npm - create-ai-project - Versions diffs - 1.23.1 → 1.23.2 - Mend

create-ai-project 1.23.1 → 1.23.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/.claude/agents-en/acceptance-test-generator.md CHANGED Viewed

@@ -172,6 +172,8 @@ describe('[Feature Name] Integration Test', () => {
   // @category: core-functionality
   // @dependency: PaymentService, OrderRepository, Database
   // @complexity: high
+  // Primary failure mode: payment succeeds but the order row is absent or unpersisted
+  // Proof obligation: the order is persisted only after a successful payment; the external payment gateway is the only boundary that may be mocked
   it.todo('AC1: Successful payment creates persisted order with correct status')
   // AC1-error: "Payment failure shows user-friendly error message"
@@ -180,10 +182,16 @@ describe('[Feature Name] Integration Test', () => {
   // @category: core-functionality
   // @dependency: PaymentService, ErrorHandler
   // @complexity: medium
+  // Primary failure mode: payment failure still creates an order, or the error is swallowed without a user-visible message
+  // Proof obligation: a failed payment surfaces an actionable error and leaves no order persisted; only the payment gateway may be mocked
   it.todo('AC1-error: Failed payment displays error without creating order')
 })
 ```
+**Proof annotations** (apply to every skeleton, alongside the metadata above): each `it.todo` carries two comment lines that hand the proof contract to the test implementer and to integration-test-reviewer (these map to the task template's Proof Obligations fields):
+- `Primary failure mode`: the specific regression that turns this test red — the behavior the AC promises and would break
+- `Proof obligation`: what the implemented test must assert to prove the claim — the boundary to traverse, the observable state before/after for state-changing ACs, and which boundaries may be mocked and why. Phrase it as design intent describing what to assert; the implementer writes the executable assertions and mock setup
 ### E2E Test Files
 Generate **separate files per lane**: `*.fixture-e2e.test.[ext]` for fixture-e2e, `*.service-e2e.test.[ext]` for service-integration-e2e. Each emitted file MUST carry a `@lane:` header so downstream agents (work-planner, task-decomposer, executor) can route correctly.
@@ -205,6 +213,8 @@ describe('[Feature Name] fixture-e2e', () => {
   // @lane: fixture-e2e
   // @dependency: full-ui (mocked backend)
   // @complexity: medium
+  // Primary failure mode: a step transition or its observable state is lost across the journey
+  // Proof obligation: each step's UI transition and resulting state are asserted in sequence; only the backend is mocked (canned responses)
   it.todo('User Journey: Cart-to-confirmation flow with mocked payment')
 })
 ```
@@ -226,6 +236,8 @@ describe('[Feature Name] service-integration-e2e', () => {
   // @lane: service-integration-e2e
   // @dependency: full-system
   // @complexity: high
+  // Primary failure mode: the order row or downstream event is absent after a real cross-service purchase
+  // Proof obligation: the DB row, published event, and enqueued email are observed against the real local stack; nothing on the asserted path is mocked
   it.todo('User Journey: Complete purchase persists order and publishes downstream event')
 })
 ```
@@ -240,6 +252,8 @@ describe('[Feature Name] service-integration-e2e', () => {
 // ROI: [value] | Test Type: property-based
 // @category: core-functionality
 // fast-check: fc.property(fc.constantFrom([input variations]), (input) => [invariant])
+// Primary failure mode: an input in the generated domain violates the stated invariant
+// Proof obligation: the invariant holds for all generated inputs; no boundary is mocked
 it.todo('[AC#]-property: [invariant in natural language]')
 ```
@@ -316,7 +330,7 @@ Upon completion, report in the following JSON format. Detailed meta information
 ## Constraints and Quality Standards
 **Required Compliance**:
-- Output `it.todo` skeletons only: each skeleton contains verification points, expected results, and pass criteria as comments inside `it.todo` blocks.
+- Output `it.todo` skeletons only: each skeleton contains verification points, expected results, pass criteria, primary failure mode, and proof obligation as comments inside `it.todo` blocks.
   Implementation code, assertions (`expect`), and mock setup must not be included — downstream consumers parse `it.todo` presence to determine phase placement and review status.
 - Clearly state verification points, expected results, and pass criteria for each test
 - Preserve original AC statements in comments (ensure traceability)

package/.claude/agents-en/code-reviewer.md CHANGED Viewed

@@ -106,6 +106,11 @@ For each function/method in implementation files, check against coding-standards
   - Counts as coverage: the test body executes at least one assertion that exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty list, null result) count when absence is the AC's expectation
   - Non-substantive examples: `skip`/`xit` left on a test that should run, TODO-only or placeholder body, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
   - Action on non-substantive: record as `coverage_gap` with rationale citing the AC reference and the specific substance issue (file:line)
+- **Proof verification per cited test** (beyond substance):
+  - When applies: a test counts as substantive coverage for an AC marked fulfilled
+  - Primary-failure-mode source: cite the claim's recorded Proof Obligation (task file) or test skeleton annotation; derive from the AC only when neither exists, so the judgment matches what the test author targeted
+  - Counts as proof: the test turns red under that primary failure mode and exercises the claimed boundary directly
+  - Action when unproven: a test that passes yet would stay green if the claimed behavior regressed → record as `coverage_gap` with rationale naming the unproven failure mode (file:line)
 #### Finding Classification

package/.claude/agents-en/document-reviewer.md CHANGED Viewed

@@ -23,7 +23,7 @@ You are an AI assistant specialized in technical document review.
   - `composite`: Composite perspective review (recommended) - Verifies structure, implementation, and completeness in one execution
   - When unspecified: Comprehensive review
-- **doc_type**: Document type (`PRD`/`ADR`/`UISpec`/`DesignDoc`)
+- **doc_type**: Document type (`PRD`/`ADR`/`UISpec`/`DesignDoc`/`WorkPlan`)
 - **target**: Document path to review
 - **code_verification**: Code verification results JSON (optional)
@@ -34,6 +34,10 @@ You are an AI assistant specialized in technical document review.
   - When provided, use `focusAreas` as the canonical source for Fact Disposition coverage checks
   - When absent, mark focusArea completeness as unverifiable for this review
+- **design_doc**: Design Doc path(s) (optional, WorkPlan review)
+  - When provided, read it as the source for AC / contract / state-transition coverage checks against the plan
+  - When absent, resolve the Design Doc(s) from the work plan's Related Documents section
 ## Workflow
 ### Step 0: Input Context Analysis (MANDATORY)
@@ -50,6 +54,7 @@ You are an AI assistant specialized in technical document review.
 - Specialized verification based on doc_type
 - For DesignDoc: Verify "Applicable Standards" section exists with explicit/implicit classification
   - Missing or incomplete → `critical` issue; implicit standards without confirmation → `important` issue
+- For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against — Design-to-Plan Traceability, Failure Mode Checklist, Review Scope, Verification Strategy summary, and Proof Strategy. Read the referenced Design Doc(s) so AC / contract / state-transition coverage can be checked against the plan's tasks
 - If `code_verification` provided: extract discrepancy list and reverse coverage gaps; feed into Gate 1 as pre-verified evidence
 - If `codebase_analysis` provided: extract `focusAreas` and their `evidence` values for Gate 0 / Gate 1 Fact Disposition checks
@@ -71,6 +76,13 @@ For DesignDoc, additionally verify:
 - [ ] Fact Disposition Table section exists in the Design Doc
 - [ ] Minimal Surface Alternatives section present with one entry per new in-scope element (persistent state; public-contract elements or cross-boundary fields/props — for backend, fields crossing module/service boundaries; for frontend, public API props of exported reusable components, Context values, or state lifted across ownership boundaries; behavioral mode/flag/variant; reusable abstraction or component split) when the design introduces any. Each entry contains the 5-step output (fixed requirements with AC references — AC ID, AC heading, EARS clause, or constraint ID — from the Design Doc or referenced PRD/UI Spec; alternatives table including at least one subtractive alternative; selected alternative with rationale; rejected alternatives log)
+For WorkPlan, additionally verify:
+- [ ] Review Scope recorded (planned-files scope, or base branch + diff range for a revision plan)
+- [ ] Design-to-Plan Traceability table present with every row mapped to a task or carrying a justified gap
+- [ ] Verification Strategy summary and Proof Strategy present
+- [ ] Failure Mode Checklist present
+- [ ] Final phase includes Quality Assurance (acceptance criteria achievement, all tests passing)
 #### Gate 1: Quality Assessment (only after Gate 0 passes)
 **Comprehensive Review Mode**:
@@ -113,6 +125,14 @@ For DesignDoc, additionally verify:
     - (3) Step 4 rationale either selects the smallest alternative or names a current requirement smaller alternatives fail to satisfy — "useful" / "future-ready" / "convenient" / "users might want" used as primary rationale → `critical` issue (category: `compliance`).
     - (4) Step 5 records the rejected alternatives with brief rationale — missing rejected alternatives log → `important` issue (category: `completeness`). Note: the zero-alternative case is already trapped at `critical` by sub-check (2); sub-check (4) catches the case where alternatives were generated but the log is missing.
+- **Work plan semantic gate** (doc_type WorkPlan):
+  - (1) Coverage is checked where each item lives in the plan: each acceptance criterion is covered by a task — evidenced by a Design-to-Plan Traceability row mapping it to a task, or the task's completion criteria or Proof Obligations referencing it; each data contract and state transition has a Design-to-Plan Traceability row mapping to a task or an explicit out-of-scope entry; each quality assurance mechanism appears in the Quality Assurance Mechanisms table with covered files. An item with no such coverage → `critical` issue (category: `completeness`). Distinguish the cause for an uncovered acceptance criterion: when the Design Doc supports it but no task maps to it (plan omission, fixable by re-planning) → `critical`; when the Design Doc or inputs give it no basis (a gap re-planning cannot fix) → the `rejected` trigger per the Verdict mapping below
+  - (2) The early verification point sits in an early phase rather than the final phase — deferral to the final phase → `important` issue (category: `consistency`)
+  - (3) Each cross-boundary, public-boundary, or persisted-state change names a task that verifies it through the real boundary — missing → `important` issue (category: `completeness`)
+  - (4) Each traceability table present (Design-to-Plan, UI Spec Component, Connection Map, ADR Bindings) is filled to a granularity that resolves its target task — under-specified rows → `important` issue (category: `completeness`)
+  - (5) The Failure Mode Checklist covers the plan's applicable domain-independent categories (same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility) — missing applicable category → `recommended` issue (category: `completeness`)
+  - Verdict mapping (WorkPlan): any semantic-gate `critical` issue forces the verdict to at least `needs_revision` — except a coverage gap traceable to a missing or contradictory Design Doc/input element (which re-planning cannot fix) → `rejected`; an `important`-only set caps the verdict at `approved_with_conditions`
 **Perspective-specific Mode**:
 - Implement review based on specified mode and focus

package/.claude/agents-en/integration-test-reviewer.md CHANGED Viewed

@@ -73,6 +73,15 @@ Verify the following for each test case:
 | Internal Components | Use actual | Unnecessary mocking |
 | Log Output Verification | Use vi.fn() | Mock without verification |
+### 5. Claim Proof Adequacy
+Take each AC's primary failure mode and proof obligation from the test's skeleton annotation (the `Primary failure mode` / `Proof obligation` comments) as the source of truth — these correspond to the task template's Proof Obligations fields. Confirm each test proves its claim: an assertion observes the promised behavior so the test fails if that behavior regresses. Record a `proof_insufficient` issue for each obligation the test leaves unproven:
+- The test turns red under the recorded primary failure mode (an assertion observes the specific behavior the AC promises, so a regression in that behavior fails the test).
+- When the AC claims a public or integration boundary, the test exercises that boundary directly.
+- When the AC claims a state change, side effect, rollback, non-mutating mode, idempotency, or persistence, the test asserts the observable state before the action, the action, and the observable state after.
+- Each mocked boundary is an external dependency, with the boundary under test left real, and a comment records why that boundary may be mocked.
+- Integration and E2E tests use bounded fixtures and assert outcomes that hold regardless of shared state, real data volume, or execution order.
 ## Output Format
 ### Output Protocol
@@ -116,7 +125,7 @@ Final message: exactly one JSON object matching the schema below (begins with `{
   "qualityIssues": [
     {
       "severity": "high | medium | low",
-      "category": "aaa_structure | independence | reproducibility | mock_boundary | readability",
+      "category": "aaa_structure | independence | reproducibility | mock_boundary | proof_insufficient | readability",
       "location": "[file:line number]",
       "description": "[Problem description]",
       "suggestion": "[Specific fix proposal]"
@@ -207,4 +216,5 @@ When needs_revision decision, output fix instructions usable in subsequent proce
 - [ ] All skeleton comments verified against implementation
 - [ ] Implementation quality evaluated
+- [ ] Each test proves its AC's claim: turns red under the primary failure mode, exercises the claimed boundary, and asserts before/after state for state-changing claims
 - [ ] Mock boundaries verified (integration tests)

package/.claude/agents-en/task-decomposer.md CHANGED Viewed

@@ -104,6 +104,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
    - Concrete implementation steps
    - **Quality Assurance Mechanisms** (derived from work plan header — see Quality Assurance Mechanism Propagation below)
    - **Operation Verification Methods** (derived from Verification Strategy in work plan)
+   - **Proof Obligations** (per claim — see Proof Obligation Propagation below)
    - Completion criteria
 6. **Investigation Targets Determination**
@@ -152,6 +153,14 @@ When the work plan includes a Verification Strategy, derive each task's Operatio
    - **Verification level**: Select L1/L2/L3 per implementation-approach skill
 3. **Investigation Targets**: Include resources needed for verification (e.g., existing implementation for comparison, schema definitions, seed data paths)
+## Proof Obligation Propagation
+Each task that implements a claim carries Proof Obligations (see task template) so downstream review can judge whether the tests prove the claim, not merely run:
+1. **Source**: When a test skeleton covers the task, copy its `Primary failure mode` and `Proof obligation` annotations into the task's Proof Obligations. When no skeleton covers the claim, derive the primary failure mode from the AC, and derive the boundary, before/after state assertion, mock boundary rationale, and residual from the AC and the task's target files (mark `N/A` for fields the claim does not exercise — e.g., no state assertion for a non-state-changing claim).
+2. **Per claim**: Record one entry per AC or claim, populating all Proof Obligations fields defined in the task template.
+3. **Apply when claims exist**: Tasks with no behavioral claim (e.g., pure config or scaffolding) omit the section.
 ## UI Spec Propagation
 When the work plan contains a UI Spec Component → Task Mapping table, propagate component references to each implementation task as follows:
@@ -348,6 +357,7 @@ Please execute decomposed tasks according to the order.
 - [ ] Overall design document creation
 - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope)
 - [ ] Investigation Targets specified for every task (specific file paths, not vague categories)
+- [ ] Proof Obligations recorded for each claim-implementing task (primary failure mode + boundary to exercise)
 - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks
 ## Task Design Principles

package/.claude/agents-en/work-planner.md CHANGED Viewed

@@ -38,6 +38,9 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati
 **Common rules (all approaches)**:
 - **Include Verification Strategy summary in work plan header** for downstream task reference
 - **Include adopted Quality Assurance Mechanisms in work plan header** for downstream task reference — list each adopted mechanism with tool name, what it enforces, configuration path, and covered files (literal file paths or directory prefixes from Design Doc, or "project-wide" if not scoped to specific files)
+- **Include a Proof Strategy in the work plan header** (see plan template) — name the proof obligation source (test skeleton annotations when skeletons are provided, otherwise each AC's primary failure mode) and state that every claim-implementing task records Proof Obligations for downstream review
+- **Record the Review Scope in the work plan header** — for a fresh pre-implementation plan, the planned-files scope derived from the Design Doc and task target files; for a revision plan over existing work, the base branch and diff range — so the work plan review and downstream verification share one scope
+- **Include a Failure Mode Checklist in the work plan** (see plan template) — enumerate all eight domain-independent failure categories (same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility), mark which apply, and map each applicable one to its covering task(s), keeping entries free of project-specific names
 - Include verification tasks in the phase corresponding to Verification Strategy's verification timing
 - When test skeletons are provided, place integration test implementation in corresponding phases and E2E test execution in the final phase
 - When test skeletons are not provided, include test implementation tasks based on Design Doc acceptance criteria
@@ -364,6 +367,9 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
   - [ ] Every row maps to at least one covering task
 - [ ] Plan header includes `Implementation Readiness: pending` (medium / large only)
 - [ ] Verification Strategy extracted from Design Doc and included in plan header
+- [ ] Proof Strategy included in plan header (proof obligation source + per-task propagation rule)
+- [ ] Review Scope recorded in plan header (base branch / diff range / changed-files scope)
+- [ ] Failure Mode Checklist included, applicable categories mapped to covering tasks, free of project-specific names
 - [ ] Adopted Quality Assurance Mechanisms extracted from Design Doc and included in plan header
 - [ ] Phase structure matches implementation approach (vertical → value unit phases, horizontal → layer phases)
 - [ ] Early verification point placed in Phase 1 (when Verification Strategy specifies one)

package/.claude/agents-ja/acceptance-test-generator.md CHANGED Viewed

@@ -174,18 +174,26 @@ describe('[機能名] Integration Test', () => {
   // @category: core-functionality
   // @dependency: PaymentService, OrderRepository, Database
   // @complexity: high
+  // 主要な故障モード: 決済は成功したのに注文行が存在しない、または永続化されていない
+  // 証明義務: 注文は決済成功後にのみ永続化される。モックしてよい境界は外部の決済ゲートウェイのみ
   it.todo('AC1: 決済成功で正しいステータスの注文が永続化される')
   // AC1-error: "決済失敗でユーザーフレンドリーなエラーメッセージを表示"
   // ROI: 23 (BV:8 × Freq:2 + Legal:0 + Defect:7)
-  // 振る舞い: 決済失敗 → ユーザーに実行可能なエラー表示 → 注文未作成
+  // 振る舞い: 決済失敗 → ユーザーに対処可能なエラー表示 → 注文未作成
   // @category: core-functionality
   // @dependency: PaymentService, ErrorHandler
   // @complexity: medium
+  // 主要な故障モード: 決済失敗でも注文が作成される、またはエラーがユーザーに見える形で表示されず握り潰される
+  // 証明義務: 決済失敗時は対処可能なエラーを提示し、注文を永続化しない。モックしてよいのは決済ゲートウェイのみ
   it.todo('AC1-error: 決済失敗でエラー表示し注文を作成しない')
 })
 ```
+**証明注釈**（すべてのスケルトンに、上記メタ情報とともに付与）: 各 `it.todo` は証明コントラクトをテスト実装者と integration-test-reviewer に渡す2行のコメントを持つ（これらは task template の Proof Obligations フィールドに対応する）:
+- `主要な故障モード`: このテストをレッドにする具体的なリグレッション — ACが約束し、壊れると失われる振る舞い
+- `証明義務`: 実装されたテストが主張を証明するためにアサートすべき内容 — 通過する境界、状態変更を伴うACでは操作前後の観測可能な状態、どの境界をなぜモックしてよいか。アサート対象を記述する設計意図として書き、実行可能なアサーションとモック設定は実装者が書く
 ### E2Eテストファイル群
 レーンごとに**別ファイル**で生成する: fixture-e2eは `*.fixture-e2e.test.[ext]`、service-integration-e2eは `*.service-e2e.test.[ext]`。各出力ファイルには下流エージェント（work-planner、task-decomposer、executor）が正しくルーティングできるよう `@lane:` ヘッダを必ず付与する。
@@ -207,6 +215,8 @@ describe('[機能名] fixture-e2e', () => {
   // @lane: fixture-e2e
   // @dependency: full-ui (mocked backend)
   // @complexity: medium
+  // 主要な故障モード: ジャーニー中のステップ遷移またはその観測可能な状態が失われる
+  // 証明義務: 各ステップの UI 遷移と結果状態を順に検証する。モックするのはバックエンドのみ（固定レスポンス）
   it.todo('ユーザージャーニー: モック決済でのカートから確認までのフロー')
 })
 ```
@@ -228,6 +238,8 @@ describe('[機能名] service-integration-e2e', () => {
   // @lane: service-integration-e2e
   // @dependency: full-system
   // @complexity: high
+  // 主要な故障モード: 実サービス間の購入後に注文行または下流イベントが存在しない
+  // 証明義務: DB行・発行イベント・キュー投入メールを実ローカルスタックに対して観測する。アサート対象の経路上は何もモックしない
   it.todo('ユーザージャーニー: 購入完了で注文が永続化され下流イベントが発行される')
 })
 ```
@@ -242,6 +254,8 @@ describe('[機能名] service-integration-e2e', () => {
 // ROI: [値] | テスト種別: property-based
 // @category: core-functionality
 // fast-check: fc.property(fc.constantFrom([入力バリエーション]), (input) => [不変条件])
+// 主要な故障モード: 生成ドメイン内のある入力が記述された不変条件に違反する
+// 証明義務: 生成された全入力で不変条件が成立する。境界はモックしない
 it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
 ```
@@ -318,7 +332,7 @@ it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
 ## 制約と品質基準
 **必須準拠事項**:
-- `it.todo`スケルトンのみ出力: 各スケルトン内にコメントとして検証観点、期待結果、合格基準を記述。
+- `it.todo`スケルトンのみ出力: 各スケルトン内にコメントとして検証観点、期待結果、合格基準、主要な故障モード、証明義務を記述。
   実装コード、アサーション(`expect`)、モックセットアップは含めない — 下流の処理で`it.todo`の有無によりフェーズ配置やレビュー判定が行われる。
 - 各テストの検証観点、期待結果、合格基準を明確に記述
 - コメントに元のAC文を保持（トレーサビリティ確保）

package/.claude/agents-ja/code-reviewer.md CHANGED Viewed

@@ -106,6 +106,11 @@ Step 1で抽出した各識別子仕様（リソース名、エンドポイン
   - カバレッジとして数える条件: テスト本体で実行されるアサーションのうち少なくとも1つが、AC の観測可能な振る舞いを検証している。意図的な不在を検証するアサーション（例: 空のリスト、null 結果）は、AC が不在を期待する場合に該当する
   - 非実体的な例: 実行されるべきテストに `skip`/`xit` が残っている、TODO のみ・プレースホルダーのみの本体、常に真となるアサーション（例: `expect(true).toBe(true)`、`expect(arr.length).toBeGreaterThanOrEqual(0)`）
   - 非実体的な場合のアクション: `coverage_gap` として記録し、rationale に該当する AC の参照と具体的な実体性の問題（file:line）を記載する
+- **引用された各テストの証明検証（実体性を超えて）**:
+  - 適用対象: fulfilled と判定した AC の実体的なカバレッジとして数えられるテスト
+  - 主要な故障モードの出所: 主張に記録された Proof Obligation（タスクファイル）またはテストスケルトンの注釈を参照する。いずれも存在しない場合のみ AC から導出し、判定がテスト作成者の狙いと一致するようにする
+  - 証明として数える条件: テストがその主要な故障モードでレッドになり、主張された境界を直接通過する
+  - 未証明の場合のアクション: テストはパスするのに、主張された振る舞いがリグレッションしてもグリーンのまま → `coverage_gap` として記録し、rationale に未証明の故障モードを明記（file:line）
 #### 検出事項の分類

package/.claude/agents-ja/document-reviewer.md CHANGED Viewed

@@ -23,7 +23,7 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
   - `composite`: 複合観点レビュー（推奨）- 構造・実装・完全性を一度に検証
   - 未指定時: 総合的レビュー
-- **doc_type**: ドキュメントタイプ（`PRD`/`UISpec`/`ADR`/`DesignDoc`）
+- **doc_type**: ドキュメントタイプ（`PRD`/`UISpec`/`ADR`/`DesignDoc`/`WorkPlan`）
 - **target**: レビュー対象のドキュメントパス
 - **code_verification**: コード検証結果のJSON（任意）
@@ -34,6 +34,10 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
   - 提供された場合、`focusAreas`をFact Dispositionカバレッジチェックの正典ソースとして使用
   - 未提供の場合、focusAreaの完全性は本レビューでは検証不能として扱う
+- **design_doc**: Design Docのパス（任意、WorkPlanレビュー用）
+  - 提供された場合、計画に対するAC / コントラクト / 状態遷移のカバレッジチェックのソースとして読み込む
+  - 未提供の場合、作業計画書の「関連ドキュメント」セクションからDesign Docを解決する
 ## 作業フロー
 ### ステップ0: 入力コンテキスト分析（必須）
@@ -50,6 +54,7 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
 - doc_typeに基づく特化した検証
 - DesignDocの場合:「適用基準」セクションの存在をexplicit/implicit分類付きで確認
   - 欠落・不完全 → `critical`、implicit基準の未確認 → `important`
+- WorkPlanの場合: セマンティックゲートの判定対象となる成果物が計画に含まれることを確認 — 設計-計画トレーサビリティ、故障モードチェックリスト、レビュースコープ、検証戦略の要約、証明戦略。参照されているDesign Docを読み込み、AC / コントラクト / 状態遷移のカバレッジを計画のタスクに対して確認できるようにする
 - `code_verification`が提供された場合: 不整合リストと逆方向カバレッジのギャップを抽出し、Gate 1の事前検証エビデンスとして組み込む
 - `codebase_analysis`が提供された場合: `focusAreas`とその`evidence`値を抽出し、Gate 0 / Gate 1のFact Dispositionチェックに使用
@@ -71,6 +76,13 @@ DesignDocの場合、追加で以下を確認:
 - [ ] Fact Disposition TableセクションがDesign Docに存在する
 - [ ] Minimal Surface Alternatives セクションが存在し、新規に導入される適用対象要素（永続状態 / 公開コントラクト要素または境界を越えるフィールド・Props — バックエンドではモジュール/サービス境界を越えるフィールド、フロントエンドではエクスポートされた再利用可能コンポーネントの公開 API Props・Context 値・所有境界を越えて持ち上げられた状態 / 振る舞いモード・フラグ・バリアント / 再利用可能な抽象またはコンポーネント分割）ごとに1エントリ持つ（適用対象要素を導入する場合）。各エントリには5ステップの結果が含まれる（確定要件 — Design Docまたは参照PRD/UI SpecのAC参照（AC ID、AC見出し、EARS節、または制約ID）、削減的な代替案を1つ以上含む比較表、根拠付きの選定結果、不採用案の記録）
+WorkPlanの場合、追加で以下を確認:
+- [ ] レビュースコープが記録されている（変更予定ファイルの範囲、または改訂計画ではベースブランチ + diff範囲）
+- [ ] 設計-計画トレーサビリティ表が存在し、各行がタスクにマッピングされているか正当化されたギャップを持つ
+- [ ] 検証戦略の要約と証明戦略が存在する
+- [ ] 故障モードチェックリストが存在する
+- [ ] 最終フェーズに品質保証が含まれる（受入基準の達成、全テストのパス）
 #### Gate 1: 品質評価（Gate 0通過後のみ実施）
 **総合レビューモード**:
@@ -113,6 +125,14 @@ DesignDocの場合、追加で以下を確認:
     - (3) ステップ4 の根拠が、最小の代替案を選定するか、より小さい代替案では満たせない現要件を名指している — 「便利」「将来対応」「実装が楽」「ユーザーが欲しがるかも」が主たる根拠として使われている → `critical`（カテゴリ: `compliance`）。
     - (4) ステップ5 で不採用案が簡潔な根拠とともに記録されている — 不採用案ログの欠落 → `important`（カテゴリ: `completeness`）。注: 代替案ゼロのケースはサブチェック(2)で先に `critical` として検出される。サブチェック(4)は代替案は生成されたが記録が抜けているケースを検出する。
+- **作業計画書セマンティックゲート**（doc_type WorkPlan）:
+  - (1) カバレッジは各項目が計画内で存在する場所で確認する: 各受入基準がタスクでカバーされている — 設計-計画トレーサビリティの行がそのACをタスクにマッピングしているか、タスクの完了基準または Proof Obligations がそのACを参照していることで示される。各データコントラクトと状態遷移は、設計-計画トレーサビリティの行でタスクにマッピングされるか、明示的なスコープ外エントリを持つ。各品質保証メカニズムは、カバー対象ファイルとともに品質保証メカニズム表に現れる。いずれのカバレッジもない項目 → `critical`（カテゴリ: `completeness`）。カバーされない受入基準は原因を区別する: Design Docが裏付けるのにタスクがマッピングされていない（計画の漏れ、再計画で修正可能）→ `critical`、Design Docや入力に裏付けがない（再計画でも修正不能なギャップ）→ 下記Verdictマッピングの`rejected`トリガー
+  - (2) 早期検証ポイントが最終フェーズではなく早期フェーズに置かれている — 最終フェーズへの後回し → `important`（カテゴリ: `consistency`）
+  - (3) 境界横断・公開境界・永続状態の各変更が、それを実境界経由で検証するタスクを名指している — 欠落 → `important`（カテゴリ: `completeness`）
+  - (4) 存在する各トレーサビリティ表（設計-計画、UI Specコンポーネント、Connection Map、ADR Bindings）が対象タスクを解決できる粒度で埋められている — 粒度不足の行 → `important`（カテゴリ: `completeness`）
+  - (5) 故障モードチェックリストが計画の該当するドメイン非依存カテゴリ（same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility）をカバーしている — 該当カテゴリの欠落 → `recommended`（カテゴリ: `completeness`）
+  - Verdictマッピング（WorkPlan）: セマンティックゲートの`critical`はいずれもverdictを最低でも`needs_revision`にする — ただしDesign Doc/入力要素の欠落や矛盾に起因するカバレッジギャップ（再計画で修正不能）→ `rejected`、`important`のみの場合はverdictを`approved_with_conditions`までに制限する
 **観点特化モード**:
 - 指定されたmodeとfocusに基づいてレビューを実施

package/.claude/agents-ja/integration-test-reviewer.md CHANGED Viewed

@@ -73,6 +73,15 @@ skills: integration-e2e-testing, typescript-testing, project-context
 | 内部コンポーネント | 実物使用 | 不要なモック化 |
 | ログ出力検証 | vi.fn()使用 | 検証なしのモック |
+### 5. 主張証明の妥当性
+各ACの主要な故障モードと証明義務は、テストのスケルトン注釈（「主要な故障モード」/「証明義務」コメント）を出所とする — これらは task template の Proof Obligations フィールドに対応する。各テストが主張を証明していることを確認する: アサーションが約束された振る舞いを観測し、その振る舞いがリグレッションするとテストが失敗する。テストが未証明のまま残す各義務について `proof_insufficient` を記録する:
+- テストが記録された主要な故障モードでレッドになる（アサーションがACの約束する具体的な振る舞いを観測するため、その振る舞いのリグレッションでテストが失敗する）。
+- ACが公開境界または統合境界を主張する場合、テストはその境界を直接通過する。
+- ACが状態変更・副作用・ロールバック・非変更モード・冪等性・永続化を主張する場合、テストは操作前の観測可能な状態、操作、操作後の観測可能な状態をアサートする。
+- モックする各境界は外部依存であり、テスト対象の境界は実物のまま残し、その境界をモックしてよい理由をコメントで記録する。
+- 統合テストとE2Eテストは範囲を限定した fixture を用い、共有状態・実データ量・実行順序によらず成立する結果をアサートする。
 ## 出力フォーマット
 ### 出力プロトコル
@@ -116,7 +125,7 @@ skills: integration-e2e-testing, typescript-testing, project-context
   "qualityIssues": [
     {
       "severity": "high | medium | low",
-      "category": "aaa_structure | independence | reproducibility | mock_boundary | readability",
+      "category": "aaa_structure | independence | reproducibility | mock_boundary | proof_insufficient | readability",
       "location": "[ファイル:行番号]",
       "description": "[問題の説明]",
       "suggestion": "[具体的な修正提案]"
@@ -207,4 +216,5 @@ needs_revision判定時、後続処理で使用できる修正指示を出力:
 - [ ] すべてのスケルトンコメントを実装と照合
 - [ ] 実装品質を評価
+- [ ] 各テストがACの主張を証明している: 主要な故障モードでレッドになり、主張された境界を通過し、状態変更を伴う主張では操作前後の状態をアサートする
 - [ ] Mock境界を検証（統合テスト）

package/.claude/agents-ja/task-decomposer.md CHANGED Viewed

@@ -104,6 +104,7 @@ implementation-approachスキルで決定された実装戦略パターンに基
    - `## Implementation Steps (TDD: Red-Green-Refactor)`
    - `## Quality Assurance Mechanisms`（作業計画書ヘッダーから導出 — 下記「品質保証メカニズムの伝播」参照）
    - `## Operation Verification Methods`（作業計画書の Verification Strategy から導出）
+   - `## Proof Obligations`（主張ごと — 下記「証明義務の伝播」参照）
    - `## Completion Criteria`
 6. **Investigation Targets の決定**
@@ -152,6 +153,14 @@ implementation-approachスキルで決定された実装戦略パターンに基
    - **検証レベル**: implementation-approachスキルに従いL1/L2/L3を選択
 3. **調査対象**: 検証に必要なリソースを含める（例: 比較対象の既存実装、スキーマ定義、seed dataのパス）
+## 証明義務の伝播
+主張を実装する各タスクは Proof Obligations（task template参照）を持ち、下流のレビューがテストが主張を証明しているか（単に実行されるだけでないか）を判定できるようにする:
+1. **出所**: テストスケルトンがタスクをカバーしている場合、その「主要な故障モード」と「証明義務」の注釈をタスクの Proof Obligations に転記する。スケルトンが主張をカバーしていない場合、主要な故障モードをACから導出し、境界・操作前後の状態アサーション・モック境界の根拠・残余をACとタスクの対象ファイルから導出する（その主張に該当しないフィールドは `N/A` とする — 例: 状態を変更しない主張では状態アサーションなし）。
+2. **主張ごと**: ACまたは主張ごとに1エントリを記録し、task template で定義された Proof Obligations の全フィールドを埋める。
+3. **主張がある場合に適用**: 振る舞いの主張を持たないタスク（純粋な設定やスキャフォールディング等）は本セクションを省略する。
 ## UI Spec伝播
 作業計画書に「UI Specコンポーネント → タスクマッピング」表が含まれる場合、各実装タスクにコンポーネント参照を以下のように伝播する:
@@ -348,6 +357,7 @@ implementation-approachスキルで決定された実装戦略パターンに基
 - [ ] 全体設計書の作成
 - [ ] 実装効率と手戻り防止（共通処理の事前識別、影響範囲の明確化）
 - [ ] 全タスクに調査対象が指定されている（具体的なファイルパス、曖昧なカテゴリではない）
+- [ ] 主張を実装する各タスクに Proof Obligations を記録（主要な故障モード + 検証する境界）
 - [ ] 作業計画書ヘッダーの品質保証メカニズムを該当タスクに伝播済み
 ## タスク設計の原則

package/.claude/agents-ja/work-planner.md CHANGED Viewed

@@ -38,6 +38,9 @@ Design Doc、UI Spec、PRD、ADR（提供されている場合）を読み込み
 **全アプローチ共通**:
 - **検証戦略の要約を作業計画書ヘッダーに記載**（後続タスクへの参照用）
 - **採用した品質保証メカニズムを作業計画書ヘッダーに記載**（後続タスクへの参照用） — 各メカニズムについてツール名、検証内容、設定パス、カバー範囲（Design Docのファイルパスまたはディレクトリプレフィックス、スコープが限定されない場合は "project-wide"）を記載
+- **証明戦略を作業計画書ヘッダーに記載**（plan template参照） — 証明義務の出所（スケルトンが提供される場合はテストスケルトンの注釈、なければ各ACの主要な故障モード）を明示し、主張を実装する各タスクが下流レビュー用に Proof Obligations を記録する旨を記載する
+- **レビュースコープを作業計画書ヘッダーに記録** — 実装前の新規計画では Design Doc とタスク対象ファイルから導出した変更予定ファイルの範囲、既存作業に対する改訂計画ではベースブランチと diff範囲 — を記録し、作業計画書レビューと下流検証が同一スコープを共有できるようにする
+- **故障モードチェックリストを作業計画書に含める**（plan template参照） — ドメイン非依存の8つの故障カテゴリ（same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility）をすべて列挙し、該当するものに印を付け、該当する各カテゴリをカバーするタスクにマッピングする。エントリにはプロジェクト固有の名称を含めない
 - 検証戦略の検証タイミングに対応するフェーズに検証タスクを配置
 - テストスケルトンが提供されている場合、統合テスト実装を対応するフェーズに配置し、E2Eテスト実行を最終フェーズに配置
 - テストスケルトンが提供されていない場合、Design Docの受入条件に基づくテスト実装タスクを含める
@@ -361,6 +364,9 @@ Design Docの技術的依存関係と実装アプローチに基づいてフェ
   - [ ] 全行が少なくとも1つのカバータスクにマッピングされている
 - [ ] 計画書ヘッダーに `Implementation Readiness: pending` を含める（medium / large のみ）
 - [ ] Design Docから検証戦略を抽出し計画書ヘッダーに記載
+- [ ] 計画書ヘッダーに証明戦略を記載（証明義務の出所 + タスクごとの伝播ルール）
+- [ ] 計画書ヘッダーにレビュースコープを記録（ベースブランチ / diff範囲 / 変更ファイル範囲）
+- [ ] 故障モードチェックリストを含め、該当カテゴリをカバーするタスクにマッピングし、プロジェクト固有の名称を含めない
 - [ ] Design Docから採用済み品質保証メカニズムを抽出し計画書ヘッダーに記載
 - [ ] フェーズ構成が実装アプローチと整合（垂直 → 価値単位フェーズ、水平 → レイヤーフェーズ）
 - [ ] 早期検証ポイントをPhase 1に配置（検証戦略で指定されている場合）

package/.claude/commands-en/front-build.md CHANGED Viewed

@@ -59,9 +59,22 @@ Analyze the Consumed Task Set and determine the action required. Note: when `$AR
 |-------|----------|-------------|
 | Tasks exist | Consumed Task Set is non-empty | User's execution instruction serves as batch approval → Enter autonomous execution immediately |
 | No tasks + plan supplied via `$ARGUMENTS` | `$ARGUMENTS` provided AND Consumed Task Set empty | Confirm with user → run task-decomposer (which will emit `*-frontend-task-*.md` per the frontend naming rule) |
-| Neither exists + Design Doc exists + `$ARGUMENTS` provided | `$ARGUMENTS` provided, no plan, no Consumed Task Set, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then proceed to task decomposition |
+| Neither exists + Design Doc exists + `$ARGUMENTS` provided | `$ARGUMENTS` provided, no plan, no Consumed Task Set, but docs/design/*.md exists | Invoke work-planner to create work plan from Design Doc, then run **Work Plan Review** (see below) before task decomposition |
 | Neither exists | No `$ARGUMENTS`, no plan, no Consumed Task Set, no Design Doc | Report missing prerequisites to user and stop |
+## Work Plan Review (when this recipe created the plan)
+When the decision flow above created the work plan from a Design Doc, review it before decomposition:
+1. Invoke document-reviewer using Agent tool:
+   - `subagent_type`: "document-reviewer"
+   - `description`: "Work plan review"
+   - `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [the Design Doc path]. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Failure Mode Checklist, and Review Scope."
+2. Branch on the reviewer's `verdict.decision`:
+   - `needs_revision` → re-invoke work-planner (update) with the findings and re-review until `approved`/`approved_with_conditions`
+   - `rejected` → stop before task decomposition and escalate to the user
+3. Present the reviewed plan for batch approval before task decomposition.
 ## Task Decomposition Phase (Conditional)
 When the Consumed Task Set is empty:

package/.claude/commands-en/front-plan.md CHANGED Viewed

@@ -23,6 +23,7 @@ description: Create frontend work plan from design document and obtain plan appr
 - Design document selection
 - Test skeleton generation with acceptance-test-generator
 - Work plan creation with work-planner
+- Work plan review with document-reviewer
 - Plan approval obtainment
 **Responsibility Boundary**: This command completes with work plan approval.
@@ -59,8 +60,20 @@ Invoke work-planner using Agent tool:
   `prompt`: "Create work plan from Design Doc at [path]."
 - Follow subagents-orchestration-guide Prompt Construction Rule for additional prompt parameters
-- Present work plan to user for review. If user requests changes, re-invoke work-planner with revised parameters
-- Highlight steps with unclear scope or external dependencies and ask user to confirm
+### Step 4: Work Plan Review
+Invoke document-reviewer to review the work plan:
+- `subagent_type`: "document-reviewer"
+- `description`: "Work plan review"
+- `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [the Design Doc path selected in Step 1]. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Failure Mode Checklist, and Review Scope."
+- The work plan is a derivation of the Design Doc, so plan-fidelity findings are resolved without user input. Branch on the reviewer's `verdict.decision`:
+  - `needs_revision`: re-invoke work-planner in update mode with the findings and re-review, repeating until `approved` or `approved_with_conditions`
+  - `approved` / `approved_with_conditions`: proceed to Step 5
+  - `rejected`: escalate to the user
+### Step 5: Present for Approval
+- Present the reviewed work plan to the user for batch approval. If the user requests changes, re-invoke work-planner with revised parameters and re-run Step 4.
+- Highlight steps with unclear scope or external dependencies and ask the user to confirm
 **Scope**: Up to work plan creation and obtaining approval for plan content.

package/.claude/commands-en/plan.md CHANGED Viewed

@@ -23,6 +23,7 @@ description: Create work plan from design document and obtain plan approval
 - Design document selection
 - E2E test skeleton generation (optional, with user confirmation)
 - Work plan creation with work-planner
+- Work plan review with document-reviewer
 - Plan approval obtainment
 **Responsibility Boundary**: This command completes with work plan approval.
@@ -55,7 +56,20 @@ Follow subagents-orchestration-guide skill strictly and create work plan with th
      `prompt`: "Create work plan from Design Doc at [path]."
    - Follow subagents-orchestration-guide Prompt Construction Rule for additional prompt parameters
-   - Interact with user to complete plan and obtain approval for plan content
+4. **Work Plan Review**
+   Invoke document-reviewer to review the work plan:
+   - `subagent_type`: "document-reviewer"
+   - `description`: "Work plan review"
+   - `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [the Design Doc path selected in Step 1]. Review semantic traceability to the Design Doc, early verification placement, real-boundary verification coverage, Failure Mode Checklist, and Review Scope."
+   - The work plan is a derivation of the Design Doc, so plan-fidelity findings are resolved without user input. Branch on the reviewer's `verdict.decision`:
+     - `needs_revision`: re-invoke work-planner in update mode with the findings and re-review, repeating until `approved` or `approved_with_conditions`
+     - `approved` / `approved_with_conditions`: proceed to Step 5
+     - `rejected`: escalate to the user
+5. **Present for Approval**
+   - Present the reviewed work plan to the user for batch approval. If the user requests changes, re-invoke work-planner with revised parameters and re-run Step 4.
+   - Highlight steps with unclear scope or external dependencies and ask the user to confirm
 Create a work plan from the selected design document, clarifying specific implementation steps and risks.

package/.claude/commands-ja/front-build.md CHANGED Viewed

@@ -59,9 +59,22 @@ Consumed Task Set を確認し、適切な対応を決定する。注: `$ARGUMEN
 |------|------|--------------|
 | タスク存在 | Consumed Task Set が非空 | ユーザーの実行指示をバッチ承認として自律実行へ移行 |
 | タスクなし + `$ARGUMENTS`で計画書指定 | `$ARGUMENTS`が提供され Consumed Task Set が空 | ユーザーに確認 → task-decomposer実行（frontend命名ルールにより `*-frontend-task-*.md` を出力する） |
-| どちらもなし＋Design Docあり + `$ARGUMENTS`提供 | `$ARGUMENTS`が提供され、計画書なし、Consumed Task Setなし、ただし docs/design/*.md が存在 | work-plannerでDesign Docから作業計画書を作成し、タスク分解へ進む |
+| どちらもなし＋Design Docあり + `$ARGUMENTS`提供 | `$ARGUMENTS`が提供され、計画書なし、Consumed Task Setなし、ただし docs/design/*.md が存在 | work-plannerでDesign Docから作業計画書を作成し、タスク分解の前に**作業計画書レビュー**（下記参照）を行う |
 | どちらもなし | `$ARGUMENTS`なし、計画書なし、Consumed Task Setなし、Design Docなし | 前提条件未達成をユーザーに報告して停止 |
+## 作業計画書レビュー（本レシピが計画書を作成した場合）
+上記の判断フローでDesign Docから作業計画書を作成した場合、タスク分解の前にレビューする:
+1. Agentツールでdocument-reviewerを呼び出す:
+   - `subagent_type`: "document-reviewer"
+   - `description`: "作業計画書レビュー"
+   - `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [Design Docのパス]。Design Docへの意味的トレーサビリティ、早期検証の配置、実境界での検証カバレッジ、故障モードチェックリスト、レビュースコープをレビューする。"
+2. reviewerの `verdict.decision` で分岐する:
+   - `needs_revision` → 所見を渡してwork-plannerをupdateモードで再実行し、`approved`/`approved_with_conditions` になるまで再レビューする
+   - `rejected` → タスク分解の前に停止しユーザーにエスカレーションする
+3. レビュー済みの計画書をタスク分解の前にバッチ承認のため提示する。
 ## タスク分解フェーズ（条件付き）
 Consumed Task Set が空の場合：

package/.claude/commands-ja/front-plan.md CHANGED Viewed

@@ -23,6 +23,7 @@ description: 設計ドキュメントからフロントエンド作業計画書
 - 設計書の選択
 - acceptance-test-generatorによるテストスケルトン生成
 - work-plannerによる作業計画書作成
+- document-reviewerによる作業計画書レビュー
 - 計画承認の取得
 **責務境界**: このコマンドは作業計画書承認で責務完了。
@@ -59,7 +60,19 @@ Agentツールでwork-plannerを呼び出す:
   `prompt`: "[パス]のDesign Docから作業計画を作成。"
 - subagents-orchestration-guideのPrompt Construction Ruleに従い追加パラメータを構成
-- 作業計画書をユーザーに提示しレビューを受ける。変更要望があればwork-plannerを修正パラメータで再実行
+### Step 4: 作業計画書レビュー
+document-reviewerを呼び出し作業計画書をレビューする:
+- `subagent_type`: "document-reviewer"
+- `description`: "作業計画書レビュー"
+- `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [ステップ1で選択したDesign Docのパス]。Design Docへの意味的トレーサビリティ、早期検証の配置、実境界での検証カバレッジ、故障モードチェックリスト、レビュースコープをレビューする。"
+- 作業計画書はDesign Docの派生物であるため、計画の忠実性に関する指摘はユーザー入力なしで解消する。reviewerの `verdict.decision` で分岐する:
+  - `needs_revision`: 所見を渡してwork-plannerをupdateモードで再実行し、`approved`/`approved_with_conditions` になるまで再レビューを繰り返す
+  - `approved` / `approved_with_conditions`: ステップ5へ進む
+  - `rejected`: ユーザーにエスカレーションする
+### Step 5: 承認のための提示
+- レビュー済みの作業計画書をユーザーにバッチ承認のため提示する。変更要望があればwork-plannerを修正パラメータで再実行し、ステップ4を再実行する。
 - スコープが不明確なステップや外部依存があるステップを強調し、ユーザーに確認を求める
 **スコープ**: 作業計画書作成と計画内容の承認取得まで。

package/.claude/commands-ja/plan.md CHANGED Viewed

@@ -23,6 +23,7 @@ description: 設計書から作業計画書を作成し計画承認を取得
 - 設計書の選択
 - E2Eテストスケルトン生成（オプション、ユーザー確認後）
 - work-plannerによる作業計画書作成
+- document-reviewerによる作業計画書レビュー
 - 計画承認の取得
 **責務境界**: このコマンドは作業計画書承認で責務完了。
@@ -55,7 +56,20 @@ description: 設計書から作業計画書を作成し計画承認を取得
      `prompt`: "[パス]のDesign Docから作業計画を作成。"
    - subagents-orchestration-guideのプロンプト構成ルールに従い追加パラメータを設定
-   - ユーザーと対話して計画を完成させ、計画内容の承認を得る
+4. **作業計画書レビュー**
+   document-reviewerを呼び出し作業計画書をレビューする:
+   - `subagent_type`: "document-reviewer"
+   - `description`: "作業計画書レビュー"
+   - `prompt`: "doc_type: WorkPlan target: docs/plans/[plan-name].md design_doc: [ステップ1で選択したDesign Docのパス]。Design Docへの意味的トレーサビリティ、早期検証の配置、実境界での検証カバレッジ、故障モードチェックリスト、レビュースコープをレビューする。"
+   - 作業計画書はDesign Docの派生物であるため、計画の忠実性に関する指摘はユーザー入力なしで解消する。reviewerの `verdict.decision` で分岐する:
+     - `needs_revision`: 所見を渡してwork-plannerをupdateモードで再実行し、`approved`/`approved_with_conditions` になるまで再レビューを繰り返す
+     - `approved` / `approved_with_conditions`: ステップ5へ進む
+     - `rejected`: ユーザーにエスカレーションする
+5. **承認のための提示**
+   - レビュー済みの作業計画書をユーザーにバッチ承認のため提示する。変更要望があればwork-plannerを修正パラメータで再実行し、ステップ4を再実行する。
+   - スコープが不明確なステップや外部依存があるステップを強調し、ユーザーに確認を求める
 選択された設計書から作業計画書を作成し、実装の具体的なステップとリスクを明確にします。

package/.claude/skills-en/documentation-criteria/references/plan-template.md CHANGED Viewed

@@ -4,6 +4,7 @@ Created Date: YYYY-MM-DD
 Type: feature|fix|refactor
 Estimated Impact: X files
 Related Issue/PR: #XXX (if any)
+Review Scope: [planned-files scope derived from Design Doc and task targets; for a revision plan over existing work, base branch + diff range]
 <!-- The line below is medium / large only — small scale uses task-template instead of this plan-template. Keep the value line free of trailing comments so downstream parsers can extract the bare status (pending | ready | escalated). -->
 Implementation Readiness: pending
@@ -26,6 +27,10 @@ Implementation Readiness: pending
 - **Success criteria**: [extracted from Design Doc]
 - **Failure response**: [extracted from Design Doc]
+### Proof Strategy
+- **Proof obligation source**: [test skeleton annotations (primary failure mode, proof obligation) when skeletons exist; otherwise each AC's primary failure mode]
+- **Per-task propagation**: every task that implements a claim records Proof Obligations (see task template) so downstream review can judge whether the tests prove the claim, not merely run
 ## Quality Assurance Mechanisms (from Design Doc)
 Adopted quality gates for the change area. Each task in this plan must satisfy these mechanisms.
@@ -48,6 +53,21 @@ Maps each Design Doc technical requirement to the covering task(s). One row per
 **Gap Status values**: `covered` (task exists), `gap` (no task — requires justification in Notes, user confirmation required before plan approval)
+## Failure Mode Checklist
+Domain-independent failure categories this implementation must guard against. Enumerate all eight categories, mark each in the Applies? column as yes/no, and list a covering task for each that applies; keep entries free of project-specific names.
+| Category | Applies? | Covered By Task(s) |
+|---|---|---|
+| same-value | yes/no | [Phase X Task Y] |
+| no-op | yes/no | |
+| empty input | yes/no | |
+| invalid option | yes/no | |
+| missing config | yes/no | |
+| unavailable boundary | yes/no | |
+| shared-state dependency | yes/no | |
+| rollback-only visibility | yes/no | |
 ## UI Spec Component → Task Mapping
 Include this section when a UI Spec is among the inputs. Maps each component documented in the UI Spec to the task(s) that implement it. task-decomposer reads this table to populate each task's Investigation Targets with the corresponding UI Spec section. Omit the section when no UI Spec exists.

package/.claude/skills-en/documentation-criteria/references/task-template.md CHANGED Viewed

@@ -56,9 +56,21 @@ Each row is an ADR decision the implementation in this task must comply with.
 - **Failure response**: [What to do if verification fails — e.g., "reassess approach before proceeding", "escalate to user"]
 - **Verification level**: [L1: Functional operation as end-user feature / L2: New tests added and passing / L3: Code builds without errors]
+## Proof Obligations
+(One entry per AC or claim this task implements. Derived from test skeleton annotations when present, otherwise from the AC's primary failure mode. Each test must prove its claim. Repeat the block below once per claim; the heading carries the AC ID or claim ID so downstream review can resolve coverage per claim.)
+### Obligation: [AC ID or claim ID]
+- **Claim**: [the behavior the AC promises]
+- **Primary failure mode**: [the regression the test turns red on]
+- **Boundary to exercise**: [public/integration boundary the test traverses, or "in-process unit"]
+- **State assertion**: [observable state before → action → after for state-changing claims; "N/A" otherwise]
+- **Mock boundary rationale**: [which boundaries may be mocked and why; "none" when all real]
+- **Residual**: [what this proof leaves unestablished, if any]
 ## Completion Criteria
 - [ ] All added tests pass
 - [ ] Operation verified per Operation Verification Methods above
+- [ ] Each Proof Obligation is met: the test turns red under its primary failure mode and exercises the stated boundary
 - [ ] Deliverables created (for research/design tasks)
 - [ ] (When Binding Decisions exist) Every Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (file:line, test result, or command output)

package/.claude/skills-en/subagents-orchestration-guide/SKILL.md CHANGED Viewed

@@ -158,7 +158,7 @@ Subagents respond in JSON format. Key fields for orchestrator decisions:
 | code-verifier | status (consistent/mostly_consistent/needs_review/inconsistent), consistencyScore, discrepancies[], reverseCoverage (dataOperationsInCode, testBoundariesSectionPresent). Pre-implementation: verifies Design Doc claims against existing codebase. Post-implementation: verifies implementation consistency against Design Doc (pass `code_paths` scoped to changed files) | Flag discrepancies for document-reviewer |
 | task-executor | Input: `task_file` (required in orchestrated flows); optional Fix Mode signals `requiredFixes` or `incompleteImplementations` — when either is non-empty, skip `task_already_completed` and extend allowed list with each item's `file_path` / `location` (parse `location` as `file[:line]`); each `incompleteImplementations[]` entry may carry `type: "missing_logic" \| "hollow_test"` and the executor branches its fix action by `type`. Output: status (escalation_needed/completed), filesModified[], testsAdded, requiresTestReview, runnableCheck{level, executed, command, result, substance, substanceIssue, reason}, escalation_type ∈ {task_file_not_found, task_already_completed, target_files_missing, design_compliance_violation, similar_function_found, similar_component_found, investigation_target_not_found, out_of_scope_file, dependency_version_uncertain, binding_decision_violation, test_environment_not_ready}. | On escalation_needed: handle by escalation_type |
 | quality-fixer | Input: `task_file` (path to current task file — always pass this in orchestrated flows), `filesModified` (extract from the upstream implementation step's response — passes the task's write set as the primary scope for stub-detection; falls back to `git diff HEAD` when omitted), `runnableCheck` (extract from the upstream implementation step's response — passes the test execution evidence including `substance` and `substanceIssue` so the substance check has the runtime signal; omit when the upstream did not run tests). Status: approved/stub_detected/blocked. `stub_detected` → `incompleteImplementations[]` items carry `type: "missing_logic" \| "hollow_test"`; route back to the implementation step (which branches its fix action on `type`), then re-run quality-fixer. `blocked` → see quality-fixer blocked handling below | On stub_detected: re-invoke the implementation step. On blocked: see handling below |
-| document-reviewer | approvalReady (true/false) | Proceed to next step on true; request fixes on false |
+| document-reviewer | verdict.decision (approved/approved_with_conditions/needs_revision/rejected) | Proceed on approved/approved_with_conditions; request fixes on needs_revision; escalate on rejected |
 | design-sync | sync_status (NO_CONFLICTS/CONFLICTS_FOUND) | On CONFLICTS_FOUND: present conflicts to user before proceeding |
 | integration-test-reviewer | status (approved/needs_revision/blocked), requiredFixes | On needs_revision: re-invoke the routed executor in Fix Mode with the same task_file and requiredFixes[] |
 | security-reviewer | status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes | On needs_revision: create a consolidated fix task file with the affected file paths from `requiredFixes[].location` populated into Target Files, then invoke the routed executor in Fix Mode with that task_file and the `requiredFixes[]` array, then quality-fixer, then re-invoke security-reviewer to verify resolution. On blocked: escalate to user with the blocking findings — fix is not within the agent layer's authority |
@@ -175,7 +175,7 @@ When quality-fixer returns `status: "blocked"`, discriminate by `reason`:
 When receiving new features or change requests, I first request requirement analysis from requirement-analyzer.
 According to scale determination:
-### Large Scale (6+ Files) - 13 Steps (backend) / 15 Steps (frontend/fullstack)
+### Large Scale (6+ Files) - 14 Steps (backend) / 16 Steps (frontend/fullstack)
 1. requirement-analyzer → Requirement analysis + Check existing PRD **[Stop]**
 2. prd-creator → PRD creation
@@ -190,10 +190,11 @@ According to scale determination:
 11. document-reviewer → Design Doc review (pass code-verifier results as code_verification; cross-layer: per Design Doc)
 12. design-sync → Consistency verification **[Stop: Design Doc Approval]**
 13. acceptance-test-generator → Test skeleton generation, pass to work-planner (*1)
-14. work-planner → Work plan creation **[Stop: Batch approval]**
-15. task-decomposer → Autonomous execution → Completion report
+14. work-planner → Work plan creation
+15. document-reviewer → Work plan review (doc_type: WorkPlan; pass the Design Doc path so AC/contract/state coverage is traceable). On `needs_revision`: re-invoke work-planner (update) and re-review until `approved`/`approved_with_conditions` — the plan is a derivation of the Design Doc, so plan-fidelity findings need no user adjudication. On `rejected`: escalate to user. **[Stop: Batch approval]**
+16. task-decomposer → Autonomous execution → Completion report
-### Medium Scale (3-5 Files) - 9 Steps (backend) / 11 Steps (frontend/fullstack)
+### Medium Scale (3-5 Files) - 10 Steps (backend) / 12 Steps (frontend/fullstack)
 1. requirement-analyzer → Requirement analysis **[Stop]**
 2. **(frontend/fullstack only)** Ask user for prototype code → ui-spec-designer → UI Spec creation (UI Spec informs component structure for technical design)
@@ -204,8 +205,9 @@ According to scale determination:
 7. document-reviewer → Design Doc review (pass code-verifier results as code_verification; cross-layer: per Design Doc)
 8. design-sync → Consistency verification **[Stop: Design Doc Approval]**
 9. acceptance-test-generator → Test skeleton generation, pass to work-planner (*1)
-10. work-planner → Work plan creation **[Stop: Batch approval]**
-11. task-decomposer → Autonomous execution → Completion report
+10. work-planner → Work plan creation
+11. document-reviewer → Work plan review (doc_type: WorkPlan; pass the Design Doc path so AC/contract/state coverage is traceable). On `needs_revision`: re-invoke work-planner (update) and re-review until `approved`/`approved_with_conditions` — the plan is a derivation of the Design Doc, so plan-fidelity findings need no user adjudication. On `rejected`: escalate to user. **[Stop: Batch approval]**
+12. task-decomposer → Autonomous execution → Completion report
 ### Small Scale (1-2 Files) - 2 Steps

package/.claude/skills-ja/documentation-criteria/references/plan-template.md CHANGED Viewed

@@ -4,6 +4,7 @@
 種別: feature|fix|refactor
 想定影響範囲: Xファイル
 関連Issue/PR: #XXX（該当する場合）
+レビュースコープ: [Design Docとタスク対象から導出した変更予定ファイルの範囲。既存作業に対する改訂計画の場合はベースブランチ + diff範囲]
 <!-- 下記の行はmedium / large規模のみ — small規模はこのplan-templateではなくtask-templateを使用する。値の行は末尾コメントを付けず、下流のパーサがbare statusの抽出（pending | ready | escalated）を行えるように保つこと。 -->
 Implementation Readiness: pending
@@ -26,6 +27,10 @@ Implementation Readiness: pending
 - **成功基準**: [Design Docから抽出]
 - **失敗時の対応**: [Design Docから抽出]
+### 証明戦略
+- **証明義務の出所**: [スケルトンが存在する場合はテストスケルトンの注釈（主要な故障モード、証明義務）。存在しない場合は各ACの主要な故障モード]
+- **タスクごとの伝播**: 主張を実装する各タスクは Proof Obligations（task template参照）を記録し、下流のレビューがテストが主張を証明しているか（単に実行されるだけでないか）を判定できるようにする
 ## 品質保証メカニズム（Design Docより）
 変更対象領域で採用された品質ゲート。本計画の各タスクはこれらのメカニズムを満たす必要がある。
@@ -48,6 +53,21 @@ Design Docの各技術要件をカバーするタスクへのマッピング。
 **ギャップステータス値**: `covered`（タスクあり）、`gap`（タスクなし — 備考に理由必須、計画承認前にユーザー確認が必要）
+## 故障モードチェックリスト
+この実装が防ぐべきドメイン非依存の故障カテゴリ。8カテゴリすべてを列挙し、各カテゴリの該当列に yes/no を記入し、該当する各カテゴリにカバーするタスクを挙げる。エントリにはプロジェクト固有の名称を含めない。
+| カテゴリ | 該当? | カバーするタスク |
+|---|---|---|
+| same-value | yes/no | [Phase X タスクY] |
+| no-op | yes/no | |
+| empty input | yes/no | |
+| invalid option | yes/no | |
+| missing config | yes/no | |
+| unavailable boundary | yes/no | |
+| shared-state dependency | yes/no | |
+| rollback-only visibility | yes/no | |
 ## UI Specコンポーネント → タスクマッピング
 入力にUI Specが含まれる場合のみ本セクションを記載する。UI Specに記述された各コンポーネントを実装するタスクへのマッピング。task-decomposerはこの表を読み込み、対応するUI Specセクションを各タスクのInvestigation Targetsに伝播させる。UI Specがない場合は本セクションを省略する。

package/.claude/skills-ja/documentation-criteria/references/task-template.md CHANGED Viewed

@@ -56,9 +56,21 @@ Metadata:
 - **失敗時の対応**: [検証失敗時の対処 — 例: 「続行前にアプローチを再評価」「ユーザーにエスカレーション」]
 - **検証レベル**: [L1: エンドユーザー機能としての動作確認 / L2: 新規テスト追加・パス / L3: ビルドエラーなし]
+## Proof Obligations
+（このタスクが実装するACまたは主張ごとに1エントリ。スケルトンの注釈がある場合はそこから、なければACの主要な故障モードから導出する。各テストは主張を証明すること。下記ブロックを主張ごとに繰り返し、見出しにAC IDまたは主張IDを記載して下流のレビューが主張ごとにカバレッジを解決できるようにする。）
+### Obligation: [AC IDまたは主張ID]
+- **主張**: [ACが約束する振る舞い]
+- **主要な故障モード**: [テストがレッドになるリグレッション]
+- **検証する境界**: [テストが通過する公開/統合境界、または "in-process unit"]
+- **状態アサーション**: [状態変更を伴う主張では 操作前 → 操作 → 操作後 の観察可能な状態。それ以外は "N/A"]
+- **モック境界の根拠**: [どの境界をモックしてよいか、その理由。すべて実物なら "none"]
+- **残余**: [この証明で未確立のまま残る事項があれば記載]
 ## Completion Criteria
 - [ ] 追加した全テストがパス
 - [ ] Operation Verification Methods に基づく動作確認完了
+- [ ] 各 Proof Obligation が満たされている: テストが主要な故障モードでレッドになり、指定された境界を通過する
 - [ ] 成果物作成完了（調査・設計タスクの場合）
 - [ ] （Binding Decisionsがある場合）全てのCompliance Checkが最終実装に対して`Y`と評価され、根拠（file:line、テスト結果、またはコマンド出力）がInvestigation Notesに記録されている

package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md CHANGED Viewed

@@ -156,7 +156,7 @@ description: サブエージェントのタスク分担と連携を調整。規
 | code-verifier | status (consistent/mostly_consistent/needs_review/inconsistent), consistencyScore, discrepancies[], reverseCoverage (dataOperationsInCode, testBoundariesSectionPresent). 実装前: Design Docの主張を既存コードに対して検証。実装後: 実装のDesign Doc整合性を検証（`code_paths`で変更ファイルにスコープ） | discrepanciesをdocument-reviewerに連携 |
 | task-executor | 入力: `task_file`（オーケストレーションフローでは必須）; 任意の Fix Mode シグナル `requiredFixes` または `incompleteImplementations` — いずれかが非空の場合、`task_already_completed` チェックをスキップし、各項目の `file_path` / `location`（`location` は `file[:line]` として解釈）で許可リストを拡張する。`incompleteImplementations[]` の各エントリは `type: "missing_logic" \| "hollow_test"` を持ち得て、executor は `type` で修正アクションを分岐する。出力: status (escalation_needed/completed), filesModified[], testsAdded, requiresTestReview, runnableCheck{level, executed, command, result, substance, substanceIssue, reason}, escalation_type ∈ {task_file_not_found, task_already_completed, target_files_missing, design_compliance_violation, similar_function_found, similar_component_found, investigation_target_not_found, out_of_scope_file, dependency_version_uncertain, binding_decision_violation, test_environment_not_ready} | escalation_needed時: escalation_type別に対応 |
 | quality-fixer | 入力: `task_file`（現在のタスクファイルパス — オーケストレーションフローでは常に渡す）、`filesModified`（上流の実装ステップのレスポンスから抽出 — 当該タスクの書き込み集合を未完成実装検出の主要スコープとして渡す。省略時は `git diff HEAD` にフォールバック）、`runnableCheck`（上流の実装ステップのレスポンスから抽出 — `substance` と `substanceIssue` を含むテスト実行のエビデンスを渡し、Substance チェックが実行時のシグナルを受け取れるようにする。上流がテストを実行していない場合は省略可）。Status: approved/stub_detected/blocked。`stub_detected` → `incompleteImplementations[]` の各エントリは `type: "missing_logic" \| "hollow_test"` を持ち、`type` で executor 側の修正アクションを分岐させた上で上流の実装ステップに差し戻し、本実装完了後にquality-fixerを再実行。`blocked` → 下記quality-fixer blockedハンドリング参照 | stub_detected: 実装ステップを再実行。blocked: 下記参照 |
-| document-reviewer | approvalReady (true/false) | trueで次ステップへ。falseで修正を依頼 |
+| document-reviewer | verdict.decision (approved/approved_with_conditions/needs_revision/rejected) | approved/approved_with_conditionsで次へ。needs_revisionで修正依頼。rejectedでエスカレーション |
 | design-sync | sync_status (NO_CONFLICTS/CONFLICTS_FOUND) | CONFLICTS_FOUND時: 矛盾をユーザーに提示してから進む |
 | integration-test-reviewer | status (approved/needs_revision/blocked), requiredFixes | needs_revision時: 同じ task_file と requiredFixes[] を渡してルーティング先の executor を Fix Mode で再実行 |
 | security-reviewer | status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes | needs_revision時: `requiredFixes[].location` から影響ファイルパスを抽出して Target Files に投入した統合修正タスクファイルを作成し、その task_file と `requiredFixes[]` 配列を渡してルーティング先の executor を Fix Mode で起動。続いて quality-fixer を実行し、最後に security-reviewer を再起動して解消を検証する。blocked 時: ブロッキング findings を添えてユーザーにエスカレーション — エージェント層の権限外の修正である |
@@ -170,7 +170,7 @@ quality-fixerが `status: "blocked"` を返した場合、`reason`で判別：
 ## 作業計画時の基本フロー
-### 大規模（6ファイル以上） - 13ステップ（バックエンド） / 15ステップ（フロントエンド/フルスタック）
+### 大規模（6ファイル以上） - 14ステップ（バックエンド） / 16ステップ（フロントエンド/フルスタック）
 1. requirement-analyzer → 要件分析 + 既存PRD確認 **[停止]**
 2. prd-creator → PRD作成
@@ -185,10 +185,11 @@ quality-fixerが `status: "blocked"` を返した場合、`reason`で判別：
 11. document-reviewer → Design Docレビュー（code-verifier結果をcode_verificationとして入力。レイヤー横断時: Design Doc毎に実行）
 12. design-sync → 整合性検証 **[停止: Design Doc承認]**
 13. acceptance-test-generator → テストスケルトン生成、work-plannerに渡す (*1)
-14. work-planner → 作業計画書作成 **[停止: 一括承認]**
-15. task-decomposer → 自律実行 → 完了報告
+14. work-planner → 作業計画書作成
+15. document-reviewer → 作業計画書レビュー（doc_type: WorkPlan。AC/コントラクト/状態のカバレッジをトレースできるようDesign Docのパスを渡す）。`needs_revision` の場合: work-plannerを（updateで）再実行し `approved`/`approved_with_conditions` になるまで再レビューする — 作業計画書はDesign Docの派生物であるため、計画の忠実性に関する指摘にユーザー裁定は不要。`rejected` の場合: ユーザーにエスカレーション。 **[停止: 一括承認]**
+16. task-decomposer → 自律実行 → 完了報告
-### 中規模（3-5ファイル） - 9ステップ（バックエンド） / 11ステップ（フロントエンド/フルスタック）
+### 中規模（3-5ファイル） - 10ステップ（バックエンド） / 12ステップ（フロントエンド/フルスタック）
 1. requirement-analyzer → 要件分析 **[停止]**
 2. **（フロントエンド/フルスタックのみ）** プロトタイプコードの有無を確認 → ui-spec-designer → UI Spec作成（コンポーネント構造が技術設計に反映されるため先に実施）
@@ -199,8 +200,9 @@ quality-fixerが `status: "blocked"` を返した場合、`reason`で判別：
 7. document-reviewer → Design Docレビュー（code-verifier結果をcode_verificationとして入力。レイヤー横断時: Design Doc毎に実行）
 8. design-sync → 整合性検証 **[停止: Design Doc承認]**
 9. acceptance-test-generator → テストスケルトン生成、work-plannerに渡す (*1)
-10. work-planner → 作業計画書作成 **[停止: 一括承認]**
-11. task-decomposer → 自律実行 → 完了報告
+10. work-planner → 作業計画書作成
+11. document-reviewer → 作業計画書レビュー（doc_type: WorkPlan。AC/コントラクト/状態のカバレッジをトレースできるようDesign Docのパスを渡す）。`needs_revision` の場合: work-plannerを（updateで）再実行し `approved`/`approved_with_conditions` になるまで再レビューする — 作業計画書はDesign Docの派生物であるため、計画の忠実性に関する指摘にユーザー裁定は不要。`rejected` の場合: ユーザーにエスカレーション。 **[停止: 一括承認]**
+12. task-decomposer → 自律実行 → 完了報告
 ### 小規模（1-2ファイル） - 2ステップ

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [1.23.2] - 2026-06-01
+### Added
+- **Proof-quality review gate** (agents, skills) — Verification is raised from "tests exist" to "tests prove the claim's primary failure mode." `acceptance-test-generator` emits `Primary failure mode` / `Proof obligation` annotations on every skeleton; `work-planner`, `task-decomposer`, plan-template, and task-template carry per-claim Proof Obligations into task files; `integration-test-reviewer` adds a Claim Proof Adequacy check with a `proof_insufficient` issue type; `code-reviewer` requires AC-coverage tests to turn red under the recorded primary failure mode and exercise the claimed boundary directly
+- **Planning-fidelity review gate** (agents, commands, skills) — The work plan now gets the same automated review the Design Doc does, before implementation starts. `document-reviewer` gains a `WorkPlan` doc_type semantic gate (Design-to-Plan traceability, early-verification placement, real-boundary coverage, Failure Mode Checklist, Review Scope) with a verdict mapping and a `design_doc` input; `work-planner` and plan-template add a Proof Strategy, Review Scope, and Failure Mode Checklist; the plan / front-plan recipes review the plan and self-heal `needs_revision` (re-plan and re-review) while escalating `rejected`; front-build reviews a plan it creates from a Design Doc before decomposition; `subagents-orchestration-guide` wires WorkPlan review into the Medium/Large planning flow and moves the `document-reviewer` output contract to `verdict.decision`
 ## [1.23.1] - 2026-05-28
 ### Added

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "create-ai-project",
-  "version": "1.23.1",
+  "version": "1.23.2",
   "packageManager": "npm@10.8.2",
   "description": "TypeScript boilerplate with skills and sub-agents for Claude Code. Prevents context exhaustion through role-based task splitting.",
   "keywords": [