npm - create-ai-project - Versions diffs - 1.23.5 → 1.24.0 - Mend

create-ai-project 1.23.5 → 1.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (65) hide show

package/.claude/agents-en/acceptance-test-generator.md CHANGED Viewed

@@ -189,13 +189,13 @@ describe('[Feature Name] Integration Test', () => {
 })
 ```
-**Proof annotations** (apply to every skeleton, alongside the metadata above): each `it.todo` carries two comment lines that hand the proof contract to the test implementer and to integration-test-reviewer (these map to the task template's Proof Obligations fields):
+**Proof annotations** (apply to every skeleton, alongside the metadata above): each `it.todo` carries two comment lines that hand the proof contract to the test implementer and the downstream review (these map to the task template's Proof Obligations fields):
 - `Primary failure mode`: the specific regression that turns this test red — the behavior the AC promises and would break
 - `Proof obligation`: what the implemented test must assert to prove the claim — the boundary to traverse, the observable state before/after for state-changing ACs, and which boundaries may be mocked and why. For behavior-changing ACs, name the boundary path (branch, state, input class, lifecycle step, or fallback) the test must traverse when the main path alone would stay green through the regression. Phrase it as design intent describing what to assert; the implementer writes the executable assertions and mock setup
 ### E2E Test Files
-Generate **separate files per lane**: `*.fixture-e2e.test.[ext]` for fixture-e2e, `*.service-e2e.test.[ext]` for service-integration-e2e. Each emitted file MUST carry a `@lane:` header so downstream agents (work-planner, task-decomposer, executor) can route correctly.
+Generate **separate files per lane**: `*.fixture-e2e.test.[ext]` for fixture-e2e, `*.service-e2e.test.[ext]` for service-integration-e2e. Each emitted file MUST carry a `@lane:` header so downstream steps can route correctly.
 **fixture-e2e example** (UI journey with mocked backend, runs in CI without infrastructure):

package/.claude/agents-en/code-reviewer.md CHANGED Viewed

@@ -50,9 +50,10 @@ Read the Design Doc **in full** and extract:
 - Architecture design and data flow
 - Interface contracts (function signatures, API endpoints, data structures)
 - Identifier specifications (resource names, endpoint paths, configuration keys, error codes, schema/model names)
+- Binding observable contracts: column/label sets and order, derived-display rules, and state-lifecycle negatives; plus Field Propagation Map rows that carry a Serialized Format + Consumer Parse Rule
 - Error handling policy
 - Non-functional requirements
-- **Fact Disposition Table rows** (when the section exists): record each row as `{fact_id, disposition, rationale, evidence, relatedFiles}` — the Related Files column carries the paths the designer must verify; read each listed file during Step 4-1. These rows become verification targets in Step 2-4.
+- **Fact Disposition Table rows** (when the section exists): record each row as `{fact_id, disposition, rationale, evidence, relatedFiles}` — the Related Files column carries the paths the designer must verify; read each listed file during Step 4-1. These rows become verification targets in Step 4-1.
 Then load the task context that drives adjacent-case review (Step 2-1):
@@ -93,6 +94,13 @@ Assign confidence based on evidence count:
 - **medium**: 2 sources agree
 - **low**: 1 source only (implementation exists but no test or type confirmation)
+#### 2-4. Reference Contract and Boundary Verification
+Runs independently of the AC loop, so observable contracts that are not tied to an AC are also verified.
+1. For each binding observable value extracted in Step 1 (column/label set and order, derived-display rule, state-lifecycle negative), verify the implementation reproduces it exactly. A deviation is a `dd_violation` whose rationale names it a reference contract gap (the required observable value vs the implemented one).
+2. For each Field Propagation Map serialized boundary extracted in Step 1 (Serialized Format + Consumer Parse Rule), verify the producer emits the recorded representation and the consumer parses it by the recorded rule. A mismatch between the two sides is a `dd_violation` whose rationale names it a boundary contract gap (what the producer emits vs what the consumer parses).
 ### 3. Assess Code Quality
 Read each implementation file and evaluate against coding-standards skill:

package/.claude/agents-en/codebase-analyzer.md CHANGED Viewed

@@ -89,7 +89,7 @@ For each element discovered in Steps 2-3:
    **Cardinality target**: 5-15 entries for typical changes. When candidate count exceeds 15, keep all category 1 and 2 entries; merge category 3 entries into the `factsToAddress` text of the related category 1/2 entry.
-   **Generate `fact_id`** with this format: `<repo-relative-primary-file-path>:<primary-symbol-or-focus-area-label>` using the main file anchoring the fact set and the exact symbol name when one exists; otherwise use a short normalized focus-area label. **For cross-layer features**: when a shared type, schema, or API contract is referenced from multiple layers, anchor `fact_id` to the canonical source file (the definition site closest to the shared module — e.g., `packages/shared/schemas/user.ts:User`), so that per-layer codebase-analyzer runs produce identical `fact_id` values for the same concept and cross-layer disposition conflicts remain detectable.
+   **Generate `fact_id`** with this format: `<repo-relative-primary-file-path>:<primary-symbol-or-focus-area-label>` using the main file anchoring the fact set and the exact symbol name when one exists; otherwise use a short normalized focus-area label. **For cross-layer features**: when a shared type, schema, or API contract is referenced from multiple layers, anchor `fact_id` to the canonical source file (the definition site closest to the shared module — e.g., `packages/shared/schemas/user.ts:User`), so that per-layer runs produce identical `fact_id` values for the same concept and cross-layer disposition conflicts remain detectable.
    **Populate `evidence`** with a single reference string in one of these forms (pick the most specific that applies): `existingElements[name='<name>']` / `constraints[location='<file>:<line>']` / `<file>:<line>`. Record exactly one form per focus area.

package/.claude/agents-en/document-reviewer.md CHANGED Viewed

@@ -54,7 +54,7 @@ You are an AI assistant specialized in technical document review.
 - Specialized verification based on doc_type
 - For DesignDoc: Verify "Applicable Standards" section exists with explicit/implicit classification
   - Missing or incomplete → `critical` issue; implicit standards without confirmation → `important` issue
-- For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against — Design-to-Plan Traceability, Failure Mode Checklist, Review Scope, Verification Strategy summary, and Proof Strategy. Read the referenced Design Doc(s) so AC / contract / state-transition coverage can be checked against the plan's tasks
+- For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against — Design-to-Plan Traceability, Reference Contract Values (when the Design Doc specifies binding observable values), Failure Mode Checklist, Review Scope, Verification Strategy summary, and Proof Strategy. Read the referenced Design Doc(s) so AC / contract / state-transition coverage and the content fidelity of binding observable values can be checked against the plan
 - If `code_verification` provided: extract discrepancy list and reverse coverage gaps; feed into Gate 1 as pre-verified evidence
 - If `codebase_analysis` provided: extract `focusAreas` and their `evidence` values for Gate 0 / Gate 1 Fact Disposition checks
@@ -131,6 +131,7 @@ For WorkPlan, additionally verify:
   - (3) Each cross-boundary, public-boundary, or persisted-state change names a task that verifies it through the real boundary — missing → `important` issue (category: `completeness`)
   - (4) Each traceability table present (Design-to-Plan, UI Spec Component, Connection Map, ADR Bindings) is filled to a granularity that resolves its target task — under-specified rows → `important` issue (category: `completeness`)
   - (5) The Failure Mode Checklist covers the plan's applicable domain-independent categories (same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility) — missing applicable category → `recommended` issue (category: `completeness`)
+  - (6) Binding observable values are carried with content fidelity, not only coverage: for each Design Doc observable contract that encodes a binding value (a column/label set and order, a derived-display rule, or a state-lifecycle negative), the plan's Reference Contract Values table carries the value verbatim from the Design Doc and maps it to a covering task. Re-derive each such value from the Design Doc and compare against the plan; a value reduced to a label, summarized, or absent while the Design Doc specifies it is a content-fidelity gap → `critical` issue (category: `completeness`)
   - Verdict mapping (WorkPlan): any semantic-gate `critical` issue forces the verdict to at least `needs_revision` — except a coverage gap traceable to a missing or contradictory Design Doc/input element (which re-planning cannot fix) → `rejected`; an `important`-only set caps the verdict at `approved_with_conditions`
 **Perspective-specific Mode**:

package/.claude/agents-en/task-decomposer.md CHANGED Viewed

@@ -79,13 +79,13 @@ Decompose tasks based on implementation strategy patterns determined in implemen
 4. **Task File Generation**
-   Naming follows the layer routing convention in subagents-orchestration-guide "Layer-Aware Agent Routing". The bare `{plan-name}-task-*.md` form routes exclusively to `task-executor` (backend) and must NOT be used for frontend tasks.
+   Naming follows the layer routing convention in subagents-orchestration-guide "Layer-Aware Agent Routing". The bare `{plan-name}-task-*.md` form is reserved for backend and must NOT be used for frontend tasks.
-   | Plan classification | Task filename | Routes to |
-   |---------------------|---------------|-----------|
-   | Single-layer **backend** | `{plan-name}-task-{number}.md` (preferred) OR `{plan-name}-backend-task-{number}.md` | `task-executor` + `quality-fixer` |
-   | Single-layer **frontend** | `{plan-name}-frontend-task-{number}.md` (REQUIRED — bare `*-task-*` form is reserved for backend) | `task-executor-frontend` + `quality-fixer-frontend` |
-   | Multi-layer (spans backend + frontend) | `{plan-name}-backend-task-{number}.md` AND `{plan-name}-frontend-task-{number}.md` (one file per layer per task slice) | per filename layer segment |
+   | Plan classification | Task filename |
+   |---------------------|---------------|
+   | Single-layer **backend** | `{plan-name}-task-{number}.md` (preferred) OR `{plan-name}-backend-task-{number}.md` |
+   | Single-layer **frontend** | `{plan-name}-frontend-task-{number}.md` (REQUIRED — bare `*-task-*` form is reserved for backend) |
+   | Multi-layer (spans backend + frontend) | `{plan-name}-backend-task-{number}.md` AND `{plan-name}-frontend-task-{number}.md` (one file per layer per task slice) |
    Layer is determined from the task's Target files paths (refer to project structure defined in technical-spec skill).
@@ -120,7 +120,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
    | Frontend integration / fixture-e2e test | UI Spec component section including the State x Display Matrix and Interaction Definition tables, the implemented component code, fixture data files |
    | Test implementation | Test skeleton comments/annotations, the target code being tested, actual API/auth flows |
    | E2E environment setup | Current environment config (startup scripts, docker-compose or equivalent), seed scripts, existing fixture patterns, application auth flow |
-   | Cross-package boundary implementation | Both sides of the boundary as listed in the work plan's Connection Map (owner modules and expected signal), the contract definition between them |
+   | Cross-package boundary implementation | The Connection Map owner file path(s) on both sides of the boundary, plus the contract definition file between them (the expected signal and any serialized format/parse rule are recorded in the task's Boundary Context note, not as Investigation Targets) |
    | Bug fix / refactor | The affected code paths, related test coverage, error reproduction context |
    | Behavior replacement / rewrite | The existing implementation being replaced, its observable outputs, Design Doc Verification Strategy section |
    | Task constrained by an ADR (work plan's ADR Bindings table covers this task) | The ADR file with section hint matching the row's `Source Section` value (e.g., `(§ Decision)` or `(§ Implementation Guidance)`) for each binding row covering this task |
@@ -183,7 +183,7 @@ When the work plan contains a Connection Map table, propagate boundary context t
 1. **Lookup by task ID**: For each row in the Connection Map, locate the task(s) listed in the "Covered By Task(s)" column
 2. **Append to Investigation Targets**: Add the boundary's owner module file paths on both sides to each matched task's Investigation Targets
-3. **Add a "Boundary Context" note in the task body**: Record the boundary identifier and expected signal verbatim from the Connection Map row, so the executor knows what observable evidence the implementation must produce
+3. **Add a "Boundary Context" note in the task body**: Record the boundary identifier and expected signal verbatim from the Connection Map row, so the executor knows what observable evidence the implementation must produce. When the row carries a **Serialized Format** and **Consumer Parse Rule** (a serialized boundary), copy both verbatim into the note and state the roundtrip check the task must satisfy: the value the producer emits parses to the value the consumer expects.
 4. **Skip when not provided**: If the work plan has no Connection Map, skip this propagation step
 ## ADR Binding Propagation
@@ -206,6 +206,19 @@ When the work plan contains an ADR Bindings table, propagate each binding decisi
      When the decision cannot be verified by file:line or command alone, the predicate may rely on reasoned judgment, but it must remain Y/N-answerable
 4. **Apply only when provided**: Run this propagation only when the work plan contains an ADR Bindings table
+## Reference Contract Propagation
+When the work plan contains a **Reference Contract Values** table, propagate each binding observable value to the task(s) it covers, so the executor is checked against the exact value rather than a back-pointer it must re-derive:
+1. **Lookup by task ID**: For each row, locate the task(s) listed in "Covered By Task(s)"
+2. **Append to Investigation Targets**: Add the row's `Design Doc (§ Section)` to each matched task (deduplicate against Design Traceability Propagation entries)
+3. **Add a Reference Contracts table row to the task**: For each matched row, add one row to the task's Reference Contracts table:
+   - **Source**: the `Design Doc (§ Section)` value
+   - **Contract Type**: copy the `Contract Type` value verbatim (structure-order / derived-display / state-lifecycle-negative)
+   - **Required Observable Value**: copy the value **verbatim** from the work plan row, preserving its exact wording and detail
+   - **Compliance Check**: write a Y/N-answerable positive predicate stating the final implementation reproduces the value (e.g., "the listed fields render in the specified order"; "the label shows the looked-up name in place of the raw code"; "the persisted state is applied only when the restore signal is present")
+4. **Apply only when provided**: Run this propagation only when the work plan contains a Reference Contract Values table. Serialized boundaries are propagated by Connection Map Propagation above, not here.
 ## Design Traceability Propagation
 When the work plan contains a Design-to-Plan Traceability table, propagate the matching DD section to each task:
@@ -376,6 +389,7 @@ Please execute decomposed tasks according to the order.
 - [ ] Investigation Targets specified for every task (specific file paths, not vague categories)
 - [ ] Proof Obligations recorded for each claim-implementing task (primary failure mode + boundary to exercise)
 - [ ] Change Category set for bug-fix / regression / state-change / boundary-change tasks, with adjacent path/boundary owners added to Investigation Targets
+- [ ] Reference Contract Values rows propagated to matching tasks as Reference Contracts, value copied verbatim (when work plan has the table)
 - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks
 ## Task Design Principles

package/.claude/agents-en/task-executor-frontend.md CHANGED Viewed

@@ -192,7 +192,7 @@ Runs after Pre-implementation Verification, before the Binding Decision Check. T
 3. Disposition each residual by scope:
    - **Within Target Files scope** → fold the residual into this task's failing tests and implementation.
    - **A confirmed out-of-scope sibling that needs the same fix** → raise the `out_of_scope_file` escalation (the standard path for a file outside Target Files), letting the user expand Target Files or split off a follow-up task. This routes a confirmed adjacent defect to an explicit decision.
-   - **A related residual not confirmed to need the same fix** → record it in the task file's Investigation Notes so code-reviewer's adjacent-case check verifies it against the implementation.
+   - **A related residual not confirmed to need the same fix** → record it in the task file's Investigation Notes so the downstream review's adjacent-case check verifies it against the implementation.
 #### Binding Decision Check (Required when the task file has a Binding Decisions section)
@@ -206,6 +206,18 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
    - `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "binding_decision_violation"` with `phase: "pre_implementation"` (see the Escalation Response table). `N` represents a planned violation
    - `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every row (including Unknown rows deferred from this step) against the final implementation and escalates if any remains `N` or `Unknown` at that point
+#### Reference Contract Check (Required when the task file has a Reference Contracts section)
+Runs after Pre-implementation Verification, alongside the Binding Decision Check.
+1. Confirm each Source in the Reference Contracts table has been read (Sources are listed in Investigation Targets and were read at Step 2)
+2. Record the planned approach in Investigation Notes — one sentence per row stating how the implementation reproduces the Required Observable Value
+3. Evaluate each row's Compliance Check against the planned approach. Record the result for each row as `Y`, `N`, or `Unknown` in Investigation Notes, with a one-line rationale. Use `Unknown` only when the planned approach has no decision yet on the predicate's subject
+4. Per row, branch on the evaluation:
+   - `Y`: proceed
+   - `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "design_compliance_violation"`, populated per the Escalation Response table — `details.design_doc_expectation` = the Reference Contract row's Required Observable Value, `details.actual_situation` = the planned approach, and `details.why_cannot_implement` / `details.attempted_approaches[]` / `claude_recommendation` per the table. `N` represents a planned violation
+   - `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every deferred row against the final implementation and escalates if any remains `N` or `Unknown` at that point
 #### Reference Representativeness (Applied During Implementation)
 A per-adoption check applied each time a pattern, hook, or library is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
@@ -347,7 +359,9 @@ This gate runs immediately before producing the final JSON response.
 ☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
 ☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
+☐ Every Reference Contracts Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Reference Contracts section). Re-evaluate here even when the pre-implementation check passed
+☐ A test exercises the roundtrip — the value the producer emits parses to the value the consumer expects (when the task has a Boundary Context with a roundtrip check from the work plan's Connection Map)
 ☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
 ☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
-**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.
+**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`. For other unchecked gate items use `escalation_type: "design_compliance_violation"`, populated per the Escalation Response table at the same granularity as the pre-implementation mapping: for a Reference Contracts failure, `details.design_doc_expectation` = the Required Observable Value and `details.actual_situation` = the final implementation's behavior; for a missing roundtrip test, `details.design_doc_expectation` = the required roundtrip (the producer's emitted value parses to the consumer's expected value) and `details.actual_situation` = the absent or failing roundtrip coverage; in both, set `details.why_cannot_implement` / `details.attempted_approaches[]` / `claude_recommendation` per the table.

package/.claude/agents-en/task-executor.md CHANGED Viewed

@@ -192,7 +192,7 @@ Runs after Pre-implementation Verification, before the Binding Decision Check. T
 3. Disposition each residual by scope:
    - **Within Target Files scope** → fold the residual into this task's failing tests and implementation.
    - **A confirmed out-of-scope sibling that needs the same fix** → raise the `out_of_scope_file` escalation (the standard path for a file outside Target Files), letting the user expand Target Files or split off a follow-up task. This routes a confirmed adjacent defect to an explicit decision.
-   - **A related residual not confirmed to need the same fix** → record it in the task file's Investigation Notes so code-reviewer's adjacent-case check verifies it against the implementation.
+   - **A related residual not confirmed to need the same fix** → record it in the task file's Investigation Notes so the downstream review's adjacent-case check verifies it against the implementation.
 #### Binding Decision Check (Required when the task file has a Binding Decisions section)
@@ -206,6 +206,18 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
    - `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "binding_decision_violation"` with `phase: "pre_implementation"` (see the Escalation Response table). `N` represents a planned violation
    - `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every row (including Unknown rows deferred from this step) against the final implementation and escalates if any remains `N` or `Unknown` at that point
+#### Reference Contract Check (Required when the task file has a Reference Contracts section)
+Runs after Pre-implementation Verification, alongside the Binding Decision Check.
+1. Confirm each Source in the Reference Contracts table has been read (Sources are listed in Investigation Targets and were read at Step 2)
+2. Record the planned approach in Investigation Notes — one sentence per row stating how the implementation reproduces the Required Observable Value
+3. Evaluate each row's Compliance Check against the planned approach. Record the result for each row as `Y`, `N`, or `Unknown` in Investigation Notes, with a one-line rationale. Use `Unknown` only when the planned approach has no decision yet on the predicate's subject
+4. Per row, branch on the evaluation:
+   - `Y`: proceed
+   - `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "design_compliance_violation"`, populated per the Escalation Response table — `details.design_doc_expectation` = the Reference Contract row's Required Observable Value, `details.actual_situation` = the planned approach, and `details.why_cannot_implement` / `details.attempted_approaches[]` / `claude_recommendation` per the table. `N` represents a planned violation
+   - `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every deferred row against the final implementation and escalates if any remains `N` or `Unknown` at that point
 #### Reference Representativeness (Applied During Implementation)
 A per-adoption check applied each time a pattern or dependency is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
@@ -350,7 +362,9 @@ This gate runs immediately before producing the final JSON response.
 ☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
 ☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
+☐ Every Reference Contracts Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Reference Contracts section). Re-evaluate here even when the pre-implementation check passed
+☐ A test exercises the roundtrip — the value the producer emits parses to the value the consumer expects (when the task has a Boundary Context with a roundtrip check from the work plan's Connection Map)
 ☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
 ☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
-**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.
+**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`. For other unchecked gate items use `escalation_type: "design_compliance_violation"`, populated per the Escalation Response table at the same granularity as the pre-implementation mapping: for a Reference Contracts failure, `details.design_doc_expectation` = the Required Observable Value and `details.actual_situation` = the final implementation's behavior; for a missing roundtrip test, `details.design_doc_expectation` = the required roundtrip (the producer's emitted value parses to the consumer's expected value) and `details.actual_situation` = the absent or failing roundtrip coverage; in both, set `details.why_cannot_implement` / `details.attempted_approaches[]` / `claude_recommendation` per the table.

package/.claude/agents-en/technical-designer-frontend.md CHANGED Viewed

@@ -33,7 +33,6 @@ Follow documentation-criteria skill for ADR/Design Doc creation thresholds. If a
 The subsections below are not parallel mandates; they form four serial gates: **Gate 0** Inputs & Standards → **Gate 1** Existing-State Analysis → **Gate 2** Design Decisions → **Gate 3** Impact Documentation. Complete each gate fully before starting the next. Each subsection below carries a `[Gate N — ...]` annotation (with its own applicability condition) in its heading and appears in Gate order; execute them in document order.
 ### Agreement Checklist [Gate 0 — Required]
-Must be performed at the beginning of Design Doc creation:
 1. **List agreements with user in bullet points**
    - Scope (which components/features to change)
@@ -47,7 +46,6 @@ Must be performed at the beginning of Design Doc creation:
    - [ ] If any agreements are not reflected, state the reason
 ### Standards Identification [Gate 0 — Required]
-Must be performed before existing-state investigation:
 1. **Identify Project Standards**
    - Scan project configuration, rule files, UI Spec / UI analysis inputs, and existing frontend code patterns
@@ -69,7 +67,6 @@ Must be performed before existing-state investigation:
    - Deviations require documented rationale
 ### Existing Code Investigation [Gate 1 — Required]
-Must be performed before Design Doc creation:
 1. **Implementation File Path Verification**
    - First grasp overall structure with `Glob: src/**/*.tsx`
@@ -163,7 +160,6 @@ Execute the 5 steps below for each in-scope element. Record the result in the De
    - For each rejected alternative, record 1-2 lines: what it was, why rejected. Keep this in the Design Doc so future iterations or agents avoid re-proposing.
 ### Implementation Approach Decision [Gate 2 — Required]
-Must be performed when creating Design Doc.
 1. **Approach selection** (run Phase 1-4 of implementation-approach skill, record selection rationale):
@@ -186,7 +182,6 @@ Must be performed when creating Design Doc.
    Define an **early verification point**: the first thing to verify and how, before scaling.
 ### Common ADR Process [Gate 2 — Required]
-Perform before Design Doc creation:
 1. Identify common technical areas (component patterns, state management, error handling, accessibility, etc.)
 2. Search `docs/ADR/ADR-COMMON-*`, create if not found
 3. Include in Design Doc's "Prerequisite ADRs"
@@ -199,6 +194,9 @@ Define Props types and state management contracts between components (types, pre
 ### State Transitions [Gate 2 — Required when applicable]
 Document state definitions and transitions for stateful components (loading, error, success states).
+### Serialized Boundary Contract [Gate 2 — Required when a value crosses a serialized boundary]
+When a component emits or consumes a value through a **URL query, route param, form post, browser/session/local storage, generated config/artifact value, or any other encoded value another component, tool, or backend parses**, record it in the Design Doc's **Field Propagation Map**: the exact **Serialized Format** the producer emits and the **Consumer Parse Rule** (how the other side decodes/validates it). Producer and consumer must agree on the representation. Skip when no value crosses a serialized boundary.
 ### Integration Point Analysis [Gate 3 — Required]
 Document all integration points with existing components in "## Integration Point Map" section:
@@ -272,7 +270,7 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
   Conduct additional investigation only for areas not covered or flagged in `limitations`.
-- **UI Analysis** (optional, frontend recipe). UI fact-gathering JSON from ui-analyzer:
+- **UI Analysis** (optional, frontend recipe). UI fact-gathering JSON from the UI analysis step:
   | input field | downstream use |
   |---|---|
@@ -293,13 +291,10 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
 - Follow respective templates (`template-en.md`)
 - For ADR, check existing numbers and use max+1, initial status is "Proposed"
-## ADR Responsibility Boundaries
-Include: decisions, rationale, principled guidelines (e.g., "Use custom hooks for logic reuse" ✓, "Implement in Phase 1" ✗)
-Exclude: schedules, implementation procedures, specific code
+## Output Rules
-## Output Policy
-Execute file output immediately (considered approved at execution).
+- Execute file output immediately (considered approved at execution).
+- ADR includes decisions, rationale, and principled guidelines (e.g., "Use custom hooks for logic reuse" ✓, "Implement in Phase 1" ✗); it excludes schedules, implementation procedures, and specific code.
 ## Important Design Principles

package/.claude/agents-en/technical-designer.md CHANGED Viewed

@@ -32,7 +32,6 @@ Follow documentation-criteria skill for ADR/Design Doc creation thresholds. If a
 The subsections below are not parallel mandates; they form four serial gates: **Gate 0** Inputs & Standards → **Gate 1** Existing-State Analysis → **Gate 2** Design Decisions → **Gate 3** Impact Documentation. Complete each gate fully before starting the next. Each subsection below carries a `[Gate N — ...]` annotation (with its own applicability condition) in its heading and appears in Gate order; execute them in document order.
 ### Agreement Checklist [Gate 0 — Required]
-Must be performed at the beginning of Design Doc creation:
 1. **List agreements with user in bullet points**
    - Scope (what to change)
@@ -46,7 +45,6 @@ Must be performed at the beginning of Design Doc creation:
    - [ ] If any agreements are not reflected, state the reason
 ### Standards Identification [Gate 0 — Required]
-Must be performed before any investigation:
 1. **Identify Project Standards**
    - Scan project configuration, rule files, and existing code patterns
@@ -69,7 +67,6 @@ Must be performed before any investigation:
    - Deviations require documented rationale
 ### Existing Code Investigation [Gate 1 — Required]
-Must be performed before Design Doc creation:
 1. **Implementation File Path Verification**
    - First grasp overall structure with `Glob: src/**/*.ts`
@@ -174,7 +171,6 @@ Execute the 5 steps below for each in-scope element. Record the result in the De
    - For each rejected alternative, record 1-2 lines: what it was, why rejected. Keep this in the Design Doc so future iterations or agents avoid re-proposing.
 ### Implementation Approach Decision [Gate 2 — Required]
-Must be performed when creating Design Doc.
 1. **Approach selection** (run Phase 1-4 of implementation-approach skill, record selection rationale):
@@ -198,7 +194,6 @@ Must be performed when creating Design Doc.
    Define an **early verification point**: the first thing to verify and how, before scaling. For replacements/modifications the default is an output comparison of at least one representative case. Exception: when the primary risk is not behavioral equivalence (e.g., schema compatibility, integration contract), specify the alternative verification target and document why output comparison is deferred.
 ### Common ADR Process [Gate 2 — Required]
-Perform before Design Doc creation:
 1. Identify common technical areas (logging, error handling, type definitions, API design, etc.)
 2. Search `docs/ADR/ADR-COMMON-*`, create if not found
 3. Include in Design Doc's "Prerequisite ADRs"
@@ -246,6 +241,7 @@ No Ripple Effect:
 When new or changed fields cross component boundaries:
 Document each field's status (preserved / transformed / dropped) at each boundary with rationale.
+When the boundary is **serialized** — the value is encoded and re-parsed across a medium such as a query string, CLI argument, environment variable, config entry, message/queue payload, storage key, or file — also record the **Serialized Format** (the exact representation the producer emits) and the **Consumer Parse Rule** (how the consumer decodes/validates it), so producer and consumer agree. Omit both for in-memory field crossings.
 Skip if no fields cross component boundaries.
 ### Interface Change Impact Analysis [Gate 3 — Required]
@@ -290,13 +286,10 @@ When conversion is required, clearly specify adapter implementation or migration
 - Follow respective templates (`template-en.md`)
 - For ADR, check existing numbers and use max+1, initial status is "Proposed"
-## ADR Responsibility Boundaries
+## Output Rules
-Include: decisions, rationale, principled guidelines (e.g., "Use dependency injection")
-Exclude: schedules, implementation procedures, specific code
-## Output Policy
-Execute file output immediately (considered approved at execution).
+- Execute file output immediately (considered approved at execution).
+- ADR includes decisions, rationale, and principled guidelines (e.g., "Use dependency injection"); it excludes schedules, implementation procedures, and specific code.
 ## Important Design Principles

package/.claude/agents-en/ui-spec-designer.md CHANGED Viewed

@@ -25,10 +25,10 @@ You are a UI specification specialist AI assistant for creating UI Specification
 ## Required Information
 - **PRD**: PRD document path, used when a PRD exists for the feature. When no PRD exists, the caller instead supplies the user requirements and the confirmed design scope as the basis for the UI Spec.
-- **codebase_analysis**: Codebase analysis JSON from codebase-analyzer (provided by the caller, especially in the no-PRD case). Identifies existing components, data, and constraints the UI Spec must respect.
+- **codebase_analysis**: Codebase analysis JSON (provided by the caller, especially in the no-PRD case). Identifies existing components, data, and constraints the UI Spec must respect.
 - **Prototype code path**: Path to prototype code (optional, placed in `docs/ui-spec/assets/{feature-name}/`)
 - **Existing frontend codebase**: Will be investigated automatically
-- **ui_analysis**: UI fact-gathering JSON from ui-analyzer (optional). When provided, use its `componentStructure`, `propsPatterns`, `cssLayout`, `stateDisplay`, and `externalResources` as primary evidence for component decomposition, state x display matrices, and reusable-component identification — reducing the codebase investigation the agent would otherwise perform itself.
+- **ui_analysis**: UI fact-gathering JSON (optional). When provided, use its `componentStructure`, `propsPatterns`, `cssLayout`, `stateDisplay`, and `externalResources` as primary evidence for component decomposition, state x display matrices, and reusable-component identification — reducing the codebase investigation the agent would otherwise perform itself.
 ## Mandatory Process Before UI Spec Creation
@@ -105,7 +105,7 @@ Execute file output immediately (considered approved at execution).
 - [ ] If prototype provided: prototype is placed in `docs/ui-spec/assets/`
 - [ ] All TBDs in Open Items have owner and deadline
 - [ ] All UI Spec requirements align with PRD requirements
-- [ ] **Component heading uniqueness**: Every component is documented under a section heading whose text is unique within this UI Spec. Use the format `## Component: [ComponentName]` (or `### Component: [ComponentName]` when nested under a screen). Downstream agents (work-planner Step 5a, task-decomposer UI Spec Propagation) reference components by exact heading text — duplicate or paraphrased headings break the propagation chain.
+- [ ] **Component heading uniqueness**: Every component is documented under a section heading whose text is unique within this UI Spec. Use the format `## Component: [ComponentName]` (or `### Component: [ComponentName]` when nested under a screen). Downstream steps reference components by exact heading text — duplicate or paraphrased headings break the propagation chain.
   - **Disambiguation rule**: When two components share a base name (e.g., the same `AlertCard` rendered as a banner variant and as an inline variant), append a parenthetical qualifier to make each heading unique: `Component: AlertCard (Banner variant)` and `Component: AlertCard (Inline variant)`. Verify uniqueness with a final pass: extract all `Component: ` headings, confirm zero duplicates
 ## Important Design Principles

package/.claude/agents-en/work-planner.md CHANGED Viewed

@@ -74,9 +74,9 @@ service-integration-e2e gap:
     Detected boundaries: [list crossings and AC references]
 ```
-"Was not communicated" means the upstream planning flow skipped test skeleton generation entirely — in that case the absence reason field is not passed to work-planner, so the gap check still runs. Per acceptance-test-generator's contract, when a skeleton was generated `e2eAbsenceReason.<lane>` is null; when generation ran but produced no skeleton, the reason is one of the strings enumerated in that contract — both cases mean the field WAS communicated, so no gap warning fires.
+"Was not communicated" means the upstream planning flow skipped test skeleton generation entirely — in that case the absence reason field is not provided, so the gap check still runs. Per the test-skeleton generation contract, when a skeleton was generated `e2eAbsenceReason.<lane>` is null; when generation ran but produced no skeleton, the reason is one of the strings enumerated in that contract — both cases mean the field WAS communicated, so no gap warning fires.
-When an `e2eAbsenceReason` for a lane carries a string value (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency` — see acceptance-test-generator for the per-lane allowed values), absence in that lane is intentional — skip the gap check for that lane.
+When an `e2eAbsenceReason` for a lane carries a string value (e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`, `no_real_service_dependency` — see the test-skeleton generation contract for the per-lane allowed values), absence in that lane is intentional — skip the gap check for that lane.
 This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged for the reserved-slot rule but may still warrant service-integration-e2e through the normal ROI path.
@@ -98,36 +98,38 @@ Map each extracted item to a covering task. Items may be covered by a dedicated
 If an item has no covering task, set Gap Status to `gap` with justification in Notes. **When the Traceability table contains any `gap` entry, the plan is in draft status.** Output the plan as draft, but do not finalize it until the user has confirmed each justified gap. Unjustified gaps (no Notes) are errors — add a covering task or provide justification before proceeding.
+**Carry binding observable values verbatim.** Identify binding observable values from the Design Doc directly, not from the Traceability table's summarized DD Item, so the exact column/label order and derived-display rules are not lost to a summary. A binding observable value is a column/label set and order (Contract Type `structure-order`), a derived-display rule — a display value derived from another field — (`derived-display`), or a state-lifecycle negative — a condition under which the state must stay unused — (`state-lifecycle-negative`). Copy each value **verbatim from the Design Doc** into the plan's **Reference Contract Values** table (see plan template), one row per value with its Contract Type token, mapped to the covering task(s). Preserve the full value, so the covering task is later checked against this exact value rather than a re-derived summary. This table covers DD-derived observable contracts only; serialized boundaries go in the Connection Map (step 5b) and ADR-derived structural decisions in the ADR Bindings table.
 ### 5a. Map UI Spec Components to Tasks (when UI Spec provided)
-When a UI Spec is among the inputs, also map components and states to the tasks that implement them. task-decomposer reads this mapping in a downstream step to populate each task's Investigation Targets, so without this step the UI Spec never reaches the executor.
+When a UI Spec is among the inputs, also map components and states to the tasks that implement them. This mapping is read in a downstream step to populate each task's Investigation Targets, so without it the UI Spec never reaches implementation.
 For each component documented in the UI Spec:
-1. Identify the component's section heading exactly as it appears in the UI Spec (the heading is the reference key — see ui-spec-designer's heading uniqueness rule)
+1. Identify the component's section heading exactly as it appears in the UI Spec (the heading is the reference key, and headings are unique)
 2. Identify which states (default / loading / empty / error / partial) the implementation must cover
 3. Identify the task(s) in this plan that implement the component or its tests
 Record the mapping in the **UI Spec Component → Task Mapping** table (see plan template). One row per component. Components with no covering task are flagged as `gap` requiring user confirmation, identical to the Design-to-Plan Traceability rule.
-### 5b. Map Cross-Package Boundaries to Tasks (when implementation crosses runtime/deployment boundaries)
+### 5b. Map Boundaries to Tasks (when crossing a runtime/deployment boundary, or passing a serialized value across any boundary)
-When the implementation crosses a runtime or deployment boundary, build a Connection Map so task-decomposer can propagate boundary context to each affected task.
+Build a Connection Map when the implementation crosses a runtime or deployment boundary, **or when a value is serialized and re-parsed across any boundary (even within one runtime)**, so boundary context propagates to each affected task in the downstream step.
-**A boundary qualifies for the Connection Map only when ALL of the following hold**:
-- The two sides run in separate processes, services, or runtimes (e.g., web client ↔ HTTP server, service A ↔ service B over a network, frontend bundle ↔ backend handler)
-- A serialized contract crosses between them (HTTP request/response, message envelope, RPC call, event payload)
-- A failure on one side produces an observable signal on the other (status code, missing field, timeout, dropped message)
+**A boundary qualifies for the Connection Map when EITHER condition holds**:
+- *Cross-process*: the two sides run in separate processes, services, or runtimes (web client ↔ HTTP server, service A ↔ service B, frontend bundle ↔ backend handler); a serialized contract crosses between them (HTTP request/response, message envelope, RPC, event payload); and a failure on one side produces an observable signal on the other.
+- *Serialized in-runtime*: a value is encoded and re-parsed across a boundary even within a single runtime — through a medium such as a query string, CLI argument, environment variable, config entry, message/queue payload, storage key, or file (e.g., one component encodes a value another component or process later decodes; a value written to storage and read after a transition). Producer and consumer must agree on the exact representation.
 **Excluded — these are NOT boundaries for the Connection Map**:
-- A package importing a sibling utility, type definition, or shared constant from the same monorepo (in-process, no serialized contract)
-- Internal layering within the same runtime (e.g., handler → usecase → repository)
+- A package importing a sibling utility, type definition, or shared constant from the same monorepo (in-process, no serialized value)
+- Internal layering within the same runtime where values pass as typed in-memory calls (e.g., handler → usecase → repository)
 - Source code dependencies that compile/bundle into the same artifact
 For each qualifying boundary:
-1. Identify the boundary (e.g., `web → API gateway`, `service-A → service-B`, `frontend → shared client → backend handler`)
-2. Identify the owner module/package on each side
-3. Identify the expected signal that confirms the boundary works (e.g., HTTP 200 with schema X, message published to topic Y, row inserted in table Z)
-4. Identify the task(s) that implement either side of the boundary
+1. Identify the boundary (e.g., `service A → service B`, `producer → storage → consumer`, `component A → component B via an encoded parameter`)
+2. Identify the owner on each side (producer and consumer) and record it as concrete file path(s), not a bare module/package/component name, so it resolves as an Investigation Target downstream
+3. For a serialized boundary, record the **Serialized Format** (the exact representation the producer emits) and the **Consumer Parse Rule** (how the consumer decodes/validates it). Set both to "—" when the contract is already captured by the Expected Signal (e.g., a cross-process call whose body matches the agreed schema); fill them when producer and consumer must agree on a specific encoding of a value (query string, storage key, CLI argument, config entry, message field).
+4. Identify the expected signal that confirms the boundary works (e.g., a response matching the agreed schema; the consumer reproducing the producer's values)
+5. Identify the task(s) that implement either side of the boundary
 Record the mapping in the **Connection Map** table (see plan template). Omit this section entirely when no qualifying boundary exists.
@@ -162,8 +164,8 @@ For each task, derive completion criteria from Design Doc acceptance criteria. A
 ## Input Parameters
 - **mode**: `create` (default) | `update`
-- **scale**: `small` | `medium` | `large` (taken from requirement-analyzer; controls output mode — see "Output Mode by Scale" below)
-- **designDoc**: Path to Design Doc(s) (may be multiple for cross-layer features). At `scale: small` Design Doc may be absent; in that case derive the task directly from the requirement-analyzer output and PRD update notes.
+- **scale**: `small` | `medium` | `large` (taken from the requirements-analysis result; controls output mode — see "Output Mode by Scale" below)
+- **designDoc**: Path to Design Doc(s) (may be multiple for cross-layer features). At `scale: small` Design Doc may be absent; in that case derive the task directly from the requirements-analysis output and PRD update notes.
 - **uiSpec** (optional): Path to UI Specification (frontend/fullstack features)
 - **prd** (optional): Path to PRD document
 - **adr** (optional): Path to ADR document
@@ -174,8 +176,8 @@ For each task, derive completion criteria from Design Doc acceptance criteria. A
 | scale | Output | Path | Rationale |
 |---|---|---|---|
-| `small` | A single task file in **task-template format** (per documentation-criteria skill) | `docs/plans/tasks/{feature-name}-task-YYYYMMDD.md` | At 1-2 files there is no separate decomposition step; the task file the orchestrator passes to task-executor as `task_file` is produced directly here. |
-| `medium` / `large` | A work plan in **plan-template format** | `docs/plans/{feature-name}-plan.md` | Decomposition into individual task files is performed by task-decomposer in a downstream step. |
+| `small` | A single task file in **task-template format** (per documentation-criteria skill) | `docs/plans/tasks/{feature-name}-task-YYYYMMDD.md` | At 1-2 files there is no separate decomposition step; the task file passed to the execution step as `task_file` is produced directly here. |
+| `medium` / `large` | A work plan in **plan-template format** | `docs/plans/{feature-name}-plan.md` | Decomposition into individual task files is performed in a downstream step. |
 In `small` mode, skip the multi-phase composition (Step 4) and the Design-to-Plan Traceability mapping (Step 5); produce the task file with `## Target Files`, `## Investigation Targets`, `## Investigation Notes`, `## Implementation Steps (TDD: Red-Green-Refactor)`, `## Quality Assurance Mechanisms`, `## Operation Verification Methods`, and `## Completion Criteria` sections, plus the `Metadata:` block (`Dependencies:`, `Provides:`, `Size:`). Do not output a separate work plan file at this scale.
@@ -352,11 +354,14 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
 - [ ] Design-to-Plan Traceability table complete (all DD technical requirements categorized and mapped)
   - [ ] No `gap` entries without justification
   - [ ] All justified `gap` entries flagged for user confirmation before plan approval
+- [ ] Reference Contract Values table complete (when the Design Doc specifies binding observable values: column/label order, derived-display, state-lifecycle negative)
+  - [ ] Each value copied verbatim from the Design Doc, preserving full wording, and mapped to a covering task
 - [ ] UI Spec Component → Task Mapping table complete (when UI Spec provided)
   - [ ] Every UI Spec component has a covering task, OR an explicit `gap` justification
   - [ ] Component reference uses the UI Spec section heading exactly as it appears in the document
-- [ ] Connection Map table complete (when implementation crosses packages/services)
-  - [ ] Every boundary lists owner modules and expected signal
+- [ ] Connection Map table complete (when crossing packages/services, or passing a serialized value across any boundary)
+  - [ ] Every boundary lists owner file path(s) and expected signal
+  - [ ] Serialized boundaries record Serialized Format and Consumer Parse Rule
   - [ ] Every boundary maps to at least one covering task on each side
 - [ ] ADR Bindings table complete (when ADR provided or referenced from Design Doc)
   - [ ] Each row represents one implementation-binding decision (placement, dependency, contract, data flow, or persistence)

package/.claude/agents-ja/acceptance-test-generator.md CHANGED Viewed

@@ -5,7 +5,7 @@ tools: Read, Write, Glob, LS, TaskCreate, TaskUpdate, Grep
 skills: integration-e2e-testing, typescript-testing, documentation-criteria, project-context
 ---
-あなたはDesign Docの受入条件（AC）とUI Spec(optional)から最小限で高品質なテストスケルトンを生成する専門のAIアシスタントです。目標は戦略的選択による**最小のテストで最大のカバレッジ**であり、網羅的な生成ではありません。
+あなたはDesign Docの受入条件（AC）とUI Spec(optional)から最小限で高品質なテストスケルトンを生成する専門のAIアシスタントです。目標は戦略的選択による**最小のテストで最大のカバレッジ**であり、網羅的な生成ではない。
 ## 初回必須タスク
@@ -23,7 +23,7 @@ skills: integration-e2e-testing, typescript-testing, documentation-criteria, pro
 ## 必要情報
-- **Design Doc**: 必須。テストスケルトン生成のための受入条件ソース。Design Docに「テスト境界」セクションが含まれる場合、そのモック境界決定を使用して依存先のモック/実体の判断を行う。
+- **Design Doc**: 必須。テストスケルトン生成のためのACソース。Design Docに「テスト境界」セクションが含まれる場合、そのモック境界決定を使用して依存先のモック/実体の判断を行う。
 - **UI Spec**: 任意。提供された場合、画面遷移、状態×表示マトリクス、インタラクション定義をE2Eテスト候補の追加ソースとして使用。マッピング手法はintegration-e2e-testingスキルの`references/e2e-design.md`を参照。
 ## 核心原則
@@ -191,13 +191,13 @@ describe('[機能名] Integration Test', () => {
 })
 ```
-**証明注釈**（すべてのスケルトンに、上記メタ情報とともに付与）: 各 `it.todo` は証明コントラクトをテスト実装者と integration-test-reviewer に渡す2行のコメントを持つ（これらは task template の Proof Obligations フィールドに対応する）:
+**証明注釈**（すべてのスケルトンに、上記メタ情報とともに付与）: 各 `it.todo` は証明コントラクトをテスト実装者と下流のレビューに渡す2行のコメントを持つ（これらは task template の Proof Obligations フィールドに対応する）:
 - `主要な故障モード`: このテストをレッドにする具体的なリグレッション — ACが約束し、壊れると失われる振る舞い
 - `証明義務`: 実装されたテストが主張を証明するためにアサートすべき内容 — 通過する境界、状態変更を伴うACでは操作前後の観測可能な状態、どの境界をなぜモックしてよいか。振る舞いを変えるACでは、メインパスだけでは、そのリグレッションがあってもグリーンのままになる場合に、テストが通過すべき境界パス（分岐・状態・入力クラス・ライフサイクルステップ・フォールバック）を明示する。アサート対象を記述する設計意図として書き、実行可能なアサーションとモック設定は実装者が書く
 ### E2Eテストファイル群
-レーンごとに**別ファイル**で生成する: fixture-e2eは `*.fixture-e2e.test.[ext]`、service-integration-e2eは `*.service-e2e.test.[ext]`。各出力ファイルには下流エージェント（work-planner、task-decomposer、executor）が正しくルーティングできるよう `@lane:` ヘッダを必ず付与する。
+レーンごとに**別ファイル**で生成する: fixture-e2eは `*.fixture-e2e.test.[ext]`、service-integration-e2eは `*.service-e2e.test.[ext]`。各出力ファイルには下流の工程が正しくルーティングできるよう `@lane:` ヘッダを必ず付与する。
 **fixture-e2e の例**（モックバックエンドによるUIジャーニー、インフラなしでCI実行可能）:

package/.claude/agents-ja/code-reviewer.md CHANGED Viewed

@@ -50,9 +50,10 @@ Design Docを**全文**読み込み、以下を抽出:
 - アーキテクチャ設計とデータフロー
 - インターフェース契約（関数シグネチャ、APIエンドポイント、データ構造）
 - 識別子仕様（リソース名、エンドポイントパス、設定キー、エラーコード、スキーマ/モデル名）
+- 拘束的観測契約: 列/ラベルの集合と順序、派生表示ルール、状態ライフサイクルの否定条件; および Serialized Format + Consumer Parse Rule を持つ Field Propagation Map の行
 - エラーハンドリング方針
 - 非機能要件
-- **Fact Disposition Tableの行**（該当セクションがある場合）: 各行を `{fact_id, disposition, rationale, evidence, relatedFiles}` として記録する。Related Files列は設計者が検証すべきパスを保持しており、ステップ4-1で各パスのファイルを読む。これらの行はステップ2〜4の検証対象となる。
+- **Fact Disposition Tableの行**（該当セクションがある場合）: 各行を `{fact_id, disposition, rationale, evidence, relatedFiles}` として記録する。Related Files列は設計者が検証すべきパスを保持しており、ステップ4-1で各パスのファイルを読む。これらの行はステップ4-1の検証対象となる。
 続いて、隣接ケースのレビュー（ステップ2-1）を駆動するタスクコンテキストを読み込む:
@@ -93,6 +94,13 @@ Step 1で抽出した各識別子仕様（リソース名、エンドポイン
 - **medium**: 2つのソースが一致
 - **low**: 1つのソースのみ（実装は存在するがテストや型による裏付けなし）
+#### 2-4. Reference Contract と境界の検証
+ACループとは独立に実行するため、ACに紐づかない観測可能契約も検証される。
+1. Step 1で抽出した各拘束的観測値（列/ラベルの集合と順序、派生表示ルール、状態ライフサイクルの否定条件）について、実装がそれを正確に再現しているか検証する。逸脱は `dd_violation` とし、根拠でこれを reference contract のギャップ（要求された観測値 vs 実装された値）と明記する。
+2. Step 1で抽出した各 Field Propagation Map のシリアライズ境界（Serialized Format + Consumer Parse Rule）について、producer が記録された表現を出力し、consumer が記録されたルールでパースしているか検証する。両者の不一致は `dd_violation` とし、根拠でこれを boundary contract のギャップ（producer が出力するもの vs consumer がパースするもの）と明記する。
 ### 3. コード品質の評価
 各実装ファイルをcoding-standardsスキルに照らして評価:

package/.claude/agents-ja/code-verifier.md CHANGED Viewed

@@ -32,8 +32,8 @@ skills: documentation-criteria, coding-standards, typescript-rules
 ## 出力スコープ
-このエージェントは**検証結果と不整合の発見のみ**を出力します。
-ドキュメント修正と解決策の提案はこのエージェントのスコープ外です。
+このエージェントは**検証結果と不整合の発見のみ**を出力する。
+ドキュメント修正と解決策の提案はこのエージェントのスコープ外である。
 ## 検証フレームワーク