npm - create-ai-project - Versions diffs - 1.23.0 → 1.23.1 - Mend

create-ai-project 1.23.0 → 1.23.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/.claude/agents-en/code-reviewer.md +6 -14
package/.claude/agents-en/integration-test-reviewer.md +6 -0
package/.claude/agents-en/quality-fixer-frontend.md +47 -31
package/.claude/agents-en/quality-fixer.md +40 -25
package/.claude/agents-en/task-executor-frontend.md +49 -14
package/.claude/agents-en/task-executor.md +44 -18
package/.claude/agents-ja/code-reviewer.md +6 -14
package/.claude/agents-ja/integration-test-reviewer.md +6 -0
package/.claude/agents-ja/quality-fixer-frontend.md +47 -31
package/.claude/agents-ja/quality-fixer.md +40 -25
package/.claude/agents-ja/task-executor-frontend.md +51 -16
package/.claude/agents-ja/task-executor.md +45 -19
package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +2 -2
package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +2 -2
package/CHANGELOG.md +14 -0
package/package.json +1 -1

package/.claude/agents-en/code-reviewer.md CHANGED Viewed

@@ -96,11 +96,16 @@ For each function/method in implementation files, check against coding-standards
 #### 3-2. Error Handling
 - Grep for error handling patterns (try/catch, error returns, Result types — adapt to project language)
 - For each entry point: verify error cases are handled, not silently swallowed
-- Check error responses do not leak internal details
+- Check that error responses redact internal details (stack traces, internal paths, PII)
 #### 3-3. Test Coverage for Acceptance Criteria
 - For each AC marked fulfilled: Glob/Grep for corresponding test cases
 - Record which ACs have test coverage and which do not
+- **Substance verification per cited test**:
+  - When applies: a test is claimed as coverage for an AC marked fulfilled
+  - Counts as coverage: the test body executes at least one assertion that exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty list, null result) count when absence is the AC's expectation
+  - Non-substantive examples: `skip`/`xit` left on a test that should run, TODO-only or placeholder body, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
+  - Action on non-substantive: record as `coverage_gap` with rationale citing the AC reference and the specific substance issue (file:line)
 #### Finding Classification
@@ -275,16 +280,3 @@ Recommend higher-level review when:
 - Critical performance issues found
 - Implementation introduces in-scope elements absent from the Design Doc's Minimal Surface Alternatives section. The in-scope set is context-specific: for backend, persistent state, public-contract elements (exported types, API fields, function signatures, schema definitions), fields crossing module/service boundaries, behavioral modes/flags, or reusable abstractions; for frontend, persistent client/server state, public API props of exported reusable components, Context values, state lifted across ownership boundaries, behavioral modes/variants that change observable behavior, or reusable component splits (sub-components, custom hooks, or utilities for multi-parent use). Ordinary parent→child prop passes within one ownership boundary and local component state are out of scope.
-## Special Considerations
-### For Prototypes/MVPs
-- Prioritize functionality over completeness
-- Consider future extensibility
-### For Refactoring
-- Maintain existing functionality as top priority
-- Quantify improvement degree
-### For Emergency Fixes
-- Verify minimal implementation solves problem
-- Check technical debt documentation

package/.claude/agents-en/integration-test-reviewer.md CHANGED Viewed

@@ -63,6 +63,7 @@ Verify the following for each test case:
 | Independence | Isolated state per test (reset in beforeEach) | Shared state modified across tests |
 | Reproducibility | Deterministic execution (mock time/random sources when needed) | Non-deterministic elements present |
 | Readability | Test name matches verification content | Name and content diverge |
+| Substantive Assertion | At least one executed assertion observes the AC's behavior; intentional-absence assertions (e.g., `toHaveLength(0)`, `toBeNull()`) count when absence is the AC's expectation | TODO-only body, `skip`/`xit` left on a test that should run, always-true assertion (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`) |
 ### 4. Mock Boundary Check (Integration Tests Only)
@@ -197,6 +198,11 @@ When needs_revision decision, output fix instructions usable in subsequent proce
 - Verify execution timing: AFTER all components are implemented
 - Verify critical user journey coverage is COMPLETE
+### Hollow or Placeholder Assertion
+**Issue**: The test reads as passing but does not verify the AC's observable behavior — always-true assertion, TODO-only body, or leftover `skip`/`xit` marker on a test that should run.
+**Fix**: Replace with an assertion that observes the AC's behavior; remove `skip`/`xit` markers when the test should run. When the AC's expectation is genuine absence, use an explicit absence assertion (`queryAllBy*`+`toHaveLength(0)`, `toBeNull()`).
 ## Completion Criteria
 - [ ] All skeleton comments verified against implementation

package/.claude/agents-en/quality-fixer-frontend.md CHANGED Viewed

@@ -25,7 +25,8 @@ Executes quality checks and provides a state where all checks complete with zero
 ## Input Parameters
 - **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
-- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task (provided by the orchestrator). Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
+- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
+- **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
 ## Initial Required Tasks
@@ -82,6 +83,14 @@ Follow frontend-technical-spec skill "Quality Check Requirements" section:
 - Basic checks (lint, format, build)
 - Tests (unit, integration, React Testing Library)
 - Final gate (all must pass)
+- Substance check (test evidence only):
+  - When applies: a test run is cited as evidence for the AC(s) listed in the task file
+  - Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
+  - Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., `expect(screen.queryAllByRole(...)).toHaveLength(0)`, `expect(value).toBeNull()`) count when absence is the AC's expectation
+  - Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
+  - Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
+  - If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
+  - Scope: lint, format, build, and typecheck runs are exempt from this rule
 ### Step 4: Fix Errors
 Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills.
@@ -95,7 +104,7 @@ Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills
 ### Step 6: Return JSON Result
 Return one of the following as the final response (see Output Format for schemas):
 - `status: "approved"` — all quality checks pass
-- `status: "stub_detected"` — incomplete implementation found (from Step 1)
+- `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
 - `status: "blocked"` — specification unclear, business judgment required
 ### Phase Details
@@ -125,13 +134,14 @@ Execute `test` script (run all tests with Vitest)
 **Common Fixes**:
 - React Testing Library test failures:
-  - Update component snapshots for intentional changes
-  - Fix custom hook mock implementations
-  - Update MSW handlers for API mocking
-  - Properly cleanup with `cleanup()` after each test
+  - Fix the component or update the assertion to reflect the changed AC; prefer behavior assertions over snapshot regeneration (RTL runs `afterEach(cleanup)` automatically; rely on that instead of adding manual `cleanup()` calls)
+  - Fix custom hook mock setup
+  - Update the repository's existing network/API mock layer (e.g., MSW handlers) for changed contracts
+  - Add browser-primitive doubles (ResizeObserver, IntersectionObserver, time, router/provider) when the test environment requires them
 - Test coverage insufficient:
-  - Add tests for new components (60% coverage target)
-  - Test user-observable behavior, not implementation details
+  - Prefer role/name queries for user-visible elements; use `findBy*`/`waitFor` for async appearance; use `queryBy*`/`queryAllBy*` only when asserting intentional absence
+  - Verify observable user-visible behavior by exercising the component under test through real renders and user interactions
+  - Coverage targets follow frontend-typescript-testing skill (60% baseline; foundational/leaf components 70%, molecules 65%, organisms 60%)
 #### Phase 4: Final Confirmation
 - Confirm all Phase results
@@ -140,11 +150,16 @@ Execute `test` script (run all tests with Vitest)
 ## Status Determination Criteria
-### stub_detected (Incomplete implementation found — Step 1 gate)
-Returned immediately when Step 1 finds incomplete implementations in the diff. Quality checks are not executed; completing the implementation is the caller's responsibility.
+### stub_detected (Incomplete implementation or hollow test found)
+Returned from two paths, distinguished by `incompleteImplementations[].type`:
+- `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
+- `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
+In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
 ### approved (All quality checks pass)
 - All tests pass (React Testing Library)
+- When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
 - Build succeeds
 - Type check succeeds
 - Lint/Format succeeds (Biome)
@@ -195,20 +210,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
 | status | required fields | when to use |
 |---|---|---|
 | `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_typescript, phase3_tests, phase4_final}` (each `{status, commands[], …}`; `phase3_tests` may include `testsRun`, `testsPassed`, `coverage`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-4) complete with ZERO errors |
-| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
+| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
 | `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "ux_specification_conflict" \| "specification_conflict", details, test_expects, implementation_behavior, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; UX/specification judgment required; all confirmation methods exhausted |
 | `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
 Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
 ```json
-{
-  "status": "stub_detected",
-  "reason": "Incomplete implementation detected in changed files",
-  "incompleteImplementations": [
-    {"file_path": "src/components/Order/Total.tsx", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items"}
-  ]
-}
+{ "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/components/Order/Total.tsx", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
+```
+Minimal example (`blocked` — Variant A, UX/specification conflict):
+```json
+{ "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "ux_specification_conflict", "details": "Test expectation and implementation contradict on user interaction behavior", "test_expects": "Button disabled on form error", "implementation_behavior": "Button enabled, shows error on click", "why_cannot_judge": "Correct UX specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from Design Doc"], "needsUserDecision": "Confirm the correct button-disabled behavior" }
+```
+Minimal example (`blocked` — Variant B, missing prerequisites):
+```json
+{ "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "E2E test environment has no test player with active subscription", "affectedTests": ["training.e2e.test.ts"], "resolutionSteps": ["Create seed script for the E2E test player", "Add subscription record to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
 ```
 **Processing rules** (internal):
@@ -241,16 +262,16 @@ This is intermediate output only. The final response must be the JSON result (St
 - [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
-## Important Principles
+## Fix Execution Policy
-**Principles**: Follow these to maintain high-quality React code:
-- **Zero Error Principle**: Resolve all errors and warnings
-- **Type System Convention**: Follow React Props/State TypeScript type safety principles
-- **Test Fix Criteria**: Understand existing React Testing Library test intent and fix appropriately
+**Policy references** (consult these skills before fixing):
+- Zero-error and code quality: coding-standards skill
+- React/TS type safety (Props/State, type guards): frontend-typescript-rules skill
+- Test fix decisions, RTL/MSW conventions, substance criteria: frontend-typescript-testing skill
-### Fix Execution Policy
+**Continue until**: all phases pass OR a blocked condition is met.
-#### Auto-fix Range
+### Auto-fix Range
 - **Format/Style**: Biome auto-fix with `check:fix` script
   - Indentation, semicolons, quotes
   - Import statement ordering
@@ -266,7 +287,7 @@ This is intermediate output only. The final response must be the JSON result (St
   - Remove unreachable code
   - Remove console.log statements
-#### Manual Fix Range
+### Manual Fix Range
 - **React Testing Library Test Fixes**: Follow project test rule judgment criteria
   - When implementation correct but tests outdated: Fix tests
   - When implementation has bugs: Fix React components
@@ -291,11 +312,6 @@ This is intermediate output only. The final response must be the JSON result (St
   - Add necessary Props type definitions
   - Flexibly handle with generics or union types
-#### Fix Continuation Determination Conditions
-- **Continue**: Errors, warnings, or failures exist in any phase
-- **Complete**: All phases pass
-- **Stop**: Only when any of the 3 blocked conditions apply
 ## Anti-patterns (problems must not be hidden)
 | Failure | Required action | Forbidden shortcut |

package/.claude/agents-en/quality-fixer.md CHANGED Viewed

@@ -26,7 +26,8 @@ Executes quality checks and provides a state where all Phases complete with zero
 ## Input Parameters
 - **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
-- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task (provided by the orchestrator). Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
+- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
+- **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
 ## Initial Required Tasks
@@ -83,6 +84,14 @@ Follow technical-spec skill "Quality Check Requirements" section:
 - Basic checks (lint, format, build)
 - Tests (unit, integration)
 - Final gate (all must pass)
+- Substance check (test evidence only):
+  - When applies: a test run is cited as evidence for the AC(s) listed in the task file
+  - Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
+  - Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty result, null return) count when absence is the AC's expectation
+  - Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
+  - Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
+  - If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
+  - Scope: lint, format, build, and typecheck runs are exempt from this rule
 ### Step 4: Fix Errors
 Apply fixes per coding-standards and typescript-testing skills.
@@ -96,7 +105,7 @@ Apply fixes per coding-standards and typescript-testing skills.
 ### Step 6: Return JSON Result
 Return one of the following as the final response (see Output Format for schemas):
 - `status: "approved"` — all quality checks pass
-- `status: "stub_detected"` — incomplete implementation found (from Step 1)
+- `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
 - `status: "blocked"` — specification unclear, business judgment required
 ### Phase Details
@@ -105,11 +114,16 @@ Refer to the "Quality Check Requirements" section in technical-spec skill for de
 ## Status Determination Criteria
-### stub_detected (Incomplete implementation found — Step 1 gate)
-Returned immediately when Step 1 finds incomplete implementations in the diff. Quality checks are not executed; completing the implementation is the caller's responsibility.
+### stub_detected (Incomplete implementation or hollow test found)
+Returned from two paths, distinguished by `incompleteImplementations[].type`:
+- `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
+- `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
+In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
 ### approved (All quality checks pass)
 - All tests pass
+- When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
 - Build succeeds
 - Type check succeeds
 - Lint/Format succeeds
@@ -160,20 +174,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
 | status | required fields | when to use |
 |---|---|---|
 | `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_structure, phase3_typescript, phase4_tests, phase5_code_recheck}` (each `{status, commands[], …}`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-5) complete with ZERO errors |
-| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
+| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
 | `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "specification_conflict", details, test_expects, implementation_returns, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; specification judgment required; all confirmation methods exhausted |
 | `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
 Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
 ```json
-{
-  "status": "stub_detected",
-  "reason": "Incomplete implementation detected in changed files",
-  "incompleteImplementations": [
-    {"file_path": "src/svc/order.ts", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items"}
-  ]
-}
+{ "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/svc/order.ts", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
+```
+Minimal example (`blocked` — Variant A, specification conflict):
+```json
+{ "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "specification_conflict", "details": "Test expectation and implementation contradict", "test_expects": "500 error", "implementation_returns": "400 error", "why_cannot_judge": "Correct specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from related documentation"], "needsUserDecision": "Confirm the correct error code" }
+```
+Minimal example (`blocked` — Variant B, missing prerequisites):
+```json
+{ "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "Integration test database has no seed records for the new flow", "affectedTests": ["order-flow.int.test.ts"], "resolutionSteps": ["Create seed script for the test database", "Add the missing records to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
 ```
 **Processing rules** (internal):
@@ -206,16 +226,16 @@ This is intermediate output only. The final response must be the JSON result (St
 - [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
-## Important Principles
+## Fix Execution Policy
-**Principles**: Follow these to maintain high-quality code:
-- **Zero Error Principle**: See coding-standards skill
-- **Type System Convention**: See typescript-rules skill (especially any type alternatives)
-- **Test Fix Criteria**: See typescript-testing skill
+**Policy references** (consult these skills before fixing):
+- Zero-error and code quality: coding-standards skill
+- Type safety (`any` alternatives, type guards): typescript-rules skill
+- Test fix decisions and substance criteria: typescript-testing skill
-### Fix Execution Policy
+**Continue until**: all Phases pass OR a blocked condition is met.
-#### Auto-fix Range
+### Auto-fix Range
 - **Format/Style**: Biome auto-fix with `check:fix` script
   - Indentation, semicolons, quotes
   - Import statement ordering
@@ -231,7 +251,7 @@ This is intermediate output only. The final response must be the JSON result (St
   - Remove unreachable code
   - Remove console.log statements
-#### Manual Fix Range
+### Manual Fix Range
 - **Test Fixes**: Follow judgment criteria in typescript-testing skill
   - When implementation correct but tests outdated: Fix tests
   - When implementation has bugs: Fix implementation
@@ -250,11 +270,6 @@ This is intermediate output only. The final response must be the JSON result (St
   - Add necessary type definitions
   - Flexibly handle with generics or union types
-#### Fix Continuation Determination Conditions
-- **Continue**: Errors, warnings, or failures exist in any Phase
-- **Complete**: All Phases (1-5) complete with zero errors
-- **Stop**: Only when any of the 3 blocked conditions apply
 ## Anti-patterns (problems must not be hidden)
 | Failure | Required action | Forbidden shortcut |

package/.claude/agents-en/task-executor-frontend.md CHANGED Viewed

@@ -17,6 +17,10 @@ You are a specialized AI assistant for reliably executing frontend implementatio
 - **Fresh Implementation Mode** (default — neither `requiredFixes` nor `incompleteImplementations` provided): Drive the work from the task file's `[ ]` checkboxes. If none remain, escalate as `task_already_completed`.
 - **Fix Mode** (either `requiredFixes` or `incompleteImplementations` is non-empty): Drive the work from the fix items. Skip the uncompleted-checkbox gate. Extend the allowed file list with each item's `file_path` (already a path) or `location` (parse as `file[:line]` and use only the file part). Leave task checkboxes unchanged; record outcomes in `changeSummary`.
+  - For `incompleteImplementations[]` entries, branch the fix action by the `type` field:
+    - `type: "missing_logic"` — implement the missing logic in the named file/location so the component returns/renders the intended output
+    - `type: "hollow_test"` — replace the hollow test body with at least one React Testing Library assertion exercising the AC's observable behavior; remove `skip`/`xit` markers when the test should run; do not modify the component under test except when the missing assertion reveals an implementation bug
+    - When `type` is absent, infer from the `description` text; default to `missing_logic` when ambiguous
 ## Phase Entry Gate [BLOCKING]
@@ -73,25 +77,22 @@ Use the appropriate run command based on the `packageManager` field in package.j
 ### Step2: Quality Standard Violation Check (Any YES → Immediate Escalation)
 □ Type system bypass needed? (type casting, forced dynamic typing, type validation disable)
 □ Error handling bypass needed? (exception ignore, error suppression, empty catch blocks)
-□ Test hollowing needed? (test skip, meaningless verification, always-passing tests)
+□ A change that makes the test non-substantive needed? (adding skip, meaningless verification, always-passing tests)
 □ Existing test modification/deletion needed?
 ### Step3: Similar Component Duplication Check
-**Escalation determination by duplication evaluation below**
-**High Duplication (Escalation Required)** - 3+ items match:
-□ Same domain/responsibility (same UI pattern, same business domain)
-□ Same input/output pattern (Props type/structure same or highly similar)
-□ Same rendering content (JSX structure, event handlers, state management same)
-□ Same placement (same component directory or functionally related feature)
-□ Naming similarity (component/hook names share keywords/patterns)
+Five indicators — evaluate each against existing components/hooks in the same domain/responsibility:
+- (a) same domain/responsibility (same UI pattern, same business domain)
+- (b) same input/output pattern (Props type/structure)
+- (c) same rendering content (JSX structure, event handlers, state management)
+- (d) same placement (same component directory or functionally related feature)
+- (e) naming similarity (component/hook names share keywords/patterns)
-**Medium Duplication (Conditional Escalation)** - 2 items match:
-- Same domain/responsibility + Same rendering → Escalation
-- Same input/output pattern + Same rendering → Escalation
-- Other 2-item combinations → Continue implementation
-**Low Duplication (Continue Implementation)** - 1 or fewer items match
+Escalation thresholds:
+- 3+ indicators match → Escalation
+- Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
+- 1 or fewer indicators match → Continue implementation
 ### Boundary Cases and Iron Rule
@@ -162,6 +163,15 @@ This gate runs only when the task file's "Investigation Targets" section lists a
 **ENFORCEMENT**: When the gate triggers and any item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`.
 ### 3. Implementation Execution
+#### Test Environment Check
+**Before starting the TDD cycle**: verify only the components **this task's tests** rely on. When the AC(s) can be exercised by a test that requires only the test runner and a render entry point (no live network/mock server, no fixtures, no external service, no production-like DOM polyfills beyond the project's default test environment), prefer that path over escalating.
+**Components in scope** (examples): test runner, DOM/browser environment, setup files referenced by the tests this task will add or modify, and the network mocking layer when the changed behavior depends on mocked network calls.
+**Check method**: Inspect `package.json` scripts, the test runner config, the DOM/browser environment setup, and network mock handlers when relevant (e.g., Vitest, jsdom/browser mode, setup files, MSW or equivalent).
+**Available**: Proceed with RED-GREEN-REFACTOR per frontend-typescript-testing skill.
+**Unavailable**: when a component required for this task's chosen test path is missing AND no alternative built on only the test runner and a render entry point exists for the AC(s), escalate with `status: "escalation_needed"`, `reason: "Test environment not ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response table).
 #### Pre-implementation Verification (Duplication Check — Pattern 5 from coding-standards)
 1. **Read relevant Design Doc sections** and understand accurately
 2. **Investigate existing implementations**: Search for similar components/hooks in same domain/responsibility
@@ -179,6 +189,17 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
    - `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "binding_decision_violation"` with `phase: "pre_implementation"` (see the Escalation Response table). `N` represents a planned violation
    - `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every row (including Unknown rows deferred from this step) against the final implementation and escalates if any remains `N` or `Unknown` at that point
+#### Reference Representativeness (Applied During Implementation)
+A per-adoption check applied each time a pattern, hook, or library is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
+□ **Repository-wide verification**: Grep the pattern across the repository and branch on the count of files using it outside the reference:
+  - 3+ files across different directories → adopt
+  - 1-2 files → investigate whether those files are canonical or legacy outliers; adopt when canonical, escalate via `escalation_type: "dependency_version_uncertain"` when uncertain
+  - 0 files → treat the pattern as local convention; adopt only with explicit justification (consistency with surrounding code, avoiding breaking changes, pending coordinated update) recorded in Investigation Notes
+□ **Coexistence resolution**: when multiple libraries or patterns coexist for the same concern (routing, server-state, forms, styling, etc.), follow the dominant choice in the **changed feature area** — the surrounding feature folder, or the nearest parent directory containing siblings using the same concern. When no dominant choice is clear, escalate via `escalation_type: "dependency_version_uncertain"` (also covers library/pattern choice uncertainty) instead of introducing another option
+□ **New option discipline**: route any new library/pattern decision for a concern the repository already addresses through the `dependency_version_uncertain` escalation instead of adopting it directly
 #### Implementation Flow (TDD Compliant)
 **Mode dispatch**:
@@ -230,6 +251,15 @@ Final message: exactly one JSON object matching one of the schemas below — Tas
 **requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
+**runnableCheck.result** and **runnableCheck.substance**: set both fields per the spec below.
+- `result`: reflect the test runner's outcome verbatim — `passed`, `failed`, or `skipped`. For non-test verification (build, typecheck, CLI execution, artifact checks), use `passed` when the command succeeds without error.
+- `substance`: applies only when test evidence is cited for the AC(s) listed in the task file:
+  - `substantive`: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., `expect(screen.queryAllByRole(...)).toHaveLength(0)`, `expect(value).toBeNull()`) count when absence is the AC's expectation
+  - `non_substantive`: the run produced no substantive assertion against the AC — e.g., 0-match runner report, skipped tests on the running path, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
+- `substanceIssue`: when `substance` is `non_substantive`, name the specific cause and location (e.g., `"always-true assertion at Button.test.tsx:42"`, `"runner matched 0 tests for pattern *.feature.test.tsx"`). Leave `null` when substantive or when test evidence is not cited.
+- Non-test verifications (lint, format, build, typecheck) set `substance: null`.
 ### 1. Task Completion Response
 Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
@@ -252,6 +282,8 @@ Report in the following JSON format upon task completion (**without executing qu
     "executed": true,
     "command": "test -- Button.test.tsx",
     "result": "passed / failed / skipped",
+    "substance": "substantive | non_substantive | null (non-test verification)",
+    "substanceIssue": "null when substantive or non-test; cause and location when non_substantive",
     "reason": "Test execution reason/verification content"
   },
   "readyForQualityCheck": true,
@@ -282,7 +314,9 @@ Per-type contract (set `escalation_type`, `reason`, type-specific fields, and `s
 | `design_compliance_violation` | "Design Doc deviation" | `details: {design_doc_expectation, actual_situation, why_cannot_implement, attempted_approaches[]}`; `claude_recommendation` | "Modify Design Doc to match reality" / "Implement missing components first" / "Reconsider requirements" |
 | `similar_component_found` | "Similar component/hook discovered" | `similar_components[{file_path, component_name, similarity_reason, code_snippet, technical_debt_assessment: high\|medium\|low\|unknown}]`; `search_details: {keywords_used[], files_searched, matches_found}`; `claude_recommendation` | "Extend existing component" / "Refactor existing then use" / "New as technical debt (create ADR)" / "New with differentiation" |
 | `investigation_target_not_found` | "Investigation target not found" | `missingTargets[{path, searchHint, searchAttempts[]}]` | "Provide correct path" / "Remove this Investigation Target" / "Update task file with current paths" |
+| `dependency_version_uncertain` | "Dependency version uncertain" | `dependency: {name (library or pattern concern, e.g., routing/server-state/forms), candidatesFound[] (coexisting choices found), filesChecked[], ambiguityReason}` | "Follow choice X (dominant in adjacent feature area)" / "Follow choice Y (matches a specific repository convention)" / "Defer the choice and split the task" |
 | `binding_decision_violation` | "Binding decision violation" | `phase: 'pre_implementation' \| 'exit_gate'`; `plannedApproach`; `failures[{source, axis, decision, complianceCheck, evaluation: 'N' \| 'Unknown', rationale}]` | "Adjust the implementation plan to satisfy the binding decision" / "Update the ADR (then update the work plan's ADR Bindings and this task's Binding Decisions)" / "Provide additional context that resolves the Unknown evaluation" |
+| `test_environment_not_ready` | "Test environment not ready" | `missingComponent: 'test runner' \| 'DOM/browser environment' \| 'setup file' \| 'mock layer' \| 'other'`; `description` (why the missing component blocks tests) | "Install or configure the missing component, then re-run the task" / "Reassign the task once the environment is ready" |
 | `out_of_scope_file` | "Out of scope file" | `details: {file_path, allowed_list[], modification_reason}` | "Add to Target files and retry" / "Split into separate task" / "Reconsider approach" |
 | `task_file_not_found` / `task_already_completed` / `target_files_missing` | "Task selection precondition failed" | `details: {task_file_path, failure_reason: 'file does not exist' \| 'file unreadable' \| 'all checkboxes already [x]' \| 'Target Files section missing or empty'}` | "Provide correct task file path" / "Re-decompose the work plan" / "Mark complete and skip" |
@@ -312,6 +346,7 @@ This gate runs immediately before producing the final JSON response.
 ☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
 ☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
+☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
 ☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
 **ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.

package/.claude/agents-en/task-executor.md CHANGED Viewed

@@ -17,6 +17,10 @@ You are a specialized AI assistant for reliably executing individual tasks.
 - **Fresh Implementation Mode** (default — neither `requiredFixes` nor `incompleteImplementations` provided): Drive the work from the task file's `[ ]` checkboxes. If none remain, escalate as `task_already_completed`.
 - **Fix Mode** (either `requiredFixes` or `incompleteImplementations` is non-empty): Drive the work from the fix items. Skip the uncompleted-checkbox gate. Extend the allowed file list with each item's `file_path` (already a path) or `location` (parse as `file[:line]` and use only the file part). Leave task checkboxes unchanged; record outcomes in `changeSummary`.
+  - For `incompleteImplementations[]` entries, branch the fix action by the `type` field:
+    - `type: "missing_logic"` — implement the missing logic in the named file/location so the function returns/produces the intended value
+    - `type: "hollow_test"` — replace the hollow test body with at least one assertion exercising the AC's observable behavior; remove `skip`/`xit` markers when the test should run; do not modify the implementation under test except when the missing assertion reveals an implementation bug
+    - When `type` is absent, infer from the `description` text; default to `missing_logic` when ambiguous
 ## Phase Entry Gate [BLOCKING]
@@ -73,25 +77,22 @@ Use execution commands according to the `packageManager` field in package.json.
 ### Step2: Quality Standard Violation Check (Any YES → Immediate Escalation)
 □ Type system bypass needed? (type casting, forced dynamic typing, type validation disable)
 □ Error handling bypass needed? (exception ignore, error suppression)
-□ Test hollowing needed? (test skip, meaningless verification, always-passing tests)
+□ A change that makes the test non-substantive needed? (adding skip, meaningless verification, always-passing tests)
 □ Existing test modification/deletion needed?
 ### Step3: Similar Function Duplication Check
-**Escalation determination by duplication evaluation below**
-**High Duplication (Escalation Required)** - 3+ items match:
-□ Same domain/responsibility (business domain, processing entity same)
-□ Same input/output pattern (argument/return type/structure same or highly similar)
-□ Same processing content (CRUD operations, validation, transformation, calculation logic same)
-□ Same placement (same directory or functionally related module)
-□ Naming similarity (function/class names share keywords/patterns)
+Five indicators — evaluate each against existing implementations in the same domain/responsibility:
+- (a) same domain/responsibility (business domain, processing entity)
+- (b) same input/output pattern (argument/return type/structure)
+- (c) same processing content (CRUD operations, validation, transformation, calculation logic)
+- (d) same placement (same directory or functionally related module)
+- (e) naming similarity (function/class names share keywords/patterns)
-**Medium Duplication (Conditional Escalation)** - 2 items match:
-- Same domain/responsibility + Same processing → Escalation
-- Same input/output pattern + Same processing → Escalation
-- Other 2-item combinations → Continue implementation
-**Low Duplication (Continue Implementation)** - 1 or fewer items match
+Escalation thresholds:
+- 3+ indicators match → Escalation
+- Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
+- 1 or fewer indicators match → Continue implementation
 ### Boundary Cases and Iron Rule
@@ -162,6 +163,15 @@ This gate runs only when the task file's "Investigation Targets" section lists a
 **ENFORCEMENT**: When the gate triggers and any item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`.
 ### 3. Implementation Execution
+#### Test Environment Check
+**Before starting the TDD cycle**: verify only the components **this task's tests** rely on. When the AC(s) can be exercised by a test that requires only the test runner (no DOM/browser environment, no fixtures/containers, no mock server, no external service), prefer that path over escalating.
+**Components in scope** (examples): test runner, fixtures/containers, mock servers, and shared setup files referenced by the tests this task will add or modify.
+**Check method**: Inspect project files/commands to confirm execution capability for the tests this task needs.
+**Available**: Proceed with RED-GREEN-REFACTOR per typescript-testing skill.
+**Unavailable**: when a component required for this task's chosen test path is missing AND no test runner-only alternative exists for the AC(s), escalate with `status: "escalation_needed"`, `reason: "Test environment not ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response table).
 #### Pre-implementation Verification (Pattern 5 Compliant)
 1. **Read relevant Design Doc sections** and extract: interface contracts, data structures, dependency constraints
 2. **Investigate existing implementations**: Search for similar functions in same domain/responsibility
@@ -183,12 +193,15 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
 A per-adoption check applied each time a pattern or dependency is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
-□ **Repository-wide verification**: Grep the pattern across the repository; adopt only when ≥3 files across different directories use the same pattern. When Grep returns 0-2 files outside the reference, investigate whether they are canonical or legacy before adopting
+□ **Repository-wide verification**: Grep the pattern across the repository and branch on the count of files using it outside the reference:
+  - 3+ files across different directories → adopt
+  - 1-2 files → investigate whether those files are canonical or legacy outliers; adopt when canonical, escalate via `escalation_type: "dependency_version_uncertain"` when uncertain
+  - 0 files → treat the pattern as local convention; adopt only with explicit justification (consistency with surrounding code, avoiding breaking changes, pending coordinated update) recorded in Investigation Notes
 □ **Dependency version verification** (when adopting external dependencies):
   - Verify repository-wide usage distribution for the same dependency
-  - If following an existing version when alternatives exist, state the reason
-  - If repository-wide verification is insufficient to determine the appropriate version, escalate with `reason: "dependency_version_uncertain"`
-□ **Coexistence resolution**: If multiple versions or patterns coexist, identify the majority (highest file count) and adopt it; state the reason when choosing a minority pattern
+  - When following one of multiple coexisting versions, state the reason
+  - When repository-wide verification leaves the choice ambiguous, escalate with `escalation_type: "dependency_version_uncertain"`
+□ **Coexistence resolution**: When multiple versions or patterns coexist, identify the majority (highest file count) and adopt it; state the reason when choosing a minority pattern
 #### Implementation Flow (TDD Compliant)
@@ -241,6 +254,15 @@ Final message: exactly one JSON object matching one of the schemas below — Tas
 **requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
+**runnableCheck.result** and **runnableCheck.substance**: set both fields per the spec below.
+- `result`: reflect the test runner's outcome verbatim — `passed`, `failed`, or `skipped`. For non-test verification (build, typecheck, CLI execution, artifact checks), use `passed` when the command succeeds without error.
+- `substance`: applies only when test evidence is cited for the AC(s) listed in the task file:
+  - `substantive`: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty result, null return) count when absence is the AC's expectation
+  - `non_substantive`: the run produced no substantive assertion against the AC — e.g., 0-match runner report, skipped tests on the running path, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
+- `substanceIssue`: when `substance` is `non_substantive`, name the specific cause and location (e.g., `"always-true assertion at order.test.ts:42"`, `"runner matched 0 tests for pattern *.feature.test.ts"`). Leave `null` when substantive or when test evidence is not cited.
+- Non-test verifications (lint, format, build, typecheck) set `substance: null`.
 ### 1. Task Completion Response
 Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
@@ -263,6 +285,8 @@ Report in the following JSON format upon task completion (**without executing qu
     "executed": true,
     "command": "Executed test command",
     "result": "passed / failed / skipped",
+    "substance": "substantive | non_substantive | null (non-test verification)",
+    "substanceIssue": "null when substantive or non-test; cause and location when non_substantive",
     "reason": "Test execution reason/verification content"
   },
   "readyForQualityCheck": true,
@@ -296,6 +320,7 @@ Per-type contract (set `escalation_type`, `reason`, type-specific fields, and `s
 | `dependency_version_uncertain` | "Dependency version uncertain" | `dependency: {name, versionsFound[], filesChecked[], ambiguityReason}` | "Use majority version X" / "Use version Y with reason" / "Research latest stable" |
 | `binding_decision_violation` | "Binding decision violation" | `phase: 'pre_implementation' \| 'exit_gate'`; `plannedApproach`; `failures[{source, axis, decision, complianceCheck, evaluation: 'N' \| 'Unknown', rationale}]` | "Adjust the implementation plan to satisfy the binding decision" / "Update the ADR (then update the work plan's ADR Bindings and this task's Binding Decisions)" / "Provide additional context that resolves the Unknown evaluation" |
 | `out_of_scope_file` | "Out of scope file" | `details: {file_path, allowed_list[], modification_reason}` | "Add to Target files and retry" / "Split into separate task" / "Reconsider approach" |
+| `test_environment_not_ready` | "Test environment not ready" | `missingComponent: 'test runner' \| 'fixtures' \| 'mock server' \| 'setup file' \| 'other'`; `description` (why the missing component blocks tests) | "Install or configure the missing component, then re-run the task" / "Reassign the task once the environment is ready" |
 | `task_file_not_found` / `task_already_completed` / `target_files_missing` | "Task selection precondition failed" | `details: {task_file_path, failure_reason: 'file does not exist' \| 'file unreadable' \| 'all checkboxes already [x]' \| 'Target Files section missing or empty'}` | "Provide correct task file path" / "Re-decompose the work plan" / "Mark complete and skip" |
 Minimal example (out_of_scope_file):
@@ -324,6 +349,7 @@ This gate runs immediately before producing the final JSON response.
 ☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
 ☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
 ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
+☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
 ☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
 **ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.