create-ai-project 1.22.1 → 1.23.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/.claude/agents-en/code-reviewer.md +9 -53
  2. package/.claude/agents-en/code-verifier.md +3 -22
  3. package/.claude/agents-en/document-reviewer.md +14 -69
  4. package/.claude/agents-en/integration-test-reviewer.md +6 -0
  5. package/.claude/agents-en/quality-fixer-frontend.md +47 -31
  6. package/.claude/agents-en/quality-fixer.md +40 -25
  7. package/.claude/agents-en/task-decomposer.md +31 -0
  8. package/.claude/agents-en/task-executor-frontend.md +64 -15
  9. package/.claude/agents-en/task-executor.md +59 -19
  10. package/.claude/agents-en/technical-designer-frontend.md +32 -9
  11. package/.claude/agents-en/technical-designer.md +0 -9
  12. package/.claude/agents-en/ui-analyzer.md +313 -0
  13. package/.claude/agents-en/ui-spec-designer.md +3 -1
  14. package/.claude/agents-en/work-planner.md +26 -1
  15. package/.claude/agents-ja/code-reviewer.md +9 -53
  16. package/.claude/agents-ja/code-verifier.md +3 -22
  17. package/.claude/agents-ja/document-reviewer.md +14 -69
  18. package/.claude/agents-ja/integration-test-reviewer.md +6 -0
  19. package/.claude/agents-ja/quality-fixer-frontend.md +47 -31
  20. package/.claude/agents-ja/quality-fixer.md +40 -25
  21. package/.claude/agents-ja/task-decomposer.md +31 -0
  22. package/.claude/agents-ja/task-executor-frontend.md +66 -17
  23. package/.claude/agents-ja/task-executor.md +60 -20
  24. package/.claude/agents-ja/technical-designer-frontend.md +32 -9
  25. package/.claude/agents-ja/technical-designer.md +0 -9
  26. package/.claude/agents-ja/ui-analyzer.md +313 -0
  27. package/.claude/agents-ja/ui-spec-designer.md +3 -1
  28. package/.claude/agents-ja/work-planner.md +26 -1
  29. package/.claude/commands-en/build.md +9 -7
  30. package/.claude/commands-en/design.md +70 -44
  31. package/.claude/commands-en/front-build.md +9 -7
  32. package/.claude/commands-en/front-design.md +87 -58
  33. package/.claude/commands-ja/build.md +9 -7
  34. package/.claude/commands-ja/design.md +69 -43
  35. package/.claude/commands-ja/front-build.md +9 -7
  36. package/.claude/commands-ja/front-design.md +95 -64
  37. package/.claude/skills-en/documentation-criteria/references/design-template.md +1 -1
  38. package/.claude/skills-en/documentation-criteria/references/plan-template.md +16 -4
  39. package/.claude/skills-en/documentation-criteria/references/task-template.md +11 -1
  40. package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +4 -2
  41. package/.claude/skills-ja/documentation-criteria/references/design-template.md +1 -1
  42. package/.claude/skills-ja/documentation-criteria/references/plan-template.md +16 -4
  43. package/.claude/skills-ja/documentation-criteria/references/task-template.md +11 -1
  44. package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +4 -2
  45. package/CHANGELOG.md +29 -0
  46. package/package.json +1 -1
@@ -96,11 +96,16 @@ For each function/method in implementation files, check against coding-standards
96
96
  #### 3-2. Error Handling
97
97
  - Grep for error handling patterns (try/catch, error returns, Result types — adapt to project language)
98
98
  - For each entry point: verify error cases are handled, not silently swallowed
99
- - Check error responses do not leak internal details
99
+ - Check that error responses redact internal details (stack traces, internal paths, PII)
100
100
 
101
101
  #### 3-3. Test Coverage for Acceptance Criteria
102
102
  - For each AC marked fulfilled: Glob/Grep for corresponding test cases
103
103
  - Record which ACs have test coverage and which do not
104
+ - **Substance verification per cited test**:
105
+ - When applies: a test is claimed as coverage for an AC marked fulfilled
106
+ - Counts as coverage: the test body executes at least one assertion that exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty list, null result) count when absence is the AC's expectation
107
+ - Non-substantive examples: `skip`/`xit` left on a test that should run, TODO-only or placeholder body, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
108
+ - Action on non-substantive: record as `coverage_gap` with rationale citing the AC reference and the specific substance issue (file:line)
104
109
 
105
110
  #### Finding Classification
106
111
 
@@ -201,7 +206,7 @@ summary.findingsByCategory.reliability: number (integer >= 0)
201
206
  summary.findingsByCategory.coverage_gap: number (integer >= 0)
202
207
  ```
203
208
 
204
- ### Example (concrete values, illustrative only)
209
+ ### Minimal Shape Example
205
210
 
206
211
  ```json
207
212
  {
@@ -220,25 +225,8 @@ summary.findingsByCategory.coverage_gap: number (integer >= 0)
220
225
  "suggestion": null
221
226
  }
222
227
  ],
223
- "identifierVerification": [
224
- {
225
- "identifier": "AUTH_TOKEN_TTL",
226
- "designDocValue": "3600",
227
- "codeValue": "1800",
228
- "location": "src/auth/config.ts:8",
229
- "match": false
230
- }
231
- ],
232
- "qualityFindings": [
233
- {
234
- "category": "reliability",
235
- "location": "src/auth/login.ts:55",
236
- "description": "Error from token signer is swallowed silently",
237
- "rationale": "When jwt.sign throws, the catch block returns null without logging; downstream sees auth failure indistinguishable from invalid credentials",
238
- "evidence_source": "Read confirmed empty catch at src/auth/login.ts:55-58",
239
- "suggestion": "Re-throw with context or log error then propagate to caller"
240
- }
241
- ],
228
+ "identifierVerification": [{"identifier": "AUTH_TOKEN_TTL", "designDocValue": "3600", "codeValue": "1800", "location": "src/auth/config.ts:8", "match": false}],
229
+ "qualityFindings": [{"category": "reliability", "location": "src/auth/login.ts:55", "description": "Error from token signer is swallowed silently", "rationale": "When jwt.sign throws, the catch block returns null without logging; downstream sees auth failure indistinguishable from invalid credentials", "evidence_source": "Read confirmed empty catch at src/auth/login.ts:55-58", "suggestion": "Re-throw with context or log error then propagate to caller"}],
242
230
  "summary": {
243
231
  "acsTotal": 12,
244
232
  "acsFulfilled": 10,
@@ -265,25 +253,6 @@ summary.findingsByCategory.coverage_gap: number (integer >= 0)
265
253
 
266
254
  Identifier mismatches automatically lower the verdict by one level (e.g., pass → needs-improvement) when any mismatch is found.
267
255
 
268
- ## Review Principles
269
-
270
- 1. **Maintain Objectivity**
271
- - Evaluate independent of implementation context
272
- - Use Design Doc as single source of truth
273
-
274
- 2. **Evidence-Based Judgment**
275
- - Every finding must cite specific file:line locations
276
- - Every status determination must include the tool name and result that produced it (e.g., "Grep found X at file:line", "Read confirmed function signature at file:line")
277
- - Low-confidence determinations must be explicitly noted
278
-
279
- 3. **Quantitative Assessment**
280
- - Quantify wherever possible
281
- - Eliminate subjective judgment
282
-
283
- 4. **Constructive Feedback**
284
- - Provide solutions, not just problems
285
- - Clarify priorities via category classification
286
-
287
256
  ## Completion Criteria
288
257
 
289
258
  - [ ] All acceptance criteria individually evaluated with confidence levels
@@ -311,16 +280,3 @@ Recommend higher-level review when:
311
280
  - Critical performance issues found
312
281
  - Implementation introduces in-scope elements absent from the Design Doc's Minimal Surface Alternatives section. The in-scope set is context-specific: for backend, persistent state, public-contract elements (exported types, API fields, function signatures, schema definitions), fields crossing module/service boundaries, behavioral modes/flags, or reusable abstractions; for frontend, persistent client/server state, public API props of exported reusable components, Context values, state lifted across ownership boundaries, behavioral modes/variants that change observable behavior, or reusable component splits (sub-components, custom hooks, or utilities for multi-parent use). Ordinary parent→child prop passes within one ownership boundary and local component state are out of scope.
313
282
 
314
- ## Special Considerations
315
-
316
- ### For Prototypes/MVPs
317
- - Prioritize functionality over completeness
318
- - Consider future extensibility
319
-
320
- ### For Refactoring
321
- - Maintain existing functionality as top priority
322
- - Quantify improvement degree
323
-
324
- ### For Emergency Fixes
325
- - Verify minimal implementation solves problem
326
- - Check technical debt documentation
@@ -184,7 +184,7 @@ coverage.unimplemented: string[] (documented specs not yet implemented)
184
184
  limitations: string[] (what could not be verified and why)
185
185
  ```
186
186
 
187
- Example (concrete values, illustrative only):
187
+ Minimal shape example:
188
188
 
189
189
  ```json
190
190
  {
@@ -196,11 +196,7 @@ Example (concrete values, illustrative only):
196
196
  "consistencyScore": 78,
197
197
  "status": "mostly_consistent"
198
198
  },
199
- "claimCoverage": {
200
- "sectionsAnalyzed": 9,
201
- "sectionsWithClaims": 8,
202
- "sectionsWithZeroClaims": ["Future Work"]
203
- },
199
+ "claimCoverage": { "sectionsAnalyzed": 9, "sectionsWithClaims": 8, "sectionsWithZeroClaims": ["Future Work"] },
204
200
  "discrepancies": [
205
201
  {
206
202
  "id": "D001",
@@ -227,11 +223,7 @@ Example (concrete values, illustrative only):
227
223
  "undocumentedDataOperations": ["sessions table SELECT (src/auth/repo.ts:42)"],
228
224
  "testBoundariesSectionPresent": true
229
225
  },
230
- "coverage": {
231
- "documented": ["login flow", "token refresh"],
232
- "undocumented": ["session deletion endpoint"],
233
- "unimplemented": ["MFA challenge response"]
234
- },
226
+ "coverage": { "documented": ["login flow", "token refresh"], "undocumented": ["session deletion endpoint"], "unimplemented": ["MFA challenge response"] },
235
227
  "limitations": ["Could not verify token refresh against running redis instance"]
236
228
  }
237
229
  ```
@@ -261,17 +253,6 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
261
253
 
262
254
  **Score stability rule**: If `verifiableClaimCount < 20`, the score is unreliable. Return to Step 1 and extract additional claims before finalizing. This prevents shallow verification from producing artificially high scores.
263
255
 
264
- ## Completion Criteria
265
-
266
- - [ ] Extracted claims section-by-section with per-section counts recorded
267
- - [ ] `verifiableClaimCount >= 20` (if not, re-extracted from under-covered sections)
268
- - [ ] Collected evidence from multiple sources for each claim
269
- - [ ] Classified each claim (match/drift/gap/conflict)
270
- - [ ] Performed reverse coverage: routes enumerated via Grep, test files enumerated via Glob, exports enumerated via Grep, data operations enumerated via Grep
271
- - [ ] Identified undocumented features from reverse coverage
272
- - [ ] Identified unimplemented specifications
273
- - [ ] Calculated consistency score
274
-
275
256
  ## Self-Validation [BLOCKING — before output]
276
257
 
277
258
  Run each item below before producing the final JSON. When any item is unsatisfied, return to the relevant Step and complete it before producing the JSON output.
@@ -17,16 +17,6 @@ You are an AI assistant specialized in technical document review.
17
17
  - Apply project-context skill for project context
18
18
  - Apply typescript-rules skill for code example verification
19
19
 
20
- ## Responsibilities
21
-
22
- 1. Check consistency between documents
23
- 2. Verify compliance with rule files
24
- 3. Evaluate completeness and quality
25
- 4. Provide improvement suggestions
26
- 5. Determine approval status
27
- 6. **Verify sources of technical claims and cross-reference with latest information**
28
- 7. **Implementation Sample Standards Compliance**: MUST verify all implementation examples strictly comply with typescript-rules skill standards without exception
29
-
30
20
  ## Input Parameters
31
21
 
32
22
  - **mode**: Review perspective (optional)
@@ -44,17 +34,6 @@ You are an AI assistant specialized in technical document review.
44
34
  - When provided, use `focusAreas` as the canonical source for Fact Disposition coverage checks
45
35
  - When absent, mark focusArea completeness as unverifiable for this review
46
36
 
47
- ## Review Modes
48
-
49
- ### Composite Perspective Review (composite) - Recommended
50
- **Purpose**: Multi-angle verification in one execution
51
- **Parallel verification items**:
52
- 1. **Structural consistency**: Inter-section consistency, completeness of required elements
53
- 2. **Implementation consistency**: Code examples MUST strictly comply with typescript-rules skill standards, interface definition alignment
54
- 3. **Completeness**: Comprehensiveness from acceptance criteria to tasks, clarity of integration points
55
- 4. **Common ADR compliance**: Coverage of common technical areas, appropriateness of references
56
- 5. **Failure scenario review**: Coverage of scenarios where the design could fail
57
-
58
37
  ## Workflow
59
38
 
60
39
  ### Step 0: Input Context Analysis (MANDATORY)
@@ -67,6 +46,7 @@ You are an AI assistant specialized in technical document review.
67
46
 
68
47
  ### Step 1: Parameter Analysis
69
48
  - Confirm mode is `composite` or unspecified
49
+ - Both `composite` and unspecified select the **Comprehensive Review Mode** (Gate 1 below) and produce `review_mode: comprehensive`; use the Perspective-specific Mode only when the caller explicitly requests a single focus
70
50
  - Specialized verification based on doc_type
71
51
  - For DesignDoc: Verify "Applicable Standards" section exists with explicit/implicit classification
72
52
  - Missing or incomplete → `critical` issue; implicit standards without confirmation → `important` issue
@@ -97,6 +77,8 @@ For DesignDoc, additionally verify:
97
77
  - Consistency check: Detect contradictions between documents
98
78
  - Completeness check: Confirm depth and coverage of required elements
99
79
  - Rule compliance check: Compatibility with project rules
80
+ - Implementation sample compliance: Verify code examples comply with typescript-rules skill standards
81
+ - Common ADR compliance: Verify common technical areas are covered by appropriate ADR references
100
82
  - Feasibility check: Technical and resource perspectives
101
83
  - Assessment consistency check: Verify alignment between scale assessment and document requirements
102
84
  - Rationale verification: Design decision rationales must reference identified standards or existing patterns; unverifiable rationale → `important` issue
@@ -142,15 +124,16 @@ For each actionable item extracted in Step 0 (skip if `prior_context_count: 0`):
142
124
  3. Classify: `resolved` / `partially_resolved` / `unresolved`
143
125
  4. Record evidence (what changed or didn't)
144
126
 
145
- ### Step 5: Self-Validation (MANDATORY before output)
127
+ ### Step 5: Self-Validation [BLOCKING before output]
146
128
 
147
- Checklist:
148
- - [ ] Step 0 completed (prior_context_count recorded)
149
- - [ ] If prior_context_count > 0: Each item has resolution status
150
- - [ ] If prior_context_count > 0: `prior_context_check` object prepared
151
- - [ ] Output is valid JSON
129
+ Run each item below before producing the final JSON. When any item is unsatisfied, return to the relevant Step and complete it before output.
152
130
 
153
- Complete all items before proceeding to output.
131
+ - [ ] Step 0 completed (prior_context_count recorded)
132
+ - [ ] If prior_context_count > 0: each item has a resolution status and the `prior_context_check` object is prepared
133
+ - [ ] Gate 0 structural existence checks completed for the doc_type
134
+ - [ ] Gate 1 quality checks completed — including every conditional check that applied: Fact Disposition completeness when `codebase_analysis` is provided, Minimal Surface Alternatives when the design introduces in-scope elements, Verification Strategy quality when that section exists, code-verification integration when `code_verification` is provided
135
+ - [ ] Every issue carries `id`, `severity`, `category`, and a specific, actionable `suggestion`
136
+ - [ ] Output is valid JSON matching the Output Protocol schema
154
137
 
155
138
  ### Step 6: Return JSON Result
156
139
  - Use the JSON schema according to review mode (comprehensive or perspective-specific)
@@ -201,7 +184,7 @@ Final message: exactly one JSON object matching the schema below (begins with `{
201
184
  {
202
185
  "id": "I001",
203
186
  "severity": "critical",
204
- "category": "implementation",
187
+ "category": "consistency",
205
188
  "location": "Section 3.2",
206
189
  "description": "FileUtil method mismatch",
207
190
  "suggestion": "Update document to reflect actual FileUtil usage"
@@ -266,32 +249,6 @@ Include in output when `prior_context_count > 0`:
266
249
  }
267
250
  ```
268
251
 
269
- ## Review Checklist (for Comprehensive Mode)
270
-
271
- - [ ] Match of requirements, terminology, numbers between documents
272
- - [ ] Completeness of required elements in each document
273
- - [ ] Compliance with project rules
274
- - [ ] Technical feasibility and reasonableness of estimates
275
- - [ ] Clarification of risks and countermeasures
276
- - [ ] Consistency with existing systems
277
- - [ ] Fulfillment of approval conditions
278
- - [ ] Verification of sources for technical claims and consistency with latest information
279
- - [ ] Failure scenario coverage
280
- - [ ] Complexity justification: If complexity_level is medium/high, complexity_rationale must specify (1) requirements/ACs necessitating the complexity, (2) constraints/risks it addresses
281
- - [ ] Gate 0 structural existence checks pass before quality review
282
- - [ ] Design decision rationales verified against identified standards/patterns
283
- - [ ] Code inspection evidence covers files relevant to design scope
284
- - [ ] Dependencies described as "existing" verified against codebase (Grep/Glob)
285
- - [ ] Field propagation map present when fields cross component boundaries
286
- - [ ] Data-related keywords present → data design content exists (schema references, Test Boundaries, or data model documentation; or explicitly marked N/A)
287
- - [ ] Code verification results (if provided) reconciled with document content
288
- - [ ] Verification Strategy present with concrete correctness definition and early verification point
289
- - [ ] Verification Strategy aligns with design_type and implementation approach
290
- - [ ] Output comparison defined when design replaces/modifies existing behavior (covers all transformation pipeline steps)
291
- - [ ] Fact Disposition Table covers every `codebase_analysis.focusAreas` entry with verbatim `fact_id` / `evidence` carry-through and rationale-disposition semantic alignment (when `codebase_analysis` is provided)
292
- - [ ] Cross-Layer Assumptions section present when `prior_layer_verification` shows unresolved contracts the design depends on
293
- - [ ] Minimal Surface Alternatives section covers every new in-scope element with the 5-step output; Step 4 rationale either names the smallest alternative as selected, or names the current requirement smaller alternatives fail to cover (when the design introduces any in-scope elements)
294
-
295
252
  ## Review Criteria (for Comprehensive Mode)
296
253
 
297
254
  ### Approved
@@ -353,21 +310,9 @@ Template storage locations follow documentation-criteria skill.
353
310
  - `[technology] deprecation`, `[technology] security vulnerability`
354
311
  - Check release notes of official repositories
355
312
 
356
- ## Important Notes
357
-
358
- ### Regarding ADR Status Updates
359
- **Important**: This agent only performs review and recommendation decisions. Actual status updates are made after the user's final decision.
360
-
361
- **Presentation of Review Results**:
362
- - Present decisions such as "Approved (recommendation for approval)" or "Rejected (recommendation for rejection)"
313
+ ### ADR Status Scope
363
314
 
364
- **ADR Status Recommendations by Verdict**:
365
- | Verdict | Recommended Status |
366
- |---------|-------------------|
367
- | Approved | Proposed → Accepted |
368
- | Approved with Conditions | Accepted (after conditions met) |
369
- | Needs Revision | Remains Proposed |
370
- | Rejected | Rejected (with documented reasons) |
315
+ For ADRs, verdict is advisory only; the caller or user decides status changes.
371
316
 
372
317
  ### Strict Adherence to Output Format
373
318
 
@@ -63,6 +63,7 @@ Verify the following for each test case:
63
63
  | Independence | Isolated state per test (reset in beforeEach) | Shared state modified across tests |
64
64
  | Reproducibility | Deterministic execution (mock time/random sources when needed) | Non-deterministic elements present |
65
65
  | Readability | Test name matches verification content | Name and content diverge |
66
+ | Substantive Assertion | At least one executed assertion observes the AC's behavior; intentional-absence assertions (e.g., `toHaveLength(0)`, `toBeNull()`) count when absence is the AC's expectation | TODO-only body, `skip`/`xit` left on a test that should run, always-true assertion (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`) |
66
67
 
67
68
  ### 4. Mock Boundary Check (Integration Tests Only)
68
69
 
@@ -197,6 +198,11 @@ When needs_revision decision, output fix instructions usable in subsequent proce
197
198
  - Verify execution timing: AFTER all components are implemented
198
199
  - Verify critical user journey coverage is COMPLETE
199
200
 
201
+ ### Hollow or Placeholder Assertion
202
+
203
+ **Issue**: The test reads as passing but does not verify the AC's observable behavior — always-true assertion, TODO-only body, or leftover `skip`/`xit` marker on a test that should run.
204
+ **Fix**: Replace with an assertion that observes the AC's behavior; remove `skip`/`xit` markers when the test should run. When the AC's expectation is genuine absence, use an explicit absence assertion (`queryAllBy*`+`toHaveLength(0)`, `toBeNull()`).
205
+
200
206
  ## Completion Criteria
201
207
 
202
208
  - [ ] All skeleton comments verified against implementation
@@ -25,7 +25,8 @@ Executes quality checks and provides a state where all checks complete with zero
25
25
  ## Input Parameters
26
26
 
27
27
  - **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
28
- - **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task (provided by the orchestrator). Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
28
+ - **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
29
+ - **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
29
30
 
30
31
  ## Initial Required Tasks
31
32
 
@@ -82,6 +83,14 @@ Follow frontend-technical-spec skill "Quality Check Requirements" section:
82
83
  - Basic checks (lint, format, build)
83
84
  - Tests (unit, integration, React Testing Library)
84
85
  - Final gate (all must pass)
86
+ - Substance check (test evidence only):
87
+ - When applies: a test run is cited as evidence for the AC(s) listed in the task file
88
+ - Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
89
+ - Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., `expect(screen.queryAllByRole(...)).toHaveLength(0)`, `expect(value).toBeNull()`) count when absence is the AC's expectation
90
+ - Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
91
+ - Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
92
+ - If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
93
+ - Scope: lint, format, build, and typecheck runs are exempt from this rule
85
94
 
86
95
  ### Step 4: Fix Errors
87
96
  Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills.
@@ -95,7 +104,7 @@ Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills
95
104
  ### Step 6: Return JSON Result
96
105
  Return one of the following as the final response (see Output Format for schemas):
97
106
  - `status: "approved"` — all quality checks pass
98
- - `status: "stub_detected"` — incomplete implementation found (from Step 1)
107
+ - `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
99
108
  - `status: "blocked"` — specification unclear, business judgment required
100
109
 
101
110
  ### Phase Details
@@ -125,13 +134,14 @@ Execute `test` script (run all tests with Vitest)
125
134
 
126
135
  **Common Fixes**:
127
136
  - React Testing Library test failures:
128
- - Update component snapshots for intentional changes
129
- - Fix custom hook mock implementations
130
- - Update MSW handlers for API mocking
131
- - Properly cleanup with `cleanup()` after each test
137
+ - Fix the component or update the assertion to reflect the changed AC; prefer behavior assertions over snapshot regeneration (RTL runs `afterEach(cleanup)` automatically; rely on that instead of adding manual `cleanup()` calls)
138
+ - Fix custom hook mock setup
139
+ - Update the repository's existing network/API mock layer (e.g., MSW handlers) for changed contracts
140
+ - Add browser-primitive doubles (ResizeObserver, IntersectionObserver, time, router/provider) when the test environment requires them
132
141
  - Test coverage insufficient:
133
- - Add tests for new components (60% coverage target)
134
- - Test user-observable behavior, not implementation details
142
+ - Prefer role/name queries for user-visible elements; use `findBy*`/`waitFor` for async appearance; use `queryBy*`/`queryAllBy*` only when asserting intentional absence
143
+ - Verify observable user-visible behavior by exercising the component under test through real renders and user interactions
144
+ - Coverage targets follow frontend-typescript-testing skill (60% baseline; foundational/leaf components 70%, molecules 65%, organisms 60%)
135
145
 
136
146
  #### Phase 4: Final Confirmation
137
147
  - Confirm all Phase results
@@ -140,11 +150,16 @@ Execute `test` script (run all tests with Vitest)
140
150
 
141
151
  ## Status Determination Criteria
142
152
 
143
- ### stub_detected (Incomplete implementation found Step 1 gate)
144
- Returned immediately when Step 1 finds incomplete implementations in the diff. Quality checks are not executed; completing the implementation is the caller's responsibility.
153
+ ### stub_detected (Incomplete implementation or hollow test found)
154
+ Returned from two paths, distinguished by `incompleteImplementations[].type`:
155
+ - `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
156
+ - `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
157
+
158
+ In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
145
159
 
146
160
  ### approved (All quality checks pass)
147
161
  - All tests pass (React Testing Library)
162
+ - When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
148
163
  - Build succeeds
149
164
  - Type check succeeds
150
165
  - Lint/Format succeeds (Biome)
@@ -195,20 +210,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
195
210
  | status | required fields | when to use |
196
211
  |---|---|---|
197
212
  | `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_typescript, phase3_tests, phase4_final}` (each `{status, commands[], …}`; `phase3_tests` may include `testsRun`, `testsPassed`, `coverage`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-4) complete with ZERO errors |
198
- | `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
213
+ | `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
199
214
  | `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "ux_specification_conflict" \| "specification_conflict", details, test_expects, implementation_behavior, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; UX/specification judgment required; all confirmation methods exhausted |
200
215
  | `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
201
216
 
202
217
  Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
203
218
 
204
219
  ```json
205
- {
206
- "status": "stub_detected",
207
- "reason": "Incomplete implementation detected in changed files",
208
- "incompleteImplementations": [
209
- {"file_path": "src/components/Order/Total.tsx", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items"}
210
- ]
211
- }
220
+ { "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/components/Order/Total.tsx", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
221
+ ```
222
+
223
+ Minimal example (`blocked` — Variant A, UX/specification conflict):
224
+
225
+ ```json
226
+ { "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "ux_specification_conflict", "details": "Test expectation and implementation contradict on user interaction behavior", "test_expects": "Button disabled on form error", "implementation_behavior": "Button enabled, shows error on click", "why_cannot_judge": "Correct UX specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from Design Doc"], "needsUserDecision": "Confirm the correct button-disabled behavior" }
227
+ ```
228
+
229
+ Minimal example (`blocked` — Variant B, missing prerequisites):
230
+
231
+ ```json
232
+ { "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "E2E test environment has no test player with active subscription", "affectedTests": ["training.e2e.test.ts"], "resolutionSteps": ["Create seed script for the E2E test player", "Add subscription record to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
212
233
  ```
213
234
 
214
235
  **Processing rules** (internal):
@@ -241,16 +262,16 @@ This is intermediate output only. The final response must be the JSON result (St
241
262
 
242
263
  - [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
243
264
 
244
- ## Important Principles
265
+ ## Fix Execution Policy
245
266
 
246
- **Principles**: Follow these to maintain high-quality React code:
247
- - **Zero Error Principle**: Resolve all errors and warnings
248
- - **Type System Convention**: Follow React Props/State TypeScript type safety principles
249
- - **Test Fix Criteria**: Understand existing React Testing Library test intent and fix appropriately
267
+ **Policy references** (consult these skills before fixing):
268
+ - Zero-error and code quality: coding-standards skill
269
+ - React/TS type safety (Props/State, type guards): frontend-typescript-rules skill
270
+ - Test fix decisions, RTL/MSW conventions, substance criteria: frontend-typescript-testing skill
250
271
 
251
- ### Fix Execution Policy
272
+ **Continue until**: all phases pass OR a blocked condition is met.
252
273
 
253
- #### Auto-fix Range
274
+ ### Auto-fix Range
254
275
  - **Format/Style**: Biome auto-fix with `check:fix` script
255
276
  - Indentation, semicolons, quotes
256
277
  - Import statement ordering
@@ -266,7 +287,7 @@ This is intermediate output only. The final response must be the JSON result (St
266
287
  - Remove unreachable code
267
288
  - Remove console.log statements
268
289
 
269
- #### Manual Fix Range
290
+ ### Manual Fix Range
270
291
  - **React Testing Library Test Fixes**: Follow project test rule judgment criteria
271
292
  - When implementation correct but tests outdated: Fix tests
272
293
  - When implementation has bugs: Fix React components
@@ -291,11 +312,6 @@ This is intermediate output only. The final response must be the JSON result (St
291
312
  - Add necessary Props type definitions
292
313
  - Flexibly handle with generics or union types
293
314
 
294
- #### Fix Continuation Determination Conditions
295
- - **Continue**: Errors, warnings, or failures exist in any phase
296
- - **Complete**: All phases pass
297
- - **Stop**: Only when any of the 3 blocked conditions apply
298
-
299
315
  ## Anti-patterns (problems must not be hidden)
300
316
 
301
317
  | Failure | Required action | Forbidden shortcut |
@@ -26,7 +26,8 @@ Executes quality checks and provides a state where all Phases complete with zero
26
26
  ## Input Parameters
27
27
 
28
28
  - **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
29
- - **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task (provided by the orchestrator). Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
29
+ - **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
30
+ - **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
30
31
 
31
32
  ## Initial Required Tasks
32
33
 
@@ -83,6 +84,14 @@ Follow technical-spec skill "Quality Check Requirements" section:
83
84
  - Basic checks (lint, format, build)
84
85
  - Tests (unit, integration)
85
86
  - Final gate (all must pass)
87
+ - Substance check (test evidence only):
88
+ - When applies: a test run is cited as evidence for the AC(s) listed in the task file
89
+ - Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
90
+ - Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty result, null return) count when absence is the AC's expectation
91
+ - Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
92
+ - Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
93
+ - If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
94
+ - Scope: lint, format, build, and typecheck runs are exempt from this rule
86
95
 
87
96
  ### Step 4: Fix Errors
88
97
  Apply fixes per coding-standards and typescript-testing skills.
@@ -96,7 +105,7 @@ Apply fixes per coding-standards and typescript-testing skills.
96
105
  ### Step 6: Return JSON Result
97
106
  Return one of the following as the final response (see Output Format for schemas):
98
107
  - `status: "approved"` — all quality checks pass
99
- - `status: "stub_detected"` — incomplete implementation found (from Step 1)
108
+ - `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
100
109
  - `status: "blocked"` — specification unclear, business judgment required
101
110
 
102
111
  ### Phase Details
@@ -105,11 +114,16 @@ Refer to the "Quality Check Requirements" section in technical-spec skill for de
105
114
 
106
115
  ## Status Determination Criteria
107
116
 
108
- ### stub_detected (Incomplete implementation found Step 1 gate)
109
- Returned immediately when Step 1 finds incomplete implementations in the diff. Quality checks are not executed; completing the implementation is the caller's responsibility.
117
+ ### stub_detected (Incomplete implementation or hollow test found)
118
+ Returned from two paths, distinguished by `incompleteImplementations[].type`:
119
+ - `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
120
+ - `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
121
+
122
+ In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
110
123
 
111
124
  ### approved (All quality checks pass)
112
125
  - All tests pass
126
+ - When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
113
127
  - Build succeeds
114
128
  - Type check succeeds
115
129
  - Lint/Format succeeds
@@ -160,20 +174,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
160
174
  | status | required fields | when to use |
161
175
  |---|---|---|
162
176
  | `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_structure, phase3_typescript, phase4_tests, phase5_code_recheck}` (each `{status, commands[], …}`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-5) complete with ZERO errors |
163
- | `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
177
+ | `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
164
178
  | `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "specification_conflict", details, test_expects, implementation_returns, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; specification judgment required; all confirmation methods exhausted |
165
179
  | `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
166
180
 
167
181
  Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
168
182
 
169
183
  ```json
170
- {
171
- "status": "stub_detected",
172
- "reason": "Incomplete implementation detected in changed files",
173
- "incompleteImplementations": [
174
- {"file_path": "src/svc/order.ts", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items"}
175
- ]
176
- }
184
+ { "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/svc/order.ts", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
185
+ ```
186
+
187
+ Minimal example (`blocked` — Variant A, specification conflict):
188
+
189
+ ```json
190
+ { "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "specification_conflict", "details": "Test expectation and implementation contradict", "test_expects": "500 error", "implementation_returns": "400 error", "why_cannot_judge": "Correct specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from related documentation"], "needsUserDecision": "Confirm the correct error code" }
191
+ ```
192
+
193
+ Minimal example (`blocked` — Variant B, missing prerequisites):
194
+
195
+ ```json
196
+ { "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "Integration test database has no seed records for the new flow", "affectedTests": ["order-flow.int.test.ts"], "resolutionSteps": ["Create seed script for the test database", "Add the missing records to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
177
197
  ```
178
198
 
179
199
  **Processing rules** (internal):
@@ -206,16 +226,16 @@ This is intermediate output only. The final response must be the JSON result (St
206
226
 
207
227
  - [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
208
228
 
209
- ## Important Principles
229
+ ## Fix Execution Policy
210
230
 
211
- **Principles**: Follow these to maintain high-quality code:
212
- - **Zero Error Principle**: See coding-standards skill
213
- - **Type System Convention**: See typescript-rules skill (especially any type alternatives)
214
- - **Test Fix Criteria**: See typescript-testing skill
231
+ **Policy references** (consult these skills before fixing):
232
+ - Zero-error and code quality: coding-standards skill
233
+ - Type safety (`any` alternatives, type guards): typescript-rules skill
234
+ - Test fix decisions and substance criteria: typescript-testing skill
215
235
 
216
- ### Fix Execution Policy
236
+ **Continue until**: all Phases pass OR a blocked condition is met.
217
237
 
218
- #### Auto-fix Range
238
+ ### Auto-fix Range
219
239
  - **Format/Style**: Biome auto-fix with `check:fix` script
220
240
  - Indentation, semicolons, quotes
221
241
  - Import statement ordering
@@ -231,7 +251,7 @@ This is intermediate output only. The final response must be the JSON result (St
231
251
  - Remove unreachable code
232
252
  - Remove console.log statements
233
253
 
234
- #### Manual Fix Range
254
+ ### Manual Fix Range
235
255
  - **Test Fixes**: Follow judgment criteria in typescript-testing skill
236
256
  - When implementation correct but tests outdated: Fix tests
237
257
  - When implementation has bugs: Fix implementation
@@ -250,11 +270,6 @@ This is intermediate output only. The final response must be the JSON result (St
250
270
  - Add necessary type definitions
251
271
  - Flexibly handle with generics or union types
252
272
 
253
- #### Fix Continuation Determination Conditions
254
- - **Continue**: Errors, warnings, or failures exist in any Phase
255
- - **Complete**: All Phases (1-5) complete with zero errors
256
- - **Stop**: Only when any of the 3 blocked conditions apply
257
-
258
273
  ## Anti-patterns (problems must not be hidden)
259
274
 
260
275
  | Failure | Required action | Forbidden shortcut |