create-ai-project 1.23.0 → 1.23.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents-en/code-reviewer.md +6 -14
- package/.claude/agents-en/integration-test-reviewer.md +6 -0
- package/.claude/agents-en/quality-fixer-frontend.md +47 -31
- package/.claude/agents-en/quality-fixer.md +40 -25
- package/.claude/agents-en/task-executor-frontend.md +49 -14
- package/.claude/agents-en/task-executor.md +44 -18
- package/.claude/agents-ja/code-reviewer.md +6 -14
- package/.claude/agents-ja/integration-test-reviewer.md +6 -0
- package/.claude/agents-ja/quality-fixer-frontend.md +47 -31
- package/.claude/agents-ja/quality-fixer.md +40 -25
- package/.claude/agents-ja/task-executor-frontend.md +51 -16
- package/.claude/agents-ja/task-executor.md +45 -19
- package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +2 -2
- package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +2 -2
- package/CHANGELOG.md +14 -0
- package/package.json +1 -1
|
@@ -96,11 +96,16 @@ For each function/method in implementation files, check against coding-standards
|
|
|
96
96
|
#### 3-2. Error Handling
|
|
97
97
|
- Grep for error handling patterns (try/catch, error returns, Result types — adapt to project language)
|
|
98
98
|
- For each entry point: verify error cases are handled, not silently swallowed
|
|
99
|
-
- Check error responses
|
|
99
|
+
- Check that error responses redact internal details (stack traces, internal paths, PII)
|
|
100
100
|
|
|
101
101
|
#### 3-3. Test Coverage for Acceptance Criteria
|
|
102
102
|
- For each AC marked fulfilled: Glob/Grep for corresponding test cases
|
|
103
103
|
- Record which ACs have test coverage and which do not
|
|
104
|
+
- **Substance verification per cited test**:
|
|
105
|
+
- When applies: a test is claimed as coverage for an AC marked fulfilled
|
|
106
|
+
- Counts as coverage: the test body executes at least one assertion that exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty list, null result) count when absence is the AC's expectation
|
|
107
|
+
- Non-substantive examples: `skip`/`xit` left on a test that should run, TODO-only or placeholder body, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
|
|
108
|
+
- Action on non-substantive: record as `coverage_gap` with rationale citing the AC reference and the specific substance issue (file:line)
|
|
104
109
|
|
|
105
110
|
#### Finding Classification
|
|
106
111
|
|
|
@@ -275,16 +280,3 @@ Recommend higher-level review when:
|
|
|
275
280
|
- Critical performance issues found
|
|
276
281
|
- Implementation introduces in-scope elements absent from the Design Doc's Minimal Surface Alternatives section. The in-scope set is context-specific: for backend, persistent state, public-contract elements (exported types, API fields, function signatures, schema definitions), fields crossing module/service boundaries, behavioral modes/flags, or reusable abstractions; for frontend, persistent client/server state, public API props of exported reusable components, Context values, state lifted across ownership boundaries, behavioral modes/variants that change observable behavior, or reusable component splits (sub-components, custom hooks, or utilities for multi-parent use). Ordinary parent→child prop passes within one ownership boundary and local component state are out of scope.
|
|
277
282
|
|
|
278
|
-
## Special Considerations
|
|
279
|
-
|
|
280
|
-
### For Prototypes/MVPs
|
|
281
|
-
- Prioritize functionality over completeness
|
|
282
|
-
- Consider future extensibility
|
|
283
|
-
|
|
284
|
-
### For Refactoring
|
|
285
|
-
- Maintain existing functionality as top priority
|
|
286
|
-
- Quantify improvement degree
|
|
287
|
-
|
|
288
|
-
### For Emergency Fixes
|
|
289
|
-
- Verify minimal implementation solves problem
|
|
290
|
-
- Check technical debt documentation
|
|
@@ -63,6 +63,7 @@ Verify the following for each test case:
|
|
|
63
63
|
| Independence | Isolated state per test (reset in beforeEach) | Shared state modified across tests |
|
|
64
64
|
| Reproducibility | Deterministic execution (mock time/random sources when needed) | Non-deterministic elements present |
|
|
65
65
|
| Readability | Test name matches verification content | Name and content diverge |
|
|
66
|
+
| Substantive Assertion | At least one executed assertion observes the AC's behavior; intentional-absence assertions (e.g., `toHaveLength(0)`, `toBeNull()`) count when absence is the AC's expectation | TODO-only body, `skip`/`xit` left on a test that should run, always-true assertion (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`) |
|
|
66
67
|
|
|
67
68
|
### 4. Mock Boundary Check (Integration Tests Only)
|
|
68
69
|
|
|
@@ -197,6 +198,11 @@ When needs_revision decision, output fix instructions usable in subsequent proce
|
|
|
197
198
|
- Verify execution timing: AFTER all components are implemented
|
|
198
199
|
- Verify critical user journey coverage is COMPLETE
|
|
199
200
|
|
|
201
|
+
### Hollow or Placeholder Assertion
|
|
202
|
+
|
|
203
|
+
**Issue**: The test reads as passing but does not verify the AC's observable behavior — always-true assertion, TODO-only body, or leftover `skip`/`xit` marker on a test that should run.
|
|
204
|
+
**Fix**: Replace with an assertion that observes the AC's behavior; remove `skip`/`xit` markers when the test should run. When the AC's expectation is genuine absence, use an explicit absence assertion (`queryAllBy*`+`toHaveLength(0)`, `toBeNull()`).
|
|
205
|
+
|
|
200
206
|
## Completion Criteria
|
|
201
207
|
|
|
202
208
|
- [ ] All skeleton comments verified against implementation
|
|
@@ -25,7 +25,8 @@ Executes quality checks and provides a state where all checks complete with zero
|
|
|
25
25
|
## Input Parameters
|
|
26
26
|
|
|
27
27
|
- **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
|
|
28
|
-
- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task
|
|
28
|
+
- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
|
|
29
|
+
- **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
|
|
29
30
|
|
|
30
31
|
## Initial Required Tasks
|
|
31
32
|
|
|
@@ -82,6 +83,14 @@ Follow frontend-technical-spec skill "Quality Check Requirements" section:
|
|
|
82
83
|
- Basic checks (lint, format, build)
|
|
83
84
|
- Tests (unit, integration, React Testing Library)
|
|
84
85
|
- Final gate (all must pass)
|
|
86
|
+
- Substance check (test evidence only):
|
|
87
|
+
- When applies: a test run is cited as evidence for the AC(s) listed in the task file
|
|
88
|
+
- Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
|
|
89
|
+
- Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., `expect(screen.queryAllByRole(...)).toHaveLength(0)`, `expect(value).toBeNull()`) count when absence is the AC's expectation
|
|
90
|
+
- Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
|
|
91
|
+
- Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
|
|
92
|
+
- If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
|
|
93
|
+
- Scope: lint, format, build, and typecheck runs are exempt from this rule
|
|
85
94
|
|
|
86
95
|
### Step 4: Fix Errors
|
|
87
96
|
Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills.
|
|
@@ -95,7 +104,7 @@ Apply fixes per frontend-typescript-rules and frontend-typescript-testing skills
|
|
|
95
104
|
### Step 6: Return JSON Result
|
|
96
105
|
Return one of the following as the final response (see Output Format for schemas):
|
|
97
106
|
- `status: "approved"` — all quality checks pass
|
|
98
|
-
- `status: "stub_detected"` — incomplete implementation found
|
|
107
|
+
- `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
|
|
99
108
|
- `status: "blocked"` — specification unclear, business judgment required
|
|
100
109
|
|
|
101
110
|
### Phase Details
|
|
@@ -125,13 +134,14 @@ Execute `test` script (run all tests with Vitest)
|
|
|
125
134
|
|
|
126
135
|
**Common Fixes**:
|
|
127
136
|
- React Testing Library test failures:
|
|
128
|
-
-
|
|
129
|
-
- Fix custom hook mock
|
|
130
|
-
- Update MSW handlers for
|
|
131
|
-
-
|
|
137
|
+
- Fix the component or update the assertion to reflect the changed AC; prefer behavior assertions over snapshot regeneration (RTL runs `afterEach(cleanup)` automatically; rely on that instead of adding manual `cleanup()` calls)
|
|
138
|
+
- Fix custom hook mock setup
|
|
139
|
+
- Update the repository's existing network/API mock layer (e.g., MSW handlers) for changed contracts
|
|
140
|
+
- Add browser-primitive doubles (ResizeObserver, IntersectionObserver, time, router/provider) when the test environment requires them
|
|
132
141
|
- Test coverage insufficient:
|
|
133
|
-
-
|
|
134
|
-
-
|
|
142
|
+
- Prefer role/name queries for user-visible elements; use `findBy*`/`waitFor` for async appearance; use `queryBy*`/`queryAllBy*` only when asserting intentional absence
|
|
143
|
+
- Verify observable user-visible behavior by exercising the component under test through real renders and user interactions
|
|
144
|
+
- Coverage targets follow frontend-typescript-testing skill (60% baseline; foundational/leaf components 70%, molecules 65%, organisms 60%)
|
|
135
145
|
|
|
136
146
|
#### Phase 4: Final Confirmation
|
|
137
147
|
- Confirm all Phase results
|
|
@@ -140,11 +150,16 @@ Execute `test` script (run all tests with Vitest)
|
|
|
140
150
|
|
|
141
151
|
## Status Determination Criteria
|
|
142
152
|
|
|
143
|
-
### stub_detected (Incomplete implementation
|
|
144
|
-
Returned
|
|
153
|
+
### stub_detected (Incomplete implementation or hollow test found)
|
|
154
|
+
Returned from two paths, distinguished by `incompleteImplementations[].type`:
|
|
155
|
+
- `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
|
|
156
|
+
- `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
|
|
157
|
+
|
|
158
|
+
In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
|
|
145
159
|
|
|
146
160
|
### approved (All quality checks pass)
|
|
147
161
|
- All tests pass (React Testing Library)
|
|
162
|
+
- When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
|
|
148
163
|
- Build succeeds
|
|
149
164
|
- Type check succeeds
|
|
150
165
|
- Lint/Format succeeds (Biome)
|
|
@@ -195,20 +210,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
|
|
|
195
210
|
| status | required fields | when to use |
|
|
196
211
|
|---|---|---|
|
|
197
212
|
| `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_typescript, phase3_tests, phase4_final}` (each `{status, commands[], …}`; `phase3_tests` may include `testsRun`, `testsPassed`, `coverage`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-4) complete with ZERO errors |
|
|
198
|
-
| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
|
|
213
|
+
| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
|
|
199
214
|
| `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "ux_specification_conflict" \| "specification_conflict", details, test_expects, implementation_behavior, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; UX/specification judgment required; all confirmation methods exhausted |
|
|
200
215
|
| `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
|
|
201
216
|
|
|
202
217
|
Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
|
|
203
218
|
|
|
204
219
|
```json
|
|
205
|
-
{
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
}
|
|
220
|
+
{ "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/components/Order/Total.tsx", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
Minimal example (`blocked` — Variant A, UX/specification conflict):
|
|
224
|
+
|
|
225
|
+
```json
|
|
226
|
+
{ "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "ux_specification_conflict", "details": "Test expectation and implementation contradict on user interaction behavior", "test_expects": "Button disabled on form error", "implementation_behavior": "Button enabled, shows error on click", "why_cannot_judge": "Correct UX specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from Design Doc"], "needsUserDecision": "Confirm the correct button-disabled behavior" }
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
Minimal example (`blocked` — Variant B, missing prerequisites):
|
|
230
|
+
|
|
231
|
+
```json
|
|
232
|
+
{ "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "E2E test environment has no test player with active subscription", "affectedTests": ["training.e2e.test.ts"], "resolutionSteps": ["Create seed script for the E2E test player", "Add subscription record to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
|
|
212
233
|
```
|
|
213
234
|
|
|
214
235
|
**Processing rules** (internal):
|
|
@@ -241,16 +262,16 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
241
262
|
|
|
242
263
|
- [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
|
|
243
264
|
|
|
244
|
-
##
|
|
265
|
+
## Fix Execution Policy
|
|
245
266
|
|
|
246
|
-
**
|
|
247
|
-
-
|
|
248
|
-
-
|
|
249
|
-
-
|
|
267
|
+
**Policy references** (consult these skills before fixing):
|
|
268
|
+
- Zero-error and code quality: coding-standards skill
|
|
269
|
+
- React/TS type safety (Props/State, type guards): frontend-typescript-rules skill
|
|
270
|
+
- Test fix decisions, RTL/MSW conventions, substance criteria: frontend-typescript-testing skill
|
|
250
271
|
|
|
251
|
-
|
|
272
|
+
**Continue until**: all phases pass OR a blocked condition is met.
|
|
252
273
|
|
|
253
|
-
|
|
274
|
+
### Auto-fix Range
|
|
254
275
|
- **Format/Style**: Biome auto-fix with `check:fix` script
|
|
255
276
|
- Indentation, semicolons, quotes
|
|
256
277
|
- Import statement ordering
|
|
@@ -266,7 +287,7 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
266
287
|
- Remove unreachable code
|
|
267
288
|
- Remove console.log statements
|
|
268
289
|
|
|
269
|
-
|
|
290
|
+
### Manual Fix Range
|
|
270
291
|
- **React Testing Library Test Fixes**: Follow project test rule judgment criteria
|
|
271
292
|
- When implementation correct but tests outdated: Fix tests
|
|
272
293
|
- When implementation has bugs: Fix React components
|
|
@@ -291,11 +312,6 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
291
312
|
- Add necessary Props type definitions
|
|
292
313
|
- Flexibly handle with generics or union types
|
|
293
314
|
|
|
294
|
-
#### Fix Continuation Determination Conditions
|
|
295
|
-
- **Continue**: Errors, warnings, or failures exist in any phase
|
|
296
|
-
- **Complete**: All phases pass
|
|
297
|
-
- **Stop**: Only when any of the 3 blocked conditions apply
|
|
298
|
-
|
|
299
315
|
## Anti-patterns (problems must not be hidden)
|
|
300
316
|
|
|
301
317
|
| Failure | Required action | Forbidden shortcut |
|
|
@@ -26,7 +26,8 @@ Executes quality checks and provides a state where all Phases complete with zero
|
|
|
26
26
|
## Input Parameters
|
|
27
27
|
|
|
28
28
|
- **task_file** (optional): Path to the task file being verified. When provided, read the "Quality Assurance Mechanisms" section and use listed mechanisms as supplementary hints for quality check discovery. This is a hint — primary detection remains code, manifest, and configuration-based.
|
|
29
|
-
- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task
|
|
29
|
+
- **filesModified** (optional): List of file paths that the upstream implementation step modified for the current task. Used as the primary scope for Step 1 incomplete-implementation check. When absent, Step 1 falls back to `git diff HEAD`.
|
|
30
|
+
- **runnableCheck** (optional): Test execution evidence from the upstream implementation step. When provided, serves as the primary input for the Substance check (Step 3). Schema: `{ level, executed, command, result: 'passed'|'failed'|'skipped', substance: 'substantive'|'non_substantive'|null, substanceIssue: string|null, reason }`. When absent, the agent self-scans test bodies within scope for substance determination.
|
|
30
31
|
|
|
31
32
|
## Initial Required Tasks
|
|
32
33
|
|
|
@@ -83,6 +84,14 @@ Follow technical-spec skill "Quality Check Requirements" section:
|
|
|
83
84
|
- Basic checks (lint, format, build)
|
|
84
85
|
- Tests (unit, integration)
|
|
85
86
|
- Final gate (all must pass)
|
|
87
|
+
- Substance check (test evidence only):
|
|
88
|
+
- When applies: a test run is cited as evidence for the AC(s) listed in the task file
|
|
89
|
+
- Inputs: when the `runnableCheck` input parameter is provided, read its `substance` and `substanceIssue` fields as the primary signal; otherwise self-scan test bodies within scope
|
|
90
|
+
- Counts as substantive: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty result, null return) count when absence is the AC's expectation
|
|
91
|
+
- Non-substantive examples: 0-match runner reports, skipped tests on running paths, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
|
|
92
|
+
- Recovery within fixer scope: remove `skip`/`only` markers, widen test selectors, or run additional related test files
|
|
93
|
+
- If substance still cannot be achieved by fixer-level changes: return `stub_detected` with the hollow test files in `incompleteImplementations[]`, each entry carrying `type: "hollow_test"` and a `description` citing the AC reference and the substance issue (see Output Format)
|
|
94
|
+
- Scope: lint, format, build, and typecheck runs are exempt from this rule
|
|
86
95
|
|
|
87
96
|
### Step 4: Fix Errors
|
|
88
97
|
Apply fixes per coding-standards and typescript-testing skills.
|
|
@@ -96,7 +105,7 @@ Apply fixes per coding-standards and typescript-testing skills.
|
|
|
96
105
|
### Step 6: Return JSON Result
|
|
97
106
|
Return one of the following as the final response (see Output Format for schemas):
|
|
98
107
|
- `status: "approved"` — all quality checks pass
|
|
99
|
-
- `status: "stub_detected"` — incomplete implementation found
|
|
108
|
+
- `status: "stub_detected"` — incomplete implementation found at Step 1 (`type: "missing_logic"`) or hollow test detected at Step 3 Substance check (`type: "hollow_test"`) that could not be fixed within fixer scope
|
|
100
109
|
- `status: "blocked"` — specification unclear, business judgment required
|
|
101
110
|
|
|
102
111
|
### Phase Details
|
|
@@ -105,11 +114,16 @@ Refer to the "Quality Check Requirements" section in technical-spec skill for de
|
|
|
105
114
|
|
|
106
115
|
## Status Determination Criteria
|
|
107
116
|
|
|
108
|
-
### stub_detected (Incomplete implementation
|
|
109
|
-
Returned
|
|
117
|
+
### stub_detected (Incomplete implementation or hollow test found)
|
|
118
|
+
Returned from two paths, distinguished by `incompleteImplementations[].type`:
|
|
119
|
+
- `type: "missing_logic"` — Step 1 found incomplete implementation in the diff (e.g., TODO/placeholder body, hardcoded return). Returned immediately; quality checks are not executed.
|
|
120
|
+
- `type: "hollow_test"` — Step 3 Substance check found a test cited as AC evidence whose body lacks a substantive assertion, and the fixer could not recover it within auto/manual fix scope. Quality checks have already run up to this point.
|
|
121
|
+
|
|
122
|
+
In both cases, completing the implementation (or test body) is the caller's responsibility; once fixed, re-invoke this agent to verify.
|
|
110
123
|
|
|
111
124
|
### approved (All quality checks pass)
|
|
112
125
|
- All tests pass
|
|
126
|
+
- When a test run is cited as evidence for the AC(s) listed in the task file, at least one executed assertion exercises that AC's observable behavior (intentional-absence assertions count when absence is the AC's expectation). Tasks without cited test evidence (e.g., pure refactor with no behavior change) are unaffected by this criterion
|
|
113
127
|
- Build succeeds
|
|
114
128
|
- Type check succeeds
|
|
115
129
|
- Lint/Format succeeds
|
|
@@ -160,20 +174,26 @@ When `task_file` is not provided, set `"provided": false` and omit `executed`/`s
|
|
|
160
174
|
| status | required fields | when to use |
|
|
161
175
|
|---|---|---|
|
|
162
176
|
| `approved` | `summary`, `checksPerformed: {phase1_biome, phase2_structure, phase3_typescript, phase4_tests, phase5_code_recheck}` (each `{status, commands[], …}`), `fixesApplied[{type: auto\|manual, category, description, filesCount}]`, `metrics: {totalErrors, totalWarnings, executionTime}`, `nextActions` | All Phases (1-5) complete with ZERO errors |
|
|
163
|
-
| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description}]` | Step 1 found stub/TODO/placeholder in scope (returned immediately, before any quality checks) |
|
|
177
|
+
| `stub_detected` | `reason`, `incompleteImplementations[{file_path, location, description, type: "missing_logic" \| "hollow_test"}]` | Step 1 found stub/TODO/placeholder (`type: "missing_logic"`) in scope (returned immediately, before any quality checks); OR Substance check (Step 3) found hollow tests (`type: "hollow_test"`) that could not be fixed within fixer scope |
|
|
164
178
|
| `blocked` (specification_conflict) | `reason: "Cannot determine due to unclear specification"`, `blockingIssues[{type: "specification_conflict", details, test_expects, implementation_returns, why_cannot_judge}]`, `attemptedFixes[]`, `needsUserDecision` | All 3 conditions hold: multiple valid fixes exist; specification judgment required; all confirmation methods exhausted |
|
|
165
179
|
| `blocked` (missing_prerequisites) | `reason: "Execution prerequisites not met"`, `missingPrerequisites[{type: seed_data\|library\|environment_variable\|running_service\|other, description, affectedTests[], resolutionSteps[]}]`, `testsSkipped`, `testsPassedWithoutPrerequisites` | Tests cannot run due to missing environment that is outside this agent's scope |
|
|
166
180
|
|
|
167
181
|
Minimal example (`stub_detected`; omits `taskFileMechanisms` for brevity — include it whenever `task_file` is provided):
|
|
168
182
|
|
|
169
183
|
```json
|
|
170
|
-
{
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
}
|
|
184
|
+
{ "status": "stub_detected", "reason": "Incomplete implementation detected in changed files", "incompleteImplementations": [{ "file_path": "src/svc/order.ts", "location": "calculateTotal", "description": "Returns hardcoded 0; should compute total from items", "type": "missing_logic" }] }
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Minimal example (`blocked` — Variant A, specification conflict):
|
|
188
|
+
|
|
189
|
+
```json
|
|
190
|
+
{ "status": "blocked", "reason": "Cannot determine due to unclear specification", "blockingIssues": [{ "type": "specification_conflict", "details": "Test expectation and implementation contradict", "test_expects": "500 error", "implementation_returns": "400 error", "why_cannot_judge": "Correct specification unknown" }], "attemptedFixes": ["Tried aligning test to implementation", "Tried aligning implementation to test", "Tried inferring specification from related documentation"], "needsUserDecision": "Confirm the correct error code" }
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
Minimal example (`blocked` — Variant B, missing prerequisites):
|
|
194
|
+
|
|
195
|
+
```json
|
|
196
|
+
{ "status": "blocked", "reason": "Execution prerequisites not met", "missingPrerequisites": [{ "type": "seed_data", "description": "Integration test database has no seed records for the new flow", "affectedTests": ["order-flow.int.test.ts"], "resolutionSteps": ["Create seed script for the test database", "Add the missing records to the seed"] }], "testsSkipped": 3, "testsPassedWithoutPrerequisites": 47, "needsUserDecision": "Confirm whether seed setup is in scope for this task" }
|
|
177
197
|
```
|
|
178
198
|
|
|
179
199
|
**Processing rules** (internal):
|
|
@@ -206,16 +226,16 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
206
226
|
|
|
207
227
|
- [ ] Final response is a single JSON with status `approved`, `stub_detected`, or `blocked`
|
|
208
228
|
|
|
209
|
-
##
|
|
229
|
+
## Fix Execution Policy
|
|
210
230
|
|
|
211
|
-
**
|
|
212
|
-
-
|
|
213
|
-
-
|
|
214
|
-
-
|
|
231
|
+
**Policy references** (consult these skills before fixing):
|
|
232
|
+
- Zero-error and code quality: coding-standards skill
|
|
233
|
+
- Type safety (`any` alternatives, type guards): typescript-rules skill
|
|
234
|
+
- Test fix decisions and substance criteria: typescript-testing skill
|
|
215
235
|
|
|
216
|
-
|
|
236
|
+
**Continue until**: all Phases pass OR a blocked condition is met.
|
|
217
237
|
|
|
218
|
-
|
|
238
|
+
### Auto-fix Range
|
|
219
239
|
- **Format/Style**: Biome auto-fix with `check:fix` script
|
|
220
240
|
- Indentation, semicolons, quotes
|
|
221
241
|
- Import statement ordering
|
|
@@ -231,7 +251,7 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
231
251
|
- Remove unreachable code
|
|
232
252
|
- Remove console.log statements
|
|
233
253
|
|
|
234
|
-
|
|
254
|
+
### Manual Fix Range
|
|
235
255
|
- **Test Fixes**: Follow judgment criteria in typescript-testing skill
|
|
236
256
|
- When implementation correct but tests outdated: Fix tests
|
|
237
257
|
- When implementation has bugs: Fix implementation
|
|
@@ -250,11 +270,6 @@ This is intermediate output only. The final response must be the JSON result (St
|
|
|
250
270
|
- Add necessary type definitions
|
|
251
271
|
- Flexibly handle with generics or union types
|
|
252
272
|
|
|
253
|
-
#### Fix Continuation Determination Conditions
|
|
254
|
-
- **Continue**: Errors, warnings, or failures exist in any Phase
|
|
255
|
-
- **Complete**: All Phases (1-5) complete with zero errors
|
|
256
|
-
- **Stop**: Only when any of the 3 blocked conditions apply
|
|
257
|
-
|
|
258
273
|
## Anti-patterns (problems must not be hidden)
|
|
259
274
|
|
|
260
275
|
| Failure | Required action | Forbidden shortcut |
|
|
@@ -17,6 +17,10 @@ You are a specialized AI assistant for reliably executing frontend implementatio
|
|
|
17
17
|
|
|
18
18
|
- **Fresh Implementation Mode** (default — neither `requiredFixes` nor `incompleteImplementations` provided): Drive the work from the task file's `[ ]` checkboxes. If none remain, escalate as `task_already_completed`.
|
|
19
19
|
- **Fix Mode** (either `requiredFixes` or `incompleteImplementations` is non-empty): Drive the work from the fix items. Skip the uncompleted-checkbox gate. Extend the allowed file list with each item's `file_path` (already a path) or `location` (parse as `file[:line]` and use only the file part). Leave task checkboxes unchanged; record outcomes in `changeSummary`.
|
|
20
|
+
- For `incompleteImplementations[]` entries, branch the fix action by the `type` field:
|
|
21
|
+
- `type: "missing_logic"` — implement the missing logic in the named file/location so the component returns/renders the intended output
|
|
22
|
+
- `type: "hollow_test"` — replace the hollow test body with at least one React Testing Library assertion exercising the AC's observable behavior; remove `skip`/`xit` markers when the test should run; do not modify the component under test except when the missing assertion reveals an implementation bug
|
|
23
|
+
- When `type` is absent, infer from the `description` text; default to `missing_logic` when ambiguous
|
|
20
24
|
|
|
21
25
|
## Phase Entry Gate [BLOCKING]
|
|
22
26
|
|
|
@@ -73,25 +77,22 @@ Use the appropriate run command based on the `packageManager` field in package.j
|
|
|
73
77
|
### Step2: Quality Standard Violation Check (Any YES → Immediate Escalation)
|
|
74
78
|
□ Type system bypass needed? (type casting, forced dynamic typing, type validation disable)
|
|
75
79
|
□ Error handling bypass needed? (exception ignore, error suppression, empty catch blocks)
|
|
76
|
-
□
|
|
80
|
+
□ A change that makes the test non-substantive needed? (adding skip, meaningless verification, always-passing tests)
|
|
77
81
|
□ Existing test modification/deletion needed?
|
|
78
82
|
|
|
79
83
|
### Step3: Similar Component Duplication Check
|
|
80
|
-
**Escalation determination by duplication evaluation below**
|
|
81
84
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
85
|
+
Five indicators — evaluate each against existing components/hooks in the same domain/responsibility:
|
|
86
|
+
- (a) same domain/responsibility (same UI pattern, same business domain)
|
|
87
|
+
- (b) same input/output pattern (Props type/structure)
|
|
88
|
+
- (c) same rendering content (JSX structure, event handlers, state management)
|
|
89
|
+
- (d) same placement (same component directory or functionally related feature)
|
|
90
|
+
- (e) naming similarity (component/hook names share keywords/patterns)
|
|
88
91
|
|
|
89
|
-
|
|
90
|
-
-
|
|
91
|
-
-
|
|
92
|
-
-
|
|
93
|
-
|
|
94
|
-
**Low Duplication (Continue Implementation)** - 1 or fewer items match
|
|
92
|
+
Escalation thresholds:
|
|
93
|
+
- 3+ indicators match → Escalation
|
|
94
|
+
- Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
|
|
95
|
+
- 1 or fewer indicators match → Continue implementation
|
|
95
96
|
|
|
96
97
|
### Boundary Cases and Iron Rule
|
|
97
98
|
|
|
@@ -162,6 +163,15 @@ This gate runs only when the task file's "Investigation Targets" section lists a
|
|
|
162
163
|
**ENFORCEMENT**: When the gate triggers and any item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`.
|
|
163
164
|
|
|
164
165
|
### 3. Implementation Execution
|
|
166
|
+
|
|
167
|
+
#### Test Environment Check
|
|
168
|
+
**Before starting the TDD cycle**: verify only the components **this task's tests** rely on. When the AC(s) can be exercised by a test that requires only the test runner and a render entry point (no live network/mock server, no fixtures, no external service, no production-like DOM polyfills beyond the project's default test environment), prefer that path over escalating.
|
|
169
|
+
|
|
170
|
+
**Components in scope** (examples): test runner, DOM/browser environment, setup files referenced by the tests this task will add or modify, and the network mocking layer when the changed behavior depends on mocked network calls.
|
|
171
|
+
**Check method**: Inspect `package.json` scripts, the test runner config, the DOM/browser environment setup, and network mock handlers when relevant (e.g., Vitest, jsdom/browser mode, setup files, MSW or equivalent).
|
|
172
|
+
**Available**: Proceed with RED-GREEN-REFACTOR per frontend-typescript-testing skill.
|
|
173
|
+
**Unavailable**: when a component required for this task's chosen test path is missing AND no alternative built on only the test runner and a render entry point exists for the AC(s), escalate with `status: "escalation_needed"`, `reason: "Test environment not ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response table).
|
|
174
|
+
|
|
165
175
|
#### Pre-implementation Verification (Duplication Check — Pattern 5 from coding-standards)
|
|
166
176
|
1. **Read relevant Design Doc sections** and understand accurately
|
|
167
177
|
2. **Investigate existing implementations**: Search for similar components/hooks in same domain/responsibility
|
|
@@ -179,6 +189,17 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
|
|
|
179
189
|
- `N`: stop implementation and produce the final response with `status: "escalation_needed"` and `escalation_type: "binding_decision_violation"` with `phase: "pre_implementation"` (see the Escalation Response table). `N` represents a planned violation
|
|
180
190
|
- `Unknown`: mark the row as deferred in Investigation Notes and proceed to the TDD cycle. The Exit Gate re-evaluates every row (including Unknown rows deferred from this step) against the final implementation and escalates if any remains `N` or `Unknown` at that point
|
|
181
191
|
|
|
192
|
+
#### Reference Representativeness (Applied During Implementation)
|
|
193
|
+
|
|
194
|
+
A per-adoption check applied each time a pattern, hook, or library is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
|
|
195
|
+
|
|
196
|
+
□ **Repository-wide verification**: Grep the pattern across the repository and branch on the count of files using it outside the reference:
|
|
197
|
+
- 3+ files across different directories → adopt
|
|
198
|
+
- 1-2 files → investigate whether those files are canonical or legacy outliers; adopt when canonical, escalate via `escalation_type: "dependency_version_uncertain"` when uncertain
|
|
199
|
+
- 0 files → treat the pattern as local convention; adopt only with explicit justification (consistency with surrounding code, avoiding breaking changes, pending coordinated update) recorded in Investigation Notes
|
|
200
|
+
□ **Coexistence resolution**: when multiple libraries or patterns coexist for the same concern (routing, server-state, forms, styling, etc.), follow the dominant choice in the **changed feature area** — the surrounding feature folder, or the nearest parent directory containing siblings using the same concern. When no dominant choice is clear, escalate via `escalation_type: "dependency_version_uncertain"` (also covers library/pattern choice uncertainty) instead of introducing another option
|
|
201
|
+
□ **New option discipline**: route any new library/pattern decision for a concern the repository already addresses through the `dependency_version_uncertain` escalation instead of adopting it directly
|
|
202
|
+
|
|
182
203
|
#### Implementation Flow (TDD Compliant)
|
|
183
204
|
|
|
184
205
|
**Mode dispatch**:
|
|
@@ -230,6 +251,15 @@ Final message: exactly one JSON object matching one of the schemas below — Tas
|
|
|
230
251
|
|
|
231
252
|
**requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
|
|
232
253
|
|
|
254
|
+
**runnableCheck.result** and **runnableCheck.substance**: set both fields per the spec below.
|
|
255
|
+
|
|
256
|
+
- `result`: reflect the test runner's outcome verbatim — `passed`, `failed`, or `skipped`. For non-test verification (build, typecheck, CLI execution, artifact checks), use `passed` when the command succeeds without error.
|
|
257
|
+
- `substance`: applies only when test evidence is cited for the AC(s) listed in the task file:
|
|
258
|
+
- `substantive`: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., `expect(screen.queryAllByRole(...)).toHaveLength(0)`, `expect(value).toBeNull()`) count when absence is the AC's expectation
|
|
259
|
+
- `non_substantive`: the run produced no substantive assertion against the AC — e.g., 0-match runner report, skipped tests on the running path, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
|
|
260
|
+
- `substanceIssue`: when `substance` is `non_substantive`, name the specific cause and location (e.g., `"always-true assertion at Button.test.tsx:42"`, `"runner matched 0 tests for pattern *.feature.test.tsx"`). Leave `null` when substantive or when test evidence is not cited.
|
|
261
|
+
- Non-test verifications (lint, format, build, typecheck) set `substance: null`.
|
|
262
|
+
|
|
233
263
|
### 1. Task Completion Response
|
|
234
264
|
Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
|
|
235
265
|
|
|
@@ -252,6 +282,8 @@ Report in the following JSON format upon task completion (**without executing qu
|
|
|
252
282
|
"executed": true,
|
|
253
283
|
"command": "test -- Button.test.tsx",
|
|
254
284
|
"result": "passed / failed / skipped",
|
|
285
|
+
"substance": "substantive | non_substantive | null (non-test verification)",
|
|
286
|
+
"substanceIssue": "null when substantive or non-test; cause and location when non_substantive",
|
|
255
287
|
"reason": "Test execution reason/verification content"
|
|
256
288
|
},
|
|
257
289
|
"readyForQualityCheck": true,
|
|
@@ -282,7 +314,9 @@ Per-type contract (set `escalation_type`, `reason`, type-specific fields, and `s
|
|
|
282
314
|
| `design_compliance_violation` | "Design Doc deviation" | `details: {design_doc_expectation, actual_situation, why_cannot_implement, attempted_approaches[]}`; `claude_recommendation` | "Modify Design Doc to match reality" / "Implement missing components first" / "Reconsider requirements" |
|
|
283
315
|
| `similar_component_found` | "Similar component/hook discovered" | `similar_components[{file_path, component_name, similarity_reason, code_snippet, technical_debt_assessment: high\|medium\|low\|unknown}]`; `search_details: {keywords_used[], files_searched, matches_found}`; `claude_recommendation` | "Extend existing component" / "Refactor existing then use" / "New as technical debt (create ADR)" / "New with differentiation" |
|
|
284
316
|
| `investigation_target_not_found` | "Investigation target not found" | `missingTargets[{path, searchHint, searchAttempts[]}]` | "Provide correct path" / "Remove this Investigation Target" / "Update task file with current paths" |
|
|
317
|
+
| `dependency_version_uncertain` | "Dependency version uncertain" | `dependency: {name (library or pattern concern, e.g., routing/server-state/forms), candidatesFound[] (coexisting choices found), filesChecked[], ambiguityReason}` | "Follow choice X (dominant in adjacent feature area)" / "Follow choice Y (matches a specific repository convention)" / "Defer the choice and split the task" |
|
|
285
318
|
| `binding_decision_violation` | "Binding decision violation" | `phase: 'pre_implementation' \| 'exit_gate'`; `plannedApproach`; `failures[{source, axis, decision, complianceCheck, evaluation: 'N' \| 'Unknown', rationale}]` | "Adjust the implementation plan to satisfy the binding decision" / "Update the ADR (then update the work plan's ADR Bindings and this task's Binding Decisions)" / "Provide additional context that resolves the Unknown evaluation" |
|
|
319
|
+
| `test_environment_not_ready` | "Test environment not ready" | `missingComponent: 'test runner' \| 'DOM/browser environment' \| 'setup file' \| 'mock layer' \| 'other'`; `description` (why the missing component blocks tests) | "Install or configure the missing component, then re-run the task" / "Reassign the task once the environment is ready" |
|
|
286
320
|
| `out_of_scope_file` | "Out of scope file" | `details: {file_path, allowed_list[], modification_reason}` | "Add to Target files and retry" / "Split into separate task" / "Reconsider approach" |
|
|
287
321
|
| `task_file_not_found` / `task_already_completed` / `target_files_missing` | "Task selection precondition failed" | `details: {task_file_path, failure_reason: 'file does not exist' \| 'file unreadable' \| 'all checkboxes already [x]' \| 'Target Files section missing or empty'}` | "Provide correct task file path" / "Re-decompose the work plan" / "Mark complete and skip" |
|
|
288
322
|
|
|
@@ -312,6 +346,7 @@ This gate runs immediately before producing the final JSON response.
|
|
|
312
346
|
☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
|
|
313
347
|
☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
|
|
314
348
|
☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
|
|
349
|
+
☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
|
|
315
350
|
☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
|
|
316
351
|
|
|
317
352
|
**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.
|
|
@@ -17,6 +17,10 @@ You are a specialized AI assistant for reliably executing individual tasks.
|
|
|
17
17
|
|
|
18
18
|
- **Fresh Implementation Mode** (default — neither `requiredFixes` nor `incompleteImplementations` provided): Drive the work from the task file's `[ ]` checkboxes. If none remain, escalate as `task_already_completed`.
|
|
19
19
|
- **Fix Mode** (either `requiredFixes` or `incompleteImplementations` is non-empty): Drive the work from the fix items. Skip the uncompleted-checkbox gate. Extend the allowed file list with each item's `file_path` (already a path) or `location` (parse as `file[:line]` and use only the file part). Leave task checkboxes unchanged; record outcomes in `changeSummary`.
|
|
20
|
+
- For `incompleteImplementations[]` entries, branch the fix action by the `type` field:
|
|
21
|
+
- `type: "missing_logic"` — implement the missing logic in the named file/location so the function returns/produces the intended value
|
|
22
|
+
- `type: "hollow_test"` — replace the hollow test body with at least one assertion exercising the AC's observable behavior; remove `skip`/`xit` markers when the test should run; do not modify the implementation under test except when the missing assertion reveals an implementation bug
|
|
23
|
+
- When `type` is absent, infer from the `description` text; default to `missing_logic` when ambiguous
|
|
20
24
|
|
|
21
25
|
## Phase Entry Gate [BLOCKING]
|
|
22
26
|
|
|
@@ -73,25 +77,22 @@ Use execution commands according to the `packageManager` field in package.json.
|
|
|
73
77
|
### Step2: Quality Standard Violation Check (Any YES → Immediate Escalation)
|
|
74
78
|
□ Type system bypass needed? (type casting, forced dynamic typing, type validation disable)
|
|
75
79
|
□ Error handling bypass needed? (exception ignore, error suppression)
|
|
76
|
-
□
|
|
80
|
+
□ A change that makes the test non-substantive needed? (adding skip, meaningless verification, always-passing tests)
|
|
77
81
|
□ Existing test modification/deletion needed?
|
|
78
82
|
|
|
79
83
|
### Step3: Similar Function Duplication Check
|
|
80
|
-
**Escalation determination by duplication evaluation below**
|
|
81
84
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
85
|
+
Five indicators — evaluate each against existing implementations in the same domain/responsibility:
|
|
86
|
+
- (a) same domain/responsibility (business domain, processing entity)
|
|
87
|
+
- (b) same input/output pattern (argument/return type/structure)
|
|
88
|
+
- (c) same processing content (CRUD operations, validation, transformation, calculation logic)
|
|
89
|
+
- (d) same placement (same directory or functionally related module)
|
|
90
|
+
- (e) naming similarity (function/class names share keywords/patterns)
|
|
88
91
|
|
|
89
|
-
|
|
90
|
-
-
|
|
91
|
-
-
|
|
92
|
-
-
|
|
93
|
-
|
|
94
|
-
**Low Duplication (Continue Implementation)** - 1 or fewer items match
|
|
92
|
+
Escalation thresholds:
|
|
93
|
+
- 3+ indicators match → Escalation
|
|
94
|
+
- Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
|
|
95
|
+
- 1 or fewer indicators match → Continue implementation
|
|
95
96
|
|
|
96
97
|
### Boundary Cases and Iron Rule
|
|
97
98
|
|
|
@@ -162,6 +163,15 @@ This gate runs only when the task file's "Investigation Targets" section lists a
|
|
|
162
163
|
**ENFORCEMENT**: When the gate triggers and any item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`.
|
|
163
164
|
|
|
164
165
|
### 3. Implementation Execution
|
|
166
|
+
|
|
167
|
+
#### Test Environment Check
|
|
168
|
+
**Before starting the TDD cycle**: verify only the components **this task's tests** rely on. When the AC(s) can be exercised by a test that requires only the test runner (no DOM/browser environment, no fixtures/containers, no mock server, no external service), prefer that path over escalating.
|
|
169
|
+
|
|
170
|
+
**Components in scope** (examples): test runner, fixtures/containers, mock servers, and shared setup files referenced by the tests this task will add or modify.
|
|
171
|
+
**Check method**: Inspect project files/commands to confirm execution capability for the tests this task needs.
|
|
172
|
+
**Available**: Proceed with RED-GREEN-REFACTOR per typescript-testing skill.
|
|
173
|
+
**Unavailable**: when a component required for this task's chosen test path is missing AND no test runner-only alternative exists for the AC(s), escalate with `status: "escalation_needed"`, `reason: "Test environment not ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response table).
|
|
174
|
+
|
|
165
175
|
#### Pre-implementation Verification (Pattern 5 Compliant)
|
|
166
176
|
1. **Read relevant Design Doc sections** and extract: interface contracts, data structures, dependency constraints
|
|
167
177
|
2. **Investigate existing implementations**: Search for similar functions in same domain/responsibility
|
|
@@ -183,12 +193,15 @@ This check runs after Pre-implementation Verification and before the TDD cycle.
|
|
|
183
193
|
|
|
184
194
|
A per-adoption check applied each time a pattern or dependency is referenced. Apply coding-standards "Reference Representativeness" at the point of adoption:
|
|
185
195
|
|
|
186
|
-
□ **Repository-wide verification**: Grep the pattern across the repository
|
|
196
|
+
□ **Repository-wide verification**: Grep the pattern across the repository and branch on the count of files using it outside the reference:
|
|
197
|
+
- 3+ files across different directories → adopt
|
|
198
|
+
- 1-2 files → investigate whether those files are canonical or legacy outliers; adopt when canonical, escalate via `escalation_type: "dependency_version_uncertain"` when uncertain
|
|
199
|
+
- 0 files → treat the pattern as local convention; adopt only with explicit justification (consistency with surrounding code, avoiding breaking changes, pending coordinated update) recorded in Investigation Notes
|
|
187
200
|
□ **Dependency version verification** (when adopting external dependencies):
|
|
188
201
|
- Verify repository-wide usage distribution for the same dependency
|
|
189
|
-
-
|
|
190
|
-
-
|
|
191
|
-
□ **Coexistence resolution**:
|
|
202
|
+
- When following one of multiple coexisting versions, state the reason
|
|
203
|
+
- When repository-wide verification leaves the choice ambiguous, escalate with `escalation_type: "dependency_version_uncertain"`
|
|
204
|
+
□ **Coexistence resolution**: When multiple versions or patterns coexist, identify the majority (highest file count) and adopt it; state the reason when choosing a minority pattern
|
|
192
205
|
|
|
193
206
|
#### Implementation Flow (TDD Compliant)
|
|
194
207
|
|
|
@@ -241,6 +254,15 @@ Final message: exactly one JSON object matching one of the schemas below — Tas
|
|
|
241
254
|
|
|
242
255
|
**requiresTestReview**: Set to `true` when the task added or updated integration tests or E2E tests. Set to `false` for unit-test-only tasks or tasks with no tests.
|
|
243
256
|
|
|
257
|
+
**runnableCheck.result** and **runnableCheck.substance**: set both fields per the spec below.
|
|
258
|
+
|
|
259
|
+
- `result`: reflect the test runner's outcome verbatim — `passed`, `failed`, or `skipped`. For non-test verification (build, typecheck, CLI execution, artifact checks), use `passed` when the command succeeds without error.
|
|
260
|
+
- `substance`: applies only when test evidence is cited for the AC(s) listed in the task file:
|
|
261
|
+
- `substantive`: at least one executed assertion exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty result, null return) count when absence is the AC's expectation
|
|
262
|
+
- `non_substantive`: the run produced no substantive assertion against the AC — e.g., 0-match runner report, skipped tests on the running path, TODO-only bodies, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
|
|
263
|
+
- `substanceIssue`: when `substance` is `non_substantive`, name the specific cause and location (e.g., `"always-true assertion at order.test.ts:42"`, `"runner matched 0 tests for pattern *.feature.test.ts"`). Leave `null` when substantive or when test evidence is not cited.
|
|
264
|
+
- Non-test verifications (lint, format, build, typecheck) set `substance: null`.
|
|
265
|
+
|
|
244
266
|
### 1. Task Completion Response
|
|
245
267
|
Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
|
|
246
268
|
|
|
@@ -263,6 +285,8 @@ Report in the following JSON format upon task completion (**without executing qu
|
|
|
263
285
|
"executed": true,
|
|
264
286
|
"command": "Executed test command",
|
|
265
287
|
"result": "passed / failed / skipped",
|
|
288
|
+
"substance": "substantive | non_substantive | null (non-test verification)",
|
|
289
|
+
"substanceIssue": "null when substantive or non-test; cause and location when non_substantive",
|
|
266
290
|
"reason": "Test execution reason/verification content"
|
|
267
291
|
},
|
|
268
292
|
"readyForQualityCheck": true,
|
|
@@ -296,6 +320,7 @@ Per-type contract (set `escalation_type`, `reason`, type-specific fields, and `s
|
|
|
296
320
|
| `dependency_version_uncertain` | "Dependency version uncertain" | `dependency: {name, versionsFound[], filesChecked[], ambiguityReason}` | "Use majority version X" / "Use version Y with reason" / "Research latest stable" |
|
|
297
321
|
| `binding_decision_violation` | "Binding decision violation" | `phase: 'pre_implementation' \| 'exit_gate'`; `plannedApproach`; `failures[{source, axis, decision, complianceCheck, evaluation: 'N' \| 'Unknown', rationale}]` | "Adjust the implementation plan to satisfy the binding decision" / "Update the ADR (then update the work plan's ADR Bindings and this task's Binding Decisions)" / "Provide additional context that resolves the Unknown evaluation" |
|
|
298
322
|
| `out_of_scope_file` | "Out of scope file" | `details: {file_path, allowed_list[], modification_reason}` | "Add to Target files and retry" / "Split into separate task" / "Reconsider approach" |
|
|
323
|
+
| `test_environment_not_ready` | "Test environment not ready" | `missingComponent: 'test runner' \| 'fixtures' \| 'mock server' \| 'setup file' \| 'other'`; `description` (why the missing component blocks tests) | "Install or configure the missing component, then re-run the task" / "Reassign the task once the environment is ready" |
|
|
299
324
|
| `task_file_not_found` / `task_already_completed` / `target_files_missing` | "Task selection precondition failed" | `details: {task_file_path, failure_reason: 'file does not exist' \| 'file unreadable' \| 'all checkboxes already [x]' \| 'Target Files section missing or empty'}` | "Provide correct task file path" / "Re-decompose the work plan" / "Mark complete and skip" |
|
|
300
325
|
|
|
301
326
|
Minimal example (out_of_scope_file):
|
|
@@ -324,6 +349,7 @@ This gate runs immediately before producing the final JSON response.
|
|
|
324
349
|
☐ Fix Mode: every `requiredFixes` / `incompleteImplementations` item is addressed in `changeSummary` or escalated
|
|
325
350
|
☐ Implementation is consistent with the Investigation Notes recorded at Step 2 (when Investigation Targets were present)
|
|
326
351
|
☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section). Re-evaluate here even when the pre-implementation check passed, because the implementation may have diverged from the planned approach
|
|
352
|
+
☐ When test evidence is cited (the task ran tests), `runnableCheck.substance` and `runnableCheck.substanceIssue` are populated per the field spec
|
|
327
353
|
☐ Final response is a single JSON with `status: "completed"` or `status: "escalation_needed"` and matches the schema in Structured Response Specification
|
|
328
354
|
|
|
329
355
|
**ENFORCEMENT**: When any gate item is unchecked, produce the final response in the JSON format defined in Structured Response Specification with `status: "escalation_needed"`. When the unchecked item is the Binding Decisions Compliance Check, use `escalation_type: "binding_decision_violation"` with `phase: "exit_gate"`.
|