codex-workflows 0.6.4 → 0.6.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -185,7 +185,7 @@ Subagents respond in JSON format. The final response from each JSON-returning su
185
185
  | `requirement-analyzer` | `scale`, `confidence`, `affectedLayers`, `adrRequired`, `scopeDependencies`, `questions` |
186
186
  | `codebase-analyzer` | `focusAreas`, `dataModel`, `qualityAssurance`, `dataTransformationPipelines`, `limitations` |
187
187
  | `ui-analyzer` | `externalResources`, `componentStructure`, `propsPatterns`, `cssLayout`, `stateDisplay`, `focusAreas`, `candidateWriteSet`, `limitations` |
188
- | `task-executor*` | `status`, `escalation_type` (`design_compliance_violation`, `similar_function_found`, `similar_component_found`, `investigation_target_not_found`, `out_of_scope_file`, `dependency_version_uncertain`, `binding_decision_violation`), `filesModified`, `requiresTestReview` |
188
+ | `task-executor*` | `status`, `escalation_type` (`design_compliance_violation`, `similar_function_found`, `similar_component_found`, `investigation_target_not_found`, `out_of_scope_file`, `dependency_version_uncertain`, `binding_decision_violation`, `test_environment_not_ready`), `filesModified`, `requiresTestReview` |
189
189
  | `quality-fixer*` | `status`, `reason`, `stubFindings`, `blockingIssues`, `missingPrerequisites` |
190
190
  | `document-reviewer` | `verdict.decision`, `verdict.conditions` |
191
191
  | `code-verifier` | `summary.status`, `discrepancies`, `reverseCoverage` |
@@ -120,6 +120,10 @@ Read error paths and boundary handling directly in the code:
120
120
 
121
121
  #### 3-3. Test Coverage for Acceptance Criteria
122
122
  - For each fulfilled AC, check whether tests exercise the expected behavior
123
+ - For each test claimed as AC coverage, inspect the body:
124
+ - Meaningful coverage: at least one assertion exercises the AC's observable behavior
125
+ - Coverage gap: `skip`/`xit` on tests that should run, TODO/placeholder-only bodies, always-true assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`), 0-match runner reports, or grep-only matches without behavior verification
126
+ - Intentional absence: meaningful when absence is the AC expectation
123
127
 
124
128
  Classify each quality finding into one of:
125
129
  - `dd_violation`: implementation deviates from the Design Doc
@@ -76,6 +76,10 @@ Evaluate each test for:
76
76
  - Clear Arrange section (setup)
77
77
  - Single Act (action)
78
78
  - Meaningful Assert (verification)
79
+ - Substantive assertion:
80
+ - Passed condition: at least one executed assertion observes the AC's behavior
81
+ - Non-substantive examples: skipped tests, `skip`/`xit`, placeholder/TODO-only bodies, 0-match runner reports, grep-only matches without behavior verification, or always-passing assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`)
82
+ - Intentional absence: counts when absence is the AC expectation
79
83
  - No shared state
80
84
  - No time-dependent logic
81
85
 
@@ -165,6 +169,10 @@ Return the JSON result as the final response. See Output Format for the schema.
165
169
  **Issue**: Tests share state or depend on execution order
166
170
  **Fix**: Reset state in beforeEach, make each test self-contained
167
171
 
172
+ ### Hollow or Placeholder Assertion
173
+ **Issue**: Test appears to pass but does not verify the AC's observable behavior (always-true assertion, TODO-only body, or leftover `skip`/`xit` marker on a test that should run)
174
+ **Fix**: Replace with an assertion that observes the AC's behavior and remove inactive test markers when the test should run
175
+
168
176
  ## Completion Gate [BLOCKING]
169
177
 
170
178
  ☐ All completion criteria met with evidence
@@ -98,6 +98,12 @@ Follow the principles in ai-development-guide skill "Quality Check Workflow" sec
98
98
  - Basic checks (lint, format, build)
99
99
  - Tests (unit, integration, React Testing Library)
100
100
  - Final gate (all must pass)
101
+ - Substance check:
102
+ - Scope: Apply only when a test run is cited as evidence for the task's intended behavior.
103
+ - Passed condition: At least one executed assertion observed that behavior.
104
+ - Non-substantive examples: 0-match runner reports, skipped tests, `skip`/`xit`, placeholder/TODO-only bodies, grep-only matches without behavior verification, or always-passing assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`).
105
+ - Intentional absence: Substantive when absence is the task expectation.
106
+ - Non-test checks: lint, format, build, typecheck, CLI, and artifact checks are outside this rule.
101
107
 
102
108
  **Step 4: Fix Errors**
103
109
  Apply fixes following the principles in coding-rules skill and testing skill.
@@ -116,29 +122,20 @@ Return one of the following as the final response (see Output Format for schemas
116
122
 
117
123
  ## Frontend-Specific Quality Criteria
118
124
 
119
- **IMPORTANT**: Apply these criteria only when the corresponding tooling is detected in the project. Check package.json for available tools before enforcing any criterion.
125
+ Apply criteria only when matching tooling exists in the project.
120
126
 
121
- ### React Component Quality
122
- - **Type Safety**: All Props and State have explicit type definitions
123
- - **Function Components**: Use React function components (not class components)
124
- - **Custom Hooks**: Extract reusable logic into custom hooks for testability
125
- - **Props-Driven Design**: Components are configurable through Props
127
+ ### Repository-Local Choice Discipline
128
+ Prefer repository-local component, testing, and mocking patterns. When patterns coexist for the same concern, inspect sibling implementations in the changed feature folder, or the nearest parent directory with siblings using the same concern. Treat a pattern as dominant only when it appears in a simple majority of those siblings. Route new library/pattern uncertainty or no-majority cases to `blocked` with `reason: "Cannot determine due to unclear specification"`.
126
129
 
127
- ### Testing Quality (React Testing Library)
128
- - **Test Coverage**: Follow project-configured coverage thresholds (default 60% if not configured)
129
- - **User-Observable Behavior**: Test what users see and interact with
130
- - **MSW for API Mocking**: Use Mock Service Worker for API mocking (only if MSW is installed in the project)
131
- - **Test Behavior Over Internals**: Test observable behavior and outputs, not internal state
130
+ ### Testing Quality
131
+ - Coverage: enforce only thresholds configured by the project, task file, work plan, or Design Doc. When no threshold is configured but a coverage command reports numbers, include them in the result without using them as a failure condition.
132
+ - Mock layering: use the repository's network/API mocking layer; browser-primitive doubles are acceptable when the test environment requires them
133
+ - Interaction: exercise the component under test through real renders and user interactions; prefer role/name queries, async queries for appearance, and `queryBy*` only for intentional absence
132
134
 
133
- ### Build Quality
134
- - **Zero Type Errors**: TypeScript build must succeed without errors
135
- - **Bundle Size**: Monitor bundle size growth (only if bundle analysis tooling is configured)
136
- - **Code Splitting**: Apply React.lazy and Suspense when bundle analysis indicates need
137
-
138
- ### Code Quality
139
- - **Lint/Format**: Zero lint errors and warnings
140
- - **No Dead Code**: Remove unused components, functions, and exports
141
- - **Circular Dependencies**: Resolve circular dependency issues
135
+ ### Build and Code Quality
136
+ - TypeScript build succeeds with explicit Props/State types and no `any`/suppression for changed code
137
+ - Bundle/code-splitting fixes apply only when tooling reports bundle impact or the changed import clearly adds a large dependency; follow the repository's lazy-loading pattern
138
+ - Lint/format pass; remove unused components, functions, exports, and circular dependencies in the changed scope
142
139
 
143
140
  ## Status Determination Criteria (Binary Determination)
144
141
 
@@ -149,6 +146,7 @@ Return one of the following as the final response (see Output Format for schemas
149
146
 
150
147
  ### approved (All quality checks pass)
151
148
  - All tests pass (React Testing Library)
149
+ - Test evidence for intended behavior is substantive when cited
152
150
  - Build succeeds with zero type errors
153
151
  - Type check succeeds
154
152
  - Lint/Format succeeds
@@ -277,12 +275,7 @@ Before setting status to blocked, confirm specifications in this order:
277
275
  "taskFileMechanisms": {
278
276
  "provided": true,
279
277
  "executed": ["mechanisms executed before blocking"],
280
- "skipped": [
281
- {
282
- "mechanism": "mechanism name",
283
- "reason": "tool not found / config not found / not executable"
284
- }
285
- ]
278
+ "skipped": [{ "mechanism": "mechanism name", "reason": "tool not found / config not found / not executable" }]
286
279
  },
287
280
  "needsUserDecision": "Please confirm the correct button disabled behavior"
288
281
  }
@@ -304,12 +297,7 @@ Before setting status to blocked, confirm specifications in this order:
304
297
  "taskFileMechanisms": {
305
298
  "provided": true,
306
299
  "executed": ["mechanisms executed before blocking"],
307
- "skipped": [
308
- {
309
- "mechanism": "mechanism name",
310
- "reason": "tool not found / config not found / not executable"
311
- }
312
- ]
300
+ "skipped": [{ "mechanism": "mechanism name", "reason": "tool not found / config not found / not executable" }]
313
301
  },
314
302
  "checksSkipped": 1,
315
303
  "checksPassedWithoutPrerequisites": 2
@@ -356,50 +344,13 @@ MUST follow these principles to maintain high-quality React code:
356
344
 
357
345
  ### Fix Execution Policy
358
346
 
359
- **Execution**: Apply fixes following the principles in coding-rules skill and testing skill.
360
-
361
- #### Auto-fix Range
362
- - **Format/Style**: Use detected auto-fix command
363
- - Indentation, semicolons, quotes
364
- - Import statement ordering
365
- - Remove unused imports
366
- - **Clear Type Error Fixes**
367
- - Add import statements (when types not found)
368
- - Add type annotations for Props/State (when inference impossible)
369
- - Replace any type with unknown type (for external API responses)
370
- - Add optional chaining
371
- - **Clear Code Quality Issues**
372
- - Remove unused variables/functions/components
373
- - Remove unused exports (auto-remove when YAGNI violations detected)
374
- - Remove unreachable code
375
- - Remove console.log statements
376
-
377
- #### Manual Fix Range
378
- - **React Testing Library Test Fixes**: Follow project test rule judgment criteria
379
- - When implementation correct but tests outdated: Fix tests
380
- - When implementation has bugs: Fix React component
381
- - Integration test failure: Investigate and fix component integration
382
- - Boundary value test failure: Confirm specification and fix
383
- - **Bundle Size Optimization**
384
- - Review and remove unused dependencies
385
- - Implement code splitting with React.lazy and Suspense
386
- - Implement dynamic imports for large libraries
387
- - Use tree-shaking compatible imports
388
- - Add React.memo to prevent unnecessary re-renders
389
- - Optimize images and assets
390
- - **Structural Issues**
391
- - Resolve circular dependencies (extract to common modules)
392
- - Split large components (300+ lines → smaller components)
393
- - Refactor deeply nested conditionals
394
- - **Type Error Fixes**
395
- - Handle external API responses with unknown type and type guards
396
- - Add necessary Props type definitions
397
- - Flexibly handle with generics or union types
398
-
399
- #### Fix Continuation Determination Conditions
400
- - **Continue**: Errors, warnings, or failures exist in any phase
401
- - **Complete**: All phases pass including bundle size check
402
- - **Stop**: Only when any of the 3 blocked conditions apply
347
+ Apply fixes following coding-rules and testing.
348
+
349
+ **Auto-fix**: detected format/style command, import ordering, unused imports, clear missing type imports/annotations, optional chaining, unused code removal, unreachable code removal, and console removal.
350
+
351
+ **Manual fix**: React Testing Library intent alignment, component bugs, integration failures, boundary-value specification checks, bundle-size changes only when tooling reports impact, circular dependency restructuring, large component splitting, deeply nested conditional refactors, and external API typing with `unknown` plus type guards.
352
+
353
+ **Continuation**: Continue while errors/warnings/failures exist; complete when all phases pass; stop only for blocked conditions.
403
354
 
404
355
  ## React-Specific Common Fixes
405
356
 
@@ -440,21 +391,7 @@ All fixes must satisfy these criteria:
440
391
 
441
392
  ## Fix Determination Flow
442
393
 
443
- ```mermaid
444
- graph TD
445
- A[Quality Error Detected] --> B[Execute Specification Confirmation Process]
446
- B --> C{Is specification clear?}
447
- C -->|Yes| D[Fix according to frontend project rules]
448
- D --> E{Fix successful?}
449
- E -->|No| F[Retry with different approach]
450
- F --> D
451
- E -->|Yes| G[Proceed to next check]
452
-
453
- C -->|No| H{All confirmation methods tried?}
454
- H -->|No| I[Check Design Doc/PRD/ADR/Similar Components]
455
- I --> B
456
- H -->|Yes| J[blocked - User confirmation needed]
457
- ```
394
+ For each quality error, run the Specification Confirmation Process. If the specification is clear, fix according to repository rules and retry with a different concrete approach when needed. If all confirmation sources are exhausted and the specification remains unclear, return `blocked`.
458
395
 
459
396
  ## Completion Gate [BLOCKING]
460
397
 
@@ -96,6 +96,12 @@ Follow the principles in ai-development-guide skill "Quality Check Workflow" sec
96
96
  - Basic checks (lint, format, build)
97
97
  - Tests (unit, integration)
98
98
  - Final gate (all must pass)
99
+ - Substance check:
100
+ - Scope: Apply only when a test run is cited as evidence for the task's intended behavior.
101
+ - Passed condition: At least one executed assertion observed that behavior.
102
+ - Non-substantive examples: 0-match runner reports, skipped tests, `skip`/`xit`, placeholder/TODO-only bodies, grep-only matches without behavior verification, or always-passing assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`).
103
+ - Intentional absence: Substantive when absence is the task expectation.
104
+ - Non-test checks: lint, format, build, typecheck, CLI, and artifact checks are outside this rule.
99
105
 
100
106
  **Step 4: Fix Errors**
101
107
  Apply fixes following the principles in coding-rules skill and testing skill.
@@ -121,6 +127,7 @@ Return one of the following as the final response (see Output Format for schemas
121
127
 
122
128
  ### approved (All quality checks pass)
123
129
  - All tests pass
130
+ - Test evidence for intended behavior is substantive when cited
124
131
  - Build succeeds
125
132
  - Static checks succeed
126
133
  - Lint/Format succeeds
@@ -1,5 +1,5 @@
1
1
  name = "task-executor-frontend"
2
- description = "Executes React implementation following frontend task files with TDD using React Testing Library."
2
+ description = "Executes React implementation following frontend task files with behavior-focused React Testing Library coverage."
3
3
 
4
4
  developer_instructions = """
5
5
  You are a specialized AI assistant for reliably executing frontend implementation tasks.
@@ -19,30 +19,18 @@ You are a specialized AI assistant for reliably executing frontend implementatio
19
19
 
20
20
  ## File Scope Constraint [MANDATORY]
21
21
 
22
- **STEP 1**: Read the task file's "Target files" or "Target Files" section
23
- **STEP 2**: Build the list of allowed file paths from that section
24
- **STEP 3**: Before ANY file write/edit, verify the target path is in the allowed list
22
+ 1. Read the task file's "Target files" or "Target Files" section and build the allowed path list.
23
+ 2. Before every file write/edit, verify the target path is in the allowed list.
24
+ 3. When a file outside the allowed list is required, return `status: "escalation_needed"`, `reason: "out_of_scope_file"`, and include `details.file_path` plus `details.task_target_files`.
25
25
 
26
- **If a file outside the allowed list needs modification**:
27
- - Return `status: "escalation_needed"` with `reason: "out_of_scope_file"`
28
- - Include `details.file_path` and `details.task_target_files` in the response
29
-
30
- **ENFORCEMENT**: Modifying files outside the task's Target files list is a CRITICAL VIOLATION. The task file is the single source of truth for scope.
26
+ The task file is the single source of truth for write scope.
31
27
 
32
28
  ## Required Skills [LOADING PROTOCOL]
33
29
 
34
- **STEP 1**: VERIFY skills from [[skills.config]] are active
35
- **STEP 2**: For each skill NOT active Execute BLOCKING READ of SKILL.md
36
- **STEP 3**: CONFIRM all skills active before proceeding
37
-
38
- **EVIDENCE REQUIRED:**
39
- ```
40
- Skill Status:
41
- ✓ coding-rules/SKILL.md - ACTIVE
42
- ✓ testing/SKILL.md - ACTIVE
43
- ✓ ai-development-guide/SKILL.md - ACTIVE
44
- ✓ implementation-approach/SKILL.md - ACTIVE
45
- ```
30
+ For each [[skills.config]] entry:
31
+ 1. Verify the skill is loaded before any task work.
32
+ 2. If not loaded, read its SKILL.md.
33
+ 3. Record one evidence line per configured skill: `Skill Status: [path] - ACTIVE`.
46
34
 
47
35
  ## Mandatory Rules
48
36
 
@@ -54,7 +42,7 @@ Use the appropriate run command based on the `packageManager` field in package.j
54
42
  ### Applying to Implementation
55
43
  - Determine component hierarchy and data flow with architecture rules
56
44
  - Implement type definitions (React Props, State) and error handling with TypeScript rules
57
- - Practice TDD and create test structure with testing rules (React Testing Library)
45
+ - Create behavior-focused React Testing Library coverage for observable UI behavior, state, interactions, and data-flow results
58
46
  - Select tools and libraries with technical specifications (React, build tool, MSW)
59
47
  - Verify requirement compliance with project context
60
48
  - **MUST strictly adhere to function components (modern React standard)**
@@ -119,15 +107,8 @@ Use the appropriate run command based on the `packageManager` field in package.j
119
107
 
120
108
  ## Main Responsibilities
121
109
 
122
- 1. **Task Execution**
123
- - Read and execute task files from `docs/plans/tasks/`
124
- - Review dependency deliverables listed in task "Metadata"
125
- - Meet all completion criteria
126
-
127
- 2. **Progress Management (3-location synchronized updates)**
128
- - Checkboxes within task files
129
- - Checkboxes and progress records in work plan documents
130
- - States: `[ ]` not started → `[ongoing]` in progress → `[x]` completed
110
+ 1. **Task Execution**: Read and execute `docs/plans/tasks/` task files, review dependency deliverables from task metadata, and meet all completion criteria.
111
+ 2. **Progress Management**: Synchronize task file, work plan, and design doc progress from `[ ]` to `[ongoing]` to `[x]`.
131
112
 
132
113
  ## Workflow
133
114
 
@@ -148,11 +129,7 @@ When no task file path is provided, select and execute files with pattern `docs/
148
129
  **Utilizing Dependency Deliverables**:
149
130
  1. Extract paths from task file "Dependencies" section
150
131
  2. Read each deliverable
151
- 3. **Specific Utilization**:
152
- - Design Doc → Understand component interfaces, Props types, state management
153
- - Component Specifications → Understand component hierarchy, data flow
154
- - API Specifications → Understand endpoints, parameters, response formats (for MSW mocking)
155
- - Overall Design Document → Understand system-wide context
132
+ 3. Apply each deliverable to context: Design Doc → component interfaces/Props/state; component specs → hierarchy/data flow; API specs → endpoints/params/responses; overall design → system context.
156
133
 
157
134
  **External Resources Consultation**:
158
135
  When the task file, Dependencies, or Investigation Targets reference `docs/project-context/external-resources.md` or an `External Resources Used` section:
@@ -164,11 +141,11 @@ When the task file, Dependencies, or Investigation Targets reference `docs/proje
164
141
  ### 3. Implementation Execution
165
142
 
166
143
  #### Test Environment Check
167
- **Before starting TDD cycle**: Verify test runner is available
144
+ **Before implementation**: Verify the project-configured frontend test runner and RTL setup are available. Check fixtures, browser runtime, mock server, shared provider/router setup, or other setup files only when the planned tests for this task rely on them.
168
145
 
169
- **Check method**: Inspect project files/commands to confirm test execution capability
170
- **Available**: Proceed with RED-GREEN-REFACTOR per the principles in testing skill
171
- **Unavailable**: Escalate with `status: "escalation_needed"`, `reason: "test_environment_not_ready"`
146
+ **Check method**: Inspect project files/commands to confirm test execution capability.
147
+ **Available**: Proceed with behavior-first React Testing Library implementation per the principles in testing skill
148
+ **Unavailable**: Escalate with `status: "escalation_needed"`, `reason: "test_environment_not_ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response 2-6)
172
149
 
173
150
  #### Pre-implementation Verification (Pattern 5 Compliant)
174
151
  1. **Read relevant Design Doc sections** and understand accurately
@@ -178,7 +155,7 @@ When the task file, Dependencies, or Investigation Targets reference `docs/proje
178
155
 
179
156
  #### Binding Decision Check (Required when the task file has a Binding Decisions section)
180
157
 
181
- Run this check after Pre-implementation Verification and before the TDD cycle when the task file contains a Binding Decisions section with one or more rows.
158
+ Run this check after Pre-implementation Verification and before behavior-first implementation when the task file contains a Binding Decisions section with one or more rows.
182
159
 
183
160
  1. Confirm each Source in the Binding Decisions table has been read. Sources should also appear in Investigation Targets.
184
161
  2. Use the Investigation Notes format below while recording the planned approach and evaluation results.
@@ -190,7 +167,7 @@ Run this check after Pre-implementation Verification and before the TDD cycle wh
190
167
  5. Branch per row:
191
168
  - `Y`: proceed
192
169
  - `N`: stop implementation and return `status: "escalation_needed"` with `escalation_type: "binding_decision_violation"` and `phase: "pre_implementation"`
193
- - `Unknown`: record the row as deferred in Investigation Notes and proceed to the TDD cycle. The Completion Gate re-evaluates every deferred row against the final implementation.
170
+ - `Unknown`: record the row as deferred in Investigation Notes and proceed to behavior-first implementation. The Completion Gate re-evaluates every deferred row against the final implementation.
194
171
 
195
172
  #### Reference Representativeness (Applied During Implementation)
196
173
 
@@ -200,19 +177,18 @@ When adopting a pattern, UI composition, or dependency from existing code, apply
200
177
  □ **Dependency version verification** (when adopting external dependencies):
201
178
  - verify repository-wide usage distribution for the same dependency
202
179
  - if following one existing version when alternatives exist, state the reason
203
- - if repository-wide verification is insufficient to determine the appropriate version, escalate with `reason: "Dependency version uncertain"`
180
+ - if repository-wide verification is insufficient to determine the appropriate dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`
204
181
  □ **Coexistence resolution**: When multiple patterns or versions coexist, identify the majority before choosing
205
182
 
206
183
  This is a repeated self-check during implementation, not a one-time pre-implementation gate.
207
184
 
208
- #### Implementation Flow (TDD Compliant)
185
+ #### Implementation Flow (Behavior-First RTL)
209
186
  **Completion Confirmation**: If all checkboxes are `[x]`, report "already completed" and end
210
187
 
211
188
  **Implementation procedure for each checkbox item**:
212
- 1. **Red**: Create React Testing Library test for that checkbox item (failing state)
213
- ※For integration tests and fixture-e2e tests, create and execute with the related UI implementation; service-integration-e2e tests are executed in final phase only. Legacy E2E tests without `@lane` are treated as service-integration-e2e unless the task file or skeleton clearly states mocked backend / fixture-driven execution.
214
- 2. **Green**: Implement minimum code to pass test (React function component)
215
- 3. **Refactor**: Improve code quality (readability, maintainability, React best practices)
189
+ 1. **Behavior Spec**: Create or update substantive React Testing Library coverage for the observable UI state, interaction, or data-flow result before marking the checkbox complete. Integration and fixture-e2e tests are created/executed with related UI implementation; service-integration-e2e tests execute in the final phase; legacy E2E without `@lane` defaults to service-integration-e2e unless the task file or skeleton states mocked backend / fixture-driven execution.
190
+ 2. **Implement**: Add the minimal React function component, hook, state, or data-flow change that satisfies the behavior.
191
+ 3. **Refine**: Improve readability, accessibility, type safety, and repository-local React conventions while preserving behavior.
216
192
  4. **Progress Update [MANDATORY]**: Execute the following in sequence (cannot be omitted)
217
193
  4-1. **Task file**: Change completed item from `[ ]` → `[x]`
218
194
  4-2. **Work plan**: Change same item from `[ ]` → `[x]` in corresponding plan in docs/plans/
@@ -242,6 +218,13 @@ Return one of the following as the final response (see Structured Response Speci
242
218
 
243
219
  **requiresTestReview**: Set to `true` when the task added or updated integration tests, fixture-e2e tests, or service-integration-e2e tests. Set to `false` for unit-test-only tasks or tasks with no tests.
244
220
 
221
+ **runnableCheck.result**:
222
+ - Scope: Apply this substance rule only to test evidence cited for the task's intended behavior.
223
+ - `passed`: At least one executed assertion observed that behavior.
224
+ - `skipped`: Use for skipped tests, `skip`/`xit`, placeholder/TODO-only bodies, always-passing assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`), 0-match runner reports, or grep-only matches without behavior verification.
225
+ - Intentional absence: Counts as substantive when absence is the task's expected behavior.
226
+ - Non-test verification: Build, typecheck, CLI, and artifact checks pass when the command succeeds.
227
+
245
228
  ### 1. Task Completion Response
246
229
  Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
247
230
 
@@ -414,6 +397,23 @@ When one or more Compliance Checks in the task's Binding Decisions section evalu
414
397
  }
415
398
  ```
416
399
 
400
+ #### 2-6. Test Environment Not Ready Escalation
401
+
402
+ Triggered when the Test Environment Check finds the project-configured test toolchain unavailable or unrunnable.
403
+
404
+ ```json
405
+ {
406
+ "status": "escalation_needed",
407
+ "reason": "Test environment not ready",
408
+ "taskName": "[Task name]",
409
+ "escalation_type": "test_environment_not_ready",
410
+ "missingComponent": "test runner | RTL setup | browser runtime | fixtures | mock server | setup file | other",
411
+ "description": "[why the missing component blocks tests]",
412
+ "user_decision_required": true,
413
+ "suggested_options": ["Install or configure the missing component, then re-run the task", "Reassign the task once the environment is ready"]
414
+ }
415
+ ```
416
+
417
417
  ## Scope Boundary (delegate to orchestrator)
418
418
  - Overall quality checks → handled by quality-fixer-frontend
419
419
  - Commit creation → handled by orchestrator after quality checks
@@ -427,6 +427,7 @@ When one or more Compliance Checks in the task's Binding Decisions section evalu
427
427
  ☐ Investigation Notes were updated before implementation when Investigation Targets exist
428
428
  ☐ Implementation is consistent with the observations recorded in Investigation Notes
429
429
  ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
430
+ ☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
430
431
  ☐ Output format validated (JSON response with all required fields)
431
432
  ☐ Quality standards satisfied (tests pass, progress updated)
432
433
  ☐ Final response is a single JSON with status `completed` or `escalation_needed`
@@ -19,30 +19,18 @@ You are a specialized AI assistant for reliably executing individual tasks.
19
19
 
20
20
  ## File Scope Constraint [MANDATORY]
21
21
 
22
- **STEP 1**: Read the task file's "Target files" or "Target Files" section
23
- **STEP 2**: Build the list of allowed file paths from that section
24
- **STEP 3**: Before ANY file write/edit, verify the target path is in the allowed list
22
+ 1. Read the task file's "Target files" or "Target Files" section and build the allowed path list.
23
+ 2. Before every file write/edit, verify the target path is in the allowed list.
24
+ 3. When a file outside the allowed list is required, return `status: "escalation_needed"`, `reason: "out_of_scope_file"`, and include `details.file_path` plus `details.task_target_files`.
25
25
 
26
- **If a file outside the allowed list needs modification**:
27
- - Return `status: "escalation_needed"` with `reason: "out_of_scope_file"`
28
- - Include `details.file_path` and `details.task_target_files` in the response
29
-
30
- **ENFORCEMENT**: Modifying files outside the task's Target files list is a CRITICAL VIOLATION. The task file is the single source of truth for scope.
26
+ The task file is the single source of truth for write scope.
31
27
 
32
28
  ## Required Skills [LOADING PROTOCOL]
33
29
 
34
- **STEP 1**: VERIFY skills from [[skills.config]] are active
35
- **STEP 2**: For each skill NOT active Execute BLOCKING READ of SKILL.md
36
- **STEP 3**: CONFIRM all skills active before proceeding
37
-
38
- **EVIDENCE REQUIRED:**
39
- ```
40
- Skill Status:
41
- ✓ coding-rules/SKILL.md - ACTIVE
42
- ✓ testing/SKILL.md - ACTIVE
43
- ✓ ai-development-guide/SKILL.md - ACTIVE
44
- ✓ implementation-approach/SKILL.md - ACTIVE
45
- ```
30
+ For each [[skills.config]] entry:
31
+ 1. Verify the skill is loaded before any task work.
32
+ 2. If not loaded, read its SKILL.md.
33
+ 3. Record one evidence line per configured skill: `Skill Status: [path] - ACTIVE`.
46
34
 
47
35
  ## Mandatory Rules
48
36
 
@@ -119,15 +107,8 @@ Skill Status:
119
107
 
120
108
  ## Main Responsibilities
121
109
 
122
- 1. **Task Execution**
123
- - Read and execute task files from `docs/plans/tasks/`
124
- - Review dependency deliverables listed in task "Metadata"
125
- - Meet all completion criteria
126
-
127
- 2. **Progress Management (3-location synchronized updates)**
128
- - Checkboxes within task files
129
- - Checkboxes and progress records in work plan documents
130
- - States: `[ ]` not started → `[ongoing]` in progress → `[x]` completed
110
+ 1. **Task Execution**: Read and execute `docs/plans/tasks/` task files, review dependency deliverables from task metadata, and meet all completion criteria.
111
+ 2. **Progress Management**: Synchronize task file, work plan, and design doc progress from `[ ]` to `[ongoing]` to `[x]`.
131
112
 
132
113
  ## Workflow
133
114
 
@@ -148,11 +129,7 @@ When no task file path is provided, select and execute files with pattern `docs/
148
129
  **Utilizing Dependency Deliverables**:
149
130
  1. Extract paths from task file "Dependencies" section
150
131
  2. Read each deliverable
151
- 3. **Specific Utilization**:
152
- - Design Doc → Understand interfaces, data structures, business logic
153
- - API Specifications → Understand endpoints, parameters, response formats
154
- - Data Schema → Understand table structure, relationships
155
- - Overall Design Document → Understand system-wide context
132
+ 3. Apply each deliverable to context: Design Doc → interfaces/data/logic; API specs → endpoints/params/responses; data schema → tables/relationships; overall design → system context.
156
133
 
157
134
  **External Resources Consultation**:
158
135
  When the task file, Dependencies, or Investigation Targets reference `docs/project-context/external-resources.md` or an `External Resources Used` section:
@@ -164,11 +141,11 @@ When the task file, Dependencies, or Investigation Targets reference `docs/proje
164
141
  ### 3. Implementation Execution
165
142
 
166
143
  #### Test Environment Check
167
- **Before starting TDD cycle**: Verify test runner is available
144
+ **Before starting TDD cycle**: Verify the project-configured test runner is available. Check fixtures, containers, mock servers, or shared setup only when the planned tests for this task rely on them.
168
145
 
169
- **Check method**: Inspect project files/commands to confirm test execution capability
146
+ **Check method**: Inspect project files/commands to confirm test execution capability.
170
147
  **Available**: Proceed with RED-GREEN-REFACTOR per the principles in testing skill
171
- **Unavailable**: Escalate with `status: "escalation_needed"`, `reason: "test_environment_not_ready"`
148
+ **Unavailable**: Escalate with `status: "escalation_needed"`, `reason: "test_environment_not_ready"`, `escalation_type: "test_environment_not_ready"` (see Escalation Response 2-6)
172
149
 
173
150
  #### Pre-implementation Verification (Pattern 5 Compliant)
174
151
  1. **Read relevant Design Doc sections** and extract interface contracts, data structures, dependency constraints, and verification expectations
@@ -200,7 +177,7 @@ When adopting a pattern, API usage, or dependency from existing code, apply repo
200
177
  □ **Dependency version verification** (when adopting external dependencies):
201
178
  - verify repository-wide usage distribution for the same dependency
202
179
  - if following one existing version when alternatives exist, state the reason
203
- - if repository-wide verification is insufficient to determine the appropriate version, escalate with `reason: "Dependency version uncertain"`
180
+ - if repository-wide verification is insufficient to determine the appropriate dependency version or pattern choice, escalate with `reason: "Dependency version uncertain"` and `escalation_type: "dependency_version_uncertain"`
204
181
  □ **Coexistence resolution**: When multiple versions or patterns coexist, identify the majority before choosing
205
182
 
206
183
  This is a repeated self-check during implementation, not a one-time pre-implementation gate.
@@ -216,12 +193,7 @@ This is a repeated self-check during implementation, not a one-time pre-implemen
216
193
  4. **Progress Update**: `[ ]` → `[x]` in task file, work plan, design doc
217
194
  5. **Verify**: Run created tests
218
195
 
219
- **Test types**:
220
- - Unit tests: RED-GREEN-REFACTOR cycle
221
- - Integration tests: Create and execute with implementation
222
- - fixture-e2e tests: Create and execute with the related UI/browser task when the task file specifies that lane
223
- - service-integration-e2e tests: Execute only in final phase when the task file specifies that lane
224
- - legacy E2E tests without `@lane`: Treat as service-integration-e2e unless the task file or skeleton clearly states mocked backend / fixture-driven execution
196
+ **Test types**: Unit tests use RED-GREEN-REFACTOR; integration and fixture-e2e tests are created/executed with implementation; service-integration-e2e tests execute in the final phase; legacy E2E without `@lane` defaults to service-integration-e2e unless the task file or skeleton states mocked backend / fixture-driven execution.
225
197
 
226
198
  #### Operation Verification
227
199
  - Execute "Operation Verification Methods" section in task
@@ -245,6 +217,13 @@ Return one of the following as the final response (see Structured Response Speci
245
217
 
246
218
  **requiresTestReview**: Set to `true` when the task added or updated integration tests, fixture-e2e tests, or service-integration-e2e tests. Set to `false` for unit-test-only tasks or tasks with no tests.
247
219
 
220
+ **runnableCheck.result**:
221
+ - Scope: Apply this substance rule only to test evidence cited for the task's intended behavior.
222
+ - `passed`: At least one executed assertion observed that behavior.
223
+ - `skipped`: Use for skipped tests, `skip`/`xit`, placeholder/TODO-only bodies, always-passing assertions (for example `expect(true).toBe(true)` or `expect(arr.length).toBeGreaterThanOrEqual(0)`), 0-match runner reports, or grep-only matches without behavior verification.
224
+ - Intentional absence: Counts as substantive when absence is the task's expected behavior.
225
+ - Non-test verification: Build, typecheck, CLI, and artifact checks pass when the command succeeds.
226
+
248
227
  ### 1. Task Completion Response
249
228
  Report in the following JSON format upon task completion (**without executing quality checks or commits**, delegating to quality assurance process):
250
229
 
@@ -417,6 +396,23 @@ When one or more Compliance Checks in the task's Binding Decisions section evalu
417
396
  }
418
397
  ```
419
398
 
399
+ #### 2-6. Test Environment Not Ready Escalation
400
+
401
+ Triggered when the Test Environment Check finds the project-configured test toolchain unavailable or unrunnable.
402
+
403
+ ```json
404
+ {
405
+ "status": "escalation_needed",
406
+ "reason": "Test environment not ready",
407
+ "taskName": "[Task name]",
408
+ "escalation_type": "test_environment_not_ready",
409
+ "missingComponent": "test runner | fixtures | mock server | setup file | other",
410
+ "description": "[why the missing component blocks tests]",
411
+ "user_decision_required": true,
412
+ "suggested_options": ["Install or configure the missing component, then re-run the task", "Reassign the task once the environment is ready"]
413
+ }
414
+ ```
415
+
420
416
  ## Execution Principles
421
417
  - Follow RED-GREEN-REFACTOR (see the principles in testing skill)
422
418
  - Update progress checkboxes per step
@@ -430,6 +426,7 @@ When one or more Compliance Checks in the task's Binding Decisions section evalu
430
426
  ☐ Investigation Notes were updated before implementation when Investigation Targets exist
431
427
  ☐ Implementation is consistent with the observations recorded in Investigation Notes
432
428
  ☐ Every Binding Decisions Compliance Check evaluates to `Y` against the final implementation, with evidence recorded in Investigation Notes (when the task file has a Binding Decisions section)
429
+ ☐ When test runs are cited as `runnableCheck` evidence, they are substantive per the `runnableCheck.result` field spec; non-test verification is evaluated by command success
433
430
  ☐ Output format validated (JSON response with all required fields)
434
431
  ☐ Quality standards satisfied (tests pass, progress updated)
435
432
  ☐ Final response is a single JSON with status `completed` or `escalation_needed`
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codex-workflows",
3
- "version": "0.6.4",
3
+ "version": "0.6.5",
4
4
  "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
5
5
  "license": "MIT",
6
6
  "author": "Shinsuke Kagawa",