codex-workflows 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -109,11 +109,11 @@ Check Step 5 result:
109
109
 
110
110
  Spawn quality-fixer agent: "Final quality assurance for test files added in this workflow. Run all tests and verify coverage."
111
111
 
112
- **Expected output**: `approved` (true/false)
112
+ **Expected output**: `status` (`approved`/`blocked`)
113
113
 
114
114
  ### Step 8: Commit
115
115
 
116
- On `approved: true` from quality-fixer:
116
+ On `status: "approved"` from quality-fixer:
117
117
  - MUST commit test files with appropriate message
118
118
  ENFORCEMENT: Commits without quality-fixer approval are invalid.
119
119
 
@@ -80,7 +80,7 @@ For EACH task, YOU MUST:
80
80
  - `approved` -> Proceed to step 4
81
81
  - `readyForQualityCheck: true` -> Proceed to step 4
82
82
  4. **Spawn quality-fixer agent**: "Execute all quality checks and fixes"
83
- 5. **COMMIT on approval**: After `approved: true` from quality-fixer -> Execute git commit
83
+ 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
84
84
 
85
85
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
86
86
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -74,7 +74,7 @@ Verify generated task files exist in docs/plans/tasks/.
74
74
  Each sub-agent responds in JSON format:
75
75
  - **task-executor-frontend**: status, filesModified, testsAdded, requiresTestReview, readyForQualityCheck
76
76
  - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes
77
- - **quality-fixer-frontend**: status, checksPerformed, fixesApplied, approved
77
+ - **quality-fixer-frontend**: status, checksPerformed, fixesApplied
78
78
 
79
79
  ### Execution Flow for Each Task
80
80
 
@@ -88,7 +88,7 @@ For EACH task, YOU MUST:
88
88
  - `approved` -> Proceed to step 4
89
89
  - `readyForQualityCheck: true` -> Proceed to step 4
90
90
  4. **Spawn quality-fixer-frontend agent**: "Execute all frontend quality checks and fixes"
91
- 5. **COMMIT on approval**: After `approved: true` from quality-fixer-frontend -> Execute git commit. Use `changeSummary` for commit message.
91
+ 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer-frontend -> Execute git commit. Use `changeSummary` for commit message.
92
92
 
93
93
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
94
94
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -98,7 +98,7 @@ For EACH task, YOU MUST:
98
98
  - `approved` -> Proceed to step 4
99
99
  - `readyForQualityCheck: true` -> Proceed to step 4
100
100
  4. **Spawn quality-fixer agent** (layer-appropriate per routing table): "Execute all quality checks and fixes"
101
- 5. **COMMIT on approval**: After `approved: true` from quality-fixer -> Execute git commit
101
+ 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
102
102
 
103
103
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
104
104
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -123,7 +123,7 @@ ENFORCEMENT: Sub-agent prompts missing the constraint suffix MUST be re-issued w
123
123
  1. Execute ONE task completely before starting next (each task goes through the full 4-step cycle individually, using the correct executor per filename pattern)
124
124
  2. Check executor status before quality-fixer (escalation check)
125
125
  3. Quality-fixer MUST run after each executor (no skipping)
126
- 4. Commit MUST execute when quality-fixer returns `approved: true` (do not defer to end)
126
+ 4. Commit MUST execute when quality-fixer returns `status: "approved"` (do not defer to end)
127
127
 
128
128
  ### Security Review (After All Tasks Complete)
129
129
 
@@ -106,7 +106,7 @@ After user grants "batch approval for entire implementation phase", enter autono
106
106
  - `approved` -> Proceed to step 3
107
107
  - Otherwise -> Proceed to step 3
108
108
  3. Spawn quality-fixer (or quality-fixer-frontend) agent: "Quality check and fixes"
109
- 4. git commit -> Execute on `approved: true`
109
+ 4. git commit -> Execute on `status: "approved"`
110
110
 
111
111
  ### Security Review (After All Tasks Complete)
112
112
 
@@ -20,7 +20,7 @@ Target: $ARGUMENTS
20
20
  **Execution Protocol**:
21
21
  1. **Spawn agents for all work** -- your role is to invoke sub-agents, pass data between them, and report results
22
22
  2. **Process one step at a time**: Execute steps sequentially within each unit (2 -> 3 -> 4 -> 5). Each step's output is the required input for the next step. Complete all steps for one unit before starting the next
23
- 3. **Pass `$STEP_N_OUTPUT` as-is** to sub-agents -- the orchestrator bridges data without processing or filtering it
23
+ 3. **Pass `$STEP_N_OUTPUT` as-is** to sub-agents -- the orchestrator bridges data without processing or filtering it, except for steps that explicitly define a deterministic transformation with an input schema, output schema, and mapping rules
24
24
 
25
25
  **Task Registration**: Register phases first, then steps within each phase as you enter it. Track status for each step.
26
26
 
@@ -44,7 +44,7 @@ Ask the user to confirm:
44
44
 
45
45
  ```
46
46
  Phase 1: PRD Generation
47
- Step 1: Scope Discovery (unified, single pass)
47
+ Step 1: Scope Discovery (unified, single pass -> group into PRD units -> human review)
48
48
  Step 2-5: Per-unit loop (Generation -> Verification -> Review -> Revision)
49
49
 
50
50
  Phase 2: Design Doc Generation (if requested)
@@ -67,17 +67,19 @@ Spawn scope-discoverer agent: "Discover functional scope targets in the codebase
67
67
  **Quality Gate**:
68
68
  - At least one unit discovered -> proceed
69
69
  - No units discovered -> ask user for hints
70
+ - `$STEP_1_OUTPUT.prdUnits` exists
71
+ - All `sourceUnits` across `prdUnits` (flattened, deduplicated) match the set of `discoveredUnits` IDs — no unit missing, no unit duplicated
70
72
 
71
- **[STOP — BLOCKING]** If human review enabled: Present discovered units to user for confirmation.
73
+ **[STOP — BLOCKING]** If human review enabled: Present `$STEP_1_OUTPUT.prdUnits` with their source unit mapping to user for confirmation.
72
74
  **CANNOT proceed until user explicitly confirms.**
73
75
 
74
76
  ### Step 2-5: Per-Unit Processing
75
77
 
76
- **FOR** each unit in `$STEP_1_OUTPUT.discoveredUnits` **(sequential, one unit at a time)**:
78
+ **FOR** each unit in `$STEP_1_OUTPUT.prdUnits` **(sequential, one unit at a time)**:
77
79
 
78
80
  #### Step 2: PRD Generation
79
81
 
80
- Spawn prd-creator agent: "Create reverse-engineered PRD for the following feature. Operation Mode: reverse-engineer. External Scope Provided: true. Feature: $UNIT_NAME. Description: $UNIT_DESCRIPTION. Related Files: $UNIT_RELATED_FILES. Entry Points: $UNIT_ENTRY_POINTS. Skip independent scope discovery. Use provided scope data. Create final version PRD based on code investigation within specified scope."
82
+ Spawn prd-creator agent: "Create reverse-engineered PRD for the following feature. Operation Mode: reverse-engineer. External Scope Provided: true. Feature: $PRD_UNIT_NAME. Description: $PRD_UNIT_DESCRIPTION. Related Files: $PRD_UNIT_COMBINED_RELATED_FILES. Entry Points: $PRD_UNIT_COMBINED_ENTRY_POINTS. Source Units: $PRD_UNIT_SOURCE_UNITS. Skip independent scope discovery. Use provided scope data. Create final version PRD based on code investigation within specified scope."
81
83
 
82
84
  **Store output as**: `$STEP_2_OUTPUT` (PRD path)
83
85
 
@@ -85,7 +87,7 @@ Spawn prd-creator agent: "Create reverse-engineered PRD for the following featur
85
87
 
86
88
  **Prerequisite**: $STEP_2_OUTPUT (PRD path from Step 2)
87
89
 
88
- Spawn code-verifier agent: "Verify consistency between PRD and code implementation. doc_type: prd. document_path: $STEP_2_OUTPUT. code_paths: $UNIT_RELATED_FILES. verbose: false."
90
+ Spawn code-verifier agent: "Verify consistency between PRD and code implementation. doc_type: prd. document_path: $STEP_2_OUTPUT. code_paths: $PRD_UNIT_COMBINED_RELATED_FILES. verbose: false."
89
91
 
90
92
  **Store output as**: `$STEP_3_OUTPUT`
91
93
 
@@ -130,11 +132,21 @@ ENFORCEMENT: Exceeding 2 revision cycles without flagging produces unreviewed ou
130
132
 
131
133
  ### Step 6: Design Doc Scope Mapping
132
134
 
133
- **No additional discovery required.** Use `$STEP_1_OUTPUT` (scope discovery results) directly.
135
+ **Step type**: Deterministic transformation step executed by the orchestrator.
134
136
 
135
- Each PRD unit from Phase 1 maps to one Design Doc unit (using technical-designer).
137
+ **No additional discovery required.** Use `$STEP_1_OUTPUT.discoveredUnits` (implementation-granularity units) for technical profiles. Use `$STEP_1_OUTPUT.prdUnits[].sourceUnits` to trace which discovered units belong to each PRD unit.
136
138
 
137
- Map `$STEP_1_OUTPUT` units to Design Doc generation targets, carrying forward:
139
+ **Default mapping rule**: Each PRD unit maps to exactly 1 Design Doc unit.
140
+
141
+ Only split one PRD unit into multiple Design Doc units when BOTH are true:
142
+ 1. The source units contain clearly separate technical boundaries with low shared-file overlap
143
+ 2. Separate Design Docs would improve verification clarity (different public interfaces, dependencies, or module groups)
144
+
145
+ If the split conditions are not clearly met, keep 1 PRD unit -> 1 Design Doc unit.
146
+
147
+ Transform `$STEP_1_OUTPUT` into `$STEP_6_OUTPUT` using only the mapping rules in this step.
148
+
149
+ Map PRD units to Design Doc generation targets by resolving each PRD unit's `sourceUnits` back to `$STEP_1_OUTPUT.discoveredUnits`, carrying forward:
138
150
  - `technicalProfile.primaryModules` -> Primary Files
139
151
  - `technicalProfile.publicInterfaces` -> Public Interfaces
140
152
  - `dependencies` -> Dependencies
@@ -142,6 +154,30 @@ Map `$STEP_1_OUTPUT` units to Design Doc generation targets, carrying forward:
142
154
 
143
155
  **Store output as**: `$STEP_6_OUTPUT`
144
156
 
157
+ `$STEP_6_OUTPUT` MUST be a JSON array of Design Doc generation targets in the following shape:
158
+
159
+ ```json
160
+ [
161
+ {
162
+ "unitId": "DD-001",
163
+ "parentPrdUnitId": "PRD-001",
164
+ "unitName": "Authentication",
165
+ "unitDescription": "Current implementation for sign-in and session management",
166
+ "sourceUnits": ["UNIT-001", "UNIT-002"],
167
+ "primaryModules": ["src/auth/service.ts", "src/auth/controller.ts"],
168
+ "publicInterfaces": ["AuthService.login()", "AuthController.handleLogin()"],
169
+ "dependencies": ["UNIT-003"],
170
+ "scopeBoundary": ["src/auth/*"],
171
+ "mappingRationale": "Default 1:1 mapping from PRD unit because technical scope is cohesive"
172
+ }
173
+ ]
174
+ ```
175
+
176
+ **Quality Gate**:
177
+ - Every PRD unit appears in at least one `$STEP_6_OUTPUT` item
178
+ - Every `$STEP_6_OUTPUT` item references only discovered units from its parent PRD unit
179
+ - `mappingRationale` explicitly states whether the mapping is default 1:1 or an intentional split
180
+
145
181
  ### Step 7-10: Per-Unit Processing
146
182
 
147
183
  **FOR** each unit in `$STEP_6_OUTPUT` **(sequential, one unit at a time)**:
@@ -31,7 +31,7 @@ ENFORCEMENT: Skipping document-reviewer risks propagating inconsistencies to dow
31
31
  ```
32
32
  Target document -> [Stop: Confirm changes]
33
33
  |
34
- technical-designer / prd-creator (update mode)
34
+ technical-designer / technical-designer-frontend / prd-creator (update mode)
35
35
  |
36
36
  document-reviewer -> [Stop: Review approval]
37
37
  | (Design Doc only)
@@ -70,15 +70,20 @@ Check for existing documents in docs/design/, docs/prd/, docs/adr/.
70
70
  | Multiple candidates found | Present options to user |
71
71
  | No documents found | Report and end (suggest $recipe-design instead) |
72
72
 
73
- ### Step 2: Document Type Determination
73
+ ### Step 2: Document Type and Layer Determination
74
74
 
75
- Determine type from document path:
75
+ Determine type from document path, then determine the layer to select the correct update agent:
76
76
 
77
77
  | Path Pattern | Type | Update Agent | Notes |
78
78
  |-------------|------|--------------|-------|
79
- | `docs/design/*.md` | Design Doc | technical-designer | - |
79
+ | `docs/design/*.md` | Design Doc | technical-designer or technical-designer-frontend | See layer detection below |
80
80
  | `docs/prd/*.md` | PRD | prd-creator | - |
81
- | `docs/adr/*.md` | ADR | technical-designer | Minor changes: update existing file; Major changes: create new ADR file |
81
+ | `docs/adr/*.md` | ADR | technical-designer or technical-designer-frontend | See layer detection below |
82
+
83
+ **Layer detection** (for Design Doc and ADR):
84
+ Read the document and determine its layer from content signals:
85
+ - **Frontend** (-> technical-designer-frontend): Document title/scope mentions React, components, UI, frontend; or file contains component hierarchy, state management, UI interactions
86
+ - **Backend** (-> technical-designer): All other cases (API, data layer, business logic, infrastructure)
82
87
 
83
88
  **ADR Update Guidance**:
84
89
  - **Minor changes** (clarification, typo fix, small scope adjustment): Update the existing ADR file
@@ -173,10 +173,10 @@ All agents MUST use this vocabulary consistently:
173
173
 
174
174
  ## Structured Response Specification
175
175
 
176
- Subagents respond in JSON format. Key fields for orchestrator decisions:
176
+ Subagents respond in JSON format. The final response from each JSON-returning subagent must be the JSON payload itself, with no trailing prose. Key fields for orchestrator decisions:
177
177
  - **requirement-analyzer**: scale, confidence, affectedLayers, adrRequired, scopeDependencies, questions
178
178
  - **task-executor**: status (escalation_needed/blocked/completed), testsAdded, requiresTestReview
179
- - **quality-fixer**: approved (true/false)
179
+ - **quality-fixer**: status (approved/blocked)
180
180
  - **document-reviewer**: verdict.decision (approved/approved_with_conditions/needs_revision/rejected)
181
181
  - **design-sync**: sync_status (CONFLICTS_FOUND/NO_CONFLICTS) — text format with [SUMMARY] block
182
182
  - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes
@@ -310,7 +310,7 @@ Stop autonomous execution and escalate to user in the following cases:
310
310
  - `approved`: Proceed to step 3
311
311
  - Otherwise: Proceed to step 3
312
312
  3. quality-fixer: Quality check and fixes
313
- 4. git commit (on `approved: true`)
313
+ 4. git commit (on `status: "approved"`)
314
314
 
315
315
  ## Main Orchestrator Roles
316
316
 
@@ -99,13 +99,13 @@ Each task uses the standard 4-step cycle with layer-appropriate agents:
99
99
  1. task-executor: Implementation
100
100
  2. Escalation check
101
101
  3. quality-fixer: Quality check and fixes
102
- 4. git commit (on approved: true)
102
+ 4. git commit (on status: "approved")
103
103
 
104
104
  ### frontend-task
105
105
  1. task-executor-frontend: Implementation
106
106
  2. Escalation check
107
107
  3. quality-fixer-frontend: Quality check and fixes
108
- 4. git commit (on approved: true)
108
+ 4. git commit (on status: "approved")
109
109
 
110
110
  ### integration-test-reviewer Placement
111
111
 
@@ -89,11 +89,14 @@ Verify against the Design Doc architecture:
89
89
  - No unnecessary duplicate implementations (Pattern 5 from ai-development-guide skill)
90
90
  - Existing codebase analysis section includes similar functionality investigation results
91
91
 
92
- ### 5. Calculate Compliance and Produce Report
92
+ ### 5. Calculate Compliance
93
93
  - Compliance rate = (fulfilled items + 0.5 x partially fulfilled items) / total AC items x 100
94
94
  - Compile all AC statuses, quality issues with specific locations
95
95
  - Determine verdict based on compliance rate
96
96
 
97
+ ### 6. Return JSON Result
98
+ Return the JSON result as the final response. See Output Format for the schema.
99
+
97
100
  ## Output Format
98
101
 
99
102
  ```json
@@ -136,6 +139,13 @@ Verify against the Design Doc architecture:
136
139
  - Provide solutions, not just problems; quantify wherever possible
137
140
  - Acknowledge good implementations; present improvements as actionable items
138
141
 
142
+ ## Completion Criteria
143
+
144
+ - [ ] All acceptance criteria individually evaluated
145
+ - [ ] Compliance rate calculated
146
+ - [ ] Verdict determined
147
+ - [ ] Final response is the JSON output
148
+
139
149
  ### Escalation Criteria
140
150
  Recommend higher-level review when: Design Doc itself has deficiencies, security concerns discovered, or critical performance issues found.
141
151
 
@@ -136,6 +136,10 @@ For each claim with collected evidence:
136
136
  2. **Implementation Coverage**: What percentage of specs are implemented?
137
137
  3. List undocumented features and unimplemented specs
138
138
 
139
+ ### Step 6: Return JSON Result
140
+
141
+ Return the JSON result as the final response. See Output Format for the schema.
142
+
139
143
  ## Output Format
140
144
 
141
145
  **JSON format is mandatory.**
@@ -201,7 +205,7 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
201
205
  - [ ] Identified undocumented features in code
202
206
  - [ ] Identified unimplemented specifications
203
207
  - [ ] Calculated consistency score
204
- - [ ] Output in specified format
208
+ - [ ] Final response is the JSON output
205
209
 
206
210
  ## Output Self-Check
207
211
  - [ ] All findings are based on verification evidence (no modifications proposed)
@@ -127,13 +127,15 @@ Checklist:
127
127
  - [ ] If prior_context_count > 0: Each item has resolution status
128
128
  - [ ] If prior_context_count > 0: `prior_context_check` object prepared
129
129
  - [ ] Output is valid JSON
130
+ - [ ] Final response is the JSON output
130
131
 
131
132
  Complete all items before proceeding to output.
132
133
 
133
- ### Step 6: Review Result Report
134
- - Output results in JSON format according to perspective
134
+ ### Step 6: Return JSON Result
135
+ - Use the JSON schema according to review mode (comprehensive or perspective-specific)
135
136
  - Clearly classify problem importance
136
137
  - Include `prior_context_check` object if prior_context_count > 0
138
+ - Return the JSON result as the final response. See Output Format for the schema.
137
139
 
138
140
  ## Output Format
139
141
 
@@ -78,6 +78,9 @@ Evaluate each test for:
78
78
  - No shared state
79
79
  - No time-dependent logic
80
80
 
81
+ ### 4. Return JSON Result
82
+ Return the JSON result as the final response. See Output Format for the schema.
83
+
81
84
  ## Output Format
82
85
 
83
86
  ```json
@@ -137,6 +140,7 @@ Evaluate each test for:
137
140
  - [ ] No test interdependencies
138
141
  - [ ] Deterministic execution (no random/time dependency)
139
142
  - [ ] Test name matches verification content
143
+ - [ ] Final response is the JSON output
140
144
 
141
145
  ## Common Issues and Fixes
142
146
 
@@ -90,12 +90,15 @@ Information source priority:
90
90
  - Stopping at "~ is not configured" → without tracing why it's not configured
91
91
  - Stopping at technical element names → without tracing why that state occurred
92
92
 
93
- ### Step 4: Impact Scope Identification and Output
93
+ ### Step 4: Impact Scope Identification
94
94
 
95
95
  - Search for locations implemented with the same pattern (impactScope)
96
96
  - Determine recurrenceRisk: low (isolated) / medium (2 or fewer locations) / high (3+ locations or design_gap)
97
97
  - Disclose unexplored areas and investigation limitations
98
- - Output in JSON format
98
+
99
+ ### Step 5: Return JSON Result
100
+
101
+ Return the JSON result as the final response. See Output Format for the schema.
99
102
 
100
103
  ## Evidence Strength Classification
101
104
 
@@ -173,6 +176,7 @@ Information source priority:
173
176
  - [ ] Enumerated 2+ hypotheses with causal tracking, evidence collection, and causeCategory determination for each
174
177
  - [ ] Determined impactScope and recurrenceRisk
175
178
  - [ ] Documented unexplored areas and investigation limitations
179
+ - [ ] Final response is the JSON output
176
180
 
177
181
  ## Output Self-Check
178
182
  - [ ] Multiple hypotheses were evaluated (not just the first plausible one)
@@ -69,8 +69,13 @@ Apply fixes following the principles in coding-rules skill and testing skill.
69
69
  **Step 4: Repeat Until Approved**
70
70
  - Address all errors in each phase before proceeding to next phase
71
71
  - Error found → Fix immediately → Re-run checks
72
- - All pass → Return `approved: true`
73
- - Cannot determine spec → Return `blocked`
72
+ - All pass → proceed to Step 5
73
+ - Cannot determine spec → proceed to Step 5 with `blocked` status
74
+
75
+ **Step 5: Return JSON Result**
76
+ Return one of the following as the final response (see Output Format for schemas):
77
+ - `status: "approved"` — all quality checks pass
78
+ - `status: "blocked"` — specification unclear, business judgment required
74
79
 
75
80
  ## Frontend-Specific Quality Criteria
76
81
 
@@ -174,7 +179,6 @@ Before setting status to blocked, confirm specifications in this order:
174
179
  "totalWarnings": 0,
175
180
  "executionTime": "3m 30s"
176
181
  },
177
- "approved": true,
178
182
  "nextActions": "Ready to commit"
179
183
  }
180
184
  ```
@@ -200,11 +204,9 @@ Before setting status to blocked, confirm specifications in this order:
200
204
  }
201
205
  ```
202
206
 
203
- ### User Report (Mandatory)
204
-
205
- Summarize quality check results in an understandable way for users
207
+ ## Intermediate Progress Report
206
208
 
207
- ### Phase-by-phase Report (Detailed Information)
209
+ During execution, report progress between tool calls using this format:
208
210
 
209
211
  ```markdown
210
212
  Phase [Number]: [Phase Name]
@@ -222,6 +224,12 @@ Issues requiring fixes:
222
224
  Phase [Number] Complete! Proceeding to next phase.
223
225
  ```
224
226
 
227
+ This is intermediate output only. The final response must be the JSON result (Step 5).
228
+
229
+ ## Completion Criteria
230
+
231
+ - [ ] Final response is a single JSON with status `approved` or `blocked`
232
+
225
233
  ## Important Principles
226
234
 
227
235
  MUST follow these principles to maintain high-quality React code:
@@ -66,8 +66,13 @@ Apply fixes following the principles in coding-rules skill and testing skill.
66
66
  **Step 4: Repeat Until Approved**
67
67
  - Address all errors in each phase before proceeding to next phase
68
68
  - Error found → Fix immediately → Re-run checks
69
- - All pass → Return `approved: true`
70
- - Cannot determine spec → Return `blocked`
69
+ - All pass → proceed to Step 5
70
+ - Cannot determine spec → proceed to Step 5 with `blocked` status
71
+
72
+ **Step 5: Return JSON Result**
73
+ Return one of the following as the final response (see Output Format for schemas):
74
+ - `status: "approved"` — all quality checks pass
75
+ - `status: "blocked"` — specification unclear, business judgment required
71
76
 
72
77
  ## Status Determination Criteria (Binary Determination)
73
78
 
@@ -144,7 +149,6 @@ Apply fixes following the principles in coding-rules skill and testing skill.
144
149
  "totalWarnings": 0,
145
150
  "executionTime": "2m 15s"
146
151
  },
147
- "approved": true,
148
152
  "nextActions": "Ready to commit"
149
153
  }
150
154
  ```
@@ -170,11 +174,9 @@ Apply fixes following the principles in coding-rules skill and testing skill.
170
174
  }
171
175
  ```
172
176
 
173
- ### User Report (Mandatory)
174
-
175
- Summarize quality check results in an understandable way for users
177
+ ## Intermediate Progress Report
176
178
 
177
- ### Phase-by-phase Report (Detailed Information)
179
+ During execution, report progress between tool calls using this format:
178
180
 
179
181
  ```markdown
180
182
  Phase [Number]: [Phase Name]
@@ -192,6 +194,12 @@ Issues requiring fixes:
192
194
  Phase [Number] Complete! Proceeding to next phase.
193
195
  ```
194
196
 
197
+ This is intermediate output only. The final response must be the JSON result (Step 5).
198
+
199
+ ## Completion Criteria
200
+
201
+ - [ ] Final response is a single JSON with status `approved` or `blocked`
202
+
195
203
  ## Important Principles
196
204
 
197
205
  MUST follow these principles to maintain high-quality code:
@@ -112,6 +112,9 @@ Identify constraints, risks, and dependencies. Use web search to verify current
112
112
  ### 6. Formulate Questions
113
113
  Identify any ambiguities that affect scale determination (scopeDependencies) or require user confirmation before proceeding.
114
114
 
115
+ ### 7. Return JSON Result
116
+ Return the JSON result as the final response. See Output Format for the schema.
117
+
115
118
  ## Output Format
116
119
 
117
120
  **JSON format is mandatory.**
@@ -161,6 +164,7 @@ Identify any ambiguities that affect scale determination (scopeDependencies) or
161
164
  - [ ] Have I correctly determined ADR necessity?
162
165
  - [ ] Have I not overlooked technical risks?
163
166
  - [ ] Have I listed scopeDependencies for uncertain scale?
167
+ - [ ] Final response is the JSON output
164
168
 
165
169
  ## Completion Gate [BLOCKING]
166
170
 
@@ -65,6 +65,9 @@ From each skill:
65
65
  - Prioritize concrete procedures over abstract principles
66
66
  - Include checklists and actionable items
67
67
 
68
+ ### 4. Return JSON Result
69
+ Return the JSON result as the final response. See Output Format for the schema.
70
+
68
71
  ## Output Format
69
72
 
70
73
  Return structured JSON:
@@ -172,6 +175,12 @@ Return structured JSON:
172
175
  - MUST include enough context for standalone understanding
173
176
  - Prioritize actionable guidance over theory
174
177
 
178
+ ## Completion Criteria
179
+
180
+ - [ ] Task analysis completed with type, scale, and tags
181
+ - [ ] Relevant skills loaded and sections extracted
182
+ - [ ] Final response is the JSON output
183
+
175
184
  ## Completion Gate [BLOCKING]
176
185
 
177
186
  ☐ All completion criteria met with evidence
@@ -49,8 +49,8 @@ Skill Status:
49
49
 
50
50
  ## Output Scope
51
51
 
52
- This agent outputs **scope discovery results and evidence only**.
53
- Document generation is out of scope for this agent.
52
+ This agent outputs **scope discovery results, evidence, and PRD unit grouping**.
53
+ Document generation (PRD content, Design Doc content) is out of scope for this agent.
54
54
 
55
55
  ## Core Responsibilities
56
56
 
@@ -115,8 +115,10 @@ Explore the codebase from both user-value and technical perspectives simultaneou
115
115
  - Identify interface contracts
116
116
 
117
117
  4. **Synthesis into Functional Units**
118
- - Merge user-value groups and technical boundaries into functional units
118
+ - Combine user-value groups and technical boundaries into functional units
119
119
  - Each unit MUST represent a coherent feature with identifiable technical scope
120
+ - For each unit, identify its `valueProfile`: who uses it, what goal it serves, and what high-level capability it belongs to
121
+ - Also assign normalized grouping keys in `valueProfile.groupingKey` for persona, goal, and category; use short stable slugs (`kebab-case`) rather than free-form prose
120
122
  - Apply Granularity Criteria (see below)
121
123
 
122
124
  5. **Boundary Validation**
@@ -128,6 +130,16 @@ Explore the codebase from both user-value and technical perspectives simultaneou
128
130
  - Stop discovery when 3 consecutive new sources yield no new units
129
131
  - Mark discovery as saturated in output
130
132
 
133
+ 7. **PRD Unit Grouping** (execute only after steps 1-6 are fully complete)
134
+ - Using the finalized `discoveredUnits` and their `valueProfile` metadata, group units into PRD-appropriate units
135
+ - Grouping logic: units with the same `groupingKey.valueCategory` AND the same `groupingKey.userGoal` AND the same `groupingKey.targetPersona` belong to one PRD unit. If any of the three differs, the units become separate PRD units
136
+ - Free-text fields (`targetPersona`, `userGoal`, `valueCategory`) are explanatory only and MUST NOT be used as grouping keys
137
+ - Every discovered unit must appear in exactly one PRD unit's `sourceUnits`
138
+ - Output as `prdUnits` alongside `discoveredUnits` (see Output Format)
139
+
140
+ 8. **Return JSON Result**
141
+ - Return the JSON result as the final response. See Output Format for the schema.
142
+
131
143
  ## Granularity Criteria
132
144
 
133
145
  Each discovered unit MUST represent a Vertical Slice — a coherent functional unit that spans all relevant layers — and satisfy:
@@ -138,11 +150,13 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
138
150
  - Multiple independent user journeys within one unit
139
151
  - Multiple distinct data domains with no shared state
140
152
 
141
- **Merge signals** (units may be too granular):
153
+ **Cohesion signals** (units that may belong together):
142
154
  - Units share >50% of related files
143
155
  - One unit cannot function without the other
144
156
  - Combined scope is still under 10 files
145
157
 
158
+ Note: These signals are informational only during steps 1-6. Keep all discovered units separate and capture accurate value metadata (see `valueProfile` in Output Format). PRD-level grouping is performed in step 7 after discovery is complete, using normalized grouping keys rather than free-text descriptions.
159
+
146
160
  ## Confidence Assessment
147
161
 
148
162
  | Level | Triangulation Strength | Criteria |
@@ -174,6 +188,16 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
174
188
  "entryPoints": ["/path1", "/path2"],
175
189
  "relatedFiles": ["src/feature/*"],
176
190
  "dependencies": ["UNIT-002"],
191
+ "valueProfile": {
192
+ "targetPersona": "Who this feature serves (e.g., 'end user', 'admin', 'developer')",
193
+ "userGoal": "What the user is trying to accomplish with this feature",
194
+ "valueCategory": "High-level capability this belongs to (e.g., 'Authentication', 'Content Management', 'Reporting')",
195
+ "groupingKey": {
196
+ "targetPersona": "end-user",
197
+ "userGoal": "sign-in",
198
+ "valueCategory": "authentication"
199
+ }
200
+ },
177
201
  "technicalProfile": {
178
202
  "primaryModules": ["src/auth/service.ts", "src/auth/controller.ts"],
179
203
  "publicInterfaces": ["AuthService.login()", "AuthController.handleLogin()"],
@@ -196,6 +220,21 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
196
220
  "suggestedAction": "What to do"
197
221
  }
198
222
  ],
223
+ "prdUnits": [
224
+ {
225
+ "id": "PRD-001",
226
+ "name": "PRD unit name (user-value level)",
227
+ "description": "What this capability delivers to the user",
228
+ "groupingKey": {
229
+ "targetPersona": "end-user",
230
+ "userGoal": "sign-in",
231
+ "valueCategory": "authentication"
232
+ },
233
+ "sourceUnits": ["UNIT-001", "UNIT-003"],
234
+ "combinedRelatedFiles": ["src/feature-a/*", "src/feature-b/*"],
235
+ "combinedEntryPoints": ["/path1", "/path2", "/path3"]
236
+ }
237
+ ],
199
238
  "limitations": ["What could not be discovered and why"]
200
239
  }
201
240
  ```
@@ -209,11 +248,14 @@ Each discovered unit MUST represent a Vertical Slice — a coherent functional u
209
248
  - [ ] Mapped public interfaces
210
249
  - [ ] Analyzed dependency graph
211
250
  - [ ] Applied granularity criteria (split/merge as needed)
251
+ - [ ] Identified value profile (persona, goal, category) for each unit
212
252
  - [ ] Mapped discovered units to evidence sources
213
253
  - [ ] Assessed triangulation strength for each unit
214
254
  - [ ] Documented relationships between units
215
255
  - [ ] Reached saturation or documented why not
216
256
  - [ ] Listed uncertain areas and limitations
257
+ - [ ] Grouped discovered units into PRD units (step 7, after all discovery steps complete)
258
+ - [ ] Final response is the JSON output
217
259
 
218
260
  ## Output Self-Check
219
261
  - [ ] Output is limited to scope discovery (no PRD or Design Doc content generated)
@@ -101,6 +101,9 @@ Each finding must include a `rationale` field whose content depends on the categ
101
101
  | **hardening** | Why the current state is acceptable, and what improvement would add |
102
102
  | **policy** | Why this is not a technical vulnerability (what mitigates the technical risk) |
103
103
 
104
+ ### 6. Return JSON Result
105
+ Return the JSON result as the final response. See Output Format for the schema.
106
+
104
107
  ## Output Format
105
108
 
106
109
  ```json
@@ -155,6 +158,7 @@ Each finding must include a `rationale` field whose content depends on the categ
155
158
  - [ ] Each finding classified into confirmed_risk / defense_gap / hardening / policy
156
159
  - [ ] False positives excluded considering runtime environment and existing mitigations
157
160
  - [ ] Committed secrets checked (blocked status if found)
161
+ - [ ] Final response is the JSON output
158
162
 
159
163
  ## Completion Gate [BLOCKING]
160
164
 
@@ -111,12 +111,15 @@ Recommendation strategy based on confidence:
111
111
  - medium: Staged approach, verify with low-impact fixes before full implementation
112
112
  - low: Start with conservative mitigation, prioritize solutions that address multiple possible causes
113
113
 
114
- ### Step 5: Implementation Steps Creation and Output
114
+ ### Step 5: Implementation Steps Creation
115
115
  - Each step independently verifiable
116
116
  - Explicitly state dependencies between steps
117
117
  - Define completion conditions for each step
118
118
  - Include rollback procedures
119
- - Output structured report in JSON format
119
+
120
+ ### Step 6: Return JSON Result
121
+
122
+ Return the JSON result as the final response. See Output Format for the schema.
120
123
 
121
124
  ## Output Format
122
125
 
@@ -184,6 +187,7 @@ Recommendation strategy based on confidence:
184
187
  - [ ] Documented residual risks
185
188
  - [ ] Verified solutions align with project rules or best practices
186
189
  - [ ] Verified input consistency with user report
190
+ - [ ] Final response is the JSON output
187
191
 
188
192
  ## Output Self-Check
189
193
  - [ ] Solution addresses the user's reported symptoms (not just the technical conclusion)
@@ -184,6 +184,11 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
184
184
  Task complete when all checkbox items completed and operation verification complete.
185
185
  For research tasks, includes creating deliverable files specified in metadata "Provides" section.
186
186
 
187
+ ### 5. Return JSON Result
188
+ Return one of the following as the final response (see Structured Response Specification for schemas):
189
+ - `status: "completed"` — task fully implemented
190
+ - `status: "escalation_needed"` — design deviation or similar component discovered
191
+
187
192
  ## Research Task Deliverables
188
193
 
189
194
  Research/analysis tasks create deliverable files specified in metadata "Provides".
@@ -291,6 +296,10 @@ When discovering similar components/hooks during existing code investigation, es
291
296
  - Design Doc deviation → escalate to orchestrator immediately
292
297
  - Component patterns → use functional components exclusively (React standard)
293
298
 
299
+ ## Completion Criteria
300
+
301
+ - [ ] Final response is a single JSON with status `completed` or `escalation_needed`
302
+
294
303
  ## Completion Gate [BLOCKING]
295
304
 
296
305
  ☐ All completion criteria met with evidence
@@ -185,6 +185,11 @@ Select and execute files with pattern `docs/plans/tasks/*-task-*.md` that have u
185
185
  Task complete when all checkbox items completed and operation verification complete.
186
186
  For research tasks, includes creating deliverable files specified in metadata "Provides" section.
187
187
 
188
+ ### 5. Return JSON Result
189
+ Return one of the following as the final response (see Structured Response Specification for schemas):
190
+ - `status: "completed"` — task fully implemented
191
+ - `status: "escalation_needed"` — design deviation or similar function discovered
192
+
188
193
  ## Research Task Deliverables
189
194
 
190
195
  Research/analysis tasks create deliverable files specified in metadata "Provides".
@@ -293,6 +298,10 @@ When discovering similar functions during existing code investigation, escalate
293
298
  - Escalate when: design deviation, similar functions found, test environment missing
294
299
  - Stop after implementation and test creation — quality checks and commits are handled separately
295
300
 
301
+ ## Completion Criteria
302
+
303
+ - [ ] Final response is a single JSON with status `completed` or `escalation_needed`
304
+
296
305
  ## Completion Gate [BLOCKING]
297
306
 
298
307
  ☐ All completion criteria met with evidence
@@ -116,7 +116,11 @@ Classify each hypothesis by the following levels:
116
116
  - Example: "The implementation is wrong" → Was design_gap considered?
117
117
  - If inconsistent, explicitly note "Investigation focus may be misaligned with user report"
118
118
 
119
- **Conclusion**: Adopt unrefuted hypotheses as causes. When multiple causes exist, determine their relationship (independent/dependent/exclusive) and output in JSON format
119
+ **Conclusion**: Adopt unrefuted hypotheses as causes. When multiple causes exist, determine their relationship (independent/dependent/exclusive)
120
+
121
+ ### Step 7: Return JSON Result
122
+
123
+ Return the JSON result as the final response. See Output Format for the schema.
120
124
 
121
125
  ## Confidence Determination Criteria
122
126
 
@@ -205,6 +209,7 @@ Classify each hypothesis by the following levels:
205
209
  - [ ] Verified consistency with user report
206
210
  - [ ] Determined verification level for each hypothesis
207
211
  - [ ] Adopted unrefuted hypotheses as causes and determined relationship when multiple
212
+ - [ ] Final response is the JSON output
208
213
 
209
214
  ## Output Self-Check
210
215
  - [ ] Confidence levels reflect all discovered evidence, including official documentation
package/README.md CHANGED
@@ -88,7 +88,7 @@ Problem → investigator → verifier (ACH + Devil's Advocate) → solver → Ac
88
88
  ### Reverse Engineering
89
89
 
90
90
  ```
91
- Existing code → scope-discoverer → prd-creator → code-verifier → document-reviewer → Design Docs
91
+ Existing code → scope-discoverer (discoveredUnits + prdUnits) → prd-creator → code-verifier → document-reviewer → Design Docs
92
92
  ```
93
93
 
94
94
  ---
@@ -246,7 +246,7 @@ Codex spawns these as needed during recipe execution. Each agent runs in its own
246
246
  | `code-verifier` | Document-code consistency verification |
247
247
  | `security-reviewer` | Security compliance review after implementation |
248
248
  | `rule-advisor` | Skill selection via metacognitive analysis |
249
- | `scope-discoverer` | Codebase scope discovery for reverse docs |
249
+ | `scope-discoverer` | Codebase scope discovery for reverse docs, including PRD unit grouping |
250
250
 
251
251
  ### Diagnosis Agents
252
252
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codex-workflows",
3
- "version": "0.2.1",
3
+ "version": "0.2.2",
4
4
  "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
5
5
  "license": "MIT",
6
6
  "author": "Shinsuke Kagawa",