codex-workflows 0.4.9 → 0.4.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -96,6 +96,27 @@ Subagents CANNOT directly call other subagents — all coordination MUST flow th
96
96
 
97
97
  **ENFORCEMENT**: Direct subagent-to-subagent communication is PROHIBITED
98
98
 
99
+ ### Subagent Completion Discipline [MANDATORY]
100
+
101
+ The orchestrator owns subagent completion. Base waiting decisions on assigned responsibility and observed state, not on an expectation of quick completion. Multi-step search, review, verification, generation, implementation, and quality work can run for extended periods.
102
+
103
+ Use this contract:
104
+ - Wait for required subagent outputs with `wait_agent`
105
+ - Keep the current task assignment while the subagent remains `running`
106
+ - Treat missing intermediate output as a normal execution state while the subagent remains `running`
107
+ - Hold final artifact production until every required subagent output is available
108
+ - After repeated empty waits, run non-destructive diagnostics: re-check prompt, inputs, expected deliverable, and agent-task fit; send a focused follow-up when it would clarify the pending deliverable
109
+ - Resume waiting after diagnostics unless the user redirects the workflow or the orchestrator confirms a launch mistake
110
+
111
+ Treat the following as explicit contradictory evidence:
112
+ - The subagent returns a terminal status such as `approved`, `needs_revision`, `blocked`, `skipped`, `completed`, or `escalation_needed`
113
+ - The orchestrator verifies that it launched the wrong subagent or sent materially incorrect inputs
114
+ - A newer explicit user instruction changes or cancels the task
115
+
116
+ Close a running subagent only when the user redirects the workflow, the orchestrator corrects a launch mistake, or a newer user instruction supersedes the pending task.
117
+
118
+ **ENFORCEMENT**: Preserve subagent execution until completion, user redirection, or explicit correction of an orchestrator launch mistake. Speed-based early termination is a CRITICAL VIOLATION.
119
+
99
120
  ## How to Spawn Agents
100
121
 
101
122
  Spawn agents using natural language prompts. Provide clear context about what the agent should accomplish.
@@ -149,16 +170,9 @@ All agents MUST use this vocabulary consistently:
149
170
  | `blocked` | security-reviewer | Committed secrets or high-confidence exploitable risk | Halt workflow immediately, escalate to user (requires human intervention) |
150
171
  | `skipped` | All agents | Preconditions not met for this step | Report reason, proceed |
151
172
 
152
- **approved_with_conditions handling** (document agents):
153
- - Conditions MUST be listed explicitly in the agent's output
154
- - Orchestrator MUST append conditions to the document's "Undetermined Items" or "Open Items" section before proceeding
155
- - Orchestrator MUST pass conditions to the next phase's agent as context
156
- - Conditions do not block progression but MUST be resolved before implementation phase
157
-
158
- **approved_with_notes handling** (security-reviewer):
159
- - Notes are informational — they do NOT require resolution before proceeding
160
- - Orchestrator MUST include notes in the completion report for awareness
161
- - Do not apply approved_with_conditions handling (no resolution tracking)
173
+ Handling rules:
174
+ - `approved_with_conditions`: append the listed conditions to the document's open-items section, carry them into the next phase, and resolve them before implementation
175
+ - `approved_with_notes`: include the notes in the completion report for awareness
162
176
 
163
177
  **ENFORCEMENT**: Using any status value outside this vocabulary is a VIOLATION.
164
178
 
@@ -176,17 +190,20 @@ All agents MUST use this vocabulary consistently:
176
190
 
177
191
  ## Structured Response Specification
178
192
 
179
- Subagents respond in JSON format. The final response from each JSON-returning subagent must be the JSON payload itself, with no trailing prose. Key fields for orchestrator decisions:
180
- - **requirement-analyzer**: scale, confidence, affectedLayers, adrRequired, scopeDependencies, questions
181
- - **codebase-analyzer**: analysisScope, existingElements, dataModel, qualityAssurance, focusAreas, limitations
182
- - **task-executor**: status (escalation_needed/completed), escalation_type (design_compliance_violation/similar_function_found/similar_component_found/investigation_target_not_found/out_of_scope_file/test_environment_not_ready/dependency_version_uncertain), testsAdded, requiresTestReview
183
- - **quality-fixer**: Input: `task_file` (always pass the current task file path in orchestrated flows). Status (`stub_detected`/approved/blocked). `stub_detected` returns `stubFindings[]` and routes back to the task executor. For blocked responses, discriminate by `reason`: specification conflicts use `blockingIssues[]`; execution prerequisites use `missingPrerequisites[]`, and each item provides its own `resolutionSteps`
184
- - **document-reviewer**: verdict.decision (approved/approved_with_conditions/needs_revision/rejected)
185
- - **code-verifier**: summary.status, summary.consistencyScore, discrepancies, reverseCoverage
186
- - **design-sync**: sync_status (CONFLICTS_FOUND/NO_CONFLICTS) text format with [SUMMARY] block
187
- - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes
188
- - **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes
189
- - **acceptance-test-generator**: status, generatedFiles, `e2eAbsenceReason`
193
+ Subagents respond in JSON format. The final response from each JSON-returning subagent must be the JSON payload itself, with no trailing prose. Agent TOML files define the full schemas; the orchestrator only relies on these routing keys:
194
+
195
+ | Agent | Routing fields the orchestrator uses |
196
+ |-------|--------------------------------------|
197
+ | `requirement-analyzer` | `scale`, `confidence`, `affectedLayers`, `adrRequired`, `scopeDependencies`, `questions` |
198
+ | `codebase-analyzer` | `focusAreas`, `dataModel`, `qualityAssurance`, `dataTransformationPipelines`, `limitations` |
199
+ | `task-executor*` | `status`, `escalation_type`, `filesModified`, `requiresTestReview` |
200
+ | `quality-fixer*` | `status`, `reason`, `stubFindings`, `blockingIssues`, `missingPrerequisites` |
201
+ | `document-reviewer` | `verdict.decision`, `verdict.conditions` |
202
+ | `code-verifier` | `summary.status`, `discrepancies`, `reverseCoverage` |
203
+ | `design-sync` | `sync_status` |
204
+ | `integration-test-reviewer` | `status`, `requiredFixes` |
205
+ | `security-reviewer` | `status`, `findings`, `notes`, `requiredFixes` |
206
+ | `acceptance-test-generator` | `status`, `generatedFiles`, `e2eAbsenceReason` |
190
207
 
191
208
  ## Handling Requirement Changes
192
209
 
@@ -215,51 +232,21 @@ Document generation agents (work-planner, technical-designer, prd-creator) can u
215
232
 
216
233
  ## Basic Flow for Work Planning
217
234
 
218
- When receiving new features or change requests, start with requirement-analyzer.
219
-
220
- ### Large Scale (6+ Files) - 13 Steps (backend) / 15 Steps (frontend/fullstack)
221
-
222
- 1. requirement-analyzer: Requirement analysis + Check existing PRD **[Stop]**
223
- 2. prd-creator: PRD creation
224
- 3. document-reviewer: PRD review **[Stop: PRD Approval]**
225
- 4. **(frontend/fullstack only)** Ask user for prototype code; ui-spec-designer: UI Spec creation
226
- 5. **(frontend/fullstack only)** document-reviewer: UI Spec review **[Stop: UI Spec Approval]**
227
- 6. technical-designer: ADR creation (if architecture/technology/data flow changes)
228
- 7. document-reviewer: ADR review (if ADR created) **[Stop: ADR Approval]**
229
- 8. codebase-analyzer: Codebase analysis (pass requirement-analyzer output and PRD path when available)
230
- 9. technical-designer: Design Doc creation
231
- 10. code-verifier: Design Doc verification against code
232
- 11. document-reviewer: Design Doc review with code verification evidence
233
- 12. design-sync: Consistency verification **[Stop: Design Doc Approval]**
234
- 13. acceptance-test-generator: Test skeleton generation, pass to work-planner
235
- 14. work-planner: Work plan creation **[Stop: Batch approval]**
236
- 15. task-decomposer: Autonomous execution to Completion report
237
-
238
- ### Medium Scale (3-5 Files) - 9 Steps (backend) / 11 Steps (frontend/fullstack)
239
-
240
- 1. requirement-analyzer: Requirement analysis **[Stop]**
241
- 2. codebase-analyzer: Codebase analysis
242
- 3. **(frontend/fullstack only)** Ask user for prototype code; ui-spec-designer: UI Spec creation
243
- 4. **(frontend/fullstack only)** document-reviewer: UI Spec review **[Stop: UI Spec Approval]**
244
- 5. technical-designer: Design Doc creation
245
- 6. code-verifier: Design Doc verification against code
246
- 7. document-reviewer: Design Doc review with code verification evidence
247
- 8. design-sync: Consistency verification **[Stop: Design Doc Approval]**
248
- 9. acceptance-test-generator: Test skeleton generation, pass to work-planner
249
- 10. work-planner: Work plan creation **[Stop: Batch approval]**
250
- 11. task-decomposer: Autonomous execution to Completion report
251
-
252
- ### Design Flow Data Passing
253
-
254
- - Pass requirement-analyzer output and original requirements to codebase-analyzer
255
- - Pass codebase-analyzer JSON to technical-designer or technical-designer-frontend as `Codebase Analysis`, including `dataTransformationPipelines` and `qualityAssurance` when present
256
- - Pass Design Doc path to code-verifier
257
- - Pass code-verifier JSON to document-reviewer as `code_verification`
258
-
259
- ### Small Scale (1-2 Files) - 2 Steps
260
-
261
- 1. Create simplified plan **[Stop: Batch approval]**
262
- 2. Direct implementation to Completion report
235
+ Always start with `requirement-analyzer`, then follow the minimum flow required by scale and affected layers.
236
+
237
+ | Scale | Required flow |
238
+ |-------|---------------|
239
+ | Large | `requirement-analyzer` **[Stop]** -> `prd-creator` -> `document-reviewer` **[Stop]** -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> optional ADR + `document-reviewer` **[Stop]** -> `codebase-analyzer` -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` **[Stop]** -> `task-decomposer` |
240
+ | Medium | `requirement-analyzer` **[Stop]** -> `codebase-analyzer` -> optional `ui-spec-designer` + `document-reviewer` **[Stop]** -> `technical-designer*` -> `code-verifier` -> `document-reviewer` -> `design-sync` **[Stop]** -> `acceptance-test-generator` -> `work-planner` **[Stop]** -> `task-decomposer` |
241
+ | Small | `requirement-analyzer` **[Stop]** -> simplified plan **[Stop: Batch approval]** -> direct implementation |
242
+
243
+ Flow rules:
244
+ - Frontend and fullstack flows add UI Spec before Design Doc creation
245
+ - Create ADR only when architecture, technology, or data-flow changes require it
246
+ - Pass requirement-analyzer output and original requirements to `codebase-analyzer`
247
+ - Pass `codebase-analyzer` output to the designer as `Codebase Analysis`
248
+ - Pass Design Doc path to `code-verifier`, then pass `code_verification` to `document-reviewer`
249
+ - Fullstack layer sequencing is defined in `references/monorepo-flow.md`
263
250
 
264
251
  ## Autonomous Execution Mode
265
252
 
@@ -316,6 +303,13 @@ Stop autonomous execution and escalate to user in the following cases:
316
303
  3. **Work-planner update restriction violated**: Requirement changes after task-decomposer starts require overall redesign
317
304
  4. **User explicitly stops**: Direct stop instruction or interruption
318
305
 
306
+ Continue autonomous execution in the following situations:
307
+ - A subagent takes longer than expected
308
+ - `wait_agent` returns without a completion payload while the subagent remains `running`
309
+ - The orchestrator has partial context but is still waiting on a required subagent output
310
+
311
+ If repeated waits return the same `running` state, apply the completion-diagnostics contract above.
312
+
319
313
  Use the task loop defined in the autonomous execution diagram above. The canonical per-task cycle is:
320
314
  1. task-executor implementation
321
315
  2. escalation or integration-test-reviewer decision
@@ -338,49 +332,27 @@ Maximum retry count is 1 verification fix cycle. If any failed verifier still fa
338
332
  2. **Information Bridging**: Data conversion and transmission between subagents
339
333
  - Convert each subagent's output to next subagent's input format
340
334
  - **Always pass deliverables from previous process to next agent**
341
- - Extract necessary information from structured responses
342
- - Compose commit messages from changeSummary
335
+ - Extract the routing fields listed above
343
336
  - Explicitly integrate initial and additional requirements when requirements change
344
337
  3. **Quality Assurance and Commit Execution**: Execute git commit per the 4-step task cycle
345
338
  4. **Autonomous Execution Mode Management**: Start/stop autonomous execution after approval, escalation decisions
346
339
  5. **ADR Status Management**: Update ADR status after user decision (Accepted/Rejected)
347
340
 
348
- ### acceptance-test-generator to work-planner Bridge
349
-
350
- **Pass to acceptance-test-generator**:
351
- - Design Doc: [path]
352
- - UI Spec: [path] (if exists)
353
-
354
- **Orchestrator verification items**:
355
- - Verify integration test file path retrieval and existence
356
- - Verify E2E test file path retrieval and existence when `generatedFiles.e2e` is not null
357
- - Verify `e2eAbsenceReason` is present when `generatedFiles.e2e` is null
358
-
359
- **Pass to work-planner**:
360
- - Integration test file: [path] (create and execute simultaneously with each phase implementation)
361
- - E2E test file: [path] or `null` (execute only in final phase when present)
362
- - E2E absence reason: [value when E2E test file is null]
363
-
364
- **On error**: Escalate to user only when required outputs are missing without a valid absence reason
365
-
366
- ### Design Doc to Work Plan Verification Handoff
367
-
368
- When a Design Doc contains a Verification Strategy section, the orchestrator must carry forward:
369
- - Design Doc path
370
- - Verification Strategy details:
371
- - Correctness definition
372
- - Target comparison
373
- - Verification method
374
- - Observable success indicator
375
- - Verification timing
376
- - Early verification point (first target, success criteria, failure response)
377
-
378
- The resulting work plan must include this summary in its header so the plan remains self-sufficient for downstream task generation and execution planning.
379
- When the Design Doc includes an `Output Comparison` section, carry forward the comparison input, expected output fields or format, diff method, and transformation pipeline coverage as part of that summary.
380
-
381
- In addition, the orchestrator must preserve implementation-relevant technical requirements from each Design Doc so work-planner can create a Design-to-Plan Traceability table. Use the category values and normalization rules defined in the plan template, including protected no-change boundaries from sections such as `No Ripple Effect`.
382
-
383
- Work-planner maps each extracted item to a covering task or phase. If no covering task exists, the row is marked `gap` with justification. Justified gaps require user confirmation before plan approval.
341
+ ### Required Handoffs
342
+
343
+ | From | To | Required pass-through |
344
+ |------|----|-----------------------|
345
+ | `requirement-analyzer` | `codebase-analyzer` | requirement analysis JSON, original requirements, PRD path when available |
346
+ | `codebase-analyzer` | `technical-designer*` | `Codebase Analysis`, including `focusAreas`, `dataModel`, `qualityAssurance`, `dataTransformationPipelines`, `limitations` |
347
+ | `technical-designer*` | `code-verifier` | Design Doc path |
348
+ | `code-verifier` | `document-reviewer` | `code_verification` JSON |
349
+ | `acceptance-test-generator` | `work-planner` | integration test path, E2E path or `null`, `e2eAbsenceReason` when E2E is absent |
350
+ | Design Doc | `work-planner` | Verification Strategy summary, Output Comparison details, implementation-relevant technical requirements, protected no-change boundaries |
351
+
352
+ Handoff rules:
353
+ - Verify generated integration and E2E file paths exist before passing them onward
354
+ - Escalate only when required outputs are missing without a valid absence reason
355
+ - Require work-planner to map every carried-forward technical requirement to a covering task or a justified `gap`
384
356
 
385
357
  ## Important Constraints [MANDATORY]
386
358
 
package/README.md CHANGED
@@ -363,9 +363,9 @@ A: `$recipe-implement` is the universal entry point. It runs requirement-analyze
363
363
 
364
364
  A: Yes. Codex skills and subagents work alongside [MCP](https://developers.openai.com/codex/mcp) — skills operate at the instruction layer while MCP operates at the tool transport layer. You can add MCP servers to any agent's TOML configuration.
365
365
 
366
- **Q: What if a subagent gets stuck?**
366
+ **Q: What if a subagent seems stuck?**
367
367
 
368
- A: Subagents escalate to the user when they encounter design deviations, ambiguous requirements, or specification conflicts. The framework stops autonomous execution and presents the issue with options.
368
+ A: Long waits can be normal in this workflow because many subagents perform substantial multi-step work. The orchestrator keeps ownership of the pending deliverable, continues waiting for the required output, and uses diagnostics only to confirm missing inputs or restate the pending deliverable. User direction remains the boundary for interrupting that work.
369
369
 
370
370
  ---
371
371
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codex-workflows",
3
- "version": "0.4.9",
3
+ "version": "0.4.10",
4
4
  "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
5
5
  "license": "MIT",
6
6
  "author": "Shinsuke Kagawa",