codex-workflows 0.3.1 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/.agents/skills/documentation-criteria/SKILL.md +25 -2
  2. package/.agents/skills/documentation-criteria/references/design-template.md +38 -0
  3. package/.agents/skills/documentation-criteria/references/plan-template.md +67 -10
  4. package/.agents/skills/documentation-criteria/references/task-template.md +8 -1
  5. package/.agents/skills/integration-e2e-testing/SKILL.md +4 -1
  6. package/.agents/skills/recipe-build/SKILL.md +1 -1
  7. package/.agents/skills/recipe-design/SKILL.md +22 -14
  8. package/.agents/skills/recipe-front-build/SKILL.md +1 -1
  9. package/.agents/skills/recipe-front-design/SKILL.md +7 -2
  10. package/.agents/skills/recipe-fullstack-build/SKILL.md +1 -1
  11. package/.agents/skills/recipe-fullstack-implement/SKILL.md +2 -0
  12. package/.agents/skills/recipe-implement/SKILL.md +2 -0
  13. package/.agents/skills/recipe-reverse-engineer/SKILL.md +2 -2
  14. package/.agents/skills/recipe-update-doc/SKILL.md +16 -5
  15. package/.agents/skills/subagents-orchestration-guide/SKILL.md +64 -41
  16. package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +45 -21
  17. package/.agents/skills/testing/SKILL.md +31 -0
  18. package/.codex/agents/acceptance-test-generator.toml +9 -2
  19. package/.codex/agents/code-verifier.toml +11 -4
  20. package/.codex/agents/codebase-analyzer.toml +193 -0
  21. package/.codex/agents/document-reviewer.toml +9 -0
  22. package/.codex/agents/integration-test-reviewer.toml +8 -7
  23. package/.codex/agents/task-decomposer.toml +16 -0
  24. package/.codex/agents/technical-designer-frontend.toml +28 -0
  25. package/.codex/agents/technical-designer.toml +21 -0
  26. package/.codex/agents/work-planner.toml +18 -4
  27. package/README.md +11 -5
  28. package/package.json +1 -1
@@ -57,18 +57,19 @@ Key checks:
57
57
  ## Verification Process
58
58
 
59
59
  ### 1. Skeleton Comment Extraction
60
- Extract the following comment patterns from test file:
61
- - `// AC:` → Original acceptance criteria
62
- - `// Behavior:` → Trigger → Process → Observable Result
63
- - `// @category:` → Test classification
64
- - `// @dependency:` → Dependencies
65
- - `// Verification items:` → Expected verification items (if present)
60
+ Extract the following annotation patterns from the test file using the project's comment syntax:
61
+ - `AC:` → Original acceptance criteria
62
+ - `Behavior:` → Trigger → Process → Observable Result
63
+ - `@category:` → Test classification
64
+ - `@dependency:` → Dependencies
65
+ - `@real-dependency:` → Dependencies expected to stay real in integration coverage
66
+ - `Verification items:` → Expected verification items (if present)
66
67
 
67
68
  ### 2. Implementation Verification
68
69
  For each test case:
69
70
  1. Check if "observable result" from Behavior is asserted
70
71
  2. Check if all items in Verification items are covered by assertions
71
- 3. Verify mock boundaries match @dependency
72
+ 3. Verify mock boundaries match `@dependency` and `@real-dependency`
72
73
 
73
74
  ### 3. Quality Assessment
74
75
  Evaluate each test for:
@@ -51,6 +51,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
51
51
  - Understand dependencies between phases and tasks
52
52
  - Grasp completion criteria and quality standards
53
53
  - **Interface change detection and response**
54
+ - **Extract Verification Strategy from the work plan header**
54
55
 
55
56
  2. **Task Decomposition**
56
57
  - Decompose at 1 commit = 1 task granularity (logical change unit)
@@ -116,6 +117,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
116
117
  - Investigation Targets
117
118
  - Investigation Notes
118
119
  - Concrete implementation steps
120
+ - Operation Verification Methods
119
121
  - Completion criteria
120
122
 
121
123
  6. **Investigation Targets Determination**
@@ -128,6 +130,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
128
130
  | Integration/E2E test work | Test skeleton file, target implementation under test, existing fixture/auth/setup patterns |
129
131
  | E2E environment/setup work | Current environment config, startup scripts, seed/fixture scripts, auth flow references |
130
132
  | Bug fix/refactor | Affected code paths, failing tests, reproduction-related files |
133
+ | Behavior replacement/rewrite | Existing implementation being replaced, observable outputs, Verification Strategy section in the Design Doc |
131
134
 
132
135
  **Principles**:
133
136
  - Every task must include at least one Investigation Target
@@ -144,6 +147,18 @@ Decompose tasks based on implementation strategy patterns determined in implemen
144
147
  8. **Utilize Test Information**
145
148
  When test information (@category, @dependency, @complexity, etc.) is documented in the work plan, reflect that information in task files
146
149
 
150
+ ## Verification Strategy Propagation
151
+
152
+ Verification Strategy defines what correctness means at design time. L1/L2/L3 (from implementation-approach) define task-level verification depth at execution time. Use both.
153
+
154
+ When the work plan includes one or more Verification Strategy blocks:
155
+
156
+ 1. **Source preservation**: Keep each strategy tied to its source Design Doc or plan block. Preserve strategy identity and merge only when the work plan explicitly marks the strategy as shared.
157
+ 2. **Early verification task**: The task matching a strategy's "First verification target" MUST include that method and success criteria in Operation Verification Methods.
158
+ 3. **Per-task verification**: Each task's Operation Verification Methods MUST instantiate the relevant plan-level verification method for that task's specific files, interfaces, or behavior.
159
+ 4. **Failure handling**: Copy or adapt the relevant plan-level failure response so the executor knows whether to reassess, stop, or escalate.
160
+ 5. **Investigation coverage**: Include every resource required for verification, such as existing implementations for comparison, schema definitions, fixtures, contracts, or seed data.
161
+
147
162
  ## Task File Template
148
163
 
149
164
  See task template in documentation-criteria skill for details.
@@ -243,6 +258,7 @@ Please execute decomposed tasks according to the order.
243
258
  - [ ] Impact scope and boundaries definition for each task
244
259
  - [ ] Appropriate granularity (1-5 files/task)
245
260
  - [ ] Investigation Targets specified for every task
261
+ - [ ] Operation Verification Methods specified for every task
246
262
  - [ ] Clear completion criteria setting
247
263
  - [ ] Overall design document creation
248
264
  - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope)
@@ -96,6 +96,8 @@ Must be performed before Design Doc creation:
96
96
  - Clearly document similar component search results (found components or "none")
97
97
  - Include dependency existence verification results (verified existing / requires new creation / external dependency)
98
98
  - Record adopted decision (use existing/improvement proposal/new implementation) and rationale
99
+ - When Codebase Analysis input is provided, use it as the baseline evidence set and extend it only where gaps remain
100
+ - When frontend behavior depends on persistence, repositories, API-backed data contracts, or schema-shaped responses, complete the `Test Boundaries` section with a concrete verification strategy. When those concerns are outside the scope, mark the section explicitly as not applicable.
99
101
 
100
102
  ### Integration Points【Important】
101
103
  Document all integration points with existing components in a "## Integration Point Map" section.
@@ -144,6 +146,17 @@ Must be performed when creating Design Doc:
144
146
  - Which task first makes the entire UI operational
145
147
  - Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
146
148
 
149
+ 3. **Verification Strategy Definition**
150
+ - Define what correctness means for this UI change and how it will be proven
151
+ - Use the Design Doc template fields directly
152
+ - Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
153
+ - Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
154
+ - For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
155
+ - For new UI features, specify acceptance-criteria verification beyond unit tests
156
+ - For extensions, specify regression verification that proves existing behavior and UX expectations are preserved
157
+ - For refactors or rewrites, specify behavioral equivalence verification against the current UI behavior when applicable
158
+ - Define an early verification point: the first screen, state transition, or interaction that proves the approach works
159
+
147
160
  ### Change Impact Map【Required】
148
161
  Must be included when creating Design Doc:
149
162
 
@@ -203,6 +216,13 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
203
216
  - `reverse-engineer`: Document existing frontend architecture as-is
204
217
 
205
218
  - **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
219
+ - **Codebase Analysis** (optional, from codebase-analyzer):
220
+ - Use as the primary source for Existing Codebase Analysis when provided
221
+ - `existingElements` informs implementation path mapping and inspection evidence
222
+ - `dataModel` informs API contract expectations and data-shape references
223
+ - `focusAreas` indicate components, hooks, or state paths that deserve deeper inspection
224
+ - `constraints` inform compatibility and UI behavior constraints
225
+ - Additional investigation should focus on areas the analysis did not fully resolve
206
226
  - **PRD**: PRD document (if exists)
207
227
  - **UI Spec**: UI Specification document (if exists, for frontend features)
208
228
  - **Documents to Create**: ADR, Design Doc, or both
@@ -256,6 +276,14 @@ Execute file output immediately. Final approval is managed by the orchestrator r
256
276
  - Cite information sources in "References" section with URLs
257
277
  - Especially confirm multiple reliable sources when introducing new technologies
258
278
 
279
+ ## Design Doc Completion Checklist
280
+
281
+ - [ ] Agreement Checklist completed and reflected in design
282
+ - [ ] Implementation approach selected with rationale
283
+ - [ ] Verification Strategy defined with correctness definition, target comparison, method, observable success indicator, timing, and early verification point
284
+ - [ ] Change Impact Map included
285
+ - [ ] Interface Change Impact Analysis included
286
+
259
287
  ## Implementation Sample Standards Compliance
260
288
 
261
289
  **MANDATORY**: All implementation samples in ADR and Design Docs MUST strictly comply with coding-rules skill standards without exception.
@@ -113,6 +113,8 @@ Must be performed before Design Doc creation:
113
113
  - Clearly document similar functionality search results (found implementations or "none")
114
114
  - Include dependency existence verification results (verified existing / requires new creation / external dependency)
115
115
  - Record adopted decision (use existing/improvement proposal/new implementation) and rationale
116
+ - When Codebase Analysis input is provided, use it as the baseline evidence set and extend it only where gaps remain
117
+ - When persistence, repositories, queries, migrations, or schema-bound behavior are part of scope, complete the `Test Boundaries` section with a concrete data layer verification strategy. When they are not part of scope, mark the section explicitly as not applicable.
116
118
 
117
119
  6. **Code Inspection Evidence**
118
120
  - Record all inspected files and key functions in "Code Inspection Evidence" section of Design Doc
@@ -178,6 +180,17 @@ Must be performed when creating Design Doc:
178
180
  - Which task first makes the whole system operational
179
181
  - Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
180
182
 
183
+ 3. **Verification Strategy Definition**
184
+ - Define what correctness means for this change and how it will be proven
185
+ - Use the Design Doc template fields directly
186
+ - Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
187
+ - Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
188
+ - For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
189
+ - For new features, specify acceptance-criteria verification beyond unit tests
190
+ - For extensions, specify regression verification that proves existing behavior is preserved
191
+ - For refactors or rewrites, specify behavioral equivalence verification against the current implementation when applicable
192
+ - Define an early verification point: the first target to validate before scaling the approach
193
+
181
194
  ### Change Impact Map【Required】
182
195
  Must be included when creating Design Doc:
183
196
 
@@ -233,6 +246,13 @@ Confirm and document conflicts with existing systems at each integration point t
233
246
  - `reverse-engineer`: Document existing architecture as-is
234
247
 
235
248
  - **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
249
+ - **Codebase Analysis** (optional, from codebase-analyzer):
250
+ - Use as the primary source for Existing Codebase Analysis when provided
251
+ - `existingElements` informs implementation path mapping and inspection evidence
252
+ - `dataModel` informs schema references, data contracts, and persistence design
253
+ - `focusAreas` indicate areas requiring deeper design attention
254
+ - `constraints` inform design constraints, assumptions, and risk handling
255
+ - Additional investigation should focus on gaps or limitations that the analysis calls out
236
256
  - **PRD**: PRD document (if exists)
237
257
  - **Documents to Create**: ADR, Design Doc, or both
238
258
  - **Existing Architecture Information**:
@@ -331,6 +351,7 @@ Implementation sample creation checklist:
331
351
  - [ ] **Complexity assessment**: complexity_level set; if medium/high, complexity_rationale specifies (1) requirements/ACs, (2) constraints/risks
332
352
  - [ ] **Data representation decision documented** (when new structures introduced)
333
353
  - [ ] **Field propagation map included** (when fields cross boundaries)
354
+ - [ ] **Verification Strategy defined** (correctness definition, target comparison, verification method, observable success indicator, timing, early verification point)
334
355
 
335
356
  **Reverse-engineer mode only**:
336
357
  - [ ] Every architectural claim cites file:line evidence
@@ -61,6 +61,7 @@ Read the Design Doc(s), UI Spec, PRD, and ADR (if provided). Extract:
61
61
  - Acceptance criteria and implementation approach
62
62
  - Technical dependencies and implementation order
63
63
  - Integration points and their contracts
64
+ - Verification Strategy from each Design Doc: correctness definition, target comparison, verification method, observable success indicator, normalized verification timing, and early verification point
64
65
 
65
66
  ### 2. Process Test Design Information (when provided)
66
67
  Read test skeleton files and extract meta information (see Test Design Information Processing section).
@@ -69,11 +70,21 @@ Read test skeleton files and extract meta information (see Test Design Informati
69
70
  Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementation-first) otherwise. See Implementation Strategy Selection section.
70
71
 
71
72
  ### 4. Compose Phases
72
- Structure phases based on technical dependencies from Design Doc:
73
- - Place tasks with lowest dependencies in earlier phases
73
+
74
+ **Common rules (all approaches)**:
75
+ - Preserve Verification Strategies per Design Doc in the work plan header and keep each source document path. Merge strategies only when the Design Docs explicitly define a shared one
76
+ - Include Verification Strategy summaries in the work plan header so the plan is self-sufficient for downstream task generation
77
+ - Place tasks with the lowest dependencies in earlier phases
78
+ - Map normalized verification timing to phases as follows: `phase_1` -> earliest implementation phase, `per_phase` -> each relevant phase, `integration_phase` -> integration phase, `final_phase` -> final Quality Assurance phase
79
+ - Include verification tasks in the phase corresponding to the Verification Strategy timing
74
80
  - When test skeletons are provided, place integration test implementation based on `@dependency` metadata from test skeletons (see Test Design Information Processing > Step 2) and place E2E test execution in the final phase
75
81
  - When test skeletons are not provided, include test implementation tasks based on Design Doc acceptance criteria
76
- - Include quality assurance in final phase
82
+ - Final phase is always Quality Assurance
83
+
84
+ **Phase structure**:
85
+ - Select the phase structure that matches the implementation approach from the Design Doc
86
+ - Use the plan template's vertical or horizontal option accordingly
87
+ - Remove every unused phase-structure option from the final work plan output
77
88
 
78
89
  ### 5. Define Tasks with Completion Criteria
79
90
  For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete).
@@ -236,7 +247,10 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
236
247
  ## Quality Checklist
237
248
 
238
249
  - [ ] Design Doc(s) consistency verification
239
- - [ ] Phase composition based on technical dependencies
250
+ - [ ] Verification Strategies extracted from each Design Doc and included in the plan header without unintended merging
251
+ - [ ] Phase structure matches the implementation approach
252
+ - [ ] Early verification point placed in the earliest applicable phase
253
+ - [ ] Normalized verification timing mapped consistently to phases
240
254
  - [ ] All requirements converted to tasks
241
255
  - [ ] Quality assurance exists in final phase
242
256
  - [ ] Test skeleton file paths listed in corresponding phases (when provided)
package/README.md CHANGED
@@ -48,10 +48,11 @@ The framework runs a structured workflow — requirements → design → task de
48
48
  A single request becomes a structured development process:
49
49
 
50
50
  1. **Understand** the problem (scale, constraints, affected files)
51
- 2. **Design** the solution (ADR, Design Doc with acceptance criteria)
52
- 3. **Break it into tasks** (atomic, 1 commit each)
53
- 4. **Implement with tests** (TDD per task)
54
- 5. **Run quality checks** (lint, test, build — no failing checks)
51
+ 2. **Analyze the existing codebase** (dependencies, data layer, risk areas)
52
+ 3. **Design** the solution (ADR, Design Doc with acceptance criteria)
53
+ 4. **Break it into tasks** (atomic, 1 commit each)
54
+ 5. **Implement with tests** (TDD per task)
55
+ 6. **Run quality checks** (lint, test, build — no failing checks)
55
56
 
56
57
  Each step is handled by a specialized subagent in its own context, preventing context pollution and reducing error accumulation in long-running tasks:
57
58
 
@@ -62,9 +63,13 @@ requirement-analyzer → Scale determination (Small / Medium / Large)
62
63
 
63
64
  prd-creator → Product requirements (Large scale)
64
65
 
66
+ codebase-analyzer → Existing codebase facts + focus areas
67
+
65
68
  technical-designer → ADR + Design Doc with acceptance criteria
66
69
 
67
- document-reviewer Quality gate
70
+ code-verifier Design Doc vs existing code verification
71
+
72
+ document-reviewer → Quality gate with verification evidence
68
73
 
69
74
  acceptance-test-gen → Test skeletons from ACs
70
75
 
@@ -222,6 +227,7 @@ Codex spawns these as needed during recipe execution. Each agent runs in its own
222
227
  | `technical-designer` | ADR and Design Doc creation (backend) |
223
228
  | `technical-designer-frontend` | Frontend ADR and Design Doc creation (React) |
224
229
  | `ui-spec-designer` | UI Specification from PRD and optional prototype code |
230
+ | `codebase-analyzer` | Existing codebase analysis before Design Doc creation |
225
231
  | `work-planner` | Work plan creation from Design Docs |
226
232
  | `document-reviewer` | Document consistency and approval |
227
233
  | `design-sync` | Cross-document consistency verification |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codex-workflows",
3
- "version": "0.3.1",
3
+ "version": "0.4.1",
4
4
  "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
5
5
  "license": "MIT",
6
6
  "author": "Shinsuke Kagawa",