codex-workflows 0.3.1 → 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/documentation-criteria/SKILL.md +25 -2
- package/.agents/skills/documentation-criteria/references/design-template.md +38 -0
- package/.agents/skills/documentation-criteria/references/plan-template.md +67 -10
- package/.agents/skills/documentation-criteria/references/task-template.md +8 -1
- package/.agents/skills/integration-e2e-testing/SKILL.md +4 -1
- package/.agents/skills/recipe-build/SKILL.md +1 -1
- package/.agents/skills/recipe-design/SKILL.md +22 -14
- package/.agents/skills/recipe-front-build/SKILL.md +1 -1
- package/.agents/skills/recipe-front-design/SKILL.md +7 -2
- package/.agents/skills/recipe-fullstack-build/SKILL.md +1 -1
- package/.agents/skills/recipe-fullstack-implement/SKILL.md +2 -0
- package/.agents/skills/recipe-implement/SKILL.md +2 -0
- package/.agents/skills/recipe-reverse-engineer/SKILL.md +2 -2
- package/.agents/skills/recipe-update-doc/SKILL.md +16 -5
- package/.agents/skills/subagents-orchestration-guide/SKILL.md +64 -41
- package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +45 -21
- package/.agents/skills/testing/SKILL.md +31 -0
- package/.codex/agents/acceptance-test-generator.toml +9 -2
- package/.codex/agents/code-verifier.toml +11 -4
- package/.codex/agents/codebase-analyzer.toml +193 -0
- package/.codex/agents/document-reviewer.toml +9 -0
- package/.codex/agents/integration-test-reviewer.toml +8 -7
- package/.codex/agents/task-decomposer.toml +16 -0
- package/.codex/agents/technical-designer-frontend.toml +28 -0
- package/.codex/agents/technical-designer.toml +21 -0
- package/.codex/agents/work-planner.toml +18 -4
- package/README.md +11 -5
- package/package.json +1 -1
|
@@ -57,18 +57,19 @@ Key checks:
|
|
|
57
57
|
## Verification Process
|
|
58
58
|
|
|
59
59
|
### 1. Skeleton Comment Extraction
|
|
60
|
-
Extract the following
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
-
|
|
64
|
-
-
|
|
65
|
-
-
|
|
60
|
+
Extract the following annotation patterns from the test file using the project's comment syntax:
|
|
61
|
+
- `AC:` → Original acceptance criteria
|
|
62
|
+
- `Behavior:` → Trigger → Process → Observable Result
|
|
63
|
+
- `@category:` → Test classification
|
|
64
|
+
- `@dependency:` → Dependencies
|
|
65
|
+
- `@real-dependency:` → Dependencies expected to stay real in integration coverage
|
|
66
|
+
- `Verification items:` → Expected verification items (if present)
|
|
66
67
|
|
|
67
68
|
### 2. Implementation Verification
|
|
68
69
|
For each test case:
|
|
69
70
|
1. Check if "observable result" from Behavior is asserted
|
|
70
71
|
2. Check if all items in Verification items are covered by assertions
|
|
71
|
-
3. Verify mock boundaries match
|
|
72
|
+
3. Verify mock boundaries match `@dependency` and `@real-dependency`
|
|
72
73
|
|
|
73
74
|
### 3. Quality Assessment
|
|
74
75
|
Evaluate each test for:
|
|
@@ -51,6 +51,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
51
51
|
- Understand dependencies between phases and tasks
|
|
52
52
|
- Grasp completion criteria and quality standards
|
|
53
53
|
- **Interface change detection and response**
|
|
54
|
+
- **Extract Verification Strategy from the work plan header**
|
|
54
55
|
|
|
55
56
|
2. **Task Decomposition**
|
|
56
57
|
- Decompose at 1 commit = 1 task granularity (logical change unit)
|
|
@@ -116,6 +117,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
116
117
|
- Investigation Targets
|
|
117
118
|
- Investigation Notes
|
|
118
119
|
- Concrete implementation steps
|
|
120
|
+
- Operation Verification Methods
|
|
119
121
|
- Completion criteria
|
|
120
122
|
|
|
121
123
|
6. **Investigation Targets Determination**
|
|
@@ -128,6 +130,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
128
130
|
| Integration/E2E test work | Test skeleton file, target implementation under test, existing fixture/auth/setup patterns |
|
|
129
131
|
| E2E environment/setup work | Current environment config, startup scripts, seed/fixture scripts, auth flow references |
|
|
130
132
|
| Bug fix/refactor | Affected code paths, failing tests, reproduction-related files |
|
|
133
|
+
| Behavior replacement/rewrite | Existing implementation being replaced, observable outputs, Verification Strategy section in the Design Doc |
|
|
131
134
|
|
|
132
135
|
**Principles**:
|
|
133
136
|
- Every task must include at least one Investigation Target
|
|
@@ -144,6 +147,18 @@ Decompose tasks based on implementation strategy patterns determined in implemen
|
|
|
144
147
|
8. **Utilize Test Information**
|
|
145
148
|
When test information (@category, @dependency, @complexity, etc.) is documented in the work plan, reflect that information in task files
|
|
146
149
|
|
|
150
|
+
## Verification Strategy Propagation
|
|
151
|
+
|
|
152
|
+
Verification Strategy defines what correctness means at design time. L1/L2/L3 (from implementation-approach) define task-level verification depth at execution time. Use both.
|
|
153
|
+
|
|
154
|
+
When the work plan includes one or more Verification Strategy blocks:
|
|
155
|
+
|
|
156
|
+
1. **Source preservation**: Keep each strategy tied to its source Design Doc or plan block. Preserve strategy identity and merge only when the work plan explicitly marks the strategy as shared.
|
|
157
|
+
2. **Early verification task**: The task matching a strategy's "First verification target" MUST include that method and success criteria in Operation Verification Methods.
|
|
158
|
+
3. **Per-task verification**: Each task's Operation Verification Methods MUST instantiate the relevant plan-level verification method for that task's specific files, interfaces, or behavior.
|
|
159
|
+
4. **Failure handling**: Copy or adapt the relevant plan-level failure response so the executor knows whether to reassess, stop, or escalate.
|
|
160
|
+
5. **Investigation coverage**: Include every resource required for verification, such as existing implementations for comparison, schema definitions, fixtures, contracts, or seed data.
|
|
161
|
+
|
|
147
162
|
## Task File Template
|
|
148
163
|
|
|
149
164
|
See task template in documentation-criteria skill for details.
|
|
@@ -243,6 +258,7 @@ Please execute decomposed tasks according to the order.
|
|
|
243
258
|
- [ ] Impact scope and boundaries definition for each task
|
|
244
259
|
- [ ] Appropriate granularity (1-5 files/task)
|
|
245
260
|
- [ ] Investigation Targets specified for every task
|
|
261
|
+
- [ ] Operation Verification Methods specified for every task
|
|
246
262
|
- [ ] Clear completion criteria setting
|
|
247
263
|
- [ ] Overall design document creation
|
|
248
264
|
- [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope)
|
|
@@ -96,6 +96,8 @@ Must be performed before Design Doc creation:
|
|
|
96
96
|
- Clearly document similar component search results (found components or "none")
|
|
97
97
|
- Include dependency existence verification results (verified existing / requires new creation / external dependency)
|
|
98
98
|
- Record adopted decision (use existing/improvement proposal/new implementation) and rationale
|
|
99
|
+
- When Codebase Analysis input is provided, use it as the baseline evidence set and extend it only where gaps remain
|
|
100
|
+
- When frontend behavior depends on persistence, repositories, API-backed data contracts, or schema-shaped responses, complete the `Test Boundaries` section with a concrete verification strategy. When those concerns are outside the scope, mark the section explicitly as not applicable.
|
|
99
101
|
|
|
100
102
|
### Integration Points【Important】
|
|
101
103
|
Document all integration points with existing components in a "## Integration Point Map" section.
|
|
@@ -144,6 +146,17 @@ Must be performed when creating Design Doc:
|
|
|
144
146
|
- Which task first makes the entire UI operational
|
|
145
147
|
- Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
|
|
146
148
|
|
|
149
|
+
3. **Verification Strategy Definition**
|
|
150
|
+
- Define what correctness means for this UI change and how it will be proven
|
|
151
|
+
- Use the Design Doc template fields directly
|
|
152
|
+
- Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
|
|
153
|
+
- Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
|
|
154
|
+
- For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
|
|
155
|
+
- For new UI features, specify acceptance-criteria verification beyond unit tests
|
|
156
|
+
- For extensions, specify regression verification that proves existing behavior and UX expectations are preserved
|
|
157
|
+
- For refactors or rewrites, specify behavioral equivalence verification against the current UI behavior when applicable
|
|
158
|
+
- Define an early verification point: the first screen, state transition, or interaction that proves the approach works
|
|
159
|
+
|
|
147
160
|
### Change Impact Map【Required】
|
|
148
161
|
Must be included when creating Design Doc:
|
|
149
162
|
|
|
@@ -203,6 +216,13 @@ When a UI Spec exists for the feature (`docs/ui-spec/{feature-name}-ui-spec.md`)
|
|
|
203
216
|
- `reverse-engineer`: Document existing frontend architecture as-is
|
|
204
217
|
|
|
205
218
|
- **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
|
|
219
|
+
- **Codebase Analysis** (optional, from codebase-analyzer):
|
|
220
|
+
- Use as the primary source for Existing Codebase Analysis when provided
|
|
221
|
+
- `existingElements` informs implementation path mapping and inspection evidence
|
|
222
|
+
- `dataModel` informs API contract expectations and data-shape references
|
|
223
|
+
- `focusAreas` indicate components, hooks, or state paths that deserve deeper inspection
|
|
224
|
+
- `constraints` inform compatibility and UI behavior constraints
|
|
225
|
+
- Additional investigation should focus on areas the analysis did not fully resolve
|
|
206
226
|
- **PRD**: PRD document (if exists)
|
|
207
227
|
- **UI Spec**: UI Specification document (if exists, for frontend features)
|
|
208
228
|
- **Documents to Create**: ADR, Design Doc, or both
|
|
@@ -256,6 +276,14 @@ Execute file output immediately. Final approval is managed by the orchestrator r
|
|
|
256
276
|
- Cite information sources in "References" section with URLs
|
|
257
277
|
- Especially confirm multiple reliable sources when introducing new technologies
|
|
258
278
|
|
|
279
|
+
## Design Doc Completion Checklist
|
|
280
|
+
|
|
281
|
+
- [ ] Agreement Checklist completed and reflected in design
|
|
282
|
+
- [ ] Implementation approach selected with rationale
|
|
283
|
+
- [ ] Verification Strategy defined with correctness definition, target comparison, method, observable success indicator, timing, and early verification point
|
|
284
|
+
- [ ] Change Impact Map included
|
|
285
|
+
- [ ] Interface Change Impact Analysis included
|
|
286
|
+
|
|
259
287
|
## Implementation Sample Standards Compliance
|
|
260
288
|
|
|
261
289
|
**MANDATORY**: All implementation samples in ADR and Design Docs MUST strictly comply with coding-rules skill standards without exception.
|
|
@@ -113,6 +113,8 @@ Must be performed before Design Doc creation:
|
|
|
113
113
|
- Clearly document similar functionality search results (found implementations or "none")
|
|
114
114
|
- Include dependency existence verification results (verified existing / requires new creation / external dependency)
|
|
115
115
|
- Record adopted decision (use existing/improvement proposal/new implementation) and rationale
|
|
116
|
+
- When Codebase Analysis input is provided, use it as the baseline evidence set and extend it only where gaps remain
|
|
117
|
+
- When persistence, repositories, queries, migrations, or schema-bound behavior are part of scope, complete the `Test Boundaries` section with a concrete data layer verification strategy. When they are not part of scope, mark the section explicitly as not applicable.
|
|
116
118
|
|
|
117
119
|
6. **Code Inspection Evidence**
|
|
118
120
|
- Record all inspected files and key functions in "Code Inspection Evidence" section of Design Doc
|
|
@@ -178,6 +180,17 @@ Must be performed when creating Design Doc:
|
|
|
178
180
|
- Which task first makes the whole system operational
|
|
179
181
|
- Verification level for each task (L1/L2/L3 defined in implementation-approach skill)
|
|
180
182
|
|
|
183
|
+
3. **Verification Strategy Definition**
|
|
184
|
+
- Define what correctness means for this change and how it will be proven
|
|
185
|
+
- Use the Design Doc template fields directly
|
|
186
|
+
- Include at minimum: correctness definition, target comparison, verification method, observable success indicator, verification timing, and early verification point
|
|
187
|
+
- Use normalized verification timing values: `phase_1`, `per_phase`, `integration_phase`, or `final_phase`
|
|
188
|
+
- For low-risk or self-evident changes, a minimal form or explicit `N/A` with rationale is acceptable
|
|
189
|
+
- For new features, specify acceptance-criteria verification beyond unit tests
|
|
190
|
+
- For extensions, specify regression verification that proves existing behavior is preserved
|
|
191
|
+
- For refactors or rewrites, specify behavioral equivalence verification against the current implementation when applicable
|
|
192
|
+
- Define an early verification point: the first target to validate before scaling the approach
|
|
193
|
+
|
|
181
194
|
### Change Impact Map【Required】
|
|
182
195
|
Must be included when creating Design Doc:
|
|
183
196
|
|
|
@@ -233,6 +246,13 @@ Confirm and document conflicts with existing systems at each integration point t
|
|
|
233
246
|
- `reverse-engineer`: Document existing architecture as-is
|
|
234
247
|
|
|
235
248
|
- **Requirements Analysis Results**: Requirements analysis results (scale determination, technical requirements, etc.)
|
|
249
|
+
- **Codebase Analysis** (optional, from codebase-analyzer):
|
|
250
|
+
- Use as the primary source for Existing Codebase Analysis when provided
|
|
251
|
+
- `existingElements` informs implementation path mapping and inspection evidence
|
|
252
|
+
- `dataModel` informs schema references, data contracts, and persistence design
|
|
253
|
+
- `focusAreas` indicate areas requiring deeper design attention
|
|
254
|
+
- `constraints` inform design constraints, assumptions, and risk handling
|
|
255
|
+
- Additional investigation should focus on gaps or limitations that the analysis calls out
|
|
236
256
|
- **PRD**: PRD document (if exists)
|
|
237
257
|
- **Documents to Create**: ADR, Design Doc, or both
|
|
238
258
|
- **Existing Architecture Information**:
|
|
@@ -331,6 +351,7 @@ Implementation sample creation checklist:
|
|
|
331
351
|
- [ ] **Complexity assessment**: complexity_level set; if medium/high, complexity_rationale specifies (1) requirements/ACs, (2) constraints/risks
|
|
332
352
|
- [ ] **Data representation decision documented** (when new structures introduced)
|
|
333
353
|
- [ ] **Field propagation map included** (when fields cross boundaries)
|
|
354
|
+
- [ ] **Verification Strategy defined** (correctness definition, target comparison, verification method, observable success indicator, timing, early verification point)
|
|
334
355
|
|
|
335
356
|
**Reverse-engineer mode only**:
|
|
336
357
|
- [ ] Every architectural claim cites file:line evidence
|
|
@@ -61,6 +61,7 @@ Read the Design Doc(s), UI Spec, PRD, and ADR (if provided). Extract:
|
|
|
61
61
|
- Acceptance criteria and implementation approach
|
|
62
62
|
- Technical dependencies and implementation order
|
|
63
63
|
- Integration points and their contracts
|
|
64
|
+
- Verification Strategy from each Design Doc: correctness definition, target comparison, verification method, observable success indicator, normalized verification timing, and early verification point
|
|
64
65
|
|
|
65
66
|
### 2. Process Test Design Information (when provided)
|
|
66
67
|
Read test skeleton files and extract meta information (see Test Design Information Processing section).
|
|
@@ -69,11 +70,21 @@ Read test skeleton files and extract meta information (see Test Design Informati
|
|
|
69
70
|
Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementation-first) otherwise. See Implementation Strategy Selection section.
|
|
70
71
|
|
|
71
72
|
### 4. Compose Phases
|
|
72
|
-
|
|
73
|
-
|
|
73
|
+
|
|
74
|
+
**Common rules (all approaches)**:
|
|
75
|
+
- Preserve Verification Strategies per Design Doc in the work plan header and keep each source document path. Merge strategies only when the Design Docs explicitly define a shared one
|
|
76
|
+
- Include Verification Strategy summaries in the work plan header so the plan is self-sufficient for downstream task generation
|
|
77
|
+
- Place tasks with the lowest dependencies in earlier phases
|
|
78
|
+
- Map normalized verification timing to phases as follows: `phase_1` -> earliest implementation phase, `per_phase` -> each relevant phase, `integration_phase` -> integration phase, `final_phase` -> final Quality Assurance phase
|
|
79
|
+
- Include verification tasks in the phase corresponding to the Verification Strategy timing
|
|
74
80
|
- When test skeletons are provided, place integration test implementation based on `@dependency` metadata from test skeletons (see Test Design Information Processing > Step 2) and place E2E test execution in the final phase
|
|
75
81
|
- When test skeletons are not provided, include test implementation tasks based on Design Doc acceptance criteria
|
|
76
|
-
-
|
|
82
|
+
- Final phase is always Quality Assurance
|
|
83
|
+
|
|
84
|
+
**Phase structure**:
|
|
85
|
+
- Select the phase structure that matches the implementation approach from the Design Doc
|
|
86
|
+
- Use the plan template's vertical or horizontal option accordingly
|
|
87
|
+
- Remove every unused phase-structure option from the final work plan output
|
|
77
88
|
|
|
78
89
|
### 5. Define Tasks with Completion Criteria
|
|
79
90
|
For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete).
|
|
@@ -236,7 +247,10 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
|
|
|
236
247
|
## Quality Checklist
|
|
237
248
|
|
|
238
249
|
- [ ] Design Doc(s) consistency verification
|
|
239
|
-
- [ ]
|
|
250
|
+
- [ ] Verification Strategies extracted from each Design Doc and included in the plan header without unintended merging
|
|
251
|
+
- [ ] Phase structure matches the implementation approach
|
|
252
|
+
- [ ] Early verification point placed in the earliest applicable phase
|
|
253
|
+
- [ ] Normalized verification timing mapped consistently to phases
|
|
240
254
|
- [ ] All requirements converted to tasks
|
|
241
255
|
- [ ] Quality assurance exists in final phase
|
|
242
256
|
- [ ] Test skeleton file paths listed in corresponding phases (when provided)
|
package/README.md
CHANGED
|
@@ -48,10 +48,11 @@ The framework runs a structured workflow — requirements → design → task de
|
|
|
48
48
|
A single request becomes a structured development process:
|
|
49
49
|
|
|
50
50
|
1. **Understand** the problem (scale, constraints, affected files)
|
|
51
|
-
2. **
|
|
52
|
-
3. **
|
|
53
|
-
4. **
|
|
54
|
-
5. **
|
|
51
|
+
2. **Analyze the existing codebase** (dependencies, data layer, risk areas)
|
|
52
|
+
3. **Design** the solution (ADR, Design Doc with acceptance criteria)
|
|
53
|
+
4. **Break it into tasks** (atomic, 1 commit each)
|
|
54
|
+
5. **Implement with tests** (TDD per task)
|
|
55
|
+
6. **Run quality checks** (lint, test, build — no failing checks)
|
|
55
56
|
|
|
56
57
|
Each step is handled by a specialized subagent in its own context, preventing context pollution and reducing error accumulation in long-running tasks:
|
|
57
58
|
|
|
@@ -62,9 +63,13 @@ requirement-analyzer → Scale determination (Small / Medium / Large)
|
|
|
62
63
|
↓
|
|
63
64
|
prd-creator → Product requirements (Large scale)
|
|
64
65
|
↓
|
|
66
|
+
codebase-analyzer → Existing codebase facts + focus areas
|
|
67
|
+
↓
|
|
65
68
|
technical-designer → ADR + Design Doc with acceptance criteria
|
|
66
69
|
↓
|
|
67
|
-
|
|
70
|
+
code-verifier → Design Doc vs existing code verification
|
|
71
|
+
↓
|
|
72
|
+
document-reviewer → Quality gate with verification evidence
|
|
68
73
|
↓
|
|
69
74
|
acceptance-test-gen → Test skeletons from ACs
|
|
70
75
|
↓
|
|
@@ -222,6 +227,7 @@ Codex spawns these as needed during recipe execution. Each agent runs in its own
|
|
|
222
227
|
| `technical-designer` | ADR and Design Doc creation (backend) |
|
|
223
228
|
| `technical-designer-frontend` | Frontend ADR and Design Doc creation (React) |
|
|
224
229
|
| `ui-spec-designer` | UI Specification from PRD and optional prototype code |
|
|
230
|
+
| `codebase-analyzer` | Existing codebase analysis before Design Doc creation |
|
|
225
231
|
| `work-planner` | Work plan creation from Design Docs |
|
|
226
232
|
| `document-reviewer` | Document consistency and approval |
|
|
227
233
|
| `design-sync` | Cross-document consistency verification |
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "codex-workflows",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.4.1",
|
|
4
4
|
"description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"author": "Shinsuke Kagawa",
|