codex-workflows 0.4.11 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (47) hide show
  1. package/.agents/skills/coding-rules/references/typescript.md +1 -1
  2. package/.agents/skills/documentation-criteria/references/design-template.md +8 -0
  3. package/.agents/skills/documentation-criteria/references/plan-template.md +22 -3
  4. package/.agents/skills/documentation-criteria/references/task-template.md +1 -1
  5. package/.agents/skills/documentation-criteria/references/ui-spec-template.md +10 -0
  6. package/.agents/skills/external-resource-context/SKILL.md +99 -0
  7. package/.agents/skills/external-resource-context/agents/openai.yaml +7 -0
  8. package/.agents/skills/external-resource-context/references/api.md +20 -0
  9. package/.agents/skills/external-resource-context/references/backend.md +21 -0
  10. package/.agents/skills/external-resource-context/references/frontend.md +21 -0
  11. package/.agents/skills/external-resource-context/references/infra.md +21 -0
  12. package/.agents/skills/external-resource-context/references/template.md +72 -0
  13. package/.agents/skills/integration-e2e-testing/SKILL.md +34 -21
  14. package/.agents/skills/integration-e2e-testing/references/e2e-design.md +16 -10
  15. package/.agents/skills/recipe-add-integration-tests/SKILL.md +7 -0
  16. package/.agents/skills/recipe-build/SKILL.md +32 -5
  17. package/.agents/skills/recipe-front-adjust/SKILL.md +113 -0
  18. package/.agents/skills/recipe-front-adjust/agents/openai.yaml +7 -0
  19. package/.agents/skills/recipe-front-build/SKILL.md +32 -5
  20. package/.agents/skills/recipe-front-design/SKILL.md +28 -9
  21. package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
  22. package/.agents/skills/recipe-front-review/SKILL.md +29 -11
  23. package/.agents/skills/recipe-fullstack-build/SKILL.md +32 -5
  24. package/.agents/skills/recipe-fullstack-implement/SKILL.md +13 -4
  25. package/.agents/skills/recipe-implement/SKILL.md +12 -4
  26. package/.agents/skills/recipe-plan/SKILL.md +5 -5
  27. package/.agents/skills/recipe-prepare-implementation/SKILL.md +162 -0
  28. package/.agents/skills/recipe-prepare-implementation/agents/openai.yaml +7 -0
  29. package/.agents/skills/recipe-review/SKILL.md +34 -6
  30. package/.agents/skills/subagents-orchestration-guide/SKILL.md +36 -34
  31. package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md +45 -48
  32. package/.agents/skills/task-analyzer/SKILL.md +3 -2
  33. package/.agents/skills/task-analyzer/references/skills-index.yaml +54 -7
  34. package/.agents/skills/testing/references/typescript.md +2 -3
  35. package/.codex/agents/acceptance-test-generator.toml +69 -31
  36. package/.codex/agents/quality-fixer-frontend.toml +5 -0
  37. package/.codex/agents/quality-fixer.toml +5 -0
  38. package/.codex/agents/task-decomposer.toml +27 -2
  39. package/.codex/agents/task-executor-frontend.toml +16 -11
  40. package/.codex/agents/task-executor.toml +19 -14
  41. package/.codex/agents/technical-designer-frontend.toml +25 -2
  42. package/.codex/agents/technical-designer.toml +13 -0
  43. package/.codex/agents/ui-analyzer.toml +307 -0
  44. package/.codex/agents/ui-spec-designer.toml +15 -0
  45. package/.codex/agents/work-planner.toml +54 -17
  46. package/README.md +54 -26
  47. package/package.json +1 -1
package/README.md CHANGED
@@ -4,9 +4,11 @@
4
4
  [![Agent Skills](https://img.shields.io/badge/Agent%20Skills-Spec%20Compliant-blue)](https://developers.openai.com/codex/skills/)
5
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
6
 
7
- **Structured agentic coding workflows for [OpenAI Codex CLI](https://developers.openai.com/codex/cli)** — specialized AI coding agents plan, implement, test, and review changes with traceable docs, task-level commits, and quality gates.
7
+ **Structured workflows for [OpenAI Codex CLI](https://developers.openai.com/codex/cli).**
8
8
 
9
- Built on the [Agent Skills specification](https://developers.openai.com/codex/skills/) and [Codex subagents](https://developers.openai.com/codex/subagents). Designed for long-running tasks, large refactors, and reviewable changes.
9
+ They help when multi-step changes stop being easy to reason about, test, or review.
10
+
11
+ Built on the [Agent Skills specification](https://developers.openai.com/codex/skills/) and [Codex subagents](https://developers.openai.com/codex/subagents). This starts to matter when tasks get large: refactors, migrations, or anything that spans multiple files and needs to stay reviewable.
10
12
 
11
13
  ---
12
14
 
@@ -25,45 +27,49 @@ $recipe-implement Add user authentication with JWT
25
27
 
26
28
  `$` is Codex CLI's syntax for invoking a skill explicitly. Type `$recipe-` to see all available recipes via tab completion.
27
29
 
28
- Small changes stay lightweight. Larger tasks get structure: requirements design task decomposition TDD implementation quality gates.
29
-
30
- codex-workflows is the Codex-native counterpart of [Claude Code Workflows](https://github.com/shinpr/claude-code-workflows): same document-driven development style, adapted for Codex CLI, subagents, and GPT models.
30
+ Small changes stay lightweight. Larger tasks are broken into requirements, design, task decomposition, TDD implementation, and quality checks.
31
31
 
32
32
  ---
33
33
 
34
34
  ## Why codex-workflows?
35
35
 
36
- Codex is already strong at one-shot implementation. The problem starts when a change spans multiple files, needs design decisions to stay visible, or has to survive review, testing, and follow-up edits.
36
+ Codex works well for short, focused tasks. The problems start when a change spans multiple files, needs design decisions to stay visible, or has to survive review, testing, and follow-up edits.
37
37
 
38
- For larger tasks, explicit planning changes the job from raw generation into verification against a design, a task breakdown, and acceptance criteria. That matters because review loops are more reliable than first-shot generation once scope and ambiguity grow.
38
+ Many developers have seen the same pattern: things work at first, then drift. Context grows, assumptions accumulate, intermediate decisions disappear, and results become harder to trust.
39
39
 
40
- codex-workflows adds the missing structure around those jobs:
40
+ codex-workflows is built around those failure modes. Instead of asking Codex to "just implement it", it turns a request into a sequence of steps you can inspect and verify:
41
41
  - Traceable artifacts: PRD → Design Doc → Task → Commit
42
- - Built-in TDD and quality gates before code is ready to commit
42
+ - Built-in TDD and quality checks before code is ready to commit
43
43
  - Agent context separation for large refactors, migrations, and PR-sized changes
44
44
  - Diagnosis and reverse-engineering flows for bugs and legacy code
45
45
 
46
+ ## Background
47
+
48
+ The recipes, subagents, and quality checks in this repo were not designed top-down. Each piece was added in response to a concrete failure mode encountered during delivery work.
49
+
50
+ That is why the workflow separates requirements, design, verification, implementation, and quality checks instead of treating them as one long session.
51
+
46
52
  ## Not Designed For
47
53
 
48
- - One-shot toy scripts or vibe-coding sessions where speed matters more than traceability
49
- - Repositories that do not use tests, lint, builds, or reviewable commits
50
- - Teams that do not want design docs, task breakdowns, or explicit quality gates
54
+ - One-shot scripts or exploratory sessions where speed matters more than traceability
55
+ - Repositories without tests, lint, builds, or reviewable commits
56
+ - Teams that would rather skip design docs and quality checks entirely
51
57
 
52
58
  ---
53
59
 
54
60
  ## What It Does
55
61
 
56
- A single request becomes a structured development process. The framework chooses the level of ceremony based on scope:
62
+ Instead of forcing a fixed workflow, the framework adjusts how much structure it adds based on scope:
57
63
 
58
64
  | Scale | File Count | What Happens |
59
65
  |-------|------------|-------------|
60
66
  | Small | 1-2 | Simplified plan → direct implementation |
61
67
  | Medium | 3-5 | Design Doc → work plan → task execution |
62
- | Large | 6+ | PRD → ADR → Design Doc → test skeletons → work plan → autonomous execution |
68
+ | Large | 6+ | PRD → ADR → Design Doc → test skeletons → work plan → guided autonomous execution |
63
69
 
64
70
  For larger work, the path usually looks like this: understand the problem, analyze the codebase, design the change, break it into atomic tasks, implement with tests, and run quality checks before commit.
65
71
 
66
- Each step is handled by a specialized subagent in its own context, using context engineering to prevent context pollution and reduce error accumulation in long-running tasks:
72
+ Each step isolates one concern, so decisions can be checked before they carry into later stages. Specialized subagents run in their own contexts to reduce carry-over assumptions during changes that would otherwise require long sessions:
67
73
 
68
74
  ```
69
75
  User Request
@@ -88,7 +94,7 @@ task-decomposer → Atomic tasks (1 task = 1 commit)
88
94
 
89
95
  task-executor → TDD implementation per task
90
96
 
91
- quality-fixer → Lint, test, build no failing checks
97
+ quality-fixer → Lint, test, build; no failing checks
92
98
 
93
99
  Ready to commit
94
100
  ```
@@ -159,7 +165,8 @@ Invoke recipes with `$recipe-name` in Codex. Type `$recipe-` and use tab complet
159
165
  | `$recipe-task` | Single task with rule selection | Bug fixes, small changes |
160
166
  | `$recipe-design` | Requirements → ADR/Design Doc | Architecture planning |
161
167
  | `$recipe-plan` | Design Doc → test skeletons → work plan | Planning phase, including nullable E2E skeleton handling |
162
- | `$recipe-build` | Execute backend tasks autonomously | Resume backend implementation |
168
+ | `$recipe-prepare-implementation` | Verify work plan readiness and resolve prep gaps | Pre-build check that the plan is implementable |
169
+ | `$recipe-build` | Execute backend tasks with validation between steps | Resume backend implementation |
163
170
  | `$recipe-review` | Design Doc compliance and security validation with auto-fixes | Post-implementation check |
164
171
  | `$recipe-diagnose` | Problem investigation → failure-point verification → solution | Bug investigation |
165
172
  | `$recipe-reverse-engineer` | Generate PRD + Design Docs from existing code | Legacy system documentation |
@@ -171,6 +178,7 @@ Invoke recipes with `$recipe-name` in Codex. Type `$recipe-` and use tab complet
171
178
  | Recipe | What it does | When to use |
172
179
  |--------|-------------|-------------|
173
180
  | `$recipe-front-design` | Requirements → UI Spec → frontend Design Doc | Frontend architecture planning |
181
+ | `$recipe-front-adjust` | Implemented UI adjustment with external context and verification | Focused UI changes after implementation |
174
182
  | `$recipe-front-plan` | Frontend Design Doc → test skeletons → work plan | Frontend planning phase |
175
183
  | `$recipe-front-build` | Execute frontend tasks with RTL + quality checks | Resume frontend implementation |
176
184
  | `$recipe-front-review` | Frontend compliance and security validation with React-specific fixes | Frontend post-implementation check |
@@ -182,6 +190,16 @@ Invoke recipes with `$recipe-name` in Codex. Type `$recipe-` and use tab complet
182
190
  | `$recipe-fullstack-implement` | Full lifecycle with separate Design Docs per layer | Cross-layer features |
183
191
  | `$recipe-fullstack-build` | Execute tasks with layer-aware agent routing | Resume cross-layer implementation |
184
192
 
193
+ ### Working State
194
+
195
+ Recipes use `docs/plans/` as ephemeral working state for work plans, decomposed task files, prep tasks, review-fix tasks, and intermediate analysis files. Add it to your project's `.gitignore` unless your team intentionally wants to review those transient files:
196
+
197
+ ```gitignore
198
+ docs/plans/
199
+ ```
200
+
201
+ PRDs, ADRs, UI Specs, and Design Docs are durable project documents and are intended to be committed.
202
+
185
203
  ### Examples
186
204
 
187
205
  **Full feature development:**
@@ -208,7 +226,7 @@ $recipe-reverse-engineer src/auth module
208
226
 
209
227
  ## Foundational Skills
210
228
 
211
- These load automatically when the conversation context matches no explicit invocation needed:
229
+ These are applied automatically based on context. You rarely need to think about them directly.
212
230
 
213
231
  | Skill | What it provides |
214
232
  |-------|-----------------|
@@ -218,8 +236,9 @@ These load automatically when the conversation context matches — no explicit i
218
236
  | `documentation-criteria` | Document creation rules and templates (PRD, ADR, Design Doc, Work Plan) |
219
237
  | `implementation-approach` | Strategy selection: vertical / horizontal / hybrid slicing |
220
238
  | `integration-e2e-testing` | Integration/E2E test design, value-based selection, review criteria |
239
+ | `external-resource-context` | Access methods for design sources, design systems, API schemas, and verification environments |
221
240
  | `task-analyzer` | Task analysis, scale estimation, skill selection |
222
- | `subagents-orchestration-guide` | Multi-agent coordination, workflow flows, autonomous execution |
241
+ | `subagents-orchestration-guide` | Multi-agent coordination, workflow flows, guided autonomous execution |
223
242
 
224
243
  Language-specific references are included for TypeScript/React projects (`coding-rules/references/typescript.md`, `testing/references/typescript.md`).
225
244
 
@@ -227,7 +246,7 @@ Language-specific references are included for TypeScript/React projects (`coding
227
246
 
228
247
  ## Subagents
229
248
 
230
- Codex spawns these as needed during recipe execution. Each agent runs in its own context with specialized instructions and skill configurations.
249
+ Codex spawns these as needed during recipe execution. You do not need to learn them first; recipes route work to the right agents automatically. Each agent runs in its own context with specialized instructions and skill configurations.
231
250
 
232
251
  ### Document Creation Agents
233
252
 
@@ -239,6 +258,7 @@ Codex spawns these as needed during recipe execution. Each agent runs in its own
239
258
  | `technical-designer-frontend` | Frontend ADR and Design Doc creation (React) |
240
259
  | `ui-spec-designer` | UI Specification from PRD and optional prototype code |
241
260
  | `codebase-analyzer` | Existing codebase analysis before Design Doc creation |
261
+ | `ui-analyzer` | UI facts from external resources (design tools, design-system docs, deployed UI) and frontend code |
242
262
  | `work-planner` | Work plan creation from Design Docs |
243
263
  | `document-reviewer` | Document consistency and approval |
244
264
  | `design-sync` | Cross-document consistency verification |
@@ -277,18 +297,18 @@ Codex spawns these as needed during recipe execution. Each agent runs in its own
277
297
 
278
298
  ## How It Works
279
299
 
280
- ### Autonomous Execution Mode
300
+ ### Guided Autonomous Execution Mode
281
301
 
282
- After work plan approval, the framework enters guided autonomous execution with escalation points:
302
+ After work plan approval, the framework executes task files with explicit validation points:
283
303
 
284
304
  1. **task-executor** implements each task with TDD
285
305
  2. **quality-fixer** first rejects incomplete task-scoped implementations, then runs lint, tests, and build before every commit
286
306
  3. Escalation pauses execution when design deviation or ambiguity is detected
287
- 4. Each task produces one commit rollback-friendly granularity
307
+ 4. Each task produces one commit for rollback-friendly granularity
288
308
 
289
309
  ### Context Separation
290
310
 
291
- Each subagent runs in a fresh context. This context-engineering pattern keeps long-running agentic coding tasks legible and reviewable:
311
+ Each subagent runs in a fresh context. This pattern keeps multi-step coding tasks legible and reviewable:
292
312
  - generation and verification happen in separate contexts, reducing author bias and carry-over assumptions
293
313
  - **document-reviewer** reviews without the author's bias
294
314
  - **investigator** collects evidence without confirmation bias
@@ -309,12 +329,15 @@ your-project/
309
329
  │ ├── documentation-criteria/
310
330
  │ ├── implementation-approach/
311
331
  │ ├── integration-e2e-testing/
332
+ │ ├── external-resource-context/
312
333
  │ ├── task-analyzer/
313
334
  │ ├── subagents-orchestration-guide/
314
335
  │ ├── recipe-implement/ # Recipes ($recipe-*)
315
336
  │ ├── recipe-design/
316
337
  │ ├── recipe-build/
338
+ │ ├── recipe-front-adjust/
317
339
  │ ├── recipe-plan/
340
+ │ ├── recipe-prepare-implementation/
318
341
  │ ├── recipe-review/
319
342
  │ ├── recipe-diagnose/
320
343
  │ ├── recipe-task/
@@ -324,8 +347,9 @@ your-project/
324
347
  ├── .codex/agents/ # Subagent TOML definitions
325
348
  │ ├── requirement-analyzer.toml
326
349
  │ ├── technical-designer.toml
350
+ │ ├── ui-analyzer.toml
327
351
  │ ├── task-executor.toml
328
- │ └── ... (23 agents total)
352
+ │ └── ... (25 agents total)
329
353
  └── docs/ # Created as you use the recipes
330
354
  ├── prd/
331
355
  ├── design/
@@ -361,7 +385,11 @@ A: `$recipe-implement` is the universal entry point. It runs requirement-analyze
361
385
 
362
386
  **Q: Does this work with MCP servers?**
363
387
 
364
- A: Yes. Codex skills and subagents work alongside [MCP](https://developers.openai.com/codex/mcp) — skills operate at the instruction layer while MCP operates at the tool transport layer. You can add MCP servers to any agent's TOML configuration.
388
+ A: Yes. Codex skills and subagents work alongside [MCP](https://developers.openai.com/codex/mcp) — skills operate at the instruction layer while MCP operates at the tool transport layer. Custom agents inherit parent `mcp_servers` when the agent TOML omits `mcp_servers`; add agent-local MCP config only for agent-specific servers or tool filtering.
389
+
390
+ **Q: How is this related to claude-code-workflows?**
391
+
392
+ A: [claude-code-workflows](https://github.com/shinpr/claude-code-workflows) is the Claude Code counterpart. The repositories share the same workflow philosophy, adapted to each tool's native extension points. They can coexist in the same project because Codex uses `.agents/skills/`, `.codex/agents/`, and `AGENTS.md`, while Claude Code uses its own `.claude/` files and `CLAUDE.md`.
365
393
 
366
394
  **Q: What if a subagent seems stuck?**
367
395
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codex-workflows",
3
- "version": "0.4.11",
3
+ "version": "0.6.0",
4
4
  "description": "Task-oriented agentic coding framework for OpenAI Codex CLI — skills, recipes, and subagents for structured development workflows",
5
5
  "license": "MIT",
6
6
  "author": "Shinsuke Kagawa",