@dyzsasd/dev-loop 0.22.0 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/README.md +30 -10
  2. package/dist/agentops.js +5 -68
  3. package/dist/cli.js +4 -0
  4. package/dist/db.js +0 -26
  5. package/dist/doctor.js +2 -2
  6. package/dist/install-claude-plugin.js +78 -0
  7. package/dist/mcp-merge.js +18 -19
  8. package/dist/mirrorstore.js +1 -1
  9. package/dist/plugin/.claude-plugin/marketplace.json +13 -0
  10. package/dist/plugin/.claude-plugin/plugin.json +11 -0
  11. package/dist/plugin/config/mcp.codex.toml.example +33 -0
  12. package/dist/plugin/config/mcp.example.json +15 -0
  13. package/dist/plugin/config/mcp.opencode.json.example +16 -0
  14. package/dist/plugin/config/projects.example.json +82 -0
  15. package/dist/plugin/hooks/hooks.json +16 -0
  16. package/dist/plugin/references/codex-integration.md +282 -0
  17. package/dist/plugin/references/config-schema.md +358 -0
  18. package/dist/plugin/references/conventions.md +2159 -0
  19. package/dist/plugin/skills/architect-agent/SKILL.md +231 -0
  20. package/dist/plugin/skills/communication-agent/SKILL.md +247 -0
  21. package/dist/plugin/skills/dev-agent/SKILL.md +373 -0
  22. package/dist/plugin/skills/init/SKILL.md +496 -0
  23. package/dist/plugin/skills/junior-dev-agent/SKILL.md +348 -0
  24. package/dist/plugin/skills/ops-agent/SKILL.md +219 -0
  25. package/dist/plugin/skills/pm-agent/SKILL.md +427 -0
  26. package/dist/plugin/skills/qa-agent/SKILL.md +299 -0
  27. package/dist/plugin/skills/reflect-agent/SKILL.md +271 -0
  28. package/dist/plugin/skills/senior-dev-agent/SKILL.md +353 -0
  29. package/dist/plugin/skills/sweep-agent/SKILL.md +180 -0
  30. package/dist/run-agents.js +373 -0
  31. package/dist/seed.js +4 -3
  32. package/dist/server.js +1 -1
  33. package/dist/shim.js +3 -4
  34. package/dist/tooldefs.js +3 -25
  35. package/package.json +5 -5
  36. package/dist/topicstore.js +0 -174
@@ -0,0 +1,282 @@
1
+ # dev-loop — Codex integration (optional)
2
+
3
+ A **companion-plugin** playbook: how the dev-loop agents may reach for **OpenAI Codex**
4
+ as an optional power tool — an *independent reviewer*, an *image generator*, and a
5
+ *second-engine rescue*. This file is the detailed how-to; the canonical rules live in
6
+ [`conventions.md` §24](conventions.md#24-codex--optional-power-tools), which every agent
7
+ reads. If a rule here conflicts with conventions, conventions wins.
8
+
9
+ > **Opt-in, and absent ⇒ 100% unchanged.** If the `codex` config block is absent **or**
10
+ > the `codex` CLI is not on `PATH`, every agent behaves exactly as it does today — no
11
+ > review call, no image step, no rescue, no new prompts. Codex is an **accelerant the
12
+ > loop may use, never a dependency it needs.** (Same opt-in philosophy as `backend`
13
+ > §18, `repos[]` §19, and `reports.sink` §23.)
14
+
15
+ ---
16
+
17
+ ## What Codex adds (and what it does NOT)
18
+
19
+ | Capability | Who uses it | Why Codex (not just the dev-loop agent itself) |
20
+ |---|---|---|
21
+ | **Independent review** | Dev (Step 5.5), Architect | A *second model* on the diff/codebase — catches what the author's own pass misses. Codex `/review` quality, on demand. |
22
+ | **Image generation** | PM (mockups), Dev (real UI assets) | The dev-loop agents **cannot generate images**; Codex has a native `image_generation` tool. This is the one capability the loop genuinely lacks. |
23
+ | **Delegate / rescue** | Dev (before a `fix-exhausted` block) | A different engine takes one pass at a stuck ticket before Dev gives up — cheap extra attempt, still gated. |
24
+
25
+ **Codex is advisory, never authoritative.** The dev-loop agent always owns the
26
+ decision, the gate, and the ship. Codex output is an input to the agent's existing
27
+ judgment — it never bypasses the firewall (§2), `mode` (§12), `autonomy` (§12a), the
28
+ ship gates (Dev §5/§5.5/§6/§6.5), or the security doctrine (§16). A Codex review does
29
+ **not** replace Dev's own self-review; it augments it.
30
+
31
+ ---
32
+
33
+ ## Prerequisites (operator-present, one-time)
34
+
35
+ 1. **Codex CLI** installed and authenticated:
36
+ ```bash
37
+ npm install -g @openai/codex # Node 18.18+
38
+ codex login # ChatGPT sign-in or API key
39
+ codex --version # sanity check
40
+ ```
41
+ Codex usage counts against your ChatGPT/Codex limits — see the Codex pricing docs.
42
+ 2. **codex-plugin-cc** installed in Claude Code (gives the `/codex:*` commands the
43
+ operator and the agents can invoke):
44
+ ```bash
45
+ /plugin marketplace add openai/codex-plugin-cc
46
+ /plugin install codex@openai-codex
47
+ /reload-plugins
48
+ /codex:setup # verifies Codex is ready
49
+ ```
50
+ 3. **Verify the native tools** Codex will use (image generation is the load-bearing one):
51
+ ```bash
52
+ codex features list | grep -E 'image_generation'
53
+ # image_generation stable true
54
+ ```
55
+ 4. Add the `codex` block to the project in `projects.json` (below). Absent ⇒ off.
56
+
57
+ `/dev-loop:init` does **not** install Codex for you (it's a separate vendor CLI), but it
58
+ notes the option in its readiness checklist when a `codex` block is present.
59
+
60
+ ---
61
+
62
+ ## Config block
63
+
64
+ Add an optional `codex` object to a project in `projects.json` (full schema in
65
+ [`config-schema.md`](config-schema.md)):
66
+
67
+ ```jsonc
68
+ "codex": {
69
+ "enabled": true, // master switch. false / absent ⇒ codex is never invoked (today's behavior)
70
+ "review": true, // Dev Step 5.5 + Architect may run an independent codex review pass
71
+ "rescue": false, // Dev may delegate ONE rescue pass to codex before a fix-exhausted block
72
+ "imageGen": true, // PM/Dev may generate images via codex's image_generation tool
73
+ "assetsDir": "public/generated", // repo-relative dir where Dev commits generated production assets (multi-repo: per the ticket's repo:<name> tree)
74
+ "model": null, // optional: pin a codex model (e.g. "gpt-5.4-mini"); null ⇒ codex's own default / its config.toml
75
+ "effort": null // optional: reasoning effort (none|minimal|low|medium|high|xhigh); null ⇒ codex default
76
+ }
77
+ ```
78
+
79
+ Each sub-flag is independently gated: e.g. `review:true, imageGen:false` runs reviews
80
+ but never generates images. A missing sub-flag ⇒ that capability is **off**.
81
+
82
+ ---
83
+
84
+ ## Invocation forms — deterministic first
85
+
86
+ The dev-loop agents run **unattended on a loop**, so they prefer the **blocking,
87
+ parseable** Codex CLI forms over the plugin's `--background` + `/codex:status` polling
88
+ (which is operator-present ergonomics). Two ways to call Codex:
89
+
90
+ - **Programmatic (preferred in the loop):** `codex exec …` / `codex exec review …` —
91
+ runs to completion, prints to stdout, exits. Add `--json` for JSONL events when you
92
+ need to parse structured output, or `--output-last-message <file>` to capture just the
93
+ final message.
94
+ - **Plugin slash-commands (operator-present, or a single attended pass):** `/codex:review`,
95
+ `/codex:adversarial-review`, `/codex:rescue`, `/codex:status`, `/codex:result`,
96
+ `/codex:cancel`. Convenient when a human is driving; in a looped agent, drive
97
+ `codex exec` directly so the call is synchronous and self-contained.
98
+
99
+ Shared flags the loop always sets:
100
+ - `< /dev/null` — close stdin. Without it `codex exec` prints *"Reading additional input
101
+ from stdin…"* and **waits**, hanging an unattended fire.
102
+ - `-C <dir>` (or `--cd`) — run in the target repo / assets dir (multi-repo: the ticket's
103
+ `repo:<name>` tree, §19).
104
+ - `--skip-git-repo-check` — only when the target dir is not a git repo (a scratch/mock dir).
105
+ - `-c model_reasoning_effort=<…>` / `-m <model>` — only when `codex.effort` / `codex.model`
106
+ are set; otherwise leave Codex on its own defaults / `config.toml`.
107
+
108
+ ---
109
+
110
+ ## Capability 1 — Independent review (read-only)
111
+
112
+ **Where it plugs in:** Dev **Step 5.5 stage 2** ("code quality") already says *"if a
113
+ `code-review` skill/command is available, invoke it"* — Codex **is** that reviewer when
114
+ `codex.review` is on. Architect (Job 2) may likewise take a Codex second opinion on its
115
+ rotating dimension.
116
+
117
+ **Form (read-only — `codex review` / `codex exec review` never edit code):**
118
+ ```bash
119
+ # Review the working-tree diff (Dev, after green gates, before shipping):
120
+ codex exec review -C "$REPO" < /dev/null
121
+
122
+ # Review the branch vs a base ref:
123
+ codex exec review --base main -C "$REPO" < /dev/null # (or /codex:review --base main)
124
+
125
+ # Pressure-test a design decision (Architect / a risky Dev change):
126
+ # the plugin's steerable variant takes focus text:
127
+ /codex:adversarial-review challenge the caching + retry design for race conditions
128
+ ```
129
+
130
+ **How Dev treats the findings (unchanged gate semantics):**
131
+ - It is an **additional advisory pass**, not a replacement for Dev's own Step-5.5
132
+ self-review. Run *both*.
133
+ - **Critical / High** findings are blocking exactly like Dev's own (Step 5.5 stage 2):
134
+ fix this run, or if you can't, revert and **block** the ticket `Bail-shape:
135
+ fix-exhausted` with the findings. Medium/Low/nits are non-blocking — apply the cheap
136
+ ones, note the rest in the hand-off.
137
+ - **Codex disagreeing with Dev is signal, not gospel.** If Codex flags something Dev is
138
+ confident is a false positive, Dev may proceed but must say so in the hand-off (so the
139
+ owner can see the disagreement). Codex never gets a veto the gates don't already grant.
140
+ - **`dry-run` (§12):** a read-only review is safe to run and print even in `dry-run`
141
+ (it mutates nothing) — but no resulting code change is shipped, same as any dry-run.
142
+
143
+ ---
144
+
145
+ ## Capability 2 — Image generation (the capability the loop lacks)
146
+
147
+ Codex's native `image_generation` tool produces real raster images. **Verify it's
148
+ present:** `codex features list | grep image_generation` → `image_generation stable true`.
149
+
150
+ ### ⚠️ How the tool actually saves (verified — read this first)
151
+ `image_generation` does **not** save to a path you name in the prompt. It **always**
152
+ writes the PNG to:
153
+
154
+ ```
155
+ ~/.codex/generated_images/<session-id>/ig_<hash>.png
156
+ ```
157
+
158
+ …and it ignores the filename **and the pixel dimensions** you ask for (a "512×512
159
+ gear.png" request produced a `1254×1254` `ig_*.png`). Worse, Codex's own final message
160
+ will often claim *"saved to ./gear.png"* — that line is a **confabulation**; no such file
161
+ exists. So **never trust the model's reported path** — the agent must locate the real
162
+ generated file and **copy it out** to the target. Two verified recipes:
163
+
164
+ **Recipe A — agent-orchestrated (deterministic; preferred in the loop).** The dev-loop
165
+ agent runs Codex to generate, captures the **session id**, then copies the file itself —
166
+ no dependence on Codex's self-report, and **race-safe under concurrency** (scopes to the
167
+ one session dir):
168
+ ```bash
169
+ # 1) generate; capture the session id from --json (or the exec banner "session id: …")
170
+ SID=$(codex exec --json --sandbox workspace-write -C "$REPO" < /dev/null \
171
+ "Use your built-in image_generation tool to create <precise description: subject,
172
+ style, palette, background>. Use the tool directly; do not write code." \
173
+ | sed -n 's/.*"session_id":"\([^"]*\)".*/\1/p' | head -1)
174
+ # 2) copy the just-generated PNG out to the repo asset path:
175
+ SRC=$(ls -t "$HOME/.codex/generated_images/$SID/"*.png | head -1)
176
+ mkdir -p "$REPO/$ASSETS_DIR" && cp "$SRC" "$REPO/$ASSETS_DIR/<name>.png"
177
+ ```
178
+
179
+ **Recipe B — single-call (simpler; Codex copies it itself).** Tell Codex to generate
180
+ **and** `cp` the result, scoping the copy to **this** session's dir (don't "newest across
181
+ all of generated_images" — that races other Codex runs):
182
+ ```bash
183
+ codex exec --sandbox workspace-write -C "$REPO" < /dev/null \
184
+ "Step 1: use your built-in image_generation tool to create <precise description>.
185
+ The tool saves the PNG under ~/.codex/generated_images/<this session id>/.
186
+ Step 2: copy that generated PNG (a shell cp is allowed) to $ASSETS_DIR/<name>.png in
187
+ the working directory. Step 3: print DONE and run 'ls -l $ASSETS_DIR/<name>.png'."
188
+ ```
189
+
190
+ Mechanics that bite (all verified):
191
+ - **`--sandbox workspace-write` is mandatory.** `codex exec` defaults to a **read-only**
192
+ sandbox and silently produces **no on-disk copy**. workspace-write permits the workdir
193
+ (+ `/tmp`, `$TMPDIR`); home is readable, so the `cp` from `~/.codex/generated_images`
194
+ into the workdir works.
195
+ - **`< /dev/null`** so the fire doesn't hang on *"Reading additional input from stdin…"*.
196
+ - Dimensions aren't honored by the prompt — if you need an exact size, resize after the
197
+ copy (e.g. `sips`/`magick`) rather than asking Codex for it.
198
+
199
+ ### 2a. Dev — production assets an acceptance criterion requires
200
+ When a ticket's ACs call for an image the code needs (an icon, an illustration, an
201
+ OpenGraph/social card, a placeholder, a favicon), Dev generates it **into the repo**
202
+ under `codex.assetsDir` during Step 4 (Recipe A/B above, `$REPO` = the ticket's
203
+ `repo:<name>` tree, §19), then it ships through the normal gates like any other file:
204
+ - The asset is a **repo artifact**: Dev stages **only** the generated file(s) + the code
205
+ that references them (staging discipline, §7), commits with the ticket id, and ships
206
+ per config (Step 6). It runs through Step 5 gates and Step 5.5 self-review.
207
+ - Coverage (§15): a generated static asset is treated like a docs/asset change — exempt
208
+ from a regression test (note it in the hand-off); the *code that uses* it still
209
+ follows §15.
210
+ - **`dry-run` (§12):** generate to a scratch path if useful for the preview, but make
211
+ **no** commit/push/deploy and don't write into the shipping tree — print what you'd do.
212
+
213
+ ### 2b. PM — mockups / wireframes to sharpen a Feature ticket
214
+ When a Feature is easier to specify with a picture, PM may generate a **mockup** and
215
+ attach/reference it on the ticket so Dev builds against a concrete visual. This is a
216
+ **spec aid, not a production asset** — keep it out of the shipping tree (copy it to a
217
+ scratch dir, then attach/reference it).
218
+ - Mark it clearly in the ticket as **"mockup — illustrative, not the production asset"**
219
+ so Dev treats it as direction, not a drop-in file.
220
+ - §16: **never** put real user data / PII / secrets in an image prompt. A mock is
221
+ synthetic by construction — use placeholder names/numbers.
222
+
223
+ ---
224
+
225
+ ## Capability 3 — Delegate / rescue (a second engine on a stuck ticket)
226
+
227
+ Before Dev blocks a ticket `Bail-shape: fix-exhausted` (§9 — Dev tried, couldn't make
228
+ the gates/self-review pass), and **only if `codex.rescue` is on**, Dev may hand the task
229
+ to Codex for **one** pass (a different model/engine often breaks a stall):
230
+
231
+ ```bash
232
+ /codex:rescue fix the failing <test/flow> with the smallest safe patch
233
+ # or programmatically, write-capable:
234
+ codex exec --sandbox workspace-write -C "$REPO" < /dev/null "<the stuck task, precisely stated>"
235
+ ```
236
+
237
+ Hard limits:
238
+ - **One rescue attempt per ticket per fire** — Codex is not a retry loop. If its patch
239
+ doesn't pass Dev's own Step-5 gates **and** Step-5.5 self-review, Dev discards it and
240
+ blocks `fix-exhausted` exactly as it would have. (This sits *inside* §9's "cap blind
241
+ retries at 2" — a rescue is the considered alternative, not a 3rd blind retry.)
242
+ - Codex shares the **same git checkout** (§7): after a rescue, Dev re-reads `git status`,
243
+ reviews the diff line-by-line (Step 5.5), and stages **only** this ticket's files —
244
+ never blind-commits whatever Codex left in the tree.
245
+ - **`dry-run` (§12):** no rescue (it writes code) — print that you *would* delegate.
246
+
247
+ ---
248
+
249
+ ## Safety & boundaries (recap — conventions win)
250
+
251
+ - **Firewall (§2):** Codex never touches Linear. All ticket state stays with the
252
+ dev-loop agent through the configured backend (§18). Codex only ever touches **code /
253
+ files / a review of them**.
254
+ - **Same machine, same checkout (§7):** an image or rescue run mutates the working tree.
255
+ Stage only your ticket's files; if commits/files you didn't author appear, surface it
256
+ (§7) rather than building on them blindly.
257
+ - **`mode` (§12):** in `dry-run`, Codex makes **no** repo writes that ship — read-only
258
+ review may run and print; image/rescue are described, not committed.
259
+ - **`autonomy` (§12a):** Codex must never inject an **interactive** prompt into the loop.
260
+ Use the non-interactive `codex exec` forms with `approval never` (the exec default) and
261
+ an explicit `--sandbox` — never a form that pauses for a human. A genuine
262
+ external-prerequisite (e.g. Codex not logged in) is reported as a fact and the agent
263
+ proceeds without Codex, exactly as if `codex.enabled` were false.
264
+ - **Security (§16):** never pass secrets/PII into a Codex prompt or image description;
265
+ treat Codex's stdout like any tool output (no raw secrets into tickets/reports). Codex
266
+ inherits your local Codex auth/config — no new credential lives in `projects.json`.
267
+ - **Determinism:** prefer `codex exec`/`codex exec review` (synchronous) in the loop; the
268
+ `--background` + `/codex:status`/`/codex:result` flow is for an attended operator.
269
+
270
+ ---
271
+
272
+ ## Quick reference
273
+
274
+ | Need | Command (loop form) |
275
+ |---|---|
276
+ | Review the diff | `codex exec review -C "$REPO" < /dev/null` |
277
+ | Review vs base branch | `codex exec review --base main -C "$REPO" < /dev/null` |
278
+ | Adversarial / steerable review | `/codex:adversarial-review <focus text>` |
279
+ | Generate an image (then copy out) | `codex exec --sandbox workspace-write -C "$REPO" < /dev/null "…image_generation… then cp the PNG from ~/.codex/generated_images/<session>/ to <assetsDir>/<f>.png"` — file lands in `~/.codex/generated_images/<session-id>/ig_*.png`, **not** the named path (see Capability 2) |
280
+ | Resize a generated asset (size isn't honored) | `sips -z <h> <w> <assetsDir>/<f>.png` (or `magick`) after the copy |
281
+ | Rescue a stuck ticket | `/codex:rescue <task>` or `codex exec --sandbox workspace-write -C "$REPO" < /dev/null "<task>"` |
282
+ | Check Codex is ready | `codex --version && codex login status && codex features list \| grep image_generation` |