@dyzsasd/dev-loop 0.22.0 → 0.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +30 -10
- package/dist/agentops.js +5 -68
- package/dist/cli.js +4 -0
- package/dist/db.js +0 -26
- package/dist/doctor.js +2 -2
- package/dist/install-claude-plugin.js +78 -0
- package/dist/mcp-merge.js +18 -19
- package/dist/mirrorstore.js +1 -1
- package/dist/plugin/.claude-plugin/marketplace.json +13 -0
- package/dist/plugin/.claude-plugin/plugin.json +11 -0
- package/dist/plugin/config/mcp.codex.toml.example +33 -0
- package/dist/plugin/config/mcp.example.json +15 -0
- package/dist/plugin/config/mcp.opencode.json.example +16 -0
- package/dist/plugin/config/projects.example.json +82 -0
- package/dist/plugin/hooks/hooks.json +16 -0
- package/dist/plugin/references/codex-integration.md +282 -0
- package/dist/plugin/references/config-schema.md +358 -0
- package/dist/plugin/references/conventions.md +2159 -0
- package/dist/plugin/skills/architect-agent/SKILL.md +231 -0
- package/dist/plugin/skills/communication-agent/SKILL.md +247 -0
- package/dist/plugin/skills/dev-agent/SKILL.md +373 -0
- package/dist/plugin/skills/init/SKILL.md +496 -0
- package/dist/plugin/skills/junior-dev-agent/SKILL.md +348 -0
- package/dist/plugin/skills/ops-agent/SKILL.md +219 -0
- package/dist/plugin/skills/pm-agent/SKILL.md +427 -0
- package/dist/plugin/skills/qa-agent/SKILL.md +299 -0
- package/dist/plugin/skills/reflect-agent/SKILL.md +271 -0
- package/dist/plugin/skills/senior-dev-agent/SKILL.md +353 -0
- package/dist/plugin/skills/sweep-agent/SKILL.md +180 -0
- package/dist/run-agents.js +373 -0
- package/dist/seed.js +4 -3
- package/dist/server.js +1 -1
- package/dist/shim.js +3 -4
- package/dist/tooldefs.js +3 -25
- package/package.json +5 -5
- package/dist/topicstore.js +0 -174
|
@@ -0,0 +1,282 @@
|
|
|
1
|
+
# dev-loop — Codex integration (optional)
|
|
2
|
+
|
|
3
|
+
A **companion-plugin** playbook: how the dev-loop agents may reach for **OpenAI Codex**
|
|
4
|
+
as an optional power tool — an *independent reviewer*, an *image generator*, and a
|
|
5
|
+
*second-engine rescue*. This file is the detailed how-to; the canonical rules live in
|
|
6
|
+
[`conventions.md` §24](conventions.md#24-codex--optional-power-tools), which every agent
|
|
7
|
+
reads. If a rule here conflicts with conventions, conventions wins.
|
|
8
|
+
|
|
9
|
+
> **Opt-in, and absent ⇒ 100% unchanged.** If the `codex` config block is absent **or**
|
|
10
|
+
> the `codex` CLI is not on `PATH`, every agent behaves exactly as it does today — no
|
|
11
|
+
> review call, no image step, no rescue, no new prompts. Codex is an **accelerant the
|
|
12
|
+
> loop may use, never a dependency it needs.** (Same opt-in philosophy as `backend`
|
|
13
|
+
> §18, `repos[]` §19, and `reports.sink` §23.)
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## What Codex adds (and what it does NOT)
|
|
18
|
+
|
|
19
|
+
| Capability | Who uses it | Why Codex (not just the dev-loop agent itself) |
|
|
20
|
+
|---|---|---|
|
|
21
|
+
| **Independent review** | Dev (Step 5.5), Architect | A *second model* on the diff/codebase — catches what the author's own pass misses. Codex `/review` quality, on demand. |
|
|
22
|
+
| **Image generation** | PM (mockups), Dev (real UI assets) | The dev-loop agents **cannot generate images**; Codex has a native `image_generation` tool. This is the one capability the loop genuinely lacks. |
|
|
23
|
+
| **Delegate / rescue** | Dev (before a `fix-exhausted` block) | A different engine takes one pass at a stuck ticket before Dev gives up — cheap extra attempt, still gated. |
|
|
24
|
+
|
|
25
|
+
**Codex is advisory, never authoritative.** The dev-loop agent always owns the
|
|
26
|
+
decision, the gate, and the ship. Codex output is an input to the agent's existing
|
|
27
|
+
judgment — it never bypasses the firewall (§2), `mode` (§12), `autonomy` (§12a), the
|
|
28
|
+
ship gates (Dev §5/§5.5/§6/§6.5), or the security doctrine (§16). A Codex review does
|
|
29
|
+
**not** replace Dev's own self-review; it augments it.
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Prerequisites (operator-present, one-time)
|
|
34
|
+
|
|
35
|
+
1. **Codex CLI** installed and authenticated:
|
|
36
|
+
```bash
|
|
37
|
+
npm install -g @openai/codex # Node 18.18+
|
|
38
|
+
codex login # ChatGPT sign-in or API key
|
|
39
|
+
codex --version # sanity check
|
|
40
|
+
```
|
|
41
|
+
Codex usage counts against your ChatGPT/Codex limits — see the Codex pricing docs.
|
|
42
|
+
2. **codex-plugin-cc** installed in Claude Code (gives the `/codex:*` commands the
|
|
43
|
+
operator and the agents can invoke):
|
|
44
|
+
```bash
|
|
45
|
+
/plugin marketplace add openai/codex-plugin-cc
|
|
46
|
+
/plugin install codex@openai-codex
|
|
47
|
+
/reload-plugins
|
|
48
|
+
/codex:setup # verifies Codex is ready
|
|
49
|
+
```
|
|
50
|
+
3. **Verify the native tools** Codex will use (image generation is the load-bearing one):
|
|
51
|
+
```bash
|
|
52
|
+
codex features list | grep -E 'image_generation'
|
|
53
|
+
# image_generation stable true
|
|
54
|
+
```
|
|
55
|
+
4. Add the `codex` block to the project in `projects.json` (below). Absent ⇒ off.
|
|
56
|
+
|
|
57
|
+
`/dev-loop:init` does **not** install Codex for you (it's a separate vendor CLI), but it
|
|
58
|
+
notes the option in its readiness checklist when a `codex` block is present.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Config block
|
|
63
|
+
|
|
64
|
+
Add an optional `codex` object to a project in `projects.json` (full schema in
|
|
65
|
+
[`config-schema.md`](config-schema.md)):
|
|
66
|
+
|
|
67
|
+
```jsonc
|
|
68
|
+
"codex": {
|
|
69
|
+
"enabled": true, // master switch. false / absent ⇒ codex is never invoked (today's behavior)
|
|
70
|
+
"review": true, // Dev Step 5.5 + Architect may run an independent codex review pass
|
|
71
|
+
"rescue": false, // Dev may delegate ONE rescue pass to codex before a fix-exhausted block
|
|
72
|
+
"imageGen": true, // PM/Dev may generate images via codex's image_generation tool
|
|
73
|
+
"assetsDir": "public/generated", // repo-relative dir where Dev commits generated production assets (multi-repo: per the ticket's repo:<name> tree)
|
|
74
|
+
"model": null, // optional: pin a codex model (e.g. "gpt-5.4-mini"); null ⇒ codex's own default / its config.toml
|
|
75
|
+
"effort": null // optional: reasoning effort (none|minimal|low|medium|high|xhigh); null ⇒ codex default
|
|
76
|
+
}
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Each sub-flag is independently gated: e.g. `review:true, imageGen:false` runs reviews
|
|
80
|
+
but never generates images. A missing sub-flag ⇒ that capability is **off**.
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Invocation forms — deterministic first
|
|
85
|
+
|
|
86
|
+
The dev-loop agents run **unattended on a loop**, so they prefer the **blocking,
|
|
87
|
+
parseable** Codex CLI forms over the plugin's `--background` + `/codex:status` polling
|
|
88
|
+
(which is operator-present ergonomics). Two ways to call Codex:
|
|
89
|
+
|
|
90
|
+
- **Programmatic (preferred in the loop):** `codex exec …` / `codex exec review …` —
|
|
91
|
+
runs to completion, prints to stdout, exits. Add `--json` for JSONL events when you
|
|
92
|
+
need to parse structured output, or `--output-last-message <file>` to capture just the
|
|
93
|
+
final message.
|
|
94
|
+
- **Plugin slash-commands (operator-present, or a single attended pass):** `/codex:review`,
|
|
95
|
+
`/codex:adversarial-review`, `/codex:rescue`, `/codex:status`, `/codex:result`,
|
|
96
|
+
`/codex:cancel`. Convenient when a human is driving; in a looped agent, drive
|
|
97
|
+
`codex exec` directly so the call is synchronous and self-contained.
|
|
98
|
+
|
|
99
|
+
Shared flags the loop always sets:
|
|
100
|
+
- `< /dev/null` — close stdin. Without it `codex exec` prints *"Reading additional input
|
|
101
|
+
from stdin…"* and **waits**, hanging an unattended fire.
|
|
102
|
+
- `-C <dir>` (or `--cd`) — run in the target repo / assets dir (multi-repo: the ticket's
|
|
103
|
+
`repo:<name>` tree, §19).
|
|
104
|
+
- `--skip-git-repo-check` — only when the target dir is not a git repo (a scratch/mock dir).
|
|
105
|
+
- `-c model_reasoning_effort=<…>` / `-m <model>` — only when `codex.effort` / `codex.model`
|
|
106
|
+
are set; otherwise leave Codex on its own defaults / `config.toml`.
|
|
107
|
+
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
## Capability 1 — Independent review (read-only)
|
|
111
|
+
|
|
112
|
+
**Where it plugs in:** Dev **Step 5.5 stage 2** ("code quality") already says *"if a
|
|
113
|
+
`code-review` skill/command is available, invoke it"* — Codex **is** that reviewer when
|
|
114
|
+
`codex.review` is on. Architect (Job 2) may likewise take a Codex second opinion on its
|
|
115
|
+
rotating dimension.
|
|
116
|
+
|
|
117
|
+
**Form (read-only — `codex review` / `codex exec review` never edit code):**
|
|
118
|
+
```bash
|
|
119
|
+
# Review the working-tree diff (Dev, after green gates, before shipping):
|
|
120
|
+
codex exec review -C "$REPO" < /dev/null
|
|
121
|
+
|
|
122
|
+
# Review the branch vs a base ref:
|
|
123
|
+
codex exec review --base main -C "$REPO" < /dev/null # (or /codex:review --base main)
|
|
124
|
+
|
|
125
|
+
# Pressure-test a design decision (Architect / a risky Dev change):
|
|
126
|
+
# the plugin's steerable variant takes focus text:
|
|
127
|
+
/codex:adversarial-review challenge the caching + retry design for race conditions
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
**How Dev treats the findings (unchanged gate semantics):**
|
|
131
|
+
- It is an **additional advisory pass**, not a replacement for Dev's own Step-5.5
|
|
132
|
+
self-review. Run *both*.
|
|
133
|
+
- **Critical / High** findings are blocking exactly like Dev's own (Step 5.5 stage 2):
|
|
134
|
+
fix this run, or if you can't, revert and **block** the ticket `Bail-shape:
|
|
135
|
+
fix-exhausted` with the findings. Medium/Low/nits are non-blocking — apply the cheap
|
|
136
|
+
ones, note the rest in the hand-off.
|
|
137
|
+
- **Codex disagreeing with Dev is signal, not gospel.** If Codex flags something Dev is
|
|
138
|
+
confident is a false positive, Dev may proceed but must say so in the hand-off (so the
|
|
139
|
+
owner can see the disagreement). Codex never gets a veto the gates don't already grant.
|
|
140
|
+
- **`dry-run` (§12):** a read-only review is safe to run and print even in `dry-run`
|
|
141
|
+
(it mutates nothing) — but no resulting code change is shipped, same as any dry-run.
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Capability 2 — Image generation (the capability the loop lacks)
|
|
146
|
+
|
|
147
|
+
Codex's native `image_generation` tool produces real raster images. **Verify it's
|
|
148
|
+
present:** `codex features list | grep image_generation` → `image_generation stable true`.
|
|
149
|
+
|
|
150
|
+
### ⚠️ How the tool actually saves (verified — read this first)
|
|
151
|
+
`image_generation` does **not** save to a path you name in the prompt. It **always**
|
|
152
|
+
writes the PNG to:
|
|
153
|
+
|
|
154
|
+
```
|
|
155
|
+
~/.codex/generated_images/<session-id>/ig_<hash>.png
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
…and it ignores the filename **and the pixel dimensions** you ask for (a "512×512
|
|
159
|
+
gear.png" request produced a `1254×1254` `ig_*.png`). Worse, Codex's own final message
|
|
160
|
+
will often claim *"saved to ./gear.png"* — that line is a **confabulation**; no such file
|
|
161
|
+
exists. So **never trust the model's reported path** — the agent must locate the real
|
|
162
|
+
generated file and **copy it out** to the target. Two verified recipes:
|
|
163
|
+
|
|
164
|
+
**Recipe A — agent-orchestrated (deterministic; preferred in the loop).** The dev-loop
|
|
165
|
+
agent runs Codex to generate, captures the **session id**, then copies the file itself —
|
|
166
|
+
no dependence on Codex's self-report, and **race-safe under concurrency** (scopes to the
|
|
167
|
+
one session dir):
|
|
168
|
+
```bash
|
|
169
|
+
# 1) generate; capture the session id from --json (or the exec banner "session id: …")
|
|
170
|
+
SID=$(codex exec --json --sandbox workspace-write -C "$REPO" < /dev/null \
|
|
171
|
+
"Use your built-in image_generation tool to create <precise description: subject,
|
|
172
|
+
style, palette, background>. Use the tool directly; do not write code." \
|
|
173
|
+
| sed -n 's/.*"session_id":"\([^"]*\)".*/\1/p' | head -1)
|
|
174
|
+
# 2) copy the just-generated PNG out to the repo asset path:
|
|
175
|
+
SRC=$(ls -t "$HOME/.codex/generated_images/$SID/"*.png | head -1)
|
|
176
|
+
mkdir -p "$REPO/$ASSETS_DIR" && cp "$SRC" "$REPO/$ASSETS_DIR/<name>.png"
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
**Recipe B — single-call (simpler; Codex copies it itself).** Tell Codex to generate
|
|
180
|
+
**and** `cp` the result, scoping the copy to **this** session's dir (don't "newest across
|
|
181
|
+
all of generated_images" — that races other Codex runs):
|
|
182
|
+
```bash
|
|
183
|
+
codex exec --sandbox workspace-write -C "$REPO" < /dev/null \
|
|
184
|
+
"Step 1: use your built-in image_generation tool to create <precise description>.
|
|
185
|
+
The tool saves the PNG under ~/.codex/generated_images/<this session id>/.
|
|
186
|
+
Step 2: copy that generated PNG (a shell cp is allowed) to $ASSETS_DIR/<name>.png in
|
|
187
|
+
the working directory. Step 3: print DONE and run 'ls -l $ASSETS_DIR/<name>.png'."
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
Mechanics that bite (all verified):
|
|
191
|
+
- **`--sandbox workspace-write` is mandatory.** `codex exec` defaults to a **read-only**
|
|
192
|
+
sandbox and silently produces **no on-disk copy**. workspace-write permits the workdir
|
|
193
|
+
(+ `/tmp`, `$TMPDIR`); home is readable, so the `cp` from `~/.codex/generated_images`
|
|
194
|
+
into the workdir works.
|
|
195
|
+
- **`< /dev/null`** so the fire doesn't hang on *"Reading additional input from stdin…"*.
|
|
196
|
+
- Dimensions aren't honored by the prompt — if you need an exact size, resize after the
|
|
197
|
+
copy (e.g. `sips`/`magick`) rather than asking Codex for it.
|
|
198
|
+
|
|
199
|
+
### 2a. Dev — production assets an acceptance criterion requires
|
|
200
|
+
When a ticket's ACs call for an image the code needs (an icon, an illustration, an
|
|
201
|
+
OpenGraph/social card, a placeholder, a favicon), Dev generates it **into the repo**
|
|
202
|
+
under `codex.assetsDir` during Step 4 (Recipe A/B above, `$REPO` = the ticket's
|
|
203
|
+
`repo:<name>` tree, §19), then it ships through the normal gates like any other file:
|
|
204
|
+
- The asset is a **repo artifact**: Dev stages **only** the generated file(s) + the code
|
|
205
|
+
that references them (staging discipline, §7), commits with the ticket id, and ships
|
|
206
|
+
per config (Step 6). It runs through Step 5 gates and Step 5.5 self-review.
|
|
207
|
+
- Coverage (§15): a generated static asset is treated like a docs/asset change — exempt
|
|
208
|
+
from a regression test (note it in the hand-off); the *code that uses* it still
|
|
209
|
+
follows §15.
|
|
210
|
+
- **`dry-run` (§12):** generate to a scratch path if useful for the preview, but make
|
|
211
|
+
**no** commit/push/deploy and don't write into the shipping tree — print what you'd do.
|
|
212
|
+
|
|
213
|
+
### 2b. PM — mockups / wireframes to sharpen a Feature ticket
|
|
214
|
+
When a Feature is easier to specify with a picture, PM may generate a **mockup** and
|
|
215
|
+
attach/reference it on the ticket so Dev builds against a concrete visual. This is a
|
|
216
|
+
**spec aid, not a production asset** — keep it out of the shipping tree (copy it to a
|
|
217
|
+
scratch dir, then attach/reference it).
|
|
218
|
+
- Mark it clearly in the ticket as **"mockup — illustrative, not the production asset"**
|
|
219
|
+
so Dev treats it as direction, not a drop-in file.
|
|
220
|
+
- §16: **never** put real user data / PII / secrets in an image prompt. A mock is
|
|
221
|
+
synthetic by construction — use placeholder names/numbers.
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## Capability 3 — Delegate / rescue (a second engine on a stuck ticket)
|
|
226
|
+
|
|
227
|
+
Before Dev blocks a ticket `Bail-shape: fix-exhausted` (§9 — Dev tried, couldn't make
|
|
228
|
+
the gates/self-review pass), and **only if `codex.rescue` is on**, Dev may hand the task
|
|
229
|
+
to Codex for **one** pass (a different model/engine often breaks a stall):
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
/codex:rescue fix the failing <test/flow> with the smallest safe patch
|
|
233
|
+
# or programmatically, write-capable:
|
|
234
|
+
codex exec --sandbox workspace-write -C "$REPO" < /dev/null "<the stuck task, precisely stated>"
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
Hard limits:
|
|
238
|
+
- **One rescue attempt per ticket per fire** — Codex is not a retry loop. If its patch
|
|
239
|
+
doesn't pass Dev's own Step-5 gates **and** Step-5.5 self-review, Dev discards it and
|
|
240
|
+
blocks `fix-exhausted` exactly as it would have. (This sits *inside* §9's "cap blind
|
|
241
|
+
retries at 2" — a rescue is the considered alternative, not a 3rd blind retry.)
|
|
242
|
+
- Codex shares the **same git checkout** (§7): after a rescue, Dev re-reads `git status`,
|
|
243
|
+
reviews the diff line-by-line (Step 5.5), and stages **only** this ticket's files —
|
|
244
|
+
never blind-commits whatever Codex left in the tree.
|
|
245
|
+
- **`dry-run` (§12):** no rescue (it writes code) — print that you *would* delegate.
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Safety & boundaries (recap — conventions win)
|
|
250
|
+
|
|
251
|
+
- **Firewall (§2):** Codex never touches Linear. All ticket state stays with the
|
|
252
|
+
dev-loop agent through the configured backend (§18). Codex only ever touches **code /
|
|
253
|
+
files / a review of them**.
|
|
254
|
+
- **Same machine, same checkout (§7):** an image or rescue run mutates the working tree.
|
|
255
|
+
Stage only your ticket's files; if commits/files you didn't author appear, surface it
|
|
256
|
+
(§7) rather than building on them blindly.
|
|
257
|
+
- **`mode` (§12):** in `dry-run`, Codex makes **no** repo writes that ship — read-only
|
|
258
|
+
review may run and print; image/rescue are described, not committed.
|
|
259
|
+
- **`autonomy` (§12a):** Codex must never inject an **interactive** prompt into the loop.
|
|
260
|
+
Use the non-interactive `codex exec` forms with `approval never` (the exec default) and
|
|
261
|
+
an explicit `--sandbox` — never a form that pauses for a human. A genuine
|
|
262
|
+
external-prerequisite (e.g. Codex not logged in) is reported as a fact and the agent
|
|
263
|
+
proceeds without Codex, exactly as if `codex.enabled` were false.
|
|
264
|
+
- **Security (§16):** never pass secrets/PII into a Codex prompt or image description;
|
|
265
|
+
treat Codex's stdout like any tool output (no raw secrets into tickets/reports). Codex
|
|
266
|
+
inherits your local Codex auth/config — no new credential lives in `projects.json`.
|
|
267
|
+
- **Determinism:** prefer `codex exec`/`codex exec review` (synchronous) in the loop; the
|
|
268
|
+
`--background` + `/codex:status`/`/codex:result` flow is for an attended operator.
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
## Quick reference
|
|
273
|
+
|
|
274
|
+
| Need | Command (loop form) |
|
|
275
|
+
|---|---|
|
|
276
|
+
| Review the diff | `codex exec review -C "$REPO" < /dev/null` |
|
|
277
|
+
| Review vs base branch | `codex exec review --base main -C "$REPO" < /dev/null` |
|
|
278
|
+
| Adversarial / steerable review | `/codex:adversarial-review <focus text>` |
|
|
279
|
+
| Generate an image (then copy out) | `codex exec --sandbox workspace-write -C "$REPO" < /dev/null "…image_generation… then cp the PNG from ~/.codex/generated_images/<session>/ to <assetsDir>/<f>.png"` — file lands in `~/.codex/generated_images/<session-id>/ig_*.png`, **not** the named path (see Capability 2) |
|
|
280
|
+
| Resize a generated asset (size isn't honored) | `sips -z <h> <w> <assetsDir>/<f>.png` (or `magick`) after the copy |
|
|
281
|
+
| Rescue a stuck ticket | `/codex:rescue <task>` or `codex exec --sandbox workspace-write -C "$REPO" < /dev/null "<task>"` |
|
|
282
|
+
| Check Codex is ready | `codex --version && codex login status && codex features list \| grep image_generation` |
|