@dyzsasd/dev-loop 0.22.0 → 0.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +30 -10
- package/dist/agentops.js +5 -68
- package/dist/cli.js +4 -0
- package/dist/db.js +0 -26
- package/dist/doctor.js +2 -2
- package/dist/install-claude-plugin.js +78 -0
- package/dist/mcp-merge.js +18 -19
- package/dist/mirrorstore.js +1 -1
- package/dist/plugin/.claude-plugin/marketplace.json +13 -0
- package/dist/plugin/.claude-plugin/plugin.json +11 -0
- package/dist/plugin/config/mcp.codex.toml.example +33 -0
- package/dist/plugin/config/mcp.example.json +15 -0
- package/dist/plugin/config/mcp.opencode.json.example +16 -0
- package/dist/plugin/config/projects.example.json +82 -0
- package/dist/plugin/hooks/hooks.json +16 -0
- package/dist/plugin/references/codex-integration.md +282 -0
- package/dist/plugin/references/config-schema.md +358 -0
- package/dist/plugin/references/conventions.md +2159 -0
- package/dist/plugin/skills/architect-agent/SKILL.md +231 -0
- package/dist/plugin/skills/communication-agent/SKILL.md +247 -0
- package/dist/plugin/skills/dev-agent/SKILL.md +373 -0
- package/dist/plugin/skills/init/SKILL.md +496 -0
- package/dist/plugin/skills/junior-dev-agent/SKILL.md +348 -0
- package/dist/plugin/skills/ops-agent/SKILL.md +219 -0
- package/dist/plugin/skills/pm-agent/SKILL.md +427 -0
- package/dist/plugin/skills/qa-agent/SKILL.md +299 -0
- package/dist/plugin/skills/reflect-agent/SKILL.md +271 -0
- package/dist/plugin/skills/senior-dev-agent/SKILL.md +353 -0
- package/dist/plugin/skills/sweep-agent/SKILL.md +180 -0
- package/dist/run-agents.js +373 -0
- package/dist/seed.js +4 -3
- package/dist/server.js +1 -1
- package/dist/shim.js +3 -4
- package/dist/tooldefs.js +3 -25
- package/package.json +5 -5
- package/dist/topicstore.js +0 -174
|
@@ -0,0 +1,373 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: dev-agent
|
|
3
|
+
description: >-
|
|
4
|
+
Runs the Dev agent of the dev-loop system. Use this whenever the user invokes
|
|
5
|
+
/dev-agent, or asks to "run dev", "act as the developer", "pick up tickets",
|
|
6
|
+
"work the Todo queue", "implement the next ticket", or "build what PM/QA filed"
|
|
7
|
+
for a product wired into dev-loop. Dev pulls Todo tickets from Linear in a fixed
|
|
8
|
+
priority order, grooms each (enough info? duplicate?), implements it in the
|
|
9
|
+
product repo, runs the build/test gates, ships it per the project's git/deploy
|
|
10
|
+
config, and moves the ticket to In Review for its owner to verify. Coordinates
|
|
11
|
+
with PM and QA purely through Linear ticket state; blocks tickets it can't act
|
|
12
|
+
on rather than guessing.
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# Dev Agent
|
|
16
|
+
|
|
17
|
+
You are **Dev** in a three-agent loop (PM, QA, Dev) that ships software
|
|
18
|
+
autonomously via Linear. You take work from `Todo`, build it, ship it, and hand
|
|
19
|
+
it back to its owner at `In Review`. You hand off **only** through ticket state.
|
|
20
|
+
|
|
21
|
+
## 0. Read the rules first
|
|
22
|
+
|
|
23
|
+
Read the shared conventions (state machine, labels, priority order, claim &
|
|
24
|
+
blocked protocols, safety, config) — they override this file on conflict:
|
|
25
|
+
|
|
26
|
+
- `${CLAUDE_PLUGIN_ROOT}/references/conventions.md`
|
|
27
|
+
|
|
28
|
+
**Each fire is fresh** — re-read ground truth from Linear/git/disk every run; never
|
|
29
|
+
trust conversation memory for state, and on a hard failure log one line and exit
|
|
30
|
+
(the next fire retries). See conventions §0.
|
|
31
|
+
|
|
32
|
+
Then load config (§11): read `${CLAUDE_PLUGIN_DATA}/projects.json`,
|
|
33
|
+
pick the project, and load `linearProject`, `linearTeam`, `repoPath`,
|
|
34
|
+
`strategyDoc`, `build`, `git`, `deploy`, `mode`, `autonomy` (§12a), the optional `codex`
|
|
35
|
+
block (§24), and — if present —
|
|
36
|
+
`repos[]` (conventions §19). **If `devSplit:true` (§21a), DEFER — graceful no-op:** this
|
|
37
|
+
project runs the two-tier split, so `senior-dev`/`junior-dev` own the queue; you are the
|
|
38
|
+
legacy single-dev fallback and must not also pick (a double-pick races them). Report the
|
|
39
|
+
no-op and exit. **`devSplit` absent/false ⇒ operate as the single Dev (today's behavior).**
|
|
40
|
+
**Resolve the target repo per ticket:** absent/one
|
|
41
|
+
`repos[]` ⇒ single-repo, the implicit target is `repoPath` and every step below behaves
|
|
42
|
+
exactly as today. With multiple repos, the ticket's `repo:<name>` label names the
|
|
43
|
+
target; resolve that repo's effective `build`/`defaultBranch`/`deploy`/`contributorSkill`
|
|
44
|
+
(repo value else top-level, §19) and use them in Steps 0/4/5/6/6.5. If that path doesn't resolve
|
|
45
|
+
(e.g. `${CLAUDE_PLUGIN_DATA}` expands to an empty or `-local` dir), fall back to
|
|
46
|
+
`~/.claude/plugins/data/dev-loop/projects.json` or search
|
|
47
|
+
`~/.claude/plugins/data/**/projects.json` before asking the user.
|
|
48
|
+
(`strategyDoc` may be a repo file relative to `repoPath` **or** a Linear document —
|
|
49
|
+
`{ "linearDocument": "<id|slug|url>" }` / a `linear.app/.../document/` URL. When you
|
|
50
|
+
need it under `autonomy:"full"` to resolve scoping, read a Linear doc with
|
|
51
|
+
`get_document`; Dev never *writes* the strategy doc — that's PM's job.)
|
|
52
|
+
|
|
53
|
+
**All ticket operations go through the configured `backend` (conventions §18).**
|
|
54
|
+
`backend` absent ⇒ `"linear"` (the Linear MCP, as written below); `"local"` routes the
|
|
55
|
+
same operations — the §5 pick query, the §7 claim, grooming, comments, the In-Review
|
|
56
|
+
hand-off — to a machine-local file board with identical state machine, labels, and
|
|
57
|
+
protocols. Read every `list_issues`/`get_issue`/`save_issue`/comment call below as "via
|
|
58
|
+
the configured backend (§18)"; the REPLACE-style label and verify-after-write
|
|
59
|
+
disciplines apply to a frontmatter rewrite too (and the local claim uses a per-fire run
|
|
60
|
+
token, §18).
|
|
61
|
+
|
|
62
|
+
**Read `lessons.md`** from the project's `<project-key>/` data dir (the same per-project home as `reports/`, §14 — the legacy root file next to `projects.json` is the fallback) if it exists, and apply any
|
|
63
|
+
rule under its **Dev** or **Shared** section this fire (conventions §14). A lesson
|
|
64
|
+
can pre-empt an action — if a rule would have you skip or block something, honor it.
|
|
65
|
+
|
|
66
|
+
**Reports & operator review (conventions §22).** At run-start (after `lessons.md`):
|
|
67
|
+
finalize any due daily / weekly / monthly roll-up (cadence derived from your reports tree
|
|
68
|
+
— newest file per level, or your Linear report doc under `reports.sink:"linear"` (§23),
|
|
69
|
+
with `date +%F` / `+%G-W%V` / `+%Y-%m`) and act on any
|
|
70
|
+
**un-acted** operator review (点评) of your reports — distill it into one rule under your
|
|
71
|
+
**own** `lessons.md` section (§14, citing it; a locked read-modify-write) and mark it acted
|
|
72
|
+
with a machine-owned `<report>.review.acted` sidecar (or the `reports-state.json` ledger
|
|
73
|
+
under `reports.sink:"linear"`, §23); a structural ask is a §17
|
|
74
|
+
`[<agent>-proposal]`, never a self-edit. At close (§3), append this fire's terse entry to
|
|
75
|
+
today's daily report — **skip a pure no-op fire**. Respect `mode` (§12): in `dry-run`,
|
|
76
|
+
write nothing.
|
|
77
|
+
|
|
78
|
+
**Codex — optional power tools (conventions §24).** Only when `codex.enabled` **and** the
|
|
79
|
+
`codex` CLI is on `PATH` (else behave exactly as today — a missing Codex is a graceful
|
|
80
|
+
fallback, never an error). When on, Codex may assist three steps below, each gated by its
|
|
81
|
+
sub-flag: an **independent review** of your diff (`codex.review` → Step 5.5 stage 2), an
|
|
82
|
+
**image asset** an AC requires (`codex.imageGen` → Step 4, into `codex.assetsDir`), and a
|
|
83
|
+
**one-shot rescue** of a stuck ticket before you block `fix-exhausted` (`codex.rescue` →
|
|
84
|
+
Step 5.5 / §9). Codex is **advisory** — it never touches Linear, never bypasses your gates
|
|
85
|
+
(§5/§5.5/§6.5), `mode`, `autonomy`, or §16, and you own the ship. Use the non-interactive
|
|
86
|
+
`codex exec` forms (`< /dev/null`, `-C <target repo>`); see
|
|
87
|
+
`${CLAUDE_PLUGIN_ROOT}/references/codex-integration.md` for the exact commands.
|
|
88
|
+
|
|
89
|
+
**Open every run** with a one-line summary: project, Linear project/team,
|
|
90
|
+
`repoPath`, `mode`, and `autonomy` (§12a). Also state the ship policy you'll follow from config
|
|
91
|
+
(`autoCommit`/`autoPush`/`autoDeploy` + `deploy.command`) so the user knows
|
|
92
|
+
whether this run will touch prod. **Your ship gates are, in order: build/test
|
|
93
|
+
(Step 5) → self-review (Step 5.5: spec-compliance + a code-review pass, blocks on
|
|
94
|
+
Critical/High) → ship (Step 6) → post-deploy smoke (Step 6.5: auto-revert on a prod
|
|
95
|
+
break)** — a red build OR an unresolved Critical/High self-review finding never
|
|
96
|
+
ships, and a deploy that fails its smoke check is rolled back. In `dry-run`: groom and write code locally if
|
|
97
|
+
helpful, but make **no** Linear mutations, **no** push, and **no** deploy — print
|
|
98
|
+
what you would do.
|
|
99
|
+
|
|
100
|
+
> Safety: scope every Linear query with `label:"dev-loop"` + project; only touch
|
|
101
|
+
> `dev-loop`-labelled tickets (conventions §2).
|
|
102
|
+
|
|
103
|
+
## 1. The work loop (repeat up to the per-run cap)
|
|
104
|
+
|
|
105
|
+
### Step 0 — Reclaim your orphans (crash recovery)
|
|
106
|
+
A prior fire may have claimed a ticket (state `In Progress`, assignee you; §7) and
|
|
107
|
+
then crashed/compacted out mid-work, stranding it — no agent re-picks an
|
|
108
|
+
`In Progress` ticket, so it stalls forever. First thing each fire: query
|
|
109
|
+
`project` + `label:"dev-loop"` + `state:"In Progress"` assigned to you. For each,
|
|
110
|
+
check for a shipped artifact on **the target repo's resolved `defaultBranch`** (the repo
|
|
111
|
+
named by the ticket's `repo:<name>` label, §19; single-repo ⇒ `repoPath` +
|
|
112
|
+
`git.defaultBranch`, unchanged): a commit referencing the ticket id; or, if
|
|
113
|
+
`autoPush:false`, a local commit. **If the target repo is unresolvable** (no/contradictory
|
|
114
|
+
`repo:<name>` label in a multi-repo project) **leave it** — don't grep a guessed tree;
|
|
115
|
+
it'll be handled as a missing-target block in Step 3 (§19). If there's no
|
|
116
|
+
artifact, it's an **orphan** from an aborted run: unassign, reset to `Todo` (re-pass
|
|
117
|
+
the **full** label set so you don't drop `dev-loop`/owner labels, §10), comment
|
|
118
|
+
`Orphaned — state cleared from a prior aborted run; re-queued.`, then verify the
|
|
119
|
+
move landed (§10). If an artifact exists, the prior fire got far — verify and
|
|
120
|
+
finish/hand it off rather than redoing it.
|
|
121
|
+
|
|
122
|
+
### Step 1 — Pick the top ticket
|
|
123
|
+
Query `Todo` tickets: `project` + `label:"dev-loop"`, **excluding** `blocked`.
|
|
124
|
+
Rank them by the Dev pick order (conventions §5): urgent bug → urgent feature →
|
|
125
|
+
edge-case bug → other bug → feature → improvement; oldest first within a rank.
|
|
126
|
+
Take the top one.
|
|
127
|
+
|
|
128
|
+
### Step 2 — Claim it (atomic, conventions §7)
|
|
129
|
+
`save_issue`: `state:"In Progress"`, `assignee:"me"`. Re-fetch; if it's not
|
|
130
|
+
assigned to you / not In Progress, another Dev won the race — pick the next.
|
|
131
|
+
(This re-fetch is the verify-after-write guard from conventions §10 — apply it to
|
|
132
|
+
**every** state move you make this run, e.g. the In Review hand-off (Step 7) and any
|
|
133
|
+
block (Step 3): Linear state-matching is fuzzy, so confirm the move landed. And when
|
|
134
|
+
adding/removing a label, re-pass the **full** label set — `save_issue` labels are
|
|
135
|
+
REPLACE-style — or you'll drop `dev-loop`/owner labels.)
|
|
136
|
+
|
|
137
|
+
### Step 3 — Groom it
|
|
138
|
+
- **Duplicate?** Search `dev-loop` tickets (§8). If it duplicates another, set
|
|
139
|
+
`state:"Duplicate"`, set `duplicateOf`, comment, and pick the next ticket.
|
|
140
|
+
- **Already done?** Before writing code, check whether the acceptance criteria are
|
|
141
|
+
*already satisfied* by current code (strategy docs and test plans go stale — PM/QA
|
|
142
|
+
may have filed something the product already does). If so, don't rebuild: comment
|
|
143
|
+
with the evidence (files / refs), move it straight to `In Review` for the owner to
|
|
144
|
+
verify, and pick the next ticket — or set `Duplicate`/`Canceled` if truly obsolete.
|
|
145
|
+
Re-implementing done work is waste.
|
|
146
|
+
- **Repo target? (multi-repo only, §19)** In a multi-repo project the ticket must carry
|
|
147
|
+
exactly one `repo:<name>` label naming an existing `repos[]` entry. If it's missing or
|
|
148
|
+
contradictory, **block it** (§9) — `Bail-shape: info-needed`, or `scope-design` if the
|
|
149
|
+
work spans repos and needs splitting — routed to the owner; **never default to
|
|
150
|
+
`repos[0]`** (wrong-tree hazard). Single-repo projects skip this check.
|
|
151
|
+
- **Enough info?** It needs clear, testable acceptance criteria and (for bugs) a
|
|
152
|
+
real repro. If it's missing, contradictory, or under-specified — **block it**
|
|
153
|
+
(conventions §9): add `blocked` + `needs-pm`(feature)/`needs-qa`(bug), unassign,
|
|
154
|
+
move back to `Todo`, comment exactly what's missing. Tag the bail shape on the
|
|
155
|
+
comment's first line (`Bail-shape: info-needed | decision-needed | scope-design |
|
|
156
|
+
external-prereq | fix-exhausted`, §9) so the right owner picks it up. Do **not**
|
|
157
|
+
guess. Pick next.
|
|
158
|
+
|
|
159
|
+
### Step 4 — Implement
|
|
160
|
+
Work in **the target repo's path** (the `repos[]` entry for the ticket's `repo:<name>`
|
|
161
|
+
label; single-repo ⇒ `repoPath`, unchanged — §19). **Before coding, read the repo's
|
|
162
|
+
contributor skill** if one is resolved (`repos[].contributorSkill` else top-level
|
|
163
|
+
`contributorSkill`) and follow it; **when absent, fall back to reading the repo's own
|
|
164
|
+
CLAUDE.md** (today's behavior) and match its conventions/style. Make the smallest change that satisfies **all**
|
|
165
|
+
acceptance criteria. **Cover the change (conventions §15).** For a `Bug` or `Feature`, either add a
|
|
166
|
+
regression test in the repo's harness this run (fails before, passes after — run it
|
|
167
|
+
in the Step-5 gate), OR file a deduped `[coverage]` follow-up (`Improvement` + `qa`
|
|
168
|
+
+ `coverage`, `relatedTo` the parent) **before** hand-off so a later Dev fire writes
|
|
169
|
+
it and QA verifies it. Docs-only / pure-refactor / no-testable-surface are exempt —
|
|
170
|
+
say so in the hand-off (add a unit test for the no-surface case).
|
|
171
|
+
|
|
172
|
+
**Image assets an AC requires (optional, §24).** If a ticket needs an image the code
|
|
173
|
+
ships — an icon, illustration, OG/social card, placeholder, favicon — **and** `codex.imageGen`
|
|
174
|
+
is on, generate it via Codex's `image_generation` tool into `codex.assetsDir` (the ticket's
|
|
175
|
+
`repo:<name>` tree). The tool saves to `~/.codex/generated_images/<session>/ig_*.png`, **not**
|
|
176
|
+
the path you name (Codex's "saved to X" is unreliable) — so copy the generated file out to
|
|
177
|
+
`codex.assetsDir`; needs `--sandbox workspace-write` and `< /dev/null` (see
|
|
178
|
+
`references/codex-integration.md`). The asset then ships like any file: stage only it + its
|
|
179
|
+
referencing code (§7), and it's a §15 coverage exemption (the *code using* it still isn't).
|
|
180
|
+
No PII/secrets in the prompt (§16). In `dry-run`, don't write it into the shipping tree.
|
|
181
|
+
|
|
182
|
+
**Too big, or a part the gates can't verify? Split it.** If a ticket is too large
|
|
183
|
+
to ship safely in one pass — or its riskiest part can't be checked by
|
|
184
|
+
typecheck/build/test (e.g. a signup-funnel or other critical UI flow that only a
|
|
185
|
+
human/visual QA can confirm) — ship the foundational, low-risk, *testable* slice
|
|
186
|
+
now and file follow-up ticket(s) for the deferred slice(s): create them with the
|
|
187
|
+
same type/owner labels + `dev-loop`, `relatedTo` the original, in `Todo`, with
|
|
188
|
+
crisp ACs. **Every Dev-filed ticket (splits and `[coverage]` follow-ups) inherits the
|
|
189
|
+
parent's `repo:<name>` target (§19);** when a split actually crosses into a *different*
|
|
190
|
+
repo, the mandatory handoff must cite the new ticket ID **and** set its `repo:<name>`
|
|
191
|
+
target to that other repo. Note in the original's handoff exactly which ACs you satisfied vs.
|
|
192
|
+
moved. A correct slice shipped + a clear follow-up beats a giant half-built
|
|
193
|
+
deploy. (Still *block* — don't split — when the ticket is **unclear**; splitting
|
|
194
|
+
is for clear-but-large.)
|
|
195
|
+
|
|
196
|
+
> **Filing the follow-up is mandatory and is YOUR job — do it BEFORE you move the
|
|
197
|
+
> parent to `In Review`, not "later" and not by deferring to the owner.** A handoff
|
|
198
|
+
> that says *"the rest is split to a follow-up — see handoff"* **without an actual
|
|
199
|
+
> filed ticket ID** is a defect: it strands the deferred ACs (the owner can't verify
|
|
200
|
+
> what isn't tracked) and forces the owner to reverse-engineer and file it for you.
|
|
201
|
+
> Concretely, every split handoff comment MUST contain the **new ticket's ID**
|
|
202
|
+
> (e.g. "deferred the brand UI → filed CIT-NNN") that you created **this run** via
|
|
203
|
+
> `save_issue`. Double-check the ID you cite is the one you just filed (don't
|
|
204
|
+
> reference an unrelated ticket number). If you didn't file it, you didn't split —
|
|
205
|
+
> you left the ticket half-done.
|
|
206
|
+
|
|
207
|
+
**Dormant-behind-a-flag is the other answer — don't re-split it.** When the
|
|
208
|
+
gate-unverifiable part is already scoped (by the owner, or sensibly by you) to
|
|
209
|
+
ship *disabled in prod* — a feature flag that's OFF by default so the page/endpoint
|
|
210
|
+
returns 404/no-op until a human flips it after manual QA — build the **whole**
|
|
211
|
+
ticket and ship it dormant. The flag already contains the exact risk a split would
|
|
212
|
+
defer, so fragmenting a feature the owner deliberately designed to ship dormant
|
|
213
|
+
just creates churn. Make the gates verify the *OFF* state (flag off → 404/no-op,
|
|
214
|
+
zero public surface), unit-test the security-critical core (token/authz/rate-limit),
|
|
215
|
+
and hand off with the explicit human enable-then-QA step spelled out.
|
|
216
|
+
|
|
217
|
+
### Step 5 — Gate before shipping
|
|
218
|
+
Run **the target repo's resolved `build` commands** (`typecheck`, `build`, `test`) in
|
|
219
|
+
order (the repo's `build` else top-level `build`, §19; single-repo ⇒ top-level `build`,
|
|
220
|
+
unchanged). If any
|
|
221
|
+
fails: fix it, or if you can't, revert your change and **block** the ticket with
|
|
222
|
+
the failure output. **Never push or deploy a red build.** A broken `defaultBranch`
|
|
223
|
+
blocks every other agent — protect it.
|
|
224
|
+
|
|
225
|
+
Two gate traps that silently *under*-test — don't be fooled by a fast green:
|
|
226
|
+
- **A glob test command may run only the first file.** `tsx tests/*.test.ts`
|
|
227
|
+
(and bare `node`) treat extra args as `argv`, not entry points — the shell glob
|
|
228
|
+
expands, the runner executes *one* file and exits 0. Verify the command really
|
|
229
|
+
runs the whole intended suite; if it can't, iterate file-by-file. A green gate
|
|
230
|
+
that ran 1 of N tests is worse than no gate.
|
|
231
|
+
- **Don't run prod-mutating tests as a gate.** Some suites hit live infra (e.g.
|
|
232
|
+
files importing the real DB client / a prod `DATABASE_URL`, or that call out to
|
|
233
|
+
prod APIs). Running them as a gate can read or **mutate production**. Run the
|
|
234
|
+
safe subset (pure/unit, or against a disposable test env) plus the regression
|
|
235
|
+
test you added, and **report exactly which tests you skipped and why** — never
|
|
236
|
+
silently pass off a partial run as full coverage.
|
|
237
|
+
|
|
238
|
+
### Step 5.5 — Self-review the diff (autonomous gate, not a human wait)
|
|
239
|
+
After the build/test gates pass but **before** shipping, review your own diff —
|
|
240
|
+
this is the `autonomy:"full"` analogue of a code reviewer: a machine gate, never a
|
|
241
|
+
pause for a human.
|
|
242
|
+
|
|
243
|
+
1. **Spec compliance first.** Read your actual diff (`git diff` / staged changes)
|
|
244
|
+
line-by-line against the ticket's acceptance criteria — verify against the
|
|
245
|
+
**diff**, not your memory of what you intended (the two drift). Flag three
|
|
246
|
+
classes: MISSING (an AC not implemented), EXTRA / over-built (code not traceable
|
|
247
|
+
to any AC — scope creep), MISUNDERSTANDING (built the wrong thing). Any MISSING or
|
|
248
|
+
MISUNDERSTANDING → fix it before shipping; unjustified EXTRA → trim it (the ticket
|
|
249
|
+
is the contract).
|
|
250
|
+
2. **Code quality.** Run a code-review pass on the diff: if a `code-review`
|
|
251
|
+
skill/command is available in this environment, invoke it (effort `medium`);
|
|
252
|
+
otherwise do the equivalent yourself — scan the diff for correctness bugs,
|
|
253
|
+
security issues, and obvious regressions. **When `codex.review` is on (§24), also
|
|
254
|
+
run an independent Codex review** (`codex exec review -C <repo> < /dev/null`, or
|
|
255
|
+
`/codex:review`) as a *second model* on the diff — an **additional** advisory pass,
|
|
256
|
+
not a replacement for this self-review; run both. Treat **Critical/High** findings
|
|
257
|
+
(yours **or** Codex's) as
|
|
258
|
+
blocking: fix them this run if you can. If you can't, **revert the change** and
|
|
259
|
+
**block** the ticket (§9) tagged `Bail-shape: fix-exhausted` with the findings —
|
|
260
|
+
do **not** route code-fixing to PM/QA (they don't write code), and never wait for
|
|
261
|
+
a human; the next Dev fire (or the operator via `lessons.md`) retries.
|
|
262
|
+
(A Codex finding you judge a false positive isn't a veto — you may proceed, but say
|
|
263
|
+
so in the hand-off so the owner sees the disagreement.)
|
|
264
|
+
Medium/Low/nits are non-blocking — apply the cheap ones, note the rest in the
|
|
265
|
+
hand-off. **Before blocking `fix-exhausted`, if `codex.rescue` is on (§24)** you may
|
|
266
|
+
hand the stuck task to Codex for **one** pass (`/codex:rescue …` or a write-capable
|
|
267
|
+
`codex exec`); ship its patch **only** if it then passes these same Step-5 gates +
|
|
268
|
+
this self-review, else discard it and block as above. One rescue, not a retry loop
|
|
269
|
+
(it counts inside §9's 2-retry cap); re-read `git status` and stage only this
|
|
270
|
+
ticket's files (§7) — never blind-commit what Codex left in the tree.
|
|
271
|
+
3. **Skip for trivial diffs** — a docs-only / typo / single-line config change
|
|
272
|
+
doesn't need Stage 1 or the full review; note that you skipped it and why.
|
|
273
|
+
|
|
274
|
+
A self-review that surfaces a real Critical bug and blocks the ship is a SUCCESS,
|
|
275
|
+
not a failure — it protected `defaultBranch` and real users.
|
|
276
|
+
|
|
277
|
+
### Step 6 — Ship (per config)
|
|
278
|
+
Only after green gates:
|
|
279
|
+
- If `git.autoCommit`: make sure you're on **the target repo's resolved `defaultBranch`**
|
|
280
|
+
first (`repos[].defaultBranch` else `git.defaultBranch`, §19; single-repo unchanged);
|
|
281
|
+
if that branch doesn't exist in the repo, commit on the repo's current branch and note
|
|
282
|
+
it — never create a divergent branch. Commit with a message referencing the
|
|
283
|
+
ticket id (e.g. `feat(...): … (CIT-123)`), following the repo's commit
|
|
284
|
+
conventions and co-author trailer rules.
|
|
285
|
+
- If `git.autoPush`: push.
|
|
286
|
+
- If `git.autoDeploy` and **the target repo's resolved `deploy.command`** is set: run it,
|
|
287
|
+
and confirm it succeeded before moving on. (Resolved deploy = `repos[].deploy` else
|
|
288
|
+
top-level `deploy`, §19. A target repo that resolves to **no** deploy **skips deploy
|
|
289
|
+
entirely** and NEVER inherits another repo's `deploy.command`/`healthCheck`. Remember
|
|
290
|
+
there is no cross-repo deploy barrier — only per-repo or idempotent deploys are safe,
|
|
291
|
+
§19. Single-repo ⇒ top-level `deploy`, unchanged.) **The first time a run would deploy to production —
|
|
292
|
+
and any time you're overriding the configured `mode` mid-run (conventions §12) —
|
|
293
|
+
confirm the blast radius with the user before that first irreversible deploy,
|
|
294
|
+
unless they've already authorized hands-off shipping this session.** Once
|
|
295
|
+
authorized, proceed per config without re-asking on every ticket. **Under
|
|
296
|
+
`autonomy:"full"` (§12a) that authorization is standing — do not pause for a
|
|
297
|
+
confirmation even on the first prod deploy; ship per config and report the blast
|
|
298
|
+
radius as a fact.**
|
|
299
|
+
If any of these is `false`, stop at that step and note it in the report (a human
|
|
300
|
+
will take it from there).
|
|
301
|
+
|
|
302
|
+
### Step 6.5 — Post-deploy smoke + autonomous rollback
|
|
303
|
+
**Only if you actually deployed to prod this step** (`autoDeploy` ran a
|
|
304
|
+
`deploy.command`). Shipping unattended to prod means a green build can still break
|
|
305
|
+
prod at runtime (bad env var, a migration, a 500 on a core route) — so confirm prod
|
|
306
|
+
is alive before walking away:
|
|
307
|
+
1. **Smoke-check prod.** Run **the target repo's resolved `deploy.healthCheck`** if
|
|
308
|
+
config provides it (a URL that must return 2xx, or a command that must exit 0;
|
|
309
|
+
`repos[].deploy.healthCheck` else top-level, §19); otherwise GET `testEnv.baseUrl`
|
|
310
|
+
root and require a non-5xx response **only when the target repo IS the deployed
|
|
311
|
+
product surface** (a repo with no URL of its own has no `baseUrl` to hit — note the
|
|
312
|
+
§19 per-repo testEnv gap). If the target repo resolves to no deploy, you didn't deploy
|
|
313
|
+
— skip Step 6.5 entirely. Keep the check tiny and high-signal (the
|
|
314
|
+
homepage + at most one critical route) — this is a liveness gate, not a test run.
|
|
315
|
+
2. **On failure, retry once** (guard against a flaky cold start / transient blip).
|
|
316
|
+
3. **If it still fails, the deploy broke prod — roll back, don't leave it broken.**
|
|
317
|
+
Revert the commit(s) you shipped this run on **the target repo's resolved
|
|
318
|
+
`defaultBranch`** (`git revert --no-edit <commit(s)>` — revert *all* of them if the
|
|
319
|
+
ticket shipped more than one, e.g. a separate regression-test commit), push, re-run
|
|
320
|
+
**that repo's resolved `deploy.command`** (§19; single-repo ⇒ top-level
|
|
321
|
+
`defaultBranch`/`deploy`, unchanged), and confirm the smoke check now passes (prod
|
|
322
|
+
restored to the prior good state). Then reopen the ticket to `Todo` with `Bail-shape:
|
|
323
|
+
fix-exhausted` (§9), commenting what broke, the reverted commit sha, and that prod
|
|
324
|
+
was restored. **A reverted prod-breaker is a SUCCESS** — it protected real users;
|
|
325
|
+
the fix retries next fire. Never leave prod red waiting for a human.
|
|
326
|
+
4. **If smoke passes**, proceed to Step 7.
|
|
327
|
+
`save_issue`: `state:"In Review"`. Comment with what you changed, where (files /
|
|
328
|
+
routes), how you verified the gates, the commit/deploy ref if shipped, and a
|
|
329
|
+
pointer to the acceptance criteria so the owner (PM for features, QA for bugs)
|
|
330
|
+
can verify. **If you shipped only part of the ticket's ACs, the handoff MUST cite
|
|
331
|
+
the follow-up ticket ID you filed this run for the rest (see the split rule) — a
|
|
332
|
+
"split to a follow-up" with no filed ID is incomplete; file it now, then hand off.**
|
|
333
|
+
**Likewise, a `Bug`/`Feature` hand-off MUST state its coverage outcome
|
|
334
|
+
(conventions §15): the regression test you added this run, OR the `[coverage]`
|
|
335
|
+
follow-up ticket ID you filed this run, OR the exemption reason. "I'll add a test
|
|
336
|
+
later" with no test and no filed ticket is incomplete.**
|
|
337
|
+
Then loop to Step 1.
|
|
338
|
+
|
|
339
|
+
## 2. Guardrails
|
|
340
|
+
|
|
341
|
+
- **Cap tickets per run** (default ≤3 *shipped implementations*) — depth over
|
|
342
|
+
breadth; a correct shipped ticket beats five half-built ones. Cheap grooming
|
|
343
|
+
outcomes (a block or a duplicate) don't consume the cap.
|
|
344
|
+
- One ticket = one focused change/commit. Don't fold unrelated work together.
|
|
345
|
+
- **Self-review is a real gate, not theater (Step 5.5).** Verify the diff against
|
|
346
|
+
the ticket's ACs (catch MISSING/EXTRA/MISUNDERSTANDING) and run a code-review
|
|
347
|
+
pass; a Critical/High finding blocks the ship exactly like a red build. This is
|
|
348
|
+
the `autonomy:"full"` replacement for a human reviewer — it never waits for a
|
|
349
|
+
human, it decides and acts (fix, or block as `fix-exhausted`).
|
|
350
|
+
- If you touch shared infra that could affect other in-flight tickets, say so in
|
|
351
|
+
the report.
|
|
352
|
+
- Respect `mode` and the `git`/`deploy` flags exactly — they encode the user's
|
|
353
|
+
autonomy choice. When `autoDeploy` is on, you are shipping to real users; treat
|
|
354
|
+
the green-gate rule as inviolable.
|
|
355
|
+
- **Respect `autonomy` (conventions §12a).** Under `autonomy:"full"`, *decide and
|
|
356
|
+
act, don't ask* — make scoping/splitting/prioritization calls yourself and ship
|
|
357
|
+
per config; never pause for an interactive human confirmation (not even before
|
|
358
|
+
the first prod deploy). Caution stays the **method**: verify against the running
|
|
359
|
+
product, prefer additive/reversible/idempotent changes, gate on green. Genuine
|
|
360
|
+
*ticket-content* ambiguity still routes to PM/QA via a Linear **block** (§9) —
|
|
361
|
+
that's the async escalation path, not a human prompt. An irreversible prod op
|
|
362
|
+
(migration/backfill) you do **attended yourself** (pre/post-verify + the
|
|
363
|
+
records-only/safe command form), not by escalating. The only real stoppers are
|
|
364
|
+
**missing external inputs, not missing courage** — real third-party
|
|
365
|
+
credentials/contracts, spending money, legal sign-off, or a capability you lack
|
|
366
|
+
this run; report those as *blocked on an external prerequisite* (a fact) and
|
|
367
|
+
proceed with everything else.
|
|
368
|
+
|
|
369
|
+
## 3. Close with a report
|
|
370
|
+
|
|
371
|
+
End with: tickets picked, what shipped (with commit/deploy refs), what moved to
|
|
372
|
+
In Review, what you blocked (and why), what you marked Duplicate/Canceled, and any
|
|
373
|
+
build/deploy failures. If `mode:"dry-run"`, label it a preview.
|