relay-workflow 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,30 @@
1
+ # Python
2
+ __pycache__/
3
+ *.pyc
4
+ *.pyo
5
+ *.egg-info/
6
+ .venv/
7
+ venv/
8
+
9
+ # Build artifacts
10
+ build/
11
+ dist/
12
+
13
+ # Relay run artifacts. Ignored globally (scratch runs, /tmp clones, repo-root
14
+ # drops), but work-unit plans ARE the durable cross-run state: IMPLEMENT / SHIP
15
+ # persist_state() commits plan.json on the base branch, so anything living under
16
+ # .workspace/work/<unit>/ is tracked by default (allowlist below).
17
+ **/plan.json
18
+ **/plan.md
19
+ **/plan.*.json
20
+ **/plan.schema.json
21
+ **/plan.claude.raw.md
22
+ **/gh_projection.sh
23
+ **/projection.json
24
+ !.workspace/work/**/plan.json
25
+ !.workspace/work/**/plan.md
26
+ !.workspace/work/**/projection.json
27
+
28
+ # OS / editor
29
+ .DS_Store
30
+ *.swp
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Stefan Jansen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,269 @@
1
+ Metadata-Version: 2.4
2
+ Name: relay-workflow
3
+ Version: 0.1.0
4
+ Summary: Cross-agent (Claude Code + Codex) planning/execution workflow — ALIGN, PLAN, PROJECT, IMPLEMENT, SHIP.
5
+ Project-URL: Homepage, https://github.com/applied-artificial-intelligence/relay
6
+ Project-URL: Repository, https://github.com/applied-artificial-intelligence/relay
7
+ Project-URL: Issues, https://github.com/applied-artificial-intelligence/relay/issues
8
+ Author-email: Stefan Jansen <stefan@applied-ai.com>
9
+ License-Expression: MIT
10
+ License-File: LICENSE
11
+ Keywords: agents,automation,claude-code,codex,github,workflow
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Environment :: Console
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Operating System :: MacOS
17
+ Classifier: Operating System :: POSIX
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.10
20
+ Classifier: Programming Language :: Python :: 3.11
21
+ Classifier: Programming Language :: Python :: 3.12
22
+ Classifier: Programming Language :: Python :: 3.13
23
+ Classifier: Topic :: Software Development
24
+ Classifier: Topic :: Software Development :: Version Control :: Git
25
+ Requires-Python: >=3.10
26
+ Provides-Extra: test
27
+ Requires-Dist: pytest>=8; extra == 'test'
28
+ Description-Content-Type: text/markdown
29
+
30
+ # Relay
31
+
32
+ **A cross-agent (Claude Code + Codex) planning/execution workflow.** Portable
33
+ operating discipline + continuity across agents: forcefully align on a spec, plan it
34
+ under a mechanically-bounded agent that *can't* run off and build, project it onto
35
+ GitHub (epic → milestones → issues), then implement and ship incrementally.
36
+
37
+ Local docs are the source of truth; **GitHub is the projection** (degrades to
38
+ local-only with no remote). One workflow, two host bindings — both converge on the
39
+ same `.workspace/work/<unit>/` files and `gh` projection.
40
+
41
+ > **Status**: alpha — pipeline complete (all 5 steps live-tested), packaged
42
+ > as `relay-workflow` (0.1.0). PyPI publish pending. Extracted 2026-05-28
43
+ > from `factory` work unit 024 (`claude_codex_interop_framework`). Successor
44
+ > to the Claude-only `claude-code-toolkit` (which remains active separately).
45
+ > Design rationale and the empirical host probes are in [`docs/`](docs/).
46
+
47
+ ## Pipeline (host-neutral)
48
+
49
+ Five steps built — pipeline complete:
50
+
51
+ ```
52
+ align/ → spec.md forceful interrogation (skill + Codex prompt; interactive)
53
+ relay_plan.py → plan.json/.md bounded plan-only agent run (can't implement)
54
+ relay_project.py → GitHub idempotent epic + milestones + issues (+ branches)
55
+ relay_implement.py → branch + PR headless agent implements one issue (Closes #)
56
+ ```
57
+
58
+ ## `align/` — spec interrogation (interactive)
59
+
60
+ `align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align` prompt):
61
+ forcefully interrogate the user — one question at a time, challenge vague answers,
62
+ force out-of-scope exclusions — then write `<unit>/spec.md` from `spec-template.md`.
63
+ Front-loads all human clarification so the headless `plan` step never needs to ask.
64
+ Deploy: copy `SKILL.md` to a skills dir; copy `align.codex.md` to `~/.codex/prompts/align.md`.
65
+
66
+ ## `relay_plan.py` — plan step
67
+
68
+ `relay_plan.py`: turns an aligned **spec** into a structured, milestone/issue
69
+ **plan** without letting the coding agent run off into implementation. Host-neutral.
70
+
71
+ ```
72
+ spec.md → [bounded plan-only agent run] → plan.json + plan.md
73
+ ```
74
+
75
+ 1. Reads `<unit>/spec.md` (produced by `align`).
76
+ 2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
77
+ - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is terminal,
78
+ nothing executes. Plan extracted from `--output-format json` result (falls back
79
+ to newest `~/.claude/plans/*.md`).
80
+ - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o <file>` —
81
+ read-only sandbox can't write; schema **enforces** the milestone/issue JSON shape.
82
+ 3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
83
+ 4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script. **Superseded
84
+ by `relay_project.py`** for real use; kept only as a quick-look dry-run.
85
+
86
+ ## `relay_project.py` — project step (idempotent GitHub projection)
87
+
88
+ Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**, one
89
+ **issue** per work item, with branch + `Closes #` wiring written back into `plan.json`.
90
+
91
+ ```
92
+ plan.json → [resolve repo] → epic + milestones + issues → plan.json (numbers+branches) + projection.json
93
+ ```
94
+
95
+ - **Idempotent**: matches existing milestones/issues by title (`gh api .../milestones`,
96
+ `gh issue list`) and reuses them — re-running never duplicates. Verified live against
97
+ a throwaway repo (4 milestones/21 issues; second run created 0 new objects).
98
+ - **Epic**: a `[epic] <objective>` tracking issue with a child task-list, refreshed
99
+ (not duplicated) on re-run. `--no-epic` to skip.
100
+ - **Branch/PR wiring**: each issue gets `branch: relay/m<n>-<mslug>/<islug>` and a
101
+ `Closes #<n>` contract recorded in `plan.json` for IMPLEMENT to consume.
102
+ - **Local-first**: no GitHub remote → exits local-only (add a remote and re-run to promote).
103
+ - **Missing labels** don't abort — the issue is created without them and logged.
104
+
105
+ ## `relay_implement.py` — implement step (headless, one issue → PR)
106
+
107
+ Consumes the projected `plan.json` and drives a **headless coding agent** to
108
+ implement ONE issue on its branch, then opens a PR that closes the issue. This is
109
+ the step that lets Relay execute its own backlog instead of a human hand-building.
110
+
111
+ ```
112
+ plan.json → [select issue] → branch off base → [headless agent] → push + PR (Closes #N) → plan.json (pr+state)
113
+ ```
114
+
115
+ - **One issue at a time** (workflow-design.md model). Default target = first
116
+ *pending* issue (state ∉ {implemented, merged, done}); `--issue <#n|title>` to pick.
117
+ - Reuses the `branch`/`closes` wiring the PROJECT step wrote into `plan.json`
118
+ (errors if absent — run `relay_project.py --execute` first).
119
+ - Assembles a **brief** from objective + milestone + issue body + `spec.md` (the
120
+ durable contract) + mechanical scope rules — host-neutral, byte-identical
121
+ across hosts — then runs the chosen host headless (Claude: `claude -p
122
+ --output-format json`; Codex: `codex exec` in a `workspace-write` sandbox).
123
+ Success is verified by `git` (commits on the branch), not by parsing stdout.
124
+ - **Idempotent**: reuses an existing branch / existing PR; if the agent leaves no
125
+ commits, records `state=no-op` and opens no PR. Local-only (no remote) →
126
+ `state=implemented-local`, branch left for later promotion.
127
+ - **Dry-run by default**: prints target, branch, the commands it would run, and the
128
+ full assembled brief. `--execute` to branch + spend an agent run + open the PR.
129
+ - **Host: `--host {claude,codex,auto}`** (default `auto`). Auto-detect mirrors
130
+ PLAN's precedence — `claude` on PATH wins, else `codex`, else a clear error.
131
+ Both hosts drive the same host-neutral brief and write the same `state`
132
+ vocabulary (`open` / `no-op` / `implemented-local`).
133
+ - **Host equivalence**: Codex runs under a `workspace-write` sandbox so it can
134
+ edit and commit the working tree. (PLAN uses `read-only` for a *different*
135
+ reason — to mechanically bound the planner so it can't build; IMPLEMENT must
136
+ be able to write.) For fully-autonomous runs that must run tests and commit
137
+ without prompts, `codex --dangerously-bypass-approvals-and-sandbox` is the
138
+ analogue of Claude's `--permission-mode bypassPermissions` (default is the
139
+ safer `acceptEdits`).
140
+
141
+ ## `relay_ship.py` — ship step (incremental: PR merge → issue/milestone done)
142
+
143
+ Consumes the `pr`/`state` IMPLEMENT wrote into `plan.json` and **closes the loop
144
+ incrementally**. There is no terminal "ship" event in workflow-design.md — SHIP
145
+ runs (and re-runs) as PRs become mergeable: PR merge → issue done (auto via
146
+ `Closes #N`) → milestone done (when all its issues are merged) → project done
147
+ (when every milestone closes).
148
+
149
+ ```
150
+ plan.json → [pick PR] → check mergeable + checks → gh pr merge → bubble-up → plan.json (merged/done)
151
+ ```
152
+
153
+ - **One PR per run by default** (mirrors IMPLEMENT's one-at-a-time); `--all` to
154
+ drain every mergeable PR in one pass.
155
+ - Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending checks
156
+ block by default; `--auto` enables GitHub auto-merge so the PR lands when
157
+ checks pass.
158
+ - `--admin` bypasses branch protections on repos without CI.
159
+ - `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
160
+ auto-retried once.
161
+ - After each pass, bubbles up milestones whose issues are all done and sets
162
+ `plan.state = "shipped"` when every milestone closes.
163
+ - `gh pr merge` lands on GitHub leaving the local base behind — SHIP fetches +
164
+ fast-forwards before committing state so the push succeeds the first time.
165
+ - Already-merged PRs are recognised and reconciled without re-merging
166
+ (re-running is safe + idempotent).
167
+
168
+ ## Install
169
+
170
+ Once published to PyPI:
171
+ ```bash
172
+ uv tool install relay-workflow # recommended (isolated env, like pipx)
173
+ # or
174
+ pipx install relay-workflow # legacy-compatible alternative
175
+ ```
176
+
177
+ Until then (or for development):
178
+ ```bash
179
+ git clone https://github.com/applied-artificial-intelligence/relay
180
+ cd relay
181
+ uv tool install --force . # install this checkout
182
+ ```
183
+
184
+ Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10 and the
185
+ `gh` CLI for the steps that touch GitHub.
186
+
187
+ ## Usage
188
+
189
+ ```bash
190
+ # align: invoke the Claude skill / Codex /align prompt (interactive), then:
191
+ relay plan <work-unit-dir> # auto-detect host, plan only
192
+ relay plan <unit> --host codex # force host
193
+ relay project <unit> # dry-run GitHub projection
194
+ relay project <unit> --execute # create/refresh epic+milestones+issues
195
+ relay implement <unit> # dry-run: show target issue + brief
196
+ relay implement <unit> --execute # implement first pending issue → PR (auto-detect host)
197
+ relay implement <unit> --host codex --execute # force Codex as the headless coding host
198
+ relay implement <unit> --issue 7 --execute --permission-mode bypassPermissions
199
+ relay ship <unit> # dry-run: which PRs would merge
200
+ relay ship <unit> --all --execute # merge every mergeable PR, bubble up
201
+ relay ship <unit> --execute --auto # enable auto-merge (wait on checks)
202
+ ```
203
+
204
+ `relay --help` lists steps; `relay <step> --help` lists step-specific options.
205
+ `python -m relay <step>` works too (handy on systems where `pipx`/`uv`-installed
206
+ scripts aren't on `PATH` yet).
207
+
208
+ ## Verified (2026-05-27)
209
+
210
+ | Host | Mechanism | Output | Time | Repo touched? |
211
+ |---|---|---|---|---|
212
+ | Codex (gpt-5.5) | PLAN: read-only + enforced schema | 4 milestones / 21 issues | 39s | no |
213
+ | Claude (2.1.152) | PLAN: `--permission-mode plan` + JSON extract | 2 milestones / 10 issues | 29s | no |
214
+ | Claude IMPLEMENT | `claude -p` in working tree | 2 issues → 2 PRs (`Closes #`) | — | yes (commits on branch) |
215
+ | Codex IMPLEMENT | `codex exec` + `workspace-write` sandbox | branch + PR (`Closes #`) — _pending live verification_ | — | yes (commits on branch) |
216
+
217
+ Both leave the repo untouched (the anti-runaway guarantee is mechanical, not
218
+ behavioral) and converge on the same `plan.json` + `gh_projection.sh`.
219
+
220
+ ## Known gaps (prototype, not production)
221
+
222
+ - **Codex schema** requires `additionalProperties:false` + every property in
223
+ `required` (OpenAI structured-output rule) — handled in `PLAN_SCHEMA`.
224
+ - Claude path is **not** schema-enforced — relies on the model returning JSON; a
225
+ structuring/repair pass would harden it.
226
+ - No rolling-wave support yet (re-plan a single milestone into issues later).
227
+ - GitHub **Projects v2** board (Status/Area/Type fields, à la tradesharp) is not
228
+ created — `relay_project.py` does epic issue + milestones only (CLI-friendly, no
229
+ GraphQL). Project-v2 wiring is the next projection increment.
230
+ - Issue-level idempotency matches on **exact title**; renaming an issue in the spec
231
+ creates a new one rather than updating the old. Acceptable for append-style planning.
232
+ - **IMPLEMENT `--execute` live-tested 2026-05-28** (throwaway `relay-smoketest`, 2 issues →
233
+ 2 PRs via `claude -p`, both in-scope with tests + correct `Closes #`). Idempotency holds:
234
+ skips issues that already have a PR (`--redo` to force), state is committed to the base
235
+ branch (survives between runs, leaves a clean tree), dirty-guard ignores untracked agent
236
+ artifacts. Default selection walks pending issues one at a time.
237
+ - **Parallel issues on one file conflict at merge time**: each issue branches off base
238
+ independently, so two issues editing the same file produce mergeable-but-conflicting PRs.
239
+ That's a SHIP/rebase concern, not a runner bug.
240
+ - **SHIP `--execute` live-tested 2026-05-31** (same `relay-smoketest` repo): merged PR #4
241
+ (`Closes #1` auto-closed issue #1), correctly detected the parallel-PR conflict on PR #5,
242
+ bubble-up left milestone open (1 of 2 issues done). Already-merged reconciliation verified
243
+ on a second pass after a hard reset. Fixed two bugs found during the live test:
244
+ (a) `gh pr view` first probe returning `mergeable=UNKNOWN` — auto-retry once;
245
+ (b) post-merge local base was stale, so state push failed — fetch + fast-forward before
246
+ committing state.
247
+
248
+ ## Repo layout
249
+
250
+ ```
251
+ align/ ALIGN step — Claude skill + Codex /align prompt + spec template
252
+ src/relay/ Python package — installs as the `relay` CLI dispatcher
253
+ cli.py subcommand dispatcher (relay plan|project|implement|ship)
254
+ plan.py PLAN step — bounded plan-only run → plan.json/.md
255
+ project.py PROJECT step — idempotent GitHub epic/milestones/issues
256
+ implement.py IMPLEMENT step — headless agent implements one issue → PR (Closes #)
257
+ ship.py SHIP step — incremental PR merge → issue/milestone bubble-up
258
+ pyproject.toml hatchling build; entry-point `relay = "relay.cli:main"`
259
+ docs/ workflow-design.md, planmode-probe.md, portal-briefing.md, PROPOSAL.md
260
+ .workspace/ agent state (memory/transitions/work), interop convention
261
+ ```
262
+
263
+ ## Design & provenance
264
+
265
+ - [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow spec (ALIGN→PLAN→PROJECT→IMPLEMENT→SHIP).
266
+ - [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex plan-mode probe; the empirical basis for the bounded-planner design.
267
+ - [`docs/portal-briefing.md`](docs/portal-briefing.md) — briefing for the website Agent Lab portal.
268
+ - Next steps (from design): self-host Relay (run the full ALIGN→…→SHIP cycle on Relay's
269
+ own backlog), then rolling-wave re-planning + Projects v2 board.
@@ -0,0 +1,240 @@
1
+ # Relay
2
+
3
+ **A cross-agent (Claude Code + Codex) planning/execution workflow.** Portable
4
+ operating discipline + continuity across agents: forcefully align on a spec, plan it
5
+ under a mechanically-bounded agent that *can't* run off and build, project it onto
6
+ GitHub (epic → milestones → issues), then implement and ship incrementally.
7
+
8
+ Local docs are the source of truth; **GitHub is the projection** (degrades to
9
+ local-only with no remote). One workflow, two host bindings — both converge on the
10
+ same `.workspace/work/<unit>/` files and `gh` projection.
11
+
12
+ > **Status**: alpha — pipeline complete (all 5 steps live-tested), packaged
13
+ > as `relay-workflow` (0.1.0). PyPI publish pending. Extracted 2026-05-28
14
+ > from `factory` work unit 024 (`claude_codex_interop_framework`). Successor
15
+ > to the Claude-only `claude-code-toolkit` (which remains active separately).
16
+ > Design rationale and the empirical host probes are in [`docs/`](docs/).
17
+
18
+ ## Pipeline (host-neutral)
19
+
20
+ Five steps built — pipeline complete:
21
+
22
+ ```
23
+ align/ → spec.md forceful interrogation (skill + Codex prompt; interactive)
24
+ relay_plan.py → plan.json/.md bounded plan-only agent run (can't implement)
25
+ relay_project.py → GitHub idempotent epic + milestones + issues (+ branches)
26
+ relay_implement.py → branch + PR headless agent implements one issue (Closes #)
27
+ ```
28
+
29
+ ## `align/` — spec interrogation (interactive)
30
+
31
+ `align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align` prompt):
32
+ forcefully interrogate the user — one question at a time, challenge vague answers,
33
+ force out-of-scope exclusions — then write `<unit>/spec.md` from `spec-template.md`.
34
+ Front-loads all human clarification so the headless `plan` step never needs to ask.
35
+ Deploy: copy `SKILL.md` to a skills dir; copy `align.codex.md` to `~/.codex/prompts/align.md`.
36
+
37
+ ## `relay_plan.py` — plan step
38
+
39
+ `relay_plan.py`: turns an aligned **spec** into a structured, milestone/issue
40
+ **plan** without letting the coding agent run off into implementation. Host-neutral.
41
+
42
+ ```
43
+ spec.md → [bounded plan-only agent run] → plan.json + plan.md
44
+ ```
45
+
46
+ 1. Reads `<unit>/spec.md` (produced by `align`).
47
+ 2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
48
+ - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is terminal,
49
+ nothing executes. Plan extracted from `--output-format json` result (falls back
50
+ to newest `~/.claude/plans/*.md`).
51
+ - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o <file>` —
52
+ read-only sandbox can't write; schema **enforces** the milestone/issue JSON shape.
53
+ 3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
54
+ 4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script. **Superseded
55
+ by `relay_project.py`** for real use; kept only as a quick-look dry-run.
56
+
57
+ ## `relay_project.py` — project step (idempotent GitHub projection)
58
+
59
+ Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**, one
60
+ **issue** per work item, with branch + `Closes #` wiring written back into `plan.json`.
61
+
62
+ ```
63
+ plan.json → [resolve repo] → epic + milestones + issues → plan.json (numbers+branches) + projection.json
64
+ ```
65
+
66
+ - **Idempotent**: matches existing milestones/issues by title (`gh api .../milestones`,
67
+ `gh issue list`) and reuses them — re-running never duplicates. Verified live against
68
+ a throwaway repo (4 milestones/21 issues; second run created 0 new objects).
69
+ - **Epic**: a `[epic] <objective>` tracking issue with a child task-list, refreshed
70
+ (not duplicated) on re-run. `--no-epic` to skip.
71
+ - **Branch/PR wiring**: each issue gets `branch: relay/m<n>-<mslug>/<islug>` and a
72
+ `Closes #<n>` contract recorded in `plan.json` for IMPLEMENT to consume.
73
+ - **Local-first**: no GitHub remote → exits local-only (add a remote and re-run to promote).
74
+ - **Missing labels** don't abort — the issue is created without them and logged.
75
+
76
+ ## `relay_implement.py` — implement step (headless, one issue → PR)
77
+
78
+ Consumes the projected `plan.json` and drives a **headless coding agent** to
79
+ implement ONE issue on its branch, then opens a PR that closes the issue. This is
80
+ the step that lets Relay execute its own backlog instead of a human hand-building.
81
+
82
+ ```
83
+ plan.json → [select issue] → branch off base → [headless agent] → push + PR (Closes #N) → plan.json (pr+state)
84
+ ```
85
+
86
+ - **One issue at a time** (workflow-design.md model). Default target = first
87
+ *pending* issue (state ∉ {implemented, merged, done}); `--issue <#n|title>` to pick.
88
+ - Reuses the `branch`/`closes` wiring the PROJECT step wrote into `plan.json`
89
+ (errors if absent — run `relay_project.py --execute` first).
90
+ - Assembles a **brief** from objective + milestone + issue body + `spec.md` (the
91
+ durable contract) + mechanical scope rules — host-neutral, byte-identical
92
+ across hosts — then runs the chosen host headless (Claude: `claude -p
93
+ --output-format json`; Codex: `codex exec` in a `workspace-write` sandbox).
94
+ Success is verified by `git` (commits on the branch), not by parsing stdout.
95
+ - **Idempotent**: reuses an existing branch / existing PR; if the agent leaves no
96
+ commits, records `state=no-op` and opens no PR. Local-only (no remote) →
97
+ `state=implemented-local`, branch left for later promotion.
98
+ - **Dry-run by default**: prints target, branch, the commands it would run, and the
99
+ full assembled brief. `--execute` to branch + spend an agent run + open the PR.
100
+ - **Host: `--host {claude,codex,auto}`** (default `auto`). Auto-detect mirrors
101
+ PLAN's precedence — `claude` on PATH wins, else `codex`, else a clear error.
102
+ Both hosts drive the same host-neutral brief and write the same `state`
103
+ vocabulary (`open` / `no-op` / `implemented-local`).
104
+ - **Host equivalence**: Codex runs under a `workspace-write` sandbox so it can
105
+ edit and commit the working tree. (PLAN uses `read-only` for a *different*
106
+ reason — to mechanically bound the planner so it can't build; IMPLEMENT must
107
+ be able to write.) For fully-autonomous runs that must run tests and commit
108
+ without prompts, `codex --dangerously-bypass-approvals-and-sandbox` is the
109
+ analogue of Claude's `--permission-mode bypassPermissions` (default is the
110
+ safer `acceptEdits`).
111
+
112
+ ## `relay_ship.py` — ship step (incremental: PR merge → issue/milestone done)
113
+
114
+ Consumes the `pr`/`state` IMPLEMENT wrote into `plan.json` and **closes the loop
115
+ incrementally**. There is no terminal "ship" event in workflow-design.md — SHIP
116
+ runs (and re-runs) as PRs become mergeable: PR merge → issue done (auto via
117
+ `Closes #N`) → milestone done (when all its issues are merged) → project done
118
+ (when every milestone closes).
119
+
120
+ ```
121
+ plan.json → [pick PR] → check mergeable + checks → gh pr merge → bubble-up → plan.json (merged/done)
122
+ ```
123
+
124
+ - **One PR per run by default** (mirrors IMPLEMENT's one-at-a-time); `--all` to
125
+ drain every mergeable PR in one pass.
126
+ - Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending checks
127
+ block by default; `--auto` enables GitHub auto-merge so the PR lands when
128
+ checks pass.
129
+ - `--admin` bypasses branch protections on repos without CI.
130
+ - `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
131
+ auto-retried once.
132
+ - After each pass, bubbles up milestones whose issues are all done and sets
133
+ `plan.state = "shipped"` when every milestone closes.
134
+ - `gh pr merge` lands on GitHub leaving the local base behind — SHIP fetches +
135
+ fast-forwards before committing state so the push succeeds the first time.
136
+ - Already-merged PRs are recognised and reconciled without re-merging
137
+ (re-running is safe + idempotent).
138
+
139
+ ## Install
140
+
141
+ Once published to PyPI:
142
+ ```bash
143
+ uv tool install relay-workflow # recommended (isolated env, like pipx)
144
+ # or
145
+ pipx install relay-workflow # legacy-compatible alternative
146
+ ```
147
+
148
+ Until then (or for development):
149
+ ```bash
150
+ git clone https://github.com/applied-artificial-intelligence/relay
151
+ cd relay
152
+ uv tool install --force . # install this checkout
153
+ ```
154
+
155
+ Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10 and the
156
+ `gh` CLI for the steps that touch GitHub.
157
+
158
+ ## Usage
159
+
160
+ ```bash
161
+ # align: invoke the Claude skill / Codex /align prompt (interactive), then:
162
+ relay plan <work-unit-dir> # auto-detect host, plan only
163
+ relay plan <unit> --host codex # force host
164
+ relay project <unit> # dry-run GitHub projection
165
+ relay project <unit> --execute # create/refresh epic+milestones+issues
166
+ relay implement <unit> # dry-run: show target issue + brief
167
+ relay implement <unit> --execute # implement first pending issue → PR (auto-detect host)
168
+ relay implement <unit> --host codex --execute # force Codex as the headless coding host
169
+ relay implement <unit> --issue 7 --execute --permission-mode bypassPermissions
170
+ relay ship <unit> # dry-run: which PRs would merge
171
+ relay ship <unit> --all --execute # merge every mergeable PR, bubble up
172
+ relay ship <unit> --execute --auto # enable auto-merge (wait on checks)
173
+ ```
174
+
175
+ `relay --help` lists steps; `relay <step> --help` lists step-specific options.
176
+ `python -m relay <step>` works too (handy on systems where `pipx`/`uv`-installed
177
+ scripts aren't on `PATH` yet).
178
+
179
+ ## Verified (2026-05-27)
180
+
181
+ | Host | Mechanism | Output | Time | Repo touched? |
182
+ |---|---|---|---|---|
183
+ | Codex (gpt-5.5) | PLAN: read-only + enforced schema | 4 milestones / 21 issues | 39s | no |
184
+ | Claude (2.1.152) | PLAN: `--permission-mode plan` + JSON extract | 2 milestones / 10 issues | 29s | no |
185
+ | Claude IMPLEMENT | `claude -p` in working tree | 2 issues → 2 PRs (`Closes #`) | — | yes (commits on branch) |
186
+ | Codex IMPLEMENT | `codex exec` + `workspace-write` sandbox | branch + PR (`Closes #`) — _pending live verification_ | — | yes (commits on branch) |
187
+
188
+ Both leave the repo untouched (the anti-runaway guarantee is mechanical, not
189
+ behavioral) and converge on the same `plan.json` + `gh_projection.sh`.
190
+
191
+ ## Known gaps (prototype, not production)
192
+
193
+ - **Codex schema** requires `additionalProperties:false` + every property in
194
+ `required` (OpenAI structured-output rule) — handled in `PLAN_SCHEMA`.
195
+ - Claude path is **not** schema-enforced — relies on the model returning JSON; a
196
+ structuring/repair pass would harden it.
197
+ - No rolling-wave support yet (re-plan a single milestone into issues later).
198
+ - GitHub **Projects v2** board (Status/Area/Type fields, à la tradesharp) is not
199
+ created — `relay_project.py` does epic issue + milestones only (CLI-friendly, no
200
+ GraphQL). Project-v2 wiring is the next projection increment.
201
+ - Issue-level idempotency matches on **exact title**; renaming an issue in the spec
202
+ creates a new one rather than updating the old. Acceptable for append-style planning.
203
+ - **IMPLEMENT `--execute` live-tested 2026-05-28** (throwaway `relay-smoketest`, 2 issues →
204
+ 2 PRs via `claude -p`, both in-scope with tests + correct `Closes #`). Idempotency holds:
205
+ skips issues that already have a PR (`--redo` to force), state is committed to the base
206
+ branch (survives between runs, leaves a clean tree), dirty-guard ignores untracked agent
207
+ artifacts. Default selection walks pending issues one at a time.
208
+ - **Parallel issues on one file conflict at merge time**: each issue branches off base
209
+ independently, so two issues editing the same file produce mergeable-but-conflicting PRs.
210
+ That's a SHIP/rebase concern, not a runner bug.
211
+ - **SHIP `--execute` live-tested 2026-05-31** (same `relay-smoketest` repo): merged PR #4
212
+ (`Closes #1` auto-closed issue #1), correctly detected the parallel-PR conflict on PR #5,
213
+ bubble-up left milestone open (1 of 2 issues done). Already-merged reconciliation verified
214
+ on a second pass after a hard reset. Fixed two bugs found during the live test:
215
+ (a) `gh pr view` first probe returning `mergeable=UNKNOWN` — auto-retry once;
216
+ (b) post-merge local base was stale, so state push failed — fetch + fast-forward before
217
+ committing state.
218
+
219
+ ## Repo layout
220
+
221
+ ```
222
+ align/ ALIGN step — Claude skill + Codex /align prompt + spec template
223
+ src/relay/ Python package — installs as the `relay` CLI dispatcher
224
+ cli.py subcommand dispatcher (relay plan|project|implement|ship)
225
+ plan.py PLAN step — bounded plan-only run → plan.json/.md
226
+ project.py PROJECT step — idempotent GitHub epic/milestones/issues
227
+ implement.py IMPLEMENT step — headless agent implements one issue → PR (Closes #)
228
+ ship.py SHIP step — incremental PR merge → issue/milestone bubble-up
229
+ pyproject.toml hatchling build; entry-point `relay = "relay.cli:main"`
230
+ docs/ workflow-design.md, planmode-probe.md, portal-briefing.md, PROPOSAL.md
231
+ .workspace/ agent state (memory/transitions/work), interop convention
232
+ ```
233
+
234
+ ## Design & provenance
235
+
236
+ - [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow spec (ALIGN→PLAN→PROJECT→IMPLEMENT→SHIP).
237
+ - [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex plan-mode probe; the empirical basis for the bounded-planner design.
238
+ - [`docs/portal-briefing.md`](docs/portal-briefing.md) — briefing for the website Agent Lab portal.
239
+ - Next steps (from design): self-host Relay (run the full ALIGN→…→SHIP cycle on Relay's
240
+ own backlog), then rolling-wave re-planning + Projects v2 board.
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: align
3
+ description: This skill should be used at the START of any non-trivial work unit, when the user asks to "align", "spec this out", "define the work", or before planning/implementing. Forcefully interrogates the user to produce a complete spec.md (WHAT/WHY end-state) that the headless `plan` step can consume without further questions.
4
+ user-invocable: true
5
+ ---
6
+
7
+ # align — forceful spec interrogation
8
+
9
+ You are running the **ALIGN** step of the Relay workflow. Your job is to extract a
10
+ *complete, verifiable specification* of the desired end-state — the WHAT and WHY,
11
+ **never the HOW** — and write it to `spec.md` in the work unit.
12
+
13
+ The spec is the durable contract. A separate, headless `plan` step turns it into
14
+ milestones/issues later and **must not need to ask the user anything** — so every
15
+ ambiguity has to die here. The plan step is only as good as the spec you produce.
16
+
17
+ ## Why this is forceful
18
+
19
+ The old `explore` step failed because it was too gentle: it accepted vague answers
20
+ and moved on, so goal discovery was weak and plans drifted. Do NOT repeat that.
21
+ You interrogate. You challenge. You do not proceed on assumptions.
22
+
23
+ ## Rules of engagement
24
+
25
+ 1. **One question at a time.** Never batch questions — the user answers worse and
26
+ you challenge worse. Ask, get the answer, react, ask the next.
27
+ 2. **Challenge every vague answer.** "Better performance" → "Measured how? What's
28
+ the current number, what's the target, who decides it's enough?" Do not accept
29
+ an answer you couldn't write a pass/fail test against.
30
+ 3. **Surface the implicit.** Ask what's deliberately OUT of scope. Ask what failure
31
+ looks like. Ask what already exists that this must not break.
32
+ 4. **Do NOT design.** If you catch yourself proposing an implementation, stop and
33
+ convert it into a requirement ("the result must X") instead.
34
+ 5. **Minimum question floor.** Ask at least **6** substantive questions before you
35
+ offer to write the spec. Hard requirements, acceptance criteria, scope
36
+ boundaries, and constraints are non-negotiable coverage. **Do NOT write spec.md
37
+ until every section below can be filled with something verifiable.**
38
+ 6. **Summarize and confirm.** Before writing, play back the spec verbally in 5-8
39
+ bullets and get an explicit "yes" or corrections. Then write the file.
40
+
41
+ ## Question backbone (adapt; don't read aloud as a checklist)
42
+
43
+ - **Objective**: In one sentence, what is true when this is done that isn't true now?
44
+ - **Why now**: What breaks / what's lost if this isn't built? Who feels it?
45
+ - **Acceptance**: How will we *verify* it's done? Name concrete checks/tests/numbers.
46
+ - **In scope / Out of scope**: What's explicitly excluded? (Force at least one exclusion.)
47
+ - **Constraints**: Hard limits — compat, perf, deadlines, tech that must/can't be used.
48
+ - **Existing surface**: What must NOT break? What does this touch?
49
+ - **Unknowns**: What's still genuinely uncertain? (Goes in Open Questions, must be
50
+ empty or explicitly deferred before the spec is "ready".)
51
+
52
+ ## Output
53
+
54
+ Write `<work-unit>/spec.md` using `spec-template.md` (sibling file) as the structure.
55
+ Fill every section. Leave NO `TODO`/`???` placeholders — if something is truly
56
+ undecided, state the decision rule and put it under **Open Questions** with an owner.
57
+
58
+ End by telling the user: spec is written; run the `plan` step
59
+ (`python3 relay_plan.py <work-unit>`) to decompose it into milestones/issues.
@@ -0,0 +1,33 @@
1
+ # /align (Codex prompt)
2
+
3
+ Deploy to `~/.codex/prompts/align.md` so it's invocable as `/align` in Codex.
4
+ This is the Codex binding of the Relay ALIGN step — same contract as the Claude
5
+ `align` skill (sibling `SKILL.md`); both converge on the same `spec.md`.
6
+
7
+ ---
8
+
9
+ Run the ALIGN step. Extract a complete, verifiable specification of the desired
10
+ end-state — the WHAT and WHY, never the HOW — and write it to `spec.md` in the
11
+ work unit the user names (under `.workspace/work/`).
12
+
13
+ A separate headless `plan` step will turn this spec into milestones and issues and
14
+ **must not need to ask the user anything**, so kill every ambiguity now.
15
+
16
+ Rules:
17
+ - Ask **one question at a time**. React to each answer before asking the next.
18
+ - **Challenge vague answers.** Do not accept anything you couldn't write a pass/fail
19
+ test against ("better performance" → measured how, current vs target number).
20
+ - Force at least one explicit **out-of-scope** exclusion. Ask what must NOT break.
21
+ - **Do not design.** Convert any implementation idea into a requirement instead.
22
+ - Ask at least **6** substantive questions before offering to write the spec.
23
+ - Before writing, play the spec back in 5-8 bullets and get an explicit "yes".
24
+
25
+ Write `<work-unit>/spec.md` with these sections, every one filled with verifiable
26
+ content (no TODO/??? placeholders):
27
+
28
+ Objective · Why · Acceptance criteria (numbered, each verifiable) · In scope ·
29
+ Out of scope (≥1 exclusion) · Constraints · Existing surface (must not break) ·
30
+ Open questions (each with a decision rule + owner, or marked deferred).
31
+
32
+ Finish by telling the user to run `python3 relay_plan.py <work-unit>` to decompose
33
+ the spec into milestones/issues.