PyPI - relay-workflow - Versions diffs - 0.1.0__tar.gz - Mend

relay-workflow 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

relay_workflow-0.1.0/.gitignore +30 -0
relay_workflow-0.1.0/LICENSE +21 -0
relay_workflow-0.1.0/PKG-INFO +269 -0
relay_workflow-0.1.0/README.md +240 -0
relay_workflow-0.1.0/align/SKILL.md +59 -0
relay_workflow-0.1.0/align/align.codex.md +33 -0
relay_workflow-0.1.0/align/spec-template.md +52 -0
relay_workflow-0.1.0/docs/PROPOSAL.md +257 -0
relay_workflow-0.1.0/docs/planmode-probe.md +134 -0
relay_workflow-0.1.0/docs/portal-briefing.md +124 -0
relay_workflow-0.1.0/docs/workflow-design.md +147 -0
relay_workflow-0.1.0/pyproject.toml +60 -0
relay_workflow-0.1.0/src/relay/__init__.py +2 -0
relay_workflow-0.1.0/src/relay/__main__.py +5 -0
relay_workflow-0.1.0/src/relay/cli.py +55 -0
relay_workflow-0.1.0/src/relay/implement.py +374 -0
relay_workflow-0.1.0/src/relay/plan.py +236 -0
relay_workflow-0.1.0/src/relay/project.py +227 -0
relay_workflow-0.1.0/src/relay/ship.py +304 -0

relay_workflow-0.1.0/.gitignore ADDED Viewed

@@ -0,0 +1,30 @@
+# Python
+__pycache__/
+*.pyc
+*.pyo
+*.egg-info/
+.venv/
+venv/
+# Build artifacts
+build/
+dist/
+# Relay run artifacts. Ignored globally (scratch runs, /tmp clones, repo-root
+# drops), but work-unit plans ARE the durable cross-run state: IMPLEMENT / SHIP
+# persist_state() commits plan.json on the base branch, so anything living under
+# .workspace/work/<unit>/ is tracked by default (allowlist below).
+**/plan.json
+**/plan.md
+**/plan.*.json
+**/plan.schema.json
+**/plan.claude.raw.md
+**/gh_projection.sh
+**/projection.json
+!.workspace/work/**/plan.json
+!.workspace/work/**/plan.md
+!.workspace/work/**/projection.json
+# OS / editor
+.DS_Store
+*.swp

relay_workflow-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Stefan Jansen
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

relay_workflow-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,269 @@
+Metadata-Version: 2.4
+Name: relay-workflow
+Version: 0.1.0
+Summary: Cross-agent (Claude Code + Codex) planning/execution workflow — ALIGN, PLAN, PROJECT, IMPLEMENT, SHIP.
+Project-URL: Homepage, https://github.com/applied-artificial-intelligence/relay
+Project-URL: Repository, https://github.com/applied-artificial-intelligence/relay
+Project-URL: Issues, https://github.com/applied-artificial-intelligence/relay/issues
+Author-email: Stefan Jansen <stefan@applied-ai.com>
+License-Expression: MIT
+License-File: LICENSE
+Keywords: agents,automation,claude-code,codex,github,workflow
+Classifier: Development Status :: 3 - Alpha
+Classifier: Environment :: Console
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: MacOS
+Classifier: Operating System :: POSIX
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Software Development
+Classifier: Topic :: Software Development :: Version Control :: Git
+Requires-Python: >=3.10
+Provides-Extra: test
+Requires-Dist: pytest>=8; extra == 'test'
+Description-Content-Type: text/markdown
+# Relay
+**A cross-agent (Claude Code + Codex) planning/execution workflow.** Portable
+operating discipline + continuity across agents: forcefully align on a spec, plan it
+under a mechanically-bounded agent that *can't* run off and build, project it onto
+GitHub (epic → milestones → issues), then implement and ship incrementally.
+Local docs are the source of truth; **GitHub is the projection** (degrades to
+local-only with no remote). One workflow, two host bindings — both converge on the
+same `.workspace/work/<unit>/` files and `gh` projection.
+> **Status**: alpha — pipeline complete (all 5 steps live-tested), packaged
+> as `relay-workflow` (0.1.0). PyPI publish pending. Extracted 2026-05-28
+> from `factory` work unit 024 (`claude_codex_interop_framework`). Successor
+> to the Claude-only `claude-code-toolkit` (which remains active separately).
+> Design rationale and the empirical host probes are in [`docs/`](docs/).
+## Pipeline (host-neutral)
+Five steps built — pipeline complete:
+```
+align/    →  spec.md       forceful interrogation (skill + Codex prompt; interactive)
+relay_plan.py      →  plan.json/.md   bounded plan-only agent run (can't implement)
+relay_project.py   →  GitHub          idempotent epic + milestones + issues (+ branches)
+relay_implement.py →  branch + PR     headless agent implements one issue (Closes #)
+```
+## `align/` — spec interrogation (interactive)
+`align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align` prompt):
+forcefully interrogate the user — one question at a time, challenge vague answers,
+force out-of-scope exclusions — then write `<unit>/spec.md` from `spec-template.md`.
+Front-loads all human clarification so the headless `plan` step never needs to ask.
+Deploy: copy `SKILL.md` to a skills dir; copy `align.codex.md` to `~/.codex/prompts/align.md`.
+## `relay_plan.py` — plan step
+`relay_plan.py`: turns an aligned **spec** into a structured, milestone/issue
+**plan** without letting the coding agent run off into implementation. Host-neutral.
+```
+spec.md  →  [bounded plan-only agent run]  →  plan.json + plan.md
+```
+1. Reads `<unit>/spec.md` (produced by `align`).
+2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
+   - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is terminal,
+     nothing executes. Plan extracted from `--output-format json` result (falls back
+     to newest `~/.claude/plans/*.md`).
+   - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o <file>` —
+     read-only sandbox can't write; schema **enforces** the milestone/issue JSON shape.
+3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
+4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script. **Superseded
+   by `relay_project.py`** for real use; kept only as a quick-look dry-run.
+## `relay_project.py` — project step (idempotent GitHub projection)
+Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**, one
+**issue** per work item, with branch + `Closes #` wiring written back into `plan.json`.
+```
+plan.json  →  [resolve repo]  →  epic + milestones + issues  →  plan.json (numbers+branches) + projection.json
+```
+- **Idempotent**: matches existing milestones/issues by title (`gh api .../milestones`,
+  `gh issue list`) and reuses them — re-running never duplicates. Verified live against
+  a throwaway repo (4 milestones/21 issues; second run created 0 new objects).
+- **Epic**: a `[epic] <objective>` tracking issue with a child task-list, refreshed
+  (not duplicated) on re-run. `--no-epic` to skip.
+- **Branch/PR wiring**: each issue gets `branch: relay/m<n>-<mslug>/<islug>` and a
+  `Closes #<n>` contract recorded in `plan.json` for IMPLEMENT to consume.
+- **Local-first**: no GitHub remote → exits local-only (add a remote and re-run to promote).
+- **Missing labels** don't abort — the issue is created without them and logged.
+## `relay_implement.py` — implement step (headless, one issue → PR)
+Consumes the projected `plan.json` and drives a **headless coding agent** to
+implement ONE issue on its branch, then opens a PR that closes the issue. This is
+the step that lets Relay execute its own backlog instead of a human hand-building.
+```
+plan.json  →  [select issue]  →  branch off base  →  [headless agent]  →  push + PR (Closes #N)  →  plan.json (pr+state)
+```
+- **One issue at a time** (workflow-design.md model). Default target = first
+  *pending* issue (state ∉ {implemented, merged, done}); `--issue <#n|title>` to pick.
+- Reuses the `branch`/`closes` wiring the PROJECT step wrote into `plan.json`
+  (errors if absent — run `relay_project.py --execute` first).
+- Assembles a **brief** from objective + milestone + issue body + `spec.md` (the
+  durable contract) + mechanical scope rules — host-neutral, byte-identical
+  across hosts — then runs the chosen host headless (Claude: `claude -p
+  --output-format json`; Codex: `codex exec` in a `workspace-write` sandbox).
+  Success is verified by `git` (commits on the branch), not by parsing stdout.
+- **Idempotent**: reuses an existing branch / existing PR; if the agent leaves no
+  commits, records `state=no-op` and opens no PR. Local-only (no remote) →
+  `state=implemented-local`, branch left for later promotion.
+- **Dry-run by default**: prints target, branch, the commands it would run, and the
+  full assembled brief. `--execute` to branch + spend an agent run + open the PR.
+- **Host: `--host {claude,codex,auto}`** (default `auto`). Auto-detect mirrors
+  PLAN's precedence — `claude` on PATH wins, else `codex`, else a clear error.
+  Both hosts drive the same host-neutral brief and write the same `state`
+  vocabulary (`open` / `no-op` / `implemented-local`).
+- **Host equivalence**: Codex runs under a `workspace-write` sandbox so it can
+  edit and commit the working tree. (PLAN uses `read-only` for a *different*
+  reason — to mechanically bound the planner so it can't build; IMPLEMENT must
+  be able to write.) For fully-autonomous runs that must run tests and commit
+  without prompts, `codex --dangerously-bypass-approvals-and-sandbox` is the
+  analogue of Claude's `--permission-mode bypassPermissions` (default is the
+  safer `acceptEdits`).
+## `relay_ship.py` — ship step (incremental: PR merge → issue/milestone done)
+Consumes the `pr`/`state` IMPLEMENT wrote into `plan.json` and **closes the loop
+incrementally**. There is no terminal "ship" event in workflow-design.md — SHIP
+runs (and re-runs) as PRs become mergeable: PR merge → issue done (auto via
+`Closes #N`) → milestone done (when all its issues are merged) → project done
+(when every milestone closes).
+```
+plan.json  →  [pick PR]  →  check mergeable + checks  →  gh pr merge  →  bubble-up  →  plan.json (merged/done)
+```
+- **One PR per run by default** (mirrors IMPLEMENT's one-at-a-time); `--all` to
+  drain every mergeable PR in one pass.
+- Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending checks
+  block by default; `--auto` enables GitHub auto-merge so the PR lands when
+  checks pass.
+- `--admin` bypasses branch protections on repos without CI.
+- `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
+  auto-retried once.
+- After each pass, bubbles up milestones whose issues are all done and sets
+  `plan.state = "shipped"` when every milestone closes.
+- `gh pr merge` lands on GitHub leaving the local base behind — SHIP fetches +
+  fast-forwards before committing state so the push succeeds the first time.
+- Already-merged PRs are recognised and reconciled without re-merging
+  (re-running is safe + idempotent).
+## Install
+Once published to PyPI:
+```bash
+uv tool install relay-workflow   # recommended (isolated env, like pipx)
+# or
+pipx install relay-workflow      # legacy-compatible alternative
+```
+Until then (or for development):
+```bash
+git clone https://github.com/applied-artificial-intelligence/relay
+cd relay
+uv tool install --force .        # install this checkout
+```
+Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10 and the
+`gh` CLI for the steps that touch GitHub.
+## Usage
+```bash
+# align: invoke the Claude skill / Codex /align prompt (interactive), then:
+relay plan <work-unit-dir>                            # auto-detect host, plan only
+relay plan <unit> --host codex                        # force host
+relay project <unit>                                  # dry-run GitHub projection
+relay project <unit> --execute                        # create/refresh epic+milestones+issues
+relay implement <unit>                                # dry-run: show target issue + brief
+relay implement <unit> --execute                      # implement first pending issue → PR (auto-detect host)
+relay implement <unit> --host codex --execute         # force Codex as the headless coding host
+relay implement <unit> --issue 7 --execute --permission-mode bypassPermissions
+relay ship <unit>                                     # dry-run: which PRs would merge
+relay ship <unit> --all --execute                     # merge every mergeable PR, bubble up
+relay ship <unit> --execute --auto                    # enable auto-merge (wait on checks)
+```
+`relay --help` lists steps; `relay <step> --help` lists step-specific options.
+`python -m relay <step>` works too (handy on systems where `pipx`/`uv`-installed
+scripts aren't on `PATH` yet).
+## Verified (2026-05-27)
+| Host | Mechanism | Output | Time | Repo touched? |
+|---|---|---|---|---|
+| Codex (gpt-5.5) | PLAN: read-only + enforced schema | 4 milestones / 21 issues | 39s | no |
+| Claude (2.1.152) | PLAN: `--permission-mode plan` + JSON extract | 2 milestones / 10 issues | 29s | no |
+| Claude IMPLEMENT | `claude -p` in working tree | 2 issues → 2 PRs (`Closes #`) | — | yes (commits on branch) |
+| Codex IMPLEMENT | `codex exec` + `workspace-write` sandbox | branch + PR (`Closes #`) — _pending live verification_ | — | yes (commits on branch) |
+Both leave the repo untouched (the anti-runaway guarantee is mechanical, not
+behavioral) and converge on the same `plan.json` + `gh_projection.sh`.
+## Known gaps (prototype, not production)
+- **Codex schema** requires `additionalProperties:false` + every property in
+  `required` (OpenAI structured-output rule) — handled in `PLAN_SCHEMA`.
+- Claude path is **not** schema-enforced — relies on the model returning JSON; a
+  structuring/repair pass would harden it.
+- No rolling-wave support yet (re-plan a single milestone into issues later).
+- GitHub **Projects v2** board (Status/Area/Type fields, à la tradesharp) is not
+  created — `relay_project.py` does epic issue + milestones only (CLI-friendly, no
+  GraphQL). Project-v2 wiring is the next projection increment.
+- Issue-level idempotency matches on **exact title**; renaming an issue in the spec
+  creates a new one rather than updating the old. Acceptable for append-style planning.
+- **IMPLEMENT `--execute` live-tested 2026-05-28** (throwaway `relay-smoketest`, 2 issues →
+  2 PRs via `claude -p`, both in-scope with tests + correct `Closes #`). Idempotency holds:
+  skips issues that already have a PR (`--redo` to force), state is committed to the base
+  branch (survives between runs, leaves a clean tree), dirty-guard ignores untracked agent
+  artifacts. Default selection walks pending issues one at a time.
+- **Parallel issues on one file conflict at merge time**: each issue branches off base
+  independently, so two issues editing the same file produce mergeable-but-conflicting PRs.
+  That's a SHIP/rebase concern, not a runner bug.
+- **SHIP `--execute` live-tested 2026-05-31** (same `relay-smoketest` repo): merged PR #4
+  (`Closes #1` auto-closed issue #1), correctly detected the parallel-PR conflict on PR #5,
+  bubble-up left milestone open (1 of 2 issues done). Already-merged reconciliation verified
+  on a second pass after a hard reset. Fixed two bugs found during the live test:
+  (a) `gh pr view` first probe returning `mergeable=UNKNOWN` — auto-retry once;
+  (b) post-merge local base was stale, so state push failed — fetch + fast-forward before
+  committing state.
+## Repo layout
+```
+align/             ALIGN step — Claude skill + Codex /align prompt + spec template
+src/relay/         Python package — installs as the `relay` CLI dispatcher
+  cli.py           subcommand dispatcher (relay plan|project|implement|ship)
+  plan.py          PLAN step  — bounded plan-only run → plan.json/.md
+  project.py       PROJECT step — idempotent GitHub epic/milestones/issues
+  implement.py     IMPLEMENT step — headless agent implements one issue → PR (Closes #)
+  ship.py          SHIP step — incremental PR merge → issue/milestone bubble-up
+pyproject.toml     hatchling build; entry-point `relay = "relay.cli:main"`
+docs/              workflow-design.md, planmode-probe.md, portal-briefing.md, PROPOSAL.md
+.workspace/        agent state (memory/transitions/work), interop convention
+```
+## Design & provenance
+- [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow spec (ALIGN→PLAN→PROJECT→IMPLEMENT→SHIP).
+- [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex plan-mode probe; the empirical basis for the bounded-planner design.
+- [`docs/portal-briefing.md`](docs/portal-briefing.md) — briefing for the website Agent Lab portal.
+- Next steps (from design): self-host Relay (run the full ALIGN→…→SHIP cycle on Relay's
+  own backlog), then rolling-wave re-planning + Projects v2 board.

relay_workflow-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,240 @@
+# Relay
+**A cross-agent (Claude Code + Codex) planning/execution workflow.** Portable
+operating discipline + continuity across agents: forcefully align on a spec, plan it
+under a mechanically-bounded agent that *can't* run off and build, project it onto
+GitHub (epic → milestones → issues), then implement and ship incrementally.
+Local docs are the source of truth; **GitHub is the projection** (degrades to
+local-only with no remote). One workflow, two host bindings — both converge on the
+same `.workspace/work/<unit>/` files and `gh` projection.
+> **Status**: alpha — pipeline complete (all 5 steps live-tested), packaged
+> as `relay-workflow` (0.1.0). PyPI publish pending. Extracted 2026-05-28
+> from `factory` work unit 024 (`claude_codex_interop_framework`). Successor
+> to the Claude-only `claude-code-toolkit` (which remains active separately).
+> Design rationale and the empirical host probes are in [`docs/`](docs/).
+## Pipeline (host-neutral)
+Five steps built — pipeline complete:
+```
+align/    →  spec.md       forceful interrogation (skill + Codex prompt; interactive)
+relay_plan.py      →  plan.json/.md   bounded plan-only agent run (can't implement)
+relay_project.py   →  GitHub          idempotent epic + milestones + issues (+ branches)
+relay_implement.py →  branch + PR     headless agent implements one issue (Closes #)
+```
+## `align/` — spec interrogation (interactive)
+`align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align` prompt):
+forcefully interrogate the user — one question at a time, challenge vague answers,
+force out-of-scope exclusions — then write `<unit>/spec.md` from `spec-template.md`.
+Front-loads all human clarification so the headless `plan` step never needs to ask.
+Deploy: copy `SKILL.md` to a skills dir; copy `align.codex.md` to `~/.codex/prompts/align.md`.
+## `relay_plan.py` — plan step
+`relay_plan.py`: turns an aligned **spec** into a structured, milestone/issue
+**plan** without letting the coding agent run off into implementation. Host-neutral.
+```
+spec.md  →  [bounded plan-only agent run]  →  plan.json + plan.md
+```
+1. Reads `<unit>/spec.md` (produced by `align`).
+2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
+   - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is terminal,
+     nothing executes. Plan extracted from `--output-format json` result (falls back
+     to newest `~/.claude/plans/*.md`).
+   - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o <file>` —
+     read-only sandbox can't write; schema **enforces** the milestone/issue JSON shape.
+3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
+4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script. **Superseded
+   by `relay_project.py`** for real use; kept only as a quick-look dry-run.
+## `relay_project.py` — project step (idempotent GitHub projection)
+Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**, one
+**issue** per work item, with branch + `Closes #` wiring written back into `plan.json`.
+```
+plan.json  →  [resolve repo]  →  epic + milestones + issues  →  plan.json (numbers+branches) + projection.json
+```
+- **Idempotent**: matches existing milestones/issues by title (`gh api .../milestones`,
+  `gh issue list`) and reuses them — re-running never duplicates. Verified live against
+  a throwaway repo (4 milestones/21 issues; second run created 0 new objects).
+- **Epic**: a `[epic] <objective>` tracking issue with a child task-list, refreshed
+  (not duplicated) on re-run. `--no-epic` to skip.
+- **Branch/PR wiring**: each issue gets `branch: relay/m<n>-<mslug>/<islug>` and a
+  `Closes #<n>` contract recorded in `plan.json` for IMPLEMENT to consume.
+- **Local-first**: no GitHub remote → exits local-only (add a remote and re-run to promote).
+- **Missing labels** don't abort — the issue is created without them and logged.
+## `relay_implement.py` — implement step (headless, one issue → PR)
+Consumes the projected `plan.json` and drives a **headless coding agent** to
+implement ONE issue on its branch, then opens a PR that closes the issue. This is
+the step that lets Relay execute its own backlog instead of a human hand-building.
+```
+plan.json  →  [select issue]  →  branch off base  →  [headless agent]  →  push + PR (Closes #N)  →  plan.json (pr+state)
+```
+- **One issue at a time** (workflow-design.md model). Default target = first
+  *pending* issue (state ∉ {implemented, merged, done}); `--issue <#n|title>` to pick.
+- Reuses the `branch`/`closes` wiring the PROJECT step wrote into `plan.json`
+  (errors if absent — run `relay_project.py --execute` first).
+- Assembles a **brief** from objective + milestone + issue body + `spec.md` (the
+  durable contract) + mechanical scope rules — host-neutral, byte-identical
+  across hosts — then runs the chosen host headless (Claude: `claude -p
+  --output-format json`; Codex: `codex exec` in a `workspace-write` sandbox).
+  Success is verified by `git` (commits on the branch), not by parsing stdout.
+- **Idempotent**: reuses an existing branch / existing PR; if the agent leaves no
+  commits, records `state=no-op` and opens no PR. Local-only (no remote) →
+  `state=implemented-local`, branch left for later promotion.
+- **Dry-run by default**: prints target, branch, the commands it would run, and the
+  full assembled brief. `--execute` to branch + spend an agent run + open the PR.
+- **Host: `--host {claude,codex,auto}`** (default `auto`). Auto-detect mirrors
+  PLAN's precedence — `claude` on PATH wins, else `codex`, else a clear error.
+  Both hosts drive the same host-neutral brief and write the same `state`
+  vocabulary (`open` / `no-op` / `implemented-local`).
+- **Host equivalence**: Codex runs under a `workspace-write` sandbox so it can
+  edit and commit the working tree. (PLAN uses `read-only` for a *different*
+  reason — to mechanically bound the planner so it can't build; IMPLEMENT must
+  be able to write.) For fully-autonomous runs that must run tests and commit
+  without prompts, `codex --dangerously-bypass-approvals-and-sandbox` is the
+  analogue of Claude's `--permission-mode bypassPermissions` (default is the
+  safer `acceptEdits`).
+## `relay_ship.py` — ship step (incremental: PR merge → issue/milestone done)
+Consumes the `pr`/`state` IMPLEMENT wrote into `plan.json` and **closes the loop
+incrementally**. There is no terminal "ship" event in workflow-design.md — SHIP
+runs (and re-runs) as PRs become mergeable: PR merge → issue done (auto via
+`Closes #N`) → milestone done (when all its issues are merged) → project done
+(when every milestone closes).
+```
+plan.json  →  [pick PR]  →  check mergeable + checks  →  gh pr merge  →  bubble-up  →  plan.json (merged/done)
+```
+- **One PR per run by default** (mirrors IMPLEMENT's one-at-a-time); `--all` to
+  drain every mergeable PR in one pass.
+- Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending checks
+  block by default; `--auto` enables GitHub auto-merge so the PR lands when
+  checks pass.
+- `--admin` bypasses branch protections on repos without CI.
+- `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
+  auto-retried once.
+- After each pass, bubbles up milestones whose issues are all done and sets
+  `plan.state = "shipped"` when every milestone closes.
+- `gh pr merge` lands on GitHub leaving the local base behind — SHIP fetches +
+  fast-forwards before committing state so the push succeeds the first time.
+- Already-merged PRs are recognised and reconciled without re-merging
+  (re-running is safe + idempotent).
+## Install
+Once published to PyPI:
+```bash
+uv tool install relay-workflow   # recommended (isolated env, like pipx)
+# or
+pipx install relay-workflow      # legacy-compatible alternative
+```
+Until then (or for development):
+```bash
+git clone https://github.com/applied-artificial-intelligence/relay
+cd relay
+uv tool install --force .        # install this checkout
+```
+Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10 and the
+`gh` CLI for the steps that touch GitHub.
+## Usage
+```bash
+# align: invoke the Claude skill / Codex /align prompt (interactive), then:
+relay plan <work-unit-dir>                            # auto-detect host, plan only
+relay plan <unit> --host codex                        # force host
+relay project <unit>                                  # dry-run GitHub projection
+relay project <unit> --execute                        # create/refresh epic+milestones+issues
+relay implement <unit>                                # dry-run: show target issue + brief
+relay implement <unit> --execute                      # implement first pending issue → PR (auto-detect host)
+relay implement <unit> --host codex --execute         # force Codex as the headless coding host
+relay implement <unit> --issue 7 --execute --permission-mode bypassPermissions
+relay ship <unit>                                     # dry-run: which PRs would merge
+relay ship <unit> --all --execute                     # merge every mergeable PR, bubble up
+relay ship <unit> --execute --auto                    # enable auto-merge (wait on checks)
+```
+`relay --help` lists steps; `relay <step> --help` lists step-specific options.
+`python -m relay <step>` works too (handy on systems where `pipx`/`uv`-installed
+scripts aren't on `PATH` yet).
+## Verified (2026-05-27)
+| Host | Mechanism | Output | Time | Repo touched? |
+|---|---|---|---|---|
+| Codex (gpt-5.5) | PLAN: read-only + enforced schema | 4 milestones / 21 issues | 39s | no |
+| Claude (2.1.152) | PLAN: `--permission-mode plan` + JSON extract | 2 milestones / 10 issues | 29s | no |
+| Claude IMPLEMENT | `claude -p` in working tree | 2 issues → 2 PRs (`Closes #`) | — | yes (commits on branch) |
+| Codex IMPLEMENT | `codex exec` + `workspace-write` sandbox | branch + PR (`Closes #`) — _pending live verification_ | — | yes (commits on branch) |
+Both leave the repo untouched (the anti-runaway guarantee is mechanical, not
+behavioral) and converge on the same `plan.json` + `gh_projection.sh`.
+## Known gaps (prototype, not production)
+- **Codex schema** requires `additionalProperties:false` + every property in
+  `required` (OpenAI structured-output rule) — handled in `PLAN_SCHEMA`.
+- Claude path is **not** schema-enforced — relies on the model returning JSON; a
+  structuring/repair pass would harden it.
+- No rolling-wave support yet (re-plan a single milestone into issues later).
+- GitHub **Projects v2** board (Status/Area/Type fields, à la tradesharp) is not
+  created — `relay_project.py` does epic issue + milestones only (CLI-friendly, no
+  GraphQL). Project-v2 wiring is the next projection increment.
+- Issue-level idempotency matches on **exact title**; renaming an issue in the spec
+  creates a new one rather than updating the old. Acceptable for append-style planning.
+- **IMPLEMENT `--execute` live-tested 2026-05-28** (throwaway `relay-smoketest`, 2 issues →
+  2 PRs via `claude -p`, both in-scope with tests + correct `Closes #`). Idempotency holds:
+  skips issues that already have a PR (`--redo` to force), state is committed to the base
+  branch (survives between runs, leaves a clean tree), dirty-guard ignores untracked agent
+  artifacts. Default selection walks pending issues one at a time.
+- **Parallel issues on one file conflict at merge time**: each issue branches off base
+  independently, so two issues editing the same file produce mergeable-but-conflicting PRs.
+  That's a SHIP/rebase concern, not a runner bug.
+- **SHIP `--execute` live-tested 2026-05-31** (same `relay-smoketest` repo): merged PR #4
+  (`Closes #1` auto-closed issue #1), correctly detected the parallel-PR conflict on PR #5,
+  bubble-up left milestone open (1 of 2 issues done). Already-merged reconciliation verified
+  on a second pass after a hard reset. Fixed two bugs found during the live test:
+  (a) `gh pr view` first probe returning `mergeable=UNKNOWN` — auto-retry once;
+  (b) post-merge local base was stale, so state push failed — fetch + fast-forward before
+  committing state.
+## Repo layout
+```
+align/             ALIGN step — Claude skill + Codex /align prompt + spec template
+src/relay/         Python package — installs as the `relay` CLI dispatcher
+  cli.py           subcommand dispatcher (relay plan|project|implement|ship)
+  plan.py          PLAN step  — bounded plan-only run → plan.json/.md
+  project.py       PROJECT step — idempotent GitHub epic/milestones/issues
+  implement.py     IMPLEMENT step — headless agent implements one issue → PR (Closes #)
+  ship.py          SHIP step — incremental PR merge → issue/milestone bubble-up
+pyproject.toml     hatchling build; entry-point `relay = "relay.cli:main"`
+docs/              workflow-design.md, planmode-probe.md, portal-briefing.md, PROPOSAL.md
+.workspace/        agent state (memory/transitions/work), interop convention
+```
+## Design & provenance
+- [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow spec (ALIGN→PLAN→PROJECT→IMPLEMENT→SHIP).
+- [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex plan-mode probe; the empirical basis for the bounded-planner design.
+- [`docs/portal-briefing.md`](docs/portal-briefing.md) — briefing for the website Agent Lab portal.
+- Next steps (from design): self-host Relay (run the full ALIGN→…→SHIP cycle on Relay's
+  own backlog), then rolling-wave re-planning + Projects v2 board.

relay_workflow-0.1.0/align/SKILL.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+name: align
+description: This skill should be used at the START of any non-trivial work unit, when the user asks to "align", "spec this out", "define the work", or before planning/implementing. Forcefully interrogates the user to produce a complete spec.md (WHAT/WHY end-state) that the headless `plan` step can consume without further questions.
+user-invocable: true
+---
+# align — forceful spec interrogation
+You are running the **ALIGN** step of the Relay workflow. Your job is to extract a
+*complete, verifiable specification* of the desired end-state — the WHAT and WHY,
+**never the HOW** — and write it to `spec.md` in the work unit.
+The spec is the durable contract. A separate, headless `plan` step turns it into
+milestones/issues later and **must not need to ask the user anything** — so every
+ambiguity has to die here. The plan step is only as good as the spec you produce.
+## Why this is forceful
+The old `explore` step failed because it was too gentle: it accepted vague answers
+and moved on, so goal discovery was weak and plans drifted. Do NOT repeat that.
+You interrogate. You challenge. You do not proceed on assumptions.
+## Rules of engagement
+1. **One question at a time.** Never batch questions — the user answers worse and
+   you challenge worse. Ask, get the answer, react, ask the next.
+2. **Challenge every vague answer.** "Better performance" → "Measured how? What's
+   the current number, what's the target, who decides it's enough?" Do not accept
+   an answer you couldn't write a pass/fail test against.
+3. **Surface the implicit.** Ask what's deliberately OUT of scope. Ask what failure
+   looks like. Ask what already exists that this must not break.
+4. **Do NOT design.** If you catch yourself proposing an implementation, stop and
+   convert it into a requirement ("the result must X") instead.
+5. **Minimum question floor.** Ask at least **6** substantive questions before you
+   offer to write the spec. Hard requirements, acceptance criteria, scope
+   boundaries, and constraints are non-negotiable coverage. **Do NOT write spec.md
+   until every section below can be filled with something verifiable.**
+6. **Summarize and confirm.** Before writing, play back the spec verbally in 5-8
+   bullets and get an explicit "yes" or corrections. Then write the file.
+## Question backbone (adapt; don't read aloud as a checklist)
+- **Objective**: In one sentence, what is true when this is done that isn't true now?
+- **Why now**: What breaks / what's lost if this isn't built? Who feels it?
+- **Acceptance**: How will we *verify* it's done? Name concrete checks/tests/numbers.
+- **In scope / Out of scope**: What's explicitly excluded? (Force at least one exclusion.)
+- **Constraints**: Hard limits — compat, perf, deadlines, tech that must/can't be used.
+- **Existing surface**: What must NOT break? What does this touch?
+- **Unknowns**: What's still genuinely uncertain? (Goes in Open Questions, must be
+  empty or explicitly deferred before the spec is "ready".)
+## Output
+Write `<work-unit>/spec.md` using `spec-template.md` (sibling file) as the structure.
+Fill every section. Leave NO `TODO`/`???` placeholders — if something is truly
+undecided, state the decision rule and put it under **Open Questions** with an owner.
+End by telling the user: spec is written; run the `plan` step
+(`python3 relay_plan.py <work-unit>`) to decompose it into milestones/issues.

relay_workflow-0.1.0/align/align.codex.md ADDED Viewed

@@ -0,0 +1,33 @@
+# /align (Codex prompt)
+Deploy to `~/.codex/prompts/align.md` so it's invocable as `/align` in Codex.
+This is the Codex binding of the Relay ALIGN step — same contract as the Claude
+`align` skill (sibling `SKILL.md`); both converge on the same `spec.md`.
+---
+Run the ALIGN step. Extract a complete, verifiable specification of the desired
+end-state — the WHAT and WHY, never the HOW — and write it to `spec.md` in the
+work unit the user names (under `.workspace/work/`).
+A separate headless `plan` step will turn this spec into milestones and issues and
+**must not need to ask the user anything**, so kill every ambiguity now.
+Rules:
+- Ask **one question at a time**. React to each answer before asking the next.
+- **Challenge vague answers.** Do not accept anything you couldn't write a pass/fail
+  test against ("better performance" → measured how, current vs target number).
+- Force at least one explicit **out-of-scope** exclusion. Ask what must NOT break.
+- **Do not design.** Convert any implementation idea into a requirement instead.
+- Ask at least **6** substantive questions before offering to write the spec.
+- Before writing, play the spec back in 5-8 bullets and get an explicit "yes".
+Write `<work-unit>/spec.md` with these sections, every one filled with verifiable
+content (no TODO/??? placeholders):
+Objective · Why · Acceptance criteria (numbered, each verifiable) · In scope ·
+Out of scope (≥1 exclusion) · Constraints · Existing surface (must not break) ·
+Open questions (each with a decision rule + owner, or marked deferred).
+Finish by telling the user to run `python3 relay_plan.py <work-unit>` to decompose
+the spec into milestones/issues.