npm - agent-conveyor - Versions diffs - 0.1.13 → 0.1.14 - Mend

agent-conveyor 0.1.13 → 0.1.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +51 -11
package/dist/cli/typescript-runtime.js +554 -12
package/dist/cli/typescript-runtime.js.map +1 -1
package/dist/runtime/app-autonomy.d.ts +1 -0
package/dist/runtime/app-autonomy.js +16 -0
package/dist/runtime/app-autonomy.js.map +1 -1
package/dist/runtime/manager-permissions.js +1 -1
package/dist/runtime/manager-permissions.js.map +1 -1
package/docs/manager-recipes.md +85 -0
package/package.json +1 -1
package/skills/manage-codex-workers/SKILL.md +76 -8

package/README.md CHANGED Viewed

@@ -139,8 +139,9 @@ release version that is not already on npm.
 For common manager setups, start with
 [`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
 requests such as GoalBuddy conveyor runs, test coverage loops, UX polish loops,
-what-next nudging, and PR/CI/merge Ralph loops to concrete `manager-config`
-settings, permissions, evidence gates, cleanup behavior, and example
+what-next nudging, PR/CI/merge Ralph loops, and autonomous ship-it loops to
+concrete `manager-config` settings, permissions, evidence gates, cleanup
+behavior, and example
 manager/Dispatch/worker interactions. Use `conveyor manager-recipes --list`
 or `conveyor manager-recipes --show goalbuddy-conveyor --json` for a
 machine-readable setup preview.
@@ -178,6 +179,12 @@ are unavailable, open a separate Codex app worker manually and paste the
 completion or blocker report must go back through the generated
 `enqueue-notify-manager` command and a bounded Dispatch watch tick; a direct
 Codex app final answer is not a durable manager receipt.
+The live manager and worker sessions should also be readable as the primary
+operator transcript: after consuming an inbox item, the consuming session must
+print `CONVEYOR POLL`, `CONVEYOR RECEIVED`, `WORK`, `CONVEYOR SEND`, and
+`DISPATCH` sections while the turn is happening. SQLite/replay/status output is
+audit proof, not a replacement for the live session story. Idle polls may be a
+single `CONVEYOR IDLE` line.
 Dispatch is core infrastructure for supervised worker/manager pairs. The
 `pair` workflow starts a detached Dispatch watch process by default so worker
@@ -195,6 +202,9 @@ Use `conveyor qa-plan adversarial-triggers` to verify natural-language
 manager prompts activate Ralph-loop adversarial gates.
 Use `conveyor qa-plan goalbuddy-conveyor` when a broad request should become
 sequential GoalBuddy child boards with PR/CI/merge receipts.
+Use `conveyor qa-plan ship-it-loop` when a manager is allowed to push a branch,
+open a PR, monitor CI, resolve bounded conflicts, and merge only after explicit
+manager-owned merge evidence.
 Before cutting a manager loose, have it resolve the freeform setup request to a
 named recipe from `docs/manager-recipes.md` or an explicit `custom` setup, then
 show the saved mode, permissions, evidence gates, cleanup policy, and disallowed
@@ -377,7 +387,10 @@ tmux attach -t codex-live-test
   required to make an idle app thread poll autonomously. The worker handoff and
   worker heartbeat prompt also include the exact durable
   `enqueue-notify-manager` and one-iteration `dispatch --watch` commands that a
-  worker must run after completing or blocking on a consumed item. Those
+  worker must run after completing or blocking on a consumed item. Those prompts
+  require live session transcript blocks for consumed items, so the operator can
+  inspect the actual Codex app sessions and see the same flow the durable inbox
+  later proves. Those
   recommendations also include `wakeup_dispatch_command` and
   `delivery_receipt_commands` for
   app-thread wake recovery. Use them to record sent, skipped, and blocked wake
@@ -385,7 +398,11 @@ tmux attach -t codex-live-test
   completion. The recommendations include a `teardown_policy`: an idle poll is
   only a quiet interval, not a reason to delete or pause heartbeat automation;
   heartbeat teardown belongs to the manager/operator after terminal closeout or
-  explicit operator instruction.
+  explicit operator instruction. For same-thread Codex app visible-session
+  dogfood, prefer `--template app_visible_build_loop` or a custom adversarial
+  gate; reserve cleanup-gated templates such as `build_then_clear` for flows
+  that create a fresh worker context or can record a real cleanup receipt
+  between iterations.
   The optional
   Codex app thread metadata is normally supplied after a Codex app manager has
   used `create_thread` and `set_thread_title`; terminal-only users can omit it
@@ -456,7 +473,8 @@ tmux attach -t codex-live-test
   flags. Use `--interactive` only as a terminal fallback when a human is
   running `conveyor` directly.
   `--permit` grants taxonomy permissions such as `repo.open_pr`,
-  `verification.run_pytest`, `context.spawn_reviewer`,
+  `repo.push_branch`, `repo.monitor_ci`, `repo.resolve_conflicts`,
+  `repo.merge_green_pr`, `verification.run_pytest`, `context.spawn_reviewer`,
   `communication.notify_operator`, or `worker_session.compact`. Use `--tool`
   to record expected verification/context tools, `--epilogue` for required
   built-in finish steps (`run-tools`, `draft-pr`, `subagent-review`,
@@ -497,7 +515,13 @@ tmux attach -t codex-live-test
   [--json]` — Draft reviewed `criteria --add` commands from a worker response
   that separates must-have current-task criteria from deferred follow-ups. This
   helper is read-only: it resolves the task and prints suggestions, but does not
-  mutate acceptance criteria, events, or commands.
+  mutate acceptance criteria, events, or commands. If a proposed criterion
+  appears to describe manager closeout mechanics such as `finish-task`,
+  `--require-criteria-audit`, heartbeat teardown, or final manager reporting,
+  the helper emits a non-blocking warning and classifies that suggestion as
+  manager closeout proof. Keep that proof in the manager final report, audit,
+  replay, or epilogue evidence instead of accepted worker/task criteria unless
+  the task is explicitly Conveyor closeout QA.
   ```bash
   conveyor criteria-plan my-task --from-worker-response response.md --json
   ```
@@ -750,9 +774,9 @@ tmux attach -t codex-live-test
 - `transcript-show <task> [--role R] [--include-content]` — Show stored
   transcript segment metadata. Segment text is redacted unless
   `--include-content` is passed.
-- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor>` — Print a
+- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor|ship-it-loop>` — Print a
   repeatable manual QA checklist.
-- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop> --receipt-output RECEIPT.json [--path DB]` —
+- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop|ship-it-loop> --receipt-output RECEIPT.json [--path DB]` —
   Run a deterministic no-tmux QA harness and save a JSON receipt.
   `ralph-loop-guardrails` proves max-iteration cutoff, missing-evidence
   cutoff, fresh retry delivery after structured `adversarial_check` evidence,
@@ -772,6 +796,10 @@ tmux attach -t codex-live-test
   `build-clear-loop` proves the non-coverage `build_then_clear` template
   blocks before `build_passed` and `cleanup` receipts, still blocks after build
   evidence alone, and delivers only after both build and cleanup evidence exist.
+  `ship-it-loop` proves push, PR, and merge commands fail closed until their
+  permissions are granted, then proves the `ship_it_loop` lifecycle blocks
+  before branch, PR, CI, mergeability, manager decision, merge, post-merge, and
+  adversarial receipts exist.
 - `loop-triggers --list|--classify PROMPT [--json]` — List the controlled
   natural-language loop triggers or classify a manager/operator prompt before
   creating a loop policy or continuation gate. Approved trigger phrases include
@@ -785,11 +813,16 @@ tmux attach -t codex-live-test
   evidence blocks a manager continuation before worker delivery until matching
   satisfied criterion evidence exists. `ralph-loop-presets` remains as a
   compatibility alias for the current Ralph-loop QA flows. The built-in
+  `app_visible_build_loop` template requires `build_passed` plus structured
+  `adversarial_check` evidence, but no cleanup evidence, so visible Codex app
+  threads can continue without pretending that same-thread context was cleared.
+  The built-in
   `visual_diff_loop` template requires `reference_artifact`,
   `candidate_screenshot`, `visual_diff_report`, `diff_below_threshold`, and
   `adversarial_check` evidence before a manager-requested next visual pass can
-  reach the worker. Quality-oriented templates (`pr_ci_merge_loop`,
-  `test_coverage_loop`, and `visual_diff_loop`) also expose an
+  reach the worker. Quality-oriented templates (`app_visible_build_loop`,
+  `pr_ci_merge_loop`, `ship_it_loop`, `test_coverage_loop`, and
+  `visual_diff_loop`) also expose an
   `artifact_requirements["adversarial_check"]` object requiring
   `failure_mode`, `check`, and `result` fields.
 - `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
@@ -812,7 +845,9 @@ thread tools are unavailable, create the binding anyway and paste the returned
 `worker_handoff` prompt into a manually opened worker session. The handoff
 requires a worker to report completion/blockers through
 `enqueue-notify-manager` plus a bounded Dispatch watch run before treating the
-manager as notified.
+manager as notified, and to print the live `CONVEYOR POLL` / `CONVEYOR
+RECEIVED` / `WORK` / `CONVEYOR SEND` / `DISPATCH` transcript in the worker
+session for any consumed item.
 - `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
   Queue a manager-requested next loop pass for Dispatch. The command refuses
   same/current iteration requests before they become pending queue rows, while
@@ -824,6 +859,8 @@ manager as notified.
   artifact requirements, and recommended tools.
 - `loop-evidence add TASK --loop-run RUN --iteration N --evidence-type TYPE` —
   Record a run-qualified evidence receipt for a loop policy. Use
+  `loop-evidence build-passed TASK --loop-run RUN --iteration N` as the
+  friendly alias for the common `evidence_type=build_passed` receipt. Use
   `loop-evidence visual-diff` to compare PNG screenshots, write an optional
   diff/report artifact, and record `visual_diff_report` plus
   `diff_below_threshold` as satisfied only when the computed score is within
@@ -869,14 +906,17 @@ conveyor qa-plan dispatch-completion
 conveyor qa-plan ralph-loop
 conveyor qa-plan adversarial-triggers
 conveyor qa-plan goalbuddy-conveyor
+conveyor qa-plan ship-it-loop
 conveyor qa-run ralph-loop-guardrails --receipt-output /tmp/ralph-loop-guardrails-receipt.json --json
 conveyor qa-run generic-loop-template --receipt-output /tmp/generic-loop-template-receipt.json --json
 conveyor qa-run generic-loop-template-browser --receipt-output /tmp/generic-loop-template-browser-receipt.json --json
 conveyor qa-run test-coverage-loop --receipt-output /tmp/test-coverage-loop-receipt.json --json
 conveyor qa-run adversarial-triggers --receipt-output /tmp/adversarial-triggers-receipt.json --json
 conveyor qa-run build-clear-loop --receipt-output /tmp/build-clear-loop-receipt.json --json
+conveyor qa-run ship-it-loop --receipt-output /tmp/ship-it-loop-receipt.json --json
 conveyor loop-triggers --classify "Run this as an adversarially gated Ralph loop." --json
 conveyor loop-templates --list --json
+conveyor loop-templates --show ship_it_loop --json
 conveyor loop-templates --show visual_diff_loop --json
 conveyor loop-evidence visual-diff qa-task --loop-run "$RUN_ID" --iteration 1 --reference reference.png --candidate candidate.png --threshold 0.02 --report-output visual-diff.json --diff-output visual-diff.png
 conveyor ralph-loop-presets --list --json