npm - agent-conveyor - Versions diffs - 0.1.13 → 0.1.15 - Mend

agent-conveyor 0.1.13 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/README.md +132 -16
package/dist/cli/typescript-runtime.js +1305 -20
package/dist/cli/typescript-runtime.js.map +1 -1
package/dist/index.d.ts +2 -0
package/dist/index.js +1 -0
package/dist/index.js.map +1 -1
package/dist/runtime/app-autonomy.d.ts +1 -0
package/dist/runtime/app-autonomy.js +16 -0
package/dist/runtime/app-autonomy.js.map +1 -1
package/dist/runtime/campaigns.d.ts +197 -0
package/dist/runtime/campaigns.js +438 -0
package/dist/runtime/campaigns.js.map +1 -0
package/dist/runtime/manager-permissions.js +1 -1
package/dist/runtime/manager-permissions.js.map +1 -1
package/dist/state/schema-v23.js +87 -1
package/dist/state/schema-v23.js.map +1 -1
package/dist/state/sqlite-contract.d.ts +1 -1
package/dist/state/sqlite-contract.js +13 -1
package/dist/state/sqlite-contract.js.map +1 -1
package/docs/manager-recipes.md +182 -0
package/package.json +1 -1
package/skills/manage-codex-workers/SKILL.md +143 -8
package/skills/manage-codex-workers/agents/openai.yaml +2 -2

package/README.md CHANGED Viewed

@@ -139,8 +139,9 @@ release version that is not already on npm.
 For common manager setups, start with
 [`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
 requests such as GoalBuddy conveyor runs, test coverage loops, UX polish loops,
-what-next nudging, and PR/CI/merge Ralph loops to concrete `manager-config`
-settings, permissions, evidence gates, cleanup behavior, and example
+what-next nudging, PR/CI/merge Ralph loops, and autonomous ship-it loops to
+concrete `manager-config` settings, permissions, evidence gates, cleanup
+behavior, and example
 manager/Dispatch/worker interactions. Use `conveyor manager-recipes --list`
 or `conveyor manager-recipes --show goalbuddy-conveyor --json` for a
 machine-readable setup preview.
@@ -178,6 +179,12 @@ are unavailable, open a separate Codex app worker manually and paste the
 completion or blocker report must go back through the generated
 `enqueue-notify-manager` command and a bounded Dispatch watch tick; a direct
 Codex app final answer is not a durable manager receipt.
+The live manager and worker sessions should also be readable as the primary
+operator transcript: after consuming an inbox item, the consuming session must
+print `CONVEYOR POLL`, `CONVEYOR RECEIVED`, `WORK`, `CONVEYOR SEND`, and
+`DISPATCH` sections while the turn is happening. SQLite/replay/status output is
+audit proof, not a replacement for the live session story. Idle polls may be a
+single `CONVEYOR IDLE` line.
 Dispatch is core infrastructure for supervised worker/manager pairs. The
 `pair` workflow starts a detached Dispatch watch process by default so worker
@@ -195,6 +202,9 @@ Use `conveyor qa-plan adversarial-triggers` to verify natural-language
 manager prompts activate Ralph-loop adversarial gates.
 Use `conveyor qa-plan goalbuddy-conveyor` when a broad request should become
 sequential GoalBuddy child boards with PR/CI/merge receipts.
+Use `conveyor qa-plan ship-it-loop` when a manager is allowed to push a branch,
+open a PR, monitor CI, resolve bounded conflicts, and merge only after explicit
+manager-owned merge evidence.
 Before cutting a manager loose, have it resolve the freeform setup request to a
 named recipe from `docs/manager-recipes.md` or an explicit `custom` setup, then
 show the saved mode, permissions, evidence gates, cleanup policy, and disallowed
@@ -285,6 +295,64 @@ tmux attach -t codex-live-test
 ## Commands
+### Campaigns
+- `campaign create --name C --objective TEXT [--metadata-json JSON] [--json]` —
+  Create a campaign record for a multi-worker initiative.
+- `campaign add-slot --name C --slot-key K --role-label TEXT [--channel CH] [--session-id S] [--thread-id ID] [--thread-title TITLE] [--state planned|active|idle|blocked|archived] [--metadata-json JSON] [--json]` —
+  Add a named worker slot to a campaign. If `--session-id` is supplied, the
+  runtime verifies that it is a registered worker session.
+- `campaign attach-slot --name C --slot SLOT_ID [--session-id S] [--thread-id ID] [--thread-title TITLE] [--state planned|active|idle|blocked|archived] [--metadata-json JSON] [--json]` —
+  Attach or refresh worker-session and Codex app thread metadata for an existing
+  campaign slot. Supplying `--session-id` requires a registered worker session.
+- `campaign rotate-slot --name C --slot SLOT_ID --expected-thread-id OLD --thread-id NEW [--thread-title TITLE] [--session-id S] [--state planned|active|idle|blocked|archived] [--json]` —
+  Record a campaign-owned worker slot rotation. The command refuses to update
+  the slot unless `OLD` matches the slot's current Codex app thread id.
+- `campaign archive-slot --name C --slot SLOT_ID --expected-thread-id CURRENT [--json]` —
+  Mark a campaign worker slot archived only when the expected current thread id
+  matches the slot record.
+- `campaign brief --name C --channel CH --brief-json JSON [--json]` —
+  Upsert the structured brief for a channel.
+- `campaign assign --name C --slot SLOT_ID --title TEXT --instructions TEXT [--status queued|active|blocked|done|cancelled] [--metadata-json JSON] [--json]` —
+  Create a slot-scoped assignment.
+- `campaign asset --name C --slot SLOT_ID --asset-type image|video|hyperframes|copy|audio|other --title TEXT [--assignment ASSIGNMENT_ID] [--channel CH] [--status draft|needs_review|approved|rejected|published] [--prompt-summary TEXT] [--artifact-path PATH] [--metadata-json JSON] [--review-notes TEXT] [--json]` —
+  Record a structured creative asset receipt.
+- `campaign status --name C [--json]` —
+  Show campaign metadata, worker slots, channel briefs, assignment counts, and
+  asset receipt counts.
+- `campaign dashboard --name C [--json]` —
+  Show a manager-oriented campaign aggregate: worker slot lifecycle states,
+  blockers, approval counts, and the next recommended manager action.
+Creative Ops Campaign manager loop:
+```bash
+conveyor campaign create --name "$CAMPAIGN" \
+  --objective "Produce reviewable channel assets." --json
+conveyor campaign add-slot --name "$CAMPAIGN" --slot-key tiktok \
+  --role-label "TikTok worker" --channel tiktok \
+  --thread-id "$TIKTOK_THREAD_ID" --thread-title "TikTok Worker" \
+  --state active --json
+conveyor campaign brief --name "$CAMPAIGN" --channel tiktok \
+  --brief-json '{"format":"9:16","review_gate":"human approval before publish"}' --json
+conveyor campaign assign --name "$CAMPAIGN" --slot "$SLOT_ID" \
+  --title "Draft TikTok hooks" \
+  --instructions "Create reviewable draft copy only; do not publish." \
+  --status active --json
+conveyor campaign asset --name "$CAMPAIGN" --slot "$SLOT_ID" \
+  --assignment "$ASSIGNMENT_ID" --asset-type copy \
+  --title "TikTok hooks v1" --status needs_review \
+  --prompt-summary "Sanitized prompt summary only." --json
+conveyor campaign dashboard --name "$CAMPAIGN" --json
+conveyor dashboard --campaign "$CAMPAIGN" --ensure-dispatch
+```
+Use `campaign rotate-slot` or `campaign archive-slot` only with the exact
+current `--expected-thread-id` for that campaign slot. Public publishing,
+scheduling, posting, external account access, private phone content, raw audio,
+tokens, JWTs, keys, archives, and IPAs require explicit human approval or must
+stay out of receipts.
 ### Sessions and binding
 - `start-worker --name N [--cwd D] [--task "..."] [--sandbox SANDBOX] [--ask-for-approval ASK_FOR_APPROVAL] [--accept-trust] [--timeout-seconds N]` —
@@ -377,7 +445,10 @@ tmux attach -t codex-live-test
   required to make an idle app thread poll autonomously. The worker handoff and
   worker heartbeat prompt also include the exact durable
   `enqueue-notify-manager` and one-iteration `dispatch --watch` commands that a
-  worker must run after completing or blocking on a consumed item. Those
+  worker must run after completing or blocking on a consumed item. Those prompts
+  require live session transcript blocks for consumed items, so the operator can
+  inspect the actual Codex app sessions and see the same flow the durable inbox
+  later proves. Those
   recommendations also include `wakeup_dispatch_command` and
   `delivery_receipt_commands` for
   app-thread wake recovery. Use them to record sent, skipped, and blocked wake
@@ -385,7 +456,11 @@ tmux attach -t codex-live-test
   completion. The recommendations include a `teardown_policy`: an idle poll is
   only a quiet interval, not a reason to delete or pause heartbeat automation;
   heartbeat teardown belongs to the manager/operator after terminal closeout or
-  explicit operator instruction.
+  explicit operator instruction. For same-thread Codex app visible-session
+  dogfood, prefer `--template app_visible_build_loop` or a custom adversarial
+  gate; reserve cleanup-gated templates such as `build_then_clear` for flows
+  that create a fresh worker context or can record a real cleanup receipt
+  between iterations.
   The optional
   Codex app thread metadata is normally supplied after a Codex app manager has
   used `create_thread` and `set_thread_title`; terminal-only users can omit it
@@ -426,6 +501,18 @@ tmux attach -t codex-live-test
   heartbeats since the last command or inbox-consumption receipt, it recommends
   `stop_autopilot` so operators can quiesce blocked/no-progress loops instead
   of repeating idle pulses.
+- `app-worker-rotation-plan TASK --old-worker-thread-id ID [--require-handoff]
+  [--reason TEXT] [--json]` — Prepare a Codex app fresh-worker rotation. The
+  CLI verifies that `ID` exactly matches the active bound worker session before
+  emitting adapter-ready actions to create a replacement worker thread and
+  archive the old worker thread. Blocked plans contain no archive action.
+- `app-worker-rotation-record TASK --old-worker-thread-id OLD
+  --new-worker-thread-id NEW [--new-worker-thread-title TITLE]
+  --archive-status archived|blocked [--reason TEXT] [--json]` — Record the
+  result after the Codex app layer creates the replacement worker thread and
+  archives, or blocks on archiving, the old thread. The command re-checks active
+  binding ownership before updating the worker session to the new app thread id,
+  so a stale plan cannot archive or replace an unrelated thread.
 - `discover [QUERY] [--all] [--limit N]` / `search [QUERY]` — Search tasks,
   registered sessions, active bindings, and recent telemetry in one JSON result.
   Use this for conversational setup when a manager or Codex session needs to
@@ -456,7 +543,8 @@ tmux attach -t codex-live-test
   flags. Use `--interactive` only as a terminal fallback when a human is
   running `conveyor` directly.
   `--permit` grants taxonomy permissions such as `repo.open_pr`,
-  `verification.run_pytest`, `context.spawn_reviewer`,
+  `repo.push_branch`, `repo.monitor_ci`, `repo.resolve_conflicts`,
+  `repo.merge_green_pr`, `verification.run_pytest`, `context.spawn_reviewer`,
   `communication.notify_operator`, or `worker_session.compact`. Use `--tool`
   to record expected verification/context tools, `--epilogue` for required
   built-in finish steps (`run-tools`, `draft-pr`, `subagent-review`,
@@ -497,7 +585,13 @@ tmux attach -t codex-live-test
   [--json]` — Draft reviewed `criteria --add` commands from a worker response
   that separates must-have current-task criteria from deferred follow-ups. This
   helper is read-only: it resolves the task and prints suggestions, but does not
-  mutate acceptance criteria, events, or commands.
+  mutate acceptance criteria, events, or commands. If a proposed criterion
+  appears to describe manager closeout mechanics such as `finish-task`,
+  `--require-criteria-audit`, heartbeat teardown, or final manager reporting,
+  the helper emits a non-blocking warning and classifies that suggestion as
+  manager closeout proof. Keep that proof in the manager final report, audit,
+  replay, or epilogue evidence instead of accepted worker/task criteria unless
+  the task is explicitly Conveyor closeout QA.
   ```bash
   conveyor criteria-plan my-task --from-worker-response response.md --json
   ```
@@ -563,6 +657,10 @@ tmux attach -t codex-live-test
   before/after sending the worker instruction. `--dry-run` still records the
   command in `commands`, `replay`, and `mutation-audit` with `dry_run: true`
   and `sent: false`.
+  In Codex app threads, remote `/compact` or `/clear` sent through
+  `send_message_to_thread` is prompt text, not an executable slash command. Use
+  `app-worker-rotation-plan` plus Codex app `create_thread` and
+  `set_thread_archived` when fresh context is required for app-native workers.
 - `bind --task T --worker W --manager M` — Create the task binding.
 - `unbind --task T` — End the active binding for a task.
 - `finish-task <task> [--reason R] [--require-criteria-audit]
@@ -592,7 +690,7 @@ tmux attach -t codex-live-test
 ### Observation
-- `dashboard [--task T] [--ensure-dispatch] [--dispatcher-id ID]
+- `dashboard [--task T] [--campaign C] [--ensure-dispatch] [--dispatcher-id ID]
   [--host 127.0.0.1] [--port 8797]` — Launch the
   local live supervision cockpit. The dashboard binds to loopback by default,
   uses the TypeScript backend to shell out to `conveyor` JSON commands, and
@@ -600,10 +698,12 @@ tmux attach -t codex-live-test
   a WebSocket PTY bridge. It includes browser bootstrap controls for creating a
   task, starting a worker/manager pair with `conveyor pair`, auto-attaching the
   terminals, attach/bind controls, and audited action receipts for cycle,
-  nudge, interrupt, finish, and export. With `--ensure-dispatch`, launch also
-  ensures a Dispatch watch process using the supplied `--dispatcher-id` when
-  provided, reusing only a fresh heartbeat from that same dispatcher id. Use
-  `--dry-run --json` to inspect the launch command.
+  nudge, interrupt, finish, and export. With `--campaign`, the observation rail
+  also shows campaign slot lifecycle, blockers, approval counts, and the next
+  manager action. With `--ensure-dispatch`, launch also ensures a Dispatch watch
+  process using the supplied `--dispatcher-id` when provided, reusing only a
+  fresh heartbeat from that same dispatcher id. Use `--dry-run --json` to
+  inspect the launch command.
 - `cycle <task> [--busy-wait-seconds N]` — One observation cycle. Idempotent. Runs `ingest`, computes
   worker state from the JSON event stream, captures the tmux pane as a shadow
   signal, writes a `manager_cycles` row, and returns a JSON dict the manager
@@ -750,9 +850,9 @@ tmux attach -t codex-live-test
 - `transcript-show <task> [--role R] [--include-content]` — Show stored
   transcript segment metadata. Segment text is redacted unless
   `--include-content` is passed.
-- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor>` — Print a
+- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor|ship-it-loop>` — Print a
   repeatable manual QA checklist.
-- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop> --receipt-output RECEIPT.json [--path DB]` —
+- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop|ship-it-loop> --receipt-output RECEIPT.json [--path DB]` —
   Run a deterministic no-tmux QA harness and save a JSON receipt.
   `ralph-loop-guardrails` proves max-iteration cutoff, missing-evidence
   cutoff, fresh retry delivery after structured `adversarial_check` evidence,
@@ -772,6 +872,10 @@ tmux attach -t codex-live-test
   `build-clear-loop` proves the non-coverage `build_then_clear` template
   blocks before `build_passed` and `cleanup` receipts, still blocks after build
   evidence alone, and delivers only after both build and cleanup evidence exist.
+  `ship-it-loop` proves push, PR, and merge commands fail closed until their
+  permissions are granted, then proves the `ship_it_loop` lifecycle blocks
+  before branch, PR, CI, mergeability, manager decision, merge, post-merge, and
+  adversarial receipts exist.
 - `loop-triggers --list|--classify PROMPT [--json]` — List the controlled
   natural-language loop triggers or classify a manager/operator prompt before
   creating a loop policy or continuation gate. Approved trigger phrases include
@@ -785,11 +889,16 @@ tmux attach -t codex-live-test
   evidence blocks a manager continuation before worker delivery until matching
   satisfied criterion evidence exists. `ralph-loop-presets` remains as a
   compatibility alias for the current Ralph-loop QA flows. The built-in
+  `app_visible_build_loop` template requires `build_passed` plus structured
+  `adversarial_check` evidence, but no cleanup evidence, so visible Codex app
+  threads can continue without pretending that same-thread context was cleared.
+  The built-in
   `visual_diff_loop` template requires `reference_artifact`,
   `candidate_screenshot`, `visual_diff_report`, `diff_below_threshold`, and
   `adversarial_check` evidence before a manager-requested next visual pass can
-  reach the worker. Quality-oriented templates (`pr_ci_merge_loop`,
-  `test_coverage_loop`, and `visual_diff_loop`) also expose an
+  reach the worker. Quality-oriented templates (`app_visible_build_loop`,
+  `pr_ci_merge_loop`, `ship_it_loop`, `test_coverage_loop`, and
+  `visual_diff_loop`) also expose an
   `artifact_requirements["adversarial_check"]` object requiring
   `failure_mode`, `check`, and `result` fields.
 - `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
@@ -812,7 +921,9 @@ thread tools are unavailable, create the binding anyway and paste the returned
 `worker_handoff` prompt into a manually opened worker session. The handoff
 requires a worker to report completion/blockers through
 `enqueue-notify-manager` plus a bounded Dispatch watch run before treating the
-manager as notified.
+manager as notified, and to print the live `CONVEYOR POLL` / `CONVEYOR
+RECEIVED` / `WORK` / `CONVEYOR SEND` / `DISPATCH` transcript in the worker
+session for any consumed item.
 - `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
   Queue a manager-requested next loop pass for Dispatch. The command refuses
   same/current iteration requests before they become pending queue rows, while
@@ -824,6 +935,8 @@ manager as notified.
   artifact requirements, and recommended tools.
 - `loop-evidence add TASK --loop-run RUN --iteration N --evidence-type TYPE` —
   Record a run-qualified evidence receipt for a loop policy. Use
+  `loop-evidence build-passed TASK --loop-run RUN --iteration N` as the
+  friendly alias for the common `evidence_type=build_passed` receipt. Use
   `loop-evidence visual-diff` to compare PNG screenshots, write an optional
   diff/report artifact, and record `visual_diff_report` plus
   `diff_below_threshold` as satisfied only when the computed score is within
@@ -869,14 +982,17 @@ conveyor qa-plan dispatch-completion
 conveyor qa-plan ralph-loop
 conveyor qa-plan adversarial-triggers
 conveyor qa-plan goalbuddy-conveyor
+conveyor qa-plan ship-it-loop
 conveyor qa-run ralph-loop-guardrails --receipt-output /tmp/ralph-loop-guardrails-receipt.json --json
 conveyor qa-run generic-loop-template --receipt-output /tmp/generic-loop-template-receipt.json --json
 conveyor qa-run generic-loop-template-browser --receipt-output /tmp/generic-loop-template-browser-receipt.json --json
 conveyor qa-run test-coverage-loop --receipt-output /tmp/test-coverage-loop-receipt.json --json
 conveyor qa-run adversarial-triggers --receipt-output /tmp/adversarial-triggers-receipt.json --json
 conveyor qa-run build-clear-loop --receipt-output /tmp/build-clear-loop-receipt.json --json
+conveyor qa-run ship-it-loop --receipt-output /tmp/ship-it-loop-receipt.json --json
 conveyor loop-triggers --classify "Run this as an adversarially gated Ralph loop." --json
 conveyor loop-templates --list --json
+conveyor loop-templates --show ship_it_loop --json
 conveyor loop-templates --show visual_diff_loop --json
 conveyor loop-evidence visual-diff qa-task --loop-run "$RUN_ID" --iteration 1 --reference reference.png --candidate candidate.png --threshold 0.02 --report-output visual-diff.json --diff-output visual-diff.png
 conveyor ralph-loop-presets --list --json