agent-conveyor 0.1.13 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -139,8 +139,9 @@ release version that is not already on npm.
139
139
  For common manager setups, start with
140
140
  [`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
141
141
  requests such as GoalBuddy conveyor runs, test coverage loops, UX polish loops,
142
- what-next nudging, and PR/CI/merge Ralph loops to concrete `manager-config`
143
- settings, permissions, evidence gates, cleanup behavior, and example
142
+ what-next nudging, PR/CI/merge Ralph loops, and autonomous ship-it loops to
143
+ concrete `manager-config` settings, permissions, evidence gates, cleanup
144
+ behavior, and example
144
145
  manager/Dispatch/worker interactions. Use `conveyor manager-recipes --list`
145
146
  or `conveyor manager-recipes --show goalbuddy-conveyor --json` for a
146
147
  machine-readable setup preview.
@@ -178,6 +179,12 @@ are unavailable, open a separate Codex app worker manually and paste the
178
179
  completion or blocker report must go back through the generated
179
180
  `enqueue-notify-manager` command and a bounded Dispatch watch tick; a direct
180
181
  Codex app final answer is not a durable manager receipt.
182
+ The live manager and worker sessions should also be readable as the primary
183
+ operator transcript: after consuming an inbox item, the consuming session must
184
+ print `CONVEYOR POLL`, `CONVEYOR RECEIVED`, `WORK`, `CONVEYOR SEND`, and
185
+ `DISPATCH` sections while the turn is happening. SQLite/replay/status output is
186
+ audit proof, not a replacement for the live session story. Idle polls may be a
187
+ single `CONVEYOR IDLE` line.
181
188
 
182
189
  Dispatch is core infrastructure for supervised worker/manager pairs. The
183
190
  `pair` workflow starts a detached Dispatch watch process by default so worker
@@ -195,6 +202,9 @@ Use `conveyor qa-plan adversarial-triggers` to verify natural-language
195
202
  manager prompts activate Ralph-loop adversarial gates.
196
203
  Use `conveyor qa-plan goalbuddy-conveyor` when a broad request should become
197
204
  sequential GoalBuddy child boards with PR/CI/merge receipts.
205
+ Use `conveyor qa-plan ship-it-loop` when a manager is allowed to push a branch,
206
+ open a PR, monitor CI, resolve bounded conflicts, and merge only after explicit
207
+ manager-owned merge evidence.
198
208
  Before cutting a manager loose, have it resolve the freeform setup request to a
199
209
  named recipe from `docs/manager-recipes.md` or an explicit `custom` setup, then
200
210
  show the saved mode, permissions, evidence gates, cleanup policy, and disallowed
@@ -285,6 +295,64 @@ tmux attach -t codex-live-test
285
295
 
286
296
  ## Commands
287
297
 
298
+ ### Campaigns
299
+
300
+ - `campaign create --name C --objective TEXT [--metadata-json JSON] [--json]` —
301
+ Create a campaign record for a multi-worker initiative.
302
+ - `campaign add-slot --name C --slot-key K --role-label TEXT [--channel CH] [--session-id S] [--thread-id ID] [--thread-title TITLE] [--state planned|active|idle|blocked|archived] [--metadata-json JSON] [--json]` —
303
+ Add a named worker slot to a campaign. If `--session-id` is supplied, the
304
+ runtime verifies that it is a registered worker session.
305
+ - `campaign attach-slot --name C --slot SLOT_ID [--session-id S] [--thread-id ID] [--thread-title TITLE] [--state planned|active|idle|blocked|archived] [--metadata-json JSON] [--json]` —
306
+ Attach or refresh worker-session and Codex app thread metadata for an existing
307
+ campaign slot. Supplying `--session-id` requires a registered worker session.
308
+ - `campaign rotate-slot --name C --slot SLOT_ID --expected-thread-id OLD --thread-id NEW [--thread-title TITLE] [--session-id S] [--state planned|active|idle|blocked|archived] [--json]` —
309
+ Record a campaign-owned worker slot rotation. The command refuses to update
310
+ the slot unless `OLD` matches the slot's current Codex app thread id.
311
+ - `campaign archive-slot --name C --slot SLOT_ID --expected-thread-id CURRENT [--json]` —
312
+ Mark a campaign worker slot archived only when the expected current thread id
313
+ matches the slot record.
314
+ - `campaign brief --name C --channel CH --brief-json JSON [--json]` —
315
+ Upsert the structured brief for a channel.
316
+ - `campaign assign --name C --slot SLOT_ID --title TEXT --instructions TEXT [--status queued|active|blocked|done|cancelled] [--metadata-json JSON] [--json]` —
317
+ Create a slot-scoped assignment.
318
+ - `campaign asset --name C --slot SLOT_ID --asset-type image|video|hyperframes|copy|audio|other --title TEXT [--assignment ASSIGNMENT_ID] [--channel CH] [--status draft|needs_review|approved|rejected|published] [--prompt-summary TEXT] [--artifact-path PATH] [--metadata-json JSON] [--review-notes TEXT] [--json]` —
319
+ Record a structured creative asset receipt.
320
+ - `campaign status --name C [--json]` —
321
+ Show campaign metadata, worker slots, channel briefs, assignment counts, and
322
+ asset receipt counts.
323
+ - `campaign dashboard --name C [--json]` —
324
+ Show a manager-oriented campaign aggregate: worker slot lifecycle states,
325
+ blockers, approval counts, and the next recommended manager action.
326
+
327
+ Creative Ops Campaign manager loop:
328
+
329
+ ```bash
330
+ conveyor campaign create --name "$CAMPAIGN" \
331
+ --objective "Produce reviewable channel assets." --json
332
+ conveyor campaign add-slot --name "$CAMPAIGN" --slot-key tiktok \
333
+ --role-label "TikTok worker" --channel tiktok \
334
+ --thread-id "$TIKTOK_THREAD_ID" --thread-title "TikTok Worker" \
335
+ --state active --json
336
+ conveyor campaign brief --name "$CAMPAIGN" --channel tiktok \
337
+ --brief-json '{"format":"9:16","review_gate":"human approval before publish"}' --json
338
+ conveyor campaign assign --name "$CAMPAIGN" --slot "$SLOT_ID" \
339
+ --title "Draft TikTok hooks" \
340
+ --instructions "Create reviewable draft copy only; do not publish." \
341
+ --status active --json
342
+ conveyor campaign asset --name "$CAMPAIGN" --slot "$SLOT_ID" \
343
+ --assignment "$ASSIGNMENT_ID" --asset-type copy \
344
+ --title "TikTok hooks v1" --status needs_review \
345
+ --prompt-summary "Sanitized prompt summary only." --json
346
+ conveyor campaign dashboard --name "$CAMPAIGN" --json
347
+ conveyor dashboard --campaign "$CAMPAIGN" --ensure-dispatch
348
+ ```
349
+
350
+ Use `campaign rotate-slot` or `campaign archive-slot` only with the exact
351
+ current `--expected-thread-id` for that campaign slot. Public publishing,
352
+ scheduling, posting, external account access, private phone content, raw audio,
353
+ tokens, JWTs, keys, archives, and IPAs require explicit human approval or must
354
+ stay out of receipts.
355
+
288
356
  ### Sessions and binding
289
357
 
290
358
  - `start-worker --name N [--cwd D] [--task "..."] [--sandbox SANDBOX] [--ask-for-approval ASK_FOR_APPROVAL] [--accept-trust] [--timeout-seconds N]` —
@@ -377,7 +445,10 @@ tmux attach -t codex-live-test
377
445
  required to make an idle app thread poll autonomously. The worker handoff and
378
446
  worker heartbeat prompt also include the exact durable
379
447
  `enqueue-notify-manager` and one-iteration `dispatch --watch` commands that a
380
- worker must run after completing or blocking on a consumed item. Those
448
+ worker must run after completing or blocking on a consumed item. Those prompts
449
+ require live session transcript blocks for consumed items, so the operator can
450
+ inspect the actual Codex app sessions and see the same flow the durable inbox
451
+ later proves. Those
381
452
  recommendations also include `wakeup_dispatch_command` and
382
453
  `delivery_receipt_commands` for
383
454
  app-thread wake recovery. Use them to record sent, skipped, and blocked wake
@@ -385,7 +456,11 @@ tmux attach -t codex-live-test
385
456
  completion. The recommendations include a `teardown_policy`: an idle poll is
386
457
  only a quiet interval, not a reason to delete or pause heartbeat automation;
387
458
  heartbeat teardown belongs to the manager/operator after terminal closeout or
388
- explicit operator instruction.
459
+ explicit operator instruction. For same-thread Codex app visible-session
460
+ dogfood, prefer `--template app_visible_build_loop` or a custom adversarial
461
+ gate; reserve cleanup-gated templates such as `build_then_clear` for flows
462
+ that create a fresh worker context or can record a real cleanup receipt
463
+ between iterations.
389
464
  The optional
390
465
  Codex app thread metadata is normally supplied after a Codex app manager has
391
466
  used `create_thread` and `set_thread_title`; terminal-only users can omit it
@@ -426,6 +501,18 @@ tmux attach -t codex-live-test
426
501
  heartbeats since the last command or inbox-consumption receipt, it recommends
427
502
  `stop_autopilot` so operators can quiesce blocked/no-progress loops instead
428
503
  of repeating idle pulses.
504
+ - `app-worker-rotation-plan TASK --old-worker-thread-id ID [--require-handoff]
505
+ [--reason TEXT] [--json]` — Prepare a Codex app fresh-worker rotation. The
506
+ CLI verifies that `ID` exactly matches the active bound worker session before
507
+ emitting adapter-ready actions to create a replacement worker thread and
508
+ archive the old worker thread. Blocked plans contain no archive action.
509
+ - `app-worker-rotation-record TASK --old-worker-thread-id OLD
510
+ --new-worker-thread-id NEW [--new-worker-thread-title TITLE]
511
+ --archive-status archived|blocked [--reason TEXT] [--json]` — Record the
512
+ result after the Codex app layer creates the replacement worker thread and
513
+ archives, or blocks on archiving, the old thread. The command re-checks active
514
+ binding ownership before updating the worker session to the new app thread id,
515
+ so a stale plan cannot archive or replace an unrelated thread.
429
516
  - `discover [QUERY] [--all] [--limit N]` / `search [QUERY]` — Search tasks,
430
517
  registered sessions, active bindings, and recent telemetry in one JSON result.
431
518
  Use this for conversational setup when a manager or Codex session needs to
@@ -456,7 +543,8 @@ tmux attach -t codex-live-test
456
543
  flags. Use `--interactive` only as a terminal fallback when a human is
457
544
  running `conveyor` directly.
458
545
  `--permit` grants taxonomy permissions such as `repo.open_pr`,
459
- `verification.run_pytest`, `context.spawn_reviewer`,
546
+ `repo.push_branch`, `repo.monitor_ci`, `repo.resolve_conflicts`,
547
+ `repo.merge_green_pr`, `verification.run_pytest`, `context.spawn_reviewer`,
460
548
  `communication.notify_operator`, or `worker_session.compact`. Use `--tool`
461
549
  to record expected verification/context tools, `--epilogue` for required
462
550
  built-in finish steps (`run-tools`, `draft-pr`, `subagent-review`,
@@ -497,7 +585,13 @@ tmux attach -t codex-live-test
497
585
  [--json]` — Draft reviewed `criteria --add` commands from a worker response
498
586
  that separates must-have current-task criteria from deferred follow-ups. This
499
587
  helper is read-only: it resolves the task and prints suggestions, but does not
500
- mutate acceptance criteria, events, or commands.
588
+ mutate acceptance criteria, events, or commands. If a proposed criterion
589
+ appears to describe manager closeout mechanics such as `finish-task`,
590
+ `--require-criteria-audit`, heartbeat teardown, or final manager reporting,
591
+ the helper emits a non-blocking warning and classifies that suggestion as
592
+ manager closeout proof. Keep that proof in the manager final report, audit,
593
+ replay, or epilogue evidence instead of accepted worker/task criteria unless
594
+ the task is explicitly Conveyor closeout QA.
501
595
  ```bash
502
596
  conveyor criteria-plan my-task --from-worker-response response.md --json
503
597
  ```
@@ -563,6 +657,10 @@ tmux attach -t codex-live-test
563
657
  before/after sending the worker instruction. `--dry-run` still records the
564
658
  command in `commands`, `replay`, and `mutation-audit` with `dry_run: true`
565
659
  and `sent: false`.
660
+ In Codex app threads, remote `/compact` or `/clear` sent through
661
+ `send_message_to_thread` is prompt text, not an executable slash command. Use
662
+ `app-worker-rotation-plan` plus Codex app `create_thread` and
663
+ `set_thread_archived` when fresh context is required for app-native workers.
566
664
  - `bind --task T --worker W --manager M` — Create the task binding.
567
665
  - `unbind --task T` — End the active binding for a task.
568
666
  - `finish-task <task> [--reason R] [--require-criteria-audit]
@@ -592,7 +690,7 @@ tmux attach -t codex-live-test
592
690
 
593
691
  ### Observation
594
692
 
595
- - `dashboard [--task T] [--ensure-dispatch] [--dispatcher-id ID]
693
+ - `dashboard [--task T] [--campaign C] [--ensure-dispatch] [--dispatcher-id ID]
596
694
  [--host 127.0.0.1] [--port 8797]` — Launch the
597
695
  local live supervision cockpit. The dashboard binds to loopback by default,
598
696
  uses the TypeScript backend to shell out to `conveyor` JSON commands, and
@@ -600,10 +698,12 @@ tmux attach -t codex-live-test
600
698
  a WebSocket PTY bridge. It includes browser bootstrap controls for creating a
601
699
  task, starting a worker/manager pair with `conveyor pair`, auto-attaching the
602
700
  terminals, attach/bind controls, and audited action receipts for cycle,
603
- nudge, interrupt, finish, and export. With `--ensure-dispatch`, launch also
604
- ensures a Dispatch watch process using the supplied `--dispatcher-id` when
605
- provided, reusing only a fresh heartbeat from that same dispatcher id. Use
606
- `--dry-run --json` to inspect the launch command.
701
+ nudge, interrupt, finish, and export. With `--campaign`, the observation rail
702
+ also shows campaign slot lifecycle, blockers, approval counts, and the next
703
+ manager action. With `--ensure-dispatch`, launch also ensures a Dispatch watch
704
+ process using the supplied `--dispatcher-id` when provided, reusing only a
705
+ fresh heartbeat from that same dispatcher id. Use `--dry-run --json` to
706
+ inspect the launch command.
607
707
  - `cycle <task> [--busy-wait-seconds N]` — One observation cycle. Idempotent. Runs `ingest`, computes
608
708
  worker state from the JSON event stream, captures the tmux pane as a shadow
609
709
  signal, writes a `manager_cycles` row, and returns a JSON dict the manager
@@ -750,9 +850,9 @@ tmux attach -t codex-live-test
750
850
  - `transcript-show <task> [--role R] [--include-content]` — Show stored
751
851
  transcript segment metadata. Segment text is redacted unless
752
852
  `--include-content` is passed.
753
- - `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor>` — Print a
853
+ - `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor|ship-it-loop>` — Print a
754
854
  repeatable manual QA checklist.
755
- - `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop> --receipt-output RECEIPT.json [--path DB]` —
855
+ - `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop|ship-it-loop> --receipt-output RECEIPT.json [--path DB]` —
756
856
  Run a deterministic no-tmux QA harness and save a JSON receipt.
757
857
  `ralph-loop-guardrails` proves max-iteration cutoff, missing-evidence
758
858
  cutoff, fresh retry delivery after structured `adversarial_check` evidence,
@@ -772,6 +872,10 @@ tmux attach -t codex-live-test
772
872
  `build-clear-loop` proves the non-coverage `build_then_clear` template
773
873
  blocks before `build_passed` and `cleanup` receipts, still blocks after build
774
874
  evidence alone, and delivers only after both build and cleanup evidence exist.
875
+ `ship-it-loop` proves push, PR, and merge commands fail closed until their
876
+ permissions are granted, then proves the `ship_it_loop` lifecycle blocks
877
+ before branch, PR, CI, mergeability, manager decision, merge, post-merge, and
878
+ adversarial receipts exist.
775
879
  - `loop-triggers --list|--classify PROMPT [--json]` — List the controlled
776
880
  natural-language loop triggers or classify a manager/operator prompt before
777
881
  creating a loop policy or continuation gate. Approved trigger phrases include
@@ -785,11 +889,16 @@ tmux attach -t codex-live-test
785
889
  evidence blocks a manager continuation before worker delivery until matching
786
890
  satisfied criterion evidence exists. `ralph-loop-presets` remains as a
787
891
  compatibility alias for the current Ralph-loop QA flows. The built-in
892
+ `app_visible_build_loop` template requires `build_passed` plus structured
893
+ `adversarial_check` evidence, but no cleanup evidence, so visible Codex app
894
+ threads can continue without pretending that same-thread context was cleared.
895
+ The built-in
788
896
  `visual_diff_loop` template requires `reference_artifact`,
789
897
  `candidate_screenshot`, `visual_diff_report`, `diff_below_threshold`, and
790
898
  `adversarial_check` evidence before a manager-requested next visual pass can
791
- reach the worker. Quality-oriented templates (`pr_ci_merge_loop`,
792
- `test_coverage_loop`, and `visual_diff_loop`) also expose an
899
+ reach the worker. Quality-oriented templates (`app_visible_build_loop`,
900
+ `pr_ci_merge_loop`, `ship_it_loop`, `test_coverage_loop`, and
901
+ `visual_diff_loop`) also expose an
793
902
  `artifact_requirements["adversarial_check"]` object requiring
794
903
  `failure_mode`, `check`, and `result` fields.
795
904
  - `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
@@ -812,7 +921,9 @@ thread tools are unavailable, create the binding anyway and paste the returned
812
921
  `worker_handoff` prompt into a manually opened worker session. The handoff
813
922
  requires a worker to report completion/blockers through
814
923
  `enqueue-notify-manager` plus a bounded Dispatch watch run before treating the
815
- manager as notified.
924
+ manager as notified, and to print the live `CONVEYOR POLL` / `CONVEYOR
925
+ RECEIVED` / `WORK` / `CONVEYOR SEND` / `DISPATCH` transcript in the worker
926
+ session for any consumed item.
816
927
  - `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
817
928
  Queue a manager-requested next loop pass for Dispatch. The command refuses
818
929
  same/current iteration requests before they become pending queue rows, while
@@ -824,6 +935,8 @@ manager as notified.
824
935
  artifact requirements, and recommended tools.
825
936
  - `loop-evidence add TASK --loop-run RUN --iteration N --evidence-type TYPE` —
826
937
  Record a run-qualified evidence receipt for a loop policy. Use
938
+ `loop-evidence build-passed TASK --loop-run RUN --iteration N` as the
939
+ friendly alias for the common `evidence_type=build_passed` receipt. Use
827
940
  `loop-evidence visual-diff` to compare PNG screenshots, write an optional
828
941
  diff/report artifact, and record `visual_diff_report` plus
829
942
  `diff_below_threshold` as satisfied only when the computed score is within
@@ -869,14 +982,17 @@ conveyor qa-plan dispatch-completion
869
982
  conveyor qa-plan ralph-loop
870
983
  conveyor qa-plan adversarial-triggers
871
984
  conveyor qa-plan goalbuddy-conveyor
985
+ conveyor qa-plan ship-it-loop
872
986
  conveyor qa-run ralph-loop-guardrails --receipt-output /tmp/ralph-loop-guardrails-receipt.json --json
873
987
  conveyor qa-run generic-loop-template --receipt-output /tmp/generic-loop-template-receipt.json --json
874
988
  conveyor qa-run generic-loop-template-browser --receipt-output /tmp/generic-loop-template-browser-receipt.json --json
875
989
  conveyor qa-run test-coverage-loop --receipt-output /tmp/test-coverage-loop-receipt.json --json
876
990
  conveyor qa-run adversarial-triggers --receipt-output /tmp/adversarial-triggers-receipt.json --json
877
991
  conveyor qa-run build-clear-loop --receipt-output /tmp/build-clear-loop-receipt.json --json
992
+ conveyor qa-run ship-it-loop --receipt-output /tmp/ship-it-loop-receipt.json --json
878
993
  conveyor loop-triggers --classify "Run this as an adversarially gated Ralph loop." --json
879
994
  conveyor loop-templates --list --json
995
+ conveyor loop-templates --show ship_it_loop --json
880
996
  conveyor loop-templates --show visual_diff_loop --json
881
997
  conveyor loop-evidence visual-diff qa-task --loop-run "$RUN_ID" --iteration 1 --reference reference.png --candidate candidate.png --threshold 0.02 --report-output visual-diff.json --diff-output visual-diff.png
882
998
  conveyor ralph-loop-presets --list --json