agent-conveyor 0.1.12 → 0.1.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +74 -15
- package/dist/cli/typescript-runtime.js +598 -13
- package/dist/cli/typescript-runtime.js.map +1 -1
- package/dist/runtime/app-autonomy.d.ts +12 -0
- package/dist/runtime/app-autonomy.js +119 -0
- package/dist/runtime/app-autonomy.js.map +1 -1
- package/dist/runtime/manager-permissions.js +1 -1
- package/dist/runtime/manager-permissions.js.map +1 -1
- package/docs/manager-recipes.md +90 -0
- package/package.json +1 -1
- package/skills/manage-codex-workers/SKILL.md +93 -10
package/README.md
CHANGED
|
@@ -139,8 +139,9 @@ release version that is not already on npm.
|
|
|
139
139
|
For common manager setups, start with
|
|
140
140
|
[`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
|
|
141
141
|
requests such as GoalBuddy conveyor runs, test coverage loops, UX polish loops,
|
|
142
|
-
what-next nudging,
|
|
143
|
-
settings, permissions, evidence gates, cleanup
|
|
142
|
+
what-next nudging, PR/CI/merge Ralph loops, and autonomous ship-it loops to
|
|
143
|
+
concrete `manager-config` settings, permissions, evidence gates, cleanup
|
|
144
|
+
behavior, and example
|
|
144
145
|
manager/Dispatch/worker interactions. Use `conveyor manager-recipes --list`
|
|
145
146
|
or `conveyor manager-recipes --show goalbuddy-conveyor --json` for a
|
|
146
147
|
machine-readable setup preview.
|
|
@@ -174,7 +175,16 @@ thread identity through `--worker-codex-app-thread-id` and
|
|
|
174
175
|
deliver the generated `worker_handoff` bootstrap prompt. The raw terminal
|
|
175
176
|
`conveyor` CLI does not create Codex app threads by itself; if app thread tools
|
|
176
177
|
are unavailable, open a separate Codex app worker manually and paste the
|
|
177
|
-
`worker_handoff` prompt.
|
|
178
|
+
`worker_handoff` prompt. After a worker consumes a manager instruction, its
|
|
179
|
+
completion or blocker report must go back through the generated
|
|
180
|
+
`enqueue-notify-manager` command and a bounded Dispatch watch tick; a direct
|
|
181
|
+
Codex app final answer is not a durable manager receipt.
|
|
182
|
+
The live manager and worker sessions should also be readable as the primary
|
|
183
|
+
operator transcript: after consuming an inbox item, the consuming session must
|
|
184
|
+
print `CONVEYOR POLL`, `CONVEYOR RECEIVED`, `WORK`, `CONVEYOR SEND`, and
|
|
185
|
+
`DISPATCH` sections while the turn is happening. SQLite/replay/status output is
|
|
186
|
+
audit proof, not a replacement for the live session story. Idle polls may be a
|
|
187
|
+
single `CONVEYOR IDLE` line.
|
|
178
188
|
|
|
179
189
|
Dispatch is core infrastructure for supervised worker/manager pairs. The
|
|
180
190
|
`pair` workflow starts a detached Dispatch watch process by default so worker
|
|
@@ -192,6 +202,9 @@ Use `conveyor qa-plan adversarial-triggers` to verify natural-language
|
|
|
192
202
|
manager prompts activate Ralph-loop adversarial gates.
|
|
193
203
|
Use `conveyor qa-plan goalbuddy-conveyor` when a broad request should become
|
|
194
204
|
sequential GoalBuddy child boards with PR/CI/merge receipts.
|
|
205
|
+
Use `conveyor qa-plan ship-it-loop` when a manager is allowed to push a branch,
|
|
206
|
+
open a PR, monitor CI, resolve bounded conflicts, and merge only after explicit
|
|
207
|
+
manager-owned merge evidence.
|
|
195
208
|
Before cutting a manager loose, have it resolve the freeform setup request to a
|
|
196
209
|
named recipe from `docs/manager-recipes.md` or an explicit `custom` setup, then
|
|
197
210
|
show the saved mode, permissions, evidence gates, cleanup policy, and disallowed
|
|
@@ -371,14 +384,25 @@ tmux attach -t codex-live-test
|
|
|
371
384
|
Codex app sessions, the JSON output also includes
|
|
372
385
|
`heartbeat_recommendations` with role-specific poll prompts; Dispatch can
|
|
373
386
|
deliver into those inboxes, but a heartbeat or operator wake-up is still
|
|
374
|
-
required to make an idle app thread poll autonomously.
|
|
375
|
-
also include
|
|
387
|
+
required to make an idle app thread poll autonomously. The worker handoff and
|
|
388
|
+
worker heartbeat prompt also include the exact durable
|
|
389
|
+
`enqueue-notify-manager` and one-iteration `dispatch --watch` commands that a
|
|
390
|
+
worker must run after completing or blocking on a consumed item. Those prompts
|
|
391
|
+
require live session transcript blocks for consumed items, so the operator can
|
|
392
|
+
inspect the actual Codex app sessions and see the same flow the durable inbox
|
|
393
|
+
later proves. Those
|
|
394
|
+
recommendations also include `wakeup_dispatch_command` and
|
|
395
|
+
`delivery_receipt_commands` for
|
|
376
396
|
app-thread wake recovery. Use them to record sent, skipped, and blocked wake
|
|
377
397
|
outcomes after `app-wakeup-dispatch`; an app-thread send is not task
|
|
378
398
|
completion. The recommendations include a `teardown_policy`: an idle poll is
|
|
379
399
|
only a quiet interval, not a reason to delete or pause heartbeat automation;
|
|
380
400
|
heartbeat teardown belongs to the manager/operator after terminal closeout or
|
|
381
|
-
explicit operator instruction.
|
|
401
|
+
explicit operator instruction. For same-thread Codex app visible-session
|
|
402
|
+
dogfood, prefer `--template app_visible_build_loop` or a custom adversarial
|
|
403
|
+
gate; reserve cleanup-gated templates such as `build_then_clear` for flows
|
|
404
|
+
that create a fresh worker context or can record a real cleanup receipt
|
|
405
|
+
between iterations.
|
|
382
406
|
The optional
|
|
383
407
|
Codex app thread metadata is normally supplied after a Codex app manager has
|
|
384
408
|
used `create_thread` and `set_thread_title`; terminal-only users can omit it
|
|
@@ -405,7 +429,8 @@ tmux attach -t codex-live-test
|
|
|
405
429
|
a matching `ready_to_send` action with `send_ready=true` and the same thread
|
|
406
430
|
id; healthy and blocked roles must be recorded as `skipped` or `blocked`.
|
|
407
431
|
- `app-autopilot start|stop|status TASK [--dispatcher-id ID]
|
|
408
|
-
[--interval SECONDS] [--watch-iterations N] [--stale-after N]
|
|
432
|
+
[--interval SECONDS] [--watch-iterations N] [--stale-after N]
|
|
433
|
+
[--quiet-after N] [--json]` —
|
|
409
434
|
Manage the pair-level app-native heartbeat policy for the active
|
|
410
435
|
manager/worker binding. `start` and `stop` write telemetry receipts and emit
|
|
411
436
|
the exact manager/worker Codex app heartbeat automation specs plus the
|
|
@@ -413,6 +438,11 @@ tmux attach -t codex-live-test
|
|
|
413
438
|
thread tools, so create/pause those heartbeat automations from a Codex app
|
|
414
439
|
operator session using the emitted specs; Conveyor remains the durable source
|
|
415
440
|
of truth through Dispatch, inboxes, wake receipts, and app heartbeat status.
|
|
441
|
+
`status` also reports `plan.quiescence`: when the loop is healthy, has no
|
|
442
|
+
`next_actions`, and both roles have produced `--quiet-after` paired
|
|
443
|
+
heartbeats since the last command or inbox-consumption receipt, it recommends
|
|
444
|
+
`stop_autopilot` so operators can quiesce blocked/no-progress loops instead
|
|
445
|
+
of repeating idle pulses.
|
|
416
446
|
- `discover [QUERY] [--all] [--limit N]` / `search [QUERY]` — Search tasks,
|
|
417
447
|
registered sessions, active bindings, and recent telemetry in one JSON result.
|
|
418
448
|
Use this for conversational setup when a manager or Codex session needs to
|
|
@@ -443,7 +473,8 @@ tmux attach -t codex-live-test
|
|
|
443
473
|
flags. Use `--interactive` only as a terminal fallback when a human is
|
|
444
474
|
running `conveyor` directly.
|
|
445
475
|
`--permit` grants taxonomy permissions such as `repo.open_pr`,
|
|
446
|
-
`
|
|
476
|
+
`repo.push_branch`, `repo.monitor_ci`, `repo.resolve_conflicts`,
|
|
477
|
+
`repo.merge_green_pr`, `verification.run_pytest`, `context.spawn_reviewer`,
|
|
447
478
|
`communication.notify_operator`, or `worker_session.compact`. Use `--tool`
|
|
448
479
|
to record expected verification/context tools, `--epilogue` for required
|
|
449
480
|
built-in finish steps (`run-tools`, `draft-pr`, `subagent-review`,
|
|
@@ -484,7 +515,13 @@ tmux attach -t codex-live-test
|
|
|
484
515
|
[--json]` — Draft reviewed `criteria --add` commands from a worker response
|
|
485
516
|
that separates must-have current-task criteria from deferred follow-ups. This
|
|
486
517
|
helper is read-only: it resolves the task and prints suggestions, but does not
|
|
487
|
-
mutate acceptance criteria, events, or commands.
|
|
518
|
+
mutate acceptance criteria, events, or commands. If a proposed criterion
|
|
519
|
+
appears to describe manager closeout mechanics such as `finish-task`,
|
|
520
|
+
`--require-criteria-audit`, heartbeat teardown, or final manager reporting,
|
|
521
|
+
the helper emits a non-blocking warning and classifies that suggestion as
|
|
522
|
+
manager closeout proof. Keep that proof in the manager final report, audit,
|
|
523
|
+
replay, or epilogue evidence instead of accepted worker/task criteria unless
|
|
524
|
+
the task is explicitly Conveyor closeout QA.
|
|
488
525
|
```bash
|
|
489
526
|
conveyor criteria-plan my-task --from-worker-response response.md --json
|
|
490
527
|
```
|
|
@@ -640,7 +677,10 @@ tmux attach -t codex-live-test
|
|
|
640
677
|
recovery; `--once` performs one pass.
|
|
641
678
|
- `enqueue-notify-manager <task> --message "..." [--correlation-id C]
|
|
642
679
|
[--required-permission P] [--idempotency-key K] [--json]` — Queue a `notify_manager` command row for
|
|
643
|
-
Dispatch to claim and deliver to the bound manager.
|
|
680
|
+
Dispatch to claim and deliver to the bound manager. Codex app/no-tmux
|
|
681
|
+
workers must use this route for completion and blocker reports after
|
|
682
|
+
consuming a manager instruction; direct app-thread final answers are local
|
|
683
|
+
text, not manager inbox receipts.
|
|
644
684
|
- `enqueue-nudge-worker <task> --message "..." [--correlation-id C]
|
|
645
685
|
[--required-permission P] [--idempotency-key K] [--json]` — Queue a `nudge_worker` command row for
|
|
646
686
|
Dispatch to claim and deliver to the bound worker. Use this dispatcher-backed
|
|
@@ -734,9 +774,9 @@ tmux attach -t codex-live-test
|
|
|
734
774
|
- `transcript-show <task> [--role R] [--include-content]` — Show stored
|
|
735
775
|
transcript segment metadata. Segment text is redacted unless
|
|
736
776
|
`--include-content` is passed.
|
|
737
|
-
- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor>` — Print a
|
|
777
|
+
- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor|ship-it-loop>` — Print a
|
|
738
778
|
repeatable manual QA checklist.
|
|
739
|
-
- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop> --receipt-output RECEIPT.json [--path DB]` —
|
|
779
|
+
- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop|ship-it-loop> --receipt-output RECEIPT.json [--path DB]` —
|
|
740
780
|
Run a deterministic no-tmux QA harness and save a JSON receipt.
|
|
741
781
|
`ralph-loop-guardrails` proves max-iteration cutoff, missing-evidence
|
|
742
782
|
cutoff, fresh retry delivery after structured `adversarial_check` evidence,
|
|
@@ -756,6 +796,10 @@ tmux attach -t codex-live-test
|
|
|
756
796
|
`build-clear-loop` proves the non-coverage `build_then_clear` template
|
|
757
797
|
blocks before `build_passed` and `cleanup` receipts, still blocks after build
|
|
758
798
|
evidence alone, and delivers only after both build and cleanup evidence exist.
|
|
799
|
+
`ship-it-loop` proves push, PR, and merge commands fail closed until their
|
|
800
|
+
permissions are granted, then proves the `ship_it_loop` lifecycle blocks
|
|
801
|
+
before branch, PR, CI, mergeability, manager decision, merge, post-merge, and
|
|
802
|
+
adversarial receipts exist.
|
|
759
803
|
- `loop-triggers --list|--classify PROMPT [--json]` — List the controlled
|
|
760
804
|
natural-language loop triggers or classify a manager/operator prompt before
|
|
761
805
|
creating a loop policy or continuation gate. Approved trigger phrases include
|
|
@@ -769,11 +813,16 @@ tmux attach -t codex-live-test
|
|
|
769
813
|
evidence blocks a manager continuation before worker delivery until matching
|
|
770
814
|
satisfied criterion evidence exists. `ralph-loop-presets` remains as a
|
|
771
815
|
compatibility alias for the current Ralph-loop QA flows. The built-in
|
|
816
|
+
`app_visible_build_loop` template requires `build_passed` plus structured
|
|
817
|
+
`adversarial_check` evidence, but no cleanup evidence, so visible Codex app
|
|
818
|
+
threads can continue without pretending that same-thread context was cleared.
|
|
819
|
+
The built-in
|
|
772
820
|
`visual_diff_loop` template requires `reference_artifact`,
|
|
773
821
|
`candidate_screenshot`, `visual_diff_report`, `diff_below_threshold`, and
|
|
774
822
|
`adversarial_check` evidence before a manager-requested next visual pass can
|
|
775
|
-
reach the worker. Quality-oriented templates (`
|
|
776
|
-
`
|
|
823
|
+
reach the worker. Quality-oriented templates (`app_visible_build_loop`,
|
|
824
|
+
`pr_ci_merge_loop`, `ship_it_loop`, `test_coverage_loop`, and
|
|
825
|
+
`visual_diff_loop`) also expose an
|
|
777
826
|
`artifact_requirements["adversarial_check"]` object requiring
|
|
778
827
|
`failure_mode`, `check`, and `result` fields.
|
|
779
828
|
- `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
|
|
@@ -793,7 +842,12 @@ same-project `create_thread` worker plus `set_thread_title` before creating the
|
|
|
793
842
|
binding, then pass the worker thread id/title into Conveyor. Use `fork_thread`
|
|
794
843
|
only when the user explicitly asks to fork or resume this conversation. If app
|
|
795
844
|
thread tools are unavailable, create the binding anyway and paste the returned
|
|
796
|
-
`worker_handoff` prompt into a manually opened worker session.
|
|
845
|
+
`worker_handoff` prompt into a manually opened worker session. The handoff
|
|
846
|
+
requires a worker to report completion/blockers through
|
|
847
|
+
`enqueue-notify-manager` plus a bounded Dispatch watch run before treating the
|
|
848
|
+
manager as notified, and to print the live `CONVEYOR POLL` / `CONVEYOR
|
|
849
|
+
RECEIVED` / `WORK` / `CONVEYOR SEND` / `DISPATCH` transcript in the worker
|
|
850
|
+
session for any consumed item.
|
|
797
851
|
- `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
|
|
798
852
|
Queue a manager-requested next loop pass for Dispatch. The command refuses
|
|
799
853
|
same/current iteration requests before they become pending queue rows, while
|
|
@@ -805,6 +859,8 @@ thread tools are unavailable, create the binding anyway and paste the returned
|
|
|
805
859
|
artifact requirements, and recommended tools.
|
|
806
860
|
- `loop-evidence add TASK --loop-run RUN --iteration N --evidence-type TYPE` —
|
|
807
861
|
Record a run-qualified evidence receipt for a loop policy. Use
|
|
862
|
+
`loop-evidence build-passed TASK --loop-run RUN --iteration N` as the
|
|
863
|
+
friendly alias for the common `evidence_type=build_passed` receipt. Use
|
|
808
864
|
`loop-evidence visual-diff` to compare PNG screenshots, write an optional
|
|
809
865
|
diff/report artifact, and record `visual_diff_report` plus
|
|
810
866
|
`diff_below_threshold` as satisfied only when the computed score is within
|
|
@@ -850,14 +906,17 @@ conveyor qa-plan dispatch-completion
|
|
|
850
906
|
conveyor qa-plan ralph-loop
|
|
851
907
|
conveyor qa-plan adversarial-triggers
|
|
852
908
|
conveyor qa-plan goalbuddy-conveyor
|
|
909
|
+
conveyor qa-plan ship-it-loop
|
|
853
910
|
conveyor qa-run ralph-loop-guardrails --receipt-output /tmp/ralph-loop-guardrails-receipt.json --json
|
|
854
911
|
conveyor qa-run generic-loop-template --receipt-output /tmp/generic-loop-template-receipt.json --json
|
|
855
912
|
conveyor qa-run generic-loop-template-browser --receipt-output /tmp/generic-loop-template-browser-receipt.json --json
|
|
856
913
|
conveyor qa-run test-coverage-loop --receipt-output /tmp/test-coverage-loop-receipt.json --json
|
|
857
914
|
conveyor qa-run adversarial-triggers --receipt-output /tmp/adversarial-triggers-receipt.json --json
|
|
858
915
|
conveyor qa-run build-clear-loop --receipt-output /tmp/build-clear-loop-receipt.json --json
|
|
916
|
+
conveyor qa-run ship-it-loop --receipt-output /tmp/ship-it-loop-receipt.json --json
|
|
859
917
|
conveyor loop-triggers --classify "Run this as an adversarially gated Ralph loop." --json
|
|
860
918
|
conveyor loop-templates --list --json
|
|
919
|
+
conveyor loop-templates --show ship_it_loop --json
|
|
861
920
|
conveyor loop-templates --show visual_diff_loop --json
|
|
862
921
|
conveyor loop-evidence visual-diff qa-task --loop-run "$RUN_ID" --iteration 1 --reference reference.png --candidate candidate.png --threshold 0.02 --report-output visual-diff.json --diff-output visual-diff.png
|
|
863
922
|
conveyor ralph-loop-presets --list --json
|