agent-conveyor 0.1.13 → 0.1.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +51 -11
- package/dist/cli/typescript-runtime.js +554 -12
- package/dist/cli/typescript-runtime.js.map +1 -1
- package/dist/runtime/app-autonomy.d.ts +1 -0
- package/dist/runtime/app-autonomy.js +16 -0
- package/dist/runtime/app-autonomy.js.map +1 -1
- package/dist/runtime/manager-permissions.js +1 -1
- package/dist/runtime/manager-permissions.js.map +1 -1
- package/docs/manager-recipes.md +85 -0
- package/package.json +1 -1
- package/skills/manage-codex-workers/SKILL.md +76 -8
package/README.md
CHANGED
|
@@ -139,8 +139,9 @@ release version that is not already on npm.
|
|
|
139
139
|
For common manager setups, start with
|
|
140
140
|
[`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
|
|
141
141
|
requests such as GoalBuddy conveyor runs, test coverage loops, UX polish loops,
|
|
142
|
-
what-next nudging,
|
|
143
|
-
settings, permissions, evidence gates, cleanup
|
|
142
|
+
what-next nudging, PR/CI/merge Ralph loops, and autonomous ship-it loops to
|
|
143
|
+
concrete `manager-config` settings, permissions, evidence gates, cleanup
|
|
144
|
+
behavior, and example
|
|
144
145
|
manager/Dispatch/worker interactions. Use `conveyor manager-recipes --list`
|
|
145
146
|
or `conveyor manager-recipes --show goalbuddy-conveyor --json` for a
|
|
146
147
|
machine-readable setup preview.
|
|
@@ -178,6 +179,12 @@ are unavailable, open a separate Codex app worker manually and paste the
|
|
|
178
179
|
completion or blocker report must go back through the generated
|
|
179
180
|
`enqueue-notify-manager` command and a bounded Dispatch watch tick; a direct
|
|
180
181
|
Codex app final answer is not a durable manager receipt.
|
|
182
|
+
The live manager and worker sessions should also be readable as the primary
|
|
183
|
+
operator transcript: after consuming an inbox item, the consuming session must
|
|
184
|
+
print `CONVEYOR POLL`, `CONVEYOR RECEIVED`, `WORK`, `CONVEYOR SEND`, and
|
|
185
|
+
`DISPATCH` sections while the turn is happening. SQLite/replay/status output is
|
|
186
|
+
audit proof, not a replacement for the live session story. Idle polls may be a
|
|
187
|
+
single `CONVEYOR IDLE` line.
|
|
181
188
|
|
|
182
189
|
Dispatch is core infrastructure for supervised worker/manager pairs. The
|
|
183
190
|
`pair` workflow starts a detached Dispatch watch process by default so worker
|
|
@@ -195,6 +202,9 @@ Use `conveyor qa-plan adversarial-triggers` to verify natural-language
|
|
|
195
202
|
manager prompts activate Ralph-loop adversarial gates.
|
|
196
203
|
Use `conveyor qa-plan goalbuddy-conveyor` when a broad request should become
|
|
197
204
|
sequential GoalBuddy child boards with PR/CI/merge receipts.
|
|
205
|
+
Use `conveyor qa-plan ship-it-loop` when a manager is allowed to push a branch,
|
|
206
|
+
open a PR, monitor CI, resolve bounded conflicts, and merge only after explicit
|
|
207
|
+
manager-owned merge evidence.
|
|
198
208
|
Before cutting a manager loose, have it resolve the freeform setup request to a
|
|
199
209
|
named recipe from `docs/manager-recipes.md` or an explicit `custom` setup, then
|
|
200
210
|
show the saved mode, permissions, evidence gates, cleanup policy, and disallowed
|
|
@@ -377,7 +387,10 @@ tmux attach -t codex-live-test
|
|
|
377
387
|
required to make an idle app thread poll autonomously. The worker handoff and
|
|
378
388
|
worker heartbeat prompt also include the exact durable
|
|
379
389
|
`enqueue-notify-manager` and one-iteration `dispatch --watch` commands that a
|
|
380
|
-
worker must run after completing or blocking on a consumed item. Those
|
|
390
|
+
worker must run after completing or blocking on a consumed item. Those prompts
|
|
391
|
+
require live session transcript blocks for consumed items, so the operator can
|
|
392
|
+
inspect the actual Codex app sessions and see the same flow the durable inbox
|
|
393
|
+
later proves. Those
|
|
381
394
|
recommendations also include `wakeup_dispatch_command` and
|
|
382
395
|
`delivery_receipt_commands` for
|
|
383
396
|
app-thread wake recovery. Use them to record sent, skipped, and blocked wake
|
|
@@ -385,7 +398,11 @@ tmux attach -t codex-live-test
|
|
|
385
398
|
completion. The recommendations include a `teardown_policy`: an idle poll is
|
|
386
399
|
only a quiet interval, not a reason to delete or pause heartbeat automation;
|
|
387
400
|
heartbeat teardown belongs to the manager/operator after terminal closeout or
|
|
388
|
-
explicit operator instruction.
|
|
401
|
+
explicit operator instruction. For same-thread Codex app visible-session
|
|
402
|
+
dogfood, prefer `--template app_visible_build_loop` or a custom adversarial
|
|
403
|
+
gate; reserve cleanup-gated templates such as `build_then_clear` for flows
|
|
404
|
+
that create a fresh worker context or can record a real cleanup receipt
|
|
405
|
+
between iterations.
|
|
389
406
|
The optional
|
|
390
407
|
Codex app thread metadata is normally supplied after a Codex app manager has
|
|
391
408
|
used `create_thread` and `set_thread_title`; terminal-only users can omit it
|
|
@@ -456,7 +473,8 @@ tmux attach -t codex-live-test
|
|
|
456
473
|
flags. Use `--interactive` only as a terminal fallback when a human is
|
|
457
474
|
running `conveyor` directly.
|
|
458
475
|
`--permit` grants taxonomy permissions such as `repo.open_pr`,
|
|
459
|
-
`
|
|
476
|
+
`repo.push_branch`, `repo.monitor_ci`, `repo.resolve_conflicts`,
|
|
477
|
+
`repo.merge_green_pr`, `verification.run_pytest`, `context.spawn_reviewer`,
|
|
460
478
|
`communication.notify_operator`, or `worker_session.compact`. Use `--tool`
|
|
461
479
|
to record expected verification/context tools, `--epilogue` for required
|
|
462
480
|
built-in finish steps (`run-tools`, `draft-pr`, `subagent-review`,
|
|
@@ -497,7 +515,13 @@ tmux attach -t codex-live-test
|
|
|
497
515
|
[--json]` — Draft reviewed `criteria --add` commands from a worker response
|
|
498
516
|
that separates must-have current-task criteria from deferred follow-ups. This
|
|
499
517
|
helper is read-only: it resolves the task and prints suggestions, but does not
|
|
500
|
-
mutate acceptance criteria, events, or commands.
|
|
518
|
+
mutate acceptance criteria, events, or commands. If a proposed criterion
|
|
519
|
+
appears to describe manager closeout mechanics such as `finish-task`,
|
|
520
|
+
`--require-criteria-audit`, heartbeat teardown, or final manager reporting,
|
|
521
|
+
the helper emits a non-blocking warning and classifies that suggestion as
|
|
522
|
+
manager closeout proof. Keep that proof in the manager final report, audit,
|
|
523
|
+
replay, or epilogue evidence instead of accepted worker/task criteria unless
|
|
524
|
+
the task is explicitly Conveyor closeout QA.
|
|
501
525
|
```bash
|
|
502
526
|
conveyor criteria-plan my-task --from-worker-response response.md --json
|
|
503
527
|
```
|
|
@@ -750,9 +774,9 @@ tmux attach -t codex-live-test
|
|
|
750
774
|
- `transcript-show <task> [--role R] [--include-content]` — Show stored
|
|
751
775
|
transcript segment metadata. Segment text is redacted unless
|
|
752
776
|
`--include-content` is passed.
|
|
753
|
-
- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor>` — Print a
|
|
777
|
+
- `qa-plan <self-management|emergent-criteria|tmux-errors|dispatch-completion|ralph-loop|adversarial-triggers|goalbuddy-conveyor|ship-it-loop>` — Print a
|
|
754
778
|
repeatable manual QA checklist.
|
|
755
|
-
- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop> --receipt-output RECEIPT.json [--path DB]` —
|
|
779
|
+
- `qa-run <ralph-loop-guardrails|generic-loop-template|generic-loop-template-browser|test-coverage-loop|adversarial-triggers|build-clear-loop|ship-it-loop> --receipt-output RECEIPT.json [--path DB]` —
|
|
756
780
|
Run a deterministic no-tmux QA harness and save a JSON receipt.
|
|
757
781
|
`ralph-loop-guardrails` proves max-iteration cutoff, missing-evidence
|
|
758
782
|
cutoff, fresh retry delivery after structured `adversarial_check` evidence,
|
|
@@ -772,6 +796,10 @@ tmux attach -t codex-live-test
|
|
|
772
796
|
`build-clear-loop` proves the non-coverage `build_then_clear` template
|
|
773
797
|
blocks before `build_passed` and `cleanup` receipts, still blocks after build
|
|
774
798
|
evidence alone, and delivers only after both build and cleanup evidence exist.
|
|
799
|
+
`ship-it-loop` proves push, PR, and merge commands fail closed until their
|
|
800
|
+
permissions are granted, then proves the `ship_it_loop` lifecycle blocks
|
|
801
|
+
before branch, PR, CI, mergeability, manager decision, merge, post-merge, and
|
|
802
|
+
adversarial receipts exist.
|
|
775
803
|
- `loop-triggers --list|--classify PROMPT [--json]` — List the controlled
|
|
776
804
|
natural-language loop triggers or classify a manager/operator prompt before
|
|
777
805
|
creating a loop policy or continuation gate. Approved trigger phrases include
|
|
@@ -785,11 +813,16 @@ tmux attach -t codex-live-test
|
|
|
785
813
|
evidence blocks a manager continuation before worker delivery until matching
|
|
786
814
|
satisfied criterion evidence exists. `ralph-loop-presets` remains as a
|
|
787
815
|
compatibility alias for the current Ralph-loop QA flows. The built-in
|
|
816
|
+
`app_visible_build_loop` template requires `build_passed` plus structured
|
|
817
|
+
`adversarial_check` evidence, but no cleanup evidence, so visible Codex app
|
|
818
|
+
threads can continue without pretending that same-thread context was cleared.
|
|
819
|
+
The built-in
|
|
788
820
|
`visual_diff_loop` template requires `reference_artifact`,
|
|
789
821
|
`candidate_screenshot`, `visual_diff_report`, `diff_below_threshold`, and
|
|
790
822
|
`adversarial_check` evidence before a manager-requested next visual pass can
|
|
791
|
-
reach the worker. Quality-oriented templates (`
|
|
792
|
-
`
|
|
823
|
+
reach the worker. Quality-oriented templates (`app_visible_build_loop`,
|
|
824
|
+
`pr_ci_merge_loop`, `ship_it_loop`, `test_coverage_loop`, and
|
|
825
|
+
`visual_diff_loop`) also expose an
|
|
793
826
|
`artifact_requirements["adversarial_check"]` object requiring
|
|
794
827
|
`failure_mode`, `check`, and `result` fields.
|
|
795
828
|
- `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
|
|
@@ -812,7 +845,9 @@ thread tools are unavailable, create the binding anyway and paste the returned
|
|
|
812
845
|
`worker_handoff` prompt into a manually opened worker session. The handoff
|
|
813
846
|
requires a worker to report completion/blockers through
|
|
814
847
|
`enqueue-notify-manager` plus a bounded Dispatch watch run before treating the
|
|
815
|
-
manager as notified
|
|
848
|
+
manager as notified, and to print the live `CONVEYOR POLL` / `CONVEYOR
|
|
849
|
+
RECEIVED` / `WORK` / `CONVEYOR SEND` / `DISPATCH` transcript in the worker
|
|
850
|
+
session for any consumed item.
|
|
816
851
|
- `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
|
|
817
852
|
Queue a manager-requested next loop pass for Dispatch. The command refuses
|
|
818
853
|
same/current iteration requests before they become pending queue rows, while
|
|
@@ -824,6 +859,8 @@ manager as notified.
|
|
|
824
859
|
artifact requirements, and recommended tools.
|
|
825
860
|
- `loop-evidence add TASK --loop-run RUN --iteration N --evidence-type TYPE` —
|
|
826
861
|
Record a run-qualified evidence receipt for a loop policy. Use
|
|
862
|
+
`loop-evidence build-passed TASK --loop-run RUN --iteration N` as the
|
|
863
|
+
friendly alias for the common `evidence_type=build_passed` receipt. Use
|
|
827
864
|
`loop-evidence visual-diff` to compare PNG screenshots, write an optional
|
|
828
865
|
diff/report artifact, and record `visual_diff_report` plus
|
|
829
866
|
`diff_below_threshold` as satisfied only when the computed score is within
|
|
@@ -869,14 +906,17 @@ conveyor qa-plan dispatch-completion
|
|
|
869
906
|
conveyor qa-plan ralph-loop
|
|
870
907
|
conveyor qa-plan adversarial-triggers
|
|
871
908
|
conveyor qa-plan goalbuddy-conveyor
|
|
909
|
+
conveyor qa-plan ship-it-loop
|
|
872
910
|
conveyor qa-run ralph-loop-guardrails --receipt-output /tmp/ralph-loop-guardrails-receipt.json --json
|
|
873
911
|
conveyor qa-run generic-loop-template --receipt-output /tmp/generic-loop-template-receipt.json --json
|
|
874
912
|
conveyor qa-run generic-loop-template-browser --receipt-output /tmp/generic-loop-template-browser-receipt.json --json
|
|
875
913
|
conveyor qa-run test-coverage-loop --receipt-output /tmp/test-coverage-loop-receipt.json --json
|
|
876
914
|
conveyor qa-run adversarial-triggers --receipt-output /tmp/adversarial-triggers-receipt.json --json
|
|
877
915
|
conveyor qa-run build-clear-loop --receipt-output /tmp/build-clear-loop-receipt.json --json
|
|
916
|
+
conveyor qa-run ship-it-loop --receipt-output /tmp/ship-it-loop-receipt.json --json
|
|
878
917
|
conveyor loop-triggers --classify "Run this as an adversarially gated Ralph loop." --json
|
|
879
918
|
conveyor loop-templates --list --json
|
|
919
|
+
conveyor loop-templates --show ship_it_loop --json
|
|
880
920
|
conveyor loop-templates --show visual_diff_loop --json
|
|
881
921
|
conveyor loop-evidence visual-diff qa-task --loop-run "$RUN_ID" --iteration 1 --reference reference.png --candidate candidate.png --threshold 0.02 --report-output visual-diff.json --diff-output visual-diff.png
|
|
882
922
|
conveyor ralph-loop-presets --list --json
|