agent-conveyor 0.1.6 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -131,6 +131,10 @@ successful JSON. Treat that warning as expected Node runtime noise when the
131
131
  command exits 0 and the JSON result reports `"ok": true`.
132
132
  Before publishing `agent-conveyor` to npm, use
133
133
  [`docs/package-release.md`](docs/package-release.md).
134
+ The preferred publish path is the manual GitHub Actions `publish.yml` workflow
135
+ with npm Trusted Publishing enabled for the `npm-production` environment. Use
136
+ `publish=false` for artifact review and `publish=true` only for an approved
137
+ release version that is not already on npm.
134
138
 
135
139
  For common manager setups, start with
136
140
  [`docs/manager-recipes.md`](docs/manager-recipes.md). It maps natural-language
@@ -144,6 +148,10 @@ For a package-facing overview of these modes, open
144
148
  [`docs/landing-page.html`](docs/landing-page.html) locally or host it as a
145
149
  static landing page. From the repo, `npm run docs:landing` serves it at
146
150
  `http://127.0.0.1:8765/`.
151
+ The GitHub Pages version lives at
152
+ [`neonwatty.github.io/agent-conveyor`](https://neonwatty.github.io/agent-conveyor/).
153
+ Use `node scripts/check-landing-page.mjs` for a docs-only desktop/mobile
154
+ screenshot gate; this does not run the full package release smoke.
147
155
 
148
156
  After install, the intended Codex app entry point is natural language. Open a
149
157
  new Codex app session in the target repo and say:
@@ -158,7 +166,15 @@ Require adversarial proof before another worker iteration.
158
166
  The installed skill should call the `conveyor` CLI, choose names, create the
159
167
  no-tmux binding with `create-disposable-binding`, point the worker at
160
168
  `worker-inbox`, and use `loop-status` plus telemetry receipts before reporting
161
- that the loop is ready.
169
+ that the loop is ready. When the manager is itself running in the Codex app and
170
+ thread tools are available, the skill should first call `create_thread` for a
171
+ fresh same-project worker, name it with `set_thread_title`, pass the returned
172
+ thread identity through `--worker-codex-app-thread-id` and
173
+ `--worker-codex-app-thread-title`, and use `send_message_to_thread` only to
174
+ deliver the generated `worker_handoff` bootstrap prompt. The raw terminal
175
+ `conveyor` CLI does not create Codex app threads by itself; if app thread tools
176
+ are unavailable, open a separate Codex app worker manually and paste the
177
+ `worker_handoff` prompt.
162
178
 
163
179
  Dispatch is core infrastructure for supervised worker/manager pairs. The
164
180
  `pair` workflow starts a detached Dispatch watch process by default so worker
@@ -309,16 +325,21 @@ tmux attach -t codex-live-test
309
325
  Use `--accept-trust` only for directories you intentionally trust; it retries
310
326
  Enter during startup discovery so fresh workspaces do not stall before
311
327
  registration.
312
- - `register-worker --name N [--pid P | --codex-session PATH] [--cwd D] [--tmux-session S]` —
328
+ - `register-worker --name N [--pid P | --codex-session PATH] [--cwd D] [--tmux-session S] [--codex-app-thread-id ID] [--codex-app-thread-title TITLE]` —
313
329
  Register an already-running Codex session as a worker. Rollout JSONL is
314
330
  auto-discovered from the pid via `lsof` unless `--codex-session` is given.
331
+ The optional Codex app thread flags are metadata supplied by the Codex app
332
+ skill/tool layer; they help humans identify the app thread but do not change
333
+ rollout ingest or Dispatch delivery.
315
334
  - `register-manager --name N ...` — Same arguments; tmux is not required.
316
335
  Both registration commands print a `communication` object. When
317
336
  `--tmux-session` is present, `communication.session_kind='tmux'`,
318
337
  `receive_style='push'`, and `delivery_mode='push'`; without tmux but with a
319
338
  Codex rollout identity, `session_kind='codex_app'`, `receive_style='pull'`,
320
339
  and `delivery_mode='pull_required'`, with the role-specific inbox polling
321
- command template.
340
+ command template. The generated command may include a local
341
+ `PATH=.../bin:$PATH conveyor` prefix; preserve that prefix when sending the
342
+ command to a Codex app thread.
322
343
  - `deregister <name>` — Mark a session gone. Refuses if the session is bound
323
344
  to an active task.
324
345
  - `sessions [--role worker|manager] [--state active|gone|all] [--include-legacy]
@@ -338,7 +359,7 @@ tmux attach -t codex-live-test
338
359
  registration, so managers can detect whether a worker or manager is
339
360
  tmux-push capable or must poll its mailbox.
340
361
  - `tasks [--create NAME --goal G --summary S]` — List or create tasks.
341
- - `create-disposable-binding TASK [--worker NAME] [--manager NAME] [--template TEMPLATE | --required-before-continue TYPE] [--adversarial]` —
362
+ - `create-disposable-binding TASK [--worker NAME] [--manager NAME] [--template TEMPLATE | --required-before-continue TYPE] [--adversarial] [--worker-codex-app-thread-id ID] [--worker-codex-app-thread-title TITLE] [--manager-codex-app-thread-id ID] [--manager-codex-app-thread-title TITLE]` —
342
363
  Create a no-tmux manager/worker binding for real Ralph-loop slices. The
343
364
  helper creates the task when missing, marks it managed, writes valid Codex
344
365
  rollout JSONL files, registers worker and manager sessions with
@@ -346,7 +367,18 @@ tmux attach -t codex-live-test
346
367
  custom Ralph-loop policy run, and prints replay commands for Dispatch,
347
368
  `loop-status`, per-session `communication` metadata, plus a `worker_handoff`
348
369
  prompt that tells Codex app workers to keep polling their worker inbox
349
- through the bounded loop.
370
+ through the bounded loop using the exact generated command. For pull-required
371
+ Codex app sessions, the JSON output also includes
372
+ `heartbeat_recommendations` with role-specific poll prompts; Dispatch can
373
+ deliver into those inboxes, but a heartbeat or operator wake-up is still
374
+ required to make an idle app thread poll autonomously. Those recommendations
375
+ include a `teardown_policy`: an idle poll is only a quiet interval, not a
376
+ reason to delete or pause heartbeat automation; heartbeat teardown belongs to
377
+ the manager/operator after terminal closeout or explicit operator instruction.
378
+ The optional
379
+ Codex app thread metadata is normally supplied after a Codex app manager has
380
+ used `create_thread` and `set_thread_title`; terminal-only users can omit it
381
+ and still use the manual no-tmux handoff.
350
382
  - `discover [QUERY] [--all] [--limit N]` / `search [QUERY]` — Search tasks,
351
383
  registered sessions, active bindings, and recent telemetry in one JSON result.
352
384
  Use this for conversational setup when a manager or Codex session needs to
@@ -407,8 +439,8 @@ tmux attach -t codex-live-test
407
439
  or manager inspection; `manager_config` is not a valid criteria source.
408
440
  To add a criterion and satisfy that same row after verification:
409
441
  ```bash
410
- criterion_id=$(conveyor criteria my-task --add --criterion "Targeted prompt tests pass" --source worker_proposed --status proposed | python3 -c 'import json,sys; print(json.load(sys.stdin)["affected_criterion"]["id"])')
411
- conveyor criteria my-task --satisfy "$criterion_id" --evidence-json '{"command":"python3 -m unittest tests.test_workerctl.ManagerBootstrapPromptTests -v","status":"pass"}'
442
+ criterion_id=$(conveyor criteria my-task --add --criterion "Targeted prompt tests pass" --source worker_proposed --status proposed | node -e 'const fs = require("fs"); console.log(JSON.parse(fs.readFileSync(0, "utf8")).affected_criterion.id)')
443
+ conveyor criteria my-task --satisfy "$criterion_id" --evidence-json '{"command":"npm test -- --runInBand","status":"pass"}'
412
444
  ```
413
445
  For mutation responses, treat `affected_criterion` as the authoritative
414
446
  receipt for the row changed by that command. When a manager applies multiple
@@ -713,7 +745,7 @@ tmux attach -t codex-live-test
713
745
  - `loop-status TASK --run RUN [--json]` — Summarize a Ralph-loop run for manager
714
746
  review: policy template, iteration bounds, command states, routed
715
747
  notifications, worker inbox backlog, evidence types, consumed-inbox
716
- telemetry, failure counts, and a recommendation.
748
+ and iteration-advanced telemetry, failure counts, and a recommendation.
717
749
 
718
750
  For real vertical slices, start with the Ralph loop operator guide in
719
751
  `docs/qa/ralph-loop-operator-guide.md`. It explains the controlled
@@ -722,7 +754,12 @@ required evidence, adversarial proof, `loop-status`, and telemetry review pass
722
754
  bar.
723
755
  Use `create-disposable-binding` when the manager and worker are Codex app or
724
756
  other no-tmux sessions and you want the same Dispatch rails without manual
725
- task/session/bind setup.
757
+ task/session/bind setup. In a Codex app manager session, prefer a fresh
758
+ same-project `create_thread` worker plus `set_thread_title` before creating the
759
+ binding, then pass the worker thread id/title into Conveyor. Use `fork_thread`
760
+ only when the user explicitly asks to fork or resume this conversation. If app
761
+ thread tools are unavailable, create the binding anyway and paste the returned
762
+ `worker_handoff` prompt into a manually opened worker session.
726
763
  - `enqueue-continue-iteration TASK --loop-run RUN --requested-iteration N` —
727
764
  Queue a manager-requested next loop pass for Dispatch. The command refuses
728
765
  same/current iteration requests before they become pending queue rows, while
@@ -980,16 +1017,27 @@ Current dispatch state:
980
1017
  in `routed_notifications`, and threaded with `correlation_id`.
981
1018
  - The session inbox is the same `routed_notifications` stream addressed by
982
1019
  `target_session_id`: tmux push is optional transport. Codex app-based sessions
983
- should long-poll with `manager-inbox --consume-next --wait --json` or
984
- `worker-inbox --consume-next --wait --json`. For disposable Ralph loops, use
985
- the generated `worker_handoff` prompt so the worker keeps polling until no
986
- inbox item remains or the loop reaches `max_iterations`.
1020
+ should long-poll with the returned `communication.poll_command`. For
1021
+ disposable Ralph loops, use the generated `worker_handoff` prompt so the
1022
+ worker keeps polling until no inbox item remains or the loop reaches
1023
+ `max_iterations`. For no-tmux Codex app sessions, treat
1024
+ `communication.requires_polling=true` as requiring a heartbeat/wake layer:
1025
+ a delivered pull inbox item does not by itself wake an idle app thread. Do
1026
+ not delete or pause heartbeats because an inbox poll is idle. A terminal
1027
+ manager decision should be followed by `finish-task --require-criteria-audit`
1028
+ or by an explicit blocker explaining why the task/binding still appears
1029
+ active.
987
1030
  - `register-worker`, `register-manager`, `sessions`, `discover`, and
988
1031
  `create-disposable-binding --json` expose a `communication` block per
989
1032
  session. Treat `session_kind='tmux'` plus `receive_style='push'` as direct
990
1033
  tmux-delivery capable; treat `session_kind='codex_app'` plus
991
1034
  `receive_style='pull'` as mailbox polling required for that worker or
992
1035
  manager.
1036
+ - App-assisted setup may also expose `codex_app_thread_id` and
1037
+ `codex_app_thread_title` for sessions created or identified by Codex app
1038
+ thread tools. Treat those fields as human/app navigation metadata; the
1039
+ durable communication record is still `routed_notifications` plus inbox
1040
+ consumption telemetry.
993
1041
  - Template-backed `continue_iteration` deliveries include `loop_policy` in the
994
1042
  inbox payload, with template name, current/max iteration, cleanup policy,
995
1043
  required evidence, artifact requirements, and recommended tools. Codex
@@ -1001,6 +1049,10 @@ Current dispatch state:
1001
1049
  - Consuming a mailbox item records `dispatch_inbox_consumed` telemetry with the
1002
1050
  notification id, signal type, delivery mode, target session role, and poll
1003
1051
  count, so manager/worker dispatcher handoffs are visible in audit evidence.
1052
+ When the item is `continue_iteration`, Conveyor also advances the run
1053
+ metadata's durable `current_iteration` to the requested iteration and records
1054
+ `ralph_loop_iteration_advanced` telemetry keyed to the run, notification,
1055
+ command, and consuming session.
1004
1056
  - If `doctor-self --json` reports `workerctl_on_path=false` inside a Codex app
1005
1057
  session, run `conveyor ...` from the repository root or install the
1006
1058
  local wrapper with `scripts/install-local --write`. Its `inside_tmux` check
@@ -1112,9 +1164,8 @@ scripts/rc-check --with-live-smoke-repeat
1112
1164
  Underlying deterministic checks:
1113
1165
 
1114
1166
  ```bash
1115
- python3 -m unittest discover -s tests -v
1116
- scripts/check-resource-warnings
1117
- python3 -m py_compile scripts/workerctl scripts/check-resource-warnings workerctl/*.py
1167
+ scripts/check-resource-warnings -- npm test -- --runInBand
1168
+ npm run build
1118
1169
  npm run migration:audit:final
1119
1170
  scripts/package-smoke
1120
1171
  scripts/release-check
@@ -1123,10 +1174,9 @@ scripts/release-check
1123
1174
  For local parallel experiments, prefer:
1124
1175
 
1125
1176
  ```bash
1126
- scripts/run-unittests-isolated
1177
+ npm test
1127
1178
  ```
1128
1179
 
1129
- This gives the process a temporary `WORKERCTL_STATE_ROOT` and a test namespace.
1130
1180
  The standard CI job remains serial.
1131
1181
 
1132
1182
  GitHub Actions runs `scripts/rc-check --skip-live-smoke-repeat` and
@@ -1134,7 +1184,7 @@ GitHub Actions runs `scripts/rc-check --skip-live-smoke-repeat` and
1134
1184
  remains local/manual because hosted runners may not have `codex`.
1135
1185
  The ResourceWarning gate intentionally fails on any `ResourceWarning` text in
1136
1186
  test output so finalization-time resource warnings cannot be hidden by a zero
1137
- `unittest` exit status.
1187
+ test exit status.
1138
1188
 
1139
1189
  Live local smoke gate:
1140
1190