@jterrats/open-orchestra 1.0.8 → 1.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. package/AGENTS.md +15 -2
  2. package/CLAUDE.md +16 -2
  3. package/dist/acceptance-criteria-quality.d.ts +12 -0
  4. package/dist/acceptance-criteria-quality.js +137 -0
  5. package/dist/acceptance-criteria-quality.js.map +1 -0
  6. package/dist/architecture-debt-inventory.d.ts +31 -0
  7. package/dist/architecture-debt-inventory.js +200 -0
  8. package/dist/architecture-debt-inventory.js.map +1 -0
  9. package/dist/architecture-debt-report.d.ts +2 -0
  10. package/dist/architecture-debt-report.js +28 -0
  11. package/dist/architecture-debt-report.js.map +1 -0
  12. package/dist/autonomous-phase-lifecycle.d.ts +5 -1
  13. package/dist/autonomous-phase-lifecycle.js +87 -17
  14. package/dist/autonomous-phase-lifecycle.js.map +1 -1
  15. package/dist/autonomous-run-store.js +2 -2
  16. package/dist/autonomous-run-store.js.map +1 -1
  17. package/dist/benchmark.js +2 -1
  18. package/dist/benchmark.js.map +1 -1
  19. package/dist/clarification.js +2 -1
  20. package/dist/clarification.js.map +1 -1
  21. package/dist/cli-payloads.d.ts +4 -0
  22. package/dist/cli-payloads.js +24 -0
  23. package/dist/cli-payloads.js.map +1 -0
  24. package/dist/collaboration-flows.d.ts +1 -2
  25. package/dist/command-manifest.js +18 -5
  26. package/dist/command-manifest.js.map +1 -1
  27. package/dist/command-routes-integrations.js +2 -1
  28. package/dist/command-routes-integrations.js.map +1 -1
  29. package/dist/command-routes.js +3 -1
  30. package/dist/command-routes.js.map +1 -1
  31. package/dist/command-utils.d.ts +1 -2
  32. package/dist/command-utils.js.map +1 -1
  33. package/dist/commands.d.ts +2 -2
  34. package/dist/commands.js +17 -4
  35. package/dist/commands.js.map +1 -1
  36. package/dist/constants.js +1 -0
  37. package/dist/constants.js.map +1 -1
  38. package/dist/cursor-canvas.js +21 -1
  39. package/dist/cursor-canvas.js.map +1 -1
  40. package/dist/cursor-mdc.d.ts +10 -0
  41. package/dist/cursor-mdc.js +37 -0
  42. package/dist/cursor-mdc.js.map +1 -0
  43. package/dist/fs-utils.js +2 -1
  44. package/dist/fs-utils.js.map +1 -1
  45. package/dist/generated-guidance-health.d.ts +33 -0
  46. package/dist/generated-guidance-health.js +125 -0
  47. package/dist/generated-guidance-health.js.map +1 -0
  48. package/dist/github.js +22 -7
  49. package/dist/github.js.map +1 -1
  50. package/dist/health-checks.js +2 -51
  51. package/dist/health-checks.js.map +1 -1
  52. package/dist/id-utils.d.ts +3 -0
  53. package/dist/id-utils.js +11 -0
  54. package/dist/id-utils.js.map +1 -0
  55. package/dist/instruction-blocks.js +20 -5
  56. package/dist/instruction-blocks.js.map +1 -1
  57. package/dist/mcp-oauth-proxy.js +5 -5
  58. package/dist/mcp-oauth-proxy.js.map +1 -1
  59. package/dist/memory-status.js +31 -5
  60. package/dist/memory-status.js.map +1 -1
  61. package/dist/memory.js +31 -7
  62. package/dist/memory.js.map +1 -1
  63. package/dist/metrics-commands.js +69 -17
  64. package/dist/metrics-commands.js.map +1 -1
  65. package/dist/notifications.js +12 -2
  66. package/dist/notifications.js.map +1 -1
  67. package/dist/phase-deterministic-output.d.ts +4 -0
  68. package/dist/phase-deterministic-output.js +62 -0
  69. package/dist/phase-deterministic-output.js.map +1 -0
  70. package/dist/phase-executor.js +9 -172
  71. package/dist/phase-executor.js.map +1 -1
  72. package/dist/phase-playbooks.js +17 -0
  73. package/dist/phase-playbooks.js.map +1 -1
  74. package/dist/provider-utils.js +11 -1
  75. package/dist/provider-utils.js.map +1 -1
  76. package/dist/qa-e2e-artifacts.d.ts +7 -0
  77. package/dist/qa-e2e-artifacts.js +225 -0
  78. package/dist/qa-e2e-artifacts.js.map +1 -0
  79. package/dist/quality-contracts.d.ts +83 -0
  80. package/dist/quality-contracts.js +463 -0
  81. package/dist/quality-contracts.js.map +1 -0
  82. package/dist/refresh-generated.js +81 -28
  83. package/dist/refresh-generated.js.map +1 -1
  84. package/dist/runtime-bootstrap.js +26 -2
  85. package/dist/runtime-bootstrap.js.map +1 -1
  86. package/dist/runtime-commands.d.ts +2 -0
  87. package/dist/runtime-commands.js +190 -1
  88. package/dist/runtime-commands.js.map +1 -1
  89. package/dist/runtime-context-manifest.d.ts +27 -0
  90. package/dist/runtime-context-manifest.js +151 -0
  91. package/dist/runtime-context-manifest.js.map +1 -0
  92. package/dist/runtime-execution-renderer.d.ts +3 -1
  93. package/dist/runtime-execution-renderer.js +7 -1
  94. package/dist/runtime-execution-renderer.js.map +1 -1
  95. package/dist/runtime-execution.d.ts +2 -1
  96. package/dist/runtime-execution.js +191 -2
  97. package/dist/runtime-execution.js.map +1 -1
  98. package/dist/runtime-guardrails.js +5 -1
  99. package/dist/runtime-guardrails.js.map +1 -1
  100. package/dist/runtime-lifecycle-watch-adapters.d.ts +4 -0
  101. package/dist/runtime-lifecycle-watch-adapters.js +87 -0
  102. package/dist/runtime-lifecycle-watch-adapters.js.map +1 -0
  103. package/dist/runtime-lifecycle-watch.d.ts +85 -0
  104. package/dist/runtime-lifecycle-watch.js +312 -0
  105. package/dist/runtime-lifecycle-watch.js.map +1 -0
  106. package/dist/runtime-parent-action-dispatch.d.ts +30 -0
  107. package/dist/runtime-parent-action-dispatch.js +114 -0
  108. package/dist/runtime-parent-action-dispatch.js.map +1 -0
  109. package/dist/runtime-parent-action-eligibility.d.ts +12 -0
  110. package/dist/runtime-parent-action-eligibility.js +131 -0
  111. package/dist/runtime-parent-action-eligibility.js.map +1 -0
  112. package/dist/runtime-parent-actions.d.ts +7 -2
  113. package/dist/runtime-parent-actions.js +145 -1
  114. package/dist/runtime-parent-actions.js.map +1 -1
  115. package/dist/runtime-spawn-bridge.js +21 -1
  116. package/dist/runtime-spawn-bridge.js.map +1 -1
  117. package/dist/skills-validation.js +1 -2
  118. package/dist/skills-validation.js.map +1 -1
  119. package/dist/sonar-commands.d.ts +1 -0
  120. package/dist/sonar-commands.js +36 -0
  121. package/dist/sonar-commands.js.map +1 -1
  122. package/dist/sonar-insights.d.ts +13 -0
  123. package/dist/sonar-insights.js +32 -53
  124. package/dist/sonar-insights.js.map +1 -1
  125. package/dist/sonar-payload-normalizers.d.ts +16 -0
  126. package/dist/sonar-payload-normalizers.js +67 -0
  127. package/dist/sonar-payload-normalizers.js.map +1 -0
  128. package/dist/sonar-preflight.d.ts +26 -0
  129. package/dist/sonar-preflight.js +111 -0
  130. package/dist/sonar-preflight.js.map +1 -0
  131. package/dist/sonar-redaction.d.ts +1 -0
  132. package/dist/sonar-redaction.js +13 -0
  133. package/dist/sonar-redaction.js.map +1 -0
  134. package/dist/subagent-protocol.js.map +1 -1
  135. package/dist/task-graph-commands.js +8 -1
  136. package/dist/task-graph-commands.js.map +1 -1
  137. package/dist/telemetry-redaction.js +31 -2
  138. package/dist/telemetry-redaction.js.map +1 -1
  139. package/dist/types/model-config.d.ts +6 -0
  140. package/dist/types/runtime.d.ts +49 -2
  141. package/dist/types/tasks.d.ts +12 -0
  142. package/dist/types.d.ts +1 -1
  143. package/dist/types.js.map +1 -1
  144. package/dist/web-api.js +8 -0
  145. package/dist/web-api.js.map +1 -1
  146. package/dist/web-artifacts.js +8 -3
  147. package/dist/web-artifacts.js.map +1 -1
  148. package/dist/web-console/assets/index-DA8Fs4r7.js +11 -0
  149. package/dist/web-console/index.html +1 -1
  150. package/dist/workflow-background-subagents.js +8 -4
  151. package/dist/workflow-background-subagents.js.map +1 -1
  152. package/dist/workflow-handoff-assessment.d.ts +3 -0
  153. package/dist/workflow-handoff-assessment.js +246 -0
  154. package/dist/workflow-handoff-assessment.js.map +1 -0
  155. package/dist/workflow-handoff-contract.d.ts +32 -0
  156. package/dist/workflow-handoff-contract.js +123 -0
  157. package/dist/workflow-handoff-contract.js.map +1 -0
  158. package/dist/workflow-phase-transition.d.ts +16 -0
  159. package/dist/workflow-phase-transition.js +76 -0
  160. package/dist/workflow-phase-transition.js.map +1 -0
  161. package/dist/workflow-run-commands.js +79 -18
  162. package/dist/workflow-run-commands.js.map +1 -1
  163. package/dist/workflow-services.js +62 -28
  164. package/dist/workflow-services.js.map +1 -1
  165. package/dist/workspace-init-artifacts.d.ts +9 -0
  166. package/dist/workspace-init-artifacts.js +28 -0
  167. package/dist/workspace-init-artifacts.js.map +1 -1
  168. package/dist/workspace-runtime-bootstrap.d.ts +3 -1
  169. package/dist/workspace-runtime-bootstrap.js +8 -3
  170. package/dist/workspace-runtime-bootstrap.js.map +1 -1
  171. package/dist/workspace.d.ts +5 -2
  172. package/dist/workspace.js +46 -16
  173. package/dist/workspace.js.map +1 -1
  174. package/docs/adoption-guide.md +12 -0
  175. package/docs/architecture-debt-inventory.md +25 -0
  176. package/docs/autonomous-workflow.md +10 -2
  177. package/docs/claude-adapter-qa-matrix.md +56 -0
  178. package/docs/e2e-test-batteries.md +34 -23
  179. package/docs/orchestra-mvp.md +21 -0
  180. package/docs/release-test-matrix.md +22 -0
  181. package/docs/runtime-adapters.md +155 -15
  182. package/docs/sonar-quality-gates.md +240 -11
  183. package/package.json +5 -1
  184. package/rules/delivery-quality-gates.mdc +8 -0
  185. package/rules/devops-tooling.mdc +1 -0
  186. package/rules/security-guardrails.mdc +3 -0
  187. package/rules/testing-discipline.mdc +9 -0
  188. package/dist/web-console/assets/index-CgSKcay8.js +0 -11
@@ -51,6 +51,15 @@ orchestra review --task STORY-001 --role qa --result approve --findings "Accepta
51
51
  orchestra release check --json
52
52
  ```
53
53
 
54
+ Do not approve workflow gates just because the CLI reached a pause. The
55
+ `po->architect` gate requires user-validated scope, assumptions, non-goals,
56
+ priority, acceptance criteria, and sizing context. The `qa->release` gate
57
+ requires real implementation evidence, exact validation commands, QA findings,
58
+ and BA/PO plus Architect review when business behavior or technical contracts
59
+ changed. If a generated handoff says `Acceptance Criteria: none`, use the
60
+ linked issue or task as the source of truth, record the gap, and block release
61
+ until the missing criteria/evidence are fixed or explicitly risk-accepted.
62
+
54
63
  ## Concepts
55
64
 
56
65
  Open Orchestra is a local control plane. The parent runtime remains the active
@@ -79,6 +88,9 @@ orchestra workflow gate-approve --run <run-id> --gate "po->architect" --approver
79
88
  orchestra workflow run --task TEAM-001 --resume <run-id>
80
89
  ```
81
90
 
91
+ Quote gate ids in shells so `po->architect` and `qa->release` are passed as
92
+ literal values rather than interpreted as output redirection.
93
+
82
94
  Local-only provider:
83
95
 
84
96
  ```bash
@@ -0,0 +1,25 @@
1
+ # Architecture Debt Inventory
2
+
3
+ Open Orchestra includes a report-only architecture debt inventory for spotting files that may need future refactoring.
4
+
5
+ Run it with:
6
+
7
+ ```sh
8
+ npm run architecture:inventory
9
+ ```
10
+
11
+ For machine-readable output:
12
+
13
+ ```sh
14
+ npm run build
15
+ node scripts/architecture-debt-inventory.js --json
16
+ ```
17
+
18
+ The inventory reports:
19
+
20
+ - Large files over the configured line threshold.
21
+ - Long functions over the configured function threshold.
22
+ - Command-facing modules that may contain orchestration logic.
23
+ - Module-boundary candidates that mix CLI, filesystem, workflow, and domain concerns.
24
+
25
+ This slice is intentionally warn-only. It does not fail CI yet because the current repository still needs threshold tuning and incremental refactor stories. Future enforcement slices can promote selected categories to CI failures once the baseline is reviewed.
@@ -372,8 +372,9 @@ Started autonomous workflow wfrun-... for task FEAT-001 [gates=phase]
372
372
  → po (product_owner) task=FEAT-001-po
373
373
  ✓ handoff → architect (...)
374
374
  ⏸ gate po→architect — review: .agent-workflow/approvals/FEAT-001-gate-po-to-architect-....md
375
- Approve: orchestra workflow run --task FEAT-001 --resume wfrun-...
375
+ Approve: orchestra workflow gate-approve --run wfrun-... --gate "po->architect" --approver <name> --rationale "<text>"
376
376
 
377
+ $ orchestra workflow gate-approve --run wfrun-... --gate "po->architect" --approver "product-owner" --rationale "User validated scope, assumptions, non-goals, priority, acceptance criteria, and sizing context"
377
378
  $ orchestra workflow run --task FEAT-001 --resume wfrun-...
378
379
  Resuming run wfrun-... from phase architect
379
380
  → architect (architect) task=FEAT-001-architect
@@ -384,10 +385,17 @@ Resuming run wfrun-... from phase architect
384
385
  → qa (qa) task=FEAT-001-qa
385
386
  ✓ handoff → release_manager (...)
386
387
  ⏸ gate qa→release — review: .agent-workflow/approvals/FEAT-001-gate-qa-to-release-....md
387
- Approve: orchestra workflow run --task FEAT-001 --resume wfrun-...
388
+ Approve: orchestra workflow gate-approve --run wfrun-... --gate "qa->release" --approver <name> --rationale "<text>"
388
389
 
390
+ $ orchestra workflow gate-approve --run wfrun-... --gate "qa->release" --approver "release-manager" --rationale "Implementation evidence, QA findings, BA/PO acceptance, and Architect review are recorded"
389
391
  $ orchestra workflow run --task FEAT-001 --resume wfrun-...
390
392
  Resuming run wfrun-... from phase release
391
393
  → release (release_manager) task=FEAT-001-release
392
394
  Workflow complete [run=wfrun-...]
393
395
  ```
396
+
397
+ Gate ids should be quoted in shells. If a generated handoff says
398
+ `Acceptance Criteria: none`, do not approve release from that handoff alone.
399
+ Use the linked issue or task as the acceptance source, record the missing
400
+ handoff detail as a review finding, and continue only after the gap is fixed or
401
+ the Product Owner explicitly accepts the risk.
@@ -0,0 +1,56 @@
1
+ # Claude Adapter QA Matrix
2
+
3
+ This matrix covers the GH-422 child stories for Claude runtime adapter support.
4
+ It records only local contract evidence and documentation evidence. It does not
5
+ claim real Claude Code native execution or Anthropic/provider API execution.
6
+
7
+ ## Evidence Summary
8
+
9
+ | Story | Scope | Evidence | Status |
10
+ | --- | --- | --- | --- |
11
+ | #432 / `GH-432-CLAUDE-ADAPTER-CONTRACT` | Claude action eligibility, skip reasons, alias policy, non-regression docs | QA handoff, release handoff, `npm run build`, `node --test test/runtime-adapters.test.js` with 51 passing tests, `git diff --check` | Pass |
12
+ | #433 / `GH-433-CLAUDE-DISPATCH-BRIDGE` | Dispatch bridge boundary, spawned/active lifecycle recording, idempotency, fallback guidance | QA handoff, release handoff, `npm run build`, `node --test test/runtime-adapters.test.js` with 54 passing tests, `git diff --check` | Pass |
13
+ | #434 / `GH-434-CLAUDE-COMPLETION-RECONCILIATION` | Strict completion validation by task, phase, role, runtime, session, and expected artifact | Issue exists and remains open; no local QA handoff found for this slice | Pending |
14
+ | #435 / `GH-435-CLAUDE-GATE-PRESERVATION` | Safe workflow resume and human gate preservation regression coverage | Issue exists and remains open; no local QA handoff found for this slice | Deferred |
15
+ | #436 / `GH-436-CLAUDE-DOCS-QA-EVIDENCE` | Documentation, QA matrix, release evidence, support-level framing | This document, `docs/runtime-adapters.md`, GH-436 QA handoff | Pending |
16
+
17
+ ## Acceptance Criteria Matrix
18
+
19
+ | Story | Acceptance Criterion | Evidence Type | Evidence | Result | Status |
20
+ | --- | --- | --- | --- | --- | --- |
21
+ | #432 | Claude runtime action eligibility is defined for runtime, action kind, tool name, session status, safety state, and runtime filter. | Unit/contract | `.agent-workflow/handoffs/GH-432-CLAUDE-ADAPTER-CONTRACT-wfrun-1779347910261-ecc68e-qa-qa-runtime-handoff.md`; `node --test test/runtime-adapters.test.js` | QA reports dispatchable primary Claude action and checked eligibility fields. | Pass |
22
+ | #432 | Skip reasons are machine-readable and human-readable for queued, suspended, terminal, stale or unsafe, runtime mismatch, tool mismatch, unsupported runtime, unavailable native tool, and manual request. | Unit/CLI contract | GH-432 QA handoff; GH-432 developer evidence | QA reports stable reason codes and readable messages for required skipped conditions. | Pass |
23
+ | #432 | Tool alias policy documents primary `claude-code-agent` and whether `Task` is supported or deferred. | Documentation/unit | `docs/runtime-adapters.md`; GH-432 QA handoff | `claude-code-agent` is primary; `Task` is deferred/manual and skipped as `tool-mismatch`. | Pass |
24
+ | #432 | Tests cover dispatchable and non-dispatchable Claude actions without changing Codex or Cursor behavior. | Regression/unit | GH-432 QA handoff; `node --test test/runtime-adapters.test.js` with 51 passing tests | Focused runtime adapter tests passed; Codex/Cursor behavior reviewed in the same suite. | Pass |
25
+ | #432 | `docs/runtime-adapters.md` reflects only the tested support level. | Documentation review | GH-432 QA handoff; current runtime adapter docs | Docs avoid native Claude/provider API claims for GH-432. | Pass |
26
+ | #433 | Eligible `claude-agent-request` actions can be dispatched through a focused Claude bridge service or equivalent adapter boundary. | Code review/CLI unit | `.agent-workflow/handoffs/GH-433-CLAUDE-DISPATCH-BRIDGE-wfrun-1779349624692-f04dbc-qa-qa-runtime-handoff.md`; `node --test test/runtime-adapters.test.js` | QA reports dispatch result is `dispatched` through the adapter boundary. | Pass |
27
+ | #433 | Dispatch records spawned and active lifecycle states with a stable child identifier or deterministic fallback label. | CLI unit/event review | GH-433 QA handoff | QA reports one spawned event, one active heartbeat, and `claude-code-agent:<session>` fallback label coverage. | Pass |
28
+ | #433 | Repeated dispatch is idempotent and never creates duplicate lifecycle events for the same session. | CLI unit | GH-433 QA handoff | QA reports repeated dispatch keeps one spawned and one active event. | Pass |
29
+ | #433 | Unavailable or unsupported native tool paths return explicit fallback guidance and manual lifecycle commands. | CLI unit/code review | GH-433 QA handoff | QA reports skipped result includes prompt artifact, expected result artifact, and manual spawned command. | Pass |
30
+ | #433 | Tests cover successful dispatch, unavailable tool fallback, repeated dispatch idempotency, runtime mismatch, and guardrail rejection. | Automated tests | GH-433 QA handoff; `node --test test/runtime-adapters.test.js` with 54 passing tests | Required scenarios mapped to deterministic tests. | Pass |
31
+ | #434 | Completion validation checks task id, phase, role, runtime, session id, and expected result artifact path. | Planned unit/watch tests | GitHub issue #434 | No local implementation or QA evidence reviewed in this slice. | Pending |
32
+ | #434 | Wrong-task, wrong-role, wrong-runtime, wrong-session, missing, duplicate, and unsafe-path artifacts are rejected or skipped with explicit reasons. | Planned unit/watch tests | GitHub issue #434 | No local implementation or QA evidence reviewed in this slice. | Pending |
33
+ | #434 | `runtime watch` records completed exactly once for a valid spawned or active Claude session. | Planned watch tests | GitHub issue #434 | No local implementation or QA evidence reviewed in this slice. | Pending |
34
+ | #434 | Native immediate completion results follow the same validation rules when supported. | Planned contract tests | GitHub issue #434 | Native immediate completion is not claimed as supported by current evidence. | Pending |
35
+ | #434 | Tests cover artifact validation, duplicate completion prevention, timeout/stale behavior, and safe path handling. | Planned automated tests | GitHub issue #434 | No local implementation or QA evidence reviewed in this slice. | Pending |
36
+ | #435 | Verified completion resumes the paused run to the next safe phase when no human gate is pending. | Planned workflow tests | GitHub issue #435 | No local implementation or QA evidence reviewed in this slice. | Deferred |
37
+ | #435 | `po-to-architect`, `qa-to-release`, and configured human gates remain paused until explicit approval. | Planned workflow tests/manual review | GitHub issue #435 | Dedicated regression evidence is still required before release claim. | Deferred |
38
+ | #435 | Auto-dispatch never records gate approval or skips a gate. | Planned workflow tests | GitHub issue #435 | Dedicated regression evidence is still required before release claim. | Deferred |
39
+ | #435 | Tests cover `gates=none`, `gates=phase`, `gates=all`, multi-phase dispatch until idle, manual fallback recovery, and GH-421 spawn-state messaging. | Planned CLI/workflow tests | GitHub issue #435 | No local implementation or QA evidence reviewed in this slice. | Deferred |
40
+ | #435 | Existing Codex, Cursor, generic, VS Code, Windsurf, and OpenCode behavior is unchanged or covered by regression tests. | Planned regression tests | GitHub issue #435 | Broad cross-runtime regression evidence is still required. | Deferred |
41
+ | #436 | Runtime adapter docs document Claude dispatch support, alias policy, fallback behavior, manual recovery commands, guardrails, and gate preservation. | Documentation review | `docs/runtime-adapters.md` | Updated in this slice. | Pass |
42
+ | #436 | QA matrix maps each GH-422 child story acceptance criterion to unit, workflow, CLI, or manual evidence. | Documentation | This file | Matrix records Pass/Pending/Deferred by criterion and evidence type. | Pass |
43
+ | #436 | Release evidence includes exact commands, pass/fail results, unsupported CI/manual verification notes, and unresolved risks. | QA handoff/evidence | GH-436 handoff under `.agent-workflow/handoffs/` | Handoff records commands and recommended validations. | Pending |
44
+ | #436 | Documentation does not claim native Claude execution beyond tested behavior. | Documentation review | `docs/runtime-adapters.md`; this file | Docs frame support as parent-runtime contract plus manual/runtime-owned launch. | Pass |
45
+ | #436 | Product/release review records go/no-go based on evidence and known limitations. | Review artifact | Pending release review for GH-436 | Needs release/product review after documentation QA. | Pending |
46
+
47
+ ## Unsupported Or Deferred Claims
48
+
49
+ - No evidence in this slice proves that Orchestra can call Claude Code native
50
+ Agent/Subagent tools from CI or from a non-Claude parent runtime.
51
+ - No evidence proves direct Anthropic or provider API execution for runtime
52
+ delegation; runtime-native artifacts keep `directProviderApiAllowed=false`.
53
+ - #434 completion reconciliation hardening remains pending.
54
+ - #435 workflow resume and human gate preservation regression coverage remains
55
+ deferred until its implementation and QA pass.
56
+
@@ -21,35 +21,36 @@ entry points a user or CI runner actually executes.
21
21
 
22
22
  ## P0 Release-Blocking Batteries
23
23
 
24
- | Battery | Scope | Command | Minimum Assertions | Evidence |
25
- | --- | --- | --- | --- | --- |
26
- | Source quality | Static checks, build, unit tests, workflow validation, secret scan, security audit | `npm run precommit` | exit code 0, no leaks, no audit blockers, workflow valid | command log |
27
- | Local CLI onboarding | Current source CLI in `/tmp` workspaces | `ORCHESTRA_NODE_SCRIPT=$PWD/bin/orchestra.js npm run test:e2e:init` | `--version`, `init`, `status`, `validate`, first-use task, handoff, evidence, release readiness | stdout/stderr, JSON output, filesystem assertions |
28
- | Installed CLI onboarding | Installed or packaged CLI in `/tmp` workspaces | `npm run test:e2e:init` after installing the candidate package | same assertions as local CLI onboarding, proving the packaged binary matches source behavior | stdout/stderr, JSON output, filesystem assertions, package version |
29
- | Browser console | Web console task, cost, provider, delegation, recovery, evidence, workflow, accessibility, artifacts | `npm run test:e2e` | visible state, API persistence, evidence attachment, lifecycle transitions, responsive/keyboard behavior | Playwright report, screenshots/traces on failure |
30
- | Public site | Documentation/site navigation, docs catalog, architecture viewer, mobile fit | `npm run test:e2e` | navigation order, local docs catalog search, no raw GitHub redirect for docs, mobile content fit | Playwright report |
31
- | Runtime manual queue | Manual runtime delegation in a `/tmp` workspace | `npm run test:e2e:runtime` | two active sessions, third manual `spawn-request` materializes `queued`, artifact includes lifecycle commands, `runtime sessions` lists queued session | stdout/stderr, JSON output, artifact content |
32
- | Init refresh environments | Simulated Codex, Claude, Cursor, generic workspaces | `node --test e2e/init-refresh-environments.test.js` | missing runtime guidance files regenerate on `init --force`, user content is preserved, managed blocks are updated only inside managed ranges | filesystem diff assertions |
33
- | Workflow lifecycle CLI | CLI workflow run, gate, resume, QA failback, release readiness | `node --test e2e/workflow-lifecycle-cli.test.js` | task phases create handoffs, blocked QA routes back, routine gate resumes immediately, release readiness maps acceptance to evidence | JSON output, events, handoffs |
24
+ | Battery | Scope | Command | Minimum Assertions | Evidence |
25
+ | ------------------------- | ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------ |
26
+ | Source quality | Static checks, build, unit tests, workflow validation, secret scan, security audit | `npm run precommit` | exit code 0, no leaks, no audit blockers, workflow valid | command log |
27
+ | Local CLI onboarding | Current source CLI in `/tmp` workspaces | `ORCHESTRA_NODE_SCRIPT=$PWD/bin/orchestra.js npm run test:e2e:init` | `--version`, `init`, `status`, `validate`, first-use task, handoff, evidence, release readiness | stdout/stderr, JSON output, filesystem assertions |
28
+ | Installed CLI onboarding | Installed or packaged CLI in `/tmp` workspaces | `npm run test:e2e:init` after installing the candidate package | same assertions as local CLI onboarding, proving the packaged binary matches source behavior | stdout/stderr, JSON output, filesystem assertions, package version |
29
+ | Browser console | Web console task, cost, provider, delegation, recovery, evidence, workflow, accessibility, artifacts | `npm run test:e2e` | visible state, API persistence, evidence attachment, lifecycle transitions, responsive/keyboard behavior | Playwright report, screenshots/traces on failure |
30
+ | Public site | Documentation/site navigation, docs catalog, architecture viewer, mobile fit | `npm run test:e2e` | navigation order, local docs catalog search, no raw GitHub redirect for docs, mobile content fit | Playwright report |
31
+ | Runtime manual queue | Manual runtime delegation in a `/tmp` workspace | `npm run test:e2e:runtime` | two active sessions, third manual `spawn-request` materializes `queued`, artifact includes lifecycle commands, `runtime sessions` lists queued session | stdout/stderr, JSON output, artifact content |
32
+ | Init refresh environments | Simulated Codex, Claude, Cursor, generic workspaces | `node --test e2e/init-refresh-environments.test.js` | missing runtime guidance files regenerate on `init --force`, user content is preserved, managed blocks are updated only inside managed ranges | filesystem diff assertions |
33
+ | Workflow lifecycle CLI | CLI workflow run, gate, resume, QA failback, release readiness | `node --test e2e/workflow-lifecycle-cli.test.js` | task phases create handoffs, blocked QA routes back, routine gate resumes immediately, release readiness maps acceptance to evidence | JSON output, events, handoffs |
34
34
 
35
35
  ## P1 High-Risk Regression Batteries
36
36
 
37
- | Battery | Scope | Command | Minimum Assertions | Evidence |
38
- | --- | --- | --- | --- | --- |
39
- | Multi-squad runtime | Parallel squad delegation with queue and threshold policy | `node --test e2e/runtime-multi-squad.test.js` | independent sessions, non-blocking parent, queued sessions do not fall back to parent, completion order reconciles | JSON output, lifecycle events |
40
- | Acceptance evidence | CLI, API, browser, and deferred integration evidence | `node --test e2e/acceptance-evidence.test.js` | evidence maps to named acceptance criteria, deferred external validation requires owner and rationale | evidence artifacts |
41
- | Recovery and repair | Interrupted runs, stale locks, failed provider phases | `node --test e2e/recovery-cli.test.js` plus browser recovery coverage | recovery detects issue, repair requires confirmation, repaired state is observable | JSON output, before/after state |
42
- | Docs/site content source | Site content generated from docs and manifest | `npm run site:build && npm run test:e2e -- --grep docs` | docs render as human-friendly catalog, no markdown-only dead ends, search works | Playwright report |
43
- | Security-sensitive operations | File paths, shell execution, web writes, secrets, telemetry redaction | `node --test e2e/security-boundaries.test.js` | path traversal blocked, unsafe writes rejected, secret-like data redacted, no raw stack traces | command/API evidence |
37
+ | Battery | Scope | Command | Minimum Assertions | Evidence |
38
+ | ------------------------------ | --------------------------------------------------------------------- | --------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------ |
39
+ | Multi-squad runtime | Parallel squad delegation with queue and threshold policy | `node --test e2e/runtime-multi-squad.test.js` | independent sessions, non-blocking parent, queued sessions do not fall back to parent, completion order reconciles | JSON output, lifecycle events |
40
+ | Acceptance evidence | CLI, API, browser, and deferred integration evidence | `node --test e2e/acceptance-evidence.test.js` | evidence maps to named acceptance criteria, deferred external validation requires owner and rationale | evidence artifacts |
41
+ | Recovery and repair | Interrupted runs, stale locks, failed provider phases | `node --test e2e/recovery-cli.test.js` plus browser recovery coverage | recovery detects issue, repair requires confirmation, repaired state is observable | JSON output, before/after state |
42
+ | Docs/site content source | Site content generated from docs and manifest | `npm run site:build && npm run test:e2e -- --grep docs` | docs render as human-friendly catalog, no markdown-only dead ends, search works | Playwright report |
43
+ | Security-sensitive operations | File paths, shell execution, web writes, secrets, telemetry redaction | `node --test e2e/security-boundaries.test.js` | path traversal blocked, unsafe writes rejected, secret-like data redacted, no raw stack traces | command/API evidence |
44
+ | Ollama provider-backed runtime | Local OpenAI-compatible Ollama provider route in a `/tmp` workspace | `npm run test:e2e:runtime:ollama` | `model connect --provider ollama`, provider-backed developer phase, OpenAI-compatible request shape, provider provenance, no runtime subagent credentials in artifacts | stdout/stderr, JSON output, mock provider request, event log |
44
45
 
45
46
  ## P2 Extended Confidence Batteries
46
47
 
47
- | Battery | Scope | Command | Minimum Assertions | Evidence |
48
- | --- | --- | --- | --- | --- |
49
- | Tracker and GitHub sync | Issue import/export and close readiness | opt-in CI job with network credentials | labels, comments, close gate, release readiness, no secret exposure | sanitized logs |
50
- | Sonar quality loop | Local or remote Sonar import and release gate mapping | configured Sonar workflow or local compose job | insights imported, release readiness reflects quality gate, unavailable token is explicit | artifact import report |
51
- | Provider-backed delegation | OpenAI, Gemini, Ollama, Claude/Cursor runtime bridges | opt-in provider E2E | budget checks, rate-limit/backpressure, lifecycle events, no direct API when disallowed | redacted provider provenance |
52
- | Package release dry run | npm package contents and release check | `npm pack --dry-run --json && orchestra release check --json` | generated/private state excluded, version/tag policy valid, release readiness complete | package list, release report |
48
+ | Battery | Scope | Command | Minimum Assertions | Evidence |
49
+ | -------------------------- | ----------------------------------------------------- | ------------------------------------------------------------- | ----------------------------------------------------------------------------------------- | ---------------------------- |
50
+ | Tracker and GitHub sync | Issue import/export and close readiness | opt-in CI job with network credentials | labels, comments, close gate, release readiness, no secret exposure | sanitized logs |
51
+ | Sonar quality loop | Local or remote Sonar import and release gate mapping | configured Sonar workflow or local compose job | insights imported, release readiness reflects quality gate, unavailable token is explicit | artifact import report |
52
+ | Provider-backed delegation | OpenAI, Gemini, Ollama, Claude/Cursor runtime bridges | opt-in provider E2E | budget checks, rate-limit/backpressure, lifecycle events, no direct API when disallowed | redacted provider provenance |
53
+ | Package release dry run | npm package contents and release check | `npm pack --dry-run --json && orchestra release check --json` | generated/private state excluded, version/tag policy valid, release readiness complete | package list, release report |
53
54
 
54
55
  ## Required `/tmp` Fixture Patterns
55
56
 
@@ -83,6 +84,16 @@ the packaging/install path is wrong.
83
84
  5. Add focused security and acceptance-evidence E2E only where unit tests cannot
84
85
  prove the user-visible contract.
85
86
 
87
+ ## Opt-In Provider Runtime Batteries
88
+
89
+ Provider-backed runtime batteries are not part of default CI because they may
90
+ need local services or paid credentials. They must still be deterministic enough
91
+ to run on a developer machine. `npm run test:e2e:runtime:ollama` uses a local
92
+ OpenAI-compatible mock endpoint by default to prove the Ollama adapter contract,
93
+ workflow provenance, and no-secret behavior without requiring a real Ollama
94
+ daemon. A separate real-model smoke can be run with `ORCHESTRA_OLLAMA_SMOKE=1`
95
+ when validating a local model installation.
96
+
86
97
  ## Definition Of Done
87
98
 
88
99
  An E2E battery is complete only when it has:
@@ -109,6 +109,14 @@ when release readiness passes. If release readiness is blocked, closure
109
109
  requires `--accepted-risk <text>`. `--dry-run --json` prints planned commands
110
110
  without writing local tasks, comments, or issue state.
111
111
 
112
+ GitHub comment payloads are file-backed. Generated commands use
113
+ `gh issue comment --body-file <payload-file>` for new comments and
114
+ `gh api --input <payload-json-file>` for comment updates instead of embedding
115
+ multiline markdown after `--body` or `-f body=...`. Agents should keep this
116
+ pattern for any copied or derived command: write markdown or JSON payload bytes
117
+ to a temporary file, pass the file path as an argv value, and avoid logging the
118
+ payload contents when command execution fails.
119
+
112
120
  The transport boundary is intentionally tracker-agnostic. The local CLI can
113
121
  execute `gh` because it is a child process with stable arguments. MCP-backed
114
122
  trackers such as GitHub MCP, Jira, Bitbucket, GitLab, or a custom work tracker
@@ -267,6 +275,19 @@ Release go/no-go:
267
275
  `macos-latest`, and `windows-latest`.
268
276
  - Confirm `Create release tag`, site publish, and npm publish workflows are
269
277
  successful for the intended release commit.
278
+ - Confirm the npm publish workflow authenticates with either a maintainer
279
+ `NPM_TOKEN` that has read-write access plus CI 2FA bypass for
280
+ `@jterrats/open-orchestra`, or npm Trusted Publishing/OIDC configured with
281
+ organization/user `jterratsdev`, repository `open-orchestra`, workflow
282
+ filename `publish-npm.yml`, and no environment. If npm returns `404 Not Found`
283
+ during publish for an existing scoped package, treat it as a token or scope
284
+ permission failure until proven otherwise; do not print token values in logs
285
+ or evidence. If `npm whoami` returns `E401 Unauthorized`, regenerate the
286
+ GitHub Actions `NPM_TOKEN` secret from the npm maintainer account before
287
+ rerunning publish. If `npm publish` returns `EOTP`, the token is valid but
288
+ cannot bypass 2FA; switch to a granular token with bypass 2FA enabled or run
289
+ `publish-npm.yml` with `auth_mode=trusted` after configuring npm Trusted
290
+ Publishing for the package.
270
291
 
271
292
  Support diagnostics:
272
293
 
@@ -35,6 +35,28 @@ If the remote tag already exists, do not move it. Use the next patch or
35
35
  prerelease version unless an explicit break-glass release decision documents why
36
36
  manual intervention is required.
37
37
 
38
+ The npm publish workflow supports two authentication modes:
39
+
40
+ - `auth_mode=token`: requires `NPM_TOKEN` from an npm account with read-write
41
+ access to `@jterrats/open-orchestra` and CI publish permission that can bypass
42
+ 2FA.
43
+ - `auth_mode=trusted`: uses npm Trusted Publishing/OIDC. Configure the package
44
+ in npm with organization/user `jterratsdev`, repository `open-orchestra`,
45
+ workflow filename `publish-npm.yml`, and no environment before running this
46
+ mode.
47
+
48
+ A `404 Not Found` response on `PUT
49
+ https://registry.npmjs.org/@jterrats%2fopen-orchestra` for an existing scoped
50
+ package usually means the token is missing, was generated by a user that is not
51
+ a package maintainer, or lacks publish permission for the scope. `EOTP` means
52
+ the token is accepted but cannot bypass 2FA for CI publishing; use a granular
53
+ token with bypass 2FA enabled or switch to Trusted Publishing. The workflow
54
+ validates `npm whoami` and scoped package access before token-based publishing
55
+ so permission failures are visible without exposing the token.
56
+ If the preflight fails at `npm whoami` with `E401 Unauthorized`, regenerate
57
+ `NPM_TOKEN` from the npm maintainer account and store the raw token value as a
58
+ GitHub Actions secret.
59
+
38
60
  ## Required Environments
39
61
 
40
62
  - `ubuntu-latest` with Node `>=22` and npm.
@@ -28,8 +28,9 @@ packet:
28
28
  approvals stay inside Codex; Orchestra renders briefs and packets only.
29
29
  - `claude-cli`: use the current Claude Code session. Orchestra renders the
30
30
  packet and the Claude parent launches it with the native Agent/Subagent tool
31
- when available; `Task` is treated as a legacy alias if that is what the
32
- runtime exposes.
31
+ when available. The tested auto-dispatch eligibility contract recognizes
32
+ `claude-code-agent` as the primary tool; `Task` is documented as a deferred
33
+ legacy/manual alias and is skipped as `tool-mismatch` by auto-dispatch.
33
34
  - `cursor-cli`: use the current Cursor runtime. Orchestra renders the packet
34
35
  and the Cursor parent launches it as a Background Agent so the current chat
35
36
  remains usable while the child works.
@@ -161,7 +162,10 @@ orchestra runtime session --session STORY-001:claude-cli --action suspend --json
161
162
  orchestra runtime session --session STORY-001:claude-cli --action resume --json
162
163
  orchestra runtime session --session STORY-001:claude-cli --action cancel --json
163
164
  orchestra runtime spawn-request --task STORY-001 --role developer --runtime codex-cli --json
165
+ orchestra runtime parent-actions --task STORY-001 --json
166
+ orchestra runtime parent-actions --task STORY-001 --dispatch --until-idle --runtime codex-cli --timeout 5m --idle-timeout 10s --json
164
167
  orchestra runtime spawn-lifecycle --session STORY-001:manual:developer:codex-cli --status spawned --agent-id <runtime-agent-id> --json
168
+ orchestra runtime watch --task STORY-001 --once --json
165
169
  ```
166
170
 
167
171
  The matching local APIs are `GET /api/runtime/sessions`,
@@ -172,27 +176,131 @@ failed, or timed-out events so the parent runtime can reconcile claimed work,
172
176
  spawned agent ids, stale sessions, and handoff state without inventing a second
173
177
  source of truth.
174
178
 
175
- Spawn request JSON includes `parentRuntimeAction`, a structured instruction for
176
- the active parent runtime. Codex receives `kind=codex-spawn-agent` with
177
- `tool=spawn_agent`; Claude receives `kind=claude-agent-request` with
178
- `tool=claude-code-agent`; Cursor receives `kind=cursor-background-agent` with
179
- `tool=cursor-background-agent`. The action points to the prompt artifact,
180
- expected result artifact, ownership paths, allowed commands, and lifecycle
181
- commands. It does not include secrets or direct provider credentials.
179
+ Spawn request JSON and `runtime parent-actions` include `parentRuntimeAction`, a
180
+ structured instruction for the active parent runtime. Codex receives
181
+ `kind=codex-spawn-agent` with `tool=spawn_agent`; Claude receives
182
+ `kind=claude-agent-request` with `tool=claude-code-agent`; Cursor receives
183
+ `kind=cursor-background-agent` with `tool=cursor-background-agent`. The action
184
+ points to the prompt artifact, expected result artifact, ownership paths,
185
+ allowed commands, and lifecycle commands. It does not include secrets or direct
186
+ provider credentials.
187
+
188
+ Pending parent actions also include structured `eligibility` metadata. The
189
+ metadata records the checked runtime, action kind, tool name, session status,
190
+ runtime filter when supplied, and safety state. Dispatchable actions report
191
+ `status=dispatchable`; skipped actions include machine-readable reason codes
192
+ and operator-readable messages. Current skip codes are `already-dispatched`,
193
+ `queued`, `suspended`, `terminal`, `stale-or-unsafe`, `runtime-mismatch`,
194
+ `tool-mismatch`, `unsupported-runtime`, `unavailable-native-tool`, and
195
+ `manual-request`.
196
+
197
+ When `workflow run` pauses with a pending parent runtime action, parent agents
198
+ have two supported paths:
199
+
200
+ - Manual inspection: run `runtime parent-actions --task <id> --json`, inspect
201
+ each requested action, call the active runtime's native tool, then record
202
+ `runtime spawn-lifecycle` with the returned child id.
203
+ - Auto-dispatch: run
204
+ `runtime parent-actions --task <id> --dispatch --until-idle --runtime <runtime-id>`.
205
+ The dispatcher repeatedly inspects pending parent actions, dispatches only
206
+ safe actions for the active runtime, records spawned and active lifecycle
207
+ events with stable runtime child ids or deterministic fallback labels, applies
208
+ `runtime watch` completions when expected handoff artifacts appear, resumes
209
+ paused workflow runs, and continues across later phases until idle or timeout.
210
+
211
+ The auto-dispatch loop is bounded by `--timeout`, `--idle-timeout`, and
212
+ `--interval`, so it never polls forever. It skips queued actions, suspended
213
+ sessions, terminal sessions, stale or unsafe actions, runtime mismatches,
214
+ already-dispatched sessions, unsupported runtimes, unavailable native tools,
215
+ manual requests, and tool mismatches. Skipped actions include fallback guidance
216
+ with the prompt artifact, expected result artifact, and manual lifecycle
217
+ commands so a human parent runtime can safely continue without provider API
218
+ access. This keeps the boundary explicit: Orchestra emits auditable actions and
219
+ lifecycle commands; the active parent runtime executes native tools such as
220
+ Codex `spawn_agent`, and the dispatcher only consumes actions that are safe for
221
+ the runtime declared on the command line. For Claude, the tested dispatch
222
+ contract accepts `claude-agent-request` with `tool=claude-code-agent`, records
223
+ `spawned` and `active` lifecycle states with a deterministic
224
+ `claude-code-agent:<session>` label when no native child id is available, and
225
+ remains idempotent across repeated dispatch attempts. Orchestra does not call
226
+ Claude Code, Anthropic APIs, or another provider API.
227
+
228
+ Runtime lifecycle watching is adapter-driven. Each inspected session reports a
229
+ `watcher` object with adapter id, detection mode, support level, fallback
230
+ behavior, and the reason a native callback is unavailable. `codex-cli`,
231
+ `claude-cli`, and `cursor-cli` currently reconcile completion through explicit
232
+ parent lifecycle events and then fall back to bounded artifact inspection.
233
+ `generic-runtime`, unknown runtime ids, and runtimes without declared callbacks
234
+ use the same artifact fallback directly. Event-driven callbacks should only be
235
+ used when the selected watcher adapter declares native support; otherwise
236
+ `runtime watch` performs bounded inspection of the expected handoff artifact.
237
+
238
+ ## Claude Adapter Support Level
239
+
240
+ Claude support is currently a parent-runtime dispatch and lifecycle contract,
241
+ not proof that Orchestra can invoke Claude Code or Anthropic APIs by itself.
242
+ The tested local behavior covers:
243
+
244
+ - Dispatch support: eligible `claude-agent-request` actions for `claude-cli`
245
+ with `tool=claude-code-agent` can be consumed by
246
+ `runtime parent-actions --dispatch --runtime claude-cli`. The dispatch path
247
+ records `spawned` and `active` lifecycle state with a stable child identifier
248
+ or deterministic `claude-code-agent:<session>` fallback label.
249
+ - Alias policy: `claude-code-agent` is the only auto-dispatchable Claude tool
250
+ name in the tested contract. `Task` is a legacy/manual alias and is skipped
251
+ as `tool-mismatch`; accepting it in auto-dispatch requires new tests and
252
+ documentation.
253
+ - Fallback behavior: skipped, unsupported, unsafe, stale, queued, suspended,
254
+ terminal, mismatched, or unavailable actions return structured eligibility
255
+ metadata, fallback guidance, prompt artifact, expected result artifact, and
256
+ manual lifecycle commands. Fallback never runs the phase in the parent agent
257
+ silently and never switches to direct provider APIs.
258
+ - Guardrails: dispatch is bounded by runtime guardrails, runtime filters,
259
+ session status, safety state, action kind, tool name, and stale-session
260
+ checks. It preserves `directProviderApiAllowed=false` for runtime-native
261
+ delegation artifacts.
262
+ - Completion reconciliation: current tested support relies on explicit
263
+ lifecycle events and bounded expected-artifact inspection. GH-434 tracks
264
+ stricter validation of task id, phase, role, runtime, session id, and safe
265
+ expected artifact path before a Claude session is marked complete.
266
+ - Gate preservation: auto-dispatch must not approve or skip human gates. GH-435
267
+ tracks the dedicated regression suite for safe workflow resume across
268
+ `gates=none`, `gates=phase`, `gates=all`, multi-phase dispatch, and manual
269
+ fallback recovery.
270
+
271
+ Manual recovery for a skipped or unavailable Claude action:
272
+
273
+ ```bash
274
+ orchestra runtime parent-actions --task <id> --json
275
+ orchestra runtime spawn-request --task <id> --role <role> --runtime claude-cli --json
276
+ orchestra runtime spawn-lifecycle --session <session-id> --status spawned --agent-id <claude-child-id-or-label> --json
277
+ orchestra runtime spawn-lifecycle --session <session-id> --status active --agent-id <claude-child-id-or-label> --json
278
+ orchestra runtime watch --task <id> --once --json
279
+ orchestra workflow run --task <id> --resume <run-id>
280
+ ```
281
+
282
+ Only run the lifecycle commands after the parent Claude Code session has
283
+ actually launched or taken ownership of the rendered prompt artifact. If no
284
+ child id is returned by the runtime, use a stable operator label and keep the
285
+ expected handoff artifact path from the spawn request.
182
286
 
183
287
  ## Native Background Agent Notes
184
288
 
185
289
  Claude Code and Cursor do not need Orchestra to call vendor APIs directly.
186
290
  They need a precise packet and lifecycle hooks:
187
291
 
188
- - Claude Code: render `runtime spawn-request`, then launch the packet from the
189
- parent Claude session with the native Agent/Subagent tool. If the local
190
- Claude runtime exposes `Task` as the tool name, treat it as the compatible
191
- legacy alias. Record the returned child id or role label through
192
- `runtime spawn-lifecycle`.
292
+ - Claude Code: render `runtime spawn-request`, then manually launch the packet
293
+ from the parent Claude session with the native Agent/Subagent tool. The
294
+ primary tested tool name is `claude-code-agent`. `Task` is deferred as a
295
+ legacy/manual alias and is not auto-dispatchable in GH-432; auto-dispatch
296
+ reports it as `tool-mismatch`. Record the returned child id or role label
297
+ through `runtime spawn-lifecycle`.
193
298
  - Codex: render `runtime spawn-request`, read `parentRuntimeAction`, and call
194
299
  the parent `spawn_agent` tool with the prompt artifact as the role-scoped
195
- assignment. Keep the child detached unless the parent is blocked.
300
+ assignment. In workflow auto-consumer mode, use
301
+ `runtime parent-actions --dispatch --until-idle --runtime codex-cli` to
302
+ discover and consume safe actions after the run pauses. Keep the child
303
+ detached unless the parent is blocked.
196
304
  - Cursor: render `runtime spawn-request`, then launch it as a Cursor Background
197
305
  Agent. Background work should stay detached from the current chat and report
198
306
  lifecycle state back to Orchestra before the workflow is resumed.
@@ -245,6 +353,22 @@ parent-agent fallback reason. `subagents` requires runtime-native support and
245
353
  fails fast if the runtime cannot satisfy it. `single-agent` forces the parent
246
354
  agent path and records that choice in phase provenance.
247
355
 
356
+ When no task or role executor is configured and the default executor is
357
+ `generic-runtime`, `auto` and strict `subagents` mode infer the active runtime
358
+ from `OPEN_ORCHESTRA_ACTIVE_RUNTIME`, known parent-runtime environment markers,
359
+ or managed runtime bootstrap files. Codex maps to `codex-cli`, Claude maps to
360
+ `claude-cli`, Cursor maps to `cursor-cli`, Windsurf maps to `windsurf-agent`,
361
+ and VS Code maps to `vscode-agent`.
362
+
363
+ Explicit selections always take precedence in this order: `--runtime`, task
364
+ override, role override, then `runtimePolicy.defaults.executor`. Automatic
365
+ inference never rewrites `.agent-workflow/config.json`; it only affects the
366
+ current planning decision. Set `workflow.phaseExecutionMode` to `single-agent`
367
+ or configure `runtimePolicy.defaults.executor` to override inference for
368
+ deterministic local or CI runs. If `OPEN_ORCHESTRA_ACTIVE_RUNTIME` names an
369
+ unknown runtime, workflow planning fails with supported values and the same
370
+ override options instead of requiring hidden config edits.
371
+
248
372
  Subagent spawning is fully asynchronous by default. A spawn request returns the
249
373
  `sessionId`, request artifact, prompt artifact, expected result artifact, status,
250
374
  next lifecycle commands, and quality warnings, then the parent agent should
@@ -331,3 +455,19 @@ orchestra runtime sessions --task <id> --json
331
455
  orchestra runtime spawn-lifecycle --session <id> --status completed --agent-id <id> --json
332
456
  orchestra model providers --json
333
457
  ```
458
+
459
+ ## Ollama E2E
460
+
461
+ The Ollama adapter has an opt-in E2E battery that runs in a temporary workspace
462
+ and uses a local OpenAI-compatible endpoint controlled by the test:
463
+
464
+ ```bash
465
+ npm run test:e2e:runtime:ollama
466
+ ```
467
+
468
+ The test configures `model connect --provider ollama`, runs a developer phase
469
+ through provider-backed execution, validates the request body sent to
470
+ `/v1/chat/completions`, and checks model provenance events. It intentionally
471
+ does not require a real Ollama daemon, so default CI and local development do
472
+ not degrade when Ollama is unavailable. Use `ORCHESTRA_OLLAMA_SMOKE=1` for a
473
+ separate real-model smoke check.