@openwop/openwop-conformance 1.29.0 → 1.33.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +4 -0
- package/README.md +2 -2
- package/api/asyncapi.yaml +53 -0
- package/coverage.md +13 -0
- package/package.json +1 -1
- package/schemas/capabilities.schema.json +58 -1
- package/schemas/conversation-event.schema.json +50 -2
- package/schemas/conversation-turn.schema.json +35 -0
- package/schemas/registry-version-manifest.schema.json +49 -2
- package/schemas/run-event-payloads.schema.json +87 -2
- package/schemas/run-event.schema.json +8 -1
- package/src/lib/multi-agent-capabilities.ts +23 -4
- package/src/lib/multiPartyConversation.ts +121 -0
- package/src/scenarios/aiproviders-realtimevoice-shape.test.ts +120 -0
- package/src/scenarios/multi-party-conversation-behavioral.test.ts +137 -0
- package/src/scenarios/multi-party-conversation-shape.test.ts +206 -0
- package/src/scenarios/registry-declarative-kinds.test.ts +111 -0
- package/src/scenarios/voice-event-payloads-shape.test.ts +127 -0
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,9 @@
|
|
|
1
1
|
# `@openwop/openwop-conformance` Changelog
|
|
2
2
|
|
|
3
|
+
## [1.30.0] — 2026-06-23 — RFC 0101 multi-party conversation behavioral leg
|
|
4
|
+
|
|
5
|
+
Adds `multi-party-conversation-behavioral.test.ts` — the capability-gated behavioral leg the RFC 0101 §Conformance section promised but had not yet shipped (suite 354 → 356; this release also publishes the always-on `multi-party-conversation-shape.test.ts` that landed in openwop#738 without a version bump). The leg gates on `capabilities.multiPartyConversation.supported` (root-first, via `isMultiPartyConversationSupported()`) through `behaviorGate('openwop-multi-party-conversation', …)` and asserts the cross-field / runtime MUSTs JSON Schema cannot express: a 3-agent council opens and a roster-valid, attributed agent turn is accepted (positive), while (a) a `role:'agent'` turn missing `speakerId`, (b) a turn whose `speakerId` is NOT in the declared `participants` roster, and (c) an `open` whose roster exceeds the advertised `maxParticipants` are each rejected with `error.code === 'validation_error'`. RFC 0005 §E pins the rejection *code*, not the HTTP status, so the leg tolerates `400`/`422`. Because RFC 0101 mints **no normative client wire-route to open a multi-party conversation** (opening / turn order / rounds are non-normative host product policy), the driver initiates a council + submits turns via the conformance-only seam `POST /v1/host/sample/conversation/multi-party/{open,exchange}` (new entry in `host-sample-test-seams.md`), which routes through the host's production roster-membership + attribution enforcement and is self-contained (it does NOT require a full RFC 0005 conversation gate). New lib helper `src/lib/multiPartyConversation.ts`. Soft-skips on `404`/`405` (reference impl: the postgres example host; a host whose enforcement is bound to a product flow — e.g. openwop-app ADR 0040's advisory-board council — witnesses instead via its own host-side test + an `INTEROP-MATRIX.md` row, the RFC 0086 dual-staging). Minor bump (scenario add) per the major/minor rule; the leg is additive + capability-gated, so no existing host pass-count changes.
|
|
6
|
+
|
|
3
7
|
## [1.29.0] — 2026-06-20 — RFC 0105 speech synthesis adapter
|
|
4
8
|
|
|
5
9
|
Adds three scenario files for **RFC 0105 — Speech Synthesis Adapter** (the capability-gated `aiProviders.speechSynthesis` TTS surface), suite 351 → 354. One always-on, server-free shape leg (`aiproviders-speechsynth-shape.test.ts`) asserts `capabilities.aiProviders.speechSynthesis` is declared as the string const `"supported"` (not an object), is NOT in `aiProviders.required` (absence ⇒ no TTS is a valid default), and that Ajv accepts the string-const form while rejecting the object form. Two capability-gated behavioral legs: `speech-synthesis-roundtrip.test.ts` (gated on `aiProviders.speechSynthesis === 'supported'` via `behaviorGate('openwop-speech-synthesis', …)`) asserts the `ctx.callSpeechSynthesizer` round-trip returns an `audio` object with EXACTLY ONE of `url` / `base64`, a non-empty `mimeType`, and a `voiceId` that echoes the input; `speech-synthesis-unadvertised.test.ts` (gated BY ABSENCE via `behaviorGate('openwop-speech-synthesis-unadvertised', !advertised)`) asserts a host NOT advertising the flag MUST reject the call with `speech_synthesis_unsupported` (never a 200 success, never a silent no-op — parallel to RFC 0091's `unsupported_modality`). Both behavioral legs soft-skip on 404 until a host wires `POST /v1/host/sample/ai/call-speech-synthesizer`. Minor bump (scenario add) per the major/minor rule; all legs additive + capability-gated, so no existing host pass-count changes.
|
package/README.md
CHANGED
|
@@ -92,7 +92,7 @@ Exit code is non-zero on any failed assertion.
|
|
|
92
92
|
|
|
93
93
|
## What's Covered
|
|
94
94
|
|
|
95
|
-
The current suite has 354 scenario files under `src/scenarios/`. 2026-06-20 (RFC 0105 — speech synthesis adapter) added three: `aiproviders-speechsynth-shape.test.ts` (always-on server-free — the `aiProviders.speechSynthesis` `const "supported"` flag is declared, absent from `aiProviders.required`, and a string-const advertisement validates while the object form `{supported:true}` is rejected), `speech-synthesis-roundtrip.test.ts` (gated on `aiProviders.speechSynthesis === "supported"` via `behaviorGate('openwop-speech-synthesis', …)` — an advertising host's `callSpeechSynthesizer` round-trip via the `POST /v1/host/sample/ai/call-speech-synthesizer` seam returns `audio` with EXACTLY ONE of `url`/`base64`, a non-empty `mimeType`, and the echoed `voiceId`; soft-skips on 404), and `speech-synthesis-unadvertised.test.ts` (gated-by-absence via `behaviorGate('openwop-speech-synthesis-unadvertised', …)` — a host NOT advertising `speechSynthesis` MUST reject the call with `speech_synthesis_unsupported`, never a silent no-op). 2026-06-19 (RFC 0104 — portable HITL approver routing) added one: `interrupt-approver-routing.test.ts` (server-free Ajv2020 — the `interrupt.approverRouting` capability block shape, the additive optionality of `approverGroupRefs` / `approverRoleRefs` / `audience` on the `ApprovalData` schema, the closed `audience` object, and the §"Portable approver routing" `notifyTargets` reference rule that `audience` DEFAULTS to the resolved eligibility union when omitted and OVERRIDES it when present; the capability-gated leg asserts an advertising host's `interrupt.approverRouting` is honest — `refKinds` ⊆ {group, role}, `audience` boolean — and soft-skips when the host does not advertise the capability). 2026-06-17 (RFC 0103 — localized content surface) added one: `localized-content-delivery.test.ts` (server-free Ajv2020 — the four content schemas + the §C `resolveSection` merge + §A capability coherence; the public test for `content-published-cache-no-draft` / `content-response-tenant-scoped` / `content-no-cross-tenant-enumeration`; the live legs gate on `capabilities.content.supported` and soft-skip without a `GET /v1/content/pages/{slug}` target). 2026-06-15 (RFC 0102 — A2UI agent-authored interface surfaces) added five: `a2ui-surface-shape.test.ts` (server-free Ajv2020 — the closed core `ui.a2ui-surface` payload validates, while an out-of-catalog component / extra script-bearing property / unenumerated `catalogVersion` / `action.target` outside `enum["resume","exchange"]` each fail; the structural half of `a2ui-action-confinement` and the enabling precondition for the render-side `a2ui-surface-no-code-exec` / `a2ui-surface-no-network-egress` reference-app probes), `a2ui-surface-degrades.test.ts` (the kind is optional/advertised, not a MUST-recognize universal kind, and an unadvertised kind is gated — N6 — never crashing the run), `a2ui-surface-version-refusal.test.ts` (the enumerated `catalogVersion` rejects an unadvertised version → `unknown_schema_version`, and the surface schema carries no external `$ref`), `a2ui-surface-replay.test.ts` (all `$ref`s internal so a stored surface `:fork`/replays deterministically; same-`correlationId` + divergent `type` → `envelope_correlation_conflict`), and `a2ui-untrusted-blocks-approval.test.ts` (a `meta.contentTrust:'untrusted'` surface is trust-gated and MUST NOT advance an approval interrupt — composition of `untrusted_content_blocks_approval`). The four behavioral legs soft-skip on 404 (host-pending; the `openwop-app` reference renderer is the render-side probe). 2026-06-14 (RFCs 0099/0100 — external-event trigger ingestion + async/durable A2A tasks) added one new scenario (`trigger-ingestion.test.ts`) and extended `a2a-task-roundtrip.test.ts` with the RFC 0100 async subtests: `trigger-ingestion.test.ts` (RFC 0099 — always-on `TriggerEvent` / `TriggerSubscriptionRegistration` schema legs incl. the §F.1 per-source one-of, the `AttachmentRef.ref`-only rule + raw-URL rejection backing `trigger-ingestion-ssrf`, and the content-free `trigger.delivery.attempted` shape backing `trigger-ingestion-content-redaction`; plus a capability-gated behavioral leg on `triggerBridge.ingestion` driving the `POST /v1/host/sample/trigger-bridge/ingest` seam for SSRF refusal + header-redaction), and the `a2a-task-roundtrip.test.ts` additions (RFC 0100 — always-on `A2ATaskState` + `capabilities.a2a` shape legs incl. the lowercase-hyphen state enum, the `PushConfig` `url`-required + truncated-`tokenFingerprint` rule backing `a2a-push-egress-ssrf`, the no-inline-inputs `additionalProperties:false` SR-1 check; plus capability-gated durable-`tasks/get`-after-disconnect and push-SSRF behavioral legs on `a2a.durableTasks` / `a2a.pushNotifications` via the `/v1/host/sample/a2a/tasks/*` seam). 2026-06-13 (RFCs 0096/0097/0098 — reviewable learning, standing goals, agent-platform portability) added three always-on-plus-gated scenarios: `proposal-reviewable-learning.test.ts` (RFC 0096 — the `agents.proposals` shape + the `Proposal` round-trip incl. the dropped `rule` kind + the content-free `proposal.{created,activated}` events, plus a gated apply-without-scope→403 leg; backs `proposal-inert-until-applied` + `proposal-no-resynthesis`), `goal-standing-continuation.test.ts` (RFC 0097 — the `agents.goals` shape + the `Goal` round-trip + the content-free `goal.{evaluated,closed}` events, plus gated bounded-termination→422 + judge-only-completion legs; backs `goal-continuation-bounded` + `goal-completion-judge-only`), and `export-bundle-portability.test.ts` (RFC 0098 — the `portability` shape incl. the `import⇒dryRun` if/then + the `ExportBundle` round-trip rejecting every credential-named field + the content-free `import.applied` event, plus a gated literal-credential-import→422 leg; backs `export-bundle-no-credential-material`). 2026-06-11 (RFCs 0093/0094 — protocol hardening + wire-shape reconciliation) added five: `version-fold.test.ts` (the `version-negotiation.md` §`X-Force-Engine-Version` cross-version matrix through the previously-orphaned `conformance-version-fold` fixture — closes catalog gap F5; soft-skips when `Capabilities.testing.forceEngineVersionRange` is unadvertised), `stream-text-fixture.test.ts` (the `stream-modes.md` §`messages` fold through the deterministic `stream-text` mock provider + the previously-orphaned `conformance-stream-text` fixture — closes catalog gap F1), `i18n-negotiation.test.ts` (gated on `capabilities.i18n` via `behaviorGate('openwop-i18n', …)` — an unsupported or malformed `Accept-Language` never 400s, `Content-Language` reflects the locale actually used, and error `code` strings stay the canonical English tokens), `grpc-transport.test.ts` (gated on `capabilities.grpc` via `behaviorGate('openwop-grpc-transport', …)` — advertisement-shape only per `grpc-transport.md` §Field semantics: `service` MUST be `openwop.v1.Engine`, the `tls` enum, `grpcs?://` endpoint URIs, `supportedTransports` includes `grpc` when exposed, production claimants require `tls: "required"`; no gRPC dialing), and `webhook-tenant-isolation.test.ts` (RFC 0093 §A.3 — backs the new protocol-tier `webhook-cross-tenant-isolation` invariant; a two-tenant proof through the `/v1/host/sample/test/surface` seam plus black-box registration-surface scoping). `spec-corpus-validity.test.ts` also gained the RFC 0094 §A satisfiability probe: canonical `createRun` bodies MUST pass the composed request schema (closed via `unevaluatedProperties: false` at the composition site, never inside an `allOf` branch) and an undeclared property MUST fail. 2026-06-07 (RFCs 0090/0091/0092 — verifier turn + convergence, multimodal perception input, agent capability requirements) added six: the always-on, server-free shape probes `agent-verifier-shape.test.ts`, `aiproviders-input-shape.test.ts`, `agent-requires-capabilities-shape.test.ts`, plus the capability-gated **behavioral** legs `agent-capability-degraded-projection.test.ts` (RFC 0092 §B — the `degraded[]` projection on `GET /v1/agents`, black-box, non-vacuous via `OPENWOP_DEGRADED_CAPABILITY_AGENT_ID`), `callai-multimodal.test.ts` (RFC 0091 §A/§B — advertised modality accepted / unadvertised → `unsupported_modality`, via the `POST /v1/host/sample/ai/call` seam), and `verifier-gating.test.ts` (RFC 0090 §B — a `fail` verdict blocks commit, via the `POST /v1/host/sample/agents/verify-run` seam). The three behavioral legs soft-skip by default and hard-fail under `OPENWOP_REQUIRE_BEHAVIOR=true` — the Active→Accepted reference-host proof for each RFC. 2026-06-02 (RFC 0082 §B — deployment channel resolve-and-pin, production-path coverage) added `agent-channel-dispatch.test.ts` (capability-gated on `agents.deployment.supported` + the seeded `conformance-agent-channel-dispatch` fixture + advertised `replay` mode via `behaviorGate('openwop-deployment-channel-dispatch', …)` — proves the §B pin from a REAL run graph, complementing `agent-deployment-lifecycle.test.ts` Leg 4's host-sample seam: a canonical `POST /v1/runs` of a node binding `agent.channel:"stable"` MUST record `resolvedChannel` + `resolvedAgentVersion` on `agent.invocation.started` (RFC 0077), a `:fork{mode:"replay"}` MUST re-read that recorded version, and the seam-guarded Leg 3 MOVES the channel then asserts a replay STILL carries the original pin — never re-resolving a moved channel; soft-skips by default, hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true` — the production-path proof of the §B contract). 2026-06-01 (RFC 0085 — `openwop-agent-platform` meta-profile, the Active→Accepted behavioral gate) added `agent-platform-aggregate-evidence.test.ts` (capability-gated on a host CLAIMING `openwop-agent-platform` in its live discovery `profiles[]` via `behaviorGate('openwop-agent-platform', …)` — the §C/§D honest-advertisement rule on the live `/.well-known/openwop`: the claim MUST satisfy the §B floor predicate (`isAgentPlatformPartial` → `partial`/`full`, never `none`), backed by the per-capability evidence not the profile string; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the non-vacuous full bar — all governance terms + tenant installScope + all 16 §D terms; server-requiring, the always-on §B/§D derivation legs stay in `agent-platform-profile.test.ts` — the RFC 0085 → Accepted bar). 2026-06-01 (RFC 0084 — budget, quota + cost policy, the Active→Accepted behavioral gate) added `budget-enforcement.test.ts` (capability-gated on `budget.supported` via `behaviorGate('openwop-budget-enforcement', …)` — the §C/§D enforcement via the new `POST /v1/host/sample/budget/run` seam + the test event-log seam: a `hard-cost-exhaust` run emits the strict-ordered `budget.reserved → budget.consumed → budget.threshold.crossed{percent} → budget.exhausted → cap.breached{kind:"budget-cost"} → run.failed{error:"budget_exhausted"}` chain; a `model-denied` run is refused `budget_model_denied` BEFORE the provider call (fail-closed); an `advisory` host emits the `budget.*` events without stopping; every `budget.*` payload content-free backing `budget-no-pricing-leak`; new lib helper `src/lib/budgetPolicy.ts`; soft-skips on 404 — the RFC 0084 → Accepted bar). 2026-06-01 (RFC 0080 — agent memory capability reconciliation, the Active→Accepted behavioral gate) added `memory-degraded-projection.test.ts` (capability-gated on `agents.manifestRuntime.supported` + `memory.supported` via `behaviorGate('openwop-memory-degraded', …)` — the §C degraded-projection iff-contract on the NORMATIVE `GET /v1/agents`: a degraded inventory entry MUST carry `memoryDegraded:true` + a non-empty, unique `degradedMemoryDimensions[]` from the closed §A-name enum, a non-degraded entry MUST NOT, the inventory is non-empty, and the degraded branch runs non-vacuously when `OPENWOP_DEGRADED_AGENT_ID` names a known-degraded agent; black-box, no POST seam — the RFC 0080 → Accepted bar). This batch also documents the two RFC 0068 conformance seams (`POST /v1/host/sample/memory/consolidate` + `.../commitment/fire`) in `host-sample-test-seams.md` (the 0068 gated scenarios shipped in 1.14.0). 2026-06-01 (RFC 0034 — collector-side BYOK-canary inspection) added `otel-collector-canary-inspection.test.ts` (always-on server-free: stands up a real `OtelCollector`, POSTs synthetic OTLP/HTTP-JSON traces + metrics through its actual ingest path, and proves the new `findCanaryLeakage()` inspector catches a canary embedded in a span attribute / resource attribute / span name / metric data-point attribute while reporting ZERO hits on a redacted payload and never matching an empty canary — the non-vacuous proof that the conformance collector now inspects what the host's OTLP exporter ACTUALLY shipped over the wire, closing the `secret-leakage-otel-attribute` / `-debug-bundle-otel` collector-seam gap; the live capability-gated complement is the new collector-export describe block in `secret-leakage-otel-attribute.test.ts`). 2026-06-01 (RFC 0035 — sandbox wall-clock timeout, the 7th-of-8 graduation) added `sandbox-wasm-timeout.test.ts` (worker-driven server-free: `probeTimeout` in `wasm-sandbox-probe.ts` spawns a worker thread running the committed `misbehaving-timeout.wasm` + a main-thread kill-timer — the thread preemption a same-thread probe can't do — asserting `sandbox_timeout` with a well-behaved positive control; graduates `node-pack-sandbox-timeout` reference-impl→protocol, so 7 of 8 `node-pack-sandbox-*` invariants are now protocol-tier, only the JS-specific `no-eval` permanently exempt). 2026-05-31 (audit-response black-box / graduation batch) added three more: `sandbox-wasm-isolation.test.ts` (RFC 0035 — drives the committed `fixtures/wasm-sandbox/*.wasm` through `wasm-sandbox-probe.ts`: escape/capability-gate via static `WebAssembly.Module.imports()`, an OOB-store memory trap, double-instantiate isolation; 10/10; graduates 6 `node-pack-sandbox-*` invariants reference-impl→protocol), `workspace-cross-tenant-isolation-blackbox.test.ts` (RFC 0059 — two-credential black-box on the normative §C `/v1/host/workspace/files` endpoints: owner A writes, a second-tenant credential fails closed; no seam), and `prompt-resolution-chain-event.test.ts` (RFC 0029 — reads the durable `agent.promptResolved.chain[]` precedence record via the normative `GET /v1/runs/{runId}/events/poll`; no seam) — each the production-path proof that graduates its surface into the `openwop-core-standard` floor. 2026-05-31 (RFC 0088 — the `openwop-core-standard` Core Standard Profile, the audit-response Core Candidate target) added `core-standard-profile.test.ts` (always-on server-free derivation probe: `isCoreStandard` derives the §B floor — `openwop-core` ∧ `openwop-interrupts` ∧ (`openwop-stream-sse` ∨ `openwop-stream-poll`) — a bare `openwop-core` host without interrupts is excluded, a host with no event transport fails, and the annex is absent from `deriveProfiles` because it composes rather than redefines). 2026-05-31 (RFC 0082 — agent deployment lifecycle, the Active→Accepted behavioral gate) added `agent-deployment-lifecycle.test.ts` (capability-gated on `agents.deployment.supported` via `behaviorGate('openwop-deployment-lifecycle', …)` — the §E promotion contract via the new `POST /v1/host/sample/agents/deployment-transition` seam + the test event-log seam across four legs: `promote` (authorize RFC 0049 → approvalGate RFC 0051 → eval-verify RFC 0081 → content-free `deployment.promoted` with a seven-state `toState` + `toVersion`, the record validating `agent-deployment.schema.json`), `unauthorized` (fail-closed — `allowed:false`, no `deployment.promoted`, the behavioral leg of `deployment-promotion-fail-closed`), `eval-gate-unmet` (`eval_gate_unmet` denial, §E-3), and `channel-pin` (the §B `resolvedAgentVersion` recorded-fact on `agent.invocation.started`); new lib helper `src/lib/agentDeployment.ts`; soft-skips on 404 — the RFC 0082 → Accepted bar). 2026-05-31 (RFC 0081 — agent evaluation, the Active→Accepted behavioral gate) added `agent-eval-run.test.ts` (capability-gated on `agents.evalSuite.supported` via `behaviorGate('openwop-eval-run', …)` — the §B `mode:"eval"` projection via the new `POST /v1/host/sample/agents/eval-run` seam + the test event-log seam: `eval.started`-first → one `eval.scored` per task → `eval.completed`-once ordering (count == `eval.completed.taskCount`), the content-free `eval.scored` legs (`score` ∈ 0..1) backing `eval-summary-no-content-leak`, and the NORMATIVE `GET /v1/runs/{runId}/eval-summary` schema-valid `EvalSummary` round-trip with `passedCount <= taskCount`; new lib helper `src/lib/agentEval.ts`; soft-skips on 404 — the RFC 0081 → Accepted bar). 2026-05-31 (RFC 0083 — durable trigger bridge, the Active→Accepted behavioral gate) added `trigger-bridge-delivery.test.ts` (profile-gated on `openwop-trigger-bridge` derived from the live discovery doc — the §C delivery model via the `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free; the always-on shape stays in `trigger-bridge-shape.test.ts`; new lib helper `src/lib/triggerBridge.ts`). 2026-05-31 (RFC 0087 — agent org-chart, the Active→Accepted behavioral gate) added two capability-gated behavioral scenarios (both gated on `agents.orgChart.supported`, black-box on the normative `/v1/agents/org-chart` surface — no new POST seam): `agent-org-chart-scoping.test.ts` (the `GET /v1/agents/org-chart` tree-shape — departments form an acyclic `parentDepartmentId` tree, members reference `host:<id>` roster entries — + the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` with a deduped `responsibilities[]` union + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) and `org-position-no-authority-escalation.test.ts` (the behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field on any member/department/responsibility-view object; the structural leg stays always-on in `agent-org-chart-shape.test.ts`, and the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier per the `agent-manifest-runtime` no-host-hook precedent). 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). 2026-05-31 (RFC 0078 portable tool catalog + RFC 0079 credential provenance / egress policy — the Active→Accepted behavioral gate) added four: `tool-catalog-projection.test.ts` (capability-gated on `toolCatalog.supported` via `behaviorGate('openwop-tool-catalog', …)` — the NORMATIVE `GET /v1/tools` list with each `ToolDescriptor` schema-valid + `source`/`safetyTier` in the closed vocab + content-free, `GET /v1/tools/{toolId}` round-trip + unknown-id 404, 401-unauthenticated, and the §F-2 cross-principal non-disclosure; black-box, no POST seam), `tool-session-lifecycle.test.ts` (gated on `toolCatalog.sessionLifecycle` — the §D `tool.session.opened`-before / `tool.session.closed`-after bracket over the RFC 0064 call events via the `POST /v1/host/sample/tools/session-run` seam, one shared `sessionId`, content-free), `egress-audience-binding.test.ts` (KEYSTONE — gated on `httpClient.egressPolicy.supported`; the §C confused-deputy MUST via `POST /v1/host/sample/egress/decide`: an out-of-audience egress is denied/downgraded with the credential NOT attached, a provenance-unevaluable egress fails closed — the behavioral leg of `egress-credential-audience-bound`), and `egress-decision-content-free.test.ts` (the SR-1 canary — the credential value never surfaces in `egress.decided` and `reason` stays in the CLOSED vocabulary). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
|
|
95
|
+
The current suite has 359 scenario files under `src/scenarios/`. 2026-06-23 (RFC 0107 — publishable declarative pack kinds) added one: `registry-declarative-kinds.test.ts` (always-on server-free — the registry version-manifest carries the `kind` discriminator + per-kind declarative payload, `runtime` is conditional on `kind` via `allOf if/then/else`, published artifact-type and connection manifests validate, and a declarative-with-`runtime` or node-without-`runtime` manifest is rejected). 2026-06-23 (RFC 0106 — real-time voice session profile, Phase 2) added one: `voice-event-payloads-shape.test.ts` (always-on server-free — all seven `voice.*` types are in the `RunEventType` enum, each has a payload `$def` via `typeIndex`, `voice.transcript` REQUIRES `contentTrust:"untrusted"` (the schema-enforced half of `voice-transcript-untrusted`), `voice.synthesis_chunk` is metadata-shaped (seq+mimeType required; bytes by `url`/`streamRef`), and the content-free events reject unknown properties). Phase 2 also lands the `voice.*` payload `$defs` in `run-event-payloads.schema.json` + the `asyncapi.yaml` messages + the four §F `SECURITY/invariants.yaml` rows (1 protocol-tier `voice-transcript-untrusted` + 3 reference-impl behavioral, graduating at `Active → Accepted`). The gated behavioral legs (`voice-transcription-streaming`/`-unadvertised`, `voice-synthesis-streaming` + the three behavioral invariant scenarios) land with the reference host at `Active → Accepted`. 2026-06-23 (RFC 0106 — real-time voice session profile, Phase 1) added one: `aiproviders-realtimevoice-shape.test.ts` (always-on server-free — the `aiProviders.realtimeVoice` object advertisement: the four sub-flags `transcription`/`synthesis`/`turnDetection`/`bargeIn` are declared, `realtimeVoice` is absent from `aiProviders.required`, the `turnDetection`/`bargeIn ⇒ transcription` `dependentRequired` closure and the `synthesis ⇒ speechSynthesis` if/then closure hold via Ajv2020, and out-of-enum / boolean sub-flag values are rejected). The gated behavioral legs (`voice-transcription-streaming` / `-unadvertised`, `voice-synthesis-streaming`) and the four §F live-ingress invariant scenarios land in Phase 2 at the `Active → Accepted` cycle. 2026-06-22 (RFC 0101 — multi-party group conversation) added two — `multi-party-conversation-shape.test.ts` (always-on server-free — asserts the three additive RFC 0101 wire facts: a 3-agent council `conversation.opened` participant roster (instance ids) + a user opening turn validates, every `role: 'agent'` turn carrying a roster-instance `speakerId` validates while one OMITTING `speakerId` MUST FAIL schema validation (the `allOf`/`if role==='agent' then required:['speakerId']` conditional), a non-AgentRef participant item is rejected, the `capabilities.multiPartyConversation { supported, maxParticipants? }` block is declared + closed and rejects extras / `maxParticipants:1` / a missing `supported`, and the non-participant membership predicate is asserted server-free) and `multi-party-conversation-behavioral.test.ts` (capability-gated on `multiPartyConversation.supported` via `behaviorGate('openwop-multi-party-conversation', …)` — drives the conformance-only seam `POST /v1/host/sample/conversation/multi-party/{open,exchange}` (`host-sample-test-seams.md`, since RFC 0101 mints no normative client trigger to open a council): a 3-agent council opens and a roster-valid attributed agent turn is accepted, while a `role:'agent'` turn missing `speakerId`, a non-participant `speakerId`, and an over-`maxParticipants` open are each rejected with `error.code:'validation_error'` (status-tolerant 400/422 per RFC 0005 §E); soft-skips on 404/405 — reference impl: the postgres example host). 2026-06-20 (RFC 0105 — speech synthesis adapter) added three: `aiproviders-speechsynth-shape.test.ts` (always-on server-free — the `aiProviders.speechSynthesis` `const "supported"` flag is declared, absent from `aiProviders.required`, and a string-const advertisement validates while the object form `{supported:true}` is rejected), `speech-synthesis-roundtrip.test.ts` (gated on `aiProviders.speechSynthesis === "supported"` via `behaviorGate('openwop-speech-synthesis', …)` — an advertising host's `callSpeechSynthesizer` round-trip via the `POST /v1/host/sample/ai/call-speech-synthesizer` seam returns `audio` with EXACTLY ONE of `url`/`base64`, a non-empty `mimeType`, and the echoed `voiceId`; soft-skips on 404), and `speech-synthesis-unadvertised.test.ts` (gated-by-absence via `behaviorGate('openwop-speech-synthesis-unadvertised', …)` — a host NOT advertising `speechSynthesis` MUST reject the call with `speech_synthesis_unsupported`, never a silent no-op). 2026-06-19 (RFC 0104 — portable HITL approver routing) added one: `interrupt-approver-routing.test.ts` (server-free Ajv2020 — the `interrupt.approverRouting` capability block shape, the additive optionality of `approverGroupRefs` / `approverRoleRefs` / `audience` on the `ApprovalData` schema, the closed `audience` object, and the §"Portable approver routing" `notifyTargets` reference rule that `audience` DEFAULTS to the resolved eligibility union when omitted and OVERRIDES it when present; the capability-gated leg asserts an advertising host's `interrupt.approverRouting` is honest — `refKinds` ⊆ {group, role}, `audience` boolean — and soft-skips when the host does not advertise the capability). 2026-06-17 (RFC 0103 — localized content surface) added one: `localized-content-delivery.test.ts` (server-free Ajv2020 — the four content schemas + the §C `resolveSection` merge + §A capability coherence; the public test for `content-published-cache-no-draft` / `content-response-tenant-scoped` / `content-no-cross-tenant-enumeration`; the live legs gate on `capabilities.content.supported` and soft-skip without a `GET /v1/content/pages/{slug}` target). 2026-06-15 (RFC 0102 — A2UI agent-authored interface surfaces) added five: `a2ui-surface-shape.test.ts` (server-free Ajv2020 — the closed core `ui.a2ui-surface` payload validates, while an out-of-catalog component / extra script-bearing property / unenumerated `catalogVersion` / `action.target` outside `enum["resume","exchange"]` each fail; the structural half of `a2ui-action-confinement` and the enabling precondition for the render-side `a2ui-surface-no-code-exec` / `a2ui-surface-no-network-egress` reference-app probes), `a2ui-surface-degrades.test.ts` (the kind is optional/advertised, not a MUST-recognize universal kind, and an unadvertised kind is gated — N6 — never crashing the run), `a2ui-surface-version-refusal.test.ts` (the enumerated `catalogVersion` rejects an unadvertised version → `unknown_schema_version`, and the surface schema carries no external `$ref`), `a2ui-surface-replay.test.ts` (all `$ref`s internal so a stored surface `:fork`/replays deterministically; same-`correlationId` + divergent `type` → `envelope_correlation_conflict`), and `a2ui-untrusted-blocks-approval.test.ts` (a `meta.contentTrust:'untrusted'` surface is trust-gated and MUST NOT advance an approval interrupt — composition of `untrusted_content_blocks_approval`). The four behavioral legs soft-skip on 404 (host-pending; the `openwop-app` reference renderer is the render-side probe). 2026-06-14 (RFCs 0099/0100 — external-event trigger ingestion + async/durable A2A tasks) added one new scenario (`trigger-ingestion.test.ts`) and extended `a2a-task-roundtrip.test.ts` with the RFC 0100 async subtests: `trigger-ingestion.test.ts` (RFC 0099 — always-on `TriggerEvent` / `TriggerSubscriptionRegistration` schema legs incl. the §F.1 per-source one-of, the `AttachmentRef.ref`-only rule + raw-URL rejection backing `trigger-ingestion-ssrf`, and the content-free `trigger.delivery.attempted` shape backing `trigger-ingestion-content-redaction`; plus a capability-gated behavioral leg on `triggerBridge.ingestion` driving the `POST /v1/host/sample/trigger-bridge/ingest` seam for SSRF refusal + header-redaction), and the `a2a-task-roundtrip.test.ts` additions (RFC 0100 — always-on `A2ATaskState` + `capabilities.a2a` shape legs incl. the lowercase-hyphen state enum, the `PushConfig` `url`-required + truncated-`tokenFingerprint` rule backing `a2a-push-egress-ssrf`, the no-inline-inputs `additionalProperties:false` SR-1 check; plus capability-gated durable-`tasks/get`-after-disconnect and push-SSRF behavioral legs on `a2a.durableTasks` / `a2a.pushNotifications` via the `/v1/host/sample/a2a/tasks/*` seam). 2026-06-13 (RFCs 0096/0097/0098 — reviewable learning, standing goals, agent-platform portability) added three always-on-plus-gated scenarios: `proposal-reviewable-learning.test.ts` (RFC 0096 — the `agents.proposals` shape + the `Proposal` round-trip incl. the dropped `rule` kind + the content-free `proposal.{created,activated}` events, plus a gated apply-without-scope→403 leg; backs `proposal-inert-until-applied` + `proposal-no-resynthesis`), `goal-standing-continuation.test.ts` (RFC 0097 — the `agents.goals` shape + the `Goal` round-trip + the content-free `goal.{evaluated,closed}` events, plus gated bounded-termination→422 + judge-only-completion legs; backs `goal-continuation-bounded` + `goal-completion-judge-only`), and `export-bundle-portability.test.ts` (RFC 0098 — the `portability` shape incl. the `import⇒dryRun` if/then + the `ExportBundle` round-trip rejecting every credential-named field + the content-free `import.applied` event, plus a gated literal-credential-import→422 leg; backs `export-bundle-no-credential-material`). 2026-06-11 (RFCs 0093/0094 — protocol hardening + wire-shape reconciliation) added five: `version-fold.test.ts` (the `version-negotiation.md` §`X-Force-Engine-Version` cross-version matrix through the previously-orphaned `conformance-version-fold` fixture — closes catalog gap F5; soft-skips when `Capabilities.testing.forceEngineVersionRange` is unadvertised), `stream-text-fixture.test.ts` (the `stream-modes.md` §`messages` fold through the deterministic `stream-text` mock provider + the previously-orphaned `conformance-stream-text` fixture — closes catalog gap F1), `i18n-negotiation.test.ts` (gated on `capabilities.i18n` via `behaviorGate('openwop-i18n', …)` — an unsupported or malformed `Accept-Language` never 400s, `Content-Language` reflects the locale actually used, and error `code` strings stay the canonical English tokens), `grpc-transport.test.ts` (gated on `capabilities.grpc` via `behaviorGate('openwop-grpc-transport', …)` — advertisement-shape only per `grpc-transport.md` §Field semantics: `service` MUST be `openwop.v1.Engine`, the `tls` enum, `grpcs?://` endpoint URIs, `supportedTransports` includes `grpc` when exposed, production claimants require `tls: "required"`; no gRPC dialing), and `webhook-tenant-isolation.test.ts` (RFC 0093 §A.3 — backs the new protocol-tier `webhook-cross-tenant-isolation` invariant; a two-tenant proof through the `/v1/host/sample/test/surface` seam plus black-box registration-surface scoping). `spec-corpus-validity.test.ts` also gained the RFC 0094 §A satisfiability probe: canonical `createRun` bodies MUST pass the composed request schema (closed via `unevaluatedProperties: false` at the composition site, never inside an `allOf` branch) and an undeclared property MUST fail. 2026-06-07 (RFCs 0090/0091/0092 — verifier turn + convergence, multimodal perception input, agent capability requirements) added six: the always-on, server-free shape probes `agent-verifier-shape.test.ts`, `aiproviders-input-shape.test.ts`, `agent-requires-capabilities-shape.test.ts`, plus the capability-gated **behavioral** legs `agent-capability-degraded-projection.test.ts` (RFC 0092 §B — the `degraded[]` projection on `GET /v1/agents`, black-box, non-vacuous via `OPENWOP_DEGRADED_CAPABILITY_AGENT_ID`), `callai-multimodal.test.ts` (RFC 0091 §A/§B — advertised modality accepted / unadvertised → `unsupported_modality`, via the `POST /v1/host/sample/ai/call` seam), and `verifier-gating.test.ts` (RFC 0090 §B — a `fail` verdict blocks commit, via the `POST /v1/host/sample/agents/verify-run` seam). The three behavioral legs soft-skip by default and hard-fail under `OPENWOP_REQUIRE_BEHAVIOR=true` — the Active→Accepted reference-host proof for each RFC. 2026-06-02 (RFC 0082 §B — deployment channel resolve-and-pin, production-path coverage) added `agent-channel-dispatch.test.ts` (capability-gated on `agents.deployment.supported` + the seeded `conformance-agent-channel-dispatch` fixture + advertised `replay` mode via `behaviorGate('openwop-deployment-channel-dispatch', …)` — proves the §B pin from a REAL run graph, complementing `agent-deployment-lifecycle.test.ts` Leg 4's host-sample seam: a canonical `POST /v1/runs` of a node binding `agent.channel:"stable"` MUST record `resolvedChannel` + `resolvedAgentVersion` on `agent.invocation.started` (RFC 0077), a `:fork{mode:"replay"}` MUST re-read that recorded version, and the seam-guarded Leg 3 MOVES the channel then asserts a replay STILL carries the original pin — never re-resolving a moved channel; soft-skips by default, hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true` — the production-path proof of the §B contract). 2026-06-01 (RFC 0085 — `openwop-agent-platform` meta-profile, the Active→Accepted behavioral gate) added `agent-platform-aggregate-evidence.test.ts` (capability-gated on a host CLAIMING `openwop-agent-platform` in its live discovery `profiles[]` via `behaviorGate('openwop-agent-platform', …)` — the §C/§D honest-advertisement rule on the live `/.well-known/openwop`: the claim MUST satisfy the §B floor predicate (`isAgentPlatformPartial` → `partial`/`full`, never `none`), backed by the per-capability evidence not the profile string; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the non-vacuous full bar — all governance terms + tenant installScope + all 16 §D terms; server-requiring, the always-on §B/§D derivation legs stay in `agent-platform-profile.test.ts` — the RFC 0085 → Accepted bar). 2026-06-01 (RFC 0084 — budget, quota + cost policy, the Active→Accepted behavioral gate) added `budget-enforcement.test.ts` (capability-gated on `budget.supported` via `behaviorGate('openwop-budget-enforcement', …)` — the §C/§D enforcement via the new `POST /v1/host/sample/budget/run` seam + the test event-log seam: a `hard-cost-exhaust` run emits the strict-ordered `budget.reserved → budget.consumed → budget.threshold.crossed{percent} → budget.exhausted → cap.breached{kind:"budget-cost"} → run.failed{error:"budget_exhausted"}` chain; a `model-denied` run is refused `budget_model_denied` BEFORE the provider call (fail-closed); an `advisory` host emits the `budget.*` events without stopping; every `budget.*` payload content-free backing `budget-no-pricing-leak`; new lib helper `src/lib/budgetPolicy.ts`; soft-skips on 404 — the RFC 0084 → Accepted bar). 2026-06-01 (RFC 0080 — agent memory capability reconciliation, the Active→Accepted behavioral gate) added `memory-degraded-projection.test.ts` (capability-gated on `agents.manifestRuntime.supported` + `memory.supported` via `behaviorGate('openwop-memory-degraded', …)` — the §C degraded-projection iff-contract on the NORMATIVE `GET /v1/agents`: a degraded inventory entry MUST carry `memoryDegraded:true` + a non-empty, unique `degradedMemoryDimensions[]` from the closed §A-name enum, a non-degraded entry MUST NOT, the inventory is non-empty, and the degraded branch runs non-vacuously when `OPENWOP_DEGRADED_AGENT_ID` names a known-degraded agent; black-box, no POST seam — the RFC 0080 → Accepted bar). This batch also documents the two RFC 0068 conformance seams (`POST /v1/host/sample/memory/consolidate` + `.../commitment/fire`) in `host-sample-test-seams.md` (the 0068 gated scenarios shipped in 1.14.0). 2026-06-01 (RFC 0034 — collector-side BYOK-canary inspection) added `otel-collector-canary-inspection.test.ts` (always-on server-free: stands up a real `OtelCollector`, POSTs synthetic OTLP/HTTP-JSON traces + metrics through its actual ingest path, and proves the new `findCanaryLeakage()` inspector catches a canary embedded in a span attribute / resource attribute / span name / metric data-point attribute while reporting ZERO hits on a redacted payload and never matching an empty canary — the non-vacuous proof that the conformance collector now inspects what the host's OTLP exporter ACTUALLY shipped over the wire, closing the `secret-leakage-otel-attribute` / `-debug-bundle-otel` collector-seam gap; the live capability-gated complement is the new collector-export describe block in `secret-leakage-otel-attribute.test.ts`). 2026-06-01 (RFC 0035 — sandbox wall-clock timeout, the 7th-of-8 graduation) added `sandbox-wasm-timeout.test.ts` (worker-driven server-free: `probeTimeout` in `wasm-sandbox-probe.ts` spawns a worker thread running the committed `misbehaving-timeout.wasm` + a main-thread kill-timer — the thread preemption a same-thread probe can't do — asserting `sandbox_timeout` with a well-behaved positive control; graduates `node-pack-sandbox-timeout` reference-impl→protocol, so 7 of 8 `node-pack-sandbox-*` invariants are now protocol-tier, only the JS-specific `no-eval` permanently exempt). 2026-05-31 (audit-response black-box / graduation batch) added three more: `sandbox-wasm-isolation.test.ts` (RFC 0035 — drives the committed `fixtures/wasm-sandbox/*.wasm` through `wasm-sandbox-probe.ts`: escape/capability-gate via static `WebAssembly.Module.imports()`, an OOB-store memory trap, double-instantiate isolation; 10/10; graduates 6 `node-pack-sandbox-*` invariants reference-impl→protocol), `workspace-cross-tenant-isolation-blackbox.test.ts` (RFC 0059 — two-credential black-box on the normative §C `/v1/host/workspace/files` endpoints: owner A writes, a second-tenant credential fails closed; no seam), and `prompt-resolution-chain-event.test.ts` (RFC 0029 — reads the durable `agent.promptResolved.chain[]` precedence record via the normative `GET /v1/runs/{runId}/events/poll`; no seam) — each the production-path proof that graduates its surface into the `openwop-core-standard` floor. 2026-05-31 (RFC 0088 — the `openwop-core-standard` Core Standard Profile, the audit-response Core Candidate target) added `core-standard-profile.test.ts` (always-on server-free derivation probe: `isCoreStandard` derives the §B floor — `openwop-core` ∧ `openwop-interrupts` ∧ (`openwop-stream-sse` ∨ `openwop-stream-poll`) — a bare `openwop-core` host without interrupts is excluded, a host with no event transport fails, and the annex is absent from `deriveProfiles` because it composes rather than redefines). 2026-05-31 (RFC 0082 — agent deployment lifecycle, the Active→Accepted behavioral gate) added `agent-deployment-lifecycle.test.ts` (capability-gated on `agents.deployment.supported` via `behaviorGate('openwop-deployment-lifecycle', …)` — the §E promotion contract via the new `POST /v1/host/sample/agents/deployment-transition` seam + the test event-log seam across four legs: `promote` (authorize RFC 0049 → approvalGate RFC 0051 → eval-verify RFC 0081 → content-free `deployment.promoted` with a seven-state `toState` + `toVersion`, the record validating `agent-deployment.schema.json`), `unauthorized` (fail-closed — `allowed:false`, no `deployment.promoted`, the behavioral leg of `deployment-promotion-fail-closed`), `eval-gate-unmet` (`eval_gate_unmet` denial, §E-3), and `channel-pin` (the §B `resolvedAgentVersion` recorded-fact on `agent.invocation.started`); new lib helper `src/lib/agentDeployment.ts`; soft-skips on 404 — the RFC 0082 → Accepted bar). 2026-05-31 (RFC 0081 — agent evaluation, the Active→Accepted behavioral gate) added `agent-eval-run.test.ts` (capability-gated on `agents.evalSuite.supported` via `behaviorGate('openwop-eval-run', …)` — the §B `mode:"eval"` projection via the new `POST /v1/host/sample/agents/eval-run` seam + the test event-log seam: `eval.started`-first → one `eval.scored` per task → `eval.completed`-once ordering (count == `eval.completed.taskCount`), the content-free `eval.scored` legs (`score` ∈ 0..1) backing `eval-summary-no-content-leak`, and the NORMATIVE `GET /v1/runs/{runId}/eval-summary` schema-valid `EvalSummary` round-trip with `passedCount <= taskCount`; new lib helper `src/lib/agentEval.ts`; soft-skips on 404 — the RFC 0081 → Accepted bar). 2026-05-31 (RFC 0083 — durable trigger bridge, the Active→Accepted behavioral gate) added `trigger-bridge-delivery.test.ts` (profile-gated on `openwop-trigger-bridge` derived from the live discovery doc — the §C delivery model via the `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free; the always-on shape stays in `trigger-bridge-shape.test.ts`; new lib helper `src/lib/triggerBridge.ts`). 2026-05-31 (RFC 0087 — agent org-chart, the Active→Accepted behavioral gate) added two capability-gated behavioral scenarios (both gated on `agents.orgChart.supported`, black-box on the normative `/v1/agents/org-chart` surface — no new POST seam): `agent-org-chart-scoping.test.ts` (the `GET /v1/agents/org-chart` tree-shape — departments form an acyclic `parentDepartmentId` tree, members reference `host:<id>` roster entries — + the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` with a deduped `responsibilities[]` union + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) and `org-position-no-authority-escalation.test.ts` (the behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field on any member/department/responsibility-view object; the structural leg stays always-on in `agent-org-chart-shape.test.ts`, and the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier per the `agent-manifest-runtime` no-host-hook precedent). 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). 2026-05-31 (RFC 0078 portable tool catalog + RFC 0079 credential provenance / egress policy — the Active→Accepted behavioral gate) added four: `tool-catalog-projection.test.ts` (capability-gated on `toolCatalog.supported` via `behaviorGate('openwop-tool-catalog', …)` — the NORMATIVE `GET /v1/tools` list with each `ToolDescriptor` schema-valid + `source`/`safetyTier` in the closed vocab + content-free, `GET /v1/tools/{toolId}` round-trip + unknown-id 404, 401-unauthenticated, and the §F-2 cross-principal non-disclosure; black-box, no POST seam), `tool-session-lifecycle.test.ts` (gated on `toolCatalog.sessionLifecycle` — the §D `tool.session.opened`-before / `tool.session.closed`-after bracket over the RFC 0064 call events via the `POST /v1/host/sample/tools/session-run` seam, one shared `sessionId`, content-free), `egress-audience-binding.test.ts` (KEYSTONE — gated on `httpClient.egressPolicy.supported`; the §C confused-deputy MUST via `POST /v1/host/sample/egress/decide`: an out-of-audience egress is denied/downgraded with the credential NOT attached, a provenance-unevaluable egress fails closed — the behavioral leg of `egress-credential-audience-bound`), and `egress-decision-content-free.test.ts` (the SR-1 canary — the credential value never surfaces in `egress.decided` and `reason` stays in the CLOSED vocabulary). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
|
|
96
96
|
|
|
97
97
|
High-level coverage includes:
|
|
98
98
|
|
|
@@ -171,7 +171,7 @@ Server-required (added in 1.7.0):
|
|
|
171
171
|
| ------------- | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
172
172
|
| **Redaction** | [`capabilities.md`](../spec/v1/capabilities.md) §"Secrets" + NFR-7 + §"aiProviders" | Vendor-neutral assertions that the server doesn't leak secret material. Three scenario groups: (a) discovery shape contract — `secrets` + `aiProviders` advertisements are well-formed regardless of `secrets.supported`; when `supported === true`, scopes MUST be non-empty + `resolution === 'host-managed'`; `byok ⊆ supported`. (b) bearer-token redaction — invalid Bearer canary in `Authorization` header is not echoed in the 401 response body. (c) credentialRef echo control — gated on `secrets.supported === true`; canary planted in `configurable.ai.credentialRef` MUST NOT appear in any RunEvent payload (poll-based capture; transport-agnostic). Uses runtime-built canary fixtures (`lib/canaries.ts`) that defeat static secret scanners. 6 scenarios. |
|
|
173
173
|
|
|
174
|
-
Current source tree:
|
|
174
|
+
Current source tree: 359 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
|
|
175
175
|
|
|
176
176
|
## Remaining Gaps
|
|
177
177
|
|
package/api/asyncapi.yaml
CHANGED
|
@@ -642,6 +642,59 @@ components:
|
|
|
642
642
|
payload:
|
|
643
643
|
$ref: '#/components/schemas/RunEventDoc'
|
|
644
644
|
|
|
645
|
+
# ── Voice (RFC 0106) ─────────────────────────────────────────────────
|
|
646
|
+
# The voice.* turn-taking / barge-in taxonomy — the single canonical record of a
|
|
647
|
+
# live voice turn (ctx.callTranscriber resolves its Promise at turn_commit; these
|
|
648
|
+
# events ARE the streaming representation on the durable log). All RunEventDocs.
|
|
649
|
+
VoiceSpeechStart:
|
|
650
|
+
name: voice.speech_start
|
|
651
|
+
title: Inbound user speech onset detected
|
|
652
|
+
contentType: application/json
|
|
653
|
+
payload:
|
|
654
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
655
|
+
|
|
656
|
+
VoiceTranscript:
|
|
657
|
+
name: voice.transcript
|
|
658
|
+
title: Interim/final transcript part (untrusted ingress; carries contentTrust)
|
|
659
|
+
contentType: application/json
|
|
660
|
+
payload:
|
|
661
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
662
|
+
|
|
663
|
+
VoiceEndpointCandidate:
|
|
664
|
+
name: voice.endpoint_candidate
|
|
665
|
+
title: Likely end-of-turn boundary (semantic turn detection)
|
|
666
|
+
contentType: application/json
|
|
667
|
+
payload:
|
|
668
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
669
|
+
|
|
670
|
+
VoiceTurnCommit:
|
|
671
|
+
name: voice.turn_commit
|
|
672
|
+
title: User yielded the floor (callTranscriber Promise resolves here)
|
|
673
|
+
contentType: application/json
|
|
674
|
+
payload:
|
|
675
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
676
|
+
|
|
677
|
+
VoiceSynthesisChunk:
|
|
678
|
+
name: voice.synthesis_chunk
|
|
679
|
+
title: Clause-boundary streaming-synthesis chunk (metadata only)
|
|
680
|
+
contentType: application/json
|
|
681
|
+
payload:
|
|
682
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
683
|
+
|
|
684
|
+
VoiceBargeIn:
|
|
685
|
+
name: voice.barge_in
|
|
686
|
+
title: User speech overlapped active assistant playback
|
|
687
|
+
contentType: application/json
|
|
688
|
+
payload:
|
|
689
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
690
|
+
|
|
691
|
+
VoiceCancelled:
|
|
692
|
+
name: voice.cancelled
|
|
693
|
+
title: Downstream LLM/TTS work cancelled (barge-in or explicit)
|
|
694
|
+
contentType: application/json
|
|
695
|
+
payload:
|
|
696
|
+
$ref: '#/components/schemas/RunEventDoc'
|
|
697
|
+
|
|
645
698
|
# ── Schemas ────────────────────────────────────────────────────────────
|
|
646
699
|
schemas:
|
|
647
700
|
|
package/coverage.md
CHANGED
|
@@ -4,6 +4,10 @@
|
|
|
4
4
|
|
|
5
5
|
> **Shape grade vs behavior grade.** Some optional-profile scenarios validate **capability shape** (the host's discovery advertisement is well-formed) without yet exercising **behavior** (the host actually implements the profile end-to-end). The "Current grade" column reflects shape; see §"Capability-gated scenarios: shape vs behavior" below for the dual-grade view and the `OPENWOP_REQUIRE_BEHAVIOR=true` strict-mode runner flag.
|
|
6
6
|
|
|
7
|
+
> **Run serial against a single stateful in-process host (`--no-file-parallelism`).** Several scenarios drive a host **test seam that holds global mutable state** — a projected run-event log plus a `POST .../test/reset` — most notably the envelope-reliability projections, the RFC 0102 a2ui correlation/approval seams, the RFC 0036 multi-region/cross-engine seams, and the RFC 0005 conversation lifecycle/replay legs. vitest runs test **files in parallel by default**, so a host that boots **one** in-process instance and runs the suite against it will have concurrent files' `reset` calls clobber each other's event logs mid-assertion — producing *phantom* failures (or, worse, phantom passes) that look like host bugs but are pure harness contamination. Run such hosts serially: use the `test:strict` script (`vitest run --no-file-parallelism`), or — if your harness boots the host **and** runs the suite in one process via a custom wrapper — pass `--no-file-parallelism` into that wrapper yourself; you cannot rely on the package script. (Precedent: the production-profile backpressure row below already required this for the same reason.) The steward INTEROP-MATRIX in-process measurements are taken serially on this basis.
|
|
8
|
+
|
|
9
|
+
> **`conversationPrimitive` (and the multi-agent capability flags) are read at the discovery ROOT (RFC 0073).** `lib/multi-agent-capabilities.ts` (`setMultiAgentCapabilities`) reads `body.conversationPrimitive` at the document root, not under a nested `.capabilities` object. A host that advertises the flag **only** under `capabilities.conversationPrimitive` resolves to `false`, and every conversation scenario then `skipIf`-skips — a **vacuous all-skip that looks like a pass**, not a real cert. Advertise at the root (a `capabilities.` mirror for legacy discovery readers is fine, but the root is the value the suite grades). This deprecates any pre-RFC-0073 guidance that pointed hosts at `capabilities.conversationPrimitive`.
|
|
10
|
+
|
|
7
11
|
> **Sibling-repo pointer convention (2026-06 monorepo split).** Reference-implementation files that used to live in this monorepo are cited repo-qualified as `<repo>:<path>`, where `<repo>` is a repository under `https://github.com/openwop` — e.g. `openwop-examples:examples/hosts/sqlite/test/audit-tamper.test.ts`, `openwop-app:backend/typescript/test/agent-dispatch-route.test.ts`, `openwop-registry:registry/scripts/verify-signatures.mjs`. Bare paths refer to this repository. (Same convention as `SECURITY/invariants.yaml`.)
|
|
8
12
|
|
|
9
13
|
---
|
|
@@ -351,6 +355,15 @@ server-free or shape-probe assertions that run unconditionally.
|
|
|
351
355
|
| `conversationVsLegacySuspend.test.ts` | RFC 0005 (`conversation.exchanged` ≠ `clarification.requested`) | `capabilities.conversationPrimitive` |
|
|
352
356
|
| `conversationReplayDeterminism.test.ts` | RFC 0005 (replay-fork yields byte-equal conversation log) | `capabilities.conversationPrimitive` + replay-fork + fixture |
|
|
353
357
|
|
|
358
|
+
### Multi-party group conversation (RFC 0101)
|
|
359
|
+
|
|
360
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
361
|
+
| ---------------------------------------------- | -------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
|
|
362
|
+
| `multi-party-conversation-shape.test.ts` | RFC 0101 §Schema (participant roster + conditional `speakerId` + `multiPartyConversation` block) | always-on (server-free) |
|
|
363
|
+
| `multi-party-conversation-behavioral.test.ts` | RFC 0101 §Spec (roster membership + attribution + `maxParticipants`, behavioral) | `capabilities.multiPartyConversation.supported` (behaviorGate `openwop-multi-party-conversation`) + the `POST /v1/host/sample/conversation/multi-party/{open,exchange}` seam |
|
|
364
|
+
|
|
365
|
+
The behavioral leg drives the conformance-only seam (RFC 0101 mints no normative client trigger to open a council) and asserts the cross-field MUSTs JSON Schema cannot express: a roster-valid attributed agent turn is accepted, while a `role:'agent'` turn missing `speakerId`, a non-participant `speakerId`, and an over-`maxParticipants` open are each rejected with `error.code:'validation_error'` (status-tolerant `400`/`422` per RFC 0005 §E). Soft-skips on `404`/`405`. **This is the RFC 0101 → behavioral-conformance bar** (reference impl: the postgres example host; a product-flow-bound host — e.g. openwop-app ADR 0040's advisory-board council — witnesses instead via its host-side test + an `INTEROP-MATRIX.md` row, the RFC 0086 dual-staging).
|
|
366
|
+
|
|
354
367
|
### Dispatch / sub-workflow mapping (RFC 0022) + sub-run attestation (RFC 0063)
|
|
355
368
|
|
|
356
369
|
| Scenario file | Spec doc / RFC | Gating capability |
|
package/package.json
CHANGED
|
@@ -566,6 +566,23 @@
|
|
|
566
566
|
"uniqueItems": true,
|
|
567
567
|
"description": "Optional v1 host-advertised opaque capability ids that NodeModules may declare in `NodeModule.requires`. Naming convention: dotted, domain-scoped (`chat.sendPrompt`, `canvas.write`, `secrets.byok`). Provider value shapes are documented per-capability alongside consumers, NOT in the protocol package — the protocol owns the *check*, not domain provider contracts. A client that submits a workflow whose nodes declare a `requires` entry SHOULD first verify the host advertises that capability; a host that lacks a capability MUST refuse to dispatch nodes that declare it, terminating the run with `RunSnapshot.error.code = 'capability_not_provided'`. See `capabilities.md` §\"Runtime capabilities\"."
|
|
568
568
|
},
|
|
569
|
+
"multiPartyConversation": {
|
|
570
|
+
"type": "object",
|
|
571
|
+
"description": "RFC 0101 — Multi-party group conversation (shared transcript + speaker attribution). When advertised with `supported: true`, the host honors the RFC 0101 normative contract: (1) it accepts an OPTIONAL `participants: AgentRef[]` roster on `conversation.opened` (`conversation-event.schema.json`); (2) it REQUIRES a `speakerId` (the roster INSTANCE id, `host:<id>` per RFC 0086) on every `role: 'agent'` turn; and (3) when a `participants` roster is present it MUST reject a turn whose `speakerId` is not a roster member. Absent block ⇒ no advertisement: the host runs single-agent conversations (RFC 0005) and treats `participants`/`speakerId` as opaque, unenforced fields. Advertising `supported: true` without honoring (1)+(2)+(3) is a dishonest wire claim (`OPENWOP_REQUIRE_BEHAVIOR=true` fails the gated conformance scenarios). Host product policy — turn-taking order, round count, synchronous vs. async rounds — stays NON-normative (RFC 0101 §Spec); only roster membership, agent-turn attribution, and this capability are normative. Requires `conversationPrimitive: true` (RFC 0005) — this block extends the single-agent conversation primitive, it does not replace it.",
|
|
572
|
+
"required": ["supported"],
|
|
573
|
+
"additionalProperties": false,
|
|
574
|
+
"properties": {
|
|
575
|
+
"supported": {
|
|
576
|
+
"type": "boolean",
|
|
577
|
+
"description": "RFC 0101. Host enforces the multi-party roster + speaker-attribution contract. When `false` or absent, RFC 0101 conformance scenarios soft-skip and the host applies no roster/attribution enforcement."
|
|
578
|
+
},
|
|
579
|
+
"maxParticipants": {
|
|
580
|
+
"type": "integer",
|
|
581
|
+
"minimum": 2,
|
|
582
|
+
"description": "RFC 0101. OPTIONAL host ceiling on the size of the `conversation.opened.participants` roster. A multi-party council has at least 2 participants. When advertised, the host MUST reject a `conversation.opened` whose `participants` array exceeds this count. Absent ⇒ the host does not bound the roster size on the wire."
|
|
583
|
+
}
|
|
584
|
+
}
|
|
585
|
+
},
|
|
569
586
|
"multiAgent": {
|
|
570
587
|
"type": "object",
|
|
571
588
|
"description": "RFC 0037 — Multi-agent execution model + handoff state machine. Hosts that advertise implement the supervisor→dispatch→harvest loop + the 4-state handoff state machine + the `core.workflowChain.event` emission contract per spec/v1/multi-agent-execution.md. Absent block = host implements RFCs 0006/0007/0022 individually with implementation flexibility on integration semantics; conformance scenarios gating on this flag soft-skip on absence.",
|
|
@@ -806,9 +823,49 @@
|
|
|
806
823
|
"minimum": 0,
|
|
807
824
|
"default": 262144,
|
|
808
825
|
"description": "RFC 0055 §C rule 2 — optional cap (bytes) on inline base64 in `media.*` envelope payloads. A `media.{image,audio,file}` asset above this size MUST be served by a tenant-scoped `url` reference rather than inlined (bounds event-log + replay-payload size). Default 256 KiB (262144) when absent. A host MAY set 0 to force URL references for all emitted media."
|
|
826
|
+
},
|
|
827
|
+
"realtimeVoice": {
|
|
828
|
+
"type": "object",
|
|
829
|
+
"additionalProperties": false,
|
|
830
|
+
"description": "RFC 0106. Optional real-time voice profile. Absent ⇒ no live voice (a call to `ctx.callTranscriber` MUST be rejected with `transcription_unsupported`, and `stream:true` on `ctx.callSpeechSynthesizer` with `streaming_unsupported`). `ctx.callTranscriber` resolves a `Promise` at `turn_commit` with the settled final transcript and emits the interim/final/endpoint signals as `voice.*` run-events on the durable event log (the single canonical record); the streaming synthesis arm resolves a `Promise` at completion and announces clause-boundary chunks as `voice.synthesis_chunk` metadata run-events (bytes by `streamRef`/`url`, not inlined past the host cap).",
|
|
831
|
+
"properties": {
|
|
832
|
+
"transcription": {
|
|
833
|
+
"const": "streaming",
|
|
834
|
+
"description": "When present (value MUST be `\"streaming\"`), the host exposes `ctx.callTranscriber(...)` (streaming STT). `turnDetection` and `bargeIn` require `transcription`."
|
|
835
|
+
},
|
|
836
|
+
"synthesis": {
|
|
837
|
+
"const": "streaming",
|
|
838
|
+
"description": "When present (value MUST be `\"streaming\"`), the RFC 0105 `ctx.callSpeechSynthesizer` request honors `stream: true` (chunked synthesis). REQUIRES `aiProviders.speechSynthesis: \"supported\"` (enforced by the if/then closure on `aiProviders`)."
|
|
839
|
+
},
|
|
840
|
+
"turnDetection": {
|
|
841
|
+
"enum": ["vad", "semantic"],
|
|
842
|
+
"description": "Endpointing sophistication. `vad` = silence-threshold endpointing only; `semantic` = a turn detector that emits `voice.endpoint_candidate` distinct from `voice.turn_commit`. Requires `transcription`."
|
|
843
|
+
},
|
|
844
|
+
"bargeIn": {
|
|
845
|
+
"const": "supported",
|
|
846
|
+
"description": "When present, the host MUST emit `voice.barge_in` on overlapping speech during playback and `voice.cancelled` when it actually cancels downstream work. Requires `transcription`."
|
|
847
|
+
}
|
|
848
|
+
},
|
|
849
|
+
"dependentRequired": {
|
|
850
|
+
"turnDetection": ["transcription"],
|
|
851
|
+
"bargeIn": ["transcription"]
|
|
852
|
+
}
|
|
809
853
|
}
|
|
810
854
|
},
|
|
811
|
-
"additionalProperties": false
|
|
855
|
+
"additionalProperties": false,
|
|
856
|
+
"allOf": [
|
|
857
|
+
{
|
|
858
|
+
"$comment": "RFC 0106 §A closure: realtimeVoice.synthesis (streaming TTS) requires the whole-file TTS surface aiProviders.speechSynthesis to be advertised.",
|
|
859
|
+
"if": {
|
|
860
|
+
"type": "object",
|
|
861
|
+
"required": ["realtimeVoice"],
|
|
862
|
+
"properties": {
|
|
863
|
+
"realtimeVoice": { "type": "object", "required": ["synthesis"] }
|
|
864
|
+
}
|
|
865
|
+
},
|
|
866
|
+
"then": { "required": ["speechSynthesis"] }
|
|
867
|
+
}
|
|
868
|
+
]
|
|
812
869
|
},
|
|
813
870
|
"testing": {
|
|
814
871
|
"type": "object",
|
|
@@ -9,6 +9,16 @@
|
|
|
9
9
|
{ "$ref": "#/$defs/ConversationClosedPayload" }
|
|
10
10
|
],
|
|
11
11
|
"$defs": {
|
|
12
|
+
"AgentRef": {
|
|
13
|
+
"type": "object",
|
|
14
|
+
"description": "RFC 0101. Slim wire-shape projection of an agent's identity. Mirror of the relevant subset of `agent-ref.schema.json`; kept in sync (inlined for self-containment under per-file Ajv compile order, same convention as `ConversationTurn`). Used by `ConversationOpenedPayload.participants` — the cohort permitted to speak. `agentId` here is the roster INSTANCE id (`host:<id>` per RFC 0086) for council members, matching a turn's `speakerId`.",
|
|
15
|
+
"required": ["agentId"],
|
|
16
|
+
"properties": {
|
|
17
|
+
"agentId": { "type": "string", "minLength": 1, "maxLength": 256 },
|
|
18
|
+
"name": { "type": "string", "maxLength": 256 }
|
|
19
|
+
},
|
|
20
|
+
"additionalProperties": true
|
|
21
|
+
},
|
|
12
22
|
"ConversationTurn": {
|
|
13
23
|
"type": "object",
|
|
14
24
|
"description": "One turn in a multi-turn conversation. Mirror of `conversation-turn.schema.json`; kept in sync. Structural superset of `ConversationMessage` (RFC 0002 §G).",
|
|
@@ -35,8 +45,21 @@
|
|
|
35
45
|
},
|
|
36
46
|
"additionalProperties": true
|
|
37
47
|
},
|
|
38
|
-
"turnIndex": { "type": "integer", "minimum": 0 }
|
|
48
|
+
"turnIndex": { "type": "integer", "minimum": 0 },
|
|
49
|
+
"speakerId": {
|
|
50
|
+
"type": "string",
|
|
51
|
+
"minLength": 1,
|
|
52
|
+
"maxLength": 256,
|
|
53
|
+
"description": "RFC 0101. Roster INSTANCE id of this turn's speaker (`host:<id>` AgentRef per RFC 0086). REQUIRED when `role: 'agent'` (the `allOf`/`if` conditional below). Mirror of `conversation-turn.schema.json`'s `speakerId`; kept in sync."
|
|
54
|
+
}
|
|
39
55
|
},
|
|
56
|
+
"allOf": [
|
|
57
|
+
{
|
|
58
|
+
"$comment": "RFC 0101 — agent turns MUST carry speaker attribution. Mirror of `conversation-turn.schema.json`; kept in sync.",
|
|
59
|
+
"if": { "properties": { "role": { "const": "agent" } }, "required": ["role"] },
|
|
60
|
+
"then": { "required": ["speakerId"] }
|
|
61
|
+
}
|
|
62
|
+
],
|
|
40
63
|
"additionalProperties": true
|
|
41
64
|
},
|
|
42
65
|
"ConversationOpenedPayload": {
|
|
@@ -56,6 +79,11 @@
|
|
|
56
79
|
"minLength": 1,
|
|
57
80
|
"maxLength": 256
|
|
58
81
|
},
|
|
82
|
+
"participants": {
|
|
83
|
+
"type": "array",
|
|
84
|
+
"items": { "$ref": "#/$defs/AgentRef" },
|
|
85
|
+
"description": "RFC 0101. OPTIONAL. The agent cohort permitted to speak in this conversation (multi-party group conversation / advisory council). Each `agentId` is a roster INSTANCE id (`host:<id>` per RFC 0086). When present (and the host advertises `multiPartyConversation.supported`), a turn whose `speakerId` is not a member of this roster MUST be rejected. Disambiguates the singular `agentId` (the owner) from the full speaking cohort. Absent ⇒ no roster enforcement (back-compat: single-agent conversations are unaffected). MVP: declared at open-time only; mid-conversation roster mutation is out of scope (RFC 0101 §Resolved questions)."
|
|
86
|
+
},
|
|
59
87
|
"initialTurn": { "$ref": "#/$defs/ConversationTurn" },
|
|
60
88
|
"schema": {
|
|
61
89
|
"type": "object",
|
|
@@ -70,7 +98,27 @@
|
|
|
70
98
|
}
|
|
71
99
|
}
|
|
72
100
|
},
|
|
73
|
-
"additionalProperties": false
|
|
101
|
+
"additionalProperties": false,
|
|
102
|
+
"examples": [
|
|
103
|
+
{
|
|
104
|
+
"$comment": "RFC 0101 POSITIVE — a 3-agent advisory council opens with a participant roster (instance ids) and a user's opening turn.",
|
|
105
|
+
"conversationId": "run-abc:n1:0",
|
|
106
|
+
"agentId": "host:advisor-cfo",
|
|
107
|
+
"participants": [
|
|
108
|
+
{ "agentId": "host:advisor-cfo", "name": "CFO" },
|
|
109
|
+
{ "agentId": "host:advisor-cmo", "name": "CMO" },
|
|
110
|
+
{ "agentId": "host:advisor-cto", "name": "CTO" }
|
|
111
|
+
],
|
|
112
|
+
"initialTurn": {
|
|
113
|
+
"messageId": "run-abc:n1:0:user",
|
|
114
|
+
"from": "user",
|
|
115
|
+
"content": "Should we launch in Q3 or Q4?",
|
|
116
|
+
"ts": 1718900000000,
|
|
117
|
+
"role": "user",
|
|
118
|
+
"turnIndex": 0
|
|
119
|
+
}
|
|
120
|
+
}
|
|
121
|
+
]
|
|
74
122
|
},
|
|
75
123
|
"ConversationExchangedPayload": {
|
|
76
124
|
"type": "object",
|
|
@@ -66,7 +66,42 @@
|
|
|
66
66
|
"type": "integer",
|
|
67
67
|
"minimum": 0,
|
|
68
68
|
"description": "0-based index within the conversation. `conversation.opened.initialTurn` carries `turnIndex: 0`; subsequent `conversation.exchanged` turns increment monotonically; `conversation.closed.finalTurn` carries the highest index."
|
|
69
|
+
},
|
|
70
|
+
"speakerId": {
|
|
71
|
+
"type": "string",
|
|
72
|
+
"minLength": 1,
|
|
73
|
+
"maxLength": 256,
|
|
74
|
+
"description": "RFC 0101. Stable speaker identity of this turn — the roster INSTANCE id of the agent that produced it (the `host:<id>` AgentRef agentId per `agent-roster-entry.schema.json` §A / RFC 0086), NOT the manifest/class `AgentRef.agentId`. REQUIRED when `role: 'agent'` (the conditional below) so that 'turn N was spoken by agent X' is an observable, replay-stable, cross-host-projectable fact in a multi-party transcript. When a `participants` roster is declared on `conversation.opened` (RFC 0101), `speakerId` MUST be a member of that roster; a turn whose `speakerId` is not a participant MUST be rejected (host-enforced, capability-gated on `multiPartyConversation.supported`). For `role: 'user'` / `role: 'system'` turns the field is OPTIONAL and carries no normative meaning. Additive-safe: pre-RFC-0101 `role: 'agent'` turns that omit it are accepted by hosts that do NOT advertise `multiPartyConversation`."
|
|
69
75
|
}
|
|
70
76
|
},
|
|
77
|
+
"allOf": [
|
|
78
|
+
{
|
|
79
|
+
"$comment": "RFC 0101 — agent turns MUST carry speaker attribution. Conditional (not unconditional `required`) so the field is only mandatory for `role: 'agent'`; `user`/`system` turns and pre-RFC-0101 producers on non-multi-party hosts are unaffected. Capability-gated: hosts enforce this only when advertising `multiPartyConversation.supported` (capabilities.schema.json).",
|
|
80
|
+
"if": { "properties": { "role": { "const": "agent" } }, "required": ["role"] },
|
|
81
|
+
"then": { "required": ["speakerId"] }
|
|
82
|
+
}
|
|
83
|
+
],
|
|
84
|
+
"examples": [
|
|
85
|
+
{
|
|
86
|
+
"$comment": "RFC 0101 POSITIVE — an agent turn in a multi-party council carrying speaker attribution (the roster instance id).",
|
|
87
|
+
"messageId": "council-q1:1:agent",
|
|
88
|
+
"from": "host:advisor-cfo",
|
|
89
|
+
"content": "From a cash-runway view I'd push the launch one quarter.",
|
|
90
|
+
"ts": 1718900000000,
|
|
91
|
+
"role": "agent",
|
|
92
|
+
"turnIndex": 1,
|
|
93
|
+
"speakerId": "host:advisor-cfo"
|
|
94
|
+
},
|
|
95
|
+
{
|
|
96
|
+
"$comment": "RFC 0101 POSITIVE — a user turn; speakerId is OPTIONAL for non-agent roles and omitted here.",
|
|
97
|
+
"messageId": "council-q1:0:user",
|
|
98
|
+
"from": "user",
|
|
99
|
+
"content": "Should we launch in Q3 or Q4?",
|
|
100
|
+
"ts": 1718900000000,
|
|
101
|
+
"role": "user",
|
|
102
|
+
"turnIndex": 0
|
|
103
|
+
}
|
|
104
|
+
],
|
|
105
|
+
"$comment": "RFC 0101 NEGATIVE (covered by the `allOf`/`if` conditional + conversation scenarios, NOT validatable as an `examples[]` entry which must be VALID): a `{ role: 'agent', ... }` turn that OMITS `speakerId` MUST FAIL validation; see conformance/src/scenarios/multi-party-conversation-shape.test.ts.",
|
|
71
106
|
"additionalProperties": true
|
|
72
107
|
}
|