@openwop/openwop-conformance 1.21.0 → 1.23.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +43 -2
- package/README.md +61 -63
- package/api/asyncapi.yaml +54 -38
- package/api/openapi.yaml +34 -6
- package/coverage.md +381 -202
- package/fixtures/connection-packs/connection-pack-github.json +31 -0
- package/fixtures.md +120 -101
- package/package.json +1 -1
- package/schemas/README.md +1 -0
- package/schemas/capabilities.schema.json +49 -0
- package/schemas/connection-pack-manifest.schema.json +161 -0
- package/schemas/run-event-payloads.schema.json +6 -5
- package/schemas/run-event.schema.json +11 -2
- package/schemas/run-options.schema.json +1 -2
- package/schemas/run-snapshot.schema.json +2 -1
- package/schemas/suspend-request.schema.json +5 -0
- package/src/scenarios/connection-pack-manifest-valid.test.ts +122 -0
- package/src/scenarios/connection-pack-no-credential-material.test.ts +125 -0
- package/src/scenarios/connection-pack-reach-exclusive.test.ts +85 -0
- package/src/scenarios/connection-pack-write-reconsent.test.ts +91 -0
- package/src/scenarios/connection-provider-resolution.test.ts +153 -0
- package/src/scenarios/cross-host-traceparent-propagation.test.ts +3 -3
- package/src/scenarios/fixtures-valid.test.ts +34 -0
- package/src/scenarios/grpc-transport.test.ts +108 -0
- package/src/scenarios/i18n-negotiation.test.ts +181 -0
- package/src/scenarios/interrupt-token-matrix.test.ts +2 -2
- package/src/scenarios/media-url-inline-cap.test.ts +5 -3
- package/src/scenarios/spec-corpus-validity.test.ts +107 -0
- package/src/scenarios/stream-text-fixture.test.ts +212 -0
- package/src/scenarios/version-fold.test.ts +193 -0
- package/src/scenarios/wasm-pack-memory-cap.test.ts +4 -2
- package/src/scenarios/webhook-tenant-isolation.test.ts +184 -0
package/coverage.md
CHANGED
|
@@ -1,160 +1,164 @@
|
|
|
1
1
|
# OpenWOP Conformance Coverage Map
|
|
2
2
|
|
|
3
|
-
> **Status: Living document. Updated 2026-06-
|
|
3
|
+
> **Status: Living document. Updated 2026-06-11.** This map connects the current scenario files to the protocol surfaces they protect and records the remaining gaps from the protocol deep dive. Scenario names are source-of-truth file names under `conformance/src/scenarios/`.
|
|
4
4
|
|
|
5
5
|
> **Shape grade vs behavior grade.** Some optional-profile scenarios validate **capability shape** (the host's discovery advertisement is well-formed) without yet exercising **behavior** (the host actually implements the profile end-to-end). The "Current grade" column reflects shape; see §"Capability-gated scenarios: shape vs behavior" below for the dual-grade view and the `OPENWOP_REQUIRE_BEHAVIOR=true` strict-mode runner flag.
|
|
6
6
|
|
|
7
|
+
> **Sibling-repo pointer convention (2026-06 monorepo split).** Reference-implementation files that used to live in this monorepo are cited repo-qualified as `<repo>:<path>`, where `<repo>` is a repository under `https://github.com/openwop` — e.g. `openwop-examples:examples/hosts/sqlite/test/audit-tamper.test.ts`, `openwop-app:backend/typescript/test/agent-dispatch-route.test.ts`, `openwop-registry:registry/scripts/verify-signatures.mjs`. Bare paths refer to this repository. (Same convention as `SECURITY/invariants.yaml`.)
|
|
8
|
+
|
|
7
9
|
---
|
|
8
10
|
|
|
9
11
|
## Coverage by protocol surface
|
|
10
12
|
|
|
11
|
-
| Surface
|
|
12
|
-
|
|
13
|
-
| Discovery and capability handshake
|
|
14
|
-
| Auth and errors
|
|
15
|
-
| Run lifecycle
|
|
16
|
-
| Idempotency and retry
|
|
17
|
-
| Interrupts
|
|
18
|
-
| Streaming
|
|
19
|
-
| Replay and fork
|
|
20
|
-
| Capabilities and limits
|
|
21
|
-
| State channels and reducers
|
|
22
|
-
| Sub-workflows and dispatch
|
|
23
|
-
| Node packs and registry
|
|
24
|
-
| Secrets and redaction
|
|
25
|
-
| Observability and diagnostics
|
|
26
|
-
| Fixtures and corpus validity
|
|
27
|
-
| Run control — pause/resume
|
|
28
|
-
| Rate-limit envelope
|
|
29
|
-
| SSE longevity (out-of-fast-CI soak)
|
|
30
|
-
| Load profile (out-of-fast-CI throughput)
|
|
31
|
-
| Per-workflow `configurableSchema`
|
|
32
|
-
| Append-reducer ordering
|
|
33
|
-
| Webhook signature algorithms
|
|
34
|
-
| Audit-log integrity profile
|
|
35
|
-
| Multi-region idempotency capability
|
|
36
|
-
| Public hosted registry (`packs.openwop.dev`)
|
|
37
|
-
| Workflow-chain packs (RFC 0013 — `spec/v1/workflow-chain-packs.md`)
|
|
38
|
-
| AI Envelope (FINAL v1.1 — `spec/v1/ai-envelope.md`, RFC 0021)
|
|
39
|
-
| Envelope `reasoning` field + Tier-1 subset (RFC 0030 — `spec/v1/ai-envelope.md` §"Reasoning field (normative)", `spec/v1/structured-output-subset.md`)
|
|
40
|
-
| Envelope variant discrimination + model capabilities (RFC 0031 — `spec/v1/ai-envelope.md` §"Variant payload discrimination (normative)", `spec/v1/host-capabilities.md` §"Model-capability declarations", `spec/v1/node-packs.md` §"Model-capability declarations on NodeModules") | `envelope-variant-discriminator-static.test.ts`, `model-capability-substituted.test.ts`, `model-capability-insufficient.test.ts`, `node-module-required-capabilities-shape.test.ts`
|
|
41
|
-
| Multimodal envelope variants (RFC 0055 — `spec/v1/ai-envelope.md` §"Rendering hints" + §"Media reference payloads")
|
|
42
|
-
| Envelope-reliability run-event vocabulary (RFC 0032 — `spec/v1/ai-envelope.md` §"Envelope-reliability events" + line-448 scope clarification, `spec/v1/observability.md` §"Envelope-reliability events (RFC 0032)")
|
|
43
|
-
| Envelope-completion retry routing (RFC 0033 — `spec/v1/ai-envelope.md` §"Envelope-completion criteria", `spec/v1/observability.md` §"Envelope-completion retry routing (RFC 0033)")
|
|
44
|
-
| Multi-agent execution model + handoff state machine (RFC 0037 — `spec/v1/multi-agent-execution.md`, `version: 1`)
|
|
45
|
-
| Multi-agent confidence-floor escalation (RFC 0039 — `spec/v1/multi-agent-execution.md` §"Confidence escalation", `version: 2`)
|
|
46
|
-
| Sandbox execution contract (RFC 0035 — `spec/v1/host-capabilities.md` §"Sandbox execution contract")
|
|
47
|
-
| Multi-region idempotency + cross-engine append-ordering (RFC 0036 — `spec/v1/idempotency.md` §"`multiRegion` sub-block", `spec/v1/replay.md` §"Cross-region replay")
|
|
48
|
-
| Secret-leakage telemetry / debug-bundle export (RFC 0034 §B — `spec/v1/host-capabilities.md` §"OTel collector test seam")
|
|
49
|
-
| Experimental capability tier (RFC 0042 — `schemas/capabilities.schema.json` §`multiAgent.executionModel.tier`)
|
|
50
|
-
| Sandbox WASM-isolation behavioral graduation (RFC 0035 §B)
|
|
51
|
-
| Sandbox MVP behavioral close-out (RFC 0035 §B)
|
|
52
|
-
| RFC 0041 §B replay-divergence-at-refusal behavioral (`version: 4`)
|
|
53
|
-
| Agent-manifest runtime floor (RFC 0070 — `capabilities.agents.manifestRuntime`)
|
|
54
|
-
| Live manifest dispatch (RFC 0077 — `capabilities.agents.liveRuntime`)
|
|
55
|
-
| Agent evaluation (RFC 0081 — `capabilities.agents.evalSuite`)
|
|
56
|
-
| Memory capability model (RFC 0080 — `spec/v1/agent-memory.md` §"Memory capability model", `spec/v1/profiles.md` §`openwop-memory`)
|
|
57
|
-
| Portable tool catalog (RFC 0078 — `spec/v1/tool-catalog.md`)
|
|
58
|
-
| Credential provenance + egress policy (RFC 0079 — `spec/v1/host-capabilities.md` §"Credential provenance + egress policy")
|
|
59
|
-
| Durable trigger + channel bridge (RFC 0083 — `spec/v1/trigger-bridge.md`, `spec/v1/profiles.md` §`openwop-trigger-bridge`)
|
|
60
|
-
| Budget, quota + cost policy (RFC 0084 — `spec/v1/budget-policy.md`)
|
|
61
|
-
| `openwop-agent-platform` meta-profile (RFC 0085 — `spec/v1/agent-platform-profile.md`, operational annex)
|
|
62
|
-
| `openwop-core-standard` profile (RFC 0088 — `spec/v1/core-standard-profile.md`, operational annex)
|
|
63
|
-
| Agent deployment lifecycle (RFC 0082 — `capabilities.agents.deployment`)
|
|
64
|
-
|
|
|
65
|
-
|
|
|
13
|
+
| Surface | Scenario files | Current grade | Remaining gaps |
|
|
14
|
+
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
|
|
15
|
+
| Discovery and capability handshake | `discovery.test.ts` (incl. RFC 0011 auth-scoped subtests), `runtime-capabilities.test.ts`, `profileDerivation.test.ts`, `mcp-discoverability.test.ts` | A | `Capabilities-Etag` optional runtime shape covered; auth-scoped discovery covered under `openwop-discovery-auth-scoped` (RFC 0011); non-HTTP handoff remains host-advertised follow-up. |
|
|
16
|
+
| Auth and errors | `auth.test.ts`, `errors.test.ts`, `policies.test.ts`, `providerPolicyEnforcement.test.ts`, `auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts` | A− | Three auth-profile capability-shape + negative-case scenarios shipped under RFC 0010 (capability-gated). Remaining: live-IdP positive-path validation + opt-in mTLS scenario + richer scope matrix. |
|
|
17
|
+
| Run lifecycle | `runs-lifecycle.test.ts`, `failure-path.test.ts`, `cancellation.test.ts`, `eventOrdering.test.ts`, `restart-during-run.test.ts` | A | Restart-during-run scenario shipped; gated under `openwop-production` profile (RFC 0009). |
|
|
18
|
+
| Idempotency and retry | `idempotency.test.ts`, `idempotencyRetry.test.ts`, `highConcurrency.test.ts` | A- | Long retention proof beyond the fast CI window. |
|
|
19
|
+
| Interrupts | `interrupt-approval.test.ts`, `interrupt-clarification.test.ts`, `approval-payload.test.ts`, `interruptRace.test.ts`, `interrupt-quorum-resolution.test.ts`, `interrupt-external-event-correlation.test.ts`, `interrupt-auth-required-resume.test.ts`, `interrupt-parent-child-cascade.test.ts` | A− | All four optional profile scenarios landed 2026-05-10. Remaining: positive end-to-end run against a host that advertises every profile. |
|
|
20
|
+
| Streaming | `stream-modes.test.ts`, `stream-modes-buffer.test.ts`, `stream-modes-mixed.test.ts`, `streamReconnect.test.ts` | A | Browser/proxy timeout matrix and long-running stream soak. |
|
|
21
|
+
| Replay and fork | `replay-fork.test.ts`, `replay-fork-arbitrary.test.ts`, `replay-retention-expiry.test.ts`, `replayDeterminism.test.ts`, `staleClaim.test.ts` | A | Arbitrary-event fork shipped (`replay-fork-arbitrary.test.ts`). Retention-expiry envelope shipped (`replay-retention-expiry.test.ts`) — gated on `OPENWOP_TEST_EXPIRED_REPLAY_RUN_ID` because hosts don't standardize a force-expire endpoint. Retention/privacy/scoring semantics specified in `replay.md`. |
|
|
22
|
+
| Capabilities and limits | `cap-breach.test.ts`, `dispatchLoop.test.ts` | B+ | Clarification/schema/envelope cap-breach fixtures beyond node-execution cap. |
|
|
23
|
+
| State channels and reducers | `channel-ttl.test.ts` | B+ | Cross-adapter reducer consistency and conflict cases. |
|
|
24
|
+
| Sub-workflows and dispatch | `subworkflow.test.ts`, `multi-node-ordering.test.ts` | B+ | Parallel fan-out floors by scale tier, parent/child cancellation. |
|
|
25
|
+
| Node packs and registry | `pack-registry.test.ts`, `pack-registry-publish.test.ts`, `maliciousManifest.test.ts`, `wasm-pack-load.test.ts`, `wasm-pack-invoke-completed.test.ts`, `wasm-pack-invoke-suspended.test.ts`, `wasm-pack-replay-determinism.test.ts`, `wasm-pack-memory-cap.test.ts`, `wasm-pack-abi-version-rejection.test.ts` | A | RFC 0008 WASM ABI scenarios landed 2026-05-10; gated on `capabilities.nodePackRuntimes.wasm.supported`. Memory-cap positive path closed 2026-05-12 via `openwop-examples:examples/packs/rust-misbehaving-memory/` + in-memory loader's `memory.grow` probe + `capBreached.kind` schema extension. ABI-mismatch positive path closed 2026-05-12 via `openwop-examples:examples/packs/rust-misbehaving-abi/` (declares ABI 999) + new `capabilities.nodePackRuntimes.wasm.loadedPacks[]` discovery field (rejected packs omitted). Remaining: public hosted registry tarball-fetch + signature-verify roundtrip and host-side pack consumption proof. |
|
|
26
|
+
| Secrets and redaction | `redaction.test.ts`, `redactionAdversarial.test.ts`, `byok-roundtrip.test.ts` | A- | Cross-provider BYOK matrix and debug-bundle redaction under high volume. |
|
|
27
|
+
| Observability and diagnostics | `cost-attribution.test.ts`, `debugBundle.test.ts`, `otel-emission.test.ts`, `otel-trace-propagation.test.ts`, `otel-trace-propagation-subworkflow.test.ts`, `metric-emission.test.ts`, `otel-emission-grpc.test.ts` | A | OTLP collector accepts all three OTLP transports: HTTP-JSON (2026-05-11), HTTP-protobuf (2026-05-12), gRPC over h2c HTTP/2 (2026-05-12 — hand-rolled framing at `conformance/src/lib/grpc-framing.ts`). Zero new npm deps for any of them. Opt-in via `OPENWOP_OTEL_COLLECTOR=true` (+ `OPENWOP_OTEL_COLLECTOR_GRPC=true` for the gRPC variant). Hosts advertise supported transports via `capabilities.observability.otel.exportProtocols ⊆ {http/json, http/protobuf, grpc}`; conformance scenario gates on the array. Trace-context propagation closed 2026-05-13 with `otel-trace-propagation-subworkflow.test.ts` asserting parent + child spans share the inbound traceparent across the `core.subWorkflow` dispatch boundary. |
|
|
28
|
+
| Fixtures and corpus validity | `fixtures-valid.test.ts`, `fixtures-gating.test.ts`, `spec-corpus-validity.test.ts` | A | Keep fixture manifest synchronized as new optional profiles land. |
|
|
29
|
+
| Run control — pause/resume | `pause-resume.test.ts` | A− | Direct `pauseRun` / `resumeRun` route exercisers cover running → paused → resumed, terminal conflict, non-paused resume conflict, idempotent re-pause, and pause-during-suspend race. Remaining: explicit immediate-vs-drain-current-node policy assertion across hosts that advertise both drain policies. |
|
|
30
|
+
| Rate-limit envelope | `rate-limit-envelope.test.ts` | A− | Shape validation under deterministic 429 induction (CF-6 close-out 2026-05-15); Postgres reference host honors `OPENWOP_FORCE_RATE_LIMIT=true` to return canonical envelope (`rate_limited` + `Retry-After` + `details.scope: 'global'` + `details.retryAfterMs`) on every route. Live-verified: shape assertion passes against the seam. |
|
|
31
|
+
| SSE longevity (out-of-fast-CI soak) | `conformance/soak/sse-longevity.mjs` | A− | CF-10 close-out 2026-05-15. Documented soak runner exercises the SSE event stream for `OPENWOP_SOAK_DURATION_SECONDS` (default 300) creating runs at `OPENWOP_SOAK_RUN_INTERVAL_SECONDS` cadence; tracks reconnects, heartbeat cadence (min/max gaps), longest quiet period, total events. Emits a single JSON line on stdout for operator dashboards. NOT registered in `openwop:check`; deployer-invoked per `docs/PRODUCTION-RUNBOOK.md`. |
|
|
32
|
+
| Load profile (out-of-fast-CI throughput) | `conformance/soak/load-profile.mjs` | A− | OPS-2 close-out 2026-05-15. Throughput / latency runner across 5 canonical paths (`create` / `poll` / `sse` / `interrupt` / `webhook`). Configurable `OPENWOP_LOAD_SAMPLES` × `OPENWOP_LOAD_CONCURRENCY`. Emits per-path min/p50/p95/p99/max latency in milliseconds as a single JSON line. `interrupt` + `webhook` paths are honest-stubbed (require host-side test seams not yet stabilized in the reference). NOT registered in `openwop:check`. |
|
|
33
|
+
| Per-workflow `configurableSchema` | `configurable-schema.test.ts` | A− | Negative validation + positive accepted-overlay + `GET /v1/workflows/{id}` schema-surface assertion all covered (CF-7 close-out, 2026-05-15). Grade moves C+ → A−. |
|
|
34
|
+
| Append-reducer ordering | `append-ordering.test.ts` | B | Intra-engine sequence-order check; remaining: cross-engine ordering under a multi-engine fixture. |
|
|
35
|
+
| Webhook signature algorithms | `webhook-sig-algorithm.test.ts`, `webhook-signed-delivery.test.ts`, `webhook-negative.test.ts`, `webhook-receiver-adversarial.test.ts` | A | Forward direction: discovery shape + end-to-end signed delivery with HMAC verification (`webhook-signed-delivery`); negative paths (SSRF guard / validation / unknown-unregister via `webhook-negative`). Reverse direction (CF-5 close-out 2026-05-15): `webhook-receiver-adversarial.test.ts` covers 6 paths — positive control + tampered body + tampered HMAC + stale timestamp + replayed signature + wrong algorithm + malformed signature header. Reference receiver implementation at `conformance/src/lib/webhook-receiver.ts` mirrors the SDK's `verifyWebhookSignature` helper. |
|
|
36
|
+
| Audit-log integrity profile | `audit-log-integrity.test.ts` | A | Profile claim + `/v1/audit/verify` shape + checkpoint-signature verification; chain re-walk with `chainValid` and `checkpointsValid` bits. Tamper detection covered host-internally at `openwop-examples:examples/hosts/sqlite/test/audit-tamper.test.ts` (mutate-entry + forge-signature paths). CF-11 close-out 2026-05-15: cross-host checkpoint export via `openwop-examples:examples/hosts/postgres/src/audit-export.ts` + standalone verifier at `scripts/verify-audit-checkpoints.mjs`; `openwop-examples:examples/hosts/postgres/test/audit-checkpoint-export.test.ts` verifies 7 paths (positive + tampered-signature + non-monotonic-atSequence). The standalone verifier is additionally regression-guarded in `openwop:check` (step 7) against the committed `conformance/audit-export-samples/{valid,tampered}.json` bundles — accept-valid (exit 0) + reject-tampered (exit 1) — so a verifier regression is caught at the spec gate, not only in the postgres host's own test. |
|
|
37
|
+
| Multi-region idempotency capability | `multi-region-idempotency.test.ts` | C | Discovery enum coverage; remaining: cross-region partition simulation (requires multi-region harness). |
|
|
38
|
+
| Public hosted registry (`packs.openwop.dev`) | `registry-public.test.ts` | A− | Discovery, index, and per-pack manifest assertions against the public registry. Opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true` so default conformance runs don't depend on outbound `packs.openwop.dev` reachability. Remaining: tarball-fetch + signature-verify roundtrip. |
|
|
39
|
+
| Workflow-chain packs (RFC 0013 — `spec/v1/workflow-chain-packs.md`) | `workflow-chain-pack-manifest-validation.test.ts`, `workflow-chain-pack-signature-verification.test.ts`, `workflow-chain-expansion.test.ts`, `workflow-chain-unresolvable-typeid.test.ts`, `workflow-chain-host-expansion.test.ts` | A (4 server-free + 1 live-host) | RFC 0013 promoted Draft → Active → Accepted 2026-05-18 once the live-host gate (`workflow-chain-host-expansion.test.ts`) passed against the reference in-memory host. The 4 server-free scenarios exercise the pure-library algorithm; the live-host scenario exercises the host's HTTP wrapper (`POST /v1/host/sample/workflow-chain:expand`) under `OPENWOP_REQUIRE_BEHAVIOR=true`, gated on `capabilities.workflowChainPacks.supported: true`. 6 cases — discovery advertisement / 1-node positive / 2-node positive with edges / unknown-pack 404 / unknown-chain 404 / malformed-body 422. The matching spec doc `workflow-chain-packs.md` remains at `DRAFT v1.x` pending Phase B/C closure (parameter schema validation + cross-host expansion equivalence). |
|
|
40
|
+
| AI Envelope (FINAL v1.1 — `spec/v1/ai-envelope.md`, RFC 0021) | `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`, `ai-envelope-shape.test.ts` | A− (shape + ~84 live behavioral assertions across all 8 files via the envelope-accept seam) | DRAFT v1.x gap-closure landed 2026-05-17; RFC 0021 promotion to FINAL v1.1 landed 2026-05-18. Closes the long-standing gap where 8 v1 surfaces already referenced AI Envelopes (`Capabilities.supportedEnvelopes` + `schemaVersions` + the three per-turn limits; `host.aiEnvelope.generate`; `envelopeType` on workflow-chain packs; `profiles.md` derivation; `host-extensions.md` namespacing; `positioning.md`; reference host discovery) but the envelope's own wire shape, universal kinds, schema discipline, and Envelope Contract gate were never specified. The 7 `aiEnvelope.*` advertisement-shape probes (gated on `capabilities.envelopeContracts.advertised: true`) cover the host's CAPABILITY claim. All 8 files now also run behavioral assertions through the workflow-engine sample's env-gated `POST /v1/host/sample/envelope/accept` seam (drained 2026-05-19): schema-drift refusal under strict mode; correlationId-replay short-circuit + persisted `priorCorrelations` store survives process restart; BYOK redaction-carry-forward returning `redactedPayload` + `redactionCount`; contract-refusal mappings via the capability-toggle seam; trust-boundary propagation from MCP/A2A; cap-breached counters; universal-kind always-allowed; `ai-envelope-shape` end-to-end accept-pipeline. Path to `Accepted` (host-side): live downstream-projection coverage on a published host (OTel scrape + debug-bundle export currently soft-skip on hosts that don't expose those seams). |
|
|
41
|
+
| Envelope `reasoning` field + Tier-1 subset (RFC 0030 — `spec/v1/ai-envelope.md` §"Reasoning field (normative)", `spec/v1/structured-output-subset.md`) | `envelope-reasoning-shape.test.ts`, `envelope-reasoning-secret-redaction.test.ts`, `envelope-tier-one-subset-static.test.ts` | A (shape + load-bearing Tier-1 static checks always-on; 8 live behavioral assertions in `envelope-reasoning-secret-redaction` via the envelope-accept seam, including the OTel + debug-bundle downstream projections) | RFC 0030 promoted Draft → Active 2026-05-20. `envelope-reasoning-shape` (always-on) asserts the OPTIONAL `reasoning` property posture on the three universal-kind schemas + the `schema.response` deliberate omission + with/without backward-compat round-trips + `capabilities.envelopes.reasoning.{supported,promptDirective}` + `tierOneSubsetCompliance` advertisement shape. `envelope-tier-one-subset-static` enforces the load-bearing Tier-1 subset (no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere — Gemini silently drops these) on every universal-kind schema as always-on; the OpenAI-strict-only constraints (`minLength` / `maxLength` / `minItems` etc.) are checked only under host-advertised `tierOneSubsetCompliance: "strict"` to honor the universal-kind schemas' pre-RFC-0030 open-bag design. `envelope-reasoning-secret-redaction` (capability-gated on `reasoning.supported` + `secrets.supported`) carries 8 live behavioral assertions for SECURITY invariant `envelope-reasoning-secret-redaction`: BYOK canary substitution with the canonical `[REDACTED:<secretId>]` marker on `reasoning`; recursive walk across `reasoning` + sibling fields; passthrough when no canary matches; canary detection inside `clarification.request.reasoning`; downstream OTel-span scrape + debug-bundle export both confirm no canary plaintext leaks (soft-skip on hosts that don't expose those seams); non-routing-on-reasoning invariant (acceptor's routing decision MUST NOT depend on `reasoning` contents). Reference host advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"`. Path to `Accepted`: reference host injects the system-prompt directive instructing the model to populate `reasoning` (promotes `promptDirective` → `"advisory"`). |
|
|
42
|
+
| Envelope variant discrimination + model capabilities (RFC 0031 — `spec/v1/ai-envelope.md` §"Variant payload discrimination (normative)", `spec/v1/host-capabilities.md` §"Model-capability declarations", `spec/v1/node-packs.md` §"Model-capability declarations on NodeModules") | `envelope-variant-discriminator-static.test.ts`, `model-capability-substituted.test.ts`, `model-capability-insufficient.test.ts`, `node-module-required-capabilities-shape.test.ts` | B+ (discriminator-static + advertisement-shape always-on; 14 live behavioral assertions across substitution + insufficient + authoring-convention, capability-gated) | RFC 0031 promoted Draft → Active 2026-05-20. `envelope-variant-discriminator-static` (always-on) walks every `schemas/envelopes/*.schema.json` asserting no `oneOf` at any nesting depth (Gemini silently drops `oneOf`, producing looser-than-declared schemas — a silent correctness bug) AND every `anyOf` branch declares a single-string-`enum` discriminator in `required` per RFC 0031 §A. `model-capability-substituted` (capability-gated on `capabilities.modelCapabilities.supported` + `substitutionSupported`) carries advertisement-shape check on the `advertised: string[]` pattern (each identifier matches the spec-reserved set OR `^x-host-<host>-<key>$` per RFC 0031 §C) + 4 live behavioral assertions covering substitution emission + SECURITY invariant `model-capability-substituted-no-credential-disclosure`'s all-or-nothing `"[REDACTED]"` redaction option. `model-capability-insufficient` (capability-gated on `modelCapabilities.supported`) carries 6 live behavioral assertions covering refusal emission paths + the no-recursive-fallback constraint (RFC 0031 §"Unresolved questions" #3 — `fallbackAttempted: true` when the declared fallback itself fails; NO chaining). `node-module-required-capabilities-shape` (SHOULD-tier authoring convention check) carries 4 live assertions for the `core.ai.*` typeId-pattern recommendation. Path to `Accepted`: reference host implements `executor/modelCapabilityGate.ts` end-to-end + advertises `capabilities.modelCapabilities: { supported: true, advertised: [...], substitutionSupported: true }` (the live behavioral assertions soft-skip cleanly on hosts that haven't wired the executor yet). |
|
|
43
|
+
| Multimodal envelope variants (RFC 0055 — `spec/v1/ai-envelope.md` §"Rendering hints" + §"Media reference payloads") | `envelope-rendering-hint.test.ts`, `media-url-inline-cap.test.ts` | A (always-on schema shape across §B + §C; live media store→serve→tenant-scoping behavioral assertions against the reference-host seam, soft-skip offline) | **RFC 0055 promoted Draft → Active 2026-05-25.** §B: `envelope-rendering-hint` (always-on, server-free Ajv) asserts the optional `meta.rendering` hint on the `EnvelopeMeta` $def — well-formed hint validates; omitting it still validates (optionality); unknown `display` rejected by the closed enum; unknown property rejected (additionalProperties:false). Consumer-fallback rendering is exercised app-side (`chat/MessageRenderer.tsx`). §C: `media-url-inline-cap` compiles + round-trips the three `media.{image,audio,file}` payload schemas, asserts `aiProviders.maxInlineMediaBytes` shape, and (live, vs. the reference host's `POST /v1/host/sample/media/put` + public token-authed `GET /v1/host/sample/assets/{token}` seam) verifies a stored asset is served by a tenant-scoped URL and an unminted/guessed token does not resolve — the `media-asset-url-tenant-scoped` SECURITY invariant. §A capability vocabulary (`vision-input`/`audio-input`/`audio-output`/`image-output`) is registered in the open `modelCapabilities.advertised` prose registry; a host advertises a given identifier only when its model supports it (the reference mock model advertises none). Reference host advertises `media.*` in `supportedEnvelopes`/`schemaVersions` + `maxInlineMediaBytes` + serves assets. `Active → Accepted` awaits a non-steward host. **Honesty note (2026-06-11):** despite the A grade, `media-url-inline-cap.test.ts` still carries the `it.todo`-staging marker in its file header for the §C behavioral legs — the tenant-scoping leg runs live (soft-skip offline), but the over-cap inline→url replacement behavior (§C rule 2) remains unexercised end-to-end. |
|
|
44
|
+
| Envelope-reliability run-event vocabulary (RFC 0032 — `spec/v1/ai-envelope.md` §"Envelope-reliability events" + line-448 scope clarification, `spec/v1/observability.md` §"Envelope-reliability events (RFC 0032)") | `envelope-retry-attempted.test.ts`, `envelope-retry-exhausted.test.ts`, `envelope-refusal-shape.test.ts`, `envelope-truncated.test.ts`, `envelope-nl-to-format-engaged.test.ts`, `envelope-recovery-applied.test.ts` | B (1 shared advertisement-shape probe with MUST-events enforcement; 34 live behavioral assertions across the six events, all capability- + fixture-gated) | RFC 0032 promoted Draft → Active 2026-05-20. Carries the central `ai-envelope.md` line-448 scope clarification (per-kind routing events forbidden; cross-kind operational events permitted via RFC). `envelope-retry-attempted` carries the shared advertisement-shape probe: when `capabilities.envelopes.reliability.supported: true`, the host MUST list both `envelope.retry.exhausted` AND `envelope.refusal` in `events[]` (the two MUST-tier events per RFC 0032 §C); `maxRetryAttempts` MUST be in `[1, 16]`. The six scenarios collectively carry 34 live behavioral assertions (drained 2026-05-19 via the conformance `mock` provider + `POST /v1/host/sample/test/mock-ai/program` seam): retry on schema-violation + retry on truncation + retry-exhausted terminal failure + provider refusal (no-retry MUST per RFC 0032 §B.3 + RFC 0033 §D) + truncation cut-off + NL-to-Format escalation (Tam et al. mitigation per arXiv 2408.02442) + lenient-parsing recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` (BYOK + prompt-content redaction on `refusalText`) and `envelope-recovery-no-content-leak` (no pre-recovery substrings in the event payload). Path to `Accepted`: reference host implements `executor/envelopeReliability.ts` end-to-end + advertises `capabilities.envelopes.reliability: { supported: true, events: [...], maxRetryAttempts: <n> }` (the behavioral assertions already pass against the reference host's end-to-end emission path under `OPENWOP_ENVELOPE_RELIABILITY_END_TO_END=true`; the no-flag default still soft-skips). |
|
|
45
|
+
| Envelope-completion retry routing (RFC 0033 — `spec/v1/ai-envelope.md` §"Envelope-completion criteria", `spec/v1/observability.md` §"Envelope-completion retry routing (RFC 0033)") | `envelope-completion-distinguishes-truncation.test.ts`, `envelope-truncation-cap-exhaustion.test.ts` | B− (1 advertisement-shape probe on `completion.{distinguishesTruncation, truncationBudgetMultiplier}`; 9 live behavioral assertions across the two retry paths + the DoS-bound assertion) | RFC 0033 promoted Draft → Active 2026-05-20. Closes `spec/v1/ai-envelope.md` §"Open spec gaps" E5 (refusal-mode + retry-policy interaction). Reuses RFC 0032's event vocabulary; introduces NO new event types. `envelope-completion-distinguishes-truncation` (capability-gated on `completion.distinguishesTruncation: true`) carries 5 live behavioral assertions covering both retry paths — truncation MUST increase output budget (RECOMMENDED 2× per `truncationBudgetMultiplier`) WITHOUT a corrective fragment; schema-violation MUST add a corrective fragment WITHOUT a budget change. `envelope-truncation-cap-exhaustion` carries 4 live behavioral assertions covering the DoS-bound assertion (truncation retries count against `Capabilities.limits.schemaRounds`; exhaustion → `envelope.retry.exhausted { finalReason: "truncation" }` + `cap.breached { kind: "schema" }` + node fails with NEW error code `envelope_truncation_unrecoverable` per RFC 0033 §F). All 9 assertions are fixture- + capability-gated against the conformance `mock` provider via `POST /v1/host/sample/test/mock-ai/program`. Path to `Accepted`: reference host implements the truncation-vs-schema-violation retry-routing branch end-to-end (`executor/envelopeReliability.ts` + `stop_reason` inspection in `aiProviders/aiProvidersHost.ts`) + advertises `capabilities.envelopes.reliability.completion.distinguishesTruncation: true`. |
|
|
46
|
+
| Multi-agent execution model + handoff state machine (RFC 0037 — `spec/v1/multi-agent-execution.md`, `version: 1`) | `multi-agent-handoff-state-machine.test.ts` | B (1 advertisement-shape probe + 1 behavioral 4-event causation-chain assertion against the parent+child fixture pair) | RFC 0037 filed Draft → promoted Active 2026-05-21 after spec + schema + scenario landed atomically. Advertisement-shape probe asserts `capabilities.multiAgent.executionModel.{supported, version ∈ [1,4]}` when present. Behavioral assertion drives the `conformance-multi-agent-handoff` parent + `conformance-multi-agent-handoff-child` fixture pair: runs the supervisor → next-worker → child completed loop and asserts the 4 `core.workflowChain.event` records appear in the exact phase sequence `dispatch.began → dispatch.succeeded → child.completed → output.harvested` with each event's `causationId === prior.eventId` and `dispatch.began.causationId === runOrchestrator.decided.eventId`, plus `output.harvested.harvestedKeys === ['parentResult']` (proves the spec §"Transition events" table on real wire). Reference workflow-engine advertises + emits end-to-end when `OPENWOP_MULTI_AGENT_EXECUTION_MODEL=true`; the no-flag default soft-skips honestly. Path to `Accepted`: non-steward host advertises + the behavioral assertion passes against it. |
|
|
47
|
+
| Multi-agent confidence-floor escalation (RFC 0039 — `spec/v1/multi-agent-execution.md` §"Confidence escalation", `version: 2`) | `multi-agent-confidence-escalation.test.ts` | B (1 advertisement-shape probe on `confidenceEscalationFloor` + 1 behavioral assertion against the low-confidence fixture) | RFC 0039 filed Draft → promoted Active 2026-05-22 after the confidence-floor half landed end-to-end. Advertisement-shape probe asserts `capabilities.multiAgent.executionModel.confidenceEscalationFloor` (when present) is a number in `[0.5, 1.0]`; values below the spec floor are non-conformant. Behavioral assertion drives the `conformance-multi-agent-confidence-escalation` fixture (supervisor `mockDispatchPlan` carries one decision with `confidence: 0.3`) and asserts: parent reaches `waiting-clarification` (NOT `completed` because no dispatch fired); exactly ONE `core.workflowChain.confidence-escalated` event with `payload.confidence === 0.3`, `payload.floor ∈ [0.5, 1.0]`, `payload.escalationKind ∈ {clarify, escalate}`; causationId chains back to the `runOrchestrator.decided` event; ZERO `core.workflowChain.event` records (the load-bearing distinction from `version: 1` — confidence floor MUST fire BEFORE any dispatch.began). Reference workflow-engine advertises `version: 2` + `confidenceEscalationFloor: 0.5` when both `OPENWOP_MULTI_AGENT_EXECUTION_MODEL=true` AND `OPENWOP_MULTI_AGENT_EXECUTION_MODEL_PHASE_2=true` are set; floor tunable via `OPENWOP_MULTI_AGENT_CONFIDENCE_FLOOR`. Path to `Accepted`: non-steward host advertises `version: 2` + the behavioral assertion passes against it. Memory-lifecycle half of RFC 0039 (MAE-2/3) remains explicit follow-up: `crossChildMemoryConcurrency` capability field is schema-landed but the host's MemoryAdapter doesn't yet implement either contract. |
|
|
48
|
+
| Sandbox execution contract (RFC 0035 — `spec/v1/host-capabilities.md` §"Sandbox execution contract") | `sandbox-no-host-fs-escape.test.ts`, `sandbox-no-host-env-leak.test.ts`, `sandbox-no-network-escape.test.ts`, `sandbox-no-host-process-escape.test.ts`, `sandbox-memory-cap.test.ts`, `sandbox-timeout-cap.test.ts`, `sandbox-capability-gate-respected.test.ts`, `sandbox-no-cross-pack-mutation.test.ts` | A (advertisement-shape probes always-on; behavioral coverage CLOSED — 10/10 PASS via `sandbox-mvp-behavior.test.ts` (2026-05-22) + 10/10 PASS via the server-free `sandbox-wasm-isolation.test.ts` (2026-05-31) + `sandbox-wasm-timeout.test.ts` (2026-06-01); see the two rows below) | RFC 0035 promoted Draft → Active 2026-05-21. 8 scenarios, one per `node-pack-sandbox-*` invariant in `SECURITY/invariants.yaml`. The per-invariant files originally shipped behavioral placeholders (skipped `it` stubs with docstring expected-wire-shapes — NOT `expect(true).toBe(true)` vacuous passes); the behavioral close-out landed in the companion scenarios (`sandbox-mvp-behavior.test.ts`, `sandbox-wasm-isolation.test.ts`, `sandbox-wasm-timeout.test.ts` — all PASS), and each per-invariant file now carries a single `it.skip` pointer at the companion coverage to preserve the per-invariant structure. 7 of 8 `node-pack-sandbox-*` SECURITY rows have graduated `reference-impl` → `protocol` tier (`no-eval` stays exempt, JS-specific). Remaining for `Accepted`: a non-steward sandbox-executing host per RFC 0035 §"Acceptance criteria." |
|
|
49
|
+
| Multi-region idempotency + cross-engine append-ordering (RFC 0036 — `spec/v1/idempotency.md` §"`multiRegion` sub-block", `spec/v1/replay.md` §"Cross-region replay") | `multi-region-idempotency.test.ts`, `cross-engine-append-ordering.test.ts`, **`multi-region-idempotency-behavior.test.ts` (2026-05-22)**, **`cross-engine-append-behavior.test.ts` (2026-05-22)** | A (2 categorical-shape probes always-on + 1 granular `multiRegion` shape probe + 1 `crossEngineOrdering` shape probe + 6 multi-region behavioral assertions + 4 cross-engine Lamport-ordering behavioral assertions; all 10 behavioral assertions PASS against the reference workflow-engine when `OPENWOP_TEST_MULTI_REGION_SIMULATOR=true` + `OPENWOP_TEST_CROSS_ENGINE_HARNESS=true` are set) | RFC 0036 §B + §C behavioral close-out landed 2026-05-22 via the new workflow-engine test seams (`POST /v1/host/sample/test/multi-region/simulate-partition` + `POST/GET /v1/host/sample/test/cross-engine/{append,read,reset}`) — see `spec/v1/host-sample-test-seams.md` §6 + §7. The new `multi-region-idempotency-behavior.test.ts` exercises the canonical lex-min convergence rule + order-invariance + 400-on-mismatch; the new `cross-engine-append-behavior.test.ts` exercises Lamport-clock monotonicity + per-engine order preservation + read-determinism. Path to `Accepted`: non-steward host advertises matching capabilities + the behavioral assertions pass against it. |
|
|
50
|
+
| Secret-leakage telemetry / debug-bundle export (RFC 0034 §B — `spec/v1/host-capabilities.md` §"OTel collector test seam") | **`secret-leakage-otel-attribute.test.ts` (2026-05-22)**, **`otel-collector-canary-inspection.test.ts` (2026-06-01)** | A (host scrape-seam probes + collector-side over-the-wire inspection: `secret-leakage-otel-attribute.test.ts` scrapes the host seams AND — new — runs `OtelCollector.findCanaryLeakage()` against the live real OTLP export when the collector is active; `otel-collector-canary-inspection.test.ts` is the always-on, server-free proof that the inspector is non-vacuous) | Broadens the existing protocol-tier `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel` SECURITY invariants from envelope-acceptor-narrow (`envelope-reasoning-secret-redaction.test.ts`) to executor-side-broad. **Collector-seam gap CLOSED 2026-06-01:** `OtelCollector.findCanaryLeakage()` scans every captured span name/attribute/resource-attribute + metric data-point attribute for the BYOK canary, so the conformance collector now inspects what the host's OTLP exporter ACTUALLY shipped over the wire — a host can no longer redact in its scrape seam while leaking on the real export. The always-on scenario stands up a real collector, POSTs synthetic OTLP/HTTP-JSON through the actual ingest path, and proves the inspector catches a planted canary in each surface + reports zero on a redacted payload + never matches an empty canary. Residual is adoption-only (the live assertion soft-skips until a host exports OTLP to the collector). |
|
|
51
|
+
| Experimental capability tier (RFC 0042 — `schemas/capabilities.schema.json` §`multiAgent.executionModel.tier`) | **`experimental-tier-shape.test.ts` (2026-05-22)** | A (6 server-free + helper-routing assertions across §A schema discipline + §D experimentalGate routing; always-on for hosts that advertise tier='experimental' on any capability sub-block; helper-level behavioral probes for the `experimentalGate()` routing under both default + OPENWOP_REQUIRE_EXPERIMENTAL modes) | RFC 0042 (Draft) lands the audit's "Active RFC → carve-out" pattern. Schema diff lands on `multiAgent.executionModel` with optional `tier ∈ {stable, experimental}` + `experimentalUntil` (ISO-8601 sunset) + `if/then` conditional enforcing §B sunset MUST mechanically. New `experimentalGate()` helper in `conformance/src/lib/behavior-gate.ts` routes scenarios under default mode + `OPENWOP_REQUIRE_EXPERIMENTAL=true` strict-mode. |
|
|
52
|
+
| Sandbox WASM-isolation behavioral graduation (RFC 0035 §B) | **`sandbox-wasm-isolation.test.ts` (2026-05-31)** | A (10 always-on server-free assertions against the committed `fixtures/wasm-sandbox/*.wasm` via the suite-local `wasm-sandbox-probe.ts`: `misbehaving-{fs,env,network,process}` → `sandbox_escape_attempt`+`escapeKind` by static `WebAssembly.Module.imports()` gate, `misbehaving-memory` OOB store → `sandbox_memory_exceeded`, un-granted `openwop.*` → `sandbox_capability_denied`, mutable-global double-instantiate → isolated context; all 10 PASS) | **Graduates 6 `node-pack-sandbox-*` SECURITY rows reference-impl → protocol 2026-05-31** (fs-gated / no-env / network-gated / no-process / memory-cap / isolated-context). These hold by construction in any WASM sandbox — a forbidden op can only be a declared import refused before instantiation; the memory bound is engine-enforced. The reference host is `openwop-examples:examples/hosts/wasm-sandbox/` (#412). **`node-pack-sandbox-timeout` ALSO graduated `reference-impl → protocol` 2026-06-01** via the worker-driven `sandbox-wasm-timeout.test.ts` (`probeTimeout` spawns a worker + a main-thread kill-timer — the thread preemption a server-free probe can't do; 2/2 non-vacuous incl. a well-behaved positive control) — so **7 of 8** `node-pack-sandbox-*` invariants are now protocol-tier. `node-pack-sandbox-no-eval` is JS-specific + a permanent exemption. RFC 0035 `Active → Accepted` separately needs a **non-steward** sandbox-executing host. |
|
|
53
|
+
| Sandbox MVP behavioral close-out (RFC 0035 §B) | **`sandbox-mvp-behavior.test.ts` (2026-05-22)** | A (10 capability-gated behavioral assertions covering 7 of 8 §B failure-mode invariants — 5 escape kinds + timeout + memory-exceeded + cross-pack-mutation isolation + capability-gate-violation + 2 well-behaved baselines; all 10 PASS against the workflow-engine's node:vm-based sandbox MVP) | Companion to the existing 8 advertisement-shape sandbox scenarios (`sandbox-no-host-fs-escape.test.ts` et al.). Exercises the canonical 4-code error catalog at `spec/v1/host-capabilities.md` §"Error codes" (`sandbox_escape_attempt` + `sandbox_capability_denied` + `sandbox_memory_exceeded` + `sandbox_timeout`) with spec-mandated `details.{escapeKind, requestedCapability, requestedBytes}` populated. Wire-shape per `spec/v1/host-sample-test-seams.md §8`. Production adopters use wasmtime/nsjail behind the same HTTP test-seam contract. |
|
|
54
|
+
| RFC 0041 §B replay-divergence-at-refusal behavioral (`version: 4`) | `replay-divergence-at-refusal.test.ts` (advertisement-shape + behavioral; 3 assertions PASS against workflow-engine when the `multiAgent.executionModel.version: 4` advertisement is enabled) | A (was `it.todo` until 2026-05-23 when the executor wiring landed — see commit `1fce55a` + `bba3b4a`. Behavioral assertions cover both divergence directions: original=valid + replay=refusal AND original=refusal + replay=valid) | Closes Track #4 of the 2026-05-22 multi-agent behavioral-harness close-out. Reference workflow-engine emits `replay.divergedAtRefusal` event + fails run with `error.code: 'replay_diverged_at_refusal'` when source vs replay envelope kinds differ at the same nodeId. Gated on `OPENWOP_MULTI_AGENT_EXECUTION_MODEL_PHASE_4=true` AND `run.forkMode === 'replay'`. Path-to-Accepted for RFC 0041: non-steward host advertises `multiAgent.executionModel.version: 4` end-to-end. **RFC 0041 §C sibling:** `replay-observable-sequence-determinism.test.ts` is likewise now ACTIVE capability-gated behavioral (2026-06-01 — was an `it.skip` placeholder; the `conformance-phase4-nondet-tool` fixture having shipped, it drives a `mode:replay` fork of the nondet fixture and asserts observable event-log prefix byte-equivalence + nondeterministic-node observable-result caching, gated on `replayDeterminism.supported` + `version >= 4`; soft-skips against hosts that haven't wired the pure-replay observable-cache path). |
|
|
55
|
+
| Agent-manifest runtime floor (RFC 0070 — `capabilities.agents.manifestRuntime`) | `agent-manifest-runtime.test.ts` | B (capability-gated; lists ≥1 installed manifest agent + dispatches one with attributed `agent.reasoned`+`agent.decided` events, plus a §F sub-threshold-escalation assertion) | RFC 0070 filed Draft 2026-05-26. Gated on `capabilities.agents.manifestRuntime.supported` + the host dispatch seam (`POST /v1/host/sample/agents/{agentId}/dispatch`); soft-skips when either is absent. The reference **workflow-engine** host advertises `manifestRuntime: { supported: true, handoffValidation: true }`, loads pack `agents[]` (RFC 0003 `installAgents`) into an AgentRegistry at boot, and dispatches end-to-end (toolAllowlist-filtered per RFC 0002 §A14, handoff-validated per RFC 0003 §D, confidence-escalating per §F) — see `openwop-app:backend/typescript/test/agent-dispatch-route.test.ts` (6 HTTP assertions, incl. the normative inventory). **RFC 0072 (`Draft`):** the scenario's inventory leg now drives the NORMATIVE `GET /v1/agents` (§A) so it runs black-box against any conformant host; the dispatch leg stays on the sample seam (soft-skips off-steward) pending the executor-integration tier. RFC 0072 §C `peerDependenciesMeta` disposition + `degraded[]` are unit-tested in `openwop-app:backend/typescript/test/agent-loader.test.ts`. Path to `Active → Accepted` (RFC 0070): a non-steward host advertises `manifestRuntime` + serves `GET /v1/agents`. |
|
|
56
|
+
| Live manifest dispatch (RFC 0077 — `capabilities.agents.liveRuntime`) | `agent-live-runtime-shape.test.ts` | A (always-on, server-free shape probe) | RFC 0077 promoted Draft → Active 2026-05-29 (5 UQs resolved via MyndHyve T4 co-design). Always-on shape probe asserts `capabilities.agents.liveRuntime` (+ `supported`/`structuredOutput`/`confidenceEscalation`/`sources` sub-flags) is declared, the `agentInvocationStarted`/`agentInvocationCompleted` payloads validate conforming content-free records + reject malformed ones (`started` missing `source`; `completed` out-of-enum `outcome`), and both event names appear in the RunEventType enum. `liveRuntime` ⊃ `manifestRuntime`. **Behavioral scenarios deferred** per RFC 0077 §Conformance (reference host): the started→completed bracket ordering, `structuredOutput` enforcement, and `toolAllowlist` enforcement gate on `capabilities.agents.liveRuntime.supported` + a live-invoke seam and soft-skip until a host wires it. Path to `Accepted`: a non-steward host advertises `liveRuntime` + emits the invocation pair (net-new MyndHyve T4 work, queued behind §B). |
|
|
57
|
+
| Agent evaluation (RFC 0081 — `capabilities.agents.evalSuite`) | `agent-eval-suite-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `eval-summary-no-content-leak`) | RFC 0081 promoted Draft → Active 2026-05-30. Always-on shape probe asserts `capabilities.agents.evalSuite` (+ `supported`/`modes` sub-flags) is declared; the `AgentEvalSuite` + `EvalSummary` schemas compile + round-trip a conforming artifact and reject malformed ones (bad `suiteId` infix; `passScore` out of 0..1; out-of-range `aggregateScore`); the `eval.started`/`eval.scored`/`eval.completed` payloads validate content-free records + reject malformed ones; and all three event names appear in the RunEventType enum. The **content-free negatives** (an `EvalSummary` task entry carrying a `taskOutput` body; a `safetyFinding` carrying an `excerpt`) are the public test for protocol-tier SECURITY invariant `eval-summary-no-content-leak`. **Behavioral scenario authored + gated** (2026-05-31; see §"Capability-gated scenarios"): `agent-eval-run.test.ts` (the `eval.started`→per-task `eval.scored`→`eval.completed` ordering, the content-free `eval.scored` legs, and the NORMATIVE `GET /v1/runs/{runId}/eval-summary` schema-valid round-trip) gates on `capabilities.agents.evalSuite.supported` + the `POST /v1/host/sample/agents/eval-run` seam (`behaviorGate('openwop-eval-run', …)`) and soft-skips until a host wires the eval projection. Path to `Accepted`: a host advertises `evalSuite` + runs a golden/regression suite end-to-end (the `GET /v1/runs/{runId}/eval-summary` endpoint + SDK helper already landed). |
|
|
58
|
+
| Memory capability model (RFC 0080 — `spec/v1/agent-memory.md` §"Memory capability model", `spec/v1/profiles.md` §`openwop-memory`) | `memory-capability-model-shape.test.ts` | A (always-on, server-free shape probe) | RFC 0080 promoted Draft → Active 2026-05-30 (4 UQs resolved via MyndHyve review). Always-on shape probe asserts the additive `capabilities.memory.{writable,search,retention}` dimensions are declared (existing `supported`/`compaction`/`distillation`/`attribution` untouched), `memory.search`/`memory.retention` validate conforming instances + reject malformed ones (`retention.ttl` non-boolean; out-of-enum `search.modes`; unknown property under `additionalProperties:false`), `agent-inventory-response` declares `memoryDegraded` + the closed-enum `degradedMemoryDimensions` (the eight §A dimension names), and `deriveProfiles` surfaces `openwop-memory` for a read/write + long-term host while withholding it from a `writable:false` host. **Behavioral scenario authored** (2026-06-01; see §"Capability-gated scenarios"): `memory-degraded-projection.test.ts` (a live `GET /v1/agents` stamping `memoryDegraded` + the closed-enum `degradedMemoryDimensions` when an agent's `memoryShape` exceeds the host's reconciled model) gates on `agents.manifestRuntime` + `memory` via `behaviorGate('openwop-memory-degraded', …)` and soft-skips until a host computes the §C projection. Path to `Accepted`: a host computes the §C degraded projection + the scenario passes against it non-vacuously (MyndHyve `memory`). |
|
|
59
|
+
| Portable tool catalog (RFC 0078 — `spec/v1/tool-catalog.md`) | `tool-descriptor-shape.test.ts` | A (always-on, server-free shape probe) | RFC 0078 promoted Draft → Active 2026-05-30 (4 UQs resolved via MyndHyve review). Always-on shape probe asserts `tool-descriptor.schema.json` round-trips a conforming `ToolDescriptor` + rejects the malformed (`safetyTier`-required, `additionalProperties:false`), enforces the §C-1/§F-4 cross-field MUST (`safetyTier:"exec"` ⇒ `source:"host-extension"`, RFC 0069 — an `exec`+`node-pack` descriptor is rejected), asserts the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and validates the two content-free `tool.session.{opened,closed}` payloads (incl. the closed `outcome` enum) + their RunEventType-enum membership. **Behavioral scenarios deferred** per RFC 0078 §Conformance: `tool-catalog-projection.test.ts` (the authorization-scoped `GET /v1/tools` + `404` non-disclosure) + `tool-session-lifecycle.test.ts` (the `tool.session.*` bracket ordering) gate on `capabilities.toolCatalog.supported` + `sessionLifecycle` and soft-skip until a reference host serves the catalog. Path to `Accepted`: a host projects ≥1 tool source at `GET /v1/tools` + the projection scenario passes. |
|
|
60
|
+
| Credential provenance + egress policy (RFC 0079 — `spec/v1/host-capabilities.md` §"Credential provenance + egress policy") | `egress-provenance-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `egress-decision-no-secret-leak`) | RFC 0079 promoted Draft → Active 2026-05-30 (4 UQs resolved via MyndHyve review). Always-on shape probe asserts `credential-provenance.schema.json` round-trips a conforming `CredentialProvenance` + rejects `audiences:[]` / missing `credentialId` / unknown property, the descriptor + `egress.decided` declare NO secret-value property (the content-free **`egress-decision-no-secret-leak`** protocol-tier invariant), the `egress.decided` payload validates a content-free record + enforces the `decision` enum + required `decision`/`destination`, and `capabilities.httpClient.egressPolicy` is declared. **The behavioral audience-binding MUST-NOT (`egress-credential-audience-bound`) is reference-impl tier** at Draft→Active — a credential bound to audience A on an egress to B must be `denied`/`downgraded` (never `allowed`-with-credential), fail-closed on unevaluable provenance — and lands in the gated `egress-audience-binding.test.ts` + `egress-decision-content-free.test.ts` (soft-skip until a host wires `egressPolicy` over `safeFetch`). Path to `Accepted`: a reference host enforces §C + the binding scenario passes → `egress-credential-audience-bound` graduates protocol-tier (RFC 0035 precedent). |
|
|
61
|
+
| Durable trigger + channel bridge (RFC 0083 — `spec/v1/trigger-bridge.md`, `spec/v1/profiles.md` §`openwop-trigger-bridge`) | `trigger-bridge-shape.test.ts` | A (always-on, server-free shape probe) | RFC 0083 promoted Draft → Active 2026-05-30 (5 UQs resolved via MyndHyve review). Always-on shape probe asserts `trigger-subscription.schema.json` round-trips a conforming `TriggerSubscription` + rejects missing-`state`/out-of-enum-`source`/unknown-property, the four-state vocab (`active`/`paused`/`failed`/`dead-lettered`) is stable, the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads validate + enforce the `state`/`outcome` enums + RunEventType-enum membership, `capabilities.triggerBridge` + `webhooks.durable` are declared, and `deriveProfiles` surfaces `openwop-trigger-bridge` for bridge+sink+durable-source while withholding it with no dead-letter sink. **Behavioral scenario deferred** per RFC 0083 §Conformance: `trigger-bridge-delivery.test.ts` (dedup → retry → dead-letter → trigger→run causation) is profile-gated on `openwop-trigger-bridge` and soft-skips until a reference host wires durable delivery. Path to `Accepted`: a host wires the state machine + delivery loop + the scenario passes. |
|
|
62
|
+
| Budget, quota + cost policy (RFC 0084 — `spec/v1/budget-policy.md`) | `budget-policy-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `budget-no-pricing-leak`) | RFC 0084 promoted Draft → Active 2026-05-30 (5 UQs resolved via MyndHyve review). Always-on shape probe asserts `budget-policy.schema.json` round-trips a conforming `BudgetPolicy` + enforces the §A/§E orthogonality guard (a wall-time field is rejected — that's RFC 0058's `runTimeoutMs`) + threshold/onExhaustion negatives, the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads validate + enforce the `dimension`/`scope` enums, the four `cap.breached{budget-tokens,budget-cost,budget-tool-calls,budget-retries}` kinds + the four `budget.*` RunEventType-enum entries are present, the payloads declare no pricing/credential property (the **`budget-no-pricing-leak`** protocol-tier invariant), and `capabilities.budget` + `limits.maxBudget{Tokens,CostUsd}` are declared. **Behavioral scenario authored** (2026-06-01; see §"Capability-gated scenarios"): `budget-enforcement.test.ts` (accrue → threshold → exhaust → `cap.breached{budget-cost}` → `run.failed{budget_exhausted}`; `budget_model_denied`; advisory no-stop) gates on `budget.supported` via `behaviorGate('openwop-budget-enforcement', …)` + the `POST /v1/host/sample/budget/run` seam and soft-skips until a reference host wires accounting. Path to `Accepted`: a host wires the budget accumulator + the exhaustion stop + the scenario passes non-vacuously (MyndHyve `budget`). |
|
|
63
|
+
| `openwop-agent-platform` meta-profile (RFC 0085 — `spec/v1/agent-platform-profile.md`, operational annex) | `agent-platform-profile.test.ts` | A (always-on, server-free derivation probe) | RFC 0085 promoted Draft → Active 2026-05-30 (5 UQs resolved via MyndHyve review). Operational annex (NOT a closed `profiles.md` catalog predicate). Always-on derivation probe asserts `isAgentPlatformPartial`/`isAgentPlatformFull`/`agentPlatformStatus` derive `none`/`partial`/`full` correctly: all-floor ⇒ partial, a missing floor flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, a missing governance term (tenant `installScope`) ⇒ partial-not-full (the honest-advertisement rule), and that the eval/deploy/budget platform-plus tier is advisory (a full host without them is still full); plus `capabilities.nondeterminismPolicy.declared` is declared. **Live aggregate-evidence assertion authored** (2026-06-01; see §"Capability-gated scenarios"): `agent-platform-aggregate-evidence.test.ts` (gated on a host CLAIMING `openwop-agent-platform` in live `profiles[]` via `behaviorGate('openwop-agent-platform', …)`) reads the live discovery and asserts the §C/§D honest-advertisement rule — the claim MUST satisfy the §B floor predicate (`partial`/`full`, never `none`), backed by the per-capability evidence; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the non-vacuous full-predicate bar. Path to `Accepted`: a host advertises `openwop-agent-platform` backed by the §B floor + passes the aggregate scenario claiming `full` (MyndHyve, after the memory batch surfaces the floor's `memory.supported`). |
|
|
64
|
+
| `openwop-core-standard` profile (RFC 0088 — `spec/v1/core-standard-profile.md`, operational annex) | `core-standard-profile.test.ts` | A (always-on, server-free derivation probe) | RFC 0088 (`Active` 2026-05-31). Operational annex (NOT a closed `profiles.md` catalog predicate) naming the stable **Core Standard Profile** — the floor of normative MUSTs with black-box production-path conformance. Always-on derivation probe asserts `isCoreStandard` derives the §B floor (`openwop-core` ∧ `openwop-interrupts` ∧ (`openwop-stream-sse` ∨ `openwop-stream-poll`)): core+interrupts+default-transport ⇒ core-standard; a bare `openwop-core` host without `clarification.request` ⇒ NOT core-standard (the floor is stricter than the v1 minimum); a host with no event transport (`supportedTransports:[]`) ⇒ fails; a non-1.x host ⇒ fails; and `openwop-core-standard` is absent from `deriveProfiles` (it composes, it does not redefine). The §C floor scenarios (runs-lifecycle / discovery / auth-401 / event-ordering / failure-path / idempotency / interrupts / webhook-negative / audit-log-verify) are all already black-box production-path. **Live aggregate-evidence assertion deferred** per RFC 0088 §C: a host claiming `openwop-core-standard` must pass every §C floor scenario black-box — already satisfied by MyndHyve + all four reference hosts. Path to `Accepted`: ≥1 host advertises the claim in its `conformance.md`/`INTEROP-MATRIX.md` row backed by the floor scenarios. |
|
|
65
|
+
| Agent deployment lifecycle (RFC 0082 — `capabilities.agents.deployment`) | `agent-deployment-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `deployment-event-no-content-leak`) | RFC 0082 promoted Draft → Active 2026-05-30. Always-on shape probe asserts `capabilities.agents.deployment` (+ `supported`/`channels`/`canary`/`rollback`/`states` sub-flags); the `AgentDeployment` record compiles + round-trips and rejects malformed ones (out-of-enum `state`; `canaryPercent` out of 0..100); the **`AgentRef` `channel` XOR `version`** rule (each alone + neither validate; both rejected by the `not` clause, §A); the four `deployment.*` payloads validate content-free records + reject malformed ones; `agent.invocation.started` carries the additive recorded-fact `resolvedAgentVersion`/`resolvedChannel` (§B channel pin); and all four event names appear in the RunEventType enum. The **content-free negatives** (a `deployment.promoted` carrying a `manifestBody`; a `deployment.state.changed` carrying a `prompt`) are the public test for protocol-tier SECURITY invariant `deployment-event-no-content-leak`. The behavioral `deployment-promotion-fail-closed` invariant is `reference-impl` tier until the behavioral scenario lands (then graduates to protocol; RFC 0035 precedent). **Behavioral scenario authored + gated** (2026-05-31; see §"Capability-gated scenarios"): `agent-deployment-lifecycle.test.ts` (the §E authz → approvalGate → eval-verify → `deployment.promoted` promotion + the record round-trip, the fail-closed denial, the `eval_gate_unmet` denial, and the §B `resolvedAgentVersion` channel pin) gates on `capabilities.agents.deployment.supported` + the `POST /v1/host/sample/agents/deployment-transition` seam (`behaviorGate('openwop-deployment-lifecycle', …)`) and soft-skips until a host wires it. When it passes against a host, the `deployment-promotion-fail-closed` invariant graduates reference-impl → protocol tier. Path to `Accepted`: a host implements the deployment store + canary router + the `POST /v1/agents/{agentId}/deployments` promotion contract (the endpoint + SDK helper already landed). |
|
|
66
|
+
| Connection packs (RFC 0095 — `capabilities.connections.packsSupported`) | `connection-pack-manifest-valid.test.ts`, `connection-pack-no-credential-material.test.ts`, `connection-pack-reach-exclusive.test.ts`, `connection-provider-resolution.test.ts`, `connection-pack-write-reconsent.test.ts` | A (3 always-on server-free schema probes) / B (2 capability-gated behavioral) | RFC 0095 promoted Draft → Active 2026-06-12. The three schema probes are always-on: the `connection-pack-github` fixture validates; every name on the §B.2 normative blocklist is schema-rejected at root/provider/auth depth with the sole `provider.auth.endpoints.token` exemption valid (the public test for the protocol-tier `connection-pack-no-credential-material` invariant); `reach` is exactly-one-of mcp/openapi/integration. The behavioral pair gates on `capabilities.connections.packsSupported` and drives the `POST /v1/host/sample/connection-packs/{install,resolve,consent-plan}` seams: §B.6 resolution (installed pack wins; unknown provider → `connection_provider_unresolved`; SemVer §11 prerelease vs built-in release → `connection_provider_conflict`) + §B.8 rejection isolation (one rejected pack never takes down the install path) + §B.4 write-re-consent (write scope groups are a separate consent step). Soft-skips when unadvertised / seam unwired; hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true`. |
|
|
67
|
+
| Standing agent roster (RFC 0086 — `capabilities.agents.roster`) | `agent-roster-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `roster-attribution-no-content`) | RFC 0086 promoted Draft → Active 2026-05-30. Always-on shape probe asserts `capabilities.agents.roster` (+ `supported`/`installScope`/`portfolioTriggerSources` sub-flags); the `AgentRosterEntry` record compiles + round-trips and rejects malformed ones (a non-`host:` `rosterId`; an `agentRef` carrying BOTH `version` and `channel` — the RFC 0082 §A XOR rule; a missing `rosterId`); the `roster.run.initiated` payload validates a content-free attribution record + requires its ids/persona/triggerSource; the `AgentInventoryEntry` carries the additive optional `roster` portfolio projection (§B); and `roster.run.initiated` appears in the RunEventType enum. The **content-free negatives** (a `roster.run.initiated` carrying a `body`; one carrying a `prompt`) are the public test for protocol-tier SECURITY invariant `roster-attribution-no-content`. **Behavioral scenarios deferred** per RFC 0086 §Conformance (reference host): a scheduled portfolio fire emitting `roster.run.initiated` before `agent.invocation.started`; the RFC 0083 work-item causation chain; the replay re-read; the cross-tenant `GET /v1/agents/roster/{id}` 404 — gate on `capabilities.agents.roster.supported` + the roster-store seam and soft-skip until a host wires it (the host-extension at `/v1/host/sample/roster` + board attribution, `openwop-app` #368 — formerly `apps/workflow-engine` — is the reference demonstration). Path to `Accepted`: a non-steward host advertises `agents.roster` + emits `roster.run.initiated`. |
|
|
68
|
+
| Agent org-chart (RFC 0087 — `capabilities.agents.orgChart`) | `agent-org-chart-shape.test.ts` | A (always-on, server-free shape probe; doubles as the public test for `org-position-no-authority-escalation`) | RFC 0087 promoted Draft → Active 2026-05-30. Always-on shape probe asserts `capabilities.agents.orgChart` (+ `supported`/`installScope`/`departmentNesting`/`responsibilityView` sub-flags); the `AgentOrgChart` record compiles + round-trips and rejects malformed ones (a non-`host:` member `rosterId`; a chart missing `members`). The **§B structural non-authority guarantee**: the schema **rejects** an authority-bearing field on a member (`scopes`/`canDispatch`/`permissions`/`authority` — every object is `additionalProperties:false`), and a conforming member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}`. These are the public test for protocol-tier SECURITY invariant `org-position-no-authority-escalation` (an org edge confers no authority — position describes, never authorizes). NO new RunEventType (the org-chart is structure + a read, not an event surface). **Behavioral scenarios deferred** per RFC 0087 §Conformance (reference host): the live-dispatch refusal of a manager's tool over-reach; an RFC 0049 decision invariant to org position; the cross-tenant `GET /v1/agents/org-chart` 404; the §D responsibility roll-up over live roster portfolios — gate on `capabilities.agents.orgChart.supported` + the org-store seam and soft-skip until a host wires it (the host-extension at `/v1/host/sample/org-chart`, `openwop-app` #371 — formerly `apps/workflow-engine` — is the reference demonstration). Path to `Accepted`: a non-steward host advertises `agents.orgChart` + passes the behavioral non-authority scenario. |
|
|
66
69
|
|
|
67
70
|
---
|
|
68
71
|
|
|
69
72
|
## Capability-gated scenarios: shape vs behavior
|
|
70
73
|
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
| Scenario
|
|
74
|
-
|
|
75
|
-
| `audit-log-integrity.test.ts`
|
|
76
|
-
| `rate-limit-envelope.test.ts`
|
|
77
|
-
| `multi-region-idempotency.test.ts`
|
|
78
|
-
| `configurable-schema.test.ts`
|
|
79
|
-
| `webhook-sig-algorithm.test.ts`
|
|
80
|
-
| `pause-resume.test.ts`
|
|
81
|
-
| `append-ordering.test.ts`
|
|
82
|
-
| `otel-emission.test.ts`, `otel-emission-grpc.test.ts`
|
|
83
|
-
| `otel-trace-propagation.test.ts`, `otel-trace-propagation-subworkflow.test.ts`
|
|
84
|
-
| `wasm-pack-*.test.ts` (six scenarios)
|
|
85
|
-
| `production-backpressure.test.ts`, `production-retention-expiry.test.ts`, `restart-during-run.test.ts`, `staleClaim.test.ts`, `debug-bundle-truncation.test.ts`, `idempotency.test.ts`, `idempotencyRetry.test.ts` (seven scenarios)
|
|
86
|
-
| `auth-api-key-rotation.test.ts`
|
|
87
|
-
| `auth-oauth2-client-credentials.test.ts`
|
|
88
|
-
| `auth-oidc-user-bearer.test.ts`
|
|
89
|
-
| `auth-mtls.test.ts`
|
|
90
|
-
| `replay-retention-expiry.test.ts`
|
|
91
|
-
| `discovery.test.ts` — auth-scoped subtests (3 of them)
|
|
92
|
-
| `fs-path-traversal.test.ts`
|
|
93
|
-
| `credentials-capability-shape.test.ts`
|
|
94
|
-
| `credential-payload-redaction.test.ts`
|
|
95
|
-
| `oauth-capability-shape.test.ts`
|
|
96
|
-
| `oauth-connector-redaction.test.ts`
|
|
97
|
-
| `oauth-authorization-code-roundtrip.test.ts`
|
|
98
|
-
| `connector-manifest-validity.test.ts`
|
|
99
|
-
| `identity-owner-shape.test.ts`
|
|
100
|
-
| `cross-workspace-isolation.test.ts`
|
|
101
|
-
| `authorization-roles-shape.test.ts`
|
|
102
|
-
| `authorization-fail-closed.test.ts`
|
|
103
|
-
| `auth-saml-profile.test.ts`
|
|
104
|
-
| `auth-scim-profile.test.ts`
|
|
105
|
-
| `runtime-requires-shape.test.ts`
|
|
106
|
-
| `runtime-requires-install-gate.test.ts`
|
|
107
|
-
| `safefetch-behavior.test.ts`
|
|
108
|
-
| `safefetch-live-audit.test.ts`
|
|
109
|
-
| `agent-roster-attribution.test.ts`
|
|
110
|
-
| `agent-live-invocation-bracket.test.ts`
|
|
111
|
-
| `agent-live-structured-output.test.ts`
|
|
112
|
-
| `agent-live-allowlist-enforced.test.ts`
|
|
113
|
-
| `agent-org-chart-scoping.test.ts`
|
|
114
|
-
| `org-position-no-authority-escalation.test.ts`
|
|
115
|
-
| `trigger-bridge-delivery.test.ts`
|
|
116
|
-
| `agent-eval-run.test.ts`
|
|
117
|
-
| `agent-deployment-lifecycle.test.ts`
|
|
118
|
-
| `agent-channel-dispatch.test.ts`
|
|
119
|
-
| `tool-catalog-projection.test.ts`
|
|
120
|
-
| `tool-session-lifecycle.test.ts`
|
|
121
|
-
| `egress-audience-binding.test.ts`
|
|
122
|
-
| `egress-decision-content-free.test.ts`
|
|
123
|
-
| `memory-degraded-projection.test.ts`
|
|
124
|
-
| `budget-enforcement.test.ts`
|
|
125
|
-
| `agent-platform-aggregate-evidence.test.ts`
|
|
126
|
-
| `approval-gate-events.test.ts`
|
|
127
|
-
| `approval-gate-flow.test.ts`
|
|
128
|
-
| `scheduling-capability-shape.test.ts`
|
|
129
|
-
| `scheduling-cron-fires-once.test.ts`
|
|
130
|
-
| `deadletter-capability-shape.test.ts`
|
|
131
|
-
| `run-execution-bounds-shape.test.ts`
|
|
132
|
-
| `workspace-capability-shape.test.ts`
|
|
133
|
-
| `workspace-behavior.test.ts`
|
|
134
|
-
| `workspace-cross-tenant-isolation.test.ts`
|
|
135
|
-
| `workspace-cross-tenant-isolation-blackbox.test.ts`
|
|
136
|
-
| `deadletter-retry-exhaustion.test.ts`
|
|
137
|
-
| `artifact-type-pack-manifest-validation.test.ts`
|
|
138
|
-
| `artifact-schema-compile-bounded.test.ts`
|
|
139
|
-
| `artifact-type-pack-install.test.ts`
|
|
140
|
-
| `artifact-type-store-without-render.test.ts`
|
|
141
|
-
| `chat-card-pack-manifest-validation.test.ts`
|
|
142
|
-
| `x-openwop-form-pack-manifest.test.ts`
|
|
143
|
-
| `chat-card-pack-execution.test.ts`
|
|
144
|
-
| `kv-cross-tenant-isolation.test.ts`, `kv-atomic-increment.test.ts`, `kv-cas.test.ts` (three scenarios)
|
|
145
|
-
| `table-cross-tenant-isolation.test.ts`
|
|
146
|
-
| `queue-cross-tenant-isolation.test.ts`
|
|
147
|
-
| `blob-cross-tenant-isolation.test.ts`, `cache-cross-tenant-isolation.test.ts` (two scenarios)
|
|
148
|
-
| `sql-injection-rejection.test.ts`
|
|
149
|
-
| `mcp-server-tool-roundtrip.test.ts`, `mcp-server-resource-roundtrip.test.ts`, `mcp-server-prompt-roundtrip.test.ts`, `mcp-server-sampling-bridge.test.ts`, `mcp-server-elicitation-bridge.test.ts`, `mcp-server-untrusted-args.test.ts` (six scenarios) | `capabilities.mcp.serverMount` (RFC 0020, `mcp-integration.md` §"OpenWOP host as MCP server") + `SECURITY/invariants.yaml` `mcp-server-untrusted-args`
|
|
150
|
-
| `prompt-end-to-end-events.test.ts`, `prompt-resolution-chain-node-wins.test.ts`, `prompt-resolution-chain-fallback-cascade.test.ts` (three scenarios)
|
|
151
|
-
| `prompt-pack-install.test.ts`, `prompt-list-and-fetch.test.ts`, `prompt-render-deterministic.test.ts` (three scenarios)
|
|
152
|
-
| `prompt-mutable-lifecycle.test.ts`
|
|
153
|
-
| `prompt-mutation-workspace-membership-enforced.test.ts`
|
|
154
|
-
| `prompt-read-workspace-membership-enforced.test.ts`
|
|
155
|
-
| `prompt-resolution-chain-agent-intrinsic.test.ts`
|
|
156
|
-
| `prompt-resolution-chain-event.test.ts`
|
|
157
|
-
| `prompt-composed-secret-redaction.test.ts`, `prompt-composed-trust-marker.test.ts` (two scenarios)
|
|
74
|
+
The scenario groups in the table below (one row per group; count the rows — the number is no longer hard-coded here because it drifts) validate optional profiles where the host's discovery advertisement is well-formed (shape grade) but no reference host yet implements the profile end-to-end (behavior grade is `host-pending`). Default suite runs skip these with a warning; set `OPENWOP_REQUIRE_BEHAVIOR=true` to convert skips into hard failures.
|
|
75
|
+
|
|
76
|
+
| Scenario | Profile / capability | Shape grade | Behavior grade | Behavior-unlock dependency |
|
|
77
|
+
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
78
|
+
| `audit-log-integrity.test.ts` | `openwop-audit-log-integrity` (`auth-profiles.md`) | A− (discovery + verify endpoint shape) | `host-pending` | Track 1.1 — SQLite host implements hash-chain + signed checkpoints |
|
|
79
|
+
| `rate-limit-envelope.test.ts` | normative `429` envelope (`rest-endpoints.md`) | A− (deterministic via Postgres `OPENWOP_FORCE_RATE_LIMIT=true` seam, CF-6 close-out 2026-05-15) | host-claimed | — |
|
|
80
|
+
| `multi-region-idempotency.test.ts` | `capabilities.idempotency.crossRegion` (`idempotency.md`) | C (enum shape only) | `host-pending` | Multi-region host fixture; cross-region partition simulation |
|
|
81
|
+
| `configurable-schema.test.ts` | per-workflow `configurableSchema` (`run-options.md`) | A− (negative validation + positive accepted-overlay + manifest-surface assertion, CF-7 close-out 2026-05-15) | host-claimed | — |
|
|
82
|
+
| `webhook-sig-algorithm.test.ts` | `X-openwop-Signature-Algorithm: v1` (`webhooks.md`) | C+ (discovery shape) | `host-pending` | End-to-end signed delivery against a test receiver |
|
|
83
|
+
| `pause-resume.test.ts` | `pauseRun` / `resumeRun` lifecycle (`rest-endpoints.md`) | B (lifecycle + 409-on-non-paused) | partial | Pause-during-suspend race; immediate-vs-drain policy assertion |
|
|
84
|
+
| `append-ordering.test.ts` | `append` reducer ordering (`channels-and-reducers.md`) | B (intra-engine) | partial | Cross-engine multi-engine fixture |
|
|
85
|
+
| `otel-emission.test.ts`, `otel-emission-grpc.test.ts` | `openwop.*` OTel spans (`observability.md`) | A (OTLP/HTTP-JSON + OTLP/HTTP-protobuf + OTLP/gRPC all accepted; collector content-type-routes JSON/protobuf and the gRPC HTTP/2 path frame-decodes via `grpc-framing.ts`) | full | — |
|
|
86
|
+
| `otel-trace-propagation.test.ts`, `otel-trace-propagation-subworkflow.test.ts` | W3C trace-context propagation (`observability.md`) | A (trace continuity across `runs:fork` + interrupt resolve + `core.subWorkflow` parent→child boundary; closed 2026-05-13 — subWorkflow scenario asserts both parent + child spans share the inbound traceparent's traceId) | full | — |
|
|
87
|
+
| `wasm-pack-*.test.ts` (six scenarios) | `capabilities.nodePackRuntimes.wasm` (`RFCS/0008`) | A (load + invoke + replay + memory-cap positive-path + ABI-mismatch positive-path) | full | Memory-cap positive path closed 2026-05-12 via misbehaving-memory pack + `memory.grow` probe in loader. ABI-mismatch positive path closed 2026-05-12 via misbehaving-abi pack + `loadedPacks[]` discovery field. |
|
|
88
|
+
| `production-backpressure.test.ts`, `production-retention-expiry.test.ts`, `restart-during-run.test.ts`, `staleClaim.test.ts`, `debug-bundle-truncation.test.ts`, `idempotency.test.ts`, `idempotencyRetry.test.ts` (seven scenarios) | `openwop-production` (`production-profile.md`, RFC 0009) | A− (capability shape + 503 envelope under saturation + discovery-exemption; durable-restart + debug-bundle-truncation predicates exercised end-to-end; retention-expiry envelope soft-skipped pending RFC 0009 Q#1) | host-pass | Postgres reference host advertises `capabilities.production.supported: true` since 2026-05-11 and passes all 11 assertions across the 5 non-opt-in scenarios under `OPENWOP_REQUIRE_BEHAVIOR=true` with `--no-file-parallelism` (the backpressure scenario saturates the inflight cap; parallel file execution collides with `idempotencyRetry.test.ts`'s burst). RFC 0009 unresolved questions #1 (force-expire endpoint normation) + #3 (inflightCap vs probing) gate the path to A. |
|
|
89
|
+
| `auth-api-key-rotation.test.ts` | `openwop-auth-api-key-rotation` (`auth-profiles.md`, RFC 0010) | B (capability shape + secondary-key overlap when env-supplied + canary-redaction) | `host-pending` | Reference host advertises the profile + supplies `OPENWOP_TEST_SECONDARY_API_KEY` for the overlap check. |
|
|
90
|
+
| `auth-oauth2-client-credentials.test.ts` | `openwop-auth-oauth2-client-credentials` (`auth-profiles.md`, RFC 0010) | B (capability shape + malformed-JWT negative + harness-minted negatives gated on `OPENWOP_TEST_OAUTH_ISSUER_TRUSTED`) | `host-pending` | Reference host advertises the profile + trusts the conformance harness + (optional) supplies `OPENWOP_TEST_OAUTH_TOKEN` for positive-path. |
|
|
91
|
+
| `auth-oidc-user-bearer.test.ts` | `openwop-auth-oidc-user-bearer` (`auth-profiles.md`, RFC 0010) | B (capability shape + six harness-driven validation cases gated on `OPENWOP_TEST_OIDC_ISSUER_URL`) | `host-pending` | Reference host advertises the profile + is pre-configured to trust `OPENWOP_TEST_OIDC_ISSUER_URL` as a trusted issuer. Synthetic OIDC issuer harness at `conformance/src/lib/oidc-issuer.ts` (RS256 + ES256 via node:crypto stdlib). |
|
|
92
|
+
| `auth-mtls.test.ts` | `openwop-auth-mtls` (`auth-profiles.md`, RFC 0010) | B (capability shape always; opt-in behavior assertions via `OPENWOP_TEST_MTLS=1` + cert paths; uses node:https.request for client-cert handshake — no new npm deps) | `host-pending` | Reference host advertises the profile + listens on HTTPS with mTLS enforcement; operator supplies `OPENWOP_TEST_MTLS_CLIENT_{CERT,KEY}_PATH` and (optionally) `OPENWOP_TEST_MTLS_CA_PATH`. |
|
|
93
|
+
| `replay-retention-expiry.test.ts` | `openwop-replay-fork` (`replay.md` §"Retention and garbage collection") | B (capability shape always; 410/422 envelope on expired-range fork gated on `OPENWOP_TEST_EXPIRED_REPLAY_RUN_ID`; details.{sourceRunId, fromSeq, retentionBoundary} soft-checks per spec SHOULD) | `host-pending` | Reference host advertises `replay.supported: true` + operator produces a known-expired run id (no standardized force-expire endpoint per RFC 0009 Q#1). |
|
|
94
|
+
| `discovery.test.ts` — auth-scoped subtests (3 of them) | `openwop-discovery-auth-scoped` (`capabilities-change-detection.md` §"Scoped capability views", RFC 0011) | B (capability shape + mode/endpointPath typing always; required-field-preservation in authenticated view always; authorization-oracle probe gated on `OPENWOP_TEST_UNAUTHORIZED_API_KEY`) | `host-pending` | Reference host advertises `capabilities.discovery.authScoped.supported: true` + serves an authenticated capability view that satisfies the base schema + a tenant-scoped key pair for the oracle probe. |
|
|
95
|
+
| `fs-path-traversal.test.ts` | `capabilities.fs` (RFC 0014, `host-fs-capability.md`) | A (advertisement shape + two path-escape probes asserting `path_outside_sandbox`) | host-pass (workflow-engine reference) | Reference host advertises `capabilities.fs.supported: true` with sandboxRoot under `<dataDir>/host-fs`. |
|
|
96
|
+
| `credentials-capability-shape.test.ts` | `capabilities.credentials` (RFC 0046, `host-capabilities.md` §host.credentials) | A (advertisement shape always — `supported` boolean; `scopes` ⊆ {user,workspace,tenant}; `rotation` ∈ {none,two-key-overlap}) | `host-pending` | Always runs; asserts the block is absent or well-formed. No host advertises `capabilities.credentials` yet (RFC 0046 `Draft`). |
|
|
97
|
+
| `credential-payload-redaction.test.ts` | `capabilities.credentials` (RFC 0046) + `SECURITY/invariants.yaml` `credential-payload-redaction` | A (advertisement shape always; redaction MUST-NOT via optional `POST /v1/host/sample/credentials/echo` seam — canary plaintext absent from all observable surfaces) | `host-pending` | Capability-gated on `credentials.supported`; behavioral probe soft-skips on 404 when the seam is unwired, mirroring `fs-path-traversal`. |
|
|
98
|
+
| `oauth-capability-shape.test.ts` | `capabilities.oauth` (RFC 0047, `host-capabilities.md` §host.oauth) | A (advertisement shape always — `supported` boolean; `grants` ⊆ {authorization_code,client_credentials,refresh_token}; every `providers[].id` non-empty) | `host-pending` | Always runs; asserts the block is absent or well-formed. No host advertises `capabilities.oauth` yet (RFC 0047 `Draft`). |
|
|
99
|
+
| `oauth-connector-redaction.test.ts` | `capabilities.oauth` (RFC 0047) + `SECURITY/invariants.yaml` `credential-payload-redaction` | A (advertisement shape always; token-material redaction via optional `POST /v1/host/sample/oauth/connector-echo` seam — canary token absent from all observable surfaces; `connector.authorized` carries the ref not the token) | `host-pending` | Capability-gated on `oauth.supported`; behavioral probe soft-skips on 404. Reuses the RFC 0046 redaction invariant (OAuth tokens are stored as host.credentials entries). |
|
|
100
|
+
| `oauth-authorization-code-roundtrip.test.ts` | `capabilities.oauth` (RFC 0047 §C) + `SECURITY/invariants.yaml` `credential-payload-redaction` | A (capability-gated on `oauth.supported` + `grants` ∋ `authorization_code`; behavioral roundtrip via optional `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` — returns a credential REFERENCE; the authorization code / state / PKCE verifier / acquired access+refresh tokens are absent from every observable surface; `connector.authorized` carries the ref not the token) | `host-pending` | Capability+grant-gated; behavioral probe soft-skips on 404. Closes the RFC 0047 Tier-2 gap — exercises the actual authorization-code dance (shape + redaction scenarios existed; the grant itself was unexercised). |
|
|
101
|
+
| `connector-manifest-validity.test.ts` | `node-pack-manifest.schema.json` §Connector (RFC 0045, `node-packs.md` §Connectors) | Server-free (schema validity of the `connector` block incl. both ConnectorAuth variants + positive/negative round-trip; §B action/trigger typeId-resolution semantics — `connector_action_unresolved` on an unknown typeId) | host-pass (server-free) | Always runs; no host needed. Behavioral idempotency-hint + rate-limit-honored scenarios deferred until a host advertises a connector. |
|
|
102
|
+
| `identity-owner-shape.test.ts` | `run-snapshot.schema.json` properties.owner (RFC 0048 §C, `auth.md` §Identity claims) | Server-free (owner triple schema validity: positive `{tenant}` + full triple; negative missing-tenant + unknown-prop) | host-pass (server-free) | Always runs; no host needed. |
|
|
103
|
+
| `cross-workspace-isolation.test.ts` | RFC 0048 §C/§D (`auth.md` §Identity claims, `rest-endpoints.md` `run_forbidden`) | A (owner-echo shape if a sample run is readable; §D isolation MUST-NOT via optional `POST /v1/host/sample/identity/cross-workspace-read` seam — cross-workspace read fails closed with `run_forbidden`/`not_found`) | `host-pending` | Behavioral probe soft-skips on 404; no host advertises run ownership yet (RFC 0048 `Draft`). |
|
|
104
|
+
| `authorization-roles-shape.test.ts` | `capabilities.authorization` (RFC 0049 §A, `auth.md` §"Role-based authorization") | A (advertisement shape always — `supported` boolean; `failClosed` const true; every `roles[].role` non-empty + `scopes` array) | `host-pending` | Always runs; asserts the block is absent or well-formed. |
|
|
105
|
+
| `authorization-fail-closed.test.ts` | `capabilities.authorization` (RFC 0049 §C) + `SECURITY/invariants.yaml` `authorization-fail-closed` | A (advertisement `failClosed===true` always; fail-closed MUST-NOT via optional `POST /v1/host/sample/authorization/decide` seam — an unseeded-role principal resolves `allowed:false`) | `host-pending` | Capability-gated on `authorization.supported`; behavioral probe soft-skips on 404. Scope-match + denial-audited scenarios deferred to a host. |
|
|
106
|
+
| `auth-saml-profile.test.ts` | `openwop-auth-saml` (RFC 0050, `auth-profiles.md` §`openwop-auth-saml`) | A+B (profile-advertisement shape always; **1-positive + 6-negative reference suite runs server-free** via the bundled synthetic IdP `conformance/src/lib/saml-idp.ts` — `alg:none`/unsigned/bad-sig/expired/not-yet-valid/wrapping; host-ACS validation opt-in via `OPENWOP_TEST_SAML_IDP_URL` + the `auth/saml/validate` seam) | host-pass (server-free reference) | Synthetic IdP bundled (`node:crypto`, no deps). Host-ACS pass is the remaining graduation gate. |
|
|
107
|
+
| `auth-scim-profile.test.ts` | `openwop-auth-scim` (RFC 0050, `auth-profiles.md` §`openwop-auth-scim`) | B (profile-advertisement shape always; SCIM user/group provisioning → principal/role roundtrip opt-in via `OPENWOP_TEST_SCIM_URL` + the `auth/scim/provision` seam) | `host-pending` | Behavior opt-in (operator-supplied SCIM endpoint); deactivate ⇒ subsequent-deny assertion deferred to a host. |
|
|
108
|
+
| `runtime-requires-shape.test.ts` | `node-pack-manifest.schema.json` `$defs/Runtime.requires` (RFC 0076 §A, `node-packs.md` §"Runtime platform requirements") | Server-free (closed-vocabulary validation: 8 tokens validate; raw builtin `node:dns/promises` rejected → `invalid_manifest`; empty-array≡omission; `uniqueItems`) | host-pass (server-free) | Always runs; no host needed. The install-time GATE behavior is in `runtime-requires-install-gate.test.ts`. |
|
|
109
|
+
| `runtime-requires-install-gate.test.ts` | RFC 0076 §A install gate (no capability flag; seam `POST /v1/host/sample/packs/install-gate`) | A (install-grant; install-refuse → `pack_runtime_requirement_unmet { unmet, manifest, advice? }`; non-sandbox SHOULD-projection — all via the optional seam) | `host-pending` | Seam-gated; soft-skips on 404. First adopter: MyndHyve's install-time gate against `core.openwop.http` declaring `["net.dns","net.outbound"]`. |
|
|
110
|
+
| `safefetch-behavior.test.ts` | `capabilities.httpClient.safeFetch` (RFC 0076 §B, `host-capabilities.md` §host.http) + `SECURITY/invariants.yaml` `http-client-ssrf-guard` | A (SSRF block + DNS-rebinding defeat + `Connection: upgrade` refusal + tool-hooks audit-when-`prePostEvents`, via the optional `POST /v1/host/sample/http/safe-fetch` seam) | in-memory ✅ | **5/5 PASS against the in-memory reference host** (2026-05-29, `OPENWOP_REQUIRE_BEHAVIOR=true`); seam-gated, soft-skips on 404 elsewhere. Reuses the existing `http-client-ssrf-guard` invariant (no new invariant). Advertisement contract in `http-client-ssrf.test.ts`. §B → Accepted still awaits `core.openwop.http@2.0.0` consumer + non-steward adoption. |
|
|
111
|
+
| `safefetch-live-audit.test.ts` | `capabilities.httpClient.safeFetch` + `capabilities.toolHooks.prePostEvents` (RFC 0076 §B, `host-capabilities.md` §host.http; RFC 0064 §B) | A (the audit-when-both MUST against the **durable run event log** — production `ctx.http.safeFetch` path — via the `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam; asserts a `callId`-paired `agent.toolCalled` `transport:"http"` / `agent.toolReturned` was persisted) | `host-pending` | `behaviorGate('openwop-safefetch-live-audit', …)` — both-flags-advertised ⇒ FAIL under `OPENWOP_REQUIRE_BEHAVIOR=true` if the durable pair is absent (closes the seam-vs-production gap: a production `createSafeFetch()` with no audit hooks passes `safefetch-behavior.test.ts` but fails this). Asserts the pair on a guaranteed-**blocked** metadata URL (egress-independent floor — "every invocation" incl. refused, so no vacuous pass on an egress-blocked host) + best-effort public fetch for success-path coverage. Run seam soft-skips on 404 (host-pending). No new invariant. **This is the RFC 0076 §B → Accepted bar.** |
|
|
112
|
+
| `agent-roster-attribution.test.ts` | `capabilities.agents.roster.supported` (RFC 0086 §B/§C, `agent-roster.md`) | A (the normative `GET /v1/agents/roster` read shape + `total==roster.length`; the §C `roster.run.initiated`-before-`agent.invocation.started` ordering + the content-free payload backing `roster-attribution-no-content`; the durable work-item `triggerSubscriptionId` (RFC 0083); the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID` — driven through `POST /v1/host/sample/roster/fire` + the test event-log seam) | `host-pending` | `behaviorGate('openwop-roster-attribution', …)`. Normative-read leg runs black-box on any roster host; the attribution/ordering legs are seam-gated and soft-skip on 404. **This is the RFC 0086 → Accepted bar.** First adopter: MyndHyve `agents.roster`. |
|
|
113
|
+
| `agent-live-invocation-bracket.test.ts` | `capabilities.agents.liveRuntime.supported` (RFC 0077 §E, `multi-agent-execution.md` §"Live manifest dispatch") | A (the §E bracket — `agent.invocation.started`-first / `agent.invocation.completed`-last, matching `invocationId`, `source`/`outcome` closed enums, both content-free — via `POST /v1/host/sample/agents/live-invoke` + the test event-log seam) | `host-pending` | `behaviorGate('openwop-live-invocation-bracket', …)`. Seam-gated; soft-skips on 404. Part of the RFC 0077 → Accepted bar. First adopter: MyndHyve `agents.liveRuntime`. |
|
|
114
|
+
| `agent-live-structured-output.test.ts` | `capabilities.agents.liveRuntime.structuredOutput` (RFC 0077 §B step 6) | A (a terminal result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` + `schemaValidated != true`, not a shipped completion — via the `forceInvalidResult` seam param) | `host-pending` | `behaviorGate('openwop-live-structured-output', …)`; gated on `liveRuntime.supported` + `structuredOutput`. Seam-gated; soft-skips on 404. |
|
|
115
|
+
| `agent-live-allowlist-enforced.test.ts` | `capabilities.agents.liveRuntime.supported` (RFC 0077 §F-1 / RFC 0002 §A14 `toolAllowlist`) | A (a tool outside the agent `toolAllowlist` is not callable — no `agent.toolCalled` for the disallowed tool — via the `attemptTool` seam param) | `host-pending` | `behaviorGate('openwop-live-allowlist-enforced', …)`. Seam-gated; soft-skips on 404. Part of the RFC 0077 → Accepted bar. |
|
|
116
|
+
| `agent-org-chart-scoping.test.ts` | `capabilities.agents.orgChart.supported` (RFC 0087 §A/§C/§D, `agent-org-chart.md`) | A (the normative `GET /v1/agents/org-chart` tree-shape — acyclic `parentDepartmentId` tree + `host:<id>` member rosterIds — + the §D `GET /v1/agents/org-chart/{departmentId}` responsibility roll-up with a deduped `responsibilities[]` union + `recursive=false` shape-stability + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) | `host-pending` | `behaviorGate('openwop-org-chart-scoping', …)`. Black-box on the normative path (no POST seam); soft-skips on 404 / when the cross-tenant env var is unset. **Part of the RFC 0087 → Accepted bar.** First adopter: MyndHyve `agents.orgChart`. |
|
|
117
|
+
| `org-position-no-authority-escalation.test.ts` | `capabilities.agents.orgChart.supported` (RFC 0087 §B) + `SECURITY/invariants.yaml` `org-position-no-authority-escalation` | A (behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field (`scopes`/`canDispatch`/`permissions`/`authority`/`roleGrants`/`capabilities`) on any member / department / responsibility-view object, proving the host's projector strips position-as-authority at every install scope) | `host-pending` | `behaviorGate('openwop-org-position-no-authority', …)`. Black-box on the normative path; soft-skips on 404. The STRUCTURAL leg (schema rejects an authority field) stays always-on in `agent-org-chart-shape.test.ts`; the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier (a non-normative authz-decide hook would be required — the `agent-manifest-runtime` precedent). **Part of the RFC 0087 → Accepted bar.** |
|
|
118
|
+
| `trigger-bridge-delivery.test.ts` | `openwop-trigger-bridge` profile (RFC 0083 §C/§D, `trigger-bridge.md`; derived from discovery — bridge + dead-letter sink + durable source) | A (the §C delivery model via `POST /v1/host/sample/trigger-bridge/deliver` + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free) | `host-pending` | `behaviorGate('openwop-trigger-bridge', …)`. Profile-gated; seam-gated; soft-skips on 404. Normative `GET /v1/trigger-subscriptions` read runs black-box. **This is the RFC 0083 → Accepted bar.** First adopter: MyndHyve `triggerBridge`. |
|
|
119
|
+
| `agent-eval-run.test.ts` | `capabilities.agents.evalSuite.supported` (RFC 0081 §B/§C, `agent-evaluation.md`) | A (the §C `mode:"eval"` projection via `POST /v1/host/sample/agents/eval-run` + the test event-log seam: `eval.started`-first → one `eval.scored` per task → `eval.completed`-once ordering (`count == eval.completed.taskCount`), every `eval.scored` content-free + `score` ∈ 0..1 backing `eval-summary-no-content-leak`; plus the NORMATIVE `GET /v1/runs/{runId}/eval-summary` returning a schema-valid `EvalSummary` with `passedCount <= taskCount` and no per-task output body) | `host-pending` | `behaviorGate('openwop-eval-run', …)`. Seam-gated; soft-skips on 404. The normative eval-summary read runs black-box on any `evalSuite` host. **This is the RFC 0081 → Accepted bar.** First adopter: MyndHyve `agents.evalSuite`. |
|
|
120
|
+
| `agent-deployment-lifecycle.test.ts` | `capabilities.agents.deployment.supported` (RFC 0082 §B/§E, `agent-deployment.md`) + `SECURITY/invariants.yaml` `deployment-promotion-fail-closed` | A (the §E promotion contract via `POST /v1/host/sample/agents/deployment-transition` + the test event-log seam: a `promote` runs authorize (RFC 0049) → approvalGate (RFC 0051) → eval-verify (RFC 0081) → content-free `deployment.promoted` with a seven-state `toState` + `toVersion`, the returned record validating `agent-deployment.schema.json`; an `unauthorized` principal fails closed — `allowed:false`, NO `deployment.promoted` (the behavioral leg of `deployment-promotion-fail-closed`); an `eval-gate-unmet` promote is denied `eval_gate_unmet` with NO `deployment.promoted` (§E-3); and a `channel-pin` run records `resolvedAgentVersion` on `agent.invocation.started` (§B)) | `host-pending` | `behaviorGate('openwop-deployment-lifecycle', …)`. Seam-gated; soft-skips on 404. The normative `GET /v1/agents/{agentId}/deployments` read runs black-box on any `deployment` host. **This is the RFC 0082 → Accepted bar** (the `deployment-promotion-fail-closed` invariant graduates reference-impl → protocol tier when this passes against a host). First adopter: MyndHyve `agents.deployment`. |
|
|
121
|
+
| `agent-channel-dispatch.test.ts` | `capabilities.agents.deployment.supported` (RFC 0082 §B, `agent-deployment.md` + `version-negotiation.md`) + the seeded `conformance-agent-channel-dispatch` fixture + advertised `replay` mode | A (the §B pin from a PRODUCTION run graph, complementing `agent-deployment-lifecycle.test.ts` Leg 4's host-sample seam: a canonical `POST /v1/runs` of a node binding `agent.channel:"stable"` records `resolvedChannel` + a concrete `resolvedAgentVersion` on `agent.invocation.started` (RFC 0077); a `:fork{mode:"replay"}` re-reads that recorded version; the seam-guarded Leg 3 MOVES the channel then asserts a replay STILL carries the original pin — never re-resolving a moved channel) | `host-pending` | `behaviorGate('openwop-deployment-channel-dispatch', …)`. Fixture+cap+replay-mode-gated; soft-skips by default, hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true`. **The production-path proof of the §B contract** (Leg 4 proves it through the seam; this proves it through a real run graph). First adopter: MyndHyve `agents.deployment`. |
|
|
122
|
+
| `tool-catalog-projection.test.ts` | `capabilities.toolCatalog.supported` (RFC 0078 §B/§F, `tool-catalog.md`) | A (the NORMATIVE `GET /v1/tools` list — each `ToolDescriptor` schema-valid + `source`/`safetyTier` ∈ the closed vocab + content-free; `GET /v1/tools/{toolId}` round-trip + unknown-id 404; auth-gated 401-unauthenticated; the §F-2 cross-principal non-disclosure via `OPENWOP_CROSS_PRINCIPAL_TOOL_ID` → 404) | `host-pending` | `behaviorGate('openwop-tool-catalog', …)`. Black-box on the normative path (no POST seam); soft-skips on 404 / when the cross-principal env var is unset. **This is part of the RFC 0078 → Accepted bar.** First adopter: MyndHyve `toolCatalog`. |
|
|
123
|
+
| `tool-session-lifecycle.test.ts` | `capabilities.toolCatalog.sessionLifecycle` (RFC 0078 §D, `tool-catalog.md`) | A (the §D bracket via `POST /v1/host/sample/tools/session-run` + the test event-log seam: `tool.session.opened` before the first RFC 0064 call event → `tool.session.closed` after the last, one shared `sessionId`, each carrying a `toolId`, `closed.outcome` ∈ {completed,failed,aborted,expired}, both content-free) | `host-pending` | `behaviorGate('openwop-tool-session-lifecycle', …)`. Seam-gated; soft-skips on 404. **Part of the RFC 0078 → Accepted bar.** First adopter: MyndHyve `toolCatalog`. |
|
|
124
|
+
| `egress-audience-binding.test.ts` | `capabilities.httpClient.egressPolicy.supported` (RFC 0079 §C, `host-capabilities.md`) + `SECURITY/invariants.yaml` `egress-credential-audience-bound` | A (KEYSTONE — the §C confused-deputy MUST via `POST /v1/host/sample/egress/decide`: an out-of-audience egress is `denied`/`downgraded` with `reason:"out-of-audience"` and the credential is NOT attached (`credentialAttached !== true`); a provenance-unevaluable egress fails closed `denied`+`reason:"provenance-unevaluable"`; decision/reason ∈ the closed enums) | `host-pending` | `behaviorGate('openwop-egress-audience-binding', …)`. Seam-gated; soft-skips on 404. **This is the RFC 0079 → Accepted bar** (the `egress-credential-audience-bound` invariant graduates reference-impl → protocol tier when this passes against a host). First adopter: MyndHyve `httpClient.egressPolicy`. |
|
|
125
|
+
| `egress-decision-content-free.test.ts` | `capabilities.httpClient.egressPolicy.supported` (RFC 0079 §F / SR-1) | A (the secret non-leak — a `canary` credential's sentinel never surfaces in the decision (`canaryLeaked !== true`), the `egress.decided` payload carries no forbidden content key, and `reason` stays in the CLOSED vocabulary so no blocked destination spills into a free-form field) | `host-pending` | `behaviorGate('openwop-egress-decision-content-free', …)`. Seam-gated; soft-skips on 404. **Part of the RFC 0079 → Accepted bar.** First adopter: MyndHyve `httpClient.egressPolicy`. |
|
|
126
|
+
| `memory-degraded-projection.test.ts` | `capabilities.agents.manifestRuntime.supported` + `capabilities.memory.supported` (RFC 0080 §C, `agent-memory.md`) | A (the §C iff-contract on the NORMATIVE `GET /v1/agents`: a degraded entry MUST carry `memoryDegraded:true` + a non-empty, unique `degradedMemoryDimensions[]` drawn from the closed §A-name enum [read/write/search/long-term-durability/compaction/attribution/replay-snapshot/retention]; a non-degraded entry MUST NOT carry a non-empty list; the inventory is non-empty; the degraded branch runs non-vacuously when `OPENWOP_DEGRADED_AGENT_ID` names a known-degraded agent) | `host-pending` | `behaviorGate('openwop-memory-degraded', …)`. Black-box on the normative path (no POST seam); soft-skips on 404 / when the host computes no degradation. **This is the RFC 0080 → Accepted bar.** First adopter: MyndHyve `memory`. |
|
|
127
|
+
| `budget-enforcement.test.ts` | `capabilities.budget.supported` (RFC 0084 §C/§D, `budget-policy.md`) + `SECURITY/invariants.yaml` `budget-no-pricing-leak` | A (the §C/§D enforcement via `POST /v1/host/sample/budget/run` + the test event-log seam: a `hard-cost-exhaust` run emits the strict-ordered `budget.reserved → budget.consumed → budget.threshold.crossed{percent} → budget.exhausted → cap.breached{kind:"budget-cost"} → run.failed{error:"budget_exhausted"}` chain; a `model-denied` run is refused `budget_model_denied` BEFORE the provider call (fail-closed); an `advisory` host emits the `budget.*` events without stopping; every `budget.*` payload content-free — no pricing/rate) | `host-pending` | `behaviorGate('openwop-budget-enforcement', …)`. Seam-gated; soft-skips on 404. **This is the RFC 0084 → Accepted bar.** First adopter: MyndHyve `budget`. |
|
|
128
|
+
| `agent-platform-aggregate-evidence.test.ts` | `openwop-agent-platform` claim — live discovery `profiles[]` includes it (RFC 0085 §C, `agent-platform-profile.md`) | A (the §C/§D honest-advertisement on live `/.well-known/openwop`: a host claiming `openwop-agent-platform` MUST satisfy the §B floor predicate (`isAgentPlatformPartial` → `partial`/`full`, never `none`), the claim backed by per-capability evidence not the profile string; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the full-predicate bar — all governance terms + tenant installScope + all 16 §D terms) | `host-pending` | `behaviorGate('openwop-agent-platform', …)`. Black-box on the discovery doc (no POST seam); soft-skips until a host claims the profile. **This is the RFC 0085 → Accepted bar.** First adopter: MyndHyve (after the memory batch surfaces the floor's `memory.supported`). |
|
|
129
|
+
| `approval-gate-events.test.ts` | `approval.granted` / `.rejected` / `.overridden` (RFC 0051 §B, `interrupt-profiles.md` §approvalGate) | Server-free (event-payload schema validity: required fields incl. mandatory `overridden.reason`; additionalProperties:false negatives) | host-pass (server-free) | Always runs; no host needed. |
|
|
130
|
+
| `approval-gate-flow.test.ts` | `core.openwop.governance.approvalGate` (RFC 0051 §A) + `capabilities.authorization` (RFC 0049) | A (capability-gated on `authorization.supported`; unauthorized-principal-denied + override-audited via the `governance/approval-gate` seam) | `host-pending` | Behavioral probe soft-skips on 404. Grant/reject-loopback/quorum scenarios deferred until a governance host wires the seam. |
|
|
131
|
+
| `scheduling-capability-shape.test.ts` | `capabilities.scheduling` (RFC 0052 §A, `host-capabilities.md` §host.scheduling) | A (advertisement shape always — `supported` boolean; `cron`/`delayed`/`calendar` booleans; `maxFutureHorizon` ISO-8601 duration) | `host-pending` | Always runs; asserts the block is absent or well-formed. |
|
|
132
|
+
| `scheduling-cron-fires-once.test.ts` | `capabilities.scheduling` (RFC 0052 §B) | A (once-per-tick + missed-tick MUST-NOT via optional `POST /v1/host/sample/scheduling/tick` seam — single tick fires exactly one run; missed window never floods) | `host-pending` | Capability-gated on `scheduling.supported` + `cron`; soft-skips on 404. Delayed-horizon + calendar scenarios deferred. |
|
|
133
|
+
| `deadletter-capability-shape.test.ts` | `capabilities.deadLetter` (RFC 0053 §A, `host-capabilities.md` §host.deadLetter) | A (advertisement shape always — `supported` boolean; `retentionDays` integer ≥ 1) | `host-pending` | Always runs; asserts the block is absent or well-formed. |
|
|
134
|
+
| `run-execution-bounds-shape.test.ts` | `capabilities.limits.{maxRunDurationMs,maxLoopIterations}` + `cap.breached` kinds `run-duration` / `loop-iterations` (RFC 0058, `run-options.md` §Reserved keys) | A (advertisement shape always — `maxRunDurationMs` integer ≥ 1000; `maxLoopIterations` integer ≥ 1; run-duration breach gated on the `conformance-run-duration-breach` fixture) | `host-pending` | Shape always runs; the run-duration breach behavior soft-skips until a host enforces wall-clock timeouts (mirrors RFC 0052 scheduling). |
|
|
135
|
+
| `workspace-capability-shape.test.ts` | `capabilities.workspace` (RFC 0059 §A, `agent-workspace.md` §"Capability advertisement") + the four workspace operations `listWorkspaceFiles` / `getWorkspaceFile` / `putWorkspaceFile` / `deleteWorkspaceFile` (`/v1/host/workspace/files[/{path}]`) | A (advertisement shape always — `supported` boolean; `maxFileBytes`/`maxFiles`/`maxVersions` integers ≥ 1 when present) | in-memory ✅ | Always runs; asserts the block is absent or well-formed. The in-memory reference host advertises `capabilities.workspace.supported` (RFC 0059 M2). |
|
|
136
|
+
| `workspace-behavior.test.ts` | RFC 0059 §C/§D (`agent-workspace.md` §"§C — Endpoints" / §"§D — Run-time exposure") — `/v1/host/workspace/files[/{path}]` | B (capability-gated on `workspace.supported`: CRUD round-trip + `If-Match` 409 `workspace_conflict` with `details.currentVersion` + `workspace_too_large` over `maxFileBytes` + the run-start workspace snapshot on the run snapshot) | in-memory ✅ | Soft-skips when `workspace.supported` is unadvertised. |
|
|
137
|
+
| `workspace-cross-tenant-isolation.test.ts` | RFC 0059 §E WCT-1 (`agent-workspace.md` §"§E — Invariants") + `SECURITY/invariants.yaml` `workspace-cross-tenant-isolation` | B (capability-gated + seam-gated: a file owned by `{tenant, workspace}` MUST NOT be readable via `get`/`list` under a different owner — drives `POST /v1/host/sample/workspace/op`) | in-memory ✅ | Public test for the `workspace-cross-tenant-isolation` invariant; soft-skips when the seam is unwired (404). |
|
|
138
|
+
| `workspace-cross-tenant-isolation-blackbox.test.ts` | RFC 0059 §E WCT-1 — **BLACK-BOX production path** (`agent-workspace.md` §"§C — Endpoints" + §"§E — Invariants") + `SECURITY/invariants.yaml` `workspace-cross-tenant-isolation` | A− (drives the NORMATIVE §C `PUT`/`GET /v1/host/workspace/files` with TWO operator credentials — owner A writes a secret, a second-tenant credential fails closed on read (404/403, no content/existence leak) + can't enumerate it, owner A still reads it; **no `/v1/host/sample/*` seam**) | host-pending (multi-tenant) | The production-path proof that replaces the seam (audit item 3) and graduates RFC 0059 **into** the `openwop-core-standard` floor (RFC 0088 §D Lever-2 → floor). Soft-skips without `OPENWOP_TEST_TENANT_B_API_KEY` (a second-tenant credential the suite can't mint) or `workspace.supported`. First adopter: MyndHyve (Firestore path-scoped multi-tenant). |
|
|
139
|
+
| `deadletter-retry-exhaustion.test.ts` | `capabilities.deadLetter` (RFC 0053 §C) + `run.dead_lettered` event | A (retry-exhaustion → `run.dead_lettered` with `attempts` + dead-lettered run fork-eligible, via optional `POST /v1/host/sample/deadletter/exhaust` seam) | `host-pending` | Capability-gated on `deadLetter.supported`; soft-skips on 404. Retention-purge scenario deferred (needs clock seam). |
|
|
140
|
+
| `artifact-type-pack-manifest-validation.test.ts` | `artifact-type-pack-manifest.schema.json` (RFC 0071 Phase 1, `artifact-type-packs.md` §"Manifest format") | Server-free (positive + 5 negatives: mixed-kind / empty `artifactTypes` / bad `artifactTypeId` pattern / unknown `rendering.display` / non-conforming `exportFormats`) | host-pass (server-free) | Always runs; no host needed. |
|
|
141
|
+
| `artifact-schema-compile-bounded.test.ts` | RFC 0071 §"Bounded schema compilation" + `SECURITY/invariants.yaml` `artifact-schema-compile-bounded` | Server-free (contract-present in corpus + a reference finite bound rejects `$ref`-depth / keyword-count / size bombs, admits benign schemas) | host-pass (server-free floor) | Public test for the `artifact-schema-compile-bounded` invariant. Behavioral form (host rejects an over-bounds pack at registry `PUT` with `pack_validation_failed`) gated on `host.artifactTypes.supported`. |
|
|
142
|
+
| `artifact-type-pack-install.test.ts` | `host.artifactTypes` (RFC 0071 Phase 1, `host-capabilities.md` §host.artifactTypes) | A (install + produce → `artifact.created { registered: true }` after schema validation; schema-violating payload rejected, via the `POST /v1/host/sample/artifacttypes/{install,produce}` seam) | MyndHyve ✅ · in-memory ✅ | Capability-gated on `host.artifactTypes.supported`. PASS against MyndHyve `workflow-runtime-00396-cuj` (production) AND the in-memory reference host (store-only seam, RFC 0075 P2-1); soft-skips on hosts that don't advertise. |
|
|
143
|
+
| `artifact-type-store-without-render.test.ts` | `host.artifactTypes` (RFC 0071 §host.artifactTypes — store/render negotiation) | A (a `store:true,render:false` host stores the artifact + completes the run; never fails for lack of a renderer) | in-memory ✅ | **Exercised end-to-end (RFC 0075 P2-1):** the in-memory reference host advertises `host.artifactTypes { store:true, render:false }` and implements the produce seam — a registered, schema-valid artifact is stored and the run completes with `rendered:false`. MyndHyve (`render:true`) honestly soft-skips this path; the in-memory store-only host is the one that actually verifies the negotiation. |
|
|
144
|
+
| `chat-card-pack-manifest-validation.test.ts` | `chat-card-pack-manifest.schema.json` (RFC 0071 Phase 2, `chat-card-packs.md` §"Manifest format") | Server-free (positive + 5 negatives: mixed-kind / empty `cards` / bad `cardTypeId` / missing `prompt` / non-portable `inputs[].type`; + positive `vendor.*` extension + the full portable `inputs[].type` subset incl. `multiselect`/`file`, G9) | host-pass (server-free) | Always runs; no host needed. Behavioral execution lives in the sibling scenario below. |
|
|
145
|
+
| `x-openwop-form-pack-manifest.test.ts` | `node-packs.md` §"`x-openwop-form` UX hints" (RFC 0066) | Server-free (positive: an annotated `configSchema` stays a valid 2020-12 schema + advisory hints don't change what it accepts; each §A annotation matches the shape; forward-compat: unknown `kind` validates; 3 negatives: missing `kind` / non-string `kind` / non-string `dependsOn`) | host-pass (server-free) | Always runs; no host needed. `x-openwop-form` is a consumer-side rendering hint — hosts don't advertise it; renderer behavior is a reference-frontend concern, out of scope here. |
|
|
146
|
+
| `chat-card-pack-execution.test.ts` | `host.chat.cardPacks` (RFC 0071 Phase 2, `chat-card-packs.md` §"Card execution" / §"Trust boundary") + `SECURITY/invariants.yaml` `chat-card-input-trust-boundary` | A (output validated against the linked `outputArtifactType` → `artifact.created { registered: true }`; **card-input-derived prompt content propagates `contentTrust:"untrusted"`** — the R2 proof, via `POST /v1/host/sample/cardpacks/execute` seam) | MyndHyve ✅ | **Phase 2 `Accepted` 2026-05-27.** PASS against MyndHyve `workflow-runtime-00402-bey` (real `core.chat.cardExecute` → `ctx.aiEnvelope.generate`; `host.chat.cardPacks` + `host.aiEnvelope` advertised unconditionally on production, steward-curl-verified). Capability-gated on `host.chat.cardPacks.supported`; soft-skips on hosts that don't advertise. |
|
|
147
|
+
| `kv-cross-tenant-isolation.test.ts`, `kv-atomic-increment.test.ts`, `kv-cas.test.ts` (three scenarios) | `capabilities.kvStorage` (RFC 0015, `host-kv-storage-capability.md`) + `SECURITY/invariants.yaml` `kv-cross-tenant-isolation` | A (advertisement shape always; behavioral cross-tenant `set`/`get`, 50× concurrent atomic increment convergence, CAS matching/stale-expect) | host-pass via opt-in test seam | Reference host exposes `POST /v1/host/sample/test/surface` env-gated on `OPENWOP_TEST_SEAM_ENABLED=true`; hosts that don't expose the seam soft-skip the behavioral assertions and verify advertisement shape only. |
|
|
148
|
+
| `table-cross-tenant-isolation.test.ts` | `capabilities.tableStorage` (RFC 0016, `host-table-storage-capability.md`) | A (advertisement shape + behavioral cross-tenant insert/query proof) | host-pass via opt-in test seam | Same seam dependency as kv row. |
|
|
149
|
+
| `queue-cross-tenant-isolation.test.ts` | `capabilities.queueBus` (RFC 0017, `host-queue-bus-capability.md`) + `SECURITY/invariants.yaml` `queue-cross-tenant-isolation` | A (advertisement shape + behavioral cross-tenant publish/consume proof) | host-pass via opt-in test seam | Same seam dependency as kv row. |
|
|
150
|
+
| `blob-cross-tenant-isolation.test.ts`, `cache-cross-tenant-isolation.test.ts` (two scenarios) | `capabilities.blobStorage` + `capabilities.cache` (RFC 0019, `host-blob-cache-capability.md`) | A (advertisement shape + behavioral cross-tenant put/get isolation for both surfaces) | host-pass via opt-in test seam | Same seam dependency as kv row. |
|
|
151
|
+
| `sql-injection-rejection.test.ts` | `capabilities.sql` (RFC 0018, `host-sql-vector-search-capability.md`) + `SECURITY/invariants.yaml` `sql-parametric-only` | A (advertisement shape + parametric round-trip + injection-shape input bound as literal returns 0 rows) | host-pass via opt-in test seam | Same seam dependency as kv row. |
|
|
152
|
+
| `mcp-server-tool-roundtrip.test.ts`, `mcp-server-resource-roundtrip.test.ts`, `mcp-server-prompt-roundtrip.test.ts`, `mcp-server-sampling-bridge.test.ts`, `mcp-server-elicitation-bridge.test.ts`, `mcp-server-untrusted-args.test.ts` (six scenarios) | `capabilities.mcp.serverMount` (RFC 0020, `mcp-integration.md` §"OpenWOP host as MCP server") + `SECURITY/invariants.yaml` `mcp-server-untrusted-args` | A (advertisement shape + JSON-RPC tools/list+tools/call roundtrip, resources/list+read, prompts/list+get, sampling/elicitation bridge dispatch, Ajv2020 args validation rejecting with -32602 before workflow start) | host-pass via opt-in MCP server mount | Reference host exposes `POST /v1/host/sample/mcp` env-gated on `OPENWOP_MCP_SERVER_ENABLED=true`; hosts that don't expose the mount soft-skip the behavioral assertions and verify advertisement shape only. |
|
|
153
|
+
| `prompt-end-to-end-events.test.ts`, `prompt-resolution-chain-node-wins.test.ts`, `prompt-resolution-chain-fallback-cascade.test.ts` (three scenarios) | `prompts-supported` profile — gates on `capabilities.prompts.supported: true` (RFC 0027 + RFC 0029, `spec/v1/prompts.md`) | A (advertisement shape always + end-to-end resolve + emit during real workflow dispatch; resolution chain Layers 1, 3, 4 exercised) | host-pass (workflow-engine reference) | Reference host advertises `capabilities.prompts.supported: true` since RFC 0027 ref-impl landed; dispatch wiring in `bootstrap/nodes.ts` walks the resolution chain and emits `agent.promptResolved` + `prompt.composed` per spec/v1/prompts.md §"Composition + observability". |
|
|
154
|
+
| `prompt-pack-install.test.ts`, `prompt-list-and-fetch.test.ts`, `prompt-render-deterministic.test.ts` (three scenarios) | `prompts-endpoints` profile — gates on `capabilities.prompts.endpointsSupported: true` (RFC 0028 §A, `spec/v1/prompts.md` §"Discovery & distribution") | A (advertisement shape always + list/get/render contract + pack-source provenance stamps + ETag honoring when supported) | host-pass (workflow-engine reference) | Reference host serves the six `/v1/prompts*` routes via `routes/prompts.ts` against the in-memory `PromptStore`. Pack-install existence claim opt-in via `OPENWOP_TEST_PROMPT_PACK_INSTALLED=true` (the in-tree `vendor.openwop.prompt-sample` pack auto-installs via `promptPackLoader.ts`). |
|
|
155
|
+
| `prompt-mutable-lifecycle.test.ts` | `prompts-mutable` profile — gates on `capabilities.prompts.mutableLibrary: true` (RFC 0028 §C) | A (advertisement shape + CRUD lifecycle + pack/host source 403-on-mutation) | host-pass (workflow-engine reference) | Reference host advertises `mutableLibrary: true`; user-source templates accepted, pack + host-built-in templates return 403 on POST/PUT/DELETE. |
|
|
156
|
+
| `prompt-mutation-workspace-membership-enforced.test.ts` | `prompts-mutable` profile — gates on `capabilities.prompts.mutableLibrary: true` (RFC 0028 Tier-2 follow-up, post-promotion) + `SECURITY/invariants.yaml` `prompt-mutation-workspace-membership-enforced` | A (advertisement gate + cross-workspace write refusal — drives `POST /v1/prompts` with a random non-member `workspaceId`, asserts any 4xx/5xx; on 403 specifically, additionally pins canonical `error === "workspace_membership_required"` envelope per `rest-endpoints.md` §"Common error codes"; other refusal codes unconstrained) | capability-gated (no reference-host membership backend yet; soft-skips cleanly until a host wires the workspace-member resolver) | Filed 2026-05-25 in response to a MyndHyve self-disclosed Admin-SDK-bypasses-DB-rules vulnerability on revision `00207-vzq`. T1 canonicalization same-day (2026-05-25) added the 403-envelope check. Operator override via `OPENWOP_TEST_NONMEMBER_WORKSPACE_ID`. |
|
|
157
|
+
| `prompt-read-workspace-membership-enforced.test.ts` | `prompts-supported` profile — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` per MyndHyve relay Option B: read-only hosts that expose `?workspaceId=` reads are NOT exempt from the symmetric authz invariant) + `SECURITY/invariants.yaml` `prompt-read-workspace-membership-enforced` | A (advertisement gate + cross-workspace read refusal — drives `GET /v1/prompts?workspaceId=<random-non-member>`, interprets response: 4xx PASS with canonical envelope check on 403; 200 with empty `templates[]` PASS as correct null result; 200 with non-empty `templates[]` FAIL as cross-tenant leak; 200 without `templates[]` field SKIP via response-shape detection — host doesn't expose workspace-scoped reads) | capability-gated (no reference-host workspace-scoped read backend yet; soft-skips cleanly on the response-shape detection) | T2 sister scenario filed 2026-05-25 alongside T1; same threat model as the write scenario but probes the read path. Read paths are NOT exempt from cross-tenant authz — a `GET ?workspaceId=<not-mine>` that returns another workspace's templates is a data leak with the same blast radius as a cross-tenant write. Uses response-shape detection (rather than a new capability field) to self-skip hosts without workspace-scoped reads. Operator override via `OPENWOP_TEST_NONMEMBER_WORKSPACE_ID`. |
|
|
158
|
+
| `prompt-resolution-chain-agent-intrinsic.test.ts` | `prompts-agent-bindings` profile — gates on `capabilities.prompts.agentBindings: true` (RFC 0029 §A Layer 2) | A (advertisement shape + Layer 2 agent intrinsic / overrides / library-default precedence over Layers 3-4) | host-pass (workflow-engine reference) | Reference host advertises `agentBindings: true` so Layer 2 sub-layers (agent-intrinsic / agent-overrides / agent-library-default) walk per RFC 0029 §B. |
|
|
159
|
+
| `prompt-resolution-chain-event.test.ts` | RFC 0029 — **BLACK-BOX production path** (`spec/v1/prompts.md` §"Resolution chain"). Gates on `capabilities.prompts.supported` | A− (creates a run from `conformance-prompt-end-to-end`, reads the durable `agent.promptResolved` event via the NORMATIVE `GET /v1/runs/{runId}/events/poll`, and asserts the precedence record — non-empty `chain[]` of valid layers, ≤1 `applied:true` winner, `resolved` mirroring the applied `source` (else null), full-traversal shape; **no `/v1/host/sample/*` seam**) | host-pass (workflow-engine reference emits `agent.promptResolved` via `bootstrap/nodes.ts`) | The production-path proof that replaces the synchronous-resolver seam (audit item 3) and graduates RFC 0029 **into** the `openwop-core-standard` floor (RFC 0088 §D Lever-2 → floor). Soft-skips without `prompts.supported` or until a host emits the event (RFC 0029 emission staging). The three seam scenarios remain for hosts that haven't wired event emission. |
|
|
160
|
+
| `prompt-composed-secret-redaction.test.ts`, `prompt-composed-trust-marker.test.ts` (two scenarios) | `prompts-observability-full` profile — gates on `prompts.supported + observability: "full"` (RFC 0027 §E + RFC 0020 §D) + `SECURITY/invariants.yaml` `prompt-composed-secret-redaction` + `prompt-composed-trust-marker` | A (advertisement shape + `[REDACTED:<credentialRef>]` markers for secret-source bindings + `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation) | host-pass (workflow-engine reference) | Reference host advertises `observability: "full"` (sourced from `host/promptHostConfig.ts`). Composition pipeline in `host/promptCompose.ts` enforces SR-1 carry-forward + untrusted-content marker per `SECURITY/threat-model-secret-leakage.md` §SR-1. |
|
|
161
|
+
| `connection-provider-resolution.test.ts`, `connection-pack-write-reconsent.test.ts` | `capabilities.connections.packsSupported: true` (RFC 0095 §C) | A — always-on schema coverage lives in the three `connection-pack-*` probes (manifest-valid incl. the §C capability declaration; no-credential-material; reach-exclusive) | Gated — §B.6 install/resolve precedence (incl. the SemVer §11 prerelease conflict), §B.8 rejection isolation, §B.4 write-re-consent | A host advertising `connections.packsSupported` + the `POST /v1/host/sample/connection-packs/{install,resolve,consent-plan}` seams (`host-sample-test-seams.md` §10). Soft-skip when unadvertised / seam 404; strict under `OPENWOP_REQUIRE_BEHAVIOR=true`. |
|
|
158
162
|
|
|
159
163
|
Strict-mode runner usage:
|
|
160
164
|
|
|
@@ -162,7 +166,7 @@ Strict-mode runner usage:
|
|
|
162
166
|
OPENWOP_REQUIRE_BEHAVIOR=true npx vitest run
|
|
163
167
|
```
|
|
164
168
|
|
|
165
|
-
The flag is read at scenario startup via `conformance/src/lib/env.ts` → `loadEnv().requireBehavior`. Scenarios use the `behaviorGate(profileName, advertised)` helper from `conformance/src/lib/behavior-gate.ts` so the strict-mode failure message cites the relevant spec section. The wired-and-passing examples (host-pass behavior grade) include `audit-log-integrity.test.ts` (landed 2026-05-11), the 5 host-capability-surface families gated on the `OPENWOP_TEST_SEAM_ENABLED=true` test seam, and the prompt
|
|
169
|
+
The flag is read at scenario startup via `conformance/src/lib/env.ts` → `loadEnv().requireBehavior`. Scenarios use the `behaviorGate(profileName, advertised)` helper from `conformance/src/lib/behavior-gate.ts` so the strict-mode failure message cites the relevant spec section. The wired-and-passing examples (host-pass behavior grade) include `audit-log-integrity.test.ts` (landed 2026-05-11), the 5 host-capability-surface families gated on the `OPENWOP_TEST_SEAM_ENABLED=true` test seam, and the prompt-_ family of 10 scenarios across the five `prompts-_`profile names (landed 2026-05-20). The remaining`host-pending`rows in the table above adopt the helper as their host-side profiles land (tracked in`docs/PROTOCOL-GAP-CLOSURE-PLAN.md`Phase-1 tracks T1.1 onward). Hosts that deliberately don't implement a profile can list it in`OPENWOP_OPTED_OUT_PROFILES=name1,name2,...` to skip even in strict mode — minimal hosts can claim full strict-mode coverage without falsifying advertisements.
|
|
166
170
|
|
|
167
171
|
---
|
|
168
172
|
|
|
@@ -175,66 +179,241 @@ Every OpenAPI operation should have:
|
|
|
175
179
|
3. At least one validation or conflict scenario where the operation accepts input.
|
|
176
180
|
4. A cited spec section in each assertion message.
|
|
177
181
|
|
|
178
|
-
| Operation ID
|
|
179
|
-
|
|
180
|
-
| `getCapabilities`
|
|
181
|
-
| `getOpenApiSpec`
|
|
182
|
-
| `getWorkflow`
|
|
183
|
-
| `createRun`
|
|
184
|
-
| `getRun`
|
|
185
|
-
| `getRunAncestry`
|
|
186
|
-
| `listAgents`
|
|
187
|
-
| `getAgent`
|
|
188
|
-
| `getEvalSummary`
|
|
189
|
-
| `listAgentDeployments`
|
|
190
|
-
| `transitionAgentDeployment`
|
|
191
|
-
| `listAgentRoster`
|
|
192
|
-
| `getAgentRosterEntry`
|
|
193
|
-
| `getAgentOrgChart`
|
|
194
|
-
| `getAgentOrgChartDepartment`
|
|
195
|
-
| `listTools`
|
|
196
|
-
| `getTool`
|
|
197
|
-
| `streamRunEvents`
|
|
198
|
-
| `pollRunEvents`
|
|
199
|
-
| `cancelRun`
|
|
200
|
-
| `bulkCancelRuns`
|
|
201
|
-
| `verifyAuditLog`
|
|
202
|
-
| `pauseRun`
|
|
203
|
-
| `resumeRun`
|
|
204
|
-
| `forkRun`
|
|
205
|
-
| `createAnnotation`
|
|
206
|
-
| `listAnnotations`
|
|
207
|
-
| `diffRun`
|
|
208
|
-
| `resolveInterruptByRun`
|
|
209
|
-
| `inspectInterruptByToken`
|
|
210
|
-
| `resolveInterruptByToken`
|
|
211
|
-
| `getArtifact`
|
|
212
|
-
| `registerWebhook`
|
|
213
|
-
| `unregisterWebhook`
|
|
214
|
-
| `listPromptTemplates`
|
|
215
|
-
| `createPromptTemplate`
|
|
216
|
-
| `getPromptTemplate`
|
|
217
|
-
| `updatePromptTemplate`
|
|
218
|
-
| `deletePromptTemplate`
|
|
219
|
-
| `renderPromptTemplate`
|
|
220
|
-
| `putTestPackTarball`, `getTestPackTarball`, `deleteTestPackVersion`, `getTestPackSignature` | `pack-registry-publish.test.ts` covers the 19-code publish error catalog through the RFC 0025 `/v1/packs-test/*` mirror namespace, gated on `capabilities.packs.testMode.supported: true` (RFC 0025 §A). 26 scenarios soft-skip when the advertisement is absent; when present, the suite exercises URL/scope, body-shape, tarball-extraction, manifest-contents, integrity, auth/conflict, unpublish-window, and signature-endpoint pairing. | Soft-skip on advertisement absence; behavioral on advertisement presence
|
|
182
|
+
| Operation ID | Positive coverage | Negative / auth / validation coverage | Gap |
|
|
183
|
+
| ------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
|
|
184
|
+
| `getCapabilities` | `discovery.test.ts`, `runtime-capabilities.test.ts`, `profileDerivation.test.ts`, `mcp-discoverability.test.ts` | `discovery.test.ts` covers optional `Capabilities-Etag`; `spec-corpus-validity.test.ts` validates schema shape | Add scoped discovery scenario when a host advertises it. |
|
|
185
|
+
| `getOpenApiSpec` | `discovery.test.ts` | `spec-corpus-validity.test.ts` validates OpenAPI refs | Add unavailable/transient error scenario only if host can simulate it. |
|
|
186
|
+
| `getWorkflow` | `route-coverage.test.ts`; fixture-dependent lifecycle tests indirectly require seeded workflow IDs | `route-coverage.test.ts` covers unknown workflow `404`/`403` envelope | Good. |
|
|
187
|
+
| `createRun` | `runs-lifecycle.test.ts`, `identity-passthrough.test.ts`, `failure-path.test.ts`, fixture scenarios | `auth.test.ts`, `errors.test.ts`, `idempotency.test.ts`, `idempotencyRetry.test.ts` | Strong baseline; add per-field validation matrix. |
|
|
188
|
+
| `getRun` | Lifecycle, cancellation, interrupt, replay, and subworkflow tests poll snapshots | `failure-path.test.ts`, `errors.test.ts` | Add explicit unknown-run `404` scenario if not already covered through helper assertions. |
|
|
189
|
+
| `getRunAncestry` | `cross-host-ancestry-endpoint.test.ts`, `cross-host-causation-shape.test.ts` (RFC 0040 §C); capability-gated on `capabilities.multiAgent.executionModel.crossHostCausation.ancestryEndpointSupported` | Unadvertised-host 404 path + top-level `parent: null` shape covered | Add positive multi-hop traversal once a reference host implements end-to-end cross-host composition. |
|
|
190
|
+
| `listAgents` | `agent-manifest-runtime.test.ts` (RFC 0072 §A); capability-gated on `capabilities.agents.manifestRuntime.supported` | Normative `GET /v1/agents` inventory leg — lists ≥1 installed manifest agent; soft-skips (404) when unadvertised | Black-box across hosts; dispatch leg via sample seam pending the executor-integration tier. |
|
|
191
|
+
| `getAgent` | `agent-manifest-runtime.test.ts` (RFC 0072 §A) + `agent-dispatch-route.test.ts` (reference host) | One manifest agent's inventory entry + 404 for unknown | Covered against the workflow-engine reference host. |
|
|
192
|
+
| `getEvalSummary` | `agent-eval-suite-shape.test.ts` (RFC 0081 — schema/shape of the returned `EvalSummary`) + the now-authored behavioral `agent-eval-run.test.ts` (the live `GET /v1/runs/{runId}/eval-summary` schema-valid round-trip); capability-gated on `capabilities.agents.evalSuite.supported` | Wire shape + the content-free invariant covered always-on; the live round-trip is `agent-eval-run.test.ts`, `host-pending` until a host advertises `evalSuite` + wires the `POST /v1/host/sample/agents/eval-run` seam. | First adopter: MyndHyve `agents.evalSuite`. |
|
|
193
|
+
| `listAgentDeployments` | `agent-deployment-shape.test.ts` (RFC 0082 — the `AgentDeployment` record shape the array returns) + the now-authored behavioral `agent-deployment-lifecycle.test.ts` (the live `GET /v1/agents/{agentId}/deployments` black-box read); capability-gated on `capabilities.agents.deployment.supported` | Record shape covered always-on; the live list is `agent-deployment-lifecycle.test.ts`, `host-pending` until a host advertises `deployment` + wires the deployment-transition seam. | First adopter: MyndHyve `agents.deployment`. |
|
|
194
|
+
| `transitionAgentDeployment` | `agent-deployment-shape.test.ts` (RFC 0082 — the `agent-deployment-transition` request + `deployment.*` event shapes) + `deployment-event-no-content-leak` public test + the now-authored behavioral `agent-deployment-lifecycle.test.ts` (the live authz→gate→eval-verify→`deployment.promoted` + fail-closed + `eval_gate_unmet` legs via the seam); capability-gated on `capabilities.agents.deployment.supported` | Request/record/event shapes + content-free negatives covered always-on; the live promotion contract is `agent-deployment-lifecycle.test.ts`, `host-pending` until a host wires the deployment store. | First adopter: MyndHyve `agents.deployment`. |
|
|
195
|
+
| `listAgentRoster` | `agent-roster-shape.test.ts` (RFC 0086 §B — the `agent-roster-response` shape); capability-gated on `capabilities.agents.roster.supported` | Response shape covered always-on; the live `GET /v1/agents/roster` 200/404 + tenant-scoping is deferred to `Active → Accepted` (reference host serves the normative `/v1/agents/roster`, vs the sample-extension `/v1/host/sample/roster`). | Add the live path once a host serves the normative roster endpoint. |
|
|
196
|
+
| `getAgentRosterEntry` | `agent-roster-shape.test.ts` (RFC 0086 §B — the `agent-roster-entry` shape); capability-gated on `capabilities.agents.roster.supported` | Entry shape covered always-on; the live 200/404 + cross-tenant-404 is deferred to `Active → Accepted`. | Add the live path once a host serves the normative endpoint. |
|
|
197
|
+
| `getAgentOrgChart` | `agent-org-chart-shape.test.ts` (RFC 0087 §C — the `agent-org-chart` shape + the `org-position-no-authority-escalation` structural test); capability-gated on `capabilities.agents.orgChart.supported` | Chart shape + the no-authority structural guarantee covered always-on; the live `GET /v1/agents/org-chart` 200/404 + tenant-scoping is deferred to `Active → Accepted`. | Add the live path once a host serves the normative endpoint. |
|
|
198
|
+
| `getAgentOrgChartDepartment` | `agent-org-chart-shape.test.ts` (RFC 0087 §D — the `org-chart-responsibility-view` response shape); capability-gated on `capabilities.agents.orgChart.supported` | Roll-up response shape covered always-on; the live subtree + responsibility roll-up (incl. `?recursive=false`) is deferred to `Active → Accepted`. | Add the live path once a host computes the roll-up over real roster portfolios. |
|
|
199
|
+
| `listTools` | `tool-descriptor-shape.test.ts` (RFC 0078 §C — the `ToolDescriptor` wire shape + `source`/`safetyTier` vocab + the `exec ⇒ host-extension` cross-field MUST) + the authored behavioral `tool-catalog-projection.test.ts` (the live `GET /v1/tools` list — each descriptor schema-valid, content-free, auth-gated); capability-gated on `capabilities.toolCatalog.supported` | Descriptor shape covered always-on; the live list is `tool-catalog-projection.test.ts`, `host-pending` until a host advertises `toolCatalog` + serves the normative `/v1/tools` path. SDK `tools.*` methods deferred (the gated scenario drives the raw conformance driver, not the SDK). | First adopter: MyndHyve `toolCatalog`. |
|
|
200
|
+
| `getTool` | `tool-descriptor-shape.test.ts` (RFC 0078 §C — the single-`ToolDescriptor` shape) + the authored behavioral `tool-catalog-projection.test.ts` (the live `GET /v1/tools/{toolId}` round-trip + unknown-id 404 + §F-2 cross-principal non-disclosure); capability-gated on `capabilities.toolCatalog.supported` | Single-descriptor shape covered always-on; the live by-id read is `tool-catalog-projection.test.ts`, `host-pending` until a host serves the normative endpoint. | First adopter: MyndHyve `toolCatalog`. |
|
|
201
|
+
| `streamRunEvents` | `stream-modes.test.ts`, `stream-modes-buffer.test.ts`, `stream-modes-mixed.test.ts`, `streamReconnect.test.ts` | Unsupported mode and invalid buffer assertions | Add long-running proxy timeout soak outside fast CI. |
|
|
202
|
+
| `pollRunEvents` | `multi-node-ordering.test.ts`, `version-negotiation.test.ts`, redaction tests | Past-end and validation assertions | Good. Add malformed `lastSequence` if missing. |
|
|
203
|
+
| `cancelRun` | `cancellation.test.ts` | Unknown/terminal idempotency cases partial | Add explicit already-terminal cancel behavior. |
|
|
204
|
+
| `bulkCancelRuns` | `bulk-cancel.test.ts` (Phase B close-out) | `bulk-cancel.test.ts` covers per-id error envelopes (`not_found`, `forbidden`, `run_terminal`), oversized-array `400 validation_error`, and 100-id cap | Add multi-tenant scope-narrowing scenario when host advertises per-key scope. |
|
|
205
|
+
| `verifyAuditLog` | `audit-log-integrity.test.ts` covers `/v1/audit/verify` shape end-to-end against the `openwop-audit-log-integrity` profile | `audit-log-integrity.test.ts` covers chain-valid + tamper detection (host-internal) | Add cross-host checkpoint export so an out-of-band verifier can re-anchor against the same chain. |
|
|
206
|
+
| `pauseRun` | `pause-resume.test.ts` covers direct route behavior for running → paused, idempotent re-pause, terminal conflict, and pause-during-suspend race | Conflict and race paths covered with `details.runStatus`; endpoint is no longer coverage-missing | Add explicit immediate-vs-drain-current-node policy assertion when a host advertises both drain policies. |
|
|
207
|
+
| `resumeRun` | `pause-resume.test.ts` covers direct route behavior for paused → running and non-paused conflict | Conflict path covered with `details.runStatus`; endpoint is no longer coverage-missing | Good. |
|
|
208
|
+
| `forkRun` | `replay-fork.test.ts`, `replay-fork-arbitrary.test.ts`, `replay-retention-expiry.test.ts`, `replayDeterminism.test.ts` | Negative `fromSeq`, past-end, unknown source, invalid overlay | ✅ Closed — arbitrary-event fork shipped (`replay-fork-arbitrary.test.ts`) and retention-expired source shipped (`replay-retention-expiry.test.ts`, gated on `OPENWOP_TEST_EXPIRED_REPLAY_RUN_ID`). |
|
|
209
|
+
| `createAnnotation` | `feedback-record-and-list.test.ts`, `feedback-on-terminal-run.test.ts`, `feedback-correction-redaction.test.ts` (RFC 0056); gated on `capabilities.feedback.supported`, soft-skip on `501` | `feedback-unsupported-501.test.ts` (501 when unadvertised), `feedback-cross-tenant-isolation.test.ts`, `feedback-fork-not-copied.test.ts` | Capability-gated; full cross-tenant proof needs a multi-tenant auth seam (soft-skips, like `kv-cross-tenant-isolation`). |
|
|
210
|
+
| `listAnnotations` | `feedback-record-and-list.test.ts`, `feedback-cross-tenant-isolation.test.ts` (RFC 0056) | `feedback-correction-redaction.test.ts` (redacted listing), `feedback-fork-not-copied.test.ts` (fork list empty) | Gated on `capabilities.feedback.supported`. |
|
|
211
|
+
| `diffRun` | `run-diff.test.ts` (RFC 0054); soft-skips on 404 when the endpoint is unimplemented | Self-diff `divergedAtSeq: null`/empty (determinism floor), two-fixture divergence with `eventDiffs[0].seq === divergedAtSeq`, response-shape + `stateDiff` redaction-safety, `400` (missing `against`) + `404` (nonexistent `against`) | Add a bespoke deterministically-divergent fork fixture for `divergedAtSeq === N`-at-a-chosen-seq; full cross-principal `403` needs a multi-principal harness. |
|
|
212
|
+
| `resolveInterruptByRun` | `interrupt-approval.test.ts`, `interrupt-clarification.test.ts`, `approval-payload.test.ts`, `interruptRace.test.ts` | Invalid action, unknown node, race cases | Add auth-required and quorum profile scenarios. |
|
|
213
|
+
| `inspectInterruptByToken` | `interrupt-token-matrix.test.ts` (CF-3, 2026-05-15) covers malformed + unknown token paths | Negative paths covered | Add explicit expired-token case when a host advertises a TTL seam. |
|
|
214
|
+
| `resolveInterruptByToken` | `interrupt-token-matrix.test.ts` covers replay (already-resolved) + unknown token; `interrupt-external-event-correlation.test.ts` covers positive path | Replay path + unknown-token path covered with explicit assertions | Add wrong-action case once the host advertises a typed allowed-actions vocabulary in the interrupt manifest. |
|
|
215
|
+
| `getArtifact` | Indirect through approval payload fixtures | `route-coverage.test.ts` covers unknown artifact `404`/`403` envelope; `artifact-auth.test.ts` (CF-4 close-out 2026-05-15; SQLite host 401-before-404 stub landed 2026-05-19, closes the info-leak surface for every HTTP method) covers `401` unauthenticated path | Negative paths covered (401 + 405 non-GET + 404/403) | Add positive artifact-read scenario once a reference host implements `getArtifact` end-to-end. |
|
|
216
|
+
| `registerWebhook` | Webhook spec exists | `route-coverage.test.ts` covers invalid URL validation envelope | Add positive registration with a test receiver when harness support exists. |
|
|
217
|
+
| `unregisterWebhook` | Webhook spec exists | `route-coverage.test.ts` covers unknown subscription behavior | Add full register-then-unregister roundtrip with a test receiver. |
|
|
218
|
+
| `listPromptTemplates` | `prompt-template-shape.test.ts` + `prompt-list-and-fetch.test.ts` cover schema shape + advertisement contract + list/get contract for `capabilities.prompts.*` against the reference workflow-engine (RFC 0028 `Active` — endpoints live under `openwop-app:backend/typescript/src/routes/prompts.ts`) | Behavioral list + advertisement-shape covered | Add cross-host list-with-filter parity scenario when a second host advertises `endpointsSupported: true`. |
|
|
219
|
+
| `createPromptTemplate` | `prompt-mutable-lifecycle.test.ts` covers CRUD lifecycle against the reference workflow-engine (gated on `mutableLibrary: true`); user-source POST succeeds, pack + host-built-in templates return 403 | Positive create + readonly-source 403 path covered | Add explicit `409` duplicate-id scenario + auth/scope matrix scenarios. |
|
|
220
|
+
| `getPromptTemplate` | `prompt-list-and-fetch.test.ts` covers positive fetch + ambiguous-libraryId + ETag honoring when host advertises it | Positive fetch + 404 + ETag covered | Good — minor gap is the `400 ambiguous_template_id` cross-library disambiguation matrix. |
|
|
221
|
+
| `updatePromptTemplate` | `prompt-mutable-lifecycle.test.ts` covers positive update + non-monotonic-version conflict + pack-sourced-readonly 403 against the reference workflow-engine | Positive update + 403 readonly-source + 409 conflict covered | Add `501` not-mutable-library negative for hosts that advertise `mutableLibrary: false`. |
|
|
222
|
+
| `deletePromptTemplate` | `prompt-mutable-lifecycle.test.ts` covers positive delete + pack-sourced-readonly 403 against the reference workflow-engine | Positive delete + 403 readonly-source covered | Add `501` not-mutable-library negative + `404` unknown-template scenarios. |
|
|
223
|
+
| `renderPromptTemplate` | `prompt-render-deterministic.test.ts` exercises `POST /v1/prompts:render` end-to-end against the reference workflow-engine; deterministic-hash invariant verified across `:render` + `prompt.composed` event paths. `prompt-composed-secret-redaction.test.ts` + `prompt-composed-trust-marker.test.ts` exercise the shared compose pipeline via the `/v1/host/sample/prompt/compose` seam | Deterministic render + composition redaction + trust-marker invariants covered | Add `400 prompt_variable_unresolved` matrix for missing variables across all four PromptKinds. |
|
|
224
|
+
| `putTestPackTarball`, `getTestPackTarball`, `deleteTestPackVersion`, `getTestPackSignature` | `pack-registry-publish.test.ts` covers the 19-code publish error catalog through the RFC 0025 `/v1/packs-test/*` mirror namespace, gated on `capabilities.packs.testMode.supported: true` (RFC 0025 §A). 26 scenarios soft-skip when the advertisement is absent; when present, the suite exercises URL/scope, body-shape, tarball-extraction, manifest-contents, integrity, auth/conflict, unpublish-window, and signature-endpoint pairing. | Soft-skip on advertisement absence; behavioral on advertisement presence | Add real-tarball-builder fixtures so the manifest_mismatch / pack_integrity_failure / unsupported_runtime branches assert against a meaningful gzip+tar payload (currently soft-skipped with explanatory comments). |
|
|
221
225
|
|
|
222
226
|
---
|
|
223
227
|
|
|
224
228
|
## Gap closure plan
|
|
225
229
|
|
|
226
|
-
| Priority
|
|
227
|
-
|
|
228
|
-
|
|
|
229
|
-
|
|
|
230
|
-
| ✅ done
|
|
231
|
-
| P1
|
|
232
|
-
| ✅ done
|
|
233
|
-
|
|
|
234
|
-
| ~~P1~~
|
|
235
|
-
|
|
|
236
|
-
|
|
|
237
|
-
|
|
|
238
|
-
| P2
|
|
239
|
-
| ✅ Closed 2026-05-17 (HV-1)
|
|
240
|
-
| ✅ Closed 2026-05-19 (HVMAP-1/2) | RFC 0022 conformance fully landed across two cycles: 2026-05-18 promoted HVMAP-1a/1b/1c/2 happy paths from `it.todo()` to live behavioral tests against the Postgres reference host (advertising `capabilities.agents.dispatchMapping: true` + `capabilities.subWorkflow.inputMapping: true`); 2026-05-19 promoted all remaining negative-path cases — HVMAP-1a-null, HVMAP-1a-refusal, HVMAP-1b-failed, HVMAP-1b-cancelled, HVMAP-1c-override, HVMAP-2-unset, HVMAP-2-no-midrun-propagation, HVMAP-2-refusal — via the new capability-toggle seam (`conformance/src/lib/host-toggle.ts` + `POST /v1/host/sample/test/capability-toggle`) and 5 new fixtures (`-no-default` variants, `-per-worker-override`, `-deterministic-fail-child`, `-cancellable-child`). Republished as `@openwop/openwop-conformance@1.3.0`. | RFCS/0022-dispatch-input-output-mapping.md §A + §B + §C
|
|
230
|
+
| Priority | Work item | Target docs |
|
|
231
|
+
| -------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
|
232
|
+
| ✅ done | ~~Add production-profile scenarios for backpressure envelope, retry durability, stale-claim recovery, and debug-bundle truncation.~~ Shipped: `production-backpressure.test.ts`, `production-retention-expiry.test.ts`, `restart-during-run.test.ts`, `staleClaim.test.ts`, `debug-bundle-truncation.test.ts`, `idempotency.test.ts`/`idempotencyRetry.test.ts` (see the `openwop-production` row in §"Capability-gated scenarios"). | `production-profile.md`, `scale-profiles.md`, `storage-adapters.md`, `debug-bundle.md` |
|
|
233
|
+
| ✅ done | ~~Add auth-profile scenarios for API-key rotation and OAuth2 client-credentials where test issuer metadata is available.~~ Shipped under RFC 0010: `auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (capability-gated; see §"Capability-gated scenarios"). | `auth.md`, `auth-profiles.md` |
|
|
234
|
+
| ✅ done | Interrupt-profile scenarios for quorum, external-event, auth-required, and parent/child cascade — landed 2026-05-10. | `interrupt.md`, `interrupt-profiles.md` |
|
|
235
|
+
| P1 | Convert endpoint manifest into generated coverage evidence from `api/openapi.yaml` operation IDs. | `rest-endpoints.md` |
|
|
236
|
+
| ✅ done | MCP and A2A synthetic-peer roundtrip scenarios landed 2026-05-10 (`mcp-tool-roundtrip.test.ts`, `a2a-task-roundtrip.test.ts`); opt-in via `OPENWOP_MCP_FAKE_SERVER=true` / `OPENWOP_A2A_FAKE_PEER=true`. | `mcp-integration.md`, `a2a-integration.md` |
|
|
237
|
+
| ✅ done | ~~Add replay retention and fork-from-arbitrary-event coverage.~~ Shipped: `replay-fork-arbitrary.test.ts` + `replay-retention-expiry.test.ts` (see the "Replay and fork" surface row). | `replay.md` |
|
|
238
|
+
| ~~P1~~ | ~~Deterministic 429-induction harness so `rate-limit-envelope.test.ts` triggers reliably under CI (currently observational).~~ ✅ Closed 2026-05-15 (CF-6): Postgres reference host honors `OPENWOP_FORCE_RATE_LIMIT=true`. | `rest-endpoints.md` |
|
|
239
|
+
| ✅ done | ~~Add tamper-detection scenario for `audit-log-integrity.test.ts` — requires admin write access to the host's audit store.~~ Covered host-internally at `openwop-examples:examples/hosts/sqlite/test/audit-tamper.test.ts` (mutate-entry + forge-signature) + the CF-11 checkpoint-export verifier path (`scripts/verify-audit-checkpoints.mjs`, regression-guarded in `openwop:check` step 7) — see the "Audit-log integrity profile" surface row. | `auth-profiles.md` |
|
|
240
|
+
| ✅ done | ~~Cross-engine append-ordering scenario (multi-engine fixture).~~ Shipped 2026-05-22: `cross-engine-append-ordering.test.ts` + `cross-engine-append-behavior.test.ts` (Lamport-clock monotonicity via the RFC 0036 cross-engine harness seam — see the RFC 0036 surface row). | `channels-and-reducers.md` |
|
|
241
|
+
| ✅ done | ~~End-to-end webhook signed-delivery test exercising `X-openwop-Signature-Algorithm: v1`.~~ Shipped: `webhook-signed-delivery.test.ts` (forward direction) + `webhook-receiver-adversarial.test.ts` (reverse direction, CF-5) — see the "Webhook signature algorithms" surface row. | `webhooks.md` |
|
|
242
|
+
| P2 | Conformance scenarios that cite normative RFC docs (not just schemas) for the multi-agent surfaces. | RFCS/0002–0007 |
|
|
243
|
+
| ✅ Closed 2026-05-17 (HV-1) | `agentPackHandoffSchemaValidation.test.ts` verifies RFC 0003 §D — host MUST validate dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef`. Fixture `conformance-agent-pack-handoff-schema-validation` exercises 3 branches (valid-task / invalid-task / mock-return-violation). Capability-gated on `capabilities.agents.{supported,dispatch}`. | RFCS/0003-agent-packs.md §D |
|
|
244
|
+
| ✅ Closed 2026-05-19 (HVMAP-1/2) | RFC 0022 conformance fully landed across two cycles: 2026-05-18 promoted HVMAP-1a/1b/1c/2 happy paths from `it.todo()` to live behavioral tests against the Postgres reference host (advertising `capabilities.agents.dispatchMapping: true` + `capabilities.subWorkflow.inputMapping: true`); 2026-05-19 promoted all remaining negative-path cases — HVMAP-1a-null, HVMAP-1a-refusal, HVMAP-1b-failed, HVMAP-1b-cancelled, HVMAP-1c-override, HVMAP-2-unset, HVMAP-2-no-midrun-propagation, HVMAP-2-refusal — via the new capability-toggle seam (`conformance/src/lib/host-toggle.ts` + `POST /v1/host/sample/test/capability-toggle`) and 5 new fixtures (`-no-default` variants, `-per-worker-override`, `-deterministic-fail-child`, `-cancellable-child`). Republished as `@openwop/openwop-conformance@1.3.0`. | RFCS/0022-dispatch-input-output-mapping.md §A + §B + §C |
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Scenario inventory backfill (2026-06-11)
|
|
249
|
+
|
|
250
|
+
<!-- NOTE: this inventory should be machine-generated in future (diff
|
|
251
|
+
`ls conformance/src/scenarios/*.test.ts` against the file names cited in
|
|
252
|
+
this document and emit rows from each file's docblock header). It was
|
|
253
|
+
hand-compiled on 2026-06-11 from each file's docblock / behaviorGate /
|
|
254
|
+
capability-gate source; ~90 of the 330 scenario files had no row anywhere
|
|
255
|
+
in this document before this section. -->
|
|
256
|
+
|
|
257
|
+
Every scenario file below exists in `conformance/src/scenarios/` but was not
|
|
258
|
+
named anywhere else in this document. Columns: the spec doc / RFC the scenario
|
|
259
|
+
covers, and the capability (or seam/fixture) that gates it — "always-on" means
|
|
260
|
+
server-free or shape-probe assertions that run unconditionally.
|
|
261
|
+
|
|
262
|
+
### Agent identity, reasoning + orchestrator (Multi-Agent Shift)
|
|
263
|
+
|
|
264
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
265
|
+
| -------------------------------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------- |
|
|
266
|
+
| `agentMetadata.test.ts` | RFC 0002 (`RunSnapshot.agent` / `runOrchestrator` identity) | `capabilities.agents.supported` |
|
|
267
|
+
| `agentReasoningEvents.test.ts` | RFC 0002 (`agent.*` reasoning event family, `run-event-payloads.schema.json`) | `capabilities.agents.supported` |
|
|
268
|
+
| `agentReasoningStreaming.test.ts` | RFC 0024 (`agent.reasoning.delta` streaming) | `capabilities.agents.reasoning.streaming` |
|
|
269
|
+
| `agentConfidenceEscalation.test.ts` | RFC 0002 (CP-1 confidence escalation → `node.suspended {low-confidence}`) | `capabilities.agents.supported` + fixture |
|
|
270
|
+
| `agentMessageReducer.test.ts` | RFC 0002 + `channels-and-reducers.md` §`message` (append-only, `messageId`-idempotent) | `capabilities.agents.supported` |
|
|
271
|
+
| `orchestratorDispatch.test.ts` | RFC 0006 (supervisor → dispatch → next-worker event sequence) | `capabilities.agents.orchestrator` + `capabilities.agents.dispatch` |
|
|
272
|
+
| `orchestratorConservativePath.test.ts` | RFC 0006 (CP-1 conservative-path suspend) | `capabilities.agents.orchestrator` |
|
|
273
|
+
| `orchestratorTermination.test.ts` | RFC 0006 (CO-3 terminate decision) | `capabilities.agents.orchestrator` |
|
|
274
|
+
|
|
275
|
+
### Agent packs (RFC 0003)
|
|
276
|
+
|
|
277
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
278
|
+
| ----------------------------- | ------------------------------------------------------------------------------------- | ------------------------------- |
|
|
279
|
+
| `agentPackInstall.test.ts` | RFC 0003 (pack `agents[]` → AgentManifest registration, `agent-manifest.schema.json`) | `capabilities.agents.supported` |
|
|
280
|
+
| `agentPackExport.test.ts` | RFC 0003 (workspace agents → AgentManifest export round-trip) | `capabilities.agents.supported` |
|
|
281
|
+
| `agentPackProvenance.test.ts` | RFC 0003 (`sourceManifestId` provenance round-trip) | `capabilities.agents.supported` |
|
|
282
|
+
| `agentPackCatalog.test.ts` | RFC 0003 (`core.openwop.agents.{deep-research,react,supervisor}` catalog evidence) | `capabilities.agents.supported` |
|
|
283
|
+
|
|
284
|
+
### Agent memory: adapter (RFC 0004), compaction (RFC 0012), attribution (RFC 0057), distillation (RFC 0062), consolidation + commitments (RFC 0068)
|
|
285
|
+
|
|
286
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
287
|
+
| --------------------------------------------- | ----------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
|
|
288
|
+
| `agentMemoryRoundTrip.test.ts` | RFC 0004 (MemoryAdapter list/get, `memory-entry.schema.json`) | `capabilities.agents.memoryBackends` ∋ `long-term` + fixture |
|
|
289
|
+
| `agentMemoryTtlExpiry.test.ts` | RFC 0004 (`expiresAt` exclusion from list/get) | `capabilities.agents.memoryBackends` ∋ `long-term` + fixture |
|
|
290
|
+
| `agentMemoryCrossTenantIsolation.test.ts` | RFC 0004 CTI-1 + `SECURITY/invariants.yaml#agent-memory-cti-1` | `capabilities.agents.memoryBackends` ∋ `long-term` + fixture |
|
|
291
|
+
| `agentMemoryRedactionContract.test.ts` | RFC 0004 SR-1 + `SECURITY/invariants.yaml#agent-memory-sr-1-redaction` | `capabilities.agents.memoryBackends` ∋ `long-term` + fixture |
|
|
292
|
+
| `memory-compaction-event-emitted.test.ts` | RFC 0012 §B (`memory.compacted` payload shape) | `capabilities.memory.compaction.supported` |
|
|
293
|
+
| `memory-compaction-provenance-tag.test.ts` | RFC 0012 §C (`compacted-from:<id>` tag, SHOULD-tier) | `capabilities.memory.compaction.supported` |
|
|
294
|
+
| `memory-compaction-sr1-carry-forward.test.ts` | RFC 0012 §D + `SECURITY/invariants.yaml#memory-compaction-sr-1-carry-forward` | `capabilities.memory.compaction.supported` |
|
|
295
|
+
| `memory-attribution-shape.test.ts` | RFC 0057 §A (`capabilities.memory.attribution` advertisement) | always-on |
|
|
296
|
+
| `memory-attribution-emits-on-write.test.ts` | RFC 0057 §A/§B (`memory.written` emission) | `capabilities.memory.attribution.emitsWriteEvents` |
|
|
297
|
+
| `memory-attribution-no-content.test.ts` | RFC 0057 §C + `SECURITY/invariants.yaml#memory-attribution-no-content` | `capabilities.memory.attribution.emitsWriteEvents` |
|
|
298
|
+
| `memory-attribution-tenant-scoped.test.ts` | RFC 0057 §C + `SECURITY/invariants.yaml#memory-attribution-tenant-scoped` | `capabilities.memory.attribution.emitsWriteEvents` |
|
|
299
|
+
| `memory-attribution-replay-stable.test.ts` | RFC 0057 §D (no `memoryId` re-mint on replay) | `capabilities.memory.attribution.emitsWriteEvents` |
|
|
300
|
+
| `multi-agent-memory-lifecycle.test.ts` | RFC 0039 §B (MAE-2/3 memory lifecycle; behavioral stubs deferred) | `capabilities.memory.supported` + `multiAgent.executionModel.version >= 2` |
|
|
301
|
+
| `distillation-shape.test.ts` | RFC 0062 §A (`capabilities.memory.distillation` advertisement) | always-on |
|
|
302
|
+
| `distillation-token-budget.test.ts` | RFC 0062 §B (budget bound + atomic `token_budget_exceeded`) | `capabilities.memory.distillation.supported` + seam |
|
|
303
|
+
| `distillation-secret-carryforward.test.ts` | RFC 0062 §B(3) (SR-1 carry-forward through distillation) | `capabilities.memory.distillation.supported` + seam |
|
|
304
|
+
| `distillation-stable-archive.test.ts` | RFC 0062 §B(4) (byte-stable archive checksum) | `capabilities.memory.distillation.supported` + seam |
|
|
305
|
+
| `distillation-index-roundtrip.test.ts` | RFC 0062 §B(5) (`MEMORY-INDEX.json` via RFC 0059 workspace) | `capabilities.memory.distillation.supported` + `indexEmitted` + seam |
|
|
306
|
+
| `memory-consolidation-shape.test.ts` | RFC 0068 (`agent.memory.consolidated` + `commitment.fired` shapes) | always-on |
|
|
307
|
+
| `memory-consolidation-idempotent.test.ts` | RFC 0068 (idempotence + SR-1 carry-forward) | `capabilities.agents.memoryConsolidation.supported` + seam |
|
|
308
|
+
| `commitment-fired.test.ts` | RFC 0068 (fire-once + content-free commitment) | `capabilities.agents.commitments.supported` + seam |
|
|
309
|
+
|
|
310
|
+
### Agent loop (RFC 0061)
|
|
311
|
+
|
|
312
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
313
|
+
| ---------------------------------------- | -------------------------------------------------------------------- | ---------------------------------------------------------- |
|
|
314
|
+
| `agent-loop-version5-shape.test.ts` | RFC 0061 §A/§B (`statefulResume` + `transcriptWindow` advertisement) | always-on |
|
|
315
|
+
| `agent-loop-iteration-monotonic.test.ts` | RFC 0061 §B (1-based monotonic `iteration` counter) | `multiAgent.executionModel.version >= 5` + agent-loop seam |
|
|
316
|
+
| `agent-loop-workspace-snapshot.test.ts` | RFC 0061 §C (per-iteration snapshot immutability) | `version >= 5` + `host.workspace.supported` + seam |
|
|
317
|
+
| `agent-loop-stateful-resume.test.ts` | RFC 0061 §D (HITL resume at same iteration) | `executionModel.statefulResume` + seam |
|
|
318
|
+
|
|
319
|
+
### RFC 0090 / 0091 / 0092 (verifier, multimodal input, capability requirements)
|
|
320
|
+
|
|
321
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
322
|
+
| ---------------------------------------------- | ----------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- |
|
|
323
|
+
| `agent-verifier-shape.test.ts` | RFC 0090 (`agent.verified` shape; public test for `verifier-no-content-leak`) | always-on |
|
|
324
|
+
| `verifier-gating.test.ts` | RFC 0090 §A/§B (verifier turn + convergence gating, behavioral) | `multiAgent.executionModel.verifier.gating` (behaviorGate) |
|
|
325
|
+
| `aiproviders-input-shape.test.ts` | RFC 0091 (`capabilities.aiProviders.input` modalities block) | always-on |
|
|
326
|
+
| `callai-multimodal.test.ts` | RFC 0091 §A/§B (multimodal perception on callAI, behavioral) | `aiProviders.input.modalities` ∋ non-text (behaviorGate `openwop-callai-multimodal`) |
|
|
327
|
+
| `agent-requires-capabilities-shape.test.ts` | RFC 0092 (`AgentManifest.requiresCapabilities[]` shape) | always-on |
|
|
328
|
+
| `agent-capability-degraded-projection.test.ts` | RFC 0092 §B (degraded projection on `GET /v1/agents`, behavioral) | `capabilities.agents.manifestRuntime` (behaviorGate `openwop-agent-capability-degraded`) |
|
|
329
|
+
|
|
330
|
+
### Conversation primitive (RFC 0005)
|
|
331
|
+
|
|
332
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
333
|
+
| ------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------ |
|
|
334
|
+
| `conversationLifecycle.test.ts` | RFC 0005 (open → exchange → close lifecycle events) | `capabilities.conversationPrimitive` |
|
|
335
|
+
| `conversationCapabilityNegotiation.test.ts` | RFC 0005 (refusal of `core.conversationGate` when unadvertised) | negative gate on `capabilities.conversationPrimitive` |
|
|
336
|
+
| `conversationVsLegacySuspend.test.ts` | RFC 0005 (`conversation.exchanged` ≠ `clarification.requested`) | `capabilities.conversationPrimitive` |
|
|
337
|
+
| `conversationReplayDeterminism.test.ts` | RFC 0005 (replay-fork yields byte-equal conversation log) | `capabilities.conversationPrimitive` + replay-fork + fixture |
|
|
338
|
+
|
|
339
|
+
### Dispatch / sub-workflow mapping (RFC 0022) + sub-run attestation (RFC 0063)
|
|
340
|
+
|
|
341
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
342
|
+
| --------------------------------------- | -------------------------------------------------------------------------- | ---------------------------------------------- |
|
|
343
|
+
| `dispatch-input-mapping.test.ts` | RFC 0022 §A (HVMAP-1a `inputMapping` projection) | `capabilities.agents.dispatchMapping` |
|
|
344
|
+
| `dispatch-output-mapping.test.ts` | RFC 0022 §A (HVMAP-1b `outputMapping` harvest) | `capabilities.agents.dispatchMapping` |
|
|
345
|
+
| `dispatch-cross-worker-handoff.test.ts` | RFC 0022 §A + §D (HVMAP-1c sequential cross-worker visibility) | `capabilities.agents.dispatchMapping` |
|
|
346
|
+
| `subworkflow-input-mapping.test.ts` | RFC 0022 §B (HVMAP-2 child variable-bag seeding) | `capabilities.subWorkflow.inputMapping` |
|
|
347
|
+
| `subrun-attestation-shape.test.ts` | RFC 0063 §A (advertisement flag shape) | always-on |
|
|
348
|
+
| `subrun-checksum-stable.test.ts` | RFC 0063 §B (JCS + SHA-256 output checksum on `output.harvested`) | `capabilities.agents.subRunAttestation` + seam |
|
|
349
|
+
| `subrun-approval-gate.test.ts` | RFC 0063 §C (suspend-before-merge; accept merges, reject doesn't) | `capabilities.agents.subRunAttestation` + seam |
|
|
350
|
+
| `subrun-approval-fail-closed.test.ts` | RFC 0063 §C + `SECURITY/invariants.yaml#subrun-merge-approval-fail-closed` | `capabilities.agents.subRunAttestation` + seam |
|
|
351
|
+
|
|
352
|
+
### Tool hooks (RFC 0064)
|
|
353
|
+
|
|
354
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
355
|
+
| ---------------------------------------------- | ---------------------------------------------------------------------------------------------------- | --------------------------------------- |
|
|
356
|
+
| `tool-hooks-shape.test.ts` | RFC 0064 §A (`capabilities.toolHooks` advertisement) | always-on |
|
|
357
|
+
| `tool-hooks-content-free.test.ts` | RFC 0064 §B (`argsHash` + `status`/`durationMs`, content-free) | `toolHooks.prePostEvents` + seam |
|
|
358
|
+
| `tool-hooks-secret-redaction.test.ts` | RFC 0064 §B/§E (SR-1 on tool args before hashing) | `toolHooks.prePostEvents` + seam |
|
|
359
|
+
| `tool-hooks-authorization-fail-closed.test.ts` | RFC 0064 §C (per-tool RFC 0049 fail-closed; also listed under `authorization-fail-closed` invariant) | `toolHooks.perToolAuthorization` + seam |
|
|
360
|
+
| `tool-hooks-rate-limit.test.ts` | RFC 0064 §D (per-`(principal, tool)` token bucket → `rate_limited`) | `toolHooks.perToolRateLimit` + seam |
|
|
361
|
+
|
|
362
|
+
### Heartbeat (RFC 0060)
|
|
363
|
+
|
|
364
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
365
|
+
| --------------------------------------- | ------------------------------------------------------------------ | --------------------------------- |
|
|
366
|
+
| `heartbeat-capability-shape.test.ts` | RFC 0060 §A (`capabilities.heartbeat` advertisement) | always-on |
|
|
367
|
+
| `heartbeat-fires-once-per-tick.test.ts` | RFC 0060 §B.1 (exactly-one `heartbeat.evaluated`; overlap skipped) | `heartbeat.supported` + tick seam |
|
|
368
|
+
| `heartbeat-runtime-bound.test.ts` | RFC 0060 §B.2 (`maxRuntimeMs` → `{status:"timeout"}`) | `heartbeat.supported` + tick seam |
|
|
369
|
+
| `heartbeat-idempotent-no-spam.test.ts` | RFC 0060 §B.5 (action on transition, not on tick) | `heartbeat.supported` + tick seam |
|
|
370
|
+
|
|
371
|
+
### Host storage-surface roundtrips (RFCs 0015–0019, beyond the cross-tenant-isolation rows above)
|
|
372
|
+
|
|
373
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
374
|
+
| ----------------------------------------- | -------------------------------------------------------- | ----------------------------------------------------------- |
|
|
375
|
+
| `kv-ttl-expiry.test.ts` | RFC 0015 (`host-kv-storage-capability.md` TTL semantics) | `capabilities.kvStorage` + `OPENWOP_TEST_SEAM_ENABLED` seam |
|
|
376
|
+
| `table-cursor-pagination.test.ts` | RFC 0016 (cursor pagination) | `capabilities.tableStorage` + seam |
|
|
377
|
+
| `table-schema-enforcement.test.ts` | RFC 0016 (table schema enforcement) | `capabilities.tableStorage` + seam |
|
|
378
|
+
| `queue-publish-consume-roundtrip.test.ts` | RFC 0017 (publish/consume roundtrip) | `capabilities.queueBus` + seam |
|
|
379
|
+
| `queue-ack-nack-dlq.test.ts` | RFC 0017 (ack/nack + DLQ semantics) | `capabilities.queueBus` + seam |
|
|
380
|
+
| `stream-subscribe-from-beginning.test.ts` | RFC 0017 (stream subscribe-from-beginning) | `capabilities.queueBus` + seam |
|
|
381
|
+
| `sql-transaction-atomicity.test.ts` | RFC 0018 (transaction atomicity) | `capabilities.sql` + seam |
|
|
382
|
+
| `search-bm25-roundtrip.test.ts` | RFC 0018 (BM25 search roundtrip) | `capabilities.searchIndex` + seam |
|
|
383
|
+
| `vector-knn-roundtrip.test.ts` | RFC 0018 (vector kNN roundtrip) | `capabilities.vectorStore` + seam |
|
|
384
|
+
| `blob-roundtrip.test.ts` | RFC 0019 (blob put/get roundtrip) | `capabilities.blobStorage` + seam |
|
|
385
|
+
| `blob-presign-expiry.test.ts` | RFC 0019 (presigned-URL expiry) | `capabilities.blobStorage` + seam |
|
|
386
|
+
| `cache-ttl-expiry.test.ts` | RFC 0019 (cache TTL expiry) | `capabilities.cache` + seam |
|
|
387
|
+
|
|
388
|
+
### Replay determinism + cross-host causation (RFC 0040 / 0041)
|
|
389
|
+
|
|
390
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
391
|
+
| -------------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
392
|
+
| `replay-llm-cache-key.test.ts` | `replay.md` §"LLM cache-key recipe" §A/§B (SHA-256 over RFC 8785 JCS) | server-free recipe vectors always-on; host leg via env-gated test seam |
|
|
393
|
+
| `replay-llm-cache-key-portable.test.ts` | RFC 0041 §E + `SECURITY/invariants.yaml#replay-llm-cache-key-portable` | `multiAgent.executionModel.version >= 4` + `replayDeterminism.llmCacheKeyRecipe: "spec-rfc-0041"` |
|
|
394
|
+
| `cross-host-traceparent-propagation.test.ts` | RFC 0040 §B (traceparent into outbound MCP/A2A) | `executionModel.version >= 3` + `crossHostCausation.supported`. **Honesty note:** the behavioral assertions are skipped placeholders (`it.skip` in code; the file header describes them as `it.todo`) pending the cross-host MCP/A2A peer harness — out of stable profile via RFC 0042 §B until RFC 0040 graduates. |
|
|
395
|
+
|
|
396
|
+
### Miscellaneous protocol surfaces
|
|
397
|
+
|
|
398
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
399
|
+
| -------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------- |
|
|
400
|
+
| `exec-not-protocol-tier.test.ts` | RFC 0069 + `SECURITY/invariants.yaml#exec-must-not-be-protocol-tier` (structural corpus assertion) | always-on, server-free |
|
|
401
|
+
| `idempotency-key-determinism.test.ts` | `idempotency.md` §"Idempotency-Key" + `SECURITY/invariants.yaml#idempotency-key-deterministic` | server-free; soft-skips when the pack source is not bundled |
|
|
402
|
+
| `agents-run-tool-allowlist.test.ts` | OPENWOP-AUDIT-2026-003 + `SECURITY/invariants.yaml#agents-run-no-raw-handler` (`core.openwop.agents.run` 1.0.1) | server-free pack-source; soft-skips when not bundled |
|
|
403
|
+
| `mcp-toolcall-redaction.test.ts` | `mcp-integration.md` MCP-1 + `SECURITY/invariants.yaml#mcp-toolcall-payload-redaction` | `capabilities.mcpClient.supported` |
|
|
404
|
+
| `provider-usage.test.ts` | RFC 0026 (`provider.usage` event + `SECURITY/invariants.yaml#provider-usage-no-credential-leak`) | `capabilities.providerUsage` |
|
|
405
|
+
| `byok-auth-modes.test.ts` | RFC 0067 (`capabilities.aiProviders.authModes` advertisement) | always-on shape; per-provider when advertised |
|
|
406
|
+
| `pack-registry-isolation.test.ts` | RFC 0025 §C.1 (test-mode pack never visible in `/v1/packs/*`) | `capabilities.packs.testMode.supported` (soft-skip) |
|
|
407
|
+
| `prompt-all-four-kinds-events.test.ts` | RFC 0027 §A (four-PromptKind dispatch coverage) | `capabilities.prompts.supported` |
|
|
408
|
+
| `feedback-capability-shape.test.ts` | RFC 0056 §A (`capabilities.feedback` advertisement) | always-on |
|
|
409
|
+
| `workflow-primary-output-annotation.test.ts` | RFC 0065 (`outputRole` closed enum on `WorkflowNode`) | always-on, server-free |
|
|
410
|
+
|
|
411
|
+
### Added on this branch (2026-06-11)
|
|
412
|
+
|
|
413
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
414
|
+
| ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------- |
|
|
415
|
+
| `version-fold.test.ts` | `spec/v1/version-negotiation.md` §`X-Force-Engine-Version` matrix (consumes the `conformance-version-fold.json` fixture — closes catalog gap F5) | ungated |
|
|
416
|
+
| `stream-text-fixture.test.ts` | `spec/v1/stream-modes.md` text-mode fold (consumes the `conformance-stream-text.json` fixture — closes catalog gap F1) | ungated |
|
|
417
|
+
| `i18n-negotiation.test.ts` | `spec/v1/i18n.md` (language negotiation) | `capabilities.i18n` |
|
|
418
|
+
| `grpc-transport.test.ts` | `spec/v1/grpc-transport.md` (gRPC transport profile) | `capabilities.grpc` (schema block added by RFC 0094 on this branch) |
|
|
419
|
+
| `webhook-tenant-isolation.test.ts` | RFC 0093 §A3 (`spec/v1/webhooks.md` delivery tenant scope; backs the protocol-tier `webhook-cross-tenant-isolation` invariant) | `capabilities.webhooks.supported` (two-tenant legs via the `host/sample/test/surface` seam, soft-skip on 404) |
|