@openwop/openwop-conformance 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -92,7 +92,7 @@ Exit code is non-zero on any failed assertion.
92
92
 
93
93
  ## What's Covered
94
94
 
95
- The current suite has 103 scenario files under `src/scenarios/`. This includes 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), and the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H. The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
95
+ The current suite has 107 scenario files under `src/scenarios/`. This includes 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
96
96
 
97
97
  High-level coverage includes:
98
98
 
@@ -171,7 +171,7 @@ Server-required (added in 1.7.0):
171
171
  |---|---|---|
172
172
  | **Redaction** | [`capabilities.md`](../spec/v1/capabilities.md) §"Secrets" + NFR-7 + §"aiProviders" | Vendor-neutral assertions that the server doesn't leak secret material. Three scenario groups: (a) discovery shape contract — `secrets` + `aiProviders` advertisements are well-formed regardless of `secrets.supported`; when `supported === true`, scopes MUST be non-empty + `resolution === 'host-managed'`; `byok ⊆ supported`. (b) bearer-token redaction — invalid Bearer canary in `Authorization` header is not echoed in the 401 response body. (c) credentialRef echo control — gated on `secrets.supported === true`; canary planted in `configurable.ai.credentialRef` MUST NOT appear in any RunEvent payload (poll-based capture; transport-agnostic). Uses runtime-built canary fixtures (`lib/canaries.ts`) that defeat static secret scanners. 6 scenarios. |
173
173
 
174
- Current source tree: 103 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
174
+ Current source tree: 107 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
175
175
 
176
176
  ## Remaining Gaps
177
177
 
package/coverage.md CHANGED
@@ -22,7 +22,7 @@
22
22
  | Sub-workflows and dispatch | `subworkflow.test.ts`, `multi-node-ordering.test.ts` | B+ | Parallel fan-out floors by scale tier, parent/child cancellation. |
23
23
  | Node packs and registry | `pack-registry.test.ts`, `pack-registry-publish.test.ts`, `maliciousManifest.test.ts`, `wasm-pack-load.test.ts`, `wasm-pack-invoke-completed.test.ts`, `wasm-pack-invoke-suspended.test.ts`, `wasm-pack-replay-determinism.test.ts`, `wasm-pack-memory-cap.test.ts`, `wasm-pack-abi-version-rejection.test.ts` | A | RFC 0008 WASM ABI scenarios landed 2026-05-10; gated on `capabilities.nodePackRuntimes.wasm.supported`. Memory-cap positive path closed 2026-05-12 via `examples/packs/rust-misbehaving-memory/` + in-memory loader's `memory.grow` probe + `capBreached.kind` schema extension. ABI-mismatch positive path closed 2026-05-12 via `examples/packs/rust-misbehaving-abi/` (declares ABI 999) + new `capabilities.nodePackRuntimes.wasm.loadedPacks[]` discovery field (rejected packs omitted). Remaining: hosted registry interoperability once `packs.openwop.dev` exists. |
24
24
  | Secrets and redaction | `redaction.test.ts`, `redactionAdversarial.test.ts`, `byok-roundtrip.test.ts` | A- | Cross-provider BYOK matrix and debug-bundle redaction under high volume. |
25
- | Observability and diagnostics | `cost-attribution.test.ts`, `debugBundle.test.ts`, `otel-emission.test.ts`, `otel-trace-propagation.test.ts`, `metric-emission.test.ts`, `otel-emission-grpc.test.ts` | A | OTLP collector accepts all three OTLP transports: HTTP-JSON (2026-05-11), HTTP-protobuf (2026-05-12), gRPC over h2c HTTP/2 (2026-05-12 — hand-rolled framing at `conformance/src/lib/grpc-framing.ts`). Zero new npm deps for any of them. Opt-in via `OPENWOP_OTEL_COLLECTOR=true` (+ `OPENWOP_OTEL_COLLECTOR_GRPC=true` for the gRPC variant). Hosts advertise supported transports via `capabilities.observability.otel.exportProtocols ⊆ {http/json, http/protobuf, grpc}`; conformance scenario gates on the array. |
25
+ | Observability and diagnostics | `cost-attribution.test.ts`, `debugBundle.test.ts`, `otel-emission.test.ts`, `otel-trace-propagation.test.ts`, `otel-trace-propagation-subworkflow.test.ts`, `metric-emission.test.ts`, `otel-emission-grpc.test.ts` | A | OTLP collector accepts all three OTLP transports: HTTP-JSON (2026-05-11), HTTP-protobuf (2026-05-12), gRPC over h2c HTTP/2 (2026-05-12 — hand-rolled framing at `conformance/src/lib/grpc-framing.ts`). Zero new npm deps for any of them. Opt-in via `OPENWOP_OTEL_COLLECTOR=true` (+ `OPENWOP_OTEL_COLLECTOR_GRPC=true` for the gRPC variant). Hosts advertise supported transports via `capabilities.observability.otel.exportProtocols ⊆ {http/json, http/protobuf, grpc}`; conformance scenario gates on the array. Trace-context propagation closed 2026-05-13 with `otel-trace-propagation-subworkflow.test.ts` asserting parent + child spans share the inbound traceparent across the `core.subWorkflow` dispatch boundary. |
26
26
  | Fixtures and corpus validity | `fixtures-valid.test.ts`, `fixtures-gating.test.ts`, `spec-corpus-validity.test.ts` | A | Keep fixture manifest synchronized as new optional profiles land. |
27
27
  | Run control — pause/resume | `pause-resume.test.ts` | B | Lifecycle + 409-on-non-paused covered; remaining: pause-during-suspend race, immediate-vs-drain-current-node policy assertion. |
28
28
  | Rate-limit envelope | `rate-limit-envelope.test.ts` | B− | Shape validation when 429 observed; remaining: deterministic 429-induction harness so the scenario reliably triggers under CI. |
@@ -48,8 +48,8 @@ Seventeen scenarios (or scenario groups) validate optional profiles where the ho
48
48
  | `webhook-sig-algorithm.test.ts` | `X-openwop-Signature-Algorithm: v1` (`webhooks.md`) | C+ (discovery shape) | `host-pending` | End-to-end signed delivery against a test receiver |
49
49
  | `pause-resume.test.ts` | `pauseRun` / `resumeRun` lifecycle (`rest-endpoints.md`) | B (lifecycle + 409-on-non-paused) | partial | Pause-during-suspend race; immediate-vs-drain policy assertion |
50
50
  | `append-ordering.test.ts` | `append` reducer ordering (`channels-and-reducers.md`) | B (intra-engine) | partial | Cross-engine multi-engine fixture |
51
- | `otel-emission.test.ts`, `otel-emission-grpc.test.ts` | `openwop.*` OTel spans (`observability.md`) | A (OTLP/HTTP-JSON + OTLP/HTTP-protobuf + OTLP/gRPC all accepted; collector content-type-routes JSON/protobuf and the gRPC HTTP/2 path frame-decodes via `grpc-framing.ts`) | full | Cross-host trace-context propagation across `core.subWorkflow` (filed under `otel-trace-propagation.test.ts` Remaining) |
52
- | `otel-trace-propagation.test.ts` | W3C trace-context propagation (`observability.md`) | B (trace continuity across `runs:fork` + interrupt resolve) | partial | Cross-host propagation across `core.subWorkflow` invocation |
51
+ | `otel-emission.test.ts`, `otel-emission-grpc.test.ts` | `openwop.*` OTel spans (`observability.md`) | A (OTLP/HTTP-JSON + OTLP/HTTP-protobuf + OTLP/gRPC all accepted; collector content-type-routes JSON/protobuf and the gRPC HTTP/2 path frame-decodes via `grpc-framing.ts`) | full | |
52
+ | `otel-trace-propagation.test.ts`, `otel-trace-propagation-subworkflow.test.ts` | W3C trace-context propagation (`observability.md`) | A (trace continuity across `runs:fork` + interrupt resolve + `core.subWorkflow` parent→child boundary; closed 2026-05-13 subWorkflow scenario asserts both parent + child spans share the inbound traceparent's traceId) | full | — |
53
53
  | `wasm-pack-*.test.ts` (six scenarios) | `capabilities.nodePackRuntimes.wasm` (`RFCS/0008`) | A (load + invoke + replay + memory-cap positive-path + ABI-mismatch positive-path) | full | Memory-cap positive path closed 2026-05-12 via misbehaving-memory pack + `memory.grow` probe in loader. ABI-mismatch positive path closed 2026-05-12 via misbehaving-abi pack + `loadedPacks[]` discovery field. |
54
54
  | `production-backpressure.test.ts`, `production-retention-expiry.test.ts`, `restart-during-run.test.ts`, `staleClaim.test.ts`, `debug-bundle-truncation.test.ts`, `idempotency.test.ts`, `idempotencyRetry.test.ts` (seven scenarios) | `openwop-production` (`production-profile.md`, RFC 0009) | A− (capability shape + 503 envelope under saturation + discovery-exemption; durable-restart + debug-bundle-truncation predicates exercised end-to-end; retention-expiry envelope soft-skipped pending RFC 0009 Q#1) | host-pass | Postgres reference host advertises `capabilities.production.supported: true` since 2026-05-11 and passes all 11 assertions across the 5 non-opt-in scenarios under `OPENWOP_REQUIRE_BEHAVIOR=true` with `--no-file-parallelism` (the backpressure scenario saturates the inflight cap; parallel file execution collides with `idempotencyRetry.test.ts`'s burst). RFC 0009 unresolved questions #1 (force-expire endpoint normation) + #3 (inflightCap vs probing) gate the path to A. |
55
55
  | `auth-api-key-rotation.test.ts` | `openwop-auth-api-key-rotation` (`auth-profiles.md`, RFC 0010) | B (capability shape + secondary-key overlap when env-supplied + canary-redaction) | `host-pending` | Reference host advertises the profile + supplies `OPENWOP_TEST_SECONDARY_API_KEY` for the overlap check. |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@openwop/openwop-conformance",
3
- "version": "1.1.0",
3
+ "version": "1.1.1",
4
4
  "description": "Production-ready black-box conformance suite for OpenWOP v1.0 compliant servers.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -84,11 +84,11 @@
84
84
  },
85
85
  "nodePackRuntimes": {
86
86
  "type": "object",
87
- "description": "Optional v1 advertisement of node-pack runtimes the host loads. See `node-packs.md` §runtime formats and RFC 0008 (WASM ABI). Hosts that don't load packs MAY omit this block entirely.",
87
+ "description": "Optional v1 advertisement of node-pack runtimes the host loads. See `node-packs.md` §runtime formats and RFC 0008 (WASM ABI). Hosts that don't load packs MAY omit this block entirely. NOTE: this block intentionally uses `additionalProperties: true` (here and on the nested runtime objects) so a future RFC may add a runtime type (e.g., `python-wasm`, `js-wasm`) without a breaking-change rev of this schema; the trade is that a strict-mode validator will accept arbitrary extra keys under these objects. A future RFC that promotes the open-set fields to first-class SHOULD tighten them to `additionalProperties: false` in the same RFC.",
88
88
  "properties": {
89
89
  "wasm": {
90
90
  "type": "object",
91
- "description": "WASM core-module runtime per RFC 0008. Hosts that load `runtime.language: \"wasm\"` packs MUST advertise `supported: true` and at least one entry in `abiVersions[]`.",
91
+ "description": "WASM core-module runtime per RFC 0008. Hosts that load `runtime.language: \"wasm\"` packs MUST advertise `supported: true` and at least one entry in `abiVersions[]`. `additionalProperties: true` preserved for future RFC 0008 amendments (e.g., per-pack memory accounting fields).",
92
92
  "required": ["supported"],
93
93
  "properties": {
94
94
  "supported": {
@@ -125,7 +125,7 @@
125
125
  },
126
126
  "wasmComponent": {
127
127
  "type": "object",
128
- "description": "WASM Component Model variant (WIT-defined interfaces). Reserved for hosts that load `runtime.language: \"wasm-component\"` packs.",
128
+ "description": "WASM Component Model variant (WIT-defined interfaces). Reserved for hosts that load `runtime.language: \"wasm-component\"` packs. `additionalProperties: true` preserved until the Component Model spec stabilises in the v1.x line; tighten in the RFC that promotes wasm-component to a first-class runtime alongside `wasm`.",
129
129
  "properties": {
130
130
  "supported": { "type": "boolean" }
131
131
  },
@@ -331,6 +331,55 @@
331
331
  },
332
332
  "additionalProperties": true
333
333
  },
334
+ "memory": {
335
+ "type": "object",
336
+ "description": "MemoryAdapter capability block per RFC 0004 + the optional compaction profile per RFC 0012. Hosts that don't wire any memory surface omit the entire block.",
337
+ "properties": {
338
+ "supported": {
339
+ "type": "boolean",
340
+ "description": "When `true`, host implements the four-operation MemoryAdapter contract (`list`, `get`, `put`, `delete`) per RFC 0004 §A."
341
+ },
342
+ "maxEntrySizeBytes": {
343
+ "type": "integer",
344
+ "minimum": 1,
345
+ "description": "Upper bound on `MemoryEntry.content` size. Hosts SHOULD reject `put` exceeding this with `validation_error`."
346
+ },
347
+ "ttlSupported": {
348
+ "type": "boolean",
349
+ "description": "When `true`, host honors `expiresAt` per RFC 0004 §E."
350
+ },
351
+ "compaction": {
352
+ "type": "object",
353
+ "description": "RFC 0012 Memory Compaction Profile (Accepted 2026-05-15). Hosts that distill many short-lived MemoryEntry rows into fewer long-lived ones MAY advertise here; advertising implies the SR-1 carry-forward invariant (`SECURITY/invariants.yaml` row `memory-compaction-sr-1-carry-forward`).",
354
+ "required": ["supported"],
355
+ "properties": {
356
+ "supported": {
357
+ "type": "boolean",
358
+ "description": "REQUIRED when the sub-block is present. When `true`, host performs compaction over `longTerm` memory and emits the `memory.compacted` event per `observability.md` §Canonical event vocabulary."
359
+ },
360
+ "trigger": {
361
+ "type": "string",
362
+ "enum": ["host-managed", "client-requested", "both"],
363
+ "description": "REQUIRED when `supported: true` per RFC 0012 §A (enforced via the `if/then` clause). `host-managed` runs on a host-internal schedule clients do not control. `client-requested` and `both` are reserved enum values; v1.x normates only `host-managed`."
364
+ },
365
+ "maxInputEntries": {
366
+ "type": "integer",
367
+ "minimum": 1,
368
+ "description": "Informational ceiling on how many source entries one compaction call collapses. Not wire-enforced."
369
+ },
370
+ "maxOutputBytes": {
371
+ "type": "integer",
372
+ "minimum": 0,
373
+ "description": "Informational ceiling on the distilled entry size. SHOULD be ≤ `memory.maxEntrySizeBytes`."
374
+ }
375
+ },
376
+ "additionalProperties": false,
377
+ "if": { "properties": { "supported": { "const": true } }, "required": ["supported"] },
378
+ "then": { "required": ["supported", "trigger"] }
379
+ }
380
+ },
381
+ "additionalProperties": true
382
+ },
334
383
  "conversationPrimitive": {
335
384
  "type": "boolean",
336
385
  "description": "Multi-Agent Shift Phase 4. When `true`, host advertises that it implements the `core.conversationGate` typeId AND honors the `conversation.start` / `conversation.exchange` / `conversation.close` suspend variants. Hosts that don't claim this fall back to `clarification.requested` interrupts for multi-turn user interjections."
@@ -2,7 +2,7 @@
2
2
  "$schema": "https://json-schema.org/draft/2020-12/schema",
3
3
  "$id": "https://openwop.dev/spec/v1/run-event-payloads.schema.json",
4
4
  "title": "RunEventPayloads",
5
- "description": "Per-RunEventType payload schemas. The base RunEventDoc shape (run-event.schema.json) leaves `payload` permissive for forward-compat. This schema defines the canonical payload contract for each known RunEventType. Consumers MAY pin strict payload validation via `$defs.<typeId>` and `ajv.validate(schema.$defs[event.type], event.payload)`. Unknown event types MUST be tolerated (no $defs match → fold best-effort).\n\n47 variants from `run-event.schema.json#$defs.RunEventType` are covered, grouped into ~20 shape families with shared $defs. Naming convention: camelCase keys mirror dotted RunEventType names (e.g., `run.started` → `runStarted`).",
5
+ "description": "Per-RunEventType payload schemas. The base RunEventDoc shape (run-event.schema.json) leaves `payload` permissive for forward-compat. This schema defines the canonical payload contract for each known RunEventType. Consumers MAY pin strict payload validation via `$defs.<typeId>` and `ajv.validate(schema.$defs[event.type], event.payload)`. Unknown event types MUST be tolerated (no $defs match → fold best-effort).\n\n48 variants from `run-event.schema.json#$defs.RunEventType` are covered, grouped into ~20 shape families with shared $defs. Naming convention: camelCase keys mirror dotted RunEventType names (e.g., `run.started` → `runStarted`).",
6
6
  "type": "object",
7
7
  "$defs": {
8
8
  "_typeIndex": {
@@ -56,7 +56,8 @@
56
56
  "runOrchestrator.decided": { "$ref": "#/$defs/runOrchestratorDecided" },
57
57
  "conversation.opened": { "$ref": "#/$defs/conversationOpened" },
58
58
  "conversation.exchanged": { "$ref": "#/$defs/conversationExchanged" },
59
- "conversation.closed": { "$ref": "#/$defs/conversationClosed" }
59
+ "conversation.closed": { "$ref": "#/$defs/conversationClosed" },
60
+ "memory.compacted": { "$ref": "#/$defs/memoryCompacted" }
60
61
  }
61
62
  },
62
63
 
@@ -658,6 +659,21 @@
658
659
  "turnCount": { "type": "integer", "minimum": 0, "description": "Total turns completed in this conversation (number of `conversation.exchanged` events emitted)." }
659
660
  },
660
661
  "additionalProperties": true
662
+ },
663
+
664
+ "memoryCompacted": {
665
+ "type": "object",
666
+ "description": "RFC 0012 (Memory Compaction Profile). Emitted by a host advertising `capabilities.memory.compaction.supported: true` each time a compaction run completes, regardless of trigger. `outputId` MUST be readable via `MemoryAdapter.get(memoryRef, outputId)` until normal TTL eviction. SR-1 carry-forward (RFC 0012 §D) applies: the host MUST route compacted entry content through the same BYOK redaction harness applied to a fresh `put`.",
667
+ "required": ["memoryRef", "outputId", "sourceCount", "trigger", "byteSize"],
668
+ "properties": {
669
+ "memoryRef": { "type": "string", "minLength": 1, "description": "Opaque MemoryAdapter handle per RFC 0004 §C. Bounds compaction to a single memory scope." },
670
+ "outputId": { "type": "string", "minLength": 1, "description": "Id of the new MemoryEntry created by compaction." },
671
+ "sourceIds": { "type": "array", "items": { "type": "string", "minLength": 1 }, "description": "Ids of entries collapsed into outputId. Hosts MAY omit when sourceCount > 100; the `compacted-from:<runId>` tag convention (RFC 0012 §C) becomes the authoritative provenance signal. When present, MUST be exhaustive within the array (no 'and N more' semantics)." },
672
+ "sourceCount": { "type": "integer", "minimum": 1, "description": "Total source entries collapsed, including any not enumerated in `sourceIds`." },
673
+ "trigger": { "type": "string", "enum": ["host-managed", "client-requested", "both"], "description": "Matches `capabilities.memory.compaction.trigger`; indicates which trigger path drove this run. v1.x normates only `host-managed`." },
674
+ "byteSize": { "type": "integer", "minimum": 0, "description": "Byte size of the resulting MemoryEntry.content." }
675
+ },
676
+ "additionalProperties": true
661
677
  }
662
678
  }
663
679
  }
@@ -109,7 +109,8 @@
109
109
  "runOrchestrator.decided",
110
110
  "conversation.opened",
111
111
  "conversation.exchanged",
112
- "conversation.closed"
112
+ "conversation.closed",
113
+ "memory.compacted"
113
114
  ]
114
115
  }
115
116
  }
@@ -18,6 +18,18 @@
18
18
  * scenario exercises real behavior — useful for hosts that want to
19
19
  * claim full coverage in `INTEROP-MATRIX.md`.
20
20
  *
21
+ * - **Strict-mode opt-out (skip in strict mode too):** set
22
+ * `OPENWOP_OPTED_OUT_PROFILES=name1,name2,...` to declare that the
23
+ * host operator has deliberately chosen NOT to implement those
24
+ * profiles. In strict mode the gate skips them with a "honest
25
+ * opt-out" log line instead of failing — minimal hosts that
26
+ * advertise only what they implement can still go strict-mode
27
+ * green without falsifying capability claims. Conflict check:
28
+ * if a profile appears in BOTH `OPENWOP_OPTED_OUT_PROFILES` AND
29
+ * the host's discovery `capabilities.auth.profiles[]` (or
30
+ * equivalent), the gate logs a loud warning — opt-outs and
31
+ * advertisements are mutually exclusive.
32
+ *
21
33
  * Usage:
22
34
  *
23
35
  * ```ts
@@ -42,18 +54,45 @@ import { loadEnv } from './env.js';
42
54
 
43
55
  /**
44
56
  * Returns true if the scenario should proceed with assertions (advertised),
45
- * false if the scenario should `return` early (default-mode skip). In strict
46
- * mode (`OPENWOP_REQUIRE_BEHAVIOR=true`), throws if not advertised — so the
47
- * caller never actually receives `false` in that mode.
57
+ * false if the scenario should `return` early (default-mode skip OR
58
+ * strict-mode honest opt-out). In strict mode (`OPENWOP_REQUIRE_BEHAVIOR=true`)
59
+ * with `profileName` NOT in `OPENWOP_OPTED_OUT_PROFILES`, throws so the
60
+ * caller never receives `false` in that combination.
61
+ *
62
+ * If the host BOTH advertises the profile AND the operator listed it in
63
+ * the opt-out env var, surface a warning (likely typo) and treat as
64
+ * advertised (proceed). Advertisement always wins over opt-out: opting
65
+ * out of a profile you actually implement is meaningless.
48
66
  */
49
67
  export function behaviorGate(profileName: string, advertised: boolean): boolean {
68
+ const env = loadEnv();
69
+ const optedOut = env.optedOutProfiles.has(profileName);
70
+
71
+ if (advertised && optedOut) {
72
+ // eslint-disable-next-line no-console
73
+ console.warn(
74
+ `[${profileName}] both ADVERTISED by the host AND listed in OPENWOP_OPTED_OUT_PROFILES — ` +
75
+ `opt-out is ignored. Remove from the env var to clear this warning.`,
76
+ );
77
+ }
78
+
50
79
  if (advertised) return true;
51
80
 
52
- const env = loadEnv();
81
+ if (optedOut) {
82
+ // Honest opt-out: the operator declared the host does not implement
83
+ // this profile. Skip in BOTH default and strict mode.
84
+ // eslint-disable-next-line no-console
85
+ console.warn(
86
+ `[${profileName}] honest opt-out (OPENWOP_OPTED_OUT_PROFILES); skipping`,
87
+ );
88
+ return false;
89
+ }
90
+
53
91
  if (env.requireBehavior) {
54
92
  expect(
55
93
  advertised,
56
- `OPENWOP_REQUIRE_BEHAVIOR=true: host MUST advertise the ${profileName} profile for this scenario to run. ` +
94
+ `OPENWOP_REQUIRE_BEHAVIOR=true: host MUST advertise the ${profileName} profile for this scenario to run, ` +
95
+ `or declare opt-out via OPENWOP_OPTED_OUT_PROFILES=${profileName}. ` +
57
96
  `See conformance/coverage.md §"Capability-gated scenarios".`,
58
97
  ).toBe(true);
59
98
  // expect.toBe(true) throws; we won't reach here.
package/src/lib/env.ts CHANGED
@@ -16,6 +16,15 @@
16
16
  * Default is false — scenarios skip with a warning so default conformance
17
17
  * runs cover what the host has implemented. See `lib/behavior-gate.ts`
18
18
  * and `conformance/coverage.md` §"Capability-gated scenarios".
19
+ *
20
+ * OPENWOP_OPTED_OUT_PROFILES — comma-separated profile names the host
21
+ * operator has DELIBERATELY chosen not to implement. In strict mode
22
+ * these scenarios skip (logged as "honest opt-out") rather than
23
+ * failing — distinguishes "host doesn't claim this surface" (good)
24
+ * from "host claims but doesn't deliver" (bug). Lets honest minimal
25
+ * hosts go strict-mode green without falsifying capability claims.
26
+ * Example for SQLite:
27
+ * OPENWOP_OPTED_OUT_PROFILES=openwop-production,openwop-auth-mtls
19
28
  */
20
29
 
21
30
  export interface ConformanceEnv {
@@ -24,6 +33,15 @@ export interface ConformanceEnv {
24
33
  readonly implementationName: string;
25
34
  readonly implementationVersion: string;
26
35
  readonly requireBehavior: boolean;
36
+ /**
37
+ * Profiles the host operator has declared the host does NOT claim. Set
38
+ * via `OPENWOP_OPTED_OUT_PROFILES=name1,name2`. In strict mode, the
39
+ * behavior-gate honors this set as PASS-by-opt-out rather than failing
40
+ * the scenario. Never include a profile the host actually advertises —
41
+ * that's a typo, not an opt-out, and `behaviorGate` will surface a
42
+ * warning if it detects the conflict.
43
+ */
44
+ readonly optedOutProfiles: ReadonlySet<string>;
27
45
  }
28
46
 
29
47
  let cached: ConformanceEnv | null = null;
@@ -48,12 +66,21 @@ export function loadEnv(): ConformanceEnv {
48
66
  // Strip trailing slash so URL composition is consistent.
49
67
  const normalizedBase = baseUrl.replace(/\/$/, '');
50
68
 
69
+ const optedOutRaw = process.env.OPENWOP_OPTED_OUT_PROFILES?.trim() ?? '';
70
+ const optedOutProfiles = new Set(
71
+ optedOutRaw
72
+ .split(',')
73
+ .map((s) => s.trim())
74
+ .filter((s) => s.length > 0),
75
+ );
76
+
51
77
  cached = {
52
78
  baseUrl: normalizedBase,
53
79
  apiKey,
54
80
  implementationName: process.env.OPENWOP_IMPLEMENTATION_NAME?.trim() ?? 'unknown',
55
81
  implementationVersion: process.env.OPENWOP_IMPLEMENTATION_VERSION?.trim() ?? 'unknown',
56
82
  requireBehavior: process.env.OPENWOP_REQUIRE_BEHAVIOR === 'true',
83
+ optedOutProfiles,
57
84
  };
58
85
  return cached;
59
86
  }
@@ -32,15 +32,22 @@
32
32
  * probe at a real MCP server. Auto-detects the transport from the
33
33
  * server's `Content-Type` response header:
34
34
  * - `application/json` → single-JSON response, parsed as one
35
- * JSON-RPC frame.
35
+ * JSON-RPC frame. Verified end-to-end against
36
+ * `@modelcontextprotocol/sdk@1.29.0` `enableJsonResponse: true`
37
+ * streamable-http servers (2026-05-12).
36
38
  * - `text/event-stream` → streamable-http+SSE; the probe reads
37
39
  * SSE frames until it finds one whose `data:` payload matches
38
- * the JSON-RPC `id` we sent, then returns that frame.
40
+ * the JSON-RPC `id` we sent, then returns that frame. Verified
41
+ * end-to-end against `@modelcontextprotocol/sdk@1.29.0`
42
+ * streamable-http servers WITHOUT `enableJsonResponse`
43
+ * (2026-05-13).
39
44
  * The stdio transport (default for `modelcontextprotocol/servers`
40
- * reference servers) is still out of scope — those run as a child
41
- * process speaking JSON-RPC over stdin/stdout, no HTTP endpoint to
42
- * point env vars at. Operators wanting interop evidence against
43
- * stdio servers run them under a `mcp-bridge` HTTP adapter.
45
+ * reference servers) is HTTP-incompatible by design — those run
46
+ * as a child process speaking JSON-RPC over stdin/stdout, no HTTP
47
+ * endpoint. Operators collecting interop evidence against stdio
48
+ * servers run them under the documented HTTP-to-stdio bridge at
49
+ * `examples/mcp-stdio-bridge/` (verified end-to-end 2026-05-13;
50
+ * probe + bridge + `echo-stdio-server.mjs` round-trip passes 2/2).
44
51
  * Assertions stay shape-only: tools/list returns ≥1 tool, a
45
52
  * tools/call returns valid MCP content (a `result.content` array,
46
53
  * possibly `isError: true` — both are spec-conformant).
@@ -0,0 +1,121 @@
1
+ /**
2
+ * RFC 0012 §B — `memory.compacted` event emission shape.
3
+ *
4
+ * Verifies that hosts advertising
5
+ * `capabilities.memory.compaction.supported: true` emit a canonical
6
+ * `memory.compacted` event payload per `run-event-payloads.schema.json`
7
+ * §`memoryCompacted` whenever a compaction run completes.
8
+ *
9
+ * Required fields per the schema:
10
+ * - `memoryRef` (string, non-empty)
11
+ * - `outputId` (string, non-empty)
12
+ * - `sourceCount` (integer ≥ 1)
13
+ * - `trigger` (closed enum: `host-managed | client-requested | both`)
14
+ * - `byteSize` (integer ≥ 0)
15
+ *
16
+ * Optional:
17
+ * - `sourceIds` (array of non-empty strings; exhaustive within the
18
+ * array — no "and N more" semantics — when present)
19
+ *
20
+ * Gating identical to `memory-compaction-sr1-carry-forward.test.ts`:
21
+ * capability advertisement + test seam reachable.
22
+ *
23
+ * @see RFCS/0012-memory-compaction-profile.md §B
24
+ * @see schemas/run-event-payloads.schema.json §memoryCompacted
25
+ */
26
+
27
+ import { describe, it, expect } from 'vitest';
28
+ import { driver } from '../lib/driver.js';
29
+
30
+ const MEMORY_REF = 'mem_tenant:default_agent:conformance-rfc0012-event_longTerm';
31
+
32
+ interface MemoryCaps {
33
+ compaction?: { supported?: boolean };
34
+ }
35
+
36
+ async function isCompactionAdvertised(): Promise<boolean> {
37
+ const disco = await driver.get('/.well-known/openwop');
38
+ const memory = (disco.json as { capabilities?: { memory?: MemoryCaps } }).capabilities?.memory;
39
+ return memory?.compaction?.supported === true;
40
+ }
41
+
42
+ async function isTestSeamReachable(): Promise<boolean> {
43
+ const r = await driver.post('/v1/test/memory/compact', {});
44
+ return r.status !== 404;
45
+ }
46
+
47
+ describe('memory-compaction-event-emitted: canonical memory.compacted payload shape', () => {
48
+ it('compaction run returns a canonical memoryCompacted payload', async () => {
49
+ if (!(await isCompactionAdvertised())) {
50
+ // eslint-disable-next-line no-console
51
+ console.warn('[rfc0012-event] capabilities.memory.compaction.supported not advertised; skipping');
52
+ return;
53
+ }
54
+ if (!(await isTestSeamReachable())) {
55
+ // eslint-disable-next-line no-console
56
+ console.warn('[rfc0012-event] test seam unreachable; skipping');
57
+ return;
58
+ }
59
+
60
+ const seed = await driver.post('/v1/test/memory/seed', {
61
+ memoryRef: MEMORY_REF,
62
+ entries: [
63
+ { id: `event-src-${Date.now()}-a`, content: 'First memory entry.' },
64
+ { id: `event-src-${Date.now()}-b`, content: 'Second memory entry.' },
65
+ { id: `event-src-${Date.now()}-c`, content: 'Third memory entry.' },
66
+ ],
67
+ });
68
+ expect(seed.status).toBe(201);
69
+
70
+ const compactRes = await driver.post('/v1/test/memory/compact', {
71
+ memoryRef: MEMORY_REF,
72
+ });
73
+ expect(compactRes.status).toBe(200);
74
+
75
+ const event = compactRes.json as {
76
+ type?: string;
77
+ payload?: Record<string, unknown>;
78
+ };
79
+
80
+ expect(event.type, driver.describe(
81
+ 'observability.md §"Canonical run lifecycle event names"',
82
+ 'event MUST be type=memory.compacted',
83
+ )).toBe('memory.compacted');
84
+
85
+ const payload = event.payload ?? {};
86
+
87
+ expect(typeof payload.memoryRef, driver.describe(
88
+ 'RFC 0012 §B / run-event-payloads.schema.json §memoryCompacted',
89
+ 'memoryRef MUST be a non-empty string',
90
+ )).toBe('string');
91
+ expect((payload.memoryRef as string).length).toBeGreaterThan(0);
92
+
93
+ expect(typeof payload.outputId, driver.describe(
94
+ 'RFC 0012 §B',
95
+ 'outputId MUST be a string identifying the distilled entry',
96
+ )).toBe('string');
97
+ expect((payload.outputId as string).length).toBeGreaterThan(0);
98
+
99
+ expect(Number.isInteger(payload.sourceCount), driver.describe(
100
+ 'RFC 0012 §B',
101
+ 'sourceCount MUST be an integer',
102
+ )).toBe(true);
103
+ expect(payload.sourceCount as number).toBeGreaterThanOrEqual(1);
104
+
105
+ expect(['host-managed', 'client-requested', 'both']).toContain(payload.trigger);
106
+
107
+ expect(Number.isInteger(payload.byteSize), driver.describe(
108
+ 'RFC 0012 §B',
109
+ 'byteSize MUST be an integer',
110
+ )).toBe(true);
111
+ expect(payload.byteSize as number).toBeGreaterThanOrEqual(0);
112
+
113
+ if (payload.sourceIds !== undefined) {
114
+ expect(Array.isArray(payload.sourceIds)).toBe(true);
115
+ for (const id of payload.sourceIds as unknown[]) {
116
+ expect(typeof id, 'sourceIds entries MUST be strings').toBe('string');
117
+ expect((id as string).length).toBeGreaterThan(0);
118
+ }
119
+ }
120
+ });
121
+ });
@@ -0,0 +1,116 @@
1
+ /**
2
+ * RFC 0012 §C — `compacted-from:<id>` provenance tag convention.
3
+ *
4
+ * The distilled entry SHOULD (not MUST) carry a tag of the form
5
+ * `compacted-from:<compactionRunId>` where `<compactionRunId>` is a
6
+ * host-issued opaque identifier. This lets `MemoryAdapter.list`
7
+ * consumers detect compacted entries without needing access to the
8
+ * `memory.compacted` event stream.
9
+ *
10
+ * SOFT ASSERTION: log-and-warn if absent, fail only if a present tag
11
+ * is malformed. Hosts with structurally-constrained tag-spaces (legacy
12
+ * tag-prefix discipline, fixed-vocabulary tagging) MAY omit this —
13
+ * the `memory.compacted` event itself remains the canonical provenance
14
+ * signal.
15
+ *
16
+ * Gating identical to the other RFC 0012 scenarios.
17
+ *
18
+ * @see RFCS/0012-memory-compaction-profile.md §C
19
+ */
20
+
21
+ import { describe, it, expect } from 'vitest';
22
+ import { driver } from '../lib/driver.js';
23
+
24
+ const MEMORY_REF = 'mem_tenant:default_agent:conformance-rfc0012-tag_longTerm';
25
+ const COMPACTED_FROM_RE = /^compacted-from:[^\s:][^\s]*$/;
26
+
27
+ interface MemoryCaps {
28
+ compaction?: { supported?: boolean };
29
+ }
30
+
31
+ interface MemoryListResponse {
32
+ entries?: Array<{ id?: string; tags?: string[] }>;
33
+ }
34
+
35
+ async function isCompactionAdvertised(): Promise<boolean> {
36
+ const disco = await driver.get('/.well-known/openwop');
37
+ const memory = (disco.json as { capabilities?: { memory?: MemoryCaps } }).capabilities?.memory;
38
+ return memory?.compaction?.supported === true;
39
+ }
40
+
41
+ async function isTestSeamReachable(): Promise<boolean> {
42
+ const r = await driver.post('/v1/test/memory/compact', {});
43
+ return r.status !== 404;
44
+ }
45
+
46
+ describe('memory-compaction-provenance-tag: compacted-from:<id> tag follows §C convention', () => {
47
+ it('compacted entry carries a well-formed compacted-from tag, OR omits it cleanly (no malformed tags)', async () => {
48
+ if (!(await isCompactionAdvertised())) {
49
+ // eslint-disable-next-line no-console
50
+ console.warn('[rfc0012-tag] capabilities.memory.compaction.supported not advertised; skipping');
51
+ return;
52
+ }
53
+ if (!(await isTestSeamReachable())) {
54
+ // eslint-disable-next-line no-console
55
+ console.warn('[rfc0012-tag] test seam unreachable; skipping');
56
+ return;
57
+ }
58
+
59
+ // Seed + compact.
60
+ const seedStamp = Date.now();
61
+ const seed = await driver.post('/v1/test/memory/seed', {
62
+ memoryRef: MEMORY_REF,
63
+ entries: [
64
+ { id: `tag-src-${seedStamp}-1`, content: 'Source content alpha.' },
65
+ { id: `tag-src-${seedStamp}-2`, content: 'Source content beta.' },
66
+ ],
67
+ });
68
+ expect(seed.status).toBe(201);
69
+
70
+ const compactRes = await driver.post('/v1/test/memory/compact', {
71
+ memoryRef: MEMORY_REF,
72
+ });
73
+ expect(compactRes.status).toBe(200);
74
+
75
+ const event = compactRes.json as { payload?: { outputId?: string } };
76
+ const outputId = event.payload?.outputId;
77
+ expect(typeof outputId).toBe('string');
78
+
79
+ // Resolve the entry via the wire MemoryAdapter list surface (no
80
+ // direct get-by-id wire endpoint; we filter list results).
81
+ // Hosts that don't expose memory:list on the wire skip — this is
82
+ // a `MemoryAdapter.list` surface check, which the canonical
83
+ // capabilities.memory.supported claim already covers.
84
+ const listRes = await driver.get(
85
+ `/v1/memory/${encodeURIComponent(MEMORY_REF)}?limit=50`,
86
+ );
87
+ if (listRes.status === 404) {
88
+ // eslint-disable-next-line no-console
89
+ console.warn('[rfc0012-tag] host does not expose memory:list at /v1/memory/{ref}; skipping tag inspection (canonical provenance signal remains the memory.compacted event itself)');
90
+ return;
91
+ }
92
+ expect(listRes.status, 'memory:list MUST return 200 when reachable').toBe(200);
93
+
94
+ const body = (listRes.json as MemoryListResponse) ?? {};
95
+ const entries = body.entries ?? [];
96
+ const output = entries.find((e) => e.id === outputId);
97
+ if (!output) {
98
+ // eslint-disable-next-line no-console
99
+ console.warn(`[rfc0012-tag] outputId ${outputId} not visible via memory:list; cannot inspect tags`);
100
+ return;
101
+ }
102
+ const tags = output.tags ?? [];
103
+
104
+ // RFC 0012 §C: SHOULD-tag, soft assertion.
105
+ const provenance = tags.find((t) => t.startsWith('compacted-from:'));
106
+ if (provenance === undefined) {
107
+ // eslint-disable-next-line no-console
108
+ console.warn('[rfc0012-tag] output entry has no compacted-from:<id> tag — RFC 0012 §C is SHOULD, not MUST; pass with warning');
109
+ return;
110
+ }
111
+ expect(provenance, driver.describe(
112
+ 'RFC 0012 §C',
113
+ 'compacted-from tag MUST match `compacted-from:<id>` shape (non-empty id, no whitespace) when present',
114
+ )).toMatch(COMPACTED_FROM_RE);
115
+ });
116
+ });
@@ -0,0 +1,127 @@
1
+ /**
2
+ * RFC 0012 §D — SR-1 carry-forward through memory compaction.
3
+ *
4
+ * Verifies the load-bearing security claim of RFC 0012 (Memory
5
+ * Compaction Profile, `Active` 2026-05-13 — comment window closes
6
+ * 2026-05-20): when a host advertising
7
+ * `capabilities.memory.compaction.supported: true` produces a
8
+ * compacted `MemoryEntry`, the derived content MUST pass the
9
+ * same BYOK redaction harness as a fresh `put`. The fact that
10
+ * source entries were SR-1-compliant at original `put` time is
11
+ * NOT evidence to skip redaction on derived content — summarization
12
+ * models can introduce secret-shaped substrings (hallucinated
13
+ * tokens, format-leaks from in-context examples) not present in
14
+ * any source.
15
+ *
16
+ * Gating:
17
+ * - `capabilities.memory.compaction.supported` MUST be `true`.
18
+ * - Host MUST expose the test seam at `POST /v1/test/memory/{seed,
19
+ * compact}` — gated on the host's `OPENWOP_TEST_TRIGGER_COMPACTION`
20
+ * env var. Without it the scenario can't synchronously drive
21
+ * compaction (RFC 0012 normates only `trigger: 'host-managed'`).
22
+ * The seam itself is host-implementation-specific; the conformance
23
+ * suite skips when the seam isn't reachable.
24
+ *
25
+ * @see RFCS/0012-memory-compaction-profile.md §D
26
+ * @see SECURITY/invariants.yaml `memory-compaction-sr-1-carry-forward`
27
+ */
28
+
29
+ import { describe, it, expect } from 'vitest';
30
+ import { driver } from '../lib/driver.js';
31
+
32
+ const MEMORY_REF = 'mem_tenant:default_agent:conformance-rfc0012-sr1_longTerm';
33
+
34
+ interface MemoryCaps {
35
+ compaction?: { supported?: boolean };
36
+ }
37
+
38
+ async function isCompactionAdvertised(): Promise<boolean> {
39
+ const disco = await driver.get('/.well-known/openwop');
40
+ const memory = (disco.json as { capabilities?: { memory?: MemoryCaps } }).capabilities?.memory;
41
+ return memory?.compaction?.supported === true;
42
+ }
43
+
44
+ async function isTestSeamReachable(): Promise<boolean> {
45
+ // Probe the seam with an empty body — expects 400 if reachable
46
+ // (validation_error on missing memoryRef), 404 when disabled.
47
+ const r = await driver.post('/v1/test/memory/compact', {});
48
+ return r.status !== 404;
49
+ }
50
+
51
+ describe('memory-compaction-sr1-carry-forward: derived content passes the BYOK redaction harness', () => {
52
+ it('compacted MemoryEntry content MUST NOT carry source-side form-leak signatures', async () => {
53
+ if (!(await isCompactionAdvertised())) {
54
+ // eslint-disable-next-line no-console
55
+ console.warn('[rfc0012-sr1] capabilities.memory.compaction.supported not advertised; skipping');
56
+ return;
57
+ }
58
+ if (!(await isTestSeamReachable())) {
59
+ // eslint-disable-next-line no-console
60
+ console.warn('[rfc0012-sr1] test seam /v1/test/memory/compact unreachable; skipping (set host\'s OPENWOP_TEST_TRIGGER_COMPACTION=true)');
61
+ return;
62
+ }
63
+
64
+ // 1. Seed source entries containing:
65
+ // - The canonical `[BYOK:...]` form-leak signature (placeholder
66
+ // surfaces verbatim — should be caught by SR-1 carry-forward).
67
+ // - A non-canonical `<REDACTED:...>` marker that the host's
68
+ // redaction harness should re-canonicalize.
69
+ // - Plain, non-sensitive prose.
70
+ const seed = await driver.post('/v1/test/memory/seed', {
71
+ memoryRef: MEMORY_REF,
72
+ entries: [
73
+ { id: `sr1-src-${Date.now()}-1`, content: 'User confirmed: [BYOK:hk_live_canary_42]' },
74
+ { id: `sr1-src-${Date.now()}-2`, content: 'Resolved <REDACTED:db-prod-creds> outage.' },
75
+ { id: `sr1-src-${Date.now()}-3`, content: 'Customer asked about pricing tiers.' },
76
+ ],
77
+ });
78
+ expect(seed.status, 'seed endpoint MUST return 201 when reachable').toBe(201);
79
+
80
+ // 2. Drive compaction synchronously.
81
+ const compactRes = await driver.post('/v1/test/memory/compact', {
82
+ memoryRef: MEMORY_REF,
83
+ });
84
+ expect(compactRes.status, 'compact MUST return 200 with ≥2 source entries').toBe(200);
85
+
86
+ const event = compactRes.json as {
87
+ type?: string;
88
+ payload?: { outputId?: string; memoryRef?: string };
89
+ // Out-of-band field from the test seam carrying the persisted
90
+ // entry bytes; the wire-level `memory.compacted` event does NOT
91
+ // carry content. Required for SR-1 verification — the canonical
92
+ // event payload is shape-only and would pass this scenario
93
+ // trivially without it.
94
+ outputContent?: string;
95
+ };
96
+ expect(event.type, 'event payload MUST be type=memory.compacted').toBe('memory.compacted');
97
+
98
+ if (typeof event.outputContent !== 'string') {
99
+ // eslint-disable-next-line no-console
100
+ console.warn('[rfc0012-sr1] test seam did not return outputContent; the wire-level memory.compacted shape does not surface content so without a host-side seam we cannot verify §D end-to-end. Skipping.');
101
+ return;
102
+ }
103
+
104
+ // The load-bearing assertion: the PERSISTED entry content (what
105
+ // future MemoryAdapter.get / list consumers would see) MUST NOT
106
+ // carry source-side form-leak signatures. A host that skips its
107
+ // BYOK redaction pass on derived content fails here.
108
+ expect(event.outputContent.includes('[BYOK:hk_live_canary_42]'), driver.describe(
109
+ 'RFC 0012 §D',
110
+ 'derived MemoryEntry.content MUST NOT carry source-side [BYOK:...] form-leak signatures (SR-1 carry-forward)',
111
+ )).toBe(false);
112
+ expect(event.outputContent.includes('<REDACTED:db-prod-creds>'), driver.describe(
113
+ 'RFC 0012 §D',
114
+ 'derived MemoryEntry.content MUST NOT echo non-canonical <REDACTED:...> markers from sources',
115
+ )).toBe(false);
116
+
117
+ // Positive: the canonical `[REDACTED:...]` placeholder MUST be
118
+ // present where SR-1 carry-forward re-substituted a source-side
119
+ // leak. Pinning this prevents a host from "passing" by simply
120
+ // stripping source content rather than redacting it (which would
121
+ // also lose audit signal).
122
+ expect(event.outputContent, driver.describe(
123
+ 'RFC 0012 §D',
124
+ 'derived MemoryEntry.content MUST carry canonical [REDACTED:...] placeholders where source-side leaks were re-substituted',
125
+ )).toMatch(/\[REDACTED:[^\]]+\]/);
126
+ });
127
+ });
@@ -2,18 +2,29 @@
2
2
  * Track 13: multi-region idempotency capability shape (idempotency.md v1.1).
3
3
  *
4
4
  * Verifies that hosts advertising the multi-region idempotency annex
5
- * surface a valid `capabilities.idempotency.crossRegion` value. The
6
- * end-to-end partition behavior cannot be exercised black-box; this
7
- * scenario validates the discovery-document shape so clients can rely
8
- * on the capability for routing decisions.
5
+ * surface a valid `capabilities.idempotency.crossRegion` value AND, when
6
+ * claiming `'best-effort'` or `'strict'`, expose the operator-tier
7
+ * metric names per `idempotency.md` §"Operator surface".
8
+ *
9
+ * The annex's partition-replay convergence rule cannot be exercised
10
+ * black-box (it requires multi-region host deployment under a real
11
+ * partition); the algorithm itself is verified in-process via the
12
+ * Postgres host's `multi-region-idempotency.test.ts` smoke against
13
+ * the canonical resolver. This scenario validates the discovery-
14
+ * document shape so clients can rely on the capability for routing
15
+ * decisions.
9
16
  *
10
17
  * @see spec/v1/idempotency.md §"Multi-region idempotency"
18
+ * @see examples/hosts/postgres/src/multi-region.ts (canonical resolver)
11
19
  */
12
20
 
13
21
  import { describe, it, expect } from 'vitest';
14
22
  import { driver } from '../lib/driver.js';
15
23
 
16
24
  const ALLOWED = new Set(['single-region', 'best-effort', 'strict']);
25
+ const REQUIRED_METRICS_WHEN_MULTI_REGION = [
26
+ 'openwop.idempotency.cross_region_conflicts_total',
27
+ ];
17
28
 
18
29
  interface IdempotencyCaps {
19
30
  supported?: boolean;
@@ -22,6 +33,10 @@ interface IdempotencyCaps {
22
33
  crossRegion?: string;
23
34
  }
24
35
 
36
+ interface ObservabilityCaps {
37
+ metrics?: { names?: string[] };
38
+ }
39
+
25
40
  describe('multi-region-idempotency: capability shape', () => {
26
41
  it('idempotency.crossRegion (when advertised) MUST be one of the closed enum', async () => {
27
42
  const disco = await driver.get('/.well-known/openwop');
@@ -49,4 +64,24 @@ describe('multi-region-idempotency: capability shape', () => {
49
64
  expect(idem.layer2RetentionSeconds).toBeGreaterThan(0);
50
65
  }
51
66
  });
67
+
68
+ it('multi-region hosts SHOULD expose the cross-region conflict counter per §"Operator surface"', async () => {
69
+ const disco = await driver.get('/.well-known/openwop');
70
+ const caps = (disco.json as { capabilities?: { idempotency?: IdempotencyCaps; observability?: ObservabilityCaps } })
71
+ .capabilities;
72
+ const crossRegion = caps?.idempotency?.crossRegion;
73
+
74
+ if (crossRegion !== 'best-effort' && crossRegion !== 'strict') {
75
+ // Single-region hosts have no conflicts to count — skip.
76
+ return;
77
+ }
78
+
79
+ const advertised = new Set(caps?.observability?.metrics?.names ?? []);
80
+ for (const name of REQUIRED_METRICS_WHEN_MULTI_REGION) {
81
+ expect(advertised.has(name), driver.describe(
82
+ 'idempotency.md §"Operator surface"',
83
+ `multi-region hosts SHOULD advertise metric "${name}" so operators can monitor conflict frequency`,
84
+ )).toBe(true);
85
+ }
86
+ });
52
87
  });
@@ -0,0 +1,139 @@
1
+ /**
2
+ * Track 11 close-out: cross-run trace-context propagation across
3
+ * `core.subWorkflow` invocation.
4
+ *
5
+ * `otel-trace-propagation.test.ts` verifies that a single run's spans
6
+ * inherit an inbound `traceparent`'s traceId. This scenario closes the
7
+ * remaining gap (`conformance/coverage.md` row 52: "Cross-host
8
+ * propagation across `core.subWorkflow` invocation"): when a parent
9
+ * run with a known inbound traceparent dispatches a child run via
10
+ * `core.subWorkflow`, the CHILD run's emitted spans MUST also share
11
+ * the parent's traceId — distributed traces stitch across the
12
+ * sub-workflow boundary without operator-side correlation hacks.
13
+ *
14
+ * Operator-tier value: in production deployments, a sub-workflow may
15
+ * execute on a different host instance (`core.subWorkflow` is a
16
+ * dispatch boundary, not necessarily an in-process call). The
17
+ * traceparent-propagation contract guarantees the operator's OTel
18
+ * backend can render parent + child as one trace tree even when
19
+ * they're on separate hosts.
20
+ *
21
+ * Skip conditions:
22
+ * - Collector disabled.
23
+ * - Host doesn't advertise `capabilities.observability`.
24
+ * - `conformance-subworkflow-parent` fixture not advertised (host
25
+ * doesn't implement `core.subWorkflow`).
26
+ *
27
+ * @see spec/v1/observability.md §"Trace context propagation"
28
+ * @see spec/v1/node-packs.md §`core.subWorkflow`
29
+ * @see conformance/coverage.md row 52
30
+ */
31
+
32
+ import { describe, it, expect } from 'vitest';
33
+ import { driver } from '../lib/driver.js';
34
+ import { pollUntilTerminal } from '../lib/polling.js';
35
+ import { isFixtureAdvertised } from '../lib/fixtures.js';
36
+ import { getCollector, waitForRunSpans } from '../lib/otel-collector.js';
37
+
38
+ const PARENT_FIXTURE = 'conformance-subworkflow-parent';
39
+
40
+ interface RunEvent {
41
+ type: string;
42
+ nodeId?: string;
43
+ payload?: { outputs?: { childRunId?: string } };
44
+ }
45
+
46
+ function makeTraceparent(): { header: string; traceId: string } {
47
+ // W3C format: 00-<32 hex traceId>-<16 hex spanId>-01.
48
+ // Use a distinct id from the parent-only scenario so collector
49
+ // matching is unambiguous when both scenarios run back-to-back.
50
+ const traceId = '7c3e51b9d2a04e6f8b1c0d2e3f4a5b6c';
51
+ const spanId = '00f067aa0ba902b7';
52
+ return { header: `00-${traceId}-${spanId}-01`, traceId };
53
+ }
54
+
55
+ async function isObservabilityAdvertised(): Promise<boolean> {
56
+ try {
57
+ const disco = await driver.get('/.well-known/openwop');
58
+ const caps = (disco.json as { capabilities?: { observability?: unknown } }).capabilities ?? {};
59
+ return caps.observability !== undefined;
60
+ } catch {
61
+ return false;
62
+ }
63
+ }
64
+
65
+ describe('otel-trace-propagation-subworkflow: traceparent threads parent → child via core.subWorkflow', () => {
66
+ it('child run spans inherit the parent run\'s inbound traceId', async () => {
67
+ if (!getCollector()) {
68
+ // eslint-disable-next-line no-console
69
+ console.warn('[otel-trace-propagation-subworkflow] collector not started; skipping');
70
+ return;
71
+ }
72
+ if (!isFixtureAdvertised(PARENT_FIXTURE)) {
73
+ // eslint-disable-next-line no-console
74
+ console.warn(`[otel-trace-propagation-subworkflow] ${PARENT_FIXTURE} not advertised; skipping`);
75
+ return;
76
+ }
77
+ if (!(await isObservabilityAdvertised())) {
78
+ // eslint-disable-next-line no-console
79
+ console.warn('[otel-trace-propagation-subworkflow] capabilities.observability not advertised; skipping');
80
+ return;
81
+ }
82
+
83
+ const collector = getCollector()!;
84
+ collector.reset();
85
+
86
+ const { header, traceId } = makeTraceparent();
87
+ const create = await driver.post(
88
+ '/v1/runs',
89
+ { workflowId: PARENT_FIXTURE },
90
+ { headers: { traceparent: header } },
91
+ );
92
+ expect(create.status).toBe(201);
93
+ const parentRunId = (create.json as { runId: string }).runId;
94
+
95
+ await pollUntilTerminal(parentRunId, { timeoutMs: 30_000 });
96
+
97
+ // Walk the parent's event log to discover the child run id.
98
+ const eventsRes = await driver.get(
99
+ `/v1/runs/${encodeURIComponent(parentRunId)}/events/poll?lastSequence=0&timeout=1`,
100
+ );
101
+ expect(eventsRes.status).toBe(200);
102
+ const events = (eventsRes.json as { events?: RunEvent[] } | undefined)?.events ?? [];
103
+
104
+ const subwfCompleted = events.find(
105
+ (e) => e.type === 'node.completed' && e.nodeId === 'subwf-call',
106
+ );
107
+ expect(subwfCompleted, driver.describe(
108
+ 'node-packs.md §core.subWorkflow',
109
+ 'parent event log MUST include node.completed for the subwf-call node',
110
+ )).toBeDefined();
111
+
112
+ const childRunId = subwfCompleted?.payload?.outputs?.childRunId;
113
+ expect(typeof childRunId, driver.describe(
114
+ 'node-packs.md §core.subWorkflow outputSchema',
115
+ 'subwf-call node.completed payload MUST carry outputs.childRunId',
116
+ )).toBe('string');
117
+
118
+ // Both parent + child spans MUST share the inbound traceId.
119
+ const parentSpans = await waitForRunSpans(parentRunId, { timeoutMs: 10_000, minCount: 1 });
120
+ const childSpans = await waitForRunSpans(childRunId!, { timeoutMs: 10_000, minCount: 1 });
121
+
122
+ expect(parentSpans.length, 'collector MUST receive ≥1 span for the parent run').toBeGreaterThan(0);
123
+ expect(childSpans.length, 'collector MUST receive ≥1 span for the child run').toBeGreaterThan(0);
124
+
125
+ const wantTrace = traceId.toLowerCase();
126
+
127
+ const parentMatching = parentSpans.filter((s) => s.traceId.toLowerCase() === wantTrace);
128
+ expect(parentMatching.length, driver.describe(
129
+ 'observability.md §"Trace context propagation"',
130
+ 'parent-run spans MUST share the inbound traceparent traceId',
131
+ )).toBeGreaterThan(0);
132
+
133
+ const childMatching = childSpans.filter((s) => s.traceId.toLowerCase() === wantTrace);
134
+ expect(childMatching.length, driver.describe(
135
+ 'observability.md §"Trace context propagation" + node-packs.md §core.subWorkflow',
136
+ 'child-run spans dispatched via core.subWorkflow MUST inherit the parent run\'s traceId so distributed traces stitch across the dispatch boundary',
137
+ )).toBeGreaterThan(0);
138
+ });
139
+ });
@@ -26,6 +26,7 @@
26
26
  */
27
27
 
28
28
  import { describe, it, expect } from 'vitest';
29
+ import { createHash, createPublicKey, verify as cryptoVerify } from 'node:crypto';
29
30
 
30
31
  const REGISTRY_BASE = 'https://packs.openwop.dev';
31
32
  const ENABLED = process.env.OPENWOP_TEST_PUBLIC_REGISTRY === 'true';
@@ -129,3 +130,93 @@ describe('registry-public: spec-canonical pack manifests resolve', () => {
129
130
  });
130
131
  }
131
132
  });
133
+
134
+ describe('registry-public: tarball + signature + Ed25519 verify roundtrip', () => {
135
+ // core.openwop.examples@1.0.0 is the canonical reference pack for this
136
+ // check: published since the registry MVP, signed with the
137
+ // `openwop-registry-root` key over the whole tarball (method='ed25519').
138
+ // The same recipe applies to any pack at the registry; this scenario
139
+ // exercises the worst-case full roundtrip so clients have a wire-level
140
+ // contract for verifying packs before installing them.
141
+ //
142
+ // What gets asserted, in order:
143
+ // 1. Version manifest, tarball, signature, and public key all 200.
144
+ // 2. SRI integrity in the manifest matches a fresh sha256 of the
145
+ // tarball bytes — protects against tarball tampering between
146
+ // publish + retrieval.
147
+ // 3. Detached Ed25519 signature verifies against the public key over
148
+ // the bytes the publisher signed (per signing.method).
149
+ const PACK_NAME = 'core.openwop.examples';
150
+ const PACK_VERSION = '1.0.0';
151
+
152
+ async function getBinary(path: string): Promise<{ status: number; bytes: Buffer }> {
153
+ const res = await fetch(`${REGISTRY_BASE}${path}`);
154
+ const ab = await res.arrayBuffer();
155
+ return { status: res.status, bytes: Buffer.from(ab) };
156
+ }
157
+
158
+ async function getText(path: string): Promise<{ status: number; body: string }> {
159
+ const res = await fetch(`${REGISTRY_BASE}${path}`);
160
+ const body = await res.text();
161
+ return { status: res.status, body };
162
+ }
163
+
164
+ it(`tarball + sig + public key all retrievable, SRI matches, Ed25519 verifies for ${PACK_NAME}@${PACK_VERSION}`, async () => {
165
+ if (!ENABLED) return;
166
+
167
+ // 1. Manifest (JSON).
168
+ const manifestRes = await get(`/v1/packs/${PACK_NAME}/-/${PACK_VERSION}.json`);
169
+ expect(manifestRes.status).toBe(200);
170
+ const manifest = manifestRes.json as {
171
+ signing?: { method?: string; keyId?: string; publicKeyUrl?: string };
172
+ integrity?: string;
173
+ };
174
+ expect(typeof manifest.signing?.method).toBe('string');
175
+ expect(typeof manifest.signing?.keyId).toBe('string');
176
+ expect(typeof manifest.integrity).toBe('string');
177
+
178
+ // 2. Tarball.
179
+ const tarball = await getBinary(`/v1/packs/${PACK_NAME}/-/${PACK_VERSION}.tgz`);
180
+ expect(tarball.status, 'tarball MUST be retrievable').toBe(200);
181
+ expect(tarball.bytes.byteLength, 'tarball MUST be non-empty').toBeGreaterThan(0);
182
+
183
+ // 3. Detached signature (64-byte raw Ed25519).
184
+ const sig = await getBinary(`/v1/packs/${PACK_NAME}/-/${PACK_VERSION}.sig`);
185
+ expect(sig.status, 'signature MUST be retrievable').toBe(200);
186
+ expect(sig.bytes.byteLength, 'Ed25519 detached signature MUST be 64 bytes').toBe(64);
187
+
188
+ // 4. Public key — fetch from the publisher-declared URL when present,
189
+ // else fall back to the canonical `/keys/<keyId>.pub` shape.
190
+ const keyUrl = manifest.signing!.publicKeyUrl ?? `/keys/${manifest.signing!.keyId}.pub`;
191
+ const keyRes = await getText(keyUrl);
192
+ expect(keyRes.status, `public key MUST be retrievable at ${keyUrl}`).toBe(200);
193
+ const publicKey = createPublicKey(keyRes.body);
194
+
195
+ // 5. SRI integrity check: `sha256-<base64>=` MUST match a fresh
196
+ // sha256 of the tarball bytes per registry-operations.md.
197
+ expect(manifest.integrity).toMatch(/^sha256-[A-Za-z0-9+/]+=*$/);
198
+ const expectedSri = `sha256-${createHash('sha256').update(tarball.bytes).digest('base64')}`;
199
+ expect(expectedSri, 'SRI integrity in manifest MUST match a fresh sha256 of the tarball').toBe(
200
+ manifest.integrity,
201
+ );
202
+
203
+ // 6. Ed25519 verification. Two canonical signing conventions per
204
+ // `node-packs.md` §"Signing recipe": `method=ed25519` signs the
205
+ // whole tarball; `method=manual` signs the pack.json bytes inside
206
+ // the tarball. core.openwop.examples uses `ed25519`.
207
+ const method = manifest.signing!.method;
208
+ expect(['ed25519', 'manual']).toContain(method);
209
+
210
+ // For `ed25519` the signed bytes are the tarball; for `manual` the
211
+ // signed bytes are pack.json extracted from the tarball. This
212
+ // scenario picks `core.openwop.examples` specifically because it's
213
+ // `method=ed25519` — the simpler path. Extending to `manual` would
214
+ // require the tarball extractor from registry/scripts/verify-
215
+ // signatures.mjs which is intentionally out of scope here.
216
+ if (method !== 'ed25519') return;
217
+
218
+ const verified = cryptoVerify(null, tarball.bytes, publicKey, sig.bytes);
219
+ expect(verified, `Ed25519 signature over ${PACK_NAME}@${PACK_VERSION}.tgz MUST verify against ${manifest.signing!.keyId}`)
220
+ .toBe(true);
221
+ });
222
+ });
@@ -789,11 +789,14 @@ describe.skipIf(
789
789
  () => {
790
790
  // describe.skipIf still evaluates the body for test registration; defaults guard against null
791
791
  // dirname() when sources are missing under the published-tarball layout. it() blocks below are
792
- // skipped at run time, so the path values are never actually read.
792
+ // skipped at run time, so the path values are never actually read. The sentinel is
793
+ // intentionally an obviously-invalid path so a stack trace from any future code that DOES
794
+ // dereference it points the reader at this comment.
795
+ const UNUSED_IN_PUBLISHED_LAYOUT = '/__sdk_paths_unused_in_published_layout__';
793
796
  const sdkSources = {
794
- typescript: TYPESCRIPT_RUN_HELPERS_PATH ?? '.',
795
- python: PYTHON_TYPES_PATH ?? '.',
796
- go: GO_TYPES_PATH ?? '.',
797
+ typescript: TYPESCRIPT_RUN_HELPERS_PATH ?? UNUSED_IN_PUBLISHED_LAYOUT,
798
+ python: PYTHON_TYPES_PATH ?? UNUSED_IN_PUBLISHED_LAYOUT,
799
+ go: GO_TYPES_PATH ?? UNUSED_IN_PUBLISHED_LAYOUT,
797
800
  };
798
801
  const sdkReadmes = {
799
802
  typescript: pathResolve(dirname(sdkSources.typescript), '..', 'README.md'),
@@ -1145,8 +1148,10 @@ describe.skipIf(README_PATH === null)('spec-corpus: local Markdown links resolve
1145
1148
 
1146
1149
  const target = pathResolve(dirname(file), decoded);
1147
1150
  // Published-tarball layout: the conformance README references ../spec/v1/... and other paths
1148
- // that resolve OUTSIDE the package boundary. Repo layout has the full tree available.
1149
- if (LAYOUT === 'published' && !target.startsWith(repoRoot)) continue;
1151
+ // that resolve OUTSIDE the package boundary. Repo layout has the full tree available. The
1152
+ // `target === repoRoot || target.startsWith(repoRoot + sep)` form avoids a sibling-path
1153
+ // false-negative when repoRoot=/foo/bar and target=/foo/barbaz.
1154
+ if (LAYOUT === 'published' && target !== repoRoot && !target.startsWith(repoRoot + '/')) continue;
1150
1155
  expect(
1151
1156
  existsSync(target),
1152
1157
  `${relFile} links to missing local target: ${link}`,