npm - @openwop/openwop-conformance - Versions diffs - 1.6.0 → 1.6.1 - Mend

@openwop/openwop-conformance 1.6.0 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/CHANGELOG.md +8 -0
package/README.md +2 -2
package/api/asyncapi.yaml +17 -1
package/api/openapi.yaml +66 -0
package/coverage.md +2 -0
package/package.json +1 -1
package/schemas/README.md +2 -0
package/schemas/annotation-create.schema.json +37 -0
package/schemas/annotation.schema.json +56 -0
package/schemas/capabilities.schema.json +24 -0
package/src/lib/feedback.ts +31 -0
package/src/scenarios/feedback-capability-shape.test.ts +35 -0
package/src/scenarios/feedback-correction-redaction.test.ts +35 -0
package/src/scenarios/feedback-cross-tenant-isolation.test.ts +37 -0
package/src/scenarios/feedback-fork-not-copied.test.ts +40 -0
package/src/scenarios/feedback-on-terminal-run.test.ts +32 -0
package/src/scenarios/feedback-record-and-list.test.ts +32 -0
package/src/scenarios/feedback-unsupported-501.test.ts +32 -0
package/src/scenarios/redaction.test.ts +4 -1
package/src/scenarios/spec-corpus-validity.test.ts +4 -1

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,13 @@
 # `@openwop/openwop-conformance` Changelog
+## [1.6.1] — 2026-05-25
+Patch — fixes a stale allowlist in `redaction.test.ts` that contradicted the same release's `capabilities.schema.json`. Reported by MyndHyve against the 1.6.0 cohort run.
+### Fixed
+- **`redaction.test.ts:103`** — the `secrets.scopes` member check hardcoded `['tenant', 'user', 'run']`, omitting `'workspace'`. The canonical `secrets.scopes` enum in `capabilities.schema.json` is `["tenant", "user", "run", "workspace"]` (`workspace` is the RFC 0046/0048 sub-tenant scope, additive). A host honestly advertising `secrets.scopes: ['workspace', …]` (e.g. MyndHyve `workflow-runtime`) wrongly failed the scenario. The allowlist now tracks the schema enum. No wire-shape change; the schema and RFC 0046 §A were already canonical — only the test was stale.
 ## [1.6.0] — 2026-05-25
 Minor bump per `PUBLISHING.md` §"Versioning alignment" — ships the conformance scenarios for the **MyndHyve protocol-extension cohort (RFCs 0045–0054)** so adopting hosts can pin the released suite, run it against their deployment, and report pass for `Draft → Active → Accepted` graduation (per `RFCS/0001-rfc-process.md` §"Promotion to Accepted"). All additive — every new scenario is capability-gated and soft-skips against a host that doesn't advertise the surface, so existing v1.0-only hosts pass unchanged.

package/README.md CHANGED Viewed

@@ -93,7 +93,7 @@ Exit code is non-zero on any failed assertion.
 ## What's Covered
-The current suite has 231 scenario files under `src/scenarios/`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
+The current suite has 238 scenario files under `src/scenarios/`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
 High-level coverage includes:
@@ -172,7 +172,7 @@ Server-required (added in 1.7.0):
 |---|---|---|
 | **Redaction** | [`capabilities.md`](../spec/v1/capabilities.md) §"Secrets" + NFR-7 + §"aiProviders" | Vendor-neutral assertions that the server doesn't leak secret material. Three scenario groups: (a) discovery shape contract — `secrets` + `aiProviders` advertisements are well-formed regardless of `secrets.supported`; when `supported === true`, scopes MUST be non-empty + `resolution === 'host-managed'`; `byok ⊆ supported`. (b) bearer-token redaction — invalid Bearer canary in `Authorization` header is not echoed in the 401 response body. (c) credentialRef echo control — gated on `secrets.supported === true`; canary planted in `configurable.ai.credentialRef` MUST NOT appear in any RunEvent payload (poll-based capture; transport-agnostic). Uses runtime-built canary fixtures (`lib/canaries.ts`) that defeat static secret scanners. 6 scenarios. |
-Current source tree: 231 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
+Current source tree: 238 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
 ## Remaining Gaps

package/api/asyncapi.yaml CHANGED Viewed

@@ -79,6 +79,7 @@ channels:
       runCancelled:      { $ref: '#/components/messages/RunCancelled' }
       runPaused:         { $ref: '#/components/messages/RunPaused' }
       runResumed:        { $ref: '#/components/messages/RunResumed' }
+      runAnnotated:      { $ref: '#/components/messages/RunAnnotated' }
       nodeCompleted:     { $ref: '#/components/messages/NodeCompleted' }
       nodeFailed:        { $ref: '#/components/messages/NodeFailed' }
       nodeSkipped:       { $ref: '#/components/messages/NodeSkipped' }
@@ -153,7 +154,8 @@ channels:
       runId:
         $ref: '#/components/parameters/RunId'
     messages:
-      anyRunEvent: { $ref: '#/components/messages/AnyRunEvent' }
+      anyRunEvent:  { $ref: '#/components/messages/AnyRunEvent' }
+      runAnnotated: { $ref: '#/components/messages/RunAnnotated' }
 # ─────────────────────────────────────────────────────────────────────────────
 # OPERATIONS — consumer-side (receive)
@@ -310,6 +312,14 @@ components:
       payload:
         $ref: '#/components/schemas/RunEventDoc'
+    RunAnnotated:
+      name: run.annotated
+      title: Run annotated (RFC 0056)
+      summary: A non-blocking quality annotation was recorded for the run. Live notification ONLY — NOT a replayable run-event-log entry; its payload is an Annotation (not a RunEventDoc), so it is excluded from fork/replay (RFC 0056 §B/§D).
+      contentType: application/json
+      payload:
+        $ref: '#/components/schemas/Annotation'
     # ── Node-lifecycle ───────────────────────────────────────────────────
     NodeCompleted:
       name: node.completed
@@ -454,6 +464,12 @@ components:
     RunEventDoc:
       $ref: '../schemas/run-event.schema.json'
+    # RFC 0056. The run.annotated notification carries an Annotation —
+    # NOT a RunEventDoc — because annotations are a side-resource, not
+    # replayable run-event-log entries (RFC 0056 §B/§D).
+    Annotation:
+      $ref: '../schemas/annotation.schema.json'
     StateSnapshotPayload:
       # S1 closure (2026-04-27): reuse the canonical RunSnapshot
       # projection shape verbatim. Same type returned by

package/api/openapi.yaml CHANGED Viewed

@@ -361,6 +361,72 @@ paths:
         '403': { $ref: '#/components/responses/Forbidden' }
         '404': { $ref: '#/components/responses/NotFound' }
+  # ── Run feedback / annotations (RFC 0056) ────────────────────────────
+  # Gated on `capabilities.feedback.supported: true`. Annotations are a
+  # per-run side-resource (NOT replayable run-event-log entries); recording
+  # one also emits a live `run.annotated` SSE notification. Hosts without
+  # the advertised capability return `501 capability_not_provided`.
+  /v1/runs/{runId}/annotations:
+    post:
+      tags: [runs]
+      summary: Record a non-blocking quality annotation on a run (RFC 0056).
+      operationId: createAnnotation
+      parameters:
+        - $ref: '#/components/parameters/RunId'
+        - $ref: '#/components/parameters/IdempotencyKey'
+      requestBody:
+        required: true
+        content:
+          application/json:
+            schema:
+              $ref: '../schemas/annotation-create.schema.json'
+      responses:
+        '201':
+          description: Annotation recorded. Returns the persisted annotation.
+          content:
+            application/json:
+              schema:
+                $ref: '../schemas/annotation.schema.json'
+        '400': { $ref: '#/components/responses/ValidationError' }
+        '401': { $ref: '#/components/responses/Unauthenticated' }
+        '403': { $ref: '#/components/responses/Forbidden' }
+        '404': { $ref: '#/components/responses/NotFound' }
+        '501':
+          description: 'Host does not advertise capabilities.feedback.supported (RFC 0056).'
+          content:
+            application/json:
+              schema:
+                $ref: '../schemas/error-envelope.schema.json'
+    get:
+      tags: [runs]
+      summary: List the annotations recorded on a run (RFC 0056).
+      operationId: listAnnotations
+      parameters:
+        - $ref: '#/components/parameters/RunId'
+      responses:
+        '200':
+          description: Annotations for the run (tenant-scoped).
+          content:
+            application/json:
+              schema:
+                type: object
+                required: [annotations]
+                properties:
+                  annotations:
+                    type: array
+                    items:
+                      $ref: '../schemas/annotation.schema.json'
+                additionalProperties: false
+        '401': { $ref: '#/components/responses/Unauthenticated' }
+        '403': { $ref: '#/components/responses/Forbidden' }
+        '404': { $ref: '#/components/responses/NotFound' }
+        '501':
+          description: 'Host does not advertise capabilities.feedback.supported (RFC 0056).'
+          content:
+            application/json:
+              schema:
+                $ref: '../schemas/error-envelope.schema.json'
   /v1/runs:bulk-cancel:
     post:
       tags: [runs]

package/coverage.md CHANGED Viewed

@@ -141,6 +141,8 @@ Every OpenAPI operation should have:
 | `pauseRun` | `pause-resume.test.ts` covers direct route behavior for running → paused, idempotent re-pause, terminal conflict, and pause-during-suspend race | Conflict and race paths covered with `details.runStatus`; endpoint is no longer coverage-missing | Add explicit immediate-vs-drain-current-node policy assertion when a host advertises both drain policies. |
 | `resumeRun` | `pause-resume.test.ts` covers direct route behavior for paused → running and non-paused conflict | Conflict path covered with `details.runStatus`; endpoint is no longer coverage-missing | Good. |
 | `forkRun` | `replay-fork.test.ts`, `replayDeterminism.test.ts` | Negative `fromSeq`, past-end, unknown source, invalid overlay | Add arbitrary-event fork and retention-expired source. |
+| `createAnnotation` | `feedback-record-and-list.test.ts`, `feedback-on-terminal-run.test.ts`, `feedback-correction-redaction.test.ts` (RFC 0056); gated on `capabilities.feedback.supported`, soft-skip on `501` | `feedback-unsupported-501.test.ts` (501 when unadvertised), `feedback-cross-tenant-isolation.test.ts`, `feedback-fork-not-copied.test.ts` | Capability-gated; full cross-tenant proof needs a multi-tenant auth seam (soft-skips, like `kv-cross-tenant-isolation`). |
+| `listAnnotations` | `feedback-record-and-list.test.ts`, `feedback-cross-tenant-isolation.test.ts` (RFC 0056) | `feedback-correction-redaction.test.ts` (redacted listing), `feedback-fork-not-copied.test.ts` (fork list empty) | Gated on `capabilities.feedback.supported`. |
 | `diffRun` | `run-diff.test.ts` (RFC 0054); soft-skips on 404 when the endpoint is unimplemented | Self-diff `divergedAtSeq: null`/empty (determinism floor), two-fixture divergence with `eventDiffs[0].seq === divergedAtSeq`, response-shape + `stateDiff` redaction-safety, `400` (missing `against`) + `404` (nonexistent `against`) | Add a bespoke deterministically-divergent fork fixture for `divergedAtSeq === N`-at-a-chosen-seq; full cross-principal `403` needs a multi-principal harness. |
 | `resolveInterruptByRun` | `interrupt-approval.test.ts`, `interrupt-clarification.test.ts`, `approval-payload.test.ts`, `interruptRace.test.ts` | Invalid action, unknown node, race cases | Add auth-required and quorum profile scenarios. |
 | `inspectInterruptByToken` | `interrupt-token-matrix.test.ts` (CF-3, 2026-05-15) covers malformed + unknown token paths | Negative paths covered | Add explicit expired-token case when a host advertises a TTL seam. |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@openwop/openwop-conformance",
-  "version": "1.6.0",
+  "version": "1.6.1",
   "description": "Production-ready black-box conformance suite for OpenWOP v1.0 compliant servers.",
   "repository": {
     "type": "git",

package/schemas/README.md CHANGED Viewed

@@ -11,6 +11,8 @@
 | `envelopes/schema.request.schema.json` | `ai-envelope.md` §"Universal kinds" | FINAL v1.1 — LLM asks the engine for a kind's JSON Schema. Counts against `Capabilities.limits.schemaRounds`. |
 | `envelopes/schema.response.schema.json` | `ai-envelope.md` §"Universal kinds" | FINAL v1.1 — side-channel ack for `schema.request`. Never surfaces to users. |
 | `envelopes/error.schema.json` | `ai-envelope.md` §"Universal kinds" | FINAL v1.1 — LLM's deliberate error report. Distinct from `error-envelope.schema.json` (host HTTP errors). |
+| `annotation.schema.json` | `RFCS/0056` + `observability.md` | RFC 0056 (`Draft`) — a non-blocking human/agent quality signal (rating / correction / label / flag) attached to a run, event, or node. A side-resource (not a replayable run-event-log entry); response of `POST/GET /v1/runs/{runId}/annotations` + payload of the `run.annotated` SSE notification. |
+| `annotation-create.schema.json` | `RFCS/0056` | RFC 0056 (`Draft`) — request body for `POST /v1/runs/{runId}/annotations` (host assigns `annotationId`/`createdAt`/`actor`; binds `target.runId` to the path). |
 | `audit-verify-result.schema.json` | `auth-profiles.md` §`openwop-audit-log-integrity` | Response payload from `GET /v1/audit/verify` — chain-validity verdict + checkpoints + anomalies |
 | `capabilities.schema.json` | `capabilities.md` | `/.well-known/openwop` response — protocolVersion + supportedEnvelopes + schemaVersions + limits + optional v1 discovery surface |
 | `channel-written-payload.schema.json` | `channels-and-reducers.md` §Channel write event | Payload of the `channel.written` RunEvent — write input + reducer name |

package/schemas/annotation-create.schema.json ADDED Viewed

@@ -0,0 +1,37 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://openwop.dev/spec/v1/annotation-create.schema.json",
+  "title": "AnnotationCreate",
+  "description": "RFC 0056. Request body for `POST /v1/runs/{runId}/annotations`. The host assigns `annotationId` + `createdAt`, derives `actor.principalRef` from the authenticated principal, and binds `target.runId` to the path `runId` — so the client supplies only the target anchor (event/node), the signal, and an optional note. The response is a full `annotation.schema.json`.",
+  "type": "object",
+  "required": ["signal"],
+  "properties": {
+    "target": {
+      "type": "object",
+      "description": "Optional finer-grained anchor within the run. `runId` is taken from the path and MUST NOT be supplied here.",
+      "properties": {
+        "eventId": { "type": "string", "description": "Anchor the annotation to one RunEvent." },
+        "nodeId": { "type": "string", "description": "Anchor the annotation to one node." }
+      },
+      "additionalProperties": false
+    },
+    "signal": {
+      "type": "object",
+      "required": ["kind"],
+      "properties": {
+        "kind": { "type": "string", "enum": ["rating", "correction", "label", "flag"] },
+        "rating": { "type": "integer", "minimum": 1, "maximum": 5, "description": "Required iff `kind` is `rating`." },
+        "label": { "type": "string", "description": "Required iff `kind` is `label`." },
+        "correction": { "type": "string", "description": "Corrected text/value iff `kind` is `correction`. Untrusted user content." }
+      },
+      "additionalProperties": false,
+      "allOf": [
+        { "if": { "properties": { "kind": { "const": "rating" } } }, "then": { "required": ["rating"] } },
+        { "if": { "properties": { "kind": { "const": "label" } } }, "then": { "required": ["label"] } },
+        { "if": { "properties": { "kind": { "const": "correction" } } }, "then": { "required": ["correction"] } }
+      ]
+    },
+    "note": { "type": "string", "description": "Optional free-text reviewer note. Untrusted user content." }
+  },
+  "additionalProperties": false
+}

package/schemas/annotation.schema.json ADDED Viewed

@@ -0,0 +1,56 @@
+{
+  "$schema": "https://json-schema.org/draft/2020-12/schema",
+  "$id": "https://openwop.dev/spec/v1/annotation.schema.json",
+  "title": "Annotation",
+  "description": "RFC 0056. A non-blocking human/agent quality signal attached to a run, event, or node — a side-resource, NOT a replayable run-event-log entry. Recorded via `POST /v1/runs/{runId}/annotations`, listed via `GET`, and surfaced live via the `run.annotated` SSE notification. `signal.correction` and `note` are untrusted user content (SECURITY invariant `annotation-content-redaction`).",
+  "type": "object",
+  "required": ["annotationId", "target", "signal", "actor", "createdAt"],
+  "properties": {
+    "annotationId": {
+      "type": "string",
+      "minLength": 1,
+      "description": "Host-assigned unique identifier for this annotation."
+    },
+    "target": {
+      "type": "object",
+      "required": ["runId"],
+      "properties": {
+        "runId": { "type": "string", "minLength": 1 },
+        "eventId": { "type": "string", "description": "Optional — anchors the annotation to one RunEvent." },
+        "nodeId": { "type": "string", "description": "Optional — anchors the annotation to one node." }
+      },
+      "additionalProperties": false
+    },
+    "signal": {
+      "type": "object",
+      "required": ["kind"],
+      "properties": {
+        "kind": { "type": "string", "enum": ["rating", "correction", "label", "flag"] },
+        "rating": { "type": "integer", "minimum": 1, "maximum": 5, "description": "Required iff `kind` is `rating`." },
+        "label": { "type": "string", "description": "Required iff `kind` is `label`." },
+        "correction": { "type": "string", "description": "Corrected text/value iff `kind` is `correction`. Untrusted user content." }
+      },
+      "additionalProperties": false,
+      "allOf": [
+        { "if": { "properties": { "kind": { "const": "rating" } } }, "then": { "required": ["rating"] } },
+        { "if": { "properties": { "kind": { "const": "label" } } }, "then": { "required": ["label"] } },
+        { "if": { "properties": { "kind": { "const": "correction" } } }, "then": { "required": ["correction"] } }
+      ]
+    },
+    "actor": {
+      "type": "object",
+      "required": ["principalRef"],
+      "properties": {
+        "principalRef": {
+          "type": "string",
+          "minLength": 1,
+          "description": "Opaque principal identifier — a principal per RFC 0048 (Draft, referenced non-normatively) or an AgentRef per RFC 0002 when a supervisor agent annotates. String-typed so RFC 0056 does not depend on RFC 0048 reaching Accepted."
+        }
+      },
+      "additionalProperties": false
+    },
+    "note": { "type": "string", "description": "Optional free-text reviewer note. Untrusted user content." },
+    "createdAt": { "type": "string", "format": "date-time", "description": "ISO 8601 timestamp the host recorded the annotation." }
+  },
+  "additionalProperties": false
+}

package/schemas/capabilities.schema.json CHANGED Viewed

@@ -414,6 +414,30 @@
       },
       "additionalProperties": false
     },
+    "feedback": {
+      "type": "object",
+      "description": "RFC 0056 (`Draft`). Non-blocking human/agent quality signals (rating / correction / label / flag) attached to a run, event, or node. Annotations are a per-run side-resource recorded via `POST /v1/runs/{runId}/annotations`, listed via `GET`, and surfaced live via the `run.annotated` SSE notification — they are NOT entries in the replayable run event log (see RFC 0056 §B/§D). Hosts that do not advertise `supported: true` return `501 capability_not_provided` on the annotation endpoints.",
+      "required": ["supported"],
+      "properties": {
+        "supported": {
+          "type": "boolean",
+          "description": "Host implements the RFC 0056 annotation side-store + endpoints + `run.annotated` notification."
+        },
+        "targets": {
+          "type": "array",
+          "items": { "type": "string", "enum": ["run", "event", "node"] },
+          "uniqueItems": true,
+          "description": "Which annotation-target granularities the host accepts. Absent = `run` only."
+        },
+        "signals": {
+          "type": "array",
+          "items": { "type": "string", "enum": ["rating", "correction", "label", "flag"] },
+          "uniqueItems": true,
+          "description": "Which signal kinds the host accepts. Absent = all four."
+        }
+      },
+      "additionalProperties": false
+    },
     "oauth": {
       "type": "object",
       "description": "RFC 0047 (`Draft`). Host performs OAuth 2.0 grants (authorization-code + refresh) on a user's behalf for connector nodes, stores the acquired token as a `host.credentials` (RFC 0046) entry, refreshes it transparently, and resolves it into the node sandbox as a bearer token. Token material NEVER crosses the wire (SECURITY invariant `credential-payload-redaction`). Distinct from `auth` host-authentication profiles (RFC 0010 = who is the caller; this = what third-party token a node holds).",

package/src/lib/feedback.ts ADDED Viewed

@@ -0,0 +1,31 @@
+/**
+ * Shared helpers for the RFC 0056 feedback/annotation conformance scenarios.
+ * Lives in lib/ (not a *.test.ts) so the scenarios can import it via the
+ * standard `../lib/feedback.js` path.
+ */
+import { driver } from './driver.js';
+import { isFixtureAdvertised } from './fixtures.js';
+interface DiscoveryDoc {
+  capabilities?: Record<string, unknown>;
+}
+/** Reads `capabilities.feedback` from discovery; null when unadvertised. */
+export async function readFeedbackCap(): Promise<Record<string, unknown> | null> {
+  const res = await driver.get('/.well-known/openwop');
+  const body = res.json as DiscoveryDoc | undefined;
+  const top = body?.capabilities as Record<string, unknown> | undefined;
+  const fb = top && typeof top === 'object' ? (top as Record<string, unknown>)['feedback'] : undefined;
+  return fb && typeof fb === 'object' ? (fb as Record<string, unknown>) : null;
+}
+const SEED_FIXTURE = 'conformance-a';
+/** Seeds a run via the basic `conformance-a` fixture; null (soft-skip) when
+ *  the fixture isn't advertised or creation fails. */
+export async function seedRun(tenantId: string): Promise<string | null> {
+  if (!isFixtureAdvertised(SEED_FIXTURE)) return null;
+  const r = await driver.post('/v1/runs', { workflowId: SEED_FIXTURE, tenantId, inputs: {} });
+  if (r.status !== 200 && r.status !== 201) return null;
+  return (r.json as { runId?: string } | undefined)?.runId ?? null;
+}

package/src/scenarios/feedback-capability-shape.test.ts ADDED Viewed

@@ -0,0 +1,35 @@
+/**
+ * feedback-capability-shape — RFC 0056 §A. The `capabilities.feedback`
+ * advertisement block is either absent or a well-formed object.
+ *
+ * Status: ACTIVE (advertisement-shape; always runs). Behavioral coverage
+ * lives in the sibling `feedback-*.test.ts` scenarios, gated on
+ * `capabilities.feedback.supported`.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §A
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { readFeedbackCap } from '../lib/feedback.js';
+describe('feedback-capability-shape: advertisement (RFC 0056 §A)', () => {
+  it('capabilities.feedback is absent or a well-formed object', async () => {
+    const cap = await readFeedbackCap();
+    if (cap === null) return; // not advertised — valid
+    expect(
+      typeof cap.supported,
+      driver.describe('capabilities.schema.json §feedback', 'capabilities.feedback.supported MUST be a boolean when present'),
+    ).toBe('boolean');
+    if (Array.isArray(cap.targets)) {
+      for (const t of cap.targets) {
+        expect(['run', 'event', 'node']).toContain(t);
+      }
+    }
+    if (Array.isArray(cap.signals)) {
+      for (const s of cap.signals) {
+        expect(['rating', 'correction', 'label', 'flag']).toContain(s);
+      }
+    }
+  });
+});

package/src/scenarios/feedback-correction-redaction.test.ts ADDED Viewed

@@ -0,0 +1,35 @@
+/**
+ * feedback-correction-redaction — RFC 0056 §E + SECURITY/invariants.yaml
+ * `annotation-content-redaction`. `signal.correction` and `note` are
+ * untrusted user content; secret-shaped material MUST be redacted under
+ * SR-1 before persistence/listing/export.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §E
+ * @see SECURITY/invariants.yaml — annotation-content-redaction
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { readFeedbackCap, seedRun } from '../lib/feedback.js';
+const CANARY = 'sk-canary-rfc0056-do-not-leak-abc123';
+describe('feedback-correction-redaction (RFC 0056 §E)', () => {
+  it('secret-shaped material in correction/note is redacted in the annotation list', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported !== true) return;
+    const runId = await seedRun('feedback-redact');
+    if (!runId) return;
+    const post = await driver.post(`/v1/runs/${runId}/annotations`, {
+      signal: { kind: 'correction', correction: `please use ${CANARY}` },
+      note: CANARY,
+    });
+    if (post.status === 501 || post.status === 404) return;
+    expect(post.status).toBe(201);
+    const list = await driver.get(`/v1/runs/${runId}/annotations`);
+    expect(
+      JSON.stringify(list.json ?? {}).includes(CANARY),
+      driver.describe('RFC 0056 §E', 'secret-shaped material MUST be redacted before persistence/listing (SR-1)'),
+    ).toBe(false);
+  });
+});

package/src/scenarios/feedback-cross-tenant-isolation.test.ts ADDED Viewed

@@ -0,0 +1,37 @@
+/**
+ * feedback-cross-tenant-isolation — RFC 0056 §E + SECURITY/invariants.yaml
+ * `annotation-cross-tenant-isolation`. A run's annotation list MUST contain
+ * only that run's annotations (mirrors CTI-1).
+ *
+ * The run-scoped check runs against any feedback host. The full cross-tenant
+ * proof (tenant B cannot read tenant A's run) needs a multi-tenant auth seam
+ * not yet standardized for this surface — that half soft-skips, mirroring
+ * `kv-cross-tenant-isolation`'s seam gate.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §E
+ * @see SECURITY/invariants.yaml — annotation-cross-tenant-isolation
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { readFeedbackCap, seedRun } from '../lib/feedback.js';
+describe('feedback-cross-tenant-isolation (RFC 0056 §E)', () => {
+  it('a run\'s annotation list contains only that run\'s annotations', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported !== true) return;
+    const runId = await seedRun('feedback-cti');
+    if (!runId) return;
+    const post = await driver.post(`/v1/runs/${runId}/annotations`, { signal: { kind: 'label', label: 'cti-probe' } });
+    if (post.status === 501 || post.status === 404) return;
+    expect(post.status).toBe(201);
+    const list = await driver.get(`/v1/runs/${runId}/annotations`);
+    const ann = (list.json as { annotations?: Array<{ target?: { runId?: string } }> } | undefined)?.annotations ?? [];
+    for (const a of ann) {
+      expect(
+        a.target?.runId,
+        driver.describe('RFC 0056 §E', 'an annotation list MUST contain only this run\'s annotations (CTI-1)'),
+      ).toBe(runId);
+    }
+  });
+});

package/src/scenarios/feedback-fork-not-copied.test.ts ADDED Viewed

@@ -0,0 +1,40 @@
+/**
+ * feedback-fork-not-copied — RFC 0056 §D. Annotations are a per-run
+ * side-store, NOT replayable event-log entries — so a fork of an annotated
+ * run starts with ZERO annotations. Gated on feedback + fork; soft-skips
+ * when either is unavailable.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §D
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { pollUntilTerminal } from '../lib/polling.js';
+import { readFeedbackCap, seedRun } from '../lib/feedback.js';
+describe('feedback-fork-not-copied (RFC 0056 §D)', () => {
+  it('a fork of an annotated run starts with zero annotations', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported !== true) return;
+    const runId = await seedRun('feedback-fork');
+    if (!runId) return;
+    const post = await driver.post(`/v1/runs/${runId}/annotations`, { signal: { kind: 'flag' } });
+    if (post.status === 501 || post.status === 404) return;
+    expect(post.status).toBe(201);
+    try {
+      await pollUntilTerminal(runId, { timeoutMs: 10_000 });
+    } catch {
+      return;
+    }
+    const fork = await driver.post(`/v1/runs/${runId}:fork`, { fromSeq: 0, mode: 'branch' });
+    if (fork.status !== 200 && fork.status !== 201) return; // fork unsupported — soft-skip
+    const forkId = (fork.json as { runId?: string } | undefined)?.runId;
+    if (!forkId) return;
+    const list = await driver.get(`/v1/runs/${forkId}/annotations`);
+    const ann = (list.json as { annotations?: unknown[] } | undefined)?.annotations ?? [];
+    expect(
+      ann.length,
+      driver.describe('RFC 0056 §D', 'annotations are a side-store and MUST NOT be copied into a fork'),
+    ).toBe(0);
+  });
+});

package/src/scenarios/feedback-on-terminal-run.test.ts ADDED Viewed

@@ -0,0 +1,32 @@
+/**
+ * feedback-on-terminal-run — RFC 0056 §C. An annotation on a COMPLETED run
+ * is accepted (proves feedback is non-blocking and post-hoc). Gated on
+ * `capabilities.feedback.supported`; soft-skips when a run can't be seeded.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §C
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { pollUntilTerminal } from '../lib/polling.js';
+import { readFeedbackCap, seedRun } from '../lib/feedback.js';
+describe('feedback-on-terminal-run (RFC 0056 §C)', () => {
+  it('annotating a terminal run is accepted', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported !== true) return;
+    const runId = await seedRun('feedback-terminal');
+    if (!runId) return;
+    try {
+      await pollUntilTerminal(runId, { timeoutMs: 10_000 });
+    } catch {
+      return; // run didn't reach terminal in time — soft-skip
+    }
+    const post = await driver.post(`/v1/runs/${runId}/annotations`, { signal: { kind: 'flag' }, note: 'post-hoc review' });
+    if (post.status === 501 || post.status === 404) return;
+    expect(
+      post.status,
+      driver.describe('RFC 0056 §C', 'a host MUST accept an annotation on a terminal run'),
+    ).toBe(201);
+  });
+});

package/src/scenarios/feedback-record-and-list.test.ts ADDED Viewed

@@ -0,0 +1,32 @@
+/**
+ * feedback-record-and-list — RFC 0056 §C. POST an annotation, then GET
+ * lists it back. Gated on `capabilities.feedback.supported` + the
+ * `conformance-a` seed fixture; soft-skips otherwise.
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §C
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { readFeedbackCap, seedRun } from '../lib/feedback.js';
+describe('feedback-record-and-list (RFC 0056 §C)', () => {
+  it('POST an annotation then GET returns it', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported !== true) return;
+    const runId = await seedRun('feedback-rl');
+    if (!runId) return;
+    const post = await driver.post(`/v1/runs/${runId}/annotations`, { signal: { kind: 'rating', rating: 5 } });
+    if (post.status === 501 || post.status === 404) return;
+    expect(
+      post.status,
+      driver.describe('RFC 0056 §C', 'POST annotation returns 201 with the persisted annotation'),
+    ).toBe(201);
+    const created = post.json as { annotationId?: string };
+    expect(typeof created.annotationId).toBe('string');
+    const list = await driver.get(`/v1/runs/${runId}/annotations`);
+    expect(list.status).toBe(200);
+    const ann = (list.json as { annotations?: Array<{ annotationId?: string }> } | undefined)?.annotations ?? [];
+    expect(ann.some((a) => a.annotationId === created.annotationId)).toBe(true);
+  });
+});

package/src/scenarios/feedback-unsupported-501.test.ts ADDED Viewed

@@ -0,0 +1,32 @@
+/**
+ * feedback-unsupported-501 — RFC 0056 §C. A host that does NOT advertise
+ * `capabilities.feedback.supported` MUST return `501 capability_not_provided`
+ * on the annotation endpoints (the honest signal, per `capabilities.md`) —
+ * not silently 404 the route.
+ *
+ * Soft-skips when the host advertises feedback (501 is N/A) or when the
+ * route is entirely absent (404/405 — host predates RFC 0056).
+ *
+ * @see RFCS/0056-run-feedback-and-annotation-event.md §C
+ */
+import { describe, it, expect } from 'vitest';
+import { driver } from '../lib/driver.js';
+import { readFeedbackCap } from '../lib/feedback.js';
+describe('feedback-unsupported-501 (RFC 0056 §C)', () => {
+  it('POST annotations returns 501 capability_not_provided when feedback is unadvertised', async () => {
+    const cap = await readFeedbackCap();
+    if (cap?.supported === true) return; // host supports feedback — 501 N/A
+    const res = await driver.post('/v1/runs/probe-run-rfc0056/annotations', {
+      signal: { kind: 'flag' },
+    });
+    if (res.status === 404 || res.status === 405) return; // route absent — host predates RFC 0056
+    expect(
+      res.status,
+      driver.describe('rest-endpoints.md / RFC 0056 §C', 'unadvertised feedback MUST return 501, not 404'),
+    ).toBe(501);
+    const code = (res.json as { error?: string } | undefined)?.error;
+    expect(code).toBe('capability_not_provided');
+  });
+});

package/src/scenarios/redaction.test.ts CHANGED Viewed

@@ -100,7 +100,10 @@ describe('redaction: /.well-known/openwop secrets+aiProviders shape contract', (
         'when secrets.supported is true, scopes MUST be non-empty',
       )).toBeGreaterThanOrEqual(1);
       for (const scope of scopes) {
-        expect(['tenant', 'user', 'run']).toContain(scope);
+        // Allowlist MUST track the `secrets.scopes` enum in capabilities.schema.json
+        // (`["tenant", "user", "run", "workspace"]`). `workspace` is the RFC 0046/0048
+        // sub-tenant scope — additive; hosts that advertise it (e.g. MyndHyve) are conformant.
+        expect(['tenant', 'user', 'run', 'workspace']).toContain(scope);
       }
       expect(s.resolution, driver.describe(
         'capabilities.md §"Secrets"',

package/src/scenarios/spec-corpus-validity.test.ts CHANGED Viewed

@@ -1016,7 +1016,10 @@ describe('spec-corpus: AsyncAPI 3.1 spec is structurally valid', () => {
     const messageNames = extractAsyncApiMessageNames(raw);
     const runEventSchema = readJson(join(SCHEMAS_DIR, 'run-event.schema.json'));
     const runEventTypes = new Set(findRunEventTypeEnum(runEventSchema));
-    const syntheticMessageNames = new Set(['state.snapshot', 'ai.message.chunk', 'any']);
+    // `run.annotated` (RFC 0056) is a live SSE notification carrying an
+    // Annotation — NOT a RunEventDoc and deliberately NOT in the RunEventType
+    // enum (annotations are a side-resource, excluded from fork/replay).
+    const syntheticMessageNames = new Set(['state.snapshot', 'ai.message.chunk', 'any', 'run.annotated']);
     expect(messageNames.length, 'AsyncAPI MUST declare named SSE messages').toBeGreaterThan(0);