@openwop/openwop-conformance 1.11.0 → 1.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,20 @@
4
4
 
5
5
  _No unreleased changes._
6
6
 
7
+ ## [1.12.0] — 2026-05-31 — RFC 0087 org-chart + RFC 0083 trigger-bridge behavioral gates
8
+
9
+ Standalone conformance minor — a scenario addition published via the `openwop-conformance/v1.12.0` per-package tag (PUBLISHING.md §"CI automation"; only the `publish-conformance` job runs), NOT a coordinated spec-corpus release. `EXPECTED_CONFORMANCE_VERSION` advances to `1.12.0` in lockstep. All additive + capability/profile/seam-gated; existing v1.0-only hosts pass unchanged. Ships the three gated behavioral scenarios RFC 0087 + RFC 0083 each name in their §Conformance but deferred at `Draft → Active` — the steward prerequisite to graduating `agents.orgChart` (RFC 0087) and `triggerBridge` (RFC 0083) from `Active` to `Accepted` on a non-steward host (MyndHyve).
10
+
11
+ ### Added — RFC 0083 behavioral scenario
12
+
13
+ - **`trigger-bridge-delivery.test.ts`** (`behaviorGate('openwop-trigger-bridge', …)`, profile-gated — the `openwop-trigger-bridge` profile derived from the live discovery doc per RFC 0083 §D: the bridge advertised + a dead-letter sink + a durable source) — drives the §C delivery model via the new `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup → effectively-once (≤1 `trigger.delivery.attempted{outcome:"delivered"}` per `dedupKey`, §C-1); retry-exhaustion → terminal `trigger.delivery.attempted{outcome:"dead-lettered"}` + `trigger.subscription.state.changed{toState:"dead-lettered"}` (§C-2 + RFC 0053); and a successful delivery whose resulting run's `run.started.causationId` == the delivery id (§C / RFC 0040). Both `trigger.*` events asserted content-free (SR-1). The normative `GET /v1/trigger-subscriptions` read runs black-box. **This is the RFC 0083 → Accepted bar.** New lib helper `src/lib/triggerBridge.ts`; new seam in `host-sample-test-seams.md` §"Open seams". No new schemas (`trigger-subscription.schema.json` + the two `trigger.*` payload $defs shipped at `Draft → Active`); no new SECURITY invariant.
14
+
15
+ ### Added — RFC 0087 behavioral scenarios
16
+
17
+ - **`agent-org-chart-scoping.test.ts`** (`behaviorGate('openwop-org-chart-scoping', …)`, gated on `agents.orgChart.supported`) — black-box on the normative `/v1/agents/org-chart` surface (no POST seam): the `GET /v1/agents/org-chart` tree-shape (`{owner, departments, members}`; `parentDepartmentId` forms an acyclic tree; members reference `host:<id>` roster entries); the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` (a deduped `responsibilities[]` union of the subtree members' RFC 0086 portfolios; `recursive=false` keeps the shape); and the RFC 0074 cross-tenant 404 via the new `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID` env var (the org-chart analog of `OPENWOP_CROSS_TENANT_ROSTER_ID`). **This is part of the RFC 0087 → Accepted bar.** New lib helper `src/lib/agentOrgChart.ts`.
18
+ - **`org-position-no-authority-escalation.test.ts`** (`behaviorGate('openwop-org-position-no-authority', …)`, gated on `agents.orgChart.supported`) — the BEHAVIORAL leg of the protocol-tier `org-position-no-authority-escalation` invariant: the live org-chart wire carries NO authority-bearing field (`scopes`/`canDispatch`/`permissions`/`authority`/`roleGrants`/`capabilities`) on any member / department / responsibility-view object, proving the host's projector strips position-as-authority at every install scope. The STRUCTURAL leg (the schema rejects an authority field, `additionalProperties:false`) remains always-on in `agent-org-chart-shape.test.ts`; the deeper RFC 0049 (authz-decision-invariant-to-position) + RFC 0051 (gate-not-satisfied-by-seniority) legs stay reference-impl tier because forcing them black-box needs a non-normative authz-decide hook (the `agent-manifest-runtime` confidence-escalation precedent). No new SECURITY invariant (`org-position-no-authority-escalation` already exists, exercised structurally always-on and now behaviorally here).
19
+ - No new schemas (`agent-org-chart.schema.json` + `org-chart-responsibility-view.schema.json` shipped at `Draft → Active`); no new POST seam (both scenarios are black-box reads + an env-var cross-tenant probe); `coverage.md` rows added.
20
+
7
21
  ## [1.11.0] — 2026-05-31 — agent-platform graduation + safe-fetch + runtime-requires behavioral scenarios
8
22
 
9
23
  First independent conformance minor since `1.10.0` — a scenario addition published via the `openwop-conformance/v1.11.0` per-package tag (PUBLISHING.md §"CI automation"; only the `publish-conformance` job runs), NOT a coordinated spec-corpus release. `EXPECTED_CONFORMANCE_VERSION` advances to `1.11.0` in lockstep. All additive + capability/seam-gated; existing v1.0-only hosts pass unchanged. Net-new scenario files since the published `1.10.0`: the four RFC 0086/0077 graduation scenarios below, plus the RFC 0076 §B (`safefetch-behavior`, `safefetch-live-audit`) and §A (`runtime-requires-shape`, `runtime-requires-install-gate`) scenarios that had accumulated unreleased.
package/README.md CHANGED
@@ -93,7 +93,7 @@ Exit code is non-zero on any failed assertion.
93
93
 
94
94
  ## What's Covered
95
95
 
96
- The current suite has 305 scenario files under `src/scenarios/`. 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
96
+ The current suite has 308 scenario files under `src/scenarios/`. 2026-05-31 (RFC 0083 — durable trigger bridge, the Active→Accepted behavioral gate) added `trigger-bridge-delivery.test.ts` (profile-gated on `openwop-trigger-bridge` derived from the live discovery doc — the §C delivery model via the `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free; the always-on shape stays in `trigger-bridge-shape.test.ts`; new lib helper `src/lib/triggerBridge.ts`). 2026-05-31 (RFC 0087 — agent org-chart, the Active→Accepted behavioral gate) added two capability-gated behavioral scenarios (both gated on `agents.orgChart.supported`, black-box on the normative `/v1/agents/org-chart` surface — no new POST seam): `agent-org-chart-scoping.test.ts` (the `GET /v1/agents/org-chart` tree-shape — departments form an acyclic `parentDepartmentId` tree, members reference `host:<id>` roster entries — + the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` with a deduped `responsibilities[]` union + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) and `org-position-no-authority-escalation.test.ts` (the behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field on any member/department/responsibility-view object; the structural leg stays always-on in `agent-org-chart-shape.test.ts`, and the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier per the `agent-manifest-runtime` no-host-hook precedent). 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
97
97
 
98
98
  High-level coverage includes:
99
99
 
@@ -172,7 +172,7 @@ Server-required (added in 1.7.0):
172
172
  |---|---|---|
173
173
  | **Redaction** | [`capabilities.md`](../spec/v1/capabilities.md) §"Secrets" + NFR-7 + §"aiProviders" | Vendor-neutral assertions that the server doesn't leak secret material. Three scenario groups: (a) discovery shape contract — `secrets` + `aiProviders` advertisements are well-formed regardless of `secrets.supported`; when `supported === true`, scopes MUST be non-empty + `resolution === 'host-managed'`; `byok ⊆ supported`. (b) bearer-token redaction — invalid Bearer canary in `Authorization` header is not echoed in the 401 response body. (c) credentialRef echo control — gated on `secrets.supported === true`; canary planted in `configurable.ai.credentialRef` MUST NOT appear in any RunEvent payload (poll-based capture; transport-agnostic). Uses runtime-built canary fixtures (`lib/canaries.ts`) that defeat static secret scanners. 6 scenarios. |
174
174
 
175
- Current source tree: 305 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
175
+ Current source tree: 308 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
176
176
 
177
177
  ## Remaining Gaps
178
178
 
package/coverage.md CHANGED
@@ -66,7 +66,7 @@
66
66
 
67
67
  ## Capability-gated scenarios: shape vs behavior
68
68
 
69
- Twenty-eight scenario groups validate optional profiles where the host's discovery advertisement is well-formed (shape grade) but no reference host yet implements the profile end-to-end (behavior grade is `host-pending`). Default suite runs skip these with a warning; set `OPENWOP_REQUIRE_BEHAVIOR=true` to convert skips into hard failures.
69
+ Thirty-one scenario groups validate optional profiles where the host's discovery advertisement is well-formed (shape grade) but no reference host yet implements the profile end-to-end (behavior grade is `host-pending`). Default suite runs skip these with a warning; set `OPENWOP_REQUIRE_BEHAVIOR=true` to convert skips into hard failures.
70
70
 
71
71
  | Scenario | Profile / capability | Shape grade | Behavior grade | Behavior-unlock dependency |
72
72
  |---|---|---|---|---|
@@ -108,6 +108,9 @@ Twenty-eight scenario groups validate optional profiles where the host's discove
108
108
  | `agent-live-invocation-bracket.test.ts` | `capabilities.agents.liveRuntime.supported` (RFC 0077 §E, `multi-agent-execution.md` §"Live manifest dispatch") | A (the §E bracket — `agent.invocation.started`-first / `agent.invocation.completed`-last, matching `invocationId`, `source`/`outcome` closed enums, both content-free — via `POST /v1/host/sample/agents/live-invoke` + the test event-log seam) | `host-pending` | `behaviorGate('openwop-live-invocation-bracket', …)`. Seam-gated; soft-skips on 404. Part of the RFC 0077 → Accepted bar. First adopter: MyndHyve `agents.liveRuntime`. |
109
109
  | `agent-live-structured-output.test.ts` | `capabilities.agents.liveRuntime.structuredOutput` (RFC 0077 §B step 6) | A (a terminal result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` + `schemaValidated != true`, not a shipped completion — via the `forceInvalidResult` seam param) | `host-pending` | `behaviorGate('openwop-live-structured-output', …)`; gated on `liveRuntime.supported` + `structuredOutput`. Seam-gated; soft-skips on 404. |
110
110
  | `agent-live-allowlist-enforced.test.ts` | `capabilities.agents.liveRuntime.supported` (RFC 0077 §F-1 / RFC 0002 §A14 `toolAllowlist`) | A (a tool outside the agent `toolAllowlist` is not callable — no `agent.toolCalled` for the disallowed tool — via the `attemptTool` seam param) | `host-pending` | `behaviorGate('openwop-live-allowlist-enforced', …)`. Seam-gated; soft-skips on 404. Part of the RFC 0077 → Accepted bar. |
111
+ | `agent-org-chart-scoping.test.ts` | `capabilities.agents.orgChart.supported` (RFC 0087 §A/§C/§D, `agent-org-chart.md`) | A (the normative `GET /v1/agents/org-chart` tree-shape — acyclic `parentDepartmentId` tree + `host:<id>` member rosterIds — + the §D `GET /v1/agents/org-chart/{departmentId}` responsibility roll-up with a deduped `responsibilities[]` union + `recursive=false` shape-stability + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) | `host-pending` | `behaviorGate('openwop-org-chart-scoping', …)`. Black-box on the normative path (no POST seam); soft-skips on 404 / when the cross-tenant env var is unset. **Part of the RFC 0087 → Accepted bar.** First adopter: MyndHyve `agents.orgChart`. |
112
+ | `org-position-no-authority-escalation.test.ts` | `capabilities.agents.orgChart.supported` (RFC 0087 §B) + `SECURITY/invariants.yaml` `org-position-no-authority-escalation` | A (behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field (`scopes`/`canDispatch`/`permissions`/`authority`/`roleGrants`/`capabilities`) on any member / department / responsibility-view object, proving the host's projector strips position-as-authority at every install scope) | `host-pending` | `behaviorGate('openwop-org-position-no-authority', …)`. Black-box on the normative path; soft-skips on 404. The STRUCTURAL leg (schema rejects an authority field) stays always-on in `agent-org-chart-shape.test.ts`; the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier (a non-normative authz-decide hook would be required — the `agent-manifest-runtime` precedent). **Part of the RFC 0087 → Accepted bar.** |
113
+ | `trigger-bridge-delivery.test.ts` | `openwop-trigger-bridge` profile (RFC 0083 §C/§D, `trigger-bridge.md`; derived from discovery — bridge + dead-letter sink + durable source) | A (the §C delivery model via `POST /v1/host/sample/trigger-bridge/deliver` + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free) | `host-pending` | `behaviorGate('openwop-trigger-bridge', …)`. Profile-gated; seam-gated; soft-skips on 404. Normative `GET /v1/trigger-subscriptions` read runs black-box. **This is the RFC 0083 → Accepted bar.** First adopter: MyndHyve `triggerBridge`. |
111
114
  | `approval-gate-events.test.ts` | `approval.granted` / `.rejected` / `.overridden` (RFC 0051 §B, `interrupt-profiles.md` §approvalGate) | Server-free (event-payload schema validity: required fields incl. mandatory `overridden.reason`; additionalProperties:false negatives) | host-pass (server-free) | Always runs; no host needed. |
112
115
  | `approval-gate-flow.test.ts` | `core.openwop.governance.approvalGate` (RFC 0051 §A) + `capabilities.authorization` (RFC 0049) | A (capability-gated on `authorization.supported`; unauthorized-principal-denied + override-audited via the `governance/approval-gate` seam) | `host-pending` | Behavioral probe soft-skips on 404. Grant/reject-loopback/quorum scenarios deferred until a governance host wires the seam. |
113
116
  | `scheduling-capability-shape.test.ts` | `capabilities.scheduling` (RFC 0052 §A, `host-capabilities.md` §host.scheduling) | A (advertisement shape always — `supported` boolean; `cron`/`delayed`/`calendar` booleans; `maxFutureHorizon` ISO-8601 duration) | `host-pending` | Always runs; asserts the block is absent or well-formed. |
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@openwop/openwop-conformance",
3
- "version": "1.11.0",
3
+ "version": "1.12.0",
4
4
  "description": "Production-ready black-box conformance suite for OpenWOP v1.0 compliant servers.",
5
5
  "repository": {
6
6
  "type": "git",
@@ -0,0 +1,82 @@
1
+ /**
2
+ * Shared helpers for the RFC 0087 `agents.orgChart` conformance scenarios.
3
+ * Lives in lib/ (not a `*.test.ts`) so scenarios import it via
4
+ * `../lib/agentOrgChart.js`.
5
+ *
6
+ * The org-chart is structure + a read (like the RFC 0072 inventory), not an
7
+ * event surface — so these helpers wrap the two NORMATIVE reads
8
+ * (`GET /v1/agents/org-chart` + `GET /v1/agents/org-chart/{departmentId}`),
9
+ * exercised black-box against any conformant host. Tenant scoping (RFC 0074)
10
+ * is probed with the `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID` env var (a
11
+ * department id outside the caller's owner triple), the org-chart analog of the
12
+ * roster scenario's `OPENWOP_CROSS_TENANT_ROSTER_ID`.
13
+ *
14
+ * @see RFCS/0087-agent-org-chart.md
15
+ * @see spec/v1/agent-org-chart.md
16
+ */
17
+ import { driver } from './driver.js';
18
+ import { readCapabilityFamily } from './discovery-capabilities.js';
19
+
20
+ /** Reads `agents.orgChart` from discovery (root-first per RFC 0073); null when
21
+ * unadvertised. */
22
+ export async function readOrgChartCap(): Promise<Record<string, unknown> | null> {
23
+ const agents = await readCapabilityFamily<{ orgChart?: unknown }>('agents');
24
+ const oc = agents?.orgChart;
25
+ return oc && typeof oc === 'object' ? (oc as Record<string, unknown>) : null;
26
+ }
27
+
28
+ export interface OrgDepartment {
29
+ departmentId?: string;
30
+ parentDepartmentId?: string | null;
31
+ [k: string]: unknown;
32
+ }
33
+
34
+ export interface OrgMember {
35
+ rosterId?: string;
36
+ departmentId?: string;
37
+ roleId?: string;
38
+ reportsTo?: string | null;
39
+ [k: string]: unknown;
40
+ }
41
+
42
+ export interface OrgChart {
43
+ owner?: { tenantId?: string; workspaceId?: string };
44
+ departments?: OrgDepartment[];
45
+ members?: OrgMember[];
46
+ }
47
+
48
+ export interface ResponsibilityView {
49
+ department?: { departmentId?: string; [k: string]: unknown };
50
+ members?: OrgMember[];
51
+ responsibilities?: string[];
52
+ }
53
+
54
+ /** GET the NORMATIVE org-chart (RFC 0087 §A `GET /v1/agents/org-chart`);
55
+ * null when the host doesn't serve it (404/405/501). */
56
+ export async function getOrgChart(): Promise<OrgChart | null> {
57
+ const res = await driver.get('/v1/agents/org-chart');
58
+ if (res.status === 404 || res.status === 405 || res.status === 501) return null;
59
+ return (res.json as OrgChart | undefined) ?? {};
60
+ }
61
+
62
+ /** GET a department's §D responsibility roll-up. `recursive` defaults to the
63
+ * host default (true) when undefined. Returns `{ status, view }` so a caller
64
+ * can distinguish a cross-tenant 404 from a served view. */
65
+ export async function getDepartmentView(
66
+ departmentId: string,
67
+ recursive?: boolean,
68
+ ): Promise<{ status: number; view: ResponsibilityView | undefined }> {
69
+ const qs = recursive === undefined ? '' : `?recursive=${recursive ? 'true' : 'false'}`;
70
+ const res = await driver.get(`/v1/agents/org-chart/${encodeURIComponent(departmentId)}${qs}`);
71
+ return { status: res.status, view: res.json as ResponsibilityView | undefined };
72
+ }
73
+
74
+ /** The descriptive key set a member object is allowed to carry on the wire
75
+ * (RFC 0087 §A). Anything outside this — in particular an authority-bearing
76
+ * field — is a §B `org-position-no-authority-escalation` violation. */
77
+ export const MEMBER_DESCRIPTIVE_KEYS = new Set(['rosterId', 'departmentId', 'roleId', 'reportsTo']);
78
+
79
+ /** Authority-bearing field names that MUST NEVER appear on an org-chart wire
80
+ * object (member / department / responsibility view) — position confers no
81
+ * authority (RFC 0087 §B). */
82
+ export const AUTHORITY_FIELDS = ['scopes', 'canDispatch', 'permissions', 'authority', 'roleGrants', 'capabilities'];
@@ -0,0 +1,74 @@
1
+ /**
2
+ * Shared helpers for the RFC 0083 `triggerBridge` conformance scenario.
3
+ * Lives in lib/ (not a `*.test.ts`) so scenarios import it via
4
+ * `../lib/triggerBridge.js`.
5
+ *
6
+ * Two surfaces:
7
+ * - the NORMATIVE read (`GET /v1/trigger-subscriptions[/{subscriptionId}]`,
8
+ * RFC 0083 §A), exercised black-box; and
9
+ * - the host-sample delivery seam (`POST /v1/host/sample/trigger-bridge/deliver`),
10
+ * used to drive the §C delivery model (dedup → retry → dead-letter →
11
+ * causation) so the two `trigger.*` events can be asserted against the test
12
+ * event-log seam. The seam is OPTIONAL — scenarios soft-skip on 404/405
13
+ * (reference durable-delivery is deferred per RFC 0083 §Conformance).
14
+ *
15
+ * Gating uses the `openwop-trigger-bridge` PROFILE derived from the live
16
+ * discovery doc (the bridge + a dead-letter sink + a durable source, §D), not a
17
+ * bare capability flag.
18
+ *
19
+ * @see RFCS/0083-durable-trigger-and-channel-bridge-profile.md
20
+ * @see spec/v1/trigger-bridge.md
21
+ * @see spec/v1/profiles.md (§openwop-trigger-bridge)
22
+ */
23
+ import { driver } from './driver.js';
24
+ import { deriveProfiles, type DiscoveryPayload } from './profiles.js';
25
+
26
+ /** True when the live host's discovery derives the `openwop-trigger-bridge`
27
+ * profile (RFC 0083 §D predicate: bridge advertised + dead-letter sink + a
28
+ * durable source). */
29
+ export async function isTriggerBridgeProfileAdvertised(): Promise<boolean> {
30
+ const disco = await driver.get('/.well-known/openwop');
31
+ if (disco.status !== 200 || !disco.json) return false;
32
+ return deriveProfiles(disco.json as DiscoveryPayload).includes('openwop-trigger-bridge');
33
+ }
34
+
35
+ export interface TriggerSubscription {
36
+ subscriptionId?: string;
37
+ source?: string;
38
+ state?: string;
39
+ [k: string]: unknown;
40
+ }
41
+
42
+ /** GET the NORMATIVE subscription read surface (RFC 0083 §A
43
+ * `GET /v1/trigger-subscriptions`); null when not served (404/405/501). */
44
+ export async function listTriggerSubscriptions(): Promise<{ subscriptions?: TriggerSubscription[] } | null> {
45
+ const res = await driver.get('/v1/trigger-subscriptions');
46
+ if (res.status === 404 || res.status === 405 || res.status === 501) return null;
47
+ return (res.json as { subscriptions?: TriggerSubscription[] } | undefined) ?? {};
48
+ }
49
+
50
+ export interface DeliveryResult {
51
+ runId?: string;
52
+ subscriptionId?: string;
53
+ outcome?: string;
54
+ deliveredCount?: number;
55
+ }
56
+
57
+ /**
58
+ * Drive one delivery through the host-sample bridge seam. `scenario`:
59
+ * - `dedup` — deliver the same `dedupKey` twice; effectively-once (§C-1).
60
+ * - `exhaust` — exhaust the retry policy → `dead-lettered` (§C-2 + RFC 0053).
61
+ * - `deliver` — a single successful delivery whose run's `run.started`
62
+ * carries the delivery `causationId` (§C / RFC 0040).
63
+ * Returns null when the seam is unwired (404/405).
64
+ */
65
+ export async function driveDelivery(
66
+ body: { scenario: 'dedup' | 'exhaust' | 'deliver'; dedupKey?: string; source?: string },
67
+ ): Promise<DeliveryResult | null> {
68
+ const res = await driver.post('/v1/host/sample/trigger-bridge/deliver', body);
69
+ if (res.status === 404 || res.status === 405) return null;
70
+ return (res.json as DeliveryResult | undefined) ?? {};
71
+ }
72
+
73
+ export const SUBSCRIPTION_STATES = ['active', 'paused', 'failed', 'dead-lettered'];
74
+ export const DELIVERY_OUTCOMES = ['delivered', 'retrying', 'dead-lettered'];
@@ -0,0 +1,137 @@
1
+ /**
2
+ * Agent org-chart — normative read, responsibility roll-up + tenant scoping
3
+ * (RFC 0087 §A/§C/§D) — behavioral.
4
+ *
5
+ * Gated on `capabilities.agents.orgChart.supported` (root-first per RFC 0073).
6
+ * Soft-skips when unadvertised (default) / hard-fails under
7
+ * `OPENWOP_REQUIRE_BEHAVIOR=true` via `behaviorGate`. The always-on wire-shape
8
+ * coverage lives in `agent-org-chart-shape.test.ts`; this asserts host
9
+ * BEHAVIOR against the live `/v1/agents/org-chart` surface:
10
+ *
11
+ * 1. NORMATIVE read — `GET /v1/agents/org-chart` returns the
12
+ * `agent-org-chart.schema.json` shape `{ owner, departments, members }`;
13
+ * departments form a tree (every `parentDepartmentId` resolves; no cycle);
14
+ * members reference roster entries (`host:<id>` rosterId) and the
15
+ * `reportsTo` graph is acyclic. Black-box on any org-chart host.
16
+ * 2. §D RESPONSIBILITY ROLL-UP — `GET /v1/agents/org-chart/{departmentId}`
17
+ * returns `{ department, members, responsibilities }` where
18
+ * `responsibilities` is a deduped `string[]` (the union of the subtree
19
+ * members' RFC 0086 portfolios); `recursive=false` scopes to direct
20
+ * members without changing the response shape.
21
+ * 3. TENANT SCOPING (§C / RFC 0074) — a `GET /v1/agents/org-chart/{id}` for a
22
+ * department outside the caller's owner triple 404s (probed only when
23
+ * `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID` is supplied; soft-skip
24
+ * otherwise — the org-chart analog of the roster scoping env var).
25
+ *
26
+ * Spec references:
27
+ * - https://github.com/openwop/openwop/blob/main/spec/v1/agent-org-chart.md
28
+ * - https://github.com/openwop/openwop/blob/main/RFCS/0087-agent-org-chart.md
29
+ */
30
+
31
+ import { describe, it, expect } from 'vitest';
32
+ import { driver } from '../lib/driver.js';
33
+ import { behaviorGate } from '../lib/behavior-gate.js';
34
+ import { readOrgChartCap, getOrgChart, getDepartmentView } from '../lib/agentOrgChart.js';
35
+
36
+ const ROSTER_ID_RE = /^host:[a-z0-9][a-z0-9._-]*$/;
37
+
38
+ describe('agent-org-chart-scoping (RFC 0087 §A/§C/§D)', () => {
39
+ it('serves the normative org-chart + responsibility roll-up, tree-shaped and tenant-scoped', async () => {
40
+ const cap = await readOrgChartCap();
41
+ if (!behaviorGate('openwop-org-chart-scoping', cap?.supported === true)) return;
42
+
43
+ const installScope = typeof cap?.installScope === 'string' ? cap.installScope : 'tenant';
44
+ expect(
45
+ installScope === 'host' || installScope === 'tenant',
46
+ driver.describe('RFC 0087 §E / RFC 0074', "agents.orgChart.installScope (when present) MUST be 'host' or 'tenant'"),
47
+ ).toBe(true);
48
+
49
+ // ---- Leg 1: normative read (black-box) -------------------------------
50
+ const chart = await getOrgChart();
51
+ if (chart === null) return; // advertised but read not served yet — soft-skip
52
+ const departments = chart.departments ?? [];
53
+ const members = chart.members ?? [];
54
+ expect(
55
+ Array.isArray(departments) && Array.isArray(members),
56
+ driver.describe('agent-org-chart.schema.json', 'GET /v1/agents/org-chart MUST return departments[] + members[]'),
57
+ ).toBe(true);
58
+
59
+ const deptIds = new Set(departments.map((d) => d.departmentId).filter((x): x is string => typeof x === 'string'));
60
+ for (const d of departments) {
61
+ const parent = d.parentDepartmentId;
62
+ if (parent !== undefined && parent !== null) {
63
+ expect(
64
+ deptIds.has(parent),
65
+ driver.describe('agent-org-chart.md §A', 'every parentDepartmentId MUST resolve to a department in the chart (a tree)'),
66
+ ).toBe(true);
67
+ }
68
+ }
69
+ // Department tree is acyclic (walk parents from each node, bound by node count).
70
+ for (const d of departments) {
71
+ const seen = new Set<string>();
72
+ let cur: string | null | undefined = d.departmentId;
73
+ let steps = 0;
74
+ while (typeof cur === 'string' && steps <= departments.length) {
75
+ if (seen.has(cur)) break;
76
+ seen.add(cur);
77
+ cur = departments.find((x) => x.departmentId === cur)?.parentDepartmentId ?? null;
78
+ steps++;
79
+ }
80
+ expect(
81
+ steps <= departments.length,
82
+ driver.describe('agent-org-chart.md §A', 'the department parent graph MUST be acyclic'),
83
+ ).toBe(true);
84
+ }
85
+ for (const m of members) {
86
+ expect(
87
+ typeof m.rosterId === 'string' && ROSTER_ID_RE.test(m.rosterId),
88
+ driver.describe('agent-org-chart.md §A', 'each member MUST reference a roster entry (host:<id> rosterId)'),
89
+ ).toBe(true);
90
+ if (typeof m.departmentId === 'string') {
91
+ expect(
92
+ deptIds.size === 0 || deptIds.has(m.departmentId),
93
+ driver.describe('agent-org-chart.md §A', "a member's departmentId MUST be a department in the chart"),
94
+ ).toBe(true);
95
+ }
96
+ }
97
+
98
+ // ---- Leg 2: §D responsibility roll-up --------------------------------
99
+ const probeDeptId = departments[0]?.departmentId;
100
+ if (typeof probeDeptId === 'string') {
101
+ const { status, view } = await getDepartmentView(probeDeptId);
102
+ if (status === 200 && view) {
103
+ expect(
104
+ Array.isArray(view.responsibilities),
105
+ driver.describe('agent-org-chart.md §D', 'the responsibility view MUST carry a responsibilities[] roll-up'),
106
+ ).toBe(true);
107
+ const r = view.responsibilities ?? [];
108
+ expect(
109
+ r.length === new Set(r).size,
110
+ driver.describe('agent-org-chart.md §D', 'responsibilities MUST be a deduped union (no duplicate workflow ids)'),
111
+ ).toBe(true);
112
+ expect(
113
+ r.every((w) => typeof w === 'string'),
114
+ driver.describe('org-chart-responsibility-view.schema.json', 'responsibilities[] entries MUST be workflow-id strings'),
115
+ ).toBe(true);
116
+ // recursive=false MUST keep the response shape (a subset roll-up).
117
+ const direct = await getDepartmentView(probeDeptId, false);
118
+ if (direct.status === 200 && direct.view) {
119
+ expect(
120
+ Array.isArray(direct.view.responsibilities),
121
+ driver.describe('agent-org-chart.md §D', 'recursive=false MUST return the same shape, scoped to direct members'),
122
+ ).toBe(true);
123
+ }
124
+ }
125
+ }
126
+
127
+ // ---- Leg 3: tenant scoping (RFC 0074) --------------------------------
128
+ const crossTenantDept = process.env.OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID;
129
+ if (typeof crossTenantDept === 'string' && crossTenantDept.length > 0) {
130
+ const probe = await getDepartmentView(crossTenantDept);
131
+ expect(
132
+ probe.status === 404,
133
+ driver.describe('agent-org-chart.md §C / RFC 0074', 'GET /v1/agents/org-chart/{id} for a cross-tenant department MUST 404 (no cross-tenant disclosure)'),
134
+ ).toBe(true);
135
+ }
136
+ });
137
+ });
@@ -0,0 +1,78 @@
1
+ /**
2
+ * Org position confers no authority — the §B invariant, behavioral leg
3
+ * (RFC 0087 §B) — the protocol-tier `org-position-no-authority-escalation`.
4
+ *
5
+ * The STRUCTURAL leg (the `agent-org-chart.schema.json` is `additionalProperties:
6
+ * false` and rejects an authority-bearing field on a member) is always-on /
7
+ * server-free in `agent-org-chart-shape.test.ts`. This scenario is the
8
+ * BEHAVIORAL leg, gated on `capabilities.agents.orgChart.supported`: it proves
9
+ * against the LIVE host that the org-chart projector strips position-as-authority
10
+ * — no member, department, or responsibility-view object served on the wire
11
+ * carries an authority-bearing field (`scopes` / `canDispatch` / `permissions` /
12
+ * `authority` / `roleGrants` / `capabilities`), at every install scope. An org
13
+ * edge is an *ownership + reporting* record, never an authority grant.
14
+ *
15
+ * Soft-skips when unadvertised (default) / hard-fails under
16
+ * `OPENWOP_REQUIRE_BEHAVIOR=true`.
17
+ *
18
+ * The deeper authority-invariance legs — a manager agent cannot dispatch a
19
+ * report's tools (RFC 0002 §A14), an RFC 0049 authorization decision is
20
+ * invariant to org position, an RFC 0051 approval gate is not satisfied by org
21
+ * seniority — require a non-normative host authorization-decide hook to force
22
+ * black-box; a conformant host need not expose one, so (mirroring the RFC 0070
23
+ * `agent-manifest-runtime` confidence-escalation note) they stay reference-impl
24
+ * tier and are NOT asserted here. The wire-projection proof below is the
25
+ * load-bearing, hook-free behavioral guarantee.
26
+ *
27
+ * Spec references:
28
+ * - https://github.com/openwop/openwop/blob/main/spec/v1/agent-org-chart.md (§B)
29
+ * - https://github.com/openwop/openwop/blob/main/RFCS/0087-agent-org-chart.md (§B)
30
+ * - https://github.com/openwop/openwop/blob/main/SECURITY/invariants.yaml (org-position-no-authority-escalation)
31
+ */
32
+
33
+ import { describe, it, expect } from 'vitest';
34
+ import { driver } from '../lib/driver.js';
35
+ import { behaviorGate } from '../lib/behavior-gate.js';
36
+ import { readOrgChartCap, getOrgChart, getDepartmentView, AUTHORITY_FIELDS } from '../lib/agentOrgChart.js';
37
+
38
+ /** Assert an org-chart wire object carries no authority-bearing field. */
39
+ function expectNoAuthority(obj: Record<string, unknown> | undefined, where: string): void {
40
+ if (!obj || typeof obj !== 'object') return;
41
+ for (const f of AUTHORITY_FIELDS) {
42
+ expect(
43
+ !(f in obj),
44
+ driver.describe('RFC 0087 §B / org-position-no-authority-escalation', `${where} MUST NOT carry the authority field "${f}" — org position confers no authority`),
45
+ ).toBe(true);
46
+ }
47
+ }
48
+
49
+ describe('org-position-no-authority-escalation (RFC 0087 §B, behavioral)', () => {
50
+ it('the live org-chart wire carries no authority-bearing field on any member/department/view', async () => {
51
+ const cap = await readOrgChartCap();
52
+ if (!behaviorGate('openwop-org-position-no-authority', cap?.supported === true)) return;
53
+
54
+ const chart = await getOrgChart();
55
+ if (chart === null) return; // advertised but read not served yet — soft-skip
56
+
57
+ for (const m of chart.members ?? []) {
58
+ expectNoAuthority(m as Record<string, unknown>, 'an org-chart member');
59
+ }
60
+ for (const d of chart.departments ?? []) {
61
+ expectNoAuthority(d as Record<string, unknown>, 'an org-chart department');
62
+ }
63
+
64
+ // The §D responsibility roll-up is a portfolio union (workflow ids), never an
65
+ // authority grant — assert its members + the view object are authority-free too.
66
+ const probeDeptId = (chart.departments ?? [])[0]?.departmentId;
67
+ if (typeof probeDeptId === 'string') {
68
+ const { status, view } = await getDepartmentView(probeDeptId);
69
+ if (status === 200 && view) {
70
+ expectNoAuthority(view as unknown as Record<string, unknown>, 'the responsibility view');
71
+ expectNoAuthority(view.department as Record<string, unknown> | undefined, "the responsibility view's department");
72
+ for (const m of view.members ?? []) {
73
+ expectNoAuthority(m as Record<string, unknown>, 'a responsibility-view member');
74
+ }
75
+ }
76
+ }
77
+ });
78
+ });
@@ -0,0 +1,126 @@
1
+ /**
2
+ * Durable trigger bridge — delivery model (RFC 0083 §C) — behavioral.
3
+ *
4
+ * Profile-gated on `openwop-trigger-bridge` (derived from the live discovery
5
+ * doc per RFC 0083 §D: the bridge advertised + a dead-letter sink + a durable
6
+ * source). Soft-skips when the profile isn't derived (default) / hard-fails
7
+ * under `OPENWOP_REQUIRE_BEHAVIOR=true`. The always-on wire-shape coverage
8
+ * lives in `trigger-bridge-shape.test.ts`; this asserts host BEHAVIOR via the
9
+ * `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam:
10
+ *
11
+ * 1. DEDUP (§C-1) — the same `dedupKey` delivered twice is effectively-once:
12
+ * exactly one `trigger.delivery.attempted { outcome:"delivered" }` for that
13
+ * key (at-least-once collapses to once within the retention window).
14
+ * 2. RETRY → DEAD-LETTER (§C-2 + RFC 0053) — an exhausted retry policy lands a
15
+ * terminal `trigger.delivery.attempted { outcome:"dead-lettered" }` and a
16
+ * `trigger.subscription.state.changed { toState:"dead-lettered" }`; both
17
+ * content-free (SR-1: ids/states/counters only).
18
+ * 3. CAUSATION (§C / RFC 0040) — a successful delivery's resulting run carries
19
+ * `run.started.causationId` == the delivery id (trigger → run is resolvable
20
+ * via `/ancestry`).
21
+ *
22
+ * Each leg soft-skips independently (seam absent / event-log seam absent).
23
+ *
24
+ * Spec references:
25
+ * - https://github.com/openwop/openwop/blob/main/spec/v1/trigger-bridge.md (§C)
26
+ * - https://github.com/openwop/openwop/blob/main/RFCS/0083-durable-trigger-and-channel-bridge-profile.md
27
+ * - https://github.com/openwop/openwop/blob/main/spec/v1/profiles.md (§openwop-trigger-bridge)
28
+ */
29
+
30
+ import { describe, it, expect } from 'vitest';
31
+ import { driver } from '../lib/driver.js';
32
+ import { behaviorGate } from '../lib/behavior-gate.js';
33
+ import {
34
+ isTriggerBridgeProfileAdvertised,
35
+ driveDelivery,
36
+ DELIVERY_OUTCOMES,
37
+ SUBSCRIPTION_STATES,
38
+ } from '../lib/triggerBridge.js';
39
+ import { queryTestEvents, isEventLogSeamAvailable, resetTestSeam } from '../lib/event-log-query.js';
40
+
41
+ const CONTENT_FREE_FORBIDDEN = ['body', 'headers', 'payload', 'secret', 'credentials', 'token', 'apiKey'];
42
+
43
+ function expectContentFree(payload: Record<string, unknown>, where: string): void {
44
+ for (const f of CONTENT_FREE_FORBIDDEN) {
45
+ expect(
46
+ !(f in payload),
47
+ driver.describe('RFC 0083 §C (SR-1)', `${where} MUST be content-free (no ${f})`),
48
+ ).toBe(true);
49
+ }
50
+ }
51
+
52
+ describe('trigger-bridge-delivery (RFC 0083 §C)', () => {
53
+ it('de-dups by dedupKey, retries to dead-letter, and links delivery→run causation', async () => {
54
+ if (!behaviorGate('openwop-trigger-bridge', await isTriggerBridgeProfileAdvertised())) return;
55
+ if (!(await isEventLogSeamAvailable())) return; // event-log seam absent — soft-skip
56
+
57
+ // ---- Leg 1: dedup → effectively-once (§C-1) ---------------------------
58
+ const dedup = await driveDelivery({ scenario: 'dedup', dedupKey: 'conformance-dedup-key', source: 'queue' });
59
+ if (dedup === null) return; // delivery seam unwired — soft-skip the whole behavioral suite
60
+ if (dedup.runId || dedup.subscriptionId) {
61
+ const subId = dedup.subscriptionId;
62
+ const q = await queryTestEvents(dedup.runId ?? '__dedup__', { type: 'trigger.delivery.attempted' });
63
+ if (q.ok) {
64
+ const deliveredForKey = q.events.filter(
65
+ (e) => e.payload.dedupKey === 'conformance-dedup-key' && e.payload.outcome === 'delivered',
66
+ );
67
+ // Effectively-once: a repeated dedupKey MUST NOT produce two 'delivered' attempts.
68
+ expect(
69
+ deliveredForKey.length <= 1,
70
+ driver.describe('trigger-bridge.md §C-1', 'a repeated dedupKey MUST be effectively-once (≤1 delivered attempt)'),
71
+ ).toBe(true);
72
+ for (const e of q.events) {
73
+ expect(
74
+ typeof e.payload.outcome === 'string' && DELIVERY_OUTCOMES.includes(e.payload.outcome as string),
75
+ driver.describe('run-event-payloads.schema.json#triggerDeliveryAttempted', 'outcome MUST be delivered|retrying|dead-lettered'),
76
+ ).toBe(true);
77
+ expectContentFree(e.payload, 'trigger.delivery.attempted');
78
+ }
79
+ }
80
+ void subId;
81
+ }
82
+
83
+ // ---- Leg 2: retry → dead-letter (§C-2 + RFC 0053) --------------------
84
+ const exhaust = await driveDelivery({ scenario: 'exhaust', source: 'webhook' });
85
+ if (exhaust && (exhaust.runId || exhaust.subscriptionId)) {
86
+ const key = exhaust.runId ?? '__exhaust__';
87
+ const dq = await queryTestEvents(key, { type: 'trigger.delivery.attempted' });
88
+ if (dq.ok && dq.events.length > 0) {
89
+ const terminal = dq.events.sort((a, b) => a.sequence - b.sequence)[dq.events.length - 1]!;
90
+ expect(
91
+ terminal.payload.outcome === 'dead-lettered',
92
+ driver.describe('trigger-bridge.md §C-2', 'an exhausted retry policy MUST terminate in a dead-lettered delivery'),
93
+ ).toBe(true);
94
+ }
95
+ const sq = await queryTestEvents(key, { type: 'trigger.subscription.state.changed' });
96
+ if (sq.ok && sq.events.length > 0) {
97
+ const toDeadLetter = sq.events.some((e) => e.payload.toState === 'dead-lettered');
98
+ expect(
99
+ toDeadLetter,
100
+ driver.describe('trigger-bridge.md §B', 'the subscription MUST transition to dead-lettered on exhaustion'),
101
+ ).toBe(true);
102
+ for (const e of sq.events) {
103
+ expect(
104
+ typeof e.payload.toState === 'string' && SUBSCRIPTION_STATES.includes(e.payload.toState as string),
105
+ driver.describe('trigger-bridge.md §B', 'toState MUST be in the four-state vocabulary'),
106
+ ).toBe(true);
107
+ expectContentFree(e.payload, 'trigger.subscription.state.changed');
108
+ }
109
+ }
110
+ }
111
+
112
+ // ---- Leg 3: delivery → run causation (§C / RFC 0040) -----------------
113
+ const delivered = await driveDelivery({ scenario: 'deliver', source: 'schedule' });
114
+ if (delivered?.runId) {
115
+ const rq = await queryTestEvents(delivered.runId, { type: 'run.started' });
116
+ if (rq.ok && rq.events[0]) {
117
+ expect(
118
+ typeof rq.events[0].causationId === 'string' && (rq.events[0].causationId as string).length > 0,
119
+ driver.describe('trigger-bridge.md §C / RFC 0040', 'the delivered run.started MUST carry the delivery causationId (resolvable via /ancestry)'),
120
+ ).toBe(true);
121
+ }
122
+ }
123
+
124
+ await resetTestSeam();
125
+ });
126
+ });