@openwop/openwop-conformance 1.23.0 → 1.24.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -2
- package/api/asyncapi.yaml +54 -0
- package/coverage.md +8 -0
- package/package.json +1 -1
- package/schemas/README.md +3 -0
- package/schemas/capabilities.schema.json +78 -0
- package/schemas/export-bundle.schema.json +66 -0
- package/schemas/goal.schema.json +104 -0
- package/schemas/proposal.schema.json +84 -0
- package/schemas/run-event-payloads.schema.json +80 -2
- package/schemas/run-event.schema.json +6 -1
- package/src/scenarios/export-bundle-portability.test.ts +120 -0
- package/src/scenarios/goal-standing-continuation.test.ts +139 -0
- package/src/scenarios/proposal-reviewable-learning.test.ts +129 -0
package/README.md
CHANGED
|
@@ -92,7 +92,7 @@ Exit code is non-zero on any failed assertion.
|
|
|
92
92
|
|
|
93
93
|
## What's Covered
|
|
94
94
|
|
|
95
|
-
The current suite has 340 scenario files under `src/scenarios/`. 2026-06-11 (RFCs 0093/0094 — protocol hardening + wire-shape reconciliation) added five: `version-fold.test.ts` (the `version-negotiation.md` §`X-Force-Engine-Version` cross-version matrix through the previously-orphaned `conformance-version-fold` fixture — closes catalog gap F5; soft-skips when `Capabilities.testing.forceEngineVersionRange` is unadvertised), `stream-text-fixture.test.ts` (the `stream-modes.md` §`messages` fold through the deterministic `stream-text` mock provider + the previously-orphaned `conformance-stream-text` fixture — closes catalog gap F1), `i18n-negotiation.test.ts` (gated on `capabilities.i18n` via `behaviorGate('openwop-i18n', …)` — an unsupported or malformed `Accept-Language` never 400s, `Content-Language` reflects the locale actually used, and error `code` strings stay the canonical English tokens), `grpc-transport.test.ts` (gated on `capabilities.grpc` via `behaviorGate('openwop-grpc-transport', …)` — advertisement-shape only per `grpc-transport.md` §Field semantics: `service` MUST be `openwop.v1.Engine`, the `tls` enum, `grpcs?://` endpoint URIs, `supportedTransports` includes `grpc` when exposed, production claimants require `tls: "required"`; no gRPC dialing), and `webhook-tenant-isolation.test.ts` (RFC 0093 §A.3 — backs the new protocol-tier `webhook-cross-tenant-isolation` invariant; a two-tenant proof through the `/v1/host/sample/test/surface` seam plus black-box registration-surface scoping). `spec-corpus-validity.test.ts` also gained the RFC 0094 §A satisfiability probe: canonical `createRun` bodies MUST pass the composed request schema (closed via `unevaluatedProperties: false` at the composition site, never inside an `allOf` branch) and an undeclared property MUST fail. 2026-06-07 (RFCs 0090/0091/0092 — verifier turn + convergence, multimodal perception input, agent capability requirements) added six: the always-on, server-free shape probes `agent-verifier-shape.test.ts`, `aiproviders-input-shape.test.ts`, `agent-requires-capabilities-shape.test.ts`, plus the capability-gated **behavioral** legs `agent-capability-degraded-projection.test.ts` (RFC 0092 §B — the `degraded[]` projection on `GET /v1/agents`, black-box, non-vacuous via `OPENWOP_DEGRADED_CAPABILITY_AGENT_ID`), `callai-multimodal.test.ts` (RFC 0091 §A/§B — advertised modality accepted / unadvertised → `unsupported_modality`, via the `POST /v1/host/sample/ai/call` seam), and `verifier-gating.test.ts` (RFC 0090 §B — a `fail` verdict blocks commit, via the `POST /v1/host/sample/agents/verify-run` seam). The three behavioral legs soft-skip by default and hard-fail under `OPENWOP_REQUIRE_BEHAVIOR=true` — the Active→Accepted reference-host proof for each RFC. 2026-06-02 (RFC 0082 §B — deployment channel resolve-and-pin, production-path coverage) added `agent-channel-dispatch.test.ts` (capability-gated on `agents.deployment.supported` + the seeded `conformance-agent-channel-dispatch` fixture + advertised `replay` mode via `behaviorGate('openwop-deployment-channel-dispatch', …)` — proves the §B pin from a REAL run graph, complementing `agent-deployment-lifecycle.test.ts` Leg 4's host-sample seam: a canonical `POST /v1/runs` of a node binding `agent.channel:"stable"` MUST record `resolvedChannel` + `resolvedAgentVersion` on `agent.invocation.started` (RFC 0077), a `:fork{mode:"replay"}` MUST re-read that recorded version, and the seam-guarded Leg 3 MOVES the channel then asserts a replay STILL carries the original pin — never re-resolving a moved channel; soft-skips by default, hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true` — the production-path proof of the §B contract). 2026-06-01 (RFC 0085 — `openwop-agent-platform` meta-profile, the Active→Accepted behavioral gate) added `agent-platform-aggregate-evidence.test.ts` (capability-gated on a host CLAIMING `openwop-agent-platform` in its live discovery `profiles[]` via `behaviorGate('openwop-agent-platform', …)` — the §C/§D honest-advertisement rule on the live `/.well-known/openwop`: the claim MUST satisfy the §B floor predicate (`isAgentPlatformPartial` → `partial`/`full`, never `none`), backed by the per-capability evidence not the profile string; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the non-vacuous full bar — all governance terms + tenant installScope + all 16 §D terms; server-requiring, the always-on §B/§D derivation legs stay in `agent-platform-profile.test.ts` — the RFC 0085 → Accepted bar). 2026-06-01 (RFC 0084 — budget, quota + cost policy, the Active→Accepted behavioral gate) added `budget-enforcement.test.ts` (capability-gated on `budget.supported` via `behaviorGate('openwop-budget-enforcement', …)` — the §C/§D enforcement via the new `POST /v1/host/sample/budget/run` seam + the test event-log seam: a `hard-cost-exhaust` run emits the strict-ordered `budget.reserved → budget.consumed → budget.threshold.crossed{percent} → budget.exhausted → cap.breached{kind:"budget-cost"} → run.failed{error:"budget_exhausted"}` chain; a `model-denied` run is refused `budget_model_denied` BEFORE the provider call (fail-closed); an `advisory` host emits the `budget.*` events without stopping; every `budget.*` payload content-free backing `budget-no-pricing-leak`; new lib helper `src/lib/budgetPolicy.ts`; soft-skips on 404 — the RFC 0084 → Accepted bar). 2026-06-01 (RFC 0080 — agent memory capability reconciliation, the Active→Accepted behavioral gate) added `memory-degraded-projection.test.ts` (capability-gated on `agents.manifestRuntime.supported` + `memory.supported` via `behaviorGate('openwop-memory-degraded', …)` — the §C degraded-projection iff-contract on the NORMATIVE `GET /v1/agents`: a degraded inventory entry MUST carry `memoryDegraded:true` + a non-empty, unique `degradedMemoryDimensions[]` from the closed §A-name enum, a non-degraded entry MUST NOT, the inventory is non-empty, and the degraded branch runs non-vacuously when `OPENWOP_DEGRADED_AGENT_ID` names a known-degraded agent; black-box, no POST seam — the RFC 0080 → Accepted bar). This batch also documents the two RFC 0068 conformance seams (`POST /v1/host/sample/memory/consolidate` + `.../commitment/fire`) in `host-sample-test-seams.md` (the 0068 gated scenarios shipped in 1.14.0). 2026-06-01 (RFC 0034 — collector-side BYOK-canary inspection) added `otel-collector-canary-inspection.test.ts` (always-on server-free: stands up a real `OtelCollector`, POSTs synthetic OTLP/HTTP-JSON traces + metrics through its actual ingest path, and proves the new `findCanaryLeakage()` inspector catches a canary embedded in a span attribute / resource attribute / span name / metric data-point attribute while reporting ZERO hits on a redacted payload and never matching an empty canary — the non-vacuous proof that the conformance collector now inspects what the host's OTLP exporter ACTUALLY shipped over the wire, closing the `secret-leakage-otel-attribute` / `-debug-bundle-otel` collector-seam gap; the live capability-gated complement is the new collector-export describe block in `secret-leakage-otel-attribute.test.ts`). 2026-06-01 (RFC 0035 — sandbox wall-clock timeout, the 7th-of-8 graduation) added `sandbox-wasm-timeout.test.ts` (worker-driven server-free: `probeTimeout` in `wasm-sandbox-probe.ts` spawns a worker thread running the committed `misbehaving-timeout.wasm` + a main-thread kill-timer — the thread preemption a same-thread probe can't do — asserting `sandbox_timeout` with a well-behaved positive control; graduates `node-pack-sandbox-timeout` reference-impl→protocol, so 7 of 8 `node-pack-sandbox-*` invariants are now protocol-tier, only the JS-specific `no-eval` permanently exempt). 2026-05-31 (audit-response black-box / graduation batch) added three more: `sandbox-wasm-isolation.test.ts` (RFC 0035 — drives the committed `fixtures/wasm-sandbox/*.wasm` through `wasm-sandbox-probe.ts`: escape/capability-gate via static `WebAssembly.Module.imports()`, an OOB-store memory trap, double-instantiate isolation; 10/10; graduates 6 `node-pack-sandbox-*` invariants reference-impl→protocol), `workspace-cross-tenant-isolation-blackbox.test.ts` (RFC 0059 — two-credential black-box on the normative §C `/v1/host/workspace/files` endpoints: owner A writes, a second-tenant credential fails closed; no seam), and `prompt-resolution-chain-event.test.ts` (RFC 0029 — reads the durable `agent.promptResolved.chain[]` precedence record via the normative `GET /v1/runs/{runId}/events/poll`; no seam) — each the production-path proof that graduates its surface into the `openwop-core-standard` floor. 2026-05-31 (RFC 0088 — the `openwop-core-standard` Core Standard Profile, the audit-response Core Candidate target) added `core-standard-profile.test.ts` (always-on server-free derivation probe: `isCoreStandard` derives the §B floor — `openwop-core` ∧ `openwop-interrupts` ∧ (`openwop-stream-sse` ∨ `openwop-stream-poll`) — a bare `openwop-core` host without interrupts is excluded, a host with no event transport fails, and the annex is absent from `deriveProfiles` because it composes rather than redefines). 2026-05-31 (RFC 0082 — agent deployment lifecycle, the Active→Accepted behavioral gate) added `agent-deployment-lifecycle.test.ts` (capability-gated on `agents.deployment.supported` via `behaviorGate('openwop-deployment-lifecycle', …)` — the §E promotion contract via the new `POST /v1/host/sample/agents/deployment-transition` seam + the test event-log seam across four legs: `promote` (authorize RFC 0049 → approvalGate RFC 0051 → eval-verify RFC 0081 → content-free `deployment.promoted` with a seven-state `toState` + `toVersion`, the record validating `agent-deployment.schema.json`), `unauthorized` (fail-closed — `allowed:false`, no `deployment.promoted`, the behavioral leg of `deployment-promotion-fail-closed`), `eval-gate-unmet` (`eval_gate_unmet` denial, §E-3), and `channel-pin` (the §B `resolvedAgentVersion` recorded-fact on `agent.invocation.started`); new lib helper `src/lib/agentDeployment.ts`; soft-skips on 404 — the RFC 0082 → Accepted bar). 2026-05-31 (RFC 0081 — agent evaluation, the Active→Accepted behavioral gate) added `agent-eval-run.test.ts` (capability-gated on `agents.evalSuite.supported` via `behaviorGate('openwop-eval-run', …)` — the §B `mode:"eval"` projection via the new `POST /v1/host/sample/agents/eval-run` seam + the test event-log seam: `eval.started`-first → one `eval.scored` per task → `eval.completed`-once ordering (count == `eval.completed.taskCount`), the content-free `eval.scored` legs (`score` ∈ 0..1) backing `eval-summary-no-content-leak`, and the NORMATIVE `GET /v1/runs/{runId}/eval-summary` schema-valid `EvalSummary` round-trip with `passedCount <= taskCount`; new lib helper `src/lib/agentEval.ts`; soft-skips on 404 — the RFC 0081 → Accepted bar). 2026-05-31 (RFC 0083 — durable trigger bridge, the Active→Accepted behavioral gate) added `trigger-bridge-delivery.test.ts` (profile-gated on `openwop-trigger-bridge` derived from the live discovery doc — the §C delivery model via the `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free; the always-on shape stays in `trigger-bridge-shape.test.ts`; new lib helper `src/lib/triggerBridge.ts`). 2026-05-31 (RFC 0087 — agent org-chart, the Active→Accepted behavioral gate) added two capability-gated behavioral scenarios (both gated on `agents.orgChart.supported`, black-box on the normative `/v1/agents/org-chart` surface — no new POST seam): `agent-org-chart-scoping.test.ts` (the `GET /v1/agents/org-chart` tree-shape — departments form an acyclic `parentDepartmentId` tree, members reference `host:<id>` roster entries — + the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` with a deduped `responsibilities[]` union + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) and `org-position-no-authority-escalation.test.ts` (the behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field on any member/department/responsibility-view object; the structural leg stays always-on in `agent-org-chart-shape.test.ts`, and the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier per the `agent-manifest-runtime` no-host-hook precedent). 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). 2026-05-31 (RFC 0078 portable tool catalog + RFC 0079 credential provenance / egress policy — the Active→Accepted behavioral gate) added four: `tool-catalog-projection.test.ts` (capability-gated on `toolCatalog.supported` via `behaviorGate('openwop-tool-catalog', …)` — the NORMATIVE `GET /v1/tools` list with each `ToolDescriptor` schema-valid + `source`/`safetyTier` in the closed vocab + content-free, `GET /v1/tools/{toolId}` round-trip + unknown-id 404, 401-unauthenticated, and the §F-2 cross-principal non-disclosure; black-box, no POST seam), `tool-session-lifecycle.test.ts` (gated on `toolCatalog.sessionLifecycle` — the §D `tool.session.opened`-before / `tool.session.closed`-after bracket over the RFC 0064 call events via the `POST /v1/host/sample/tools/session-run` seam, one shared `sessionId`, content-free), `egress-audience-binding.test.ts` (KEYSTONE — gated on `httpClient.egressPolicy.supported`; the §C confused-deputy MUST via `POST /v1/host/sample/egress/decide`: an out-of-audience egress is denied/downgraded with the credential NOT attached, a provenance-unevaluable egress fails closed — the behavioral leg of `egress-credential-audience-bound`), and `egress-decision-content-free.test.ts` (the SR-1 canary — the credential value never surfaces in `egress.decided` and `reason` stays in the CLOSED vocabulary). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
|
|
95
|
+
The current suite has 343 scenario files under `src/scenarios/`. 2026-06-13 (RFCs 0096/0097/0098 — reviewable learning, standing goals, agent-platform portability) added three always-on-plus-gated scenarios: `proposal-reviewable-learning.test.ts` (RFC 0096 — the `agents.proposals` shape + the `Proposal` round-trip incl. the dropped `rule` kind + the content-free `proposal.{created,activated}` events, plus a gated apply-without-scope→403 leg; backs `proposal-inert-until-applied` + `proposal-no-resynthesis`), `goal-standing-continuation.test.ts` (RFC 0097 — the `agents.goals` shape + the `Goal` round-trip + the content-free `goal.{evaluated,closed}` events, plus gated bounded-termination→422 + judge-only-completion legs; backs `goal-continuation-bounded` + `goal-completion-judge-only`), and `export-bundle-portability.test.ts` (RFC 0098 — the `portability` shape incl. the `import⇒dryRun` if/then + the `ExportBundle` round-trip rejecting every credential-named field + the content-free `import.applied` event, plus a gated literal-credential-import→422 leg; backs `export-bundle-no-credential-material`). 2026-06-11 (RFCs 0093/0094 — protocol hardening + wire-shape reconciliation) added five: `version-fold.test.ts` (the `version-negotiation.md` §`X-Force-Engine-Version` cross-version matrix through the previously-orphaned `conformance-version-fold` fixture — closes catalog gap F5; soft-skips when `Capabilities.testing.forceEngineVersionRange` is unadvertised), `stream-text-fixture.test.ts` (the `stream-modes.md` §`messages` fold through the deterministic `stream-text` mock provider + the previously-orphaned `conformance-stream-text` fixture — closes catalog gap F1), `i18n-negotiation.test.ts` (gated on `capabilities.i18n` via `behaviorGate('openwop-i18n', …)` — an unsupported or malformed `Accept-Language` never 400s, `Content-Language` reflects the locale actually used, and error `code` strings stay the canonical English tokens), `grpc-transport.test.ts` (gated on `capabilities.grpc` via `behaviorGate('openwop-grpc-transport', …)` — advertisement-shape only per `grpc-transport.md` §Field semantics: `service` MUST be `openwop.v1.Engine`, the `tls` enum, `grpcs?://` endpoint URIs, `supportedTransports` includes `grpc` when exposed, production claimants require `tls: "required"`; no gRPC dialing), and `webhook-tenant-isolation.test.ts` (RFC 0093 §A.3 — backs the new protocol-tier `webhook-cross-tenant-isolation` invariant; a two-tenant proof through the `/v1/host/sample/test/surface` seam plus black-box registration-surface scoping). `spec-corpus-validity.test.ts` also gained the RFC 0094 §A satisfiability probe: canonical `createRun` bodies MUST pass the composed request schema (closed via `unevaluatedProperties: false` at the composition site, never inside an `allOf` branch) and an undeclared property MUST fail. 2026-06-07 (RFCs 0090/0091/0092 — verifier turn + convergence, multimodal perception input, agent capability requirements) added six: the always-on, server-free shape probes `agent-verifier-shape.test.ts`, `aiproviders-input-shape.test.ts`, `agent-requires-capabilities-shape.test.ts`, plus the capability-gated **behavioral** legs `agent-capability-degraded-projection.test.ts` (RFC 0092 §B — the `degraded[]` projection on `GET /v1/agents`, black-box, non-vacuous via `OPENWOP_DEGRADED_CAPABILITY_AGENT_ID`), `callai-multimodal.test.ts` (RFC 0091 §A/§B — advertised modality accepted / unadvertised → `unsupported_modality`, via the `POST /v1/host/sample/ai/call` seam), and `verifier-gating.test.ts` (RFC 0090 §B — a `fail` verdict blocks commit, via the `POST /v1/host/sample/agents/verify-run` seam). The three behavioral legs soft-skip by default and hard-fail under `OPENWOP_REQUIRE_BEHAVIOR=true` — the Active→Accepted reference-host proof for each RFC. 2026-06-02 (RFC 0082 §B — deployment channel resolve-and-pin, production-path coverage) added `agent-channel-dispatch.test.ts` (capability-gated on `agents.deployment.supported` + the seeded `conformance-agent-channel-dispatch` fixture + advertised `replay` mode via `behaviorGate('openwop-deployment-channel-dispatch', …)` — proves the §B pin from a REAL run graph, complementing `agent-deployment-lifecycle.test.ts` Leg 4's host-sample seam: a canonical `POST /v1/runs` of a node binding `agent.channel:"stable"` MUST record `resolvedChannel` + `resolvedAgentVersion` on `agent.invocation.started` (RFC 0077), a `:fork{mode:"replay"}` MUST re-read that recorded version, and the seam-guarded Leg 3 MOVES the channel then asserts a replay STILL carries the original pin — never re-resolving a moved channel; soft-skips by default, hard-fails under `OPENWOP_REQUIRE_BEHAVIOR=true` — the production-path proof of the §B contract). 2026-06-01 (RFC 0085 — `openwop-agent-platform` meta-profile, the Active→Accepted behavioral gate) added `agent-platform-aggregate-evidence.test.ts` (capability-gated on a host CLAIMING `openwop-agent-platform` in its live discovery `profiles[]` via `behaviorGate('openwop-agent-platform', …)` — the §C/§D honest-advertisement rule on the live `/.well-known/openwop`: the claim MUST satisfy the §B floor predicate (`isAgentPlatformPartial` → `partial`/`full`, never `none`), backed by the per-capability evidence not the profile string; `OPENWOP_AGENT_PLATFORM_TIER=full` forces the non-vacuous full bar — all governance terms + tenant installScope + all 16 §D terms; server-requiring, the always-on §B/§D derivation legs stay in `agent-platform-profile.test.ts` — the RFC 0085 → Accepted bar). 2026-06-01 (RFC 0084 — budget, quota + cost policy, the Active→Accepted behavioral gate) added `budget-enforcement.test.ts` (capability-gated on `budget.supported` via `behaviorGate('openwop-budget-enforcement', …)` — the §C/§D enforcement via the new `POST /v1/host/sample/budget/run` seam + the test event-log seam: a `hard-cost-exhaust` run emits the strict-ordered `budget.reserved → budget.consumed → budget.threshold.crossed{percent} → budget.exhausted → cap.breached{kind:"budget-cost"} → run.failed{error:"budget_exhausted"}` chain; a `model-denied` run is refused `budget_model_denied` BEFORE the provider call (fail-closed); an `advisory` host emits the `budget.*` events without stopping; every `budget.*` payload content-free backing `budget-no-pricing-leak`; new lib helper `src/lib/budgetPolicy.ts`; soft-skips on 404 — the RFC 0084 → Accepted bar). 2026-06-01 (RFC 0080 — agent memory capability reconciliation, the Active→Accepted behavioral gate) added `memory-degraded-projection.test.ts` (capability-gated on `agents.manifestRuntime.supported` + `memory.supported` via `behaviorGate('openwop-memory-degraded', …)` — the §C degraded-projection iff-contract on the NORMATIVE `GET /v1/agents`: a degraded inventory entry MUST carry `memoryDegraded:true` + a non-empty, unique `degradedMemoryDimensions[]` from the closed §A-name enum, a non-degraded entry MUST NOT, the inventory is non-empty, and the degraded branch runs non-vacuously when `OPENWOP_DEGRADED_AGENT_ID` names a known-degraded agent; black-box, no POST seam — the RFC 0080 → Accepted bar). This batch also documents the two RFC 0068 conformance seams (`POST /v1/host/sample/memory/consolidate` + `.../commitment/fire`) in `host-sample-test-seams.md` (the 0068 gated scenarios shipped in 1.14.0). 2026-06-01 (RFC 0034 — collector-side BYOK-canary inspection) added `otel-collector-canary-inspection.test.ts` (always-on server-free: stands up a real `OtelCollector`, POSTs synthetic OTLP/HTTP-JSON traces + metrics through its actual ingest path, and proves the new `findCanaryLeakage()` inspector catches a canary embedded in a span attribute / resource attribute / span name / metric data-point attribute while reporting ZERO hits on a redacted payload and never matching an empty canary — the non-vacuous proof that the conformance collector now inspects what the host's OTLP exporter ACTUALLY shipped over the wire, closing the `secret-leakage-otel-attribute` / `-debug-bundle-otel` collector-seam gap; the live capability-gated complement is the new collector-export describe block in `secret-leakage-otel-attribute.test.ts`). 2026-06-01 (RFC 0035 — sandbox wall-clock timeout, the 7th-of-8 graduation) added `sandbox-wasm-timeout.test.ts` (worker-driven server-free: `probeTimeout` in `wasm-sandbox-probe.ts` spawns a worker thread running the committed `misbehaving-timeout.wasm` + a main-thread kill-timer — the thread preemption a same-thread probe can't do — asserting `sandbox_timeout` with a well-behaved positive control; graduates `node-pack-sandbox-timeout` reference-impl→protocol, so 7 of 8 `node-pack-sandbox-*` invariants are now protocol-tier, only the JS-specific `no-eval` permanently exempt). 2026-05-31 (audit-response black-box / graduation batch) added three more: `sandbox-wasm-isolation.test.ts` (RFC 0035 — drives the committed `fixtures/wasm-sandbox/*.wasm` through `wasm-sandbox-probe.ts`: escape/capability-gate via static `WebAssembly.Module.imports()`, an OOB-store memory trap, double-instantiate isolation; 10/10; graduates 6 `node-pack-sandbox-*` invariants reference-impl→protocol), `workspace-cross-tenant-isolation-blackbox.test.ts` (RFC 0059 — two-credential black-box on the normative §C `/v1/host/workspace/files` endpoints: owner A writes, a second-tenant credential fails closed; no seam), and `prompt-resolution-chain-event.test.ts` (RFC 0029 — reads the durable `agent.promptResolved.chain[]` precedence record via the normative `GET /v1/runs/{runId}/events/poll`; no seam) — each the production-path proof that graduates its surface into the `openwop-core-standard` floor. 2026-05-31 (RFC 0088 — the `openwop-core-standard` Core Standard Profile, the audit-response Core Candidate target) added `core-standard-profile.test.ts` (always-on server-free derivation probe: `isCoreStandard` derives the §B floor — `openwop-core` ∧ `openwop-interrupts` ∧ (`openwop-stream-sse` ∨ `openwop-stream-poll`) — a bare `openwop-core` host without interrupts is excluded, a host with no event transport fails, and the annex is absent from `deriveProfiles` because it composes rather than redefines). 2026-05-31 (RFC 0082 — agent deployment lifecycle, the Active→Accepted behavioral gate) added `agent-deployment-lifecycle.test.ts` (capability-gated on `agents.deployment.supported` via `behaviorGate('openwop-deployment-lifecycle', …)` — the §E promotion contract via the new `POST /v1/host/sample/agents/deployment-transition` seam + the test event-log seam across four legs: `promote` (authorize RFC 0049 → approvalGate RFC 0051 → eval-verify RFC 0081 → content-free `deployment.promoted` with a seven-state `toState` + `toVersion`, the record validating `agent-deployment.schema.json`), `unauthorized` (fail-closed — `allowed:false`, no `deployment.promoted`, the behavioral leg of `deployment-promotion-fail-closed`), `eval-gate-unmet` (`eval_gate_unmet` denial, §E-3), and `channel-pin` (the §B `resolvedAgentVersion` recorded-fact on `agent.invocation.started`); new lib helper `src/lib/agentDeployment.ts`; soft-skips on 404 — the RFC 0082 → Accepted bar). 2026-05-31 (RFC 0081 — agent evaluation, the Active→Accepted behavioral gate) added `agent-eval-run.test.ts` (capability-gated on `agents.evalSuite.supported` via `behaviorGate('openwop-eval-run', …)` — the §B `mode:"eval"` projection via the new `POST /v1/host/sample/agents/eval-run` seam + the test event-log seam: `eval.started`-first → one `eval.scored` per task → `eval.completed`-once ordering (count == `eval.completed.taskCount`), the content-free `eval.scored` legs (`score` ∈ 0..1) backing `eval-summary-no-content-leak`, and the NORMATIVE `GET /v1/runs/{runId}/eval-summary` schema-valid `EvalSummary` round-trip with `passedCount <= taskCount`; new lib helper `src/lib/agentEval.ts`; soft-skips on 404 — the RFC 0081 → Accepted bar). 2026-05-31 (RFC 0083 — durable trigger bridge, the Active→Accepted behavioral gate) added `trigger-bridge-delivery.test.ts` (profile-gated on `openwop-trigger-bridge` derived from the live discovery doc — the §C delivery model via the `POST /v1/host/sample/trigger-bridge/deliver` seam + the test event-log seam: dedup→effectively-once `trigger.delivery.attempted{delivered}` (§C-1), retry-exhaustion→`{dead-lettered}` + `trigger.subscription.state.changed{toState:dead-lettered}` (§C-2 + RFC 0053), and the delivered run's `run.started.causationId` == the delivery id (§C / RFC 0040); both `trigger.*` events content-free; the always-on shape stays in `trigger-bridge-shape.test.ts`; new lib helper `src/lib/triggerBridge.ts`). 2026-05-31 (RFC 0087 — agent org-chart, the Active→Accepted behavioral gate) added two capability-gated behavioral scenarios (both gated on `agents.orgChart.supported`, black-box on the normative `/v1/agents/org-chart` surface — no new POST seam): `agent-org-chart-scoping.test.ts` (the `GET /v1/agents/org-chart` tree-shape — departments form an acyclic `parentDepartmentId` tree, members reference `host:<id>` roster entries — + the §D responsibility roll-up via `GET /v1/agents/org-chart/{departmentId}` with a deduped `responsibilities[]` union + the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ORG_CHART_DEPARTMENT_ID`) and `org-position-no-authority-escalation.test.ts` (the behavioral leg of the protocol-tier invariant — the live org-chart wire carries NO authority-bearing field on any member/department/responsibility-view object; the structural leg stays always-on in `agent-org-chart-shape.test.ts`, and the deeper RFC 0049/0051 authority-invariance legs stay reference-impl tier per the `agent-manifest-runtime` no-host-hook precedent). 2026-05-31 (RFCs 0086 + 0077 — the Active→Accepted behavioral gate) added four capability-gated behavioral scenarios so a non-steward host can be mechanically certified non-vacuously under `OPENWOP_REQUIRE_BEHAVIOR=true`: `agent-roster-attribution.test.ts` (RFC 0086 §B/§C; gated on `agents.roster.supported` — the normative `GET /v1/agents/roster` read shape + `total==roster.length`, the §C `roster.run.initiated`-before-`agent.invocation.started` ordering, the content-free payload backing `roster-attribution-no-content`, the durable work-item `triggerSubscriptionId`, and the RFC 0074 cross-tenant 404 via `OPENWOP_CROSS_TENANT_ROSTER_ID`), `agent-live-invocation-bracket.test.ts` (RFC 0077 §E; gated on `agents.liveRuntime.supported` — `agent.invocation.started`-first / `agent.invocation.completed`-last bracket, matching `invocationId`, `source`/`outcome` closed enums, content-free), `agent-live-structured-output.test.ts` (RFC 0077 §B step 6; gated on `agents.liveRuntime.structuredOutput` — a result violating `handoff.returnSchemaRef` fails the invocation `outcome:"failed"` rather than shipping as completed), and `agent-live-allowlist-enforced.test.ts` (RFC 0077 §F-1 / RFC 0002 §A14; gated on `agents.liveRuntime.supported` — a tool outside `toolAllowlist` is not callable); all four drive the documented `POST /v1/host/sample/roster/fire` + `POST /v1/host/sample/agents/live-invoke` seams plus the test event-log seam and soft-skip on 404 (these are the RFC 0086 / 0077 Active→Accepted bars). 2026-05-30 (RFC 0087 — agent org-chart, Draft -> Active) added `agent-org-chart-shape.test.ts` (always-on server-free: the `capabilities.agents.orgChart` shape + the `AgentOrgChart` round-trip + the non-`host:` member negative + the **§B structural non-authority guarantee** — the schema rejects a `scopes`/`canDispatch`/`permissions`/`authority` field on a member (`additionalProperties:false`), and a member's key set is exactly `{rosterId, departmentId, roleId, reportsTo}` — backing the protocol-tier `org-position-no-authority-escalation` invariant; no new RunEventType). 2026-05-30 (RFC 0086 — standing agent roster, Draft -> Active) added `agent-roster-shape.test.ts` (always-on server-free: the `capabilities.agents.roster` shape + the `AgentRosterEntry` round-trip + the `host:` `rosterId` + `agentRef` version-XOR-channel negatives + the content-free `roster.run.initiated` negatives backing the protocol-tier `roster-attribution-no-content` invariant + the additive `roster` inventory projection + RunEventType-enum membership). 2026-05-30 (RFC 0082 — agent deployment lifecycle, Draft -> Active) added `agent-deployment-shape.test.ts` (always-on server-free: the `capabilities.agents.deployment` shape + the `AgentDeployment` record round-trip + the `AgentRef` `channel` XOR `version` `not`-clause + the four `deployment.*` payloads + the content-free negatives backing the protocol-tier `deployment-event-no-content-leak` invariant). 2026-05-30 (RFC 0085 — `openwop-agent-platform` meta-profile, Draft -> Active) added `agent-platform-profile.test.ts` (always-on server-free derivation of the operational-annex `none`/`partial`/`full` status: all-floor ⇒ partial, missing-flag ⇒ none, the replay-OR-`nondeterminismPolicy.declared` term, floor+governance ⇒ full, missing-tenant-scope ⇒ partial-not-full per the honest-advertisement rule, eval/deploy/budget-are-advisory-not-hard-terms, + the `capabilities.nondeterminismPolicy.declared` shape). 2026-05-30 (RFC 0084 — budget, quota + cost policy, Draft -> Active) added `budget-policy-shape.test.ts` (always-on server-free: `budget-policy.schema.json` round-trip + the §A orthogonality guard — a wall-time field is rejected (it's RFC 0058's `runTimeoutMs`) — + threshold/onExhaustion negatives + the four content-free `budget.{reserved,consumed,threshold.crossed,exhausted}` payloads + the four `cap.breached{budget-*}` kinds + RunEventType-enum membership + the no-pricing-property structural check backing the protocol-tier `budget-no-pricing-leak` invariant + the `capabilities.budget`/`limits.maxBudget*` shape). 2026-05-30 (RFC 0083 — durable trigger + channel bridge, Draft -> Active) added `trigger-bridge-shape.test.ts` (always-on server-free: `trigger-subscription.schema.json` round-trip + missing-`state`/out-of-enum-`source`/unknown-property negatives + the four-state vocab + the two content-free `trigger.{subscription.state.changed,delivery.attempted}` payloads incl. closed `state`/`outcome` enums + RunEventType-enum membership + the `triggerBridge`/`webhooks.durable` capability shape + the `openwop-trigger-bridge` profile derivation incl. the no-dead-letter-sink negative). 2026-05-30 (RFC 0079 — credential provenance + egress policy, Draft -> Active) added `egress-provenance-shape.test.ts` (always-on server-free: `credential-provenance.schema.json` round-trip + `audiences:[]`/missing-`credentialId`/unknown-property negatives + the no-secret-property structural check backing the protocol-tier `egress-decision-no-secret-leak` invariant + the content-free `egress.decided` record incl. the `decision` enum + RunEventType-enum membership + the `httpClient.egressPolicy` shape; the behavioral `egress-credential-audience-bound` confused-deputy MUST is reference-impl tier, deferred to a host). 2026-05-30 (RFC 0078 — portable tool catalog, Draft -> Active) added `tool-descriptor-shape.test.ts` (always-on server-free: `tool-descriptor.schema.json` round-trip + the §C-1 `exec` ⇒ `host-extension` cross-field MUST (RFC 0069) + the `safetyTier`-required negative + `additionalProperties:false`, the `capabilities.toolCatalog` `supported`/`sources`/`sessionLifecycle` shape, and the two content-free `tool.session.{opened,closed}` payload $defs incl. the closed `outcome` enum + RunEventType-enum membership). 2026-05-30 (RFC 0080 — agent memory capability reconciliation, Draft -> Active) added `memory-capability-model-shape.test.ts` (always-on server-free: the additive `capabilities.memory.{writable,search,retention}` dimension shapes + malformed-instance negatives — `retention.ttl` non-boolean, out-of-enum `search.modes`, unknown property under `additionalProperties:false` — the `agent-inventory-response` `memoryDegraded`/`degradedMemoryDimensions` closed-enum fields, and the `openwop-memory` derivation surfacing for read/write + long-term hosts while withholding from `writable:false`). 2026-05-30 (RFC 0081 — agent evaluation, Draft -> Active) added `agent-eval-suite-shape.test.ts` (always-on server-free: the `capabilities.agents.evalSuite` shape + the `AgentEvalSuite`/`EvalSummary` schema round-trips + the three `eval.{started,scored,completed}` payloads + the content-free negatives — a task entry with a `taskOutput` body, a `safetyFinding` with an `excerpt` — backing the new `eval-summary-no-content-leak` SECURITY invariant). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch` live-run audit) added `safefetch-live-audit.test.ts` (`behaviorGate('openwop-safefetch-live-audit', …)`, gated on `httpClient.safeFetch` + `toolHooks.prePostEvents`) — asserts the audit-when-both MUST against the **durable run event log** via the new `POST /v1/host/sample/http/safe-fetch-run` open seam + the test event-log seam, closing the seam-vs-production gap (a production `createSafeFetch()` with no audit hooks passes the inline `safefetch-behavior.test.ts` but FAILS this under `OPENWOP_REQUIRE_BEHAVIOR=true`); this is the RFC 0076 §B → Accepted bar; run seam soft-skips on 404 (host-pending). 2026-05-29 (RFC 0066 — `x-openwop-form` picker UX hints, Draft → Active) added `x-openwop-form-pack-manifest.test.ts` (always-on server-free: an annotated `configSchema` stays a valid 2020-12 schema + the advisory hints don't change what it accepts, each §A annotation matches the shape, an unknown `kind` validates for forward-compat, 3 negatives — missing/non-string `kind`, non-string `dependsOn`). 2026-05-29 (RFC 0076 §B — `ctx.http.safeFetch`) added `safefetch-behavior.test.ts` (seam-gated: SSRF block / DNS-rebinding / `Connection: upgrade` refusal / tool-hooks audit-when-both, via `POST /v1/host/sample/http/safe-fetch`; advertisement contract stays in `http-client-ssrf.test.ts`). 2026-05-29 (RFC 0076 §A — pack `runtime.requires[]` install gate) added two: `runtime-requires-shape.test.ts` (server-free closed-vocabulary validation — the 8 tokens validate, a raw builtin name is rejected, empty-array≡omission, `uniqueItems`) + `runtime-requires-install-gate.test.ts` (seam-gated install-grant / install-refuse → `pack_runtime_requirement_unmet` / non-sandbox SHOULD-projection, soft-skip on 404 via `POST /v1/host/sample/packs/install-gate`). 2026-05-29 (RFC 0047 — `host.oauth` authorization-code roundtrip) added `oauth-authorization-code-roundtrip.test.ts` — capability-gated on `capabilities.oauth.supported` + `grants` including `authorization_code`; drives the `POST /v1/host/sample/oauth/authorize-code-roundtrip` seam against the one canonical synthetic provider in `fixtures/oauth-providers/synthetic.json` (soft-skip on 404, Tier-2 host-pending), asserting a successful grant returns a credential REFERENCE (token persisted as a `host.credentials` entry) and that the authorization code / state / PKCE verifier / acquired access+refresh tokens never appear on any run-visible surface (RFC 0047 §C + §C.2 / `credential-payload-redaction`). Closes the RFC 0047 Tier-2 gap (capability-shape + redaction scenarios existed; the actual authorization-code dance was unexercised). 2026-05-26 (RFC 0070 — agent-manifest runtime) added `agent-manifest-runtime.test.ts`; 2026-05-26 (RFC 0071 — artifact-type + chat card packs) added six: `artifact-type-pack-manifest-validation.test.ts` + `artifact-schema-compile-bounded.test.ts` (server-free) + `artifact-type-pack-install.test.ts` + `artifact-type-store-without-render.test.ts` + `chat-card-pack-manifest-validation.test.ts` (server-free) + `chat-card-pack-execution.test.ts` (capability-gated, host-pending). 2026-05-26 (RFCs 0067 / 0068 / 0069 — spec-gap Draft cohort) added five scenarios: `byok-auth-modes.test.ts` (RFC 0067; always-on schema-shape of `aiProviders.authModes` + a discovery-gated §B auth-mode-contract cross-field check), `memory-consolidation-shape.test.ts` (RFC 0068; always-on shape of `agents.memoryConsolidation`/`agents.commitments` + the `agent.memory.consolidated`/`commitment.fired` payload $defs), `memory-consolidation-idempotent.test.ts` + `commitment-fired.test.ts` (RFC 0068; capability-gated behavioral, soft-skip on the documented `/v1/host/sample/memory/consolidate` + `/commitment/fire` seams), and `exec-not-protocol-tier.test.ts` (RFC 0069; always-on server-free structural assertion that the protocol corpus defines no `core.*`/`openwop.*` exec-class primitive — backs the `exec-must-not-be-protocol-tier` SECURITY invariant). 2026-05-25 (RFC 0061 — stateful agent-loop lifecycle, executionModel.version 5) added four `agent-loop-*.test.ts` scenarios: `-version5-shape` (always-on; validates `executionModel.statefulResume`/`transcriptWindow` + the 1–5 version ceiling) plus `-iteration-monotonic` (gated on `version >= 5`; `runOrchestrator.decided.iteration` increments 1,2,3… exactly once per turn), `-workspace-snapshot` (gated additionally on `host.workspace.supported`; a turn-i workspace write is invisible to turn i, visible to turn i+1), and `-stateful-resume` (gated on `statefulResume`; a mid-loop suspend resumes at the same iteration without resetting the counter) — the three behavioral scenarios drive the documented agent-loop seam (`POST /v1/host/sample/agentloop/run`) and soft-skip until a host wires it. 2026-05-25 (RFC 0059 — host.workspace M2, reference-host enforcement) added two `workspace-*.test.ts` scenarios: `-behavior` (capability-gated CRUD round-trip / `If-Match` 409 `workspace_conflict` / `workspace_too_large` / §D run-start snapshot, all via the real `/v1/host/workspace/files` §C endpoints) and `-cross-tenant-isolation` (WCT-1 — drives the documented `POST /v1/host/sample/workspace/op` seam to assert a file owned by one `{tenant, workspace}` is unreadable, on both `get` and `list`, under a different owner; backs the new `workspace-cross-tenant-isolation` SECURITY invariant). The in-memory reference host now advertises `capabilities.workspace.supported` and honors §C/§D/§E end-to-end. 2026-05-25 (RFC 0062 — memory.distillation "dreams") added five `distillation-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.memory.distillation` block + the additive `distillation` sub-object on `memory.compacted`) plus `-token-budget` (within budget `tokensUsed ≤ tokenBudget`; an un-meetable budget → `token_budget_exceeded` with no partial archive), `-stable-archive` (same sources + budget ⇒ byte-stable archive checksum), `-index-roundtrip` (gated additionally on `indexEmitted`; the `MEMORY-INDEX.json` workspace file is retrievable + `workspace.updated` fired), and `-secret-carryforward` (SR-1: a redacted source secret never appears in the archive) — the four behavioral scenarios drive the documented memory-distillation seam (`POST /v1/host/sample/memory/distill`) and soft-skip until a host wires it. 2026-05-25 (RFC 0063 — core.subWorkflow.outputAttestation) added four `subrun-*.test.ts` scenarios: `-attestation-shape` (always-on; validates the `capabilities.agents.subRunAttestation` flag) plus `-checksum-stable` (the child output checksum is the byte-stable, key-order-invariant RFC 8785 JCS + SHA-256 digest), `-approval-gate` (`requireApproval` → `accept` merges, `reject` does not), and `-approval-fail-closed` (no `accept`/`edit-accept` → no merge; backs the deferred `subrun-merge-approval-fail-closed` invariant) — the three behavioral scenarios drive the documented sub-run attestation seam (`POST /v1/host/sample/subrun/attest`) and soft-skip until a host wires it. 2026-05-25 (RFC 0064 — host.toolHooks) added five `tool-hooks-*.test.ts` scenarios: `-shape` (always-on; validates the `capabilities.toolHooks` block + the optional content-free fields on `agentToolCalled` / `agentToolReturned`) plus `-content-free` (gated on `prePostEvents`), `-authorization-fail-closed` (gated on `perToolAuthorization`), `-rate-limit` (gated on `perToolRateLimit`), and `-secret-redaction` (gated on `prePostEvents` + the SR-1 `argsHash` redaction rule) — the four behavioral scenarios drive the documented tool-hooks invoke seam (`POST /v1/host/sample/toolhooks/invoke`) and soft-skip until a host wires it. 2026-05-25 (RFC 0060 — host.heartbeat) added four `heartbeat-*.test.ts` scenarios: `-capability-shape` (always-on; validates the `capabilities.heartbeat` block) plus `-fires-once-per-tick`, `-idempotent-no-spam`, and `-runtime-bound` (gated on `capabilities.heartbeat.supported` + the host heartbeat tick seam; soft-skip until a host wires it). 2026-05-25 (RFC 0057 — memory write-attribution) added five `memory-attribution-*.test.ts` scenarios: `-shape` (always-on advertisement check on `capabilities.memory.attribution`), plus `-no-content`, `-tenant-scoped`, `-emits-on-write`, and `-replay-stable` (gated on `capabilities.memory.attribution.emitsWriteEvents`) verifying the content-free `memory.written` RunEvent, its two SECURITY invariants (`memory-attribution-no-content` + `memory-attribution-tenant-scoped`), and the §D replay rule that a `replay`-mode fork MUST NOT regenerate `memoryId`. 2026-05-25 (RFC 0025 §C point 1 — test-catalog isolation invariant; pairs with the 25 publish-error scenarios in `pack-registry-publish.test.ts`) added `pack-registry-isolation.test.ts` — capability-gated on `capabilities.packs.testMode.{supported, isolated}: true`; PUTs a disposable pack into `/v1/packs-test/{name}` and asserts the same `(name, version)` does NOT appear via `GET /v1/packs/{name}` — anchors the test-catalog isolation MUST in RFC 0025 §C. 2026-05-25 (RFC 0028 Tier-2 post-promotion T2 — read-side sister scenario for workspace-membership enforcement) added `prompt-read-workspace-membership-enforced.test.ts` — gates on `capabilities.prompts.supported: true` (broader than `mutableLibrary` so read-only hosts that expose `?workspaceId=` are also probed); drives `GET /v1/prompts?workspaceId=<random-non-member>` and interprets the response: 4xx PASS (canonical envelope check on 403); 200 with empty `templates[]` PASS (correct null result for a nonexistent workspace); 200 with non-empty `templates[]` FAIL (cross-tenant leak); 200 without `templates[]` field SKIP (host doesn't expose workspace-scoped reads). Verifies SECURITY invariant `prompt-read-workspace-membership-enforced`. Same-day T1 strengthened `prompt-mutation-workspace-membership-enforced.test.ts` to pin `error === "workspace_membership_required"` when the host's refusal status is 403 (other refusal codes unconstrained). 2026-05-25 (RFC 0028 Tier-2 follow-up — workspace-membership enforcement on mutating prompt endpoints, filed in response to a self-disclosed adopter vulnerability) added `prompt-mutation-workspace-membership-enforced.test.ts` — capability-gated on `capabilities.prompts.mutableLibrary: true`; drives `POST /v1/prompts` with a cryptographically-random non-member `workspaceId` and asserts the host refuses (NOT a 2xx; any 4xx/5xx is acceptable — silent success is the failure mode). Verifies SECURITY invariant `prompt-mutation-workspace-membership-enforced`. 2026-05-22 (RFC 0034 §B follow-up — secret-leakage harness against the OTel + debug-bundle seams) added `secret-leakage-otel-attribute.test.ts` — gates on `capabilities.secrets.supported` + `capabilities.observability.testSeams.{otelScrape,debugBundleExport}` AND the `OPENWOP_CANARY_SECRET_VALUE` env (host operator + conformance runner agree on the canary). Drives the existing `openwop-smoke-byok-roundtrip` fixture end-to-end; scrapes both seams after run completion; hard-fails if the canary plaintext appears in any OTel span attribute or debug-bundle field. Verifies SECURITY invariants `secret-leakage-otel-attribute` + `secret-leakage-debug-bundle-otel`. 2026-05-22 (RFC 0041 Phase 4 — replay determinism under nondeterministic models) added three scenarios: `replay-divergence-at-refusal.test.ts` (advertisement-shape probe on `replayDeterminism.refusalDivergenceEmission` + 2 `it.todo` for the dual-direction refusal-divergence case), `replay-observable-sequence-determinism.test.ts` (capability-gated; behavioral assertion soft-skipped until a `conformance-phase4-nondet-tool` fixture ships), `replay-llm-cache-key-portable.test.ts` (intra-host reproducibility + non-recipe-field invariance + Phase 4 advertisement alignment — reuses the existing `POST /v1/host/sample/test/llm-cache-key` seam from the sibling `replay-llm-cache-key.test.ts`). 2026-05-20 (RFC 0027 §A templateKinds-coverage follow-up — paired with `prompt-end-to-end-events.test.ts`) added `prompt-all-four-kinds-events.test.ts` exercising all four `PromptKind` values (`system`, `user`, `schema-hint`, `few-shot`) end-to-end through the reference workflow-engine sample's `local.sample.demo.mock-ai` dispatch path; capability-gated via `behaviorGate('prompts-supported', ...)`. Closes the credibility gap where the host advertised `templateKinds: ["system", "user", "few-shot", "schema-hint"]` but only the system+user pair was actually wired into dispatch. 2026-05-20 (RFCs 0030–0033 — envelope LLM-contract-hardening track) added 15 scenarios across four `Active` RFCs: `envelope-reasoning-shape.test.ts` (RFC 0030, always-on; asserts the OPTIONAL `reasoning` property on the three universal-kind schemas + the `schema.response` deliberate omission), `envelope-reasoning-secret-redaction.test.ts` (RFC 0030, capability-gated on `capabilities.envelopes.reasoning.supported` + `secrets.supported`; 5 `it.todo()` placeholders for SECURITY invariant `envelope-reasoning-secret-redaction`), `envelope-tier-one-subset-static.test.ts` (RFC 0030, always-on for load-bearing rules — no `oneOf` / `allOf` / `not` / `prefixItems` / `propertyNames` anywhere; gated on `tierOneSubsetCompliance: "strict"` for OpenAI-strict-only constraints), `envelope-variant-discriminator-static.test.ts` (RFC 0031, always-on; asserts no `oneOf` + every `anyOf` branch declares a single-string-enum discriminator in `required` on every `schemas/envelopes/*.schema.json`), `model-capability-substituted.test.ts` (RFC 0031, advertisement-shape probe on `capabilities.modelCapabilities.advertised[]` identifier pattern + 5 `it.todo()` placeholders for SECURITY invariant `model-capability-substituted-no-credential-disclosure`), `model-capability-insufficient.test.ts` (RFC 0031, 6 `it.todo()` placeholders for refusal + no-recursive-fallback), `node-module-required-capabilities-shape.test.ts` (RFC 0031 SHOULD-tier authoring-convention; 4 `it.todo()` placeholders), and the six envelope-reliability events from RFC 0032 (`envelope-retry-attempted` carrying the shared advertisement-shape probe enforcing both MUST-tier events in `events[]` per RFC 0032 §C, plus `envelope-retry-exhausted`, `envelope-refusal-shape`, `envelope-truncated`, `envelope-nl-to-format-engaged`, `envelope-recovery-applied` — collectively 39 `it.todo()` placeholders covering retry/refusal/truncation/recovery + SECURITY invariants `envelope-refusal-no-prompt-leak` and `envelope-recovery-no-content-leak`), plus RFC 0033's two scenarios (`envelope-completion-distinguishes-truncation.test.ts` + `envelope-truncation-cap-exhaustion.test.ts` — 12 `it.todo()` placeholders covering the truncation-vs-schema-violation retry-routing distinction + the DoS-bound assertion). Reference workflow-engine sample advertises `capabilities.envelopes.reasoning: { supported: true, promptDirective: "off" }` + `tierOneSubsetCompliance: "warn"` honestly (schemas accept the field; host doesn't yet inject the directive); the other three RFCs' capability blocks defer to reference-host emission code per the staged RFC 0027 §G precedent. 2026-05-20 (RFC 0028 §B Phase B — prompt-pack boot-time install) added `prompt-pack-install.test.ts` (capability-gated on `capabilities.prompts.endpointsSupported: true`; asserts a host that ran the boot-time pack loader surfaces ≥ 1 pack-source template under `GET /v1/prompts?source=pack` carrying the canonical `meta.source: "pack"` + `meta.packName` + `meta.packVersion` stamps; positively identifies the in-tree `vendor.openwop.prompt-sample` reference pack's `writer-system` template when present). Pairs with the new `host/promptPackLoader.ts` boot-time entry on the reference workflow-engine sample, which scans `examples/packs/*` plus `OPENWOP_PROMPT_PACKS_DIR` and calls `installPackTemplates()` for each `kind: "prompt"` pack found. 2026-05-20 (RFC 0029 Phase C — prompt resolution chain wire shape) added three more scenarios: `prompt-resolution-chain-node-wins.test.ts` (capability-gated on `capabilities.prompts.supported: true`; asserts layer-1 node-config supersedes lower layers per `spec/v1/prompts.md` §"Resolution chain (normative)"), `prompt-resolution-chain-agent-intrinsic.test.ts` (additionally gated on `capabilities.prompts.agentBindings: true`; asserts agent intrinsic `systemPromptRef` wins over `promptOverrides` AND lower layers when the node has no layer-1 ref), `prompt-resolution-chain-fallback-cascade.test.ts` (asserts layer 3 workflow-defaults wins over layer 4 host-defaults; layer 4 host-defaults wins when 1-3 yield null; resolved is null when all four yield null but chain[] still lists every attempted layer). The scenarios drive the host's `POST /v1/host/sample/prompt/resolve` test seam (reference-host implementation deferred to follow-up slice per RFC 0021 staging precedent). 2026-05-20 (RFC 0027 Phase A — prompt templates wire shape) added three scenarios: `prompt-template-shape.test.ts` (always-on; Ajv compileability + positive/negative round-trip for PromptTemplate + PromptRef + PromptKind), `prompt-composed-secret-redaction.test.ts` (capability-gated on `capabilities.prompts.supported: true` + `observability: "full"`; asserts `[REDACTED:<secretId>]` markers in `prompt.composed` payloads for `source: "secret"` variable bindings per SECURITY/threat-model-secret-leakage.md §SR-1), `prompt-composed-trust-marker.test.ts` (same capability gates; asserts `<UNTRUSTED>...</UNTRUSTED>` wrapping + `contentTrust: "untrusted"` propagation per RFC 0020 §D). Paired with new `fixtures/prompt-templates/` sub-directory + per-fixture schema-validity describe block + future SECURITY invariants `prompt-composed-secret-redaction` and `prompt-composed-trust-marker` (lands alongside reference-host emission per RFC 0021 staging precedent). 2026-05-18 (RFC 0022 `Draft` — runtime variable mapping) added four `it.todo()` placeholder scenarios covering the new mapping surfaces on `core.dispatch` (§A — `dispatch-input-mapping.test.ts`, `dispatch-output-mapping.test.ts`, `dispatch-cross-worker-handoff.test.ts`) and `core.subWorkflow` (§B — `subworkflow-input-mapping.test.ts`). Gated on `capabilities.agents.dispatchMapping` (dispatch trio) and `capabilities.subWorkflow.inputMapping` (subWorkflow). Promote to live assertions when RFC 0022 reaches `Active` + a reference host advertises the matching flags. 2026-05-17 (RFC 0003 §D handoff-schema enforcement, HV-1) added `agentPackHandoffSchemaValidation.test.ts` — verifies the host validates dispatch payloads against `handoff.taskSchemaRef` AND return payloads against `handoff.returnSchemaRef` per RFC 0003 §D. Paired with the new `agent-pack-handoff-schema-enforcement` row in `SECURITY/invariants.yaml`. 2026-05-17 (AI Envelope gap-closure, DRAFT v1.x — `spec/v1/ai-envelope.md`) added 7 advertisement-shape scenarios with `it.todo()` behavioral placeholders gated on `capabilities.envelopeContracts.advertised: true`: `aiEnvelope.universalKinds.test.ts`, `aiEnvelope.schemaDrift.test.ts`, `aiEnvelope.correlationReplay.test.ts`, `aiEnvelope.contractRefusal.test.ts`, `aiEnvelope.trustBoundaryPropagation.test.ts`, `aiEnvelope.redaction.test.ts`, `aiEnvelope.capBreached.test.ts`. Paired with the new `envelope-redaction-sr-1-carry-forward` row in `SECURITY/invariants.yaml`. 2026-05-17 (post-publish hardening, deep audit of `core.openwop.agents`) added `agents-run-tool-allowlist.test.ts` — server-free scenario locking in the `core.openwop.agents@1.0.1` safety-fix that closes `OPENWOP-AUDIT-2026-003` (function-typed `tool.handler` properties rejected at `validateTools()` with `INVALID_TOOL_DECLARATION`; tool-driven runs require `ctx.agentRuntime`; tool-less safe fallback preserved). Paired with the new `agents-run-no-raw-handler` row in `SECURITY/invariants.yaml`. Same-day post-publish hardening added `idempotency-key-determinism.test.ts` — server-free scenario locking in the `core.openwop.http@1.1.2` determinism safety-fix (default `composite` mode produces deterministic keys in `(runId, nodeId, payload)`; removed `uuid` mode rejects with `CONFIG_INVALID`; cross-impl vector test lets third-party reimplementations verify wire agreement). Paired with the new `idempotency-key-deterministic` row in `SECURITY/invariants.yaml`. 2026-05-17 (Phase 3 of RFC 0013) added three server-free scenarios exercising the reference workflow-chain expansion library (`conformance/src/lib/workflow-chain-expansion.ts`): `workflow-chain-expansion.test.ts` (parameter substitution + node id collision avoidance + edge rewriting + capability propagation + runtime-invariance contract), `workflow-chain-unresolvable-typeid.test.ts` (rejection with `chain_unresolvable_typeid` when a chain references an unknown typeId), and `workflow-chain-pack-signature-verification.test.ts` (Ed25519 verification recipe reuse from `node-packs.md §Signing`). Earlier that day (Phase 1) added `workflow-chain-pack-manifest-validation.test.ts` — server-free schema-validation scenario covering the new `workflow-chain-pack-manifest.schema.json` (positive sample + two negatives: kind/contents mismatch and invalid `chainId`). Closes RFC 0013 (`Workflow-chain packs`, `Draft`) Phases 1 + 3 alongside the new `spec/v1/workflow-chain-packs.md`, the `Capabilities.workflowChainPacks` block, and the registry build-index/conformance-check `kind` routing from Phase 2. Earlier that day, the suite added 27 `it.todo()` placeholder scenarios paired with RFCs 0014-0020 (host capability surfaces — fs, kvStorage, tableStorage, queueBus, sql/vector/search, blob/cache, mcp.serverMount). These promote to live assertions when each RFC reaches `Active` + the matching capability block lands in `schemas/capabilities.schema.json` + a reference host advertises the capability. Earlier additions include 18 Multi-Agent Shift scenarios (Phases 1-5) added 2026-05-10, the `registry-public.test.ts` public-registry healthcheck added 2026-05-11 (opt-in via `OPENWOP_TEST_PUBLIC_REGISTRY=true`), the `replay-llm-cache-key.test.ts` placeholder added 2026-05-11 (three `it.todo()` cases for the cross-host LLM cache-key recipe per `replay.md` §"LLM cache-key recipe"), the two `production-*.test.ts` scenarios added 2026-05-11 for the `openwop-production` profile per RFC 0009 (`production-backpressure.test.ts`, `production-retention-expiry.test.ts`), the four `auth-*.test.ts` scenarios added 2026-05-11/12 for the production-auth profiles per RFC 0010 (`auth-api-key-rotation.test.ts`, `auth-oauth2-client-credentials.test.ts`, `auth-oidc-user-bearer.test.ts`, `auth-mtls.test.ts` (opt-in via `OPENWOP_TEST_MTLS=1`)), `replay-retention-expiry.test.ts` added 2026-05-12 (capability shape + 410/422 envelope per `replay.md` §"Retention and garbage collection"), `bulk-cancel.test.ts` added 2026-05-12 (Phase B close-out of R1 — `POST /v1/runs:bulk-cancel`), the two Phase H launch-blocker advertisement-contract scenarios added 2026-05-12 (`mcp-toolcall-redaction.test.ts` for the MCP-1 invariant per `host-capabilities.md §host.mcp` + `threat-model-prompt-injection.md §UNTRUSTED`, and `http-client-ssrf.test.ts` for the SSRF + body-size cap advertisement contract on `capabilities.httpClient`), the `wasm-pack-abi-version-rejection.test.ts` Track 7 scenario added 2026-05-12 for the ABI-mismatch positive path via the `vendor.openwop.misbehaving-abi` pack per RFC 0008 §H, the `otel-trace-propagation-subworkflow.test.ts` Track 11 close-out added 2026-05-13 (parent + child run spans share the inbound traceparent's traceId across the `core.subWorkflow` dispatch boundary), and the three RFC 0012 (Memory Compaction Profile, `Active`) scenarios added 2026-05-13/14: `memory-compaction-sr1-carry-forward.test.ts` (load-bearing SR-1 §D), `memory-compaction-event-emitted.test.ts` (canonical §B payload shape), and `memory-compaction-provenance-tag.test.ts` (soft assertion on §C `compacted-from:<id>` convention). All three gate on `capabilities.memory.compaction.supported` + the host's test seam at `/v1/test/memory/{seed,compact}` (Postgres reference host enables both via `OPENWOP_MEMORY_COMPACTION=true OPENWOP_TEST_TRIGGER_COMPACTION=true`). 2026-05-15 (gap-closure CF-3) added `interrupt-token-matrix.test.ts` (malformed / unknown / replay / cross-run-id paths on `GET|POST /v1/interrupts/{token}`). 2026-05-31 (RFC 0078 portable tool catalog + RFC 0079 credential provenance / egress policy — the Active→Accepted behavioral gate) added four: `tool-catalog-projection.test.ts` (capability-gated on `toolCatalog.supported` via `behaviorGate('openwop-tool-catalog', …)` — the NORMATIVE `GET /v1/tools` list with each `ToolDescriptor` schema-valid + `source`/`safetyTier` in the closed vocab + content-free, `GET /v1/tools/{toolId}` round-trip + unknown-id 404, 401-unauthenticated, and the §F-2 cross-principal non-disclosure; black-box, no POST seam), `tool-session-lifecycle.test.ts` (gated on `toolCatalog.sessionLifecycle` — the §D `tool.session.opened`-before / `tool.session.closed`-after bracket over the RFC 0064 call events via the `POST /v1/host/sample/tools/session-run` seam, one shared `sessionId`, content-free), `egress-audience-binding.test.ts` (KEYSTONE — gated on `httpClient.egressPolicy.supported`; the §C confused-deputy MUST via `POST /v1/host/sample/egress/decide`: an out-of-audience egress is denied/downgraded with the credential NOT attached, a provenance-unevaluable egress fails closed — the behavioral leg of `egress-credential-audience-bound`), and `egress-decision-content-free.test.ts` (the SR-1 canary — the credential value never surfaces in `egress.decided` and `reason` stays in the CLOSED vocabulary). The maintained scenario-to-spec map lives in [`coverage.md`](./coverage.md); this README keeps the operator quickstart and the historical scenario notes below.
|
|
96
96
|
|
|
97
97
|
High-level coverage includes:
|
|
98
98
|
|
|
@@ -171,7 +171,7 @@ Server-required (added in 1.7.0):
|
|
|
171
171
|
| ------------- | ----------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
172
172
|
| **Redaction** | [`capabilities.md`](../spec/v1/capabilities.md) §"Secrets" + NFR-7 + §"aiProviders" | Vendor-neutral assertions that the server doesn't leak secret material. Three scenario groups: (a) discovery shape contract — `secrets` + `aiProviders` advertisements are well-formed regardless of `secrets.supported`; when `supported === true`, scopes MUST be non-empty + `resolution === 'host-managed'`; `byok ⊆ supported`. (b) bearer-token redaction — invalid Bearer canary in `Authorization` header is not echoed in the 401 response body. (c) credentialRef echo control — gated on `secrets.supported === true`; canary planted in `configurable.ai.credentialRef` MUST NOT appear in any RunEvent payload (poll-based capture; transport-agnostic). Uses runtime-built canary fixtures (`lib/canaries.ts`) that defeat static secret scanners. 6 scenarios. |
|
|
173
173
|
|
|
174
|
-
Current source tree:
|
|
174
|
+
Current source tree: 343 scenario files. Use [`coverage.md`](./coverage.md) for current grade/gap tracking.
|
|
175
175
|
|
|
176
176
|
## Remaining Gaps
|
|
177
177
|
|
package/api/asyncapi.yaml
CHANGED
|
@@ -126,6 +126,11 @@ channels:
|
|
|
126
126
|
deploymentRolledBack: { $ref: '#/components/messages/DeploymentRolledBack' }
|
|
127
127
|
deploymentCanaryAdjusted: { $ref: '#/components/messages/DeploymentCanaryAdjusted' }
|
|
128
128
|
deploymentStateChanged: { $ref: '#/components/messages/DeploymentStateChanged' }
|
|
129
|
+
proposalCreated: { $ref: '#/components/messages/ProposalCreated' }
|
|
130
|
+
proposalActivated: { $ref: '#/components/messages/ProposalActivated' }
|
|
131
|
+
goalEvaluated: { $ref: '#/components/messages/GoalEvaluated' }
|
|
132
|
+
goalClosed: { $ref: '#/components/messages/GoalClosed' }
|
|
133
|
+
importApplied: { $ref: '#/components/messages/ImportApplied' }
|
|
129
134
|
|
|
130
135
|
runEventsValues:
|
|
131
136
|
address: /runs/{runId}/events
|
|
@@ -375,6 +380,47 @@ components:
|
|
|
375
380
|
payload:
|
|
376
381
|
$ref: '#/components/schemas/DeploymentStateChangedPayload'
|
|
377
382
|
|
|
383
|
+
# ── Reviewable learning (RFC 0096) — content-free proposal lifecycle ──
|
|
384
|
+
ProposalCreated:
|
|
385
|
+
name: proposal.created
|
|
386
|
+
title: Proposal created (RFC 0096)
|
|
387
|
+
summary: The host synthesized a reviewable-learning draft. Content-free — ids/kind/refs only, never the artifact body or rationale (proposal-inert-until-applied). Emitted only when capabilities.agents.proposals is advertised.
|
|
388
|
+
contentType: application/json
|
|
389
|
+
payload:
|
|
390
|
+
$ref: '#/components/schemas/ProposalCreatedPayload'
|
|
391
|
+
ProposalActivated:
|
|
392
|
+
name: proposal.activated
|
|
393
|
+
title: Proposal activated (RFC 0096)
|
|
394
|
+
summary: A proposal was applied (RFC 0051/0049-gated). Content-free; the installed artifact byte-matches the last-persisted draft (proposal-no-resynthesis).
|
|
395
|
+
contentType: application/json
|
|
396
|
+
payload:
|
|
397
|
+
$ref: '#/components/schemas/ProposalActivatedPayload'
|
|
398
|
+
|
|
399
|
+
# ── Standing goals (RFC 0097) — content-free judge/continuation events ─
|
|
400
|
+
GoalEvaluated:
|
|
401
|
+
name: goal.evaluated
|
|
402
|
+
title: Goal evaluated (RFC 0097)
|
|
403
|
+
summary: A judge check ran against a standing goal. Content-free — no objective text; the verdict is recorded (not recomputed on replay).
|
|
404
|
+
contentType: application/json
|
|
405
|
+
payload:
|
|
406
|
+
$ref: '#/components/schemas/GoalEvaluatedPayload'
|
|
407
|
+
GoalClosed:
|
|
408
|
+
name: goal.closed
|
|
409
|
+
title: Goal closed (RFC 0097)
|
|
410
|
+
summary: A standing goal stopped continuation (satisfied / escalated / abandoned / bound-exceeded). Content-free.
|
|
411
|
+
contentType: application/json
|
|
412
|
+
payload:
|
|
413
|
+
$ref: '#/components/schemas/GoalClosedPayload'
|
|
414
|
+
|
|
415
|
+
# ── Portability (RFC 0098) — content-free import event ───────────────
|
|
416
|
+
ImportApplied:
|
|
417
|
+
name: import.applied
|
|
418
|
+
title: Import applied (RFC 0098)
|
|
419
|
+
summary: An estate import was applied. Content-free — counts + refs only, never item payloads or secret values (export-bundle-no-credential-material).
|
|
420
|
+
contentType: application/json
|
|
421
|
+
payload:
|
|
422
|
+
$ref: '#/components/schemas/ImportAppliedPayload'
|
|
423
|
+
|
|
378
424
|
# ── Run-lifecycle ────────────────────────────────────────────────────
|
|
379
425
|
RunStarted:
|
|
380
426
|
name: run.started
|
|
@@ -614,6 +660,14 @@ components:
|
|
|
614
660
|
DeploymentRolledBackPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/deploymentRolledBack' }
|
|
615
661
|
DeploymentCanaryAdjustedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/deploymentCanaryAdjusted' }
|
|
616
662
|
DeploymentStateChangedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/deploymentStateChanged' }
|
|
663
|
+
# RFC 0096 — reviewable-learning proposal event payloads.
|
|
664
|
+
ProposalCreatedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/proposalCreated' }
|
|
665
|
+
ProposalActivatedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/proposalActivated' }
|
|
666
|
+
# RFC 0097 — standing-goal event payloads.
|
|
667
|
+
GoalEvaluatedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/goalEvaluated' }
|
|
668
|
+
GoalClosedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/goalClosed' }
|
|
669
|
+
# RFC 0098 — portability import event payload.
|
|
670
|
+
ImportAppliedPayload: { $ref: '../schemas/run-event-payloads.schema.json#/$defs/importApplied' }
|
|
617
671
|
|
|
618
672
|
# RFC 0056. The run.annotated notification carries an Annotation —
|
|
619
673
|
# NOT a RunEventDoc — because annotations are a side-resource, not
|
package/coverage.md
CHANGED
|
@@ -417,3 +417,11 @@ server-free or shape-probe assertions that run unconditionally.
|
|
|
417
417
|
| `i18n-negotiation.test.ts` | `spec/v1/i18n.md` (language negotiation) | `capabilities.i18n` |
|
|
418
418
|
| `grpc-transport.test.ts` | `spec/v1/grpc-transport.md` (gRPC transport profile) | `capabilities.grpc` (schema block added by RFC 0094 on this branch) |
|
|
419
419
|
| `webhook-tenant-isolation.test.ts` | RFC 0093 §A3 (`spec/v1/webhooks.md` delivery tenant scope; backs the protocol-tier `webhook-cross-tenant-isolation` invariant) | `capabilities.webhooks.supported` (two-tenant legs via the `host/sample/test/surface` seam, soft-skip on 404) |
|
|
420
|
+
|
|
421
|
+
### Added on this branch (2026-06-13 — RFCs 0096/0097/0098 `Draft → Active`)
|
|
422
|
+
|
|
423
|
+
| Scenario file | Spec doc / RFC | Gating capability |
|
|
424
|
+
| -------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
|
|
425
|
+
| `proposal-reviewable-learning.test.ts` | `spec/v1/agent-memory.md` §"Reviewable learning" (RFC 0096); backs `proposal-inert-until-applied` + `proposal-no-resynthesis` | always-on schema legs; behavioral apply-403 leg gated on `capabilities.agents.proposals` (soft-skip on 404) |
|
|
426
|
+
| `goal-standing-continuation.test.ts` | `spec/v1/agent-runtime.md` §"Standing goals" (RFC 0097); backs `goal-continuation-bounded` + `goal-completion-judge-only` | always-on schema legs; behavioral bounds-422 + judge-only legs gated on `capabilities.agents.goals` (soft-skip on 404) |
|
|
427
|
+
| `export-bundle-portability.test.ts` | `spec/v1/portability.md` (RFC 0098); backs `export-bundle-no-credential-material` | always-on schema legs; behavioral literal-credential-import-422 leg gated on `capabilities.portability.import` (soft-skip) |
|
package/package.json
CHANGED
package/schemas/README.md
CHANGED
|
@@ -10,6 +10,9 @@
|
|
|
10
10
|
| `agent-eval-suite.schema.json` | `agent-evaluation.md` (RFC 0081) | Portable agent evaluation suite — tasks + golden/rubric `expected` + deterministic fixtures + allowed model classes + pass/fail thresholds, pack-distributed via `evalSuiteRef` |
|
|
11
11
|
| `agent-manifest.schema.json` | `node-packs.md` + agent-pack RFCs | Agent manifest entries distributed alongside node-pack manifests |
|
|
12
12
|
| `eval-summary.schema.json` | `agent-evaluation.md` (RFC 0081) | The content-free eval-run scorecard — aggregate + per-task scores/cost/latency/safety-findings + regression delta; served by `GET /v1/runs/{runId}/eval-summary` (SECURITY invariant `eval-summary-no-content-leak`) |
|
|
13
|
+
| `proposal.schema.json` | `agent-memory.md` §"Reviewable learning" (RFC 0096) | Reviewable-learning proposal — an INERT, draft-state reusable artifact (agent-pack/workflow-chain-pack/prompt-template/automation) synthesized from run traces; MUST NOT influence any run until `applied`; activation delegated to RFC 0051/0049 (SECURITY invariants `proposal-inert-until-applied`, `proposal-no-resynthesis`) |
|
|
14
|
+
| `goal.schema.json` | `agent-runtime.md` §"Standing goals" (RFC 0097) | Standing goal — a durable objective with judge-based (RFC 0090) completion + bounded (RFC 0058) continuation; completion is the judge's verdict, never client-set (SECURITY invariants `goal-continuation-bounded`, `goal-completion-judge-only`) |
|
|
15
|
+
| `export-bundle.schema.json` | `portability.md` (RFC 0098) | Portable agent-platform export bundle — a tenant's reusable estate (agents/packs/templates/connection-refs/schedules/roster/org-chart) for cross-host migration; carries NO credential values, only refs (SECURITY invariant `export-bundle-no-credential-material`) |
|
|
13
16
|
| `agent-ref.schema.json` | `agent-memory.md` + agent-identity RFC | Multi-Agent Shift Phase 1 — slim runtime AgentRef projection carried on `RunSnapshot.agent` / `runOrchestrator`, `WorkflowNode.agent?`, and `agent.*` event payloads |
|
|
14
17
|
| `agent-roster-entry.schema.json` | `agent-roster.md` (RFC 0086) | Standing agent INSTANCE — a named, tenant-scoped `host:<id>` agent (the "digital-twin employee") that references a manifest/deployment (`agentRef`) and owns a `workflows[]` portfolio; the discovery shape behind `GET /v1/agents/roster` + the `roster` inventory projection |
|
|
15
18
|
| `agent-org-chart.schema.json` | `agent-org-chart.md` (RFC 0087) | Tenant-scoped, DESCRIPTIVE grouping of roster members into departments + roles with acyclic `reportsTo` edges; carries NO authority field (every object `additionalProperties:false`) per the `org-position-no-authority-escalation` invariant; the discovery shape behind `GET /v1/agents/org-chart` |
|
|
@@ -1119,6 +1119,54 @@
|
|
|
1119
1119
|
"type": "boolean",
|
|
1120
1120
|
"default": false,
|
|
1121
1121
|
"description": "RFC 0063 (`Active`). When `true`, host honors the optional `outputAttestation` block on `core.subWorkflow`: computes a content checksum (RFC 8785 JCS + SHA-256, the `replay.md` recipe) over a child's harvested outputs and surfaces it as the additive optional `attestation` object on the existing `core.workflowChain.event { phase: 'output.harvested' }` (RFC 0037) BEFORE applying `outputMapping`; when the config sets `requireApproval: true`, suspends the parent via an `approval` interrupt (RFC 0051) before merge and fails closed (no `accept`/`edit-accept` ⇒ no merge). Reuses RFC 0051's `approval` kind + RFC 0049 scopes for `principalScope` — no new interrupt kind, event type, or error code. Hosts that omit / `false` this flag treat `outputAttestation` as inert (blind merge, today's behavior)."
|
|
1122
|
+
},
|
|
1123
|
+
"proposals": {
|
|
1124
|
+
"type": "object",
|
|
1125
|
+
"description": "RFC 0096 (`Active`). Reviewable learning — the host synthesizes reusable artifacts (skills/packs/templates/automations) from run/tool traces as INERT, reviewable drafts that MUST NOT influence the resolution, planning, or execution of any run until an authorized principal activates them. A host advertising this serves the `/v1/host/sample/proposals` surface (promotable to `/v1/proposals`) and emits the content-free `proposal.created` / `proposal.activated` events. Activation is delegated to RFC 0051 approval-gate or RFC 0049 RBAC — no new authorization path. On `apply` the installed artifact MUST byte-match the last-persisted `artifact` (no silent re-synthesis). Hosts that omit this block do not synthesize proposals; the conformance scenarios skip cleanly.",
|
|
1126
|
+
"additionalProperties": false,
|
|
1127
|
+
"required": ["artifactKinds", "activation"],
|
|
1128
|
+
"properties": {
|
|
1129
|
+
"artifactKinds": {
|
|
1130
|
+
"type": "array",
|
|
1131
|
+
"items": { "type": "string", "enum": ["agent-pack", "workflow-chain-pack", "prompt-template", "automation"] },
|
|
1132
|
+
"uniqueItems": true,
|
|
1133
|
+
"description": "Which reusable artifact kinds the host can propose. `agent-pack` (RFC 0003), `workflow-chain-pack` (RFC 0013), `prompt-template` (RFC 0027), `automation` (RFC 0052 scheduled job)."
|
|
1134
|
+
},
|
|
1135
|
+
"duplicationDetection": {
|
|
1136
|
+
"type": "boolean",
|
|
1137
|
+
"default": false,
|
|
1138
|
+
"description": "When `true`, the host populates `Proposal.duplicateOf` with an existing artifact ref the proposal restates/overlaps (the 'Curator' duplication signal)."
|
|
1139
|
+
},
|
|
1140
|
+
"activation": {
|
|
1141
|
+
"type": "string",
|
|
1142
|
+
"enum": ["approval-gate", "direct-rbac"],
|
|
1143
|
+
"description": "`approval-gate`: `apply` MUST drive an RFC 0051 gate (role/scope/quorum, audited override) and MUST NOT install unless granted/overridden. `direct-rbac`: `apply` requires only the RFC 0049 scope the host advertises for activation."
|
|
1144
|
+
}
|
|
1145
|
+
}
|
|
1146
|
+
},
|
|
1147
|
+
"goals": {
|
|
1148
|
+
"type": "object",
|
|
1149
|
+
"description": "RFC 0097 (`Active`). Standing goals — a durable objective with explicit completion criteria, evaluated by a host-side judge (RFC 0090 verifier or host evaluator), that keeps an agent working across turns/runs until the judge is satisfied, a declared RFC 0058 bound is crossed, or the agent escalates (RFC 0044). A host advertising this serves `/v1/host/sample/goals` (promotable to `/v1/goals`) and emits the content-free `goal.evaluated` / `goal.closed` events. Completion MUST be the judge's verdict — a client MUST NOT set `state: satisfied` directly. Continuation MUST be bounded. Hosts that omit this block do not run standing goals; the conformance scenarios skip cleanly.",
|
|
1150
|
+
"additionalProperties": false,
|
|
1151
|
+
"required": ["judge", "continuation"],
|
|
1152
|
+
"properties": {
|
|
1153
|
+
"judge": {
|
|
1154
|
+
"type": "string",
|
|
1155
|
+
"enum": ["verifier", "host"],
|
|
1156
|
+
"description": "`verifier`: completion is an RFC 0090 verifier verdict. `host`: an opaque host evaluator."
|
|
1157
|
+
},
|
|
1158
|
+
"continuation": {
|
|
1159
|
+
"type": "array",
|
|
1160
|
+
"items": { "type": "string", "enum": ["schedule", "commitment", "heartbeat", "manual"] },
|
|
1161
|
+
"uniqueItems": true,
|
|
1162
|
+
"description": "How a goal re-engages work between judge checks. `schedule` (RFC 0052), `commitment` (RFC 0068), `heartbeat` (RFC 0060), `manual`."
|
|
1163
|
+
},
|
|
1164
|
+
"requiresBounds": {
|
|
1165
|
+
"type": "boolean",
|
|
1166
|
+
"default": true,
|
|
1167
|
+
"description": "When `true` (default), a goal MUST declare valid RFC 0058 `bounds` before it may activate; a `POST /goals` without bounds returns 422. The host MUST stop continuation and set `state: bound-exceeded` when any declared bound (iteration count / accumulated cost / wall-clock deadline) is crossed."
|
|
1168
|
+
}
|
|
1169
|
+
}
|
|
1122
1170
|
}
|
|
1123
1171
|
},
|
|
1124
1172
|
"additionalProperties": true
|
|
@@ -1979,6 +2027,36 @@
|
|
|
1979
2027
|
}
|
|
1980
2028
|
},
|
|
1981
2029
|
"additionalProperties": false
|
|
2030
|
+
},
|
|
2031
|
+
"portability": {
|
|
2032
|
+
"type": "object",
|
|
2033
|
+
"description": "RFC 0098 (`Active`). Export/import of a tenant's reusable estate (agents 0070, packs 0003/0013, prompt templates 0027, connection *refs* 0045/0095, schedules 0052, roster/org-chart 0086/0087). An export bundle carries NO credential values — only refs to be re-bound at the destination (RFC 0046/0079). Import maps the estate onto the destination's RFC 0048 identity, MUST offer a no-write dry-run plan, MUST be idempotent, and is gated by an RFC 0049 scope. A host advertising this serves `/v1/host/sample/{export,import}` (promotable to `/v1/{export,import}`) and emits the content-free `import.applied` event. Hosts that omit this block neither export nor import; the conformance scenarios skip cleanly.",
|
|
2034
|
+
"additionalProperties": false,
|
|
2035
|
+
"properties": {
|
|
2036
|
+
"export": {
|
|
2037
|
+
"type": "boolean",
|
|
2038
|
+
"default": false,
|
|
2039
|
+
"description": "Host can emit an export bundle for the caller's tenant/workspace via `GET /export`."
|
|
2040
|
+
},
|
|
2041
|
+
"import": {
|
|
2042
|
+
"type": "boolean",
|
|
2043
|
+
"default": false,
|
|
2044
|
+
"description": "Host can import an export bundle via `POST /import`. When `true`, `dryRun` MUST also be `true` (a no-write plan preview is mandatory)."
|
|
2045
|
+
},
|
|
2046
|
+
"kinds": {
|
|
2047
|
+
"type": "array",
|
|
2048
|
+
"items": { "type": "string", "enum": ["agent", "pack", "prompt-template", "connection-ref", "schedule", "roster", "org-chart"] },
|
|
2049
|
+
"uniqueItems": true,
|
|
2050
|
+
"description": "Estate kinds this host can export/import."
|
|
2051
|
+
},
|
|
2052
|
+
"dryRun": {
|
|
2053
|
+
"type": "boolean",
|
|
2054
|
+
"default": true,
|
|
2055
|
+
"description": "Import supports a no-write plan preview (`POST /import?dryRun=true`). MUST be true if `import` is true."
|
|
2056
|
+
}
|
|
2057
|
+
},
|
|
2058
|
+
"if": { "properties": { "import": { "const": true } }, "required": ["import"] },
|
|
2059
|
+
"then": { "properties": { "dryRun": { "const": true } }, "required": ["dryRun"] }
|
|
1982
2060
|
}
|
|
1983
2061
|
},
|
|
1984
2062
|
"additionalProperties": true
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
{
|
|
2
|
+
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
3
|
+
"$id": "https://openwop.dev/spec/v1/export-bundle.schema.json",
|
|
4
|
+
"title": "ExportBundle",
|
|
5
|
+
"description": "RFC 0098 §B. A portable agent-platform export bundle — a tenant's reusable estate (agents, packs, prompt templates, connection refs, schedules, roster/org-chart) composed for migration between openwop hosts (or import from an external platform via an adapter). The bundle MUST NOT contain credential VALUES: `connection-ref` items carry only refs/provider ids per RFC 0046/0079; the importer reports unbound refs in `secretsToRebind` and re-binds secrets at the destination. All imported entities are re-owned to the caller's RFC 0048 identity; `source.originPrincipal` is informational only and grants no access. Gated on the `portability` capability.",
|
|
6
|
+
"type": "object",
|
|
7
|
+
"additionalProperties": false,
|
|
8
|
+
"required": ["bundleVersion", "source", "items"],
|
|
9
|
+
"properties": {
|
|
10
|
+
"bundleVersion": {
|
|
11
|
+
"const": "1",
|
|
12
|
+
"description": "Bundle schema version. Importers MUST reject a bundleVersion they do not understand."
|
|
13
|
+
},
|
|
14
|
+
"source": {
|
|
15
|
+
"type": "object",
|
|
16
|
+
"additionalProperties": false,
|
|
17
|
+
"required": ["origin"],
|
|
18
|
+
"description": "Provenance of the bundle. Informational only — never a source of authority at the destination.",
|
|
19
|
+
"properties": {
|
|
20
|
+
"origin": {
|
|
21
|
+
"type": "string",
|
|
22
|
+
"minLength": 1,
|
|
23
|
+
"description": "Origin host base URL or adapter id (e.g. 'adapter:openclaw')."
|
|
24
|
+
},
|
|
25
|
+
"exportedAt": { "type": "string", "format": "date-time" },
|
|
26
|
+
"originPrincipal": {
|
|
27
|
+
"type": ["string", "null"],
|
|
28
|
+
"default": null,
|
|
29
|
+
"description": "Opaque source identity, informational only. MUST NOT grant any access at the destination."
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
},
|
|
33
|
+
"items": {
|
|
34
|
+
"type": "array",
|
|
35
|
+
"description": "The estate, in arbitrary order; `dependsOn` edges define apply order (topological; a cycle is a 422).",
|
|
36
|
+
"items": {
|
|
37
|
+
"type": "object",
|
|
38
|
+
"additionalProperties": false,
|
|
39
|
+
"required": ["kind", "ref", "payload"],
|
|
40
|
+
"properties": {
|
|
41
|
+
"kind": {
|
|
42
|
+
"type": "string",
|
|
43
|
+
"enum": ["agent", "pack", "prompt-template", "connection-ref", "schedule", "roster", "org-chart"],
|
|
44
|
+
"description": "Estate kind. `agent` (RFC 0070), `pack` (RFC 0003/0013), `prompt-template` (RFC 0027), `connection-ref` (RFC 0045/0095 — refs only), `schedule` (RFC 0052), `roster` (RFC 0086), `org-chart` (RFC 0087)."
|
|
45
|
+
},
|
|
46
|
+
"ref": {
|
|
47
|
+
"type": "string",
|
|
48
|
+
"minLength": 1,
|
|
49
|
+
"description": "Stable id within the bundle, used as the target of `dependsOn` edges."
|
|
50
|
+
},
|
|
51
|
+
"dependsOn": {
|
|
52
|
+
"type": "array",
|
|
53
|
+
"items": { "type": "string" },
|
|
54
|
+
"default": [],
|
|
55
|
+
"description": "Refs of other items that MUST be applied before this one (topological order)."
|
|
56
|
+
},
|
|
57
|
+
"payload": {
|
|
58
|
+
"type": "object",
|
|
59
|
+
"description": "The kind's existing schema (RFC 0070 manifest, 0003/0013 pack, 0027 template, 0045/0095 connection *ref*, 0052 job, 0086 roster, 0087 org-chart). Deliberately open here because each kind validates against its own schema. For `connection-ref`, the payload MUST carry only refs/provider ids — a literal credential value is rejected (422).",
|
|
60
|
+
"$comment": "Intentionally open object — validated against the kind's own schema; connection-ref payloads MUST NOT contain credential values (export-bundle-no-credential-material)."
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
}
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
{
|
|
2
|
+
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
3
|
+
"$id": "https://openwop.dev/spec/v1/goal.schema.json",
|
|
4
|
+
"title": "Goal",
|
|
5
|
+
"description": "RFC 0097 §B. A standing goal — a durable objective with explicit completion criteria, evaluated by a host-side judge (RFC 0090 verifier verdict or an opaque host evaluator), that keeps an agent working across turns and runs until the judge is satisfied, a declared RFC 0058 bound is crossed, or the agent escalates (RFC 0044). Completion MUST be the judge's verdict — a client MUST NOT set `state: satisfied` directly. Continuation MUST be bounded. `objective` and any verdict payload are SR-1 redaction-safe. Gated on the `agents.goals` capability.",
|
|
6
|
+
"type": "object",
|
|
7
|
+
"additionalProperties": false,
|
|
8
|
+
"required": ["id", "objective", "state", "completion", "continuation", "bounds", "owner", "createdAt"],
|
|
9
|
+
"properties": {
|
|
10
|
+
"id": {
|
|
11
|
+
"type": "string",
|
|
12
|
+
"minLength": 1,
|
|
13
|
+
"description": "Host-assigned identifier."
|
|
14
|
+
},
|
|
15
|
+
"objective": {
|
|
16
|
+
"type": "string",
|
|
17
|
+
"description": "The standing objective. SR-1 redaction-safe — no secrets/PII."
|
|
18
|
+
},
|
|
19
|
+
"state": {
|
|
20
|
+
"type": "string",
|
|
21
|
+
"enum": ["active", "satisfied", "escalated", "abandoned", "bound-exceeded"],
|
|
22
|
+
"description": "Lifecycle state. `satisfied` is set ONLY by the host on a judge verdict, never by a client. `bound-exceeded` is set when a declared bound is crossed; `escalated` when an RFC 0044 confidence-escalation interrupt fires against the goal; `abandoned` on explicit abandon."
|
|
23
|
+
},
|
|
24
|
+
"completion": {
|
|
25
|
+
"type": "object",
|
|
26
|
+
"additionalProperties": false,
|
|
27
|
+
"required": ["check"],
|
|
28
|
+
"description": "How the goal's completion is judged.",
|
|
29
|
+
"properties": {
|
|
30
|
+
"check": {
|
|
31
|
+
"type": "string",
|
|
32
|
+
"enum": ["verifier", "host"],
|
|
33
|
+
"description": "`verifier`: completion is an RFC 0090 verifier verdict. `host`: an opaque host evaluator."
|
|
34
|
+
},
|
|
35
|
+
"verifierRef": {
|
|
36
|
+
"type": ["string", "null"],
|
|
37
|
+
"default": null,
|
|
38
|
+
"description": "RFC 0090 verifier id when `check = verifier`."
|
|
39
|
+
},
|
|
40
|
+
"lastVerdict": {
|
|
41
|
+
"type": ["object", "null"],
|
|
42
|
+
"default": null,
|
|
43
|
+
"additionalProperties": false,
|
|
44
|
+
"description": "Most recent judge verdict. Content-free re: objective text. Non-deterministic judge output — MUST be persisted (carried in the `goal.evaluated` event / checkpoint) and never recomputed on replay/fork.",
|
|
45
|
+
"properties": {
|
|
46
|
+
"satisfied": { "type": "boolean" },
|
|
47
|
+
"confidence": { "type": "number", "minimum": 0, "maximum": 1 },
|
|
48
|
+
"runId": { "type": "string", "description": "The run this verdict evaluated (RFC 0040 causation-compatible)." }
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
}
|
|
52
|
+
},
|
|
53
|
+
"continuation": {
|
|
54
|
+
"type": "object",
|
|
55
|
+
"additionalProperties": false,
|
|
56
|
+
"required": ["mode"],
|
|
57
|
+
"description": "How the goal re-engages work between judge checks.",
|
|
58
|
+
"properties": {
|
|
59
|
+
"mode": {
|
|
60
|
+
"type": "string",
|
|
61
|
+
"enum": ["schedule", "commitment", "heartbeat", "manual"],
|
|
62
|
+
"description": "`schedule` (RFC 0052), `commitment` (RFC 0068), `heartbeat` (RFC 0060), `manual`."
|
|
63
|
+
},
|
|
64
|
+
"armRef": {
|
|
65
|
+
"type": ["string", "null"],
|
|
66
|
+
"default": null,
|
|
67
|
+
"description": "The RFC 0052 job / RFC 0068 commitment / RFC 0060 heartbeat that re-engages work for this goal."
|
|
68
|
+
}
|
|
69
|
+
}
|
|
70
|
+
},
|
|
71
|
+
"bounds": {
|
|
72
|
+
"type": "object",
|
|
73
|
+
"additionalProperties": false,
|
|
74
|
+
"description": "RFC 0058 execution bounds. The host MUST stop continuation and set `state: bound-exceeded` when any declared bound is crossed. When `agents.goals.requiresBounds` is true, at least one bound MUST be present before the goal may activate.",
|
|
75
|
+
"properties": {
|
|
76
|
+
"maxLoopIterations": { "type": "integer", "minimum": 1, "description": "RFC 0058. Max contributing iterations before the goal is bound-exceeded." },
|
|
77
|
+
"runTimeoutMs": { "type": "integer", "minimum": 0, "description": "RFC 0058. Wall-clock deadline (ms) for the standing goal." },
|
|
78
|
+
"maxCostUsd": { "type": "number", "minimum": 0, "description": "RFC 0084. Accumulated cost ceiling across contributing runs." }
|
|
79
|
+
}
|
|
80
|
+
},
|
|
81
|
+
"progress": {
|
|
82
|
+
"type": "object",
|
|
83
|
+
"additionalProperties": false,
|
|
84
|
+
"description": "Continuation progress so far.",
|
|
85
|
+
"properties": {
|
|
86
|
+
"iterations": { "type": "integer", "minimum": 0 },
|
|
87
|
+
"contributingRunIds": { "type": "array", "items": { "type": "string" }, "description": "RFC 0040 causation-compatible." }
|
|
88
|
+
}
|
|
89
|
+
},
|
|
90
|
+
"owner": {
|
|
91
|
+
"type": "object",
|
|
92
|
+
"additionalProperties": false,
|
|
93
|
+
"required": ["tenant"],
|
|
94
|
+
"description": "RFC 0048 identity triple. Redaction-safe — `principal` is an opaque id, never PII or credential material.",
|
|
95
|
+
"properties": {
|
|
96
|
+
"tenant": { "type": "string", "minLength": 1 },
|
|
97
|
+
"workspace": { "type": "string", "minLength": 1 },
|
|
98
|
+
"principal": { "type": "string", "minLength": 1 }
|
|
99
|
+
}
|
|
100
|
+
},
|
|
101
|
+
"createdAt": { "type": "string", "format": "date-time" },
|
|
102
|
+
"updatedAt": { "type": "string", "format": "date-time" }
|
|
103
|
+
}
|
|
104
|
+
}
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
{
|
|
2
|
+
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
3
|
+
"$id": "https://openwop.dev/spec/v1/proposal.schema.json",
|
|
4
|
+
"title": "Proposal",
|
|
5
|
+
"description": "RFC 0096 §B. A reviewable-learning proposal — an INERT, draft-state reusable artifact (skill/pack/template/automation) the host synthesized from run/tool traces, that MUST NOT influence the resolution, planning, or execution of any run while in any state other than `applied`. Activation is delegated to RFC 0051 approval-gate or RFC 0049 RBAC (no new authorization path). On `apply` the installed artifact MUST byte-match the last-persisted `artifact` (no silent re-synthesis). `rationale` and `provenance.sourceRunIds` are SR-1 redaction-safe (no secrets/PII). Gated on the `agents.proposals` capability.",
|
|
6
|
+
"type": "object",
|
|
7
|
+
"additionalProperties": false,
|
|
8
|
+
"required": ["id", "kind", "state", "artifact", "provenance", "owner", "createdAt"],
|
|
9
|
+
"properties": {
|
|
10
|
+
"id": {
|
|
11
|
+
"type": "string",
|
|
12
|
+
"minLength": 1,
|
|
13
|
+
"description": "Host-assigned identifier, stable across revisions."
|
|
14
|
+
},
|
|
15
|
+
"kind": {
|
|
16
|
+
"type": "string",
|
|
17
|
+
"enum": ["agent-pack", "workflow-chain-pack", "prompt-template", "automation"],
|
|
18
|
+
"description": "The reusable artifact kind. `agent-pack` (RFC 0003), `workflow-chain-pack` (RFC 0013), `prompt-template` (RFC 0027), `automation` (RFC 0052 scheduled job)."
|
|
19
|
+
},
|
|
20
|
+
"state": {
|
|
21
|
+
"type": "string",
|
|
22
|
+
"enum": ["draft", "revised", "applied", "rejected", "archived"],
|
|
23
|
+
"description": "Lifecycle state. Only `applied` may influence a run. `draft`/`revised` are inert; `rejected`/`archived` are terminal-inert."
|
|
24
|
+
},
|
|
25
|
+
"title": {
|
|
26
|
+
"type": "string",
|
|
27
|
+
"description": "Human-facing label. SR-1 redaction-safe."
|
|
28
|
+
},
|
|
29
|
+
"rationale": {
|
|
30
|
+
"type": "string",
|
|
31
|
+
"description": "Plain-language justification. SR-1 redaction-safe — no secrets/PII."
|
|
32
|
+
},
|
|
33
|
+
"artifact": {
|
|
34
|
+
"type": "object",
|
|
35
|
+
"description": "The proposed artifact, shaped by `kind` (an RFC 0003 pack manifest, an RFC 0013 chain pack, an RFC 0027 template, an RFC 0052 job spec). Inert until applied. Deliberately open (`additionalProperties` not constrained here) because the payload conforms to the kind's own existing schema, validated by the host at `apply` (malformed-for-kind ⇒ 422).",
|
|
36
|
+
"$comment": "Intentionally open object — the artifact body is validated against the kind's own schema, not this envelope."
|
|
37
|
+
},
|
|
38
|
+
"provenance": {
|
|
39
|
+
"type": "object",
|
|
40
|
+
"additionalProperties": false,
|
|
41
|
+
"required": ["sourceRunIds"],
|
|
42
|
+
"description": "Where the proposal came from.",
|
|
43
|
+
"properties": {
|
|
44
|
+
"sourceRunIds": {
|
|
45
|
+
"type": "array",
|
|
46
|
+
"items": { "type": "string", "minLength": 1 },
|
|
47
|
+
"description": "Runs whose traces produced this proposal (RFC 0040 causation-compatible). SR-1 redaction-safe."
|
|
48
|
+
},
|
|
49
|
+
"synthesizerModel": {
|
|
50
|
+
"type": "string",
|
|
51
|
+
"description": "Opaque identifier of the model/process that synthesized the draft."
|
|
52
|
+
}
|
|
53
|
+
}
|
|
54
|
+
},
|
|
55
|
+
"duplicateOf": {
|
|
56
|
+
"type": ["string", "null"],
|
|
57
|
+
"default": null,
|
|
58
|
+
"description": "Existing artifact ref this proposal restates/overlaps, when `agents.proposals.duplicationDetection` is on. `null` when unknown or detection is off."
|
|
59
|
+
},
|
|
60
|
+
"owner": {
|
|
61
|
+
"type": "object",
|
|
62
|
+
"additionalProperties": false,
|
|
63
|
+
"required": ["tenant"],
|
|
64
|
+
"description": "RFC 0048 identity triple that owns this proposal. Redaction-safe — `principal` is an opaque id, never PII or credential material. Single-tenant hosts populate at least `tenant`.",
|
|
65
|
+
"properties": {
|
|
66
|
+
"tenant": { "type": "string", "minLength": 1, "description": "Top-level isolation boundary." },
|
|
67
|
+
"workspace": { "type": "string", "minLength": 1, "description": "Optional sub-tenant within the tenant (RFC 0048 workspace)." },
|
|
68
|
+
"principal": { "type": "string", "minLength": 1, "description": "Acting identity (user or agent) — opaque id, never PII." }
|
|
69
|
+
}
|
|
70
|
+
},
|
|
71
|
+
"activation": {
|
|
72
|
+
"type": ["object", "null"],
|
|
73
|
+
"default": null,
|
|
74
|
+
"description": "Populated only when `state: applied`: the RFC 0051 approval reference (if `activation = approval-gate`) and the resulting installed artifact ref.",
|
|
75
|
+
"additionalProperties": false,
|
|
76
|
+
"properties": {
|
|
77
|
+
"approvalId": { "type": ["string", "null"], "description": "RFC 0051 approval/gate id when activation routed through an approval gate." },
|
|
78
|
+
"installedArtifactRef": { "type": "string", "description": "Ref of the artifact installed on `apply` (RFC 0043 install path)." }
|
|
79
|
+
}
|
|
80
|
+
},
|
|
81
|
+
"createdAt": { "type": "string", "format": "date-time" },
|
|
82
|
+
"updatedAt": { "type": "string", "format": "date-time" }
|
|
83
|
+
}
|
|
84
|
+
}
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
|
3
3
|
"$id": "https://openwop.dev/spec/v1/run-event-payloads.schema.json",
|
|
4
4
|
"title": "RunEventPayloads",
|
|
5
|
-
"description": "Per-RunEventType payload schemas. The base RunEventDoc shape (run-event.schema.json) leaves `payload` permissive for forward-compat. This schema defines the canonical payload contract for each known RunEventType. Consumers MAY pin strict payload validation via `$defs.<typeId>` and `ajv.validate(schema.$defs[event.type], event.payload)`. Unknown event types MUST be tolerated (no $defs match → fold best-effort).\n\
|
|
5
|
+
"description": "Per-RunEventType payload schemas. The base RunEventDoc shape (run-event.schema.json) leaves `payload` permissive for forward-compat. This schema defines the canonical payload contract for each known RunEventType. Consumers MAY pin strict payload validation via `$defs.<typeId>` and `ajv.validate(schema.$defs[event.type], event.payload)`. Unknown event types MUST be tolerated (no $defs match → fold best-effort).\n\n100 variants from `run-event.schema.json#$defs.RunEventType` are covered, grouped into ~20 shape families with shared $defs. Naming convention: camelCase keys mirror dotted RunEventType names (e.g., `run.started` → `runStarted`).",
|
|
6
6
|
"type": "object",
|
|
7
7
|
"$defs": {
|
|
8
8
|
"_typeIndex": {
|
|
@@ -104,7 +104,12 @@
|
|
|
104
104
|
"budget.reserved": { "$ref": "#/$defs/budgetReserved" },
|
|
105
105
|
"budget.consumed": { "$ref": "#/$defs/budgetConsumed" },
|
|
106
106
|
"budget.threshold.crossed": { "$ref": "#/$defs/budgetThresholdCrossed" },
|
|
107
|
-
"budget.exhausted": { "$ref": "#/$defs/budgetExhausted" }
|
|
107
|
+
"budget.exhausted": { "$ref": "#/$defs/budgetExhausted" },
|
|
108
|
+
"proposal.created": { "$ref": "#/$defs/proposalCreated" },
|
|
109
|
+
"proposal.activated": { "$ref": "#/$defs/proposalActivated" },
|
|
110
|
+
"goal.evaluated": { "$ref": "#/$defs/goalEvaluated" },
|
|
111
|
+
"goal.closed": { "$ref": "#/$defs/goalClosed" },
|
|
112
|
+
"import.applied": { "$ref": "#/$defs/importApplied" }
|
|
108
113
|
}
|
|
109
114
|
},
|
|
110
115
|
|
|
@@ -122,6 +127,79 @@
|
|
|
122
127
|
}
|
|
123
128
|
},
|
|
124
129
|
|
|
130
|
+
"proposalCreated": {
|
|
131
|
+
"description": "RFC 0096 §D. Emitted when the host synthesizes a reviewable-learning draft. Content-free: ids / kind / content-free references only — NEVER the artifact body or the rationale text (those live behind the authed read). Redaction-safe (SECURITY invariant `proposal-inert-until-applied` covers the behavior; SR-1 covers the payload). MUST NOT be emitted unless `capabilities.agents.proposals` is advertised.",
|
|
132
|
+
"type": "object",
|
|
133
|
+
"additionalProperties": false,
|
|
134
|
+
"required": ["proposalId", "kind"],
|
|
135
|
+
"properties": {
|
|
136
|
+
"proposalId": { "type": "string", "minLength": 1, "description": "The created proposal's stable id." },
|
|
137
|
+
"kind": { "type": "string", "enum": ["agent-pack", "workflow-chain-pack", "prompt-template", "automation"], "description": "The proposed artifact kind." },
|
|
138
|
+
"sourceRunIds": { "type": "array", "items": { "type": "string" }, "description": "Runs whose traces produced the draft (RFC 0040 causation-compatible). Ids only — no trace content." },
|
|
139
|
+
"duplicateOf": { "type": ["string", "null"], "description": "Existing artifact ref the proposal restates/overlaps, when duplication detection is on; else null." }
|
|
140
|
+
}
|
|
141
|
+
},
|
|
142
|
+
|
|
143
|
+
"proposalActivated": {
|
|
144
|
+
"description": "RFC 0096 §D. Emitted on a successful `apply`. Content-free: ids / content-free references only — NEVER the installed artifact body. MUST NOT be emitted unless `capabilities.agents.proposals` is advertised.",
|
|
145
|
+
"type": "object",
|
|
146
|
+
"additionalProperties": false,
|
|
147
|
+
"required": ["proposalId", "kind", "installedArtifactRef"],
|
|
148
|
+
"properties": {
|
|
149
|
+
"proposalId": { "type": "string", "minLength": 1 },
|
|
150
|
+
"kind": { "type": "string", "enum": ["agent-pack", "workflow-chain-pack", "prompt-template", "automation"] },
|
|
151
|
+
"approvalId": { "type": ["string", "null"], "description": "RFC 0051 approval id when activation routed through an approval gate; null for direct-rbac." },
|
|
152
|
+
"installedArtifactRef": { "type": "string", "minLength": 1, "description": "Ref of the artifact installed on apply (RFC 0043)." }
|
|
153
|
+
}
|
|
154
|
+
},
|
|
155
|
+
|
|
156
|
+
"goalEvaluated": {
|
|
157
|
+
"description": "RFC 0097 §D. Emitted after each judge check on a standing goal. Content-free: NO objective text. The verdict (`satisfied`/`confidence`) is non-deterministic judge output — it is RECORDED here and MUST NOT be recomputed on replay/fork (`replay.md`). MUST NOT be emitted unless `capabilities.agents.goals` is advertised.",
|
|
158
|
+
"type": "object",
|
|
159
|
+
"additionalProperties": false,
|
|
160
|
+
"required": ["goalId", "satisfied", "runId", "iterations"],
|
|
161
|
+
"properties": {
|
|
162
|
+
"goalId": { "type": "string", "minLength": 1 },
|
|
163
|
+
"satisfied": { "type": "boolean", "description": "The judge's verdict for this check." },
|
|
164
|
+
"confidence": { "type": "number", "minimum": 0, "maximum": 1, "description": "Judge confidence in [0,1]." },
|
|
165
|
+
"runId": { "type": "string", "minLength": 1, "description": "The run this verdict evaluated (RFC 0040 causation-compatible)." },
|
|
166
|
+
"iterations": { "type": "integer", "minimum": 0, "description": "Contributing iterations so far." }
|
|
167
|
+
}
|
|
168
|
+
},
|
|
169
|
+
|
|
170
|
+
"goalClosed": {
|
|
171
|
+
"description": "RFC 0097 §D. Emitted when a standing goal stops continuation. Content-free. MUST NOT be emitted unless `capabilities.agents.goals` is advertised.",
|
|
172
|
+
"type": "object",
|
|
173
|
+
"additionalProperties": false,
|
|
174
|
+
"required": ["goalId", "finalState"],
|
|
175
|
+
"properties": {
|
|
176
|
+
"goalId": { "type": "string", "minLength": 1 },
|
|
177
|
+
"finalState": { "type": "string", "enum": ["satisfied", "escalated", "abandoned", "bound-exceeded"], "description": "Terminal state the goal closed in." }
|
|
178
|
+
}
|
|
179
|
+
},
|
|
180
|
+
|
|
181
|
+
"importApplied": {
|
|
182
|
+
"description": "RFC 0098 §D. Emitted when an estate import is applied. Content-free: counts + ref-only — NO item payloads, NO secret values (SECURITY invariant `export-bundle-no-credential-material`). MUST NOT be emitted unless `capabilities.portability` is advertised.",
|
|
183
|
+
"type": "object",
|
|
184
|
+
"additionalProperties": false,
|
|
185
|
+
"required": ["bundleOrigin", "counts"],
|
|
186
|
+
"properties": {
|
|
187
|
+
"bundleOrigin": { "type": "string", "minLength": 1, "description": "The bundle's `source.origin` — informational only." },
|
|
188
|
+
"counts": {
|
|
189
|
+
"type": "object",
|
|
190
|
+
"additionalProperties": false,
|
|
191
|
+
"description": "Per-action item tallies.",
|
|
192
|
+
"properties": {
|
|
193
|
+
"created": { "type": "integer", "minimum": 0 },
|
|
194
|
+
"updated": { "type": "integer", "minimum": 0 },
|
|
195
|
+
"skipped": { "type": "integer", "minimum": 0 },
|
|
196
|
+
"failed": { "type": "integer", "minimum": 0 }
|
|
197
|
+
}
|
|
198
|
+
},
|
|
199
|
+
"secretsToRebind": { "type": "array", "items": { "type": "string" }, "description": "Provider/ref ids whose secrets must be re-bound at the destination. Refs only — never secret values." }
|
|
200
|
+
}
|
|
201
|
+
},
|
|
202
|
+
|
|
125
203
|
"connectorAuthorized": {
|
|
126
204
|
"description": "RFC 0047 — emitted when the host acquires (or re-authorizes) a third-party OAuth token on a user's behalf via the `host.oauth` flow. Redaction-safe: carries the credential REFERENCE (RFC 0046), never token material. MUST NOT be emitted unless `capabilities.oauth.supported: true`.",
|
|
127
205
|
"type": "object",
|
|
@@ -166,7 +166,12 @@
|
|
|
166
166
|
"core.workflowChain.confidence-escalated",
|
|
167
167
|
"connector.authorized",
|
|
168
168
|
"connector.auth_expired",
|
|
169
|
-
"authorization.decided"
|
|
169
|
+
"authorization.decided",
|
|
170
|
+
"proposal.created",
|
|
171
|
+
"proposal.activated",
|
|
172
|
+
"goal.evaluated",
|
|
173
|
+
"goal.closed",
|
|
174
|
+
"import.applied"
|
|
170
175
|
]
|
|
171
176
|
}
|
|
172
177
|
}
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Agent-platform portability — export bundle + tenant import (RFC 0098;
|
|
3
|
+
* `portability.md`). Public test for the protocol-tier SECURITY invariant
|
|
4
|
+
* `export-bundle-no-credential-material`.
|
|
5
|
+
*
|
|
6
|
+
* Two layers:
|
|
7
|
+
*
|
|
8
|
+
* A. Always-on, server-free schema legs — the capability block (incl. the
|
|
9
|
+
* `import ⇒ dryRun` if/then), the `export-bundle.schema.json` shape (no
|
|
10
|
+
* credential-named field admitted), and the content-free `import.applied`
|
|
11
|
+
* event payload.
|
|
12
|
+
*
|
|
13
|
+
* B. Capability-gated behavioral legs — on a host advertising
|
|
14
|
+
* `capabilities.portability` that exposes the `/v1/host/sample/import`
|
|
15
|
+
* seam: a bundle carrying a literal credential value is rejected (422),
|
|
16
|
+
* and a dry-run import makes zero writes. Hosts without the seam soft-skip
|
|
17
|
+
* (404); unadvertised hosts skip via the behavior gate.
|
|
18
|
+
*
|
|
19
|
+
* @see spec/v1/portability.md
|
|
20
|
+
* @see SECURITY/invariants.yaml id: export-bundle-no-credential-material
|
|
21
|
+
* @see RFCS/0098-agent-platform-portability-export-bundle-and-import.md
|
|
22
|
+
*/
|
|
23
|
+
|
|
24
|
+
import { describe, it, expect } from 'vitest';
|
|
25
|
+
import { readFileSync } from 'node:fs';
|
|
26
|
+
import { join } from 'node:path';
|
|
27
|
+
import Ajv2020 from 'ajv/dist/2020.js';
|
|
28
|
+
import addFormats from 'ajv-formats';
|
|
29
|
+
import { SCHEMAS_DIR } from '../lib/paths.js';
|
|
30
|
+
import { driver } from '../lib/driver.js';
|
|
31
|
+
import { behaviorGate } from '../lib/behavior-gate.js';
|
|
32
|
+
import { readCapabilityFamily } from '../lib/discovery-capabilities.js';
|
|
33
|
+
|
|
34
|
+
const why = (specRef: string, requirement: string): string => `${specRef} — ${requirement}`;
|
|
35
|
+
function loadSchema(name: string): Record<string, unknown> {
|
|
36
|
+
return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), 'utf8')) as Record<string, unknown>;
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
const CRED_NAMES = ['clientSecret', 'client_secret', 'apiKey', 'api_key', 'accessToken', 'refreshToken', 'password', 'privateKey'] as const;
|
|
40
|
+
|
|
41
|
+
describe('export-bundle-portability: capability advertisement (RFC 0098 §A, server-free)', () => {
|
|
42
|
+
const caps = loadSchema('capabilities.schema.json');
|
|
43
|
+
const portability = (caps.properties as Record<string, { properties?: Record<string, unknown>; if?: unknown; then?: unknown }>).portability;
|
|
44
|
+
|
|
45
|
+
it('capabilities schema declares portability with its sub-flags + the import⇒dryRun if/then', () => {
|
|
46
|
+
expect(portability, why('capabilities.md §portability', 'portability MUST be declared')).toBeDefined();
|
|
47
|
+
for (const flag of ['export', 'import', 'kinds', 'dryRun']) {
|
|
48
|
+
expect(portability?.properties?.[flag], why('RFC 0098 §A', `portability.${flag} MUST be declared`)).toBeDefined();
|
|
49
|
+
}
|
|
50
|
+
expect(portability?.if, why('RFC 0098 §A', 'a JSON-Schema if/then MUST enforce dryRun:true when import:true')).toBeDefined();
|
|
51
|
+
expect(portability?.then, why('RFC 0098 §A', 'the then-branch MUST require dryRun')).toBeDefined();
|
|
52
|
+
});
|
|
53
|
+
});
|
|
54
|
+
|
|
55
|
+
describe('export-bundle-portability: ExportBundle shape (RFC 0098 §B, server-free)', () => {
|
|
56
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
57
|
+
addFormats(ajv);
|
|
58
|
+
const validate = ajv.compile(loadSchema('export-bundle.schema.json'));
|
|
59
|
+
|
|
60
|
+
const good = {
|
|
61
|
+
bundleVersion: '1',
|
|
62
|
+
source: { origin: 'https://host-a.example', exportedAt: '2026-06-13T00:00:00Z' },
|
|
63
|
+
items: [
|
|
64
|
+
{ kind: 'prompt-template', ref: 'tpl-1', payload: { templateId: 'welcome', version: '1.0.0' } },
|
|
65
|
+
{ kind: 'connection-ref', ref: 'conn-1', dependsOn: ['tpl-1'], payload: { provider: 'slack', credentialRef: 'cred:abc' } },
|
|
66
|
+
],
|
|
67
|
+
};
|
|
68
|
+
|
|
69
|
+
it('validates a conforming bundle with refs only', () => {
|
|
70
|
+
expect(validate(good), why('RFC 0098 §B', `a conforming bundle MUST validate. Errors: ${JSON.stringify(validate.errors)}`)).toBe(true);
|
|
71
|
+
});
|
|
72
|
+
|
|
73
|
+
it('rejects a wrong bundleVersion, an unknown kind, and a missing item ref', () => {
|
|
74
|
+
expect(validate({ ...good, bundleVersion: '2' }), why('RFC 0098 §B', 'an unsupported bundleVersion MUST be rejected')).toBe(false);
|
|
75
|
+
expect(validate({ ...good, items: [{ kind: 'mystery', ref: 'x', payload: {} }] }), why('RFC 0098 §B', 'an unknown item kind MUST be rejected')).toBe(false);
|
|
76
|
+
expect(validate({ ...good, items: [{ kind: 'agent', payload: {} }] }), why('RFC 0098 §B', 'an item without a ref MUST be rejected')).toBe(false);
|
|
77
|
+
});
|
|
78
|
+
|
|
79
|
+
it('the bundle envelope admits no credential-named field at the root or source level (additionalProperties:false)', () => {
|
|
80
|
+
for (const name of CRED_NAMES) {
|
|
81
|
+
expect(validate({ ...good, [name]: 'xxx' }), why('SECURITY invariant export-bundle-no-credential-material', `a "${name}" field at the bundle root MUST NOT validate`)).toBe(false);
|
|
82
|
+
expect(validate({ ...good, source: { ...good.source, [name]: 'xxx' } }), why('SECURITY invariant export-bundle-no-credential-material', `a "${name}" field under source MUST NOT validate`)).toBe(false);
|
|
83
|
+
}
|
|
84
|
+
});
|
|
85
|
+
});
|
|
86
|
+
|
|
87
|
+
describe('export-bundle-portability: content-free event (RFC 0098 §D, server-free)', () => {
|
|
88
|
+
const runEvent = loadSchema('run-event.schema.json');
|
|
89
|
+
const payloads = loadSchema('run-event-payloads.schema.json');
|
|
90
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
91
|
+
addFormats(ajv);
|
|
92
|
+
ajv.addSchema(payloads, 'payloads');
|
|
93
|
+
|
|
94
|
+
it('import.applied is in the RunEventType enum and is content-free (counts + refs only)', () => {
|
|
95
|
+
const en = (runEvent.$defs as Record<string, { enum?: string[] }>).RunEventType?.enum ?? [];
|
|
96
|
+
expect(en).toContain('import.applied');
|
|
97
|
+
const applied = ajv.getSchema('payloads#/$defs/importApplied')!;
|
|
98
|
+
expect(applied({ bundleOrigin: 'https://host-a.example', counts: { created: 2, skipped: 1 }, secretsToRebind: ['anthropic'] }), why('RFC 0098 §D', 'a content-free import.applied MUST validate')).toBe(true);
|
|
99
|
+
expect(applied({ bundleOrigin: 'h', counts: { created: 1 }, items: [{ payload: {} }] }), why('SECURITY invariant export-bundle-no-credential-material', 'import.applied MUST NOT carry item payloads')).toBe(false);
|
|
100
|
+
});
|
|
101
|
+
});
|
|
102
|
+
|
|
103
|
+
describe('export-bundle-portability: behavioral (RFC 0098 §E, capability-gated)', () => {
|
|
104
|
+
it('importing a bundle with a literal credential value is rejected (422)', async () => {
|
|
105
|
+
const portability = await readCapabilityFamily<{ import?: boolean }>('portability');
|
|
106
|
+
if (!behaviorGate('portability', portability !== undefined && portability.import === true)) return;
|
|
107
|
+
|
|
108
|
+
const leaky = {
|
|
109
|
+
bundleVersion: '1',
|
|
110
|
+
source: { origin: 'adapter:conformance' },
|
|
111
|
+
items: [{ kind: 'connection-ref', ref: 'c1', payload: { provider: 'anthropic', apiKey: 'sk-conformance-canary' } }],
|
|
112
|
+
};
|
|
113
|
+
const res = await driver.post('/v1/host/sample/import?dryRun=true', { bundle: leaky });
|
|
114
|
+
if (res.status === 404 || res.status === 403) return; // seam unwired — soft-skip
|
|
115
|
+
expect(
|
|
116
|
+
res.status,
|
|
117
|
+
driver.describe('portability.md §Invariants clause 1', 'a bundle carrying a literal credential value MUST be rejected (422)'),
|
|
118
|
+
).toBe(422);
|
|
119
|
+
});
|
|
120
|
+
});
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Standing goals — judge-based completion + bounded continuation (RFC 0097;
|
|
3
|
+
* `agent-runtime.md` §"Standing goals"). Public tests for the protocol-tier
|
|
4
|
+
* SECURITY invariants `goal-continuation-bounded` and
|
|
5
|
+
* `goal-completion-judge-only`.
|
|
6
|
+
*
|
|
7
|
+
* Two layers:
|
|
8
|
+
*
|
|
9
|
+
* A. Always-on, server-free schema legs — the capability block, the
|
|
10
|
+
* `goal.schema.json` shape, and the content-free `goal.evaluated` /
|
|
11
|
+
* `goal.closed` event payloads.
|
|
12
|
+
*
|
|
13
|
+
* B. Capability-gated behavioral legs — on a host advertising
|
|
14
|
+
* `capabilities.agents.goals` that exposes the `/v1/host/sample/goals`
|
|
15
|
+
* seam: bounded termination (a never-satisfied goal halts at
|
|
16
|
+
* `maxLoopIterations` with `state: bound-exceeded`) and judge-only
|
|
17
|
+
* completion (a client-supplied `state: satisfied` is refused). Hosts
|
|
18
|
+
* without the seam soft-skip (404); unadvertised hosts skip via the gate.
|
|
19
|
+
*
|
|
20
|
+
* @see spec/v1/agent-runtime.md §"Standing goals"
|
|
21
|
+
* @see SECURITY/invariants.yaml id: goal-continuation-bounded, goal-completion-judge-only
|
|
22
|
+
* @see RFCS/0097-standing-goals-and-judge-based-continuation.md
|
|
23
|
+
*/
|
|
24
|
+
|
|
25
|
+
import { describe, it, expect } from 'vitest';
|
|
26
|
+
import { readFileSync } from 'node:fs';
|
|
27
|
+
import { join } from 'node:path';
|
|
28
|
+
import Ajv2020 from 'ajv/dist/2020.js';
|
|
29
|
+
import addFormats from 'ajv-formats';
|
|
30
|
+
import { SCHEMAS_DIR } from '../lib/paths.js';
|
|
31
|
+
import { driver } from '../lib/driver.js';
|
|
32
|
+
import { behaviorGate } from '../lib/behavior-gate.js';
|
|
33
|
+
import { readCapabilityFamily } from '../lib/discovery-capabilities.js';
|
|
34
|
+
|
|
35
|
+
const why = (specRef: string, requirement: string): string => `${specRef} — ${requirement}`;
|
|
36
|
+
function loadSchema(name: string): Record<string, unknown> {
|
|
37
|
+
return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), 'utf8')) as Record<string, unknown>;
|
|
38
|
+
}
|
|
39
|
+
|
|
40
|
+
describe('goal-standing-continuation: capability advertisement (RFC 0097 §A, server-free)', () => {
|
|
41
|
+
it('capabilities schema declares agents.goals with its required sub-flags', () => {
|
|
42
|
+
const caps = loadSchema('capabilities.schema.json');
|
|
43
|
+
const agents = (caps.properties as Record<string, { properties?: Record<string, { properties?: Record<string, unknown>; required?: string[] }> }>).agents;
|
|
44
|
+
const goals = agents?.properties?.goals;
|
|
45
|
+
expect(goals, why('capabilities.md §agents', 'agents.goals MUST be declared')).toBeDefined();
|
|
46
|
+
for (const flag of ['judge', 'continuation', 'requiresBounds']) {
|
|
47
|
+
expect(goals?.properties?.[flag], why('RFC 0097 §A', `agents.goals.${flag} MUST be declared`)).toBeDefined();
|
|
48
|
+
}
|
|
49
|
+
expect(goals?.required, why('RFC 0097 §A', 'judge + continuation MUST be required')).toEqual(
|
|
50
|
+
expect.arrayContaining(['judge', 'continuation']),
|
|
51
|
+
);
|
|
52
|
+
});
|
|
53
|
+
});
|
|
54
|
+
|
|
55
|
+
describe('goal-standing-continuation: Goal shape (RFC 0097 §B, server-free)', () => {
|
|
56
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
57
|
+
addFormats(ajv);
|
|
58
|
+
const validate = ajv.compile(loadSchema('goal.schema.json'));
|
|
59
|
+
|
|
60
|
+
const good = {
|
|
61
|
+
id: 'goal-1',
|
|
62
|
+
objective: 'ship the release checklist',
|
|
63
|
+
state: 'active',
|
|
64
|
+
completion: { check: 'verifier', verifierRef: 'vf-1' },
|
|
65
|
+
continuation: { mode: 'schedule', armRef: 'job-1' },
|
|
66
|
+
bounds: { maxLoopIterations: 7 },
|
|
67
|
+
owner: { tenant: 'acme' },
|
|
68
|
+
createdAt: '2026-06-13T00:00:00Z',
|
|
69
|
+
};
|
|
70
|
+
|
|
71
|
+
it('validates a conforming active goal', () => {
|
|
72
|
+
expect(validate(good), why('RFC 0097 §B', `a conforming goal MUST validate. Errors: ${JSON.stringify(validate.errors)}`)).toBe(true);
|
|
73
|
+
});
|
|
74
|
+
|
|
75
|
+
it('rejects an unknown state, an unknown judge, and a bad continuation mode', () => {
|
|
76
|
+
expect(validate({ ...good, state: 'done' }), why('RFC 0097 §B', 'a state outside the lifecycle enum MUST be rejected')).toBe(false);
|
|
77
|
+
expect(validate({ ...good, completion: { check: 'vibes' } }), why('RFC 0097 §B', 'judge check outside {verifier,host} MUST be rejected')).toBe(false);
|
|
78
|
+
expect(validate({ ...good, continuation: { mode: 'whenever' } }), why('RFC 0097 §B', 'continuation mode outside the enum MUST be rejected')).toBe(false);
|
|
79
|
+
});
|
|
80
|
+
});
|
|
81
|
+
|
|
82
|
+
describe('goal-standing-continuation: content-free events (RFC 0097 §D, server-free)', () => {
|
|
83
|
+
const runEvent = loadSchema('run-event.schema.json');
|
|
84
|
+
const payloads = loadSchema('run-event-payloads.schema.json');
|
|
85
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
86
|
+
addFormats(ajv);
|
|
87
|
+
ajv.addSchema(payloads, 'payloads');
|
|
88
|
+
|
|
89
|
+
it('goal.evaluated and goal.closed are in the RunEventType enum', () => {
|
|
90
|
+
const en = (runEvent.$defs as Record<string, { enum?: string[] }>).RunEventType?.enum ?? [];
|
|
91
|
+
expect(en).toContain('goal.evaluated');
|
|
92
|
+
expect(en).toContain('goal.closed');
|
|
93
|
+
});
|
|
94
|
+
|
|
95
|
+
it('goal.evaluated is content-free — an objective text field is rejected; goal.closed requires a terminal finalState', () => {
|
|
96
|
+
const evaluated = ajv.getSchema('payloads#/$defs/goalEvaluated')!;
|
|
97
|
+
expect(evaluated({ goalId: 'g1', satisfied: false, confidence: 0.4, runId: 'r1', iterations: 2 }), why('RFC 0097 §D', 'a content-free goal.evaluated MUST validate')).toBe(true);
|
|
98
|
+
expect(evaluated({ goalId: 'g1', satisfied: false, runId: 'r1', iterations: 2, objective: 'ship it' }), why('RFC 0097 §D', 'goal.evaluated MUST NOT carry objective text')).toBe(false);
|
|
99
|
+
const closed = ajv.getSchema('payloads#/$defs/goalClosed')!;
|
|
100
|
+
expect(closed({ goalId: 'g1', finalState: 'bound-exceeded' }), why('RFC 0097 §D', 'goal.closed with a terminal finalState MUST validate')).toBe(true);
|
|
101
|
+
expect(closed({ goalId: 'g1', finalState: 'active' }), why('RFC 0097 §D', 'goal.closed MUST NOT use the non-terminal `active` state')).toBe(false);
|
|
102
|
+
});
|
|
103
|
+
});
|
|
104
|
+
|
|
105
|
+
describe('goal-standing-continuation: behavioral (RFC 0097 §E, capability-gated)', () => {
|
|
106
|
+
it('a goal cannot be created without bounds when requiresBounds is advertised (422)', async () => {
|
|
107
|
+
const agents = await readCapabilityFamily<{ goals?: { requiresBounds?: boolean } }>('agents');
|
|
108
|
+
if (!behaviorGate('agents.goals', agents?.goals !== undefined)) return;
|
|
109
|
+
if (agents?.goals?.requiresBounds === false) return; // host opted out of mandatory bounds
|
|
110
|
+
|
|
111
|
+
const res = await driver.post('/v1/host/sample/goals', {
|
|
112
|
+
objective: 'unbounded work',
|
|
113
|
+
completion: { check: 'host' },
|
|
114
|
+
continuation: { mode: 'manual' },
|
|
115
|
+
});
|
|
116
|
+
if (res.status === 404 || res.status === 403) return; // seam unwired — soft-skip
|
|
117
|
+
expect(
|
|
118
|
+
res.status,
|
|
119
|
+
driver.describe('agent-runtime.md §"Standing goals" clause 2', 'POST /goals without RFC 0058 bounds MUST be rejected (422) when requiresBounds is advertised'),
|
|
120
|
+
).toBe(422);
|
|
121
|
+
});
|
|
122
|
+
|
|
123
|
+
it('a client MUST NOT set state: satisfied directly', async () => {
|
|
124
|
+
const agents = await readCapabilityFamily<{ goals?: unknown }>('agents');
|
|
125
|
+
if (!behaviorGate('agents.goals', agents?.goals !== undefined)) return;
|
|
126
|
+
|
|
127
|
+
const list = await driver.get('/v1/host/sample/goals?state=active');
|
|
128
|
+
if (list.status === 404 || list.status === 403) return;
|
|
129
|
+
const goals = (list.json as { goals?: Array<{ id: string }> })?.goals ?? [];
|
|
130
|
+
if (goals.length === 0) return;
|
|
131
|
+
|
|
132
|
+
const res = await driver.post(`/v1/host/sample/goals/${goals[0]!.id}`, { state: 'satisfied' });
|
|
133
|
+
if (res.status === 404) return;
|
|
134
|
+
expect(
|
|
135
|
+
res.status >= 400,
|
|
136
|
+
driver.describe('agent-runtime.md §"Standing goals" clause 1', 'a client-supplied state: satisfied MUST be refused — completion is the judge\'s verdict'),
|
|
137
|
+
).toBe(true);
|
|
138
|
+
});
|
|
139
|
+
});
|
|
@@ -0,0 +1,129 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Reviewable learning — the proposal lifecycle (RFC 0096; `agent-memory.md`
|
|
3
|
+
* §"Reviewable learning"). Public tests for the protocol-tier SECURITY
|
|
4
|
+
* invariants `proposal-inert-until-applied` and `proposal-no-resynthesis`.
|
|
5
|
+
*
|
|
6
|
+
* Two layers:
|
|
7
|
+
*
|
|
8
|
+
* A. Always-on, server-free schema legs — the capability block, the
|
|
9
|
+
* `proposal.schema.json` shape (incl. the dropped `rule` kind), and the
|
|
10
|
+
* content-free `proposal.created` / `proposal.activated` event payloads.
|
|
11
|
+
*
|
|
12
|
+
* B. Capability-gated behavioral legs — on a host advertising
|
|
13
|
+
* `capabilities.agents.proposals` that exposes the
|
|
14
|
+
* `/v1/host/sample/proposals` seam: inertness (a draft proposal does not
|
|
15
|
+
* influence a run), gated activation (`apply` without scope → 403), and
|
|
16
|
+
* no-re-synthesis (installed artifact byte-matches the last-persisted
|
|
17
|
+
* `artifact`). Hosts without the seam soft-skip (404); unadvertised
|
|
18
|
+
* hosts skip via the behavior gate.
|
|
19
|
+
*
|
|
20
|
+
* @see spec/v1/agent-memory.md §"Reviewable learning"
|
|
21
|
+
* @see SECURITY/invariants.yaml id: proposal-inert-until-applied, proposal-no-resynthesis
|
|
22
|
+
* @see RFCS/0096-reviewable-learning-skill-proposal-lifecycle.md
|
|
23
|
+
*/
|
|
24
|
+
|
|
25
|
+
import { describe, it, expect } from 'vitest';
|
|
26
|
+
import { readFileSync } from 'node:fs';
|
|
27
|
+
import { join } from 'node:path';
|
|
28
|
+
import Ajv2020 from 'ajv/dist/2020.js';
|
|
29
|
+
import addFormats from 'ajv-formats';
|
|
30
|
+
import { SCHEMAS_DIR } from '../lib/paths.js';
|
|
31
|
+
import { driver } from '../lib/driver.js';
|
|
32
|
+
import { behaviorGate } from '../lib/behavior-gate.js';
|
|
33
|
+
import { readCapabilityFamily } from '../lib/discovery-capabilities.js';
|
|
34
|
+
|
|
35
|
+
const why = (specRef: string, requirement: string): string => `${specRef} — ${requirement}`;
|
|
36
|
+
function loadSchema(name: string): Record<string, unknown> {
|
|
37
|
+
return JSON.parse(readFileSync(join(SCHEMAS_DIR, name), 'utf8')) as Record<string, unknown>;
|
|
38
|
+
}
|
|
39
|
+
|
|
40
|
+
describe('proposal-reviewable-learning: capability advertisement (RFC 0096 §A, server-free)', () => {
|
|
41
|
+
const caps = loadSchema('capabilities.schema.json');
|
|
42
|
+
const agents = (caps.properties as Record<string, { properties?: Record<string, { properties?: Record<string, unknown>; required?: string[] }> }>).agents;
|
|
43
|
+
|
|
44
|
+
it('capabilities schema declares agents.proposals with its required sub-flags', () => {
|
|
45
|
+
const proposals = agents?.properties?.proposals;
|
|
46
|
+
expect(proposals, why('capabilities.md §agents', 'agents.proposals MUST be declared')).toBeDefined();
|
|
47
|
+
for (const flag of ['artifactKinds', 'duplicationDetection', 'activation']) {
|
|
48
|
+
expect(proposals?.properties?.[flag], why('RFC 0096 §A', `agents.proposals.${flag} MUST be declared`)).toBeDefined();
|
|
49
|
+
}
|
|
50
|
+
expect(proposals?.required, why('RFC 0096 §A', 'artifactKinds + activation MUST be required')).toEqual(
|
|
51
|
+
expect.arrayContaining(['artifactKinds', 'activation']),
|
|
52
|
+
);
|
|
53
|
+
});
|
|
54
|
+
|
|
55
|
+
it('the `rule` artifact kind is NOT in the enum (dropped — no defining RFC)', () => {
|
|
56
|
+
const kinds = (agents?.properties?.proposals as { properties?: Record<string, { items?: { enum?: string[] } }> })?.properties?.artifactKinds?.items?.enum ?? [];
|
|
57
|
+
expect(kinds, why('RFC 0096 §A', 'artifactKinds enumerates the four kinds with a defining RFC')).toEqual(
|
|
58
|
+
expect.arrayContaining(['agent-pack', 'workflow-chain-pack', 'prompt-template', 'automation']),
|
|
59
|
+
);
|
|
60
|
+
expect(kinds, why('RFC 0096 §A', '`rule` MUST NOT appear — it has no defining RFC, so its artifact is unvalidatable')).not.toContain('rule');
|
|
61
|
+
});
|
|
62
|
+
});
|
|
63
|
+
|
|
64
|
+
describe('proposal-reviewable-learning: Proposal shape (RFC 0096 §B, server-free)', () => {
|
|
65
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
66
|
+
addFormats(ajv);
|
|
67
|
+
const validate = ajv.compile(loadSchema('proposal.schema.json'));
|
|
68
|
+
|
|
69
|
+
const good = {
|
|
70
|
+
id: 'prop-1',
|
|
71
|
+
kind: 'workflow-chain-pack',
|
|
72
|
+
state: 'draft',
|
|
73
|
+
artifact: { name: 'weekly-digest', version: '1.0.0' },
|
|
74
|
+
provenance: { sourceRunIds: ['run-a', 'run-b'] },
|
|
75
|
+
owner: { tenant: 'acme' },
|
|
76
|
+
createdAt: '2026-06-13T00:00:00Z',
|
|
77
|
+
};
|
|
78
|
+
|
|
79
|
+
it('validates a conforming draft proposal', () => {
|
|
80
|
+
expect(validate(good), why('RFC 0096 §B', `a conforming proposal MUST validate. Errors: ${JSON.stringify(validate.errors)}`)).toBe(true);
|
|
81
|
+
});
|
|
82
|
+
|
|
83
|
+
it('rejects an unknown state and the dropped `rule` kind', () => {
|
|
84
|
+
expect(validate({ ...good, state: 'live' }), why('RFC 0096 §B', 'a state outside the lifecycle enum MUST be rejected')).toBe(false);
|
|
85
|
+
expect(validate({ ...good, kind: 'rule' }), why('RFC 0096 §B', '`kind: "rule"` MUST be rejected (dropped from the enum)')).toBe(false);
|
|
86
|
+
expect(validate({ ...good, owner: { workspace: 'x' } }), why('RFC 0048', 'owner without tenant MUST be rejected')).toBe(false);
|
|
87
|
+
});
|
|
88
|
+
});
|
|
89
|
+
|
|
90
|
+
describe('proposal-reviewable-learning: content-free events (RFC 0096 §D, server-free)', () => {
|
|
91
|
+
const runEvent = loadSchema('run-event.schema.json');
|
|
92
|
+
const payloads = loadSchema('run-event-payloads.schema.json');
|
|
93
|
+
const ajv = new Ajv2020({ strict: false, allErrors: true });
|
|
94
|
+
addFormats(ajv);
|
|
95
|
+
ajv.addSchema(payloads, 'payloads');
|
|
96
|
+
|
|
97
|
+
it('proposal.created and proposal.activated are in the RunEventType enum', () => {
|
|
98
|
+
const en = (runEvent.$defs as Record<string, { enum?: string[] }>).RunEventType?.enum ?? [];
|
|
99
|
+
expect(en).toContain('proposal.created');
|
|
100
|
+
expect(en).toContain('proposal.activated');
|
|
101
|
+
});
|
|
102
|
+
|
|
103
|
+
it('proposal.created is content-free — an artifact body and rationale text are rejected', () => {
|
|
104
|
+
const created = ajv.getSchema('payloads#/$defs/proposalCreated')!;
|
|
105
|
+
expect(created({ proposalId: 'p1', kind: 'agent-pack', sourceRunIds: ['r1'], duplicateOf: null }), why('RFC 0096 §D', 'a content-free proposal.created MUST validate')).toBe(true);
|
|
106
|
+
expect(created({ proposalId: 'p1', kind: 'agent-pack', artifact: { x: 1 } }), why('SECURITY invariant proposal-inert-until-applied', 'proposal.created MUST NOT carry the artifact body')).toBe(false);
|
|
107
|
+
expect(created({ proposalId: 'p1', kind: 'agent-pack', rationale: 'because…' }), why('RFC 0096 §D', 'proposal.created MUST NOT carry rationale text')).toBe(false);
|
|
108
|
+
});
|
|
109
|
+
});
|
|
110
|
+
|
|
111
|
+
describe('proposal-reviewable-learning: behavioral (RFC 0096 §E, capability-gated)', () => {
|
|
112
|
+
it('apply without the activation scope is denied (403) and installs nothing', async () => {
|
|
113
|
+
const agents = await readCapabilityFamily<{ proposals?: { activation?: string } }>('agents');
|
|
114
|
+
if (!behaviorGate('agents.proposals', agents?.proposals !== undefined)) return;
|
|
115
|
+
|
|
116
|
+
const list = await driver.get('/v1/host/sample/proposals?state=draft');
|
|
117
|
+
if (list.status === 404 || list.status === 403) return; // seam unwired — soft-skip
|
|
118
|
+
const proposals = (list.json as { proposals?: Array<{ id: string }> })?.proposals ?? [];
|
|
119
|
+
if (proposals.length === 0) return; // nothing to act on — soft-skip
|
|
120
|
+
|
|
121
|
+
// Apply with no auth/scope — MUST be refused.
|
|
122
|
+
const res = await driver.post(`/v1/host/sample/proposals/${proposals[0]!.id}/apply`, {});
|
|
123
|
+
if (res.status === 404) return;
|
|
124
|
+
expect(
|
|
125
|
+
res.status,
|
|
126
|
+
driver.describe('agent-memory.md §"Reviewable learning" clause 2', 'apply without the activation scope MUST be denied (403)'),
|
|
127
|
+
).toBe(403);
|
|
128
|
+
});
|
|
129
|
+
});
|