npm - copilot-tap-extension - Versions diffs - 2.0.8 → 2.0.9 - Mend

copilot-tap-extension 2.0.8 → 2.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

package/README.md +2 -1
package/SOUL.md +51 -0
package/bin/install.mjs +2 -1
package/dist/copilot-instructions.md +5 -0
package/dist/extension.mjs +361 -20
package/dist/version.json +1 -1
package/docs/adr/0001-persistent-config-default-ownership.md +33 -0
package/docs/adr/0002-local-provider-gateway-runtime-security.md +36 -0
package/docs/adr/0003-emitter-delivery-lifecycle.md +68 -0
package/docs/adr/0004-persistent-config-canonical-streams.md +86 -0
package/docs/adr/0005-provider-sdk-push-and-dynamic-tools.md +48 -0
package/docs/adr/0006-command-emitter-cwd-workspace-boundary.md +46 -0
package/docs/adr/0007-runtime-session-workspace-context.md +62 -0
package/docs/evals.md +41 -0
package/docs/evolution-of-tap-icon.html +989 -0
package/docs/providers.md +242 -0
package/docs/recipes/adaptive-agent.md +303 -0
package/docs/recipes/agent-brainstorm/100-extension-ideas.md +288 -0
package/docs/recipes/agent-brainstorm/deep-ideas.md +216 -0
package/docs/recipes/ambient-guardian.md +314 -0
package/docs/recipes/browser-bridge.md +162 -0
package/docs/recipes/codex-goals-for-tap-goal.md +136 -0
package/docs/recipes/copilot-sdk-canvas.md +147 -0
package/docs/recipes/deferred-cognition.md +310 -0
package/docs/recipes/provider-integration-patterns.md +93 -0
package/docs/recipes/provider-interface-advanced.md +1364 -0
package/docs/recipes/provider-interface-core-profile.md +568 -0
package/docs/recipes/tap-control-plane-roadmap.md +60 -0
package/docs/recipes/universal-tool-gateway.md +202 -0
package/docs/reference.md +229 -0
package/docs/use-cases.md +348 -0
package/package.json +4 -1
package/providers/detour/README.md +84 -0
package/providers/detour/bridge.js +219 -0
package/providers/detour/index.mjs +322 -0
package/providers/detour/package-lock.json +577 -0
package/providers/detour/package.json +19 -0
package/providers/detour/scripts/build.mjs +31 -0
package/providers/detour/src/bridge.js +256 -0
package/providers/detour/src/contracts.js +40 -0
package/providers/detour/src/inspector.js +260 -0
package/providers/detour/src/inspector.test.mjs +53 -0
package/providers/detour/src/panel.js +465 -0
package/providers/detour/src/provider-core.js +233 -0
package/providers/detour/src/provider-core.test.mjs +185 -0
package/providers/detour/src/react-context-core.js +143 -0
package/providers/detour/src/react-context.js +44 -0
package/providers/detour/src/react-context.test.mjs +41 -0
package/providers/templates/README.md +23 -0
package/providers/templates/ci-review-provider.mjs +46 -0
package/providers/templates/detour-workflow-provider.mjs +41 -0
package/providers/templates/jira-github-provider.mjs +42 -0
package/providers/templates/provider-utils.mjs +45 -0
package/providers/templates/sast-triage-provider.mjs +51 -0

package/docs/adr/0002-local-provider-gateway-runtime-security.md ADDED Viewed

@@ -0,0 +1,36 @@
+# ADR 0002: Local provider gateway token and runtime shutdown boundary
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR style.
+The provider gateway lets local external processes register tools with a Copilot session over WebSocket. That boundary needs safe defaults for network exposure, token discovery, token lifetime, and shutdown cleanup. ADR 0001 covers persisted emitter ownership defaults and does not cover provider gateway security or runtime lifecycle. The extension entrypoint should not own provider protocol details; it delegates session lifecycle handling to the runtime facade.
+Providers may be launched from the active Copilot environment, from a sibling terminal, or via the provider SDK. Sibling processes cannot reliably inherit `TAP_PROVIDER_TOKEN`, so the gateway needs a local discovery path without turning the token into durable configuration.
+## Decision
+- The provider WebSocket gateway binds to loopback by default: `127.0.0.1:9400`. Any non-loopback host must be an explicit runtime override.
+- On gateway start, generate a fresh provider token for the running gateway instance.
+- Publish the token in both supported discovery locations:
+  - `TAP_PROVIDER_TOKEN` in the gateway process environment.
+  - `<COPILOT_HOME or ~/.copilot>/extensions/tap/.provider-token` for sibling local providers and SDK auto-discovery.
+- Create the token directory with restrictive permissions (`0700`) and write the token file with restrictive permissions (`0600`), including a best-effort chmod after write.
+- Treat the token file as runtime state, not config: remove it and clear `TAP_PROVIDER_TOKEN` when the gateway stops.
+- On session shutdown, the runtime facade owns provider lifecycle coordination. Entrypoints delegate shutdown listener registration to the cached runtime, which de-duplicates the effective `session.shutdown` handler across extension reloads and logs cleanup failures instead of allowing fire-and-forget rejections. The runtime sends `session.lifecycle` with `state: "shutdown.pending"` and a runtime-owned deadline (currently 10 seconds). Stop accepting new gateway connections immediately, but keep existing provider sockets open until they send `goodbye`, all sockets drain, or the deadline expires. After the deadline, close remaining provider sockets.
+- Runtime session-shutdown cleanup uses the shutdown-specific emitter wait path from ADR 0003 before reporting cleanup complete; ordinary user/tool stop requests remain non-blocking stop requests.
+- Bound-provider protocol failures that can represent an in-flight tool call's terminal response must fail deterministically. If a malformed message, oversized message, invalid `tool.result`, or syntactically valid `tool.result` with an unknown call id cannot be correlated while provider calls are pending, reject exactly one pending call with the protocol/validation error. If multiple calls are pending and correlation is impossible, disconnect the provider and reject all pending calls with `DISCONNECTED`. Unknown-id `tool.result` messages are protocol errors and are not delivered to normal tool-result callbacks.
+## Consequences
+- Local providers can connect from sibling terminals without manually copying environment variables, while the gateway remains limited to loopback by default.
+- Token exposure is scoped to the current OS user profile and gateway runtime. The token is still bearer auth, so users must not share it or expose the token file to untrusted processes.
+- Gateway stop acts as token revocation by deleting the token file and clearing the environment value.
+- Providers get a bounded cleanup window during session shutdown, avoiding abrupt termination when they respond promptly and preventing indefinite shutdown hangs when they do not.
+- Repeated extension reloads do not accumulate active shutdown cleanup handlers against the cached runtime, and rejected async shutdown cleanup is visible in stderr/session diagnostics.
+- Invalid or uncorrelatable terminal provider behavior cannot leave Copilot-facing tool promises pending indefinitely, including tools without declared timeouts. Some parse/payload errors remain non-fatal when no calls are pending, but become fail-fast while ambiguous in-flight calls could otherwise be orphaned.
+- Future changes to provider token discovery, host binding, token persistence, runtime shutdown ownership/deadlines/listener ownership, or pending-call fail-fast semantics should update or supersede this ADR.

package/docs/adr/0003-emitter-delivery-lifecycle.md ADDED Viewed

@@ -0,0 +1,68 @@
+# ADR 0003: Emitter delivery lifecycle is retryable and transactional
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR style.
+Persistent emitter auto-start, stream session-injector policy, notification delivery, and supervisor persistence happen across separate modules. A persistent emitter may produce output during session startup, before the Copilot session is attached to the cached runtime, and it may route `surface` or `inject` outcomes through the stream's session injector immediately.
+The lifecycle contract needs to avoid these failure modes:
+- auto-started persistent emitters must not silently suppress `surface`/`inject` outcomes when no explicit stream injector is configured;
+- queued notification delivery must not discard a batch solely because the session is not attached yet;
+- a failed persistent start must not report failure while leaving a newly-started emitter running outside durable config.
+- startup surface logs must not disappear when emitters produce `surface` output before the Copilot session object is attached to the cached runtime.
+- session shutdown cleanup must not report completion before child processes close, except through a bounded timeout path.
+- idle PromptEmitters auto-started during session startup must not remain `WAITING` forever solely because the initial `session.idle` transition happened before the runtime's session activity bridge was attached.
+## Decision
+- Config bootstrap preserves explicit `subscribe: false` on emitter definitions.
+- When a persistent stream already has a configured session injector, auto-start does not overwrite that injector during emitter start; the persisted stream policy remains authoritative.
+- When no explicit stream injector is configured for the emitter stream, bootstrap allows the emitter's normal subscription default to apply so `surface`/`inject` filter outcomes have an enabled delivery path.
+- Notification dispatch removes a batch from the queue only for an attempted send, and requeues that batch at the front when delivery fails because the session is not attached. Retry is delayed and single-flight to avoid tight retry loops while preserving notification ordering. The retry queue is bounded in memory: new updates are dropped when the queue is full, and retry requeues preserve the failed batch at the front while dropping any overflow from the tail. Drops are reported through session diagnostics.
+- Session end/shutdown advances the notification dispatch generation, cancels pending retry timers, and clears queued-but-unsent notifications so stale background updates from one Copilot session cannot be injected into a later session.
+- Session timeline logs emitted before initial attach are queued in bounded memory by the session port and replayed after attach. This covers startup `surface` delivery races without changing post-attach logging behavior.
+- Supervisor start is transactional around newly-started emitters: if post-start persistence fails, the supervisor requests a bounded stop-and-wait for the new emitter, removes/restores the in-memory emitter entry after the stop settles, restores the prior session-injector state, and best-effort restores the previous persistent emitter config before surfacing the failure. If the bounded rollback wait times out or stop fails, the runtime emitter entry and current injector state remain visible for manual cleanup while durable config is still restored best-effort.
+- Bootstrap restoration of an emitter that already exists in persistent config is
+  a runtime-only start path. When bootstrap does not request any durable
+  mutation, supervisor start skips config rewrites and persistence rollback so a
+  read-only or temporarily unwritable config file cannot by itself prevent
+  auto-start recovery. User-initiated persistent starts and persistent injector
+  updates continue to use the transactional persistence behavior above.
+- Scheduled emitter iterations must clear `inFlight` in a `finally` path and convert unexpected thrown/rejected iteration failures into deterministic failed iteration results so unhandled rejections cannot strand an emitter in `RUNNING`.
+- Ordinary emitter `stop()` remains a request/transition operation for tool compatibility. Shutdown uses a separate wait path that requests stop, waits for child `close`/in-flight completion, and returns per-emitter outcomes (`stopped`, `timedOut`, or `failed`) instead of discarding timeout/rejection details. Hook cleanup summaries report those outcomes rather than claiming unconditional success.
+- After session listeners are attached, the runtime synthesizes one initial idle lifecycle nudge by marking the session port idle and calling the supervisor's existing `onSessionIdle()` path. Later real activity events clear scheduled idle work through the normal session-activity transition. This gives persistent idle PromptEmitters auto-started during `onSessionStart` a deterministic first scheduling path even when the SDK does not replay an already-observed `session.idle` event to late listeners.
+- CommandEmitter event delivery is resolved through the shared stream delivery policy seam, preserving the existing EventFilter + SessionInjector matrix:
+  - `drop` stores nothing and increments the dropped-line count.
+  - `keep` stores the event and surfaces it only when the stream SessionInjector is enabled with `delivery: "all"`.
+  - `surface` stores the event and surfaces it only when the stream SessionInjector is enabled with `delivery: "surface"` or `delivery: "all"`.
+  - `inject` stores the event and enqueues session injection when the stream SessionInjector is enabled with `delivery: "important"`, `"all"`, `"surface"`, or `"inject"`; it surfaces only for enabled `delivery: "surface"` or `"all"`.
+  - Nullish SessionInjector delivery continues to default to `surface`; disabled SessionInjectors and `keep`/`drop`/unknown non-null delivery modes do not proactively surface or inject.
+- System notifications emitted by the line router continue to use the same SessionInjector injection decision for enqueueing, but they are not timeline-surfaced by that path.
+- `handlePromptResult()` remains append-only compatibility code; this decision does not introduce PromptEmitter assistant-response capture.
+## Consequences
+- Persistent emitters with `inject` or `surface` event-filter outcomes can deliver immediately after auto-start without requiring a duplicate stream entry.
+- User-configured persistent stream injector policy remains the source of truth when both a stream definition and an emitter definition exist.
+- Session startup races defer notification delivery instead of losing background events.
+- Startup surface events can appear after attach instead of being silently suppressed by a temporarily detached session port.
+- Startup idle PromptEmitters can run after attach without waiting for a second idle transition; if real session activity follows attach, the existing activity bridge cancels the pending idle timer.
+- Retried notifications retain FIFO order relative to later queued updates that remain inside the bounded retry queue; deterministic overflow drops prefer preserving older/retried work over newer tail entries.
+- Session lifecycle clearing prevents queued notification retries from crossing session boundaries.
+- A caller that sees persistent emitter start fail should not also have a hidden running emitter to clean up; if rollback cannot confirm settlement before the bounded timeout, the emitter remains visible in runtime state for manual cleanup.
+- Persistent emitter auto-start from config can succeed even when the existing
+  config file cannot be rewritten, provided no durable state change was
+  requested by bootstrap.
+- Shutdown cleanup can wait for real process closure without changing the public meaning of user/tool stop requests, and session summaries can distinguish emitters that stopped, timed out, or failed during cleanup.
+- The shared delivery seam centralizes CommandEmitter delivery decisions without changing EventFilter outcomes, SessionInjector authority, notification retry behavior, or PromptEmitter capture semantics.
+- Future changes to auto-start subscription defaults, bootstrap persistence
+  writes, notification/log replay policy, attach-time idle nudging, scheduled
+  iteration failure handling, notification retry bounds/session clearing,
+  shutdown wait reporting behavior, or supervisor start rollback semantics
+  should update or supersede this ADR.

package/docs/adr/0004-persistent-config-canonical-streams.md ADDED Viewed

@@ -0,0 +1,86 @@
+# ADR 0004: Persistent config stream injector and alias semantics
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR style.
+ADR 0001 records that persisted emitter definitions default to user-owned,
+persistent ownership semantics. Persistent stream definitions have the same
+on-disk/user-authored character, but session-injector normalization defaulted
+missing ownership and lifespan to model-owned, temporary values while runtime
+bootstrap applied persisted streams as user-owned, persistent definitions.
+Emitter config also has two names for the destination EventStream:
+- `stream` is the documented config field.
+- `channel` is a legacy alias still accepted by older tools and config files.
+Keeping both fields in normalized/serialized config lets a stale `channel`
+silently override an edited `stream` in runtime paths that consume only
+`channel`.
+## Decision
+- Persisted stream `sessionInjector` entries default to:
+  - `ownership: "userOwned"`
+  - `lifespan: "persistent"`
+- Explicit compatibility fields are still honored:
+  - `ownership` or legacy `managedBy`
+  - `lifespan` or legacy `scope`
+- `stream` is the canonical persisted emitter destination field.
+- `channel` remains accepted as an input alias for backwards compatibility.
+- When both `stream` and `channel` are present and conflict, `stream` wins.
+- Normalization and serialization drop the legacy `channel` alias after resolving
+  the canonical `stream`, preventing stale aliases from being persisted again.
+- Runtime emitter normalization and configured-emitter projection prefer
+  `stream` over `channel` so user-authored config and runtime routing agree.
+- Config migration persistence is best-effort: loading a readable config should
+  succeed even if saving the canonical migrated form fails. The store should skip
+  the migration save when the parsed on-disk JSON is already canonically equal.
+- Config loading is transactional. The store builds candidate cwd/path/config
+  state in locals and commits it only after read, parse, and migration all
+  succeed, or after the config search completes successfully with no file found.
+  If a load fails, the previous runtime config remains active and subsequent
+  saves are refused until a later load succeeds.
+- Persisted stream entries must include an explicit, non-blank string `name`.
+  `name: "main"` remains valid, but missing, blank, non-string, or otherwise
+  non-normalizable names are config validation errors and are never defaulted to
+  `main`.
+- A persisted stream entry with only metadata such as `name` and `description`
+  does not define durable SessionInjector policy. Applying such an entry keeps
+  the runtime injector on the non-protected default
+  (`modelOwned`/`temporary`, disabled) instead of synthesizing a
+  `userOwned`/`persistent` injector. Only an explicit `sessionInjector` or
+  legacy `subscription` object receives the persisted stream injector defaults
+  above.
+- Bootstrap auto-start of emitter definitions already present in config is a
+  runtime restoration path, not a durable config update. It must not require a
+  config rewrite when no persisted emitter or stream policy is being changed.
+## Consequences
+- Hand-authored stream injector config aligns with runtime persistent semantics
+  and the user-owned defaults documented for durable workflows.
+- Existing channel-only config remains valid and is migrated to `stream`.
+- Editing `stream` is deterministic even if an old `channel` field remains.
+- Read-only or temporarily unwritable config files no longer prevent the
+  extension from using an otherwise valid config; users receive a warning when
+  canonical migration persistence fails.
+- Malformed or unreadable config files cannot replace the last known-good
+  runtime config and cannot be overwritten later by an empty or partially loaded
+  state via `save()`.
+- Malformed persisted stream entries fail config normalization instead of
+  silently enabling or modifying the default `main` stream.
+- Bare persisted stream entries preserve stream metadata without turning normal
+  unforced SessionInjector updates into protected user-owned mutations.
+- Auto-start restoration can recover already-persisted emitters from a
+  read-only config file, while user-initiated durable emitter starts and stream
+  injector changes still persist and roll back on save failure.
+- Future changes to persistent config defaults, stream-name validation, emitter
+  destination aliases, transactional load/save safety, bootstrap restoration
+  writes, or migration write-failure behavior should update or supersede this
+  ADR.

package/docs/adr/0005-provider-sdk-push-and-dynamic-tools.md ADDED Viewed

@@ -0,0 +1,48 @@
+# ADR 0005: Bound-provider SDK push and dynamic tools
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR style.
+The provider SDK exposes `push`, `surface`, `keep`, and `updateTools`. Detour already calls `provider.push()`. Before this decision, the gateway accepted only `auth`, `hello`, `tool.result`, and `goodbye` from providers after binding, so SDK pushes and dynamic tool updates were rejected as unknown message types.
+The full provider-interface design contains broader advanced-provider features such as hooks updates, context updates, stream queries, subscriptions, all-session binding, pairing auth, revisions, and update acknowledgments. Those features are intentionally outside this follow-up.
+## Decision
+- A provider must still authenticate and send `hello` for exactly one active session before using the new messages.
+- While Bound, a provider may send `push` with:
+  - `level`: `keep`, `surface`, or `inject`
+  - `event`: non-empty text
+  - optional `stream`, defaulting in the SDK to the provider name
+  - optional `sessionId`, which must match the session selected in `hello`
+  - optional object `metadata`
+- Explicit push stream names use the same canonical EventStream identifier rules as stream tools. The gateway rejects non-normalizable stream names instead of falling back to `main`; accepted pushes use the canonical stream name consistently for storage, notifications, and return values. If a push omits `stream`, the runtime uses the canonical provider name/id when possible and otherwise falls back to `main`.
+- Push delivery is immediate and session-bound:
+  - `keep` appends to the EventStream only.
+  - `surface` appends and logs to the Copilot timeline.
+  - `inject` appends, logs, and enqueues a session injection through the existing retrying notification dispatcher.
+- Provider push delivery uses the shared stream delivery policy seam in provider-authoritative mode. The provider-selected `level` remains the complete delivery policy for that push: provider `inject` is not gated by the destination stream's SessionInjector, and provider `keep`/`surface` semantics are unchanged.
+- Provider push surfacing is best-effort: timeline logging failures must not create unhandled promise rejections or prevent the already-appended stream event from remaining stored.
+- While Bound, a provider may send `tools.update` with a complete replacement `tools` array using the same validation and 100-tool cap as `hello.tools`.
+- `tools.update` is accepted only for the bound session. A supplied `sessionId` must match the session selected in `hello`.
+- Successful `tools.update` replaces the provider's registry entry, updates the connection's active tool definitions, and schedules the same debounced session tool refresh used by provider connect/disconnect.
+- Rejected `tools.update` messages leave the previous provider tool list active.
+- Existing in-flight tool calls are not cancelled when a successful update removes their tool definition; they continue to their result, timeout, cancellation, or disconnect outcome.
+- `hello.ack` may include `sessionId` so SDK providers can observe the bound session.
+- This minimal contract does not add hooks updates, context updates, stream queries, subscriptions, all-session binding, pairing auth, per-update revisions, or success acknowledgments.
+## Consequences
+- SDK providers can use `provider.push()`, `provider.surface()`, `provider.keep()`, and `provider.updateTools()` without being rejected by the gateway.
+- Detour page messages can reach the Copilot session instead of failing as unknown provider messages.
+- The shared delivery seam records the provider delivery matrix alongside CommandEmitter delivery policy while preserving this ADR's existing `keep`/`surface`/`inject` behavior.
+- Invalid provider-selected stream names fail closed instead of silently misrouting events into `main`.
+- Dynamic tools remain simple and deterministic: each update replaces the provider's complete tool list and reuses the existing reload path.
+- Providers do not receive a success ack for `tools.update`; they should treat absence of `error` as success in the minimal profile.
+- The gateway remains single-session-bound for external providers, preserving the current security boundary.
+- Future changes to provider push semantics, dynamic tool update acknowledgments/revisions, multi-session binding, pairing auth, or advanced provider capabilities should update or supersede this ADR.

package/docs/adr/0006-command-emitter-cwd-workspace-boundary.md ADDED Viewed

@@ -0,0 +1,46 @@
+# ADR 0006: Command-emitter cwd stays within the session workspace
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR style.
+Command emitters accept an optional `cwd` field so a watcher can run from a
+subdirectory such as `services/api`. Before this decision, `cwd` resolution used
+`path.resolve(sessionCwd, requestedCwd)`, which also accepted absolute paths and
+relative traversal that escaped the Copilot session working directory.
+That behavior made a model-supplied emitter request capable of changing the
+process working directory outside the workspace boundary selected for the
+session. ADR 0001 and ADR 0004 cover persistent ownership/config defaults, and
+ADR 0002 covers provider gateway security; they do not define command-emitter
+working-directory boundaries.
+## Decision
+- Command-emitter `cwd` is interpreted as a path relative to the session cwd.
+- Omitted or blank `cwd` uses the session cwd.
+- `cwd: "."` is allowed and resolves to the session cwd.
+- Subdirectories under the session cwd are allowed.
+- Absolute `cwd` values are rejected.
+- Relative paths that resolve outside the session cwd, including `..` traversal,
+  are rejected.
+- This is API hardening for emitter configuration. It is not a shell sandbox:
+  the spawned command still runs with the user's normal OS permissions, and the
+  command itself may access files or change directories according to those
+  permissions.
+## Consequences
+- Tool callers and persisted configs must express command-emitter working
+  directories as workspace-relative paths.
+- Existing configs that used absolute paths or traversal outside the session cwd
+  will fail validation during auto-start until rewritten as in-workspace
+  relative paths.
+- The session cwd remains the durable boundary for command-emitter placement,
+  reducing accidental or model-initiated execution from unrelated directories.
+- Future changes to command-emitter cwd resolution, workspace-boundary behavior,
+  or stronger process sandboxing should update or supersede this ADR.

package/docs/adr/0007-runtime-session-workspace-context.md ADDED Viewed

@@ -0,0 +1,62 @@
+# ADR 0007: Runtime session workspace context boundary
+## Status
+Accepted
+## Context
+No `docs/adr/0000-template.md` exists, so this ADR follows the existing ADR
+style.
+Runtime services previously shared the active session workspace through loose
+getter/setter cwd closures. That made the mutable cwd boundary easy to
+thread through services but left ownership split across runtime subsystem
+bootstrap, config loading, emitter service fallback behavior, and supervisor cwd
+validation.
+ADR 0004 requires persistent config loading to be transactional: failed loads
+must not replace the last-known-good config state. ADR 0006 requires
+command-emitter `cwd` values to remain relative to the session workspace and to
+be validated by the existing workspace-boundary rules. Those decisions remain
+the source of truth for config safety and emitter cwd semantics; this ADR records
+where the runtime owns the shared session/workspace context.
+## Decision
+- Introduce a runtime-owned session/workspace context for active session
+  identity metadata, current session/base cwd, current config cwd, and emitter
+  cwd resolution.
+- Runtime subsystem construction creates or receives this context once and then
+  hands out narrow capability views:
+  - config bootstrap receives config cwd resolution/commit capabilities;
+  - emitter services and supervisor receive emitter cwd resolution capabilities.
+- Config bootstrap resolves the candidate cwd before loading config, then
+  commits the runtime context's base/config cwd only after `configStore.load()`
+  succeeds or completes the no-config-found path. This keeps runtime cwd
+  ownership aligned with ADR 0004 last-known-good load semantics.
+- Emitter cwd resolution remains delegated to the existing path validation
+  helpers, preserving ADR 0006 behavior:
+  - omitted, blank, or `.` emitter `cwd` uses the session cwd;
+  - subdirectories under the session cwd are allowed;
+  - absolute paths and traversal outside the session cwd are rejected;
+  - this remains cwd placement hardening, not a shell sandbox.
+- Emitter start fallback preserves the prior nullish-only base cwd behavior:
+  omitted/null base cwd options use the current session cwd, while explicitly
+  supplied base cwd values are normalized as supplied.
+- Public tool names, provider protocol behavior, emitter delivery semantics,
+  persistence defaults, and config canonicalization rules are unchanged.
+## Consequences
+- The mutable session/workspace boundary has one owner instead of ad hoc closure
+  plumbing across services.
+- Dependency injection remains capability-specific: consumers receive config or
+  emitter workspace capabilities rather than a broad runtime object.
+- Failed config loads cannot advance the runtime context's config/base cwd ahead
+  of the last-known-good config load state.
+- Command-emitter cwd validation remains centralized on the existing helper and
+  keeps the ADR 0006 workspace-relative contract.
+- Future changes to runtime session/workspace ownership, config cwd commit
+  timing, base cwd fallback semantics, or emitter cwd validation ownership
+  should update or supersede this ADR.

package/docs/evals.md ADDED Viewed

@@ -0,0 +1,41 @@
+# Evals
+Testing infrastructure for copilot-tap-extension.
+## Quick validation
+```bash
+npm run check              # syntax check
+npm run evals:smoke        # smoke test
+npm run evals:validate-modes  # interactive vs prompt-mode gap
+```
+## How the eval runner works
+`evals/run.mjs` starts one ACP server, creates fresh SDK sessions, and mounts the shared runtime from `src/tap-runtime.mjs` directly into those sessions. This means `smoke` and `run` exercise the same EventStream/EventEmitter logic as the extension without depending on `.github/extensions` being discovered in a headless session.
+The runner writes prompt, response, error, and full event-transcript artifacts under `evals/results/...`.
+## Supported paths
+The reliable supported paths are:
+1. **Interactive foreground Copilot sessions**
+2. **ACP/SDK sessions that mount the shared runtime directly**
+Do **not** treat headless prompt-mode or other non-interactive repo-extension loading as reliable. Use `validate-modes` to prove that distinction.
+## Extension-loader evals
+The real repo-scoped extension loader is validated separately. `npm run evals:validate-modes` probes `copilot -p` with the actual `.github/extensions` entrypoint, then compares with the same prompt in an interactive session.
+For interactive executor evals:
+```bash
+node evals/run.mjs prepare-interactive --case E001
+# run the printed prompt inside an interactive `copilot` session
+# then run the printed /share command
+node evals/run.mjs judge-interactive --run-dir "<printed-run-dir>"
+```
+This keeps the executor in a foreground Copilot session where the extension can attach, uses `/share <path>` to persist the transcript, and runs a tool-free ACP judge against the transcript plus config snapshots. If you reuse one session for multiple cases, run `/clear` before each next case.