npm - copilot-tap-extension - Versions diffs - 2.0.7 → 2.0.9 - Mend

copilot-tap-extension 2.0.7 → 2.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

package/README.md +4 -1
package/SOUL.md +51 -0
package/bin/install.mjs +7 -1
package/dist/copilot-instructions.md +15 -0
package/dist/extension.mjs +823 -29
package/dist/skills/tap-goal/SKILL.md +13 -2
package/dist/skills/tap-loop/SKILL.md +6 -0
package/dist/skills/tap-monitor/SKILL.md +19 -3
package/dist/skills/tap-orchestrate/SKILL.md +81 -0
package/dist/version.json +1 -1
package/docs/adr/0001-persistent-config-default-ownership.md +33 -0
package/docs/adr/0002-local-provider-gateway-runtime-security.md +36 -0
package/docs/adr/0003-emitter-delivery-lifecycle.md +68 -0
package/docs/adr/0004-persistent-config-canonical-streams.md +86 -0
package/docs/adr/0005-provider-sdk-push-and-dynamic-tools.md +48 -0
package/docs/adr/0006-command-emitter-cwd-workspace-boundary.md +46 -0
package/docs/adr/0007-runtime-session-workspace-context.md +62 -0
package/docs/evals.md +41 -0
package/docs/evolution-of-tap-icon.html +989 -0
package/docs/providers.md +242 -0
package/docs/recipes/adaptive-agent.md +303 -0
package/docs/recipes/agent-brainstorm/100-extension-ideas.md +288 -0
package/docs/recipes/agent-brainstorm/deep-ideas.md +216 -0
package/docs/recipes/ambient-guardian.md +314 -0
package/docs/recipes/browser-bridge.md +162 -0
package/docs/recipes/codex-goals-for-tap-goal.md +136 -0
package/docs/recipes/copilot-sdk-canvas.md +147 -0
package/docs/recipes/deferred-cognition.md +310 -0
package/docs/recipes/provider-integration-patterns.md +93 -0
package/docs/recipes/provider-interface-advanced.md +1364 -0
package/docs/recipes/provider-interface-core-profile.md +568 -0
package/docs/recipes/tap-control-plane-roadmap.md +60 -0
package/docs/recipes/universal-tool-gateway.md +202 -0
package/docs/reference.md +229 -0
package/docs/use-cases.md +348 -0
package/package.json +4 -1
package/providers/detour/README.md +84 -0
package/providers/detour/bridge.js +219 -0
package/providers/detour/index.mjs +322 -0
package/providers/detour/package-lock.json +577 -0
package/providers/detour/package.json +19 -0
package/providers/detour/scripts/build.mjs +31 -0
package/providers/detour/src/bridge.js +256 -0
package/providers/detour/src/contracts.js +40 -0
package/providers/detour/src/inspector.js +260 -0
package/providers/detour/src/inspector.test.mjs +53 -0
package/providers/detour/src/panel.js +465 -0
package/providers/detour/src/provider-core.js +233 -0
package/providers/detour/src/provider-core.test.mjs +185 -0
package/providers/detour/src/react-context-core.js +143 -0
package/providers/detour/src/react-context.js +44 -0
package/providers/detour/src/react-context.test.mjs +41 -0
package/providers/templates/README.md +23 -0
package/providers/templates/ci-review-provider.mjs +46 -0
package/providers/templates/detour-workflow-provider.mjs +41 -0
package/providers/templates/jira-github-provider.mjs +42 -0
package/providers/templates/provider-utils.mjs +45 -0
package/providers/templates/sast-triage-provider.mjs +51 -0

package/docs/recipes/copilot-sdk-canvas.md ADDED Viewed

@@ -0,0 +1,147 @@
+# Copilot SDK canvas surfaces
+These notes reflect the canvas API surface found in the local Copilot CLI 1.0.61 SDK. The canvas APIs are marked experimental in the SDK types, so treat exact names and runtime behavior as subject to change.
+## What a canvas is
+A canvas is an extension-owned UI surface declared through the Copilot SDK:
+```js
+import { joinSession, createCanvas } from "@github/copilot-sdk/extension";
+await joinSession({
+  canvases: [
+    createCanvas({
+      id: "tap-dashboard",
+      displayName: "Tap Dashboard",
+      description: "Inspect live tap streams and emitter state.",
+      open: async (ctx) => ({
+        title: "Tap Dashboard",
+        status: `Instance ${ctx.instanceId}`,
+        url: "http://127.0.0.1:5173/",
+      }),
+    }),
+  ],
+});
+```
+The extension process is the live provider for that canvas. The runtime routes `canvas.open`, `canvas.close`, and `canvas.action.invoke` requests back into the SDK process, and the SDK dispatches them to the handlers bound by `createCanvas`.
+Use canvases when a workflow needs a persistent visual panel, inspector, or control surface instead of plain text tool output. Examples that fit tap well: stream dashboards, emitter graphs, PR-review status boards, browser-debug panels, and interactive incident timelines.
+## Declaration shape
+`createCanvas(options)` returns a `Canvas` that can be passed in `joinSession({ canvases })`.
+| Field | Required | Notes |
+| --- | --- | --- |
+| `id` | yes | Provider-local canvas id, unique within the declaring extension connection. |
+| `displayName` | yes | Human-readable name shown in discovery/UI chrome. |
+| `description` | yes | Short single-sentence description shown to the agent. |
+| `inputSchema` | no | JSON Schema for the `open` input payload. |
+| `actions` | no | Agent/host-invocable actions for an open instance. |
+| `open(ctx)` | yes | Called when the host or agent opens/focuses an instance. |
+| `onClose(ctx)` | no | Called when an instance is closed; use it to release resources. |
+Action names are unique within the canvas and must not start with `canvas.` because that prefix is reserved for lifecycle verbs.
+## Open lifecycle
+The host or agent opens a canvas with a `canvasId`, stable caller-supplied `instanceId`, and optional `extensionId` when multiple providers declare the same `canvasId`. The SDK calls:
+```js
+open: async ({ sessionId, extensionId, canvasId, instanceId, input, host }) => {
+  return {
+    title: "Rendered title",
+    status: "ready",
+    url: "http://127.0.0.1:49152/",
+  };
+}
+```
+The response may include:
+| Field | Meaning |
+| --- | --- |
+| `url` | Web renderer URL for the host to render. Optional for native canvases. |
+| `title` | Title shown in host chrome. |
+| `status` | Provider-supplied status text shown in host chrome. |
+Re-opening the same `instanceId` is idempotent and focuses/reuses the existing panel. Open snapshots and `session.canvas.opened` events include `reopen: true` when the notification represents such a reopen.
+The CLI canvas scaffold uses a practical default for web canvases: start a loopback HTTP server on port `0`, let the OS choose a free ephemeral port, keep one server per `instanceId`, and close that server from `onClose` so ports do not leak.
+## Actions
+Canvas actions let the agent or host interact with an already-open instance:
+```js
+createCanvas({
+  id: "tap-dashboard",
+  displayName: "Tap Dashboard",
+  description: "Inspect live tap streams and emitter state.",
+  actions: [
+    {
+      name: "refresh_stream",
+      description: "Refresh the stream snapshot shown in the canvas.",
+      inputSchema: {
+        type: "object",
+        properties: {
+          limit: { type: "integer", minimum: 1, maximum: 100 },
+        },
+      },
+      handler: async ({ instanceId, input }) => {
+        return { ok: true, instanceId, limit: input?.limit ?? 25 };
+      },
+    },
+  ],
+  open: async () => ({ title: "Tap Dashboard", url: "http://127.0.0.1:5173/" }),
+});
+```
+At the model/tool layer, actions are discovered through `list_canvas_capabilities` and invoked through `invoke_canvas_action`. At the SDK RPC layer, renderer-capable clients can call `session.rpc.canvas.invokeAction(...)`.
+## Renderer and RPC APIs
+The SDK exposes an experimental `session.rpc.canvas` API for SDK clients that can render or coordinate canvases:
+| Method | Purpose |
+| --- | --- |
+| `list()` | List canvas declarations available in the session. |
+| `listOpen()` | List currently open canvas instances. |
+| `open({ canvasId, instanceId, input, extensionId? })` | Open or focus an instance. |
+| `close({ instanceId })` | Close an open instance. |
+| `invokeAction({ instanceId, actionName, input })` | Invoke an action on an open instance. |
+Set `requestCanvasRenderer: true` only for SDK session clients that can display canvases. That opt-in surfaces the model tools `list_canvas_capabilities`, `open_canvas`, and `invoke_canvas_action`. Extension canvases generated by `extensions_manage({ kind: "canvas" })` only declare `canvases`; they do not set `requestCanvasRenderer`.
+For resumed SDK sessions, `openCanvases` can be supplied with the prior open-instance snapshot so the runtime can rehydrate canvas state without re-opening everything manually.
+## Events and capability signals
+Canvas-related session events are transient (`ephemeral: true`):
+| Event | Key payload |
+| --- | --- |
+| `capabilities.changed` | `data.ui.canvases` indicates whether canvas rendering is supported. |
+| `session.canvas.registry_changed` | Current canvas declarations: ids, display names, descriptions, input schemas, and actions. |
+| `session.canvas.opened` | Open instance snapshot: `instanceId`, `canvasId`, `extensionId`, `url`, `title`, `status`, `input`, `reopen`, and `availability`. |
+Open instances report `availability: "ready"` while the provider connection is live and `"stale"` if the provider has gone away. In stale/unavailable cases, action routing can fail until the agent re-opens the canvas or the provider reconnects.
+## Tap integration notes
+Tap's external provider protocol is intentionally not the Copilot SDK. External tap providers can register tools, update tools, and push/keep/surface/inject events, but they cannot declare Copilot SDK canvases over the WebSocket provider protocol today.
+To add a canvas-backed tap experience, implement the canvas in the tap extension layer with `createCanvas`, or add an explicit gateway/protocol extension that maps provider UI declarations into SDK canvases. Keep provider-side browser or local services behind loopback URLs and pass only the URL/title/status through the canvas `open` response.
+Tap now includes a built-in example: the `tap-diagnostics` canvas. It is declared by the extension and can be opened with `tap_open_diagnostics_canvas`. The canvas serves a loopback-only renderer and streams bounded diagnostics snapshots over SSE.
+## Best practices
+1. Keep renderer servers on loopback (`127.0.0.1`) and use ephemeral ports unless the port is user-configured.
+2. Key per-instance resources by `instanceId` and release them in `onClose`.
+3. Validate `open` and action inputs with JSON Schema; avoid broad casts or unstructured payloads.
+4. Keep action names stable, descriptive, and free of the reserved `canvas.` prefix.
+5. Treat all canvas APIs as experimental and guard optional host capabilities such as `host?.capabilities?.canvases`.
+6. Use `session.log()` for extension diagnostics; do not write diagnostics to stdout because stdout is reserved for JSON-RPC.

package/docs/recipes/deferred-cognition.md ADDED Viewed

@@ -0,0 +1,310 @@
+# Recipe: Deferred Cognition — The AI Schedules Work for Its Future Self
+## The insight
+You're investigating a problem. You hit a wall — the data you need doesn't exist yet (logs rotate at midnight, a deploy hasn't finished, a test suite is running). Today you'd set a calendar reminder, context-switch, and come back later having forgotten half of what you figured out.
+Deferred cognition means the AI captures its current investigation state — hypothesis, evidence gathered, what was ruled out, what to check next — and creates a persistent PromptEmitter that fires on the next session start. When you open Copilot tomorrow, it picks up mid-thought.
+This isn't session resumption (which replays the conversation). It's the AI **deliberately planning future work**, with a specific prompt about what to do when it wakes up.
+## Why skills can't do this
+1. A skill can't write itself. Deferred cognition generates a prompt at runtime based on the current investigation state.
+2. A skill can't schedule itself. The persistent PromptEmitter fires automatically on next session start.
+3. A skill doesn't carry state. The deferred prompt includes specific hypothesis, file paths, line numbers, evidence collected — all from the current session.
+4. A skill can't span sessions. This bridges the gap between "what I know now" and "what I need to do later."
+## Architecture
+```
+Session N (today)
+    │
+    ▼
+You say: "check the rotated logs tomorrow and continue"
+    │
+    ▼
+tap captures current investigation state:
+  • Files examined (from onPostToolUse history)
+  • Hypothesis (from assistant.message history)
+  • What was ruled out (from conversation)
+  • What to check next (from your instruction)
+    │
+    ▼
+Creates persistent PromptEmitter:
+  name: "deferred-investigation-auth-timeout"
+  schedule: oneTime (fires once on next session start)
+  autoStart: true
+  prompt: <generated investigation prompt>
+    │
+    ▼
+Saved to tap.config.json
+═══════════════════════════════════════════════
+Session N+1 (tomorrow)
+    │
+    ▼
+onSessionStart → persistent emitters auto-start
+    │
+    ▼
+PromptEmitter fires immediately:
+  "Continue investigating the auth-timeout issue from yesterday.
+   State from previous session:
+   - Hypothesis: Connection pool exhaustion in auth-service
+   - Evidence: Error rate correlates with batch job at 23:45
+   - Checked: src/auth/pool.ts (pool size is 10, seems low)
+   - Ruled out: DNS resolution (dig shows normal latency)
+   - Next step: Check the rotated logs at /var/log/auth/
+     for connection refused errors between 23:45-00:15
+   - Also check: Has the batch job's connection count changed
+     since PR #289 merged last Tuesday?
+   Read the relevant files, check the logs, and report
+   what you find."
+    │
+    ▼
+Copilot picks up the investigation mid-thought.
+The PromptEmitter is oneTime → auto-removes after firing.
+```
+## Components
+### 1. State capture tool
+A new tap tool that captures the current investigation state:
+```js
+{
+  name: "tap_defer",
+  description:
+    "Defer work to a future session. Captures the current investigation " +
+    "state and creates a persistent prompt that fires on next session start. " +
+    "Use when the user wants to continue later, wait for something, or " +
+    "schedule future work.",
+  parameters: {
+    type: "object",
+    properties: {
+      what: {
+        type: "string",
+        description:
+          "What to do in the future session — the specific task or check"
+      },
+      when: {
+        type: "string",
+        description:
+          "When to fire: 'next-session' (default), or a delay like '6h', '1d'"
+      }
+    },
+    required: ["what"]
+  },
+  handler: async ({ what, when }, invocation) => {
+    // Gather context from the current session
+    const sessionHistory = await session.getMessages();
+    const recentFiles = extractRecentFiles(sessionHistory);
+    const hypothesis = extractHypothesis(sessionHistory);
+    const ruledOut = extractRuledOut(sessionHistory);
+    // Generate the deferred prompt
+    const deferredPrompt = await generateDeferredPrompt({
+      task: what,
+      files: recentFiles,
+      hypothesis,
+      ruledOut,
+      sessionTranscriptSummary: summarizeRecent(sessionHistory, 20)
+    });
+    // Create a persistent one-time emitter
+    const emitterName = `deferred-${normalizeName(what).slice(0, 30)}`;
+    await supervisor.start({
+      name: emitterName,
+      prompt: deferredPrompt,
+      scope: "persistent",
+      runSchedule: "oneTime",
+      autoStart: true,
+      managedBy: "modelOwned"
+    });
+    return `Deferred to ${when || "next session"}: "${what}"\n` +
+      `Emitter '${emitterName}' will fire on next session start ` +
+      `with full investigation context.`;
+  }
+}
+```
+### 2. Prompt generation
+The deferred prompt is generated by a PromptEmitter that summarizes the current session:
+```js
+async function generateDeferredPrompt({ task, files, hypothesis, ruledOut, sessionTranscriptSummary }) {
+  const prompt = `Continue this investigation from a previous session.
+## Task
+${task}
+## State from previous session
+${hypothesis ? `- Hypothesis: ${hypothesis}` : ""}
+${files.length ? `- Files examined: ${files.join(", ")}` : ""}
+${ruledOut.length ? `- Ruled out: ${ruledOut.join("; ")}` : ""}
+## Previous session summary
+${sessionTranscriptSummary}
+## Instructions
+Pick up where this left off. Read the relevant files, perform the
+check described in the task, and report findings. If the hypothesis
+is confirmed, suggest a fix. If not, update the hypothesis.`;
+  return prompt;
+}
+```
+### 3. Auto-cleanup
+The emitter is `oneTime` with `autoStart: true`. After it fires:
+- The PromptEmitter runs once on session start
+- Copilot processes it and responds
+- The emitter marks itself completed and removes from persistent config
+- No cleanup needed
+### 4. Deferred chain (optional)
+If the future session also hits a wall, the AI can defer again:
+```
+Session N: "check logs tomorrow" → deferred
+Session N+1: checks logs, finds a clue, needs metric data that's being aggregated
+  → "check the hourly aggregation after it runs at 6am" → deferred again
+Session N+2: checks aggregation, finds root cause, suggests fix
+```
+Each deferral carries forward the accumulated context from all previous sessions. The investigation builds across days without losing state.
+## Use cases
+### "Wait for the deploy to finish"
+```
+> I need to check if the new config works in production.
+> The deploy won't finish for another 20 minutes and
+> I need to switch to something else.
+Copilot calls tap_defer({
+  what: "Check if auth-service in production is using the new
+         Redis connection pool settings from PR #312. Verify
+         by checking the pod env vars and the connection count
+         in the Redis dashboard.",
+  when: "next-session"
+})
+> Deferred to next session. I'll check the deploy results
+> when you come back.
+```
+### "Monitor overnight"
+```
+> The memory leak reproduces after ~8 hours of uptime.
+> The service was restarted at 4pm. Defer checking the
+> heap stats to tomorrow morning.
+Copilot calls tap_defer({
+  what: "Check heap memory stats for auth-service. It was
+         restarted at 4pm yesterday. If uptime is now ~16h,
+         check if RSS has grown linearly. Compare to the
+         baseline of 512MB at startup. Check /debug/pprof/heap
+         if available.",
+  when: "next-session"
+})
+```
+### "Continue this research"
+```
+> I'm evaluating whether to migrate from Express to Fastify.
+> I've compared routing and middleware so far. Defer the
+> rest — I need to check plugin ecosystem and benchmarks.
+Copilot calls tap_defer({
+  what: "Continue the Express→Fastify migration evaluation.
+         Already compared: routing (Fastify wins on speed),
+         middleware (Express has more ecosystem).
+         Still need to evaluate: plugin compatibility for
+         our 12 Express middlewares, benchmark with our
+         actual API routes, and migration effort estimate.",
+  when: "next-session"
+})
+```
+## Protocol
+### Deferred emitter in tap.config.json
+```json
+{
+  "emitters": [
+    {
+      "name": "deferred-check-auth-deploy",
+      "prompt": "Continue investigating the auth-timeout issue...",
+      "runSchedule": "oneTime",
+      "autoStart": true,
+      "ownership": "modelOwned",
+      "lifespan": "persistent",
+      "metadata": {
+        "deferredAt": "2026-04-26T14:30:00Z",
+        "deferredFrom": "session-abc123",
+        "task": "Check if auth-service is using new Redis pool settings"
+      }
+    }
+  ]
+}
+```
+### Lifecycle
+```
+tap_defer() called
+    │
+    ▼
+Generate prompt from session state
+    │
+    ▼
+Write to tap.config.json as persistent oneTime emitter
+    │
+    ▼
+Session ends normally
+    │
+    ═══ time passes ═══
+    │
+    ▼
+New session starts → onSessionStart loads persistent config
+    │
+    ▼
+Deferred emitter fires (session.send with the prompt)
+    │
+    ▼
+Copilot processes the deferred work
+    │
+    ▼
+Emitter completes → removed from persistent config
+```
+## Phased delivery
+| Phase | Scope |
+|---|---|
+| **1. tap_defer tool** | Capture task + basic context, create persistent oneTime emitter |
+| **2. State extraction** | Use session.getMessages() to extract hypothesis, files, ruled-out from conversation |
+| **3. Prompt generation** | PromptEmitter that distills session into a rich deferred prompt |
+| **4. Deferred chains** | Carry forward accumulated context across multiple deferrals |
+| **5. Scheduled deferral** | Support `when: "6h"` with delayed autoStart (timed persistent emitter) |
+## Open questions
+- **Context budget** — how much session history to include in the deferred prompt? Too little loses context, too much wastes tokens.
+- **Relevance decay** — a deferral from 2 weeks ago may no longer be relevant. Auto-expire?
+- **Multiple deferrals** — what if you defer 5 things? Queue them? Fire all at session start?
+- **User confirmation** — should deferred prompts fire silently or announce themselves? "You deferred 2 tasks from yesterday. Running them now."
+- **Cross-repo** — deferred work is stored in tap.config.json which is repo-scoped. What about repo-independent deferrals?

package/docs/recipes/provider-integration-patterns.md ADDED Viewed

@@ -0,0 +1,93 @@
+# Provider integration patterns
+These recipes turn external workflow systems into tap-operable signals and tools.
+They are intentionally provider-shaped: each integration exposes a small local
+service or WebSocket provider, then tap uses EventStreams, EventFilters, goals,
+and diagnostics to keep the agent connected to the workflow.
+## Shared shape
+1. Provider authenticates to the external system.
+2. Provider registers focused tools through the tap provider gateway.
+3. Provider exposes or emits normalized events with stable fields.
+4. tap filters events into `keep`, `surface`, or `inject`.
+5. High-value events can start or steer a `/tap-goal` or `/tap-orchestrate`
+   workflow.
+Prefer structured JSON lines for provider output:
+```json
+{"type":"ci.failure","repo":"owner/name","runUrl":"...","branch":"main","severity":"high"}
+```
+This makes EventFilter rules stable and auditable.
+Dependency-free template providers live in `providers/templates/`:
+- `ci-review-provider.mjs`
+- `jira-github-provider.mjs`
+- `sast-triage-provider.mjs`
+- `detour-workflow-provider.mjs`
+## Jira + GitHub
+Inspired by the Codex Jira/GitHub automation pattern:
+- Jira label or automation rule triggers a provider event.
+- Provider tools:
+  - `jira_get_issue`
+  - `jira_transition_issue`
+  - `jira_post_comment`
+  - `github_create_pr`
+- A goal completes only when the EventStream ledger includes:
+  - Jira issue key
+  - branch name
+  - commit SHA
+  - PR URL
+  - Jira status transition
+## CI auto-fix
+Inspired by the Codex GitHub Actions auto-fix pattern:
+- CommandEmitter or provider watches failed workflow runs.
+- Failure events surface or inject with run URL and failing job.
+- A repair goal uses the failing output as the verification surface.
+- Completion requires a successful verification command and a traceable branch
+  or PR.
+## Code review
+Inspired by the Codex SDK code-review pattern:
+- Provider or skill runs a structured review command.
+- Findings use stable fields:
+  - title
+  - body
+  - confidence score
+  - priority
+  - file path
+  - line range
+- P0/P1 findings should inject; P2/P3 findings should surface or keep.
+## SAST triage
+Inspired by the GitLab security-quality pattern:
+- Ingest SAST JSON as structured provider events.
+- Deduplicate by `(CWE, sink/function, file:line)`.
+- Rank by exploitability and business risk.
+- Process one finding per goal iteration.
+- Completion requires either a validated patch or an explicit blocked reason.
+## Browser / Detour workflows
+Use Detour for browser-page instrumentation and tap for agent-side orchestration:
+- Detour injects browser bridge code.
+- Provider exposes a local API for page events.
+- CommandEmitter polls the provider and normalizes events.
+- tap goals or monitors react to stable event types.
+Do not mutate Detour source for tap-specific workflows; use injectable scripts
+and provider-side adapters.