npm - copilot-tap-extension - Versions diffs - 2.0.8 → 2.0.9 - Mend

copilot-tap-extension 2.0.8 → 2.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (54) hide show

package/README.md +2 -1
package/SOUL.md +51 -0
package/bin/install.mjs +2 -1
package/dist/copilot-instructions.md +5 -0
package/dist/extension.mjs +361 -20
package/dist/version.json +1 -1
package/docs/adr/0001-persistent-config-default-ownership.md +33 -0
package/docs/adr/0002-local-provider-gateway-runtime-security.md +36 -0
package/docs/adr/0003-emitter-delivery-lifecycle.md +68 -0
package/docs/adr/0004-persistent-config-canonical-streams.md +86 -0
package/docs/adr/0005-provider-sdk-push-and-dynamic-tools.md +48 -0
package/docs/adr/0006-command-emitter-cwd-workspace-boundary.md +46 -0
package/docs/adr/0007-runtime-session-workspace-context.md +62 -0
package/docs/evals.md +41 -0
package/docs/evolution-of-tap-icon.html +989 -0
package/docs/providers.md +242 -0
package/docs/recipes/adaptive-agent.md +303 -0
package/docs/recipes/agent-brainstorm/100-extension-ideas.md +288 -0
package/docs/recipes/agent-brainstorm/deep-ideas.md +216 -0
package/docs/recipes/ambient-guardian.md +314 -0
package/docs/recipes/browser-bridge.md +162 -0
package/docs/recipes/codex-goals-for-tap-goal.md +136 -0
package/docs/recipes/copilot-sdk-canvas.md +147 -0
package/docs/recipes/deferred-cognition.md +310 -0
package/docs/recipes/provider-integration-patterns.md +93 -0
package/docs/recipes/provider-interface-advanced.md +1364 -0
package/docs/recipes/provider-interface-core-profile.md +568 -0
package/docs/recipes/tap-control-plane-roadmap.md +60 -0
package/docs/recipes/universal-tool-gateway.md +202 -0
package/docs/reference.md +229 -0
package/docs/use-cases.md +348 -0
package/package.json +4 -1
package/providers/detour/README.md +84 -0
package/providers/detour/bridge.js +219 -0
package/providers/detour/index.mjs +322 -0
package/providers/detour/package-lock.json +577 -0
package/providers/detour/package.json +19 -0
package/providers/detour/scripts/build.mjs +31 -0
package/providers/detour/src/bridge.js +256 -0
package/providers/detour/src/contracts.js +40 -0
package/providers/detour/src/inspector.js +260 -0
package/providers/detour/src/inspector.test.mjs +53 -0
package/providers/detour/src/panel.js +465 -0
package/providers/detour/src/provider-core.js +233 -0
package/providers/detour/src/provider-core.test.mjs +185 -0
package/providers/detour/src/react-context-core.js +143 -0
package/providers/detour/src/react-context.js +44 -0
package/providers/detour/src/react-context.test.mjs +41 -0
package/providers/templates/README.md +23 -0
package/providers/templates/ci-review-provider.mjs +46 -0
package/providers/templates/detour-workflow-provider.mjs +41 -0
package/providers/templates/jira-github-provider.mjs +42 -0
package/providers/templates/provider-utils.mjs +45 -0
package/providers/templates/sast-triage-provider.mjs +51 -0

package/docs/recipes/agent-brainstorm/100-extension-ideas.md ADDED Viewed

@@ -0,0 +1,288 @@
+# 100 Extension Ideas — Consolidated Strategy Report
+Research + 10 parallel domain agents × 10 ideas each = 100 scenarios, distilled into actionable insights.
+## Platform Capabilities (Foundation)
+| Capability | Description |
+|---|---|
+| CommandEmitter | Spawns a shell process, captures stdout line-by-line, routes lines through EventFilter |
+| PromptEmitter | Re-runs an AI prompt on a timer, on session idle, or one-time |
+| EventFilter | Regex pipeline: drop / keep / surface / inject — first match wins |
+| SessionInjector | Proactively pushes events into the active conversation without you asking |
+| Tool Registration | Expose any function as a Copilot tool call (AI can invoke it) |
+| Slash Commands | Register /your-command handlers with full parameter parsing |
+| Persistent Config | Read/write tap.config.json to save state between sessions |
+| Child Processes | Spawn subprocesses (API calls, CLI tools, linters, scanners) as tool results |
+| Hot-swap Filters | Change EventFilter rules while an emitter is running — no restart |
+| Lifespan Control | Emitters can be temporary (session) or persistent (auto-restarts) |
+### Core pattern
+```
+Background process → EventFilter → EventStream → SessionInjector → Your conversation
+```
+Everything builds on this: any background signal → filtered → injected as AI context. All 100 ideas follow this pattern.
+---
+## The 100 Ideas by Domain
+### Domain 1 — DevOps & CI/CD
+| # | Name | Pitch |
+|---|---|---|
+| 1 | PipelinePulse | Streams CI logs and injects AI-summarized failure root causes the moment a build breaks |
+| 2 | DriftSentinel | Diffs live infra state against Terraform/Pulumi and alerts when configuration drifts |
+| 3 | CanaryWhisperer | Watches canary metrics and auto-recommends promote/rollback with statistical confidence |
+| 4 | SecretSweeper | Intercepts git push/docker build and scans staged diffs for leaked secrets pre-flight |
+| 5 | BlastRadius | Before PR merge, maps every downstream service/pipeline affected and scores risk 0-10 |
+| 6 | RollbackOracle | On deployment degradation, identifies the culprit commit and offers one-click rollback |
+| 7 | CostGatekeeper | Injects estimated cloud cost deltas into every terraform plan, blocks over-budget applies |
+| 8 | ChaosCompanion | Suggests and orchestrates chaos experiments based on live topology; auto-terminates on SLO breach |
+| 9 | GitOpsDiff | Compares GitOps desired-state vs. actual cluster state and explains sync failures in plain English |
+| 10 | DeployDiary | Auto-generates structured deployment journal entries per release, committed back to repo |
+### Domain 2 — Security & Compliance
+| # | Name | Pitch |
+|---|---|---|
+| 11 | SecretSentry | Wraps every shell command and redacts/blocks accidental secret leaks before they hit stdout |
+| 12 | CVEWatch | Monitors your lockfiles against the GitHub Advisory API and injects CVE alerts on discovery |
+| 13 | DriftDetect | Diffs live cloud IAM/security-group state against a compliance baseline in real time |
+| 14 | AuditLens | Streams GitHub audit log events; lets you ask natural-language forensic questions |
+| 15 | ThreatModel | Auto-generates a living STRIDE threat model from code changes and architecture files |
+| 16 | LicenseCop | Scans transitive dependencies for license conflicts against your policy, blocks violations |
+| 17 | ZeroTrustPosture | Evaluates repo zero-trust hygiene (OIDC, branch protections, signed commits) and scores it |
+| 18 | SOC2Tracker | Maps engineering activities to SOC2 Trust Service Criteria and tracks evidence gaps |
+| 19 | SASTSurface | Runs incremental SAST on changed files and injects HIGH/CRITICAL findings as you save |
+| 20 | SBOMPulse | Generates/monitors your SBOM and alerts when component checksums change unexpectedly |
+### Domain 3 — Developer Productivity & Code Quality
+| # | Name | Pitch |
+|---|---|---|
+| 21 | FlowGuard | Detects context-switching frequency and nudges you back to your original task |
+| 22 | DebtRadar | Surfaces tech debt hotspots in files you're actively editing, ranked by age and churn |
+| 23 | CommitCoach | Scores staged commit messages and rewrites weak ones on demand via /commit |
+| 24 | CoverageWatch | Alerts in real time when your edits drop test coverage below a configured threshold |
+| 25 | BundleWatcher | Monitors JS bundle size on every build and flags regressions with the culprit import |
+| 26 | RefactorPulse | Identifies long functions and high cyclomatic complexity in your current branch diff |
+| 27 | DepFreshness | Warns when you import a dependency with a known CVE or major update, as you type |
+| 28 | GhostCode | Detects dead exports and unused functions, offers to schedule a removal PR |
+| 29 | LintDrift | Tracks linting error trends and flags when violations are trending upward |
+| 30 | PRPulse | Injects live PR review comments into your editor session to eliminate the GitHub UI round-trip |
+### Domain 4 — Team Collaboration & Communication
+| # | Name | Pitch |
+|---|---|---|
+| 31 | StandupBot | Collects async standup updates from git/PR activity and synthesizes a team digest |
+| 32 | PRNudge | Monitors stale PRs awaiting review and escalates with context injected into session |
+| 33 | WarRoom | Spins up a structured incident channel with auto-assigned roles and live timeline |
+| 34 | DepAlert | Detects when upstream teams merge breaking changes and notifies preemptively |
+| 35 | OnboardPilot | Guides new engineers through codebase onboarding with contextual hints from git activity |
+| 36 | VelocityWatch | Tracks sprint health in real time, surfaces blocked/scope-crept issues before retros |
+| 37 | MergeMediator | When two PRs conflict, summarizes both and proposes a resolution strategy |
+| 38 | KnowledgePulse | Captures implicit knowledge from PR review comments into a searchable knowledge base |
+| 39 | OncallEscalate | Monitors on-call rotation and auto-escalates unacknowledged alerts with runbook links |
+| 40 | AsyncBridge | Summarizes Slack/Teams threads relevant to your current branch and injects key decisions |
+### Domain 5 — Data Engineering & Analytics
+| # | Name | Pitch |
+|---|---|---|
+| 41 | PipelineWatch | Monitors dbt/Airflow DAG runs and injects failure context + upstream lineage |
+| 42 | SchemaDrift | Alerts when upstream table schemas change in ways that will break your models |
+| 43 | QuerySheriff | Flags query performance regressions as you write SQL, before they hit production |
+| 44 | FreshnessCop | Monitors data freshness SLAs and alerts when tables go stale |
+| 45 | SparkPulse | Streams Spark executor metrics (shuffle spills, GC pressure, skew) live into session |
+| 46 | CostSentinel | Watches Snowflake/BigQuery billing and fires alerts when queries exceed cost budgets |
+| 47 | DeadTableDetector | Finds tables unused for N days and proposes deprecation |
+| 48 | CDCStreamHealth | Monitors Debezium/Kafka CDC lag, offset drift, and consumer group failures in real time |
+| 49 | QualityGate | Runs Great Expectations/dbt tests and explains failures in plain English |
+| 50 | AnomalyRadar | Detects statistical anomalies in key business metrics, surfaces them proactively |
+### Domain 6 — Cloud Infrastructure & Kubernetes
+| # | Name | Pitch |
+|---|---|---|
+| 51 | crashloop-cop | Auto-detects CrashLoopBackOff pods and injects root-cause summaries into session |
+| 52 | quota-watch | Warns when namespace resource quotas near exhaustion before deployments fail |
+| 53 | helm-drift | Detects when live cluster state has drifted from the Helm release manifest |
+| 54 | terra-narrator | Pipes terraform plan output through Copilot and returns a plain-English risk summary |
+| 55 | argo-pulse | Monitors ArgoCD sync status and injects degraded/failed application context automatically |
+| 56 | node-scaler-log | Narrates Cluster Autoscaler decisions in real time, explaining why nodes are added/removed |
+| 57 | cost-anomaly | Detects cloud spend spikes and correlates them with recent Kubernetes workload changes |
+| 58 | netpol-audit | Audits NetworkPolicy coverage and flags pods with unrestricted ingress/egress |
+| 59 | crd-deprecation-sentinel | Scans installed CRDs against the K8s version removal list before cluster upgrades |
+| 60 | failover-coach | Guides cross-region failover runbook execution step-by-step with live state validation |
+### Domain 7 — Testing & QA Automation
+| # | Name | Pitch |
+|---|---|---|
+| 61 | FlakeHunter | Detects flaky tests by watching CI run history for non-deterministic failures |
+| 62 | CoverageGuard | Blocks coverage regressions in real time by watching lcov/istanbul diffs on save |
+| 63 | MutantWhisperer | Runs mutation testing on changed files and injects surviving mutants as test-writing prompts |
+| 64 | SplitAdvisor | Analyzes test suite timing and recommends optimal CI matrix parallelization splits |
+| 65 | PixelSentry | Watches visual regression reports (Percy/Playwright) and injects diff summaries |
+| 66 | BlastRadius (Test) | Maps which E2E tests historically fail when a given file changes |
+| 67 | LoadLens | Streams k6/Locust load test metrics and injects threshold breach alerts live |
+| 68 | SeedVault | Manages test fixtures by generating and versioning realistic seed data on demand |
+| 69 | PropBot | Suggests property-based test cases by analyzing function signatures and boundary gaps |
+| 70 | SnapDrift | Monitors snapshot test churn and flags blind jest -u updates that mask regressions |
+### Domain 8 — Documentation & Knowledge Management
+| # | Name | Pitch |
+|---|---|---|
+| 71 | ADRscribe | Watches commits and drafts Architecture Decision Records when structural changes are detected |
+| 72 | StaleDoc | Monitors source changes and flags documentation that hasn't been updated in sync |
+| 73 | Postmortem Pilot | Assembles structured postmortem drafts from incident logs, alert history, and git blame |
+| 74 | ChangelogCraft | Generates human-readable changelogs from PR titles, labels, and linked issues since last tag |
+| 75 | GlossaryGrow | Watches code/docs for undefined jargon and incrementally evolves a team glossary |
+| 76 | RunbookBot | Auto-generates runbooks by observing commands engineers actually run during incidents |
+| 77 | READMEscore | Scores README health across all org repos and surfaces the worst offenders |
+| 78 | DiagramDrift | Detects when Mermaid/PlantUML architecture diagrams diverge from actual codebase structure |
+| 79 | OnboardOracle | Builds personalized onboarding knowledge paths based on what files a new engineer has touched |
+| 80 | CommentCoverage | Tracks code comment coverage and enforces doc standards on public APIs |
+### Domain 9 — AI/ML & Data Science Workflows
+| # | Name | Pitch |
+|---|---|---|
+| 81 | TrainWatch | Monitors training runs and injects loss/metric anomalies into session live |
+| 82 | DriftSentry (ML) | Watches production inference logs for input feature distribution drift from training baselines |
+| 83 | HyperPilot | Injects past MLflow/W&B experiment results to turn Copilot into a Bayesian HP advisor |
+| 84 | GPUWatch | Streams GPU utilization, VRAM, and thermal throttle events with remediation hints |
+| 85 | EvalGate | Runs evaluation harness after every code change and injects metric deltas before commit |
+| 86 | DataFresh | Monitors feature store staleness and warns when you reference outdated datasets |
+| 87 | PromptAB | Runs A/B evaluations of prompt variants against a judge LLM and streams ranked results |
+| 88 | FineTuneWatch | Tails fine-tuning job logs (OpenAI/Vertex/SageMaker) and surfaces completion + cost |
+| 89 | ModelCardGen | Auto-generates HuggingFace-compatible model cards from eval results and training config |
+| 90 | ExperimentNarrator | Injects a natural-language narrative of recent experiments so you can ask "what have I tried?" |
+### Domain 10 — Enterprise Governance & Org Management
+| # | Name | Pitch |
+|---|---|---|
+| 91 | PolicyGuard | Enforces org-wide policy-as-code rules (via OPA) before every commit or PR |
+| 92 | RepoScore | Gives every repo a live health score across security, docs, coverage, and dependency freshness |
+| 93 | OwnerDrift | Detects when CODEOWNERS entries no longer match actual committers and suggests corrections |
+| 94 | LicenseWarden | Audits all transitive dependencies for license compatibility against org-approved list |
+| 95 | BranchShield | Audits branch protection rules org-wide and alerts/auto-remediates regressions |
+| 96 | InnerSource Pulse | Tracks cross-team contributions to shared libraries and surfaces engagement trends |
+| 97 | APIDeprecator | Tracks deprecated internal APIs and alerts consuming teams before sunset dates |
+| 98 | CostCenter | Attributes cloud spend and Actions minutes to teams using repo metadata and CODEOWNERS |
+| 99 | FitnessFn | Runs architectural fitness functions as executable guardrails, blocking layering violations |
+| 100 | SLAWatch | Monitors open incidents and PRs against SLA commitments and injects breach alerts live |
+---
+## Top 20 Recommendations
+Ranked by impact × buildability × novelty.
+### Tier 1 — Build First (High Impact, Directly Buildable)
+| Rank | Extension | Why It Wins |
+|---|---|---|
+| 1 | PipelinePulse | Solves the #1 developer pain: watching CI. Pure CommandEmitter + EventFilter. Near-zero infra. |
+| 2 | DriftSentinel | IaC drift is universal. Periodic terraform plan + filter is the exact tap pattern. |
+| 3 | PRPulse | Eliminates the editor-GitHub UI round-trip. Uses gh pr view CommandEmitter. |
+| 4 | SASTSurface | SAST in seconds not CI minutes. Uses semgrep --json + EventFilter. |
+| 5 | FlakeHunter | Flaky tests erode team trust. CI log tailing is exactly what tap does best. |
+### Tier 2 — High Leverage, Moderate Complexity
+| Rank | Extension | Why It Wins |
+|---|---|---|
+| 6 | CanaryWhisperer | Prometheus/Datadog polling + promote/rollback slash commands. |
+| 7 | SecretSentry | Intercepts every shell command; pre-git-push defense. High security ROI. |
+| 8 | CVEWatch | GitHub Advisory API polling + lockfile scanning. Nearly every team needs this. |
+| 9 | BlastRadius | Pre-merge risk scoring changes merge culture. Reads CODEOWNERS + GitHub API. |
+| 10 | CoverageWatch | Test coverage regression is invisible until CI fails. Live feedback closes the loop. |
+### Tier 3 — Strategic & Differentiated
+| Rank | Extension | Why It Wins |
+|---|---|---|
+| 11 | ExperimentNarrator | Unique to ML teams; session-start context injection from MLflow/W&B. |
+| 12 | ADRscribe | ADRs never get written. Triggering on commit patterns is elegant. |
+| 13 | DeployDiary | Compliance artifact generated automatically. Huge for regulated industries. |
+| 14 | FitnessFn | Architectural fitness functions as CI is proven; via Copilot it becomes conversational. |
+| 15 | ThreatModel | Living threat models that stay in sync with code are a massive security gap. |
+| 16 | StandupBot | Async standup from git activity eliminates synchronous ceremonies. |
+| 17 | ChangelogCraft | Release changelogs are universally hated to write. Fully automatable. |
+| 18 | GPUWatch | nvidia-smi dmon as a CommandEmitter is trivial; diagnostic value is enormous. |
+| 19 | CostGatekeeper | Infracost integration as a filter on terraform apply is a clear budget win. |
+| 20 | RepoScore | Org-wide health scoring via GitHub API is a platform engineering multiplier. |
+---
+## The 5 Core Extension Archetypes
+```
+1. WATCHER    — CommandEmitter + EventFilter + SessionInjector
+               (PipelinePulse, CVEWatch, GPUWatch, FlakeHunter)
+               Pattern: tail a process → filter → inject
+2. SCHEDULER  — PromptEmitter on timer/idle + SessionInjector
+               (DriftSentinel, StandupBot, AnomalyRadar, HyperPilot)
+               Pattern: poll on schedule → AI synthesizes → inject
+3. GATEKEEPER — Tool wraps a command, blocks/modifies output
+               (SecretSentry, CostGatekeeper, LicenseCop, PolicyGuard)
+               Pattern: intercept action → check → allow/block/annotate
+4. ADVISOR    — Slash command + multi-tool AI analysis
+               (BlastRadius, CanaryWhisperer, ThreatModel, terra-narrator)
+               Pattern: /command → gather data → AI reasons → report
+5. RECORDER   — Slash command + writes artifacts back to repo
+               (DeployDiary, ADRscribe, ChangelogCraft, ModelCardGen)
+               Pattern: /command → gather context → AI writes → git commit
+```
+| Archetype | SDK Features Used | Complexity |
+|---|---|---|
+| Watcher | CommandEmitter, EventFilter, SessionInjector | Low |
+| Scheduler | PromptEmitter (interval/idle), SessionInjector | Low |
+| Gatekeeper | Tool registration, child_process, inject | Medium |
+| Advisor | Slash command, multiple tools, external APIs | Medium |
+| Recorder | Slash command, tools, fs.writeFile, git | Higher |
+---
+## Recommended Build Roadmap
+### Phase 1 — Extend ※ tap (Immediate)
+- **PipelinePulse** — gh run watch CommandEmitter with AI failure diagnosis
+- **PRPulse** — gh pr view --comments + new-comment EventFilter
+- **CoverageWatch** — Jest/Vitest coverage CommandEmitter with threshold config
+### Phase 2 — New Extensions (using tap as template)
+- **DriftSentinel** — terraform plan -json PromptEmitter for IaC drift
+- **SASTSurface** — semgrep --json on git diff with HIGH/CRITICAL filter
+- **CVEWatch** — GitHub Advisory API PromptEmitter on idle
+### Phase 3 — Advanced Extensions (Strategic)
+- **CanaryWhisperer** — Prometheus/Datadog polling + promote/rollback slash commands
+- **ADRscribe** — Commit-triggered ADR drafting with git-pattern detection
+- **ExperimentNarrator** — MLflow/W&B session-start context injection
+- **DeployDiary** — Multi-source deployment record generation and git commit
+---
+## Key Insight
+> The terminal is ambient. Copilot is conversational. The gap between them — background signals that require your attention — is exactly where extensions live.
+> Every great extension answers one question: "What would you want to already know when you start your next conversation?"
+The tap pattern — **watch → filter → inject** — is the universal primitive. All 100 ideas are variations of filling that gap in specific domains. The best ones work silently until something needs your attention, then interrupt precisely and contextually.

package/docs/recipes/agent-brainstorm/deep-ideas.md ADDED Viewed

@@ -0,0 +1,216 @@
+# Beyond Watch-Filter-Inject — Deep Ideas
+The 100 ideas catalog covers the obvious pattern: run a command, filter output, inject into session. That's the "hello world" of tap. These ideas explore what becomes possible when you think harder about what the platform uniquely enables.
+---
+## What makes tap different from every other tool?
+1. **It lives inside the AI reasoning loop.** It doesn't just show you data — it changes how the AI thinks by injecting context before the model reasons.
+2. **It intercepts tool calls.** `onPreToolUse` / `onPostToolUse` hooks can modify, enhance, gate, or redirect ANY tool call the agent makes.
+3. **It has AI-calling-AI.** PromptEmitters are scheduled AI invocations — not shell commands, but reasoning on a timer.
+4. **It accumulates state across turns.** EventStreams are rolling memory that persists within and across sessions.
+5. **It can register/unregister tools at runtime.** The agent's capabilities change based on what's happening.
+The 100 ideas treat tap as a pipe. These ideas treat it as a **cognitive layer**.
+---
+## Idea 1: Reflexive Self-Improvement
+**The extension watches the agent and makes it better.**
+The agent makes tool calls, writes code, runs tests. An extension watches all of this via `onPostToolUse` and `assistant.message` events. It builds a model of what works and what doesn't:
+- Which tool calls fail and why
+- Which code patterns lead to test failures
+- Which instructions the user has to repeat
+Then it injects learned corrections into session context: "When editing Python in this repo, always run black after edits — you've been corrected 3 times." The agent gets better over sessions without anyone writing new instructions.
+**Why this is novel:** The extension IS the agent's long-term memory and learning system. Not a static instructions file — a living, adapting context layer.
+**Core mechanism:** `onPostToolUse` hook → track outcomes → EventStream as learning journal → `onSessionStart` injects accumulated lessons.
+---
+## Idea 2: Attention Budget
+**Not everything deserves to interrupt the AI. The extension decides what does.**
+Current model: EventFilter is static regex rules. New model: an attention budget that's dynamic. The extension tracks:
+- What the user is currently working on (from recent tool calls and messages)
+- How important each incoming event is (not just regex — semantic relevance)
+- How many interruptions have happened recently (fatigue modeling)
+A PromptEmitter periodically reviews the EventStream and re-ranks what should be injected vs. kept vs. dropped. The filter rules themselves are AI-generated and hot-swapped based on context.
+**Why this is novel:** The EventFilter becomes intelligent. It doesn't just pattern-match — it reasons about relevance. "You're debugging a CSS issue, so suppress the CI failure for the backend service, but surface the Playwright screenshot regression."
+**Core mechanism:** PromptEmitter (idle schedule) reads recent conversation + EventStream → generates new EventFilter rules → `tap_set_event_filter` hot-swaps them.
+---
+## Idea 3: Tool Interception Layer
+**Every tool call passes through an enhancement/gating layer.**
+`onPreToolUse` and `onPostToolUse` are the most underexplored hooks. They let you:
+- **Enhance:** Before `edit` runs, inject linting context. After `edit` runs, auto-run the formatter.
+- **Gate:** Before `shell(rm -rf)` runs, check if it's in a protected directory.
+- **Augment:** After `grep` returns results, automatically add file summaries.
+- **Redirect:** Before an API call, check if there's a cached result in the EventStream.
+- **Record:** Log every tool call into an EventStream for replay, audit, or debugging.
+This turns tap into a **middleware layer** for the agent's actions. Think Express.js middleware but for AI tool calls.
+**Why this is novel:** You're not adding tools — you're modifying all existing tools. One extension can change the behavior of every tool in the system.
+**Example:** An extension that intercepts every `edit` tool call, runs the edit, then immediately spawns `eslint --fix` on the file. The agent never produces unlinted code. Zero changes to the agent's instructions needed.
+---
+## Idea 4: Multi-Agent Debate Protocol
+**Multiple PromptEmitters that cross-check each other's work.**
+Instead of one AI doing everything, set up competing perspectives:
+- **Builder emitter:** "Implement this feature"
+- **Critic emitter:** On idle, reviews what Builder did and injects objections
+- **Security emitter:** On every file edit, checks for vulnerabilities
+- **Simplicity emitter:** On idle, asks "can this be simpler?"
+They don't talk to each other directly — they all inject into the same session. The main agent synthesizes their perspectives. It's a **council pattern** where the user gets the benefit of multiple viewpoints without managing multiple sessions.
+**Why this is novel:** AI-to-AI coordination through shared EventStreams. Each emitter specializes and challenges the others. The quality of output improves because no single perspective dominates.
+**Core mechanism:** Multiple PromptEmitters on idle schedule, each with a different system prompt and focus area. EventStreams provide shared context.
+---
+## Idea 5: Capability Discovery via Universal Tool Gateway
+**The agent's abilities change based on what's running on your machine.**
+Combine the universal tool gateway with session context injection:
+- Docker is running → container management tools appear, agent knows it can deploy locally
+- Postgres is running → database query tools appear, agent knows the schema
+- Browser has a React app open (via Detour bridge) → React component tools appear
+- kubectl is configured → Kubernetes tools appear
+- Nothing is running → agent works with just files and git
+The agent's instructions dynamically update: "You currently have access to: local Postgres (myapp_dev), Docker (3 containers running), and the React app at localhost:3000."
+**Why this is novel:** The agent is context-aware about the developer's environment at runtime. Not "what tools exist" but "what tools are available right now." It can suggest actions based on what's actually possible.
+**Core mechanism:** Bridge server polls or receives hellos from providers → tap calls `session.registerTools()` AND injects updated capability description into session context via `onUserPromptSubmitted` hook.
+---
+## Idea 6: Session Handoff
+**Pass work between sessions with full context.**
+You're working on a feature in Session A. You need to context-switch to debug a production issue. Instead of losing context:
+1. The extension serializes Session A's EventStreams, current state, and a PromptEmitter-generated summary into a handoff artifact.
+2. Session B starts with the production issue. The extension notes this is a different context.
+3. When you return to feature work, Session C starts. The extension detects the topic match and injects the Session A handoff artifact as context.
+**Why this is novel:** Extensions bridge the gap between ephemeral sessions. Your work doesn't evaporate when you switch contexts. The extension is your **working memory across sessions**.
+**Core mechanism:** `onSessionEnd` hook serializes state → persistent config + file artifact. `onSessionStart` hook checks for relevant handoff artifacts → injects as additionalContext.
+---
+## Idea 7: Workflow Recorder → Replay
+**Watch what a human does, then replay it as an automated workflow.**
+The extension records every meaningful action during a session: which files were edited, what commands were run, what tools were called, in what order. It builds a **workflow graph**.
+Later, you say "do what I did last time when deploying" and the extension replays the workflow — but through the AI, so it adapts to the current context (different branch, different files, different state).
+**Why this is novel:** It's not a shell script recording. It's capturing intent at the tool-call level and replaying it through an AI that can handle variations. Brittle automation becomes adaptive automation.
+**Core mechanism:** `onPostToolUse` hook records tool calls into EventStream → `onSessionEnd` distills into workflow template → PromptEmitter can replay by injecting the template as instructions.
+---
+## Idea 8: Semantic Event Correlation
+**Events from different streams that are related find each other.**
+You have three emitters running: CI watcher, error log tailer, and PR comment monitor. They produce events independently. But:
+- CI fails at 2:03pm
+- Error logs spike at 2:03pm
+- A PR comment at 1:55pm said "this might break staging"
+A correlation engine (PromptEmitter on idle) reads across all EventStreams and connects these: "CI failure correlates with error spike; both likely caused by PR #247 (commenter warned about this)."
+**Why this is novel:** Individual emitters are dumb pipes. The correlation layer creates intelligence by reading across streams. This is how humans reason about incidents — connecting signals from different sources.
+**Core mechanism:** PromptEmitter (idle) reads history from all streams → reasons about correlations → injects synthesis as a single high-signal event.
+---
+## Idea 9: Progressive Disclosure of Complexity
+**The extension starts simple and grows capabilities as you need them.**
+New user gets: one tool, one emitter pattern, minimal instructions. As the extension observes what you do, it progressively:
+- Suggests new emitters based on your patterns ("I notice you check CI manually every 10 minutes — want me to watch it?")
+- Offers to persist useful temporary emitters
+- Proposes EventFilter refinements based on what you ignore vs. react to
+- Surfaces advanced features only when they'd help
+**Why this is novel:** Most tools dump all features on you day one. This extension **teaches itself to you** by observing what you need. The onboarding IS the product.
+**Core mechanism:** PromptEmitter (idle, low frequency) reviews session patterns → compares against known recipes → suggests next capability via `session.send()`.
+---
+## Idea 10: Extension as API Gateway
+**Any external API becomes a tool without writing an extension.**
+User says: "I want to be able to query our Jira board." Instead of writing a Jira extension:
+1. tap registers a generic `tap_api` tool
+2. User provides the API spec (OpenAPI/Swagger URL or a few example curl commands)
+3. tap generates typed tools from the spec at runtime via `session.registerTools()`
+4. The agent can now query Jira, create tickets, update status
+The API definition lives in `tap.config.json`. Add a new API? Add an entry to config. No code.
+**Why this is novel:** It collapses the "extension per service" model into "config per service." The 100 ideas list has CVEWatch, PipelineWatch, SLAWatch — they're all "call an API and filter results." With a generic API gateway, you configure them instead of coding them.
+**Core mechanism:** Config-driven API definitions → `onSessionStart` generates Tool objects with handlers that make HTTP calls → `session.registerTools()`.
+---
+## The Meta-Pattern
+The 100 ideas are all instances of: **run a thing → filter output → show it to the AI.**
+The deeper ideas are instances of: **observe the system (including the AI itself) → reason about what matters → change the AI's behavior.**
+The shift is from **tap as a pipe** to **tap as a cognitive layer:**
+| Surface level | Deep level |
+|---|---|
+| Watch CI logs | Watch the agent's own failures and learn from them |
+| Filter by regex | Filter by semantic relevance to current task |
+| Inject events | Change what tools exist and how they behave |
+| One emitter per service | One gateway that discovers services at runtime |
+| Static instructions | Instructions that evolve based on observed patterns |
+| Single session | Context that flows across sessions |
+| One AI perspective | Multiple AI perspectives that debate |
+The universal tool gateway was the first example of this shift. These ideas continue it.