copilot-tap-extension 2.0.8 → 2.0.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -1
- package/SOUL.md +51 -0
- package/bin/install.mjs +2 -1
- package/dist/copilot-instructions.md +5 -0
- package/dist/extension.mjs +361 -20
- package/dist/version.json +1 -1
- package/docs/adr/0001-persistent-config-default-ownership.md +33 -0
- package/docs/adr/0002-local-provider-gateway-runtime-security.md +36 -0
- package/docs/adr/0003-emitter-delivery-lifecycle.md +68 -0
- package/docs/adr/0004-persistent-config-canonical-streams.md +86 -0
- package/docs/adr/0005-provider-sdk-push-and-dynamic-tools.md +48 -0
- package/docs/adr/0006-command-emitter-cwd-workspace-boundary.md +46 -0
- package/docs/adr/0007-runtime-session-workspace-context.md +62 -0
- package/docs/evals.md +41 -0
- package/docs/evolution-of-tap-icon.html +989 -0
- package/docs/providers.md +242 -0
- package/docs/recipes/adaptive-agent.md +303 -0
- package/docs/recipes/agent-brainstorm/100-extension-ideas.md +288 -0
- package/docs/recipes/agent-brainstorm/deep-ideas.md +216 -0
- package/docs/recipes/ambient-guardian.md +314 -0
- package/docs/recipes/browser-bridge.md +162 -0
- package/docs/recipes/codex-goals-for-tap-goal.md +136 -0
- package/docs/recipes/copilot-sdk-canvas.md +147 -0
- package/docs/recipes/deferred-cognition.md +310 -0
- package/docs/recipes/provider-integration-patterns.md +93 -0
- package/docs/recipes/provider-interface-advanced.md +1364 -0
- package/docs/recipes/provider-interface-core-profile.md +568 -0
- package/docs/recipes/tap-control-plane-roadmap.md +60 -0
- package/docs/recipes/universal-tool-gateway.md +202 -0
- package/docs/reference.md +229 -0
- package/docs/use-cases.md +348 -0
- package/package.json +4 -1
- package/providers/detour/README.md +84 -0
- package/providers/detour/bridge.js +219 -0
- package/providers/detour/index.mjs +322 -0
- package/providers/detour/package-lock.json +577 -0
- package/providers/detour/package.json +19 -0
- package/providers/detour/scripts/build.mjs +31 -0
- package/providers/detour/src/bridge.js +256 -0
- package/providers/detour/src/contracts.js +40 -0
- package/providers/detour/src/inspector.js +260 -0
- package/providers/detour/src/inspector.test.mjs +53 -0
- package/providers/detour/src/panel.js +465 -0
- package/providers/detour/src/provider-core.js +233 -0
- package/providers/detour/src/provider-core.test.mjs +185 -0
- package/providers/detour/src/react-context-core.js +143 -0
- package/providers/detour/src/react-context.js +44 -0
- package/providers/detour/src/react-context.test.mjs +41 -0
- package/providers/templates/README.md +23 -0
- package/providers/templates/ci-review-provider.mjs +46 -0
- package/providers/templates/detour-workflow-provider.mjs +41 -0
- package/providers/templates/jira-github-provider.mjs +42 -0
- package/providers/templates/provider-utils.mjs +45 -0
- package/providers/templates/sast-triage-provider.mjs +51 -0
|
@@ -0,0 +1,314 @@
|
|
|
1
|
+
# Recipe: Ambient Guardian — Continuous Background Intelligence
|
|
2
|
+
|
|
3
|
+
## The insight
|
|
4
|
+
|
|
5
|
+
Skills fire when invoked. tap fires when something happens. The gap between these two is **time** — the 30 seconds between a teammate's force-push and your next `git push` that will conflict. The 90 seconds between a deploy and the error spike it causes. The silent period while CI is failing and you're still writing code that depends on it passing.
|
|
6
|
+
|
|
7
|
+
The Ambient Guardian is a pattern where tap maintains a continuous awareness of your environment and interrupts **only when something needs your attention right now**. Not a dashboard. Not a notification system. A runtime that understands what you're doing and correlates it with what's happening around you.
|
|
8
|
+
|
|
9
|
+
## Architecture
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
┌──────────────────────────────────────────────────────────┐
|
|
13
|
+
│ Copilot CLI session │
|
|
14
|
+
│ │
|
|
15
|
+
│ ┌─────────────────────────────────────────────────────┐ │
|
|
16
|
+
│ │ Ambient Guardian (tap extension layer) │ │
|
|
17
|
+
│ │ │ │
|
|
18
|
+
│ │ onPreToolUse ──► gate actions against live state │ │
|
|
19
|
+
│ │ onPostToolUse ──► track what you're working on │ │
|
|
20
|
+
│ │ transform callbacks ──► rewrite rules per context │ │
|
|
21
|
+
│ └────────┬───────────┬───────────┬────────────────────┘ │
|
|
22
|
+
│ │ │ │ │
|
|
23
|
+
│ ┌────────▼──┐ ┌──────▼───┐ ┌────▼──────┐ │
|
|
24
|
+
│ │ Emitter: │ │ Emitter: │ │ Emitter: │ │
|
|
25
|
+
│ │ git state │ │ CI watch │ │ env probe │ │
|
|
26
|
+
│ │ (30s poll)│ │ (gh api) │ │ (custom) │ │
|
|
27
|
+
│ └────────┬──┘ └──────┬───┘ └────┬──────┘ │
|
|
28
|
+
│ │ │ │ │
|
|
29
|
+
│ ┌────────▼───────────▼──────────▼──────┐ │
|
|
30
|
+
│ │ Correlation PromptEmitter (idle) │ │
|
|
31
|
+
│ │ Reads all streams, finds patterns, │ │
|
|
32
|
+
│ │ decides what to surface │ │
|
|
33
|
+
│ └───────────────────────────────────────┘ │
|
|
34
|
+
└──────────────────────────────────────────────────────────┘
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Why skills can't do this
|
|
38
|
+
|
|
39
|
+
A skill can check CI status when you ask. But:
|
|
40
|
+
|
|
41
|
+
1. You don't ask until you're about to push — by then you've built on a broken foundation for 20 minutes.
|
|
42
|
+
2. A skill can't correlate a deploy, an error spike, and a PR comment that happened 90 seconds apart. It sees one thing at a time.
|
|
43
|
+
3. A skill can't physically block a `git push` mid-execution. `onPreToolUse` can.
|
|
44
|
+
4. A skill can't rewrite the system prompt to say "be conservative, production is degraded." Transform callbacks can.
|
|
45
|
+
|
|
46
|
+
The value is in what happens **between user messages** — the silence when nobody is asking questions but the world is changing.
|
|
47
|
+
|
|
48
|
+
## Components
|
|
49
|
+
|
|
50
|
+
### 1. Environment emitters (the eyes)
|
|
51
|
+
|
|
52
|
+
Three CommandEmitters running continuously:
|
|
53
|
+
|
|
54
|
+
**Git state watcher** — polls every 30 seconds:
|
|
55
|
+
```bash
|
|
56
|
+
git fetch --quiet 2>/dev/null; \
|
|
57
|
+
echo "branch=$(git branch --show-current)"; \
|
|
58
|
+
echo "ahead=$(git rev-list --count @{u}..HEAD 2>/dev/null || echo 0)"; \
|
|
59
|
+
echo "behind=$(git rev-list --count HEAD..@{u} 2>/dev/null || echo 0)"; \
|
|
60
|
+
echo "dirty=$(git status --porcelain | wc -l)"; \
|
|
61
|
+
echo "conflicts=$(git diff --name-only --diff-filter=U | wc -l)"
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**CI watcher** — polls GitHub Actions:
|
|
65
|
+
```bash
|
|
66
|
+
gh run list --branch $(git branch --show-current) --limit 3 --json status,conclusion,name,createdAt
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**Deploy/infra probe** — customizable per project (Kubernetes, AWS, Vercel, etc.):
|
|
70
|
+
```bash
|
|
71
|
+
kubectl get pods -l app=myservice --no-headers | awk '{print $1, $3, $4, $5}'
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### 2. EventFilter rules (noise control)
|
|
75
|
+
|
|
76
|
+
```json
|
|
77
|
+
[
|
|
78
|
+
{ "match": "behind=0", "outcome": "drop" },
|
|
79
|
+
{ "match": "dirty=0", "outcome": "drop" },
|
|
80
|
+
{ "match": "conflicts=[1-9]", "outcome": "inject" },
|
|
81
|
+
{ "match": "behind=[1-9]", "outcome": "surface" },
|
|
82
|
+
{ "match": "status.*failure", "outcome": "inject" },
|
|
83
|
+
{ "match": "CrashLoopBackOff", "outcome": "inject" },
|
|
84
|
+
{ "match": ".*", "outcome": "keep" }
|
|
85
|
+
]
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Most polls produce nothing interesting → dropped. Only real signals break through.
|
|
89
|
+
|
|
90
|
+
### 3. Correlation engine (the brain)
|
|
91
|
+
|
|
92
|
+
A PromptEmitter on idle schedule that reads across all streams:
|
|
93
|
+
|
|
94
|
+
```
|
|
95
|
+
prompt: |
|
|
96
|
+
You are a background correlation engine. Read the recent events
|
|
97
|
+
from all streams and look for patterns:
|
|
98
|
+
- Did something change in one stream that explains an event in another?
|
|
99
|
+
- Is there a time correlation between events across streams?
|
|
100
|
+
- Is the developer's current work going to collide with something
|
|
101
|
+
that just happened?
|
|
102
|
+
|
|
103
|
+
Only report if you find a genuine correlation. Say nothing if
|
|
104
|
+
everything looks normal. Be terse — one sentence max.
|
|
105
|
+
|
|
106
|
+
Stream history:
|
|
107
|
+
{{git_stream_last_10}}
|
|
108
|
+
{{ci_stream_last_10}}
|
|
109
|
+
{{deploy_stream_last_10}}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
### 4. Action gating (onPreToolUse)
|
|
113
|
+
|
|
114
|
+
Before tool calls execute, the guardian checks live state:
|
|
115
|
+
|
|
116
|
+
```js
|
|
117
|
+
onPreToolUse: async ({ toolName, toolArgs }) => {
|
|
118
|
+
// Block push if CI is failing
|
|
119
|
+
if (toolName === "shell" && isGitPush(toolArgs.command)) {
|
|
120
|
+
const ciState = streams.latest("ci-watch");
|
|
121
|
+
if (ciState?.includes("failure")) {
|
|
122
|
+
return {
|
|
123
|
+
permissionDecision: "deny",
|
|
124
|
+
permissionDecisionReason:
|
|
125
|
+
"CI is currently failing on this branch. Fix the failing " +
|
|
126
|
+
"tests before pushing, or the failure will block the PR."
|
|
127
|
+
};
|
|
128
|
+
}
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
// Warn before editing files that have upstream changes
|
|
132
|
+
if (toolName === "edit") {
|
|
133
|
+
const gitState = streams.latest("git-watch");
|
|
134
|
+
if (gitState?.behind > 0) {
|
|
135
|
+
return {
|
|
136
|
+
additionalContext:
|
|
137
|
+
`Warning: your branch is ${gitState.behind} commits behind ` +
|
|
138
|
+
`origin. The file you're editing may have upstream changes. ` +
|
|
139
|
+
`Consider pulling first.`
|
|
140
|
+
};
|
|
141
|
+
}
|
|
142
|
+
}
|
|
143
|
+
}
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### 5. Context-adaptive system prompt (transform callbacks)
|
|
147
|
+
|
|
148
|
+
```js
|
|
149
|
+
registerTransformCallbacks(new Map([
|
|
150
|
+
["code_change_rules", (current) => {
|
|
151
|
+
const branch = streams.latest("git-watch")?.branch;
|
|
152
|
+
const ciStatus = streams.latest("ci-watch")?.status;
|
|
153
|
+
const deploying = streams.latest("deploy-watch")?.deploying;
|
|
154
|
+
|
|
155
|
+
const additions = [];
|
|
156
|
+
|
|
157
|
+
if (branch === "main" || branch === "master") {
|
|
158
|
+
additions.push(
|
|
159
|
+
"You are on the production branch. Require explicit user " +
|
|
160
|
+
"confirmation before any file write. Suggest a feature branch."
|
|
161
|
+
);
|
|
162
|
+
}
|
|
163
|
+
|
|
164
|
+
if (ciStatus === "failure") {
|
|
165
|
+
additions.push(
|
|
166
|
+
"CI is currently failing. Prioritize fixing tests over new features."
|
|
167
|
+
);
|
|
168
|
+
}
|
|
169
|
+
|
|
170
|
+
if (deploying) {
|
|
171
|
+
additions.push(
|
|
172
|
+
"A production deploy is in progress. Do not suggest database " +
|
|
173
|
+
"migrations or infrastructure changes until it completes."
|
|
174
|
+
);
|
|
175
|
+
}
|
|
176
|
+
|
|
177
|
+
return additions.length > 0
|
|
178
|
+
? current + "\n\n" + additions.join("\n")
|
|
179
|
+
: current;
|
|
180
|
+
}]
|
|
181
|
+
]));
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
## Example scenarios
|
|
185
|
+
|
|
186
|
+
### Scenario A: The silent conflict
|
|
187
|
+
|
|
188
|
+
```
|
|
189
|
+
You're writing code on feature/auth (10 minutes in)
|
|
190
|
+
│
|
|
191
|
+
▼
|
|
192
|
+
Git emitter detects: branch is now 2 commits behind origin
|
|
193
|
+
│
|
|
194
|
+
▼
|
|
195
|
+
EventFilter: behind=[1-9] → surface
|
|
196
|
+
│
|
|
197
|
+
▼
|
|
198
|
+
Timeline shows: "※ tap: feature/auth is 2 commits behind origin"
|
|
199
|
+
│
|
|
200
|
+
▼
|
|
201
|
+
You keep working (it's just a surface, not an inject)
|
|
202
|
+
│
|
|
203
|
+
▼
|
|
204
|
+
5 minutes later, you ask Copilot to edit src/auth.ts
|
|
205
|
+
│
|
|
206
|
+
▼
|
|
207
|
+
onPreToolUse fires → checks git state → one of the upstream
|
|
208
|
+
commits touched src/auth.ts
|
|
209
|
+
│
|
|
210
|
+
▼
|
|
211
|
+
Copilot receives: "Warning: src/auth.ts was modified in an upstream
|
|
212
|
+
commit (abc123 by Alice, 7 min ago). Your edit may conflict.
|
|
213
|
+
Consider pulling first."
|
|
214
|
+
│
|
|
215
|
+
▼
|
|
216
|
+
You pull, resolve cleanly, then continue. Saved 20 minutes of
|
|
217
|
+
merge conflict debugging.
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
### Scenario B: The cascading failure
|
|
221
|
+
|
|
222
|
+
```
|
|
223
|
+
3 events arrive over 90 seconds:
|
|
224
|
+
|
|
225
|
+
2:01pm — deploy emitter: "v2.4.2 deployed to prod"
|
|
226
|
+
2:02pm — CI emitter: "staging pipeline failed: connection refused"
|
|
227
|
+
2:03pm — deploy emitter: "pod auth-service restart count: 4"
|
|
228
|
+
│
|
|
229
|
+
▼
|
|
230
|
+
Correlation PromptEmitter (idle) reads all streams:
|
|
231
|
+
│
|
|
232
|
+
▼
|
|
233
|
+
Injects: "Deploy v2.4.2 is causing auth-service crash loops
|
|
234
|
+
(4 restarts in 2 min). Staging CI is failing with connection
|
|
235
|
+
refused — likely same root cause. Consider rolling back."
|
|
236
|
+
│
|
|
237
|
+
▼
|
|
238
|
+
Meanwhile, transform callback has already added to system prompt:
|
|
239
|
+
"Production is degraded. Do not suggest changes to auth-service
|
|
240
|
+
configuration. Prioritize investigation and rollback."
|
|
241
|
+
│
|
|
242
|
+
▼
|
|
243
|
+
You say: "rollback" — Copilot already knows the context,
|
|
244
|
+
runs the rollback command immediately.
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Scenario C: The preemptive gate
|
|
248
|
+
|
|
249
|
+
```
|
|
250
|
+
You ask Copilot: "push my changes"
|
|
251
|
+
│
|
|
252
|
+
▼
|
|
253
|
+
onPreToolUse fires for shell(git push)
|
|
254
|
+
│
|
|
255
|
+
▼
|
|
256
|
+
Guardian checks:
|
|
257
|
+
✗ CI status: failure (test/auth.spec.ts)
|
|
258
|
+
✗ Uncommitted files: 2 files not in this branch's scope
|
|
259
|
+
✓ Branch: feature/auth (not main)
|
|
260
|
+
✓ No deploy in progress
|
|
261
|
+
│
|
|
262
|
+
▼
|
|
263
|
+
Returns: permissionDecision: "deny"
|
|
264
|
+
reason: "CI is failing on test/auth.spec.ts (your branch).
|
|
265
|
+
Also, you have 2 uncommitted files (config.json,
|
|
266
|
+
.env.local) that aren't related to this PR.
|
|
267
|
+
Fix the test first, then stash or commit the
|
|
268
|
+
unrelated files."
|
|
269
|
+
│
|
|
270
|
+
▼
|
|
271
|
+
Copilot: "I can't push right now — CI is failing and you
|
|
272
|
+
have unrelated uncommitted files. Want me to fix the
|
|
273
|
+
failing test first?"
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
## Configuration
|
|
277
|
+
|
|
278
|
+
In `tap.config.json`:
|
|
279
|
+
|
|
280
|
+
```json
|
|
281
|
+
{
|
|
282
|
+
"guardian": {
|
|
283
|
+
"emitters": {
|
|
284
|
+
"git": { "every": "30s", "enabled": true },
|
|
285
|
+
"ci": { "every": "60s", "enabled": true },
|
|
286
|
+
"deploy": { "command": "kubectl get pods ...", "every": "60s", "enabled": false }
|
|
287
|
+
},
|
|
288
|
+
"correlation": { "schedule": "idle", "enabled": true },
|
|
289
|
+
"gating": {
|
|
290
|
+
"blockPushOnCIFailure": true,
|
|
291
|
+
"warnOnUpstreamChanges": true,
|
|
292
|
+
"blockMainBranchWrites": true
|
|
293
|
+
}
|
|
294
|
+
}
|
|
295
|
+
}
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
## Phased delivery
|
|
299
|
+
|
|
300
|
+
| Phase | Scope |
|
|
301
|
+
|---|---|
|
|
302
|
+
| **1. Git + CI emitters** | Two CommandEmitters with EventFilter rules, surface/inject thresholds |
|
|
303
|
+
| **2. onPreToolUse gating** | Block push on CI failure, warn on upstream conflicts |
|
|
304
|
+
| **3. Transform callbacks** | Context-adaptive system prompt based on branch/CI/deploy state |
|
|
305
|
+
| **4. Correlation engine** | PromptEmitter that reads across streams and synthesizes |
|
|
306
|
+
| **5. Configuration** | Per-project guardian config in tap.config.json |
|
|
307
|
+
|
|
308
|
+
## Open questions
|
|
309
|
+
|
|
310
|
+
- **Polling frequency** — 30s for git, 60s for CI? Configurable per project?
|
|
311
|
+
- **Gate strictness** — deny vs. warn? Should the user be able to override gates?
|
|
312
|
+
- **Correlation prompt** — how to keep it cheap (token-wise) while effective?
|
|
313
|
+
- **Multi-repo** — does the guardian follow you across repos, or reset per project?
|
|
314
|
+
- **Override mechanism** — `--force` style escape hatch for gates?
|
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# Recipe: Browser Bridge — Copilot CLI ↔ Live Web Pages
|
|
2
|
+
|
|
3
|
+
Connect Copilot CLI to any browser tab via a local WebSocket relay and [Detour](https://chromewebstore.google.com/detail/detour/cinkplogkjggmgdkaflhlemcdhchninp) (a Chrome extension that injects scripts into pages).
|
|
4
|
+
|
|
5
|
+
## How it works
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
Copilot CLI (※ tap) ◄─ws─► Bridge Server ◄─ws─► Injected JS (via Detour)
|
|
9
|
+
ws://localhost:9400 running in page MAIN world
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
1. A standalone **bridge server** runs locally on a WebSocket port.
|
|
13
|
+
2. **Detour injects a client script** into target pages — no changes to Detour needed. Detour already runs arbitrary JS in the MAIN world and bypasses CSP. The bridge client is just another script it injects.
|
|
14
|
+
3. **tap dynamically registers tools** when the bridge connects via `session.registerTools()` — Copilot sees browser tools appear and disappear as the bridge connects/disconnects.
|
|
15
|
+
4. **Push events** (console, annotations) flow from the browser through a tap emitter into the Copilot session.
|
|
16
|
+
|
|
17
|
+
## Architecture
|
|
18
|
+
|
|
19
|
+
### Bridge server (standalone)
|
|
20
|
+
|
|
21
|
+
A minimal Node.js WebSocket relay. Clients self-identify as `agent` (Copilot) or `browser` (injected page script). The bridge routes messages between them.
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
npx copilot-bridge
|
|
25
|
+
# or
|
|
26
|
+
node bridge/server.mjs
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Zero knowledge of tap or Detour — it just relays JSON.
|
|
30
|
+
|
|
31
|
+
### Injected script (via Detour)
|
|
32
|
+
|
|
33
|
+
A self-contained JS file hosted locally or on a CDN. Added to Detour as a script injection rule on target pages. It:
|
|
34
|
+
|
|
35
|
+
- Connects to `ws://localhost:9400`
|
|
36
|
+
- Identifies as `browser`
|
|
37
|
+
- Handles action requests (screenshot, DOM query, JS exec)
|
|
38
|
+
- Pushes events (console, annotations) to the bridge
|
|
39
|
+
|
|
40
|
+
### tap integration — dynamic tool registration
|
|
41
|
+
|
|
42
|
+
The Copilot SDK supports `session.registerTools()` at runtime (`CopilotSession.registerTools`). tap doesn't need to predefine browser tools — it registers them when the bridge connects and removes them when it disconnects.
|
|
43
|
+
|
|
44
|
+
```js
|
|
45
|
+
// When bridge connects and browser is available:
|
|
46
|
+
session.registerTools([
|
|
47
|
+
...existingTapTools,
|
|
48
|
+
{
|
|
49
|
+
name: "browser_screenshot",
|
|
50
|
+
description: "Capture the visible browser viewport as a PNG screenshot",
|
|
51
|
+
handler: async () => bridge.request("screenshot")
|
|
52
|
+
},
|
|
53
|
+
{
|
|
54
|
+
name: "browser_exec",
|
|
55
|
+
description: "Execute JS in the page MAIN world, return result",
|
|
56
|
+
parameters: { type: "object", properties: { js: { type: "string" } }, required: ["js"] },
|
|
57
|
+
handler: async ({ js }) => bridge.request("js.exec", { js })
|
|
58
|
+
}
|
|
59
|
+
]);
|
|
60
|
+
|
|
61
|
+
// When bridge disconnects — re-register without browser tools:
|
|
62
|
+
session.registerTools(existingTapTools);
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
This pattern generalizes: any WebSocket-connected service can surface tools into Copilot at runtime — not just the browser bridge. The bridge announces what actions the connected browser supports, and tap materializes them as tools.
|
|
66
|
+
|
|
67
|
+
## Protocol
|
|
68
|
+
|
|
69
|
+
JSON over WebSocket. Request/response with correlation IDs.
|
|
70
|
+
|
|
71
|
+
### Handshake
|
|
72
|
+
|
|
73
|
+
```json
|
|
74
|
+
{ "type": "hello", "role": "agent", "name": "copilot-tap" }
|
|
75
|
+
{ "type": "hello", "role": "browser", "name": "detour-bridge-client" }
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Request → Response
|
|
79
|
+
|
|
80
|
+
```json
|
|
81
|
+
// agent sends
|
|
82
|
+
{ "type": "request", "id": "r1", "action": "screenshot", "params": {} }
|
|
83
|
+
|
|
84
|
+
// browser responds
|
|
85
|
+
{ "type": "response", "id": "r1", "data": { "image": "data:image/png;base64,..." } }
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Push (browser → agent, unsolicited)
|
|
89
|
+
|
|
90
|
+
```json
|
|
91
|
+
{ "type": "push", "action": "comment", "data": { "text": "Fix this button", "selector": "#submit-btn", "url": "https://..." } }
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Actions
|
|
95
|
+
|
|
96
|
+
| Action | Direction | What it does |
|
|
97
|
+
|---|---|---|
|
|
98
|
+
| `screenshot` | agent → browser | `html2canvas` or Canvas API capture of viewport |
|
|
99
|
+
| `dom.query` | agent → browser | `querySelector` → outerHTML, textContent, attributes |
|
|
100
|
+
| `dom.react` | agent → browser | React fiber walk → component name, file, line, props |
|
|
101
|
+
| `js.exec` | agent → browser | Run arbitrary JS in page context, return result |
|
|
102
|
+
| `page.info` | agent → browser | URL, title, meta, `document.readyState` |
|
|
103
|
+
| `comment` | browser → agent | User annotation from page → Copilot session |
|
|
104
|
+
| `console` | browser → agent | Intercepted `console.*` calls → tap emitter |
|
|
105
|
+
| `navigate` | agent → browser | `window.location.href = url` |
|
|
106
|
+
|
|
107
|
+
## Use cases
|
|
108
|
+
|
|
109
|
+
### Get a screenshot into Copilot
|
|
110
|
+
|
|
111
|
+
```
|
|
112
|
+
> take a screenshot of the current page
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
tap calls `tap_browser_screenshot` → bridge → injected script captures viewport → base64 flows back → Copilot sees the image.
|
|
116
|
+
|
|
117
|
+
### React component context (like react-grab)
|
|
118
|
+
|
|
119
|
+
```
|
|
120
|
+
> what React component renders the sidebar?
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
tap calls `tap_browser_query` with a selector or `tap_browser_react_context` → walks React fiber tree → returns component name, source file, line number, props → Copilot has full context without searching the codebase.
|
|
124
|
+
|
|
125
|
+
### Live console monitoring
|
|
126
|
+
|
|
127
|
+
A tap CommandEmitter connects to the bridge and streams `console` push events. EventFilter drops noise, injects errors:
|
|
128
|
+
|
|
129
|
+
```json
|
|
130
|
+
{ "match": "error|warn|uncaught", "outcome": "inject" }
|
|
131
|
+
{ "match": ".*", "outcome": "keep" }
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Page annotations → Copilot
|
|
135
|
+
|
|
136
|
+
User selects an element on the page, types a comment. The injected script pushes it to the bridge → tap injects it into the Copilot session. Like react-grab but the context goes straight into the conversation, not the clipboard.
|
|
137
|
+
|
|
138
|
+
### Copilot drives the browser
|
|
139
|
+
|
|
140
|
+
```
|
|
141
|
+
> click the submit button and tell me what happens
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
tap calls `tap_browser_exec` with `document.querySelector('#submit').click()` → injected script runs it → returns result or captures DOM changes.
|
|
145
|
+
|
|
146
|
+
## Phased delivery
|
|
147
|
+
|
|
148
|
+
| Phase | Scope |
|
|
149
|
+
|---|---|
|
|
150
|
+
| **1. Prove the round-trip** | Bridge server + injected client script + `screenshot` action + one tap tool |
|
|
151
|
+
| **2. DOM + React context** | `dom.query`, `dom.react`, `page.info` actions and tap tools |
|
|
152
|
+
| **3. Bidirectional** | `js.exec`, `comment` push, `console` push, `navigate` |
|
|
153
|
+
| **4. Polish** | Auto-reconnect, multi-tab targeting, annotation overlay UI, error handling |
|
|
154
|
+
|
|
155
|
+
## Open questions
|
|
156
|
+
|
|
157
|
+
- **Bridge as npm package?** `npx copilot-bridge` or should tap auto-start it?
|
|
158
|
+
- **Multi-tab** — target active tab by default, allow tab ID targeting?
|
|
159
|
+
- **Screenshot method** — `html2canvas` (full fidelity) vs Canvas API (faster)?
|
|
160
|
+
- **Security** — localhost-only binding, optional shared secret?
|
|
161
|
+
- **Image delivery** — base64 inline vs write to temp file and return path?
|
|
162
|
+
- **React context** — bundle react-grab extraction logic or write lightweight version?
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
# Codex Goals lessons for `/tap-goal`
|
|
2
|
+
|
|
3
|
+
This recipe records the design lessons borrowed from OpenAI's
|
|
4
|
+
[Using Goals in Codex](https://developers.openai.com/cookbook/examples/codex/using_goals_in_codex)
|
|
5
|
+
guide and maps them to ※ tap's `/tap-goal` skill.
|
|
6
|
+
|
|
7
|
+
## Core lesson
|
|
8
|
+
|
|
9
|
+
A goal is a **completion contract** attached to the current thread:
|
|
10
|
+
|
|
11
|
+
```text
|
|
12
|
+
work -> check evidence -> continue, complete, or stop blocked
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
It is not open-ended background autonomy. The objective persists, but evidence
|
|
16
|
+
decides whether work is complete.
|
|
17
|
+
|
|
18
|
+
## Strong goal contract
|
|
19
|
+
|
|
20
|
+
Before starting a goal loop, make these fields explicit:
|
|
21
|
+
|
|
22
|
+
| Field | Purpose |
|
|
23
|
+
| --- | --- |
|
|
24
|
+
| Outcome | Desired end state |
|
|
25
|
+
| Verification surface | Test, benchmark, command output, artifact, source material, or report that proves completion |
|
|
26
|
+
| Constraints | What must not regress |
|
|
27
|
+
| Boundaries | Files, tools, data, repositories, or resources in scope |
|
|
28
|
+
| Iteration policy | How to choose the next experiment/action after each attempt |
|
|
29
|
+
| Blocked stop condition | When to stop, what to report, and what would unlock progress |
|
|
30
|
+
|
|
31
|
+
Weak:
|
|
32
|
+
|
|
33
|
+
```text
|
|
34
|
+
/tap-goal improve performance
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Strong:
|
|
38
|
+
|
|
39
|
+
```text
|
|
40
|
+
/tap-goal Reduce p95 checkout latency below 120 ms, verified by the checkout benchmark,
|
|
41
|
+
while keeping the correctness suite green. Use only checkout service files,
|
|
42
|
+
benchmark fixtures, and related tests. Between iterations, record what changed,
|
|
43
|
+
what the benchmark showed, and the next best experiment. If blocked, stop with
|
|
44
|
+
attempted paths, evidence, blocker, and next input needed.
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Runtime mapping in tap
|
|
48
|
+
|
|
49
|
+
Codex Goals continue at safe idle boundaries. Tap supports that as:
|
|
50
|
+
|
|
51
|
+
```json
|
|
52
|
+
{ "prompt": "...", "every": "idle", "maxRuns": 50 }
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Copilot CLI autopilot can keep a session continuously busy, so `/tap-goal` also
|
|
56
|
+
supports timed autopilot-compatible goals:
|
|
57
|
+
|
|
58
|
+
```json
|
|
59
|
+
{ "prompt": "...", "everySchedule": ["2m", "5m", "10m"], "maxRuns": 50 }
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Timed prompt sends that are deferred because the session is busy do not consume
|
|
63
|
+
the real iteration budget.
|
|
64
|
+
|
|
65
|
+
## Evidence audit before completion
|
|
66
|
+
|
|
67
|
+
Before a goal stops as complete, the prompt must record:
|
|
68
|
+
|
|
69
|
+
```text
|
|
70
|
+
GOAL COMPLETE
|
|
71
|
+
Verification surface checked: <specific evidence>
|
|
72
|
+
Result observed: <what it showed>
|
|
73
|
+
Constraints checked: <what did not regress>
|
|
74
|
+
Conclusion: complete
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
If the verification surface cannot be checked, the goal is blocked, not
|
|
78
|
+
complete.
|
|
79
|
+
|
|
80
|
+
## Iteration ledger
|
|
81
|
+
|
|
82
|
+
Each iteration should post a structured EventStream note with `tap_post`:
|
|
83
|
+
|
|
84
|
+
```text
|
|
85
|
+
ITERATION RECORD
|
|
86
|
+
Iteration: <runs> of <maxRuns>
|
|
87
|
+
Action taken: <smallest useful action>
|
|
88
|
+
Evidence checked: <test/output/artifact/result>
|
|
89
|
+
Status: progressing | complete | blocked | budget-limited
|
|
90
|
+
Next best action: <next step>
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
This makes the EventStream an audit trail rather than only a notification log.
|
|
94
|
+
|
|
95
|
+
## Research and reproduction goals
|
|
96
|
+
|
|
97
|
+
For research goals, maintain a claim ledger:
|
|
98
|
+
|
|
99
|
+
```text
|
|
100
|
+
Claim: <specific claim>
|
|
101
|
+
Route: <how it was tested>
|
|
102
|
+
Evidence surface: <what was checked>
|
|
103
|
+
Status: confirmed | approximate-support | blocked | uncertain
|
|
104
|
+
Remaining uncertainty: <what is missing>
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
The final output should preserve epistemic levels instead of flattening partial
|
|
108
|
+
support into success.
|
|
109
|
+
|
|
110
|
+
## Figure lessons from the Codex guide
|
|
111
|
+
|
|
112
|
+
The guide's figures reinforce these workflow rules:
|
|
113
|
+
|
|
114
|
+
1. A goal turns a one-turn exchange into an evidence-checked continuation loop.
|
|
115
|
+
2. Goal state is thread-scoped and includes durable state, continuation,
|
|
116
|
+
controls, and evidence checks.
|
|
117
|
+
3. Continuation is gated: active goal, idle thread, and no queued user input.
|
|
118
|
+
4. Strong goals visibly name end state, verification surface, and constraints.
|
|
119
|
+
5. Research goals decompose source claims into evidence channels before status.
|
|
120
|
+
6. Final research output preserves confirmed, approximate, blocked, and
|
|
121
|
+
uncertain support levels.
|
|
122
|
+
7. The UI example shows goal mode as an explicit command/input affordance rather
|
|
123
|
+
than hidden background work.
|
|
124
|
+
|
|
125
|
+
## Budget handling
|
|
126
|
+
|
|
127
|
+
`maxRuns` is a safety budget. Reaching it means "budget-limited handoff," not
|
|
128
|
+
"goal complete." The final budget-limited iteration should post:
|
|
129
|
+
|
|
130
|
+
```text
|
|
131
|
+
BUDGET LIMITED
|
|
132
|
+
Progress: <what was achieved>
|
|
133
|
+
Evidence gathered: <what is known>
|
|
134
|
+
Remaining work: <what is not done>
|
|
135
|
+
Recommended next goal/budget: <next invocation>
|
|
136
|
+
```
|