browser-automation-skill 0.71.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +144 -0
- package/SECURITY.md +39 -0
- package/SKILL.md +206 -0
- package/bin/cli.mjs +55 -0
- package/install.sh +143 -0
- package/package.json +54 -0
- package/references/adapter-candidates.md +40 -0
- package/references/browser-mcp-cheatsheet.md +132 -0
- package/references/browser-stats-cheatsheet.md +155 -0
- package/references/chrome-devtools-mcp-cheatsheet.md +232 -0
- package/references/midscene-integration.md +359 -0
- package/references/obscura-cheatsheet.md +103 -0
- package/references/playwright-cli-cheatsheet.md +64 -0
- package/references/playwright-lib-cheatsheet.md +90 -0
- package/references/recipes/add-a-tool-adapter.md +134 -0
- package/references/recipes/agent-workflows/README.md +37 -0
- package/references/recipes/agent-workflows/cache-driven-bulk-operation.md +110 -0
- package/references/recipes/agent-workflows/flow-record-and-replay.md +102 -0
- package/references/recipes/agent-workflows/incremental-pattern-discovery.md +125 -0
- package/references/recipes/agent-workflows/login-then-scrape.md +100 -0
- package/references/recipes/anti-patterns-tool-extension.md +182 -0
- package/references/recipes/body-bytes-not-body.md +139 -0
- package/references/recipes/cache-write-security.md +210 -0
- package/references/recipes/fingerprint-rescue.md +154 -0
- package/references/recipes/model-routing.md +143 -0
- package/references/recipes/path-security.md +138 -0
- package/references/recipes/privacy-canary.md +96 -0
- package/references/recipes/visual-rescue-hook.md +182 -0
- package/references/stats-prices.json +42 -0
- package/references/stats-schema.json +77 -0
- package/references/tool-versions.md +8 -0
- package/scripts/browser-add-site.sh +113 -0
- package/scripts/browser-assert.sh +106 -0
- package/scripts/browser-audit.sh +68 -0
- package/scripts/browser-baseline.sh +135 -0
- package/scripts/browser-click.sh +100 -0
- package/scripts/browser-creds-add.sh +254 -0
- package/scripts/browser-creds-list.sh +67 -0
- package/scripts/browser-creds-migrate.sh +122 -0
- package/scripts/browser-creds-remove.sh +69 -0
- package/scripts/browser-creds-rotate-totp.sh +109 -0
- package/scripts/browser-creds-show.sh +82 -0
- package/scripts/browser-creds-totp.sh +94 -0
- package/scripts/browser-do.sh +630 -0
- package/scripts/browser-doctor.sh +365 -0
- package/scripts/browser-drag.sh +90 -0
- package/scripts/browser-extract.sh +192 -0
- package/scripts/browser-fill.sh +142 -0
- package/scripts/browser-flow.sh +316 -0
- package/scripts/browser-history.sh +187 -0
- package/scripts/browser-hover.sh +92 -0
- package/scripts/browser-inspect.sh +188 -0
- package/scripts/browser-list-sessions.sh +78 -0
- package/scripts/browser-list-sites.sh +42 -0
- package/scripts/browser-login.sh +279 -0
- package/scripts/browser-mcp.sh +65 -0
- package/scripts/browser-migrate.sh +195 -0
- package/scripts/browser-open.sh +134 -0
- package/scripts/browser-press.sh +80 -0
- package/scripts/browser-remove-session.sh +72 -0
- package/scripts/browser-remove-site.sh +68 -0
- package/scripts/browser-replay.sh +206 -0
- package/scripts/browser-route.sh +174 -0
- package/scripts/browser-select.sh +122 -0
- package/scripts/browser-show-session.sh +57 -0
- package/scripts/browser-show-site.sh +37 -0
- package/scripts/browser-snapshot.sh +176 -0
- package/scripts/browser-stats.sh +522 -0
- package/scripts/browser-tab-close.sh +112 -0
- package/scripts/browser-tab-list.sh +70 -0
- package/scripts/browser-tab-switch.sh +111 -0
- package/scripts/browser-upload.sh +132 -0
- package/scripts/browser-use.sh +60 -0
- package/scripts/browser-vlm.sh +707 -0
- package/scripts/browser-wait.sh +97 -0
- package/scripts/install-git-hooks.sh +16 -0
- package/scripts/lib/capture.sh +356 -0
- package/scripts/lib/common.sh +262 -0
- package/scripts/lib/credential.sh +237 -0
- package/scripts/lib/fingerprint-rescue.js +123 -0
- package/scripts/lib/flow.sh +448 -0
- package/scripts/lib/flow_record.sh +210 -0
- package/scripts/lib/mask.sh +49 -0
- package/scripts/lib/memory.sh +427 -0
- package/scripts/lib/migrate.sh +390 -0
- package/scripts/lib/migrators/README.md +23 -0
- package/scripts/lib/migrators/memory/v1_to_v2.sh +15 -0
- package/scripts/lib/migrators/recent_urls/README.md +13 -0
- package/scripts/lib/migrators/stats/README.md +24 -0
- package/scripts/lib/node/chrome-devtools-bridge.mjs +1812 -0
- package/scripts/lib/node/mcp-server.mjs +531 -0
- package/scripts/lib/node/mcp-tools.json +68 -0
- package/scripts/lib/node/playwright-driver.mjs +1104 -0
- package/scripts/lib/node/totp-core.mjs +52 -0
- package/scripts/lib/node/totp.mjs +52 -0
- package/scripts/lib/node/url-pattern-cluster.mjs +102 -0
- package/scripts/lib/node/url-pattern-resolver.mjs +77 -0
- package/scripts/lib/output.sh +79 -0
- package/scripts/lib/router.sh +342 -0
- package/scripts/lib/sanitize.sh +107 -0
- package/scripts/lib/secret/keychain.sh +91 -0
- package/scripts/lib/secret/libsecret.sh +74 -0
- package/scripts/lib/secret/plaintext.sh +75 -0
- package/scripts/lib/secret_backend_select.sh +57 -0
- package/scripts/lib/session.sh +153 -0
- package/scripts/lib/site.sh +126 -0
- package/scripts/lib/stats.sh +419 -0
- package/scripts/lib/tool/.gitkeep +0 -0
- package/scripts/lib/tool/chrome-devtools-mcp.sh +349 -0
- package/scripts/lib/tool/obscura.sh +249 -0
- package/scripts/lib/tool/playwright-cli.sh +155 -0
- package/scripts/lib/tool/playwright-lib.sh +106 -0
- package/scripts/lib/verb_helpers.sh +222 -0
- package/scripts/lib/visual-rescue-default.sh +145 -0
- package/scripts/regenerate-docs.sh +99 -0
- package/uninstall.sh +51 -0
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
# Anti-patterns: tool extension
|
|
2
|
+
|
|
3
|
+
When adding or modifying a tool adapter, these are the temptations to resist. Each is a real pattern an early contributor will reach for, with a clear WRONG / RIGHT shape and the SOLID-principle reasoning.
|
|
4
|
+
|
|
5
|
+
## AP-1: Don't add adapter-specific checks to `browser-doctor.sh`
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# WRONG — editing core to teach it about a new tool
|
|
9
|
+
# scripts/browser-doctor.sh
|
|
10
|
+
if ! command -v puppeteer >/dev/null 2>&1; then
|
|
11
|
+
warn "puppeteer not on PATH"
|
|
12
|
+
problems=$((problems+1))
|
|
13
|
+
fi
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
# RIGHT — adapter declares its own check; doctor aggregates
|
|
18
|
+
# scripts/lib/tool/puppeteer.sh
|
|
19
|
+
tool_doctor_check() {
|
|
20
|
+
if command -v puppeteer >/dev/null 2>&1; then
|
|
21
|
+
printf '{"ok":true,"binary":"puppeteer","version":"%s"}\n' "$(puppeteer --version)"
|
|
22
|
+
else
|
|
23
|
+
cat <<'EOF'
|
|
24
|
+
{ "ok": false, "binary": "puppeteer", "error": "not on PATH",
|
|
25
|
+
"install_hint": "npm i -g puppeteer" }
|
|
26
|
+
EOF
|
|
27
|
+
fi
|
|
28
|
+
}
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
**SOLID:** SRP — `browser-doctor.sh` knows about framework-level state; `lib/tool/<tool>.sh` knows about its own binary. OCP — adding `puppeteer.sh` should never force an edit to a file in `scripts/` outside `lib/tool/`.
|
|
32
|
+
|
|
33
|
+
## AP-2: Don't cross-call between adapters
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
# WRONG — adapter reaching into another adapter
|
|
37
|
+
# scripts/lib/tool/obscura.sh
|
|
38
|
+
tool_inspect() {
|
|
39
|
+
source "${LIB_TOOL_DIR}/playwright-cli.sh"
|
|
40
|
+
tool_inspect "$@"
|
|
41
|
+
}
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
# RIGHT — shared logic factors into a helper module
|
|
46
|
+
# scripts/lib/inspect_helpers.sh (NEW shared lib — sibling to lib/tool/)
|
|
47
|
+
inspect_collect_console() { :; }
|
|
48
|
+
|
|
49
|
+
# Both adapters source the helper; neither sources the other.
|
|
50
|
+
# scripts/lib/tool/obscura.sh
|
|
51
|
+
source "${BROWSER_SKILL_LIB}/inspect_helpers.sh"
|
|
52
|
+
tool_inspect() { inspect_collect_console "$@"; }
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
**SOLID:** SRP + DIP — adapters are leaves. If two adapters need the same behavior, that behavior is a *shared concern* and lives in `scripts/lib/` (sibling to `lib/tool/`). Without this rule, the dependency graph stops being a tree.
|
|
56
|
+
|
|
57
|
+
## AP-3: Don't declare routing precedence in an adapter (the Z-hybrid line)
|
|
58
|
+
|
|
59
|
+
```bash
|
|
60
|
+
# WRONG — adapter trying to claim "I'm the default for verb=audit"
|
|
61
|
+
# scripts/lib/tool/puppeteer.sh
|
|
62
|
+
tool_default_routes() {
|
|
63
|
+
cat <<'EOF'
|
|
64
|
+
{ "audit": { "priority": 100 } }
|
|
65
|
+
EOF
|
|
66
|
+
}
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
# RIGHT — adapter declares only what it CAN do; router decides who WINS
|
|
71
|
+
# scripts/lib/tool/puppeteer.sh
|
|
72
|
+
tool_capabilities() {
|
|
73
|
+
cat <<'EOF'
|
|
74
|
+
{ "verbs": { "audit": { "flags": ["--lighthouse"] } } }
|
|
75
|
+
EOF
|
|
76
|
+
}
|
|
77
|
+
|
|
78
|
+
# scripts/lib/router.sh (one rule, one place — Path B in the recipe)
|
|
79
|
+
rule_audit_verb() {
|
|
80
|
+
[ "${1:-}" = "audit" ] || return 1
|
|
81
|
+
printf 'puppeteer\taudit verb (Path B promotion)\n'
|
|
82
|
+
}
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
**Why:** Routing precedence among peers is a global decision — when two adapters both claim audit, somebody has to break the tie, and centralizing that choice in `router.sh` lets a reviewer see the conflict in one diff. Decentralizing scatters precedence across N adapter files.
|
|
86
|
+
|
|
87
|
+
## AP-4: Don't make a tool default in the same PR that adds it
|
|
88
|
+
|
|
89
|
+
```diff
|
|
90
|
+
# WRONG — single PR introduces tool AND promotes it
|
|
91
|
+
+ scripts/lib/tool/puppeteer.sh (NEW)
|
|
92
|
+
+ tests/puppeteer_adapter.bats (NEW)
|
|
93
|
+
+ tests/stubs/puppeteer (NEW)
|
|
94
|
+
+ tests/fixtures/puppeteer/ (NEW)
|
|
95
|
+
+ references/puppeteer-cheatsheet.md (NEW)
|
|
96
|
+
~ scripts/lib/router.sh (EDIT — adds rule_audit_verb -> puppeteer)
|
|
97
|
+
~ tests/router.bats (EDIT)
|
|
98
|
+
~ references/routing-heuristics.md (EDIT)
|
|
99
|
+
~ CHANGELOG.md (EDIT)
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
```diff
|
|
103
|
+
# RIGHT — two PRs, separated by a soak window
|
|
104
|
+
# PR #1 (Path A — ship dark, opt-in via --tool=puppeteer):
|
|
105
|
+
+ scripts/lib/tool/puppeteer.sh (NEW)
|
|
106
|
+
+ tests/puppeteer_adapter.bats (NEW)
|
|
107
|
+
+ tests/stubs/puppeteer (NEW)
|
|
108
|
+
+ tests/fixtures/puppeteer/ (NEW)
|
|
109
|
+
+ references/puppeteer-cheatsheet.md (NEW)
|
|
110
|
+
~ CHANGELOG.md (EDIT — [adapter] added, opt-in)
|
|
111
|
+
|
|
112
|
+
# (one week later, after using --tool=puppeteer in real workflows)
|
|
113
|
+
# PR #2 (Path B — promote to default):
|
|
114
|
+
~ scripts/lib/router.sh (EDIT)
|
|
115
|
+
~ tests/router.bats (EDIT)
|
|
116
|
+
~ references/routing-heuristics.md (EDIT)
|
|
117
|
+
~ CHANGELOG.md (EDIT — [adapter] promoted)
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Why:** Process. A smaller PR is easier to revert. The "ship dark, then promote" pattern is the same hygiene any production system uses for feature flags.
|
|
121
|
+
|
|
122
|
+
## AP-5: Don't hand-edit autogenerated files
|
|
123
|
+
|
|
124
|
+
```bash
|
|
125
|
+
# WRONG — manually adding a row to references/tool-versions.md
|
|
126
|
+
$ vim references/tool-versions.md # adds a "puppeteer" row by hand
|
|
127
|
+
$ git add references/tool-versions.md && git commit
|
|
128
|
+
# tests/lint.sh::ensure_docs_in_sync will fail in CI:
|
|
129
|
+
# "tool-versions.md is stale; run scripts/regenerate-docs.sh"
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
# RIGHT — let the generator do it; commit the generator's output
|
|
134
|
+
$ vim scripts/lib/tool/puppeteer.sh # implement tool_metadata + tool_doctor_check
|
|
135
|
+
$ scripts/regenerate-docs.sh # autogen edits the marked sections
|
|
136
|
+
$ git add scripts/lib/tool/puppeteer.sh references/tool-versions.md SKILL.md
|
|
137
|
+
$ git commit
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**SOLID:** DRY + drift-prevention. The adapter's `tool_metadata()` is the single source of truth; any hand-edit to a generated file creates a second source that will go stale.
|
|
141
|
+
|
|
142
|
+
## AP-6: Don't pollute the parent-shell namespace from adapter file scope
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
# WRONG — readonly globals at adapter file scope without a namespace prefix
|
|
146
|
+
# scripts/lib/tool/puppeteer.sh
|
|
147
|
+
readonly TIMEOUT=30
|
|
148
|
+
readonly DEFAULT_VIEWPORT='1280x800'
|
|
149
|
+
# ... tool_open uses $TIMEOUT
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
```bash
|
|
153
|
+
# RIGHT — namespace the adapter's globals so they cannot collide
|
|
154
|
+
# scripts/lib/tool/puppeteer.sh
|
|
155
|
+
readonly _BROWSER_TOOL_PUPPETEER_TIMEOUT=30
|
|
156
|
+
readonly _BROWSER_TOOL_PUPPETEER_DEFAULT_VIEWPORT='1280x800'
|
|
157
|
+
# ... tool_open uses $_BROWSER_TOOL_PUPPETEER_TIMEOUT
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
**Why:** Encapsulation. The current loading model only sources one adapter per parent shell, so collision is unlikely today — but a future verb that consults two adapters in the same shell would clobber `TIMEOUT`. Prefix-namespacing makes globals private without ceremony.
|
|
161
|
+
|
|
162
|
+
## AP-7: Don't accept secrets in argv
|
|
163
|
+
|
|
164
|
+
`--secret-stdin` only. Lint and `tests/argv_leak.bats` enforce. A password on the command line is visible to `ps`, the shell history, and any tracing tool that sees argv. Always pipe secrets via stdin.
|
|
165
|
+
|
|
166
|
+
## AP-8: Don't run network calls at adapter file-source time
|
|
167
|
+
|
|
168
|
+
Sourcing must be cheap and pure. `command -v <binary>` is fine; running the binary at source time is not. The framework sources adapters dozens of times per CI run (once per subshell aggregation in doctor + lint). A network call at file-source time multiplies that cost.
|
|
169
|
+
|
|
170
|
+
## AP-9: Don't test only the happy path
|
|
171
|
+
|
|
172
|
+
Every `tests/<tool>_adapter.bats` must cover:
|
|
173
|
+
- Declaration-presence of all 11 required functions.
|
|
174
|
+
- `tool_capabilities()` returns valid JSON.
|
|
175
|
+
- Unsupported-op exit-41 path (at least one verb that returns 41).
|
|
176
|
+
- At least one happy path via the stub binary.
|
|
177
|
+
|
|
178
|
+
## See also
|
|
179
|
+
|
|
180
|
+
- [add-a-tool-adapter recipe](add-a-tool-adapter.md)
|
|
181
|
+
- [extension model spec §9](../../docs/superpowers/specs/2026-04-30-tool-adapter-extension-model-design.md)
|
|
182
|
+
- [token-efficient adapter output spec](../../docs/superpowers/specs/2026-05-01-token-efficient-adapter-output-design.md)
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
# Recipe: `body_bytes`, not `body`, in replies
|
|
2
|
+
|
|
3
|
+
When a verb ingests caller-supplied content (HTTP body, large blob, multi-line text), ship the **byte length** in the reply — not the content itself. Avoids re-emitting agent-supplied data into stdout / logs / terminal capture / Claude transcript.
|
|
4
|
+
|
|
5
|
+
## When to use this recipe
|
|
6
|
+
|
|
7
|
+
Use this whenever a verb takes content via `--content`, `--body`, `--data`, `--*-stdin`, or any flag that ingests bytes the agent typed and the verb forwards downstream. Already shipped:
|
|
8
|
+
|
|
9
|
+
- `scripts/browser-route.sh` + `scripts/lib/node/chrome-devtools-bridge.mjs` (Phase 6 part 7-ii) — `route fulfill --body` / `--body-stdin`. Reply has `body_bytes`, not `body`.
|
|
10
|
+
- `scripts/lib/node/chrome-devtools-bridge.mjs::runStatefulViaDaemon` (fill case, line ~432) — defensively scrubs `text` from the reply before emitting (related: privacy-canary handles the stronger secrets case).
|
|
11
|
+
|
|
12
|
+
Phase 7 capture-pipeline candidates: any sanitizer-output reply that shows the agent how much got redacted.
|
|
13
|
+
|
|
14
|
+
Do NOT use this recipe for:
|
|
15
|
+
- Replies whose **shape requires** the content (e.g. `extract --selector` returns the matched element's textContent — that's the whole point of the verb). The contract is *return what was extracted*, not *return how many bytes were extracted*.
|
|
16
|
+
- Secret-bytes (passwords, tokens). Those want **zero** reflection — see `privacy-canary.md`. Even `body_bytes` could leak something via length-side-channel for short secrets.
|
|
17
|
+
|
|
18
|
+
## The pattern
|
|
19
|
+
|
|
20
|
+
```javascript
|
|
21
|
+
// WRONG — echo the body in the reply
|
|
22
|
+
return {
|
|
23
|
+
verb: 'route',
|
|
24
|
+
action: 'fulfill',
|
|
25
|
+
pattern: msg.pattern,
|
|
26
|
+
status: msg.status,
|
|
27
|
+
body: msg.body, // <-- agent-supplied content, now in stdout
|
|
28
|
+
rule_count: routeRules.length,
|
|
29
|
+
};
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
```javascript
|
|
33
|
+
// RIGHT — ship a length contract
|
|
34
|
+
return {
|
|
35
|
+
verb: 'route',
|
|
36
|
+
action: 'fulfill',
|
|
37
|
+
pattern: msg.pattern,
|
|
38
|
+
fulfill_status: msg.status,
|
|
39
|
+
body_bytes: Buffer.byteLength(msg.body, 'utf8'),
|
|
40
|
+
rule_count: routeRules.length,
|
|
41
|
+
};
|
|
42
|
+
// body itself stays in the daemon's routeRules; never re-emitted.
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Source of truth: `scripts/lib/node/chrome-devtools-bridge.mjs::case 'route'` (the fulfill branch).
|
|
46
|
+
|
|
47
|
+
## Why the agent doesn't need the body back
|
|
48
|
+
|
|
49
|
+
The agent **just sent** the body. They have it. The reply's job is to confirm:
|
|
50
|
+
- That the request was accepted (`status: 'ok'`).
|
|
51
|
+
- That it landed where intended (`pattern`, `rule_count`).
|
|
52
|
+
- That the bytes the daemon received match what they sent (`body_bytes`).
|
|
53
|
+
|
|
54
|
+
A length match is sufficient evidence that nothing got truncated by transport. If the agent suspects encoding corruption, they can `printf '%s' BODY | wc -c` and compare. They never needed the body echoed.
|
|
55
|
+
|
|
56
|
+
## Why echoing is actively bad
|
|
57
|
+
|
|
58
|
+
1. **Stdout is the Claude transcript.** Every echoed byte is a token the model rereads on the next turn. A 50KB JSON mock body bloats context for zero gain.
|
|
59
|
+
2. **Logs persist.** If the user pipes the verb to `tee log.txt` or runs under `script(1)`, the body lands on disk in plain text. The daemon's in-memory store is a deliberate scoping decision; echoing undoes it.
|
|
60
|
+
3. **Terminal-recording tools capture stdout.** Asciinema, screen recordings of demos, even `tmux` capture-pane all see whatever lands on stdout.
|
|
61
|
+
4. **Convention sets expectations.** Once one verb echoes content, the next maintainer assumes that's the contract and adds another. Establishing "we ship lengths, not content" as the norm prevents the drift.
|
|
62
|
+
|
|
63
|
+
## Why `Buffer.byteLength`, not `body.length`
|
|
64
|
+
|
|
65
|
+
```javascript
|
|
66
|
+
// WRONG — JS string length is code units, not bytes
|
|
67
|
+
body_bytes: msg.body.length
|
|
68
|
+
// '🔒'.length === 2 (UTF-16 surrogate pair)
|
|
69
|
+
// '🔒' as utf-8 is 4 bytes
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
```javascript
|
|
73
|
+
// RIGHT — count bytes, not code units
|
|
74
|
+
body_bytes: Buffer.byteLength(msg.body, 'utf8')
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
If the agent's `printf | wc -c` says 4 and the reply's `body_bytes` says 2, that looks like data loss when it's just a counting-units mismatch. Always count in the same unit the agent will count in (bytes, not chars).
|
|
78
|
+
|
|
79
|
+
Bash equivalent (for strings the bash verb script measures):
|
|
80
|
+
|
|
81
|
+
```bash
|
|
82
|
+
# WRONG — only correct for ASCII
|
|
83
|
+
body_bytes="${#body_inline}"
|
|
84
|
+
|
|
85
|
+
# RIGHT — measure bytes via wc
|
|
86
|
+
body_bytes="$(printf '%s' "${body_inline}" | wc -c | tr -d ' ')"
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
`scripts/browser-route.sh:107` uses `${#body_inline}` only for the dry-run / inline-body case where the bash side already knows the bytes won't surprise; the daemon-side authoritative count uses `Buffer.byteLength`. For new verbs, prefer `wc -c` bash-side.
|
|
90
|
+
|
|
91
|
+
## Defense in depth: also scrub upstream
|
|
92
|
+
|
|
93
|
+
The bridge daemon's `route` reply is one layer. The fill verb's similar concern (`scripts/lib/node/chrome-devtools-bridge.mjs:432`) shows the defensive-scrub layered idiom:
|
|
94
|
+
|
|
95
|
+
```javascript
|
|
96
|
+
const reply = await ipcCall({ verb: 'fill', ref, text });
|
|
97
|
+
// Defensive: scrub any echoed text from the reply before emitting.
|
|
98
|
+
if (reply && typeof reply === 'object') delete reply.text;
|
|
99
|
+
emitReply(reply);
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Even if the daemon child accidentally puts `text` into the reply, the bridge strips it on the way out. **Two layers** because either layer can be edited carelessly; both wrong at the same time is the regression that ships.
|
|
103
|
+
|
|
104
|
+
## What about echoing for `--dry-run`?
|
|
105
|
+
|
|
106
|
+
Dry-run is the **one** justified case for surfacing the body — the agent asked "what would happen?" without committing. Even there, ship `body_bytes` plus an excerpt with explicit truncation, never the full body:
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
# dry-run summary
|
|
110
|
+
emit_summary verb=route ... \
|
|
111
|
+
fulfill_status="${status_code}" \
|
|
112
|
+
body_bytes="${body_bytes}" \
|
|
113
|
+
body_excerpt="$(printf '%s' "${body_inline}" | head -c 80)" \
|
|
114
|
+
body_truncated="$([ "${body_bytes}" -gt 80 ] && echo true || echo false)"
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
`browser-route.sh` doesn't ship the excerpt today. Add it in a follow-up if dry-run UX feedback asks for it.
|
|
118
|
+
|
|
119
|
+
## Checklist for any new content-ingesting verb
|
|
120
|
+
|
|
121
|
+
```
|
|
122
|
+
1. Reply object has `<thing>_bytes` (length), not `<thing>` (content).
|
|
123
|
+
2. Daemon-child stores the content in the slot it was meant for; reply
|
|
124
|
+
surface is purely confirmation.
|
|
125
|
+
3. Use Buffer.byteLength (Node) or wc -c (bash) — never .length on strings.
|
|
126
|
+
4. If the verb routes through a bridge, add a defensive `delete reply.<thing>`
|
|
127
|
+
on the way out (see fill verb precedent).
|
|
128
|
+
5. Test that asserts the byte-length contract: roundtrip a body with a known
|
|
129
|
+
non-ASCII character; confirm body_bytes matches `printf | wc -c`.
|
|
130
|
+
6. NEVER echo the content even in error replies — error UX is the message
|
|
131
|
+
string, not the offending payload.
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## See also
|
|
135
|
+
|
|
136
|
+
- `scripts/lib/node/chrome-devtools-bridge.mjs::case 'route'` — fulfill branch.
|
|
137
|
+
- `scripts/lib/node/chrome-devtools-bridge.mjs:432` — fill defensive scrub.
|
|
138
|
+
- [Privacy canary recipe](privacy-canary.md) — stronger discipline for credential bytes.
|
|
139
|
+
- [Path security recipe](path-security.md) — sister pattern for filesystem inputs.
|
|
@@ -0,0 +1,210 @@
|
|
|
1
|
+
# Recipe: Cache-write security
|
|
2
|
+
|
|
3
|
+
A discipline for any verb that writes learned state to disk based on caller-supplied bytes. Codifies the contract that Phase 11 part 1's memory cache (`scripts/lib/memory.sh` + `scripts/browser-do.sh`) ships with: verbs that turn agent input into persistent cache state are a different security shape from verbs that read state, and they need defenses the read side doesn't.
|
|
4
|
+
|
|
5
|
+
## When to use this recipe
|
|
6
|
+
|
|
7
|
+
Use this **whenever you add a verb that persists caller-supplied selectors, intents, URL patterns, or other free-text agent input as cache/memory state read by future invocations**. Examples already shipped:
|
|
8
|
+
|
|
9
|
+
- `scripts/browser-do.sh record` — Phase 11 part 1-ii; writes `(intent, selector, url_pattern)` triples to `~/.browser-skill/memory/<site>/`.
|
|
10
|
+
- `scripts/browser-do.sh --intent` (cache-hit success path) — Phase 11 part 1-ii/iii; bumps `success_count` + records `pattern → archetype` mapping on dispatch success.
|
|
11
|
+
- `scripts/browser-do.sh --intent` (cache-hit failure path, exit 11/13) — Phase 11 part 1-iii; increments `fail_count` toward H1 disable threshold.
|
|
12
|
+
|
|
13
|
+
Do NOT use this recipe for:
|
|
14
|
+
- Read-side verbs (lookup, list, show) — they don't persist new state; the privacy invariant is "don't echo what's already on disk", which is `privacy-canary.md`'s domain.
|
|
15
|
+
- Captures (Phase 7) — they persist *observed* page state, not cache mappings; sanitization is `sanitize.sh`'s job, not the cache-write contract.
|
|
16
|
+
- Sites profile / credentials / sessions (Phase 1–4) — these store explicit user-registered material; the user typed it deliberately. Cache writes happen *incidentally* during agent execution, which is what makes the security shape different.
|
|
17
|
+
|
|
18
|
+
## The five rules
|
|
19
|
+
|
|
20
|
+
### Rule 1 — Whitelist the cache-write surface
|
|
21
|
+
|
|
22
|
+
Never accept a caller-supplied **verb name** that you'll dispatch (or store) without enumerating allowed targets in code.
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
WRONG — accept any verb name the caller supplies
|
|
26
|
+
case "${arg_verb}" in
|
|
27
|
+
*) bash "${SCRIPT_DIR}/browser-${arg_verb}.sh" --selector "${cached}" ;;
|
|
28
|
+
esac
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
RIGHT — explicit constant whitelist; reject everything else
|
|
33
|
+
readonly DO_VERB_WHITELIST=(click) # v1: only click takes --selector
|
|
34
|
+
_verb_in_whitelist() {
|
|
35
|
+
local needle="$1" v
|
|
36
|
+
for v in "${DO_VERB_WHITELIST[@]}"; do
|
|
37
|
+
[ "${v}" = "${needle}" ] && return 0
|
|
38
|
+
done
|
|
39
|
+
return 1
|
|
40
|
+
}
|
|
41
|
+
_verb_in_whitelist "${arg_verb}" \
|
|
42
|
+
|| die "${EXIT_USAGE_ERROR}" "browser-do: --verb '${arg_verb}' not in whitelist (allowed: ${DO_VERB_WHITELIST[*]})"
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Why: a caller-supplied verb name dispatching to `bash scripts/browser-${arg_verb}.sh` lets a typo silently route to the wrong verb (`fil` → file-not-found vs `fill` actually firing); worse, lets a hostile prompt trick the agent into invoking credential-handling verbs (`creds-show`, `extract`, `audit`) under the cache-hit fast path. Whitelist is constant; reviewers can grep it; new verbs join the whitelist by an explicit code change with reviewable rationale.
|
|
46
|
+
|
|
47
|
+
Reference: `scripts/browser-do.sh::DO_VERB_WHITELIST` + `tests/browser-do.bats::4`.
|
|
48
|
+
|
|
49
|
+
### Rule 2 — Refuse cache writes containing credential sentinels
|
|
50
|
+
|
|
51
|
+
Cache writes carry caller-supplied free-text (intent phrases, selectors). If an agent accidentally inlines credential bytes into those args, they hit disk in the cache and survive across sessions. Refuse with a fast-fail.
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
WRONG — store whatever the caller passed
|
|
55
|
+
memory_record "${site}" "${arch}" "${intent}" "${selector}"
|
|
56
|
+
# Caller wrote intent="type password 'mySecret123'" → "mySecret123" persists.
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
```
|
|
60
|
+
RIGHT — refuse on sentinel-shaped content
|
|
61
|
+
readonly CANARY_SENTINEL='PASSWORD-CANARY'
|
|
62
|
+
_canary_check() {
|
|
63
|
+
local field="$1" value="$2"
|
|
64
|
+
if printf '%s' "${value}" | grep -qF -- "${CANARY_SENTINEL}"; then
|
|
65
|
+
die "${EXIT_BLOCKLIST_REJECTED}" "browser-do: refused — ${field} contains canary sentinel '${CANARY_SENTINEL}'"
|
|
66
|
+
fi
|
|
67
|
+
}
|
|
68
|
+
_canary_check "intent" "${arg_intent}"
|
|
69
|
+
_canary_check "selector" "${arg_selector}"
|
|
70
|
+
memory_record "${site}" "${arch}" "${arg_intent}" "${arg_selector}"
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
This is **not a real secret detector** — entropy scanning, real password-format detection, and the broader regex-zoo of credential patterns are out of scope for this recipe. The sentinel:
|
|
74
|
+
|
|
75
|
+
- Lets bats inject `PASSWORD-CANARY` into intent or selector, assert exit `EXIT_BLOCKLIST_REJECTED (28)`, AND assert the cache file is untouched on disk. That's the **regression** safety net.
|
|
76
|
+
- Forces the production code to have a refusal codepath at all, instead of unconditionally writing whatever shows up.
|
|
77
|
+
|
|
78
|
+
Real entropy/format-based detection is a future hardening pass. Document it as such in the recipe-doc + plan-doc.
|
|
79
|
+
|
|
80
|
+
Reference: `scripts/browser-do.sh::_canary_check` + `tests/browser-do.bats::24,25`.
|
|
81
|
+
|
|
82
|
+
### Rule 3 — Cache writes are best-effort; never taint the action's exit code
|
|
83
|
+
|
|
84
|
+
The action (clicking, filling, navigating) is the user's actual intent. The cache-write is an *opportunistic side effect* that improves future runs. **Cache freshness < action correctness.**
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
WRONG — cache failure surfaces as the verb's exit code
|
|
88
|
+
memory_record "${site}" "${arch}" "${intent}" "${selector}" \
|
|
89
|
+
|| die "${EXIT_GENERIC_ERROR}" "cache write failed"
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
```
|
|
93
|
+
RIGHT — log and forge ahead
|
|
94
|
+
if ! memory_record "${site}" "${arch}" "${intent}" "${selector}" 2>/dev/null; then
|
|
95
|
+
warn "browser-do: cache success_count update failed (best-effort; action exit unchanged)"
|
|
96
|
+
fi
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Why: a disk-full or perms-bug while writing the cache must not retroactively turn a successful click into an error the agent has to handle. The agent did the right thing; if the skill couldn't remember, that's the skill's problem to log, not the agent's problem to debug. The `warn:` line is observable to the user/reviewer; the action's exit code (and downstream agent decisions) stays correct.
|
|
100
|
+
|
|
101
|
+
Reference: `scripts/browser-do.sh` cache-hit-success branch + post-dispatch failure-recording branch (both `warn:`-only on cache failure).
|
|
102
|
+
|
|
103
|
+
### Rule 4 — Self-heal failure-counting needs an exit-code whitelist
|
|
104
|
+
|
|
105
|
+
If you wire failure counting into a verb (e.g. "fail 4 times → mark cached selector disabled"), **only specific exit codes drive the counter.** Counting any non-zero exit poisons the cache when the failure was environmental.
|
|
106
|
+
|
|
107
|
+
```
|
|
108
|
+
WRONG — count any failure as a selector-fitness signal
|
|
109
|
+
if [ "${dispatch_rc}" -ne 0 ]; then
|
|
110
|
+
memory_record_failure "${site}" "${arch}" "${intent}"
|
|
111
|
+
fi
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
RIGHT — whitelist the exit codes that genuinely indicate a selector miss
|
|
116
|
+
elif [ "${dispatch_rc}" -eq "${EXIT_EMPTY_RESULT}" ] || [ "${dispatch_rc}" -eq "${EXIT_ASSERTION_FAILED}" ]; then
|
|
117
|
+
# 11 = element not found at selector; 13 = assertion failed (expected element absent).
|
|
118
|
+
# 30 (network), 42 (tool crash), 43 (timeout) are environmental — they would poison
|
|
119
|
+
# the cache if we counted them.
|
|
120
|
+
if ! memory_record_failure "${site}" "${arch}" "${intent}" 2>/dev/null; then
|
|
121
|
+
warn "browser-do: cache fail_count update failed (best-effort)"
|
|
122
|
+
fi
|
|
123
|
+
fi
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Why: a flaky network or a one-off tool crash shouldn't push a working selector toward disable. The cache is supposed to remember "this selector reliably finds the element"; only the kind of failure that would change that conclusion (the selector returning nothing; the assertion not matching) qualifies as evidence the cached value is stale.
|
|
127
|
+
|
|
128
|
+
Pick the whitelist deliberately:
|
|
129
|
+
- **In:** Exit codes that mean "the cached value was tried and its referent wasn't there." `EXIT_EMPTY_RESULT (11)` and `EXIT_ASSERTION_FAILED (13)` for Phase 11 part 1.
|
|
130
|
+
- **Out:** Network errors, tool crashes, timeouts, session expiry, usage errors. Document the cutoff inline so future readers see the rule, not just the list.
|
|
131
|
+
|
|
132
|
+
Reference: `scripts/browser-do.sh` post-dispatch elif branch + `tests/browser-do.bats::29,30,31` (covers in-whitelist 11+13 + out-of-whitelist 30).
|
|
133
|
+
|
|
134
|
+
### Rule 5 — Lock the cache schema; don't store action-type
|
|
135
|
+
|
|
136
|
+
Cache schema is **forever** (or schema-version-bump-forever). Frozen v1 shapes can only grow new fields, not change existing ones. The temptation to add a `verb` field per cached interaction so the cache can dispatch any verb on hit — *resist it*.
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
WRONG — cache stores (intent, selector, verb)
|
|
140
|
+
{
|
|
141
|
+
"intent": "click delete",
|
|
142
|
+
"selector": "button.delete",
|
|
143
|
+
"verb": "click" // <-- couples cache to verb-set; schema-bump on every new verb
|
|
144
|
+
}
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
```
|
|
148
|
+
RIGHT — cache stores (intent, selector); caller specifies verb per call
|
|
149
|
+
{
|
|
150
|
+
"intent": "click delete",
|
|
151
|
+
"selector": "button.delete"
|
|
152
|
+
}
|
|
153
|
+
# Caller: browser-do --verb click --intent "click delete"
|
|
154
|
+
# Same selector can serve hover, fill (with --text), etc. — orthogonal axis.
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
Why: storing verb-type in the cache forces a schema bump every time you add a new dispatchable verb. The cache becomes brittle to the verb set. Moving the verb axis out (caller passes `--verb` per call) keeps the cache stable across verb-set evolution, AND lets the same selector serve multiple actions naturally (the same `button.confirm` selector is clickable AND hoverable; storing it once is correct).
|
|
158
|
+
|
|
159
|
+
Corollary: **don't store literal URLs**. Store *patterns*. URLs change every visit (`/devices/123` vs `/devices/124`); patterns generalize (`/devices/:id`). Storing the literal couples the cache to one entity instance; storing the pattern lets the cache hit across the whole archetype.
|
|
160
|
+
|
|
161
|
+
Reference: archetype JSON shape in `scripts/lib/memory.sh::memory_save_archetype` + design doc 2026-05-08-phase-11-memory-design.md §3 M1.
|
|
162
|
+
|
|
163
|
+
## What to test
|
|
164
|
+
|
|
165
|
+
Each rule needs at least one bats case proving the contract holds:
|
|
166
|
+
|
|
167
|
+
```
|
|
168
|
+
1. Whitelist enforcement: --verb ghost → exit EXIT_USAGE_ERROR
|
|
169
|
+
2. Canary refusal (intent): intent='PASSWORD-CANARY ...' → exit 28; cache untouched
|
|
170
|
+
3. Canary refusal (selector): selector='input[name=PASSWORD-CANARY]' → exit 28; cache untouched
|
|
171
|
+
4. Best-effort write semantics: (harder — needs a forced cache-write failure
|
|
172
|
+
to prove the action's exit code stays unchanged.
|
|
173
|
+
May be deferred to integration testing.)
|
|
174
|
+
5. Self-heal in-whitelist: dispatched verb exits 11 → fail_count++
|
|
175
|
+
6. Self-heal in-whitelist: dispatched verb exits 13 → fail_count++
|
|
176
|
+
7. Self-heal out-of-whitelist: dispatched verb exits 30 → fail_count UNCHANGED
|
|
177
|
+
8. Schema stability: new field added → existing fixtures still parse
|
|
178
|
+
(round-trip test in lib bats)
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
Sample placement (already shipped): `tests/browser-do.bats::4,12,13,29,30,31,32` (rules 1, 2, 3, 5, 6, 7, end-to-end). `tests/memory.bats::2,13` (rule 5 round-trip + self-heal D2 reset).
|
|
182
|
+
|
|
183
|
+
## Why a per-recipe contract beats per-PR vigilance
|
|
184
|
+
|
|
185
|
+
Phase 11 part 1 shipped over three PRs (1-i lib + 1-ii verb + 1-iii self-heal). Each PR had a plan-doc that locked decisions for that scope; the cumulative cache-write contract spans all three. **Without this recipe, a future PR adding a new cache-writing verb would have to re-derive the same five rules** by reading three plan-docs in sequence + the design doc. Recipes turn cumulative knowledge into a single grep-able artifact.
|
|
186
|
+
|
|
187
|
+
Concretely: if a PR adds `browser-do --verb fill` after `fill` gains `--selector` adapter plumbing, the reviewer should be able to ask "did this PR honor the cache-write contract?" and check this file's five rules + the test placements without having to reconstruct the rationale from three months of git log.
|
|
188
|
+
|
|
189
|
+
## Don't
|
|
190
|
+
|
|
191
|
+
- **Don't** add fields that store user-typed values verbatim in the cache. Selectors are CSS strings (structurally bounded); intents are short natural-language phrases (no value bytes). If you need to cache a typed value (e.g. "remember the username for this site"), it's not a memory-cache concern — it's the **credentials backend** (Phase 4), with its own security envelope.
|
|
192
|
+
- **Don't** widen the self-heal exit-code whitelist without explicit rationale. Adding code 22 ("session expired") looks reasonable but couples the cache to a different subsystem's failure mode; the cached selector is fine — the *session* needs renewing.
|
|
193
|
+
- **Don't** cache across sites. Per-site memory is the boundary (design doc §12). A selector that works on `prod-app` may have a homonym on `staging` that means something different.
|
|
194
|
+
- **Don't** auto-sanitize / strip sentinel bytes silently. Refuse with `EXIT_BLOCKLIST_REJECTED`. The agent's call site needs to know the cache write didn't happen so it doesn't assume a future hit; silent strip would create a "successful write that wrote nothing" failure mode that's worse than refusal.
|
|
195
|
+
- **Don't** add `press` to the cache-dispatch whitelist via `--focus-selector` or similar target-flag retrofit. **`press` is structurally outside cache scope** — chrome-devtools-mcp's bridge `case 'press':` is target-less by design ("Stateless w.r.t. refMap — acts on the focused element or page"; `lib/node/chrome-devtools-bridge.mjs:1098`). The cache-friendly composition is: agent calls `browser-do --verb click --intent "focus input"` to land focus on the right element, then invokes `browser-press --key Enter` directly (no cache; relies on focus state from prior click). This composition uses the cache where it adds value (target resolution) and leaves press as a no-op-for-cache stateless keyboard event. Documented in selector-mode-select plan-doc as decision SS5 + selector-mode plumbing HANDOFF row.
|
|
196
|
+
|
|
197
|
+
## Codified deferrals
|
|
198
|
+
|
|
199
|
+
| Verb / surface | Status | Why |
|
|
200
|
+
|---|---|---|
|
|
201
|
+
| `press` cache dispatch | **deferred (option c — compose-with-click+press)** | Bridge designed target-less. Cache-friendly composition: cache `click "focus input"` then invoke `press` directly. Recommended over `--focus-selector` retrofit; no IPC schema bump needed. |
|
|
202
|
+
| `hover`/`select` on playwright-lib | deferred (no demand) | Hover + select route exclusively to chrome-devtools-mcp per `lib/router.sh::rule_{hover,select}_default`. If routing ever expands, mirror PR #105's pattern (`runHover`/`runSelect` + IPC `case 'hover':`/`case 'select':` selector branch). |
|
|
203
|
+
|
|
204
|
+
## See also
|
|
205
|
+
|
|
206
|
+
- [Privacy canary recipe](privacy-canary.md) — sister pattern for read-side verbs that ingest secrets.
|
|
207
|
+
- [Path security recipe](path-security.md) — how `~/.browser-skill/memory/` enforces 0700 dir + 0600 file modes (mirrored from the captures pipeline).
|
|
208
|
+
- [Anti-patterns: tool extension](anti-patterns-tool-extension.md) — AP-7 (secrets-via-stdin), the broader pattern this recipe extends to cache-write surfaces.
|
|
209
|
+
- Design doc: `docs/superpowers/specs/2026-05-08-phase-11-memory-design.md` §6 (memory + recipes integration), §12 (cross-site boundary), §3 H1 (self-heal threshold).
|
|
210
|
+
- Phase 11 part 1 plan-docs: `2026-05-10-phase-11-part-1-{i,ii,iii}-*.md` — per-PR scope decisions that compose into this recipe.
|