browser-automation-skill 0.71.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (117) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +144 -0
  3. package/SECURITY.md +39 -0
  4. package/SKILL.md +206 -0
  5. package/bin/cli.mjs +55 -0
  6. package/install.sh +143 -0
  7. package/package.json +54 -0
  8. package/references/adapter-candidates.md +40 -0
  9. package/references/browser-mcp-cheatsheet.md +132 -0
  10. package/references/browser-stats-cheatsheet.md +155 -0
  11. package/references/chrome-devtools-mcp-cheatsheet.md +232 -0
  12. package/references/midscene-integration.md +359 -0
  13. package/references/obscura-cheatsheet.md +103 -0
  14. package/references/playwright-cli-cheatsheet.md +64 -0
  15. package/references/playwright-lib-cheatsheet.md +90 -0
  16. package/references/recipes/add-a-tool-adapter.md +134 -0
  17. package/references/recipes/agent-workflows/README.md +37 -0
  18. package/references/recipes/agent-workflows/cache-driven-bulk-operation.md +110 -0
  19. package/references/recipes/agent-workflows/flow-record-and-replay.md +102 -0
  20. package/references/recipes/agent-workflows/incremental-pattern-discovery.md +125 -0
  21. package/references/recipes/agent-workflows/login-then-scrape.md +100 -0
  22. package/references/recipes/anti-patterns-tool-extension.md +182 -0
  23. package/references/recipes/body-bytes-not-body.md +139 -0
  24. package/references/recipes/cache-write-security.md +210 -0
  25. package/references/recipes/fingerprint-rescue.md +154 -0
  26. package/references/recipes/model-routing.md +143 -0
  27. package/references/recipes/path-security.md +138 -0
  28. package/references/recipes/privacy-canary.md +96 -0
  29. package/references/recipes/visual-rescue-hook.md +182 -0
  30. package/references/stats-prices.json +42 -0
  31. package/references/stats-schema.json +77 -0
  32. package/references/tool-versions.md +8 -0
  33. package/scripts/browser-add-site.sh +113 -0
  34. package/scripts/browser-assert.sh +106 -0
  35. package/scripts/browser-audit.sh +68 -0
  36. package/scripts/browser-baseline.sh +135 -0
  37. package/scripts/browser-click.sh +100 -0
  38. package/scripts/browser-creds-add.sh +254 -0
  39. package/scripts/browser-creds-list.sh +67 -0
  40. package/scripts/browser-creds-migrate.sh +122 -0
  41. package/scripts/browser-creds-remove.sh +69 -0
  42. package/scripts/browser-creds-rotate-totp.sh +109 -0
  43. package/scripts/browser-creds-show.sh +82 -0
  44. package/scripts/browser-creds-totp.sh +94 -0
  45. package/scripts/browser-do.sh +630 -0
  46. package/scripts/browser-doctor.sh +365 -0
  47. package/scripts/browser-drag.sh +90 -0
  48. package/scripts/browser-extract.sh +192 -0
  49. package/scripts/browser-fill.sh +142 -0
  50. package/scripts/browser-flow.sh +316 -0
  51. package/scripts/browser-history.sh +187 -0
  52. package/scripts/browser-hover.sh +92 -0
  53. package/scripts/browser-inspect.sh +188 -0
  54. package/scripts/browser-list-sessions.sh +78 -0
  55. package/scripts/browser-list-sites.sh +42 -0
  56. package/scripts/browser-login.sh +279 -0
  57. package/scripts/browser-mcp.sh +65 -0
  58. package/scripts/browser-migrate.sh +195 -0
  59. package/scripts/browser-open.sh +134 -0
  60. package/scripts/browser-press.sh +80 -0
  61. package/scripts/browser-remove-session.sh +72 -0
  62. package/scripts/browser-remove-site.sh +68 -0
  63. package/scripts/browser-replay.sh +206 -0
  64. package/scripts/browser-route.sh +174 -0
  65. package/scripts/browser-select.sh +122 -0
  66. package/scripts/browser-show-session.sh +57 -0
  67. package/scripts/browser-show-site.sh +37 -0
  68. package/scripts/browser-snapshot.sh +176 -0
  69. package/scripts/browser-stats.sh +522 -0
  70. package/scripts/browser-tab-close.sh +112 -0
  71. package/scripts/browser-tab-list.sh +70 -0
  72. package/scripts/browser-tab-switch.sh +111 -0
  73. package/scripts/browser-upload.sh +132 -0
  74. package/scripts/browser-use.sh +60 -0
  75. package/scripts/browser-vlm.sh +707 -0
  76. package/scripts/browser-wait.sh +97 -0
  77. package/scripts/install-git-hooks.sh +16 -0
  78. package/scripts/lib/capture.sh +356 -0
  79. package/scripts/lib/common.sh +262 -0
  80. package/scripts/lib/credential.sh +237 -0
  81. package/scripts/lib/fingerprint-rescue.js +123 -0
  82. package/scripts/lib/flow.sh +448 -0
  83. package/scripts/lib/flow_record.sh +210 -0
  84. package/scripts/lib/mask.sh +49 -0
  85. package/scripts/lib/memory.sh +427 -0
  86. package/scripts/lib/migrate.sh +390 -0
  87. package/scripts/lib/migrators/README.md +23 -0
  88. package/scripts/lib/migrators/memory/v1_to_v2.sh +15 -0
  89. package/scripts/lib/migrators/recent_urls/README.md +13 -0
  90. package/scripts/lib/migrators/stats/README.md +24 -0
  91. package/scripts/lib/node/chrome-devtools-bridge.mjs +1812 -0
  92. package/scripts/lib/node/mcp-server.mjs +531 -0
  93. package/scripts/lib/node/mcp-tools.json +68 -0
  94. package/scripts/lib/node/playwright-driver.mjs +1104 -0
  95. package/scripts/lib/node/totp-core.mjs +52 -0
  96. package/scripts/lib/node/totp.mjs +52 -0
  97. package/scripts/lib/node/url-pattern-cluster.mjs +102 -0
  98. package/scripts/lib/node/url-pattern-resolver.mjs +77 -0
  99. package/scripts/lib/output.sh +79 -0
  100. package/scripts/lib/router.sh +342 -0
  101. package/scripts/lib/sanitize.sh +107 -0
  102. package/scripts/lib/secret/keychain.sh +91 -0
  103. package/scripts/lib/secret/libsecret.sh +74 -0
  104. package/scripts/lib/secret/plaintext.sh +75 -0
  105. package/scripts/lib/secret_backend_select.sh +57 -0
  106. package/scripts/lib/session.sh +153 -0
  107. package/scripts/lib/site.sh +126 -0
  108. package/scripts/lib/stats.sh +419 -0
  109. package/scripts/lib/tool/.gitkeep +0 -0
  110. package/scripts/lib/tool/chrome-devtools-mcp.sh +349 -0
  111. package/scripts/lib/tool/obscura.sh +249 -0
  112. package/scripts/lib/tool/playwright-cli.sh +155 -0
  113. package/scripts/lib/tool/playwright-lib.sh +106 -0
  114. package/scripts/lib/verb_helpers.sh +222 -0
  115. package/scripts/lib/visual-rescue-default.sh +145 -0
  116. package/scripts/regenerate-docs.sh +99 -0
  117. package/uninstall.sh +51 -0
@@ -0,0 +1,182 @@
1
+ # Anti-patterns: tool extension
2
+
3
+ When adding or modifying a tool adapter, these are the temptations to resist. Each is a real pattern an early contributor will reach for, with a clear WRONG / RIGHT shape and the SOLID-principle reasoning.
4
+
5
+ ## AP-1: Don't add adapter-specific checks to `browser-doctor.sh`
6
+
7
+ ```bash
8
+ # WRONG — editing core to teach it about a new tool
9
+ # scripts/browser-doctor.sh
10
+ if ! command -v puppeteer >/dev/null 2>&1; then
11
+ warn "puppeteer not on PATH"
12
+ problems=$((problems+1))
13
+ fi
14
+ ```
15
+
16
+ ```bash
17
+ # RIGHT — adapter declares its own check; doctor aggregates
18
+ # scripts/lib/tool/puppeteer.sh
19
+ tool_doctor_check() {
20
+ if command -v puppeteer >/dev/null 2>&1; then
21
+ printf '{"ok":true,"binary":"puppeteer","version":"%s"}\n' "$(puppeteer --version)"
22
+ else
23
+ cat <<'EOF'
24
+ { "ok": false, "binary": "puppeteer", "error": "not on PATH",
25
+ "install_hint": "npm i -g puppeteer" }
26
+ EOF
27
+ fi
28
+ }
29
+ ```
30
+
31
+ **SOLID:** SRP — `browser-doctor.sh` knows about framework-level state; `lib/tool/<tool>.sh` knows about its own binary. OCP — adding `puppeteer.sh` should never force an edit to a file in `scripts/` outside `lib/tool/`.
32
+
33
+ ## AP-2: Don't cross-call between adapters
34
+
35
+ ```bash
36
+ # WRONG — adapter reaching into another adapter
37
+ # scripts/lib/tool/obscura.sh
38
+ tool_inspect() {
39
+ source "${LIB_TOOL_DIR}/playwright-cli.sh"
40
+ tool_inspect "$@"
41
+ }
42
+ ```
43
+
44
+ ```bash
45
+ # RIGHT — shared logic factors into a helper module
46
+ # scripts/lib/inspect_helpers.sh (NEW shared lib — sibling to lib/tool/)
47
+ inspect_collect_console() { :; }
48
+
49
+ # Both adapters source the helper; neither sources the other.
50
+ # scripts/lib/tool/obscura.sh
51
+ source "${BROWSER_SKILL_LIB}/inspect_helpers.sh"
52
+ tool_inspect() { inspect_collect_console "$@"; }
53
+ ```
54
+
55
+ **SOLID:** SRP + DIP — adapters are leaves. If two adapters need the same behavior, that behavior is a *shared concern* and lives in `scripts/lib/` (sibling to `lib/tool/`). Without this rule, the dependency graph stops being a tree.
56
+
57
+ ## AP-3: Don't declare routing precedence in an adapter (the Z-hybrid line)
58
+
59
+ ```bash
60
+ # WRONG — adapter trying to claim "I'm the default for verb=audit"
61
+ # scripts/lib/tool/puppeteer.sh
62
+ tool_default_routes() {
63
+ cat <<'EOF'
64
+ { "audit": { "priority": 100 } }
65
+ EOF
66
+ }
67
+ ```
68
+
69
+ ```bash
70
+ # RIGHT — adapter declares only what it CAN do; router decides who WINS
71
+ # scripts/lib/tool/puppeteer.sh
72
+ tool_capabilities() {
73
+ cat <<'EOF'
74
+ { "verbs": { "audit": { "flags": ["--lighthouse"] } } }
75
+ EOF
76
+ }
77
+
78
+ # scripts/lib/router.sh (one rule, one place — Path B in the recipe)
79
+ rule_audit_verb() {
80
+ [ "${1:-}" = "audit" ] || return 1
81
+ printf 'puppeteer\taudit verb (Path B promotion)\n'
82
+ }
83
+ ```
84
+
85
+ **Why:** Routing precedence among peers is a global decision — when two adapters both claim audit, somebody has to break the tie, and centralizing that choice in `router.sh` lets a reviewer see the conflict in one diff. Decentralizing scatters precedence across N adapter files.
86
+
87
+ ## AP-4: Don't make a tool default in the same PR that adds it
88
+
89
+ ```diff
90
+ # WRONG — single PR introduces tool AND promotes it
91
+ + scripts/lib/tool/puppeteer.sh (NEW)
92
+ + tests/puppeteer_adapter.bats (NEW)
93
+ + tests/stubs/puppeteer (NEW)
94
+ + tests/fixtures/puppeteer/ (NEW)
95
+ + references/puppeteer-cheatsheet.md (NEW)
96
+ ~ scripts/lib/router.sh (EDIT — adds rule_audit_verb -> puppeteer)
97
+ ~ tests/router.bats (EDIT)
98
+ ~ references/routing-heuristics.md (EDIT)
99
+ ~ CHANGELOG.md (EDIT)
100
+ ```
101
+
102
+ ```diff
103
+ # RIGHT — two PRs, separated by a soak window
104
+ # PR #1 (Path A — ship dark, opt-in via --tool=puppeteer):
105
+ + scripts/lib/tool/puppeteer.sh (NEW)
106
+ + tests/puppeteer_adapter.bats (NEW)
107
+ + tests/stubs/puppeteer (NEW)
108
+ + tests/fixtures/puppeteer/ (NEW)
109
+ + references/puppeteer-cheatsheet.md (NEW)
110
+ ~ CHANGELOG.md (EDIT — [adapter] added, opt-in)
111
+
112
+ # (one week later, after using --tool=puppeteer in real workflows)
113
+ # PR #2 (Path B — promote to default):
114
+ ~ scripts/lib/router.sh (EDIT)
115
+ ~ tests/router.bats (EDIT)
116
+ ~ references/routing-heuristics.md (EDIT)
117
+ ~ CHANGELOG.md (EDIT — [adapter] promoted)
118
+ ```
119
+
120
+ **Why:** Process. A smaller PR is easier to revert. The "ship dark, then promote" pattern is the same hygiene any production system uses for feature flags.
121
+
122
+ ## AP-5: Don't hand-edit autogenerated files
123
+
124
+ ```bash
125
+ # WRONG — manually adding a row to references/tool-versions.md
126
+ $ vim references/tool-versions.md # adds a "puppeteer" row by hand
127
+ $ git add references/tool-versions.md && git commit
128
+ # tests/lint.sh::ensure_docs_in_sync will fail in CI:
129
+ # "tool-versions.md is stale; run scripts/regenerate-docs.sh"
130
+ ```
131
+
132
+ ```bash
133
+ # RIGHT — let the generator do it; commit the generator's output
134
+ $ vim scripts/lib/tool/puppeteer.sh # implement tool_metadata + tool_doctor_check
135
+ $ scripts/regenerate-docs.sh # autogen edits the marked sections
136
+ $ git add scripts/lib/tool/puppeteer.sh references/tool-versions.md SKILL.md
137
+ $ git commit
138
+ ```
139
+
140
+ **SOLID:** DRY + drift-prevention. The adapter's `tool_metadata()` is the single source of truth; any hand-edit to a generated file creates a second source that will go stale.
141
+
142
+ ## AP-6: Don't pollute the parent-shell namespace from adapter file scope
143
+
144
+ ```bash
145
+ # WRONG — readonly globals at adapter file scope without a namespace prefix
146
+ # scripts/lib/tool/puppeteer.sh
147
+ readonly TIMEOUT=30
148
+ readonly DEFAULT_VIEWPORT='1280x800'
149
+ # ... tool_open uses $TIMEOUT
150
+ ```
151
+
152
+ ```bash
153
+ # RIGHT — namespace the adapter's globals so they cannot collide
154
+ # scripts/lib/tool/puppeteer.sh
155
+ readonly _BROWSER_TOOL_PUPPETEER_TIMEOUT=30
156
+ readonly _BROWSER_TOOL_PUPPETEER_DEFAULT_VIEWPORT='1280x800'
157
+ # ... tool_open uses $_BROWSER_TOOL_PUPPETEER_TIMEOUT
158
+ ```
159
+
160
+ **Why:** Encapsulation. The current loading model only sources one adapter per parent shell, so collision is unlikely today — but a future verb that consults two adapters in the same shell would clobber `TIMEOUT`. Prefix-namespacing makes globals private without ceremony.
161
+
162
+ ## AP-7: Don't accept secrets in argv
163
+
164
+ `--secret-stdin` only. Lint and `tests/argv_leak.bats` enforce. A password on the command line is visible to `ps`, the shell history, and any tracing tool that sees argv. Always pipe secrets via stdin.
165
+
166
+ ## AP-8: Don't run network calls at adapter file-source time
167
+
168
+ Sourcing must be cheap and pure. `command -v <binary>` is fine; running the binary at source time is not. The framework sources adapters dozens of times per CI run (once per subshell aggregation in doctor + lint). A network call at file-source time multiplies that cost.
169
+
170
+ ## AP-9: Don't test only the happy path
171
+
172
+ Every `tests/<tool>_adapter.bats` must cover:
173
+ - Declaration-presence of all 11 required functions.
174
+ - `tool_capabilities()` returns valid JSON.
175
+ - Unsupported-op exit-41 path (at least one verb that returns 41).
176
+ - At least one happy path via the stub binary.
177
+
178
+ ## See also
179
+
180
+ - [add-a-tool-adapter recipe](add-a-tool-adapter.md)
181
+ - [extension model spec §9](../../docs/superpowers/specs/2026-04-30-tool-adapter-extension-model-design.md)
182
+ - [token-efficient adapter output spec](../../docs/superpowers/specs/2026-05-01-token-efficient-adapter-output-design.md)
@@ -0,0 +1,139 @@
1
+ # Recipe: `body_bytes`, not `body`, in replies
2
+
3
+ When a verb ingests caller-supplied content (HTTP body, large blob, multi-line text), ship the **byte length** in the reply — not the content itself. Avoids re-emitting agent-supplied data into stdout / logs / terminal capture / Claude transcript.
4
+
5
+ ## When to use this recipe
6
+
7
+ Use this whenever a verb takes content via `--content`, `--body`, `--data`, `--*-stdin`, or any flag that ingests bytes the agent typed and the verb forwards downstream. Already shipped:
8
+
9
+ - `scripts/browser-route.sh` + `scripts/lib/node/chrome-devtools-bridge.mjs` (Phase 6 part 7-ii) — `route fulfill --body` / `--body-stdin`. Reply has `body_bytes`, not `body`.
10
+ - `scripts/lib/node/chrome-devtools-bridge.mjs::runStatefulViaDaemon` (fill case, line ~432) — defensively scrubs `text` from the reply before emitting (related: privacy-canary handles the stronger secrets case).
11
+
12
+ Phase 7 capture-pipeline candidates: any sanitizer-output reply that shows the agent how much got redacted.
13
+
14
+ Do NOT use this recipe for:
15
+ - Replies whose **shape requires** the content (e.g. `extract --selector` returns the matched element's textContent — that's the whole point of the verb). The contract is *return what was extracted*, not *return how many bytes were extracted*.
16
+ - Secret-bytes (passwords, tokens). Those want **zero** reflection — see `privacy-canary.md`. Even `body_bytes` could leak something via length-side-channel for short secrets.
17
+
18
+ ## The pattern
19
+
20
+ ```javascript
21
+ // WRONG — echo the body in the reply
22
+ return {
23
+ verb: 'route',
24
+ action: 'fulfill',
25
+ pattern: msg.pattern,
26
+ status: msg.status,
27
+ body: msg.body, // <-- agent-supplied content, now in stdout
28
+ rule_count: routeRules.length,
29
+ };
30
+ ```
31
+
32
+ ```javascript
33
+ // RIGHT — ship a length contract
34
+ return {
35
+ verb: 'route',
36
+ action: 'fulfill',
37
+ pattern: msg.pattern,
38
+ fulfill_status: msg.status,
39
+ body_bytes: Buffer.byteLength(msg.body, 'utf8'),
40
+ rule_count: routeRules.length,
41
+ };
42
+ // body itself stays in the daemon's routeRules; never re-emitted.
43
+ ```
44
+
45
+ Source of truth: `scripts/lib/node/chrome-devtools-bridge.mjs::case 'route'` (the fulfill branch).
46
+
47
+ ## Why the agent doesn't need the body back
48
+
49
+ The agent **just sent** the body. They have it. The reply's job is to confirm:
50
+ - That the request was accepted (`status: 'ok'`).
51
+ - That it landed where intended (`pattern`, `rule_count`).
52
+ - That the bytes the daemon received match what they sent (`body_bytes`).
53
+
54
+ A length match is sufficient evidence that nothing got truncated by transport. If the agent suspects encoding corruption, they can `printf '%s' BODY | wc -c` and compare. They never needed the body echoed.
55
+
56
+ ## Why echoing is actively bad
57
+
58
+ 1. **Stdout is the Claude transcript.** Every echoed byte is a token the model rereads on the next turn. A 50KB JSON mock body bloats context for zero gain.
59
+ 2. **Logs persist.** If the user pipes the verb to `tee log.txt` or runs under `script(1)`, the body lands on disk in plain text. The daemon's in-memory store is a deliberate scoping decision; echoing undoes it.
60
+ 3. **Terminal-recording tools capture stdout.** Asciinema, screen recordings of demos, even `tmux` capture-pane all see whatever lands on stdout.
61
+ 4. **Convention sets expectations.** Once one verb echoes content, the next maintainer assumes that's the contract and adds another. Establishing "we ship lengths, not content" as the norm prevents the drift.
62
+
63
+ ## Why `Buffer.byteLength`, not `body.length`
64
+
65
+ ```javascript
66
+ // WRONG — JS string length is code units, not bytes
67
+ body_bytes: msg.body.length
68
+ // '🔒'.length === 2 (UTF-16 surrogate pair)
69
+ // '🔒' as utf-8 is 4 bytes
70
+ ```
71
+
72
+ ```javascript
73
+ // RIGHT — count bytes, not code units
74
+ body_bytes: Buffer.byteLength(msg.body, 'utf8')
75
+ ```
76
+
77
+ If the agent's `printf | wc -c` says 4 and the reply's `body_bytes` says 2, that looks like data loss when it's just a counting-units mismatch. Always count in the same unit the agent will count in (bytes, not chars).
78
+
79
+ Bash equivalent (for strings the bash verb script measures):
80
+
81
+ ```bash
82
+ # WRONG — only correct for ASCII
83
+ body_bytes="${#body_inline}"
84
+
85
+ # RIGHT — measure bytes via wc
86
+ body_bytes="$(printf '%s' "${body_inline}" | wc -c | tr -d ' ')"
87
+ ```
88
+
89
+ `scripts/browser-route.sh:107` uses `${#body_inline}` only for the dry-run / inline-body case where the bash side already knows the bytes won't surprise; the daemon-side authoritative count uses `Buffer.byteLength`. For new verbs, prefer `wc -c` bash-side.
90
+
91
+ ## Defense in depth: also scrub upstream
92
+
93
+ The bridge daemon's `route` reply is one layer. The fill verb's similar concern (`scripts/lib/node/chrome-devtools-bridge.mjs:432`) shows the defensive-scrub layered idiom:
94
+
95
+ ```javascript
96
+ const reply = await ipcCall({ verb: 'fill', ref, text });
97
+ // Defensive: scrub any echoed text from the reply before emitting.
98
+ if (reply && typeof reply === 'object') delete reply.text;
99
+ emitReply(reply);
100
+ ```
101
+
102
+ Even if the daemon child accidentally puts `text` into the reply, the bridge strips it on the way out. **Two layers** because either layer can be edited carelessly; both wrong at the same time is the regression that ships.
103
+
104
+ ## What about echoing for `--dry-run`?
105
+
106
+ Dry-run is the **one** justified case for surfacing the body — the agent asked "what would happen?" without committing. Even there, ship `body_bytes` plus an excerpt with explicit truncation, never the full body:
107
+
108
+ ```bash
109
+ # dry-run summary
110
+ emit_summary verb=route ... \
111
+ fulfill_status="${status_code}" \
112
+ body_bytes="${body_bytes}" \
113
+ body_excerpt="$(printf '%s' "${body_inline}" | head -c 80)" \
114
+ body_truncated="$([ "${body_bytes}" -gt 80 ] && echo true || echo false)"
115
+ ```
116
+
117
+ `browser-route.sh` doesn't ship the excerpt today. Add it in a follow-up if dry-run UX feedback asks for it.
118
+
119
+ ## Checklist for any new content-ingesting verb
120
+
121
+ ```
122
+ 1. Reply object has `<thing>_bytes` (length), not `<thing>` (content).
123
+ 2. Daemon-child stores the content in the slot it was meant for; reply
124
+ surface is purely confirmation.
125
+ 3. Use Buffer.byteLength (Node) or wc -c (bash) — never .length on strings.
126
+ 4. If the verb routes through a bridge, add a defensive `delete reply.<thing>`
127
+ on the way out (see fill verb precedent).
128
+ 5. Test that asserts the byte-length contract: roundtrip a body with a known
129
+ non-ASCII character; confirm body_bytes matches `printf | wc -c`.
130
+ 6. NEVER echo the content even in error replies — error UX is the message
131
+ string, not the offending payload.
132
+ ```
133
+
134
+ ## See also
135
+
136
+ - `scripts/lib/node/chrome-devtools-bridge.mjs::case 'route'` — fulfill branch.
137
+ - `scripts/lib/node/chrome-devtools-bridge.mjs:432` — fill defensive scrub.
138
+ - [Privacy canary recipe](privacy-canary.md) — stronger discipline for credential bytes.
139
+ - [Path security recipe](path-security.md) — sister pattern for filesystem inputs.
@@ -0,0 +1,210 @@
1
+ # Recipe: Cache-write security
2
+
3
+ A discipline for any verb that writes learned state to disk based on caller-supplied bytes. Codifies the contract that Phase 11 part 1's memory cache (`scripts/lib/memory.sh` + `scripts/browser-do.sh`) ships with: verbs that turn agent input into persistent cache state are a different security shape from verbs that read state, and they need defenses the read side doesn't.
4
+
5
+ ## When to use this recipe
6
+
7
+ Use this **whenever you add a verb that persists caller-supplied selectors, intents, URL patterns, or other free-text agent input as cache/memory state read by future invocations**. Examples already shipped:
8
+
9
+ - `scripts/browser-do.sh record` — Phase 11 part 1-ii; writes `(intent, selector, url_pattern)` triples to `~/.browser-skill/memory/<site>/`.
10
+ - `scripts/browser-do.sh --intent` (cache-hit success path) — Phase 11 part 1-ii/iii; bumps `success_count` + records `pattern → archetype` mapping on dispatch success.
11
+ - `scripts/browser-do.sh --intent` (cache-hit failure path, exit 11/13) — Phase 11 part 1-iii; increments `fail_count` toward H1 disable threshold.
12
+
13
+ Do NOT use this recipe for:
14
+ - Read-side verbs (lookup, list, show) — they don't persist new state; the privacy invariant is "don't echo what's already on disk", which is `privacy-canary.md`'s domain.
15
+ - Captures (Phase 7) — they persist *observed* page state, not cache mappings; sanitization is `sanitize.sh`'s job, not the cache-write contract.
16
+ - Sites profile / credentials / sessions (Phase 1–4) — these store explicit user-registered material; the user typed it deliberately. Cache writes happen *incidentally* during agent execution, which is what makes the security shape different.
17
+
18
+ ## The five rules
19
+
20
+ ### Rule 1 — Whitelist the cache-write surface
21
+
22
+ Never accept a caller-supplied **verb name** that you'll dispatch (or store) without enumerating allowed targets in code.
23
+
24
+ ```
25
+ WRONG — accept any verb name the caller supplies
26
+ case "${arg_verb}" in
27
+ *) bash "${SCRIPT_DIR}/browser-${arg_verb}.sh" --selector "${cached}" ;;
28
+ esac
29
+ ```
30
+
31
+ ```
32
+ RIGHT — explicit constant whitelist; reject everything else
33
+ readonly DO_VERB_WHITELIST=(click) # v1: only click takes --selector
34
+ _verb_in_whitelist() {
35
+ local needle="$1" v
36
+ for v in "${DO_VERB_WHITELIST[@]}"; do
37
+ [ "${v}" = "${needle}" ] && return 0
38
+ done
39
+ return 1
40
+ }
41
+ _verb_in_whitelist "${arg_verb}" \
42
+ || die "${EXIT_USAGE_ERROR}" "browser-do: --verb '${arg_verb}' not in whitelist (allowed: ${DO_VERB_WHITELIST[*]})"
43
+ ```
44
+
45
+ Why: a caller-supplied verb name dispatching to `bash scripts/browser-${arg_verb}.sh` lets a typo silently route to the wrong verb (`fil` → file-not-found vs `fill` actually firing); worse, lets a hostile prompt trick the agent into invoking credential-handling verbs (`creds-show`, `extract`, `audit`) under the cache-hit fast path. Whitelist is constant; reviewers can grep it; new verbs join the whitelist by an explicit code change with reviewable rationale.
46
+
47
+ Reference: `scripts/browser-do.sh::DO_VERB_WHITELIST` + `tests/browser-do.bats::4`.
48
+
49
+ ### Rule 2 — Refuse cache writes containing credential sentinels
50
+
51
+ Cache writes carry caller-supplied free-text (intent phrases, selectors). If an agent accidentally inlines credential bytes into those args, they hit disk in the cache and survive across sessions. Refuse with a fast-fail.
52
+
53
+ ```
54
+ WRONG — store whatever the caller passed
55
+ memory_record "${site}" "${arch}" "${intent}" "${selector}"
56
+ # Caller wrote intent="type password 'mySecret123'" → "mySecret123" persists.
57
+ ```
58
+
59
+ ```
60
+ RIGHT — refuse on sentinel-shaped content
61
+ readonly CANARY_SENTINEL='PASSWORD-CANARY'
62
+ _canary_check() {
63
+ local field="$1" value="$2"
64
+ if printf '%s' "${value}" | grep -qF -- "${CANARY_SENTINEL}"; then
65
+ die "${EXIT_BLOCKLIST_REJECTED}" "browser-do: refused — ${field} contains canary sentinel '${CANARY_SENTINEL}'"
66
+ fi
67
+ }
68
+ _canary_check "intent" "${arg_intent}"
69
+ _canary_check "selector" "${arg_selector}"
70
+ memory_record "${site}" "${arch}" "${arg_intent}" "${arg_selector}"
71
+ ```
72
+
73
+ This is **not a real secret detector** — entropy scanning, real password-format detection, and the broader regex-zoo of credential patterns are out of scope for this recipe. The sentinel:
74
+
75
+ - Lets bats inject `PASSWORD-CANARY` into intent or selector, assert exit `EXIT_BLOCKLIST_REJECTED (28)`, AND assert the cache file is untouched on disk. That's the **regression** safety net.
76
+ - Forces the production code to have a refusal codepath at all, instead of unconditionally writing whatever shows up.
77
+
78
+ Real entropy/format-based detection is a future hardening pass. Document it as such in the recipe-doc + plan-doc.
79
+
80
+ Reference: `scripts/browser-do.sh::_canary_check` + `tests/browser-do.bats::24,25`.
81
+
82
+ ### Rule 3 — Cache writes are best-effort; never taint the action's exit code
83
+
84
+ The action (clicking, filling, navigating) is the user's actual intent. The cache-write is an *opportunistic side effect* that improves future runs. **Cache freshness < action correctness.**
85
+
86
+ ```
87
+ WRONG — cache failure surfaces as the verb's exit code
88
+ memory_record "${site}" "${arch}" "${intent}" "${selector}" \
89
+ || die "${EXIT_GENERIC_ERROR}" "cache write failed"
90
+ ```
91
+
92
+ ```
93
+ RIGHT — log and forge ahead
94
+ if ! memory_record "${site}" "${arch}" "${intent}" "${selector}" 2>/dev/null; then
95
+ warn "browser-do: cache success_count update failed (best-effort; action exit unchanged)"
96
+ fi
97
+ ```
98
+
99
+ Why: a disk-full or perms-bug while writing the cache must not retroactively turn a successful click into an error the agent has to handle. The agent did the right thing; if the skill couldn't remember, that's the skill's problem to log, not the agent's problem to debug. The `warn:` line is observable to the user/reviewer; the action's exit code (and downstream agent decisions) stays correct.
100
+
101
+ Reference: `scripts/browser-do.sh` cache-hit-success branch + post-dispatch failure-recording branch (both `warn:`-only on cache failure).
102
+
103
+ ### Rule 4 — Self-heal failure-counting needs an exit-code whitelist
104
+
105
+ If you wire failure counting into a verb (e.g. "fail 4 times → mark cached selector disabled"), **only specific exit codes drive the counter.** Counting any non-zero exit poisons the cache when the failure was environmental.
106
+
107
+ ```
108
+ WRONG — count any failure as a selector-fitness signal
109
+ if [ "${dispatch_rc}" -ne 0 ]; then
110
+ memory_record_failure "${site}" "${arch}" "${intent}"
111
+ fi
112
+ ```
113
+
114
+ ```
115
+ RIGHT — whitelist the exit codes that genuinely indicate a selector miss
116
+ elif [ "${dispatch_rc}" -eq "${EXIT_EMPTY_RESULT}" ] || [ "${dispatch_rc}" -eq "${EXIT_ASSERTION_FAILED}" ]; then
117
+ # 11 = element not found at selector; 13 = assertion failed (expected element absent).
118
+ # 30 (network), 42 (tool crash), 43 (timeout) are environmental — they would poison
119
+ # the cache if we counted them.
120
+ if ! memory_record_failure "${site}" "${arch}" "${intent}" 2>/dev/null; then
121
+ warn "browser-do: cache fail_count update failed (best-effort)"
122
+ fi
123
+ fi
124
+ ```
125
+
126
+ Why: a flaky network or a one-off tool crash shouldn't push a working selector toward disable. The cache is supposed to remember "this selector reliably finds the element"; only the kind of failure that would change that conclusion (the selector returning nothing; the assertion not matching) qualifies as evidence the cached value is stale.
127
+
128
+ Pick the whitelist deliberately:
129
+ - **In:** Exit codes that mean "the cached value was tried and its referent wasn't there." `EXIT_EMPTY_RESULT (11)` and `EXIT_ASSERTION_FAILED (13)` for Phase 11 part 1.
130
+ - **Out:** Network errors, tool crashes, timeouts, session expiry, usage errors. Document the cutoff inline so future readers see the rule, not just the list.
131
+
132
+ Reference: `scripts/browser-do.sh` post-dispatch elif branch + `tests/browser-do.bats::29,30,31` (covers in-whitelist 11+13 + out-of-whitelist 30).
133
+
134
+ ### Rule 5 — Lock the cache schema; don't store action-type
135
+
136
+ Cache schema is **forever** (or schema-version-bump-forever). Frozen v1 shapes can only grow new fields, not change existing ones. The temptation to add a `verb` field per cached interaction so the cache can dispatch any verb on hit — *resist it*.
137
+
138
+ ```
139
+ WRONG — cache stores (intent, selector, verb)
140
+ {
141
+ "intent": "click delete",
142
+ "selector": "button.delete",
143
+ "verb": "click" // <-- couples cache to verb-set; schema-bump on every new verb
144
+ }
145
+ ```
146
+
147
+ ```
148
+ RIGHT — cache stores (intent, selector); caller specifies verb per call
149
+ {
150
+ "intent": "click delete",
151
+ "selector": "button.delete"
152
+ }
153
+ # Caller: browser-do --verb click --intent "click delete"
154
+ # Same selector can serve hover, fill (with --text), etc. — orthogonal axis.
155
+ ```
156
+
157
+ Why: storing verb-type in the cache forces a schema bump every time you add a new dispatchable verb. The cache becomes brittle to the verb set. Moving the verb axis out (caller passes `--verb` per call) keeps the cache stable across verb-set evolution, AND lets the same selector serve multiple actions naturally (the same `button.confirm` selector is clickable AND hoverable; storing it once is correct).
158
+
159
+ Corollary: **don't store literal URLs**. Store *patterns*. URLs change every visit (`/devices/123` vs `/devices/124`); patterns generalize (`/devices/:id`). Storing the literal couples the cache to one entity instance; storing the pattern lets the cache hit across the whole archetype.
160
+
161
+ Reference: archetype JSON shape in `scripts/lib/memory.sh::memory_save_archetype` + design doc 2026-05-08-phase-11-memory-design.md §3 M1.
162
+
163
+ ## What to test
164
+
165
+ Each rule needs at least one bats case proving the contract holds:
166
+
167
+ ```
168
+ 1. Whitelist enforcement: --verb ghost → exit EXIT_USAGE_ERROR
169
+ 2. Canary refusal (intent): intent='PASSWORD-CANARY ...' → exit 28; cache untouched
170
+ 3. Canary refusal (selector): selector='input[name=PASSWORD-CANARY]' → exit 28; cache untouched
171
+ 4. Best-effort write semantics: (harder — needs a forced cache-write failure
172
+ to prove the action's exit code stays unchanged.
173
+ May be deferred to integration testing.)
174
+ 5. Self-heal in-whitelist: dispatched verb exits 11 → fail_count++
175
+ 6. Self-heal in-whitelist: dispatched verb exits 13 → fail_count++
176
+ 7. Self-heal out-of-whitelist: dispatched verb exits 30 → fail_count UNCHANGED
177
+ 8. Schema stability: new field added → existing fixtures still parse
178
+ (round-trip test in lib bats)
179
+ ```
180
+
181
+ Sample placement (already shipped): `tests/browser-do.bats::4,12,13,29,30,31,32` (rules 1, 2, 3, 5, 6, 7, end-to-end). `tests/memory.bats::2,13` (rule 5 round-trip + self-heal D2 reset).
182
+
183
+ ## Why a per-recipe contract beats per-PR vigilance
184
+
185
+ Phase 11 part 1 shipped over three PRs (1-i lib + 1-ii verb + 1-iii self-heal). Each PR had a plan-doc that locked decisions for that scope; the cumulative cache-write contract spans all three. **Without this recipe, a future PR adding a new cache-writing verb would have to re-derive the same five rules** by reading three plan-docs in sequence + the design doc. Recipes turn cumulative knowledge into a single grep-able artifact.
186
+
187
+ Concretely: if a PR adds `browser-do --verb fill` after `fill` gains `--selector` adapter plumbing, the reviewer should be able to ask "did this PR honor the cache-write contract?" and check this file's five rules + the test placements without having to reconstruct the rationale from three months of git log.
188
+
189
+ ## Don't
190
+
191
+ - **Don't** add fields that store user-typed values verbatim in the cache. Selectors are CSS strings (structurally bounded); intents are short natural-language phrases (no value bytes). If you need to cache a typed value (e.g. "remember the username for this site"), it's not a memory-cache concern — it's the **credentials backend** (Phase 4), with its own security envelope.
192
+ - **Don't** widen the self-heal exit-code whitelist without explicit rationale. Adding code 22 ("session expired") looks reasonable but couples the cache to a different subsystem's failure mode; the cached selector is fine — the *session* needs renewing.
193
+ - **Don't** cache across sites. Per-site memory is the boundary (design doc §12). A selector that works on `prod-app` may have a homonym on `staging` that means something different.
194
+ - **Don't** auto-sanitize / strip sentinel bytes silently. Refuse with `EXIT_BLOCKLIST_REJECTED`. The agent's call site needs to know the cache write didn't happen so it doesn't assume a future hit; silent strip would create a "successful write that wrote nothing" failure mode that's worse than refusal.
195
+ - **Don't** add `press` to the cache-dispatch whitelist via `--focus-selector` or similar target-flag retrofit. **`press` is structurally outside cache scope** — chrome-devtools-mcp's bridge `case 'press':` is target-less by design ("Stateless w.r.t. refMap — acts on the focused element or page"; `lib/node/chrome-devtools-bridge.mjs:1098`). The cache-friendly composition is: agent calls `browser-do --verb click --intent "focus input"` to land focus on the right element, then invokes `browser-press --key Enter` directly (no cache; relies on focus state from prior click). This composition uses the cache where it adds value (target resolution) and leaves press as a no-op-for-cache stateless keyboard event. Documented in selector-mode-select plan-doc as decision SS5 + selector-mode plumbing HANDOFF row.
196
+
197
+ ## Codified deferrals
198
+
199
+ | Verb / surface | Status | Why |
200
+ |---|---|---|
201
+ | `press` cache dispatch | **deferred (option c — compose-with-click+press)** | Bridge designed target-less. Cache-friendly composition: cache `click "focus input"` then invoke `press` directly. Recommended over `--focus-selector` retrofit; no IPC schema bump needed. |
202
+ | `hover`/`select` on playwright-lib | deferred (no demand) | Hover + select route exclusively to chrome-devtools-mcp per `lib/router.sh::rule_{hover,select}_default`. If routing ever expands, mirror PR #105's pattern (`runHover`/`runSelect` + IPC `case 'hover':`/`case 'select':` selector branch). |
203
+
204
+ ## See also
205
+
206
+ - [Privacy canary recipe](privacy-canary.md) — sister pattern for read-side verbs that ingest secrets.
207
+ - [Path security recipe](path-security.md) — how `~/.browser-skill/memory/` enforces 0700 dir + 0600 file modes (mirrored from the captures pipeline).
208
+ - [Anti-patterns: tool extension](anti-patterns-tool-extension.md) — AP-7 (secrets-via-stdin), the broader pattern this recipe extends to cache-write surfaces.
209
+ - Design doc: `docs/superpowers/specs/2026-05-08-phase-11-memory-design.md` §6 (memory + recipes integration), §12 (cross-site boundary), §3 H1 (self-heal threshold).
210
+ - Phase 11 part 1 plan-docs: `2026-05-10-phase-11-part-1-{i,ii,iii}-*.md` — per-PR scope decisions that compose into this recipe.