opensteer 0.8.13 → 0.8.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,17 +1,6 @@
1
1
  # Opensteer SDK Reference
2
2
 
3
- This covers all task paths (DOM, request capture, browser admin). For task triage, see SKILL.md.
4
-
5
- Use the SDK when the workflow should become reusable TypeScript code in the repository.
6
-
7
- ## Sections
8
-
9
- - [Construction](#construction)
10
- - [DOM Automation And Extraction](#dom-automation-and-extraction)
11
- - [Browser Admin](#browser-admin)
12
- - [Request Capture, Plans, And Recipes](#request-capture-plans-and-recipes)
13
- - [Common Methods](#common-methods)
14
- - [Rules](#rules)
3
+ Use the SDK when the result should become reusable TypeScript in the repository.
15
4
 
16
5
  ## Construction
17
6
 
@@ -19,143 +8,127 @@ Use the SDK when the workflow should become reusable TypeScript code in the repo
19
8
  import { Opensteer } from "opensteer";
20
9
 
21
10
  const opensteer = new Opensteer({
22
- workspace: "github-sync",
11
+ workspace: "demo",
23
12
  rootDir: process.cwd(),
24
- browser: "persistent",
25
- launch: {
26
- headless: false,
27
- },
28
- context: {
29
- locale: "en-US",
30
- },
31
13
  });
32
14
  ```
33
15
 
34
- - `workspace` creates a repo-local persistent root under `.opensteer/workspaces/<id>`.
35
- - Omitting `workspace` creates a temporary root.
36
- - `browser` can be `persistent`, `temporary`, or `{ mode: "attach", endpoint?, freshTab? }`.
37
- - `opensteer.browser.status()`, `clone()`, `reset()`, and `delete()` manage the persistent workspace browser.
38
- - `close()` shuts the current session and, for persistent workspaces, closes the live browser process.
39
- - `disconnect()` detaches local runtime handles and leaves the workspace/browser files intact.
40
- - Network history is SQLite-backed and auto-persisted. Records are written to SQLite as they are captured during `goto()`, `click()`, and other actions. The database initializes on first use.
41
- - Generic workspace and browser helpers do not require SQLite capability unless they touch network history persistence.
42
- - The current public SDK does not expose `Opensteer.attach()`, cloud session helpers, or the ABP engine.
16
+ Key options:
43
17
 
44
- ## DOM Automation And Extraction
18
+ - `workspace`: persistent repo-local browser state
19
+ - `rootDir`: where `.opensteer` lives
20
+ - `browser`: local browser mode or attach config
21
+ - `provider`: cloud config when applicable
45
22
 
46
- Opensteer uses a two-phase workflow: **explore** with the CLI, then **replay** with the SDK.
23
+ ## DOM Automation
47
24
 
48
- ### Phase 1 Exploration (one-time, via CLI or setup script)
25
+ Explore with the CLI first. Use the snapshot `html` output to find `c="N"` element numbers.
49
26
 
50
- Run `opensteer snapshot action --workspace demo` from the CLI first. Read the `html` field in the JSON output — it is a clean filtered DOM with `c="N"` attributes. Use those counter numbers as the `element` parameter below. The SDK also exposes `snapshot()`, but this guide keeps discovery in the CLI so the DOM HTML is easy to inspect from the terminal.
27
+ Persist a DOM action target during exploration:
51
28
 
52
29
  ```ts
53
- import { Opensteer } from "opensteer";
54
-
55
- const opensteer = new Opensteer({
56
- workspace: "demo",
57
- rootDir: process.cwd(),
58
- });
59
-
60
- await opensteer.open("https://example.com");
61
-
62
- // element numbers come from c="N" values in the snapshot html field
63
30
  await opensteer.click({
64
31
  element: 3,
65
- description: "primary button", // caches the element path
32
+ persist: "primary button",
66
33
  });
67
34
 
68
35
  await opensteer.input({
69
36
  element: 7,
70
- description: "search input", // caches the element path
37
+ persist: "search input",
71
38
  text: "laptop",
72
39
  pressEnter: true,
73
40
  });
41
+ ```
42
+
43
+ Replay it later without element numbers:
74
44
 
75
- await opensteer.extract({
76
- description: "page summary",
45
+ ```ts
46
+ await opensteer.click({ persist: "primary button" });
47
+ await opensteer.input({ persist: "search input", text: "laptop", pressEnter: true });
48
+ ```
49
+
50
+ Persist works for extraction too:
51
+
52
+ ```ts
53
+ const summary = await opensteer.extract({
54
+ persist: "page summary",
77
55
  schema: {
78
56
  title: { selector: "title" },
79
57
  url: { source: "current_url" },
80
58
  },
81
59
  });
82
-
83
- await opensteer.close();
84
60
  ```
85
61
 
86
- ### Phase 2 — Deterministic replay (the actual reusable script)
62
+ Rules:
87
63
 
88
- Use `description` alone for everything resolves from cached descriptors:
64
+ - Use `persist` for reusable DOM targets and extraction descriptors.
65
+ - Use `selector` only as a low-level escape hatch.
66
+
67
+ ## Network Discovery
89
68
 
90
69
  ```ts
91
- const opensteer = new Opensteer({
92
- workspace: "demo",
93
- rootDir: process.cwd(),
70
+ await opensteer.goto("https://example.com/search", {
71
+ captureNetwork: "search",
94
72
  });
95
73
 
96
- await opensteer.open("https://example.com");
97
-
98
- await opensteer.click({ description: "primary button" });
99
- await opensteer.input({ description: "search input", text: "laptop", pressEnter: true });
100
- const data = await opensteer.extract({ description: "page summary" });
74
+ const records = await opensteer.network.query({
75
+ capture: "search",
76
+ limit: 20,
77
+ });
101
78
 
102
- await opensteer.close();
79
+ const detail = await opensteer.network.detail(records.records[0]!.recordId);
80
+ const replay = await opensteer.network.replay(records.records[0]!.recordId, {
81
+ query: { keyword: "headphones" },
82
+ });
103
83
  ```
104
84
 
105
- DOM rules:
85
+ Use:
106
86
 
107
- - Deterministic scripts use `description` for all interactions and extractions — no snapshots, no selectors.
108
- - `element + description` persists a DOM action descriptor. Bare `description` replays it later.
109
- - `description + schema` writes or updates a persisted extraction descriptor. Bare `description` replays it later.
110
- - Use `element` targets only during the exploration phase with a fresh snapshot from the CLI.
111
- - Keep DOM data collection in `extract()`, not `evaluate()` or raw page DOM parsing, when the result can be expressed as structured fields.
112
- - CSS selectors exist as a low-level escape hatch but are not recommended for reusable scripts.
87
+ - `network.query()` to shortlist requests
88
+ - `network.detail()` to inspect one request deeply
89
+ - `network.replay()` to confirm the transport and response shape
113
90
 
114
- Supported extraction field shapes:
91
+ ## Browser State
115
92
 
116
- - `{ element: N }` — requires a prior CLI snapshot; use during exploration only
117
- - `{ element: N, attribute: "href" }`
118
- - `{ selector: ".price" }`
119
- - `{ selector: "img.hero", attribute: "src" }`
120
- - `{ source: "current_url" }`
93
+ ```ts
94
+ const cookies = await opensteer.cookies("example.com");
95
+ const localStorage = await opensteer.storage("example.com", "local");
96
+ const sessionStorage = await opensteer.storage("example.com", "session");
97
+ const browserState = await opensteer.state("example.com");
98
+ ```
121
99
 
122
- For arrays, provide 1-2 representative objects. The extractor auto-generalizes from these templates to find ALL matching rows on the page:
100
+ `cookies()` returns a small cookie-jar helper:
123
101
 
124
102
  ```ts
125
- const results = await opensteer.extract({
126
- description: "search results",
127
- schema: {
128
- items: [
129
- { name: { element: 13 }, price: { element: 14 } },
130
- { name: { element: 22 }, price: { element: 23 } },
131
- ],
132
- },
133
- });
134
- // results.items contains ALL matching rows on the page, not just the 2 templates
103
+ cookies.has("session");
104
+ cookies.get("session");
105
+ cookies.getAll();
106
+ cookies.serialize();
135
107
  ```
136
108
 
137
- Do not use `prompt` or semantic placeholder values such as `"string"` in the current public SDK. The extractor expects explicit schema objects, arrays, and field descriptors.
138
-
139
- ### What extract() Returns
109
+ ## Session-Aware Fetch
140
110
 
141
- `extract()` returns a plain JSON object matching your schema shape:
111
+ `fetch()` is the main replay primitive for final code.
142
112
 
143
113
  ```ts
144
- // Flat schema:
145
- { title: "Search Results", url: "https://..." }
146
-
147
- // Array schema (auto-generalized from 1-2 templates):
148
- {
149
- items: [
150
- { name: "Apple AirPods Max", price: "$549.99" },
151
- { name: "Apple AirPods Pro", price: "$249.99" },
152
- { name: "Apple AirPods 4", price: "$129.99" },
153
- // ... ALL matching rows
154
- ]
155
- }
114
+ const response = await opensteer.fetch("https://api.example.com/search", {
115
+ query: {
116
+ keyword: "laptop",
117
+ count: 24,
118
+ },
119
+ });
120
+
121
+ const data = await response.json();
156
122
  ```
157
123
 
158
- Use `extract()` for structured data. Do NOT use `evaluate()` or raw DOM parsing when `extract()` can express the result.
124
+ If exploration showed a required transport:
125
+
126
+ ```ts
127
+ const response = await opensteer.fetch("https://api.example.com/search", {
128
+ query: { keyword: "laptop" },
129
+ transport: "matched-tls",
130
+ });
131
+ ```
159
132
 
160
133
  ## Browser Admin
161
134
 
@@ -170,244 +143,15 @@ if (!status.live) {
170
143
  }
171
144
  ```
172
145
 
173
- - `browser.clone()` is only for persistent workspace browsers.
174
- - Clone before `open()` when the workflow needs local authenticated browser state.
175
- - `browser.reset()` clears cloned browser state but keeps the workspace.
176
- - `browser.delete()` removes workspace browser files.
177
-
178
- ## Request Capture, Plans, And Recipes
179
-
180
- For the complete pipeline with mandatory phases, see [Request Plan Pipeline](request-workflow.md). This section covers SDK method signatures and examples for reusable scripts.
181
-
182
- The deliverable of a request capture workflow is a persisted request plan tested via `request()`. `rawRequest()` is a diagnostic probe — not the deliverable.
183
-
184
- ### Capture, Probe, and Infer
185
-
186
- ```ts
187
- await opensteer.open();
188
- await opensteer.goto({
189
- url: "https://example.com/app",
190
- captureNetwork: "page-load",
191
- });
192
-
193
- await opensteer.click({
194
- selector: "button.load-products",
195
- description: "load products",
196
- captureNetwork: "products-load",
197
- });
198
-
199
- const records = await opensteer.queryNetwork({
200
- capture: "products-load",
201
- includeBodies: true,
202
- limit: 20,
203
- });
204
-
205
- // DIAGNOSTIC ONLY — probe transport to determine portability.
206
- // Do NOT return this data as the final answer. Proceed to inferRequestPlan.
207
- const response = await opensteer.rawRequest({
208
- transport: "direct-http",
209
- url: "https://example.com/api/products",
210
- method: "POST",
211
- body: {
212
- json: { page: 1 },
213
- },
214
- });
215
-
216
- // Infer plan with the transport you proved works
217
- await opensteer.inferRequestPlan({
218
- recordId: records.records[0]!.recordId,
219
- key: "products.search",
220
- version: "v1",
221
- transport: "direct-http",
222
- });
223
-
224
- // Replay the plan — this is the deliverable
225
- await opensteer.request("products.search", {
226
- query: { q: "laptop" },
227
- });
228
- ```
229
-
230
- ### Read and Fix Plans
231
-
232
- After inferring a plan, read it to validate auth and annotate parameters:
233
-
234
- ```ts
235
- // Read the inferred plan
236
- const plan = await opensteer.getRequestPlan({
237
- key: "products.search",
238
- version: "v1",
239
- });
240
-
241
- // Inspect plan.payload.auth — if auth.strategy is set but the API is public,
242
- // rewrite the plan with auth removed
243
- await opensteer.writeRequestPlan({
244
- key: "products.search",
245
- version: "v1",
246
- tags: ["products", "search"],
247
- provenance: {
248
- source: "manual",
249
- notes: "Auth removed — API is public. Parameters annotated.",
250
- },
251
- payload: {
252
- ...plan.payload,
253
- auth: undefined, // Remove spurious auth classification
254
- parameters: [
255
- { name: "q", in: "query", required: true, description: "Search keyword" },
256
- { name: "count", in: "query", defaultValue: "24", description: "Results per page" },
257
- { name: "offset", in: "query", defaultValue: "0", description: "Pagination offset" },
258
- ],
259
- },
260
- });
261
- ```
146
+ ## Recommended Rules
262
147
 
263
- ### Auth Recipes
148
+ - Explore with the CLI first, then write reusable SDK code.
149
+ - Use `captureNetwork` on the real browser actions that trigger the traffic.
150
+ - Let `replay` tell you the required transport instead of guessing.
151
+ - Keep the final artifact as code, not as shell commands or giant logs.
264
152
 
265
- When an API genuinely requires auth, create an auth recipe that acquires tokens automatically:
266
-
267
- ```ts
268
- // Write an auth recipe that acquires a bearer token
269
- await opensteer.writeAuthRecipe({
270
- key: "example.auth",
271
- version: "v1",
272
- payload: {
273
- description: "Acquire guest bearer token for example.com API",
274
- steps: [
275
- {
276
- kind: "directRequest",
277
- request: {
278
- url: "https://example.com/api/oauth/token",
279
- transport: "direct-http",
280
- method: "POST",
281
- body: { json: { grant_type: "client_credentials" } },
282
- },
283
- capture: {
284
- bodyJsonPointer: { pointer: "/access_token", saveAs: "token" },
285
- },
286
- },
287
- ],
288
- outputs: {
289
- headers: { Authorization: "Bearer {{token}}" },
290
- },
291
- },
292
- });
293
-
294
- // Bind the auth recipe to a plan by rewriting the plan with auth.recipe
295
- const plan = await opensteer.getRequestPlan({ key: "products.search" });
296
- await opensteer.writeRequestPlan({
297
- key: "products.search",
298
- version: "v1",
299
- payload: {
300
- ...plan.payload,
301
- auth: {
302
- strategy: "bearer-token",
303
- recipe: { key: "example.auth", version: "v1" },
304
- failurePolicy: { on: "status", status: "401", action: "recover" },
305
- },
306
- },
307
- });
308
-
309
- // Now request.execute automatically runs the auth recipe on 401
310
- const result = await opensteer.request("products.search", {
311
- query: { q: "laptop" },
312
- });
313
- ```
153
+ ## What Not To Do
314
154
 
315
- **Recipe step types:** `directRequest`, `sessionRequest`, `request`, `readCookie`, `readStorage`, `evaluate`, `waitForNetwork`, `waitForCookie`, `goto`, `solveCaptcha`, `hook`. Each step can `capture` values; `outputs` maps captured variables to `headers`, `query`, `params`, or `body` overrides.
316
-
317
- ### Rules
318
-
319
- - `captureNetwork` is supported on `goto()`, `click()`, `scroll()`, `input()`, and `hover()`. It is NOT supported on `open()`. Use `open()` then `goto({ url, captureNetwork })` to name initial navigation capture.
320
- - Query by capture first, then query all traffic to catch async requests that fire after page load.
321
- - Probe discovered APIs with `rawRequest()` using `direct-http` first, then `context-http`. `rawRequest()` is diagnostic — always proceed to `inferRequestPlan`.
322
- - Persistence is automatic; use `tagNetwork()` when you want to label a slice of already-persisted history for later lookup.
323
- - Use recipes when replay needs deterministic setup work. Use auth recipes when the setup is specifically auth-related. They live in separate registries.
324
-
325
- ### Input Shapes
326
-
327
- - `rawRequest` `headers` MUST be an array: `[{ name: "Authorization", value: "Bearer ..." }]`. NOT `{ Authorization: "Bearer ..." }`.
328
- - `rawRequest` `body` MUST be one of: `{ json: { ... } }`, `{ text: "..." }`, or `{ base64: "..." }`. NOT a raw string or object.
329
- - `rawRequest()` may populate parsed JSON on `data`. If it does not, decode `response.body.data` with `Buffer.from(..., "base64").toString("utf8")`.
330
- - Recipe step `request` fields accept `{ key: value }` objects for headers (unlike `rawRequest`).
331
-
332
- ### Common Errors
333
-
334
- | Error | Cause | Fix |
335
- | ------------------------------------------------------------ | ----------------------------------------------------- | ----------------------------------------------------------------- |
336
- | `"captureNetwork is not allowed"` | Used `captureNetwork` on `open()` | Move to `goto({ url, captureNetwork })` |
337
- | `"must be array"` on `rawRequest` | Headers passed as `{key: value}` | Use `[{name, value}]` array |
338
- | `"must match exactly one supported shape"` | Body passed as raw string | Wrap in `{json: {...}}` or `{text: "..."}` |
339
- | `"Specify exactly one of element, selector, or description"` | `scroll()` called without a target | Add `selector: "body"` or a `description` |
340
- | `"registry record already exists"` | `inferRequestPlan` called twice with same key+version | Catch the error or use a new version |
341
- | `"no stored extraction descriptor"` | `extract()` called with `description` but no `schema` | Always provide `schema` unless a descriptor was previously stored |
342
-
343
- ## Common Methods
344
-
345
- Session and page control:
346
-
347
- - `new Opensteer({ workspace?, rootDir?, browser?, launch?, context? })`
348
- - `open(url | { url?, workspace?, browser?, launch?, context? })`
349
- - `goto(url | { url, captureNetwork? })`
350
- - `listPages()`
351
- - `newPage({ url?, openerPageRef? })`
352
- - `activatePage({ pageRef })`
353
- - `closePage({ pageRef })`
354
- - `waitForPage({ openerPageRef?, urlIncludes?, timeoutMs? })`
355
-
356
- Interaction and extraction:
357
-
358
- - `click({ element | selector | description, captureNetwork? })`
359
- - `hover({ element | selector | description, captureNetwork? })`
360
- - `input({ element | selector | description, text, pressEnter?, captureNetwork? })`
361
- - `scroll({ element | selector | description, direction, amount, captureNetwork? })`
362
- - `extract({ description, schema? })`
363
-
364
- Inspection and evaluation:
365
-
366
- - `evaluate(script | { script, pageRef?, args? })`
367
- - `evaluateJson({ script, pageRef?, args? })`
368
- - `waitForNetwork({ ...filters, pageRef?, includeBodies?, timeoutMs? })`
369
- - `waitForResponse({ ...filters, pageRef?, includeBodies?, timeoutMs? })`
370
- - `queryNetwork({ ...filters, includeBodies?, limit? })`
371
- - `tagNetwork({ tag, ...filters })`
372
- - `clearNetwork({ tag? })`
373
-
374
- Request capture and replay:
375
-
376
- - `rawRequest({ transport?, pageRef?, url, method?, headers?, body?, followRedirects? })`
377
- - `inferRequestPlan({ recordId, key, version, transport? })`
378
- - `writeRequestPlan({ key, version, payload, tags?, provenance?, freshness? })`
379
- - `getRequestPlan({ key, version? })`
380
- - `listRequestPlans({ key? })`
381
- - `request(key, { path?, query?, headers?, body? })`
382
- - `writeRecipe({ key, version, payload, tags?, provenance? })`
383
- - `getRecipe({ key, version? })`
384
- - `listRecipes({ key? })`
385
- - `runRecipe({ key, version?, input? })`
386
- - `writeAuthRecipe({ key, version, payload, tags?, provenance? })`
387
- - `getAuthRecipe({ key, version? })`
388
- - `listAuthRecipes({ key? })`
389
- - `runAuthRecipe({ key, version?, input? })`
390
-
391
- Browser helpers:
392
-
393
- - `discoverLocalCdpBrowsers({ timeoutMs? })`
394
- - `inspectCdpEndpoint({ endpoint, headers?, timeoutMs? })`
395
- - `browser.status()`
396
- - `browser.clone({ sourceUserDataDir, sourceProfileDirectory? })`
397
- - `browser.reset()`
398
- - `browser.delete()`
399
-
400
- Lifecycle:
401
-
402
- - `close()`
403
- - `disconnect()`
404
-
405
- ## Rules
406
-
407
- - Wrap long-running browser ownership in `try/finally` and call `close()`.
408
- - Use `captureNetwork` on actions that trigger requests you may inspect later.
409
- - Use `description` for all interactions and extractions in deterministic scripts.
410
- - Use `description` plus `schema` to persist an extraction descriptor. Bare `description` replays it.
411
- - Use `element` targets only during CLI exploration with a fresh snapshot. Deterministic scripts use `description`.
412
- - The SDK does expose `snapshot()`, but this workflow keeps element discovery in the CLI with `snapshot action`.
413
- - Prefer Opensteer methods over raw Playwright so browser, extraction, and replay semantics stay consistent.
155
+ - Do not build new abstractions on top of simple `fetch()` code unless the task really needs them.
156
+ - Do not bypass Opensteer with raw Playwright when Opensteer already captured the request.
157
+ - Do not dump giant raw response blobs into logs or prompts when the filtered previews already show the useful structure.