opensteer 0.8.12 → 0.8.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,400 +1,153 @@
1
- # Request Plan Pipeline
1
+ # Opensteer Request Workflow
2
2
 
3
- If you haven't decided whether this workflow applies, see the task triage in SKILL.md.
3
+ Use this workflow when the task is to understand, validate, or reuse a site API.
4
4
 
5
- ## The Deliverable
5
+ The deliverable is working TypeScript that uses `session.fetch()` or other SDK primitives. The deliverable is not a registry artifact or a raw replay dump.
6
6
 
7
- The deliverable is a **persisted request plan** that works via `request.execute`. `rawRequest()` is a diagnostic probe — its output is never the final answer. You are not done until `request.execute` returns valid data from a stored plan.
7
+ ## Core Rules
8
8
 
9
- ## Critical Rules
9
+ 1. Capture real browser traffic instead of guessing request shapes.
10
+ 2. Use the filtered summaries first. Only drill into details when needed.
11
+ 3. Let `replay` tell you what transport works.
12
+ 4. Keep the final artifact as code, not as shell history.
10
13
 
11
- 1. Action capture is opt-in. `goto()`, `click()`, `input()`, `scroll()`, and `hover()` only persist network records when you pass `captureNetwork`.
12
- 2. `queryNetwork()` always reads from the persisted store. There is no `source` parameter. Do NOT pass `source: "saved"` or `source: "live"`.
13
- 3. `tagNetwork()` labels already-persisted records. It does NOT save anything. Use it to organize captures for later lookup by tag.
14
- 4. `clearNetwork()` permanently removes records with tombstoning. Cleared records cannot be resurrected by late-arriving browser events.
15
- 5. `waitForNetwork()` watches for NEW records only. It snapshots existing records and polls for ones that were not present at the start. It does NOT return historical matches.
14
+ ## Workflow
16
15
 
17
- ## Transport Selection
16
+ ### 1. Capture
18
17
 
19
- - `direct-http`: the request is replayable without a browser.
20
- - `context-http`: browser session state matters, but the request does not need to execute inside page JavaScript.
21
- - `page-http`: the request must execute inside the live page JavaScript world.
22
- - `session-http`: use a stored request plan that still depends on a live browser session.
23
-
24
- When in doubt, start with browser-backed capture first. Opensteer treats browser-backed replay as a first-class path, not a fallback.
25
-
26
- ## Pipeline Phases
27
-
28
- Work through each phase in order. Do NOT skip phases. Each phase has exit criteria — verify them before proceeding.
29
-
30
- ### Phase 1: Capture
31
-
32
- Trigger the real browser action that causes the request. Name the capture.
33
-
34
- ```bash
35
- opensteer open https://example.com/app --workspace demo
36
- opensteer run page.goto --workspace demo \
37
- --input-json '{"url":"https://example.com/app","captureNetwork":"page-load"}'
38
- ```
39
-
40
- For interactions that trigger API calls (search, filter, load-more):
41
-
42
- ```bash
43
- opensteer run dom.click --workspace demo \
44
- --input-json '{"target":{"kind":"description","description":"load products"},"captureNetwork":"products-load"}'
45
- ```
46
-
47
- **EXIT CRITERIA:** You have at least one named capture. If `queryNetwork` returns empty after capture, see Error Recovery.
48
-
49
- ### Phase 2: Discover
50
-
51
- Query captured traffic to isolate the API calls worth replaying. Ignore static assets, analytics, and third-party scripts.
52
-
53
- ```bash
54
- opensteer run network.query --workspace demo \
55
- --input-json '{"capture":"products-load","includeBodies":true,"limit":20}'
56
- ```
57
-
58
- Examine the results. Look for first-party JSON APIs — requests returning `application/json` with data relevant to the task.
59
-
60
- If the first query is too broad, filter by hostname, path, or method:
61
-
62
- ```bash
63
- opensteer run network.query --workspace demo \
64
- --input-json '{"capture":"products-load","hostname":"api.example.com","method":"GET","includeBodies":true,"limit":10}'
65
- ```
66
-
67
- **EXIT CRITERIA:** You have identified at least one candidate API URL with its method, recordId, and response shape.
68
-
69
- ### Phase 3: Probe (Diagnostic Only)
70
-
71
- `rawRequest()` is a diagnostic tool. Use it to determine:
72
- 1. Which transport works (`direct-http` vs `context-http`)
73
- 2. Whether the API returns the expected data shape
74
- 3. Whether auth headers are actually required
75
-
76
- `rawRequest()` output is NOT the deliverable. Do NOT return rawRequest results to the user as the final answer. Always proceed to Phase 4.
77
-
78
- Test portability — try `direct-http` first:
79
-
80
- ```bash
81
- opensteer run request.raw --workspace demo \
82
- --input-json '{"transport":"direct-http","url":"https://api.example.com/products","method":"GET"}'
83
- ```
84
-
85
- If `direct-http` returns 200, the API is portable. If it fails (403/401), try `context-http`:
86
-
87
- ```bash
88
- opensteer run request.raw --workspace demo \
89
- --input-json '{"transport":"context-http","url":"https://api.example.com/products","method":"GET"}'
90
- ```
91
-
92
- **EXIT CRITERIA:** You know which transport works and have a successful response. Note the `recordId` from the probe response — you will use it in Phase 4.
93
-
94
- ### Phase 4: Infer Plan
95
-
96
- Create a request plan from the probed record. Pass the `transport` you proved works.
97
-
98
- ```bash
99
- opensteer run request-plan.infer --workspace demo \
100
- --input-json '{"recordId":"<recordId-from-phase-3>","key":"products.search","version":"v1","transport":"direct-http"}'
101
- ```
102
-
103
- If you proved `direct-http` works, always pass `transport: "direct-http"` so the plan is portable.
104
-
105
- If `inferRequestPlan` throws "registry record already exists", bump the version (e.g., `v2`).
106
-
107
- **EXIT CRITERIA:** Plan is persisted. You can verify with `request-plan.get`.
108
-
109
- ### Phase 5: Validate Auth Classification
110
-
111
- `inferRequestPlan` records auth metadata by observing headers on the captured request. This is **often wrong**. If the browser sent an `Authorization` header, the plan records `auth.strategy: "bearer-token"` even if the API works without auth.
112
-
113
- **MANDATORY VALIDATION:**
114
-
115
- 1. Read the inferred plan:
18
+ Trigger the real browser action that causes the request.
116
19
 
117
20
  ```bash
118
- opensteer run request-plan.get --workspace demo \
119
- --input-json '{"key":"products.search","version":"v1"}'
21
+ opensteer open https://example.com --workspace demo
22
+ opensteer goto https://example.com/search --workspace demo --capture-network page-load
23
+ opensteer input 5 laptop --workspace demo --persist "search input" --capture-network search
24
+ opensteer click 7 --workspace demo --persist "search button" --capture-network search
120
25
  ```
121
26
 
122
- 2. Check the `auth` field in `payload`. If `auth` is absent or `undefined`, auth is not detected — skip to Phase 6.
123
-
124
- 3. If `auth.strategy` is set, test whether the API actually needs it. Run a raw request to the same URL with NO auth headers:
27
+ Use meaningful capture labels. They make the next step much easier.
125
28
 
126
- ```bash
127
- opensteer run request.raw --workspace demo \
128
- --input-json '{"transport":"direct-http","url":"<the-api-url>","method":"GET"}'
129
- ```
29
+ ### 2. Discover
130
30
 
131
- 4. If it returns 200 without auth headers, auth is **spurious** — the browser attached a token the API doesn't enforce. Rewrite the plan with corrected auth:
31
+ Scan the captured traffic.
132
32
 
133
33
  ```bash
134
- opensteer run request-plan.write --workspace demo \
135
- --input-json '{
136
- "key":"products.search",
137
- "version":"v1",
138
- "tags":["products","search"],
139
- "provenance":{"source":"manual","notes":"Auth removed — API is public, bearer token was incidental."},
140
- "payload":{
141
- ...existing payload with auth field removed...
142
- }
143
- }'
34
+ opensteer network query --workspace demo --capture search
35
+ opensteer network query --workspace demo --capture search --hostname api.example.com
36
+ opensteer network query --workspace demo --capture search --url search --limit 20
144
37
  ```
145
38
 
146
- Copy the full existing `payload` from `request-plan.get`, remove or null out the `auth` field, and write it back.
147
-
148
- 5. If the no-auth probe returns 401/403, auth IS required. Keep the auth classification and proceed. You will create an auth recipe in Phase 8 after testing the plan.
149
-
150
- **EXIT CRITERIA:** The plan's `auth` field accurately reflects whether auth is required.
39
+ Look for first-party JSON requests that actually carry the data you want. Ignore static assets, analytics, and third-party noise.
151
40
 
152
- ### Phase 6: Annotate Parameters
41
+ ### 3. Inspect
153
42
 
154
- `inferRequestPlan` dumps all query and body parameters into `defaultQuery`/`defaultBody` without distinguishing variable from fixed.
155
-
156
- 1. Read the plan with `request-plan.get`.
157
-
158
- 2. Examine each parameter in `defaultQuery`:
159
- - **Variable:** values that change per invocation — search terms, page numbers, offsets, dates, user-specific IDs
160
- - **Fixed:** values constant for this API — site keys, platform identifiers, API versions, channel strings
161
-
162
- 3. Rewrite the plan with the `parameters` field annotating variable inputs:
43
+ Inspect the most promising record deeply.
163
44
 
164
45
  ```bash
165
- opensteer run request-plan.write --workspace demo \
166
- --input-json '{
167
- "key":"products.search",
168
- "version":"v1",
169
- "payload":{
170
- ...existing payload...,
171
- "parameters":[
172
- {"name":"keyword","in":"query","required":true,"description":"Search term"},
173
- {"name":"count","in":"query","defaultValue":"24","description":"Results per page"},
174
- {"name":"offset","in":"query","defaultValue":"0","description":"Pagination offset"}
175
- ]
176
- }
177
- }'
46
+ opensteer network detail rec_123 --workspace demo
178
47
  ```
179
48
 
180
- Variable params remain in `defaultQuery` as initial values. The `parameters` field documents which ones a caller should override via `request.execute` input.
49
+ Read:
181
50
 
182
- **EXIT CRITERIA:** The plan's `parameters` field lists all variable inputs with descriptions.
51
+ - URL and method
52
+ - request headers
53
+ - cookies sent
54
+ - request body preview
55
+ - response headers
56
+ - response body preview
57
+ - GraphQL metadata when present
58
+ - redirect chains when present
183
59
 
184
- ### Phase 7: Test Plan
60
+ ### 4. Test
185
61
 
186
- Execute the plan through `request.execute`, NOT `rawRequest`:
62
+ Replay the captured request.
187
63
 
188
64
  ```bash
189
- opensteer run request.execute --workspace demo \
190
- --input-json '{"key":"products.search","version":"v1","query":{"keyword":"laptop","count":"10"}}'
65
+ opensteer replay rec_123 --workspace demo
66
+ opensteer replay rec_123 --workspace demo --query keyword=headphones --query count=10
67
+ opensteer replay rec_123 --workspace demo --variables '{"keyword":"headphones"}'
191
68
  ```
192
69
 
193
- **GATE:**
194
- - If `request.execute` returns valid data → proceed to Phase 9 (Done).
195
- - If `request.execute` returns 401/403 and Phase 5 confirmed auth is required → proceed to Phase 8 (Auth Recipe).
196
- - If `request.execute` fails with another error → see Error Recovery.
70
+ Use the working transport that `replay` discovers as input to your final SDK code.
197
71
 
198
- ### Phase 8: Auth Recipe (Conditional)
72
+ ### 5. Trace Dependencies
199
73
 
200
- Enter this phase ONLY if Phase 5 confirmed auth is genuinely required AND Phase 7 failed with 401/403.
201
-
202
- #### Step 8a: Discover Auth Endpoint
203
-
204
- Search captured traffic for OAuth, token, or login endpoints:
74
+ If replay fails or returns `401`/`403`, inspect the surrounding state.
205
75
 
206
76
  ```bash
207
- opensteer run network.query --workspace demo \
208
- --input-json '{"path":"/oauth","includeBodies":true,"limit":10}'
209
- opensteer run network.query --workspace demo \
210
- --input-json '{"path":"/token","includeBodies":true,"limit":10}'
211
- opensteer run network.query --workspace demo \
212
- --input-json '{"path":"/auth","includeBodies":true,"limit":10}'
77
+ opensteer network query --workspace demo --before rec_123 --limit 50
78
+ opensteer cookies example.com --workspace demo
79
+ opensteer storage example.com --workspace demo
80
+ opensteer state example.com --workspace demo
213
81
  ```
214
82
 
215
- Examine responses to find the endpoint that returns an access token.
216
-
217
- #### Step 8b: Probe Auth Endpoint
83
+ Use these to answer:
218
84
 
219
- Test the auth endpoint with `request.raw`:
85
+ - which cookies matter
86
+ - which tokens live in storage
87
+ - whether hidden fields or globals provide CSRF values or nonces
88
+ - which earlier requests set the relevant state
220
89
 
221
- ```bash
222
- opensteer run request.raw --workspace demo \
223
- --input-json '{
224
- "transport":"direct-http",
225
- "url":"https://example.com/api/oauth/token",
226
- "method":"POST",
227
- "body":{"json":{"grant_type":"client_credentials"}}
228
- }'
229
- ```
230
-
231
- Verify it returns a token. Note the response shape (e.g., `{ "access_token": "..." }`).
90
+ ### 6. Write Code
232
91
 
233
- #### Step 8c: Create Auth Recipe
92
+ Translate what worked into plain TypeScript.
234
93
 
235
- Write an auth recipe that acquires a fresh token and maps it to request headers:
236
-
237
- ```bash
238
- opensteer run auth-recipe.write --workspace demo \
239
- --input-json '{
240
- "key":"example.auth",
241
- "version":"v1",
242
- "payload":{
243
- "description":"Acquire bearer token for example.com API",
244
- "steps":[
245
- {
246
- "kind":"directRequest",
247
- "request":{
248
- "url":"https://example.com/api/oauth/token",
249
- "transport":"direct-http",
250
- "method":"POST",
251
- "body":{"json":{"grant_type":"client_credentials"}}
252
- },
253
- "capture":{
254
- "bodyJsonPointer":{"pointer":"/access_token","saveAs":"token"}
255
- }
256
- }
257
- ],
258
- "outputs":{
259
- "headers":{"Authorization":"Bearer {{token}}"}
260
- }
261
- }
262
- }'
263
- ```
94
+ ```ts
95
+ import { Opensteer } from "opensteer";
264
96
 
265
- **Recipe step types you can use:**
266
- - `directRequest` — HTTP request outside the browser (portable, no session needed)
267
- - `sessionRequest` — HTTP request using browser session state (cookies, etc.)
268
- - `request` — generic request step
269
- - `readCookie` — read a browser cookie value, `saveAs` a variable
270
- - `readStorage` — read localStorage/sessionStorage, `saveAs` a variable
271
- - `evaluate` — run JavaScript in the page, `saveAs` a variable
272
- - `waitForNetwork` — wait for a network request matching filters
273
- - `waitForCookie` — wait for a cookie to appear
274
- - `goto` — navigate to a URL (e.g., trigger a login page)
275
- - `solveCaptcha` — solve a CAPTCHA challenge
97
+ const opensteer = new Opensteer({
98
+ workspace: "demo",
99
+ rootDir: process.cwd(),
100
+ });
276
101
 
277
- Each step can have a `capture` field to extract values from the response. The `outputs` field maps captured variables to `headers`, `query`, `params`, or `body` overrides applied to the request plan at execution time.
102
+ async function ensureSession() {
103
+ const cookies = await opensteer.cookies("example.com");
104
+ if (cookies.has("session")) {
105
+ return;
106
+ }
107
+ await opensteer.goto("https://example.com");
108
+ }
278
109
 
279
- #### Step 8d: Bind Auth Recipe to Plan
110
+ export async function searchProducts(keyword: string) {
111
+ await ensureSession();
280
112
 
281
- Update the plan to reference the auth recipe:
113
+ const response = await opensteer.fetch("https://api.example.com/search", {
114
+ query: {
115
+ keyword,
116
+ count: 24,
117
+ },
118
+ });
282
119
 
283
- ```bash
284
- opensteer run request-plan.get --workspace demo \
285
- --input-json '{"key":"products.search","version":"v1"}'
120
+ return response.json();
121
+ }
286
122
  ```
287
123
 
288
- Read the current payload, then write it back with the `auth.recipe` binding:
124
+ If exploration showed a required transport, carry it into `fetch()`:
289
125
 
290
- ```bash
291
- opensteer run request-plan.write --workspace demo \
292
- --input-json '{
293
- "key":"products.search",
294
- "version":"v1",
295
- "payload":{
296
- ...existing payload...,
297
- "auth":{
298
- "strategy":"bearer-token",
299
- "recipe":{"key":"example.auth","version":"v1"},
300
- "failurePolicy":{"on":"status","status":"401","action":"recover"}
301
- }
302
- }
303
- }'
126
+ ```ts
127
+ const response = await opensteer.fetch("https://api.example.com/search", {
128
+ query: { keyword: "laptop" },
129
+ transport: "matched-tls",
130
+ });
304
131
  ```
305
132
 
306
- The `failurePolicy` tells the plan to automatically re-run the auth recipe when a 401 is received.
307
-
308
- #### Step 8e: Test Authenticated Plan
309
-
310
- ```bash
311
- opensteer run request.execute --workspace demo \
312
- --input-json '{"key":"products.search","version":"v1","query":{"keyword":"laptop"}}'
313
- ```
314
-
315
- The auth recipe fires automatically before the request. If it still fails, inspect the token response shape and fix the recipe.
316
-
317
- **EXIT CRITERIA:** `request.execute` returns valid data with the auth recipe attached.
318
-
319
- ### Phase 9: Done
320
-
321
- Close the browser session:
322
-
323
- ```bash
324
- opensteer close --workspace demo
325
- ```
326
-
327
- The plan persists in the workspace registry and (if cloud mode) in Convex. Future callers can replay it with `request.execute` without opening a browser.
328
-
329
- ## Error Recovery
330
-
331
- ### `request.execute` returns 400 Bad Request
332
-
333
- 1. Read the plan: `request-plan.get`
334
- 2. Compare the plan's `defaultQuery`, `defaultBody`, and `defaultHeaders` against the original captured request from Phase 2
335
- 3. Identify the discrepancy — missing required parameter, wrong content-type, malformed body
336
- 4. Fix the plan with `request-plan.write`
337
- 5. Re-test with `request.execute`
338
- 6. If still failing after 2 fix attempts, use `request.raw` to isolate which specific parameter causes the 400 — remove params one at a time
339
-
340
- ### `request.execute` returns 401/403 Unauthorized
341
-
342
- 1. Was auth classification validated in Phase 5? If not, go back to Phase 5.
343
- 2. If auth is confirmed needed, enter Phase 8 (Auth Recipe).
344
- 3. If an auth recipe exists but fails, inspect the token response — the token may have expired, the scope may be wrong, or the grant type may differ.
345
-
346
- ### `request.execute` returns 404 Not Found
347
-
348
- 1. The API path may have changed since capture. Re-capture traffic (Phase 1) and re-discover (Phase 2).
349
- 2. Check if the URL uses path parameters that were hardcoded during inference. These may need to be templated.
350
-
351
- ### `request.execute` returns 500 Server Error
352
-
353
- This is the API server's problem, not a plan problem. Retry once. If persistent, document and report to the user.
354
-
355
- ### `inferRequestPlan` throws "registry record already exists"
356
-
357
- The key+version combination is already registered. Bump the version string (e.g., `v1` → `v2`).
358
-
359
- ### `queryNetwork()` returns empty records
360
-
361
- - Verify `captureNetwork` was set on the action that triggered the request (not on `open()`).
362
- - Re-trigger the action with `captureNetwork`. Records are auto-persisted on actions that opt in.
363
- - Broaden filters: try removing `tag` and querying by `hostname` or `path` instead.
364
- - Check that the request actually fired — some SPAs lazy-load data or use WebSocket instead of HTTP.
365
-
366
- ### `rawRequest()` returns non-200
367
-
368
- This is diagnostic information. Use it to decide transport and debug, not as a final answer.
369
- - If `direct-http` fails with 403/401: the API requires session state. Try `context-http`.
370
- - If `context-http` fails: the API may require specific cookies or tokens. Check for auth endpoints in captured traffic.
371
- - If the response body is empty: decode `response.body.data` with `Buffer.from(data, "base64").toString("utf8")` — the parsed `data` field may not be populated.
133
+ ## Common Cases
372
134
 
373
- ### `waitForNetwork()` times out
135
+ ### GraphQL
374
136
 
375
- - `waitForNetwork()` only matches records that appear AFTER the call starts. If the request already fired, use `queryNetwork()` instead.
376
- - Ensure the triggering action happens AFTER calling `waitForNetwork()`.
137
+ - `network query` should surface the operation name next to the URL.
138
+ - `network detail` should show operation type, operation name, and variables.
139
+ - `replay --variables '{...}'` is the fastest way to test new inputs.
377
140
 
378
- ## Input Formats
141
+ ### Redirect or Auth Chains
379
142
 
380
- `rawRequest` and recipe steps expect specific shapes:
143
+ Start with `network detail` on the failing request. If it shows redirects or challenge notes, inspect earlier records with `--before`.
381
144
 
382
- - `headers`: array of `[{ name, value }]`, not `{ key: value }`. Exception: recipe step `request` fields accept `{ key: value }` objects.
383
- - `body`: one of `{ json: { ... } }`, `{ text: "..." }`, or `{ base64: "..." }`. Not raw strings.
384
- - `request.execute` input includes `key` inside the JSON object. The SDK convenience wrapper `opensteer.request(key, input)` adds that for you.
145
+ ### Hidden Form Tokens
385
146
 
386
- ## Practical Guidance
147
+ Use `state example.com --workspace demo` when the request depends on hidden fields or globals that do not show up cleanly in cookies or storage alone.
387
148
 
388
- Mandatory steps:
389
- - MUST use `goto({ url, captureNetwork })` to name navigation capture. `captureNetwork` is NOT supported on `open()`.
390
- - MUST query by capture first, then query all traffic to catch async requests.
391
- - MUST probe every discovered first-party API with transport tests. Do NOT just log URLs.
392
- - The deliverable is a persisted plan. `rawRequest()` output is never the final answer.
149
+ ## What Not To Do
393
150
 
394
- Common mistakes:
395
- - Stopping at `rawRequest` output and returning it to the user. Always proceed to `inferRequestPlan` and `request.execute`.
396
- - Trusting inferred auth metadata without validation. Always run Phase 5.
397
- - Passing headers as `{key: value}` to `rawRequest`. MUST use `[{name, value}]` arrays.
398
- - Passing body as a raw string to `rawRequest`. MUST wrap in `{json: {...}}`, `{text: "..."}`, or `{base64: "..."}`.
399
- - Skipping parameter annotation. Variable params should be documented in the plan's `parameters` field.
400
- - Not closing the browser after completing the pipeline. Always run `opensteer close` when done.
151
+ - Do not stop at `network query` when the user asked for reusable code.
152
+ - Do not bypass Opensteer with raw Playwright when Opensteer already captured the request.
153
+ - Do not dump giant raw response blobs into the prompt when the filtered summaries already show the useful shape.