npm - opensteer - Versions diffs - 0.8.2 → 0.8.4 - Mend

opensteer 0.8.2 → 0.8.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +7 -6
package/dist/{chunk-X3G6QSCF.js → chunk-C7GWMSTV.js} +1700 -577
package/dist/chunk-C7GWMSTV.js.map +1 -0
package/dist/cli/bin.cjs +6115 -4757
package/dist/cli/bin.cjs.map +1 -1
package/dist/cli/bin.js +320 -52
package/dist/cli/bin.js.map +1 -1
package/dist/index.cjs +1751 -614
package/dist/index.cjs.map +1 -1
package/dist/index.d.cts +484 -539
package/dist/index.d.ts +484 -539
package/dist/index.js +30 -26
package/dist/index.js.map +1 -1
package/package.json +5 -4
package/skills/opensteer/SKILL.md +66 -13
package/skills/opensteer/references/cli-reference.md +115 -20
package/skills/opensteer/references/request-workflow.md +35 -15
package/skills/opensteer/references/sdk-reference.md +85 -21
package/dist/chunk-X3G6QSCF.js.map +0 -1

package/skills/opensteer/references/cli-reference.md CHANGED Viewed

@@ -5,32 +5,101 @@ Use the CLI when you need a fast JSON-first loop against a repo-local workspace
 ## Sections
 - [Quickstart](#quickstart)
+- [Snapshot Output — What To Read](#snapshot-output--what-to-read)
+- [End-to-End Example](#end-to-end-example)
 - [Browser Lifecycle And Profile Cloning](#browser-lifecycle-and-profile-cloning)
 - [Browser Modes](#browser-modes)
 - [Advanced Semantic Operations](#advanced-semantic-operations)
-- [Extraction Schema Examples](#extraction-schema-examples)
+- [Extraction Schema And Array Auto-Generalization](#extraction-schema-and-array-auto-generalization)
 ## Quickstart
 ```bash
-opensteer browser status --workspace demo
 opensteer open https://example.com --workspace demo
 opensteer snapshot action --workspace demo
-opensteer click --workspace demo --element 3
-opensteer input --workspace demo --selector "input[type=search]" --text "search term" --press-enter true
+# Read the "html" field in the JSON output. Find elements by their c="N" attributes.
+# Example: <input c="5" placeholder="Search"> means element 5 is the search input.
+# Example: <button c="7">Search</button> means element 7 is the search button.
+# Act on elements AND persist their paths with human-readable descriptions
+opensteer run dom.input --workspace demo \
+  --input-json '{"target":{"kind":"element","element":5},"text":"search term","persistAsDescription":"search input"}'
+opensteer run dom.click --workspace demo \
+  --input-json '{"target":{"kind":"element","element":7},"persistAsDescription":"search button"}'
+# Re-snapshot after navigation, then persist an extraction descriptor
 opensteer snapshot extraction --workspace demo
 opensteer extract --workspace demo \
   --description "page summary" \
-  --schema-json '{"title":{"selector":"title"},"url":{"source":"current_url"}}'
+  --schema-json '{"title":{"element":3},"url":{"source":"current_url"}}'
+# Replay later with just descriptions — no snapshot needed
+opensteer click --workspace demo --description "search button"
 opensteer extract --workspace demo --description "page summary"
 opensteer close --workspace demo
 ```
 - Stateful CLI commands currently require `--workspace <id>`.
-- With a workspace, browser mode defaults to `persistent`.
-- Use `snapshot action` before `--element <n>` targets.
-- `extract --description --schema-json ...` writes or updates a persisted extraction descriptor.
-- `extract --description ...` replays the stored extraction payload with no schema.
+- Use `snapshot action` to discover page elements during exploration.
+- Use `opensteer run dom.*` with `persistAsDescription` to cache element paths under descriptions.
+- Replay cached actions with `--description` alone — no snapshot needed.
+- `extract --description --schema-json` writes a persisted extraction descriptor.
+- `extract --description` replays the stored extraction.
+## Snapshot Output — What To Read
+`snapshot action` and `snapshot extraction` both return JSON:
+```json
+{
+  "url": "https://example.com/search?q=airpods",
+  "title": "Search Results",
+  "mode": "extraction",
+  "html": "<span c=\"12\">$549.99</span>\n<a c=\"15\" href=\"/p/product-1\">\n  <div c=\"16\">Apple AirPods Max</div>\n</a>\n<a c=\"18\" href=\"/b/apple\">Apple</a>...",
+  "counters": [{"element":12,"tagName":"SPAN","pathHint":"span",...}, ...]
+}
+```
+**Read the `html` field.** It is a clean, filtered DOM. Hidden elements, scripts, and styles are already removed. Every element has a `c="N"` attribute.
+- `c="N"` in the HTML = `element: N` in commands and extraction schemas
+- `snapshot action` keeps interactive elements (buttons, inputs, links) for clicking/typing
+- `snapshot extraction` keeps all visible content (text, prices, titles) for data extraction
+- Do NOT parse the `counters` array to find elements — it is verbose metadata. Read the HTML string, find the `c="N"` values, and use those numbers.
+## End-to-End Example
+Goal: go to a site, search for a product, extract all results.
+```bash
+# 1. Open the page
+opensteer open https://example.com --workspace demo
+# 2. Snapshot to discover elements
+opensteer snapshot action --workspace demo
+# Read html field. Find: <input c="5" placeholder="Search"> and <button c="7">Search</button>
+# 3. Type search term and persist the element path
+opensteer run dom.input --workspace demo \
+  --input-json '{"target":{"kind":"element","element":5},"text":"airpods","pressEnter":true,"persistAsDescription":"search input"}'
+# 4. Re-snapshot the results page (always re-snapshot after navigation!)
+opensteer snapshot extraction --workspace demo
+# Read html field. Find product items with their c="N" values:
+# <div c="13">Apple AirPods Max</div> <span c="14">$549.99</span>
+# <div c="22">Apple AirPods Pro</div> <span c="23">$249.99</span>
+# 5. Extract all results — array auto-generalizes from template rows
+opensteer extract --workspace demo \
+  --description "search results" \
+  --schema-json '{"items":[{"name":{"element":13},"price":{"element":14}},{"name":{"element":22},"price":{"element":23}}]}'
+# Returns ALL matching rows on the page, not just the 2 templates.
+# 6. Close
+opensteer close --workspace demo
+```
+Total: 6 commands.
 ## Browser Lifecycle And Profile Cloning
@@ -94,6 +163,9 @@ opensteer run network.query --workspace demo \
 opensteer run request-plan.infer --workspace demo \
   --input-json '{"recordId":"rec_123","key":"products.search","version":"v1"}'
+opensteer run request-plan.infer --workspace demo \
+  --input-json '{"recordId":"rec_123","key":"products.search.portable","version":"v1","transport":"direct-http"}'
 opensteer run request.execute --workspace demo \
   --input-json '{"key":"products.search","query":{"q":"laptop"}}'
 ```
@@ -102,13 +174,11 @@ opensteer run request.execute --workspace demo \
 - Use `run page.goto` when you need `networkTag` on navigation. The short `goto` form only parses the URL positional.
 - Use `run dom.click` / `run dom.input` / `run dom.hover` / `run dom.scroll` when you need `persistAsDescription`.
-## Extraction Schema Examples
+## Extraction Schema And Array Auto-Generalization
-```bash
-opensteer snapshot extraction --workspace demo
-```
+Always run `snapshot extraction` before building a schema — you need the `c="N"` counter values from the HTML.
-Explicit field bindings:
+Flat field bindings:
 ```bash
 opensteer extract --workspace demo \
@@ -120,14 +190,39 @@ opensteer extract --workspace demo \
   --schema-json '{"url":{"selector":"a.primary","attribute":"href"},"pageUrl":{"source":"current_url"}}'
 ```
-Arrays with representative rows:
+Array extraction with auto-generalization:
 ```bash
 opensteer extract --workspace demo \
-  --description "items" \
-  --schema-json '{"items":[{"title":{"selector":"#products li:nth-child(1) .title"},"price":{"selector":"#products li:nth-child(1) .price"}},{"title":{"selector":"#products li:nth-child(2) .title"},"price":{"selector":"#products li:nth-child(2) .price"}}]}'
+  --description "product list" \
+  --schema-json '{"items":[{"name":{"element":13},"price":{"element":14}},{"name":{"element":22},"price":{"element":23}}]}'
+```
+The extractor uses the 1-2 representative rows as **templates**, then automatically finds **ALL matching rows** on the page:
+```json
+{
+  "items": [
+    {"name": "Apple AirPods Max", "price": "$549.99"},
+    {"name": "Apple AirPods Pro", "price": "$249.99"},
+    {"name": "Apple AirPods 4", "price": "$129.99"},
+    ...
+  ]
+}
 ```
-- Build the exact JSON object you want. The extractor does not accept semantic placeholders like `"string"` or prompt-style schemas.
-- Use `element` fields only with counters from a fresh snapshot in the same live session.
-- For arrays, include one or more representative objects. Add multiple examples when repeated rows have structural variants.
+Rules:
+- Build the exact JSON shape you want. The extractor does not accept `"string"` or prompt-style schemas.
+- Each leaf must be `{ element: N }`, `{ selector: "..." }`, optional `attribute`, or `{ source: "current_url" }`.
+- Use `element` fields during CLI exploration with a fresh snapshot. Deterministic scripts use `description`.
+- For arrays, provide 1-2 representative objects. Add 2 when repeated rows have structural variants.
+- Nested arrays are not supported.
+## What NOT To Do
+- Do NOT use `page.evaluate()` to scrape DOM data. Use `extract()` with element-based schemas.
+- Do NOT parse the `counters` array to find elements. Read the `html` string and find `c="N"` values.
+- Do NOT use CSS selectors in reusable scripts. Use `description` from cached descriptors.
+- Do NOT write loops to enumerate list items. Use array extraction with 1-2 template rows.
+- Do NOT skip re-snapshot after navigation. Always re-snapshot before targeting new elements.

package/skills/opensteer/references/request-workflow.md CHANGED Viewed

@@ -19,10 +19,10 @@ Use this workflow when the deliverable is a custom API, a replayable request pla
 2. Tag the important navigation or interactions with `networkTag`.
 3. Inspect the captured traffic and isolate the relevant records.
 4. Save useful captures to the workspace if they need to survive later analysis.
-5. Prove the request with `rawRequest()`.
-6. Promote the winning record into a request plan.
+5. Probe the request with `rawRequest()` — try `direct-http` first, then `context-http`.
+6. Infer a request plan from the probed record — pass `transport` if you proved portability.
 7. Add recipes or auth recipes if replay needs deterministic setup.
-8. Replay the plan from code.
+8. Replay the plan from code — works immediately, no extra steps.
 This workflow should carry equal weight with DOM automation. Use it whenever the browser page is only the launcher for the real target request.
@@ -47,7 +47,6 @@ await opensteer.goto({
 });
 await opensteer.click({
-  selector: "button.load-products",
   description: "load products",
   networkTag: "products-load",
 });
@@ -63,26 +62,29 @@ const records = await opensteer.queryNetwork({
 });
 ```
-3. Test the request directly.
+3. Probe the request — try `direct-http` first to test portability.
 ```ts
 const response = await opensteer.rawRequest({
-  transport: "context-http",
+  transport: "direct-http",
   url: "https://example.com/api/products",
   method: "POST",
   body: {
     json: { page: 1 },
   },
 });
+// If direct-http returns 200, the API is portable (no browser needed).
+// If it fails, try context-http — the API needs browser session state.
 ```
-4. Promote a captured record into a request plan.
+4. Infer a request plan — pass `transport` if you proved portability.
 ```ts
 await opensteer.inferRequestPlan({
-  recordId: records.records[0]!.id,
+  recordId: response.recordId,
   key: "products.search",
   version: "v1",
+  transport: "direct-http", // use the transport you proved works
 });
 ```
@@ -117,17 +119,18 @@ await opensteer.runAuthRecipe({
 ## CLI Equivalents
 ```bash
-opensteer open --workspace demo
+opensteer open https://example.com/app --workspace demo
 opensteer run page.goto --workspace demo \
   --input-json '{"url":"https://example.com/app","networkTag":"page-load"}'
-opensteer run dom.click --workspace demo \
-  --input-json '{"target":{"kind":"selector","selector":"button.load-products"},"persistAsDescription":"load products","networkTag":"products-load"}'
+opensteer click --workspace demo --description "load products"
+  # or with networkTag: opensteer run dom.click --workspace demo \
+  #   --input-json '{"target":{"kind":"description","description":"load products"},"networkTag":"products-load"}'
 opensteer run network.query --workspace demo \
-  --input-json '{"tag":"products-load","includeBodies":true,"limit":20}'
+  --input-json '{"source":"saved","tag":"products-load","includeBodies":true,"limit":20}'
 opensteer run request.raw --workspace demo \
-  --input-json '{"transport":"context-http","url":"https://example.com/api/products","method":"POST","body":{"json":{"page":1}}}'
+  --input-json '{"transport":"direct-http","url":"https://example.com/api/products","method":"POST","body":{"json":{"page":1}}}'
 opensteer run request-plan.infer --workspace demo \
-  --input-json '{"recordId":"rec_123","key":"products.search","version":"v1"}'
+  --input-json '{"recordId":"rec_123","key":"products.search","version":"v1","transport":"direct-http"}'
 opensteer run request.execute --workspace demo \
   --input-json '{"key":"products.search","query":{"q":"laptop"}}'
 ```
@@ -152,6 +155,22 @@ const context = await opensteer.rawRequest({
 If `direct-http` returns 200, the API is portable and does not need a browser for future calls. If only `context-http` works, the API depends on browser session state.
+After proving portability, infer the plan with an explicit transport override:
+```ts
+await opensteer.inferRequestPlan({
+  recordId: records.records[0]!.id,
+  key: "products.search.portable",
+  version: "v1",
+  transport: "direct-http",
+});
+```
+```bash
+opensteer run request-plan.infer --workspace demo \
+  --input-json '{"recordId":"rec_123","key":"products.search.portable","version":"v1","transport":"direct-http"}'
+```
 ## Auth Token Acquisition
 When you discover an auth endpoint, acquire a token and use it to probe for data APIs that may be behind auth:
@@ -211,7 +230,8 @@ Common mistakes:
 Additional guidance:
 - Capture the browser action first if authentication, cookies, or minted tokens may matter.
-- Prefer `direct-http` only after proving the request no longer depends on live browser state.
+- Probe with `direct-http` first. If it works, pass `transport: "direct-http"` to `inferRequestPlan` so the plan is portable. If it fails, fall back to `context-http`.
 - `inferRequestPlan()` throws if the key+version already exists. Catch the error or bump the version.
+- Inferred plans are immediately usable — `request.execute` works right after inference.
 - Use recipes when the request plan needs deterministic setup work. Use auth recipes when the setup is specifically auth refresh or login state.
 - Stay in the DOM workflow only when the rendered page itself is the deliverable. Move here when the request is the durable artifact.

package/skills/opensteer/references/sdk-reference.md CHANGED Viewed

@@ -39,6 +39,12 @@ const opensteer = new Opensteer({
 ## DOM Automation And Extraction
+Opensteer uses a two-phase workflow: **explore** with the CLI, then **replay** with the SDK.
+### Phase 1 — Exploration (one-time, via CLI or setup script)
+Run `opensteer snapshot action --workspace demo` from the CLI first. Read the `html` field in the JSON output — it is a clean filtered DOM with `c="N"` attributes. Use those counter numbers as the `element` parameter below. The SDK also exposes `snapshot()`, but this guide keeps discovery in the CLI so the DOM HTML is easy to inspect from the terminal.
 ```ts
 import { Opensteer } from "opensteer";
@@ -48,21 +54,21 @@ const opensteer = new Opensteer({
 });
 await opensteer.open("https://example.com");
-await opensteer.snapshot("action");
+// element numbers come from c="N" values in the snapshot html field
 await opensteer.click({
-  selector: "button.primary",
-  description: "primary button",
+  element: 3,
+  description: "primary button", // caches the element path
 });
 await opensteer.input({
-  selector: "input[type=search]",
-  description: "search input",
+  element: 7,
+  description: "search input", // caches the element path
   text: "laptop",
   pressEnter: true,
 });
-const data = await opensteer.extract({
+await opensteer.extract({
   description: "page summary",
   schema: {
     title: { selector: "title" },
@@ -70,32 +76,83 @@ const data = await opensteer.extract({
   },
 });
-const replayed = await opensteer.extract({
-  description: "page summary",
+await opensteer.close();
+```
+### Phase 2 — Deterministic replay (the actual reusable script)
+Use `description` alone for everything — resolves from cached descriptors:
+```ts
+const opensteer = new Opensteer({
+  workspace: "demo",
+  rootDir: process.cwd(),
 });
+await opensteer.open("https://example.com");
+await opensteer.click({ description: "primary button" });
+await opensteer.input({ description: "search input", text: "laptop", pressEnter: true });
+const data = await opensteer.extract({ description: "page summary" });
+await opensteer.close();
 ```
 DOM rules:
-- Use `snapshot("action")` before counter-based interactions.
-- Use `snapshot("extraction")` to inspect the page structure and design the output object.
-- Treat snapshots as planning artifacts. `extract()` reads current page state and replays persisted extraction descriptors from deterministic, snapshot-backed payloads.
-- `selector + description` or `element + description` persists a DOM action descriptor. Bare `description` replays it later.
+- Deterministic scripts use `description` for all interactions and extractions — no snapshots, no selectors.
+- `element + description` persists a DOM action descriptor. Bare `description` replays it later.
 - `description + schema` writes or updates a persisted extraction descriptor. Bare `description` replays it later.
+- Use `element` targets only during the exploration phase with a fresh snapshot from the CLI.
 - Keep DOM data collection in `extract()`, not `evaluate()` or raw page DOM parsing, when the result can be expressed as structured fields.
+- CSS selectors exist as a low-level escape hatch but are not recommended for reusable scripts.
-Supported extraction field shapes in the current public SDK:
+Supported extraction field shapes:
-- `{ element: N }`
+- `{ element: N }` — requires a prior CLI snapshot; use during exploration only
 - `{ element: N, attribute: "href" }`
 - `{ selector: ".price" }`
 - `{ selector: "img.hero", attribute: "src" }`
 - `{ source: "current_url" }`
-For arrays, include one or more representative objects in the schema. Add multiple examples when repeated rows have structural variants.
+For arrays, provide 1-2 representative objects. The extractor auto-generalizes from these templates to find ALL matching rows on the page:
+```ts
+const results = await opensteer.extract({
+  description: "search results",
+  schema: {
+    items: [
+      { name: { element: 13 }, price: { element: 14 } },
+      { name: { element: 22 }, price: { element: 23 } },
+    ],
+  },
+});
+// results.items contains ALL matching rows on the page, not just the 2 templates
+```
 Do not use `prompt` or semantic placeholder values such as `"string"` in the current public SDK. The extractor expects explicit schema objects, arrays, and field descriptors.
+### What extract() Returns
+`extract()` returns a plain JSON object matching your schema shape:
+```ts
+// Flat schema:
+{ title: "Search Results", url: "https://..." }
+// Array schema (auto-generalized from 1-2 templates):
+{
+  items: [
+    { name: "Apple AirPods Max", price: "$549.99" },
+    { name: "Apple AirPods Pro", price: "$249.99" },
+    { name: "Apple AirPods 4", price: "$129.99" },
+    // ... ALL matching rows
+  ]
+}
+```
+Use `extract()` for structured data. Do NOT use `evaluate()` or raw DOM parsing when `extract()` can express the result.
 ## Browser Admin
 ```ts
@@ -150,6 +207,13 @@ await opensteer.inferRequestPlan({
   version: "v1",
 });
+await opensteer.inferRequestPlan({
+  recordId: records.records[0]!.id,
+  key: "products.search.portable",
+  version: "v1",
+  transport: "direct-http",
+});
 await opensteer.saveNetwork({
   tag: "products-load",
 });
@@ -199,7 +263,6 @@ Session and page control:
 Interaction and extraction:
-- `snapshot("action" | "extraction")`
 - `click({ element | selector | description, networkTag? })`
 - `hover({ element | selector | description, networkTag? })`
 - `input({ element | selector | description, text, pressEnter?, networkTag? })`
@@ -219,8 +282,8 @@ Inspection and evaluation:
 Request capture and replay:
 - `rawRequest({ transport?, pageRef?, url, method?, headers?, body?, followRedirects? })`
-- `inferRequestPlan({ recordId, key, version, lifecycle? })`
-- `writeRequestPlan({ key, version, payload, lifecycle?, tags?, provenance?, freshness? })`
+- `inferRequestPlan({ recordId, key, version, transport? })`
+- `writeRequestPlan({ key, version, payload, tags?, provenance?, freshness? })`
 - `getRequestPlan({ key, version? })`
 - `listRequestPlans({ key? })`
 - `request(key, { path?, query?, headers?, body? })`
@@ -251,7 +314,8 @@ Lifecycle:
 - Wrap long-running browser ownership in `try/finally` and call `close()`.
 - Use `networkTag` on actions that trigger requests you may inspect later.
-- Use `description` when the interaction should be replayable across sessions.
-- Use `description` plus `schema` when an extraction should be replayable across sessions.
-- Use `element` targets only with a fresh snapshot from the same live session.
+- Use `description` for all interactions and extractions in deterministic scripts.
+- Use `description` plus `schema` to persist an extraction descriptor. Bare `description` replays it.
+- Use `element` targets only during CLI exploration with a fresh snapshot. Deterministic scripts use `description`.
+- The SDK does expose `snapshot()`, but this workflow keeps element discovery in the CLI with `snapshot action`.
 - Prefer Opensteer methods over raw Playwright so browser, extraction, and replay semantics stay consistent.