opensteer 0.4.10 → 0.4.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -36,6 +36,11 @@
36
36
  timeout/stale-target cases more accurately.
37
37
  - Cloud action failures now accept optional structured failure details and map
38
38
  them to `OpensteerActionError` when available.
39
+ - Docs: refreshed README and getting-started guidance to match current SDK/CLI
40
+ behavior and env vars.
41
+ - Docs: added CLI reference and docs index.
42
+ - OSS community docs: expanded `CONTRIBUTING.md` and added `SECURITY.md` +
43
+ `SUPPORT.md`.
39
44
 
40
45
  ## 0.1.0
41
46
 
package/README.md CHANGED
@@ -1,186 +1,142 @@
1
1
  # Opensteer
2
2
 
3
- Lean browser automation SDK for coding agents and script replay.
3
+ Open-source browser automation SDK for coding agents and deterministic replay.
4
4
 
5
- `opensteer` provides descriptor-aware actions (`click`, `dblclick`,
6
- `rightclick`, `hover`, `input`, `select`, `scroll`, `extract`,
7
- `extractFromPlan`, `uploadFile`), observation (`snapshot`, `state`,
8
- `screenshot`), navigation (`goto`), and convenience methods for tabs, cookies,
9
- keyboard, element info, and wait.
5
+ Opensteer combines descriptor-aware actions, resilient selector persistence,
6
+ clean HTML snapshots, and first-class local or cloud runtime support.
10
7
 
11
- For anything not covered, use raw Playwright via `opensteer.page` and
12
- `opensteer.context`.
8
+ ## Requirements
9
+
10
+ - Node.js `>=20`
11
+ - A browser environment supported by Playwright
12
+ - API key for your selected model provider if you use LLM resolve/extract
13
13
 
14
14
  ## Install
15
15
 
16
16
  ```bash
17
17
  # npm
18
- npm install opensteer playwright
18
+ npm install opensteer
19
19
  # pnpm
20
- pnpm add opensteer playwright
20
+ pnpm add opensteer
21
21
  ```
22
22
 
23
- ## CLI Session Routing
24
-
25
- OpenSteer CLI now separates runtime routing from selector caching:
26
-
27
- - Runtime routing: `--session` or `OPENSTEER_SESSION`
28
- - Selector cache namespace: `--name` or `OPENSTEER_NAME` (used on `open`)
29
-
30
- If neither `--session` nor `OPENSTEER_SESSION` is set:
31
-
32
- - In an interactive terminal, OpenSteer creates/reuses a terminal-scoped default session.
33
- - In non-interactive environments (agents/CI), it fails fast unless you set
34
- `OPENSTEER_SESSION` or `OPENSTEER_CLIENT_ID`.
35
-
36
- Example:
23
+ If your environment skips Playwright browser downloads, run:
37
24
 
38
25
  ```bash
39
- export OPENSTEER_SESSION=agent-a
40
- opensteer open https://example.com --name product-scraper
41
- opensteer snapshot
42
- opensteer click 3
43
- opensteer status
26
+ npx playwright install chromium
44
27
  ```
45
28
 
46
- `opensteer status` reports `resolvedSession`, `sessionSource`, `resolvedName`, and `nameSource`.
47
-
48
- ## Quickstart
29
+ ## Quickstart (SDK)
49
30
 
50
31
  ```ts
51
32
  import { Opensteer } from "opensteer";
52
33
 
53
- const opensteer = new Opensteer({ name: "my-scraper" }); // defaults to model: 'gpt-5.1'
34
+ const opensteer = new Opensteer({ name: "my-scraper" });
54
35
  await opensteer.launch({ headless: false });
55
36
 
56
- await opensteer.goto("https://example.com");
57
- const html = await opensteer.snapshot();
58
-
59
- await opensteer.click({ description: "login-button" });
60
- await opensteer.input({ description: "email", text: "user@example.com" });
61
- await opensteer.page.keyboard.press("Enter");
37
+ try {
38
+ await opensteer.goto("https://example.com");
39
+ const html = await opensteer.snapshot();
40
+ console.log(html.slice(0, 500));
62
41
 
63
- await opensteer.close();
42
+ await opensteer.click({ description: "main call to action", element: 3 });
43
+ } finally {
44
+ await opensteer.close();
45
+ }
64
46
  ```
65
47
 
66
- ## Core Model
67
-
68
- - `opensteer.page`: raw Playwright `Page`
69
- - `opensteer.context`: raw Playwright `BrowserContext`
70
- - Opensteer methods: descriptor-aware operations that can persist selectors
71
- - Selector storage: `.opensteer/selectors/<namespace>`
72
-
73
- ## Resolution Chain
74
-
75
- For actions like `click`/`input`/`hover`/`select`/`scroll`:
76
-
77
- 1. Use persisted path for `description` (if present)
78
- 2. Use `element` counter from snapshot
79
- 3. Use explicit CSS `selector`
80
- 4. Use built-in LLM resolution (`description` required)
81
- 5. Throw
82
-
83
- When steps 2-4 resolve and `description` is provided, the path is persisted.
84
-
85
- ## Smart Post-Action Wait
86
-
87
- Mutating actions (`click`, `input`, `select`, `scroll`, etc.) include a
88
- best-effort post-action wait so delayed visual updates are usually settled
89
- before the method resolves.
90
-
91
- You can disable or tune this per call:
48
+ ## CUA Agent
92
49
 
93
50
  ```ts
94
- await opensteer.click({ description: "Save button", wait: false });
51
+ import { Opensteer } from "opensteer";
95
52
 
96
- await opensteer.click({
97
- description: "Save button",
98
- wait: { timeout: 9000, settleMs: 900, includeNetwork: true, networkQuietMs: 400 },
53
+ const opensteer = new Opensteer({
54
+ model: "openai/computer-use-preview",
99
55
  });
100
- ```
101
56
 
102
- ## Action Failure Diagnostics
57
+ await opensteer.launch();
103
58
 
104
- Descriptor-aware interaction methods (`click`, `dblclick`, `rightclick`,
105
- `hover`, `input`, `select`, `scroll`, `uploadFile`) throw
106
- `OpensteerActionError` when an interaction cannot be completed.
59
+ const agent = opensteer.agent({
60
+ mode: "cua",
61
+ });
107
62
 
108
- The error includes structured failure metadata for agent/tooling decisions:
63
+ const result = await agent.execute({
64
+ instruction: "Go to Hacker News and open the top story.",
65
+ maxSteps: 20,
66
+ highlightCursor: true,
67
+ });
109
68
 
110
- - `error.failure.code` (`ActionFailureCode`)
111
- - `error.failure.message`
112
- - `error.failure.retryable`
113
- - `error.failure.classificationSource`
114
- - `error.failure.details` (for blocker and observation details when available)
69
+ console.log(result.message);
70
+ await opensteer.close();
71
+ ```
115
72
 
116
- ```ts
117
- import { Opensteer, OpensteerActionError } from "opensteer";
73
+ Supported CUA providers in V1: `openai`, `anthropic`, `google`.
118
74
 
119
- try {
120
- await opensteer.click({ description: "Save button" });
121
- } catch (err) {
122
- if (err instanceof OpensteerActionError) {
123
- console.error(err.failure.code); // e.g. BLOCKED_BY_INTERCEPTOR
124
- console.error(err.failure.message);
125
- console.error(err.failure.classificationSource);
126
- }
127
- throw err;
128
- }
129
- ```
75
+ ## Quickstart (CLI)
130
76
 
131
- ## Snapshot Modes
77
+ Opensteer CLI separates runtime routing from selector namespace routing.
132
78
 
133
- ```ts
134
- await opensteer.snapshot(); // action mode (default)
135
- await opensteer.snapshot({ mode: "extraction" });
136
- await opensteer.snapshot({ mode: "clickable" });
137
- await opensteer.snapshot({ mode: "scrollable" });
138
- await opensteer.snapshot({ mode: "full" });
139
- ```
79
+ - Runtime routing: `--session` or `OPENSTEER_SESSION`
80
+ - Selector namespace: `--name` or `OPENSTEER_NAME` (used by `open`)
140
81
 
141
- ## Two Usage Patterns
82
+ ```bash
83
+ opensteer open https://example.com --session agent-a --name product-scraper
84
+ opensteer snapshot --session agent-a
85
+ opensteer click 3 --session agent-a
86
+ opensteer status --session agent-a
87
+ opensteer close --session agent-a
88
+ ```
142
89
 
143
- ### Explore (coding agent, no API key required)
90
+ In non-interactive environments, set `OPENSTEER_SESSION` or
91
+ `OPENSTEER_CLIENT_ID` explicitly.
144
92
 
145
- Use `snapshot()` + `element` counters while exploring in real time, then persist
146
- stable descriptions for replay.
93
+ ## Resolution and Replay Model
147
94
 
148
- ### Run (script replay / built-in LLM)
95
+ For descriptor-aware actions (`click`, `input`, `hover`, `select`, `scroll`):
149
96
 
150
- Opensteer uses built-in LLM resolve/extract by default. You can override the
151
- default model with top-level `model` or `OPENSTEER_MODEL`.
97
+ 1. Reuse persisted path for `description`
98
+ 2. Use `element` counter from snapshot
99
+ 3. Use explicit CSS `selector`
100
+ 4. Use built-in LLM resolution (`description` required)
101
+ 5. Throw actionable error
152
102
 
153
- ```ts
154
- const opensteer = new Opensteer({
155
- name: "run-mode",
156
- model: "gpt-5-mini",
157
- });
158
- ```
103
+ When steps 2-4 succeed and `description` is present, Opensteer persists the
104
+ path for deterministic replay in `.opensteer/selectors/<namespace>`.
159
105
 
160
- ## Mode Selection
106
+ ## Cloud Mode
161
107
 
162
108
  Opensteer defaults to local mode.
163
109
 
164
- - `OPENSTEER_MODE=local` runs local Playwright.
165
- - `OPENSTEER_MODE=cloud` enables cloud mode (requires `OPENSTEER_API_KEY`).
166
- - `cloud: true` in constructor config always enables cloud mode.
167
- - Opensteer auto-loads `.env` files from your `storage.rootDir` (default:
168
- `process.cwd()`) using this order: `.env.<NODE_ENV>.local`, `.env.local`
169
- (skipped when `NODE_ENV=test`), `.env.<NODE_ENV>`, `.env`.
170
- - Existing `process.env` values are never overwritten by `.env` values.
171
- - Set `OPENSTEER_DISABLE_DOTENV_AUTOLOAD=true` to disable auto-loading.
110
+ - `OPENSTEER_MODE=local|cloud`
111
+ - `OPENSTEER_API_KEY` or `cloud.apiKey` required in cloud mode
112
+ - `OPENSTEER_BASE_URL` or `cloud.baseUrl` to override the default cloud host
113
+ - `OPENSTEER_AUTH_SCHEME` or `cloud.authScheme` for auth header mode
114
+ (`api-key` or `bearer`)
115
+ - `cloud: true` or a `cloud` options object overrides `OPENSTEER_MODE`
172
116
 
173
- Cloud mode is fail-fast: it does not automatically fall back to local mode.
117
+ `.env` files are auto-loaded from `storage.rootDir` (default `process.cwd()`)
118
+ in this order: `.env.<NODE_ENV>.local`, `.env.local` (except in test),
119
+ `.env.<NODE_ENV>`, `.env`. Existing `process.env` values are not overwritten.
120
+ Set `OPENSTEER_DISABLE_DOTENV_AUTOLOAD=true` to disable.
174
121
 
175
122
  ## Docs
176
123
 
177
- - `docs/getting-started.md`
178
- - `docs/api-reference.md`
179
- - `docs/cloud-integration.md`
180
- - `docs/html-cleaning.md`
181
- - `docs/selectors.md`
182
- - `docs/live-web-tests.md`
124
+ - [Getting Started](docs/getting-started.md)
125
+ - [API Reference](docs/api-reference.md)
126
+ - [CLI Reference](docs/cli-reference.md)
127
+ - [Cloud Integration](docs/cloud-integration.md)
128
+ - [Selectors and Storage](docs/selectors.md)
129
+ - [HTML Cleaning and Snapshot Modes](docs/html-cleaning.md)
130
+ - [Live Web Validation Suite](docs/live-web-tests.md)
131
+
132
+ ## Community
133
+
134
+ - [Contributing Guide](CONTRIBUTING.md)
135
+ - [Code of Conduct](CODE_OF_CONDUCT.md)
136
+ - [Security Policy](SECURITY.md)
137
+ - [Support](SUPPORT.md)
138
+ - [Changelog](CHANGELOG.md)
183
139
 
184
140
  ## License
185
141
 
186
- MIT
142
+ [MIT](LICENSE)
package/bin/opensteer.mjs CHANGED
@@ -781,6 +781,7 @@ Environment:
781
781
  OPENSTEER_MODE Runtime routing: "local" (default) or "cloud"
782
782
  OPENSTEER_API_KEY Required when cloud mode is selected
783
783
  OPENSTEER_BASE_URL Override cloud control-plane base URL
784
+ OPENSTEER_AUTH_SCHEME Cloud auth scheme: api-key (default) or bearer
784
785
  OPENSTEER_REMOTE_ANNOUNCE Cloud session announcement policy: always (default), off, tty
785
786
  `)
786
787
  }