opensteer 0.10.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. opensteer-0.10.0/LICENSE +21 -0
  2. opensteer-0.10.0/PKG-INFO +76 -0
  3. opensteer-0.10.0/README.md +63 -0
  4. opensteer-0.10.0/opensteer/__init__.py +13 -0
  5. opensteer-0.10.0/opensteer/agent_skills/SKILL.md +80 -0
  6. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/connection.md +46 -0
  7. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/cookies.md +3 -0
  8. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/cross-origin-iframes.md +3 -0
  9. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/dialogs.md +64 -0
  10. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/downloads.md +3 -0
  11. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/drag-and-drop.md +3 -0
  12. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/dropdowns.md +3 -0
  13. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/iframes.md +3 -0
  14. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/network-requests.md +3 -0
  15. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/print-as-pdf.md +3 -0
  16. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/screenshots.md +17 -0
  17. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/scrolling.md +3 -0
  18. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/shadow-dom.md +3 -0
  19. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/tabs.md +69 -0
  20. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/uploads.md +1 -0
  21. opensteer-0.10.0/opensteer/agent_skills/interaction-skills/viewport.md +3 -0
  22. opensteer-0.10.0/opensteer/daemon.py +865 -0
  23. opensteer-0.10.0/opensteer/env.py +17 -0
  24. opensteer-0.10.0/opensteer/errors.py +202 -0
  25. opensteer-0.10.0/opensteer/helpers.py +409 -0
  26. opensteer-0.10.0/opensteer/http_client.py +110 -0
  27. opensteer-0.10.0/opensteer/paths.py +54 -0
  28. opensteer-0.10.0/opensteer/run.py +65 -0
  29. opensteer-0.10.0/opensteer/runtime.py +480 -0
  30. opensteer-0.10.0/opensteer/skill_installer.py +35 -0
  31. opensteer-0.10.0/opensteer.egg-info/PKG-INFO +76 -0
  32. opensteer-0.10.0/opensteer.egg-info/SOURCES.txt +36 -0
  33. opensteer-0.10.0/opensteer.egg-info/dependency_links.txt +1 -0
  34. opensteer-0.10.0/opensteer.egg-info/entry_points.txt +2 -0
  35. opensteer-0.10.0/opensteer.egg-info/requires.txt +4 -0
  36. opensteer-0.10.0/opensteer.egg-info/top_level.txt +1 -0
  37. opensteer-0.10.0/pyproject.toml +51 -0
  38. opensteer-0.10.0/setup.cfg +4 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Opensteer
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,76 @@
1
+ Metadata-Version: 2.4
2
+ Name: opensteer
3
+ Version: 0.10.0
4
+ Summary: Global browser-control runtime and agent skills for Opensteer.
5
+ Requires-Python: >=3.11
6
+ Description-Content-Type: text/markdown
7
+ License-File: LICENSE
8
+ Requires-Dist: cdp-use==1.4.5
9
+ Requires-Dist: fetch-use==0.4.0
10
+ Requires-Dist: pillow==12.2.0
11
+ Requires-Dist: websockets==16.0
12
+ Dynamic: license-file
13
+
14
+ # Opensteer
15
+
16
+ Opensteer is a global browser-control runtime for agents and harness packs.
17
+
18
+ It provides:
19
+
20
+ - CDP browser helpers
21
+ - Local browser attach
22
+ - Named browser sessions
23
+ - Opensteer Cloud browser attach
24
+ - Profile helper functions
25
+ - Generic browser interaction skills
26
+
27
+ Harness packs provide domain-specific tools, selectors, workflows, databases, setup docs, and agent skills.
28
+
29
+ ## Install
30
+
31
+ ```bash
32
+ uv tool install opensteer
33
+ opensteer skills install
34
+ opensteer --setup
35
+ ```
36
+
37
+ Or:
38
+
39
+ ```bash
40
+ curl -fsSL https://opensteer.com/install.sh | sh
41
+ ```
42
+
43
+ ## Use
44
+
45
+ ```bash
46
+ opensteer -c "print(page_info())"
47
+ ```
48
+
49
+ ```python
50
+ from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
51
+ ```
52
+
53
+ Named browser sessions are routed with `OPENSTEER_NAME`:
54
+
55
+ ```bash
56
+ OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://linkedin.com')"
57
+ ```
58
+
59
+ ## Harness Packs
60
+
61
+ Harness packs depend on the installed `opensteer` package. They should not mutate Opensteer runtime files.
62
+
63
+ A pack can import helpers directly:
64
+
65
+ ```python
66
+ from opensteer.helpers import goto_url, js, click_at_xy
67
+ ```
68
+
69
+ Or provide a small local shim:
70
+
71
+ ```python
72
+ # actions/helpers.py
73
+ from opensteer.helpers import *
74
+ ```
75
+
76
+ Opensteer stays generic. Pack-specific browser workflows, selectors, API clients, and tools live in the harness pack.
@@ -0,0 +1,63 @@
1
+ # Opensteer
2
+
3
+ Opensteer is a global browser-control runtime for agents and harness packs.
4
+
5
+ It provides:
6
+
7
+ - CDP browser helpers
8
+ - Local browser attach
9
+ - Named browser sessions
10
+ - Opensteer Cloud browser attach
11
+ - Profile helper functions
12
+ - Generic browser interaction skills
13
+
14
+ Harness packs provide domain-specific tools, selectors, workflows, databases, setup docs, and agent skills.
15
+
16
+ ## Install
17
+
18
+ ```bash
19
+ uv tool install opensteer
20
+ opensteer skills install
21
+ opensteer --setup
22
+ ```
23
+
24
+ Or:
25
+
26
+ ```bash
27
+ curl -fsSL https://opensteer.com/install.sh | sh
28
+ ```
29
+
30
+ ## Use
31
+
32
+ ```bash
33
+ opensteer -c "print(page_info())"
34
+ ```
35
+
36
+ ```python
37
+ from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
38
+ ```
39
+
40
+ Named browser sessions are routed with `OPENSTEER_NAME`:
41
+
42
+ ```bash
43
+ OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://linkedin.com')"
44
+ ```
45
+
46
+ ## Harness Packs
47
+
48
+ Harness packs depend on the installed `opensteer` package. They should not mutate Opensteer runtime files.
49
+
50
+ A pack can import helpers directly:
51
+
52
+ ```python
53
+ from opensteer.helpers import goto_url, js, click_at_xy
54
+ ```
55
+
56
+ Or provide a small local shim:
57
+
58
+ ```python
59
+ # actions/helpers.py
60
+ from opensteer.helpers import *
61
+ ```
62
+
63
+ Opensteer stays generic. Pack-specific browser workflows, selectors, API clients, and tools live in the harness pack.
@@ -0,0 +1,13 @@
1
+ """Opensteer browser-control runtime."""
2
+
3
+ __all__ = ["__version__"]
4
+
5
+ try:
6
+ from importlib.metadata import PackageNotFoundError, version
7
+
8
+ try:
9
+ __version__ = version("opensteer")
10
+ except PackageNotFoundError:
11
+ __version__ = "0.0.0"
12
+ except Exception:
13
+ __version__ = "0.0.0"
@@ -0,0 +1,80 @@
1
+ ---
2
+ name: opensteer
3
+ description: Direct browser control via CDP. Use when the user wants to automate, inspect, scrape, test, or interact with web pages.
4
+ ---
5
+
6
+ # Opensteer
7
+
8
+ Opensteer is a global browser-control runtime. It exposes small CDP primitives to agents and harness packs.
9
+
10
+ ## Usage
11
+
12
+ ```bash
13
+ opensteer -c "print(page_info())"
14
+ ```
15
+
16
+ Python snippets run with helpers pre-imported:
17
+
18
+ ```bash
19
+ opensteer -c "new_tab('https://docs.opensteer.com'); wait_for_load(); print(page_info())"
20
+ ```
21
+
22
+ For harness code, import helpers from the installed package:
23
+
24
+ ```python
25
+ from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
26
+ ```
27
+
28
+ ## Named Sessions
29
+
30
+ Use `OPENSTEER_NAME` to route commands to a named browser session:
31
+
32
+ ```bash
33
+ OPENSTEER_NAME=linkedin opensteer -c "print(page_info())"
34
+ ```
35
+
36
+ Remote browser sessions can be started from a Python snippet when `OPENSTEER_API_KEY` is set:
37
+
38
+ ```bash
39
+ opensteer -c "start_remote_daemon('linkedin', profileId='bp_...')"
40
+ OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://example.com')"
41
+ ```
42
+
43
+ ## Interaction Skills
44
+
45
+ Generic browser mechanics live in `interaction-skills/`:
46
+
47
+ - connection.md
48
+ - cookies.md
49
+ - cross-origin-iframes.md
50
+ - dialogs.md
51
+ - downloads.md
52
+ - drag-and-drop.md
53
+ - dropdowns.md
54
+ - iframes.md
55
+ - network-requests.md
56
+ - print-as-pdf.md
57
+ - screenshots.md
58
+ - scrolling.md
59
+ - shadow-dom.md
60
+ - tabs.md
61
+ - uploads.md
62
+ - viewport.md
63
+
64
+ Use them when a page mechanic is tricky. Put domain-specific selectors, workflows, APIs, local databases, and task tools in the harness pack.
65
+
66
+ ## What Works Well
67
+
68
+ - Start with screenshots for visible state: `capture_screenshot()`.
69
+ - Prefer coordinate clicks for visible targets: `click_at_xy(x, y)`.
70
+ - Use `js(...)` for DOM inspection, extraction, and page API discovery.
71
+ - Use `cdp("Domain.method", ...)` for raw CDP operations not covered by helpers.
72
+ - After navigation, call `wait_for_load()`.
73
+ - Navigate the current Opensteer-owned tab by default; use `list_tabs()` and `switch_tab(target_id)` only when the user explicitly asks for a specific existing tab.
74
+ - If a page redirects to login, stop and ask the user. Do not type credentials from screenshots.
75
+
76
+ ## Boundaries
77
+
78
+ Opensteer owns generic browser primitives, local attach, remote attach, cloud browser attach, and generic interaction skills.
79
+
80
+ Harness packs own domain-specific tools, selectors, workflows, storage, setup docs, and agent skills.
@@ -0,0 +1,46 @@
1
+ # Connection & Tab Visibility
2
+
3
+ ## The omnibox popup problem
4
+
5
+ When Chrome opens fresh, the only CDP `type: "page"` targets are often `chrome://inspect` and `chrome://omnibox-popup.top-chrome/` (a 1px invisible viewport). Opensteer avoids taking over arbitrary browser pages by restoring only tabs already owned by the current `OPENSTEER_NAME`; if none are still live, it creates a fresh `about:blank` tab and controls that.
6
+
7
+ For OpenSteer-managed cloud CDP grants, visible targets are already scoped by the runtime proxy; if exactly one controller-owned page exists, the daemon attaches to that page instead of creating a duplicate blank tab.
8
+
9
+ If the user asks you to control a specific existing tab, use `list_tabs()` to find it and `switch_tab(target_id)` to explicitly attach to it. If you still end up on an invisible tab, `switch_tab()` calls `Target.activateTarget` to bring the selected tab to front.
10
+
11
+ ## Startup sequence
12
+
13
+ 1. Check if a daemon is already running with `daemon_alive()`
14
+ 2. If stale sockets exist but daemon is dead, clean them up
15
+ 3. Let the daemon attach to the current Opensteer-owned tab or create a fresh `about:blank`
16
+ 4. Navigate the owned tab with `goto_url()`
17
+ 5. Only use `list_tabs()` and `switch_tab(target_id)` when the user explicitly wants an existing tab
18
+
19
+ ```python
20
+ if not daemon_alive():
21
+ import os
22
+ from opensteer.paths import session_paths
23
+ for f in session_paths("default")[:2]:
24
+ if os.path.exists(f):
25
+ os.unlink(f)
26
+ ensure_daemon()
27
+
28
+ goto_url("https://example.com")
29
+ ```
30
+
31
+ ## Bringing Chrome to front
32
+
33
+ If Chrome is behind other windows or on another desktop:
34
+
35
+ ```python
36
+ import subprocess
37
+ subprocess.run(["osascript", "-e", 'tell application "Google Chrome" to activate'])
38
+ ```
39
+
40
+ ## Navigating
41
+
42
+ Prefer navigating the current Opensteer-owned tab. Tabs created via CDP's `Target.createTarget` are visible but may open behind the active tab.
43
+
44
+ ```python
45
+ goto_url("https://example.com")
46
+ ```
@@ -0,0 +1,3 @@
1
+ # Cookies
2
+
3
+ Document how to get cookies, save cookies, and set cookies without confusing browser state with page state.
@@ -0,0 +1,3 @@
1
+ # Cross-Origin Iframes
2
+
3
+ Focus on `iframe_target(...)`, target attachment, and when compositor-level coordinate clicks are lower-friction than cross-target DOM work.
@@ -0,0 +1,64 @@
1
+ # Dialogs
2
+
3
+ Browser dialogs (`alert`, `confirm`, `prompt`, `beforeunload`) freeze the JS thread. Two approaches depending on timing.
4
+
5
+ ## Detection
6
+
7
+ `page_info()` auto-surfaces any open dialog: if one is pending it returns `{"dialog": {"type", "message", ...}}` instead of the usual viewport dict (because the page's JS is frozen anyway). So if you call `page_info()` after an action and see a `dialog` key, handle it before doing anything else.
8
+
9
+ ## Reactive: dismiss via CDP (preferred)
10
+
11
+ Works even when JS is frozen. Handles all dialog types including `beforeunload`.
12
+
13
+ ```python
14
+ # Dismiss and read the message
15
+ cdp("Page.handleJavaScriptDialog", accept=True) # accept / click OK
16
+ cdp("Page.handleJavaScriptDialog", accept=False) # cancel / click Cancel
17
+
18
+ # Read what the dialog said (from buffered CDP events)
19
+ events = drain_events()
20
+ for e in events:
21
+ if e["method"] == "Page.javascriptDialogOpening":
22
+ print(e["params"]["type"]) # "alert", "confirm", "prompt", "beforeunload"
23
+ print(e["params"]["message"]) # the dialog text
24
+ ```
25
+
26
+ Undetectable by antibot — no JS injected into the page.
27
+
28
+ ## Proactive: stub via JS
29
+
30
+ Prevents dialogs from ever appearing. Good when you expect multiple `alert()`/`confirm()` calls in sequence.
31
+
32
+ ```python
33
+ js("""
34
+ window.__dialogs__=[];
35
+ window.alert=m=>window.__dialogs__.push(String(m));
36
+ window.confirm=m=>{window.__dialogs__.push(String(m));return true;};
37
+ window.prompt=(m,d)=>{window.__dialogs__.push(String(m));return d||'';};
38
+ """)
39
+ # ... do actions that trigger dialogs ...
40
+ msgs = js("window.__dialogs__||[]")
41
+ ```
42
+
43
+ Tradeoffs:
44
+ - Stubs are lost on page navigation -- must re-run the snippet
45
+ - `confirm()` always returns `true` (auto-approves)
46
+ - Detectable by antibot (`window.alert.toString()` reveals non-native code)
47
+ - Does NOT handle `beforeunload`
48
+
49
+ ## beforeunload specifically
50
+
51
+ Fires when navigating away from a page with unsaved changes (forms, editors, upload pages). The page freezes until the user clicks Leave/Stay.
52
+
53
+ ```python
54
+ # Option A: dismiss after navigating (CDP-level, safe)
55
+ goto_url("https://new-url.com")
56
+ try:
57
+ cdp("Page.handleJavaScriptDialog", accept=True) # click "Leave"
58
+ except:
59
+ pass # no dialog — normal
60
+
61
+ # Option B: prevent before navigating (JS injection, detectable)
62
+ js("window.onbeforeunload=null")
63
+ goto_url("https://new-url.com")
64
+ ```
@@ -0,0 +1,3 @@
1
+ # Downloads
2
+
3
+ Separate browser-triggered downloads from direct `http_get(...)` fetches, and document the minimal signals that prove a download actually started.
@@ -0,0 +1,3 @@
1
+ # Drag And Drop
2
+
3
+ Focus on when drag-and-drop can be driven with low-level input events versus when the site really expects a file upload or DOM-specific drag sequence.
@@ -0,0 +1,3 @@
1
+ # Dropdowns
2
+
3
+ Split dropdowns into native selects, custom overlays, searchable comboboxes, and virtualized menus, and always re-measure after opening because option geometry often appears late.
@@ -0,0 +1,3 @@
1
+ # Iframes
2
+
3
+ Cover same-origin iframe traversal through `contentDocument` / `contentWindow`, and keep the frame-local versus page-coordinate warning explicit for clicks.
@@ -0,0 +1,3 @@
1
+ # Network Requests
2
+
3
+ Document how to watch or infer network activity when page state is ambiguous, especially for submit flows, downloads, and SPA actions that succeed without obvious DOM changes.
@@ -0,0 +1,3 @@
1
+ # Print As PDF
2
+
3
+ Cover both direct PDF generation via CDP and sites that only expose a visible "Print" button which must be clicked before handling the browser print flow.
@@ -0,0 +1,17 @@
1
+ # Screenshots
2
+
3
+ `capture_screenshot()` writes a PNG of the current viewport. The file is in **device pixels** — on a 2× display a 2296×1143 CSS viewport produces a 4592×2286 PNG.
4
+
5
+ That matters for two reasons:
6
+
7
+ 1. **Click coordinates are CSS pixels.** Don't read a target off the image and pass it to `click_at_xy()` directly without dividing by `devicePixelRatio`. The simplest workflow is to take the screenshot, look at it in a viewer that shows CSS coordinates, or measure relative positions and use `js("window.devicePixelRatio")` to convert.
8
+
9
+ 2. **Some LLMs reject images > 2000 px per side.** Long sessions on 2× displays will eventually hit this. Pass `max_dim=1800` to downscale the file before it gets into the conversation:
10
+
11
+ ```python
12
+ capture_screenshot("/tmp/shot.png", max_dim=1800)
13
+ ```
14
+
15
+ The downscale only happens when the image actually exceeds `max_dim`, so it's safe to leave on for every shot.
16
+
17
+ Use full-page screenshots (`full=True`) only when you need to see content below the fold — they are much larger and slower than viewport-only.
@@ -0,0 +1,3 @@
1
+ # Scrolling
2
+
3
+ Separate page scroll, nested containers, virtualized lists, and dropdown menus, and identify which element is actually consuming wheel events before scrolling.
@@ -0,0 +1,3 @@
1
+ # Shadow DOM
2
+
3
+ Focus on recursive `shadowRoot` traversal, and note when coordinate clicking is simpler than piercing deeply nested component trees.
@@ -0,0 +1,69 @@
1
+ # Tabs
2
+
3
+ Use **CDP for control**, **UI automation for user-visible order**.
4
+
5
+ ## Pure CDP (portable: macOS / Linux / Windows)
6
+
7
+ ```python
8
+ tabs = list_tabs() # includes chrome:// pages too
9
+ real_tabs = list_tabs(include_chrome=False)
10
+ tid = new_tab("https://example.com") # create + attach
11
+ switch_tab(tid) # attach harness to tab
12
+ cdp("Target.activateTarget", targetId=tid) # show it in Chrome
13
+ print(current_tab())
14
+ print(page_info())
15
+ ```
16
+
17
+ What CDP is good at:
18
+ - attach to a tab
19
+ - open a tab
20
+ - activate a known target
21
+ - inspect URL/title/viewport
22
+ - capture the attached tab's screenshot even if another tab is visibly frontmost
23
+
24
+ What CDP is bad at:
25
+ - matching the **left-to-right tab strip order** the user sees
26
+ - telling whether the attached target is an omnibox popup / internal page without URL filtering
27
+
28
+ ## Visible order (platform UI)
29
+
30
+ ### macOS
31
+
32
+ ```applescript
33
+ tell application "Google Chrome"
34
+ set out to {}
35
+ set i to 1
36
+ repeat with t in every tab of front window
37
+ set end of out to {tab_index:i, tab_title:(title of t), tab_url:(URL of t)}
38
+ set i to i + 1
39
+ end repeat
40
+ return out
41
+ end tell
42
+ ```
43
+
44
+ ```applescript
45
+ tell application "Google Chrome"
46
+ set active tab index of front window to 2
47
+ activate
48
+ end tell
49
+ ```
50
+
51
+ ### Linux
52
+
53
+ No AppleScript. Same split still applies:
54
+ - use CDP for `new_tab`, attach, inspect, activate known targets
55
+ - use window-manager / browser UI automation when the user means visible order
56
+
57
+ Typical tools:
58
+ - `xdotool`
59
+ - `wmctrl`
60
+ - desktop-environment scripting (`gdbus`, KWin, GNOME Shell extensions, etc.)
61
+
62
+ ## Rules that held up in practice
63
+
64
+ - `switch_tab()` is **not enough** if the user expects Chrome to visibly change.
65
+ - `Target.activateTarget` is the CDP-side "show this tab".
66
+ - `list_tabs()` includes `chrome://newtab/` by default; ask for `include_chrome=False` when you want only real pages.
67
+ - `chrome://omnibox-popup.top-chrome/` can appear as a fake page target; ignore it for user-facing tab lists.
68
+ - If a page has `w=0 h=0`, you may be attached to the wrong target or a non-window surface.
69
+ - For dynamic UIs, re-read element rects after opening dropdowns / modals before coordinate-clicking.
@@ -0,0 +1,3 @@
1
+ # Viewport
2
+
3
+ Cover how viewport size changes affect layout, coordinate clicks, and any workflow that depends on stable geometry.