PyPI - opensteer - Versions diffs - 0.10.0__tar.gz - Mend

opensteer 0.10.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

opensteer-0.10.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Opensteer
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

opensteer-0.10.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,76 @@
+Metadata-Version: 2.4
+Name: opensteer
+Version: 0.10.0
+Summary: Global browser-control runtime and agent skills for Opensteer.
+Requires-Python: >=3.11
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: cdp-use==1.4.5
+Requires-Dist: fetch-use==0.4.0
+Requires-Dist: pillow==12.2.0
+Requires-Dist: websockets==16.0
+Dynamic: license-file
+# Opensteer
+Opensteer is a global browser-control runtime for agents and harness packs.
+It provides:
+- CDP browser helpers
+- Local browser attach
+- Named browser sessions
+- Opensteer Cloud browser attach
+- Profile helper functions
+- Generic browser interaction skills
+Harness packs provide domain-specific tools, selectors, workflows, databases, setup docs, and agent skills.
+## Install
+```bash
+uv tool install opensteer
+opensteer skills install
+opensteer --setup
+```
+Or:
+```bash
+curl -fsSL https://opensteer.com/install.sh | sh
+```
+## Use
+```bash
+opensteer -c "print(page_info())"
+```
+```python
+from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
+```
+Named browser sessions are routed with `OPENSTEER_NAME`:
+```bash
+OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://linkedin.com')"
+```
+## Harness Packs
+Harness packs depend on the installed `opensteer` package. They should not mutate Opensteer runtime files.
+A pack can import helpers directly:
+```python
+from opensteer.helpers import goto_url, js, click_at_xy
+```
+Or provide a small local shim:
+```python
+# actions/helpers.py
+from opensteer.helpers import *
+```
+Opensteer stays generic. Pack-specific browser workflows, selectors, API clients, and tools live in the harness pack.

opensteer-0.10.0/README.md ADDED Viewed

@@ -0,0 +1,63 @@
+# Opensteer
+Opensteer is a global browser-control runtime for agents and harness packs.
+It provides:
+- CDP browser helpers
+- Local browser attach
+- Named browser sessions
+- Opensteer Cloud browser attach
+- Profile helper functions
+- Generic browser interaction skills
+Harness packs provide domain-specific tools, selectors, workflows, databases, setup docs, and agent skills.
+## Install
+```bash
+uv tool install opensteer
+opensteer skills install
+opensteer --setup
+```
+Or:
+```bash
+curl -fsSL https://opensteer.com/install.sh | sh
+```
+## Use
+```bash
+opensteer -c "print(page_info())"
+```
+```python
+from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
+```
+Named browser sessions are routed with `OPENSTEER_NAME`:
+```bash
+OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://linkedin.com')"
+```
+## Harness Packs
+Harness packs depend on the installed `opensteer` package. They should not mutate Opensteer runtime files.
+A pack can import helpers directly:
+```python
+from opensteer.helpers import goto_url, js, click_at_xy
+```
+Or provide a small local shim:
+```python
+# actions/helpers.py
+from opensteer.helpers import *
+```
+Opensteer stays generic. Pack-specific browser workflows, selectors, API clients, and tools live in the harness pack.

opensteer-0.10.0/opensteer/__init__.py ADDED Viewed

@@ -0,0 +1,13 @@
+"""Opensteer browser-control runtime."""
+__all__ = ["__version__"]
+try:
+    from importlib.metadata import PackageNotFoundError, version
+    try:
+        __version__ = version("opensteer")
+    except PackageNotFoundError:
+        __version__ = "0.0.0"
+except Exception:
+    __version__ = "0.0.0"

opensteer-0.10.0/opensteer/agent_skills/SKILL.md ADDED Viewed

@@ -0,0 +1,80 @@
+---
+name: opensteer
+description: Direct browser control via CDP. Use when the user wants to automate, inspect, scrape, test, or interact with web pages.
+---
+# Opensteer
+Opensteer is a global browser-control runtime. It exposes small CDP primitives to agents and harness packs.
+## Usage
+```bash
+opensteer -c "print(page_info())"
+```
+Python snippets run with helpers pre-imported:
+```bash
+opensteer -c "new_tab('https://docs.opensteer.com'); wait_for_load(); print(page_info())"
+```
+For harness code, import helpers from the installed package:
+```python
+from opensteer.helpers import goto_url, js, click_at_xy, type_text, wait_for_load
+```
+## Named Sessions
+Use `OPENSTEER_NAME` to route commands to a named browser session:
+```bash
+OPENSTEER_NAME=linkedin opensteer -c "print(page_info())"
+```
+Remote browser sessions can be started from a Python snippet when `OPENSTEER_API_KEY` is set:
+```bash
+opensteer -c "start_remote_daemon('linkedin', profileId='bp_...')"
+OPENSTEER_NAME=linkedin opensteer -c "new_tab('https://example.com')"
+```
+## Interaction Skills
+Generic browser mechanics live in `interaction-skills/`:
+- connection.md
+- cookies.md
+- cross-origin-iframes.md
+- dialogs.md
+- downloads.md
+- drag-and-drop.md
+- dropdowns.md
+- iframes.md
+- network-requests.md
+- print-as-pdf.md
+- screenshots.md
+- scrolling.md
+- shadow-dom.md
+- tabs.md
+- uploads.md
+- viewport.md
+Use them when a page mechanic is tricky. Put domain-specific selectors, workflows, APIs, local databases, and task tools in the harness pack.
+## What Works Well
+- Start with screenshots for visible state: `capture_screenshot()`.
+- Prefer coordinate clicks for visible targets: `click_at_xy(x, y)`.
+- Use `js(...)` for DOM inspection, extraction, and page API discovery.
+- Use `cdp("Domain.method", ...)` for raw CDP operations not covered by helpers.
+- After navigation, call `wait_for_load()`.
+- Navigate the current Opensteer-owned tab by default; use `list_tabs()` and `switch_tab(target_id)` only when the user explicitly asks for a specific existing tab.
+- If a page redirects to login, stop and ask the user. Do not type credentials from screenshots.
+## Boundaries
+Opensteer owns generic browser primitives, local attach, remote attach, cloud browser attach, and generic interaction skills.
+Harness packs own domain-specific tools, selectors, workflows, storage, setup docs, and agent skills.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/connection.md ADDED Viewed

@@ -0,0 +1,46 @@
+# Connection & Tab Visibility
+## The omnibox popup problem
+When Chrome opens fresh, the only CDP `type: "page"` targets are often `chrome://inspect` and `chrome://omnibox-popup.top-chrome/` (a 1px invisible viewport). Opensteer avoids taking over arbitrary browser pages by restoring only tabs already owned by the current `OPENSTEER_NAME`; if none are still live, it creates a fresh `about:blank` tab and controls that.
+For OpenSteer-managed cloud CDP grants, visible targets are already scoped by the runtime proxy; if exactly one controller-owned page exists, the daemon attaches to that page instead of creating a duplicate blank tab.
+If the user asks you to control a specific existing tab, use `list_tabs()` to find it and `switch_tab(target_id)` to explicitly attach to it. If you still end up on an invisible tab, `switch_tab()` calls `Target.activateTarget` to bring the selected tab to front.
+## Startup sequence
+1. Check if a daemon is already running with `daemon_alive()`
+2. If stale sockets exist but daemon is dead, clean them up
+3. Let the daemon attach to the current Opensteer-owned tab or create a fresh `about:blank`
+4. Navigate the owned tab with `goto_url()`
+5. Only use `list_tabs()` and `switch_tab(target_id)` when the user explicitly wants an existing tab
+```python
+if not daemon_alive():
+    import os
+    from opensteer.paths import session_paths
+    for f in session_paths("default")[:2]:
+        if os.path.exists(f):
+            os.unlink(f)
+    ensure_daemon()
+goto_url("https://example.com")
+```
+## Bringing Chrome to front
+If Chrome is behind other windows or on another desktop:
+```python
+import subprocess
+subprocess.run(["osascript", "-e", 'tell application "Google Chrome" to activate'])
+```
+## Navigating
+Prefer navigating the current Opensteer-owned tab. Tabs created via CDP's `Target.createTarget` are visible but may open behind the active tab.
+```python
+goto_url("https://example.com")
+```

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/cookies.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Cookies
+Document how to get cookies, save cookies, and set cookies without confusing browser state with page state.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/cross-origin-iframes.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Cross-Origin Iframes
+Focus on `iframe_target(...)`, target attachment, and when compositor-level coordinate clicks are lower-friction than cross-target DOM work.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/dialogs.md ADDED Viewed

@@ -0,0 +1,64 @@
+# Dialogs
+Browser dialogs (`alert`, `confirm`, `prompt`, `beforeunload`) freeze the JS thread. Two approaches depending on timing.
+## Detection
+`page_info()` auto-surfaces any open dialog: if one is pending it returns `{"dialog": {"type", "message", ...}}` instead of the usual viewport dict (because the page's JS is frozen anyway). So if you call `page_info()` after an action and see a `dialog` key, handle it before doing anything else.
+## Reactive: dismiss via CDP (preferred)
+Works even when JS is frozen. Handles all dialog types including `beforeunload`.
+```python
+# Dismiss and read the message
+cdp("Page.handleJavaScriptDialog", accept=True)   # accept / click OK
+cdp("Page.handleJavaScriptDialog", accept=False)  # cancel / click Cancel
+# Read what the dialog said (from buffered CDP events)
+events = drain_events()
+for e in events:
+    if e["method"] == "Page.javascriptDialogOpening":
+        print(e["params"]["type"])     # "alert", "confirm", "prompt", "beforeunload"
+        print(e["params"]["message"])  # the dialog text
+```
+Undetectable by antibot — no JS injected into the page.
+## Proactive: stub via JS
+Prevents dialogs from ever appearing. Good when you expect multiple `alert()`/`confirm()` calls in sequence.
+```python
+js("""
+window.__dialogs__=[];
+window.alert=m=>window.__dialogs__.push(String(m));
+window.confirm=m=>{window.__dialogs__.push(String(m));return true;};
+window.prompt=(m,d)=>{window.__dialogs__.push(String(m));return d||'';};
+""")
+# ... do actions that trigger dialogs ...
+msgs = js("window.__dialogs__||[]")
+```
+Tradeoffs:
+- Stubs are lost on page navigation -- must re-run the snippet
+- `confirm()` always returns `true` (auto-approves)
+- Detectable by antibot (`window.alert.toString()` reveals non-native code)
+- Does NOT handle `beforeunload`
+## beforeunload specifically
+Fires when navigating away from a page with unsaved changes (forms, editors, upload pages). The page freezes until the user clicks Leave/Stay.
+```python
+# Option A: dismiss after navigating (CDP-level, safe)
+goto_url("https://new-url.com")
+try:
+    cdp("Page.handleJavaScriptDialog", accept=True)  # click "Leave"
+except:
+    pass  # no dialog — normal
+# Option B: prevent before navigating (JS injection, detectable)
+js("window.onbeforeunload=null")
+goto_url("https://new-url.com")
+```

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/downloads.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Downloads
+Separate browser-triggered downloads from direct `http_get(...)` fetches, and document the minimal signals that prove a download actually started.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/drag-and-drop.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Drag And Drop
+Focus on when drag-and-drop can be driven with low-level input events versus when the site really expects a file upload or DOM-specific drag sequence.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/dropdowns.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Dropdowns
+Split dropdowns into native selects, custom overlays, searchable comboboxes, and virtualized menus, and always re-measure after opening because option geometry often appears late.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/iframes.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Iframes
+Cover same-origin iframe traversal through `contentDocument` / `contentWindow`, and keep the frame-local versus page-coordinate warning explicit for clicks.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/network-requests.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Network Requests
+Document how to watch or infer network activity when page state is ambiguous, especially for submit flows, downloads, and SPA actions that succeed without obvious DOM changes.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/print-as-pdf.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Print As PDF
+Cover both direct PDF generation via CDP and sites that only expose a visible "Print" button which must be clicked before handling the browser print flow.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/screenshots.md ADDED Viewed

@@ -0,0 +1,17 @@
+# Screenshots
+`capture_screenshot()` writes a PNG of the current viewport. The file is in **device pixels** — on a 2× display a 2296×1143 CSS viewport produces a 4592×2286 PNG.
+That matters for two reasons:
+1. **Click coordinates are CSS pixels.** Don't read a target off the image and pass it to `click_at_xy()` directly without dividing by `devicePixelRatio`. The simplest workflow is to take the screenshot, look at it in a viewer that shows CSS coordinates, or measure relative positions and use `js("window.devicePixelRatio")` to convert.
+2. **Some LLMs reject images > 2000 px per side.** Long sessions on 2× displays will eventually hit this. Pass `max_dim=1800` to downscale the file before it gets into the conversation:
+```python
+capture_screenshot("/tmp/shot.png", max_dim=1800)
+```
+The downscale only happens when the image actually exceeds `max_dim`, so it's safe to leave on for every shot.
+Use full-page screenshots (`full=True`) only when you need to see content below the fold — they are much larger and slower than viewport-only.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/scrolling.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Scrolling
+Separate page scroll, nested containers, virtualized lists, and dropdown menus, and identify which element is actually consuming wheel events before scrolling.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/shadow-dom.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Shadow DOM
+Focus on recursive `shadowRoot` traversal, and note when coordinate clicking is simpler than piercing deeply nested component trees.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/tabs.md ADDED Viewed

@@ -0,0 +1,69 @@
+# Tabs
+Use **CDP for control**, **UI automation for user-visible order**.
+## Pure CDP (portable: macOS / Linux / Windows)
+```python
+tabs = list_tabs()                    # includes chrome:// pages too
+real_tabs = list_tabs(include_chrome=False)
+tid = new_tab("https://example.com")  # create + attach
+switch_tab(tid)                       # attach harness to tab
+cdp("Target.activateTarget", targetId=tid)  # show it in Chrome
+print(current_tab())
+print(page_info())
+```
+What CDP is good at:
+- attach to a tab
+- open a tab
+- activate a known target
+- inspect URL/title/viewport
+- capture the attached tab's screenshot even if another tab is visibly frontmost
+What CDP is bad at:
+- matching the **left-to-right tab strip order** the user sees
+- telling whether the attached target is an omnibox popup / internal page without URL filtering
+## Visible order (platform UI)
+### macOS
+```applescript
+tell application "Google Chrome"
+  set out to {}
+  set i to 1
+  repeat with t in every tab of front window
+    set end of out to {tab_index:i, tab_title:(title of t), tab_url:(URL of t)}
+    set i to i + 1
+  end repeat
+  return out
+end tell
+```
+```applescript
+tell application "Google Chrome"
+  set active tab index of front window to 2
+  activate
+end tell
+```
+### Linux
+No AppleScript. Same split still applies:
+- use CDP for `new_tab`, attach, inspect, activate known targets
+- use window-manager / browser UI automation when the user means visible order
+Typical tools:
+- `xdotool`
+- `wmctrl`
+- desktop-environment scripting (`gdbus`, KWin, GNOME Shell extensions, etc.)
+## Rules that held up in practice
+- `switch_tab()` is **not enough** if the user expects Chrome to visibly change.
+- `Target.activateTarget` is the CDP-side "show this tab".
+- `list_tabs()` includes `chrome://newtab/` by default; ask for `include_chrome=False` when you want only real pages.
+- `chrome://omnibox-popup.top-chrome/` can appear as a fake page target; ignore it for user-facing tab lists.
+- If a page has `w=0 h=0`, you may be attached to the wrong target or a non-window surface.
+- For dynamic UIs, re-read element rects after opening dropdowns / modals before coordinate-clicking.

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/uploads.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ # Uploads

opensteer-0.10.0/opensteer/agent_skills/interaction-skills/viewport.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Viewport
+Cover how viewport size changes affect layout, coordinate clicks, and any workflow that depends on stable geometry.