browser-ctl 0.2.0__tar.gz → 0.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: browser-ctl
3
- Version: 0.2.0
3
+ Version: 0.2.1
4
4
  Summary: Control your browser from the command line via a Chrome extension + WebSocket bridge
5
5
  Author-email: geb <853934146@qq.com>
6
6
  License-Expression: MIT
@@ -47,8 +47,9 @@ Dynamic: license-file
47
47
  pip install browser-ctl
48
48
 
49
49
  bctl go https://github.com
50
- bctl click "a.search-button"
51
- bctl type "input[name=q]" "browser-ctl"
50
+ bctl snapshot # List interactive elements → e0, e1, e2, …
51
+ bctl click e3 # Click by ref — no CSS selector needed
52
+ bctl type e5 "browser-ctl" # Type into element by ref
52
53
  bctl press Enter
53
54
  bctl screenshot results.png
54
55
  ```
@@ -74,11 +75,12 @@ Tools like [browser-use](https://github.com/browser-use/browser-use), [Playwrigh
74
75
 
75
76
  browser-ctl is purpose-built for AI agent workflows:
76
77
 
78
+ - **Snapshot-first workflow** — `bctl snapshot` lists interactive elements as `e0`, `e1`, … then operate by ref (`bctl click e3`) — no CSS selector guessing
77
79
  - **Tool-calling ready** — every command is a single shell call returning structured JSON, perfect for function-calling / tool-use patterns
78
80
  - **Built-in AI skill** — ships with `SKILL.md` that teaches AI agents (Cursor, OpenCode, etc.) the full command set and best practices
79
81
  - **Real browser = real access** — your LLM can operate on authenticated pages (Gmail, Jira, internal tools) without credential management
80
- - **Deterministic output** — JSON responses with CSS-selector-based queries, no vision model needed for most tasks
81
- - **Minimal token cost** — `bctl select "a.link" -l 5` returns structured data in one call vs multi-step screenshot → vision → parse loops
82
+ - **Deterministic output** — JSON responses with element refs or CSS selectors, no vision model needed for most tasks
83
+ - **Minimal token cost** — `bctl snapshot` + `bctl click e5` vs multi-step screenshot → vision → parse loops
82
84
 
83
85
  ```bash
84
86
  # Install the AI skill for Cursor IDE in one command
@@ -21,8 +21,9 @@
21
21
  pip install browser-ctl
22
22
 
23
23
  bctl go https://github.com
24
- bctl click "a.search-button"
25
- bctl type "input[name=q]" "browser-ctl"
24
+ bctl snapshot # List interactive elements → e0, e1, e2, …
25
+ bctl click e3 # Click by ref — no CSS selector needed
26
+ bctl type e5 "browser-ctl" # Type into element by ref
26
27
  bctl press Enter
27
28
  bctl screenshot results.png
28
29
  ```
@@ -48,11 +49,12 @@ Tools like [browser-use](https://github.com/browser-use/browser-use), [Playwrigh
48
49
 
49
50
  browser-ctl is purpose-built for AI agent workflows:
50
51
 
52
+ - **Snapshot-first workflow** — `bctl snapshot` lists interactive elements as `e0`, `e1`, … then operate by ref (`bctl click e3`) — no CSS selector guessing
51
53
  - **Tool-calling ready** — every command is a single shell call returning structured JSON, perfect for function-calling / tool-use patterns
52
54
  - **Built-in AI skill** — ships with `SKILL.md` that teaches AI agents (Cursor, OpenCode, etc.) the full command set and best practices
53
55
  - **Real browser = real access** — your LLM can operate on authenticated pages (Gmail, Jira, internal tools) without credential management
54
- - **Deterministic output** — JSON responses with CSS-selector-based queries, no vision model needed for most tasks
55
- - **Minimal token cost** — `bctl select "a.link" -l 5` returns structured data in one call vs multi-step screenshot → vision → parse loops
56
+ - **Deterministic output** — JSON responses with element refs or CSS selectors, no vision model needed for most tasks
57
+ - **Minimal token cost** — `bctl snapshot` + `bctl click e5` vs multi-step screenshot → vision → parse loops
56
58
 
57
59
  ```bash
58
60
  # Install the AI skill for Cursor IDE in one command
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: browser-ctl
3
- Version: 0.2.0
3
+ Version: 0.2.1
4
4
  Summary: Control your browser from the command line via a Chrome extension + WebSocket bridge
5
5
  Author-email: geb <853934146@qq.com>
6
6
  License-Expression: MIT
@@ -47,8 +47,9 @@ Dynamic: license-file
47
47
  pip install browser-ctl
48
48
 
49
49
  bctl go https://github.com
50
- bctl click "a.search-button"
51
- bctl type "input[name=q]" "browser-ctl"
50
+ bctl snapshot # List interactive elements → e0, e1, e2, …
51
+ bctl click e3 # Click by ref — no CSS selector needed
52
+ bctl type e5 "browser-ctl" # Type into element by ref
52
53
  bctl press Enter
53
54
  bctl screenshot results.png
54
55
  ```
@@ -74,11 +75,12 @@ Tools like [browser-use](https://github.com/browser-use/browser-use), [Playwrigh
74
75
 
75
76
  browser-ctl is purpose-built for AI agent workflows:
76
77
 
78
+ - **Snapshot-first workflow** — `bctl snapshot` lists interactive elements as `e0`, `e1`, … then operate by ref (`bctl click e3`) — no CSS selector guessing
77
79
  - **Tool-calling ready** — every command is a single shell call returning structured JSON, perfect for function-calling / tool-use patterns
78
80
  - **Built-in AI skill** — ships with `SKILL.md` that teaches AI agents (Cursor, OpenCode, etc.) the full command set and best practices
79
81
  - **Real browser = real access** — your LLM can operate on authenticated pages (Gmail, Jira, internal tools) without credential management
80
- - **Deterministic output** — JSON responses with CSS-selector-based queries, no vision model needed for most tasks
81
- - **Minimal token cost** — `bctl select "a.link" -l 5` returns structured data in one call vs multi-step screenshot → vision → parse loops
82
+ - **Deterministic output** — JSON responses with element refs or CSS selectors, no vision model needed for most tasks
83
+ - **Minimal token cost** — `bctl snapshot` + `bctl click e5` vs multi-step screenshot → vision → parse loops
82
84
 
83
85
  ```bash
84
86
  # Install the AI skill for Cursor IDE in one command
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "browser-ctl"
3
- version = "0.2.0"
3
+ version = "0.2.1"
4
4
  description = "Control your browser from the command line via a Chrome extension + WebSocket bridge"
5
5
  readme = "README.md"
6
6
  license = "MIT"
File without changes
File without changes