@qqbrowser/openclaw-qbot 0.10.15 → 0.10.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "0.10.15",
3
- "commit": "7187441ef98deb503e2d1dd34633df3768e5f7d7",
4
- "builtAt": "2026-06-03T02:08:52.803Z"
2
+ "version": "0.10.16",
3
+ "commit": "8594507507cc7488894cbaa90de3930c118bbd74",
4
+ "builtAt": "2026-06-03T06:31:42.639Z"
5
5
  }
@@ -1 +1 @@
1
- cb1d57e5a97f17ed3449f82242b3ed117eebf7623673aa68bbdbdbc7b85208d7
1
+ f7a15c502412644edbd46618c15a2859b3267e15f4ae1e98b9bbf7424c5c4014
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@qqbrowser/openclaw-qbot",
3
- "version": "0.10.15",
3
+ "version": "0.10.16",
4
4
  "description": "Multi-channel AI gateway with extensible messaging integrations",
5
5
  "keywords": [],
6
6
  "homepage": "https://github.com/openclaw/openclaw#readme",
@@ -42,7 +42,7 @@ Convert raw `qqbrowser-skill` task recordings into reusable, parameterized repla
42
42
  | `browser_download_file` / `browser_download_url` | — |
43
43
  | `browser_eval_content_js` | `script` |
44
44
  | `browser_find_and_act` | `value`, `actionValue`, `name`, `nth` |
45
- | `loop` | `count`, nested `steps` |
45
+ | `loop` | `count`, `variable`, `start`, `on_error`, nested `steps` |
46
46
 
47
47
  ### ❌ Excluded (filter out from recording)
48
48
 
@@ -102,7 +102,30 @@ Use `loop` when recording shows **2+ repetitions** of the same action sequence d
102
102
  }
103
103
  ```
104
104
 
105
- Fields: `count`(string/number, supports `{{param}}`), `variable`(loop var name), `start`(default 1, 1-based for `nth`).
105
+ Fields: `count`(string/number, supports `{{param}}`), `variable`(loop var name), `start`(default 1, 1-based for `nth`), `on_error`(optional, see below).
106
+
107
+ ### `on_error` — Dynamic List Handling
108
+
109
+ When the list length is **unknown or dynamic** (e.g., "claim all free games", "process all search results"), use `on_error: "break"`:
110
+
111
+ ```json
112
+ {
113
+ "action": "loop",
114
+ "params": { "count": "100", "variable": "i", "start": 1, "on_error": "break" },
115
+ "description": "处理所有列表项(自动检测结束)",
116
+ "steps": [ ... ]
117
+ }
118
+ ```
119
+
120
+ | `on_error` | Behavior |
121
+ |---|---|
122
+ | `"fail"` (default) | Sub-step failure → entire replay fails |
123
+ | `"break"` | Sub-step failure → **exit loop gracefully**, replay continues as success |
124
+
125
+ **When to use `on_error: "break"`:**
126
+ - List length is unknown at script generation time
127
+ - Task is "process ALL items" (not a fixed count)
128
+ - Set `count` to a large upper bound (e.g., 100), loop breaks when `find_and_act` can't find the Nth element
106
129
 
107
130
  ### Key Rules for Loop Steps
108
131
 
@@ -129,6 +152,7 @@ Fields: `count`(string/number, supports `{{param}}`), `variable`(loop var name),
129
152
  | Need detail page content | `loop` (navigate in/out per item) |
130
153
  | Pagination + extract | `loop` |
131
154
  | Fixed different form fields | Linear steps |
155
+ | Process ALL items (unknown count) | `loop` with `on_error: "break"` + large `count` |
132
156
 
133
157
  **`go_back` vs `tab_close` + `tab_switch`:**
134
158
 
@@ -15,57 +15,117 @@ permissions:
15
15
  ```bash
16
16
  # Linux / macOS
17
17
  pipx install qqbrowser-skill && qqbrowser-skill install
18
-
19
18
  # Windows
20
19
  pip install qqbrowser-skill && qqbrowser-skill install
21
20
  ```
22
21
 
23
22
  ## Key Concepts
24
23
 
25
- - **Element Index**: Encoded string like `2_sfli_qp0u` (format: `highlightIndex_attrHash_xpathHash`). Generated by `browser_snapshot`, used to target elements for interaction. Features a 4-level degradation matching mechanism for cross-snapshot stability.
26
- - **Snapshot**: Returns page content with indexed elements. Re-snapshot is required after any DOM change (navigation, form submission, AJAX loading).
27
- - **Task Recording**: All browser tasks MUST be wrapped with `task_begin`/`task_end` to enable replay.
24
+ - **Element Index**: Encoded string like `2_sfli_qp0u` (`highlightIndex_attrHash_xpathHash`). Generated by `browser_snapshot`, used to target elements.
25
+ - **Snapshot**: Returns page content with indexed elements. Re-snapshot after any DOM change.
26
+ - **Task Recording**: All manual browser tasks MUST be wrapped with `task_begin`/`task_end`.
27
+
28
+ ---
29
+
30
+ ## ⚠️ Task Execution Priority (MANDATORY)
31
+
32
+ > **CRITICAL: You MUST run `playbook_list` BEFORE any `task_begin` or `browser_go_to_url`. NEVER start manual automation without first checking for existing playbooks.**
33
+
34
+ ```
35
+ User task → playbook_list (MANDATORY) → Match? → YES → browser_replay
36
+ → NO → Manual automation
37
+ ```
38
+
39
+ ### Step 0: Check Playbooks (DO NOT SKIP)
40
+
41
+ ```bash
42
+ qqbrowser-skill playbook_list
43
+ ```
44
+
45
+ 1. Match returned playbooks against user's task by `name`, `description`, keywords, target URL
46
+ 2. If matched (even partially): `browser_replay --script <path> --variables '{...}'`
47
+ 3. Only if NO match: proceed to manual automation
48
+
49
+ ### Step 1: Composite Task — Multi-Playbook Orchestration
50
+
51
+ For tasks spanning multiple domains (e.g., "从知乎获取文章然后发到小红书"):
52
+
53
+ ```
54
+ Composite task → Decompose into sub-tasks → Match EACH against playbook_list
55
+ → Execute pipeline: playbook₁ → AI mediation → playbook₂ → ...
56
+ ```
57
+
58
+ **Rules:**
59
+ 1. Break request into atomic browser operations, match each independently
60
+ 2. AI mediates BETWEEN playbooks (summarize, rewrite, transform format) — no browser commands needed
61
+ 3. Each playbook runs independently — never chain inside a single `task_begin`/`task_end`
62
+ 4. If a sub-task has no playbook, fall back to manual for that sub-task only
63
+ 5. Report progress after each stage
64
+
65
+ **Example:**
66
+ ```bash
67
+ # Sub-task A: playbook exists
68
+ qqbrowser-skill browser_replay --script ~/.qqbrowser-skill/playbooks/zhihu-article-extractor.json \
69
+ --variables '{"topic": "人工智能", "count": "3"}'
70
+ # AI mediation: transform extracted content to target platform style
71
+ # Sub-task B: playbook exists
72
+ qqbrowser-skill browser_replay --script ~/.qqbrowser-skill/playbooks/xiaohongshu-publish.json \
73
+ --variables '{"notes_title": "AI生成标题", "notes_content": "转换后内容"}'
74
+ ```
75
+
76
+ ### Step 2: Fallback to Manual Automation
77
+
78
+ Only when: no matching playbook / replay failed after retry / task requires dynamic AI decisions.
79
+
80
+ ### Step 3: Save as Playbook (explicit user request only)
28
81
 
29
- ## Core Workflow
82
+ Trigger words: "save this", "make reusable", "保存为脚本", "下次还要用"
83
+ → Load `qqbrowser-playbook` skill → `task_latest` → generate → save
84
+
85
+ ---
86
+
87
+ ## Core Workflow (Manual Automation)
30
88
 
31
89
  ```bash
32
- qqbrowser-skill task_begin --description "描述任务" # 1. Start recording (MANDATORY)
33
- qqbrowser-skill browser_go_to_url --url <url> # 2. Navigate
34
- qqbrowser-skill browser_snapshot # 3. Get element indices
35
- # ... interact using indices ... # 4. Perform actions
36
- qqbrowser-skill browser_snapshot # 5. Re-snapshot after DOM changes
37
- qqbrowser-skill task_end # 6. End recording (MANDATORY)
90
+ qqbrowser-skill task_begin --description "描述任务"
91
+ qqbrowser-skill browser_go_to_url --url <url>
92
+ qqbrowser-skill browser_snapshot # Get element indices
93
+ # ... interact using indices ...
94
+ qqbrowser-skill browser_snapshot # Re-snapshot after DOM changes
95
+ qqbrowser-skill task_end
38
96
  ```
39
97
 
98
+ ---
99
+
40
100
  ## Command Reference
41
101
 
42
102
  ### Navigation
43
103
  ```bash
44
- browser_go_to_url --url <url> # Navigate to URL
45
- browser_go_back # Go back
46
- browser_wait --seconds <n> # Wait (default 3s)
104
+ browser_go_to_url --url <url>
105
+ browser_go_back
106
+ browser_wait --seconds <n> # Default 3s
47
107
  ```
48
108
 
49
109
  ### Snapshot & Screenshot
50
110
  ```bash
51
- browser_snapshot # Element indices mode (for interaction)
52
- browser_snapshot --markdown # Clean Markdown mode (for reading, no indices)
53
- browser_screenshot [--full] [--annotate] # Screenshot (returns .webp temp path)
111
+ browser_snapshot # Element indices (for interaction)
112
+ browser_snapshot --markdown # Markdown (for reading)
113
+ browser_screenshot [--full] [--annotate]
54
114
  ```
55
115
 
56
116
  ### Click & Input
57
117
  ```bash
58
- browser_click_element --index <id> # Click
59
- browser_dblclick_element --index <id> # Double-click
60
- browser_focus_element --index <id> # Focus
61
- browser_input_text --index <id> --text "<content>" # Clear + input text
118
+ browser_click_element --index <id>
119
+ browser_dblclick_element --index <id>
120
+ browser_focus_element --index <id>
121
+ browser_input_text --index <id> --text "<content>"
62
122
  ```
63
123
 
64
124
  ### Scroll
65
125
  ```bash
66
- browser_scroll_down [--amount <px>] # Scroll down (default one page)
67
- browser_scroll_up [--amount <px>] # Scroll up
68
- browser_scroll_to_text --text "<text>" # Scroll to text
126
+ browser_scroll_down [--amount <px>]
127
+ browser_scroll_up [--amount <px>]
128
+ browser_scroll_to_text --text "<text>"
69
129
  browser_scroll_to_top / browser_scroll_to_bottom
70
130
  browser_scroll_by --direction <dir> --pixels <n> [--index <id>]
71
131
  browser_scroll_into_view --index <id>
@@ -73,9 +133,9 @@ browser_scroll_into_view --index <id>
73
133
 
74
134
  ### Keyboard
75
135
  ```bash
76
- browser_keypress --key <key> # Press key (Enter, Tab, etc.)
77
- browser_keyboard_op --action type --text "<content>" # Type text
78
- browser_keyboard_op --action inserttext --text "<content>" # Insert without key events
136
+ browser_keypress --key <key>
137
+ browser_keyboard_op --action type --text "<content>"
138
+ browser_keyboard_op --action inserttext --text "<content>"
79
139
  browser_keydown --key <key> / browser_keyup --key <key>
80
140
  ```
81
141
 
@@ -89,8 +149,8 @@ browser_check_op --index <id> --value / --no-value
89
149
  ### Find and Act (Semantic Locators)
90
150
  ```bash
91
151
  browser_find_and_act --by <role|text|label|placeholder|testid|css> --value "<v>" --action <click|fill|type> [--actionValue "<v>"] [--name "<n>"] [--nth <n>]
92
- # --nth: select the Nth matching element (0-based). Essential for loop patterns in lists.
93
152
  ```
153
+ > `--nth`: 0-based index for list iteration. Use `by: "css"` + `--nth` for loops.
94
154
 
95
155
  ### Get Information & State
96
156
  ```bash
@@ -98,50 +158,26 @@ browser_get_info --type <text|url|title|html|value|attr|count|box|styles|list_se
98
158
  browser_check_state --state <visible|enabled|checked> --index <id>
99
159
  ```
100
160
 
101
- > **💡 Key Usage**: `browser_get_info` returns **only the requested info** (no page state or interactive elements), making it lightweight and token-efficient.
102
- >
103
- > **🔑 `list_selector` — Auto-detect CSS selector for list iteration (PREFERRED for loop tasks):**
104
- > ```bash
105
- > # From snapshot you see a list of articles:
106
- > # [3_abc1_def2]<a 人工智能的未来/>
107
- > # [4_xyz3_uvw4]<a 深度学习入门/>
108
- > # [5_mno5_pqr6]<a 大模型时代/>
109
- >
110
- > # Pick ANY one element from the list and run:
111
- > browser_get_info --type list_selector --index "3_abc1_def2"
112
- > # → {"success":true, "selector":".ContentItem h2 a", "count":10,
113
- > # "samples":["人工智能的未来","深度学习入门","大模型时代","...","..."],
114
- > # "strategy":"parent_level_1"}
115
- >
116
- > # Now use this selector directly in find_and_act for loop iteration:
117
- > browser_find_and_act --by css --value ".ContentItem h2 a" --action click --nth 1
118
- > ```
119
- > **This is the RECOMMENDED way to find CSS selectors for list operations.** It replaces the manual workflow of `get_info --type html` → analyze → construct selector → probe verify. One command does it all.
120
- >
121
- > If `list_selector` returns `success: false`, fall back to manual inspection:
122
- > ```bash
123
- > browser_get_info --type html --index "5_abc1_def2"
124
- > # → returns element HTML, use it to construct a selector manually
125
- > ```
161
+ > **`list_selector`**: Auto-detect CSS selector for list iteration. Pass any list item's index returns `{"selector": "...", "count": N, "samples": [...]}`. Use the returned selector in `find_and_act --by css --value <selector> --nth N`.
126
162
 
127
163
  ### JavaScript Evaluation
128
164
  ```bash
129
- browser_eval_content_js --script "<js_code>" # Evaluate JS, return result
130
- browser_eval_content_js --script "<base64>" --base64 # Base64-encoded script
165
+ browser_eval_content_js --script "<js_code>"
166
+ browser_eval_content_js --script "<base64>" --base64
131
167
  ```
132
168
 
133
169
  ### Download
134
170
  ```bash
135
- browser_download_file --index <id> # Download by clicking element
136
- browser_download_url # Download from URL
171
+ browser_download_file --index <id>
172
+ browser_download_url
137
173
  ```
138
174
 
139
175
  ### Tab Management
140
176
  ```bash
141
- browser_tab_open --url <url> # Open new tab
142
- browser_tab_list # List tabs
143
- browser_tab_switch --tabId <n> # Switch tab
144
- browser_tab_close --tabId <n> # Close tab
177
+ browser_tab_open --url <url>
178
+ browser_tab_list
179
+ browser_tab_switch --tabId <n>
180
+ browser_tab_close --tabId <n>
145
181
  ```
146
182
 
147
183
  ### Dialog
@@ -149,267 +185,143 @@ browser_tab_close --tabId <n> # Close tab
149
185
  browser_dialog --action <accept|dismiss> [--text "<input>"]
150
186
  ```
151
187
 
152
- ### Replay
188
+ ### Replay & Playbook
153
189
  ```bash
154
- browser_replay --script <path> # Replay from file
155
- browser_replay --script_content '<json>' # Replay from JSON string
156
- browser_replay --script <path> --variables '{"key":"value"}' # With params
190
+ playbook_list # List available playbooks
191
+ browser_replay --script <path> [--variables '{"key":"value"}']
192
+ browser_replay --script_content '<json>'
193
+ ```
194
+ > ⚠️ `browser_replay` can take **up to 10 minutes**. Wait patiently — do NOT interrupt.
195
+
196
+ **`browser_replay` Output Format:**
197
+ ```json
198
+ {
199
+ "success": true,
200
+ "total_steps": 5,
201
+ "completed_steps": 5,
202
+ "step_results": [
203
+ {"index": 0, "action": "browser_go_to_url", "description": "...", "success": true, "result": "Success! Navigated to ..."},
204
+ {"index": 1, "action": "browser_eval_content_js", "description": "提取数据", "success": true, "result": "{\"title\":\"...\",\"content\":\"...\"}"}
205
+ ],
206
+ "duration_ms": 12345,
207
+ "summary": "Replay completed successfully: 5/5 steps in 12345ms."
208
+ }
157
209
  ```
158
210
 
159
- > **⚠️ LONG EXECUTION TIME**: `browser_replay` executes ALL steps in the playbook sequentially (navigation, waiting, data extraction, etc.). **This can take up to 10 minutes** depending on the number of steps, network conditions, and configured delays/timeouts.
160
- >
161
- > **You MUST wait patiently** for the command to complete do NOT assume it has hung or timed out. Do NOT interrupt or retry unless you receive an explicit error message.
162
- > Typical durations: simple playbooks (5-10 steps) ~30s–1min, complex playbooks with loops (20+ steps) ~3–10min.
211
+ **How to use the output (especially for composite tasks):**
212
+ - `success`: Check overall success/failure. If `false`, check `failed_step` for error details.
213
+ - `step_results[N].result`: Contains the return value of each step. **For `browser_eval_content_js` steps, this is the extracted data (usually JSON string)** parse it for use in subsequent AI processing or next playbook's variables.
214
+ - For composite pipelines: find all `eval_content_js` steps in `step_results`, parse their `result` field to get structured data for AI mediation.
163
215
 
164
216
  ### Task Recording
165
217
  ```bash
166
- task_begin --description "<description>" # Start recording (MUST be first)
167
- task_end # Stop recording → then auto-generate playbook (see below)
168
- task_latest # Get most recent recording (for playbook generation)
218
+ task_begin --description "<desc>"
219
+ task_end
220
+ task_latest # Get most recent recording
169
221
  ```
170
-
171
- > **IMPORTANT**: After `task_end`, do NOT just show the raw recording path. The raw recording (`~/.qqbrowser-skill/records/task_*.json`) is NOT a ready-to-use replay script. You MUST proceed to generate a proper playbook — see "Generating Replay Scripts / Playbooks" section.
222
+ > After `task_end`, raw recordings are NOT replay-ready. Must generate proper playbook.
172
223
 
173
224
  ### Utility
174
225
  ```bash
175
- browser_done --success --text "<msg>" # Mark task complete
176
- status # Check skill status
177
- list # List available skills
226
+ browser_done --success --text "<msg>"
227
+ status
228
+ list
178
229
  ```
179
230
 
180
231
  ---
181
232
 
182
- ## ⚠️ MANDATORY: Replay-Friendly Execution Rules
183
-
184
- **All operations MUST prioritize replayability. Follow these rules to ensure recorded steps can be replayed with parameterized variables.**
233
+ ## Replay-Friendly Execution Rules
185
234
 
186
235
  ### Rule 1: Task Classification
187
236
 
188
- Before executing, classify the task:
189
-
190
237
  | Category | Replayable? | Strategy |
191
238
  |----------|-------------|----------|
192
- | **A: Fixed-path** (navigate, click, fill forms) | ✅ | Standard commands, all recorded |
193
- | **B: Data extraction** | ⚠️ Partial | Use `browser_eval_content_js` (replayable), NOT `snapshot --markdown` + AI summarization |
194
- | **C: Content understanding** (summarize, compare) | ❌ | Separate from replayable steps, mark with `"requires_ai": true` |
195
- | **D: Dynamic iteration** (pagination, load more) | ⚠️ | Use fixed number of operations based on predictable calculation |
239
+ | **A: Fixed-path** (navigate, click, fill) | ✅ | Standard commands |
240
+ | **B: Data extraction** | ⚠️ | Use `browser_eval_content_js`, NOT `snapshot --markdown` |
241
+ | **C: Content understanding** | ❌ | Mark `"requires_ai": true` |
242
+ | **D: Dynamic iteration** | ⚠️ | Fixed operation count |
196
243
 
197
244
  ### Rule 2: Data Extraction — Always Prefer JS
198
245
 
199
246
  ```bash
200
- # ✅ Replayable: JS extraction
201
- browser_eval_content_js --script "JSON.stringify(Array.from(document.querySelectorAll('.item')).slice(0,10).map(el=>({title:el.querySelector('.title')?.textContent?.trim()})))" --base64
202
-
203
- # ❌ Non-replayable: AI reading
247
+ # ✅ Replayable
248
+ browser_eval_content_js --script "JSON.stringify(Array.from(document.querySelectorAll('.item')).slice(0,10).map(el=>({title:el.querySelector('.title')?.textContent?.trim()})))"
249
+ # ❌ Non-replayable
204
250
  browser_snapshot --markdown # then AI summarizes
205
251
  ```
206
252
 
207
- **Exception**: Tasks explicitly requiring AI understanding (e.g., "总结文章要点") — note in `task_begin` description: "需要AI在线参与,不可纯回放".
253
+ ### Rule 3: Analyze First, Execute Once
208
254
 
209
- ### Rule 3: Analyze First, Execute Once (No Trial-and-Error Recording)
255
+ Do NOT trial-and-error during recorded tasks. Use `browser_snapshot` to analyze, then write ONE definitive script. If it fails, `task_end` → discard → re-record.
210
256
 
211
- **CRITICAL**: Do NOT use trial-and-error during a recorded task. Every command between `task_begin` and `task_end` is recorded — failed attempts pollute the recording and make it unreplayable.
212
-
213
- **Correct workflow for data extraction:**
214
- ```bash
215
- task_begin --description "..."
216
- browser_go_to_url --url "..."
217
- browser_wait --seconds 3
218
- browser_snapshot # ← AI analyzes page structure here (not recorded in replay)
219
- # AI identifies the correct, stable selector BEFORE writing the JS script
220
- browser_eval_content_js --script "/* one correct script */" # ← Only this gets replayed
221
- task_end
222
- ```
223
-
224
- **PROHIBITED patterns:**
225
- - ❌ Running multiple `browser_eval_content_js` with different selectors hoping one works
226
- - ❌ Using `browser_get_info` to "check" results mid-recording then adjusting
227
- - ❌ Hardcoding element index strings (e.g., `[id='25_tg4y_yb9z']`) inside JS scripts — index is for command params only, not JS selectors
228
- - ❌ Using regex to match page text content as a selector strategy (fragile, locale-dependent)
229
-
230
- **Rule**: Use `browser_snapshot` to understand the DOM, then write **one definitive** JS extraction script. If the first script fails, call `task_end` to discard, fix the script, then re-record with `task_begin`.
257
+ **PROHIBITED:**
258
+ - ❌ Multiple `eval_content_js` with different selectors
259
+ - Hardcoding element index in JS scripts
260
+ - ❌ Using regex on page text as selector strategy
231
261
 
232
262
  ### Rule 4: JS Selector Priority
233
263
 
234
- | Priority | Type | Example |
235
- |----------|------|---------|
236
- | 1 | `id` | `#rank-list` |
237
- | 2 | `data-*` | `[data-testid="item"]` |
238
- | 3 | ARIA | `[role="listitem"]` |
239
- | 4 | Semantic class | `.article-title` |
240
- | 5 | Structural path | `main > ul > li` |
241
- | ❌ | Dynamic/hash class | `.css-1a2b3c` — NEVER use |
264
+ `id` > `data-*` > ARIA > semantic class > structural path. ❌ NEVER use dynamic/hash classes (`.css-1a2b3c`).
242
265
 
243
- ### Rule 5: Repeating Patterns & Pagination
266
+ ### Rule 5: Loop Recording
244
267
 
245
- **Identify repeating patterns during recording**. If you find yourself doing the same sequence of actions N times (e.g., click item → read → go back), this is a **loop pattern**. Record it correctly:
268
+ Record **at least 2 iterations** using `browser_find_and_act --by css --nth`:
246
269
 
247
- **Recording strategy for loop tasks:**
248
270
  ```bash
249
- task_begin --description "搜索{{topic}}并提取前{{count}}篇文章内容"
250
- browser_go_to_url --url "https://example.com/search?q=AI"
251
- browser_wait --seconds 3
252
- browser_snapshot # AI analyzes list structure
253
-
254
- # ★ STEP 0: DISCOVER LIST SELECTOR (MANDATORY before any list iteration)
255
- # Use list_selector to auto-detect the CSS selector from any list item's snapshot index:
271
+ # MANDATORY: Discover selector first
256
272
  browser_get_info --type list_selector --index "3_abc1_def2"
257
- # {"success":true, "selector":".result-item h2 a", "count":10, "samples":[...]}
258
- # Then record the probe with the discovered selector for playbook generation:
273
+ # Record probe for playbook generator:
259
274
  browser_eval_content_js --script "JSON.stringify({__list_probe__: true, selector: '.result-item h2 a', count: document.querySelectorAll('.result-item h2 a').length, samples: Array.from(document.querySelectorAll('.result-item h2 a')).slice(0,3).map(e=>e.textContent.trim())})"
260
- # The __list_probe__ marker helps the playbook generator locate this data.
261
-
262
- # Record ONE complete iteration as the pattern:
275
+ # Iteration 1:
263
276
  browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 1
264
- browser_wait --seconds 2
265
- browser_eval_content_js --script "JSON.stringify({title: document.querySelector('h1')?.textContent, content: document.querySelector('.content')?.textContent?.substring(0,2000)})"
266
- browser_go_back
267
- browser_wait --seconds 1
268
-
269
- # Record second iteration to confirm the pattern:
277
+ # ... extract, go back ...
278
+ # Iteration 2:
270
279
  browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 2
271
- browser_wait --seconds 2
272
- browser_eval_content_js --script "JSON.stringify({title: document.querySelector('h1')?.textContent, content: document.querySelector('.content')?.textContent?.substring(0,2000)})"
273
- browser_go_back
274
-
275
- task_end
280
+ # ... extract, go back ...
276
281
  ```
277
282
 
278
- **Key points:**
279
- - **★ MUST discover and probe list selector** before iterating — first use `browser_get_info --type list_selector --index <id>` to auto-detect the CSS selector, then use `browser_eval_content_js` with `__list_probe__: true` to record it. **NEVER guess CSS selectors.**
280
- - Record **at least 2 iterations** so the playbook generator can identify the loop pattern
281
- - Use `browser_find_and_act` with `--nth` (not `browser_click_element` with hardcoded index) for list items — this enables loop parameterization
282
- - The playbook generator will automatically convert repetitions into a `loop` structure with `{{count}}` controlling iterations
283
+ **Key:** Use `list_selector` to discover CSS selector (NEVER guess). Use `find_and_act --nth` (not `click_element` with hardcoded index).
283
284
 
284
- **How to find the CSS selector (MANDATORY — use `list_selector`):**
285
- ```
286
- ★ PREFERRED (one command, auto-detect):
287
- 1. browser_snapshot → AI sees list elements: [3_abc1_def2]<a 人工智能的未来/> ...
288
- 2. browser_get_info --type list_selector --index "3_abc1_def2"
289
- → returns: {"success":true, "selector":".ContentItem h2 a", "count":10, "samples":[...]}
290
- 3. Use the returned selector → proceed with __list_probe__ and find_and_act
291
-
292
- FALLBACK (if list_selector returns success=false):
293
- 1. browser_get_info --type html --index "3_abc1_def2"
294
- → returns: <a class="ContentItem-title" href="/p/123456">人工智能的未来</a>
295
- 2. AI reads the HTML → identifies class, parent structure, etc.
296
- 3. AI constructs CSS selector → ".ContentItem-title" or "a[data-za-detail-view-element_name='Title']"
297
- 4. browser_eval_content_js → run __list_probe__ to verify the selector
298
- ```
299
- **NEVER guess CSS selectors.** Always use `list_selector` first, then fall back to `html` inspection if needed.
300
-
301
- **For simple pagination** (next page button):
302
- ```bash
303
- browser_eval_content_js --script "/* extract page 1 */"
304
- browser_find_and_act --by text --value "下一页" --action click
305
- browser_wait --seconds 2
306
- browser_eval_content_js --script "/* extract page 2 */"
307
- ```
285
+ ### Rule 6: Multi-Tab Recording
308
286
 
309
- **DO NOT** use AI judgment loops ("check if enough, scroll more").
310
-
311
- ### Rule 6: Multi-Tab Recording Strategy
312
-
313
- When a task involves links that open in new tabs (e.g., `target="_blank"`), record using **explicit tab management commands** instead of relying on `browser_go_back`:
314
-
315
- ```bash
316
- task_begin --description "搜索并提取文章(新Tab场景)"
317
- browser_go_to_url --url "https://example.com/search?q=AI"
318
- browser_wait --seconds 3
319
- browser_snapshot # AI analyzes list structure
320
-
321
- # ★ DISCOVER + PROBE LIST SELECTOR (same as Rule 5 — MANDATORY before list iteration):
322
- browser_get_info --type list_selector --index "3_abc1_def2" # auto-detect selector
323
- browser_eval_content_js --script "JSON.stringify({__list_probe__: true, selector: '.result-item h2 a', count: document.querySelectorAll('.result-item h2 a').length, samples: Array.from(document.querySelectorAll('.result-item h2 a')).slice(0,3).map(e=>e.textContent.trim())})"
324
-
325
- # Click link that opens in new tab:
326
- browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 1
327
- browser_tab_list # AI checks: new tab appeared?
328
- browser_tab_switch --tabId <new_tab_id> # Switch to the new tab
329
- browser_wait --seconds 2
330
- browser_eval_content_js --script "..." # Extract content in detail page
331
- browser_tab_close --tabId <current_tab_id> # Close detail tab
332
- browser_tab_switch --tabId <origin_tab_id> # Switch back to list page
333
- browser_wait --seconds 1
334
-
335
- # Repeat for second item...
336
- browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 2
337
- # ... same pattern ...
338
-
339
- task_end
340
- ```
341
-
342
- **Key points:**
343
- - Use `browser_tab_list` to observe tab changes (this is AI-only, filtered from playbook)
344
- - Record physical `tabId` during recording — the **playbook generator** will convert them to semantic references (`"origin"`, `"latest"`, `"current"`)
345
- - **DO NOT use `browser_go_back`** when the previous action opened a new tab — `go_back` navigates within the current tab's history, it cannot close tabs or switch between them
346
- - Record **at least 2 iterations** of the tab pattern (same as loop recording) so the playbook generator can detect and convert to `loop` + tab management
347
-
348
- **How to detect new tab during recording:**
349
- 1. After clicking a link, call `browser_tab_list`
350
- 2. If a new tab appeared (tab count increased), use `browser_tab_switch` to go to it
351
- 3. If no new tab (same count), the link navigated in-place → use normal flow with `browser_go_back`
287
+ When links open new tabs:
288
+ 1. After click, `browser_tab_list` to detect new tab
289
+ 2. `browser_tab_switch` extract `browser_tab_close` → `browser_tab_switch` back
290
+ 3. Do NOT use `browser_go_back` for cross-tab navigation
291
+ 4. Record 2+ iterations for loop detection
352
292
 
353
293
  ### Rule 7: Fixed vs Variable Parameters
354
294
 
355
- | Command | Fixed | Variable (→ `{{param}}`) |
356
- |---------|-------|--------------------------|
357
- | `browser_go_to_url` | Base URL | Query params, path segments |
358
- | `browser_click_element` | `--index` | — (never parameterize) |
359
- | `browser_input_text` | `--index` | `--text` |
360
- | `browser_select_dropdown_option` | `--index` | `--text` |
361
- | `browser_keyboard_op` | `--action` | `--text` |
362
- | `browser_eval_content_js` | Script structure | Count, selector, keyword in script |
363
- | `browser_find_and_act` | `--by`, `--action` | `--value`, `--actionValue`, `--name` |
364
- | `browser_dialog` | `--action` | `--text` |
365
- | `browser_wait` / `browser_scroll_*` | All params | — |
366
-
367
- ### Rule 8: Non-Replayable Commands (AI-Decision Only)
295
+ | Fixed (never parameterize) | Variable (→ `{{param}}`) |
296
+ |---|---|
297
+ | `index`, `action`, base URLs, `settings` | `text`, query params, `actionValue`, count in scripts |
368
298
 
369
- These commands help AI decide but produce **no replayable action** — they are skipped during replay:
299
+ ### Rule 8: Non-Replayable Commands (filtered during replay)
370
300
 
371
- `browser_snapshot`, `browser_snapshot --markdown`, `browser_screenshot`, `browser_get_info`, `browser_check_state`, `browser_get_dropdown_options`, `browser_tab_list`
372
-
373
- **Principle**: AI uses these to decide; the resulting **actions** (click, input, select) are what gets recorded and replayed.
301
+ `browser_snapshot`, `browser_screenshot`, `browser_get_info`, `browser_check_state`, `browser_get_dropdown_options`, `browser_tab_list`
374
302
 
375
303
  ### Rule 9: Prefer `browser_find_and_act` for Dynamic Content
376
304
 
377
- For elements in dynamic lists (search results, feeds) where index stability is uncertain:
378
-
379
305
  ```bash
380
- # ✅ Preferred: semantic locator (stable across page changes)
381
- browser_find_and_act --by text --value "{{target_text}}" --action click
382
-
383
- # ⚠️ Acceptable: index-based (for stable page structures only)
306
+ # ✅ Stable across page changes
307
+ browser_find_and_act --by text --value "{{target}}" --action click
308
+ # ⚠️ Only for stable structures
384
309
  browser_click_element --index 15_abc1_def2
385
310
  ```
386
311
 
387
312
  ---
388
313
 
389
- ## Generating Replay Scripts / Playbooks
314
+ ## Generating Playbooks
390
315
 
391
- ### Auto-Generate Trigger
316
+ After `task_end`, if task is replayable:
392
317
 
393
- After every `task_end`, if the task description indicates it should be replayable (e.g., contains "可回放", "replay", or is a standard operational task without AI summarization), the AI **MUST automatically**:
394
-
395
- 1. Load the `qqbrowser-playbook` skill
396
- 2. Run `task_latest` to get the raw recording
397
- 3. Generate a clean playbook (filtering out AI-only commands, parameterizing user data)
318
+ 1. Load `qqbrowser-playbook` skill
319
+ 2. `task_latest` → get raw recording
320
+ 3. Generate clean playbook (filter AI-only commands, parameterize user data)
398
321
  4. Save to `~/.qqbrowser-skill/playbooks/<name>.json`
399
322
 
400
- **DO NOT** simply tell the user to run `browser_replay --script <raw_recording_path>`. Raw recordings contain trial-and-error steps and AI-only commands — they are NOT replay-ready.
401
-
402
- ### Explicit Trigger
403
-
404
- When the user explicitly asks to "save as replay script" / "generate playbook" / "保存成回放脚本" / "生成回放脚本":
405
-
406
- Same workflow as above.
407
-
408
- ### Rules
409
-
410
- 1. **MUST** load the `qqbrowser-playbook` skill for format specification
411
- 2. **MUST** use `task_latest` to get recorded steps — do NOT fabricate
412
- 3. **MUST** filter out all AI-only commands (`browser_snapshot`, `browser_screenshot`, `browser_get_info`, `browser_check_state`, etc.)
413
- 4. **MUST** save to `~/.qqbrowser-skill/playbooks/<name>.json` (macOS/Linux) or `%LOCALAPPDATA%/qqbrowser-skill/playbooks/<name>.json` (Windows)
414
- 5. **MUST** use the exact JSON format from the `qqbrowser-playbook` skill specification
415
- 6. **MUST NOT** output the raw recording path as the "replay script" — it is source material, not a ready playbook
323
+ **Rules:**
324
+ - MUST base on `task_latest` — never fabricate steps
325
+ - MUST filter excluded commands
326
+ - MUST use `qqbrowser-playbook` skill JSON format
327
+ - Raw recordings are NOT replay-ready always generate proper playbook