@qqbrowser/openclaw-qbot 0.10.13 → 0.10.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (180) hide show
  1. package/dist/build-info.json +3 -3
  2. package/dist/canvas-host/a2ui/.bundle.hash +1 -1
  3. package/dist/canvas-host/a2ui/a2ui.bundle.js +6 -6
  4. package/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js +1 -0
  5. package/node_modules/@aws-sdk/client-bedrock-runtime/dist-es/models/enums.js +1 -0
  6. package/node_modules/@aws-sdk/client-bedrock-runtime/package.json +13 -13
  7. package/node_modules/@aws-sdk/core/package.json +6 -6
  8. package/node_modules/@aws-sdk/credential-provider-env/package.json +5 -5
  9. package/node_modules/@aws-sdk/credential-provider-http/package.json +7 -7
  10. package/node_modules/@aws-sdk/credential-provider-ini/package.json +13 -13
  11. package/node_modules/@aws-sdk/credential-provider-login/package.json +6 -6
  12. package/node_modules/@aws-sdk/credential-provider-node/dist-cjs/index.js +12 -1
  13. package/node_modules/@aws-sdk/credential-provider-node/dist-es/runtime/memoize-chain.js +12 -1
  14. package/node_modules/@aws-sdk/credential-provider-node/package.json +11 -11
  15. package/node_modules/@aws-sdk/credential-provider-process/package.json +5 -5
  16. package/node_modules/@aws-sdk/credential-provider-sso/package.json +7 -7
  17. package/node_modules/@aws-sdk/credential-provider-web-identity/package.json +6 -6
  18. package/node_modules/@aws-sdk/eventstream-handler-node/package.json +4 -4
  19. package/node_modules/@aws-sdk/middleware-eventstream/package.json +4 -4
  20. package/node_modules/@aws-sdk/middleware-websocket/package.json +7 -7
  21. package/node_modules/@aws-sdk/nested-clients/dist-cjs/submodules/cognito-identity/index.js +1 -1
  22. package/node_modules/@aws-sdk/nested-clients/dist-cjs/submodules/signin/index.js +1 -1
  23. package/node_modules/@aws-sdk/nested-clients/dist-cjs/submodules/sso/index.js +1 -1
  24. package/node_modules/@aws-sdk/nested-clients/dist-cjs/submodules/sso-oidc/index.js +1 -1
  25. package/node_modules/@aws-sdk/nested-clients/dist-cjs/submodules/sts/index.js +1 -1
  26. package/node_modules/@aws-sdk/nested-clients/package.json +8 -8
  27. package/node_modules/@aws-sdk/signature-v4-multi-region/package.json +4 -4
  28. package/node_modules/@aws-sdk/token-providers/package.json +6 -6
  29. package/node_modules/@aws-sdk/types/package.json +2 -2
  30. package/node_modules/@aws-sdk/xml-builder/package.json +2 -2
  31. package/node_modules/@clack/core/dist/index.mjs +7 -7
  32. package/node_modules/@clack/core/package.json +2 -1
  33. package/node_modules/@clack/prompts/dist/index.mjs +38 -38
  34. package/node_modules/@clack/prompts/package.json +2 -2
  35. package/node_modules/@nodable/entities/package.json +1 -1
  36. package/node_modules/@nodable/entities/src/EntityDecoder.js +1 -1
  37. package/node_modules/@nodable/entities/src/entities.js +0 -18
  38. package/node_modules/@slack/bolt/dist/App.js +8 -12
  39. package/node_modules/@slack/bolt/dist/context/index.js +7 -7
  40. package/node_modules/@slack/bolt/dist/index.js +16 -16
  41. package/node_modules/@slack/bolt/dist/receivers/AwsLambdaReceiver.js +2 -0
  42. package/node_modules/@slack/bolt/dist/receivers/ExpressReceiver.js +8 -1
  43. package/node_modules/@slack/bolt/dist/receivers/HTTPReceiver.js +4 -2
  44. package/node_modules/@slack/bolt/dist/receivers/SocketModeReceiver.js +1 -1
  45. package/node_modules/@slack/bolt/dist/receivers/verify-request.js +3 -0
  46. package/node_modules/@slack/bolt/dist/receivers/verify-signing-secret.js +12 -0
  47. package/node_modules/@slack/bolt/dist/types/actions/index.js +1 -1
  48. package/node_modules/@slack/bolt/dist/types/index.js +3 -3
  49. package/node_modules/@slack/bolt/package.json +4 -4
  50. package/node_modules/@smithy/core/dist-cjs/index.js +3 -4
  51. package/node_modules/@smithy/core/dist-cjs/submodules/client/index.js +3 -11
  52. package/node_modules/@smithy/core/dist-cjs/submodules/config/index.browser.js +2 -2
  53. package/node_modules/@smithy/core/dist-cjs/submodules/config/index.js +2 -2
  54. package/node_modules/@smithy/core/dist-cjs/submodules/config/index.native.js +2 -2
  55. package/node_modules/@smithy/core/dist-cjs/submodules/endpoints/index.browser.js +9 -40
  56. package/node_modules/@smithy/core/dist-cjs/submodules/endpoints/index.js +9 -40
  57. package/node_modules/@smithy/core/dist-cjs/submodules/protocols/index.js +12 -142
  58. package/node_modules/@smithy/core/dist-cjs/submodules/retry/index.browser.js +39 -22
  59. package/node_modules/@smithy/core/dist-cjs/submodules/retry/index.js +39 -22
  60. package/node_modules/@smithy/core/dist-cjs/submodules/schema/index.js +7 -9
  61. package/node_modules/@smithy/core/dist-cjs/submodules/serde/index.browser.js +2 -2
  62. package/node_modules/@smithy/core/dist-cjs/submodules/serde/index.js +2 -2
  63. package/node_modules/@smithy/core/dist-cjs/submodules/serde/index.native.js +2 -2
  64. package/node_modules/@smithy/core/dist-cjs/submodules/transport/index.js +184 -0
  65. package/node_modules/@smithy/core/dist-es/index.js +6 -6
  66. package/node_modules/@smithy/core/dist-es/submodules/client/index.js +2 -2
  67. package/node_modules/@smithy/core/dist-es/submodules/config/config-resolver/regionConfig/checkRegion.js +1 -1
  68. package/node_modules/@smithy/core/dist-es/submodules/endpoints/index.browser.js +2 -2
  69. package/node_modules/@smithy/core/dist-es/submodules/endpoints/index.js +2 -2
  70. package/node_modules/@smithy/core/dist-es/submodules/endpoints/middleware-endpoint/adaptors/toEndpointV1.js +1 -1
  71. package/node_modules/@smithy/core/dist-es/submodules/endpoints/middleware-endpoint/resolveEndpointConfig.js +1 -1
  72. package/node_modules/@smithy/core/dist-es/submodules/endpoints/util-endpoints/lib/index.js +1 -1
  73. package/node_modules/@smithy/core/dist-es/submodules/protocols/HttpBindingProtocol.js +1 -1
  74. package/node_modules/@smithy/core/dist-es/submodules/protocols/HttpProtocol.js +1 -2
  75. package/node_modules/@smithy/core/dist-es/submodules/protocols/RpcProtocol.js +1 -1
  76. package/node_modules/@smithy/core/dist-es/submodules/protocols/index.js +5 -5
  77. package/node_modules/@smithy/core/dist-es/submodules/protocols/middleware-content-length/contentLengthMiddleware.js +1 -1
  78. package/node_modules/@smithy/core/dist-es/submodules/protocols/requestBuilder.js +1 -1
  79. package/node_modules/@smithy/core/dist-es/submodules/retry/middleware-retry/configurations.js +19 -6
  80. package/node_modules/@smithy/core/dist-es/submodules/retry/middleware-retry/retryMiddleware.js +4 -5
  81. package/node_modules/@smithy/core/dist-es/submodules/retry/service-error-classification/constants.js +1 -1
  82. package/node_modules/@smithy/core/dist-es/submodules/retry/util-retry/ConfiguredRetryStrategy.js +4 -5
  83. package/node_modules/@smithy/core/dist-es/submodules/retry/util-retry/DefaultRetryToken.js +3 -0
  84. package/node_modules/@smithy/core/dist-es/submodules/retry/util-retry/StandardRetryStrategy.js +9 -5
  85. package/node_modules/@smithy/core/dist-es/submodules/schema/middleware/schemaDeserializationMiddleware.js +3 -4
  86. package/node_modules/@smithy/core/dist-es/submodules/schema/middleware/schemaSerializationMiddleware.js +1 -2
  87. package/node_modules/@smithy/core/dist-es/submodules/serde/middleware-serde/deserializerMiddleware.js +1 -1
  88. package/node_modules/@smithy/core/dist-es/submodules/transport/index.js +9 -0
  89. package/node_modules/@smithy/core/dist-es/submodules/{protocols/url-parser → transport}/parseUrl.js +1 -1
  90. package/node_modules/@smithy/core/dist-es/submodules/{endpoints → transport}/toEndpointV1.js +1 -1
  91. package/node_modules/@smithy/core/package.json +20 -11
  92. package/node_modules/@smithy/core/transport.js +5 -0
  93. package/node_modules/@smithy/credential-provider-imds/dist-cjs/index.js +14 -13
  94. package/node_modules/@smithy/credential-provider-imds/dist-es/fromContainerMetadata.js +14 -13
  95. package/node_modules/@smithy/credential-provider-imds/package.json +3 -3
  96. package/node_modules/@smithy/fetch-http-handler/package.json +4 -4
  97. package/node_modules/@smithy/node-http-handler/package.json +4 -4
  98. package/node_modules/@smithy/signature-v4/package.json +3 -3
  99. package/node_modules/@smithy/types/package.json +1 -1
  100. package/node_modules/eventsource-parser/dist/index.cjs +21 -10
  101. package/node_modules/eventsource-parser/dist/index.d.cts +33 -10
  102. package/node_modules/eventsource-parser/dist/index.js +21 -10
  103. package/node_modules/eventsource-parser/dist/stream.cjs +4 -3
  104. package/node_modules/eventsource-parser/dist/stream.d.cts +16 -3
  105. package/node_modules/eventsource-parser/dist/stream.js +4 -3
  106. package/node_modules/eventsource-parser/package.json +8 -8
  107. package/node_modules/hasown/package.json +4 -5
  108. package/node_modules/lru-cache/package.json +1 -1
  109. package/node_modules/protobufjs/dist/light/protobuf.js +7 -5
  110. package/node_modules/protobufjs/dist/light/protobuf.min.js +3 -3
  111. package/node_modules/protobufjs/dist/minimal/protobuf.js +3 -3
  112. package/node_modules/protobufjs/dist/minimal/protobuf.min.js +3 -3
  113. package/node_modules/protobufjs/dist/protobuf.js +7 -5
  114. package/node_modules/protobufjs/dist/protobuf.min.js +3 -3
  115. package/node_modules/protobufjs/package.json +1 -1
  116. package/node_modules/protobufjs/src/converter.js +4 -2
  117. package/node_modules/protobufjs/src/roots.js +1 -1
  118. package/node_modules/thread-stream/.claude/settings.local.json +15 -0
  119. package/node_modules/thread-stream/CLAUDE.md +64 -0
  120. package/node_modules/thread-stream/index.js +41 -13
  121. package/node_modules/thread-stream/lib/indexes.js +3 -1
  122. package/node_modules/thread-stream/lib/worker.js +20 -8
  123. package/node_modules/thread-stream/package.json +1 -1
  124. package/node_modules/undici/lib/global.js +10 -1
  125. package/node_modules/undici/package.json +1 -1
  126. package/package.json +1 -1
  127. package/skills/qqbrowser-playbook/SKILL.md +262 -234
  128. package/skills/qqbrowser-skill/SKILL.md +330 -234
  129. package/node_modules/@smithy/core/dist-cjs/getSmithyContext.js +0 -6
  130. package/node_modules/@smithy/core/dist-cjs/middleware-http-auth-scheme/getHttpAuthSchemeEndpointRuleSetPlugin.js +0 -21
  131. package/node_modules/@smithy/core/dist-cjs/middleware-http-auth-scheme/getHttpAuthSchemePlugin.js +0 -21
  132. package/node_modules/@smithy/core/dist-cjs/middleware-http-auth-scheme/httpAuthSchemeMiddleware.js +0 -46
  133. package/node_modules/@smithy/core/dist-cjs/middleware-http-auth-scheme/index.js +0 -6
  134. package/node_modules/@smithy/core/dist-cjs/middleware-http-auth-scheme/resolveAuthOptions.js +0 -24
  135. package/node_modules/@smithy/core/dist-cjs/middleware-http-signing/getHttpSigningMiddleware.js +0 -19
  136. package/node_modules/@smithy/core/dist-cjs/middleware-http-signing/httpSigningMiddleware.js +0 -27
  137. package/node_modules/@smithy/core/dist-cjs/middleware-http-signing/index.js +0 -5
  138. package/node_modules/@smithy/core/dist-cjs/normalizeProvider.js +0 -10
  139. package/node_modules/@smithy/core/dist-cjs/pagination/createPaginator.js +0 -44
  140. package/node_modules/@smithy/core/dist-cjs/request-builder/requestBuilder.js +0 -5
  141. package/node_modules/@smithy/core/dist-cjs/setFeature.js +0 -14
  142. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/DefaultIdentityProviderConfig.js +0 -18
  143. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/httpAuthSchemes/httpApiKeyAuth.js +0 -38
  144. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/httpAuthSchemes/httpBearerAuth.js +0 -15
  145. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/httpAuthSchemes/index.js +0 -6
  146. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/httpAuthSchemes/noAuth.js +0 -9
  147. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/index.js +0 -6
  148. package/node_modules/@smithy/core/dist-cjs/util-identity-and-auth/memoizeIdentityProvider.js +0 -61
  149. package/node_modules/@smithy/core/dist-es/request-builder/requestBuilder.js +0 -1
  150. package/node_modules/@smithy/core/dist-es/submodules/client/util-middleware/getSmithyContext.js +0 -2
  151. package/node_modules/@smithy/core/dist-es/submodules/event-streams/eventstream-codec/TestVectors.fixture.js +0 -146
  152. package/node_modules/@smithy/core/dist-es/submodules/event-streams/eventstream-codec/vectorTypes.fixture.js +0 -1
  153. package/node_modules/@smithy/credential-provider-imds/dist-es/remoteProvider/index.js +0 -2
  154. package/node_modules/@smithy/node-http-handler/dist-es/readable.mock.js +0 -21
  155. package/node_modules/@smithy/node-http-handler/dist-es/server.mock.js +0 -88
  156. package/node_modules/@smithy/node-http-handler/dist-es/stream-collector/readable.mock.js +0 -21
  157. package/node_modules/@smithy/signature-v4/dist-es/suite.fixture.js +0 -399
  158. /package/node_modules/@smithy/core/dist-es/{middleware-http-auth-scheme → legacy-root-exports/middleware-http-auth-scheme}/getHttpAuthSchemeEndpointRuleSetPlugin.js +0 -0
  159. /package/node_modules/@smithy/core/dist-es/{middleware-http-auth-scheme → legacy-root-exports/middleware-http-auth-scheme}/getHttpAuthSchemePlugin.js +0 -0
  160. /package/node_modules/@smithy/core/dist-es/{middleware-http-auth-scheme → legacy-root-exports/middleware-http-auth-scheme}/httpAuthSchemeMiddleware.js +0 -0
  161. /package/node_modules/@smithy/core/dist-es/{middleware-http-auth-scheme → legacy-root-exports/middleware-http-auth-scheme}/index.js +0 -0
  162. /package/node_modules/@smithy/core/dist-es/{middleware-http-auth-scheme → legacy-root-exports/middleware-http-auth-scheme}/resolveAuthOptions.js +0 -0
  163. /package/node_modules/@smithy/core/dist-es/{middleware-http-signing → legacy-root-exports/middleware-http-signing}/getHttpSigningMiddleware.js +0 -0
  164. /package/node_modules/@smithy/core/dist-es/{middleware-http-signing → legacy-root-exports/middleware-http-signing}/httpSigningMiddleware.js +0 -0
  165. /package/node_modules/@smithy/core/dist-es/{middleware-http-signing → legacy-root-exports/middleware-http-signing}/index.js +0 -0
  166. /package/node_modules/@smithy/core/dist-es/{pagination → legacy-root-exports/pagination}/createPaginator.js +0 -0
  167. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/DefaultIdentityProviderConfig.js +0 -0
  168. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/httpAuthSchemes/httpApiKeyAuth.js +0 -0
  169. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/httpAuthSchemes/httpBearerAuth.js +0 -0
  170. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/httpAuthSchemes/index.js +0 -0
  171. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/httpAuthSchemes/noAuth.js +0 -0
  172. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/index.js +0 -0
  173. /package/node_modules/@smithy/core/dist-es/{util-identity-and-auth → legacy-root-exports/util-identity-and-auth}/memoizeIdentityProvider.js +0 -0
  174. /package/node_modules/@smithy/core/dist-es/{getSmithyContext.js → submodules/transport/getSmithyContext.js} +0 -0
  175. /package/node_modules/@smithy/core/dist-es/submodules/{protocols/protocol-http → transport}/httpRequest.js +0 -0
  176. /package/node_modules/@smithy/core/dist-es/submodules/{protocols/protocol-http → transport}/httpResponse.js +0 -0
  177. /package/node_modules/@smithy/core/dist-es/submodules/{endpoints/util-endpoints/lib → transport}/isValidHostLabel.js +0 -0
  178. /package/node_modules/@smithy/core/dist-es/submodules/{protocols/protocol-http → transport}/isValidHostname.js +0 -0
  179. /package/node_modules/@smithy/core/dist-es/submodules/{client/util-middleware → transport}/normalizeProvider.js +0 -0
  180. /package/node_modules/@smithy/core/dist-es/submodules/{protocols/querystring-parser → transport}/parseQueryString.js +0 -0
@@ -10,310 +10,406 @@ permissions:
10
10
 
11
11
  # qqbrowser-skill
12
12
 
13
- ## Platform Support
13
+ ## Installation
14
14
 
15
- - **Linux x86_64**: Supported
16
- - **Windows**: Supported
17
- - **macOS**: Supported
18
- - Other Linux architectures (ARM, etc.) are not supported.
19
-
20
- ## Security
15
+ ```bash
16
+ # Linux / macOS
17
+ pipx install qqbrowser-skill && qqbrowser-skill install
21
18
 
22
- ### Permissions
19
+ # Windows
20
+ pip install qqbrowser-skill && qqbrowser-skill install
21
+ ```
23
22
 
24
- This skill requires the following permissions to function properly:
23
+ ## Key Concepts
25
24
 
26
- | Permission | Scope | Purpose |
27
- | ---------------------------- | -------------------------- | ------------------------------------------------------------------ |
28
- | **Network Access** | Outbound HTTP/HTTPS | Required for browser navigation, page loading, and web interaction |
29
- | **File System (Read/Write)** | Temporary directories only | Required for saving screenshots (`.webp`) and downloaded files |
25
+ - **Element Index**: Encoded string like `2_sfli_qp0u` (format: `highlightIndex_attrHash_xpathHash`). Generated by `browser_snapshot`, used to target elements for interaction. Features a 4-level degradation matching mechanism for cross-snapshot stability.
26
+ - **Snapshot**: Returns page content with indexed elements. Re-snapshot is required after any DOM change (navigation, form submission, AJAX loading).
27
+ - **Task Recording**: All browser tasks MUST be wrapped with `task_begin`/`task_end` to enable replay.
30
28
 
31
- ### QQBrowser Binary
29
+ ## Core Workflow
32
30
 
33
- The `qqbrowser-skill install` command downloads the QQ Browser package from official Tencent distribution channels via HTTPS:
31
+ ```bash
32
+ qqbrowser-skill task_begin --description "描述任务" # 1. Start recording (MANDATORY)
33
+ qqbrowser-skill browser_go_to_url --url <url> # 2. Navigate
34
+ qqbrowser-skill browser_snapshot # 3. Get element indices
35
+ # ... interact using indices ... # 4. Perform actions
36
+ qqbrowser-skill browser_snapshot # 5. Re-snapshot after DOM changes
37
+ qqbrowser-skill task_end # 6. End recording (MANDATORY)
38
+ ```
34
39
 
35
- - **Base URL**: `https://dldir1v6.qq.com/invc/tt/QB/Public/`
36
- - `dldir1v6.qq.com` is Tencent's official software distribution CDN.
37
- - All downloads are performed over **HTTPS** to ensure transport-level security.
40
+ ## Command Reference
38
41
 
39
- ### File Storage
42
+ ### Navigation
43
+ ```bash
44
+ browser_go_to_url --url <url> # Navigate to URL
45
+ browser_go_back # Go back
46
+ browser_wait --seconds <n> # Wait (default 3s)
47
+ ```
40
48
 
41
- - **Screenshots**: Saved to the system's temporary directory (e.g., `/tmp/` on Linux) and returned as file paths.
42
- - **Downloaded files**: Saved to the system's temporary directory or user-specified path via `browser_download_file` / `browser_download_url`.
43
- - This skill does **not** access or modify files outside of its designated directories.
49
+ ### Snapshot & Screenshot
50
+ ```bash
51
+ browser_snapshot # Element indices mode (for interaction)
52
+ browser_snapshot --markdown # Clean Markdown mode (for reading, no indices)
53
+ browser_screenshot [--full] [--annotate] # Screenshot (returns .webp temp path)
54
+ ```
44
55
 
45
- ## Note:
56
+ ### Click & Input
57
+ ```bash
58
+ browser_click_element --index <id> # Click
59
+ browser_dblclick_element --index <id> # Double-click
60
+ browser_focus_element --index <id> # Focus
61
+ browser_input_text --index <id> --text "<content>" # Clear + input text
62
+ ```
46
63
 
47
- Each command will return a snapshot of the current page after execution, including the encoded index of elements (format: `index_attrHash_xpathHash`, e.g. `2_sfli_qp0u`).
48
- Please call the standalone qqbrowser-skill browser_snapshot command only when necessary to avoid unnecessary token consumption.
64
+ ### Scroll
65
+ ```bash
66
+ browser_scroll_down [--amount <px>] # Scroll down (default one page)
67
+ browser_scroll_up [--amount <px>] # Scroll up
68
+ browser_scroll_to_text --text "<text>" # Scroll to text
69
+ browser_scroll_to_top / browser_scroll_to_bottom
70
+ browser_scroll_by --direction <dir> --pixels <n> [--index <id>]
71
+ browser_scroll_into_view --index <id>
72
+ ```
49
73
 
50
- ## ⚠️ MANDATORY: Task Recording Required
74
+ ### Keyboard
75
+ ```bash
76
+ browser_keypress --key <key> # Press key (Enter, Tab, etc.)
77
+ browser_keyboard_op --action type --text "<content>" # Type text
78
+ browser_keyboard_op --action inserttext --text "<content>" # Insert without key events
79
+ browser_keydown --key <key> / browser_keyup --key <key>
80
+ ```
51
81
 
52
- **EVERY browser automation task MUST be wrapped with `task_begin` and `task_end`.** This is NOT optional.
82
+ ### Dropdown & Checkbox
83
+ ```bash
84
+ browser_get_dropdown_options --index <id>
85
+ browser_select_dropdown_option --index <id> --text "<option>"
86
+ browser_check_op --index <id> --value / --no-value
87
+ ```
53
88
 
89
+ ### Find and Act (Semantic Locators)
54
90
  ```bash
55
- # ALWAYS start with task_begin BEFORE any browser commands
56
- qqbrowser-skill task_begin --description "描述你要做的任务"
91
+ browser_find_and_act --by <role|text|label|placeholder|testid|css> --value "<v>" --action <click|fill|type> [--actionValue "<v>"] [--name "<n>"] [--nth <n>]
92
+ # --nth: select the Nth matching element (0-based). Essential for loop patterns in lists.
93
+ ```
57
94
 
58
- # ... your browser commands here ...
95
+ ### Get Information & State
96
+ ```bash
97
+ browser_get_info --type <text|url|title|html|value|attr|count|box|styles|list_selector> [--index <id>] [--attribute <name>]
98
+ browser_check_state --state <visible|enabled|checked> --index <id>
99
+ ```
59
100
 
60
- # ALWAYS end with task_end AFTER all browser commands are done
61
- qqbrowser-skill task_end
101
+ > **💡 Key Usage**: `browser_get_info` returns **only the requested info** (no page state or interactive elements), making it lightweight and token-efficient.
102
+ >
103
+ > **🔑 `list_selector` — Auto-detect CSS selector for list iteration (PREFERRED for loop tasks):**
104
+ > ```bash
105
+ > # From snapshot you see a list of articles:
106
+ > # [3_abc1_def2]<a 人工智能的未来/>
107
+ > # [4_xyz3_uvw4]<a 深度学习入门/>
108
+ > # [5_mno5_pqr6]<a 大模型时代/>
109
+ >
110
+ > # Pick ANY one element from the list and run:
111
+ > browser_get_info --type list_selector --index "3_abc1_def2"
112
+ > # → {"success":true, "selector":".ContentItem h2 a", "count":10,
113
+ > # "samples":["人工智能的未来","深度学习入门","大模型时代","...","..."],
114
+ > # "strategy":"parent_level_1"}
115
+ >
116
+ > # Now use this selector directly in find_and_act for loop iteration:
117
+ > browser_find_and_act --by css --value ".ContentItem h2 a" --action click --nth 1
118
+ > ```
119
+ > **This is the RECOMMENDED way to find CSS selectors for list operations.** It replaces the manual workflow of `get_info --type html` → analyze → construct selector → probe verify. One command does it all.
120
+ >
121
+ > If `list_selector` returns `success: false`, fall back to manual inspection:
122
+ > ```bash
123
+ > browser_get_info --type html --index "5_abc1_def2"
124
+ > # → returns element HTML, use it to construct a selector manually
125
+ > ```
126
+
127
+ ### JavaScript Evaluation
128
+ ```bash
129
+ browser_eval_content_js --script "<js_code>" # Evaluate JS, return result
130
+ browser_eval_content_js --script "<base64>" --base64 # Base64-encoded script
62
131
  ```
63
132
 
64
- **Rules:**
133
+ ### Download
134
+ ```bash
135
+ browser_download_file --index <id> # Download by clicking element
136
+ browser_download_url # Download from URL
137
+ ```
65
138
 
66
- - `task_begin` MUST be the **first** command you execute (before `browser_go_to_url` or any other browser command)
67
- - `task_end` MUST be the **last** command you execute (after the task is complete)
68
- - There are NO exceptions to this rule — even for simple tasks like data extraction or page reading
69
- - Failure to use `task_begin`/`task_end` means the task cannot be replayed or saved as a playbook later
139
+ ### Tab Management
140
+ ```bash
141
+ browser_tab_open --url <url> # Open new tab
142
+ browser_tab_list # List tabs
143
+ browser_tab_switch --tabId <n> # Switch tab
144
+ browser_tab_close --tabId <n> # Close tab
145
+ ```
70
146
 
71
- ## Core Workflow
147
+ ### Dialog
148
+ ```bash
149
+ browser_dialog --action <accept|dismiss> [--text "<input>"]
150
+ ```
72
151
 
73
- Every browser automation follows this pattern:
152
+ ### Replay
153
+ ```bash
154
+ browser_replay --script <path> # Replay from file
155
+ browser_replay --script_content '<json>' # Replay from JSON string
156
+ browser_replay --script <path> --variables '{"key":"value"}' # With params
157
+ ```
74
158
 
75
- 1. **Record start**: `qqbrowser-skill task_begin --description "task description"`
76
- 2. **Navigate**: `qqbrowser-skill browser_go_to_url --url <url>`
77
- 3. **Snapshot**: `qqbrowser-skill browser_snapshot` (get indexed element refs)
78
- 4. **Interact**: Use element index to click, fill, select
79
- 5. **Re-snapshot**: After navigation or DOM changes, get fresh refs
80
- 6. **Record end**: `qqbrowser-skill task_end`
159
+ > **⚠️ LONG EXECUTION TIME**: `browser_replay` executes ALL steps in the playbook sequentially (navigation, waiting, data extraction, etc.). **This can take up to 10 minutes** depending on the number of steps, network conditions, and configured delays/timeouts.
160
+ >
161
+ > **You MUST wait patiently** for the command to complete — do NOT assume it has hung or timed out. Do NOT interrupt or retry unless you receive an explicit error message.
162
+ > Typical durations: simple playbooks (5-10 steps) ~30s–1min, complex playbooks with loops (20+ steps) ~3–10min.
81
163
 
164
+ ### Task Recording
82
165
  ```bash
83
- qqbrowser-skill task_begin --description "Fill out example form"
84
- qqbrowser-skill browser_go_to_url --url https://example.com/form
85
- qqbrowser-skill browser_snapshot
86
- # Output includes element indices: [1_abc1_def2] input "email", [2_xyz3_uvw4] input "password", [3_klm5_nop6] button "Submit"
87
-
88
- qqbrowser-skill browser_input_text --index 1_abc1_def2 --text "user@example.com"
89
- qqbrowser-skill browser_input_text --index 2_xyz3_uvw4 --text "password123"
90
- qqbrowser-skill browser_click_element --index 3_klm5_nop6
91
- qqbrowser-skill browser_wait --seconds 2
92
- qqbrowser-skill browser_snapshot # Check result
93
- qqbrowser-skill task_end
166
+ task_begin --description "<description>" # Start recording (MUST be first)
167
+ task_end # Stop recording → then auto-generate playbook (see below)
168
+ task_latest # Get most recent recording (for playbook generation)
94
169
  ```
95
170
 
96
- ## Essential Commands
171
+ > **IMPORTANT**: After `task_end`, do NOT just show the raw recording path. The raw recording (`~/.qqbrowser-skill/records/task_*.json`) is NOT a ready-to-use replay script. You MUST proceed to generate a proper playbook — see "Generating Replay Scripts / Playbooks" section.
97
172
 
173
+ ### Utility
98
174
  ```bash
99
- # Navigation
100
- qqbrowser-skill browser_go_to_url --url <url> # Navigate to URL
101
- qqbrowser-skill browser_go_back # Go back
102
- qqbrowser-skill browser_wait --seconds 3 # Wait for page load (default 3s)
103
-
104
- # Snapshot & Screenshot
105
- qqbrowser-skill browser_snapshot # Default mode: page content with element indices (for interaction)
106
- qqbrowser-skill browser_snapshot --markdown # Markdown mode: clean Markdown of the page (for reading/extraction, no indices)
107
- qqbrowser-skill browser_screenshot # Take screenshot (returns temp file path of .webp image)
108
- qqbrowser-skill browser_screenshot --full # Full-page screenshot (returns temp file path)
109
- qqbrowser-skill browser_screenshot --annotate # Annotated screenshot with element labels (returns temp file path)
110
-
111
- # Click & Input (use indices from snapshot, index is an encoded string like "2_sfli_qp0u")
112
- qqbrowser-skill browser_click_element --index 2_sfli_qp0u # Click element
113
- qqbrowser-skill browser_dblclick_element --index 2_sfli_qp0u # Double-click element
114
- qqbrowser-skill browser_focus_element --index 2_sfli_qp0u # Focus element
115
- qqbrowser-skill browser_input_text --index 3_abc1_def2 --text "hello" # Input text into element
116
-
117
- # Scroll
118
- qqbrowser-skill browser_scroll_down # Scroll down one page
119
- qqbrowser-skill browser_scroll_down --amount 300 # Scroll down 300px
120
- qqbrowser-skill browser_scroll_up # Scroll up one page
121
- qqbrowser-skill browser_scroll_up --amount 300 # Scroll up 300px
122
- qqbrowser-skill browser_scroll_to_text --text "Section 3" # Scroll to text
123
- qqbrowser-skill browser_scroll_to_top # Scroll to top
124
- qqbrowser-skill browser_scroll_to_bottom # Scroll to bottom
125
- qqbrowser-skill browser_scroll_by --direction down --pixels 500 # Scroll page by direction
126
- qqbrowser-skill browser_scroll_by --direction right --pixels 200 --index 3_klm5_nop6 # Scroll element by direction
127
- qqbrowser-skill browser_scroll_into_view --index 5_xyz3_uvw4 # Scroll element into view
128
-
129
- # Keyboard
130
- qqbrowser-skill browser_keypress --key Enter # Press a key
131
- qqbrowser-skill browser_keyboard_op --action type --text "hello" # Type text
132
- qqbrowser-skill browser_keyboard_op --action inserttext --text "hello" # Insert text without key events
133
- qqbrowser-skill browser_keydown --key Shift # Hold down a key
134
- qqbrowser-skill browser_keyup --key Shift # Release a key
135
-
136
- # Dropdown
137
- qqbrowser-skill browser_get_dropdown_options --index 2_abc1_def2 # Get dropdown options
138
- qqbrowser-skill browser_select_dropdown_option --index 2_abc1_def2 --text "Option A" # Select option
139
-
140
- # Checkbox
141
- qqbrowser-skill browser_check_op --index 4_klm5_nop6 --value # Check checkbox
142
- qqbrowser-skill browser_check_op --index 4_klm5_nop6 --no-value # Uncheck checkbox
143
-
144
- # Get Information
145
- qqbrowser-skill browser_get_info --type text --index 1_abc1_def2 # Get element text
146
- qqbrowser-skill browser_get_info --type url # Get current URL
147
- qqbrowser-skill browser_get_info --type title # Get page title
148
- qqbrowser-skill browser_get_info --type html --index 1_abc1_def2 # Get element HTML
149
- qqbrowser-skill browser_get_info --type value --index 1_abc1_def2 # Get element value
150
- qqbrowser-skill browser_get_info --type attr --index 1_abc1_def2 --attribute href # Get attribute
151
- qqbrowser-skill browser_get_info --type count # Get element count
152
- qqbrowser-skill browser_get_info --type box --index 1_abc1_def2 # Get bounding box
153
- qqbrowser-skill browser_get_info --type styles --index 1_abc1_def2 # Get computed styles
154
- qqbrowser-skill browser_check_state --state visible --index 1_abc1_def2 # Check visibility
155
- qqbrowser-skill browser_check_state --state enabled --index 1_abc1_def2 # Check if enabled
156
- qqbrowser-skill browser_check_state --state checked --index 1_abc1_def2 # Check if checked
157
-
158
- # Find and Act (semantic locators)
159
- qqbrowser-skill browser_find_and_act --by role --value button --action click --name "Submit"
160
- qqbrowser-skill browser_find_and_act --by text --value "Sign In" --action click
161
- qqbrowser-skill browser_find_and_act --by label --value "Email" --action fill --actionValue "user@test.com"
162
- qqbrowser-skill browser_find_and_act --by placeholder --value "Search" --action type --actionValue "query"
163
- qqbrowser-skill browser_find_and_act --by testid --value "submit-btn" --action click
164
-
165
- # Download
166
- qqbrowser-skill browser_download_file --index 5_xyz3_uvw4 # Download file by clicking element
167
- qqbrowser-skill browser_download_url # Download from URL
168
-
169
- # Tab Management
170
- qqbrowser-skill browser_tab_open --url <url> # Open URL in new tab
171
- qqbrowser-skill browser_tab_list # List open tabs
172
- qqbrowser-skill browser_tab_switch --tabId 2 # Switch to tab
173
- qqbrowser-skill browser_tab_close --tabId 2 # Close tab
174
-
175
- # Dialog
176
- qqbrowser-skill browser_dialog --action accept # Accept dialog
177
- qqbrowser-skill browser_dialog --action dismiss # Dismiss dialog
178
- qqbrowser-skill browser_dialog --action accept --text "input text" # Accept prompt with text
179
-
180
- # JavaScript Evaluation
181
- qqbrowser-skill browser_eval_content_js --script "document.title" # Evaluate JS and return result
182
- qqbrowser-skill browser_eval_content_js --script "ZG9jdW1lbnQudGl0bGU=" --base64 # Evaluate base64-encoded JS script
183
-
184
- # Replay (execute a recorded automation script)
185
- qqbrowser-skill browser_replay --script /path/to/record.json # Replay from file
186
- qqbrowser-skill browser_replay --script_content '{"version":"1.0",...}' # Replay from JSON string
187
- qqbrowser-skill browser_replay --script /path/to/record.json --variables '{"user":"test@example.com","pass":"123"}' # With parameterized variables
188
-
189
- # Task Completion
190
- qqbrowser-skill browser_done --success --text "Task completed" # Mark task as done
191
- qqbrowser-skill browser_done --text "Still in progress" # Mark task as incomplete
192
-
193
- # Task Recording
194
- qqbrowser-skill task_begin --description "description of what you are doing" # Start recording commands
195
- qqbrowser-skill task_end # Stop recording and save
196
- qqbrowser-skill task_latest # Get the most recent task recording
197
-
198
- # Help
199
- qqbrowser-skill list # List all available skills
200
- qqbrowser-skill <skill_name> --help # Show help for a specific skill
201
-
202
- # Skill Check
203
- qqbrowser-skill status # Check skill status
175
+ browser_done --success --text "<msg>" # Mark task complete
176
+ status # Check skill status
177
+ list # List available skills
204
178
  ```
205
179
 
206
- ## Common Patterns
180
+ ---
181
+
182
+ ## ⚠️ MANDATORY: Replay-Friendly Execution Rules
183
+
184
+ **All operations MUST prioritize replayability. Follow these rules to ensure recorded steps can be replayed with parameterized variables.**
185
+
186
+ ### Rule 1: Task Classification
207
187
 
208
- ### Form Submission
188
+ Before executing, classify the task:
189
+
190
+ | Category | Replayable? | Strategy |
191
+ |----------|-------------|----------|
192
+ | **A: Fixed-path** (navigate, click, fill forms) | ✅ | Standard commands, all recorded |
193
+ | **B: Data extraction** | ⚠️ Partial | Use `browser_eval_content_js` (replayable), NOT `snapshot --markdown` + AI summarization |
194
+ | **C: Content understanding** (summarize, compare) | ❌ | Separate from replayable steps, mark with `"requires_ai": true` |
195
+ | **D: Dynamic iteration** (pagination, load more) | ⚠️ | Use fixed number of operations based on predictable calculation |
196
+
197
+ ### Rule 2: Data Extraction — Always Prefer JS
209
198
 
210
199
  ```bash
211
- qqbrowser-skill browser_go_to_url --url https://example.com/signup
212
- qqbrowser-skill browser_snapshot
213
- qqbrowser-skill browser_input_text --index 1_abc1_def2 --text "Jane Doe"
214
- qqbrowser-skill browser_input_text --index 2_xyz3_uvw4 --text "jane@example.com"
215
- qqbrowser-skill browser_select_dropdown_option --index 3_klm5_nop6 --text "California"
216
- qqbrowser-skill browser_check_op --index 4_pqr7_stu8 --value
217
- qqbrowser-skill browser_click_element --index 5_vwx9_yza0
218
- qqbrowser-skill browser_wait --seconds 2
219
- qqbrowser-skill browser_snapshot # Verify result
200
+ # Replayable: JS extraction
201
+ browser_eval_content_js --script "JSON.stringify(Array.from(document.querySelectorAll('.item')).slice(0,10).map(el=>({title:el.querySelector('.title')?.textContent?.trim()})))" --base64
202
+
203
+ # Non-replayable: AI reading
204
+ browser_snapshot --markdown # then AI summarizes
220
205
  ```
221
206
 
222
- ### Data Extraction
207
+ **Exception**: Tasks explicitly requiring AI understanding (e.g., "总结文章要点") — note in `task_begin` description: "需要AI在线参与,不可纯回放".
223
208
 
224
- Choose the right `browser_snapshot` mode based on intent:
209
+ ### Rule 3: Analyze First, Execute Once (No Trial-and-Error Recording)
225
210
 
226
- - **`browser_snapshot --markdown`** — preferred for **reading/summarizing** page content (articles, docs, search results, product pages). Returns clean Markdown with ads/nav/scripts stripped. Token-efficient, but contains **no element indices** cannot be used to drive clicks or inputs.
227
- - **`browser_snapshot`** (default) — required for **interacting** with the page. Returns indexed elements for use with `browser_click_element`, `browser_input_text`, etc.
211
+ **CRITICAL**: Do NOT use trial-and-error during a recorded task. Every command between `task_begin` and `task_end` is recorded failed attempts pollute the recording and make it unreplayable.
228
212
 
213
+ **Correct workflow for data extraction:**
229
214
  ```bash
230
- qqbrowser-skill browser_go_to_url --url https://example.com/products
231
- qqbrowser-skill browser_snapshot --markdown # Read the whole page as clean markdown
215
+ task_begin --description "..."
216
+ browser_go_to_url --url "..."
217
+ browser_wait --seconds 3
218
+ browser_snapshot # ← AI analyzes page structure here (not recorded in replay)
219
+ # AI identifies the correct, stable selector BEFORE writing the JS script
220
+ browser_eval_content_js --script "/* one correct script */" # ← Only this gets replayed
221
+ task_end
222
+ ```
223
+
224
+ **PROHIBITED patterns:**
225
+ - ❌ Running multiple `browser_eval_content_js` with different selectors hoping one works
226
+ - ❌ Using `browser_get_info` to "check" results mid-recording then adjusting
227
+ - ❌ Hardcoding element index strings (e.g., `[id='25_tg4y_yb9z']`) inside JS scripts — index is for command params only, not JS selectors
228
+ - ❌ Using regex to match page text content as a selector strategy (fragile, locale-dependent)
229
+
230
+ **Rule**: Use `browser_snapshot` to understand the DOM, then write **one definitive** JS extraction script. If the first script fails, call `task_end` to discard, fix the script, then re-record with `task_begin`.
231
+
232
+ ### Rule 4: JS Selector Priority
232
233
 
233
- # Use default snapshot + get_info when you need to interact or read a specific element
234
- qqbrowser-skill browser_snapshot
235
- qqbrowser-skill browser_get_info --type text --index 5_abc1_def2 # Get a specific element's text
234
+ | Priority | Type | Example |
235
+ |----------|------|---------|
236
+ | 1 | `id` | `#rank-list` |
237
+ | 2 | `data-*` | `[data-testid="item"]` |
238
+ | 3 | ARIA | `[role="listitem"]` |
239
+ | 4 | Semantic class | `.article-title` |
240
+ | 5 | Structural path | `main > ul > li` |
241
+ | ❌ | Dynamic/hash class | `.css-1a2b3c` — NEVER use |
242
+
243
+ ### Rule 5: Repeating Patterns & Pagination
244
+
245
+ **Identify repeating patterns during recording**. If you find yourself doing the same sequence of actions N times (e.g., click item → read → go back), this is a **loop pattern**. Record it correctly:
246
+
247
+ **Recording strategy for loop tasks:**
248
+ ```bash
249
+ task_begin --description "搜索{{topic}}并提取前{{count}}篇文章内容"
250
+ browser_go_to_url --url "https://example.com/search?q=AI"
251
+ browser_wait --seconds 3
252
+ browser_snapshot # AI analyzes list structure
253
+
254
+ # ★ STEP 0: DISCOVER LIST SELECTOR (MANDATORY before any list iteration)
255
+ # Use list_selector to auto-detect the CSS selector from any list item's snapshot index:
256
+ browser_get_info --type list_selector --index "3_abc1_def2"
257
+ # → {"success":true, "selector":".result-item h2 a", "count":10, "samples":[...]}
258
+ # Then record the probe with the discovered selector for playbook generation:
259
+ browser_eval_content_js --script "JSON.stringify({__list_probe__: true, selector: '.result-item h2 a', count: document.querySelectorAll('.result-item h2 a').length, samples: Array.from(document.querySelectorAll('.result-item h2 a')).slice(0,3).map(e=>e.textContent.trim())})"
260
+ # The __list_probe__ marker helps the playbook generator locate this data.
261
+
262
+ # Record ONE complete iteration as the pattern:
263
+ browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 1
264
+ browser_wait --seconds 2
265
+ browser_eval_content_js --script "JSON.stringify({title: document.querySelector('h1')?.textContent, content: document.querySelector('.content')?.textContent?.substring(0,2000)})"
266
+ browser_go_back
267
+ browser_wait --seconds 1
268
+
269
+ # Record second iteration to confirm the pattern:
270
+ browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 2
271
+ browser_wait --seconds 2
272
+ browser_eval_content_js --script "JSON.stringify({title: document.querySelector('h1')?.textContent, content: document.querySelector('.content')?.textContent?.substring(0,2000)})"
273
+ browser_go_back
274
+
275
+ task_end
236
276
  ```
237
277
 
238
- ### Infinite Scroll Pages
278
+ **Key points:**
279
+ - **★ MUST discover and probe list selector** before iterating — first use `browser_get_info --type list_selector --index <id>` to auto-detect the CSS selector, then use `browser_eval_content_js` with `__list_probe__: true` to record it. **NEVER guess CSS selectors.**
280
+ - Record **at least 2 iterations** so the playbook generator can identify the loop pattern
281
+ - Use `browser_find_and_act` with `--nth` (not `browser_click_element` with hardcoded index) for list items — this enables loop parameterization
282
+ - The playbook generator will automatically convert repetitions into a `loop` structure with `{{count}}` controlling iterations
283
+
284
+ **How to find the CSS selector (MANDATORY — use `list_selector`):**
285
+ ```
286
+ ★ PREFERRED (one command, auto-detect):
287
+ 1. browser_snapshot → AI sees list elements: [3_abc1_def2]<a 人工智能的未来/> ...
288
+ 2. browser_get_info --type list_selector --index "3_abc1_def2"
289
+ → returns: {"success":true, "selector":".ContentItem h2 a", "count":10, "samples":[...]}
290
+ 3. Use the returned selector → proceed with __list_probe__ and find_and_act
291
+
292
+ FALLBACK (if list_selector returns success=false):
293
+ 1. browser_get_info --type html --index "3_abc1_def2"
294
+ → returns: <a class="ContentItem-title" href="/p/123456">人工智能的未来</a>
295
+ 2. AI reads the HTML → identifies class, parent structure, etc.
296
+ 3. AI constructs CSS selector → ".ContentItem-title" or "a[data-za-detail-view-element_name='Title']"
297
+ 4. browser_eval_content_js → run __list_probe__ to verify the selector
298
+ ```
299
+ **NEVER guess CSS selectors.** Always use `list_selector` first, then fall back to `html` inspection if needed.
239
300
 
301
+ **For simple pagination** (next page button):
240
302
  ```bash
241
- qqbrowser-skill browser_go_to_url --url https://example.com/feed
242
- qqbrowser-skill browser_scroll_to_bottom # Trigger lazy loading
243
- qqbrowser-skill browser_wait --seconds 2 # Wait for content
244
- qqbrowser-skill browser_snapshot # Get updated content
303
+ browser_eval_content_js --script "/* extract page 1 */"
304
+ browser_find_and_act --by text --value "下一页" --action click
305
+ browser_wait --seconds 2
306
+ browser_eval_content_js --script "/* extract page 2 */"
245
307
  ```
246
308
 
247
- ## Element Index Lifecycle (Important)
309
+ **DO NOT** use AI judgment loops ("check if enough, scroll more").
248
310
 
249
- Element indices are encoded strings (e.g. `2_sfli_qp0u`) that contain hash information for cross-snapshot matching. They are generated by each snapshot and may change when the page DOM changes. Always re-snapshot after:
311
+ ### Rule 6: Multi-Tab Recording Strategy
250
312
 
251
- - Clicking links or buttons that navigate
252
- - Form submissions
253
- - Dynamic content loading (dropdowns, modals, AJAX)
313
+ When a task involves links that open in new tabs (e.g., `target="_blank"`), record using **explicit tab management commands** instead of relying on `browser_go_back`:
254
314
 
255
315
  ```bash
256
- qqbrowser-skill browser_click_element --index 5_abc1_def2 # May navigate to new page
257
- qqbrowser-skill browser_snapshot # MUST re-snapshot
258
- qqbrowser-skill browser_click_element --index 1_xyz3_uvw4 # Use new indices
316
+ task_begin --description "搜索并提取文章(新Tab场景)"
317
+ browser_go_to_url --url "https://example.com/search?q=AI"
318
+ browser_wait --seconds 3
319
+ browser_snapshot # AI analyzes list structure
320
+
321
+ # ★ DISCOVER + PROBE LIST SELECTOR (same as Rule 5 — MANDATORY before list iteration):
322
+ browser_get_info --type list_selector --index "3_abc1_def2" # auto-detect selector
323
+ browser_eval_content_js --script "JSON.stringify({__list_probe__: true, selector: '.result-item h2 a', count: document.querySelectorAll('.result-item h2 a').length, samples: Array.from(document.querySelectorAll('.result-item h2 a')).slice(0,3).map(e=>e.textContent.trim())})"
324
+
325
+ # Click link that opens in new tab:
326
+ browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 1
327
+ browser_tab_list # AI checks: new tab appeared?
328
+ browser_tab_switch --tabId <new_tab_id> # Switch to the new tab
329
+ browser_wait --seconds 2
330
+ browser_eval_content_js --script "..." # Extract content in detail page
331
+ browser_tab_close --tabId <current_tab_id> # Close detail tab
332
+ browser_tab_switch --tabId <origin_tab_id> # Switch back to list page
333
+ browser_wait --seconds 1
334
+
335
+ # Repeat for second item...
336
+ browser_find_and_act --by css --value ".result-item h2 a" --action click --nth 2
337
+ # ... same pattern ...
338
+
339
+ task_end
259
340
  ```
260
341
 
261
- ## Evaluation Report
342
+ **Key points:**
343
+ - Use `browser_tab_list` to observe tab changes (this is AI-only, filtered from playbook)
344
+ - Record physical `tabId` during recording — the **playbook generator** will convert them to semantic references (`"origin"`, `"latest"`, `"current"`)
345
+ - **DO NOT use `browser_go_back`** when the previous action opened a new tab — `go_back` navigates within the current tab's history, it cannot close tabs or switch between them
346
+ - Record **at least 2 iterations** of the tab pattern (same as loop recording) so the playbook generator can detect and convert to `loop` + tab management
262
347
 
263
- See the full skill evaluation report: [QQBrowserSkillReport](https://bak.res.qq.com/nav/qqbrowser_skills/QQBrowserSkillReport.html)
348
+ **How to detect new tab during recording:**
349
+ 1. After clicking a link, call `browser_tab_list`
350
+ 2. If a new tab appeared (tab count increased), use `browser_tab_switch` to go to it
351
+ 3. If no new tab (same count), the link navigated in-place → use normal flow with `browser_go_back`
264
352
 
265
- ## Task Recording (Important for AI Agents)
353
+ ### Rule 7: Fixed vs Variable Parameters
266
354
 
267
- **IMPORTANT**: When using qqbrowser-skill to complete any automation task, you MUST wrap your commands with `task_begin` and `task_end`. This records all browser operations performed during the task, enabling replay and reuse later.
355
+ | Command | Fixed | Variable (→ `{{param}}`) |
356
+ |---------|-------|--------------------------|
357
+ | `browser_go_to_url` | Base URL | Query params, path segments |
358
+ | `browser_click_element` | `--index` | — (never parameterize) |
359
+ | `browser_input_text` | `--index` | `--text` |
360
+ | `browser_select_dropdown_option` | `--index` | `--text` |
361
+ | `browser_keyboard_op` | `--action` | `--text` |
362
+ | `browser_eval_content_js` | Script structure | Count, selector, keyword in script |
363
+ | `browser_find_and_act` | `--by`, `--action` | `--value`, `--actionValue`, `--name` |
364
+ | `browser_dialog` | `--action` | `--text` |
365
+ | `browser_wait` / `browser_scroll_*` | All params | — |
268
366
 
269
- ### Workflow
367
+ ### Rule 8: Non-Replayable Commands (AI-Decision Only)
270
368
 
271
- 1. **Before starting any browser automation task**, call `task_begin` with a description of what you're about to do
272
- 2. Execute your browser commands as normal
273
- 3. **After completing the task**, call `task_end` to save the recording
369
+ These commands help AI decide but produce **no replayable action** they are skipped during replay:
274
370
 
275
- ### Commands
371
+ `browser_snapshot`, `browser_snapshot --markdown`, `browser_screenshot`, `browser_get_info`, `browser_check_state`, `browser_get_dropdown_options`, `browser_tab_list`
276
372
 
277
- ```bash
278
- # Start recording — call this BEFORE any browser commands for the task
279
- qqbrowser-skill task_begin --description "自动打开小红书并发布一篇笔记"
373
+ **Principle**: AI uses these to decide; the resulting **actions** (click, input, select) are what gets recorded and replayed.
374
+
375
+ ### Rule 9: Prefer `browser_find_and_act` for Dynamic Content
280
376
 
281
- # ... execute browser commands normally ...
282
- qqbrowser-skill browser_go_to_url --url https://www.xiaohongshu.com
283
- qqbrowser-skill browser_snapshot
284
- qqbrowser-skill browser_click_element --index 5_abc1_def2
285
- # ... more commands ...
377
+ For elements in dynamic lists (search results, feeds) where index stability is uncertain:
286
378
 
287
- # Stop recording — call this AFTER all browser commands are done
288
- qqbrowser-skill task_end
379
+ ```bash
380
+ # ✅ Preferred: semantic locator (stable across page changes)
381
+ browser_find_and_act --by text --value "{{target_text}}" --action click
289
382
 
290
- # Retrieve the most recent completed task recording
291
- qqbrowser-skill task_latest
383
+ # ⚠️ Acceptable: index-based (for stable page structures only)
384
+ browser_click_element --index 15_abc1_def2
292
385
  ```
293
386
 
294
- ### Output
387
+ ---
388
+
389
+ ## Generating Replay Scripts / Playbooks
390
+
391
+ ### Auto-Generate Trigger
295
392
 
296
- The recorded commands are saved as a JSON file in `~/.qqbrowser-skill/records/` directory. The output format is compatible with the `browser_replay` command, containing a `steps` array where each step has `action` and `params` fields.
393
+ After every `task_end`, if the task description indicates it should be replayable (e.g., contains "可回放", "replay", or is a standard operational task without AI summarization), the AI **MUST automatically**:
297
394
 
298
- ### Notes
395
+ 1. Load the `qqbrowser-playbook` skill
396
+ 2. Run `task_latest` to get the raw recording
397
+ 3. Generate a clean playbook (filtering out AI-only commands, parameterizing user data)
398
+ 4. Save to `~/.qqbrowser-skill/playbooks/<name>.json`
299
399
 
300
- - `task_begin` and `task_end` do not interact with the browser they only manage the recording state
301
- - If `task_begin` is called while a previous recording is still active (not ended), the previous recording is discarded
302
- - All browser skill commands executed between `task_begin` and `task_end` are automatically recorded
303
- - The recorded file can be used as a reference for generating parameterized replay scripts
304
- - Use `task_latest` to retrieve the content of the most recently completed recording (useful for reviewing or processing the recorded steps)
400
+ **DO NOT** simply tell the user to run `browser_replay --script <raw_recording_path>`. Raw recordings contain trial-and-error steps and AI-only commands they are NOT replay-ready.
305
401
 
306
- ### Generating Replay Scripts / Playbooks
402
+ ### Explicit Trigger
307
403
 
308
- **CRITICAL**: When the user asks to "save as replay script", "generate playbook", "保存成回放脚本", "生成回放脚本", or any similar request to create a reusable automation script:
404
+ When the user explicitly asks to "save as replay script" / "generate playbook" / "保存成回放脚本" / "生成回放脚本":
309
405
 
310
- 1. You **MUST** load and follow the `qqbrowser-playbook` skill instructions
311
- 2. You **MUST** use `task_latest` to get the recorded steps — do NOT fabricate steps
312
- 3. You **MUST** save the file to `~/.qqbrowser-skill/playbooks/<name>.json` (macOS/Linux) or `%LOCALAPPDATA%/qqbrowser-skill/playbooks/<name>.json` (Windows)
313
- 4. You **MUST** use the exact JSON format specified in the `qqbrowser-playbook` skill (with `metadata`, `params`, `settings`, `steps` structure)
406
+ Same workflow as above.
314
407
 
315
- **DO NOT**:
408
+ ### Rules
316
409
 
317
- - Save to any other directory (e.g., `~/.qbotclaw/workspace/` or project directories)
318
- - Use a flat/simplified JSON format the format has specific nested structure requirements
319
- - Skip loading the playbook skill it contains the authoritative format specification
410
+ 1. **MUST** load the `qqbrowser-playbook` skill for format specification
411
+ 2. **MUST** use `task_latest` to get recorded steps do NOT fabricate
412
+ 3. **MUST** filter out all AI-only commands (`browser_snapshot`, `browser_screenshot`, `browser_get_info`, `browser_check_state`, etc.)
413
+ 4. **MUST** save to `~/.qqbrowser-skill/playbooks/<name>.json` (macOS/Linux) or `%LOCALAPPDATA%/qqbrowser-skill/playbooks/<name>.json` (Windows)
414
+ 5. **MUST** use the exact JSON format from the `qqbrowser-playbook` skill specification
415
+ 6. **MUST NOT** output the raw recording path as the "replay script" — it is source material, not a ready playbook