@yuzc-001/grasp 0.6.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (78) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +327 -0
  3. package/README.zh-CN.md +324 -0
  4. package/examples/README.md +31 -0
  5. package/examples/claude-desktop.json +8 -0
  6. package/examples/codex-config.toml +4 -0
  7. package/grasp.skill +0 -0
  8. package/index.js +87 -0
  9. package/package.json +48 -0
  10. package/scripts/grasp_openclaw_ctl.sh +122 -0
  11. package/scripts/run-search-benchmark.mjs +287 -0
  12. package/scripts/update-star-history.mjs +274 -0
  13. package/skill/SKILL.md +61 -0
  14. package/skill/references/tools.md +306 -0
  15. package/src/cli/auto-configure.js +116 -0
  16. package/src/cli/cmd-connect.js +148 -0
  17. package/src/cli/cmd-explain.js +42 -0
  18. package/src/cli/cmd-logs.js +55 -0
  19. package/src/cli/cmd-status.js +119 -0
  20. package/src/cli/config.js +27 -0
  21. package/src/cli/detect-chrome.js +58 -0
  22. package/src/grasp/handoff/events.js +67 -0
  23. package/src/grasp/handoff/persist.js +48 -0
  24. package/src/grasp/handoff/state.js +28 -0
  25. package/src/grasp/page/capture.js +34 -0
  26. package/src/grasp/page/state.js +273 -0
  27. package/src/grasp/verify/evidence.js +40 -0
  28. package/src/grasp/verify/pipeline.js +52 -0
  29. package/src/layer1-bridge/chrome.js +416 -0
  30. package/src/layer1-bridge/webmcp.js +143 -0
  31. package/src/layer2-perception/hints.js +284 -0
  32. package/src/layer3-action/actions.js +400 -0
  33. package/src/runtime/browser-instance.js +65 -0
  34. package/src/runtime/truth/model.js +94 -0
  35. package/src/runtime/truth/snapshot.js +51 -0
  36. package/src/server/affordances.js +47 -0
  37. package/src/server/audit.js +122 -0
  38. package/src/server/boss-fast-path.js +164 -0
  39. package/src/server/boundary-guard.js +53 -0
  40. package/src/server/content.js +97 -0
  41. package/src/server/continuity.js +256 -0
  42. package/src/server/engine-selection.js +29 -0
  43. package/src/server/entry-orchestrator.js +115 -0
  44. package/src/server/error-codes.js +7 -0
  45. package/src/server/explain-share-card.js +113 -0
  46. package/src/server/fast-path-router.js +134 -0
  47. package/src/server/form-runtime.js +602 -0
  48. package/src/server/form-tasks.js +254 -0
  49. package/src/server/gateway-response.js +62 -0
  50. package/src/server/index.js +22 -0
  51. package/src/server/observe.js +52 -0
  52. package/src/server/page-projection.js +31 -0
  53. package/src/server/page-state.js +27 -0
  54. package/src/server/postconditions.js +128 -0
  55. package/src/server/prompt-assembly.js +148 -0
  56. package/src/server/responses.js +44 -0
  57. package/src/server/route-boundary.js +174 -0
  58. package/src/server/route-policy.js +168 -0
  59. package/src/server/runtime-confirmation.js +87 -0
  60. package/src/server/runtime-status.js +7 -0
  61. package/src/server/share-artifacts.js +284 -0
  62. package/src/server/state.js +132 -0
  63. package/src/server/structured-extraction.js +131 -0
  64. package/src/server/surface-prompts.js +166 -0
  65. package/src/server/task-frame.js +11 -0
  66. package/src/server/tasks/search-task.js +321 -0
  67. package/src/server/tools.actions.js +1361 -0
  68. package/src/server/tools.form.js +526 -0
  69. package/src/server/tools.gateway.js +757 -0
  70. package/src/server/tools.handoff.js +210 -0
  71. package/src/server/tools.js +20 -0
  72. package/src/server/tools.legacy.js +983 -0
  73. package/src/server/tools.strategy.js +250 -0
  74. package/src/server/tools.task-surface.js +66 -0
  75. package/src/server/tools.workspace.js +873 -0
  76. package/src/server/workspace-runtime.js +1138 -0
  77. package/src/server/workspace-tasks.js +735 -0
  78. package/start-chrome.bat +84 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Yuzc-001
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,327 @@
1
+ # Grasp
2
+
3
+ [English](./README.md) · [简体中文](./README.zh-CN.md) · [GitHub](https://github.com/Yuzc-001/grasp) · [Issues](https://github.com/Yuzc-001/grasp/issues)
4
+
5
+ [![Version](https://img.shields.io/badge/version-v0.6.6-0B1738?style=flat-square)](./CHANGELOG.md)
6
+ [![License](https://img.shields.io/badge/license-MIT-23C993?style=flat-square)](./LICENSE)
7
+ [![Validated](https://img.shields.io/badge/validated-Claude%20Code%20%7C%20Codex%20%7C%20Cursor-5B6CFF?style=flat-square)](./README.md#quickstart)
8
+ > **Grasp is a route-aware AI Browser Runtime for agents. One URL, one best path.**
9
+
10
+ Grasp runs locally, keeps a dedicated `chrome-grasp` profile, and gives agents a persistent, human-visible, recoverable web runtime instead of disposable tabs and one-off scripts. That dedicated profile is Grasp's runtime boundary, not "whatever local browser window the user happens to have open right now." The product promise in `v0.6.6` is simple: given a URL and an intent, Grasp should choose the best path first, keep that decision explainable, require confirmed runtime context before page-changing actions, continue on the same runtime path, surface the active route boundary directly in high-level tool responses, refuse high-level form/workspace actions when the current surface boundary does not match, and attach a route/surface-aware prompt package agents can actually execute against.
11
+
12
+ - Current package release: `v0.6.6`
13
+ - Start here: [Browser Runtime Landing](./docs/browser-runtime-landing.html)
14
+ - Public docs for the runtime surface: [docs/README.md](./docs/README.md)
15
+ - Release notes: [CHANGELOG.md](./CHANGELOG.md)
16
+
17
+ ---
18
+
19
+ ## Where the moat comes from
20
+
21
+ Anyone can open a page. Very few systems can keep real web work continuous, verifiable, and recoverable.
22
+
23
+ Grasp compounds around the parts that are hard to fake:
24
+
25
+ - `Continuity`: tasks survive login state, checkpoint pages, and context switching instead of restarting from scratch
26
+ - `Verification`: actions are checked against actual page changes instead of being treated as success by default
27
+ - `Recovery`: humans can step in and agents can resume in the same browser context with evidence
28
+
29
+ That is why Grasp is not just a browser automation wrapper. Over time, that is how a browser runtime becomes the operating layer agents rely on for real web work.
30
+
31
+ ## Route by Evidence
32
+
33
+ Users should not need to remember whether this URL belongs on a public reader, a live authenticated session, a workspace flow, a real form flow, or a handoff path.
34
+
35
+ That route choice is the product.
36
+
37
+ Public modes:
38
+
39
+ - `public_read`
40
+ - `live_session`
41
+ - `workspace_runtime`
42
+ - `form_runtime`
43
+ - `handoff`
44
+
45
+ Provider choice stays internal. Users and agents should reason about modes and evidence, not about which package or adapter happens to run underneath.
46
+
47
+ ## Proof of the runtime
48
+
49
+ ```text
50
+ entry(url, intent)
51
+ inspect()
52
+ request_handoff(...)
53
+ mark_handoff_done()
54
+ resume_after_handoff()
55
+ continue()
56
+ ```
57
+
58
+ If the same task can survive a human step, return to the same browser context, and continue from evidence instead of replaying from scratch, the product has crossed from browser wrapper into runtime.
59
+
60
+ What it does not claim:
61
+
62
+ - universal CAPTCHA bypass
63
+ - guaranteed full autonomy on every gated site
64
+ - evidence-free recovery
65
+ - that any one workflow defines the whole product
66
+
67
+ ---
68
+
69
+ ## Quickstart
70
+
71
+ ### 1. Bootstrap Grasp locally
72
+
73
+ ```bash
74
+ npx -y @yuzc-001/grasp
75
+ ```
76
+
77
+ This detects Chrome, launches the dedicated `chrome-grasp` profile, and helps you connect your AI client.
78
+
79
+ By default this connects Grasp's own CDP runtime. Unless you explicitly point it at a different CDP endpoint, it is not claiming control over an arbitrary browser session the user is currently viewing.
80
+
81
+ If you already have the CLI installed, `grasp connect` does the same local bootstrap step.
82
+
83
+ Bootstrap also establishes the remote-debugging/CDP connection Grasp needs. In the normal local path, users do not need to prepare that separately.
84
+
85
+ ### 2. Connect your client
86
+
87
+ Claude Code:
88
+
89
+ ```bash
90
+ claude mcp add grasp -- npx -y @yuzc-001/grasp
91
+ ```
92
+
93
+ Claude Desktop / Cursor:
94
+
95
+ ```json
96
+ {
97
+ "mcpServers": {
98
+ "grasp": {
99
+ "command": "npx",
100
+ "args": ["-y", "@yuzc-001/grasp"]
101
+ }
102
+ }
103
+ }
104
+ ```
105
+
106
+ Codex CLI:
107
+
108
+ ```toml
109
+ [mcp_servers.grasp]
110
+ type = "stdio"
111
+ command = "npx"
112
+ args = ["-y", "@yuzc-001/grasp"]
113
+ ```
114
+
115
+ ### 3. Get your first win
116
+
117
+ Tell your AI to:
118
+
119
+ 1. call `get_status`
120
+ 2. use `entry` on a real page with an intent such as `extract` or `workspace`
121
+ 3. call `inspect`, then `extract`, `extract_structured`, or `continue`
122
+ 4. call `explain_route` or run `grasp explain`
123
+
124
+ The first win is not just that Grasp opens a page. It is that the agent can choose a route, explain why, and stay inside the same runtime when the task gets real.
125
+
126
+ Reference: [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
127
+ Manual smoke playbook: [docs/reference/smoke-paths.md](./docs/reference/smoke-paths.md)
128
+
129
+ ---
130
+
131
+ ## Runtime Workflows
132
+
133
+ ### Real browsing first
134
+
135
+ Start from the real page and the real session whenever possible. Grasp should read and act on the current browser state before falling back to heavier observation or search-like shortcuts.
136
+
137
+ ### Public read
138
+
139
+ Use `entry(url, intent="extract")` -> `inspect` -> `extract` when the page is public and already readable.
140
+
141
+ What you get:
142
+
143
+ - route decision
144
+ - current page status
145
+ - readable content
146
+ - a suggested next action
147
+
148
+ ### Structured extraction
149
+
150
+ Use `extract_structured(fields=[...])` when you want the current page converted into a field-based record while staying on the same runtime path.
151
+
152
+ What you get:
153
+
154
+ - field-based `record` output
155
+ - `missing_fields` when the page does not expose a requested value clearly enough
156
+ - field evidence with the matched label and extraction strategy
157
+ - JSON export, plus optional Markdown export
158
+
159
+ Use `extract_batch(urls=[...], fields=[...])` when you want the same structured extraction contract applied across multiple URLs in sequence on the same runtime.
160
+
161
+ What you get:
162
+
163
+ - one structured `record` per visited URL
164
+ - exported `CSV` and `JSON` artifacts, plus optional Markdown bundle
165
+ - per-URL status when a page stays gated or needs handoff instead of pretending the scrape succeeded
166
+
167
+ ### Share layer
168
+
169
+ Use `share_page(format="markdown" | "screenshot" | "pdf")` when the result needs to be forwarded to someone else without sending them the original inaccessible page link.
170
+
171
+ What you get:
172
+
173
+ - a shareable artifact written locally
174
+ - a clean share document generated from the current page projection instead of the raw page chrome
175
+ - the same runtime explanation path, so the artifact can still be traced back to the page and route that produced it
176
+
177
+ Use `explain_share_card()` when you want the human-facing share layout explained before exporting it. This uses a Pretext-backed text layout estimate when available, so the share layer can reason about title and summary density without touching the current page DOM.
178
+
179
+ ### Fast-path adapters
180
+
181
+ Site-specific fast reads no longer need to live inside the core router. `v0.6.3` keeps the built-in BOSS path as an adapter and lets you extend the same mechanism locally.
182
+
183
+ What is supported:
184
+
185
+ - drop `.js` adapters into `~/.grasp/site-adapters`
186
+ - or point `GRASP_SITE_ADAPTER_DIR` at a different adapter directory
187
+ - use a lightweight `.skill` file as a manifest with `entry:` or `adapter:` pointing at a `.js` adapter
188
+
189
+ A `.js` adapter only needs two capabilities:
190
+
191
+ - `matches(url)` or `match(url)`
192
+ - `read(page)`
193
+
194
+ The `.skill` file is only a local manifest that points at the adapter entry. It is not a separate runtime layer.
195
+
196
+ ### Live session
197
+
198
+ Use `entry(url, intent="act")` or `entry(url, intent="workspace")` when the task depends on the current browser session.
199
+
200
+ `entry` can now surface route evidence such as:
201
+
202
+ - selected mode
203
+ - confidence
204
+ - fallback chain
205
+ - whether a human is required
206
+
207
+ ### Handoff and resume
208
+
209
+ When a human step is required, keep the workflow continuous instead of pretending it is fully autonomous:
210
+
211
+ 1. `entry` or `continue` shows the page is gated
212
+ 2. `request_handoff` records the required human step
213
+ 3. `mark_handoff_done` marks the step complete
214
+ 4. `resume_after_handoff` reacquires the page with continuation evidence
215
+ 5. `continue` decides what should happen next
216
+
217
+ Runtime story: [docs/product/browser-runtime-for-agents.md](./docs/product/browser-runtime-for-agents.md)
218
+
219
+ ---
220
+
221
+ ## Product Model
222
+
223
+ ### How the layers fit
224
+
225
+ The product is the route-aware Agent Web Runtime itself. `npx -y @yuzc-001/grasp` / `grasp connect` bootstrap it locally, MCP tools expose the public runtime surface, and the skill is the recommended task-facing layer on top of the same runtime.
226
+
227
+ For the canonical delivery-surface mapping, see [Browser Runtime for Agents](./docs/product/browser-runtime-for-agents.md).
228
+
229
+ ### Modes, not providers
230
+
231
+ Grasp keeps a single agent-facing interface. The core promise is not a collection of site integrations; it is that any real webpage can be entered, routed, and worked through the same task model.
232
+
233
+ The public surface should expose modes, not provider names:
234
+
235
+ - `public_read`
236
+ - `live_session`
237
+ - `workspace_runtime`
238
+ - `form_runtime`
239
+ - `handoff`
240
+
241
+ Provider and adapter choice stays internal. In this slice, `Runtime Engine` remains first-class and `Data Engine` remains a thin read seam for public-web extraction without claiming a fully delivered separate backend.
242
+
243
+ ---
244
+
245
+ ## Real Forms
246
+
247
+ When the page is a real form, use the specialized form surface:
248
+
249
+ `form_inspect` -> `fill_form` / `set_option` / `set_date` -> `verify_form` -> `safe_submit`
250
+
251
+ The default behavior is conservative:
252
+
253
+ - `fill_form` only writes safe fields
254
+ - `review` and `sensitive` fields stay visible so you can inspect them explicitly
255
+ - `safe_submit` starts with preview, so you can check blockers before any real submit
256
+
257
+ Form surface reference: [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
258
+
259
+ ---
260
+
261
+ ## Authenticated Workspaces
262
+
263
+ Use `workspace_inspect` to inspect a dynamic authenticated workspace and let it suggest the
264
+ next step. A typical loop is `workspace_inspect -> select_live_item -> workspace_inspect ->
265
+ draft_action -> workspace_inspect -> execute_action -> verify_outcome`. By default Grasp drafts
266
+ first, requires explicit confirmation for irreversible actions, and verifies that the workspace
267
+ really moved to the next state.
268
+
269
+ Workspace surface reference: [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
270
+
271
+ These workspace flows are examples of the browser runtime in use. BOSS is one example, and the same runtime direction also covers surfaces such as WeChat Official Accounts and Xiaohongshu without collapsing the whole product into any one workflow.
272
+
273
+ ### Basic parallel task state
274
+
275
+ Grasp does not promise a large scheduler today, but it is moving toward handling more than one task/session context without collapsing everything into one active browser assumption.
276
+
277
+ ---
278
+
279
+ ## Advanced Runtime Primitives
280
+
281
+ The runtime surface is the public default. The lower-level runtime is still available when you need tighter control.
282
+
283
+ Common advanced primitives:
284
+
285
+ - navigation and state: `navigate`, `get_status`, `get_page_summary`
286
+ - visible runtime tabs: `list_visible_tabs`, `select_visible_tab`
287
+ - interaction map: `get_hint_map`
288
+ - verified actions: `click`, `type`, `hover`, `press_key`, `scroll`
289
+ - observation: `watch_element`
290
+ - session strategy and handoff helpers: `preheat_session`, `navigate_with_strategy`, `session_trust_preflight`, `suggest_handoff`, `request_handoff_from_checkpoint`, `request_handoff`, `mark_handoff_in_progress`, `mark_handoff_done`, `resume_after_handoff`, `clear_handoff`
291
+
292
+ Full reference: [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
293
+
294
+ ---
295
+
296
+ ## CLI
297
+
298
+ | Command | Description |
299
+ |:---|:---|
300
+ | `grasp` / `grasp connect` | Set up the local browser runtime |
301
+ | `grasp status` | Show connection state, current tab, and recent activity |
302
+ | `grasp explain` | Explain the latest route decision |
303
+ | `grasp logs` | View audit log (`~/.grasp/audit.log`) |
304
+ | `grasp logs --lines 20` | Show the last 20 log lines |
305
+ | `grasp logs --follow` | Stream the audit log |
306
+
307
+ ## Docs
308
+
309
+ - [docs/README.md](./docs/README.md)
310
+ - [Browser Runtime Story](./docs/product/browser-runtime-for-agents.md)
311
+ - [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
312
+ - [docs/reference/smoke-paths.md](./docs/reference/smoke-paths.md)
313
+
314
+ ## Releases
315
+
316
+ - [CHANGELOG.md](./CHANGELOG.md)
317
+ - [CHANGELOG.md](./CHANGELOG.md)
318
+ - [docs/release-notes-v0.6.0.md](./docs/release-notes-v0.6.0.md)
319
+ - [docs/release-notes-v0.55.0.md](./docs/release-notes-v0.55.0.md)
320
+
321
+ ## License
322
+
323
+ MIT — see [LICENSE](./LICENSE).
324
+
325
+ ## Star History
326
+
327
+ [![Star History Chart](./star-history.svg)](https://www.star-history.com/#Yuzc-001/grasp&Date)
@@ -0,0 +1,324 @@
1
+ # Grasp
2
+
3
+ [English](./README.md) · [简体中文](./README.zh-CN.md) · [GitHub](https://github.com/Yuzc-001/grasp) · [Issues](https://github.com/Yuzc-001/grasp/issues)
4
+
5
+ [![Version](https://img.shields.io/badge/version-v0.6.6-0B1738?style=flat-square)](./CHANGELOG.md)
6
+ [![License](https://img.shields.io/badge/license-MIT-23C993?style=flat-square)](./LICENSE)
7
+ [![Validated](https://img.shields.io/badge/validated-Claude%20Code%20%7C%20Codex%20%7C%20Cursor-5B6CFF?style=flat-square)](./README.zh-CN.md#快速开始)
8
+ > **Grasp 是一个会先选路的 AI 浏览器运行时。One URL, one best path.**
9
+
10
+ Grasp 完全本地运行,使用专属的 `chrome-grasp` browser 配置目录,让智能体拥有持久、可见、可恢复的网页运行时,而不是一次性标签页和单站点脚本。这个专属 profile 是 Grasp 的运行时边界,不等于“你当前随手正在使用的任意本地浏览器窗口”。`v0.6.6` 的核心承诺很简单:给它一个 URL 和任务意图,它先选出最合适的路径,让这个决定可解释,在改页面前先确认运行时边界,并沿着同一条 runtime 路径继续执行,同时把当前所在 surface 的边界直接回显到高层工具响应里,在 surface 不匹配时拒绝高层表单 / 工作台动作,并把按 route/surface 动态拼出的 prompt 包一并挂到响应元数据里。
11
+
12
+ - 当前包版本:`v0.6.6`
13
+ - 先看展示单页:[docs/browser-runtime-landing.html](./docs/browser-runtime-landing.html)
14
+ - 对外文档入口:[docs/README.md](./docs/README.md)
15
+ - 发布说明:[CHANGELOG.md](./CHANGELOG.md)
16
+
17
+ ---
18
+
19
+ ## 护城河从哪里来
20
+
21
+ 打开网页不难。难的是让真实网页任务保持连续、可验证、可恢复。
22
+
23
+ Grasp 把能力积累在最难伪造的三个点上:
24
+
25
+ - `连续性`:任务跨登录态、检查点和上下文切换后仍能继续,而不是整段重来
26
+ - `可验证性`:动作要以真实页面状态变化为准,而不是默认“已经成功”
27
+ - `接力恢复`:人工可以中途接力,智能体也能带着证据回到同一浏览器上下文继续推进
28
+
29
+ 这也是 Grasp 不只是浏览器自动化封装的原因。长期看,这正是 browser runtime 进一步长成 agents 在真实网页上的 operating layer 的路径。
30
+
31
+ ## 证据选路
32
+
33
+ 用户和 agent 不应该自己记住“这个 URL 应该走哪个 provider”。这件事应该由产品本身负责。
34
+
35
+ 对外只暴露 mode,不暴露 provider:
36
+
37
+ - `public_read`
38
+ - `live_session`
39
+ - `workspace_runtime`
40
+ - `form_runtime`
41
+ - `handoff`
42
+
43
+ Provider 选择留在内部。用户看到的应该是路径、证据、风险和 fallback。
44
+
45
+ ## 运行时已经成立的证明
46
+
47
+ ```text
48
+ entry(url, intent)
49
+ inspect()
50
+ request_handoff(...)
51
+ mark_handoff_done()
52
+ resume_after_handoff()
53
+ continue()
54
+ ```
55
+
56
+ 如果同一个任务可以跨过人工步骤、回到同一浏览器上下文,并且基于证据继续推进,而不是从头重放,那它就已经不是浏览器封装,而是运行时。
57
+
58
+ 它不承诺:
59
+
60
+ - 通用验证码绕过
61
+ - 所有高风控站点都能全自动完成
62
+ - 没有页面证据也能判断恢复成功
63
+ - 某一个 workflow 就等于整个产品
64
+
65
+ ---
66
+
67
+ ## 快速开始
68
+
69
+ ### 1. 本地启动 Grasp
70
+
71
+ ```bash
72
+ npx -y @yuzc-001/grasp
73
+ ```
74
+
75
+ 它会检测 Chrome,启动专属 `chrome-grasp` 浏览器配置目录,并帮助你把运行时接到 AI 客户端上。
76
+
77
+ 这里连接的是 Grasp 自己的 CDP runtime。除非你明确把别的 CDP endpoint 指给它,否则它不是“当前用户正在看的任意浏览器会话”。
78
+
79
+ 如果你已经安装了 CLI,`grasp connect` 也可以完成同样的本地启动步骤。
80
+
81
+ Bootstrap 也会建立 Grasp 需要的 remote debugging / CDP 连接;在正常本地路径里,用户不需要额外手动准备这一层。
82
+
83
+ ### 2. 接入客户端
84
+
85
+ Claude Code:
86
+
87
+ ```bash
88
+ claude mcp add grasp -- npx -y @yuzc-001/grasp
89
+ ```
90
+
91
+ Claude Desktop / Cursor:
92
+
93
+ ```json
94
+ {
95
+ "mcpServers": {
96
+ "grasp": {
97
+ "command": "npx",
98
+ "args": ["-y", "@yuzc-001/grasp"]
99
+ }
100
+ }
101
+ }
102
+ ```
103
+
104
+ Codex CLI:
105
+
106
+ ```toml
107
+ [mcp_servers.grasp]
108
+ type = "stdio"
109
+ command = "npx"
110
+ args = ["-y", "@yuzc-001/grasp"]
111
+ ```
112
+
113
+ ### 3. 拿到第一次真实成功
114
+
115
+ 让你的 AI 先做这四步:
116
+
117
+ 1. 调用 `get_status`
118
+ 2. 在一个真实页面上带着 `intent` 调用 `entry`
119
+ 3. 调用 `inspect`,然后走 `extract`、`extract_structured` 或 `continue`
120
+ 4. 调用 `explain_route` 或运行 `grasp explain`
121
+
122
+ 第一次成功不只是“它能打开网页”,而是智能体已经能先选路、解释原因,并在任务变真实时留在同一个 runtime 里继续推进。
123
+
124
+ 工具说明见:[docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
125
+ 手工 smoke 路径见:[docs/reference/smoke-paths.md](./docs/reference/smoke-paths.md)
126
+
127
+ ---
128
+
129
+ ## 运行时工作流
130
+
131
+ ### 真实浏览优先
132
+
133
+ 只要能进入真实页面和真实会话,就优先从当前浏览器状态读取和操作,而不是先退化成更重的观察链路或搜索式替代路径。
134
+
135
+ ### 公开读取
136
+
137
+ 页面已公开可读时,用 `entry(url, intent="extract")` -> `inspect` -> `extract`。
138
+
139
+ 你会拿到:
140
+
141
+ - route decision
142
+ - 当前页面状态
143
+ - 可读取内容
144
+ - 建议的下一步动作
145
+
146
+ ### 结构化抽取
147
+
148
+ 当你希望把当前页面直接转成字段记录时,用 `extract_structured(fields=[...])`,同时保持在同一条 runtime 路径上。
149
+
150
+ 你会拿到:
151
+
152
+ - 字段化的 `record`
153
+ - 页面没能明确提供的 `missing_fields`
154
+ - 每个命中字段对应的标签与抽取策略证据
155
+ - JSON 导出,以及可选的 Markdown 导出
156
+
157
+ 当你希望对一组 URL 连续执行同一套结构化抽取时,用 `extract_batch(urls=[...], fields=[...])`。
158
+
159
+ 你会拿到:
160
+
161
+ - 每个 URL 一条结构化 `record`
162
+ - 导出的 `CSV` 和 `JSON` artifact,以及可选的 Markdown 汇总
163
+ - 对受阻页面保留真实状态,而不是把失败假装成“抓取成功”
164
+
165
+ ### 分享层
166
+
167
+ 当结果需要转发给别人,而原始页面链接本身并不适合直接分享时,用 `share_page(format="markdown" | "screenshot" | "pdf")`。
168
+
169
+ 你会拿到:
170
+
171
+ - 一个本地可分享 artifact
172
+ - 由当前页面投影生成的干净分享文档,而不是把原始网页外壳整页丢过去
173
+ - 和 runtime 保持一致的可追溯性,能回到当时的页面与路径解释
174
+
175
+ 当你想在导出前先理解分享卡片会如何布局时,用 `explain_share_card()`。这层会在可用时使用 Pretext 做文本布局估计,从而在不触碰当前页面 DOM 的前提下解释标题和摘要的密度。
176
+
177
+ ### Fast-path 站点适配器
178
+
179
+ 站点特定的快速读取逻辑不再需要继续硬编码在核心 router 里。`v0.6.3` 里内置的 BOSS 路径已经被收敛成一个 adapter,同时也允许你在本地扩展同一套机制。
180
+
181
+ 当前支持:
182
+
183
+ - 直接把 `.js` adapter 放进 `~/.grasp/site-adapters`
184
+ - 或者通过 `GRASP_SITE_ADAPTER_DIR` 指向别的 adapter 目录
185
+ - 用一个轻量 `.skill` 文件作为入口清单,通过 `entry:` 或 `adapter:` 指向对应的 `.js` adapter
186
+
187
+ 一个 `.js` adapter 只需要两件事:
188
+
189
+ - `matches(url)` 或 `match(url)`
190
+ - `read(page)`
191
+
192
+ `.skill` 文件在这里仅仅是一个本地入口清单,不是新的运行时层。
193
+
194
+ ### 实时会话
195
+
196
+ 当任务依赖当前登录态、真实工作台或表单流程时,用 `entry(url, intent="act" | "workspace" | "submit")` 先判路。
197
+
198
+ `entry` 现在会返回这类证据:
199
+
200
+ - 选中了哪个 mode
201
+ - 置信度是多少
202
+ - fallback 链路是什么
203
+ - 是否需要人工接力
204
+
205
+ ### 接力与恢复
206
+
207
+ 当流程必须有人来接一下时,不要假装系统已经全自动,而是把它纳入连续工作流:
208
+
209
+ 1. `entry` 或 `continue` 发现页面受阻
210
+ 2. `request_handoff` 记录人工步骤
211
+ 3. `mark_handoff_done` 标记人工步骤完成
212
+ 4. `resume_after_handoff` 带着延续性证据重新接回页面
213
+ 5. `continue` 判断接下来该继续、等待,还是再次接力
214
+
215
+ 运行时说明见:[docs/product/browser-runtime-for-agents.md](./docs/product/browser-runtime-for-agents.md)
216
+
217
+ ---
218
+
219
+ ## 产品模型
220
+
221
+ ### 三层关系
222
+
223
+ 产品本身是 route-aware Agent Web Runtime。`npx -y @yuzc-001/grasp` / `grasp connect` 负责在本地把它启动起来,MCP 工具是它的公共运行时接口,skill 是建立在同一运行时之上的推荐任务层。
224
+
225
+ CLI、MCP、skill 都只是同一运行时的交付面,不是彼此独立的产品定义。
226
+
227
+ ### 看 mode,不看 provider
228
+
229
+ Grasp 面向智能体保持同一接口。它的核心承诺不是“对很多网站做了很多适配”,而是“任意真实网页都能进入同一套路由与任务模型”。
230
+
231
+ 对外公开的是 mode,而不是 provider 名字:
232
+
233
+ - `public_read`
234
+ - `live_session`
235
+ - `workspace_runtime`
236
+ - `form_runtime`
237
+ - `handoff`
238
+
239
+ Provider 和 adapter 选择留在内部。就这个 slice 而言,`Runtime Engine` 仍然是一等能力,`Data Engine` 仍然只是公开网页读取的一条薄读侧,不夸大成已完整交付的独立后端。
240
+
241
+ ---
242
+
243
+ ## 真实表单
244
+
245
+ 当页面是真实表单时,优先使用专门的表单运行时表面:
246
+
247
+ `form_inspect` -> `fill_form` / `set_option` / `set_date` -> `verify_form` -> `safe_submit`
248
+
249
+ 默认行为是保守的:
250
+
251
+ - `fill_form` 只写 `safe` 字段
252
+ - `review` 和 `sensitive` 字段会保留出来,便于显式查看
253
+ - `safe_submit` 默认先走 preview,先看阻塞项再决定是否真正提交
254
+
255
+ 表单表面参考:[docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
256
+
257
+ ---
258
+
259
+ ## 认证工作台
260
+
261
+ 当当前页面是动态认证 workspace 时,先用 `workspace_inspect` 查看当前状态和下一步建议。
262
+ 典型循环是 `workspace_inspect -> select_live_item -> workspace_inspect -> draft_action ->
263
+ workspace_inspect -> execute_action -> verify_outcome`。默认情况下,Grasp 会先草拟内容,
264
+ 对不可逆操作要求显式确认,并验证 workspace 是否真的进入了下一状态。
265
+
266
+ Workspace 表面参考:[docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
267
+
268
+ 这些 workspace 流程只是这个 browser runtime 的例子。BOSS 是一个例子,微信公众号和小红书也是同类例子,但都不构成产品边界。
269
+
270
+ ### 基础多任务状态
271
+
272
+ Grasp 当前不会承诺复杂调度器,但会继续往“能同时持有多个任务/会话上下文”的方向推进,而不是把所有流程压成一个活动浏览器假设。
273
+
274
+ ---
275
+
276
+ ## 高级运行时原语
277
+
278
+ 高层运行时表面是默认入口;需要更细粒度控制时,底层运行时原语仍然保留。
279
+
280
+ 常用高级原语:
281
+
282
+ - 导航与状态:`navigate`、`get_status`、`get_page_summary`
283
+ - 可见 runtime 标签页:`list_visible_tabs`、`select_visible_tab`
284
+ - 交互地图:`get_hint_map`
285
+ - 可验证动作:`click`、`type`、`hover`、`press_key`、`scroll`
286
+ - 观察:`watch_element`
287
+ - 会话策略与接力辅助:`preheat_session`、`navigate_with_strategy`、`session_trust_preflight`、`suggest_handoff`、`request_handoff_from_checkpoint`、`request_handoff`、`mark_handoff_in_progress`、`mark_handoff_done`、`resume_after_handoff`、`clear_handoff`
288
+
289
+ 完整说明见:[docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
290
+
291
+ ---
292
+
293
+ ## CLI
294
+
295
+ | 命令 | 说明 |
296
+ |:---|:---|
297
+ | `grasp` / `grasp connect` | 初始化本地浏览运行时 |
298
+ | `grasp status` | 查看连接状态、当前标签页和最近活动 |
299
+ | `grasp explain` | 解释最近一次 route decision |
300
+ | `grasp logs` | 查看审计日志(`~/.grasp/audit.log`) |
301
+ | `grasp logs --lines 20` | 查看最近 20 行日志 |
302
+ | `grasp logs --follow` | 实时跟随日志 |
303
+
304
+ ## 文档
305
+
306
+ - [docs/README.md](./docs/README.md)
307
+ - [浏览器运行时说明](./docs/product/browser-runtime-for-agents.md)
308
+ - [docs/reference/mcp-tools.md](./docs/reference/mcp-tools.md)
309
+ - [docs/reference/smoke-paths.md](./docs/reference/smoke-paths.md)
310
+
311
+ ## 发布
312
+
313
+ - [CHANGELOG.md](./CHANGELOG.md)
314
+ - [CHANGELOG.md](./CHANGELOG.md)
315
+ - [docs/release-notes-v0.6.0.md](./docs/release-notes-v0.6.0.md)
316
+ - [docs/release-notes-v0.55.0.md](./docs/release-notes-v0.55.0.md)
317
+
318
+ ## 许可证
319
+
320
+ MIT — 见 [LICENSE](./LICENSE)。
321
+
322
+ ## Star 历史
323
+
324
+ [![Star History Chart](./star-history.svg)](https://www.star-history.com/#Yuzc-001/grasp&Date)