@predicatelabs/sdk 0.99.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (302) hide show
  1. package/LICENSE +24 -0
  2. package/README.md +252 -0
  3. package/dist/actions.d.ts +185 -0
  4. package/dist/actions.d.ts.map +1 -0
  5. package/dist/actions.js +1120 -0
  6. package/dist/actions.js.map +1 -0
  7. package/dist/agent-runtime.d.ts +352 -0
  8. package/dist/agent-runtime.d.ts.map +1 -0
  9. package/dist/agent-runtime.js +1170 -0
  10. package/dist/agent-runtime.js.map +1 -0
  11. package/dist/agent.d.ts +164 -0
  12. package/dist/agent.d.ts.map +1 -0
  13. package/dist/agent.js +408 -0
  14. package/dist/agent.js.map +1 -0
  15. package/dist/asserts/expect.d.ts +159 -0
  16. package/dist/asserts/expect.d.ts.map +1 -0
  17. package/dist/asserts/expect.js +547 -0
  18. package/dist/asserts/expect.js.map +1 -0
  19. package/dist/asserts/index.d.ts +58 -0
  20. package/dist/asserts/index.d.ts.map +1 -0
  21. package/dist/asserts/index.js +70 -0
  22. package/dist/asserts/index.js.map +1 -0
  23. package/dist/asserts/query.d.ts +199 -0
  24. package/dist/asserts/query.d.ts.map +1 -0
  25. package/dist/asserts/query.js +288 -0
  26. package/dist/asserts/query.js.map +1 -0
  27. package/dist/backends/actions.d.ts +119 -0
  28. package/dist/backends/actions.d.ts.map +1 -0
  29. package/dist/backends/actions.js +291 -0
  30. package/dist/backends/actions.js.map +1 -0
  31. package/dist/backends/browser-use-adapter.d.ts +131 -0
  32. package/dist/backends/browser-use-adapter.d.ts.map +1 -0
  33. package/dist/backends/browser-use-adapter.js +219 -0
  34. package/dist/backends/browser-use-adapter.js.map +1 -0
  35. package/dist/backends/cdp-backend.d.ts +66 -0
  36. package/dist/backends/cdp-backend.d.ts.map +1 -0
  37. package/dist/backends/cdp-backend.js +273 -0
  38. package/dist/backends/cdp-backend.js.map +1 -0
  39. package/dist/backends/index.d.ts +80 -0
  40. package/dist/backends/index.d.ts.map +1 -0
  41. package/dist/backends/index.js +101 -0
  42. package/dist/backends/index.js.map +1 -0
  43. package/dist/backends/protocol.d.ts +156 -0
  44. package/dist/backends/protocol.d.ts.map +1 -0
  45. package/dist/backends/protocol.js +16 -0
  46. package/dist/backends/protocol.js.map +1 -0
  47. package/dist/backends/sentience-context.d.ts +143 -0
  48. package/dist/backends/sentience-context.d.ts.map +1 -0
  49. package/dist/backends/sentience-context.js +359 -0
  50. package/dist/backends/sentience-context.js.map +1 -0
  51. package/dist/backends/snapshot.d.ts +188 -0
  52. package/dist/backends/snapshot.d.ts.map +1 -0
  53. package/dist/backends/snapshot.js +360 -0
  54. package/dist/backends/snapshot.js.map +1 -0
  55. package/dist/browser.d.ts +154 -0
  56. package/dist/browser.d.ts.map +1 -0
  57. package/dist/browser.js +920 -0
  58. package/dist/browser.js.map +1 -0
  59. package/dist/canonicalization.d.ts +126 -0
  60. package/dist/canonicalization.d.ts.map +1 -0
  61. package/dist/canonicalization.js +161 -0
  62. package/dist/canonicalization.js.map +1 -0
  63. package/dist/captcha/strategies.d.ts +12 -0
  64. package/dist/captcha/strategies.d.ts.map +1 -0
  65. package/dist/captcha/strategies.js +43 -0
  66. package/dist/captcha/strategies.js.map +1 -0
  67. package/dist/captcha/types.d.ts +45 -0
  68. package/dist/captcha/types.d.ts.map +1 -0
  69. package/dist/captcha/types.js +12 -0
  70. package/dist/captcha/types.js.map +1 -0
  71. package/dist/cli.d.ts +5 -0
  72. package/dist/cli.d.ts.map +1 -0
  73. package/dist/cli.js +422 -0
  74. package/dist/cli.js.map +1 -0
  75. package/dist/conversational-agent.d.ts +123 -0
  76. package/dist/conversational-agent.d.ts.map +1 -0
  77. package/dist/conversational-agent.js +341 -0
  78. package/dist/conversational-agent.js.map +1 -0
  79. package/dist/cursor-policy.d.ts +41 -0
  80. package/dist/cursor-policy.d.ts.map +1 -0
  81. package/dist/cursor-policy.js +81 -0
  82. package/dist/cursor-policy.js.map +1 -0
  83. package/dist/debugger.d.ts +28 -0
  84. package/dist/debugger.d.ts.map +1 -0
  85. package/dist/debugger.js +107 -0
  86. package/dist/debugger.js.map +1 -0
  87. package/dist/expect.d.ts +16 -0
  88. package/dist/expect.d.ts.map +1 -0
  89. package/dist/expect.js +67 -0
  90. package/dist/expect.js.map +1 -0
  91. package/dist/failure-artifacts.d.ts +95 -0
  92. package/dist/failure-artifacts.d.ts.map +1 -0
  93. package/dist/failure-artifacts.js +805 -0
  94. package/dist/failure-artifacts.js.map +1 -0
  95. package/dist/generator.d.ts +16 -0
  96. package/dist/generator.d.ts.map +1 -0
  97. package/dist/generator.js +205 -0
  98. package/dist/generator.js.map +1 -0
  99. package/dist/index.d.ts +37 -0
  100. package/dist/index.d.ts.map +1 -0
  101. package/dist/index.js +160 -0
  102. package/dist/index.js.map +1 -0
  103. package/dist/inspector.d.ts +13 -0
  104. package/dist/inspector.d.ts.map +1 -0
  105. package/dist/inspector.js +153 -0
  106. package/dist/inspector.js.map +1 -0
  107. package/dist/llm-provider.d.ts +144 -0
  108. package/dist/llm-provider.d.ts.map +1 -0
  109. package/dist/llm-provider.js +460 -0
  110. package/dist/llm-provider.js.map +1 -0
  111. package/dist/ordinal.d.ts +90 -0
  112. package/dist/ordinal.d.ts.map +1 -0
  113. package/dist/ordinal.js +249 -0
  114. package/dist/ordinal.js.map +1 -0
  115. package/dist/overlay.d.ts +63 -0
  116. package/dist/overlay.d.ts.map +1 -0
  117. package/dist/overlay.js +102 -0
  118. package/dist/overlay.js.map +1 -0
  119. package/dist/protocols/browser-protocol.d.ts +79 -0
  120. package/dist/protocols/browser-protocol.d.ts.map +1 -0
  121. package/dist/protocols/browser-protocol.js +9 -0
  122. package/dist/protocols/browser-protocol.js.map +1 -0
  123. package/dist/query.d.ts +66 -0
  124. package/dist/query.d.ts.map +1 -0
  125. package/dist/query.js +482 -0
  126. package/dist/query.js.map +1 -0
  127. package/dist/read.d.ts +47 -0
  128. package/dist/read.d.ts.map +1 -0
  129. package/dist/read.js +128 -0
  130. package/dist/read.js.map +1 -0
  131. package/dist/recorder.d.ts +44 -0
  132. package/dist/recorder.d.ts.map +1 -0
  133. package/dist/recorder.js +262 -0
  134. package/dist/recorder.js.map +1 -0
  135. package/dist/runtime-agent.d.ts +72 -0
  136. package/dist/runtime-agent.d.ts.map +1 -0
  137. package/dist/runtime-agent.js +357 -0
  138. package/dist/runtime-agent.js.map +1 -0
  139. package/dist/screenshot.d.ts +17 -0
  140. package/dist/screenshot.d.ts.map +1 -0
  141. package/dist/screenshot.js +40 -0
  142. package/dist/screenshot.js.map +1 -0
  143. package/dist/snapshot-diff.d.ts +23 -0
  144. package/dist/snapshot-diff.d.ts.map +1 -0
  145. package/dist/snapshot-diff.js +119 -0
  146. package/dist/snapshot-diff.js.map +1 -0
  147. package/dist/snapshot.d.ts +47 -0
  148. package/dist/snapshot.d.ts.map +1 -0
  149. package/dist/snapshot.js +358 -0
  150. package/dist/snapshot.js.map +1 -0
  151. package/dist/textSearch.d.ts +64 -0
  152. package/dist/textSearch.d.ts.map +1 -0
  153. package/dist/textSearch.js +113 -0
  154. package/dist/textSearch.js.map +1 -0
  155. package/dist/tools/context.d.ts +18 -0
  156. package/dist/tools/context.d.ts.map +1 -0
  157. package/dist/tools/context.js +40 -0
  158. package/dist/tools/context.js.map +1 -0
  159. package/dist/tools/defaults.d.ts +5 -0
  160. package/dist/tools/defaults.d.ts.map +1 -0
  161. package/dist/tools/defaults.js +368 -0
  162. package/dist/tools/defaults.js.map +1 -0
  163. package/dist/tools/filesystem.d.ts +12 -0
  164. package/dist/tools/filesystem.d.ts.map +1 -0
  165. package/dist/tools/filesystem.js +137 -0
  166. package/dist/tools/filesystem.js.map +1 -0
  167. package/dist/tools/index.d.ts +5 -0
  168. package/dist/tools/index.d.ts.map +1 -0
  169. package/dist/tools/index.js +15 -0
  170. package/dist/tools/index.js.map +1 -0
  171. package/dist/tools/registry.d.ts +38 -0
  172. package/dist/tools/registry.d.ts.map +1 -0
  173. package/dist/tools/registry.js +100 -0
  174. package/dist/tools/registry.js.map +1 -0
  175. package/dist/tracing/cloud-sink.d.ts +189 -0
  176. package/dist/tracing/cloud-sink.d.ts.map +1 -0
  177. package/dist/tracing/cloud-sink.js +1067 -0
  178. package/dist/tracing/cloud-sink.js.map +1 -0
  179. package/dist/tracing/index-schema.d.ts +231 -0
  180. package/dist/tracing/index-schema.d.ts.map +1 -0
  181. package/dist/tracing/index-schema.js +235 -0
  182. package/dist/tracing/index-schema.js.map +1 -0
  183. package/dist/tracing/index.d.ts +12 -0
  184. package/dist/tracing/index.d.ts.map +1 -0
  185. package/dist/tracing/index.js +28 -0
  186. package/dist/tracing/index.js.map +1 -0
  187. package/dist/tracing/indexer.d.ts +20 -0
  188. package/dist/tracing/indexer.d.ts.map +1 -0
  189. package/dist/tracing/indexer.js +347 -0
  190. package/dist/tracing/indexer.js.map +1 -0
  191. package/dist/tracing/jsonl-sink.d.ts +51 -0
  192. package/dist/tracing/jsonl-sink.d.ts.map +1 -0
  193. package/dist/tracing/jsonl-sink.js +329 -0
  194. package/dist/tracing/jsonl-sink.js.map +1 -0
  195. package/dist/tracing/sink.d.ts +25 -0
  196. package/dist/tracing/sink.d.ts.map +1 -0
  197. package/dist/tracing/sink.js +15 -0
  198. package/dist/tracing/sink.js.map +1 -0
  199. package/dist/tracing/tracer-factory.d.ts +102 -0
  200. package/dist/tracing/tracer-factory.d.ts.map +1 -0
  201. package/dist/tracing/tracer-factory.js +375 -0
  202. package/dist/tracing/tracer-factory.js.map +1 -0
  203. package/dist/tracing/tracer.d.ts +140 -0
  204. package/dist/tracing/tracer.d.ts.map +1 -0
  205. package/dist/tracing/tracer.js +336 -0
  206. package/dist/tracing/tracer.js.map +1 -0
  207. package/dist/tracing/types.d.ts +203 -0
  208. package/dist/tracing/types.d.ts.map +1 -0
  209. package/dist/tracing/types.js +8 -0
  210. package/dist/tracing/types.js.map +1 -0
  211. package/dist/types.d.ts +422 -0
  212. package/dist/types.d.ts.map +1 -0
  213. package/dist/types.js +6 -0
  214. package/dist/types.js.map +1 -0
  215. package/dist/utils/action-executor.d.ts +25 -0
  216. package/dist/utils/action-executor.d.ts.map +1 -0
  217. package/dist/utils/action-executor.js +121 -0
  218. package/dist/utils/action-executor.js.map +1 -0
  219. package/dist/utils/browser-evaluator.d.ts +76 -0
  220. package/dist/utils/browser-evaluator.d.ts.map +1 -0
  221. package/dist/utils/browser-evaluator.js +130 -0
  222. package/dist/utils/browser-evaluator.js.map +1 -0
  223. package/dist/utils/browser.d.ts +30 -0
  224. package/dist/utils/browser.d.ts.map +1 -0
  225. package/dist/utils/browser.js +75 -0
  226. package/dist/utils/browser.js.map +1 -0
  227. package/dist/utils/element-filter.d.ts +76 -0
  228. package/dist/utils/element-filter.d.ts.map +1 -0
  229. package/dist/utils/element-filter.js +195 -0
  230. package/dist/utils/element-filter.js.map +1 -0
  231. package/dist/utils/grid-utils.d.ts +37 -0
  232. package/dist/utils/grid-utils.d.ts.map +1 -0
  233. package/dist/utils/grid-utils.js +283 -0
  234. package/dist/utils/grid-utils.js.map +1 -0
  235. package/dist/utils/llm-interaction-handler.d.ts +41 -0
  236. package/dist/utils/llm-interaction-handler.d.ts.map +1 -0
  237. package/dist/utils/llm-interaction-handler.js +171 -0
  238. package/dist/utils/llm-interaction-handler.js.map +1 -0
  239. package/dist/utils/llm-response-builder.d.ts +56 -0
  240. package/dist/utils/llm-response-builder.d.ts.map +1 -0
  241. package/dist/utils/llm-response-builder.js +130 -0
  242. package/dist/utils/llm-response-builder.js.map +1 -0
  243. package/dist/utils/selector-utils.d.ts +12 -0
  244. package/dist/utils/selector-utils.d.ts.map +1 -0
  245. package/dist/utils/selector-utils.js +32 -0
  246. package/dist/utils/selector-utils.js.map +1 -0
  247. package/dist/utils/snapshot-event-builder.d.ts +28 -0
  248. package/dist/utils/snapshot-event-builder.d.ts.map +1 -0
  249. package/dist/utils/snapshot-event-builder.js +88 -0
  250. package/dist/utils/snapshot-event-builder.js.map +1 -0
  251. package/dist/utils/snapshot-processor.d.ts +27 -0
  252. package/dist/utils/snapshot-processor.d.ts.map +1 -0
  253. package/dist/utils/snapshot-processor.js +47 -0
  254. package/dist/utils/snapshot-processor.js.map +1 -0
  255. package/dist/utils/trace-event-builder.d.ts +122 -0
  256. package/dist/utils/trace-event-builder.d.ts.map +1 -0
  257. package/dist/utils/trace-event-builder.js +365 -0
  258. package/dist/utils/trace-event-builder.js.map +1 -0
  259. package/dist/utils/trace-file-manager.d.ts +70 -0
  260. package/dist/utils/trace-file-manager.d.ts.map +1 -0
  261. package/dist/utils/trace-file-manager.js +194 -0
  262. package/dist/utils/trace-file-manager.js.map +1 -0
  263. package/dist/utils/zod.d.ts +5 -0
  264. package/dist/utils/zod.d.ts.map +1 -0
  265. package/dist/utils/zod.js +80 -0
  266. package/dist/utils/zod.js.map +1 -0
  267. package/dist/utils.d.ts +8 -0
  268. package/dist/utils.d.ts.map +1 -0
  269. package/dist/utils.js +13 -0
  270. package/dist/utils.js.map +1 -0
  271. package/dist/verification.d.ts +194 -0
  272. package/dist/verification.d.ts.map +1 -0
  273. package/dist/verification.js +530 -0
  274. package/dist/verification.js.map +1 -0
  275. package/dist/vision-executor.d.ts +18 -0
  276. package/dist/vision-executor.d.ts.map +1 -0
  277. package/dist/vision-executor.js +60 -0
  278. package/dist/vision-executor.js.map +1 -0
  279. package/dist/visual-agent.d.ts +120 -0
  280. package/dist/visual-agent.d.ts.map +1 -0
  281. package/dist/visual-agent.js +796 -0
  282. package/dist/visual-agent.js.map +1 -0
  283. package/dist/wait.d.ts +35 -0
  284. package/dist/wait.d.ts.map +1 -0
  285. package/dist/wait.js +76 -0
  286. package/dist/wait.js.map +1 -0
  287. package/package.json +94 -0
  288. package/spec/README.md +72 -0
  289. package/spec/SNAPSHOT_V1.md +208 -0
  290. package/spec/sdk-types.md +259 -0
  291. package/spec/snapshot.schema.json +148 -0
  292. package/src/extension/background.js +104 -0
  293. package/src/extension/content.js +162 -0
  294. package/src/extension/injected_api.js +1399 -0
  295. package/src/extension/manifest.json +36 -0
  296. package/src/extension/pkg/README.md +1340 -0
  297. package/src/extension/pkg/package.json +15 -0
  298. package/src/extension/pkg/sentience_core.d.ts +51 -0
  299. package/src/extension/pkg/sentience_core.js +371 -0
  300. package/src/extension/pkg/sentience_core_bg.wasm +0 -0
  301. package/src/extension/pkg/sentience_core_bg.wasm.d.ts +10 -0
  302. package/src/extension/release.json +116 -0
@@ -0,0 +1,1340 @@
1
+ # Sentience Chrome Extension - Complete Documentation
2
+
3
+ **A Rust/WASM-powered Chrome extension for extracting geometric layouts, visual cues, and importance scores from web pages.**
4
+
5
+ Perfect for AI agents, automation scripts, visual grounding, and accessibility tools.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ 1. [Overview](#overview)
12
+ 2. [Quick Start](#quick-start)
13
+ 3. [Developer Quick Reference](#developer-quick-reference)
14
+ 4. [Installation](#installation)
15
+ 5. [User API](#user-api)
16
+ 6. [Usage Examples](#usage-examples)
17
+ 7. [Screenshot Feature](#screenshot-feature)
18
+ 8. [Bounding Box Visualization](#bounding-box-visualization)
19
+ 9. [Filtering & Ranking](#filtering--ranking)
20
+ 10. [Architecture](#architecture)
21
+ 11. [Implementation Details](#implementation-details)
22
+ 12. [API Reference](#api-reference)
23
+ 13. [Performance](#performance)
24
+ 14. [Troubleshooting](#troubleshooting)
25
+ 15. [Contributing](#contributing)
26
+
27
+ ---
28
+
29
+ ## Overview
30
+
31
+ ### What It Does
32
+
33
+ Extracts a **geometry map** of any webpage:
34
+ - **Element positions** (bounding boxes)
35
+ - **Semantic roles** (button, link, textbox, etc.)
36
+ - **Importance scores** (AI-optimized ranking)
37
+ - **Visual cues** (colors, primary actions, clickability)
38
+ - **Screenshots** (Base64 PNG/JPEG)
39
+
40
+ ### Why Use It?
41
+
42
+ ✅ **AI Agent Navigation** - Provide structured page data to LLMs
43
+ ✅ **Visual Grounding** - Combine screenshots with element positions
44
+ ✅ **Accessibility Auditing** - Find low-importance or inaccessible elements
45
+ ✅ **Form Detection** - Extract all input fields automatically
46
+ ✅ **Primary CTA Detection** - Find the main call-to-action button
47
+ ✅ **Automation** - Programmatically interact with web pages
48
+
49
+ ### Technology Stack
50
+
51
+ - **Frontend**: JavaScript (DOM extraction)
52
+ - **Backend**: Rust/WASM (analysis & ranking)
53
+ - **Size**: ~50-150 KB WASM binary
54
+ - **Performance**: 100-300ms typical response
55
+
56
+ ---
57
+
58
+ ## Quick Start
59
+
60
+ ### 5-Second Test
61
+
62
+ ```bash
63
+ # 1. Install dependencies
64
+ npm install
65
+
66
+ # 2. Build extension (WASM + JavaScript)
67
+ npm run build
68
+
69
+ # 3. Load extension in Chrome
70
+ # chrome://extensions → Enable Developer mode → Load unpacked → Select this directory
71
+
72
+ # 4. Open any webpage, then in DevTools Console:
73
+ ```
74
+
75
+ ```javascript
76
+ const result = await window.sentience.snapshot({ limit: 5 });
77
+ console.log(result.elements);
78
+ ```
79
+
80
+ **That's it!** You now have the top 5 most important elements with positions, roles, and scores.
81
+
82
+ ---
83
+
84
+ ## Developer Quick Reference
85
+
86
+ ### 📦 **Build Commands**
87
+
88
+ | Command | What it does |
89
+ |---------|-------------|
90
+ | `npm install` | Install dependencies (first time only) |
91
+ | `npm run build` | **Full build** (WASM + JavaScript bundles) |
92
+ | `npm run build:wasm` | Build only WASM (Rust → pkg/) |
93
+ | `npm run build:bundle` | Build only JavaScript (src/ → dist/) |
94
+
95
+ ### ✅ **Code Quality**
96
+
97
+ | Command | What it does |
98
+ |---------|-------------|
99
+ | `npm run lint` | Check code quality with ESLint |
100
+ | `npm run lint:fix` | Auto-fix linting issues |
101
+ | `npm run format` | Format code with Prettier |
102
+ | `npm run format:check` | Check code formatting |
103
+
104
+ ### 🧪 **Testing**
105
+
106
+ | Command | What it does |
107
+ |---------|-------------|
108
+ | `npm test` | Run all tests with coverage |
109
+ | `npm run test:watch` | Run tests in watch mode |
110
+ | `npm run test:coverage` | Generate coverage report |
111
+
112
+ ### 📁 **Project Structure**
113
+
114
+ ```
115
+ sentience-chrome/
116
+ ├── src/ # Modular source code
117
+ │ ├── background/ # Service worker with WASM
118
+ │ ├── content/ # Message bridge
119
+ │ └── injected/ # Main API (7 modules)
120
+ ├── tests/ # Unit tests (76 tests)
121
+ ├── dist/ # Bundled output (gitignored)
122
+ ├── pkg/ # WASM artifacts (gitignored)
123
+ └── docs/ # Documentation
124
+ ```
125
+
126
+ ### 🔄 **Development Workflow**
127
+
128
+ ```bash
129
+ # 1. Make changes to src/injected/utils.js (for example)
130
+ # 2. Rebuild JavaScript (fast)
131
+ npm run build:bundle
132
+
133
+ # 3. Reload extension in Chrome
134
+ # Click refresh icon in chrome://extensions/
135
+ ```
136
+
137
+ ### 📚 **Documentation**
138
+
139
+ - **[CONTRIBUTING.md](CONTRIBUTING.md)** - Developer guide
140
+ - **[docs/RESTRUCTURING_PROGRESS.md](docs/RESTRUCTURING_PROGRESS.md)** - Architecture overview
141
+ - **[docs/RESTRUCTURING_ASSESSMENT.md](docs/RESTRUCTURING_ASSESSMENT.md)** - Planning document
142
+
143
+ ---
144
+
145
+ ## Installation
146
+
147
+ ### Prerequisites
148
+
149
+ - **Rust** (for WASM compilation)
150
+ - **wasm-pack** (`cargo install wasm-pack`)
151
+ - **Chrome/Chromium** browser
152
+
153
+ ### Build Steps
154
+
155
+ **Option 1: Use the build script (Recommended)**
156
+
157
+ ```bash
158
+ # 1. Navigate to directory
159
+ cd /Users/guoliangwang/Desktop/Code/Rust/sentience-chrome/Claude
160
+
161
+ # 2. Run build script
162
+ ./build.sh
163
+
164
+ # The script will:
165
+ # - Check for wasm-pack
166
+ # - Build the WASM module
167
+ # - Show file sizes
168
+ # - Display next steps
169
+ ```
170
+
171
+ **Option 2: Manual build**
172
+
173
+ ```bash
174
+ # 1. Navigate to directory
175
+ cd /Users/guoliangwang/Desktop/Code/Rust/sentience-chrome/Claude
176
+
177
+ # 2. Build WASM module
178
+ wasm-pack build --target web
179
+
180
+ # 3. Verify output
181
+ ls pkg/
182
+ # Should see: sentience_core.js, sentience_core_bg.wasm
183
+ ```
184
+
185
+ **Load in Chrome:**
186
+
187
+ 1. Open `chrome://extensions`
188
+ 2. Enable "Developer mode" (top-right toggle)
189
+ 3. Click "Load unpacked"
190
+ 4. Select the `Claude/` directory
191
+ 5. Test on any webpage in DevTools Console:
192
+ ```javascript
193
+ await window.sentience.snapshot()
194
+ ```
195
+
196
+ ### File Structure
197
+
198
+ ```
199
+ Claude/
200
+ ├── src/
201
+ │ └── lib.rs # Rust/WASM logic
202
+ ├── pkg/ # Generated by wasm-pack
203
+ │ ├── sentience_core.js
204
+ │ └── sentience_core_bg.wasm
205
+ ├── content.js # JavaScript entry point
206
+ ├── background.js # Screenshot capture
207
+ ├── manifest.json # Chrome extension config
208
+ ├── Cargo.toml # Rust dependencies
209
+ ├── build.sh # Build script (executable)
210
+ ├── README.md # This file
211
+ ├── IMPLEMENTATION_SUMMARY.md # Technical implementation details
212
+ └── prompt.md # Original task instructions
213
+ ```
214
+
215
+ ---
216
+
217
+ ## User API
218
+
219
+ ### Core Functions
220
+
221
+ The extension provides two main functions:
222
+
223
+ 1. **`window.sentience.snapshot(options?)`** - Extract page geometry and elements
224
+ 2. **`window.sentience.findTextRect(options)`** - Find exact pixel coordinates of text
225
+
226
+ ### snapshot() - Geometry Extraction
227
+
228
+ ```javascript
229
+ window.sentience.snapshot(options?)
230
+ ```
231
+
232
+ **Capabilities:**
233
+ - Get geometry map
234
+ - Capture screenshot
235
+ - Filter by role/size/z-index
236
+ - Limit to top N elements
237
+
238
+ ### Complete Options
239
+
240
+ ```typescript
241
+ await window.sentience.snapshot({
242
+ // Screenshot
243
+ screenshot?: boolean | {
244
+ format: 'png' | 'jpeg',
245
+ quality: number // 0-100, JPEG only
246
+ },
247
+
248
+ // Filtering
249
+ limit?: number,
250
+ filter?: {
251
+ min_area?: number,
252
+ allowed_roles?: string[],
253
+ min_z_index?: number
254
+ }
255
+ })
256
+ ```
257
+
258
+ ### Response Format
259
+
260
+ ```typescript
261
+ {
262
+ status: "success",
263
+ timestamp: string,
264
+ url: string,
265
+ viewport: { width: number, height: number },
266
+
267
+ // Geometry data
268
+ elements: [{
269
+ id: number,
270
+ role: string,
271
+ importance: number,
272
+ visual_cues: {
273
+ is_primary: boolean,
274
+ background_color_name: string | null,
275
+ is_clickable: boolean
276
+ },
277
+ bbox: { x: number, y: number, width: number, height: number },
278
+ z_index: number
279
+ }],
280
+
281
+ // Screenshot (optional)
282
+ screenshot?: string, // Base64 data URL
283
+ screenshot_format?: 'png' | 'jpeg',
284
+ screenshot_error?: string
285
+ }
286
+ ```
287
+
288
+ ### findTextRect() - Text Location Finder
289
+
290
+ Find exact pixel coordinates of any text on the page using the DOM Range API. Perfect for highlighting specific words, clicking on text, or text-based navigation **without Vision Models**.
291
+
292
+ ```javascript
293
+ window.sentience.findTextRect(options)
294
+ ```
295
+
296
+ **Parameters:**
297
+ ```typescript
298
+ {
299
+ text: string, // Required: Text to find
300
+ containerElement?: Element, // Optional: Search within (default: document.body)
301
+ caseSensitive?: boolean, // Optional: Case-sensitive search (default: false)
302
+ wholeWord?: boolean, // Optional: Match whole words only (default: false)
303
+ maxResults?: number // Optional: Limit results (default: 10)
304
+ }
305
+ ```
306
+
307
+ **Returns:**
308
+ ```typescript
309
+ {
310
+ status: "success" | "error",
311
+ query: string, // The search text
312
+ case_sensitive: boolean,
313
+ whole_word: boolean,
314
+ matches: number, // Total matches found
315
+ results: [{
316
+ text: string, // Actual matched text
317
+ rect: { // Absolute coordinates (with scroll)
318
+ x: number,
319
+ y: number,
320
+ width: number,
321
+ height: number,
322
+ left: number,
323
+ top: number,
324
+ right: number,
325
+ bottom: number
326
+ },
327
+ viewport_rect: { // Viewport-relative coordinates
328
+ x: number,
329
+ y: number,
330
+ width: number,
331
+ height: number
332
+ },
333
+ context: { // Surrounding text
334
+ before: string, // 20 chars before
335
+ after: string // 20 chars after
336
+ },
337
+ in_viewport: boolean // Is it currently visible?
338
+ }],
339
+ viewport: {
340
+ width: number,
341
+ height: number,
342
+ scroll_x: number,
343
+ scroll_y: number
344
+ },
345
+ error?: string // Error message if status is "error"
346
+ }
347
+ ```
348
+
349
+ **Usage Examples:**
350
+
351
+ ```javascript
352
+ // Example 1: Find "Add to Cart" text
353
+ const result = await window.sentience.findTextRect({
354
+ text: "Add to Cart"
355
+ });
356
+
357
+ if (result.status === "success") {
358
+ console.log(`Found ${result.matches} occurrences`);
359
+ result.results.forEach((match, i) => {
360
+ console.log(`${i+1}. At (${match.rect.x}, ${match.rect.y})`);
361
+ console.log(` Context: "${match.context.before}${match.text}${match.context.after}"`);
362
+ });
363
+ }
364
+
365
+ // Example 2: Highlight all matches
366
+ const result = await window.sentience.findTextRect({
367
+ text: "price",
368
+ caseSensitive: false,
369
+ maxResults: 20
370
+ });
371
+
372
+ result.results.forEach(match => {
373
+ const highlight = document.createElement('div');
374
+ highlight.style.cssText = `
375
+ position: absolute;
376
+ left: ${match.rect.x}px;
377
+ top: ${match.rect.y}px;
378
+ width: ${match.rect.width}px;
379
+ height: ${match.rect.height}px;
380
+ background: yellow;
381
+ opacity: 0.5;
382
+ pointer-events: none;
383
+ z-index: 9999;
384
+ `;
385
+ document.body.appendChild(highlight);
386
+ });
387
+
388
+ // Example 3: Click on specific text (not button!)
389
+ const result = await window.sentience.findTextRect({
390
+ text: "Terms of Service",
391
+ wholeWord: true
392
+ });
393
+
394
+ if (result.matches > 0) {
395
+ const first = result.results[0];
396
+ // Click the center of the text
397
+ const centerX = first.viewport_rect.x + first.viewport_rect.width / 2;
398
+ const centerY = first.viewport_rect.y + first.viewport_rect.height / 2;
399
+
400
+ document.elementFromPoint(centerX, centerY)?.click();
401
+ }
402
+
403
+ // Example 4: Find text only in header
404
+ const header = document.querySelector('header');
405
+ const result = await window.sentience.findTextRect({
406
+ text: "Login",
407
+ containerElement: header
408
+ });
409
+
410
+ // Example 5: Scroll to first match
411
+ const result = await window.sentience.findTextRect({
412
+ text: "Contact Us"
413
+ });
414
+
415
+ if (result.matches > 0) {
416
+ const first = result.results[0];
417
+ window.scrollTo({
418
+ top: first.rect.y - 100, // Offset for header
419
+ behavior: 'smooth'
420
+ });
421
+ }
422
+ ```
423
+
424
+ **Use Cases:**
425
+ - 🎯 **Text-based clicking** - Click on text that's not in a button
426
+ - 🖍️ **Text highlighting** - Draw bounding boxes around specific words
427
+ - 📍 **Text navigation** - Scroll to specific content
428
+ - ♿ **Accessibility** - Find and highlight important text
429
+ - 🤖 **AI Agents** - Locate text without vision models
430
+ - 🔍 **Search results** - Find and highlight search terms
431
+
432
+ **Features:**
433
+ - ✅ Pixel-perfect coordinates using DOM Range API
434
+ - ✅ Filters invisible/hidden text automatically
435
+ - ✅ Returns both absolute and viewport-relative coordinates
436
+ - ✅ Provides context for ambiguous matches
437
+ - ✅ Handles multiple occurrences
438
+ - ✅ Performance-safe with result limits
439
+ - ✅ Works with case-insensitive and whole-word matching
440
+
441
+ ---
442
+
443
+ ## Usage Examples
444
+
445
+ ### Example 1: Basic Geometry Map
446
+
447
+ ```javascript
448
+ const result = await window.sentience.snapshot();
449
+ console.log(`Found ${result.elements.length} elements`);
450
+ console.log('Top element:', result.elements[0]);
451
+ ```
452
+
453
+ ### Example 2: Top 10 Most Important
454
+
455
+ ```javascript
456
+ const top10 = await window.sentience.snapshot({ limit: 10 });
457
+ top10.elements.forEach((el, i) => {
458
+ console.log(`${i+1}. [${el.role}] Score: ${el.importance}, Position: (${el.bbox.x}, ${el.bbox.y})`);
459
+ });
460
+ ```
461
+
462
+ ### Example 3: Filter for Buttons Only
463
+
464
+ ```javascript
465
+ const buttons = await window.sentience.snapshot({
466
+ filter: { allowed_roles: ['button', 'submit'] }
467
+ });
468
+ console.log(`Found ${buttons.elements.length} buttons`);
469
+ ```
470
+
471
+ ### Example 4: Find Primary Action
472
+
473
+ ```javascript
474
+ const result = await window.sentience.snapshot();
475
+ const cta = result.elements.find(el =>
476
+ el.visual_cues.is_primary && el.role === 'button'
477
+ );
478
+ console.log('Primary CTA:', cta);
479
+ ```
480
+
481
+ ### Example 5: AI Agent Prompt
482
+
483
+ ```javascript
484
+ const top5 = await window.sentience.snapshot({ limit: 5 });
485
+
486
+ const prompt = `
487
+ Available actions:
488
+ ${top5.elements.map((el, i) =>
489
+ `${i+1}. ${el.role} at (${el.bbox.x}, ${el.bbox.y})`
490
+ ).join('\n')}
491
+
492
+ Which action should I take to search for products?
493
+ `;
494
+
495
+ // Send to LLM API
496
+ ```
497
+
498
+ ### Example 6: Form Field Detection
499
+
500
+ ```javascript
501
+ const inputs = await window.sentience.snapshot({
502
+ filter: {
503
+ allowed_roles: ['textbox', 'searchbox', 'checkbox', 'radio']
504
+ }
505
+ });
506
+
507
+ console.log('Form schema:', inputs.elements.map(el => ({
508
+ role: el.role,
509
+ position: el.bbox,
510
+ importance: el.importance
511
+ })));
512
+ ```
513
+
514
+ ---
515
+
516
+ ## Screenshot Feature
517
+
518
+ ### Basic Screenshot
519
+
520
+ ```javascript
521
+ const result = await window.sentience.snapshot({
522
+ screenshot: true
523
+ });
524
+
525
+ console.log('Screenshot:', result.screenshot); // Base64 data URL
526
+ console.log('Format:', result.screenshot_format); // "png"
527
+ ```
528
+
529
+ ### JPEG for Smaller Size
530
+
531
+ ```javascript
532
+ const result = await window.sentience.snapshot({
533
+ screenshot: {
534
+ format: 'jpeg',
535
+ quality: 80 // Recommended: 70-85
536
+ }
537
+ });
538
+ ```
539
+
540
+ ### Screenshot Format
541
+
542
+ **Important:** Screenshot is a **Base64 data URL string**, NOT a file path!
543
+
544
+ ```
545
+ "..."
546
+ ```
547
+
548
+ ### Display Screenshot
549
+
550
+ ```javascript
551
+ const result = await window.sentience.snapshot({ screenshot: true });
552
+
553
+ const img = document.createElement('img');
554
+ img.src = result.screenshot;
555
+ document.body.appendChild(img);
556
+ ```
557
+
558
+ ### Download Screenshot
559
+
560
+ ```javascript
561
+ const result = await window.sentience.snapshot({ screenshot: true });
562
+
563
+ const a = document.createElement('a');
564
+ a.href = result.screenshot;
565
+ a.download = 'screenshot.png';
566
+ a.click();
567
+ ```
568
+
569
+ ### Annotated Screenshot
570
+
571
+ ```javascript
572
+ const result = await window.sentience.snapshot({
573
+ screenshot: true,
574
+ limit: 10
575
+ });
576
+
577
+ const img = new Image();
578
+ img.onload = () => {
579
+ const canvas = document.createElement('canvas');
580
+ canvas.width = result.viewport.width;
581
+ canvas.height = result.viewport.height;
582
+
583
+ const ctx = canvas.getContext('2d');
584
+ ctx.drawImage(img, 0, 0);
585
+
586
+ // Draw bounding boxes
587
+ result.elements.forEach((el, i) => {
588
+ ctx.strokeStyle = el.visual_cues.is_primary ? 'red' : 'blue';
589
+ ctx.lineWidth = 3;
590
+ ctx.strokeRect(el.bbox.x, el.bbox.y, el.bbox.width, el.bbox.height);
591
+
592
+ // Label
593
+ ctx.fillStyle = 'red';
594
+ ctx.font = '14px Arial';
595
+ ctx.fillText(`${i+1}: ${el.role}`, el.bbox.x, el.bbox.y - 5);
596
+ });
597
+
598
+ document.body.appendChild(canvas);
599
+ };
600
+ img.src = result.screenshot;
601
+ ```
602
+
603
+ ### Send to AI Vision API
604
+
605
+ ```javascript
606
+ const result = await window.sentience.snapshot({
607
+ screenshot: { format: 'jpeg', quality: 80 },
608
+ limit: 10
609
+ });
610
+
611
+ // Extract Base64 (without prefix)
612
+ const base64Only = result.screenshot.split(',')[1];
613
+
614
+ // Send to GPT-4 Vision, Claude 3, etc.
615
+ await fetch('https://api.openai.com/v1/chat/completions', {
616
+ method: 'POST',
617
+ headers: {
618
+ 'Authorization': `Bearer ${API_KEY}`,
619
+ 'Content-Type': 'application/json'
620
+ },
621
+ body: JSON.stringify({
622
+ model: 'gpt-4-vision-preview',
623
+ messages: [{
624
+ role: 'user',
625
+ content: [
626
+ { type: 'text', text: 'What should I click?' },
627
+ { type: 'image_url', image_url: { url: result.screenshot } }
628
+ ]
629
+ }]
630
+ })
631
+ });
632
+ ```
633
+
634
+ ### File Size Reference
635
+
636
+ | Format | Quality | Typical Size | Use Case |
637
+ |--------|---------|--------------|----------|
638
+ | PNG | N/A | 1-2 MB | Archival, exact pixels |
639
+ | JPEG | 90 | 300-800 KB | High quality uploads |
640
+ | JPEG | 80 | 150-400 KB | **Recommended** |
641
+ | JPEG | 70 | 80-200 KB | Bandwidth-limited |
642
+
643
+ ---
644
+
645
+ ## Bounding Box Visualization
646
+
647
+ ### Simple Overlay
648
+
649
+ ```javascript
650
+ const result = await window.sentience.snapshot({ limit: 10 });
651
+
652
+ result.elements.forEach((el, i) => {
653
+ const box = document.createElement('div');
654
+ box.style.cssText = `
655
+ position: absolute;
656
+ left: ${el.bbox.x}px;
657
+ top: ${el.bbox.y}px;
658
+ width: ${el.bbox.width}px;
659
+ height: ${el.bbox.height}px;
660
+ border: ${el.visual_cues.is_primary ? '3px solid red' : '2px solid blue'};
661
+ pointer-events: none;
662
+ z-index: 9999;
663
+ box-sizing: border-box;
664
+ `;
665
+
666
+ // Add label
667
+ const label = document.createElement('div');
668
+ label.style.cssText = `
669
+ position: absolute;
670
+ top: -22px;
671
+ background: ${el.visual_cues.is_primary ? 'red' : 'blue'};
672
+ color: white;
673
+ padding: 2px 6px;
674
+ font-size: 12px;
675
+ `;
676
+ label.textContent = `${i+1}: ${el.role}`;
677
+ box.appendChild(label);
678
+
679
+ document.body.appendChild(box);
680
+ });
681
+ ```
682
+
683
+ ### Canvas Overlay (Better Performance)
684
+
685
+ ```javascript
686
+ const result = await window.sentience.snapshot({ limit: 20 });
687
+
688
+ const canvas = document.createElement('canvas');
689
+ canvas.width = window.innerWidth;
690
+ canvas.height = window.innerHeight;
691
+ canvas.style.cssText = `
692
+ position: fixed;
693
+ top: 0;
694
+ left: 0;
695
+ pointer-events: none;
696
+ z-index: 9999;
697
+ `;
698
+ document.body.appendChild(canvas);
699
+
700
+ const ctx = canvas.getContext('2d');
701
+
702
+ result.elements.forEach((el, i) => {
703
+ ctx.strokeStyle = el.visual_cues.is_primary ? '#ff0000' : '#0066ff';
704
+ ctx.lineWidth = el.visual_cues.is_primary ? 3 : 2;
705
+ ctx.strokeRect(el.bbox.x, el.bbox.y, el.bbox.width, el.bbox.height);
706
+
707
+ ctx.fillStyle = '#0066ff';
708
+ ctx.font = '14px Arial';
709
+ ctx.fillText(`${i+1}: ${el.role}`, el.bbox.x, el.bbox.y - 5);
710
+ });
711
+ ```
712
+
713
+ ### Reusable Helper Function
714
+
715
+ ```javascript
716
+ // Add to content.js or run in console
717
+ window.sentience.visualize = async function(options = {}) {
718
+ const {
719
+ limit = 10,
720
+ filter = null,
721
+ highlightPrimary = true
722
+ } = options;
723
+
724
+ const result = await window.sentience.snapshot({ limit, filter });
725
+
726
+ // Clear previous
727
+ document.querySelectorAll('.sentience-box').forEach(el => el.remove());
728
+
729
+ result.elements.forEach((el, i) => {
730
+ const box = document.createElement('div');
731
+ box.className = 'sentience-box';
732
+ box.style.cssText = `
733
+ position: absolute;
734
+ left: ${el.bbox.x}px;
735
+ top: ${el.bbox.y}px;
736
+ width: ${el.bbox.width}px;
737
+ height: ${el.bbox.height}px;
738
+ border: ${el.visual_cues.is_primary && highlightPrimary ? '3px solid red' : '2px solid blue'};
739
+ pointer-events: none;
740
+ z-index: 9998;
741
+ box-sizing: border-box;
742
+ `;
743
+ document.body.appendChild(box);
744
+ });
745
+
746
+ console.log(`✅ Visualized ${result.elements.length} elements`);
747
+ return result;
748
+ };
749
+
750
+ // Usage
751
+ await window.sentience.visualize();
752
+ await window.sentience.visualize({ limit: 20 });
753
+ await window.sentience.visualize({ filter: { allowed_roles: ['button'] } });
754
+ ```
755
+
756
+ ---
757
+
758
+ ## Filtering & Ranking
759
+
760
+ ### Importance Scoring (6 Metrics)
761
+
762
+ Elements are ranked by:
763
+
764
+ 1. **Role Priority** (1000+ for inputs, 500 for buttons, 100 for links)
765
+ 2. **Area Score** (larger elements score higher, capped at 200)
766
+ 3. **Center Bias** (penalizes footer/sidebar elements)
767
+ 4. **Z-Index Bonus** (modals/overlays get priority)
768
+ 5. **ARIA Label Bonus** (+200 for explicit labels)
769
+ 6. **Visual Prominence** (+200 for `is_primary` elements)
770
+
771
+ **Score Range:** -300 to ~1800
772
+
773
+ ### Filter Options
774
+
775
+ #### Limit to Top N
776
+
777
+ ```javascript
778
+ { limit: 10 } // Returns top 10 most important
779
+ ```
780
+
781
+ #### Filter by Element Size
782
+
783
+ ```javascript
784
+ {
785
+ filter: {
786
+ min_area: 500 // Minimum 500 pixels² (e.g., 20×25)
787
+ }
788
+ }
789
+ ```
790
+
791
+ #### Filter by Role
792
+
793
+ ```javascript
794
+ {
795
+ filter: {
796
+ allowed_roles: ['button', 'link', 'textbox']
797
+ }
798
+ }
799
+ ```
800
+
801
+ **Available Roles:**
802
+ - `button`, `submit` - Buttons
803
+ - `link` - Hyperlinks
804
+ - `textbox`, `searchbox` - Text inputs
805
+ - `checkbox`, `radio` - Form controls
806
+ - `combobox` - Dropdowns
807
+ - `generic` - Other elements
808
+
809
+ #### Filter by Z-Index
810
+
811
+ ```javascript
812
+ {
813
+ filter: {
814
+ min_z_index: 100 // Only modals/overlays
815
+ }
816
+ }
817
+ ```
818
+
819
+ #### Combined Filters
820
+
821
+ ```javascript
822
+ {
823
+ limit: 20,
824
+ filter: {
825
+ allowed_roles: ['button', 'textbox'],
826
+ min_area: 100,
827
+ min_z_index: 0
828
+ }
829
+ }
830
+ ```
831
+
832
+ ### Visual Cues
833
+
834
+ #### is_primary Detection
835
+
836
+ An element is marked as primary if:
837
+ 1. **Actionable**: Has clickable role (button/link/input)
838
+ 2. **NOT Decorative**: Not image/presentation
839
+ 3. **Visually Prominent**:
840
+ - Size > 1% of viewport, OR
841
+ - Bold text + primary color (blue/green/orange/red), OR
842
+ - Large font (≥18px) + primary color
843
+
844
+ #### Color Detection
845
+
846
+ 32-color palette using Euclidean distance in RGB space:
847
+ - **Basic**: black, white, gray
848
+ - **Primary**: red, blue, green, yellow
849
+ - **Secondary**: orange, purple, pink, brown
850
+ - **Extended**: gold, salmon, skyblue, khaki, etc.
851
+
852
+ ---
853
+
854
+ ## Architecture
855
+
856
+ ### Data Flow
857
+
858
+ ```
859
+ User calls window.sentience.snapshot()
860
+
861
+ content.js extracts raw DOM data
862
+ - getBoundingClientRect() for positions
863
+ - getComputedStyle() for colors/fonts
864
+ - getAttribute() for roles/attributes
865
+
866
+ WASM (lib.rs) analyzes
867
+ 1. infer_role() - Detect semantic roles
868
+ 2. extract_visual_cues() - Colors, prominence
869
+ 3. calculate_importance() - 6-metric scoring
870
+ 4. apply_filters() - Smart selection
871
+
872
+ Return sorted, filtered JSON
873
+ ```
874
+
875
+ ### Component Breakdown
876
+
877
+ **content.js (JavaScript Layer)**
878
+ - DOM access (WASM can't access DOM)
879
+ - Element registry management
880
+ - WASM initialization
881
+ - API exposure
882
+
883
+ **lib.rs (Rust/WASM Layer)**
884
+ - Element analysis
885
+ - Filtering and ranking
886
+ - Smart selection algorithm
887
+ - No dependencies (small binary)
888
+
889
+ **background.js (Service Worker)**
890
+ - Screenshot capture via `chrome.tabs.captureVisibleTab`
891
+ - Message passing to content script
892
+
893
+ ### Bridge Pattern
894
+
895
+ **JavaScript → WASM (Data Flow)**
896
+ ```
897
+ Raw DOM data (JsValue)
898
+ → analyze_page()
899
+ → Vec<RawElement>
900
+ → infer_role(), extract_visual_cues(), calculate_importance()
901
+ → Vec<SmartElement>
902
+ → JsValue
903
+ ```
904
+
905
+ **WASM → JavaScript (Action Bridge)**
906
+ ```
907
+ click_element_bridge(id)
908
+ → js_click_element(id)
909
+ → window.sentience_registry[id].click()
910
+ ```
911
+
912
+ ---
913
+
914
+ ## Implementation Details
915
+
916
+ ### Visual Cues Extraction
917
+
918
+ **RGB to Hex Conversion:**
919
+ ```rust
920
+ fn rgb_to_hex(rgb_string: &str) -> Option<String> {
921
+ // Parse "rgb(0, 123, 255)" → "#007bff"
922
+ // Manual string parsing (no regex dependency)
923
+ }
924
+ ```
925
+
926
+ **Nearest Color Name:**
927
+ ```rust
928
+ fn find_nearest_color_name(hex: &str) -> Option<String> {
929
+ // Euclidean distance to 32-color palette
930
+ // Returns closest named color
931
+ }
932
+ ```
933
+
934
+ **Primary Action Detection:**
935
+ ```rust
936
+ fn extract_visual_cues(raw: &RawElement, role: &str) -> VisualCues {
937
+ // Check: actionable + not decorative + visually prominent
938
+ let is_primary = is_actionable && !is_decorative && is_visually_prominent;
939
+ }
940
+ ```
941
+
942
+ ### Importance Calculation
943
+
944
+ ```rust
945
+ fn calculate_importance(raw: &RawElement, role: &str, cues: &VisualCues) -> i32 {
946
+ let mut score = 0;
947
+
948
+ // Role priority
949
+ score += match role {
950
+ "textbox" | "searchbox" => 1000,
951
+ "button" | "checkbox" | "radio" => 500,
952
+ "link" => 100,
953
+ _ => 10,
954
+ };
955
+
956
+ // Area bonus
957
+ let area = raw.rect.width * raw.rect.height;
958
+ score += (area.sqrt() as i32).min(200);
959
+
960
+ // Center bias
961
+ let dist_from_center = calculate_manhattan_distance();
962
+ score -= dist_from_center as i32;
963
+
964
+ // Z-index bonus
965
+ if z_index > 0 {
966
+ score += (z_index.min(100)) * 2;
967
+ }
968
+
969
+ // ARIA label bonus
970
+ if raw.attributes.aria_label.is_some() {
971
+ score += 200;
972
+ }
973
+
974
+ // Visual prominence
975
+ if cues.is_primary {
976
+ score += 200;
977
+ }
978
+
979
+ score
980
+ }
981
+ ```
982
+
983
+ ### Smart Selection
984
+
985
+ ```rust
986
+ fn apply_filters(elements: &mut Vec<SmartElement>, options: &AnalysisOptions) {
987
+ // Stage 1: Attribute filters
988
+ apply_attribute_filters(elements, &options.filter);
989
+
990
+ // Stage 2: Smart selection
991
+ if let Some(limit) = options.limit {
992
+ // Truncate to top N
993
+ elements.truncate(limit);
994
+
995
+ // Re-sort by Y-position (reading order for LLMs)
996
+ elements.sort_by(|a, b| a.bbox.y.partial_cmp(&b.bbox.y).unwrap());
997
+ }
998
+ }
999
+ ```
1000
+
1001
+ ### Build Optimizations
1002
+
1003
+ **Cargo.toml:**
1004
+ ```toml
1005
+ [profile.release]
1006
+ opt-level = "z" # Optimize for size
1007
+ lto = true # Link-time optimization
1008
+ codegen-units = 1 # Better optimization
1009
+ panic = "abort" # Smaller panic handler
1010
+ strip = true # Strip debug symbols
1011
+ ```
1012
+
1013
+ **Result:** ~50-150 KB WASM binary
1014
+
1015
+ ---
1016
+
1017
+ ## API Reference
1018
+
1019
+ ### snapshot(options?)
1020
+
1021
+ **Parameters:**
1022
+ ```typescript
1023
+ {
1024
+ screenshot?: boolean | {
1025
+ format?: 'png' | 'jpeg',
1026
+ quality?: number // 0-100
1027
+ },
1028
+ limit?: number,
1029
+ filter?: {
1030
+ min_area?: number,
1031
+ allowed_roles?: string[],
1032
+ min_z_index?: number
1033
+ }
1034
+ }
1035
+ ```
1036
+
1037
+ **Returns:** Promise<GeometryMap>
1038
+
1039
+ ```typescript
1040
+ {
1041
+ status: "success" | "error",
1042
+ timestamp: string,
1043
+ url: string,
1044
+ viewport: { width, height },
1045
+ elements: SmartElement[],
1046
+ screenshot?: string,
1047
+ screenshot_format?: 'png' | 'jpeg',
1048
+ screenshot_error?: string
1049
+ }
1050
+ ```
1051
+
1052
+ ### SmartElement
1053
+
1054
+ ```typescript
1055
+ {
1056
+ id: number, // Registry index
1057
+ role: string, // Semantic role
1058
+ text: string | null, // Text content
1059
+ importance: number, // Score (-300 to 1800)
1060
+ visual_cues: {
1061
+ is_primary: boolean,
1062
+ background_color_name: string | null,
1063
+ is_clickable: boolean
1064
+ },
1065
+ bbox: {
1066
+ x: number,
1067
+ y: number,
1068
+ width: number,
1069
+ height: number
1070
+ },
1071
+ z_index: number
1072
+ }
1073
+ ```
1074
+
1075
+ ---
1076
+
1077
+ ## Performance
1078
+
1079
+ ### Timing Breakdown
1080
+
1081
+ | Operation | Time |
1082
+ |-----------|------|
1083
+ | Geometry extraction | 100-300ms |
1084
+ | + PNG screenshot | +50-100ms |
1085
+ | + JPEG screenshot | +30-80ms |
1086
+ | Total | 150-400ms |
1087
+
1088
+ ### Memory Usage
1089
+
1090
+ | Component | Size |
1091
+ |-----------|------|
1092
+ | WASM Binary | 50-150 KB |
1093
+ | Registry (1000 elements) | ~100 KB |
1094
+ | Raw Data | ~200 KB |
1095
+ | Smart Elements | ~100 KB |
1096
+ | Total Peak | ~500 KB |
1097
+
1098
+ ### Optimization Tips
1099
+
1100
+ 1. Use `limit` to reduce processing time
1101
+ 2. Filter with `allowed_roles` to skip irrelevant elements
1102
+ 3. Use JPEG for smaller screenshots
1103
+ 4. Cache results if analyzing same page multiple times
1104
+
1105
+ ---
1106
+
1107
+ ## Troubleshooting
1108
+
1109
+ ### "WASM not ready" Error
1110
+
1111
+ **Solution:** Wait for WASM to load
1112
+ ```javascript
1113
+ setTimeout(async () => {
1114
+ const result = await window.sentience.snapshot();
1115
+ console.log(result);
1116
+ }, 1000);
1117
+ ```
1118
+
1119
+ ### Empty Results
1120
+
1121
+ **Check:**
1122
+ ```javascript
1123
+ const result = await window.sentience.snapshot();
1124
+ console.log('Elements:', result.elements.length);
1125
+ console.log('Registry:', window.sentience_registry.length);
1126
+ ```
1127
+
1128
+ ### Screenshot Fails
1129
+
1130
+ **Causes:**
1131
+ - Extension doesn't have tab permissions
1132
+ - Page is restricted URL (chrome://, file://)
1133
+ - Tab is not active
1134
+
1135
+ **Solution:**
1136
+ ```javascript
1137
+ const result = await window.sentience.snapshot({ screenshot: true });
1138
+ if (result.screenshot_error) {
1139
+ console.error('Screenshot error:', result.screenshot_error);
1140
+ }
1141
+ ```
1142
+
1143
+ ### Screenshot is Black
1144
+
1145
+ **Cause:** Page hasn't rendered
1146
+
1147
+ **Solution:** Add delay
1148
+ ```javascript
1149
+ await new Promise(r => setTimeout(r, 500));
1150
+ const result = await window.sentience.snapshot({ screenshot: true });
1151
+ ```
1152
+
1153
+ ### Large Screenshots
1154
+
1155
+ **Solution:** Use JPEG with lower quality
1156
+ ```javascript
1157
+ {
1158
+ screenshot: {
1159
+ format: 'jpeg',
1160
+ quality: 60
1161
+ }
1162
+ }
1163
+ ```
1164
+
1165
+ ---
1166
+
1167
+ ## Summary
1168
+
1169
+ ### What Users Can Do
1170
+
1171
+ 1. ✅ Call `window.sentience.snapshot()` to get geometry map
1172
+ 2. ✅ Add `{ screenshot: true }` for visual snapshot
1173
+ 3. ✅ Add `{ limit: N }` to get top N elements
1174
+ 4. ✅ Add `{ filter: {...} }` to filter by role/size/z-index
1175
+ 5. ✅ Visualize with bounding boxes
1176
+ 6. ✅ Send to AI vision APIs
1177
+
1178
+ ### What Users Get
1179
+
1180
+ - **Screenshot**: Base64-encoded PNG/JPEG
1181
+ - **Viewport**: URL, width, height
1182
+ - **Elements**: Sorted by importance, then Y-position
1183
+ - **Positions**: Bounding boxes (x, y, width, height)
1184
+ - **Roles**: Semantic (button/link/input)
1185
+ - **Scores**: -300 to 1800 importance ranking
1186
+ - **Visual Cues**: is_primary, colors, clickability
1187
+
1188
+ ### One Function, All Features
1189
+
1190
+ ```javascript
1191
+ await window.sentience.snapshot({
1192
+ screenshot: { format: 'jpeg', quality: 80 },
1193
+ limit: 20,
1194
+ filter: {
1195
+ allowed_roles: ['button', 'textbox'],
1196
+ min_area: 100
1197
+ }
1198
+ })
1199
+ ```
1200
+
1201
+ **That's it!** 🎉
1202
+
1203
+ ---
1204
+
1205
+ ## License
1206
+
1207
+ MIT License - See LICENSE file for details.
1208
+
1209
+ ## Contributing
1210
+
1211
+ Contributions welcome! Please open issues or pull requests.
1212
+
1213
+ ## Authors
1214
+
1215
+ - Built with Rust, WASM, and Chrome Extension APIs
1216
+ - Ported from geometry-service architecture
1217
+
1218
+ ---
1219
+
1220
+ **Questions?** Open an issue on GitHub.
1221
+
1222
+ ## Running Python to use the chrome extension
1223
+ > suppose we have python code running on the same machine as the chrome browser, can the python code invoke the snapshopt function in `window.sentience` of the chrome extension?
1224
+
1225
+ You are asking if Python code running locally can invoke the `snapshot` function exposed by the Chrome Extension.
1226
+
1227
+ The answer is **Yes**. Python can execute any JavaScript function available in the browser context using standard browser automation libraries like Playwright, Selenium, or Puppeteer. Since your extension injects `window.sentience` into the "Main World" (the same context the page uses), your Python script has full access to it via `page.evaluate()`.
1228
+
1229
+ ### How to do it (Python Example)
1230
+
1231
+ Using **Playwright** (recommended for modern agents):
1232
+
1233
+ ```python
1234
+ from playwright.sync_api import sync_playwright
1235
+ import json
1236
+
1237
+ def run_agent():
1238
+ with sync_playwright() as p:
1239
+ # 1. Launch Chrome with your Extension loaded
1240
+ # Note: Use launch_persistent_context to keep extension state/settings
1241
+ browser = p.chromium.launch_persistent_context(
1242
+ user_data_dir="/tmp/chrome-profile",
1243
+ headless=False, # Extensions don't work in headless=new yet (usually)
1244
+ args=[
1245
+ "--disable-extensions-except=./path/to/your/extension",
1246
+ "--load-extension=./path/to/your/extension"
1247
+ ]
1248
+ )
1249
+
1250
+ page = browser.pages[0]
1251
+ page.goto("https://www.amazon.com")
1252
+
1253
+ # 2. Invoke your API
1254
+ # Python waits for the Promise to resolve and gets the JSON back
1255
+ print("📸 Taking snapshot...")
1256
+ result = page.evaluate("window.sentience.snapshot({ limit: 50 })")
1257
+
1258
+ if result.get("status") == "success":
1259
+ elements = result.get("elements", [])
1260
+ print(f"✅ Found {len(elements)} interactive elements")
1261
+
1262
+ # Example: Find Search Box
1263
+ search_box = next((el for el in elements if el['role'] == 'searchbox'), None)
1264
+
1265
+ if search_box:
1266
+ print(f"🔍 Clicking Search Box ID: {search_box['id']}")
1267
+ # 3. Execute Action via your API
1268
+ page.evaluate(f"window.sentience.click({search_box['id']})")
1269
+
1270
+ # Or standard Playwright typing
1271
+ page.keyboard.type("Gaming Laptop")
1272
+ page.keyboard.press("Enter")
1273
+ else:
1274
+ print(f"❌ Error: {result.get('error')}")
1275
+
1276
+ browser.close()
1277
+
1278
+ if __name__ == "__main__":
1279
+ run_agent()
1280
+
1281
+ ```
1282
+
1283
+ ### Why this architecture is powerful
1284
+
1285
+ 1. **Zero Network Latency:** The Python script talks to the Extension instantly via the Chrome DevTools Protocol. No HTTP requests to your server.
1286
+ 2. **Shared State:** The Python script can see the `window.sentience` object just like a developer typing in the console.
1287
+ 3. **Hybrid Control:** You can mix your `sentience.snapshot()` (for vision) with standard Playwright commands (like `.type()` or `.waitForNavigation()`) for a robust agent.
1288
+ ---
1289
+
1290
+ ## Contributing
1291
+
1292
+ We welcome contributions! This extension has been fully restructured with modern tooling and modular architecture.
1293
+
1294
+ ### 🚀 **Quick Start for Contributors**
1295
+
1296
+ 1. **Clone and install:**
1297
+ ```bash
1298
+ git clone https://github.com/YOUR_ORG/sentience-chrome.git
1299
+ cd sentience-chrome
1300
+ npm install
1301
+ ```
1302
+
1303
+ 2. **Make changes** to the modular source in `src/`
1304
+
1305
+ 3. **Test your changes:**
1306
+ ```bash
1307
+ npm run lint # Check code quality
1308
+ npm test # Run tests
1309
+ npm run build # Build extension
1310
+ ```
1311
+
1312
+ 4. **Submit a PR** - CI will automatically check linting, tests, and builds
1313
+
1314
+ ### 📚 **Developer Resources**
1315
+
1316
+ - **[CONTRIBUTING.md](CONTRIBUTING.md)** - Complete developer guide
1317
+ - **[docs/RESTRUCTURING_PROGRESS.md](docs/RESTRUCTURING_PROGRESS.md)** - Architecture details
1318
+ - **Modular codebase** - 7 focused modules instead of monolith
1319
+ - **76 tests** with 80% pass rate
1320
+ - **Automated CI/CD** - GitHub Actions for quality checks
1321
+
1322
+ ### 🎯 **Key Features for Contributors**
1323
+
1324
+ - ✅ **ESLint + Prettier** - Automated code quality
1325
+ - ✅ **Jest testing** - Unit tests with coverage reports
1326
+ - ✅ **Rollup bundling** - Optimized builds
1327
+ - ✅ **CI/CD pipelines** - Automated checks on every PR
1328
+ - ✅ **Zero breaking changes** - SDK compatibility preserved
1329
+
1330
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
1331
+
1332
+ ---
1333
+
1334
+ ## License
1335
+
1336
+ See [LICENSE](LICENSE) file.
1337
+
1338
+ ---
1339
+
1340
+ **Built with ❤️ for AI agents and web automation**