@predicatelabs/sdk 0.99.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +24 -0
- package/README.md +252 -0
- package/dist/actions.d.ts +185 -0
- package/dist/actions.d.ts.map +1 -0
- package/dist/actions.js +1120 -0
- package/dist/actions.js.map +1 -0
- package/dist/agent-runtime.d.ts +352 -0
- package/dist/agent-runtime.d.ts.map +1 -0
- package/dist/agent-runtime.js +1170 -0
- package/dist/agent-runtime.js.map +1 -0
- package/dist/agent.d.ts +164 -0
- package/dist/agent.d.ts.map +1 -0
- package/dist/agent.js +408 -0
- package/dist/agent.js.map +1 -0
- package/dist/asserts/expect.d.ts +159 -0
- package/dist/asserts/expect.d.ts.map +1 -0
- package/dist/asserts/expect.js +547 -0
- package/dist/asserts/expect.js.map +1 -0
- package/dist/asserts/index.d.ts +58 -0
- package/dist/asserts/index.d.ts.map +1 -0
- package/dist/asserts/index.js +70 -0
- package/dist/asserts/index.js.map +1 -0
- package/dist/asserts/query.d.ts +199 -0
- package/dist/asserts/query.d.ts.map +1 -0
- package/dist/asserts/query.js +288 -0
- package/dist/asserts/query.js.map +1 -0
- package/dist/backends/actions.d.ts +119 -0
- package/dist/backends/actions.d.ts.map +1 -0
- package/dist/backends/actions.js +291 -0
- package/dist/backends/actions.js.map +1 -0
- package/dist/backends/browser-use-adapter.d.ts +131 -0
- package/dist/backends/browser-use-adapter.d.ts.map +1 -0
- package/dist/backends/browser-use-adapter.js +219 -0
- package/dist/backends/browser-use-adapter.js.map +1 -0
- package/dist/backends/cdp-backend.d.ts +66 -0
- package/dist/backends/cdp-backend.d.ts.map +1 -0
- package/dist/backends/cdp-backend.js +273 -0
- package/dist/backends/cdp-backend.js.map +1 -0
- package/dist/backends/index.d.ts +80 -0
- package/dist/backends/index.d.ts.map +1 -0
- package/dist/backends/index.js +101 -0
- package/dist/backends/index.js.map +1 -0
- package/dist/backends/protocol.d.ts +156 -0
- package/dist/backends/protocol.d.ts.map +1 -0
- package/dist/backends/protocol.js +16 -0
- package/dist/backends/protocol.js.map +1 -0
- package/dist/backends/sentience-context.d.ts +143 -0
- package/dist/backends/sentience-context.d.ts.map +1 -0
- package/dist/backends/sentience-context.js +359 -0
- package/dist/backends/sentience-context.js.map +1 -0
- package/dist/backends/snapshot.d.ts +188 -0
- package/dist/backends/snapshot.d.ts.map +1 -0
- package/dist/backends/snapshot.js +360 -0
- package/dist/backends/snapshot.js.map +1 -0
- package/dist/browser.d.ts +154 -0
- package/dist/browser.d.ts.map +1 -0
- package/dist/browser.js +920 -0
- package/dist/browser.js.map +1 -0
- package/dist/canonicalization.d.ts +126 -0
- package/dist/canonicalization.d.ts.map +1 -0
- package/dist/canonicalization.js +161 -0
- package/dist/canonicalization.js.map +1 -0
- package/dist/captcha/strategies.d.ts +12 -0
- package/dist/captcha/strategies.d.ts.map +1 -0
- package/dist/captcha/strategies.js +43 -0
- package/dist/captcha/strategies.js.map +1 -0
- package/dist/captcha/types.d.ts +45 -0
- package/dist/captcha/types.d.ts.map +1 -0
- package/dist/captcha/types.js +12 -0
- package/dist/captcha/types.js.map +1 -0
- package/dist/cli.d.ts +5 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +422 -0
- package/dist/cli.js.map +1 -0
- package/dist/conversational-agent.d.ts +123 -0
- package/dist/conversational-agent.d.ts.map +1 -0
- package/dist/conversational-agent.js +341 -0
- package/dist/conversational-agent.js.map +1 -0
- package/dist/cursor-policy.d.ts +41 -0
- package/dist/cursor-policy.d.ts.map +1 -0
- package/dist/cursor-policy.js +81 -0
- package/dist/cursor-policy.js.map +1 -0
- package/dist/debugger.d.ts +28 -0
- package/dist/debugger.d.ts.map +1 -0
- package/dist/debugger.js +107 -0
- package/dist/debugger.js.map +1 -0
- package/dist/expect.d.ts +16 -0
- package/dist/expect.d.ts.map +1 -0
- package/dist/expect.js +67 -0
- package/dist/expect.js.map +1 -0
- package/dist/failure-artifacts.d.ts +95 -0
- package/dist/failure-artifacts.d.ts.map +1 -0
- package/dist/failure-artifacts.js +805 -0
- package/dist/failure-artifacts.js.map +1 -0
- package/dist/generator.d.ts +16 -0
- package/dist/generator.d.ts.map +1 -0
- package/dist/generator.js +205 -0
- package/dist/generator.js.map +1 -0
- package/dist/index.d.ts +37 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +160 -0
- package/dist/index.js.map +1 -0
- package/dist/inspector.d.ts +13 -0
- package/dist/inspector.d.ts.map +1 -0
- package/dist/inspector.js +153 -0
- package/dist/inspector.js.map +1 -0
- package/dist/llm-provider.d.ts +144 -0
- package/dist/llm-provider.d.ts.map +1 -0
- package/dist/llm-provider.js +460 -0
- package/dist/llm-provider.js.map +1 -0
- package/dist/ordinal.d.ts +90 -0
- package/dist/ordinal.d.ts.map +1 -0
- package/dist/ordinal.js +249 -0
- package/dist/ordinal.js.map +1 -0
- package/dist/overlay.d.ts +63 -0
- package/dist/overlay.d.ts.map +1 -0
- package/dist/overlay.js +102 -0
- package/dist/overlay.js.map +1 -0
- package/dist/protocols/browser-protocol.d.ts +79 -0
- package/dist/protocols/browser-protocol.d.ts.map +1 -0
- package/dist/protocols/browser-protocol.js +9 -0
- package/dist/protocols/browser-protocol.js.map +1 -0
- package/dist/query.d.ts +66 -0
- package/dist/query.d.ts.map +1 -0
- package/dist/query.js +482 -0
- package/dist/query.js.map +1 -0
- package/dist/read.d.ts +47 -0
- package/dist/read.d.ts.map +1 -0
- package/dist/read.js +128 -0
- package/dist/read.js.map +1 -0
- package/dist/recorder.d.ts +44 -0
- package/dist/recorder.d.ts.map +1 -0
- package/dist/recorder.js +262 -0
- package/dist/recorder.js.map +1 -0
- package/dist/runtime-agent.d.ts +72 -0
- package/dist/runtime-agent.d.ts.map +1 -0
- package/dist/runtime-agent.js +357 -0
- package/dist/runtime-agent.js.map +1 -0
- package/dist/screenshot.d.ts +17 -0
- package/dist/screenshot.d.ts.map +1 -0
- package/dist/screenshot.js +40 -0
- package/dist/screenshot.js.map +1 -0
- package/dist/snapshot-diff.d.ts +23 -0
- package/dist/snapshot-diff.d.ts.map +1 -0
- package/dist/snapshot-diff.js +119 -0
- package/dist/snapshot-diff.js.map +1 -0
- package/dist/snapshot.d.ts +47 -0
- package/dist/snapshot.d.ts.map +1 -0
- package/dist/snapshot.js +358 -0
- package/dist/snapshot.js.map +1 -0
- package/dist/textSearch.d.ts +64 -0
- package/dist/textSearch.d.ts.map +1 -0
- package/dist/textSearch.js +113 -0
- package/dist/textSearch.js.map +1 -0
- package/dist/tools/context.d.ts +18 -0
- package/dist/tools/context.d.ts.map +1 -0
- package/dist/tools/context.js +40 -0
- package/dist/tools/context.js.map +1 -0
- package/dist/tools/defaults.d.ts +5 -0
- package/dist/tools/defaults.d.ts.map +1 -0
- package/dist/tools/defaults.js +368 -0
- package/dist/tools/defaults.js.map +1 -0
- package/dist/tools/filesystem.d.ts +12 -0
- package/dist/tools/filesystem.d.ts.map +1 -0
- package/dist/tools/filesystem.js +137 -0
- package/dist/tools/filesystem.js.map +1 -0
- package/dist/tools/index.d.ts +5 -0
- package/dist/tools/index.d.ts.map +1 -0
- package/dist/tools/index.js +15 -0
- package/dist/tools/index.js.map +1 -0
- package/dist/tools/registry.d.ts +38 -0
- package/dist/tools/registry.d.ts.map +1 -0
- package/dist/tools/registry.js +100 -0
- package/dist/tools/registry.js.map +1 -0
- package/dist/tracing/cloud-sink.d.ts +189 -0
- package/dist/tracing/cloud-sink.d.ts.map +1 -0
- package/dist/tracing/cloud-sink.js +1067 -0
- package/dist/tracing/cloud-sink.js.map +1 -0
- package/dist/tracing/index-schema.d.ts +231 -0
- package/dist/tracing/index-schema.d.ts.map +1 -0
- package/dist/tracing/index-schema.js +235 -0
- package/dist/tracing/index-schema.js.map +1 -0
- package/dist/tracing/index.d.ts +12 -0
- package/dist/tracing/index.d.ts.map +1 -0
- package/dist/tracing/index.js +28 -0
- package/dist/tracing/index.js.map +1 -0
- package/dist/tracing/indexer.d.ts +20 -0
- package/dist/tracing/indexer.d.ts.map +1 -0
- package/dist/tracing/indexer.js +347 -0
- package/dist/tracing/indexer.js.map +1 -0
- package/dist/tracing/jsonl-sink.d.ts +51 -0
- package/dist/tracing/jsonl-sink.d.ts.map +1 -0
- package/dist/tracing/jsonl-sink.js +329 -0
- package/dist/tracing/jsonl-sink.js.map +1 -0
- package/dist/tracing/sink.d.ts +25 -0
- package/dist/tracing/sink.d.ts.map +1 -0
- package/dist/tracing/sink.js +15 -0
- package/dist/tracing/sink.js.map +1 -0
- package/dist/tracing/tracer-factory.d.ts +102 -0
- package/dist/tracing/tracer-factory.d.ts.map +1 -0
- package/dist/tracing/tracer-factory.js +375 -0
- package/dist/tracing/tracer-factory.js.map +1 -0
- package/dist/tracing/tracer.d.ts +140 -0
- package/dist/tracing/tracer.d.ts.map +1 -0
- package/dist/tracing/tracer.js +336 -0
- package/dist/tracing/tracer.js.map +1 -0
- package/dist/tracing/types.d.ts +203 -0
- package/dist/tracing/types.d.ts.map +1 -0
- package/dist/tracing/types.js +8 -0
- package/dist/tracing/types.js.map +1 -0
- package/dist/types.d.ts +422 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +6 -0
- package/dist/types.js.map +1 -0
- package/dist/utils/action-executor.d.ts +25 -0
- package/dist/utils/action-executor.d.ts.map +1 -0
- package/dist/utils/action-executor.js +121 -0
- package/dist/utils/action-executor.js.map +1 -0
- package/dist/utils/browser-evaluator.d.ts +76 -0
- package/dist/utils/browser-evaluator.d.ts.map +1 -0
- package/dist/utils/browser-evaluator.js +130 -0
- package/dist/utils/browser-evaluator.js.map +1 -0
- package/dist/utils/browser.d.ts +30 -0
- package/dist/utils/browser.d.ts.map +1 -0
- package/dist/utils/browser.js +75 -0
- package/dist/utils/browser.js.map +1 -0
- package/dist/utils/element-filter.d.ts +76 -0
- package/dist/utils/element-filter.d.ts.map +1 -0
- package/dist/utils/element-filter.js +195 -0
- package/dist/utils/element-filter.js.map +1 -0
- package/dist/utils/grid-utils.d.ts +37 -0
- package/dist/utils/grid-utils.d.ts.map +1 -0
- package/dist/utils/grid-utils.js +283 -0
- package/dist/utils/grid-utils.js.map +1 -0
- package/dist/utils/llm-interaction-handler.d.ts +41 -0
- package/dist/utils/llm-interaction-handler.d.ts.map +1 -0
- package/dist/utils/llm-interaction-handler.js +171 -0
- package/dist/utils/llm-interaction-handler.js.map +1 -0
- package/dist/utils/llm-response-builder.d.ts +56 -0
- package/dist/utils/llm-response-builder.d.ts.map +1 -0
- package/dist/utils/llm-response-builder.js +130 -0
- package/dist/utils/llm-response-builder.js.map +1 -0
- package/dist/utils/selector-utils.d.ts +12 -0
- package/dist/utils/selector-utils.d.ts.map +1 -0
- package/dist/utils/selector-utils.js +32 -0
- package/dist/utils/selector-utils.js.map +1 -0
- package/dist/utils/snapshot-event-builder.d.ts +28 -0
- package/dist/utils/snapshot-event-builder.d.ts.map +1 -0
- package/dist/utils/snapshot-event-builder.js +88 -0
- package/dist/utils/snapshot-event-builder.js.map +1 -0
- package/dist/utils/snapshot-processor.d.ts +27 -0
- package/dist/utils/snapshot-processor.d.ts.map +1 -0
- package/dist/utils/snapshot-processor.js +47 -0
- package/dist/utils/snapshot-processor.js.map +1 -0
- package/dist/utils/trace-event-builder.d.ts +122 -0
- package/dist/utils/trace-event-builder.d.ts.map +1 -0
- package/dist/utils/trace-event-builder.js +365 -0
- package/dist/utils/trace-event-builder.js.map +1 -0
- package/dist/utils/trace-file-manager.d.ts +70 -0
- package/dist/utils/trace-file-manager.d.ts.map +1 -0
- package/dist/utils/trace-file-manager.js +194 -0
- package/dist/utils/trace-file-manager.js.map +1 -0
- package/dist/utils/zod.d.ts +5 -0
- package/dist/utils/zod.d.ts.map +1 -0
- package/dist/utils/zod.js +80 -0
- package/dist/utils/zod.js.map +1 -0
- package/dist/utils.d.ts +8 -0
- package/dist/utils.d.ts.map +1 -0
- package/dist/utils.js +13 -0
- package/dist/utils.js.map +1 -0
- package/dist/verification.d.ts +194 -0
- package/dist/verification.d.ts.map +1 -0
- package/dist/verification.js +530 -0
- package/dist/verification.js.map +1 -0
- package/dist/vision-executor.d.ts +18 -0
- package/dist/vision-executor.d.ts.map +1 -0
- package/dist/vision-executor.js +60 -0
- package/dist/vision-executor.js.map +1 -0
- package/dist/visual-agent.d.ts +120 -0
- package/dist/visual-agent.d.ts.map +1 -0
- package/dist/visual-agent.js +796 -0
- package/dist/visual-agent.js.map +1 -0
- package/dist/wait.d.ts +35 -0
- package/dist/wait.d.ts.map +1 -0
- package/dist/wait.js +76 -0
- package/dist/wait.js.map +1 -0
- package/package.json +94 -0
- package/spec/README.md +72 -0
- package/spec/SNAPSHOT_V1.md +208 -0
- package/spec/sdk-types.md +259 -0
- package/spec/snapshot.schema.json +148 -0
- package/src/extension/background.js +104 -0
- package/src/extension/content.js +162 -0
- package/src/extension/injected_api.js +1399 -0
- package/src/extension/manifest.json +36 -0
- package/src/extension/pkg/README.md +1340 -0
- package/src/extension/pkg/package.json +15 -0
- package/src/extension/pkg/sentience_core.d.ts +51 -0
- package/src/extension/pkg/sentience_core.js +371 -0
- package/src/extension/pkg/sentience_core_bg.wasm +0 -0
- package/src/extension/pkg/sentience_core_bg.wasm.d.ts +10 -0
- package/src/extension/release.json +116 -0
|
@@ -0,0 +1,1340 @@
|
|
|
1
|
+
# Sentience Chrome Extension - Complete Documentation
|
|
2
|
+
|
|
3
|
+
**A Rust/WASM-powered Chrome extension for extracting geometric layouts, visual cues, and importance scores from web pages.**
|
|
4
|
+
|
|
5
|
+
Perfect for AI agents, automation scripts, visual grounding, and accessibility tools.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Table of Contents
|
|
10
|
+
|
|
11
|
+
1. [Overview](#overview)
|
|
12
|
+
2. [Quick Start](#quick-start)
|
|
13
|
+
3. [Developer Quick Reference](#developer-quick-reference)
|
|
14
|
+
4. [Installation](#installation)
|
|
15
|
+
5. [User API](#user-api)
|
|
16
|
+
6. [Usage Examples](#usage-examples)
|
|
17
|
+
7. [Screenshot Feature](#screenshot-feature)
|
|
18
|
+
8. [Bounding Box Visualization](#bounding-box-visualization)
|
|
19
|
+
9. [Filtering & Ranking](#filtering--ranking)
|
|
20
|
+
10. [Architecture](#architecture)
|
|
21
|
+
11. [Implementation Details](#implementation-details)
|
|
22
|
+
12. [API Reference](#api-reference)
|
|
23
|
+
13. [Performance](#performance)
|
|
24
|
+
14. [Troubleshooting](#troubleshooting)
|
|
25
|
+
15. [Contributing](#contributing)
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Overview
|
|
30
|
+
|
|
31
|
+
### What It Does
|
|
32
|
+
|
|
33
|
+
Extracts a **geometry map** of any webpage:
|
|
34
|
+
- **Element positions** (bounding boxes)
|
|
35
|
+
- **Semantic roles** (button, link, textbox, etc.)
|
|
36
|
+
- **Importance scores** (AI-optimized ranking)
|
|
37
|
+
- **Visual cues** (colors, primary actions, clickability)
|
|
38
|
+
- **Screenshots** (Base64 PNG/JPEG)
|
|
39
|
+
|
|
40
|
+
### Why Use It?
|
|
41
|
+
|
|
42
|
+
✅ **AI Agent Navigation** - Provide structured page data to LLMs
|
|
43
|
+
✅ **Visual Grounding** - Combine screenshots with element positions
|
|
44
|
+
✅ **Accessibility Auditing** - Find low-importance or inaccessible elements
|
|
45
|
+
✅ **Form Detection** - Extract all input fields automatically
|
|
46
|
+
✅ **Primary CTA Detection** - Find the main call-to-action button
|
|
47
|
+
✅ **Automation** - Programmatically interact with web pages
|
|
48
|
+
|
|
49
|
+
### Technology Stack
|
|
50
|
+
|
|
51
|
+
- **Frontend**: JavaScript (DOM extraction)
|
|
52
|
+
- **Backend**: Rust/WASM (analysis & ranking)
|
|
53
|
+
- **Size**: ~50-150 KB WASM binary
|
|
54
|
+
- **Performance**: 100-300ms typical response
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
## Quick Start
|
|
59
|
+
|
|
60
|
+
### 5-Second Test
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
# 1. Install dependencies
|
|
64
|
+
npm install
|
|
65
|
+
|
|
66
|
+
# 2. Build extension (WASM + JavaScript)
|
|
67
|
+
npm run build
|
|
68
|
+
|
|
69
|
+
# 3. Load extension in Chrome
|
|
70
|
+
# chrome://extensions → Enable Developer mode → Load unpacked → Select this directory
|
|
71
|
+
|
|
72
|
+
# 4. Open any webpage, then in DevTools Console:
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
```javascript
|
|
76
|
+
const result = await window.sentience.snapshot({ limit: 5 });
|
|
77
|
+
console.log(result.elements);
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
**That's it!** You now have the top 5 most important elements with positions, roles, and scores.
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Developer Quick Reference
|
|
85
|
+
|
|
86
|
+
### 📦 **Build Commands**
|
|
87
|
+
|
|
88
|
+
| Command | What it does |
|
|
89
|
+
|---------|-------------|
|
|
90
|
+
| `npm install` | Install dependencies (first time only) |
|
|
91
|
+
| `npm run build` | **Full build** (WASM + JavaScript bundles) |
|
|
92
|
+
| `npm run build:wasm` | Build only WASM (Rust → pkg/) |
|
|
93
|
+
| `npm run build:bundle` | Build only JavaScript (src/ → dist/) |
|
|
94
|
+
|
|
95
|
+
### ✅ **Code Quality**
|
|
96
|
+
|
|
97
|
+
| Command | What it does |
|
|
98
|
+
|---------|-------------|
|
|
99
|
+
| `npm run lint` | Check code quality with ESLint |
|
|
100
|
+
| `npm run lint:fix` | Auto-fix linting issues |
|
|
101
|
+
| `npm run format` | Format code with Prettier |
|
|
102
|
+
| `npm run format:check` | Check code formatting |
|
|
103
|
+
|
|
104
|
+
### 🧪 **Testing**
|
|
105
|
+
|
|
106
|
+
| Command | What it does |
|
|
107
|
+
|---------|-------------|
|
|
108
|
+
| `npm test` | Run all tests with coverage |
|
|
109
|
+
| `npm run test:watch` | Run tests in watch mode |
|
|
110
|
+
| `npm run test:coverage` | Generate coverage report |
|
|
111
|
+
|
|
112
|
+
### 📁 **Project Structure**
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
sentience-chrome/
|
|
116
|
+
├── src/ # Modular source code
|
|
117
|
+
│ ├── background/ # Service worker with WASM
|
|
118
|
+
│ ├── content/ # Message bridge
|
|
119
|
+
│ └── injected/ # Main API (7 modules)
|
|
120
|
+
├── tests/ # Unit tests (76 tests)
|
|
121
|
+
├── dist/ # Bundled output (gitignored)
|
|
122
|
+
├── pkg/ # WASM artifacts (gitignored)
|
|
123
|
+
└── docs/ # Documentation
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### 🔄 **Development Workflow**
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# 1. Make changes to src/injected/utils.js (for example)
|
|
130
|
+
# 2. Rebuild JavaScript (fast)
|
|
131
|
+
npm run build:bundle
|
|
132
|
+
|
|
133
|
+
# 3. Reload extension in Chrome
|
|
134
|
+
# Click refresh icon in chrome://extensions/
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### 📚 **Documentation**
|
|
138
|
+
|
|
139
|
+
- **[CONTRIBUTING.md](CONTRIBUTING.md)** - Developer guide
|
|
140
|
+
- **[docs/RESTRUCTURING_PROGRESS.md](docs/RESTRUCTURING_PROGRESS.md)** - Architecture overview
|
|
141
|
+
- **[docs/RESTRUCTURING_ASSESSMENT.md](docs/RESTRUCTURING_ASSESSMENT.md)** - Planning document
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Installation
|
|
146
|
+
|
|
147
|
+
### Prerequisites
|
|
148
|
+
|
|
149
|
+
- **Rust** (for WASM compilation)
|
|
150
|
+
- **wasm-pack** (`cargo install wasm-pack`)
|
|
151
|
+
- **Chrome/Chromium** browser
|
|
152
|
+
|
|
153
|
+
### Build Steps
|
|
154
|
+
|
|
155
|
+
**Option 1: Use the build script (Recommended)**
|
|
156
|
+
|
|
157
|
+
```bash
|
|
158
|
+
# 1. Navigate to directory
|
|
159
|
+
cd /Users/guoliangwang/Desktop/Code/Rust/sentience-chrome/Claude
|
|
160
|
+
|
|
161
|
+
# 2. Run build script
|
|
162
|
+
./build.sh
|
|
163
|
+
|
|
164
|
+
# The script will:
|
|
165
|
+
# - Check for wasm-pack
|
|
166
|
+
# - Build the WASM module
|
|
167
|
+
# - Show file sizes
|
|
168
|
+
# - Display next steps
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
**Option 2: Manual build**
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
# 1. Navigate to directory
|
|
175
|
+
cd /Users/guoliangwang/Desktop/Code/Rust/sentience-chrome/Claude
|
|
176
|
+
|
|
177
|
+
# 2. Build WASM module
|
|
178
|
+
wasm-pack build --target web
|
|
179
|
+
|
|
180
|
+
# 3. Verify output
|
|
181
|
+
ls pkg/
|
|
182
|
+
# Should see: sentience_core.js, sentience_core_bg.wasm
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Load in Chrome:**
|
|
186
|
+
|
|
187
|
+
1. Open `chrome://extensions`
|
|
188
|
+
2. Enable "Developer mode" (top-right toggle)
|
|
189
|
+
3. Click "Load unpacked"
|
|
190
|
+
4. Select the `Claude/` directory
|
|
191
|
+
5. Test on any webpage in DevTools Console:
|
|
192
|
+
```javascript
|
|
193
|
+
await window.sentience.snapshot()
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
### File Structure
|
|
197
|
+
|
|
198
|
+
```
|
|
199
|
+
Claude/
|
|
200
|
+
├── src/
|
|
201
|
+
│ └── lib.rs # Rust/WASM logic
|
|
202
|
+
├── pkg/ # Generated by wasm-pack
|
|
203
|
+
│ ├── sentience_core.js
|
|
204
|
+
│ └── sentience_core_bg.wasm
|
|
205
|
+
├── content.js # JavaScript entry point
|
|
206
|
+
├── background.js # Screenshot capture
|
|
207
|
+
├── manifest.json # Chrome extension config
|
|
208
|
+
├── Cargo.toml # Rust dependencies
|
|
209
|
+
├── build.sh # Build script (executable)
|
|
210
|
+
├── README.md # This file
|
|
211
|
+
├── IMPLEMENTATION_SUMMARY.md # Technical implementation details
|
|
212
|
+
└── prompt.md # Original task instructions
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## User API
|
|
218
|
+
|
|
219
|
+
### Core Functions
|
|
220
|
+
|
|
221
|
+
The extension provides two main functions:
|
|
222
|
+
|
|
223
|
+
1. **`window.sentience.snapshot(options?)`** - Extract page geometry and elements
|
|
224
|
+
2. **`window.sentience.findTextRect(options)`** - Find exact pixel coordinates of text
|
|
225
|
+
|
|
226
|
+
### snapshot() - Geometry Extraction
|
|
227
|
+
|
|
228
|
+
```javascript
|
|
229
|
+
window.sentience.snapshot(options?)
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Capabilities:**
|
|
233
|
+
- Get geometry map
|
|
234
|
+
- Capture screenshot
|
|
235
|
+
- Filter by role/size/z-index
|
|
236
|
+
- Limit to top N elements
|
|
237
|
+
|
|
238
|
+
### Complete Options
|
|
239
|
+
|
|
240
|
+
```typescript
|
|
241
|
+
await window.sentience.snapshot({
|
|
242
|
+
// Screenshot
|
|
243
|
+
screenshot?: boolean | {
|
|
244
|
+
format: 'png' | 'jpeg',
|
|
245
|
+
quality: number // 0-100, JPEG only
|
|
246
|
+
},
|
|
247
|
+
|
|
248
|
+
// Filtering
|
|
249
|
+
limit?: number,
|
|
250
|
+
filter?: {
|
|
251
|
+
min_area?: number,
|
|
252
|
+
allowed_roles?: string[],
|
|
253
|
+
min_z_index?: number
|
|
254
|
+
}
|
|
255
|
+
})
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
### Response Format
|
|
259
|
+
|
|
260
|
+
```typescript
|
|
261
|
+
{
|
|
262
|
+
status: "success",
|
|
263
|
+
timestamp: string,
|
|
264
|
+
url: string,
|
|
265
|
+
viewport: { width: number, height: number },
|
|
266
|
+
|
|
267
|
+
// Geometry data
|
|
268
|
+
elements: [{
|
|
269
|
+
id: number,
|
|
270
|
+
role: string,
|
|
271
|
+
importance: number,
|
|
272
|
+
visual_cues: {
|
|
273
|
+
is_primary: boolean,
|
|
274
|
+
background_color_name: string | null,
|
|
275
|
+
is_clickable: boolean
|
|
276
|
+
},
|
|
277
|
+
bbox: { x: number, y: number, width: number, height: number },
|
|
278
|
+
z_index: number
|
|
279
|
+
}],
|
|
280
|
+
|
|
281
|
+
// Screenshot (optional)
|
|
282
|
+
screenshot?: string, // Base64 data URL
|
|
283
|
+
screenshot_format?: 'png' | 'jpeg',
|
|
284
|
+
screenshot_error?: string
|
|
285
|
+
}
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
### findTextRect() - Text Location Finder
|
|
289
|
+
|
|
290
|
+
Find exact pixel coordinates of any text on the page using the DOM Range API. Perfect for highlighting specific words, clicking on text, or text-based navigation **without Vision Models**.
|
|
291
|
+
|
|
292
|
+
```javascript
|
|
293
|
+
window.sentience.findTextRect(options)
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
**Parameters:**
|
|
297
|
+
```typescript
|
|
298
|
+
{
|
|
299
|
+
text: string, // Required: Text to find
|
|
300
|
+
containerElement?: Element, // Optional: Search within (default: document.body)
|
|
301
|
+
caseSensitive?: boolean, // Optional: Case-sensitive search (default: false)
|
|
302
|
+
wholeWord?: boolean, // Optional: Match whole words only (default: false)
|
|
303
|
+
maxResults?: number // Optional: Limit results (default: 10)
|
|
304
|
+
}
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
**Returns:**
|
|
308
|
+
```typescript
|
|
309
|
+
{
|
|
310
|
+
status: "success" | "error",
|
|
311
|
+
query: string, // The search text
|
|
312
|
+
case_sensitive: boolean,
|
|
313
|
+
whole_word: boolean,
|
|
314
|
+
matches: number, // Total matches found
|
|
315
|
+
results: [{
|
|
316
|
+
text: string, // Actual matched text
|
|
317
|
+
rect: { // Absolute coordinates (with scroll)
|
|
318
|
+
x: number,
|
|
319
|
+
y: number,
|
|
320
|
+
width: number,
|
|
321
|
+
height: number,
|
|
322
|
+
left: number,
|
|
323
|
+
top: number,
|
|
324
|
+
right: number,
|
|
325
|
+
bottom: number
|
|
326
|
+
},
|
|
327
|
+
viewport_rect: { // Viewport-relative coordinates
|
|
328
|
+
x: number,
|
|
329
|
+
y: number,
|
|
330
|
+
width: number,
|
|
331
|
+
height: number
|
|
332
|
+
},
|
|
333
|
+
context: { // Surrounding text
|
|
334
|
+
before: string, // 20 chars before
|
|
335
|
+
after: string // 20 chars after
|
|
336
|
+
},
|
|
337
|
+
in_viewport: boolean // Is it currently visible?
|
|
338
|
+
}],
|
|
339
|
+
viewport: {
|
|
340
|
+
width: number,
|
|
341
|
+
height: number,
|
|
342
|
+
scroll_x: number,
|
|
343
|
+
scroll_y: number
|
|
344
|
+
},
|
|
345
|
+
error?: string // Error message if status is "error"
|
|
346
|
+
}
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
**Usage Examples:**
|
|
350
|
+
|
|
351
|
+
```javascript
|
|
352
|
+
// Example 1: Find "Add to Cart" text
|
|
353
|
+
const result = await window.sentience.findTextRect({
|
|
354
|
+
text: "Add to Cart"
|
|
355
|
+
});
|
|
356
|
+
|
|
357
|
+
if (result.status === "success") {
|
|
358
|
+
console.log(`Found ${result.matches} occurrences`);
|
|
359
|
+
result.results.forEach((match, i) => {
|
|
360
|
+
console.log(`${i+1}. At (${match.rect.x}, ${match.rect.y})`);
|
|
361
|
+
console.log(` Context: "${match.context.before}${match.text}${match.context.after}"`);
|
|
362
|
+
});
|
|
363
|
+
}
|
|
364
|
+
|
|
365
|
+
// Example 2: Highlight all matches
|
|
366
|
+
const result = await window.sentience.findTextRect({
|
|
367
|
+
text: "price",
|
|
368
|
+
caseSensitive: false,
|
|
369
|
+
maxResults: 20
|
|
370
|
+
});
|
|
371
|
+
|
|
372
|
+
result.results.forEach(match => {
|
|
373
|
+
const highlight = document.createElement('div');
|
|
374
|
+
highlight.style.cssText = `
|
|
375
|
+
position: absolute;
|
|
376
|
+
left: ${match.rect.x}px;
|
|
377
|
+
top: ${match.rect.y}px;
|
|
378
|
+
width: ${match.rect.width}px;
|
|
379
|
+
height: ${match.rect.height}px;
|
|
380
|
+
background: yellow;
|
|
381
|
+
opacity: 0.5;
|
|
382
|
+
pointer-events: none;
|
|
383
|
+
z-index: 9999;
|
|
384
|
+
`;
|
|
385
|
+
document.body.appendChild(highlight);
|
|
386
|
+
});
|
|
387
|
+
|
|
388
|
+
// Example 3: Click on specific text (not button!)
|
|
389
|
+
const result = await window.sentience.findTextRect({
|
|
390
|
+
text: "Terms of Service",
|
|
391
|
+
wholeWord: true
|
|
392
|
+
});
|
|
393
|
+
|
|
394
|
+
if (result.matches > 0) {
|
|
395
|
+
const first = result.results[0];
|
|
396
|
+
// Click the center of the text
|
|
397
|
+
const centerX = first.viewport_rect.x + first.viewport_rect.width / 2;
|
|
398
|
+
const centerY = first.viewport_rect.y + first.viewport_rect.height / 2;
|
|
399
|
+
|
|
400
|
+
document.elementFromPoint(centerX, centerY)?.click();
|
|
401
|
+
}
|
|
402
|
+
|
|
403
|
+
// Example 4: Find text only in header
|
|
404
|
+
const header = document.querySelector('header');
|
|
405
|
+
const result = await window.sentience.findTextRect({
|
|
406
|
+
text: "Login",
|
|
407
|
+
containerElement: header
|
|
408
|
+
});
|
|
409
|
+
|
|
410
|
+
// Example 5: Scroll to first match
|
|
411
|
+
const result = await window.sentience.findTextRect({
|
|
412
|
+
text: "Contact Us"
|
|
413
|
+
});
|
|
414
|
+
|
|
415
|
+
if (result.matches > 0) {
|
|
416
|
+
const first = result.results[0];
|
|
417
|
+
window.scrollTo({
|
|
418
|
+
top: first.rect.y - 100, // Offset for header
|
|
419
|
+
behavior: 'smooth'
|
|
420
|
+
});
|
|
421
|
+
}
|
|
422
|
+
```
|
|
423
|
+
|
|
424
|
+
**Use Cases:**
|
|
425
|
+
- 🎯 **Text-based clicking** - Click on text that's not in a button
|
|
426
|
+
- 🖍️ **Text highlighting** - Draw bounding boxes around specific words
|
|
427
|
+
- 📍 **Text navigation** - Scroll to specific content
|
|
428
|
+
- ♿ **Accessibility** - Find and highlight important text
|
|
429
|
+
- 🤖 **AI Agents** - Locate text without vision models
|
|
430
|
+
- 🔍 **Search results** - Find and highlight search terms
|
|
431
|
+
|
|
432
|
+
**Features:**
|
|
433
|
+
- ✅ Pixel-perfect coordinates using DOM Range API
|
|
434
|
+
- ✅ Filters invisible/hidden text automatically
|
|
435
|
+
- ✅ Returns both absolute and viewport-relative coordinates
|
|
436
|
+
- ✅ Provides context for ambiguous matches
|
|
437
|
+
- ✅ Handles multiple occurrences
|
|
438
|
+
- ✅ Performance-safe with result limits
|
|
439
|
+
- ✅ Works with case-insensitive and whole-word matching
|
|
440
|
+
|
|
441
|
+
---
|
|
442
|
+
|
|
443
|
+
## Usage Examples
|
|
444
|
+
|
|
445
|
+
### Example 1: Basic Geometry Map
|
|
446
|
+
|
|
447
|
+
```javascript
|
|
448
|
+
const result = await window.sentience.snapshot();
|
|
449
|
+
console.log(`Found ${result.elements.length} elements`);
|
|
450
|
+
console.log('Top element:', result.elements[0]);
|
|
451
|
+
```
|
|
452
|
+
|
|
453
|
+
### Example 2: Top 10 Most Important
|
|
454
|
+
|
|
455
|
+
```javascript
|
|
456
|
+
const top10 = await window.sentience.snapshot({ limit: 10 });
|
|
457
|
+
top10.elements.forEach((el, i) => {
|
|
458
|
+
console.log(`${i+1}. [${el.role}] Score: ${el.importance}, Position: (${el.bbox.x}, ${el.bbox.y})`);
|
|
459
|
+
});
|
|
460
|
+
```
|
|
461
|
+
|
|
462
|
+
### Example 3: Filter for Buttons Only
|
|
463
|
+
|
|
464
|
+
```javascript
|
|
465
|
+
const buttons = await window.sentience.snapshot({
|
|
466
|
+
filter: { allowed_roles: ['button', 'submit'] }
|
|
467
|
+
});
|
|
468
|
+
console.log(`Found ${buttons.elements.length} buttons`);
|
|
469
|
+
```
|
|
470
|
+
|
|
471
|
+
### Example 4: Find Primary Action
|
|
472
|
+
|
|
473
|
+
```javascript
|
|
474
|
+
const result = await window.sentience.snapshot();
|
|
475
|
+
const cta = result.elements.find(el =>
|
|
476
|
+
el.visual_cues.is_primary && el.role === 'button'
|
|
477
|
+
);
|
|
478
|
+
console.log('Primary CTA:', cta);
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
### Example 5: AI Agent Prompt
|
|
482
|
+
|
|
483
|
+
```javascript
|
|
484
|
+
const top5 = await window.sentience.snapshot({ limit: 5 });
|
|
485
|
+
|
|
486
|
+
const prompt = `
|
|
487
|
+
Available actions:
|
|
488
|
+
${top5.elements.map((el, i) =>
|
|
489
|
+
`${i+1}. ${el.role} at (${el.bbox.x}, ${el.bbox.y})`
|
|
490
|
+
).join('\n')}
|
|
491
|
+
|
|
492
|
+
Which action should I take to search for products?
|
|
493
|
+
`;
|
|
494
|
+
|
|
495
|
+
// Send to LLM API
|
|
496
|
+
```
|
|
497
|
+
|
|
498
|
+
### Example 6: Form Field Detection
|
|
499
|
+
|
|
500
|
+
```javascript
|
|
501
|
+
const inputs = await window.sentience.snapshot({
|
|
502
|
+
filter: {
|
|
503
|
+
allowed_roles: ['textbox', 'searchbox', 'checkbox', 'radio']
|
|
504
|
+
}
|
|
505
|
+
});
|
|
506
|
+
|
|
507
|
+
console.log('Form schema:', inputs.elements.map(el => ({
|
|
508
|
+
role: el.role,
|
|
509
|
+
position: el.bbox,
|
|
510
|
+
importance: el.importance
|
|
511
|
+
})));
|
|
512
|
+
```
|
|
513
|
+
|
|
514
|
+
---
|
|
515
|
+
|
|
516
|
+
## Screenshot Feature
|
|
517
|
+
|
|
518
|
+
### Basic Screenshot
|
|
519
|
+
|
|
520
|
+
```javascript
|
|
521
|
+
const result = await window.sentience.snapshot({
|
|
522
|
+
screenshot: true
|
|
523
|
+
});
|
|
524
|
+
|
|
525
|
+
console.log('Screenshot:', result.screenshot); // Base64 data URL
|
|
526
|
+
console.log('Format:', result.screenshot_format); // "png"
|
|
527
|
+
```
|
|
528
|
+
|
|
529
|
+
### JPEG for Smaller Size
|
|
530
|
+
|
|
531
|
+
```javascript
|
|
532
|
+
const result = await window.sentience.snapshot({
|
|
533
|
+
screenshot: {
|
|
534
|
+
format: 'jpeg',
|
|
535
|
+
quality: 80 // Recommended: 70-85
|
|
536
|
+
}
|
|
537
|
+
});
|
|
538
|
+
```
|
|
539
|
+
|
|
540
|
+
### Screenshot Format
|
|
541
|
+
|
|
542
|
+
**Important:** Screenshot is a **Base64 data URL string**, NOT a file path!
|
|
543
|
+
|
|
544
|
+
```
|
|
545
|
+
"..."
|
|
546
|
+
```
|
|
547
|
+
|
|
548
|
+
### Display Screenshot
|
|
549
|
+
|
|
550
|
+
```javascript
|
|
551
|
+
const result = await window.sentience.snapshot({ screenshot: true });
|
|
552
|
+
|
|
553
|
+
const img = document.createElement('img');
|
|
554
|
+
img.src = result.screenshot;
|
|
555
|
+
document.body.appendChild(img);
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
### Download Screenshot
|
|
559
|
+
|
|
560
|
+
```javascript
|
|
561
|
+
const result = await window.sentience.snapshot({ screenshot: true });
|
|
562
|
+
|
|
563
|
+
const a = document.createElement('a');
|
|
564
|
+
a.href = result.screenshot;
|
|
565
|
+
a.download = 'screenshot.png';
|
|
566
|
+
a.click();
|
|
567
|
+
```
|
|
568
|
+
|
|
569
|
+
### Annotated Screenshot
|
|
570
|
+
|
|
571
|
+
```javascript
|
|
572
|
+
const result = await window.sentience.snapshot({
|
|
573
|
+
screenshot: true,
|
|
574
|
+
limit: 10
|
|
575
|
+
});
|
|
576
|
+
|
|
577
|
+
const img = new Image();
|
|
578
|
+
img.onload = () => {
|
|
579
|
+
const canvas = document.createElement('canvas');
|
|
580
|
+
canvas.width = result.viewport.width;
|
|
581
|
+
canvas.height = result.viewport.height;
|
|
582
|
+
|
|
583
|
+
const ctx = canvas.getContext('2d');
|
|
584
|
+
ctx.drawImage(img, 0, 0);
|
|
585
|
+
|
|
586
|
+
// Draw bounding boxes
|
|
587
|
+
result.elements.forEach((el, i) => {
|
|
588
|
+
ctx.strokeStyle = el.visual_cues.is_primary ? 'red' : 'blue';
|
|
589
|
+
ctx.lineWidth = 3;
|
|
590
|
+
ctx.strokeRect(el.bbox.x, el.bbox.y, el.bbox.width, el.bbox.height);
|
|
591
|
+
|
|
592
|
+
// Label
|
|
593
|
+
ctx.fillStyle = 'red';
|
|
594
|
+
ctx.font = '14px Arial';
|
|
595
|
+
ctx.fillText(`${i+1}: ${el.role}`, el.bbox.x, el.bbox.y - 5);
|
|
596
|
+
});
|
|
597
|
+
|
|
598
|
+
document.body.appendChild(canvas);
|
|
599
|
+
};
|
|
600
|
+
img.src = result.screenshot;
|
|
601
|
+
```
|
|
602
|
+
|
|
603
|
+
### Send to AI Vision API
|
|
604
|
+
|
|
605
|
+
```javascript
|
|
606
|
+
const result = await window.sentience.snapshot({
|
|
607
|
+
screenshot: { format: 'jpeg', quality: 80 },
|
|
608
|
+
limit: 10
|
|
609
|
+
});
|
|
610
|
+
|
|
611
|
+
// Extract Base64 (without prefix)
|
|
612
|
+
const base64Only = result.screenshot.split(',')[1];
|
|
613
|
+
|
|
614
|
+
// Send to GPT-4 Vision, Claude 3, etc.
|
|
615
|
+
await fetch('https://api.openai.com/v1/chat/completions', {
|
|
616
|
+
method: 'POST',
|
|
617
|
+
headers: {
|
|
618
|
+
'Authorization': `Bearer ${API_KEY}`,
|
|
619
|
+
'Content-Type': 'application/json'
|
|
620
|
+
},
|
|
621
|
+
body: JSON.stringify({
|
|
622
|
+
model: 'gpt-4-vision-preview',
|
|
623
|
+
messages: [{
|
|
624
|
+
role: 'user',
|
|
625
|
+
content: [
|
|
626
|
+
{ type: 'text', text: 'What should I click?' },
|
|
627
|
+
{ type: 'image_url', image_url: { url: result.screenshot } }
|
|
628
|
+
]
|
|
629
|
+
}]
|
|
630
|
+
})
|
|
631
|
+
});
|
|
632
|
+
```
|
|
633
|
+
|
|
634
|
+
### File Size Reference
|
|
635
|
+
|
|
636
|
+
| Format | Quality | Typical Size | Use Case |
|
|
637
|
+
|--------|---------|--------------|----------|
|
|
638
|
+
| PNG | N/A | 1-2 MB | Archival, exact pixels |
|
|
639
|
+
| JPEG | 90 | 300-800 KB | High quality uploads |
|
|
640
|
+
| JPEG | 80 | 150-400 KB | **Recommended** |
|
|
641
|
+
| JPEG | 70 | 80-200 KB | Bandwidth-limited |
|
|
642
|
+
|
|
643
|
+
---
|
|
644
|
+
|
|
645
|
+
## Bounding Box Visualization
|
|
646
|
+
|
|
647
|
+
### Simple Overlay
|
|
648
|
+
|
|
649
|
+
```javascript
|
|
650
|
+
const result = await window.sentience.snapshot({ limit: 10 });
|
|
651
|
+
|
|
652
|
+
result.elements.forEach((el, i) => {
|
|
653
|
+
const box = document.createElement('div');
|
|
654
|
+
box.style.cssText = `
|
|
655
|
+
position: absolute;
|
|
656
|
+
left: ${el.bbox.x}px;
|
|
657
|
+
top: ${el.bbox.y}px;
|
|
658
|
+
width: ${el.bbox.width}px;
|
|
659
|
+
height: ${el.bbox.height}px;
|
|
660
|
+
border: ${el.visual_cues.is_primary ? '3px solid red' : '2px solid blue'};
|
|
661
|
+
pointer-events: none;
|
|
662
|
+
z-index: 9999;
|
|
663
|
+
box-sizing: border-box;
|
|
664
|
+
`;
|
|
665
|
+
|
|
666
|
+
// Add label
|
|
667
|
+
const label = document.createElement('div');
|
|
668
|
+
label.style.cssText = `
|
|
669
|
+
position: absolute;
|
|
670
|
+
top: -22px;
|
|
671
|
+
background: ${el.visual_cues.is_primary ? 'red' : 'blue'};
|
|
672
|
+
color: white;
|
|
673
|
+
padding: 2px 6px;
|
|
674
|
+
font-size: 12px;
|
|
675
|
+
`;
|
|
676
|
+
label.textContent = `${i+1}: ${el.role}`;
|
|
677
|
+
box.appendChild(label);
|
|
678
|
+
|
|
679
|
+
document.body.appendChild(box);
|
|
680
|
+
});
|
|
681
|
+
```
|
|
682
|
+
|
|
683
|
+
### Canvas Overlay (Better Performance)
|
|
684
|
+
|
|
685
|
+
```javascript
|
|
686
|
+
const result = await window.sentience.snapshot({ limit: 20 });
|
|
687
|
+
|
|
688
|
+
const canvas = document.createElement('canvas');
|
|
689
|
+
canvas.width = window.innerWidth;
|
|
690
|
+
canvas.height = window.innerHeight;
|
|
691
|
+
canvas.style.cssText = `
|
|
692
|
+
position: fixed;
|
|
693
|
+
top: 0;
|
|
694
|
+
left: 0;
|
|
695
|
+
pointer-events: none;
|
|
696
|
+
z-index: 9999;
|
|
697
|
+
`;
|
|
698
|
+
document.body.appendChild(canvas);
|
|
699
|
+
|
|
700
|
+
const ctx = canvas.getContext('2d');
|
|
701
|
+
|
|
702
|
+
result.elements.forEach((el, i) => {
|
|
703
|
+
ctx.strokeStyle = el.visual_cues.is_primary ? '#ff0000' : '#0066ff';
|
|
704
|
+
ctx.lineWidth = el.visual_cues.is_primary ? 3 : 2;
|
|
705
|
+
ctx.strokeRect(el.bbox.x, el.bbox.y, el.bbox.width, el.bbox.height);
|
|
706
|
+
|
|
707
|
+
ctx.fillStyle = '#0066ff';
|
|
708
|
+
ctx.font = '14px Arial';
|
|
709
|
+
ctx.fillText(`${i+1}: ${el.role}`, el.bbox.x, el.bbox.y - 5);
|
|
710
|
+
});
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
### Reusable Helper Function
|
|
714
|
+
|
|
715
|
+
```javascript
|
|
716
|
+
// Add to content.js or run in console
|
|
717
|
+
window.sentience.visualize = async function(options = {}) {
|
|
718
|
+
const {
|
|
719
|
+
limit = 10,
|
|
720
|
+
filter = null,
|
|
721
|
+
highlightPrimary = true
|
|
722
|
+
} = options;
|
|
723
|
+
|
|
724
|
+
const result = await window.sentience.snapshot({ limit, filter });
|
|
725
|
+
|
|
726
|
+
// Clear previous
|
|
727
|
+
document.querySelectorAll('.sentience-box').forEach(el => el.remove());
|
|
728
|
+
|
|
729
|
+
result.elements.forEach((el, i) => {
|
|
730
|
+
const box = document.createElement('div');
|
|
731
|
+
box.className = 'sentience-box';
|
|
732
|
+
box.style.cssText = `
|
|
733
|
+
position: absolute;
|
|
734
|
+
left: ${el.bbox.x}px;
|
|
735
|
+
top: ${el.bbox.y}px;
|
|
736
|
+
width: ${el.bbox.width}px;
|
|
737
|
+
height: ${el.bbox.height}px;
|
|
738
|
+
border: ${el.visual_cues.is_primary && highlightPrimary ? '3px solid red' : '2px solid blue'};
|
|
739
|
+
pointer-events: none;
|
|
740
|
+
z-index: 9998;
|
|
741
|
+
box-sizing: border-box;
|
|
742
|
+
`;
|
|
743
|
+
document.body.appendChild(box);
|
|
744
|
+
});
|
|
745
|
+
|
|
746
|
+
console.log(`✅ Visualized ${result.elements.length} elements`);
|
|
747
|
+
return result;
|
|
748
|
+
};
|
|
749
|
+
|
|
750
|
+
// Usage
|
|
751
|
+
await window.sentience.visualize();
|
|
752
|
+
await window.sentience.visualize({ limit: 20 });
|
|
753
|
+
await window.sentience.visualize({ filter: { allowed_roles: ['button'] } });
|
|
754
|
+
```
|
|
755
|
+
|
|
756
|
+
---
|
|
757
|
+
|
|
758
|
+
## Filtering & Ranking
|
|
759
|
+
|
|
760
|
+
### Importance Scoring (6 Metrics)
|
|
761
|
+
|
|
762
|
+
Elements are ranked by:
|
|
763
|
+
|
|
764
|
+
1. **Role Priority** (1000+ for inputs, 500 for buttons, 100 for links)
|
|
765
|
+
2. **Area Score** (larger elements score higher, capped at 200)
|
|
766
|
+
3. **Center Bias** (penalizes footer/sidebar elements)
|
|
767
|
+
4. **Z-Index Bonus** (modals/overlays get priority)
|
|
768
|
+
5. **ARIA Label Bonus** (+200 for explicit labels)
|
|
769
|
+
6. **Visual Prominence** (+200 for `is_primary` elements)
|
|
770
|
+
|
|
771
|
+
**Score Range:** -300 to ~1800
|
|
772
|
+
|
|
773
|
+
### Filter Options
|
|
774
|
+
|
|
775
|
+
#### Limit to Top N
|
|
776
|
+
|
|
777
|
+
```javascript
|
|
778
|
+
{ limit: 10 } // Returns top 10 most important
|
|
779
|
+
```
|
|
780
|
+
|
|
781
|
+
#### Filter by Element Size
|
|
782
|
+
|
|
783
|
+
```javascript
|
|
784
|
+
{
|
|
785
|
+
filter: {
|
|
786
|
+
min_area: 500 // Minimum 500 pixels² (e.g., 20×25)
|
|
787
|
+
}
|
|
788
|
+
}
|
|
789
|
+
```
|
|
790
|
+
|
|
791
|
+
#### Filter by Role
|
|
792
|
+
|
|
793
|
+
```javascript
|
|
794
|
+
{
|
|
795
|
+
filter: {
|
|
796
|
+
allowed_roles: ['button', 'link', 'textbox']
|
|
797
|
+
}
|
|
798
|
+
}
|
|
799
|
+
```
|
|
800
|
+
|
|
801
|
+
**Available Roles:**
|
|
802
|
+
- `button`, `submit` - Buttons
|
|
803
|
+
- `link` - Hyperlinks
|
|
804
|
+
- `textbox`, `searchbox` - Text inputs
|
|
805
|
+
- `checkbox`, `radio` - Form controls
|
|
806
|
+
- `combobox` - Dropdowns
|
|
807
|
+
- `generic` - Other elements
|
|
808
|
+
|
|
809
|
+
#### Filter by Z-Index
|
|
810
|
+
|
|
811
|
+
```javascript
|
|
812
|
+
{
|
|
813
|
+
filter: {
|
|
814
|
+
min_z_index: 100 // Only modals/overlays
|
|
815
|
+
}
|
|
816
|
+
}
|
|
817
|
+
```
|
|
818
|
+
|
|
819
|
+
#### Combined Filters
|
|
820
|
+
|
|
821
|
+
```javascript
|
|
822
|
+
{
|
|
823
|
+
limit: 20,
|
|
824
|
+
filter: {
|
|
825
|
+
allowed_roles: ['button', 'textbox'],
|
|
826
|
+
min_area: 100,
|
|
827
|
+
min_z_index: 0
|
|
828
|
+
}
|
|
829
|
+
}
|
|
830
|
+
```
|
|
831
|
+
|
|
832
|
+
### Visual Cues
|
|
833
|
+
|
|
834
|
+
#### is_primary Detection
|
|
835
|
+
|
|
836
|
+
An element is marked as primary if:
|
|
837
|
+
1. **Actionable**: Has clickable role (button/link/input)
|
|
838
|
+
2. **NOT Decorative**: Not image/presentation
|
|
839
|
+
3. **Visually Prominent**:
|
|
840
|
+
- Size > 1% of viewport, OR
|
|
841
|
+
- Bold text + primary color (blue/green/orange/red), OR
|
|
842
|
+
- Large font (≥18px) + primary color
|
|
843
|
+
|
|
844
|
+
#### Color Detection
|
|
845
|
+
|
|
846
|
+
32-color palette using Euclidean distance in RGB space:
|
|
847
|
+
- **Basic**: black, white, gray
|
|
848
|
+
- **Primary**: red, blue, green, yellow
|
|
849
|
+
- **Secondary**: orange, purple, pink, brown
|
|
850
|
+
- **Extended**: gold, salmon, skyblue, khaki, etc.
|
|
851
|
+
|
|
852
|
+
---
|
|
853
|
+
|
|
854
|
+
## Architecture
|
|
855
|
+
|
|
856
|
+
### Data Flow
|
|
857
|
+
|
|
858
|
+
```
|
|
859
|
+
User calls window.sentience.snapshot()
|
|
860
|
+
↓
|
|
861
|
+
content.js extracts raw DOM data
|
|
862
|
+
- getBoundingClientRect() for positions
|
|
863
|
+
- getComputedStyle() for colors/fonts
|
|
864
|
+
- getAttribute() for roles/attributes
|
|
865
|
+
↓
|
|
866
|
+
WASM (lib.rs) analyzes
|
|
867
|
+
1. infer_role() - Detect semantic roles
|
|
868
|
+
2. extract_visual_cues() - Colors, prominence
|
|
869
|
+
3. calculate_importance() - 6-metric scoring
|
|
870
|
+
4. apply_filters() - Smart selection
|
|
871
|
+
↓
|
|
872
|
+
Return sorted, filtered JSON
|
|
873
|
+
```
|
|
874
|
+
|
|
875
|
+
### Component Breakdown
|
|
876
|
+
|
|
877
|
+
**content.js (JavaScript Layer)**
|
|
878
|
+
- DOM access (WASM can't access DOM)
|
|
879
|
+
- Element registry management
|
|
880
|
+
- WASM initialization
|
|
881
|
+
- API exposure
|
|
882
|
+
|
|
883
|
+
**lib.rs (Rust/WASM Layer)**
|
|
884
|
+
- Element analysis
|
|
885
|
+
- Filtering and ranking
|
|
886
|
+
- Smart selection algorithm
|
|
887
|
+
- No dependencies (small binary)
|
|
888
|
+
|
|
889
|
+
**background.js (Service Worker)**
|
|
890
|
+
- Screenshot capture via `chrome.tabs.captureVisibleTab`
|
|
891
|
+
- Message passing to content script
|
|
892
|
+
|
|
893
|
+
### Bridge Pattern
|
|
894
|
+
|
|
895
|
+
**JavaScript → WASM (Data Flow)**
|
|
896
|
+
```
|
|
897
|
+
Raw DOM data (JsValue)
|
|
898
|
+
→ analyze_page()
|
|
899
|
+
→ Vec<RawElement>
|
|
900
|
+
→ infer_role(), extract_visual_cues(), calculate_importance()
|
|
901
|
+
→ Vec<SmartElement>
|
|
902
|
+
→ JsValue
|
|
903
|
+
```
|
|
904
|
+
|
|
905
|
+
**WASM → JavaScript (Action Bridge)**
|
|
906
|
+
```
|
|
907
|
+
click_element_bridge(id)
|
|
908
|
+
→ js_click_element(id)
|
|
909
|
+
→ window.sentience_registry[id].click()
|
|
910
|
+
```
|
|
911
|
+
|
|
912
|
+
---
|
|
913
|
+
|
|
914
|
+
## Implementation Details
|
|
915
|
+
|
|
916
|
+
### Visual Cues Extraction
|
|
917
|
+
|
|
918
|
+
**RGB to Hex Conversion:**
|
|
919
|
+
```rust
|
|
920
|
+
fn rgb_to_hex(rgb_string: &str) -> Option<String> {
|
|
921
|
+
// Parse "rgb(0, 123, 255)" → "#007bff"
|
|
922
|
+
// Manual string parsing (no regex dependency)
|
|
923
|
+
}
|
|
924
|
+
```
|
|
925
|
+
|
|
926
|
+
**Nearest Color Name:**
|
|
927
|
+
```rust
|
|
928
|
+
fn find_nearest_color_name(hex: &str) -> Option<String> {
|
|
929
|
+
// Euclidean distance to 32-color palette
|
|
930
|
+
// Returns closest named color
|
|
931
|
+
}
|
|
932
|
+
```
|
|
933
|
+
|
|
934
|
+
**Primary Action Detection:**
|
|
935
|
+
```rust
|
|
936
|
+
fn extract_visual_cues(raw: &RawElement, role: &str) -> VisualCues {
|
|
937
|
+
// Check: actionable + not decorative + visually prominent
|
|
938
|
+
let is_primary = is_actionable && !is_decorative && is_visually_prominent;
|
|
939
|
+
}
|
|
940
|
+
```
|
|
941
|
+
|
|
942
|
+
### Importance Calculation
|
|
943
|
+
|
|
944
|
+
```rust
|
|
945
|
+
fn calculate_importance(raw: &RawElement, role: &str, cues: &VisualCues) -> i32 {
|
|
946
|
+
let mut score = 0;
|
|
947
|
+
|
|
948
|
+
// Role priority
|
|
949
|
+
score += match role {
|
|
950
|
+
"textbox" | "searchbox" => 1000,
|
|
951
|
+
"button" | "checkbox" | "radio" => 500,
|
|
952
|
+
"link" => 100,
|
|
953
|
+
_ => 10,
|
|
954
|
+
};
|
|
955
|
+
|
|
956
|
+
// Area bonus
|
|
957
|
+
let area = raw.rect.width * raw.rect.height;
|
|
958
|
+
score += (area.sqrt() as i32).min(200);
|
|
959
|
+
|
|
960
|
+
// Center bias
|
|
961
|
+
let dist_from_center = calculate_manhattan_distance();
|
|
962
|
+
score -= dist_from_center as i32;
|
|
963
|
+
|
|
964
|
+
// Z-index bonus
|
|
965
|
+
if z_index > 0 {
|
|
966
|
+
score += (z_index.min(100)) * 2;
|
|
967
|
+
}
|
|
968
|
+
|
|
969
|
+
// ARIA label bonus
|
|
970
|
+
if raw.attributes.aria_label.is_some() {
|
|
971
|
+
score += 200;
|
|
972
|
+
}
|
|
973
|
+
|
|
974
|
+
// Visual prominence
|
|
975
|
+
if cues.is_primary {
|
|
976
|
+
score += 200;
|
|
977
|
+
}
|
|
978
|
+
|
|
979
|
+
score
|
|
980
|
+
}
|
|
981
|
+
```
|
|
982
|
+
|
|
983
|
+
### Smart Selection
|
|
984
|
+
|
|
985
|
+
```rust
|
|
986
|
+
fn apply_filters(elements: &mut Vec<SmartElement>, options: &AnalysisOptions) {
|
|
987
|
+
// Stage 1: Attribute filters
|
|
988
|
+
apply_attribute_filters(elements, &options.filter);
|
|
989
|
+
|
|
990
|
+
// Stage 2: Smart selection
|
|
991
|
+
if let Some(limit) = options.limit {
|
|
992
|
+
// Truncate to top N
|
|
993
|
+
elements.truncate(limit);
|
|
994
|
+
|
|
995
|
+
// Re-sort by Y-position (reading order for LLMs)
|
|
996
|
+
elements.sort_by(|a, b| a.bbox.y.partial_cmp(&b.bbox.y).unwrap());
|
|
997
|
+
}
|
|
998
|
+
}
|
|
999
|
+
```
|
|
1000
|
+
|
|
1001
|
+
### Build Optimizations
|
|
1002
|
+
|
|
1003
|
+
**Cargo.toml:**
|
|
1004
|
+
```toml
|
|
1005
|
+
[profile.release]
|
|
1006
|
+
opt-level = "z" # Optimize for size
|
|
1007
|
+
lto = true # Link-time optimization
|
|
1008
|
+
codegen-units = 1 # Better optimization
|
|
1009
|
+
panic = "abort" # Smaller panic handler
|
|
1010
|
+
strip = true # Strip debug symbols
|
|
1011
|
+
```
|
|
1012
|
+
|
|
1013
|
+
**Result:** ~50-150 KB WASM binary
|
|
1014
|
+
|
|
1015
|
+
---
|
|
1016
|
+
|
|
1017
|
+
## API Reference
|
|
1018
|
+
|
|
1019
|
+
### snapshot(options?)
|
|
1020
|
+
|
|
1021
|
+
**Parameters:**
|
|
1022
|
+
```typescript
|
|
1023
|
+
{
|
|
1024
|
+
screenshot?: boolean | {
|
|
1025
|
+
format?: 'png' | 'jpeg',
|
|
1026
|
+
quality?: number // 0-100
|
|
1027
|
+
},
|
|
1028
|
+
limit?: number,
|
|
1029
|
+
filter?: {
|
|
1030
|
+
min_area?: number,
|
|
1031
|
+
allowed_roles?: string[],
|
|
1032
|
+
min_z_index?: number
|
|
1033
|
+
}
|
|
1034
|
+
}
|
|
1035
|
+
```
|
|
1036
|
+
|
|
1037
|
+
**Returns:** Promise<GeometryMap>
|
|
1038
|
+
|
|
1039
|
+
```typescript
|
|
1040
|
+
{
|
|
1041
|
+
status: "success" | "error",
|
|
1042
|
+
timestamp: string,
|
|
1043
|
+
url: string,
|
|
1044
|
+
viewport: { width, height },
|
|
1045
|
+
elements: SmartElement[],
|
|
1046
|
+
screenshot?: string,
|
|
1047
|
+
screenshot_format?: 'png' | 'jpeg',
|
|
1048
|
+
screenshot_error?: string
|
|
1049
|
+
}
|
|
1050
|
+
```
|
|
1051
|
+
|
|
1052
|
+
### SmartElement
|
|
1053
|
+
|
|
1054
|
+
```typescript
|
|
1055
|
+
{
|
|
1056
|
+
id: number, // Registry index
|
|
1057
|
+
role: string, // Semantic role
|
|
1058
|
+
text: string | null, // Text content
|
|
1059
|
+
importance: number, // Score (-300 to 1800)
|
|
1060
|
+
visual_cues: {
|
|
1061
|
+
is_primary: boolean,
|
|
1062
|
+
background_color_name: string | null,
|
|
1063
|
+
is_clickable: boolean
|
|
1064
|
+
},
|
|
1065
|
+
bbox: {
|
|
1066
|
+
x: number,
|
|
1067
|
+
y: number,
|
|
1068
|
+
width: number,
|
|
1069
|
+
height: number
|
|
1070
|
+
},
|
|
1071
|
+
z_index: number
|
|
1072
|
+
}
|
|
1073
|
+
```
|
|
1074
|
+
|
|
1075
|
+
---
|
|
1076
|
+
|
|
1077
|
+
## Performance
|
|
1078
|
+
|
|
1079
|
+
### Timing Breakdown
|
|
1080
|
+
|
|
1081
|
+
| Operation | Time |
|
|
1082
|
+
|-----------|------|
|
|
1083
|
+
| Geometry extraction | 100-300ms |
|
|
1084
|
+
| + PNG screenshot | +50-100ms |
|
|
1085
|
+
| + JPEG screenshot | +30-80ms |
|
|
1086
|
+
| Total | 150-400ms |
|
|
1087
|
+
|
|
1088
|
+
### Memory Usage
|
|
1089
|
+
|
|
1090
|
+
| Component | Size |
|
|
1091
|
+
|-----------|------|
|
|
1092
|
+
| WASM Binary | 50-150 KB |
|
|
1093
|
+
| Registry (1000 elements) | ~100 KB |
|
|
1094
|
+
| Raw Data | ~200 KB |
|
|
1095
|
+
| Smart Elements | ~100 KB |
|
|
1096
|
+
| Total Peak | ~500 KB |
|
|
1097
|
+
|
|
1098
|
+
### Optimization Tips
|
|
1099
|
+
|
|
1100
|
+
1. Use `limit` to reduce processing time
|
|
1101
|
+
2. Filter with `allowed_roles` to skip irrelevant elements
|
|
1102
|
+
3. Use JPEG for smaller screenshots
|
|
1103
|
+
4. Cache results if analyzing same page multiple times
|
|
1104
|
+
|
|
1105
|
+
---
|
|
1106
|
+
|
|
1107
|
+
## Troubleshooting
|
|
1108
|
+
|
|
1109
|
+
### "WASM not ready" Error
|
|
1110
|
+
|
|
1111
|
+
**Solution:** Wait for WASM to load
|
|
1112
|
+
```javascript
|
|
1113
|
+
setTimeout(async () => {
|
|
1114
|
+
const result = await window.sentience.snapshot();
|
|
1115
|
+
console.log(result);
|
|
1116
|
+
}, 1000);
|
|
1117
|
+
```
|
|
1118
|
+
|
|
1119
|
+
### Empty Results
|
|
1120
|
+
|
|
1121
|
+
**Check:**
|
|
1122
|
+
```javascript
|
|
1123
|
+
const result = await window.sentience.snapshot();
|
|
1124
|
+
console.log('Elements:', result.elements.length);
|
|
1125
|
+
console.log('Registry:', window.sentience_registry.length);
|
|
1126
|
+
```
|
|
1127
|
+
|
|
1128
|
+
### Screenshot Fails
|
|
1129
|
+
|
|
1130
|
+
**Causes:**
|
|
1131
|
+
- Extension doesn't have tab permissions
|
|
1132
|
+
- Page is restricted URL (chrome://, file://)
|
|
1133
|
+
- Tab is not active
|
|
1134
|
+
|
|
1135
|
+
**Solution:**
|
|
1136
|
+
```javascript
|
|
1137
|
+
const result = await window.sentience.snapshot({ screenshot: true });
|
|
1138
|
+
if (result.screenshot_error) {
|
|
1139
|
+
console.error('Screenshot error:', result.screenshot_error);
|
|
1140
|
+
}
|
|
1141
|
+
```
|
|
1142
|
+
|
|
1143
|
+
### Screenshot is Black
|
|
1144
|
+
|
|
1145
|
+
**Cause:** Page hasn't rendered
|
|
1146
|
+
|
|
1147
|
+
**Solution:** Add delay
|
|
1148
|
+
```javascript
|
|
1149
|
+
await new Promise(r => setTimeout(r, 500));
|
|
1150
|
+
const result = await window.sentience.snapshot({ screenshot: true });
|
|
1151
|
+
```
|
|
1152
|
+
|
|
1153
|
+
### Large Screenshots
|
|
1154
|
+
|
|
1155
|
+
**Solution:** Use JPEG with lower quality
|
|
1156
|
+
```javascript
|
|
1157
|
+
{
|
|
1158
|
+
screenshot: {
|
|
1159
|
+
format: 'jpeg',
|
|
1160
|
+
quality: 60
|
|
1161
|
+
}
|
|
1162
|
+
}
|
|
1163
|
+
```
|
|
1164
|
+
|
|
1165
|
+
---
|
|
1166
|
+
|
|
1167
|
+
## Summary
|
|
1168
|
+
|
|
1169
|
+
### What Users Can Do
|
|
1170
|
+
|
|
1171
|
+
1. ✅ Call `window.sentience.snapshot()` to get geometry map
|
|
1172
|
+
2. ✅ Add `{ screenshot: true }` for visual snapshot
|
|
1173
|
+
3. ✅ Add `{ limit: N }` to get top N elements
|
|
1174
|
+
4. ✅ Add `{ filter: {...} }` to filter by role/size/z-index
|
|
1175
|
+
5. ✅ Visualize with bounding boxes
|
|
1176
|
+
6. ✅ Send to AI vision APIs
|
|
1177
|
+
|
|
1178
|
+
### What Users Get
|
|
1179
|
+
|
|
1180
|
+
- **Screenshot**: Base64-encoded PNG/JPEG
|
|
1181
|
+
- **Viewport**: URL, width, height
|
|
1182
|
+
- **Elements**: Sorted by importance, then Y-position
|
|
1183
|
+
- **Positions**: Bounding boxes (x, y, width, height)
|
|
1184
|
+
- **Roles**: Semantic (button/link/input)
|
|
1185
|
+
- **Scores**: -300 to 1800 importance ranking
|
|
1186
|
+
- **Visual Cues**: is_primary, colors, clickability
|
|
1187
|
+
|
|
1188
|
+
### One Function, All Features
|
|
1189
|
+
|
|
1190
|
+
```javascript
|
|
1191
|
+
await window.sentience.snapshot({
|
|
1192
|
+
screenshot: { format: 'jpeg', quality: 80 },
|
|
1193
|
+
limit: 20,
|
|
1194
|
+
filter: {
|
|
1195
|
+
allowed_roles: ['button', 'textbox'],
|
|
1196
|
+
min_area: 100
|
|
1197
|
+
}
|
|
1198
|
+
})
|
|
1199
|
+
```
|
|
1200
|
+
|
|
1201
|
+
**That's it!** 🎉
|
|
1202
|
+
|
|
1203
|
+
---
|
|
1204
|
+
|
|
1205
|
+
## License
|
|
1206
|
+
|
|
1207
|
+
MIT License - See LICENSE file for details.
|
|
1208
|
+
|
|
1209
|
+
## Contributing
|
|
1210
|
+
|
|
1211
|
+
Contributions welcome! Please open issues or pull requests.
|
|
1212
|
+
|
|
1213
|
+
## Authors
|
|
1214
|
+
|
|
1215
|
+
- Built with Rust, WASM, and Chrome Extension APIs
|
|
1216
|
+
- Ported from geometry-service architecture
|
|
1217
|
+
|
|
1218
|
+
---
|
|
1219
|
+
|
|
1220
|
+
**Questions?** Open an issue on GitHub.
|
|
1221
|
+
|
|
1222
|
+
## Running Python to use the chrome extension
|
|
1223
|
+
> suppose we have python code running on the same machine as the chrome browser, can the python code invoke the snapshopt function in `window.sentience` of the chrome extension?
|
|
1224
|
+
|
|
1225
|
+
You are asking if Python code running locally can invoke the `snapshot` function exposed by the Chrome Extension.
|
|
1226
|
+
|
|
1227
|
+
The answer is **Yes**. Python can execute any JavaScript function available in the browser context using standard browser automation libraries like Playwright, Selenium, or Puppeteer. Since your extension injects `window.sentience` into the "Main World" (the same context the page uses), your Python script has full access to it via `page.evaluate()`.
|
|
1228
|
+
|
|
1229
|
+
### How to do it (Python Example)
|
|
1230
|
+
|
|
1231
|
+
Using **Playwright** (recommended for modern agents):
|
|
1232
|
+
|
|
1233
|
+
```python
|
|
1234
|
+
from playwright.sync_api import sync_playwright
|
|
1235
|
+
import json
|
|
1236
|
+
|
|
1237
|
+
def run_agent():
|
|
1238
|
+
with sync_playwright() as p:
|
|
1239
|
+
# 1. Launch Chrome with your Extension loaded
|
|
1240
|
+
# Note: Use launch_persistent_context to keep extension state/settings
|
|
1241
|
+
browser = p.chromium.launch_persistent_context(
|
|
1242
|
+
user_data_dir="/tmp/chrome-profile",
|
|
1243
|
+
headless=False, # Extensions don't work in headless=new yet (usually)
|
|
1244
|
+
args=[
|
|
1245
|
+
"--disable-extensions-except=./path/to/your/extension",
|
|
1246
|
+
"--load-extension=./path/to/your/extension"
|
|
1247
|
+
]
|
|
1248
|
+
)
|
|
1249
|
+
|
|
1250
|
+
page = browser.pages[0]
|
|
1251
|
+
page.goto("https://www.amazon.com")
|
|
1252
|
+
|
|
1253
|
+
# 2. Invoke your API
|
|
1254
|
+
# Python waits for the Promise to resolve and gets the JSON back
|
|
1255
|
+
print("📸 Taking snapshot...")
|
|
1256
|
+
result = page.evaluate("window.sentience.snapshot({ limit: 50 })")
|
|
1257
|
+
|
|
1258
|
+
if result.get("status") == "success":
|
|
1259
|
+
elements = result.get("elements", [])
|
|
1260
|
+
print(f"✅ Found {len(elements)} interactive elements")
|
|
1261
|
+
|
|
1262
|
+
# Example: Find Search Box
|
|
1263
|
+
search_box = next((el for el in elements if el['role'] == 'searchbox'), None)
|
|
1264
|
+
|
|
1265
|
+
if search_box:
|
|
1266
|
+
print(f"🔍 Clicking Search Box ID: {search_box['id']}")
|
|
1267
|
+
# 3. Execute Action via your API
|
|
1268
|
+
page.evaluate(f"window.sentience.click({search_box['id']})")
|
|
1269
|
+
|
|
1270
|
+
# Or standard Playwright typing
|
|
1271
|
+
page.keyboard.type("Gaming Laptop")
|
|
1272
|
+
page.keyboard.press("Enter")
|
|
1273
|
+
else:
|
|
1274
|
+
print(f"❌ Error: {result.get('error')}")
|
|
1275
|
+
|
|
1276
|
+
browser.close()
|
|
1277
|
+
|
|
1278
|
+
if __name__ == "__main__":
|
|
1279
|
+
run_agent()
|
|
1280
|
+
|
|
1281
|
+
```
|
|
1282
|
+
|
|
1283
|
+
### Why this architecture is powerful
|
|
1284
|
+
|
|
1285
|
+
1. **Zero Network Latency:** The Python script talks to the Extension instantly via the Chrome DevTools Protocol. No HTTP requests to your server.
|
|
1286
|
+
2. **Shared State:** The Python script can see the `window.sentience` object just like a developer typing in the console.
|
|
1287
|
+
3. **Hybrid Control:** You can mix your `sentience.snapshot()` (for vision) with standard Playwright commands (like `.type()` or `.waitForNavigation()`) for a robust agent.
|
|
1288
|
+
---
|
|
1289
|
+
|
|
1290
|
+
## Contributing
|
|
1291
|
+
|
|
1292
|
+
We welcome contributions! This extension has been fully restructured with modern tooling and modular architecture.
|
|
1293
|
+
|
|
1294
|
+
### 🚀 **Quick Start for Contributors**
|
|
1295
|
+
|
|
1296
|
+
1. **Clone and install:**
|
|
1297
|
+
```bash
|
|
1298
|
+
git clone https://github.com/YOUR_ORG/sentience-chrome.git
|
|
1299
|
+
cd sentience-chrome
|
|
1300
|
+
npm install
|
|
1301
|
+
```
|
|
1302
|
+
|
|
1303
|
+
2. **Make changes** to the modular source in `src/`
|
|
1304
|
+
|
|
1305
|
+
3. **Test your changes:**
|
|
1306
|
+
```bash
|
|
1307
|
+
npm run lint # Check code quality
|
|
1308
|
+
npm test # Run tests
|
|
1309
|
+
npm run build # Build extension
|
|
1310
|
+
```
|
|
1311
|
+
|
|
1312
|
+
4. **Submit a PR** - CI will automatically check linting, tests, and builds
|
|
1313
|
+
|
|
1314
|
+
### 📚 **Developer Resources**
|
|
1315
|
+
|
|
1316
|
+
- **[CONTRIBUTING.md](CONTRIBUTING.md)** - Complete developer guide
|
|
1317
|
+
- **[docs/RESTRUCTURING_PROGRESS.md](docs/RESTRUCTURING_PROGRESS.md)** - Architecture details
|
|
1318
|
+
- **Modular codebase** - 7 focused modules instead of monolith
|
|
1319
|
+
- **76 tests** with 80% pass rate
|
|
1320
|
+
- **Automated CI/CD** - GitHub Actions for quality checks
|
|
1321
|
+
|
|
1322
|
+
### 🎯 **Key Features for Contributors**
|
|
1323
|
+
|
|
1324
|
+
- ✅ **ESLint + Prettier** - Automated code quality
|
|
1325
|
+
- ✅ **Jest testing** - Unit tests with coverage reports
|
|
1326
|
+
- ✅ **Rollup bundling** - Optimized builds
|
|
1327
|
+
- ✅ **CI/CD pipelines** - Automated checks on every PR
|
|
1328
|
+
- ✅ **Zero breaking changes** - SDK compatibility preserved
|
|
1329
|
+
|
|
1330
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
|
|
1331
|
+
|
|
1332
|
+
---
|
|
1333
|
+
|
|
1334
|
+
## License
|
|
1335
|
+
|
|
1336
|
+
See [LICENSE](LICENSE) file.
|
|
1337
|
+
|
|
1338
|
+
---
|
|
1339
|
+
|
|
1340
|
+
**Built with ❤️ for AI agents and web automation**
|