sentienceapi 0.90.17__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of sentienceapi might be problematic. Click here for more details.

Files changed (50) hide show
  1. sentience/__init__.py +153 -0
  2. sentience/_extension_loader.py +40 -0
  3. sentience/actions.py +837 -0
  4. sentience/agent.py +1246 -0
  5. sentience/agent_config.py +43 -0
  6. sentience/async_api.py +101 -0
  7. sentience/base_agent.py +194 -0
  8. sentience/browser.py +1037 -0
  9. sentience/cli.py +130 -0
  10. sentience/cloud_tracing.py +382 -0
  11. sentience/conversational_agent.py +509 -0
  12. sentience/expect.py +188 -0
  13. sentience/extension/background.js +233 -0
  14. sentience/extension/content.js +298 -0
  15. sentience/extension/injected_api.js +1473 -0
  16. sentience/extension/manifest.json +36 -0
  17. sentience/extension/pkg/sentience_core.d.ts +51 -0
  18. sentience/extension/pkg/sentience_core.js +529 -0
  19. sentience/extension/pkg/sentience_core_bg.wasm +0 -0
  20. sentience/extension/pkg/sentience_core_bg.wasm.d.ts +10 -0
  21. sentience/extension/release.json +115 -0
  22. sentience/extension/test-content.js +4 -0
  23. sentience/formatting.py +59 -0
  24. sentience/generator.py +202 -0
  25. sentience/inspector.py +365 -0
  26. sentience/llm_provider.py +637 -0
  27. sentience/models.py +412 -0
  28. sentience/overlay.py +222 -0
  29. sentience/query.py +303 -0
  30. sentience/read.py +185 -0
  31. sentience/recorder.py +589 -0
  32. sentience/schemas/trace_v1.json +216 -0
  33. sentience/screenshot.py +100 -0
  34. sentience/snapshot.py +516 -0
  35. sentience/text_search.py +290 -0
  36. sentience/trace_indexing/__init__.py +27 -0
  37. sentience/trace_indexing/index_schema.py +111 -0
  38. sentience/trace_indexing/indexer.py +357 -0
  39. sentience/tracer_factory.py +211 -0
  40. sentience/tracing.py +285 -0
  41. sentience/utils.py +296 -0
  42. sentience/wait.py +137 -0
  43. sentienceapi-0.90.17.dist-info/METADATA +917 -0
  44. sentienceapi-0.90.17.dist-info/RECORD +50 -0
  45. sentienceapi-0.90.17.dist-info/WHEEL +5 -0
  46. sentienceapi-0.90.17.dist-info/entry_points.txt +2 -0
  47. sentienceapi-0.90.17.dist-info/licenses/LICENSE +24 -0
  48. sentienceapi-0.90.17.dist-info/licenses/LICENSE-APACHE +201 -0
  49. sentienceapi-0.90.17.dist-info/licenses/LICENSE-MIT +21 -0
  50. sentienceapi-0.90.17.dist-info/top_level.txt +1 -0
@@ -0,0 +1,917 @@
1
+ Metadata-Version: 2.4
2
+ Name: sentienceapi
3
+ Version: 0.90.17
4
+ Summary: Python SDK for Sentience AI Agent Browser Automation
5
+ Author: Sentience Team
6
+ License: MIT OR Apache-2.0
7
+ Project-URL: Homepage, https://github.com/SentienceAPI/sentience-python
8
+ Project-URL: Repository, https://github.com/SentienceAPI/sentience-python
9
+ Project-URL: Issues, https://github.com/SentienceAPI/sentience-python/issues
10
+ Keywords: browser-automation,playwright,ai-agent,web-automation,sentience
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: License :: OSI Approved :: MIT License
14
+ Classifier: License :: OSI Approved :: Apache Software License
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Requires-Python: >=3.11
18
+ Description-Content-Type: text/markdown
19
+ License-File: LICENSE
20
+ License-File: LICENSE-APACHE
21
+ License-File: LICENSE-MIT
22
+ Requires-Dist: playwright>=1.40.0
23
+ Requires-Dist: pydantic>=2.0.0
24
+ Requires-Dist: jsonschema>=4.0.0
25
+ Requires-Dist: requests>=2.31.0
26
+ Requires-Dist: httpx>=0.25.0
27
+ Requires-Dist: playwright-stealth>=1.0.6
28
+ Requires-Dist: markdownify>=0.11.6
29
+ Provides-Extra: dev
30
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
31
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
32
+ Dynamic: license-file
33
+
34
+ # Sentience Python SDK
35
+
36
+ **Semantic geometry grounding for deterministic, debuggable AI web agents with time-travel traces.**
37
+
38
+ ## 📦 Installation
39
+
40
+ ```bash
41
+ # Install from PyPI
42
+ pip install sentienceapi
43
+
44
+ # Install Playwright browsers (required)
45
+ playwright install chromium
46
+
47
+ # For LLM Agent features (optional)
48
+ pip install openai # For OpenAI models
49
+ pip install anthropic # For Claude models
50
+ pip install transformers torch # For local LLMs
51
+ ```
52
+
53
+ **For local development:**
54
+ ```bash
55
+ pip install -e .
56
+ ```
57
+
58
+ ## 🚀 Quick Start: Choose Your Abstraction Level
59
+
60
+ Sentience SDK offers **three abstraction levels** - use what fits your needs:
61
+
62
+ <details>
63
+ <summary><b>🎯 Level 3: Natural Language (Easiest)</b> - For non-technical users</summary>
64
+
65
+ ```python
66
+ from sentience import SentienceBrowser, ConversationalAgent
67
+ from sentience.llm_provider import OpenAIProvider
68
+
69
+ browser = SentienceBrowser()
70
+ llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
71
+ agent = ConversationalAgent(browser, llm)
72
+
73
+ with browser:
74
+ response = agent.execute("Search for magic mouse on google.com")
75
+ print(response)
76
+ # → "I searched for 'magic mouse' and found several results.
77
+ # The top result is from amazon.com selling Magic Mouse 2 for $79."
78
+ ```
79
+
80
+ **Best for:** End users, chatbots, no-code platforms
81
+ **Code required:** 3-5 lines
82
+ **Technical knowledge:** None
83
+
84
+ </details>
85
+
86
+ <details>
87
+ <summary><b>⚙️ Level 2: Technical Commands (Recommended)</b> - For AI developers</summary>
88
+
89
+ ```python
90
+ from sentience import SentienceBrowser, SentienceAgent
91
+ from sentience.llm_provider import OpenAIProvider
92
+
93
+ browser = SentienceBrowser()
94
+ llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
95
+ agent = SentienceAgent(browser, llm)
96
+
97
+ with browser:
98
+ browser.page.goto("https://google.com")
99
+ agent.act("Click the search box")
100
+ agent.act("Type 'magic mouse' into the search field")
101
+ agent.act("Press Enter key")
102
+ ```
103
+
104
+ **Best for:** Building AI agents, automation scripts
105
+ **Code required:** 10-15 lines
106
+ **Technical knowledge:** Medium (Python basics)
107
+
108
+ </details>
109
+
110
+ <details>
111
+ <summary><b>🔧 Level 1: Direct SDK (Most Control)</b> - For production automation</summary>
112
+
113
+ ```python
114
+ from sentience import SentienceBrowser, snapshot, find, click
115
+
116
+ with SentienceBrowser(headless=False) as browser:
117
+ browser.page.goto("https://example.com")
118
+
119
+ # Take snapshot - captures all interactive elements
120
+ snap = snapshot(browser)
121
+ print(f"Found {len(snap.elements)} elements")
122
+
123
+ # Find and click a link using semantic selectors
124
+ link = find(snap, "role=link text~'More information'")
125
+ if link:
126
+ result = click(browser, link.id)
127
+ print(f"Click success: {result.success}")
128
+ ```
129
+
130
+ **Best for:** Maximum control, performance-critical apps
131
+ **Code required:** 20-50 lines
132
+ **Technical knowledge:** High (SDK API, selectors)
133
+
134
+ </details>
135
+
136
+ ---
137
+
138
+ <details>
139
+ <summary><h2>💼 Real-World Example: Amazon Shopping Bot</h2></summary>
140
+
141
+ This example demonstrates navigating Amazon, finding products, and adding items to cart:
142
+
143
+ ```python
144
+ from sentience import SentienceBrowser, snapshot, find, click
145
+ import time
146
+
147
+ with SentienceBrowser(headless=False) as browser:
148
+ # Navigate to Amazon Best Sellers
149
+ browser.goto("https://www.amazon.com/gp/bestsellers/", wait_until="domcontentloaded")
150
+ time.sleep(2) # Wait for dynamic content
151
+
152
+ # Take snapshot and find products
153
+ snap = snapshot(browser)
154
+ print(f"Found {len(snap.elements)} elements")
155
+
156
+ # Find first product in viewport using spatial filtering
157
+ products = [
158
+ el for el in snap.elements
159
+ if el.role == "link"
160
+ and el.visual_cues.is_clickable
161
+ and el.in_viewport
162
+ and not el.is_occluded
163
+ and el.bbox.y < 600 # First row
164
+ ]
165
+
166
+ if products:
167
+ # Sort by position (left to right, top to bottom)
168
+ products.sort(key=lambda e: (e.bbox.y, e.bbox.x))
169
+ first_product = products[0]
170
+
171
+ print(f"Clicking: {first_product.text}")
172
+ result = click(browser, first_product.id)
173
+
174
+ # Wait for product page
175
+ browser.page.wait_for_load_state("networkidle")
176
+ time.sleep(2)
177
+
178
+ # Find and click "Add to Cart" button
179
+ product_snap = snapshot(browser)
180
+ add_to_cart = find(product_snap, "role=button text~'add to cart'")
181
+
182
+ if add_to_cart:
183
+ cart_result = click(browser, add_to_cart.id)
184
+ print(f"Added to cart: {cart_result.success}")
185
+ ```
186
+
187
+ **📖 See the complete tutorial:** [Amazon Shopping Guide](../docs/AMAZON_SHOPPING_GUIDE.md)
188
+
189
+ </details>
190
+
191
+ ---
192
+
193
+ ## 📚 Core Features
194
+
195
+ <details>
196
+ <summary><h3>🌐 Browser Control</h3></summary>
197
+
198
+ - **`SentienceBrowser`** - Playwright browser with Sentience extension pre-loaded
199
+ - **`browser.goto(url)`** - Navigate with automatic extension readiness checks
200
+ - Automatic bot evasion and stealth mode
201
+ - Configurable headless/headed mode
202
+
203
+ </details>
204
+
205
+ <details>
206
+ <summary><h3>📸 Snapshot - Intelligent Page Analysis</h3></summary>
207
+
208
+ **`snapshot(browser, options=SnapshotOptions(screenshot=True, show_overlay=False, limit=None, goal=None))`** - Capture page state with AI-ranked elements
209
+
210
+ Features:
211
+ - Returns semantic elements with roles, text, importance scores, and bounding boxes
212
+ - Optional screenshot capture (PNG/JPEG) - set `screenshot=True`
213
+ - Optional visual overlay to see what elements are detected - set `show_overlay=True`
214
+ - Pydantic models for type safety
215
+ - Optional ML reranking when `goal` is provided
216
+ - **`snapshot.save(filepath)`** - Export to JSON
217
+
218
+ **Example:**
219
+ ```python
220
+ from sentience import snapshot, SnapshotOptions
221
+
222
+ # Basic snapshot with defaults (no screenshot, no overlay)
223
+ snap = snapshot(browser)
224
+
225
+ # With screenshot and overlay
226
+ snap = snapshot(browser, SnapshotOptions(
227
+ screenshot=True,
228
+ show_overlay=True,
229
+ limit=100,
230
+ goal="Click the login button" # Optional: enables ML reranking
231
+ ))
232
+
233
+ # Access structured data
234
+ print(f"URL: {snap.url}")
235
+ print(f"Viewport: {snap.viewport.width}x{snap.viewport.height}")
236
+ print(f"Elements: {len(snap.elements)}")
237
+
238
+ # Iterate over elements
239
+ for element in snap.elements:
240
+ print(f"{element.role}: {element.text} (importance: {element.importance})")
241
+
242
+ # Check ML reranking metadata (when goal is provided)
243
+ if element.rerank_index is not None:
244
+ print(f" ML rank: {element.rerank_index} (confidence: {element.ml_probability:.2%})")
245
+ ```
246
+
247
+ </details>
248
+
249
+ <details>
250
+ <summary><h3>🔍 Query Engine - Semantic Element Selection</h3></summary>
251
+
252
+ - **`query(snapshot, selector)`** - Find all matching elements
253
+ - **`find(snapshot, selector)`** - Find single best match (by importance)
254
+ - Powerful query DSL with multiple operators
255
+
256
+ **Query Examples:**
257
+ ```python
258
+ # Find by role and text
259
+ button = find(snap, "role=button text='Sign in'")
260
+
261
+ # Substring match (case-insensitive)
262
+ link = find(snap, "role=link text~'more info'")
263
+
264
+ # Spatial filtering
265
+ top_left = find(snap, "bbox.x<=100 bbox.y<=200")
266
+
267
+ # Multiple conditions (AND logic)
268
+ primary_btn = find(snap, "role=button clickable=true visible=true importance>800")
269
+
270
+ # Prefix/suffix matching
271
+ starts_with = find(snap, "text^='Add'")
272
+ ends_with = find(snap, "text$='Cart'")
273
+
274
+ # Numeric comparisons
275
+ important = query(snap, "importance>=700")
276
+ first_row = query(snap, "bbox.y<600")
277
+ ```
278
+
279
+ **📖 [Complete Query DSL Guide](docs/QUERY_DSL.md)** - All operators, fields, and advanced patterns
280
+
281
+ </details>
282
+
283
+ <details>
284
+ <summary><h3>👆 Actions - Interact with Elements</h3></summary>
285
+
286
+ - **`click(browser, element_id)`** - Click element by ID
287
+ - **`click_rect(browser, rect)`** - Click at center of rectangle (coordinate-based)
288
+ - **`type_text(browser, element_id, text)`** - Type into input fields
289
+ - **`press(browser, key)`** - Press keyboard keys (Enter, Escape, Tab, etc.)
290
+
291
+ All actions return `ActionResult` with success status, timing, and outcome:
292
+
293
+ ```python
294
+ result = click(browser, element.id)
295
+
296
+ print(f"Success: {result.success}")
297
+ print(f"Outcome: {result.outcome}") # "navigated", "dom_updated", "error"
298
+ print(f"Duration: {result.duration_ms}ms")
299
+ print(f"URL changed: {result.url_changed}")
300
+ ```
301
+
302
+ **Coordinate-based clicking:**
303
+ ```python
304
+ from sentience import click_rect
305
+
306
+ # Click at center of rectangle (x, y, width, height)
307
+ click_rect(browser, {"x": 100, "y": 200, "w": 50, "h": 30})
308
+
309
+ # With visual highlight (default: red border for 2 seconds)
310
+ click_rect(browser, {"x": 100, "y": 200, "w": 50, "h": 30}, highlight=True, highlight_duration=2.0)
311
+
312
+ # Using element's bounding box
313
+ snap = snapshot(browser)
314
+ element = find(snap, "role=button")
315
+ if element:
316
+ click_rect(browser, {
317
+ "x": element.bbox.x,
318
+ "y": element.bbox.y,
319
+ "w": element.bbox.width,
320
+ "h": element.bbox.height
321
+ })
322
+ ```
323
+
324
+ </details>
325
+
326
+ <details>
327
+ <summary><h3>⏱️ Wait & Assertions</h3></summary>
328
+
329
+ - **`wait_for(browser, selector, timeout=5.0, interval=None, use_api=None)`** - Wait for element to appear
330
+ - **`expect(browser, selector)`** - Assertion helper with fluent API
331
+
332
+ **Examples:**
333
+ ```python
334
+ # Wait for element (auto-detects optimal interval based on API usage)
335
+ result = wait_for(browser, "role=button text='Submit'", timeout=10.0)
336
+ if result.found:
337
+ print(f"Found after {result.duration_ms}ms")
338
+
339
+ # Use local extension with fast polling (0.25s interval)
340
+ result = wait_for(browser, "role=button", timeout=5.0, use_api=False)
341
+
342
+ # Use remote API with network-friendly polling (1.5s interval)
343
+ result = wait_for(browser, "role=button", timeout=5.0, use_api=True)
344
+
345
+ # Custom interval override
346
+ result = wait_for(browser, "role=button", timeout=5.0, interval=0.5, use_api=False)
347
+
348
+ # Semantic wait conditions
349
+ wait_for(browser, "clickable=true", timeout=5.0) # Wait for clickable element
350
+ wait_for(browser, "importance>100", timeout=5.0) # Wait for important element
351
+ wait_for(browser, "role=link visible=true", timeout=5.0) # Wait for visible link
352
+
353
+ # Assertions
354
+ expect(browser, "role=button text='Submit'").to_exist(timeout=5.0)
355
+ expect(browser, "role=heading").to_be_visible()
356
+ expect(browser, "role=button").to_have_text("Submit")
357
+ expect(browser, "role=link").to_have_count(10)
358
+ ```
359
+
360
+ </details>
361
+
362
+ <details>
363
+ <summary><h3>🎨 Visual Overlay - Debug Element Detection</h3></summary>
364
+
365
+ - **`show_overlay(browser, elements, target_element_id=None)`** - Display visual overlay highlighting elements
366
+ - **`clear_overlay(browser)`** - Clear overlay manually
367
+
368
+ Show color-coded borders around detected elements to debug, validate, and understand what Sentience sees:
369
+
370
+ ```python
371
+ from sentience import show_overlay, clear_overlay
372
+
373
+ # Take snapshot once
374
+ snap = snapshot(browser)
375
+
376
+ # Show overlay anytime without re-snapshotting
377
+ show_overlay(browser, snap) # Auto-clears after 5 seconds
378
+
379
+ # Highlight specific target element in red
380
+ button = find(snap, "role=button text~'Submit'")
381
+ show_overlay(browser, snap, target_element_id=button.id)
382
+
383
+ # Clear manually before 5 seconds
384
+ import time
385
+ time.sleep(2)
386
+ clear_overlay(browser)
387
+ ```
388
+
389
+ **Color Coding:**
390
+ - 🔴 Red: Target element
391
+ - 🔵 Blue: Primary elements (`is_primary=true`)
392
+ - 🟢 Green: Regular interactive elements
393
+
394
+ **Visual Indicators:**
395
+ - Border thickness/opacity scales with importance
396
+ - Semi-transparent fill
397
+ - Importance badges
398
+ - Star icons for primary elements
399
+ - Auto-clear after 5 seconds
400
+
401
+ </details>
402
+
403
+ <details>
404
+ <summary><h3>📄 Content Reading</h3></summary>
405
+
406
+ **`read(browser, format="text|markdown|raw")`** - Extract page content
407
+ - `format="text"` - Plain text extraction
408
+ - `format="markdown"` - High-quality markdown conversion (uses markdownify)
409
+ - `format="raw"` - Cleaned HTML (default)
410
+
411
+ **Example:**
412
+ ```python
413
+ from sentience import read
414
+
415
+ # Get markdown content
416
+ result = read(browser, format="markdown")
417
+ print(result["content"]) # Markdown text
418
+
419
+ # Get plain text
420
+ result = read(browser, format="text")
421
+ print(result["content"]) # Plain text
422
+ ```
423
+
424
+ </details>
425
+
426
+ <details>
427
+ <summary><h3>📷 Screenshots</h3></summary>
428
+
429
+ **`screenshot(browser, format="png|jpeg", quality=80)`** - Standalone screenshot capture
430
+ - Returns base64-encoded data URL
431
+ - PNG or JPEG format
432
+ - Quality control for JPEG (1-100)
433
+
434
+ **Example:**
435
+ ```python
436
+ from sentience import screenshot
437
+ import base64
438
+
439
+ # Capture PNG screenshot
440
+ data_url = screenshot(browser, format="png")
441
+
442
+ # Save to file
443
+ image_data = base64.b64decode(data_url.split(",")[1])
444
+ with open("screenshot.png", "wb") as f:
445
+ f.write(image_data)
446
+
447
+ # JPEG with quality control (smaller file size)
448
+ data_url = screenshot(browser, format="jpeg", quality=85)
449
+ ```
450
+
451
+ </details>
452
+
453
+ <details>
454
+ <summary><h3>🔎 Text Search - Find Elements by Visible Text</h3></summary>
455
+
456
+ **`find_text_rect(browser, text, case_sensitive=False, whole_word=False, max_results=10)`** - Find text on page and get exact pixel coordinates
457
+
458
+ Find buttons, links, or any UI elements by their visible text without needing element IDs or CSS selectors. Returns exact pixel coordinates for each match.
459
+
460
+ **Example:**
461
+ ```python
462
+ from sentience import SentienceBrowser, find_text_rect, click_rect
463
+
464
+ with SentienceBrowser() as browser:
465
+ browser.page.goto("https://example.com")
466
+
467
+ # Find "Sign In" button
468
+ result = find_text_rect(browser, "Sign In")
469
+ if result.status == "success" and result.results:
470
+ first_match = result.results[0]
471
+ print(f"Found at: ({first_match.rect.x}, {first_match.rect.y})")
472
+ print(f"In viewport: {first_match.in_viewport}")
473
+
474
+ # Click on the found text
475
+ if first_match.in_viewport:
476
+ click_rect(browser, {
477
+ "x": first_match.rect.x,
478
+ "y": first_match.rect.y,
479
+ "w": first_match.rect.width,
480
+ "h": first_match.rect.height
481
+ })
482
+ ```
483
+
484
+ **Advanced Options:**
485
+ ```python
486
+ # Case-sensitive search
487
+ result = find_text_rect(browser, "LOGIN", case_sensitive=True)
488
+
489
+ # Whole word only (won't match "login" as part of "loginButton")
490
+ result = find_text_rect(browser, "log", whole_word=True)
491
+
492
+ # Find multiple matches
493
+ result = find_text_rect(browser, "Buy", max_results=10)
494
+ for match in result.results:
495
+ if match.in_viewport:
496
+ print(f"Found '{match.text}' at ({match.rect.x}, {match.rect.y})")
497
+ print(f"Context: ...{match.context.before}[{match.text}]{match.context.after}...")
498
+ ```
499
+
500
+ **Returns:** `TextRectSearchResult` with:
501
+ - **`status`**: "success" or "error"
502
+ - **`results`**: List of `TextMatch` objects with:
503
+ - `text` - The matched text
504
+ - `rect` - Absolute coordinates (with scroll offset)
505
+ - `viewport_rect` - Viewport-relative coordinates
506
+ - `context` - Surrounding text (before/after)
507
+ - `in_viewport` - Whether visible in current viewport
508
+
509
+ **Use Cases:**
510
+ - Find buttons/links by visible text without CSS selectors
511
+ - Get exact pixel coordinates for click automation
512
+ - Verify text visibility and position on page
513
+ - Search dynamic content that changes frequently
514
+
515
+ **Note:** Does not consume API credits (runs locally in browser)
516
+
517
+ **See example:** `examples/find_text_demo.py`
518
+
519
+ </details>
520
+
521
+ ---
522
+
523
+ ## 🔄 Async API
524
+
525
+ For asyncio contexts (FastAPI, async frameworks):
526
+
527
+ ```python
528
+ from sentience.async_api import AsyncSentienceBrowser, snapshot_async, click_async, find
529
+
530
+ async def main():
531
+ async with AsyncSentienceBrowser() as browser:
532
+ await browser.goto("https://example.com")
533
+ snap = await snapshot_async(browser)
534
+ button = find(snap, "role=button")
535
+ if button:
536
+ await click_async(browser, button.id)
537
+
538
+ asyncio.run(main())
539
+ ```
540
+
541
+ **See example:** `examples/async_api_demo.py`
542
+
543
+ ---
544
+
545
+ ## 📋 Reference
546
+
547
+ <details>
548
+ <summary><h3>Element Properties</h3></summary>
549
+
550
+ Elements returned by `snapshot()` have the following properties:
551
+
552
+ ```python
553
+ element.id # Unique identifier for interactions
554
+ element.role # ARIA role (button, link, textbox, heading, etc.)
555
+ element.text # Visible text content
556
+ element.importance # AI importance score (0-1000)
557
+ element.bbox # Bounding box (x, y, width, height)
558
+ element.visual_cues # Visual analysis (is_primary, is_clickable, background_color)
559
+ element.in_viewport # Is element visible in current viewport?
560
+ element.is_occluded # Is element covered by other elements?
561
+ element.z_index # CSS stacking order
562
+ ```
563
+
564
+ </details>
565
+
566
+ <details>
567
+ <summary><h3>Query DSL Reference</h3></summary>
568
+
569
+ ### Basic Operators
570
+
571
+ | Operator | Description | Example |
572
+ |----------|-------------|---------|
573
+ | `=` | Exact match | `role=button` |
574
+ | `!=` | Exclusion | `role!=link` |
575
+ | `~` | Substring (case-insensitive) | `text~'sign in'` |
576
+ | `^=` | Prefix match | `text^='Add'` |
577
+ | `$=` | Suffix match | `text$='Cart'` |
578
+ | `>`, `>=` | Greater than | `importance>500` |
579
+ | `<`, `<=` | Less than | `bbox.y<600` |
580
+
581
+ ### Supported Fields
582
+
583
+ - **Role**: `role=button|link|textbox|heading|...`
584
+ - **Text**: `text`, `text~`, `text^=`, `text$=`
585
+ - **Visibility**: `clickable=true|false`, `visible=true|false`
586
+ - **Importance**: `importance`, `importance>=N`, `importance<N`
587
+ - **Position**: `bbox.x`, `bbox.y`, `bbox.width`, `bbox.height`
588
+ - **Layering**: `z_index`
589
+
590
+ </details>
591
+
592
+ ---
593
+
594
+ ## ⚙️ Configuration
595
+
596
+ <details>
597
+ <summary><h3>Viewport Size</h3></summary>
598
+
599
+ Default viewport is **1280x800** pixels. You can customize it using Playwright's API:
600
+
601
+ ```python
602
+ with SentienceBrowser(headless=False) as browser:
603
+ # Set custom viewport before navigating
604
+ browser.page.set_viewport_size({"width": 1920, "height": 1080})
605
+
606
+ browser.goto("https://example.com")
607
+ ```
608
+
609
+ </details>
610
+
611
+ <details>
612
+ <summary><h3>Headless Mode</h3></summary>
613
+
614
+ ```python
615
+ # Headed mode (default in dev, shows browser window)
616
+ browser = SentienceBrowser(headless=False)
617
+
618
+ # Headless mode (default in CI environments)
619
+ browser = SentienceBrowser(headless=True)
620
+
621
+ # Auto-detect based on environment
622
+ browser = SentienceBrowser() # headless=True if CI=true, else False
623
+ ```
624
+
625
+ </details>
626
+
627
+ <details>
628
+ <summary><h3>🌍 Residential Proxy Support</h3></summary>
629
+
630
+ Use residential proxies to route traffic and protect your IP address. Supports HTTP, HTTPS, and SOCKS5 with automatic SSL certificate handling:
631
+
632
+ ```python
633
+ # Method 1: Direct configuration
634
+ browser = SentienceBrowser(proxy="http://user:pass@proxy.example.com:8080")
635
+
636
+ # Method 2: Environment variable
637
+ # export SENTIENCE_PROXY="http://user:pass@proxy.example.com:8080"
638
+ browser = SentienceBrowser()
639
+
640
+ # Works with agents
641
+ llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
642
+ agent = SentienceAgent(browser, llm)
643
+
644
+ with browser:
645
+ browser.page.goto("https://example.com")
646
+ agent.act("Search for products")
647
+ # All traffic routed through proxy with WebRTC leak protection
648
+ ```
649
+
650
+ **Features:**
651
+ - HTTP, HTTPS, SOCKS5 proxy support
652
+ - Username/password authentication
653
+ - Automatic self-signed SSL certificate handling
654
+ - WebRTC IP leak protection (automatic)
655
+
656
+ See `examples/residential_proxy_agent.py` for complete examples.
657
+
658
+ </details>
659
+
660
+ <details>
661
+ <summary><h3>🔐 Authentication Session Injection</h3></summary>
662
+
663
+ Inject pre-recorded authentication sessions (cookies + localStorage) to start your agent already logged in, bypassing login screens, 2FA, and CAPTCHAs. This saves tokens and reduces costs by eliminating login steps.
664
+
665
+ ```python
666
+ # Workflow 1: Inject pre-recorded session from file
667
+ from sentience import SentienceBrowser, save_storage_state
668
+
669
+ # Save session after manual login
670
+ browser = SentienceBrowser()
671
+ browser.start()
672
+ browser.goto("https://example.com")
673
+ # ... log in manually ...
674
+ save_storage_state(browser.context, "auth.json")
675
+
676
+ # Use saved session in future runs
677
+ browser = SentienceBrowser(storage_state="auth.json")
678
+ browser.start()
679
+ # Agent starts already logged in!
680
+
681
+ # Workflow 2: Persistent sessions (cookies persist across runs)
682
+ browser = SentienceBrowser(user_data_dir="./chrome_profile")
683
+ browser.start()
684
+ # First run: Log in
685
+ # Second run: Already logged in (cookies persist automatically)
686
+ ```
687
+
688
+ **Benefits:**
689
+ - Bypass login screens and CAPTCHAs with valid sessions
690
+ - Save 5-10 agent steps and hundreds of tokens per run
691
+ - Maintain stateful sessions for accessing authenticated pages
692
+ - Act as authenticated users (e.g., "Go to my Orders page")
693
+
694
+ See `examples/auth_injection_agent.py` for complete examples.
695
+
696
+ </details>
697
+
698
+ ---
699
+
700
+ ## 💡 Best Practices
701
+
702
+ <details>
703
+ <summary>Click to expand best practices</summary>
704
+
705
+ ### 1. Wait for Dynamic Content
706
+ ```python
707
+ browser.goto("https://example.com", wait_until="domcontentloaded")
708
+ time.sleep(1) # Extra buffer for AJAX/animations
709
+ ```
710
+
711
+ ### 2. Use Multiple Strategies for Finding Elements
712
+ ```python
713
+ # Try exact match first
714
+ btn = find(snap, "role=button text='Add to Cart'")
715
+
716
+ # Fallback to fuzzy match
717
+ if not btn:
718
+ btn = find(snap, "role=button text~='cart'")
719
+ ```
720
+
721
+ ### 3. Check Element Visibility Before Clicking
722
+ ```python
723
+ if element.in_viewport and not element.is_occluded:
724
+ click(browser, element.id)
725
+ ```
726
+
727
+ ### 4. Handle Navigation
728
+ ```python
729
+ result = click(browser, link_id)
730
+ if result.url_changed:
731
+ browser.page.wait_for_load_state("networkidle")
732
+ ```
733
+
734
+ ### 5. Use Screenshots Sparingly
735
+ ```python
736
+ # Fast - no screenshot (only element data)
737
+ snap = snapshot(browser)
738
+
739
+ # Slower - with screenshot (for debugging/verification)
740
+ snap = snapshot(browser, SnapshotOptions(screenshot=True))
741
+ ```
742
+
743
+ </details>
744
+
745
+ ---
746
+
747
+ ## 🛠️ Troubleshooting
748
+
749
+ <details>
750
+ <summary>Click to expand common issues and solutions</summary>
751
+
752
+ ### "Extension failed to load"
753
+ **Solution:** Build the extension first:
754
+ ```bash
755
+ cd sentience-chrome
756
+ ./build.sh
757
+ ```
758
+
759
+ ### "Element not found"
760
+ **Solutions:**
761
+ - Ensure page is loaded: `browser.page.wait_for_load_state("networkidle")`
762
+ - Use `wait_for()`: `wait_for(browser, "role=button", timeout=10)`
763
+ - Debug elements: `print([el.text for el in snap.elements])`
764
+
765
+ ### Button not clickable
766
+ **Solutions:**
767
+ - Check visibility: `element.in_viewport and not element.is_occluded`
768
+ - Scroll to element: `browser.page.evaluate(f"window.sentience_registry[{element.id}].scrollIntoView()")`
769
+
770
+ </details>
771
+
772
+ ---
773
+
774
+ ## 🔬 Advanced Features (v0.12.0+)
775
+
776
+ <details>
777
+ <summary><h3>📊 Agent Tracing & Debugging</h3></summary>
778
+
779
+ The SDK now includes built-in tracing infrastructure for debugging and analyzing agent behavior:
780
+
781
+ ```python
782
+ from sentience import SentienceBrowser, SentienceAgent
783
+ from sentience.llm_provider import OpenAIProvider
784
+ from sentience.tracing import Tracer, JsonlTraceSink
785
+ from sentience.agent_config import AgentConfig
786
+
787
+ # Create tracer to record agent execution
788
+ tracer = Tracer(
789
+ run_id="my-agent-run-123",
790
+ sink=JsonlTraceSink("trace.jsonl")
791
+ )
792
+
793
+ # Configure agent behavior
794
+ config = AgentConfig(
795
+ snapshot_limit=50,
796
+ temperature=0.0,
797
+ max_retries=1,
798
+ capture_screenshots=True
799
+ )
800
+
801
+ browser = SentienceBrowser()
802
+ llm = OpenAIProvider(api_key="your-key", model="gpt-4o")
803
+
804
+ # Pass tracer and config to agent
805
+ agent = SentienceAgent(browser, llm, tracer=tracer, config=config)
806
+
807
+ with browser:
808
+ browser.page.goto("https://example.com")
809
+
810
+ # All actions are automatically traced
811
+ agent.act("Click the sign in button")
812
+ agent.act("Type 'user@example.com' into email field")
813
+
814
+ # Trace events saved to trace.jsonl
815
+ # Events: step_start, snapshot, llm_query, action, step_end, error
816
+ ```
817
+
818
+ **Trace Events Captured:**
819
+ - `step_start` - Agent begins executing a goal
820
+ - `snapshot` - Page state captured
821
+ - `llm_query` - LLM decision made (includes tokens, model, response)
822
+ - `action` - Action executed (click, type, press)
823
+ - `step_end` - Step completed successfully
824
+ - `error` - Error occurred during execution
825
+
826
+ **Use Cases:**
827
+ - Debug why agent failed or got stuck
828
+ - Analyze token usage and costs
829
+ - Replay agent sessions
830
+ - Train custom models from successful runs
831
+ - Monitor production agents
832
+
833
+ </details>
834
+
835
+ <details>
836
+ <summary><h3>🧰 Snapshot Utilities</h3></summary>
837
+
838
+ New utility functions for working with snapshots:
839
+
840
+ ```python
841
+ from sentience import snapshot
842
+ from sentience.utils import compute_snapshot_digests, canonical_snapshot_strict
843
+ from sentience.formatting import format_snapshot_for_llm
844
+
845
+ snap = snapshot(browser)
846
+
847
+ # Compute snapshot fingerprints (detect page changes)
848
+ digests = compute_snapshot_digests(snap.elements)
849
+ print(f"Strict digest: {digests['strict']}") # Changes when text changes
850
+ print(f"Loose digest: {digests['loose']}") # Only changes when layout changes
851
+
852
+ # Format snapshot for LLM prompts
853
+ llm_context = format_snapshot_for_llm(snap, limit=50)
854
+ print(llm_context)
855
+ # Output: [1] <button> "Sign In" {PRIMARY,CLICKABLE} @ (100,50) (Imp:10)
856
+ ```
857
+
858
+ </details>
859
+
860
+ ---
861
+
862
+ ## 📖 Documentation
863
+
864
+ - **📖 [Amazon Shopping Guide](../docs/AMAZON_SHOPPING_GUIDE.md)** - Complete tutorial with real-world example
865
+ - **📖 [Query DSL Guide](docs/QUERY_DSL.md)** - Advanced query patterns and operators
866
+ - **📄 [API Contract](../spec/SNAPSHOT_V1.md)** - Snapshot API specification
867
+ - **📄 [Type Definitions](../spec/sdk-types.md)** - TypeScript/Python type definitions
868
+
869
+ ---
870
+
871
+ ## 💻 Examples & Testing
872
+
873
+ <details>
874
+ <summary><h3>Examples</h3></summary>
875
+
876
+ See the `examples/` directory for complete working examples:
877
+
878
+ - **`hello.py`** - Extension bridge verification
879
+ - **`basic_agent.py`** - Basic snapshot and element inspection
880
+ - **`query_demo.py`** - Query engine demonstrations
881
+ - **`wait_and_click.py`** - Waiting for elements and performing actions
882
+ - **`read_markdown.py`** - Content extraction and markdown conversion
883
+
884
+ </details>
885
+
886
+ <details>
887
+ <summary><h3>Testing</h3></summary>
888
+
889
+ ```bash
890
+ # Run all tests
891
+ pytest tests/
892
+
893
+ # Run specific test file
894
+ pytest tests/test_snapshot.py
895
+
896
+ # Run with verbose output
897
+ pytest -v tests/
898
+ ```
899
+
900
+ </details>
901
+
902
+ ---
903
+
904
+ ## License & Commercial Use
905
+
906
+ ### Open Source SDK
907
+ The Sentience SDK is dual-licensed under [MIT License](./LICENSE-MIT) and [Apache 2.0](./LICENSE-APACHE). You are free to use, modify, and distribute this SDK in your own projects (including commercial ones) without restriction.
908
+
909
+ ### Commercial Platform
910
+ While the SDK is open source, the **Sentience Cloud Platform** (API, Hosting, Sentience Studio) is a commercial service.
911
+
912
+ **We offer Commercial Licenses for:**
913
+ * **High-Volume Production:** Usage beyond the free tier limits.
914
+ * **SLA & Support:** Guaranteed uptime and dedicated engineering support.
915
+ * **On-Premise / Self-Hosted Gateway:** If you need to run the Sentience Gateway (Rust+ONNX) in your own VPC for compliance (e.g., banking/healthcare), you need an Enterprise License.
916
+
917
+ [Contact Us](mailto:support@sentienceapi.com) for Enterprise inquiries.