playwright-genie 1.0.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,435 +1,697 @@
1
- # playwright-nlp-locator
1
+ # playwright-genie
2
2
 
3
- > 🎯 Find Playwright locators using natural language descriptions powered by LLM
3
+ > Find and interact with Playwright elements using natural language powered by any LLM
4
4
 
5
- [![npm version](https://img.shields.io/npm/v/playwright-nlp-locator.svg)](https://www.npmjs.com/package/playwright-nlp-locator)
5
+ [![npm version](https://img.shields.io/npm/v/playwright-genie.svg)](https://www.npmjs.com/package/playwright-genie)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
7
 
8
- **playwright-nlp-locator** enables you to write browser automation scripts using plain English instead of manually inspecting elements and writing complex selectors. Simply describe the element you want to interact with, and the library will find the best Playwright locator for it.
8
+ **playwright-genie** lets you write Playwright tests in plain English. No more hunting for selectors just describe the element and the genie finds it.
9
9
 
10
- ## Features
10
+ ## Features
11
11
 
12
- - 🗣️ **Natural Language Descriptions** - Describe elements in plain English
13
- - 🎯 **Smart Locator Generation** - Returns optimal locators (role-based > text-based > CSS)
14
- - 📊 **Confidence Scores** - Know how confident the match is
15
- - 🔄 **Alternative Suggestions** - Get multiple locator options
16
- - 🛠️ **TypeScript Support** - Full type definitions included
17
- - 🤖 **LLM-Powered** - Uses Abacus.AI APIs for intelligent matching
18
- - 📦 **Production Ready** - Error handling, edge cases, and best practices
12
+ - **Natural Language** describe elements in plain English, no selectors needed
13
+ - **40+ Playwright Actions** click, fill, check, hover, drag, wait, screenshot and more
14
+ - **Any LLM Provider** OpenAI, Claude, Ollama, Azure, or any OpenAI-compatible API
15
+ - **Adaptive Page Analysis** — automatically switches between ARIA, hybrid, and DOM-only modes based on page accessibility quality
16
+ - **DOM Tree Pipeline** — generates XPath and CSS selectors in-browser for pages with poor accessibility
17
+ - **Smart Caching** — two-tier cache (memory + disk `.locator-cache.json`) to minimize LLM calls
18
+ - **Fallback Locator Chains** LLM returns multiple locator strategies; if the primary fails, fallbacks are tried automatically
19
+ - **Action-Aware** — `fill('username')` targets the input, not the label
20
+ - **Auto-Retry** — stale cached locators are automatically invalidated and re-resolved
21
+ - **Batch Resolution** — `prefetch()` resolves multiple queries in a single LLM call
22
+ - **iframe Support** — automatically detects and resolves elements inside iframes
23
+ - **TypeScript Support** — full type definitions included
24
+ - **Single Page Object** — one `createSmartLocator(page)` works across all navigations
19
25
 
20
- ## 📦 Installation
26
+ ## Installation
21
27
 
22
28
  ```bash
23
- npm install playwright-nlp-locator playwright
29
+ npm install playwright-genie
24
30
  ```
25
31
 
26
32
  ### Prerequisites
27
33
 
28
- 1. **Playwright** - The library works with Playwright
29
- 2. **Python 3** - Required for LLM API calls
30
- 3. **Abacus.AI Python SDK** - Install with `pip install abacusai`
31
- 4. **Abacus.AI API Key** - Set the `ABACUS_API_KEY` environment variable
34
+ - **Node.js** >= 18
35
+ - **Playwright** >= 1.40
36
+ - An **LLM API key** (OpenAI, Anthropic, or any OpenAI-compatible provider)
32
37
 
33
- ```bash
34
- pip install abacusai
35
- export ABACUS_API_KEY="your-api-key-here"
38
+ ## Setup
39
+
40
+ Create a `.env` file in your project root:
41
+
42
+ ```env
43
+ # Option 1: OpenAI
44
+ LLM_API_KEY=sk-your-openai-key
45
+ LLM_MODEL=gpt-4o-mini
46
+
47
+ # Option 2: Anthropic (via OpenAI-compatible proxy)
48
+ LLM_API_KEY=your-anthropic-key
49
+ LLM_BASE_URL=https://your-proxy.com/v1
50
+ LLM_MODEL=claude-sonnet-4-20250514
51
+
52
+ # Option 3: Ollama (local, free)
53
+ LLM_API_KEY=ollama
54
+ LLM_BASE_URL=http://localhost:11434/v1
55
+ LLM_MODEL=llama3
56
+
57
+ # Option 4: Azure OpenAI
58
+ LLM_API_KEY=your-azure-key
59
+ LLM_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
60
+ LLM_MODEL=gpt-4o-mini
61
+ ```
62
+
63
+ Also supports `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `ROUTELLM_API_KEY` as fallbacks.
64
+
65
+ ## Quick Start
66
+
67
+ ### With Playwright Test
68
+
69
+ ```js
70
+ import { test } from '@playwright/test';
71
+ import { createSmartLocator } from 'playwright-genie';
72
+
73
+ test('login flow', async ({ page }) => {
74
+ const smart = createSmartLocator(page);
75
+
76
+ await page.goto('https://myapp.com/login');
77
+ await smart.fill('username', 'admin');
78
+ await smart.fill('password', 'secret123');
79
+ await smart.click('login button');
80
+ await smart.waitForVisible('welcome heading');
81
+ });
82
+ ```
83
+
84
+ ### Standalone Script
85
+
86
+ ```js
87
+ import { chromium } from 'playwright';
88
+ import { createSmartLocator } from 'playwright-genie';
89
+
90
+ const browser = await chromium.launch();
91
+ const page = await browser.newPage();
92
+ const smart = createSmartLocator(page);
93
+
94
+ await page.goto('https://myapp.com');
95
+ await smart.click('sign in link');
96
+ await smart.fill('email field', 'user@example.com');
97
+ await smart.fill('password field', 'secret');
98
+ await smart.click('submit button');
99
+
100
+ await browser.close();
101
+ ```
102
+
103
+ ## How It Works
104
+
105
+ When you call `smart.click('login button')`, the library:
106
+
107
+ 1. **Collects page structure** — gathers the ARIA accessibility tree, interactive elements, special attributes (`data-testid`, `placeholder`, `aria-label`), and a full DOM tree with XPath/CSS selectors
108
+ 2. **Evaluates ARIA quality** — scores the page as `good`, `sparse`, or `none` based on how many named interactive elements the ARIA tree contains
109
+ 3. **Builds an adaptive payload** — selects the best strategy:
110
+ - **`aria` mode** — rich ARIA tree with good accessibility; uses ARIA snapshot + special elements
111
+ - **`hybrid` mode** — sparse ARIA; combines the ARIA tree with DOM tree nodes for better coverage
112
+ - **`dom` mode** — no useful ARIA; sends DOM tree with XPaths and CSS selectors generated in-browser
113
+ 4. **Queries the LLM** — sends the payload with your natural language query; the LLM returns a Playwright locator string (e.g., `getByRole('button', { name: 'Login' })`) along with fallback locators
114
+ 5. **Validates and caches** — verifies the locator resolves to a real element, caches it to memory and disk, and returns a `SmartAction` for interaction
115
+ 6. **Auto-recovers** — if a cached locator goes stale, it's invalidated and re-resolved; if the primary locator fails, fallback locators are tried automatically
116
+
117
+ ## API Reference
118
+
119
+ ### `createSmartLocator(page, options?)`
120
+
121
+ Creates a smart locator instance bound to a Playwright page. Works across navigations — no need to recreate it.
122
+
123
+ ```js
124
+ const smart = createSmartLocator(page, { verbose: true });
125
+ ```
126
+
127
+ **Options:**
128
+
129
+ | Option | Type | Default | Description |
130
+ |--------|------|---------|-------------|
131
+ | `verbose` | `boolean` | `false` | Log resolved locators to console |
132
+ | `debug` | `boolean` | `false` | Enable detailed debug logging |
133
+ | `model` | `string` | env var | Override LLM model |
134
+ | `temperature` | `number` | `0` | LLM temperature |
135
+ | `maxTokens` | `number` | `1024` | Max response tokens |
136
+ | `actionTimeout` | `number` | `10000` | Timeout for actions in ms |
137
+
138
+ ---
139
+
140
+ ### Interaction Actions
141
+
142
+ ```js
143
+ await smart.click('login button');
144
+ await smart.click('submit', { force: true });
145
+
146
+ await smart.dblclick('editable cell');
147
+
148
+ await smart.fill('username', 'Admin');
149
+ await smart.fill('email field', 'a@b.com', { timeout: 5000 });
150
+
151
+ await smart.type('search box', 'hello');
152
+
153
+ await smart.pressSequentially('otp input', '123456', { delay: 100 });
154
+
155
+ await smart.press('search box', 'Enter');
156
+
157
+ await smart.clear('email field');
158
+
159
+ await smart.hover('profile menu');
160
+
161
+ await smart.focus('first input');
162
+
163
+ await smart.tap('mobile menu icon');
164
+
165
+ await smart.select('country dropdown', 'India');
166
+
167
+ await smart.selectText('paragraph content');
36
168
  ```
37
169
 
38
- ## 🚀 Quick Start
170
+ ---
171
+
172
+ ### Checkbox & Radio
39
173
 
40
- ```javascript
41
- const { chromium } = require('playwright');
42
- const { findLocator, createNLPHelper } = require('playwright-nlp-locator');
174
+ ```js
175
+ await smart.check('remember me checkbox');
43
176
 
44
- async function main() {
45
- const browser = await chromium.launch();
46
- const page = await browser.newPage();
47
- await page.goto('https://example.com');
177
+ await smart.uncheck('newsletter opt-in');
48
178
 
49
- // Option 1: Get the locator string
50
- const result = await findLocator(page, 'Login Button');
51
- console.log(result.locator); // e.g., "page.getByRole('button', { name: 'Login' })"
52
- console.log(result.confidence); // e.g., 0.95
179
+ await smart.setChecked('terms checkbox', true);
180
+ ```
53
181
 
54
- // Option 2: Use the NLP Helper for direct interaction
55
- const nlp = createNLPHelper(page);
56
- await nlp.click('Submit button');
57
- await nlp.type('Email input field', 'user@example.com');
58
- await nlp.type('Password field', 'secret123');
182
+ ---
59
183
 
60
- await browser.close();
61
- }
184
+ ### File Upload
62
185
 
63
- main();
186
+ ```js
187
+ await smart.setInputFiles('file upload', '/path/to/file.pdf');
188
+ await smart.setInputFiles('avatar input', ['/img1.png', '/img2.png']);
64
189
  ```
65
190
 
66
- ## 📖 API Reference
191
+ ---
67
192
 
68
- ### `findLocator(page, description, options?)`
193
+ ### Drag & Drop
69
194
 
70
- Find a Playwright locator using natural language description.
195
+ ```js
196
+ const { source, target } = await smart.dragTo('card item', 'drop zone');
197
+ ```
198
+
199
+ ---
200
+
201
+ ### Wait Actions
71
202
 
72
- **Parameters:**
73
- - `page` - Playwright Page object
74
- - `description` - Natural language description of the element
75
- - `options` - Optional configuration object
203
+ ```js
204
+ await smart.waitForVisible('success toast');
205
+ await smart.waitForVisible('modal', 10000);
76
206
 
77
- **Returns:** `Promise<FindLocatorResult>`
207
+ await smart.waitForHidden('loading spinner');
78
208
 
79
- ```javascript
80
- const result = await findLocator(page, 'email input field');
209
+ await smart.waitForAttached('dynamic table');
81
210
 
82
- // Result object:
83
- {
84
- found: true,
85
- locator: "page.getByPlaceholder('Enter your email')",
86
- locatorType: 'placeholder',
87
- confidence: 0.92,
88
- explanation: 'Input field with placeholder text indicating email entry',
89
- alternatives: [
90
- { locator: "page.locator('#email')", confidence: 0.85 }
91
- ],
92
- elementIndex: 3
93
- }
211
+ await smart.waitForDetached('old modal');
212
+
213
+ await smart.waitFor('element', { state: 'visible', timeout: 5000 });
94
214
  ```
95
215
 
96
- ### `getLocator(page, description, options?)`
216
+ ---
97
217
 
98
- Get an actual Playwright Locator object ready for interaction.
218
+ ### State Queries
99
219
 
100
- ```javascript
101
- const { locator, result } = await getLocator(page, 'Submit button');
220
+ ```js
221
+ const visible = await smart.isVisible('error message');
222
+ const hidden = await smart.isHidden('loading spinner');
223
+ const enabled = await smart.isEnabled('submit button');
224
+ const disabled = await smart.isDisabled('locked field');
225
+ const checked = await smart.isChecked('terms checkbox');
226
+ const editable = await smart.isEditable('readonly field');
227
+ const found = await smart.exists('optional element');
228
+ ```
102
229
 
103
- // Use the locator directly
104
- await locator.click();
105
- await locator.fill('text');
106
- const text = await locator.textContent();
230
+ ---
231
+
232
+ ### Content Retrieval
233
+
234
+ ```js
235
+ const text = await smart.getText('welcome heading');
236
+ const inner = await smart.getInnerText('article body');
237
+ const html = await smart.getInnerHTML('rich content area');
238
+ const value = await smart.getInputValue('email field');
239
+ const attr = await smart.getAttribute('link', 'href');
240
+ const box = await smart.getBoundingBox('hero image');
241
+ const num = await smart.count('list items');
107
242
  ```
108
243
 
109
- ### `findAllLocators(page, description, options?)`
244
+ ---
245
+
246
+ ### Scroll & Visual
110
247
 
111
- Find multiple matching elements for a description.
248
+ ```js
249
+ await smart.scrollIntoView('footer section');
112
250
 
113
- ```javascript
114
- const matches = await findAllLocators(page, 'navigation links');
251
+ const buffer = await smart.screenshot('chart area', { path: 'chart.png' });
115
252
 
116
- // Returns array of matches:
117
- [
118
- { locator: "page.getByRole('link', { name: 'Home' })", confidence: 0.90, type: 'role' },
119
- { locator: "page.getByRole('link', { name: 'About' })", confidence: 0.88, type: 'role' }
120
- ]
253
+ await smart.highlight('target element');
121
254
  ```
122
255
 
123
- ### `createNLPHelper(page, options?)`
256
+ ---
257
+
258
+ ### `smart.locate()` — Resolve Once, Act Many Times
259
+
260
+ When you need multiple actions on the same element, use `locate()` to resolve the locator once:
124
261
 
125
- Create a helper object with natural language methods.
262
+ ```js
263
+ const el = await smart.locate('username');
264
+ await el.clear();
265
+ await el.fill('NewAdmin');
266
+ await el.press('Tab');
267
+ console.log(await el.inputValue());
268
+ console.log(await el.isEnabled());
126
269
 
127
- ```javascript
128
- const nlp = createNLPHelper(page);
270
+ // Access the raw Playwright locator
271
+ const loc = el.rawLocator;
129
272
 
130
- // Available methods:
131
- await nlp.click('Login button');
132
- await nlp.type('Username field', 'john_doe');
133
- await nlp.select('Country dropdown', 'USA');
134
- await nlp.hover('Profile menu');
135
- await nlp.waitFor('Loading spinner to disappear');
136
- const text = await nlp.getText('Welcome message');
137
- const exists = await nlp.exists('Error message');
273
+ // SmartAction has 40+ methods matching Playwright's Locator API
274
+ await el.click();
275
+ await el.hover();
276
+ await el.screenshot({ path: 'element.png' });
277
+ await el.waitForVisible();
278
+ await el.evaluate((node) => node.style.border = '2px solid red');
138
279
  ```
139
280
 
140
- ## ⚙️ Configuration Options
281
+ ---
141
282
 
142
- ```javascript
143
- const options = {
144
- // Maximum elements to analyze (default: 100)
145
- maxElements: 100,
146
-
147
- // Include hidden elements (default: false)
148
- includeHiddenElements: false,
149
-
150
- // Minimum confidence to accept (default: 0.5)
151
- confidenceThreshold: 0.5,
152
-
153
- // LLM model to use (default: 'claude-3-5-sonnet')
154
- model: 'claude-3-5-sonnet',
155
-
156
- // Locator preference order
157
- preferredLocatorOrder: ['role', 'text', 'testId', 'css', 'xpath']
158
- };
283
+ ### `smart.prefetch()` — Batch Resolve
159
284
 
160
- const result = await findLocator(page, 'Submit button', options);
285
+ Pre-resolve multiple locators in a single LLM call to save time and cost:
286
+
287
+ ```js
288
+ await smart.prefetch('username', 'password', 'login button');
289
+
290
+ // These now hit the cache — no LLM calls
291
+ await smart.fill('username', 'Admin');
292
+ await smart.fill('password', 'secret');
293
+ await smart.click('login button');
161
294
  ```
162
295
 
163
- ## 🎯 Best Practices for Descriptions
296
+ ---
297
+
298
+ ### Cache Management
299
+
300
+ ```js
301
+ smart.clearCache(); // clear in-memory cache only
302
+ smart.clearAllCaches(); // clear both memory + disk (.locator-cache.json)
303
+ ```
164
304
 
165
- ### Good Descriptions
305
+ Use `clearCache()` after SPA navigations where the DOM changes significantly (e.g., after login) to force fresh page analysis.
166
306
 
167
- ```javascript
168
- // Be specific about the element type
169
- await nlp.click('Login button');
170
- await nlp.type('Email input field', 'user@example.com');
171
- await nlp.click('Submit form button');
307
+ ## Low-Level API
172
308
 
173
- // Include context when needed
174
- await nlp.click('Add to cart button for the first product');
175
- await nlp.type('Search box in the header', 'shoes');
309
+ For advanced use cases, you can use the lower-level exports directly.
176
310
 
177
- // Use visual or functional descriptions
178
- await nlp.click('Red cancel button');
179
- await nlp.click('Hamburger menu icon');
180
- await nlp.click('Close modal X button');
311
+ ### `findLocator(page, query, options?)`
312
+
313
+ Resolves a single natural language query to a Playwright locator string without performing any action.
314
+
315
+ ```js
316
+ import { findLocator } from 'playwright-genie';
317
+
318
+ const result = await findLocator(page, 'submit button');
319
+ console.log(result);
320
+ // {
321
+ // found: true,
322
+ // strategy: 'role',
323
+ // locatorString: "getByRole('button', { name: 'Submit' })",
324
+ // confidence: 0.95,
325
+ // fallbackLocators: ["getByTestId('submit-btn')", "locator('#submit')"],
326
+ // isInFrame: false,
327
+ // frameSelector: null
328
+ // }
329
+ ```
330
+
331
+ ### `findAllMatches(page, query, options?)`
332
+
333
+ Returns an array of all matching locator results for a query.
334
+
335
+ ```js
336
+ import { findAllMatches } from 'playwright-genie';
337
+
338
+ const matches = await findAllMatches(page, 'navigation link');
339
+ ```
340
+
341
+ ### `getPageStructure(page, forceRefresh?)`
342
+
343
+ Returns the full page structure used for LLM resolution. Useful for debugging or building custom pipelines.
344
+
345
+ ```js
346
+ import { getPageStructure } from 'playwright-genie';
347
+
348
+ const structure = await getPageStructure(page);
349
+ console.log(structure.mainFrame.ariaQuality); // 'good' | 'sparse' | 'none'
350
+ console.log(structure.mainFrame.ariaTree); // ARIA snapshot (YAML string)
351
+ console.log(structure.mainFrame.domTree); // Array of DOM nodes with XPath/CSS
352
+ console.log(structure.mainFrame.interactiveElements); // Interactive element metadata
353
+ console.log(structure.mainFrame.specialElements); // Elements with data-testid, etc.
354
+ console.log(structure.frames); // iframe structures
355
+ ```
356
+
357
+ ### `resolveLocator(page, query, options?)`
358
+
359
+ Low-level resolver that checks memory cache → disk cache → LLM. Returns the raw result object without creating a Playwright locator.
360
+
361
+ ```js
362
+ import { resolveLocator } from 'playwright-genie';
363
+
364
+ const result = await resolveLocator(page, 'login button', { action: 'click' });
365
+ // result.source is 'memory', 'disk', or 'llm'
181
366
  ```
182
367
 
183
- ### Avoid Vague Descriptions
368
+ ### `getLocator(page, query, options?)`
184
369
 
185
- ```javascript
186
- // Too vague - could match multiple elements
187
- await nlp.click('button');
188
- await nlp.type('input', 'text');
370
+ Resolves a query and returns both the Playwright `Locator` object and the result metadata. Handles stale cache invalidation and fallback chains.
189
371
 
190
- // Better alternatives:
191
- await nlp.click('primary submit button');
192
- await nlp.type('username input', 'text');
372
+ ```js
373
+ import { getLocator } from 'playwright-genie';
374
+
375
+ const { locator, result } = await getLocator(page, 'email input', { action: 'fill' });
376
+ await locator.fill('user@example.com');
193
377
  ```
194
378
 
195
- ## 📚 Complete Examples
379
+ ### `clearCache()` / `clearAllCaches()`
196
380
 
197
- ### Example 1: Login Form Automation
381
+ Module-level cache clearing functions.
198
382
 
199
- ```javascript
200
- const { chromium } = require('playwright');
201
- const { createNLPHelper } = require('playwright-nlp-locator');
383
+ ```js
384
+ import { clearCache, clearAllCaches } from 'playwright-genie';
202
385
 
203
- async function loginAutomation() {
204
- const browser = await chromium.launch();
205
- const page = await browser.newPage();
206
-
386
+ clearCache(); // memory only
387
+ clearAllCaches(); // memory + disk
388
+ ```
389
+
390
+ ## Adaptive Page Analysis
391
+
392
+ The library automatically adapts to the accessibility quality of each page:
393
+
394
+ | ARIA Quality | Criteria | Mode | What Gets Sent to LLM |
395
+ |---|---|---|---|
396
+ | **`good`** | ARIA tree has 3+ named interactive elements, 20+ lines | `aria` | Trimmed ARIA tree + special elements + interactive elements |
397
+ | **`sparse`** | ARIA tree exists but fewer named elements than the page has | `hybrid` | ARIA tree + DOM tree nodes (XPath/CSS) + interactive elements |
398
+ | **`none`** | ARIA tree has < 5 lines or is missing | `dom` | DOM tree with XPaths and CSS selectors + interactive elements |
399
+
400
+ ### DOM Tree Pipeline
401
+
402
+ For pages with poor or no accessibility markup, the library walks the DOM in-browser and:
403
+
404
+ - Traverses up to **300 visible nodes** (headings, links, buttons, inputs, landmarks, etc.)
405
+ - Generates **XPath** for each node (e.g., `//*[@id="login"]`, `//form/div[2]/input[1]`)
406
+ - Generates **unique CSS selectors** (e.g., `#login`, `[data-testid="submit"]`, `button.primary`)
407
+ - Extracts text content, ARIA labels, placeholders, `data-testid` attributes, and parent context
408
+ - Filters nodes by **relevance scoring** against your query before sending to the LLM
409
+
410
+ This means the library works on any page — not just accessible ones.
411
+
412
+ ## Caching
413
+
414
+ **playwright-genie** uses a two-level cache to minimize LLM calls:
415
+
416
+ 1. **Memory cache** — instant lookups within the same test run
417
+ 2. **Disk cache** (`.locator-cache.json`) — persists across runs
418
+
419
+ Cache keys are scoped by **URL pathname + action + query**, so `fill('username')` on `/login` won't collide with `click('username')` on `/dashboard`.
420
+
421
+ If a cached locator becomes stale (element no longer exists), the library:
422
+ 1. Tries **fallback locators** returned by the LLM
423
+ 2. If all fallbacks fail, **invalidates the cache** and re-queries the LLM with fresh page structure
424
+
425
+ Set `LOCATOR_CACHE_FILE` env var to customize the cache file path.
426
+
427
+ ## LLM Provider Configuration
428
+
429
+ | Provider | `LLM_API_KEY` | `LLM_BASE_URL` | `LLM_MODEL` |
430
+ |----------|---------------|-----------------|-------------|
431
+ | OpenAI | `sk-...` | *(default)* | `gpt-4o-mini` |
432
+ | Anthropic | `sk-ant-...` | proxy URL | `claude-sonnet-4-20250514` |
433
+ | Ollama | `ollama` | `http://localhost:11434/v1` | `llama3` |
434
+ | Azure OpenAI | Azure key | deployment URL | `gpt-4o-mini` |
435
+ | RouteLLM | key | proxy URL | model name |
436
+
437
+ ## Complete Examples
438
+
439
+ ### Login Flow
440
+
441
+ ```js
442
+ import { test, expect } from '@playwright/test';
443
+ import { createSmartLocator } from 'playwright-genie';
444
+
445
+ test('complete login flow', async ({ page }) => {
446
+ const smart = createSmartLocator(page);
207
447
  await page.goto('https://myapp.com/login');
208
-
209
- const nlp = createNLPHelper(page);
210
-
211
- // Fill login form using natural language
212
- await nlp.type('Username or email input', 'john@example.com');
213
- await nlp.type('Password field', 'secretPassword123');
214
-
215
- // Check "Remember me" if it exists
216
- if (await nlp.exists('Remember me checkbox')) {
217
- await nlp.click('Remember me checkbox');
448
+
449
+ await smart.fill('username', 'admin');
450
+ await smart.fill('password', 'secret123');
451
+
452
+ if (await smart.exists('remember me checkbox')) {
453
+ await smart.check('remember me checkbox');
218
454
  }
219
-
220
- // Submit the form
221
- await nlp.click('Sign in button');
222
-
223
- // Wait for dashboard to load
224
- await nlp.waitFor('Dashboard header');
225
-
226
- console.log('Login successful!');
227
-
228
- await browser.close();
229
- }
230
- ```
231
-
232
- ### Example 2: E-commerce Shopping Flow
233
-
234
- ```javascript
235
- async function shoppingFlow() {
236
- const browser = await chromium.launch();
237
- const page = await browser.newPage();
238
- const nlp = createNLPHelper(page);
239
-
455
+
456
+ await smart.click('sign in button');
457
+ await smart.waitForVisible('dashboard heading');
458
+
459
+ const welcome = await smart.getText('welcome message');
460
+ expect(welcome).toContain('admin');
461
+ });
462
+ ```
463
+
464
+ ### E-commerce Flow
465
+
466
+ ```js
467
+ test('shopping flow', async ({ page }) => {
468
+ const smart = createSmartLocator(page);
240
469
  await page.goto('https://shop.example.com');
241
-
242
- // Search for a product
243
- await nlp.type('Search bar', 'wireless headphones');
244
- await nlp.click('Search button');
245
-
246
- // Filter results
247
- await nlp.click('Price filter dropdown');
248
- await nlp.click('Under $100 option');
249
-
250
- // Select a product
251
- await nlp.click('First product in the list');
252
-
253
- // Add to cart
254
- await nlp.select('Size selector', 'Medium');
255
- await nlp.click('Add to cart button');
256
-
257
- // Proceed to checkout
258
- await nlp.click('Shopping cart icon');
259
- await nlp.click('Proceed to checkout button');
260
-
261
- // Fill shipping info
262
- await nlp.type('First name field', 'John');
263
- await nlp.type('Last name field', 'Doe');
264
- await nlp.type('Address line 1', '123 Main St');
265
- await nlp.type('City input', 'New York');
266
- await nlp.select('State dropdown', 'NY');
267
- await nlp.type('Zip code', '10001');
268
-
269
- await nlp.click('Continue to payment button');
270
-
271
- await browser.close();
272
- }
273
- ```
274
-
275
- ### Example 3: Form Validation Testing
276
-
277
- ```javascript
278
- const { findLocator, getLocator } = require('playwright-nlp-locator');
279
-
280
- async function testFormValidation() {
281
- const browser = await chromium.launch();
282
- const page = await browser.newPage();
283
-
284
- await page.goto('https://myapp.com/signup');
285
-
286
- // Submit empty form to trigger validation
287
- const { locator: submitBtn } = await getLocator(page, 'Submit button');
288
- await submitBtn.click();
289
-
290
- // Check for validation errors
291
- const emailError = await findLocator(page, 'Email validation error message');
292
- if (emailError.found && emailError.confidence > 0.7) {
293
- console.log('Email validation working:', emailError.locator);
294
- }
295
-
296
- const passwordError = await findLocator(page, 'Password error message');
297
- if (passwordError.found) {
298
- const { locator } = await getLocator(page, 'Password error');
299
- const errorText = await locator.textContent();
300
- console.log('Password error:', errorText);
301
- }
302
-
303
- await browser.close();
304
- }
470
+
471
+ await smart.fill('search bar', 'wireless headphones');
472
+ await smart.press('search bar', 'Enter');
473
+ await smart.waitForVisible('product list');
474
+
475
+ await smart.click('first product card');
476
+ await smart.select('size dropdown', 'Medium');
477
+ await smart.click('add to cart button');
478
+
479
+ await smart.waitForVisible('cart badge');
480
+ const count = await smart.getText('cart badge');
481
+ expect(count).toBe('1');
482
+ });
305
483
  ```
306
484
 
307
- ### Example 4: Dynamic Content Handling
485
+ ### SPA Navigation with Cache Clearing
308
486
 
309
- ```javascript
310
- async function handleDynamicContent() {
311
- const browser = await chromium.launch();
312
- const page = await browser.newPage();
313
- const nlp = createNLPHelper(page);
314
-
487
+ ```js
488
+ test('SPA login and navigate', async ({ page }) => {
489
+ const smart = createSmartLocator(page);
490
+ await page.goto('https://spa-app.com/login');
491
+
492
+ await smart.fill('username', 'admin');
493
+ await smart.fill('password', 'secret');
494
+ await smart.click('login button');
495
+
496
+ // After SPA navigation, clear cache to force fresh page analysis
497
+ await page.waitForURL('**/dashboard');
498
+ smart.clearCache();
499
+
500
+ await smart.click('settings tab');
501
+ await smart.waitForVisible('settings panel');
502
+ });
503
+ ```
504
+
505
+ ### Batch Pre-fetch for Performance
506
+
507
+ ```js
508
+ test('prefetch for faster tests', async ({ page }) => {
509
+ const smart = createSmartLocator(page);
510
+ await page.goto('https://myapp.com/form');
511
+
512
+ // Resolve all locators in one LLM call
513
+ await smart.prefetch(
514
+ 'first name input',
515
+ 'last name input',
516
+ 'email field',
517
+ 'phone number',
518
+ 'submit button'
519
+ );
520
+
521
+ // All cached — zero LLM calls from here
522
+ await smart.fill('first name input', 'John');
523
+ await smart.fill('last name input', 'Doe');
524
+ await smart.fill('email field', 'john@example.com');
525
+ await smart.fill('phone number', '555-0123');
526
+ await smart.click('submit button');
527
+ });
528
+ ```
529
+
530
+ ### Dynamic Content & Modals
531
+
532
+ ```js
533
+ test('handle dynamic content', async ({ page }) => {
534
+ const smart = createSmartLocator(page);
315
535
  await page.goto('https://app.example.com');
316
-
317
- // Wait for dynamic content to load
318
- await nlp.waitFor('Content loaded indicator');
319
-
320
- // Handle modals
321
- if (await nlp.exists('Cookie consent popup')) {
322
- await nlp.click('Accept cookies button');
323
- }
324
-
325
- if (await nlp.exists('Newsletter subscription modal')) {
326
- await nlp.click('Close modal button');
327
- }
328
-
329
- // Interact with lazy-loaded content
330
- await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
331
- await nlp.waitFor('Load more button');
332
- await nlp.click('Load more button');
333
-
334
- await browser.close();
335
- }
336
- ```
337
-
338
- ## 🔧 Locator Types
339
-
340
- The library returns different types of locators based on element attributes:
341
-
342
- | Type | Example | Priority |
343
- |------|---------|----------|
344
- | Role | `page.getByRole('button', { name: 'Submit' })` | 1 (Highest) |
345
- | Text | `page.getByText('Click here')` | 2 |
346
- | Label | `page.getByLabel('Email address')` | 3 |
347
- | Placeholder | `page.getByPlaceholder('Enter email')` | 4 |
348
- | TestId | `page.getByTestId('submit-btn')` | 5 |
349
- | CSS | `page.locator('button.primary')` | 6 |
350
- | XPath | `page.locator('//button[@id="submit"]')` | 7 (Lowest) |
351
-
352
- ## 🐛 Error Handling
353
-
354
- ```javascript
355
- const { findLocator, getLocator } = require('playwright-nlp-locator');
356
-
357
- // Method 1: Check result
358
- const result = await findLocator(page, 'Missing element');
359
- if (!result.found) {
360
- console.log('Element not found:', result.error);
361
- }
362
-
363
- if (result.confidence < 0.5) {
364
- console.log('Low confidence match - may be incorrect');
365
- }
366
-
367
- // Method 2: Try-catch with getLocator
368
- try {
369
- const { locator } = await getLocator(page, 'Element description');
370
- await locator.click();
371
- } catch (error) {
372
- console.error('Failed to find element:', error.message);
373
- }
374
-
375
- // Method 3: Fallback locators
376
- const result = await findLocator(page, 'Submit button');
377
- if (result.confidence < 0.7 && result.alternatives?.length > 0) {
378
- console.log('Using alternative locator:', result.alternatives[0].locator);
379
- }
380
- ```
381
-
382
- ## 🔒 Security Notes
383
-
384
- - The library sends page DOM structure to the LLM API for analysis
385
- - Sensitive data in element values may be visible to the API
386
- - Consider using this in development/testing environments
387
- - For production, ensure compliance with your data policies
388
-
389
- ## 📄 TypeScript Usage
390
-
391
- ```typescript
392
- import { chromium, Page } from 'playwright';
393
- import {
394
- findLocator,
395
- getLocator,
396
- createNLPHelper,
397
- FindLocatorResult,
398
- NLPHelper,
399
- NLPLocatorConfig
400
- } from 'playwright-nlp-locator';
401
-
402
- async function typedExample(): Promise<void> {
403
- const browser = await chromium.launch();
404
- const page: Page = await browser.newPage();
405
-
406
- const config: NLPLocatorConfig = {
407
- confidenceThreshold: 0.7,
408
- maxElements: 50
409
- };
410
-
411
- const result: FindLocatorResult = await findLocator(page, 'Login button', config);
412
-
413
- if (result.found && result.confidence >= 0.7) {
414
- console.log(`Found: ${result.locator}`);
536
+
537
+ if (await smart.exists('cookie consent popup')) {
538
+ await smart.click('accept cookies button');
539
+ await smart.waitForHidden('cookie consent popup');
415
540
  }
416
-
417
- const nlp: NLPHelper = createNLPHelper(page, config);
418
- await nlp.click('Submit button');
419
-
420
- await browser.close();
421
- }
541
+
542
+ await smart.scrollIntoView('footer section');
543
+ await smart.waitForVisible('load more button');
544
+ await smart.click('load more button');
545
+ await smart.waitForHidden('loading spinner');
546
+ });
422
547
  ```
423
548
 
424
- ## 🤝 Contributing
549
+ ### Using Low-Level API for Debugging
425
550
 
426
- Contributions are welcome! Please feel free to submit a Pull Request.
551
+ ```js
552
+ import { getPageStructure, findLocator } from 'playwright-genie';
553
+
554
+ test('debug locator resolution', async ({ page }) => {
555
+ await page.goto('https://myapp.com');
556
+
557
+ // Inspect page analysis
558
+ const structure = await getPageStructure(page);
559
+ console.log('ARIA quality:', structure.mainFrame.ariaQuality);
560
+ console.log('DOM nodes:', structure.mainFrame.domTree.length);
561
+ console.log('Interactive elements:', structure.mainFrame.interactiveElements.length);
562
+
563
+ // See what the LLM resolves without acting
564
+ const result = await findLocator(page, 'submit button');
565
+ console.log('Strategy:', result.strategy);
566
+ console.log('Locator:', result.locatorString);
567
+ console.log('Fallbacks:', result.fallbackLocators);
568
+ });
569
+ ```
570
+
571
+ ## Exports
572
+
573
+ ```js
574
+ import {
575
+ createSmartLocator, // Main entry — creates smart locator with 40+ action methods
576
+ findLocator, // Resolve a query to a locator string (no action)
577
+ findAllMatches, // Get all matching locator results
578
+ getPageStructure, // Get the full page structure (ARIA + DOM + interactive elements)
579
+ getLocator, // Resolve + validate + create Playwright Locator object
580
+ resolveLocator, // Low-level: cache lookup → LLM resolution
581
+ clearCache, // Clear in-memory cache
582
+ clearAllCaches, // Clear memory + disk cache
583
+ SmartAction, // Class wrapping a Playwright Locator with 40+ methods
584
+ chatCompletion, // Direct LLM call (for custom pipelines)
585
+ getConfig, // Get current LLM configuration
586
+ loadDiskCache, // Load disk cache manually
587
+ invalidateDiskEntry, // Invalidate a specific disk cache entry
588
+ } from 'playwright-genie';
589
+ ```
590
+
591
+ ## Best Practices
592
+
593
+ **Be specific about element type:**
594
+ ```js
595
+ await smart.click('login button'); // good
596
+ await smart.click('button'); // too vague
597
+ ```
598
+
599
+ **Include context when needed:**
600
+ ```js
601
+ await smart.click('delete button in first row');
602
+ await smart.fill('search box in header', 'shoes');
603
+ ```
604
+
605
+ **Use action-appropriate descriptions:**
606
+ ```js
607
+ await smart.fill('username', 'Admin'); // finds the input
608
+ await smart.click('username label'); // finds the label
609
+ ```
610
+
611
+ **Reuse the same instance across navigations:**
612
+ ```js
613
+ const smart = createSmartLocator(page);
614
+ await page.goto('/login');
615
+ await smart.fill('username', 'Admin');
616
+ await smart.click('login button');
617
+ // navigated to /dashboard — same smart object works
618
+ await smart.click('settings tab');
619
+ ```
620
+
621
+ **Use locate() for multiple actions on same element:**
622
+ ```js
623
+ const el = await smart.locate('search box');
624
+ await el.fill('query');
625
+ await el.press('Enter');
626
+ // 1 LLM call instead of 2
627
+ ```
427
628
 
428
- ## 📝 License
629
+ **Use prefetch() for forms and multi-element pages:**
630
+ ```js
631
+ await smart.prefetch('name', 'email', 'password', 'submit');
632
+ // 1 LLM call instead of 4
633
+ ```
634
+
635
+ **Clear cache after SPA navigation:**
636
+ ```js
637
+ await page.waitForURL('**/dashboard');
638
+ smart.clearCache();
639
+ ```
640
+
641
+ ## TypeScript
642
+
643
+ ```ts
644
+ import { test } from '@playwright/test';
645
+ import { createSmartLocator, SmartAction, SmartLocator } from 'playwright-genie';
646
+
647
+ test('typed example', async ({ page }) => {
648
+ const smart: SmartLocator = createSmartLocator(page);
649
+
650
+ const el: SmartAction = await smart.locate('username');
651
+ await el.fill('Admin');
652
+
653
+ const visible: boolean = await smart.isVisible('dashboard');
654
+ const text: string | null = await smart.getText('heading');
655
+ });
656
+ ```
657
+
658
+ ## Debug Mode
429
659
 
430
- MIT License - see the [LICENSE](LICENSE) file for details.
660
+ ```bash
661
+ LLM_LOCATOR_DEBUG=true npx playwright test
662
+ ```
663
+
664
+ This logs:
665
+ - Payload mode selected (`aria`/`hybrid`/`dom`) and payload size
666
+ - LLM queries and responses
667
+ - Cache hits/misses (memory and disk)
668
+ - Stale cache invalidations and fallback attempts
669
+ - Page structure collection timing
670
+
671
+ ## Environment Variables
672
+
673
+ | Variable | Description |
674
+ |---|---|
675
+ | `LLM_API_KEY` | API key for LLM provider |
676
+ | `LLM_BASE_URL` | Base URL for OpenAI-compatible API |
677
+ | `LLM_MODEL` | Model name (e.g., `gpt-4o-mini`) |
678
+ | `OPENAI_API_KEY` | Fallback API key |
679
+ | `ANTHROPIC_API_KEY` | Fallback API key |
680
+ | `ROUTELLM_API_KEY` | Fallback API key |
681
+ | `LOCATOR_CACHE_FILE` | Custom path for disk cache file |
682
+ | `LLM_LOCATOR_DEBUG` | Set to `true` to enable debug logging |
683
+
684
+ ## Security Notes
685
+
686
+ - The library sends the page's accessibility tree and/or DOM structure to your configured LLM API
687
+ - Sensitive data visible in the DOM may be sent to the API
688
+ - Use environment variables for API keys — never hardcode them
689
+ - For sensitive environments, use a local LLM (e.g., Ollama)
690
+
691
+ ## Contributing
692
+
693
+ Contributions are welcome! Please feel free to submit a Pull Request.
431
694
 
432
- ## 🙏 Acknowledgments
695
+ ## License
433
696
 
434
- - [Playwright](https://playwright.dev/) - The underlying browser automation framework
435
- - [Abacus.AI](https://abacus.ai/) - LLM API powering the natural language understanding
697
+ MIT License — see the [LICENSE](LICENSE) file for details.