assuremind 1.1.2 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CONTRIBUTING.md +13 -5
- package/README.md +89 -1
- package/dist/cli/index.js +2055 -410
- package/dist/cli/index.js.map +1 -1
- package/dist/index.d.mts +151 -12
- package/dist/index.d.ts +151 -12
- package/dist/index.js +49 -2
- package/dist/index.js.map +1 -1
- package/dist/index.mjs +49 -2
- package/dist/index.mjs.map +1 -1
- package/docs/CLI-REFERENCE.md +104 -0
- package/docs/GETTING-STARTED.md +64 -3
- package/docs/STUDIO.md +186 -0
- package/package.json +1 -1
- package/ui/dist/assets/index-DTtYd1hD.js +837 -0
- package/ui/dist/assets/index-lOAh29q9.css +1 -0
- package/ui/dist/assuremind-logo.png +0 -0
- package/ui/dist/favicon.svg +8 -36
- package/ui/dist/index.html +2 -2
- package/ui/dist/assets/index-By2Hw5l2.css +0 -1
- package/ui/dist/assets/index-DaQ-JHje.js +0 -819
package/docs/CLI-REFERENCE.md
CHANGED
|
@@ -327,3 +327,107 @@ MCP (Model Context Protocol) settings are configured in `autotest.config.json` (
|
|
|
327
327
|
| `mcp.idleTimeout` | `30000` | Auto-disconnect MCP browser after idle period in ms |
|
|
328
328
|
|
|
329
329
|
> **Note:** MCP is used only during code generation. Test execution (`npx assuremind run`) is never affected by MCP settings.
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## RAG Configuration
|
|
334
|
+
|
|
335
|
+
RAG (Retrieval-Augmented Generation) is **ON by default — zero setup required**. The AI automatically learns from every test run, retrieving similar past steps and healing fixes to improve accuracy over time. No database, no external service — uses local TF-IDF embeddings and file-based JSON storage.
|
|
336
|
+
|
|
337
|
+
**Most users never need to change these settings.** They exist for power-user scenarios like debugging flaky tests, resetting memory after a major app redesign, or disabling RAG for deterministic CI runs.
|
|
338
|
+
|
|
339
|
+
Settings are configured in `autotest.config.json` (or via Settings → RAG Memory in Studio).
|
|
340
|
+
|
|
341
|
+
| Setting | Default | Description |
|
|
342
|
+
|---------|---------|-------------|
|
|
343
|
+
| `rag.enabled` | `true` | Master switch for semantic memory from past runs |
|
|
344
|
+
| `rag.codeCorpus.enabled` | `true` | Remember instruction-to-code mappings from successful runs |
|
|
345
|
+
| `rag.codeCorpus.maxEntries` | `500` | Maximum entries in the code corpus |
|
|
346
|
+
| `rag.codeCorpus.similarityThreshold` | `0.65` | Minimum similarity score to include as AI prompt example |
|
|
347
|
+
| `rag.codeCorpus.directUseThreshold` | `0.90` | Minimum similarity score to use directly as fuzzy cache hit ($0) |
|
|
348
|
+
| `rag.healingCorpus.enabled` | `true` | Remember past healing fixes for smarter self-repair |
|
|
349
|
+
| `rag.healingCorpus.maxEntries` | `300` | Maximum entries in the healing corpus |
|
|
350
|
+
| `rag.healingCorpus.similarityThreshold` | `0.60` | Minimum similarity score for healing retrieval |
|
|
351
|
+
| `rag.errorCatalog.enabled` | `true` | Track recurring error patterns per URL |
|
|
352
|
+
| `rag.errorCatalog.maxEntries` | `200` | Maximum entries in the error catalog |
|
|
353
|
+
| `rag.embedder` | `'tfidf'` | Embedding strategy: `tfidf` (local, free, offline) |
|
|
354
|
+
|
|
355
|
+
> **Note:** RAG data is stored in `results/.rag/` as JSON files. It persists across runs and is fully Git-friendly. Delete the directory to reset all memory.
|
|
356
|
+
|
|
357
|
+
### When to change RAG settings
|
|
358
|
+
|
|
359
|
+
| Scenario | What to do |
|
|
360
|
+
|----------|------------|
|
|
361
|
+
| Debugging a flaky test | Turn OFF `rag.codeCorpus.enabled` — forces fresh AI generation instead of reusing a possibly stale mapping |
|
|
362
|
+
| Healing keeps suggesting a bad fix | Turn OFF `rag.healingCorpus.enabled` — clears bad fix influence |
|
|
363
|
+
| Major app redesign (UI overhaul) | Set `rag.enabled: false` — old memory from the previous UI is now misleading |
|
|
364
|
+
| Want deterministic CI runs | Disable RAG in CI config, keep ON for local development |
|
|
365
|
+
| Error warnings are outdated | Turn OFF `rag.errorCatalog.enabled` — stops AI from avoiding selectors that are fine now |
|
|
366
|
+
|
|
367
|
+
---
|
|
368
|
+
|
|
369
|
+
## Recorder API
|
|
370
|
+
|
|
371
|
+
The Test Recorder is available through the Studio UI (Test Editor → Record button) and also exposes REST endpoints for programmatic access:
|
|
372
|
+
|
|
373
|
+
| Endpoint | Method | Description |
|
|
374
|
+
|----------|--------|-------------|
|
|
375
|
+
| `/api/recorder/start` | POST | Launch a headed browser and begin recording. Body: `{ url?: string }` |
|
|
376
|
+
| `/api/recorder/stop` | POST | Stop recording and return all captured actions |
|
|
377
|
+
| `/api/recorder/status` | GET | Check if a recording session is active |
|
|
378
|
+
|
|
379
|
+
Recorded actions are broadcast in real time via WebSocket (`recorder:action` event). When recording stops, a `recorder:stopped` event is sent.
|
|
380
|
+
|
|
381
|
+
### Soft Assertions
|
|
382
|
+
|
|
383
|
+
The recorder supports **soft assertions** via `Ctrl+Shift+Click` — these use Playwright's `expect.soft()` so the test continues executing even when the assertion fails. All failures are collected and reported at the end.
|
|
384
|
+
|
|
385
|
+
For plain-English steps, prefix with "Soft" to generate soft assertion code:
|
|
386
|
+
- `Soft verify "Dashboard" text is visible` → `await expect.soft(page.getByText('Dashboard')).toBeVisible();`
|
|
387
|
+
- `Soft check the URL contains "/dashboard"` → `await expect.soft(page).toHaveURL(/dashboard/);`
|
|
388
|
+
|
|
389
|
+
You can also toggle any assertion step between hard and soft using the **Hard/Soft** badge in the Test Editor.
|
|
390
|
+
|
|
391
|
+
### Recorder locator strategies
|
|
392
|
+
|
|
393
|
+
The recorder resolves locators against Playwright's live accessibility tree using 6 strategies in priority order:
|
|
394
|
+
|
|
395
|
+
| Priority | Strategy | Example |
|
|
396
|
+
|----------|----------|---------|
|
|
397
|
+
| 1 | `data-testid` | `page.getByTestId('login-btn')` |
|
|
398
|
+
| 2 | `getByRole()` exact + level | `page.getByRole('heading', { name: 'Dashboard', level: 1 })` |
|
|
399
|
+
| 3 | `getByRole()` exact | `page.getByRole('button', { name: 'Login' })` |
|
|
400
|
+
| 4 | `getByLabel()` | `page.getByLabel('Email')` |
|
|
401
|
+
| 5 | `getByPlaceholder()` | `page.getByPlaceholder('Enter email')` |
|
|
402
|
+
| 6 | `getByText()` | `page.getByText('Welcome')` |
|
|
403
|
+
| 7 | CSS fallback | `page.locator('#login-btn')` |
|
|
404
|
+
|
|
405
|
+
Each candidate is verified with `count() === 1` — only uniquely-matching locators are used.
|
|
406
|
+
|
|
407
|
+
### Iframe support
|
|
408
|
+
|
|
409
|
+
The recorder automatically handles **same-origin iframes** — essential for enterprise apps like SAP and Salesforce:
|
|
410
|
+
|
|
411
|
+
- The capture script is injected into **all frames**, not just the main page
|
|
412
|
+
- Elements inside iframes produce `frameLocator()` chains:
|
|
413
|
+
```typescript
|
|
414
|
+
// Element inside an iframe with id="content-frame"
|
|
415
|
+
await page.frameLocator('#content-frame').getByRole('button', { name: 'Submit' }).click();
|
|
416
|
+
```
|
|
417
|
+
- The iframe selector is computed from the iframe's `id`, `name`, `data-testid`, or `src` attribute
|
|
418
|
+
- Dynamically loaded iframes are detected and instrumented automatically
|
|
419
|
+
- **Limitation:** Cross-origin iframes cannot be instrumented due to browser security restrictions
|
|
420
|
+
|
|
421
|
+
---
|
|
422
|
+
|
|
423
|
+
### CI/CD tips
|
|
424
|
+
|
|
425
|
+
```yaml
|
|
426
|
+
# Cache RAG memory between CI runs for persistent learning
|
|
427
|
+
- name: Cache RAG memory
|
|
428
|
+
uses: actions/cache@v4
|
|
429
|
+
with:
|
|
430
|
+
path: results/.rag
|
|
431
|
+
key: rag-memory-${{ github.ref }}
|
|
432
|
+
restore-keys: rag-memory-
|
|
433
|
+
```
|
package/docs/GETTING-STARTED.md
CHANGED
|
@@ -128,6 +128,13 @@ export default defineConfig({
|
|
|
128
128
|
actThenScript: false, // Two-phase generation (higher accuracy, slower)
|
|
129
129
|
proactiveHealing: false, // Pre-run selector validation
|
|
130
130
|
},
|
|
131
|
+
rag: {
|
|
132
|
+
enabled: true, // AI learns from past runs (semantic memory)
|
|
133
|
+
codeCorpus: { enabled: true, maxEntries: 500, similarityThreshold: 0.65, directUseThreshold: 0.90 },
|
|
134
|
+
healingCorpus: { enabled: true, maxEntries: 300, similarityThreshold: 0.60 },
|
|
135
|
+
errorCatalog: { enabled: true, maxEntries: 200 },
|
|
136
|
+
embedder: 'tfidf', // local, free, offline — no API calls
|
|
137
|
+
},
|
|
131
138
|
});
|
|
132
139
|
```
|
|
133
140
|
|
|
@@ -159,7 +166,24 @@ Your browser opens at `http://localhost:4400`. From there:
|
|
|
159
166
|
|
|
160
167
|
See [STUDIO.md](./STUDIO.md) for a full Studio walkthrough.
|
|
161
168
|
|
|
162
|
-
### Option B —
|
|
169
|
+
### Option B — Record a test (fastest, zero AI)
|
|
170
|
+
|
|
171
|
+
Start the Studio and use the built-in **Test Recorder** to create tests by clicking through your app:
|
|
172
|
+
|
|
173
|
+
1. Click **Test Editor** → select a suite and case (or create new ones)
|
|
174
|
+
2. Click the red **Record** button in the step editor
|
|
175
|
+
3. A headed Chromium browser opens your app — interact naturally:
|
|
176
|
+
- Click buttons, fill forms, navigate pages
|
|
177
|
+
- **Shift+Click** any element to assert it's visible (hard assertion — test stops on failure)
|
|
178
|
+
- **Ctrl+Shift+Click** any element for a **soft assertion** (test continues, failures collected at end)
|
|
179
|
+
- **Ctrl+Shift+U** to assert the current URL
|
|
180
|
+
- **Ctrl+Shift+T** to assert the page title
|
|
181
|
+
4. Click **Stop Recording** — all actions become steps with pre-generated Playwright code
|
|
182
|
+
5. Click **Run** to execute immediately
|
|
183
|
+
|
|
184
|
+
**No AI calls, no API keys needed** — the recorder resolves locators against Playwright's accessibility tree in real time.
|
|
185
|
+
|
|
186
|
+
### Option C — CLI generate command
|
|
163
187
|
|
|
164
188
|
Generate a full test suite from a user story in one command:
|
|
165
189
|
|
|
@@ -171,7 +195,7 @@ npx assuremind generate \
|
|
|
171
195
|
|
|
172
196
|
This creates the suite file structure under `tests/login-tests/` and generates Playwright code for every step.
|
|
173
197
|
|
|
174
|
-
### Option
|
|
198
|
+
### Option D — Write JSON directly
|
|
175
199
|
|
|
176
200
|
Create `tests/login-tests/suite.json`:
|
|
177
201
|
|
|
@@ -372,14 +396,51 @@ Or manage variables in the Studio → **Variables** page.
|
|
|
372
396
|
|
|
373
397
|
---
|
|
374
398
|
|
|
375
|
-
## 12.
|
|
399
|
+
## 12. RAG Memory — AI Gets Smarter Over Time
|
|
400
|
+
|
|
401
|
+
RAG (Retrieval-Augmented Generation) is **ON by default — zero setup required**. The AI automatically learns from every test run:
|
|
402
|
+
|
|
403
|
+
- **Run 1** — memory is empty, AI generates code normally
|
|
404
|
+
- **Run 2+** — similar instructions are retrieved from past runs instead of making API calls (free + faster)
|
|
405
|
+
- **Run 10+** — most common steps served from memory at zero cost, self-healing resolves issues on the first attempt
|
|
406
|
+
|
|
407
|
+
### What happens automatically
|
|
408
|
+
|
|
409
|
+
| Event | What RAG does |
|
|
410
|
+
|-------|---------------|
|
|
411
|
+
| Step passes | Remembers the instruction → code mapping |
|
|
412
|
+
| Step fails | Records the error pattern for that URL |
|
|
413
|
+
| Healing fixes a step | Remembers the error → fix pair |
|
|
414
|
+
| Next similar step | Retrieves past code instead of calling AI ($0) |
|
|
415
|
+
| Next similar error | Uses proven past fix in the healing prompt |
|
|
416
|
+
|
|
417
|
+
### Storage
|
|
418
|
+
|
|
419
|
+
RAG data lives in `results/.rag/` as plain JSON files. To share memory across your team, commit it to Git. To reset, delete the folder.
|
|
420
|
+
|
|
421
|
+
### When to change RAG settings
|
|
422
|
+
|
|
423
|
+
Most users **never need to touch RAG settings**. The Settings → RAG Memory card exists for edge cases:
|
|
424
|
+
|
|
425
|
+
| Scenario | Action |
|
|
426
|
+
|----------|--------|
|
|
427
|
+
| Debugging a flaky test | Turn OFF Code Corpus — forces fresh AI generation |
|
|
428
|
+
| Healing keeps suggesting a bad fix | Turn OFF Healing Corpus |
|
|
429
|
+
| Major app redesign | Turn OFF RAG entirely — old memory is misleading |
|
|
430
|
+
| Want deterministic CI runs | Disable RAG in CI, keep ON locally |
|
|
431
|
+
|
|
432
|
+
---
|
|
433
|
+
|
|
434
|
+
## 13. Next Steps
|
|
376
435
|
|
|
377
436
|
| Topic | Where to look |
|
|
378
437
|
|-------|---------------|
|
|
438
|
+
| Test Recorder | [Studio Guide → Test Recorder](./STUDIO.md#test-recorder) |
|
|
379
439
|
| All CLI flags | [CLI Reference](./CLI-REFERENCE.md) |
|
|
380
440
|
| Studio UI walkthrough | [Studio Guide](./STUDIO.md) |
|
|
381
441
|
| Build the package from source | [CONTRIBUTING.md](../CONTRIBUTING.md) |
|
|
382
442
|
| Config options | `autotest.config.ts` comments |
|
|
383
443
|
| MCP integration | Settings page → MCP Integration |
|
|
444
|
+
| RAG memory | Settings page → RAG Memory |
|
|
384
445
|
| All supported AI providers | `.env.example` |
|
|
385
446
|
| Quick daily reference | `ASSUREMIND.md` in your project root |
|
package/docs/STUDIO.md
CHANGED
|
@@ -144,6 +144,113 @@ If all three are unchecked, Lighthouse is skipped for that case. A warning is sh
|
|
|
144
144
|
- Reference variables with double-braces: `Enter "{{ADMIN_EMAIL}}" in the email field`
|
|
145
145
|
- Include assertions: `Verify that the success banner is visible`
|
|
146
146
|
|
|
147
|
+
### Test Recorder
|
|
148
|
+
|
|
149
|
+
The Test Recorder lets you create tests by interacting with your application in a real browser — **no coding, no AI, no cost**.
|
|
150
|
+
|
|
151
|
+
#### Recording a test
|
|
152
|
+
|
|
153
|
+
1. Open a test case in the editor
|
|
154
|
+
2. Click the red **Record** button (between "Add Step" and "Generate All")
|
|
155
|
+
3. A headed Chromium browser opens your app's base URL
|
|
156
|
+
4. Interact with the page naturally — click buttons, fill forms, navigate
|
|
157
|
+
5. **Iframes are handled automatically** — the recorder injects into all frames and generates correct `frameLocator()` code
|
|
158
|
+
6. Each action appears in the live preview panel in real time
|
|
159
|
+
7. Click **Stop Recording** when done
|
|
160
|
+
|
|
161
|
+
Every recorded action becomes a test step with **pre-generated Playwright code** — no AI generation needed.
|
|
162
|
+
|
|
163
|
+
#### Assertion shortcuts
|
|
164
|
+
|
|
165
|
+
While recording, use these keyboard shortcuts to add assertions:
|
|
166
|
+
|
|
167
|
+
| Shortcut | Assertion Type | Behavior | Example Output |
|
|
168
|
+
|----------|---------------|----------|----------------|
|
|
169
|
+
| **Shift+Click** | Hard assertion — element visible | Test **stops** on failure | `await expect(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();` |
|
|
170
|
+
| **Ctrl+Shift+Click** | Soft assertion — element visible | Test **continues**, failures collected at end | `await expect.soft(page.getByRole('heading', { name: 'Dashboard' })).toBeVisible();` |
|
|
171
|
+
| **Ctrl+Shift+U** | Current URL matches | Hard assertion | `await expect(page).toHaveURL(/dashboard/);` |
|
|
172
|
+
| **Ctrl+Shift+T** | Page title matches | Hard assertion | `await expect(page).toHaveTitle(/Dashboard/);` |
|
|
173
|
+
|
|
174
|
+
Assertion steps show a green badge in the live preview. Soft assertions show a blue **Soft** badge in the step editor.
|
|
175
|
+
|
|
176
|
+
#### Hard vs Soft Assertions
|
|
177
|
+
|
|
178
|
+
| Type | Keyword | Behavior | When to use |
|
|
179
|
+
|------|---------|----------|-------------|
|
|
180
|
+
| **Hard** | `Verify ...` | Test stops immediately on failure | Critical checks — login worked, page loaded |
|
|
181
|
+
| **Soft** | `Soft verify ...` | Test continues, all failures reported at end | Multiple checks on one page — verify 5 fields are correct |
|
|
182
|
+
|
|
183
|
+
**Three ways to use soft assertions:**
|
|
184
|
+
|
|
185
|
+
1. **Recorder** — Ctrl+Shift+Click instead of Shift+Click
|
|
186
|
+
2. **Plain English** — Start instruction with "Soft verify..." or "Soft check..." or "Soft assert..."
|
|
187
|
+
3. **Step toggle** — Click the **Hard/Soft** badge on any assertion step in the editor to switch
|
|
188
|
+
|
|
189
|
+
#### How locators are resolved
|
|
190
|
+
|
|
191
|
+
The recorder does **not** use simple CSS selectors or XPath. Instead, it:
|
|
192
|
+
|
|
193
|
+
1. Captures raw element metadata from the DOM (tag, attributes, text, ARIA properties)
|
|
194
|
+
2. Sends the metadata to Node.js where Playwright's accessibility tree is queried
|
|
195
|
+
3. Tries **6 locator strategies** in priority order, verifying each with `count() === 1`:
|
|
196
|
+
- `data-testid` attribute
|
|
197
|
+
- `getByRole()` with exact name + heading level
|
|
198
|
+
- `getByRole()` with exact name
|
|
199
|
+
- `getByLabel()`
|
|
200
|
+
- `getByPlaceholder()`
|
|
201
|
+
- `getByText()`
|
|
202
|
+
- CSS fallback (last resort)
|
|
203
|
+
4. The first strategy that uniquely identifies the element is used
|
|
204
|
+
|
|
205
|
+
This produces the most resilient locators possible — the same quality as hand-written Playwright tests.
|
|
206
|
+
|
|
207
|
+
#### What makes it stand out vs other recorders
|
|
208
|
+
|
|
209
|
+
| Feature | Selenium IDE | Playwright Codegen | Assuremind Recorder |
|
|
210
|
+
|---------|-------------|-------------------|---------------------|
|
|
211
|
+
| Locator quality | CSS/XPath | Good | Best — 6 strategies, verified against live page |
|
|
212
|
+
| Accessibility tree | No | Partial | Full — every locator checked via Playwright API |
|
|
213
|
+
| **Iframe support** | Partial | Manual | **Auto** — detects iframes, generates `frameLocator()` code |
|
|
214
|
+
| Assertions | Manual | Manual | Shift+Click (hard), Ctrl+Shift+Click (soft), URL & title shortcuts |
|
|
215
|
+
| Plain-English steps | No | No | Yes — human-readable instructions auto-generated |
|
|
216
|
+
| Self-healing after | No | No | Yes — 5-level AI healing cascade |
|
|
217
|
+
| RAG memory | No | No | Yes — recorded steps feed the learning loop |
|
|
218
|
+
| Cost | Free | Free | Free |
|
|
219
|
+
|
|
220
|
+
#### Biggest pain points solved
|
|
221
|
+
|
|
222
|
+
| Pain Point | How the Recorder Solves It |
|
|
223
|
+
|-----------|---------------------------|
|
|
224
|
+
| Writing tests is slow | Record a full test in 30 seconds |
|
|
225
|
+
| Selectors break constantly | Locators verified against Playwright's accessibility tree in real time |
|
|
226
|
+
| AI costs money | Recording + code generation = $0, zero AI calls |
|
|
227
|
+
| Non-technical testers can't write tests | Anyone who can click a browser can create tests |
|
|
228
|
+
| Assertions are hard to write | Shift+Click (hard), Ctrl+Shift+Click (soft), Ctrl+Shift+U for URL, Ctrl+Shift+T for title |
|
|
229
|
+
| Recorded tests are fragile | 6-strategy locator resolution + post-run 5-level self-healing |
|
|
230
|
+
| Apps use iframes (SAP, Salesforce) | Auto-detects iframe context, generates `frameLocator()` chains |
|
|
231
|
+
|
|
232
|
+
#### Iframe support
|
|
233
|
+
|
|
234
|
+
The recorder automatically handles **same-origin iframes** — common in enterprise apps like SAP, Salesforce, and embedded widgets:
|
|
235
|
+
|
|
236
|
+
- The capture script is injected into **all frames** (main page + every iframe), not just the top-level page
|
|
237
|
+
- When an element is inside an iframe, the recorder detects the frame context and computes a selector for the iframe element (using `id`, `name`, `data-testid`, or `src`)
|
|
238
|
+
- Locators are resolved against the correct frame using `page.frameLocator('...')` — producing code like:
|
|
239
|
+
```typescript
|
|
240
|
+
await page.frameLocator('#content-frame').getByRole('button', { name: 'Submit' }).click();
|
|
241
|
+
```
|
|
242
|
+
- Dynamically added iframes are detected via `frameattached` events and injected automatically
|
|
243
|
+
- The recording banner only appears in the main frame — iframes display cleanly
|
|
244
|
+
- **Limitation:** Cross-origin iframes (different domain) cannot be instrumented due to browser security. The main frame and same-origin iframes work fully.
|
|
245
|
+
|
|
246
|
+
#### Technical details
|
|
247
|
+
|
|
248
|
+
- The recorder uses `context.exposeBinding()` for browser↔Node.js communication that survives page navigations — works across all frames
|
|
249
|
+
- Input values are captured on blur (when the user leaves the field), not on keystrokes — no duplicate steps
|
|
250
|
+
- The capture script is re-injected on every page load via `page.on('load')` and into all child frames via `page.on('frameattached')` + `page.on('framenavigated')`
|
|
251
|
+
- Recorded steps use the `recorder` strategy tag, distinguishing them from AI-generated code
|
|
252
|
+
- All recorded steps are immediately usable — no "Generate Code" step required
|
|
253
|
+
|
|
147
254
|
### Generating Code
|
|
148
255
|
|
|
149
256
|
After adding steps, click **Generate Code** (or **Generate** on an individual step) to have the AI write the Playwright code.
|
|
@@ -355,6 +462,11 @@ Edit `autotest.config.ts` from the browser:
|
|
|
355
462
|
| MCP headless | Run MCP browser without visible window |
|
|
356
463
|
| MCP act-then-script | Two-phase generation: execute via MCP first, convert to code |
|
|
357
464
|
| MCP proactive healing | Validate selectors against live page before test runs |
|
|
465
|
+
| RAG enabled | Master switch for semantic memory from past runs (ON by default) |
|
|
466
|
+
| RAG Code Corpus | Remember instruction-to-code mappings from successful runs |
|
|
467
|
+
| RAG Healing Corpus | Remember past healing fixes for smarter self-repair |
|
|
468
|
+
| RAG Error Catalog | Track recurring error patterns to avoid known-bad selectors |
|
|
469
|
+
| RAG Embedder | Similarity engine: TF-IDF (local, free, offline) |
|
|
358
470
|
|
|
359
471
|
Changes are saved immediately to both `autotest.config.json` and `autotest.config.ts`.
|
|
360
472
|
|
|
@@ -390,6 +502,78 @@ If MCP fails at any point (browser crash, timeout, network issue), generation si
|
|
|
390
502
|
|
|
391
503
|
---
|
|
392
504
|
|
|
505
|
+
## RAG Memory (Retrieval-Augmented Generation)
|
|
506
|
+
|
|
507
|
+
RAG gives the AI **semantic memory** from past test runs. Every successful step, every healing fix, and every recurring error is indexed so the AI can retrieve similar experiences and generate better code over time.
|
|
508
|
+
|
|
509
|
+
### Three Corpora
|
|
510
|
+
|
|
511
|
+
| Corpus | What it stores | When it's used |
|
|
512
|
+
|--------|---------------|----------------|
|
|
513
|
+
| **Code Corpus** | Instruction-to-code mappings from successful runs | During generation — similar past steps retrieved as examples (score 0.65-0.90) or used directly as fuzzy cache hits (score >= 0.90) |
|
|
514
|
+
| **Healing Corpus** | Past healing events (instruction + error + fix) | During self-healing — proven past fixes injected into Level 2 repair prompt |
|
|
515
|
+
| **Error Catalog** | Recurring error patterns per URL with fix history | During generation — AI warned about known-bad selectors/patterns to avoid |
|
|
516
|
+
|
|
517
|
+
### How It Works
|
|
518
|
+
|
|
519
|
+
1. **Step passes** → the instruction + code are ingested into the Code Corpus.
|
|
520
|
+
2. **Step fails** → the error pattern is recorded in the Error Catalog.
|
|
521
|
+
3. **Healing succeeds** → the fix is ingested into the Healing Corpus, and the Error Catalog is updated with the fix.
|
|
522
|
+
4. **Next generation** → the SmartRouter queries the Code Corpus. High-confidence matches (>= 90%) are used directly ($0, instant). Lower matches (65-90%) are passed as examples in the AI prompt.
|
|
523
|
+
5. **Next healing** → Level 2 queries the Healing Corpus for similar past fixes and appends them to the repair instruction.
|
|
524
|
+
|
|
525
|
+
### Storage
|
|
526
|
+
|
|
527
|
+
All RAG data is stored as JSON files in `results/.rag/`:
|
|
528
|
+
- `idf-vocab.json` — TF-IDF learned vocabulary
|
|
529
|
+
- `code-corpus.json` — instruction → code mappings
|
|
530
|
+
- `healing-corpus.json` — failure → fix mappings
|
|
531
|
+
- `error-catalog.json` — recurring error patterns
|
|
532
|
+
|
|
533
|
+
### Consumer Experience
|
|
534
|
+
|
|
535
|
+
**RAG is ON by default — zero setup required.** It works automatically from the very first run.
|
|
536
|
+
|
|
537
|
+
- **Run 1** — memory is empty, AI generates code normally
|
|
538
|
+
- **Run 2+** — RAG kicks in silently: similar instructions are retrieved instead of making API calls (free + faster), healing uses proven past fixes
|
|
539
|
+
- **Run 10+** — most common steps are served from RAG memory at zero cost
|
|
540
|
+
|
|
541
|
+
RAG data is stored in `results/.rag/` as plain JSON — fully Git-friendly. Commit it to share memory across your team, or cache it in CI to persist between runs.
|
|
542
|
+
|
|
543
|
+
### FAQ
|
|
544
|
+
|
|
545
|
+
| Question | Answer |
|
|
546
|
+
|----------|--------|
|
|
547
|
+
| Do I need to configure anything? | No. RAG is ON by default with zero setup. |
|
|
548
|
+
| Does it cost anything? | No. TF-IDF embedder runs locally. RAG hits replace paid AI calls. |
|
|
549
|
+
| Does it slow down tests? | No. RAG lookup is <1ms. It speeds up generation. |
|
|
550
|
+
| Works in CI/CD? | Yes. Cache `results/.rag/` between runs to persist memory. |
|
|
551
|
+
| Share memory across team? | Commit `results/.rag/` to Git or use a CI cache step. |
|
|
552
|
+
| How to reset? | Delete the `results/.rag/` folder. |
|
|
553
|
+
|
|
554
|
+
### When to Use the Settings Card
|
|
555
|
+
|
|
556
|
+
Most users never need to touch RAG settings. The **Settings → RAG Memory** card exists for power-user scenarios:
|
|
557
|
+
|
|
558
|
+
| Scenario | Action |
|
|
559
|
+
|----------|--------|
|
|
560
|
+
| Debugging a flaky test | Turn OFF Code Corpus — forces fresh AI generation instead of reusing a possibly stale mapping |
|
|
561
|
+
| Healing keeps suggesting a bad fix | Turn OFF Healing Corpus — clears the influence of a bad past fix |
|
|
562
|
+
| Major app redesign (UI overhaul) | Turn OFF RAG entirely — old memory from the previous UI is now misleading |
|
|
563
|
+
| Error warnings are outdated | Turn OFF Error Catalog — stops avoiding selectors that are fine now |
|
|
564
|
+
| Want deterministic CI runs | Disable RAG in CI config, keep ON for local development |
|
|
565
|
+
|
|
566
|
+
### Settings Reference
|
|
567
|
+
|
|
568
|
+
Go to **Settings → RAG Memory** to configure:
|
|
569
|
+
- **Enable RAG** — Master switch (ON by default)
|
|
570
|
+
- **Code Corpus** — Toggle instruction-to-code memory
|
|
571
|
+
- **Healing Corpus** — Toggle healing fix memory
|
|
572
|
+
- **Error Catalog** — Toggle error pattern tracking
|
|
573
|
+
- **Embedder** — TF-IDF (local, free, offline) — no API calls needed
|
|
574
|
+
|
|
575
|
+
---
|
|
576
|
+
|
|
393
577
|
## WebSocket Live Updates
|
|
394
578
|
|
|
395
579
|
The Studio uses a WebSocket connection (`ws://localhost:<port>/ws`) to receive live updates:
|
|
@@ -398,6 +582,8 @@ The Studio uses a WebSocket connection (`ws://localhost:<port>/ws`) to receive l
|
|
|
398
582
|
|-------|-------------|
|
|
399
583
|
| `run:complete` | A run has finished — results available |
|
|
400
584
|
| `run:error` | A run failed to start |
|
|
585
|
+
| `recorder:action` | A new action was recorded (click, fill, navigate, assert) |
|
|
586
|
+
| `recorder:stopped` | Recording session ended |
|
|
401
587
|
|
|
402
588
|
The connection indicator in the top bar shows:
|
|
403
589
|
- **Green dot** — connected
|