@thierrynakoa/fire-flow 12.2.1 → 12.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +247 -12
- package/TROUBLESHOOTING.md +103 -0
- package/commands/fire-verify-uat.md +177 -9
- package/package.json +2 -1
- package/skills-library/specialists/quality/browser-use-expert.md +210 -0
- package/tools/uat-runner.py +179 -0
package/README.md
CHANGED
|
@@ -33,6 +33,16 @@ AUTOPSY → INTENT → CLARIFY → VISION → REBUILD → COMPARISON
|
|
|
33
33
|
/fire-phoenix --source ./app --dry-run
|
|
34
34
|
```
|
|
35
35
|
|
|
36
|
+
### Got a Messy Project Sitting in a Folder?
|
|
37
|
+
|
|
38
|
+
We all have them — that side project you built in a weekend, the freelance app that grew out of control, the "it works so don't touch it" codebase. Instead of rewriting from scratch manually or abandoning it, point Phoenix at it:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
/fire-phoenix --source ./that-old-project
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Phoenix will tell you exactly what's wrong, what's worth keeping, and what should be rebuilt. Your original code is never touched — the clean version goes to a new folder. **This is the fastest way to turn technical debt into production-quality code.**
|
|
45
|
+
|
|
36
46
|
---
|
|
37
47
|
|
|
38
48
|
## What Does It Do?
|
|
@@ -148,15 +158,197 @@ npx @thierrynakoa/fire-flow --uninstall
|
|
|
148
158
|
|
|
149
159
|
---
|
|
150
160
|
|
|
151
|
-
##
|
|
161
|
+
## Slash Commands Not Appearing?
|
|
162
|
+
|
|
163
|
+
If you've installed Dominion Flow but the `/fire-` commands don't show up when you type `/` in Claude Code, follow these steps:
|
|
164
|
+
|
|
165
|
+
### Option A — Ask Claude to Fix It (Easiest)
|
|
166
|
+
|
|
167
|
+
Open Claude Code and paste this message:
|
|
168
|
+
|
|
169
|
+
> *"My Dominion Flow plugin is installed but the /fire- slash commands aren't appearing. Please check my ~/.claude/settings.json, register the plugin under enabledPlugins, and refresh my commands."*
|
|
170
|
+
|
|
171
|
+
Claude will inspect your configuration, add the missing plugin entry, and reload the commands automatically.
|
|
172
|
+
|
|
173
|
+
### Option B — Fix settings.json Manually
|
|
174
|
+
|
|
175
|
+
1. Open your global Claude Code settings file:
|
|
176
|
+
|
|
177
|
+
**Mac / Linux:**
|
|
178
|
+
```bash
|
|
179
|
+
code ~/.claude/settings.json
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
**Windows:**
|
|
183
|
+
```bash
|
|
184
|
+
code %USERPROFILE%\.claude\settings.json
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
*(Replace `code` with `notepad`, `nano`, or any text editor you prefer)*
|
|
188
|
+
|
|
189
|
+
2. Find the `"enabledPlugins"` section. If it doesn't exist, add it. Then add the Dominion Flow entry:
|
|
190
|
+
|
|
191
|
+
```json
|
|
192
|
+
{
|
|
193
|
+
"enabledPlugins": {
|
|
194
|
+
"dominion-flow@local": true
|
|
195
|
+
}
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
If you already have other plugins listed, just add the `"dominion-flow@local": true` line inside the existing block (don't forget the comma on the line above it).
|
|
200
|
+
|
|
201
|
+
3. Save the file and **restart Claude Code completely** (close and reopen — not just a new conversation).
|
|
202
|
+
|
|
203
|
+
4. Type `/fire-0-orient` to confirm the commands are working.
|
|
152
204
|
|
|
153
|
-
|
|
205
|
+
### Option C — Reinstall the Plugin
|
|
206
|
+
|
|
207
|
+
If the above steps don't work, reinstall from scratch:
|
|
208
|
+
|
|
209
|
+
```bash
|
|
210
|
+
npx @thierrynakoa/fire-flow --update
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
Or if you cloned from GitHub:
|
|
214
|
+
|
|
215
|
+
```bash
|
|
216
|
+
claude plugin uninstall dominion-flow
|
|
217
|
+
claude install-plugin ./fire-flow
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
Then restart Claude Code.
|
|
221
|
+
|
|
222
|
+
### Still Not Working?
|
|
223
|
+
|
|
224
|
+
See [TROUBLESHOOTING.md](./TROUBLESHOOTING.md) for additional diagnostics, or open an issue at [github.com/ThierryN/fire-flow/issues](https://github.com/ThierryN/fire-flow/issues).
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
228
|
+
## Highly Recommended: Install the Playwright Plugin
|
|
229
|
+
|
|
230
|
+
Dominion Flow includes built-in E2E (end-to-end) browser testing through its `/fire-test` and `/fire-verify-uat` commands. For these to work, Claude needs the **Playwright plugin** — which gives Claude the ability to open a real browser, click buttons, fill forms, take screenshots, and verify that your application actually works the way a real user would experience it.
|
|
231
|
+
|
|
232
|
+
**This is the single most impactful add-on you can install.** Without it, Dominion Flow can still verify your code quality and logic — but with it, Claude can launch your app in a browser and prove it works visually.
|
|
233
|
+
|
|
234
|
+
### Install Playwright Plugin (One Command)
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
claude plugin install playwright@claude-plugins-official
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
That's it. Restart Claude Code and the Playwright tools are available immediately.
|
|
241
|
+
|
|
242
|
+
### What This Unlocks
|
|
243
|
+
|
|
244
|
+
| Capability | What Claude Can Do |
|
|
245
|
+
|------------|-------------------|
|
|
246
|
+
| **Visual verification** | Open your app in a browser and confirm UI renders correctly |
|
|
247
|
+
| **Form testing** | Fill out forms, submit them, and verify responses |
|
|
248
|
+
| **Click-through flows** | Test multi-step user journeys (signup → login → dashboard) |
|
|
249
|
+
| **Screenshot comparison** | Capture before/after screenshots during Phoenix Rebuild |
|
|
250
|
+
| **Error detection** | Catch console errors, broken links, and missing elements |
|
|
251
|
+
| **Responsive testing** | Test at different viewport sizes (mobile, tablet, desktop) |
|
|
252
|
+
|
|
253
|
+
### How Dominion Flow Uses Playwright
|
|
254
|
+
|
|
255
|
+
Once installed, Playwright is automatically used by:
|
|
256
|
+
|
|
257
|
+
- **`/fire-verify-uat`** — Runs user acceptance tests against your running application
|
|
258
|
+
- **`/fire-test`** — Includes E2E test generation and execution
|
|
259
|
+
- **`/fire-4-verify`** — The 70-point WARRIOR checklist awards up to 10 points for E2E coverage
|
|
260
|
+
- **`/fire-phoenix`** — Phase 6 (COMPARISON) uses screenshots to visually compare source vs rebuild
|
|
261
|
+
|
|
262
|
+
### Verify Playwright Is Working
|
|
263
|
+
|
|
264
|
+
After installing, ask Claude:
|
|
265
|
+
|
|
266
|
+
> *"Take a screenshot of http://localhost:3000"*
|
|
267
|
+
|
|
268
|
+
If Claude opens a browser and returns a screenshot, Playwright is working. If your app isn't running yet, you can test with any public URL:
|
|
269
|
+
|
|
270
|
+
> *"Take a screenshot of https://example.com"*
|
|
271
|
+
|
|
272
|
+
### Don't Have Playwright Yet? Dominion Flow Still Works
|
|
273
|
+
|
|
274
|
+
All core commands (`/fire-1a-new` through `/fire-6-resume`, `/fire-autonomous`, `/fire-phoenix`) work without Playwright. E2E testing features will simply be skipped, and the verifier will redistribute those points to other categories. But you'll get significantly more value from Dominion Flow with Playwright installed.
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Highly Recommended: Install browser-use (AI Browser Automation)
|
|
279
|
+
|
|
280
|
+
**browser-use** is an open-source AI browser automation library that lets Claude control a real browser using natural language. Instead of writing test scripts, you describe what to test in plain English and the AI navigates, clicks, fills forms, and verifies results autonomously.
|
|
281
|
+
|
|
282
|
+
Dominion Flow integrates browser-use into `/fire-verify-uat` — enabling **fully autonomous User Acceptance Testing**. No more manually clicking through every flow. Claude generates the test flows, browser-use executes them in a headless browser, and results come back as structured pass/fail reports.
|
|
283
|
+
|
|
284
|
+
### Install browser-use (3 Commands)
|
|
285
|
+
|
|
286
|
+
```bash
|
|
287
|
+
pip install browser-use
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
```bash
|
|
291
|
+
uvx browser-use install
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
```bash
|
|
295
|
+
python -c "from browser_use import Agent; print('Setup OK')"
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
**Note:** You also need your `ANTHROPIC_API_KEY` set as an environment variable (the same key you use for Claude Code).
|
|
299
|
+
|
|
300
|
+
### What This Unlocks
|
|
301
|
+
|
|
302
|
+
| Without browser-use | With browser-use |
|
|
303
|
+
|--------------------|-----------------|
|
|
304
|
+
| You manually click through every UAT flow | AI navigates the browser and tests flows automatically |
|
|
305
|
+
| Testing takes 10-30 minutes per phase | Testing takes 1-3 minutes per phase |
|
|
306
|
+
| You might miss edge cases when tired | AI tests exactly what was specified, every time |
|
|
307
|
+
| Can't test at 2 AM when you're asleep | `/fire-autonomous` includes UAT without human intervention |
|
|
308
|
+
|
|
309
|
+
### How Dominion Flow Uses It
|
|
310
|
+
|
|
311
|
+
When you run `/fire-verify-uat`, you'll be asked to choose a mode:
|
|
312
|
+
|
|
313
|
+
- **Autonomous** — browser-use AI runs all flows. Just sit back and watch the results.
|
|
314
|
+
- **Manual** — the current guided testing mode. You click, Claude watches.
|
|
315
|
+
|
|
316
|
+
Both modes include automatic parallel diagnosis when flows fail — 3 debug agents investigate the root cause simultaneously.
|
|
317
|
+
|
|
318
|
+
### browser-use vs Playwright — Both Work Together
|
|
319
|
+
|
|
320
|
+
| Tool | Role in Dominion Flow |
|
|
321
|
+
|------|----------------------|
|
|
322
|
+
| **Playwright** | Deterministic E2E regression tests, CI/CD, cross-browser matrix, visual snapshots |
|
|
323
|
+
| **browser-use** | AI-driven UAT flows, exploratory testing, natural language test specs |
|
|
324
|
+
|
|
325
|
+
They complement each other. Playwright catches regressions. browser-use explores like a real user would.
|
|
326
|
+
|
|
327
|
+
---
|
|
328
|
+
|
|
329
|
+
## Highly Recommended: Persistent Memory with Docker + Qdrant + Ollama
|
|
330
|
+
|
|
331
|
+
Without persistent memory, Claude starts every session from scratch — it forgets your codebase, past decisions, what worked, and what didn't. **With Qdrant and Ollama installed, Claude remembers everything across sessions.** It can recall past debugging sessions, reuse solutions it discovered weeks ago, and pick up exactly where it left off with full context.
|
|
332
|
+
|
|
333
|
+
This is the difference between an AI that helps you for 30 minutes and one that becomes a long-term collaborator on your project.
|
|
334
|
+
|
|
335
|
+
**What you get:**
|
|
336
|
+
|
|
337
|
+
| Without Memory | With Qdrant + Ollama |
|
|
338
|
+
|----------------|---------------------|
|
|
339
|
+
| Claude forgets everything when you close the session | Claude recalls past sessions, decisions, and patterns |
|
|
340
|
+
| You re-explain your project every time | Claude already knows your codebase and conventions |
|
|
341
|
+
| Same mistakes get repeated across sessions | Lessons learned persist — Claude gets smarter over time |
|
|
342
|
+
| Skills and patterns are discovered but lost | Auto-extracted skills are stored and reused |
|
|
343
|
+
| Handoffs rely on text files only | Handoffs are backed by vector search across all history |
|
|
344
|
+
|
|
345
|
+
**The setup takes about 10 minutes and runs entirely on your local machine — no cloud services, no data leaving your computer.**
|
|
154
346
|
|
|
155
347
|
---
|
|
156
348
|
|
|
157
349
|
### Step 1 — Install Docker Desktop
|
|
158
350
|
|
|
159
|
-
Docker Desktop is required to run Qdrant. Install it before anything else.
|
|
351
|
+
Docker Desktop is required to run Qdrant (the vector database that stores Claude's memory). Install it before anything else.
|
|
160
352
|
|
|
161
353
|
**Windows (PC):**
|
|
162
354
|
|
|
@@ -193,7 +385,7 @@ Docker Desktop is required to run Qdrant. Install it before anything else.
|
|
|
193
385
|
|
|
194
386
|
### Step 2 — Run Qdrant (Vector Database)
|
|
195
387
|
|
|
196
|
-
Qdrant
|
|
388
|
+
Qdrant is the brain behind Claude's long-term memory. Every session handoff, debugging insight, skill discovery, and architectural decision gets stored as a searchable vector. When Claude starts a new session, it automatically searches this memory for relevant context — giving it continuity that no other AI coding tool provides.
|
|
197
389
|
|
|
198
390
|
```bash
|
|
199
391
|
docker pull qdrant/qdrant
|
|
@@ -239,19 +431,24 @@ claude mcp add qdrant -s user -- cmd /c uvx mcp-server-qdrant \
|
|
|
239
431
|
|
|
240
432
|
### Step 4 — Install Ollama (Local Embeddings)
|
|
241
433
|
|
|
242
|
-
Ollama
|
|
434
|
+
Ollama is the engine that converts your code, decisions, and session context into vectors that Qdrant can search. It runs entirely on your machine — no API keys, no cloud calls, no cost per query. Once installed, it works silently in the background.
|
|
243
435
|
|
|
244
436
|
1. Download and install from [ollama.com](https://ollama.com)
|
|
245
437
|
2. Pull the embedding model:
|
|
246
438
|
```bash
|
|
247
439
|
ollama pull nomic-embed-text
|
|
248
440
|
```
|
|
441
|
+
3. Ollama starts automatically after install. Verify it's running:
|
|
442
|
+
```bash
|
|
443
|
+
ollama list
|
|
444
|
+
```
|
|
445
|
+
You should see `nomic-embed-text` in the output.
|
|
249
446
|
|
|
250
447
|
---
|
|
251
448
|
|
|
252
|
-
### Step 5 — Run Docker Hub MCP Server (hub-mcp)
|
|
449
|
+
### Step 5 (Optional) — Run Docker Hub MCP Server (hub-mcp)
|
|
253
450
|
|
|
254
|
-
hub-mcp lets Claude search Docker Hub, browse images and tags, and pull images by just asking.
|
|
451
|
+
hub-mcp lets Claude search Docker Hub, browse images and tags, and pull images by just asking. This is optional — it's useful if you work with Docker images regularly.
|
|
255
452
|
|
|
256
453
|
```bash
|
|
257
454
|
docker pull docker/hub-mcp
|
|
@@ -326,8 +523,39 @@ Claude will plan, build, and verify every phase without you having to type each
|
|
|
326
523
|
|
|
327
524
|
## How Does It Compare?
|
|
328
525
|
|
|
526
|
+
There are several open-source orchestration tools for Claude Code. Here's an honest, side-by-side comparison across 14 features:
|
|
527
|
+
|
|
329
528
|

|
|
330
529
|
|
|
530
|
+
### Why Dominion Flow Wins
|
|
531
|
+
|
|
532
|
+
Most orchestration plugins do one or two things well. Dominion Flow is the only one that covers the **entire development lifecycle** — from initial project setup to production verification — in a single, free, open-source package.
|
|
533
|
+
|
|
534
|
+
**Features no other free orchestration tool offers:**
|
|
535
|
+
|
|
536
|
+
| Exclusive to Dominion Flow | What It Means for You |
|
|
537
|
+
|---------------------------|----------------------|
|
|
538
|
+
| **Phoenix Rebuild** | Take any messy "vibe coded" project and rebuild it clean. No other tool does this. |
|
|
539
|
+
| **478+ skills library** | The largest collection of reusable patterns for AI-assisted development. Next closest has 60. |
|
|
540
|
+
| **Playwright E2E testing** | Claude can open a real browser and prove your app works. No other orchestration tool integrates this. |
|
|
541
|
+
| **Tiered verification (3-tier + 70-point)** | Fast gate catches obvious issues in 30 seconds. Full checklist catches everything else. Other tools have basic checks at best. |
|
|
542
|
+
| **6-type circuit breaker** | Classifies WHY Claude is stuck (stall, spin, degrade, blocked, thrash, drift) and takes different action for each. Others just retry or stop. |
|
|
543
|
+
| **GoF design patterns for AI** | All 22 Gang of Four patterns mapped to AI agent architecture. Educational and practical. |
|
|
544
|
+
| **Learncoding mode** | Walk through any codebase step-by-step to understand it before changing it. A learning tool built into a production tool. |
|
|
545
|
+
| **46 commands in 8 tiers** | Most tools have 5-10 commands. Dominion Flow covers planning, execution, verification, debugging, security, analytics, milestones, and learning. |
|
|
546
|
+
|
|
547
|
+
**Where others are strong too** (credit where it's due):
|
|
548
|
+
|
|
549
|
+
- **claude-mem** — Best-in-class session memory if memory is all you need
|
|
550
|
+
- **GSD** — Solid plan-execute-verify pipeline with goal-backward verification
|
|
551
|
+
- **everything-cc** — Good skills system with built-in security (AgentShield)
|
|
552
|
+
- **Ruflo** — Strong swarm-based parallel execution
|
|
553
|
+
- **Composio** — Powerful multi-agent orchestration with 30+ agents and CI self-healing
|
|
554
|
+
|
|
555
|
+
**The difference:** Those tools specialize. Dominion Flow integrates. You get structured workflow, memory, verification, security, debugging, skills, E2E testing, autonomous mode, Phoenix Rebuild, and a learning system — all working together in one coherent pipeline. And it's completely free.
|
|
556
|
+
|
|
557
|
+
---
|
|
558
|
+
|
|
331
559
|
## Key Features
|
|
332
560
|
|
|
333
561
|
| Feature | What It Does |
|
|
@@ -408,14 +636,21 @@ This is where students and followers ask questions, share projects, and stay up
|
|
|
408
636
|
|
|
409
637
|
---
|
|
410
638
|
|
|
411
|
-
##
|
|
639
|
+
## Star This Repo and Share It
|
|
640
|
+
|
|
641
|
+
If Dominion Flow has helped you build faster, catch bugs earlier, or clean up a messy project — **please star this repo.** It takes one click and makes a real difference:
|
|
642
|
+
|
|
643
|
+
**[github.com/ThierryN/fire-flow](https://github.com/ThierryN/fire-flow)** — Click the star button at the top right.
|
|
644
|
+
|
|
645
|
+
Stars help other developers discover this project. The more people using Dominion Flow, the better the skills library gets, the more patterns get shared, and the stronger the community becomes.
|
|
412
646
|
|
|
413
|
-
|
|
647
|
+
**Know someone who could use this?**
|
|
414
648
|
|
|
415
|
-
-
|
|
416
|
-
-
|
|
649
|
+
- A friend struggling with a messy codebase? Send them the Phoenix Rebuild section
|
|
650
|
+
- A colleague learning Claude Code? This gives them structure from day one
|
|
651
|
+
- A developer drowning in technical debt? `/fire-phoenix` was built for exactly that
|
|
417
652
|
|
|
418
|
-
|
|
653
|
+
Share the link: `https://github.com/ThierryN/fire-flow`
|
|
419
654
|
|
|
420
655
|
---
|
|
421
656
|
|
package/TROUBLESHOOTING.md
CHANGED
|
@@ -261,4 +261,107 @@ Common failure modes and their fixes.
|
|
|
261
261
|
|
|
262
262
|
---
|
|
263
263
|
|
|
264
|
+
## 11. Slash commands not appearing after install
|
|
265
|
+
|
|
266
|
+
**Symptoms:**
|
|
267
|
+
- You installed Dominion Flow but typing `/` doesn't show any `/fire-` commands
|
|
268
|
+
- Claude doesn't recognize `/fire-0-orient` or any other Dominion Flow command
|
|
269
|
+
- The plugin files exist on disk but Claude Code doesn't see them
|
|
270
|
+
|
|
271
|
+
**Likely Cause:**
|
|
272
|
+
- The plugin is not registered in your global `~/.claude/settings.json` under `enabledPlugins`
|
|
273
|
+
- Claude Code was not restarted after installation
|
|
274
|
+
- The plugin directory path is incorrect or the files were extracted to the wrong location
|
|
275
|
+
- Stale command cache from a previous installation
|
|
276
|
+
|
|
277
|
+
**Fix Steps:**
|
|
278
|
+
|
|
279
|
+
**Option A — Ask Claude to fix it:**
|
|
280
|
+
|
|
281
|
+
Open Claude Code and paste this message:
|
|
282
|
+
|
|
283
|
+
> "My Dominion Flow plugin is installed but the /fire- slash commands aren't appearing. Please check my ~/.claude/settings.json, register the plugin under enabledPlugins, and refresh my commands."
|
|
284
|
+
|
|
285
|
+
Claude will inspect your configuration, add the missing entry, and reload everything.
|
|
286
|
+
|
|
287
|
+
**Option B — Fix manually:**
|
|
288
|
+
|
|
289
|
+
1. Open your settings file:
|
|
290
|
+
- **Mac/Linux:** `~/.claude/settings.json`
|
|
291
|
+
- **Windows:** `%USERPROFILE%\.claude\settings.json`
|
|
292
|
+
|
|
293
|
+
2. Find or create the `"enabledPlugins"` section and add:
|
|
294
|
+
```json
|
|
295
|
+
{
|
|
296
|
+
"enabledPlugins": {
|
|
297
|
+
"dominion-flow@local": true
|
|
298
|
+
}
|
|
299
|
+
}
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
3. Verify the plugin files exist at the expected location:
|
|
303
|
+
```bash
|
|
304
|
+
ls ~/.claude/plugins/dominion-flow/plugin.json
|
|
305
|
+
```
|
|
306
|
+
If the file doesn't exist, the plugin wasn't installed correctly — reinstall with `npx @thierrynakoa/fire-flow` or `claude install-plugin ./fire-flow`.
|
|
307
|
+
|
|
308
|
+
4. **Restart Claude Code completely** — close the terminal and reopen. A new conversation in the same terminal is not enough.
|
|
309
|
+
|
|
310
|
+
5. Verify: type `/fire-0-orient` — you should see the orientation output.
|
|
311
|
+
|
|
312
|
+
**Option C — Reinstall:**
|
|
313
|
+
|
|
314
|
+
```bash
|
|
315
|
+
npx @thierrynakoa/fire-flow --update
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
Or for git-cloned installs:
|
|
319
|
+
```bash
|
|
320
|
+
claude plugin uninstall dominion-flow
|
|
321
|
+
claude install-plugin ./fire-flow
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
**Prevention:**
|
|
325
|
+
- Always restart Claude Code after installing or updating any plugin
|
|
326
|
+
- After installation, immediately test with `/fire-0-orient` before starting work
|
|
327
|
+
- If you update Claude Code itself, re-verify your plugins are still registered — major updates can sometimes reset plugin state
|
|
328
|
+
- Keep your `settings.json` backed up so you can restore plugin registrations quickly
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
|
|
332
|
+
## 12. Hooks not running or outdated
|
|
333
|
+
|
|
334
|
+
**Symptoms:**
|
|
335
|
+
- Session hooks (auto-update check, memory injection, compact restoration) don't fire
|
|
336
|
+
- Hook errors appear at session start
|
|
337
|
+
- Hooks reference files that no longer exist after an update
|
|
338
|
+
|
|
339
|
+
**Likely Cause:**
|
|
340
|
+
- Hooks were registered with absolute paths that changed after a plugin update or reinstall
|
|
341
|
+
- The `hooks.json` file in the plugin references scripts that weren't included in the update
|
|
342
|
+
- Global `settings.json` hooks conflict with plugin hooks
|
|
343
|
+
|
|
344
|
+
**Fix Steps:**
|
|
345
|
+
|
|
346
|
+
1. Ask Claude to refresh your hooks:
|
|
347
|
+
|
|
348
|
+
> "Please check my Dominion Flow hooks configuration and update any stale paths or missing hook scripts."
|
|
349
|
+
|
|
350
|
+
2. Or manually check your plugin's hooks file:
|
|
351
|
+
```bash
|
|
352
|
+
cat ~/.claude/plugins/dominion-flow/hooks/hooks.json
|
|
353
|
+
```
|
|
354
|
+
Verify every `"command"` path points to a file that actually exists.
|
|
355
|
+
|
|
356
|
+
3. If hooks reference missing files, reinstall:
|
|
357
|
+
```bash
|
|
358
|
+
npx @thierrynakoa/fire-flow --update
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
**Prevention:**
|
|
362
|
+
- After every plugin update, restart Claude Code and verify hooks fire correctly
|
|
363
|
+
- Don't manually edit hook paths unless you know the exact file structure
|
|
364
|
+
|
|
365
|
+
---
|
|
366
|
+
|
|
264
367
|
*If your issue isn't listed here, check the references directory for detailed protocol documentation, or run `/fire-debug` to systematically investigate.*
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
---
|
|
2
|
-
description:
|
|
2
|
+
description: User Acceptance Testing with autonomous browser-use AI mode or manual guided mode, plus automatic parallel diagnosis on failures
|
|
3
3
|
---
|
|
4
4
|
|
|
5
5
|
# /fire-verify-uat
|
|
6
6
|
|
|
7
|
-
>
|
|
7
|
+
> UAT testing with autonomous AI browser mode or manual guided mode, plus automatic parallel diagnosis when flows fail.
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
@@ -48,7 +48,166 @@ Critical Flows to Test:
|
|
|
48
48
|
|
|
49
49
|
Present to user: "Ready to start UAT? I'll guide you through each flow."
|
|
50
50
|
|
|
51
|
-
### Step
|
|
51
|
+
### Step 2.5: Select Testing Mode
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
+---------------------------------------------------------------+
|
|
55
|
+
| UAT MODE SELECTION |
|
|
56
|
+
+---------------------------------------------------------------+
|
|
57
|
+
| |
|
|
58
|
+
| A) AUTONOMOUS — browser-use AI runs all flows automatically |
|
|
59
|
+
| - No clicking required, AI navigates the browser |
|
|
60
|
+
| - Requires: Python 3.11+, browser-use, ANTHROPIC_API_KEY |
|
|
61
|
+
| - Best for: form flows, page navigation, standard UI |
|
|
62
|
+
| |
|
|
63
|
+
| B) MANUAL — you click through each flow (current behavior) |
|
|
64
|
+
| - Always works, no extra setup needed |
|
|
65
|
+
| - Best for: OAuth popups, file uploads, visual layout |
|
|
66
|
+
| |
|
|
67
|
+
+---------------------------------------------------------------+
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Use AskUserQuestion with header "UAT Mode" and options:
|
|
71
|
+
- "Autonomous (Recommended)" — browser-use AI runs all flows
|
|
72
|
+
- "Manual" — guided human testing
|
|
73
|
+
|
|
74
|
+
#### If user selects Autonomous:
|
|
75
|
+
|
|
76
|
+
Run prerequisite checks via Bash:
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
# Check Python 3.11+
|
|
80
|
+
python --version 2>&1
|
|
81
|
+
|
|
82
|
+
# Check browser-use installed
|
|
83
|
+
python -c "import browser_use; print('OK')" 2>&1
|
|
84
|
+
|
|
85
|
+
# Check API key
|
|
86
|
+
python -c "import os; print('KEY_OK' if os.environ.get('ANTHROPIC_API_KEY') else 'KEY_MISSING')" 2>&1
|
|
87
|
+
|
|
88
|
+
# Check app is running (use the app's URL from PROJECT.md or default localhost:3000)
|
|
89
|
+
curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>&1
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
If any check fails:
|
|
93
|
+
```
|
|
94
|
+
+---------------------------------------------------------------+
|
|
95
|
+
| SETUP ISSUE — falling back to Manual mode |
|
|
96
|
+
+---------------------------------------------------------------+
|
|
97
|
+
| |
|
|
98
|
+
| Missing: [what failed] |
|
|
99
|
+
| |
|
|
100
|
+
| To fix: |
|
|
101
|
+
| pip install browser-use |
|
|
102
|
+
| uvx browser-use install |
|
|
103
|
+
| export ANTHROPIC_API_KEY="sk-ant-..." |
|
|
104
|
+
| |
|
|
105
|
+
| Proceeding with Manual mode instead. |
|
|
106
|
+
+---------------------------------------------------------------+
|
|
107
|
+
```
|
|
108
|
+
Fall back to Step 3 (Manual).
|
|
109
|
+
|
|
110
|
+
If all checks pass: proceed to Step 2.6 (Autonomous Execution). Skip Step 3.
|
|
111
|
+
|
|
112
|
+
#### If user selects Manual:
|
|
113
|
+
|
|
114
|
+
Proceed to Step 3 (Guided Testing) as normal.
|
|
115
|
+
|
|
116
|
+
### Step 2.6: Autonomous UAT Execution
|
|
117
|
+
|
|
118
|
+
**Only runs if mode = Autonomous and all checks passed.**
|
|
119
|
+
|
|
120
|
+
Load skill: `@skills-library/specialists/quality/browser-use-expert.md`
|
|
121
|
+
|
|
122
|
+
Locate the UAT runner script:
|
|
123
|
+
```bash
|
|
124
|
+
RUNNER="$HOME/.claude/plugins/fire-flow/tools/uat-runner.py"
|
|
125
|
+
# Fallback: check if running from cloned repo
|
|
126
|
+
if [ ! -f "$RUNNER" ]; then
|
|
127
|
+
RUNNER="$HOME/.claude/plugins/dominion-flow/tools/uat-runner.py"
|
|
128
|
+
fi
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
For each test flow generated in Step 2:
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
+---------------------------------------------------------------+
|
|
135
|
+
| AUTONOMOUS TEST: Flow N of M |
|
|
136
|
+
+---------------------------------------------------------------+
|
|
137
|
+
| Flow: [flow description] |
|
|
138
|
+
| Mode: browser-use AI agent (headless Chromium) |
|
|
139
|
+
| Running... (this may take 30-90 seconds per flow) |
|
|
140
|
+
+---------------------------------------------------------------+
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
**Generate the task string** from the flow's must-have truth. Include explicit PASS/FAIL criteria:
|
|
144
|
+
|
|
145
|
+
```
|
|
146
|
+
"Go to http://localhost:{port}/{path}.
|
|
147
|
+
[Specific steps derived from the flow]
|
|
148
|
+
|
|
149
|
+
PASS CRITERIA (all must be true):
|
|
150
|
+
1. [Expected outcome 1]
|
|
151
|
+
2. [Expected outcome 2]
|
|
152
|
+
|
|
153
|
+
FAIL if:
|
|
154
|
+
- [Failure condition 1]
|
|
155
|
+
- [Failure condition 2]
|
|
156
|
+
|
|
157
|
+
Report your verdict as PASS or FAIL with a one-sentence summary."
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
**If the flow requires credentials**, add the `--credentials` flag:
|
|
161
|
+
```bash
|
|
162
|
+
python "$RUNNER" \
|
|
163
|
+
--task "[generated task string]" \
|
|
164
|
+
--output /tmp/uat-flow-N.json \
|
|
165
|
+
--max-steps 20 \
|
|
166
|
+
--credentials '{"localhost": {"x_email": "[test email]", "x_password": "[test password]"}}'
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**If no credentials needed:**
|
|
170
|
+
```bash
|
|
171
|
+
python "$RUNNER" \
|
|
172
|
+
--task "[generated task string]" \
|
|
173
|
+
--output /tmp/uat-flow-N.json \
|
|
174
|
+
--max-steps 20
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
**Parse the result:**
|
|
178
|
+
```bash
|
|
179
|
+
cat /tmp/uat-flow-N.json
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Read the JSON and record:
|
|
183
|
+
- `status` = PASS, FAIL, or ERROR
|
|
184
|
+
- `summary` = what happened
|
|
185
|
+
- `errors` = any error messages
|
|
186
|
+
|
|
187
|
+
**After all flows complete, display results:**
|
|
188
|
+
|
|
189
|
+
```
|
|
190
|
+
+---------------------------------------------------------------+
|
|
191
|
+
| AUTONOMOUS UAT RESULTS |
|
|
192
|
+
+---------------------------------------------------------------+
|
|
193
|
+
|
|
194
|
+
| # | Flow | Status | Summary |
|
|
195
|
+
|---|-------------------------------|--------|--------------------|
|
|
196
|
+
| 1 | User registration | PASS | Redirected to /dashboard |
|
|
197
|
+
| 2 | User login | FAIL | Login button not found |
|
|
198
|
+
| 3 | Dashboard loads | PASS | All widgets rendered |
|
|
199
|
+
|
|
200
|
+
Passed: 2/3 | Failed: 1/3 | Errors: 0/3
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
**Route results:**
|
|
204
|
+
- PASS flows → record and continue
|
|
205
|
+
- FAIL flows → proceed to Step 4 (Automatic Diagnosis) for each failure
|
|
206
|
+
- ERROR flows → display: "Infrastructure error on flow N. Retest manually?" and offer to run that flow in Manual mode (Step 3)
|
|
207
|
+
|
|
208
|
+
After all flows processed, proceed to Step 6 (Generate UAT Report).
|
|
209
|
+
|
|
210
|
+
### Step 3: Guided Testing (Manual Mode Only)
|
|
52
211
|
|
|
53
212
|
For each flow:
|
|
54
213
|
1. Present exact steps to test
|
|
@@ -106,12 +265,16 @@ For each failure with diagnosis:
|
|
|
106
265
|
```markdown
|
|
107
266
|
## UAT Report: Phase XX
|
|
108
267
|
|
|
268
|
+
### Session Info
|
|
269
|
+
- **Mode:** [autonomous / manual]
|
|
270
|
+
- **Tester:** [browser-use AI / human]
|
|
271
|
+
|
|
109
272
|
### Results
|
|
110
|
-
| Flow | Status | Notes |
|
|
111
|
-
|
|
112
|
-
| Registration | PASS | |
|
|
113
|
-
| Login | PASS | |
|
|
114
|
-
| Dashboard | FAIL -> PASS | Fixed: missing redirect |
|
|
273
|
+
| Flow | Status | Mode | Notes |
|
|
274
|
+
|------|--------|------|-------|
|
|
275
|
+
| Registration | PASS | autonomous | Redirected to /dashboard |
|
|
276
|
+
| Login | PASS | autonomous | |
|
|
277
|
+
| Dashboard | FAIL -> PASS | manual | Fixed: missing redirect |
|
|
115
278
|
|
|
116
279
|
### Verdict: [PASS / CONDITIONAL PASS / FAIL]
|
|
117
280
|
- Blocking issues: [N]
|
|
@@ -131,10 +294,12 @@ For each failure with diagnosis:
|
|
|
131
294
|
|
|
132
295
|
## Success Criteria
|
|
133
296
|
|
|
297
|
+
- [ ] Testing mode selected (autonomous or manual)
|
|
298
|
+
- [ ] If autonomous: prerequisites verified (Python, browser-use, API key, app running)
|
|
134
299
|
- [ ] All critical flows tested
|
|
135
300
|
- [ ] Failures diagnosed with parallel agents
|
|
136
301
|
- [ ] Fixes committed and retested
|
|
137
|
-
- [ ] UAT report generated
|
|
302
|
+
- [ ] UAT report generated with mode column
|
|
138
303
|
- [ ] Clear verdict and next action
|
|
139
304
|
|
|
140
305
|
---
|
|
@@ -144,3 +309,6 @@ For each failure with diagnosis:
|
|
|
144
309
|
- **Template:** `@templates/UAT.md`
|
|
145
310
|
- **Debugging pattern:** CLAUDE.md parallel debugging section
|
|
146
311
|
- **Verification:** `@references/verification-patterns.md`
|
|
312
|
+
- **Skill:** `@skills-library/specialists/quality/browser-use-expert.md`
|
|
313
|
+
- **Runner:** `@tools/uat-runner.py`
|
|
314
|
+
- **Library:** [browser-use](https://github.com/browser-use/browser-use) — MIT License
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@thierrynakoa/fire-flow",
|
|
3
|
-
"version": "12.2.
|
|
3
|
+
"version": "12.2.2",
|
|
4
4
|
"description": "Dominion Flow — the most comprehensive orchestration platform for Claude Code. 46 commands, 15 agents, 478+ skills. Plan → Execute → Verify → Handoff with parallel execution, session memory, circuit breaker safety, Phoenix Rebuild, and learning mode.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"claude-code",
|
|
@@ -36,6 +36,7 @@
|
|
|
36
36
|
"skills-library/",
|
|
37
37
|
"templates/",
|
|
38
38
|
"references/",
|
|
39
|
+
"tools/",
|
|
39
40
|
"workflows/",
|
|
40
41
|
"plugin.json",
|
|
41
42
|
".claude-plugin/",
|
|
@@ -0,0 +1,210 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: browser-use-expert
|
|
3
|
+
description: Use when running autonomous AI-driven UAT flows, exploratory browser testing, or scripting multi-step browser interactions via browser-use Python library. Invoke for AI browser automation, credential-masked testing, structured test output from browser sessions.
|
|
4
|
+
license: MIT
|
|
5
|
+
metadata:
|
|
6
|
+
author: Dominion Flow
|
|
7
|
+
version: "1.0.0"
|
|
8
|
+
domain: quality
|
|
9
|
+
triggers: browser-use, autonomous UAT, AI browser testing, browser automation, exploratory testing
|
|
10
|
+
role: specialist
|
|
11
|
+
scope: testing
|
|
12
|
+
output-format: json
|
|
13
|
+
related-skills: playwright-expert, test-master, debugging-wizard
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# browser-use Expert
|
|
17
|
+
|
|
18
|
+
Senior QA automation engineer specializing in AI-driven browser testing using the browser-use Python library. Complements Playwright (deterministic E2E) with AI-driven exploratory testing where the agent decides HOW to test based on natural language instructions.
|
|
19
|
+
|
|
20
|
+
## Role: Complementary to Playwright
|
|
21
|
+
|
|
22
|
+
| Tool | Best For |
|
|
23
|
+
|------|----------|
|
|
24
|
+
| **Playwright** | Regression tests, CI/CD, deterministic flows, cross-browser matrix |
|
|
25
|
+
| **browser-use** | UAT exploratory flows, natural language test specs, credential-masked testing, flows too complex to script |
|
|
26
|
+
|
|
27
|
+
Never replace Playwright E2E with browser-use. Use both.
|
|
28
|
+
|
|
29
|
+
## Setup Requirements
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
# 1. Python 3.11+ required
|
|
33
|
+
python --version
|
|
34
|
+
|
|
35
|
+
# 2. Install browser-use
|
|
36
|
+
pip install browser-use
|
|
37
|
+
|
|
38
|
+
# 3. Install Chromium (one-time)
|
|
39
|
+
uvx browser-use install
|
|
40
|
+
|
|
41
|
+
# 4. Anthropic API key must be set
|
|
42
|
+
# Windows: System Environment Variables
|
|
43
|
+
# Mac/Linux: export ANTHROPIC_API_KEY="sk-ant-..."
|
|
44
|
+
|
|
45
|
+
# 5. Verify setup
|
|
46
|
+
python -c "from browser_use import Agent; from browser_use.llm import ChatAnthropic; print('Setup OK')"
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## The UAT Runner Script
|
|
50
|
+
|
|
51
|
+
Dominion Flow ships a thin runner: `~/.claude/plugins/fire-flow/tools/uat-runner.py`
|
|
52
|
+
|
|
53
|
+
Call it from Bash inside fire-verify-uat:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
python ~/.claude/plugins/fire-flow/tools/uat-runner.py \
|
|
57
|
+
--task "Go to http://localhost:3000. Test user registration..." \
|
|
58
|
+
--output /tmp/uat-result.json \
|
|
59
|
+
--max-steps 20
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### Output Format
|
|
63
|
+
|
|
64
|
+
```json
|
|
65
|
+
{
|
|
66
|
+
"status": "PASS",
|
|
67
|
+
"summary": "User registered successfully, redirected to /dashboard",
|
|
68
|
+
"steps_taken": 8,
|
|
69
|
+
"final_url": "http://localhost:3000/dashboard",
|
|
70
|
+
"errors": [],
|
|
71
|
+
"screenshots": ["/tmp/.../step_0.png", "/tmp/.../step_1.png"]
|
|
72
|
+
}
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
Status values: `"PASS"`, `"FAIL"`, `"ERROR"` (infrastructure failure, not test failure)
|
|
76
|
+
|
|
77
|
+
## Canonical Usage Pattern
|
|
78
|
+
|
|
79
|
+
```python
|
|
80
|
+
import asyncio
|
|
81
|
+
from browser_use import Agent
|
|
82
|
+
from browser_use.llm import ChatAnthropic
|
|
83
|
+
from browser_use.browser.profile import BrowserProfile
|
|
84
|
+
from pydantic import BaseModel
|
|
85
|
+
|
|
86
|
+
class UATResult(BaseModel):
|
|
87
|
+
status: str # "PASS" or "FAIL"
|
|
88
|
+
summary: str # 1-2 sentences of what happened
|
|
89
|
+
steps_taken: int
|
|
90
|
+
final_url: str
|
|
91
|
+
errors: list[str]
|
|
92
|
+
|
|
93
|
+
async def run_uat_flow(task: str, sensitive_data: dict = None):
|
|
94
|
+
llm = ChatAnthropic(model='claude-sonnet-4-5', temperature=0.0)
|
|
95
|
+
profile = BrowserProfile(headless=True)
|
|
96
|
+
|
|
97
|
+
agent = Agent(
|
|
98
|
+
task=task,
|
|
99
|
+
llm=llm,
|
|
100
|
+
browser_profile=profile,
|
|
101
|
+
sensitive_data=sensitive_data,
|
|
102
|
+
output_model_schema=UATResult,
|
|
103
|
+
)
|
|
104
|
+
|
|
105
|
+
history = await agent.run(max_steps=20)
|
|
106
|
+
return history.final_result()
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Credential Masking Pattern
|
|
110
|
+
|
|
111
|
+
NEVER put real passwords in the task string. Use `sensitive_data`:
|
|
112
|
+
|
|
113
|
+
```python
|
|
114
|
+
# In task string, use placeholder names
|
|
115
|
+
task = "Login with x_email and x_password, then verify dashboard loads"
|
|
116
|
+
|
|
117
|
+
# In sensitive_data, map placeholders to real values
|
|
118
|
+
sensitive_data = {
|
|
119
|
+
'localhost': {
|
|
120
|
+
'x_email': 'testuser@example.com',
|
|
121
|
+
'x_password': 'RealPassword123!'
|
|
122
|
+
}
|
|
123
|
+
}
|
|
124
|
+
|
|
125
|
+
# The LLM sees "x_email" and "x_password" — never the actual values
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
When calling via CLI:
|
|
129
|
+
```bash
|
|
130
|
+
--credentials '{"localhost": {"x_email": "test@example.com", "x_password": "Pass123!"}}'
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
## Writing Effective UAT Task Strings
|
|
134
|
+
|
|
135
|
+
Good task strings are explicit about WHAT to verify:
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
# GOOD — includes explicit verification criteria
|
|
139
|
+
"Go to http://localhost:3000/register.
|
|
140
|
+
Fill in the email field with: x_email
|
|
141
|
+
Fill in the password field with: x_password
|
|
142
|
+
Click the Register button.
|
|
143
|
+
|
|
144
|
+
PASS CRITERIA (all must be true):
|
|
145
|
+
1. The browser URL changes to /dashboard
|
|
146
|
+
2. The page displays a welcome message
|
|
147
|
+
|
|
148
|
+
FAIL if any error message appears or URL does not change."
|
|
149
|
+
|
|
150
|
+
# BAD — no verification criteria
|
|
151
|
+
"Go to localhost:3000 and register"
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
## Multi-Flow Pattern
|
|
155
|
+
|
|
156
|
+
Reuse the same browser session for sequential flows:
|
|
157
|
+
|
|
158
|
+
```python
|
|
159
|
+
profile = BrowserProfile(keep_alive=True, headless=True)
|
|
160
|
+
agent = Agent(task=task1, llm=llm, browser_profile=profile)
|
|
161
|
+
await agent.run()
|
|
162
|
+
|
|
163
|
+
agent.add_new_task("Now test the login flow: logout, go to /login, fill credentials")
|
|
164
|
+
await agent.run()
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
## When to Prefer Manual UAT Instead
|
|
168
|
+
|
|
169
|
+
- OAuth popup windows from external providers
|
|
170
|
+
- File uploads or downloads (OS file dialogs)
|
|
171
|
+
- Visual design or pixel-level layout verification
|
|
172
|
+
- Testing on specific mobile device form factors
|
|
173
|
+
- Flows requiring CAPTCHA interaction
|
|
174
|
+
|
|
175
|
+
## Must Do
|
|
176
|
+
|
|
177
|
+
- Always use `headless=True` in UAT runner
|
|
178
|
+
- Always provide explicit PASS/FAIL criteria in task strings
|
|
179
|
+
- Always use `sensitive_data` for credentials
|
|
180
|
+
- Set `temperature=0.0` for deterministic testing
|
|
181
|
+
- Set `max_steps` to prevent runaway sessions (15-25 typical)
|
|
182
|
+
- Check that the dev server is running before starting tests
|
|
183
|
+
|
|
184
|
+
## Must Not Do
|
|
185
|
+
|
|
186
|
+
- Do not use browser-use for regression tests that have Playwright coverage
|
|
187
|
+
- Do not run without `max_steps` set
|
|
188
|
+
- Do not embed real passwords in task strings
|
|
189
|
+
- Do not spawn multiple browser-use agents against the same localhost app
|
|
190
|
+
- Do not use as a replacement for unit or integration tests
|
|
191
|
+
|
|
192
|
+
## Error Handling
|
|
193
|
+
|
|
194
|
+
```bash
|
|
195
|
+
# Check exit code: 0 = ran (check JSON), 1 = setup error
|
|
196
|
+
python uat-runner.py --task "..." --output /tmp/result.json
|
|
197
|
+
if [ $? -ne 0 ]; then
|
|
198
|
+
echo "Setup error — check stderr for details"
|
|
199
|
+
fi
|
|
200
|
+
|
|
201
|
+
# Parse JSON status
|
|
202
|
+
python -c "import json; r=json.load(open('/tmp/result.json')); print(r['status'])"
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
## Windows Notes
|
|
206
|
+
|
|
207
|
+
- Use `python` (not `python3`) unless aliased
|
|
208
|
+
- `ANTHROPIC_API_KEY` must be in system environment or `.env` file
|
|
209
|
+
- Headless mode works without a display server
|
|
210
|
+
- Screenshot paths use forward slashes in Python even on Windows
|
|
@@ -0,0 +1,179 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
"""
|
|
3
|
+
Dominion Flow UAT Runner — browser-use integration
|
|
4
|
+
Called by fire-verify-uat autonomous mode via Bash.
|
|
5
|
+
|
|
6
|
+
Usage:
|
|
7
|
+
python uat-runner.py --task "..." --output /tmp/result.json [--max-steps 20] [--credentials '{}']
|
|
8
|
+
|
|
9
|
+
Exit codes:
|
|
10
|
+
0 = script ran successfully, check JSON for PASS/FAIL/ERROR
|
|
11
|
+
1 = setup error (missing dependency, missing API key, etc.)
|
|
12
|
+
"""
|
|
13
|
+
|
|
14
|
+
import argparse
|
|
15
|
+
import asyncio
|
|
16
|
+
import json
|
|
17
|
+
import os
|
|
18
|
+
import sys
|
|
19
|
+
from pathlib import Path
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def check_setup():
|
|
23
|
+
"""Verify all prerequisites before importing browser_use."""
|
|
24
|
+
errors = []
|
|
25
|
+
|
|
26
|
+
if sys.version_info < (3, 11):
|
|
27
|
+
errors.append(f"Python 3.11+ required (got {sys.version})")
|
|
28
|
+
|
|
29
|
+
if not os.environ.get("ANTHROPIC_API_KEY"):
|
|
30
|
+
env_file = Path(".env")
|
|
31
|
+
if env_file.exists():
|
|
32
|
+
for line in env_file.read_text().splitlines():
|
|
33
|
+
if line.startswith("ANTHROPIC_API_KEY="):
|
|
34
|
+
os.environ["ANTHROPIC_API_KEY"] = line.split("=", 1)[1].strip().strip('"')
|
|
35
|
+
break
|
|
36
|
+
|
|
37
|
+
if not os.environ.get("ANTHROPIC_API_KEY"):
|
|
38
|
+
errors.append("ANTHROPIC_API_KEY not found in environment or .env file")
|
|
39
|
+
|
|
40
|
+
try:
|
|
41
|
+
import browser_use # noqa: F401
|
|
42
|
+
except ImportError:
|
|
43
|
+
errors.append("browser_use not installed. Run: pip install browser-use")
|
|
44
|
+
|
|
45
|
+
return errors
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
def get_result_model():
|
|
49
|
+
from pydantic import BaseModel
|
|
50
|
+
|
|
51
|
+
class UATFlowResult(BaseModel):
|
|
52
|
+
status: str
|
|
53
|
+
summary: str
|
|
54
|
+
steps_taken: int
|
|
55
|
+
final_url: str
|
|
56
|
+
errors: list[str]
|
|
57
|
+
|
|
58
|
+
return UATFlowResult
|
|
59
|
+
|
|
60
|
+
|
|
61
|
+
async def run_uat(task, max_steps, sensitive_data):
|
|
62
|
+
"""Run one UAT flow via browser-use agent."""
|
|
63
|
+
from browser_use import Agent
|
|
64
|
+
from browser_use.llm import ChatAnthropic
|
|
65
|
+
from browser_use.browser.profile import BrowserProfile
|
|
66
|
+
|
|
67
|
+
UATFlowResult = get_result_model()
|
|
68
|
+
|
|
69
|
+
llm = ChatAnthropic(
|
|
70
|
+
model=os.environ.get("BROWSER_USE_MODEL", "claude-sonnet-4-5"),
|
|
71
|
+
temperature=0.0,
|
|
72
|
+
)
|
|
73
|
+
|
|
74
|
+
profile = BrowserProfile(headless=True)
|
|
75
|
+
|
|
76
|
+
agent = Agent(
|
|
77
|
+
task=task,
|
|
78
|
+
llm=llm,
|
|
79
|
+
browser_profile=profile,
|
|
80
|
+
sensitive_data=sensitive_data,
|
|
81
|
+
output_model_schema=UATFlowResult,
|
|
82
|
+
generate_gif=False,
|
|
83
|
+
)
|
|
84
|
+
|
|
85
|
+
history = await agent.run(max_steps=max_steps)
|
|
86
|
+
|
|
87
|
+
raw = history.final_result()
|
|
88
|
+
if raw:
|
|
89
|
+
try:
|
|
90
|
+
parsed = UATFlowResult.model_validate_json(raw)
|
|
91
|
+
result = parsed.model_dump()
|
|
92
|
+
except Exception as e:
|
|
93
|
+
result = {
|
|
94
|
+
"status": "FAIL",
|
|
95
|
+
"summary": f"Agent returned unstructured output: {str(raw)[:500]}",
|
|
96
|
+
"steps_taken": max_steps,
|
|
97
|
+
"final_url": "",
|
|
98
|
+
"errors": [f"Schema parse error: {e}"],
|
|
99
|
+
}
|
|
100
|
+
else:
|
|
101
|
+
result = {
|
|
102
|
+
"status": "FAIL",
|
|
103
|
+
"summary": "Agent completed but returned no result.",
|
|
104
|
+
"steps_taken": max_steps,
|
|
105
|
+
"final_url": "",
|
|
106
|
+
"errors": ["No final result returned by agent"],
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
# Attach screenshot paths if available
|
|
110
|
+
try:
|
|
111
|
+
screenshots = [str(p) for item in history.history if hasattr(item, 'state') and hasattr(item.state, 'screenshot') and item.state.screenshot for p in [item.state.screenshot]]
|
|
112
|
+
result["screenshots"] = screenshots
|
|
113
|
+
except Exception:
|
|
114
|
+
result["screenshots"] = []
|
|
115
|
+
|
|
116
|
+
return result
|
|
117
|
+
|
|
118
|
+
|
|
119
|
+
def main():
|
|
120
|
+
parser = argparse.ArgumentParser(description="Dominion Flow UAT Runner (browser-use)")
|
|
121
|
+
parser.add_argument("--task", required=True, help="Plain-English UAT task description")
|
|
122
|
+
parser.add_argument("--output", required=True, help="Path for JSON output file")
|
|
123
|
+
parser.add_argument("--max-steps", type=int, default=20, help="Max browser steps (default: 20)")
|
|
124
|
+
parser.add_argument("--credentials", default=None, help='JSON string: {"domain": {"key": "value"}}')
|
|
125
|
+
args = parser.parse_args()
|
|
126
|
+
|
|
127
|
+
# 1. Setup checks
|
|
128
|
+
errors = check_setup()
|
|
129
|
+
if errors:
|
|
130
|
+
print("SETUP ERRORS:", file=sys.stderr)
|
|
131
|
+
for e in errors:
|
|
132
|
+
print(f" - {e}", file=sys.stderr)
|
|
133
|
+
print("\nFix: pip install browser-use && uvx browser-use install", file=sys.stderr)
|
|
134
|
+
sys.exit(1)
|
|
135
|
+
|
|
136
|
+
# 2. Parse credentials
|
|
137
|
+
sensitive_data = None
|
|
138
|
+
if args.credentials:
|
|
139
|
+
try:
|
|
140
|
+
sensitive_data = json.loads(args.credentials)
|
|
141
|
+
except json.JSONDecodeError as e:
|
|
142
|
+
print(f"Invalid --credentials JSON: {e}", file=sys.stderr)
|
|
143
|
+
sys.exit(1)
|
|
144
|
+
|
|
145
|
+
# 3. Run agent
|
|
146
|
+
try:
|
|
147
|
+
result = asyncio.run(run_uat(
|
|
148
|
+
task=args.task,
|
|
149
|
+
max_steps=args.max_steps,
|
|
150
|
+
sensitive_data=sensitive_data,
|
|
151
|
+
))
|
|
152
|
+
except KeyboardInterrupt:
|
|
153
|
+
print("UAT run interrupted.", file=sys.stderr)
|
|
154
|
+
sys.exit(1)
|
|
155
|
+
except Exception as e:
|
|
156
|
+
result = {
|
|
157
|
+
"status": "ERROR",
|
|
158
|
+
"summary": f"Infrastructure error: {str(e)}",
|
|
159
|
+
"steps_taken": 0,
|
|
160
|
+
"final_url": "",
|
|
161
|
+
"errors": [str(e)],
|
|
162
|
+
"screenshots": [],
|
|
163
|
+
}
|
|
164
|
+
|
|
165
|
+
# 4. Write JSON output
|
|
166
|
+
output_path = Path(args.output)
|
|
167
|
+
output_path.parent.mkdir(parents=True, exist_ok=True)
|
|
168
|
+
output_path.write_text(json.dumps(result, indent=2))
|
|
169
|
+
|
|
170
|
+
# 5. Print summary to stdout for Claude to read
|
|
171
|
+
print(f"STATUS: {result['status']}")
|
|
172
|
+
print(f"SUMMARY: {result['summary']}")
|
|
173
|
+
if result.get("errors"):
|
|
174
|
+
print(f"ERRORS: {'; '.join(result['errors'])}")
|
|
175
|
+
print(f"OUTPUT: {args.output}")
|
|
176
|
+
|
|
177
|
+
|
|
178
|
+
if __name__ == "__main__":
|
|
179
|
+
main()
|