@muggleai/works 4.3.0 → 4.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +31 -13
- package/dist/{chunk-23NOSJFH.js → chunk-MNCBJEPQ.js} +588 -36
- package/dist/cli.js +1 -1
- package/dist/index.js +1 -1
- package/dist/plugin/.claude-plugin/plugin.json +1 -1
- package/dist/plugin/.cursor-plugin/plugin.json +1 -1
- package/dist/plugin/README.md +1 -0
- package/dist/plugin/skills/do/open-prs.md +32 -65
- package/dist/plugin/skills/muggle/SKILL.md +15 -15
- package/dist/plugin/skills/muggle-pr-visual-walkthrough/SKILL.md +181 -0
- package/dist/plugin/skills/muggle-test/SKILL.md +97 -137
- package/dist/plugin/skills/muggle-test-feature-local/SKILL.md +94 -27
- package/dist/plugin/skills/muggle-test-import/SKILL.md +135 -40
- package/dist/plugin/skills/muggle-test-regenerate-missing/SKILL.md +196 -0
- package/dist/plugin/skills/muggle-test-regenerate-missing/evals/evals.json +58 -0
- package/dist/plugin/skills/muggle-test-regenerate-missing/evals/trigger-eval.json +22 -0
- package/dist/release-manifest.json +7 -0
- package/package.json +7 -7
- package/plugin/.claude-plugin/plugin.json +1 -1
- package/plugin/.cursor-plugin/plugin.json +1 -1
- package/plugin/README.md +1 -0
- package/plugin/skills/do/open-prs.md +32 -65
- package/plugin/skills/muggle/SKILL.md +15 -15
- package/plugin/skills/muggle-pr-visual-walkthrough/SKILL.md +181 -0
- package/plugin/skills/muggle-test/SKILL.md +97 -137
- package/plugin/skills/muggle-test-feature-local/SKILL.md +94 -27
- package/plugin/skills/muggle-test-import/SKILL.md +135 -40
- package/plugin/skills/muggle-test-regenerate-missing/SKILL.md +196 -0
- package/plugin/skills/muggle-test-regenerate-missing/evals/evals.json +58 -0
- package/plugin/skills/muggle-test-regenerate-missing/evals/trigger-eval.json +22 -0
|
@@ -7,6 +7,15 @@ description: "Run change-driven E2E acceptance testing using Muggle AI — detec
|
|
|
7
7
|
|
|
8
8
|
A router skill that detects code changes, resolves impacted test cases, executes them locally or remotely, publishes results to the Muggle AI dashboard, and posts E2E acceptance summaries to the PR. The user can invoke this at any moment, in any state.
|
|
9
9
|
|
|
10
|
+
## UX Guidelines — Minimize Typing
|
|
11
|
+
|
|
12
|
+
**Every selection-based question MUST use the `AskQuestion` tool** (or the platform's equivalent structured selection tool). Never ask the user to "reply with a number" in a plain text message — always present clickable options.
|
|
13
|
+
|
|
14
|
+
- **Selections** (project, use case, test case, mode, approval): Use `AskQuestion` with labeled options the user can click.
|
|
15
|
+
- **Multi-select** (use cases, test cases): Use `AskQuestion` with `allow_multiple: true`.
|
|
16
|
+
- **Free-text inputs** (URLs, descriptions): Only use plain text prompts when there is no finite set of options. Even then, offer a detected/default value when possible.
|
|
17
|
+
- **Batch related questions**: If two questions are independent, present them together in a single `AskQuestion` call rather than asking sequentially.
|
|
18
|
+
|
|
10
19
|
## Step 1: Confirm Scope of Work (Always First)
|
|
11
20
|
|
|
12
21
|
Parse the user's query and explicitly confirm their expectation. There are exactly two modes:
|
|
@@ -27,23 +36,13 @@ Signs the user wants this: mentions "preview", "staging", "deployed", "preview U
|
|
|
27
36
|
|
|
28
37
|
### Confirming
|
|
29
38
|
|
|
30
|
-
If the user's intent is clear, state back what you understood and
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
──────────────────────────────────────────────────────────────
|
|
34
|
-
1. Yes, proceed
|
|
35
|
-
2. No, switch to [the other mode]
|
|
36
|
-
──────────────────────────────────────────────────────────────
|
|
37
|
-
```
|
|
39
|
+
If the user's intent is clear, state back what you understood and use `AskQuestion` to confirm:
|
|
40
|
+
- Option 1: "Yes, proceed"
|
|
41
|
+
- Option 2: "Switch to [the other mode]"
|
|
38
42
|
|
|
39
|
-
If ambiguous,
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
──────────────────────────────────────────────────────────────
|
|
43
|
-
1. Local — launch browser on your machine against localhost
|
|
44
|
-
2. Remote — Muggle cloud tests against a preview/staging URL
|
|
45
|
-
──────────────────────────────────────────────────────────────
|
|
46
|
-
```
|
|
43
|
+
If ambiguous, use `AskQuestion` to let the user choose:
|
|
44
|
+
- Option 1: "Local — launch browser on your machine against localhost"
|
|
45
|
+
- Option 2: "Remote — Muggle cloud tests against a preview/staging URL"
|
|
47
46
|
|
|
48
47
|
Only proceed after the user selects an option.
|
|
49
48
|
|
|
@@ -75,26 +74,21 @@ If auth fails repeatedly, suggest: `muggle logout && muggle login` from terminal
|
|
|
75
74
|
|
|
76
75
|
## Step 4: Select Project (User Must Choose)
|
|
77
76
|
|
|
77
|
+
A **project** is where all your test results, use cases, and test scripts are grouped on the Muggle AI dashboard. Pick the project that matches what you're working on.
|
|
78
|
+
|
|
78
79
|
1. Call `muggle-remote-project-list`
|
|
79
|
-
2.
|
|
80
|
+
2. Use `AskQuestion` to present all projects as clickable options. Include the project URL in each label so the user can identify the right one. Always include a "Create new project" option at the end.
|
|
80
81
|
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
2. Tanka Testing https://www.tanka.ai
|
|
86
|
-
3. Staging muggleTestV0 https://staging.muggle-ai.com/muggleTestV0
|
|
87
|
-
4. [Create new project]
|
|
88
|
-
──────────────────────────────────────────────────────────────
|
|
89
|
-
```
|
|
82
|
+
Example labels:
|
|
83
|
+
- "MUGGLE AI STAGING 1 — https://staging.muggle-ai.com/"
|
|
84
|
+
- "Tanka Testing — https://www.tanka.ai"
|
|
85
|
+
- "Create new project"
|
|
90
86
|
|
|
91
|
-
|
|
87
|
+
Prompt: "Pick the project to group this test run into:"
|
|
92
88
|
|
|
93
89
|
3. **Wait for the user to explicitly choose** — do NOT auto-select based on repo name or URL matching
|
|
94
90
|
4. **If user chooses "Create new project"**:
|
|
95
|
-
- Ask for `projectName`
|
|
96
|
-
- Ask for `description`
|
|
97
|
-
- Ask for the production/preview URL
|
|
91
|
+
- Ask for `projectName`, `description`, and the production/preview URL
|
|
98
92
|
- Call `muggle-remote-project-create`
|
|
99
93
|
|
|
100
94
|
Store the `projectId` only after user confirms.
|
|
@@ -104,21 +98,11 @@ Store the `projectId` only after user confirms.
|
|
|
104
98
|
### 5a: List existing use cases
|
|
105
99
|
Call `muggle-remote-use-case-list` with the project ID.
|
|
106
100
|
|
|
107
|
-
### 5b: Present ALL use cases
|
|
101
|
+
### 5b: Present ALL use cases for user selection
|
|
108
102
|
|
|
109
|
-
|
|
110
|
-
Available Use Cases for [Project Name]:
|
|
111
|
-
──────────────────────────────────────────────────────────────────────────
|
|
112
|
-
1. Sign up for Muggle Test account
|
|
113
|
-
2. Access existing account via login
|
|
114
|
-
3. Manually Add a Use Case
|
|
115
|
-
4. View Generated Test Script After Test Run
|
|
116
|
-
5. Generate comprehensive UX testing reports
|
|
117
|
-
6. [Create new use case]
|
|
118
|
-
──────────────────────────────────────────────────────────────────────────
|
|
119
|
-
```
|
|
103
|
+
Use `AskQuestion` with `allow_multiple: true` to present all use cases as clickable options. Always include a "Create new use case" option at the end.
|
|
120
104
|
|
|
121
|
-
|
|
105
|
+
Prompt: "Which use case(s) do you want to test?"
|
|
122
106
|
|
|
123
107
|
### 5c: Wait for explicit user selection
|
|
124
108
|
|
|
@@ -133,7 +117,7 @@ The user MUST explicitly tell you which use case(s) to use.
|
|
|
133
117
|
1. Ask the user to describe the use case in plain English
|
|
134
118
|
2. Call `muggle-remote-use-case-create-from-prompts`:
|
|
135
119
|
- `projectId`: The project ID
|
|
136
|
-
- `
|
|
120
|
+
- `instructions`: A plain array of strings, one per use case — e.g. `["<user's description>"]`
|
|
137
121
|
3. Present the created use case and confirm it's correct
|
|
138
122
|
|
|
139
123
|
## Step 6: Select Test Case (User Must Choose)
|
|
@@ -143,19 +127,11 @@ For the selected use case(s):
|
|
|
143
127
|
### 6a: List existing test cases
|
|
144
128
|
Call `muggle-remote-test-case-list-by-use-case` with each use case ID.
|
|
145
129
|
|
|
146
|
-
### 6b: Present ALL test cases
|
|
130
|
+
### 6b: Present ALL test cases for user selection
|
|
147
131
|
|
|
148
|
-
|
|
149
|
-
Available Test Cases for "[Use Case Name]":
|
|
150
|
-
──────────────────────────────────────────────────────────────────────────
|
|
151
|
-
1. E2E: Login with valid credentials
|
|
152
|
-
2. E2E: Login with invalid password
|
|
153
|
-
3. E2E: Login with expired session
|
|
154
|
-
4. [Generate new test case]
|
|
155
|
-
──────────────────────────────────────────────────────────────────────────
|
|
156
|
-
```
|
|
132
|
+
Use `AskQuestion` with `allow_multiple: true` to present all test cases as clickable options. Always include a "Generate new test case" option at the end.
|
|
157
133
|
|
|
158
|
-
|
|
134
|
+
Prompt: "Which test case(s) do you want to run?"
|
|
159
135
|
|
|
160
136
|
### 6c: Wait for explicit user selection
|
|
161
137
|
|
|
@@ -170,40 +146,34 @@ Available Test Cases for "[Use Case Name]":
|
|
|
170
146
|
|
|
171
147
|
### 6e: Confirm final selection
|
|
172
148
|
|
|
173
|
-
|
|
149
|
+
Use `AskQuestion` to confirm: "You selected [N] test case(s): [list titles]. Ready to proceed?"
|
|
150
|
+
- Option 1: "Yes, run them"
|
|
151
|
+
- Option 2: "No, let me re-select"
|
|
174
152
|
|
|
175
153
|
Wait for user confirmation before moving to execution.
|
|
176
154
|
|
|
177
155
|
## Step 7A: Execute — Local Mode
|
|
178
156
|
|
|
179
|
-
###
|
|
157
|
+
### Pre-flight questions (batch where possible)
|
|
180
158
|
|
|
181
159
|
**Question 1 — Local URL:**
|
|
182
|
-
> "Your local app should be running. What's the URL? (e.g., http://localhost:3000)"
|
|
183
160
|
|
|
184
|
-
|
|
161
|
+
Try to auto-detect the dev server URL by checking running terminals or common ports (e.g., `lsof -iTCP -sTCP:LISTEN -nP | grep -E ':(3000|3001|4200|5173|8080)'`). If a likely URL is found, present it as a clickable default via `AskQuestion`:
|
|
162
|
+
- Option 1: "http://localhost:3000" (or whatever was detected)
|
|
163
|
+
- Option 2: "Other — let me type a URL"
|
|
185
164
|
|
|
186
|
-
|
|
187
|
-
```
|
|
188
|
-
I'll launch the Muggle Electron browser to run [N] test case(s).
|
|
189
|
-
──────────────────────────────────────────────────────────────
|
|
190
|
-
1. Yes, launch it
|
|
191
|
-
2. No, cancel
|
|
192
|
-
──────────────────────────────────────────────────────────────
|
|
193
|
-
```
|
|
165
|
+
If nothing detected, ask as free text: "Your local app should be running. What's the URL? (e.g., http://localhost:3000)"
|
|
194
166
|
|
|
195
|
-
|
|
167
|
+
**Question 2 — Electron launch + window visibility (ask together):**
|
|
196
168
|
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
──────────────────────────────────────────────────────────────
|
|
204
|
-
```
|
|
169
|
+
After getting the URL, use a single `AskQuestion` call with two questions:
|
|
170
|
+
|
|
171
|
+
1. "Ready to launch the Muggle Electron browser for [N] test case(s)?"
|
|
172
|
+
- "Yes, launch it (visible — I want to watch)"
|
|
173
|
+
- "Yes, launch it (headless — run in background)"
|
|
174
|
+
- "No, cancel"
|
|
205
175
|
|
|
206
|
-
|
|
176
|
+
If user cancels, stop and ask what they want to do instead.
|
|
207
177
|
|
|
208
178
|
### Run sequentially
|
|
209
179
|
|
|
@@ -213,8 +183,8 @@ For each test case:
|
|
|
213
183
|
2. Call `muggle-local-execute-test-generation`:
|
|
214
184
|
- `testCase`: Full test case object from step 1
|
|
215
185
|
- `localUrl`: User's local URL (from Question 1)
|
|
216
|
-
- `approveElectronAppLaunch`: `true` (only if user
|
|
217
|
-
- `showUi`: `true` if user chose "visible", `false` if "headless" (from Question
|
|
186
|
+
- `approveElectronAppLaunch`: `true` (only if user approved in Question 2)
|
|
187
|
+
- `showUi`: `true` if user chose "visible", `false` if "headless" (from Question 2)
|
|
218
188
|
3. Store the returned `runId`
|
|
219
189
|
|
|
220
190
|
If a generation fails, log it and continue to the next. Do not abort the batch.
|
|
@@ -309,73 +279,62 @@ open "https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/runs
|
|
|
309
279
|
Tell the user:
|
|
310
280
|
> "I've opened the Muggle AI dashboard in your browser — you can see the test results, step-by-step screenshots, and action scripts there."
|
|
311
281
|
|
|
312
|
-
## Step 9: Post
|
|
282
|
+
## Step 9: Offer to Post Visual Walkthrough to PR
|
|
313
283
|
|
|
314
|
-
After reporting results,
|
|
284
|
+
After reporting results, ask the user if they want to attach a **visual walkthrough** — a markdown block with per-test-case dashboard links and step-by-step screenshots — to the current branch's open PR. The rendering and posting workflow lives in the shared `muggle-pr-visual-walkthrough` skill; this step gathers the required input and hands off.
|
|
315
285
|
|
|
316
|
-
### 9a:
|
|
286
|
+
### 9a: Gather per-step screenshots (required input for the shared skill)
|
|
317
287
|
|
|
318
|
-
|
|
319
|
-
gh pr view --json number,url,title 2>/dev/null
|
|
320
|
-
```
|
|
288
|
+
The shared skill takes an **`E2eReport` JSON** that includes per-step screenshot URLs. You already have `projectId`, `testCaseId`, `runId`, `viewUrl`, and `status` from earlier steps — you still need the step-level data.
|
|
321
289
|
|
|
322
|
-
|
|
323
|
-
- If no PR exists → ask:
|
|
324
|
-
```
|
|
325
|
-
No open PR found for this branch.
|
|
326
|
-
──────────────────────────────────────────────────────────────
|
|
327
|
-
1. Create PR with E2E acceptance results
|
|
328
|
-
2. Skip posting to PR
|
|
329
|
-
──────────────────────────────────────────────────────────────
|
|
330
|
-
```
|
|
331
|
-
- If 1: create PR with E2E acceptance results in the body (use `gh pr create`)
|
|
332
|
-
- If 2: skip this step
|
|
290
|
+
For each published run from Step 7A:
|
|
333
291
|
|
|
334
|
-
|
|
292
|
+
1. Call `muggle-remote-test-script-get` with the `testScriptId` returned by `muggle-local-publish-test-script`.
|
|
293
|
+
2. Extract from the response: `steps[].operation.action` (description) and `steps[].operation.screenshotUrl` (cloud URL).
|
|
294
|
+
3. Build a `steps` array: `[{ stepIndex: 0, action: "...", screenshotUrl: "..." }, ...]`.
|
|
295
|
+
4. If the run failed, also capture `failureStepIndex`, `error`, and the local `artifactsDir` from `muggle-local-run-result-get`.
|
|
335
296
|
|
|
336
|
-
|
|
297
|
+
Assemble the report:
|
|
337
298
|
|
|
338
|
-
```
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
299
|
+
```json
|
|
300
|
+
{
|
|
301
|
+
"projectId": "<projectId>",
|
|
302
|
+
"tests": [
|
|
303
|
+
{
|
|
304
|
+
"name": "<test case title>",
|
|
305
|
+
"testCaseId": "<id>",
|
|
306
|
+
"testScriptId": "<id>",
|
|
307
|
+
"runId": "<id>",
|
|
308
|
+
"viewUrl": "<publish response viewUrl>",
|
|
309
|
+
"status": "passed",
|
|
310
|
+
"steps": [{ "stepIndex": 0, "action": "...", "screenshotUrl": "..." }]
|
|
311
|
+
}
|
|
312
|
+
]
|
|
313
|
+
}
|
|
314
|
+
```
|
|
342
315
|
|
|
343
|
-
|
|
344
|
-
|-----------|--------|---------|
|
|
345
|
-
| [Login with valid creds](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ✅ PASSED | 8 steps, 12.3s |
|
|
346
|
-
| [Login with invalid creds](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ✅ PASSED | 6 steps, 9.1s |
|
|
347
|
-
| [Checkout flow](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ❌ FAILED | Step 7: "Click checkout button" — element not found |
|
|
316
|
+
See the shared skill for the full schema (including the failed-test shape with `failureStepIndex` and `error`).
|
|
348
317
|
|
|
349
|
-
|
|
350
|
-
<summary>Failed test details</summary>
|
|
318
|
+
### 9b: Ask the user
|
|
351
319
|
|
|
352
|
-
|
|
353
|
-
- **Failed at**: Step 7 — "Click checkout button"
|
|
354
|
-
- **Error**: Element not found
|
|
355
|
-
- **Local artifacts**: `~/.muggle-ai/sessions/{runId}/`
|
|
356
|
-
- **Screenshots**: `~/.muggle-ai/sessions/{runId}/screenshots/`
|
|
320
|
+
Use `AskQuestion`:
|
|
357
321
|
|
|
358
|
-
|
|
322
|
+
> "Post a visual walkthrough of these results to the PR? Reviewers can click each test case to see step-by-step screenshots on the Muggle AI dashboard."
|
|
359
323
|
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
```
|
|
324
|
+
- Option 1: "Yes, post to PR"
|
|
325
|
+
- Option 2: "Skip"
|
|
363
326
|
|
|
364
|
-
### 9c:
|
|
365
|
-
|
|
366
|
-
If PR already exists — add as a comment:
|
|
367
|
-
```bash
|
|
368
|
-
gh pr comment {pr-number} --body "$(cat <<'EOF'
|
|
369
|
-
{the E2E acceptance comment body from 9b}
|
|
370
|
-
EOF
|
|
371
|
-
)"
|
|
372
|
-
```
|
|
327
|
+
### 9c: Invoke the shared skill in Mode A
|
|
373
328
|
|
|
374
|
-
If
|
|
329
|
+
If the user chooses "Yes, post to PR", invoke the `muggle-pr-visual-walkthrough` skill via the `Skill` tool. With the `E2eReport` already in context, the skill will:
|
|
375
330
|
|
|
376
|
-
|
|
331
|
+
1. Call `muggle build-pr-section` to render the markdown block (fit-vs-overflow automatic)
|
|
332
|
+
2. Find the PR via `gh pr view`
|
|
333
|
+
3. Post `body` as a `gh pr comment`
|
|
334
|
+
4. Post the overflow `comment` as a second comment (only if the CLI emitted one)
|
|
335
|
+
5. Confirm the PR URL to the user
|
|
377
336
|
|
|
378
|
-
|
|
337
|
+
This skill always uses **Mode A** (post to an existing PR); `muggle-do` is the only caller that uses Mode B. Do not attempt to render the walkthrough markdown yourself — delegate to the shared skill.
|
|
379
338
|
|
|
380
339
|
## Tool Reference
|
|
381
340
|
|
|
@@ -397,20 +356,21 @@ If creating a new PR — include the E2E acceptance section in the PR body along
|
|
|
397
356
|
| Results | `muggle-local-run-result-get` | Local |
|
|
398
357
|
| Results | `muggle-remote-wf-get-ts-gen-latest-run` | Remote |
|
|
399
358
|
| Publish | `muggle-local-publish-test-script` | Local |
|
|
359
|
+
| Per-step screenshots (for walkthrough) | `muggle-remote-test-script-get` | Both |
|
|
400
360
|
| Browser | `open` (shell command) | Both |
|
|
401
|
-
| PR | `
|
|
361
|
+
| PR walkthrough | `muggle-pr-visual-walkthrough` (shared skill) | Both |
|
|
402
362
|
|
|
403
363
|
## Guardrails
|
|
404
364
|
|
|
405
365
|
- **Always confirm intent first** — never assume local vs remote without asking
|
|
406
|
-
- **User MUST select project** — present
|
|
407
|
-
- **User MUST select use case(s)** — present
|
|
408
|
-
- **User MUST select test case(s)** — present
|
|
409
|
-
- **
|
|
366
|
+
- **User MUST select project** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select
|
|
367
|
+
- **User MUST select use case(s)** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select based on git changes or heuristics
|
|
368
|
+
- **User MUST select test case(s)** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select
|
|
369
|
+
- **Use `AskQuestion` for every selection** — never ask the user to type a number; always present clickable options
|
|
370
|
+
- **Batch related questions** — combine Electron approval + visibility into one question; auto-detect localhost URL when possible
|
|
410
371
|
- **Never launch Electron without explicit user approval** (`approveElectronAppLaunch`)
|
|
411
372
|
- **Never silently drop test cases** — log failures and continue, then report them
|
|
412
373
|
- **Never guess the URL** — always ask the user for localhost or preview URL
|
|
413
374
|
- **Always publish before opening browser** — the dashboard needs the published data to show results
|
|
414
|
-
- **
|
|
415
|
-
- **Always check for PR before posting** — don't create a PR comment if there's no PR (ask user first)
|
|
375
|
+
- **Delegate PR posting to `muggle-pr-visual-walkthrough`** — never inline the walkthrough markdown or call `gh pr comment` directly from this skill; ask the user and hand off
|
|
416
376
|
- **Can be invoked at any state** — if the user already has a project or use cases set up, skip to the relevant step rather than re-doing everything
|
|
@@ -5,7 +5,7 @@ description: Run a real-browser end-to-end (E2E) acceptance test against localho
|
|
|
5
5
|
|
|
6
6
|
# Muggle Test Feature Local
|
|
7
7
|
|
|
8
|
-
**Goal:** Run or generate an end-to-end test against a **local URL** using Muggle
|
|
8
|
+
**Goal:** Run or generate an end-to-end test against a **local URL** using Muggle's Electron browser.
|
|
9
9
|
|
|
10
10
|
| Scope | MCP tools |
|
|
11
11
|
| :---- | :-------- |
|
|
@@ -15,12 +15,19 @@ description: Run a real-browser end-to-end (E2E) acceptance test against localho
|
|
|
15
15
|
|
|
16
16
|
The local URL only changes where the browser opens; it does not change the remote project or test definitions.
|
|
17
17
|
|
|
18
|
+
## UX Guidelines — Minimize Typing
|
|
19
|
+
|
|
20
|
+
**Every selection-based question MUST use the `AskQuestion` tool** (or the platform's equivalent structured selection tool). Never ask the user to "reply with a number" in a plain text message — always present clickable options.
|
|
21
|
+
|
|
22
|
+
- **Selections** (project, use case, test case, script, approval): Use `AskQuestion` with labeled options the user can click.
|
|
23
|
+
- **Free-text inputs** (URLs, descriptions): Only use plain text prompts when there is no finite set of options. Even then, offer a detected/default value when possible.
|
|
24
|
+
|
|
18
25
|
## Workflow
|
|
19
26
|
|
|
20
27
|
### 1. Auth
|
|
21
28
|
|
|
22
29
|
- `muggle-remote-auth-status`
|
|
23
|
-
- If not signed in: `muggle-remote-auth-login` then `muggle-remote-auth-poll`
|
|
30
|
+
- If not signed in: `muggle-remote-auth-login` then `muggle-remote-auth-poll`
|
|
24
31
|
Do not skip or assume auth.
|
|
25
32
|
|
|
26
33
|
### 2. Targets (user must confirm)
|
|
@@ -31,41 +38,45 @@ Ask the user to pick **project**, **use case**, and **test case** (do not infer)
|
|
|
31
38
|
- `muggle-remote-use-case-list` (with `projectId`)
|
|
32
39
|
- `muggle-remote-test-case-list-by-use-case` (with `useCaseId`)
|
|
33
40
|
|
|
34
|
-
**Selection UI (mandatory):**
|
|
41
|
+
**Selection UI (mandatory):** Every selection MUST use `AskQuestion` with clickable options. Never ask the user to "reply with the number" in plain text.
|
|
35
42
|
|
|
36
|
-
**
|
|
43
|
+
**Project selection context:** A **project** groups all your test results, use cases, and test scripts on the Muggle AI dashboard. Include the project URL in each option label so the user can identify the right one.
|
|
37
44
|
|
|
38
|
-
|
|
39
|
-
2. **Create new …** — user creates a new entity instead of picking an existing one. Label per step: **Create new project**, **Create new use case**, or **Create new test case**.
|
|
45
|
+
Prompt for projects: "Pick the project to group this test into:"
|
|
40
46
|
|
|
41
47
|
**Relevance-first filtering (mandatory for project, use case, and test case lists):**
|
|
42
48
|
|
|
43
49
|
- Do **not** dump the full list by default.
|
|
44
|
-
- Rank items by semantic relevance to the user
|
|
45
|
-
- Show only the **top 3
|
|
46
|
-
-
|
|
50
|
+
- Rank items by semantic relevance to the user's stated goal (title first, then description / user story / acceptance criteria).
|
|
51
|
+
- Show only the **top 3-5** most relevant options via `AskQuestion`, plus these fixed tail options:
|
|
52
|
+
- **"Show full list"** — present the complete list in a new `AskQuestion` call. **Skip this option** if the API returned zero rows.
|
|
53
|
+
- **"Create new ..."** — never omitted. Label per step: "Create new project", "Create new use case", or "Create new test case".
|
|
47
54
|
|
|
48
55
|
**Create new — tools and flow (use these MCP tools; preview before persist):**
|
|
49
56
|
|
|
50
57
|
- **Project — Create new project:** Collect `projectName`, `description`, and `url` (may be the local app URL, e.g. `http://localhost:3999`). Call `muggle-remote-project-create`. Use the returned `projectId` and continue.
|
|
51
|
-
- **Use case — Create new use case:** User provides a natural-language instruction (or you reuse their testing goal).
|
|
52
|
-
1. `muggle-remote-use-case-prompt-preview` with `projectId`, `instruction` — show preview; get confirmation
|
|
53
|
-
2. `muggle-remote-use-case-create-from-prompts` with `projectId
|
|
54
|
-
- **Test case — Create new test case** (requires a chosen `useCaseId`): User provides an instruction describing what to test.
|
|
55
|
-
1. `muggle-remote-test-case-generate-from-prompt` with `projectId`, `useCaseId`, `instruction` — **preview only** (server test-case prompt preview); show the returned draft(s); get confirmation
|
|
56
|
-
2. Persist the accepted draft with `muggle-remote-test-case-create`, mapping preview fields into the required properties (`title`, `description`, `goal`, `expectedResult`, `url`, etc.). Then continue from
|
|
58
|
+
- **Use case — Create new use case:** User provides a natural-language instruction (or you reuse their testing goal).
|
|
59
|
+
1. `muggle-remote-use-case-prompt-preview` with `projectId`, `instruction` — show preview; get confirmation via `AskQuestion`.
|
|
60
|
+
2. `muggle-remote-use-case-create-from-prompts` with `projectId` and `instructions: ["<the user's natural-language instruction>"]` — persist. Use the created use case id and continue to test-case selection.
|
|
61
|
+
- **Test case — Create new test case** (requires a chosen `useCaseId`): User provides an instruction describing what to test.
|
|
62
|
+
1. `muggle-remote-test-case-generate-from-prompt` with `projectId`, `useCaseId`, `instruction` — **preview only** (server test-case prompt preview); show the returned draft(s); get confirmation via `AskQuestion`.
|
|
63
|
+
2. Persist the accepted draft with `muggle-remote-test-case-create`, mapping preview fields into the required properties (`title`, `description`, `goal`, `expectedResult`, `url`, etc.). Then continue from **section 4** with that `testCaseId`.
|
|
57
64
|
|
|
58
65
|
### 3. Local URL
|
|
59
66
|
|
|
60
|
-
-
|
|
61
|
-
-
|
|
67
|
+
Try to auto-detect the dev server URL by checking running terminals or common ports (e.g., `lsof -iTCP -sTCP:LISTEN -nP | grep -E ':(3000|3001|4200|5173|8080)'`). If a likely URL is found, present it as a clickable default via `AskQuestion`:
|
|
68
|
+
- Option 1: "http://localhost:3000" (or whatever was detected)
|
|
69
|
+
- Option 2: "Other — let me type a URL"
|
|
70
|
+
|
|
71
|
+
If nothing detected, ask as free text: "Your local app should be running. What's the URL? (e.g., http://localhost:3000)"
|
|
72
|
+
|
|
73
|
+
Remind them: local URL is only the execution target, not tied to cloud project config.
|
|
62
74
|
|
|
63
75
|
### 4. Existing scripts vs new generation
|
|
64
76
|
|
|
65
77
|
`muggle-remote-test-script-list` with `testCaseId`.
|
|
66
78
|
|
|
67
|
-
- **If any replayable/succeeded scripts exist:**
|
|
68
|
-
Show: name, id, created/updated, step count. Include **`Generate new script`** as the **last** numbered option (e.g. last number) so it is selectable by number too.
|
|
79
|
+
- **If any replayable/succeeded scripts exist:** use `AskQuestion` to present them as clickable options. Show: name, created/updated, step count per option. Include **"Generate new script"** as the last option.
|
|
69
80
|
- **If none:** go straight to generation (no need to ask replay vs generate).
|
|
70
81
|
|
|
71
82
|
### 5. Load data for the chosen path
|
|
@@ -77,8 +88,8 @@ Ask the user to pick **project**, **use case**, and **test case** (do not infer)
|
|
|
77
88
|
|
|
78
89
|
**Replay**
|
|
79
90
|
|
|
80
|
-
1. `muggle-remote-test-script-get`
|
|
81
|
-
2. `muggle-remote-action-script-get` with that id
|
|
91
|
+
1. `muggle-remote-test-script-get` — note `actionScriptId`
|
|
92
|
+
2. `muggle-remote-action-script-get` with that id — full `actionScript`
|
|
82
93
|
**Use the API response as-is.** Do not edit, shorten, or rebuild `actionScript`; replay needs full `label` paths for element lookup.
|
|
83
94
|
3. `muggle-local-execute-replay` (after approval in step 6) with `testScript`, `actionScript`, `localUrl`, `approveElectronAppLaunch: true` (optional: `showUi: true`, **`timeoutMs`** — see below)
|
|
84
95
|
|
|
@@ -88,17 +99,22 @@ The MCP client often uses a **default wait of 300000 ms (5 minutes)** for `muggl
|
|
|
88
99
|
|
|
89
100
|
- **Always pass `timeoutMs`** for flows that may be long — for example **`600000` (10 min)** or **`900000` (15 min)** — unless the user explicitly wants a short cap.
|
|
90
101
|
- If the tool reports **`Electron execution timed out after 300000ms`** (or similar) **but** Electron logs show the run still progressing (steps, screenshots, LLM calls), treat it as **orchestration timeout**, not an Electron app defect: **increase `timeoutMs` and retry** (after user re-approves if your policy requires it).
|
|
91
|
-
- **Test case design:** Preconditions like
|
|
102
|
+
- **Test case design:** Preconditions like "a test run has already completed" on an **empty account** can force many steps (sign-up, new project, crawl). Prefer an account/project that **already has** the needed state, or narrow the test goal so generation does not try to create a full project from scratch unless that is intentional.
|
|
92
103
|
|
|
93
104
|
### Interpreting `failed` / non-zero Electron exit
|
|
94
105
|
|
|
95
106
|
- **`Electron execution timed out after 300000ms`:** Orchestration wait too short — see **`timeoutMs`** above.
|
|
96
|
-
- **Exit code 26** (and messages like **LLM failed to generate / replay action script**): Often corresponds to a completed exploration whose **outcome was goal not achievable** (`goal_not_achievable`, summary with `halt`) — e.g. verifying
|
|
107
|
+
- **Exit code 26** (and messages like **LLM failed to generate / replay action script**): Often corresponds to a completed exploration whose **outcome was goal not achievable** (`goal_not_achievable`, summary with `halt`) — e.g. verifying "view script after a successful run" when **no run or script exists yet** in the UI. Use `muggle-local-run-result-get` and read the **summary / structured summary**; do not assume an Electron crash. **Fix:** choose a **project that already has** completed runs and scripts, or **change the test case** so preconditions match what localhost can satisfy (e.g. include steps to create and run a test first, or assert only empty-state UI when no runs exist).
|
|
97
108
|
|
|
98
109
|
### 6. Approval before any local execution
|
|
99
110
|
|
|
100
|
-
|
|
101
|
-
|
|
111
|
+
Use `AskQuestion` to get explicit approval before launching Electron. State: replay vs generation, test case name, URL.
|
|
112
|
+
|
|
113
|
+
- "Yes, launch Electron (visible — I want to watch)"
|
|
114
|
+
- "Yes, launch Electron (headless — run in background)"
|
|
115
|
+
- "No, cancel"
|
|
116
|
+
|
|
117
|
+
Only call local execute tools with `approveElectronAppLaunch: true` after the user selects a "Yes" option. Map visible to `showUi: true`, headless to `showUi: false`.
|
|
102
118
|
|
|
103
119
|
### 7. After successful generation only
|
|
104
120
|
|
|
@@ -110,10 +126,61 @@ Only then call local execute tools with `approveElectronAppLaunch: true`.
|
|
|
110
126
|
- `muggle-local-run-result-get` with the run id from execute.
|
|
111
127
|
- Include: status, duration, pass/fail summary, per-step summary, artifact/screenshot paths, errors if failed, and script view URL when publishing ran.
|
|
112
128
|
|
|
129
|
+
### 9. Offer to post a visual walkthrough to the PR
|
|
130
|
+
|
|
131
|
+
After reporting results, gather the required input and hand off to the shared **`muggle-pr-visual-walkthrough`** skill, which renders the walkthrough via `muggle build-pr-section` and posts it to the current branch's open PR.
|
|
132
|
+
|
|
133
|
+
#### 9a: Gather per-step screenshots
|
|
134
|
+
|
|
135
|
+
The shared skill takes an **`E2eReport` JSON** that includes per-step screenshot URLs. After step 7 has called `muggle-local-publish-test-script` and you have the `testScriptId`:
|
|
136
|
+
|
|
137
|
+
1. Call `muggle-remote-test-script-get` with the `testScriptId`.
|
|
138
|
+
2. Extract per step: `steps[].operation.action` and `steps[].operation.screenshotUrl`.
|
|
139
|
+
3. Build the `steps` array: `[{ stepIndex: 0, action: "...", screenshotUrl: "..." }, ...]`.
|
|
140
|
+
4. If the run failed, capture `failureStepIndex`, `error`, and the local `artifactsDir` from the run result in step 8.
|
|
141
|
+
|
|
142
|
+
Assemble the `E2eReport`:
|
|
143
|
+
|
|
144
|
+
```json
|
|
145
|
+
{
|
|
146
|
+
"projectId": "<projectId from step 2>",
|
|
147
|
+
"tests": [
|
|
148
|
+
{
|
|
149
|
+
"name": "<test case title>",
|
|
150
|
+
"testCaseId": "<id>",
|
|
151
|
+
"testScriptId": "<id from publish>",
|
|
152
|
+
"runId": "<runId from execute>",
|
|
153
|
+
"viewUrl": "<viewUrl from publish>",
|
|
154
|
+
"status": "passed",
|
|
155
|
+
"steps": [{ "stepIndex": 0, "action": "...", "screenshotUrl": "..." }]
|
|
156
|
+
}
|
|
157
|
+
]
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
See the `muggle-pr-visual-walkthrough` skill for the full schema including the failed-test shape.
|
|
162
|
+
|
|
163
|
+
#### 9b: Ask the user
|
|
164
|
+
|
|
165
|
+
Use `AskQuestion`:
|
|
166
|
+
|
|
167
|
+
> "Post a visual walkthrough of this run to the PR? Reviewers can click the test case to see step-by-step screenshots on the Muggle AI dashboard."
|
|
168
|
+
|
|
169
|
+
- Option 1: "Yes, post to PR"
|
|
170
|
+
- Option 2: "Skip"
|
|
171
|
+
|
|
172
|
+
#### 9c: Invoke the shared skill in Mode A
|
|
173
|
+
|
|
174
|
+
If the user chooses "Yes, post to PR", invoke the `muggle-pr-visual-walkthrough` skill via the `Skill` tool. With the `E2eReport` in context, the skill renders the markdown block via the CLI, finds the PR via `gh pr view`, posts `body` as a comment, posts the overflow `comment` only if the CLI emitted one, and confirms the PR URL to the user.
|
|
175
|
+
|
|
176
|
+
Always use **Mode A** (post to existing PR) from this skill. Never hand-write the walkthrough markdown or call `gh pr comment` directly — delegate to `muggle-pr-visual-walkthrough`.
|
|
177
|
+
|
|
113
178
|
## Non-negotiables
|
|
114
179
|
|
|
115
|
-
- No silent auth skip; no launching Electron without approval
|
|
180
|
+
- No silent auth skip; no launching Electron without approval via `AskQuestion`.
|
|
116
181
|
- If replayable scripts exist, do not default to generation without user choice.
|
|
117
182
|
- No hiding failures: surface errors and artifact paths.
|
|
118
183
|
- Replay: never hand-built or simplified `actionScript` — only from `muggle-remote-action-script-get`.
|
|
119
|
-
-
|
|
184
|
+
- Use `AskQuestion` for every selection — project, use case, test case, script, and approval. Never ask the user to type a number.
|
|
185
|
+
- Project, use case, and test case selection lists must always include "Create new ...". Include "Show full list" whenever the API returned at least one row for that step; omit "Show full list" when the list is empty (offer "Create new ..." only). For creates, use preview tools (`muggle-remote-use-case-prompt-preview`, `muggle-remote-test-case-generate-from-prompt`) before persisting.
|
|
186
|
+
- PR posting is always optional and always delegated to the `muggle-pr-visual-walkthrough` skill — never inline the walkthrough markdown or call `gh pr comment` directly from this skill.
|