npm - @muggleai/works - Versions diffs - 4.2.2 → 4.4.0 - Mend

@muggleai/works 4.2.2 → 4.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/README.md +45 -37
package/dist/{chunk-BZJXQZ5Q.js → chunk-PMI2DI3V.js} +524 -173
package/dist/cli.js +1 -1
package/dist/index.js +1 -1
package/dist/plugin/.claude-plugin/plugin.json +4 -4
package/dist/plugin/.cursor-plugin/plugin.json +3 -3
package/dist/plugin/README.md +7 -5
package/dist/plugin/scripts/ensure-electron-app.sh +3 -3
package/dist/plugin/skills/do/e2e-acceptance.md +161 -0
package/dist/plugin/skills/do/open-prs.md +86 -16
package/dist/plugin/skills/muggle/SKILL.md +15 -13
package/dist/plugin/skills/muggle-do/SKILL.md +6 -6
package/dist/plugin/skills/muggle-test/SKILL.md +380 -0
package/dist/plugin/skills/muggle-test-feature-local/SKILL.md +44 -27
package/dist/plugin/skills/muggle-test-import/SKILL.md +272 -0
package/dist/plugin/skills/muggle-upgrade/SKILL.md +1 -1
package/dist/plugin/skills/optimize-descriptions/SKILL.md +8 -8
package/package.json +15 -12
package/plugin/.claude-plugin/plugin.json +4 -4
package/plugin/.cursor-plugin/plugin.json +3 -3
package/plugin/README.md +7 -5
package/plugin/scripts/ensure-electron-app.sh +3 -3
package/plugin/skills/do/e2e-acceptance.md +161 -0
package/plugin/skills/do/open-prs.md +86 -16
package/plugin/skills/muggle/SKILL.md +15 -13
package/plugin/skills/muggle-do/SKILL.md +6 -6
package/plugin/skills/muggle-test/SKILL.md +380 -0
package/plugin/skills/muggle-test-feature-local/SKILL.md +44 -27
package/plugin/skills/muggle-test-import/SKILL.md +272 -0
package/plugin/skills/muggle-upgrade/SKILL.md +1 -1
package/plugin/skills/optimize-descriptions/SKILL.md +8 -8
package/dist/plugin/skills/do/qa.md +0 -89
package/plugin/skills/do/qa.md +0 -89

package/dist/plugin/skills/muggle-test/SKILL.md ADDED Viewed

@@ -0,0 +1,380 @@
+---
+name: muggle-test
+description: "Run change-driven E2E acceptance testing using Muggle AI — detects local code changes, maps them to use cases, and generates test scripts either locally (real browser on localhost) or remotely (cloud execution on a preview/staging URL). Publishes results to Muggle dashboard, opens them in the browser, and posts E2E acceptance summaries with screenshots to the PR. Use this skill whenever the user wants to test their changes, run E2E acceptance tests on recent work, validate what they've been working on, or check if their code changes broke anything. Triggers on: 'test my changes', 'run tests on my changes', 'acceptance test my work', 'check my changes', 'validate my changes', 'test before I push', 'make sure my changes work', 'regression test my changes', 'test on preview', 'test on staging'. This is the go-to skill for change-driven E2E acceptance testing — it handles everything from change detection to test execution to result reporting."
+---
+# Muggle Test — Change-Driven E2E Acceptance Router
+A router skill that detects code changes, resolves impacted test cases, executes them locally or remotely, publishes results to the Muggle AI dashboard, and posts E2E acceptance summaries to the PR. The user can invoke this at any moment, in any state.
+## UX Guidelines — Minimize Typing
+**Every selection-based question MUST use the `AskQuestion` tool** (or the platform's equivalent structured selection tool). Never ask the user to "reply with a number" in a plain text message — always present clickable options.
+- **Selections** (project, use case, test case, mode, approval): Use `AskQuestion` with labeled options the user can click.
+- **Multi-select** (use cases, test cases): Use `AskQuestion` with `allow_multiple: true`.
+- **Free-text inputs** (URLs, descriptions): Only use plain text prompts when there is no finite set of options. Even then, offer a detected/default value when possible.
+- **Batch related questions**: If two questions are independent, present them together in a single `AskQuestion` call rather than asking sequentially.
+## Step 1: Confirm Scope of Work (Always First)
+Parse the user's query and explicitly confirm their expectation. There are exactly two modes:
+### Mode A: Local Test Generation
+> Test impacted use cases/test cases against **localhost** using the Electron browser.
+>
+> Execution tool: `muggle-local-execute-test-generation`
+Signs the user wants this: mentions "localhost", "local", "my machine", "dev server", "my changes locally", or just "test my changes" in a repo context.
+### Mode B: Remote Test Generation
+> Ask Muggle's cloud to generate test scripts against a **preview/staging URL**.
+>
+> Execution tool: `muggle-remote-workflow-start-test-script-generation`
+Signs the user wants this: mentions "preview", "staging", "deployed", "preview URL", "test on preview", "test the deployment", or provides a non-localhost URL.
+### Confirming
+If the user's intent is clear, state back what you understood and use `AskQuestion` to confirm:
+- Option 1: "Yes, proceed"
+- Option 2: "Switch to [the other mode]"
+If ambiguous, use `AskQuestion` to let the user choose:
+- Option 1: "Local — launch browser on your machine against localhost"
+- Option 2: "Remote — Muggle cloud tests against a preview/staging URL"
+Only proceed after the user selects an option.
+## Step 2: Detect Local Changes
+Analyze the working directory to understand what changed.
+1. Run `git status` and `git diff --stat` for an overview
+2. Run `git diff` (or `git diff --cached` if staged) to read actual diffs
+3. Identify impacted feature areas:
+   - Changed UI components, pages, routes
+   - Modified API endpoints or data flows
+   - Updated form fields, validation, user interactions
+4. Produce a concise **change summary** — a list of impacted features
+Present:
+> "Here's what changed: [list]. I'll scope E2E acceptance testing to these areas."
+If no changes detected (clean tree), tell the user and ask what they want to test.
+## Step 3: Authenticate
+1. Call `muggle-remote-auth-status`
+2. If authenticated and not expired → proceed
+3. If not authenticated or expired → call `muggle-remote-auth-login`
+4. If login pending → call `muggle-remote-auth-poll`
+If auth fails repeatedly, suggest: `muggle logout && muggle login` from terminal.
+## Step 4: Select Project (User Must Choose)
+A **project** is where all your test results, use cases, and test scripts are grouped on the Muggle AI dashboard. Pick the project that matches what you're working on.
+1. Call `muggle-remote-project-list`
+2. Use `AskQuestion` to present all projects as clickable options. Include the project URL in each label so the user can identify the right one. Always include a "Create new project" option at the end.
+   Example labels:
+   - "MUGGLE AI STAGING 1 — https://staging.muggle-ai.com/"
+   - "Tanka Testing — https://www.tanka.ai"
+   - "Create new project"
+   Prompt: "Pick the project to group this test run into:"
+3. **Wait for the user to explicitly choose** — do NOT auto-select based on repo name or URL matching
+4. **If user chooses "Create new project"**:
+   - Ask for `projectName`, `description`, and the production/preview URL
+   - Call `muggle-remote-project-create`
+Store the `projectId` only after user confirms.
+## Step 5: Select Use Case (User Must Choose)
+### 5a: List existing use cases
+Call `muggle-remote-use-case-list` with the project ID.
+### 5b: Present ALL use cases for user selection
+Use `AskQuestion` with `allow_multiple: true` to present all use cases as clickable options. Always include a "Create new use case" option at the end.
+Prompt: "Which use case(s) do you want to test?"
+### 5c: Wait for explicit user selection
+**CRITICAL: Do NOT auto-select use cases** based on:
+- Git changes analysis
+- Use case title/description matching
+- Any heuristic or inference
+The user MUST explicitly tell you which use case(s) to use.
+### 5d: If user chooses "Create new use case"
+1. Ask the user to describe the use case in plain English
+2. Call `muggle-remote-use-case-create-from-prompts`:
+   - `projectId`: The project ID
+   - `prompts`: Array of `{ instruction: "..." }` with the user's description
+3. Present the created use case and confirm it's correct
+## Step 6: Select Test Case (User Must Choose)
+For the selected use case(s):
+### 6a: List existing test cases
+Call `muggle-remote-test-case-list-by-use-case` with each use case ID.
+### 6b: Present ALL test cases for user selection
+Use `AskQuestion` with `allow_multiple: true` to present all test cases as clickable options. Always include a "Generate new test case" option at the end.
+Prompt: "Which test case(s) do you want to run?"
+### 6c: Wait for explicit user selection
+**CRITICAL: Do NOT auto-select test cases** — the user MUST explicitly choose which test case(s) to execute.
+### 6d: If user chooses "Generate new test case"
+1. Ask the user to describe what they want to test in plain English
+2. Call `muggle-remote-test-case-generate-from-prompt`:
+   - `projectId`, `useCaseId`, `instruction` (the user's description)
+3. Present the generated test case(s) for review
+4. Call `muggle-remote-test-case-create` to save the ones the user approves
+### 6e: Confirm final selection
+Use `AskQuestion` to confirm: "You selected [N] test case(s): [list titles]. Ready to proceed?"
+- Option 1: "Yes, run them"
+- Option 2: "No, let me re-select"
+Wait for user confirmation before moving to execution.
+## Step 7A: Execute — Local Mode
+### Pre-flight questions (batch where possible)
+**Question 1 — Local URL:**
+Try to auto-detect the dev server URL by checking running terminals or common ports (e.g., `lsof -iTCP -sTCP:LISTEN -nP | grep -E ':(3000|3001|4200|5173|8080)'`). If a likely URL is found, present it as a clickable default via `AskQuestion`:
+- Option 1: "http://localhost:3000" (or whatever was detected)
+- Option 2: "Other — let me type a URL"
+If nothing detected, ask as free text: "Your local app should be running. What's the URL? (e.g., http://localhost:3000)"
+**Question 2 — Electron launch + window visibility (ask together):**
+After getting the URL, use a single `AskQuestion` call with two questions:
+1. "Ready to launch the Muggle Electron browser for [N] test case(s)?"
+   - "Yes, launch it (visible — I want to watch)"
+   - "Yes, launch it (headless — run in background)"
+   - "No, cancel"
+If user cancels, stop and ask what they want to do instead.
+### Run sequentially
+For each test case:
+1. Call `muggle-remote-test-case-get` to fetch full details
+2. Call `muggle-local-execute-test-generation`:
+   - `testCase`: Full test case object from step 1
+   - `localUrl`: User's local URL (from Question 1)
+   - `approveElectronAppLaunch`: `true` (only if user approved in Question 2)
+   - `showUi`: `true` if user chose "visible", `false` if "headless" (from Question 2)
+3. Store the returned `runId`
+If a generation fails, log it and continue to the next. Do not abort the batch.
+### Collect results
+For each `runId`, call `muggle-local-run-result-get`. Extract: status, duration, step count, `artifactsDir`.
+### Publish each run to cloud
+For each completed run, call `muggle-local-publish-test-script`:
+- `runId`: The local run ID
+- `cloudTestCaseId`: The cloud test case ID
+This returns:
+- `viewUrl`: Direct link to view this test run on the Muggle AI dashboard
+- `testScriptId`, `actionScriptId`, `workflowRuntimeId`
+Store every `viewUrl` — these are used in the next steps.
+### Report summary
+```
+Test Case                  Status    Duration   Steps   View on Muggle
+─────────────────────────────────────────────────────────────────────────
+Login with valid creds     PASSED    12.3s      8       https://www.muggle-ai.com/...
+Login with invalid creds   PASSED    9.1s       6       https://www.muggle-ai.com/...
+Checkout flow              FAILED    15.7s      12      https://www.muggle-ai.com/...
+─────────────────────────────────────────────────────────────────────────
+Total: 3 tests | 2 passed | 1 failed | 37.1s
+```
+For failures: show which step failed, the local screenshot path, and a suggestion.
+## Step 7B: Execute — Remote Mode
+### Ask for target URL
+> "What's the preview/staging URL to test against?"
+### Trigger remote workflows
+For each test case:
+1. Call `muggle-remote-test-case-get` to fetch full details
+2. Call `muggle-remote-workflow-start-test-script-generation`:
+   - `projectId`: The project ID
+   - `useCaseId`: The use case ID
+   - `testCaseId`: The test case ID
+   - `name`: `"muggle-test: {test case title}"`
+   - `url`: The preview/staging URL
+   - `goal`: From the test case
+   - `precondition`: From the test case (use `"None"` if empty)
+   - `instructions`: From the test case
+   - `expectedResult`: From the test case
+3. Store the returned workflow runtime ID
+### Monitor and report
+For each workflow, call `muggle-remote-wf-get-ts-gen-latest-run` with the runtime ID.
+```
+Test Case                  Workflow Status   Runtime ID
+────────────────────────────────────────────────────────
+Login with valid creds     RUNNING           rt-abc123
+Login with invalid creds   COMPLETED         rt-def456
+Checkout flow              QUEUED            rt-ghi789
+```
+## Step 8: Open Results in Browser
+After execution and publishing are complete, open the Muggle AI dashboard so the user can visually inspect results and screenshots.
+### Mode A (Local) — open each published viewUrl
+For each published run's `viewUrl`:
+```bash
+open "https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}"
+```
+If there are many runs (>3), open just the project-level runs page instead of individual tabs:
+```bash
+open "https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/runs"
+```
+### Mode B (Remote) — open the project runs page
+```bash
+open "https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/runs"
+```
+Tell the user:
+> "I've opened the Muggle AI dashboard in your browser — you can see the test results, step-by-step screenshots, and action scripts there."
+## Step 9: Post E2E Acceptance Results to PR
+After reporting results, check if there's an open PR for the current branch and attach the E2E acceptance summary.
+### 9a: Find the PR
+```bash
+gh pr view --json number,url,title 2>/dev/null
+```
+- If a PR exists → post results as a comment
+- If no PR exists → use `AskQuestion`:
+  - "Create PR with E2E acceptance results"
+  - "Skip posting to PR"
+### 9b: Build the E2E acceptance comment body
+Construct a markdown comment with the full E2E acceptance breakdown. The format links each test case to its detail page on the Muggle AI dashboard, so PR reviewers can click through to see step-by-step screenshots and action scripts.
+```markdown
+## 🧪 Muggle AI — E2E Acceptance Results
+**X passed / Y failed** | [View all on Muggle AI](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/runs)
+| Test Case | Status | Details |
+|-----------|--------|---------|
+| [Login with valid creds](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ✅ PASSED | 8 steps, 12.3s |
+| [Login with invalid creds](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ✅ PASSED | 6 steps, 9.1s |
+| [Checkout flow](https://www.muggle-ai.com/muggleTestV0/dashboard/projects/{projectId}/scripts?modal=script-details&testCaseId={testCaseId}) | ❌ FAILED | Step 7: "Click checkout button" — element not found |
+<details>
+<summary>Failed test details</summary>
+### Checkout flow
+- **Failed at**: Step 7 — "Click checkout button"
+- **Error**: Element not found
+- **Local artifacts**: `~/.muggle-ai/sessions/{runId}/`
+- **Screenshots**: `~/.muggle-ai/sessions/{runId}/screenshots/`
+</details>
+---
+*Generated by [Muggle AI](https://www.muggle-ai.com) — change-driven E2E acceptance testing*
+```
+### 9c: Post to the PR
+If PR already exists — add as a comment:
+```bash
+gh pr comment {pr-number} --body "$(cat <<'EOF'
+{the E2E acceptance comment body from 9b}
+EOF
+)"
+```
+If creating a new PR — include the E2E acceptance section in the PR body alongside the usual summary/changes sections.
+### 9d: Confirm to user
+> "E2E acceptance results posted to PR #{number}. Reviewers can click the test case links to see step-by-step screenshots on the Muggle AI dashboard."
+## Tool Reference
+| Phase | Tool | Mode |
+|:------|:-----|:-----|
+| Auth | `muggle-remote-auth-status` | Both |
+| Auth | `muggle-remote-auth-login` | Both |
+| Auth | `muggle-remote-auth-poll` | Both |
+| Project | `muggle-remote-project-list` | Both |
+| Project | `muggle-remote-project-create` | Both |
+| Use Case | `muggle-remote-use-case-list` | Both |
+| Use Case | `muggle-remote-use-case-create-from-prompts` | Both |
+| Test Case | `muggle-remote-test-case-list-by-use-case` | Both |
+| Test Case | `muggle-remote-test-case-generate-from-prompt` | Both |
+| Test Case | `muggle-remote-test-case-create` | Both |
+| Test Case | `muggle-remote-test-case-get` | Both |
+| Execute | `muggle-local-execute-test-generation` | Local |
+| Execute | `muggle-remote-workflow-start-test-script-generation` | Remote |
+| Results | `muggle-local-run-result-get` | Local |
+| Results | `muggle-remote-wf-get-ts-gen-latest-run` | Remote |
+| Publish | `muggle-local-publish-test-script` | Local |
+| Browser | `open` (shell command) | Both |
+| PR | `gh pr view`, `gh pr comment`, `gh pr create` | Both |
+## Guardrails
+- **Always confirm intent first** — never assume local vs remote without asking
+- **User MUST select project** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select
+- **User MUST select use case(s)** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select based on git changes or heuristics
+- **User MUST select test case(s)** — present clickable options via `AskQuestion`, wait for explicit choice, never auto-select
+- **Use `AskQuestion` for every selection** — never ask the user to type a number; always present clickable options
+- **Batch related questions** — combine Electron approval + visibility into one question; auto-detect localhost URL when possible
+- **Never launch Electron without explicit user approval** (`approveElectronAppLaunch`)
+- **Never silently drop test cases** — log failures and continue, then report them
+- **Never guess the URL** — always ask the user for localhost or preview URL
+- **Always publish before opening browser** — the dashboard needs the published data to show results
+- **Use correct dashboard URL format** — `modal=script-details` (not `modal=details`)
+- **Always check for PR before posting** — don't create a PR comment if there's no PR (ask user first)
+- **Can be invoked at any state** — if the user already has a project or use cases set up, skip to the relevant step rather than re-doing everything

package/dist/plugin/skills/muggle-test-feature-local/SKILL.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
 name: muggle-test-feature-local
-description: Run a real-browser QA test against localhost to verify a feature works correctly — signup flows, checkout, form validation, UI interactions, or any user-facing behavior. Launches a browser that executes test steps and captures screenshots. Use this skill whenever the user asks to test, QA, validate, or verify their web app, UI changes, user flows, or frontend behavior on localhost or a dev server — even if they don't mention 'muggle' or 'QA' explicitly.
+description: Run a real-browser end-to-end (E2E) acceptance test against localhost to verify a feature works correctly — signup flows, checkout, form validation, UI interactions, or any user-facing behavior. Launches a browser that executes test steps and captures screenshots. Use this skill whenever the user asks to test, validate, or verify their web app, UI changes, user flows, or frontend behavior on localhost or a dev server — even if they don't mention 'muggle' or 'E2E' explicitly.
 ---
 # Muggle Test Feature Local
-**Goal:** Run or generate an end-to-end test against a **local URL** using Muggle’s Electron browser.
+**Goal:** Run or generate an end-to-end test against a **local URL** using Muggle's Electron browser.
 | Scope | MCP tools |
 | :---- | :-------- |
@@ -15,12 +15,19 @@ description: Run a real-browser QA test against localhost to verify a feature wo
 The local URL only changes where the browser opens; it does not change the remote project or test definitions.
+## UX Guidelines — Minimize Typing
+**Every selection-based question MUST use the `AskQuestion` tool** (or the platform's equivalent structured selection tool). Never ask the user to "reply with a number" in a plain text message — always present clickable options.
+- **Selections** (project, use case, test case, script, approval): Use `AskQuestion` with labeled options the user can click.
+- **Free-text inputs** (URLs, descriptions): Only use plain text prompts when there is no finite set of options. Even then, offer a detected/default value when possible.
 ## Workflow
 ### 1. Auth
 - `muggle-remote-auth-status`
-- If not signed in: `muggle-remote-auth-login` then `muggle-remote-auth-poll`
+- If not signed in: `muggle-remote-auth-login` then `muggle-remote-auth-poll`
   Do not skip or assume auth.
 ### 2. Targets (user must confirm)
@@ -31,41 +38,45 @@ Ask the user to pick **project**, **use case**, and **test case** (do not infer)
 - `muggle-remote-use-case-list` (with `projectId`)
 - `muggle-remote-test-case-list-by-use-case` (with `useCaseId`)
-**Selection UI (mandatory):** After each list call, present choices as a **numbered list** (`1.` … `n.`). Keep each line minimal: number, short title, UUID. Ask the user to **reply with the number** or the UUID.
+**Selection UI (mandatory):** Every selection MUST use `AskQuestion` with clickable options. Never ask the user to "reply with the number" in plain text.
-**Fixed tail of each pick list (project, use case, test case):** After the relevance-ranked rows, end with the options below. **Create new …** is never omitted; **Show full list** is omitted when it would be pointless (see empty list).
+**Project selection context:** A **project** groups all your test results, use cases, and test scripts on the Muggle AI dashboard. Include the project URL in each option label so the user can identify the right one.
-1. **Show full list** — user sees every row from the API (then re-number the full list including the tails below again). **Skip this option** if the API returned **zero** rows for that step (e.g. no test cases yet for the chosen use case). There is nothing to expand.
-2. **Create new …** — user creates a new entity instead of picking an existing one. Label per step: **Create new project**, **Create new use case**, or **Create new test case**.
+Prompt for projects: "Pick the project to group this test into:"
 **Relevance-first filtering (mandatory for project, use case, and test case lists):**
 - Do **not** dump the full list by default.
-- Rank items by semantic relevance to the user’s stated goal (title first, then description / user story / acceptance criteria).
-- Show only the **top 3–5** most relevant options, then **Show full list** (unless the API list is empty — see above), then **Create new …** as above.
-- If the user picks **Show full list**, then present the complete numbered list (still ending with **Create new …**; include **Show full list** again only when the full list has at least one row).
+- Rank items by semantic relevance to the user's stated goal (title first, then description / user story / acceptance criteria).
+- Show only the **top 3-5** most relevant options via `AskQuestion`, plus these fixed tail options:
+  - **"Show full list"** — present the complete list in a new `AskQuestion` call. **Skip this option** if the API returned zero rows.
+  - **"Create new ..."** — never omitted. Label per step: "Create new project", "Create new use case", or "Create new test case".
 **Create new — tools and flow (use these MCP tools; preview before persist):**
 - **Project — Create new project:** Collect `projectName`, `description`, and `url` (may be the local app URL, e.g. `http://localhost:3999`). Call `muggle-remote-project-create`. Use the returned `projectId` and continue.
-- **Use case — Create new use case:** User provides a natural-language instruction (or you reuse their testing goal).
-  1. `muggle-remote-use-case-prompt-preview` with `projectId`, `instruction` — show preview; get confirmation.
+- **Use case — Create new use case:** User provides a natural-language instruction (or you reuse their testing goal).
+  1. `muggle-remote-use-case-prompt-preview` with `projectId`, `instruction` — show preview; get confirmation via `AskQuestion`.
   2. `muggle-remote-use-case-create-from-prompts` with `projectId`, `prompts: [{ instruction }]` — persist. Use the created use case id and continue to test-case selection.
-- **Test case — Create new test case** (requires a chosen `useCaseId`): User provides an instruction describing what to test.
-  1. `muggle-remote-test-case-generate-from-prompt` with `projectId`, `useCaseId`, `instruction` — **preview only** (server test-case prompt preview); show the returned draft(s); get confirmation.
-  2. Persist the accepted draft with `muggle-remote-test-case-create`, mapping preview fields into the required properties (`title`, `description`, `goal`, `expectedResult`, `url`, etc.). Then continue from **§4** with that `testCaseId`.
+- **Test case — Create new test case** (requires a chosen `useCaseId`): User provides an instruction describing what to test.
+  1. `muggle-remote-test-case-generate-from-prompt` with `projectId`, `useCaseId`, `instruction` — **preview only** (server test-case prompt preview); show the returned draft(s); get confirmation via `AskQuestion`.
+  2. Persist the accepted draft with `muggle-remote-test-case-create`, mapping preview fields into the required properties (`title`, `description`, `goal`, `expectedResult`, `url`, etc.). Then continue from **section 4** with that `testCaseId`.
 ### 3. Local URL
-- Use the URL the user gives. If none, ask; **do not guess**.
-- Remind them: local URL is only the execution target, not tied to cloud project config.
+Try to auto-detect the dev server URL by checking running terminals or common ports (e.g., `lsof -iTCP -sTCP:LISTEN -nP | grep -E ':(3000|3001|4200|5173|8080)'`). If a likely URL is found, present it as a clickable default via `AskQuestion`:
+- Option 1: "http://localhost:3000" (or whatever was detected)
+- Option 2: "Other — let me type a URL"
+If nothing detected, ask as free text: "Your local app should be running. What's the URL? (e.g., http://localhost:3000)"
+Remind them: local URL is only the execution target, not tied to cloud project config.
 ### 4. Existing scripts vs new generation
 `muggle-remote-test-script-list` with `testCaseId`.
-- **If any replayable/succeeded scripts exist:** list them in a **numbered** list and ask: replay one **or** generate new.
-  Show: name, id, created/updated, step count. Include **`Generate new script`** as the **last** numbered option (e.g. last number) so it is selectable by number too.
+- **If any replayable/succeeded scripts exist:** use `AskQuestion` to present them as clickable options. Show: name, created/updated, step count per option. Include **"Generate new script"** as the last option.
 - **If none:** go straight to generation (no need to ask replay vs generate).
 ### 5. Load data for the chosen path
@@ -77,8 +88,8 @@ Ask the user to pick **project**, **use case**, and **test case** (do not infer)
 **Replay**
-1. `muggle-remote-test-script-get` → note `actionScriptId`
-2. `muggle-remote-action-script-get` with that id → full `actionScript`
+1. `muggle-remote-test-script-get` — note `actionScriptId`
+2. `muggle-remote-action-script-get` with that id — full `actionScript`
    **Use the API response as-is.** Do not edit, shorten, or rebuild `actionScript`; replay needs full `label` paths for element lookup.
 3. `muggle-local-execute-replay` (after approval in step 6) with `testScript`, `actionScript`, `localUrl`, `approveElectronAppLaunch: true` (optional: `showUi: true`, **`timeoutMs`** — see below)
@@ -88,17 +99,22 @@ The MCP client often uses a **default wait of 300000 ms (5 minutes)** for `muggl
 - **Always pass `timeoutMs`** for flows that may be long — for example **`600000` (10 min)** or **`900000` (15 min)** — unless the user explicitly wants a short cap.
 - If the tool reports **`Electron execution timed out after 300000ms`** (or similar) **but** Electron logs show the run still progressing (steps, screenshots, LLM calls), treat it as **orchestration timeout**, not an Electron app defect: **increase `timeoutMs` and retry** (after user re-approves if your policy requires it).
-- **Test case design:** Preconditions like “a test run has already completed” on an **empty account** can force many steps (sign-up, new project, crawl). Prefer an account/project that **already has** the needed state, or narrow the test goal so generation does not try to create a full project from scratch unless that is intentional.
+- **Test case design:** Preconditions like "a test run has already completed" on an **empty account** can force many steps (sign-up, new project, crawl). Prefer an account/project that **already has** the needed state, or narrow the test goal so generation does not try to create a full project from scratch unless that is intentional.
 ### Interpreting `failed` / non-zero Electron exit
 - **`Electron execution timed out after 300000ms`:** Orchestration wait too short — see **`timeoutMs`** above.
-- **Exit code 26** (and messages like **LLM failed to generate / replay action script**): Often corresponds to a completed exploration whose **outcome was goal not achievable** (`goal_not_achievable`, summary with `halt`) — e.g. verifying “view script after a successful run” when **no run or script exists yet** in the UI. Use `muggle-local-run-result-get` and read the **summary / structured summary**; do not assume an Electron crash. **Fix:** choose a **project that already has** completed runs and scripts, or **change the test case** so preconditions match what localhost can satisfy (e.g. include steps to create and run a test first, or assert only empty-state UI when no runs exist).
+- **Exit code 26** (and messages like **LLM failed to generate / replay action script**): Often corresponds to a completed exploration whose **outcome was goal not achievable** (`goal_not_achievable`, summary with `halt`) — e.g. verifying "view script after a successful run" when **no run or script exists yet** in the UI. Use `muggle-local-run-result-get` and read the **summary / structured summary**; do not assume an Electron crash. **Fix:** choose a **project that already has** completed runs and scripts, or **change the test case** so preconditions match what localhost can satisfy (e.g. include steps to create and run a test first, or assert only empty-state UI when no runs exist).
 ### 6. Approval before any local execution
-Get **explicit** OK to launch Electron. State: replay vs generation, test case name, URL.
-Only then call local execute tools with `approveElectronAppLaunch: true`.
+Use `AskQuestion` to get explicit approval before launching Electron. State: replay vs generation, test case name, URL.
+- "Yes, launch Electron (visible — I want to watch)"
+- "Yes, launch Electron (headless — run in background)"
+- "No, cancel"
+Only call local execute tools with `approveElectronAppLaunch: true` after the user selects a "Yes" option. Map visible to `showUi: true`, headless to `showUi: false`.
 ### 7. After successful generation only
@@ -112,8 +128,9 @@ Only then call local execute tools with `approveElectronAppLaunch: true`.
 ## Non-negotiables
-- No silent auth skip; no launching Electron without approval.
+- No silent auth skip; no launching Electron without approval via `AskQuestion`.
 - If replayable scripts exist, do not default to generation without user choice.
 - No hiding failures: surface errors and artifact paths.
 - Replay: never hand-built or simplified `actionScript` — only from `muggle-remote-action-script-get`.
-- Project, use case, and test case selection lists must always include **Create new …**. Include **Show full list** whenever the API returned at least one row for that step; **omit Show full list** when the list is empty (offer **Create new …** only). For creates, use preview tools (`muggle-remote-use-case-prompt-preview`, `muggle-remote-test-case-generate-from-prompt`) before persisting.
+- Use `AskQuestion` for every selection — project, use case, test case, script, and approval. Never ask the user to type a number.
+- Project, use case, and test case selection lists must always include "Create new ...". Include "Show full list" whenever the API returned at least one row for that step; omit "Show full list" when the list is empty (offer "Create new ..." only). For creates, use preview tools (`muggle-remote-use-case-prompt-preview`, `muggle-remote-test-case-generate-from-prompt`) before persisting.