npm - @pzy560117/opentest - Versions diffs - 0.1.10 → 0.1.11 - Mend

@pzy560117/opentest 0.1.10 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/assets/manifest.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "0.1.10",
+  "version": "0.1.11",
   "languages": [
     {
       "id": "en",
@@ -60,25 +60,36 @@
     "localized": [
       "opentest/SKILL.md",
       "opentest/references/acceptance-evidence.md",
+      "opentest/references/api-testing.md",
       "opentest/references/codex-harness-coverage-heuristics.md",
       "opentest/references/command-routing.md",
       "opentest/references/complete-testing-workflow.md",
+      "opentest/references/desktop-gui-testing.md",
       "opentest/references/lifecycle.md",
       "opentest/references/matrix-format.md",
       "opentest/references/opentest-driven-development.md",
       "opentest/references/quality-gate.md",
+      "opentest/references/test-asset-layout.md",
+      "opentest/references/test-surfaces.md",
+      "opentest/references/web-browser-testing.md",
       "opentest/templates/acceptance-template.md",
+      "opentest/templates/api-acceptance-template.md",
       "opentest/templates/archive-layout.md",
+      "opentest/templates/desktop-gui-acceptance-template.md",
       "opentest/templates/fixtures-template.md",
       "opentest/templates/matrix-template.md",
       "opentest/templates/plan-template.md",
       "opentest/templates/report-template.md",
+      "opentest/templates/web-acceptance-template.md",
       "opentest-accept/SKILL.md",
+      "opentest-api/SKILL.md",
       "opentest-archive/SKILL.md",
       "opentest-author/SKILL.md",
       "opentest-heal/SKILL.md",
       "opentest-plan/SKILL.md",
       "opentest-run/SKILL.md",
+      "opentest-desktop-gui/SKILL.md",
+      "opentest-web-browser/SKILL.md",
       "opentest-verify/SKILL.md"
     ],
     "shared": [

package/assets/skills/opentest/references/api-testing.md ADDED Viewed

@@ -0,0 +1,77 @@
+# API Testing
+Use this reference for `api` execution-surface rows.
+## Default Architecture
+Use the repository's existing API test framework first. If the project has no clear API test command, default to:
+```text
+pytest
+  -> httpx or requests client
+  -> pytest fixtures for seed/teardown
+  -> jsonschema or existing Pydantic/DTO models for contract assertions
+  -> optional DB/storage/log read-back
+  -> pytest report or JUnit XML
+```
+Use Docker Compose, testcontainers, or the repository's local service runner when dependencies need isolation. Mock or stub third-party APIs by default; live external services require an explicit requirement and recorded risk.
+Place durable API assets under `tests/api/` from `opentest/references/test-asset-layout.md`: clients in `tests/api/clients/`, fixtures in `tests/api/fixtures/`, schemas in `tests/api/schemas/`, and repeatable entry through `scripts/opentest-run-api.ps1` or the repository's equivalent command.
+## Evidence Layers
+| Layer | Proves | Typical command/tool |
+| --- | --- | --- |
+| contract | status code, headers, response fields, schema, error shape | project contract tests, `pytest`, OpenAPI-based checks |
+| integration | API handler, service, database/storage, queue/log side effects | project integration command, `pytest`, local services |
+| smoke | base URL and critical endpoints are alive | project smoke command, small `pytest`/curl script |
+| security-review | auth, authorization, sensitive field exposure, injection risk | targeted tests plus review notes |
+## Required API Cases
+For API changes, include applicable matrix rows for:
+- happy path: expected status, payload, headers, and business state
+- validation failure: invalid, empty, boundary, malformed, unsupported fields
+- auth and permission: unauthenticated, expired token, wrong role, object-level authorization
+- not found and stale state: missing resource, deleted resource, stale version
+- conflict and idempotency: duplicate create, repeated submit, retry with idempotency key, concurrent update
+- rate limit or throttling when applicable
+- pagination, filtering, sorting, and empty result when list endpoints are changed
+- data consistency: response, DB/storage, emitted event, queue message, file, or log
+- teardown/cleanup: created resources removed or isolated fixture namespace reset
+## Contract Source
+Prefer contract sources in this order:
+1. OpenAPI/protobuf/schema file committed in the repository.
+2. Existing request/response DTO, Pydantic model, serializer, or typed client.
+3. Requirement/design document with explicit fields and errors.
+4. Handwritten schema in the acceptance case.
+Do not infer contract solely from current implementation behavior when it conflicts with requirements.
+## Blocking Rules
+Record `blocked` when any required prerequisite is missing:
+- base URL or service start command
+- auth token, role, or test user
+- fixture seed/teardown path
+- dependency service, database, queue, or mock server
+- stable contract source or expected schema
+- deterministic read-after-write surface
+Do not mark API acceptance as PASS from a 2xx response alone. Write operations require read-after-write evidence from a trustworthy surface such as API read endpoint, DB/storage record, queue/event/log, or another project-owned state surface.
+## Matrix Requirements
+`api` rows must include:
+- `Execution surface`: `api`
+- `Acceptance mode`: `n/a`
+- `Evidence layer`: `contract`, `integration`, `smoke`, or `security-review`
+- `Framework/command`: project API command, `python -m pytest tests/api -v`, curl/httpie script, Postman/Newman when already used, or contract tool
+- `Required evidence`: request/response record, status code, payload/schema assertion, auth/permission assertion when applicable, read-after-write/data consistency, and cleanup/teardown proof

package/assets/skills/opentest/references/codex-harness-coverage-heuristics.md CHANGED Viewed

@@ -13,6 +13,9 @@ This reference extracts short rules from the local Codex Harness knowledge base,
 - `tdd-workflow`
 - `e2e-runner`
 - `browser-e2e-testing`
+- `android-midscene-pytest`
+- `opentest-desktop-gui`
+- `opentest-api`
 - `verification-loop`
 - `code-reviewer`
 - `speckit-checklist`
@@ -28,6 +31,19 @@ This reference extracts short rules from the local Codex Harness knowledge base,
 | Permissions, payments, security, data writes, cross-page loops | high-risk acceptance or E2E evidence |
 | Copy, configuration, small non-behavioral changes | targeted review or light evidence |
+## Execution Surface Selection
+Choose the execution surface separately from the evidence layer:
+| Surface | Default route |
+| --- | --- |
+| `web-browser` | Chrome DevTools MCP, Playwright CLI, or browser acceptance |
+| `android-app` | `android-midscene-pytest` when available; `python -m pytest tests_py -v` drives Midscene Android through ADB |
+| `desktop-gui` | `opentest-desktop-gui`; project GUI automation first, `@midscene/computer` for visual/native/RDP GUI flows, or scripted manual GUI acceptance when automation is unavailable |
+| `api` | `opentest-api`; project API/integration command first, otherwise `pytest` with `httpx`/`requests`, schema checks, fixtures, read-after-write, and cleanup/teardown |
+Do not classify code checks such as unit, component, integration, contract, smoke, or security review as execution surfaces. Use them as evidence layers.
 ## Frontend Acceptance Dimensions
 Frontend or real workflow acceptance may choose applicable items from the following dimensions. Full coverage is not required every time:
@@ -74,7 +90,7 @@ The OpenTest plan phase checks these questions by default. If applicable, add th
 | ACC-001 | User sees success feedback after save | medium | UI acceptance | pending |
 ```
-Add coverage dimension, command, evidence path, and blocker reason columns only when risk or change type requires them.
+Add execution surface, evidence layer, command/tool, evidence path, and blocker reason columns when the matrix drives acceptance execution or risk requires them.
 ## Quality Gate Heuristics

package/assets/skills/opentest/references/complete-testing-workflow.md CHANGED Viewed

@@ -14,6 +14,27 @@ plan -> matrix -> fixtures -> tests -> run -> accept -> smoke -> pre-push -> ver
 Unit, component, integration, contract, E2E, smoke, and browser acceptance are evidence layers. They describe how to prove a requirement after or during implementation; they do not decide what the requirement is. If code does not exist yet, keep the acceptance case and mark code-dependent evidence as pending or blocked with a reason.
+## Execution Surfaces
+Every matrix row must name both an execution surface and an evidence layer. The execution surface is where the requirement is exercised; the evidence layer is how the result is proven.
+Before authoring tests, select the fixed asset layout from `opentest/references/test-asset-layout.md`. Option B, the standard framework skeleton, is the default; one-off scripts are allowed only for explicitly non-durable acceptance or blocked investigation.
+Primary execution surfaces are:
+- `web-browser`: browser-rendered pages and web apps
+- `android-app`: Android APK/app GUI on emulator or device
+- `desktop-gui`: native desktop GUI, Electron, Tauri, or similar app UI
+- `api`: HTTP API, RPC, backend workflow, contract, or service endpoint
+Do not use unit, component, integration, contract, smoke, or security review as the execution surface. Those are evidence layers or run gates. If an Android GUI requirement is present, route acceptance through the `android-midscene-pytest` skill when available and require pytest/Midscene/screenshot/logcat evidence. If native desktop GUI behavior is present, route acceptance through `opentest-desktop-gui` and require project GUI automation or `@midscene/computer` evidence plus screenshots, GUI action logs, window/app metadata, and deterministic read-back. If API behavior is present, route acceptance through `opentest-api` and require contract, status code, payload/schema, auth/permission, read-after-write, and cleanup/teardown evidence when applicable.
+For `web-browser`, choose an acceptance mode from `opentest/references/web-browser-testing.md`. MCP and Playwright CLI are immediate acceptance routes; durable regression requires a committed repeatable test such as `@playwright/test`.
+For `desktop-gui`, use `opentest/references/desktop-gui-testing.md`. Electron/Tauri DOM-verifiable flows can stay in `web-browser`; native shell, tray, file picker, menu, OS dialog, installer, updater, RDP, and multi-window behavior stay in `desktop-gui`.
+For `api`, use `opentest/references/api-testing.md`. Project API/integration commands are preferred; without them, use `pytest` with `httpx` or `requests`, schema checks, fixtures, and deterministic read-back.
 ## Test Data
 Create `docs/opentest/fixtures/` for changes that touch data, files, roles, permissions, APIs, or stateful workflows.

package/assets/skills/opentest/references/desktop-gui-testing.md ADDED Viewed

@@ -0,0 +1,52 @@
+# Desktop GUI Testing
+Use this reference for `desktop-gui` execution-surface rows.
+Durable desktop GUI assets follow `opentest/references/test-asset-layout.md`: scripts under `tests/desktop/scripts/`, Midscene assets under `tests/desktop/midscene/`, metadata captures under `tests/desktop/metadata/`, and repeatable entry through `scripts/opentest-run-desktop.ps1` or the project GUI command.
+## Tool Routes
+| Route | Use when | Required evidence |
+| --- | --- | --- |
+| project GUI automation | The repository already has a repeatable desktop automation command | command, report/log, screenshot or recording, post-action read-back |
+| `@midscene/computer` | Native desktop controls, weak selectors, visual workflows, multi-window flows, or Windows RDP need AI visual assistance | Midscene/computer run log, screenshots, model env status, window/app metadata, deterministic read-back |
+| accessibility/window metadata | Native controls expose stable accessibility tree, title, process, window handle, or menu state | metadata dump, action log, expected state assertion |
+| scripted manual GUI acceptance | No reliable automation exists and the acceptance is one-off | exact steps, screenshots, window/app metadata, observed result, blocker/risk note |
+## Midscene Desktop Route
+`@midscene/computer` is the Midscene desktop automation package. It can control local Windows, macOS, and Linux desktops, and can control remote Windows desktops through RDP when configured.
+Use it as an OpenTest visual automation layer, not as the whole quality gate:
+```text
+desktop-gui matrix row
+  -> project launch / environment check
+  -> @midscene/computer or project GUI automation
+  -> screenshots + GUI action log + window/app metadata
+  -> deterministic read/assert changed result
+```
+For Electron or Tauri, first decide whether the requirement is DOM-verifiable. DOM-verifiable flows belong to `web-browser`; shell, tray, file picker, native menu, OS dialog, installer, updater, and multi-window behavior belong to `desktop-gui`.
+## Blocking Rules
+Record `blocked` when any required prerequisite is missing:
+- model credentials for Midscene visual automation
+- desktop access, display, or RDP session
+- app launch command or target process/window identity
+- stable fixture data or reset/teardown path
+- deterministic result surface after writes
+Do not mark `desktop-gui` acceptance as PASS from an AI visual assertion alone. After create/update/delete/save actions, re-read a trustworthy result surface: reopened window state, file/config value, app storage/API, accessibility metadata, process/window metadata, or visible persisted value after restart.
+## Matrix Requirements
+`desktop-gui` rows must include:
+- `Execution surface`: `desktop-gui`
+- `Acceptance mode`: `n/a`
+- `Evidence layer`: `gui-acceptance`, `visual-acceptance`, `integration`, or `smoke`
+- `Framework/command`: project GUI command, `@midscene/computer`, accessibility/window metadata script, or scripted manual GUI route
+- `Required evidence`: screenshots or recording, GUI action log, window/app metadata, deterministic read-back, and blocked prerequisites when unavailable

package/assets/skills/opentest/references/matrix-format.md CHANGED Viewed

@@ -2,11 +2,15 @@
 ## Minimal Columns
-| ID | Requirement source | Intent | Trigger/Input | Expected behavior | Risk | Evidence layer | Required evidence | Status |
-| --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| ID | Requirement source | Intent | Execution surface | Acceptance mode | Trigger/Input | Expected behavior | Risk | Evidence layer | Required evidence | Status |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
 `Requirement source` is mandatory. Use a requirement ID, design section, user story, business rule, risk note, issue, or explicit user request. Do not use a function name, component path, or existing test file as the source of acceptance.
+`Execution surface` is mandatory and should be one of `web-browser`, `android-app`, `desktop-gui`, or `api`.
+`Acceptance mode` is mandatory for `web-browser`: `instant-acceptance`, `durable-regression`, or `visual-ai-assist`.
 ## Optional Columns
 - Coverage
@@ -28,4 +32,6 @@ Evidence layers describe how a requirement will be proven. They do not create or
 - `e2e`: cross-page flows, login, permissions, critical business loops.
 - `smoke`: key pages or happy paths do not crash.
 - `browser-acceptance`: real browser interaction, feedback location, responsive and visual state.
+- `visual-acceptance`: visual GUI behavior on Android app or desktop GUI surfaces.
+- `gui-acceptance`: desktop GUI behavior, window state, dialogs, menus, and native controls.
 - `security-review`: permissions, sensitive information, authorization bypass, duplicate submit, injection risk.

package/assets/skills/opentest/references/test-asset-layout.md ADDED Viewed

@@ -0,0 +1,64 @@
+# Test Asset Layout
+OpenTest default test assets use Option B: a standard framework skeleton. Do not decide directories ad hoc during `author`.
+## Default Layout
+```text
+tests/
+  api/
+    conftest.py
+    clients/
+    fixtures/
+    schemas/
+    test_contract_*.py
+    test_permissions_*.py
+    test_crud_*.py
+  web/
+    playwright/
+    midscene/
+  android/
+    tests_py/
+    midscene/
+  desktop/
+    scripts/
+    midscene/
+    metadata/
+docs/opentest/
+  matrix.md
+  fixtures/
+  acceptance/
+  runs/
+  reports/
+scripts/
+  opentest-run-api.ps1
+  opentest-run-web.ps1
+  opentest-run-android.ps1
+  opentest-run-desktop.ps1
+```
+Use the closest existing project directories when they already exist, but keep the same logical slots: surface tests under `tests/<surface>/`, evidence under `docs/opentest/`, and repeatable entry scripts under `scripts/opentest-run-*.ps1` or the repository's equivalent command.
+## Shape Rules
+- The default is not a large QA platform. It is a stable skeleton for repeatable tests, fixtures, reports, and run entry points.
+- Do not create one-off scripts as the durable path. One-off scripts may be used only for `instant-acceptance` or blocked investigation evidence.
+- Do not create a separate top-level QA project unless the user explicitly chooses a team-scale template.
+- `author` creates or updates assets only inside the chosen layout.
+- `run` invokes fixed entry commands, not newly invented paths.
+- `accept` records evidence under `docs/opentest/acceptance/` and run artifacts under `docs/opentest/runs/` or `docs/opentest/reports/`.
+## Surface Mapping
+| Surface | Durable test location | Default run entry |
+| --- | --- | --- |
+| `api` | `tests/api/` | `scripts/opentest-run-api.ps1` or `python -m pytest tests/api -v` |
+| `web-browser` | `tests/web/playwright/` and `tests/web/midscene/` | `scripts/opentest-run-web.ps1` or project E2E command |
+| `android-app` | `tests/android/tests_py/` and `tests/android/midscene/` | `scripts/opentest-run-android.ps1` or `python -m pytest tests_py -v` in existing Android harness |
+| `desktop-gui` | `tests/desktop/scripts/`, `tests/desktop/midscene/`, `tests/desktop/metadata/` | `scripts/opentest-run-desktop.ps1` or project GUI command |
+## When To Use A Lighter Shape
+Use a lighter script-only shape only when the matrix row is explicitly one-off, exploratory, or blocked. Mark it as not durable regression and record the reason in `gap/blocker`.

package/assets/skills/opentest/references/test-surfaces.md ADDED Viewed

@@ -0,0 +1,101 @@
+# Test Surfaces
+OpenTest classifies acceptance execution by surface and evidence layer:
+- `Execution surface` is where the requirement is exercised from the user's or caller's point of view.
+- `Evidence layer` is how the requirement is proven, for example unit, integration, contract, e2e, smoke, or security review.
+- Test assets use the fixed layout in `opentest/references/test-asset-layout.md`; do not invent directories during authoring.
+Keep the primary execution surface to one of these four values:
+| Surface | Use when | Default acceptance path | Required artifacts |
+| --- | --- | --- | --- |
+| `web-browser` | Browser-rendered web pages, web apps, admin consoles, SaaS, dashboards | Use `opentest-web-browser`: Playwright MCP first, Playwright CLI fallback, `@playwright/test` for durable regression, Midscene only for visual assist | screenshots, snapshots, post-submit assertions, console/network notes, trace/report when durable |
+| `android-app` | Android APK or Android app GUI on emulator/device | Use the `android-midscene-pytest` skill when available: pytest orchestrates, `@midscene/android` executes visual automation, ADB/emulator controls device | pytest report, Midscene HTML report, screenshots, logcat, device/app metadata |
+| `desktop-gui` | Native desktop GUI or Electron/Tauri/Windows/macOS/Linux app UI | Use `opentest-desktop-gui`: project GUI automation first; `@midscene/computer` for visual desktop automation, weak selectors, native controls, multi-window flows, or RDP; scripted manual GUI acceptance only when automation is unavailable | screenshots or recording, GUI action log, window/app metadata, deterministic read-back, failure capture |
+| `api` | HTTP API, RPC, backend workflow, contract, service endpoint | Use `opentest-api`: project API/integration command first; otherwise `pytest` with `httpx` or `requests`, schema checks, fixtures, and read-after-write evidence | request/response records, status codes, payload/schema assertions, auth/permission results, data consistency, cleanup/teardown, logs |
+Do not invent a fifth primary surface. Code checks such as `unit`, `component`, `integration`, `contract`, `smoke`, and `security-review` are evidence layers or run gates, not primary surfaces.
+## Surface vs Evidence Layer
+Examples:
+| Requirement | Execution surface | Evidence layer |
+| --- | --- | --- |
+| User submits a web form and sees the saved item | `web-browser` | `browser-acceptance` + `integration` |
+| Android user creates a task in the app | `android-app` | `e2e` + `visual-acceptance` |
+| Desktop app opens a settings dialog and saves a preference | `desktop-gui` | `gui-acceptance` + `integration` |
+| Client creates an entity through REST API | `api` | `contract` + `integration` |
+## Android App Surface
+For Android GUI work, route through `android-midscene-pytest` when installed:
+```text
+python -m pytest tests_py -v
+  -> npm/Vitest wrapper
+  -> @midscene/android
+  -> ADB + Android emulator/device
+  -> screenshots + logcat + Midscene HTML report
+```
+Route selection:
+- Stable automation, demos, and repeatable reports: use `pytest -> npm/Vitest -> @midscene/android -> ADB`.
+- One-off natural-language exploration: Midscene YAML runner is optional, but it must not replace the pytest entry.
+- Agent-controlled Android: Midscene MCP is optional only when separately configured with `MIDSCENE_MCP_ANDROID_MODE=true`; do not write global MCP config automatically.
+- Pure Python stack: evaluate `midscene-python` only when the user explicitly asks.
+Layered run:
+- User-facing entry is `python -m pytest tests_py -v`.
+- pytest should check ADB, prepare emulator/device, install APK, run ADB smoke, and collect evidence before Midscene.
+- Run `npm run test:android` only when model environment variables are complete.
+- Run `npm run test:android` directly only to debug the Midscene layer.
+If Midscene model credentials, ADB, emulator/device, APK path, or package name are missing, record `blocked` with the exact missing prerequisite. Do not mark Android GUI acceptance as pass from a static screenshot alone.
+Failure evidence should include any available `midscene_run/log/ai-call.log`, `midscene_run/log/agent.log`, `midscene_run/log/android-device.log`, and `midscene_run/report/*.html`.
+## Web Browser Surface
+For `web-browser`, read `opentest/references/web-browser-testing.md` or use `opentest-web-browser`.
+Set `Acceptance mode`:
+- `instant-acceptance`: Playwright MCP first, Playwright CLI fallback.
+- `durable-regression`: `@playwright/test` or the repository's existing E2E framework.
+- `visual-ai-assist`: Midscene for weak selectors, canvas, cross-frame UI, or visual matching.
+Do not treat MCP or Playwright CLI evidence as durable regression by itself.
+## Desktop GUI Surface
+For `desktop-gui`, read `opentest/references/desktop-gui-testing.md` or use `opentest-desktop-gui`.
+Route selection:
+- Prefer explicit project GUI automation when the repository already provides a repeatable command.
+- Use `@midscene/computer` for native controls, visual workflows, weak selectors, multi-window flows, or Windows RDP that needs AI visual assistance.
+- For Electron or Tauri, use `web-browser` when the requirement is DOM-verifiable; keep native shell, tray, file picker, menu, OS dialog, installer, updater, and multi-window behavior under `desktop-gui`.
+- Scripted manual GUI acceptance is a fallback for one-off evidence, not durable regression.
+Do not record `desktop-gui` as PASS from an AI visual assertion alone. Save/create/update/delete flows must include screenshots or recording, GUI action log, window/app metadata, and a deterministic read/assert changed result after reopening, restarting, or reading a trusted app/file/config state.
+## API Surface
+For `api`, read `opentest/references/api-testing.md` or use `opentest-api`.
+Route selection:
+- Prefer explicit project API, integration, contract, or smoke commands.
+- If no project command exists, default to `python -m pytest tests/api -v` with `httpx` or `requests`, `jsonschema` or existing Pydantic/DTO models, and pytest fixtures for seed/teardown.
+- Use OpenAPI, protobuf, schema files, DTOs, serializers, typed clients, or requirement docs as contract sources.
+- Mock or stub third-party APIs unless the requirement explicitly needs live external services.
+Do not record `api` as PASS from a 2xx response alone. API writes must include request/response records, payload/schema assertions, auth/permission checks when applicable, read-after-write/data consistency, and cleanup or teardown proof.
+## Matrix Rule
+Every matrix row must include both `Execution surface` and `Evidence layer`. `web-browser` rows must also include `Acceptance mode`. If a requirement needs more than one surface or mode, split it into separate rows or state the primary surface and add secondary evidence in `Required evidence`.

package/assets/skills/opentest/references/web-browser-testing.md ADDED Viewed

@@ -0,0 +1,40 @@
+# Web Browser Testing
+Use this reference for `web-browser` execution-surface rows.
+Durable web assets follow `opentest/references/test-asset-layout.md`: Playwright tests under `tests/web/playwright/`, Midscene visual assists under `tests/web/midscene/`, and repeatable entry through `scripts/opentest-run-web.ps1` or the repository's existing E2E command.
+## Acceptance Modes
+| Mode | Use when | Default tool | Required evidence |
+| --- | --- | --- | --- |
+| `instant-acceptance` | Prove the current change in a real browser now | Playwright MCP first, Playwright CLI fallback | snapshots, action steps, post-submit assertion, screenshot, console/network notes |
+| `durable-regression` | The workflow must run repeatedly in CI or future releases | `@playwright/test` or existing E2E framework | committed test file, deterministic locators/assertions, command, report/trace path |
+| `visual-ai-assist` | Selectors cannot reliably prove the UI state | Midscene plus Playwright or project browser driver | Midscene report, screenshot, and deterministic read/assert result |
+## Tool Rules
+- Playwright MCP and Playwright CLI are immediate acceptance tools. They are useful for live exploration and proof, but they are not durable regression by themselves.
+- `@playwright/test` or the project's existing E2E framework is the default durable regression path.
+- Midscene is a supplemental AI visual UI automation layer for weak selectors, canvas, cross-frame UI, visual matching, or natural-language exploration.
+- Do not record `visual-ai-assist` as PASS from an AI assertion alone. Re-read a trustworthy result surface after writes.
+## Required Web Write Chain
+```text
+open -> snapshot -> fill/input -> click(submit/confirm) -> snapshot -> read/assert changed result -> screenshot -> PASS/FAIL
+```
+PASS must name the changed value and where it was read back: page, list, detail view, API response, storage record, or logs.
+## Matrix Fields
+For `web-browser`, include:
+- `Execution surface`: `web-browser`
+- `Acceptance mode`: `instant-acceptance`, `durable-regression`, or `visual-ai-assist`
+- `Evidence layer`: `browser-acceptance`, `e2e`, `visual-acceptance`, `integration`, or `smoke`
+- `Framework/command`: MCP, Playwright CLI, `npx playwright test`, project E2E command, or Midscene route
+- `Required evidence`: snapshots, screenshot, post-submit assertion, report/trace, console/network notes, or Midscene report
+If a feature needs both immediate acceptance and durable regression, split it into two rows.

package/assets/skills/opentest/templates/acceptance-template.md CHANGED Viewed

@@ -5,7 +5,9 @@
 - intent:
 - context:
 - actor:
-- execution surface:
+- execution surface: web-browser | android-app | desktop-gui | api
+- acceptance mode:
+- evidence layer:
 - trigger/input:
 - expected feedback location:
 - status: pending

package/assets/skills/opentest/templates/api-acceptance-template.md ADDED Viewed

@@ -0,0 +1,44 @@
+# API Acceptance
+## ACC-API-001
+- execution surface: api
+- acceptance mode: n/a
+- tool route: project API command | pytest + httpx/requests | curl/httpie | Postman/Newman | contract tool
+- evidence layer: contract | integration | smoke | security-review
+- base URL:
+- auth/role:
+- fixture/seed:
+- teardown:
+- status: pending
+### Request
+- method:
+- path:
+- headers:
+- query:
+- body:
+### Expected Response
+- status code:
+- headers:
+- schema/source:
+- payload assertions:
+- error contract:
+### Read-Back Contract
+- API read endpoint:
+- DB/storage/log/event assertion:
+- idempotency/retry assertion:
+- cleanup assertion:
+### Evidence
+- status:
+- request/response record:
+- report path:
+- artifacts:
+- blockers:

package/assets/skills/opentest/templates/desktop-gui-acceptance-template.md ADDED Viewed

@@ -0,0 +1,43 @@
+# Desktop GUI Acceptance
+## ACC-Desktop-001
+- execution surface: desktop-gui
+- acceptance mode: n/a
+- tool route: project GUI automation | @midscene/computer | accessibility/window metadata | scripted manual GUI acceptance
+- evidence layer: gui-acceptance | visual-acceptance | integration | smoke
+- target app/window:
+- launch command:
+- fixture/reset:
+- status: pending
+### Environment
+- OS/display/RDP:
+- model env status:
+- app version/build:
+- target process/window metadata:
+### Steps
+1.
+### Expected Outcome
+-
+### Read-Back Contract
+- persisted result surface:
+- reopen/restart check:
+- accessibility/window metadata assertion:
+- file/config/app-state assertion:
+### Evidence
+- status:
+- screenshots/recording:
+- GUI action log:
+- window/app metadata:
+- Midscene/computer report or log:
+- blockers:

package/assets/skills/opentest/templates/matrix-template.md CHANGED Viewed

@@ -1,13 +1,14 @@
 # Acceptance-to-Test Matrix
-| ID | Requirement source | Intent | Coverage dimension | Trigger/Input | Expected behavior | Risk | Evidence layer | Framework/command | Required evidence | Gap/blocker | Status |
-| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-| ACC-001 | REQ-001 / user story | create succeeds | create | valid fixture entity | entity is created and visible in list/detail | high | integration + acceptance | `python -m pytest` + real workflow | create evidence, UI/API/DB assertion, fixture used | none | pending |
-| ACC-002 | REQ-001 / design list view | read/list/detail succeeds | read/list/detail | seeded entity | list, search/filter, and detail show correct data | high | integration + acceptance | `python -m pytest` + real workflow | read/list/detail evidence and data consistency check | none | pending |
-| ACC-003 | REQ-002 / edit workflow | update succeeds | update | edited fixture entity | updated values persist after read back | high | integration + acceptance | `python -m pytest` + real workflow | update evidence and read-back assertion | none | pending |
-| ACC-004 | REQ-003 / delete rule | delete succeeds safely | delete | existing fixture entity | confirm/cancel behavior works; deleted item disappears after confirmation | high | integration + acceptance | `python -m pytest` + real workflow | delete evidence, cancel evidence, post-delete read check | none | pending |
-| ACC-005 | Risk: invalid and unauthorized input | failure and boundary paths are handled | failure/boundary | invalid, empty, duplicate, unauthorized, or stale fixture data | clear feedback without corrupting data | high | unit/integration/acceptance | `python -m pytest` + acceptance | validation, permission, duplicate, stale-state evidence | none | pending |
-| ACC-006 | Business rule: data consistency | data consistency holds | data consistency | create/update/delete flow | UI, API, database/storage, files, and logs agree | high | integration | project command or `python -m pytest` | consistency evidence across surfaces | none | pending |
-| ACC-007 | User workflow: full CRUD | end-to-end CRUD flow works | end-to-end CRUD | create -> list -> detail -> update -> read back -> delete | full user workflow completes and leaves clean state | high | E2E/acceptance | browser/API workflow | full-chain steps, screenshots/logs, cleanup evidence | none | pending |
-| ACC-008 | Quality gate requirement | smoke gate passes | smoke | app starts and core entry points open | core route/API/CRUD happy path does not crash | high | smoke | project smoke command or targeted workflow | smoke_report path | none | pending |
-| ACC-009 | Delivery gate requirement | pre-push gate passes | pre-push | staged change before push | format/lint/type/unit/integration/smoke/diff checks pass or block push | high | pre-push | project command sequence | pre_push_report path | none | pending |
+| ID | Requirement source | Intent | Execution surface | Acceptance mode | Coverage dimension | Trigger/Input | Expected behavior | Risk | Evidence layer | Framework/command | Required evidence | Gap/blocker | Status |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| ACC-001 | REQ-001 / user story | create succeeds in web UI | web-browser | instant-acceptance | create | valid fixture entity | entity is created and visible in list/detail | high | browser-acceptance + integration | Playwright MCP / Playwright CLI + project command | create evidence, UI/API/DB assertion, fixture used | none | pending |
+| ACC-002 | REQ-WEB-REG-001 / regression need | web CRUD has durable regression | web-browser | durable-regression | read/list/detail + end-to-end CRUD | committed E2E test | create -> list -> detail -> update -> read back -> delete is repeatable | high | e2e | `npx playwright test` or project E2E command | test file, trace/report, stable locator assertions | none | pending |
+| ACC-003 | REQ-WEB-VIS-001 / visual risk | visual web state is verified | web-browser | visual-ai-assist | visual behavior | weak selector or canvas UI | Midscene helps locate UI and deterministic read-back proves result | medium | visual-acceptance | Midscene + Playwright route | Midscene report, screenshot, read/assert result | none | pending |
+| ACC-004 | REQ-ANDROID-001 / mobile flow | create succeeds in Android app | android-app | n/a | create | APK installed on emulator/device | task is created and visible after app refresh | high | visual-acceptance + e2e | `android-midscene-pytest` / `python -m pytest tests_py -v` | pytest report, ADB smoke, Midscene HTML report, screenshot, logcat, `midscene_run` logs | missing ADB/APK/device/package/model env if unavailable | pending |
+| ACC-005 | REQ-DESKTOP-001 / settings workflow | desktop setting saves | desktop-gui | n/a | update | desktop app window is open | setting persists after save and reopen | high | gui-acceptance + integration | `opentest-desktop-gui` / project GUI automation / `@midscene/computer` | screenshot or recording, GUI action log, app/window metadata, deterministic read-back, Midscene/computer report when used | missing display/RDP/app launch/window identity/model env/result surface if unavailable | pending |
+| ACC-006 | REQ-API-001 / contract | API create succeeds | api | n/a | create | valid request payload | response status and payload confirm created entity | high | contract + integration | `opentest-api` / project API command / `python -m pytest tests/api -v` | request/response record, status code, payload/schema assertion, read-after-write, data consistency, cleanup/teardown | missing base URL/auth/fixture/seed/teardown/dependency/schema/result surface if unavailable | pending |
+| ACC-007 | Risk: invalid and unauthorized input | failure and boundary paths are handled | api | n/a | failure/boundary | invalid, empty, duplicate, unauthorized, expired, forbidden, or stale fixture data | clear error contract without corrupting data | high | contract + security-review | `opentest-api` / project command / `python -m pytest tests/api -v` | validation, auth/permission, duplicate/idempotency, stale-state, error schema, sensitive-field evidence | none | pending |
+| ACC-007A | Risk: list contract drift | API list/search behavior is stable | api | n/a | list/filter/sort/pagination | seeded collection and query params | pagination, filtering, sorting, and empty result follow contract | medium | contract + integration | `opentest-api` / project API command | request/response record, schema assertion, pagination metadata, fixture isolation | not applicable if no list endpoint changed | pending |
+| ACC-008 | Quality gate requirement | smoke gate passes | web-browser | instant-acceptance | smoke | app starts and core entry points open | core route/API/CRUD happy path does not crash | high | smoke | project smoke command or targeted workflow | smoke_report path | none | pending |
+| ACC-009 | Delivery gate requirement | pre-push gate passes | api | n/a | pre-push | staged change before push | format/lint/type/unit/integration/smoke/diff checks pass or block push | high | pre-push | project command sequence | pre_push_report path | none | pending |

package/assets/skills/opentest/templates/plan-template.md CHANGED Viewed

@@ -24,5 +24,5 @@
 ## Evidence Plan
-| Evidence layer | Applicable scenario | Command or execution surface | Artifact path | Status |
-| --- | --- | --- | --- | --- |
+| Execution surface | Acceptance mode | Evidence layer | Applicable scenario | Command/tool | Artifact path | Status |
+| --- | --- | --- | --- | --- | --- | --- |