npm - @pzy560117/opentest - Versions diffs - 0.1.9 → 0.1.11 - Mend

@pzy560117/opentest 0.1.9 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

package/assets/skills/opentest/templates/matrix-template.md CHANGED Viewed

@@ -1,13 +1,14 @@
 # Acceptance-to-Test Matrix
-| ID | Intent | Coverage dimension | Trigger/Input | Expected behavior | Risk | Evidence layer | Framework/command | Required evidence | Gap/blocker | Status |
-| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
-| ACC-001 | create succeeds | create | valid fixture entity | entity is created and visible in list/detail | high | integration + acceptance | `python -m pytest` + real workflow | create evidence, UI/API/DB assertion, fixture used | none | pending |
-| ACC-002 | read/list/detail succeeds | read/list/detail | seeded entity | list, search/filter, and detail show correct data | high | integration + acceptance | `python -m pytest` + real workflow | read/list/detail evidence and data consistency check | none | pending |
-| ACC-003 | update succeeds | update | edited fixture entity | updated values persist after read back | high | integration + acceptance | `python -m pytest` + real workflow | update evidence and read-back assertion | none | pending |
-| ACC-004 | delete succeeds safely | delete | existing fixture entity | confirm/cancel behavior works; deleted item disappears after confirmation | high | integration + acceptance | `python -m pytest` + real workflow | delete evidence, cancel evidence, post-delete read check | none | pending |
-| ACC-005 | failure and boundary paths are handled | failure/boundary | invalid, empty, duplicate, unauthorized, or stale fixture data | clear feedback without corrupting data | high | unit/integration/acceptance | `python -m pytest` + acceptance | validation, permission, duplicate, stale-state evidence | none | pending |
-| ACC-006 | data consistency holds | data consistency | create/update/delete flow | UI, API, database/storage, files, and logs agree | high | integration | project command or `python -m pytest` | consistency evidence across surfaces | none | pending |
-| ACC-007 | end-to-end CRUD flow works | end-to-end CRUD | create -> list -> detail -> update -> read back -> delete | full user workflow completes and leaves clean state | high | E2E/acceptance | browser/API workflow | full-chain steps, screenshots/logs, cleanup evidence | none | pending |
-| ACC-008 | smoke gate passes | smoke | app starts and core entry points open | core route/API/CRUD happy path does not crash | high | smoke | project smoke command or targeted workflow | smoke_report path | none | pending |
-| ACC-009 | pre-push gate passes | pre-push | staged change before push | format/lint/type/unit/integration/smoke/diff checks pass or block push | high | pre-push | project command sequence | pre_push_report path | none | pending |
+| ID | Requirement source | Intent | Execution surface | Acceptance mode | Coverage dimension | Trigger/Input | Expected behavior | Risk | Evidence layer | Framework/command | Required evidence | Gap/blocker | Status |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| ACC-001 | REQ-001 / user story | create succeeds in web UI | web-browser | instant-acceptance | create | valid fixture entity | entity is created and visible in list/detail | high | browser-acceptance + integration | Playwright MCP / Playwright CLI + project command | create evidence, UI/API/DB assertion, fixture used | none | pending |
+| ACC-002 | REQ-WEB-REG-001 / regression need | web CRUD has durable regression | web-browser | durable-regression | read/list/detail + end-to-end CRUD | committed E2E test | create -> list -> detail -> update -> read back -> delete is repeatable | high | e2e | `npx playwright test` or project E2E command | test file, trace/report, stable locator assertions | none | pending |
+| ACC-003 | REQ-WEB-VIS-001 / visual risk | visual web state is verified | web-browser | visual-ai-assist | visual behavior | weak selector or canvas UI | Midscene helps locate UI and deterministic read-back proves result | medium | visual-acceptance | Midscene + Playwright route | Midscene report, screenshot, read/assert result | none | pending |
+| ACC-004 | REQ-ANDROID-001 / mobile flow | create succeeds in Android app | android-app | n/a | create | APK installed on emulator/device | task is created and visible after app refresh | high | visual-acceptance + e2e | `android-midscene-pytest` / `python -m pytest tests_py -v` | pytest report, ADB smoke, Midscene HTML report, screenshot, logcat, `midscene_run` logs | missing ADB/APK/device/package/model env if unavailable | pending |
+| ACC-005 | REQ-DESKTOP-001 / settings workflow | desktop setting saves | desktop-gui | n/a | update | desktop app window is open | setting persists after save and reopen | high | gui-acceptance + integration | `opentest-desktop-gui` / project GUI automation / `@midscene/computer` | screenshot or recording, GUI action log, app/window metadata, deterministic read-back, Midscene/computer report when used | missing display/RDP/app launch/window identity/model env/result surface if unavailable | pending |
+| ACC-006 | REQ-API-001 / contract | API create succeeds | api | n/a | create | valid request payload | response status and payload confirm created entity | high | contract + integration | `opentest-api` / project API command / `python -m pytest tests/api -v` | request/response record, status code, payload/schema assertion, read-after-write, data consistency, cleanup/teardown | missing base URL/auth/fixture/seed/teardown/dependency/schema/result surface if unavailable | pending |
+| ACC-007 | Risk: invalid and unauthorized input | failure and boundary paths are handled | api | n/a | failure/boundary | invalid, empty, duplicate, unauthorized, expired, forbidden, or stale fixture data | clear error contract without corrupting data | high | contract + security-review | `opentest-api` / project command / `python -m pytest tests/api -v` | validation, auth/permission, duplicate/idempotency, stale-state, error schema, sensitive-field evidence | none | pending |
+| ACC-007A | Risk: list contract drift | API list/search behavior is stable | api | n/a | list/filter/sort/pagination | seeded collection and query params | pagination, filtering, sorting, and empty result follow contract | medium | contract + integration | `opentest-api` / project API command | request/response record, schema assertion, pagination metadata, fixture isolation | not applicable if no list endpoint changed | pending |
+| ACC-008 | Quality gate requirement | smoke gate passes | web-browser | instant-acceptance | smoke | app starts and core entry points open | core route/API/CRUD happy path does not crash | high | smoke | project smoke command or targeted workflow | smoke_report path | none | pending |
+| ACC-009 | Delivery gate requirement | pre-push gate passes | api | n/a | pre-push | staged change before push | format/lint/type/unit/integration/smoke/diff checks pass or block push | high | pre-push | project command sequence | pre_push_report path | none | pending |

package/assets/skills/opentest/templates/plan-template.md CHANGED Viewed

@@ -24,5 +24,5 @@
 ## Evidence Plan
-| Evidence layer | Applicable scenario | Command or execution surface | Artifact path | Status |
-| --- | --- | --- | --- | --- |
+| Execution surface | Acceptance mode | Evidence layer | Applicable scenario | Command/tool | Artifact path | Status |
+| --- | --- | --- | --- | --- | --- | --- |

package/assets/skills/opentest/templates/web-acceptance-template.md ADDED Viewed

@@ -0,0 +1,27 @@
+# Web Browser Acceptance
+- ACC ID:
+- execution surface: web-browser
+- acceptance mode: instant-acceptance | durable-regression | visual-ai-assist
+- tool route: Playwright MCP | Playwright CLI | @playwright/test | Midscene
+- page/route:
+- actor/role:
+- fixture:
+## Steps
+1. open:
+2. snapshot:
+3. fill/input:
+4. submit/confirm:
+5. snapshot after submit:
+6. read/assert changed result:
+7. screenshot/report:
+## Evidence
+- status:
+- changed result asserted:
+- artifact paths:
+- console/network notes:
+- blocked reason:

package/assets/skills/opentest-accept/SKILL.md CHANGED Viewed

@@ -5,19 +5,28 @@ description: "OpenTest phase 4: execute natural language, MCP, or real workflow
 # OpenTest Accept
-Execute required acceptance items and write PASS, FAIL, or blocked evidence back to cases and the matrix.
+Write PASS, FAIL, or blocked evidence to cases and matrix.
 ## Required references
 - `opentest/references/acceptance-evidence.md`
 - `opentest/references/complete-testing-workflow.md`
+- `opentest/references/test-surfaces.md`
+- `opentest/references/web-browser-testing.md`
+- `opentest/references/desktop-gui-testing.md`
+- `opentest/references/api-testing.md`
 - `opentest/templates/acceptance-template.md`
+- `opentest/templates/desktop-gui-acceptance-template.md`
+- `opentest/templates/api-acceptance-template.md`
 ## Steps
 1. Read the matrix, fixtures, and `docs/opentest/acceptance/`.
-2. For frontend interactions, prefer Chrome DevTools MCP and observe the real rendered UI.
-3. For API/backend workflows, use project commands or direct API checks.
-4. For CRUD/data changes, execute the full chain from the workflow reference.
-5. Record feedback location/shape, artifacts, blocked evidence, and matching ACC IDs.
-6. Update acceptance records and run `bash "$OPENTEST_GUARD" accept --apply`.
+2. Select the acceptance tool from the matrix execution surface.
+3. For `web-browser`, use `opentest-web-browser`: Playwright MCP/CLI, `@playwright/test` for durable regression, Midscene only for visual assist.
+4. For `android-app`, use `android-midscene-pytest`: `python -m pytest tests_py -v`, ADB smoke, Midscene HTML, logcat, and `midscene_run`; block missing prerequisites.
+5. For `desktop-gui`, use `opentest-desktop-gui`: project GUI automation or `@midscene/computer`, screenshot/recording, metadata, and read-back.
+6. For `api`, use `opentest-api`: project API command or `pytest` with `httpx`/`requests`, schema checks, fixtures, read-after-write, and cleanup/teardown.
+7. For CRUD/data changes, execute the full chain from the workflow reference.
+8. Record feedback location/shape, artifacts, blocked evidence, and matching ACC IDs.
+9. Update acceptance records and run `bash "$OPENTEST_GUARD" accept --apply`.

package/assets/skills/opentest-api/SKILL.md ADDED Viewed

@@ -0,0 +1,25 @@
+---
+name: opentest-api
+description: "OpenTest API execution surface adapter. Use when planning, authoring, running, or accepting HTTP API, RPC, backend workflow, contract, auth, data consistency, idempotency, and integration evidence."
+---
+# OpenTest API
+Use this adapter for `api` matrix rows.
+## Required references
+- `opentest/references/api-testing.md`
+- `opentest/templates/api-acceptance-template.md`
+## Route
+1. Prefer the repository's existing API/integration test command when it is explicit and repeatable.
+2. If no project command exists, default to `pytest` with `httpx` or `requests`, schema checks via `jsonschema` or existing Pydantic/DTO models, and fixtures for seed/teardown.
+3. Use OpenAPI, protobuf, or existing contract docs as the contract source when present; otherwise write the expected status, headers, response schema, and business payload in the acceptance case.
+4. Mock or stub third-party APIs unless the requirement explicitly needs a live external dependency.
+5. Mark blocked when base URL, auth token, fixture data, seed/teardown, dependency service, or stable read-back surface is missing.
+## Evidence Contract
+PASS requires request/response records, status code, response payload or schema assertion, auth/permission result when applicable, data consistency/read-after-write evidence, and cleanup or teardown proof. API smoke only proves liveness; it does not replace contract, boundary, permission, or data consistency evidence for high-risk changes.

package/assets/skills/opentest-author/SKILL.md CHANGED Viewed

@@ -11,14 +11,16 @@ Turn the matrix into executable tests, fixtures, seed/teardown notes, and accept
 - `opentest/references/opentest-driven-development.md`
 - `opentest/references/complete-testing-workflow.md`
+- `opentest/references/test-asset-layout.md`
 - `opentest/templates/fixtures-template.md`
 - `opentest/templates/acceptance-template.md`
 ## Steps
 1. Read `matrix` and `fixtures` from `.opentest.yaml`.
-2. For code evidence, use the project framework; if none exists, Default to pytest with tests under `tests/` runnable by `python -m pytest`.
-3. Create/update fixtures, seed, teardown, users, roles, entities, files/images, and assertion surfaces.
-4. For CRUD/data changes, author the full acceptance flow: create -> list -> detail -> update -> read back -> delete -> confirm absence -> teardown.
-5. Record any gap/blocker with reason and risk.
-6. Write `.opentest.yaml` fields: `fixtures`, `acceptance`, then run `bash "$OPENTEST_GUARD" author --apply`.
+2. Preserve each row's requirement source and expected behavior. Do not rewrite acceptance cases around current implementation names, component internals, or existing test files.
+3. Place assets in the fixed layout from `test-asset-layout.md`; default to pytest under `tests/` when no project framework exists. Missing implementation means evidence stays pending.
+4. Create/update fixtures, seed, teardown, users, roles, entities, files/images, and assertion surfaces.
+5. For CRUD/data changes, author the full acceptance flow: create -> list -> detail -> update -> read back -> delete -> confirm absence -> teardown.
+6. Record any gap/blocker with reason and risk.
+7. Write `.opentest.yaml` fields: `fixtures`, `acceptance`, then run `bash "$OPENTEST_GUARD" author --apply`.

package/assets/skills/opentest-desktop-gui/SKILL.md ADDED Viewed

@@ -0,0 +1,24 @@
+---
+name: opentest-desktop-gui
+description: "OpenTest desktop GUI execution surface adapter. Use when planning, authoring, running, or accepting native desktop, Electron, Tauri, Windows, macOS, Linux, or RDP workflows and deciding between project GUI automation, @midscene/computer, accessibility metadata, screenshots, and scripted manual acceptance."
+---
+# OpenTest Desktop GUI
+Use this adapter for `desktop-gui` matrix rows.
+## Required references
+- `opentest/references/desktop-gui-testing.md`
+- `opentest/templates/desktop-gui-acceptance-template.md`
+## Route
+1. Prefer the repository's existing GUI automation command when it is explicit and repeatable.
+2. Use `@midscene/computer` for visual desktop automation, weak selectors, native controls, cross-window flows, or remote Windows RDP when deterministic project automation is unavailable.
+3. For Electron or Tauri apps that expose a browser context, prefer `web-browser` or Playwright routes for DOM-verifiable flows, and use `desktop-gui` only for native shell, tray, file picker, menu, OS dialog, or multi-window behavior.
+4. If model credentials, desktop access, display/RDP availability, app launch command, or stable result surface is missing, record `blocked` instead of PASS.
+## Evidence Contract
+PASS requires the executed steps, screenshot or screen recording, window/app metadata, GUI action log, and a deterministic read-back result after writes. `@midscene/computer` assertions are visual assistance; they must not replace persisted-result checks such as reopening a settings screen, reading a file/config value, querying an app state surface, or confirming the expected window state after restart.

package/assets/skills/opentest-plan/SKILL.md CHANGED Viewed

@@ -1,27 +1,33 @@
 ---
 name: opentest-plan
-description: "OpenTest phase 1: analyze the change, risks, project facts, and create a test strategy plus acceptance-to-test matrix."
+description: "OpenTest phase 1: create test strategy and acceptance matrix."
 ---
 # OpenTest Plan
-Create `docs/opentest/plans/`, `docs/opentest/matrices/`, and a fixtures plan before implementation.
+Create plan, matrix, and fixtures before implementation.
 ## Required references
 - `opentest/references/codex-harness-coverage-heuristics.md`
 - `opentest/references/matrix-format.md`
 - `opentest/references/complete-testing-workflow.md`
+- `opentest/references/test-asset-layout.md`
+- `opentest/references/test-surfaces.md`
+- `opentest/references/web-browser-testing.md`
+- `opentest/references/desktop-gui-testing.md`
+- `opentest/references/api-testing.md`
 ## Steps
-1. Read project rules, requirements/design/diff, existing commands, and detection output.
-2. Classify risk, applicable coverage, test data, and whether code tests use an existing framework or default `pytest`.
-3. Apply the CRUD baseline and test data requirements from the workflow reference for data-writing/API/form/file/stateful changes.
-4. Produce a matrix with coverage dimension, framework/command, required evidence, gap/blocker, and status.
-5. Write `.opentest.yaml` fields: `plan`, `matrix`, `fixtures`.
-6. Update handoff if present, then run `bash "$OPENTEST_GUARD" plan --apply`.
+1. Read rules, requirements/diff, commands, and detection output.
+2. Treat requirements and risks as sources; inspect current code only for project facts.
+3. Apply CRUD baseline and test data rules for data-writing/API/form/file/stateful changes.
+4. Classify execution surface and evidence layer separately: `web-browser`, `android-app`, `desktop-gui`, or `api`; use web modes, `opentest-desktop-gui`, and `opentest-api` where applicable.
+5. Produce a requirement-first matrix with source, behavior, surface, mode, evidence, command, gap/blocker, status, and fixed layout.
+6. Write `.opentest.yaml` fields: `plan`, `matrix`, `fixtures`.
+7. Update handoff if present, then run `bash "$OPENTEST_GUARD" plan --apply`.
 ## Gate
-Every applicable behavior, failure path, boundary, and risk surface needs evidence or a written gap/blocker. CRUD baseline and test data are default requirements unless the matrix says why they are not applicable.
+Every behavior, failure path, boundary, and risk needs evidence or gap/blocker. Every row cites a source and includes surface plus evidence layer; web rows include acceptance mode. Do not use unit/component/integration/contract/smoke as the execution surface. Do not drop or narrow acceptance because current code has no matching file.

package/assets/skills/opentest-run/SKILL.md CHANGED Viewed

@@ -5,12 +5,17 @@ description: "OpenTest phase 3: run project verification commands in targeted, f
 # OpenTest Run
-Run matrix-driven commands and write evidence reports under `docs/opentest/runs/`.
+Run matrix-driven commands and write reports under `docs/opentest/runs/`.
 ## Required references
 - `opentest/references/command-routing.md`
 - `opentest/references/complete-testing-workflow.md`
+- `opentest/references/test-asset-layout.md`
+- `opentest/references/test-surfaces.md`
+- `opentest/references/web-browser-testing.md`
+- `opentest/references/desktop-gui-testing.md`
+- `opentest/references/api-testing.md`
 ## Modes
@@ -22,10 +27,11 @@ Run matrix-driven commands and write evidence reports under `docs/opentest/runs/
 ## Steps
-1. Read `run_mode`, matrix, fixtures, and required evidence.
-2. Prefer explicit project commands; otherwise use `python -m pytest` for code-level tests.
-3. For coverage, prefer `python -m pytest --cov=. --cov-report=term-missing`.
-4. Smoke evidence is required unless the matrix says not applicable.
-5. For `pre-push`, run or record format/check, lint, type, unit, targeted integration, smoke, and `git diff --check`.
-6. Write `run_report`, plus `coverage_report`, `smoke_report`, or `pre_push_report` when produced/required.
-7. Run `bash "$OPENTEST_GUARD" run --apply`.
+1. Read `run_mode`, matrix, fixtures, required evidence, and fixed asset layout.
+2. Choose by matrix execution surface: MCP/CLI or `npx playwright test` for `web-browser`; `python -m pytest tests_py -v` for `android-app`, with `npm run test:android` only when model env is ready or debugging Midscene; `opentest-desktop-gui` for `desktop-gui`; `opentest-api` or `python -m pytest tests/api -v` for `api`.
+3. Prefer explicit project commands; otherwise use `python -m pytest` for code-level tests.
+4. For coverage, prefer `python -m pytest --cov=. --cov-report=term-missing`.
+5. Smoke evidence is required unless the matrix says not applicable.
+6. For `pre-push`, run or record format/check, lint, type, unit, targeted integration, smoke, and `git diff --check`.
+7. Write `run_report`, `coverage_report`, `smoke_report`, or `pre_push_report` when required.
+8. Run `bash "$OPENTEST_GUARD" run --apply`.

package/assets/skills/opentest-web-browser/SKILL.md ADDED Viewed

@@ -0,0 +1,26 @@
+---
+name: opentest-web-browser
+description: "OpenTest web-browser execution surface adapter. Use when planning, authoring, running, or accepting browser-rendered web workflows and deciding between Playwright MCP, Playwright CLI, @playwright/test, and Midscene."
+---
+# OpenTest Web Browser
+Use this adapter for `web-browser` matrix rows.
+## Required references
+- `opentest/references/web-browser-testing.md`
+- `opentest/references/complete-testing-workflow.md`
+- `opentest/templates/web-acceptance-template.md`
+## Steps
+1. Decide `acceptance_mode`: `instant-acceptance`, `durable-regression`, or `visual-ai-assist`.
+2. For `instant-acceptance`, use Playwright MCP first; if browser MCP is unavailable or unstable, use Playwright CLI.
+3. For `durable-regression`, write or run `@playwright/test` or the repository's existing E2E framework.
+4. Use Midscene only for visual, weak-selector, canvas, cross-frame, or other UI surfaces selectors cannot prove well.
+5. For writes, run the full chain: open -> snapshot -> fill/input -> submit -> snapshot -> read/assert changed result -> screenshot -> PASS/FAIL.
+## Gate
+MCP or Playwright CLI evidence proves this acceptance run only. Do not count it as durable regression unless the repository has a committed repeatable test, command, and report path. Midscene must not replace read-after-write assertions.

package/assets/skills-zh/opentest/references/api-testing.md ADDED Viewed

@@ -0,0 +1,77 @@
+# API 测试
+用于 `api` 执行面的矩阵行。
+## 默认架构
+优先使用仓库已有 API 测试框架。项目没有明确 API 测试命令时，默认采用：
+```text
+pytest
+  -> httpx 或 requests client
+  -> pytest fixtures 管 seed/teardown
+  -> jsonschema 或项目已有 Pydantic/DTO 模型做契约断言
+  -> 可选 DB/存储/日志回读
+  -> pytest report 或 JUnit XML
+```
+依赖需要隔离时，用 Docker Compose、testcontainers 或仓库已有本地服务启动方式。第三方 API 默认 mock/stub；打 live 外部服务必须有明确需求和记录的风险说明。
+稳定 API 资产放入 `opentest/references/test-asset-layout.md` 约定的 `tests/api/`：client 放 `tests/api/clients/`，fixtures 放 `tests/api/fixtures/`，schemas 放 `tests/api/schemas/`，可重复入口使用 `scripts/opentest-run-api.ps1` 或仓库等价命令。
+## 证据层级
+| 层级 | 证明内容 | 常见命令/工具 |
+| --- | --- | --- |
+| contract | 状态码、headers、响应字段、schema、错误形态 | 项目 contract tests、`pytest`、OpenAPI 检查 |
+| integration | API handler、service、数据库/存储、队列/日志副作用 | 项目 integration 命令、`pytest`、本地服务 |
+| smoke | base URL 和关键 endpoint 存活 | 项目 smoke 命令、小型 `pytest`/curl 脚本 |
+| security-review | 鉴权、授权、敏感字段泄漏、注入风险 | 定向测试 + review 记录 |
+## 必需 API 用例
+API 变更应按适用性写入矩阵：
+- 主路径：期望状态码、payload、headers 和业务状态
+- 校验失败：非法、空、边界、格式错误、不支持字段
+- 鉴权和权限：未登录、token 过期、角色错误、对象级授权
+- 未找到和过期状态：资源不存在、已删除资源、stale version
+- 冲突和幂等：重复创建、重复提交、带 idempotency key 的 retry、并发修改
+- 适用时的限流或节流
+- 列表 endpoint 变更时的分页、过滤、排序和空结果
+- 数据一致性：响应、DB/存储、事件、队列消息、文件或日志
+- teardown/cleanup：创建资源已删除，或隔离 fixture namespace 已重置
+## 契约来源
+契约来源优先级：
+1. 仓库提交的 OpenAPI/protobuf/schema 文件。
+2. 既有 request/response DTO、Pydantic model、serializer 或 typed client。
+3. 明确写出字段和错误的需求/设计文档。
+4. 验收用例中手写 schema。
+当当前实现行为与需求冲突时，不得只按实现反推契约。
+## 阻塞规则
+缺少任一必需前置条件时记录 `blocked`：
+- base URL 或服务启动命令
+- auth token、角色或测试用户
+- fixture seed/teardown 路径
+- 依赖服务、数据库、队列或 mock server
+- 稳定契约来源或期望 schema
+- 确定性写后读结果面
+不得只凭 2xx 响应把 API 验收标为 PASS。写操作必须包含可信结果面的写后读证据，例如 API 查询 endpoint、DB/存储记录、队列/事件/日志，或另一个项目自有状态面。
+## 矩阵要求
+`api` 行必须包含：
+- `执行面`：`api`
+- `验收模式`：`n/a`
+- `证据层级`：`contract`、`integration`、`smoke` 或 `security-review`
+- `框架/命令`：项目 API 命令、`python -m pytest tests/api -v`、curl/httpie 脚本、项目已使用时的 Postman/Newman，或契约工具
+- `必需证据`：请求/响应记录、状态码、payload/schema 断言、适用时的鉴权/权限断言、写后读/数据一致性，以及 cleanup/teardown 证明

package/assets/skills-zh/opentest/references/codex-harness-coverage-heuristics.md CHANGED Viewed

@@ -13,6 +13,9 @@ OpenTest 应根据变更类型、风险和项目事实选择适用覆盖面。
 - `tdd-workflow`
 - `e2e-runner`
 - `browser-e2e-testing`
+- `android-midscene-pytest`
+- `opentest-desktop-gui`
+- `opentest-api`
 - `verification-loop`
 - `code-reviewer`
 - `speckit-checklist`
@@ -28,6 +31,19 @@ OpenTest 应根据变更类型、风险和项目事实选择适用覆盖面。
 | 权限、支付、安全、数据写入、跨页面闭环 | high-risk 验收或 E2E 证据 |
 | 文案、配置、小范围无行为变更 | targeted review 或轻量证据 |
+## 执行面选择
+执行面和证据层级必须分开选择：
+| 执行面 | 默认路由 |
+| --- | --- |
+| `web-browser` | Chrome DevTools MCP、Playwright CLI 或浏览器验收 |
+| `android-app` | 已安装时使用 `android-midscene-pytest`；`python -m pytest tests_py -v` 通过 ADB 驱动 Midscene Android |
+| `desktop-gui` | `opentest-desktop-gui`；优先项目 GUI 自动化，视觉/原生/RDP GUI 流程用 `@midscene/computer`，无自动化时才用脚本化人工 GUI 验收 |
+| `api` | `opentest-api`；优先项目 API/integration 命令，否则用 `pytest` + `httpx`/`requests`、schema 校验、fixtures、写后读和 cleanup/teardown |
+不得把 unit、component、integration、contract、smoke 或 security review 这类代码检查归类成执行面；它们只能作为证据层级。
 ## 前端验收维度
 前端或真实链路验收可从以下维度中选择适用项，不要求每次全部覆盖：
@@ -74,7 +90,7 @@ OpenTest plan 阶段默认检查以下问题；适用则进入矩阵，不适用
 | ACC-001 | 用户保存后看到成功反馈 | medium | UI 验收 | pending |
 ```
-只有当风险或变更类型需要时，才增加覆盖维度、命令、证据路径和阻塞原因列。
+当矩阵需要驱动验收执行或风险要求时，再增加执行面、证据层级、命令/工具、证据路径和阻塞原因列。
 ## 质量门启发

package/assets/skills-zh/opentest/references/complete-testing-workflow.md CHANGED Viewed

@@ -8,6 +8,33 @@
 plan -> matrix -> fixtures -> tests -> run -> accept -> smoke -> pre-push -> verify -> archive
 ```
+## 需求先行验收
+`plan` 和 `author` 发生在实现前，必须把需求转成验收契约。来源只能是需求、设计说明、用户流程、业务规则和风险边界。当前代码只用于发现执行事实，例如命令、已有框架、路由、fixtures 和可复用 helper。
+unit、component、integration、contract、E2E、smoke、browser acceptance 都是证据层级。它们描述实现期间或实现后如何证明需求，不能决定需求是什么。如果代码还没出现，保留验收用例，并把依赖代码的证据标成 pending 或 blocked 且写明原因。
+## 执行面
+每条矩阵行必须同时写执行面和证据层级。执行面表示需求从哪里被实际操作；证据层级表示如何证明结果。
+编写测试前必须先按 `opentest/references/test-asset-layout.md` 选定固定资产目录。默认使用方案 B 的标准测试框架骨架；一次性脚本只能用于明确的非稳定验收或 blocked 排查。
+主执行面只有：
+- `web-browser`：浏览器渲染页面和 Web App
+- `android-app`：模拟器或真机上的 Android APK/App GUI
+- `desktop-gui`：原生桌面 GUI、Electron、Tauri 或类似 App UI
+- `api`：HTTP API、RPC、后端工作流、契约或服务端点
+不得把 unit、component、integration、contract、smoke 或 security review 当成执行面；它们是证据层级或运行门禁。如果存在 Android GUI 需求，已安装 `android-midscene-pytest` 时用它执行验收，并要求 pytest/Midscene/截图/logcat 证据。如果存在原生桌面 GUI 行为，用 `opentest-desktop-gui` 执行验收，并要求项目 GUI 自动化或 `@midscene/computer` 证据，加上截图、GUI 操作日志、窗口/App 元数据和确定性回读。如果存在 API 行为，用 `opentest-api` 执行验收，并按适用性要求契约、状态码、payload/schema、鉴权/权限、写后读和 cleanup/teardown 证据。
+`web-browser` 必须按 `opentest/references/web-browser-testing.md` 选择验收模式。MCP 和 Playwright CLI 是现场验收路径；稳定回归必须有已提交、可重复运行的测试，例如 `@playwright/test`。
+`desktop-gui` 使用 `opentest/references/desktop-gui-testing.md`。Electron/Tauri 的 DOM 可验证流程可以保留在 `web-browser`；原生外壳、托盘、文件选择器、菜单、系统弹窗、安装器、更新器、RDP 和多窗口行为保留在 `desktop-gui`。
+`api` 使用 `opentest/references/api-testing.md`。优先项目 API/integration 命令；没有项目命令时，用 `pytest` + `httpx` 或 `requests`、schema 校验、fixtures 和确定性回读。
 ## 测试数据
 变更涉及数据、文件、角色、权限、API 或有状态流程时，创建 `docs/opentest/fixtures/`。

package/assets/skills-zh/opentest/references/desktop-gui-testing.md ADDED Viewed

@@ -0,0 +1,52 @@
+# 桌面 GUI 测试
+用于 `desktop-gui` 执行面的矩阵行。
+稳定桌面 GUI 资产遵循 `opentest/references/test-asset-layout.md`：脚本放 `tests/desktop/scripts/`，Midscene 资产放 `tests/desktop/midscene/`，元数据采集放 `tests/desktop/metadata/`，可重复入口使用 `scripts/opentest-run-desktop.ps1` 或项目 GUI 命令。
+## 工具路线
+| 路线 | 适用场景 | 必需证据 |
+| --- | --- | --- |
+| 项目 GUI 自动化 | 仓库已有可重复的桌面自动化命令 | 命令、报告/日志、截图或录屏、操作后回读 |
+| `@midscene/computer` | 原生桌面控件、弱选择器、视觉流程、多窗口流程，或 Windows RDP 需要 AI 视觉辅助 | Midscene/computer 运行日志、截图、模型环境状态、窗口/App 元数据、确定性回读 |
+| 可访问性/窗口元数据 | 原生控件能暴露稳定可访问性树、标题、进程、窗口句柄或菜单状态 | 元数据 dump、操作日志、期望状态断言 |
+| 脚本化人工 GUI 验收 | 没有可靠自动化且验收是一次性的 | 精确步骤、截图、窗口/App 元数据、观察结果、阻塞/风险说明 |
+## Midscene 桌面路线
+`@midscene/computer` 是 Midscene 的桌面自动化包，可控制本地 Windows、macOS 和 Linux 桌面，也可在配置后通过 RDP 控制远程 Windows 桌面。
+在 OpenTest 中把它作为视觉自动化层，而不是整个质量门：
+```text
+desktop-gui 矩阵行
+  -> 项目启动 / 环境检查
+  -> @midscene/computer 或项目 GUI 自动化
+  -> 截图 + GUI 操作日志 + 窗口/App 元数据
+  -> 确定性 read/assert changed result
+```
+Electron 或 Tauri 先判断需求是否能用 DOM 验证。DOM 可验证流程归入 `web-browser`；原生外壳、托盘、文件选择器、原生菜单、系统弹窗、安装器、更新器和多窗口行为归入 `desktop-gui`。
+## 阻塞规则
+缺少任一必需前置条件时记录 `blocked`：
+- Midscene 视觉自动化的模型凭据
+- 桌面访问、显示器或 RDP 会话
+- App 启动命令或目标进程/窗口标识
+- 稳定 fixture 数据或 reset/teardown 路径
+- 写操作后的确定性结果面
+不得只凭 AI 视觉断言把 `desktop-gui` 验收标为 PASS。新增/修改/删除/保存后，必须重新读取可信结果面：重开的窗口状态、文件/配置值、App 存储/API、可访问性元数据、进程/窗口元数据，或重启后仍可见的持久化值。
+## 矩阵要求
+`desktop-gui` 行必须包含：
+- `执行面`：`desktop-gui`
+- `验收模式`：`n/a`
+- `证据层级`：`gui-acceptance`、`visual-acceptance`、`integration` 或 `smoke`
+- `框架/命令`：项目 GUI 命令、`@midscene/computer`、可访问性/窗口元数据脚本，或脚本化人工 GUI 路线
+- `必需证据`：截图或录屏、GUI 操作日志、窗口/App 元数据、确定性回读，以及不可用时的 blocked 前置条件

package/assets/skills-zh/opentest/references/matrix-format.md CHANGED Viewed

@@ -2,8 +2,14 @@
 ## 最小列
-| ID | 意图 | 触发/输入 | 期望行为 | 风险 | 证据层级 | 必需证据 | 状态 |
-| --- | --- | --- | --- | --- | --- | --- | --- |
+| ID | 需求来源 | 意图 | 执行面 | 验收模式 | 触发/输入 | 期望行为 | 风险 | 证据层级 | 必需证据 | 状态 |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
+`需求来源` 是必填项。可填写需求 ID、设计章节、用户故事、业务规则、风险说明、issue 或用户明确请求。不要把函数名、组件路径或已有测试文件当成验收来源。
+`执行面` 是必填项，应使用 `web-browser`、`android-app`、`desktop-gui` 或 `api` 之一。
+`web-browser` 行必须填写 `验收模式`：`instant-acceptance`、`durable-regression` 或 `visual-ai-assist`。
 ## 可选列
@@ -17,6 +23,8 @@
 ## 证据层级
+证据层级只描述如何证明需求，不能产生或限制需求本身。
 - `unit`：纯函数、校验规则、状态计算。
 - `component`：表单反馈、按钮状态、局部 UI 状态。
 - `integration`：模块协作、API client、状态管理、mock server。
@@ -24,4 +32,6 @@
 - `e2e`：跨页面、登录、权限、关键业务闭环。
 - `smoke`：关键页面或主路径不崩。
 - `browser-acceptance`：真实浏览器交互、反馈位置、响应式和视觉状态。
+- `visual-acceptance`：Android App 或桌面 GUI 执行面上的视觉 GUI 行为。
+- `gui-acceptance`：桌面 GUI 行为、窗口状态、弹窗、菜单和原生控件。
 - `security-review`：权限、敏感信息、越权、重复提交、注入风险。

package/assets/skills-zh/opentest/references/opentest-driven-development.md CHANGED Viewed

@@ -16,6 +16,14 @@ OpenTest 驱动开发不是传统 TDD 的替代名词。它把 TDD 放进更大
 这些场景需要先进入验收到测试矩阵，再决定证据层级。
+## 需求先行契约
+OpenTest 的 `plan` 和 `author` 阶段是在实现前生成需求验收契约。验收用例必须来自需求、设计说明、用户流程、业务规则、风险边界和预期交互反馈。
+可以读取当前代码来发现项目事实，例如已有测试框架、命令、路由、fixtures 或可复用 helper。当前代码不能决定某个需求是否需要验收、用户可见行为应该是什么，或某个需求是否可以删除。
+`unit`、`component`、`integration`、`contract`、`e2e` 等证据层级只描述实现期间或实现后如何证明需求，不是需求来源。如果代码还没出现，矩阵仍然要记录必需行为，并把依赖实现的证据标为 pending。
 ## 推荐顺序
 ```text
@@ -42,6 +50,8 @@ OpenTest 驱动开发不是传统 TDD 的替代名词。它把 TDD 放进更大
 ## 输出要求
+- 每条矩阵行必须引用需求来源，例如需求 ID、设计章节、用户故事、业务规则、风险说明或用户明确请求。
+- 验收表述必须与实现无关：描述用户可观察行为和业务结果，不描述当前函数名、组件内部结构或已有测试文件。
 - 每个 required 场景必须有证据层级和执行面。
 - blocked 不能直接等于 pass；必须写原因和恢复路径。
 - high-risk 场景缺证据默认 fail，除非用户明确接受风险并写明理由。