npm - @rolepod/uiproof - Versions diffs - 0.4.0 - Mend

@rolepod/uiproof 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/.agents/plugins/marketplace.json +22 -0
package/.claude-plugin/marketplace.json +34 -0
package/.claude-plugin/plugin.json +26 -0
package/.codex-plugin/plugin.json +45 -0
package/.cursor/mcp.json +9 -0
package/.cursor-plugin/marketplace.json +32 -0
package/.cursor-plugin/plugin.json +22 -0
package/CHANGELOG.md +310 -0
package/LICENSE +21 -0
package/README.md +170 -0
package/THIRD_PARTY.md +104 -0
package/UPSTREAM_TRACKING.md +58 -0
package/dist/bin/rolepod-uiproof.d.ts +1 -0
package/dist/bin/rolepod-uiproof.js +2530 -0
package/dist/bin/rolepod-uiproof.js.map +1 -0
package/dist/index.d.ts +918 -0
package/dist/index.js +2303 -0
package/dist/index.js.map +1 -0
package/dist/schemas/tools.json +51 -0
package/package.json +85 -0
package/skills/audit-a11y/SKILL.md +76 -0
package/skills/scaffold-e2e/SKILL.md +85 -0
package/skills/verify-ui/SKILL.md +134 -0
package/skills/visual-diff/SKILL.md +73 -0

package/README.md ADDED Viewed

@@ -0,0 +1,170 @@
+# rolepod-uiproof
+**rolepod-uiproof gives Claude Code, Cursor, Codex CLI, and Gemini CLI a real browser/mobile driver — so the AI can actually click through your UI, audit accessibility, diff screenshots, and scaffold e2e tests instead of guessing.**
+One MCP server, one tool surface, four skills you invoke from chat. Web is production-ready via Playwright; iOS and Android use Appium (same client as alumnium — needs a local Appium daemon + simulator/emulator, or a real device). No internal LLM — your Lead agent drives every action.
+## What it helps with
+- **Verify a UI change in seconds.** `/verify-ui` opens a real browser, runs your steps, checks your assertions, saves a screenshot + replay bundle.
+- **Catch a11y regressions before merge.** `/audit-a11y` runs axe-core against WCAG-A / AA / AAA and returns issues grouped by severity, with WCAG references and fix links.
+- **Lock down the visual contract.** `/visual-diff` captures a screenshot and compares against a named baseline under `./.rolepod-uiproof/baselines/`. First call seeds; subsequent calls diff.
+- **Turn an interactive verify run into a real test file.** `/scaffold-e2e` transcribes a replay bundle into Playwright Test, Vitest+Playwright, or pytest+selenium.
+- **Reproduce + minimize a bug deterministically.** `/verify-ui` with `mode: "reproduce"` runs ddmin step-elimination to find the shortest still-reproducing sequence.
+## The four skills
+| Skill | Wraps | What it does |
+|---|---|---|
+| `/verify-ui` | `rolepod_verify_ui_flow` | Drive a session through steps, evaluate assertions, save evidence + replay bundle. `mode: assert` (default) or `reproduce` with optional ddmin minimization. |
+| `/audit-a11y` | `rolepod_audit_a11y` | axe-core audit at WCAG-A / AA / AAA. `scope: "page"` or `scope: { ref }`. Markdown or JSON report. |
+| `/visual-diff` | `rolepod_visual_diff` | Pixel diff against a named baseline. Auto-seeds on first call. Configurable threshold + pixelmatch sensitivity. |
+| `/scaffold-e2e` | `rolepod_scaffold_e2e` | Generate a runnable test file from a scenario + optional replay bundle. Three target frameworks. |
+Every skill is **single-backend** (D-024) — it calls the rolepod-uiproof server and only the rolepod-uiproof server. If the server is unavailable, the skill fails with a clear diagnostic. Multi-backend routing belongs in the parent [`rolepod`](https://github.com/nuttaruj/rolepod) plugin's phase skills, not here.
+## Install
+Pick your CLI. All install paths share the same MCP server (`@rolepod/uiproof` on npm) and the same skill set.
+### Claude Code (recommended)
+```bash
+# Install
+claude plugin marketplace add nuttaruj/rolepod-uiproof
+claude plugin install rolepod-uiproof@rolepod-uiproof
+# Update
+claude plugin marketplace update rolepod-uiproof
+claude plugin install rolepod-uiproof@rolepod-uiproof
+# Uninstall
+claude plugin uninstall rolepod-uiproof@rolepod-uiproof
+claude plugin marketplace remove rolepod-uiproof
+```
+The plugin auto-registers the four `/verify-ui` / `/audit-a11y` / `/visual-diff` / `/scaffold-e2e` skills AND spawns the MCP server (`npx -y @rolepod/uiproof`) on session start.
+### Cursor IDE
+Cursor's plugin marketplace is enterprise-only (Free / Pro plans cannot install marketplace plugins). For everyone else, drop the workspace MCP config:
+```bash
+# Per project — copy from this repo, or run:
+mkdir -p .cursor
+curl -fsSL https://raw.githubusercontent.com/nuttaruj/rolepod-uiproof/main/.cursor/mcp.json -o .cursor/mcp.json
+# Or global (across every project)
+mkdir -p ~/.cursor
+curl -fsSL https://raw.githubusercontent.com/nuttaruj/rolepod-uiproof/main/.cursor/mcp.json -o ~/.cursor/mcp.json
+```
+Then **fully restart Cursor** — MCP servers load only at startup. Verify under **Settings → MCP**.
+Skills are not auto-registered under Cursor (no unified plugin format for skills + MCP in one). The MCP tools are still available; invoke them by name in chat (`Use rolepod_verify_ui_flow to …`).
+> **Teams / Enterprise:** add `https://github.com/nuttaruj/rolepod-uiproof` as a team marketplace under **Settings → Plugins** for one-click install with skills auto-registered.
+### Codex CLI
+```bash
+# Install
+codex plugin marketplace add nuttaruj/rolepod-uiproof
+codex plugin install rolepod-uiproof@rolepod-uiproof
+# Update
+codex plugin marketplace upgrade rolepod-uiproof
+codex plugin install rolepod-uiproof@rolepod-uiproof
+```
+Codex reads the plugin from `.agents/plugins/marketplace.json` + `.codex-plugin/plugin.json` in this repo. Skills install to `~/.codex/skills/` (Codex's plugin loader handles registration).
+### Gemini CLI
+Not yet shipped. The Gemini extension format is not yet stable enough to commit to; we plan to add `gemini-extension.json` in v0.4. Track [issue #N](https://github.com/nuttaruj/rolepod-uiproof/issues) if you need it.
+### Direct npm (any MCP-aware tool)
+Use this when your tool reads a standard `mcpServers` config (most non-CLI MCP clients):
+```json
+{
+  "mcpServers": {
+    "rolepod-uiproof": {
+      "command": "npx",
+      "args": ["-y", "@rolepod/uiproof"]
+    }
+  }
+}
+```
+15 MCP tools (`rolepod_browser_*` + `rolepod_verify_ui_flow` + 4 composites) will appear in your client. Skills are not surfaced via this path — call the tools by name.
+## Quick start
+After install, in your Claude Code / Cursor / Codex session:
+```
+/verify-ui https://example.com
+  steps: []
+  expect: text_visible "Example Domain", text_visible "Learn more"
+```
+Returns a `run_id`, `passed: true`, and a path under `./.rolepod-uiproof/artifacts/verify_<run_id>/`:
+```
+.rolepod-uiproof/artifacts/verify_20260524T101512_a1b2c3d4/
+├── final.png            screenshot at end of run
+└── replay.json          replay bundle — re-runnable via `npx rolepod-uiproof replay …`
+```
+Convert that to a Playwright Test file:
+```
+/scaffold-e2e from .rolepod-uiproof/artifacts/verify_…/replay.json using playwright-test
+```
+## Verify your setup
+```bash
+npx rolepod-uiproof doctor
+```
+```
+✓ Node ≥20                       24.14.0
+✓ Playwright Chromium installed  ~/Library/Caches/ms-playwright
+✓ webdriverio (mobile client, v0.3)
+• Appium server (roadmap v0.3)   Not reachable at http://127.0.0.1:4723/status
+✓ Xcode (iOS, roadmap v0.3)      /Applications/Xcode.app
+• Android SDK (roadmap v0.3)     Set ANDROID_HOME — needed only for Android
+• SeleniumEngine (roadmap v0.4)  Not implemented — deferred to v0.4
+✓ Artifact root writable
+```
+`✓` = ready · `•` = optional / deferred · `✗` = blocker.
+## What's inside
+- **15 MCP tools** — 10 atomic browser/mobile primitives (`browser_open`, `_close`, `_snapshot`, `_click`, `_type`, `_key`, `_scroll`, `_wait_for`, `_screenshot`, `_navigate`) + 5 composites (`verify_ui_flow`, `audit_a11y`, `visual_diff`, `scaffold_e2e`, `extract_ui_state`). All prefixed `rolepod_*` to namespace away from other MCP servers.
+- **2 engines behind one interface** — `PlaywrightEngine` for web (Chromium / Firefox / WebKit), `AppiumEngine` for iOS XCUITest + Android UIAutomator2. The Lead sees one unified `A11yNode` shape regardless of platform.
+- **Stable refs with explicit invalidation (D-010)** — every state-changing call invalidates prior refs; the engine returns a structured `stale_ref` error if you try to reuse one. No silent locator drift.
+- **Replay bundles** — every `/verify-ui` run writes a JSON replay you can re-run later with `npx rolepod-uiproof replay <bundle.json>`, agent-free.
+- **No internal LLM (D-004)** — your Lead agent makes every decision. We don't double-bill you for inference.
+## Use with parent rolepod
+If you also use [`rolepod`](https://github.com/nuttaruj/rolepod) (the markdown plugin), its `check-work`, `debug-issue`, and `review-code` skills auto-route to `/verify-ui`, `/audit-a11y`, and `/visual-diff` when the rolepod-uiproof server is present. Nothing breaks if it isn't — parent falls back to Playwright MCP / Chrome DevTools MCP / manual verification.
+The two are **independent**: install rolepod-uiproof standalone and get a complete experience via slash commands, or install both together and let parent's phase router pick the right backend automatically.
+## Docs
+- [docs/sessions.md](docs/sessions.md) — session lifecycle, stale-ref semantics, multi-session
+- [docs/artifacts.md](docs/artifacts.md) — `.rolepod-uiproof/` layout, run_id convention, replay bundle format
+- [docs/recipes/](docs/recipes/) — `verify-a-checkout-flow`, `audit-a11y-during-review`, `visual-baseline-workflow`
+- [CHANGELOG.md](CHANGELOG.md) — release history with per-version "Not yet verified" notes mapped to milestones
+- [CONTRIBUTING.md](CONTRIBUTING.md), [SECURITY.md](SECURITY.md), [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md)
+---
+MIT licensed — see [LICENSE](LICENSE) and [THIRD_PARTY.md](THIRD_PARTY.md). Mobile AT normalizers are alumnium-inspired ([UPSTREAM_TRACKING.md](UPSTREAM_TRACKING.md)). Feedback + runtime reports for Cursor / Codex / Gemini install paths especially welcome via [issues](https://github.com/nuttaruj/rolepod-uiproof/issues).

package/THIRD_PARTY.md ADDED Viewed

@@ -0,0 +1,104 @@
+# Third-Party Notices
+rolepod-uiproof depends on and (in later milestones) incorporates code from the
+following third-party projects. All listed projects are MIT-licensed and
+compatible with this project's MIT license (see `LICENSE`).
+---
+## alumnium
+- **Project:** [alumnium-hq/alumnium](https://github.com/alumnium-hq/alumnium)
+- **License:** MIT
+- **Used for:** Driver abstraction and accessibility-tree extractors for
+  Chromium (web), XCUITest (iOS), and UIAutomator2 (Android).
+- **Relationship:** Code is **forked** (copied with modification), not
+  depended on as an npm package. The LLM-driven `Alumni` class, LangChain
+  bindings, and OpenAI integration are **not** copied — only the driver and
+  accessibility layers. See `UPSTREAM_TRACKING.md` for the fork rationale,
+  the upstream commit referenced, and the cherry-pick policy.
+### Forked files
+> **Note (v0.3):** After surveying alumnium during scaffolding we chose
+> an **inspired-by** reimplementation rather than a verbatim fork. See
+> [`UPSTREAM_TRACKING.md`](UPSTREAM_TRACKING.md) for the reasoning,
+> the upstream commit SHA referenced, and the quarterly cherry-pick
+> policy.
+>
+> The Chromium AT path uses Playwright 1.60's built-in
+> `page.ariaSnapshot({mode:'ai'})` directly. The mobile AT extractors
+> (`src/engine/a11y/xcuitest.ts`, `uiautomator2.ts`) are
+> alumnium-inspired Original code parsing Appium's XML page source via
+> `fast-xml-parser`.
+>
+> Should literal alumnium source be copied in a future revision, each
+> file will carry this header:
+>
+> ```
+> /*
+>  * Originally from alumnium-hq/alumnium (MIT License).
+>  * Source commit: <SHA>
+>  * Modified for rolepod-uiproof.
+>  */
+> ```
+### Upstream MIT notice
+```
+MIT License
+Copyright (c) alumnium-hq contributors
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+```
+---
+## Runtime npm dependencies
+The following npm packages are direct runtime dependencies. Each retains its
+own license; this section is acknowledgement only.
+- `@modelcontextprotocol/sdk` — MIT — MCP protocol implementation.
+- `playwright` — Apache-2.0 — Web automation engine for the `web` platform.
+- `zod` — MIT — Tool input/output schema validation.
+- `js-yaml` — MIT — Parses Playwright's `ariaSnapshot({mode:'ai'})` YAML
+  output into the unified `A11yNode` tree.
+- `@axe-core/playwright` — MPL-2.0 — Powers the `rolepod_audit_a11y`
+  composite. axe-core is dual-licensed MPL-2.0 (weak copyleft); using it
+  as an unmodified runtime dependency is compatible with this project's
+  MIT license. We do not modify axe-core source.
+- `pixelmatch` — ISC — Pixel-level image comparison for
+  `rolepod_visual_diff`.
+- `pngjs` — MIT — PNG encode/decode for baseline + diff images in
+  `rolepod_visual_diff`.
+- `fast-xml-parser` — MIT — Parses Appium's XML page source in the
+  mobile AT normalizers (`xcuitest.ts`, `uiautomator2.ts`).
+## Optional npm dependencies
+- `webdriverio` — MIT — Loaded lazily by `AppiumEngine` when a mobile
+  session is requested. Web-only installs skip it via npm
+  `optionalDependencies`.
+## Build-time-only dependencies
+- `zod-to-json-schema` — ISC — Used by `npm run build:schemas` to emit
+  `dist/schemas/tools.json`. Not shipped at runtime.

package/UPSTREAM_TRACKING.md ADDED Viewed

@@ -0,0 +1,58 @@
+# Upstream tracking
+Records the upstream sources that have informed rolepod-uiproof's design and
+implementation, so that future cherry-picks or audits can locate them.
+## alumnium-hq/alumnium (MIT)
+- **Repo:** https://github.com/alumnium-hq/alumnium
+- **Commit referenced:** `94dea1e6916c3fb8e38fc229a7c7c85aa6230d52`
+- **Date referenced:** 2026-05-24
+- **Used as:** Design reference for mobile accessibility-tree shape and
+  the XCUITest / UIAutomator2 XML → unified tree mapping.
+### Status
+The original engine-layer design specified a *verbatim fork* of
+alumnium's `packages/typescript/src/drivers/` and
+`packages/typescript/src/accessibility/`. After surveying the source
+during v0.3 scaffolding we instead chose an **inspired-by**
+reimplementation:
+- alumnium uses bun-style `.ts` import extensions throughout; our
+  Node + tsup setup uses `.js` resolution. Mass-renaming imports was
+  the bulk of any literal fork effort.
+- alumnium pulls four runtime XML deps (`domhandler`, `htmlparser2`,
+  `dom-serializer`, `xml-formatter`) plus its internal `alwaysly`
+  helper. `fast-xml-parser` (MIT, single dep) covers our needs.
+- The accessibility-tree types in alumnium serve their LLM `Alumni`
+  class; ours serve the unified `A11yNode` schema in
+  `src/schema/tools.ts`, so the field set differs anyway.
+### What we keep from alumnium
+- The overall shape of the XCUITest and UIAutomator2 tree extractors —
+  walk the Appium XML page source, assign stable refs, map native
+  attributes (`name`, `label`, `value`, `content-desc`, `text`,
+  `resource-id`, etc.) into a normalized accessibility shape.
+- The decision to use Appium's `getPageSource` as the AT entry point
+  for mobile (alumnium proved this is workable).
+### What we DO NOT keep
+- The `Alumni` / LLM-driven action loop (incompatible with our
+  Lead-driven D-004 design).
+- The `Xml` namespace + 4 XML parsing deps.
+- The `pythonic*` polyfills.
+- The CLI / MCP wrappers (we have our own).
+- Literal source files.
+### Quarterly cherry-pick policy
+Each quarter, review alumnium's commit log between the SHA above and
+their `HEAD`. Cherry-pick *behavioral* fixes that apply (a UIAutomator2
+attribute we missed, an XCUITest edge case, etc.). We do **not** commit
+to staying current on every bug fix; we commit to staying *correct*.
+When alumnium fixes a behavioral bug we share, update this file with
+the new SHA + date + a one-line note on what changed.

package/dist/bin/rolepod-uiproof.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ #!/usr/bin/env node