bun-uia 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +77 -0
  2. package/index.ts +1 -0
  3. package/package.json +54 -0
package/README.md ADDED
@@ -0,0 +1,77 @@
1
+ # bun-uia
2
+
3
+ **Playwright for Windows desktop apps.** Query the live UI Automation accessibility tree by name and role, invoke controls, type, wait for elements, and serialize a window to JSON for an LLM agent — from Bun, with **zero native dependencies**. No node-gyp, no prebuild matrix, no Appium server, no .NET.
4
+
5
+ ```ts
6
+ import { ControlType, uia } from 'bun-uia';
7
+
8
+ const app = await uia.launch(['notepad.exe'], { className: 'Notepad' });
9
+ const edit = await app.waitFor({ controlType: ControlType.Document });
10
+ edit.focus().type('nothing native compiles, and it just works');
11
+ console.log(edit.text()); // → nothing native compiles, and it just works
12
+ ```
13
+
14
+ ```ts
15
+ // Drive Calculator to 5 + 3 = 8 by name — survives DPI/theme/layout shifts that break pixel scripts:
16
+ const calc = await uia.launch(['cmd', '/c', 'start', 'calc'], { title: 'Calculator' });
17
+ for (const name of ['Five', 'Plus', 'Three', 'Equals']) calc.find({ controlType: ControlType.Button, name })?.invoke();
18
+ console.log(calc.find({ automationId: 'CalculatorResults' })?.name); // → "Display is 8"
19
+ ```
20
+
21
+ `bun add bun-uia` is the entire install story.
22
+
23
+ ## Why this exists
24
+
25
+ The Windows desktop-automation cluster on npm is a field of native-addon pain, paywalls, and abandoned daemons. Downloads verified against `api.npmjs.org` for the week of 2026-06-05→11.
26
+
27
+ | Tool | Weekly dl | Install / runtime | The catch |
28
+ | --- | --- | --- | --- |
29
+ | `@nut-tree-fork/nut-js` | 32,360 | libnut N-API addon (cmake-js) | Fork of a **paywalled** original — *"all of my packages around nut.js will cease to exist publicly on npm … only available through the private … registry, which requires an active subscription."* Pixel/image-match, **no a11y tree**. |
30
+ | `appium-windows-driver` | 30,749 | Appium server **+ a separate WinAppDriver.exe** | *"WinAppDriver server has not been maintained by Microsoft for years … Developer mode must be enabled."* Two daemons + a W3C HTTP hop per element read. |
31
+ | `@jitsi/robotjs` / `robotjs` | 15,333 / 11,375 | node-gyp / prebuild matrix | *"No prebuilt binaries found … node-gyp rebuild"* C++ compile fallback — the #1 documented install failure. Blind pixel + keystroke, **no element model**. |
32
+ | `uiohook-napi` (input hooks) | 21,965 | N-API addon | Healthy — but global `SetWindowsHookEx` hooks run on a foreign thread and can assert/segfault (node-addon-api #903). |
33
+ | `@bright-fish/node-ui-automation` | 33 | NAPI/COM native addon | The only real npm UIA wrapper — **dead since 2022**. |
34
+ | NodeRT `windows.ui.uiautomation` | 15 | NodeRT native addon | Dead 2022 **and wrong namespace** (projects WinRT, not the Win32 `IUIAutomation`). |
35
+ | FlaUI / pywinauto / AutoIt | n/a | .NET / Python / bespoke EXE | A foreign runtime to install and ship. |
36
+
37
+ **There is no zero-install, typed, in-process `IUIAutomation` client for Node or Bun.** bun-uia is a few kilobytes of TypeScript over `bun:ffi` — the runtime's own FFI, not a third-party N-API addon that rots against each Node minor (*"PLEASE ARCHIVE THIS REPO"* — node-ffi-napi #269). It **can't be paywalled** (no compiled binary to gate behind a subscription registry), has **no build step** (no node-gyp, no ABI matrix, no MSVC/Python), and talks to UIA **in-process** (no WinAppDriver.exe, no Appium daemon, no `127.0.0.1:4723` round-trip, no Developer Mode).
38
+
39
+ ## What you can do
40
+
41
+ - **Find controls semantically** — by name, role, or automationId, not a fragile `(x, y)`. Exact scalars compile to a **server-side** UIA condition (the target app filters in-process); regex/substring filter client-side.
42
+ - **Act** — `invoke()`, `click()`, `setValue()`, `type()`, `toggle()`, `expand()`, `select()`, `setRangeValue()`, window `close()`/`setVisualState()`. Each pattern is proven against a real control.
43
+ - **`waitFor`** — Playwright-class auto-retry for flaky native UIs. No other Windows-desktop npm tool has it. Timeouts quote the selector, the window, and the nearest candidates.
44
+ - **Read & assert** — `value`, `text()`, `isEnabled`, `boundingRectangle`, `toggleState`. Read state back through the tree to assert — pixel tools can't.
45
+ - **Serialize the tree to JSON** for an LLM agent (`uia.tree`), with a token-svelte agent profile.
46
+ - **Screenshot** any window via PrintWindow (works even on a locked session).
47
+ - **MSAA fallback** (`uia.msaaTree`) for legacy / owner-draw windows.
48
+ - **Crash-safe input observation** via `GetAsyncKeyState` polling — no foreign-thread hook, no message-pump assert.
49
+
50
+ ## For AI agents
51
+
52
+ Frontier computer-use agents ground actions in **screenshots** and the literature calls it fragile and expensive. Microsoft **UFO2** (arXiv 2504.14603) fuses the **UI Automation tree first, vision second**, to fix *"fragile screenshot-based interaction"*; OmniParser exists because VLMs can't reliably locate clickable elements from a bitmap; and **OSWorld-Human** (arXiv 2506.16042) reports a11y-tree builds taking **3–26 seconds** and "thousands more tokens per step."
53
+
54
+ bun-uia is exactly that UIA-first substrate — served **fast and in-process**. `uia.tree(app, { agentProfile: true })` walks a window's subtree in **one cached round-trip** and emits ground-truth `{ role, name, automationId, bounds, children }` an agent acts on without pixel-counting. The measured build time below beats the OSWorld 3–26 s reference by **two-to-three orders of magnitude**. `uia.execute(app, actions)` runs a JSON action list; `AGENT_TOOLS` is a ready LLM tool schema. Honest limit: UIA can't see owner-draw/canvas/games, so this **complements** a vision agent rather than replacing screenshots.
55
+
56
+ ## Benchmarks
57
+
58
+ Measured on Windows 11, Bun 1.4, by `bun run example/benchmark.ts` (run it to reproduce):
59
+
60
+ | operation | result |
61
+ | --- | --- |
62
+ | single property read (cross-process) | ~55 µs |
63
+ | naive subtree walk (65 nodes) | ~44 ms |
64
+ | **cached subtree walk** (one round-trip) | **~37 ms** (1.2× faster; the gap widens with tree size) |
65
+ | agent-grounding tree build | ~9 ms, ~2.7k tokens |
66
+ | **vs OSWorld a11y-tree build (3–26 s)** | **~345–2987× faster** |
67
+
68
+ ## Requirements & honest scoping
69
+
70
+ - **Windows 10/11, Bun ≥ 1.1.** Windows-only and Bun-only — the owned trade-off (nut.js/robotjs/uiohook are genuinely cross-platform; this is not).
71
+ - **UIA-tree based.** Apps with no accessibility tree (games, canvas/WebGL, custom-draw) get MSAA + screenshots + coordinate `click()`, not vision matching — a complement to screenshot tools, not a replacement.
72
+ - **Synthetic input (`type`/`sendKeys`/`click`) needs an unlocked, interactive desktop.** UIA queries, `invoke`, `setValue`, and `screenshot` work on a locked session; prefer them.
73
+ - **Selectors are client-side for regex/substring** (exact scalars are server-side). **UIA events are roadmap** — poll with `waitFor`. `scrollIntoView` is implemented but not yet proven against a real list.
74
+
75
+ Read [`AI.md`](https://github.com/ObscuritySRL/bun-win32/blob/main/packages/uia/AI.md) — it is the complete surface; an agent should not need the source.
76
+
77
+ MIT.
package/index.ts ADDED
@@ -0,0 +1 @@
1
+ export * from '@bun-win32/uia';
package/package.json ADDED
@@ -0,0 +1,54 @@
1
+ {
2
+ "author": "Stev Peifer <stev.p@outlook.com>",
3
+ "bugs": {
4
+ "url": "https://github.com/ObscuritySRL/bun-win32/issues"
5
+ },
6
+ "dependencies": {
7
+ "@bun-win32/uia": "1.0.0"
8
+ },
9
+ "description": "Playwright for Windows desktop apps — drive and test native Windows GUIs from Bun by querying the UI Automation tree, invoking controls by name, typing, and asserting. Zero native dependencies, no node-gyp, no server.",
10
+ "devDependencies": {
11
+ "@types/bun": "latest"
12
+ },
13
+ "exports": {
14
+ ".": "./index.ts"
15
+ },
16
+ "license": "MIT",
17
+ "module": "index.ts",
18
+ "name": "bun-uia",
19
+ "peerDependencies": {
20
+ "typescript": "^5"
21
+ },
22
+ "private": false,
23
+ "homepage": "https://github.com/ObscuritySRL/bun-win32#readme",
24
+ "repository": {
25
+ "type": "git",
26
+ "url": "git://github.com/ObscuritySRL/bun-win32.git",
27
+ "directory": "packages/bun-uia"
28
+ },
29
+ "type": "module",
30
+ "version": "1.0.0",
31
+ "main": "./index.ts",
32
+ "keywords": [
33
+ "accessibility",
34
+ "automation",
35
+ "bun",
36
+ "desktop",
37
+ "e2e",
38
+ "ffi",
39
+ "playwright",
40
+ "testing",
41
+ "typescript",
42
+ "uiautomation",
43
+ "win32",
44
+ "windows"
45
+ ],
46
+ "files": [
47
+ "README.md",
48
+ "index.ts"
49
+ ],
50
+ "sideEffects": false,
51
+ "engines": {
52
+ "bun": ">=1.1.0"
53
+ }
54
+ }