appium-desktop-driver 1.2.0 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,609 +1,648 @@
1
- NovaWindows Driver
2
- ===================
3
-
4
- NovaWindows Driver is a custom Appium driver designed to tackle the limitations of existing Windows automation solutions like WinAppDriver. NovaWindows Driver supports testing Universal Windows Platform (UWP), Windows Forms (WinForms), Windows Presentation Foundation (WPF), and Classic Windows (Win32) apps on Windows 10 PCs. Built to improve performance and reliability for traditional desktop applications, it offers:
5
-
6
- Faster XPath locator performance — Reduces element lookup times, even in complex UIs.
7
- RawView element support — Access elements typically hidden from the default ControlView/ContentView.
8
- Enhanced text input handling — Solves keyboard layout issues while improving input speed.
9
- Platform-specific commands — Supports direct window manipulation, advanced UI interactions, and more.
10
- It’s designed to handle real-world scenarios where traditional drivers fall short — from tricky dropdowns to missing elements and unreliable clicks — making it an ideal choice for automating legacy Windows apps.
11
-
12
- > **Note**
13
- >
14
- > This driver is built for Appium 2/3 and is not compatible with Appium 1. To install
15
- > the driver, simply run:
16
- > `appium driver install --source=npm appium-novawindows-driver`
17
-
18
-
19
- ## Usage
20
-
21
- Beside of standard Appium requirements NovaWindows Driver adds the following prerequisites:
22
-
23
- - Appium Windows Driver only supports Windows 10 and later as the host.
24
-
25
- > **Note**
26
- >
27
- > The driver currently uses a PowerShell session as a back-end, and
28
- > should not require Developer Mode to be on, or any other software.
29
- > There's a plan to update to a better, .NET-based backend for improved
30
- > realiability and better code and error management, as well as supporting
31
- > more features, that are currently not possible using PowerShell alone.
32
- > It is unlikely for the prerequisites to change, as this is one of the
33
- > main goals of NovaWindows driver – seamless setup on any PC.
34
-
35
- NovaWindows Driver supports the following capabilities:
36
-
37
- Capability Name | Description
38
- --- | ---
39
- platformName | Must be set to `Windows` (case-insensitive).
40
- automationName | Must be set to `DesktopDriver` (case-insensitive).
41
- smoothPointerMove | CSS-like easing function (including valid Bezier curve). This controls the smooth movement of the mouse for `delayBeforeClick` ms. Example: `ease-in`, `cubic-bezier(0.42, 0, 0.58, 1)`.
42
- delayBeforeClick | Time in milliseconds before a click is performed.
43
- delayAfterClick | Time in milliseconds after a click is performed.
44
- appTopLevelWindow | The handle of an existing application top-level window to attach to. It can be a number or string (not necessarily hexadecimal). Example: `12345`, `0x12345`.
45
- shouldCloseApp | Whether to close the window of the application in test after the session finishes. Default is `true`.
46
- appArguments | Optional string of arguments to pass to the app on launch.
47
- appWorkingDir | Optional working directory path for the application.
48
- prerun | An object containing either `script` or `command` key. The value of each key must be a valid PowerShell script or command to be executed prior to the WinAppDriver session startup. See [Power Shell commands execution](#power-shell-commands-execution) for more details. Example: `{script: 'Get-Process outlook -ErrorAction SilentlyContinue'}`
49
- postrun | An object containing either `script` or `command` key. The value of each key must be a valid PowerShell script or command to be executed after WinAppDriver session is stopped. See [Power Shell commands execution](#power-shell-commands-execution) for more details.
50
- isolatedScriptExecution | Whether PowerShell scripts are executed in an isolated session. Default is `false`.
51
- appEnvironment | Optional object of custom environment variables to inject into the PowerShell session. The variables are only available for the lifetime of the session and do not affect the system environment. Example: `{"MY_VAR": "hello", "API_URL": "http://localhost:3000"}`.
52
- returnAllWindowHandles | When `true`, `getWindowHandles()` returns all top-level windows on the desktop (UIA root children) instead of only the windows belonging to the launched app. Useful for switching to arbitrary system windows. Default is `false`.
53
- ms:waitForAppLaunch | Time in seconds to wait for the app window to appear after launch. Default is `0` (falls back to 10 000 ms internal timeout).
54
- ms:windowSwitchRetries | Maximum number of retry attempts in `setWindow()` when the target window is not yet visible. Must be a non-negative integer. Default is `20`.
55
- ms:windowSwitchInterval | Sleep duration in milliseconds between each retry in `setWindow()`. Must be a non-negative integer. Default is `500`.
56
-
57
- Please note that more capabilities will be added as the development of this driver progresses. Since it is still in its early stages, some features may be missing or subject to change. If you need a specific capability or encounter any issues, please feel free to open an issue.
58
-
59
- ## Example
60
-
61
- ```python
62
- # Python3 + PyTest
63
- import pytest
64
-
65
- from appium import webdriver
66
- from appium.options.windows import WindowsOptions
67
-
68
- def generate_options():
69
- uwp_options = WindowsOptions()
70
- # How to get the app ID for Universal Windows Apps (UWP):
71
- # https://www.securitylearningacademy.com/mod/book/view.php?id=13829&chapterid=678
72
- uwp_options.app = 'Microsoft.WindowsCalculator_8wekyb3d8bbwe!App'
73
- uwp_options.automation_name = 'NovaWindows'
74
-
75
- classic_options = WindowsOptions()
76
- classic_options.app = 'C:\\Windows\\System32\\notepad.exe'
77
- classic_options.automation_name = 'NovaWindows'
78
-
79
- use_existing_app_options = WindowsOptions()
80
- # Active window handles could be retrieved from any compatible UI inspector app:
81
- # https://docs.microsoft.com/en-us/windows/win32/winauto/inspect-objects
82
- # or https://accessibilityinsights.io/.
83
- # Also, it is possible to use the corresponding WinApi calls for this purpose:
84
- # https://referencesource.microsoft.com/#System/services/monitoring/system/diagnosticts/ProcessManager.cs,db7ac68b7cb40db1
85
- #
86
- # This capability could be used to create a workaround for UWP apps startup:
87
- # https://github.com/microsoft/WinAppDriver/blob/master/Samples/C%23/StickyNotesTest/StickyNotesSession.cs
88
- use_existing_app_options.app_top_level_window = hex(12345)
89
- use_existing_app_options.automation_name = 'NovaWindows'
90
-
91
- return [uwp_options, classic_options, use_existing_app_options]
92
-
93
-
94
- @pytest.fixture(params=generate_options())
95
- def driver(request):
96
- drv = webdriver.Remote('http://127.0.0.1:4723', options=request.param)
97
- yield drv
98
- drv.quit()
99
-
100
-
101
- def test_app_source_could_be_retrieved(driver):
102
- assert len(driver.page_source) > 0
103
- ```
104
-
105
-
106
- ## Power Shell commands execution
107
-
108
- Just like in Appium Windows Driver (version 1.15.0 and above) there is a possibility to
109
- run custom Power Shell scriptsfrom your client code. This feature is potentially insecure
110
- and thus needs to beexplicitly enabled when executing the server by providing `power_shell`
111
- key to the listof enabled insecure features. Refer to [Appium Security document](https://github.com/appium/appium/blob/master/docs/en/writing-running-appium/security.md) for more details.
112
- It is possible to ether execute a single Power Shell command or a whole script
113
- and get its stdout in response. If the script execution returns non-zero exit code then an exception
114
- is going to be thrown. The exception message will contain the actual stderr. Unlike, Appium Windows Driver,
115
- there is no difference if you paste the script with `command` or `script` argument. For ease of use, you can pass the script as a string when executing a PowerShell command directly via the driver. Note: This shorthand does not work when using the prerun or postrun capabilities, which require full object syntax.
116
- Here's an example code of how to control the Notepad process:
117
-
118
- ```java
119
- // java
120
- String psScript =
121
- "$sig = '[DllImport(\"user32.dll\")] public static extern bool ShowWindowAsync(IntPtr hWnd, int nCmdShow);'\n" +
122
- "Add-Type -MemberDefinition $sig -name NativeMethods -namespace Win32\n" +
123
- "Start-Process Notepad\n" +
124
- "$hwnd = @(Get-Process Notepad)[0].MainWindowHandle\n" +
125
- "[Win32.NativeMethods]::ShowWindowAsync($hwnd, 2)\n" +
126
- "[Win32.NativeMethods]::ShowWindowAsync($hwnd, 4)\n" +
127
- "Stop-Process -Name Notepad";
128
- driver.executeScript("powerShell", psScript);
129
- ```
130
-
131
- Another example, which demonstrates how to use the command output:
132
-
133
- ```python
134
- # python
135
- cmd = 'Get-Process outlook -ErrorAction SilentlyContinue'
136
- proc_info = driver.execute_script('powerShell', cmd)
137
- if proc_info:
138
- print('Outlook is running')
139
- else:
140
- print('Outlook is not running')
141
- ```
142
-
143
- > **Note**
144
- >
145
- > NovaWindows Driver runs on a single PowerShell session,
146
- > therefore you may share variables between executed PowerShell
147
- > scripts. Unless the PowerShell session exits or crashes for some
148
- > reason, you should be able to reuse the variables that you create.
149
-
150
-
151
- ## Element Location
152
-
153
- Appium Windows Driver supports the same location strategies [the WinAppDriver supports](https://github.com/microsoft/WinAppDriver/blob/master/Docs/AuthoringTestScripts.md#supported-locators-to-find-ui-elements), but also includes Windows UIAutomation conditoons:
154
-
155
- Name | Description | Example
156
- --- | --- | ---
157
- accessibility id | This strategy is AutomationId attribute in inspect.exe | AppNameTitle
158
- class name | This strategy is ClassName attribute in inspect.exe | TextBlock
159
- id | This strategy is RuntimeId (decimal) attribute in inspect.exe | 42.333896.3.1
160
- name | This strategy is Name attribute in inspect.exe | Calculator
161
- tag name | This strategy is LocalizedControlType (upper camel case) attribute in inspect.exe since Appium Windows Driver 2.1.1 | Text
162
- xpath | This strategy allows to create custom XPath queries on any attribute exposed by inspect.exe. Only XPath 1.0 is supported | (//Button)[2]
163
- windows uiautomation | This strategy allows to create custom Windows UIAutomation conditions on any attribute exposed by inspect.exe. Both C# and PowerShell syntax is supported | new PropertyCondition(AutomationElement.HelpTextProperty, "Info")
164
-
165
- ## Platform-Specific Extensions
166
-
167
- Beside of standard W3C APIs the driver provides the below custom command extensions to execute platform specific scenarios. Use the following source code examples in order to invoke them from your client code:
168
-
169
- > **Note**
170
- >
171
- > In most cases, commands implemented in NovaWindows driver can be used
172
- > more intuitively by just the element as a second argument and the value
173
- > (if such is needed) as the thrid argument and so on. For example:
174
- > `driver.executeScript("windows: setValue", element, "valueToSet")` or
175
- > `driver.executeScript("windows: invoke", element)`. Commands that are created
176
- > as fallbacks to Appium Windows Driver should work as is. Open an issue if some
177
- > command that you need is missing or is not behaving as it should.
178
-
179
- ```java
180
- // Java 11+
181
- var result = driver.executeScript("windows: <methodName>", Map.of(
182
- "arg1", "value1",
183
- "arg2", "value2"
184
- // you may add more pairs if needed or skip providing the map completely
185
- // if all arguments are defined as optional
186
- ));
187
- ```
188
-
189
- ```js
190
- // WebdriverIO
191
- const result = await driver.executeScript('windows: <methodName>', [{
192
- arg1: "value1",
193
- arg2: "value2",
194
- }]);
195
- ```
196
-
197
- ```python
198
- # Python
199
- result = driver.execute_script('windows: <methodName>', {
200
- 'arg1': 'value1',
201
- 'arg2': 'value2',
202
- })
203
- ```
204
-
205
- ```ruby
206
- # Ruby
207
- result = @driver.execute_script 'windows: <methodName>', {
208
- arg1: 'value1',
209
- arg2: 'value2',
210
- }
211
- ```
212
-
213
- ```csharp
214
- // Dotnet
215
- object result = driver.ExecuteScript("windows: <methodName>", new Dictionary<string, object>() {
216
- {"arg1", "value1"},
217
- {"arg2", "value2"}
218
- });
219
- ```
220
-
221
- ### windows: click
222
-
223
- This is a shortcut for a single mouse click gesture.
224
-
225
- #### Arguments
226
-
227
- Name | Type | Required | Description | Example
228
- --- | --- | --- | --- | ---
229
- elementId | string | no | Hexadecimal identifier of the element to click on. If this parameter is missing then given coordinates will be parsed as absolute ones. Otherwise they are parsed as relative to the top left corner of this element. | 123e4567-e89b-12d3-a456-426614174000
230
- x | number | no | Integer horizontal coordinate of the click point. Both x and y coordinates must be provided or none of them if elementId is present. In such case the gesture will be performed at the center point of the given element. The screen scale (if customized) is **not** taken into consideration while calculating the coordinate. The coordinate is always calculated for the [virtual screen](https://learn.microsoft.com/en-us/windows/win32/gdi/the-virtual-screen). | 100
231
- y | number | no | Integer vertical coordinate of the click point. Both x and y coordinates must be provided or none of them if elementId is present. In such case the gesture will be performed at the center point of the given element. The screen scale (if customized) is **not** taken into consideration while calculating the coordinate. The coordinate is always calculated for the [virtual screen](https://learn.microsoft.com/en-us/windows/win32/gdi/the-virtual-screen). | 100
232
- button | string | no | Name of the mouse button to be clicked. An exception is thrown if an unknown button name is provided. Supported button names are: left, middle, right, back, forward. The default value is `left` | right
233
- modifierKeys | string[] or string | no | List of possible keys or a single key name to depress while the click is being performed. Supported key names are: Shift, Ctrl, Alt, Win. For example, in order to keep Ctrl+Alt depressed while clicking, provide the value of ['ctrl', 'alt'] | win
234
- durationMs | number | no | The number of milliseconds to wait between pressing and releasing the mouse button. By default no delay is applied, which simulates a regular click. | 500
235
- times | number | no | How many times the click must be performed. One by default. | 2
236
- interClickDelayMs | number | no | Duration of the pause between each click gesture. Only makes sense if `times` is greater than one. 100ms by default. | 10
237
-
238
- ### windows: scroll
239
-
240
- This is a shortcut for a mouse wheel scroll gesture. The API is a thin wrapper over the [SendInput](https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-sendinput#:~:text=The%20SendInput%20function%20inserts%20the,or%20other%20calls%20to%20SendInput.)
241
- WinApi call. It emulates the mouse cursor movement and/or horizontal/vertical rotation of the mouse wheel.
242
- Thus make sure the target control is ready to receive mouse wheel events (e.g. is focused) before invoking it.
243
-
244
- #### Arguments
245
-
246
- Name | Type | Required | Description | Example
247
- --- | --- | --- | --- | ---
248
- elementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
249
- x | number | no | Same as in [windows: click](#windows-click) | 100
250
- y | number | no | Same as in [windows: click](#windows-click) | 100
251
- deltaX | number | no | The amount of horizontal wheel movement measured in wheel clicks. A positive value indicates that the wheel was rotated to the right; a negative value indicates that the wheel was rotated to the left. Either this value or deltaY must be provided, but not both. | -5
252
- deltaY | number | no | The amount of vertical wheel movement measured in wheel clicks. A positive value indicates that the wheel was rotated forward, away from the user; a negative value indicates that the wheel was rotated backward, toward the user. Either this value or deltaX must be provided, but not both. | 5
253
- modifierKeys | string[] or string | no | Same as in [windows: click](#windows-click) | win
254
-
255
- ### windows: hover
256
-
257
- This is a shortcut for a hover gesture.
258
-
259
- #### Arguments
260
-
261
- Name | Type | Required | Description | Example
262
- --- | --- | --- | --- | ---
263
- startElementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
264
- startX | number | no | Same as in [windows: click](#windows-click) | 100
265
- startY | number | no | Same as in [windows: click](#windows-click) | 100
266
- endElementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
267
- endX | number | no | Same as in [windows: click](#windows-click) | 100
268
- endY | number | no | Same as in [windows: click](#windows-click) | 100
269
- modifierKeys | string[] or string | no | Same as in [windows: click](#windows-click) | win
270
- durationMs | number | no | The number of milliseconds between moving the cursor from the starting to the ending hover point. 500ms by default. | 700
271
-
272
- ### windows: keys
273
-
274
- This is a shortcut for a customized keyboard input. Selenium keys should also work as modifier keys, unless forceUnicode option is set to true.
275
-
276
- #### Arguments
277
-
278
- Name | Type | Required | Description | Example
279
- --- | --- | --- | --- | ---
280
- actions | KeyAction[] or KeyAction | yes | One or more [KeyAction](#keyaction) dictionaries | ```json [{"virtualKeyCode": 0x10, "down": true}, {'text': "appium likes you"}, {"virtualKeyCode": 0x10, "down": false}]```
281
- forceUnicode | boolean | no | Forces the characters to be sent as unicode characters. Note that they won't work in keyboard shortcut combinations, but it makes them keyboard-layout independent. | true
282
-
283
- ##### KeyAction
284
-
285
- Name | Type | Required | Description | Example
286
- --- | --- | --- | --- | ---
287
- pause | number | no | Allows to set a delay in milliseconds between key input series. Either this property or `text` or `virtualKeyCode` must be provided. | 100
288
- text | string | no | Non-empty string of Unicode text to type (surrogate characters like smileys are not supported). Either this property or `pause` or `virtualKeyCode` must be provided. | Привіт Світ!
289
- virtualKeyCode | number | no | Valid virtual key code. The list of supported key codes is available at [Virtual-Key Codes](https://learn.microsoft.com/en-us/windows/win32/inputdev/virtual-key-codes) page. Either this property or `pause` or `text` must be provided. | 0x10
290
- down | boolean | no | This property only makes sense in combination with `virtualKeyCode`. If set to `true` then the corresponding key will be depressed, `false` - released. By default the key is just pressed once. ! Do not forget to release depressed keys in your automated tests. | true
291
-
292
- ### windows: setClipboard
293
-
294
- Sets Windows clipboard content to the given text or a PNG image.
295
-
296
- #### Arguments
297
-
298
- Name | Type | Required | Description | Example
299
- --- | --- | --- | --- | ---
300
- b64Content | string | yes | Base64-encoded content of the clipboard to be set | `QXBwaXVt`
301
- contentType | 'plaintext' or 'image' | no | Set to 'plaintext' in order to set the given text to the clipboard (the default value). Set to 'image' if `b64Content` contains a base64-encoded payload of a PNG image. | image
302
-
303
- ### windows: getClipboard
304
-
305
- Retrieves Windows clipboard content.
306
-
307
- #### Arguments
308
-
309
- Name | Type | Required | Description | Example
310
- --- | --- | --- | --- | ---
311
- contentType | 'plaintext' or 'image' | no | Set to 'plaintext' in order to set the given text to the clipboard (the default value). Set to 'image' to retrieve a base64-encoded payload of a PNG image. | image
312
-
313
- #### Returns
314
-
315
- Base-64 encoded content of the Windows clipboard.
316
-
317
- ### windows: pushCacheRequest
318
-
319
- This is an asynchronous function that sends cache requests based on specific conditions. This is useful for revealing RawView elements in the element tree. Note that cached elements aren't supported by NovaWindows driver yet.
320
-
321
- #### Arguments
322
-
323
- Name | Type | Required | Description | Example
324
- --- | --- | --- | --- | ---
325
- treeFilter | string | yes | Defines the filter that is applied when walking the automation tree. You can use any UI Automation conditions. For simplicity, you can omit the namespace and/or the Condition word at the end. | `RawView`
326
- treeScope | string | no | Defines the scope of the automation tree to be cached. It determines how far to search for elements, such as just the element itself, its children, descendants or the entire subtree. | `SubTree`
327
- automationElementMode | string | no | Specifies the mode of automation element (e.g., None, Full). Determines whether the UI element is fully cached or only partially cached. | `Full`
328
-
329
- ### windows: invoke
330
-
331
- Invokes a UI element pattern, simulating an interaction like clicking or activating the element.
332
-
333
- #### Arguments
334
-
335
- Position | Type | Description | Example
336
- | --- | --- | --- | --- |
337
- 1 | `Element` | The UI element on which the `InvokePattern` is called to simulate activation. | `element`
338
-
339
- ### windows: expand
340
-
341
- Expands a UI element that supports the `ExpandPattern`, typically used for elements that can be expanded (like trees or combo boxes).
342
-
343
- #### Arguments
344
-
345
- Position | Type | Description | Example
346
- | --- | --- | --- | --- |
347
- 1 | `Element` | The UI element on which the `ExpandPattern` is called to expand the element. | `element`
348
-
349
- ### windows: collapse
350
-
351
- Collapses a UI element that supports the `CollapsePattern`, typically used for collapsible elements (like tree nodes).
352
-
353
- #### Arguments
354
-
355
- Position | Type | Description | Example
356
- | --- | --- | --- | --- |
357
- 1 | `Element` | The UI element on which the `CollapsePattern` is called to collapse the element. | `element`
358
-
359
- ### windows: scrollIntoView
360
-
361
- Scrolls the UI element into view using the `ScrollItemPattern`, ensuring that the element is visible within its container.
362
-
363
- #### Arguments
364
-
365
- Position | Type | Description | Example
366
- | --- | --- | --- | --- |
367
- 1 | `Element` | The UI element on which the `ScrollItemPattern` is called to bring the element into view. | `element`
368
-
369
- ### windows: isMultiple
370
-
371
- Checks if a UI element supports multiple selection using the `SelectionPattern`. Returns `true` if the element supports multiple selections, otherwise `false`.
372
-
373
- #### Arguments
374
-
375
- Position | Type | Description | Example
376
- | --- | --- | --- | --- |
377
- 1 | `Element` | The UI element to check for multiple selection support. | `element`
378
-
379
- #### Returns
380
-
381
- - `boolean`: `true` if the element supports multiple selection, otherwise `false`.
382
-
383
- ### windows: selectedItem
384
-
385
- Gets the selected item from a UI element that supports the `SelectionPattern`.
386
-
387
- #### Arguments
388
-
389
- Position | Type | Description | Example
390
- | --- | --- | --- | --- |
391
- 1 | `Element` | The UI element from which to retrieve the selected item. | `element`
392
-
393
- #### Returns
394
-
395
- - `Element`: The selected item of the element as an Appium element.
396
-
397
- ### windows: allSelectedItems
398
-
399
- Gets all selected items from a UI element that supports the `SelectionPattern`, useful for lists or combo boxes.
400
-
401
- #### Arguments
402
-
403
- Position | Type | Description | Example
404
- | --- | --- | --- | --- |
405
- 1 | `Element` | The UI element from which to retrieve all selected items. | `element`
406
-
407
- #### Returns
408
-
409
- - `Element[]`: An array of selected items as Appium elements.
410
-
411
- ### windows: addToSelection
412
-
413
- Adds an element to the current selection on a UI element that supports the `SelectionPattern`.
414
-
415
- #### Arguments
416
-
417
- Position | Type | Description | Example
418
- | --- | --- | --- | --- |
419
- 1 | `Element` | The UI element to add to the selection. | `element`
420
-
421
- ### windows: removeFromSelection
422
-
423
- Removes an element from the current selection on a UI element that supports the `SelectionPattern`.
424
-
425
- #### Arguments
426
-
427
- Position | Type | Description | Example
428
- | --- | --- | --- | --- |
429
- 1 | `Element` | The UI element to remove from the selection. | `element`
430
-
431
- ### windows: select
432
-
433
- Selects a UI element using the `SelectionPattern`, simulating the action of choosing the element in a selection context.
434
-
435
- #### Arguments
436
-
437
- Position | Type | Description | Example
438
- | --- | --- | --- | --- |
439
- 1 | `Element` | The UI element to select. | `element`
440
-
441
- ### windows: toggle
442
-
443
- Toggles a UI element’s state using the `TogglePattern`, typically used for elements like checkboxes or radio buttons.
444
-
445
- #### Arguments
446
-
447
- Position | Type | Description | Example
448
- | --- | --- | --- | --- |
449
- 1 | `Element` | The UI element to toggle. | `element`
450
-
451
- ### windows: setValue
452
-
453
- Sets the value of a UI element using the `ValuePattern` (for elements like text boxes, sliders, etc.).
454
-
455
- #### Arguments
456
-
457
- Position | Type | Description | Example
458
- | --- | --- | --- | --- |
459
- 1 | `Element` | The UI element whose value will be set. | `element`
460
- 2 | `string` | The value to be set on the element. | `"new value"`
461
-
462
- ### windows: getValue
463
-
464
- Gets the current value of a UI element that supports the `ValuePattern` (e.g., a text box).
465
-
466
- #### Arguments
467
-
468
- Position | Type | Description | Example
469
- | --- | --- | --- | --- |
470
- 1 | `Element` | The UI element from which to retrieve the value. | `element`
471
-
472
- ### windows: maximize
473
-
474
- Maximizes a window or UI element using the `WindowPattern`.
475
-
476
- #### Arguments
477
-
478
- Position | Type | Description | Example
479
- | --- | --- | --- | --- |
480
- 1 | `Element` | The window or UI element to maximize. | `element`
481
-
482
- ### windows: minimize
483
-
484
- Minimizes a window or UI element using the `WindowPattern`.
485
-
486
- #### Arguments
487
-
488
- Position | Type | Description | Example
489
- | --- | --- | --- | --- |
490
- 1 | `Element` | The window or UI element to minimize. | `element`
491
-
492
- ### windows: restore
493
-
494
- Restores a window or UI element to its normal state (if it was maximized or minimized) using the `WindowPattern`.
495
-
496
- #### Arguments
497
-
498
- Position | Type | Description | Example
499
- | --- | --- | --- | --- |
500
- 1 | `Element` | The window or UI element to restore. | `element`
501
-
502
- ### windows: close
503
-
504
- Closes a window or UI element using the `WindowPattern`.
505
-
506
- #### Arguments
507
-
508
- Position | Type | Description | Example
509
- | --- | --- | --- | --- |
510
- 1 | `Element` | The window or UI element to close. | `element`
511
-
512
- ### windows: setFocus
513
-
514
- Sets focus to the specified UI element using UIAutomationElement's `SetFocus` method.
515
-
516
- #### Arguments
517
-
518
- Position | Type | Description | Example
519
- | --- | --- | --- | --- |
520
- 1 | `Element` | The UI element to set focus on. | `element`
521
-
522
- ### windows: startRecordingScreen
523
-
524
- Starts screen recording using the **bundled ffmpeg** included with the driver. There is no system PATH fallback: if the bundle is not present (e.g. driver was not installed via npm with dependencies), screen recording is not available and the driver reports a clear error.
525
-
526
- ### windows: stopRecordingScreen
527
-
528
- Stops the current screen recording and returns the video (base64 or uploads to a remote path if specified).
529
-
530
- ### windows: deleteFile
531
-
532
- Deletes a file on the Windows machine. Uses PowerShell `Remove-Item -Path ... -Force`. Paths containing `[`, `]`, or `?` use `-LiteralPath` for correct interpretation.
533
-
534
- #### Arguments
535
-
536
- Name | Type | Required | Description | Example
537
- --- | --- | --- | --- | ---
538
- path | string | yes | Absolute or relative path to the file to delete. | `C:\Temp\file.txt`
539
-
540
- ### windows: deleteFolder
541
-
542
- Deletes a folder on the Windows machine. Uses PowerShell `Remove-Item -Path ... -Force` with optional `-Recurse`. Paths containing `[`, `]`, or `?` use `-LiteralPath`.
543
-
544
- #### Arguments
545
-
546
- Name | Type | Required | Description | Example
547
- --- | --- | --- | --- | ---
548
- path | string | yes | Absolute or relative path to the folder to delete. | `C:\Temp\MyFolder`
549
- recursive | boolean | no | If true (default), delete contents recursively. If false, only remove the folder when empty. | `true`
550
-
551
- ### windows: launchApp
552
-
553
- Re-launches the application configured in the `app` session capability. The app path or App User Model ID (AUMID) must have been set when the session was created. Typically used to reopen an app after it has been closed with `windows: closeApp`.
554
-
555
- This command takes no arguments.
556
-
557
- #### Example
558
-
559
- ```javascript
560
- // Re-launch the app set in the session capability
561
- await driver.executeScript('windows: launchApp', []);
562
- ```
563
-
564
- ### windows: closeApp
565
-
566
- Closes the current root application window by sending a close command via the Windows UI Automation WindowPattern. Clears the root element reference in the session afterward. Throws a `NoSuchWindowError` if no active window is found.
567
-
568
- This command takes no arguments.
569
-
570
- #### Example
571
-
572
- ```javascript
573
- // Close the current app window
574
- await driver.executeScript('windows: closeApp', []);
575
- ```
576
-
577
- ### windows: clickAndDrag
578
-
579
- Performs a click-and-drag: move to the start position, press the mouse button, move to the end position over the given duration, then release. Start and end can be specified by element (center or offset) or by screen coordinates. Uses the same Windows input APIs as other pointer actions.
580
-
581
- #### Arguments
582
-
583
- Name | Type | Required | Description | Example
584
- --- | --- | --- | --- | ---
585
- startElementId | string | no* | Element ID for drag start. Use *or* startX/startY. | `1.2.3.4.5`
586
- startX | number | no* | X coordinate for drag start (with startY). | `100`
587
- startY | number | no* | Y coordinate for drag start (with startX). | `200`
588
- endElementId | string | no* | Element ID for drag end. Use *or* endX/endY. | `1.2.3.4.6`
589
- endX | number | no* | X coordinate for drag end (with endY). | `300`
590
- endY | number | no* | Y coordinate for drag end (with endX). | `400`
591
- modifierKeys | string or string[] | no | Keys to hold during drag: `shift`, `ctrl`, `alt`, `win`. | `["ctrl"]`
592
- durationMs | number | no | Duration of the move from start to end (default: 500). | `300`
593
- button | string | no | Mouse button: `left` (default), `middle`, `right`, `back`, `forward`. | `left`
594
-
595
- \* Provide either startElementId or both startX and startY; and either endElementId or both endX and endY.
596
-
597
- ## Development
598
-
599
- it is recommended to use Matt Bierner's [Comment tagged templates](https://marketplace.visualstudio.com/items?itemName=bierner.comment-tagged-templates)
600
- Visual Studio Code plugin so it highlights the powershell and C code used throughout the project.
601
-
602
- ```bash
603
- # Checkout the current repository and run
604
- npm install
605
- # Run linting to check for code quality
606
- npm run lint
607
- # Transpile TypeScript files to build the project
608
- npm run build
609
- ```
1
+ Appium Desktop Driver
2
+ ===================
3
+
4
+ Appium Desktop Driver is a custom Appium driver designed to tackle the limitations of existing Windows automation solutions like WinAppDriver. Appium Desktop Driver supports testing Universal Windows Platform (UWP), Windows Forms (WinForms), Windows Presentation Foundation (WPF), and Classic Windows (Win32) apps on Windows 10 PCs. Built to improve performance and reliability for traditional desktop applications, it offers:
5
+
6
+ Faster XPath locator performance — Reduces element lookup times, even in complex UIs.
7
+ RawView element support — Access elements typically hidden from the default ControlView/ContentView.
8
+ Enhanced text input handling — Solves keyboard layout issues while improving input speed.
9
+ Platform-specific commands — Supports direct window manipulation, advanced UI interactions, and more.
10
+ It’s designed to handle real-world scenarios where traditional drivers fall short — from tricky dropdowns to missing elements and unreliable clicks — making it an ideal choice for automating legacy Windows apps.
11
+
12
+ > **Note**
13
+ >
14
+ > This driver is built for Appium 2/3 and is not compatible with Appium 1. To install
15
+ > the driver, simply run:
16
+ > `appium driver install --source=npm appium-desktop-driver`
17
+
18
+
19
+ ## Usage
20
+
21
+ Beside of standard Appium requirements Appium Desktop Driver adds the following prerequisites:
22
+
23
+ - Appium Windows Driver only supports Windows 10 and later as the host.
24
+
25
+ > **Note**
26
+ >
27
+ > The driver currently uses a PowerShell session as a back-end, and
28
+ > should not require Developer Mode to be on, or any other software.
29
+ > There's a plan to update to a better, .NET-based backend for improved
30
+ > realiability and better code and error management, as well as supporting
31
+ > more features, that are currently not possible using PowerShell alone.
32
+ > It is unlikely for the prerequisites to change, as this is one of the
33
+ > main goals of Appium Desktop driver – seamless setup on any PC.
34
+
35
+ Appium Desktop Driver supports the following capabilities:
36
+
37
+ Capability Name | Description
38
+ --- | ---
39
+ platformName | Must be set to `Windows` (case-insensitive).
40
+ automationName | Must be set to `DesktopDriver` (case-insensitive).
41
+ smoothPointerMove | CSS-like easing function (including valid Bezier curve). This controls the smooth movement of the mouse for `delayBeforeClick` ms. Example: `ease-in`, `cubic-bezier(0.42, 0, 0.58, 1)`.
42
+ delayBeforeClick | Time in milliseconds before a click is performed.
43
+ delayAfterClick | Time in milliseconds after a click is performed.
44
+ appTopLevelWindow | The handle of an existing application top-level window to attach to. It can be a number or string (not necessarily hexadecimal). Example: `12345`, `0x12345`.
45
+ shouldCloseApp | Whether to close the window of the application in test after the session finishes. Default is `true`.
46
+ appArguments | Optional string of arguments to pass to the app on launch.
47
+ appWorkingDir | Optional working directory path for the application.
48
+ prerun | An object containing either `script` or `command` key. The value of each key must be a valid PowerShell script or command to be executed prior to the WinAppDriver session startup. See [Power Shell commands execution](#power-shell-commands-execution) for more details. Example: `{script: 'Get-Process outlook -ErrorAction SilentlyContinue'}`
49
+ postrun | An object containing either `script` or `command` key. The value of each key must be a valid PowerShell script or command to be executed after WinAppDriver session is stopped. See [Power Shell commands execution](#power-shell-commands-execution) for more details.
50
+ isolatedScriptExecution | Whether PowerShell scripts are executed in an isolated session. Default is `false`.
51
+ appEnvironment | Optional object of custom environment variables to inject into the PowerShell session. The variables are only available for the lifetime of the session and do not affect the system environment. Example: `{"MY_VAR": "hello", "API_URL": "http://localhost:3000"}`.
52
+ returnAllWindowHandles | When `true`, `getWindowHandles()` returns all top-level windows on the desktop (UIA root children) instead of only the windows belonging to the launched app. Useful for switching to arbitrary system windows. Default is `false`.
53
+ ms:waitForAppLaunch | Time in seconds to wait for the app window to appear after launch. Default is `0` (falls back to 10 000 ms internal timeout).
54
+ ms:windowSwitchRetries | Maximum number of retry attempts in `setWindow()` when the target window is not yet visible. Must be a non-negative integer. Default is `20`.
55
+ ms:windowSwitchInterval | Sleep duration in milliseconds between each retry in `setWindow()`. Must be a non-negative integer. Default is `500`.
56
+
57
+ Please note that more capabilities will be added as the development of this driver progresses. Since it is still in its early stages, some features may be missing or subject to change. If you need a specific capability or encounter any issues, please feel free to open an issue.
58
+
59
+ ## Example
60
+
61
+ ```python
62
+ # Python3 + PyTest
63
+ import pytest
64
+
65
+ from appium import webdriver
66
+ from appium.options.windows import WindowsOptions
67
+
68
+ def generate_options():
69
+ uwp_options = WindowsOptions()
70
+ # How to get the app ID for Universal Windows Apps (UWP):
71
+ # https://www.securitylearningacademy.com/mod/book/view.php?id=13829&chapterid=678
72
+ uwp_options.app = 'Microsoft.WindowsCalculator_8wekyb3d8bbwe!App'
73
+ uwp_options.automation_name = 'DesktopDriver'
74
+
75
+ classic_options = WindowsOptions()
76
+ classic_options.app = 'C:\\Windows\\System32\\notepad.exe'
77
+ classic_options.automation_name = 'DesktopDriver'
78
+
79
+ use_existing_app_options = WindowsOptions()
80
+ # Active window handles could be retrieved from any compatible UI inspector app:
81
+ # https://docs.microsoft.com/en-us/windows/win32/winauto/inspect-objects
82
+ # or https://accessibilityinsights.io/.
83
+ # Also, it is possible to use the corresponding WinApi calls for this purpose:
84
+ # https://referencesource.microsoft.com/#System/services/monitoring/system/diagnosticts/ProcessManager.cs,db7ac68b7cb40db1
85
+ #
86
+ # This capability could be used to create a workaround for UWP apps startup:
87
+ # https://github.com/microsoft/WinAppDriver/blob/master/Samples/C%23/StickyNotesTest/StickyNotesSession.cs
88
+ use_existing_app_options.app_top_level_window = hex(12345)
89
+ use_existing_app_options.automation_name = 'DesktopDriver'
90
+
91
+ return [uwp_options, classic_options, use_existing_app_options]
92
+
93
+
94
+ @pytest.fixture(params=generate_options())
95
+ def driver(request):
96
+ drv = webdriver.Remote('http://127.0.0.1:4723', options=request.param)
97
+ yield drv
98
+ drv.quit()
99
+
100
+
101
+ def test_app_source_could_be_retrieved(driver):
102
+ assert len(driver.page_source) > 0
103
+ ```
104
+
105
+
106
+ ## Power Shell commands execution
107
+
108
+ Just like in Appium Windows Driver (version 1.15.0 and above) there is a possibility to
109
+ run custom Power Shell scriptsfrom your client code. This feature is potentially insecure
110
+ and thus needs to beexplicitly enabled when executing the server by providing `power_shell`
111
+ key to the listof enabled insecure features. Refer to [Appium Security document](https://github.com/appium/appium/blob/master/docs/en/writing-running-appium/security.md) for more details.
112
+ It is possible to ether execute a single Power Shell command or a whole script
113
+ and get its stdout in response. If the script execution returns non-zero exit code then an exception
114
+ is going to be thrown. The exception message will contain the actual stderr. Unlike, Appium Windows Driver,
115
+ there is no difference if you paste the script with `command` or `script` argument. For ease of use, you can pass the script as a string when executing a PowerShell command directly via the driver. Note: This shorthand does not work when using the prerun or postrun capabilities, which require full object syntax.
116
+ Here's an example code of how to control the Notepad process:
117
+
118
+ ```java
119
+ // java
120
+ String psScript =
121
+ "$sig = '[DllImport(\"user32.dll\")] public static extern bool ShowWindowAsync(IntPtr hWnd, int nCmdShow);'\n" +
122
+ "Add-Type -MemberDefinition $sig -name NativeMethods -namespace Win32\n" +
123
+ "Start-Process Notepad\n" +
124
+ "$hwnd = @(Get-Process Notepad)[0].MainWindowHandle\n" +
125
+ "[Win32.NativeMethods]::ShowWindowAsync($hwnd, 2)\n" +
126
+ "[Win32.NativeMethods]::ShowWindowAsync($hwnd, 4)\n" +
127
+ "Stop-Process -Name Notepad";
128
+ driver.executeScript("powerShell", psScript);
129
+ ```
130
+
131
+ Another example, which demonstrates how to use the command output:
132
+
133
+ ```python
134
+ # python
135
+ cmd = 'Get-Process outlook -ErrorAction SilentlyContinue'
136
+ proc_info = driver.execute_script('powerShell', cmd)
137
+ if proc_info:
138
+ print('Outlook is running')
139
+ else:
140
+ print('Outlook is not running')
141
+ ```
142
+
143
+ > **Note**
144
+ >
145
+ > Appium Desktop Driver runs on a single PowerShell session,
146
+ > therefore you may share variables between executed PowerShell
147
+ > scripts. Unless the PowerShell session exits or crashes for some
148
+ > reason, you should be able to reuse the variables that you create.
149
+
150
+
151
+ ## Element Location
152
+
153
+ Appium Windows Driver supports the same location strategies [the WinAppDriver supports](https://github.com/microsoft/WinAppDriver/blob/master/Docs/AuthoringTestScripts.md#supported-locators-to-find-ui-elements), but also includes Windows UIAutomation conditoons:
154
+
155
+ Name | Description | Example
156
+ --- | --- | ---
157
+ accessibility id | This strategy is AutomationId attribute in inspect.exe | AppNameTitle
158
+ class name | This strategy is ClassName attribute in inspect.exe | TextBlock
159
+ id | This strategy is RuntimeId (decimal) attribute in inspect.exe | 42.333896.3.1
160
+ name | This strategy is Name attribute in inspect.exe | Calculator
161
+ tag name | This strategy is LocalizedControlType (upper camel case) attribute in inspect.exe since Appium Windows Driver 2.1.1 | Text
162
+ xpath | This strategy allows to create custom XPath queries on any attribute exposed by inspect.exe. Only XPath 1.0 is supported | (//Button)[2]
163
+ windows uiautomation | This strategy allows to create custom Windows UIAutomation conditions on any attribute exposed by inspect.exe. Both C# and PowerShell syntax is supported | new PropertyCondition(AutomationElement.HelpTextProperty, "Info")
164
+
165
+ ## Platform-Specific Extensions
166
+
167
+ Beside of standard W3C APIs the driver provides the below custom command extensions to execute platform specific scenarios. Use the following source code examples in order to invoke them from your client code:
168
+
169
+ > **Note**
170
+ >
171
+ > In most cases, commands implemented in Appium Desktop driver can be used
172
+ > more intuitively by just the element as a second argument and the value
173
+ > (if such is needed) as the thrid argument and so on. For example:
174
+ > `driver.executeScript("windows: setValue", element, "valueToSet")` or
175
+ > `driver.executeScript("windows: invoke", element)`. Commands that are created
176
+ > as fallbacks to Appium Windows Driver should work as is. Open an issue if some
177
+ > command that you need is missing or is not behaving as it should.
178
+
179
+ ```java
180
+ // Java 11+
181
+ var result = driver.executeScript("windows: <methodName>", Map.of(
182
+ "arg1", "value1",
183
+ "arg2", "value2"
184
+ // you may add more pairs if needed or skip providing the map completely
185
+ // if all arguments are defined as optional
186
+ ));
187
+ ```
188
+
189
+ ```js
190
+ // WebdriverIO
191
+ const result = await driver.executeScript('windows: <methodName>', [{
192
+ arg1: "value1",
193
+ arg2: "value2",
194
+ }]);
195
+ ```
196
+
197
+ ```python
198
+ # Python
199
+ result = driver.execute_script('windows: <methodName>', {
200
+ 'arg1': 'value1',
201
+ 'arg2': 'value2',
202
+ })
203
+ ```
204
+
205
+ ```ruby
206
+ # Ruby
207
+ result = @driver.execute_script 'windows: <methodName>', {
208
+ arg1: 'value1',
209
+ arg2: 'value2',
210
+ }
211
+ ```
212
+
213
+ ```csharp
214
+ // Dotnet
215
+ object result = driver.ExecuteScript("windows: <methodName>", new Dictionary<string, object>() {
216
+ {"arg1", "value1"},
217
+ {"arg2", "value2"}
218
+ });
219
+ ```
220
+
221
+ ### windows: click
222
+
223
+ This is a shortcut for a single mouse click gesture.
224
+
225
+ #### Arguments
226
+
227
+ Name | Type | Required | Description | Example
228
+ --- | --- | --- | --- | ---
229
+ elementId | string | no | Hexadecimal identifier of the element to click on. If this parameter is missing then given coordinates will be parsed as absolute ones. Otherwise they are parsed as relative to the top left corner of this element. | 123e4567-e89b-12d3-a456-426614174000
230
+ x | number | no | Integer horizontal coordinate of the click point. Both x and y coordinates must be provided or none of them if elementId is present. In such case the gesture will be performed at the center point of the given element. The screen scale (if customized) is **not** taken into consideration while calculating the coordinate. The coordinate is always calculated for the [virtual screen](https://learn.microsoft.com/en-us/windows/win32/gdi/the-virtual-screen). | 100
231
+ y | number | no | Integer vertical coordinate of the click point. Both x and y coordinates must be provided or none of them if elementId is present. In such case the gesture will be performed at the center point of the given element. The screen scale (if customized) is **not** taken into consideration while calculating the coordinate. The coordinate is always calculated for the [virtual screen](https://learn.microsoft.com/en-us/windows/win32/gdi/the-virtual-screen). | 100
232
+ button | string | no | Name of the mouse button to be clicked. An exception is thrown if an unknown button name is provided. Supported button names are: left, middle, right, back, forward. The default value is `left` | right
233
+ modifierKeys | string[] or string | no | List of possible keys or a single key name to depress while the click is being performed. Supported key names are: Shift, Ctrl, Alt, Win. For example, in order to keep Ctrl+Alt depressed while clicking, provide the value of ['ctrl', 'alt'] | win
234
+ durationMs | number | no | The number of milliseconds to wait between pressing and releasing the mouse button. By default no delay is applied, which simulates a regular click. | 500
235
+ times | number | no | How many times the click must be performed. One by default. | 2
236
+ interClickDelayMs | number | no | Duration of the pause between each click gesture. Only makes sense if `times` is greater than one. 100ms by default. | 10
237
+
238
+ ### windows: scroll
239
+
240
+ This is a shortcut for a mouse wheel scroll gesture. The API is a thin wrapper over the [SendInput](https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-sendinput#:~:text=The%20SendInput%20function%20inserts%20the,or%20other%20calls%20to%20SendInput.)
241
+ WinApi call. It emulates the mouse cursor movement and/or horizontal/vertical rotation of the mouse wheel.
242
+ Thus make sure the target control is ready to receive mouse wheel events (e.g. is focused) before invoking it.
243
+
244
+ #### Arguments
245
+
246
+ Name | Type | Required | Description | Example
247
+ --- | --- | --- | --- | ---
248
+ elementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
249
+ x | number | no | Same as in [windows: click](#windows-click) | 100
250
+ y | number | no | Same as in [windows: click](#windows-click) | 100
251
+ deltaX | number | no | The amount of horizontal wheel movement measured in wheel clicks. A positive value indicates that the wheel was rotated to the right; a negative value indicates that the wheel was rotated to the left. Either this value or deltaY must be provided, but not both. | -5
252
+ deltaY | number | no | The amount of vertical wheel movement measured in wheel clicks. A positive value indicates that the wheel was rotated forward, away from the user; a negative value indicates that the wheel was rotated backward, toward the user. Either this value or deltaX must be provided, but not both. | 5
253
+ modifierKeys | string[] or string | no | Same as in [windows: click](#windows-click) | win
254
+
255
+ ### windows: hover
256
+
257
+ This is a shortcut for a hover gesture.
258
+
259
+ #### Arguments
260
+
261
+ Name | Type | Required | Description | Example
262
+ --- | --- | --- | --- | ---
263
+ startElementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
264
+ startX | number | no | Same as in [windows: click](#windows-click) | 100
265
+ startY | number | no | Same as in [windows: click](#windows-click) | 100
266
+ endElementId | string | no | Same as in [windows: click](#windows-click) | 123e4567-e89b-12d3-a456-426614174000
267
+ endX | number | no | Same as in [windows: click](#windows-click) | 100
268
+ endY | number | no | Same as in [windows: click](#windows-click) | 100
269
+ modifierKeys | string[] or string | no | Same as in [windows: click](#windows-click) | win
270
+ durationMs | number | no | The number of milliseconds between moving the cursor from the starting to the ending hover point. 500ms by default. | 700
271
+
272
+ ### windows: keys
273
+
274
+ This is a shortcut for a customized keyboard input. Selenium keys should also work as modifier keys, unless forceUnicode option is set to true.
275
+
276
+ #### Arguments
277
+
278
+ Name | Type | Required | Description | Example
279
+ --- | --- | --- | --- | ---
280
+ actions | KeyAction[] or KeyAction | yes | One or more [KeyAction](#keyaction) dictionaries | ```json [{"virtualKeyCode": 0x10, "down": true}, {'text': "appium likes you"}, {"virtualKeyCode": 0x10, "down": false}]```
281
+ forceUnicode | boolean | no | Forces the characters to be sent as unicode characters. Note that they won't work in keyboard shortcut combinations, but it makes them keyboard-layout independent. | true
282
+
283
+ ##### KeyAction
284
+
285
+ Name | Type | Required | Description | Example
286
+ --- | --- | --- | --- | ---
287
+ pause | number | no | Allows to set a delay in milliseconds between key input series. Either this property or `text` or `virtualKeyCode` must be provided. | 100
288
+ text | string | no | Non-empty string of Unicode text to type (surrogate characters like smileys are not supported). Either this property or `pause` or `virtualKeyCode` must be provided. | Привіт Світ!
289
+ virtualKeyCode | number | no | Valid virtual key code. The list of supported key codes is available at [Virtual-Key Codes](https://learn.microsoft.com/en-us/windows/win32/inputdev/virtual-key-codes) page. Either this property or `pause` or `text` must be provided. | 0x10
290
+ down | boolean | no | This property only makes sense in combination with `virtualKeyCode`. If set to `true` then the corresponding key will be depressed, `false` - released. By default the key is just pressed once. ! Do not forget to release depressed keys in your automated tests. | true
291
+
292
+ ### windows: setClipboard
293
+
294
+ Sets Windows clipboard content to the given text or a PNG image.
295
+
296
+ #### Arguments
297
+
298
+ Name | Type | Required | Description | Example
299
+ --- | --- | --- | --- | ---
300
+ b64Content | string | yes | Base64-encoded content of the clipboard to be set | `QXBwaXVt`
301
+ contentType | 'plaintext' or 'image' | no | Set to 'plaintext' in order to set the given text to the clipboard (the default value). Set to 'image' if `b64Content` contains a base64-encoded payload of a PNG image. | image
302
+
303
+ ### windows: getClipboard
304
+
305
+ Retrieves Windows clipboard content.
306
+
307
+ #### Arguments
308
+
309
+ Name | Type | Required | Description | Example
310
+ --- | --- | --- | --- | ---
311
+ contentType | 'plaintext' or 'image' | no | Set to 'plaintext' in order to set the given text to the clipboard (the default value). Set to 'image' to retrieve a base64-encoded payload of a PNG image. | image
312
+
313
+ #### Returns
314
+
315
+ Base-64 encoded content of the Windows clipboard.
316
+
317
+ ### windows: pushCacheRequest
318
+
319
+ This is an asynchronous function that sends cache requests based on specific conditions. This is useful for revealing RawView elements in the element tree. Note that cached elements aren't supported by Appium Desktop driver yet.
320
+
321
+ #### Arguments
322
+
323
+ Name | Type | Required | Description | Example
324
+ --- | --- | --- | --- | ---
325
+ treeFilter | string | yes | Defines the filter that is applied when walking the automation tree. You can use any UI Automation conditions. For simplicity, you can omit the namespace and/or the Condition word at the end. | `RawView`
326
+ treeScope | string | no | Defines the scope of the automation tree to be cached. It determines how far to search for elements, such as just the element itself, its children, descendants or the entire subtree. | `SubTree`
327
+ automationElementMode | string | no | Specifies the mode of automation element (e.g., None, Full). Determines whether the UI element is fully cached or only partially cached. | `Full`
328
+
329
+ ### windows: invoke
330
+
331
+ Invokes a UI element pattern, simulating an interaction like clicking or activating the element.
332
+
333
+ #### Arguments
334
+
335
+ Position | Type | Description | Example
336
+ | --- | --- | --- | --- |
337
+ 1 | `Element` | The UI element on which the `InvokePattern` is called to simulate activation. | `element`
338
+
339
+ ### windows: expand
340
+
341
+ Expands a UI element that supports the `ExpandPattern`, typically used for elements that can be expanded (like trees or combo boxes).
342
+
343
+ #### Arguments
344
+
345
+ Position | Type | Description | Example
346
+ | --- | --- | --- | --- |
347
+ 1 | `Element` | The UI element on which the `ExpandPattern` is called to expand the element. | `element`
348
+
349
+ ### windows: collapse
350
+
351
+ Collapses a UI element that supports the `CollapsePattern`, typically used for collapsible elements (like tree nodes).
352
+
353
+ #### Arguments
354
+
355
+ Position | Type | Description | Example
356
+ | --- | --- | --- | --- |
357
+ 1 | `Element` | The UI element on which the `CollapsePattern` is called to collapse the element. | `element`
358
+
359
+ ### windows: scrollIntoView
360
+
361
+ Scrolls the UI element into view using the `ScrollItemPattern`, ensuring that the element is visible within its container.
362
+
363
+ #### Arguments
364
+
365
+ Position | Type | Description | Example
366
+ | --- | --- | --- | --- |
367
+ 1 | `Element` | The UI element on which the `ScrollItemPattern` is called to bring the element into view. | `element`
368
+
369
+ ### windows: isMultiple
370
+
371
+ Checks if a UI element supports multiple selection using the `SelectionPattern`. Returns `true` if the element supports multiple selections, otherwise `false`.
372
+
373
+ #### Arguments
374
+
375
+ Position | Type | Description | Example
376
+ | --- | --- | --- | --- |
377
+ 1 | `Element` | The UI element to check for multiple selection support. | `element`
378
+
379
+ #### Returns
380
+
381
+ - `boolean`: `true` if the element supports multiple selection, otherwise `false`.
382
+
383
+ ### windows: selectedItem
384
+
385
+ Gets the selected item from a UI element that supports the `SelectionPattern`.
386
+
387
+ #### Arguments
388
+
389
+ Position | Type | Description | Example
390
+ | --- | --- | --- | --- |
391
+ 1 | `Element` | The UI element from which to retrieve the selected item. | `element`
392
+
393
+ #### Returns
394
+
395
+ - `Element`: The selected item of the element as an Appium element.
396
+
397
+ ### windows: allSelectedItems
398
+
399
+ Gets all selected items from a UI element that supports the `SelectionPattern`, useful for lists or combo boxes.
400
+
401
+ #### Arguments
402
+
403
+ Position | Type | Description | Example
404
+ | --- | --- | --- | --- |
405
+ 1 | `Element` | The UI element from which to retrieve all selected items. | `element`
406
+
407
+ #### Returns
408
+
409
+ - `Element[]`: An array of selected items as Appium elements.
410
+
411
+ ### windows: addToSelection
412
+
413
+ Adds an element to the current selection on a UI element that supports the `SelectionPattern`.
414
+
415
+ #### Arguments
416
+
417
+ Position | Type | Description | Example
418
+ | --- | --- | --- | --- |
419
+ 1 | `Element` | The UI element to add to the selection. | `element`
420
+
421
+ ### windows: removeFromSelection
422
+
423
+ Removes an element from the current selection on a UI element that supports the `SelectionPattern`.
424
+
425
+ #### Arguments
426
+
427
+ Position | Type | Description | Example
428
+ | --- | --- | --- | --- |
429
+ 1 | `Element` | The UI element to remove from the selection. | `element`
430
+
431
+ ### windows: select
432
+
433
+ Selects a UI element using the `SelectionPattern`, simulating the action of choosing the element in a selection context.
434
+
435
+ #### Arguments
436
+
437
+ Position | Type | Description | Example
438
+ | --- | --- | --- | --- |
439
+ 1 | `Element` | The UI element to select. | `element`
440
+
441
+ ### windows: toggle
442
+
443
+ Toggles a UI element’s state using the `TogglePattern`, typically used for elements like checkboxes or radio buttons.
444
+
445
+ #### Arguments
446
+
447
+ Position | Type | Description | Example
448
+ | --- | --- | --- | --- |
449
+ 1 | `Element` | The UI element to toggle. | `element`
450
+
451
+ ### windows: setValue
452
+
453
+ Sets the value of a UI element using the `ValuePattern` (for elements like text boxes, sliders, etc.).
454
+
455
+ #### Arguments
456
+
457
+ Position | Type | Description | Example
458
+ | --- | --- | --- | --- |
459
+ 1 | `Element` | The UI element whose value will be set. | `element`
460
+ 2 | `string` | The value to be set on the element. | `"new value"`
461
+
462
+ ### windows: getValue
463
+
464
+ Gets the current value of a UI element that supports the `ValuePattern` (e.g., a text box).
465
+
466
+ #### Arguments
467
+
468
+ Position | Type | Description | Example
469
+ | --- | --- | --- | --- |
470
+ 1 | `Element` | The UI element from which to retrieve the value. | `element`
471
+
472
+ ### windows: maximize
473
+
474
+ Maximizes a window or UI element using the `WindowPattern`.
475
+
476
+ #### Arguments
477
+
478
+ Position | Type | Description | Example
479
+ | --- | --- | --- | --- |
480
+ 1 | `Element` | The window or UI element to maximize. | `element`
481
+
482
+ ### windows: minimize
483
+
484
+ Minimizes a window or UI element using the `WindowPattern`.
485
+
486
+ #### Arguments
487
+
488
+ Position | Type | Description | Example
489
+ | --- | --- | --- | --- |
490
+ 1 | `Element` | The window or UI element to minimize. | `element`
491
+
492
+ ### windows: restore
493
+
494
+ Restores a window or UI element to its normal state (if it was maximized or minimized) using the `WindowPattern`.
495
+
496
+ #### Arguments
497
+
498
+ Position | Type | Description | Example
499
+ | --- | --- | --- | --- |
500
+ 1 | `Element` | The window or UI element to restore. | `element`
501
+
502
+ ### windows: close
503
+
504
+ Closes a window or UI element using the `WindowPattern`.
505
+
506
+ #### Arguments
507
+
508
+ Position | Type | Description | Example
509
+ | --- | --- | --- | --- |
510
+ 1 | `Element` | The window or UI element to close. | `element`
511
+
512
+ ### windows: setFocus
513
+
514
+ Sets focus to the specified UI element using UIAutomationElement's `SetFocus` method.
515
+
516
+ #### Arguments
517
+
518
+ Position | Type | Description | Example
519
+ | --- | --- | --- | --- |
520
+ 1 | `Element` | The UI element to set focus on. | `element`
521
+
522
+ ### windows: startRecordingScreen
523
+
524
+ Starts screen recording using the **bundled ffmpeg** included with the driver. There is no system PATH fallback: if the bundle is not present (e.g. driver was not installed via npm with dependencies), screen recording is not available and the driver reports a clear error.
525
+
526
+ ### windows: stopRecordingScreen
527
+
528
+ Stops the current screen recording and returns the video (base64 or uploads to a remote path if specified).
529
+
530
+ ### windows: deleteFile
531
+
532
+ Deletes a file on the Windows machine. Uses PowerShell `Remove-Item -Path ... -Force`. Paths containing `[`, `]`, or `?` use `-LiteralPath` for correct interpretation.
533
+
534
+ #### Arguments
535
+
536
+ Name | Type | Required | Description | Example
537
+ --- | --- | --- | --- | ---
538
+ path | string | yes | Absolute or relative path to the file to delete. | `C:\Temp\file.txt`
539
+
540
+ ### windows: deleteFolder
541
+
542
+ Deletes a folder on the Windows machine. Uses PowerShell `Remove-Item -Path ... -Force` with optional `-Recurse`. Paths containing `[`, `]`, or `?` use `-LiteralPath`.
543
+
544
+ #### Arguments
545
+
546
+ Name | Type | Required | Description | Example
547
+ --- | --- | --- | --- | ---
548
+ path | string | yes | Absolute or relative path to the folder to delete. | `C:\Temp\MyFolder`
549
+ recursive | boolean | no | If true (default), delete contents recursively. If false, only remove the folder when empty. | `true`
550
+
551
+ ### windows: launchApp
552
+
553
+ Re-launches the application configured in the `app` session capability. The app path or App User Model ID (AUMID) must have been set when the session was created. Typically used to reopen an app after it has been closed with `windows: closeApp`.
554
+
555
+ This command takes no arguments.
556
+
557
+ #### Example
558
+
559
+ ```javascript
560
+ // Re-launch the app set in the session capability
561
+ await driver.executeScript('windows: launchApp', []);
562
+ ```
563
+
564
+ ### windows: closeApp
565
+
566
+ Closes the current root application window by sending a close command via the Windows UI Automation WindowPattern. Clears the root element reference in the session afterward. Throws a `NoSuchWindowError` if no active window is found.
567
+
568
+ This command takes no arguments.
569
+
570
+ #### Example
571
+
572
+ ```javascript
573
+ // Close the current app window
574
+ await driver.executeScript('windows: closeApp', []);
575
+ ```
576
+
577
+ ### windows: clickAndDrag
578
+
579
+ Performs a click-and-drag: move to the start position, press the mouse button, move to the end position over the given duration, then release. Start and end can be specified by element (center or offset) or by screen coordinates. Uses the same Windows input APIs as other pointer actions.
580
+
581
+ #### Arguments
582
+
583
+ Name | Type | Required | Description | Example
584
+ --- | --- | --- | --- | ---
585
+ startElementId | string | no* | Element ID for drag start. Use *or* startX/startY. | `1.2.3.4.5`
586
+ startX | number | no* | X coordinate for drag start (with startY). | `100`
587
+ startY | number | no* | Y coordinate for drag start (with startX). | `200`
588
+ endElementId | string | no* | Element ID for drag end. Use *or* endX/endY. | `1.2.3.4.6`
589
+ endX | number | no* | X coordinate for drag end (with endY). | `300`
590
+ endY | number | no* | Y coordinate for drag end (with endX). | `400`
591
+ modifierKeys | string or string[] | no | Keys to hold during drag: `shift`, `ctrl`, `alt`, `win`. | `["ctrl"]`
592
+ durationMs | number | no | Duration of the move from start to end (default: 500). | `300`
593
+ button | string | no | Mouse button: `left` (default), `middle`, `right`, `back`, `forward`. | `left`
594
+
595
+ \* Provide either startElementId or both startX and startY; and either endElementId or both endX and endY.
596
+
597
+ ### windows: getMonitors
598
+
599
+ Returns information about all connected display monitors, including their screen coordinates in the [virtual screen](https://learn.microsoft.com/en-us/windows/win32/gdi/the-virtual-screen) coordinate space, working area, device name, and which monitor is the primary display.
600
+
601
+ This command takes no arguments.
602
+
603
+ #### Returns
604
+
605
+ An array of monitor objects, one per connected display:
606
+
607
+ Name | Type | Description
608
+ --- | --- | ---
609
+ index | number | Zero-based index of the monitor in the `AllScreens` array.
610
+ deviceName | string | System device name, e.g. `\\.\DISPLAY1`.
611
+ primary | boolean | `true` if this is the primary display.
612
+ bounds | object | Full monitor rectangle: `{ x, y, width, height }` in virtual screen coordinates.
613
+ workingArea | object | Usable area excluding taskbars and docked toolbars: `{ x, y, width, height }`.
614
+
615
+ #### Example
616
+
617
+ ```javascript
618
+ // WebdriverIO — move app window to the secondary monitor
619
+ const monitors = await driver.executeScript('windows: getMonitors', []);
620
+ const secondary = monitors.find(m => !m.primary);
621
+ if (secondary) {
622
+ await driver.setWindowRect(secondary.bounds.x, secondary.bounds.y, null, null);
623
+ }
624
+ ```
625
+
626
+ ```python
627
+ # Python — click at the center of the secondary monitor
628
+ monitors = driver.execute_script('windows: getMonitors', {})
629
+ secondary = next((m for m in monitors if not m['primary']), None)
630
+ if secondary:
631
+ cx = secondary['bounds']['x'] + secondary['bounds']['width'] // 2
632
+ cy = secondary['bounds']['y'] + secondary['bounds']['height'] // 2
633
+ driver.execute_script('windows: click', {'x': cx, 'y': cy})
634
+ ```
635
+
636
+ ## Development
637
+
638
+ it is recommended to use Matt Bierner's [Comment tagged templates](https://marketplace.visualstudio.com/items?itemName=bierner.comment-tagged-templates)
639
+ Visual Studio Code plugin so it highlights the powershell and C code used throughout the project.
640
+
641
+ ```bash
642
+ # Checkout the current repository and run
643
+ npm install
644
+ # Run linting to check for code quality
645
+ npm run lint
646
+ # Transpile TypeScript files to build the project
647
+ npm run build
648
+ ```