@aspiresys/visor 1.3.4 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/readme.md CHANGED
@@ -1,799 +1,790 @@
1
- # Visor
2
-
3
- Desktop Visual Automation Framework for Node.js and TypeScript.
4
-
5
- Visor is a visual desktop automation framework that combines:
6
-
7
- * OpenCV image matching
8
- * OCR text recognition
9
- * Mouse & keyboard automation
10
- * Desktop application automation
11
-
12
- Visor is designed for automating desktop workflows using visual interactions instead of traditional DOM/browser automation.
13
-
14
- ---
15
-
16
- # Features
17
-
18
- * OpenCV-based image matching
19
- * Multi-scale image matching
20
- * OCR automation using Tesseract
21
- * OCR occurrence indexing (beta)
22
- * Region OCR support
23
- * Automatic display scaling detection
24
- * Mouse automation
25
- * Region-based mouse automation
26
- * Region OCR support
27
- * Region-based mouse automation
28
- * Target offset support
29
- * Keyboard automation
30
- * Drag & drop support
31
- * Screenshot capture
32
- * Desktop application automation
33
- * OCR text searching
34
- * Wait APIs
35
- * Multi-image matching
36
- * Config-driven initialization
37
- * High-DPI display scaling support
38
-
39
- ---
40
-
41
- ## What's New in 1.3.x
42
-
43
- * Region class API
44
- * Region image matching
45
- * Region OCR search
46
- * Region interaction methods
47
- * Improved Inspector workflow
48
- * Target offset support
49
-
50
- ---
51
-
52
- # Installation
53
-
54
- ```bash
55
- npm install @aspiresys/visor
56
- ```
57
-
58
- ---
59
-
60
- # Requirements
61
-
62
- * Windows
63
- * Node.js 18+
64
- * TypeScript
65
-
66
- ---
67
-
68
- # Visor Inspector
69
-
70
- Visor includes an optional desktop Inspector tool for:
71
-
72
- * Capturing templates
73
- * Testing image matches
74
- * Measuring screen coordinates
75
- * Validating confidence thresholds
76
-
77
- Run:
78
-
79
- ```bash
80
- npx visor-inspector
81
- ```
82
-
83
- ---
84
-
85
- # Quick Start
86
-
87
- ```ts
88
- import { visor, Region } from "@aspiresys/visor";
89
-
90
- async function main() {
91
-
92
- visor.loadConfig({
93
- imagePath: "./images",
94
- debug: true
95
- });
96
-
97
- await visor.openApp("notepad");
98
-
99
- await visor.wait("notepad.png");
100
-
101
- await visor.click("notepad.png");
102
-
103
- await visor.type(
104
- "Hello from Visor"
105
- );
106
- }
107
-
108
- main();
109
- ```
110
-
111
- ---
112
-
113
- # Configuration
114
-
115
- ```ts
116
- visor.loadConfig({
117
- imagePath: "./images",
118
- debug: true
119
- });
120
- ```
121
-
122
- ---
123
-
124
- ## Configuration Options
125
-
126
- | Option | Description |
127
- | ------------ | ---------------------------------------- |
128
- | scaleFactor | Optional manual display scaling override |
129
- | imagePath | Default image directory |
130
- | ssOutputPath | Screenshot output directory |
131
- | debug | Enable debug logging |
132
-
133
- ---
134
-
135
- # Display Scaling
136
-
137
- Visor automatically detects Windows display scaling and adjusts mouse coordinates accordingly.
138
-
139
- Common scaling values:
140
-
141
- | Scaling | Value |
142
- | ------- | ----- |
143
- | 100% | 1.0 |
144
- | 125% | 1.25 |
145
- | 150% | 1.5 |
146
- | 175% | 1.75 |
147
- | 200% | 2.0 |
148
-
149
- Manual override is still supported:
150
-
151
- ```ts
152
- visor.loadConfig({
153
- scaleFactor: 1.5
154
- });
155
- ```
156
-
157
- ---
158
-
159
- # Multi-Scale Image Matching
160
-
161
- Visor automatically performs multi-scale template matching to support:
162
-
163
- * Different Windows scaling settings
164
- * Different screen resolutions
165
- * High-DPI displays
166
- * Cross-machine execution
167
-
168
- By default Visor evaluates templates across multiple scale levels and automatically selects the best match.
169
-
170
- Supported environments include:
171
-
172
- * 050% scaling
173
- * 075% scaling
174
- * 100% scaling
175
- * 125% scaling
176
- * 150% scaling
177
- * 175% scaling
178
- * 200% scaling
179
-
180
- This significantly improves image matching reliability when automation is executed across different machines.
181
-
182
- ---
183
-
184
- # Visual Automation APIs
185
-
186
- ## Click Image
187
-
188
- ```ts
189
- await visor.click("save.png");
190
- ```
191
-
192
- ---
193
-
194
- ## Find Image
195
-
196
- ```ts
197
- const region = await visor.find("icon.png");
198
- ```
199
-
200
- ---
201
-
202
- ## Region-Based Automation
203
-
204
- Regions can be obtained from:
205
-
206
- * visor.find()
207
- * visor.findAll()
208
- * visor.findText()
209
- * Visor Inspector
210
-
211
- ### Move To Region
212
-
213
- ```ts
214
- const region =
215
- await visor.find(
216
- "save.png"
217
- );
218
-
219
- await visor.moveToRegion(
220
- region
221
- );
222
- ```
223
-
224
- ### Click Region
225
-
226
- ```ts
227
- const region =
228
- await visor.find(
229
- "save.png"
230
- );
231
-
232
- await visor.clickRegion(
233
- region
234
- );
235
- ```
236
-
237
- ### Double Click Region
238
-
239
- ```ts
240
- await visor.doubleClickRegion(new Region(
241
- 100,
242
- 200,
243
- 150,
244
- 50
245
- ));
246
- ```
247
-
248
- ### Right Click Region
249
-
250
- ```ts
251
- await visor.rightClickRegion(new Region(
252
- 100,
253
- 200,
254
- 150,
255
- 50
256
- ));
257
- ```
258
-
259
- Display scaling is automatically applied when using region-based APIs.
260
-
261
- ---
262
-
263
- # Region Object API
264
-
265
- Regions returned by Visor are first-class objects that provide built-in automation methods.
266
-
267
- Regions can be obtained from:
268
-
269
- * visor.find()
270
- * visor.findAll()
271
- * visor.findText()
272
- * Visor Inspector
273
-
274
- Example:
275
-
276
- ```ts
277
- const dialog = await visor.find("dialog.png");
278
-
279
- const save = await dialog.find("save.png");
280
-
281
- await save.click();
282
- ```
283
-
284
- ---
285
-
286
- ```md
287
- ## Region.find()
288
-
289
- Search for an image within the current region.
290
- ```
291
- ```ts
292
- const dialog = await visor.find("dialog.png");
293
-
294
- const save = await dialog.find("save.png");
295
- ```
296
-
297
- ---
298
-
299
- ```md
300
- ## Region.findAll()
301
-
302
- Find all image matches within the current region.
303
- ```
304
- ```ts
305
- const dialog = await visor.find("dialog.png");
306
-
307
- const buttons = await dialog.findAll("button.png");
308
- ```
309
-
310
- ---
311
-
312
- ```md
313
- ## Region.exists()
314
-
315
- Check whether an image exists within the current region.
316
- ```
317
- ```ts
318
- const dialog = await visor.find("dialog.png");
319
-
320
- const exists = await dialog.exists("save.png");
321
- ```
322
-
323
- ---
324
-
325
- ```md
326
- ## Region.findText()
327
-
328
- Search for text within the current region.
329
- ```
330
- ```ts
331
- const dialog = await visor.find("dialog.png");
332
-
333
- const submit = await dialog.findText("Submit");
334
- ```
335
-
336
- ---
337
-
338
- ```md
339
- ## Region.existsText()
340
-
341
- Check whether text exists within the current region.
342
- ```
343
- ```ts
344
- const dialog = await visor.find("dialog.png");
345
-
346
- const exists = await dialog.existsText("Success");
347
- ```
348
-
349
- ---
350
-
351
- ```md
352
- ## Region.readText()
353
-
354
- Extract OCR text from the current region.
355
- ```
356
- ```ts
357
- const dialog = await visor.find("dialog.png");
358
-
359
- const result = await dialog.readText();
360
-
361
- console.log(result.text);
362
- ```
363
-
364
- ---
365
-
366
- ```md
367
- ## Region.click()
368
- ```
369
- ```ts
370
- const save = await visor.find("save.png");
371
-
372
- await save.click();
373
- ```
374
-
375
- ---
376
-
377
- ```md
378
- ## Region.doubleClick()
379
- ```
380
- ```ts
381
- await save.doubleClick();
382
- ```
383
- ---
384
-
385
- ```md
386
- ## Region.rightClick()
387
- ```
388
- ```ts
389
- await save.rightClick();
390
- ```
391
- ---
392
-
393
- ```md
394
- ## Region.move()
395
- ```
396
- ```ts
397
- await save.move();
398
- ```
399
- ---
400
-
401
- ## Check Image Exists
402
-
403
- ```ts
404
- const exists = await visor.exists("login.png");
405
- ```
406
-
407
- ---
408
-
409
- ## Wait For Image
410
-
411
- ```ts
412
- await visor.wait("save.png");
413
-
414
- await visor.wait("save.png", {
415
- confidence: 0.9,
416
- timeout: 10000
417
- });
418
- ```
419
-
420
- ---
421
-
422
- ## Wait For Multiple Images
423
-
424
- ```ts
425
- await visor.waitAny([
426
- "light-theme.png",
427
- "dark-theme.png"
428
- ]);
429
- ```
430
-
431
- ---
432
-
433
- ## Click Multiple Theme Variants
434
-
435
- ```ts
436
- await visor.clickAny([
437
- "send-light.png",
438
- "send-dark.png"
439
- ]);
440
- ```
441
-
442
- ---
443
-
444
- ## Drag & Drop
445
-
446
- ```ts
447
- await visor.dragDrop(
448
- "source.png",
449
- "target.png"
450
- );
451
- ```
452
-
453
- ---
454
-
455
- ## Hover
456
-
457
- ```ts
458
- await visor.hover("menu.png");
459
- ```
460
-
461
- ---
462
-
463
- ## Target Offsets
464
-
465
- Target offsets allow mouse actions to be performed relative to the center of a matched image.
466
-
467
- Useful for:
468
-
469
- * Dropdown arrows
470
- * Adjacent controls
471
- * Dynamic layouts
472
- * Composite UI elements
473
-
474
- ### Click With Offset
475
-
476
- ```ts
477
- await visor.click(
478
- "search.png",
479
- 0.8,
480
- {
481
- x: 50,
482
- y: 0
483
- }
484
- );
485
- ```
486
-
487
- ### Hover With Offset
488
-
489
- ```ts
490
- await visor.hover(
491
- "menu.png",
492
- 0.8,
493
- {
494
- x: -20,
495
- y: 10
496
- }
497
- );
498
- ```
499
-
500
- Offsets are applied relative to the center of the matched region before display scaling adjustments are performed.
501
-
502
- ---
503
-
504
- # OCR Automation
505
-
506
- Visor includes OCR automation powered by Tesseract.js.
507
-
508
- OCR supports:
509
-
510
- * Full-screen OCR
511
- * Region OCR
512
- * Text search
513
- * Text clicking
514
- * Text waiting
515
- * OCR occurrence indexing
516
-
517
- ---
518
-
519
- ## Read Screen
520
-
521
- ```ts
522
- const result = await visor.readScreen();
523
-
524
- console.log(result.text);
525
- ```
526
-
527
- ---
528
-
529
- ## Read Region
530
-
531
- ```ts
532
- const result =
533
- await visor.readRegion(new Region(
534
- 100,
535
- 100,
536
- 500,
537
- 300
538
- ));
539
-
540
- console.log(result.text);
541
- ```
542
-
543
- ---
544
-
545
- ## Find Text
546
-
547
- ```ts
548
- const region = visor.findText("Submit");
549
- ```
550
-
551
- ---
552
-
553
- ## Click Text
554
-
555
- ```ts
556
- await visor.clickText("Login");
557
- ```
558
-
559
- ---
560
-
561
- ## Wait For Text
562
-
563
- ```ts
564
- await visor.waitText("Success");
565
- ```
566
-
567
- ---
568
-
569
- # OCR Occurrence Indexing
570
-
571
- When the same text appears multiple times on screen, Visor allows selecting a specific occurrence.
572
-
573
- ```ts
574
- await visor.clickText("Inbox", 0);
575
- await visor.clickText("Inbox", 1);
576
- await visor.clickText("Inbox", 2);
577
- ```
578
-
579
- OCR elements are processed from:
580
-
581
- ```text
582
- Top Bottom
583
- Left → Right
584
- ```
585
-
586
- This improves automation stability when multiple matching text elements exist on screen.
587
-
588
- ---
589
-
590
- # OCR Optimizations
591
-
592
- Visor includes:
593
-
594
- * Shared OCR worker reuse
595
- * OCR preprocessing
596
- * Grayscale normalization
597
- * Image sharpening
598
- * Confidence filtering
599
- * OCR occurrence indexing
600
-
601
- Benefits:
602
-
603
- * Faster OCR execution
604
- * Improved OCR accuracy
605
- * Lower memory usage
606
- * Improved framework stability
607
-
608
- ---
609
-
610
- # Mouse Automation
611
-
612
- ## Move Mouse
613
-
614
- ```ts
615
- await visor.moveMouse(
616
- 500,
617
- 300
618
- );
619
- ```
620
-
621
- ---
622
-
623
- ### Move To Inspector Region
624
-
625
- ```ts
626
- await visor.moveToRegion(new Region(
627
- 90,
628
- 61,
629
- 138,
630
- 69
631
- ));
632
- ```
633
-
634
- Region coordinates can be copied directly from Visor Inspector match results.
635
-
636
- ---
637
-
638
- ## Scroll Down
639
-
640
- ```ts
641
- await visor.scrollDown(1000);
642
- ```
643
-
644
- ---
645
-
646
- ## Scroll Up
647
-
648
- ```ts
649
- await visor.scrollUp(1000);
650
- ```
651
-
652
- ---
653
-
654
- ## Mouse Position
655
-
656
- ```ts
657
- const pos = await visor.getMousePosition();
658
- ```
659
-
660
- ---
661
-
662
- # Keyboard Automation
663
-
664
- ## Type Text
665
-
666
- ```ts
667
- await visor.type(
668
- "Hello World"
669
- );
670
- ```
671
-
672
- ---
673
-
674
- ## Press Keys
675
-
676
- ```ts
677
- await visor.press(
678
- visor.Key.LeftControl,
679
- visor.Key.S
680
- );
681
- ```
682
-
683
- ---
684
-
685
- # Screenshot Automation
686
-
687
- ```ts
688
- await visor.captureScreenshot(
689
- "./screenshots/home.png"
690
- );
691
- ```
692
-
693
- ---
694
-
695
- # Desktop Application Automation
696
-
697
- ## Open Application
698
-
699
- ```ts
700
- await visor.openApp("notepad");
701
- ```
702
-
703
- ---
704
-
705
- ## Close Application
706
-
707
- ```ts
708
- await visor.closeApp("notepad.exe");
709
- ```
710
-
711
- ---
712
-
713
- # Confidence Thresholds
714
-
715
- Supported range:
716
-
717
- ```text
718
- 0.0 - 1.0
719
- ```
720
-
721
- Recommended values:
722
-
723
- | Confidence | Usage |
724
- | ---------- | --------------- |
725
- | 0.7 | Dynamic UI |
726
- | 0.8 | General usage |
727
- | 0.9 | Strict matching |
728
-
729
- ---
730
-
731
- # Performance Improvements
732
-
733
- Visor includes:
734
-
735
- * Shared OCR worker reuse
736
- * Multi-scale image matching
737
- * OCR preprocessing pipeline
738
- * Automatic display scaling detection
739
-
740
- These improvements increase reliability across varying display configurations and reduce OCR initialization overhead.
741
-
742
- ---
743
-
744
- # Troubleshooting
745
-
746
- ## Image Not Found
747
-
748
- Possible causes:
749
-
750
- * Incorrect image path
751
- * Low confidence threshold
752
- * Theme mismatch
753
- * Poor template quality
754
-
755
- ---
756
-
757
- ## OCR Not Detecting Text
758
-
759
- Possible causes:
760
-
761
- * Small fonts
762
- * Low contrast text
763
- * Blurry UI elements
764
-
765
- ---
766
-
767
- ## Mouse Clicking Incorrect Position
768
-
769
- Visor automatically detects Windows display scaling.
770
-
771
- If required, manually override:
772
-
773
- ```ts
774
- visor.loadConfig({
775
- scaleFactor: 1.5
776
- });
777
- ```
778
-
779
- ---
780
-
781
- # Roadmap
782
-
783
- * Match visualization overlay
784
- * Inspector coordinate picker
785
- * Multi-monitor support improvements
786
- * Parallel image matching
787
- * Advanced OCR tuning
788
- * Electron recorder
789
- * AI-assisted automation
790
-
791
- ---
792
-
793
- # Tech Stack
794
-
795
- * OpenCV
796
- * Tesseract.js
797
- * screenshot-desktop
798
- * sharp
799
- * nut.js
1
+ # Visor
2
+
3
+ Desktop Visual Automation Framework for Node.js and TypeScript.
4
+
5
+ Visor is a visual desktop automation framework that combines:
6
+
7
+ - OpenCV image matching
8
+ - OCR text recognition
9
+ - Mouse & keyboard automation
10
+ - Desktop application automation
11
+
12
+ Visor is designed for automating desktop workflows using visual interactions instead of traditional DOM/browser automation.
13
+
14
+ ---
15
+
16
+ # Features
17
+
18
+ - OpenCV-based image matching
19
+ - Multi-scale image matching
20
+ - OCR automation using Tesseract
21
+ - OCR occurrence indexing (beta)
22
+ - Region OCR support
23
+ - Automatic display scaling detection
24
+ - Mouse automation
25
+ - Region-based mouse automation
26
+ - Region OCR support
27
+ - Region-based mouse automation
28
+ - Target offset support
29
+ - Keyboard automation
30
+ - Drag & drop support
31
+ - Screenshot capture
32
+ - Desktop application automation
33
+ - OCR text searching
34
+ - Wait APIs
35
+ - Multi-image matching
36
+ - Config-driven initialization
37
+ - High-DPI display scaling support
38
+
39
+ ---
40
+
41
+ ## What's New in 1.4.x
42
+
43
+ - Automatic resolution-aware template matching
44
+ - Template metadata support (.properties.json)
45
+ - visor.version()
46
+ - Region.capture()
47
+ - Region.waitAnyImg()
48
+ - Region.clickAny()
49
+ - Improved DPI-aware matching
50
+ - Faster image matching using predicted scaling
51
+
52
+ ---
53
+
54
+ # Installation
55
+
56
+ ```bash
57
+ npm install @aspiresys/visor
58
+ ```
59
+
60
+ ---
61
+
62
+ # Requirements
63
+
64
+ - Windows
65
+ - Node.js 18+
66
+ - TypeScript
67
+
68
+ ---
69
+
70
+ # Visor Inspector
71
+
72
+ Visor includes an optional desktop Inspector tool for:
73
+
74
+ - Capturing templates
75
+ - Testing image matches
76
+ - Measuring screen coordinates
77
+ - Validating confidence thresholds
78
+
79
+ Run:
80
+
81
+ ```bash
82
+ npx visor-inspector
83
+ ```
84
+
85
+ ---
86
+
87
+ # Template Metadata
88
+
89
+ Visor Inspector automatically creates a
90
+ .properties.json file alongside captured templates.
91
+
92
+ Example:
93
+
94
+ save.png
95
+ save.properties.json
96
+
97
+ Metadata includes:
98
+
99
+ - Captured resolution
100
+ - Display scaling factor
101
+ - Capture environment
102
+
103
+ Visor uses this metadata to predict the
104
+ correct image scale during automation,
105
+ significantly improving matching speed
106
+ and reliability across machines.
107
+
108
+ ---
109
+
110
+ # Quick Start
111
+
112
+ ```ts
113
+ import { visor, Region } from '@aspiresys/visor';
114
+
115
+ async function main() {
116
+ visor.loadConfig({
117
+ imagePath: './images',
118
+ debug: true,
119
+ });
120
+
121
+ await visor.openApp('notepad');
122
+
123
+ await visor.wait('notepad.png');
124
+
125
+ await visor.click('notepad.png');
126
+
127
+ await visor.type('Hello from Visor');
128
+ }
129
+
130
+ main();
131
+ ```
132
+
133
+ ---
134
+
135
+ # Configuration
136
+
137
+ ```ts
138
+ visor.loadConfig({
139
+ imagePath: './images',
140
+ debug: true,
141
+ });
142
+ ```
143
+
144
+ ---
145
+
146
+ ## Configuration Options
147
+
148
+ | Option | Description |
149
+ | ------------ | ---------------------------------------- |
150
+ | scaleFactor | Optional manual display scaling override |
151
+ | imagePath | Default image directory |
152
+ | ssOutputPath | Screenshot output directory |
153
+ | debug | Enable debug logging |
154
+
155
+ ---
156
+
157
+ # Display Scaling
158
+
159
+ Visor automatically detects Windows display scaling and adjusts mouse coordinates accordingly.
160
+
161
+ Common scaling values:
162
+
163
+ | Scaling | Value |
164
+ | ------- | ----- |
165
+ | 100% | 1.0 |
166
+ | 125% | 1.25 |
167
+ | 150% | 1.5 |
168
+ | 175% | 1.75 |
169
+ | 200% | 2.0 |
170
+
171
+ Manual override is still supported:
172
+
173
+ ```ts
174
+ visor.loadConfig({
175
+ scaleFactor: 1.5,
176
+ });
177
+ ```
178
+
179
+ ---
180
+
181
+ # Multi-Scale Image Matching
182
+
183
+ Visor automatically performs multi-scale template matching to support:
184
+
185
+ - Different Windows scaling settings
186
+ - Different screen resolutions
187
+ - High-DPI displays
188
+ - Cross-machine execution
189
+
190
+ By default Visor evaluates templates across multiple scale levels and automatically selects the best match.
191
+
192
+ Supported environments include:
193
+
194
+ - 050% scaling
195
+ - 075% scaling
196
+ - 100% scaling
197
+ - 125% scaling
198
+ - 150% scaling
199
+ - 175% scaling
200
+ - 200% scaling
201
+
202
+ This significantly improves image matching reliability when automation is executed across different machines.
203
+
204
+ ---
205
+
206
+ # Visual Automation APIs
207
+
208
+ ## Click Image
209
+
210
+ ```ts
211
+ await visor.click('save.png');
212
+ ```
213
+
214
+ ---
215
+
216
+ ## Find Image
217
+
218
+ ```ts
219
+ const region = await visor.find('icon.png');
220
+ ```
221
+
222
+ ---
223
+
224
+ ## Region-Based Automation
225
+
226
+ Regions can be obtained from:
227
+
228
+ - visor.find()
229
+ - visor.findAll()
230
+ - visor.findText()
231
+ - Visor Inspector
232
+
233
+ ### Move To Region
234
+
235
+ ```ts
236
+ const region = await visor.find('save.png');
237
+
238
+ await visor.moveToRegion(region);
239
+ ```
240
+
241
+ ### Click Region
242
+
243
+ ```ts
244
+ const region = await visor.find('save.png');
245
+
246
+ await visor.clickRegion(region);
247
+ ```
248
+
249
+ ### Double Click Region
250
+
251
+ ```ts
252
+ await visor.doubleClickRegion(new Region(100, 200, 150, 50));
253
+ ```
254
+
255
+ ### Right Click Region
256
+
257
+ ```ts
258
+ await visor.rightClickRegion(new Region(100, 200, 150, 50));
259
+ ```
260
+
261
+ Display scaling is automatically applied when using region-based APIs.
262
+
263
+ ---
264
+
265
+ # Region Object API
266
+
267
+ Regions returned by Visor are first-class objects that provide built-in automation methods.
268
+
269
+ Regions can be obtained from:
270
+
271
+ - visor.find()
272
+ - visor.findAll()
273
+ - visor.findText()
274
+ - Visor Inspector
275
+
276
+ Example:
277
+
278
+ ```ts
279
+ const dialog = await visor.find('dialog.png');
280
+
281
+ const save = await dialog.find('save.png');
282
+
283
+ await save.click();
284
+ ```
285
+
286
+ ---
287
+
288
+ ```md
289
+ ## Region.find()
290
+
291
+ Search for an image within the current region.
292
+ ```
293
+
294
+ ```ts
295
+ const dialog = await visor.find('dialog.png');
296
+
297
+ const save = await dialog.find('save.png');
298
+ ```
299
+
300
+ ---
301
+
302
+ ```md
303
+ ## Region.findAll()
304
+
305
+ Find all image matches within the current region.
306
+ ```
307
+
308
+ ```ts
309
+ const dialog = await visor.find('dialog.png');
310
+
311
+ const buttons = await dialog.findAll('button.png');
312
+ ```
313
+
314
+ ---
315
+
316
+ ```md
317
+ ## Region.exists()
318
+
319
+ Check whether an image exists within the current region.
320
+ ```
321
+
322
+ ```ts
323
+ const dialog = await visor.find('dialog.png');
324
+
325
+ const exists = await dialog.exists('save.png');
326
+ ```
327
+
328
+ ---
329
+
330
+ ```md
331
+ ## Region.findText()
332
+
333
+ Search for text within the current region.
334
+ ```
335
+
336
+ ```ts
337
+ const dialog = await visor.find('dialog.png');
338
+
339
+ const submit = await dialog.findText('Submit');
340
+ ```
341
+
342
+ ---
343
+
344
+ ```md
345
+ ## Region.existsText()
346
+
347
+ Check whether text exists within the current region.
348
+ ```
349
+
350
+ ```ts
351
+ const dialog = await visor.find('dialog.png');
352
+
353
+ const exists = await dialog.existsText('Success');
354
+ ```
355
+
356
+ ---
357
+
358
+ ```md
359
+ ## Region.readText()
360
+
361
+ Extract OCR text from the current region.
362
+ ```
363
+
364
+ ```ts
365
+ const dialog = await visor.find('dialog.png');
366
+
367
+ const result = await dialog.readText();
368
+
369
+ console.log(result.text);
370
+ ```
371
+
372
+ ---
373
+
374
+ ```md
375
+ ## Region.click()
376
+ ```
377
+
378
+ ```ts
379
+ const save = await visor.find('save.png');
380
+
381
+ await save.click();
382
+ ```
383
+
384
+ ---
385
+
386
+ ```md
387
+ ## Region.doubleClick()
388
+ ```
389
+
390
+ ```ts
391
+ await save.doubleClick();
392
+ ```
393
+
394
+ ---
395
+
396
+ ```md
397
+ ## Region.rightClick()
398
+ ```
399
+
400
+ ```ts
401
+ await save.rightClick();
402
+ ```
403
+
404
+ ---
405
+
406
+ ```md
407
+ ## Region.move()
408
+ ```
409
+
410
+ ```ts
411
+ await save.move();
412
+ ```
413
+
414
+ ---
415
+
416
+ ## Check Image Exists
417
+
418
+ ```ts
419
+ const exists = await visor.exists('login.png');
420
+ ```
421
+
422
+ ---
423
+
424
+ ## Wait For Image
425
+
426
+ ```ts
427
+ await visor.wait('save.png');
428
+
429
+ await visor.wait('save.png', {
430
+ confidence: 0.9,
431
+ timeout: 10000,
432
+ });
433
+ ```
434
+
435
+ ---
436
+
437
+ ## Wait For Multiple Images
438
+
439
+ ```ts
440
+ await visor.waitAny(['light-theme.png', 'dark-theme.png']);
441
+ ```
442
+
443
+ ---
444
+
445
+ ## Click Multiple Theme Variants
446
+
447
+ ```ts
448
+ await visor.clickAny(['send-light.png', 'send-dark.png']);
449
+ ```
450
+
451
+ ---
452
+
453
+ ## Drag & Drop
454
+
455
+ ```ts
456
+ await visor.dragDrop('source.png', 'target.png');
457
+ ```
458
+
459
+ ---
460
+
461
+ ## Hover
462
+
463
+ ```ts
464
+ await visor.hover('menu.png');
465
+ ```
466
+
467
+ ---
468
+
469
+ ## Target Offsets
470
+
471
+ Target offsets allow mouse actions to be performed relative to the center of a matched image.
472
+
473
+ Useful for:
474
+
475
+ - Dropdown arrows
476
+ - Adjacent controls
477
+ - Dynamic layouts
478
+ - Composite UI elements
479
+
480
+ ### Click With Offset
481
+
482
+ ```ts
483
+ await visor.click('search.png', 0.8, {
484
+ x: 50,
485
+ y: 0,
486
+ });
487
+ ```
488
+
489
+ ### Hover With Offset
490
+
491
+ ```ts
492
+ await visor.hover('menu.png', 0.8, {
493
+ x: -20,
494
+ y: 10,
495
+ });
496
+ ```
497
+
498
+ Offsets are applied relative to the center of the matched region before display scaling adjustments are performed.
499
+
500
+ ---
501
+
502
+ # OCR Automation
503
+
504
+ Visor includes OCR automation powered by Tesseract.js.
505
+
506
+ OCR supports:
507
+
508
+ - Full-screen OCR
509
+ - Region OCR
510
+ - Text search
511
+ - Text clicking
512
+ - Text waiting
513
+ - OCR occurrence indexing
514
+
515
+ ---
516
+
517
+ ## Read Screen
518
+
519
+ ```ts
520
+ const result = await visor.readScreen();
521
+
522
+ console.log(result.text);
523
+ ```
524
+
525
+ ---
526
+
527
+ ## Read Region
528
+
529
+ ```ts
530
+ const result = await visor.readRegion(new Region(100, 100, 500, 300));
531
+
532
+ console.log(result.text);
533
+ ```
534
+
535
+ ---
536
+
537
+ ## Find Text
538
+
539
+ ```ts
540
+ const region = visor.findText('Submit');
541
+ ```
542
+
543
+ ---
544
+
545
+ ## Click Text
546
+
547
+ ```ts
548
+ await visor.clickText('Login');
549
+ ```
550
+
551
+ ---
552
+
553
+ ## Wait For Text
554
+
555
+ ```ts
556
+ await visor.waitText('Success');
557
+ ```
558
+
559
+ ---
560
+
561
+ # OCR Occurrence Indexing
562
+
563
+ When the same text appears multiple times on screen, Visor allows selecting a specific occurrence.
564
+
565
+ ```ts
566
+ await visor.clickText('Inbox', 0);
567
+ await visor.clickText('Inbox', 1);
568
+ await visor.clickText('Inbox', 2);
569
+ ```
570
+
571
+ OCR elements are processed from:
572
+
573
+ ```text
574
+ Top Bottom
575
+ Left Right
576
+ ```
577
+
578
+ This improves automation stability when multiple matching text elements exist on screen.
579
+
580
+ ---
581
+
582
+ # OCR Optimizations
583
+
584
+ Visor includes:
585
+
586
+ - Shared OCR worker reuse
587
+ - OCR preprocessing
588
+ - Grayscale normalization
589
+ - Image sharpening
590
+ - Confidence filtering
591
+ - OCR occurrence indexing
592
+
593
+ Benefits:
594
+
595
+ - Faster OCR execution
596
+ - Improved OCR accuracy
597
+ - Lower memory usage
598
+ - Improved framework stability
599
+
600
+ ---
601
+
602
+ # Mouse Automation
603
+
604
+ ## Move Mouse
605
+
606
+ ```ts
607
+ await visor.moveMouse(500, 300);
608
+ ```
609
+
610
+ ---
611
+
612
+ ### Move To Inspector Region
613
+
614
+ ```ts
615
+ await visor.moveToRegion(new Region(90, 61, 138, 69));
616
+ ```
617
+
618
+ Region coordinates can be copied directly from Visor Inspector match results.
619
+
620
+ ---
621
+
622
+ ## Scroll Down
623
+
624
+ ```ts
625
+ await visor.scrollDown(1000);
626
+ ```
627
+
628
+ ---
629
+
630
+ ## Scroll Up
631
+
632
+ ```ts
633
+ await visor.scrollUp(1000);
634
+ ```
635
+
636
+ ---
637
+
638
+ ## Mouse Position
639
+
640
+ ```ts
641
+ const pos = await visor.getMousePosition();
642
+ ```
643
+
644
+ ---
645
+
646
+ # Keyboard Automation
647
+
648
+ ## Type Text
649
+
650
+ ```ts
651
+ await visor.type('Hello World');
652
+ ```
653
+
654
+ ---
655
+
656
+ ## Press Keys
657
+
658
+ ```ts
659
+ await visor.press(visor.Key.LeftControl, visor.Key.S);
660
+ ```
661
+
662
+ ---
663
+
664
+ # Screenshot Automation
665
+
666
+ ```ts
667
+ await visor.captureScreenshot('./screenshots/home.png');
668
+ ```
669
+
670
+ ---
671
+
672
+ # Desktop Application Automation
673
+
674
+ ## Open Application
675
+
676
+ ```ts
677
+ await visor.openApp('notepad');
678
+ ```
679
+
680
+ ---
681
+
682
+ ## Close Application
683
+
684
+ ```ts
685
+ await visor.closeApp('notepad.exe');
686
+ ```
687
+
688
+ ---
689
+
690
+ # Confidence Thresholds
691
+
692
+ Supported range:
693
+
694
+ ```text
695
+ 0.0 - 1.0
696
+ ```
697
+
698
+ Recommended values:
699
+
700
+ | Confidence | Usage |
701
+ | ---------- | --------------- |
702
+ | 0.7 | Dynamic UI |
703
+ | 0.8 | General usage |
704
+ | 0.9 | Strict matching |
705
+
706
+ ---
707
+
708
+ # Performance Improvements
709
+
710
+ Visor includes:
711
+
712
+ - Shared OCR worker reuse
713
+ - Multi-scale image matching
714
+ - OCR preprocessing pipeline
715
+ - Automatic display scaling detection
716
+
717
+ These improvements increase reliability across varying display configurations and reduce OCR initialization overhead.
718
+
719
+ ---
720
+
721
+ # Troubleshooting
722
+
723
+ ## Image Not Found
724
+
725
+ Possible causes:
726
+
727
+ - Incorrect image path
728
+ - Low confidence threshold
729
+ - Theme mismatch
730
+ - Poor template quality
731
+
732
+ ---
733
+
734
+ ## OCR Not Detecting Text
735
+
736
+ Possible causes:
737
+
738
+ - Small fonts
739
+ - Low contrast text
740
+ - Blurry UI elements
741
+
742
+ ---
743
+
744
+ ## Mouse Clicking Incorrect Position
745
+
746
+ Visor automatically detects Windows display scaling.
747
+
748
+ If required, manually override:
749
+
750
+ ```ts
751
+ visor.loadConfig({
752
+ scaleFactor: 1.5,
753
+ });
754
+ ```
755
+
756
+ ---
757
+
758
+ # Roadmap
759
+
760
+ - Match visualization overlay
761
+ - Inspector coordinate picker
762
+ - Multi-monitor support improvements
763
+ - Parallel image matching
764
+ - Advanced OCR tuning
765
+ - Electron recorder
766
+ - AI-assisted automation
767
+
768
+ ---
769
+
770
+ # Tech Stack
771
+
772
+ - OpenCV
773
+ - Tesseract.js
774
+ - screenshot-desktop
775
+ - sharp
776
+ - nut.js
777
+
778
+ Why Visor?
779
+
780
+ Unlike Selenium or Playwright,
781
+ Visor automates desktop applications
782
+ using image recognition and OCR.
783
+
784
+ Works with:
785
+
786
+ - Native Windows applications
787
+ - Citrix environments
788
+ - Remote desktops
789
+ - Thick-client applications
790
+ - Legacy systems