@aspiresys/visor 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/readme.md ADDED
@@ -0,0 +1,566 @@
1
+ # Visor
2
+
3
+ Desktop Visual Automation Framework for Node.js and TypeScript.
4
+
5
+ Visor is a visual desktop automation framework that combines:
6
+
7
+ - OpenCV image matching
8
+ - OCR text recognition
9
+ - Mouse & keyboard automation
10
+ - Desktop application automation
11
+
12
+ Visor is designed for automating desktop workflows using visual interactions instead of traditional DOM/browser automation.
13
+
14
+ ---
15
+
16
+ # Features
17
+
18
+ - OpenCV-based image matching
19
+ - OCR automation using Tesseract
20
+ - Mouse automation
21
+ - Keyboard automation
22
+ - Drag & drop support
23
+ - Multi-theme image handling
24
+ - Screenshot capture
25
+ - Desktop application automation
26
+ - OCR text searching
27
+ - Wait APIs
28
+ - Multi-image matching
29
+ - Config-driven initialization
30
+ - High-DPI display scaling support
31
+
32
+ ---
33
+
34
+ # Installation
35
+
36
+ ## Install from npm
37
+
38
+ ```bash
39
+ npm install visor
40
+ ```
41
+
42
+ ---
43
+
44
+ ## Local Development
45
+
46
+ ```bash
47
+ npm install
48
+ npm run build
49
+ ```
50
+
51
+ ---
52
+
53
+ # Requirements
54
+
55
+ Visor currently supports:
56
+
57
+ - Windows
58
+ - Node.js 18+
59
+ - TypeScript
60
+
61
+ ---
62
+
63
+ # Quick Start
64
+
65
+ ```ts
66
+ import { visor } from "visor";
67
+
68
+ async function main() {
69
+
70
+ visor.loadConfig({
71
+ scaleFactor: 1.5,
72
+ imagePath: "./images",
73
+ debug: true
74
+ });
75
+
76
+ await visor.openApp("notepad");
77
+
78
+ await visor.wait("notepad.png");
79
+
80
+ await visor.click("notepad.png");
81
+
82
+ await visor.type(
83
+ "Hello from Visor"
84
+ );
85
+
86
+ }
87
+
88
+ main();
89
+ ```
90
+
91
+ ---
92
+
93
+ # Configuration
94
+
95
+ Visor supports centralized framework configuration.
96
+
97
+ ```ts
98
+ visor.loadConfig({
99
+ scaleFactor: 1.5,
100
+ imagePath: "./images",
101
+ debug: true
102
+ });
103
+ ```
104
+
105
+ ---
106
+
107
+ ## Configuration Options
108
+
109
+ | Option | Description |
110
+ |---|---|
111
+ | `scaleFactor` | Display scaling factor |
112
+ | `imagePath` | Default image directory |
113
+ | `debug` | Enables debug logging |
114
+
115
+ ---
116
+
117
+ # Display Scaling
118
+
119
+ Display scaling is extremely important for desktop automation.
120
+
121
+ Common values:
122
+
123
+ | Scaling | Value |
124
+ |---|---|
125
+ | 100% | `1.0` |
126
+ | 125% | `1.25` |
127
+ | 150% | `1.5` |
128
+ | 200% | `2.0` |
129
+
130
+ Example:
131
+
132
+ ```ts
133
+ visor.setScaleFactor(1.5);
134
+ ```
135
+
136
+ Incorrect scaling values may cause:
137
+
138
+ - failed image matching
139
+ - incorrect mouse positioning
140
+ - OCR region issues
141
+
142
+ ---
143
+
144
+ # Visual Automation APIs
145
+
146
+ ## Click Image
147
+
148
+ ```ts
149
+ await visor.click("save.png");
150
+ ```
151
+
152
+ ---
153
+
154
+ ## Find Image
155
+
156
+ ```ts
157
+ const region =
158
+ await visor.find("icon.png");
159
+ ```
160
+
161
+ ---
162
+
163
+ ## Check Image Exists
164
+
165
+ ```ts
166
+ const exists =
167
+ await visor.exists("login.png");
168
+ ```
169
+
170
+ ---
171
+
172
+ ## Wait For Image
173
+
174
+ ```ts
175
+ await visor.wait("loading-complete.png");
176
+ ```
177
+
178
+ ---
179
+
180
+ ## Wait For Multiple Images
181
+
182
+ ```ts
183
+ await visor.waitAny([
184
+ "home-light.png",
185
+ "home-dark.png"
186
+ ]);
187
+ ```
188
+
189
+ ---
190
+
191
+ ## Click Multiple Theme Variants
192
+
193
+ ```ts
194
+ await visor.clickAny([
195
+ "send-light.png",
196
+ "send-dark.png"
197
+ ]);
198
+ ```
199
+
200
+ ---
201
+
202
+ ## Drag & Drop
203
+
204
+ ```ts
205
+ await visor.dragDrop(
206
+ "file.png",
207
+ "folder.png"
208
+ );
209
+ ```
210
+
211
+ ---
212
+
213
+ ## Hover
214
+
215
+ ```ts
216
+ await visor.hover("menu.png");
217
+ ```
218
+
219
+ ---
220
+
221
+ # OCR Automation
222
+
223
+ Visor includes OCR automation using Tesseract.js.
224
+
225
+ OCR supports:
226
+
227
+ - full screen text detection
228
+ - region-based OCR
229
+ - text matching
230
+ - OCR click automation
231
+
232
+ ---
233
+
234
+ ## Read Screen
235
+
236
+ ```ts
237
+ const result =
238
+ await visor.readScreen();
239
+
240
+ console.log(result.text);
241
+ ```
242
+
243
+ ---
244
+
245
+ ## Find Text
246
+
247
+ ```ts
248
+ const region =
249
+ await visor.findText("Submit");
250
+ ```
251
+
252
+ ---
253
+
254
+ ## Click Text
255
+
256
+ ```ts
257
+ await visor.clickText("Login");
258
+ ```
259
+
260
+ ---
261
+
262
+ ## Wait For Text
263
+
264
+ ```ts
265
+ await visor.waitText("Success");
266
+ ```
267
+
268
+ ---
269
+
270
+ # OCR Optimizations
271
+
272
+ Visor includes:
273
+
274
+ - shared OCR worker reuse
275
+ - OCR preprocessing
276
+ - confidence filtering
277
+ - grayscale normalization
278
+ - sharpened OCR processing
279
+
280
+ These optimizations improve:
281
+
282
+ - OCR speed
283
+ - OCR accuracy
284
+ - framework stability
285
+
286
+ ---
287
+
288
+ # Mouse Automation
289
+
290
+ ## Move Mouse
291
+
292
+ ```ts
293
+ await visor.moveMouse(
294
+ 500,
295
+ 300
296
+ );
297
+ ```
298
+
299
+ ---
300
+
301
+ ## Scroll Down
302
+
303
+ ```ts
304
+ await visor.scrollDown(1000);
305
+ ```
306
+
307
+ ---
308
+
309
+ ## Scroll Up
310
+
311
+ ```ts
312
+ await visor.scrollUp(1000);
313
+ ```
314
+
315
+ ---
316
+
317
+ ## Mouse Position
318
+
319
+ ```ts
320
+ const pos =
321
+ await visor.getMousePosition();
322
+ ```
323
+
324
+ ---
325
+
326
+ # Keyboard Automation
327
+
328
+ ## Type Text
329
+
330
+ ```ts
331
+ await visor.type(
332
+ "Hello World"
333
+ );
334
+ ```
335
+
336
+ ---
337
+
338
+ ## Press Keys
339
+
340
+ ```ts
341
+ await visor.press(
342
+ visor.Key.LeftControl,
343
+ visor.Key.S
344
+ );
345
+ ```
346
+
347
+ ---
348
+
349
+ # Screenshot Automation
350
+
351
+ ## Capture Screenshot
352
+
353
+ ```ts
354
+ await visor.captureScreenshot(
355
+ "./screenshots/home.png"
356
+ );
357
+ ```
358
+
359
+ ---
360
+
361
+ # Desktop Application Automation
362
+
363
+ ## Open Application
364
+
365
+ ```ts
366
+ await visor.openApp("notepad");
367
+ ```
368
+
369
+ ---
370
+
371
+ ## Close Application
372
+
373
+ ```ts
374
+ await visor.closeApp("notepad.exe");
375
+ ```
376
+
377
+ ---
378
+
379
+ # Teams Automation Example
380
+
381
+ ```ts
382
+ import { visor } from "visor";
383
+
384
+ async function teamsDemo() {
385
+
386
+ visor.loadConfig({
387
+ scaleFactor: 1.5,
388
+ imagePath: "./images",
389
+ debug: true
390
+ });
391
+
392
+ await visor.openApp(
393
+ "ms-teams.exe"
394
+ );
395
+
396
+ await visor.waitAny([
397
+ "teams-light.png",
398
+ "teams-dark.png"
399
+ ]);
400
+
401
+ await visor.clickAny([
402
+ "chat-light.png",
403
+ "chat-dark.png"
404
+ ]);
405
+
406
+ await visor.clickText("Search");
407
+
408
+ await visor.type("John");
409
+
410
+ }
411
+ ```
412
+
413
+ ---
414
+
415
+ # Confidence Thresholds
416
+
417
+ Most image matching APIs support confidence values.
418
+
419
+ Accepted range:
420
+
421
+ ```txt
422
+ 0.0 to 1.0
423
+ ```
424
+
425
+ Recommended values:
426
+
427
+ | Confidence | Usage |
428
+ |---|---|
429
+ | `0.7` | Dynamic UI |
430
+ | `0.8` | Standard usage |
431
+ | `0.9+` | Strict matching |
432
+
433
+ Lower confidence increases flexibility but may increase false positives.
434
+
435
+ ---
436
+
437
+ # Performance Notes
438
+
439
+ ## OCR
440
+
441
+ OCR operations are computationally expensive.
442
+
443
+ OCR speed depends on:
444
+
445
+ - display resolution
446
+ - text density
447
+ - screen complexity
448
+ - hardware performance
449
+
450
+ ---
451
+
452
+ ## Electron Applications
453
+
454
+ Electron-based applications may require additional startup stabilization time.
455
+
456
+ ---
457
+
458
+ # Debug Logging
459
+
460
+ Enable debug logs:
461
+
462
+ ```ts
463
+ visor.setDebug(true);
464
+ ```
465
+
466
+ ---
467
+
468
+ # Best Practices
469
+
470
+ ## Use Stable Images
471
+
472
+ Prefer:
473
+
474
+ - high contrast images
475
+ - unique UI elements
476
+ - properly cropped templates
477
+
478
+ Avoid:
479
+
480
+ - blurry screenshots
481
+ - partially hidden elements
482
+ - scaled screenshots
483
+
484
+ ---
485
+
486
+ ## Use Proper Scaling
487
+
488
+ Always configure:
489
+
490
+ ```ts
491
+ visor.setScaleFactor(...);
492
+ ```
493
+
494
+ before automation begins.
495
+
496
+ ---
497
+
498
+ ## Prefer Wait APIs
499
+
500
+ Prefer:
501
+
502
+ ```ts
503
+ await visor.wait(...);
504
+ ```
505
+
506
+ instead of excessive hard sleeps.
507
+
508
+ ---
509
+
510
+ # Troubleshooting
511
+
512
+ ## Image Not Found
513
+
514
+ Possible causes:
515
+
516
+ - incorrect scaling
517
+ - wrong image path
518
+ - low confidence threshold
519
+ - theme mismatch
520
+
521
+ ---
522
+
523
+ ## OCR Not Detecting Text
524
+
525
+ Possible causes:
526
+
527
+ - blurry UI
528
+ - small font size
529
+ - low contrast text
530
+
531
+ ---
532
+
533
+ ## Mouse Clicking Incorrect Position
534
+
535
+ Possible causes:
536
+
537
+ - incorrect scale factor
538
+ - Windows DPI scaling mismatch
539
+
540
+ ---
541
+
542
+ # Roadmap
543
+
544
+ Planned features:
545
+
546
+ - Linux support
547
+ - Mac support
548
+ - Region caching
549
+ - Parallel image matching
550
+ - Advanced OCR tuning
551
+ - Electron recorder
552
+ - AI-assisted automation
553
+
554
+ ---
555
+
556
+ # Tech Stack
557
+
558
+ Visor uses:
559
+
560
+ - OpenCV
561
+ - Tesseract.js
562
+ - screenshot-desktop
563
+ - sharp
564
+ - nut.js
565
+
566
+ ---