npm - rust-rpa - Versions diffs - 0.2.2-beta.1 → 0.2.3 - Mend

rust-rpa 0.2.2-beta.1 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -174,305 +174,143 @@ async function main() {
 main();
 ```
-## API
+## API 文档
+完整的 TypeScript 类型定义请参考 `index.d.ts` 文件。
 ### Window 类
 窗口管理类。
-#### 静态方法
-**`Window.all(): Window[]`**
-获取所有可见窗口列表。
-```javascript
-const windows = Window.all();
-```
-#### 实例方法
-- `id(): number` - 获取窗口 ID
-- `pid(): number` - 获取窗口进程 ID
-- `parentId(): number` - 获取窗口所属进程的父进程 ID
-- `title(): string` - 获取窗口标题
-- `appName(): string` - 获取应用程序名称
-- `x(): number` - 获取窗口 X 坐标
-- `y(): number` - 获取窗口 Y 坐标
-- `width(): number` - 获取窗口宽度
-- `height(): number` - 获取窗口高度
-- `getSize(): WindowSize` - 获取窗口尺寸（宽度和高度）
-- `isMinimized(): boolean` - 是否最小化
-- `isMaximized(): boolean` - 是否最大化
-- `isFocused(): boolean` - 是否聚焦
-- `bringToFront(): Promise<void>` - 将窗口置于最前（激活并置顶）
-- `maximize(): Promise<void>` - 最大化窗口
-- `minimize(): Promise<void>` - 最小化窗口
-- `currentMonitor(): Monitor` - 获取窗口所在显示器
-- `currentMonitorId(): number` - 获取窗口所在显示器的 ID
-- `getBounds(): WindowBounds` - 获取窗口边界（位置和大小）
-- `setBounds(bounds): Promise<void>` - 设置窗口边界（位置和大小）
-- `toJSON(): WindowToJson` - 转为 JSON 可序列化对象
-- `captureImage(options?: CaptureImageOptions): Promise<ImageData>` - 截取窗口图像；可选 `options.size` 指定目标宽高并自动缩放；`options.from` 为 `'screen'` 时截取窗口所在显示器整屏；`options.region` 指定仅截取逻辑像素区域；`options.autoSize` 默认为 `true`，在 Windows 高 DPI 下自动缩放到逻辑像素尺寸
-- `findIcon(template, options?): Promise<MatchResult | null>` - 在窗口截图中查找模板图标（先截图 autoSize=true，再执行模板匹配），找到返回 MatchResult，未找到返回 null
-- `recognizeText(): Promise<TextRecognitionResult[]>` - 识别窗口截图中的文字（先截图 autoSize=true，再执行 OCR）
-- `findText(text, options?): Promise<TextRecognitionResult | null>` - 在窗口截图中查找指定文字，找到返回位置信息，未找到返回 null（先截图 autoSize=true，再执行 OCR）；`options.regions` 可指定查找区域列表以提升性能；`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `waitText(text, options?): Promise<TextRecognitionResult>` - 等待指定文字出现，超时抛出错误；`options.timeout` 指定等待时间（默认 3000ms），`options.regions` 可指定查找区域，`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `waitIcon(template, options?): Promise<MatchResult>` - 等待指定图标出现，超时抛出错误；`options.timeout` 指定等待时间（默认 3000ms），`options.regions` 可指定查找区域，`options.threshold` 可指定匹配阈值（默认 0.8）
-- `clickText(text, options?): Promise<ClickResult>` - 点击指定文字，先等待文字出现，然后点击文字中心点；返回点击坐标 `{x, y, relX, relY}`；`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `clickIcon(template, options?): Promise<ClickResult>` - 点击指定图标，先等待图标出现，然后点击图标中心点；返回点击坐标 `{x, y, relX, relY}`；`options.threshold` 可指定匹配阈值（默认 0.8）
+**静态方法：**
+- `Window.all(): Window[]` - 获取所有可见窗口
+**实例方法：**
+- `id(): number` - 窗口 ID
+- `pid(): number` - 进程 ID
+- `parentId(): number` - 父进程 ID
+- `title(): string` - 窗口标题
+- `appName(): string` - 应用程序名称
+- `x(), y(), width(), height()` - 位置和尺寸
+- `getSize(): { width, height }` - 窗口尺寸
+- `isMinimized(), isMaximized(), isFocused()` - 窗口状态
+- `bringToFront()` - 置顶窗口
+- `maximize(), minimize()` - 最大化/最小化
+- `currentMonitor(): Monitor` - 获取所在显示器
+- `currentMonitorId(): number` - 显示器 ID
+- `getBounds(): { x, y, width, height }` - 窗口边界
+- `setBounds(bounds)` - 设置窗口边界
+- `toJSON()` - 转为 JSON 对象
+- `captureImage(options?)` - 截图
+- `recognizeText(options?)` - OCR 识别文字
+- `findText(text, options?)` - 查找文字
+- `findIcon(template, options?)` - 查找图标
+- `waitText(text, options?)` - 等待文字出现
+- `waitIcon(template, options?)` - 等待图标出现
+- `clickText(text, options?)` - 点击文字
+- `clickIcon(template, options?)` - 点击图标
+**Window 专属 Options：**
+- `from?: 'window' | 'screen'` - 截图来源（`'window'` 默认截窗口，`'screen'` 截整屏）
 ### Monitor 类
 显示器管理类。
-#### 静态方法
-**`Monitor.all(): Monitor[]`**
-获取所有显示器列表。
-```javascript
-const monitors = Monitor.all();
-const primary = monitors.find(m => m.isPrimary());
-```
-#### 实例方法
+**静态方法：**
+- `Monitor.all(): Monitor[]` - 获取所有显示器
-- `id(): number` - 获取显示器 ID
-- `name(): string` - 获取显示器名称
-- `x(): number` - 获取显示器 X 坐标
-- `y(): number` - 获取显示器 Y 坐标
-- `width(): number` - 获取显示器宽度
-- `height(): number` - 获取显示器高度
-- `scaleFactor(): number` - 获取缩放因子
+**实例方法：**
+- `id(): number` - 显示器 ID
+- `name(): string` - 显示器名称
+- `x(), y(), width(), height()` - 位置和尺寸
+- `scaleFactor(): number` - 缩放因子
 - `isPrimary(): boolean` - 是否主显示器
-- `captureImage(options?: CaptureImageOptions): Promise<ImageData>` - 截取显示器图像；可选 `options.size` 指定目标宽高并自动缩放；`options.autoSize` 默认为 `true`，在 Windows 高 DPI 下自动缩放到逻辑像素尺寸
-- `findIcon(template, options?): Promise<MatchResult | null>` - 在显示器截图中查找模板图标（先截图 autoSize=true，再执行模板匹配），找到返回 MatchResult，未找到返回 null
-- `recognizeText(): Promise<TextRecognitionResult[]>` - 识别显示器截图中的文字（先截图 autoSize=true，再执行 OCR）
-- `findText(text, options?): Promise<TextRecognitionResult | null>` - 在显示器截图中查找指定文字，找到返回位置信息，未找到返回 null（先截图 autoSize=true，再执行 OCR）；`options.regions` 可指定查找区域列表以提升性能；`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `waitText(text, options?): Promise<TextRecognitionResult>` - 等待指定文字出现，超时抛出错误；`options.timeout` 指定等待时间（默认 3000ms），`options.regions` 可指定查找区域，`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `waitIcon(template, options?): Promise<MatchResult>` - 等待指定图标出现，超时抛出错误；`options.timeout` 指定等待时间（默认 3000ms），`options.regions` 可指定查找区域，`options.threshold` 可指定匹配阈值（默认 0.8）
-- `clickText(text, options?): Promise<ClickResult>` - 点击指定文字，先等待文字出现，然后点击文字中心点；返回点击坐标 `{x, y, relX, relY}`；`options.notText` 可排除同一行中包含特定文字的匹配结果
-- `clickIcon(template, options?): Promise<ClickResult>` - 点击指定图标，先等待图标出现，然后点击图标中心点；返回点击坐标 `{x, y, relX, relY}`；`options.threshold` 可指定匹配阈值（默认 0.8）
+- `captureImage(options?)` - 截图
+- `recognizeText(options?)` - OCR 识别文字
+- `findText(text, options?)` - 查找文字
+- `findIcon(template, options?)` - 查找图标
+- `waitText(text, options?)` - 等待文字出现
+- `waitIcon(template, options?)` - 等待图标出现
+- `clickText(text, options?)` - 点击文字
+- `clickIcon(template, options?)` - 点击图标
 ### ImageData 类
-图像数据类，用于截图和图像处理操作。
-#### 静态工厂方法
-- `ImageData.fromFile(filePath): Promise<ImageData>` - 从文件加载图片
-- `ImageData.fromBuffer(buffer): Promise<ImageData>` - 从 Buffer 解码图片
-#### 属性
-- `width: number` - 图像宽度（只读）
-- `height: number` - 图像高度（只读）
-#### 方法
-**格式转换与保存**
+图像处理类。
-- `toPng(): Promise<Buffer>` - 转换为 PNG 格式
-- `toJpeg(): Promise<Buffer>` - 转换为 JPEG 格式
-- `toBmp(): Promise<Buffer>` - 转换为 BMP 格式
-- `toFile(filePath): Promise<void>` - 保存到文件（根据扩展名自动识别格式：.png/.jpg/.jpeg/.bmp）
+**静态方法：**
+- `ImageData.fromFile(filePath)` - 从文件加载
+- `ImageData.fromBuffer(buffer)` - 从 Buffer 解码
-**图像处理**
-- `crop(x, y, width, height): Promise<ImageData>` - 裁剪图像
-- `resize(width, height): Promise<ImageData>` - 缩放图像
-- `grayscale(): Promise<ImageData>` - 转换为灰度图
-- `findIcon(template, options?): Promise<MatchResult | null>` - 查找模板图标，找到返回 MatchResult，未找到返回 null
-- `recognizeText(options?): Promise<TextRecognitionResult[]>` - 识别图片中的文字；`options.regions` 可指定识别区域列表
-- `findText(text, options?): Promise<TextRecognitionResult | null>` - 查找指定文字；`options.regions` 可指定查找区域列表；`options.notText` 可排除同一行中包含特定文字的匹配结果
-**数据访问**
+**属性：**
+- `width: number` - 图像宽度
+- `height: number` - 图像高度
+**方法：**
+- `toPng(), toJpeg(), toBmp()` - 格式转换
+- `toFile(filePath)` - 保存到文件
+- `crop(x, y, width, height)` - 裁剪
+- `resize(width, height)` - 缩放
+- `grayscale()` - 灰度化
 - `getRawData(): Buffer` - 获取原始像素数据
-- `metadata(): ImageMetadata` - 获取图片元信息
-**findIcon 参数：**
-- `template: ImageData` - 模板图片（要查找的图标）
-- `options?: MatchOptions` - 可选的匹配选项对象
-  - `threshold?: number` - 匹配阈值 (0.0-1.0)，默认 0.8
-  - `regions?: Array<{x, y, width, height}>` - 可选的搜索区域列表，默认为空（全图搜索）
-    - 区域可超出图片边界，会自动裁剪为与图片的交集范围进行查找
-**MatchOptions 说明：**
-1. **threshold（匹配阈值）**
-   - 取值范围：0.0 - 1.0
-   - 默认值：0.8
-   - 说明：相似度分数必须大于此阈值才算匹配成功
-   - 建议：0.8 是个好的起点，根据实际效果调整
-2. **regions（搜索区域）**
-   - 格式：`[{x: number, y: number, width: number, height: number}]`
-   - 默认：空数组（全图搜索）
-   - 说明：限定在指定区域内搜索，可提高效率和准确性
-   - 示例：`[{x: 0, y: 0, width: 100, height: 100}]`
-**使用建议：**
-- 大多数情况使用默认参数即可：`findIcon(template)`
-- 需要调整灵敏度时指定阈值：`findIcon(template, { threshold: 0.9 })`
-- 在特定区域查找可提高性能：`findIcon(template, { threshold: 0.8, regions: [...] })`
-- **返回值处理**：未找到时返回 `null`，使用时需检查返回值
-  ```javascript
-  const result = await image.findIcon(template);
-  if (result) {
-    console.log(`找到图标: (${result.x}, ${result.y})`);
-  } else {
-    console.log('未找到图标');
-  }
-  ```
-**MatchResult 类型：**
-```typescript
-interface MatchResult {
-  found: boolean;      // 是否找到匹配
-  score: number;       // 相似度分数 (0.0-1.0)
-  x: number;           // 匹配位置 x 坐标（逻辑屏幕坐标，可直接用于 Mouse.moveTo）
-  y: number;           // 匹配位置 y 坐标（逻辑屏幕坐标，可直接用于 Mouse.moveTo）
-  width: number;       // 模板宽度
-  height: number;      // 模板高度
-  relX: number;        // 相对于当前窗口/屏幕左上角的 x 坐标
-  relY: number;        // 相对于当前窗口/屏幕左上角的 y 坐标
-}
-```
-**ImageMetadata 类型：**
-```typescript
-interface ImageMetadata {
-  width: number;           // 图片宽度
-  height: number;          // 图片高度
-  channels: number;        // 通道数（固定为 4，RGBA）
-  bytesPerPixel: number;   // 每像素字节数（固定为 4）
-  dataSize: number;        // 数据总字节数
-}
-```
-**captureImage 选项（Monitor / Window）：**
-```typescript
-// 截取区域（逻辑像素），仅对 Window.captureImage 有效
-interface CaptureRegion {
-  x: number;      // 区域左上角 x（相对截图画布左上角）
-  y: number;      // 区域左上角 y
-  width: number;  // 区域宽度
-  height: number; // 区域高度
-}
-interface CaptureImageOptions {
-  size?: { width: number; height: number };  // 可选，截取后缩放到指定宽高
-  // 以下仅对 Window.captureImage 有效：
-  from?: 'window' | 'screen';  // 默认 'window' 截窗口；'screen' 截窗口所在显示器整屏（可配合 getBounds 与 image.crop 裁剪出窗口）
-  region?: CaptureRegion | null;  // 仅截取该逻辑像素区域，不填则截全图
-  autoSize?: boolean | null;     // 默认为 true，在 Windows 存在 DPI 缩放时自动将图像缩放到逻辑像素宽高（与 getSize/getBounds 一致）
-}
-```
+- `metadata()` - 获取图片元信息
+- `recognizeText(options?)` - OCR 识别文字
+- `findText(text, options?)` - 查找文字
+- `findIcon(template, options?)` - 模板匹配
 ### Mouse 类
-鼠标控制类，所有方法都是静态的。
+鼠标控制类（静态方法）。
-#### 静态方法
-- `Mouse.moveTo(x, y, duration?): Promise<void>` - 移动鼠标到指定坐标。**duration** 可选，移动动画持续时间（秒），默认为 0（瞬间移动）。如果设置 > 0，鼠标将以平滑动画的方式移动到目标点，动画过程中会触发 mouseMove 事件
-- `Mouse.moveRel(dx, dy, duration?): Promise<void>` - 相对移动鼠标。**dx** X 轴偏移量（正数向右，负数向左）；**dy** Y 轴偏移量（正数向下，负数向上）；**duration** 可选，移动动画持续时间（秒），默认为 0
-- `Mouse.click(button?): Promise<void>` - 点击鼠标。**button** 可选，可用枚举 `MouseButton.Left` / `MouseButton.Right` 或字符串 `'left'` / `'right'` 等，默认左键
-- `Mouse.doubleClick(button?): Promise<void>` - 双击鼠标，button 同上
-- `Mouse.down(button?): Promise<void>` - 按下鼠标按钮，button 同上
-- `Mouse.up(button?): Promise<void>` - 释放鼠标按钮，button 同上
-- **MouseButton** 枚举：`MouseButton.Left` | `Right` | `Middle` | `Back` | `Forward`，用于上述 button 参数
-- `Mouse.scroll(deltaX?, deltaY?): Promise<void>` - 滚动鼠标滚轮。**方向**：`deltaY` 正数向上、负数向下；`deltaX` 正数向右、负数向左。如 `scroll(0, 3)` 向上滚、`scroll(0, -3)` 向下滚
-- `Mouse.position(): Promise<{ x: number, y: number }>` - 获取当前鼠标位置
+- `Mouse.moveTo(x, y, duration?)` - 移动到坐标（duration 动画时间，秒）
+- `Mouse.moveRel(dx, dy, duration?)` - 相对移动
+- `Mouse.click(button?)` - 点击（`'left'`, `'right'`, `'middle'`）
+- `Mouse.doubleClick(button?)` - 双击
+- `Mouse.down(button?), Mouse.up(button?)` - 按下/释放
+- `Mouse.scroll(deltaX?, deltaY?)` - 滚动（Y 正数向上、负数向下）
+- `Mouse.position()` - 获取当前位置
 ### Keyboard 类
-键盘控制类，所有方法都是静态的。
-#### 静态方法
+键盘控制类（静态方法）。
-- `Keyboard.typeText(text): Promise<void>` - 输入文本
-- `Keyboard.click(key): Promise<void>` - 按下并释放按键
-- `Keyboard.down(key): Promise<void>` - 按下按键
-- `Keyboard.up(key): Promise<void>` - 释放按键
-- `Keyboard.sequence(keys): Promise<void>` - 按键序列（如 `[Key.Ctrl, Key.C]`）
+- `Keyboard.typeText(text)` - 输入文本
+- `Keyboard.click(key)` - 按下并释放按键
+- `Keyboard.down(key), Keyboard.up(key)` - 按下/释放
+- `Keyboard.sequence(keys)` - 按键序列（如 `[Key.Ctrl, Key.C]`）
-**key 参数**：可使用 **Key** 枚举（如 `Key.Enter`、`Key.Ctrl`、`Key.F1`）或字符串（如 `'enter'`、`'ctrl'`）。**Key** 提供字母 A–Z、数字 Num0–Num9、功能键 F1–F12、修饰键 Control/Ctrl/Shift/Alt/Meta/Command/Win、控制键 Enter/Escape/Tab/Space/Backspace/Delete/Insert、方向键 Left/Right/Up/Down、Home/End/PageUp/PageDown、数字键盘 Numpad0–Numpad9、NumpadAdd/NumpadEnter 等。
+**Key 枚举：**`Key.Enter`, `Key.Ctrl`, `Key.F1`, `Key.A` 等
 ### Clipboard 类
-剪贴板控制类，所有方法都是静态的。
+剪贴板控制类（静态方法）。
-#### 静态方法
-- `Clipboard.readText(): Promise<string>` - 读取剪贴板文本内容
-- `Clipboard.writeText(text): Promise<void>` - 将文本写入剪贴板
-- `Clipboard.writeImage(source): Promise<void>` - 将图片写入剪贴板，`source` 支持文件路径（string）、base64 字符串、data URI 或 Buffer
-- `Clipboard.writeFile(paths): Promise<void>` - 将文件路径写入剪贴板，粘贴时在资源管理器/访达中为文件（`paths` 可为单个路径字符串或路径数组）
-- `Clipboard.paste(): Promise<void>` - 执行粘贴操作（使用 Cmd/Ctrl+V 快捷键）
-- `Clipboard.pasteText(text): Promise<void>` - 将文本写入剪贴板并粘贴，完成后自动恢复剪贴板原始内容
-- `Clipboard.pasteImage(source): Promise<void>` - 将图片写入剪贴板并粘贴，完成后自动恢复剪贴板原始内容（`source` 同 `writeImage`）
-- `Clipboard.pasteFile(paths): Promise<void>` - 将文件路径写入剪贴板并粘贴，不会自动恢复剪贴板原始内容
+- `Clipboard.readText()` - 读取剪贴板文本
+- `Clipboard.writeText(text)` - 写入文本
+- `Clipboard.writeImage(source)` - 写入图片（支持文件路径、base64、Buffer）
+- `Clipboard.writeFile(paths)` - 写入文件路径
+- `Clipboard.paste()` - 执行粘贴
+- `Clipboard.pasteText(text)` - 写入并粘贴（自动恢复剪贴板）
+- `Clipboard.pasteImage(source)` - 写入图片并粘贴（自动恢复）
+- `Clipboard.pasteFile(paths)` - 写入文件路径并粘贴
 ### Permission 类
-权限检查工具类，所有方法都是静态的。
+权限检查类（静态方法）。
-#### 静态方法
+- `Permission.checkAccessibility(prompt?)` - 检查辅助功能权限
+- `Permission.checkScreenCapture(prompt?)` - 检查屏幕录制权限
-- `Permission.checkAccessibility(prompt?): boolean` - 检查辅助功能权限（鼠标、键盘、窗口操作等需要此权限）
-- `Permission.checkScreenCapture(prompt?): boolean` - 检查屏幕录制权限（截图功能需要此权限）
+**说明：**`prompt` 默认为 `true`，无权限时自动弹出系统授权对话框（仅 macOS）
 ### pause 函数
-暂停/等待指定时间的辅助函数。
-#### 函数签名
-- `pause(ms: number): Promise<void>` - 等待指定毫秒数
+- `pause(ms)` - 等待指定毫秒数
 ```javascript
 const { pause } = require('rust-rpa');
-// 等待 1 秒
-await pause(1000);
-// 在操作之间添加延迟
-await Mouse.click('left');
-await pause(500);  // 等待 500ms
-await Keyboard.typeText('Hello');
-```
-**参数说明：**
-- `ms`：等待时间，单位为毫秒
----
-**Permission 类参数说明：**
-- `prompt`（可选，默认 `true`）：无权限时是否自动弹出系统授权对话框（仅 macOS 生效）
-**平台行为差异：**
-| 平台 | checkAccessibility | checkScreenCapture |
-|------|--------------------|--------------------|
-| macOS | 调用 `AXIsProcessTrustedWithOptions`，prompt=true 时弹出辅助功能授权对话框 | 调用 `CGRequestScreenCaptureAccess`，prompt=true 时弹出屏幕录制授权对话框 |
-| Windows | 检查是否以管理员身份运行（管理员即有权限） | 检查是否以管理员身份运行 |
-```javascript
-const { Permission } = require('rust-rpa');
-// 检查并弹出授权（推荐在应用启动时调用）
-const hasAccessibility = Permission.checkAccessibility();
-const hasScreenCapture = Permission.checkScreenCapture();
-console.log('辅助功能权限:', hasAccessibility);
-console.log('屏幕录制权限:', hasScreenCapture);
-// 仅检查，不弹窗
-const check = Permission.checkAccessibility(false);
+await pause(1000);  // 等待 1 秒
 ```
 ## macOS 权限要求
@@ -716,48 +554,20 @@ if (targetWindow) {
   // clickText 也支持 notText 参数
   const clickResult = await targetWindow.clickText('发送', { notText: '草稿', timeout: 5000 });
   console.log(`点击位置: (${clickResult.x}, ${clickResult.y})`);
-}
-```
-**TextRecognitionResult 类型：**
-```typescript
-interface TextRecognitionResult {
-  text: string;         // 识别到的文字
-  x: number;            // 文字区块左上角 x 坐标（逻辑屏幕坐标，可直接用于 Mouse.moveTo）
-  y: number;            // 文字区块左上角 y 坐标（逻辑屏幕坐标，可直接用于 Mouse.moveTo）
-  width: number;        // 文字区块宽度（像素）
-  height: number;       // 文字区块高度（像素）
-  confidence: number;   // 置信度（0.0-1.0）
-  relX: number;         // 相对于当前窗口/屏幕左上角的 x 坐标
-  relY: number;         // 相对于当前窗口/屏幕左上角的 y 坐标
-}
-```
-**ClickResult 类型：**
-```typescript
-interface ClickResult {
-  x: number;            // 点击的 x 坐标（屏幕坐标）
-  y: number;            // 点击的 y 坐标（屏幕坐标）
-  relX: number;         // 相对于当前窗口/屏幕左上角的 x 坐标
-  relY: number;         // 相对于当前窗口/屏幕左上角的 y 坐标
-}
-```
-**FindTextOptions 类型：**
-```typescript
-interface FindTextOptions {
-  regions?: Array<{ x: number, y: number, width: number, height: number }>;  // 查找区域列表
-  notText?: string;     // 排除文字，若同一行中包含此文字则排除该匹配结果
-}
-```
-**WaitOptions 类型：**
-```typescript
-interface WaitOptions {
-  regions?: Array<{ x: number, y: number, width: number, height: number }>;  // 查找区域列表
-  timeout?: number;     // 等待超时时间（毫秒），默认 3000
-  threshold?: number;   // 匹配阈值（仅用于 waitIcon/clickIcon），默认 0.8
-  notText?: string;     // 排除文字（仅用于 waitText/clickText）
+  // 使用 from 参数指定截图来源（'window' 默认截窗口，'screen' 截窗口所在显示器整屏）
+  // 场景：窗口被其他窗口遮挡时，从屏幕截图可以获取更完整的内容
+  const screenResult = await targetWindow.findText('提示', { from: 'screen' });
+  if (screenResult) {
+    console.log(`在屏幕截图中找到"提示": (${screenResult.x}, ${screenResult.y})`);
+  }
+  // recognizeText 和 waitText 同样支持 from 参数
+  const screenTexts = await targetWindow.recognizeText({ from: 'screen' });
+  console.log(`屏幕截图中识别到 ${screenTexts.length} 条文字`);
+  const waitResult = await targetWindow.waitText('加载完成', { from: 'screen', timeout: 10000 });
+  console.log(`等待到文字出现: (${waitResult.x}, ${waitResult.y})`);
 }
 ```
@@ -771,51 +581,42 @@ interface WaitOptions {
 | macOS | ARM64 (Apple Silicon) | ✅ 支持 | 需要辅助功能权限 |
 | Linux | 全部 | ❌ 不支持 | 当前版本无计划 |
-## 性能
-- 窗口枚举通常在 < 100ms 内完成
-- 异步操作不会阻塞 Node.js 事件循环
-- 最小的内存开销
 ## 开发
 从源码构建和开发指南请参阅 [dev.md](./dev.md)。
-## 路线图
-- [x] 窗口枚举和信息获取（基于 XCap）
-- [x] 鼠标和键盘自动化（Mouse/Keyboard 类）
-- [x] 屏幕截图（Monitor/Window 截图）
-- [x] 多显示器支持
-- [x] 图像格式转换（PNG/JPEG/BMP）
-- [x] 图像裁剪功能
-- [x] 图像缩放功能（Lanczos3 高质量算法）
-- [x] 图像识别（模板匹配）
-- [x] 图像灰度化
-- [x] 图像数据访问（原始像素、元信息）
-- [x] 图像保存到文件
-- [x] 剪贴板操作
-- [x] 窗口操作（置顶、移动、调整大小、父进程查询）
-- [x] 窗口最大化/最小化
-- [x] OCR 文字识别（基于 PP-OCRv5_mobile）
-- [x] 文字位置查找（findText）
-- [x] 等待文字、图标出现（waitText/waitIcon）
-- [x] 点击文字、图标（clickText/clickIcon）
-- [ ] 进程管理
 ## 更新日志
+### 0.2.3
+#### 修复
+- **OCR `recognizeText` 多区域识别修复**: 修复了 `recognizeText` 在指定多个 `regions` 时坐标转换错误的问题
 ### 0.2.2
-- **findText|waitText|clickText**匹配文字时，模糊匹配，例如o和0避免ocr识别不一致造成无法查找或者点击
 #### 新功能
+- **Window 文字和图标操作新增 `from` 参数**: `findText`、`waitText`、`clickText`、`recognizeText`、`findIcon`、`waitIcon`、`clickIcon` 方法新增 `from` 选项，用于指定截图来源
+  - `from: 'window'`（默认）- 截取当前窗口内容
+  - `from: 'screen'` - 截取窗口所在显示器的整屏内容
+  - 适用场景：窗口被其他窗口遮挡时，从屏幕截图可以获取更完整的内容
+  - 示例：`window.findText('提示', { from: 'screen' })`、`window.waitIcon(template, { from: 'screen' })`
+#### 类型定义更新
+- **Options 类型分离**: 为 `Window` 和 `Monitor` 类分别定义独立的 Options 类型，使 API 更清晰
+  - `WindowMatchOptions` / `MonitorMatchOptions` / `MatchOptionsJs` - 分别用于 `findIcon`
+  - `WindowFindTextOptions` / `MonitorFindTextOptions` / `FindTextOptions` - 分别用于 `findText`
+  - `WindowRecognizeTextOptions` / `MonitorRecognizeTextOptions` / `RecognizeTextOptions` - 分别用于 `recognizeText`
+  - `WindowWaitOptions` / `MonitorWaitOptions` / `WaitOptions` - 分别用于 `waitText` / `waitIcon` / `clickText` / `clickIcon`
+  - 只有 Window 专用类型包含 `from` 参数，Monitor 类型不包含（因为 Monitor 本身就是屏幕）
+#### 其他改进
+- **findText|waitText|clickText** 匹配文字时，模糊匹配，例如 o 和 0 避免 OCR 识别不一致造成无法查找或者点击
 - **Window.maximize()**: 最大化窗口
-  - Windows 平台：使用 `ShowWindow(hwnd, SW_MAXIMIZE)` Win32 API
-  - macOS 平台：使用 AppleScript 点击窗口菜单中的"缩放"或"Zoom"菜单项
 - **Window.minimize()**: 最小化窗口
-  - Windows 平台：使用 `ShowWindow(hwnd, SW_MINIMIZE)` Win32 API
-  - macOS 平台：使用 AppleScript 点击窗口菜单中的"最小化"或"Minimize"菜单项
 ### 0.2.1

package/index.d.ts CHANGED Viewed

@@ -27,26 +27,110 @@ export interface ImageMetadata {
   /** 数据总字节数 */
   dataSize: number
 }
-/** 匹配选项 */
+/** 匹配选项（ImageData 使用） */
 export interface MatchOptionsJs {
   /** 匹配阈值 (0.0-1.0)，低于此值视为未找到，默认 0.8 */
   threshold?: number
   /** 匹配区域列表，若为空则在整个图片中匹配 */
   regions?: Array<MatchRegion>
 }
-/** 文字查找选项 */
+/** 匹配选项（Window 专用） */
+export interface WindowMatchOptions {
+  /** 匹配阈值 (0.0-1.0)，低于此值视为未找到，默认 0.8 */
+  threshold?: number
+  /** 匹配区域列表，若为空则在整个图片中匹配 */
+  regions?: Array<MatchRegion>
+  /**
+   * 图片来源：
+   * - `'window'`（默认）：截取当前窗口
+   * - `'screen'`：截取窗口所在显示器的整屏
+   */
+  from?: 'window' | 'screen'
+}
+/** 匹配选项（Monitor 专用） */
+export interface MonitorMatchOptions {
+  /** 匹配阈值 (0.0-1.0)，低于此值视为未找到，默认 0.8 */
+  threshold?: number
+  /** 匹配区域列表，若为空则在整个图片中匹配 */
+  regions?: Array<MatchRegion>
+}
+/** 文字查找选项（ImageData 使用） */
 export interface FindTextOptions {
   /** 查找区域列表，若为空则在全图查找 */
   regions?: Array<MatchRegion>
   /** 排除文字，若同一行中包含此文字则排除该匹配结果 */
   notText?: string
 }
-/** 文字识别选项 */
+/** 文字查找选项（Window 专用） */
+export interface WindowFindTextOptions {
+  /** 查找区域列表，若为空则在全图查找 */
+  regions?: Array<MatchRegion>
+  /** 排除文字，若同一行中包含此文字则排除该匹配结果 */
+  notText?: string
+  /**
+   * 图片来源：
+   * - `'window'`（默认）：截取当前窗口
+   * - `'screen'`：截取窗口所在显示器的整屏
+   */
+  from?: 'window' | 'screen'
+}
+/** 文字查找选项（Monitor 专用） */
+export interface MonitorFindTextOptions {
+  /** 查找区域列表，若为空则在全图查找 */
+  regions?: Array<MatchRegion>
+  /** 排除文字，若同一行中包含此文字则排除该匹配结果 */
+  notText?: string
+}
+/** 文字识别选项（ImageData 使用） */
 export interface RecognizeTextOptions {
   /** 识别区域列表，若为空则识别全图 */
   regions?: Array<MatchRegion>
 }
-/** 等待选项 */
+/** 文字识别选项（Window 专用） */
+export interface WindowRecognizeTextOptions {
+  /** 识别区域列表，若为空则识别全图 */
+  regions?: Array<MatchRegion>
+  /**
+   * 图片来源：
+   * - `'window'`（默认）：截取当前窗口
+   * - `'screen'`：截取窗口所在显示器的整屏
+   */
+  from?: 'window' | 'screen'
+}
+/** 文字识别选项（Monitor 专用） */
+export interface MonitorRecognizeTextOptions {
+  /** 识别区域列表，若为空则识别全图 */
+  regions?: Array<MatchRegion>
+}
+/** 等待选项（Window 专用） */
+export interface WindowWaitOptions {
+  /** 查找区域列表，若为空则在全图查找 */
+  regions?: Array<MatchRegion>
+  /** 等待超时时间（毫秒），默认 3000 */
+  timeout?: number
+  /** 匹配阈值 (0.0-1.0)，低于此值视为未找到，默认 0.8（仅用于 waitIcon/clickIcon） */
+  threshold?: number
+  /** 排除文字，若同一行中包含此文字则排除该匹配结果（仅用于 waitText/clickText） */
+  notText?: string
+  /**
+   * 图片来源：
+   * - `'window'`（默认）：截取当前窗口
+   * - `'screen'`：截取窗口所在显示器的整屏
+   */
+  from?: 'window' | 'screen'
+}
+/** 等待选项（Monitor 专用） */
+export interface MonitorWaitOptions {
+  /** 查找区域列表，若为空则在全图查找 */
+  regions?: Array<MatchRegion>
+  /** 等待超时时间（毫秒），默认 3000 */
+  timeout?: number
+  /** 匹配阈值 (0.0-1.0)，低于此值视为未找到，默认 0.8（仅用于 waitIcon/clickIcon） */
+  threshold?: number
+  /** 排除文字，若同一行中包含此文字则排除该匹配结果（仅用于 waitText/clickText） */
+  notText?: string
+}
+/** 等待选项（兼容旧代码，与 WindowWaitOptions 相同） */
 export interface WaitOptions {
   /** 查找区域列表，若为空则在全图查找 */
   regions?: Array<MatchRegion>
@@ -56,6 +140,12 @@ export interface WaitOptions {
   threshold?: number
   /** 排除文字，若同一行中包含此文字则排除该匹配结果（仅用于 waitText/clickText） */
   notText?: string
+  /**
+   * 图片来源：
+   * - `'window'`（默认）：截取当前窗口
+   * - `'screen'`：截取窗口所在显示器的整屏
+   */
+  from?: 'window' | 'screen'
 }
 /** 图标匹配结果 */
 export interface MatchResult {
@@ -570,7 +660,7 @@ export declare class Monitor {
    * }
    * ```
    */
-  findIcon(template: ImageData, options?: MatchOptionsJs | undefined | null): Promise<MatchResult | null>
+  findIcon(template: ImageData, options?: MonitorMatchOptions | undefined | null): Promise<MatchResult | null>
   /**
    * 识别显示器截图中的文字（异步）
    *
@@ -587,7 +677,7 @@ export declare class Monitor {
    * }
    * ```
    */
-  recognizeText(): Promise<Array<TextRecognitionResult>>
+  recognizeText(options?: MonitorRecognizeTextOptions | null): Promise<Array<TextRecognitionResult>>
   /**
    * 在显示器截图中查找指定文字（异步）
    *
@@ -614,7 +704,7 @@ export declare class Monitor {
    * });
    * ```
    */
-  findText(text: string, options?: FindTextOptions | null): Promise<TextRecognitionResult | null>
+  findText(text: string, options?: MonitorFindTextOptions | null): Promise<TextRecognitionResult | null>
   /**
    * 等待指定文字出现（异步）
    *
@@ -650,7 +740,7 @@ export declare class Monitor {
    * });
    * ```
    */
-  waitText(text: string, options?: WaitOptions | null): Promise<TextRecognitionResult>
+  waitText(text: string, options?: MonitorWaitOptions | null): Promise<TextRecognitionResult>
   /**
    * 等待指定图标出现（异步）
    *
@@ -687,7 +777,7 @@ export declare class Monitor {
    * });
    * ```
    */
-  waitIcon(template: ImageData, options?: WaitOptions | null): Promise<MatchResult>
+  waitIcon(template: ImageData, options?: MonitorWaitOptions | null): Promise<MatchResult>
   /**
    * 点击指定文字（异步）
    *
@@ -714,7 +804,7 @@ export declare class Monitor {
    * });
    * ```
    */
-  clickText(text: string, options?: WaitOptions | null): Promise<ClickResult>
+  clickText(text: string, options?: MonitorWaitOptions | null): Promise<ClickResult>
   /**
    * 点击指定图标（异步）
    *
@@ -742,7 +832,7 @@ export declare class Monitor {
    * });
    * ```
    */
-  clickIcon(template: ImageData, options?: WaitOptions | null): Promise<ClickResult>
+  clickIcon(template: ImageData, options?: MonitorWaitOptions | null): Promise<ClickResult>
 }
 /**
  * Window 类 - 封装 XCap 的 Window 对象
@@ -969,7 +1059,7 @@ export declare class Window {
    * }
    * ```
    */
-  findIcon(template: ImageData, options?: MatchOptionsJs | undefined | null): Promise<MatchResult | null>
+  findIcon(template: ImageData, options?: WindowMatchOptions | undefined | null): Promise<MatchResult | null>
   /**
    * 识别窗口中的文字（异步）
    *
@@ -992,7 +1082,7 @@ export declare class Window {
    * }
    * ```
    */
-  recognizeText(): Promise<Array<TextRecognitionResult>>
+  recognizeText(options?: WindowRecognizeTextOptions | null): Promise<Array<TextRecognitionResult>>
   /**
    * 在窗口中查找指定文字（异步）
    *
@@ -1025,7 +1115,7 @@ export declare class Window {
    * });
    * ```
    */
-  findText(text: string, options?: FindTextOptions | null): Promise<TextRecognitionResult | null>
+  findText(text: string, options?: WindowFindTextOptions | null): Promise<TextRecognitionResult | null>
   /**
    * 等待指定文字出现（异步）
    *
@@ -1062,7 +1152,7 @@ export declare class Window {
    * });
    * ```
    */
-  waitText(text: string, options?: WaitOptions | null): Promise<TextRecognitionResult>
+  waitText(text: string, options?: WindowWaitOptions | null): Promise<TextRecognitionResult>
   /**
    * 等待指定图标出现（异步）
    *
@@ -1100,7 +1190,7 @@ export declare class Window {
    * });
    * ```
    */
-  waitIcon(template: ImageData, options?: WaitOptions | null): Promise<MatchResult>
+  waitIcon(template: ImageData, options?: WindowWaitOptions | null): Promise<MatchResult>
   /**
    * 点击指定文字（异步）
    *
@@ -1128,7 +1218,7 @@ export declare class Window {
    * });
    * ```
    */
-  clickText(text: string, options?: WaitOptions | null): Promise<ClickResult>
+  clickText(text: string, options?: WindowWaitOptions | null): Promise<ClickResult>
   /**
    * 点击指定图标（异步）
    *
@@ -1157,7 +1247,7 @@ export declare class Window {
    * });
    * ```
    */
-  clickIcon(template: ImageData, options?: WaitOptions | null): Promise<ClickResult>
+  clickIcon(template: ImageData, options?: WindowWaitOptions | null): Promise<ClickResult>
 }
 /** Clipboard 类 - 剪贴板控制 */
 export declare class Clipboard {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "rust-rpa",
-  "version": "0.2.2-beta.1",
+  "version": "0.2.3",
   "description": "Rust-based RPA automation library for Node.js",
   "type": "commonjs",
   "main": "index.js",
@@ -55,8 +55,6 @@
   },
   "devDependencies": {
     "@napi-rs/cli": "^3.5.1",
-    "node-screenshots": "^0.2.8",
-    "sharp": "^0.34.5",
     "vitest": "^4.0.18"
   },
   "files": [
@@ -69,9 +67,9 @@
     "commander": "^14.0.3"
   },
   "optionalDependencies": {
-    "@alibot/rust-rpa-win32-x64-msvc": "0.2.2-beta.1",
-    "@alibot/rust-rpa-win32-ia32-msvc": "0.2.2-beta.1",
-    "@alibot/rust-rpa-darwin-x64": "0.2.2-beta.1",
-    "@alibot/rust-rpa-darwin-arm64": "0.2.2-beta.1"
+    "@alibot/rust-rpa-win32-x64-msvc": "0.2.3",
+    "@alibot/rust-rpa-win32-ia32-msvc": "0.2.3",
+    "@alibot/rust-rpa-darwin-x64": "0.2.3",
+    "@alibot/rust-rpa-darwin-arm64": "0.2.3"
   }
 }