@ui-tars-test/cli 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +231 -0
- package/bin/index.js +13 -0
- package/dist/337.js +1396 -0
- package/dist/337.js.LICENSE.txt +14 -0
- package/dist/337.js.map +1 -0
- package/dist/337.mjs +1395 -0
- package/dist/337.mjs.LICENSE.txt +14 -0
- package/dist/337.mjs.map +1 -0
- package/dist/597.js +19 -0
- package/dist/597.mjs +18 -0
- package/dist/663.js +98 -0
- package/dist/663.js.map +1 -0
- package/dist/663.mjs +97 -0
- package/dist/663.mjs.map +1 -0
- package/dist/760.js +2957 -0
- package/dist/760.js.map +1 -0
- package/dist/760.mjs +2956 -0
- package/dist/760.mjs.map +1 -0
- package/dist/940.js +1013 -0
- package/dist/940.js.map +1 -0
- package/dist/940.mjs +1011 -0
- package/dist/940.mjs.map +1 -0
- package/dist/955.js +317 -0
- package/dist/955.js.map +1 -0
- package/dist/955.mjs +317 -0
- package/dist/955.mjs.map +1 -0
- package/dist/bundle/index.js +299060 -0
- package/dist/cli/commands.js +75 -0
- package/dist/cli/commands.js.map +1 -0
- package/dist/cli/commands.mjs +41 -0
- package/dist/cli/commands.mjs.map +1 -0
- package/dist/cli/start.js +447 -0
- package/dist/cli/start.js.map +1 -0
- package/dist/cli/start.mjs +396 -0
- package/dist/cli/start.mjs.map +1 -0
- package/dist/gui-agent-macos +0 -0
- package/dist/index.js +14 -0
- package/dist/index.js.LICENSE.txt +471 -0
- package/dist/index.js.map +1 -0
- package/dist/index.mjs +8 -0
- package/dist/index.mjs.LICENSE.txt +471 -0
- package/dist/index.mjs.map +1 -0
- package/dist/src/cli/commands.d.ts +2 -0
- package/dist/src/cli/commands.d.ts.map +1 -0
- package/dist/src/cli/start.d.ts +11 -0
- package/dist/src/cli/start.d.ts.map +1 -0
- package/dist/src/index.d.ts +2 -0
- package/dist/src/index.d.ts.map +1 -0
- package/package.json +59 -0
package/README.md
ADDED
|
@@ -0,0 +1,231 @@
|
|
|
1
|
+
# @ui-tars-test/cli
|
|
2
|
+
|
|
3
|
+
CLI for GUI Agent - A powerful automation tool for desktop, web, and mobile applications.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
### Global Installation
|
|
8
|
+
```bash
|
|
9
|
+
npm install -g @ui-tars-test/cli
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
### Use via npx (without installation)
|
|
13
|
+
```bash
|
|
14
|
+
npx @ui-tars-test/cli run [options]
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
### Local Installation
|
|
18
|
+
```bash
|
|
19
|
+
npm install @ui-tars-test/cli
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Usage
|
|
23
|
+
|
|
24
|
+
### Basic Usage
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
gui-agent run
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
This will start an interactive prompt where you can:
|
|
31
|
+
1. Configure your VLM model settings (provider, base URL, API key, model name)
|
|
32
|
+
2. Select the target operator (computer, browser, or android)
|
|
33
|
+
3. Enter your automation instruction
|
|
34
|
+
|
|
35
|
+
### Available Commands
|
|
36
|
+
|
|
37
|
+
#### `gui-agent run`
|
|
38
|
+
Run GUI Agent automation with optional parameters.
|
|
39
|
+
|
|
40
|
+
#### `gui-agent reset`
|
|
41
|
+
Reset stored configuration (API keys, model settings, etc.).
|
|
42
|
+
```bash
|
|
43
|
+
gui-agent reset # Reset default configuration file
|
|
44
|
+
gui-agent reset -c custom.json # Reset specific configuration file
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### Command Line Options
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
gui-agent run [options]
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
#### Options:
|
|
54
|
+
- `-p, --presets <url>` - Load model configuration from a remote YAML preset file
|
|
55
|
+
- `-t, --target <target>` - Specify the target operator:
|
|
56
|
+
- `computer` - Desktop automation (default)
|
|
57
|
+
- `browser` - Web browser automation
|
|
58
|
+
- `android` - Android mobile automation
|
|
59
|
+
- `-q, --query <query>` - Provide the automation instruction directly via command line
|
|
60
|
+
- `-c, --config <path>` - Path to a custom configuration file (default: `~/.gui-agent-cli.json`)
|
|
61
|
+
|
|
62
|
+
### Examples
|
|
63
|
+
|
|
64
|
+
#### Computer Automation
|
|
65
|
+
```bash
|
|
66
|
+
gui-agent run -t computer -q "Open Chrome browser and navigate to github.com"
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
#### Android Mobile Automation
|
|
70
|
+
Make sure your Android device is connected via USB debugging:
|
|
71
|
+
|
|
72
|
+
```bash
|
|
73
|
+
gui-agent run -t android -q "Open WhatsApp and send a message to John"
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
#### Browser Automation
|
|
77
|
+
```bash
|
|
78
|
+
gui-agent run -t browser -q "Search for 'GUI Agent automation' on Google"
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
#### Using Remote Presets
|
|
82
|
+
```bash
|
|
83
|
+
gui-agent run -p "https://example.com/config.yaml" -q "Automate the login process"
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
## Configuration
|
|
87
|
+
|
|
88
|
+
### Model Configuration
|
|
89
|
+
|
|
90
|
+
The CLI requires VLM (Vision Language Model) configuration. You can provide this via:
|
|
91
|
+
|
|
92
|
+
1. **Interactive setup** - When you first run the CLI, it will prompt for:
|
|
93
|
+
- Model provider (volcengine, anthropic, openai, lm-studio, deepseek, ollama)
|
|
94
|
+
- Model base URL
|
|
95
|
+
- API key
|
|
96
|
+
- Model name
|
|
97
|
+
|
|
98
|
+
2. **Configuration file** - Settings are saved to `~/.gui-agent-cli.json`:
|
|
99
|
+
```json
|
|
100
|
+
{
|
|
101
|
+
"provider": "openai",
|
|
102
|
+
"baseURL": "https://api.openai.com/v1",
|
|
103
|
+
"apiKey": "your-api-key",
|
|
104
|
+
"model": "gpt-4-vision-preview",
|
|
105
|
+
"useResponsesApi": false
|
|
106
|
+
}
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
3. **Remote presets** - Load configuration from a YAML file:
|
|
110
|
+
```yaml
|
|
111
|
+
vlmBaseUrl: "https://api.openai.com/v1"
|
|
112
|
+
vlmApiKey: "your-api-key"
|
|
113
|
+
vlmModelName: "gpt-4-vision-preview"
|
|
114
|
+
useResponsesApi: false
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
#### Supported Providers
|
|
118
|
+
- **volcengine** - VolcEngine (ByteDance) models
|
|
119
|
+
- **anthropic** - Anthropic Claude models
|
|
120
|
+
- **openai** - OpenAI models (default)
|
|
121
|
+
- **lm-studio** - LM Studio local models
|
|
122
|
+
- **deepseek** - DeepSeek models
|
|
123
|
+
- **ollama** - Ollama local models
|
|
124
|
+
|
|
125
|
+
## Operators
|
|
126
|
+
|
|
127
|
+
### Computer Automation (nut-js)
|
|
128
|
+
|
|
129
|
+
#### Using Remote Presets
|
|
130
|
+
```bash
|
|
131
|
+
gui-agent start -p "https://example.com/config.yaml" -q "Automate the login process"
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Configuration
|
|
135
|
+
|
|
136
|
+
### Model Configuration
|
|
137
|
+
|
|
138
|
+
The CLI requires VLM (Vision Language Model) configuration. You can provide this via:
|
|
139
|
+
|
|
140
|
+
1. **Interactive setup** - When you first run the CLI, it will prompt for:
|
|
141
|
+
- Model provider (volcengine, anthropic, openai, lm-studio, deepseek, ollama)
|
|
142
|
+
- Model base URL
|
|
143
|
+
- API key
|
|
144
|
+
- Model name
|
|
145
|
+
|
|
146
|
+
2. **Configuration file** - Settings are saved to `~/.gui-agent-cli.json`:
|
|
147
|
+
```json
|
|
148
|
+
{
|
|
149
|
+
"provider": "openai",
|
|
150
|
+
"baseURL": "https://api.openai.com/v1",
|
|
151
|
+
"apiKey": "your-api-key",
|
|
152
|
+
"model": "gpt-4-vision-preview",
|
|
153
|
+
"useResponsesApi": false
|
|
154
|
+
}
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
3. **Remote presets** - Load configuration from a YAML file:
|
|
158
|
+
```yaml
|
|
159
|
+
vlmBaseUrl: "https://api.openai.com/v1"
|
|
160
|
+
vlmApiKey: "your-api-key"
|
|
161
|
+
vlmModelName: "gpt-4-vision-preview"
|
|
162
|
+
useResponsesApi: false
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
#### Supported Providers
|
|
166
|
+
- **volcengine** - VolcEngine (ByteDance) models
|
|
167
|
+
- **anthropic** - Anthropic Claude models
|
|
168
|
+
- **openai** - OpenAI models (default)
|
|
169
|
+
- **lm-studio** - LM Studio local models
|
|
170
|
+
- **deepseek** - DeepSeek models
|
|
171
|
+
- **ollama** - Ollama local models
|
|
172
|
+
|
|
173
|
+
## Operators
|
|
174
|
+
|
|
175
|
+
### Desktop Automation (nut-js)
|
|
176
|
+
- Automates desktop applications
|
|
177
|
+
- Uses computer vision to identify UI elements
|
|
178
|
+
- Supports mouse and keyboard actions
|
|
179
|
+
- Works with Windows, macOS, and Linux
|
|
180
|
+
|
|
181
|
+
### Android Automation (adb)
|
|
182
|
+
- Controls Android devices via ADB
|
|
183
|
+
- Requires USB debugging enabled
|
|
184
|
+
- Can automate mobile apps and system UI
|
|
185
|
+
- Supports touch gestures and device interactions
|
|
186
|
+
|
|
187
|
+
## Configuration Management
|
|
188
|
+
|
|
189
|
+
### Reset Configuration
|
|
190
|
+
To clear all stored configuration and start fresh:
|
|
191
|
+
```bash
|
|
192
|
+
gui-agent reset
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
This will remove the configuration file (`~/.gui-agent-cli.json`) and the CLI will prompt you to configure settings again on the next run.
|
|
196
|
+
|
|
197
|
+
### Custom Configuration File
|
|
198
|
+
You can specify a custom configuration file location:
|
|
199
|
+
```bash
|
|
200
|
+
gui-agent run -c /path/to/custom-config.json
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
To reset a specific configuration file:
|
|
204
|
+
```bash
|
|
205
|
+
gui-agent reset -c /path/to/custom-config.json
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
## Development
|
|
209
|
+
|
|
210
|
+
### Building the CLI
|
|
211
|
+
```bash
|
|
212
|
+
npm run build
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
### Development Mode
|
|
216
|
+
```bash
|
|
217
|
+
npm run dev
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
### Running Tests
|
|
221
|
+
```bash
|
|
222
|
+
npm test
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
## License
|
|
226
|
+
|
|
227
|
+
Apache-2.0
|
|
228
|
+
|
|
229
|
+
## Contributing
|
|
230
|
+
|
|
231
|
+
Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
|