@ui-tars-test/cli 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/README.md +231 -0
  2. package/bin/index.js +13 -0
  3. package/dist/337.js +1396 -0
  4. package/dist/337.js.LICENSE.txt +14 -0
  5. package/dist/337.js.map +1 -0
  6. package/dist/337.mjs +1395 -0
  7. package/dist/337.mjs.LICENSE.txt +14 -0
  8. package/dist/337.mjs.map +1 -0
  9. package/dist/597.js +19 -0
  10. package/dist/597.mjs +18 -0
  11. package/dist/663.js +98 -0
  12. package/dist/663.js.map +1 -0
  13. package/dist/663.mjs +97 -0
  14. package/dist/663.mjs.map +1 -0
  15. package/dist/760.js +2957 -0
  16. package/dist/760.js.map +1 -0
  17. package/dist/760.mjs +2956 -0
  18. package/dist/760.mjs.map +1 -0
  19. package/dist/940.js +1013 -0
  20. package/dist/940.js.map +1 -0
  21. package/dist/940.mjs +1011 -0
  22. package/dist/940.mjs.map +1 -0
  23. package/dist/955.js +317 -0
  24. package/dist/955.js.map +1 -0
  25. package/dist/955.mjs +317 -0
  26. package/dist/955.mjs.map +1 -0
  27. package/dist/bundle/index.js +299060 -0
  28. package/dist/cli/commands.js +75 -0
  29. package/dist/cli/commands.js.map +1 -0
  30. package/dist/cli/commands.mjs +41 -0
  31. package/dist/cli/commands.mjs.map +1 -0
  32. package/dist/cli/start.js +447 -0
  33. package/dist/cli/start.js.map +1 -0
  34. package/dist/cli/start.mjs +396 -0
  35. package/dist/cli/start.mjs.map +1 -0
  36. package/dist/gui-agent-macos +0 -0
  37. package/dist/index.js +14 -0
  38. package/dist/index.js.LICENSE.txt +471 -0
  39. package/dist/index.js.map +1 -0
  40. package/dist/index.mjs +8 -0
  41. package/dist/index.mjs.LICENSE.txt +471 -0
  42. package/dist/index.mjs.map +1 -0
  43. package/dist/src/cli/commands.d.ts +2 -0
  44. package/dist/src/cli/commands.d.ts.map +1 -0
  45. package/dist/src/cli/start.d.ts +11 -0
  46. package/dist/src/cli/start.d.ts.map +1 -0
  47. package/dist/src/index.d.ts +2 -0
  48. package/dist/src/index.d.ts.map +1 -0
  49. package/package.json +59 -0
package/README.md ADDED
@@ -0,0 +1,231 @@
1
+ # @ui-tars-test/cli
2
+
3
+ CLI for GUI Agent - A powerful automation tool for desktop, web, and mobile applications.
4
+
5
+ ## Installation
6
+
7
+ ### Global Installation
8
+ ```bash
9
+ npm install -g @ui-tars-test/cli
10
+ ```
11
+
12
+ ### Use via npx (without installation)
13
+ ```bash
14
+ npx @ui-tars-test/cli run [options]
15
+ ```
16
+
17
+ ### Local Installation
18
+ ```bash
19
+ npm install @ui-tars-test/cli
20
+ ```
21
+
22
+ ## Usage
23
+
24
+ ### Basic Usage
25
+
26
+ ```bash
27
+ gui-agent run
28
+ ```
29
+
30
+ This will start an interactive prompt where you can:
31
+ 1. Configure your VLM model settings (provider, base URL, API key, model name)
32
+ 2. Select the target operator (computer, browser, or android)
33
+ 3. Enter your automation instruction
34
+
35
+ ### Available Commands
36
+
37
+ #### `gui-agent run`
38
+ Run GUI Agent automation with optional parameters.
39
+
40
+ #### `gui-agent reset`
41
+ Reset stored configuration (API keys, model settings, etc.).
42
+ ```bash
43
+ gui-agent reset # Reset default configuration file
44
+ gui-agent reset -c custom.json # Reset specific configuration file
45
+ ```
46
+
47
+ ### Command Line Options
48
+
49
+ ```bash
50
+ gui-agent run [options]
51
+ ```
52
+
53
+ #### Options:
54
+ - `-p, --presets <url>` - Load model configuration from a remote YAML preset file
55
+ - `-t, --target <target>` - Specify the target operator:
56
+ - `computer` - Desktop automation (default)
57
+ - `browser` - Web browser automation
58
+ - `android` - Android mobile automation
59
+ - `-q, --query <query>` - Provide the automation instruction directly via command line
60
+ - `-c, --config <path>` - Path to a custom configuration file (default: `~/.gui-agent-cli.json`)
61
+
62
+ ### Examples
63
+
64
+ #### Computer Automation
65
+ ```bash
66
+ gui-agent run -t computer -q "Open Chrome browser and navigate to github.com"
67
+ ```
68
+
69
+ #### Android Mobile Automation
70
+ Make sure your Android device is connected via USB debugging:
71
+
72
+ ```bash
73
+ gui-agent run -t android -q "Open WhatsApp and send a message to John"
74
+ ```
75
+
76
+ #### Browser Automation
77
+ ```bash
78
+ gui-agent run -t browser -q "Search for 'GUI Agent automation' on Google"
79
+ ```
80
+
81
+ #### Using Remote Presets
82
+ ```bash
83
+ gui-agent run -p "https://example.com/config.yaml" -q "Automate the login process"
84
+ ```
85
+
86
+ ## Configuration
87
+
88
+ ### Model Configuration
89
+
90
+ The CLI requires VLM (Vision Language Model) configuration. You can provide this via:
91
+
92
+ 1. **Interactive setup** - When you first run the CLI, it will prompt for:
93
+ - Model provider (volcengine, anthropic, openai, lm-studio, deepseek, ollama)
94
+ - Model base URL
95
+ - API key
96
+ - Model name
97
+
98
+ 2. **Configuration file** - Settings are saved to `~/.gui-agent-cli.json`:
99
+ ```json
100
+ {
101
+ "provider": "openai",
102
+ "baseURL": "https://api.openai.com/v1",
103
+ "apiKey": "your-api-key",
104
+ "model": "gpt-4-vision-preview",
105
+ "useResponsesApi": false
106
+ }
107
+ ```
108
+
109
+ 3. **Remote presets** - Load configuration from a YAML file:
110
+ ```yaml
111
+ vlmBaseUrl: "https://api.openai.com/v1"
112
+ vlmApiKey: "your-api-key"
113
+ vlmModelName: "gpt-4-vision-preview"
114
+ useResponsesApi: false
115
+ ```
116
+
117
+ #### Supported Providers
118
+ - **volcengine** - VolcEngine (ByteDance) models
119
+ - **anthropic** - Anthropic Claude models
120
+ - **openai** - OpenAI models (default)
121
+ - **lm-studio** - LM Studio local models
122
+ - **deepseek** - DeepSeek models
123
+ - **ollama** - Ollama local models
124
+
125
+ ## Operators
126
+
127
+ ### Computer Automation (nut-js)
128
+
129
+ #### Using Remote Presets
130
+ ```bash
131
+ gui-agent start -p "https://example.com/config.yaml" -q "Automate the login process"
132
+ ```
133
+
134
+ ## Configuration
135
+
136
+ ### Model Configuration
137
+
138
+ The CLI requires VLM (Vision Language Model) configuration. You can provide this via:
139
+
140
+ 1. **Interactive setup** - When you first run the CLI, it will prompt for:
141
+ - Model provider (volcengine, anthropic, openai, lm-studio, deepseek, ollama)
142
+ - Model base URL
143
+ - API key
144
+ - Model name
145
+
146
+ 2. **Configuration file** - Settings are saved to `~/.gui-agent-cli.json`:
147
+ ```json
148
+ {
149
+ "provider": "openai",
150
+ "baseURL": "https://api.openai.com/v1",
151
+ "apiKey": "your-api-key",
152
+ "model": "gpt-4-vision-preview",
153
+ "useResponsesApi": false
154
+ }
155
+ ```
156
+
157
+ 3. **Remote presets** - Load configuration from a YAML file:
158
+ ```yaml
159
+ vlmBaseUrl: "https://api.openai.com/v1"
160
+ vlmApiKey: "your-api-key"
161
+ vlmModelName: "gpt-4-vision-preview"
162
+ useResponsesApi: false
163
+ ```
164
+
165
+ #### Supported Providers
166
+ - **volcengine** - VolcEngine (ByteDance) models
167
+ - **anthropic** - Anthropic Claude models
168
+ - **openai** - OpenAI models (default)
169
+ - **lm-studio** - LM Studio local models
170
+ - **deepseek** - DeepSeek models
171
+ - **ollama** - Ollama local models
172
+
173
+ ## Operators
174
+
175
+ ### Desktop Automation (nut-js)
176
+ - Automates desktop applications
177
+ - Uses computer vision to identify UI elements
178
+ - Supports mouse and keyboard actions
179
+ - Works with Windows, macOS, and Linux
180
+
181
+ ### Android Automation (adb)
182
+ - Controls Android devices via ADB
183
+ - Requires USB debugging enabled
184
+ - Can automate mobile apps and system UI
185
+ - Supports touch gestures and device interactions
186
+
187
+ ## Configuration Management
188
+
189
+ ### Reset Configuration
190
+ To clear all stored configuration and start fresh:
191
+ ```bash
192
+ gui-agent reset
193
+ ```
194
+
195
+ This will remove the configuration file (`~/.gui-agent-cli.json`) and the CLI will prompt you to configure settings again on the next run.
196
+
197
+ ### Custom Configuration File
198
+ You can specify a custom configuration file location:
199
+ ```bash
200
+ gui-agent run -c /path/to/custom-config.json
201
+ ```
202
+
203
+ To reset a specific configuration file:
204
+ ```bash
205
+ gui-agent reset -c /path/to/custom-config.json
206
+ ```
207
+
208
+ ## Development
209
+
210
+ ### Building the CLI
211
+ ```bash
212
+ npm run build
213
+ ```
214
+
215
+ ### Development Mode
216
+ ```bash
217
+ npm run dev
218
+ ```
219
+
220
+ ### Running Tests
221
+ ```bash
222
+ npm test
223
+ ```
224
+
225
+ ## License
226
+
227
+ Apache-2.0
228
+
229
+ ## Contributing
230
+
231
+ Contributions are welcome! Please read our contributing guidelines and submit pull requests to our repository.
package/bin/index.js ADDED
@@ -0,0 +1,13 @@
1
+ #!/usr/bin/env node
2
+
3
+ function main() {
4
+ try {
5
+ const { run } = require('../dist/cli/commands');
6
+ run();
7
+ } catch (err) {
8
+ console.error(err);
9
+ process.exit(1);
10
+ }
11
+ }
12
+
13
+ main();