yiyan-browser-agent 1.3.3 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,145 +1,228 @@
1
- # doubao-browser-agent
1
+ # yiyan-browser-agent
2
2
 
3
- NPM package for interacting with Doubao (豆包) web version via Playwright.
3
+ NPM package for interacting with Yiyan (文心一言) web version via Playwright. No API key required.
4
4
 
5
5
  ## Features
6
6
 
7
- - Automated browser interaction with Doubao web version
8
- - Automatic Chrome profile management for login persistence
9
- - Retry mechanism for network failures
10
- - CLI and Node.js API support
11
- - TypeScript support with full type definitions
7
+ - 🤖 Automated browser interaction with Yiyan web version
8
+ - 🔐 Login persistence via Playwright session
9
+ - 🌍 Cross-platform: Windows / Ubuntu / macOS
10
+ - 📦 Auto-download Chromium (~150MB)
11
+ - 🛠️ CLI and Node.js API support
12
+ - 📝 TypeScript support with full type definitions
12
13
 
13
- ## Prerequisites
14
+ ## Installation
14
15
 
15
- 1. **Chrome browser** installed on your system
16
- 2. **Logged into Doubao** (https://www.doubao.com/chat/) in your Chrome browser at least once
16
+ ```bash
17
+ npm install yiyan-browser-agent
18
+ ```
17
19
 
18
- The package uses your Chrome profile to maintain login state, so you need to login manually first.
20
+ Chromium will be automatically downloaded after installation.
19
21
 
20
- ## Installation
22
+ ### Ubuntu Additional Step
21
23
 
22
24
  ```bash
23
- npm install doubao-browser-agent
25
+ npx playwright install-deps chromium
24
26
  ```
25
27
 
26
- ## Node.js API Usage
28
+ This installs required system libraries for Chromium on Ubuntu.
29
+
30
+ ## Quick Start
31
+
32
+ ### CLI Usage
33
+
34
+ ```bash
35
+ # First time: login to Yiyan
36
+ yiyan-browser-agent login
37
+
38
+ # Ask a question (headless mode)
39
+ yiyan-browser-agent ask "济宁天气情况"
40
+
41
+ # With verbose output
42
+ yiyan-browser-agent ask "什么是 TypeScript?" --verbose
43
+
44
+ # Headful mode (visible browser window)
45
+ yiyan-browser-agent ask "解释 Promise" --headful
46
+
47
+ # Check login status
48
+ yiyan-browser-agent status
49
+
50
+ # Clear saved session
51
+ yiyan-browser-agent reset
52
+
53
+ # Debug mode (output DOM info for troubleshooting)
54
+ yiyan-browser-agent debug
55
+ ```
56
+
57
+ ### Node.js API
27
58
 
28
59
  ```typescript
29
- import { DoubaoAgent } from 'doubao-browser-agent';
60
+ import { YiyanAgent } from 'yiyan-browser-agent';
30
61
 
31
62
  // Create agent instance
32
- const agent = new DoubaoAgent({
63
+ const agent = new YiyanAgent({
33
64
  timeout: 120000, // Timeout in milliseconds (default: 120000)
34
- retryCount: 3, // Retry count (default: 3)
35
65
  });
36
66
 
37
- // Send a question and get answer
38
- try {
39
- const answer = await agent.ask('What is TypeScript?');
40
- console.log('Answer:', answer);
41
- } catch (error) {
42
- console.error('Error:', error.message);
43
- }
67
+ // Login first time
68
+ await agent.login();
69
+
70
+ // Ask a question
71
+ const answer = await agent.ask('What is TypeScript?');
72
+ console.log('Answer:', answer);
73
+
74
+ // Headful mode (visible browser)
75
+ const answer2 = await agent.ask('Explain Promise', true);
44
76
 
45
77
  // Check login status
46
78
  const status = agent.status();
47
79
  console.log('Logged in:', status.loggedIn);
48
- console.log('Profile path:', status.profilePath);
80
+ console.log('Session path:', status.sessionPath);
49
81
 
50
- // Clear saved profile (if needed)
82
+ // Clear saved session
51
83
  await agent.reset();
52
- ```
53
-
54
- ## CLI Usage
55
84
 
56
- ```bash
57
- # Send a question
58
- doubao-agent ask "What is TypeScript?"
85
+ // Debug mode (for troubleshooting selectors)
86
+ await agent.debug();
87
+ ```
59
88
 
60
- # With options
61
- doubao-agent ask "Explain Promise" --timeout 60000 --retry 5
89
+ ## CLI Commands
62
90
 
63
- # Check login status
64
- doubao-agent status
91
+ | Command | Description |
92
+ |---------|-------------|
93
+ | `login` | Open browser for manual login (required first time) |
94
+ | `ask "question"` | Send question and get answer |
95
+ | `status` | Check login status |
96
+ | `reset` | Clear saved session |
97
+ | `debug` | Debug mode: output DOM info for troubleshooting |
65
98
 
66
- # Clear saved profile
67
- doubao-agent reset
99
+ ## CLI Options
68
100
 
69
- # Show help
70
- doubao-agent --help
71
- ```
101
+ | Option | Description |
102
+ |--------|-------------|
103
+ | `--timeout <ms>` | Timeout in milliseconds (default: 120000) |
104
+ | `--headful` | Show browser window (for debugging/captcha) |
105
+ | `--verbose` | Show detailed logs |
106
+ | `--help` | Show help message |
72
107
 
73
108
  ## API Documentation
74
109
 
75
- ### `DoubaoAgent`
110
+ ### `YiyanAgent`
76
111
 
77
- Main class for interacting with Doubao.
112
+ Main class for interacting with Yiyan.
78
113
 
79
114
  #### Constructor
80
115
 
81
116
  ```typescript
82
- new DoubaoAgent(options?: DoubaoAgentOptions)
117
+ new YiyanAgent(options?: YiyanAgentOptions)
83
118
  ```
84
119
 
85
120
  **Options:**
86
121
  - `timeout?: number` - Timeout in milliseconds (default: 120000)
87
- - `retryCount?: number` - Number of retry attempts (default: 3)
88
- - `profileDir?: string` - Custom directory for storing Chrome profile copy
89
- - `chromePath?: string` - Custom Chrome executable path
122
+ - `profileDir?: string` - Custom session directory
123
+ - `verbose?: boolean` - Enable verbose logging (default: false)
90
124
 
91
125
  #### Methods
92
126
 
93
- ##### `ask(question: string): Promise<string>`
127
+ ##### `login(): Promise<void>`
128
+
129
+ Open browser window for manual login. Login state is saved automatically.
94
130
 
95
- Send a question to Doubao and return the answer.
131
+ ##### `ask(question: string, headful?: boolean): Promise<string>`
96
132
 
97
- ##### `status(): { loggedIn: boolean; profilePath: string }`
133
+ Send question and return answer.
134
+ - `headful: false` (default) - Headless mode
135
+ - `headful: true` - Visible browser window
98
136
 
99
- Check the login status (whether profile exists).
137
+ ##### `status(): { loggedIn: boolean; sessionPath: string }`
138
+
139
+ Check login status.
100
140
 
101
141
  ##### `reset(): Promise<void>`
102
142
 
103
- Clear the saved profile copy.
143
+ Clear saved session.
104
144
 
105
- ### Error Types
145
+ ##### `debug(): Promise<void>`
106
146
 
107
- The package throws `DoubaoAgentError` with the following error types:
147
+ Start browser in debug mode, output DOM information for troubleshooting selectors.
148
+
149
+ ### Error Types
108
150
 
109
151
  | Type | Description |
110
152
  |------|-------------|
111
- | `BROWSER_LAUNCH` | Failed to launch Chrome browser |
112
- | `PROFILE_COPY` | Failed to copy Chrome profile |
153
+ | `BROWSER_LAUNCH` | Failed to launch Chromium |
113
154
  | `TIMEOUT` | Timeout while waiting for response |
114
155
  | `NETWORK` | Network or connection error |
115
-
116
- ```typescript
117
- import { DoubaoAgentError } from 'doubao-browser-agent';
118
-
119
- try {
120
- const answer = await agent.ask('Hello');
121
- } catch (error) {
122
- if (error instanceof DoubaoAgentError) {
123
- console.log('Error type:', error.type);
124
- console.log('Error message:', error.message);
125
- }
126
- }
127
- ```
156
+ | `CAPTCHA` | Captcha detected (use --headful) |
128
157
 
129
158
  ## How It Works
130
159
 
131
- 1. On first run, the package copies your Chrome profile to a temporary directory
132
- 2. Launches Chrome in headless mode with the copied profile
133
- 3. Navigates to Doubao chat page
134
- 4. Sends your question and waits for response
135
- 5. Extracts the response and closes the browser
160
+ 1. Uses Playwright's built-in Chromium (auto-download)
161
+ 2. Launches browser with persistent session context
162
+ 3. Navigates to Yiyan chat page
163
+ 4. Sends question and waits for response
164
+ 5. Extracts response using multiple strategies
165
+ 6. Closes browser (session saved for next use)
166
+
167
+ ## Session Storage
168
+
169
+ Session data is stored in:
170
+ - Windows: `C:\Users\<user>\.yiyan-browser-agent\session`
171
+ - Linux/macOS: `~/.yiyan-browser-agent/session`
136
172
 
137
173
  ## Supported Platforms
138
174
 
139
- - Windows (Chrome paths: `C:/Program Files/Google/Chrome/Application/chrome.exe`)
140
- - macOS (Chrome paths: `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`)
141
- - Linux (Chrome paths: `/usr/bin/google-chrome`, `/usr/bin/chrome`)
175
+ | Platform | Status | Notes |
176
+ |----------|--------|-------|
177
+ | Windows | | Direct use |
178
+ | Ubuntu | ✅ | Run `npx playwright install-deps chromium` first |
179
+ | macOS | ✅ | Direct use |
180
+
181
+ ## Troubleshooting
182
+
183
+ ### Ubuntu: Browser doesn't start
184
+
185
+ ```bash
186
+ npx playwright install chromium
187
+ npx playwright install-deps chromium
188
+ yiyan-browser-agent login --verbose
189
+ ```
190
+
191
+ ### Timeout in headless mode
192
+
193
+ ```bash
194
+ # Use headful mode to see what's happening
195
+ yiyan-browser-agent ask "question" --headful --verbose
196
+ ```
197
+
198
+ ### Captcha detected
199
+
200
+ ```bash
201
+ # Use headful mode to manually solve captcha
202
+ yiyan-browser-agent ask "question" --headful
203
+ ```
204
+
205
+ ### Selectors not working
206
+
207
+ ```bash
208
+ # Use debug mode to see DOM structure
209
+ yiyan-browser-agent debug
210
+ ```
211
+
212
+ ## Comparison with deepseek-browser-agent
213
+
214
+ This package follows similar architecture to [deepseek-browser-agent](https://github.com/Omar-Azam/deepseek-browser-agent):
215
+
216
+ | Feature | yiyan-browser-agent | deepseek-browser-agent |
217
+ |---------|---------------------|------------------------|
218
+ | Browser | Playwright Chromium | Playwright Chromium |
219
+ | Login | Persistent session | Persistent session |
220
+ | Platform | Win/Ubuntu/macOS | Linux |
142
221
 
143
222
  ## License
144
223
 
145
- MIT
224
+ MIT
225
+
226
+ ## Repository
227
+
228
+ https://github.com/picha/yiyan-browser-agent
package/dist/cli.d.ts CHANGED
@@ -4,7 +4,7 @@ import { C as CliOutput } from './types-BhQ78DYf.js';
4
4
  declare function printHelp(): void;
5
5
  /** 解析后的 CLI 参数 */
6
6
  interface ParsedArgs {
7
- command: 'ask' | 'status' | 'reset' | 'login' | 'help' | null;
7
+ command: 'ask' | 'status' | 'reset' | 'login' | 'debug' | 'help' | null;
8
8
  question?: string;
9
9
  timeout?: number;
10
10
  headful?: boolean;