yiyan-browser-agent 1.3.3 → 1.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +143 -78
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,145 +1,210 @@
1
- # doubao-browser-agent
1
+ # yiyan-browser-agent
2
2
 
3
- NPM package for interacting with Doubao (豆包) web version via Playwright.
3
+ NPM package for interacting with Yiyan (文心一言) web version via Playwright. No API key required.
4
4
 
5
5
  ## Features
6
6
 
7
- - Automated browser interaction with Doubao web version
8
- - Automatic Chrome profile management for login persistence
9
- - Retry mechanism for network failures
10
- - CLI and Node.js API support
11
- - TypeScript support with full type definitions
7
+ - 🤖 Automated browser interaction with Yiyan web version
8
+ - 🔐 Login persistence via Playwright session
9
+ - 🌍 Cross-platform: Windows / Ubuntu / macOS
10
+ - 📦 Auto-download Chromium (~150MB)
11
+ - 🛠️ CLI and Node.js API support
12
+ - 📝 TypeScript support with full type definitions
12
13
 
13
- ## Prerequisites
14
+ ## Installation
14
15
 
15
- 1. **Chrome browser** installed on your system
16
- 2. **Logged into Doubao** (https://www.doubao.com/chat/) in your Chrome browser at least once
16
+ ```bash
17
+ npm install yiyan-browser-agent
18
+ ```
17
19
 
18
- The package uses your Chrome profile to maintain login state, so you need to login manually first.
20
+ Chromium will be automatically downloaded after installation.
19
21
 
20
- ## Installation
22
+ ### Ubuntu Additional Step
21
23
 
22
24
  ```bash
23
- npm install doubao-browser-agent
25
+ npx playwright install-deps chromium
24
26
  ```
25
27
 
26
- ## Node.js API Usage
28
+ This installs required system libraries for Chromium on Ubuntu.
29
+
30
+ ## Quick Start
31
+
32
+ ### CLI Usage
33
+
34
+ ```bash
35
+ # First time: login to Yiyan
36
+ yiyan-browser-agent login
37
+
38
+ # Ask a question (headless mode)
39
+ yiyan-browser-agent ask "济宁天气情况"
40
+
41
+ # With verbose output
42
+ yiyan-browser-agent ask "什么是 TypeScript?" --verbose
43
+
44
+ # Headful mode (visible browser window)
45
+ yiyan-browser-agent ask "解释 Promise" --headful
46
+
47
+ # Check login status
48
+ yiyan-browser-agent status
49
+
50
+ # Clear saved session
51
+ yiyan-browser-agent reset
52
+ ```
53
+
54
+ ### Node.js API
27
55
 
28
56
  ```typescript
29
- import { DoubaoAgent } from 'doubao-browser-agent';
57
+ import { YiyanAgent } from 'yiyan-browser-agent';
30
58
 
31
59
  // Create agent instance
32
- const agent = new DoubaoAgent({
60
+ const agent = new YiyanAgent({
33
61
  timeout: 120000, // Timeout in milliseconds (default: 120000)
34
- retryCount: 3, // Retry count (default: 3)
35
62
  });
36
63
 
37
- // Send a question and get answer
38
- try {
39
- const answer = await agent.ask('What is TypeScript?');
40
- console.log('Answer:', answer);
41
- } catch (error) {
42
- console.error('Error:', error.message);
43
- }
64
+ // Login first time
65
+ await agent.login();
66
+
67
+ // Ask a question
68
+ const answer = await agent.ask('What is TypeScript?');
69
+ console.log('Answer:', answer);
70
+
71
+ // Headful mode (visible browser)
72
+ const answer2 = await agent.ask('Explain Promise', true);
44
73
 
45
74
  // Check login status
46
75
  const status = agent.status();
47
76
  console.log('Logged in:', status.loggedIn);
48
- console.log('Profile path:', status.profilePath);
77
+ console.log('Session path:', status.sessionPath);
49
78
 
50
- // Clear saved profile (if needed)
79
+ // Clear saved session
51
80
  await agent.reset();
52
81
  ```
53
82
 
54
- ## CLI Usage
55
-
56
- ```bash
57
- # Send a question
58
- doubao-agent ask "What is TypeScript?"
59
-
60
- # With options
61
- doubao-agent ask "Explain Promise" --timeout 60000 --retry 5
83
+ ## CLI Commands
62
84
 
63
- # Check login status
64
- doubao-agent status
85
+ | Command | Description |
86
+ |---------|-------------|
87
+ | `login` | Open browser for manual login (required first time) |
88
+ | `ask "question"` | Send question and get answer |
89
+ | `status` | Check login status |
90
+ | `reset` | Clear saved session |
65
91
 
66
- # Clear saved profile
67
- doubao-agent reset
92
+ ## CLI Options
68
93
 
69
- # Show help
70
- doubao-agent --help
71
- ```
94
+ | Option | Description |
95
+ |--------|-------------|
96
+ | `--timeout <ms>` | Timeout in milliseconds (default: 120000) |
97
+ | `--headful` | Show browser window (for debugging/captcha) |
98
+ | `--verbose` | Show detailed logs |
99
+ | `--help` | Show help message |
72
100
 
73
101
  ## API Documentation
74
102
 
75
- ### `DoubaoAgent`
103
+ ### `YiyanAgent`
76
104
 
77
- Main class for interacting with Doubao.
105
+ Main class for interacting with Yiyan.
78
106
 
79
107
  #### Constructor
80
108
 
81
109
  ```typescript
82
- new DoubaoAgent(options?: DoubaoAgentOptions)
110
+ new YiyanAgent(options?: YiyanAgentOptions)
83
111
  ```
84
112
 
85
113
  **Options:**
86
114
  - `timeout?: number` - Timeout in milliseconds (default: 120000)
87
- - `retryCount?: number` - Number of retry attempts (default: 3)
88
- - `profileDir?: string` - Custom directory for storing Chrome profile copy
89
- - `chromePath?: string` - Custom Chrome executable path
115
+ - `profileDir?: string` - Custom session directory
116
+ - `verbose?: boolean` - Enable verbose logging (default: false)
90
117
 
91
118
  #### Methods
92
119
 
93
- ##### `ask(question: string): Promise<string>`
120
+ ##### `login(): Promise<void>`
121
+
122
+ Open browser window for manual login. Login state is saved automatically.
123
+
124
+ ##### `ask(question: string, headful?: boolean): Promise<string>`
94
125
 
95
- Send a question to Doubao and return the answer.
126
+ Send question and return answer.
127
+ - `headful: false` (default) - Headless mode
128
+ - `headful: true` - Visible browser window
96
129
 
97
- ##### `status(): { loggedIn: boolean; profilePath: string }`
130
+ ##### `status(): { loggedIn: boolean; sessionPath: string }`
98
131
 
99
- Check the login status (whether profile exists).
132
+ Check login status.
100
133
 
101
134
  ##### `reset(): Promise<void>`
102
135
 
103
- Clear the saved profile copy.
136
+ Clear saved session.
104
137
 
105
138
  ### Error Types
106
139
 
107
- The package throws `DoubaoAgentError` with the following error types:
108
-
109
140
  | Type | Description |
110
141
  |------|-------------|
111
- | `BROWSER_LAUNCH` | Failed to launch Chrome browser |
112
- | `PROFILE_COPY` | Failed to copy Chrome profile |
142
+ | `BROWSER_LAUNCH` | Failed to launch Chromium |
113
143
  | `TIMEOUT` | Timeout while waiting for response |
114
144
  | `NETWORK` | Network or connection error |
115
-
116
- ```typescript
117
- import { DoubaoAgentError } from 'doubao-browser-agent';
118
-
119
- try {
120
- const answer = await agent.ask('Hello');
121
- } catch (error) {
122
- if (error instanceof DoubaoAgentError) {
123
- console.log('Error type:', error.type);
124
- console.log('Error message:', error.message);
125
- }
126
- }
127
- ```
145
+ | `CAPTCHA` | Captcha detected (use --headful) |
128
146
 
129
147
  ## How It Works
130
148
 
131
- 1. On first run, the package copies your Chrome profile to a temporary directory
132
- 2. Launches Chrome in headless mode with the copied profile
133
- 3. Navigates to Doubao chat page
134
- 4. Sends your question and waits for response
135
- 5. Extracts the response and closes the browser
149
+ 1. Uses Playwright's built-in Chromium (auto-download)
150
+ 2. Launches browser with persistent session context
151
+ 3. Navigates to Yiyan chat page
152
+ 4. Sends question and waits for response
153
+ 5. Extracts response using multiple strategies
154
+ 6. Closes browser (session saved for next use)
155
+
156
+ ## Session Storage
157
+
158
+ Session data is stored in:
159
+ - Windows: `C:\Users\<user>\.yiyan-browser-agent\session`
160
+ - Linux/macOS: `~/.yiyan-browser-agent/session`
136
161
 
137
162
  ## Supported Platforms
138
163
 
139
- - Windows (Chrome paths: `C:/Program Files/Google/Chrome/Application/chrome.exe`)
140
- - macOS (Chrome paths: `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`)
141
- - Linux (Chrome paths: `/usr/bin/google-chrome`, `/usr/bin/chrome`)
164
+ | Platform | Status | Notes |
165
+ |----------|--------|-------|
166
+ | Windows | | Direct use |
167
+ | Ubuntu | ✅ | Run `npx playwright install-deps chromium` first |
168
+ | macOS | ✅ | Direct use |
169
+
170
+ ## Troubleshooting
171
+
172
+ ### Ubuntu: Browser doesn't start
173
+
174
+ ```bash
175
+ npx playwright install chromium
176
+ npx playwright install-deps chromium
177
+ yiyan-browser-agent login --verbose
178
+ ```
179
+
180
+ ### Timeout in headless mode
181
+
182
+ ```bash
183
+ # Use headful mode to see what's happening
184
+ yiyan-browser-agent ask "question" --headful --verbose
185
+ ```
186
+
187
+ ### Captcha detected
188
+
189
+ ```bash
190
+ # Use headful mode to manually solve captcha
191
+ yiyan-browser-agent ask "question" --headful
192
+ ```
193
+
194
+ ## Comparison with deepseek-browser-agent
195
+
196
+ This package follows similar architecture to [deepseek-browser-agent](https://github.com/Omar-Azam/deepseek-browser-agent):
197
+
198
+ | Feature | yiyan-browser-agent | deepseek-browser-agent |
199
+ |---------|---------------------|------------------------|
200
+ | Browser | Playwright Chromium | Playwright Chromium |
201
+ | Login | Persistent session | Persistent session |
202
+ | Platform | Win/Ubuntu/macOS | Linux |
142
203
 
143
204
  ## License
144
205
 
145
- MIT
206
+ MIT
207
+
208
+ ## Repository
209
+
210
+ https://github.com/picha/yiyan-browser-agent
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "yiyan-browser-agent",
3
- "version": "1.3.3",
3
+ "version": "1.3.4",
4
4
  "description": "NPM package for interacting with Yiyan (文心一言) web version via Playwright",
5
5
  "type": "module",
6
6
  "bin": {