@elizaos/plugin-browser 1.0.0-alpha.26
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +384 -0
- package/dist/index.js +1174 -0
- package/dist/index.js.map +1 -0
- package/package.json +62 -0
- package/scripts/postinstall.js +70 -0
- package/tsup.config.ts +22 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Shaw Walters, aka Moon aka @lalalune
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,384 @@
|
|
|
1
|
+
# @elizaos/plugin-browser
|
|
2
|
+
|
|
3
|
+
Core Node.js plugin for Eliza OS that provides essential services and actions for file operations, media processing, and cloud integrations.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The Node plugin serves as a foundational component of Eliza OS, bridging core Node.js capabilities with the Eliza ecosystem. It provides crucial services for file operations, media processing, speech synthesis, and cloud integrations, enabling both local and cloud-based functionality for Eliza agents.
|
|
8
|
+
|
|
9
|
+
## Features
|
|
10
|
+
|
|
11
|
+
- **AWS S3 Integration**: File upload and management with AWS S3
|
|
12
|
+
- **Browser Automation**: Web scraping and content extraction with Playwright
|
|
13
|
+
- **Image Processing**: Image description and analysis capabilities
|
|
14
|
+
- **PDF Processing**: PDF text extraction and parsing
|
|
15
|
+
- **Speech Synthesis**: Text-to-speech using ElevenLabs and VITS
|
|
16
|
+
- **Transcription**: Speech-to-text using various providers (OpenAI, Deepgram, Local)
|
|
17
|
+
- **Video Processing**: YouTube video download and transcription
|
|
18
|
+
- **LLaMA Integration**: Local LLM support with LLaMA models
|
|
19
|
+
|
|
20
|
+
## Installation
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
npx elizaos plugin add @elizaos/plugin-browser
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Configuration
|
|
27
|
+
|
|
28
|
+
The plugin requires various environment variables depending on which services you plan to use:
|
|
29
|
+
|
|
30
|
+
### Core Settings
|
|
31
|
+
|
|
32
|
+
```env
|
|
33
|
+
OPENAI_API_KEY=your_openai_api_key
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Voice Settings (Optional)
|
|
37
|
+
|
|
38
|
+
```env
|
|
39
|
+
ELEVENLABS_API_KEY=your_elevenlabs_api_key
|
|
40
|
+
ELEVENLABS_MODEL_ID=eleven_monolingual_v1
|
|
41
|
+
ELEVENLABS_VOICE_ID=your_voice_id
|
|
42
|
+
ELEVENLABS_VOICE_STABILITY=0.5
|
|
43
|
+
ELEVENLABS_VOICE_SIMILARITY_BOOST=0.75
|
|
44
|
+
ELEVENLABS_OPTIMIZE_STREAMING_LATENCY=0
|
|
45
|
+
ELEVENLABS_OUTPUT_FORMAT=pcm_16000
|
|
46
|
+
VITS_VOICE=en_US-hfc_female-medium
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### AWS Settings (Optional)
|
|
50
|
+
|
|
51
|
+
```env
|
|
52
|
+
AWS_ACCESS_KEY_ID=your_aws_access_key
|
|
53
|
+
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
|
|
54
|
+
AWS_REGION=your_aws_region
|
|
55
|
+
AWS_S3_BUCKET=your_s3_bucket
|
|
56
|
+
AWS_S3_UPLOAD_PATH=your_upload_path
|
|
57
|
+
AWS_S3_ENDPOINT=an_alternative_endpoint
|
|
58
|
+
AWS_S3_SSL_ENABLED=boolean(true|false)
|
|
59
|
+
AWS_S3_FORCE_PATH_STYLE=boolean(true|false)
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## Usage
|
|
63
|
+
|
|
64
|
+
```typescript
|
|
65
|
+
import { createNodePlugin } from "@elizaos/plugin-node";
|
|
66
|
+
|
|
67
|
+
// Initialize the plugin
|
|
68
|
+
const nodePlugin = createNodePlugin();
|
|
69
|
+
|
|
70
|
+
// Register with Eliza OS
|
|
71
|
+
elizaos.registerPlugin(nodePlugin);
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Services
|
|
75
|
+
|
|
76
|
+
### AwsS3Service
|
|
77
|
+
|
|
78
|
+
Handles file uploads and management with AWS S3.
|
|
79
|
+
|
|
80
|
+
### BrowserService
|
|
81
|
+
|
|
82
|
+
Provides web scraping and content extraction capabilities using Playwright.
|
|
83
|
+
|
|
84
|
+
### ImageDescriptionService
|
|
85
|
+
|
|
86
|
+
Processes and analyzes images to generate descriptions. Supports multiple providers:
|
|
87
|
+
|
|
88
|
+
- Local processing using Florence model
|
|
89
|
+
- OpenAI Vision API
|
|
90
|
+
- Google Gemini
|
|
91
|
+
|
|
92
|
+
Configuration:
|
|
93
|
+
|
|
94
|
+
```env
|
|
95
|
+
# For OpenAI Vision
|
|
96
|
+
OPENAI_API_KEY=your_openai_api_key
|
|
97
|
+
|
|
98
|
+
# For Google Gemini
|
|
99
|
+
GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Provider selection:
|
|
103
|
+
|
|
104
|
+
- If `imageVisionModelProvider` is set to `google/openai`, it will use this one.
|
|
105
|
+
- Else if `model` is set to `google/openai`, it will use this one.
|
|
106
|
+
- Default if nothing is set is OpenAI.
|
|
107
|
+
|
|
108
|
+
The service automatically handles different image formats, including GIFs (first frame extraction).
|
|
109
|
+
|
|
110
|
+
Features by provider:
|
|
111
|
+
|
|
112
|
+
**Local (Florence):**
|
|
113
|
+
|
|
114
|
+
- Basic image captioning
|
|
115
|
+
- Local processing without API calls
|
|
116
|
+
|
|
117
|
+
**OpenAI Vision:**
|
|
118
|
+
|
|
119
|
+
- Detailed image descriptions
|
|
120
|
+
- Text detection
|
|
121
|
+
- Object recognition
|
|
122
|
+
|
|
123
|
+
**Google Gemini 1.5:**
|
|
124
|
+
|
|
125
|
+
- High-quality image understanding
|
|
126
|
+
- Detailed descriptions with natural language
|
|
127
|
+
- Multi-modal context understanding
|
|
128
|
+
- Support for complex scenes and content
|
|
129
|
+
|
|
130
|
+
The provider can be configured through the runtime settings, allowing easy switching between providers based on your needs.
|
|
131
|
+
|
|
132
|
+
### LlamaService
|
|
133
|
+
|
|
134
|
+
Provides local LLM capabilities using LLaMA models.
|
|
135
|
+
|
|
136
|
+
### PdfService
|
|
137
|
+
|
|
138
|
+
Extracts and processes text content from PDF files.
|
|
139
|
+
|
|
140
|
+
### SpeechService
|
|
141
|
+
|
|
142
|
+
Handles text-to-speech conversion using ElevenLabs and VITS.
|
|
143
|
+
|
|
144
|
+
### TranscriptionService
|
|
145
|
+
|
|
146
|
+
Converts speech to text using various providers.
|
|
147
|
+
|
|
148
|
+
### VideoService
|
|
149
|
+
|
|
150
|
+
Processes video content, including YouTube video downloads and transcription.
|
|
151
|
+
|
|
152
|
+
## Actions
|
|
153
|
+
|
|
154
|
+
### describeImage
|
|
155
|
+
|
|
156
|
+
Analyzes and generates descriptions for images.
|
|
157
|
+
|
|
158
|
+
```typescript
|
|
159
|
+
// Example usage
|
|
160
|
+
const result = await runtime.executeAction("DESCRIBE_IMAGE", {
|
|
161
|
+
imageUrl: "path/to/image.jpg",
|
|
162
|
+
});
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Dependencies
|
|
166
|
+
|
|
167
|
+
The plugin requires several peer dependencies:
|
|
168
|
+
|
|
169
|
+
- `onnxruntime-node`: 1.20.1
|
|
170
|
+
- `whatwg-url`: 7.1.0
|
|
171
|
+
|
|
172
|
+
And trusted dependencies:
|
|
173
|
+
|
|
174
|
+
- `onnxruntime-node`: 1.20.1
|
|
175
|
+
- `sharp`: 0.33.5
|
|
176
|
+
|
|
177
|
+
## Safety & Security
|
|
178
|
+
|
|
179
|
+
### File Operations
|
|
180
|
+
|
|
181
|
+
- **Path Sanitization**: All file paths are sanitized to prevent directory traversal attacks
|
|
182
|
+
- **File Size Limits**: Enforced limits on upload sizes
|
|
183
|
+
- **Type Checking**: Strict file type validation
|
|
184
|
+
- **Temporary File Cleanup**: Automatic cleanup of temporary files
|
|
185
|
+
|
|
186
|
+
### API Keys & Credentials
|
|
187
|
+
|
|
188
|
+
- **Environment Isolation**: Sensitive credentials are isolated in environment variables
|
|
189
|
+
- **Access Scoping**: Services are initialized with minimum required permissions
|
|
190
|
+
- **Key Rotation**: Support for credential rotation without service interruption
|
|
191
|
+
|
|
192
|
+
### Media Processing
|
|
193
|
+
|
|
194
|
+
- **Resource Limits**: Memory and CPU usage limits for media processing
|
|
195
|
+
- **Timeout Controls**: Automatic termination of long-running processes
|
|
196
|
+
- **Format Validation**: Strict media format validation before processing
|
|
197
|
+
|
|
198
|
+
## Troubleshooting
|
|
199
|
+
|
|
200
|
+
### Common Issues
|
|
201
|
+
|
|
202
|
+
1. **Service Initialization Failures**
|
|
203
|
+
|
|
204
|
+
```bash
|
|
205
|
+
Error: Service initialization failed
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
- Verify environment variables are properly set
|
|
209
|
+
- Check service dependencies are installed
|
|
210
|
+
- Ensure sufficient system permissions
|
|
211
|
+
|
|
212
|
+
2. **Media Processing Errors**
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
Error: Failed to process media file
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
- Verify file format is supported
|
|
219
|
+
- Check available system memory
|
|
220
|
+
- Ensure ffmpeg is properly installed
|
|
221
|
+
|
|
222
|
+
3. **AWS S3 Connection Issues**
|
|
223
|
+
|
|
224
|
+
```bash
|
|
225
|
+
Error: AWS credentials not configured
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
- Verify AWS credentials are set
|
|
229
|
+
- Check S3 bucket permissions
|
|
230
|
+
- Ensure correct region configuration
|
|
231
|
+
|
|
232
|
+
### Debug Mode
|
|
233
|
+
|
|
234
|
+
Enable debug logging for detailed troubleshooting:
|
|
235
|
+
|
|
236
|
+
```typescript
|
|
237
|
+
process.env.DEBUG = "eliza:plugin-node:*";
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
### System Requirements
|
|
241
|
+
|
|
242
|
+
- Node.js 16.x or higher
|
|
243
|
+
- FFmpeg for media processing
|
|
244
|
+
- Minimum 4GB RAM recommended
|
|
245
|
+
- CUDA-compatible GPU (optional, for ML features)
|
|
246
|
+
|
|
247
|
+
### Performance Optimization
|
|
248
|
+
|
|
249
|
+
1. **Cache Management**
|
|
250
|
+
|
|
251
|
+
- Regular cleanup of `content_cache` directory
|
|
252
|
+
- Implement cache size limits
|
|
253
|
+
- Monitor disk usage
|
|
254
|
+
|
|
255
|
+
2. **Memory Usage**
|
|
256
|
+
|
|
257
|
+
- Configure max buffer sizes
|
|
258
|
+
- Implement streaming for large files
|
|
259
|
+
- Monitor memory consumption
|
|
260
|
+
|
|
261
|
+
3. **Concurrent Operations**
|
|
262
|
+
- Adjust queue size limits
|
|
263
|
+
- Configure worker threads
|
|
264
|
+
- Monitor process pool
|
|
265
|
+
|
|
266
|
+
## Support
|
|
267
|
+
|
|
268
|
+
For issues and feature requests, please:
|
|
269
|
+
|
|
270
|
+
1. Check the troubleshooting guide above
|
|
271
|
+
2. Review existing GitHub issues
|
|
272
|
+
3. Submit a new issue with:
|
|
273
|
+
- System information
|
|
274
|
+
- Error logs
|
|
275
|
+
- Steps to reproduce
|
|
276
|
+
|
|
277
|
+
## Future Enhancements
|
|
278
|
+
|
|
279
|
+
1. **File Operations**
|
|
280
|
+
|
|
281
|
+
- Enhanced streaming capabilities
|
|
282
|
+
- Advanced compression options
|
|
283
|
+
- Batch file processing
|
|
284
|
+
- File type detection
|
|
285
|
+
- Metadata management
|
|
286
|
+
- Version control integration
|
|
287
|
+
|
|
288
|
+
2. **Media Processing**
|
|
289
|
+
|
|
290
|
+
- Additional video formats
|
|
291
|
+
- Advanced image processing
|
|
292
|
+
- Audio enhancement tools
|
|
293
|
+
- Real-time processing
|
|
294
|
+
- Quality optimization
|
|
295
|
+
- Format conversion
|
|
296
|
+
|
|
297
|
+
3. **Cloud Integration**
|
|
298
|
+
|
|
299
|
+
- Multi-cloud support
|
|
300
|
+
- Advanced caching
|
|
301
|
+
- CDN optimization
|
|
302
|
+
- Auto-scaling features
|
|
303
|
+
- Cost optimization
|
|
304
|
+
- Backup automation
|
|
305
|
+
|
|
306
|
+
4. **Speech Services**
|
|
307
|
+
|
|
308
|
+
- Additional voice models
|
|
309
|
+
- Language expansion
|
|
310
|
+
- Emotion detection
|
|
311
|
+
- Voice cloning
|
|
312
|
+
- Real-time synthesis
|
|
313
|
+
- Custom voice training
|
|
314
|
+
|
|
315
|
+
5. **Browser Automation**
|
|
316
|
+
|
|
317
|
+
- Headless optimization
|
|
318
|
+
- Parallel processing
|
|
319
|
+
- Session management
|
|
320
|
+
- Cookie handling
|
|
321
|
+
- Proxy support
|
|
322
|
+
- Resource optimization
|
|
323
|
+
|
|
324
|
+
6. **Security Features**
|
|
325
|
+
|
|
326
|
+
- Enhanced encryption
|
|
327
|
+
- Access control
|
|
328
|
+
- Audit logging
|
|
329
|
+
- Threat detection
|
|
330
|
+
- Rate limiting
|
|
331
|
+
- Compliance tools
|
|
332
|
+
|
|
333
|
+
7. **Performance Optimization**
|
|
334
|
+
|
|
335
|
+
- Memory management
|
|
336
|
+
- CPU utilization
|
|
337
|
+
- Concurrent operations
|
|
338
|
+
- Resource pooling
|
|
339
|
+
- Cache strategies
|
|
340
|
+
- Load balancing
|
|
341
|
+
|
|
342
|
+
8. **Developer Tools**
|
|
343
|
+
- Enhanced debugging
|
|
344
|
+
- Testing framework
|
|
345
|
+
- Documentation generator
|
|
346
|
+
- CLI improvements
|
|
347
|
+
- Monitoring tools
|
|
348
|
+
- Integration templates
|
|
349
|
+
|
|
350
|
+
We welcome community feedback and contributions to help prioritize these enhancements.
|
|
351
|
+
|
|
352
|
+
## Contributing
|
|
353
|
+
|
|
354
|
+
Contributions are welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information.
|
|
355
|
+
|
|
356
|
+
## Credits
|
|
357
|
+
|
|
358
|
+
This plugin integrates with and builds upon several key technologies:
|
|
359
|
+
|
|
360
|
+
- [Node.js](https://nodejs.org/) - The core runtime environment
|
|
361
|
+
- [FFmpeg](https://ffmpeg.org/) - Media processing capabilities
|
|
362
|
+
- [ElevenLabs](https://elevenlabs.io/) - Voice synthesis
|
|
363
|
+
- [OpenAI](https://openai.com/) - Transcription and AI services
|
|
364
|
+
- [AWS S3](https://aws.amazon.com/s3/) - Cloud storage
|
|
365
|
+
- [Playwright](https://playwright.dev/) - Browser automation
|
|
366
|
+
- [LLaMA](https://github.com/facebookresearch/llama) - Local language models
|
|
367
|
+
- [VITS](https://github.com/jaywalnut310/vits) - Voice synthesis
|
|
368
|
+
- [Deepgram](https://deepgram.com/) - Speech recognition
|
|
369
|
+
- [Sharp](https://sharp.pixelplumbing.com/) - Image processing
|
|
370
|
+
|
|
371
|
+
Special thanks to:
|
|
372
|
+
|
|
373
|
+
- The Node.js community and all the open-source contributors who make these integrations possible.
|
|
374
|
+
- The Eliza community for their contributions and feedback.
|
|
375
|
+
|
|
376
|
+
For more information about Node.js capabilities:
|
|
377
|
+
|
|
378
|
+
- [Node.js Documentation](https://nodejs.org/en/docs/)
|
|
379
|
+
- [Node.js Developer Portal](https://nodejs.org/en/about/)
|
|
380
|
+
- [Node.js GitHub Repository](https://github.com/nodejs/node)
|
|
381
|
+
|
|
382
|
+
## License
|
|
383
|
+
|
|
384
|
+
This plugin is part of the Eliza project. See the main project repository for license information.
|