@elizaos/plugin-local-ai 1.0.0-beta.48 → 1.0.0-beta.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +81 -59
- package/dist/index.js +502 -1019
- package/dist/index.js.map +1 -1
- package/package.json +5 -4
package/README.md
CHANGED
|
@@ -12,46 +12,62 @@ Add the plugin to your character configuration:
|
|
|
12
12
|
|
|
13
13
|
## Configuration
|
|
14
14
|
|
|
15
|
-
The plugin
|
|
16
|
-
|
|
17
|
-
```json
|
|
18
|
-
"settings": {
|
|
19
|
-
"USE_LOCAL_AI": true,
|
|
20
|
-
"USE_STUDIOLM_TEXT_MODELS": false,
|
|
21
|
-
|
|
22
|
-
"STUDIOLM_SERVER_URL": "http://localhost:1234",
|
|
23
|
-
"STUDIOLM_SMALL_MODEL": "lmstudio-community/deepseek-r1-distill-qwen-1.5b",
|
|
24
|
-
"STUDIOLM_MEDIUM_MODEL": "deepseek-r1-distill-qwen-7b",
|
|
25
|
-
"STUDIOLM_EMBEDDING_MODEL": false
|
|
26
|
-
}
|
|
27
|
-
```
|
|
15
|
+
The plugin is configured using environment variables (typically set in a `.env` file or via your deployment settings):
|
|
28
16
|
|
|
29
17
|
Or in `.env` file:
|
|
30
18
|
|
|
31
19
|
```env
|
|
32
|
-
#
|
|
33
|
-
|
|
34
|
-
USE_STUDIOLM_TEXT_MODELS=false
|
|
35
|
-
|
|
36
|
-
# StudioLM Configuration
|
|
37
|
-
STUDIOLM_SERVER_URL=http://localhost:1234
|
|
38
|
-
STUDIOLM_SMALL_MODEL=lmstudio-community/deepseek-r1-distill-qwen-1.5b
|
|
39
|
-
STUDIOLM_MEDIUM_MODEL=deepseek-r1-distill-qwen-7b
|
|
40
|
-
STUDIOLM_EMBEDDING_MODEL=false
|
|
41
|
-
```
|
|
20
|
+
# Optional: Specify a custom directory for models (GGUF files)
|
|
21
|
+
# MODELS_DIR=/path/to/your/models
|
|
42
22
|
|
|
43
|
-
|
|
23
|
+
# Optional: Specify a custom directory for caching other components (tokenizers, etc.)
|
|
24
|
+
# CACHE_DIR=/path/to/your/cache
|
|
44
25
|
|
|
45
|
-
|
|
26
|
+
# Optional: Specify filenames for the text generation and embedding models within the models directory
|
|
27
|
+
# LOCAL_SMALL_MODEL=my-custom-small-model.gguf
|
|
28
|
+
# LOCAL_LARGE_MODEL=my-custom-large-model.gguf
|
|
29
|
+
# LOCAL_EMBEDDING_MODEL=my-custom-embedding-model.gguf
|
|
46
30
|
|
|
47
|
-
|
|
31
|
+
# Optional: Fallback dimension size for embeddings if generation fails. Defaults to the model's default (e.g., 384).
|
|
32
|
+
# LOCAL_EMBEDDING_DIMENSIONS=384
|
|
33
|
+
```
|
|
48
34
|
|
|
49
|
-
|
|
35
|
+
### Configuration Options
|
|
50
36
|
|
|
51
|
-
- `
|
|
52
|
-
- `
|
|
53
|
-
- `
|
|
54
|
-
- `
|
|
37
|
+
- `MODELS_DIR` (Optional): Specifies a custom directory for storing model files (GGUF format). If not set, defaults to `~/.eliza/models`.
|
|
38
|
+
- `CACHE_DIR` (Optional): Specifies a custom directory for caching other components like tokenizers. If not set, defaults to `~/.eliza/cache`.
|
|
39
|
+
- `LOCAL_SMALL_MODEL` (Optional): Specifies the filename for the small text generation model (e.g., `DeepHermes-3-Llama-3-3B-Preview-q4.gguf`) located in the models directory.
|
|
40
|
+
- `LOCAL_LARGE_MODEL` (Optional): Specifies the filename for the large text generation model (e.g., `DeepHermes-3-Llama-3-8B-q4.gguf`) located in the models directory.
|
|
41
|
+
- `LOCAL_EMBEDDING_MODEL` (Optional): Specifies the filename for the text embedding model (e.g., `bge-small-en-v1.5.Q4_K_M.gguf`) located in the models directory.
|
|
42
|
+
- `LOCAL_EMBEDDING_DIMENSIONS` (Optional): Defines the expected dimension size for text embeddings. This is primarily used as a fallback dimension if the embedding model fails to generate an embedding. If not set, it defaults to the embedding model's native dimension size (e.g., 384 for `bge-small-en-v1.5.Q4_K_M.gguf`).
|
|
43
|
+
|
|
44
|
+
## Prerequisites
|
|
45
|
+
|
|
46
|
+
### FFmpeg for Audio Transcription
|
|
47
|
+
|
|
48
|
+
The audio transcription feature (`ModelType.TRANSCRIPTION`) relies on **FFmpeg** to process audio files. If FFmpeg is not installed or not found in your system's PATH, transcription will fail.
|
|
49
|
+
|
|
50
|
+
**Installation:**
|
|
51
|
+
|
|
52
|
+
- **macOS (using Homebrew):**
|
|
53
|
+
```bash
|
|
54
|
+
brew install ffmpeg
|
|
55
|
+
```
|
|
56
|
+
- **Linux (Debian/Ubuntu):**
|
|
57
|
+
```bash
|
|
58
|
+
sudo apt-get update && sudo apt-get install ffmpeg
|
|
59
|
+
```
|
|
60
|
+
- **Linux (Fedora):**
|
|
61
|
+
```bash
|
|
62
|
+
sudo dnf install ffmpeg
|
|
63
|
+
```
|
|
64
|
+
- **Windows (using Chocolatey):**
|
|
65
|
+
```bash
|
|
66
|
+
choco install ffmpeg
|
|
67
|
+
```
|
|
68
|
+
Alternatively, download FFmpeg from the [official FFmpeg website](https://ffmpeg.org/download.html) and add the `bin` directory (containing `ffmpeg.exe`) to your system's PATH environment variable.
|
|
69
|
+
|
|
70
|
+
After installation, ensure that the `ffmpeg` command is accessible from your terminal. You may need to restart your terminal or your application for the changes to take effect.
|
|
55
71
|
|
|
56
72
|
## Features
|
|
57
73
|
|
|
@@ -59,55 +75,61 @@ The plugin provides these model classes:
|
|
|
59
75
|
|
|
60
76
|
- `TEXT_SMALL`: Fast, efficient text generation using smaller models
|
|
61
77
|
- `TEXT_LARGE`: More capable text generation using larger models
|
|
78
|
+
- `TEXT_EMBEDDING`: Generates text embeddings locally.
|
|
62
79
|
- `IMAGE_DESCRIPTION`: Local image analysis using Florence-2 vision model
|
|
63
80
|
- `TEXT_TO_SPEECH`: Local text-to-speech synthesis
|
|
64
81
|
- `TRANSCRIPTION`: Local audio transcription using Whisper
|
|
65
82
|
|
|
66
|
-
###
|
|
83
|
+
### Text Generation
|
|
67
84
|
|
|
68
85
|
```typescript
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
'
|
|
72
|
-
|
|
86
|
+
// Using small model
|
|
87
|
+
const smallResponse = await runtime.useModel(ModelType.TEXT_SMALL, {
|
|
88
|
+
prompt: 'Generate a short response',
|
|
89
|
+
stopSequences: [],
|
|
90
|
+
});
|
|
91
|
+
|
|
92
|
+
// Using large model
|
|
93
|
+
const largeResponse = await runtime.useModel(ModelType.TEXT_LARGE, {
|
|
94
|
+
prompt: 'Generate a detailed response',
|
|
95
|
+
stopSequences: [],
|
|
96
|
+
});
|
|
73
97
|
```
|
|
74
98
|
|
|
75
99
|
### Text-to-Speech
|
|
76
100
|
|
|
101
|
+
This plugin uses the [`transformers.js`](https://huggingface.co/docs/transformers.js) library for Text-to-Speech synthesis, running directly in the Node.js environment without external Python dependencies for this feature.
|
|
102
|
+
|
|
77
103
|
```typescript
|
|
78
104
|
const audioStream = await runtime.useModel(ModelType.TEXT_TO_SPEECH, 'Text to convert to speech');
|
|
79
105
|
```
|
|
80
106
|
|
|
81
|
-
|
|
107
|
+
**Current Implementation Details:**
|
|
82
108
|
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
109
|
+
- **Model:** By default, it uses the [`Xenova/speecht5_tts`](https://huggingface.co/Xenova/speecht5_tts) model (ONNX format), which is optimized for `transformers.js`.
|
|
110
|
+
- **Engine:** `@huggingface/transformers` library.
|
|
111
|
+
- **Speaker:** It uses a default speaker embedding for `SpeechT5`. The specific voice cannot be configured through environment variables currently.
|
|
112
|
+
- **Caching:** The ONNX model files and the default speaker embedding will be automatically downloaded and cached by `transformers.js` (typically in `~/.cache/huggingface/hub` or as configured by `transformers.js` environment variables) on first use.
|
|
86
113
|
|
|
87
|
-
### Text
|
|
114
|
+
### Text Embedding
|
|
88
115
|
|
|
89
116
|
```typescript
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
context: 'Generate a short response',
|
|
93
|
-
stopSequences: [],
|
|
94
|
-
});
|
|
95
|
-
|
|
96
|
-
// Using large model
|
|
97
|
-
const largeResponse = await runtime.useModel(ModelType.TEXT_LARGE, {
|
|
98
|
-
context: 'Generate a detailed response',
|
|
99
|
-
stopSequences: [],
|
|
117
|
+
const embedding = await runtime.useModel(ModelType.TEXT_EMBEDDING, {
|
|
118
|
+
text: 'Text to get embedding for',
|
|
100
119
|
});
|
|
101
120
|
```
|
|
102
121
|
|
|
103
|
-
|
|
122
|
+
### Image Analysis
|
|
104
123
|
|
|
105
|
-
|
|
124
|
+
```typescript
|
|
125
|
+
const { title, description } = await runtime.useModel(
|
|
126
|
+
ModelType.IMAGE_DESCRIPTION,
|
|
127
|
+
'https://example.com/image.jpg'
|
|
128
|
+
);
|
|
129
|
+
```
|
|
106
130
|
|
|
107
|
-
|
|
108
|
-
- Supports chat completion API similar to OpenAI
|
|
109
|
-
- Configure with `USE_STUDIOLM_TEXT_MODELS=true`
|
|
110
|
-
- Supports both small and medium-sized models
|
|
111
|
-
- Optional embedding model support
|
|
131
|
+
### Audio Transcription
|
|
112
132
|
|
|
113
|
-
|
|
133
|
+
```typescript
|
|
134
|
+
const transcription = await runtime.useModel(ModelType.TRANSCRIPTION, audioBuffer);
|
|
135
|
+
```
|