aia 0.9.16 → 0.9.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.version +1 -1
- data/CHANGELOG.md +50 -0
- data/README.md +77 -0
- data/docs/faq.md +83 -1
- data/docs/guides/local-models.md +304 -0
- data/docs/guides/models.md +157 -0
- data/lib/aia/chat_processor_service.rb +20 -5
- data/lib/aia/directives/models.rb +135 -5
- data/lib/aia/ruby_llm_adapter.rb +139 -9
- data/lib/aia/session.rb +27 -16
- data/lib/extensions/ruby_llm/provider_fix.rb +34 -0
- data/mkdocs.yml +1 -0
- metadata +31 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ae5c7e9497837763b36226f8fc44aeb7dffe692c964e450938ff83aea043c10a
|
4
|
+
data.tar.gz: b839a219cb3f6e5b34d9aa02ff7f3ced77ff039816b6b2a3d9ef79076d32c302
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 7df6fe4bbf0fd2ce33b9827390340d3cf1f8fd5ad5b4f52ceb343769dee6d699ab87c5dd2686472cd480bf5ac1e20b612f5c69e70b72eab68735560601423d7c
|
7
|
+
data.tar.gz: 9fa99822319f6e3ff213f62ba0c26121c045e4fa96035596df46b4d723df72cecb19c1abb98f760e6b435941a3cc9b869f6401da771452c3d226930b4f4ea104
|
data/.version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.9.
|
1
|
+
0.9.17
|
data/CHANGELOG.md
CHANGED
@@ -1,6 +1,56 @@
|
|
1
1
|
# Changelog
|
2
2
|
## [Unreleased]
|
3
3
|
|
4
|
+
### [0.9.17] 2025-10-04
|
5
|
+
|
6
|
+
#### New Features
|
7
|
+
- **NEW FEATURE**: Enhanced local model support with comprehensive validation and error handling
|
8
|
+
- **NEW FEATURE**: Added `lms/` prefix support for LM Studio models with automatic validation against loaded models
|
9
|
+
- **NEW FEATURE**: Enhanced `//models` directive to auto-detect and display local providers (Ollama and LM Studio)
|
10
|
+
- **NEW FEATURE**: Added model name prefix display in error messages for LM Studio (`lms/` prefix)
|
11
|
+
|
12
|
+
#### Improvements
|
13
|
+
- **ENHANCEMENT**: Improved LM Studio integration with model validation against `/v1/models` endpoint
|
14
|
+
- **ENHANCEMENT**: Enhanced error messages showing exact model names with correct prefixes when validation fails
|
15
|
+
- **ENHANCEMENT**: Added environment variable support for custom LM Studio API base (`LMS_API_BASE`)
|
16
|
+
- **ENHANCEMENT**: Improved `//models` directive output formatting for local models with size and modified date for Ollama
|
17
|
+
- **ENHANCEMENT**: Enhanced multi-model support to seamlessly mix local and cloud models
|
18
|
+
|
19
|
+
#### Documentation
|
20
|
+
- **DOCUMENTATION**: Added comprehensive local model documentation to README.md
|
21
|
+
- **DOCUMENTATION**: Created new docs/guides/local-models.md guide covering Ollama and LM Studio setup, usage, and troubleshooting
|
22
|
+
- **DOCUMENTATION**: Updated docs/guides/models.md with local provider sections including comparison table and workflow examples
|
23
|
+
- **DOCUMENTATION**: Enhanced docs/faq.md with 5 new FAQ entries covering local model usage, differences, and error handling
|
24
|
+
|
25
|
+
#### Technical Changes
|
26
|
+
- Enhanced RubyLLMAdapter with LM Studio model validation (lib/aia/ruby_llm_adapter.rb)
|
27
|
+
- Updated models directive to query local provider endpoints (lib/aia/directives/models.rb)
|
28
|
+
- Added provider_fix extension for RubyLLM compatibility (lib/extensions/ruby_llm/provider_fix.rb)
|
29
|
+
- Added comprehensive test coverage with 22 new tests for local providers
|
30
|
+
- Updated dependencies: ruby_llm, webmock, crack, rexml
|
31
|
+
- Bumped Ruby bundler version to 2.7.2
|
32
|
+
|
33
|
+
#### Bug Fixes
|
34
|
+
- **BUG FIX**: Fixed missing `lms/` prefix in LM Studio model listings
|
35
|
+
- **BUG FIX**: Fixed model validation error messages to show usable model names with correct prefixes
|
36
|
+
- **BUG FIX**: Fixed Ollama endpoint to use native API (removed incorrect `/v1` suffix)
|
37
|
+
|
38
|
+
#### Usage Examples
|
39
|
+
```bash
|
40
|
+
# Use LM Studio with validation
|
41
|
+
aia --model lms/qwen/qwen3-coder-30b my_prompt
|
42
|
+
|
43
|
+
# Use Ollama
|
44
|
+
aia --model ollama/llama3.2 --chat
|
45
|
+
|
46
|
+
# Mix local and cloud models
|
47
|
+
aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet my_prompt
|
48
|
+
|
49
|
+
# List available local models
|
50
|
+
aia --model ollama/llama3.2 --chat
|
51
|
+
> //models
|
52
|
+
```
|
53
|
+
|
4
54
|
### [0.9.16] 2025-09-26
|
5
55
|
|
6
56
|
#### New Features
|
data/README.md
CHANGED
@@ -563,6 +563,83 @@ Model Details:
|
|
563
563
|
- **Error Handling**: Invalid models are reported but don't prevent valid models from working
|
564
564
|
- **Batch Mode Support**: Multi-model responses are properly formatted in output files
|
565
565
|
|
566
|
+
### Local Model Support
|
567
|
+
|
568
|
+
AIA supports running local AI models through Ollama and LM Studio, providing privacy, offline capability, and cost savings.
|
569
|
+
|
570
|
+
#### Ollama Integration
|
571
|
+
|
572
|
+
[Ollama](https://ollama.ai) runs AI models locally on your machine.
|
573
|
+
|
574
|
+
```bash
|
575
|
+
# Install Ollama (macOS)
|
576
|
+
brew install ollama
|
577
|
+
|
578
|
+
# Pull a model
|
579
|
+
ollama pull llama3.2
|
580
|
+
|
581
|
+
# Use with AIA - prefix model name with 'ollama/'
|
582
|
+
aia --model ollama/llama3.2 my_prompt
|
583
|
+
|
584
|
+
# In chat mode
|
585
|
+
aia --chat --model ollama/llama3.2
|
586
|
+
|
587
|
+
# Combine with cloud models
|
588
|
+
aia --model ollama/llama3.2,gpt-4o-mini --consensus my_prompt
|
589
|
+
```
|
590
|
+
|
591
|
+
**Environment Variables:**
|
592
|
+
```bash
|
593
|
+
# Optional: Set custom Ollama API endpoint
|
594
|
+
export OLLAMA_API_BASE=http://localhost:11434
|
595
|
+
```
|
596
|
+
|
597
|
+
#### LM Studio Integration
|
598
|
+
|
599
|
+
[LM Studio](https://lmstudio.ai) provides a desktop application for running local models with an OpenAI-compatible API.
|
600
|
+
|
601
|
+
```bash
|
602
|
+
# 1. Install LM Studio from lmstudio.ai
|
603
|
+
# 2. Download and load a model in LM Studio
|
604
|
+
# 3. Start the local server in LM Studio
|
605
|
+
|
606
|
+
# Use with AIA - prefix model name with 'lms/'
|
607
|
+
aia --model lms/qwen/qwen3-coder-30b my_prompt
|
608
|
+
|
609
|
+
# In chat mode
|
610
|
+
aia --chat --model lms/your-model-name
|
611
|
+
|
612
|
+
# Mix local and cloud models
|
613
|
+
aia --model lms/local-model,gpt-4o-mini my_prompt
|
614
|
+
```
|
615
|
+
|
616
|
+
**Environment Variables:**
|
617
|
+
```bash
|
618
|
+
# Optional: Set custom LM Studio API endpoint (default: http://localhost:1234/v1)
|
619
|
+
export LMS_API_BASE=http://localhost:1234/v1
|
620
|
+
```
|
621
|
+
|
622
|
+
#### Listing Local Models
|
623
|
+
|
624
|
+
The `//models` directive automatically detects local providers and queries their endpoints:
|
625
|
+
|
626
|
+
```bash
|
627
|
+
# In a prompt file or chat session
|
628
|
+
//models
|
629
|
+
|
630
|
+
# Output will show:
|
631
|
+
# - Ollama models from http://localhost:11434/api/tags
|
632
|
+
# - LM Studio models from http://localhost:1234/v1/models
|
633
|
+
# - Cloud models from RubyLLM database
|
634
|
+
```
|
635
|
+
|
636
|
+
**Benefits of Local Models:**
|
637
|
+
- 🔒 **Privacy**: No data sent to external servers
|
638
|
+
- 💰 **Cost**: Zero API costs after initial setup
|
639
|
+
- 🚀 **Speed**: No network latency
|
640
|
+
- 📡 **Offline**: Works without internet connection
|
641
|
+
- 🔧 **Control**: Full control over model and parameters
|
642
|
+
|
566
643
|
### Shell Integration
|
567
644
|
|
568
645
|
AIA automatically processes shell patterns in prompts:
|
data/docs/faq.md
CHANGED
@@ -23,7 +23,89 @@ export ANTHROPIC_API_KEY="your_key_here"
|
|
23
23
|
```
|
24
24
|
|
25
25
|
### Q: Can I use AIA without internet access?
|
26
|
-
**A:** Yes
|
26
|
+
**A:** Yes! AIA supports two local model providers for complete offline operation:
|
27
|
+
|
28
|
+
1. **Ollama**: Run open-source models locally
|
29
|
+
```bash
|
30
|
+
# Install and use Ollama
|
31
|
+
brew install ollama
|
32
|
+
ollama pull llama3.2
|
33
|
+
aia --model ollama/llama3.2 --chat
|
34
|
+
```
|
35
|
+
|
36
|
+
2. **LM Studio**: GUI-based local model runner
|
37
|
+
```bash
|
38
|
+
# Download from https://lmstudio.ai
|
39
|
+
# Load a model and start local server
|
40
|
+
aia --model lms/your-model-name --chat
|
41
|
+
```
|
42
|
+
|
43
|
+
Both options provide full AI functionality without internet connection, perfect for:
|
44
|
+
- 🔒 Private/sensitive data processing
|
45
|
+
- ✈️ Offline/travel use
|
46
|
+
- 💰 Zero API costs
|
47
|
+
- 🏠 Air-gapped environments
|
48
|
+
|
49
|
+
### Q: How do I list available local models?
|
50
|
+
**A:** Use the `//models` directive in a chat session or prompt:
|
51
|
+
|
52
|
+
```bash
|
53
|
+
# Start chat with any local model
|
54
|
+
aia --model ollama/llama3.2 --chat
|
55
|
+
|
56
|
+
# In the chat session
|
57
|
+
> //models
|
58
|
+
|
59
|
+
# Output shows:
|
60
|
+
# - Ollama models from local installation
|
61
|
+
# - LM Studio models currently loaded
|
62
|
+
# - Cloud models from RubyLLM database
|
63
|
+
```
|
64
|
+
|
65
|
+
For Ollama specifically: `ollama list`
|
66
|
+
For LM Studio: Check the Models tab in the LM Studio GUI
|
67
|
+
|
68
|
+
### Q: What's the difference between Ollama and LM Studio?
|
69
|
+
**A:**
|
70
|
+
- **Ollama**: Command-line focused, quick model switching, multiple models available
|
71
|
+
- **LM Studio**: GUI application, visual model management, one model at a time
|
72
|
+
|
73
|
+
Choose **Ollama** if you prefer CLI tools and automation.
|
74
|
+
Choose **LM Studio** if you want a visual interface and easier model discovery.
|
75
|
+
|
76
|
+
Both work great with AIA!
|
77
|
+
|
78
|
+
### Q: Can I mix local and cloud models?
|
79
|
+
**A:** Absolutely! This is a powerful feature:
|
80
|
+
|
81
|
+
```bash
|
82
|
+
# Compare local vs cloud responses
|
83
|
+
aia --model ollama/llama3.2,gpt-4o-mini my_prompt
|
84
|
+
|
85
|
+
# Get consensus across local and cloud models
|
86
|
+
aia --model ollama/mistral,lms/qwen-coder,claude-3-sonnet --consensus decision
|
87
|
+
|
88
|
+
# Use local for drafts, cloud for refinement
|
89
|
+
aia --model ollama/llama3.2 --out_file draft.md initial_analysis
|
90
|
+
aia --model gpt-4 --include draft.md final_report
|
91
|
+
```
|
92
|
+
|
93
|
+
### Q: Why does my lms/ model show an error?
|
94
|
+
**A:** Common causes:
|
95
|
+
|
96
|
+
1. **Model not loaded in LM Studio**: Load a model first
|
97
|
+
2. **Wrong model name**: AIA validates against available models and shows the exact names to use
|
98
|
+
3. **Server not running**: Start the local server in LM Studio
|
99
|
+
4. **Wrong prefix**: Always use `lms/` prefix with full model name
|
100
|
+
|
101
|
+
If you get an error, AIA will show you the exact model names to use:
|
102
|
+
```
|
103
|
+
❌ 'wrong-name' is not a valid LM Studio model.
|
104
|
+
|
105
|
+
Available LM Studio models:
|
106
|
+
- lms/qwen/qwen3-coder-30b
|
107
|
+
- lms/llama-3.2-3b-instruct
|
108
|
+
```
|
27
109
|
|
28
110
|
## Basic Usage
|
29
111
|
|
@@ -0,0 +1,304 @@
|
|
1
|
+
# Local Models Guide
|
2
|
+
|
3
|
+
Complete guide to using Ollama and LM Studio with AIA for local AI processing.
|
4
|
+
|
5
|
+
## Why Use Local Models?
|
6
|
+
|
7
|
+
### Benefits
|
8
|
+
|
9
|
+
- 🔒 **Privacy**: All processing happens on your machine
|
10
|
+
- 💰 **Cost**: No API fees
|
11
|
+
- 🚀 **Speed**: No network latency
|
12
|
+
- 📡 **Offline**: Works without internet
|
13
|
+
- 🔧 **Control**: Choose exact model and parameters
|
14
|
+
- 📦 **Unlimited**: No rate limits or quotas
|
15
|
+
|
16
|
+
### Use Cases
|
17
|
+
|
18
|
+
- Processing confidential business data
|
19
|
+
- Working with personal information
|
20
|
+
- Development and testing
|
21
|
+
- High-volume batch processing
|
22
|
+
- Air-gapped environments
|
23
|
+
- Learning and experimentation
|
24
|
+
|
25
|
+
## Ollama Setup
|
26
|
+
|
27
|
+
### Installation
|
28
|
+
|
29
|
+
```bash
|
30
|
+
# macOS
|
31
|
+
brew install ollama
|
32
|
+
|
33
|
+
# Linux
|
34
|
+
curl -fsSL https://ollama.ai/install.sh | sh
|
35
|
+
|
36
|
+
# Windows
|
37
|
+
# Download installer from https://ollama.ai
|
38
|
+
```
|
39
|
+
|
40
|
+
### Model Management
|
41
|
+
|
42
|
+
```bash
|
43
|
+
# List available models
|
44
|
+
ollama list
|
45
|
+
|
46
|
+
# Pull new models
|
47
|
+
ollama pull llama3.2
|
48
|
+
ollama pull mistral
|
49
|
+
ollama pull codellama
|
50
|
+
|
51
|
+
# Remove models
|
52
|
+
ollama rm model-name
|
53
|
+
|
54
|
+
# Show model info
|
55
|
+
ollama show llama3.2
|
56
|
+
```
|
57
|
+
|
58
|
+
### Using with AIA
|
59
|
+
|
60
|
+
```bash
|
61
|
+
# Basic usage - prefix with 'ollama/'
|
62
|
+
aia --model ollama/llama3.2 my_prompt
|
63
|
+
|
64
|
+
# Chat mode
|
65
|
+
aia --chat --model ollama/mistral
|
66
|
+
|
67
|
+
# Batch processing
|
68
|
+
for file in *.md; do
|
69
|
+
aia --model ollama/llama3.2 summarize "$file"
|
70
|
+
done
|
71
|
+
```
|
72
|
+
|
73
|
+
### Recommended Ollama Models
|
74
|
+
|
75
|
+
#### General Purpose
|
76
|
+
- `llama3.2` - Versatile, good quality
|
77
|
+
- `llama3.2:70b` - Higher quality, slower
|
78
|
+
- `mistral` - Fast, efficient
|
79
|
+
|
80
|
+
#### Code
|
81
|
+
- `qwen2.5-coder` - Excellent for code
|
82
|
+
- `codellama` - Code-focused
|
83
|
+
- `deepseek-coder` - Programming tasks
|
84
|
+
|
85
|
+
#### Specialized
|
86
|
+
- `mixtral` - High performance
|
87
|
+
- `phi3` - Small, efficient
|
88
|
+
- `gemma2` - Google's open model
|
89
|
+
|
90
|
+
## LM Studio Setup
|
91
|
+
|
92
|
+
### Installation
|
93
|
+
|
94
|
+
1. Download from https://lmstudio.ai
|
95
|
+
2. Install the application
|
96
|
+
3. Launch LM Studio
|
97
|
+
|
98
|
+
### Model Management
|
99
|
+
|
100
|
+
1. Click "🔍 Search" tab
|
101
|
+
2. Browse or search for models
|
102
|
+
3. Click download button
|
103
|
+
4. Wait for download to complete
|
104
|
+
|
105
|
+
### Starting Local Server
|
106
|
+
|
107
|
+
1. Click "💻 Local Server" tab
|
108
|
+
2. Select loaded model from dropdown
|
109
|
+
3. Click "Start Server"
|
110
|
+
4. Note the endpoint (default: http://localhost:1234/v1)
|
111
|
+
|
112
|
+
### Using with AIA
|
113
|
+
|
114
|
+
```bash
|
115
|
+
# Prefix model name with 'lms/'
|
116
|
+
aia --model lms/qwen/qwen3-coder-30b my_prompt
|
117
|
+
|
118
|
+
# Chat mode
|
119
|
+
aia --chat --model lms/llama-3.2-3b-instruct
|
120
|
+
|
121
|
+
# AIA validates model names
|
122
|
+
# Error shows available models if name is wrong
|
123
|
+
```
|
124
|
+
|
125
|
+
### Popular LM Studio Models
|
126
|
+
|
127
|
+
- `lmsys/vicuna-7b` - Conversation
|
128
|
+
- `TheBloke/Llama-2-7B-Chat-GGUF` - Chat
|
129
|
+
- `TheBloke/CodeLlama-7B-GGUF` - Code
|
130
|
+
- `qwen/qwen3-coder-30b` - Advanced coding
|
131
|
+
|
132
|
+
## Configuration
|
133
|
+
|
134
|
+
### Environment Variables
|
135
|
+
|
136
|
+
```bash
|
137
|
+
# Ollama custom endpoint
|
138
|
+
export OLLAMA_API_BASE=http://localhost:11434
|
139
|
+
|
140
|
+
# LM Studio custom endpoint
|
141
|
+
export LMS_API_BASE=http://localhost:1234/v1
|
142
|
+
```
|
143
|
+
|
144
|
+
### Config File
|
145
|
+
|
146
|
+
```yaml
|
147
|
+
# ~/.aia/config.yml
|
148
|
+
model: ollama/llama3.2
|
149
|
+
|
150
|
+
# Or for LM Studio
|
151
|
+
model: lms/qwen/qwen3-coder-30b
|
152
|
+
```
|
153
|
+
|
154
|
+
### In Prompts
|
155
|
+
|
156
|
+
```
|
157
|
+
//config model = ollama/mistral
|
158
|
+
//config temperature = 0.7
|
159
|
+
|
160
|
+
Your prompt here...
|
161
|
+
```
|
162
|
+
|
163
|
+
## Listing Models
|
164
|
+
|
165
|
+
### In Chat Session
|
166
|
+
|
167
|
+
```bash
|
168
|
+
aia --model ollama/llama3.2 --chat
|
169
|
+
> //models
|
170
|
+
```
|
171
|
+
|
172
|
+
**Ollama Output:**
|
173
|
+
```
|
174
|
+
Local LLM Models:
|
175
|
+
|
176
|
+
Ollama Models (http://localhost:11434):
|
177
|
+
------------------------------------------------------------
|
178
|
+
- ollama/llama3.2:latest (size: 2.0 GB, modified: 2024-10-01)
|
179
|
+
- ollama/mistral:latest (size: 4.1 GB, modified: 2024-09-28)
|
180
|
+
|
181
|
+
2 Ollama model(s) available
|
182
|
+
```
|
183
|
+
|
184
|
+
**LM Studio Output:**
|
185
|
+
```
|
186
|
+
Local LLM Models:
|
187
|
+
|
188
|
+
LM Studio Models (http://localhost:1234/v1):
|
189
|
+
------------------------------------------------------------
|
190
|
+
- lms/qwen/qwen3-coder-30b
|
191
|
+
- lms/llama-3.2-3b-instruct
|
192
|
+
|
193
|
+
2 LM Studio model(s) available
|
194
|
+
```
|
195
|
+
|
196
|
+
## Advanced Usage
|
197
|
+
|
198
|
+
### Mixed Local/Cloud Models
|
199
|
+
|
200
|
+
```bash
|
201
|
+
# Compare local and cloud responses
|
202
|
+
aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet analysis_prompt
|
203
|
+
|
204
|
+
# Get consensus
|
205
|
+
aia --model ollama/llama3.2,ollama/mistral,gpt-4 --consensus decision_prompt
|
206
|
+
```
|
207
|
+
|
208
|
+
### Local-First Workflow
|
209
|
+
|
210
|
+
```bash
|
211
|
+
# 1. Process with local model (private)
|
212
|
+
aia --model ollama/llama3.2 --out_file draft.md sensitive_data.txt
|
213
|
+
|
214
|
+
# 2. Review and sanitize draft.md manually
|
215
|
+
|
216
|
+
# 3. Polish with cloud model
|
217
|
+
aia --model gpt-4 --include draft.md final_output
|
218
|
+
```
|
219
|
+
|
220
|
+
### Cost Optimization
|
221
|
+
|
222
|
+
```bash
|
223
|
+
# Bulk tasks with local model
|
224
|
+
for i in {1..1000}; do
|
225
|
+
aia --model ollama/mistral --out_file "result_$i.md" process "input_$i.txt"
|
226
|
+
done
|
227
|
+
|
228
|
+
# No API costs!
|
229
|
+
```
|
230
|
+
|
231
|
+
## Troubleshooting
|
232
|
+
|
233
|
+
### Ollama Issues
|
234
|
+
|
235
|
+
**Problem:** "Cannot connect to Ollama"
|
236
|
+
```bash
|
237
|
+
# Check if Ollama is running
|
238
|
+
ollama list
|
239
|
+
|
240
|
+
# Start Ollama service (if needed)
|
241
|
+
ollama serve
|
242
|
+
```
|
243
|
+
|
244
|
+
**Problem:** "Model not found"
|
245
|
+
```bash
|
246
|
+
# List installed models
|
247
|
+
ollama list
|
248
|
+
|
249
|
+
# Pull missing model
|
250
|
+
ollama pull llama3.2
|
251
|
+
```
|
252
|
+
|
253
|
+
### LM Studio Issues
|
254
|
+
|
255
|
+
**Problem:** "Cannot connect to LM Studio"
|
256
|
+
1. Ensure LM Studio is running
|
257
|
+
2. Check local server is started
|
258
|
+
3. Verify endpoint in settings
|
259
|
+
|
260
|
+
**Problem:** "Model validation failed"
|
261
|
+
- Check exact model name in LM Studio
|
262
|
+
- Ensure model is loaded (not just downloaded)
|
263
|
+
- Use full model path with `lms/` prefix
|
264
|
+
|
265
|
+
**Problem:** "Model not listed"
|
266
|
+
1. Load model in LM Studio
|
267
|
+
2. Start local server
|
268
|
+
3. Run `//models` directive
|
269
|
+
|
270
|
+
### Performance Issues
|
271
|
+
|
272
|
+
**Slow responses:**
|
273
|
+
- Use smaller models (7B instead of 70B)
|
274
|
+
- Reduce max_tokens
|
275
|
+
- Check system resources (CPU/RAM/GPU)
|
276
|
+
|
277
|
+
**High memory usage:**
|
278
|
+
- Close other applications
|
279
|
+
- Use quantized models (Q4, Q5)
|
280
|
+
- Try smaller model variants
|
281
|
+
|
282
|
+
## Best Practices
|
283
|
+
|
284
|
+
### Security
|
285
|
+
✅ Keep local models for sensitive data
|
286
|
+
✅ Use cloud models for general tasks
|
287
|
+
✅ Review outputs before sharing externally
|
288
|
+
|
289
|
+
### Performance
|
290
|
+
✅ Use appropriate model size for task
|
291
|
+
✅ Leverage GPU if available
|
292
|
+
✅ Cache common responses
|
293
|
+
|
294
|
+
### Cost Management
|
295
|
+
✅ Use local models for development/testing
|
296
|
+
✅ Use local models for high-volume processing
|
297
|
+
✅ Reserve cloud models for critical tasks
|
298
|
+
|
299
|
+
## Related Documentation
|
300
|
+
|
301
|
+
- [Models Guide](models.md)
|
302
|
+
- [Configuration](../configuration.md)
|
303
|
+
- [Chat Mode](chat.md)
|
304
|
+
- [CLI Reference](../cli-reference.md)
|
data/docs/guides/models.md
CHANGED
@@ -373,6 +373,163 @@ premium_models:
|
|
373
373
|
- **Llama 2**: Open-source general purpose
|
374
374
|
- **Mixtral**: High-performance open model
|
375
375
|
|
376
|
+
## Local Model Providers
|
377
|
+
|
378
|
+
### Ollama
|
379
|
+
|
380
|
+
[Ollama](https://ollama.ai) enables running open-source AI models locally.
|
381
|
+
|
382
|
+
#### Setup
|
383
|
+
|
384
|
+
```bash
|
385
|
+
# Install Ollama
|
386
|
+
brew install ollama # macOS
|
387
|
+
# or download from https://ollama.ai
|
388
|
+
|
389
|
+
# Pull models
|
390
|
+
ollama pull llama3.2
|
391
|
+
ollama pull mistral
|
392
|
+
ollama pull qwen2.5-coder
|
393
|
+
|
394
|
+
# List available models
|
395
|
+
ollama list
|
396
|
+
```
|
397
|
+
|
398
|
+
#### Usage with AIA
|
399
|
+
|
400
|
+
```bash
|
401
|
+
# Use Ollama model (prefix with 'ollama/')
|
402
|
+
aia --model ollama/llama3.2 my_prompt
|
403
|
+
|
404
|
+
# Chat mode
|
405
|
+
aia --chat --model ollama/mistral
|
406
|
+
|
407
|
+
# List Ollama models from AIA
|
408
|
+
aia --model ollama/llama3.2 --chat
|
409
|
+
> //models
|
410
|
+
|
411
|
+
# Combine with cloud models for comparison
|
412
|
+
aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet my_prompt
|
413
|
+
```
|
414
|
+
|
415
|
+
#### Configuration
|
416
|
+
|
417
|
+
```yaml
|
418
|
+
# ~/.aia/config.yml
|
419
|
+
model: ollama/llama3.2
|
420
|
+
|
421
|
+
# Optional: Custom Ollama endpoint
|
422
|
+
# Set via environment variable
|
423
|
+
export OLLAMA_API_BASE=http://custom-host:11434
|
424
|
+
```
|
425
|
+
|
426
|
+
#### Popular Ollama Models
|
427
|
+
|
428
|
+
- **llama3.2**: Latest Llama model, good general purpose
|
429
|
+
- **llama3.2:70b**: Larger version, better quality
|
430
|
+
- **mistral**: Fast and efficient
|
431
|
+
- **mixtral**: High-performance mixture of experts
|
432
|
+
- **qwen2.5-coder**: Specialized for code
|
433
|
+
- **codellama**: Code-focused model
|
434
|
+
|
435
|
+
### LM Studio
|
436
|
+
|
437
|
+
[LM Studio](https://lmstudio.ai) provides a GUI for running local models with OpenAI-compatible API.
|
438
|
+
|
439
|
+
#### Setup
|
440
|
+
|
441
|
+
1. Download LM Studio from https://lmstudio.ai
|
442
|
+
2. Install and launch the application
|
443
|
+
3. Browse and download models within LM Studio
|
444
|
+
4. Start the local server:
|
445
|
+
- Click "Local Server" tab
|
446
|
+
- Click "Start Server"
|
447
|
+
- Default endpoint: http://localhost:1234/v1
|
448
|
+
|
449
|
+
#### Usage with AIA
|
450
|
+
|
451
|
+
```bash
|
452
|
+
# Use LM Studio model (prefix with 'lms/')
|
453
|
+
aia --model lms/qwen/qwen3-coder-30b my_prompt
|
454
|
+
|
455
|
+
# Chat mode
|
456
|
+
aia --chat --model lms/llama-3.2-3b-instruct
|
457
|
+
|
458
|
+
# List LM Studio models from AIA
|
459
|
+
aia --model lms/any-loaded-model --chat
|
460
|
+
> //models
|
461
|
+
|
462
|
+
# Model validation
|
463
|
+
# AIA validates model names against LM Studio's loaded models
|
464
|
+
# If you specify an invalid model, you'll see:
|
465
|
+
# ❌ 'model-name' is not a valid LM Studio model.
|
466
|
+
#
|
467
|
+
# Available LM Studio models:
|
468
|
+
# - lms/qwen/qwen3-coder-30b
|
469
|
+
# - lms/llama-3.2-3b-instruct
|
470
|
+
```
|
471
|
+
|
472
|
+
#### Configuration
|
473
|
+
|
474
|
+
```yaml
|
475
|
+
# ~/.aia/config.yml
|
476
|
+
model: lms/qwen/qwen3-coder-30b
|
477
|
+
|
478
|
+
# Optional: Custom LM Studio endpoint
|
479
|
+
# Set via environment variable
|
480
|
+
export LMS_API_BASE=http://localhost:1234/v1
|
481
|
+
```
|
482
|
+
|
483
|
+
#### Tips for LM Studio
|
484
|
+
|
485
|
+
- Use the model name exactly as shown in LM Studio
|
486
|
+
- Prefix all model names with `lms/`
|
487
|
+
- Ensure the local server is running before use
|
488
|
+
- LM Studio supports one model at a time (unlike Ollama)
|
489
|
+
|
490
|
+
### Comparison: Ollama vs LM Studio
|
491
|
+
|
492
|
+
| Feature | Ollama | LM Studio |
|
493
|
+
|---------|--------|-----------|
|
494
|
+
| **Interface** | Command-line | GUI + CLI |
|
495
|
+
| **Model Management** | Via CLI (`ollama pull`) | GUI download |
|
496
|
+
| **API Compatibility** | Custom + OpenAI-like | OpenAI-compatible |
|
497
|
+
| **Multiple Models** | Yes (switch quickly) | One at a time |
|
498
|
+
| **Platform** | macOS, Linux, Windows | macOS, Windows |
|
499
|
+
| **Model Format** | GGUF, custom | GGUF |
|
500
|
+
| **Best For** | CLI users, automation | GUI users, experimentation |
|
501
|
+
|
502
|
+
### Local + Cloud Model Workflows
|
503
|
+
|
504
|
+
#### Privacy-First Workflow
|
505
|
+
```bash
|
506
|
+
# Use local model for sensitive data
|
507
|
+
aia --model ollama/llama3.2 --out_file draft.md process_private_data.txt
|
508
|
+
|
509
|
+
# Use cloud model for final polish (on sanitized data)
|
510
|
+
aia --model gpt-4 --include draft.md refine_output
|
511
|
+
```
|
512
|
+
|
513
|
+
#### Cost-Optimization Workflow
|
514
|
+
```bash
|
515
|
+
# Bulk processing with local model (free)
|
516
|
+
for file in *.txt; do
|
517
|
+
aia --model ollama/mistral --out_file "${file%.txt}_summary.md" summarize "$file"
|
518
|
+
done
|
519
|
+
|
520
|
+
# Final review with premium cloud model
|
521
|
+
aia --model gpt-4 --include *_summary.md final_report
|
522
|
+
```
|
523
|
+
|
524
|
+
#### Consensus with Mixed Models
|
525
|
+
```bash
|
526
|
+
# Get consensus from local and cloud models
|
527
|
+
aia --model ollama/llama3.2,ollama/mistral,gpt-4o-mini --consensus decision_prompt
|
528
|
+
|
529
|
+
# Or individual responses to compare
|
530
|
+
aia --model ollama/llama3.2,lms/qwen-coder,claude-3-sonnet --no-consensus code_review.py
|
531
|
+
```
|
532
|
+
|
376
533
|
## Troubleshooting Models
|
377
534
|
|
378
535
|
### Common Issues
|
@@ -28,25 +28,37 @@ module AIA
|
|
28
28
|
result = send_to_client(prompt)
|
29
29
|
end
|
30
30
|
|
31
|
+
# Debug output to understand what we're receiving
|
32
|
+
puts "[DEBUG ChatProcessor] Result class: #{result.class}" if AIA.config.debug
|
33
|
+
puts "[DEBUG ChatProcessor] Result inspect: #{result.inspect[0..500]}..." if AIA.config.debug
|
34
|
+
|
31
35
|
# Preserve token information if available for metrics
|
32
36
|
if result.is_a?(String)
|
37
|
+
puts "[DEBUG ChatProcessor] Processing as String" if AIA.config.debug
|
33
38
|
{ content: result, metrics: nil }
|
34
39
|
elsif result.respond_to?(:multi_model?) && result.multi_model?
|
40
|
+
puts "[DEBUG ChatProcessor] Processing as multi-model response" if AIA.config.debug
|
35
41
|
# Handle multi-model response with metrics
|
36
42
|
{
|
37
43
|
content: result.content,
|
38
44
|
metrics: nil, # Individual model metrics handled separately
|
39
45
|
multi_metrics: result.metrics_list
|
40
46
|
}
|
41
|
-
|
47
|
+
elsif result.respond_to?(:content)
|
48
|
+
puts "[DEBUG ChatProcessor] Processing as standard response with content method" if AIA.config.debug
|
49
|
+
# Standard response object with content method
|
42
50
|
{
|
43
51
|
content: result.content,
|
44
52
|
metrics: {
|
45
|
-
input_tokens: result.input_tokens,
|
46
|
-
output_tokens: result.output_tokens,
|
47
|
-
model_id: result.model_id
|
53
|
+
input_tokens: result.respond_to?(:input_tokens) ? result.input_tokens : nil,
|
54
|
+
output_tokens: result.respond_to?(:output_tokens) ? result.output_tokens : nil,
|
55
|
+
model_id: result.respond_to?(:model_id) ? result.model_id : nil
|
48
56
|
}
|
49
57
|
}
|
58
|
+
else
|
59
|
+
puts "[DEBUG ChatProcessor] Processing as fallback (unexpected type)" if AIA.config.debug
|
60
|
+
# Fallback for unexpected response types
|
61
|
+
{ content: result.to_s, metrics: nil }
|
50
62
|
end
|
51
63
|
end
|
52
64
|
|
@@ -56,7 +68,10 @@ module AIA
|
|
56
68
|
def send_to_client(conversation)
|
57
69
|
maybe_change_model
|
58
70
|
|
59
|
-
AIA.
|
71
|
+
puts "[DEBUG ChatProcessor] Sending conversation to client: #{conversation.inspect[0..500]}..." if AIA.config.debug
|
72
|
+
result = AIA.client.chat(conversation)
|
73
|
+
puts "[DEBUG ChatProcessor] Client returned: #{result.class} - #{result.inspect[0..500]}..." if AIA.config.debug
|
74
|
+
result
|
60
75
|
end
|
61
76
|
|
62
77
|
|
@@ -30,9 +30,141 @@ module AIA
|
|
30
30
|
end
|
31
31
|
|
32
32
|
def self.available_models(args = nil, context_manager = nil)
|
33
|
+
# Check if we're using a local provider
|
34
|
+
current_models = AIA.config.model
|
35
|
+
current_models = [current_models] if current_models.is_a?(String)
|
36
|
+
|
37
|
+
using_local_provider = current_models.any? { |m| m.start_with?('ollama/', 'lms/') }
|
38
|
+
|
39
|
+
if using_local_provider
|
40
|
+
show_local_models(current_models, args)
|
41
|
+
else
|
42
|
+
show_rubyllm_models(args)
|
43
|
+
end
|
44
|
+
|
45
|
+
""
|
46
|
+
end
|
47
|
+
|
48
|
+
def self.show_local_models(current_models, args)
|
49
|
+
require 'net/http'
|
50
|
+
require 'json'
|
51
|
+
|
52
|
+
puts "\nLocal LLM Models:"
|
53
|
+
puts
|
54
|
+
|
55
|
+
current_models.each do |model_spec|
|
56
|
+
if model_spec.start_with?('ollama/')
|
57
|
+
# Ollama uses its native API, not /v1
|
58
|
+
api_base = ENV.fetch('OLLAMA_API_BASE', 'http://localhost:11434')
|
59
|
+
# Remove /v1 suffix if present
|
60
|
+
api_base = api_base.gsub(%r{/v1/?$}, '')
|
61
|
+
show_ollama_models(api_base, args)
|
62
|
+
elsif model_spec.start_with?('lms/')
|
63
|
+
api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234')
|
64
|
+
show_lms_models(api_base, args)
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|
68
|
+
|
69
|
+
def self.show_ollama_models(api_base, args)
|
70
|
+
begin
|
71
|
+
uri = URI("#{api_base}/api/tags")
|
72
|
+
response = Net::HTTP.get_response(uri)
|
73
|
+
|
74
|
+
unless response.is_a?(Net::HTTPSuccess)
|
75
|
+
puts "❌ Cannot connect to Ollama at #{api_base}"
|
76
|
+
return
|
77
|
+
end
|
78
|
+
|
79
|
+
data = JSON.parse(response.body)
|
80
|
+
models = data['models'] || []
|
81
|
+
|
82
|
+
if models.empty?
|
83
|
+
puts "No Ollama models found"
|
84
|
+
return
|
85
|
+
end
|
86
|
+
|
87
|
+
puts "Ollama Models (#{api_base}):"
|
88
|
+
puts "-" * 60
|
89
|
+
|
90
|
+
counter = 0
|
91
|
+
models.each do |model|
|
92
|
+
name = model['name']
|
93
|
+
size = model['size'] ? format_bytes(model['size']) : 'unknown'
|
94
|
+
modified = model['modified_at'] ? Time.parse(model['modified_at']).strftime('%Y-%m-%d') : 'unknown'
|
95
|
+
|
96
|
+
entry = "- ollama/#{name} (size: #{size}, modified: #{modified})"
|
97
|
+
|
98
|
+
# Apply query filter if provided
|
99
|
+
if args.nil? || args.empty? || args.any? { |q| entry.downcase.include?(q.downcase) }
|
100
|
+
puts entry
|
101
|
+
counter += 1
|
102
|
+
end
|
103
|
+
end
|
104
|
+
|
105
|
+
puts
|
106
|
+
puts "#{counter} Ollama model(s) available"
|
107
|
+
puts
|
108
|
+
rescue StandardError => e
|
109
|
+
puts "❌ Error fetching Ollama models: #{e.message}"
|
110
|
+
end
|
111
|
+
end
|
112
|
+
|
113
|
+
def self.show_lms_models(api_base, args)
|
114
|
+
begin
|
115
|
+
uri = URI("#{api_base.gsub(%r{/v1/?$}, '')}/v1/models")
|
116
|
+
response = Net::HTTP.get_response(uri)
|
117
|
+
|
118
|
+
unless response.is_a?(Net::HTTPSuccess)
|
119
|
+
puts "❌ Cannot connect to LM Studio at #{api_base}"
|
120
|
+
return
|
121
|
+
end
|
122
|
+
|
123
|
+
data = JSON.parse(response.body)
|
124
|
+
models = data['data'] || []
|
125
|
+
|
126
|
+
if models.empty?
|
127
|
+
puts "No LM Studio models found"
|
128
|
+
return
|
129
|
+
end
|
130
|
+
|
131
|
+
puts "LM Studio Models (#{api_base}):"
|
132
|
+
puts "-" * 60
|
133
|
+
|
134
|
+
counter = 0
|
135
|
+
models.each do |model|
|
136
|
+
name = model['id']
|
137
|
+
entry = "- lms/#{name}"
|
138
|
+
|
139
|
+
# Apply query filter if provided
|
140
|
+
if args.nil? || args.empty? || args.any? { |q| entry.downcase.include?(q.downcase) }
|
141
|
+
puts entry
|
142
|
+
counter += 1
|
143
|
+
end
|
144
|
+
end
|
145
|
+
|
146
|
+
puts
|
147
|
+
puts "#{counter} LM Studio model(s) available"
|
148
|
+
puts
|
149
|
+
rescue StandardError => e
|
150
|
+
puts "❌ Error fetching LM Studio models: #{e.message}"
|
151
|
+
end
|
152
|
+
end
|
153
|
+
|
154
|
+
def self.format_bytes(bytes)
|
155
|
+
units = ['B', 'KB', 'MB', 'GB', 'TB']
|
156
|
+
return "0 B" if bytes.zero?
|
157
|
+
|
158
|
+
exp = (Math.log(bytes) / Math.log(1024)).to_i
|
159
|
+
exp = [exp, units.length - 1].min
|
160
|
+
|
161
|
+
"%.1f %s" % [bytes.to_f / (1024 ** exp), units[exp]]
|
162
|
+
end
|
163
|
+
|
164
|
+
def self.show_rubyllm_models(args)
|
33
165
|
query = args
|
34
166
|
|
35
|
-
if 1 == query.size
|
167
|
+
if query && 1 == query.size
|
36
168
|
query = query.first.split(',')
|
37
169
|
end
|
38
170
|
|
@@ -42,8 +174,8 @@ module AIA
|
|
42
174
|
puts header + ':'
|
43
175
|
puts
|
44
176
|
|
45
|
-
q1 = query.select { |q| q.include?('_to_') }
|
46
|
-
q2 = query.reject { |q| q.include?('_to_') }
|
177
|
+
q1 = query ? query.select { |q| q.include?('_to_') } : []
|
178
|
+
q2 = query ? query.reject { |q| q.include?('_to_') } : []
|
47
179
|
|
48
180
|
counter = 0
|
49
181
|
|
@@ -75,8 +207,6 @@ module AIA
|
|
75
207
|
puts if counter > 0
|
76
208
|
puts "#{counter} LLMs matching your query"
|
77
209
|
puts
|
78
|
-
|
79
|
-
""
|
80
210
|
end
|
81
211
|
|
82
212
|
def self.help(args = nil, context_manager = nil)
|
data/lib/aia/ruby_llm_adapter.rb
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
# lib/aia/ruby_llm_adapter.rb
|
2
2
|
|
3
3
|
require 'async'
|
4
|
+
require_relative '../extensions/ruby_llm/provider_fix'
|
4
5
|
|
5
6
|
module AIA
|
6
7
|
class RubyLLMAdapter
|
@@ -101,8 +102,13 @@ module AIA
|
|
101
102
|
elsif model_name.start_with?('lms/')
|
102
103
|
# For LM Studio models (OpenAI-compatible), create a custom context with the right API base
|
103
104
|
actual_model = model_name.sub('lms/', '')
|
105
|
+
lms_api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234/v1')
|
106
|
+
|
107
|
+
# Validate model exists in LM Studio
|
108
|
+
validate_lms_model!(actual_model, lms_api_base)
|
109
|
+
|
104
110
|
custom_config = RubyLLM.config.dup
|
105
|
-
custom_config.openai_api_base =
|
111
|
+
custom_config.openai_api_base = lms_api_base
|
106
112
|
custom_config.openai_api_key = 'dummy' # Local servers don't need a real API key
|
107
113
|
context = RubyLLM::Context.new(custom_config)
|
108
114
|
chat = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
|
@@ -237,33 +243,55 @@ module AIA
|
|
237
243
|
|
238
244
|
|
239
245
|
def chat(prompt)
|
240
|
-
|
246
|
+
puts "[DEBUG RubyLLMAdapter.chat] Received prompt class: #{prompt.class}" if AIA.config.debug
|
247
|
+
puts "[DEBUG RubyLLMAdapter.chat] Prompt inspect: #{prompt.inspect[0..500]}..." if AIA.config.debug
|
248
|
+
puts "[DEBUG RubyLLMAdapter.chat] Models: #{@models.inspect}" if AIA.config.debug
|
249
|
+
|
250
|
+
result = if @models.size == 1
|
241
251
|
# Single model - use the original behavior
|
242
252
|
single_model_chat(prompt, @models.first)
|
243
253
|
else
|
244
254
|
# Multiple models - use concurrent processing
|
245
255
|
multi_model_chat(prompt)
|
246
256
|
end
|
257
|
+
|
258
|
+
puts "[DEBUG RubyLLMAdapter.chat] Returning result class: #{result.class}" if AIA.config.debug
|
259
|
+
puts "[DEBUG RubyLLMAdapter.chat] Result inspect: #{result.inspect[0..500]}..." if AIA.config.debug
|
260
|
+
result
|
247
261
|
end
|
248
262
|
|
249
263
|
def single_model_chat(prompt, model_name)
|
264
|
+
puts "[DEBUG single_model_chat] Model name: #{model_name}" if AIA.config.debug
|
250
265
|
chat_instance = @chats[model_name]
|
266
|
+
puts "[DEBUG single_model_chat] Chat instance: #{chat_instance.class}" if AIA.config.debug
|
267
|
+
|
251
268
|
modes = chat_instance.model.modalities
|
269
|
+
puts "[DEBUG single_model_chat] Modalities: #{modes.inspect}" if AIA.config.debug
|
252
270
|
|
253
271
|
# TODO: Need to consider how to handle multi-mode models
|
254
|
-
if modes.text_to_text?
|
272
|
+
result = if modes.text_to_text?
|
273
|
+
puts "[DEBUG single_model_chat] Using text_to_text_single" if AIA.config.debug
|
255
274
|
text_to_text_single(prompt, model_name)
|
256
275
|
elsif modes.image_to_text?
|
276
|
+
puts "[DEBUG single_model_chat] Using image_to_text_single" if AIA.config.debug
|
257
277
|
image_to_text_single(prompt, model_name)
|
258
278
|
elsif modes.text_to_image?
|
279
|
+
puts "[DEBUG single_model_chat] Using text_to_image_single" if AIA.config.debug
|
259
280
|
text_to_image_single(prompt, model_name)
|
260
281
|
elsif modes.text_to_audio?
|
282
|
+
puts "[DEBUG single_model_chat] Using text_to_audio_single" if AIA.config.debug
|
261
283
|
text_to_audio_single(prompt, model_name)
|
262
284
|
elsif modes.audio_to_text?
|
285
|
+
puts "[DEBUG single_model_chat] Using audio_to_text_single" if AIA.config.debug
|
263
286
|
audio_to_text_single(prompt, model_name)
|
264
287
|
else
|
288
|
+
puts "[DEBUG single_model_chat] No matching modality!" if AIA.config.debug
|
265
289
|
# TODO: what else can be done?
|
290
|
+
"Error: No matching modality for model #{model_name}"
|
266
291
|
end
|
292
|
+
|
293
|
+
puts "[DEBUG single_model_chat] Result class: #{result.class}" if AIA.config.debug
|
294
|
+
result
|
267
295
|
end
|
268
296
|
|
269
297
|
def multi_model_chat(prompt)
|
@@ -440,7 +468,7 @@ module AIA
|
|
440
468
|
|
441
469
|
|
442
470
|
# Clear the chat context/history
|
443
|
-
# Needed for the //clear
|
471
|
+
# Needed for the //clear and //restore directives
|
444
472
|
def clear_context
|
445
473
|
@chats.each do |model_name, chat|
|
446
474
|
# Option 1: Directly clear the messages array in the current chat object
|
@@ -455,16 +483,65 @@ module AIA
|
|
455
483
|
# This ensures any shared state is reset
|
456
484
|
RubyLLM.instance_variable_set(:@chat, nil) if RubyLLM.instance_variable_defined?(:@chat)
|
457
485
|
|
458
|
-
# Option 3:
|
459
|
-
|
486
|
+
# Option 3: Try to create fresh chat instances, but don't exit on failure
|
487
|
+
# This is safer for use in directives like //restore
|
488
|
+
old_chats = @chats
|
489
|
+
@chats = {} # First clear the chats hash
|
460
490
|
|
461
491
|
begin
|
462
492
|
@models.each do |model_name|
|
463
|
-
|
493
|
+
# Try to recreate each chat, but if it fails, keep the old one
|
494
|
+
begin
|
495
|
+
# Check if this is a local provider model and handle it specially
|
496
|
+
if model_name.start_with?('ollama/')
|
497
|
+
actual_model = model_name.sub('ollama/', '')
|
498
|
+
@chats[model_name] = RubyLLM.chat(model: actual_model, provider: 'ollama', assume_model_exists: true)
|
499
|
+
elsif model_name.start_with?('osaurus/')
|
500
|
+
actual_model = model_name.sub('osaurus/', '')
|
501
|
+
custom_config = RubyLLM.config.dup
|
502
|
+
custom_config.openai_api_base = ENV.fetch('OSAURUS_API_BASE', 'http://localhost:11434/v1')
|
503
|
+
custom_config.openai_api_key = 'dummy'
|
504
|
+
context = RubyLLM::Context.new(custom_config)
|
505
|
+
@chats[model_name] = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
|
506
|
+
elsif model_name.start_with?('lms/')
|
507
|
+
actual_model = model_name.sub('lms/', '')
|
508
|
+
lms_api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234/v1')
|
509
|
+
|
510
|
+
# Validate model exists in LM Studio
|
511
|
+
validate_lms_model!(actual_model, lms_api_base)
|
512
|
+
|
513
|
+
custom_config = RubyLLM.config.dup
|
514
|
+
custom_config.openai_api_base = lms_api_base
|
515
|
+
custom_config.openai_api_key = 'dummy'
|
516
|
+
context = RubyLLM::Context.new(custom_config)
|
517
|
+
@chats[model_name] = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
|
518
|
+
else
|
519
|
+
@chats[model_name] = RubyLLM.chat(model: model_name)
|
520
|
+
end
|
521
|
+
|
522
|
+
# Re-add tools if they were previously loaded
|
523
|
+
if @tools && !@tools.empty? && @chats[model_name].model&.supports_functions?
|
524
|
+
@chats[model_name].with_tools(*@tools)
|
525
|
+
end
|
526
|
+
rescue StandardError => e
|
527
|
+
# If we can't create a new chat, keep the old one but clear its context
|
528
|
+
warn "Warning: Could not recreate chat for #{model_name}: #{e.message}. Keeping existing instance."
|
529
|
+
@chats[model_name] = old_chats[model_name]
|
530
|
+
# Clear the old chat's messages if possible
|
531
|
+
if @chats[model_name] && @chats[model_name].instance_variable_defined?(:@messages)
|
532
|
+
@chats[model_name].instance_variable_set(:@messages, [])
|
533
|
+
end
|
534
|
+
end
|
464
535
|
end
|
465
536
|
rescue StandardError => e
|
466
|
-
|
467
|
-
|
537
|
+
# If something went terribly wrong, restore the old chats but clear their contexts
|
538
|
+
warn "Warning: Error during context clearing: #{e.message}. Attempting to recover."
|
539
|
+
@chats = old_chats
|
540
|
+
@chats.each_value do |chat|
|
541
|
+
if chat.instance_variable_defined?(:@messages)
|
542
|
+
chat.instance_variable_set(:@messages, [])
|
543
|
+
end
|
544
|
+
end
|
468
545
|
end
|
469
546
|
|
470
547
|
# Option 4: Call official clear_history method if it exists
|
@@ -523,6 +600,44 @@ module AIA
|
|
523
600
|
end
|
524
601
|
|
525
602
|
|
603
|
+
def validate_lms_model!(model_name, api_base)
|
604
|
+
require 'net/http'
|
605
|
+
require 'json'
|
606
|
+
|
607
|
+
# Build the /v1/models endpoint URL
|
608
|
+
uri = URI("#{api_base.gsub(%r{/v1/?$}, '')}/v1/models")
|
609
|
+
|
610
|
+
begin
|
611
|
+
response = Net::HTTP.get_response(uri)
|
612
|
+
|
613
|
+
unless response.is_a?(Net::HTTPSuccess)
|
614
|
+
raise "Cannot connect to LM Studio at #{api_base}. Is LM Studio running?"
|
615
|
+
end
|
616
|
+
|
617
|
+
data = JSON.parse(response.body)
|
618
|
+
available_models = data['data']&.map { |m| m['id'] } || []
|
619
|
+
|
620
|
+
unless available_models.include?(model_name)
|
621
|
+
error_msg = "❌ '#{model_name}' is not a valid LM Studio model.\n\n"
|
622
|
+
if available_models.empty?
|
623
|
+
error_msg += "No models are currently loaded in LM Studio.\n"
|
624
|
+
error_msg += "Please load a model in LM Studio first."
|
625
|
+
else
|
626
|
+
error_msg += "Available LM Studio models:\n"
|
627
|
+
available_models.each { |m| error_msg += " - lms/#{m}\n" }
|
628
|
+
end
|
629
|
+
raise error_msg
|
630
|
+
end
|
631
|
+
rescue JSON::ParserError => e
|
632
|
+
raise "Invalid response from LM Studio at #{api_base}: #{e.message}"
|
633
|
+
rescue StandardError => e
|
634
|
+
# Re-raise our custom error messages, wrap others
|
635
|
+
raise if e.message.start_with?('❌')
|
636
|
+
raise "Error connecting to LM Studio: #{e.message}"
|
637
|
+
end
|
638
|
+
end
|
639
|
+
|
640
|
+
|
526
641
|
def extract_models_config
|
527
642
|
models_config = AIA.config.model
|
528
643
|
|
@@ -556,15 +671,30 @@ module AIA
|
|
556
671
|
def text_to_text_single(prompt, model_name)
|
557
672
|
chat_instance = @chats[model_name]
|
558
673
|
text_prompt = extract_text_prompt(prompt)
|
674
|
+
|
675
|
+
puts "[DEBUG RubyLLMAdapter] Sending to model #{model_name}: #{text_prompt[0..100]}..." if AIA.config.debug
|
676
|
+
|
559
677
|
response = if AIA.config.context_files.empty?
|
560
678
|
chat_instance.ask(text_prompt)
|
561
679
|
else
|
562
680
|
chat_instance.ask(text_prompt, with: AIA.config.context_files)
|
563
681
|
end
|
564
682
|
|
683
|
+
# Debug output to understand the response structure
|
684
|
+
puts "[DEBUG RubyLLMAdapter] Response class: #{response.class}" if AIA.config.debug
|
685
|
+
puts "[DEBUG RubyLLMAdapter] Response inspect: #{response.inspect[0..500]}..." if AIA.config.debug
|
686
|
+
|
687
|
+
if response.respond_to?(:content)
|
688
|
+
puts "[DEBUG RubyLLMAdapter] Response content: #{response.content[0..200]}..." if AIA.config.debug
|
689
|
+
else
|
690
|
+
puts "[DEBUG RubyLLMAdapter] Response (no content method): #{response.to_s[0..200]}..." if AIA.config.debug
|
691
|
+
end
|
692
|
+
|
565
693
|
# Return the full response object to preserve token information
|
566
694
|
response
|
567
695
|
rescue StandardError => e
|
696
|
+
puts "[DEBUG RubyLLMAdapter] Error in text_to_text_single: #{e.class} - #{e.message}" if AIA.config.debug
|
697
|
+
puts "[DEBUG RubyLLMAdapter] Backtrace: #{e.backtrace[0..5].join("\n")}" if AIA.config.debug
|
568
698
|
e.message
|
569
699
|
end
|
570
700
|
|
data/lib/aia/session.rb
CHANGED
@@ -418,23 +418,23 @@ module AIA
|
|
418
418
|
|
419
419
|
def handle_clear_directive
|
420
420
|
# The directive processor has called context_manager.clear_context
|
421
|
-
# but we need
|
421
|
+
# but we need to also clear the LLM client's context
|
422
422
|
|
423
423
|
# First, clear the context manager's context
|
424
424
|
@context_manager.clear_context(keep_system_prompt: true)
|
425
425
|
|
426
426
|
# Second, try clearing the client's context
|
427
427
|
if AIA.config.client && AIA.config.client.respond_to?(:clear_context)
|
428
|
-
|
428
|
+
begin
|
429
|
+
AIA.config.client.clear_context
|
430
|
+
rescue => e
|
431
|
+
STDERR.puts "Warning: Error clearing client context: #{e.message}"
|
432
|
+
# Continue anyway - the context manager has been cleared which is the main goal
|
433
|
+
end
|
429
434
|
end
|
430
435
|
|
431
|
-
#
|
432
|
-
#
|
433
|
-
begin
|
434
|
-
AIA.config.client = AIA::RubyLLMAdapter.new
|
435
|
-
rescue => e
|
436
|
-
STDERR.puts "Error reinitializing client: #{e.message}"
|
437
|
-
end
|
436
|
+
# Note: We intentionally do NOT reinitialize the client here
|
437
|
+
# as that could cause termination if model initialization fails
|
438
438
|
|
439
439
|
@ui_presenter.display_info("Chat context cleared.")
|
440
440
|
nil
|
@@ -448,16 +448,27 @@ module AIA
|
|
448
448
|
def handle_restore_directive(directive_output)
|
449
449
|
# If the restore was successful, we also need to refresh the client's context
|
450
450
|
if directive_output.start_with?("Context restored")
|
451
|
-
#
|
451
|
+
# Clear the client's context without reinitializing the entire adapter
|
452
|
+
# This avoids the risk of exiting if model initialization fails
|
452
453
|
if AIA.config.client && AIA.config.client.respond_to?(:clear_context)
|
453
|
-
|
454
|
+
begin
|
455
|
+
AIA.config.client.clear_context
|
456
|
+
rescue => e
|
457
|
+
STDERR.puts "Warning: Error clearing client context after restore: #{e.message}"
|
458
|
+
# Continue anyway - the context manager has been restored which is the main goal
|
459
|
+
end
|
454
460
|
end
|
455
461
|
|
456
|
-
#
|
457
|
-
|
458
|
-
|
459
|
-
|
460
|
-
|
462
|
+
# Rebuild the conversation in the LLM client from the restored context
|
463
|
+
# This ensures the LLM's internal state matches what we restored
|
464
|
+
if AIA.config.client && @context_manager
|
465
|
+
begin
|
466
|
+
restored_context = @context_manager.get_context
|
467
|
+
# The client's context has been cleared, so we can safely continue
|
468
|
+
# The next interaction will use the restored context from context_manager
|
469
|
+
rescue => e
|
470
|
+
STDERR.puts "Warning: Error syncing restored context: #{e.message}"
|
471
|
+
end
|
461
472
|
end
|
462
473
|
end
|
463
474
|
|
@@ -0,0 +1,34 @@
|
|
1
|
+
# lib/extensions/ruby_llm/provider_fix.rb
|
2
|
+
#
|
3
|
+
# Monkey patch to fix LM Studio compatibility with RubyLLM Provider
|
4
|
+
# LM Studio sometimes returns response.body as a String that fails JSON parsing
|
5
|
+
# This causes "String does not have #dig method" errors in parse_error
|
6
|
+
|
7
|
+
module RubyLLM
|
8
|
+
class Provider
|
9
|
+
# Override the parse_error method to handle String responses from LM Studio
|
10
|
+
def parse_error(response)
|
11
|
+
return if response.body.empty?
|
12
|
+
|
13
|
+
body = try_parse_json(response.body)
|
14
|
+
|
15
|
+
# Be more explicit about type checking to prevent String#dig errors
|
16
|
+
case body
|
17
|
+
when Hash
|
18
|
+
# Only call dig if we're certain it's a Hash
|
19
|
+
body.dig('error', 'message')
|
20
|
+
when Array
|
21
|
+
# Only call dig on array elements if they're Hashes
|
22
|
+
body.filter_map do |part|
|
23
|
+
part.is_a?(Hash) ? part.dig('error', 'message') : part.to_s
|
24
|
+
end.join('. ')
|
25
|
+
else
|
26
|
+
# For Strings or any other type, convert to string
|
27
|
+
body.to_s
|
28
|
+
end
|
29
|
+
rescue StandardError => e
|
30
|
+
# Fallback in case anything goes wrong
|
31
|
+
"Error parsing response: #{e.message}"
|
32
|
+
end
|
33
|
+
end
|
34
|
+
end
|
data/mkdocs.yml
CHANGED
@@ -151,6 +151,7 @@ nav:
|
|
151
151
|
- Getting Started: guides/getting-started.md
|
152
152
|
- Chat Mode: guides/chat.md
|
153
153
|
- Working with Models: guides/models.md
|
154
|
+
- Local Models: guides/local-models.md
|
154
155
|
- Available Models: guides/available-models.md
|
155
156
|
- Image Generation: guides/image-generation.md
|
156
157
|
- Tools Integration: guides/tools.md
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: aia
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.9.
|
4
|
+
version: 0.9.17
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Dewayne VanHoozer
|
@@ -289,6 +289,20 @@ dependencies:
|
|
289
289
|
- - ">="
|
290
290
|
- !ruby/object:Gem::Version
|
291
291
|
version: '0'
|
292
|
+
- !ruby/object:Gem::Dependency
|
293
|
+
name: simplecov_lcov_formatter
|
294
|
+
requirement: !ruby/object:Gem::Requirement
|
295
|
+
requirements:
|
296
|
+
- - ">="
|
297
|
+
- !ruby/object:Gem::Version
|
298
|
+
version: '0'
|
299
|
+
type: :development
|
300
|
+
prerelease: false
|
301
|
+
version_requirements: !ruby/object:Gem::Requirement
|
302
|
+
requirements:
|
303
|
+
- - ">="
|
304
|
+
- !ruby/object:Gem::Version
|
305
|
+
version: '0'
|
292
306
|
- !ruby/object:Gem::Dependency
|
293
307
|
name: tocer
|
294
308
|
requirement: !ruby/object:Gem::Requirement
|
@@ -303,6 +317,20 @@ dependencies:
|
|
303
317
|
- - ">="
|
304
318
|
- !ruby/object:Gem::Version
|
305
319
|
version: '0'
|
320
|
+
- !ruby/object:Gem::Dependency
|
321
|
+
name: webmock
|
322
|
+
requirement: !ruby/object:Gem::Requirement
|
323
|
+
requirements:
|
324
|
+
- - ">="
|
325
|
+
- !ruby/object:Gem::Version
|
326
|
+
version: '0'
|
327
|
+
type: :development
|
328
|
+
prerelease: false
|
329
|
+
version_requirements: !ruby/object:Gem::Requirement
|
330
|
+
requirements:
|
331
|
+
- - ">="
|
332
|
+
- !ruby/object:Gem::Version
|
333
|
+
version: '0'
|
306
334
|
description: 'AIA is a revolutionary CLI console application that brings multi-model
|
307
335
|
AI capabilities to your command line, supporting 20+ providers including OpenAI,
|
308
336
|
Anthropic, and Google. Run multiple AI models simultaneously for comparison, get
|
@@ -354,6 +382,7 @@ files:
|
|
354
382
|
- docs/guides/getting-started.md
|
355
383
|
- docs/guides/image-generation.md
|
356
384
|
- docs/guides/index.md
|
385
|
+
- docs/guides/local-models.md
|
357
386
|
- docs/guides/models.md
|
358
387
|
- docs/guides/tools.md
|
359
388
|
- docs/index.md
|
@@ -416,6 +445,7 @@ files:
|
|
416
445
|
- lib/extensions/openstruct_merge.rb
|
417
446
|
- lib/extensions/ruby_llm/.irbrc
|
418
447
|
- lib/extensions/ruby_llm/modalities.rb
|
448
|
+
- lib/extensions/ruby_llm/provider_fix.rb
|
419
449
|
- lib/refinements/string.rb
|
420
450
|
- main.just
|
421
451
|
- mcp_servers/README.md
|