RubyGems - aia - Versions diffs - 0.9.16 → 0.9.17 - Mend

aia 0.9.16 → 0.9.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

checksums.yaml +4 -4
data/.version +1 -1
data/CHANGELOG.md +50 -0
data/README.md +77 -0
data/docs/faq.md +83 -1
data/docs/guides/local-models.md +304 -0
data/docs/guides/models.md +157 -0
data/lib/aia/chat_processor_service.rb +20 -5
data/lib/aia/directives/models.rb +135 -5
data/lib/aia/ruby_llm_adapter.rb +139 -9
data/lib/aia/session.rb +27 -16
data/lib/extensions/ruby_llm/provider_fix.rb +34 -0
data/mkdocs.yml +1 -0
metadata +31 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: d93978bc1dc5486b24cb460ecdd13bbcefa83c3ce7a6cb3b7b7454c21b66e979
-  data.tar.gz: 169d574acd4e9b4127ba8f47526e4767bc86cd5e7c3c144f8f79de2f2dfefa25
+  metadata.gz: ae5c7e9497837763b36226f8fc44aeb7dffe692c964e450938ff83aea043c10a
+  data.tar.gz: b839a219cb3f6e5b34d9aa02ff7f3ced77ff039816b6b2a3d9ef79076d32c302
 SHA512:
-  metadata.gz: 82b26b797a06a1af89e96ced21b8a182a9f9e6fd31e1cb20c8ad5e070b045f31430f89f8c3fc82d79d5714363d3289e80cdbf716bc6a9f86d4b6f59cd9a6a96e
-  data.tar.gz: 6c2f8953be44fc7d98fe2bef0680831f09947ee75c9d60b251ecdb243057da83a836de904a4ed45e4776280cad4d048a9485a249c692213aec9496fe05eb8cf7
+  metadata.gz: 7df6fe4bbf0fd2ce33b9827390340d3cf1f8fd5ad5b4f52ceb343769dee6d699ab87c5dd2686472cd480bf5ac1e20b612f5c69e70b72eab68735560601423d7c
+  data.tar.gz: 9fa99822319f6e3ff213f62ba0c26121c045e4fa96035596df46b4d723df72cecb19c1abb98f760e6b435941a3cc9b869f6401da771452c3d226930b4f4ea104

data/.version CHANGED Viewed

	@@ -1 +1 @@
1	- 0.9.16
1	+ 0.9.17

data/CHANGELOG.md CHANGED Viewed

@@ -1,6 +1,56 @@
 # Changelog
 ## [Unreleased]
+### [0.9.17] 2025-10-04
+#### New Features
+- **NEW FEATURE**: Enhanced local model support with comprehensive validation and error handling
+- **NEW FEATURE**: Added `lms/` prefix support for LM Studio models with automatic validation against loaded models
+- **NEW FEATURE**: Enhanced `//models` directive to auto-detect and display local providers (Ollama and LM Studio)
+- **NEW FEATURE**: Added model name prefix display in error messages for LM Studio (`lms/` prefix)
+#### Improvements
+- **ENHANCEMENT**: Improved LM Studio integration with model validation against `/v1/models` endpoint
+- **ENHANCEMENT**: Enhanced error messages showing exact model names with correct prefixes when validation fails
+- **ENHANCEMENT**: Added environment variable support for custom LM Studio API base (`LMS_API_BASE`)
+- **ENHANCEMENT**: Improved `//models` directive output formatting for local models with size and modified date for Ollama
+- **ENHANCEMENT**: Enhanced multi-model support to seamlessly mix local and cloud models
+#### Documentation
+- **DOCUMENTATION**: Added comprehensive local model documentation to README.md
+- **DOCUMENTATION**: Created new docs/guides/local-models.md guide covering Ollama and LM Studio setup, usage, and troubleshooting
+- **DOCUMENTATION**: Updated docs/guides/models.md with local provider sections including comparison table and workflow examples
+- **DOCUMENTATION**: Enhanced docs/faq.md with 5 new FAQ entries covering local model usage, differences, and error handling
+#### Technical Changes
+- Enhanced RubyLLMAdapter with LM Studio model validation (lib/aia/ruby_llm_adapter.rb)
+- Updated models directive to query local provider endpoints (lib/aia/directives/models.rb)
+- Added provider_fix extension for RubyLLM compatibility (lib/extensions/ruby_llm/provider_fix.rb)
+- Added comprehensive test coverage with 22 new tests for local providers
+- Updated dependencies: ruby_llm, webmock, crack, rexml
+- Bumped Ruby bundler version to 2.7.2
+#### Bug Fixes
+- **BUG FIX**: Fixed missing `lms/` prefix in LM Studio model listings
+- **BUG FIX**: Fixed model validation error messages to show usable model names with correct prefixes
+- **BUG FIX**: Fixed Ollama endpoint to use native API (removed incorrect `/v1` suffix)
+#### Usage Examples
+```bash
+# Use LM Studio with validation
+aia --model lms/qwen/qwen3-coder-30b my_prompt
+# Use Ollama
+aia --model ollama/llama3.2 --chat
+# Mix local and cloud models
+aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet my_prompt
+# List available local models
+aia --model ollama/llama3.2 --chat
+> //models
+```
 ### [0.9.16] 2025-09-26
 #### New Features

data/README.md CHANGED Viewed

@@ -563,6 +563,83 @@ Model Details:
 - **Error Handling**: Invalid models are reported but don't prevent valid models from working
 - **Batch Mode Support**: Multi-model responses are properly formatted in output files
+### Local Model Support
+AIA supports running local AI models through Ollama and LM Studio, providing privacy, offline capability, and cost savings.
+#### Ollama Integration
+[Ollama](https://ollama.ai) runs AI models locally on your machine.
+```bash
+# Install Ollama (macOS)
+brew install ollama
+# Pull a model
+ollama pull llama3.2
+# Use with AIA - prefix model name with 'ollama/'
+aia --model ollama/llama3.2 my_prompt
+# In chat mode
+aia --chat --model ollama/llama3.2
+# Combine with cloud models
+aia --model ollama/llama3.2,gpt-4o-mini --consensus my_prompt
+```
+**Environment Variables:**
+```bash
+# Optional: Set custom Ollama API endpoint
+export OLLAMA_API_BASE=http://localhost:11434
+```
+#### LM Studio Integration
+[LM Studio](https://lmstudio.ai) provides a desktop application for running local models with an OpenAI-compatible API.
+```bash
+# 1. Install LM Studio from lmstudio.ai
+# 2. Download and load a model in LM Studio
+# 3. Start the local server in LM Studio
+# Use with AIA - prefix model name with 'lms/'
+aia --model lms/qwen/qwen3-coder-30b my_prompt
+# In chat mode
+aia --chat --model lms/your-model-name
+# Mix local and cloud models
+aia --model lms/local-model,gpt-4o-mini my_prompt
+```
+**Environment Variables:**
+```bash
+# Optional: Set custom LM Studio API endpoint (default: http://localhost:1234/v1)
+export LMS_API_BASE=http://localhost:1234/v1
+```
+#### Listing Local Models
+The `//models` directive automatically detects local providers and queries their endpoints:
+```bash
+# In a prompt file or chat session
+//models
+# Output will show:
+# - Ollama models from http://localhost:11434/api/tags
+# - LM Studio models from http://localhost:1234/v1/models
+# - Cloud models from RubyLLM database
+```
+**Benefits of Local Models:**
+- 🔒 **Privacy**: No data sent to external servers
+- 💰 **Cost**: Zero API costs after initial setup
+- 🚀 **Speed**: No network latency
+- 📡 **Offline**: Works without internet connection
+- 🔧 **Control**: Full control over model and parameters
 ### Shell Integration
 AIA automatically processes shell patterns in prompts:

data/docs/faq.md CHANGED Viewed

@@ -23,7 +23,89 @@ export ANTHROPIC_API_KEY="your_key_here"
 ```
 ### Q: Can I use AIA without internet access?
-**A:** Yes, if you use local models through Ollama. Most AI models require internet access, but you can run models locally for offline use.
+**A:** Yes! AIA supports two local model providers for complete offline operation:
+1. **Ollama**: Run open-source models locally
+   ```bash
+   # Install and use Ollama
+   brew install ollama
+   ollama pull llama3.2
+   aia --model ollama/llama3.2 --chat
+   ```
+2. **LM Studio**: GUI-based local model runner
+   ```bash
+   # Download from https://lmstudio.ai
+   # Load a model and start local server
+   aia --model lms/your-model-name --chat
+   ```
+Both options provide full AI functionality without internet connection, perfect for:
+- 🔒 Private/sensitive data processing
+- ✈️ Offline/travel use
+- 💰 Zero API costs
+- 🏠 Air-gapped environments
+### Q: How do I list available local models?
+**A:** Use the `//models` directive in a chat session or prompt:
+```bash
+# Start chat with any local model
+aia --model ollama/llama3.2 --chat
+# In the chat session
+> //models
+# Output shows:
+# - Ollama models from local installation
+# - LM Studio models currently loaded
+# - Cloud models from RubyLLM database
+```
+For Ollama specifically: `ollama list`
+For LM Studio: Check the Models tab in the LM Studio GUI
+### Q: What's the difference between Ollama and LM Studio?
+**A:**
+- **Ollama**: Command-line focused, quick model switching, multiple models available
+- **LM Studio**: GUI application, visual model management, one model at a time
+Choose **Ollama** if you prefer CLI tools and automation.
+Choose **LM Studio** if you want a visual interface and easier model discovery.
+Both work great with AIA!
+### Q: Can I mix local and cloud models?
+**A:** Absolutely! This is a powerful feature:
+```bash
+# Compare local vs cloud responses
+aia --model ollama/llama3.2,gpt-4o-mini my_prompt
+# Get consensus across local and cloud models
+aia --model ollama/mistral,lms/qwen-coder,claude-3-sonnet --consensus decision
+# Use local for drafts, cloud for refinement
+aia --model ollama/llama3.2 --out_file draft.md initial_analysis
+aia --model gpt-4 --include draft.md final_report
+```
+### Q: Why does my lms/ model show an error?
+**A:** Common causes:
+1. **Model not loaded in LM Studio**: Load a model first
+2. **Wrong model name**: AIA validates against available models and shows the exact names to use
+3. **Server not running**: Start the local server in LM Studio
+4. **Wrong prefix**: Always use `lms/` prefix with full model name
+If you get an error, AIA will show you the exact model names to use:
+```
+❌ 'wrong-name' is not a valid LM Studio model.
+Available LM Studio models:
+  - lms/qwen/qwen3-coder-30b
+  - lms/llama-3.2-3b-instruct
+```
 ## Basic Usage

data/docs/guides/local-models.md ADDED Viewed

@@ -0,0 +1,304 @@
+# Local Models Guide
+Complete guide to using Ollama and LM Studio with AIA for local AI processing.
+## Why Use Local Models?
+### Benefits
+- 🔒 **Privacy**: All processing happens on your machine
+- 💰 **Cost**: No API fees
+- 🚀 **Speed**: No network latency
+- 📡 **Offline**: Works without internet
+- 🔧 **Control**: Choose exact model and parameters
+- 📦 **Unlimited**: No rate limits or quotas
+### Use Cases
+- Processing confidential business data
+- Working with personal information
+- Development and testing
+- High-volume batch processing
+- Air-gapped environments
+- Learning and experimentation
+## Ollama Setup
+### Installation
+```bash
+# macOS
+brew install ollama
+# Linux
+curl -fsSL https://ollama.ai/install.sh | sh
+# Windows
+# Download installer from https://ollama.ai
+```
+### Model Management
+```bash
+# List available models
+ollama list
+# Pull new models
+ollama pull llama3.2
+ollama pull mistral
+ollama pull codellama
+# Remove models
+ollama rm model-name
+# Show model info
+ollama show llama3.2
+```
+### Using with AIA
+```bash
+# Basic usage - prefix with 'ollama/'
+aia --model ollama/llama3.2 my_prompt
+# Chat mode
+aia --chat --model ollama/mistral
+# Batch processing
+for file in *.md; do
+  aia --model ollama/llama3.2 summarize "$file"
+done
+```
+### Recommended Ollama Models
+#### General Purpose
+- `llama3.2` - Versatile, good quality
+- `llama3.2:70b` - Higher quality, slower
+- `mistral` - Fast, efficient
+#### Code
+- `qwen2.5-coder` - Excellent for code
+- `codellama` - Code-focused
+- `deepseek-coder` - Programming tasks
+#### Specialized
+- `mixtral` - High performance
+- `phi3` - Small, efficient
+- `gemma2` - Google's open model
+## LM Studio Setup
+### Installation
+1. Download from https://lmstudio.ai
+2. Install the application
+3. Launch LM Studio
+### Model Management
+1. Click "🔍 Search" tab
+2. Browse or search for models
+3. Click download button
+4. Wait for download to complete
+### Starting Local Server
+1. Click "💻 Local Server" tab
+2. Select loaded model from dropdown
+3. Click "Start Server"
+4. Note the endpoint (default: http://localhost:1234/v1)
+### Using with AIA
+```bash
+# Prefix model name with 'lms/'
+aia --model lms/qwen/qwen3-coder-30b my_prompt
+# Chat mode
+aia --chat --model lms/llama-3.2-3b-instruct
+# AIA validates model names
+# Error shows available models if name is wrong
+```
+### Popular LM Studio Models
+- `lmsys/vicuna-7b` - Conversation
+- `TheBloke/Llama-2-7B-Chat-GGUF` - Chat
+- `TheBloke/CodeLlama-7B-GGUF` - Code
+- `qwen/qwen3-coder-30b` - Advanced coding
+## Configuration
+### Environment Variables
+```bash
+# Ollama custom endpoint
+export OLLAMA_API_BASE=http://localhost:11434
+# LM Studio custom endpoint
+export LMS_API_BASE=http://localhost:1234/v1
+```
+### Config File
+```yaml
+# ~/.aia/config.yml
+model: ollama/llama3.2
+# Or for LM Studio
+model: lms/qwen/qwen3-coder-30b
+```
+### In Prompts
+```
+//config model = ollama/mistral
+//config temperature = 0.7
+Your prompt here...
+```
+## Listing Models
+### In Chat Session
+```bash
+aia --model ollama/llama3.2 --chat
+> //models
+```
+**Ollama Output:**
+```
+Local LLM Models:
+Ollama Models (http://localhost:11434):
+------------------------------------------------------------
+- ollama/llama3.2:latest (size: 2.0 GB, modified: 2024-10-01)
+- ollama/mistral:latest (size: 4.1 GB, modified: 2024-09-28)
+2 Ollama model(s) available
+```
+**LM Studio Output:**
+```
+Local LLM Models:
+LM Studio Models (http://localhost:1234/v1):
+------------------------------------------------------------
+- lms/qwen/qwen3-coder-30b
+- lms/llama-3.2-3b-instruct
+2 LM Studio model(s) available
+```
+## Advanced Usage
+### Mixed Local/Cloud Models
+```bash
+# Compare local and cloud responses
+aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet analysis_prompt
+# Get consensus
+aia --model ollama/llama3.2,ollama/mistral,gpt-4 --consensus decision_prompt
+```
+### Local-First Workflow
+```bash
+# 1. Process with local model (private)
+aia --model ollama/llama3.2 --out_file draft.md sensitive_data.txt
+# 2. Review and sanitize draft.md manually
+# 3. Polish with cloud model
+aia --model gpt-4 --include draft.md final_output
+```
+### Cost Optimization
+```bash
+# Bulk tasks with local model
+for i in {1..1000}; do
+  aia --model ollama/mistral --out_file "result_$i.md" process "input_$i.txt"
+done
+# No API costs!
+```
+## Troubleshooting
+### Ollama Issues
+**Problem:** "Cannot connect to Ollama"
+```bash
+# Check if Ollama is running
+ollama list
+# Start Ollama service (if needed)
+ollama serve
+```
+**Problem:** "Model not found"
+```bash
+# List installed models
+ollama list
+# Pull missing model
+ollama pull llama3.2
+```
+### LM Studio Issues
+**Problem:** "Cannot connect to LM Studio"
+1. Ensure LM Studio is running
+2. Check local server is started
+3. Verify endpoint in settings
+**Problem:** "Model validation failed"
+- Check exact model name in LM Studio
+- Ensure model is loaded (not just downloaded)
+- Use full model path with `lms/` prefix
+**Problem:** "Model not listed"
+1. Load model in LM Studio
+2. Start local server
+3. Run `//models` directive
+### Performance Issues
+**Slow responses:**
+- Use smaller models (7B instead of 70B)
+- Reduce max_tokens
+- Check system resources (CPU/RAM/GPU)
+**High memory usage:**
+- Close other applications
+- Use quantized models (Q4, Q5)
+- Try smaller model variants
+## Best Practices
+### Security
+✅ Keep local models for sensitive data
+✅ Use cloud models for general tasks
+✅ Review outputs before sharing externally
+### Performance
+✅ Use appropriate model size for task
+✅ Leverage GPU if available
+✅ Cache common responses
+### Cost Management
+✅ Use local models for development/testing
+✅ Use local models for high-volume processing
+✅ Reserve cloud models for critical tasks
+## Related Documentation
+- [Models Guide](models.md)
+- [Configuration](../configuration.md)
+- [Chat Mode](chat.md)
+- [CLI Reference](../cli-reference.md)

data/docs/guides/models.md CHANGED Viewed

@@ -373,6 +373,163 @@ premium_models:
 - **Llama 2**: Open-source general purpose
 - **Mixtral**: High-performance open model
+## Local Model Providers
+### Ollama
+[Ollama](https://ollama.ai) enables running open-source AI models locally.
+#### Setup
+```bash
+# Install Ollama
+brew install ollama  # macOS
+# or download from https://ollama.ai
+# Pull models
+ollama pull llama3.2
+ollama pull mistral
+ollama pull qwen2.5-coder
+# List available models
+ollama list
+```
+#### Usage with AIA
+```bash
+# Use Ollama model (prefix with 'ollama/')
+aia --model ollama/llama3.2 my_prompt
+# Chat mode
+aia --chat --model ollama/mistral
+# List Ollama models from AIA
+aia --model ollama/llama3.2 --chat
+> //models
+# Combine with cloud models for comparison
+aia --model ollama/llama3.2,gpt-4o-mini,claude-3-sonnet my_prompt
+```
+#### Configuration
+```yaml
+# ~/.aia/config.yml
+model: ollama/llama3.2
+# Optional: Custom Ollama endpoint
+# Set via environment variable
+export OLLAMA_API_BASE=http://custom-host:11434
+```
+#### Popular Ollama Models
+- **llama3.2**: Latest Llama model, good general purpose
+- **llama3.2:70b**: Larger version, better quality
+- **mistral**: Fast and efficient
+- **mixtral**: High-performance mixture of experts
+- **qwen2.5-coder**: Specialized for code
+- **codellama**: Code-focused model
+### LM Studio
+[LM Studio](https://lmstudio.ai) provides a GUI for running local models with OpenAI-compatible API.
+#### Setup
+1. Download LM Studio from https://lmstudio.ai
+2. Install and launch the application
+3. Browse and download models within LM Studio
+4. Start the local server:
+   - Click "Local Server" tab
+   - Click "Start Server"
+   - Default endpoint: http://localhost:1234/v1
+#### Usage with AIA
+```bash
+# Use LM Studio model (prefix with 'lms/')
+aia --model lms/qwen/qwen3-coder-30b my_prompt
+# Chat mode
+aia --chat --model lms/llama-3.2-3b-instruct
+# List LM Studio models from AIA
+aia --model lms/any-loaded-model --chat
+> //models
+# Model validation
+# AIA validates model names against LM Studio's loaded models
+# If you specify an invalid model, you'll see:
+#   ❌ 'model-name' is not a valid LM Studio model.
+#
+#   Available LM Studio models:
+#     - lms/qwen/qwen3-coder-30b
+#     - lms/llama-3.2-3b-instruct
+```
+#### Configuration
+```yaml
+# ~/.aia/config.yml
+model: lms/qwen/qwen3-coder-30b
+# Optional: Custom LM Studio endpoint
+# Set via environment variable
+export LMS_API_BASE=http://localhost:1234/v1
+```
+#### Tips for LM Studio
+- Use the model name exactly as shown in LM Studio
+- Prefix all model names with `lms/`
+- Ensure the local server is running before use
+- LM Studio supports one model at a time (unlike Ollama)
+### Comparison: Ollama vs LM Studio
+| Feature | Ollama | LM Studio |
+|---------|--------|-----------|
+| **Interface** | Command-line | GUI + CLI |
+| **Model Management** | Via CLI (`ollama pull`) | GUI download |
+| **API Compatibility** | Custom + OpenAI-like | OpenAI-compatible |
+| **Multiple Models** | Yes (switch quickly) | One at a time |
+| **Platform** | macOS, Linux, Windows | macOS, Windows |
+| **Model Format** | GGUF, custom | GGUF |
+| **Best For** | CLI users, automation | GUI users, experimentation |
+### Local + Cloud Model Workflows
+#### Privacy-First Workflow
+```bash
+# Use local model for sensitive data
+aia --model ollama/llama3.2 --out_file draft.md process_private_data.txt
+# Use cloud model for final polish (on sanitized data)
+aia --model gpt-4 --include draft.md refine_output
+```
+#### Cost-Optimization Workflow
+```bash
+# Bulk processing with local model (free)
+for file in *.txt; do
+  aia --model ollama/mistral --out_file "${file%.txt}_summary.md" summarize "$file"
+done
+# Final review with premium cloud model
+aia --model gpt-4 --include *_summary.md final_report
+```
+#### Consensus with Mixed Models
+```bash
+# Get consensus from local and cloud models
+aia --model ollama/llama3.2,ollama/mistral,gpt-4o-mini --consensus decision_prompt
+# Or individual responses to compare
+aia --model ollama/llama3.2,lms/qwen-coder,claude-3-sonnet --no-consensus code_review.py
+```
 ## Troubleshooting Models
 ### Common Issues

data/lib/aia/chat_processor_service.rb CHANGED Viewed

@@ -28,25 +28,37 @@ module AIA
         result = send_to_client(prompt)
       end
+      # Debug output to understand what we're receiving
+      puts "[DEBUG ChatProcessor] Result class: #{result.class}" if AIA.config.debug
+      puts "[DEBUG ChatProcessor] Result inspect: #{result.inspect[0..500]}..." if AIA.config.debug
       # Preserve token information if available for metrics
       if result.is_a?(String)
+        puts "[DEBUG ChatProcessor] Processing as String" if AIA.config.debug
         { content: result, metrics: nil }
       elsif result.respond_to?(:multi_model?) && result.multi_model?
+        puts "[DEBUG ChatProcessor] Processing as multi-model response" if AIA.config.debug
         # Handle multi-model response with metrics
         {
           content: result.content,
           metrics: nil,  # Individual model metrics handled separately
           multi_metrics: result.metrics_list
         }
-      else
+      elsif result.respond_to?(:content)
+        puts "[DEBUG ChatProcessor] Processing as standard response with content method" if AIA.config.debug
+        # Standard response object with content method
         {
           content: result.content,
           metrics: {
-            input_tokens: result.input_tokens,
-            output_tokens: result.output_tokens,
-            model_id: result.model_id
+            input_tokens: result.respond_to?(:input_tokens) ? result.input_tokens : nil,
+            output_tokens: result.respond_to?(:output_tokens) ? result.output_tokens : nil,
+            model_id: result.respond_to?(:model_id) ? result.model_id : nil
           }
         }
+      else
+        puts "[DEBUG ChatProcessor] Processing as fallback (unexpected type)" if AIA.config.debug
+        # Fallback for unexpected response types
+        { content: result.to_s, metrics: nil }
       end
     end
@@ -56,7 +68,10 @@ module AIA
     def send_to_client(conversation)
       maybe_change_model
-      AIA.client.chat(conversation)
+      puts "[DEBUG ChatProcessor] Sending conversation to client: #{conversation.inspect[0..500]}..." if AIA.config.debug
+      result = AIA.client.chat(conversation)
+      puts "[DEBUG ChatProcessor] Client returned: #{result.class} - #{result.inspect[0..500]}..." if AIA.config.debug
+      result
     end

data/lib/aia/directives/models.rb CHANGED Viewed

@@ -30,9 +30,141 @@ module AIA
       end
       def self.available_models(args = nil, context_manager = nil)
+          # Check if we're using a local provider
+          current_models = AIA.config.model
+          current_models = [current_models] if current_models.is_a?(String)
+          using_local_provider = current_models.any? { |m| m.start_with?('ollama/', 'lms/') }
+          if using_local_provider
+            show_local_models(current_models, args)
+          else
+            show_rubyllm_models(args)
+          end
+          ""
+        end
+      def self.show_local_models(current_models, args)
+          require 'net/http'
+          require 'json'
+          puts "\nLocal LLM Models:"
+          puts
+          current_models.each do |model_spec|
+            if model_spec.start_with?('ollama/')
+              # Ollama uses its native API, not /v1
+              api_base = ENV.fetch('OLLAMA_API_BASE', 'http://localhost:11434')
+              # Remove /v1 suffix if present
+              api_base = api_base.gsub(%r{/v1/?$}, '')
+              show_ollama_models(api_base, args)
+            elsif model_spec.start_with?('lms/')
+              api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234')
+              show_lms_models(api_base, args)
+            end
+          end
+        end
+      def self.show_ollama_models(api_base, args)
+          begin
+            uri = URI("#{api_base}/api/tags")
+            response = Net::HTTP.get_response(uri)
+            unless response.is_a?(Net::HTTPSuccess)
+              puts "❌ Cannot connect to Ollama at #{api_base}"
+              return
+            end
+            data = JSON.parse(response.body)
+            models = data['models'] || []
+            if models.empty?
+              puts "No Ollama models found"
+              return
+            end
+            puts "Ollama Models (#{api_base}):"
+            puts "-" * 60
+            counter = 0
+            models.each do |model|
+              name = model['name']
+              size = model['size'] ? format_bytes(model['size']) : 'unknown'
+              modified = model['modified_at'] ? Time.parse(model['modified_at']).strftime('%Y-%m-%d') : 'unknown'
+              entry = "- ollama/#{name} (size: #{size}, modified: #{modified})"
+              # Apply query filter if provided
+              if args.nil? || args.empty? || args.any? { |q| entry.downcase.include?(q.downcase) }
+                puts entry
+                counter += 1
+              end
+            end
+            puts
+            puts "#{counter} Ollama model(s) available"
+            puts
+          rescue StandardError => e
+            puts "❌ Error fetching Ollama models: #{e.message}"
+          end
+        end
+      def self.show_lms_models(api_base, args)
+          begin
+            uri = URI("#{api_base.gsub(%r{/v1/?$}, '')}/v1/models")
+            response = Net::HTTP.get_response(uri)
+            unless response.is_a?(Net::HTTPSuccess)
+              puts "❌ Cannot connect to LM Studio at #{api_base}"
+              return
+            end
+            data = JSON.parse(response.body)
+            models = data['data'] || []
+            if models.empty?
+              puts "No LM Studio models found"
+              return
+            end
+            puts "LM Studio Models (#{api_base}):"
+            puts "-" * 60
+            counter = 0
+            models.each do |model|
+              name = model['id']
+              entry = "- lms/#{name}"
+              # Apply query filter if provided
+              if args.nil? || args.empty? || args.any? { |q| entry.downcase.include?(q.downcase) }
+                puts entry
+                counter += 1
+              end
+            end
+            puts
+            puts "#{counter} LM Studio model(s) available"
+            puts
+          rescue StandardError => e
+            puts "❌ Error fetching LM Studio models: #{e.message}"
+          end
+        end
+      def self.format_bytes(bytes)
+          units = ['B', 'KB', 'MB', 'GB', 'TB']
+          return "0 B" if bytes.zero?
+          exp = (Math.log(bytes) / Math.log(1024)).to_i
+          exp = [exp, units.length - 1].min
+          "%.1f %s" % [bytes.to_f / (1024 ** exp), units[exp]]
+        end
+      def self.show_rubyllm_models(args)
           query = args
-          if 1 == query.size
+          if query && 1 == query.size
             query = query.first.split(',')
           end
@@ -42,8 +174,8 @@ module AIA
           puts header + ':'
           puts
-          q1 = query.select { |q| q.include?('_to_') } # SMELL: ??
-          q2 = query.reject { |q| q.include?('_to_') }
+          q1 = query ? query.select { |q| q.include?('_to_') } : []
+          q2 = query ? query.reject { |q| q.include?('_to_') } : []
           counter = 0
@@ -75,8 +207,6 @@ module AIA
           puts if counter > 0
           puts "#{counter} LLMs matching your query"
           puts
-          ""
         end
       def self.help(args = nil, context_manager = nil)

data/lib/aia/ruby_llm_adapter.rb CHANGED Viewed

@@ -1,6 +1,7 @@
 # lib/aia/ruby_llm_adapter.rb
 require 'async'
+require_relative '../extensions/ruby_llm/provider_fix'
 module AIA
   class RubyLLMAdapter
@@ -101,8 +102,13 @@ module AIA
           elsif model_name.start_with?('lms/')
             # For LM Studio models (OpenAI-compatible), create a custom context with the right API base
             actual_model = model_name.sub('lms/', '')
+            lms_api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234/v1')
+            # Validate model exists in LM Studio
+            validate_lms_model!(actual_model, lms_api_base)
             custom_config = RubyLLM.config.dup
-            custom_config.openai_api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234/v1')
+            custom_config.openai_api_base = lms_api_base
             custom_config.openai_api_key = 'dummy' # Local servers don't need a real API key
             context = RubyLLM::Context.new(custom_config)
             chat = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
@@ -237,33 +243,55 @@ module AIA
     def chat(prompt)
-      if @models.size == 1
+      puts "[DEBUG RubyLLMAdapter.chat] Received prompt class: #{prompt.class}" if AIA.config.debug
+      puts "[DEBUG RubyLLMAdapter.chat] Prompt inspect: #{prompt.inspect[0..500]}..." if AIA.config.debug
+      puts "[DEBUG RubyLLMAdapter.chat] Models: #{@models.inspect}" if AIA.config.debug
+      result = if @models.size == 1
         # Single model - use the original behavior
         single_model_chat(prompt, @models.first)
       else
         # Multiple models - use concurrent processing
         multi_model_chat(prompt)
       end
+      puts "[DEBUG RubyLLMAdapter.chat] Returning result class: #{result.class}" if AIA.config.debug
+      puts "[DEBUG RubyLLMAdapter.chat] Result inspect: #{result.inspect[0..500]}..." if AIA.config.debug
+      result
     end
     def single_model_chat(prompt, model_name)
+      puts "[DEBUG single_model_chat] Model name: #{model_name}" if AIA.config.debug
       chat_instance = @chats[model_name]
+      puts "[DEBUG single_model_chat] Chat instance: #{chat_instance.class}" if AIA.config.debug
       modes = chat_instance.model.modalities
+      puts "[DEBUG single_model_chat] Modalities: #{modes.inspect}" if AIA.config.debug
       # TODO: Need to consider how to handle multi-mode models
-      if modes.text_to_text?
+      result = if modes.text_to_text?
+        puts "[DEBUG single_model_chat] Using text_to_text_single" if AIA.config.debug
         text_to_text_single(prompt, model_name)
       elsif modes.image_to_text?
+        puts "[DEBUG single_model_chat] Using image_to_text_single" if AIA.config.debug
         image_to_text_single(prompt, model_name)
       elsif modes.text_to_image?
+        puts "[DEBUG single_model_chat] Using text_to_image_single" if AIA.config.debug
         text_to_image_single(prompt, model_name)
       elsif modes.text_to_audio?
+        puts "[DEBUG single_model_chat] Using text_to_audio_single" if AIA.config.debug
         text_to_audio_single(prompt, model_name)
       elsif modes.audio_to_text?
+        puts "[DEBUG single_model_chat] Using audio_to_text_single" if AIA.config.debug
         audio_to_text_single(prompt, model_name)
       else
+        puts "[DEBUG single_model_chat] No matching modality!" if AIA.config.debug
         # TODO: what else can be done?
+        "Error: No matching modality for model #{model_name}"
       end
+      puts "[DEBUG single_model_chat] Result class: #{result.class}" if AIA.config.debug
+      result
     end
     def multi_model_chat(prompt)
@@ -440,7 +468,7 @@ module AIA
     # Clear the chat context/history
-    # Needed for the //clear directive
+    # Needed for the //clear and //restore directives
     def clear_context
       @chats.each do |model_name, chat|
         # Option 1: Directly clear the messages array in the current chat object
@@ -455,16 +483,65 @@ module AIA
       # This ensures any shared state is reset
       RubyLLM.instance_variable_set(:@chat, nil) if RubyLLM.instance_variable_defined?(:@chat)
-      # Option 3: Create completely fresh chat instances for this adapter
-      @chats = {} # First nil the chats hash
+      # Option 3: Try to create fresh chat instances, but don't exit on failure
+      # This is safer for use in directives like //restore
+      old_chats = @chats
+      @chats = {} # First clear the chats hash
       begin
         @models.each do |model_name|
-          @chats[model_name] = RubyLLM.chat(model: model_name)
+          # Try to recreate each chat, but if it fails, keep the old one
+          begin
+            # Check if this is a local provider model and handle it specially
+            if model_name.start_with?('ollama/')
+              actual_model = model_name.sub('ollama/', '')
+              @chats[model_name] = RubyLLM.chat(model: actual_model, provider: 'ollama', assume_model_exists: true)
+            elsif model_name.start_with?('osaurus/')
+              actual_model = model_name.sub('osaurus/', '')
+              custom_config = RubyLLM.config.dup
+              custom_config.openai_api_base = ENV.fetch('OSAURUS_API_BASE', 'http://localhost:11434/v1')
+              custom_config.openai_api_key = 'dummy'
+              context = RubyLLM::Context.new(custom_config)
+              @chats[model_name] = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
+            elsif model_name.start_with?('lms/')
+              actual_model = model_name.sub('lms/', '')
+              lms_api_base = ENV.fetch('LMS_API_BASE', 'http://localhost:1234/v1')
+              # Validate model exists in LM Studio
+              validate_lms_model!(actual_model, lms_api_base)
+              custom_config = RubyLLM.config.dup
+              custom_config.openai_api_base = lms_api_base
+              custom_config.openai_api_key = 'dummy'
+              context = RubyLLM::Context.new(custom_config)
+              @chats[model_name] = context.chat(model: actual_model, provider: 'openai', assume_model_exists: true)
+            else
+              @chats[model_name] = RubyLLM.chat(model: model_name)
+            end
+            # Re-add tools if they were previously loaded
+            if @tools && !@tools.empty? && @chats[model_name].model&.supports_functions?
+              @chats[model_name].with_tools(*@tools)
+            end
+          rescue StandardError => e
+            # If we can't create a new chat, keep the old one but clear its context
+            warn "Warning: Could not recreate chat for #{model_name}: #{e.message}. Keeping existing instance."
+            @chats[model_name] = old_chats[model_name]
+            # Clear the old chat's messages if possible
+            if @chats[model_name] && @chats[model_name].instance_variable_defined?(:@messages)
+              @chats[model_name].instance_variable_set(:@messages, [])
+            end
+          end
         end
       rescue StandardError => e
-        warn "ERROR: #{e.message}"
-        exit 1
+        # If something went terribly wrong, restore the old chats but clear their contexts
+        warn "Warning: Error during context clearing: #{e.message}. Attempting to recover."
+        @chats = old_chats
+        @chats.each_value do |chat|
+          if chat.instance_variable_defined?(:@messages)
+            chat.instance_variable_set(:@messages, [])
+          end
+        end
       end
       # Option 4: Call official clear_history method if it exists
@@ -523,6 +600,44 @@ module AIA
     end
+    def validate_lms_model!(model_name, api_base)
+      require 'net/http'
+      require 'json'
+      # Build the /v1/models endpoint URL
+      uri = URI("#{api_base.gsub(%r{/v1/?$}, '')}/v1/models")
+      begin
+        response = Net::HTTP.get_response(uri)
+        unless response.is_a?(Net::HTTPSuccess)
+          raise "Cannot connect to LM Studio at #{api_base}. Is LM Studio running?"
+        end
+        data = JSON.parse(response.body)
+        available_models = data['data']&.map { |m| m['id'] } || []
+        unless available_models.include?(model_name)
+          error_msg = "❌ '#{model_name}' is not a valid LM Studio model.\n\n"
+          if available_models.empty?
+            error_msg += "No models are currently loaded in LM Studio.\n"
+            error_msg += "Please load a model in LM Studio first."
+          else
+            error_msg += "Available LM Studio models:\n"
+            available_models.each { |m| error_msg += "  - lms/#{m}\n" }
+          end
+          raise error_msg
+        end
+      rescue JSON::ParserError => e
+        raise "Invalid response from LM Studio at #{api_base}: #{e.message}"
+      rescue StandardError => e
+        # Re-raise our custom error messages, wrap others
+        raise if e.message.start_with?('❌')
+        raise "Error connecting to LM Studio: #{e.message}"
+      end
+    end
     def extract_models_config
       models_config = AIA.config.model
@@ -556,15 +671,30 @@ module AIA
     def text_to_text_single(prompt, model_name)
       chat_instance = @chats[model_name]
       text_prompt = extract_text_prompt(prompt)
+      puts "[DEBUG RubyLLMAdapter] Sending to model #{model_name}: #{text_prompt[0..100]}..." if AIA.config.debug
       response = if AIA.config.context_files.empty?
                    chat_instance.ask(text_prompt)
                  else
                    chat_instance.ask(text_prompt, with: AIA.config.context_files)
                  end
+      # Debug output to understand the response structure
+      puts "[DEBUG RubyLLMAdapter] Response class: #{response.class}" if AIA.config.debug
+      puts "[DEBUG RubyLLMAdapter] Response inspect: #{response.inspect[0..500]}..." if AIA.config.debug
+      if response.respond_to?(:content)
+        puts "[DEBUG RubyLLMAdapter] Response content: #{response.content[0..200]}..." if AIA.config.debug
+      else
+        puts "[DEBUG RubyLLMAdapter] Response (no content method): #{response.to_s[0..200]}..." if AIA.config.debug
+      end
       # Return the full response object to preserve token information
       response
     rescue StandardError => e
+      puts "[DEBUG RubyLLMAdapter] Error in text_to_text_single: #{e.class} - #{e.message}" if AIA.config.debug
+      puts "[DEBUG RubyLLMAdapter] Backtrace: #{e.backtrace[0..5].join("\n")}" if AIA.config.debug
       e.message
     end

data/lib/aia/session.rb CHANGED Viewed

@@ -418,23 +418,23 @@ module AIA
     def handle_clear_directive
       # The directive processor has called context_manager.clear_context
-      # but we need a more aggressive approach to fully clear all context
+      # but we need to also clear the LLM client's context
       # First, clear the context manager's context
       @context_manager.clear_context(keep_system_prompt: true)
       # Second, try clearing the client's context
       if AIA.config.client && AIA.config.client.respond_to?(:clear_context)
-        AIA.config.client.clear_context
+        begin
+          AIA.config.client.clear_context
+        rescue => e
+          STDERR.puts "Warning: Error clearing client context: #{e.message}"
+          # Continue anyway - the context manager has been cleared which is the main goal
+        end
       end
-      # Third, completely reinitialize the client to ensure fresh state
-      # This is the most aggressive approach to ensure no context remains
-      begin
-        AIA.config.client = AIA::RubyLLMAdapter.new
-      rescue => e
-        STDERR.puts "Error reinitializing client: #{e.message}"
-      end
+      # Note: We intentionally do NOT reinitialize the client here
+      # as that could cause termination if model initialization fails
       @ui_presenter.display_info("Chat context cleared.")
       nil
@@ -448,16 +448,27 @@ module AIA
     def handle_restore_directive(directive_output)
       # If the restore was successful, we also need to refresh the client's context
       if directive_output.start_with?("Context restored")
-        # Try to clear and rebuild the client's context
+        # Clear the client's context without reinitializing the entire adapter
+        # This avoids the risk of exiting if model initialization fails
         if AIA.config.client && AIA.config.client.respond_to?(:clear_context)
-          AIA.config.client.clear_context
+          begin
+            AIA.config.client.clear_context
+          rescue => e
+            STDERR.puts "Warning: Error clearing client context after restore: #{e.message}"
+            # Continue anyway - the context manager has been restored which is the main goal
+          end
         end
-        # Optionally reinitialize the client for a clean state
-        begin
-          AIA.config.client = AIA::RubyLLMAdapter.new
-        rescue => e
-          STDERR.puts "Error reinitializing client after restore: #{e.message}"
+        # Rebuild the conversation in the LLM client from the restored context
+        # This ensures the LLM's internal state matches what we restored
+        if AIA.config.client && @context_manager
+          begin
+            restored_context = @context_manager.get_context
+            # The client's context has been cleared, so we can safely continue
+            # The next interaction will use the restored context from context_manager
+          rescue => e
+            STDERR.puts "Warning: Error syncing restored context: #{e.message}"
+          end
         end
       end

data/lib/extensions/ruby_llm/provider_fix.rb ADDED Viewed

@@ -0,0 +1,34 @@
+# lib/extensions/ruby_llm/provider_fix.rb
+#
+# Monkey patch to fix LM Studio compatibility with RubyLLM Provider
+# LM Studio sometimes returns response.body as a String that fails JSON parsing
+# This causes "String does not have #dig method" errors in parse_error
+module RubyLLM
+  class Provider
+    # Override the parse_error method to handle String responses from LM Studio
+    def parse_error(response)
+      return if response.body.empty?
+      body = try_parse_json(response.body)
+      # Be more explicit about type checking to prevent String#dig errors
+      case body
+      when Hash
+        # Only call dig if we're certain it's a Hash
+        body.dig('error', 'message')
+      when Array
+        # Only call dig on array elements if they're Hashes
+        body.filter_map do |part|
+          part.is_a?(Hash) ? part.dig('error', 'message') : part.to_s
+        end.join('. ')
+      else
+        # For Strings or any other type, convert to string
+        body.to_s
+      end
+    rescue StandardError => e
+      # Fallback in case anything goes wrong
+      "Error parsing response: #{e.message}"
+    end
+  end
+end

data/mkdocs.yml CHANGED Viewed

@@ -151,6 +151,7 @@ nav:
     - Getting Started: guides/getting-started.md
     - Chat Mode: guides/chat.md
     - Working with Models: guides/models.md
+    - Local Models: guides/local-models.md
     - Available Models: guides/available-models.md
     - Image Generation: guides/image-generation.md
     - Tools Integration: guides/tools.md

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: aia
 version: !ruby/object:Gem::Version
-  version: 0.9.16
+  version: 0.9.17
 platform: ruby
 authors:
 - Dewayne VanHoozer
@@ -289,6 +289,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: simplecov_lcov_formatter
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 - !ruby/object:Gem::Dependency
   name: tocer
   requirement: !ruby/object:Gem::Requirement
@@ -303,6 +317,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: webmock
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 description: 'AIA is a revolutionary CLI console application that brings multi-model
   AI capabilities to your command line, supporting 20+ providers including OpenAI,
   Anthropic, and Google. Run multiple AI models simultaneously for comparison, get
@@ -354,6 +382,7 @@ files:
 - docs/guides/getting-started.md
 - docs/guides/image-generation.md
 - docs/guides/index.md
+- docs/guides/local-models.md
 - docs/guides/models.md
 - docs/guides/tools.md
 - docs/index.md
@@ -416,6 +445,7 @@ files:
 - lib/extensions/openstruct_merge.rb
 - lib/extensions/ruby_llm/.irbrc
 - lib/extensions/ruby_llm/modalities.rb
+- lib/extensions/ruby_llm/provider_fix.rb
 - lib/refinements/string.rb
 - main.just
 - mcp_servers/README.md