ask-llm-providers 0.1.5 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/lib/ask/llm/version.rb +1 -1
- data/lib/ask/skills/providers.model_select/SKILL.md +164 -0
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: cf20ba02f9ecd52cdbc18f93ff8a22f75e9d33d845dd80ab7880d35454cd27c0
|
|
4
|
+
data.tar.gz: 982162bc42de9c71dfe83dabdd40b69a95b618ead9a056ffcafcaee8c77a5bd7
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: f4a361c8bf519e443793c31c4f21dc4f1a5872b4b64f1f7483499f22328a4db091fa6b4e287c0e56ece26777a5ea00e6a3a099c428def8a1cd91a5f61d88a74b
|
|
7
|
+
data.tar.gz: 93d0092ed08a53a12be9e822e7f8b4dc889695c39cd3f82f6167422110c8318c0e923b7f1cd1b00aa7104d5fb9f947490a43217ae35af703c570e7a5d8eb918a
|
data/lib/ask/llm/version.rb
CHANGED
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: providers.model_select
|
|
3
|
+
description: How to select the right LLM model for a task — balancing cost, capability, latency, and context window
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
Use this skill when choosing an LLM model for a specific task. The ask-rb
|
|
7
|
+
ecosystem uses `Ask::ModelCatalog` to resolve models and check capabilities.
|
|
8
|
+
|
|
9
|
+
## Step 1: Classify the Task
|
|
10
|
+
|
|
11
|
+
Determine what kind of task you're solving:
|
|
12
|
+
|
|
13
|
+
| Task Type | Examples | Key Requirements |
|
|
14
|
+
|-----------|----------|-----------------|
|
|
15
|
+
| **Simple chat** | Q&A, summarization, translation | Speed, low cost |
|
|
16
|
+
| **Code generation** | Write functions, review PRs | Strong coding, large context |
|
|
17
|
+
| **Reasoning/analysis** | Debugging, architecture, planning | Deep reasoning, structured output |
|
|
18
|
+
| **Structured extraction** | Parse logs, extract data | JSON mode, function calling |
|
|
19
|
+
| **Vision/multimodal** | Screenshot analysis, document OCR | Image input support |
|
|
20
|
+
| **Long document** | Analyze 100+ page docs | Large context window (200K+) |
|
|
21
|
+
| **Embeddings** | Semantic search, RAG | Embedding model, dimensions |
|
|
22
|
+
|
|
23
|
+
## Step 2: Query the Model Catalog
|
|
24
|
+
|
|
25
|
+
Access available models through the catalog:
|
|
26
|
+
|
|
27
|
+
```ruby
|
|
28
|
+
# List all models
|
|
29
|
+
Ask::ModelCatalog.all
|
|
30
|
+
|
|
31
|
+
# Filter by capability
|
|
32
|
+
Ask::ModelCatalog.chat_models
|
|
33
|
+
Ask::ModelCatalog.by_provider("openai")
|
|
34
|
+
Ask::ModelCatalog.by_family("gpt")
|
|
35
|
+
Ask::ModelCatalog.embedding_models
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Find a specific model by ID:
|
|
39
|
+
|
|
40
|
+
```ruby
|
|
41
|
+
model = Ask::ModelCatalog.find("gpt-4o")
|
|
42
|
+
model.context_window # => 128000
|
|
43
|
+
model.max_output_tokens # => 16384
|
|
44
|
+
model.supports?(:function_calling) # => true
|
|
45
|
+
model.capabilities # => ["function_calling", "structured_output", "reasoning", "vision"]
|
|
46
|
+
model.modalities # => { input: ["text", "image"], output: ["text"] }
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
If the catalog doesn't have the model you need, refresh:
|
|
50
|
+
|
|
51
|
+
```ruby
|
|
52
|
+
Ask::ModelCatalog.refresh!
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## Step 3: Evaluate Cost vs Capability
|
|
56
|
+
|
|
57
|
+
Use pricing data from the catalog:
|
|
58
|
+
|
|
59
|
+
```ruby
|
|
60
|
+
model = Ask::ModelCatalog.find("gpt-4o")
|
|
61
|
+
pricing = model.pricing.dig(:text_tokens, :standard)
|
|
62
|
+
pricing[:input_per_million] # $ per 1M input tokens
|
|
63
|
+
pricing[:output_per_million] # $ per 1M output tokens
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**Cost comparison (approximate):**
|
|
67
|
+
|
|
68
|
+
| Tier | Models | Cost/M tokens (in) | Best For |
|
|
69
|
+
|------|--------|-------------------|----------|
|
|
70
|
+
| **Frontier** | GPT-4o, Claude 4 Sonnet, Gemini 2.5 Pro | $3-15 | Complex reasoning, code generation |
|
|
71
|
+
| **Fast/Cheap** | GPT-4o-mini, Claude 4 Haiku, Gemini 2.5 Flash | $0.15-1.00 | Simple chat, extraction, classification |
|
|
72
|
+
| **Reasoning** | o3, o4-mini, DeepSeek R1 | $2-10 | Deep analysis, math, multi-step tasks |
|
|
73
|
+
| **Specialized** | Embedding, image, audio models | Varies | Non-chat tasks |
|
|
74
|
+
|
|
75
|
+
## Step 4: Match Capabilities to Task Requirements
|
|
76
|
+
|
|
77
|
+
Check if a model supports the features you need:
|
|
78
|
+
|
|
79
|
+
```ruby
|
|
80
|
+
model.supports?(:function_calling) # For tool use
|
|
81
|
+
model.supports?(:structured_output) # For JSON mode
|
|
82
|
+
model.supports?(:vision) # For image analysis
|
|
83
|
+
model.supports?(:reasoning) # For complex reasoning
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
**Capability requirements by task:**
|
|
87
|
+
|
|
88
|
+
| Need | Check | Fallback |
|
|
89
|
+
|------|-------|----------|
|
|
90
|
+
| Tool calling | `supports?(:function_calling)` | Use text instruction instead |
|
|
91
|
+
| JSON output | `supports?(:structured_output)` | Prompt-engineering |
|
|
92
|
+
| Image processing | `modalities[:input].include?("image")` | Describe image in text |
|
|
93
|
+
| Audio processing | `modalities[:input].include?("audio")` | Transcribe first |
|
|
94
|
+
| Deep reasoning | `supports?(:reasoning)` | Chain-of-thought prompting |
|
|
95
|
+
|
|
96
|
+
## Step 5: Consider Context Window Requirements
|
|
97
|
+
|
|
98
|
+
Choose context window based on your input size:
|
|
99
|
+
|
|
100
|
+
```ruby
|
|
101
|
+
model.context_window # total tokens the model can process
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
**Guidelines:**
|
|
105
|
+
- **8K-16K** — Simple Q&A, short conversations
|
|
106
|
+
- **32K-64K** — Code review, medium documents, multi-turn conversations
|
|
107
|
+
- **100K-200K** — Large codebases, long documents, RAG with many chunks
|
|
108
|
+
- **1M-2M** — Gemini 2.5 Pro, Gemini 2.0 Flash for massive documents
|
|
109
|
+
|
|
110
|
+
Be aware that large context windows increase latency and cost even if you don't
|
|
111
|
+
use them all.
|
|
112
|
+
|
|
113
|
+
## Step 6: Pick the Right Embedding Model
|
|
114
|
+
|
|
115
|
+
For RAG and semantic search:
|
|
116
|
+
|
|
117
|
+
```ruby
|
|
118
|
+
Ask::ModelCatalog.embedding_models
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
**Recommendations:**
|
|
122
|
+
- **General purpose**: `text-embedding-3-large` (256-3072 dims)
|
|
123
|
+
- **Best accuracy**: `text-embedding-3-large` with 3072 dimensions
|
|
124
|
+
- **Fast/Cheap**: `text-embedding-3-small` (512 dimensions)
|
|
125
|
+
- **Multilingual**: `text-embedding-3-small` (supports 100+ languages)
|
|
126
|
+
|
|
127
|
+
## Decision Tree
|
|
128
|
+
|
|
129
|
+
```
|
|
130
|
+
Task Type?
|
|
131
|
+
├── Simple chat / extraction
|
|
132
|
+
│ └── Fast model (GPT-4o-mini, Claude 4 Haiku)
|
|
133
|
+
│ → Cheapest adequate model
|
|
134
|
+
├── Code generation / review
|
|
135
|
+
│ └── Frontier model (GPT-4o, Claude 4 Sonnet)
|
|
136
|
+
│ → Needs function calling + max capability
|
|
137
|
+
├── Deep reasoning / debugging
|
|
138
|
+
│ └── Reasoning model (o4-mini, DeepSeek R1, o3)
|
|
139
|
+
│ → Needs chain-of-thought + analysis
|
|
140
|
+
├── Long document analysis
|
|
141
|
+
│ └── Large context (Gemini 2.5 Pro 1M, GPT-4o)
|
|
142
|
+
│ → Needs context window > input size
|
|
143
|
+
├── Multimodal (image/video)
|
|
144
|
+
│ └── Vision-capable (GPT-4o, Claude 4 Sonnet, Gemini 2.5)
|
|
145
|
+
│ → Check modalities[:input] includes image
|
|
146
|
+
├── Embeddings / RAG
|
|
147
|
+
│ └── text-embedding-3-large / small
|
|
148
|
+
│ → Not a chat model
|
|
149
|
+
└── Audio / Voice
|
|
150
|
+
└── GPT-4o-audio, Gemini Audio
|
|
151
|
+
→ Check modalities[:output] includes audio
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
## Provider Selection
|
|
155
|
+
|
|
156
|
+
Consider provider reliability and features:
|
|
157
|
+
|
|
158
|
+
| Provider | Strengths | Weaknesses |
|
|
159
|
+
|----------|-----------|------------|
|
|
160
|
+
| **OpenAI** | Best tool calling, broad model range | Higher cost for frontier |
|
|
161
|
+
| **Anthropic** | Excellent code, long context | Slower for simple tasks |
|
|
162
|
+
| **Google Gemini** | Massive context (1M+), fast | Fewer integration tools |
|
|
163
|
+
| **DeepSeek** | Cheap reasoning, open weights | Limited ecosystem |
|
|
164
|
+
| **Ollama** | Local, free, private | Slow, no hosted offerings |
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: ask-llm-providers
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1.
|
|
4
|
+
version: 0.1.6
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Kaka Ruto
|
|
@@ -189,6 +189,7 @@ files:
|
|
|
189
189
|
- lib/ask/provider/openai.rb
|
|
190
190
|
- lib/ask/provider/opencode.rb
|
|
191
191
|
- lib/ask/provider/opencode_go.rb
|
|
192
|
+
- lib/ask/skills/providers.model_select/SKILL.md
|
|
192
193
|
homepage: https://github.com/ask-rb/ask-llm-providers
|
|
193
194
|
licenses:
|
|
194
195
|
- MIT
|