ask-llm-providers 0.1.5 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 98e73dfa923583db079237386caab024766c6613b202623b73952d3233ddf3ae
4
- data.tar.gz: 0aeef8d5eface2a50ffa4365c1d8e53fb47de3d0be54eb88c4ba53dea6fecfff
3
+ metadata.gz: cf20ba02f9ecd52cdbc18f93ff8a22f75e9d33d845dd80ab7880d35454cd27c0
4
+ data.tar.gz: 982162bc42de9c71dfe83dabdd40b69a95b618ead9a056ffcafcaee8c77a5bd7
5
5
  SHA512:
6
- metadata.gz: 94dd687fc071961fe2495e61788861d676470b42e7352ff5f2d9c53455fe090d9d93c1b122a7b1d769f347d81b7059b44fffb18235fa7c66910b36b193b9f890
7
- data.tar.gz: 88d1dbcb9f4fb92d8a447352c725c6e27e89bbc29f46f7d83e96856c9288ec18bcbda236b04e6e7eb85df901187e3f5fcef56c464f5b101b3a12a53534dc2f20
6
+ metadata.gz: f4a361c8bf519e443793c31c4f21dc4f1a5872b4b64f1f7483499f22328a4db091fa6b4e287c0e56ece26777a5ea00e6a3a099c428def8a1cd91a5f61d88a74b
7
+ data.tar.gz: 93d0092ed08a53a12be9e822e7f8b4dc889695c39cd3f82f6167422110c8318c0e923b7f1cd1b00aa7104d5fb9f947490a43217ae35af703c570e7a5d8eb918a
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Ask
4
4
  module LLM
5
- VERSION = "0.1.5"
5
+ VERSION = "0.1.6"
6
6
  end
7
7
  end
@@ -0,0 +1,164 @@
1
+ ---
2
+ name: providers.model_select
3
+ description: How to select the right LLM model for a task — balancing cost, capability, latency, and context window
4
+ ---
5
+
6
+ Use this skill when choosing an LLM model for a specific task. The ask-rb
7
+ ecosystem uses `Ask::ModelCatalog` to resolve models and check capabilities.
8
+
9
+ ## Step 1: Classify the Task
10
+
11
+ Determine what kind of task you're solving:
12
+
13
+ | Task Type | Examples | Key Requirements |
14
+ |-----------|----------|-----------------|
15
+ | **Simple chat** | Q&A, summarization, translation | Speed, low cost |
16
+ | **Code generation** | Write functions, review PRs | Strong coding, large context |
17
+ | **Reasoning/analysis** | Debugging, architecture, planning | Deep reasoning, structured output |
18
+ | **Structured extraction** | Parse logs, extract data | JSON mode, function calling |
19
+ | **Vision/multimodal** | Screenshot analysis, document OCR | Image input support |
20
+ | **Long document** | Analyze 100+ page docs | Large context window (200K+) |
21
+ | **Embeddings** | Semantic search, RAG | Embedding model, dimensions |
22
+
23
+ ## Step 2: Query the Model Catalog
24
+
25
+ Access available models through the catalog:
26
+
27
+ ```ruby
28
+ # List all models
29
+ Ask::ModelCatalog.all
30
+
31
+ # Filter by capability
32
+ Ask::ModelCatalog.chat_models
33
+ Ask::ModelCatalog.by_provider("openai")
34
+ Ask::ModelCatalog.by_family("gpt")
35
+ Ask::ModelCatalog.embedding_models
36
+ ```
37
+
38
+ Find a specific model by ID:
39
+
40
+ ```ruby
41
+ model = Ask::ModelCatalog.find("gpt-4o")
42
+ model.context_window # => 128000
43
+ model.max_output_tokens # => 16384
44
+ model.supports?(:function_calling) # => true
45
+ model.capabilities # => ["function_calling", "structured_output", "reasoning", "vision"]
46
+ model.modalities # => { input: ["text", "image"], output: ["text"] }
47
+ ```
48
+
49
+ If the catalog doesn't have the model you need, refresh:
50
+
51
+ ```ruby
52
+ Ask::ModelCatalog.refresh!
53
+ ```
54
+
55
+ ## Step 3: Evaluate Cost vs Capability
56
+
57
+ Use pricing data from the catalog:
58
+
59
+ ```ruby
60
+ model = Ask::ModelCatalog.find("gpt-4o")
61
+ pricing = model.pricing.dig(:text_tokens, :standard)
62
+ pricing[:input_per_million] # $ per 1M input tokens
63
+ pricing[:output_per_million] # $ per 1M output tokens
64
+ ```
65
+
66
+ **Cost comparison (approximate):**
67
+
68
+ | Tier | Models | Cost/M tokens (in) | Best For |
69
+ |------|--------|-------------------|----------|
70
+ | **Frontier** | GPT-4o, Claude 4 Sonnet, Gemini 2.5 Pro | $3-15 | Complex reasoning, code generation |
71
+ | **Fast/Cheap** | GPT-4o-mini, Claude 4 Haiku, Gemini 2.5 Flash | $0.15-1.00 | Simple chat, extraction, classification |
72
+ | **Reasoning** | o3, o4-mini, DeepSeek R1 | $2-10 | Deep analysis, math, multi-step tasks |
73
+ | **Specialized** | Embedding, image, audio models | Varies | Non-chat tasks |
74
+
75
+ ## Step 4: Match Capabilities to Task Requirements
76
+
77
+ Check if a model supports the features you need:
78
+
79
+ ```ruby
80
+ model.supports?(:function_calling) # For tool use
81
+ model.supports?(:structured_output) # For JSON mode
82
+ model.supports?(:vision) # For image analysis
83
+ model.supports?(:reasoning) # For complex reasoning
84
+ ```
85
+
86
+ **Capability requirements by task:**
87
+
88
+ | Need | Check | Fallback |
89
+ |------|-------|----------|
90
+ | Tool calling | `supports?(:function_calling)` | Use text instruction instead |
91
+ | JSON output | `supports?(:structured_output)` | Prompt-engineering |
92
+ | Image processing | `modalities[:input].include?("image")` | Describe image in text |
93
+ | Audio processing | `modalities[:input].include?("audio")` | Transcribe first |
94
+ | Deep reasoning | `supports?(:reasoning)` | Chain-of-thought prompting |
95
+
96
+ ## Step 5: Consider Context Window Requirements
97
+
98
+ Choose context window based on your input size:
99
+
100
+ ```ruby
101
+ model.context_window # total tokens the model can process
102
+ ```
103
+
104
+ **Guidelines:**
105
+ - **8K-16K** — Simple Q&A, short conversations
106
+ - **32K-64K** — Code review, medium documents, multi-turn conversations
107
+ - **100K-200K** — Large codebases, long documents, RAG with many chunks
108
+ - **1M-2M** — Gemini 2.5 Pro, Gemini 2.0 Flash for massive documents
109
+
110
+ Be aware that large context windows increase latency and cost even if you don't
111
+ use them all.
112
+
113
+ ## Step 6: Pick the Right Embedding Model
114
+
115
+ For RAG and semantic search:
116
+
117
+ ```ruby
118
+ Ask::ModelCatalog.embedding_models
119
+ ```
120
+
121
+ **Recommendations:**
122
+ - **General purpose**: `text-embedding-3-large` (256-3072 dims)
123
+ - **Best accuracy**: `text-embedding-3-large` with 3072 dimensions
124
+ - **Fast/Cheap**: `text-embedding-3-small` (512 dimensions)
125
+ - **Multilingual**: `text-embedding-3-small` (supports 100+ languages)
126
+
127
+ ## Decision Tree
128
+
129
+ ```
130
+ Task Type?
131
+ ├── Simple chat / extraction
132
+ │ └── Fast model (GPT-4o-mini, Claude 4 Haiku)
133
+ │ → Cheapest adequate model
134
+ ├── Code generation / review
135
+ │ └── Frontier model (GPT-4o, Claude 4 Sonnet)
136
+ │ → Needs function calling + max capability
137
+ ├── Deep reasoning / debugging
138
+ │ └── Reasoning model (o4-mini, DeepSeek R1, o3)
139
+ │ → Needs chain-of-thought + analysis
140
+ ├── Long document analysis
141
+ │ └── Large context (Gemini 2.5 Pro 1M, GPT-4o)
142
+ │ → Needs context window > input size
143
+ ├── Multimodal (image/video)
144
+ │ └── Vision-capable (GPT-4o, Claude 4 Sonnet, Gemini 2.5)
145
+ │ → Check modalities[:input] includes image
146
+ ├── Embeddings / RAG
147
+ │ └── text-embedding-3-large / small
148
+ │ → Not a chat model
149
+ └── Audio / Voice
150
+ └── GPT-4o-audio, Gemini Audio
151
+ → Check modalities[:output] includes audio
152
+ ```
153
+
154
+ ## Provider Selection
155
+
156
+ Consider provider reliability and features:
157
+
158
+ | Provider | Strengths | Weaknesses |
159
+ |----------|-----------|------------|
160
+ | **OpenAI** | Best tool calling, broad model range | Higher cost for frontier |
161
+ | **Anthropic** | Excellent code, long context | Slower for simple tasks |
162
+ | **Google Gemini** | Massive context (1M+), fast | Fewer integration tools |
163
+ | **DeepSeek** | Cheap reasoning, open weights | Limited ecosystem |
164
+ | **Ollama** | Local, free, private | Slow, no hosted offerings |
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ask-llm-providers
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.5
4
+ version: 0.1.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Kaka Ruto
@@ -189,6 +189,7 @@ files:
189
189
  - lib/ask/provider/openai.rb
190
190
  - lib/ask/provider/opencode.rb
191
191
  - lib/ask/provider/opencode_go.rb
192
+ - lib/ask/skills/providers.model_select/SKILL.md
192
193
  homepage: https://github.com/ask-rb/ask-llm-providers
193
194
  licenses:
194
195
  - MIT