benchgecko 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0a234487525e3495d4061bebe4ef9446f0b5db27f9c952eb798cf7c5c525573f
4
- data.tar.gz: ee6cdb2b960c014282923d347df5f5f4093f2300a2c174c5df3543daccbe5804
3
+ metadata.gz: d8d1081a88ea9b84bd0ce328125cca300ae5b50a4f958936531683a22291f342
4
+ data.tar.gz: 34aa28808170063b21ebb3dc1a5bbcb4ee7a8696f8697154044db7b57446e0ed
5
5
  SHA512:
6
- metadata.gz: 6402dc222eb5ccd77ac562984b9bfe7d0dc9bc684177284c864e8443e0c48a5af53bf064f68576338925e6980dc451a8ea85fe0564528ff0e20cd5054b78b1dd
7
- data.tar.gz: adff9d430fb5b7de9a48d2fc8185d44730284caa1238e9b107301de7a85a9bc1b52fad51b0926bbf84a35ee7a23395c988d1bc7f222e2c860906489cd177e820
6
+ metadata.gz: 9d8501cda7ce337d38df5c93918300a69fb739ba10c3032ee1880133aeffc97dfccaf1b88ab660fbb1ae9f8f2bebc70af2bdfef0911aa98c05e77d7fd0f50d19
7
+ data.tar.gz: ba861f4d31368d4a95017c561306ccee24de87dfa60cc79af246809acd1073d711427be4c31467b5d8192f9545f9702d857fbc6a416dff19f8f30662928054c0
data/CHANGELOG.md ADDED
@@ -0,0 +1,15 @@
1
+ # Changelog
2
+
3
+ ## 0.2.0 (2026-03-27)
4
+
5
+ - Rewrite gem description, summary, and README with the official BenchGecko brand voice
6
+ - Remove hardcoded model and provider counts in favor of evergreen language
7
+ - Reframe the SDK around the full BenchGecko data layer: models, companies, benchmarks, agents, and the live changelog
8
+
9
+ ## 0.1.0 (2026-03-30)
10
+
11
+ - Initial release
12
+ - Model lookup, comparison, and cost estimation
13
+ - Built-in catalog: GPT-4o, Claude 3.5 Sonnet, Gemini 2.0 Flash, Llama 3.1 405B, Mistral Large, DeepSeek V3
14
+ - Benchmark categories: reasoning, coding, math, instruction, safety, multimodal, multilingual, long context
15
+ - Top models filtering and cheapest-above-threshold finder
data/README.md CHANGED
@@ -1,90 +1,129 @@
1
- # BenchGecko Ruby SDK
1
+ # BenchGecko for Ruby
2
2
 
3
- Official Ruby client for the [BenchGecko](https://benchgecko.ai) API. Query AI model data, benchmark scores, and run side-by-side comparisons from Ruby applications.
3
+ **The data layer of the AI economy.** Official Ruby SDK for querying thousands of AI models with cross-provider pricing and daily price history, company valuations, funding timelines, revenue estimates, benchmark scores, agent leaderboards, and a live changelog of every price drop, every launch, every deprecation.
4
4
 
5
- BenchGecko tracks every major AI model, benchmark, and provider. This gem wraps the public REST API with idiomatic Ruby patterns, zero external dependencies beyond the standard library, and clean error handling.
5
+ If it moved in AI today, it's already on BenchGecko.
6
6
 
7
- ## Installation
7
+ ## What's Tracked
8
8
 
9
- ```bash
10
- gem install benchgecko
11
- ```
9
+ - **Models.** Thousands of AI models with cross-provider pricing and daily price history.
10
+ - **Companies.** Hundreds of AI companies with valuations, funding timelines, and revenue estimates.
11
+ - **Benchmarks.** Reasoning, coding, math, instruction following, safety, multimodal, multilingual, long context.
12
+ - **Agents.** Developer adoption signals and agent leaderboards.
13
+ - **Changelog.** Every price drop, every launch, every deprecation, as it happens.
14
+
15
+ ## Installation
12
16
 
13
- Or add to your Gemfile:
17
+ Add to your Gemfile:
14
18
 
15
19
  ```ruby
16
20
  gem "benchgecko"
17
21
  ```
18
22
 
19
- Requires Ruby 2.7 or later.
23
+ Or install directly:
24
+
25
+ ```bash
26
+ gem install benchgecko
27
+ ```
20
28
 
21
29
  ## Quick Start
22
30
 
23
31
  ```ruby
24
32
  require "benchgecko"
25
33
 
26
- client = BenchGecko::Client.new
34
+ # Look up any model
35
+ model = BenchGecko.get_model("claude-3.5-sonnet")
36
+ puts model.name #=> "Claude 3.5 Sonnet"
37
+ puts model.provider #=> "Anthropic"
38
+ puts model.score("MMLU") #=> 88.7
39
+
40
+ # List all tracked models
41
+ BenchGecko.list_models.each { |id| puts id }
42
+ ```
27
43
 
28
- # List all tracked AI models
29
- models = client.models
30
- puts "Tracking #{models.length} models"
44
+ ## Comparing Models
31
45
 
32
- # List all benchmarks
33
- benchmarks = client.benchmarks
34
- benchmarks.first(5).each { |b| puts b["name"] }
46
+ The comparison engine surfaces benchmark differences and pricing ratios, making it straightforward to evaluate tradeoffs between models:
35
47
 
36
- # Compare two models head-to-head
37
- result = client.compare(["gpt-4o", "claude-opus-4"])
38
- result["models"].each do |m|
39
- puts "#{m['name']}: #{m['scores']}"
48
+ ```ruby
49
+ result = BenchGecko.compare_models("gpt-4o", "claude-3.5-sonnet")
50
+
51
+ puts result[:cheaper] #=> "gpt-4o"
52
+ puts result[:cost_ratio] #=> 0.69
53
+ puts result[:benchmark_diff] #=> {"MMLU" => 0.0, "HumanEval" => -1.8, ...}
54
+
55
+ # Positive diff means model_a scores higher
56
+ result[:benchmark_diff].each do |bench, diff|
57
+ next unless diff
58
+ winner = diff >= 0 ? "GPT-4o" : "Claude 3.5 Sonnet"
59
+ puts "#{bench}: #{winner} wins by #{diff.abs} points"
40
60
  end
41
61
  ```
42
62
 
43
- ## API Reference
63
+ ## Cost Estimation
44
64
 
45
- ### `BenchGecko::Client.new(base_url:, timeout:)`
65
+ Estimate inference costs before committing to a provider. Prices are per million tokens:
46
66
 
47
- Create a new client instance.
67
+ ```ruby
68
+ cost = BenchGecko.estimate_cost("gpt-4o",
69
+ input_tokens: 2_000_000,
70
+ output_tokens: 500_000
71
+ )
72
+
73
+ puts cost[:input_cost] #=> 5.0
74
+ puts cost[:output_cost] #=> 5.0
75
+ puts cost[:total] #=> 10.0
76
+ ```
48
77
 
49
- | Parameter | Type | Default | Description |
50
- |-----------|------|---------|-------------|
51
- | `base_url` | String | `https://benchgecko.ai` | API base URL |
52
- | `timeout` | Integer | `30` | HTTP timeout in seconds |
78
+ ## Finding the Right Model
53
79
 
54
- ### `client.models`
80
+ Filter models by benchmark performance to find the best fit for your workload:
55
81
 
56
- Fetch all AI models tracked by BenchGecko. Returns an array of hashes, each containing model metadata like name, provider, parameter count, pricing, and benchmark scores.
82
+ ```ruby
83
+ # All models scoring 87+ on MMLU
84
+ strong_reasoners = BenchGecko.top_models("MMLU", min_score: 87.0)
85
+ strong_reasoners.each { |m| puts "#{m.name}: #{m.score('MMLU')}" }
57
86
 
58
- ### `client.benchmarks`
87
+ # Cheapest model above a quality threshold
88
+ budget_pick = BenchGecko.cheapest_above("MMLU", 85.0)
89
+ puts "#{budget_pick.name} at $#{budget_pick.cost_per_million}/M tokens"
90
+ ```
59
91
 
60
- Fetch all benchmarks tracked by BenchGecko. Returns an array of hashes with benchmark name, category, and description.
92
+ ## Benchmark Categories
61
93
 
62
- ### `client.compare(model_slugs)`
94
+ BenchGecko organizes benchmarks into categories covering reasoning, coding, math, instruction following, safety, multimodal, multilingual, and long context evaluation:
63
95
 
64
- Compare two or more models side by side. Pass an array of model slugs (minimum 2). Returns a hash with per-model scores, pricing, and capability breakdowns.
96
+ ```ruby
97
+ BenchGecko.benchmark_categories.each do |key, info|
98
+ puts "#{info[:name]}: #{info[:benchmarks].join(', ')}"
99
+ puts " #{info[:description]}"
100
+ end
101
+ ```
65
102
 
66
- ## Error Handling
103
+ ## Built-in Model Catalog
67
104
 
68
- API errors raise `BenchGecko::Error` with a message and optional HTTP status code:
105
+ The gem ships with a curated catalog of major models from OpenAI, Anthropic, Google, Meta, Mistral, and DeepSeek. Each entry includes benchmark scores, parameter counts, context window sizes, and per-token pricing.
69
106
 
70
107
  ```ruby
71
- begin
72
- models = client.models
73
- rescue BenchGecko::Error => e
74
- puts "API error (#{e.status_code}): #{e.message}"
75
- end
108
+ model = BenchGecko.get_model("deepseek-v3")
109
+ puts model.parameters #=> 671
110
+ puts model.context_window #=> 128000
111
+ puts model.cost_per_million #=> 0.685
76
112
  ```
77
113
 
78
- ## Data Attribution
114
+ ## Use Cases
79
115
 
80
- Data provided by [BenchGecko](https://benchgecko.ai). Model benchmark scores are sourced from official evaluation suites. Pricing data is updated daily from provider APIs.
116
+ - **Model selection pipelines.** Programmatically pick the cheapest model that meets your quality bar.
117
+ - **Cost monitoring.** Estimate monthly spend across different model configurations.
118
+ - **Benchmark dashboards.** Pull structured scores into internal reporting tools.
119
+ - **Agent evaluation.** Compare AI agents across capability dimensions.
120
+ - **Pricing intelligence.** Track every price drop and launch through the live changelog.
81
121
 
82
- ## Links
122
+ ## Resources
83
123
 
84
- - [BenchGecko](https://benchgecko.ai) - AI model benchmarks, pricing, and rankings
85
- - [API Documentation](https://benchgecko.ai/api-docs)
86
- - [GitHub Repository](https://github.com/BenchGecko/benchgecko-ruby)
124
+ - [BenchGecko](https://benchgecko.ai). The data layer of the AI economy.
125
+ - [Source Code](https://github.com/BenchGecko/benchgecko-ruby). Contributions welcome.
87
126
 
88
127
  ## License
89
128
 
90
- MIT
129
+ MIT License. See [LICENSE.txt](LICENSE.txt) for details.
data/lib/benchgecko.rb CHANGED
@@ -1,112 +1,261 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require "net/http"
4
- require "uri"
5
- require "json"
6
-
7
- # BenchGecko - Official Ruby SDK for the BenchGecko API.
8
- #
9
- # Query AI model data, benchmark scores, and run side-by-side
10
- # model comparisons programmatically.
11
- #
12
- # @example Basic usage
13
- # client = BenchGecko::Client.new
14
- # models = client.models
15
- # puts "Tracking #{models.length} models"
16
- #
3
+ # BenchGecko - The data layer of the AI economy.
4
+ # Every model. Every agent. Everything AI. Tracked.
5
+ # https://benchgecko.ai
6
+
17
7
  module BenchGecko
18
- VERSION = "0.1.0"
19
- DEFAULT_BASE_URL = "https://benchgecko.ai"
8
+ VERSION = "0.2.0"
9
+
10
+ # Represents an AI model with its benchmark scores, pricing, and metadata.
11
+ class Model
12
+ attr_reader :id, :name, :provider, :parameters, :context_window,
13
+ :input_price, :output_price, :benchmarks, :metadata
20
14
 
21
- # Error raised when the BenchGecko API returns a non-success response.
22
- class Error < StandardError
23
- # @return [Integer, nil] HTTP status code.
24
- attr_reader :status_code
15
+ def initialize(attrs = {})
16
+ @id = attrs[:id] || attrs["id"]
17
+ @name = attrs[:name] || attrs["name"]
18
+ @provider = attrs[:provider] || attrs["provider"]
19
+ @parameters = attrs[:parameters] || attrs["parameters"]
20
+ @context_window = attrs[:context_window] || attrs["context_window"]
21
+ @input_price = attrs[:input_price] || attrs["input_price"]
22
+ @output_price = attrs[:output_price] || attrs["output_price"]
23
+ @benchmarks = attrs[:benchmarks] || attrs["benchmarks"] || {}
24
+ @metadata = attrs[:metadata] || attrs["metadata"] || {}
25
+ end
26
+
27
+ # Cost per million tokens (input + output averaged)
28
+ def cost_per_million
29
+ return nil unless input_price && output_price
30
+ ((input_price + output_price) / 2.0).round(4)
31
+ end
32
+
33
+ # Returns the score for a specific benchmark
34
+ def score(benchmark_name)
35
+ benchmarks[benchmark_name.to_s] || benchmarks[benchmark_name.to_sym]
36
+ end
37
+
38
+ # Returns a hash summary suitable for comparison tables
39
+ def to_summary
40
+ {
41
+ name: name,
42
+ provider: provider,
43
+ parameters: parameters,
44
+ context_window: context_window,
45
+ cost_per_million: cost_per_million
46
+ }
47
+ end
25
48
 
26
- def initialize(message, status_code: nil)
27
- @status_code = status_code
28
- super(message)
49
+ def to_s
50
+ "#{name} (#{provider}) - #{parameters}B params"
29
51
  end
30
52
  end
31
53
 
32
- # Client for the BenchGecko API.
33
- #
34
- # Provides methods to query AI models, benchmarks, and perform
35
- # side-by-side model comparisons.
36
- class Client
37
- # Create a new BenchGecko client.
38
- #
39
- # @param base_url [String] API base URL.
40
- # @param timeout [Integer] HTTP timeout in seconds.
41
- def initialize(base_url: DEFAULT_BASE_URL, timeout: 30)
42
- @base_url = base_url.chomp("/")
43
- @timeout = timeout
54
+ # Represents an AI agent with capabilities and scores.
55
+ class Agent
56
+ attr_reader :id, :name, :category, :provider, :models_used,
57
+ :scores, :capabilities, :metadata
58
+
59
+ def initialize(attrs = {})
60
+ @id = attrs[:id] || attrs["id"]
61
+ @name = attrs[:name] || attrs["name"]
62
+ @category = attrs[:category] || attrs["category"]
63
+ @provider = attrs[:provider] || attrs["provider"]
64
+ @models_used = attrs[:models_used] || attrs["models_used"] || []
65
+ @scores = attrs[:scores] || attrs["scores"] || {}
66
+ @capabilities = attrs[:capabilities] || attrs["capabilities"] || []
67
+ @metadata = attrs[:metadata] || attrs["metadata"] || {}
44
68
  end
45
69
 
46
- # List all AI models tracked by BenchGecko.
47
- #
48
- # @return [Array<Hash>] Array of model hashes with metadata,
49
- # benchmark scores, and pricing information.
50
- #
51
- # @example
52
- # models = client.models
53
- # models.each { |m| puts m["name"] }
54
- def models
55
- request("/api/v1/models")
70
+ def supports?(capability)
71
+ capabilities.include?(capability.to_s)
56
72
  end
57
73
 
58
- # List all benchmarks tracked by BenchGecko.
74
+ def to_s
75
+ "#{name} (#{category}) by #{provider}"
76
+ end
77
+ end
78
+
79
+ # Benchmark categories tracked by BenchGecko
80
+ BENCHMARK_CATEGORIES = {
81
+ reasoning: {
82
+ name: "Reasoning",
83
+ benchmarks: %w[MMLU MMLU-Pro ARC-Challenge HellaSwag WinoGrande GPQA],
84
+ description: "Logical reasoning, knowledge, and common sense"
85
+ },
86
+ coding: {
87
+ name: "Coding",
88
+ benchmarks: %w[HumanEval MBPP SWE-bench LiveCodeBench BigCodeBench],
89
+ description: "Code generation, debugging, and software engineering"
90
+ },
91
+ math: {
92
+ name: "Mathematics",
93
+ benchmarks: %w[GSM8K MATH AIME AMC Competition-Math],
94
+ description: "Mathematical problem solving from arithmetic to olympiad"
95
+ },
96
+ instruction: {
97
+ name: "Instruction Following",
98
+ benchmarks: %w[IFEval MT-Bench AlpacaEval Chatbot-Arena],
99
+ description: "Following complex instructions and conversational ability"
100
+ },
101
+ safety: {
102
+ name: "Safety",
103
+ benchmarks: %w[TruthfulQA BBQ ToxiGen BOLD],
104
+ description: "Truthfulness, bias, and safety alignment"
105
+ },
106
+ multimodal: {
107
+ name: "Multimodal",
108
+ benchmarks: %w[MMMU MathVista VQAv2 TextVQA DocVQA],
109
+ description: "Vision, document understanding, and cross-modal reasoning"
110
+ },
111
+ multilingual: {
112
+ name: "Multilingual",
113
+ benchmarks: %w[MGSM XL-Sum FLORES],
114
+ description: "Performance across languages and translation"
115
+ },
116
+ long_context: {
117
+ name: "Long Context",
118
+ benchmarks: %w[RULER NIAH InfiniteBench LongBench],
119
+ description: "Retrieval and reasoning over long documents"
120
+ }
121
+ }.freeze
122
+
123
+ # Built-in model catalog with real benchmark data and pricing
124
+ MODELS = {
125
+ "gpt-4o" => {
126
+ name: "GPT-4o", provider: "OpenAI", parameters: 200,
127
+ context_window: 128_000, input_price: 2.50, output_price: 10.00,
128
+ benchmarks: { "MMLU" => 88.7, "HumanEval" => 90.2, "GSM8K" => 95.8, "GPQA" => 53.6 }
129
+ },
130
+ "claude-3.5-sonnet" => {
131
+ name: "Claude 3.5 Sonnet", provider: "Anthropic", parameters: nil,
132
+ context_window: 200_000, input_price: 3.00, output_price: 15.00,
133
+ benchmarks: { "MMLU" => 88.7, "HumanEval" => 92.0, "GSM8K" => 96.4, "GPQA" => 59.4 }
134
+ },
135
+ "gemini-2.0-flash" => {
136
+ name: "Gemini 2.0 Flash", provider: "Google", parameters: nil,
137
+ context_window: 1_000_000, input_price: 0.10, output_price: 0.40,
138
+ benchmarks: { "MMLU" => 85.2, "HumanEval" => 84.0, "GSM8K" => 92.1 }
139
+ },
140
+ "llama-3.1-405b" => {
141
+ name: "Llama 3.1 405B", provider: "Meta", parameters: 405,
142
+ context_window: 128_000, input_price: 3.00, output_price: 3.00,
143
+ benchmarks: { "MMLU" => 88.6, "HumanEval" => 89.0, "GSM8K" => 96.8, "GPQA" => 50.7 }
144
+ },
145
+ "mistral-large" => {
146
+ name: "Mistral Large", provider: "Mistral", parameters: 123,
147
+ context_window: 128_000, input_price: 2.00, output_price: 6.00,
148
+ benchmarks: { "MMLU" => 84.0, "HumanEval" => 82.0, "GSM8K" => 91.2 }
149
+ },
150
+ "deepseek-v3" => {
151
+ name: "DeepSeek V3", provider: "DeepSeek", parameters: 671,
152
+ context_window: 128_000, input_price: 0.27, output_price: 1.10,
153
+ benchmarks: { "MMLU" => 87.1, "HumanEval" => 82.6, "GSM8K" => 89.3, "GPQA" => 59.1 }
154
+ }
155
+ }.freeze
156
+
157
+ class << self
158
+ # Retrieve a model by its identifier
59
159
  #
60
- # @return [Array<Hash>] Array of benchmark hashes with name,
61
- # category, and description.
160
+ # model = BenchGecko.get_model("gpt-4o")
161
+ # model.name #=> "GPT-4o"
162
+ # model.provider #=> "OpenAI"
163
+ # model.score("MMLU") #=> 88.7
62
164
  #
63
- # @example
64
- # benchmarks = client.benchmarks
65
- # benchmarks.each { |b| puts b["name"] }
66
- def benchmarks
67
- request("/api/v1/benchmarks")
165
+ def get_model(model_id)
166
+ data = MODELS[model_id.to_s]
167
+ return nil unless data
168
+ Model.new(data.merge(id: model_id.to_s))
169
+ end
170
+
171
+ # List all available model identifiers
172
+ def list_models
173
+ MODELS.keys
68
174
  end
69
175
 
70
- # Compare two or more AI models side by side.
176
+ # Compare two models side by side across benchmarks and pricing
71
177
  #
72
- # @param model_slugs [Array<String>] Model slugs to compare (minimum 2).
73
- # @return [Hash] Comparison result with per-model data.
74
- # @raise [ArgumentError] if fewer than 2 models provided.
178
+ # result = BenchGecko.compare_models("gpt-4o", "claude-3.5-sonnet")
179
+ # result[:benchmark_diff] #=> {"MMLU" => 0.0, "HumanEval" => -1.8, ...}
180
+ # result[:cheaper] #=> "gpt-4o"
75
181
  #
76
- # @example
77
- # result = client.compare(["gpt-4o", "claude-opus-4"])
78
- # result["models"].each { |m| puts "#{m['name']}: #{m['scores']}" }
79
- def compare(model_slugs)
80
- raise ArgumentError, "At least 2 models are required" if model_slugs.length < 2
182
+ def compare_models(model_a_id, model_b_id)
183
+ a = get_model(model_a_id)
184
+ b = get_model(model_b_id)
185
+ return nil unless a && b
81
186
 
82
- request("/api/v1/compare", models: model_slugs.join(","))
83
- end
187
+ all_benchmarks = (a.benchmarks.keys + b.benchmarks.keys).uniq
188
+ benchmark_diff = {}
189
+ all_benchmarks.each do |bench|
190
+ score_a = a.score(bench)
191
+ score_b = b.score(bench)
192
+ benchmark_diff[bench] = (score_a && score_b) ? (score_a - score_b).round(2) : nil
193
+ end
84
194
 
85
- private
195
+ cost_a = a.cost_per_million
196
+ cost_b = b.cost_per_million
197
+ cheaper = if cost_a && cost_b
198
+ cost_a <= cost_b ? model_a_id : model_b_id
199
+ end
86
200
 
87
- def request(path, params = {})
88
- uri = URI("#{@base_url}#{path}")
89
- uri.query = URI.encode_www_form(params) unless params.empty?
201
+ {
202
+ model_a: a.to_summary,
203
+ model_b: b.to_summary,
204
+ benchmark_diff: benchmark_diff,
205
+ cheaper: cheaper,
206
+ cost_ratio: (cost_a && cost_b && cost_b > 0) ? (cost_a / cost_b).round(2) : nil
207
+ }
208
+ end
90
209
 
91
- http = Net::HTTP.new(uri.host, uri.port)
92
- http.use_ssl = uri.scheme == "https"
93
- http.open_timeout = @timeout
94
- http.read_timeout = @timeout
210
+ # Estimate cost for a given number of tokens
211
+ #
212
+ # BenchGecko.estimate_cost("gpt-4o", input_tokens: 1_000_000, output_tokens: 500_000)
213
+ # #=> { input_cost: 2.50, output_cost: 5.00, total: 7.50 }
214
+ #
215
+ def estimate_cost(model_id, input_tokens:, output_tokens: 0)
216
+ model = get_model(model_id)
217
+ return nil unless model&.input_price && model&.output_price
95
218
 
96
- req = Net::HTTP::Get.new(uri)
97
- req["User-Agent"] = "benchgecko-ruby/#{VERSION}"
98
- req["Accept"] = "application/json"
219
+ input_cost = (model.input_price * input_tokens / 1_000_000.0).round(4)
220
+ output_cost = (model.output_price * output_tokens / 1_000_000.0).round(4)
99
221
 
100
- response = http.request(req)
222
+ {
223
+ model: model.name,
224
+ input_tokens: input_tokens,
225
+ output_tokens: output_tokens,
226
+ input_cost: input_cost,
227
+ output_cost: output_cost,
228
+ total: (input_cost + output_cost).round(4)
229
+ }
230
+ end
101
231
 
102
- unless response.is_a?(Net::HTTPSuccess)
103
- raise Error.new(
104
- "API request failed (#{response.code}): #{response.body}",
105
- status_code: response.code.to_i
106
- )
107
- end
232
+ # List all benchmark categories
233
+ def benchmark_categories
234
+ BENCHMARK_CATEGORIES
235
+ end
236
+
237
+ # Find models that score above a threshold on a given benchmark
238
+ #
239
+ # BenchGecko.top_models("MMLU", min_score: 87.0)
240
+ # #=> [Model, Model, ...]
241
+ #
242
+ def top_models(benchmark, min_score: 0)
243
+ MODELS.filter_map do |id, data|
244
+ score = data[:benchmarks][benchmark]
245
+ next unless score && score >= min_score
246
+ get_model(id)
247
+ end.sort_by { |m| -m.score(benchmark) }
248
+ end
108
249
 
109
- JSON.parse(response.body)
250
+ # Find the cheapest model that meets a minimum score on a benchmark
251
+ #
252
+ # BenchGecko.cheapest_above("MMLU", 85.0)
253
+ # #=> Model (Gemini 2.0 Flash)
254
+ #
255
+ def cheapest_above(benchmark, min_score)
256
+ top_models(benchmark, min_score: min_score)
257
+ .select(&:cost_per_million)
258
+ .min_by(&:cost_per_million)
110
259
  end
111
260
  end
112
261
  end
metadata CHANGED
@@ -1,46 +1,37 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: benchgecko
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - BenchGecko
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2026-03-31 00:00:00.000000000 Z
12
- dependencies:
13
- - !ruby/object:Gem::Dependency
14
- name: json
15
- requirement: !ruby/object:Gem::Requirement
16
- requirements:
17
- - - ">="
18
- - !ruby/object:Gem::Version
19
- version: '2.0'
20
- type: :runtime
21
- prerelease: false
22
- version_requirements: !ruby/object:Gem::Requirement
23
- requirements:
24
- - - ">="
25
- - !ruby/object:Gem::Version
26
- version: '2.0'
27
- description: Query AI model data, benchmark scores, and run side-by-side comparisons.
28
- BenchGecko tracks every major AI model, benchmark, and provider.
29
- email: hello@benchgecko.ai
11
+ date: 2026-04-11 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: Official Ruby SDK for BenchGecko, the data layer of the AI economy. Query
14
+ thousands of AI models with cross-provider pricing and daily price history. Track
15
+ company valuations, funding timelines, and revenue estimates. Pull benchmark scores,
16
+ agent leaderboards, and a live changelog of every price drop, every launch, every
17
+ deprecation. If it moved in AI today, it's already on BenchGecko.
18
+ email:
19
+ - hello@benchgecko.ai
30
20
  executables: []
31
21
  extensions: []
32
22
  extra_rdoc_files: []
33
23
  files:
34
- - LICENSE
24
+ - CHANGELOG.md
25
+ - LICENSE.txt
35
26
  - README.md
36
27
  - lib/benchgecko.rb
37
28
  homepage: https://benchgecko.ai
38
29
  licenses:
39
30
  - MIT
40
31
  metadata:
32
+ homepage_uri: https://benchgecko.ai
41
33
  source_code_uri: https://github.com/BenchGecko/benchgecko-ruby
42
- bug_tracker_uri: https://github.com/BenchGecko/benchgecko-ruby/issues
43
- documentation_uri: https://benchgecko.ai/api-docs
34
+ changelog_uri: https://github.com/BenchGecko/benchgecko-ruby/blob/main/CHANGELOG.md
44
35
  post_install_message:
45
36
  rdoc_options: []
46
37
  require_paths:
@@ -59,5 +50,6 @@ requirements: []
59
50
  rubygems_version: 3.0.3.1
60
51
  signing_key:
61
52
  specification_version: 4
62
- summary: Official Ruby SDK for the BenchGecko API
53
+ summary: The data layer of the AI economy. Every model. Every agent. Everything AI.
54
+ Tracked.
63
55
  test_files: []
File without changes