RubyGems - llm_capabilities - Versions diffs - 0.2.0 - Mend

llm_capabilities 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

checksums.yaml +7 -0
data/CHANGELOG.md +14 -0
data/LICENSE.txt +21 -0
data/README.md +155 -0
data/lib/llm_capabilities/cache.rb +82 -0
data/lib/llm_capabilities/configuration.rb +28 -0
data/lib/llm_capabilities/detector.rb +83 -0
data/lib/llm_capabilities/model_index.rb +170 -0
data/lib/llm_capabilities/version.rb +6 -0
data/lib/llm_capabilities.rb +77 -0
metadata +51 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: d3046a9f7990b66895e6ee190eee74fff899b2c989e2719ab024cb0e8d3f5986
+  data.tar.gz: 377a8fd162deb086f6cbc1d60c2b6a3799b502b5f330cc1f5219f44cff7018ed
+SHA512:
+  metadata.gz: 69888f933e3a33071b6d647ca735b5af9b1dd3cc5e2233960c44cb52855569fd33df38a24d389f8eb52c14c3cfc593fa750cad264b8620b898f73d4b6289612d
+  data.tar.gz: cbd50993c32cfb0d1ce7e3a8e96d6f374585f1bdceb84b283cd81436d9dc43e7297be4865d813c6bbec726f9d34cb6291572c3b5551547bbe193cf661826bcc3

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,14 @@
+# Changelog
+## 0.2.0
+- Generalize API from `supports_schema?(model, thinking:)` to `supports?(model, capability, context: {})`
+- Add 4-tier resolution hierarchy: empirical cache, OpenRouter model index, RubyLLM registry, provider heuristic
+- Add `ModelIndex` class for unauthenticated OpenRouter `/api/v1/models` integration
+- Add 18 validated capability symbols (`KNOWN_CAPABILITIES`) with `UnknownCapabilityError`
+- Add freeform context hash for cache precision (e.g., `{thinking: true}`)
+- Remove `sorbet-runtime` dependency (zero runtime dependencies)
+## 0.1.0
+- Initial release: 3-tier structured output detection (cache, RubyLLM, provider heuristic)

data/LICENSE.txt ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Sorcerous Machine, LLC
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,155 @@
+# llm_capabilities
+[![CI](https://github.com/SorcerousMachine/LLMCapabilities/actions/workflows/ci.yml/badge.svg)](https://github.com/SorcerousMachine/LLMCapabilities/actions/workflows/ci.yml)
+[![Gem Version](https://img.shields.io/gem/v/llm_capabilities)](https://rubygems.org/gems/llm_capabilities)
+[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE.txt)
+4-tier capability detection for LLM models. Zero runtime dependencies.
+Answers the question: *does this model support this capability?* — using a layered resolution hierarchy that combines empirical observations, live model indexes, and static heuristics.
+## Resolution Hierarchy
+Each query walks four tiers in order. The first non-nil result wins.
+| Tier | Source | What it knows |
+|------|--------|---------------|
+| 1 | **Empirical cache** | Observed results from actual API calls, with optional context (e.g., `{thinking: true}`) |
+| 2 | **OpenRouter model index** | Per-model capability data from OpenRouter's public API, cached locally for 24 hours |
+| 3 | **RubyLLM model registry** | Soft dependency — used automatically when the `ruby_llm` gem is loaded |
+| 4 | **Provider heuristic** | Static fallback mapping providers to known capabilities |
+Tier 1 is the most authoritative because it reflects what you've actually observed. Tiers 2-4 provide progressively coarser defaults.
+## Quick Start
+```ruby
+gem "llm_capabilities"
+```
+```ruby
+require "llm_capabilities"
+# Query a capability (walks all 4 tiers automatically)
+LLMCapabilities.supports?("openai/o4-mini", :structured_output)
+# => true
+LLMCapabilities.supports?("deepseek/deepseek-r1", :vision)
+# => false
+# Query with context (only matches tier 1 cache entries with the same context)
+LLMCapabilities.supports?("anthropic/claude-haiku-4.5", :structured_output, context: { thinking: true })
+# => false (if you've recorded that specific combination as unsupported)
+```
+Model identifiers use the `"provider/model"` format (e.g., `"openai/gpt-4o"`, `"anthropic/claude-sonnet-4.5"`).
+## Recording Empirical Results
+The real power is in tier 1: recording what you've actually observed from API calls.
+```ruby
+# After a successful structured output call
+LLMCapabilities.record("openai/o4-mini", :structured_output, supported: true)
+# After discovering a model doesn't support a capability in a specific context
+LLMCapabilities.record(
+  "anthropic/claude-haiku-4.5",
+  :structured_output,
+  context: { thinking: true },
+  supported: false
+)
+# Look up a cached result directly (nil if not recorded)
+LLMCapabilities.lookup("openai/o4-mini", :structured_output)
+# => true
+# Cache management
+LLMCapabilities.size   # => 2
+LLMCapabilities.clear! # wipes all cached entries
+```
+Cache entries are persisted to disk as JSON and survive process restarts. Entries expire after 30 days by default.
+## Configuration
+```ruby
+LLMCapabilities.configure do |config|
+  # File paths for persistent storage
+  config.cache_path = ".llm_capabilities_cache.json"  # default
+  config.index_path = ".llm_capabilities_index.json"   # default
+  # Cache entry lifetime (seconds)
+  config.max_age = 2_592_000  # 30 days, default
+  # OpenRouter index refresh interval (seconds)
+  config.index_ttl = 86_400   # 24 hours, default
+  # Override which providers support which capabilities
+  config.provider_capabilities = {
+    structured_output: %w[openai google anthropic deepseek],
+    function_calling:  %w[openai google anthropic deepseek],
+    vision:            %w[openai google anthropic],
+    streaming:         %w[openai google anthropic deepseek]
+  }
+end
+```
+## Known Capabilities
+The gem recognizes 18 capability symbols. Passing anything else to `supports?` raises `LLMCapabilities::UnknownCapabilityError`.
+| Capability | Description |
+|------------|-------------|
+| `:structured_output` | JSON schema-constrained output |
+| `:function_calling` | Tool/function calling |
+| `:vision` | Image input processing |
+| `:streaming` | Streaming response support |
+| `:json_mode` | JSON output mode (less strict than structured output) |
+| `:reasoning` | Extended thinking / chain-of-thought |
+| `:image_generation` | Image output generation |
+| `:speech_generation` | Audio/speech output |
+| `:transcription` | Audio-to-text conversion |
+| `:translation` | Language translation |
+| `:citations` | Source citation support |
+| `:predicted_outputs` | Predicted/cached output optimization |
+| `:distillation` | Model distillation support |
+| `:fine_tuning` | Fine-tuning API support |
+| `:batch` | Batch API processing |
+| `:realtime` | Real-time / WebSocket API |
+| `:caching` | Prompt caching |
+| `:moderation` | Content moderation |
+## Default Provider Capabilities
+Tier 4 uses these static mappings as a last resort when no better data is available:
+| Capability | Providers |
+|------------|-----------|
+| `:structured_output` | openai, google, anthropic, deepseek |
+| `:function_calling` | openai, google, anthropic, deepseek |
+| `:vision` | openai, google, anthropic |
+| `:streaming` | openai, google, anthropic, deepseek |
+These can be overridden via `config.provider_capabilities`.
+## OpenRouter API Usage
+Tier 2 fetches model capability data from [OpenRouter's](https://openrouter.ai) unauthenticated public endpoint (`GET /api/v1/models`). This data is cached locally on disk with a 24-hour TTL. No API key is required. No data is sent to OpenRouter — it is a read-only GET request.
+If the fetch fails (network error, timeout, non-200 response), tier 2 is silently skipped and resolution falls through to tiers 3 and 4. The gem never blocks on network failure.
+See OpenRouter's [terms of service](https://openrouter.ai/terms) for their data usage policies.
+## RubyLLM Integration
+Tier 3 activates automatically when the [`ruby_llm`](https://github.com/crmne/ruby_llm) gem is loaded in your process. No configuration required — if `RubyLLM` is defined, the gem queries its model registry for capability data. If `ruby_llm` is not present, tier 3 is silently skipped.
+## Requirements
+- Ruby >= 3.2
+- Zero runtime dependencies (stdlib only: `json`, `net/http`, `fileutils`)
+## License
+MIT. See [LICENSE.txt](LICENSE.txt).

data/lib/llm_capabilities/cache.rb ADDED Viewed

@@ -0,0 +1,82 @@
+# frozen_string_literal: true
+# typed: true
+require "json"
+require "fileutils"
+module LLMCapabilities
+  class Cache
+    def initialize(path: Configuration::DEFAULT_CACHE_PATH, max_age: Configuration::DEFAULT_MAX_AGE)
+      @path = path
+      @max_age = max_age
+      @entries = nil
+    end
+    def lookup(model, capability, context: {})
+      load_cache!
+      entry = @entries[cache_key(model, capability, context)]
+      return nil unless entry.is_a?(Hash)
+      if @max_age && entry["recorded_at"]
+        elapsed = Time.now.to_i - entry["recorded_at"]
+        return nil if elapsed > @max_age
+      end
+      entry["supported"]
+    end
+    def record(model, capability, supported:, context: {})
+      load_cache!
+      @entries[cache_key(model, capability, context)] = {
+        "supported" => supported,
+        "recorded_at" => Time.now.to_i
+      }
+      persist!
+    end
+    def clear!
+      @entries = {}
+      persist!
+    end
+    def size
+      load_cache!
+      @entries.length
+    end
+    private
+    def cache_key(model, capability, context)
+      base = "#{model}:#{capability}"
+      return base if context.empty?
+      pairs = context.sort_by { |k, _| k.to_s }.map { |k, v| "#{k}=#{v}" }.join(",")
+      "#{base}:#{pairs}"
+    end
+    def load_cache!
+      return unless @entries.nil?
+      @entries = if File.exist?(@path)
+        File.open(@path, File::RDONLY) do |f|
+          f.flock(File::LOCK_SH)
+          JSON.parse(f.read)
+        end
+      else
+        {}
+      end
+    rescue JSON::ParserError
+      @entries = {}
+    end
+    def persist!
+      dir = File.dirname(@path)
+      FileUtils.mkdir_p(dir) unless Dir.exist?(dir)
+      File.open(@path, File::CREAT | File::WRONLY | File::TRUNC) do |f|
+        f.flock(File::LOCK_EX)
+        f.write(JSON.pretty_generate(@entries))
+      end
+    end
+  end
+end

data/lib/llm_capabilities/configuration.rb ADDED Viewed

@@ -0,0 +1,28 @@
+# frozen_string_literal: true
+# typed: true
+module LLMCapabilities
+  class Configuration
+    DEFAULT_CACHE_PATH = ".llm_capabilities_cache.json"
+    DEFAULT_INDEX_PATH = ".llm_capabilities_index.json"
+    DEFAULT_INDEX_TTL = 86_400 # 24 hours in seconds
+    DEFAULT_MAX_AGE = 2_592_000 # 30 days in seconds
+    DEFAULT_PROVIDER_CAPABILITIES = {
+      structured_output: %w[openai google anthropic deepseek],
+      function_calling: %w[openai google anthropic deepseek],
+      vision: %w[openai google anthropic],
+      streaming: %w[openai google anthropic deepseek]
+    }.freeze
+    attr_accessor :cache_path, :index_path, :index_ttl, :provider_capabilities, :max_age
+    def initialize
+      @cache_path = DEFAULT_CACHE_PATH
+      @index_path = DEFAULT_INDEX_PATH
+      @index_ttl = DEFAULT_INDEX_TTL
+      @provider_capabilities = DEFAULT_PROVIDER_CAPABILITIES.transform_values(&:dup)
+      @max_age = DEFAULT_MAX_AGE
+    end
+  end
+end

data/lib/llm_capabilities/detector.rb ADDED Viewed

@@ -0,0 +1,83 @@
+# frozen_string_literal: true
+# typed: true
+module LLMCapabilities
+  class Detector
+    # Capability vocabulary derived from RubyLLM's model capability strings.
+    # See: https://github.com/crmne/ruby_llm
+    KNOWN_CAPABILITIES = %i[
+      streaming function_calling structured_output predicted_outputs
+      distillation fine_tuning batch realtime image_generation
+      speech_generation transcription translation citations
+      reasoning caching moderation json_mode vision
+    ].freeze
+    def initialize(cache:, provider_capabilities: Configuration::DEFAULT_PROVIDER_CAPABILITIES, model_index: nil, ruby_llm_enabled: nil)
+      @cache = cache
+      @provider_capabilities = provider_capabilities
+      @model_index = model_index
+      @ruby_llm_enabled = ruby_llm_enabled
+    end
+    def supports?(model, capability, context: {})
+      validate_capability!(capability)
+      # Tier 1: Empirical cache (most authoritative, with full context)
+      cached = @cache.lookup(model, capability, context: context)
+      return cached unless cached.nil?
+      # Tier 2: OpenRouter model index (base capability only)
+      if @model_index
+        index_result = @model_index.lookup(model, capability)
+        return index_result unless index_result.nil?
+      end
+      # Tier 3: RubyLLM model registry (base capability only)
+      ruby_llm_result = query_ruby_llm(model, capability)
+      return ruby_llm_result unless ruby_llm_result.nil?
+      # Tier 4: Provider-level heuristic (base capability only)
+      provider_supports?(model, capability)
+    end
+    def provider_supports?(model, capability)
+      provider = model.include?("/") ? model.split("/", 2).first : nil
+      return false unless provider
+      providers = @provider_capabilities[capability]
+      return false unless providers
+      providers.include?(provider)
+    end
+    private
+    def validate_capability!(capability)
+      return if KNOWN_CAPABILITIES.include?(capability)
+      raise UnknownCapabilityError,
+        "Unknown capability: #{capability.inspect}. Known capabilities: #{KNOWN_CAPABILITIES.join(", ")}"
+    end
+    def ruby_llm_available?
+      if @ruby_llm_enabled.nil?
+        @ruby_llm_enabled = defined?(RubyLLM) ? true : false
+      end
+      @ruby_llm_enabled
+    end
+    def query_ruby_llm(model, capability)
+      return nil unless ruby_llm_available?
+      model_info = RubyLLM.models.find(model)
+      return nil unless model_info
+      method_name = :"#{capability}?"
+      if model_info.respond_to?(method_name)
+        model_info.public_send(method_name)
+      end
+    rescue
+      nil
+    end
+  end
+end

data/lib/llm_capabilities/model_index.rb ADDED Viewed

@@ -0,0 +1,170 @@
+# frozen_string_literal: true
+# typed: true
+require "json"
+require "fileutils"
+require "net/http"
+require "uri"
+module LLMCapabilities
+  class ModelIndex
+    OPENROUTER_MODELS_URL = "https://openrouter.ai/api/v1/models"
+    # Mapping from OpenRouter field values to gem capability symbols.
+    # NOTE: Name divergences are intentional and subtle:
+    #   - OpenRouter "structured_outputs" (plural) -> gem :structured_output (singular)
+    #   - OpenRouter "tools" -> gem :function_calling
+    #   - OpenRouter "response_format" -> gem :json_mode
+    #   - OpenRouter "reasoning" -> gem :reasoning
+    PARAMETER_MAPPING = {
+      "structured_outputs" => :structured_output,
+      "tools" => :function_calling,
+      "reasoning" => :reasoning,
+      "response_format" => :json_mode
+    }.freeze
+    def initialize(path:, ttl:)
+      @path = path
+      @ttl = ttl
+      @index = nil
+    end
+    def lookup(model, capability)
+      load_index!
+      model_caps = @index[model]
+      return nil unless model_caps
+      model_caps[capability]
+    end
+    private
+    def load_index!
+      return unless @index.nil?
+      if File.exist?(@path) && !stale?
+        load_from_disk
+        return unless @index.nil?
+      end
+      fetch_and_cache!
+    rescue => _e
+      @index ||= {}
+    end
+    def stale?
+      mtime = File.mtime(@path)
+      (Time.now - mtime) > @ttl
+    end
+    def load_from_disk
+      File.open(@path, File::RDONLY) do |f|
+        f.flock(File::LOCK_SH)
+        raw = JSON.parse(f.read)
+        @index = deserialize(raw)
+      end
+    rescue JSON::ParserError
+      @index = nil
+    end
+    def fetch_and_cache!
+      uri = URI.parse(OPENROUTER_MODELS_URL)
+      response = Net::HTTP.get_response(uri)
+      unless response.is_a?(Net::HTTPSuccess)
+        @index ||= {}
+        return
+      end
+      body = response.body
+      unless body
+        @index ||= {}
+        return
+      end
+      raw_data = JSON.parse(body)
+      @index = normalize(raw_data)
+      persist!
+    rescue => _e
+      @index ||= {}
+    end
+    def normalize(raw_data)
+      result = {}
+      models = raw_data["data"]
+      return result unless models.is_a?(Array)
+      models.each do |model_entry|
+        next unless model_entry.is_a?(Hash)
+        id = model_entry["id"]
+        next unless id.is_a?(String)
+        caps = map_capabilities(model_entry)
+        result[id] = caps unless caps.empty?
+      end
+      result
+    end
+    def map_capabilities(model_entry)
+      caps = {}
+      # Map supported_parameters to capabilities
+      # NOTE: OpenRouter uses different names than the gem:
+      #   "structured_outputs" (plural) -> :structured_output (singular)
+      #   "tools" -> :function_calling
+      #   "response_format" -> :json_mode
+      params = model_entry["supported_parameters"]
+      if params.is_a?(Array)
+        params.each do |param|
+          cap = PARAMETER_MAPPING[param]
+          caps[cap] = true if cap
+        end
+      end
+      # Map architecture modalities to capabilities
+      arch = model_entry["architecture"]
+      if arch.is_a?(Hash)
+        input_mods = arch["input_modalities"]
+        if input_mods.is_a?(Array) && input_mods.include?("image")
+          caps[:vision] = true
+        end
+        output_mods = arch["output_modalities"]
+        if output_mods.is_a?(Array) && output_mods.include?("image")
+          caps[:image_generation] = true
+        end
+      end
+      caps
+    end
+    def persist!
+      dir = File.dirname(@path)
+      FileUtils.mkdir_p(dir) unless Dir.exist?(dir)
+      serialized = serialize(@index)
+      File.open(@path, File::CREAT | File::WRONLY | File::TRUNC) do |f|
+        f.flock(File::LOCK_EX)
+        f.write(JSON.pretty_generate(serialized))
+      end
+    end
+    def serialize(index)
+      index.transform_values { |caps| caps.transform_keys(&:to_s) }
+    end
+    def deserialize(raw)
+      result = {}
+      raw.each do |model_id, caps|
+        next unless caps.is_a?(Hash)
+        result[model_id] = caps.each_with_object({}) do |(k, v), acc|
+          acc[k.to_sym] = v if v == true || v == false
+        end
+      end
+      result
+    end
+  end
+end

data/lib/llm_capabilities/version.rb ADDED Viewed

@@ -0,0 +1,6 @@
+# frozen_string_literal: true
+# typed: true
+module LLMCapabilities
+  VERSION = "0.2.0"
+end

data/lib/llm_capabilities.rb ADDED Viewed

@@ -0,0 +1,77 @@
+# frozen_string_literal: true
+# typed: true
+require_relative "llm_capabilities/version"
+require_relative "llm_capabilities/configuration"
+require_relative "llm_capabilities/cache"
+require_relative "llm_capabilities/model_index"
+require_relative "llm_capabilities/detector"
+module LLMCapabilities
+  class Error < StandardError; end
+  class UnknownCapabilityError < Error; end
+  class << self
+    def configuration
+      @configuration ||= Configuration.new
+    end
+    def configure(&block)
+      block.call(configuration)
+      @cache = nil
+      @model_index = nil
+      @detector = nil
+    end
+    def supports?(model, capability, context: {})
+      detector.supports?(model, capability, context: context)
+    end
+    def record(model, capability, supported:, context: {})
+      cache.record(model, capability, supported: supported, context: context)
+    end
+    def lookup(model, capability, context: {})
+      cache.lookup(model, capability, context: context)
+    end
+    def clear!
+      cache.clear!
+    end
+    def size
+      cache.size
+    end
+    def reset!
+      @configuration = nil
+      @cache = nil
+      @model_index = nil
+      @detector = nil
+    end
+    private
+    def cache
+      @cache ||= Cache.new(
+        path: configuration.cache_path,
+        max_age: configuration.max_age
+      )
+    end
+    def model_index
+      @model_index ||= ModelIndex.new(
+        path: configuration.index_path,
+        ttl: configuration.index_ttl
+      )
+    end
+    def detector
+      @detector ||= Detector.new(
+        cache: cache,
+        provider_capabilities: configuration.provider_capabilities,
+        model_index: model_index
+      )
+    end
+  end
+end

metadata ADDED Viewed

@@ -0,0 +1,51 @@
+--- !ruby/object:Gem::Specification
+name: llm_capabilities
+version: !ruby/object:Gem::Version
+  version: 0.2.0
+platform: ruby
+authors:
+- Alex
+bindir: bin
+cert_chain: []
+date: 1980-01-02 00:00:00.000000000 Z
+dependencies: []
+description: Detects whether LLM models support specific capabilities via empirical
+  cache, OpenRouter model index, RubyLLM model registry, and provider-level heuristics.
+  Zero runtime dependencies.
+executables: []
+extensions: []
+extra_rdoc_files: []
+files:
+- CHANGELOG.md
+- LICENSE.txt
+- README.md
+- lib/llm_capabilities.rb
+- lib/llm_capabilities/cache.rb
+- lib/llm_capabilities/configuration.rb
+- lib/llm_capabilities/detector.rb
+- lib/llm_capabilities/model_index.rb
+- lib/llm_capabilities/version.rb
+homepage: https://github.com/SorcerousMachine/LLMCapabilities
+licenses:
+- MIT
+metadata:
+  source_code_uri: https://github.com/SorcerousMachine/LLMCapabilities
+  changelog_uri: https://github.com/SorcerousMachine/LLMCapabilities/blob/main/CHANGELOG.md
+rdoc_options: []
+require_paths:
+- lib
+required_ruby_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '3.2'
+required_rubygems_version: !ruby/object:Gem::Requirement
+  requirements:
+  - - ">="
+    - !ruby/object:Gem::Version
+      version: '0'
+requirements: []
+rubygems_version: 4.0.4
+specification_version: 4
+summary: 4-tier capability detection for LLM models
+test_files: []