RubyGems - llm_ruby - Versions diffs - 0.1.0 → 0.3.0 - Mend

llm_ruby 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

checksums.yaml +4 -4
data/README.md +70 -36
data/lib/llm/clients/anthropic/response.rb +48 -0
data/lib/llm/clients/anthropic.rb +113 -0
data/lib/llm/clients/gemini/request.rb +75 -0
data/lib/llm/clients/gemini/response.rb +61 -0
data/lib/llm/clients/gemini.rb +102 -0
data/lib/llm/clients/open_ai/response.rb +45 -32
data/lib/llm/clients/open_ai.rb +86 -82
data/lib/llm/info.rb +261 -89
data/lib/llm/response.rb +9 -1
data/lib/llm/schema.rb +75 -0
data/lib/llm/stop_reason.rb +8 -5
data/lib/llm.rb +9 -2
metadata +12 -13

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: dfe908817dd406ae16aca4130133b9a421b18333cdecc4ad870635dd997be500
-  data.tar.gz: 105ae0dcc30918686abcf8d01d99605d2f70f41eebdd2737744bc2bf27c6575c
+  metadata.gz: 0de9e01920a472c7a04f9f794747090e3bf279b403d4efa206b1db7c2b006987
+  data.tar.gz: bd77b3a4c5540a82d4a6f1b93213a8b3da60cc9342e581335adeac6fe93c999f
 SHA512:
-  metadata.gz: 5b9643df8771735111f18f52182b6f217231ca92c890e3456f21c3937850bf2c9ac668730cd1aeae8b43330d5a3c84eab6b419c60aa5d54d70f405793e4463ad
-  data.tar.gz: 820687838675aeaadde8e5b7c5b7f7f45bfdf10beb74b90e2b594cd80d5e7accc38ace2deec56a6a2fcdc3de6a369af41405c5523da3e4c2b47f6e584c28f3fd
+  metadata.gz: bc4b0263dfeaf1db4dded66667ea13e65dfc035d27579a2b032a59ae63f29fd588630d749d492d281322269fbc9d3909f8648f464d250baa99ff5b2afd582193
+  data.tar.gz: a4c5713df43e27127afa0b3cbf044ce8d805235f73abd05b3f79dc9fc7a814c193caac9192392a8670523cf74be96da2b52dc648d965116e8e9bbc3f8a2aaaaa

data/README.md CHANGED Viewed

@@ -1,6 +1,12 @@
 # LLMRuby
-LLMRuby is a Ruby gem that provides a consistent interface for interacting with various Large Language Model (LLM) APIs, with a current focus on OpenAI's models.
+[![Gem Version](https://badge.fury.io/rb/llm_ruby.svg)](https://badge.fury.io/rb/llm_ruby)
+![Workflow Status](https://github.com/agamble/llm_ruby/actions/workflows/main.yml/badge.svg)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+LLMRuby is a Ruby gem that provides a consistent interface for interacting with multiple Large Language Model (LLM) APIs. Most OpenAI, Anthropic and Gemini models are currently supported.
 ## Installation
@@ -12,14 +18,14 @@ gem 'llm_ruby'
 And then execute:
-```
-$ bundle install
+```shell
+bundle install
 ```
 Or install it yourself as:
-```
-$ gem install llm_ruby
+```shell
+gem install llm_ruby
 ```
 ## Usage
@@ -27,7 +33,7 @@ $ gem install llm_ruby
 ### Basic Usage
 ```ruby
-require 'llm'
+require 'llm_ruby'
 # Initialize an LLM instance
 llm = LLM.from_string!("gpt-4")
@@ -46,10 +52,10 @@ puts response.content
 LLMRuby supports streaming responses:
 ```ruby
-require 'llm'
+require 'llm_ruby'
 # Initialize an LLM instance
-llm = LLM.from_string!("gpt-4")
+llm = LLM.from_string!("gpt-4o")
 # Create a client
 client = llm.client
@@ -87,7 +93,7 @@ Here is an example of how to use the response object:
 ```ruby
 # Initialize an LLM instance
-llm = LLM.from_string!("gpt-4")
+llm = LLM.from_string!("gpt-4o")
 # Create a client
 client = llm.client
@@ -101,37 +107,69 @@ puts "Raw response: #{response.raw_response}"
 puts "Stop reason: #{response.stop_reason}"
 ```
 ## Available Models
 LLMRuby supports various OpenAI models, including GPT-3.5 and GPT-4 variants. You can see the full list of supported models in the `KNOWN_MODELS` constant:
-| Canonical Name            | Display Name           | Provider |
-|---------------------------|------------------------|----------|
-| gpt-3.5-turbo             | GPT-3.5 Turbo          | openai   |
-| gpt-3.5-turbo-0125        | GPT-3.5 Turbo 0125     | openai   |
-| gpt-3.5-turbo-16k         | GPT-3.5 Turbo 16K      | openai   |
-| gpt-3.5-turbo-1106        | GPT-3.5 Turbo 1106     | openai   |
-| gpt-4                     | GPT-4                  | openai   |
-| gpt-4-32k                 | GPT-4 32K              | openai   |
-| gpt-4-1106-preview        | GPT-4 Turbo 1106       | openai   |
-| gpt-4-turbo-2024-04-09    | GPT-4 Turbo 2024-04-09 | openai   |
-| gpt-4-0125-preview        | GPT-4 Turbo 0125       | openai   |
-| gpt-4-turbo-preview       | GPT-4 Turbo            | openai   |
-| gpt-4-0613                | GPT-4 0613             | openai   |
-| gpt-4-32k-0613            | GPT-4 32K 0613         | openai   |
-| gpt-4o                    | GPT-4o                 | openai   |
-| gpt-4o-mini               | GPT-4o Mini            | openai   |
-| gpt-4o-2024-05-13         | GPT-4o 2024-05-13      | openai   |
-| gpt-4o-2024-08-06         | GPT-4o 2024-08-06      | openai   |
+### OpenAI Models
+| Canonical Name             | Display Name                         |
+|----------------------------|--------------------------------------|
+| gpt-3.5-turbo              | GPT-3.5 Turbo                        |
+| gpt-3.5-turbo-0125         | GPT-3.5 Turbo 0125                   |
+| gpt-3.5-turbo-16k          | GPT-3.5 Turbo 16K                    |
+| gpt-3.5-turbo-1106         | GPT-3.5 Turbo 1106                   |
+| gpt-4                      | GPT-4                                |
+| gpt-4-1106-preview         | GPT-4 Turbo 1106                     |
+| gpt-4-turbo-2024-04-09     | GPT-4 Turbo 2024-04-09               |
+| gpt-4-0125-preview         | GPT-4 Turbo 0125                     |
+| gpt-4-turbo-preview        | GPT-4 Turbo                          |
+| gpt-4-0613                 | GPT-4 0613                           |
+| gpt-4o                     | GPT-4o                               |
+| gpt-4o-mini                | GPT-4o Mini                          |
+| gpt-4o-mini-2024-07-18     | GPT-4o Mini 2024-07-18               |
+| gpt-4o-2024-05-13          | GPT-4o 2024-05-13                    |
+| gpt-4o-2024-08-06          | GPT-4o 2024-08-06                    |
+| gpt-4o-2024-11-20          | GPT-4o 2024-11-20                    |
+| chatgpt-4o-latest          | ChatGPT 4o Latest                    |
+| o1                         | o1                                   |
+| o1-2024-12-17              | o1 2024-12-17                        |
+| o1-preview                 | o1 Preview                           |
+| o1-preview-2024-09-12      | o1 Preview 2024-09-12                |
+| o1-mini                    | o1 Mini                              |
+| o1-mini-2024-09-12         | o1 Mini 2024-09-12                   |
+| o3-mini                    | o3 Mini                              |
+| o3-mini-2025-01-31         | o3 Mini 2025-01-31                   |
+### Anthropic Models
+| Canonical Name             | Display Name                         |
+|----------------------------|--------------------------------------|
+| claude-3-5-sonnet-20241022 | Claude 3.5 Sonnet 2024-10-22         |
+| claude-3-5-haiku-20241022  | Claude 3.5 Haiku 2024-10-22          |
+| claude-3-5-sonnet-20240620 | Claude 3.5 Sonnet 2024-06-20         |
+| claude-3-opus-20240229     | Claude 3.5 Opus 2024-02-29           |
+| claude-3-sonnet-20240229   | Claude 3.5 Sonnet 2024-02-29         |
+| claude-3-haiku-20240307    | Claude 3.5 Opus 2024-03-07           |
+### Google Models
+| Canonical Name                       | Display Name                             |
+|--------------------------------------|------------------------------------------|
+| gemini-2.0-flash                     | Gemini 2.0 Flash                         |
+| gemini-2.0-flash-lite-preview-02-05  | Gemini 2.0 Flash Lite Preview 02-05      |
+| gemini-1.5-flash                     | Gemini 1.5 Flash                         |
+| gemini-1.5-pro                       | Gemini 1.5 Pro                           |
+| gemini-1.5-flash-8b                  | Gemini 1.5 Flash 8B                      |
 ## Configuration
-Set your OpenAI API key as an environment variable:
+Set your OpenAI, Anthropic or Google API key as an environment variable:
-```
+```shell
 export OPENAI_API_KEY=your_api_key_here
+export ANTHROPIC_API_KEY=your_api_key_here
+export GEMINI_API_KEY=your_api_key_here
 ```
 ## Development
@@ -142,12 +180,8 @@ To install this gem onto your local machine, run `bundle exec rake install`.
 ## Contributing
-Bug reports and pull requests are welcome on GitHub at https://github.com/contextco/llm_ruby.
+Bug reports and pull requests are welcome.
 ## License
 The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
-## Acknowledgements
-This gem is developed and maintained by [Context](https://context.ai).

data/lib/llm/clients/anthropic/response.rb ADDED Viewed

@@ -0,0 +1,48 @@
+# frozen_string_literal: true
+class LLM
+  module Clients
+    class Anthropic
+      class Response
+        def initialize(raw_response)
+          @raw_response = raw_response
+        end
+        def to_normalized_response
+          LLM::Response.new(
+            content: content,
+            raw_response: parsed_response,
+            stop_reason: normalize_stop_reason
+          )
+        end
+        def self.normalize_stop_reason(stop_reason)
+          case stop_reason
+          when "end_turn"
+            LLM::StopReason::STOP
+          when "stop_sequence"
+            LLM::StopReason::STOP_SEQUENCE
+          when "max_tokens"
+            LLM::StopReason::MAX_TOKENS_REACHED
+          else
+            LLM::StopReason::OTHER
+          end
+        end
+        private
+        def content
+          parsed_response.dig("content", 0, "text")
+        end
+        def normalize_stop_reason
+          self.class.normalize_stop_reason(parsed_response["stop_reason"])
+        end
+        def parsed_response
+          @raw_response.parsed_response
+        end
+      end
+    end
+  end
+end

data/lib/llm/clients/anthropic.rb ADDED Viewed

@@ -0,0 +1,113 @@
+# frozen_string_literal: true
+require "httparty"
+class LLM
+  module Clients
+    class Anthropic
+      include HTTParty
+      base_uri "https://api.anthropic.com"
+      def initialize(llm:)
+        @llm = llm
+      end
+      def chat(messages, options = {})
+        request = payload(messages, options)
+        return chat_streaming(request, options[:on_message], options[:on_complete]) if options[:stream]
+        resp = post_url("/v1/messages", body: request.to_json)
+        Response.new(resp).to_normalized_response
+      end
+      private
+      def chat_streaming(request, on_message, on_complete)
+        buffer = +""
+        chunks = []
+        output_data = {}
+        wrapped_on_complete = lambda { |stop_reason|
+          output_data[:stop_reason] = stop_reason
+          on_complete&.call(stop_reason)
+        }
+        request[:stream] = true
+        proc = handle_event_stream(buffer, chunks, on_message_proc: on_message, on_complete_proc: wrapped_on_complete)
+        _resp = post_url_streaming("/v1/messages", body: request.to_json, &proc)
+        LLM::Response.new(
+          content: buffer,
+          raw_response: chunks,
+          stop_reason: Response.normalize_stop_reason(output_data[:stop_reason])
+        )
+      end
+      def handle_event_stream(buffer, chunks, on_message_proc:, on_complete_proc:)
+        each_json_chunk do |type, chunk|
+          chunks << chunk
+          case type
+          when "content_block_delta"
+            new_content = chunk.dig("delta", "text")
+            buffer << new_content
+            on_message_proc&.call(new_content)
+          when "message_delta"
+            finish_reason = chunk.dig("delta", "stop_reason")
+            on_complete_proc&.call(finish_reason)
+          else
+            next
+          end
+        end
+      end
+      def each_json_chunk
+        parser = EventStreamParser::Parser.new
+        proc do |chunk|
+          # TODO: Add error handling.
+          parser.feed(chunk) do |type, data|
+            yield(type, JSON.parse(data))
+          end
+        end
+      end
+      def payload(messages, options = {})
+        {
+          system: combined_system_messages(messages),
+          messages: messages.filter { |m| m[:role].to_sym != :system },
+          model: @llm.canonical_name,
+          max_tokens: options[:max_output_tokens] || @llm.default_params[:max_output_tokens],
+          temperature: options[:temperature],
+          top_p: options[:top_p],
+          top_k: options[:top_k],
+          stream: options[:stream]
+        }.compact
+      end
+      def combined_system_messages(messages)
+        messages.filter { |m| m[:role].to_sym == :system }.map { |m| m[:content] }.join('\n\n')
+      end
+      def post_url(url, body:)
+        self.class.post(url, body: body, headers: default_headers)
+      end
+      def post_url_streaming(url, **kwargs, &block)
+        self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
+      end
+      def default_headers
+        {
+          "anthropic-version" => "2023-06-01",
+          "x-api-key" => ENV["ANTHROPIC_API_KEY"],
+          "Content-Type" => "application/json"
+        }
+      end
+    end
+  end
+end

data/lib/llm/clients/gemini/request.rb ADDED Viewed

@@ -0,0 +1,75 @@
+# frozen_string_literal: true
+class LLM
+  module Clients
+    class Gemini
+      class Request
+        def initialize(messages, options)
+          @messages = messages
+          @options = options
+        end
+        def model_for_url
+          "models/#{model}"
+        end
+        def params
+          generation_config = {}
+          if options[:response_format]
+            generation_config = {
+              responseMimeType: "application/json",
+              responseSchema: options[:response_format]&.gemini_response_format
+            }
+          end
+          {
+            systemInstruction: normalized_prompt,
+            contents: normalized_messages,
+            generationConfig: generation_config
+          }
+        end
+        private
+        attr_reader :messages, :options
+        def model
+          options[:model]
+        end
+        def normalized_messages
+          user_visible_messages
+            .map(&method(:message_to_gemini_message))
+        end
+        def message_to_gemini_message(message)
+          {
+            role: ROLES_MAP[message[:role]],
+            parts: [{text: message[:content]}]
+          }
+        end
+        def normalized_prompt
+          return nil if system_messages.empty?
+          system_messages
+            .map { |message| message[:content] }
+            .join("\n\n")
+        end
+        def system_messages
+          messages.filter { |message| message[:role] == :system }
+        end
+        def user_visible_messages
+          messages.filter { |message| message[:role] != :system }
+        end
+        ROLES_MAP = {
+          assistant: :model,
+          user: :user
+        }.freeze
+      end
+    end
+  end
+end

data/lib/llm/clients/gemini/response.rb ADDED Viewed

@@ -0,0 +1,61 @@
+# frozen_string_literal: true
+class LLM
+  module Clients
+    class Gemini
+      class Response
+        def initialize(raw_response)
+          @raw_response = raw_response
+        end
+        def to_normalized_response
+          LLM::Response.new(
+            content: content,
+            raw_response: parsed_response,
+            stop_reason: translated_stop_reason,
+            structured_output: structured_output
+          )
+        end
+        def self.normalize_stop_reason(stop_reason)
+          case stop_reason
+          when "STOP"
+            LLM::StopReason::STOP
+          when "MAX_TOKENS"
+            LLM::StopReason::MAX_TOKENS
+          when "SAFETY"
+            LLM::StopReason::SAFETY
+          else
+            LLM::StopReason::OTHER
+          end
+        end
+        private
+        attr_reader :raw_response
+        def content
+          parsed_response.dig("candidates", 0, "content", "parts", 0, "text")
+        end
+        def stop_reason
+          parsed_response.dig("candidates", 0, "finishReason")
+        end
+        def translated_stop_reason
+          self.class.normalize_stop_reason(stop_reason)
+        end
+        def parsed_response
+          raw_response.parsed_response
+        end
+        def structured_output
+          @structured_output ||= JSON.parse(parsed_response.dig("candidates", 0, "content", "parts", 0, "text"))
+        rescue JSON::ParserError
+          nil
+        end
+      end
+    end
+  end
+end

data/lib/llm/clients/gemini.rb ADDED Viewed

@@ -0,0 +1,102 @@
+# frozen_string_literal: true
+require "httparty"
+require "event_stream_parser"
+class LLM
+  module Clients
+    class Gemini
+      include HTTParty
+      base_uri "https://generativelanguage.googleapis.com"
+      def initialize(llm:)
+        @llm = llm
+      end
+      def chat(messages, options = {})
+        req = Request.new(messages, options)
+        return chat_streaming(req, options[:on_message], options[:on_complete]) if options[:stream]
+        resp = post_url(
+          "/v1beta/models/#{llm.canonical_name}:generateContent",
+          body: req.params.to_json
+        )
+        Response.new(resp).to_normalized_response
+      end
+      private
+      attr_reader :llm
+      def chat_streaming(request, on_message, on_complete)
+        buffer = +""
+        chunks = []
+        output_data = {}
+        wrapped_on_complete = lambda { |stop_reason|
+          output_data[:stop_reason] = stop_reason
+          on_complete&.call(stop_reason)
+        }
+        proc = handle_event_stream(buffer, chunks, on_message_proc: on_message, on_complete_proc: wrapped_on_complete)
+        _resp = post_url_streaming(
+          "/v1beta/models/#{llm.canonical_name}:streamGenerateContent?alt=sse",
+          body: request.params.to_json,
+          &proc
+        )
+        LLM::Response.new(
+          content: buffer,
+          raw_response: chunks,
+          stop_reason: Response.normalize_stop_reason(output_data[:stop_reason])
+        )
+      end
+      def handle_event_stream(buffer, chunks, on_message_proc:, on_complete_proc:)
+        each_json_chunk do |_type, chunk|
+          chunks << chunk
+          new_content = chunk.dig("candidates", 0, "content", "parts", 0, "text")
+          unless new_content.nil?
+            on_message_proc&.call(new_content)
+            buffer << new_content
+          end
+          stop_reason = chunk.dig("candidates", 0, "finishReason")
+          on_complete_proc&.call(stop_reason) unless stop_reason.nil?
+        end
+      end
+      def each_json_chunk
+        parser = EventStreamParser::Parser.new
+        proc do |chunk|
+          # TODO: Add error handling.
+          parser.feed(chunk) do |type, data|
+            yield(type, JSON.parse(data))
+          end
+        end
+      end
+      def post_url(url, **kwargs)
+        self.class.post(url, **kwargs.merge(headers: default_headers))
+      end
+      def post_url_streaming(url, **kwargs, &block)
+        self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
+      end
+      def default_headers
+        {
+          "x-goog-api-key" => ENV["GEMINI_API_KEY"],
+          "Content-Type" => "application/json"
+        }
+      end
+    end
+  end
+end

data/lib/llm/clients/open_ai/response.rb CHANGED Viewed

@@ -1,42 +1,55 @@
 # frozen_string_literal: true
-class LLM::Clients::OpenAI::Response
-  def initialize(raw_response)
-    @raw_response = raw_response
-  end
+class LLM
+  module Clients
+    class OpenAI
+      class Response
+        def initialize(raw_response)
+          @raw_response = raw_response
+        end
-  def to_normalized_response
-    LLM::Response.new(
-      content: content,
-      raw_response: parsed_response,
-      stop_reason: normalize_stop_reason
-    )
-  end
+        def to_normalized_response
+          LLM::Response.new(
+            content: content,
+            raw_response: parsed_response,
+            stop_reason: normalize_stop_reason,
+            structured_output: structured_output
+          )
+        end
-  def self.normalize_stop_reason(stop_reason)
-    case stop_reason
-    when "stop"
-      LLM::StopReason::STOP
-    when "safety"
-      LLM::StopReason::SAFETY
-    when "max_tokens"
-      LLM::StopReason::MAX_TOKENS_REACHED
-    else
-      LLM::StopReason::OTHER
-    end
-  end
+        def self.normalize_stop_reason(stop_reason)
+          case stop_reason
+          when "stop"
+            LLM::StopReason::STOP
+          when "safety"
+            LLM::StopReason::SAFETY
+          when "max_tokens"
+            LLM::StopReason::MAX_TOKENS_REACHED
+          else
+            LLM::StopReason::OTHER
+          end
+        end
-  private
+        private
-  def content
-    parsed_response.dig("choices", 0, "message", "content")
-  end
+        def content
+          parsed_response.dig("choices", 0, "message", "content")
+        end
-  def normalize_stop_reason
-    self.class.normalize_stop_reason(parsed_response.dig("choices", 0, "finish_reason"))
-  end
+        def normalize_stop_reason
+          self.class.normalize_stop_reason(parsed_response.dig("choices", 0, "finish_reason"))
+        end
-  def parsed_response
-    @raw_response.parsed_response
+        def parsed_response
+          @raw_response.parsed_response
+        end
+        def structured_output
+          @structured_output ||= JSON.parse(parsed_response.dig("choices", 0, "message", "content"))
+        rescue JSON::ParserError
+          nil
+        end
+      end
+    end
   end
 end