RubyGems - raix - Versions diffs - 0.3.2 → 0.4.0 - Mend

raix 0.3.2 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

checksums.yaml +4 -4
data/.rubocop.yml +1 -1
data/CHANGELOG.md +6 -0
data/Gemfile.lock +2 -1
data/README.md +36 -0
data/lib/raix/chat_completion.rb +34 -30
data/lib/raix/function_dispatch.rb +1 -1
data/lib/raix/prompt_declarations.rb +1 -1
data/lib/raix/version.rb +1 -1
data/lib/raix.rb +5 -0
metadata +3 -3

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 29e66d0046995dca8f9d093ccfd719e54905f94c62d5ee4e78588e1b40b0dadd
-  data.tar.gz: 7b1d4544576891124e209130d339be55fe35573e05dcb05474c526655e850390
+  metadata.gz: 318057e8ece37b63c06884a61c37dc1ef15f38cee2c05d5305bcaf6d3697420e
+  data.tar.gz: 250229d71a808203689b87cea7c1ce8dd31b085c92b54dce392787299cd420d6
 SHA512:
-  metadata.gz: a8989d6e4c4422054a2e452637ffc3205a7242a060f32c86fcf5c72ff1050f50d7bc6b20a9e4eb147398cdbc461c6fe80cf1f1426e0401fe944ce484ae70d4f9
-  data.tar.gz: 364ba4571799b8a7eea4abae9f4a905a32121303f62e22e66bff9f4663831b6311efa3eb9238f4eaa326456c2db467d7de0277000e42a11db710435c3e482679
+  metadata.gz: dc51d8fab907f8ffa5e95df2ef308ee3cd5fc443f46803e8a6f184be80e12719be2d25aeb51349a7912b35df55866db66cdec532c0258bc66a97aeabbc4017d0
+  data.tar.gz: 7b393143a5da05ba75ac11a8a77e31b0e61c1e7e9cb564fc6c986a4f63e2f98a24d43056f613e8e4685baab85251c9e819bcf99fcdc22ee7428b50eb9895d4e7

data/.rubocop.yml CHANGED Viewed

@@ -11,7 +11,7 @@ Style/StringLiteralsInInterpolation:
   EnforcedStyle: double_quotes
 Layout/LineLength:
-  Max: 120
+  Max: 180
 Metrics/BlockLength:
   Enabled: false

data/CHANGELOG.md CHANGED Viewed

@@ -8,3 +8,9 @@
 - adds `ChatCompletion` module
 - adds `PromptDeclarations` module
 - adds `FunctionDispatch` module
+## [0.3.2] - 2024-06-29
+- adds support for streaming
+## [0.4.0] - 2024-10-18
+- adds support for Anthropic-style prompt caching

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    raix (0.3.1)
+    raix (0.3.2)
       activesupport (>= 6.0)
       open_router (~> 0.2)
@@ -198,6 +198,7 @@ GEM
 PLATFORMS
   arm64-darwin-21
+  arm64-darwin-22
   x86_64-linux
 DEPENDENCIES

data/README.md CHANGED Viewed

@@ -42,6 +42,30 @@ transcript << { role: "user", content: "What is the meaning of life?" }
 One of the advantages of OpenRouter and the reason that it is used by default by this library is that it handles mapping message formats from the OpenAI standard to whatever other model you're wanting to use (Anthropic, Cohere, etc.)
+### Prompt Caching
+Raix supports [Anthropic-style prompt caching](https://openrouter.ai/docs/prompt-caching#anthropic-claude) when using Anthropic's Claud family of models. You can specify a `cache_at` parameter when doing a chat completion. If the character count for the content of a particular message is longer than the cache_at parameter, it will be sent to Anthropic as a multipart message with a cache control "breakpoint" set to "ephemeral".
+Note that there is a limit of four breakpoints, and the cache will expire within five minutes. Therefore, it is recommended to reserve the cache breakpoints for large bodies of text, such as character cards, CSV data, RAG data, book chapters, etc. Raix does not enforce a limit on the number of breakpoints, which means that you might get an error if you try to cache too many messages.
+```ruby
+>> my_class.chat_completion(params: { cache_at: 1000 })
+=> {
+  "messages": [
+    {
+      "role": "system",
+      "content": [
+        {
+          "type": "text",
+          "text": "HUGE TEXT BODY LONGER THAN 1000 CHARACTERS",
+          "cache_control": {
+            "type": "ephemeral"
+          }
+        }
+      ]
+    },
+```
 ### Use of Tools/Functions
 The second (optional) module that you can add to your Ruby classes after `ChatCompletion` is `FunctionDispatch`. It lets you declare and implement functions to be called at the AI's discretion as part of a chat completion "loop" in a declarative, Rails-like "DSL" fashion.
@@ -216,6 +240,18 @@ If bundler is not being used to manage dependencies, install the gem by executin
     $ gem install raix
+If you are using the default OpenRouter API, Raix expects `Raix.configuration.openrouter_client` to initialized with the OpenRouter API client instance.
+You can add an initializer to your application's `config/initializers` directory:
+```ruby
+  # config/initializers/raix.rb
+  Raix.configure do |config|
+    config.openrouter_client = OpenRouter::Client.new
+  end
+```
+You will also need to configure the OpenRouter API access token as per the instructions here: https://github.com/OlympiaAI/open_router?tab=readme-ov-file#quickstart
 ## Development

data/lib/raix/chat_completion.rb CHANGED Viewed

@@ -2,6 +2,7 @@
 require "active_support/concern"
 require "active_support/core_ext/object/blank"
+require "raix/message_adapters/base"
 require "open_router"
 require "openai"
@@ -17,9 +18,9 @@ module Raix
   module ChatCompletion
     extend ActiveSupport::Concern
-    attr_accessor :frequency_penalty, :logit_bias, :logprobs, :loop, :min_p, :model, :presence_penalty,
-                  :repetition_penalty, :response_format, :stream, :temperature, :max_tokens, :seed, :stop, :top_a,
-                  :top_k, :top_logprobs, :top_p, :tools, :tool_choice, :provider
+    attr_accessor :cache_at, :frequency_penalty, :logit_bias, :logprobs, :loop, :min_p, :model, :presence_penalty,
+                  :repetition_penalty, :response_format, :stream, :temperature, :max_completion_tokens,
+                  :max_tokens, :seed, :stop, :top_a, :top_k, :top_logprobs, :top_p, :tools, :tool_choice, :provider
     # This method performs chat completion based on the provided transcript and parameters.
     #
@@ -30,16 +31,12 @@ module Raix
     # @option params [Boolean] :raw (false) Whether to return the raw response or dig the text content.
     # @return [String|Hash] The completed chat response.
     def chat_completion(params: {}, loop: false, json: false, raw: false, openai: false)
-      messages = transcript.flatten.compact.map { |msg| transform_message_format(msg) }
-      raise "Can't complete an empty transcript" if messages.blank?
-      # used by FunctionDispatch
-      self.loop = loop
       # set params to default values if not provided
+      params[:cache_at] ||= cache_at.presence
       params[:frequency_penalty] ||= frequency_penalty.presence
       params[:logit_bias] ||= logit_bias.presence
       params[:logprobs] ||= logprobs.presence
+      params[:max_completion_tokens] ||= max_completion_tokens.presence || Raix.configuration.max_completion_tokens
       params[:max_tokens] ||= max_tokens.presence || Raix.configuration.max_tokens
       params[:min_p] ||= min_p.presence
       params[:presence_penalty] ||= presence_penalty.presence
@@ -57,23 +54,29 @@ module Raix
       params[:top_p] ||= top_p.presence
       if json
-        params[:provider] ||= {}
-        params[:provider][:require_parameters] = true
+        unless openai
+          params[:provider] ||= {}
+          params[:provider][:require_parameters] = true
+        end
         params[:response_format] ||= {}
         params[:response_format][:type] = "json_object"
       end
+      # used by FunctionDispatch
+      self.loop = loop
       # set the model to the default if not provided
       self.model ||= Raix.configuration.model
+      adapter = MessageAdapters::Base.new(self)
+      messages = transcript.flatten.compact.map { |msg| adapter.transform(msg) }
+      raise "Can't complete an empty transcript" if messages.blank?
       begin
         response = if openai
-                     openai_request(params:, model: openai,
-                                    messages:)
+                     openai_request(params:, model: openai, messages:)
                    else
-                     openrouter_request(
-                       params:, model:, messages:
-                     )
+                     openrouter_request(params:, model:, messages:)
                    end
         retry_count = 0
         content = nil
@@ -115,8 +118,8 @@ module Raix
           raise e # just fail if we can't get content after 3 attempts
         end
-        # attempt to fix the JSON
-        JsonFixer.new.call(content, e.message)
+        puts "Bad JSON received!!!!!!: #{content}"
+        raise e
       rescue Faraday::BadRequestError => e
         # make sure we see the actual error message on console or Honeybadger
         puts "Chat completion failed!!!!!!!!!!!!!!!!: #{e.response[:body]}"
@@ -132,6 +135,9 @@ module Raix
     # { user: "Hey what time is it?" },
     # { assistant: "Sorry, pumpkins do not wear watches" }
     #
+    # to add a function call use the following format:
+    # { function: { name: 'fancy_pants_function', arguments: { param: 'value' } } }
+    #
     # to add a function result use the following format:
     # { function: result, name: 'fancy_pants_function' }
     #
@@ -143,11 +149,21 @@ module Raix
     private
     def openai_request(params:, model:, messages:)
+      # deprecated in favor of max_completion_tokens
+      params.delete(:max_tokens)
       params[:stream] ||= stream.presence
+      params[:stream_options] = { include_usage: true } if params[:stream]
+      params.delete(:temperature) if model == "o1-preview"
       Raix.configuration.openai_client.chat(parameters: params.compact.merge(model:, messages:))
     end
     def openrouter_request(params:, model:, messages:)
+      # max_completion_tokens is not supported by OpenRouter
+      params.delete(:max_completion_tokens)
       retry_count = 0
       begin
@@ -163,17 +179,5 @@ module Raix
         raise e
       end
     end
-    def transform_message_format(message)
-      return message if message[:role].present?
-      if message[:function].present?
-        { role: "assistant", name: message.dig(:function, :name), content: message.dig(:function, :arguments).to_json }
-      elsif message[:result].present?
-        { role: "function", name: message[:name], content: message[:result] }
-      else
-        { role: message.first.first, content: message.first.last }
-      end
-    end
   end
 end

data/lib/raix/function_dispatch.rb CHANGED Viewed

@@ -35,7 +35,7 @@ module Raix
       # argument will be executed in the instance context of the class that includes this module.
       #
       # Example:
-      #   function :google_search, description: "Search Google for something", query: { type: "string" } do |arguments|
+      #   function :google_search, "Search Google for something", query: { type: "string" } do |arguments|
       #     GoogleSearch.new(arguments[:query]).search
       #   end
       #

data/lib/raix/prompt_declarations.rb CHANGED Viewed

@@ -83,7 +83,7 @@ module Raix
         params = @current_prompt.params.merge(params)
         # set the stream if necessary
-        self.stream = instance_exec(&current_prompt.stream) if current_prompt.stream.present?
+        self.stream = instance_exec(&@current_prompt.stream) if @current_prompt.stream.present?
         super(params:, raw:).then do |response|
           transcript << { assistant: response }

data/lib/raix/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Raix
-  VERSION = "0.3.2"
+  VERSION = "0.4.0"
 end

data/lib/raix.rb CHANGED Viewed

@@ -16,6 +16,9 @@ module Raix
     # The max_tokens option determines the maximum number of tokens to generate.
     attr_accessor :max_tokens
+    # The max_completion_tokens option determines the maximum number of tokens to generate.
+    attr_accessor :max_completion_tokens
     # The model option determines the model to use for text generation. This option
     # is normally set in each class that includes the ChatCompletion module.
     attr_accessor :model
@@ -27,12 +30,14 @@ module Raix
     attr_accessor :openai_client
     DEFAULT_MAX_TOKENS = 1000
+    DEFAULT_MAX_COMPLETION_TOKENS = 16_384
     DEFAULT_MODEL = "meta-llama/llama-3-8b-instruct:free"
     DEFAULT_TEMPERATURE = 0.0
     # Initializes a new instance of the Configuration class with default values.
     def initialize
       self.temperature = DEFAULT_TEMPERATURE
+      self.max_completion_tokens = DEFAULT_MAX_COMPLETION_TOKENS
       self.max_tokens = DEFAULT_MAX_TOKENS
       self.model = DEFAULT_MODEL
     end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: raix
 version: !ruby/object:Gem::Version
-  version: 0.3.2
+  version: 0.4.0
 platform: ruby
 authors:
 - Obie Fernandez
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2024-06-30 00:00:00.000000000 Z
+date: 2024-10-19 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport
@@ -84,7 +84,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.4.10
+rubygems_version: 3.5.21
 signing_key:
 specification_version: 4
 summary: Ruby AI eXtensions