RubyGems - active_genie - Versions diffs - 0.0.10 → 0.0.18 - Mend

active_genie 0.0.10 → 0.0.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

checksums.yaml +4 -4
data/README.md +63 -57
data/VERSION +1 -1
data/lib/active_genie/battle/README.md +7 -7
data/lib/active_genie/battle/basic.rb +75 -68
data/lib/active_genie/battle.rb +4 -0
data/lib/active_genie/clients/anthropic_client.rb +110 -0
data/lib/active_genie/clients/google_client.rb +158 -0
data/lib/active_genie/clients/helpers/retry.rb +29 -0
data/lib/active_genie/clients/openai_client.rb +58 -38
data/lib/active_genie/clients/unified_client.rb +5 -5
data/lib/active_genie/concerns/loggable.rb +44 -0
data/lib/active_genie/configuration/log_config.rb +1 -1
data/lib/active_genie/configuration/providers/anthropic_config.rb +54 -0
data/lib/active_genie/configuration/providers/base_config.rb +85 -0
data/lib/active_genie/configuration/providers/deepseek_config.rb +54 -0
data/lib/active_genie/configuration/providers/google_config.rb +56 -0
data/lib/active_genie/configuration/providers/openai_config.rb +54 -0
data/lib/active_genie/configuration/providers_config.rb +7 -4
data/lib/active_genie/configuration/runtime_config.rb +35 -0
data/lib/active_genie/configuration.rb +18 -4
data/lib/active_genie/data_extractor/README.md +0 -1
data/lib/active_genie/data_extractor/basic.rb +22 -19
data/lib/active_genie/data_extractor/from_informal.rb +4 -15
data/lib/active_genie/data_extractor.rb +4 -0
data/lib/active_genie/logger.rb +60 -14
data/lib/active_genie/{league → ranking}/README.md +7 -7
data/lib/active_genie/ranking/elo_round.rb +134 -0
data/lib/active_genie/ranking/free_for_all.rb +93 -0
data/lib/active_genie/ranking/player.rb +92 -0
data/lib/active_genie/{league → ranking}/players_collection.rb +19 -12
data/lib/active_genie/ranking/ranking.rb +153 -0
data/lib/active_genie/ranking/ranking_scoring.rb +71 -0
data/lib/active_genie/ranking.rb +12 -0
data/lib/active_genie/scoring/README.md +1 -1
data/lib/active_genie/scoring/basic.rb +93 -49
data/lib/active_genie/scoring/{recommended_reviews.rb → recommended_reviewers.rb} +18 -7
data/lib/active_genie/scoring.rb +6 -3
data/lib/active_genie.rb +1 -1
data/lib/tasks/benchmark.rake +27 -0
metadata +100 -100
data/lib/active_genie/configuration/openai_config.rb +0 -56
data/lib/active_genie/league/elo_ranking.rb +0 -121
data/lib/active_genie/league/free_for_all.rb +0 -62
data/lib/active_genie/league/league.rb +0 -120
data/lib/active_genie/league/player.rb +0 -59
data/lib/active_genie/league.rb +0 -12

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 9d0424a39ba21d821cb2419730387e1b026c35b5e2e5dff9f6d615f3ec54e6a3
-  data.tar.gz: 17b460ccd1a689d0f8709af2b84f3cde65aa0075b76104ee1b4bb8b3b0ffc182
+  metadata.gz: 81b6b3ccf366bdeb07e1dfc1942749e4a1d48da74735c48a95cb9d53afb61b33
+  data.tar.gz: df2d1ee4ac8bbcfa031b261bedd228ed5c3a8772c055e312360d6a4ad2f699fa
 SHA512:
-  metadata.gz: ad98b2d5d063d0d1c4009e1a9f92d6d326ed948cbdee71317c94b6a4a0ee57042609c1f405ca7ce00d4beb321bc1e98ad722646349076929bcdc0a28da8b6b8b
-  data.tar.gz: d2cc39b77757619c5235041f12d5182778b4237444f3c2246982ebcf54c0542af0da194783cea57fbe4fbde0985ef635802c648796b4f5c4d7a5d4f42c6519c7
+  metadata.gz: d3a2ff8342483f8b475f0e60d91fa839ba57b0853e82e637ba4e761fd9ae749917e5ae134803200bfe3fd4bab658b297c1888c88d6c433d4f2c0a0694face6aa
+  data.tar.gz: a4b37bd1e6ba7a3a4b6edea20bd37f7e2dd11142182b65b8e62707e993d9c42d44486c2247e342ab8a81b01778456dd5bf15616261e01d6c0dd556757646da18

data/README.md CHANGED Viewed

@@ -1,16 +1,11 @@
 # ActiveGenie 🧞‍♂️
-> Transform your Ruby application with powerful, production-ready GenAI features
+> The lodash for GenAI, stop reinventing the wheel
 [![Gem Version](https://badge.fury.io/rb/active_genie.svg?icon=si%3Arubygems)](https://badge.fury.io/rb/active_genie)
-[![Ruby](https://github.com/roriz/active_genie/actions/workflows/ruby.yml/badge.svg)](https://github.com/roriz/active_genie/actions/workflows/ruby.yml)
+[![Ruby](https://github.com/roriz/active_genie/actions/workflows/benchmark.yml/badge.svg)](https://github.com/roriz/active_genie/actions/workflows/benchmark.yml)
-ActiveGenie is a Ruby gem that provides a polished, production-ready interface for working with Generative AI (GenAI) models. Just like ActiveStorage simplifies file handling in Rails, ActiveGenie makes it effortless to integrate GenAI capabilities into your Ruby applications.
-## Features
-- 🎯 **Data Extraction**: Extract structured data from unstructured text with type validation
-- 📊 **Smart Scoring**: Multi-reviewer evaluation system with automatic expert selection
-- 💭 **Leaderboard**: Consistent rank items based on custom criteria, using multiple tecniques of ranking
+ActiveGenie is a Ruby gem that provides valuable solutions powered by Generative AI (GenAI) models. Just like Lodash or ActiveStorage, ActiveGenie brings a set of Modules reach real value fast and reliable.
+ActiveGenie is backed by a custom benchmarking system that ensures consistent quality and performance across different models and providers in every release.
 ## Installation
@@ -40,6 +35,7 @@ end
 ## Quick Start
 ### Data Extractor
 Extract structured data from text using AI-powered analysis, handling informal language and complex expressions.
 ```ruby
@@ -54,13 +50,17 @@ schema = {
     minimum: 0
   },
   size: {
-    type: 'integer',
+    type: 'number',
     minimum: 35,
     maximum: 46
   }
 }
-result = ActiveGenie::DataExtractor.call(text, schema)
+result = ActiveGenie::DataExtractor.call(
+  text,
+  schema,
+  config: { provider: :openai, model: 'gpt-4o-mini' } # optional
+)
 # => {
 #      brand: "Nike",
 #      brand_explanation: "Brand name found at start of text",
@@ -71,6 +71,8 @@ result = ActiveGenie::DataExtractor.call(text, schema)
 #    }
 ```
+*Recommended model*: `gpt-4o-mini`
 Features:
 - Structured data extraction with type validation
 - Schema-based extraction with custom constraints
@@ -86,7 +88,11 @@ Text evaluation system that provides detailed scoring and feedback using multipl
 text = "The code implements a binary search algorithm with O(log n) complexity"
 criteria = "Evaluate technical accuracy and clarity"
-result = ActiveGenie::Scoring.basic(text, criteria)
+result = ActiveGenie::Scoring.basic(
+  text,
+  criteria,
+  config: { provider: :anthropic, model: 'claude-3-5-haiku-20241022' } # optional
+)
 # => {
 #      algorithm_expert_score: 95,
 #      algorithm_expert_reasoning: "Accurately describes binary search and its complexity",
@@ -96,6 +102,8 @@ result = ActiveGenie::Scoring.basic(text, criteria)
 #    }
 ```
+*Recommended model*: `claude-3-5-haiku-20241022`
 Features:
 - Multi-reviewer evaluation with automatic expert selection
 - Detailed feedback with scoring reasoning
@@ -110,20 +118,27 @@ AI-powered battle evaluation system that determines winners between two players
 ```ruby
 require 'active_genie'
-player_a = "Implementation uses dependency injection for better testability"
-player_b = "Code has high test coverage but tightly coupled components"
+player_1 = "Implementation uses dependency injection for better testability"
+player_2 = "Code has high test coverage but tightly coupled components"
 criteria = "Evaluate code quality and maintainability"
-result = ActiveGenie::Battle.call(player_a, player_b, criteria)
+result = ActiveGenie::Battle.call(
+  player_1,
+  player_2,
+  criteria,
+  config: { provider: :google, model: 'gemini-2.0-flash-lite' } # optional
+)
 # => {
 #      winner_player: "Implementation uses dependency injection for better testability",
-#      reasoning: "Player A's implementation demonstrates better maintainability through dependency injection,
-#                 which allows for easier testing and component replacement. While Player B has good test coverage,
+#      reasoning: "Player 1 implementation demonstrates better maintainability through dependency injection,
+#                 which allows for easier testing and component replacement. While Player 2 has good test coverage,
 #                 the tight coupling makes the code harder to maintain and modify.",
 #      what_could_be_changed_to_avoid_draw: "Focus on specific architectural patterns and design principles"
 #    }
 ```
+*Recommended model*: `gemini-2.0-flash-lite`
 Features:
 - Multi-reviewer evaluation with automatic expert selection
 - Detailed feedback with scoring reasoning
@@ -132,9 +147,8 @@ Features:
 See the [Battle README](lib/active_genie/battle/README.md) for advanced usage, custom reviewers, and detailed interface documentation.
-### League
-The League module provides competitive ranking through multi-stage evaluation:
+### Ranking
+The Ranking module provides competitive ranking through multi-stage evaluation:
 ```ruby
 require 'active_genie'
@@ -142,62 +156,53 @@ require 'active_genie'
 players = ['REST API', 'GraphQL API', 'SOAP API', 'gRPC API', 'Websocket API']
 criteria = "Best one to be used into a high changing environment"
-result = ActiveGenie::League.call(players, criteria)
+result = ActiveGenie::Ranking.call(
+  players,
+  criteria,
+  config: { provider: :google, model: 'gemini-2.0-flash-lite' } # optional
+)
 # => {
 #      winner_player: "gRPC API",
 #      reasoning: "gRPC API is the best one to be used into a high changing environment",
 #    }
 ```
+*Recommended model*: `gemini-2.0-flash-lite`
 - **Multi-phase ranking system** combining expert scoring and ELO algorithms
 - **Automatic elimination** of inconsistent performers using statistical analysis
 - **Dynamic ranking adjustments** based on simulated pairwise battles, from bottom to top
-See the [League README](lib/active_genie/league/README.md) for implementation details, configuration, and advanced ranking strategies.
-### Summarizer (WIP)
-The summarizer is a tool that can be used to summarize a given text. It uses a set of rules to summarize the text out of the box. Uses the best practices of prompt engineering and engineering to make the summarization as accurate as possible.
-```ruby
-require 'active_genie'
-text = "Example text to be summarized. The fox jumps over the dog"
-summarized_text = ActiveGenie::Summarizer.call(text)
-puts summarized_text # => "The fox jumps over the dog"
-```
-### Language detector (WIP)
-The language detector is a tool that can be used to detect the language of a given text. It uses a set of rules to detect the language of the text out of the box. Uses the best practices of prompt engineering and engineering to make the language detection as accurate as possible.
+See the [Ranking README](lib/active_genie/ranking/README.md) for implementation details, configuration, and advanced ranking strategies.
-```ruby
-require 'active_genie'
+### Text Summarizer (Future)
+### Categorizer (Future)
+### Language detector (Future)
+### Translator (Future)
+### Sentiment analyzer (Future)
-text = "Example text to be detected"
-language = ActiveGenie::LanguageDetector.call(text)
-puts language # => "en"
-```
+## Benchmarking 🧪
-### Translator (WIP)
-The translator is a tool that can be used to translate a given text. It uses a set of rules to translate the text out of the box. Uses the best practices of prompt engineering and engineering to make the translation as accurate as possible.
+ActiveGenie includes a comprehensive benchmarking system to ensure consistent, high-quality outputs across different LLM models and providers.
 ```ruby
-require 'active_genie'
+# Run all benchmarks
+bundle exec rake active_genie:benchmark
-text = "Example text to be translated"
-translated_text = ActiveGenie::Translator.call(text, from: 'en', to: 'pt')
-puts translated_text # => "Exemplo de texto a ser traduzido"
+# Run benchmarks for a specific module
+bundle exec rake active_genie:benchmark[data_extractor]
 ```
-### Sentiment analyzer (WIP)
-The sentiment analyzer is a tool that can be used to analyze the sentiment of a given text. It uses a set of rules to analyze the sentiment of the text out of the box. Uses the best practices of prompt engineering and engineering to make the sentiment analysis as accurate as possible.
+### Latest Results
-```ruby
-require 'active_genie'
+| Model | Overall Precision |
+|-------|-------------------|
+| claude-3-5-haiku-20241022 | 92.25% |
+| gemini-2.0-flash-lite | 84.25% |
+| gpt-4o-mini | 62.75% |
+| deepseek-chat | 57.25% |
-text = "Example text to be analyzed"
-sentiment = ActiveGenie::SentimentAnalyzer.call(text)
-puts sentiment # => "positive"
-```
+See the [Benchmark README](benchmark/README.md) for detailed results, methodology, and how to contribute to our test suite.
 ## Configuration
@@ -218,6 +223,7 @@ puts sentiment # => "positive"
 3. Commit your changes (`git commit -m 'Add amazing feature'`)
 4. Push to the branch (`git push origin feature/amazing-feature`)
 5. Open a Pull Request
 ## License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+This project is licensed under the Apache License 2.0 License - see the [LICENSE](LICENSE) file for details.

data/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 0.0.10
1	+ 0.0.18

data/lib/active_genie/battle/README.md CHANGED Viewed

@@ -12,11 +12,11 @@ AI-powered battle evaluation system that determines winners between two players
 Evaluate a battle between two players with simple text content:
 ```ruby
-player_a = "Implementation uses dependency injection for better testability"
-player_b = "Code has high test coverage but tightly coupled components"
+player_1 = "Implementation uses dependency injection for better testability"
+player_2 = "Code has high test coverage but tightly coupled components"
 criteria = "Evaluate code quality and maintainability"
-result = ActiveGenie::Battle::Basic.call(player_a, player_b, criteria)
+result = ActiveGenie::Battle::Basic.call(player_1, player_2, criteria)
 # => {
 #      winner_player: "Implementation uses dependency injection for better testability",
 #      reasoning: "Player A's implementation demonstrates better maintainability through dependency injection,
@@ -27,13 +27,13 @@ result = ActiveGenie::Battle::Basic.call(player_a, player_b, criteria)
 ```
 ## Interface
-### Basic.call(player_a, player_b, criteria, config: {})
-- `player_a` [String, Hash] - The content or submission from the first player
-- `player_b` [String, Hash] - The content or submission from the second player
+### Basic.call(player_1, player_2, criteria, config: {})
+- `player_1` [String, Hash] - The content or submission from the first player
+- `player_2` [String, Hash] - The content or submission from the second player
 - `criteria` [String] - The evaluation criteria or rules to assess against
 - `config` [Hash] - Additional configuration config that modify the battle evaluation behavior
 Returns a Hash containing:
-- `winner_player` [String, Hash] - The winning player's content (either player_a or player_b)
+- `winner_player` [String, Hash] - The winning player's content (either player_1 or player_2)
 - `reasoning` [String] - Detailed explanation of why the winner was chosen
 - `what_could_be_changed_to_avoid_draw` [String] - A suggestion on how to avoid a draw

data/lib/active_genie/battle/basic.rb CHANGED Viewed

@@ -14,117 +14,124 @@ module ActiveGenie::Battle
   #   Basic.call("Player A content", "Player B content", "Evaluate keyword usage and pattern matching")
   #
   class Basic
-    def self.call(player_a, player_b, criteria, config: {})
-      new(player_a, player_b, criteria, config:).call
+    def self.call(...)
+      new(...).call
     end
-    # @param player_a [String] The content or submission from the first player
-    # @param player_b [String] The content or submission from the second player
+    # @param player_1 [String] The content or submission from the first player
+    # @param player_2 [String] The content or submission from the second player
     # @param criteria [String] The evaluation criteria or rules to assess against
-    # @param config [Hash] Additional configuration config that modify the battle evaluation behavior
+    # @param config [Hash] Additional configuration options that modify the battle evaluation behavior
     # @return [Hash] The evaluation result containing the winner and reasoning
-    #   @return [String] :winner The @param player_a or player_b
+    #   @return [String] :winner The winner, either player_1 or player_2
     #   @return [String] :reasoning Detailed explanation of why the winner was chosen
     #   @return [String] :what_could_be_changed_to_avoid_draw A suggestion on how to avoid a draw
-    def initialize(player_a, player_b, criteria, config: {})
-      @player_a = player_a
-      @player_b = player_b
+    def initialize(player_1, player_2, criteria, config: {})
+      @player_1 = player_1
+      @player_2 = player_2
       @criteria = criteria
-      @config = config
-      @response = nil
+      @config = ActiveGenie::Configuration.to_h(config)
     end
     def call
       messages = [
         {  role: 'system', content: PROMPT },
         {  role: 'user', content: "criteria: #{@criteria}" },
-        {  role: 'user', content: "player_a: #{player_content(@player_a)}" },
-        {  role: 'user', content: "player_b: #{player_content(@player_b)}" },
+        {  role: 'user', content: "player_1: #{@player_1}" },
+        {  role: 'user', content: "player_2: #{@player_2}" },
       ]
-      @response = ::ActiveGenie::Clients::UnifiedClient.function_calling(messages, FUNCTION, config:)
-      response_formatted
+      response = ::ActiveGenie::Clients::UnifiedClient.function_calling(
+        messages,
+        FUNCTION,
+        model_tier: 'lower_tier',
+        config: @config
+      )
+      ActiveGenie::Logger.debug({
+        code: :battle,
+        player_1: @player_1[0..30],
+        player_2: @player_2[0..30],
+        criteria: @criteria[0..30],
+        winner: response['impartial_judge_winner'],
+        reasoning: response['impartial_judge_winner_reasoning']
+      })
+      response_formatted(response)
     end
     private
-    def player_content(player)
-      return player.dig('content') if player.is_a?(Hash)
+    def response_formatted(response)
+      winner = response['impartial_judge_winner']
+      loser = case response['impartial_judge_winner']
+              when 'player_1' then 'player_2'
+              when 'player_2' then 'player_1'
+              end
-      player
-    end
-    def response_formatted
-      winner = case @response['winner']
-               when 'player_a' then @player_a
-               when 'player_b' then @player_b
-               end
-      @response.merge!('winner' => winner, 'loser' => winner ? (winner == @player_a ? @player_b : @player_a) : nil)
+      { 'winner' => winner, 'loser' => loser, 'reasoning' => response['impartial_judge_winner_reasoning'] }
     end
     PROMPT = <<~PROMPT
-    Evaluate a battle between player_a and player_b using predefined criteria and identify the winner.
-    Consider rules, keywords, and patterns as the criteria for evaluation. Analyze the content from both players objectively, focusing on who meets the criteria most effectively. Explain your decision clearly, with specific reasoning on how the chosen player fulfilled the criteria better than the other. Avoid selecting a draw unless absolutely necessary.
+    Based on two players, player_1 and player_2, they will battle against each other based on criteria. Criteria are vital as they provide a clear metric to compare the players. Follow these criteria strictly.
     # Steps
-    1. **Review Predefined Criteria**: Understand the specific rules, keywords, and patterns that serve as the basis for evaluation.
-    2. **Analyze Content**: Examine the contributions of both player_a and player_b. Look for how each player meets or fails to meet the criteria.
-    3. **Comparison**: Compare both players against each criterion to determine who aligns better with the standards set.
-    4. **Decision-Making**: Based on the analysis, determine the player who meets the most or all criteria effectively.
-    5. **Provide Justification**: Offer a clear and concise reason for your choice detailing how the winner outperformed the other.
+    1. player_1 presents their strengths and how they meet the criteria. Max of 100 words.
+    2. player_2 presents their strengths and how they meet the criteria. Max of 100 words.
+    3. player_1 argues why they should be the winner compared to player_2. Max of 100 words.
+    4. player_2 counter-argues why they should be the winner compared to player_1. Max of 100 words.
+    5. The impartial judge chooses the winner.
-    # Examples
-    - **Example 1**:
-      - Input: Player A uses keyword X, follows rule Y, Player B uses keyword Z, breaks rule Y.
-      - Output: winner: player_a
-        - Justification: Player A successfully used keyword X and followed rule Y, whereas Player B broke rule Y.
-    - **Example 2**:
-      - Input: Player A matches pattern P, Player B matches pattern P, uses keyword Q.
-      - Output: winner: player_b
-        - Justification: Both matched pattern P, but Player B also used keyword Q, meeting more criteria.
+    # Output Format
+    - The impartial judge chooses this player as the winner.
     # Notes
-    - Avoid drawing if a clear winner can be discerned.
+    - Avoid resulting in a draw. Use reasoning or make fair assumptions if needed.
     - Critically assess each player's adherence to the criteria.
     - Clearly communicate the reasoning behind your decision.
     PROMPT
     FUNCTION =  {
       name: 'battle_evaluation',
-      description: 'Evaluate a battle between player_a and player_b using predefined criteria and identify the winner.',
+      description: 'Evaluate a battle between player_1 and player_2 using predefined criteria and identify the winner.',
       schema: {
         type: "object",
         properties: {
-          winner: {
+          player_1_sell_himself: {
             type: 'string',
-            description: 'The player who won the battle based on the criteria.',
-            enum: ['player_a', 'player_b', 'draw']
+            description: 'player_1 presents their strengths and how they meet the criteria. Max of 100 words.',
           },
-          reasoning_of_winner: {
+          player_2_sell_himself: {
             type: 'string',
-            description: 'The detailed reasoning about why the winner won based on the criteria.',
+            description: 'player_2 presents their strengths and how they meet the criteria. Max of 100 words.',
           },
-          what_could_be_changed_to_avoid_draw: {
+          player_1_arguments: {
             type: 'string',
-            description: 'Suggestions on how to avoid a draw based on the criteria. Be as objective and short as possible. Can be empty.',
-          }
-        }
-      }
-    }
-    def config
-      {
-        all_providers: { model_tier: 'lower_tier' },
-        log: {
-          **(@config.dig(:log) || {}),
-          trace: self.class.name,
+            description: 'player_1 arguments for why they should be the winner compared to player_2. Max of 100 words.',
+          },
+          player_2_counter: {
+            type: 'string',
+            description: 'player_2 counter arguments for why they should be the winner compared to player_1. Max of 100 words.',
+          },
+          impartial_judge_winner_reasoning: {
+            type: 'string',
+            description: 'The detailed reasoning about why the impartial judge chose the winner. Max of 100 words.',
+          },
+          impartial_judge_winner: {
+            type: 'string',
+            description: 'Who is the winner based on the impartial judge reasoning?',
+            enum: ['player_1', 'player_2']
+          },
         },
-        **@config
+        required: [
+          'player_1_sell_himself',
+          'player_2_sell_himself',
+          'player_1_arguments',
+          'player_2_counter',
+          'impartial_judge_winner_reasoning',
+          'impartial_judge_winner'
+        ]
       }
-    end
+    }
   end
 end

data/lib/active_genie/battle.rb CHANGED Viewed

@@ -9,5 +9,9 @@ module ActiveGenie
     def basic(...)
       Basic.call(...)
     end
+    def call(...)
+      Basic.call(...)
+    end
   end
 end

data/lib/active_genie/clients/anthropic_client.rb ADDED Viewed

@@ -0,0 +1,110 @@
+require 'json'
+require 'net/http'
+require 'uri'
+require_relative './helpers/retry'
+module ActiveGenie
+  module Clients
+    # Client for interacting with the Anthropic (Claude) API with json response
+    class AnthropicClient
+      class AnthropicError < StandardError; end
+      class RateLimitError < AnthropicError; end
+      def initialize(config)
+        @app_config = config
+      end
+      # Requests structured JSON output from the Anthropic Claude model based on a schema.
+      #
+      # @param messages [Array<Hash>] A list of messages representing the conversation history.
+      #   Each hash should have :role ('user', 'assistant', or 'system') and :content (String).
+      #   Claude uses 'user', 'assistant', and 'system' roles.
+      # @param function [Hash] A JSON schema definition describing the desired output format.
+      # @param model_tier [Symbol, nil] A symbolic representation of the model quality/size tier.
+      # @param config [Hash] Optional configuration overrides:
+      #   - :api_key [String] Override the default API key.
+      #   - :model [String] Override the model name directly.
+      #   - :max_retries [Integer] Max retries for the request.
+      #   - :retry_delay [Integer] Initial delay for retries.
+      #   - :anthropic_version [String] Override the default Anthropic API version.
+      # @return [Hash, nil] The parsed JSON object matching the schema, or nil if parsing fails or content is empty.
+      def function_calling(messages, function, model_tier: nil, config: {})
+        model = config[:runtime][:model] || @app_config.tier_to_model(model_tier)
+        system_message = messages.find { |m| m[:role] == 'system' }&.dig(:content) || ''
+        user_messages = messages.select { |m| m[:role] == 'user' || m[:role] == 'assistant' }
+          .map { |m| { role: m[:role], content: m[:content] } }
+        anthropic_function = function
+        anthropic_function[:input_schema] = function[:schema]
+        anthropic_function.delete(:schema)
+        payload = {
+          model:,
+          system: system_message,
+          messages: user_messages,
+          tools: [anthropic_function],
+          tool_choice: { name: anthropic_function[:name], type: 'tool' },
+          max_tokens: config[:runtime][:max_tokens],
+          temperature: config[:runtime][:temperature] || 0,
+        }
+        api_key = config[:runtime][:api_key] || @app_config.api_key
+        headers = DEFAULT_HEADERS.merge(
+          'x-api-key': api_key,
+          'anthropic-version': config[:anthropic_version] || ANTHROPIC_VERSION
+        ).compact
+        retry_with_backoff(config:) do
+          response = request(payload, headers, config:)
+          content = response.dig('content', 0, 'input')
+          ActiveGenie::Logger.trace({code: :function_calling, payload:, parsed_response: content})
+          content
+        end
+      end
+      private
+      DEFAULT_HEADERS = {
+        'Content-Type': 'application/json',
+      }
+      ANTHROPIC_VERSION = '2023-06-01'
+      def request(payload, headers, config:)
+        start_time = Time.now
+        retry_with_backoff(config:) do
+          response = Net::HTTP.post(
+            URI("#{@app_config.api_url}/v1/messages"),
+            payload.to_json,
+            headers
+          )
+          if response.is_a?(Net::HTTPTooManyRequests)
+            raise RateLimitError, "Anthropic API rate limit exceeded: #{response.body}"
+          end
+          raise AnthropicError, response.body unless response.is_a?(Net::HTTPSuccess)
+          return nil if response.body.empty?
+          parsed_body = JSON.parse(response.body)
+          ActiveGenie::Logger.trace({
+            code: :llm_usage,
+            input_tokens: parsed_body.dig('usage', 'input_tokens'),
+            output_tokens: parsed_body.dig('usage', 'output_tokens'),
+            total_tokens: parsed_body.dig('usage', 'input_tokens') + parsed_body.dig('usage', 'output_tokens'),
+            model: payload[:model],
+            duration: Time.now - start_time,
+            usage: parsed_body.dig('usage')
+          })
+          parsed_body
+        end
+      end
+    end
+  end
+end