llm_ruby 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: dfe908817dd406ae16aca4130133b9a421b18333cdecc4ad870635dd997be500
4
- data.tar.gz: 105ae0dcc30918686abcf8d01d99605d2f70f41eebdd2737744bc2bf27c6575c
3
+ metadata.gz: edf2b09bc3a9416193088298e41577369bf5198230c7278d6a832854f04c7e20
4
+ data.tar.gz: 735c0e1735d90e5c41a93d7e123bfe359c0a3525bf292ba46d0a6b8c23580c05
5
5
  SHA512:
6
- metadata.gz: 5b9643df8771735111f18f52182b6f217231ca92c890e3456f21c3937850bf2c9ac668730cd1aeae8b43330d5a3c84eab6b419c60aa5d54d70f405793e4463ad
7
- data.tar.gz: 820687838675aeaadde8e5b7c5b7f7f45bfdf10beb74b90e2b594cd80d5e7accc38ace2deec56a6a2fcdc3de6a369af41405c5523da3e4c2b47f6e584c28f3fd
6
+ metadata.gz: d7509110f0f53028e6d6c6cc899ab31985748e2f5775be64286fc1a66659723953fee49f8ea70c1f797e9c7786bf401634d35f201d4b44d0f4abd11a22d691ed
7
+ data.tar.gz: 447086c9fed992e31db3e0bec1f9a92d0b52995ad899f11e9eeba0674eaf0a734918b4ee144a3b869ba6e61b7f2c73dcba2eaec5b602700d8c126a4de4c0ad33
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # LLMRuby
2
2
 
3
- LLMRuby is a Ruby gem that provides a consistent interface for interacting with various Large Language Model (LLM) APIs, with a current focus on OpenAI's models.
3
+ LLMRuby is a Ruby gem that provides a consistent interface for interacting with multiple Large Language Model (LLM) APIs. Most OpenAI, Anthropic and Gemini models are currently supported.
4
4
 
5
5
  ## Installation
6
6
 
@@ -12,14 +12,14 @@ gem 'llm_ruby'
12
12
 
13
13
  And then execute:
14
14
 
15
- ```
16
- $ bundle install
15
+ ```shell
16
+ bundle install
17
17
  ```
18
18
 
19
19
  Or install it yourself as:
20
20
 
21
- ```
22
- $ gem install llm_ruby
21
+ ```shell
22
+ gem install llm_ruby
23
23
  ```
24
24
 
25
25
  ## Usage
@@ -27,7 +27,7 @@ $ gem install llm_ruby
27
27
  ### Basic Usage
28
28
 
29
29
  ```ruby
30
- require 'llm'
30
+ require 'llm_ruby'
31
31
 
32
32
  # Initialize an LLM instance
33
33
  llm = LLM.from_string!("gpt-4")
@@ -46,10 +46,10 @@ puts response.content
46
46
  LLMRuby supports streaming responses:
47
47
 
48
48
  ```ruby
49
- require 'llm'
49
+ require 'llm_ruby'
50
50
 
51
51
  # Initialize an LLM instance
52
- llm = LLM.from_string!("gpt-4")
52
+ llm = LLM.from_string!("gpt-4o")
53
53
 
54
54
  # Create a client
55
55
  client = llm.client
@@ -87,7 +87,7 @@ Here is an example of how to use the response object:
87
87
 
88
88
  ```ruby
89
89
  # Initialize an LLM instance
90
- llm = LLM.from_string!("gpt-4")
90
+ llm = LLM.from_string!("gpt-4o")
91
91
 
92
92
  # Create a client
93
93
  client = llm.client
@@ -101,37 +101,69 @@ puts "Raw response: #{response.raw_response}"
101
101
  puts "Stop reason: #{response.stop_reason}"
102
102
  ```
103
103
 
104
-
105
104
  ## Available Models
106
105
 
107
106
  LLMRuby supports various OpenAI models, including GPT-3.5 and GPT-4 variants. You can see the full list of supported models in the `KNOWN_MODELS` constant:
108
107
 
109
- | Canonical Name | Display Name | Provider |
110
- |---------------------------|------------------------|----------|
111
- | gpt-3.5-turbo | GPT-3.5 Turbo | openai |
112
- | gpt-3.5-turbo-0125 | GPT-3.5 Turbo 0125 | openai |
113
- | gpt-3.5-turbo-16k | GPT-3.5 Turbo 16K | openai |
114
- | gpt-3.5-turbo-1106 | GPT-3.5 Turbo 1106 | openai |
115
- | gpt-4 | GPT-4 | openai |
116
- | gpt-4-32k | GPT-4 32K | openai |
117
- | gpt-4-1106-preview | GPT-4 Turbo 1106 | openai |
118
- | gpt-4-turbo-2024-04-09 | GPT-4 Turbo 2024-04-09 | openai |
119
- | gpt-4-0125-preview | GPT-4 Turbo 0125 | openai |
120
- | gpt-4-turbo-preview | GPT-4 Turbo | openai |
121
- | gpt-4-0613 | GPT-4 0613 | openai |
122
- | gpt-4-32k-0613 | GPT-4 32K 0613 | openai |
123
- | gpt-4o | GPT-4o | openai |
124
- | gpt-4o-mini | GPT-4o Mini | openai |
125
- | gpt-4o-2024-05-13 | GPT-4o 2024-05-13 | openai |
126
- | gpt-4o-2024-08-06 | GPT-4o 2024-08-06 | openai |
127
-
108
+ ### OpenAI Models
109
+
110
+ | Canonical Name | Display Name |
111
+ |----------------------------|--------------------------------------|
112
+ | gpt-3.5-turbo | GPT-3.5 Turbo |
113
+ | gpt-3.5-turbo-0125 | GPT-3.5 Turbo 0125 |
114
+ | gpt-3.5-turbo-16k | GPT-3.5 Turbo 16K |
115
+ | gpt-3.5-turbo-1106 | GPT-3.5 Turbo 1106 |
116
+ | gpt-4 | GPT-4 |
117
+ | gpt-4-1106-preview | GPT-4 Turbo 1106 |
118
+ | gpt-4-turbo-2024-04-09 | GPT-4 Turbo 2024-04-09 |
119
+ | gpt-4-0125-preview | GPT-4 Turbo 0125 |
120
+ | gpt-4-turbo-preview | GPT-4 Turbo |
121
+ | gpt-4-0613 | GPT-4 0613 |
122
+ | gpt-4o | GPT-4o |
123
+ | gpt-4o-mini | GPT-4o Mini |
124
+ | gpt-4o-mini-2024-07-18 | GPT-4o Mini 2024-07-18 |
125
+ | gpt-4o-2024-05-13 | GPT-4o 2024-05-13 |
126
+ | gpt-4o-2024-08-06 | GPT-4o 2024-08-06 |
127
+ | gpt-4o-2024-11-20 | GPT-4o 2024-11-20 |
128
+ | chatgpt-4o-latest | ChatGPT 4o Latest |
129
+ | o1 | o1 |
130
+ | o1-2024-12-17 | o1 2024-12-17 |
131
+ | o1-preview | o1 Preview |
132
+ | o1-preview-2024-09-12 | o1 Preview 2024-09-12 |
133
+ | o1-mini | o1 Mini |
134
+ | o1-mini-2024-09-12 | o1 Mini 2024-09-12 |
135
+ | o3-mini | o3 Mini |
136
+ | o3-mini-2025-01-31 | o3 Mini 2025-01-31 |
137
+
138
+ ### Anthropic Models
139
+
140
+ | Canonical Name | Display Name |
141
+ |----------------------------|--------------------------------------|
142
+ | claude-3-5-sonnet-20241022 | Claude 3.5 Sonnet 2024-10-22 |
143
+ | claude-3-5-haiku-20241022 | Claude 3.5 Haiku 2024-10-22 |
144
+ | claude-3-5-sonnet-20240620 | Claude 3.5 Sonnet 2024-06-20 |
145
+ | claude-3-opus-20240229 | Claude 3.5 Opus 2024-02-29 |
146
+ | claude-3-sonnet-20240229 | Claude 3.5 Sonnet 2024-02-29 |
147
+ | claude-3-haiku-20240307 | Claude 3.5 Opus 2024-03-07 |
148
+
149
+ ### Google Models
150
+
151
+ | Canonical Name | Display Name |
152
+ |--------------------------------------|------------------------------------------|
153
+ | gemini-2.0-flash | Gemini 2.0 Flash |
154
+ | gemini-2.0-flash-lite-preview-02-05 | Gemini 2.0 Flash Lite Preview 02-05 |
155
+ | gemini-1.5-flash | Gemini 1.5 Flash |
156
+ | gemini-1.5-pro | Gemini 1.5 Pro |
157
+ | gemini-1.5-flash-8b | Gemini 1.5 Flash 8B |
128
158
 
129
159
  ## Configuration
130
160
 
131
- Set your OpenAI API key as an environment variable:
161
+ Set your OpenAI, Anthropic or Google API key as an environment variable:
132
162
 
133
- ```
163
+ ```shell
134
164
  export OPENAI_API_KEY=your_api_key_here
165
+ export ANTHROPIC_API_KEY=your_api_key_here
166
+ export GEMINI_API_KEY=your_api_key_here
135
167
  ```
136
168
 
137
169
  ## Development
@@ -142,12 +174,8 @@ To install this gem onto your local machine, run `bundle exec rake install`.
142
174
 
143
175
  ## Contributing
144
176
 
145
- Bug reports and pull requests are welcome on GitHub at https://github.com/contextco/llm_ruby.
177
+ Bug reports and pull requests are welcome.
146
178
 
147
179
  ## License
148
180
 
149
181
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
150
-
151
- ## Acknowledgements
152
-
153
- This gem is developed and maintained by [Context](https://context.ai).
@@ -0,0 +1,48 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM
4
+ module Clients
5
+ class Anthropic
6
+ class Response
7
+ def initialize(raw_response)
8
+ @raw_response = raw_response
9
+ end
10
+
11
+ def to_normalized_response
12
+ LLM::Response.new(
13
+ content: content,
14
+ raw_response: parsed_response,
15
+ stop_reason: normalize_stop_reason
16
+ )
17
+ end
18
+
19
+ def self.normalize_stop_reason(stop_reason)
20
+ case stop_reason
21
+ when "end_turn"
22
+ LLM::StopReason::STOP
23
+ when "stop_sequence"
24
+ LLM::StopReason::STOP_SEQUENCE
25
+ when "max_tokens"
26
+ LLM::StopReason::MAX_TOKENS_REACHED
27
+ else
28
+ LLM::StopReason::OTHER
29
+ end
30
+ end
31
+
32
+ private
33
+
34
+ def content
35
+ parsed_response.dig("content", 0, "text")
36
+ end
37
+
38
+ def normalize_stop_reason
39
+ self.class.normalize_stop_reason(parsed_response["stop_reason"])
40
+ end
41
+
42
+ def parsed_response
43
+ @raw_response.parsed_response
44
+ end
45
+ end
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,113 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "httparty"
4
+
5
+ class LLM
6
+ module Clients
7
+ class Anthropic
8
+ include HTTParty
9
+ base_uri "https://api.anthropic.com"
10
+
11
+ def initialize(llm:)
12
+ @llm = llm
13
+ end
14
+
15
+ def chat(messages, options = {})
16
+ request = payload(messages, options)
17
+
18
+ return chat_streaming(request, options[:on_message], options[:on_complete]) if options[:stream]
19
+
20
+ resp = post_url("/v1/messages", body: request.to_json)
21
+
22
+ Response.new(resp).to_normalized_response
23
+ end
24
+
25
+ private
26
+
27
+ def chat_streaming(request, on_message, on_complete)
28
+ buffer = +""
29
+ chunks = []
30
+ output_data = {}
31
+
32
+ wrapped_on_complete = lambda { |stop_reason|
33
+ output_data[:stop_reason] = stop_reason
34
+ on_complete&.call(stop_reason)
35
+ }
36
+
37
+ request[:stream] = true
38
+
39
+ proc = handle_event_stream(buffer, chunks, on_message_proc: on_message, on_complete_proc: wrapped_on_complete)
40
+
41
+ _resp = post_url_streaming("/v1/messages", body: request.to_json, &proc)
42
+
43
+ LLM::Response.new(
44
+ content: buffer,
45
+ raw_response: chunks,
46
+ stop_reason: Response.normalize_stop_reason(output_data[:stop_reason])
47
+ )
48
+ end
49
+
50
+ def handle_event_stream(buffer, chunks, on_message_proc:, on_complete_proc:)
51
+ each_json_chunk do |type, chunk|
52
+ chunks << chunk
53
+ case type
54
+ when "content_block_delta"
55
+ new_content = chunk.dig("delta", "text")
56
+ buffer << new_content
57
+ on_message_proc&.call(new_content)
58
+ when "message_delta"
59
+ finish_reason = chunk.dig("delta", "stop_reason")
60
+ on_complete_proc&.call(finish_reason)
61
+ else
62
+ next
63
+ end
64
+ end
65
+ end
66
+
67
+ def each_json_chunk
68
+ parser = EventStreamParser::Parser.new
69
+
70
+ proc do |chunk|
71
+ # TODO: Add error handling.
72
+
73
+ parser.feed(chunk) do |type, data|
74
+ yield(type, JSON.parse(data))
75
+ end
76
+ end
77
+ end
78
+
79
+ def payload(messages, options = {})
80
+ {
81
+ system: combined_system_messages(messages),
82
+ messages: messages.filter { |m| m[:role].to_sym != :system },
83
+ model: @llm.canonical_name,
84
+ max_tokens: options[:max_output_tokens] || @llm.default_params[:max_output_tokens],
85
+ temperature: options[:temperature],
86
+ top_p: options[:top_p],
87
+ top_k: options[:top_k],
88
+ stream: options[:stream]
89
+ }.compact
90
+ end
91
+
92
+ def combined_system_messages(messages)
93
+ messages.filter { |m| m[:role].to_sym == :system }.map { |m| m[:content] }.join('\n\n')
94
+ end
95
+
96
+ def post_url(url, body:)
97
+ self.class.post(url, body: body, headers: default_headers)
98
+ end
99
+
100
+ def post_url_streaming(url, **kwargs, &block)
101
+ self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
102
+ end
103
+
104
+ def default_headers
105
+ {
106
+ "anthropic-version" => "2023-06-01",
107
+ "x-api-key" => ENV["ANTHROPIC_API_KEY"],
108
+ "Content-Type" => "application/json"
109
+ }
110
+ end
111
+ end
112
+ end
113
+ end
@@ -0,0 +1,66 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM
4
+ module Clients
5
+ class Gemini
6
+ class Request
7
+ def initialize(messages, options)
8
+ @messages = messages
9
+ @options = options
10
+ end
11
+
12
+ def model_for_url
13
+ "models/#{model}"
14
+ end
15
+
16
+ def params
17
+ {
18
+ systemInstruction: normalized_prompt,
19
+ contents: normalized_messages
20
+ }
21
+ end
22
+
23
+ private
24
+
25
+ attr_reader :messages, :options
26
+
27
+ def model
28
+ options[:model]
29
+ end
30
+
31
+ def normalized_messages
32
+ user_visible_messages
33
+ .map(&method(:message_to_gemini_message))
34
+ end
35
+
36
+ def message_to_gemini_message(message)
37
+ {
38
+ role: ROLES_MAP[message[:role]],
39
+ parts: [{text: message[:content]}]
40
+ }
41
+ end
42
+
43
+ def normalized_prompt
44
+ return nil if system_messages.empty?
45
+
46
+ system_messages
47
+ .map { |message| message[:content] }
48
+ .join("\n\n")
49
+ end
50
+
51
+ def system_messages
52
+ messages.filter { |message| message[:role] == :system }
53
+ end
54
+
55
+ def user_visible_messages
56
+ messages.filter { |message| message[:role] != :system }
57
+ end
58
+
59
+ ROLES_MAP = {
60
+ assistant: :model,
61
+ user: :user
62
+ }.freeze
63
+ end
64
+ end
65
+ end
66
+ end
@@ -0,0 +1,54 @@
1
+ # frozen_string_literal: true
2
+
3
+ class LLM
4
+ module Clients
5
+ class Gemini
6
+ class Response
7
+ def initialize(raw_response)
8
+ @raw_response = raw_response
9
+ end
10
+
11
+ def to_normalized_response
12
+ LLM::Response.new(
13
+ content: content,
14
+ raw_response: parsed_response,
15
+ stop_reason: translated_stop_reason
16
+ )
17
+ end
18
+
19
+ def self.normalize_stop_reason(stop_reason)
20
+ case stop_reason
21
+ when "STOP"
22
+ LLM::StopReason::STOP
23
+ when "MAX_TOKENS"
24
+ LLM::StopReason::MAX_TOKENS
25
+ when "SAFETY"
26
+ LLM::StopReason::SAFETY
27
+ else
28
+ LLM::StopReason::OTHER
29
+ end
30
+ end
31
+
32
+ private
33
+
34
+ attr_reader :raw_response
35
+
36
+ def content
37
+ parsed_response.dig("candidates", 0, "content", "parts", 0, "text")
38
+ end
39
+
40
+ def stop_reason
41
+ parsed_response.dig("candidates", 0, "finishReason")
42
+ end
43
+
44
+ def translated_stop_reason
45
+ self.class.normalize_stop_reason(stop_reason)
46
+ end
47
+
48
+ def parsed_response
49
+ raw_response.parsed_response
50
+ end
51
+ end
52
+ end
53
+ end
54
+ end
@@ -0,0 +1,102 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "httparty"
4
+ require "event_stream_parser"
5
+
6
+ class LLM
7
+ module Clients
8
+ class Gemini
9
+ include HTTParty
10
+ base_uri "https://generativelanguage.googleapis.com"
11
+
12
+ def initialize(llm:)
13
+ @llm = llm
14
+ end
15
+
16
+ def chat(messages, options = {})
17
+ req = Request.new(messages, options)
18
+
19
+ return chat_streaming(req, options[:on_message], options[:on_complete]) if options[:stream]
20
+
21
+ resp = post_url(
22
+ "/v1beta/models/#{llm.canonical_name}:generateContent",
23
+ body: req.params.to_json
24
+ )
25
+
26
+ Response.new(resp).to_normalized_response
27
+ end
28
+
29
+ private
30
+
31
+ attr_reader :llm
32
+
33
+ def chat_streaming(request, on_message, on_complete)
34
+ buffer = +""
35
+ chunks = []
36
+ output_data = {}
37
+
38
+ wrapped_on_complete = lambda { |stop_reason|
39
+ output_data[:stop_reason] = stop_reason
40
+ on_complete&.call(stop_reason)
41
+ }
42
+
43
+ proc = handle_event_stream(buffer, chunks, on_message_proc: on_message, on_complete_proc: wrapped_on_complete)
44
+
45
+ _resp = post_url_streaming(
46
+ "/v1beta/models/#{llm.canonical_name}:streamGenerateContent?alt=sse",
47
+ body: request.params.to_json,
48
+ &proc
49
+ )
50
+
51
+ LLM::Response.new(
52
+ content: buffer,
53
+ raw_response: chunks,
54
+ stop_reason: Response.normalize_stop_reason(output_data[:stop_reason])
55
+ )
56
+ end
57
+
58
+ def handle_event_stream(buffer, chunks, on_message_proc:, on_complete_proc:)
59
+ each_json_chunk do |_type, chunk|
60
+ chunks << chunk
61
+
62
+ new_content = chunk.dig("candidates", 0, "content", "parts", 0, "text")
63
+
64
+ unless new_content.nil?
65
+ on_message_proc&.call(new_content)
66
+ buffer << new_content
67
+ end
68
+
69
+ stop_reason = chunk.dig("candidates", 0, "finishReason")
70
+ on_complete_proc&.call(stop_reason) unless stop_reason.nil?
71
+ end
72
+ end
73
+
74
+ def each_json_chunk
75
+ parser = EventStreamParser::Parser.new
76
+
77
+ proc do |chunk|
78
+ # TODO: Add error handling.
79
+
80
+ parser.feed(chunk) do |type, data|
81
+ yield(type, JSON.parse(data))
82
+ end
83
+ end
84
+ end
85
+
86
+ def post_url(url, **kwargs)
87
+ self.class.post(url, **kwargs.merge(headers: default_headers))
88
+ end
89
+
90
+ def post_url_streaming(url, **kwargs, &block)
91
+ self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
92
+ end
93
+
94
+ def default_headers
95
+ {
96
+ "x-goog-api-key" => ENV["GEMINI_API_KEY"],
97
+ "Content-Type" => "application/json"
98
+ }
99
+ end
100
+ end
101
+ end
102
+ end
@@ -1,42 +1,48 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- class LLM::Clients::OpenAI::Response
4
- def initialize(raw_response)
5
- @raw_response = raw_response
6
- end
3
+ class LLM
4
+ module Clients
5
+ class OpenAI
6
+ class Response
7
+ def initialize(raw_response)
8
+ @raw_response = raw_response
9
+ end
7
10
 
8
- def to_normalized_response
9
- LLM::Response.new(
10
- content: content,
11
- raw_response: parsed_response,
12
- stop_reason: normalize_stop_reason
13
- )
14
- end
11
+ def to_normalized_response
12
+ LLM::Response.new(
13
+ content: content,
14
+ raw_response: parsed_response,
15
+ stop_reason: normalize_stop_reason
16
+ )
17
+ end
15
18
 
16
- def self.normalize_stop_reason(stop_reason)
17
- case stop_reason
18
- when "stop"
19
- LLM::StopReason::STOP
20
- when "safety"
21
- LLM::StopReason::SAFETY
22
- when "max_tokens"
23
- LLM::StopReason::MAX_TOKENS_REACHED
24
- else
25
- LLM::StopReason::OTHER
26
- end
27
- end
19
+ def self.normalize_stop_reason(stop_reason)
20
+ case stop_reason
21
+ when "stop"
22
+ LLM::StopReason::STOP
23
+ when "safety"
24
+ LLM::StopReason::SAFETY
25
+ when "max_tokens"
26
+ LLM::StopReason::MAX_TOKENS_REACHED
27
+ else
28
+ LLM::StopReason::OTHER
29
+ end
30
+ end
28
31
 
29
- private
32
+ private
30
33
 
31
- def content
32
- parsed_response.dig("choices", 0, "message", "content")
33
- end
34
+ def content
35
+ parsed_response.dig("choices", 0, "message", "content")
36
+ end
34
37
 
35
- def normalize_stop_reason
36
- self.class.normalize_stop_reason(parsed_response.dig("choices", 0, "finish_reason"))
37
- end
38
+ def normalize_stop_reason
39
+ self.class.normalize_stop_reason(parsed_response.dig("choices", 0, "finish_reason"))
40
+ end
38
41
 
39
- def parsed_response
40
- @raw_response.parsed_response
42
+ def parsed_response
43
+ @raw_response.parsed_response
44
+ end
45
+ end
46
+ end
41
47
  end
42
48
  end
@@ -3,107 +3,111 @@
3
3
  require "httparty"
4
4
  require "event_stream_parser"
5
5
 
6
- class LLM::Clients::OpenAI
7
- include HTTParty
8
- base_uri "https://api.openai.com/v1"
6
+ class LLM
7
+ module Clients
8
+ class OpenAI
9
+ include HTTParty
10
+ base_uri "https://api.openai.com/v1"
11
+
12
+ def initialize(llm:)
13
+ @llm = llm
14
+ end
9
15
 
10
- def initialize(llm:)
11
- @llm = llm
12
- end
16
+ def chat(messages, options = {})
17
+ parameters = {
18
+ model: @llm.canonical_name,
19
+ messages: messages,
20
+ temperature: options[:temperature],
21
+ response_format: options[:response_format],
22
+ max_tokens: options[:max_output_tokens],
23
+ top_p: options[:top_p],
24
+ stop: options[:stop_sequences],
25
+ presence_penalty: options[:presence_penalty],
26
+ frequency_penalty: options[:frequency_penalty],
27
+ tools: options[:tools],
28
+ tool_choice: options[:tool_choice]
29
+ }.compact
30
+
31
+ return chat_streaming(parameters, options[:on_message], options[:on_complete]) if options[:stream]
32
+
33
+ resp = post_url("/chat/completions", body: parameters.to_json)
34
+
35
+ Response.new(resp).to_normalized_response
36
+ end
13
37
 
14
- def chat(messages, options = {})
15
- parameters = {
16
- model: @llm.canonical_name,
17
- messages: messages,
18
- temperature: options[:temperature],
19
- response_format: options[:response_format],
20
- max_tokens: options[:max_output_tokens],
21
- top_p: options[:top_p],
22
- stop: options[:stop_sequences],
23
- presence_penalty: options[:presence_penalty],
24
- frequency_penalty: options[:frequency_penalty],
25
- tools: options[:tools],
26
- tool_choice: options[:tool_choice]
27
- }.compact
28
-
29
- return chat_streaming(parameters, options[:on_message], options[:on_complete]) if options[:stream]
30
-
31
- resp = post_url("/chat/completions", body: parameters.to_json)
32
-
33
- Response.new(resp).to_normalized_response
34
- end
38
+ private
35
39
 
36
- private
40
+ def chat_streaming(parameters, on_message, on_complete)
41
+ buffer = +""
42
+ chunks = []
43
+ output_data = {}
37
44
 
38
- def chat_streaming(parameters, on_message, on_complete)
39
- buffer = +""
40
- chunks = []
41
- output_data = {}
45
+ wrapped_on_complete = lambda { |stop_reason|
46
+ output_data[:stop_reason] = stop_reason
47
+ on_complete&.call(stop_reason)
48
+ }
42
49
 
43
- wrapped_on_complete = lambda { |stop_reason|
44
- output_data[:stop_reason] = stop_reason
45
- on_complete&.call(stop_reason)
46
- }
50
+ parameters[:stream] = true
47
51
 
48
- parameters[:stream] = true
52
+ proc = stream_proc(buffer, chunks, on_message, wrapped_on_complete)
49
53
 
50
- proc = stream_proc(buffer, chunks, on_message, wrapped_on_complete)
54
+ parameters.delete(:on_message)
55
+ parameters.delete(:on_complete)
51
56
 
52
- parameters.delete(:on_message)
53
- parameters.delete(:on_complete)
57
+ _resp = post_url_streaming("/chat/completions", body: parameters.to_json, &proc)
54
58
 
55
- _resp = post_url_streaming("/chat/completions", body: parameters.to_json, &proc)
59
+ LLM::Response.new(
60
+ content: buffer,
61
+ raw_response: chunks,
62
+ stop_reason: output_data[:stop_reason]
63
+ )
64
+ end
56
65
 
57
- LLM::Response.new(
58
- content: buffer,
59
- raw_response: chunks,
60
- stop_reason: output_data[:stop_reason]
61
- )
62
- end
66
+ def stream_proc(buffer, chunks, on_message, complete_proc)
67
+ each_json_chunk do |_type, event|
68
+ next if event == "[DONE]"
63
69
 
64
- def stream_proc(buffer, chunks, on_message, complete_proc)
65
- each_json_chunk do |_type, event|
66
- next if event == "[DONE]"
70
+ chunks << event
71
+ new_content = event.dig("choices", 0, "delta", "content")
72
+ stop_reason = event.dig("choices", 0, "finish_reason")
67
73
 
68
- chunks << event
69
- new_content = event.dig("choices", 0, "delta", "content")
70
- stop_reason = event.dig("choices", 0, "finish_reason")
74
+ buffer << new_content unless new_content.nil?
75
+ on_message&.call(new_content) unless new_content.nil?
76
+ complete_proc&.call(Response.normalize_stop_reason(stop_reason)) unless stop_reason.nil?
77
+ end
78
+ end
71
79
 
72
- buffer << new_content unless new_content.nil?
73
- on_message&.call(new_content) unless new_content.nil?
74
- complete_proc&.call(Response.normalize_stop_reason(stop_reason)) unless stop_reason.nil?
75
- end
76
- end
80
+ def each_json_chunk
81
+ parser = EventStreamParser::Parser.new
77
82
 
78
- def each_json_chunk
79
- parser = EventStreamParser::Parser.new
83
+ proc do |chunk, _bytes, env|
84
+ if env && env.status != 200
85
+ raise_error = Faraday::Response::RaiseError.new
86
+ raise_error.on_complete(env.merge(body: try_parse_json(chunk)))
87
+ end
80
88
 
81
- proc do |chunk, _bytes, env|
82
- if env && env.status != 200
83
- raise_error = Faraday::Response::RaiseError.new
84
- raise_error.on_complete(env.merge(body: try_parse_json(chunk)))
85
- end
89
+ parser.feed(chunk) do |type, data|
90
+ next if data == "[DONE]"
86
91
 
87
- parser.feed(chunk) do |type, data|
88
- next if data == "[DONE]"
89
-
90
- yield(type, JSON.parse(data))
92
+ yield(type, JSON.parse(data))
93
+ end
94
+ end
91
95
  end
92
- end
93
- end
94
96
 
95
- def post_url(url, **kwargs)
96
- self.class.post(url, **kwargs.merge(headers: default_headers))
97
- end
97
+ def post_url(url, **kwargs)
98
+ self.class.post(url, **kwargs.merge(headers: default_headers))
99
+ end
98
100
 
99
- def post_url_streaming(url, **kwargs, &block)
100
- self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
101
- end
101
+ def post_url_streaming(url, **kwargs, &block)
102
+ self.class.post(url, **kwargs.merge(headers: default_headers, stream_body: true), &block)
103
+ end
102
104
 
103
- def default_headers
104
- {
105
- "Authorization" => "Bearer #{ENV["OPENAI_API_KEY"]}",
106
- "Content-Type" => "application/json"
107
- }
105
+ def default_headers
106
+ {
107
+ "Authorization" => "Bearer #{ENV["OPENAI_API_KEY"]}",
108
+ "Content-Type" => "application/json"
109
+ }
110
+ end
111
+ end
108
112
  end
109
113
  end
data/lib/llm/info.rb CHANGED
@@ -1,94 +1,255 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- module LLM::Info
4
- KNOWN_MODELS = [
5
- # Semantics of fields:
6
- # - canonical_name (required): A string that uniquely identifies the model.
7
- # We use this string as the public identifier when users choose this model via the API.
8
- # - display_name (required): A string that is displayed to the user when choosing this model via the UI.
3
+ class LLM
4
+ module Info
5
+ KNOWN_MODELS = [
6
+ # Semantics of fields:
7
+ # - canonical_name (required): A string that uniquely identifies the model.
8
+ # We use this string as the public identifier when users choose this model via the API.
9
+ # - display_name (required): A string that is displayed to the user when choosing this model via the UI.
10
+ # - client_class (required): The client class to be used for this model.
9
11
 
10
- # GPT-3.5 Turbo Models
11
- {
12
- canonical_name: "gpt-3.5-turbo",
13
- display_name: "GPT-3.5 Turbo",
14
- provider: :openai
15
- },
16
- {
17
- canonical_name: "gpt-3.5-turbo-0125",
18
- display_name: "GPT-3.5 Turbo 0125",
19
- provider: :openai
20
- },
21
- {
22
- canonical_name: "gpt-3.5-turbo-16k",
23
- display_name: "GPT-3.5 Turbo 16K",
24
- provider: :openai
25
- },
26
- {
27
- canonical_name: "gpt-3.5-turbo-1106",
28
- display_name: "GPT-3.5 Turbo 1106",
29
- provider: :openai
30
- },
12
+ # GPT-3.5 Turbo Models
13
+ {
14
+ canonical_name: "gpt-3.5-turbo",
15
+ display_name: "GPT-3.5 Turbo",
16
+ provider: :openai,
17
+ client_class: LLM::Clients::OpenAI
18
+ },
19
+ {
20
+ canonical_name: "gpt-3.5-turbo-0125",
21
+ display_name: "GPT-3.5 Turbo 0125",
22
+ provider: :openai,
23
+ client_class: LLM::Clients::OpenAI
24
+ },
25
+ {
26
+ canonical_name: "gpt-3.5-turbo-16k",
27
+ display_name: "GPT-3.5 Turbo 16K",
28
+ provider: :openai,
29
+ client_class: LLM::Clients::OpenAI
30
+ },
31
+ {
32
+ canonical_name: "gpt-3.5-turbo-1106",
33
+ display_name: "GPT-3.5 Turbo 1106",
34
+ provider: :openai,
35
+ client_class: LLM::Clients::OpenAI
36
+ },
31
37
 
32
- # GPT-4 Models
33
- {
34
- canonical_name: "gpt-4",
35
- display_name: "GPT-4",
36
- provider: :openai
37
- },
38
- {
39
- canonical_name: "gpt-4-32k",
40
- display_name: "GPT-4 32K",
41
- provider: :openai
42
- },
43
- {
44
- canonical_name: "gpt-4-1106-preview",
45
- display_name: "GPT-4 Turbo 1106",
46
- provider: :openai
47
- },
48
- {
49
- canonical_name: "gpt-4-turbo-2024-04-09",
50
- display_name: "GPT-4 Turbo 2024-04-09",
51
- provider: :openai
52
- },
53
- {
54
- canonical_name: "gpt-4-0125-preview",
55
- display_name: "GPT-4 Turbo 0125",
56
- provider: :openai
57
- },
58
- {
59
- canonical_name: "gpt-4-turbo-preview",
60
- display_name: "GPT-4 Turbo",
61
- provider: :openai
62
- },
63
- {
64
- canonical_name: "gpt-4-0613",
65
- display_name: "GPT-4 0613",
66
- provider: :openai
67
- },
68
- {
69
- canonical_name: "gpt-4-32k-0613",
70
- display_name: "GPT-4 32K 0613",
71
- provider: :openai
72
- },
73
- {
74
- canonical_name: "gpt-4o",
75
- display_name: "GPT-4o",
76
- provider: :openai
77
- },
78
- {
79
- canonical_name: "gpt-4o-mini",
80
- display_name: "GPT-4o Mini",
81
- provider: :openai
82
- },
83
- {
84
- canonical_name: "gpt-4o-2024-05-13",
85
- display_name: "GPT-4o 2024-05-13",
86
- provider: :openai
87
- },
88
- {
89
- canonical_name: "gpt-4o-2024-08-06",
90
- display_name: "GPT-4o 2024-08-06",
91
- provider: :openai
92
- }
93
- ].freeze
38
+ # GPT-4 Models
39
+ {
40
+ canonical_name: "gpt-4",
41
+ display_name: "GPT-4",
42
+ provider: :openai,
43
+ client_class: LLM::Clients::OpenAI
44
+ },
45
+ {
46
+ canonical_name: "gpt-4-1106-preview",
47
+ display_name: "GPT-4 Turbo 1106",
48
+ provider: :openai,
49
+ client_class: LLM::Clients::OpenAI
50
+ },
51
+ {
52
+ canonical_name: "gpt-4-turbo-2024-04-09",
53
+ display_name: "GPT-4 Turbo 2024-04-09",
54
+ provider: :openai,
55
+ client_class: LLM::Clients::OpenAI
56
+ },
57
+ {
58
+ canonical_name: "gpt-4-0125-preview",
59
+ display_name: "GPT-4 Turbo 0125",
60
+ provider: :openai,
61
+ client_class: LLM::Clients::OpenAI
62
+ },
63
+ {
64
+ canonical_name: "gpt-4-turbo-preview",
65
+ display_name: "GPT-4 Turbo",
66
+ provider: :openai,
67
+ client_class: LLM::Clients::OpenAI
68
+ },
69
+ {
70
+ canonical_name: "gpt-4-0613",
71
+ display_name: "GPT-4 0613",
72
+ provider: :openai,
73
+ client_class: LLM::Clients::OpenAI
74
+ },
75
+ {
76
+ canonical_name: "gpt-4o",
77
+ display_name: "GPT-4o",
78
+ provider: :openai,
79
+ client_class: LLM::Clients::OpenAI
80
+ },
81
+ {
82
+ canonical_name: "gpt-4o-mini",
83
+ display_name: "GPT-4o Mini",
84
+ provider: :openai,
85
+ client_class: LLM::Clients::OpenAI
86
+ },
87
+ {
88
+ canonical_name: "gpt-4o-mini-2024-07-18",
89
+ display_name: "GPT-4o Mini 2024-07-18",
90
+ provider: :openai,
91
+ client_class: LLM::Clients::OpenAI
92
+ },
93
+ {
94
+ canonical_name: "gpt-4o-2024-05-13",
95
+ display_name: "GPT-4o 2024-05-13",
96
+ provider: :openai,
97
+ client_class: LLM::Clients::OpenAI
98
+ },
99
+ {
100
+ canonical_name: "gpt-4o-2024-08-06",
101
+ display_name: "GPT-4o 2024-08-06",
102
+ provider: :openai,
103
+ client_class: LLM::Clients::OpenAI
104
+ },
105
+ {
106
+ canonical_name: "gpt-4o-2024-11-20",
107
+ display_name: "GPT-4o 2024-11-20",
108
+ provider: :openai,
109
+ client_class: LLM::Clients::OpenAI
110
+ },
111
+ {
112
+ canonical_name: "chatgpt-4o-latest",
113
+ display_name: "ChatGPT 4o Latest",
114
+ provider: :openai,
115
+ client_class: LLM::Clients::OpenAI
116
+ },
117
+ {
118
+ canonical_name: "o1",
119
+ display_name: "o1",
120
+ provider: :openai,
121
+ client_class: LLM::Clients::OpenAI
122
+ },
123
+ {
124
+ canonical_name: "o1-2024-12-17",
125
+ display_name: "o1 2024-12-17",
126
+ provider: :openai,
127
+ client_class: LLM::Clients::OpenAI
128
+ },
129
+ {
130
+ canonical_name: "o1-preview",
131
+ display_name: "o1 Preview",
132
+ provider: :openai,
133
+ client_class: LLM::Clients::OpenAI
134
+ },
135
+ {
136
+ canonical_name: "o1-preview-2024-09-12",
137
+ display_name: "o1 Preview 2024-09-12",
138
+ provider: :openai,
139
+ client_class: LLM::Clients::OpenAI
140
+ },
141
+ {
142
+ canonical_name: "o1-mini",
143
+ display_name: "o1 Mini",
144
+ provider: :openai,
145
+ client_class: LLM::Clients::OpenAI
146
+ },
147
+ {
148
+ canonical_name: "o1-mini-2024-09-12",
149
+ display_name: "o1 Mini 2024-09-12",
150
+ provider: :openai,
151
+ client_class: LLM::Clients::OpenAI
152
+ },
153
+ {
154
+ canonical_name: "o3-mini",
155
+ display_name: "o3 Mini",
156
+ provider: :openai,
157
+ client_class: LLM::Clients::OpenAI
158
+ },
159
+ {
160
+ canonical_name: "o3-mini-2025-01-31",
161
+ display_name: "o3 Mini 2025-01-31",
162
+ provider: :openai,
163
+ client_class: LLM::Clients::OpenAI
164
+ },
165
+
166
+ # Anthropic Models
167
+ {
168
+ canonical_name: "claude-3-5-sonnet-20241022",
169
+ display_name: "Claude 3.5 Sonnet 2024-10-22",
170
+ provider: :anthropic,
171
+ client_class: LLM::Clients::Anthropic,
172
+ additional_default_required_parameters: {
173
+ max_output_tokens: 8192
174
+ }
175
+ },
176
+ {
177
+ canonical_name: "claude-3-5-haiku-20241022",
178
+ display_name: "Claude 3.5 Haiku 2024-10-22",
179
+ provider: :anthropic,
180
+ client_class: LLM::Clients::Anthropic,
181
+ additional_default_required_parameters: {
182
+ max_output_tokens: 8192
183
+ }
184
+ },
185
+ {
186
+ canonical_name: "claude-3-5-sonnet-20240620",
187
+ display_name: "Claude 3.5 Sonnet 2024-06-20",
188
+ provider: :anthropic,
189
+ client_class: LLM::Clients::Anthropic,
190
+ additional_default_required_parameters: {
191
+ max_output_tokens: 8192
192
+ }
193
+ },
194
+ {
195
+ canonical_name: "claude-3-opus-20240229",
196
+ display_name: "Claude 3.5 Opus 2024-02-29",
197
+ provider: :anthropic,
198
+ client_class: LLM::Clients::Anthropic,
199
+ additional_default_required_parameters: {
200
+ max_output_tokens: 4096
201
+ }
202
+ },
203
+ {
204
+ canonical_name: "claude-3-sonnet-20240229",
205
+ display_name: "Claude 3.5 Sonnet 2024-02-29",
206
+ provider: :anthropic,
207
+ client_class: LLM::Clients::Anthropic,
208
+ additional_default_required_parameters: {
209
+ max_output_tokens: 4096
210
+ }
211
+ },
212
+ {
213
+ canonical_name: "claude-3-haiku-20240307",
214
+ display_name: "Claude 3.5 Opus 2024-03-07",
215
+ provider: :anthropic,
216
+ client_class: LLM::Clients::Anthropic,
217
+ additional_default_required_parameters: {
218
+ max_output_tokens: 4096
219
+ }
220
+ },
221
+
222
+ # Google Models
223
+ {
224
+ canonical_name: "gemini-2.0-flash",
225
+ display_name: "Gemini 2.0 Flash",
226
+ provider: :google,
227
+ client_class: LLM::Clients::Gemini
228
+ },
229
+ {
230
+ canonical_name: "gemini-2.0-flash-lite-preview-02-05",
231
+ display_name: "Gemini 2.0 Flash Lite Preview 02-05",
232
+ provider: :google,
233
+ client_class: LLM::Clients::Gemini
234
+ },
235
+ {
236
+ canonical_name: "gemini-1.5-flash-8b",
237
+ display_name: "Gemini 1.5 Flash 8B",
238
+ provider: :google,
239
+ client_class: LLM::Clients::Gemini
240
+ },
241
+ {
242
+ canonical_name: "gemini-1.5-flash",
243
+ display_name: "Gemini 1.5 Flash",
244
+ provider: :google,
245
+ client_class: LLM::Clients::Gemini
246
+ },
247
+ {
248
+ canonical_name: "gemini-1.5-pro",
249
+ display_name: "Gemini 1.5 Pro",
250
+ provider: :google,
251
+ client_class: LLM::Clients::Gemini
252
+ }
253
+ ].freeze
254
+ end
94
255
  end
@@ -1,9 +1,12 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- module LLM::StopReason
4
- STOP = :stop
5
- SAFETY = :safety
6
- MAX_TOKENS_REACHED = :max_tokens
3
+ class LLM
4
+ module StopReason
5
+ STOP = :stop
6
+ SAFETY = :safety
7
+ MAX_TOKENS_REACHED = :max_tokens
8
+ STOP_SEQUENCE = :stop_sequence
7
9
 
8
- OTHER = :other
10
+ OTHER = :other
11
+ end
9
12
  end
data/lib/llm.rb CHANGED
@@ -13,7 +13,8 @@ class LLM
13
13
  @canonical_name = model[:canonical_name]
14
14
  @display_name = model[:display_name]
15
15
  @provider = model[:provider]
16
- @client_class = LLM::Clients::OpenAI # TODO: Allow alternative client classes.
16
+ @client_class = model[:client_class]
17
+ @default_params = model[:additional_default_required_parameters] || {}
17
18
  end
18
19
 
19
20
  def client
@@ -22,7 +23,8 @@ class LLM
22
23
 
23
24
  attr_reader :canonical_name,
24
25
  :display_name,
25
- :provider
26
+ :provider,
27
+ :default_params
26
28
 
27
29
  private
28
30
 
metadata CHANGED
@@ -1,14 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: llm_ruby
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Alex Gamble
8
- autorequire:
9
8
  bindir: exe
10
9
  cert_chain: []
11
- date: 2024-09-13 00:00:00.000000000 Z
10
+ date: 2025-02-23 00:00:00.000000000 Z
12
11
  dependencies:
13
12
  - !ruby/object:Gem::Dependency
14
13
  name: event_stream_parser
@@ -136,10 +135,6 @@ dependencies:
136
135
  - - "~>"
137
136
  - !ruby/object:Gem::Version
138
137
  version: 3.23.0
139
- description:
140
- email:
141
- - alex@context.ai
142
- - alec@context.ai
143
138
  executables: []
144
139
  extensions: []
145
140
  extra_rdoc_files: []
@@ -150,6 +145,11 @@ files:
150
145
  - README.md
151
146
  - Rakefile
152
147
  - lib/llm.rb
148
+ - lib/llm/clients/anthropic.rb
149
+ - lib/llm/clients/anthropic/response.rb
150
+ - lib/llm/clients/gemini.rb
151
+ - lib/llm/clients/gemini/request.rb
152
+ - lib/llm/clients/gemini/response.rb
153
153
  - lib/llm/clients/open_ai.rb
154
154
  - lib/llm/clients/open_ai/response.rb
155
155
  - lib/llm/info.rb
@@ -157,13 +157,12 @@ files:
157
157
  - lib/llm/response.rb
158
158
  - lib/llm/stop_reason.rb
159
159
  - lib/llm/version.rb
160
- homepage: https://context.ai
160
+ homepage: https://github.com/agamble/llm_ruby
161
161
  licenses:
162
162
  - MIT
163
163
  metadata:
164
- homepage_uri: https://context.ai
165
- source_code_uri: https://github.com/contextco/llm_ruby
166
- post_install_message:
164
+ homepage_uri: https://github.com/agamble/llm_ruby
165
+ source_code_uri: https://github.com/agamble/llm_ruby
167
166
  rdoc_options: []
168
167
  require_paths:
169
168
  - lib
@@ -178,8 +177,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
178
177
  - !ruby/object:Gem::Version
179
178
  version: '0'
180
179
  requirements: []
181
- rubygems_version: 3.5.16
182
- signing_key:
180
+ rubygems_version: 3.6.2
183
181
  specification_version: 4
184
182
  summary: A client to interact with LLM APIs in a consistent way.
185
183
  test_files: []