ruby-openai 7.1.0 → 7.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a86dc627f27eeea7cf3eb1bf2eec2b0209d0bb8c11fef0eb6fd6518f7f10cfe9
4
- data.tar.gz: 712ab627670853d680c8858a9d27aef5a82be09de8d53e5f7156ee608ba8d939
3
+ metadata.gz: 278f25c283d841bfa33614bd69b4340b9275712b83e9121a1aa2a6a439767714
4
+ data.tar.gz: 702c11ba4b0411a47e9d6f9fdb178d1eb40a7baede5909f0665b41edc00797b0
5
5
  SHA512:
6
- metadata.gz: 72e14dc39495046b71ca147953582a24f8c9261955f2ca2a8d898ca7f8e136b459c31583620c16db3fa80c39da61c1f3a4cc932c5b3f2e71741fed42719eaeaf
7
- data.tar.gz: 82db19d40f9b44fedb73d8f310771af71096d3d7e8e56f96d000a70f4c61abb1f21cbe98187d70fec1c3d639921eabe8cd8e456cac92dbcfa6e2efcb655f865b
6
+ metadata.gz: 7c4a1bdb8fd3f466808f740112c3223a04da5ef73fd355b5f2136ecf28f5be2968ec4ecced5db25833878ba7e9de1f63e063529580f86d269c7bdd82f7e77df9
7
+ data.tar.gz: '014855034340e14ac2e78c845ae791f619dedf636c9839ce5edc2ea27d7eb54e973dbe4a41998b41d1e89c2c56ce04cd2c062c90bdc87b858b7467005e78100c'
data/CHANGELOG.md CHANGED
@@ -5,6 +5,20 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [7.3.0] - 2024-10-11
9
+
10
+ ### Added
11
+
12
+ - Add ability to (with the right incantations) retrieve the chunks used by an Assistant file search - thanks to [@agamble](https://github.com/agamble) for the addition!
13
+
14
+ ## [7.2.0] - 2024-10-10
15
+
16
+ ### Added
17
+
18
+ - Add ability to pass parameters to Files#list endpoint - thanks to [@parterburn](https://github.com/parterburn)!
19
+ - Add Velvet observability platform to README - thanks to [@philipithomas](https://github.com/philipithomas)
20
+ - Add Assistants::Messages#delete endpoint - thanks to [@mochetts](https://github.com/mochetts)!
21
+
8
22
  ## [7.1.0] - 2024-06-10
9
23
 
10
24
  ### Added
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- ruby-openai (7.1.0)
4
+ ruby-openai (7.3.0)
5
5
  event_stream_parser (>= 0.3.0, < 2.0.0)
6
6
  faraday (>= 1)
7
7
  faraday-multipart (>= 1)
@@ -38,7 +38,7 @@ GEM
38
38
  rainbow (3.1.1)
39
39
  rake (13.2.1)
40
40
  regexp_parser (2.8.0)
41
- rexml (3.2.9)
41
+ rexml (3.3.6)
42
42
  strscan
43
43
  rspec (3.13.0)
44
44
  rspec-core (~> 3.13.0)
data/README.md CHANGED
@@ -8,7 +8,7 @@ Use the [OpenAI API](https://openai.com/blog/openai-api/) with Ruby! 🤖❤️
8
8
 
9
9
  Stream text with GPT-4o, transcribe and translate audio with Whisper, or create images with DALL·E...
10
10
 
11
- [🚢 Hire me](https://peaceterms.com?utm_source=ruby-openai&utm_medium=readme&utm_id=26072023) | [🎮 Ruby AI Builders Discord](https://discord.gg/k4Uc224xVD) | [🐦 Twitter](https://twitter.com/alexrudall) | [🧠 Anthropic Gem](https://github.com/alexrudall/anthropic) | [🚂 Midjourney Gem](https://github.com/alexrudall/midjourney)
11
+ [📚 Rails AI (FREE Book)](https://railsai.com) | [🎮 Ruby AI Builders Discord](https://discord.gg/k4Uc224xVD) | [🐦 X](https://x.com/alexrudall) | [🧠 Anthropic Gem](https://github.com/alexrudall/anthropic) | [🚂 Midjourney Gem](https://github.com/alexrudall/midjourney)
12
12
 
13
13
  ## Contents
14
14
 
@@ -139,7 +139,9 @@ client = OpenAI::Client.new(access_token: "access_token_goes_here")
139
139
 
140
140
  #### Custom timeout or base URI
141
141
 
142
- The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the `request_timeout` when initializing the client. You can also change the base URI used for all requests, eg. to use observability tools like [Helicone](https://docs.helicone.ai/quickstart/integrate-in-one-line-of-code), and add arbitrary other headers e.g. for [openai-caching-proxy-worker](https://github.com/6/openai-caching-proxy-worker):
142
+ - The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the `request_timeout` when initializing the client.
143
+ - You can also change the base URI used for all requests, eg. to use observability tools like [Helicone](https://docs.helicone.ai/quickstart/integrate-in-one-line-of-code) or [Velvet](https://docs.usevelvet.com/docs/getting-started)
144
+ - You can also add arbitrary other headers e.g. for [openai-caching-proxy-worker](https://github.com/6/openai-caching-proxy-worker), eg.:
143
145
 
144
146
  ```ruby
145
147
  client = OpenAI::Client.new(
@@ -326,7 +328,28 @@ client.chat(
326
328
  # => "Anna is a young woman in her mid-twenties, with wavy chestnut hair that falls to her shoulders..."
327
329
  ```
328
330
 
329
- Note: OpenAPI currently does not report token usage for streaming responses. To count tokens while streaming, try `OpenAI.rough_token_count` or [tiktoken_ruby](https://github.com/IAPark/tiktoken_ruby). We think that each call to the stream proc corresponds to a single token, so you can also try counting the number of calls to the proc to get the completion token count.
331
+ Note: In order to get usage information, you can provide the [`stream_options` parameter](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options) and OpenAI will provide a final chunk with the usage. Here is an example:
332
+
333
+ ```ruby
334
+ stream_proc = proc { |chunk, _bytesize| puts "--------------"; puts chunk.inspect; }
335
+ client.chat(
336
+ parameters: {
337
+ model: "gpt-4o",
338
+ stream: stream_proc,
339
+ stream_options: { include_usage: true },
340
+ messages: [{ role: "user", content: "Hello!"}],
341
+ })
342
+ # => --------------
343
+ # => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"role"=>"assistant", "content"=>""}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
344
+ # => --------------
345
+ # => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"content"=>"Hello"}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
346
+ # => --------------
347
+ # => ... more content chunks
348
+ # => --------------
349
+ # => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{}, "logprobs"=>nil, "finish_reason"=>"stop"}], "usage"=>nil}
350
+ # => --------------
351
+ # => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[], "usage"=>{"prompt_tokens"=>9, "completion_tokens"=>9, "total_tokens"=>18}}
352
+ ```
330
353
 
331
354
  #### Vision
332
355
 
@@ -526,9 +549,11 @@ puts response.dig("data", 0, "embedding")
526
549
  ```
527
550
 
528
551
  ### Batches
552
+
529
553
  The Batches endpoint allows you to create and manage large batches of API requests to run asynchronously. Currently, the supported endpoints for batches are `/v1/chat/completions` (Chat Completions API) and `/v1/embeddings` (Embeddings API).
530
554
 
531
555
  To use the Batches endpoint, you need to first upload a JSONL file containing the batch requests using the Files endpoint. The file must be uploaded with the purpose set to `batch`. Each line in the JSONL file represents a single request and should have the following format:
556
+
532
557
  ```json
533
558
  {
534
559
  "custom_id": "request-1",
@@ -612,7 +637,9 @@ These files are in JSONL format, with each line representing the output or error
612
637
  If a request fails with a non-HTTP error, the error object will contain more information about the cause of the failure.
613
638
 
614
639
  ### Files
640
+
615
641
  #### For fine-tuning purposes
642
+
616
643
  Put your data in a `.jsonl` file like this:
617
644
 
618
645
  ```json
@@ -645,7 +672,6 @@ my_file = File.open("path/to/file.pdf", "rb")
645
672
  client.files.upload(parameters: { file: my_file, purpose: "assistants" })
646
673
  ```
647
674
 
648
-
649
675
  See supported file types on [API documentation](https://platform.openai.com/docs/assistants/tools/file-search/supported-files).
650
676
 
651
677
  ### Finetunes
@@ -701,6 +727,7 @@ client.finetunes.list_events(id: fine_tune_id)
701
727
  ```
702
728
 
703
729
  ### Vector Stores
730
+
704
731
  Vector Store objects give the File Search tool the ability to search your files.
705
732
 
706
733
  You can create a new vector store:
@@ -746,6 +773,7 @@ client.vector_stores.delete(id: vector_store_id)
746
773
  ```
747
774
 
748
775
  ### Vector Store Files
776
+
749
777
  Vector store files represent files inside a vector store.
750
778
 
751
779
  You can create a new vector store file by attaching a File to a vector store.
@@ -784,9 +812,11 @@ client.vector_store_files.delete(
784
812
  id: vector_store_file_id
785
813
  )
786
814
  ```
815
+
787
816
  Note: This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
788
817
 
789
818
  ### Vector Store File Batches
819
+
790
820
  Vector store file batches represent operations to add multiple files to a vector store.
791
821
 
792
822
  You can create a new vector store file batch by attaching multiple Files to a vector store.
@@ -1081,6 +1111,116 @@ end
1081
1111
 
1082
1112
  Note that you have 10 minutes to submit your tool output before the run expires.
1083
1113
 
1114
+ #### Exploring chunks used in File Search
1115
+
1116
+ Take a deep breath. You might need a drink for this one.
1117
+
1118
+ It's possible for OpenAI to share what chunks it used in its internal RAG Pipeline to create its filesearch example.
1119
+
1120
+ An example spec can be found [here](https://github.com/alexrudall/ruby-openai/blob/main/spec/openai/client/assistant_file_search_spec.rb) that does this, just so you know it's possible.
1121
+
1122
+ Here's how to get the chunks used in a file search. In this example I'm using [this file](https://css4.pub/2015/textbook/somatosensory.pdf):
1123
+
1124
+ ```
1125
+ require "openai"
1126
+
1127
+ # Make a client
1128
+ client = OpenAI::Client.new(
1129
+ access_token: "access_token_goes_here",
1130
+ log_errors: true # Don't do this in production.
1131
+ )
1132
+
1133
+ # Upload your file(s)
1134
+ file_id = client.files.upload(
1135
+ parameters: {
1136
+ file: "path/to/somatosensory.pdf",
1137
+ purpose: "assistants"
1138
+ }
1139
+ )["id"]
1140
+
1141
+ # Create a vector store to store the vectorised file(s)
1142
+ vector_store_id = client.vector_stores.create(parameters: {})["id"]
1143
+
1144
+ # Vectorise the file(s)
1145
+ vector_store_file_id = client.vector_store_files.create(
1146
+ vector_store_id: vector_store_id,
1147
+ parameters: { file_id: file_id }
1148
+ )["id"]
1149
+
1150
+ # Check that the file is vectorised (wait for status to be "completed")
1151
+ client.vector_store_files.retrieve(vector_store_id: vector_store_id, id: vector_store_file_id)["status"]
1152
+
1153
+ # Create an assistant, referencing the vector store
1154
+ assistant_id = client.assistants.create(
1155
+ parameters: {
1156
+ model: "gpt-4o",
1157
+ name: "Answer finder",
1158
+ instructions: "You are a file search tool. Find the answer in the given files, please.",
1159
+ tools: [
1160
+ { type: "file_search" }
1161
+ ],
1162
+ tool_resources: {
1163
+ file_search: {
1164
+ vector_store_ids: [vector_store_id]
1165
+ }
1166
+ }
1167
+ }
1168
+ )["id"]
1169
+
1170
+ # Create a thread with your question
1171
+ thread_id = client.threads.create(parameters: {
1172
+ messages: [
1173
+ { role: "user",
1174
+ content: "Find the description of a nociceptor." }
1175
+ ]
1176
+ })["id"]
1177
+
1178
+ # Run the thread to generate the response. Include the "GIVE ME THE CHUNKS" incantation.
1179
+ run_id = client.runs.create(
1180
+ thread_id: thread_id,
1181
+ parameters: {
1182
+ assistant_id: assistant_id
1183
+ },
1184
+ query_parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
1185
+ )["id"]
1186
+
1187
+ # Get the steps that happened in the run
1188
+ steps = client.run_steps.list(
1189
+ thread_id: thread_id,
1190
+ run_id: run_id,
1191
+ parameters: { order: "asc" }
1192
+ )
1193
+
1194
+ # Get the last step ID (or whichever one you want to look at)
1195
+ step_id = steps["data"].first["id"]
1196
+
1197
+ # Retrieve all the steps. Include the "GIVE ME THE CHUNKS" incantation again.
1198
+ steps = steps["data"].map do |step|
1199
+ client.run_steps.retrieve(
1200
+ thread_id: thread_id,
1201
+ run_id: run_id,
1202
+ id: step["id"],
1203
+ parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
1204
+ )
1205
+ end
1206
+
1207
+ # Now we've got the chunk info, buried deep. Loop through the steps and find chunks if included:
1208
+ chunks = steps.flat_map do |step|
1209
+ included_results = step.dig("step_details", "tool_calls", 0, "file_search", "results")
1210
+
1211
+ next if included_results.nil? || included_results.empty?
1212
+
1213
+ included_results.flat_map do |result|
1214
+ result["content"].map do |content|
1215
+ content["text"]
1216
+ end
1217
+ end
1218
+ end.compact
1219
+
1220
+ # The first chunk will be the closest match to the prompt. Finally, if you want to view the completed message(s):
1221
+ client.messages.list(thread_id: thread_id)
1222
+ ```
1223
+
1084
1224
  ### Image Generation
1085
1225
 
1086
1226
  Generate images using DALL·E 2 or DALL·E 3!
data/lib/openai/client.rb CHANGED
@@ -2,6 +2,7 @@ module OpenAI
2
2
  class Client
3
3
  include OpenAI::HTTP
4
4
 
5
+ SENSITIVE_ATTRIBUTES = %i[@access_token @organization_id @extra_headers].freeze
5
6
  CONFIG_KEYS = %i[
6
7
  api_type
7
8
  api_version
@@ -107,5 +108,15 @@ module OpenAI
107
108
  client.add_headers("OpenAI-Beta": apis.map { |k, v| "#{k}=#{v}" }.join(";"))
108
109
  end
109
110
  end
111
+
112
+ def inspect
113
+ vars = instance_variables.map do |var|
114
+ value = instance_variable_get(var)
115
+
116
+ SENSITIVE_ATTRIBUTES.include?(var) ? "#{var}=[REDACTED]" : "#{var}=#{value.inspect}"
117
+ end
118
+
119
+ "#<#{self.class}:#{object_id} #{vars.join(', ')}>"
120
+ end
110
121
  end
111
122
  end
data/lib/openai/files.rb CHANGED
@@ -11,8 +11,8 @@ module OpenAI
11
11
  @client = client
12
12
  end
13
13
 
14
- def list
15
- @client.get(path: "/files")
14
+ def list(parameters: {})
15
+ @client.get(path: "/files", parameters: parameters)
16
16
  end
17
17
 
18
18
  def upload(parameters: {})
data/lib/openai/http.rb CHANGED
@@ -18,9 +18,10 @@ module OpenAI
18
18
  end&.body)
19
19
  end
20
20
 
21
- def json_post(path:, parameters:)
21
+ def json_post(path:, parameters:, query_parameters: {})
22
22
  conn.post(uri(path: path)) do |req|
23
23
  configure_json_post_request(req, parameters)
24
+ req.params = query_parameters
24
25
  end&.body
25
26
  end
26
27
 
@@ -16,8 +16,12 @@ module OpenAI
16
16
  @client.json_post(path: "/threads/#{thread_id}/messages", parameters: parameters)
17
17
  end
18
18
 
19
- def modify(id:, thread_id:, parameters: {})
19
+ def modify(thread_id:, id:, parameters: {})
20
20
  @client.json_post(path: "/threads/#{thread_id}/messages/#{id}", parameters: parameters)
21
21
  end
22
+
23
+ def delete(thread_id:, id:)
24
+ @client.delete(path: "/threads/#{thread_id}/messages/#{id}")
25
+ end
22
26
  end
23
27
  end
@@ -8,8 +8,8 @@ module OpenAI
8
8
  @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps", parameters: parameters)
9
9
  end
10
10
 
11
- def retrieve(thread_id:, run_id:, id:)
12
- @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps/#{id}")
11
+ def retrieve(thread_id:, run_id:, id:, parameters: {})
12
+ @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps/#{id}", parameters: parameters)
13
13
  end
14
14
  end
15
15
  end
data/lib/openai/runs.rb CHANGED
@@ -12,8 +12,9 @@ module OpenAI
12
12
  @client.get(path: "/threads/#{thread_id}/runs/#{id}")
13
13
  end
14
14
 
15
- def create(thread_id:, parameters: {})
16
- @client.json_post(path: "/threads/#{thread_id}/runs", parameters: parameters)
15
+ def create(thread_id:, parameters: {}, query_parameters: {})
16
+ @client.json_post(path: "/threads/#{thread_id}/runs", parameters: parameters,
17
+ query_parameters: query_parameters)
17
18
  end
18
19
 
19
20
  def modify(id:, thread_id:, parameters: {})
@@ -1,3 +1,3 @@
1
1
  module OpenAI
2
- VERSION = "7.1.0".freeze
2
+ VERSION = "7.3.0".freeze
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby-openai
3
3
  version: !ruby/object:Gem::Version
4
- version: 7.1.0
4
+ version: 7.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Alex
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-06-10 00:00:00.000000000 Z
11
+ date: 2024-10-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: event_stream_parser