RubyGems - ruby-openai - Versions diffs - 7.1.0 → 7.3.0 - Mend

ruby-openai 7.1.0 → 7.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: a86dc627f27eeea7cf3eb1bf2eec2b0209d0bb8c11fef0eb6fd6518f7f10cfe9
-  data.tar.gz: 712ab627670853d680c8858a9d27aef5a82be09de8d53e5f7156ee608ba8d939
+  metadata.gz: 278f25c283d841bfa33614bd69b4340b9275712b83e9121a1aa2a6a439767714
+  data.tar.gz: 702c11ba4b0411a47e9d6f9fdb178d1eb40a7baede5909f0665b41edc00797b0
 SHA512:
-  metadata.gz: 72e14dc39495046b71ca147953582a24f8c9261955f2ca2a8d898ca7f8e136b459c31583620c16db3fa80c39da61c1f3a4cc932c5b3f2e71741fed42719eaeaf
-  data.tar.gz: 82db19d40f9b44fedb73d8f310771af71096d3d7e8e56f96d000a70f4c61abb1f21cbe98187d70fec1c3d639921eabe8cd8e456cac92dbcfa6e2efcb655f865b
+  metadata.gz: 7c4a1bdb8fd3f466808f740112c3223a04da5ef73fd355b5f2136ecf28f5be2968ec4ecced5db25833878ba7e9de1f63e063529580f86d269c7bdd82f7e77df9
+  data.tar.gz: '014855034340e14ac2e78c845ae791f619dedf636c9839ce5edc2ea27d7eb54e973dbe4a41998b41d1e89c2c56ce04cd2c062c90bdc87b858b7467005e78100c'

data/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,20 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [7.3.0] - 2024-10-11
+### Added
+- Add ability to (with the right incantations) retrieve the chunks used by an Assistant file search - thanks to [@agamble](https://github.com/agamble) for the addition!
+## [7.2.0] - 2024-10-10
+### Added
+- Add ability to pass parameters to Files#list endpoint - thanks to [@parterburn](https://github.com/parterburn)!
+- Add Velvet observability platform to README - thanks to [@philipithomas](https://github.com/philipithomas)
+- Add Assistants::Messages#delete endpoint - thanks to [@mochetts](https://github.com/mochetts)!
 ## [7.1.0] - 2024-06-10
 ### Added

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    ruby-openai (7.1.0)
+    ruby-openai (7.3.0)
       event_stream_parser (>= 0.3.0, < 2.0.0)
       faraday (>= 1)
       faraday-multipart (>= 1)
@@ -38,7 +38,7 @@ GEM
     rainbow (3.1.1)
     rake (13.2.1)
     regexp_parser (2.8.0)
-    rexml (3.2.9)
+    rexml (3.3.6)
       strscan
     rspec (3.13.0)
       rspec-core (~> 3.13.0)

data/README.md CHANGED Viewed

@@ -8,7 +8,7 @@ Use the [OpenAI API](https://openai.com/blog/openai-api/) with Ruby! 🤖❤️
 Stream text with GPT-4o, transcribe and translate audio with Whisper, or create images with DALL·E...
-[🚢 Hire me](https://peaceterms.com?utm_source=ruby-openai&utm_medium=readme&utm_id=26072023) | [🎮 Ruby AI Builders Discord](https://discord.gg/k4Uc224xVD) | [🐦 Twitter](https://twitter.com/alexrudall) | [🧠 Anthropic Gem](https://github.com/alexrudall/anthropic) | [🚂 Midjourney Gem](https://github.com/alexrudall/midjourney)
+[📚 Rails AI (FREE Book)](https://railsai.com) | [🎮 Ruby AI Builders Discord](https://discord.gg/k4Uc224xVD) | [🐦 X](https://x.com/alexrudall) | [🧠 Anthropic Gem](https://github.com/alexrudall/anthropic) | [🚂 Midjourney Gem](https://github.com/alexrudall/midjourney)
 ## Contents
@@ -139,7 +139,9 @@ client = OpenAI::Client.new(access_token: "access_token_goes_here")
 #### Custom timeout or base URI
-The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the `request_timeout` when initializing the client. You can also change the base URI used for all requests, eg. to use observability tools like [Helicone](https://docs.helicone.ai/quickstart/integrate-in-one-line-of-code), and add arbitrary other headers e.g. for [openai-caching-proxy-worker](https://github.com/6/openai-caching-proxy-worker):
+- The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the `request_timeout` when initializing the client.
+- You can also change the base URI used for all requests, eg. to use observability tools like [Helicone](https://docs.helicone.ai/quickstart/integrate-in-one-line-of-code) or [Velvet](https://docs.usevelvet.com/docs/getting-started)
+- You can also add arbitrary other headers e.g. for [openai-caching-proxy-worker](https://github.com/6/openai-caching-proxy-worker), eg.:
 ```ruby
 client = OpenAI::Client.new(
@@ -326,7 +328,28 @@ client.chat(
 # => "Anna is a young woman in her mid-twenties, with wavy chestnut hair that falls to her shoulders..."
 ```
-Note: OpenAPI currently does not report token usage for streaming responses. To count tokens while streaming, try `OpenAI.rough_token_count` or [tiktoken_ruby](https://github.com/IAPark/tiktoken_ruby). We think that each call to the stream proc corresponds to a single token, so you can also try counting the number of calls to the proc to get the completion token count.
+Note: In order to get usage information, you can provide the [`stream_options` parameter](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options) and OpenAI will provide a final chunk with the usage. Here is an example:
+```ruby
+stream_proc = proc { |chunk, _bytesize| puts "--------------"; puts chunk.inspect; }
+client.chat(
+    parameters: {
+        model: "gpt-4o",
+        stream: stream_proc,
+        stream_options: { include_usage: true },
+        messages: [{ role: "user", content: "Hello!"}],
+    })
+# => --------------
+# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"role"=>"assistant", "content"=>""}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
+# => --------------
+# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"content"=>"Hello"}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
+# => --------------
+# => ... more content chunks
+# => --------------
+# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{}, "logprobs"=>nil, "finish_reason"=>"stop"}], "usage"=>nil}
+# => --------------
+# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[], "usage"=>{"prompt_tokens"=>9, "completion_tokens"=>9, "total_tokens"=>18}}
+```
 #### Vision
@@ -526,9 +549,11 @@ puts response.dig("data", 0, "embedding")
 ```
 ### Batches
 The Batches endpoint allows you to create and manage large batches of API requests to run asynchronously. Currently, the supported endpoints for batches are `/v1/chat/completions` (Chat Completions API) and `/v1/embeddings` (Embeddings API).
 To use the Batches endpoint, you need to first upload a JSONL file containing the batch requests using the Files endpoint. The file must be uploaded with the purpose set to `batch`. Each line in the JSONL file represents a single request and should have the following format:
 ```json
 {
   "custom_id": "request-1",
@@ -612,7 +637,9 @@ These files are in JSONL format, with each line representing the output or error
 If a request fails with a non-HTTP error, the error object will contain more information about the cause of the failure.
 ### Files
 #### For fine-tuning purposes
 Put your data in a `.jsonl` file like this:
 ```json
@@ -645,7 +672,6 @@ my_file = File.open("path/to/file.pdf", "rb")
 client.files.upload(parameters: { file: my_file, purpose: "assistants" })
 ```
 See supported file types on [API documentation](https://platform.openai.com/docs/assistants/tools/file-search/supported-files).
 ### Finetunes
@@ -701,6 +727,7 @@ client.finetunes.list_events(id: fine_tune_id)
 ```
 ### Vector Stores
 Vector Store objects give the File Search tool the ability to search your files.
 You can create a new vector store:
@@ -746,6 +773,7 @@ client.vector_stores.delete(id: vector_store_id)
 ```
 ### Vector Store Files
 Vector store files represent files inside a vector store.
 You can create a new vector store file by attaching a File to a vector store.
@@ -784,9 +812,11 @@ client.vector_store_files.delete(
   id: vector_store_file_id
 )
 ```
 Note: This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
 ### Vector Store File Batches
 Vector store file batches represent operations to add multiple files to a vector store.
 You can create a new vector store file batch by attaching multiple Files to a vector store.
@@ -1081,6 +1111,116 @@ end
 Note that you have 10 minutes to submit your tool output before the run expires.
+#### Exploring chunks used in File Search
+Take a deep breath. You might need a drink for this one.
+It's possible for OpenAI to share what chunks it used in its internal RAG Pipeline to create its filesearch example.
+An example spec can be found [here](https://github.com/alexrudall/ruby-openai/blob/main/spec/openai/client/assistant_file_search_spec.rb) that does this, just so you know it's possible.
+Here's how to get the chunks used in a file search. In this example I'm using [this file](https://css4.pub/2015/textbook/somatosensory.pdf):
+```
+require "openai"
+# Make a client
+client = OpenAI::Client.new(
+  access_token: "access_token_goes_here",
+  log_errors: true # Don't do this in production.
+)
+# Upload your file(s)
+file_id = client.files.upload(
+  parameters: {
+    file: "path/to/somatosensory.pdf",
+    purpose: "assistants"
+  }
+)["id"]
+# Create a vector store to store the vectorised file(s)
+vector_store_id = client.vector_stores.create(parameters: {})["id"]
+# Vectorise the file(s)
+vector_store_file_id = client.vector_store_files.create(
+  vector_store_id: vector_store_id,
+  parameters: { file_id: file_id }
+)["id"]
+# Check that the file is vectorised (wait for status to be "completed")
+client.vector_store_files.retrieve(vector_store_id: vector_store_id, id: vector_store_file_id)["status"]
+# Create an assistant, referencing the vector store
+assistant_id = client.assistants.create(
+  parameters: {
+    model: "gpt-4o",
+    name: "Answer finder",
+    instructions: "You are a file search tool. Find the answer in the given files, please.",
+    tools: [
+      { type: "file_search" }
+    ],
+    tool_resources: {
+      file_search: {
+        vector_store_ids: [vector_store_id]
+      }
+    }
+  }
+)["id"]
+# Create a thread with your question
+thread_id = client.threads.create(parameters: {
+  messages: [
+    { role: "user",
+      content: "Find the description of a nociceptor." }
+  ]
+})["id"]
+# Run the thread to generate the response. Include the "GIVE ME THE CHUNKS" incantation.
+run_id = client.runs.create(
+  thread_id: thread_id,
+  parameters: {
+    assistant_id: assistant_id
+  },
+  query_parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
+)["id"]
+# Get the steps that happened in the run
+steps = client.run_steps.list(
+  thread_id: thread_id,
+  run_id: run_id,
+  parameters: { order: "asc" }
+)
+# Get the last step ID (or whichever one you want to look at)
+step_id = steps["data"].first["id"]
+# Retrieve all the steps. Include the "GIVE ME THE CHUNKS" incantation again.
+steps = steps["data"].map do |step|
+  client.run_steps.retrieve(
+    thread_id: thread_id,
+    run_id: run_id,
+    id: step["id"],
+    parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
+  )
+end
+# Now we've got the chunk info, buried deep. Loop through the steps and find chunks if included:
+chunks = steps.flat_map do |step|
+  included_results = step.dig("step_details", "tool_calls", 0, "file_search", "results")
+  next if included_results.nil? || included_results.empty?
+  included_results.flat_map do |result|
+    result["content"].map do |content|
+      content["text"]
+    end
+  end
+end.compact
+# The first chunk will be the closest match to the prompt. Finally, if you want to view the completed message(s):
+client.messages.list(thread_id: thread_id)
+```
 ### Image Generation
 Generate images using DALL·E 2 or DALL·E 3!

data/lib/openai/client.rb CHANGED Viewed

@@ -2,6 +2,7 @@ module OpenAI
   class Client
     include OpenAI::HTTP
+    SENSITIVE_ATTRIBUTES = %i[@access_token @organization_id @extra_headers].freeze
     CONFIG_KEYS = %i[
       api_type
       api_version
@@ -107,5 +108,15 @@ module OpenAI
         client.add_headers("OpenAI-Beta": apis.map { |k, v| "#{k}=#{v}" }.join(";"))
       end
     end
+    def inspect
+      vars = instance_variables.map do |var|
+        value = instance_variable_get(var)
+        SENSITIVE_ATTRIBUTES.include?(var) ? "#{var}=[REDACTED]" : "#{var}=#{value.inspect}"
+      end
+      "#<#{self.class}:#{object_id} #{vars.join(', ')}>"
+    end
   end
 end

data/lib/openai/files.rb CHANGED Viewed

@@ -11,8 +11,8 @@ module OpenAI
       @client = client
     end
-    def list
-      @client.get(path: "/files")
+    def list(parameters: {})
+      @client.get(path: "/files", parameters: parameters)
     end
     def upload(parameters: {})

data/lib/openai/http.rb CHANGED Viewed

@@ -18,9 +18,10 @@ module OpenAI
       end&.body)
     end
-    def json_post(path:, parameters:)
+    def json_post(path:, parameters:, query_parameters: {})
       conn.post(uri(path: path)) do |req|
         configure_json_post_request(req, parameters)
+        req.params = query_parameters
       end&.body
     end

data/lib/openai/messages.rb CHANGED Viewed

@@ -16,8 +16,12 @@ module OpenAI
       @client.json_post(path: "/threads/#{thread_id}/messages", parameters: parameters)
     end
-    def modify(id:, thread_id:, parameters: {})
+    def modify(thread_id:, id:, parameters: {})
       @client.json_post(path: "/threads/#{thread_id}/messages/#{id}", parameters: parameters)
     end
+    def delete(thread_id:, id:)
+      @client.delete(path: "/threads/#{thread_id}/messages/#{id}")
+    end
   end
 end

data/lib/openai/run_steps.rb CHANGED Viewed

@@ -8,8 +8,8 @@ module OpenAI
       @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps", parameters: parameters)
     end
-    def retrieve(thread_id:, run_id:, id:)
-      @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps/#{id}")
+    def retrieve(thread_id:, run_id:, id:, parameters: {})
+      @client.get(path: "/threads/#{thread_id}/runs/#{run_id}/steps/#{id}", parameters: parameters)
     end
   end
 end

data/lib/openai/runs.rb CHANGED Viewed

@@ -12,8 +12,9 @@ module OpenAI
       @client.get(path: "/threads/#{thread_id}/runs/#{id}")
     end
-    def create(thread_id:, parameters: {})
-      @client.json_post(path: "/threads/#{thread_id}/runs", parameters: parameters)
+    def create(thread_id:, parameters: {}, query_parameters: {})
+      @client.json_post(path: "/threads/#{thread_id}/runs", parameters: parameters,
+                        query_parameters: query_parameters)
     end
     def modify(id:, thread_id:, parameters: {})

data/lib/openai/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module OpenAI
-  VERSION = "7.1.0".freeze
+  VERSION = "7.3.0".freeze
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: ruby-openai
 version: !ruby/object:Gem::Version
-  version: 7.1.0
+  version: 7.3.0
 platform: ruby
 authors:
 - Alex
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2024-06-10 00:00:00.000000000 Z
+date: 2024-10-11 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: event_stream_parser