ruby-openai 3.6.0 → 6.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,15 +1,63 @@
1
1
  # Ruby OpenAI
2
2
 
3
- [![Gem Version](https://badge.fury.io/rb/ruby-openai.svg)](https://badge.fury.io/rb/ruby-openai)
3
+ [![Gem Version](https://img.shields.io/gem/v/ruby-openai.svg)](https://rubygems.org/gems/ruby-openai)
4
4
  [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/alexrudall/ruby-openai/blob/main/LICENSE.txt)
5
5
  [![CircleCI Build Status](https://circleci.com/gh/alexrudall/ruby-openai.svg?style=shield)](https://circleci.com/gh/alexrudall/ruby-openai)
6
- [![Maintainability](https://api.codeclimate.com/v1/badges/a99a88d28ad37a79dbf6/maintainability)](https://codeclimate.com/github/codeclimate/codeclimate/maintainability)
7
6
 
8
- Use the [OpenAI API](https://openai.com/blog/openai-api/) with Ruby! 🤖❤️
9
-
10
- Generate text with ChatGPT, transcribe or translate audio with Whisper, create images with DALL·E, or write code with Codex...
11
-
12
- Check out [Ruby AI Builders](https://discord.gg/k4Uc224xVD) on Discord!
7
+ Use the [OpenAI API](https://openai.com/blog/openai-api/) with Ruby! 🤖🩵
8
+
9
+ Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E...
10
+
11
+ [🚢 Hire me](https://peaceterms.com?utm_source=ruby-openai&utm_medium=readme&utm_id=26072023) | [🎮 Ruby AI Builders Discord](https://discord.gg/k4Uc224xVD) | [🐦 Twitter](https://twitter.com/alexrudall) | [🧠 Anthropic Gem](https://github.com/alexrudall/anthropic) | [🚂 Midjourney Gem](https://github.com/alexrudall/midjourney)
12
+
13
+ # Table of Contents
14
+
15
+ - [Ruby OpenAI](#ruby-openai)
16
+ - [Table of Contents](#table-of-contents)
17
+ - [Installation](#installation)
18
+ - [Bundler](#bundler)
19
+ - [Gem install](#gem-install)
20
+ - [Usage](#usage)
21
+ - [Quickstart](#quickstart)
22
+ - [With Config](#with-config)
23
+ - [Custom timeout or base URI](#custom-timeout-or-base-uri)
24
+ - [Extra Headers per Client](#extra-headers-per-client)
25
+ - [Verbose Logging](#verbose-logging)
26
+ - [Azure](#azure)
27
+ - [Counting Tokens](#counting-tokens)
28
+ - [Models](#models)
29
+ - [Examples](#examples)
30
+ - [Chat](#chat)
31
+ - [Streaming Chat](#streaming-chat)
32
+ - [Vision](#vision)
33
+ - [JSON Mode](#json-mode)
34
+ - [Functions](#functions)
35
+ - [Edits](#edits)
36
+ - [Embeddings](#embeddings)
37
+ - [Files](#files)
38
+ - [Finetunes](#finetunes)
39
+ - [Assistants](#assistants)
40
+ - [Threads and Messages](#threads-and-messages)
41
+ - [Runs](#runs)
42
+ - [Runs involving function tools](#runs-involving-function-tools)
43
+ - [Image Generation](#image-generation)
44
+ - [DALL·E 2](#dalle-2)
45
+ - [DALL·E 3](#dalle-3)
46
+ - [Image Edit](#image-edit)
47
+ - [Image Variations](#image-variations)
48
+ - [Moderations](#moderations)
49
+ - [Whisper](#whisper)
50
+ - [Translate](#translate)
51
+ - [Transcribe](#transcribe)
52
+ - [Speech](#speech)
53
+ - [Errors](#errors)
54
+ - [Development](#development)
55
+ - [Release](#release)
56
+ - [Contributing](#contributing)
57
+ - [License](#license)
58
+ - [Code of Conduct](#code-of-conduct)
59
+
60
+ ## Installation
13
61
 
14
62
  ### Bundler
15
63
 
@@ -21,13 +69,17 @@ gem "ruby-openai"
21
69
 
22
70
  And then execute:
23
71
 
72
+ ```bash
24
73
  $ bundle install
74
+ ```
25
75
 
26
76
  ### Gem install
27
77
 
28
78
  Or install with:
29
79
 
80
+ ```bash
30
81
  $ gem install ruby-openai
82
+ ```
31
83
 
32
84
  and require with:
33
85
 
@@ -35,14 +87,10 @@ and require with:
35
87
  require "openai"
36
88
  ```
37
89
 
38
- ## Upgrading
39
-
40
- The `::Ruby::OpenAI` module has been removed and all classes have been moved under the top level `::OpenAI` module. To upgrade, change `require 'ruby/openai'` to `require 'openai'` and change all references to `Ruby::OpenAI` to `OpenAI`.
41
-
42
90
  ## Usage
43
91
 
44
- - Get your API key from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys)
45
- - If you belong to multiple organizations, you can get your Organization ID from [https://beta.openai.com/account/org-settings](https://beta.openai.com/account/org-settings)
92
+ - Get your API key from [https://platform.openai.com/account/api-keys](https://platform.openai.com/account/api-keys)
93
+ - If you belong to multiple organizations, you can get your Organization ID from [https://platform.openai.com/account/org-settings](https://platform.openai.com/account/org-settings)
46
94
 
47
95
  ### Quickstart
48
96
 
@@ -58,8 +106,8 @@ For a more robust setup, you can configure the gem with your API keys, for examp
58
106
 
59
107
  ```ruby
60
108
  OpenAI.configure do |config|
61
- config.access_token = ENV.fetch('OPENAI_ACCESS_TOKEN')
62
- config.organization_id = ENV.fetch('OPENAI_ORGANIZATION_ID') # Optional.
109
+ config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
110
+ config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional.
63
111
  end
64
112
  ```
65
113
 
@@ -69,27 +117,95 @@ Then you can create a client like this:
69
117
  client = OpenAI::Client.new
70
118
  ```
71
119
 
72
- #### Setting request timeout
120
+ You can still override the config defaults when making new clients; any options not included will fall back to any global config set with OpenAI.configure. e.g. in this example the organization_id, request_timeout, etc. will fallback to any set globally using OpenAI.configure, with only the access_token overridden:
73
121
 
74
- The default timeout for any OpenAI request is 120 seconds. You can change that passing the `request_timeout` when initializing the client:
122
+ ```ruby
123
+ client = OpenAI::Client.new(access_token: "access_token_goes_here")
124
+ ```
125
+
126
+ #### Custom timeout or base URI
127
+
128
+ The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the `request_timeout` when initializing the client. You can also change the base URI used for all requests, eg. to use observability tools like [Helicone](https://docs.helicone.ai/quickstart/integrate-in-one-line-of-code), and add arbitrary other headers e.g. for [openai-caching-proxy-worker](https://github.com/6/openai-caching-proxy-worker):
75
129
 
76
130
  ```ruby
77
- client = OpenAI::Client.new(access_token: "access_token_goes_here", request_timeout: 25)
131
+ client = OpenAI::Client.new(
132
+ access_token: "access_token_goes_here",
133
+ uri_base: "https://oai.hconeai.com/",
134
+ request_timeout: 240,
135
+ extra_headers: {
136
+ "X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
137
+ "X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
138
+ "Helicone-Auth": "Bearer HELICONE_API_KEY", # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
139
+ "helicone-stream-force-format" => "true", # Use this with Helicone otherwise streaming drops chunks # https://github.com/alexrudall/ruby-openai/issues/251
140
+ }
141
+ )
78
142
  ```
79
143
 
80
144
  or when configuring the gem:
81
145
 
146
+ ```ruby
147
+ OpenAI.configure do |config|
148
+ config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
149
+ config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional
150
+ config.uri_base = "https://oai.hconeai.com/" # Optional
151
+ config.request_timeout = 240 # Optional
152
+ config.extra_headers = {
153
+ "X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
154
+ "X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
155
+ "Helicone-Auth": "Bearer HELICONE_API_KEY" # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
156
+ } # Optional
157
+ end
158
+ ```
159
+
160
+ #### Extra Headers per Client
161
+
162
+ You can dynamically pass headers per client object, which will be merged with any headers set globally with OpenAI.configure:
163
+
164
+ ```ruby
165
+ client = OpenAI::Client.new(access_token: "access_token_goes_here")
166
+ client.add_headers("X-Proxy-TTL" => "43200")
167
+ ```
168
+
169
+ #### Verbose Logging
170
+
171
+ You can pass [Faraday middleware](https://lostisland.github.io/faraday/#/middleware/index) to the client in a block, eg. to enable verbose logging with Ruby's [Logger](https://ruby-doc.org/3.2.2/stdlibs/logger/Logger.html):
172
+
173
+ ```ruby
174
+ client = OpenAI::Client.new do |f|
175
+ f.response :logger, Logger.new($stdout), bodies: true
176
+ end
177
+ ```
178
+
179
+ #### Azure
180
+
181
+ To use the [Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/) API, you can configure the gem like this:
182
+
82
183
  ```ruby
83
184
  OpenAI.configure do |config|
84
- config.access_token = ENV.fetch('OPENAI_ACCESS_TOKEN')
85
- config.organization_id = ENV.fetch('OPENAI_ORGANIZATION_ID') # Optional.
86
- config.request_timeout = 25 # Optional
185
+ config.access_token = ENV.fetch("AZURE_OPENAI_API_KEY")
186
+ config.uri_base = ENV.fetch("AZURE_OPENAI_URI")
187
+ config.api_type = :azure
188
+ config.api_version = "2023-03-15-preview"
87
189
  end
88
190
  ```
89
191
 
192
+ where `AZURE_OPENAI_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo`
193
+
194
+ ### Counting Tokens
195
+
196
+ OpenAI parses prompt text into [tokens](https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them), which are words or portions of words. (These tokens are unrelated to your API access_token.) Counting tokens can help you estimate your [costs](https://openai.com/pricing). It can also help you ensure your prompt text size is within the max-token limits of your model's context window, and choose an appropriate [`max_tokens`](https://platform.openai.com/docs/api-reference/chat/create#chat/create-max_tokens) completion parameter so your response will fit as well.
197
+
198
+ To estimate the token-count of your text:
199
+
200
+ ```ruby
201
+ OpenAI.rough_token_count("Your text")
202
+ ```
203
+
204
+ If you need a more accurate count, try [tiktoken_ruby](https://github.com/IAPark/tiktoken_ruby).
205
+
90
206
  ### Models
91
207
 
92
- There are different models that can be used to generate text. For a full list and to retrieve information about a single models:
208
+ There are different models that can be used to generate text. For a full list and to retrieve information about a single model:
93
209
 
94
210
  ```ruby
95
211
  client.models.list
@@ -98,18 +214,22 @@ client.models.retrieve(id: "text-ada-001")
98
214
 
99
215
  #### Examples
100
216
 
101
- - [GPT-3](https://beta.openai.com/docs/models/gpt-3)
217
+ - [GPT-4 (limited beta)](https://platform.openai.com/docs/models/gpt-4)
218
+ - gpt-4 (uses current version)
219
+ - gpt-4-0314
220
+ - gpt-4-32k
221
+ - [GPT-3.5](https://platform.openai.com/docs/models/gpt-3-5)
222
+ - gpt-3.5-turbo
223
+ - gpt-3.5-turbo-0301
224
+ - text-davinci-003
225
+ - [GPT-3](https://platform.openai.com/docs/models/gpt-3)
102
226
  - text-ada-001
103
227
  - text-babbage-001
104
228
  - text-curie-001
105
- - text-davinci-001
106
- - [Codex (private beta)](https://beta.openai.com/docs/models/codex-series-private-beta)
107
- - code-davinci-002
108
- - code-cushman-001
109
229
 
110
- ### ChatGPT
230
+ ### Chat
111
231
 
112
- ChatGPT is a model that can be used to generate text in a conversational style. You can use it to [generate a response](https://platform.openai.com/docs/api-reference/chat/create) to a sequence of [messages](https://platform.openai.com/docs/guides/chat/introduction):
232
+ GPT is a model that can be used to generate text in a conversational style. You can use it to [generate a response](https://platform.openai.com/docs/api-reference/chat/create) to a sequence of [messages](https://platform.openai.com/docs/guides/chat/introduction):
113
233
 
114
234
  ```ruby
115
235
  response = client.chat(
@@ -122,6 +242,163 @@ puts response.dig("choices", 0, "message", "content")
122
242
  # => "Hello! How may I assist you today?"
123
243
  ```
124
244
 
245
+ #### Streaming Chat
246
+
247
+ [Quick guide to streaming Chat with Rails 7 and Hotwire](https://gist.github.com/alexrudall/cb5ee1e109353ef358adb4e66631799d)
248
+
249
+ You can stream from the API in realtime, which can be much faster and used to create a more engaging user experience. Pass a [Proc](https://ruby-doc.org/core-2.6/Proc.html) (or any object with a `#call` method) to the `stream` parameter to receive the stream of completion chunks as they are generated. Each time one or more chunks is received, the proc will be called once with each chunk, parsed as a Hash. If OpenAI returns an error, `ruby-openai` will raise a Faraday error.
250
+
251
+ ```ruby
252
+ client.chat(
253
+ parameters: {
254
+ model: "gpt-3.5-turbo", # Required.
255
+ messages: [{ role: "user", content: "Describe a character called Anna!"}], # Required.
256
+ temperature: 0.7,
257
+ stream: proc do |chunk, _bytesize|
258
+ print chunk.dig("choices", 0, "delta", "content")
259
+ end
260
+ })
261
+ # => "Anna is a young woman in her mid-twenties, with wavy chestnut hair that falls to her shoulders..."
262
+ ```
263
+
264
+ Note: OpenAPI currently does not report token usage for streaming responses. To count tokens while streaming, try `OpenAI.rough_token_count` or [tiktoken_ruby](https://github.com/IAPark/tiktoken_ruby). We think that each call to the stream proc corresponds to a single token, so you can also try counting the number of calls to the proc to get the completion token count.
265
+
266
+ #### Vision
267
+
268
+ You can use the GPT-4 Vision model to generate a description of an image:
269
+
270
+ ```ruby
271
+ messages = [
272
+ { "type": "text", "text": "What’s in this image?"},
273
+ { "type": "image_url",
274
+ "image_url": {
275
+ "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
276
+ },
277
+ }
278
+ ]
279
+ response = client.chat(
280
+ parameters: {
281
+ model: "gpt-4-vision-preview", # Required.
282
+ messages: [{ role: "user", content: messages}], # Required.
283
+ })
284
+ puts response.dig("choices", 0, "message", "content")
285
+ # => "The image depicts a serene natural landscape featuring a long wooden boardwalk extending straight ahead"
286
+ ```
287
+
288
+ #### JSON Mode
289
+
290
+ You can set the response_format to ask for responses in JSON (at least for `gpt-3.5-turbo-1106`):
291
+
292
+ ```ruby
293
+ response = client.chat(
294
+ parameters: {
295
+ model: "gpt-3.5-turbo-1106",
296
+ response_format: { type: "json_object" },
297
+ messages: [{ role: "user", content: "Hello! Give me some JSON please."}],
298
+ temperature: 0.7,
299
+ })
300
+ puts response.dig("choices", 0, "message", "content")
301
+ {
302
+ "name": "John",
303
+ "age": 30,
304
+ "city": "New York",
305
+ "hobbies": ["reading", "traveling", "hiking"],
306
+ "isStudent": false
307
+ }
308
+ ```
309
+
310
+ You can stream it as well!
311
+
312
+ ```ruby
313
+ response = client.chat(
314
+ parameters: {
315
+ model: "gpt-3.5-turbo-1106",
316
+ messages: [{ role: "user", content: "Can I have some JSON please?"}],
317
+ response_format: { type: "json_object" },
318
+ stream: proc do |chunk, _bytesize|
319
+ print chunk.dig("choices", 0, "delta", "content")
320
+ end
321
+ })
322
+ {
323
+ "message": "Sure, please let me know what specific JSON data you are looking for.",
324
+ "JSON_data": {
325
+ "example_1": {
326
+ "key_1": "value_1",
327
+ "key_2": "value_2",
328
+ "key_3": "value_3"
329
+ },
330
+ "example_2": {
331
+ "key_4": "value_4",
332
+ "key_5": "value_5",
333
+ "key_6": "value_6"
334
+ }
335
+ }
336
+ }
337
+ ```
338
+
339
+ ### Functions
340
+
341
+ You can describe and pass in functions and the model will intelligently choose to output a JSON object containing arguments to call those them. For example, if you want the model to use your method `get_current_weather` to get the current weather in a given location:
342
+
343
+ ```ruby
344
+ def get_current_weather(location:, unit: "fahrenheit")
345
+ # use a weather api to fetch weather
346
+ end
347
+
348
+ response =
349
+ client.chat(
350
+ parameters: {
351
+ model: "gpt-3.5-turbo-0613",
352
+ messages: [
353
+ {
354
+ "role": "user",
355
+ "content": "What is the weather like in San Francisco?",
356
+ },
357
+ ],
358
+ tools: [
359
+ {
360
+ type: "function",
361
+ function: {
362
+ name: "get_current_weather",
363
+ description: "Get the current weather in a given location",
364
+ parameters: {
365
+ type: :object,
366
+ properties: {
367
+ location: {
368
+ type: :string,
369
+ description: "The city and state, e.g. San Francisco, CA",
370
+ },
371
+ unit: {
372
+ type: "string",
373
+ enum: %w[celsius fahrenheit],
374
+ },
375
+ },
376
+ required: ["location"],
377
+ },
378
+ },
379
+ }
380
+ ],
381
+ },
382
+ )
383
+
384
+ message = response.dig("choices", 0, "message")
385
+
386
+ if message["role"] == "assistant" && message["tool_calls"]
387
+ function_name = message.dig("tool_calls", "function", "name")
388
+ args =
389
+ JSON.parse(
390
+ message.dig("tool_calls", "function", "arguments"),
391
+ { symbolize_names: true },
392
+ )
393
+
394
+ case function_name
395
+ when "get_current_weather"
396
+ get_current_weather(**args)
397
+ end
398
+ end
399
+ # => "The weather is nice 🌞"
400
+ ```
401
+
125
402
  ### Completions
126
403
 
127
404
  Hit the OpenAI API for a completion using other GPT-3 models:
@@ -158,12 +435,15 @@ puts response.dig("choices", 0, "text")
158
435
  You can use the embeddings endpoint to get a vector of numbers representing an input. You can then compare these vectors for different inputs to efficiently check how similar the inputs are.
159
436
 
160
437
  ```ruby
161
- client.embeddings(
438
+ response = client.embeddings(
162
439
  parameters: {
163
- model: "babbage-similarity",
440
+ model: "text-embedding-ada-002",
164
441
  input: "The food was delicious and the waiter..."
165
442
  }
166
443
  )
444
+
445
+ puts response.dig("data", 0, "embedding")
446
+ # => Vector representation of your embedding
167
447
  ```
168
448
 
169
449
  ### Files
@@ -180,29 +460,29 @@ and pass the path to `client.files.upload` to upload it to OpenAI, and then inte
180
460
  ```ruby
181
461
  client.files.upload(parameters: { file: "path/to/sentiment.jsonl", purpose: "fine-tune" })
182
462
  client.files.list
183
- client.files.retrieve(id: 123)
184
- client.files.content(id: 123)
185
- client.files.delete(id: 123)
463
+ client.files.retrieve(id: "file-123")
464
+ client.files.content(id: "file-123")
465
+ client.files.delete(id: "file-123")
186
466
  ```
187
467
 
188
- ### Fine-tunes
468
+ ### Finetunes
189
469
 
190
470
  Upload your fine-tuning data in a `.jsonl` file as above and get its ID:
191
471
 
192
472
  ```ruby
193
- response = client.files.upload(parameters: { file: "path/to/sentiment.jsonl", purpose: "fine-tune" })
473
+ response = client.files.upload(parameters: { file: "path/to/sarcasm.jsonl", purpose: "fine-tune" })
194
474
  file_id = JSON.parse(response.body)["id"]
195
475
  ```
196
476
 
197
- You can then use this file ID to create a fine-tune model:
477
+ You can then use this file ID to create a fine tuning job:
198
478
 
199
479
  ```ruby
200
480
  response = client.finetunes.create(
201
481
  parameters: {
202
482
  training_file: file_id,
203
- model: "text-ada-001"
483
+ model: "gpt-3.5-turbo-0613"
204
484
  })
205
- fine_tune_id = JSON.parse(response.body)["id"]
485
+ fine_tune_id = response["id"]
206
486
  ```
207
487
 
208
488
  That will give you the fine-tune ID. If you made a mistake you can cancel the fine-tune model before it is processed:
@@ -216,31 +496,242 @@ You may need to wait a short time for processing to complete. Once processed, yo
216
496
  ```ruby
217
497
  client.finetunes.list
218
498
  response = client.finetunes.retrieve(id: fine_tune_id)
219
- fine_tuned_model = JSON.parse(response.body)["fine_tuned_model"]
499
+ fine_tuned_model = response["fine_tuned_model"]
220
500
  ```
221
501
 
222
- This fine-tuned model name can then be used in completions:
502
+ This fine-tuned model name can then be used in chat completions:
223
503
 
224
504
  ```ruby
225
- response = client.completions(
505
+ response = client.chat(
226
506
  parameters: {
227
507
  model: fine_tuned_model,
228
- prompt: "I love Mondays!"
508
+ messages: [{ role: "user", content: "I love Mondays!"}]
229
509
  }
230
510
  )
231
- JSON.parse(response.body)["choices"].map { |c| c["text"] }
511
+ response.dig("choices", 0, "message", "content")
512
+ ```
513
+
514
+ You can also capture the events for a job:
515
+
516
+ ```
517
+ client.finetunes.list_events(id: fine_tune_id)
518
+ ```
519
+
520
+ ### Assistants
521
+
522
+ Assistants can call models to interact with threads and use tools to perform tasks (see [Assistant Overview](https://platform.openai.com/docs/assistants/overview)).
523
+
524
+ To create a new assistant (see [API documentation](https://platform.openai.com/docs/api-reference/assistants/createAssistant)):
525
+
526
+ ```ruby
527
+ response = client.assistants.create(
528
+ parameters: {
529
+ model: "gpt-3.5-turbo-1106", # Retrieve via client.models.list. Assistants need 'gpt-3.5-turbo-1106' or later.
530
+ name: "OpenAI-Ruby test assistant",
531
+ description: nil,
532
+ instructions: "You are a helpful assistant for coding a OpenAI API client using the OpenAI-Ruby gem.",
533
+ tools: [
534
+ { type: 'retrieval' }, # Allow access to files attached using file_ids
535
+ { type: 'code_interpreter' }, # Allow access to Python code interpreter
536
+ ],
537
+ "file_ids": ["file-123"], # See Files section above for how to upload files
538
+ "metadata": { my_internal_version_id: '1.0.0' }
539
+ })
540
+ assistant_id = response["id"]
541
+ ```
542
+
543
+ Given an `assistant_id` you can `retrieve` the current field values:
544
+
545
+ ```ruby
546
+ client.assistants.retrieve(id: assistant_id)
547
+ ```
548
+
549
+ You can get a `list` of all assistants currently available under the organization:
550
+
551
+ ```ruby
552
+ client.assistants.list
232
553
  ```
233
554
 
234
- You can delete the fine-tuned model when you are done with it:
555
+ You can modify an existing assistant using the assistant's id (see [API documentation](https://platform.openai.com/docs/api-reference/assistants/modifyAssistant)):
235
556
 
236
557
  ```ruby
237
- client.finetunes.delete(fine_tuned_model: fine_tuned_model)
558
+ response = client.assistants.modify(
559
+ id: assistant_id,
560
+ parameters: {
561
+ name: "Modified Test Assistant for OpenAI-Ruby",
562
+ metadata: { my_internal_version_id: '1.0.1' }
563
+ })
564
+ ```
565
+
566
+ You can delete assistants:
567
+
568
+ ```
569
+ client.assistants.delete(id: assistant_id)
238
570
  ```
239
571
 
572
+ ### Threads and Messages
573
+
574
+ Once you have created an assistant as described above, you need to prepare a `Thread` of `Messages` for the assistant to work on (see [introduction on Assistants](https://platform.openai.com/docs/assistants/how-it-works)). For example, as an initial setup you could do:
575
+
576
+ ```ruby
577
+ # Create thread
578
+ response = client.threads.create # Note: Once you create a thread, there is no way to list it
579
+ # or recover it currently (as of 2023-12-10). So hold onto the `id`
580
+ thread_id = response["id"]
581
+
582
+ # Add initial message from user (see https://platform.openai.com/docs/api-reference/messages/createMessage)
583
+ message_id = client.messages.create(
584
+ thread_id: thread_id,
585
+ parameters: {
586
+ role: "user", # Required for manually created messages
587
+ content: "Can you help me write an API library to interact with the OpenAI API please?"
588
+ })["id"]
589
+
590
+ # Retrieve individual message
591
+ message = client.messages.retrieve(thread_id: thread_id, id: message_id)
592
+
593
+ # Review all messages on the thread
594
+ messages = client.messages.list(thread_id: thread_id)
595
+ ```
596
+
597
+ To clean up after a thread is no longer needed:
598
+
599
+ ```ruby
600
+ # To delete the thread (and all associated messages):
601
+ client.threads.delete(id: thread_id)
602
+
603
+ client.messages.retrieve(thread_id: thread_id, id: message_id) # -> Fails after thread is deleted
604
+ ```
605
+
606
+ ### Runs
607
+
608
+ To submit a thread to be evaluated with the model of an assistant, create a `Run` as follows (Note: This is one place where OpenAI will take your money):
609
+
610
+ ```ruby
611
+ # Create run (will use instruction/model/tools from Assistant's definition)
612
+ response = client.runs.create(thread_id: thread_id,
613
+ parameters: {
614
+ assistant_id: assistant_id
615
+ })
616
+ run_id = response['id']
617
+
618
+ # Retrieve/poll Run to observe status
619
+ response = client.runs.retrieve(id: run_id, thread_id: thread_id)
620
+ status = response['status']
621
+ ```
622
+
623
+ The `status` response can include the following strings `queued`, `in_progress`, `requires_action`, `cancelling`, `cancelled`, `failed`, `completed`, or `expired` which you can handle as follows:
624
+
625
+ ```ruby
626
+ while true do
627
+
628
+ response = client.runs.retrieve(id: run_id, thread_id: thread_id)
629
+ status = response['status']
630
+
631
+ case status
632
+ when 'queued', 'in_progress', 'cancelling'
633
+ puts 'Sleeping'
634
+ sleep 1 # Wait one second and poll again
635
+ when 'completed'
636
+ break # Exit loop and report result to user
637
+ when 'requires_action'
638
+ # Handle tool calls (see below)
639
+ when 'cancelled', 'failed', 'expired'
640
+ puts response['last_error'].inspect
641
+ break # or `exit`
642
+ else
643
+ puts "Unknown status response: #{status}"
644
+ end
645
+ end
646
+ ```
647
+
648
+ If the `status` response indicates that the `run` is `completed`, the associated `thread` will have one or more new `messages` attached:
649
+
650
+ ```ruby
651
+ # Either retrieve all messages in bulk again, or...
652
+ messages = client.messages.list(thread_id: thread_id) # Note: as of 2023-12-11 adding limit or order options isn't working, yet
653
+
654
+ # Alternatively retrieve the `run steps` for the run which link to the messages:
655
+ run_steps = client.run_steps.list(thread_id: thread_id, run_id: run_id)
656
+ new_message_ids = run_steps['data'].filter_map { |step|
657
+ if step['type'] == 'message_creation'
658
+ step.dig('step_details', "message_creation", "message_id")
659
+ end # Ignore tool calls, because they don't create new messages.
660
+ }
661
+
662
+ # Retrieve the individual messages
663
+ new_messages = new_message_ids.map { |msg_id|
664
+ client.messages.retrieve(id: msg_id, thread_id: thread_id)
665
+ }
666
+
667
+ # Find the actual response text in the content array of the messages
668
+ new_messages.each { |msg|
669
+ msg['content'].each { |content_item|
670
+ case content_item['type']
671
+ when 'text'
672
+ puts content_item.dig('text', 'value')
673
+ # Also handle annotations
674
+ when 'image_file'
675
+ # Use File endpoint to retrieve file contents via id
676
+ id = content_item.dig('image_file', 'file_id')
677
+ end
678
+ }
679
+ }
680
+ ```
681
+
682
+ At any time you can list all runs which have been performed on a particular thread or are currently running (in descending/newest first order):
683
+
684
+ ```ruby
685
+ client.runs.list(thread_id: thread_id)
686
+ ```
687
+
688
+ #### Runs involving function tools
689
+
690
+ In case you are allowing the assistant to access `function` tools (they are defined in the same way as functions during chat completion), you might get a status code of `requires_action` when the assistant wants you to evaluate one or more function tools:
691
+
692
+ ```ruby
693
+ def get_current_weather(location:, unit: "celsius")
694
+ # Your function code goes here
695
+ if location =~ /San Francisco/i
696
+ return unit == "celsius" ? "The weather is nice 🌞 at 27°C" : "The weather is nice 🌞 at 80°F"
697
+ else
698
+ return unit == "celsius" ? "The weather is icy 🥶 at -5°C" : "The weather is icy 🥶 at 23°F"
699
+ end
700
+ end
701
+
702
+ if status == 'requires_action'
703
+
704
+ tools_to_call = response.dig('required_action', 'submit_tool_outputs', 'tool_calls')
705
+
706
+ my_tool_outputs = tools_to_call.map { |tool|
707
+ # Call the functions based on the tool's name
708
+ function_name = tool.dig('function', 'name')
709
+ arguments = JSON.parse(
710
+ tool.dig("function", "arguments"),
711
+ { symbolize_names: true },
712
+ )
713
+
714
+ tool_output = case function_name
715
+ when "get_current_weather"
716
+ get_current_weather(**arguments)
717
+ end
718
+
719
+ { tool_call_id: tool['id'], output: tool_output }
720
+ }
721
+
722
+ client.runs.submit_tool_outputs(thread_id: thread_id, run_id: run_id, parameters: { tool_outputs: my_tool_outputs })
723
+ end
724
+ ```
725
+
726
+ Note that you have 10 minutes to submit your tool output before the run expires.
727
+
240
728
  ### Image Generation
241
729
 
242
- Generate an image using DALL·E! The size of any generated images must be one of `256x256`, `512x512` or `1024x1024` -
243
- if not specified the image will default to `1024x1024`.
730
+ Generate images using DALL·E 2 or DALL·E 3!
731
+
732
+ #### DALL·E 2
733
+
734
+ For DALL·E 2 the size of any generated images must be one of `256x256`, `512x512` or `1024x1024` - if not specified the image will default to `1024x1024`.
244
735
 
245
736
  ```ruby
246
737
  response = client.images.generate(parameters: { prompt: "A baby sea otter cooking pasta wearing a hat of some sort", size: "256x256" })
@@ -250,6 +741,18 @@ puts response.dig("data", 0, "url")
250
741
 
251
742
  ![Ruby](https://i.ibb.co/6y4HJFx/img-d-Tx-Rf-RHj-SO5-Gho-Cbd8o-LJvw3.png)
252
743
 
744
+ #### DALL·E 3
745
+
746
+ For DALL·E 3 the size of any generated images must be one of `1024x1024`, `1024x1792` or `1792x1024`. Additionally the quality of the image can be specified to either `standard` or `hd`.
747
+
748
+ ```ruby
749
+ response = client.images.generate(parameters: { prompt: "A springer spaniel cooking pasta wearing a hat of some sort", size: "1024x1792", quality: "standard" })
750
+ puts response.dig("data", 0, "url")
751
+ # => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."
752
+ ```
753
+
754
+ ![Ruby](https://i.ibb.co/z2tCKv9/img-Goio0l-S0i81-NUNa-BIx-Eh-CT6-L.png)
755
+
253
756
  ### Image Edit
254
757
 
255
758
  Fill in the transparent part of an image, or upload a mask with transparent sections to indicate the parts of an image that can be changed according to your prompt...
@@ -294,12 +797,12 @@ Whisper is a speech to text model that can be used to generate text based on aud
294
797
  The translations API takes as input the audio file in any of the supported languages and transcribes the audio into English.
295
798
 
296
799
  ```ruby
297
- response = client.translate(
800
+ response = client.audio.translate(
298
801
  parameters: {
299
802
  model: "whisper-1",
300
- file: File.open('path_to_file'),
803
+ file: File.open("path_to_file", "rb"),
301
804
  })
302
- puts response.parsed_response['text']
805
+ puts response["text"]
303
806
  # => "Translation of the text"
304
807
  ```
305
808
 
@@ -307,28 +810,64 @@ puts response.parsed_response['text']
307
810
 
308
811
  The transcriptions API takes as input the audio file you want to transcribe and returns the text in the desired output file format.
309
812
 
813
+ You can pass the language of the audio file to improve transcription quality. Supported languages are listed [here](https://github.com/openai/whisper#available-models-and-languages). You need to provide the language as an ISO-639-1 code, eg. "en" for English or "ne" for Nepali. You can look up the codes [here](https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes).
814
+
310
815
  ```ruby
311
- response = client.transcribe(
816
+ response = client.audio.transcribe(
312
817
  parameters: {
313
818
  model: "whisper-1",
314
- file: File.open('path_to_file'),
819
+ file: File.open("path_to_file", "rb"),
820
+ language: "en" # Optional.
315
821
  })
316
- puts response.parsed_response['text']
822
+ puts response["text"]
317
823
  # => "Transcription of the text"
318
824
  ```
319
825
 
826
+ #### Speech
827
+
828
+ The speech API takes as input the text and a voice and returns the content of an audio file you can listen to.
829
+
830
+ ```ruby
831
+ response = client.audio.speech(
832
+ parameters: {
833
+ model: "tts-1",
834
+ input: "This is a speech test!",
835
+ voice: "alloy"
836
+ }
837
+ )
838
+ File.binwrite('demo.mp3', response)
839
+ # => mp3 file that plays: "This is a speech test!"
840
+ ```
841
+
842
+ ### Errors
843
+
844
+ HTTP errors can be caught like this:
845
+
846
+ ```
847
+ begin
848
+ OpenAI::Client.new.models.retrieve(id: "text-ada-001")
849
+ rescue Faraday::Error => e
850
+ raise "Got a Faraday error: #{e}"
851
+ end
852
+ ```
853
+
320
854
  ## Development
321
855
 
322
856
  After checking out the repo, run `bin/setup` to install dependencies. You can run `bin/console` for an interactive prompt that will allow you to experiment.
323
857
 
324
858
  To install this gem onto your local machine, run `bundle exec rake install`.
325
859
 
860
+ To run all tests, execute the command `bundle exec rake`, which will also run the linter (Rubocop). This repository uses [VCR](https://github.com/vcr/vcr) to log API requests.
861
+
862
+ > [!WARNING]
863
+ > If you have an `OPENAI_ACCESS_TOKEN` in your `ENV`, running the specs will use this to run the specs against the actual API, which will be slow and cost you money - 2 cents or more! Remove it from your environment with `unset` or similar if you just want to run the specs against the stored VCR responses.
864
+
326
865
  ## Release
327
866
 
328
- First run the specs without VCR so they actually hit the API. This will cost about 2 cents. You'll need to add your `OPENAI_ACCESS_TOKEN=` in `.env`.
867
+ First run the specs without VCR so they actually hit the API. This will cost 2 cents or more. Set OPENAI_ACCESS_TOKEN in your environment or pass it in like this:
329
868
 
330
869
  ```
331
- NO_VCR=true bundle exec rspec
870
+ OPENAI_ACCESS_TOKEN=123abc bundle exec rspec
332
871
  ```
333
872
 
334
873
  Then update the version number in `version.rb`, update `CHANGELOG.md`, run `bundle install` to update Gemfile.lock, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).