ai-chat 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1b559dc1098b7391dbca24aea20c9c631ec770eecaf125dc118a1358d62fba39
4
- data.tar.gz: 8e5c4b588ed741e7e07d7cbfa975764e7c8df6bb7ebb27ab7d3f1730924bdca7
3
+ metadata.gz: 6b050afeef6a27a67c0125c131e9f2825a0201cf1c1781f7f87750705b150ea8
4
+ data.tar.gz: d87412fd5c1439eaad5eba3d919b6cbb7dfc795e762199beaeb28825dc1d0281
5
5
  SHA512:
6
- metadata.gz: 934e8b03fee2aade7ec67eb122d78c1d271af3681f8a4ac4712f4ec8e1132a36a2cf9291167d05a44fa2d0a7e7c9096e1ce767ccb277de929e5acc783bf1ff52
7
- data.tar.gz: d9edb5b4a0a2fb8da9cab3ad380ef21f66a054438c57bcc202769e4b5688a90e9820ff77a73331b4dceccb7f06bbebfa675264856cc1810fb77eb686575373af
6
+ metadata.gz: d7e6064820465b1ce64d2fa551e5e92fdf4bb74f6a817e4473a28f69541e431cd84deec12dce1df08fba16041baf27f120e7b37ec6721d40201d138f7f563f69
7
+ data.tar.gz: f13ebe743b083cd8089fa28e1de37750416b03ce0414eb0b754454f16a1b0bd080abb71c6e2c5b966462b6bda5dc9893dfd530060997a177fce493b1e0782eee
data/README.md CHANGED
@@ -241,6 +241,27 @@ h.last
241
241
  # => "Here's how to boil an egg..."
242
242
  ```
243
243
 
244
+ ## Web Search
245
+
246
+ To give the model access to real-time information from the internet, we enable the `web_search` feature by default. This uses OpenAI's built-in `web_search_preview` tool.
247
+
248
+ ```ruby
249
+ m = AI::Chat.new
250
+ m.user("What are the latest developments in the Ruby language?")
251
+ m.generate! # This may use web search to find current information
252
+ ```
253
+
254
+ **Note:** This feature requires a model that supports the `web_search_preview` tool, such as `gpt-4o` or `gpt-4o-mini`. The gem will attempt to use a compatible model if you have `web_search` enabled.
255
+
256
+ If you don't want the model to use web search, set `web_search` to `false`:
257
+
258
+ ```ruby
259
+ m = AI::Chat.new
260
+ m.web_search = false
261
+ m.user("What are the latest developments in the Ruby language?")
262
+ m.generate! # This definitely won't use web search to find current information
263
+ ```
264
+
244
265
  ## Structured Output
245
266
 
246
267
  Get back Structured Output by setting the `schema` attribute (I suggest using [OpenAI's handy tool for generating the JSON Schema](https://platform.openai.com/docs/guides/structured-outputs)):
@@ -412,18 +433,87 @@ l.generate!
412
433
 
413
434
  **Note**: Images should use `image:`/`images:` parameters, while documents should use `file:`/`files:` parameters.
414
435
 
415
- ## Web Search
436
+ ## Re-sending old images and files
416
437
 
417
- To give the model access to real-time information from the internet, you can enable the `web_search` feature. This uses OpenAI's built-in `web_search_preview` tool.
438
+ Note: if you generate another API request using the same chat, old images and files in the conversation history will not be re-sent by default. If you really want to re-send old images and files, then you must set `previous_response_id` to `nil`:
418
439
 
419
440
  ```ruby
420
- m = AI::Chat.new
421
- m.web_search = true
422
- m.user("What are the latest developments in the Ruby language?")
423
- m.generate! # This may use web search to find current information
441
+ a = AI::Chat.new
442
+ a.user("What color is the object in this photo?", image: "thing.png")
443
+ a.generate! # => "Red"
444
+ a.user("What is the object in the photo?")
445
+ a.generate! # => "I don't see a photo"
446
+
447
+ b = AI::Chat.new
448
+ b.user("What color is the object in this photo?", image: "thing.png")
449
+ b.generate! # => "Red"
450
+ b.user("What is the object in the photo?")
451
+ b.previous_response_id = nil
452
+ b.generate! # => "An apple"
424
453
  ```
425
454
 
426
- **Note:** This feature requires a model that supports the `web_search_preview` tool, such as `gpt-4o` or `gpt-4o-mini`. The gem will attempt to use a compatible model if you have `web_search` enabled.
455
+ If you don't set `previous_response_id` to `nil`, the model won't have the old image(s) to work with.
456
+
457
+ ## Image generation
458
+
459
+ You can enable OpenAI's image generation tool:
460
+
461
+ ```ruby
462
+ a = AI::Chat.new
463
+ a.image_generation = true
464
+ a.user("Draw a picture of a kitten")
465
+ a.generate! # => "Here is your picture of a kitten:"
466
+ ```
467
+
468
+ By default, images are saved to `./images`. You can configure a different location:
469
+
470
+ ```ruby
471
+ a = AI::Chat.new
472
+ a.image_generation = true
473
+ a.image_folder = "./my_images"
474
+ a.user("Draw a picture of a kitten")
475
+ a.generate! # => "Here is your picture of a kitten:"
476
+ ```
477
+
478
+ Images are saved in timestamped subfolders using ISO 8601 basic format. For example:
479
+ - `./images/20250804T11303912_resp_abc123/001.png`
480
+ - `./images/20250804T11303912_resp_abc123/002.png` (if multiple images)
481
+
482
+ The folder structure ensures images are organized chronologically and by response.
483
+
484
+ The messages array will now look like this:
485
+
486
+ ```ruby
487
+ pp a.messages
488
+ # => [
489
+ # {:role=>"user", :content=>"Draw a picture of a kitten"},
490
+ # {:role=>"assistant", :content=>"Here is your picture of a kitten:", :images => ["./images/20250804T11303912_resp_abc123/001.png"], :response => #<Response ...>}
491
+ # ]
492
+ ```
493
+
494
+ You can access the image filenames in several ways:
495
+
496
+ ```ruby
497
+ # From the last message
498
+ images = a.messages.last[:images]
499
+ # => ["./images/20250804T11303912_resp_abc123/001.png"]
500
+
501
+ # From the response object
502
+ images = a.messages.last[:response].images
503
+ # => ["./images/20250804T11303912_resp_abc123/001.png"]
504
+ ```
505
+
506
+ Note: Unlike with user-provided input images, OpenAI _does_ store AI-generated output images. So, if you make another API request using the same chat, previous images generated by the model in the conversation history will automatically be used — you don't have to re-send them. This allows you to easily refine an image with user input over multi-turn chats.
507
+
508
+ ```ruby
509
+ a = AI::Chat.new
510
+ a.image_generation = true
511
+ a.image_folder = "./images"
512
+ a.user("Draw a picture of a kitten")
513
+ a.generate! # => "Here is a picture of a kitten:"
514
+ a.user("Make it even cuter")
515
+ a.generate! # => "Here is the kitten, but even cuter:"
516
+ ```
427
517
 
428
518
  ## Building Conversations Without API Calls
429
519
 
data/ai-chat.gemspec CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Gem::Specification.new do |spec|
4
4
  spec.name = "ai-chat"
5
- spec.version = "0.2.0"
5
+ spec.version = "0.2.1"
6
6
  spec.authors = ["Raghu Betina"]
7
7
  spec.email = ["raghu@firstdraft.com"]
8
8
  spec.homepage = "https://github.com/firstdraft/ai-chat"
data/lib/ai/chat.rb CHANGED
@@ -6,6 +6,7 @@ require "marcel"
6
6
  require "openai"
7
7
  require "pathname"
8
8
  require "stringio"
9
+ require "fileutils"
9
10
 
10
11
  require_relative "response"
11
12
 
@@ -17,7 +18,7 @@ module AI
17
18
  # :reek:IrresponsibleModule
18
19
  class Chat
19
20
  # :reek:Attribute
20
- attr_accessor :messages, :model, :web_search, :previous_response_id
21
+ attr_accessor :messages, :model, :web_search, :previous_response_id, :image_generation, :image_folder
21
22
  attr_reader :reasoning_effort, :client, :schema
22
23
 
23
24
  VALID_REASONING_EFFORTS = [:low, :medium, :high].freeze
@@ -29,6 +30,8 @@ module AI
29
30
  @model = "gpt-4.1-nano"
30
31
  @client = OpenAI::Client.new(api_key: api_key)
31
32
  @previous_response_id = nil
33
+ @image_generation = false
34
+ @image_folder = "./images"
32
35
  end
33
36
 
34
37
  # :reek:TooManyStatements
@@ -102,6 +105,10 @@ module AI
102
105
 
103
106
  text_response = extract_text_from_response(response)
104
107
 
108
+ image_filenames = extract_and_save_images(response)
109
+
110
+ chat_response.images = image_filenames
111
+
105
112
  message = if schema
106
113
  if text_response.nil? || text_response.empty?
107
114
  raise ArgumentError, "No text content in response to parse as JSON for schema: #{schema.inspect}"
@@ -111,7 +118,18 @@ module AI
111
118
  text_response
112
119
  end
113
120
 
114
- assistant(message, response: chat_response)
121
+ if image_filenames.empty?
122
+ assistant(message, response: chat_response)
123
+ else
124
+ messages.push(
125
+ {
126
+ role: "assistant",
127
+ content: message,
128
+ images: image_filenames,
129
+ response: chat_response
130
+ }.compact
131
+ )
132
+ end
115
133
 
116
134
  self.previous_response_id = response.id
117
135
 
@@ -333,9 +351,83 @@ module AI
333
351
  if web_search
334
352
  tools_list << {type: "web_search_preview"}
335
353
  end
354
+ if image_generation
355
+ tools_list << {type: "image_generation"}
356
+ end
357
+ tools_list
358
+ end
359
+
360
+ def extract_text_from_response(response)
361
+ response.output.flat_map { |output|
362
+ output.respond_to?(:content) ? output.content : []
363
+ }.compact.find { |content|
364
+ content.is_a?(OpenAI::Models::Responses::ResponseOutputText)
365
+ }&.text
366
+ end
367
+
368
+ # :reek:FeatureEnvy
369
+ def wrap_schema_if_needed(schema)
370
+ if schema.key?(:format) || schema.key?("format")
371
+ schema
372
+ elsif (schema.key?(:name) || schema.key?("name")) &&
373
+ (schema.key?(:schema) || schema.key?("schema")) &&
374
+ (schema.key?(:strict) || schema.key?("strict"))
375
+ {
376
+ format: schema.merge(type: :json_schema)
377
+ }
378
+ else
379
+ {
380
+ format: {
381
+ type: :json_schema,
382
+ name: "response",
383
+ schema: schema,
384
+ strict: true
385
+ }
386
+ }
387
+ end
336
388
  tools_list
337
389
  end
338
390
 
391
+ # :reek:DuplicateMethodCall
392
+ # :reek:FeatureEnvy
393
+ # :reek:ManualDispatch
394
+ # :reek:TooManyStatements
395
+ def extract_and_save_images(response)
396
+ image_filenames = []
397
+
398
+ image_outputs = response.output.select { |output|
399
+ output.respond_to?(:type) && output.type == :image_generation_call
400
+ }
401
+
402
+ return image_filenames if image_outputs.empty?
403
+
404
+ # ISO 8601 basic format with centisecond precision
405
+ timestamp = Time.now.strftime("%Y%m%dT%H%M%S%2N")
406
+
407
+ subfolder_name = "#{timestamp}_#{response.id}"
408
+ subfolder_path = File.join(image_folder || "./images", subfolder_name)
409
+ FileUtils.mkdir_p(subfolder_path)
410
+
411
+ image_outputs.each_with_index do |output, index|
412
+ next unless output.respond_to?(:result) && output.result
413
+
414
+ begin
415
+ image_data = Base64.strict_decode64(output.result)
416
+
417
+ filename = "#{(index + 1).to_s.rjust(3, "0")}.png"
418
+ filepath = File.join(subfolder_path, filename)
419
+
420
+ File.binwrite(filepath, image_data)
421
+
422
+ image_filenames << filepath
423
+ rescue => error
424
+ warn "Failed to save image: #{error.message}"
425
+ end
426
+ end
427
+
428
+ image_filenames
429
+ end
430
+
339
431
  # :reek:UtilityFunction
340
432
  # :reek:ManualDispatch
341
433
  def extract_text_from_response(response)
data/lib/ai/response.rb CHANGED
@@ -1,13 +1,17 @@
1
1
  module AI
2
2
  # :reek:IrresponsibleModule
3
+ # :reek:TooManyInstanceVariables
3
4
  class Response
4
5
  attr_reader :id, :model, :usage, :total_tokens
6
+ # :reek:Attribute
7
+ attr_accessor :images
5
8
 
6
9
  def initialize(response)
7
10
  @id = response.id
8
11
  @model = response.model
9
12
  @usage = response.usage.to_h.slice(:input_tokens, :output_tokens, :total_tokens)
10
13
  @total_tokens = @usage[:total_tokens]
14
+ @images = []
11
15
  end
12
16
  end
13
17
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ai-chat
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Raghu Betina