ruby-gemini-api 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,764 @@
1
+ [README ‐ 日本語](https://github.com/rira100000000/ruby-gemini/wiki/README-%E2%80%90-%E6%97%A5%E6%9C%AC%E8%AA%9E)
2
+ # Ruby-Gemini
3
+
4
+ A Ruby client library for Google's Gemini API. This gem provides a simple, intuitive interface for interacting with Gemini's generative AI capabilities, following patterns similar to other AI client libraries.
5
+
6
+ This project is inspired by and pays homage to [ruby-openai](https://github.com/alexrudall/ruby-openai), aiming to provide a familiar and consistent experience for Ruby developers working with Gemini's AI models.
7
+
8
+ ## Features
9
+
10
+ - Text generation with Gemini models
11
+ - Chat functionality with conversation history
12
+ - Streaming responses for real-time text generation
13
+ - Audio transcription capabilities
14
+ - Thread and message management for chat applications
15
+ - Runs management for executing AI tasks
16
+ - Convenient Response object for easy access to generated content
17
+ - Structured output with JSON schema and enum constraints
18
+ - Document processing (PDFs and other formats)
19
+ - Context caching for efficient processing
20
+
21
+ ## Installation
22
+
23
+ Add this line to your application's Gemfile:
24
+
25
+ ```ruby
26
+ gem 'ruby-gemini'
27
+ ```
28
+
29
+ And then execute:
30
+
31
+ ```bash
32
+ $ bundle install
33
+ ```
34
+
35
+ Or install it yourself as:
36
+
37
+ ```bash
38
+ $ gem install ruby-gemini
39
+ ```
40
+
41
+ ## Quick Start
42
+
43
+ ### Text Generation
44
+
45
+ ```ruby
46
+ require 'gemini'
47
+
48
+ # Initialize client with API key
49
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
50
+
51
+ # Generate text
52
+ response = client.generate_content(
53
+ "What are the main features of Ruby programming language?",
54
+ model: "gemini-2.0-flash-lite"
55
+ )
56
+
57
+ # Access the generated content using Response object
58
+ if response.valid?
59
+ puts response.text
60
+ else
61
+ puts "Error: #{response.error}"
62
+ end
63
+ ```
64
+
65
+ ### Streaming Text Generation
66
+
67
+ ```ruby
68
+ require 'gemini'
69
+
70
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
71
+
72
+ # Stream response in real-time
73
+ client.generate_content_stream(
74
+ "Tell me a story about a programmer who loves Ruby",
75
+ model: "gemini-2.0-flash-lite"
76
+ ) do |chunk|
77
+ print chunk
78
+ $stdout.flush
79
+ end
80
+ ```
81
+
82
+ ### Chat Conversations
83
+
84
+ ```ruby
85
+ require 'gemini'
86
+
87
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
88
+
89
+ # Create conversation contents
90
+ contents = [
91
+ { role: "user", parts: [{ text: "Hello, I'm interested in learning Ruby." }] },
92
+ { role: "model", parts: [{ text: "That's great! Ruby is a dynamic, interpreted language..." }] },
93
+ { role: "user", parts: [{ text: "What makes Ruby different from other languages?" }] }
94
+ ]
95
+
96
+ # Get response with conversation history
97
+ response = client.chat(parameters: {
98
+ model: "gemini-2.0-flash-lite",
99
+ contents: contents
100
+ })
101
+
102
+ # Process the response using Response object
103
+ if response.success?
104
+ puts response.text
105
+ else
106
+ puts "Error: #{response.error}"
107
+ end
108
+
109
+ # You can also access other response information
110
+ puts "Finish reason: #{response.finish_reason}"
111
+ puts "Token usage: #{response.total_tokens}"
112
+ ```
113
+
114
+ ### Using System Instructions
115
+
116
+ ```ruby
117
+ require 'gemini'
118
+
119
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
120
+
121
+ # Set system instructions for model behavior
122
+ system_instruction = "You are a Ruby programming expert who provides concise code examples."
123
+
124
+ # Use system instructions with chat
125
+ response = client.chat(parameters: {
126
+ model: "gemini-2.0-flash-lite",
127
+ system_instruction: { parts: [{ text: system_instruction }] },
128
+ contents: [{ role: "user", parts: [{ text: "How do I write a simple web server in Ruby?" }] }]
129
+ })
130
+
131
+ # Access the response
132
+ puts response.text
133
+
134
+ # Check if the response was blocked for safety reasons
135
+ if response.safety_blocked?
136
+ puts "Response was blocked due to safety considerations"
137
+ end
138
+ ```
139
+
140
+ ### Image Recognition
141
+
142
+ ```ruby
143
+ require 'gemini'
144
+
145
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
146
+
147
+ # Analyze an image file (note: file size limit is 20MB for direct upload)
148
+ response = client.generate_content(
149
+ [
150
+ { type: "text", text: "Describe what you see in this image" },
151
+ { type: "image_file", image_file: { file_path: "path/to/image.jpg" } }
152
+ ],
153
+ model: "gemini-2.0-flash"
154
+ )
155
+
156
+ # Access the description using Response object
157
+ if response.success?
158
+ puts response.text
159
+ else
160
+ puts "Image analysis failed: #{response.error}"
161
+ end
162
+ ```
163
+
164
+ For image files larger than 20MB, you should use the `files.upload` method:
165
+
166
+ ```ruby
167
+ require 'gemini'
168
+
169
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
170
+
171
+ # Upload large image file
172
+ file = File.open("path/to/large_image.jpg", "rb")
173
+ upload_result = client.files.upload(file: file)
174
+ # Get file uri and name from the response
175
+ file_uri = upload_result["file"]["uri"]
176
+ file_name = upload_result["file"]["name"]
177
+
178
+ # Use the file URI for image analysis
179
+ response = client.generate_content(
180
+ [
181
+ { text: "Describe this image in detail" },
182
+ { file_data: { mime_type: "image/jpeg", file_uri: file_uri } }
183
+ ],
184
+ model: "gemini-2.0-flash"
185
+ )
186
+
187
+ # Process the response using Response object
188
+ if response.success?
189
+ puts response.text
190
+ else
191
+ puts "Image analysis failed: #{response.error}"
192
+ end
193
+
194
+ # Optionally delete the file when done
195
+ client.files.delete(name: file_name)
196
+ ```
197
+
198
+ For more examples, check out the `demo/vision_demo.rb` and `demo/file_vision_demo.rb` files included with the gem.
199
+
200
+ ### Image Generation
201
+
202
+ ```ruby
203
+ require 'gemini'
204
+
205
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
206
+
207
+ # Generate an image using Gemini 2.0
208
+ response = client.images.generate(
209
+ parameters: {
210
+ prompt: "A beautiful sunset over the ocean with sailing boats",
211
+ model: "gemini-2.0-flash-exp-image-generation",
212
+ size: "16:9"
213
+ }
214
+ )
215
+
216
+ # Save the generated image
217
+ if response.success? && !response.images.empty?
218
+ filepath = "generated_image.png"
219
+ response.save_image(filepath)
220
+ puts "Image saved to #{filepath}"
221
+ else
222
+ puts "Image generation failed: #{response.error}"
223
+ end
224
+ ```
225
+
226
+ You can also use Imagen 3 model (Note: This feature is not fully tested yet):
227
+
228
+ ```ruby
229
+ # Generate multiple images using Imagen 3
230
+ response = client.images.generate(
231
+ parameters: {
232
+ prompt: "A futuristic city with flying cars and tall skyscrapers",
233
+ model: "imagen-3.0-generate-002",
234
+ size: "1:1",
235
+ n: 4 # Generate 4 images
236
+ }
237
+ )
238
+
239
+ # Save all generated images
240
+ if response.success? && !response.images.empty?
241
+ filepaths = response.images.map.with_index { |_, i| "imagen_#{i+1}.png" }
242
+ saved_files = response.save_images(filepaths)
243
+ saved_files.each { |f| puts "Image saved to #{f}" if f }
244
+ end
245
+ ```
246
+
247
+ For a complete example, check out the `demo/image_generation_demo.rb` file included with the gem.
248
+
249
+ ### Audio Transcription
250
+
251
+ ```ruby
252
+ require 'gemini'
253
+
254
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
255
+
256
+ # Transcribe audio file (note: file size limit is 20MB for direct upload)
257
+ response = client.audio.transcribe(
258
+ parameters: {
259
+ model: "gemini-1.5-flash",
260
+ file: File.open("audio_file.mp3", "rb"),
261
+ language: "en",
262
+ content_text: "Transcribe this audio clip"
263
+ }
264
+ )
265
+
266
+ # Response object makes accessing the transcription easy
267
+ if response.success?
268
+ puts response.text
269
+ else
270
+ puts "Transcription failed: #{response.error}"
271
+ end
272
+ ```
273
+
274
+ For audio files larger than 20MB, you should use the `files.upload` method:
275
+
276
+ ```ruby
277
+ require 'gemini'
278
+
279
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
280
+
281
+ # Upload large audio file
282
+ file = File.open("path/to/audio.mp3", "rb")
283
+ upload_result = client.files.upload(file: file)
284
+ # Get file uri and name from the response
285
+ file_uri = upload_result["file"]["uri"]
286
+ file_name = upload_result["file"]["name"]
287
+
288
+ # Use the file ID for transcription
289
+ response = client.audio.transcribe(
290
+ parameters: {
291
+ model: "gemini-1.5-flash",
292
+ file_uri: file_uri,
293
+ language: "en"
294
+ }
295
+ )
296
+
297
+ # Check if the response was successful and get the transcription
298
+ if response.success?
299
+ puts response.text
300
+ else
301
+ puts "Transcription failed: #{response.error}"
302
+ end
303
+
304
+ # Optionally delete the file when done
305
+ client.files.delete(name: file_name)
306
+ ```
307
+
308
+ For more examples, check out the `demo/file_audio_demo.rb` file included with the gem.
309
+
310
+ ### Document Processing
311
+
312
+ Gemini API can process long documents (up to 3,600 pages), including PDFs. Gemini models understand both text and images within the document, enabling you to analyze, summarize, and extract information.
313
+
314
+ ```ruby
315
+ require 'gemini'
316
+
317
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
318
+
319
+ # Process a PDF document
320
+ result = client.documents.process(
321
+ file_path: "path/to/document.pdf",
322
+ prompt: "Summarize this document in three key points",
323
+ model: "gemini-1.5-flash"
324
+ )
325
+
326
+ response = result[:response]
327
+
328
+ # Check the response
329
+ if response.success?
330
+ puts response.text
331
+ else
332
+ puts "Document processing failed: #{response.error}"
333
+ end
334
+
335
+ # File information (optional)
336
+ puts "File URI: #{result[:file_uri]}"
337
+ puts "File name: #{result[:file_name]}"
338
+ ```
339
+
340
+ For more complex document processing, you can have a conversation about the document:
341
+
342
+ ```ruby
343
+ require 'gemini'
344
+
345
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
346
+
347
+ # Start a conversation with a document
348
+ file_path = "path/to/document.pdf"
349
+ thread_result = client.chat_with_file(
350
+ file_path,
351
+ "Please provide an overview of this document",
352
+ model: "gemini-1.5-flash"
353
+ )
354
+
355
+ # Get the thread ID (for continuing the conversation)
356
+ thread_id = thread_result[:thread_id]
357
+
358
+ # Add another message to continue the conversation
359
+ client.messages.create(
360
+ thread_id: thread_id,
361
+ parameters: {
362
+ role: "user",
363
+ content: "Tell me more details about it"
364
+ }
365
+ )
366
+
367
+ # Execute to get a response
368
+ run = client.runs.create(thread_id: thread_id)
369
+
370
+ # Get conversation history
371
+ messages = client.messages.list(thread_id: thread_id)
372
+ puts "Conversation history:"
373
+ messages["data"].each do |msg|
374
+ role = msg["role"]
375
+ content = msg["content"].map { |c| c["text"]["value"] }.join("\n")
376
+ puts "#{role.upcase}: #{content}"
377
+ puts "--------------------------"
378
+ end
379
+ ```
380
+
381
+ Supported document formats:
382
+ - PDF - application/pdf
383
+ - Text - text/plain
384
+ - HTML - text/html
385
+ - CSS - text/css
386
+ - Markdown - text/md
387
+ - CSV - text/csv
388
+ - XML - text/xml
389
+ - RTF - text/rtf
390
+ - JavaScript - application/x-javascript, text/javascript
391
+ - Python - application/x-python, text/x-python
392
+
393
+ Demo applications can be found in `demo/document_chat_demo.rb` and `demo/document_conversation_demo.rb`.
394
+
395
+ ### Context Caching
396
+
397
+ Context caching allows you to preprocess and store inputs like large documents or images with the Gemini API, then reuse them across multiple requests. This saves processing time and token usage when asking different questions about the same content.
398
+
399
+ **Important**: Context caching requires a minimum input of 32,768 tokens. The maximum token count matches the context window size of the model you are using. Caches automatically expire after 48 hours, but you can set a custom TTL (Time To Live).Models are only available in fixed version stable models (e.g. gemini-1.5-pro-001).The version suffix (e.g. -001 for gemini-1.5-pro-001) must be included.
400
+
401
+ ```ruby
402
+ require 'gemini'
403
+
404
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
405
+
406
+ # Cache a document for repeated use
407
+ cache_result = client.documents.cache(
408
+ file_path: "path/to/large_document.pdf",
409
+ system_instruction: "You are a document analysis expert. Please understand the content thoroughly and answer questions accurately.",
410
+ ttl: "86400s", # 24 hours (in seconds)
411
+ model: "gemini-1.5-flash-001"
412
+ )
413
+
414
+ # Get the cache name
415
+ cache_name = cache_result[:cache][:name]
416
+ puts "Cache name: #{cache_name}"
417
+
418
+ # Ask questions using the cache
419
+ response = client.generate_content_with_cache(
420
+ "What are the key findings in this document?",
421
+ cached_content: cache_name,
422
+ model: "gemini-1.5-flash-001"
423
+ )
424
+
425
+ if response.success?
426
+ puts response.text
427
+ else
428
+ puts "Error: #{response.error}"
429
+ end
430
+
431
+ # Extend cache expiration
432
+ client.cached_content.update(
433
+ name: cache_name,
434
+ ttl: "172800s" # 48 hours (in seconds)
435
+ )
436
+
437
+ # Delete cache when done
438
+ client.cached_content.delete(name: cache_name)
439
+ ```
440
+
441
+ You can also list all your cached content:
442
+
443
+ ```ruby
444
+ # List all caches
445
+ caches = client.cached_content.list
446
+ puts "Cache list:"
447
+ caches.raw_data["cachedContents"].each do |cache|
448
+ puts "Name: #{cache['name']}"
449
+ puts "Model: #{cache['model']}"
450
+ puts "Expires: #{cache['expireTime']}"
451
+ puts "Token count: #{cache.dig('usageMetadata', 'totalTokenCount')}"
452
+ puts "--------------------------"
453
+ end
454
+ ```
455
+
456
+ For a complete example of context caching, check out the `demo/document_cache_demo.rb` file.
457
+
458
+ ### Structured Output with JSON Schema
459
+
460
+ You can request responses in structured JSON format by specifying a JSON schema:
461
+
462
+ ```ruby
463
+ require 'gemini'
464
+ require 'json'
465
+
466
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
467
+
468
+ # Define a schema for recipes
469
+ recipe_schema = {
470
+ type: "ARRAY",
471
+ items: {
472
+ type: "OBJECT",
473
+ properties: {
474
+ "recipe_name": { type: "STRING" },
475
+ "ingredients": {
476
+ type: "ARRAY",
477
+ items: { type: "STRING" }
478
+ },
479
+ "preparation_time": {
480
+ type: "INTEGER",
481
+ description: "Preparation time in minutes"
482
+ }
483
+ },
484
+ required: ["recipe_name", "ingredients"],
485
+ propertyOrdering: ["recipe_name", "ingredients", "preparation_time"]
486
+ }
487
+ }
488
+
489
+ # Request JSON-formatted response according to the schema
490
+ response = client.generate_content(
491
+ "List three popular cookie recipes with ingredients and preparation time",
492
+ response_mime_type: "application/json",
493
+ response_schema: recipe_schema
494
+ )
495
+
496
+ # Process JSON response
497
+ if response.success? && response.json?
498
+ recipes = response.json
499
+
500
+ # Work with structured data
501
+ recipes.each do |recipe|
502
+ puts "#{recipe['recipe_name']} (#{recipe['preparation_time']} minutes)"
503
+ puts "Ingredients: #{recipe['ingredients'].join(', ')}"
504
+ puts
505
+ end
506
+ else
507
+ puts "Failed to get JSON: #{response.error}"
508
+ end
509
+ ```
510
+
511
+ ### Enum-Constrained Responses
512
+
513
+ You can limit possible values in responses using enums:
514
+
515
+ ```ruby
516
+ require 'gemini'
517
+ require 'json'
518
+
519
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
520
+
521
+ # Define schema with enum constraints
522
+ review_schema = {
523
+ type: "OBJECT",
524
+ properties: {
525
+ "product_name": { type: "STRING" },
526
+ "rating": {
527
+ type: "STRING",
528
+ enum: ["1", "2", "3", "4", "5"],
529
+ description: "Rating from 1 to 5"
530
+ },
531
+ "recommendation": {
532
+ type: "STRING",
533
+ enum: ["Not recommended", "Neutral", "Recommended", "Highly recommended"],
534
+ description: "Level of recommendation"
535
+ },
536
+ "comment": { type: "STRING" }
537
+ },
538
+ required: ["product_name", "rating", "recommendation"]
539
+ }
540
+
541
+ # Request constrained response
542
+ response = client.generate_content(
543
+ "Write a review for the new GeminiPhone 15 smartphone",
544
+ response_mime_type: "application/json",
545
+ response_schema: review_schema
546
+ )
547
+
548
+ # Work with structured data that follows constraints
549
+ if response.success? && response.json?
550
+ review = response.json
551
+ puts "Product: #{review['product_name']}"
552
+ puts "Rating: #{review['rating']}/5"
553
+ puts "Recommendation: #{review['recommendation']}"
554
+ puts "Comment: #{review['comment']}" if review['comment']
555
+ else
556
+ puts "Failed to get JSON: #{response.error}"
557
+ end
558
+ ```
559
+
560
+ For complete examples of structured output, check out the `demo/structured_output_demo.rb` and `demo/enum_response_demo.rb` files included with the gem.
561
+
562
+ ## Advanced Usage
563
+
564
+ ### Threads and Messages
565
+
566
+ The library supports a threads and messages concept similar to other AI platforms:
567
+
568
+ ```ruby
569
+ require 'gemini'
570
+
571
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
572
+
573
+ # Create a new thread
574
+ thread = client.threads.create(parameters: { model: "gemini-2.0-flash-lite" })
575
+ thread_id = thread["id"]
576
+
577
+ # Add a message to the thread
578
+ message = client.messages.create(
579
+ thread_id: thread_id,
580
+ parameters: {
581
+ role: "user",
582
+ content: "Tell me about Ruby on Rails"
583
+ }
584
+ )
585
+
586
+ # Execute a run on the thread
587
+ run = client.runs.create(thread_id: thread_id)
588
+
589
+ # Retrieve all messages in the thread
590
+ messages = client.messages.list(thread_id: thread_id)
591
+ puts "\nAll messages in thread:"
592
+ messages["data"].each do |msg|
593
+ role = msg["role"]
594
+ content = msg["content"].map { |c| c["text"]["value"] }.join("\n")
595
+ puts "#{role.upcase}: #{content}"
596
+ end
597
+ ```
598
+
599
+ ### Working with Response Objects
600
+
601
+ The Response object provides several useful methods for working with API responses:
602
+
603
+ ```ruby
604
+ require 'gemini'
605
+
606
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
607
+
608
+ response = client.generate_content(
609
+ "Tell me about the Ruby programming language",
610
+ model: "gemini-2.0-flash-lite"
611
+ )
612
+
613
+ # Basic response information
614
+ puts "Valid response? #{response.valid?}"
615
+ puts "Success? #{response.success?}"
616
+
617
+ # Access text content
618
+ puts "Text: #{response.text}"
619
+ puts "Formatted text: #{response.formatted_text}"
620
+
621
+ # Get individual text parts
622
+ puts "Text parts: #{response.text_parts.size}"
623
+ response.text_parts.each_with_index do |part, i|
624
+ puts "Part #{i+1}: #{part[0..30]}..." # Print beginning of each part
625
+ end
626
+
627
+ # Access first candidate
628
+ puts "First candidate role: #{response.role}"
629
+
630
+ # Token usage information
631
+ puts "Prompt tokens: #{response.prompt_tokens}"
632
+ puts "Completion tokens: #{response.completion_tokens}"
633
+ puts "Total tokens: #{response.total_tokens}"
634
+
635
+ # Safety information
636
+ puts "Finish reason: #{response.finish_reason}"
637
+ puts "Safety blocked? #{response.safety_blocked?}"
638
+
639
+ # JSON handling methods (for structured output)
640
+ puts "Is JSON response? #{response.json?}"
641
+ if response.json?
642
+ puts "JSON data: #{response.json.inspect}"
643
+ puts "Pretty JSON: #{response.to_formatted_json(pretty: true)}"
644
+ end
645
+
646
+ # Raw data access for advanced needs
647
+ puts "Raw response data available? #{!response.raw_data.nil?}"
648
+ ```
649
+
650
+ ### Configuration
651
+
652
+ Configure the client with custom options:
653
+
654
+ ```ruby
655
+ require 'gemini'
656
+
657
+ # Global configuration
658
+ Gemini.configure do |config|
659
+ config.api_key = ENV['GEMINI_API_KEY']
660
+ config.uri_base = "https://generativelanguage.googleapis.com/v1beta"
661
+ config.request_timeout = 60
662
+ config.log_errors = true
663
+ end
664
+
665
+ # Or per-client configuration
666
+ client = Gemini::Client.new(
667
+ ENV['GEMINI_API_KEY'],
668
+ {
669
+ uri_base: "https://generativelanguage.googleapis.com/v1beta",
670
+ request_timeout: 60,
671
+ log_errors: true
672
+ }
673
+ )
674
+
675
+ # Add custom headers
676
+ client.add_headers({"X-Custom-Header" => "value"})
677
+ ```
678
+
679
+ ## Demo Applications
680
+
681
+ The gem includes several demo applications that showcase its functionality:
682
+
683
+ - `demo/demo.rb` - Basic text generation and chat
684
+ - `demo/stream_demo.rb` - Streaming text generation
685
+ - `demo/audio_demo.rb` - Audio transcription
686
+ - `demo/vision_demo.rb` - Image recognition
687
+ - `demo/image_generation_demo.rb` - Image generation
688
+ - `demo/file_vision_demo.rb` - Image recognition with large image files
689
+ - `demo/file_audio_demo.rb` - Audio transcription with large audio files
690
+ - `demo/structured_output_demo.rb` - Structured JSON output with schema
691
+ - `demo/enum_response_demo.rb` - Enum-constrained responses
692
+ - `demo/document_chat_demo.rb` - Document processing
693
+ - `demo/document_conversation_demo.rb` - Conversation with documents
694
+ - `demo/document_cache_demo.rb` - Document caching
695
+
696
+ Run the demos with:
697
+
698
+ Adding _ja to the name of each demo file will launch the Japanese version of the demo.
699
+ example: `ruby demo_ja.rb`
700
+
701
+ ```bash
702
+ # Basic chat demo
703
+ ruby demo/demo.rb
704
+
705
+ # Streaming chat demo
706
+ ruby demo/stream_demo.rb
707
+
708
+ # Audio transcription
709
+ ruby demo/audio_demo.rb path/to/audio/file.mp3
710
+
711
+ # Audio transcription with over 20MB audio file
712
+ ruby demo/file_audio_demo.rb path/to/audio/file.mp3
713
+
714
+ # Image recognition
715
+ ruby demo/vision_demo.rb path/to/image/file.jpg
716
+
717
+ # Image recognition with large image files
718
+ ruby demo/file_vision_demo.rb path/to/image/file.jpg
719
+
720
+ # Image generation
721
+ ruby demo/image_generation_demo.rb
722
+
723
+ # Structured output with JSON schema
724
+ ruby demo/structured_output_demo.rb
725
+
726
+ # Enum-constrained responses
727
+ ruby demo/enum_response_demo.rb
728
+
729
+ # Document processing
730
+ ruby demo/document_chat_demo.rb path/to/document.pdf
731
+
732
+ # Conversation with a document
733
+ ruby demo/document_conversation_demo.rb path/to/document.pdf
734
+
735
+ # Document caching and querying
736
+ ruby demo/document_cache_demo.rb path/to/document.pdf
737
+ ```
738
+
739
+ ## Models
740
+
741
+ The library supports various Gemini models:
742
+
743
+ - `gemini-2.0-flash-lite`
744
+ - `gemini-2.0-flash`
745
+ - `gemini-2.0-pro`
746
+ - `gemini-1.5-flash`
747
+
748
+ ## Requirements
749
+
750
+ - Ruby 3.0 or higher
751
+ - Faraday 2.0 or higher
752
+ - Google Gemini API key
753
+
754
+ ## Contributing
755
+
756
+ 1. Fork it
757
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
758
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
759
+ 4. Push to the branch (`git push origin my-new-feature`)
760
+ 5. Create a new Pull Request
761
+
762
+ ## License
763
+
764
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).