ruby-gemini-api 0.1.7 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 89f3cb8e2bfe7ee5c3e2319312a44b8eefe756d984aa0a26373102f5a3d7eb18
4
- data.tar.gz: 0b3272695fc70f9a841261f525799faea9b6da6562bf949d167ec511f1e04eb4
3
+ metadata.gz: 31487006e959d8d9a755743f6471b54e075a1bc91aa36718274203deb9fda84d
4
+ data.tar.gz: bc7dbbbea933ed2343b2ec1a32d3cf9b03fe41a4494801c47e5d83842bf31603
5
5
  SHA512:
6
- metadata.gz: 3ac51187c27d747ea663c269d63b87a8e31a52febdce56efcc7b17497ffa17987e9b61a957c1ee95ea265128594a22ac3ce937a0380c97e5422a81e6906b052c
7
- data.tar.gz: 8ca1da9ce78243b3435c7baf5b059253ecb1a32b98092520342fa3063135315713da4cb98c9ee789fcbf28453fadf1f2e2bbfb39d28b3e515b60549ac78f8012
6
+ metadata.gz: cc033c4ab711800c56f2f1d8884c38a74359d7c3b66fb66ed39ea03b23b98c3748d08208dc560e3276ac3d6e25660ad3a2c5aef69ef25b523bbd6512a5b5d246
7
+ data.tar.gz: 998ca95babf2803241a9d0a01c00eefe7fcbd37990723ffc9378668bc30f5529a90daa92b20919e5c0e681f68f4e8a5d4206e66520eeec46da9aa18c169fc9be
data/CHANGELOG.md CHANGED
@@ -1,39 +1,77 @@
1
1
  ## [Unreleased]
2
2
 
3
- ## [0.1.0] - 2025-04-05
4
-
5
- - Initial release
3
+ ## [1.1.0] - 2026-04-29
4
+
5
+ ### Added
6
+ - Live API support for real-time bidirectional audio/video/text conversations over WebSocket
7
+ - `Gemini::Live::Session` with event-driven API (`:setup_complete`, `:text`, `:audio`, `:tool_call`, `:turn_complete`, `:interrupted`, `:usage_metadata`, `:session_resumption`, `:go_away`, `:close`, `:error`)
8
+ - `Gemini::Live::Configuration` with response modality, voice, system instruction, tools, context-window compression, session resumption, manual VAD, output audio transcription
9
+ - `Gemini::Live::MessageBuilder` for setup, clientContent, realtimeInput, activity start/end, and tool response messages
10
+ - Live API audio demos: `live_audio_demo.rb` (low-latency streaming), `live_audio_simple.rb`
11
+ - Manual VAD (Voice Activity Detection) support via `automatic_activity_detection: false`
12
+ - Live API Function Calling
13
+ - `Session#send_realtime_text(text)` — universal text input via `realtimeInput.text`, required by newer Live models such as `gemini-3.1-flash-live-preview`
14
+ - `MessageBuilder.realtime_text(text)` builder
15
+ - Async (NON_BLOCKING) function call support: `MessageBuilder.tool_response` validates and normalizes the `scheduling` field (`INTERRUPT`, `WHEN_IDLE`, `SILENT`), accepted either inside the response payload or as a top-level shortcut
16
+ - Demos: `live_function_calling_demo.rb` / `live_function_calling_demo_ja.rb`
17
+ - Embeddings API support (`embedContent` and `batchEmbedContents`)
18
+ - `client.embeddings_api.create(input:, ...)` for single embeddings
19
+ - `client.embeddings_api.batch_create(inputs:, ...)` for batch embeddings
20
+ - `client.embed_content(input, ...)` shortcut that auto-routes Array inputs to batch
21
+ - Optional parameters: `task_type` (RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, QUESTION_ANSWERING, FACT_VERIFICATION, CODE_RETRIEVAL_QUERY), `title` (RETRIEVAL_DOCUMENT only), `output_dimensionality`
22
+ - Default model: `gemini-embedding-001`
23
+ - `Response` helpers for embeddings: `#embedding`, `#embeddings`, `#embedding_dimension`, `#embedding_response?`
24
+ - Demos: `embeddings_demo.rb` / `embeddings_demo_ja.rb`
25
+
26
+ ### Notes
27
+ - Verified Live model compatibility on the `bidiGenerateContent` endpoint: only the native-audio variants and `gemini-3.1-flash-live-preview` are deployed today. The latter requires `realtimeInput.text` (i.e., `Session#send_realtime_text`) and `AUDIO` modality. The `gemini-2.5-flash-live-preview` model name listed in the public tools docs is not yet deployed.
28
+ - `MessageBuilder.realtime_input` (legacy `mediaChunks` path) is documented as deprecated by the upstream API; prefer `realtime_text` going forward.
29
+
30
+ ## [1.0.0] - 2026-01-28
31
+
32
+ ### Added
33
+ - Thinking feature support for Gemini 2.5 and Gemini 3 models
34
+ - `thinking_budget` parameter for Gemini 2.5 (1-32768 tokens, -1 for dynamic, 0 to disable)
35
+ - `thinking_level` parameter for Gemini 3 (:minimal, :low, :medium, :high)
36
+ - Thought Signatures support for Function Calling with Thinking
37
+ - `FunctionCallingHelper.build_continuation` for automatic signature management
38
+ - Response methods: `thought_signatures`, `first_thought_signature`, `has_thought_signature?`
39
+ - Response helper methods: `thoughts_token_count`, `model_version`, `gemini_3?`
6
40
 
7
- ## [0.1.1] - 2025-05-04
41
+ ## [0.1.7] - 2026-01-13
8
42
 
9
- - Changed generate_contents to accept temperature parameter
43
+ - Remove dotenv dependency
10
44
 
11
- ## [0.1.2] - 2025-07-10
45
+ ## [0.1.6] - 2025-12-11
12
46
 
13
- - Add function calling
47
+ - Add support for video understanding
48
+ - Analyze local video files (Files API and inline data)
49
+ - Analyze YouTube videos
50
+ - Helper methods: describe, ask, extract_timestamps, analyze_segment
51
+ - Support for MP4, MPEG, MOV, AVI, FLV, WebM, WMV, 3GPP formats
52
+ - Change default model to gemini-2.5-flash
14
53
 
15
- ## [0.1.3] - 2025-10-09
54
+ ## [0.1.5] - 2025-11-13
16
55
 
17
- - Add support for multi-image input
56
+ - Add support for URL Context tool
57
+ - Add simplified method for accessing grounding search sources
18
58
 
19
59
  ## [0.1.4] - 2025-11-08
20
60
 
21
61
  - Add support for grounding search
22
62
 
23
- ## [0.1.5] - 2025-11-13
63
+ ## [0.1.3] - 2025-10-09
24
64
 
25
- - Add support for URL Context tool
26
- - Add simplified method for accessing grounding search sources
65
+ - Add support for multi-image input
27
66
 
28
- ## [0.1.6] - 2025-12-11
67
+ ## [0.1.2] - 2025-07-10
29
68
 
30
- - Add support for video understanding
31
- - Analyze local video files (Files API and inline data)
32
- - Analyze YouTube videos
33
- - Helper methods: describe, ask, extract_timestamps, analyze_segment
34
- - Support for MP4, MPEG, MOV, AVI, FLV, WebM, WMV, 3GPP formats
35
- - Change default model to gemini-2.5-flash
69
+ - Add function calling
36
70
 
37
- ## [0.1.7] - 2026-01-13
71
+ ## [0.1.1] - 2025-05-04
38
72
 
39
- - Remove dotenv dependency
73
+ - Changed generate_contents to accept temperature parameter
74
+
75
+ ## [0.1.0] - 2025-04-05
76
+
77
+ - Initial release
data/README.md CHANGED
@@ -1,10 +1,22 @@
1
1
  [README ‐ 日本語](https://github.com/rira100000000/ruby-gemini-api/blob/main/README_ja.md)
2
2
  # Ruby-Gemini-API
3
3
 
4
+ [![Gem Version](https://badge.fury.io/rb/ruby-gemini-api.svg)](https://badge.fury.io/rb/ruby-gemini-api)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+
4
7
  A Ruby client library for Google's Gemini API. This gem provides a simple, intuitive interface for interacting with Gemini's generative AI capabilities, following patterns similar to other AI client libraries.
5
8
 
6
9
  This project is inspired by and pays homage to [ruby-openai](https://github.com/alexrudall/ruby-openai), aiming to provide a familiar and consistent experience for Ruby developers working with Gemini's AI models.
7
10
 
11
+ ## Why This Gem?
12
+
13
+ - **Familiar Interface**: API design inspired by ruby-openai for a smooth transition
14
+ - **Comprehensive Features**: Text generation, vision, audio, video, function calling, and more
15
+ - **Response Object**: Convenient wrapper for easy access to generated content
16
+ - **Streaming Support**: Real-time text generation with block-based API
17
+ - **Thinking Support**: Built-in support for Gemini 2.5/3 thinking features
18
+ - **Production Ready**: Stable 1.0 release with thorough documentation
19
+
8
20
  ## Features
9
21
 
10
22
  - Text generation with Gemini models
@@ -18,6 +30,8 @@ This project is inspired by and pays homage to [ruby-openai](https://github.com/
18
30
  - Structured output with JSON schema and enum constraints
19
31
  - Document processing (PDFs and other formats)
20
32
  - Context caching for efficient processing
33
+ - Text embeddings (single and batch) with task type, title, and output dimensionality control
34
+ - Live API: real-time bidirectional conversations with text/audio/video and function calling (sync and async)
21
35
 
22
36
  ### Function Calling
23
37
 
@@ -94,6 +108,105 @@ puts "After deleting a function: #{all_tools.list_functions}"
94
108
  # => After deleting a function: [:get_current_weather, :send_email]
95
109
  ```
96
110
 
111
+ ### Thinking Feature
112
+
113
+ Gemini 2.5 and later models support the Thinking feature, which enables the model to perform internal reasoning processes for complex problems to generate higher-quality answers.
114
+
115
+ #### Using with Gemini 2.5: `thinking_budget`
116
+
117
+ ```ruby
118
+ require 'gemini'
119
+
120
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
121
+
122
+ # Specify thinking token count (1-32768)
123
+ response = client.generate_content(
124
+ "Solve this complex math problem step by step",
125
+ model: "gemini-2.5-flash",
126
+ thinking_budget: 2048
127
+ )
128
+
129
+ puts "Thoughts token count: #{response.thoughts_token_count}"
130
+ puts "Answer: #{response.text}"
131
+
132
+ # Dynamic thinking (model decides automatically)
133
+ response = client.generate_content(
134
+ "A difficult logic puzzle",
135
+ model: "gemini-2.5-flash",
136
+ thinking_budget: -1
137
+ )
138
+
139
+ # Disable thinking
140
+ response = client.generate_content(
141
+ "Simple question",
142
+ thinking_budget: 0
143
+ )
144
+ ```
145
+
146
+ #### Using with Gemini 3: `thinking_level`
147
+
148
+ ```ruby
149
+ # Specify thinking level (:minimal, :low, :medium, :high)
150
+ response = client.generate_content(
151
+ "Complex analysis task",
152
+ model: "gemini-3-flash-preview",
153
+ thinking_level: :high
154
+ )
155
+ ```
156
+
157
+ #### Function Calling with Thinking (Thought Signatures)
158
+
159
+ When using Function Calling with Gemini 3, Thought Signatures must be managed. The library automatically handles signatures for you.
160
+
161
+ ```ruby
162
+ # Initial request
163
+ response = client.generate_content(
164
+ "What's the weather in Tokyo?",
165
+ tools: tools,
166
+ thinking_level: :medium
167
+ )
168
+
169
+ # If function calls are present
170
+ if response.function_calls.any?
171
+ # Execute the function
172
+ weather_data = get_weather("Tokyo")
173
+
174
+ # Build continuation request (with automatic signature attachment)
175
+ contents = Gemini::FunctionCallingHelper.build_continuation(
176
+ original_contents: [{ role: "user", parts: [{ text: "What's the weather in Tokyo?" }] }],
177
+ model_response: response,
178
+ function_responses: [
179
+ { name: "get_weather", response: weather_data }
180
+ ]
181
+ )
182
+
183
+ # Continuation request
184
+ final_response = client.generate_content(
185
+ contents,
186
+ tools: tools,
187
+ thinking_level: :medium
188
+ )
189
+
190
+ puts final_response.text
191
+ end
192
+ ```
193
+
194
+ #### Response Methods for Thinking
195
+
196
+ ```ruby
197
+ # Get thoughts token count
198
+ response.thoughts_token_count # => 150
199
+
200
+ # Get thought signatures (for Function Calling)
201
+ response.thought_signatures # => ["base64encoded..."]
202
+ response.first_thought_signature
203
+ response.has_thought_signature?
204
+
205
+ # Check model version
206
+ response.model_version # => "gemini-3-flash-preview"
207
+ response.gemini_3? # => true
208
+ ```
209
+
97
210
  ## Installation
98
211
 
99
212
  Add this line to your application's Gemfile:
@@ -881,6 +994,275 @@ end
881
994
 
882
995
  For a complete example of context caching, check out the `demo/document_cache_demo.rb` file.
883
996
 
997
+ ### Live API (Real-time Conversations)
998
+
999
+ The Gemini Live API provides bidirectional WebSocket-based real-time conversations with audio, video, and text support. The library wraps the protocol behind an event-driven `Gemini::Live::Session`.
1000
+
1001
+ #### Basic Audio Conversation
1002
+
1003
+ The default model (`gemini-2.5-flash-native-audio-preview-12-2025`) responds with audio. You receive Base64-encoded 24 kHz 16-bit PCM chunks via the `:audio` event.
1004
+
1005
+ ```ruby
1006
+ require 'gemini'
1007
+ require 'base64'
1008
+
1009
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
1010
+
1011
+ client.live.connect(
1012
+ response_modality: "AUDIO",
1013
+ voice_name: "Kore",
1014
+ system_instruction: "You are a helpful assistant. Be brief."
1015
+ ) do |session|
1016
+ setup_complete = false
1017
+ audio_chunks = []
1018
+
1019
+ session.on(:setup_complete) { setup_complete = true }
1020
+ session.on(:audio) { |data, _mime| audio_chunks << Base64.decode64(data) }
1021
+ session.on(:turn_complete) { puts "[#{audio_chunks.sum(&:bytesize)} bytes]" }
1022
+ session.on(:error) { |e| puts "Error: #{e.message}" }
1023
+
1024
+ sleep 0.05 until setup_complete
1025
+
1026
+ session.send_realtime_text("What is the capital of Japan?")
1027
+ sleep 8
1028
+ end
1029
+ ```
1030
+
1031
+ For text-only responses, see the note below about Live model availability.
1032
+
1033
+ #### Function Calling (Synchronous)
1034
+
1035
+ The Live API supports function calling. Define your tools, register a `:tool_call` handler, and reply with `session.send_tool_response`.
1036
+
1037
+ > **Note on Live model input format**
1038
+ > Newer Live models such as `gemini-3.1-flash-live-preview` reject the
1039
+ > legacy `clientContent.turns[]` payload that older models (including the
1040
+ > native-audio variants) accept. Use `session.send_realtime_text(...)`
1041
+ > instead of `session.send_text(...)`, which emits the universal
1042
+ > `realtimeInput.text` form and works on every currently-deployed Live
1043
+ > model. The `gemini-2.5-flash-live-preview` model name listed in the
1044
+ > public tools docs is not deployed on the `bidiGenerateContent` endpoint
1045
+ > at the time of writing.
1046
+
1047
+ ```ruby
1048
+ require 'base64'
1049
+
1050
+ tools = [
1051
+ {
1052
+ functionDeclarations: [
1053
+ {
1054
+ name: "get_weather",
1055
+ description: "Get the current weather for a location",
1056
+ parameters: {
1057
+ type: "object",
1058
+ properties: {
1059
+ location: { type: "string", description: "City name" }
1060
+ },
1061
+ required: ["location"]
1062
+ }
1063
+ }
1064
+ ]
1065
+ }
1066
+ ]
1067
+
1068
+ audio_chunks = []
1069
+
1070
+ client.live.connect(
1071
+ response_modality: "AUDIO",
1072
+ voice_name: "Kore",
1073
+ tools: tools,
1074
+ system_instruction: "Use the available functions when asked about weather."
1075
+ ) do |session|
1076
+ session.on(:audio) { |data, _mime| audio_chunks << Base64.decode64(data) }
1077
+
1078
+ session.on(:tool_call) do |function_calls|
1079
+ responses = function_calls.map do |call|
1080
+ result = case call[:name]
1081
+ when "get_weather"
1082
+ { temperature: 22, condition: "sunny", location: call[:args]["location"] }
1083
+ end
1084
+ { id: call[:id], name: call[:name], response: result }
1085
+ end
1086
+ session.send_tool_response(responses)
1087
+ end
1088
+
1089
+ sleep 0.5 # wait for setup
1090
+ session.send_realtime_text("What's the weather in Tokyo?")
1091
+ sleep 18
1092
+ end
1093
+
1094
+ # audio_chunks now contains 24 kHz, 16-bit PCM mono audio of the spoken reply.
1095
+ ```
1096
+
1097
+ A complete example is in `demo/live_function_calling_demo.rb`.
1098
+
1099
+ #### Function Calling (Asynchronous / NON_BLOCKING)
1100
+
1101
+ `gemini-2.5-flash-live-preview` supports asynchronous function calls. Mark a function declaration with `behavior: "NON_BLOCKING"` so the model can keep talking while the call runs, then control how the result is delivered back via `scheduling`.
1102
+
1103
+ ```ruby
1104
+ tools = [
1105
+ {
1106
+ functionDeclarations: [
1107
+ {
1108
+ name: "fetch_long_running_data",
1109
+ behavior: "NON_BLOCKING",
1110
+ description: "Slow data lookup",
1111
+ parameters: { type: "object", properties: {} }
1112
+ }
1113
+ ]
1114
+ }
1115
+ ]
1116
+
1117
+ session.on(:tool_call) do |function_calls|
1118
+ responses = function_calls.map do |call|
1119
+ {
1120
+ id: call[:id],
1121
+ name: call[:name],
1122
+ response: { result: "data ready" },
1123
+ scheduling: "INTERRUPT" # or "WHEN_IDLE", "SILENT"
1124
+ }
1125
+ end
1126
+ session.send_tool_response(responses)
1127
+ end
1128
+ ```
1129
+
1130
+ `scheduling` can also be placed inside the `response:` hash directly. Valid values: `INTERRUPT`, `WHEN_IDLE`, `SILENT`. The library validates and uppercases the value automatically; an unknown value raises `ArgumentError`.
1131
+
1132
+ #### Built-in Tools
1133
+
1134
+ Google Search grounding is supported in the Live API:
1135
+
1136
+ ```ruby
1137
+ client.live.connect(
1138
+ model: "gemini-2.5-flash-live-preview",
1139
+ tools: [{ google_search: {} }]
1140
+ ) do |session|
1141
+ # ...
1142
+ end
1143
+ ```
1144
+
1145
+ #### Supported Live API Models for Tools
1146
+
1147
+ The public Live API tools docs list:
1148
+
1149
+ | Model | Sync Function Calling | Async (NON_BLOCKING) | Google Search |
1150
+ |---|---|---|---|
1151
+ | `gemini-2.5-flash-live-preview` | ✓ | ✓ | ✓ |
1152
+ | `gemini-3.1-flash-live-preview` | ✓ | — | ✓ |
1153
+
1154
+ In practice, on the `bidiGenerateContent` endpoint as of writing:
1155
+
1156
+ - `gemini-3.1-flash-live-preview` is deployed and works with **AUDIO** response modality + tools, **but only when text input is sent via `session.send_realtime_text(...)`** (i.e., `realtimeInput.text`). It rejects the legacy `clientContent.turns[]` payload.
1157
+ - `gemini-2.5-flash-native-audio-preview-12-2025` (the library default) is deployed and accepts both `send_realtime_text` and `send_text` (legacy `clientContent.turns[]`).
1158
+ - `gemini-2.5-flash-live-preview` from the docs table is **not yet deployed**.
1159
+
1160
+ Once a TEXT-modality-capable Live model ships, the same code works with `response_modality: "TEXT"` and the `voice_name:` argument removed.
1161
+
1162
+ Demos available:
1163
+
1164
+ - `demo/live_text_demo.rb` - Live API text conversation
1165
+ - `demo/live_audio_demo.rb` - Live API audio conversation
1166
+ - `demo/live_function_calling_demo.rb` - Live API function calling
1167
+
1168
+ ### Embeddings
1169
+
1170
+ You can generate text embeddings using the Gemini Embeddings API. Embeddings are vector representations of text that can be used for semantic similarity, classification, clustering, retrieval, and more.
1171
+
1172
+ #### Single Embedding
1173
+
1174
+ ```ruby
1175
+ require 'gemini'
1176
+
1177
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
1178
+
1179
+ response = client.embed_content(
1180
+ "What is the meaning of life?",
1181
+ model: "gemini-embedding-001"
1182
+ )
1183
+
1184
+ if response.success?
1185
+ puts "Dimension: #{response.embedding_dimension}"
1186
+ puts "Vector (first 5 values): #{response.embedding.first(5).inspect}"
1187
+ end
1188
+ ```
1189
+
1190
+ #### Batch Embeddings
1191
+
1192
+ Pass an Array of strings to embed multiple texts in a single batch request (uses `batchEmbedContents` under the hood):
1193
+
1194
+ ```ruby
1195
+ response = client.embed_content(
1196
+ [
1197
+ "I love programming in Ruby.",
1198
+ "Rubies are red gemstones.",
1199
+ "Python is also a programming language."
1200
+ ],
1201
+ model: "gemini-embedding-001",
1202
+ task_type: :semantic_similarity
1203
+ )
1204
+
1205
+ response.embeddings.each_with_index do |values, i|
1206
+ puts "Embedding #{i}: dimension=#{values.size}"
1207
+ end
1208
+ ```
1209
+
1210
+ #### Task Type, Title, and Output Dimensionality
1211
+
1212
+ You can specify a `task_type` to optimize the embedding for a particular downstream task. When `task_type: :retrieval_document` is used, you may also pass a `title`. Use `output_dimensionality` to truncate the vector length (recommended values: 768, 1536, 3072).
1213
+
1214
+ ```ruby
1215
+ response = client.embed_content(
1216
+ "Ruby is a dynamic, open-source programming language.",
1217
+ model: "gemini-embedding-001",
1218
+ task_type: :retrieval_document,
1219
+ title: "Ruby Overview",
1220
+ output_dimensionality: 768
1221
+ )
1222
+ ```
1223
+
1224
+ Supported task types:
1225
+
1226
+ - `RETRIEVAL_QUERY`
1227
+ - `RETRIEVAL_DOCUMENT`
1228
+ - `SEMANTIC_SIMILARITY`
1229
+ - `CLASSIFICATION`
1230
+ - `CLUSTERING`
1231
+ - `QUESTION_ANSWERING`
1232
+ - `FACT_VERIFICATION`
1233
+ - `CODE_RETRIEVAL_QUERY`
1234
+
1235
+ You can pass them as a String, Symbol, or in any case (e.g. `:retrieval_query`, `"RETRIEVAL_QUERY"`).
1236
+
1237
+ #### Direct Access via `embeddings_api`
1238
+
1239
+ For more control, you can call the embeddings API directly:
1240
+
1241
+ ```ruby
1242
+ # Single
1243
+ client.embeddings_api.create(input: "Hello", model: "gemini-embedding-001")
1244
+
1245
+ # Batch
1246
+ client.embeddings_api.batch_create(
1247
+ inputs: ["First", "Second", "Third"],
1248
+ model: "gemini-embedding-001",
1249
+ task_type: :clustering
1250
+ )
1251
+ ```
1252
+
1253
+ #### Response Helpers
1254
+
1255
+ The Response object exposes a few helpers for embedding payloads:
1256
+
1257
+ ```ruby
1258
+ response.embedding # First embedding values (Array of Floats)
1259
+ response.embeddings # All embedding value arrays (Array of Arrays)
1260
+ response.embedding_dimension # Length of the first embedding vector
1261
+ response.embedding_response? # true if the payload contains embedding data
1262
+ ```
1263
+
1264
+ A complete example is available in `demo/embeddings_demo.rb`.
1265
+
884
1266
  ### Structured Output with JSON Schema
885
1267
 
886
1268
  You can request responses in structured JSON format by specifying a JSON schema:
@@ -1116,9 +1498,15 @@ The gem includes several demo applications that showcase its functionality:
1116
1498
  - `demo/file_audio_demo.rb` - Audio transcription with large audio files
1117
1499
  - `demo/structured_output_demo.rb` - Structured JSON output with schema
1118
1500
  - `demo/enum_response_demo.rb` - Enum-constrained responses
1501
+ - `demo/thinking_demo.rb` - Thinking feature (Gemini 2.5)
1502
+ - `demo/thinking_gemini3_demo.rb` - Thinking feature (Gemini 3)
1119
1503
  - `demo/document_chat_demo.rb` - Document processing
1120
1504
  - `demo/document_conversation_demo.rb` - Conversation with documents
1121
1505
  - `demo/document_cache_demo.rb` - Document caching
1506
+ - `demo/embeddings_demo.rb` - Text embeddings (single and batch)
1507
+ - `demo/live_text_demo.rb` - Live API text conversation
1508
+ - `demo/live_audio_demo.rb` - Live API audio conversation
1509
+ - `demo/live_function_calling_demo.rb` - Live API function calling
1122
1510
 
1123
1511
  Run the demos with:
1124
1512
 
@@ -1159,6 +1547,12 @@ ruby demo/structured_output_demo.rb
1159
1547
  # Enum-constrained responses
1160
1548
  ruby demo/enum_response_demo.rb
1161
1549
 
1550
+ # Thinking feature (Gemini 2.5)
1551
+ ruby demo/thinking_demo.rb
1552
+
1553
+ # Thinking feature (Gemini 3)
1554
+ ruby demo/thinking_gemini3_demo.rb
1555
+
1162
1556
  # Document processing
1163
1557
  ruby demo/document_chat_demo.rb path/to/document.pdf
1164
1558
 
@@ -1167,6 +1561,9 @@ ruby demo/document_conversation_demo.rb path/to/document.pdf
1167
1561
 
1168
1562
  # Document caching and querying
1169
1563
  ruby demo/document_cache_demo.rb path/to/document.pdf
1564
+
1565
+ # Text embeddings (single and batch)
1566
+ ruby demo/embeddings_demo.rb
1170
1567
  ```
1171
1568
 
1172
1569
  ## Models