ruby-gemini-api 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 92b2d6f35fbff3fda1457ef3244ae730ee99e156909c310b62878540d47d4076
4
- data.tar.gz: 710191ae9ac4f136ea586782be43b8aa18390d74a8ba1d844a9524fc120fdaa4
3
+ metadata.gz: 31487006e959d8d9a755743f6471b54e075a1bc91aa36718274203deb9fda84d
4
+ data.tar.gz: bc7dbbbea933ed2343b2ec1a32d3cf9b03fe41a4494801c47e5d83842bf31603
5
5
  SHA512:
6
- metadata.gz: 31a2f84ff0f7d9d5127fe3cfa2eba75c2395cc4105e5faa6d45b472ccc4dc8e6502f28635f6a7dfce0c071e445010af9a1781c833a18c6f491f55b9e2989957b
7
- data.tar.gz: 499e659d3a3284f461deede5fbcaff16ebb3476ced2c3c5c44d4d5e333d3751e45483ed221a654b0b560651127496933342e0fed5bfea8307de40edab80faa21
6
+ metadata.gz: cc033c4ab711800c56f2f1d8884c38a74359d7c3b66fb66ed39ea03b23b98c3748d08208dc560e3276ac3d6e25660ad3a2c5aef69ef25b523bbd6512a5b5d246
7
+ data.tar.gz: 998ca95babf2803241a9d0a01c00eefe7fcbd37990723ffc9378668bc30f5529a90daa92b20919e5c0e681f68f4e8a5d4206e66520eeec46da9aa18c169fc9be
data/CHANGELOG.md CHANGED
@@ -1,5 +1,32 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [1.1.0] - 2026-04-29
4
+
5
+ ### Added
6
+ - Live API support for real-time bidirectional audio/video/text conversations over WebSocket
7
+ - `Gemini::Live::Session` with event-driven API (`:setup_complete`, `:text`, `:audio`, `:tool_call`, `:turn_complete`, `:interrupted`, `:usage_metadata`, `:session_resumption`, `:go_away`, `:close`, `:error`)
8
+ - `Gemini::Live::Configuration` with response modality, voice, system instruction, tools, context-window compression, session resumption, manual VAD, output audio transcription
9
+ - `Gemini::Live::MessageBuilder` for setup, clientContent, realtimeInput, activity start/end, and tool response messages
10
+ - Live API audio demos: `live_audio_demo.rb` (low-latency streaming), `live_audio_simple.rb`
11
+ - Manual VAD (Voice Activity Detection) support via `automatic_activity_detection: false`
12
+ - Live API Function Calling
13
+ - `Session#send_realtime_text(text)` — universal text input via `realtimeInput.text`, required by newer Live models such as `gemini-3.1-flash-live-preview`
14
+ - `MessageBuilder.realtime_text(text)` builder
15
+ - Async (NON_BLOCKING) function call support: `MessageBuilder.tool_response` validates and normalizes the `scheduling` field (`INTERRUPT`, `WHEN_IDLE`, `SILENT`), accepted either inside the response payload or as a top-level shortcut
16
+ - Demos: `live_function_calling_demo.rb` / `live_function_calling_demo_ja.rb`
17
+ - Embeddings API support (`embedContent` and `batchEmbedContents`)
18
+ - `client.embeddings_api.create(input:, ...)` for single embeddings
19
+ - `client.embeddings_api.batch_create(inputs:, ...)` for batch embeddings
20
+ - `client.embed_content(input, ...)` shortcut that auto-routes Array inputs to batch
21
+ - Optional parameters: `task_type` (RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING, QUESTION_ANSWERING, FACT_VERIFICATION, CODE_RETRIEVAL_QUERY), `title` (RETRIEVAL_DOCUMENT only), `output_dimensionality`
22
+ - Default model: `gemini-embedding-001`
23
+ - `Response` helpers for embeddings: `#embedding`, `#embeddings`, `#embedding_dimension`, `#embedding_response?`
24
+ - Demos: `embeddings_demo.rb` / `embeddings_demo_ja.rb`
25
+
26
+ ### Notes
27
+ - Verified Live model compatibility on the `bidiGenerateContent` endpoint: only the native-audio variants and `gemini-3.1-flash-live-preview` are deployed today. The latter requires `realtimeInput.text` (i.e., `Session#send_realtime_text`) and `AUDIO` modality. The `gemini-2.5-flash-live-preview` model name listed in the public tools docs is not yet deployed.
28
+ - `MessageBuilder.realtime_input` (legacy `mediaChunks` path) is documented as deprecated by the upstream API; prefer `realtime_text` going forward.
29
+
3
30
  ## [1.0.0] - 2026-01-28
4
31
 
5
32
  ### Added
data/README.md CHANGED
@@ -30,6 +30,8 @@ This project is inspired by and pays homage to [ruby-openai](https://github.com/
30
30
  - Structured output with JSON schema and enum constraints
31
31
  - Document processing (PDFs and other formats)
32
32
  - Context caching for efficient processing
33
+ - Text embeddings (single and batch) with task type, title, and output dimensionality control
34
+ - Live API: real-time bidirectional conversations with text/audio/video and function calling (sync and async)
33
35
 
34
36
  ### Function Calling
35
37
 
@@ -992,6 +994,275 @@ end
992
994
 
993
995
  For a complete example of context caching, check out the `demo/document_cache_demo.rb` file.
994
996
 
997
+ ### Live API (Real-time Conversations)
998
+
999
+ The Gemini Live API provides bidirectional WebSocket-based real-time conversations with audio, video, and text support. The library wraps the protocol behind an event-driven `Gemini::Live::Session`.
1000
+
1001
+ #### Basic Audio Conversation
1002
+
1003
+ The default model (`gemini-2.5-flash-native-audio-preview-12-2025`) responds with audio. You receive Base64-encoded 24 kHz 16-bit PCM chunks via the `:audio` event.
1004
+
1005
+ ```ruby
1006
+ require 'gemini'
1007
+ require 'base64'
1008
+
1009
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
1010
+
1011
+ client.live.connect(
1012
+ response_modality: "AUDIO",
1013
+ voice_name: "Kore",
1014
+ system_instruction: "You are a helpful assistant. Be brief."
1015
+ ) do |session|
1016
+ setup_complete = false
1017
+ audio_chunks = []
1018
+
1019
+ session.on(:setup_complete) { setup_complete = true }
1020
+ session.on(:audio) { |data, _mime| audio_chunks << Base64.decode64(data) }
1021
+ session.on(:turn_complete) { puts "[#{audio_chunks.sum(&:bytesize)} bytes]" }
1022
+ session.on(:error) { |e| puts "Error: #{e.message}" }
1023
+
1024
+ sleep 0.05 until setup_complete
1025
+
1026
+ session.send_realtime_text("What is the capital of Japan?")
1027
+ sleep 8
1028
+ end
1029
+ ```
1030
+
1031
+ For text-only responses, see the note below about Live model availability.
1032
+
1033
+ #### Function Calling (Synchronous)
1034
+
1035
+ The Live API supports function calling. Define your tools, register a `:tool_call` handler, and reply with `session.send_tool_response`.
1036
+
1037
+ > **Note on Live model input format**
1038
+ > Newer Live models such as `gemini-3.1-flash-live-preview` reject the
1039
+ > legacy `clientContent.turns[]` payload that older models (including the
1040
+ > native-audio variants) accept. Use `session.send_realtime_text(...)`
1041
+ > instead of `session.send_text(...)`, which emits the universal
1042
+ > `realtimeInput.text` form and works on every currently-deployed Live
1043
+ > model. The `gemini-2.5-flash-live-preview` model name listed in the
1044
+ > public tools docs is not deployed on the `bidiGenerateContent` endpoint
1045
+ > at the time of writing.
1046
+
1047
+ ```ruby
1048
+ require 'base64'
1049
+
1050
+ tools = [
1051
+ {
1052
+ functionDeclarations: [
1053
+ {
1054
+ name: "get_weather",
1055
+ description: "Get the current weather for a location",
1056
+ parameters: {
1057
+ type: "object",
1058
+ properties: {
1059
+ location: { type: "string", description: "City name" }
1060
+ },
1061
+ required: ["location"]
1062
+ }
1063
+ }
1064
+ ]
1065
+ }
1066
+ ]
1067
+
1068
+ audio_chunks = []
1069
+
1070
+ client.live.connect(
1071
+ response_modality: "AUDIO",
1072
+ voice_name: "Kore",
1073
+ tools: tools,
1074
+ system_instruction: "Use the available functions when asked about weather."
1075
+ ) do |session|
1076
+ session.on(:audio) { |data, _mime| audio_chunks << Base64.decode64(data) }
1077
+
1078
+ session.on(:tool_call) do |function_calls|
1079
+ responses = function_calls.map do |call|
1080
+ result = case call[:name]
1081
+ when "get_weather"
1082
+ { temperature: 22, condition: "sunny", location: call[:args]["location"] }
1083
+ end
1084
+ { id: call[:id], name: call[:name], response: result }
1085
+ end
1086
+ session.send_tool_response(responses)
1087
+ end
1088
+
1089
+ sleep 0.5 # wait for setup
1090
+ session.send_realtime_text("What's the weather in Tokyo?")
1091
+ sleep 18
1092
+ end
1093
+
1094
+ # audio_chunks now contains 24 kHz, 16-bit PCM mono audio of the spoken reply.
1095
+ ```
1096
+
1097
+ A complete example is in `demo/live_function_calling_demo.rb`.
1098
+
1099
+ #### Function Calling (Asynchronous / NON_BLOCKING)
1100
+
1101
+ `gemini-2.5-flash-live-preview` supports asynchronous function calls. Mark a function declaration with `behavior: "NON_BLOCKING"` so the model can keep talking while the call runs, then control how the result is delivered back via `scheduling`.
1102
+
1103
+ ```ruby
1104
+ tools = [
1105
+ {
1106
+ functionDeclarations: [
1107
+ {
1108
+ name: "fetch_long_running_data",
1109
+ behavior: "NON_BLOCKING",
1110
+ description: "Slow data lookup",
1111
+ parameters: { type: "object", properties: {} }
1112
+ }
1113
+ ]
1114
+ }
1115
+ ]
1116
+
1117
+ session.on(:tool_call) do |function_calls|
1118
+ responses = function_calls.map do |call|
1119
+ {
1120
+ id: call[:id],
1121
+ name: call[:name],
1122
+ response: { result: "data ready" },
1123
+ scheduling: "INTERRUPT" # or "WHEN_IDLE", "SILENT"
1124
+ }
1125
+ end
1126
+ session.send_tool_response(responses)
1127
+ end
1128
+ ```
1129
+
1130
+ `scheduling` can also be placed inside the `response:` hash directly. Valid values: `INTERRUPT`, `WHEN_IDLE`, `SILENT`. The library validates and uppercases the value automatically; an unknown value raises `ArgumentError`.
1131
+
1132
+ #### Built-in Tools
1133
+
1134
+ Google Search grounding is supported in the Live API:
1135
+
1136
+ ```ruby
1137
+ client.live.connect(
1138
+ model: "gemini-2.5-flash-live-preview",
1139
+ tools: [{ google_search: {} }]
1140
+ ) do |session|
1141
+ # ...
1142
+ end
1143
+ ```
1144
+
1145
+ #### Supported Live API Models for Tools
1146
+
1147
+ The public Live API tools docs list:
1148
+
1149
+ | Model | Sync Function Calling | Async (NON_BLOCKING) | Google Search |
1150
+ |---|---|---|---|
1151
+ | `gemini-2.5-flash-live-preview` | ✓ | ✓ | ✓ |
1152
+ | `gemini-3.1-flash-live-preview` | ✓ | — | ✓ |
1153
+
1154
+ In practice, on the `bidiGenerateContent` endpoint as of writing:
1155
+
1156
+ - `gemini-3.1-flash-live-preview` is deployed and works with **AUDIO** response modality + tools, **but only when text input is sent via `session.send_realtime_text(...)`** (i.e., `realtimeInput.text`). It rejects the legacy `clientContent.turns[]` payload.
1157
+ - `gemini-2.5-flash-native-audio-preview-12-2025` (the library default) is deployed and accepts both `send_realtime_text` and `send_text` (legacy `clientContent.turns[]`).
1158
+ - `gemini-2.5-flash-live-preview` from the docs table is **not yet deployed**.
1159
+
1160
+ Once a TEXT-modality-capable Live model ships, the same code works with `response_modality: "TEXT"` and the `voice_name:` argument removed.
1161
+
1162
+ Demos available:
1163
+
1164
+ - `demo/live_text_demo.rb` - Live API text conversation
1165
+ - `demo/live_audio_demo.rb` - Live API audio conversation
1166
+ - `demo/live_function_calling_demo.rb` - Live API function calling
1167
+
1168
+ ### Embeddings
1169
+
1170
+ You can generate text embeddings using the Gemini Embeddings API. Embeddings are vector representations of text that can be used for semantic similarity, classification, clustering, retrieval, and more.
1171
+
1172
+ #### Single Embedding
1173
+
1174
+ ```ruby
1175
+ require 'gemini'
1176
+
1177
+ client = Gemini::Client.new(ENV['GEMINI_API_KEY'])
1178
+
1179
+ response = client.embed_content(
1180
+ "What is the meaning of life?",
1181
+ model: "gemini-embedding-001"
1182
+ )
1183
+
1184
+ if response.success?
1185
+ puts "Dimension: #{response.embedding_dimension}"
1186
+ puts "Vector (first 5 values): #{response.embedding.first(5).inspect}"
1187
+ end
1188
+ ```
1189
+
1190
+ #### Batch Embeddings
1191
+
1192
+ Pass an Array of strings to embed multiple texts in a single batch request (uses `batchEmbedContents` under the hood):
1193
+
1194
+ ```ruby
1195
+ response = client.embed_content(
1196
+ [
1197
+ "I love programming in Ruby.",
1198
+ "Rubies are red gemstones.",
1199
+ "Python is also a programming language."
1200
+ ],
1201
+ model: "gemini-embedding-001",
1202
+ task_type: :semantic_similarity
1203
+ )
1204
+
1205
+ response.embeddings.each_with_index do |values, i|
1206
+ puts "Embedding #{i}: dimension=#{values.size}"
1207
+ end
1208
+ ```
1209
+
1210
+ #### Task Type, Title, and Output Dimensionality
1211
+
1212
+ You can specify a `task_type` to optimize the embedding for a particular downstream task. When `task_type: :retrieval_document` is used, you may also pass a `title`. Use `output_dimensionality` to truncate the vector length (recommended values: 768, 1536, 3072).
1213
+
1214
+ ```ruby
1215
+ response = client.embed_content(
1216
+ "Ruby is a dynamic, open-source programming language.",
1217
+ model: "gemini-embedding-001",
1218
+ task_type: :retrieval_document,
1219
+ title: "Ruby Overview",
1220
+ output_dimensionality: 768
1221
+ )
1222
+ ```
1223
+
1224
+ Supported task types:
1225
+
1226
+ - `RETRIEVAL_QUERY`
1227
+ - `RETRIEVAL_DOCUMENT`
1228
+ - `SEMANTIC_SIMILARITY`
1229
+ - `CLASSIFICATION`
1230
+ - `CLUSTERING`
1231
+ - `QUESTION_ANSWERING`
1232
+ - `FACT_VERIFICATION`
1233
+ - `CODE_RETRIEVAL_QUERY`
1234
+
1235
+ You can pass them as a String, Symbol, or in any case (e.g. `:retrieval_query`, `"RETRIEVAL_QUERY"`).
1236
+
1237
+ #### Direct Access via `embeddings_api`
1238
+
1239
+ For more control, you can call the embeddings API directly:
1240
+
1241
+ ```ruby
1242
+ # Single
1243
+ client.embeddings_api.create(input: "Hello", model: "gemini-embedding-001")
1244
+
1245
+ # Batch
1246
+ client.embeddings_api.batch_create(
1247
+ inputs: ["First", "Second", "Third"],
1248
+ model: "gemini-embedding-001",
1249
+ task_type: :clustering
1250
+ )
1251
+ ```
1252
+
1253
+ #### Response Helpers
1254
+
1255
+ The Response object exposes a few helpers for embedding payloads:
1256
+
1257
+ ```ruby
1258
+ response.embedding # First embedding values (Array of Floats)
1259
+ response.embeddings # All embedding value arrays (Array of Arrays)
1260
+ response.embedding_dimension # Length of the first embedding vector
1261
+ response.embedding_response? # true if the payload contains embedding data
1262
+ ```
1263
+
1264
+ A complete example is available in `demo/embeddings_demo.rb`.
1265
+
995
1266
  ### Structured Output with JSON Schema
996
1267
 
997
1268
  You can request responses in structured JSON format by specifying a JSON schema:
@@ -1232,6 +1503,10 @@ The gem includes several demo applications that showcase its functionality:
1232
1503
  - `demo/document_chat_demo.rb` - Document processing
1233
1504
  - `demo/document_conversation_demo.rb` - Conversation with documents
1234
1505
  - `demo/document_cache_demo.rb` - Document caching
1506
+ - `demo/embeddings_demo.rb` - Text embeddings (single and batch)
1507
+ - `demo/live_text_demo.rb` - Live API text conversation
1508
+ - `demo/live_audio_demo.rb` - Live API audio conversation
1509
+ - `demo/live_function_calling_demo.rb` - Live API function calling
1235
1510
 
1236
1511
  Run the demos with:
1237
1512
 
@@ -1286,6 +1561,9 @@ ruby demo/document_conversation_demo.rb path/to/document.pdf
1286
1561
 
1287
1562
  # Document caching and querying
1288
1563
  ruby demo/document_cache_demo.rb path/to/document.pdf
1564
+
1565
+ # Text embeddings (single and batch)
1566
+ ruby demo/embeddings_demo.rb
1289
1567
  ```
1290
1568
 
1291
1569
  ## Models
data/lib/gemini/client.rb CHANGED
@@ -70,6 +70,16 @@ module Gemini
70
70
  @cached_content ||= Gemini::CachedContent.new(client: self)
71
71
  end
72
72
 
73
+ # Live APIアクセサ
74
+ def live
75
+ @live ||= Gemini::Live.new(client: self)
76
+ end
77
+
78
+ # Embeddings APIアクセサ
79
+ def embeddings_api
80
+ @embeddings_api ||= Gemini::Embeddings.new(client: self)
81
+ end
82
+
73
83
  def reset_headers
74
84
  @extra_headers = {}
75
85
  end
@@ -112,10 +122,25 @@ module Gemini
112
122
  end
113
123
  end
114
124
 
115
- # Method corresponding to OpenAI's embeddings
125
+ # Generate embeddings for the given input.
126
+ # input can be a String (single embed) or Array of Strings (batch embed).
127
+ # Supports task_type, title (RETRIEVAL_DOCUMENT only), and output_dimensionality.
128
+ def embed_content(input, model: Gemini::Embeddings::DEFAULT_MODEL, task_type: nil,
129
+ title: nil, output_dimensionality: nil, **parameters)
130
+ embeddings_api.create(
131
+ input: input,
132
+ model: model,
133
+ task_type: task_type,
134
+ title: title,
135
+ output_dimensionality: output_dimensionality,
136
+ **parameters
137
+ )
138
+ end
139
+
140
+ # Method corresponding to OpenAI's embeddings (kept for compatibility)
116
141
  def embeddings(parameters: {})
117
- model = parameters.delete(:model) || "text-embedding-model"
118
- path = "models/#{model}:embedContent"
142
+ model = parameters.delete(:model) || Gemini::Embeddings::DEFAULT_MODEL
143
+ path = "models/#{model.to_s.delete_prefix("models/")}:embedContent"
119
144
  response = json_post(path: path, parameters: parameters)
120
145
  Gemini::Response.new(response)
121
146
  end
@@ -1,27 +1,118 @@
1
1
  module Gemini
2
2
  class Embeddings
3
+ DEFAULT_MODEL = "gemini-embedding-001".freeze
4
+
5
+ VALID_TASK_TYPES = %w[
6
+ RETRIEVAL_QUERY
7
+ RETRIEVAL_DOCUMENT
8
+ SEMANTIC_SIMILARITY
9
+ CLASSIFICATION
10
+ CLUSTERING
11
+ QUESTION_ANSWERING
12
+ FACT_VERIFICATION
13
+ CODE_RETRIEVAL_QUERY
14
+ ].freeze
15
+
3
16
  def initialize(client:)
4
17
  @client = client
5
18
  end
6
19
 
7
- def create(input:, model: "text-embedding-model", **parameters)
8
- content = case input
9
- when String
10
- { parts: [{ text: input }] }
11
- when Array
12
- { parts: input.map { |text| { text: text.to_s } } }
13
- else
14
- { parts: [{ text: input.to_s }] }
15
- end
16
-
17
- payload = {
18
- content: content
19
- }.merge(parameters)
20
-
21
- @client.json_post(
22
- path: "models/#{model}:embedContent",
20
+ # Generate an embedding for a single content, or batch when input is an Array
21
+ def create(input:, model: DEFAULT_MODEL, task_type: nil, title: nil,
22
+ output_dimensionality: nil, **parameters)
23
+ if input.is_a?(Array)
24
+ return batch_create(
25
+ inputs: input,
26
+ model: model,
27
+ task_type: task_type,
28
+ title: title,
29
+ output_dimensionality: output_dimensionality,
30
+ **parameters
31
+ )
32
+ end
33
+
34
+ payload = build_embed_payload(
35
+ input: input,
36
+ task_type: task_type,
37
+ title: title,
38
+ output_dimensionality: output_dimensionality
39
+ ).merge(parameters)
40
+
41
+ response = @client.json_post(
42
+ path: "models/#{normalize_model(model)}:embedContent",
23
43
  parameters: payload
24
44
  )
45
+ Gemini::Response.new(response)
46
+ end
47
+
48
+ # Generate embeddings for multiple inputs in a single batch request
49
+ def batch_create(inputs:, model: DEFAULT_MODEL, task_type: nil, title: nil,
50
+ output_dimensionality: nil, **parameters)
51
+ requests = inputs.map do |input|
52
+ req = build_embed_payload(
53
+ input: input,
54
+ task_type: task_type,
55
+ title: title,
56
+ output_dimensionality: output_dimensionality
57
+ )
58
+ req[:model] = "models/#{normalize_model(model)}"
59
+ req
60
+ end
61
+
62
+ payload = { requests: requests }.merge(parameters)
63
+
64
+ response = @client.json_post(
65
+ path: "models/#{normalize_model(model)}:batchEmbedContents",
66
+ parameters: payload
67
+ )
68
+ Gemini::Response.new(response)
69
+ end
70
+
71
+ private
72
+
73
+ def build_embed_payload(input:, task_type:, title:, output_dimensionality:)
74
+ payload = { content: format_content(input) }
75
+
76
+ if task_type
77
+ validate_task_type!(task_type)
78
+ payload[:taskType] = task_type.to_s.upcase
79
+ end
80
+
81
+ payload[:title] = title if title
82
+ payload[:outputDimensionality] = output_dimensionality if output_dimensionality
83
+
84
+ payload
85
+ end
86
+
87
+ def format_content(input)
88
+ case input
89
+ when String
90
+ { parts: [{ text: input }] }
91
+ when Hash
92
+ if input.key?(:parts) || input.key?("parts")
93
+ input
94
+ elsif input.key?(:text) || input.key?("text") ||
95
+ input.key?(:inline_data) || input.key?("inline_data") ||
96
+ input.key?(:file_data) || input.key?("file_data")
97
+ { parts: [input] }
98
+ else
99
+ input
100
+ end
101
+ else
102
+ { parts: [{ text: input.to_s }] }
103
+ end
104
+ end
105
+
106
+ def normalize_model(model)
107
+ model_str = model.to_s
108
+ model_str.start_with?("models/") ? model_str.delete_prefix("models/") : model_str
109
+ end
110
+
111
+ def validate_task_type!(task_type)
112
+ task_type_str = task_type.to_s.upcase
113
+ unless VALID_TASK_TYPES.include?(task_type_str)
114
+ raise ArgumentError, "task_type must be one of: #{VALID_TASK_TYPES.join(', ')}"
115
+ end
25
116
  end
26
117
  end
27
- end
118
+ end
@@ -0,0 +1,65 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Gemini
4
+ class Live
5
+ # Configuration class for Live API sessions
6
+ class Configuration
7
+ attr_accessor :model, :response_modality, :voice_name,
8
+ :system_instruction, :tools,
9
+ :context_window_compression, :session_resumption,
10
+ :automatic_activity_detection,
11
+ :media_resolution, :output_audio_transcription
12
+
13
+ VALID_MODALITIES = %w[TEXT AUDIO].freeze
14
+ VALID_VOICES = %w[Puck Charon Kore Fenrir Aoede Leda Orus Zephyr].freeze
15
+ # NOTE: gemini-2.5-flash-live-preview is listed in the public Live API
16
+ # tools documentation as the recommended model, but is not currently
17
+ # deployed (returns "model not found" on bidiGenerateContent). The
18
+ # native-audio preview model is the only Live model on which function
19
+ # calling currently works in practice (with AUDIO modality).
20
+ DEFAULT_MODEL = "gemini-2.5-flash-native-audio-preview-12-2025"
21
+
22
+ def initialize(
23
+ model: DEFAULT_MODEL,
24
+ response_modality: "TEXT",
25
+ voice_name: nil,
26
+ system_instruction: nil,
27
+ tools: nil,
28
+ context_window_compression: nil,
29
+ session_resumption: nil,
30
+ automatic_activity_detection: true,
31
+ media_resolution: nil,
32
+ output_audio_transcription: false
33
+ )
34
+ @model = model
35
+ @response_modality = validate_modality(response_modality)
36
+ @voice_name = validate_voice(voice_name)
37
+ @system_instruction = system_instruction
38
+ @tools = tools
39
+ @context_window_compression = context_window_compression
40
+ @session_resumption = session_resumption
41
+ @automatic_activity_detection = automatic_activity_detection
42
+ @media_resolution = media_resolution
43
+ @output_audio_transcription = output_audio_transcription
44
+ end
45
+
46
+ private
47
+
48
+ def validate_modality(modality)
49
+ modality = modality.to_s.upcase
50
+ unless VALID_MODALITIES.include?(modality)
51
+ raise ArgumentError, "Invalid modality: #{modality}. Must be one of: #{VALID_MODALITIES.join(', ')}"
52
+ end
53
+ modality
54
+ end
55
+
56
+ def validate_voice(voice)
57
+ return nil if voice.nil?
58
+ unless VALID_VOICES.include?(voice)
59
+ raise ArgumentError, "Invalid voice: #{voice}. Must be one of: #{VALID_VOICES.join(', ')}"
60
+ end
61
+ voice
62
+ end
63
+ end
64
+ end
65
+ end
@@ -0,0 +1,83 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "websocket-client-simple"
4
+ require "json"
5
+
6
+ module Gemini
7
+ class Live
8
+ # WebSocket connection manager for Live API
9
+ class Connection
10
+ WEBSOCKET_BASE_URL = "wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent"
11
+
12
+ attr_reader :connected
13
+
14
+ def initialize(api_key:, on_message:, on_open:, on_error:, on_close:)
15
+ @api_key = api_key
16
+ @on_message = on_message
17
+ @on_open = on_open
18
+ @on_error = on_error
19
+ @on_close = on_close
20
+ @ws = nil
21
+ @connected = false
22
+ @mutex = Mutex.new
23
+ end
24
+
25
+ def connect
26
+ url = "#{WEBSOCKET_BASE_URL}?key=#{@api_key}"
27
+
28
+ # Store callbacks in local variables for closure
29
+ on_message_callback = @on_message
30
+ on_open_callback = @on_open
31
+ on_error_callback = @on_error
32
+ on_close_callback = @on_close
33
+ connection = self
34
+
35
+ @ws = WebSocket::Client::Simple.connect(url) do |ws|
36
+ ws.on :open do
37
+ connection.instance_variable_set(:@connected, true)
38
+ on_open_callback.call if on_open_callback
39
+ end
40
+
41
+ ws.on :message do |msg|
42
+ on_message_callback.call(msg.data) if on_message_callback
43
+ end
44
+
45
+ ws.on :error do |e|
46
+ on_error_callback.call(e) if on_error_callback
47
+ end
48
+
49
+ ws.on :close do |e|
50
+ connection.instance_variable_set(:@connected, false)
51
+ code = e.respond_to?(:code) ? e.code : nil
52
+ reason = e.respond_to?(:reason) ? e.reason : nil
53
+ on_close_callback.call(code, reason) if on_close_callback
54
+ end
55
+ end
56
+
57
+ self
58
+ end
59
+
60
+ def send(data)
61
+ return false unless @ws && @connected
62
+
63
+ @mutex.synchronize do
64
+ json_data = data.is_a?(String) ? data : data.to_json
65
+ @ws.send(json_data)
66
+ end
67
+ true
68
+ rescue StandardError => e
69
+ @on_error&.call(e)
70
+ false
71
+ end
72
+
73
+ def close
74
+ @ws&.close
75
+ @connected = false
76
+ end
77
+
78
+ def connected?
79
+ @connected && @ws && !@ws.closed?
80
+ end
81
+ end
82
+ end
83
+ end
@@ -0,0 +1,217 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Gemini
4
+ class Live
5
+ # Helper class to build Live API messages
6
+ class MessageBuilder
7
+ VALID_SCHEDULING = %w[INTERRUPT WHEN_IDLE SILENT].freeze
8
+
9
+ class << self
10
+ # Build setup message from configuration
11
+ def setup(config)
12
+ message = {
13
+ setup: {
14
+ model: normalize_model_name(config.model)
15
+ }
16
+ }
17
+
18
+ generation_config = build_generation_config(config)
19
+ message[:setup][:generationConfig] = generation_config unless generation_config.empty?
20
+
21
+ # System instruction
22
+ if config.system_instruction
23
+ message[:setup][:systemInstruction] = {
24
+ parts: [{ text: config.system_instruction }]
25
+ }
26
+ end
27
+
28
+ # Tools configuration
29
+ message[:setup][:tools] = config.tools if config.tools
30
+
31
+ # Context window compression
32
+ if config.context_window_compression
33
+ message[:setup][:contextWindowCompression] = config.context_window_compression
34
+ end
35
+
36
+ # Session resumption
37
+ if config.session_resumption
38
+ message[:setup][:sessionResumption] = config.session_resumption
39
+ end
40
+
41
+ # VAD (Voice Activity Detection) settings
42
+ unless config.automatic_activity_detection
43
+ message[:setup][:realtimeInputConfig] = {
44
+ automaticActivityDetection: {
45
+ disabled: true
46
+ }
47
+ }
48
+ end
49
+
50
+ message
51
+ end
52
+
53
+ # Build client content message (text)
54
+ def client_content(text:, turn_complete: true, role: "user")
55
+ {
56
+ clientContent: {
57
+ turns: [
58
+ {
59
+ role: role,
60
+ parts: [{ text: text }]
61
+ }
62
+ ],
63
+ turnComplete: turn_complete
64
+ }
65
+ }
66
+ end
67
+
68
+ # Build client content with multiple parts
69
+ def client_content_parts(parts:, turn_complete: true, role: "user")
70
+ {
71
+ clientContent: {
72
+ turns: [
73
+ {
74
+ role: role,
75
+ parts: parts
76
+ }
77
+ ],
78
+ turnComplete: turn_complete
79
+ }
80
+ }
81
+ end
82
+
83
+ # Build realtime input message (audio/video) using the legacy
84
+ # mediaChunks field. NOTE: mediaChunks is deprecated by the API in
85
+ # favor of the dedicated audio/video fields built by realtime_audio
86
+ # and realtime_video. Kept for backward compatibility with older
87
+ # Live models that still accept it.
88
+ def realtime_input(audio_data: nil, video_data: nil, mime_type:)
89
+ data = audio_data || video_data
90
+ {
91
+ realtimeInput: {
92
+ mediaChunks: [
93
+ {
94
+ mimeType: mime_type,
95
+ data: data
96
+ }
97
+ ]
98
+ }
99
+ }
100
+ end
101
+
102
+ # Build a realtime text input message. This is the universal
103
+ # text-input form for the Live API and is required by newer Live
104
+ # models such as gemini-3.1-flash-live-preview, which reject the
105
+ # turn-based clientContent payload.
106
+ def realtime_text(text)
107
+ { realtimeInput: { text: text.to_s } }
108
+ end
109
+
110
+ # Build activity start message (for manual VAD)
111
+ def activity_start
112
+ {
113
+ realtimeInput: {
114
+ activityStart: {}
115
+ }
116
+ }
117
+ end
118
+
119
+ # Build activity end message (for manual VAD)
120
+ def activity_end
121
+ {
122
+ realtimeInput: {
123
+ activityEnd: {}
124
+ }
125
+ }
126
+ end
127
+
128
+ # Build tool response message.
129
+ #
130
+ # Each function response hash supports:
131
+ # :id - The function call id from the server
132
+ # :name - The function name
133
+ # :response - The function result (Hash or scalar). When using
134
+ # NON_BLOCKING (async) function calls, include
135
+ # `scheduling: "INTERRUPT" | "WHEN_IDLE" | "SILENT"`
136
+ # inside the response hash.
137
+ # :scheduling - (optional) Top-level shortcut. When provided,
138
+ # it is merged into the response hash as
139
+ # `response[:scheduling]`. Accepts Symbol or String.
140
+ #
141
+ # Raises ArgumentError if scheduling is not one of the valid values.
142
+ def tool_response(function_responses)
143
+ {
144
+ toolResponse: {
145
+ functionResponses: function_responses.map { |resp| build_function_response(resp) }
146
+ }
147
+ }
148
+ end
149
+
150
+ private
151
+
152
+ def build_function_response(resp)
153
+ response_payload =
154
+ case resp[:response]
155
+ when Hash then resp[:response].dup
156
+ when nil then {}
157
+ else { result: resp[:response] }
158
+ end
159
+
160
+ if (top_level_scheduling = resp[:scheduling])
161
+ response_payload[:scheduling] = normalize_scheduling(top_level_scheduling)
162
+ elsif (sched = response_payload[:scheduling] || response_payload["scheduling"])
163
+ normalized = normalize_scheduling(sched)
164
+ response_payload.delete("scheduling")
165
+ response_payload[:scheduling] = normalized
166
+ end
167
+
168
+ { id: resp[:id], name: resp[:name], response: response_payload }
169
+ end
170
+
171
+ def normalize_scheduling(value)
172
+ value_str = value.to_s.upcase
173
+ unless VALID_SCHEDULING.include?(value_str)
174
+ raise ArgumentError,
175
+ "scheduling must be one of: #{VALID_SCHEDULING.join(', ')} (got #{value.inspect})"
176
+ end
177
+ value_str
178
+ end
179
+
180
+
181
+ def normalize_model_name(model)
182
+ model.start_with?("models/") ? model : "models/#{model}"
183
+ end
184
+
185
+ def build_generation_config(config)
186
+ generation_config = {}
187
+
188
+ # Response modality
189
+ generation_config[:responseModalities] = [config.response_modality]
190
+
191
+ # Speech/Voice configuration for AUDIO modality
192
+ if config.response_modality == "AUDIO" && config.voice_name
193
+ generation_config[:speechConfig] = {
194
+ voiceConfig: {
195
+ prebuiltVoiceConfig: {
196
+ voiceName: config.voice_name
197
+ }
198
+ }
199
+ }
200
+ end
201
+
202
+ # Media resolution
203
+ if config.media_resolution
204
+ generation_config[:mediaResolution] = config.media_resolution
205
+ end
206
+
207
+ # Output audio transcription
208
+ if config.output_audio_transcription
209
+ generation_config[:outputAudioTranscription] = {}
210
+ end
211
+
212
+ generation_config
213
+ end
214
+ end
215
+ end
216
+ end
217
+ end
@@ -0,0 +1,223 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "json"
4
+ require "base64"
5
+
6
+ module Gemini
7
+ class Live
8
+ # Live API session manager
9
+ class Session
10
+ attr_reader :configuration, :last_resumption_token, :usage_metadata
11
+
12
+ def initialize(api_key:, configuration:)
13
+ @api_key = api_key
14
+ @configuration = configuration
15
+ @event_handlers = Hash.new { |h, k| h[k] = [] }
16
+ @connected = false
17
+ @setup_complete = false
18
+ @last_resumption_token = nil
19
+ @usage_metadata = nil
20
+ @connection = nil
21
+
22
+ setup_connection
23
+ end
24
+
25
+ # Register event handler
26
+ # Supported events:
27
+ # :setup_complete - Session setup completed
28
+ # :text - Text response received (text)
29
+ # :audio - Audio data received (base64_data, mime_type)
30
+ # :data - Other inline data received (base64_data, mime_type)
31
+ # :tool_call - Tool call requested (function_calls)
32
+ # :interrupted - User interrupted the model
33
+ # :turn_complete - Model turn completed
34
+ # :generation_complete - Generation completed
35
+ # :usage_metadata - Token usage info received (metadata)
36
+ # :session_resumption - Session resumption token updated (update)
37
+ # :go_away - Connection will close soon (info)
38
+ # :error - Error occurred (error)
39
+ # :close - Connection closed (code, reason)
40
+ def on(event, &block)
41
+ @event_handlers[event.to_sym] << block
42
+ self
43
+ end
44
+
45
+ # Send text message via clientContent.turns. This is the legacy form
46
+ # used by native-audio Live models. Newer models such as
47
+ # gemini-3.1-flash-live-preview reject this payload — use
48
+ # #send_realtime_text instead, which works on every Live model.
49
+ def send_text(text, turn_complete: true)
50
+ ensure_setup_complete!
51
+ message = MessageBuilder.client_content(
52
+ text: text,
53
+ turn_complete: turn_complete
54
+ )
55
+ @connection.send(message)
56
+ end
57
+
58
+ # Send text input via realtimeInput.text (universal form).
59
+ # Works with every currently-deployed Live model, including
60
+ # gemini-3.1-flash-live-preview and native-audio variants.
61
+ def send_realtime_text(text)
62
+ ensure_setup_complete!
63
+ @connection.send(MessageBuilder.realtime_text(text))
64
+ end
65
+
66
+ # Send audio data (Base64 encoded PCM)
67
+ def send_audio(audio_data, mime_type: "audio/pcm;rate=16000")
68
+ ensure_setup_complete!
69
+ encoded_data = audio_data.is_a?(String) && audio_data.encoding == Encoding::BINARY ?
70
+ Base64.strict_encode64(audio_data) : audio_data
71
+ message = MessageBuilder.realtime_input(
72
+ audio_data: encoded_data,
73
+ mime_type: mime_type
74
+ )
75
+ @connection.send(message)
76
+ end
77
+
78
+ # Send video/image data (Base64 encoded)
79
+ def send_video(image_data, mime_type: "image/jpeg")
80
+ ensure_setup_complete!
81
+ encoded_data = image_data.is_a?(String) && image_data.encoding == Encoding::BINARY ?
82
+ Base64.strict_encode64(image_data) : image_data
83
+ message = MessageBuilder.realtime_input(
84
+ video_data: encoded_data,
85
+ mime_type: mime_type
86
+ )
87
+ @connection.send(message)
88
+ end
89
+
90
+ # Send tool response
91
+ def send_tool_response(function_responses)
92
+ ensure_setup_complete!
93
+ message = MessageBuilder.tool_response(function_responses)
94
+ @connection.send(message)
95
+ end
96
+
97
+ # Manual VAD control - signal activity start
98
+ def activity_start
99
+ ensure_setup_complete!
100
+ @connection.send(MessageBuilder.activity_start)
101
+ end
102
+
103
+ # Manual VAD control - signal activity end
104
+ def activity_end
105
+ ensure_setup_complete!
106
+ @connection.send(MessageBuilder.activity_end)
107
+ end
108
+
109
+ # Close the session
110
+ def close
111
+ @connection&.close
112
+ @connected = false
113
+ @setup_complete = false
114
+ end
115
+
116
+ def connected?
117
+ @connected && @connection&.connected?
118
+ end
119
+
120
+ def setup_complete?
121
+ @setup_complete
122
+ end
123
+
124
+ private
125
+
126
+ def setup_connection
127
+ @connection = Connection.new(
128
+ api_key: @api_key,
129
+ on_message: method(:handle_message),
130
+ on_open: method(:handle_open),
131
+ on_error: method(:handle_error),
132
+ on_close: method(:handle_close)
133
+ )
134
+ @connection.connect
135
+ @connected = true
136
+ end
137
+
138
+ def handle_open
139
+ # Send setup message immediately after connection opens
140
+ setup_message = MessageBuilder.setup(@configuration)
141
+ @connection.send(setup_message)
142
+ end
143
+
144
+ def handle_message(data)
145
+ parsed = JSON.parse(data, symbolize_names: true)
146
+
147
+ if parsed[:setupComplete]
148
+ @setup_complete = true
149
+ emit(:setup_complete)
150
+ elsif parsed[:serverContent]
151
+ handle_server_content(parsed[:serverContent])
152
+ elsif parsed[:toolCall]
153
+ emit(:tool_call, parsed[:toolCall][:functionCalls])
154
+ elsif parsed[:usageMetadata]
155
+ @usage_metadata = parsed[:usageMetadata]
156
+ emit(:usage_metadata, parsed[:usageMetadata])
157
+ elsif parsed[:sessionResumptionUpdate]
158
+ handle_session_resumption(parsed[:sessionResumptionUpdate])
159
+ elsif parsed[:goAway]
160
+ emit(:go_away, parsed[:goAway])
161
+ end
162
+ rescue JSON::ParserError => e
163
+ emit(:error, e)
164
+ end
165
+
166
+ def handle_server_content(content)
167
+ # Check for interruption
168
+ if content[:interrupted]
169
+ emit(:interrupted)
170
+ return
171
+ end
172
+
173
+ # Check for generation complete
174
+ if content[:generationComplete]
175
+ emit(:generation_complete)
176
+ end
177
+
178
+ # Process model turn
179
+ model_turn = content[:modelTurn]
180
+ if model_turn
181
+ model_turn[:parts]&.each do |part|
182
+ if part[:text]
183
+ emit(:text, part[:text])
184
+ elsif part[:inlineData]
185
+ inline = part[:inlineData]
186
+ if inline[:mimeType]&.start_with?("audio/")
187
+ emit(:audio, inline[:data], inline[:mimeType])
188
+ else
189
+ emit(:data, inline[:data], inline[:mimeType])
190
+ end
191
+ end
192
+ end
193
+ end
194
+
195
+ # Check for turn complete
196
+ emit(:turn_complete) if content[:turnComplete]
197
+ end
198
+
199
+ def handle_session_resumption(update)
200
+ @last_resumption_token = update[:newHandle]
201
+ emit(:session_resumption, update)
202
+ end
203
+
204
+ def handle_error(error)
205
+ emit(:error, error)
206
+ end
207
+
208
+ def handle_close(code, reason)
209
+ @connected = false
210
+ @setup_complete = false
211
+ emit(:close, code, reason)
212
+ end
213
+
214
+ def emit(event, *args)
215
+ @event_handlers[event].each { |handler| handler.call(*args) }
216
+ end
217
+
218
+ def ensure_setup_complete!
219
+ raise Gemini::Error, "Session setup not complete. Wait for :setup_complete event." unless @setup_complete
220
+ end
221
+ end
222
+ end
223
+ end
@@ -0,0 +1,102 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "live/configuration"
4
+ require_relative "live/message_builder"
5
+ require_relative "live/connection"
6
+ require_relative "live/session"
7
+
8
+ module Gemini
9
+ # Live API client for real-time audio/video/text interactions
10
+ #
11
+ # @example Basic text conversation
12
+ # client = Gemini::Client.new(api_key)
13
+ # session = client.live.connect(model: "gemini-2.5-flash-live-preview")
14
+ #
15
+ # session.on(:setup_complete) { puts "Connected!" }
16
+ # session.on(:text) { |text| puts "AI: #{text}" }
17
+ # session.on(:error) { |e| puts "Error: #{e}" }
18
+ #
19
+ # session.send_text("Hello!")
20
+ # sleep 5
21
+ # session.close
22
+ #
23
+ # @example Audio conversation
24
+ # session = client.live.connect(
25
+ # model: "gemini-2.5-flash-live-preview",
26
+ # response_modality: "AUDIO",
27
+ # voice_name: "Puck"
28
+ # )
29
+ #
30
+ # session.on(:audio) { |data, mime| play_audio(data) }
31
+ # session.send_audio(pcm_data) # 16-bit PCM, 16kHz, mono
32
+ #
33
+ # @example With block (auto-close)
34
+ # client.live.connect(model: "gemini-2.5-flash-live-preview") do |session|
35
+ # session.on(:text) { |text| puts text }
36
+ # session.send_text("Hello!")
37
+ # sleep 5
38
+ # end # session.close called automatically
39
+ #
40
+ class Live
41
+ def initialize(client:)
42
+ @client = client
43
+ end
44
+
45
+ # Establish a WebSocket connection and return a session
46
+ #
47
+ # @param model [String] Model to use (default: "gemini-2.5-flash-live-preview")
48
+ # @param response_modality [String] "TEXT" or "AUDIO" (default: "TEXT")
49
+ # @param voice_name [String] Voice for audio responses (Puck, Charon, Kore, etc.)
50
+ # @param system_instruction [String] System prompt
51
+ # @param tools [Array] Tool definitions for function calling
52
+ # @param context_window_compression [Hash] Compression settings for long sessions
53
+ # @param session_resumption [Hash] Session resumption settings
54
+ # @param automatic_activity_detection [Boolean] Enable/disable automatic VAD (default: true)
55
+ # @param media_resolution [String] Media resolution setting
56
+ # @param output_audio_transcription [Boolean] Enable audio transcription (default: false)
57
+ # @yield [session] If block given, yields the session and closes it when block returns
58
+ # @return [Gemini::Live::Session] The live session
59
+ #
60
+ def connect(
61
+ model: Configuration::DEFAULT_MODEL,
62
+ response_modality: "TEXT",
63
+ voice_name: nil,
64
+ system_instruction: nil,
65
+ tools: nil,
66
+ context_window_compression: nil,
67
+ session_resumption: nil,
68
+ automatic_activity_detection: true,
69
+ media_resolution: nil,
70
+ output_audio_transcription: false,
71
+ &block
72
+ )
73
+ config = Configuration.new(
74
+ model: model,
75
+ response_modality: response_modality,
76
+ voice_name: voice_name,
77
+ system_instruction: system_instruction,
78
+ tools: tools,
79
+ context_window_compression: context_window_compression,
80
+ session_resumption: session_resumption,
81
+ automatic_activity_detection: automatic_activity_detection,
82
+ media_resolution: media_resolution,
83
+ output_audio_transcription: output_audio_transcription
84
+ )
85
+
86
+ session = Session.new(
87
+ api_key: @client.api_key,
88
+ configuration: config
89
+ )
90
+
91
+ if block_given?
92
+ begin
93
+ yield session
94
+ ensure
95
+ session.close
96
+ end
97
+ else
98
+ session
99
+ end
100
+ end
101
+ end
102
+ end
@@ -70,9 +70,49 @@ module Gemini
70
70
 
71
71
  # Check if response is valid
72
72
  def valid?
73
- !@raw_data.nil? &&
74
- ((@raw_data.key?("candidates") && !@raw_data["candidates"].empty?) ||
75
- (@raw_data.key?("predictions") && !@raw_data["predictions"].empty?))
73
+ !@raw_data.nil? &&
74
+ ((@raw_data.key?("candidates") && !@raw_data["candidates"].empty?) ||
75
+ (@raw_data.key?("predictions") && !@raw_data["predictions"].empty?) ||
76
+ embedding_response?)
77
+ end
78
+
79
+ # Check if the raw response contains embedding data
80
+ def embedding_response?
81
+ return false if @raw_data.nil?
82
+ (@raw_data.key?("embedding") && !@raw_data["embedding"].nil?) ||
83
+ (@raw_data.key?("embeddings") && @raw_data["embeddings"].is_a?(Array) && !@raw_data["embeddings"].empty?)
84
+ end
85
+
86
+ # Get the embedding values as an Array of Floats.
87
+ # For single embedContent responses returns the values array.
88
+ # For batchEmbedContents responses returns the first embedding's values.
89
+ def embedding
90
+ return nil unless @raw_data
91
+ if @raw_data["embedding"].is_a?(Hash)
92
+ @raw_data["embedding"]["values"]
93
+ elsif @raw_data["embeddings"].is_a?(Array) && @raw_data["embeddings"].first.is_a?(Hash)
94
+ @raw_data["embeddings"].first["values"]
95
+ end
96
+ end
97
+
98
+ # Get all embedding value arrays for batch responses.
99
+ # Returns an Array of Arrays of Floats.
100
+ # For single embedContent responses, returns a single-element array.
101
+ def embeddings
102
+ return [] unless @raw_data
103
+ if @raw_data["embeddings"].is_a?(Array)
104
+ @raw_data["embeddings"].map { |e| e["values"] }.compact
105
+ elsif @raw_data["embedding"].is_a?(Hash) && @raw_data["embedding"]["values"]
106
+ [@raw_data["embedding"]["values"]]
107
+ else
108
+ []
109
+ end
110
+ end
111
+
112
+ # Get the dimensionality (length) of the first embedding vector
113
+ def embedding_dimension
114
+ values = embedding
115
+ values.is_a?(Array) ? values.length : 0
76
116
  end
77
117
 
78
118
  # Get error message if any
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Gemini
4
- VERSION = "1.0.0"
4
+ VERSION = "1.1.0"
5
5
  end
data/lib/gemini.rb CHANGED
@@ -20,6 +20,7 @@ require_relative "gemini/function_calling_helper"
20
20
  require_relative "gemini/documents"
21
21
  require_relative "gemini/cached_content"
22
22
  require_relative "gemini/video"
23
+ require_relative "gemini/live"
23
24
 
24
25
  module Gemini
25
26
  class Error < StandardError; end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby-gemini-api
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - rira100000000
@@ -51,6 +51,20 @@ dependencies:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
53
  version: '2.0'
54
+ - !ruby/object:Gem::Dependency
55
+ name: websocket-client-simple
56
+ requirement: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - "~>"
59
+ - !ruby/object:Gem::Version
60
+ version: '0.8'
61
+ type: :runtime
62
+ prerelease: false
63
+ version_requirements: !ruby/object:Gem::Requirement
64
+ requirements:
65
+ - - "~>"
66
+ - !ruby/object:Gem::Version
67
+ version: '0.8'
54
68
  - !ruby/object:Gem::Dependency
55
69
  name: rake
56
70
  requirement: !ruby/object:Gem::Requirement
@@ -156,6 +170,11 @@ files:
156
170
  - lib/gemini/http.rb
157
171
  - lib/gemini/http_headers.rb
158
172
  - lib/gemini/images.rb
173
+ - lib/gemini/live.rb
174
+ - lib/gemini/live/configuration.rb
175
+ - lib/gemini/live/connection.rb
176
+ - lib/gemini/live/message_builder.rb
177
+ - lib/gemini/live/session.rb
159
178
  - lib/gemini/messages.rb
160
179
  - lib/gemini/models.rb
161
180
  - lib/gemini/response.rb
@@ -187,7 +206,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
187
206
  - !ruby/object:Gem::Version
188
207
  version: '0'
189
208
  requirements: []
190
- rubygems_version: 3.7.2
209
+ rubygems_version: 3.6.9
191
210
  specification_version: 4
192
211
  summary: Ruby client for Google's Gemini API
193
212
  test_files: []