llm-gemini 0.3a0__tar.gz → 0.4.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: llm-gemini
3
- Version: 0.3a0
3
+ Version: 0.4.1
4
4
  Summary: LLM plugin to access Google's Gemini family of models
5
5
  Author: Simon Willison
6
6
  License: Apache-2.0
@@ -11,11 +11,12 @@ Project-URL: CI, https://github.com/simonw/llm-gemini/actions
11
11
  Classifier: License :: OSI Approved :: Apache Software License
12
12
  Description-Content-Type: text/markdown
13
13
  License-File: LICENSE
14
- Requires-Dist: llm>=0.17a0
14
+ Requires-Dist: llm>=0.18
15
15
  Requires-Dist: httpx
16
16
  Requires-Dist: ijson
17
17
  Provides-Extra: test
18
18
  Requires-Dist: pytest; extra == "test"
19
+ Requires-Dist: pytest-recording; extra == "test"
19
20
 
20
21
  # llm-gemini
21
22
 
@@ -35,13 +36,13 @@ llm install llm-gemini
35
36
  ## Usage
36
37
 
37
38
  Configure the model by setting a key called "gemini" to your [API key](https://aistudio.google.com/app/apikey):
38
-
39
39
  ```bash
40
40
  llm keys set gemini
41
41
  ```
42
42
  ```
43
43
  <paste key here>
44
44
  ```
45
+ You can also set the API key by assigning it to the environment variable `LLM_GEMINI_KEY`.
45
46
 
46
47
  Now run the model using `-m gemini-1.5-pro-latest`, for example:
47
48
 
@@ -55,16 +56,13 @@ llm -m gemini-1.5-pro-latest "A joke about a pelican and a walrus"
55
56
  >
56
57
  > The pelican taps its beak thoughtfully. "I believe," it says, "it's a billfish."
57
58
 
58
- To chat interactively with the model, run `llm chat`:
59
-
60
- ```bash
61
- llm chat -m gemini-1.5-pro-latest
62
- ```
63
-
64
59
  Other models are:
65
60
 
66
61
  - `gemini-1.5-flash-latest`
67
- - gemini-1.5-flash-8b-latest` - the least expensive
62
+ - `gemini-1.5-flash-8b-latest` - the least expensive
63
+ - `gemini-exp-1114` - recent experimental
64
+
65
+ ### Images, audio and video
68
66
 
69
67
  Gemini models are multi-modal. You can provide images, audio or video files as input like this:
70
68
 
@@ -76,8 +74,52 @@ Or with a URL:
76
74
  llm -m gemini-1.5-flash-8b-latest 'describe image' \
77
75
  -a https://static.simonwillison.net/static/2024/pelicans.jpg
78
76
  ```
77
+ Audio works too:
78
+
79
+ ```bash
80
+ llm -m gemini-1.5-pro-latest 'transcribe audio' -a audio.mp3
81
+ ```
82
+
83
+ And video:
84
+
85
+ ```bash
86
+ llm -m gemini-1.5-pro-latest 'describe what happens' -a video.mp4
87
+ ```
88
+ The Gemini prompting guide includes [extensive advice](https://ai.google.dev/gemini-api/docs/file-prompting-strategies) on multi-modal prompting.
89
+
90
+ ### JSON output
91
+
92
+ Use `-o json_object 1` to force the output to be JSON:
93
+
94
+ ```bash
95
+ llm -m gemini-1.5-flash-latest -o json_object 1 \
96
+ '3 largest cities in California, list of {"name": "..."}'
97
+ ```
98
+ Outputs:
99
+ ```json
100
+ {"cities": [{"name": "Los Angeles"}, {"name": "San Diego"}, {"name": "San Jose"}]}
101
+ ```
102
+
103
+ ### Code execution
104
+
105
+ Gemini models can [write and execute code](https://ai.google.dev/gemini-api/docs/code-execution) - they can decide to write Python code, execute it in a secure sandbox and use the result as part of their response.
106
+
107
+ To enable this feature, use `-o code_execution 1`:
108
+
109
+ ```bash
110
+ llm -m gemini-1.5-pro-latest -o code_execution 1 \
111
+ 'use python to calculate (factorial of 13) * 3'
112
+ ```
113
+
114
+ ### Chat
115
+
116
+ To chat interactively with the model, run `llm chat`:
117
+
118
+ ```bash
119
+ llm chat -m gemini-1.5-pro-latest
120
+ ```
79
121
 
80
- ### Embeddings
122
+ ## Embeddings
81
123
 
82
124
  The plugin also adds support for the `text-embedding-004` embedding model.
83
125
 
@@ -16,13 +16,13 @@ llm install llm-gemini
16
16
  ## Usage
17
17
 
18
18
  Configure the model by setting a key called "gemini" to your [API key](https://aistudio.google.com/app/apikey):
19
-
20
19
  ```bash
21
20
  llm keys set gemini
22
21
  ```
23
22
  ```
24
23
  <paste key here>
25
24
  ```
25
+ You can also set the API key by assigning it to the environment variable `LLM_GEMINI_KEY`.
26
26
 
27
27
  Now run the model using `-m gemini-1.5-pro-latest`, for example:
28
28
 
@@ -36,16 +36,13 @@ llm -m gemini-1.5-pro-latest "A joke about a pelican and a walrus"
36
36
  >
37
37
  > The pelican taps its beak thoughtfully. "I believe," it says, "it's a billfish."
38
38
 
39
- To chat interactively with the model, run `llm chat`:
40
-
41
- ```bash
42
- llm chat -m gemini-1.5-pro-latest
43
- ```
44
-
45
39
  Other models are:
46
40
 
47
41
  - `gemini-1.5-flash-latest`
48
- - gemini-1.5-flash-8b-latest` - the least expensive
42
+ - `gemini-1.5-flash-8b-latest` - the least expensive
43
+ - `gemini-exp-1114` - recent experimental
44
+
45
+ ### Images, audio and video
49
46
 
50
47
  Gemini models are multi-modal. You can provide images, audio or video files as input like this:
51
48
 
@@ -57,8 +54,52 @@ Or with a URL:
57
54
  llm -m gemini-1.5-flash-8b-latest 'describe image' \
58
55
  -a https://static.simonwillison.net/static/2024/pelicans.jpg
59
56
  ```
57
+ Audio works too:
58
+
59
+ ```bash
60
+ llm -m gemini-1.5-pro-latest 'transcribe audio' -a audio.mp3
61
+ ```
62
+
63
+ And video:
64
+
65
+ ```bash
66
+ llm -m gemini-1.5-pro-latest 'describe what happens' -a video.mp4
67
+ ```
68
+ The Gemini prompting guide includes [extensive advice](https://ai.google.dev/gemini-api/docs/file-prompting-strategies) on multi-modal prompting.
69
+
70
+ ### JSON output
71
+
72
+ Use `-o json_object 1` to force the output to be JSON:
73
+
74
+ ```bash
75
+ llm -m gemini-1.5-flash-latest -o json_object 1 \
76
+ '3 largest cities in California, list of {"name": "..."}'
77
+ ```
78
+ Outputs:
79
+ ```json
80
+ {"cities": [{"name": "Los Angeles"}, {"name": "San Diego"}, {"name": "San Jose"}]}
81
+ ```
82
+
83
+ ### Code execution
84
+
85
+ Gemini models can [write and execute code](https://ai.google.dev/gemini-api/docs/code-execution) - they can decide to write Python code, execute it in a secure sandbox and use the result as part of their response.
86
+
87
+ To enable this feature, use `-o code_execution 1`:
88
+
89
+ ```bash
90
+ llm -m gemini-1.5-pro-latest -o code_execution 1 \
91
+ 'use python to calculate (factorial of 13) * 3'
92
+ ```
93
+
94
+ ### Chat
95
+
96
+ To chat interactively with the model, run `llm chat`:
97
+
98
+ ```bash
99
+ llm chat -m gemini-1.5-pro-latest
100
+ ```
60
101
 
61
- ### Embeddings
102
+ ## Embeddings
62
103
 
63
104
  The plugin also adds support for the `text-embedding-004` embedding model.
64
105
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: llm-gemini
3
- Version: 0.3a0
3
+ Version: 0.4.1
4
4
  Summary: LLM plugin to access Google's Gemini family of models
5
5
  Author: Simon Willison
6
6
  License: Apache-2.0
@@ -11,11 +11,12 @@ Project-URL: CI, https://github.com/simonw/llm-gemini/actions
11
11
  Classifier: License :: OSI Approved :: Apache Software License
12
12
  Description-Content-Type: text/markdown
13
13
  License-File: LICENSE
14
- Requires-Dist: llm>=0.17a0
14
+ Requires-Dist: llm>=0.18
15
15
  Requires-Dist: httpx
16
16
  Requires-Dist: ijson
17
17
  Provides-Extra: test
18
18
  Requires-Dist: pytest; extra == "test"
19
+ Requires-Dist: pytest-recording; extra == "test"
19
20
 
20
21
  # llm-gemini
21
22
 
@@ -35,13 +36,13 @@ llm install llm-gemini
35
36
  ## Usage
36
37
 
37
38
  Configure the model by setting a key called "gemini" to your [API key](https://aistudio.google.com/app/apikey):
38
-
39
39
  ```bash
40
40
  llm keys set gemini
41
41
  ```
42
42
  ```
43
43
  <paste key here>
44
44
  ```
45
+ You can also set the API key by assigning it to the environment variable `LLM_GEMINI_KEY`.
45
46
 
46
47
  Now run the model using `-m gemini-1.5-pro-latest`, for example:
47
48
 
@@ -55,16 +56,13 @@ llm -m gemini-1.5-pro-latest "A joke about a pelican and a walrus"
55
56
  >
56
57
  > The pelican taps its beak thoughtfully. "I believe," it says, "it's a billfish."
57
58
 
58
- To chat interactively with the model, run `llm chat`:
59
-
60
- ```bash
61
- llm chat -m gemini-1.5-pro-latest
62
- ```
63
-
64
59
  Other models are:
65
60
 
66
61
  - `gemini-1.5-flash-latest`
67
- - gemini-1.5-flash-8b-latest` - the least expensive
62
+ - `gemini-1.5-flash-8b-latest` - the least expensive
63
+ - `gemini-exp-1114` - recent experimental
64
+
65
+ ### Images, audio and video
68
66
 
69
67
  Gemini models are multi-modal. You can provide images, audio or video files as input like this:
70
68
 
@@ -76,8 +74,52 @@ Or with a URL:
76
74
  llm -m gemini-1.5-flash-8b-latest 'describe image' \
77
75
  -a https://static.simonwillison.net/static/2024/pelicans.jpg
78
76
  ```
77
+ Audio works too:
78
+
79
+ ```bash
80
+ llm -m gemini-1.5-pro-latest 'transcribe audio' -a audio.mp3
81
+ ```
82
+
83
+ And video:
84
+
85
+ ```bash
86
+ llm -m gemini-1.5-pro-latest 'describe what happens' -a video.mp4
87
+ ```
88
+ The Gemini prompting guide includes [extensive advice](https://ai.google.dev/gemini-api/docs/file-prompting-strategies) on multi-modal prompting.
89
+
90
+ ### JSON output
91
+
92
+ Use `-o json_object 1` to force the output to be JSON:
93
+
94
+ ```bash
95
+ llm -m gemini-1.5-flash-latest -o json_object 1 \
96
+ '3 largest cities in California, list of {"name": "..."}'
97
+ ```
98
+ Outputs:
99
+ ```json
100
+ {"cities": [{"name": "Los Angeles"}, {"name": "San Diego"}, {"name": "San Jose"}]}
101
+ ```
102
+
103
+ ### Code execution
104
+
105
+ Gemini models can [write and execute code](https://ai.google.dev/gemini-api/docs/code-execution) - they can decide to write Python code, execute it in a secure sandbox and use the result as part of their response.
106
+
107
+ To enable this feature, use `-o code_execution 1`:
108
+
109
+ ```bash
110
+ llm -m gemini-1.5-pro-latest -o code_execution 1 \
111
+ 'use python to calculate (factorial of 13) * 3'
112
+ ```
113
+
114
+ ### Chat
115
+
116
+ To chat interactively with the model, run `llm chat`:
117
+
118
+ ```bash
119
+ llm chat -m gemini-1.5-pro-latest
120
+ ```
79
121
 
80
- ### Embeddings
122
+ ## Embeddings
81
123
 
82
124
  The plugin also adds support for the `text-embedding-004` embedding model.
83
125
 
@@ -1,6 +1,7 @@
1
- llm>=0.17a0
1
+ llm>=0.18
2
2
  httpx
3
3
  ijson
4
4
 
5
5
  [test]
6
6
  pytest
7
+ pytest-recording
@@ -0,0 +1,319 @@
1
+ import httpx
2
+ import ijson
3
+ import llm
4
+ from pydantic import Field
5
+ from typing import Optional
6
+
7
+ SAFETY_SETTINGS = [
8
+ {
9
+ "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
10
+ "threshold": "BLOCK_NONE",
11
+ },
12
+ {
13
+ "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
14
+ "threshold": "BLOCK_NONE",
15
+ },
16
+ {
17
+ "category": "HARM_CATEGORY_HATE_SPEECH",
18
+ "threshold": "BLOCK_NONE",
19
+ },
20
+ {
21
+ "category": "HARM_CATEGORY_HARASSMENT",
22
+ "threshold": "BLOCK_NONE",
23
+ },
24
+ ]
25
+
26
+
27
+ @llm.hookimpl
28
+ def register_models(register):
29
+ # Register both sync and async versions of each model
30
+ for model_id in [
31
+ "gemini-pro",
32
+ "gemini-1.5-pro-latest",
33
+ "gemini-1.5-flash-latest",
34
+ "gemini-1.5-pro-001",
35
+ "gemini-1.5-flash-001",
36
+ "gemini-1.5-pro-002",
37
+ "gemini-1.5-flash-002",
38
+ "gemini-1.5-flash-8b-latest",
39
+ "gemini-1.5-flash-8b-001",
40
+ "gemini-exp-1114",
41
+ ]:
42
+ register(GeminiPro(model_id), AsyncGeminiPro(model_id))
43
+
44
+
45
+ def resolve_type(attachment):
46
+ mime_type = attachment.resolve_type()
47
+ # https://github.com/simonw/llm/issues/587#issuecomment-2439785140
48
+ if mime_type == "audio/mpeg":
49
+ mime_type = "audio/mp3"
50
+ return mime_type
51
+
52
+
53
+ class _SharedGemini:
54
+ needs_key = "gemini"
55
+ key_env_var = "LLM_GEMINI_KEY"
56
+ can_stream = True
57
+
58
+ attachment_types = (
59
+ # PDF
60
+ "application/pdf",
61
+ # Images
62
+ "image/png",
63
+ "image/jpeg",
64
+ "image/webp",
65
+ "image/heic",
66
+ "image/heif",
67
+ # Audio
68
+ "audio/wav",
69
+ "audio/mp3",
70
+ "audio/aiff",
71
+ "audio/aac",
72
+ "audio/ogg",
73
+ "audio/flac",
74
+ "audio/mpeg", # Treated as audio/mp3
75
+ # Video
76
+ "video/mp4",
77
+ "video/mpeg",
78
+ "video/mov",
79
+ "video/avi",
80
+ "video/x-flv",
81
+ "video/mpg",
82
+ "video/webm",
83
+ "video/wmv",
84
+ "video/3gpp",
85
+ "video/quicktime",
86
+ )
87
+
88
+ class Options(llm.Options):
89
+ code_execution: Optional[bool] = Field(
90
+ description="Enables the model to generate and run Python code",
91
+ default=None,
92
+ )
93
+ temperature: Optional[float] = Field(
94
+ description=(
95
+ "Controls the randomness of the output. Use higher values for "
96
+ "more creative responses, and lower values for more "
97
+ "deterministic responses."
98
+ ),
99
+ default=None,
100
+ ge=0.0,
101
+ le=2.0,
102
+ )
103
+ max_output_tokens: Optional[int] = Field(
104
+ description="Sets the maximum number of tokens to include in a candidate.",
105
+ default=None,
106
+ )
107
+ top_p: Optional[float] = Field(
108
+ description=(
109
+ "Changes how the model selects tokens for output. Tokens are "
110
+ "selected from the most to least probable until the sum of "
111
+ "their probabilities equals the topP value."
112
+ ),
113
+ default=None,
114
+ ge=0.0,
115
+ le=1.0,
116
+ )
117
+ top_k: Optional[int] = Field(
118
+ description=(
119
+ "Changes how the model selects tokens for output. A topK of 1 "
120
+ "means the selected token is the most probable among all the "
121
+ "tokens in the model's vocabulary, while a topK of 3 means "
122
+ "that the next token is selected from among the 3 most "
123
+ "probable using the temperature."
124
+ ),
125
+ default=None,
126
+ ge=1,
127
+ )
128
+ json_object: Optional[bool] = Field(
129
+ description="Output a valid JSON object {...}",
130
+ default=None,
131
+ )
132
+
133
+ def __init__(self, model_id):
134
+ self.model_id = model_id
135
+
136
+ def build_messages(self, prompt, conversation):
137
+ messages = []
138
+ if conversation:
139
+ for response in conversation.responses:
140
+ parts = []
141
+ for attachment in response.attachments:
142
+ mime_type = resolve_type(attachment)
143
+ parts.append(
144
+ {
145
+ "inlineData": {
146
+ "data": attachment.base64_content(),
147
+ "mimeType": mime_type,
148
+ }
149
+ }
150
+ )
151
+ if response.prompt.prompt:
152
+ parts.append({"text": response.prompt.prompt})
153
+ messages.append({"role": "user", "parts": parts})
154
+ messages.append({"role": "model", "parts": [{"text": response.text()}]})
155
+
156
+ parts = []
157
+ if prompt.prompt:
158
+ parts.append({"text": prompt.prompt})
159
+ for attachment in prompt.attachments:
160
+ mime_type = resolve_type(attachment)
161
+ parts.append(
162
+ {
163
+ "inlineData": {
164
+ "data": attachment.base64_content(),
165
+ "mimeType": mime_type,
166
+ }
167
+ }
168
+ )
169
+
170
+ messages.append({"role": "user", "parts": parts})
171
+ return messages
172
+
173
+ def build_request_body(self, prompt, conversation):
174
+ body = {
175
+ "contents": self.build_messages(prompt, conversation),
176
+ "safetySettings": SAFETY_SETTINGS,
177
+ }
178
+ if prompt.options and prompt.options.code_execution:
179
+ body["tools"] = [{"codeExecution": {}}]
180
+ if prompt.system:
181
+ body["systemInstruction"] = {"parts": [{"text": prompt.system}]}
182
+
183
+ config_map = {
184
+ "temperature": "temperature",
185
+ "max_output_tokens": "maxOutputTokens",
186
+ "top_p": "topP",
187
+ "top_k": "topK",
188
+ }
189
+ if prompt.options and prompt.options.json_object:
190
+ body["generationConfig"] = {"response_mime_type": "application/json"}
191
+
192
+ if any(
193
+ getattr(prompt.options, key, None) is not None for key in config_map.keys()
194
+ ):
195
+ generation_config = {}
196
+ for key, other_key in config_map.items():
197
+ config_value = getattr(prompt.options, key, None)
198
+ if config_value is not None:
199
+ generation_config[other_key] = config_value
200
+ body["generationConfig"] = generation_config
201
+
202
+ return body
203
+
204
+ def process_part(self, part):
205
+ if "text" in part:
206
+ return part["text"]
207
+ elif "executableCode" in part:
208
+ return f'```{part["executableCode"]["language"].lower()}\n{part["executableCode"]["code"].strip()}\n```\n'
209
+ elif "codeExecutionResult" in part:
210
+ return f'```\n{part["codeExecutionResult"]["output"].strip()}\n```\n'
211
+ return ""
212
+
213
+
214
+ class GeminiPro(_SharedGemini, llm.Model):
215
+ def execute(self, prompt, stream, response, conversation):
216
+ key = self.get_key()
217
+ url = f"https://generativelanguage.googleapis.com/v1beta/models/{self.model_id}:streamGenerateContent"
218
+ gathered = []
219
+ body = self.build_request_body(prompt, conversation)
220
+
221
+ with httpx.stream(
222
+ "POST",
223
+ url,
224
+ timeout=None,
225
+ headers={"x-goog-api-key": key},
226
+ json=body,
227
+ ) as http_response:
228
+ events = ijson.sendable_list()
229
+ coro = ijson.items_coro(events, "item")
230
+ for chunk in http_response.iter_bytes():
231
+ coro.send(chunk)
232
+ if events:
233
+ event = events[0]
234
+ if isinstance(event, dict) and "error" in event:
235
+ raise llm.ModelError(event["error"]["message"])
236
+ try:
237
+ part = event["candidates"][0]["content"]["parts"][0]
238
+ yield self.process_part(part)
239
+ except KeyError:
240
+ yield ""
241
+ gathered.append(event)
242
+ events.clear()
243
+ response.response_json = gathered
244
+
245
+
246
+ class AsyncGeminiPro(_SharedGemini, llm.AsyncModel):
247
+ async def execute(self, prompt, stream, response, conversation):
248
+ key = self.get_key()
249
+ url = f"https://generativelanguage.googleapis.com/v1beta/models/{self.model_id}:streamGenerateContent"
250
+ gathered = []
251
+ body = self.build_request_body(prompt, conversation)
252
+
253
+ async with httpx.AsyncClient() as client:
254
+ async with client.stream(
255
+ "POST",
256
+ url,
257
+ timeout=None,
258
+ headers={"x-goog-api-key": key},
259
+ json=body,
260
+ ) as http_response:
261
+ events = ijson.sendable_list()
262
+ coro = ijson.items_coro(events, "item")
263
+ async for chunk in http_response.aiter_bytes():
264
+ coro.send(chunk)
265
+ if events:
266
+ event = events[0]
267
+ if isinstance(event, dict) and "error" in event:
268
+ raise llm.ModelError(event["error"]["message"])
269
+ try:
270
+ part = event["candidates"][0]["content"]["parts"][0]
271
+ yield self.process_part(part)
272
+ except KeyError:
273
+ yield ""
274
+ gathered.append(event)
275
+ events.clear()
276
+ response.response_json = gathered
277
+
278
+
279
+ @llm.hookimpl
280
+ def register_embedding_models(register):
281
+ register(
282
+ GeminiEmbeddingModel("text-embedding-004", "text-embedding-004"),
283
+ )
284
+
285
+
286
+ class GeminiEmbeddingModel(llm.EmbeddingModel):
287
+ needs_key = "gemini"
288
+ key_env_var = "LLM_GEMINI_KEY"
289
+ batch_size = 20
290
+
291
+ def __init__(self, model_id, gemini_model_id):
292
+ self.model_id = model_id
293
+ self.gemini_model_id = gemini_model_id
294
+
295
+ def embed_batch(self, items):
296
+ headers = {
297
+ "Content-Type": "application/json",
298
+ "x-goog-api-key": self.get_key(),
299
+ }
300
+ data = {
301
+ "requests": [
302
+ {
303
+ "model": "models/" + self.gemini_model_id,
304
+ "content": {"parts": [{"text": item}]},
305
+ }
306
+ for item in items
307
+ ]
308
+ }
309
+
310
+ with httpx.Client() as client:
311
+ response = client.post(
312
+ f"https://generativelanguage.googleapis.com/v1beta/models/{self.gemini_model_id}:batchEmbedContents",
313
+ headers=headers,
314
+ json=data,
315
+ timeout=None,
316
+ )
317
+
318
+ response.raise_for_status()
319
+ return [item["values"] for item in response.json()["embeddings"]]
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "llm-gemini"
3
- version = "0.3a0"
3
+ version = "0.4.1"
4
4
  description = "LLM plugin to access Google's Gemini family of models"
5
5
  readme = "README.md"
6
6
  authors = [{name = "Simon Willison"}]
@@ -9,7 +9,7 @@ classifiers = [
9
9
  "License :: OSI Approved :: Apache Software License"
10
10
  ]
11
11
  dependencies = [
12
- "llm>=0.17a0",
12
+ "llm>=0.18",
13
13
  "httpx",
14
14
  "ijson"
15
15
  ]
@@ -24,4 +24,4 @@ CI = "https://github.com/simonw/llm-gemini/actions"
24
24
  gemini = "llm_gemini"
25
25
 
26
26
  [project.optional-dependencies]
27
- test = ["pytest"]
27
+ test = ["pytest", "pytest-recording"]
@@ -0,0 +1,29 @@
1
+ import llm
2
+ import os
3
+ import pytest
4
+
5
+ GEMINI_API_KEY = os.environ.get("PYTEST_GEMINI_API_KEY", None) or "gm-..."
6
+
7
+
8
+ @pytest.mark.vcr
9
+ @pytest.mark.asyncio
10
+ async def test_prompt():
11
+ model = llm.get_model("gemini-1.5-flash-latest")
12
+ model.key = model.key or GEMINI_API_KEY
13
+ response = model.prompt("Name for a pet pelican, just the name")
14
+ assert str(response) == "Percy"
15
+ assert response.response_json == [
16
+ {
17
+ "candidates": [
18
+ {"content": {"parts": [{"text": "Percy"}], "role": "model"}}
19
+ ],
20
+ "usageMetadata": {"promptTokenCount": 10, "totalTokenCount": 10},
21
+ "modelVersion": "gemini-1.5-flash-002",
22
+ }
23
+ ]
24
+ # And try it async too
25
+ async_model = llm.get_async_model("gemini-1.5-flash-latest")
26
+ async_model.key = async_model.key or GEMINI_API_KEY
27
+ response = await async_model.prompt("Name for a pet pelican, just the name")
28
+ text = await response.text()
29
+ assert text == "Percy"
@@ -1,193 +0,0 @@
1
- import httpx
2
- import ijson
3
- import llm
4
- import urllib.parse
5
-
6
- # We disable all of these to avoid random unexpected errors
7
- SAFETY_SETTINGS = [
8
- {
9
- "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
10
- "threshold": "BLOCK_NONE",
11
- },
12
- {
13
- "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
14
- "threshold": "BLOCK_NONE",
15
- },
16
- {
17
- "category": "HARM_CATEGORY_HATE_SPEECH",
18
- "threshold": "BLOCK_NONE",
19
- },
20
- {
21
- "category": "HARM_CATEGORY_HARASSMENT",
22
- "threshold": "BLOCK_NONE",
23
- },
24
- ]
25
-
26
-
27
- @llm.hookimpl
28
- def register_models(register):
29
- register(GeminiPro("gemini-pro"))
30
- register(GeminiPro("gemini-1.5-pro-latest"))
31
- register(GeminiPro("gemini-1.5-flash-latest"))
32
- register(GeminiPro("gemini-1.5-pro-001"))
33
- register(GeminiPro("gemini-1.5-flash-001"))
34
- register(GeminiPro("gemini-1.5-pro-002"))
35
- register(GeminiPro("gemini-1.5-flash-002"))
36
- register(GeminiPro("gemini-1.5-flash-8b-latest"))
37
- register(GeminiPro("gemini-1.5-flash-8b-001"))
38
-
39
-
40
- def resolve_type(attachment):
41
- mime_type = attachment.resolve_type()
42
- # https://github.com/simonw/llm/issues/587#issuecomment-2439785140
43
- if mime_type == "audio/mpeg":
44
- mime_type = "audio/mp3"
45
- return mime_type
46
-
47
-
48
- class GeminiPro(llm.Model):
49
- can_stream = True
50
-
51
- attachment_types = (
52
- # PDF
53
- "application/pdf",
54
- # Images
55
- "image/png",
56
- "image/jpeg",
57
- "image/webp",
58
- "image/heic",
59
- "image/heif",
60
- # Audio
61
- "audio/wav",
62
- "audio/mp3",
63
- "audio/aiff",
64
- "audio/aac",
65
- "audio/ogg",
66
- "audio/flac",
67
- "audio/mpeg", # Treated as audio/mp3
68
- # Video
69
- "video/mp4",
70
- "video/mpeg",
71
- "video/mov",
72
- "video/avi",
73
- "video/x-flv",
74
- "video/mpg",
75
- "video/webm",
76
- "video/wmv",
77
- "video/3gpp",
78
- )
79
-
80
- def __init__(self, model_id):
81
- self.model_id = model_id
82
-
83
- def build_messages(self, prompt, conversation):
84
- messages = []
85
- if conversation:
86
- for response in conversation.responses:
87
- parts = []
88
- for attachment in response.attachments:
89
- mime_type = resolve_type(attachment)
90
- parts.append(
91
- {
92
- "inlineData": {
93
- "data": attachment.base64_content(),
94
- "mimeType": mime_type,
95
- }
96
- }
97
- )
98
- parts.append({"text": response.prompt.prompt})
99
- messages.append({"role": "user", "parts": parts})
100
- messages.append({"role": "model", "parts": [{"text": response.text()}]})
101
-
102
- parts = [{"text": prompt.prompt}]
103
- for attachment in prompt.attachments:
104
- mime_type = resolve_type(attachment)
105
- parts.append(
106
- {
107
- "inlineData": {
108
- "data": attachment.base64_content(),
109
- "mimeType": mime_type,
110
- }
111
- }
112
- )
113
-
114
- messages.append({"role": "user", "parts": parts})
115
- return messages
116
-
117
- def execute(self, prompt, stream, response, conversation):
118
- key = llm.get_key("", "gemini", "LLM_GEMINI_KEY")
119
- url = "https://generativelanguage.googleapis.com/v1beta/models/{}:streamGenerateContent?".format(
120
- self.model_id
121
- ) + urllib.parse.urlencode(
122
- {"key": key}
123
- )
124
- gathered = []
125
- body = {
126
- "contents": self.build_messages(prompt, conversation),
127
- "safetySettings": SAFETY_SETTINGS,
128
- }
129
- if prompt.system:
130
- body["systemInstruction"] = {"parts": [{"text": prompt.system}]}
131
- with httpx.stream(
132
- "POST",
133
- url,
134
- timeout=None,
135
- json=body,
136
- ) as http_response:
137
- events = ijson.sendable_list()
138
- coro = ijson.items_coro(events, "item")
139
- for chunk in http_response.iter_bytes():
140
- coro.send(chunk)
141
- if events:
142
- event = events[0]
143
- if isinstance(event, dict) and "error" in event:
144
- raise llm.ModelError(event["error"]["message"])
145
- try:
146
- yield event["candidates"][0]["content"]["parts"][0]["text"]
147
- except KeyError:
148
- yield ""
149
- gathered.append(event)
150
- events.clear()
151
- response.response_json = gathered
152
-
153
-
154
- @llm.hookimpl
155
- def register_embedding_models(register):
156
- register(
157
- GeminiEmbeddingModel("text-embedding-004", "text-embedding-004"),
158
- )
159
-
160
-
161
- class GeminiEmbeddingModel(llm.EmbeddingModel):
162
- needs_key = "gemini"
163
- key_env_var = "LLM_GEMINI_KEY"
164
- batch_size = 20
165
-
166
- def __init__(self, model_id, gemini_model_id):
167
- self.model_id = model_id
168
- self.gemini_model_id = gemini_model_id
169
-
170
- def embed_batch(self, items):
171
- headers = {
172
- "Content-Type": "application/json",
173
- }
174
- data = {
175
- "requests": [
176
- {
177
- "model": "models/" + self.gemini_model_id,
178
- "content": {"parts": [{"text": item}]},
179
- }
180
- for item in items
181
- ]
182
- }
183
-
184
- with httpx.Client() as client:
185
- response = client.post(
186
- f"https://generativelanguage.googleapis.com/v1beta/models/{self.gemini_model_id}:batchEmbedContents?key={self.get_key()}",
187
- headers=headers,
188
- json=data,
189
- timeout=None,
190
- )
191
-
192
- response.raise_for_status()
193
- return [item["values"] for item in response.json()["embeddings"]]
@@ -1,6 +0,0 @@
1
- from llm.plugins import pm
2
-
3
-
4
- def test_plugin_is_installed():
5
- names = [mod.__name__ for mod in pm.get_plugins()]
6
- assert "llm_gemini" in names
File without changes
File without changes