PyPI - llm-gemini - Versions diffs - 0.2__py3-none-any.whl → 0.3__py3-none-any.whl - Mend

llm-gemini 0.2py3-none-any.whl → 0.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

{llm_gemini-0.2.dist-info → llm_gemini-0.3.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: llm-gemini
-Version: 0.2
+Version: 0.3
 Summary: LLM plugin to access Google's Gemini family of models
 Author: Simon Willison
 License: Apache-2.0
@@ -11,7 +11,7 @@ Project-URL: CI, https://github.com/simonw/llm-gemini/actions
 Classifier: License :: OSI Approved :: Apache Software License
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: llm
+Requires-Dist: llm >=0.17
 Requires-Dist: httpx
 Requires-Dist: ijson
 Provides-Extra: test
@@ -43,23 +43,66 @@ llm keys set gemini
 <paste key here>
 ```
-Now run the model using `-m gemini-pro`, for example:
+Now run the model using `-m gemini-1.5-pro-latest`, for example:
 ```bash
-llm -m gemini-pro "A joke about a pelican and a walrus"
+llm -m gemini-1.5-pro-latest "A joke about a pelican and a walrus"
 ```
-> Why did the pelican get mad at the walrus?
+> A pelican walks into a seafood restaurant with a huge fish hanging out of its beak.  The walrus, sitting at the bar, eyes it enviously.
 >
-> Because he called him a hippo-crit.
+> "Hey," the walrus says, "That looks delicious! What kind of fish is that?"
+>
+> The pelican taps its beak thoughtfully. "I believe," it says, "it's a billfish."
+### Images, audio and video
+Gemini models are multi-modal. You can provide images, audio or video files as input like this:
+```bash
+llm -m gemini-1.5-flash-latest 'extract text' -a image.jpg
+```
+Or with a URL:
+```bash
+llm -m gemini-1.5-flash-8b-latest 'describe image' \
+  -a https://static.simonwillison.net/static/2024/pelicans.jpg
+```
+Audio works too:
+```bash
+llm -m gemini-1.5-pro-latest 'transcribe audio' -a audio.mp3
+```
+And video:
+```bash
+llm -m gemini-1.5-pro-latest 'describe what happens' -a video.mp4
+```
+## Code execution
+Gemini models can [write and execute code](https://ai.google.dev/gemini-api/docs/code-execution) - they can decide to write Python code, execute it in a secure sandbox and use the result as part of their response.
+To enable this feature, use `-o code_execution 1`:
+```bash
+llm -m gemini-1.5-pro-latest -o code_execution 1 \
+'use python to calculate (factorial of 13) * 3'
+```
+### Chat
 To chat interactively with the model, run `llm chat`:
 ```bash
-llm chat -m gemini-pro
+llm chat -m gemini-1.5-pro-latest
 ```
-If you have access to the Gemini 1.5 Pro preview you can use `-m gemini-1.5-pro-latest` to work with that model.
+Other models are:
+- `gemini-1.5-flash-latest`
+- gemini-1.5-flash-8b-latest` - the least expensive
 ### Embeddings

llm_gemini-0.3.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,7 @@
+llm_gemini.py,sha256=DQO3ROfJSajqUYmgeuW-4_FJ1yvMoFVKb44ly20oqGw,8628
+llm_gemini-0.3.dist-info/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
+llm_gemini-0.3.dist-info/METADATA,sha256=ROGQUiOTfQHn1FXN3x6cgFFnNsJ75TtHTOyb_EJvFBA,4234
+llm_gemini-0.3.dist-info/WHEEL,sha256=OVMc5UfuAQiSplgO0_WdW7vXVGAt9Hdd6qtN4HotdyA,91
+llm_gemini-0.3.dist-info/entry_points.txt,sha256=n544bpgUPIBc5l_cnwsTxPc3gMGJHPtAyqBNp-CkMWk,26
+llm_gemini-0.3.dist-info/top_level.txt,sha256=WUQmG6_2QKbT_8W4HH93qyKl_0SUteL4Ra6_PhyNGKU,11
+llm_gemini-0.3.dist-info/RECORD,,

{llm_gemini-0.2.dist-info → llm_gemini-0.3.dist-info}/WHEEL RENAMED Viewed

@@ -1,5 +1,5 @@
 Wheel-Version: 1.0
-Generator: setuptools (75.1.0)
+Generator: setuptools (75.2.0)
 Root-Is-Purelib: true
 Tag: py3-none-any

llm_gemini.py CHANGED Viewed

@@ -1,6 +1,9 @@
 import httpx
 import ijson
 import llm
+from pydantic import Field
+from typing import Optional
 import urllib.parse
 # We disable all of these to avoid random unexpected errors
@@ -37,26 +40,114 @@ def register_models(register):
     register(GeminiPro("gemini-1.5-flash-8b-001"))
+def resolve_type(attachment):
+    mime_type = attachment.resolve_type()
+    # https://github.com/simonw/llm/issues/587#issuecomment-2439785140
+    if mime_type == "audio/mpeg":
+        mime_type = "audio/mp3"
+    return mime_type
 class GeminiPro(llm.Model):
+    needs_key = "gemini"
+    key_env_var = "LLM_GEMINI_KEY"
     can_stream = True
+    attachment_types = (
+        # PDF
+        "application/pdf",
+        # Images
+        "image/png",
+        "image/jpeg",
+        "image/webp",
+        "image/heic",
+        "image/heif",
+        # Audio
+        "audio/wav",
+        "audio/mp3",
+        "audio/aiff",
+        "audio/aac",
+        "audio/ogg",
+        "audio/flac",
+        "audio/mpeg",  # Treated as audio/mp3
+        # Video
+        "video/mp4",
+        "video/mpeg",
+        "video/mov",
+        "video/avi",
+        "video/x-flv",
+        "video/mpg",
+        "video/webm",
+        "video/wmv",
+        "video/3gpp",
+    )
+    class Options(llm.Options):
+        code_execution: Optional[bool] = Field(
+            description="Enables the model to generate and run Python code",
+            default=None,
+        )
+        temperature: Optional[float] = Field(
+            description="Controls the randomness of the output. Use higher values for more creative responses, and lower values for more deterministic responses.",
+            default=None,
+            ge=0.0,
+            le=2.0,
+        )
+        max_output_tokens: Optional[int] = Field(
+            description="Sets the maximum number of tokens to include in a candidate.",
+            default=None,
+        )
+        top_p: Optional[float] = Field(
+            description="Changes how the model selects tokens for output. Tokens are selected from the most to least probable until the sum of their probabilities equals the topP value.",
+            default=None,
+            ge=0.0,
+            le=1.0,
+        )
+        top_k: Optional[int] = Field(
+            description="Changes how the model selects tokens for output. A topK of 1 means the selected token is the most probable among all the tokens in the model's vocabulary, while a topK of 3 means that the next token is selected from among the 3 most probable using the temperature.",
+            default=None,
+            ge=1,
+        )
     def __init__(self, model_id):
         self.model_id = model_id
     def build_messages(self, prompt, conversation):
-        if not conversation:
-            return [{"role": "user", "parts": [{"text": prompt.prompt}]}]
         messages = []
-        for response in conversation.responses:
-            messages.append(
-                {"role": "user", "parts": [{"text": response.prompt.prompt}]}
+        if conversation:
+            for response in conversation.responses:
+                parts = []
+                for attachment in response.attachments:
+                    mime_type = resolve_type(attachment)
+                    parts.append(
+                        {
+                            "inlineData": {
+                                "data": attachment.base64_content(),
+                                "mimeType": mime_type,
+                            }
+                        }
+                    )
+                parts.append({"text": response.prompt.prompt})
+                messages.append({"role": "user", "parts": parts})
+                messages.append({"role": "model", "parts": [{"text": response.text()}]})
+        parts = [{"text": prompt.prompt}]
+        for attachment in prompt.attachments:
+            mime_type = resolve_type(attachment)
+            parts.append(
+                {
+                    "inlineData": {
+                        "data": attachment.base64_content(),
+                        "mimeType": mime_type,
+                    }
+                }
             )
-            messages.append({"role": "model", "parts": [{"text": response.text()}]})
-        messages.append({"role": "user", "parts": [{"text": prompt.prompt}]})
+        messages.append({"role": "user", "parts": parts})
         return messages
     def execute(self, prompt, stream, response, conversation):
-        key = llm.get_key("", "gemini", "LLM_GEMINI_KEY")
+        key = self.get_key()
         url = "https://generativelanguage.googleapis.com/v1beta/models/{}:streamGenerateContent?".format(
             self.model_id
         ) + urllib.parse.urlencode(
@@ -67,8 +158,28 @@ class GeminiPro(llm.Model):
             "contents": self.build_messages(prompt, conversation),
             "safetySettings": SAFETY_SETTINGS,
         }
+        if prompt.options and prompt.options.code_execution:
+            body["tools"] = [{"codeExecution": {}}]
         if prompt.system:
             body["systemInstruction"] = {"parts": [{"text": prompt.system}]}
+        config_map = {
+            "temperature": "temperature",
+            "max_output_tokens": "maxOutputTokens",
+            "top_p": "topP",
+            "top_k": "topK",
+        }
+        # If any of those are set in prompt.options...
+        if any(
+            getattr(prompt.options, key, None) is not None for key in config_map.keys()
+        ):
+            generation_config = {}
+            for key, other_key in config_map.items():
+                config_value = getattr(prompt.options, key, None)
+                if config_value is not None:
+                    generation_config[other_key] = config_value
+            body["generationConfig"] = generation_config
         with httpx.stream(
             "POST",
             url,
@@ -84,7 +195,15 @@ class GeminiPro(llm.Model):
                     if isinstance(event, dict) and "error" in event:
                         raise llm.ModelError(event["error"]["message"])
                     try:
-                        yield event["candidates"][0]["content"]["parts"][0]["text"]
+                        part = event["candidates"][0]["content"]["parts"][0]
+                        if "text" in part:
+                            yield part["text"]
+                        elif "executableCode" in part:
+                            # For code_execution
+                            yield f'```{part["executableCode"]["language"].lower()}\n{part["executableCode"]["code"].strip()}\n```\n'
+                        elif "codeExecutionResult" in part:
+                            # For code_execution
+                            yield f'```\n{part["codeExecutionResult"]["output"].strip()}\n```\n'
                     except KeyError:
                         yield ""
                     gathered.append(event)

llm_gemini-0.2.dist-info/RECORD DELETED Viewed

@@ -1,7 +0,0 @@
-llm_gemini.py,sha256=38ONnvzgDWJIE17ODeQd87UWsgvJSeTsDyHpLBTp9og,4305
-llm_gemini-0.2.dist-info/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
-llm_gemini-0.2.dist-info/METADATA,sha256=rVokMpbsBeOsCR59GzpyXcBlj99KFo3g4pO767oyi_k,3059
-llm_gemini-0.2.dist-info/WHEEL,sha256=GV9aMThwP_4oNCtvEC2ec3qUYutgWeAzklro_0m4WJQ,91
-llm_gemini-0.2.dist-info/entry_points.txt,sha256=n544bpgUPIBc5l_cnwsTxPc3gMGJHPtAyqBNp-CkMWk,26
-llm_gemini-0.2.dist-info/top_level.txt,sha256=WUQmG6_2QKbT_8W4HH93qyKl_0SUteL4Ra6_PhyNGKU,11
-llm_gemini-0.2.dist-info/RECORD,,

{llm_gemini-0.2.dist-info → llm_gemini-0.3.dist-info}/LICENSE RENAMED Viewed

File without changes

{llm_gemini-0.2.dist-info → llm_gemini-0.3.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{llm_gemini-0.2.dist-info → llm_gemini-0.3.dist-info}/top_level.txt RENAMED Viewed

File without changes

llm-gemini 0.2__py3-none-any.whl → 0.3__py3-none-any.whl

llm-gemini 0.2py3-none-any.whl → 0.3py3-none-any.whl