@heylemon/lemonade 0.5.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
- "version": "0.5.0",
3
- "commit": "a13eab6d9fff1ec7ccecd4f8abbe2679bf88cbee",
4
- "builtAt": "2026-02-25T08:16:55.962Z"
2
+ "version": "0.5.1",
3
+ "commit": "29eee7a82748ffb33ff4f60eef20cce3ddb33311",
4
+ "builtAt": "2026-02-25T09:21:40.481Z"
5
5
  }
@@ -1 +1 @@
1
- 095be0769244f46020c935adb52b83f602459c8318e55f7513862d7abb51c029
1
+ 6e4d6b64c78d4e7ef9e48d3412435e2c6e906c7625d57950ae01e6395b9db7ab
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@heylemon/lemonade",
3
- "version": "0.5.0",
3
+ "version": "0.5.1",
4
4
  "description": "AI gateway CLI for Lemon - local AI assistant with integrations",
5
5
  "publishConfig": {
6
6
  "access": "restricted"
@@ -1,15 +1,28 @@
1
1
  ---
2
2
  name: image-generation
3
- description: Generate images using DALL-E 3 or GPT Image models. Create artwork, illustrations, photos, designs. Triggers on: generate image, create picture, make art, draw, illustration, design image
3
+ description: Generate images using DALL-E 3 or Nano Banana Pro (Gemini 3 Pro Image). Create artwork, illustrations, photos, designs, edit images. Triggers on: generate image, create picture, make art, draw, illustration, design image, edit image
4
4
  ---
5
5
 
6
6
  # Image Generation Skill
7
7
 
8
- Generate images using OpenAI's image generation API, routed through the backend proxy.
8
+ Generate images using either **DALL-E 3** (OpenAI) or **Nano Banana Pro** (Gemini 3 Pro Image), routed through the Lemonade backend proxy.
9
9
 
10
- ## CRITICAL: How to Generate Images
10
+ ## Model Selection
11
11
 
12
- Use the `exec` tool to run a Python script. The script routes through the Lemonade backend proxy — no local API key needed.
12
+ Check the `IMAGE_GEN_MODEL` environment variable to decide which model to use:
13
+
14
+ - `dall-e-3` → Use DALL-E 3 (OpenAI) — text-to-image only
15
+ - `nano-banana-pro` → Use Nano Banana Pro (Gemini 3 Pro Image) — text-to-image, editing, multi-image composition
16
+
17
+ If `IMAGE_GEN_MODEL` is not set, default to `dall-e-3`.
18
+
19
+ **To check:** run `echo $IMAGE_GEN_MODEL` with the `exec` tool before generating.
20
+
21
+ ---
22
+
23
+ ## DALL-E 3 (OpenAI)
24
+
25
+ Use the `exec` tool to run an inline Python script. Routes through the Lemonade backend proxy — no local API key needed.
13
26
 
14
27
  ```python
15
28
  import openai
@@ -43,6 +56,55 @@ urllib.request.urlretrieve(image_url, filepath)
43
56
  print(f"Image saved to Desktop: {filename}")
44
57
  ```
45
58
 
59
+ ### DALL-E 3 Size Options
60
+
61
+ | Size | Aspect Ratio | Best For |
62
+ |------|--------------|----------|
63
+ | 1024x1024 | Square | Profile pictures, icons |
64
+ | 1792x1024 | Landscape | Desktop wallpapers, banners |
65
+ | 1024x1792 | Portrait | Phone wallpapers, posters |
66
+
67
+ ### DALL-E 3 Quality Options
68
+
69
+ | Quality | Description |
70
+ |---------|-------------|
71
+ | standard | Good for most uses |
72
+ | hd | More detailed, sharper |
73
+
74
+ ---
75
+
76
+ ## Nano Banana Pro (Gemini 3 Pro Image)
77
+
78
+ Use the bundled script via `exec`. Routes through the Lemonade backend proxy when `LEMON_BACKEND_URL` and `GATEWAY_TOKEN` are set, otherwise falls back to `GEMINI_API_KEY`.
79
+
80
+ ### Generate
81
+
82
+ ```bash
83
+ uv run {baseDir}/scripts/generate_image.py --prompt "your image description" --filename ~/Desktop/output.png --resolution 1K
84
+ ```
85
+
86
+ ### Edit (single image)
87
+
88
+ ```bash
89
+ uv run {baseDir}/scripts/generate_image.py --prompt "edit instructions" --filename ~/Desktop/output.png -i "/path/to/input.png" --resolution 2K
90
+ ```
91
+
92
+ ### Multi-image composition (up to 14 images)
93
+
94
+ ```bash
95
+ uv run {baseDir}/scripts/generate_image.py --prompt "combine these into one scene" --filename ~/Desktop/output.png -i img1.png -i img2.png -i img3.png
96
+ ```
97
+
98
+ ### Nano Banana Pro Resolutions
99
+
100
+ | Resolution | Best For |
101
+ |------------|----------|
102
+ | 1K (default) | Quick previews, social media |
103
+ | 2K | Prints, detailed work |
104
+ | 4K | High-resolution output |
105
+
106
+ ---
107
+
46
108
  ## Prompt Best Practices
47
109
 
48
110
  ### Be Specific
@@ -58,31 +120,21 @@ print(f"Image saved to Desktop: {filename}")
58
120
  - Mood: "serene", "energetic", "mysterious"
59
121
  - Colors: "vibrant colors", "muted tones", "black and white"
60
122
 
61
- ## Size Options
62
-
63
- | Size | Aspect Ratio | Best For |
64
- |------|--------------|----------|
65
- | 1024x1024 | Square | Profile pictures, icons |
66
- | 1792x1024 | Landscape | Desktop wallpapers, banners |
67
- | 1024x1792 | Portrait | Phone wallpapers, posters |
68
-
69
- ## Quality Options
70
-
71
- | Quality | Description |
72
- |---------|-------------|
73
- | standard | Good for most uses |
74
- | hd | More detailed, sharper |
75
-
76
123
  ## Workflow
77
124
 
78
- 1. **Understand request** - What does the user want to create?
79
- 2. **Craft detailed prompt** - Add style, lighting, composition details
80
- 3. **Generate image** - Run the Python script
81
- 4. **Save to Desktop** - So user can easily find it
82
- 5. **Confirm** - Tell user where the image was saved
125
+ 1. **Check model** run `echo $IMAGE_GEN_MODEL` to determine which backend to use
126
+ 2. **Understand request** What does the user want to create or edit?
127
+ 3. **Choose model** — If editing or compositing, prefer Nano Banana Pro (if activated). For text-to-image, use whichever is set.
128
+ 4. **Craft detailed prompt** Add style, lighting, composition details
129
+ 5. **Generate image** Run the appropriate script
130
+ 6. **Save to Desktop** — Use `~/Desktop/` with timestamped filename
131
+ 7. **Confirm** — Tell user where the image was saved
83
132
 
84
133
  ## Important Notes
85
134
 
86
135
  - Always save generated images to `~/Desktop/` for easy access
87
- - Images expire from the URL after ~1 hour, so always save locally
88
- - Always install openai package first if needed: `source ~/.lemonade/.venv/bin/activate && pip install -q openai`
136
+ - DALL-E image URLs expire after ~1 hour, so always save locally
137
+ - For DALL-E, install openai if needed: `source ~/.lemonade/.venv/bin/activate && pip install -q openai`
138
+ - Nano Banana Pro requires `uv` — install via `brew install uv` if missing
139
+ - Use timestamps in filenames: `yyyy-mm-dd-hh-mm-ss-name.png`
140
+ - The Nano Banana Pro script prints a `MEDIA:` line for Lemonade to auto-attach on supported providers
@@ -0,0 +1,201 @@
1
+ #!/usr/bin/env python3
2
+ # /// script
3
+ # requires-python = ">=3.10"
4
+ # dependencies = [
5
+ # "google-genai>=1.0.0",
6
+ # "pillow>=10.0.0",
7
+ # ]
8
+ # ///
9
+ """
10
+ Generate images using Google's Nano Banana Pro (Gemini 3 Pro Image) API.
11
+
12
+ Usage:
13
+ uv run generate_image.py --prompt "your image description" --filename "output.png" [--resolution 1K|2K|4K] [--api-key KEY]
14
+
15
+ Multi-image editing (up to 14 images):
16
+ uv run generate_image.py --prompt "combine these images" --filename "output.png" -i img1.png -i img2.png -i img3.png
17
+ """
18
+
19
+ import argparse
20
+ import os
21
+ import sys
22
+ from pathlib import Path
23
+
24
+
25
+ def get_api_key(provided_key: str | None) -> str | None:
26
+ """Get API key from argument first, then environment."""
27
+ if provided_key:
28
+ return provided_key
29
+ return os.environ.get("GEMINI_API_KEY")
30
+
31
+
32
+ def get_proxy_config() -> tuple[str | None, str | None]:
33
+ """Get backend proxy URL and gateway token for server-side API key routing."""
34
+ backend_url = os.environ.get("LEMON_BACKEND_URL", "").rstrip("/")
35
+ gateway_token = os.environ.get("GATEWAY_TOKEN", "")
36
+ if backend_url and gateway_token:
37
+ return f"{backend_url}/api/lemonade/proxy/gemini", gateway_token
38
+ return None, None
39
+
40
+
41
+ def main():
42
+ parser = argparse.ArgumentParser(
43
+ description="Generate images using Nano Banana Pro (Gemini 3 Pro Image)"
44
+ )
45
+ parser.add_argument(
46
+ "--prompt", "-p",
47
+ required=True,
48
+ help="Image description/prompt"
49
+ )
50
+ parser.add_argument(
51
+ "--filename", "-f",
52
+ required=True,
53
+ help="Output filename (e.g., sunset-mountains.png)"
54
+ )
55
+ parser.add_argument(
56
+ "--input-image", "-i",
57
+ action="append",
58
+ dest="input_images",
59
+ metavar="IMAGE",
60
+ help="Input image path(s) for editing/composition. Can be specified multiple times (up to 14 images)."
61
+ )
62
+ parser.add_argument(
63
+ "--resolution", "-r",
64
+ choices=["1K", "2K", "4K"],
65
+ default="1K",
66
+ help="Output resolution: 1K (default), 2K, or 4K"
67
+ )
68
+ parser.add_argument(
69
+ "--api-key", "-k",
70
+ help="Gemini API key (overrides GEMINI_API_KEY env var)"
71
+ )
72
+
73
+ args = parser.parse_args()
74
+
75
+ # Prefer backend proxy (API key stays server-side), fall back to local key
76
+ proxy_url, gateway_token = get_proxy_config()
77
+ api_key = get_api_key(args.api_key)
78
+
79
+ if not proxy_url and not api_key:
80
+ print("Error: No API key or backend proxy available.", file=sys.stderr)
81
+ print("Please either:", file=sys.stderr)
82
+ print(" 1. Set LEMON_BACKEND_URL + GATEWAY_TOKEN (backend proxy)", file=sys.stderr)
83
+ print(" 2. Provide --api-key argument", file=sys.stderr)
84
+ print(" 3. Set GEMINI_API_KEY environment variable", file=sys.stderr)
85
+ sys.exit(1)
86
+
87
+ from google import genai
88
+ from google.genai import types
89
+ from PIL import Image as PILImage
90
+
91
+ if proxy_url:
92
+ print(f"Using backend proxy for Gemini API")
93
+ client = genai.Client(
94
+ api_key=gateway_token,
95
+ http_options=types.HttpOptions(api_endpoint=proxy_url),
96
+ )
97
+ else:
98
+ client = genai.Client(api_key=api_key)
99
+
100
+ # Set up output path
101
+ output_path = Path(args.filename)
102
+ output_path.parent.mkdir(parents=True, exist_ok=True)
103
+
104
+ # Load input images if provided (up to 14 supported by Nano Banana Pro)
105
+ input_images = []
106
+ output_resolution = args.resolution
107
+ if args.input_images:
108
+ if len(args.input_images) > 14:
109
+ print(f"Error: Too many input images ({len(args.input_images)}). Maximum is 14.", file=sys.stderr)
110
+ sys.exit(1)
111
+
112
+ max_input_dim = 0
113
+ for img_path in args.input_images:
114
+ try:
115
+ img = PILImage.open(img_path)
116
+ input_images.append(img)
117
+ print(f"Loaded input image: {img_path}")
118
+
119
+ # Track largest dimension for auto-resolution
120
+ width, height = img.size
121
+ max_input_dim = max(max_input_dim, width, height)
122
+ except Exception as e:
123
+ print(f"Error loading input image '{img_path}': {e}", file=sys.stderr)
124
+ sys.exit(1)
125
+
126
+ # Auto-detect resolution from largest input if not explicitly set
127
+ if args.resolution == "1K" and max_input_dim > 0: # Default value
128
+ if max_input_dim >= 3000:
129
+ output_resolution = "4K"
130
+ elif max_input_dim >= 1500:
131
+ output_resolution = "2K"
132
+ else:
133
+ output_resolution = "1K"
134
+ print(f"Auto-detected resolution: {output_resolution} (from max input dimension {max_input_dim})")
135
+
136
+ # Build contents (images first if editing, prompt only if generating)
137
+ if input_images:
138
+ contents = [*input_images, args.prompt]
139
+ img_count = len(input_images)
140
+ print(f"Processing {img_count} image{'s' if img_count > 1 else ''} with resolution {output_resolution}...")
141
+ else:
142
+ contents = args.prompt
143
+ print(f"Generating image with resolution {output_resolution}...")
144
+
145
+ try:
146
+ response = client.models.generate_content(
147
+ model="gemini-3-pro-image-preview",
148
+ contents=contents,
149
+ config=types.GenerateContentConfig(
150
+ response_modalities=["TEXT", "IMAGE"],
151
+ image_config=types.ImageConfig(
152
+ image_size=output_resolution
153
+ )
154
+ )
155
+ )
156
+
157
+ # Process response and convert to PNG
158
+ image_saved = False
159
+ for part in response.parts:
160
+ if part.text is not None:
161
+ print(f"Model response: {part.text}")
162
+ elif part.inline_data is not None:
163
+ # Convert inline data to PIL Image and save as PNG
164
+ from io import BytesIO
165
+
166
+ # inline_data.data is already bytes, not base64
167
+ image_data = part.inline_data.data
168
+ if isinstance(image_data, str):
169
+ # If it's a string, it might be base64
170
+ import base64
171
+ image_data = base64.b64decode(image_data)
172
+
173
+ image = PILImage.open(BytesIO(image_data))
174
+
175
+ # Ensure RGB mode for PNG (convert RGBA to RGB with white background if needed)
176
+ if image.mode == 'RGBA':
177
+ rgb_image = PILImage.new('RGB', image.size, (255, 255, 255))
178
+ rgb_image.paste(image, mask=image.split()[3])
179
+ rgb_image.save(str(output_path), 'PNG')
180
+ elif image.mode == 'RGB':
181
+ image.save(str(output_path), 'PNG')
182
+ else:
183
+ image.convert('RGB').save(str(output_path), 'PNG')
184
+ image_saved = True
185
+
186
+ if image_saved:
187
+ full_path = output_path.resolve()
188
+ print(f"\nImage saved: {full_path}")
189
+ # Lemonade parses MEDIA tokens and will attach the file on supported providers.
190
+ print(f"MEDIA: {full_path}")
191
+ else:
192
+ print("Error: No image was generated in the response.", file=sys.stderr)
193
+ sys.exit(1)
194
+
195
+ except Exception as e:
196
+ print(f"Error generating image: {e}", file=sys.stderr)
197
+ sys.exit(1)
198
+
199
+
200
+ if __name__ == "__main__":
201
+ main()