@comate/zulu 1.4.0-beta.3 → 1.4.0-beta.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/comate-engine/assets/skills/code-review/SKILL.md +6 -5
  2. package/comate-engine/assets/skills/code-review/agents/custom-reviewer.md +2 -2
  3. package/comate-engine/assets/skills/code-review/agents/meta-reviewer.md +2 -2
  4. package/comate-engine/assets/skills/code-review/agents/style-reviewer.md +72 -10
  5. package/comate-engine/assets/skills/code-review/references/dispatch-template.md +12 -12
  6. package/comate-engine/assets/skills/code-review/references/rules/Java/JAVA_STYLE_RULES.md +11 -5
  7. package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/SKILL.md +5 -8
  8. package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/long_running_task.md +0 -15
  9. package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/testing_strategy.md +1 -1
  10. package/comate-engine/assets/skills/create-image/SKILL.md +197 -201
  11. package/comate-engine/assets/skills/create-image/scripts/generate-image.ps1 +213 -0
  12. package/comate-engine/assets/skills/create-image/scripts/generate-image.sh +322 -0
  13. package/comate-engine/assets/skills/create-skill/SKILL.md +1 -2
  14. package/comate-engine/assets/skills/create-subagent/SKILL.md +11 -5
  15. package/comate-engine/assets/skills/get-ugate-token/SKILL.md +97 -13
  16. package/comate-engine/assets/skills/get-ugate-token/getUgateToken.py +99 -5
  17. package/comate-engine/node_modules/@baidu/comate-browser-use/dist/launch-chrome/index.js +1 -1
  18. package/comate-engine/node_modules/@baidu/comate-browser-use/package.json +5 -5
  19. package/comate-engine/node_modules/@comate/plugin-shared-internals/dist/index.js +1 -1
  20. package/comate-engine/package.json +1 -1
  21. package/comate-engine/server.js +149 -180
  22. package/dist/bundle/index.js +3 -3
  23. package/package.json +1 -1
  24. package/scripts/postinstall.js +4 -3
  25. package/comate-engine/assets/skills/code-review/evals/SKILL.md +0 -334
  26. package/comate-engine/assets/skills/code-review/evals/agents/gt-generator.md +0 -76
  27. package/comate-engine/assets/skills/code-review/evals/agents/miner.md +0 -87
  28. package/comate-engine/assets/skills/code-review/evals/agents/score-judge.md +0 -168
  29. package/comate-engine/assets/skills/code-review/evals/references/cli-query-template.md +0 -114
  30. package/comate-engine/assets/skills/code-review/evals/references/gt-schema.md +0 -77
  31. package/comate-engine/assets/skills/code-review/references/custom-rules/RULE_TEMPLATE.md +0 -141
  32. /package/comate-engine/assets/commands/{code-review-comate.md → code-review.md} +0 -0
  33. /package/comate-engine/assets/commands/{debug-comate.md → debug.md} +0 -0
  34. /package/comate-engine/assets/commands/{unit-test-comate.md → unit-test.md} +0 -0
  35. /package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/backend_dev.md +0 -0
  36. /package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/env_setup.md +0 -0
  37. /package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/frontend_dev.md +0 -0
  38. /package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/references/git_operations.md +0 -0
  39. /package/comate-engine/assets/skills/{create-automation-tasks-comate → create-automation}/scripts/check_config.py +0 -0
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: create-image
3
- description: "Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K resolution."
3
+ description: "Generate/edit images with Nano Banana Pro (Gemini 3 Pro Image). Use for image create/modify requests incl. edits. Supports text-to-image + image-to-image; 1K/2K/4K resolution and multiple aspect ratios."
4
4
  metadata:
5
5
  enableWhen:
6
6
  - isInternalComateIDE
@@ -8,274 +8,270 @@ metadata:
8
8
 
9
9
  # Nano Banana Pro Image Generation & Editing
10
10
 
11
- Generate new images or edit existing ones using Google's Nano Banana Pro API (Gemini 3 Pro Image) via curl.
11
+ Generate new images or edit existing ones using Google's Nano Banana Pro API (Gemini 3 Pro Image), via a unified cross-platform wrapper script.
12
12
 
13
- ## Requirements
13
+ ---
14
+
15
+ ## Prerequisites
16
+
17
+ - **macOS / Linux:** `curl`, `base64`, and **one** of `python3` / `jq` for JSON handling (python3 is preferred and universally available). Resolution auto-detect needs `sips` (macOS built-in) or `identify` (Linux, `imagemagick`).
18
+ - **Windows:** `curl.exe` (Win10 1803+) and a PowerShell host. **Default to `powershell.exe`** (Windows PowerShell 5.1, ships with every Windows 10/11) — no detection needed. Only fall back to `pwsh.exe` (PowerShell 7+) if `powershell.exe` fails or is unavailable.
19
+ - **Credentials:** the script resolves the encrypted login name automatically:
20
+ 1. `COMATE_USERNAME_ENCRYPTED` env var, if set; otherwise
21
+ 2. `~/.comate/login` JWT (written by the Comate IDE on login) — payload's `content.identity` is extracted.
22
+
23
+ Agents do **not** need to export the env var themselves as long as the user is signed in to Comate.
24
+
25
+ ---
14
26
 
15
- ### macOS / Linux
27
+ ## PowerShell host on Windows
16
28
 
17
- - `curl` and `jq` must be installed
18
- - `base64` built-in
19
- - `sips` for auto-resolution detection (macOS built-in)
20
- - `identify` for auto-resolution detection (Linux, via `imagemagick`)
29
+ **Default: `powershell.exe`** (Windows PowerShell 5.1). It ships with every Windows 10/11, so no probing is required — just call it directly.
21
30
 
22
- ### Windows
31
+ Only switch to `pwsh.exe` (PowerShell 7+) if `powershell.exe` is missing or the invocation fails for a host-specific reason.
23
32
 
24
- - `curl.exe` built-in (Windows 10 1803+)
25
- - PowerShell 5.1+ built-in (Windows 10+), used for all JSON parsing
33
+ ### Recommended invocation template
26
34
 
27
- ### Common
35
+ Always pass `-NoProfile -ExecutionPolicy Bypass -File <script>` so user profile overrides and execution-policy blocks cannot break the run:
28
36
 
29
- - API endpoint: `https://comate.baidu-int.com/api/aidevops/autocomate/rest/autowork/v1/generate-image`
30
- - Username is passed via Header `login-name`, NOT in the request body
31
- - Username: ${COMATE_USERNAME_ENCRYPTED}
37
+ ```powershell
38
+ powershell -NoProfile -ExecutionPolicy Bypass -File <SKILL_DIR>\scripts\generate-image.ps1 `
39
+ -Prompt "..." -Output "hero-v1.png" ...
40
+ ```
41
+
42
+ If that call fails because `powershell.exe` is unavailable, retry with `pwsh` using the same arguments.
32
43
 
33
44
  ---
34
45
 
35
- ## Usage
36
46
 
37
- **Important:** Always save output images to the user's current working directory.
38
47
 
39
- ### 1. Generate new image (text-to-image)
48
+ Two scripts ship alongside this skill and encapsulate all HTTP / JSON / base64 / temp-file details. **Always call these instead of writing curl by hand.**
49
+
50
+ - macOS / Linux: `scripts/generate-image.sh`
51
+ - Windows: `scripts/generate-image.ps1`
52
+
53
+ Both produce output in `<projectRoot>/images/` (created on demand) and print the absolute path of the generated file to stdout.
40
54
 
41
- **macOS / Linux (bash/zsh):**
55
+ ### Arguments
56
+
57
+ | Flag (bash) | Param (PowerShell) | Required | Default | Values |
58
+ |---------------------|--------------------|----------|---------|-----------------------------------------------------------------|
59
+ | `--prompt`, `-p` | `-Prompt` | yes | — | any string |
60
+ | `--output`, `-o` | `-Output` | yes | — | filename only (e.g. `hero-v1.png`), stored under `images/` |
61
+ | `--input`, `-i` | `-InputPath` | no | — | path to reference image (png/jpg/webp) for image-to-image |
62
+ | `--resolution`, `-r`| `-Resolution` | no | `1K`* | `1K` / `2K` / `4K` |
63
+ | `--aspect-ratio`,`-a`| `-AspectRatio` | no | `1:1` | `1:1` / `16:9` / `9:16` / `4:3` / `3:4` / `3:2` / `2:3` / `21:9` |
64
+
65
+ *When `--input` is given and `--resolution` is omitted, the script auto-picks `4K` / `2K` / `1K` from the input image's longest edge (≥3000 / ≥1500 / else).
66
+
67
+ ### Parameter selection heuristics (when user is ambiguous)
68
+
69
+ - "壁纸 / 横版封面 / banner" → `16:9`
70
+ - "手机壁纸 / 竖屏海报" → `9:16`
71
+ - "头像 / icon / 方形卡片" → `1:1`
72
+ - "书籍插画 / 故事场景(横)" → `4:3` or `3:2`
73
+ - "电影宽屏 / cinematic" → `21:9`
74
+ - Draft / 快速试错 → `1K`; 中等交付 → `2K`; 定稿 → `4K`
75
+
76
+ If intent is unclear, ask the user before calling.
77
+
78
+ ---
79
+
80
+ ## Usage
81
+
82
+ ### Text-to-image
83
+
84
+ **macOS / Linux:**
42
85
 
43
86
  ```bash
44
- curl -s -X POST https://comate.baidu-int.com/api/aidevops/autocomate/rest/autowork/v1/generate-image \
45
- -H "Content-Type: application/json" \
46
- -H "login-name: ${COMATE_USERNAME_ENCRYPTED}" \
47
- -d '{
48
- "contents": [{"role": "USER", "parts": [{"text": "YOUR_PROMPT_HERE"}]}],
49
- "resolution": "1K"
50
- }' \
51
- | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' \
52
- | base64 --decode > output-name.png
87
+ bash <SKILL_DIR>/scripts/generate-image.sh \
88
+ --prompt "a minimalist mountain at sunrise, flat illustration" \
89
+ --output hero-v1.png \
90
+ --aspect-ratio 16:9 \
91
+ --resolution 2K
53
92
  ```
54
93
 
55
94
  **Windows (PowerShell):**
56
95
 
96
+ > Default to `powershell`; only fall back to `pwsh` if `powershell` fails or is unavailable.
97
+
57
98
  ```powershell
58
- # Write JSON to temp file to avoid PowerShell double-quote escaping bug with curl.exe -d
59
- $tmpFile = "$env:TEMP\request.json"
60
- @'
61
- {"contents":[{"role": "USER", "parts":[{"text":"YOUR_PROMPT_HERE"}]}],"resolution":"1K"}
62
- '@ | Set-Content -Path $tmpFile -Encoding UTF8 -NoNewline
63
-
64
- $response = curl.exe -s -X POST https://comate.baidu-int.com/api/aidevops/autocomate/rest/autowork/v1/generate-image `
65
- -H "Content-Type: application/json" `
66
- -H "login-name: $env:${COMATE_USERNAME_ENCRYPTED}" `
67
- -d "@$tmpFile" | ConvertFrom-Json
68
-
69
- $imgBase64 = $response.candidates[0].content.parts |
70
- Where-Object { $_.inlineData } |
71
- Select-Object -First 1 -ExpandProperty inlineData |
72
- Select-Object -ExpandProperty data
73
- [System.IO.File]::WriteAllBytes(
74
- (Join-Path (Get-Location) "output-name.png"),
75
- [System.Convert]::FromBase64String($imgBase64)
76
- )
99
+ powershell -NoProfile -ExecutionPolicy Bypass -File <SKILL_DIR>\scripts\generate-image.ps1 `
100
+ -Prompt "a minimalist mountain at sunrise, flat illustration" `
101
+ -Output hero-v1.png `
102
+ -AspectRatio 16:9 `
103
+ -Resolution 2K
77
104
  ```
78
105
 
79
- ### 2. Edit existing image (image-to-image)
106
+ ### Image-to-image (edit an existing file)
80
107
 
81
- **base64 image data is large — must write to a temp file, never inline in the command.**
82
-
83
- **macOS / Linux (bash/zsh):**
108
+ **macOS / Linux:**
84
109
 
85
110
  ```bash
86
- # Step 1: encode input image
87
- IMG_B64=$(base64 -i path/to/input.png | tr -d '\n')
88
-
89
- # Step 2: write JSON to temp file
90
- cat > /tmp/request.json << EOF
91
- {
92
- "contents": [
93
- {
94
- "role": "USER",
95
- "parts": [
96
- {"inline_data": {"mime_type": "image/png", "data": "${IMG_B64}"}},
97
- {"text": "YOUR_EDIT_INSTRUCTIONS_HERE"}
98
- ]
99
- }
100
- ],
101
- "resolution": "2K"
102
- }
103
- EOF
104
-
105
- # Step 3: send request
106
- curl -s -X POST https://comate.baidu-int.com/api/aidevops/autocomate/rest/autowork/v1/generate-image \
107
- -H "Content-Type: application/json" \
108
- -H "login-name: ${COMATE_USERNAME_ENCRYPTED}" \
109
- -d @/tmp/request.json \
110
- | jq -r '.candidates[0].content.parts[] | select(.inlineData) | .inlineData.data' \
111
- | base64 --decode > output-name.png
111
+ bash <SKILL_DIR>/scripts/generate-image.sh \
112
+ --prompt "repaint in Ghibli style, warmer palette" \
113
+ --input images/hero-v1.png \
114
+ --output hero-v2.png \
115
+ --aspect-ratio 16:9
112
116
  ```
113
117
 
114
118
  **Windows (PowerShell):**
115
119
 
116
120
  ```powershell
117
- # Step 1: encode input image
118
- $imgBytes = [System.IO.File]::ReadAllBytes((Resolve-Path "path\to\input.png"))
119
- $imgBase64 = [System.Convert]::ToBase64String($imgBytes)
120
-
121
- # Step 2: write JSON to temp file
122
- $tmpFile = "$env:TEMP\request.json"
123
- @"
124
- {
125
- "contents": [
126
- {
127
- "role": "USER",
128
- "parts": [
129
- {"inline_data": {"mime_type": "image/png", "data": "$imgBase64"}},
130
- {"text": "YOUR_EDIT_INSTRUCTIONS_HERE"}
131
- ]
132
- }
133
- ],
134
- "resolution": "2K"
135
- }
136
- "@ | Set-Content -Path $tmpFile -Encoding UTF8
137
-
138
- # Step 3: send request
139
- $response = curl.exe -s -X POST https://comate.baidu-int.com/api/aidevops/autocomate/rest/autowork/v1/generate-image `
140
- -H "Content-Type: application/json" `
141
- -H "login-name: $env:${COMATE_USERNAME_ENCRYPTED}" `
142
- -d "@$tmpFile" | ConvertFrom-Json
143
-
144
- $outBase64 = $response.candidates[0].content.parts |
145
- Where-Object { $_.inlineData } |
146
- Select-Object -First 1 -ExpandProperty inlineData |
147
- Select-Object -ExpandProperty data
148
-
149
- [System.IO.File]::WriteAllBytes(
150
- (Join-Path (Get-Location) "output-name.png"),
151
- [System.Convert]::FromBase64String($outBase64)
152
- )
121
+ powershell -NoProfile -ExecutionPolicy Bypass -File <SKILL_DIR>\scripts\generate-image.ps1 `
122
+ -Prompt "repaint in Ghibli style, warmer palette" `
123
+ -InputPath images\hero-v1.png `
124
+ -Output hero-v2.png `
125
+ -AspectRatio 16:9
153
126
  ```
154
127
 
155
- ---
128
+ `<SKILL_DIR>` resolves to this skill's directory at runtime. If the shell does not have that variable, substitute the absolute path to the skill folder.
129
+
130
+ ### Batch generation
156
131
 
157
- ## Default Workflow (draft > iterate > final)
132
+ Call the script multiple times (concurrent or sequential). Each run writes one image; use distinct `--output` names with a version suffix (`-v1`, `-v2`, `scene1`, `scene2`, …). Do NOT overwrite previous iterations.
158
133
 
159
- - Draft (1K): quick feedback loop
160
- - Iterate: adjust prompt in small diffs; keep filename new per run
161
- - Final (4K): only when prompt is locked
134
+ ---
162
135
 
163
- ## Resolution Options
136
+ ## Output Directory & Naming
164
137
 
165
- - **1K** (default) - ~1024px, use for drafts and quick feedback
166
- - **2K** - ~2048px, use for mid-quality or medium-size input images
167
- - **4K** - ~4096px, use only when prompt is locked and quality is required
138
+ - Always `<projectRoot>/images/` unified location, **never** `.comate/images/` or other subfolders.
139
+ - File names: kebab-case + version suffix (`hero-v1.png`, `scene1.png`). Iterate with `-v2`, `-v3` instead of overwriting.
168
140
 
169
141
  ---
170
142
 
171
- ## Auto-Resolution Detection (image-to-image only)
143
+ ## Workflow (draft → iterate → final)
172
144
 
173
- When editing an existing image and the user has not specified a resolution,
174
- auto-detect based on the input image's longest edge.
145
+ 1. **Draft** `--resolution 1K`, quick feedback loop.
146
+ 2. **Iterate** tweak prompt in small diffs; bump filename (`-v2`, `-v3`).
147
+ 3. **Final** — re-run with `--resolution 4K` once the prompt is locked.
175
148
 
176
- **macOS (bash/zsh):**
149
+ ---
177
150
 
178
- ```bash
179
- MAX_DIM=$(sips -g pixelWidth -g pixelHeight "input.png" \
180
- | awk '/pixel/{print $2}' | sort -rn | head -1)
151
+ ## Output Display Format (MUST FOLLOW)
181
152
 
182
- if [ "${MAX_DIM}" -ge 3000 ]; then RESOLUTION="4K"
183
- elif [ "${MAX_DIM}" -ge 1500 ]; then RESOLUTION="2K"
184
- else RESOLUTION="1K"
185
- fi
186
- ```
153
+ After generation, display results using an **HTML `<table>` block** (NOT `![](path)` markdown). This renders as a polished gallery in the chat UI and in embedded `.md` files.
187
154
 
188
- **Linux (bash):**
155
+ Each cell contains only:
189
156
 
190
- ```bash
191
- # requires: apt install imagemagick / yum install imagemagick
192
- MAX_DIM=$(identify -format "%[fx:max(w,h)]" input.png)
157
+ - `<img src="...">` with a `width` picked from the layout table below
193
158
 
194
- if [ "${MAX_DIM}" -ge 3000 ]; then RESOLUTION="4K"
195
- elif [ "${MAX_DIM}" -ge 1500 ]; then RESOLUTION="2K"
196
- else RESOLUTION="1K"
197
- fi
198
- ```
159
+ Do NOT add `<b>` titles or `<small>` captions unless the user explicitly asks for them. Keep cells image-only by default.
199
160
 
200
- **Windows (PowerShell):**
161
+ ### Path rule
201
162
 
202
- ```powershell
203
- Add-Type -AssemblyName System.Drawing
204
- $img = [System.Drawing.Image]::FromFile((Resolve-Path "input.png"))
205
- $maxDim = [Math]::Max($img.Width, $img.Height)
206
- $img.Dispose()
207
-
208
- $resolution = if ($maxDim -ge 3000) { "4K" }
209
- elseif ($maxDim -ge 1500) { "2K" }
210
- else { "1K" }
211
- ```
163
+ - Inside a `.md` file that lives next to the images → **relative path** (e.g. `scene1.png`).
164
+ - Inline in chat reply:
165
+ - **macOS / Linux** → absolute path (e.g. `/Users/.../project/images/hero.png`).
166
+ - **Windows** → **relative path** (e.g. `images/hero.png`), because absolute Windows paths (e.g. `C:\Users\...`) do not render reliably inside `<img src>` in the chat UI / markdown preview.
167
+ - **Always use forward slashes (`/`) in `<img src>` values, never backslashes (`\`)** — this applies to Windows too. For example, use `images/hero.png`, not `images\hero.png`.
168
+ - Never mix absolute and relative formats in one table.
212
169
 
213
- ---
170
+ ### Layout rules
214
171
 
215
- ## Prompt Handling
172
+ | Image count | Layout | `width` per image |
173
+ |-------------|------------------------------------|-------------------|
174
+ | 1 | Single centered cell | `640` |
175
+ | 2 | 1 row × 2 columns | `480` |
176
+ | 3 | 1 row × 3 columns | `360` |
177
+ | 4 | **2 rows × 2 columns** (preferred) | `480` |
178
+ | 5–6 | 2 rows × 3 columns | `360` |
179
+ | 7+ | 3 columns per row, wrap as needed | `320` |
216
180
 
217
- **For generation:** Pass the user's image description as-is to the `text` field.
218
- **For editing:** Pass editing instructions to the `text` field alongside the `inline_data` image part.
181
+ ### Single image
219
182
 
220
- ---
183
+ ```html
184
+ <table>
185
+ <tr>
186
+ <td align="center">
187
+ <img src="/abs/path/to/images/hero.png" width="640" />
188
+ </td>
189
+ </tr>
190
+ </table>
191
+ ```
221
192
 
222
- ## Output
193
+ On Windows, use a relative path with forward slashes instead:
223
194
 
224
- - Saves PNG to `.comate/images` directory, if directory is not exist, you need to create it.
225
- - Script outputs the full path to the generated image
226
- - **Do not read the image back** - display the image using markdown syntax: `![image](/absolute/path/to/image.png)`
227
- - **Display the image path only once** using an absolute path (e.g. `/home/user/project/.comate/images/output-name.png`)
228
- - Absolute paths are REQUIRED (e.g. paths starting with `/` on Unix or `C:\` on Windows);
229
- - do NOT use relative paths (e.g. `./output.png` or `images/output.png` formats are forbidden);
230
- - do NOT repeat the path multiple times in the conversation
195
+ ```html
196
+ <table>
197
+ <tr>
198
+ <td align="center">
199
+ <img src="images/hero.png" width="640" />
200
+ </td>
201
+ </tr>
202
+ </table>
203
+ ```
231
204
 
232
- ---
205
+ ### Multiple images (2×2 example)
206
+
207
+ ```html
208
+ <table>
209
+ <tr>
210
+ <td align="center"><img src="/abs/path/to/images/scene1.png" width="480" /></td>
211
+ <td align="center"><img src="/abs/path/to/images/scene2.png" width="480" /></td>
212
+ </tr>
213
+ <tr>
214
+ <td align="center"><img src="/abs/path/to/images/scene3.png" width="480" /></td>
215
+ <td align="center"><img src="/abs/path/to/images/scene4.png" width="480" /></td>
216
+ </tr>
217
+ </table>
218
+ ```
233
219
 
234
- ## Notes
220
+ On Windows, use relative paths with forward slashes:
221
+
222
+ ```html
223
+ <table>
224
+ <tr>
225
+ <td align="center"><img src="images/scene1.png" width="480" /></td>
226
+ <td align="center"><img src="images/scene2.png" width="480" /></td>
227
+ </tr>
228
+ <tr>
229
+ <td align="center"><img src="images/scene3.png" width="480" /></td>
230
+ <td align="center"><img src="images/scene4.png" width="480" /></td>
231
+ </tr>
232
+ </table>
233
+ ```
234
+
235
+ ### Display rules
235
236
 
236
- | Rule | macOS/Linux | Windows |
237
- |-------------------|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
238
- | curl command | `curl` | `curl.exe` (avoid PowerShell `curl` alias conflict) |
239
- | `-d` body | inline `'{"key":"val"}'` is fine | **never** inline — PowerShell escapes `"` to `\u0022`, corrupts JSON; **always** write to temp file and use `-d "@$tmpFile"` |
240
- | base64 encode | `base64 -i file \| tr -d '\n'` | `[Convert]::ToBase64String([IO.File]::ReadAllBytes(...))` |
241
- | base64 decode | `base64 --decode` | `[IO.File]::WriteAllBytes(path, [Convert]::FromBase64String(...))` (PS 5.1 compatible) |
242
- | image size detect | `sips` (macOS) / `identify` (Linux) | `System.Drawing.Image` (built-in) |
243
- | line continuation | `\` | `` ` `` (backtick) |
244
- | env variable | `${COMATE_USERNAME_ENCRYPTED}` | `$env:COMATE_USERNAME_ENCRYPTED` |
237
+ - Show the `<table>` block **once**; do not repeat paths elsewhere in the reply.
238
+ - **Do not read the image back** with a file tool — just reference the printed path.
239
+ - Do not use markdown `![]()` for the deliverable (only allowed inside the table's `<img>`).
245
240
 
246
241
  ---
247
242
 
248
- ## Windows Pitfall: PowerShell escapes `"` when passing to external programs
243
+ ## Error Handling (Agent contract)
249
244
 
250
- When PowerShell passes a variable containing `"` to an external program like `curl.exe`,
251
- it silently converts `"` into `\u0022`, which corrupts JSON and causes the server to return:
245
+ On any failure the script writes one or more **tagged, single-line messages** to stderr:
252
246
 
253
247
  ```
254
- JsonParseException: Unexpected character ('u'): was expecting double-quote to start field name
248
+ [generate-image:<stage>] <message>
255
249
  ```
256
250
 
257
- **Wrong (never do this on Windows):**
251
+ Always capture stderr alongside stdout (`bash ... 2>&1` or `$output = & ... 2>&1`) so you can forward the tag + message to the user. The final path on stdout only appears on success.
258
252
 
259
- ```powershell
260
- $body = '{"contents":[...]}'
261
- curl.exe ... -d $body # PowerShell corrupts the " characters
262
- ```
253
+ ### Stages you may see
263
254
 
264
- **Correct always write JSON to a temp file and pass with `-d @file`:**
255
+ | Stage | Meaning | Typical remedy |
256
+ |-----------|---------------------------------------------------|--------------------------------------------------------------------|
257
+ | `args` | bad / missing CLI flag | Fix the call — do NOT retry blindly |
258
+ | `deps` | `curl` / `base64` missing, or neither `python3` nor `jq` available | Ask user to install the missing tool (python3 preferred) |
259
+ | `env` | neither `COMATE_USERNAME_ENCRYPTED` nor `~/.comate/login` usable | Ask user to sign in to Comate, or export the env var |
260
+ | `input` | input image missing / unreadable / empty | Verify the path, fix the `--input` argument |
261
+ | `output` | cannot create/use `images/` dir | Usually a file named `images` exists; rename or remove |
262
+ | `payload` | JSON build / write failed | Likely disk / `jq` issue; retry once |
263
+ | `http` | HTTP non-2xx or network failure; body snippet attached | Retry with smaller input/resolution or lower prompt size |
264
+ | `api` | API returned `.error` or `no inlineData` — stderr includes the model's text explanation when present | Inspect the model message; often a policy rejection or bad prompt |
265
+ | `write` | decode/write-to-disk failed | Disk full / permissions |
266
+ | `tmp` | `mktemp` failed | `/tmp` full or read-only |
265
267
 
266
- ```powershell
267
- # Single-quote here-string: content is taken literally, no escaping
268
- $tmpFile = "$env:TEMP\request.json"
269
- @'
270
- {"contents":[{"role": "USER", "parts":[{"text":"YOUR_PROMPT_HERE"}]}],"resolution":"1K"}
271
- '@ | Set-Content -Path $tmpFile -Encoding UTF8 -NoNewline
268
+ ### Exit codes
272
269
 
273
- curl.exe ... -d "@$tmpFile" # curl reads the file directly, no PowerShell escaping
274
- ```
270
+ `0` success · `2` args · `3` deps · `4` env · `5` input · `10` HTTP/network · `11` API error · `12` no image data · `13` local I/O
275
271
 
276
- **Here-string type selection:**
272
+ ### Idempotency guarantees
277
273
 
278
- | Situation | Here-string type | Reason |
279
- |--------------------------------------------|------------------------|---------------------------------|
280
- | text-to-image (no variables in JSON) | `@'...'@` single-quote | content taken literally, safest |
281
- | image-to-image (need `$imgBase64` in JSON) | `@"..."@` double-quote | allows variable expansion |
274
+ - Existing `images/` directory is reused, not recreated.
275
+ - Same `--output` filename will **overwrite** the previous file — if you want to keep both, use a new versioned filename (`-v2`, `-v3`).
276
+ - Temp files in `$TMPDIR` / `%TEMP%` are always cleaned up, even on failure.
277
+ - Safe to run concurrently with different `--output` names.