@heart-of-gold/toolkit 0.1.45 → 0.1.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "name": "@heart-of-gold/toolkit",
3
- "version": "0.1.45",
3
+ "version": "0.1.46",
4
4
  "type": "module",
5
- "description": "Cross-platform installer for Heart of Gold skills works with Codex, OpenCode, Pi, Claude Code, and more",
5
+ "description": "Cross-platform installer for Heart of Gold skills \u2014 works with Codex, OpenCode, Pi, Claude Code, and more",
6
6
  "bin": {
7
7
  "heart-of-gold": "src/index.ts"
8
8
  },
@@ -2,7 +2,7 @@
2
2
  name: image
3
3
  description: >
4
4
  AI image generation and editing. Text-to-image, style transfer, and logo generation.
5
- Currently powered by Gemini and FLUX via OpenRouter.
5
+ Powered by GPT Image 2 (via Codex CLI, no API key), Gemini, and FLUX.
6
6
  Triggers: generate image, create image, make image, draw, illustrate, logo, visual, picture.
7
7
  allowed-tools:
8
8
  - Read
@@ -26,10 +26,23 @@ Translating your thoughts into pictures. The Babel Fish handles all languages, i
26
26
 
27
27
  | Backend | When to Use | Key |
28
28
  |---------|-------------|-----|
29
- | Gemini (`gemini-3-pro-image-preview`) | Primarybest quality, text in images, style transfer | `GEMINI_API_KEY` |
30
- | FLUX via OpenRouter (`black-forest-labs/flux.2-pro`) | Fallback artistic, creative work | `~/.claude/secrets/openrouter.json` |
29
+ | GPT Image 2 via Codex CLI (`gpt-image-2`) | Defaultstrong prompt adherence, text in images, no API key | ChatGPT OAuth (`codex login`) |
30
+ | Gemini (`gemini-3-pro-image-preview`) | Style transfer with reference images, multi-turn iteration, 4K | `GEMINI_API_KEY` |
31
+ | FLUX via OpenRouter (`black-forest-labs/flux.2-pro`) | Artistic, stylised work | `~/.claude/secrets/openrouter.json` |
31
32
 
32
- Check `$GEMINI_API_KEY` first; if unset, fall back to `~/.claude/secrets/openrouter.json`.
33
+ ## Backend selection (ask first)
34
+
35
+ Before generating, ask via `AskUserQuestion` which backend to use — list the three above as options with GPT Image 2 first (default). Skip the question only when:
36
+
37
+ - The user already named a backend in their prompt ("use Gemini", "with FLUX", "via codex")
38
+ - The request **forces** a specific one:
39
+ - Reference images for style matching → Gemini (the codex path does not accept reference images)
40
+ - True native transparent background → Gemini, or codex with explicit `gpt-image-1.5`
41
+ - 4K+ output above 3840px per edge → Gemini (codex caps at 3840)
42
+
43
+ Announce the chosen backend in one line before running so the user has clarity.
44
+
45
+ Requires: `codex-cli >= 0.124.0-alpha.2` for GPT Image 2. Install: `npm install -g @openai/codex@0.124.0-alpha.2`.
33
46
 
34
47
  ## Phase 0 — Understand
35
48
 
@@ -43,7 +56,31 @@ Gather before generating. Ask if unclear:
43
56
 
44
57
  ## Phase 1 — Generate
45
58
 
46
- ### Gemini (primary)
59
+ ### GPT Image 2 via Codex (default)
60
+
61
+ Runs through codex's built-in `image_gen` tool — uses your ChatGPT OAuth, no API key, no per-image billing (counts against your ChatGPT plan limits).
62
+
63
+ ```bash
64
+ plugins/babel-fish/skills/image/scripts/generate_image_codex.sh \
65
+ --output ./output.png \
66
+ --size 1024x1024 \
67
+ --quality high \
68
+ "your prompt here"
69
+ ```
70
+
71
+ Constraints:
72
+ - Sizes: `auto` or `WxH` — max edge ≤ 3840px, edges multiples of 16, 655k–8.3M total pixels.
73
+ - Quality: `low` | `medium` | `high` | `auto`.
74
+ - Does **not** support `background=transparent` — use Gemini or ask codex for `gpt-image-1.5` explicitly if true transparency is required.
75
+
76
+ Equivalent raw invocation (skip the wrapper for multi-variant riffs):
77
+
78
+ ```bash
79
+ codex exec --skip-git-repo-check --sandbox workspace-write --full-auto \
80
+ "Use your built-in image_gen tool to generate: <PROMPT>. After it returns, copy the file to <OUT_PATH> and print the path."
81
+ ```
82
+
83
+ ### Gemini
47
84
 
48
85
  ```python
49
86
  import os
@@ -74,7 +111,7 @@ for part in response.parts:
74
111
 
75
112
  **Critical:** Gemini returns JPEG by default. Use `.jpg` extension. If PNG is needed, pass `format="PNG"` explicitly to `img.save()`.
76
113
 
77
- ### FLUX via OpenRouter (fallback)
114
+ ### FLUX via OpenRouter
78
115
 
79
116
  ```bash
80
117
  curl https://openrouter.ai/api/v1/images/generations \
@@ -0,0 +1,77 @@
1
+ #!/usr/bin/env bash
2
+ # Generate an image via Codex CLI's built-in image_gen tool (gpt-image-2).
3
+ # Uses ChatGPT OAuth — no OPENAI_API_KEY required.
4
+ # Requires: codex-cli >= 0.124.0-alpha.2
5
+
6
+ set -euo pipefail
7
+
8
+ PROMPT=""
9
+ OUTPUT="./generated_image.png"
10
+ SIZE=""
11
+ QUALITY=""
12
+
13
+ usage() {
14
+ cat <<'EOF'
15
+ Usage: generate_image_codex.sh [options] "prompt"
16
+
17
+ Options:
18
+ -o, --output PATH Output file (default: ./generated_image.png)
19
+ --size SIZE e.g. 1024x1024, 1536x1024, 3840x2160, or auto
20
+ --quality LEVEL low | medium | high | auto
21
+ -h, --help Show this help
22
+
23
+ Examples:
24
+ generate_image_codex.sh "a red apple on a white table"
25
+ generate_image_codex.sh --output logo.png --quality high "a minimalist fox logo"
26
+ EOF
27
+ }
28
+
29
+ while [[ $# -gt 0 ]]; do
30
+ case "$1" in
31
+ -o|--output) OUTPUT="$2"; shift 2 ;;
32
+ --size) SIZE="$2"; shift 2 ;;
33
+ --quality) QUALITY="$2"; shift 2 ;;
34
+ -h|--help) usage; exit 0 ;;
35
+ --) shift; PROMPT="${1:-}"; shift || true ;;
36
+ -*) echo "Unknown flag: $1" >&2; usage >&2; exit 2 ;;
37
+ *) PROMPT="$1"; shift ;;
38
+ esac
39
+ done
40
+
41
+ if [[ -z "$PROMPT" ]]; then
42
+ echo "Error: prompt is required" >&2
43
+ usage >&2
44
+ exit 2
45
+ fi
46
+
47
+ if ! command -v codex >/dev/null 2>&1; then
48
+ echo "Error: codex CLI not found on PATH." >&2
49
+ echo "Install: npm install -g @openai/codex@0.124.0-alpha.2" >&2
50
+ exit 1
51
+ fi
52
+
53
+ OUT_DIR="$(cd "$(dirname "$OUTPUT")" 2>/dev/null && pwd || echo "$(pwd)")"
54
+ OUT_ABS="$OUT_DIR/$(basename "$OUTPUT")"
55
+ mkdir -p "$OUT_DIR"
56
+
57
+ INSTR="Use your built-in image_gen tool to generate a single image: $PROMPT"
58
+ [[ -n "$SIZE" ]] && INSTR+=$'\nSize: '"$SIZE"
59
+ [[ -n "$QUALITY" ]] && INSTR+=$'\nQuality: '"$QUALITY"
60
+ INSTR+=$'\nAfter image_gen returns, copy the generated file to '"$OUT_ABS"$' and print that absolute path as the final line. Do not use the fallback CLI scripts/image_gen.py.'
61
+
62
+ echo "Generating via codex built-in image_gen (gpt-image-2)..." >&2
63
+ echo "Prompt: $PROMPT" >&2
64
+
65
+ codex exec \
66
+ --skip-git-repo-check \
67
+ --sandbox workspace-write \
68
+ --full-auto \
69
+ -C "$OUT_DIR" \
70
+ "$INSTR" 2>/dev/null | tail -3
71
+
72
+ if [[ -f "$OUT_ABS" ]]; then
73
+ echo "Image saved to: $OUT_ABS"
74
+ else
75
+ echo "Error: expected image at $OUT_ABS but none was written." >&2
76
+ exit 1
77
+ fi
@@ -75,3 +75,18 @@ Codex is powered by OpenAI models with their own knowledge cutoffs and limitatio
75
75
  - Stop and report failures whenever `codex --version` or a `codex exec` command exits non-zero; request direction before retrying.
76
76
  - Before using high-impact flags (`--full-auto`, `--sandbox danger-full-access`, `--skip-git-repo-check`) ask the user for permission using `AskUserQuestion` unless already given.
77
77
  - When output includes warnings or partial results, summarize them and ask how to adjust using `AskUserQuestion`.
78
+
79
+ ## Image Generation (codex ≥ 0.124.0-alpha.2)
80
+
81
+ Codex ships a built-in `image_gen` tool that uses OpenAI's `gpt-image-2` under the user's ChatGPT OAuth — **no `OPENAI_API_KEY` required**. To invoke it:
82
+
83
+ ```bash
84
+ codex exec --skip-git-repo-check --sandbox workspace-write --full-auto \
85
+ "Use your built-in image_gen tool to generate: <PROMPT>. After it returns, copy the file to <OUT_PATH> and print the path."
86
+ ```
87
+
88
+ - Supported models (server-picked): `gpt-image-2` (default, newest), `gpt-image-1.5` (only one supporting `background=transparent`), `gpt-image-1`, `gpt-image-1-mini`.
89
+ - Quality: `low` | `medium` | `high` | `auto`. Sizes: `auto` or `WxH` with max edge ≤ 3840px, edges multiples of 16, 655k–8.3M total pixels.
90
+ - Output lands in `$CODEX_HOME/generated_images/<session>/<hash>.png` — ask the agent to copy to the user's target.
91
+ - Convenience wrapper (handles prompt assembly + output copy) is at `~/.claude/skills/image-gen/scripts/generate_image_codex.sh`; the `image-gen` skill documents the full UX. Prefer that wrapper over hand-rolling the codex invocation.
92
+ - Do **not** use codex's bundled fallback script `scripts/image_gen.py` — it requires an `OPENAI_API_KEY`, which is precisely what the built-in path avoids.