@leejungkiin/awkit 1.6.5 → 1.6.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +121 -130
  2. package/bin/awk.js +111 -8
  3. package/package.json +5 -3
  4. package/schemas/onboarding-screen.schema.json +108 -0
  5. package/scripts/__pycache__/openrouter_image_gen.cpython-311.pyc +0 -0
  6. package/scripts/cockpit-quota.js +93 -0
  7. package/scripts/openrouter_image_gen.py +772 -0
  8. package/scripts/video-analyzer.js +172 -0
  9. package/skills/CATALOG.md +2 -1
  10. package/skills/ai-sprite-maker/SKILL.md +27 -6
  11. package/skills/ai-sprite-maker/scripts/__pycache__/remove_chroma_key.cpython-311.pyc +0 -0
  12. package/skills/ai-sprite-maker/scripts/remove_chroma_key.py +440 -0
  13. package/skills/awf-caveman/SKILL.md +65 -0
  14. package/skills/expo-build-optimizer/SKILL.md +33 -0
  15. package/skills/ios-app-store-audit/SKILL.md +48 -0
  16. package/skills/ios-expert-coder/SKILL.md +45 -0
  17. package/skills/mascot-designer/SKILL.md +66 -0
  18. package/skills/mascot-designer/examples/witny-case-study.md +35 -0
  19. package/skills/orchestrator/SKILL.md +20 -0
  20. package/skills/short-maker/scripts/google-flow-cli/README.md +227 -115
  21. package/skills/short-maker/scripts/google-flow-cli/gflow/api/client.py +32 -3
  22. package/skills/short-maker/scripts/google-flow-cli/gflow/api/models.py +4 -2
  23. package/skills/short-maker/scripts/google-flow-cli/gflow/cli/main.py +33 -6
  24. package/skills/short-maker/scripts/google-flow-cli/pyproject.toml +1 -1
  25. package/skills/verification-gate/SKILL.md +4 -0
  26. package/templates/help.html +21 -0
  27. package/templates/project-identity/android.json +24 -0
  28. package/templates/project-identity/backend-nestjs.json +24 -0
  29. package/templates/project-identity/expo.json +24 -0
  30. package/templates/project-identity/ios.json +24 -0
  31. package/templates/project-identity/web-nextjs.json +24 -0
  32. package/templates/specs/design-template.md +71 -161
  33. package/templates/specs/requirements-template.md +133 -65
  34. package/workflows/ui/create-spec-architect.md +80 -50
  35. package/workflows/ui/image-gen.md +118 -0
@@ -1,168 +1,280 @@
1
- # gflow CLI for Google Flow
1
+ # Google Flow CLI (gflow)
2
2
 
3
- A command-line interface to [Google Flow](https://flow.google) (AI image & video generation), built using the same reverse-engineering approach as [tmc/nlm](https://github.com/tmc/nlm) for NotebookLM.
3
+ > Reverse-engineered CLI for Google's [Flow](https://labs.google/fx/tools/flow) **Imagen 4** image generation and **Veo 3.1** video generation, powered by the same APIs as the web UI.
4
4
 
5
- Lets AI agents and scripts generate images/videos via Google Flow without the GUI.
5
+ ## 🚀 Quick Start
6
6
 
7
- ## Architecture
7
+ ```bash
8
+ # 1. Activate virtual environment
9
+ cd /path/to/google-flow-cli
10
+ source .venv/bin/activate
11
+
12
+ # 2. Authenticate (opens Chrome for Google login)
13
+ gflow auth
14
+
15
+ # 3. Generate an image
16
+ gflow generate-image "a cute owl doctor mascot, chibi style" --json
8
17
 
18
+ # 4. Generate a video
19
+ gflow generate-video "a cat walking through a garden" --wait
9
20
  ```
10
- ┌────────────────────────────────────┐
11
- │ CLI Layer (click) │ gflow/cli/main.py
12
- │ generate-image, generate-video, │
13
- │ list, download, collections, raw │
14
- ├────────────────────────────────────┤
15
- │ API Client (FlowClient) │ gflow/api/client.py
16
- │ High-level ops, polling, parsing │
17
- ├────────────────────────────────────┤
18
- │ BatchExecute Protocol │ gflow/batchexecute/client.py
19
- │ RPC encoding, SAPISIDHASH, │
20
- │ chunked response decoding │
21
- ├────────────────────────────────────┤
22
- │ Browser Auth │ gflow/auth/browser_auth.py
23
- │ Cookie extraction from Chrome, │
24
- │ Selenium interactive login │
25
- └────────────────────────────────────┘
21
+
22
+ ## 📦 Installation
23
+
24
+ ```bash
25
+ # Clone and setup
26
+ pip install -e .
27
+
28
+ # Or just use the venv binary directly
29
+ .venv/bin/gflow --help
26
30
  ```
27
31
 
28
- This is the same layered architecture as `tmc/nlm`:
29
- - **Auth** extracts Google cookies from your browser (browser_cookie3 or Selenium)
30
- - **BatchExecute** encodes RPCs into Google's wire format and decodes responses
31
- - **API Client** wraps BatchExecute with typed methods for each Flow feature
32
- - **CLI** exposes everything as clean subcommands
32
+ ## 🔐 Authentication
33
33
 
34
- ## Install
34
+ gflow uses Chrome browser cookies for authentication. A persistent Chrome instance runs in the background for reCAPTCHA solving.
35
35
 
36
36
  ```bash
37
- pip install -e .
37
+ # First-time auth (opens Chrome, login with your Google account)
38
+ gflow auth
39
+
40
+ # Clear and re-authenticate (when getting 401 errors)
41
+ gflow auth --clear
42
+ gflow auth
43
+
44
+ # Close background Chrome when done
45
+ gflow close
38
46
  ```
39
47
 
40
- Or with optional Selenium support for interactive login:
48
+ **Requirements:**
49
+ - Google Chrome installed
50
+ - A Google account with access to [labs.google/fx](https://labs.google/fx)
51
+ - Chrome must stay running in background (for reCAPTCHA tokens)
52
+
53
+ ## 🎨 Image Generation
54
+
55
+ ### Basic Usage
56
+
41
57
  ```bash
42
- pip install -e ".[dev]"
43
- pip install selenium
58
+ # Simple generation
59
+ gflow generate-image "a sunset over mountains"
60
+
61
+ # With options
62
+ gflow generate-image "cute robot mascot" \
63
+ --aspect-ratio 1:1 \
64
+ --num 4 \
65
+ -m imagen4 \
66
+ --json
44
67
  ```
45
68
 
46
- ## Quick Start
69
+ ### Options
70
+
71
+ | Flag | Values | Default | Description |
72
+ |------|--------|---------|-------------|
73
+ | `--aspect-ratio` | `16:9`, `4:3`, `1:1`, `3:4`, `9:16`, `square`, `portrait`, `landscape` | `landscape` | Image aspect ratio |
74
+ | `--num` | `1-4` | `1` | Number of images to generate |
75
+ | `-m`, `--model` | `imagen4`, `nano-banana-2` | `imagen4` | Image generation model |
76
+ | `-i`, `--image` | path | — | Reference image for image-to-image |
77
+ | `-o`, `--output` | path | auto | Output file path |
78
+ | `--seed` | integer | random | Seed for reproducibility |
79
+ | `--project-id` | string | — | Reuse existing project (skip creation) |
80
+ | `--json` | flag | — | Output as JSON |
81
+
82
+ ### Models
83
+
84
+ | Model | Internal Name | Description |
85
+ |-------|---------------|-------------|
86
+ | `imagen4` | `NARWHAL` | Imagen 4 — highest quality, default |
87
+ | `nano-banana-2` | `NANOBANANA2` | Nano Banana 2 — faster, lower cost |
88
+
89
+ ### Image-to-Image (Reference)
90
+
91
+ Use `-i` to provide a reference image. The model will generate variations based on your image + prompt:
47
92
 
48
93
  ```bash
49
- # 1. Authenticate (extracts cookies from your Chrome browser)
50
- gflow auth
94
+ # Generate a variation of an existing character
95
+ gflow generate-image "same character but smiling" -i ./base-mascot.png
51
96
 
52
- # 2. Generate an image
53
- gflow generate-image "a cat astronaut floating in space"
97
+ # Style transfer
98
+ gflow generate-image "pixel art style" -i ./photo.jpg --aspect-ratio 1:1
99
+ ```
100
+
101
+ ### Project Reuse
54
102
 
55
- # 3. Generate a video
56
- gflow generate-video "a timelapse of a flower blooming"
103
+ By default, each CLI invocation creates a new Flow project. To avoid this overhead, pass `--project-id`:
57
104
 
58
- # 4. List your assets
59
- gflow list
105
+ ```bash
106
+ # First call — note the projectId in JSON output
107
+ gflow generate-image "owl mascot" --json
108
+ # Output: { "images": [...], "projectId": "abc123" }
60
109
 
61
- # 5. Download an asset
62
- gflow download <asset-id> -o output.png
110
+ # Subsequent calls reuse the project
111
+ gflow generate-image "same owl but dancing" --project-id abc123 -i owl.png --json
63
112
  ```
64
113
 
65
- ## Setup: Discovering RPC IDs
114
+ This is especially useful in automation (like Mascot Studio) where many images are generated for the same character.
66
115
 
67
- Google Flow uses the same BatchExecute protocol as NotebookLM, but with different RPC endpoint IDs. You need to discover these by inspecting network traffic:
116
+ ## 🎬 Video Generation
68
117
 
69
- 1. Open [flow.google](https://flow.google) in Chrome
70
- 2. Open DevTools → Network tab
71
- 3. Filter requests by `batchexecute`
72
- 4. Perform an action (e.g., generate an image)
73
- 5. In the request payload, find the `rpcids` parameter — that's the RPC ID
74
- 6. Update `gflow/api/rpc_ids.py` with the real ID
118
+ ### Text-to-Video
75
119
 
76
- Check which IDs are configured:
77
120
  ```bash
78
- gflow rpc-ids
121
+ # Generate and wait for completion
122
+ gflow generate-video "a cat walking through a sunlit garden" --wait
123
+
124
+ # With options
125
+ gflow generate-video "a spaceship launching" \
126
+ --aspect-ratio 16:9 \
127
+ --seed 42 \
128
+ -o spaceship.mp4 \
129
+ --json
79
130
  ```
80
131
 
81
- Use raw mode to test discovered IDs:
132
+ ### Image-to-Video
133
+
82
134
  ```bash
83
- gflow raw "xYz123" --args '["my prompt", "16:9"]'
135
+ # Use an image as the starting frame
136
+ gflow generate-video "the character starts walking" -i character.png --wait
84
137
  ```
85
138
 
86
- ## Commands
139
+ ### Video Extension
87
140
 
88
- | Command | Description |
89
- |---------|-------------|
90
- | `gflow auth` | Authenticate with Google Flow |
91
- | `gflow auth --status` | Check auth status |
92
- | `gflow auth --clear` | Clear saved credentials |
93
- | `gflow generate-image PROMPT` | Generate images (Imagen 4) |
94
- | `gflow generate-video PROMPT` | Generate videos (Veo 3.1) |
95
- | `gflow list` | List assets in your library |
96
- | `gflow get ASSET_ID` | Get asset details |
97
- | `gflow download ASSET_ID` | Download an asset |
98
- | `gflow delete ASSET_ID` | Delete an asset |
99
- | `gflow collections list` | List collections |
100
- | `gflow collections create NAME` | Create a collection |
101
- | `gflow collections add COL_ID ASSET_ID` | Add asset to collection |
102
- | `gflow raw RPC_ID` | Execute raw RPC (discovery mode) |
103
- | `gflow rpc-ids` | Show configured RPC IDs |
141
+ ```bash
142
+ # Extend an existing video
143
+ gflow extend-video "the character turns around and waves" \
144
+ --media-id <media_id_from_previous_generation> \
145
+ --wait
146
+ ```
104
147
 
105
- All commands support `--json` for machine-readable output (ideal for scripts/agents).
148
+ ### Video Options
106
149
 
107
- ## Environment Variables
150
+ | Flag | Values | Default | Description |
151
+ |------|--------|---------|-------------|
152
+ | `--aspect-ratio` | `16:9`, `9:16`, `1:1` | `landscape` | Video aspect ratio |
153
+ | `--seed` | integer | random | Seed for reproducibility |
154
+ | `--wait/--no-wait` | flag | `--wait` | Wait for rendering to complete |
155
+ | `--timeout` | seconds | `300` | Max wait time |
156
+ | `-i`, `--image` | path | — | Starting frame image |
157
+ | `-o`, `--output` | path | auto | Output file path |
158
+ | `--json` | flag | — | Output as JSON |
108
159
 
109
- | Variable | Description |
110
- |----------|-------------|
111
- | `GFLOW_AUTH_TOKEN` | Auth token (overrides saved credentials) |
112
- | `GFLOW_COOKIES` | Cookie string (overrides saved credentials) |
113
- | `GFLOW_CHROME_PATH` | Path to Chrome executable |
114
- | `GFLOW_DEBUG` | Set to `true` for debug output |
160
+ ## 🔧 Utility Commands
115
161
 
116
- ## For AI Agents / Scripts
162
+ ### Network Sniffer
117
163
 
118
- Every command supports `--json` output:
164
+ Discover what APIs Flow uses by capturing network traffic:
119
165
 
120
166
  ```bash
121
- # Generate and get JSON response
122
- gflow generate-image "a logo" --json | jq '.[0].url'
123
-
124
- # List assets as JSON for processing
125
- gflow list --type image --json | jq '.[].id'
167
+ # Capture for 2 minutes (default)
168
+ gflow sniff
126
169
 
127
- # Pipeline: generate, wait, download
128
- ASSET_ID=$(gflow generate-video "ocean waves" --json | jq -r '.[0].id')
129
- gflow download "$ASSET_ID" -o waves.mp4
170
+ # Custom duration and output
171
+ gflow sniff --duration 300 -o my-capture.json
130
172
  ```
131
173
 
132
- ## How It Works (Same as tmc/nlm)
174
+ ### Debug Mode
133
175
 
134
- 1. **Browser Auth**: Extracts Google cookies from Chrome/Brave/Edge profiles using `browser_cookie3`, then fetches the XSRF token from the Flow page HTML
135
- 2. **BatchExecute Protocol**: Encodes RPC calls into Google's `batchexecute` wire format — form-encoded POST with nested JSON arrays, SAPISIDHASH authorization header
136
- 3. **Response Decoding**: Parses Google's chunked response format (byte-count prefixed JSON chunks with `wrb.fr` markers and multi-layer JSON encoding)
137
- 4. **Retry Logic**: Exponential backoff for transient errors (429, 500, 502, 503, 504)
176
+ Add `--debug` before any command for verbose logging:
138
177
 
139
- ## Project Structure
178
+ ```bash
179
+ gflow --debug generate-image "test prompt"
180
+ ```
181
+
182
+ ## 🏗️ Architecture
140
183
 
141
184
  ```
142
- gflow-py/
143
- ├── pyproject.toml # Package config & dependencies
144
- ├── README.md
185
+ google-flow-cli/
145
186
  ├── gflow/
146
- │ ├── __init__.py
147
- │ ├── auth/
148
- │ │ ├── __init__.py
149
- │ │ └── browser_auth.py # Cookie extraction & Selenium login
150
- │ ├── batchexecute/
151
- │ │ ├── __init__.py
152
- │ │ └── client.py # Google BatchExecute protocol
153
187
  │ ├── api/
154
- │ │ ├── __init__.py
155
- │ │ ├── client.py # FlowClient (high-level API)
156
- ├── models.py # Pydantic models (Asset, Collection, etc.)
157
- │ │ └── rpc_ids.py # RPC endpoint IDs (fill these in!)
188
+ │ │ ├── client.py # FlowClient — core API interactions
189
+ │ │ └── models.py # Pydantic data models
190
+ │ ├── auth/
191
+ │ │ ├── browser_auth.py # Chrome cookie/token management
192
+ │ │ └── recaptcha.py # reCAPTCHA Enterprise solver
158
193
  │ └── cli/
159
- ├── __init__.py
160
- │ └── main.py # Click CLI commands
161
- └── tests/
162
- ├── __init__.py
163
- └── test_batchexecute.py # Unit tests
194
+ └── main.py # Click CLI commands
195
+ ├── pyproject.toml # Package config
196
+ └── README.md # This file
197
+ ```
198
+
199
+ ### API Endpoints (Reverse-Engineered)
200
+
201
+ | Endpoint | Purpose |
202
+ |----------|---------|
203
+ | `labs.google/fx/api/trpc/project.createProject` | Create a Flow project |
204
+ | `aisandbox-pa.googleapis.com/v1/projects/{pid}/flowMedia:batchGenerateImages` | Image generation |
205
+ | `aisandbox-pa.googleapis.com/v1/video:batchAsyncGenerateVideoText` | Video generation |
206
+ | `aisandbox-pa.googleapis.com/v1/video:batchCheckAsyncVideoGenerationStatus` | Video polling |
207
+ | `labs.google/fx/api/auth/session` | Session/auth management |
208
+
209
+ ### Authentication Flow
210
+
211
+ 1. Chrome opens `labs.google/fx` with remote debugging
212
+ 2. User logs in with Google account
213
+ 3. Cookies are captured from Chrome via CDP
214
+ 4. Access token is obtained from `/auth/session`
215
+ 5. reCAPTCHA tokens are solved via Chrome's embedded reCAPTCHA
216
+ 6. API calls use Bearer token (sandbox) + Cookies (labs)
217
+
218
+ ## ⚠️ Troubleshooting
219
+
220
+ ### `401 Unauthorized`
221
+ ```bash
222
+ gflow auth --clear
223
+ gflow auth
224
+ ```
225
+ Re-authenticate. This happens when cookies/tokens expire.
226
+
227
+ ### `No CDP WebSocket available`
228
+ Chrome isn't running or crashed. Start it again:
229
+ ```bash
230
+ gflow auth
231
+ ```
232
+
233
+ ### `reCAPTCHA evaluation failed`
234
+ - Ensure Chrome is on the Flow page (`labs.google/fx`)
235
+ - Wait 30-60 seconds and retry
236
+ - If persistent: `gflow auth --clear && gflow auth`, then manually interact with Flow page
237
+
238
+ ### `LibreSSL warning`
239
+ Harmless warning on macOS. The CLI works fine with LibreSSL.
240
+
241
+ ## 📝 Integration Examples
242
+
243
+ ### Mascot Studio (Vite Proxy)
244
+
245
+ The Mascot Studio web app uses gflow via a Vite dev server proxy:
246
+
247
+ ```typescript
248
+ // POST /api/gflow/generate
249
+ {
250
+ "prompt": "cute owl mascot",
251
+ "model": "imagen4", // optional
252
+ "aspectRatio": "1:1", // optional
253
+ "num": 4, // optional (1-4)
254
+ "baseImageBase64": "data:..." // optional (reference image)
255
+ }
256
+ ```
257
+
258
+ ### Shell Script Batch Generation
259
+
260
+ ```bash
261
+ #!/bin/bash
262
+ PROMPT="cute robot mascot"
263
+ PROJECT_ID=""
264
+
265
+ # Generate base
266
+ RESULT=$(gflow generate-image "$PROMPT" --num 4 --json)
267
+ PROJECT_ID=$(echo $RESULT | jq -r '.projectId')
268
+
269
+ # Generate states using same project
270
+ for state in "happy" "sad" "excited" "sleeping"; do
271
+ gflow generate-image "$PROMPT in $state state" \
272
+ --project-id $PROJECT_ID \
273
+ -i base.png \
274
+ -o "${state}.png"
275
+ done
164
276
  ```
165
277
 
166
- ## License
278
+ ## 📄 License
167
279
 
168
- MIT
280
+ Internal tool — not for public distribution. Uses Google's internal APIs which may change without notice.
@@ -49,10 +49,22 @@ SANDBOX_BASE = "https://aisandbox-pa.googleapis.com"
49
49
  LABS_BASE = "https://labs.google/fx/api"
50
50
 
51
51
  # Internal model names (discovered from sniff)
52
- IMAGE_MODEL = "NARWHAL" # Imagen 4 internal name
52
+ # Flow UI shows friendly names; API uses internal codenames.
53
+ # NARWHAL = Imagen 4 (default, highest quality)
54
+ # NANOBANANA2 = Nano Banana 2 (faster, lower cost)
55
+ IMAGE_MODEL = "NARWHAL" # Imagen 4 internal name (default)
53
56
  VIDEO_MODEL = "veo_3_1_t2v_fast_ultra" # Veo 3.1 fast/ultra (text-to-video)
54
57
  TOOL_NAME = "PINHOLE" # Flow's internal tool name
55
58
 
59
+ # Image model mapping: friendly name -> API internal name
60
+ IMAGE_MODEL_MAP = {
61
+ "imagen4": "NARWHAL",
62
+ "narwhal": "NARWHAL",
63
+ "nano-banana-2": "NANOBANANA2",
64
+ "nanobanana2": "NANOBANANA2",
65
+ "banana": "NANOBANANA2",
66
+ }
67
+
56
68
  # I2V model mapping by aspect ratio (discovered from network sniff)
57
69
  I2V_MODEL_MAP = {
58
70
  "landscape": "veo_3_1_i2v_s_fast_ultra",
@@ -69,6 +81,7 @@ IMAGE_ASPECT_MAP = {
69
81
  "1:1": "IMAGE_ASPECT_RATIO_SQUARE",
70
82
  "portrait": "IMAGE_ASPECT_RATIO_PORTRAIT",
71
83
  "9:16": "IMAGE_ASPECT_RATIO_PORTRAIT",
84
+ "3:4": "IMAGE_ASPECT_RATIO_PORTRAIT_THREE_FOUR",
72
85
  "landscape": "IMAGE_ASPECT_RATIO_LANDSCAPE",
73
86
  "16:9": "IMAGE_ASPECT_RATIO_LANDSCAPE",
74
87
  "4:3": "IMAGE_ASPECT_RATIO_LANDSCAPE_FOUR_THREE",
@@ -684,7 +697,7 @@ class FlowClient:
684
697
 
685
698
  def generate_image(self, req: GenerateImageRequest) -> list[Asset]:
686
699
  """
687
- Generate images using Imagen 4 (NARWHAL).
700
+ Generate images using the specified model (default: Imagen 4 / NARWHAL).
688
701
 
689
702
  POST /v1/projects/{pid}/flowMedia:batchGenerateImages
690
703
  """
@@ -695,11 +708,20 @@ class FlowClient:
695
708
  aspect = IMAGE_ASPECT_MAP.get(req.aspect_ratio.lower(), "IMAGE_ASPECT_RATIO_LANDSCAPE")
696
709
  seed = req.seed if req.seed is not None else random.randint(10000, 99999)
697
710
 
711
+ # Resolve model name
712
+ model_name = IMAGE_MODEL # default
713
+ if req.model:
714
+ model_name = IMAGE_MODEL_MAP.get(req.model.lower(), req.model.upper())
715
+
698
716
  def _do_generate():
699
717
  recaptcha_token = self._get_recaptcha_token()
700
718
  client_ctx = self._build_client_context(project_id, recaptcha_token)
701
719
  batch_id = str(uuid.uuid4())
702
720
 
721
+ media_id = None
722
+ if req.image:
723
+ media_id = self.upload_image(req.image)
724
+
703
725
  payload = {
704
726
  "clientContext": client_ctx,
705
727
  "mediaGenerationContext": {"batchId": batch_id},
@@ -709,7 +731,7 @@ class FlowClient:
709
731
  for i in range(req.num_images):
710
732
  img_req = {
711
733
  "clientContext": client_ctx,
712
- "imageModelName": IMAGE_MODEL,
734
+ "imageModelName": model_name,
713
735
  "imageAspectRatio": aspect,
714
736
  "structuredPrompt": {
715
737
  "parts": [{"text": req.prompt}],
@@ -717,6 +739,13 @@ class FlowClient:
717
739
  "seed": seed + i,
718
740
  "imageInputs": [],
719
741
  }
742
+
743
+ if media_id:
744
+ img_req["imageInputs"].append({
745
+ "imageInputType": "IMAGE_INPUT_TYPE_BASE_IMAGE",
746
+ "name": media_id
747
+ })
748
+
720
749
  payload["requests"].append(img_req)
721
750
 
722
751
  if self.debug:
@@ -36,11 +36,13 @@ class Asset(BaseModel):
36
36
 
37
37
 
38
38
  class GenerateImageRequest(BaseModel):
39
- """Request to generate images using Imagen 4."""
39
+ """Request to generate images using Imagen 4 or other models."""
40
40
  prompt: str
41
- aspect_ratio: str = "landscape" # square, portrait, landscape, 4:3
41
+ aspect_ratio: str = "landscape" # square, portrait, landscape, 4:3, 3:4
42
42
  seed: int | None = None
43
43
  num_images: int = 1 # Number of images to generate (1-8)
44
+ image: str | None = None # Path to image for reference (image-to-image)
45
+ model: str | None = None # Model name: imagen4, nano-banana-2 (default: imagen4)
44
46
 
45
47
 
46
48
  class GenerateVideoRequest(BaseModel):
@@ -140,29 +140,56 @@ def close_browser(ctx):
140
140
  @cli.command("generate-image")
141
141
  @click.argument("prompt")
142
142
  @click.option("--aspect-ratio", default="landscape",
143
- type=click.Choice(["square", "portrait", "landscape", "4:3", "1:1", "16:9", "9:16"]),
143
+ type=click.Choice(["square", "portrait", "landscape", "4:3", "3:4", "1:1", "16:9", "9:16"]),
144
144
  help="Image aspect ratio")
145
145
  @click.option("--seed", default=None, type=int, help="Random seed for reproducibility")
146
- @click.option("--num", default=1, type=click.IntRange(1, 8), help="Number of images (1-8)")
146
+ @click.option("--num", default=1, type=click.IntRange(1, 4), help="Number of images (1-4)")
147
147
  @click.option("-o", "--output", default=None, help="Output file path (auto-named if omitted)")
148
+ @click.option("-i", "--image", default=None, help="Reference base image path (image-to-image)")
149
+ @click.option("--project-id", default=None, help="Reuse an existing project ID (skip project creation)")
150
+ @click.option("-m", "--model", default=None,
151
+ type=click.Choice(["imagen4", "nano-banana-2"], case_sensitive=False),
152
+ help="Image model (default: imagen4)")
148
153
  @click.option("--json", "as_json", is_flag=True, help="Output as JSON")
149
154
  @click.pass_context
150
- def generate_image(ctx, prompt, aspect_ratio, seed, num, output, as_json):
151
- """Generate images from a text prompt using Imagen 4.
155
+ def generate_image(ctx, prompt, aspect_ratio, seed, num, output, image, project_id, model, as_json):
156
+ """Generate images from a text prompt.
157
+
158
+ \b
159
+ Models:
160
+ imagen4 — Imagen 4 (NARWHAL) highest quality (default)
161
+ nano-banana-2 — Nano Banana 2, faster generation
152
162
 
153
163
  \b
154
164
  Examples:
155
165
  gflow generate-image "a cat astronaut floating in space"
156
- gflow generate-image "sunset over mountains" --aspect-ratio landscape --num 4
166
+ gflow generate-image "sunset" --aspect-ratio 1:1 --num 4
157
167
  gflow generate-image "logo design" --aspect-ratio square -o logo.png
168
+ gflow generate-image "cute owl" -m nano-banana-2
169
+ gflow generate-image "make the character smile" -i base.png
170
+ gflow generate-image "prompt" --project-id abc123 (reuse project)
158
171
  """
159
172
  client = _get_client(ctx.obj["debug"])
160
173
 
174
+ # Set project ID if provided (skip project creation)
175
+ if project_id:
176
+ client._project_id = project_id
177
+
178
+ # Validate image path if provided
179
+ if image:
180
+ img_path = Path(image)
181
+ if not img_path.exists():
182
+ console.print(f"[red]Error: Image file not found:[/red] {image}")
183
+ sys.exit(1)
184
+ image = str(img_path.resolve()) # Use absolute path
185
+
161
186
  req = GenerateImageRequest(
162
187
  prompt=prompt,
163
188
  aspect_ratio=aspect_ratio,
164
189
  seed=seed,
165
190
  num_images=num,
191
+ image=image,
192
+ model=model,
166
193
  )
167
194
 
168
195
  try:
@@ -206,7 +233,7 @@ def generate_image(ctx, prompt, aspect_ratio, seed, num, output, as_json):
206
233
  enc = d["raw"]["encodedImage"]
207
234
  d["raw"]["encodedImage"] = f"<{len(enc)} chars base64>"
208
235
  result.append(d)
209
- click.echo(json.dumps({"images": result, "saved_files": saved_files}, indent=2))
236
+ click.echo(json.dumps({"images": result, "saved_files": saved_files, "projectId": client._project_id}, indent=2))
210
237
  else:
211
238
  console.print(f"\n[bold]Generated {len(assets)} image(s)[/bold] for: {prompt}")
212
239
 
@@ -8,7 +8,7 @@ version = "0.1.0"
8
8
  description = "A command-line interface to Google Flow (AI image & video generation)"
9
9
  readme = "README.md"
10
10
  license = {text = "MIT"}
11
- requires-python = ">=3.10"
11
+ requires-python = ">=3.9"
12
12
  dependencies = [
13
13
  "requests>=2.31",
14
14
  "click>=8.1",
@@ -153,6 +153,10 @@ Trước khi claim DONE, kiểm tra **mỗi item** dưới đây:
153
153
  ☐ Backwards compatibility: Breaking changes documented?
154
154
  ☐ Localization (I18N): Text UI mới đã được thêm vào Localizable.strings (EN & VI) chưa?
155
155
  → Việc bọc `Localized()` trong code là chưa đủ. Phải THỰC SỰ mở file .strings (hoặc chạy update_strings.py) để bổ sung key/value trước khi báo cáo DONE.
156
+ ☐ App Store Compliance (iOS Only): Đã tuân thủ App Store Review Guidelines chưa?
157
+ → Background Modes (VD: UIBackgroundModes `location`) phải hợp lệ và thực sự CẦN THIẾT (Guideline 2.5.4). Khai báo thừa sẽ bị reject.
158
+ → Thu thập dữ liệu (Track user, Quảng cáo/Sức khoẻ) PHẢI có App Tracking Transparency (`NSUserTrackingUsageDescription`) (Guideline 5.1.2(i)).
159
+ → Tài sản/API bên thứ 3 (VD: Apple Weather) PHẢI hiển thị đầy đủ trademark attribution và link legal source (Guideline 5.2.5).
156
160
  ```
157
161
 
158
162
  **Nếu thiếu bất kỳ item nào → report DONE_WITH_CONCERNS, không DONE.**
@@ -328,6 +328,27 @@ symphony task block &lt;id&gt; "Lý do" # Đánh dấu bị chặn vì lỗi</
328
328
  <p>Hệ thống AI AWKit sẽ <strong>luôn tự động</strong> thực hiện các hành động trên cho bạn trong quá trình thực thi hệ thống 7-Gate. Bạn không cần gõ lệnh thủ công trừ khi muốn kiểm tra tiến độ.</p>
329
329
  </div>
330
330
 
331
+ <div class="card">
332
+ <h2>🗣️ Giao Tiếp & Caveman Mode</h2>
333
+ <p>AWKit hỗ trợ chế độ <strong>Caveman Mode</strong> – giúp AI giao tiếp cực kỳ súc tích, lược bỏ các từ thừa, lời chào hỏi và giải thích dài dòng. Điều này giúp tiết kiệm 75% Token và tăng tốc độ phản hồi.</p>
334
+
335
+ <h3>Cách Cấu Hình (Trong <code>.project-identity</code>):</h3>
336
+ <pre><code>"communication": {
337
+ "cavemanMode": {
338
+ "enabled": true, // Bật hoặc tắt
339
+ "level": "full" // Các mức độ: lite, full, ultra
340
+ }
341
+ }</code></pre>
342
+
343
+ <h3>Các Mức Độ Nén:</h3>
344
+ <ul>
345
+ <li><strong>lite:</strong> Chỉ lược bỏ lời chào và câu sáo rỗng.</li>
346
+ <li><strong>full:</strong> Nói cụt lủn, bỏ trợ từ, ưu tiên từ khóa kỹ thuật.</li>
347
+ <li><strong>ultra:</strong> Chỉ dùng từ khóa và ký hiệu (e.g. <code>Fix bug A -> Done.</code>).</li>
348
+ </ul>
349
+ <p><em>⚠️ Lưu ý: Ở các tác vụ nguy hiểm (Xóa file, Deploy), AI sẽ tự động tắt chế độ này để cảnh báo rõ ràng cho bạn.</em></p>
350
+ </div>
351
+
331
352
  <div class="card">
332
353
  <h2>🔌 Tích Hợp Dự Án (Trello & Telegram & Neural Memory)</h2>
333
354
 
@@ -23,6 +23,12 @@
23
23
  "indentation": "spaces-4",
24
24
  "lineLength": 120
25
25
  },
26
+ "communication": {
27
+ "cavemanMode": {
28
+ "enabled": false,
29
+ "level": "full"
30
+ }
31
+ },
26
32
  "createdDate": "{{DATE}}",
27
33
  "lastUpdated": "{{DATE}}",
28
34
  "automation": {
@@ -50,6 +56,24 @@
50
56
  "git": {
51
57
  "autoCommit": true,
52
58
  "autoPush": true
59
+ },
60
+ "obsidian": {
61
+ "enabled": false,
62
+ "path": "",
63
+ "autoSync": false
64
+ },
65
+ "mcp": {
66
+ "pixel-mcp": {
67
+ "enabled": false
68
+ }
69
+ }
70
+ },
71
+ "modelPolicy": {
72
+ "mode": "auto",
73
+ "defaultTier": "STANDARD",
74
+ "tierOverrides": {
75
+ "*.plist|*.json|*.env": "LIGHT",
76
+ "docs/*": "LIGHT"
53
77
  }
54
78
  }
55
79
  }