ima2-gen 1.1.17 → 1.1.19
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +26 -4
- package/bin/commands/capabilities.js +6 -0
- package/bin/commands/capabilities.ts +6 -0
- package/bin/commands/video.js +215 -0
- package/bin/commands/video.ts +205 -0
- package/bin/ima2.js +61 -6
- package/bin/ima2.ts +54 -6
- package/docs/API.md +73 -4
- package/docs/CLI.md +38 -0
- package/docs/README.ja.md +2 -2
- package/docs/README.ko.md +15 -3
- package/docs/README.zh-CN.md +2 -2
- package/lib/agentGenerationPlanner.js +18 -1
- package/lib/agentGenerationPlanner.ts +21 -1
- package/lib/agentRuntime.js +105 -1
- package/lib/agentRuntime.ts +118 -1
- package/lib/agentTypes.js +1 -0
- package/lib/agentTypes.ts +2 -1
- package/lib/assetLifecycle.js +12 -8
- package/lib/assetLifecycle.ts +12 -8
- package/lib/capabilities.js +9 -0
- package/lib/capabilities.ts +9 -0
- package/lib/grokVideoAdapter.js +45 -1
- package/lib/grokVideoAdapter.ts +49 -1
- package/lib/historyList.js +1 -0
- package/lib/historyList.ts +1 -0
- package/lib/imageModels.js +1 -1
- package/lib/imageModels.ts +2 -2
- package/lib/oauthLauncher.js +5 -2
- package/lib/oauthLauncher.ts +5 -3
- package/lib/videoSeriesChain.js +24 -0
- package/lib/videoSeriesChain.ts +29 -0
- package/node_modules/progrok/README.md +300 -22
- package/node_modules/progrok/dist/index.js +558 -173
- package/node_modules/progrok/dist/index.js.map +1 -1
- package/node_modules/progrok/package.json +3 -3
- package/node_modules/progrok/skills/progrok/SKILL.md +145 -109
- package/package.json +2 -2
- package/routes/video.js +10 -1
- package/routes/video.ts +11 -1
- package/ui/dist/.vite/manifest.json +12 -12
- package/ui/dist/assets/AgentWorkspace-DE_wg90f.js +3 -0
- package/ui/dist/assets/{CardNewsWorkspace-6y_HNp3I.js → CardNewsWorkspace--Myc5pAp.js} +1 -1
- package/ui/dist/assets/NodeCanvas-4U5oOT2y.js +7 -0
- package/ui/dist/assets/{PromptBuilderPanel-BQlPtGGm.js → PromptBuilderPanel-DNW1U8zI.js} +2 -2
- package/ui/dist/assets/{PromptImportDialog-aNk40wLt.js → PromptImportDialog-o-4Sqki1.js} +2 -2
- package/ui/dist/assets/{PromptImportDiscoverySection-B6NKkVBz.js → PromptImportDiscoverySection-BAbrRP8B.js} +1 -1
- package/ui/dist/assets/{PromptImportFolderSection-9-xbe-FM.js → PromptImportFolderSection-L-XI2noz.js} +1 -1
- package/ui/dist/assets/{PromptLibraryPanel-CbEY0AM6.js → PromptLibraryPanel-CrW9LYGD.js} +2 -2
- package/ui/dist/assets/{SettingsWorkspace-ao9ymIWt.js → SettingsWorkspace-Dn4SYTyZ.js} +1 -1
- package/ui/dist/assets/index-B6tcw_UF.css +1 -0
- package/ui/dist/assets/{index-DP88bEQf.js → index-BONbNNIi.js} +1 -1
- package/ui/dist/assets/index-CeSZ2L3-.js +32 -0
- package/ui/dist/index.html +2 -2
- package/vendor/progrok-0.1.1.tgz +0 -0
- package/ui/dist/assets/AgentWorkspace-CLHwx6u4.js +0 -3
- package/ui/dist/assets/NodeCanvas-DR2N5Dib.js +0 -7
- package/ui/dist/assets/index-B0re600T.js +0 -32
- package/ui/dist/assets/index-CXJEgTOQ.css +0 -1
- package/vendor/progrok-0.1.0.tgz +0 -0
package/README.md
CHANGED
|
@@ -16,9 +16,9 @@
|
|
|
16
16
|
|
|
17
17
|
`ima2-gen` is a local image generation studio for people who want the ChatGPT/Codex image workflow in a small desktop-like web app.
|
|
18
18
|
|
|
19
|
-
Run it with `npx`, sign in with
|
|
19
|
+
Run it with `npx`, sign in with ChatGPT OAuth or Grok OAuth, and start generating images and videos. Iterate with history, references, node branches, multimode batches, Canvas Mode cleanup, and Grok Video generation. No API key required — free ChatGPT OAuth and SuperGrok subscription cover everything.
|
|
20
20
|
|
|
21
|
-

|
|
22
22
|
|
|
23
23
|
## Quick Start
|
|
24
24
|
|
|
@@ -35,6 +35,13 @@ npx @openai/codex login
|
|
|
35
35
|
npx ima2-gen serve
|
|
36
36
|
```
|
|
37
37
|
|
|
38
|
+
To generate a video from the CLI:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
ima2 video "a cat playing piano" --duration 5 --resolution 720p
|
|
42
|
+
ima2 video "animate this scene" --ref photo.png --duration 10
|
|
43
|
+
```
|
|
44
|
+
|
|
38
45
|
If `3333` is already occupied, `ima2-gen` binds the next available port and writes the actual URL to `~/.ima2/server.json`. Use `ima2 open` or the URL printed in the terminal instead of assuming the port.
|
|
39
46
|
|
|
40
47
|
You can also install it globally:
|
|
@@ -44,6 +51,17 @@ npm install -g ima2-gen
|
|
|
44
51
|
ima2 serve
|
|
45
52
|
```
|
|
46
53
|
|
|
54
|
+
### Setup
|
|
55
|
+
|
|
56
|
+
`ima2 setup` offers four authentication choices:
|
|
57
|
+
|
|
58
|
+
1. **GPT OAuth** — login with ChatGPT account (free, images only)
|
|
59
|
+
2. **Grok OAuth** — login with xAI/Grok account (images + video)
|
|
60
|
+
3. **Both** — GPT OAuth + Grok OAuth (full feature access)
|
|
61
|
+
4. **API Key** — paste your OpenAI API key (paid)
|
|
62
|
+
|
|
63
|
+
Video generation requires Grok OAuth (option 2 or 3). Run `ima2 grok login` separately if you already have GPT OAuth configured and want to add video support.
|
|
64
|
+
|
|
47
65
|
Before updating a global install on Windows, stop any running `ima2 serve`
|
|
48
66
|
process. If npm reports `EBUSY` or `resource busy or locked`, close ima2
|
|
49
67
|
terminals, end stale `node.exe` processes if needed, and retry. If the lock
|
|
@@ -54,9 +72,10 @@ persists, reboot and run the update before starting ima2 again.
|
|
|
54
72
|
- **Classic mode**: generate, edit, reuse the current image, paste references, and continue from history.
|
|
55
73
|
- **Node mode**: branch a good image into multiple directions without losing the original.
|
|
56
74
|
- **Multimode batches**: launch several Classic outputs from one prompt, watch slot-by-slot progress, and continue from the best result.
|
|
75
|
+
- **Video generation**: create short videos from text, a single image, or multiple reference images via Grok video models. SSE streaming shows planning → submitted → progress % → done.
|
|
57
76
|
- **Canvas Mode**: zoom, pan, annotate, erase, clean backgrounds, keep transparent previews, and export either alpha or matte-backed versions.
|
|
58
77
|
- **Local gallery**: keep generated assets on your machine with session-aware history. By default the gallery shows the current session and an All Images toggle reveals the full history; the default scope is sticky across sessions. Each image records its generation time and reasoning effort in the result metadata, so they persist across reloads.
|
|
59
|
-
- **Reference images**: drag, drop, paste, and attach up to 5 references; large images are compressed before upload.
|
|
78
|
+
- **Reference images**: drag, drop, paste, and attach up to 5 references (images) or up to 7 references (video); large images are compressed before upload.
|
|
60
79
|
- **Prompt library imports**: import local prompt packs, GitHub folders, and curated GPT-image prompt hints into the built-in prompt library.
|
|
61
80
|
- **Mobile shell**: use the app bar, compose sheet, and compact settings toggle on smaller screens.
|
|
62
81
|
- **Observable jobs**: active and recent jobs are tracked with safe logs and request IDs.
|
|
@@ -73,7 +92,7 @@ Image generation can run through the local Codex/ChatGPT OAuth path, a configure
|
|
|
73
92
|
|
|
74
93
|
If no provider is specified, the app keeps the current OAuth/default behavior. API-key generation defaults to `gpt-5.4-mini`, `low` reasoning, and `1024x1024` unless the request passes validated model, reasoning, size, or web-search options. Grok defaults to `grok-imagine-image`; `quality: "high"` promotes the final image call to `grok-imagine-image-quality`.
|
|
75
94
|
|
|
76
|
-
Grok video generation (
|
|
95
|
+
Grok video generation uses `grok-imagine-video` (default) or `grok-imagine-video-1.5-preview`. Three modes are auto-detected from reference count: text-to-video (0 refs), image-to-video (1 ref), and reference-to-video (2–7 refs, max 10s duration). Video controls include duration (1–15s), resolution (480p, 720p), and aspect ratio (1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto).
|
|
77
96
|
|
|
78
97
|

|
|
79
98
|
|
|
@@ -166,6 +185,7 @@ These require a running `ima2 serve`. The CLI covers every server route. The mos
|
|
|
166
185
|
| `ima2 gen <prompt>` | Generate from the CLI |
|
|
167
186
|
| `ima2 edit <file> --prompt <text>` | Edit an existing image |
|
|
168
187
|
| `ima2 multimode <prompt>` | Multi-image SSE generation |
|
|
188
|
+
| `ima2 video <prompt>` | Video generation via Grok (SSE streaming with progress) |
|
|
169
189
|
| `ima2 ls [--session <id>] [--favorites]` | List recent history |
|
|
170
190
|
| `ima2 show <name> [--metadata]` | Reveal a generated asset |
|
|
171
191
|
| `ima2 prompt ls -q <search>` | Search the prompt library |
|
|
@@ -179,6 +199,8 @@ The server advertises its actual port at `~/.ima2/server.json`. If `3333` is bus
|
|
|
179
199
|
ima2 gen "poster" --model gpt-5.4 --reasoning-effort high
|
|
180
200
|
ima2 edit input.png --prompt "make it rainy" --web-search
|
|
181
201
|
ima2 multimode "two cats playing" -n 2
|
|
202
|
+
ima2 video "a cat playing piano" --duration 5 --resolution 720p
|
|
203
|
+
ima2 video "animate this" --ref photo.png --aspect-ratio 16:9
|
|
182
204
|
ima2 inflight ls --terminal
|
|
183
205
|
ima2 config set imageModels.reasoningEffort high
|
|
184
206
|
```
|
|
@@ -70,6 +70,12 @@ function printText(capabilities) {
|
|
|
70
70
|
if (capabilities.valid?.imageModels?.grokSupported?.length) {
|
|
71
71
|
out(` grok models: ${capabilities.valid.imageModels.grokSupported.join(", ")}`);
|
|
72
72
|
}
|
|
73
|
+
if (capabilities.valid?.videoModels?.supported?.length) {
|
|
74
|
+
out(` video models: ${capabilities.valid.videoModels.supported.join(", ")}`);
|
|
75
|
+
out(` video resolutions: ${capabilities.valid.videoModels.resolutions?.join(", ")}`);
|
|
76
|
+
out(` video aspect ratios: ${capabilities.valid.videoModels.aspectRatios?.join(", ")}`);
|
|
77
|
+
out(` video duration: ${capabilities.valid.videoModels.durationRange?.[0]}-${capabilities.valid.videoModels.durationRange?.[1]}s`);
|
|
78
|
+
}
|
|
73
79
|
out(` reasoning: ${capabilities.valid?.reasoningEfforts?.join(", ")}`);
|
|
74
80
|
out(` quality: ${capabilities.valid?.quality?.join(", ")}`);
|
|
75
81
|
out(` modes: ${capabilities.valid?.modes?.join(", ")}`);
|
|
@@ -74,6 +74,12 @@ function printText(capabilities: any): void {
|
|
|
74
74
|
if (capabilities.valid?.imageModels?.grokSupported?.length) {
|
|
75
75
|
out(` grok models: ${capabilities.valid.imageModels.grokSupported.join(", ")}`);
|
|
76
76
|
}
|
|
77
|
+
if (capabilities.valid?.videoModels?.supported?.length) {
|
|
78
|
+
out(` video models: ${capabilities.valid.videoModels.supported.join(", ")}`);
|
|
79
|
+
out(` video resolutions: ${capabilities.valid.videoModels.resolutions?.join(", ")}`);
|
|
80
|
+
out(` video aspect ratios: ${capabilities.valid.videoModels.aspectRatios?.join(", ")}`);
|
|
81
|
+
out(` video duration: ${capabilities.valid.videoModels.durationRange?.[0]}-${capabilities.valid.videoModels.durationRange?.[1]}s`);
|
|
82
|
+
}
|
|
77
83
|
out(` reasoning: ${capabilities.valid?.reasoningEfforts?.join(", ")}`);
|
|
78
84
|
out(` quality: ${capabilities.valid?.quality?.join(", ")}`);
|
|
79
85
|
out(` modes: ${capabilities.valid?.modes?.join(", ")}`);
|
|
@@ -0,0 +1,215 @@
|
|
|
1
|
+
import { parseArgs } from "../lib/args.js";
|
|
2
|
+
import { resolveServer } from "../lib/client.js";
|
|
3
|
+
import { streamSse } from "../lib/sse.js";
|
|
4
|
+
import { out, die, color, json, exitCodeForError } from "../lib/output.js";
|
|
5
|
+
import { config } from "../../config.js";
|
|
6
|
+
import { readFile, writeFile, mkdir } from "node:fs/promises";
|
|
7
|
+
import { dirname, join } from "node:path";
|
|
8
|
+
const VALID_RESOLUTIONS = new Set(["480p", "720p"]);
|
|
9
|
+
const VALID_ASPECT_RATIOS = new Set(["1:1", "16:9", "9:16", "4:3", "3:4", "3:2", "2:3", "auto"]);
|
|
10
|
+
const VALID_MODELS = new Set(["grok-imagine-video", "grok-imagine-video-1.5-preview"]);
|
|
11
|
+
const SPEC = {
|
|
12
|
+
flags: {
|
|
13
|
+
duration: { type: "string", default: "5" },
|
|
14
|
+
resolution: { type: "string", default: "480p" },
|
|
15
|
+
"aspect-ratio": { type: "string", default: "auto" },
|
|
16
|
+
model: { type: "string" },
|
|
17
|
+
topic: { type: "string" },
|
|
18
|
+
ref: { type: "string", repeatable: true },
|
|
19
|
+
out: { short: "o", type: "string" },
|
|
20
|
+
"out-dir": { short: "d", type: "string" },
|
|
21
|
+
json: { type: "boolean" },
|
|
22
|
+
timeout: { type: "string", default: "600" },
|
|
23
|
+
server: { type: "string" },
|
|
24
|
+
session: { type: "string" },
|
|
25
|
+
help: { short: "h", type: "boolean" },
|
|
26
|
+
},
|
|
27
|
+
};
|
|
28
|
+
const HELP = `
|
|
29
|
+
ima2 video <prompt...> [options]
|
|
30
|
+
|
|
31
|
+
Generate a video via the Grok video provider (SSE streaming).
|
|
32
|
+
|
|
33
|
+
Options:
|
|
34
|
+
--duration <1..15> Duration in seconds. Default: 5
|
|
35
|
+
--resolution <480p|720p> Default: 480p
|
|
36
|
+
--aspect-ratio <ratio|auto> 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto. Default: auto
|
|
37
|
+
--model <name> grok-imagine-video, grok-imagine-video-1.5-preview
|
|
38
|
+
--topic <text> Series topic for prompt chain continuity
|
|
39
|
+
--ref <file> Attach source/reference image (repeatable, max 7)
|
|
40
|
+
-o, --out <file> Output file path
|
|
41
|
+
-d, --out-dir <dir> Output directory
|
|
42
|
+
--json Print JSON result to stdout
|
|
43
|
+
--timeout <sec> Default: 600
|
|
44
|
+
--server <url> Override server URL
|
|
45
|
+
--session <id> Session ID
|
|
46
|
+
|
|
47
|
+
Modes (auto-detected from --ref count):
|
|
48
|
+
0 refs → text-to-video
|
|
49
|
+
1 ref → image-to-video
|
|
50
|
+
2-7 refs → reference-to-video (max 10s duration)
|
|
51
|
+
|
|
52
|
+
Examples:
|
|
53
|
+
ima2 video "a cat playing piano"
|
|
54
|
+
ima2 video "animate this" --ref photo.png --duration 10
|
|
55
|
+
ima2 video "cinematic" --resolution 720p --aspect-ratio 16:9 -o out.mp4
|
|
56
|
+
`;
|
|
57
|
+
export default async function videoCmd(argv) {
|
|
58
|
+
const args = parseArgs(argv, SPEC);
|
|
59
|
+
if (args.help) {
|
|
60
|
+
out(HELP);
|
|
61
|
+
return;
|
|
62
|
+
}
|
|
63
|
+
const prompt = args.positional.join(" ");
|
|
64
|
+
if (!prompt)
|
|
65
|
+
die(2, "prompt is required");
|
|
66
|
+
const duration = parseInt(String(args.duration)) || 5;
|
|
67
|
+
if (duration < 1 || duration > 15)
|
|
68
|
+
die(2, "--duration must be between 1 and 15");
|
|
69
|
+
const resolution = String(args.resolution);
|
|
70
|
+
if (!VALID_RESOLUTIONS.has(resolution))
|
|
71
|
+
die(2, "--resolution must be one of: 480p, 720p");
|
|
72
|
+
const aspectRatio = String(args["aspect-ratio"]);
|
|
73
|
+
if (!VALID_ASPECT_RATIOS.has(aspectRatio))
|
|
74
|
+
die(2, "--aspect-ratio must be one of: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto");
|
|
75
|
+
if (args.model && !VALID_MODELS.has(String(args.model))) {
|
|
76
|
+
die(2, "--model must be one of: grok-imagine-video, grok-imagine-video-1.5-preview");
|
|
77
|
+
}
|
|
78
|
+
const refs = (Array.isArray(args.ref) ? args.ref : []);
|
|
79
|
+
if (refs.length > 7)
|
|
80
|
+
die(2, "max 7 --ref attachments for video");
|
|
81
|
+
let server;
|
|
82
|
+
try {
|
|
83
|
+
server = await resolveServer({ serverFlag: args.server });
|
|
84
|
+
}
|
|
85
|
+
catch (e) {
|
|
86
|
+
die(exitCodeForError(e), e.message);
|
|
87
|
+
throw e;
|
|
88
|
+
}
|
|
89
|
+
const referenceImages = await Promise.all(refs.map(async (p) => {
|
|
90
|
+
const buf = await readFile(p);
|
|
91
|
+
return buf.toString("base64");
|
|
92
|
+
}));
|
|
93
|
+
const timeoutMs = (parseInt(String(args.timeout)) || 600) * 1000;
|
|
94
|
+
const requestId = `req_cli_video_${Date.now().toString(36)}`;
|
|
95
|
+
const body = {
|
|
96
|
+
prompt,
|
|
97
|
+
provider: "grok",
|
|
98
|
+
duration,
|
|
99
|
+
resolution,
|
|
100
|
+
aspectRatio,
|
|
101
|
+
requestId,
|
|
102
|
+
};
|
|
103
|
+
if (args.model)
|
|
104
|
+
body.model = args.model;
|
|
105
|
+
if (args.session)
|
|
106
|
+
body.sessionId = args.session;
|
|
107
|
+
if (args.topic)
|
|
108
|
+
body.topic = args.topic;
|
|
109
|
+
if (referenceImages.length === 1) {
|
|
110
|
+
body.sourceImage = referenceImages[0];
|
|
111
|
+
}
|
|
112
|
+
else if (referenceImages.length > 1) {
|
|
113
|
+
body.referenceImages = referenceImages;
|
|
114
|
+
}
|
|
115
|
+
const ac = new AbortController();
|
|
116
|
+
let timedOut = false;
|
|
117
|
+
const timeoutTimer = setTimeout(() => { timedOut = true; ac.abort(); }, timeoutMs);
|
|
118
|
+
const onSig = () => { ac.abort(); process.exit(130); };
|
|
119
|
+
process.once("SIGINT", onSig);
|
|
120
|
+
process.once("SIGTERM", onSig);
|
|
121
|
+
const url = `${server.base}/api/video/generate`;
|
|
122
|
+
let doneData = null;
|
|
123
|
+
let lastProgress = -1;
|
|
124
|
+
try {
|
|
125
|
+
for await (const ev of streamSse(url, { body, signal: ac.signal, headers: { "X-Request-Id": requestId } })) {
|
|
126
|
+
switch (ev.event) {
|
|
127
|
+
case "planning":
|
|
128
|
+
if (!args.json)
|
|
129
|
+
out(color.dim("[planning] preparing video generation..."));
|
|
130
|
+
break;
|
|
131
|
+
case "submitted":
|
|
132
|
+
if (!args.json)
|
|
133
|
+
out(color.dim(`[submitted] xai request: ${ev.data.xaiVideoRequestId || "..."}`));
|
|
134
|
+
break;
|
|
135
|
+
case "progress": {
|
|
136
|
+
const pct = typeof ev.data.progress === "number" ? Math.round(ev.data.progress * 100) : null;
|
|
137
|
+
if (pct !== null && pct !== lastProgress && !args.json) {
|
|
138
|
+
const bar = renderBar(pct);
|
|
139
|
+
process.stdout.write(`\r ${bar} ${pct}%`);
|
|
140
|
+
lastProgress = pct;
|
|
141
|
+
}
|
|
142
|
+
break;
|
|
143
|
+
}
|
|
144
|
+
case "done":
|
|
145
|
+
if (!args.json && lastProgress >= 0)
|
|
146
|
+
process.stdout.write("\n");
|
|
147
|
+
doneData = ev.data;
|
|
148
|
+
break;
|
|
149
|
+
case "error":
|
|
150
|
+
if (!args.json && lastProgress >= 0)
|
|
151
|
+
process.stdout.write("\n");
|
|
152
|
+
die(1, `video error: ${ev.data.error || ev.data}${ev.data.code ? ` (${ev.data.code})` : ""}`);
|
|
153
|
+
}
|
|
154
|
+
}
|
|
155
|
+
}
|
|
156
|
+
catch (e) {
|
|
157
|
+
if (e.name === "AbortError" && !timedOut)
|
|
158
|
+
return;
|
|
159
|
+
if (!args.json && lastProgress >= 0)
|
|
160
|
+
process.stdout.write("\n");
|
|
161
|
+
die(exitCodeForError(e), e.message);
|
|
162
|
+
}
|
|
163
|
+
finally {
|
|
164
|
+
clearTimeout(timeoutTimer);
|
|
165
|
+
process.off("SIGINT", onSig);
|
|
166
|
+
process.off("SIGTERM", onSig);
|
|
167
|
+
}
|
|
168
|
+
if (!doneData?.filename)
|
|
169
|
+
die(1, "server did not return a video filename");
|
|
170
|
+
// Determine output path
|
|
171
|
+
const filename = String(doneData.filename);
|
|
172
|
+
const explicitOut = args.out ? String(args.out) : null;
|
|
173
|
+
const outDir = args["out-dir"] ? String(args["out-dir"]) : null;
|
|
174
|
+
let target;
|
|
175
|
+
if (explicitOut) {
|
|
176
|
+
target = explicitOut;
|
|
177
|
+
}
|
|
178
|
+
else if (outDir) {
|
|
179
|
+
target = join(outDir, filename);
|
|
180
|
+
}
|
|
181
|
+
else {
|
|
182
|
+
target = join(config.storage.generatedDir, filename);
|
|
183
|
+
}
|
|
184
|
+
// Download the video file from server
|
|
185
|
+
const videoUrl = `${server.base}${doneData.url || `/generated/${encodeURIComponent(filename)}`}`;
|
|
186
|
+
const dlRes = await fetch(videoUrl, { signal: AbortSignal.timeout(30_000) });
|
|
187
|
+
if (!dlRes.ok)
|
|
188
|
+
die(1, `failed to download video: HTTP ${dlRes.status}`);
|
|
189
|
+
const videoBuf = Buffer.from(await dlRes.arrayBuffer());
|
|
190
|
+
await mkdir(dirname(target), { recursive: true }).catch(() => { });
|
|
191
|
+
await writeFile(target, videoBuf);
|
|
192
|
+
if (args.json) {
|
|
193
|
+
json({
|
|
194
|
+
ok: true,
|
|
195
|
+
requestId: doneData.requestId,
|
|
196
|
+
path: target,
|
|
197
|
+
filename,
|
|
198
|
+
elapsed: doneData.elapsed,
|
|
199
|
+
video: doneData.video,
|
|
200
|
+
revisedPrompt: doneData.revisedPrompt,
|
|
201
|
+
});
|
|
202
|
+
}
|
|
203
|
+
else {
|
|
204
|
+
out(color.green("✓ ") + target);
|
|
205
|
+
if (doneData.elapsed)
|
|
206
|
+
out(color.dim(`elapsed ${doneData.elapsed}s`));
|
|
207
|
+
if (doneData.revisedPrompt)
|
|
208
|
+
out(color.dim(`revised: ${String(doneData.revisedPrompt).slice(0, 80)}`));
|
|
209
|
+
}
|
|
210
|
+
}
|
|
211
|
+
function renderBar(pct) {
|
|
212
|
+
const width = 20;
|
|
213
|
+
const filled = Math.round((pct / 100) * width);
|
|
214
|
+
return color.green("█".repeat(filled)) + color.dim("░".repeat(width - filled));
|
|
215
|
+
}
|
|
@@ -0,0 +1,205 @@
|
|
|
1
|
+
import { parseArgs } from "../lib/args.js";
|
|
2
|
+
import { resolveServer } from "../lib/client.js";
|
|
3
|
+
import { streamSse } from "../lib/sse.js";
|
|
4
|
+
import { out, die, color, json, exitCodeForError } from "../lib/output.js";
|
|
5
|
+
import { config } from "../../config.js";
|
|
6
|
+
import { readFile, writeFile, mkdir } from "node:fs/promises";
|
|
7
|
+
import { dirname, join } from "node:path";
|
|
8
|
+
|
|
9
|
+
const VALID_RESOLUTIONS = new Set(["480p", "720p"]);
|
|
10
|
+
const VALID_ASPECT_RATIOS = new Set(["1:1", "16:9", "9:16", "4:3", "3:4", "3:2", "2:3", "auto"]);
|
|
11
|
+
const VALID_MODELS = new Set(["grok-imagine-video", "grok-imagine-video-1.5-preview"]);
|
|
12
|
+
|
|
13
|
+
const SPEC = {
|
|
14
|
+
flags: {
|
|
15
|
+
duration: { type: "string", default: "5" },
|
|
16
|
+
resolution: { type: "string", default: "480p" },
|
|
17
|
+
"aspect-ratio": { type: "string", default: "auto" },
|
|
18
|
+
model: { type: "string" },
|
|
19
|
+
topic: { type: "string" },
|
|
20
|
+
ref: { type: "string", repeatable: true },
|
|
21
|
+
out: { short: "o", type: "string" },
|
|
22
|
+
"out-dir": { short: "d", type: "string" },
|
|
23
|
+
json: { type: "boolean" },
|
|
24
|
+
timeout: { type: "string", default: "600" },
|
|
25
|
+
server: { type: "string" },
|
|
26
|
+
session: { type: "string" },
|
|
27
|
+
help: { short: "h", type: "boolean" },
|
|
28
|
+
},
|
|
29
|
+
};
|
|
30
|
+
|
|
31
|
+
const HELP = `
|
|
32
|
+
ima2 video <prompt...> [options]
|
|
33
|
+
|
|
34
|
+
Generate a video via the Grok video provider (SSE streaming).
|
|
35
|
+
|
|
36
|
+
Options:
|
|
37
|
+
--duration <1..15> Duration in seconds. Default: 5
|
|
38
|
+
--resolution <480p|720p> Default: 480p
|
|
39
|
+
--aspect-ratio <ratio|auto> 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto. Default: auto
|
|
40
|
+
--model <name> grok-imagine-video, grok-imagine-video-1.5-preview
|
|
41
|
+
--topic <text> Series topic for prompt chain continuity
|
|
42
|
+
--ref <file> Attach source/reference image (repeatable, max 7)
|
|
43
|
+
-o, --out <file> Output file path
|
|
44
|
+
-d, --out-dir <dir> Output directory
|
|
45
|
+
--json Print JSON result to stdout
|
|
46
|
+
--timeout <sec> Default: 600
|
|
47
|
+
--server <url> Override server URL
|
|
48
|
+
--session <id> Session ID
|
|
49
|
+
|
|
50
|
+
Modes (auto-detected from --ref count):
|
|
51
|
+
0 refs → text-to-video
|
|
52
|
+
1 ref → image-to-video
|
|
53
|
+
2-7 refs → reference-to-video (max 10s duration)
|
|
54
|
+
|
|
55
|
+
Examples:
|
|
56
|
+
ima2 video "a cat playing piano"
|
|
57
|
+
ima2 video "animate this" --ref photo.png --duration 10
|
|
58
|
+
ima2 video "cinematic" --resolution 720p --aspect-ratio 16:9 -o out.mp4
|
|
59
|
+
`;
|
|
60
|
+
|
|
61
|
+
export default async function videoCmd(argv: string[]) {
|
|
62
|
+
const args = parseArgs(argv, SPEC);
|
|
63
|
+
if (args.help) { out(HELP); return; }
|
|
64
|
+
|
|
65
|
+
const prompt = args.positional.join(" ");
|
|
66
|
+
if (!prompt) die(2, "prompt is required");
|
|
67
|
+
|
|
68
|
+
const duration = parseInt(String(args.duration)) || 5;
|
|
69
|
+
if (duration < 1 || duration > 15) die(2, "--duration must be between 1 and 15");
|
|
70
|
+
|
|
71
|
+
const resolution = String(args.resolution);
|
|
72
|
+
if (!VALID_RESOLUTIONS.has(resolution)) die(2, "--resolution must be one of: 480p, 720p");
|
|
73
|
+
|
|
74
|
+
const aspectRatio = String(args["aspect-ratio"]);
|
|
75
|
+
if (!VALID_ASPECT_RATIOS.has(aspectRatio)) die(2, "--aspect-ratio must be one of: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, auto");
|
|
76
|
+
|
|
77
|
+
if (args.model && !VALID_MODELS.has(String(args.model))) {
|
|
78
|
+
die(2, "--model must be one of: grok-imagine-video, grok-imagine-video-1.5-preview");
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
const refs = (Array.isArray(args.ref) ? args.ref : []) as string[];
|
|
82
|
+
if (refs.length > 7) die(2, "max 7 --ref attachments for video");
|
|
83
|
+
|
|
84
|
+
let server;
|
|
85
|
+
try { server = await resolveServer({ serverFlag: args.server }); }
|
|
86
|
+
catch (e: unknown) { die(exitCodeForError(e), (e as Error).message); throw e; }
|
|
87
|
+
|
|
88
|
+
const referenceImages = await Promise.all(refs.map(async (p: string) => {
|
|
89
|
+
const buf = await readFile(p);
|
|
90
|
+
return buf.toString("base64");
|
|
91
|
+
}));
|
|
92
|
+
|
|
93
|
+
const timeoutMs = (parseInt(String(args.timeout)) || 600) * 1000;
|
|
94
|
+
const requestId = `req_cli_video_${Date.now().toString(36)}`;
|
|
95
|
+
|
|
96
|
+
const body: Record<string, unknown> = {
|
|
97
|
+
prompt,
|
|
98
|
+
provider: "grok",
|
|
99
|
+
duration,
|
|
100
|
+
resolution,
|
|
101
|
+
aspectRatio,
|
|
102
|
+
requestId,
|
|
103
|
+
};
|
|
104
|
+
if (args.model) body.model = args.model;
|
|
105
|
+
if (args.session) body.sessionId = args.session;
|
|
106
|
+
if (args.topic) body.topic = args.topic;
|
|
107
|
+
if (referenceImages.length === 1) {
|
|
108
|
+
body.sourceImage = referenceImages[0];
|
|
109
|
+
} else if (referenceImages.length > 1) {
|
|
110
|
+
body.referenceImages = referenceImages;
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
const ac = new AbortController();
|
|
114
|
+
let timedOut = false;
|
|
115
|
+
const timeoutTimer = setTimeout(() => { timedOut = true; ac.abort(); }, timeoutMs);
|
|
116
|
+
const onSig = () => { ac.abort(); process.exit(130); };
|
|
117
|
+
process.once("SIGINT", onSig);
|
|
118
|
+
process.once("SIGTERM", onSig);
|
|
119
|
+
|
|
120
|
+
const url = `${server.base}/api/video/generate`;
|
|
121
|
+
let doneData: Record<string, unknown> | null = null;
|
|
122
|
+
let lastProgress = -1;
|
|
123
|
+
|
|
124
|
+
try {
|
|
125
|
+
for await (const ev of streamSse(url, { body, signal: ac.signal, headers: { "X-Request-Id": requestId } })) {
|
|
126
|
+
switch (ev.event) {
|
|
127
|
+
case "planning":
|
|
128
|
+
if (!args.json) out(color.dim("[planning] preparing video generation..."));
|
|
129
|
+
break;
|
|
130
|
+
case "submitted":
|
|
131
|
+
if (!args.json) out(color.dim(`[submitted] xai request: ${ev.data.xaiVideoRequestId || "..."}`));
|
|
132
|
+
break;
|
|
133
|
+
case "progress": {
|
|
134
|
+
const pct = typeof ev.data.progress === "number" ? Math.round(ev.data.progress * 100) : null;
|
|
135
|
+
if (pct !== null && pct !== lastProgress && !args.json) {
|
|
136
|
+
const bar = renderBar(pct);
|
|
137
|
+
process.stdout.write(`\r ${bar} ${pct}%`);
|
|
138
|
+
lastProgress = pct;
|
|
139
|
+
}
|
|
140
|
+
break;
|
|
141
|
+
}
|
|
142
|
+
case "done":
|
|
143
|
+
if (!args.json && lastProgress >= 0) process.stdout.write("\n");
|
|
144
|
+
doneData = ev.data;
|
|
145
|
+
break;
|
|
146
|
+
case "error":
|
|
147
|
+
if (!args.json && lastProgress >= 0) process.stdout.write("\n");
|
|
148
|
+
die(1, `video error: ${ev.data.error || ev.data}${ev.data.code ? ` (${ev.data.code})` : ""}`);
|
|
149
|
+
}
|
|
150
|
+
}
|
|
151
|
+
} catch (e: unknown) {
|
|
152
|
+
if ((e as Error).name === "AbortError" && !timedOut) return;
|
|
153
|
+
if (!args.json && lastProgress >= 0) process.stdout.write("\n");
|
|
154
|
+
die(exitCodeForError(e), (e as Error).message);
|
|
155
|
+
} finally {
|
|
156
|
+
clearTimeout(timeoutTimer);
|
|
157
|
+
process.off("SIGINT", onSig);
|
|
158
|
+
process.off("SIGTERM", onSig);
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
if (!doneData?.filename) die(1, "server did not return a video filename");
|
|
162
|
+
|
|
163
|
+
// Determine output path
|
|
164
|
+
const filename = String(doneData.filename);
|
|
165
|
+
const explicitOut = args.out ? String(args.out) : null;
|
|
166
|
+
const outDir = args["out-dir"] ? String(args["out-dir"]) : null;
|
|
167
|
+
let target: string;
|
|
168
|
+
if (explicitOut) {
|
|
169
|
+
target = explicitOut;
|
|
170
|
+
} else if (outDir) {
|
|
171
|
+
target = join(outDir, filename);
|
|
172
|
+
} else {
|
|
173
|
+
target = join(config.storage.generatedDir, filename);
|
|
174
|
+
}
|
|
175
|
+
|
|
176
|
+
// Download the video file from server
|
|
177
|
+
const videoUrl = `${server.base}${doneData.url || `/generated/${encodeURIComponent(filename)}`}`;
|
|
178
|
+
const dlRes = await fetch(videoUrl, { signal: AbortSignal.timeout(30_000) });
|
|
179
|
+
if (!dlRes.ok) die(1, `failed to download video: HTTP ${dlRes.status}`);
|
|
180
|
+
const videoBuf = Buffer.from(await dlRes.arrayBuffer());
|
|
181
|
+
await mkdir(dirname(target), { recursive: true }).catch(() => {});
|
|
182
|
+
await writeFile(target, videoBuf);
|
|
183
|
+
|
|
184
|
+
if (args.json) {
|
|
185
|
+
json({
|
|
186
|
+
ok: true,
|
|
187
|
+
requestId: doneData.requestId,
|
|
188
|
+
path: target,
|
|
189
|
+
filename,
|
|
190
|
+
elapsed: doneData.elapsed,
|
|
191
|
+
video: doneData.video,
|
|
192
|
+
revisedPrompt: doneData.revisedPrompt,
|
|
193
|
+
});
|
|
194
|
+
} else {
|
|
195
|
+
out(color.green("✓ ") + target);
|
|
196
|
+
if (doneData.elapsed) out(color.dim(`elapsed ${doneData.elapsed}s`));
|
|
197
|
+
if (doneData.revisedPrompt) out(color.dim(`revised: ${String(doneData.revisedPrompt).slice(0, 80)}`));
|
|
198
|
+
}
|
|
199
|
+
}
|
|
200
|
+
|
|
201
|
+
function renderBar(pct: number): string {
|
|
202
|
+
const width = 20;
|
|
203
|
+
const filled = Math.round((pct / 100) * width);
|
|
204
|
+
return color.green("█".repeat(filled)) + color.dim("░".repeat(width - filled));
|
|
205
|
+
}
|
package/bin/ima2.js
CHANGED
|
@@ -62,11 +62,13 @@ async function setup() {
|
|
|
62
62
|
const rl = createInterface({ input: process.stdin, output: process.stdout });
|
|
63
63
|
console.log("\n ima2-gen — GPT Image 2 Generator\n");
|
|
64
64
|
console.log(" Choose authentication method:\n");
|
|
65
|
-
console.log(" 1)
|
|
66
|
-
console.log(" 2) OAuth
|
|
67
|
-
|
|
65
|
+
console.log(" 1) GPT OAuth — login with ChatGPT account (free, images only)");
|
|
66
|
+
console.log(" 2) Grok OAuth — login with xAI/Grok account (images + video)");
|
|
67
|
+
console.log(" 3) Both — GPT OAuth + Grok OAuth");
|
|
68
|
+
console.log(" 4) API Key — paste your OpenAI API key (paid)\n");
|
|
69
|
+
const choice = await rl.question(" Enter 1-4: ");
|
|
68
70
|
const config = loadConfig();
|
|
69
|
-
if (choice.trim() === "
|
|
71
|
+
if (choice.trim() === "4") {
|
|
70
72
|
const key = await rl.question(" OpenAI API Key: ");
|
|
71
73
|
if (!key.startsWith("sk-")) {
|
|
72
74
|
console.log(" Invalid API key format. Expected sk-...");
|
|
@@ -78,12 +80,62 @@ async function setup() {
|
|
|
78
80
|
saveConfig(config);
|
|
79
81
|
console.log("\n API key saved. Starting server...\n");
|
|
80
82
|
}
|
|
83
|
+
else if (choice.trim() === "2") {
|
|
84
|
+
config.provider = "grok";
|
|
85
|
+
config.oauth = config.oauth || {};
|
|
86
|
+
config.oauth.disableAutoStart = true;
|
|
87
|
+
delete config.apiKey;
|
|
88
|
+
saveConfig(config);
|
|
89
|
+
console.log("\n Starting Grok OAuth login...\n");
|
|
90
|
+
try {
|
|
91
|
+
execSync(`node ${JSON.stringify(join(ROOT, "bin", "ima2.js"))} grok login`, { stdio: "inherit" });
|
|
92
|
+
}
|
|
93
|
+
catch {
|
|
94
|
+
console.log("\n Grok login failed or cancelled. You can retry with 'ima2 grok login'.\n");
|
|
95
|
+
rl.close();
|
|
96
|
+
process.exit(1);
|
|
97
|
+
}
|
|
98
|
+
console.log(" Grok configured. Run 'ima2 serve' to start.\n");
|
|
99
|
+
}
|
|
100
|
+
else if (choice.trim() === "3") {
|
|
101
|
+
config.provider = "oauth";
|
|
102
|
+
delete config.apiKey;
|
|
103
|
+
if (config.oauth)
|
|
104
|
+
delete config.oauth.disableAutoStart;
|
|
105
|
+
saveConfig(config);
|
|
106
|
+
console.log("\n Setting up both GPT OAuth + Grok OAuth...\n");
|
|
107
|
+
// GPT OAuth
|
|
108
|
+
const auth = detectCodexAuth();
|
|
109
|
+
if (!auth.authed) {
|
|
110
|
+
console.log(" Running GPT OAuth login...\n");
|
|
111
|
+
try {
|
|
112
|
+
execSync(`${resolveBin("npx")} @openai/codex login`, { stdio: "inherit" });
|
|
113
|
+
}
|
|
114
|
+
catch {
|
|
115
|
+
console.log("\n GPT login failed. Continuing with Grok...\n");
|
|
116
|
+
}
|
|
117
|
+
}
|
|
118
|
+
else {
|
|
119
|
+
console.log(` GPT OAuth session found.\n`);
|
|
120
|
+
}
|
|
121
|
+
// Grok OAuth
|
|
122
|
+
console.log(" Running Grok OAuth login...\n");
|
|
123
|
+
try {
|
|
124
|
+
execSync(`node ${JSON.stringify(join(ROOT, "bin", "ima2.js"))} grok login`, { stdio: "inherit" });
|
|
125
|
+
}
|
|
126
|
+
catch {
|
|
127
|
+
console.log("\n Grok login failed. You can retry with 'ima2 grok login'.\n");
|
|
128
|
+
}
|
|
129
|
+
console.log(" Both providers configured.\n");
|
|
130
|
+
}
|
|
81
131
|
else {
|
|
132
|
+
// Default: GPT OAuth (choice 1 or anything else)
|
|
82
133
|
config.provider = "oauth";
|
|
134
|
+
config.oauth = config.oauth || {};
|
|
135
|
+
config.oauth.disableAutoStart = false;
|
|
83
136
|
delete config.apiKey;
|
|
84
137
|
saveConfig(config);
|
|
85
138
|
console.log("\n Starting OAuth login...\n");
|
|
86
|
-
// Check if codex auth exists (file OR keyring via `codex login status`)
|
|
87
139
|
const auth = detectCodexAuth();
|
|
88
140
|
const hasAuth = auth.authed;
|
|
89
141
|
if (!hasAuth) {
|
|
@@ -211,6 +263,7 @@ function showHelp() {
|
|
|
211
263
|
|
|
212
264
|
Client commands (require a running 'ima2 serve'):
|
|
213
265
|
gen <prompt> Generate image(s) from prompt (ima2 gen --help)
|
|
266
|
+
video <prompt> Generate video via Grok (ima2 video --help)
|
|
214
267
|
edit <file> Edit an existing image (ima2 edit --help)
|
|
215
268
|
ls List recent history (ima2 ls --help)
|
|
216
269
|
show <name> Show one history item (ima2 show --help)
|
|
@@ -256,6 +309,7 @@ function showHelp() {
|
|
|
256
309
|
ima2 serve --dev Start with verbose server diagnostics
|
|
257
310
|
ima2 gen "a shiba in space" Generate from CLI
|
|
258
311
|
ima2 gen "merge" --ref a.png --ref b.png -q high -o out.png
|
|
312
|
+
ima2 video "a cat playing piano" --duration 10
|
|
259
313
|
ima2 ls -n 10 Last 10 generations
|
|
260
314
|
ima2 skill Print agent usage skill
|
|
261
315
|
ima2 capabilities --json Inspect supported models/options
|
|
@@ -271,7 +325,7 @@ if (args.includes("-v") || args.includes("--version")) {
|
|
|
271
325
|
process.exit(0);
|
|
272
326
|
}
|
|
273
327
|
if ((!command || args.includes("-h") || args.includes("--help"))
|
|
274
|
-
&& !["doctor", "gen", "edit", "ls", "show", "ps", "cancel", "session", "history", "prompt", "multimode", "node", "annotate", "canvas-versions", "metadata", "comfy", "cardnews", "inflight", "storage", "billing", "providers", "oauth", "grok", "config", "defaults", "capabilities", "skill", "ping"].includes(command)) {
|
|
328
|
+
&& !["doctor", "gen", "video", "edit", "ls", "show", "ps", "cancel", "session", "history", "prompt", "multimode", "node", "annotate", "canvas-versions", "metadata", "comfy", "cardnews", "inflight", "storage", "billing", "providers", "oauth", "grok", "config", "defaults", "capabilities", "skill", "ping"].includes(command)) {
|
|
275
329
|
showHelp();
|
|
276
330
|
process.exit(command ? 0 : 1);
|
|
277
331
|
}
|
|
@@ -314,6 +368,7 @@ switch (command) {
|
|
|
314
368
|
}
|
|
315
369
|
break;
|
|
316
370
|
case "gen":
|
|
371
|
+
case "video":
|
|
317
372
|
case "edit":
|
|
318
373
|
case "ls":
|
|
319
374
|
case "show":
|