@sogni-ai/sogni-creative-agent-skill 3.6.0 → 3.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,30 @@ All notable changes to this project are documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [3.6.1] - 2026-06-15
9
+
10
+ ### Changed
11
+
12
+ - **Hosted-API guidance now recommends client-side planning over hosted re-planning.** The skill is driven by a
13
+ frontier LLM that out-plans Sogni's hosted planning model, so steering it to delegate planning through
14
+ `--api-chat` was a downgrade. `SKILL.md`, `references/hosted-api.md`, and `README.md` now tell the calling agent
15
+ to plan and select tools itself, use `--api-workflow` with an explicit `--workflow-input` step graph for durable
16
+ multi-step work (the server executes the authored plan without re-planning), and reserve `--api-chat` /
17
+ `--durable-chat` for deliberately offloading a long server-side loop or uploading several local files in one
18
+ turn. `--api-chat` and all hosted modes remain fully supported — only the recommended default changed.
19
+
20
+ ### Fixed
21
+
22
+ - **Local Seedance reference images via `-c`/`--context` now auto-upload in direct CLI mode.** Local
23
+ loose-reference images were rejected with an HTTPS-only error that pushed users onto the unreliable
24
+ `--api-chat` / `--durable-chat` path; local `--ref-audio` and `--ref-video` already auto-uploaded through the
25
+ `/v2` presigned-POST flow, so images were the only modality missing it and one broken branch cascaded into
26
+ downstream failures (vision 1024px cap, HTTP timeout, no-content, missing durable SDK package). Local
27
+ `-c`/`--context` images now upload through the same `/v2/image` presigned flow and forward as Sogni-hosted URLs.
28
+ MIME type is resolved by magic-byte sniffing (falling back to extension), and the accepted set
29
+ (PNG/JPEG/WebP/GIF) mirrors the backend's `allowedContentTypes`. Adds local-PNG-upload and mislabeled-WebP
30
+ byte-sniff regression tests; verified end-to-end with a real Seedance 2.0 render from a local `-c` PNG.
31
+
8
32
  ## [3.6.0] - 2026-06-12
9
33
 
10
34
  ### Added
package/README.md CHANGED
@@ -600,7 +600,9 @@ Stored at `~/.config/sogni/personality.txt`.
600
600
 
601
601
  Hosted API modes require `SOGNI_API_KEY`.
602
602
 
603
- - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
603
+ **Choosing a mode.** Whatever is driving this CLI is usually a more capable planner than Sogni's hosted model, so prefer to plan yourself and let the server execute: direct-to-SDK flags for one-shot work, and `--api-workflow` with an explicit `--workflow-input` step graph for multi-step/durable work (you author the plan; the server runs it durably with replay — no hosted re-planning). Use `--api-chat` / `--durable-chat` when you deliberately want the hosted model to own a long server-side loop, or when several local files must be uploaded for one turn.
604
+
605
+ - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools and **delegates planning/tool-selection to the hosted model** — reach for it when the caller is a thin client, when you want the hosted model to drive a long server-side tool loop, or when several local files must be uploaded for one turn. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
604
606
  - **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
605
607
  - **`--durable-chat`** starts a hosted `/v1/chat/runs` record through the SDK transport. Set `SOGNI_SKILL_USE_SDK_TRANSPORT=1` before using it. The CLI streams assistant deltas and de-duplicated per-job progress / ETA / result lines from hosted run events.
606
608
  - **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Requests carry `input.steps` plus snake_case controls such as `token_type`, `media_references`, `max_estimated_capacity_units`, and `confirm_cost`.
package/SKILL.md CHANGED
@@ -2,7 +2,7 @@
2
2
  name: sogni-creative-agent-skill
3
3
  description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories, custom personality, style transfer, angle synthesis, Seedance/LTX/WAN video, music/lyrics, hosted chat, durable workflows, replay records, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
4
4
  metadata:
5
- version: "3.6.0"
5
+ version: "3.6.1"
6
6
  homepage: https://sogni.ai
7
7
  openclaw:
8
8
  emoji: "🎨"
@@ -98,28 +98,31 @@ sogni-agent -o /tmp/cat.png "a cat wearing a hat" # ✗ avoid — user can't
98
98
  - Media listing for `--list-media` (read): `~/.openclaw/media/inbound`, falling back to the legacy `~/.clawdbot/media/inbound` when only it exists (`SOGNI_MEDIA_INBOUND_DIR`)
99
99
  - Custom ffmpeg binary: `FFMPEG_PATH`
100
100
 
101
- ## Recommended path: hosted Sogni Intelligence endpoints
101
+ ## Recommended path: you plan, Sogni executes
102
102
 
103
- For any natural-language creative request that should be planned, multi-step, resumable, or benefit from server-side tool selection and repair, prefer the hosted endpoints over direct-to-SDK flags **read [`references/hosted-api.md`](./references/hosted-api.md) first** for the full contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls):
103
+ You (the calling LLM) are almost always more capable than Sogni's hosted planning model, so **do the planning and tool selection yourself** and let the hosted endpoints do what only the server can run on the GPU network, persist assets/manifests, orchestrate durable multi-step runs with replay, and apply structured-contract repair. Don't flatten a rich request into a single natural-language string and hand planning back to a weaker model. Match the mode to the work:
104
104
 
105
- ```bash
106
- # Natural-language creative request (LLM picks the tool, dispatches, repairs)
107
- sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
105
+ - **One-shot generation** → direct-to-SDK flags (the Core Commands below). You already know the tool, model, and prompt — just run it. No LLM round-trip, lowest latency/cost.
106
+ - **Multi-step / durable / resumable** → `--api-workflow` with an explicit step graph via `--workflow-input <json|@path>`. *You* author the exact plan — `steps[]` with `toolName`, `arguments`, and `dependsOn` bindings (e.g. `sourceStepId`, `targetArgument`, `transform: "artifact_url"`) — and the server executes it durably with replay/resumability, **without re-planning through the hosted LLM**. Presets like `--api-workflow storyboard-video` are fine when they already match the request.
107
+ - **`--api-chat` / `--durable-chat` (hosted LLM owns the loop)** reserve for when you deliberately *want* the hosted model to drive a long server-side tool loop (saves client round-trips on long async jobs), when structured-contract repair recipes should govern, or when several local files must be uploaded for a single turn (multi-file local upload is only supported here). These delegate planning to the hosted model — choose them on purpose, not by default.
108
108
 
109
- # Durable hosted chat run (persisted event log + SSE stream)
110
- SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
109
+ **Read [`references/hosted-api.md`](./references/hosted-api.md) first** for the full hosted contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls).
111
110
 
112
- # Durable workflow (resumable, server-orchestrated)
113
- sogni-agent --api-workflow --video-prompt "The camera slowly pushes in" "A graphite robot sketch on a drafting table"
111
+ ```bash
112
+ # One-shot: you pick the tool, the server just executes (see Core Commands below)
113
+ sogni-agent -q -Q hq -o ./poster.png "Turn the product photo into a launch poster"
114
114
 
115
- # Storyboard GPT Image 2 sheet Seedance video, all server-side
115
+ # Multi-step durable: you author the step graph, the server executes it (no hosted re-planning)
116
+ sogni-agent --api-workflow --workflow-input @plan.json
116
117
  sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq "9:16 bakery launch video"
118
+
119
+ # Deliberately hand the whole loop to the hosted model (long async job, or multi local-file upload)
120
+ sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
121
+ SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
117
122
  ```
118
123
 
119
124
  Hosted modes require `SOGNI_API_KEY`. Local file references are uploaded to Sogni media storage and forwarded as retrievable URLs — **use direct CLI mode for private media that must not leave the local machine.**
120
125
 
121
- Use the direct-to-SDK commands below for explicit one-shot generation when you already know the model, dimensions, and prompt.
122
-
123
126
  ## Core Commands (direct-to-SDK)
124
127
 
125
128
  ```bash
@@ -2,7 +2,7 @@
2
2
  "id": "sogni-creative-agent-skill",
3
3
  "name": "Sogni Creative Agent Skill — Image, Video & Music Generation",
4
4
  "description": "Agent skill and CLI for Sogni AI image, video, and music generation.",
5
- "version": "3.6.0",
5
+ "version": "3.6.1",
6
6
  "skills": [
7
7
  "."
8
8
  ],
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sogni-ai/sogni-creative-agent-skill",
3
- "version": "3.6.0",
3
+ "version": "3.6.1",
4
4
  "description": "Sogni Creative Agent Skill: agent skill and CLI for Sogni AI image, video, and music generation.",
5
5
  "type": "module",
6
6
  "main": "sogni-agent.mjs",
@@ -7,35 +7,46 @@ All hosted modes require `SOGNI_API_KEY`.
7
7
 
8
8
  ## When to prefer the hosted path
9
9
 
10
- For any natural-language creative request that benefits from tool selection,
11
- repair, or durable workflows, prefer the hosted Sogni Intelligence endpoints
12
- over direct-to-SDK media flags. They are the canonical home for
13
- OpenAI-compatible chat, server-side creative tool dispatch, Structured
14
- Contracts v1 (gating policies, repair recipes, prompt contracts), durable chat
15
- runs, durable workflows, workflow templates, replay, and asset-manifest
16
- mapping.
10
+ The thing calling this API is usually a frontier LLM that is **more capable
11
+ than Sogni's hosted planning model**. So the default split is: *you* do the
12
+ planning and tool selection, and the hosted endpoints do what only the server
13
+ can run on the GPU network, persist assets/manifests, orchestrate durable
14
+ multi-step runs with replay, and apply Structured Contracts v1 (gating
15
+ policies, repair recipes, prompt contracts). Routing a request through
16
+ `--api-chat` so a weaker model re-plans it is usually a downgrade; reach for the
17
+ hosted *planner* deliberately, not by default.
18
+
19
+ - **You already know the single tool + args** → direct-to-SDK flags. Lowest
20
+ latency/cost, no LLM round-trip.
21
+ - **Multi-step, durable, resumable** → `--api-workflow` with an explicit
22
+ `--workflow-input` step graph that *you* author (`steps[]` with `toolName`,
23
+ `arguments`, and `dependsOn` bindings). The server executes and repairs it
24
+ deterministically with replay/resumability and **no hosted-LLM re-planning**.
25
+ This is the best fit when a frontier client drives the work.
26
+ - **You want the hosted model to own a long loop** → `--api-chat` /
27
+ `--durable-chat`. Worth it when offloading a long async tool loop server-side
28
+ saves client round-trips, when structured-contract repair should govern, or
29
+ when several local files must be uploaded for one turn (only supported here).
17
30
 
18
31
  ```bash
19
- # Natural-language creative request (LLM picks the tool, dispatches, repairs)
32
+ # You author the exact durable plan; the server executes it (no hosted re-planning)
33
+ sogni-agent --api-workflow --workflow-input @plan.json
34
+
35
+ # Storyboard → GPT Image 2 sheet → Seedance, all server-side (preset plan)
36
+ sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
37
+ "Create a 9:16 bakery launch video with a neon street-window reveal"
38
+
39
+ # Deliberately hand planning to the hosted model (long async job / multi local-file upload)
20
40
  sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
21
41
 
22
42
  # Durable hosted chat run (persisted event log + SSE stream)
23
43
  SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
24
44
  "Create a four-shot launch campaign, generate the key art, and animate the hero clip"
25
-
26
- # Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
27
- sogni-agent --api-workflow \
28
- --video-prompt "The camera slowly pushes in" \
29
- "A graphite robot sketch on a drafting table"
30
-
31
- # Storyboard → GPT Image 2 sheet → Seedance, all server-side
32
- sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
33
- "Create a 9:16 bakery launch video with a neon street-window reveal"
34
45
  ```
35
46
 
36
- The direct-to-SDK flags remain available for explicit one-shot generation when
37
- you already know the exact model, dimensions, and prompt and don't need LLM
38
- planning — use them when latency or cost rules out the LLM round-trip.
47
+ The direct-to-SDK flags remain the right call for explicit one-shot generation
48
+ when you already know the exact model, dimensions, and prompt use them
49
+ whenever latency or cost rules out an LLM round-trip.
39
50
 
40
51
  ## --api-chat (`POST /v1/chat/completions`)
41
52
 
@@ -156,19 +167,24 @@ per video request:
156
167
  - **Loose reference mode — `-c/--context` plus optional `--ref-audio` and
157
168
  `--ref-video` extras.** Anchor frame intent in the prompt with `@Image1` /
158
169
  `@Video1` / `@Audio1` etc. (e.g. *"Use @Image1 as the opening shot
159
- reference"*). Supports up to 9 image refs, 3 video refs, 3 audio refs, and
160
- 12 total reference assets per request (canonical caps come from
170
+ reference"*). Each `-c/--context` image may be a **local file or an HTTPS
171
+ URL** (PNG, JPEG, WebP, or GIF) local files are uploaded to Sogni media
172
+ storage automatically, so you do **not** need `--api-chat` / `--durable-chat`
173
+ just to attach a local loose-reference image. Supports up to 9 image refs, 3 video refs, 3 audio
174
+ refs, and 12 total reference assets per request (canonical caps come from
161
175
  `SEEDANCE_REFERENCE_LIMITS` / `validateSeedanceReferenceCounts()` in
162
176
  `@sogni-ai/sogni-intelligence-client/tools`).
163
177
 
164
178
  Combining `--ref` / `--ref-end` with `-c/--context` on Seedance is rejected
165
- client-side with an error pointing at the correct mode. In CLI direct-gen
166
- mode, additional `--ref-audio` / `--ref-video` entries beyond the first must
167
- be HTTPS URLs (the primary entry can still be a local file); for local
168
- multi-file Seedance uploads, use `--api-chat` / `--durable-chat` instead.
169
- Seedance accepts public HTTPS image, video, and audio references that pass the
170
- CLI URL safety checks; localhost and private-network URLs are rejected before
171
- forwarding. Audio references must be paired with an image or video reference.
179
+ client-side with an error pointing at the correct mode. In CLI direct-gen mode,
180
+ local `-c/--context` images and the primary `--ref-audio` / `--ref-video` are
181
+ uploaded to Sogni media storage automatically and forwarded as HTTPS URLs; only
182
+ *additional* `--ref-audio` / `--ref-video` entries beyond the first must already
183
+ be HTTPS URLs (use `--api-chat` / `--durable-chat` when you need to attach
184
+ several local audio or video files in one request). Seedance accepts public
185
+ HTTPS image, video, and audio references that pass the CLI URL safety checks;
186
+ localhost and private-network URLs are rejected before forwarding. Audio
187
+ references must be paired with an image or video reference.
172
188
 
173
189
  ## Models, replays, and contract debugging
174
190
 
package/sogni-agent.mjs CHANGED
@@ -4242,6 +4242,8 @@ function extensionForApiMediaReference(mimeType, kind) {
4242
4242
  const normalized = String(mimeType || '').split(';')[0].trim().toLowerCase();
4243
4243
  if (normalized === 'image/jpeg' || normalized === 'image/jpg') return 'jpg';
4244
4244
  if (normalized === 'image/png') return 'png';
4245
+ if (normalized === 'image/webp') return 'webp';
4246
+ if (normalized === 'image/gif') return 'gif';
4245
4247
  if (normalized === 'audio/mpeg' || normalized === 'audio/mp3') return 'mp3';
4246
4248
  if (normalized === 'audio/mp4' || normalized === 'audio/m4a' || normalized === 'audio/x-m4a') return 'm4a';
4247
4249
  if (normalized === 'audio/wav' || normalized === 'audio/x-wav' || normalized === 'audio/wave') return 'wav';
@@ -6285,6 +6287,84 @@ async function uploadSeedanceReferenceVideoUrl(pathOrUrl, apiKey, index = 0) {
6285
6287
  return uploaded.url;
6286
6288
  }
6287
6289
 
6290
+ // Content types the Sogni media pipeline accepts for image references, mirroring
6291
+ // the `allowedContentTypes` the /v2/image/uploadUrl presigned-POST endpoint
6292
+ // returns. Kept as a constant so the skill validates exactly what the backend
6293
+ // will store rather than imposing a narrower client-side policy.
6294
+ const SEEDANCE_REFERENCE_IMAGE_MIME_TYPES = Object.freeze([
6295
+ 'image/png', 'image/jpeg', 'image/webp', 'image/gif',
6296
+ ]);
6297
+
6298
+ // Identify an image's MIME type from its leading bytes (magic numbers). Reliable
6299
+ // because we already hold the buffer, so it works regardless of file extension.
6300
+ function sniffSeedanceReferenceImageMimeType(buffer) {
6301
+ if (!buffer || buffer.length < 4) return null;
6302
+ if (buffer[0] === 0x89 && buffer[1] === 0x50 && buffer[2] === 0x4e && buffer[3] === 0x47) return 'image/png';
6303
+ if (buffer[0] === 0xff && buffer[1] === 0xd8 && buffer[2] === 0xff) return 'image/jpeg';
6304
+ if (
6305
+ buffer.length >= 12
6306
+ && buffer[0] === 0x52 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x46
6307
+ && buffer[8] === 0x57 && buffer[9] === 0x45 && buffer[10] === 0x42 && buffer[11] === 0x50
6308
+ ) return 'image/webp';
6309
+ if (buffer[0] === 0x47 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x38) return 'image/gif';
6310
+ return null;
6311
+ }
6312
+
6313
+ // Resolve a Seedance loose-reference image's MIME type from its bytes first,
6314
+ // falling back to the file extension. Unsupported files fail fast with an
6315
+ // actionable message instead of uploading bytes the render backend will reject.
6316
+ function seedanceReferenceImageMimeType(pathOrUrl, buffer) {
6317
+ const sniffed = sniffSeedanceReferenceImageMimeType(buffer);
6318
+ if (sniffed) return sniffed;
6319
+ const byPath = mimeTypeForPath(pathOrUrl, '');
6320
+ const normalizedByPath = byPath === 'image/jpg' ? 'image/jpeg' : byPath;
6321
+ if (SEEDANCE_REFERENCE_IMAGE_MIME_TYPES.includes(normalizedByPath)) return normalizedByPath;
6322
+ const err = new Error(
6323
+ `Seedance reference image "${pathOrUrl}" must be a PNG, JPEG, WebP, or GIF file (or an HTTPS URL to one).`,
6324
+ );
6325
+ err.code = 'UNSUPPORTED_MEDIA_TYPE';
6326
+ err.hint = 'Convert the image to PNG, JPEG, or WebP, or pass an HTTPS URL.';
6327
+ err.details = { source: pathOrUrl };
6328
+ throw err;
6329
+ }
6330
+
6331
+ async function prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer) {
6332
+ const data = Buffer.from(buffer);
6333
+ const mimeType = seedanceReferenceImageMimeType(pathOrUrl, data);
6334
+ const filename = withMediaExtension(
6335
+ mediaFilenameFromSource(pathOrUrl, 'reference-image'),
6336
+ extensionForApiMediaReference(mimeType, 'image'),
6337
+ );
6338
+ const maxBytes = apiMediaReferenceMaxBytes();
6339
+ if (data.length > maxBytes) {
6340
+ const err = new Error(
6341
+ `Seedance reference image "${pathOrUrl}" is ${data.length} bytes, above the ${maxBytes} byte upload limit.`,
6342
+ );
6343
+ err.code = 'MEDIA_REFERENCE_TOO_LARGE';
6344
+ err.details = { source: pathOrUrl, byteLength: data.length, maxBytes };
6345
+ throw err;
6346
+ }
6347
+ return {
6348
+ buffer: data,
6349
+ filename,
6350
+ byteLength: data.length,
6351
+ mimeType,
6352
+ };
6353
+ }
6354
+
6355
+ // Upload a local (non-HTTPS) Seedance loose-reference image and return its
6356
+ // hosted HTTPS download URL. The Client SDK's loose-reference arrays accept only
6357
+ // URL strings, so this is what lets `-c <local image>` work in direct generation
6358
+ // without forcing the user onto the --api-chat / --durable-chat path. Mirrors
6359
+ // uploadSeedanceReferenceAudioUrl / uploadSeedanceReferenceVideoUrl.
6360
+ async function uploadSeedanceReferenceImageUrl(pathOrUrl, apiKey, index = 0) {
6361
+ const ref = { flag: '-c/--context', value: pathOrUrl, kind: 'image' };
6362
+ const buffer = await fetchMediaBuffer(pathOrUrl);
6363
+ const file = await prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer);
6364
+ const uploaded = await uploadPreparedApiMediaReferenceV2(ref, index, apiKey, file);
6365
+ return uploaded.url;
6366
+ }
6367
+
6288
6368
  async function trimSeedanceV2VSourceVideoBuffer(buffer, sourceLabel, startOffset, requestedDuration) {
6289
6369
  const ffmpegPath = await ensureFfmpegAvailable();
6290
6370
  const tempDir = createTrackedTempDir('sogni-seedance-v2v-');
@@ -8086,19 +8166,24 @@ async function main() {
8086
8166
  // Seedance loose-reference extras: -c/--context images beyond start/end,
8087
8167
  // plus repeated --ref-audio / --ref-video entries past the first. The
8088
8168
  // Sogni Client SDK accepts only URL arrays for these (createJobRequestMessage),
8089
- // so extras MUST be HTTPS URLs. For multi-file local uploads, use --api-chat /
8090
- // --durable-chat where the LLM upload pipeline handles per-file uploads.
8169
+ // so each entry must resolve to an HTTPS URL. HTTPS inputs are forwarded as-is
8170
+ // (SSRF-validated); local files are uploaded to a Sogni-hosted URL first, the
8171
+ // same way the primary --ref-audio / --ref-video locals are handled. This lets
8172
+ // `-c <local image>` work in direct generation without a detour through
8173
+ // --api-chat / --durable-chat.
8091
8174
  if (isSeedanceVideo) {
8092
- for (const ctxImage of (Array.isArray(options.contextImages) ? options.contextImages : [])) {
8175
+ for (const [ctxIndex, ctxImage] of (Array.isArray(options.contextImages) ? options.contextImages : []).entries()) {
8093
8176
  if (!ctxImage) continue;
8094
- if (!isHttpsUrl(ctxImage)) {
8095
- fatalCliError(
8096
- `Seedance extra image reference "${ctxImage}" must be an HTTPS URL. ` +
8097
- 'Local file uploads beyond --ref / --ref-end are only supported in --api-chat / --durable-chat mode.',
8098
- { code: 'INVALID_ARGUMENT', details: { flag: '-c/--context', value: ctxImage } },
8177
+ if (isHttpsUrl(ctxImage)) {
8178
+ await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
8179
+ } else {
8180
+ const uploadedImageUrl = await uploadSeedanceReferenceImageUrl(
8181
+ ctxImage,
8182
+ creds.SOGNI_API_KEY,
8183
+ ctxIndex,
8099
8184
  );
8185
+ seedanceReferenceImageUrls.push(uploadedImageUrl);
8100
8186
  }
8101
- await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
8102
8187
  }
8103
8188
  for (const [extraAudioIndex, extraAudio] of options.refAudios.entries()) {
8104
8189
  if (!isHttpsUrl(extraAudio)) {
package/version.mjs CHANGED
@@ -1 +1 @@
1
- export const PACKAGE_VERSION = '3.6.0';
1
+ export const PACKAGE_VERSION = '3.6.1';