@sogni-ai/sogni-creative-agent-skill 3.6.0 → 3.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +24 -0
- package/README.md +3 -1
- package/SKILL.md +16 -13
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
- package/references/hosted-api.md +45 -29
- package/sogni-agent.mjs +94 -9
- package/version.mjs +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,30 @@ All notable changes to this project are documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [3.6.1] - 2026-06-15
|
|
9
|
+
|
|
10
|
+
### Changed
|
|
11
|
+
|
|
12
|
+
- **Hosted-API guidance now recommends client-side planning over hosted re-planning.** The skill is driven by a
|
|
13
|
+
frontier LLM that out-plans Sogni's hosted planning model, so steering it to delegate planning through
|
|
14
|
+
`--api-chat` was a downgrade. `SKILL.md`, `references/hosted-api.md`, and `README.md` now tell the calling agent
|
|
15
|
+
to plan and select tools itself, use `--api-workflow` with an explicit `--workflow-input` step graph for durable
|
|
16
|
+
multi-step work (the server executes the authored plan without re-planning), and reserve `--api-chat` /
|
|
17
|
+
`--durable-chat` for deliberately offloading a long server-side loop or uploading several local files in one
|
|
18
|
+
turn. `--api-chat` and all hosted modes remain fully supported — only the recommended default changed.
|
|
19
|
+
|
|
20
|
+
### Fixed
|
|
21
|
+
|
|
22
|
+
- **Local Seedance reference images via `-c`/`--context` now auto-upload in direct CLI mode.** Local
|
|
23
|
+
loose-reference images were rejected with an HTTPS-only error that pushed users onto the unreliable
|
|
24
|
+
`--api-chat` / `--durable-chat` path; local `--ref-audio` and `--ref-video` already auto-uploaded through the
|
|
25
|
+
`/v2` presigned-POST flow, so images were the only modality missing it and one broken branch cascaded into
|
|
26
|
+
downstream failures (vision 1024px cap, HTTP timeout, no-content, missing durable SDK package). Local
|
|
27
|
+
`-c`/`--context` images now upload through the same `/v2/image` presigned flow and forward as Sogni-hosted URLs.
|
|
28
|
+
MIME type is resolved by magic-byte sniffing (falling back to extension), and the accepted set
|
|
29
|
+
(PNG/JPEG/WebP/GIF) mirrors the backend's `allowedContentTypes`. Adds local-PNG-upload and mislabeled-WebP
|
|
30
|
+
byte-sniff regression tests; verified end-to-end with a real Seedance 2.0 render from a local `-c` PNG.
|
|
31
|
+
|
|
8
32
|
## [3.6.0] - 2026-06-12
|
|
9
33
|
|
|
10
34
|
### Added
|
package/README.md
CHANGED
|
@@ -600,7 +600,9 @@ Stored at `~/.config/sogni/personality.txt`.
|
|
|
600
600
|
|
|
601
601
|
Hosted API modes require `SOGNI_API_KEY`.
|
|
602
602
|
|
|
603
|
-
|
|
603
|
+
**Choosing a mode.** Whatever is driving this CLI is usually a more capable planner than Sogni's hosted model, so prefer to plan yourself and let the server execute: direct-to-SDK flags for one-shot work, and `--api-workflow` with an explicit `--workflow-input` step graph for multi-step/durable work (you author the plan; the server runs it durably with replay — no hosted re-planning). Use `--api-chat` / `--durable-chat` when you deliberately want the hosted model to own a long server-side loop, or when several local files must be uploaded for one turn.
|
|
604
|
+
|
|
605
|
+
- **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools and **delegates planning/tool-selection to the hosted model** — reach for it when the caller is a thin client, when you want the hosted model to drive a long server-side tool loop, or when several local files must be uploaded for one turn. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
|
|
604
606
|
- **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
|
|
605
607
|
- **`--durable-chat`** starts a hosted `/v1/chat/runs` record through the SDK transport. Set `SOGNI_SKILL_USE_SDK_TRANSPORT=1` before using it. The CLI streams assistant deltas and de-duplicated per-job progress / ETA / result lines from hosted run events.
|
|
606
608
|
- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Requests carry `input.steps` plus snake_case controls such as `token_type`, `media_references`, `max_estimated_capacity_units`, and `confirm_cost`.
|
package/SKILL.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
name: sogni-creative-agent-skill
|
|
3
3
|
description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories, custom personality, style transfer, angle synthesis, Seedance/LTX/WAN video, music/lyrics, hosted chat, durable workflows, replay records, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
|
|
4
4
|
metadata:
|
|
5
|
-
version: "3.6.
|
|
5
|
+
version: "3.6.1"
|
|
6
6
|
homepage: https://sogni.ai
|
|
7
7
|
openclaw:
|
|
8
8
|
emoji: "🎨"
|
|
@@ -98,28 +98,31 @@ sogni-agent -o /tmp/cat.png "a cat wearing a hat" # ✗ avoid — user can't
|
|
|
98
98
|
- Media listing for `--list-media` (read): `~/.openclaw/media/inbound`, falling back to the legacy `~/.clawdbot/media/inbound` when only it exists (`SOGNI_MEDIA_INBOUND_DIR`)
|
|
99
99
|
- Custom ffmpeg binary: `FFMPEG_PATH`
|
|
100
100
|
|
|
101
|
-
## Recommended path:
|
|
101
|
+
## Recommended path: you plan, Sogni executes
|
|
102
102
|
|
|
103
|
-
|
|
103
|
+
You (the calling LLM) are almost always more capable than Sogni's hosted planning model, so **do the planning and tool selection yourself** and let the hosted endpoints do what only the server can — run on the GPU network, persist assets/manifests, orchestrate durable multi-step runs with replay, and apply structured-contract repair. Don't flatten a rich request into a single natural-language string and hand planning back to a weaker model. Match the mode to the work:
|
|
104
104
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
105
|
+
- **One-shot generation** → direct-to-SDK flags (the Core Commands below). You already know the tool, model, and prompt — just run it. No LLM round-trip, lowest latency/cost.
|
|
106
|
+
- **Multi-step / durable / resumable** → `--api-workflow` with an explicit step graph via `--workflow-input <json|@path>`. *You* author the exact plan — `steps[]` with `toolName`, `arguments`, and `dependsOn` bindings (e.g. `sourceStepId`, `targetArgument`, `transform: "artifact_url"`) — and the server executes it durably with replay/resumability, **without re-planning through the hosted LLM**. Presets like `--api-workflow storyboard-video` are fine when they already match the request.
|
|
107
|
+
- **`--api-chat` / `--durable-chat` (hosted LLM owns the loop)** → reserve for when you deliberately *want* the hosted model to drive a long server-side tool loop (saves client round-trips on long async jobs), when structured-contract repair recipes should govern, or when several local files must be uploaded for a single turn (multi-file local upload is only supported here). These delegate planning to the hosted model — choose them on purpose, not by default.
|
|
108
108
|
|
|
109
|
-
|
|
110
|
-
SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
|
|
109
|
+
**Read [`references/hosted-api.md`](./references/hosted-api.md) first** for the full hosted contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls).
|
|
111
110
|
|
|
112
|
-
|
|
113
|
-
|
|
111
|
+
```bash
|
|
112
|
+
# One-shot: you pick the tool, the server just executes (see Core Commands below)
|
|
113
|
+
sogni-agent -q -Q hq -o ./poster.png "Turn the product photo into a launch poster"
|
|
114
114
|
|
|
115
|
-
#
|
|
115
|
+
# Multi-step durable: you author the step graph, the server executes it (no hosted re-planning)
|
|
116
|
+
sogni-agent --api-workflow --workflow-input @plan.json
|
|
116
117
|
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq "9:16 bakery launch video"
|
|
118
|
+
|
|
119
|
+
# Deliberately hand the whole loop to the hosted model (long async job, or multi local-file upload)
|
|
120
|
+
sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
|
|
121
|
+
SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
|
|
117
122
|
```
|
|
118
123
|
|
|
119
124
|
Hosted modes require `SOGNI_API_KEY`. Local file references are uploaded to Sogni media storage and forwarded as retrievable URLs — **use direct CLI mode for private media that must not leave the local machine.**
|
|
120
125
|
|
|
121
|
-
Use the direct-to-SDK commands below for explicit one-shot generation when you already know the model, dimensions, and prompt.
|
|
122
|
-
|
|
123
126
|
## Core Commands (direct-to-SDK)
|
|
124
127
|
|
|
125
128
|
```bash
|
package/openclaw.plugin.json
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
"id": "sogni-creative-agent-skill",
|
|
3
3
|
"name": "Sogni Creative Agent Skill — Image, Video & Music Generation",
|
|
4
4
|
"description": "Agent skill and CLI for Sogni AI image, video, and music generation.",
|
|
5
|
-
"version": "3.6.
|
|
5
|
+
"version": "3.6.1",
|
|
6
6
|
"skills": [
|
|
7
7
|
"."
|
|
8
8
|
],
|
package/package.json
CHANGED
package/references/hosted-api.md
CHANGED
|
@@ -7,35 +7,46 @@ All hosted modes require `SOGNI_API_KEY`.
|
|
|
7
7
|
|
|
8
8
|
## When to prefer the hosted path
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
10
|
+
The thing calling this API is usually a frontier LLM that is **more capable
|
|
11
|
+
than Sogni's hosted planning model**. So the default split is: *you* do the
|
|
12
|
+
planning and tool selection, and the hosted endpoints do what only the server
|
|
13
|
+
can — run on the GPU network, persist assets/manifests, orchestrate durable
|
|
14
|
+
multi-step runs with replay, and apply Structured Contracts v1 (gating
|
|
15
|
+
policies, repair recipes, prompt contracts). Routing a request through
|
|
16
|
+
`--api-chat` so a weaker model re-plans it is usually a downgrade; reach for the
|
|
17
|
+
hosted *planner* deliberately, not by default.
|
|
18
|
+
|
|
19
|
+
- **You already know the single tool + args** → direct-to-SDK flags. Lowest
|
|
20
|
+
latency/cost, no LLM round-trip.
|
|
21
|
+
- **Multi-step, durable, resumable** → `--api-workflow` with an explicit
|
|
22
|
+
`--workflow-input` step graph that *you* author (`steps[]` with `toolName`,
|
|
23
|
+
`arguments`, and `dependsOn` bindings). The server executes and repairs it
|
|
24
|
+
deterministically with replay/resumability and **no hosted-LLM re-planning**.
|
|
25
|
+
This is the best fit when a frontier client drives the work.
|
|
26
|
+
- **You want the hosted model to own a long loop** → `--api-chat` /
|
|
27
|
+
`--durable-chat`. Worth it when offloading a long async tool loop server-side
|
|
28
|
+
saves client round-trips, when structured-contract repair should govern, or
|
|
29
|
+
when several local files must be uploaded for one turn (only supported here).
|
|
17
30
|
|
|
18
31
|
```bash
|
|
19
|
-
#
|
|
32
|
+
# You author the exact durable plan; the server executes it (no hosted re-planning)
|
|
33
|
+
sogni-agent --api-workflow --workflow-input @plan.json
|
|
34
|
+
|
|
35
|
+
# Storyboard → GPT Image 2 sheet → Seedance, all server-side (preset plan)
|
|
36
|
+
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
|
|
37
|
+
"Create a 9:16 bakery launch video with a neon street-window reveal"
|
|
38
|
+
|
|
39
|
+
# Deliberately hand planning to the hosted model (long async job / multi local-file upload)
|
|
20
40
|
sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
|
|
21
41
|
|
|
22
42
|
# Durable hosted chat run (persisted event log + SSE stream)
|
|
23
43
|
SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
|
|
24
44
|
"Create a four-shot launch campaign, generate the key art, and animate the hero clip"
|
|
25
|
-
|
|
26
|
-
# Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
|
|
27
|
-
sogni-agent --api-workflow \
|
|
28
|
-
--video-prompt "The camera slowly pushes in" \
|
|
29
|
-
"A graphite robot sketch on a drafting table"
|
|
30
|
-
|
|
31
|
-
# Storyboard → GPT Image 2 sheet → Seedance, all server-side
|
|
32
|
-
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
|
|
33
|
-
"Create a 9:16 bakery launch video with a neon street-window reveal"
|
|
34
45
|
```
|
|
35
46
|
|
|
36
|
-
The direct-to-SDK flags remain
|
|
37
|
-
you already know the exact model, dimensions, and prompt
|
|
38
|
-
|
|
47
|
+
The direct-to-SDK flags remain the right call for explicit one-shot generation
|
|
48
|
+
when you already know the exact model, dimensions, and prompt — use them
|
|
49
|
+
whenever latency or cost rules out an LLM round-trip.
|
|
39
50
|
|
|
40
51
|
## --api-chat (`POST /v1/chat/completions`)
|
|
41
52
|
|
|
@@ -156,19 +167,24 @@ per video request:
|
|
|
156
167
|
- **Loose reference mode — `-c/--context` plus optional `--ref-audio` and
|
|
157
168
|
`--ref-video` extras.** Anchor frame intent in the prompt with `@Image1` /
|
|
158
169
|
`@Video1` / `@Audio1` etc. (e.g. *"Use @Image1 as the opening shot
|
|
159
|
-
reference"*).
|
|
160
|
-
|
|
170
|
+
reference"*). Each `-c/--context` image may be a **local file or an HTTPS
|
|
171
|
+
URL** (PNG, JPEG, WebP, or GIF) — local files are uploaded to Sogni media
|
|
172
|
+
storage automatically, so you do **not** need `--api-chat` / `--durable-chat`
|
|
173
|
+
just to attach a local loose-reference image. Supports up to 9 image refs, 3 video refs, 3 audio
|
|
174
|
+
refs, and 12 total reference assets per request (canonical caps come from
|
|
161
175
|
`SEEDANCE_REFERENCE_LIMITS` / `validateSeedanceReferenceCounts()` in
|
|
162
176
|
`@sogni-ai/sogni-intelligence-client/tools`).
|
|
163
177
|
|
|
164
178
|
Combining `--ref` / `--ref-end` with `-c/--context` on Seedance is rejected
|
|
165
|
-
client-side with an error pointing at the correct mode. In CLI direct-gen
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
179
|
+
client-side with an error pointing at the correct mode. In CLI direct-gen mode,
|
|
180
|
+
local `-c/--context` images and the primary `--ref-audio` / `--ref-video` are
|
|
181
|
+
uploaded to Sogni media storage automatically and forwarded as HTTPS URLs; only
|
|
182
|
+
*additional* `--ref-audio` / `--ref-video` entries beyond the first must already
|
|
183
|
+
be HTTPS URLs (use `--api-chat` / `--durable-chat` when you need to attach
|
|
184
|
+
several local audio or video files in one request). Seedance accepts public
|
|
185
|
+
HTTPS image, video, and audio references that pass the CLI URL safety checks;
|
|
186
|
+
localhost and private-network URLs are rejected before forwarding. Audio
|
|
187
|
+
references must be paired with an image or video reference.
|
|
172
188
|
|
|
173
189
|
## Models, replays, and contract debugging
|
|
174
190
|
|
package/sogni-agent.mjs
CHANGED
|
@@ -4242,6 +4242,8 @@ function extensionForApiMediaReference(mimeType, kind) {
|
|
|
4242
4242
|
const normalized = String(mimeType || '').split(';')[0].trim().toLowerCase();
|
|
4243
4243
|
if (normalized === 'image/jpeg' || normalized === 'image/jpg') return 'jpg';
|
|
4244
4244
|
if (normalized === 'image/png') return 'png';
|
|
4245
|
+
if (normalized === 'image/webp') return 'webp';
|
|
4246
|
+
if (normalized === 'image/gif') return 'gif';
|
|
4245
4247
|
if (normalized === 'audio/mpeg' || normalized === 'audio/mp3') return 'mp3';
|
|
4246
4248
|
if (normalized === 'audio/mp4' || normalized === 'audio/m4a' || normalized === 'audio/x-m4a') return 'm4a';
|
|
4247
4249
|
if (normalized === 'audio/wav' || normalized === 'audio/x-wav' || normalized === 'audio/wave') return 'wav';
|
|
@@ -6285,6 +6287,84 @@ async function uploadSeedanceReferenceVideoUrl(pathOrUrl, apiKey, index = 0) {
|
|
|
6285
6287
|
return uploaded.url;
|
|
6286
6288
|
}
|
|
6287
6289
|
|
|
6290
|
+
// Content types the Sogni media pipeline accepts for image references, mirroring
|
|
6291
|
+
// the `allowedContentTypes` the /v2/image/uploadUrl presigned-POST endpoint
|
|
6292
|
+
// returns. Kept as a constant so the skill validates exactly what the backend
|
|
6293
|
+
// will store rather than imposing a narrower client-side policy.
|
|
6294
|
+
const SEEDANCE_REFERENCE_IMAGE_MIME_TYPES = Object.freeze([
|
|
6295
|
+
'image/png', 'image/jpeg', 'image/webp', 'image/gif',
|
|
6296
|
+
]);
|
|
6297
|
+
|
|
6298
|
+
// Identify an image's MIME type from its leading bytes (magic numbers). Reliable
|
|
6299
|
+
// because we already hold the buffer, so it works regardless of file extension.
|
|
6300
|
+
function sniffSeedanceReferenceImageMimeType(buffer) {
|
|
6301
|
+
if (!buffer || buffer.length < 4) return null;
|
|
6302
|
+
if (buffer[0] === 0x89 && buffer[1] === 0x50 && buffer[2] === 0x4e && buffer[3] === 0x47) return 'image/png';
|
|
6303
|
+
if (buffer[0] === 0xff && buffer[1] === 0xd8 && buffer[2] === 0xff) return 'image/jpeg';
|
|
6304
|
+
if (
|
|
6305
|
+
buffer.length >= 12
|
|
6306
|
+
&& buffer[0] === 0x52 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x46
|
|
6307
|
+
&& buffer[8] === 0x57 && buffer[9] === 0x45 && buffer[10] === 0x42 && buffer[11] === 0x50
|
|
6308
|
+
) return 'image/webp';
|
|
6309
|
+
if (buffer[0] === 0x47 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x38) return 'image/gif';
|
|
6310
|
+
return null;
|
|
6311
|
+
}
|
|
6312
|
+
|
|
6313
|
+
// Resolve a Seedance loose-reference image's MIME type from its bytes first,
|
|
6314
|
+
// falling back to the file extension. Unsupported files fail fast with an
|
|
6315
|
+
// actionable message instead of uploading bytes the render backend will reject.
|
|
6316
|
+
function seedanceReferenceImageMimeType(pathOrUrl, buffer) {
|
|
6317
|
+
const sniffed = sniffSeedanceReferenceImageMimeType(buffer);
|
|
6318
|
+
if (sniffed) return sniffed;
|
|
6319
|
+
const byPath = mimeTypeForPath(pathOrUrl, '');
|
|
6320
|
+
const normalizedByPath = byPath === 'image/jpg' ? 'image/jpeg' : byPath;
|
|
6321
|
+
if (SEEDANCE_REFERENCE_IMAGE_MIME_TYPES.includes(normalizedByPath)) return normalizedByPath;
|
|
6322
|
+
const err = new Error(
|
|
6323
|
+
`Seedance reference image "${pathOrUrl}" must be a PNG, JPEG, WebP, or GIF file (or an HTTPS URL to one).`,
|
|
6324
|
+
);
|
|
6325
|
+
err.code = 'UNSUPPORTED_MEDIA_TYPE';
|
|
6326
|
+
err.hint = 'Convert the image to PNG, JPEG, or WebP, or pass an HTTPS URL.';
|
|
6327
|
+
err.details = { source: pathOrUrl };
|
|
6328
|
+
throw err;
|
|
6329
|
+
}
|
|
6330
|
+
|
|
6331
|
+
async function prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer) {
|
|
6332
|
+
const data = Buffer.from(buffer);
|
|
6333
|
+
const mimeType = seedanceReferenceImageMimeType(pathOrUrl, data);
|
|
6334
|
+
const filename = withMediaExtension(
|
|
6335
|
+
mediaFilenameFromSource(pathOrUrl, 'reference-image'),
|
|
6336
|
+
extensionForApiMediaReference(mimeType, 'image'),
|
|
6337
|
+
);
|
|
6338
|
+
const maxBytes = apiMediaReferenceMaxBytes();
|
|
6339
|
+
if (data.length > maxBytes) {
|
|
6340
|
+
const err = new Error(
|
|
6341
|
+
`Seedance reference image "${pathOrUrl}" is ${data.length} bytes, above the ${maxBytes} byte upload limit.`,
|
|
6342
|
+
);
|
|
6343
|
+
err.code = 'MEDIA_REFERENCE_TOO_LARGE';
|
|
6344
|
+
err.details = { source: pathOrUrl, byteLength: data.length, maxBytes };
|
|
6345
|
+
throw err;
|
|
6346
|
+
}
|
|
6347
|
+
return {
|
|
6348
|
+
buffer: data,
|
|
6349
|
+
filename,
|
|
6350
|
+
byteLength: data.length,
|
|
6351
|
+
mimeType,
|
|
6352
|
+
};
|
|
6353
|
+
}
|
|
6354
|
+
|
|
6355
|
+
// Upload a local (non-HTTPS) Seedance loose-reference image and return its
|
|
6356
|
+
// hosted HTTPS download URL. The Client SDK's loose-reference arrays accept only
|
|
6357
|
+
// URL strings, so this is what lets `-c <local image>` work in direct generation
|
|
6358
|
+
// without forcing the user onto the --api-chat / --durable-chat path. Mirrors
|
|
6359
|
+
// uploadSeedanceReferenceAudioUrl / uploadSeedanceReferenceVideoUrl.
|
|
6360
|
+
async function uploadSeedanceReferenceImageUrl(pathOrUrl, apiKey, index = 0) {
|
|
6361
|
+
const ref = { flag: '-c/--context', value: pathOrUrl, kind: 'image' };
|
|
6362
|
+
const buffer = await fetchMediaBuffer(pathOrUrl);
|
|
6363
|
+
const file = await prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer);
|
|
6364
|
+
const uploaded = await uploadPreparedApiMediaReferenceV2(ref, index, apiKey, file);
|
|
6365
|
+
return uploaded.url;
|
|
6366
|
+
}
|
|
6367
|
+
|
|
6288
6368
|
async function trimSeedanceV2VSourceVideoBuffer(buffer, sourceLabel, startOffset, requestedDuration) {
|
|
6289
6369
|
const ffmpegPath = await ensureFfmpegAvailable();
|
|
6290
6370
|
const tempDir = createTrackedTempDir('sogni-seedance-v2v-');
|
|
@@ -8086,19 +8166,24 @@ async function main() {
|
|
|
8086
8166
|
// Seedance loose-reference extras: -c/--context images beyond start/end,
|
|
8087
8167
|
// plus repeated --ref-audio / --ref-video entries past the first. The
|
|
8088
8168
|
// Sogni Client SDK accepts only URL arrays for these (createJobRequestMessage),
|
|
8089
|
-
// so
|
|
8090
|
-
//
|
|
8169
|
+
// so each entry must resolve to an HTTPS URL. HTTPS inputs are forwarded as-is
|
|
8170
|
+
// (SSRF-validated); local files are uploaded to a Sogni-hosted URL first, the
|
|
8171
|
+
// same way the primary --ref-audio / --ref-video locals are handled. This lets
|
|
8172
|
+
// `-c <local image>` work in direct generation without a detour through
|
|
8173
|
+
// --api-chat / --durable-chat.
|
|
8091
8174
|
if (isSeedanceVideo) {
|
|
8092
|
-
for (const ctxImage of (Array.isArray(options.contextImages) ? options.contextImages : [])) {
|
|
8175
|
+
for (const [ctxIndex, ctxImage] of (Array.isArray(options.contextImages) ? options.contextImages : []).entries()) {
|
|
8093
8176
|
if (!ctxImage) continue;
|
|
8094
|
-
if (
|
|
8095
|
-
|
|
8096
|
-
|
|
8097
|
-
|
|
8098
|
-
|
|
8177
|
+
if (isHttpsUrl(ctxImage)) {
|
|
8178
|
+
await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
|
|
8179
|
+
} else {
|
|
8180
|
+
const uploadedImageUrl = await uploadSeedanceReferenceImageUrl(
|
|
8181
|
+
ctxImage,
|
|
8182
|
+
creds.SOGNI_API_KEY,
|
|
8183
|
+
ctxIndex,
|
|
8099
8184
|
);
|
|
8185
|
+
seedanceReferenceImageUrls.push(uploadedImageUrl);
|
|
8100
8186
|
}
|
|
8101
|
-
await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
|
|
8102
8187
|
}
|
|
8103
8188
|
for (const [extraAudioIndex, extraAudio] of options.refAudios.entries()) {
|
|
8104
8189
|
if (!isHttpsUrl(extraAudio)) {
|
package/version.mjs
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
export const PACKAGE_VERSION = '3.6.
|
|
1
|
+
export const PACKAGE_VERSION = '3.6.1';
|