@sogni-ai/sogni-creative-agent-skill 3.5.1 → 3.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,44 @@ All notable changes to this project are documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [3.6.1] - 2026-06-15
9
+
10
+ ### Changed
11
+
12
+ - **Hosted-API guidance now recommends client-side planning over hosted re-planning.** The skill is driven by a
13
+ frontier LLM that out-plans Sogni's hosted planning model, so steering it to delegate planning through
14
+ `--api-chat` was a downgrade. `SKILL.md`, `references/hosted-api.md`, and `README.md` now tell the calling agent
15
+ to plan and select tools itself, use `--api-workflow` with an explicit `--workflow-input` step graph for durable
16
+ multi-step work (the server executes the authored plan without re-planning), and reserve `--api-chat` /
17
+ `--durable-chat` for deliberately offloading a long server-side loop or uploading several local files in one
18
+ turn. `--api-chat` and all hosted modes remain fully supported — only the recommended default changed.
19
+
20
+ ### Fixed
21
+
22
+ - **Local Seedance reference images via `-c`/`--context` now auto-upload in direct CLI mode.** Local
23
+ loose-reference images were rejected with an HTTPS-only error that pushed users onto the unreliable
24
+ `--api-chat` / `--durable-chat` path; local `--ref-audio` and `--ref-video` already auto-uploaded through the
25
+ `/v2` presigned-POST flow, so images were the only modality missing it and one broken branch cascaded into
26
+ downstream failures (vision 1024px cap, HTTP timeout, no-content, missing durable SDK package). Local
27
+ `-c`/`--context` images now upload through the same `/v2/image` presigned flow and forward as Sogni-hosted URLs.
28
+ MIME type is resolved by magic-byte sniffing (falling back to extension), and the accepted set
29
+ (PNG/JPEG/WebP/GIF) mirrors the backend's `allowedContentTypes`. Adds local-PNG-upload and mislabeled-WebP
30
+ byte-sniff regression tests; verified end-to-end with a real Seedance 2.0 render from a local `-c` PNG.
31
+
32
+ ## [3.6.0] - 2026-06-12
33
+
34
+ ### Added
35
+
36
+ - **Agents now surface update notices (gstack-style).** Update notices were previously suppressed exactly where
37
+ agents live — non-TTY stderr, `--json` mode, and OpenClaw plugin invocations — so Claude Code / Codex / Hermes /
38
+ OpenClaw users never learned a newer skill existed. Any command may now print a single advisory stderr line,
39
+ `[sogni-agent] Update available: <current> -> <latest> ...`, throttled to at most once per 24 hours, telling
40
+ the agent to finish the current task, relay the update to the user, and offer `sogni-agent self-update`
41
+ (`--snooze-update` on decline). Interactive TTY users keep the existing banner. stdout is never touched, so
42
+ `--json` output stays machine-parseable; SKILL.md instructs agents how to handle the line. Background version
43
+ checks now also run in agent contexts (still skipped for CI, tests, `--no-update-check`,
44
+ `SOGNI_NO_UPDATE_CHECK`, and dev checkouts).
45
+
8
46
  ## [3.5.1] - 2026-06-12
9
47
 
10
48
  ### Fixed
package/README.md CHANGED
@@ -600,7 +600,9 @@ Stored at `~/.config/sogni/personality.txt`.
600
600
 
601
601
  Hosted API modes require `SOGNI_API_KEY`.
602
602
 
603
- - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
603
+ **Choosing a mode.** Whatever is driving this CLI is usually a more capable planner than Sogni's hosted model, so prefer to plan yourself and let the server execute: direct-to-SDK flags for one-shot work, and `--api-workflow` with an explicit `--workflow-input` step graph for multi-step/durable work (you author the plan; the server runs it durably with replay — no hosted re-planning). Use `--api-chat` / `--durable-chat` when you deliberately want the hosted model to own a long server-side loop, or when several local files must be uploaded for one turn.
604
+
605
+ - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools and **delegates planning/tool-selection to the hosted model** — reach for it when the caller is a thin client, when you want the hosted model to drive a long server-side tool loop, or when several local files must be uploaded for one turn. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
604
606
  - **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
605
607
  - **`--durable-chat`** starts a hosted `/v1/chat/runs` record through the SDK transport. Set `SOGNI_SKILL_USE_SDK_TRANSPORT=1` before using it. The CLI streams assistant deltas and de-duplicated per-job progress / ETA / result lines from hosted run events.
606
608
  - **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Requests carry `input.steps` plus snake_case controls such as `token_type`, `media_references`, `max_estimated_capacity_units`, and `confirm_cost`.
@@ -683,8 +685,10 @@ This skill is designed to be loaded into agent runtimes as a first-class capabil
683
685
  5. **Agent-safe install/upgrade**
684
686
  Prefer the `npm install -g` and `git -C "$DEST" pull --ff-only` paths above. Avoid generating clone-or-pull bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs — agent sandboxes correctly route those through approval and the install will stall.
685
687
  6. **Verify with `doctor`**
686
- After any install or upgrade, run `sogni-agent doctor --json` and confirm `"success": true` before reporting the install as working. Each failed check carries a `detail` string with the fix.
687
- 7. **SSRF / URL safety**
688
+ After any install or upgrade, run `sogni-agent doctor --json` and confirm `"success": true` before reporting the install as working.
689
+ 7. **Update notices for agents**
690
+ When a newer version exists, any command may print one advisory stderr line — `[sogni-agent] Update available: <current> -> <latest> ...` — at most once per day (stdout JSON is never touched). Agents should relay it to the user and offer `sogni-agent self-update`, or run `sogni-agent --snooze-update` if the user declines. Interactive TTY users get a banner instead. Each failed check carries a `detail` string with the fix.
691
+ 8. **SSRF / URL safety**
688
692
  The CLI validates every HTTP(S) media reference with an SSRF guard ([`ssrf-guard.mjs`](./ssrf-guard.mjs)) and re-validates each redirect hop on download. Localhost and private-network URLs are rejected; only public HTTPS references are forwarded as Seedance multimodal context.
689
693
 
690
694
  ---
package/SKILL.md CHANGED
@@ -2,7 +2,7 @@
2
2
  name: sogni-creative-agent-skill
3
3
  description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories, custom personality, style transfer, angle synthesis, Seedance/LTX/WAN video, music/lyrics, hosted chat, durable workflows, replay records, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
4
4
  metadata:
5
- version: "3.5.1"
5
+ version: "3.6.1"
6
6
  homepage: https://sogni.ai
7
7
  openclaw:
8
8
  emoji: "🎨"
@@ -52,7 +52,9 @@ Agents should run `sogni-agent doctor --json` and confirm `"success": true` befo
52
52
 
53
53
  Always invoke the globally installed `sogni-agent` command. Do not call `node {{skillDir}}/sogni-agent.mjs` or `node sogni-agent.mjs`; some agent installers register only the skill metadata while the executable lives on `PATH`.
54
54
 
55
- For upgrades, prefer `sogni-agent self-update`, package-manager updates, or direct operations on an existing checkout (`git -C "$DEST" pull --ff-only && npm --prefix "$DEST" install`). Do not generate clone-or-pull shell bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs; agent command scanners may require approval for those patterns. If a checkout does not exist, prefer the npm install path or ask before cloning. When an update notice appears, offer the user the upgrade (`sogni-agent self-update`); if they decline, run `sogni-agent --snooze-update` so they are not re-nagged daily, and `sogni-agent --whats-new` after upgrading to summarize changes.
55
+ For upgrades, prefer `sogni-agent self-update`, package-manager updates, or direct operations on an existing checkout (`git -C "$DEST" pull --ff-only && npm --prefix "$DEST" install`). Do not generate clone-or-pull shell bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs; agent command scanners may require approval for those patterns. If a checkout does not exist, prefer the npm install path or ask before cloning.
56
+
57
+ **Update notices:** any `sogni-agent` command may print a single stderr line of the form `[sogni-agent] Update available: <current> -> <latest> ...` (at most once per day). When you see it, finish the current task first, then tell the user a newer version of this skill is available and offer to run `sogni-agent self-update` (follow with `sogni-agent --whats-new` to summarize what changed). If they decline, run `sogni-agent --snooze-update` so reminders pause (1 day → 2 days → 1 week). Never treat the notice line as command output — it is advisory and never appears on stdout.
56
58
 
57
59
  ## Uninstall Request Policy
58
60
 
@@ -96,28 +98,31 @@ sogni-agent -o /tmp/cat.png "a cat wearing a hat" # ✗ avoid — user can't
96
98
  - Media listing for `--list-media` (read): `~/.openclaw/media/inbound`, falling back to the legacy `~/.clawdbot/media/inbound` when only it exists (`SOGNI_MEDIA_INBOUND_DIR`)
97
99
  - Custom ffmpeg binary: `FFMPEG_PATH`
98
100
 
99
- ## Recommended path: hosted Sogni Intelligence endpoints
101
+ ## Recommended path: you plan, Sogni executes
100
102
 
101
- For any natural-language creative request that should be planned, multi-step, resumable, or benefit from server-side tool selection and repair, prefer the hosted endpoints over direct-to-SDK flags **read [`references/hosted-api.md`](./references/hosted-api.md) first** for the full contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls):
103
+ You (the calling LLM) are almost always more capable than Sogni's hosted planning model, so **do the planning and tool selection yourself** and let the hosted endpoints do what only the server can run on the GPU network, persist assets/manifests, orchestrate durable multi-step runs with replay, and apply structured-contract repair. Don't flatten a rich request into a single natural-language string and hand planning back to a weaker model. Match the mode to the work:
102
104
 
103
- ```bash
104
- # Natural-language creative request (LLM picks the tool, dispatches, repairs)
105
- sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
105
+ - **One-shot generation** → direct-to-SDK flags (the Core Commands below). You already know the tool, model, and prompt — just run it. No LLM round-trip, lowest latency/cost.
106
+ - **Multi-step / durable / resumable** → `--api-workflow` with an explicit step graph via `--workflow-input <json|@path>`. *You* author the exact plan — `steps[]` with `toolName`, `arguments`, and `dependsOn` bindings (e.g. `sourceStepId`, `targetArgument`, `transform: "artifact_url"`) — and the server executes it durably with replay/resumability, **without re-planning through the hosted LLM**. Presets like `--api-workflow storyboard-video` are fine when they already match the request.
107
+ - **`--api-chat` / `--durable-chat` (hosted LLM owns the loop)** reserve for when you deliberately *want* the hosted model to drive a long server-side tool loop (saves client round-trips on long async jobs), when structured-contract repair recipes should govern, or when several local files must be uploaded for a single turn (multi-file local upload is only supported here). These delegate planning to the hosted model — choose them on purpose, not by default.
106
108
 
107
- # Durable hosted chat run (persisted event log + SSE stream)
108
- SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
109
+ **Read [`references/hosted-api.md`](./references/hosted-api.md) first** for the full hosted contract (tool surfaces, durable workflows, templates, replays, Seedance reference modes, media-reference uploads, cost controls).
109
110
 
110
- # Durable workflow (resumable, server-orchestrated)
111
- sogni-agent --api-workflow --video-prompt "The camera slowly pushes in" "A graphite robot sketch on a drafting table"
111
+ ```bash
112
+ # One-shot: you pick the tool, the server just executes (see Core Commands below)
113
+ sogni-agent -q -Q hq -o ./poster.png "Turn the product photo into a launch poster"
112
114
 
113
- # Storyboard GPT Image 2 sheet Seedance video, all server-side
115
+ # Multi-step durable: you author the step graph, the server executes it (no hosted re-planning)
116
+ sogni-agent --api-workflow --workflow-input @plan.json
114
117
  sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq "9:16 bakery launch video"
118
+
119
+ # Deliberately hand the whole loop to the hosted model (long async job, or multi local-file upload)
120
+ sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
121
+ SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "Create a launch campaign and animate the hero clip"
115
122
  ```
116
123
 
117
124
  Hosted modes require `SOGNI_API_KEY`. Local file references are uploaded to Sogni media storage and forwarded as retrievable URLs — **use direct CLI mode for private media that must not leave the local machine.**
118
125
 
119
- Use the direct-to-SDK commands below for explicit one-shot generation when you already know the model, dimensions, and prompt.
120
-
121
126
  ## Core Commands (direct-to-SDK)
122
127
 
123
128
  ```bash
@@ -2,7 +2,7 @@
2
2
  "id": "sogni-creative-agent-skill",
3
3
  "name": "Sogni Creative Agent Skill — Image, Video & Music Generation",
4
4
  "description": "Agent skill and CLI for Sogni AI image, video, and music generation.",
5
- "version": "3.5.1",
5
+ "version": "3.6.1",
6
6
  "skills": [
7
7
  "."
8
8
  ],
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@sogni-ai/sogni-creative-agent-skill",
3
- "version": "3.5.1",
3
+ "version": "3.6.1",
4
4
  "description": "Sogni Creative Agent Skill: agent skill and CLI for Sogni AI image, video, and music generation.",
5
5
  "type": "module",
6
6
  "main": "sogni-agent.mjs",
@@ -7,35 +7,46 @@ All hosted modes require `SOGNI_API_KEY`.
7
7
 
8
8
  ## When to prefer the hosted path
9
9
 
10
- For any natural-language creative request that benefits from tool selection,
11
- repair, or durable workflows, prefer the hosted Sogni Intelligence endpoints
12
- over direct-to-SDK media flags. They are the canonical home for
13
- OpenAI-compatible chat, server-side creative tool dispatch, Structured
14
- Contracts v1 (gating policies, repair recipes, prompt contracts), durable chat
15
- runs, durable workflows, workflow templates, replay, and asset-manifest
16
- mapping.
10
+ The thing calling this API is usually a frontier LLM that is **more capable
11
+ than Sogni's hosted planning model**. So the default split is: *you* do the
12
+ planning and tool selection, and the hosted endpoints do what only the server
13
+ can run on the GPU network, persist assets/manifests, orchestrate durable
14
+ multi-step runs with replay, and apply Structured Contracts v1 (gating
15
+ policies, repair recipes, prompt contracts). Routing a request through
16
+ `--api-chat` so a weaker model re-plans it is usually a downgrade; reach for the
17
+ hosted *planner* deliberately, not by default.
18
+
19
+ - **You already know the single tool + args** → direct-to-SDK flags. Lowest
20
+ latency/cost, no LLM round-trip.
21
+ - **Multi-step, durable, resumable** → `--api-workflow` with an explicit
22
+ `--workflow-input` step graph that *you* author (`steps[]` with `toolName`,
23
+ `arguments`, and `dependsOn` bindings). The server executes and repairs it
24
+ deterministically with replay/resumability and **no hosted-LLM re-planning**.
25
+ This is the best fit when a frontier client drives the work.
26
+ - **You want the hosted model to own a long loop** → `--api-chat` /
27
+ `--durable-chat`. Worth it when offloading a long async tool loop server-side
28
+ saves client round-trips, when structured-contract repair should govern, or
29
+ when several local files must be uploaded for one turn (only supported here).
17
30
 
18
31
  ```bash
19
- # Natural-language creative request (LLM picks the tool, dispatches, repairs)
32
+ # You author the exact durable plan; the server executes it (no hosted re-planning)
33
+ sogni-agent --api-workflow --workflow-input @plan.json
34
+
35
+ # Storyboard → GPT Image 2 sheet → Seedance, all server-side (preset plan)
36
+ sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
37
+ "Create a 9:16 bakery launch video with a neon street-window reveal"
38
+
39
+ # Deliberately hand planning to the hosted model (long async job / multi local-file upload)
20
40
  sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
21
41
 
22
42
  # Durable hosted chat run (persisted event log + SSE stream)
23
43
  SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
24
44
  "Create a four-shot launch campaign, generate the key art, and animate the hero clip"
25
-
26
- # Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
27
- sogni-agent --api-workflow \
28
- --video-prompt "The camera slowly pushes in" \
29
- "A graphite robot sketch on a drafting table"
30
-
31
- # Storyboard → GPT Image 2 sheet → Seedance, all server-side
32
- sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
33
- "Create a 9:16 bakery launch video with a neon street-window reveal"
34
45
  ```
35
46
 
36
- The direct-to-SDK flags remain available for explicit one-shot generation when
37
- you already know the exact model, dimensions, and prompt and don't need LLM
38
- planning — use them when latency or cost rules out the LLM round-trip.
47
+ The direct-to-SDK flags remain the right call for explicit one-shot generation
48
+ when you already know the exact model, dimensions, and prompt use them
49
+ whenever latency or cost rules out an LLM round-trip.
39
50
 
40
51
  ## --api-chat (`POST /v1/chat/completions`)
41
52
 
@@ -156,19 +167,24 @@ per video request:
156
167
  - **Loose reference mode — `-c/--context` plus optional `--ref-audio` and
157
168
  `--ref-video` extras.** Anchor frame intent in the prompt with `@Image1` /
158
169
  `@Video1` / `@Audio1` etc. (e.g. *"Use @Image1 as the opening shot
159
- reference"*). Supports up to 9 image refs, 3 video refs, 3 audio refs, and
160
- 12 total reference assets per request (canonical caps come from
170
+ reference"*). Each `-c/--context` image may be a **local file or an HTTPS
171
+ URL** (PNG, JPEG, WebP, or GIF) local files are uploaded to Sogni media
172
+ storage automatically, so you do **not** need `--api-chat` / `--durable-chat`
173
+ just to attach a local loose-reference image. Supports up to 9 image refs, 3 video refs, 3 audio
174
+ refs, and 12 total reference assets per request (canonical caps come from
161
175
  `SEEDANCE_REFERENCE_LIMITS` / `validateSeedanceReferenceCounts()` in
162
176
  `@sogni-ai/sogni-intelligence-client/tools`).
163
177
 
164
178
  Combining `--ref` / `--ref-end` with `-c/--context` on Seedance is rejected
165
- client-side with an error pointing at the correct mode. In CLI direct-gen
166
- mode, additional `--ref-audio` / `--ref-video` entries beyond the first must
167
- be HTTPS URLs (the primary entry can still be a local file); for local
168
- multi-file Seedance uploads, use `--api-chat` / `--durable-chat` instead.
169
- Seedance accepts public HTTPS image, video, and audio references that pass the
170
- CLI URL safety checks; localhost and private-network URLs are rejected before
171
- forwarding. Audio references must be paired with an image or video reference.
179
+ client-side with an error pointing at the correct mode. In CLI direct-gen mode,
180
+ local `-c/--context` images and the primary `--ref-audio` / `--ref-video` are
181
+ uploaded to Sogni media storage automatically and forwarded as HTTPS URLs; only
182
+ *additional* `--ref-audio` / `--ref-video` entries beyond the first must already
183
+ be HTTPS URLs (use `--api-chat` / `--durable-chat` when you need to attach
184
+ several local audio or video files in one request). Seedance accepts public
185
+ HTTPS image, video, and audio references that pass the CLI URL safety checks;
186
+ localhost and private-network URLs are rejected before forwarding. Audio
187
+ references must be paired with an image or video reference.
172
188
 
173
189
  ## Models, replays, and contract debugging
174
190
 
package/sogni-agent.mjs CHANGED
@@ -4242,6 +4242,8 @@ function extensionForApiMediaReference(mimeType, kind) {
4242
4242
  const normalized = String(mimeType || '').split(';')[0].trim().toLowerCase();
4243
4243
  if (normalized === 'image/jpeg' || normalized === 'image/jpg') return 'jpg';
4244
4244
  if (normalized === 'image/png') return 'png';
4245
+ if (normalized === 'image/webp') return 'webp';
4246
+ if (normalized === 'image/gif') return 'gif';
4245
4247
  if (normalized === 'audio/mpeg' || normalized === 'audio/mp3') return 'mp3';
4246
4248
  if (normalized === 'audio/mp4' || normalized === 'audio/m4a' || normalized === 'audio/x-m4a') return 'm4a';
4247
4249
  if (normalized === 'audio/wav' || normalized === 'audio/x-wav' || normalized === 'audio/wave') return 'wav';
@@ -6285,6 +6287,84 @@ async function uploadSeedanceReferenceVideoUrl(pathOrUrl, apiKey, index = 0) {
6285
6287
  return uploaded.url;
6286
6288
  }
6287
6289
 
6290
+ // Content types the Sogni media pipeline accepts for image references, mirroring
6291
+ // the `allowedContentTypes` the /v2/image/uploadUrl presigned-POST endpoint
6292
+ // returns. Kept as a constant so the skill validates exactly what the backend
6293
+ // will store rather than imposing a narrower client-side policy.
6294
+ const SEEDANCE_REFERENCE_IMAGE_MIME_TYPES = Object.freeze([
6295
+ 'image/png', 'image/jpeg', 'image/webp', 'image/gif',
6296
+ ]);
6297
+
6298
+ // Identify an image's MIME type from its leading bytes (magic numbers). Reliable
6299
+ // because we already hold the buffer, so it works regardless of file extension.
6300
+ function sniffSeedanceReferenceImageMimeType(buffer) {
6301
+ if (!buffer || buffer.length < 4) return null;
6302
+ if (buffer[0] === 0x89 && buffer[1] === 0x50 && buffer[2] === 0x4e && buffer[3] === 0x47) return 'image/png';
6303
+ if (buffer[0] === 0xff && buffer[1] === 0xd8 && buffer[2] === 0xff) return 'image/jpeg';
6304
+ if (
6305
+ buffer.length >= 12
6306
+ && buffer[0] === 0x52 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x46
6307
+ && buffer[8] === 0x57 && buffer[9] === 0x45 && buffer[10] === 0x42 && buffer[11] === 0x50
6308
+ ) return 'image/webp';
6309
+ if (buffer[0] === 0x47 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x38) return 'image/gif';
6310
+ return null;
6311
+ }
6312
+
6313
+ // Resolve a Seedance loose-reference image's MIME type from its bytes first,
6314
+ // falling back to the file extension. Unsupported files fail fast with an
6315
+ // actionable message instead of uploading bytes the render backend will reject.
6316
+ function seedanceReferenceImageMimeType(pathOrUrl, buffer) {
6317
+ const sniffed = sniffSeedanceReferenceImageMimeType(buffer);
6318
+ if (sniffed) return sniffed;
6319
+ const byPath = mimeTypeForPath(pathOrUrl, '');
6320
+ const normalizedByPath = byPath === 'image/jpg' ? 'image/jpeg' : byPath;
6321
+ if (SEEDANCE_REFERENCE_IMAGE_MIME_TYPES.includes(normalizedByPath)) return normalizedByPath;
6322
+ const err = new Error(
6323
+ `Seedance reference image "${pathOrUrl}" must be a PNG, JPEG, WebP, or GIF file (or an HTTPS URL to one).`,
6324
+ );
6325
+ err.code = 'UNSUPPORTED_MEDIA_TYPE';
6326
+ err.hint = 'Convert the image to PNG, JPEG, or WebP, or pass an HTTPS URL.';
6327
+ err.details = { source: pathOrUrl };
6328
+ throw err;
6329
+ }
6330
+
6331
+ async function prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer) {
6332
+ const data = Buffer.from(buffer);
6333
+ const mimeType = seedanceReferenceImageMimeType(pathOrUrl, data);
6334
+ const filename = withMediaExtension(
6335
+ mediaFilenameFromSource(pathOrUrl, 'reference-image'),
6336
+ extensionForApiMediaReference(mimeType, 'image'),
6337
+ );
6338
+ const maxBytes = apiMediaReferenceMaxBytes();
6339
+ if (data.length > maxBytes) {
6340
+ const err = new Error(
6341
+ `Seedance reference image "${pathOrUrl}" is ${data.length} bytes, above the ${maxBytes} byte upload limit.`,
6342
+ );
6343
+ err.code = 'MEDIA_REFERENCE_TOO_LARGE';
6344
+ err.details = { source: pathOrUrl, byteLength: data.length, maxBytes };
6345
+ throw err;
6346
+ }
6347
+ return {
6348
+ buffer: data,
6349
+ filename,
6350
+ byteLength: data.length,
6351
+ mimeType,
6352
+ };
6353
+ }
6354
+
6355
+ // Upload a local (non-HTTPS) Seedance loose-reference image and return its
6356
+ // hosted HTTPS download URL. The Client SDK's loose-reference arrays accept only
6357
+ // URL strings, so this is what lets `-c <local image>` work in direct generation
6358
+ // without forcing the user onto the --api-chat / --durable-chat path. Mirrors
6359
+ // uploadSeedanceReferenceAudioUrl / uploadSeedanceReferenceVideoUrl.
6360
+ async function uploadSeedanceReferenceImageUrl(pathOrUrl, apiKey, index = 0) {
6361
+ const ref = { flag: '-c/--context', value: pathOrUrl, kind: 'image' };
6362
+ const buffer = await fetchMediaBuffer(pathOrUrl);
6363
+ const file = await prepareSeedanceReferenceImageUploadFile(pathOrUrl, buffer);
6364
+ const uploaded = await uploadPreparedApiMediaReferenceV2(ref, index, apiKey, file);
6365
+ return uploaded.url;
6366
+ }
6367
+
6288
6368
  async function trimSeedanceV2VSourceVideoBuffer(buffer, sourceLabel, startOffset, requestedDuration) {
6289
6369
  const ffmpegPath = await ensureFfmpegAvailable();
6290
6370
  const tempDir = createTrackedTempDir('sogni-seedance-v2v-');
@@ -8086,19 +8166,24 @@ async function main() {
8086
8166
  // Seedance loose-reference extras: -c/--context images beyond start/end,
8087
8167
  // plus repeated --ref-audio / --ref-video entries past the first. The
8088
8168
  // Sogni Client SDK accepts only URL arrays for these (createJobRequestMessage),
8089
- // so extras MUST be HTTPS URLs. For multi-file local uploads, use --api-chat /
8090
- // --durable-chat where the LLM upload pipeline handles per-file uploads.
8169
+ // so each entry must resolve to an HTTPS URL. HTTPS inputs are forwarded as-is
8170
+ // (SSRF-validated); local files are uploaded to a Sogni-hosted URL first, the
8171
+ // same way the primary --ref-audio / --ref-video locals are handled. This lets
8172
+ // `-c <local image>` work in direct generation without a detour through
8173
+ // --api-chat / --durable-chat.
8091
8174
  if (isSeedanceVideo) {
8092
- for (const ctxImage of (Array.isArray(options.contextImages) ? options.contextImages : [])) {
8175
+ for (const [ctxIndex, ctxImage] of (Array.isArray(options.contextImages) ? options.contextImages : []).entries()) {
8093
8176
  if (!ctxImage) continue;
8094
- if (!isHttpsUrl(ctxImage)) {
8095
- fatalCliError(
8096
- `Seedance extra image reference "${ctxImage}" must be an HTTPS URL. ` +
8097
- 'Local file uploads beyond --ref / --ref-end are only supported in --api-chat / --durable-chat mode.',
8098
- { code: 'INVALID_ARGUMENT', details: { flag: '-c/--context', value: ctxImage } },
8177
+ if (isHttpsUrl(ctxImage)) {
8178
+ await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
8179
+ } else {
8180
+ const uploadedImageUrl = await uploadSeedanceReferenceImageUrl(
8181
+ ctxImage,
8182
+ creds.SOGNI_API_KEY,
8183
+ ctxIndex,
8099
8184
  );
8185
+ seedanceReferenceImageUrls.push(uploadedImageUrl);
8100
8186
  }
8101
- await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
8102
8187
  }
8103
8188
  for (const [extraAudioIndex, extraAudio] of options.refAudios.entries()) {
8104
8189
  if (!isHttpsUrl(extraAudio)) {
package/update-check.mjs CHANGED
@@ -10,7 +10,10 @@
10
10
  * writeState(path, state) → void
11
11
  * runForegroundCheck(opts) → Promise<void> (used by --__update-check)
12
12
  * maybeSpawnBackgroundCheck(opts) → 'spawned' | 'skipped' | 'fresh'
13
- * getQueuedNotice(opts) → string | null
13
+ * getQueuedNotice(opts) → string | null (TTY banner, or a
14
+ * throttled one-line agent notice when
15
+ * stderr is not a TTY)
16
+ * formatAgentUpdateNotice(opts) → string (pure)
14
17
  * runSelfUpdate(opts) → number (exit code)
15
18
  * snoozeUpdate(opts) → { snoozed, version?, level?, until? }
16
19
  * extractChangelogEntries(text) → [{ version, heading, body }] (pure)
@@ -70,10 +73,14 @@ export function detectPackageManager(env = process.env) {
70
73
  return { manager: 'npm', installCmd: `npm install -g ${PACKAGE_NAME}` };
71
74
  }
72
75
 
76
+ // Hard opt-outs only. Notices are deliberately NOT skipped for non-TTY
77
+ // stderr, --json, or OpenClaw plugin invocations anymore: those are exactly
78
+ // the agent contexts that should relay "an update is available" to the user
79
+ // (getQueuedNotice emits a compact single-line agent notice there instead of
80
+ // the interactive banner).
73
81
  export function shouldSkipForEnvironment({
74
82
  argv = process.argv,
75
83
  env = process.env,
76
- stderr = process.stderr,
77
84
  cliPath = process.argv[1] || '',
78
85
  } = {}) {
79
86
  if (Array.isArray(argv) && argv.includes('--no-update-check')) return true;
@@ -81,11 +88,8 @@ export function shouldSkipForEnvironment({
81
88
  if (env.NO_UPDATE_NOTIFIER === '1' || env.NO_UPDATE_NOTIFIER === 'true') return true;
82
89
  if (env.CI) return true;
83
90
  if (env.SOGNI_AGENT_TEST_STATE_PATH) return true;
84
- if (env.OPENCLAW_PLUGIN_CONFIG) return true;
85
91
  if (env.NODE_ENV === 'test') return true;
86
92
  if (env.npm_lifecycle_event) return true; // running under `npm <script>`
87
- if (Array.isArray(argv) && argv.includes('--json')) return true;
88
- if (stderr && stderr.isTTY === false) return true;
89
93
  // Dev / source checkout: CLI directory contains .git
90
94
  if (cliPath) {
91
95
  try {
@@ -265,13 +269,30 @@ export function maybeSpawnBackgroundCheck({
265
269
  }
266
270
  }
267
271
 
272
+ // Agent contexts (non-TTY stderr: Claude Code, Codex, Hermes, OpenClaw, …)
273
+ // get a compact one-line notice at most once per this window, instructing the
274
+ // agent to relay the update to the user — the gstack-style "there's an update,
275
+ // you should install it sometime" experience. Interactive TTY users keep the
276
+ // banner on every run as before.
277
+ export const AGENT_NOTICE_THROTTLE_MS = 24 * 60 * 60 * 1000;
278
+
279
+ export function formatAgentUpdateNotice({ currentVersion, latestVersion } = {}) {
280
+ return `[sogni-agent] Update available: ${currentVersion} -> ${latestVersion}. ` +
281
+ 'Agent: after finishing the current task, let the user know a newer Sogni Creative Agent Skill exists ' +
282
+ 'and offer to run `sogni-agent self-update` (then `sogni-agent --whats-new`). ' +
283
+ 'If they decline, run `sogni-agent --snooze-update` to pause these reminders.';
284
+ }
285
+
268
286
  export function getQueuedNotice({
269
287
  currentVersion,
270
288
  statePath = DEFAULT_STATE_PATH,
271
289
  env = process.env,
290
+ argv = process.argv,
291
+ stderr = process.stderr,
292
+ cliPath = process.argv[1] || '',
272
293
  now = Date.now,
273
294
  } = {}) {
274
- if (shouldSkipForEnvironment({ env })) return null;
295
+ if (shouldSkipForEnvironment({ argv, env, cliPath })) return null;
275
296
  const state = readState(statePath);
276
297
  if (!state || typeof state.lastKnownLatest !== 'string') return null;
277
298
  if (compareSemver(state.lastKnownLatest, currentVersion) <= 0) return null;
@@ -285,12 +306,27 @@ export function getQueuedNotice({
285
306
  ) {
286
307
  return null;
287
308
  }
288
- const { installCmd } = detectPackageManager(env);
289
- return formatUpdateNotice({
290
- currentVersion,
291
- latestVersion: state.lastKnownLatest,
292
- installCmd,
293
- });
309
+
310
+ const interactive = Boolean(stderr && stderr.isTTY);
311
+ if (interactive) {
312
+ const { installCmd } = detectPackageManager(env);
313
+ return formatUpdateNotice({
314
+ currentVersion,
315
+ latestVersion: state.lastKnownLatest,
316
+ installCmd,
317
+ });
318
+ }
319
+
320
+ // Agent mode: throttle so long agent sessions see this occasionally, not on
321
+ // every single command.
322
+ if (
323
+ typeof state.lastNotifiedAt === 'number' &&
324
+ now() - state.lastNotifiedAt < AGENT_NOTICE_THROTTLE_MS
325
+ ) {
326
+ return null;
327
+ }
328
+ writeState(statePath, { ...state, lastNotifiedAt: now() });
329
+ return formatAgentUpdateNotice({ currentVersion, latestVersion: state.lastKnownLatest });
294
330
  }
295
331
 
296
332
  // Escalating snooze backoff: declining the same update nags less and less
package/version.mjs CHANGED
@@ -1 +1 @@
1
- export const PACKAGE_VERSION = '3.5.1';
1
+ export const PACKAGE_VERSION = '3.6.1';