npm - @sogni-ai/sogni-creative-agent-skill - Versions diffs - 3.1.1 → 3.3.0 - Mend

@sogni-ai/sogni-creative-agent-skill 3.1.1 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +19 -2
package/SKILL.md +141 -45
package/generated/creative-agent-runtime.mjs +2 -2
package/llm.txt +8 -0
package/package.json +6 -5
package/skill-package.json +1 -1
package/sogni-agent.mjs +277 -16
package/update-check.mjs +303 -0
package/version.mjs +1 -1

package/README.md CHANGED Viewed

@@ -42,6 +42,7 @@ With this skill, an agent can:
 - [Requirements](#requirements)
 - [Installation](#installation)
   - [Node CLI (default)](#node-cli-default)
+  - [Claude Code plugin](#claude-code-plugin)
   - [OpenClaw plugin](#openclaw-plugin)
   - [Hermes Agent / Manus / other frameworks](#hermes-agent--manus--other-frameworks)
   - [Manual install from source](#manual-install-from-source)
@@ -111,6 +112,17 @@ sogni-agent --version
 Then point your agent/runtime at this repository's [`SKILL.md`](./SKILL.md). When an install request is ambiguous, install the CLI and skill source together — that's the supported default.
+### Claude Code plugin
+The Claude Code plugin shells out to the `sogni-agent` CLI installed above, so both steps are required. From inside Claude Code, register the marketplace and install the plugin:
+```text
+/plugin marketplace add Sogni-AI/sogni-creative-agent-skill
+/plugin install sogni-creative-agent@sogni
+```
+The first command registers a `sogni` marketplace with one plugin entry (`sogni-creative-agent`) backed by a lean Claude-Code-focused [`plugin-skills/sogni-creative-agent/SKILL.md`](./plugin-skills/sogni-creative-agent/SKILL.md); the second installs the plugin into Claude Code. The full skill spec still lives at the repository root [`SKILL.md`](./SKILL.md).
 ### OpenClaw plugin
 For the published plugin:
@@ -275,6 +287,10 @@ sogni-agent --api-chat --task-profile reasoning --no-thinking \
   "Plan a concise multi-step product launch workflow"
 sogni-agent --list-api-models
+# Durable hosted chat run with SSE progress events
+SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
+  "Create a product launch storyboard and render the first hero image"
 # Durable hosted workflow (/v1/creative-agent/workflows)
 sogni-agent --api-workflow \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
@@ -299,7 +315,7 @@ sogni-agent --get-replay run_abc123 --json
 # Opt in to SDK transport for hosted operations (durable workflows + chat).
 # Validates restEndpoint/socketEndpoint via the skill's SSRF guard, then
-# calls sogni.workflows.* / .chat.completions.* directly.
+# calls the SDK workflow/chat methods directly.
 # Falls back to the legacy SSRF-validated fetch path when the env is unset.
 export SOGNI_SKILL_USE_SDK_TRANSPORT=1
 sogni-agent --api-workflow storyboard-video "10s neon city flyover"
@@ -508,6 +524,7 @@ Hosted API modes require `SOGNI_API_KEY`.
 - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools — best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
 - **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
+- **`--durable-chat`** starts a hosted `/v1/chat/runs` record through the SDK transport. Set `SOGNI_SKILL_USE_SDK_TRANSPORT=1` before using it. The CLI streams assistant deltas and de-duplicated per-job progress / ETA / result lines from hosted run events.
 - **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Requests carry `input.steps` plus snake_case controls such as `token_type`, `media_references`, `max_estimated_capacity_units`, and `confirm_cost`.
 - **`--workflow-input`** forwards exact durable workflow JSON (`{ title?, steps: [...] }`). Use this when you need exact multi-step behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
 - **`--api-workflow storyboard-video`** generates a storyline, creates a single GPT Image 2 storyboard sheet, then passes that artifact into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low/medium/high quality for that storyboard sheet.
@@ -601,7 +618,7 @@ With both repos checked out as siblings, refresh the generated runtime before pu
 npm run sync:creative-agent-runtime
 ```
-Reusable workflow rules should be added to `../sogni-creative-agent` first, then synced here. Keep storyboard planning, tool argument validation, prompt linting, typed media turn intent, and typed repair/control semantics aligned with `sogni-chat`, `sogni-client`, and `sogni-api` hosted chat/workflow endpoints rather than recreating skill-only regex guards. Prefer generated or copied shared helpers for hosted workflow compilation, schema argument validation, `CreativeTurnPlannerFields` / `classifyMediaTurnIntent()` media-routing contracts, repair-control decisions, and guard telemetry summaries over skill-local guard code — this keeps public-agent behavior close to `/v1/chat/completions` and `/v1/creative-agent/workflows`.
+Reusable workflow rules should come from the shared Sogni runtime before they are synced into this public package. Keep storyboard planning, tool argument validation, prompt linting, media-routing decisions, chat-run progress extraction, and repair/control behavior aligned with the hosted `/v1/chat/completions` and `/v1/creative-agent/workflows` APIs. Prefer typed helpers exported by `@sogni-ai/sogni-intelligence-client` or the generated runtime over new skill-local regex guards.
 Public-skill regex should stay limited to CLI argument/fact extraction such as file paths, URLs, extensions, dimensions, durations, and explicit positions. Hosted-style decisions such as latest-video continuation, uploaded-video modification, image-selection waits, stitch-after-batch state, and repair/control routing belong upstream in typed planner/runtime fields before they are synced here.

package/SKILL.md CHANGED Viewed

@@ -128,12 +128,16 @@ Path override environment variables:
 ## Recommended path: route through the hosted Sogni Intelligence endpoints
-For any natural-language creative request — anything that should be planned, multi-step, or that benefits from tool selection, repair, or durable workflows — prefer the hosted endpoints over the direct-to-SDK flags. The hosted endpoints are the canonical home for tool dispatch, Structured Contracts v1 (gating policies, repair recipes, prompt contracts), durable workflows, replay, and asset-manifest mapping. They stay aligned with `sogni-chat` and the rest of the `@sogni/creative-agent` consumers automatically.
+For any natural-language creative request — anything that should be planned, multi-step, resumable, or that benefits from tool selection, repair, or durable workflows — prefer the hosted Sogni Intelligence endpoints over the direct-to-SDK media flags. The hosted surfaces are the canonical home for OpenAI-compatible chat, server-side creative tool dispatch, Structured Contracts v1 (gating policies, repair recipes, prompt contracts), durable chat runs, durable workflows, workflow templates, replay, and asset-manifest mapping. They stay aligned with `sogni-chat`, `sogni-api`, and the rest of the `@sogni/creative-agent` consumers.
 ```bash
 # Natural-language creative request (LLM picks the tool, dispatches, repairs)
 sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
+# Durable hosted chat run (persisted event log + SSE stream)
+SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
+  "Create a four-shot launch campaign, generate the key art, and animate the hero clip"
 # Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
 sogni-agent --api-workflow \
   --video-prompt "The camera slowly pushes in" \
@@ -196,11 +200,15 @@ sogni-agent --api-chat --ref product.jpg \
 # Sogni Intelligence model/replay utilities
 sogni-agent --list-api-models
-sogni-agent --api-chat --task-profile reasoning --no-thinking \
+sogni-agent --api-chat --task-profile reasoning --max-tokens 2000 \
   "Plan a concise multi-step product launch workflow"
 sogni-agent --list-replays 20
 sogni-agent --get-replay run_abc123 --json
+# Draft a savable workflow template through the hosted creative-agent tool loop
+sogni-agent --api-chat \
+  "Design a reusable workflow for a 9:16 product teaser from one product photo"
 # Durable API workflow: generated keyframe to video with resumable workflow record
 sogni-agent --api-workflow \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
@@ -214,43 +222,128 @@ sogni-agent --api-workflow \
   "Animate the referenced sketch"
 # Exact durable workflow input with explicit steps
-sogni-agent --api-workflow --workflow-input @workflow.json
+sogni-agent --api-workflow --workflow-input @workflow-input.json \
+  --workflow-idempotency-key product-teaser-v1
 # Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
 sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
   "Create a 9:16 bakery launch video with a neon street-window reveal"
+# Workflow management
+sogni-agent --list-workflows
+sogni-agent --resume-workflow wf_durable_workflow_123
 ```
 Use `--api-chat` for text-first natural-language workflows that should go through
-Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. This path
-sanitizes prompt-injection markers before forwarding messages and uses the
-current hosted creative-agent tool surface. Use `--api-workflow` when the caller
-already knows it wants an async durable workflow record under
-`/v1/creative-agent/workflows`. Use `--workflow-input @workflow.json` when the
-caller already has exact durable workflow input with `steps`; the skill forwards
-that body to the API as-is. This is the preferred hosted path for
-exact multi-step plans, including repeated `replace_video_segment` operations
-with `replacementStartSeconds` / `replacementEndSeconds` when interleaving
-existing video slices. Use `--api-workflow storyboard-video`
-when the caller wants the hosted sequence to generate a storyline, create one GPT
-Image 2 storyboard sheet, and feed that image artifact into Seedance as the video
-reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low|medium|high
-quality for the storyboard sheet. Hosted API requests forward media references
-from `-c`, `--ref`, `--ref-end`, `--ref-audio`,
-`--reference-audio-identity`, and `--ref-video` as `media_references`
-metadata; workflow JSON can bind them into step arguments with
-`sourceStepId: "$input_media"`, and API chat also attaches image refs as vision
-inputs. Local file references are uploaded to Sogni media storage first, then
-forwarded as retrievable URLs for hosted chat and durable workflows. Use the
-direct CLI path for private media that must not leave the local machine.
-Use `--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to
-forward explicit workflow cost policy.
-Sogni Intelligence utilities are exposed through the same API key path:
-`--list-api-models` / `--get-api-model <id>` read `/v1/models`,
-`--task-profile`, `--max-tokens`, and `--thinking` / `--no-thinking` tune
-`/v1/chat/completions`, and `--list-replays`, `--get-replay`, and
-`--ingest-replay` manage `/v1/replay/records` RunRecords for replay/debug
-viewers.
+Sogni API's OpenAI-compatible `POST /v1/chat/completions` loop. The public
+REST body uses snake_case controls such as `tool_choice`, `response_format`,
+`task_profile`, `token_type`, `app_source`, `media_references`,
+`chat_template_kwargs`, `sogni_tools`, and `sogni_tool_execution`. The endpoint
+normalizes OpenAI `developer` messages to `system`; when a developer message is
+present and no explicit `task_profile` is supplied, the server treats the task
+as `coding`. The CLI sanitizes prompt-injection markers before forwarding
+messages and sends API-key auth so hosted Sogni tools can execute server-side.
+Hosted tool surfaces are split by `sogni_tools`:
+- `creative-tools` is the public API default when `sogni_tools` is omitted or
+  true. It exposes generation/editing tools (`generate_image`,
+  `generate_video`, `generate_music`, `edit_image`, `apply_style`,
+  `restore_photo`, `refine_result`, `animate_photo`, `change_angle`,
+  `video_to_video`, `stitch_video`, `orbit_video`, `dance_montage`,
+  `sound_to_video`, `extend_video`, `replace_video_segment`, `overlay_video`,
+  `add_subtitles`), media-analysis tools (`analyze_image`, `analyze_video`,
+  `extract_metadata`), and lightweight composition tools (`enhance_prompt`,
+  `compose_lyrics`, `compose_instrumental`, `compose_script`).
+- `creative-agent` is this CLI's default for `--api-chat`. It includes the
+  `creative-tools` surface plus session-control tools
+  (`ask_clarifying_question`, `finalize_response`), asset-manifest tools
+  (`create_asset_manifest`, `inspect_asset`, `label_asset`,
+  `map_assets_for_model`, `validate_asset_references`), and durable planning
+  tools (`compose_workflow`, `compose_workflow_template`). Use this surface
+  when the model should design one-shot workflow plans, draft savable workflow
+  templates, or maintain stable asset references across a multi-step turn.
+- `none` disables Sogni tool injection and leaves only caller-supplied OpenAI
+  tools on raw API/SDK requests. In the CLI, use it with
+  `--no-api-tool-execution` when you want text-only planning without hosted
+  Sogni tool dispatch.
+Use `--durable-chat` for long-running, LLM-in-the-loop turns that should be
+persisted as `POST /v1/chat/runs` records instead of a single
+`/v1/chat/completions` request. Chat runs keep an event log, stream via
+`/v1/chat/runs/:id/events/stream`, support cancellation, and can pause for
+persisted cost approval (`/v1/chat/runs/:id/confirm-cost`) in first-party
+clients. The CLI can start and stream durable chat runs through the SDK
+transport when `SOGNI_SKILL_USE_SDK_TRANSPORT=1` is set.
+Use `--api-workflow` when the caller already knows it wants an async durable
+workflow under `POST /v1/creative-agent/workflows`. The API now accepts either
+an inline durable plan (`input.steps`) or a saved workflow template invocation
+(`workflow_id` plus `inputs`) and rejects requests that provide both. The CLI's
+generated-keyframe and `storyboard-video` presets submit inline `input.steps`;
+`--workflow-input @workflow-input.json` supplies that `input` object directly.
+Saved template CRUD lives at `/v1/creative-agent/workflows/templates`, and a
+saved template can later be run by API/SDK callers with `workflow_id + inputs`.
+Use `compose_workflow_template` through `--api-chat` to draft a savable template;
+the caller is still responsible for persisting the returned `template_draft`.
+Exact multi-step workflow plans should use explicit step dependencies, including
+`replace_video_segment` steps with bounded `replacementStartSeconds` /
+`replacementEndSeconds` when interleaving existing video slices. Workflow JSON
+can bind request media into step arguments with `sourceStepId: "$input_media"`.
+Use `--api-workflow storyboard-video` when the hosted sequence should generate a
+storyline, create one GPT Image 2 storyboard sheet, and feed that image artifact
+into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT
+Image 2 low|medium|high quality for the storyboard sheet.
+Hosted API requests forward media references from `-c`, `--ref`, `--ref-end`,
+`--ref-audio`, `--reference-audio-identity`, and `--ref-video` as
+`media_references` metadata. `--ref-audio` and `--ref-video` are repeatable in
+api-chat / durable-chat mode — each entry uploads independently and is exposed
+to the hosted LLM at `@Audio1` / `@Audio2` / `@Video1` etc. API chat also
+attaches image refs as vision inputs. Local file references are uploaded to
+Sogni media storage first, then forwarded as retrievable URLs for hosted chat
+and durable workflows. Use the direct CLI path for private media that must not
+leave the local machine.
+### Seedance reference modes (mutually exclusive)
+When `--video -m seedance2` or `-m seedance2-fast` is selected, the skill
+exposes the same two-mode pattern that the hosted chat surfaces. Pick one
+mode per video request:
+- **Dedicated frame mode — `--ref` and/or `--ref-end`.** First-class
+  first-frame / last-frame anchoring; the Seedance worker pins them as
+  parameter-mode firstFrame / lastFrame. Max 2 images.
+- **Loose reference mode — `-c/--context` plus optional `--ref-audio`
+  extras and `--ref-video` extras.** Anchor frame intent in the prompt with
+  `@Image1` / `@Image2` / `@Video1` / `@Audio1` etc. (e.g. *"Use @Image1 as
+  the opening shot reference"*). Supports up to 9 image refs, 3 video refs,
+  3 audio refs, and 12 total reference assets per video request. The
+  numeric caps come from the canonical
+  `@sogni-ai/sogni-protocol/catalogs/seedance-reference-limits.json` catalog,
+  surfaced through `@sogni-ai/sogni-intelligence-client/tools` as
+  `SEEDANCE_REFERENCE_LIMITS` and `validateSeedanceReferenceCounts()`.
+Combining `--ref` / `--ref-end` with `-c/--context` on Seedance is rejected
+client-side with a clear error pointing to the correct mode. In CLI direct-gen
+mode, additional `--ref-audio` / `--ref-video` entries beyond the first must
+be HTTPS URLs (the primary entry can still be a local file path); for local
+multi-file Seedance uploads, use `--api-chat` / `--durable-chat` instead. Use
+`--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to forward
+explicit workflow cost policy, and `--workflow-idempotency-key` when retrying a
+workflow start request.
+Sogni Intelligence utilities are exposed through the same API-key path:
+`--list-api-models` / `--get-api-model <id>` read `/v1/models`, `--task-profile`
+and `--max-tokens` tune `/v1/chat/completions`, and `--list-replays`,
+`--get-replay`, and `--ingest-replay` manage `/v1/replay/records` RunRecords for
+replay/debug viewers. The public chat endpoint also accepts OpenAI-standard
+`reasoning_effort` / `reasoning.effort` in raw API requests. The CLI's
+`--thinking` / `--no-thinking` flags are forwarded as
+`chat_template_kwargs.enable_thinking`; current hosted Qwen requests may
+normalize thinking on server-side, so do not rely on `--no-thinking` as a hard
+suppression switch for `/v1/chat/completions`.
 Hosted API modes require `SOGNI_API_KEY`; this skill's CLI uses API-key
 authentication.
@@ -263,15 +356,15 @@ SSRF-validated fetch path. The skill's `sogni-hosted-client.mjs`
 factory still validates `restEndpoint` / `socketEndpoint` against the
 SSRF guard before constructing the SDK client, so the safety contract
 holds.
+For `--durable-chat`, stream output as the run advances; the CLI reports
+assistant deltas plus de-duplicated per-job progress / ETA / result lines from
+hosted run events.
 When changing hosted API chat/workflow behavior, keep reusable validation,
-workflow compilation, repair-control, and guard telemetry logic in
-`../sogni-creative-agent` first. The public skill should consume generated or
-copied shared contracts instead of adding skill-local regex guards. Media-routing
-decisions should come from typed planner/runtime contracts such as
-`CreativeTurnPlannerFields`, `classifyMediaTurnIntent()`, `videoContinuation`,
-`videoModification`, `outputGrouping`, `imageSelectionPolicy`, and
-`pendingStitchAfterBatch`; regex is appropriate only for bounded CLI/fact
+workflow compilation, repair-control, and guard telemetry logic in the shared
+Sogni runtime first, then sync it into this public skill. The public skill
+should consume generated or shared typed contracts instead of adding
+skill-local regex guards. Keep local regex limited to bounded CLI/fact
 extraction such as paths, URLs, extensions, dimensions, durations, and explicit
 positions.
@@ -358,13 +451,15 @@ positions.
 | `--concat-audio <path>` | Optional audio track to mux over `--concat-videos` output | - |
 | `--concat-audio-start <sec>` | Start offset into `--concat-audio` | - |
 | `--list-media [type]` | List recent inbound media (images\|audio\|all) | images |
-| `--api-chat` | Call `/v1/chat/completions` with Sogni creative-agent tool injection | - |
-| `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none | creative-agent |
+| `--api-chat` | Call OpenAI-compatible `/v1/chat/completions`; CLI default sends the hosted `creative-agent` tool surface | - |
+| `--durable-chat` | Start and stream a durable `/v1/chat/runs` record through SDK transport; requires `SOGNI_SKILL_USE_SDK_TRANSPORT=1` | - |
+| `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none. CLI default is creative-agent; raw API default is creative-tools. | creative-agent |
 | `--no-api-tool-execution` | Plan/tool-call via API chat without executing Sogni tools | - |
 | `--llm-model <id>` | LLM model for `--api-chat` | qwen3.6-35b-a3b-gguf-iq4xs |
 | `--task-profile <profile>` | Sogni Intelligence task profile: general\|coding\|reasoning | - |
 | `--max-tokens <n>` | Max hosted chat completion tokens | 1600 |
-| `--thinking`, `--no-thinking` | Toggle `chat_template_kwargs.enable_thinking` for hosted chat | server default |
+| `--thinking`, `--no-thinking` | Forward `chat_template_kwargs.enable_thinking` for hosted chat; current public Qwen requests may normalize thinking on server-side | server default |
+| `--system <text>` | Override the base system prompt for hosted chat | built-in creative assistant prompt |
 | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM model metadata | - |
 | `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|@path>` | Manage Sogni Intelligence replay RunRecords. List/get output is run through `redactRunRecord` from `@sogni/creative-agent/replay` before printing, so signed URLs, bearer tokens, JWTs, and PEM blocks cannot leak via the CLI. Use `@path` to load JSON from a file. | - |
 | `--skip-redact`, `--no-redact` | Bypass the replay redactor on `--list-replays` / `--get-replay`. Debug-only — emits unredacted RunRecord payloads. | redacted |
@@ -376,9 +471,10 @@ positions.
 | `--storyboard-plan-frames <n>` | Frame count for `--storyboard-plan`. | inferred |
 | `--storyboard-plan-model <id>` | Adapter target for `--storyboard-plan` (seedance, seedance2, gpt-image-2, ltx23, wan). | inferred |
 | `--storyboard-plan-stage <stage>` | Compilation stage for `--storyboard-plan` (storyboard_image, scene_clip). | storyboard_image |
-| `--api-workflow` | Start a durable workflow with explicit `input.steps`; optional `storyboard-video` preset | - |
-| `--workflow-input <json\|@path>` | Durable workflow input JSON. Use `@path` to load from a file. | - |
+| `--api-workflow` | Start `/v1/creative-agent/workflows` with generated inline `input.steps`; optional `storyboard-video` preset | - |
+| `--workflow-input <json\|@path>` | Durable workflow `input` JSON for the start request. Use `@path` to load from a file. | - |
 | `--workflow-title <text>` | Title for generated or storyboard durable workflow input | - |
+| `--workflow-idempotency-key <key>`, `--idempotency-key <key>` | Reuse safely when retrying a durable workflow start request | - |
 | `--workflow-max-cost <n>` | Reject hosted workflow starts above this estimated capacity-unit ceiling | - |
 | `--confirm-cost`, `--no-confirm-cost` | Forward explicit hosted workflow cost confirmation | - |
 | `--storyboard-frames <n>` | Beat count for storyboard-video workflow | - |
@@ -387,7 +483,7 @@ positions.
 | `--generate-audio`, `--no-generate-audio` | Toggle audio generation for generated video steps | - |
 | `--expand-prompt`, `--no-expand-prompt` | Toggle prompt expansion for generated video steps | - |
 | `--watch-workflow` | Stream durable workflow events after start | - |
-| `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Durable workflow management helpers | - |
+| `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>`, `--resume-workflow <id>` | Durable workflow management helpers | - |
 | `--api-base-url <url>` | Sogni API base for hosted API modes. Credentials are only sent to `https://api.sogni.ai` by default; use `SOGNI_API_ALLOWED_HOSTS` for trusted custom hosts or `SOGNI_ALLOW_UNSAFE_API_BASE_URL=1` for isolated local testing. | https://api.sogni.ai |
 | `--no-filter` | Disable NSFW content filter | - |
 | `--memory-set <key> <value>` | Save a user preference | - |

package/generated/creative-agent-runtime.mjs CHANGED Viewed

@@ -2285,7 +2285,7 @@ const PROMPT_CONTRACTS = [
         "contractId": "sound_to_video_v1",
         "version": "1.0.0",
         "toolName": "sound_to_video",
-        "baseDescription": "sound_to_video creates audio-synced video from an audio source. Works with uploaded audio\nfiles (mp3, m4a, wav) OR previously generated music from generate_music (auto-detected).\n\nWhen the user asks to \"turn that song/music into a video\" after generate_music, use\nsound_to_video — it will automatically find the generated audio.\n\nFor music visualization (syncing video to a specific song or audio track), use the\ngenerate_music → sound_to_video pipeline. Do NOT use animate_photo or generate_video for\naudio-driven visualization.\n\nanimate_photo and generate_video produce audio natively via LTX 2.3 — never pre-generate\naudio for those tools. sound_to_video is only for when the audio IS the primary creative\ninput driving the video output.",
+        "baseDescription": "sound_to_video creates audio-synced video from an audio source. Works with uploaded audio\nfiles (mp3, m4a, wav) OR previously generated music from generate_music (auto-detected).\n\nWhen the user asks to \"turn that song/music into a video\" after generate_music, use\nsound_to_video — it will automatically find the generated audio.\n\nFor music visualization (syncing video to a specific song or audio track), use the\ngenerate_music → sound_to_video pipeline. Do NOT use animate_photo or generate_video for\naudio-driven visualization.\n\nanimate_photo and generate_video produce audio natively via LTX 2.3 — never pre-generate\naudio for those tools. sound_to_video is only for when the audio IS the primary creative\ninput driving the video output.\n\nPERSONA VOICE CLIPS: Persona voice clips returned by resolve_personas are voice identity\nreferences for LTX Audio ID, not audio tracks to synchronize. If the user explicitly asks\nfor a registered/persona/reference voice or voice clone, call resolve_personas first, then\nuse generate_video with ltx23 and voicePersonaName when there is no image source, or\nanimate_photo with ltx23 and voicePersonaName when an image/source frame should be\nanimated. Do not call sound_to_video for a persona voice clip unless a separate uploaded\naudio track is the primary sync driver.",
         "parameterDocs": {
             "audioSource": "Uploaded audio file or reference to a prior generate_music result. Auto-detected when omitted after generate_music."
         }
@@ -2351,7 +2351,7 @@ const PROMPT_CONTRACTS = [
         "contractId": "resolve_personas_v1",
         "version": "1.0.0",
         "toolName": "resolve_personas",
-        "baseDescription": "resolve_personas is the required first step when the user explicitly names a saved Persona\nor says to use a Persona Image, Persona reference photo, Persona Voice, registered voice,\nor voice clone. Do not answer in prose, ask a follow-up, or finalize before calling this\ntool when a listed Persona name is present.\n\nDIRECT PERSONA IMAGE / VOICE VIDEO: If the user says to use the Persona image/reference\ndirectly/originally, call resolve_personas first, then call animate_photo using the injected\npersona photo as an uploaded image index. For one named Persona, use sourceImageIndex=-1\nor sourceImageIndices=[-1,...] for a multi-clip batch. If Persona Voice was explicitly\nrequested, set voicePersonaName to the exact resolved Persona name and use an LTX model.\nDo not call generate_video for Persona image/voice videos. Do not generate a new image first\nwhen the user explicitly requested the existing Persona image directly.\n\nMULTI-CLIP PERSONA BATCHES: If the user asks for several separate clips from the same\nPersona Image, make one animate_photo call after resolve_personas with repeated persona\nsource indices, one prompt per clip, and the requested per-clip duration. If the user asks\nto stitch the clips, call stitch_video with the returned video indices after animate_photo.",
+        "baseDescription": "resolve_personas is the required first step when the user explicitly names a saved Persona\nor says to use a Persona Image, Persona reference photo, Persona Voice, registered voice,\nor voice clone. Do not answer in prose, ask a follow-up, or finalize before calling this\ntool when a listed Persona name is present.\n\nDIRECT PERSONA IMAGE / VOICE VIDEO: If the user says to use the Persona image/reference\ndirectly/originally, call resolve_personas first, then call animate_photo using the injected\npersona photo as an uploaded image index. For one named Persona, use sourceImageIndex=-1\nor sourceImageIndices=[-1,...] for a multi-clip batch. If Persona Voice was explicitly\nrequested, set voicePersonaName to the exact resolved Persona name and use ltx23. Use\nanimate_photo when an image/source frame should be animated. Use generate_video with\nltx23 when the user requested a persona voice video but no image source exists yet.\nDo not call sound_to_video for Persona voice clips; they are voice identity references,\nnot the audio track to synchronize. Do not generate a new image first when the user\nexplicitly requested the existing Persona image directly.\n\nMULTI-CLIP PERSONA BATCHES: If the user asks for several separate clips from the same\nPersona Image, make one animate_photo call after resolve_personas with repeated persona\nsource indices, one prompt per clip, and the requested per-clip duration. If the user asks\nto stitch the clips, call stitch_video with the returned video indices after animate_photo.",
         "parameterDocs": {
             "names": "Persona names to load. Use the exact listed Persona name; call this before any Persona image/voice video or image generation."
         }

package/llm.txt CHANGED Viewed

@@ -11,6 +11,11 @@ runtimes.
 npm install -g @sogni-ai/sogni-creative-agent-skill@latest
 sogni-agent --version
+# Claude Code plugin (requires the CLI install above; the plugin shells out to sogni-agent).
+# Run both slash commands from inside Claude Code:
+#   /plugin marketplace add Sogni-AI/sogni-creative-agent-skill
+#   /plugin install sogni-creative-agent@sogni
 # OpenClaw plugin
 openclaw plugins install sogni-creative-agent-skill
@@ -67,6 +72,9 @@ for `/v1/chat/completions` with rich creative-agent tools and sanitized
 message forwarding, or
 `sogni-agent --api-workflow --video-prompt "motion" "image prompt"`
 for durable `/v1/creative-agent/workflows` execution.
+Use `SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat "prompt"` for
+durable `/v1/chat/runs` execution with SSE assistant deltas and per-job
+progress / ETA / result events.
 Sogni Intelligence utilities are available with `--list-api-models`,
 `--get-api-model <id>`, `--task-profile general|coding|reasoning`,
 `--max-tokens <n>`, `--thinking` / `--no-thinking`, `--list-replays [n]`,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@sogni-ai/sogni-creative-agent-skill",
-  "version": "3.1.1",
+  "version": "3.3.0",
   "description": "Sogni Creative Agent Skill: agent skill and CLI for Sogni AI image, video, and music generation.",
   "type": "module",
   "main": "sogni-agent.mjs",
@@ -62,11 +62,12 @@
     "openclaw.plugin.json",
     "env.mjs",
     "ssrf-guard.mjs",
+    "update-check.mjs",
     "generated/creative-agent-runtime.mjs",
     "sogni-agent.mjs"
   ],
   "dependencies": {
-    "@sogni-ai/sogni-intelligence-client": "^2.2.6",
+    "@sogni-ai/sogni-intelligence-client": "^2.4.0",
     "execa": "^9.6.1",
     "json5": "^2.2.3",
     "sharp": "^0.34.5"
@@ -77,11 +78,11 @@
     ]
   },
   "devDependencies": {
-    "@commitlint/cli": "^20.5.3",
-    "@commitlint/config-conventional": "^20.5.3",
+    "@commitlint/cli": "^21.0.1",
+    "@commitlint/config-conventional": "^21.0.1",
     "@semantic-release/changelog": "^6.0.3",
     "@semantic-release/git": "^10.0.1",
     "husky": "^9.1.7",
-    "semantic-release": "^24.2.9"
+    "semantic-release": "^25.0.3"
   }
 }

package/skill-package.json CHANGED Viewed

@@ -3,7 +3,7 @@
   "private": true,
   "type": "module",
   "dependencies": {
-    "@sogni-ai/sogni-intelligence-client": "^2.2.6",
+    "@sogni-ai/sogni-intelligence-client": "^2.4.0",
     "execa": "^9.6.1",
     "json5": "^2.2.3",
     "sharp": "^0.34.5"

package/sogni-agent.mjs CHANGED Viewed

@@ -14,6 +14,13 @@ import sharp from 'sharp';
 import { getEnv, hasEnv } from './env.mjs';
 import { PACKAGE_VERSION } from './version.mjs';
 import { assertSafeUrl } from './ssrf-guard.mjs';
+import {
+  INTERNAL_FLAG as UPDATE_CHECK_INTERNAL_FLAG,
+  runForegroundCheck as runUpdateCheckForeground,
+  maybeSpawnBackgroundCheck as maybeSpawnUpdateCheck,
+  getQueuedNotice as getUpdateCheckNotice,
+  runSelfUpdate as runSogniSelfUpdate,
+} from './update-check.mjs';
 import {
   LTX23_WORKFLOW_MODELS,
   PUBLIC_SKILL_DEFAULT_TOOL_DEFINITIONS,
@@ -56,6 +63,14 @@ import {
   redactPayload,
   redactRunRecord
 } from '@sogni-ai/sogni-intelligence-client/replay';
+import {
+  extractToolCallProgressUpdate
+} from '@sogni-ai/sogni-intelligence-client/chatRun';
+import {
+  SEEDANCE_REFERENCE_LIMITS,
+  SeedanceReferenceLimitError,
+  validateSeedanceReferenceCounts
+} from '@sogni-ai/sogni-intelligence-client/tools';
 const require = createRequire(import.meta.url);
 const rootClientModule = process.env.SOGNI_AGENT_TEST_STATE_PATH
@@ -121,6 +136,26 @@ const IS_OPENCLAW_INVOCATION = Boolean(getEnv('OPENCLAW_PLUGIN_CONFIG'));
 const RAW_ARGS = process.argv.slice(2);
 const CLI_WANTS_JSON = RAW_ARGS.includes('--json');
 const JSON_ERROR_MODE = CLI_WANTS_JSON || IS_OPENCLAW_INVOCATION;
+// --- Update-check entry points --------------------------------------------
+// Internal mode: the detached background child that fetches the npm registry.
+if (RAW_ARGS[0] === UPDATE_CHECK_INTERNAL_FLAG) {
+  await runUpdateCheckForeground({ currentVersion: PACKAGE_VERSION });
+  process.exit(0);
+}
+// User-facing subcommand: `sogni-agent self-update`
+if (RAW_ARGS[0] === 'self-update') {
+  process.exit(runSogniSelfUpdate({}));
+}
+// Fire-and-forget background check (no-op when throttled or skipped)
+try { maybeSpawnUpdateCheck({ cliPath: process.argv[1] }); } catch { /* never break the CLI */ }
+// Trailing notice on exit, if a newer version is on file
+process.on('exit', () => {
+  try {
+    const notice = getUpdateCheckNotice({ currentVersion: PACKAGE_VERSION });
+    if (notice) process.stderr.write(notice + '\n');
+  } catch { /* never break exit */ }
+});
 const SOCKET_EVENT_SUBSCRIPTIONS = Object.freeze({
   modelAvailability: false
 });
@@ -1108,12 +1143,14 @@ const options = {
   angles360Video: null,
   refImage: null, // Reference image for video (start frame)
   refImageEnd: null, // End frame for video interpolation
-  refAudio: null, // Uploaded/generated audio for ia2v/a2v, or s2v lip-sync
+  refAudio: null, // Uploaded/generated audio for ia2v/a2v, or s2v lip-sync (primary)
+  refAudios: [], // Additional Seedance loose audio refs; first --ref-audio fills refAudio, subsequent calls append here
   audioStart: null, // Optional start offset into reference audio
   audioDuration: null, // Optional duration slice for reference audio
   referenceAudioIdentity: null, // Voice identity reference for LTX native audio
   voicePersonaName: null,
-  refVideo: null, // Reference video for animate workflows
+  refVideo: null, // Reference video for animate workflows (primary)
+  refVideos: [], // Additional Seedance loose video refs; first --ref-video fills refVideo, subsequent calls append here
   videoStart: null, // Optional start offset into reference video
   contextImages: [], // Context images for image editing
   looping: false, // Create looping video (i2v only): generate A→B then B→A and concatenate
@@ -1238,11 +1275,13 @@ const cliSet = {
   refImage: false,
   refImageEnd: false,
   refAudio: false,
+  refAudios: false,
   audioStart: false,
   audioDuration: false,
   referenceAudioIdentity: false,
   voicePersonaName: false,
   refVideo: false,
+  refVideos: false,
   videoStart: false,
   context: false,
   looping: false,
@@ -1529,8 +1568,13 @@ for (let i = 0; i < args.length; i++) {
   } else if (arg === '--ref-audio' || arg === '--audio') {
     const raw = requireFlagValue(args, i, arg);
     i++;
-    options.refAudio = raw;
-    cliSet.refAudio = true;
+    if (!options.refAudio) {
+      options.refAudio = raw;
+      cliSet.refAudio = true;
+    } else {
+      options.refAudios.push(raw);
+      cliSet.refAudios = true;
+    }
   } else if (arg === '--audio-start') {
     const raw = requireFlagValue(args, i, arg);
     i++;
@@ -1554,8 +1598,13 @@ for (let i = 0; i < args.length; i++) {
   } else if (arg === '--ref-video') {
     const raw = requireFlagValue(args, i, arg);
     i++;
-    options.refVideo = raw;
-    cliSet.refVideo = true;
+    if (!options.refVideo) {
+      options.refVideo = raw;
+      cliSet.refVideo = true;
+    } else {
+      options.refVideos.push(raw);
+      cliSet.refVideos = true;
+    }
   } else if (arg === '--video-start' || arg === '--video-start-offset') {
     const raw = requireFlagValue(args, i, arg);
     i++;
@@ -1947,6 +1996,9 @@ for (let i = 0; i < args.length; i++) {
     options.showBalance = true;
   } else if (arg === '--version' || arg === '-V') {
     options.showVersion = true;
+  } else if (arg === '--no-update-check') {
+    // Update-check opt-out handled at module load; no-op here so the parser
+    // doesn't reject it as an unknown option.
   } else if (arg === '--help') {
     console.log(`
 sogni-agent - Generate images, videos, and music using Sogni AI
@@ -2015,14 +2067,30 @@ Video Options:
   --auto-resize-assets  Auto-resize video reference assets (default)
   --no-auto-resize-assets  Disable auto-resize for video assets
   --estimate-video-cost Estimate video cost and exit
-  --ref <path|url>      Reference image for video (start frame)
-  --ref-end <path|url>  End frame for interpolation/morphing
-  --ref-audio <path|url> Uploaded/generated audio for ia2v/a2v, or s2v lip-sync
+  --ref <path|url>      Reference image for video (start/first frame on Seedance)
+  --ref-end <path|url>  End frame for interpolation/morphing (last frame on Seedance)
+  --ref-audio <path|url> Audio reference. Repeatable on Seedance models (up to 3 total);
+                         first entry is the primary, extras must be HTTPS URLs in CLI
+                         direct-gen (use --api-chat for multi local-file uploads).
+                         On LTX/WAN: single primary only (for ia2v/a2v/s2v lip-sync).
   --audio-start <sec>   Start offset into --ref-audio for audio-driven clips
   --audio-duration <sec> Duration slice from --ref-audio
   --reference-audio-identity <path>  Voice identity clip for LTX native audio
   --voice-persona <name>  Use saved persona voice clip as LTX voice identity
-  --ref-video <path|url> Reference video for animate/v2v workflows
+  --ref-video <path|url> Video reference. Repeatable on Seedance models (up to 3 total);
+                         first entry is the primary, extras must be HTTPS URLs in CLI
+                         direct-gen. On LTX/WAN: single primary for animate/v2v workflows.
+Seedance Reference Modes (mutually exclusive on seedance2 / seedance2-fast):
+  - DEDICATED FRAME MODE: --ref (first frame) and/or --ref-end (last frame).
+    Best when you want canonical first/last frame anchoring; max 2 images.
+  - LOOSE REFERENCE MODE: -c/--context image refs plus optional --ref-audio /
+    --ref-video extras. Anchor frame intent in the prompt with @Image1, @Image2,
+    @Video1, @Audio1 etc. (e.g. "Use @Image1 as the opening shot reference").
+    Up to 9 image / 3 video / 3 audio / 12 total references per video request.
+  Combining --ref/--ref-end with -c/--context on Seedance is rejected client-side.
+  All three modalities pull caps from the canonical
+  @sogni-ai/sogni-protocol seedance-reference-limits catalog.
   --video-start <sec>   Start offset into --ref-video for segmented V2V/animate
   --controlnet-name <n> ControlNet type for v2v: canny|pose|depth|detailer
   --controlnet-strength <n>  ControlNet strength for v2v (0.0-1.0, default: 0.8)
@@ -2076,6 +2144,8 @@ General:
   --token-type <type>   Token type: spark|sogni|auto (default: spark, auto retries with alternate)
   --balance, --balances Show SPARK/SOGNI balances and exit
   --version, -V         Show sogni-agent version and exit
+  --no-update-check     Skip the once-daily npm update check for this run
+  self-update           Upgrade sogni-agent in place (npm/pnpm/yarn/bun auto-detected)
   --extract-last-frame <video> <image>  Extract last frame from a video (safe ffmpeg wrapper)
   --concat-videos <out> <clips...>      Concatenate video clips (safe ffmpeg wrapper, min 2 clips)
   --concat-audio <path> Optional audio track to mux over --concat-videos output
@@ -2916,8 +2986,47 @@ if (options.video) {
   if (options.videoStart !== null && !options.refVideo) {
     fatalCliError('--video-start requires --ref-video.', { code: 'INVALID_ARGUMENT' });
   }
-  if (isSeedanceVideo && options.refAudio && !options.refImage && !options.refImageEnd && !options.refVideo) {
-    fatalCliError('Seedance audio references require --ref or --ref-video.', { code: 'INVALID_ARGUMENT' });
+  if (isSeedanceVideo && options.refAudio && !options.refImage && !options.refImageEnd && !options.refVideo
+      && (!Array.isArray(options.contextImages) || options.contextImages.length === 0)) {
+    fatalCliError('Seedance audio references require --ref, --ref-video, or -c/--context image refs.', { code: 'INVALID_ARGUMENT' });
+  }
+  // Seedance reference modes are mutually exclusive:
+  //   - DEDICATED FRAME MODE: --ref (first frame) and/or --ref-end (last frame).
+  //     Up to 2 images; the platform pins them as parameter-mode firstFrame/lastFrame.
+  //   - LOOSE REFERENCE MODE: -c/--context (repeatable image refs), --ref-audio extras,
+  //     --ref-video extras. Up to 9 images / 3 videos / 3 audios / 12 total.
+  //     Anchor frame intent in the prompt with @Image1 / @Video1 / @Audio1 etc.
+  // Mixing dedicated frames with loose image refs is rejected at sogni-socket
+  // (jobsController.js) so we catch it client-side with a clearer message.
+  if (isSeedanceVideo
+      && (options.refImage || options.refImageEnd)
+      && Array.isArray(options.contextImages) && options.contextImages.length > 0) {
+    fatalCliError(
+      'Seedance reference modes are mutually exclusive: --ref/--ref-end (dedicated first/last frame) cannot be combined with -c/--context (loose image references). '
+      + 'Pick one: use --ref/--ref-end for first-class first-frame/last-frame anchoring (max 2 images), '
+      + 'or use -c/--context (plus optional @Image1/@Image2 prompt language) for up to 9 loose image references.',
+      { code: 'INVALID_ARGUMENT', details: {
+          dedicatedFrames: [options.refImage, options.refImageEnd].filter(Boolean),
+          looseImageRefs: options.contextImages,
+        } },
+    );
+  }
+  // Non-Seedance video models do not understand multi-ref audio/video extras —
+  // they only support a single primary --ref-audio / --ref-video each.
+  if (!isSeedanceVideo) {
+    if (Array.isArray(options.refAudios) && options.refAudios.length > 0) {
+      fatalCliError('Multiple --ref-audio entries are only supported for Seedance models (seedance2, seedance2-fast).', {
+        code: 'INVALID_ARGUMENT',
+        details: { model: options.model, extras: options.refAudios },
+      });
+    }
+    if (Array.isArray(options.refVideos) && options.refVideos.length > 0) {
+      fatalCliError('Multiple --ref-video entries are only supported for Seedance models (seedance2, seedance2-fast).', {
+        code: 'INVALID_ARGUMENT',
+        details: { model: options.model, extras: options.refVideos },
+      });
+    }
   }
   if (options.referenceAudioIdentity && !['t2v', 'i2v'].includes(options.videoWorkflow)) {
@@ -3246,8 +3355,9 @@ function apiRequestHeaders(apiKey, extra = {}) {
  * Phase 6 P0 — SDK transport dispatch for hosted workflow operations.
  *
  * When `SOGNI_SKILL_USE_SDK_TRANSPORT=1` is set, route hosted workflow
- * start / get / list / events / cancel through `@sogni-ai/sogni-client`
- * via the SSRF-validated `SogniHostedClientFactory` in
+ * start / get / list / events / cancel through
+ * `@sogni-ai/sogni-intelligence-client`'s SDK-backed client via the
+ * SSRF-validated `SogniHostedClientFactory` in
  * `sogni-hosted-client.mjs`. Otherwise fall back to the legacy
  * `fetchApiJson` path so existing users on older SDK versions are
  * unaffected.
@@ -3325,8 +3435,9 @@ async function dispatchWorkflowActionViaSdk(action, apiKey, params) {
  * Phase 6 P0 — SDK transport dispatch for hosted chat completions.
  *
  * When `SOGNI_SKILL_USE_SDK_TRANSPORT=1` is set, route synchronous
- * hosted chat through `@sogni-ai/sogni-client` via the SSRF-validated
- * factory. The SDK's `chat.hosted.create` accepts the same field
+ * hosted chat through `@sogni-ai/sogni-intelligence-client`'s SDK-backed
+ * client via the SSRF-validated factory. The SDK's `chat.hosted.create`
+ * accepts the same field
  * names the legacy fetch sends (`model`, `messages`, `temperature`,
  * `max_tokens`, `token_type`, `app_source`, `sogni_tools`,
  * `sogni_tool_execution`, `task_profile`, `chat_template_kwargs`,
@@ -3464,8 +3575,14 @@ function getApiModeMediaReferences() {
   if (options.refImage) refs.push({ flag: '--ref', value: options.refImage, kind: 'image' });
   if (options.refImageEnd) refs.push({ flag: '--ref-end', value: options.refImageEnd, kind: 'image' });
   if (options.refAudio) refs.push({ flag: '--ref-audio', value: options.refAudio, kind: 'audio' });
+  for (const value of options.refAudios || []) {
+    if (value) refs.push({ flag: '--ref-audio', value, kind: 'audio' });
+  }
   if (options.referenceAudioIdentity) refs.push({ flag: '--reference-audio-identity', value: options.referenceAudioIdentity, kind: 'audio' });
   if (options.refVideo) refs.push({ flag: '--ref-video', value: options.refVideo, kind: 'video' });
+  for (const value of options.refVideos || []) {
+    if (value) refs.push({ flag: '--ref-video', value, kind: 'video' });
+  }
   return refs;
 }
@@ -4014,6 +4131,17 @@ async function runApiChatDurable(log, { apiKey, body }) {
       }
       if (!options.json) log(`Durable chat run started: ${runId}`);
+      // Per-job tool_call_progress dedupe state. The sogni-api throttled
+      // emitter sends 1 Hz `jobETA` countdowns + per-step progress
+      // ticks per job; we log only when the value actually changes
+      // (and only in non-JSON CLI mode) so a 16-image batch doesn't
+      // pour ~16 lines/sec into the log file.
+      const perJobLogState = new Map();
+      const logJobUpdate = (line) => {
+        if (options.json) return;
+        log(line);
+      };
       for await (const event of helpers.sdkChatRunsStreamEvents(client, runId, {})) {
         const type = event?.type || event?.event || '';
         const payload = event?.data || event;
@@ -4028,6 +4156,50 @@ async function runApiChatDurable(log, { apiKey, body }) {
             process.stdout.write(delta);
           }
         }
+        // Per-job progress / ETA / completion / error log lines for
+        // CLI watchers. The sogni-api `tool_call_progress` SSE event
+        // packs `jobIndex` + per-job fields (`jobProgress`,
+        // `jobEtaSeconds`, `resultUrl`, `jobError`) for vendor-emulated
+        // jobs (GPT, Seedance — 1 Hz `jobETA` heartbeat from
+        // sogni-socket) and real workers (per-step progress).
+        // Untouched payloads from older sogni-api builds simply lack
+        // `jobIndex` and skip this block — forward-compatible.
+        if (type === 'tool_call_progress' && payload && typeof payload === 'object') {
+          const {
+            jobIndex,
+            jobProgress,
+            jobEtaSeconds,
+            resultUrl,
+            jobError,
+          } = extractToolCallProgressUpdate(payload);
+          if (jobIndex !== undefined) {
+            const state = perJobLogState.get(jobIndex) ?? {};
+            if (jobError && state.error !== jobError) {
+              logJobUpdate(`[job ${jobIndex}] error: ${jobError}`);
+              state.error = jobError;
+            } else if (resultUrl && state.resultUrl !== resultUrl) {
+              logJobUpdate(`[job ${jobIndex}] done${jobProgress !== undefined ? ` (${Math.round(jobProgress * 100)}%)` : ''} → ${resultUrl}`);
+              state.resultUrl = resultUrl;
+              state.progress = jobProgress ?? state.progress;
+            } else if (jobProgress !== undefined || jobEtaSeconds !== undefined) {
+              // Dedupe: only emit when progress moved >=5% or ETA changed.
+              const pctBefore = state.progress !== undefined ? Math.round(state.progress * 100) : -1;
+              const pctNow = jobProgress !== undefined ? Math.round(jobProgress * 100) : pctBefore;
+              const progressChanged = jobProgress !== undefined && Math.abs(pctNow - pctBefore) >= 5;
+              const etaChanged = jobEtaSeconds !== undefined && jobEtaSeconds !== state.eta;
+              if (progressChanged || etaChanged) {
+                const parts = [`[job ${jobIndex}]`];
+                if (jobProgress !== undefined) parts.push(`${pctNow}%`);
+                else if (state.progress !== undefined) parts.push(`${pctBefore}%`);
+                if (jobEtaSeconds !== undefined) parts.push(`(${jobEtaSeconds}s)`);
+                logJobUpdate(parts.join(' '));
+                if (jobProgress !== undefined) state.progress = jobProgress;
+                if (jobEtaSeconds !== undefined) state.eta = jobEtaSeconds;
+              }
+            }
+            perJobLogState.set(jobIndex, state);
+          }
+        }
         const eventToolCalls =
           payload?.toolCalls
           || payload?.tool_calls
@@ -5306,6 +5478,51 @@ async function appendSafeSeedanceReferenceUrl(target, pathOrUrl, label) {
   return true;
 }
+// Effective Seedance reference counts for the current `options` snapshot.
+// Mirrors the per-modality bookkeeping sogni-chat does in
+// uploadedModalityReferenceIndices(...) (chatService.ts ~6149), translated to
+// the skill's primary + extras CLI shape:
+//   images = refImage + refImageEnd + contextImages (loose Seedance @ImageN refs)
+//   audios = refAudio + refAudios (extras)
+//   videos = refVideo + refVideos (extras)
+function effectiveSeedanceReferenceCounts() {
+  const images =
+    (options.refImage ? 1 : 0)
+    + (options.refImageEnd ? 1 : 0)
+    + (Array.isArray(options.contextImages) ? options.contextImages.length : 0);
+  const audios =
+    (options.refAudio ? 1 : 0)
+    + (Array.isArray(options.refAudios) ? options.refAudios.length : 0);
+  const videos =
+    (options.refVideo ? 1 : 0)
+    + (Array.isArray(options.refVideos) ? options.refVideos.length : 0);
+  return { images, audios, videos };
+}
+// Wraps the shared validateSeedanceReferenceCounts() so a thrown
+// SeedanceReferenceLimitError is re-raised as a CLI fatal error with the same
+// human message the hosted chat surfaces. Source of truth for the numeric caps
+// (9 / 3 / 3 / 12) is @sogni-ai/sogni-protocol's seedance-reference-limits
+// catalog, surfaced through @sogni-ai/sogni-intelligence-client/tools.
+function enforceSeedanceReferenceCaps() {
+  try {
+    validateSeedanceReferenceCounts(effectiveSeedanceReferenceCounts());
+  } catch (err) {
+    if (err instanceof SeedanceReferenceLimitError) {
+      fatalCliError(err.message, {
+        code: err.code,
+        details: {
+          limitKind: err.limitKind,
+          requestedCount: err.requestedCount,
+          maxCount: err.maxCount,
+          limits: SEEDANCE_REFERENCE_LIMITS,
+        },
+      });
+    }
+    throw err;
+  }
+}
 function resolveMultiAngleOutputConfig(outputPath, outputFormat) {
   if (!outputPath) return null;
   const ext = extname(outputPath);
@@ -6499,6 +6716,11 @@ async function main() {
       if (options.refVideo) log(`Reference video: ${options.refVideo}`);
       const isSeedanceVideo = isSeedanceModel(options.model);
+      if (isSeedanceVideo) {
+        // Source of truth: @sogni-ai/sogni-protocol catalogs/seedance-reference-limits.json
+        // surfaced through @sogni-ai/sogni-intelligence-client/tools.
+        enforceSeedanceReferenceCaps();
+      }
       const seedanceReferenceImageUrls = [];
       const seedanceReferenceVideoUrls = [];
       const seedanceReferenceAudioUrls = [];
@@ -6518,6 +6740,45 @@ async function main() {
         && options.videoStart === null
         && await appendSafeSeedanceReferenceUrl(seedanceReferenceVideoUrls, options.refVideo, 'Reference video');
+      // Seedance loose-reference extras: -c/--context images beyond start/end,
+      // plus repeated --ref-audio / --ref-video entries past the first. The
+      // Sogni Client SDK accepts only URL arrays for these (createJobRequestMessage),
+      // so extras MUST be HTTPS URLs. For multi-file local uploads, use --api-chat /
+      // --durable-chat where the LLM upload pipeline handles per-file uploads.
+      if (isSeedanceVideo) {
+        for (const ctxImage of (Array.isArray(options.contextImages) ? options.contextImages : [])) {
+          if (!ctxImage) continue;
+          if (!isHttpsUrl(ctxImage)) {
+            fatalCliError(
+              `Seedance extra image reference "${ctxImage}" must be an HTTPS URL. ` +
+              'Local file uploads beyond --ref / --ref-end are only supported in --api-chat / --durable-chat mode.',
+              { code: 'INVALID_ARGUMENT', details: { flag: '-c/--context', value: ctxImage } },
+            );
+          }
+          await appendSafeSeedanceReferenceUrl(seedanceReferenceImageUrls, ctxImage, 'Seedance image reference');
+        }
+        for (const extraAudio of options.refAudios) {
+          if (!isHttpsUrl(extraAudio)) {
+            fatalCliError(
+              `Additional --ref-audio "${extraAudio}" must be an HTTPS URL. ` +
+              'Local file uploads beyond the primary --ref-audio are only supported in --api-chat / --durable-chat mode.',
+              { code: 'INVALID_ARGUMENT', details: { flag: '--ref-audio', value: extraAudio } },
+            );
+          }
+          await appendSafeSeedanceReferenceUrl(seedanceReferenceAudioUrls, extraAudio, 'Seedance audio reference');
+        }
+        for (const extraVideo of options.refVideos) {
+          if (!isHttpsUrl(extraVideo)) {
+            fatalCliError(
+              `Additional --ref-video "${extraVideo}" must be an HTTPS URL. ` +
+              'Local file uploads beyond the primary --ref-video are only supported in --api-chat / --durable-chat mode.',
+              { code: 'INVALID_ARGUMENT', details: { flag: '--ref-video', value: extraVideo } },
+            );
+          }
+          await appendSafeSeedanceReferenceUrl(seedanceReferenceVideoUrls, extraVideo, 'Seedance video reference');
+        }
+      }
       let imageBuffer = options.refImage && !useRefImageUrl ? await fetchMediaBuffer(options.refImage) : undefined;
       let endImageBuffer = options.refImageEnd && !useRefImageEndUrl ? await fetchMediaBuffer(options.refImageEnd) : undefined;
       let audioBuffer = options.refAudio && !useRefAudioUrl ? await fetchMediaBuffer(options.refAudio) : undefined;

package/update-check.mjs ADDED Viewed

@@ -0,0 +1,303 @@
+/**
+ * sogni-agent update check — trailing-notification style.
+ *
+ * Public API:
+ *   shouldSkipForEnvironment(opts)  → boolean   (pure)
+ *   compareSemver(a, b)             → -1|0|1    (pure)
+ *   detectPackageManager(env)       → { manager, installCmd }
+ *   formatUpdateNotice(opts)        → string    (pure)
+ *   readState(path)                 → state | null
+ *   writeState(path, state)         → void
+ *   runForegroundCheck(opts)        → Promise<void>   (used by --__update-check)
+ *   maybeSpawnBackgroundCheck(opts) → 'spawned' | 'skipped' | 'fresh'
+ *   getQueuedNotice(opts)           → string | null
+ *   runSelfUpdate(opts)             → number (exit code)
+ */
+import { spawn, spawnSync } from 'child_process';
+import { existsSync, mkdirSync, readFileSync, writeFileSync, unlinkSync } from 'fs';
+import { dirname, join } from 'path';
+import { homedir } from 'os';
+import https from 'https';
+export const PACKAGE_NAME = '@sogni-ai/sogni-creative-agent-skill';
+export const DEFAULT_STATE_PATH = join(homedir(), '.config', 'sogni', 'update-check.json');
+export const DEFAULT_THROTTLE_MS = 24 * 60 * 60 * 1000; // 24h
+const REGISTRY_URL = `https://registry.npmjs.org/${encodeURIComponent(PACKAGE_NAME)}/latest`;
+const REGISTRY_TIMEOUT_MS = 1500;
+const MAX_RESPONSE_BYTES = 1024 * 1024;
+const INTERNAL_FLAG = '--__update-check';
+export { INTERNAL_FLAG };
+// ---------- pure helpers ----------
+function parseSemverPart(value) {
+  const [main, prerelease] = String(value).split('-', 2);
+  const nums = main.split('.').map((n) => Number.parseInt(n, 10));
+  if (nums.length !== 3 || nums.some((n) => !Number.isFinite(n) || n < 0)) return null;
+  return { nums, prerelease: prerelease || '' };
+}
+export function compareSemver(a, b) {
+  const pa = parseSemverPart(a);
+  const pb = parseSemverPart(b);
+  if (!pa || !pb) return 0;
+  for (let i = 0; i < 3; i++) {
+    if (pa.nums[i] !== pb.nums[i]) return pa.nums[i] < pb.nums[i] ? -1 : 1;
+  }
+  if (pa.prerelease === pb.prerelease) return 0;
+  if (!pa.prerelease) return 1;
+  if (!pb.prerelease) return -1;
+  return pa.prerelease < pb.prerelease ? -1 : 1;
+}
+export function detectPackageManager(env = process.env) {
+  const ua = env.npm_config_user_agent || '';
+  if (ua.startsWith('pnpm/')) {
+    return { manager: 'pnpm', installCmd: `pnpm add -g ${PACKAGE_NAME}` };
+  }
+  if (ua.startsWith('yarn/')) {
+    return { manager: 'yarn', installCmd: `yarn global add ${PACKAGE_NAME}` };
+  }
+  if (ua.startsWith('bun/')) {
+    return { manager: 'bun', installCmd: `bun add -g ${PACKAGE_NAME}` };
+  }
+  return { manager: 'npm', installCmd: `npm install -g ${PACKAGE_NAME}` };
+}
+export function shouldSkipForEnvironment({
+  argv = process.argv,
+  env = process.env,
+  stderr = process.stderr,
+  cliPath = process.argv[1] || '',
+} = {}) {
+  if (Array.isArray(argv) && argv.includes('--no-update-check')) return true;
+  if (env.SOGNI_NO_UPDATE_CHECK === '1' || env.SOGNI_NO_UPDATE_CHECK === 'true') return true;
+  if (env.NO_UPDATE_NOTIFIER === '1' || env.NO_UPDATE_NOTIFIER === 'true') return true;
+  if (env.CI) return true;
+  if (env.SOGNI_AGENT_TEST_STATE_PATH) return true;
+  if (env.OPENCLAW_PLUGIN_CONFIG) return true;
+  if (env.NODE_ENV === 'test') return true;
+  if (env.npm_lifecycle_event) return true; // running under `npm <script>`
+  if (Array.isArray(argv) && argv.includes('--json')) return true;
+  if (stderr && stderr.isTTY === false) return true;
+  // Dev / source checkout: CLI directory contains .git
+  if (cliPath) {
+    try {
+      const cliDir = dirname(cliPath);
+      if (existsSync(join(cliDir, '.git'))) return true;
+    } catch {
+      // ignore
+    }
+  }
+  return false;
+}
+export function formatUpdateNotice({
+  currentVersion,
+  latestVersion,
+  installCmd,
+  useColor,
+} = {}) {
+  const color = useColor !== false && !process.env.NO_COLOR && process.stderr.isTTY;
+  const c = {
+    dim: color ? '\x1b[2m' : '',
+    bold: color ? '\x1b[1m' : '',
+    yellow: color ? '\x1b[33m' : '',
+    cyan: color ? '\x1b[36m' : '',
+    reset: color ? '\x1b[0m' : '',
+  };
+  const headline = `Update available ${c.dim}${currentVersion}${c.reset} → ${c.bold}${c.yellow}${latestVersion}${c.reset}`;
+  const cta = `Run ${c.cyan}${installCmd}${c.reset} to update`;
+  const tip = `${c.dim}(or run ${c.reset}${c.cyan}sogni-agent self-update${c.reset}${c.dim}, disable with --no-update-check)${c.reset}`;
+  return ['', headline, cta, tip, ''].join('\n');
+}
+// ---------- state file ----------
+export function readState(path = DEFAULT_STATE_PATH) {
+  try {
+    if (!existsSync(path)) return null;
+    const raw = readFileSync(path, 'utf8');
+    const parsed = JSON.parse(raw);
+    if (!parsed || typeof parsed !== 'object') return null;
+    return parsed;
+  } catch {
+    return null;
+  }
+}
+export function writeState(path, state) {
+  try {
+    const dir = dirname(path);
+    if (!existsSync(dir)) mkdirSync(dir, { recursive: true });
+    writeFileSync(path, JSON.stringify(state, null, 2));
+  } catch {
+    // best-effort; never throw
+  }
+}
+export function clearState(path = DEFAULT_STATE_PATH) {
+  try {
+    if (existsSync(path)) unlinkSync(path);
+  } catch {
+    // ignore
+  }
+}
+// ---------- network ----------
+function fetchLatestVersion({ url = REGISTRY_URL, timeoutMs = REGISTRY_TIMEOUT_MS } = {}) {
+  return new Promise((resolve, reject) => {
+    let settled = false;
+    const finish = (fn, value) => {
+      if (settled) return;
+      settled = true;
+      fn(value);
+    };
+    let req;
+    try {
+      req = https.get(url, { headers: { accept: 'application/json' } }, (res) => {
+        if (res.statusCode !== 200) {
+          res.resume();
+          finish(reject, new Error(`registry status ${res.statusCode}`));
+          return;
+        }
+        let received = 0;
+        const chunks = [];
+        res.on('data', (chunk) => {
+          received += chunk.length;
+          if (received > MAX_RESPONSE_BYTES) {
+            res.destroy();
+            finish(reject, new Error('registry response too large'));
+            return;
+          }
+          chunks.push(chunk);
+        });
+        res.on('end', () => {
+          try {
+            const body = Buffer.concat(chunks).toString('utf8');
+            const parsed = JSON.parse(body);
+            if (parsed && typeof parsed.version === 'string') {
+              finish(resolve, parsed.version);
+            } else {
+              finish(reject, new Error('registry response missing version'));
+            }
+          } catch (err) {
+            finish(reject, err);
+          }
+        });
+        res.on('error', (err) => finish(reject, err));
+      });
+    } catch (err) {
+      finish(reject, err);
+      return;
+    }
+    req.setTimeout(timeoutMs, () => {
+      req.destroy(new Error('registry timeout'));
+    });
+    req.on('error', (err) => finish(reject, err));
+  });
+}
+// ---------- foreground (child) check ----------
+export async function runForegroundCheck({
+  currentVersion,
+  statePath = DEFAULT_STATE_PATH,
+  url = REGISTRY_URL,
+  timeoutMs = REGISTRY_TIMEOUT_MS,
+  fetcher = fetchLatestVersion,
+  now = Date.now,
+} = {}) {
+  try {
+    const latest = await fetcher({ url, timeoutMs });
+    writeState(statePath, {
+      lastCheckedAt: now(),
+      lastKnownLatest: latest,
+      currentVersion: currentVersion || null,
+    });
+  } catch {
+    // Still record the attempt timestamp so we don't hammer the registry
+    // when offline. Keep any previously-known latest version so the user
+    // still sees the notice for an older known update.
+    const prev = readState(statePath) || {};
+    writeState(statePath, {
+      lastCheckedAt: now(),
+      lastKnownLatest: prev.lastKnownLatest || null,
+      currentVersion: currentVersion || null,
+    });
+  }
+}
+// ---------- parent helpers ----------
+export function maybeSpawnBackgroundCheck({
+  cliPath = process.argv[1],
+  statePath = DEFAULT_STATE_PATH,
+  throttleMs = DEFAULT_THROTTLE_MS,
+  now = Date.now,
+  spawnFn = spawn,
+  execPath = process.execPath,
+  env = process.env,
+} = {}) {
+  if (shouldSkipForEnvironment({ env })) return 'skipped';
+  const state = readState(statePath);
+  if (state && typeof state.lastCheckedAt === 'number' && now() - state.lastCheckedAt < throttleMs) {
+    return 'fresh';
+  }
+  try {
+    const child = spawnFn(execPath, [cliPath, INTERNAL_FLAG], {
+      detached: true,
+      stdio: 'ignore',
+      env,
+    });
+    child.on('error', () => {});
+    if (typeof child.unref === 'function') child.unref();
+    return 'spawned';
+  } catch {
+    return 'skipped';
+  }
+}
+export function getQueuedNotice({
+  currentVersion,
+  statePath = DEFAULT_STATE_PATH,
+  env = process.env,
+} = {}) {
+  if (shouldSkipForEnvironment({ env })) return null;
+  const state = readState(statePath);
+  if (!state || typeof state.lastKnownLatest !== 'string') return null;
+  if (compareSemver(state.lastKnownLatest, currentVersion) <= 0) return null;
+  const { installCmd } = detectPackageManager(env);
+  return formatUpdateNotice({
+    currentVersion,
+    latestVersion: state.lastKnownLatest,
+    installCmd,
+  });
+}
+export function runSelfUpdate({
+  env = process.env,
+  statePath = DEFAULT_STATE_PATH,
+  spawnSyncFn = spawnSync,
+  stdio = 'inherit',
+} = {}) {
+  const { manager, installCmd } = detectPackageManager(env);
+  const [command, ...args] = installCmd.split(' ');
+  console.error(`Running: ${installCmd}`);
+  const result = spawnSyncFn(command, args, { stdio, env });
+  if (result.error) {
+    console.error(`self-update failed: ${result.error.message}`);
+    if (manager === 'npm' && /EACCES|EPERM/i.test(result.error.message)) {
+      console.error('Hint: re-run with sudo, or install with a Node version manager (nvm/fnm/volta).');
+    }
+    return 1;
+  }
+  if (typeof result.status === 'number' && result.status !== 0) {
+    return result.status;
+  }
+  clearState(statePath);
+  return 0;
+}

package/version.mjs CHANGED Viewed

	@@ -1 +1 @@
1	- export const PACKAGE_VERSION = '3.1.1';
1	+ export const PACKAGE_VERSION = '3.3.0';