npm - @sogni-ai/sogni-creative-agent-skill - Versions diffs - 2.3.0 → 3.1.0-alpha.0 - Mend

@sogni-ai/sogni-creative-agent-skill 2.3.0 → 3.1.0-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +18 -11
package/SKILL.md +61 -23
package/generated/creative-agent-runtime.mjs +127 -6366
package/llm.txt +2 -2
package/package.json +13 -3
package/skill-package.json +1 -1
package/sogni-agent.mjs +979 -207
package/version.mjs +1 -1

package/README.md CHANGED Viewed

@@ -276,18 +276,18 @@ sogni-agent --api-chat --task-profile reasoning --no-thinking \
 sogni-agent --list-api-models
 # Durable hosted workflow (/v1/creative-agent/workflows)
-sogni-agent --api-workflow image-to-video \
+sogni-agent --api-workflow \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "A graphite robot sketch on a drafting table"
 # Durable workflow with a media reference and a cost ceiling
-sogni-agent --api-workflow image-to-video --ref https://cdn.example.com/sketch.png \
+sogni-agent --api-workflow --ref https://cdn.example.com/sketch.png \
   --workflow-max-cost 25 --confirm-cost \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "Animate the referenced sketch"
-# Shared CreativeWorkflowPlan -> API compiles to hosted sequence
-sogni-agent --api-workflow creative-plan --workflow-input @plan.json
+# Exact durable workflow input
+sogni-agent --api-workflow --workflow-input @workflow.json
 # Storyline -> GPT Image 2 storyboard sheet -> Seedance video sequence
 sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
@@ -297,6 +297,13 @@ sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12
 sogni-agent --list-replays 20
 sogni-agent --get-replay run_abc123 --json
+# Opt in to SDK transport for hosted operations (durable workflows + chat).
+# Validates restEndpoint/socketEndpoint via the skill's SSRF guard, then
+# calls sogni.workflows.* / .chat.completions.* directly.
+# Falls back to the legacy SSRF-validated fetch path when the env is unset.
+export SOGNI_SKILL_USE_SDK_TRANSPORT=1
+sogni-agent --api-workflow storyboard-video "10s neon city flyover"
 # Local segment + concat with external soundtrack
 sogni-agent --video --workflow v2v --ref-video dance.mp4 \
   --video-start 10 --duration 8 --controlnet-name pose -o /tmp/clip-2.mp4 \
@@ -333,15 +340,15 @@ Run `sogni-agent --help` for the full CLI. Below are the options and tables most
 | `--target-resolution <px>` | Target the short side, preserving aspect ratio |
 | `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
 | `--api-chat` | Use `/v1/chat/completions` with Sogni creative-agent tools |
-| `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, or `storyboard-video` |
-| `--workflow-input <json\|path\|@path>` | Explicit hosted workflow input JSON |
+| `--api-workflow` | Start a `/v1/creative-agent/workflows` durable workflow with explicit `input.steps`; optional `storyboard-video` preset |
+| `--workflow-input <json\|@path>` | Explicit durable workflow input JSON. Use `@path` to load JSON from a file. |
 | `--workflow-max-cost <n>`, `--confirm-cost`, `--no-confirm-cost` | Set durable workflow capacity ceiling and explicit cost confirmation |
 | `--storyboard-frames <n>` | Beat count for `--api-workflow storyboard-video` |
-| `--video-prompt`, `--negative-prompt`, `--generate-audio`, `--expand-prompt` | Durable image-to-video workflow inputs |
+| `--video-prompt`, `--negative-prompt`, `--generate-audio`, `--expand-prompt` | Generated-keyframe durable workflow step controls |
 | `--watch-workflow`, `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Manage durable workflows |
 | `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--task-profile <profile>`, `--max-tokens <n>`, `--thinking` / `--no-thinking`, `--api-base-url <url>` | Tune hosted API requests |
 | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM models |
-| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay records |
+| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|@path>` | Manage Sogni Intelligence replay records (use `@path` to load JSON from a file) |
 | `--persona <name>` | Use a saved persona |
 | `--concat-videos <out> <clips...>` | Stitch clips locally with FFmpeg |
 | `--last`, `--last-image` | Inspect last render / reuse last image as context or video reference |
@@ -501,12 +508,12 @@ Hosted API modes require `SOGNI_API_KEY`.
 - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools — best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
 - **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
-- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, and `storyboard-video`.
-- **`--api-workflow creative-plan`** forwards a shared `CreativeWorkflowPlan` JSON object (`{ title?, steps: [...] }`) to the API as `kind: "creative_plan"`. Compilation, hosted-tool argument validation, and persistence happen in `../sogni-api` through `@sogni/creative-agent`; the public skill does not duplicate that compiler. Use this when you need exact shared-plan behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
+- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Requests carry `input.steps` plus snake_case controls such as `token_type`, `media_references`, `max_estimated_capacity_units`, and `confirm_cost`.
+- **`--workflow-input`** forwards exact durable workflow JSON (`{ title?, steps: [...] }`). Use this when you need exact multi-step behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
 - **`--api-workflow storyboard-video`** generates a storyline, creates a single GPT Image 2 storyboard sheet, then passes that artifact into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low/medium/high quality for that storyboard sheet.
 - **Media references** from `-c`, `--ref`, `--ref-end`, `--ref-audio`, `--reference-audio-identity`, and `--ref-video` are forwarded as `media_references` metadata in hosted API requests. API chat also attaches image refs as vision inputs. Local file references are uploaded to Sogni media storage first, then forwarded as retrievable URLs so durable executors do not depend on `data:` URI support. Durable workflow JSON can bind those references into step arguments with `sourceStepId: "$input_media"`. Use direct CLI mode for private media that must not leave the local machine.
 - **Cost controls** use `--workflow-max-cost <n>` to reject workflow starts above a capacity-unit ceiling, and `--confirm-cost` / `--no-confirm-cost` to forward explicit billing confirmation.
-- Manage runs with `--watch-workflow`, `--workflow-events`, `--stream-workflow`, `--list-workflows`, `--get-workflow`, and `--cancel-workflow`. Use `--workflow-input` to provide exact hosted workflow JSON.
+- Manage runs with `--watch-workflow`, `--workflow-events`, `--stream-workflow`, `--list-workflows`, `--get-workflow`, and `--cancel-workflow`. Use `--workflow-input` to provide exact durable workflow JSON.
 - **Replay records** use `/v1/replay/records`: `--list-replays [limit]`, `--get-replay <runId>`, and `--ingest-replay <json|path|@path>` expose redacted RunRecord storage for Sogni Intelligence replay/debug viewers.
 Override the API origin with `--api-base-url`, `SOGNI_API_BASE_URL`, or `SOGNI_REST_ENDPOINT`.

package/SKILL.md CHANGED Viewed

@@ -36,7 +36,7 @@ metadata:
 Generate **images, videos, and music** using Sogni AI's decentralized GPU network.
-> **Per-skill view**: hosts that want to load focused capabilities rather than this monolith can read [`skills/README.md`](./skills/README.md) for the per-skill index — one markdown file per skill (`image_generation`, `image_editing`, `video_generation`, `video_editing`, `music_generation`, `media_analysis`, `persona_management`, `app_settings`, plus the always-loaded `quality_audit`, `session_control`, `asset_reference_management`). Each file mirrors the canonical manifest in `@sogni/creative-agent`. The whole-monolith load below stays the default for OpenClaw / Claude Code / Hermes Agent / Manus AI integrations.
+> **Per-skill view**: hosts that want to load focused capabilities rather than this monolith can read [`skills/README.md`](./skills/README.md) for the per-skill index — one markdown file per skill (`image_generation`, `image_editing`, `video_generation`, `video_editing`, `music_generation`, `media_analysis`, `persona_management`, `app_settings`, `composition_planning`, plus the always-loaded `quality_audit`, `session_control`, `asset_reference_management`). Each file mirrors the canonical manifest in `@sogni/creative-agent`. The whole-monolith load below stays the default for OpenClaw / Claude Code / Hermes Agent / Manus AI integrations.
 ## Install Request Policy
@@ -124,13 +124,33 @@ Path override environment variables:
 - `SOGNI_MEDIA_INBOUND_DIR`
 - `OPENCLAW_CONFIG_PATH`
-## Usage (Images, Video & Music)
+## Recommended path: route through the hosted Sogni Intelligence endpoints
+For any natural-language creative request — anything that should be planned, multi-step, or that benefits from tool selection, repair, or durable workflows — prefer the hosted endpoints over the direct-to-SDK flags. The hosted endpoints are the canonical home for tool dispatch, Structured Contracts v1 (gating policies, repair recipes, prompt contracts), durable workflows, replay, and asset-manifest mapping. They stay aligned with `sogni-chat` and the rest of the `@sogni/creative-agent` consumers automatically.
+```bash
+# Natural-language creative request (LLM picks the tool, dispatches, repairs)
+node sogni-agent.mjs --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
+# Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
+node sogni-agent.mjs --api-workflow \
+  --video-prompt "The camera slowly pushes in" \
+  "A graphite robot sketch on a drafting table"
+# Storyboard → keyframe → Seedance, all server-side
+node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
+  "Create a 9:16 bakery launch video with a neon street-window reveal"
+```
+The direct-to-SDK flags below remain available for explicit one-shot generation when you already know the exact model, dimensions, and prompt and don't need LLM planning. Use them when latency or cost rules out the LLM round-trip.
+## Usage (direct-to-SDK image, video & music)
 ```bash
 # Generate and get URL
 node sogni-agent.mjs "a cat wearing a hat"
-# Quality presets (recommended — auto-selects model, steps, and size)
+# Quality presets (recommended for direct mode — auto-selects model, steps, and size)
 node sogni-agent.mjs -Q fast "a cat wearing a hat"    # z_image_turbo, 8 steps, 512x512 (~5-10s)
 node sogni-agent.mjs -Q hq "a cat wearing a hat"      # z_image_turbo, default steps, 768x768 (~10-15s)
 node sogni-agent.mjs -Q pro "a cat wearing a hat"      # flux2_dev, 40 steps, 1024x1024 (~2min)
@@ -179,20 +199,20 @@ node sogni-agent.mjs --api-chat --task-profile reasoning --no-thinking \
 node sogni-agent.mjs --list-replays 20
 node sogni-agent.mjs --get-replay run_abc123 --json
-# Durable API workflow: async image-to-video with resumable workflow record
-node sogni-agent.mjs --api-workflow image-to-video \
+# Durable API workflow: generated keyframe to video with resumable workflow record
+node sogni-agent.mjs --api-workflow \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "A graphite robot sketch on a drafting table"
 # Durable API workflow with media reference and cost controls
-node sogni-agent.mjs --api-workflow image-to-video \
+node sogni-agent.mjs --api-workflow \
   --ref https://cdn.example.com/sketch.png \
   --workflow-max-cost 25 --confirm-cost \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "Animate the referenced sketch"
-# Shared CreativeWorkflowPlan: API compiles and validates through @sogni/creative-agent
-node sogni-agent.mjs --api-workflow creative-plan --workflow-input @plan.json
+# Exact durable workflow input with explicit steps
+node sogni-agent.mjs --api-workflow --workflow-input @workflow.json
 # Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
 node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
@@ -204,13 +224,12 @@ Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. This path
 sanitizes prompt-injection markers before forwarding messages and uses the
 current hosted creative-agent tool surface. Use `--api-workflow` when the caller
 already knows it wants an async durable workflow record under
-`/v1/creative-agent/workflows`. Use `--api-workflow creative-plan` when the
-caller already has a shared `CreativeWorkflowPlan`; the skill forwards it as
-`kind: "creative_plan"` and lets Sogni API compile, validate, and persist it
-through `@sogni/creative-agent`. This is the preferred hosted path for exact
-multi-step plans, including repeated `replace_video_segment` operations with
-`replacementStartSeconds` / `replacementEndSeconds` when interleaving existing
-video slices. Use `--api-workflow storyboard-video`
+`/v1/creative-agent/workflows`. Use `--workflow-input @workflow.json` when the
+caller already has exact durable workflow input with `steps`; the skill forwards
+that body to the API as-is. This is the preferred hosted path for
+exact multi-step plans, including repeated `replace_video_segment` operations
+with `replacementStartSeconds` / `replacementEndSeconds` when interleaving
+existing video slices. Use `--api-workflow storyboard-video`
 when the caller wants the hosted sequence to generate a storyline, create one GPT
 Image 2 storyboard sheet, and feed that image artifact into Seedance as the video
 reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low|medium|high
@@ -233,6 +252,16 @@ viewers.
 Hosted API modes require `SOGNI_API_KEY`; this skill's CLI uses API-key
 authentication.
+For durable hosted chat runs (long-running multi-tool turns that should
+survive a client disconnect), the SDK now exposes
+`sogni.chat.runs.{create, get, cancel, streamEvents}`.
+Set `SOGNI_SKILL_USE_SDK_TRANSPORT=1` to route hosted workflow + chat
+operations through the SDK transport instead of the legacy
+SSRF-validated fetch path. The skill's `sogni-hosted-client.mjs`
+factory still validates `restEndpoint` / `socketEndpoint` against the
+SSRF guard before constructing the SDK client, so the safety contract
+holds.
 When changing hosted API chat/workflow behavior, keep reusable validation,
 workflow compilation, repair-control, and guard telemetry logic in
 `../sogni-creative-agent` first. The public skill should consume generated or
@@ -335,17 +364,26 @@ positions.
 | `--max-tokens <n>` | Max hosted chat completion tokens | 1600 |
 | `--thinking`, `--no-thinking` | Toggle `chat_template_kwargs.enable_thinking` for hosted chat | server default |
 | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM model metadata | - |
-| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay RunRecords | - |
-| `--api-workflow <kind>` | Start durable workflow: image-to-video\|hosted-tool-sequence\|creative-plan\|storyboard-video | - |
-| `--workflow-input <json\|path\|@path>` | Workflow input JSON for hosted tool sequences/custom starts | - |
-| `--workflow-title <text>` | Title for hosted-tool-sequence, creative-plan, or storyboard-video workflow input | - |
+| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|@path>` | Manage Sogni Intelligence replay RunRecords. List/get output is run through `redactRunRecord` from `@sogni/creative-agent/replay` before printing, so signed URLs, bearer tokens, JWTs, and PEM blocks cannot leak via the CLI. Use `@path` to load JSON from a file. | - |
+| `--skip-redact`, `--no-redact` | Bypass the replay redactor on `--list-replays` / `--get-replay`. Debug-only — emits unredacted RunRecord payloads. | redacted |
+| `--turn-classify` | Print the public-skill turn policy (`visibleTools`, `forbiddenTools`, `requiredTools`) the default contract runtime would produce for the current session-state flags. Mirrors the chat / `/v1/chat/completions` Structured Contracts v1 pipeline. | - |
+| `--compile-tools` | Print the per-turn compiled tool surface (filtered tool list + prompt-contract fragments) the default contract runtime emits. | - |
+| `--dispatch-tool <name>` | Print the dispatch verdict (`allowed`, `mode`, repair recipe, suggested args) the default contract runtime would return for a tool call. Combine with `--tool-args` to supply arguments. | - |
+| `--tool-args <json>` | JSON arguments for `--dispatch-tool`. | `{}` |
+| `--storyboard-plan` | Build a storyboard project from the prompt locally (`buildStoryboardProject` + per-model adapter compilation via `compileForModel`) and print the plan as JSON. Does not call the network. Expects scene-structured prompt input (`SCENE NN - Title` / `VISUAL:` / `ACTION:` / `CAMERA:` / `AUDIO/SFX:` blocks) — for casual prompts, use `--api-workflow storyboard-video` instead, which runs an LLM storyline expansion first. Pair with `--storyboard-plan-frames`, `--storyboard-plan-model`, `--storyboard-plan-stage`. | - |
+| `--storyboard-plan-frames <n>` | Frame count for `--storyboard-plan`. | inferred |
+| `--storyboard-plan-model <id>` | Adapter target for `--storyboard-plan` (seedance, seedance2, gpt-image-2, ltx23, wan). | inferred |
+| `--storyboard-plan-stage <stage>` | Compilation stage for `--storyboard-plan` (storyboard_image, scene_clip). | storyboard_image |
+| `--api-workflow` | Start a durable workflow with explicit `input.steps`; optional `storyboard-video` preset | - |
+| `--workflow-input <json\|@path>` | Durable workflow input JSON. Use `@path` to load from a file. | - |
+| `--workflow-title <text>` | Title for generated or storyboard durable workflow input | - |
 | `--workflow-max-cost <n>` | Reject hosted workflow starts above this estimated capacity-unit ceiling | - |
 | `--confirm-cost`, `--no-confirm-cost` | Forward explicit hosted workflow cost confirmation | - |
 | `--storyboard-frames <n>` | Beat count for storyboard-video workflow | - |
-| `--video-prompt <text>` | Motion prompt for durable image-to-video workflow | - |
-| `--negative-prompt <text>` | Negative prompt for durable image-to-video workflow | - |
-| `--generate-audio`, `--no-generate-audio` | Toggle audio generation for durable image-to-video | - |
-| `--expand-prompt`, `--no-expand-prompt` | Toggle prompt expansion for durable image-to-video | - |
+| `--video-prompt <text>` | Motion prompt for generated-keyframe durable workflow | - |
+| `--negative-prompt <text>` | Negative prompt for generated-keyframe durable workflow | - |
+| `--generate-audio`, `--no-generate-audio` | Toggle audio generation for generated video steps | - |
+| `--expand-prompt`, `--no-expand-prompt` | Toggle prompt expansion for generated video steps | - |
 | `--watch-workflow` | Stream durable workflow events after start | - |
 | `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Durable workflow management helpers | - |
 | `--api-base-url <url>` | Sogni API base for hosted API modes. Credentials are only sent to `https://api.sogni.ai` by default; use `SOGNI_API_ALLOWED_HOSTS` for trusted custom hosts or `SOGNI_ALLOW_UNSAFE_API_BASE_URL=1` for isolated local testing. | https://api.sogni.ai |