npm - @sogni-ai/sogni-creative-agent-skill - Versions diffs - 2.2.0 → 2.3.0 - Mend

@sogni-ai/sogni-creative-agent-skill 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +40 -10
package/SKILL.md +73 -17
package/generated/creative-agent-runtime.mjs +4076 -148
package/llm.txt +15 -5
package/openclaw.plugin.json +25 -2
package/package.json +6 -2
package/scripts/check-creative-agent-source.mjs +104 -0
package/sogni-agent.mjs +574 -75
package/ssrf-guard.mjs +2 -1
package/version.mjs +1 -1

package/README.md CHANGED Viewed

@@ -68,7 +68,7 @@ With this skill, an agent can:
 ## Quick Start
-1. Get a Sogni API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (click your username) and save it — see [Setup](#setup-sogni-api-key).
+1. Get a Sogni API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu) and save it — see [Setup](#setup-sogni-api-key).
 2. Install the CLI:
    ```bash
@@ -147,7 +147,7 @@ The generated `.openclaw-link/` directory is only for OpenClaw; Hermes, Manus, a
 #### OpenClaw configuration
-When loaded through OpenClaw, this skill reads plugin defaults from OpenClaw config; CLI flags always override them. The supported config schema is defined in [`openclaw.plugin.json`](./openclaw.plugin.json) and includes default models, video workflow models, hosted API defaults (`apiBaseUrl`, `defaultLlmModel`, `defaultApiToolMode`), token type, seed strategy, timeouts, and media paths. If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
+When loaded through OpenClaw, this skill reads plugin defaults from OpenClaw config; CLI flags always override them. The supported config schema is defined in [`openclaw.plugin.json`](./openclaw.plugin.json) and includes default models, video workflow models, hosted API defaults (`apiBaseUrl`, `defaultLlmModel`, `defaultTaskProfile`, `defaultApiMaxTokens`, `defaultApiThinking`, `defaultApiToolMode`, workflow cost defaults), token type, seed strategy, timeouts, and media paths. If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
 ### Hermes Agent / Manus / other frameworks
@@ -186,7 +186,7 @@ If the checkout is missing, use the npm install path above or explicitly approve
 ## Setup (Sogni API key)
-1. Get your API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (click your username).
+1. Get your API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu).
 2. Save it to a credentials file:
    ```bash
@@ -262,19 +262,41 @@ sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 \
 sogni-agent --video --reference-audio-identity voice.webm \
   'NARRATOR: "This is my voice."'
-# Hosted chat with rich creative-agent tools (/v1/chat/completions)
+# Hosted chat with Sogni creative-agent tools (/v1/chat/completions)
 sogni-agent --api-chat \
   "Create a 4-shot product video concept for a red sneaker"
+# Hosted chat with image vision plus media-reference metadata
+sogni-agent --api-chat --ref product.jpg \
+  "Turn this into a launch poster and describe the edit plan"
+# Hosted chat controls and model discovery
+sogni-agent --api-chat --task-profile reasoning --no-thinking \
+  "Plan a concise multi-step product launch workflow"
+sogni-agent --list-api-models
 # Durable hosted workflow (/v1/creative-agent/workflows)
 sogni-agent --api-workflow image-to-video \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "A graphite robot sketch on a drafting table"
+# Durable workflow with a media reference and a cost ceiling
+sogni-agent --api-workflow image-to-video --ref https://cdn.example.com/sketch.png \
+  --workflow-max-cost 25 --confirm-cost \
+  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
+  "Animate the referenced sketch"
+# Shared CreativeWorkflowPlan -> API compiles to hosted sequence
+sogni-agent --api-workflow creative-plan --workflow-input @plan.json
 # Storyline -> GPT Image 2 storyboard sheet -> Seedance video sequence
 sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
   "Create a 9:16 bakery launch video with a neon street-window reveal"
+# Sogni Intelligence replay records
+sogni-agent --list-replays 20
+sogni-agent --get-replay run_abc123 --json
 # Local segment + concat with external soundtrack
 sogni-agent --video --workflow v2v --ref-video dance.mp4 \
   --video-start 10 --duration 8 --controlnet-name pose -o /tmp/clip-2.mp4 \
@@ -311,12 +333,15 @@ Run `sogni-agent --help` for the full CLI. Below are the options and tables most
 | `--target-resolution <px>` | Target the short side, preserving aspect ratio |
 | `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
 | `--api-chat` | Use `/v1/chat/completions` with Sogni creative-agent tools |
-| `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, or `storyboard-video` |
+| `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, or `storyboard-video` |
 | `--workflow-input <json\|path\|@path>` | Explicit hosted workflow input JSON |
+| `--workflow-max-cost <n>`, `--confirm-cost`, `--no-confirm-cost` | Set durable workflow capacity ceiling and explicit cost confirmation |
 | `--storyboard-frames <n>` | Beat count for `--api-workflow storyboard-video` |
 | `--video-prompt`, `--negative-prompt`, `--generate-audio`, `--expand-prompt` | Durable image-to-video workflow inputs |
 | `--watch-workflow`, `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Manage durable workflows |
-| `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--api-base-url <url>` | Tune hosted API requests |
+| `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--task-profile <profile>`, `--max-tokens <n>`, `--thinking` / `--no-thinking`, `--api-base-url <url>` | Tune hosted API requests |
+| `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM models |
+| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay records |
 | `--persona <name>` | Use a saved persona |
 | `--concat-videos <out> <clips...>` | Stitch clips locally with FFmpeg |
 | `--last`, `--last-image` | Inspect last render / reuse last image as context or video reference |
@@ -474,17 +499,22 @@ Stored at `~/.config/sogni/personality.txt`.
 Hosted API modes require `SOGNI_API_KEY`.
-- **`--api-chat`** targets `/v1/chat/completions` with rich creative-agent tools — best for text-first natural-language workflows. Tune with `--api-tools creative-agent|rich|hosted|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
-- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, and `storyboard-video`.
+- **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools — best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
+- **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
+- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, and `storyboard-video`.
+- **`--api-workflow creative-plan`** forwards a shared `CreativeWorkflowPlan` JSON object (`{ title?, steps: [...] }`) to the API as `kind: "creative_plan"`. Compilation, hosted-tool argument validation, and persistence happen in `../sogni-api` through `@sogni/creative-agent`; the public skill does not duplicate that compiler. Use this when you need exact shared-plan behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
 - **`--api-workflow storyboard-video`** generates a storyline, creates a single GPT Image 2 storyboard sheet, then passes that artifact into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low/medium/high quality for that storyboard sheet.
+- **Media references** from `-c`, `--ref`, `--ref-end`, `--ref-audio`, `--reference-audio-identity`, and `--ref-video` are forwarded as `media_references` metadata in hosted API requests. API chat also attaches image refs as vision inputs. Local file references are uploaded to Sogni media storage first, then forwarded as retrievable URLs so durable executors do not depend on `data:` URI support. Durable workflow JSON can bind those references into step arguments with `sourceStepId: "$input_media"`. Use direct CLI mode for private media that must not leave the local machine.
+- **Cost controls** use `--workflow-max-cost <n>` to reject workflow starts above a capacity-unit ceiling, and `--confirm-cost` / `--no-confirm-cost` to forward explicit billing confirmation.
 - Manage runs with `--watch-workflow`, `--workflow-events`, `--stream-workflow`, `--list-workflows`, `--get-workflow`, and `--cancel-workflow`. Use `--workflow-input` to provide exact hosted workflow JSON.
+- **Replay records** use `/v1/replay/records`: `--list-replays [limit]`, `--get-replay <runId>`, and `--ingest-replay <json|path|@path>` expose redacted RunRecord storage for Sogni Intelligence replay/debug viewers.
 Override the API origin with `--api-base-url`, `SOGNI_API_BASE_URL`, or `SOGNI_REST_ENDPOINT`.
 Hosted API credentials are only sent to `https://api.sogni.ai` by default. Add trusted custom
 hosts with `SOGNI_API_ALLOWED_HOSTS`; loopback or non-HTTPS local testing requires
 `SOGNI_ALLOW_UNSAFE_API_BASE_URL=1`.
-> Uploaded local media still uses the direct CLI path because hosted API modes do not accept CLI `--ref*` media flags for server-side tool execution.
+> The public skill consumes generated storyboard adapters from `../sogni-creative-agent`: `compileForModel()` now works in the bundled runtime for Seedance, GPT Image 2, LTX-2.3, and WAN storyboard stages.
 ---
@@ -520,7 +550,7 @@ Tries SPARK first (free daily tokens), then falls back to SOGNI if the balance i
 ## Error Reporting & Output
 - **Exit codes:** failures use a non-zero exit code with human-readable stderr.
-- **Structured output:** add `--json` when an agent needs machine-parseable success/error data, or `--last` to inspect the last render.
+- **Structured output:** add `--json` when an agent needs machine-parseable success/error data, or `--last` to inspect the last render. JSON failures include canonical `errorType`, `errorCategory`, and `retryable` fields where the shared runtime can classify the error.
 - **Output files:** use `-o <path>` to save locally; otherwise the CLI prints a result URL.
 - **Quiet mode:** `-q` / `--quiet` suppresses progress output without changing exit semantics.

package/SKILL.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: sogni-creative-agent-skill
 description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories (user preferences across sessions), custom personality, style transfer, angle synthesis, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
 metadata:
-  version: "2.2.0"
+  version: "2.3.0"
   homepage: https://sogni.ai
   clawdbot:
     emoji: "🎨"
@@ -72,7 +72,7 @@ If that checkout does not exist, prefer the npm-based local skill install below,
 ## Setup
-1. **Get your Sogni API key** by logging into https://dashboard.sogni.ai and clicking your username.
+1. **Get your Sogni API key** by logging into https://dashboard.sogni.ai and opening the account menu.
 2. **Create an API key credentials file:**
 ```bash
 mkdir -p ~/.config/sogni
@@ -82,7 +82,7 @@ EOF
 chmod 600 ~/.config/sogni/credentials
 ```
-You can also export `SOGNI_API_KEY` instead of writing the file. The API key can always be found by logging into https://dashboard.sogni.ai and clicking your username.
+You can also export `SOGNI_API_KEY` instead of writing the file. The API key can always be found by logging into https://dashboard.sogni.ai and opening the account menu.
 3. **Install the CLI and skill by default:**
 ```bash
@@ -165,32 +165,73 @@ node sogni-agent.mjs --music --duration 30 \
 node sogni-agent.mjs --music --lyrics "Rise with the morning light" --bpm 128 \
   --keyscale "C major" --output-format mp3 "bright indie pop chorus"
-# Hosted API chat: natural-language rich creative-agent tool execution
+# Hosted API chat: natural-language creative-agent tool execution
 node sogni-agent.mjs --api-chat "Create a 4-shot product video concept for a red sneaker"
+# Hosted API chat with image vision and media-reference metadata
+node sogni-agent.mjs --api-chat --ref product.jpg \
+  "Turn this into a launch poster and describe the edit plan"
+# Sogni Intelligence model/replay utilities
+node sogni-agent.mjs --list-api-models
+node sogni-agent.mjs --api-chat --task-profile reasoning --no-thinking \
+  "Plan a concise multi-step product launch workflow"
+node sogni-agent.mjs --list-replays 20
+node sogni-agent.mjs --get-replay run_abc123 --json
 # Durable API workflow: async image-to-video with resumable workflow record
 node sogni-agent.mjs --api-workflow image-to-video \
   --video-prompt "The camera slowly pushes in as the sketch comes alive" \
   "A graphite robot sketch on a drafting table"
+# Durable API workflow with media reference and cost controls
+node sogni-agent.mjs --api-workflow image-to-video \
+  --ref https://cdn.example.com/sketch.png \
+  --workflow-max-cost 25 --confirm-cost \
+  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
+  "Animate the referenced sketch"
+# Shared CreativeWorkflowPlan: API compiles and validates through @sogni/creative-agent
+node sogni-agent.mjs --api-workflow creative-plan --workflow-input @plan.json
 # Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
 node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
   "Create a 9:16 bakery launch video with a neon street-window reveal"
 ```
 Use `--api-chat` for text-first natural-language workflows that should go through
-Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. Use
-`--api-workflow` when the caller already knows it wants an async durable workflow
-record under `/v1/creative-agent/workflows`. Use `--api-workflow storyboard-video`
+Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. This path
+sanitizes prompt-injection markers before forwarding messages and uses the
+current hosted creative-agent tool surface. Use `--api-workflow` when the caller
+already knows it wants an async durable workflow record under
+`/v1/creative-agent/workflows`. Use `--api-workflow creative-plan` when the
+caller already has a shared `CreativeWorkflowPlan`; the skill forwards it as
+`kind: "creative_plan"` and lets Sogni API compile, validate, and persist it
+through `@sogni/creative-agent`. This is the preferred hosted path for exact
+multi-step plans, including repeated `replace_video_segment` operations with
+`replacementStartSeconds` / `replacementEndSeconds` when interleaving existing
+video slices. Use `--api-workflow storyboard-video`
 when the caller wants the hosted sequence to generate a storyline, create one GPT
 Image 2 storyboard sheet, and feed that image artifact into Seedance as the video
 reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low|medium|high
-quality for the storyboard sheet. Uploaded-media execution still
-belongs on the direct CLI path (`-c`, `--ref`, `--ref-audio`, `--ref-video`)
-until the hosted rich API and durable workflow endpoint support uploaded
-negative-index media references through CLI media flags.
-Hosted API modes require `SOGNI_API_KEY`; username/password credentials are only
-for the direct client-wrapper path.
+quality for the storyboard sheet. Hosted API requests forward media references
+from `-c`, `--ref`, `--ref-end`, `--ref-audio`,
+`--reference-audio-identity`, and `--ref-video` as `media_references`
+metadata; workflow JSON can bind them into step arguments with
+`sourceStepId: "$input_media"`, and API chat also attaches image refs as vision
+inputs. Local file references are uploaded to Sogni media storage first, then
+forwarded as retrievable URLs for hosted chat and durable workflows. Use the
+direct CLI path for private media that must not leave the local machine.
+Use `--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to
+forward explicit workflow cost policy.
+Sogni Intelligence utilities are exposed through the same API key path:
+`--list-api-models` / `--get-api-model <id>` read `/v1/models`,
+`--task-profile`, `--max-tokens`, and `--thinking` / `--no-thinking` tune
+`/v1/chat/completions`, and `--list-replays`, `--get-replay`, and
+`--ingest-replay` manage `/v1/replay/records` RunRecords for replay/debug
+viewers.
+Hosted API modes require `SOGNI_API_KEY`; this skill's CLI uses API-key
+authentication.
 When changing hosted API chat/workflow behavior, keep reusable validation,
 workflow compilation, repair-control, and guard telemetry logic in
@@ -286,13 +327,20 @@ positions.
 | `--concat-audio <path>` | Optional audio track to mux over `--concat-videos` output | - |
 | `--concat-audio-start <sec>` | Start offset into `--concat-audio` | - |
 | `--list-media [type]` | List recent inbound media (images\|audio\|all) | images |
-| `--api-chat` | Call `/v1/chat/completions` with rich creative-agent tool injection | - |
-| `--api-tools <mode>` | API tool mode: creative-agent\|rich\|hosted\|none | creative-agent |
+| `--api-chat` | Call `/v1/chat/completions` with Sogni creative-agent tool injection | - |
+| `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none | creative-agent |
 | `--no-api-tool-execution` | Plan/tool-call via API chat without executing Sogni tools | - |
 | `--llm-model <id>` | LLM model for `--api-chat` | qwen3.6-35b-a3b-gguf-iq4xs |
-| `--api-workflow <kind>` | Start durable workflow: image-to-video\|hosted-tool-sequence\|storyboard-video | - |
+| `--task-profile <profile>` | Sogni Intelligence task profile: general\|coding\|reasoning | - |
+| `--max-tokens <n>` | Max hosted chat completion tokens | 1600 |
+| `--thinking`, `--no-thinking` | Toggle `chat_template_kwargs.enable_thinking` for hosted chat | server default |
+| `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM model metadata | - |
+| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay RunRecords | - |
+| `--api-workflow <kind>` | Start durable workflow: image-to-video\|hosted-tool-sequence\|creative-plan\|storyboard-video | - |
 | `--workflow-input <json\|path\|@path>` | Workflow input JSON for hosted tool sequences/custom starts | - |
-| `--workflow-title <text>` | Title for hosted-tool-sequence workflow input | - |
+| `--workflow-title <text>` | Title for hosted-tool-sequence, creative-plan, or storyboard-video workflow input | - |
+| `--workflow-max-cost <n>` | Reject hosted workflow starts above this estimated capacity-unit ceiling | - |
+| `--confirm-cost`, `--no-confirm-cost` | Forward explicit hosted workflow cost confirmation | - |
 | `--storyboard-frames <n>` | Beat count for storyboard-video workflow | - |
 | `--video-prompt <text>` | Motion prompt for durable image-to-video workflow | - |
 | `--negative-prompt <text>` | Negative prompt for durable image-to-video workflow | - |
@@ -349,7 +397,12 @@ When installed as an OpenClaw plugin, Sogni Creative Agent Skill will read defau
           "defaultTokenType": "spark",
           "apiBaseUrl": "https://api.sogni.ai",
           "defaultLlmModel": "qwen3.6-35b-a3b-gguf-iq4xs",
+          "defaultTaskProfile": "general",
+          "defaultApiMaxTokens": 1600,
+          "defaultApiThinking": false,
           "defaultApiToolMode": "creative-agent",
+          "defaultWorkflowMaxCost": 25,
+          "defaultWorkflowConfirmCost": false,
           "seedStrategy": "prompt-hash",
           "modelDefaults": {
             "flux1-schnell-fp8": { "steps": 4, "guidance": 3.5 },
@@ -877,6 +930,9 @@ On error (with `--json`), the script returns a single JSON object like:
   "success": false,
   "error": "Reference image 2314x1200 would resize to 512x266, but both dimensions must be divisible by 16.",
   "errorCode": "INVALID_VIDEO_SIZE",
+  "errorType": "PARAMETER_INVALID",
+  "errorCategory": "schema_validation",
+  "retryable": false,
   "hint": "Try: --width 1296 --height 672 (or omit --strict-size)"
 }
 ```