@sogni-ai/sogni-creative-agent-skill 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -68,7 +68,7 @@ With this skill, an agent can:
68
68
 
69
69
  ## Quick Start
70
70
 
71
- 1. Get a Sogni API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (click your username) and save it — see [Setup](#setup-sogni-api-key).
71
+ 1. Get a Sogni API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu) and save it — see [Setup](#setup-sogni-api-key).
72
72
  2. Install the CLI:
73
73
 
74
74
  ```bash
@@ -147,7 +147,7 @@ The generated `.openclaw-link/` directory is only for OpenClaw; Hermes, Manus, a
147
147
 
148
148
  #### OpenClaw configuration
149
149
 
150
- When loaded through OpenClaw, this skill reads plugin defaults from OpenClaw config; CLI flags always override them. The supported config schema is defined in [`openclaw.plugin.json`](./openclaw.plugin.json) and includes default models, video workflow models, hosted API defaults (`apiBaseUrl`, `defaultLlmModel`, `defaultApiToolMode`), token type, seed strategy, timeouts, and media paths. If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
150
+ When loaded through OpenClaw, this skill reads plugin defaults from OpenClaw config; CLI flags always override them. The supported config schema is defined in [`openclaw.plugin.json`](./openclaw.plugin.json) and includes default models, video workflow models, hosted API defaults (`apiBaseUrl`, `defaultLlmModel`, `defaultTaskProfile`, `defaultApiMaxTokens`, `defaultApiThinking`, `defaultApiToolMode`, workflow cost defaults), token type, seed strategy, timeouts, and media paths. If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
151
151
 
152
152
  ### Hermes Agent / Manus / other frameworks
153
153
 
@@ -186,7 +186,7 @@ If the checkout is missing, use the npm install path above or explicitly approve
186
186
 
187
187
  ## Setup (Sogni API key)
188
188
 
189
- 1. Get your API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (click your username).
189
+ 1. Get your API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu).
190
190
  2. Save it to a credentials file:
191
191
 
192
192
  ```bash
@@ -262,19 +262,41 @@ sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 \
262
262
  sogni-agent --video --reference-audio-identity voice.webm \
263
263
  'NARRATOR: "This is my voice."'
264
264
 
265
- # Hosted chat with rich creative-agent tools (/v1/chat/completions)
265
+ # Hosted chat with Sogni creative-agent tools (/v1/chat/completions)
266
266
  sogni-agent --api-chat \
267
267
  "Create a 4-shot product video concept for a red sneaker"
268
268
 
269
+ # Hosted chat with image vision plus media-reference metadata
270
+ sogni-agent --api-chat --ref product.jpg \
271
+ "Turn this into a launch poster and describe the edit plan"
272
+
273
+ # Hosted chat controls and model discovery
274
+ sogni-agent --api-chat --task-profile reasoning --no-thinking \
275
+ "Plan a concise multi-step product launch workflow"
276
+ sogni-agent --list-api-models
277
+
269
278
  # Durable hosted workflow (/v1/creative-agent/workflows)
270
279
  sogni-agent --api-workflow image-to-video \
271
280
  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
272
281
  "A graphite robot sketch on a drafting table"
273
282
 
283
+ # Durable workflow with a media reference and a cost ceiling
284
+ sogni-agent --api-workflow image-to-video --ref https://cdn.example.com/sketch.png \
285
+ --workflow-max-cost 25 --confirm-cost \
286
+ --video-prompt "The camera slowly pushes in as the sketch comes alive" \
287
+ "Animate the referenced sketch"
288
+
289
+ # Shared CreativeWorkflowPlan -> API compiles to hosted sequence
290
+ sogni-agent --api-workflow creative-plan --workflow-input @plan.json
291
+
274
292
  # Storyline -> GPT Image 2 storyboard sheet -> Seedance video sequence
275
293
  sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
276
294
  "Create a 9:16 bakery launch video with a neon street-window reveal"
277
295
 
296
+ # Sogni Intelligence replay records
297
+ sogni-agent --list-replays 20
298
+ sogni-agent --get-replay run_abc123 --json
299
+
278
300
  # Local segment + concat with external soundtrack
279
301
  sogni-agent --video --workflow v2v --ref-video dance.mp4 \
280
302
  --video-start 10 --duration 8 --controlnet-name pose -o /tmp/clip-2.mp4 \
@@ -311,12 +333,15 @@ Run `sogni-agent --help` for the full CLI. Below are the options and tables most
311
333
  | `--target-resolution <px>` | Target the short side, preserving aspect ratio |
312
334
  | `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
313
335
  | `--api-chat` | Use `/v1/chat/completions` with Sogni creative-agent tools |
314
- | `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, or `storyboard-video` |
336
+ | `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, or `storyboard-video` |
315
337
  | `--workflow-input <json\|path\|@path>` | Explicit hosted workflow input JSON |
338
+ | `--workflow-max-cost <n>`, `--confirm-cost`, `--no-confirm-cost` | Set durable workflow capacity ceiling and explicit cost confirmation |
316
339
  | `--storyboard-frames <n>` | Beat count for `--api-workflow storyboard-video` |
317
340
  | `--video-prompt`, `--negative-prompt`, `--generate-audio`, `--expand-prompt` | Durable image-to-video workflow inputs |
318
341
  | `--watch-workflow`, `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Manage durable workflows |
319
- | `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--api-base-url <url>` | Tune hosted API requests |
342
+ | `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--task-profile <profile>`, `--max-tokens <n>`, `--thinking` / `--no-thinking`, `--api-base-url <url>` | Tune hosted API requests |
343
+ | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM models |
344
+ | `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay records |
320
345
  | `--persona <name>` | Use a saved persona |
321
346
  | `--concat-videos <out> <clips...>` | Stitch clips locally with FFmpeg |
322
347
  | `--last`, `--last-image` | Inspect last render / reuse last image as context or video reference |
@@ -474,17 +499,22 @@ Stored at `~/.config/sogni/personality.txt`.
474
499
 
475
500
  Hosted API modes require `SOGNI_API_KEY`.
476
501
 
477
- - **`--api-chat`** targets `/v1/chat/completions` with rich creative-agent tools — best for text-first natural-language workflows. Tune with `--api-tools creative-agent|rich|hosted|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
478
- - **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, and `storyboard-video`.
502
+ - **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools — best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
503
+ - **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
504
+ - **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, and `storyboard-video`.
505
+ - **`--api-workflow creative-plan`** forwards a shared `CreativeWorkflowPlan` JSON object (`{ title?, steps: [...] }`) to the API as `kind: "creative_plan"`. Compilation, hosted-tool argument validation, and persistence happen in `../sogni-api` through `@sogni/creative-agent`; the public skill does not duplicate that compiler. Use this when you need exact shared-plan behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
479
506
  - **`--api-workflow storyboard-video`** generates a storyline, creates a single GPT Image 2 storyboard sheet, then passes that artifact into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low/medium/high quality for that storyboard sheet.
507
+ - **Media references** from `-c`, `--ref`, `--ref-end`, `--ref-audio`, `--reference-audio-identity`, and `--ref-video` are forwarded as `media_references` metadata in hosted API requests. API chat also attaches image refs as vision inputs. Local file references are uploaded to Sogni media storage first, then forwarded as retrievable URLs so durable executors do not depend on `data:` URI support. Durable workflow JSON can bind those references into step arguments with `sourceStepId: "$input_media"`. Use direct CLI mode for private media that must not leave the local machine.
508
+ - **Cost controls** use `--workflow-max-cost <n>` to reject workflow starts above a capacity-unit ceiling, and `--confirm-cost` / `--no-confirm-cost` to forward explicit billing confirmation.
480
509
  - Manage runs with `--watch-workflow`, `--workflow-events`, `--stream-workflow`, `--list-workflows`, `--get-workflow`, and `--cancel-workflow`. Use `--workflow-input` to provide exact hosted workflow JSON.
510
+ - **Replay records** use `/v1/replay/records`: `--list-replays [limit]`, `--get-replay <runId>`, and `--ingest-replay <json|path|@path>` expose redacted RunRecord storage for Sogni Intelligence replay/debug viewers.
481
511
 
482
512
  Override the API origin with `--api-base-url`, `SOGNI_API_BASE_URL`, or `SOGNI_REST_ENDPOINT`.
483
513
  Hosted API credentials are only sent to `https://api.sogni.ai` by default. Add trusted custom
484
514
  hosts with `SOGNI_API_ALLOWED_HOSTS`; loopback or non-HTTPS local testing requires
485
515
  `SOGNI_ALLOW_UNSAFE_API_BASE_URL=1`.
486
516
 
487
- > Uploaded local media still uses the direct CLI path because hosted API modes do not accept CLI `--ref*` media flags for server-side tool execution.
517
+ > The public skill consumes generated storyboard adapters from `../sogni-creative-agent`: `compileForModel()` now works in the bundled runtime for Seedance, GPT Image 2, LTX-2.3, and WAN storyboard stages.
488
518
 
489
519
  ---
490
520
 
@@ -520,7 +550,7 @@ Tries SPARK first (free daily tokens), then falls back to SOGNI if the balance i
520
550
  ## Error Reporting & Output
521
551
 
522
552
  - **Exit codes:** failures use a non-zero exit code with human-readable stderr.
523
- - **Structured output:** add `--json` when an agent needs machine-parseable success/error data, or `--last` to inspect the last render.
553
+ - **Structured output:** add `--json` when an agent needs machine-parseable success/error data, or `--last` to inspect the last render. JSON failures include canonical `errorType`, `errorCategory`, and `retryable` fields where the shared runtime can classify the error.
524
554
  - **Output files:** use `-o <path>` to save locally; otherwise the CLI prints a result URL.
525
555
  - **Quiet mode:** `-q` / `--quiet` suppresses progress output without changing exit semantics.
526
556
 
package/SKILL.md CHANGED
@@ -2,7 +2,7 @@
2
2
  name: sogni-creative-agent-skill
3
3
  description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories (user preferences across sessions), custom personality, style transfer, angle synthesis, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
4
4
  metadata:
5
- version: "2.2.0"
5
+ version: "2.3.0"
6
6
  homepage: https://sogni.ai
7
7
  clawdbot:
8
8
  emoji: "🎨"
@@ -72,7 +72,7 @@ If that checkout does not exist, prefer the npm-based local skill install below,
72
72
 
73
73
  ## Setup
74
74
 
75
- 1. **Get your Sogni API key** by logging into https://dashboard.sogni.ai and clicking your username.
75
+ 1. **Get your Sogni API key** by logging into https://dashboard.sogni.ai and opening the account menu.
76
76
  2. **Create an API key credentials file:**
77
77
  ```bash
78
78
  mkdir -p ~/.config/sogni
@@ -82,7 +82,7 @@ EOF
82
82
  chmod 600 ~/.config/sogni/credentials
83
83
  ```
84
84
 
85
- You can also export `SOGNI_API_KEY` instead of writing the file. The API key can always be found by logging into https://dashboard.sogni.ai and clicking your username.
85
+ You can also export `SOGNI_API_KEY` instead of writing the file. The API key can always be found by logging into https://dashboard.sogni.ai and opening the account menu.
86
86
 
87
87
  3. **Install the CLI and skill by default:**
88
88
  ```bash
@@ -165,32 +165,73 @@ node sogni-agent.mjs --music --duration 30 \
165
165
  node sogni-agent.mjs --music --lyrics "Rise with the morning light" --bpm 128 \
166
166
  --keyscale "C major" --output-format mp3 "bright indie pop chorus"
167
167
 
168
- # Hosted API chat: natural-language rich creative-agent tool execution
168
+ # Hosted API chat: natural-language creative-agent tool execution
169
169
  node sogni-agent.mjs --api-chat "Create a 4-shot product video concept for a red sneaker"
170
170
 
171
+ # Hosted API chat with image vision and media-reference metadata
172
+ node sogni-agent.mjs --api-chat --ref product.jpg \
173
+ "Turn this into a launch poster and describe the edit plan"
174
+
175
+ # Sogni Intelligence model/replay utilities
176
+ node sogni-agent.mjs --list-api-models
177
+ node sogni-agent.mjs --api-chat --task-profile reasoning --no-thinking \
178
+ "Plan a concise multi-step product launch workflow"
179
+ node sogni-agent.mjs --list-replays 20
180
+ node sogni-agent.mjs --get-replay run_abc123 --json
181
+
171
182
  # Durable API workflow: async image-to-video with resumable workflow record
172
183
  node sogni-agent.mjs --api-workflow image-to-video \
173
184
  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
174
185
  "A graphite robot sketch on a drafting table"
175
186
 
187
+ # Durable API workflow with media reference and cost controls
188
+ node sogni-agent.mjs --api-workflow image-to-video \
189
+ --ref https://cdn.example.com/sketch.png \
190
+ --workflow-max-cost 25 --confirm-cost \
191
+ --video-prompt "The camera slowly pushes in as the sketch comes alive" \
192
+ "Animate the referenced sketch"
193
+
194
+ # Shared CreativeWorkflowPlan: API compiles and validates through @sogni/creative-agent
195
+ node sogni-agent.mjs --api-workflow creative-plan --workflow-input @plan.json
196
+
176
197
  # Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
177
198
  node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
178
199
  "Create a 9:16 bakery launch video with a neon street-window reveal"
179
200
  ```
180
201
 
181
202
  Use `--api-chat` for text-first natural-language workflows that should go through
182
- Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. Use
183
- `--api-workflow` when the caller already knows it wants an async durable workflow
184
- record under `/v1/creative-agent/workflows`. Use `--api-workflow storyboard-video`
203
+ Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. This path
204
+ sanitizes prompt-injection markers before forwarding messages and uses the
205
+ current hosted creative-agent tool surface. Use `--api-workflow` when the caller
206
+ already knows it wants an async durable workflow record under
207
+ `/v1/creative-agent/workflows`. Use `--api-workflow creative-plan` when the
208
+ caller already has a shared `CreativeWorkflowPlan`; the skill forwards it as
209
+ `kind: "creative_plan"` and lets Sogni API compile, validate, and persist it
210
+ through `@sogni/creative-agent`. This is the preferred hosted path for exact
211
+ multi-step plans, including repeated `replace_video_segment` operations with
212
+ `replacementStartSeconds` / `replacementEndSeconds` when interleaving existing
213
+ video slices. Use `--api-workflow storyboard-video`
185
214
  when the caller wants the hosted sequence to generate a storyline, create one GPT
186
215
  Image 2 storyboard sheet, and feed that image artifact into Seedance as the video
187
216
  reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low|medium|high
188
- quality for the storyboard sheet. Uploaded-media execution still
189
- belongs on the direct CLI path (`-c`, `--ref`, `--ref-audio`, `--ref-video`)
190
- until the hosted rich API and durable workflow endpoint support uploaded
191
- negative-index media references through CLI media flags.
192
- Hosted API modes require `SOGNI_API_KEY`; username/password credentials are only
193
- for the direct client-wrapper path.
217
+ quality for the storyboard sheet. Hosted API requests forward media references
218
+ from `-c`, `--ref`, `--ref-end`, `--ref-audio`,
219
+ `--reference-audio-identity`, and `--ref-video` as `media_references`
220
+ metadata; workflow JSON can bind them into step arguments with
221
+ `sourceStepId: "$input_media"`, and API chat also attaches image refs as vision
222
+ inputs. Local file references are uploaded to Sogni media storage first, then
223
+ forwarded as retrievable URLs for hosted chat and durable workflows. Use the
224
+ direct CLI path for private media that must not leave the local machine.
225
+ Use `--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to
226
+ forward explicit workflow cost policy.
227
+ Sogni Intelligence utilities are exposed through the same API key path:
228
+ `--list-api-models` / `--get-api-model <id>` read `/v1/models`,
229
+ `--task-profile`, `--max-tokens`, and `--thinking` / `--no-thinking` tune
230
+ `/v1/chat/completions`, and `--list-replays`, `--get-replay`, and
231
+ `--ingest-replay` manage `/v1/replay/records` RunRecords for replay/debug
232
+ viewers.
233
+ Hosted API modes require `SOGNI_API_KEY`; this skill's CLI uses API-key
234
+ authentication.
194
235
 
195
236
  When changing hosted API chat/workflow behavior, keep reusable validation,
196
237
  workflow compilation, repair-control, and guard telemetry logic in
@@ -286,13 +327,20 @@ positions.
286
327
  | `--concat-audio <path>` | Optional audio track to mux over `--concat-videos` output | - |
287
328
  | `--concat-audio-start <sec>` | Start offset into `--concat-audio` | - |
288
329
  | `--list-media [type]` | List recent inbound media (images\|audio\|all) | images |
289
- | `--api-chat` | Call `/v1/chat/completions` with rich creative-agent tool injection | - |
290
- | `--api-tools <mode>` | API tool mode: creative-agent\|rich\|hosted\|none | creative-agent |
330
+ | `--api-chat` | Call `/v1/chat/completions` with Sogni creative-agent tool injection | - |
331
+ | `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none | creative-agent |
291
332
  | `--no-api-tool-execution` | Plan/tool-call via API chat without executing Sogni tools | - |
292
333
  | `--llm-model <id>` | LLM model for `--api-chat` | qwen3.6-35b-a3b-gguf-iq4xs |
293
- | `--api-workflow <kind>` | Start durable workflow: image-to-video\|hosted-tool-sequence\|storyboard-video | - |
334
+ | `--task-profile <profile>` | Sogni Intelligence task profile: general\|coding\|reasoning | - |
335
+ | `--max-tokens <n>` | Max hosted chat completion tokens | 1600 |
336
+ | `--thinking`, `--no-thinking` | Toggle `chat_template_kwargs.enable_thinking` for hosted chat | server default |
337
+ | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM model metadata | - |
338
+ | `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay RunRecords | - |
339
+ | `--api-workflow <kind>` | Start durable workflow: image-to-video\|hosted-tool-sequence\|creative-plan\|storyboard-video | - |
294
340
  | `--workflow-input <json\|path\|@path>` | Workflow input JSON for hosted tool sequences/custom starts | - |
295
- | `--workflow-title <text>` | Title for hosted-tool-sequence workflow input | - |
341
+ | `--workflow-title <text>` | Title for hosted-tool-sequence, creative-plan, or storyboard-video workflow input | - |
342
+ | `--workflow-max-cost <n>` | Reject hosted workflow starts above this estimated capacity-unit ceiling | - |
343
+ | `--confirm-cost`, `--no-confirm-cost` | Forward explicit hosted workflow cost confirmation | - |
296
344
  | `--storyboard-frames <n>` | Beat count for storyboard-video workflow | - |
297
345
  | `--video-prompt <text>` | Motion prompt for durable image-to-video workflow | - |
298
346
  | `--negative-prompt <text>` | Negative prompt for durable image-to-video workflow | - |
@@ -349,7 +397,12 @@ When installed as an OpenClaw plugin, Sogni Creative Agent Skill will read defau
349
397
  "defaultTokenType": "spark",
350
398
  "apiBaseUrl": "https://api.sogni.ai",
351
399
  "defaultLlmModel": "qwen3.6-35b-a3b-gguf-iq4xs",
400
+ "defaultTaskProfile": "general",
401
+ "defaultApiMaxTokens": 1600,
402
+ "defaultApiThinking": false,
352
403
  "defaultApiToolMode": "creative-agent",
404
+ "defaultWorkflowMaxCost": 25,
405
+ "defaultWorkflowConfirmCost": false,
353
406
  "seedStrategy": "prompt-hash",
354
407
  "modelDefaults": {
355
408
  "flux1-schnell-fp8": { "steps": 4, "guidance": 3.5 },
@@ -877,6 +930,9 @@ On error (with `--json`), the script returns a single JSON object like:
877
930
  "success": false,
878
931
  "error": "Reference image 2314x1200 would resize to 512x266, but both dimensions must be divisible by 16.",
879
932
  "errorCode": "INVALID_VIDEO_SIZE",
933
+ "errorType": "PARAMETER_INVALID",
934
+ "errorCategory": "schema_validation",
935
+ "retryable": false,
880
936
  "hint": "Try: --width 1296 --height 672 (or omit --strict-size)"
881
937
  }
882
938
  ```