@sogni-ai/sogni-creative-agent-skill 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/SKILL.md CHANGED
@@ -2,7 +2,7 @@
2
2
  name: sogni-creative-agent-skill
3
3
  description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories (user preferences across sessions), custom personality, style transfer, angle synthesis, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
4
4
  metadata:
5
- version: "3.1.0"
5
+ version: "3.1.1"
6
6
  homepage: https://sogni.ai
7
7
  clawdbot:
8
8
  emoji: "🎨"
@@ -51,6 +51,8 @@ sogni-agent --version
51
51
 
52
52
  Then configure the agent/runtime to use this `SKILL.md` and invoke the `sogni-agent` CLI.
53
53
 
54
+ Always invoke the globally installed `sogni-agent` command. Do not call `node {{skillDir}}/sogni-agent.mjs` or `node sogni-agent.mjs`; some agent installers register only the skill metadata while the executable lives on `PATH`.
55
+
54
56
  For upgrades, prefer package-manager updates or direct operations on an existing checkout. Do not generate clone-or-pull shell bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs; agent command scanners may require approval for those patterns.
55
57
 
56
58
  Agent-safe CLI upgrade:
@@ -126,19 +128,23 @@ Path override environment variables:
126
128
 
127
129
  ## Recommended path: route through the hosted Sogni Intelligence endpoints
128
130
 
129
- For any natural-language creative request — anything that should be planned, multi-step, or that benefits from tool selection, repair, or durable workflows — prefer the hosted endpoints over the direct-to-SDK flags. The hosted endpoints are the canonical home for tool dispatch, Structured Contracts v1 (gating policies, repair recipes, prompt contracts), durable workflows, replay, and asset-manifest mapping. They stay aligned with `sogni-chat` and the rest of the `@sogni/creative-agent` consumers automatically.
131
+ For any natural-language creative request — anything that should be planned, multi-step, resumable, or that benefits from tool selection, repair, or durable workflows — prefer the hosted Sogni Intelligence endpoints over the direct-to-SDK media flags. The hosted surfaces are the canonical home for OpenAI-compatible chat, server-side creative tool dispatch, Structured Contracts v1 (gating policies, repair recipes, prompt contracts), durable chat runs, durable workflows, workflow templates, replay, and asset-manifest mapping. They stay aligned with `sogni-chat`, `sogni-api`, and the rest of the `@sogni/creative-agent` consumers.
130
132
 
131
133
  ```bash
132
134
  # Natural-language creative request (LLM picks the tool, dispatches, repairs)
133
- node sogni-agent.mjs --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
135
+ sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
136
+
137
+ # Durable hosted chat run (persisted event log + SSE stream)
138
+ SOGNI_SKILL_USE_SDK_TRANSPORT=1 sogni-agent --durable-chat \
139
+ "Create a four-shot launch campaign, generate the key art, and animate the hero clip"
134
140
 
135
141
  # Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
136
- node sogni-agent.mjs --api-workflow \
142
+ sogni-agent --api-workflow \
137
143
  --video-prompt "The camera slowly pushes in" \
138
144
  "A graphite robot sketch on a drafting table"
139
145
 
140
146
  # Storyboard → keyframe → Seedance, all server-side
141
- node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
147
+ sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
142
148
  "Create a 9:16 bakery launch video with a neon street-window reveal"
143
149
  ```
144
150
 
@@ -148,107 +154,196 @@ The direct-to-SDK flags below remain available for explicit one-shot generation
148
154
 
149
155
  ```bash
150
156
  # Generate and get URL
151
- node sogni-agent.mjs "a cat wearing a hat"
157
+ sogni-agent "a cat wearing a hat"
152
158
 
153
159
  # Quality presets (recommended for direct mode — auto-selects model, steps, and size)
154
- node sogni-agent.mjs -Q fast "a cat wearing a hat" # z_image_turbo, 8 steps, 512x512 (~5-10s)
155
- node sogni-agent.mjs -Q hq "a cat wearing a hat" # z_image_turbo, default steps, 768x768 (~10-15s)
156
- node sogni-agent.mjs -Q pro "a cat wearing a hat" # flux2_dev, 40 steps, 1024x1024 (~2min)
160
+ sogni-agent -Q fast "a cat wearing a hat" # z_image_turbo, 8 steps, 512x512 (~5-10s)
161
+ sogni-agent -Q hq "a cat wearing a hat" # z_image_turbo, default steps, 768x768 (~10-15s)
162
+ sogni-agent -Q pro "a cat wearing a hat" # flux2_dev, 40 steps, 1024x1024 (~2min)
157
163
 
158
164
  # Dynamic prompt variations — diverse images in one call
159
- node sogni-agent.mjs -n 3 "a {red|blue|green} sports car"
165
+ sogni-agent -n 3 "a {red|blue|green} sports car"
160
166
  # → generates "a red sports car", "a blue sports car", "a green sports car"
161
167
 
162
168
  # Token auto-fallback (tries SPARK, falls back to SOGNI)
163
- node sogni-agent.mjs --token-type auto "a cat wearing a hat"
169
+ sogni-agent --token-type auto "a cat wearing a hat"
164
170
 
165
171
  # Save to file
166
- node sogni-agent.mjs -o /tmp/cat.png "a cat wearing a hat"
172
+ sogni-agent -o /tmp/cat.png "a cat wearing a hat"
167
173
 
168
174
  # JSON output (for scripting)
169
- node sogni-agent.mjs --json "a cat wearing a hat"
175
+ sogni-agent --json "a cat wearing a hat"
170
176
 
171
177
  # Check token balances (no prompt required)
172
- node sogni-agent.mjs --balance
178
+ sogni-agent --balance
173
179
 
174
180
  # Check token balances in JSON
175
- node sogni-agent.mjs --json --balance
181
+ sogni-agent --json --balance
176
182
 
177
183
  # Quiet mode (suppress progress)
178
- node sogni-agent.mjs -q -o /tmp/cat.png "a cat wearing a hat"
184
+ sogni-agent -q -o /tmp/cat.png "a cat wearing a hat"
179
185
 
180
186
  # Direct music/audio generation
181
- node sogni-agent.mjs --music --duration 30 \
187
+ sogni-agent --music --duration 30 \
182
188
  "uplifting cinematic synthwave theme for a product launch"
183
189
 
184
190
  # Song with lyrics and musical controls
185
- node sogni-agent.mjs --music --lyrics "Rise with the morning light" --bpm 128 \
191
+ sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 \
186
192
  --keyscale "C major" --output-format mp3 "bright indie pop chorus"
187
193
 
188
194
  # Hosted API chat: natural-language creative-agent tool execution
189
- node sogni-agent.mjs --api-chat "Create a 4-shot product video concept for a red sneaker"
195
+ sogni-agent --api-chat "Create a 4-shot product video concept for a red sneaker"
190
196
 
191
197
  # Hosted API chat with image vision and media-reference metadata
192
- node sogni-agent.mjs --api-chat --ref product.jpg \
198
+ sogni-agent --api-chat --ref product.jpg \
193
199
  "Turn this into a launch poster and describe the edit plan"
194
200
 
195
201
  # Sogni Intelligence model/replay utilities
196
- node sogni-agent.mjs --list-api-models
197
- node sogni-agent.mjs --api-chat --task-profile reasoning --no-thinking \
202
+ sogni-agent --list-api-models
203
+ sogni-agent --api-chat --task-profile reasoning --max-tokens 2000 \
198
204
  "Plan a concise multi-step product launch workflow"
199
- node sogni-agent.mjs --list-replays 20
200
- node sogni-agent.mjs --get-replay run_abc123 --json
205
+ sogni-agent --list-replays 20
206
+ sogni-agent --get-replay run_abc123 --json
207
+
208
+ # Draft a savable workflow template through the hosted creative-agent tool loop
209
+ sogni-agent --api-chat \
210
+ "Design a reusable workflow for a 9:16 product teaser from one product photo"
201
211
 
202
212
  # Durable API workflow: generated keyframe to video with resumable workflow record
203
- node sogni-agent.mjs --api-workflow \
213
+ sogni-agent --api-workflow \
204
214
  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
205
215
  "A graphite robot sketch on a drafting table"
206
216
 
207
217
  # Durable API workflow with media reference and cost controls
208
- node sogni-agent.mjs --api-workflow \
218
+ sogni-agent --api-workflow \
209
219
  --ref https://cdn.example.com/sketch.png \
210
220
  --workflow-max-cost 25 --confirm-cost \
211
221
  --video-prompt "The camera slowly pushes in as the sketch comes alive" \
212
222
  "Animate the referenced sketch"
213
223
 
214
224
  # Exact durable workflow input with explicit steps
215
- node sogni-agent.mjs --api-workflow --workflow-input @workflow.json
225
+ sogni-agent --api-workflow --workflow-input @workflow-input.json \
226
+ --workflow-idempotency-key product-teaser-v1
216
227
 
217
228
  # Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
218
- node sogni-agent.mjs --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
229
+ sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
219
230
  "Create a 9:16 bakery launch video with a neon street-window reveal"
231
+
232
+ # Workflow management
233
+ sogni-agent --list-workflows
234
+ sogni-agent --resume-workflow wf_durable_workflow_123
220
235
  ```
221
236
 
222
237
  Use `--api-chat` for text-first natural-language workflows that should go through
223
- Sogni API's OpenAI-compatible `/v1/chat/completions` tool loop. This path
224
- sanitizes prompt-injection markers before forwarding messages and uses the
225
- current hosted creative-agent tool surface. Use `--api-workflow` when the caller
226
- already knows it wants an async durable workflow record under
227
- `/v1/creative-agent/workflows`. Use `--workflow-input @workflow.json` when the
228
- caller already has exact durable workflow input with `steps`; the skill forwards
229
- that body to the API as-is. This is the preferred hosted path for
230
- exact multi-step plans, including repeated `replace_video_segment` operations
231
- with `replacementStartSeconds` / `replacementEndSeconds` when interleaving
232
- existing video slices. Use `--api-workflow storyboard-video`
233
- when the caller wants the hosted sequence to generate a storyline, create one GPT
234
- Image 2 storyboard sheet, and feed that image artifact into Seedance as the video
235
- reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low|medium|high
236
- quality for the storyboard sheet. Hosted API requests forward media references
237
- from `-c`, `--ref`, `--ref-end`, `--ref-audio`,
238
- `--reference-audio-identity`, and `--ref-video` as `media_references`
239
- metadata; workflow JSON can bind them into step arguments with
240
- `sourceStepId: "$input_media"`, and API chat also attaches image refs as vision
241
- inputs. Local file references are uploaded to Sogni media storage first, then
242
- forwarded as retrievable URLs for hosted chat and durable workflows. Use the
243
- direct CLI path for private media that must not leave the local machine.
244
- Use `--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to
245
- forward explicit workflow cost policy.
246
- Sogni Intelligence utilities are exposed through the same API key path:
247
- `--list-api-models` / `--get-api-model <id>` read `/v1/models`,
248
- `--task-profile`, `--max-tokens`, and `--thinking` / `--no-thinking` tune
249
- `/v1/chat/completions`, and `--list-replays`, `--get-replay`, and
250
- `--ingest-replay` manage `/v1/replay/records` RunRecords for replay/debug
251
- viewers.
238
+ Sogni API's OpenAI-compatible `POST /v1/chat/completions` loop. The public
239
+ REST body uses snake_case controls such as `tool_choice`, `response_format`,
240
+ `task_profile`, `token_type`, `app_source`, `media_references`,
241
+ `chat_template_kwargs`, `sogni_tools`, and `sogni_tool_execution`. The endpoint
242
+ normalizes OpenAI `developer` messages to `system`; when a developer message is
243
+ present and no explicit `task_profile` is supplied, the server treats the task
244
+ as `coding`. The CLI sanitizes prompt-injection markers before forwarding
245
+ messages and sends API-key auth so hosted Sogni tools can execute server-side.
246
+
247
+ Hosted tool surfaces are split by `sogni_tools`:
248
+
249
+ - `creative-tools` is the public API default when `sogni_tools` is omitted or
250
+ true. It exposes generation/editing tools (`generate_image`,
251
+ `generate_video`, `generate_music`, `edit_image`, `apply_style`,
252
+ `restore_photo`, `refine_result`, `animate_photo`, `change_angle`,
253
+ `video_to_video`, `stitch_video`, `orbit_video`, `dance_montage`,
254
+ `sound_to_video`, `extend_video`, `replace_video_segment`, `overlay_video`,
255
+ `add_subtitles`), media-analysis tools (`analyze_image`, `analyze_video`,
256
+ `extract_metadata`), and lightweight composition tools (`enhance_prompt`,
257
+ `compose_lyrics`, `compose_instrumental`, `compose_script`).
258
+ - `creative-agent` is this CLI's default for `--api-chat`. It includes the
259
+ `creative-tools` surface plus session-control tools
260
+ (`ask_clarifying_question`, `finalize_response`), asset-manifest tools
261
+ (`create_asset_manifest`, `inspect_asset`, `label_asset`,
262
+ `map_assets_for_model`, `validate_asset_references`), and durable planning
263
+ tools (`compose_workflow`, `compose_workflow_template`). Use this surface
264
+ when the model should design one-shot workflow plans, draft savable workflow
265
+ templates, or maintain stable asset references across a multi-step turn.
266
+ - `none` disables Sogni tool injection and leaves only caller-supplied OpenAI
267
+ tools on raw API/SDK requests. In the CLI, use it with
268
+ `--no-api-tool-execution` when you want text-only planning without hosted
269
+ Sogni tool dispatch.
270
+
271
+ Use `--durable-chat` for long-running, LLM-in-the-loop turns that should be
272
+ persisted as `POST /v1/chat/runs` records instead of a single
273
+ `/v1/chat/completions` request. Chat runs keep an event log, stream via
274
+ `/v1/chat/runs/:id/events/stream`, support cancellation, and can pause for
275
+ persisted cost approval (`/v1/chat/runs/:id/confirm-cost`) in first-party
276
+ clients. The CLI can start and stream durable chat runs through the SDK
277
+ transport when `SOGNI_SKILL_USE_SDK_TRANSPORT=1` is set.
278
+
279
+ Use `--api-workflow` when the caller already knows it wants an async durable
280
+ workflow under `POST /v1/creative-agent/workflows`. The API now accepts either
281
+ an inline durable plan (`input.steps`) or a saved workflow template invocation
282
+ (`workflow_id` plus `inputs`) and rejects requests that provide both. The CLI's
283
+ generated-keyframe and `storyboard-video` presets submit inline `input.steps`;
284
+ `--workflow-input @workflow-input.json` supplies that `input` object directly.
285
+ Saved template CRUD lives at `/v1/creative-agent/workflows/templates`, and a
286
+ saved template can later be run by API/SDK callers with `workflow_id + inputs`.
287
+ Use `compose_workflow_template` through `--api-chat` to draft a savable template;
288
+ the caller is still responsible for persisting the returned `template_draft`.
289
+
290
+ Exact multi-step workflow plans should use explicit step dependencies, including
291
+ `replace_video_segment` steps with bounded `replacementStartSeconds` /
292
+ `replacementEndSeconds` when interleaving existing video slices. Workflow JSON
293
+ can bind request media into step arguments with `sourceStepId: "$input_media"`.
294
+ Use `--api-workflow storyboard-video` when the hosted sequence should generate a
295
+ storyline, create one GPT Image 2 storyboard sheet, and feed that image artifact
296
+ into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT
297
+ Image 2 low|medium|high quality for the storyboard sheet.
298
+
299
+ Hosted API requests forward media references from `-c`, `--ref`, `--ref-end`,
300
+ `--ref-audio`, `--reference-audio-identity`, and `--ref-video` as
301
+ `media_references` metadata. `--ref-audio` and `--ref-video` are repeatable in
302
+ api-chat / durable-chat mode — each entry uploads independently and is exposed
303
+ to the hosted LLM at `@Audio1` / `@Audio2` / `@Video1` etc. API chat also
304
+ attaches image refs as vision inputs. Local file references are uploaded to
305
+ Sogni media storage first, then forwarded as retrievable URLs for hosted chat
306
+ and durable workflows. Use the direct CLI path for private media that must not
307
+ leave the local machine.
308
+
309
+ ### Seedance reference modes (mutually exclusive)
310
+
311
+ When `--video -m seedance2` or `-m seedance2-fast` is selected, the skill
312
+ exposes the same two-mode pattern that the hosted chat surfaces. Pick one
313
+ mode per video request:
314
+
315
+ - **Dedicated frame mode — `--ref` and/or `--ref-end`.** First-class
316
+ first-frame / last-frame anchoring; the Seedance worker pins them as
317
+ parameter-mode firstFrame / lastFrame. Max 2 images.
318
+ - **Loose reference mode — `-c/--context` plus optional `--ref-audio`
319
+ extras and `--ref-video` extras.** Anchor frame intent in the prompt with
320
+ `@Image1` / `@Image2` / `@Video1` / `@Audio1` etc. (e.g. *"Use @Image1 as
321
+ the opening shot reference"*). Supports up to 9 image refs, 3 video refs,
322
+ 3 audio refs, and 12 total reference assets per video request. The
323
+ numeric caps come from the canonical
324
+ `@sogni-ai/sogni-protocol/catalogs/seedance-reference-limits.json` catalog,
325
+ surfaced through `@sogni-ai/sogni-intelligence-client/tools` as
326
+ `SEEDANCE_REFERENCE_LIMITS` and `validateSeedanceReferenceCounts()`.
327
+
328
+ Combining `--ref` / `--ref-end` with `-c/--context` on Seedance is rejected
329
+ client-side with a clear error pointing to the correct mode. In CLI direct-gen
330
+ mode, additional `--ref-audio` / `--ref-video` entries beyond the first must
331
+ be HTTPS URLs (the primary entry can still be a local file path); for local
332
+ multi-file Seedance uploads, use `--api-chat` / `--durable-chat` instead. Use
333
+ `--workflow-max-cost <n>` plus `--confirm-cost` / `--no-confirm-cost` to forward
334
+ explicit workflow cost policy, and `--workflow-idempotency-key` when retrying a
335
+ workflow start request.
336
+
337
+ Sogni Intelligence utilities are exposed through the same API-key path:
338
+ `--list-api-models` / `--get-api-model <id>` read `/v1/models`, `--task-profile`
339
+ and `--max-tokens` tune `/v1/chat/completions`, and `--list-replays`,
340
+ `--get-replay`, and `--ingest-replay` manage `/v1/replay/records` RunRecords for
341
+ replay/debug viewers. The public chat endpoint also accepts OpenAI-standard
342
+ `reasoning_effort` / `reasoning.effort` in raw API requests. The CLI's
343
+ `--thinking` / `--no-thinking` flags are forwarded as
344
+ `chat_template_kwargs.enable_thinking`; current hosted Qwen requests may
345
+ normalize thinking on server-side, so do not rely on `--no-thinking` as a hard
346
+ suppression switch for `/v1/chat/completions`.
252
347
  Hosted API modes require `SOGNI_API_KEY`; this skill's CLI uses API-key
253
348
  authentication.
254
349
 
@@ -261,15 +356,15 @@ SSRF-validated fetch path. The skill's `sogni-hosted-client.mjs`
261
356
  factory still validates `restEndpoint` / `socketEndpoint` against the
262
357
  SSRF guard before constructing the SDK client, so the safety contract
263
358
  holds.
359
+ For `--durable-chat`, stream output as the run advances; the CLI reports
360
+ assistant deltas plus de-duplicated per-job progress / ETA / result lines from
361
+ hosted run events.
264
362
 
265
363
  When changing hosted API chat/workflow behavior, keep reusable validation,
266
- workflow compilation, repair-control, and guard telemetry logic in
267
- `../sogni-creative-agent` first. The public skill should consume generated or
268
- copied shared contracts instead of adding skill-local regex guards. Media-routing
269
- decisions should come from typed planner/runtime contracts such as
270
- `CreativeTurnPlannerFields`, `classifyMediaTurnIntent()`, `videoContinuation`,
271
- `videoModification`, `outputGrouping`, `imageSelectionPolicy`, and
272
- `pendingStitchAfterBatch`; regex is appropriate only for bounded CLI/fact
364
+ workflow compilation, repair-control, and guard telemetry logic in the shared
365
+ Sogni runtime first, then sync it into this public skill. The public skill
366
+ should consume generated or shared typed contracts instead of adding
367
+ skill-local regex guards. Keep local regex limited to bounded CLI/fact
273
368
  extraction such as paths, URLs, extensions, dimensions, durations, and explicit
274
369
  positions.
275
370
 
@@ -356,13 +451,15 @@ positions.
356
451
  | `--concat-audio <path>` | Optional audio track to mux over `--concat-videos` output | - |
357
452
  | `--concat-audio-start <sec>` | Start offset into `--concat-audio` | - |
358
453
  | `--list-media [type]` | List recent inbound media (images\|audio\|all) | images |
359
- | `--api-chat` | Call `/v1/chat/completions` with Sogni creative-agent tool injection | - |
360
- | `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none | creative-agent |
454
+ | `--api-chat` | Call OpenAI-compatible `/v1/chat/completions`; CLI default sends the hosted `creative-agent` tool surface | - |
455
+ | `--durable-chat` | Start and stream a durable `/v1/chat/runs` record through SDK transport; requires `SOGNI_SKILL_USE_SDK_TRANSPORT=1` | - |
456
+ | `--api-tools <mode>` | API tool mode: creative-agent\|creative-tools\|none. CLI default is creative-agent; raw API default is creative-tools. | creative-agent |
361
457
  | `--no-api-tool-execution` | Plan/tool-call via API chat without executing Sogni tools | - |
362
458
  | `--llm-model <id>` | LLM model for `--api-chat` | qwen3.6-35b-a3b-gguf-iq4xs |
363
459
  | `--task-profile <profile>` | Sogni Intelligence task profile: general\|coding\|reasoning | - |
364
460
  | `--max-tokens <n>` | Max hosted chat completion tokens | 1600 |
365
- | `--thinking`, `--no-thinking` | Toggle `chat_template_kwargs.enable_thinking` for hosted chat | server default |
461
+ | `--thinking`, `--no-thinking` | Forward `chat_template_kwargs.enable_thinking` for hosted chat; current public Qwen requests may normalize thinking on server-side | server default |
462
+ | `--system <text>` | Override the base system prompt for hosted chat | built-in creative assistant prompt |
366
463
  | `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM model metadata | - |
367
464
  | `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|@path>` | Manage Sogni Intelligence replay RunRecords. List/get output is run through `redactRunRecord` from `@sogni/creative-agent/replay` before printing, so signed URLs, bearer tokens, JWTs, and PEM blocks cannot leak via the CLI. Use `@path` to load JSON from a file. | - |
368
465
  | `--skip-redact`, `--no-redact` | Bypass the replay redactor on `--list-replays` / `--get-replay`. Debug-only — emits unredacted RunRecord payloads. | redacted |
@@ -374,9 +471,10 @@ positions.
374
471
  | `--storyboard-plan-frames <n>` | Frame count for `--storyboard-plan`. | inferred |
375
472
  | `--storyboard-plan-model <id>` | Adapter target for `--storyboard-plan` (seedance, seedance2, gpt-image-2, ltx23, wan). | inferred |
376
473
  | `--storyboard-plan-stage <stage>` | Compilation stage for `--storyboard-plan` (storyboard_image, scene_clip). | storyboard_image |
377
- | `--api-workflow` | Start a durable workflow with explicit `input.steps`; optional `storyboard-video` preset | - |
378
- | `--workflow-input <json\|@path>` | Durable workflow input JSON. Use `@path` to load from a file. | - |
474
+ | `--api-workflow` | Start `/v1/creative-agent/workflows` with generated inline `input.steps`; optional `storyboard-video` preset | - |
475
+ | `--workflow-input <json\|@path>` | Durable workflow `input` JSON for the start request. Use `@path` to load from a file. | - |
379
476
  | `--workflow-title <text>` | Title for generated or storyboard durable workflow input | - |
477
+ | `--workflow-idempotency-key <key>`, `--idempotency-key <key>` | Reuse safely when retrying a durable workflow start request | - |
380
478
  | `--workflow-max-cost <n>` | Reject hosted workflow starts above this estimated capacity-unit ceiling | - |
381
479
  | `--confirm-cost`, `--no-confirm-cost` | Forward explicit hosted workflow cost confirmation | - |
382
480
  | `--storyboard-frames <n>` | Beat count for storyboard-video workflow | - |
@@ -385,7 +483,7 @@ positions.
385
483
  | `--generate-audio`, `--no-generate-audio` | Toggle audio generation for generated video steps | - |
386
484
  | `--expand-prompt`, `--no-expand-prompt` | Toggle prompt expansion for generated video steps | - |
387
485
  | `--watch-workflow` | Stream durable workflow events after start | - |
388
- | `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Durable workflow management helpers | - |
486
+ | `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>`, `--resume-workflow <id>` | Durable workflow management helpers | - |
389
487
  | `--api-base-url <url>` | Sogni API base for hosted API modes. Credentials are only sent to `https://api.sogni.ai` by default; use `SOGNI_API_ALLOWED_HOSTS` for trusted custom hosts or `SOGNI_ALLOW_UNSAFE_API_BASE_URL=1` for isolated local testing. | https://api.sogni.ai |
390
488
  | `--no-filter` | Disable NSFW content filter | - |
391
489
  | `--memory-set <key> <value>` | Save a user preference | - |
@@ -537,16 +635,16 @@ Edit images using reference images. Qwen models support up to 3 context images;
537
635
 
538
636
  ```bash
539
637
  # Single context image
540
- node sogni-agent.mjs -c photo.jpg "make the background a beach"
638
+ sogni-agent -c photo.jpg "make the background a beach"
541
639
 
542
640
  # Multiple context images (subject + style)
543
- node sogni-agent.mjs -c subject.jpg -c style.jpg "apply the style to the subject"
641
+ sogni-agent -c subject.jpg -c style.jpg "apply the style to the subject"
544
642
 
545
643
  # GPT Image 2 multi-reference edit
546
- node sogni-agent.mjs -m gpt-image-2 -c subject.jpg -c outfit.jpg -c room.jpg "place the subject in the room wearing the outfit"
644
+ sogni-agent -m gpt-image-2 -c subject.jpg -c outfit.jpg -c room.jpg "place the subject in the room wearing the outfit"
547
645
 
548
646
  # Use last generated image as context
549
- node sogni-agent.mjs --last-image "make it more vibrant"
647
+ sogni-agent --last-image "make it more vibrant"
550
648
  ```
551
649
 
552
650
  When context images are provided without `-m`, defaults to `qwen_image_edit_2511_fp8_lightning`. Select `-m gpt-image-2` for GPT Image 2's higher reference-image limit and OpenAI-backed image editing.
@@ -557,13 +655,13 @@ Generate stylized portraits from a face photo using InstantID ControlNet. When a
557
655
 
558
656
  ```bash
559
657
  # Basic photobooth
560
- node sogni-agent.mjs --photobooth --ref face.jpg "80s fashion portrait"
658
+ sogni-agent --photobooth --ref face.jpg "80s fashion portrait"
561
659
 
562
660
  # Multiple outputs
563
- node sogni-agent.mjs --photobooth --ref face.jpg -n 4 "LinkedIn professional headshot"
661
+ sogni-agent --photobooth --ref face.jpg -n 4 "LinkedIn professional headshot"
564
662
 
565
663
  # Custom ControlNet tuning
566
- node sogni-agent.mjs --photobooth --ref face.jpg --cn-strength 0.6 --cn-guidance-end 0.5 "oil painting"
664
+ sogni-agent --photobooth --ref face.jpg --cn-strength 0.6 --cn-guidance-end 0.5 "oil painting"
567
665
  ```
568
666
 
569
667
  Uses SDXL Turbo (`coreml-sogniXLturbo_alpha1_ad`) at 1024x1024 by default. The face image is passed via `--ref` and styled according to the prompt. Cannot be combined with `--video` or `-c/--context`.
@@ -571,10 +669,10 @@ Uses SDXL Turbo (`coreml-sogniXLturbo_alpha1_ad`) at 1024x1024 by default. The f
571
669
  **Agent usage:**
572
670
  ```bash
573
671
  # Photobooth: stylize a face photo
574
- node {{skillDir}}/sogni-agent.mjs -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
672
+ sogni-agent -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
575
673
 
576
674
  # Multiple photobooth outputs
577
- node {{skillDir}}/sogni-agent.mjs -q --photobooth --ref /path/to/face.jpg -n 4 -o /tmp/stylized.png "LinkedIn professional headshot"
675
+ sogni-agent -q --photobooth --ref /path/to/face.jpg -n 4 -o /tmp/stylized.png "LinkedIn professional headshot"
578
676
  ```
579
677
 
580
678
  ## Multiple Angles (Turnaround)
@@ -583,17 +681,17 @@ Generate specific camera angles from a single reference image using the Multiple
583
681
 
584
682
  ```bash
585
683
  # Single angle
586
- node sogni-agent.mjs --multi-angle -c subject.jpg \
684
+ sogni-agent --multi-angle -c subject.jpg \
587
685
  --azimuth front-right --elevation eye-level --distance medium \
588
686
  --angle-strength 0.9 \
589
687
  "studio portrait, same person"
590
688
 
591
689
  # 360 sweep (8 azimuths)
592
- node sogni-agent.mjs --angles-360 -c subject.jpg --distance medium --elevation eye-level \
690
+ sogni-agent --angles-360 -c subject.jpg --distance medium --elevation eye-level \
593
691
  "studio portrait, same person"
594
692
 
595
693
  # 360 sweep video (looping mp4, uses i2v between angles; requires ffmpeg)
596
- node sogni-agent.mjs --angles-360 --angles-360-video /tmp/turntable.mp4 \
694
+ sogni-agent --angles-360 --angles-360-video /tmp/turntable.mp4 \
597
695
  -c subject.jpg --distance medium --elevation eye-level \
598
696
  "studio portrait, same person"
599
697
  ```
@@ -623,7 +721,7 @@ When a user requests a "360 video", follow this workflow:
623
721
 
624
722
  4. **Example command**:
625
723
  ```bash
626
- node sogni-agent.mjs --angles-360 --angles-360-video /tmp/output.mp4 \
724
+ sogni-agent --angles-360 --angles-360-video /tmp/output.mp4 \
627
725
  -c /path/to/image.png --elevation eye-level --distance medium \
628
726
  "description of subject"
629
727
  ```
@@ -646,65 +744,65 @@ Generate videos from a reference image:
646
744
 
647
745
  ```bash
648
746
  # Text-to-video (t2v)
649
- node sogni-agent.mjs --video "A narrator says \"welcome to the story\" as ocean waves crash"
747
+ sogni-agent --video "A narrator says \"welcome to the story\" as ocean waves crash"
650
748
 
651
749
  # Basic video from image
652
- node sogni-agent.mjs --video --ref cat.jpg -o cat.mp4 "cat walks around"
750
+ sogni-agent --video --ref cat.jpg -o cat.mp4 "cat walks around"
653
751
 
654
752
  # Use last generated image as reference
655
- node sogni-agent.mjs --last-image --video "gentle camera pan"
753
+ sogni-agent --last-image --video "gentle camera pan"
656
754
 
657
755
  # Custom duration and FPS
658
- node sogni-agent.mjs --video --ref scene.png --duration 10 --fps 24 "zoom out slowly"
756
+ sogni-agent --video --ref scene.png --duration 10 --fps 24 "zoom out slowly"
659
757
 
660
758
  # Bare "720p" / "HD" without exact pixels: preserve aspect via short-side target
661
- node sogni-agent.mjs --video --target-resolution 768 \
759
+ sogni-agent --video --target-resolution 768 \
662
760
  "A calm cinematic shot of lanterns drifting across a night lake"
663
761
 
664
762
  # Natural-language aspect and resolution inference
665
- node sogni-agent.mjs --video \
763
+ sogni-agent --video \
666
764
  "Make a 720p 9:16 video of ocean waves at sunset"
667
765
 
668
766
  # Seedance 2.0 text-to-video
669
- node sogni-agent.mjs --video -m seedance2 --duration 8 \
767
+ sogni-agent --video -m seedance2 --duration 8 \
670
768
  "A polished product reveal with native ambient sound"
671
769
 
672
770
  # Seedance multimodal context with public HTTPS references
673
- node sogni-agent.mjs --video -m seedance2 --workflow t2v \
771
+ sogni-agent --video -m seedance2 --workflow t2v \
674
772
  --ref https://cdn.example.com/product.png \
675
773
  --ref-video https://cdn.example.com/motion.mp4 \
676
774
  --ref-audio https://cdn.example.com/music.m4a \
677
775
  "Use @Image1 for product identity, @Video1 for camera movement, and @Audio1 for music rhythm"
678
776
 
679
777
  # Sound-to-video (s2v)
680
- node sogni-agent.mjs --video --ref face.jpg --ref-audio speech.m4a \
778
+ sogni-agent --video --ref face.jpg --ref-audio speech.m4a \
681
779
  -m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync talking head"
682
780
 
683
781
  # Image+audio-to-video (auto-routes to LTX 2.3 ia2v)
684
- node sogni-agent.mjs --video --ref cover.jpg --ref-audio song.mp3 \
782
+ sogni-agent --video --ref cover.jpg --ref-audio song.mp3 \
685
783
  "music video with synchronized motion"
686
784
 
687
785
  # Audio-to-video (auto-routes to LTX 2.3 a2v)
688
- node sogni-agent.mjs --video --ref-audio song.mp3 \
786
+ sogni-agent --video --ref-audio song.mp3 \
689
787
  "abstract audio-reactive visualizer"
690
788
 
691
789
  # Persona/voice identity with LTX native audio
692
- node sogni-agent.mjs --video --reference-audio-identity voice.webm \
790
+ sogni-agent --video --reference-audio-identity voice.webm \
693
791
  "NARRATOR: \"This is my voice.\""
694
792
 
695
793
  # Prefer .webm, .m4a, or .mp3 voice clips. Local .wav clips are normalized
696
794
  # to .m4a before upload when ffmpeg is available.
697
795
 
698
796
  # LTX-2.3 text-to-video
699
- node sogni-agent.mjs --video -m ltx23-22b-fp8_t2v_distilled --duration 20 \
797
+ sogni-agent --video -m ltx23-22b-fp8_t2v_distilled --duration 20 \
700
798
  "A wide cinematic aerial shot opens over steep tropical cliffs at golden hour, warm sunlight grazing the rock faces while sea mist drifts above the water below. Palm trees bend gently along the ridge as waves roll against the shoreline, leaving bright bands of foam across the dark stone. The camera glides forward in one continuous pass, revealing more of the coastline as sunlight flickers across wet surfaces and distant birds wheel through the haze. The scene holds a calm, upscale travel-film mood with smooth stabilized motion and crisp environmental detail."
701
799
 
702
800
  # Animate (motion transfer)
703
- node sogni-agent.mjs --video --ref subject.jpg --ref-video motion.mp4 \
801
+ sogni-agent --video --ref subject.jpg --ref-video motion.mp4 \
704
802
  --workflow animate-move "transfer motion"
705
803
 
706
804
  # Segment a longer reference video for local stitched workflows
707
- node sogni-agent.mjs --video --workflow v2v --ref-video dance.mp4 \
805
+ sogni-agent --video --workflow v2v --ref-video dance.mp4 \
708
806
  --video-start 10 --duration 8 --controlnet-name pose \
709
807
  "robot dancing"
710
808
  ```
@@ -715,15 +813,15 @@ Transform an existing video using LTX-2 models with ControlNet guidance:
715
813
 
716
814
  ```bash
717
815
  # Basic v2v with canny edge detection
718
- node sogni-agent.mjs --video --workflow v2v --ref-video input.mp4 \
816
+ sogni-agent --video --workflow v2v --ref-video input.mp4 \
719
817
  --controlnet-name canny "stylized anime version"
720
818
 
721
819
  # V2V with pose detection and custom strength
722
- node sogni-agent.mjs --video --workflow v2v --ref-video dance.mp4 \
820
+ sogni-agent --video --workflow v2v --ref-video dance.mp4 \
723
821
  --controlnet-name pose --controlnet-strength 0.7 "robot dancing"
724
822
 
725
823
  # V2V with depth map
726
- node sogni-agent.mjs --video --workflow v2v --ref-video scene.mp4 \
824
+ sogni-agent --video --workflow v2v --ref-video scene.mp4 \
727
825
  --controlnet-name depth "watercolor painting style"
728
826
  ```
729
827
 
@@ -732,7 +830,7 @@ Default V2V strengths are tuned from Sogni Chat: `canny`/`pose`/`depth` use `0.8
732
830
 
733
831
  ```bash
734
832
  # Seedance V2V without ControlNet
735
- node sogni-agent.mjs --video --workflow v2v -m seedance2-v2v \
833
+ sogni-agent --video --workflow v2v -m seedance2-v2v \
736
834
  --ref-video input.mp4 "make the clip more cinematic"
737
835
  ```
738
836
 
@@ -758,7 +856,7 @@ sogni-agent -c old_photo.jpg -o restored.png -w 1024 -h 1280 \
758
856
 
759
857
  **Finding received images (Telegram/etc):**
760
858
  ```bash
761
- node {{skillDir}}/sogni-agent.mjs --json --list-media images
859
+ sogni-agent --json --list-media images
762
860
  ```
763
861
 
764
862
  **Do NOT use `ls`, `cp`, or other shell commands to browse user files.** Always use `--list-media` to find inbound media.
@@ -826,41 +924,41 @@ When user asks to generate/draw/create an image:
826
924
 
827
925
  ```bash
828
926
  # Generate and save locally (use -Q for quality presets instead of memorizing model IDs)
829
- node {{skillDir}}/sogni-agent.mjs -q -Q fast -o /tmp/generated.png "user's prompt"
830
- node {{skillDir}}/sogni-agent.mjs -q -Q pro -o /tmp/generated.png "user's prompt"
927
+ sogni-agent -q -Q fast -o /tmp/generated.png "user's prompt"
928
+ sogni-agent -q -Q pro -o /tmp/generated.png "user's prompt"
831
929
 
832
930
  # Generate with prompt variations (diverse images in one call)
833
- node {{skillDir}}/sogni-agent.mjs -q -n 3 -o /tmp/cars.png "a {red|blue|green} sports car"
931
+ sogni-agent -q -n 3 -o /tmp/cars.png "a {red|blue|green} sports car"
834
932
 
835
933
  # Edit an existing image
836
- node {{skillDir}}/sogni-agent.mjs -q -c /path/to/input.jpg -o /tmp/edited.png "make it pop art style"
934
+ sogni-agent -q -c /path/to/input.jpg -o /tmp/edited.png "make it pop art style"
837
935
 
838
936
  # Generate video from image
839
- node {{skillDir}}/sogni-agent.mjs -q --video --ref /path/to/image.png -o /tmp/video.mp4 "A medium shot holds on the subject in soft late-afternoon light as fabric edges and background details remain clear and stable. The camera performs a slow push-in while the subject shifts weight subtly and turns slightly toward the lens, keeping the motion gentle and continuous. Leaves rustle softly in the background and the scene maintains smooth cinematic movement with no abrupt action changes."
937
+ sogni-agent -q --video --ref /path/to/image.png -o /tmp/video.mp4 "A medium shot holds on the subject in soft late-afternoon light as fabric edges and background details remain clear and stable. The camera performs a slow push-in while the subject shifts weight subtly and turns slightly toward the lens, keeping the motion gentle and continuous. Leaves rustle softly in the background and the scene maintains smooth cinematic movement with no abrupt action changes."
840
938
 
841
939
  # Generate text-to-video
842
- node {{skillDir}}/sogni-agent.mjs -q --video -o /tmp/video.mp4 "A wide cinematic shot opens on ocean waves rolling toward a rocky shoreline at sunset, golden light spreading across the water while sea mist drifts through the air. Foam patterns form and recede over the dark sand as the horizon glows orange and pink in the distance. The camera glides forward in one continuous movement, holding smooth stabilized motion and calm environmental detail throughout the scene."
940
+ sogni-agent -q --video -o /tmp/video.mp4 "A wide cinematic shot opens on ocean waves rolling toward a rocky shoreline at sunset, golden light spreading across the water while sea mist drifts through the air. Foam patterns form and recede over the dark sand as the horizon glows orange and pink in the distance. The camera glides forward in one continuous movement, holding smooth stabilized motion and calm environmental detail throughout the scene."
843
941
 
844
942
  # Generate direct music/audio
845
- node {{skillDir}}/sogni-agent.mjs -q --music --duration 30 -o /tmp/music.mp3 "uplifting cinematic synthwave theme for a product launch"
943
+ sogni-agent -q --music --duration 30 -o /tmp/music.mp3 "uplifting cinematic synthwave theme for a product launch"
846
944
 
847
945
  # HD / "4K" text-to-video: prefer LTX-2.3
848
- node {{skillDir}}/sogni-agent.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A wide cinematic aerial shot opens over a rugged ocean coastline at golden hour, warm sunlight catching the cliff faces while white surf breaks against dark rock below. Low sea mist hangs over the water and bands of foam trace the shoreline as gulls wheel through the distance. The camera glides forward in one continuous pass, revealing the curve of the coast while wet stone flashes with reflected light and the scene keeps smooth stabilized motion from start to finish. The overall mood feels expansive and polished, with crisp environmental detail and steady travel-film energy."
946
+ sogni-agent -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A wide cinematic aerial shot opens over a rugged ocean coastline at golden hour, warm sunlight catching the cliff faces while white surf breaks against dark rock below. Low sea mist hangs over the water and bands of foam trace the shoreline as gulls wheel through the distance. The camera glides forward in one continuous pass, revealing the curve of the coast while wet stone flashes with reflected light and the scene keeps smooth stabilized motion from start to finish. The overall mood feels expansive and polished, with crisp environmental detail and steady travel-film energy."
849
947
 
850
948
  # HD / "4K" image-to-video: prefer LTX i2v
851
- node {{skillDir}}/sogni-agent.mjs -q --video --ref /path/to/image.png -m ltx23-22b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A medium cinematic shot holds on the scene with clean subject separation and stable environmental detail as directional light shapes the surfaces and background depth. The camera performs a slow push-in while the main subject makes one subtle continuous movement, keeping posture and identity consistent from start to finish. Ambient motion in the background stays gentle and the overall clip remains smooth, stabilized, and visually coherent."
949
+ sogni-agent -q --video --ref /path/to/image.png -m ltx23-22b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A medium cinematic shot holds on the scene with clean subject separation and stable environmental detail as directional light shapes the surfaces and background depth. The camera performs a slow push-in while the main subject makes one subtle continuous movement, keeping posture and identity consistent from start to finish. Ambient motion in the background stays gentle and the overall clip remains smooth, stabilized, and visually coherent."
852
950
 
853
951
  # Photobooth: stylize a face photo
854
- node {{skillDir}}/sogni-agent.mjs -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
952
+ sogni-agent -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
855
953
 
856
954
  # Token auto-fallback (tries SPARK first, retries with SOGNI on insufficient balance)
857
- node {{skillDir}}/sogni-agent.mjs -q --token-type auto -o /tmp/generated.png "user's prompt"
955
+ sogni-agent -q --token-type auto -o /tmp/generated.png "user's prompt"
858
956
 
859
957
  # Check current SPARK/SOGNI balances (no prompt required)
860
- node {{skillDir}}/sogni-agent.mjs --json --balance
958
+ sogni-agent --json --balance
861
959
 
862
960
  # Find user-sent images/audio
863
- node {{skillDir}}/sogni-agent.mjs --json --list-media images
961
+ sogni-agent --json --list-media images
864
962
 
865
963
  # Then send via message tool with filePath
866
964
  ```
@@ -883,10 +981,10 @@ When the user wants multiple variations (different colors, styles, subjects), us
883
981
 
884
982
  ```bash
885
983
  # 3 color variations
886
- node {{skillDir}}/sogni-agent.mjs -q -n 3 "a {red|blue|green} sports car"
984
+ sogni-agent -q -n 3 "a {red|blue|green} sports car"
887
985
 
888
986
  # 4 style variations
889
- node {{skillDir}}/sogni-agent.mjs -q -n 4 "a portrait in {oil painting|watercolor|pencil sketch|pop art} style"
987
+ sogni-agent -q -n 4 "a portrait in {oil painting|watercolor|pencil sketch|pop art} style"
890
988
  ```
891
989
 
892
990
  Options cycle sequentially per image. Without `{...}` syntax, `-n` generates multiple images with the same prompt.
@@ -916,7 +1014,7 @@ When a user asks to **animate between two images**, use `--ref` (first frame) an
916
1014
 
917
1015
  ```bash
918
1016
  # Animate from image A to image B
919
- node {{skillDir}}/sogni-agent.mjs -q --video --ref /tmp/imageA.png --ref-end /tmp/imageB.png -o /tmp/transition.mp4 "descriptive prompt of the transition"
1017
+ sogni-agent -q --video --ref /tmp/imageA.png --ref-end /tmp/imageB.png -o /tmp/transition.mp4 "descriptive prompt of the transition"
920
1018
  ```
921
1019
 
922
1020
  ### Animate a Video to an Image (Scene Continuation)
@@ -925,15 +1023,15 @@ When a user asks to **animate from a video to an image** (or "continue" a video
925
1023
 
926
1024
  1. **Extract the last frame** of the existing video using the built-in safe wrapper:
927
1025
  ```bash
928
- node {{skillDir}}/sogni-agent.mjs --extract-last-frame /tmp/existing.mp4 /tmp/lastframe.png
1026
+ sogni-agent --extract-last-frame /tmp/existing.mp4 /tmp/lastframe.png
929
1027
  ```
930
1028
  2. **Generate a new video** using the last frame as `--ref` and the target image as `--ref-end`:
931
1029
  ```bash
932
- node {{skillDir}}/sogni-agent.mjs -q --video --ref /tmp/lastframe.png --ref-end /tmp/target.png -o /tmp/continuation.mp4 "scene transition prompt"
1030
+ sogni-agent -q --video --ref /tmp/lastframe.png --ref-end /tmp/target.png -o /tmp/continuation.mp4 "scene transition prompt"
933
1031
  ```
934
1032
  3. **Concatenate the videos** using the built-in safe wrapper:
935
1033
  ```bash
936
- node {{skillDir}}/sogni-agent.mjs --concat-videos /tmp/full_sequence.mp4 /tmp/existing.mp4 /tmp/continuation.mp4
1034
+ sogni-agent --concat-videos /tmp/full_sequence.mp4 /tmp/existing.mp4 /tmp/continuation.mp4
937
1035
  ```
938
1036
 
939
1037
  This ensures visual continuity — the new clip picks up exactly where the previous one ended.
@@ -998,22 +1096,22 @@ Personas are named people with saved reference photos and optional voice clips.
998
1096
 
999
1097
  ```bash
1000
1098
  # Add a persona with a reference photo
1001
- node {{skillDir}}/sogni-agent.mjs --persona-add "Mark" --ref face.jpg --relationship self --description "30s male, brown hair, brown eyes"
1099
+ sogni-agent --persona-add "Mark" --ref face.jpg --relationship self --description "30s male, brown hair, brown eyes"
1002
1100
 
1003
1101
  # Add with voice clip for video voice cloning
1004
- node {{skillDir}}/sogni-agent.mjs --persona-add "Sarah" --ref sarah.jpg --relationship partner --voice-clip sarah-voice.webm --voice "warm alto with British accent"
1102
+ sogni-agent --persona-add "Sarah" --ref sarah.jpg --relationship partner --voice-clip sarah-voice.webm --voice "warm alto with British accent"
1005
1103
 
1006
1104
  # List all personas
1007
- node {{skillDir}}/sogni-agent.mjs --persona-list --json
1105
+ sogni-agent --persona-list --json
1008
1106
 
1009
1107
  # Resolve a persona by name, tag, or pronoun
1010
- node {{skillDir}}/sogni-agent.mjs --persona-resolve "me" --json
1108
+ sogni-agent --persona-resolve "me" --json
1011
1109
 
1012
1110
  # Generate using a persona (auto-injects photo as context)
1013
- node {{skillDir}}/sogni-agent.mjs --persona "Mark" -o /tmp/hero.png "superhero in dramatic lighting"
1111
+ sogni-agent --persona "Mark" -o /tmp/hero.png "superhero in dramatic lighting"
1014
1112
 
1015
1113
  # Remove a persona
1016
- node {{skillDir}}/sogni-agent.mjs --persona-remove "Mark"
1114
+ sogni-agent --persona-remove "Mark"
1017
1115
  ```
1018
1116
 
1019
1117
  ### Persona Pipeline Rules
@@ -1038,18 +1136,18 @@ Memories are persistent key-value preferences stored locally at `~/.config/sogni
1038
1136
 
1039
1137
  ```bash
1040
1138
  # Save a preference
1041
- node {{skillDir}}/sogni-agent.mjs --memory-set preferred_style "watercolor and soft lighting"
1042
- node {{skillDir}}/sogni-agent.mjs --memory-set aspect_ratio "16:9"
1043
- node {{skillDir}}/sogni-agent.mjs --memory-set favorite_artist "Studio Ghibli"
1139
+ sogni-agent --memory-set preferred_style "watercolor and soft lighting"
1140
+ sogni-agent --memory-set aspect_ratio "16:9"
1141
+ sogni-agent --memory-set favorite_artist "Studio Ghibli"
1044
1142
 
1045
1143
  # Read all memories
1046
- node {{skillDir}}/sogni-agent.mjs --memory-list --json
1144
+ sogni-agent --memory-list --json
1047
1145
 
1048
1146
  # Get one memory
1049
- node {{skillDir}}/sogni-agent.mjs --memory-get preferred_style --json
1147
+ sogni-agent --memory-get preferred_style --json
1050
1148
 
1051
1149
  # Delete a memory
1052
- node {{skillDir}}/sogni-agent.mjs --memory-remove preferred_style
1150
+ sogni-agent --memory-remove preferred_style
1053
1151
  ```
1054
1152
 
1055
1153
  **Agent behavior:** Before generating, check memories with `--memory-list` and respect saved preferences. If the user says "I always want watercolor style", save it with `--memory-set`. Categories: `preference` (default), `fact`, `context`.
@@ -1060,13 +1158,13 @@ Users can set custom instructions that shape agent behavior, stored at `~/.confi
1060
1158
 
1061
1159
  ```bash
1062
1160
  # Set personality
1063
- node {{skillDir}}/sogni-agent.mjs --personality-set "Be concise, always use cinematic lighting, suggest bold creative ideas"
1161
+ sogni-agent --personality-set "Be concise, always use cinematic lighting, suggest bold creative ideas"
1064
1162
 
1065
1163
  # Read current personality
1066
- node {{skillDir}}/sogni-agent.mjs --personality-get --json
1164
+ sogni-agent --personality-get --json
1067
1165
 
1068
1166
  # Clear (reset to default)
1069
- node {{skillDir}}/sogni-agent.mjs --personality-clear
1167
+ sogni-agent --personality-clear
1070
1168
  ```
1071
1169
 
1072
1170
  **Agent behavior:** Check personality on startup and adopt those instructions. Personality overrides default style but not hard constraints (safety, tool usage rules).
@@ -1077,13 +1175,13 @@ Apply artistic styles to existing images:
1077
1175
 
1078
1176
  ```bash
1079
1177
  # Apply a named artist style
1080
- node {{skillDir}}/sogni-agent.mjs -c photo.jpg -o /tmp/styled.png "Apply style: Andy Warhol pop art with bold primary colors"
1178
+ sogni-agent -c photo.jpg -o /tmp/styled.png "Apply style: Andy Warhol pop art with bold primary colors"
1081
1179
 
1082
1180
  # Studio Ghibli transformation
1083
- node {{skillDir}}/sogni-agent.mjs -c photo.jpg -o /tmp/ghibli.png "Apply style: Studio Ghibli watercolor with soft pastel sky and lush greenery"
1181
+ sogni-agent -c photo.jpg -o /tmp/ghibli.png "Apply style: Studio Ghibli watercolor with soft pastel sky and lush greenery"
1084
1182
 
1085
1183
  # For photos with people, always preserve identity
1086
- node {{skillDir}}/sogni-agent.mjs -c portrait.jpg -o /tmp/styled.png "Apply style: oil painting in the style of Vermeer. Preserve all facial features, expressions, and identity."
1184
+ sogni-agent -c portrait.jpg -o /tmp/styled.png "Apply style: oil painting in the style of Vermeer. Preserve all facial features, expressions, and identity."
1087
1185
  ```
1088
1186
 
1089
1187
  **Tips:** Reference artists and styles BY NAME for best results. Use positive phrasing. For photos with people, always append identity preservation instructions.
@@ -1094,13 +1192,13 @@ Generate a photo from a different camera angle:
1094
1192
 
1095
1193
  ```bash
1096
1194
  # 3/4 view
1097
- node {{skillDir}}/sogni-agent.mjs --multi-angle -c subject.jpg --azimuth front-right "same subject"
1195
+ sogni-agent --multi-angle -c subject.jpg --azimuth front-right "same subject"
1098
1196
 
1099
1197
  # Side view
1100
- node {{skillDir}}/sogni-agent.mjs --multi-angle -c subject.jpg --azimuth left --elevation eye-level --distance medium "same subject"
1198
+ sogni-agent --multi-angle -c subject.jpg --azimuth left --elevation eye-level --distance medium "same subject"
1101
1199
 
1102
1200
  # Full 360 turntable
1103
- node {{skillDir}}/sogni-agent.mjs --angles-360 -c subject.jpg "same subject"
1201
+ sogni-agent --angles-360 -c subject.jpg "same subject"
1104
1202
  ```
1105
1203
 
1106
1204
  **User term mapping:**