@sogni-ai/sogni-creative-agent-skill 2.1.1 → 2.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +47 -6
- package/SKILL.md +37 -11
- package/generated/creative-agent-runtime.mjs +1138 -0
- package/llm.txt +17 -6
- package/openclaw.plugin.json +1 -1
- package/package.json +10 -6
- package/scripts/check-creative-agent-runtime.mjs +57 -0
- package/skill-package.json +1 -1
- package/sogni-agent.mjs +267 -699
- package/version.mjs +1 -1
package/README.md
CHANGED
|
@@ -31,7 +31,7 @@ With Sogni Creative Agent Skill, an agent can:
|
|
|
31
31
|
2. Install the command-line tool:
|
|
32
32
|
|
|
33
33
|
```bash
|
|
34
|
-
npm install -g @sogni-ai/sogni-creative-agent-skill
|
|
34
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
35
35
|
sogni-agent --version
|
|
36
36
|
```
|
|
37
37
|
|
|
@@ -49,12 +49,33 @@ Then ask your agent to do something simple, for example:
|
|
|
49
49
|
Default install behavior: when someone asks to install Sogni Creative Agent Skill, install the command-line tool and skill source.
|
|
50
50
|
|
|
51
51
|
```bash
|
|
52
|
-
npm install -g @sogni-ai/sogni-creative-agent-skill
|
|
52
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
53
53
|
sogni-agent --version
|
|
54
54
|
```
|
|
55
55
|
|
|
56
56
|
Then point the agent/runtime at this repository's [`SKILL.md`](./SKILL.md).
|
|
57
57
|
|
|
58
|
+
### Agent-Safe Upgrade
|
|
59
|
+
|
|
60
|
+
When upgrading from inside an agent runtime, prefer direct package-manager or existing-checkout commands. Avoid asking the agent to build a clone-or-pull shell bootstrap script with `set -e`, `bash -c`, `sh -c`, or an inline repository URL; some sandboxes correctly route those through approval.
|
|
61
|
+
|
|
62
|
+
For the CLI:
|
|
63
|
+
|
|
64
|
+
```bash
|
|
65
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
66
|
+
sogni-agent --version
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
For an existing local checkout:
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
DEST="$HOME/Documents/git/sogni/sogni-creative-agent-skill"
|
|
73
|
+
git -C "$DEST" pull --ff-only
|
|
74
|
+
npm --prefix "$DEST" install
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
If the checkout is missing, use the npm install path above or explicitly approve a clone.
|
|
78
|
+
|
|
58
79
|
### OpenClaw Plugin
|
|
59
80
|
|
|
60
81
|
For the published plugin:
|
|
@@ -96,11 +117,23 @@ Point the agent to this repository's [`SKILL.md`](./SKILL.md) for behavior guida
|
|
|
96
117
|
### Manual Installation
|
|
97
118
|
|
|
98
119
|
```bash
|
|
99
|
-
|
|
120
|
+
gh repo clone Sogni-AI/sogni-creative-agent-skill
|
|
100
121
|
cd sogni-creative-agent-skill
|
|
101
122
|
npm install
|
|
102
123
|
```
|
|
103
124
|
|
|
125
|
+
### Maintainer Runtime Sync
|
|
126
|
+
|
|
127
|
+
This public skill keeps CLI/runtime glue here, but Sogni model routing, video workflow defaults, quality tiers, and prompt guardrails are generated from the private `sogni-creative-agent` repo. With both repos checked out as siblings, refresh the generated runtime before publishing:
|
|
128
|
+
|
|
129
|
+
```bash
|
|
130
|
+
npm run sync:creative-agent-runtime
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
`npm test` runs `npm run check:creative-agent-runtime` first, which regenerates this file and fails if it differs from the committed copy.
|
|
134
|
+
|
|
135
|
+
The generated file is committed at [`generated/creative-agent-runtime.mjs`](./generated/creative-agent-runtime.mjs) so public installs do not need access to the private repo.
|
|
136
|
+
|
|
104
137
|
### Advanced OpenClaw Config
|
|
105
138
|
|
|
106
139
|
When loaded through OpenClaw, Sogni Creative Agent Skill reads plugin defaults from OpenClaw config. CLI flags always override those defaults.
|
|
@@ -152,6 +185,13 @@ sogni-agent --video --target-resolution 768 \
|
|
|
152
185
|
sogni-agent --video -m seedance2 --duration 8 \
|
|
153
186
|
"A polished product reveal with native ambient sound"
|
|
154
187
|
|
|
188
|
+
# Seedance multimodal context with public HTTPS references
|
|
189
|
+
sogni-agent --video -m seedance2 --workflow t2v \
|
|
190
|
+
--ref https://cdn.example.com/product.png \
|
|
191
|
+
--ref-video https://cdn.example.com/motion.mp4 \
|
|
192
|
+
--ref-audio https://cdn.example.com/music.m4a \
|
|
193
|
+
"Use @Image1 for product identity, @Video1 for camera movement, and @Audio1 for music rhythm"
|
|
194
|
+
|
|
155
195
|
# Image-to-video (i2v)
|
|
156
196
|
sogni-agent --video --ref cat.jpg "gentle camera pan"
|
|
157
197
|
|
|
@@ -177,7 +217,7 @@ sogni-agent --help
|
|
|
177
217
|
|
|
178
218
|
For local multi-clip workflows, prefer the built-in FFmpeg wrappers over raw shell commands. `--video-start`, `--audio-start`, and `--audio-duration` let you generate focused segments, while `--concat-videos` can stitch them and optionally mux a single soundtrack with `--concat-audio`.
|
|
179
219
|
|
|
180
|
-
V2V defaults mirror the Sogni Chat workflow tuning: `canny`, `pose`, and `depth` use ControlNet strength `0.85` with detailer assist, while `detailer` uses strength `1.0`. Use `-m seedance2-v2v` for Seedance V2V without ControlNet.
|
|
220
|
+
V2V defaults mirror the Sogni Chat workflow tuning: `canny`, `pose`, and `depth` use ControlNet strength `0.85` with detailer assist, while `detailer` uses strength `1.0`. Use `-m seedance2-v2v` for Seedance V2V without ControlNet. Seedance also accepts public HTTPS image, video, and audio references; audio references must be paired with an image or video reference.
|
|
181
221
|
|
|
182
222
|
## LTX-2.3 Prompting Guide
|
|
183
223
|
|
|
@@ -218,7 +258,8 @@ Multi-angle mode auto-builds the `<sks>` prompt and applies the `multiple_angles
|
|
|
218
258
|
## Video Sizing Rules (Aspect Ratios)
|
|
219
259
|
|
|
220
260
|
- WAN models use dimensions divisible by 16, min 480px, max 1536px.
|
|
221
|
-
- LTX family models (`ltx2-*`, `ltx23-*`) use dimensions divisible by 64.
|
|
261
|
+
- LTX family models (`ltx2-*`, `ltx23-*`) use dimensions divisible by 64. The current wrapper caps non-WAN video dimensions at 2048px on the long side.
|
|
262
|
+
- Seedance runs at fixed 24fps and supports 4-15s durations. Other default/WAN video paths support up to 10s; LTX and WAN animate workflows support up to 20s.
|
|
222
263
|
- The script auto-normalizes video sizes to satisfy those constraints.
|
|
223
264
|
- Use `--target-resolution <px>` for bare resolution requests such as "720p" when the user did not specify exact pixels. It targets the short side and preserves the inherited aspect ratio.
|
|
224
265
|
- For i2v (and any workflow using `--ref` / `--ref-end`), the client wrapper resizes the reference image with a strict aspect-fit (`fit: inside`) and then uses the *resized reference dimensions* as the final video size. Because that resize uses rounding, a “valid” requested size can still produce an invalid final size (example: `1024x1536` requested, but ref becomes `1024x1535`).
|
|
@@ -239,7 +280,7 @@ Run `sogni-agent --help` for the complete CLI. These are the options most agents
|
|
|
239
280
|
| `-o <path>` | Save output locally |
|
|
240
281
|
| `-c <path>` | Provide image context for edits |
|
|
241
282
|
| `--video` | Generate video instead of image |
|
|
242
|
-
| `--ref`, `--ref-audio`, `--ref-video` | Provide image/audio/video references |
|
|
283
|
+
| `--ref`, `--ref-audio`, `--ref-video` | Provide image/audio/video references; Seedance HTTPS references are forwarded as URL context |
|
|
243
284
|
| `--target-resolution <px>` | Target the short side while preserving aspect ratio |
|
|
244
285
|
| `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
|
|
245
286
|
| `--persona <name>` | Use a saved persona reference |
|
package/SKILL.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: sogni-creative-agent-skill
|
|
3
|
-
version: "2.1.
|
|
3
|
+
version: "2.1.3"
|
|
4
4
|
description: Sogni Creative Agent Skill: agent skill and CLI for image and video generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories (user preferences across sessions), custom personality, style transfer, angle synthesis, and multi-step creative workflows. Ask the agent to "draw", "generate", "create an image", "make a video/animate", "apply a style", or "generate me as a superhero".
|
|
5
5
|
homepage: https://sogni.ai
|
|
6
6
|
metadata:
|
|
@@ -45,12 +45,31 @@ When a user asks to install this plugin, skill, or Sogni Creative Agent Skill, i
|
|
|
45
45
|
Default install path:
|
|
46
46
|
|
|
47
47
|
```bash
|
|
48
|
-
npm install -g @sogni-ai/sogni-creative-agent-skill
|
|
48
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
49
49
|
sogni-agent --version
|
|
50
50
|
```
|
|
51
51
|
|
|
52
52
|
Then configure the agent/runtime to use this `SKILL.md` and invoke the `sogni-agent` CLI.
|
|
53
53
|
|
|
54
|
+
For upgrades, prefer package-manager updates or direct operations on an existing checkout. Do not generate clone-or-pull shell bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs; agent command scanners may require approval for those patterns.
|
|
55
|
+
|
|
56
|
+
Agent-safe CLI upgrade:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
60
|
+
sogni-agent --version
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Agent-safe update for an existing local checkout:
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
DEST="$HOME/Documents/git/sogni/sogni-creative-agent-skill"
|
|
67
|
+
git -C "$DEST" pull --ff-only
|
|
68
|
+
npm --prefix "$DEST" install
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
If that checkout does not exist, prefer the npm-based local skill install below, or ask before cloning.
|
|
72
|
+
|
|
54
73
|
## Setup
|
|
55
74
|
|
|
56
75
|
1. **Get Sogni credentials** at https://app.sogni.ai/
|
|
@@ -70,7 +89,7 @@ You can also export `SOGNI_API_KEY`, or `SOGNI_USERNAME` + `SOGNI_PASSWORD`, ins
|
|
|
70
89
|
|
|
71
90
|
3. **Install the CLI and skill by default:**
|
|
72
91
|
```bash
|
|
73
|
-
npm install -g @sogni-ai/sogni-creative-agent-skill
|
|
92
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
74
93
|
sogni-agent --version
|
|
75
94
|
```
|
|
76
95
|
|
|
@@ -185,18 +204,18 @@ node sogni-agent.mjs -q -o /tmp/cat.png "a cat wearing a hat"
|
|
|
185
204
|
| `--target-resolution <px>` | Short-side video target preserving aspect ratio | - |
|
|
186
205
|
| `--auto-resize-assets` | Auto-resize video assets | true |
|
|
187
206
|
| `--no-auto-resize-assets` | Disable auto-resize | - |
|
|
188
|
-
| `--estimate-video-cost` | Estimate video cost and exit
|
|
207
|
+
| `--estimate-video-cost` | Estimate video cost and exit | - |
|
|
189
208
|
| `--photobooth` | Face transfer mode (InstantID + SDXL Turbo) | - |
|
|
190
209
|
| `--cn-strength <n>` | ControlNet strength (photobooth) | 0.8 |
|
|
191
210
|
| `--cn-guidance-end <n>` | ControlNet guidance end point (photobooth) | 0.3 |
|
|
192
|
-
| `--ref <path>` | Reference image for video or photobooth face | required for video/photobooth |
|
|
193
|
-
| `--ref-end <path>` | End frame for i2v interpolation | - |
|
|
194
|
-
| `--ref-audio <path>` | Uploaded/generated audio for ia2v/a2v, or s2v lip-sync | - |
|
|
211
|
+
| `--ref <path\|url>` | Reference image for video or photobooth face | required for video/photobooth |
|
|
212
|
+
| `--ref-end <path\|url>` | End frame for i2v interpolation | - |
|
|
213
|
+
| `--ref-audio <path\|url>` | Uploaded/generated audio for ia2v/a2v, or s2v lip-sync | - |
|
|
195
214
|
| `--audio-start <sec>` | Start offset into `--ref-audio` | - |
|
|
196
215
|
| `--audio-duration <sec>` | Duration slice from `--ref-audio` | - |
|
|
197
216
|
| `--reference-audio-identity <path>` | Voice identity clip for LTX native audio | - |
|
|
198
217
|
| `--voice-persona <name>` | Use saved persona voice clip as LTX voice identity | - |
|
|
199
|
-
| `--ref-video <path>` | Reference video for animate/v2v workflows | - |
|
|
218
|
+
| `--ref-video <path\|url>` | Reference video for animate/v2v workflows | - |
|
|
200
219
|
| `--video-start <sec>` | Start offset into `--ref-video` for segmented V2V/animate | - |
|
|
201
220
|
| `--controlnet-name <name>` | ControlNet type for v2v: canny\|pose\|depth\|detailer | - |
|
|
202
221
|
| `--controlnet-strength <n>` | ControlNet strength for v2v (0.0-1.0) | canny/pose/depth 0.85, detailer 1.0 |
|
|
@@ -462,6 +481,13 @@ node sogni-agent.mjs --video --target-resolution 768 \
|
|
|
462
481
|
node sogni-agent.mjs --video -m seedance2 --duration 8 \
|
|
463
482
|
"A polished product reveal with native ambient sound"
|
|
464
483
|
|
|
484
|
+
# Seedance multimodal context with public HTTPS references
|
|
485
|
+
node sogni-agent.mjs --video -m seedance2 --workflow t2v \
|
|
486
|
+
--ref https://cdn.example.com/product.png \
|
|
487
|
+
--ref-video https://cdn.example.com/motion.mp4 \
|
|
488
|
+
--ref-audio https://cdn.example.com/music.m4a \
|
|
489
|
+
"Use @Image1 for product identity, @Video1 for camera movement, and @Audio1 for music rhythm"
|
|
490
|
+
|
|
465
491
|
# Sound-to-video (s2v)
|
|
466
492
|
node sogni-agent.mjs --video --ref face.jpg --ref-audio speech.m4a \
|
|
467
493
|
-m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync talking head"
|
|
@@ -511,7 +537,7 @@ node sogni-agent.mjs --video --workflow v2v --ref-video scene.mp4 \
|
|
|
511
537
|
```
|
|
512
538
|
|
|
513
539
|
ControlNet types: `canny` (edge detection), `pose` (body pose), `depth` (depth map), `detailer` (detail enhancement).
|
|
514
|
-
Default V2V strengths are tuned from Sogni Chat: `canny`/`pose`/`depth` use `0.85` plus detailer assist, while `detailer` uses `1.0` for preservation. For Seedance V2V, use `-m seedance2-v2v` and omit ControlNet.
|
|
540
|
+
Default V2V strengths are tuned from Sogni Chat: `canny`/`pose`/`depth` use `0.85` plus detailer assist, while `detailer` uses `1.0` for preservation. For Seedance V2V, use `-m seedance2-v2v` and omit ControlNet. Seedance accepts public HTTPS image, video, and audio references as URL context; audio references must be paired with an image or video reference.
|
|
515
541
|
|
|
516
542
|
```bash
|
|
517
543
|
# Seedance V2V without ControlNet
|
|
@@ -685,7 +711,7 @@ When the user asks for video in **"hd"**, **"1080p"**, **"4k"**, **"uhd"**, or *
|
|
|
685
711
|
- For bare named resolutions such as "720p" without orientation or exact pixels, prefer `--target-resolution 768` or the closest requested short side instead of forcing landscape dimensions.
|
|
686
712
|
- If the user explicitly asks for `vertical`, `portrait`, `story`, `reel`, `tiktok`, `square`, or `4:3`, apply the matching dimensions from the **Orientation Mapping** rules instead of defaulting to 16:9.
|
|
687
713
|
- Rewrite the user's request using the **LTX-2.3 Prompt Rule** before invoking the command. Do not send short slogan-style prompts to LTX.
|
|
688
|
-
- Treat "4k" as a signal to use the highest practical LTX path exposed by this skill, even
|
|
714
|
+
- Treat "4k" as a signal to use the highest practical LTX path exposed by this skill, even though the current wrapper caps non-WAN video dimensions at 2048px on the long side.
|
|
689
715
|
|
|
690
716
|
**Security:** Agents must use the CLI's built-in flags (`--extract-last-frame`, `--concat-videos`, `--list-media`) for all file operations and video manipulation. Never run raw shell commands (`ffmpeg`, `ls`, `cp`, etc.) directly.
|
|
691
717
|
|
|
@@ -919,7 +945,7 @@ node {{skillDir}}/sogni-agent.mjs --angles-360 -c subject.jpg "same subject"
|
|
|
919
945
|
## Troubleshooting
|
|
920
946
|
|
|
921
947
|
- **Auth errors**: Check `SOGNI_API_KEY` or the credentials in `~/.config/sogni/credentials`
|
|
922
|
-
- **i2v sizing gotchas**: Video sizes are model-specific. WAN uses min 480px, max 1536px, divisible by 16. LTX uses divisible-by-64 dimensions, and
|
|
948
|
+
- **i2v sizing gotchas**: Video sizes are model-specific. WAN uses min 480px, max 1536px, divisible by 16. LTX uses divisible-by-64 dimensions, and the current wrapper caps non-WAN video dimensions at 2048px on the long side. For i2v, the client wrapper resizes the reference (`fit: inside`) and uses the resized dimensions as the final video size. Because this uses rounding, a requested size can still yield an invalid final size.
|
|
923
949
|
- **Auto-adjustment**: With a local `--ref`, the script will auto-adjust the requested size to avoid resized reference dimensions that miss the model divisor.
|
|
924
950
|
- **If the script adjusts your size but you want to fail instead**: pass `--strict-size` and it will print a suggested `--width/--height`.
|
|
925
951
|
- **Timeouts**: Try a faster model or increase `-t` timeout
|