sogni-gen 1.5.13 → 1.5.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +17 -2
- package/desktop-extension/server/package.json +1 -1
- package/llm.txt +18 -5
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
package/SKILL.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: sogni-gen
|
|
3
|
-
version: "1.5.
|
|
3
|
+
version: "1.5.14"
|
|
4
4
|
description: Generate images **and videos** using Sogni AI's decentralized network, with local credential/config files and optional local media inputs. Ask the agent to "draw", "generate", "create an image", or "make a video/animate" from a prompt or reference image.
|
|
5
5
|
homepage: https://sogni.ai
|
|
6
6
|
metadata:
|
|
@@ -147,7 +147,7 @@ node sogni-gen.mjs -q -o /tmp/cat.png "a cat wearing a hat"
|
|
|
147
147
|
| `-c, --context <path>` | Context image for editing | - |
|
|
148
148
|
| `--last-image` | Use last generated image as context/ref | - |
|
|
149
149
|
| `--video, -v` | Generate video instead of image | - |
|
|
150
|
-
| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|v2v\|animate-move\|animate-replace) | inferred |
|
|
150
|
+
| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|ia2v\|a2v\|v2v\|animate-move\|animate-replace) | inferred |
|
|
151
151
|
| `--fps <num>` | Frames per second (video) | 16 |
|
|
152
152
|
| `--duration <sec>` | Duration in seconds (video) | 5 |
|
|
153
153
|
| `--frames <num>` | Override total frames (video) | - |
|
|
@@ -480,6 +480,12 @@ node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/vide
|
|
|
480
480
|
# Generate text-to-video
|
|
481
481
|
node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "ocean waves at sunset"
|
|
482
482
|
|
|
483
|
+
# HD / "4K" text-to-video: prefer LTX-2.3
|
|
484
|
+
node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "cinematic ocean flyover"
|
|
485
|
+
|
|
486
|
+
# HD / "4K" image-to-video: prefer LTX i2v
|
|
487
|
+
node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "slow cinematic push-in"
|
|
488
|
+
|
|
483
489
|
# Photobooth: stylize a face photo
|
|
484
490
|
node {{skillDir}}/sogni-gen.mjs -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
|
|
485
491
|
|
|
@@ -492,6 +498,15 @@ node {{skillDir}}/sogni-gen.mjs --json --list-media images
|
|
|
492
498
|
# Then send via message tool with filePath
|
|
493
499
|
```
|
|
494
500
|
|
|
501
|
+
## High-Res Video Routing
|
|
502
|
+
|
|
503
|
+
When the user asks for video in **"hd"**, **"1080p"**, **"4k"**, **"uhd"**, or **"high-res"**, do not use the default WAN video models.
|
|
504
|
+
|
|
505
|
+
- For **text-to-video**, use `-m ltx23-22b-fp8_t2v_distilled`.
|
|
506
|
+
- For **image-to-video**, use `-m ltx2-19b-fp8_i2v_distilled`.
|
|
507
|
+
- Prefer LTX-sized dimensions such as `-w 1920 -h 1088`.
|
|
508
|
+
- Treat "4k" as a signal to use the highest practical LTX path exposed by this skill, even if the exact output is not literal 3840x2160.
|
|
509
|
+
|
|
495
510
|
**Security:** Agents must use the CLI's built-in flags (`--extract-last-frame`, `--concat-videos`, `--list-media`) for all file operations and video manipulation. Never run raw shell commands (`ffmpeg`, `ls`, `cp`, etc.) directly.
|
|
496
511
|
|
|
497
512
|
## Animate Between Two Images (First-Frame / Last-Frame)
|
package/llm.txt
CHANGED
|
@@ -86,6 +86,12 @@ node {{skillDir}}/sogni-gen.mjs -q --video --workflow a2v --ref-audio /path/to/s
|
|
|
86
86
|
# LTX-2.3 text-to-video
|
|
87
87
|
node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled --duration 20 -o /tmp/ltx23.mp4 "cinematic drone shot"
|
|
88
88
|
|
|
89
|
+
# HD / "4K" text-to-video: prefer LTX-2.3 over WAN defaults
|
|
90
|
+
node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/hd-video.mp4 "cinematic ocean flyover"
|
|
91
|
+
|
|
92
|
+
# HD / "4K" image-to-video: prefer LTX i2v over WAN defaults
|
|
93
|
+
node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/hd-i2v.mp4 "slow cinematic push-in"
|
|
94
|
+
|
|
89
95
|
# Motion transfer from another video
|
|
90
96
|
node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/subject.jpg --ref-video /path/to/motion.mp4 --workflow animate-move -o /tmp/animated.mp4 "transfer motion"
|
|
91
97
|
```
|
|
@@ -161,11 +167,12 @@ node {{skillDir}}/sogni-gen.mjs --json --balance
|
|
|
161
167
|
3. When they send a photo and ask to edit/change/modify it: use -c with their image.
|
|
162
168
|
4. When they send a photo and ask to animate it: use --video --ref with their image.
|
|
163
169
|
5. When they send a photo + audio and ask for lip sync: use --video --ref IMAGE --ref-audio AUDIO.
|
|
164
|
-
6.
|
|
165
|
-
7.
|
|
166
|
-
8.
|
|
167
|
-
9.
|
|
168
|
-
10.
|
|
170
|
+
6. If the user asks for video in "hd", "1080p", "4k", "uhd", or "high-res", do not use WAN defaults. For text-to-video, use `-m ltx23-22b-fp8_t2v_distilled`. For image-to-video, use `-m ltx2-19b-fp8_i2v_distilled`. Prefer LTX-sized outputs such as `-w 1920 -h 1088`.
|
|
171
|
+
7. Always use -q (quiet) and -o (output to file) so you can send the result back.
|
|
172
|
+
8. After generating, send the file to the user via the message tool with filePath.
|
|
173
|
+
9. If you get "Insufficient funds", tell them: "Claim 50 free daily Spark points at https://app.sogni.ai/"
|
|
174
|
+
10. For transition/animation videos, always use this plugin's built-in flags (not raw ffmpeg). Use `--looping`, `--extract-last-frame`, or `--concat-videos`.
|
|
175
|
+
11. Default to 768x768 for images. WAN video sizes must be divisible by 16; LTX family sizes must be divisible by 64.
|
|
169
176
|
|
|
170
177
|
## Finding User-Sent Media
|
|
171
178
|
|
|
@@ -198,6 +205,12 @@ You: Use --video --ref with their photo, send video.
|
|
|
198
205
|
User: "Make a video of a cat playing piano"
|
|
199
206
|
You: Use --video (t2v), send video.
|
|
200
207
|
|
|
208
|
+
User: "Make this image into a 4k cinematic video" *sends photo*
|
|
209
|
+
You: Use `--video --ref` with the image and prefer `-m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088`.
|
|
210
|
+
|
|
211
|
+
User: "Create an HD video of a spaceship landing"
|
|
212
|
+
You: Use `--video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088`.
|
|
213
|
+
|
|
201
214
|
User: *sends photo + audio* "Make this person say this"
|
|
202
215
|
You: Use --video --ref photo --ref-audio audio (s2v), send video.
|
|
203
216
|
|
package/openclaw.plugin.json
CHANGED