npm - sogni-gen - Versions diffs - 1.5.13 → 1.5.14 - Mend

sogni-gen 1.5.13 → 1.5.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/SKILL.md +17 -2
package/desktop-extension/server/package.json +1 -1
package/llm.txt +18 -5
package/openclaw.plugin.json +1 -1
package/package.json +1 -1

package/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: sogni-gen
-version: "1.5.13"
+version: "1.5.14"
 description: Generate images **and videos** using Sogni AI's decentralized network, with local credential/config files and optional local media inputs. Ask the agent to "draw", "generate", "create an image", or "make a video/animate" from a prompt or reference image.
 homepage: https://sogni.ai
 metadata:
@@ -147,7 +147,7 @@ node sogni-gen.mjs -q -o /tmp/cat.png "a cat wearing a hat"
 | `-c, --context <path>` | Context image for editing | - |
 | `--last-image` | Use last generated image as context/ref | - |
 | `--video, -v` | Generate video instead of image | - |
-| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|v2v\|animate-move\|animate-replace) | inferred |
+| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|ia2v\|a2v\|v2v\|animate-move\|animate-replace) | inferred |
 | `--fps <num>` | Frames per second (video) | 16 |
 | `--duration <sec>` | Duration in seconds (video) | 5 |
 | `--frames <num>` | Override total frames (video) | - |
@@ -480,6 +480,12 @@ node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/vide
 # Generate text-to-video
 node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "ocean waves at sunset"
+# HD / "4K" text-to-video: prefer LTX-2.3
+node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "cinematic ocean flyover"
+# HD / "4K" image-to-video: prefer LTX i2v
+node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "slow cinematic push-in"
 # Photobooth: stylize a face photo
 node {{skillDir}}/sogni-gen.mjs -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
@@ -492,6 +498,15 @@ node {{skillDir}}/sogni-gen.mjs --json --list-media images
 # Then send via message tool with filePath
 ```
+## High-Res Video Routing
+When the user asks for video in **"hd"**, **"1080p"**, **"4k"**, **"uhd"**, or **"high-res"**, do not use the default WAN video models.
+- For **text-to-video**, use `-m ltx23-22b-fp8_t2v_distilled`.
+- For **image-to-video**, use `-m ltx2-19b-fp8_i2v_distilled`.
+- Prefer LTX-sized dimensions such as `-w 1920 -h 1088`.
+- Treat "4k" as a signal to use the highest practical LTX path exposed by this skill, even if the exact output is not literal 3840x2160.
 **Security:** Agents must use the CLI's built-in flags (`--extract-last-frame`, `--concat-videos`, `--list-media`) for all file operations and video manipulation. Never run raw shell commands (`ffmpeg`, `ls`, `cp`, etc.) directly.
 ## Animate Between Two Images (First-Frame / Last-Frame)

package/desktop-extension/server/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sogni-gen-desktop-server",
-  "version": "1.5.13",
+  "version": "1.5.14",
   "private": true,
   "type": "module",
   "engines": {

package/llm.txt CHANGED Viewed

@@ -86,6 +86,12 @@ node {{skillDir}}/sogni-gen.mjs -q --video --workflow a2v --ref-audio /path/to/s
 # LTX-2.3 text-to-video
 node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled --duration 20 -o /tmp/ltx23.mp4 "cinematic drone shot"
+# HD / "4K" text-to-video: prefer LTX-2.3 over WAN defaults
+node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/hd-video.mp4 "cinematic ocean flyover"
+# HD / "4K" image-to-video: prefer LTX i2v over WAN defaults
+node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/hd-i2v.mp4 "slow cinematic push-in"
 # Motion transfer from another video
 node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/subject.jpg --ref-video /path/to/motion.mp4 --workflow animate-move -o /tmp/animated.mp4 "transfer motion"
 ```
@@ -161,11 +167,12 @@ node {{skillDir}}/sogni-gen.mjs --json --balance
 3. When they send a photo and ask to edit/change/modify it: use -c with their image.
 4. When they send a photo and ask to animate it: use --video --ref with their image.
 5. When they send a photo + audio and ask for lip sync: use --video --ref IMAGE --ref-audio AUDIO.
-6. Always use -q (quiet) and -o (output to file) so you can send the result back.
-7. After generating, send the file to the user via the message tool with filePath.
-8. If you get "Insufficient funds", tell them: "Claim 50 free daily Spark points at https://app.sogni.ai/"
-9. For transition/animation videos, always use this plugin's built-in flags (not raw ffmpeg). Use `--looping`, `--extract-last-frame`, or `--concat-videos`.
-10. Default to 768x768 for images. WAN video sizes must be divisible by 16; LTX family sizes must be divisible by 64.
+6. If the user asks for video in "hd", "1080p", "4k", "uhd", or "high-res", do not use WAN defaults. For text-to-video, use `-m ltx23-22b-fp8_t2v_distilled`. For image-to-video, use `-m ltx2-19b-fp8_i2v_distilled`. Prefer LTX-sized outputs such as `-w 1920 -h 1088`.
+7. Always use -q (quiet) and -o (output to file) so you can send the result back.
+8. After generating, send the file to the user via the message tool with filePath.
+9. If you get "Insufficient funds", tell them: "Claim 50 free daily Spark points at https://app.sogni.ai/"
+10. For transition/animation videos, always use this plugin's built-in flags (not raw ffmpeg). Use `--looping`, `--extract-last-frame`, or `--concat-videos`.
+11. Default to 768x768 for images. WAN video sizes must be divisible by 16; LTX family sizes must be divisible by 64.
 ## Finding User-Sent Media
@@ -198,6 +205,12 @@ You: Use --video --ref with their photo, send video.
 User: "Make a video of a cat playing piano"
 You: Use --video (t2v), send video.
+User: "Make this image into a 4k cinematic video" *sends photo*
+You: Use `--video --ref` with the image and prefer `-m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088`.
+User: "Create an HD video of a spaceship landing"
+You: Use `--video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088`.
 User: *sends photo + audio* "Make this person say this"
 You: Use --video --ref photo --ref-audio audio (s2v), send video.

package/openclaw.plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "id": "sogni-gen",
   "name": "Sogni Image & Video Generation",
   "description": "Generate images and videos using Sogni AI via the sogni-gen skill.",
-  "version": "1.5.13",
+  "version": "1.5.14",
   "skills": [
     "."
   ],

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sogni-gen",
-  "version": "1.5.13",
+  "version": "1.5.14",
   "description": "Sogni AI image & video generation — OpenClaw plugin and MCP server for Claude Code / Claude Desktop",
   "type": "module",
   "main": "sogni-gen.mjs",