npm - sogni-gen - Versions diffs - 1.5.13 → 1.5.15 - Mend

sogni-gen 1.5.13 → 1.5.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +31 -8
package/SKILL.md +74 -5
package/desktop-extension/manifest.json +1 -1
package/desktop-extension/server/package.json +1 -1
package/llm.txt +11 -182
package/openclaw.plugin.json +1 -1
package/package.json +1 -1
package/sogni-gen.mjs +1 -1

package/README.md CHANGED Viewed

@@ -14,10 +14,10 @@ Works as:
 ## Quick Start (OpenClaw + Manus)
 1. Create Sogni credentials (one-time): see [Setup](#setup).
-2. For OpenClaw, point your agent to:
+2. For OpenClaw, install the plugin:
-```
-https://raw.githubusercontent.com/Sogni-AI/openclaw-sogni-gen/main/llm.txt
+```bash
+openclaw plugins install sogni-gen
 ```
 3. For Manus AI agent, point it to this repository:
@@ -36,16 +36,18 @@ Then ask your agent:
 ## OpenClaw Installation (Recommended)
-### Quick Install (URL)
-Point OpenClaw to the [`llm.txt`](https://raw.githubusercontent.com/Sogni-AI/openclaw-sogni-gen/main/llm.txt). This is the fastest setup path.
 ### Plugin Install
 ```bash
 openclaw plugins install sogni-gen
 ```
+The installed plugin loads its behavior from [`SKILL.md`](./SKILL.md) via [`openclaw.plugin.json`](./openclaw.plugin.json).
+### Optional Install Helper
+[`llm.txt`](https://raw.githubusercontent.com/Sogni-AI/openclaw-sogni-gen/main/llm.txt) is now only a lightweight install/setup helper. It is not the primary behavior source for the installed OpenClaw plugin.
 ### Manual Installation
 ```bash
@@ -261,7 +263,7 @@ node sogni-gen.mjs --video --workflow a2v --ref-audio song.mp3 \
 # LTX-2.3 text-to-video
 node sogni-gen.mjs --video -m ltx23-22b-fp8_t2v_distilled --duration 20 \
-  "cinematic drone shot over tropical cliffs"
+  "A wide cinematic aerial shot opens over steep tropical cliffs at golden hour, warm sunlight grazing the rock faces while sea mist drifts above the water below. Palm trees bend gently along the ridge as waves roll against the shoreline, leaving bright bands of foam across the dark stone. The camera glides forward in one continuous pass, revealing more of the coastline as sunlight flickers across wet surfaces and distant birds wheel through the haze. The scene holds a calm, upscale travel-film mood with smooth stabilized motion and crisp environmental detail."
 # Animate (motion transfer)
 node sogni-gen.mjs --video --ref subject.jpg --ref-video motion.mp4 \
@@ -272,6 +274,27 @@ node sogni-gen.mjs --video --estimate-video-cost --steps 20 \
   -m wan_v2.2-14b-fp8_t2v_lightx2v "ocean waves at sunset"
 ```
+## LTX-2.3 Prompting Guide
+When you use `ltx23-22b-fp8_t2v_distilled`, do not feed it short tag prompts like `"cinematic drone shot over tropical cliffs"`. LTX-2.3 renders more reliably from a dense natural-language scene description.
+- Write one unbroken paragraph with no line breaks, bullets, headers, or tag blocks.
+- Use 4-8 flowing present-tense sentences describing one continuous shot, not a montage.
+- Start with shot scale and scene identity, then cover environment, time of day, textures, and named light sources.
+- Keep characters and objects concrete and stable. Describe one main action thread from start to finish.
+- If the user wants dialogue, weave it into the prose with the speaker and delivery identified inline.
+- Express mood through visible behavior, motion, and sound cues instead of vague adjectives.
+- Use positive phrasing. Avoid script formatting, negative prompts, on-screen text/logo requests, and generic filler words like "beautiful" or "nice".
+- Match scene density to clip length. For the default short clips, describe one main beat rather than several unrelated actions.
+Example rewrite:
+```text
+User ask: "make a 4k video of a woman in a neon alley"
+LTX-2.3 prompt: "A medium cinematic shot frames a woman in her 30s standing in a rain-soaked neon alley at night, violet and amber signs reflecting across the wet pavement while warm steam drifts from street vents. She wears a dark trench coat with damp strands of black hair clinging near her cheek as light glances across the fabric texture and the brick walls behind her. She turns toward the camera and steps forward with measured focus, one hand tightening around the strap of her bag while rain taps softly on the metal fire escape and a distant train hum rolls through the block. The camera performs a slow push-in as her jaw sets and her breathing steadies, maintaining smooth stabilized motion and a tense urban-thriller mood."
+```
 ## Photobooth (Face Transfer)
 Generate stylized portraits from a face photo using InstantID ControlNet:

package/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: sogni-gen
-version: "1.5.13"
+version: "1.5.15"
 description: Generate images **and videos** using Sogni AI's decentralized network, with local credential/config files and optional local media inputs. Ask the agent to "draw", "generate", "create an image", or "make a video/animate" from a prompt or reference image.
 homepage: https://sogni.ai
 metadata:
@@ -147,7 +147,7 @@ node sogni-gen.mjs -q -o /tmp/cat.png "a cat wearing a hat"
 | `-c, --context <path>` | Context image for editing | - |
 | `--last-image` | Use last generated image as context/ref | - |
 | `--video, -v` | Generate video instead of image | - |
-| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|v2v\|animate-move\|animate-replace) | inferred |
+| `--workflow <type>` | Video workflow (t2v\|i2v\|s2v\|ia2v\|a2v\|v2v\|animate-move\|animate-replace) | inferred |
 | `--fps <num>` | Frames per second (video) | 16 |
 | `--duration <sec>` | Duration in seconds (video) | 5 |
 | `--frames <num>` | Override total frames (video) | - |
@@ -404,7 +404,7 @@ node sogni-gen.mjs --video --workflow a2v --ref-audio song.mp3 \
 # LTX-2.3 text-to-video
 node sogni-gen.mjs --video -m ltx23-22b-fp8_t2v_distilled --duration 20 \
-  "cinematic drone shot over tropical cliffs"
+  "A wide cinematic aerial shot opens over steep tropical cliffs at golden hour, warm sunlight grazing the rock faces while sea mist drifts above the water below. Palm trees bend gently along the ridge as waves roll against the shoreline, leaving bright bands of foam across the dark stone. The camera glides forward in one continuous pass, revealing more of the coastline as sunlight flickers across wet surfaces and distant birds wheel through the haze. The scene holds a calm, upscale travel-film mood with smooth stabilized motion and crisp environmental detail."
 # Animate (motion transfer)
 node sogni-gen.mjs --video --ref subject.jpg --ref-video motion.mp4 \
@@ -463,6 +463,58 @@ node {{skillDir}}/sogni-gen.mjs --json --list-media images
 - If the user message includes the word "photobooth" (case-insensitive), always use `--photobooth` mode with `--ref` set to the user-provided face image.
 - Prioritize this rule over generic image-edit flows (`-c`) for that request.
+## LTX-2.3 Prompt Rule
+Whenever the chosen video model is `ltx23-22b-fp8_t2v_distilled`, do not pass the user's short request through unchanged. Rewrite it into an LTX-2.3-safe prompt before calling `sogni-gen`.
+- Output one single paragraph only. No line breaks, bullet points, section labels, tag lists, or screenplay formatting.
+- Use 4-8 flowing present-tense sentences describing one continuous shot. No cuts, montage, or unrelated scene jumps.
+- Start with shot scale plus the scene's visual identity, then describe environment, time of day, atmosphere, textures, and specific light sources.
+- Keep people, clothing, props, and locations concrete and stable across the whole paragraph.
+- Give the scene one main action thread from start to finish. Use connectors like `as`, `while`, and `then` so motion reads as a continuous filmed moment.
+- If the user asks for dialogue, embed the spoken words inline as prose and identify who is speaking and how they deliver the line.
+- Express emotion through visible physical cues such as posture, grip, jaw tension, breathing, or pacing. Ambient sound can be woven into the prose naturally.
+- Use positive phrasing only. Do not add negative prompts, "no ..." clauses, on-screen text/logo requests, vague filler words like `beautiful` or `nice`, or structural markup such as `[DIALOGUE]`.
+- Keep action density proportional to duration. For short clips, describe one main beat rather than several separate events.
+- Preserve the user's request, but expand it into cinematic prose. Do not invent a different story just to make the prompt longer.
+### Duration-Aware Pacing
+Match scene density to clip length so prompts stay filmable:
+- About `1-4s`: describe exactly 1 action or moment.
+- About `5-8s`: describe about 2 sequential actions.
+- About `9-12s`: describe about 3 sequential actions.
+- Longer clips: add only a small number of additional sequential beats. Do not turn the prompt into a montage or a full story arc unless the duration clearly supports it.
+### Orientation Mapping
+When the user explicitly asks for an orientation or aspect ratio, map it to safe LTX dimensions:
+- `vertical`, `portrait`, `story`, `reel`, `tiktok` -> `-w 1088 -h 1920`
+- `landscape`, `horizontal`, `widescreen`, `youtube`, `16:9` -> `-w 1920 -h 1088`
+- `square`, `1:1` -> `-w 1088 -h 1088`
+- `4:3 portrait` -> `-w 832 -h 1088`
+- `4:3 landscape` -> `-w 1088 -h 832`
+### Camera Language Normalization
+When the user uses loose camera language, translate it into concrete motion phrasing inside the prose prompt:
+- `zoom in` -> `slow push-in`
+- `zoom out` -> `slow pull-back`
+- `pan left` / `pan right` -> `smooth pan left` / `smooth pan right`
+- `orbit` / `circle around` -> `slow arc left` or `slow arc right`
+- `follow` -> `tracking follow`
+Short example:
+```text
+User ask: "4k video of a woman in a neon alley"
+Use this shape instead: "A medium cinematic shot frames a woman in her 30s standing in a rain-soaked neon alley at night, violet and amber signs reflecting across the wet pavement while warm steam drifts from street vents. She wears a dark trench coat with damp strands of black hair clinging near her cheek as light glances across the fabric texture and the brick walls behind her. She turns toward the camera and steps forward with measured focus, one hand tightening around the strap of her bag while rain taps softly on the metal fire escape and a distant train hum rolls through the block. The camera performs a slow push-in as her jaw sets and her breathing steadies, maintaining smooth stabilized motion and a tense urban-thriller mood."
+```
 ## Agent Usage
 When user asks to generate/draw/create an image:
@@ -475,10 +527,16 @@ node {{skillDir}}/sogni-gen.mjs -q -o /tmp/generated.png "user's prompt"
 node {{skillDir}}/sogni-gen.mjs -q -c /path/to/input.jpg -o /tmp/edited.png "make it pop art style"
 # Generate video from image
-node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/video.mp4 "camera slowly zooms in"
+node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/video.mp4 "A medium shot holds on the subject in soft late-afternoon light as fabric edges and background details remain clear and stable. The camera performs a slow push-in while the subject shifts weight subtly and turns slightly toward the lens, keeping the motion gentle and continuous. Leaves rustle softly in the background and the scene maintains smooth cinematic movement with no abrupt action changes."
 # Generate text-to-video
-node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "ocean waves at sunset"
+node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "A wide cinematic shot opens on ocean waves rolling toward a rocky shoreline at sunset, golden light spreading across the water while sea mist drifts through the air. Foam patterns form and recede over the dark sand as the horizon glows orange and pink in the distance. The camera glides forward in one continuous movement, holding smooth stabilized motion and calm environmental detail throughout the scene."
+# HD / "4K" text-to-video: prefer LTX-2.3
+node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A wide cinematic aerial shot opens over a rugged ocean coastline at golden hour, warm sunlight catching the cliff faces while white surf breaks against dark rock below. Low sea mist hangs over the water and bands of foam trace the shoreline as gulls wheel through the distance. The camera glides forward in one continuous pass, revealing the curve of the coast while wet stone flashes with reflected light and the scene keeps smooth stabilized motion from start to finish. The overall mood feels expansive and polished, with crisp environmental detail and steady travel-film energy."
+# HD / "4K" image-to-video: prefer LTX i2v
+node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -m ltx2-19b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A medium cinematic shot holds on the scene with clean subject separation and stable environmental detail as directional light shapes the surfaces and background depth. The camera performs a slow push-in while the main subject makes one subtle continuous movement, keeping posture and identity consistent from start to finish. Ambient motion in the background stays gentle and the overall clip remains smooth, stabilized, and visually coherent."
 # Photobooth: stylize a face photo
 node {{skillDir}}/sogni-gen.mjs -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
@@ -492,6 +550,17 @@ node {{skillDir}}/sogni-gen.mjs --json --list-media images
 # Then send via message tool with filePath
 ```
+## High-Res Video Routing
+When the user asks for video in **"hd"**, **"1080p"**, **"4k"**, **"uhd"**, or **"high-res"**, do not use the default WAN video models.
+- For **text-to-video**, use `-m ltx23-22b-fp8_t2v_distilled`.
+- For **image-to-video**, use `-m ltx2-19b-fp8_i2v_distilled`.
+- Prefer LTX-sized dimensions such as `-w 1920 -h 1088`.
+- If the user explicitly asks for `vertical`, `portrait`, `story`, `reel`, `tiktok`, `square`, or `4:3`, apply the matching dimensions from the **Orientation Mapping** rules instead of defaulting to 16:9.
+- Rewrite the user's request using the **LTX-2.3 Prompt Rule** before invoking the command. Do not send short slogan-style prompts to LTX.
+- Treat "4k" as a signal to use the highest practical LTX path exposed by this skill, even if the exact output is not literal 3840x2160.
 **Security:** Agents must use the CLI's built-in flags (`--extract-last-frame`, `--concat-videos`, `--list-media`) for all file operations and video manipulation. Never run raw shell commands (`ffmpeg`, `ls`, `cp`, etc.) directly.
 ## Animate Between Two Images (First-Frame / Last-Frame)

package/desktop-extension/manifest.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "manifest_version": "0.3",
   "name": "sogni-gen",
-  "version": "1.5.11",
+  "version": "1.5.15",
   "display_name": "Sogni AI Image & Video Generation",
   "description": "Generate images and videos using Sogni AI's decentralized GPU network",
   "long_description": "Generate images, edit photos, create videos, and transfer faces using Sogni AI. Supports multiple models with different speed/quality tradeoffs. Uses Spark tokens — claim 50 free daily at https://app.sogni.ai/",

package/desktop-extension/server/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sogni-gen-desktop-server",
-  "version": "1.5.13",
+  "version": "1.5.15",
   "private": true,
   "type": "module",
   "engines": {

package/llm.txt CHANGED Viewed

@@ -1,19 +1,12 @@
-# sogni-gen — AI Image & Video Generation
+# sogni-gen OpenClaw Plugin
-> OpenClaw plugin powered by Sogni AI's decentralized GPU network.
-> Repo: https://github.com/Sogni-AI/openclaw-sogni-gen
-## What It Does
-Generates AI images and videos from text prompts or reference media. Users ask you to "draw", "generate", "create an image/video", or "animate" something and you produce it.
-## Install
+Install this plugin in OpenClaw:
 ```bash
 openclaw plugins install sogni-gen
 ```
-Then create Sogni credentials:
+After install, configure Sogni credentials:
 ```bash
 mkdir -p ~/.config/sogni
@@ -26,180 +19,16 @@ EOF
 chmod 600 ~/.config/sogni/credentials
 ```
-Sign up at https://app.sogni.ai/ if you don't have an account. You get 50 free Spark tokens daily at https://app.sogni.ai/
-## How to Generate
-### Images
-```bash
-# Basic — returns a URL
-node {{skillDir}}/sogni-gen.mjs -q "a cat wearing a hat"
-# Save to file (then send via message tool with filePath)
-node {{skillDir}}/sogni-gen.mjs -q -o /tmp/generated.png "a cat wearing a hat"
-# Bigger image
-node {{skillDir}}/sogni-gen.mjs -q -o /tmp/out.png -w 1024 -h 1024 "a dragon eating tacos"
-# Higher quality (slower)
-node {{skillDir}}/sogni-gen.mjs -q -m flux2_dev_fp8 -o /tmp/out.png "portrait of a wizard"
-```
-### Image Editing (needs a reference image)
-```bash
-# Edit an existing image
-node {{skillDir}}/sogni-gen.mjs -q -c /path/to/photo.jpg -o /tmp/edited.png "make the background a beach"
-# Use last generated image as input
-node {{skillDir}}/sogni-gen.mjs -q --last-image -o /tmp/edited.png "make it pop art style"
-# Restore a damaged photo
-node {{skillDir}}/sogni-gen.mjs -q -c /path/to/old_photo.jpg -o /tmp/restored.png "restore this vintage photo, remove damage and scratches"
-```
-### Videos
-```bash
-# Text-to-video
-node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "ocean waves at sunset"
-# Image-to-video (animate an image)
-node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/video.mp4 "camera slowly zooms in"
-# Looping video
-node {{skillDir}}/sogni-gen.mjs -q --video --looping --ref /path/to/image.png -o /tmp/loop.mp4 "gentle camera pan"
-# Longer video (10 seconds)
-node {{skillDir}}/sogni-gen.mjs -q --video --duration 10 --ref /path/to/image.png -o /tmp/video.mp4 "camera orbits around"
-# Sound-to-video (lip sync / talking head)
-node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/face.jpg --ref-audio /path/to/speech.m4a -o /tmp/talking.mp4 "talking head"
-# Image+audio-to-video (LTX)
-node {{skillDir}}/sogni-gen.mjs -q --video --workflow ia2v --ref /path/to/cover.jpg --ref-audio /path/to/song.mp3 -o /tmp/music-video.mp4 "music video"
-# Audio-to-video (LTX)
-node {{skillDir}}/sogni-gen.mjs -q --video --workflow a2v --ref-audio /path/to/song.mp3 -o /tmp/visualizer.mp4 "abstract visualizer"
-# LTX-2.3 text-to-video
-node {{skillDir}}/sogni-gen.mjs -q --video -m ltx23-22b-fp8_t2v_distilled --duration 20 -o /tmp/ltx23.mp4 "cinematic drone shot"
-# Motion transfer from another video
-node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/subject.jpg --ref-video /path/to/motion.mp4 --workflow animate-move -o /tmp/animated.mp4 "transfer motion"
-```
-### 360 Turntable
-```bash
-# Generate 8 angles of a subject
-node {{skillDir}}/sogni-gen.mjs -q --angles-360 -c /path/to/subject.jpg "studio portrait"
-# 360 video (looping mp4, requires ffmpeg)
-node {{skillDir}}/sogni-gen.mjs -q --angles-360 --angles-360-video /tmp/turntable.mp4 -c /path/to/subject.jpg "studio portrait"
-```
-### Check Balance
-```bash
-node {{skillDir}}/sogni-gen.mjs --json --balance
-```
-## Image Models
-| Model | Speed | Best For |
-|-------|-------|----------|
-| z_image_turbo_bf16 | ~5-10s | Default, general purpose |
-| flux1-schnell-fp8 | ~3-5s | Quick iterations |
-| flux2_dev_fp8 | ~2min | Highest quality |
-| chroma-v.46-flash_fp8 | ~30s | Balanced speed/quality |
-| qwen_image_edit_2511_fp8_lightning | ~8s | Fast image editing (auto-selected with -c) |
-| qwen_image_edit_2511_fp8 | ~30s | Higher quality editing |
-## Video Models (auto-selected by workflow)
-| Workflow | Model | Speed |
-|----------|-------|-------|
-| t2v (text-to-video) | wan_v2.2-14b-fp8_t2v_lightx2v | ~5min |
-| i2v (image-to-video) | wan_v2.2-14b-fp8_i2v_lightx2v | ~3-5min |
-| s2v (sound-to-video) | wan_v2.2-14b-fp8_s2v_lightx2v | ~5min |
-| ia2v (image+audio-to-video) | ltx2-19b-fp8_ia2v_distilled | ~2-3min |
-| a2v (audio-to-video) | ltx2-19b-fp8_a2v_distilled | ~2-3min |
-| v2v (video-to-video) | ltx2-19b-fp8_v2v_distilled | ~3min |
-| animate-move | wan_v2.2-14b-fp8_animate-move_lightx2v | ~5min |
-| animate-replace | wan_v2.2-14b-fp8_animate-replace_lightx2v | ~5min |
-## Key Flags
-| Flag | What It Does |
-|------|-------------|
-| -o /path | Save output to file |
-| -q | Quiet mode (suppress progress) |
-| -w, -h | Width/height in pixels (default 768x768) |
-| -m MODEL | Choose a specific model |
-| -c IMAGE | Context image for editing (repeatable, max 3) |
-| --video, -v | Generate video instead of image |
-| --ref IMAGE | Reference image for video |
-| --ref-audio FILE | Audio for lip sync (s2v) |
-| --ref-video FILE | Motion source for animate workflows |
-| --looping | Seamless A-B-A loop (i2v only) |
-| --duration SEC | Video length (default 5s) |
-| --fps NUM | Frames per second (default 16) |
-| --last-image | Reuse last generated image as input |
-| --json | Machine-readable JSON output |
-| --balance | Show Spark/Sogni token balances |
-| --extract-last-frame VIDEO IMAGE | Extract last frame from a video file |
-| --concat-videos OUTPUT CLIPS... | Concatenate multiple video clips |
-| --list-media [images\|audio\|all] | List recent inbound media files |
-## Agent Behavior Guidelines
-0. If the user includes the keyword "photobooth" (case-insensitive), always use `--photobooth` with `--ref` to the user face image. Do not fall back to `-c` edit flow for that request.
-1. When the user asks to "draw", "generate", "create", or "make" an image: generate an image and send it.
-2. When they ask to "animate", "make a video", or "create a video": use --video mode.
-3. When they send a photo and ask to edit/change/modify it: use -c with their image.
-4. When they send a photo and ask to animate it: use --video --ref with their image.
-5. When they send a photo + audio and ask for lip sync: use --video --ref IMAGE --ref-audio AUDIO.
-6. Always use -q (quiet) and -o (output to file) so you can send the result back.
-7. After generating, send the file to the user via the message tool with filePath.
-8. If you get "Insufficient funds", tell them: "Claim 50 free daily Spark points at https://app.sogni.ai/"
-9. For transition/animation videos, always use this plugin's built-in flags (not raw ffmpeg). Use `--looping`, `--extract-last-frame`, or `--concat-videos`.
-10. Default to 768x768 for images. WAN video sizes must be divisible by 16; LTX family sizes must be divisible by 64.
-## Finding User-Sent Media
-When users send images/audio via Telegram, WhatsApp, or iMessages, use the built-in `--list-media` flag:
-```bash
-# Recent inbound images (default)
-node {{skillDir}}/sogni-gen.mjs --json --list-media images
-# Recent inbound audio
-node {{skillDir}}/sogni-gen.mjs --json --list-media audio
-# All recent media
-node {{skillDir}}/sogni-gen.mjs --json --list-media all
-```
-Do NOT use shell commands (`ls`, `cp`, etc.) to browse user media directories.
-## Example Conversations
-User: "Draw a sunset over mountains"
-You: Generate image, send it.
+Sign up at https://app.sogni.ai/ if needed.
-User: *sends photo* "Make this look like a watercolor painting"
-You: Use -c with their photo, edit prompt, send result.
+This package ships its OpenClaw behavior through the plugin skill declared in `openclaw.plugin.json`, so the installed plugin uses `SKILL.md` as the main instruction source.
-User: *sends photo* "Animate this"
-You: Use --video --ref with their photo, send video.
+Repo:
-User: "Make a video of a cat playing piano"
-You: Use --video (t2v), send video.
+https://github.com/Sogni-AI/openclaw-sogni-gen
-User: *sends photo + audio* "Make this person say this"
-You: Use --video --ref photo --ref-audio audio (s2v), send video.
+Key files:
-User: "Show me a 360 view of this" *sends photo*
-You: Use --angles-360 --angles-360-video with their photo, send video.
+- `SKILL.md` — agent behavior and usage rules
+- `openclaw.plugin.json` — plugin manifest and config schema
+- `sogni-gen.mjs` — CLI used by the skill

package/openclaw.plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "id": "sogni-gen",
   "name": "Sogni Image & Video Generation",
   "description": "Generate images and videos using Sogni AI via the sogni-gen skill.",
-  "version": "1.5.13",
+  "version": "1.5.15",
   "skills": [
     "."
   ],

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sogni-gen",
-  "version": "1.5.13",
+  "version": "1.5.15",
   "description": "Sogni AI image & video generation — OpenClaw plugin and MCP server for Claude Code / Claude Desktop",
   "type": "module",
   "main": "sogni-gen.mjs",

package/sogni-gen.mjs CHANGED Viewed

@@ -1228,7 +1228,7 @@ Examples:
   sogni-gen --video --ref cat.jpg --ref-audio speech.m4a -m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync"
   sogni-gen --video --workflow ia2v --ref cover.jpg --ref-audio song.mp3 "music video"
   sogni-gen --video --workflow a2v --ref-audio song.mp3 "abstract music visualizer"
-  sogni-gen --video -m ltx23-22b-fp8_t2v_distilled --duration 20 "cinematic drone shot"
+  sogni-gen --video -m ltx23-22b-fp8_t2v_distilled --duration 20 "A wide cinematic aerial shot opens over steep tropical cliffs at golden hour, warm sunlight grazing the rock faces while sea mist drifts above the water below. Palm trees bend gently along the ridge as waves roll against the shoreline, leaving bright bands of foam across the dark stone. The camera glides forward in one continuous pass, revealing more of the coastline as sunlight flickers across wet surfaces and distant birds wheel through the haze. The scene holds a calm, upscale travel-film mood with smooth stabilized motion and crisp environmental detail."
   sogni-gen --video --ref subject.jpg --ref-video motion.mp4 --workflow animate-move "transfer motion"
   sogni-gen --video --last-image "gentle camera pan"
   sogni-gen -c photo.jpg "make the background a beach" -m qwen_image_edit_2511_fp8