@sogni-ai/sogni-creative-agent-skill 3.1.0-alpha.1 → 3.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +102 -100
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
- package/version.mjs +1 -1
package/SKILL.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
name: sogni-creative-agent-skill
|
|
3
3
|
description: "Sogni Creative Agent Skill: agent skill and CLI for image, video, and music generation using Sogni AI's decentralized GPU network. Supports personas (named people with saved reference photos and voice clips), persistent memories (user preferences across sessions), custom personality, style transfer, angle synthesis, and multi-step creative workflows. Ask the agent to \"draw\", \"generate\", \"create an image\", \"make a video/animate\", \"make music\", \"apply a style\", or \"generate me as a superhero\"."
|
|
4
4
|
metadata:
|
|
5
|
-
version: "3.1.
|
|
5
|
+
version: "3.1.1"
|
|
6
6
|
homepage: https://sogni.ai
|
|
7
7
|
clawdbot:
|
|
8
8
|
emoji: "🎨"
|
|
@@ -51,6 +51,8 @@ sogni-agent --version
|
|
|
51
51
|
|
|
52
52
|
Then configure the agent/runtime to use this `SKILL.md` and invoke the `sogni-agent` CLI.
|
|
53
53
|
|
|
54
|
+
Always invoke the globally installed `sogni-agent` command. Do not call `node {{skillDir}}/sogni-agent.mjs` or `node sogni-agent.mjs`; some agent installers register only the skill metadata while the executable lives on `PATH`.
|
|
55
|
+
|
|
54
56
|
For upgrades, prefer package-manager updates or direct operations on an existing checkout. Do not generate clone-or-pull shell bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs; agent command scanners may require approval for those patterns.
|
|
55
57
|
|
|
56
58
|
Agent-safe CLI upgrade:
|
|
@@ -130,15 +132,15 @@ For any natural-language creative request — anything that should be planned, m
|
|
|
130
132
|
|
|
131
133
|
```bash
|
|
132
134
|
# Natural-language creative request (LLM picks the tool, dispatches, repairs)
|
|
133
|
-
|
|
135
|
+
sogni-agent --api-chat "Turn the attached product photo into a launch poster" --ref product.jpg
|
|
134
136
|
|
|
135
137
|
# Multi-step durable workflow (resumable, replay-friendly, server-orchestrated)
|
|
136
|
-
|
|
138
|
+
sogni-agent --api-workflow \
|
|
137
139
|
--video-prompt "The camera slowly pushes in" \
|
|
138
140
|
"A graphite robot sketch on a drafting table"
|
|
139
141
|
|
|
140
142
|
# Storyboard → keyframe → Seedance, all server-side
|
|
141
|
-
|
|
143
|
+
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 -Q hq \
|
|
142
144
|
"Create a 9:16 bakery launch video with a neon street-window reveal"
|
|
143
145
|
```
|
|
144
146
|
|
|
@@ -148,74 +150,74 @@ The direct-to-SDK flags below remain available for explicit one-shot generation
|
|
|
148
150
|
|
|
149
151
|
```bash
|
|
150
152
|
# Generate and get URL
|
|
151
|
-
|
|
153
|
+
sogni-agent "a cat wearing a hat"
|
|
152
154
|
|
|
153
155
|
# Quality presets (recommended for direct mode — auto-selects model, steps, and size)
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
156
|
+
sogni-agent -Q fast "a cat wearing a hat" # z_image_turbo, 8 steps, 512x512 (~5-10s)
|
|
157
|
+
sogni-agent -Q hq "a cat wearing a hat" # z_image_turbo, default steps, 768x768 (~10-15s)
|
|
158
|
+
sogni-agent -Q pro "a cat wearing a hat" # flux2_dev, 40 steps, 1024x1024 (~2min)
|
|
157
159
|
|
|
158
160
|
# Dynamic prompt variations — diverse images in one call
|
|
159
|
-
|
|
161
|
+
sogni-agent -n 3 "a {red|blue|green} sports car"
|
|
160
162
|
# → generates "a red sports car", "a blue sports car", "a green sports car"
|
|
161
163
|
|
|
162
164
|
# Token auto-fallback (tries SPARK, falls back to SOGNI)
|
|
163
|
-
|
|
165
|
+
sogni-agent --token-type auto "a cat wearing a hat"
|
|
164
166
|
|
|
165
167
|
# Save to file
|
|
166
|
-
|
|
168
|
+
sogni-agent -o /tmp/cat.png "a cat wearing a hat"
|
|
167
169
|
|
|
168
170
|
# JSON output (for scripting)
|
|
169
|
-
|
|
171
|
+
sogni-agent --json "a cat wearing a hat"
|
|
170
172
|
|
|
171
173
|
# Check token balances (no prompt required)
|
|
172
|
-
|
|
174
|
+
sogni-agent --balance
|
|
173
175
|
|
|
174
176
|
# Check token balances in JSON
|
|
175
|
-
|
|
177
|
+
sogni-agent --json --balance
|
|
176
178
|
|
|
177
179
|
# Quiet mode (suppress progress)
|
|
178
|
-
|
|
180
|
+
sogni-agent -q -o /tmp/cat.png "a cat wearing a hat"
|
|
179
181
|
|
|
180
182
|
# Direct music/audio generation
|
|
181
|
-
|
|
183
|
+
sogni-agent --music --duration 30 \
|
|
182
184
|
"uplifting cinematic synthwave theme for a product launch"
|
|
183
185
|
|
|
184
186
|
# Song with lyrics and musical controls
|
|
185
|
-
|
|
187
|
+
sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 \
|
|
186
188
|
--keyscale "C major" --output-format mp3 "bright indie pop chorus"
|
|
187
189
|
|
|
188
190
|
# Hosted API chat: natural-language creative-agent tool execution
|
|
189
|
-
|
|
191
|
+
sogni-agent --api-chat "Create a 4-shot product video concept for a red sneaker"
|
|
190
192
|
|
|
191
193
|
# Hosted API chat with image vision and media-reference metadata
|
|
192
|
-
|
|
194
|
+
sogni-agent --api-chat --ref product.jpg \
|
|
193
195
|
"Turn this into a launch poster and describe the edit plan"
|
|
194
196
|
|
|
195
197
|
# Sogni Intelligence model/replay utilities
|
|
196
|
-
|
|
197
|
-
|
|
198
|
+
sogni-agent --list-api-models
|
|
199
|
+
sogni-agent --api-chat --task-profile reasoning --no-thinking \
|
|
198
200
|
"Plan a concise multi-step product launch workflow"
|
|
199
|
-
|
|
200
|
-
|
|
201
|
+
sogni-agent --list-replays 20
|
|
202
|
+
sogni-agent --get-replay run_abc123 --json
|
|
201
203
|
|
|
202
204
|
# Durable API workflow: generated keyframe to video with resumable workflow record
|
|
203
|
-
|
|
205
|
+
sogni-agent --api-workflow \
|
|
204
206
|
--video-prompt "The camera slowly pushes in as the sketch comes alive" \
|
|
205
207
|
"A graphite robot sketch on a drafting table"
|
|
206
208
|
|
|
207
209
|
# Durable API workflow with media reference and cost controls
|
|
208
|
-
|
|
210
|
+
sogni-agent --api-workflow \
|
|
209
211
|
--ref https://cdn.example.com/sketch.png \
|
|
210
212
|
--workflow-max-cost 25 --confirm-cost \
|
|
211
213
|
--video-prompt "The camera slowly pushes in as the sketch comes alive" \
|
|
212
214
|
"Animate the referenced sketch"
|
|
213
215
|
|
|
214
216
|
# Exact durable workflow input with explicit steps
|
|
215
|
-
|
|
217
|
+
sogni-agent --api-workflow --workflow-input @workflow.json
|
|
216
218
|
|
|
217
219
|
# Durable storyboard-video workflow: storyline -> GPT Image 2 storyboard -> Seedance
|
|
218
|
-
|
|
220
|
+
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
|
|
219
221
|
"Create a 9:16 bakery launch video with a neon street-window reveal"
|
|
220
222
|
```
|
|
221
223
|
|
|
@@ -537,16 +539,16 @@ Edit images using reference images. Qwen models support up to 3 context images;
|
|
|
537
539
|
|
|
538
540
|
```bash
|
|
539
541
|
# Single context image
|
|
540
|
-
|
|
542
|
+
sogni-agent -c photo.jpg "make the background a beach"
|
|
541
543
|
|
|
542
544
|
# Multiple context images (subject + style)
|
|
543
|
-
|
|
545
|
+
sogni-agent -c subject.jpg -c style.jpg "apply the style to the subject"
|
|
544
546
|
|
|
545
547
|
# GPT Image 2 multi-reference edit
|
|
546
|
-
|
|
548
|
+
sogni-agent -m gpt-image-2 -c subject.jpg -c outfit.jpg -c room.jpg "place the subject in the room wearing the outfit"
|
|
547
549
|
|
|
548
550
|
# Use last generated image as context
|
|
549
|
-
|
|
551
|
+
sogni-agent --last-image "make it more vibrant"
|
|
550
552
|
```
|
|
551
553
|
|
|
552
554
|
When context images are provided without `-m`, defaults to `qwen_image_edit_2511_fp8_lightning`. Select `-m gpt-image-2` for GPT Image 2's higher reference-image limit and OpenAI-backed image editing.
|
|
@@ -557,13 +559,13 @@ Generate stylized portraits from a face photo using InstantID ControlNet. When a
|
|
|
557
559
|
|
|
558
560
|
```bash
|
|
559
561
|
# Basic photobooth
|
|
560
|
-
|
|
562
|
+
sogni-agent --photobooth --ref face.jpg "80s fashion portrait"
|
|
561
563
|
|
|
562
564
|
# Multiple outputs
|
|
563
|
-
|
|
565
|
+
sogni-agent --photobooth --ref face.jpg -n 4 "LinkedIn professional headshot"
|
|
564
566
|
|
|
565
567
|
# Custom ControlNet tuning
|
|
566
|
-
|
|
568
|
+
sogni-agent --photobooth --ref face.jpg --cn-strength 0.6 --cn-guidance-end 0.5 "oil painting"
|
|
567
569
|
```
|
|
568
570
|
|
|
569
571
|
Uses SDXL Turbo (`coreml-sogniXLturbo_alpha1_ad`) at 1024x1024 by default. The face image is passed via `--ref` and styled according to the prompt. Cannot be combined with `--video` or `-c/--context`.
|
|
@@ -571,10 +573,10 @@ Uses SDXL Turbo (`coreml-sogniXLturbo_alpha1_ad`) at 1024x1024 by default. The f
|
|
|
571
573
|
**Agent usage:**
|
|
572
574
|
```bash
|
|
573
575
|
# Photobooth: stylize a face photo
|
|
574
|
-
|
|
576
|
+
sogni-agent -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
|
|
575
577
|
|
|
576
578
|
# Multiple photobooth outputs
|
|
577
|
-
|
|
579
|
+
sogni-agent -q --photobooth --ref /path/to/face.jpg -n 4 -o /tmp/stylized.png "LinkedIn professional headshot"
|
|
578
580
|
```
|
|
579
581
|
|
|
580
582
|
## Multiple Angles (Turnaround)
|
|
@@ -583,17 +585,17 @@ Generate specific camera angles from a single reference image using the Multiple
|
|
|
583
585
|
|
|
584
586
|
```bash
|
|
585
587
|
# Single angle
|
|
586
|
-
|
|
588
|
+
sogni-agent --multi-angle -c subject.jpg \
|
|
587
589
|
--azimuth front-right --elevation eye-level --distance medium \
|
|
588
590
|
--angle-strength 0.9 \
|
|
589
591
|
"studio portrait, same person"
|
|
590
592
|
|
|
591
593
|
# 360 sweep (8 azimuths)
|
|
592
|
-
|
|
594
|
+
sogni-agent --angles-360 -c subject.jpg --distance medium --elevation eye-level \
|
|
593
595
|
"studio portrait, same person"
|
|
594
596
|
|
|
595
597
|
# 360 sweep video (looping mp4, uses i2v between angles; requires ffmpeg)
|
|
596
|
-
|
|
598
|
+
sogni-agent --angles-360 --angles-360-video /tmp/turntable.mp4 \
|
|
597
599
|
-c subject.jpg --distance medium --elevation eye-level \
|
|
598
600
|
"studio portrait, same person"
|
|
599
601
|
```
|
|
@@ -623,7 +625,7 @@ When a user requests a "360 video", follow this workflow:
|
|
|
623
625
|
|
|
624
626
|
4. **Example command**:
|
|
625
627
|
```bash
|
|
626
|
-
|
|
628
|
+
sogni-agent --angles-360 --angles-360-video /tmp/output.mp4 \
|
|
627
629
|
-c /path/to/image.png --elevation eye-level --distance medium \
|
|
628
630
|
"description of subject"
|
|
629
631
|
```
|
|
@@ -646,65 +648,65 @@ Generate videos from a reference image:
|
|
|
646
648
|
|
|
647
649
|
```bash
|
|
648
650
|
# Text-to-video (t2v)
|
|
649
|
-
|
|
651
|
+
sogni-agent --video "A narrator says \"welcome to the story\" as ocean waves crash"
|
|
650
652
|
|
|
651
653
|
# Basic video from image
|
|
652
|
-
|
|
654
|
+
sogni-agent --video --ref cat.jpg -o cat.mp4 "cat walks around"
|
|
653
655
|
|
|
654
656
|
# Use last generated image as reference
|
|
655
|
-
|
|
657
|
+
sogni-agent --last-image --video "gentle camera pan"
|
|
656
658
|
|
|
657
659
|
# Custom duration and FPS
|
|
658
|
-
|
|
660
|
+
sogni-agent --video --ref scene.png --duration 10 --fps 24 "zoom out slowly"
|
|
659
661
|
|
|
660
662
|
# Bare "720p" / "HD" without exact pixels: preserve aspect via short-side target
|
|
661
|
-
|
|
663
|
+
sogni-agent --video --target-resolution 768 \
|
|
662
664
|
"A calm cinematic shot of lanterns drifting across a night lake"
|
|
663
665
|
|
|
664
666
|
# Natural-language aspect and resolution inference
|
|
665
|
-
|
|
667
|
+
sogni-agent --video \
|
|
666
668
|
"Make a 720p 9:16 video of ocean waves at sunset"
|
|
667
669
|
|
|
668
670
|
# Seedance 2.0 text-to-video
|
|
669
|
-
|
|
671
|
+
sogni-agent --video -m seedance2 --duration 8 \
|
|
670
672
|
"A polished product reveal with native ambient sound"
|
|
671
673
|
|
|
672
674
|
# Seedance multimodal context with public HTTPS references
|
|
673
|
-
|
|
675
|
+
sogni-agent --video -m seedance2 --workflow t2v \
|
|
674
676
|
--ref https://cdn.example.com/product.png \
|
|
675
677
|
--ref-video https://cdn.example.com/motion.mp4 \
|
|
676
678
|
--ref-audio https://cdn.example.com/music.m4a \
|
|
677
679
|
"Use @Image1 for product identity, @Video1 for camera movement, and @Audio1 for music rhythm"
|
|
678
680
|
|
|
679
681
|
# Sound-to-video (s2v)
|
|
680
|
-
|
|
682
|
+
sogni-agent --video --ref face.jpg --ref-audio speech.m4a \
|
|
681
683
|
-m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync talking head"
|
|
682
684
|
|
|
683
685
|
# Image+audio-to-video (auto-routes to LTX 2.3 ia2v)
|
|
684
|
-
|
|
686
|
+
sogni-agent --video --ref cover.jpg --ref-audio song.mp3 \
|
|
685
687
|
"music video with synchronized motion"
|
|
686
688
|
|
|
687
689
|
# Audio-to-video (auto-routes to LTX 2.3 a2v)
|
|
688
|
-
|
|
690
|
+
sogni-agent --video --ref-audio song.mp3 \
|
|
689
691
|
"abstract audio-reactive visualizer"
|
|
690
692
|
|
|
691
693
|
# Persona/voice identity with LTX native audio
|
|
692
|
-
|
|
694
|
+
sogni-agent --video --reference-audio-identity voice.webm \
|
|
693
695
|
"NARRATOR: \"This is my voice.\""
|
|
694
696
|
|
|
695
697
|
# Prefer .webm, .m4a, or .mp3 voice clips. Local .wav clips are normalized
|
|
696
698
|
# to .m4a before upload when ffmpeg is available.
|
|
697
699
|
|
|
698
700
|
# LTX-2.3 text-to-video
|
|
699
|
-
|
|
701
|
+
sogni-agent --video -m ltx23-22b-fp8_t2v_distilled --duration 20 \
|
|
700
702
|
"A wide cinematic aerial shot opens over steep tropical cliffs at golden hour, warm sunlight grazing the rock faces while sea mist drifts above the water below. Palm trees bend gently along the ridge as waves roll against the shoreline, leaving bright bands of foam across the dark stone. The camera glides forward in one continuous pass, revealing more of the coastline as sunlight flickers across wet surfaces and distant birds wheel through the haze. The scene holds a calm, upscale travel-film mood with smooth stabilized motion and crisp environmental detail."
|
|
701
703
|
|
|
702
704
|
# Animate (motion transfer)
|
|
703
|
-
|
|
705
|
+
sogni-agent --video --ref subject.jpg --ref-video motion.mp4 \
|
|
704
706
|
--workflow animate-move "transfer motion"
|
|
705
707
|
|
|
706
708
|
# Segment a longer reference video for local stitched workflows
|
|
707
|
-
|
|
709
|
+
sogni-agent --video --workflow v2v --ref-video dance.mp4 \
|
|
708
710
|
--video-start 10 --duration 8 --controlnet-name pose \
|
|
709
711
|
"robot dancing"
|
|
710
712
|
```
|
|
@@ -715,15 +717,15 @@ Transform an existing video using LTX-2 models with ControlNet guidance:
|
|
|
715
717
|
|
|
716
718
|
```bash
|
|
717
719
|
# Basic v2v with canny edge detection
|
|
718
|
-
|
|
720
|
+
sogni-agent --video --workflow v2v --ref-video input.mp4 \
|
|
719
721
|
--controlnet-name canny "stylized anime version"
|
|
720
722
|
|
|
721
723
|
# V2V with pose detection and custom strength
|
|
722
|
-
|
|
724
|
+
sogni-agent --video --workflow v2v --ref-video dance.mp4 \
|
|
723
725
|
--controlnet-name pose --controlnet-strength 0.7 "robot dancing"
|
|
724
726
|
|
|
725
727
|
# V2V with depth map
|
|
726
|
-
|
|
728
|
+
sogni-agent --video --workflow v2v --ref-video scene.mp4 \
|
|
727
729
|
--controlnet-name depth "watercolor painting style"
|
|
728
730
|
```
|
|
729
731
|
|
|
@@ -732,7 +734,7 @@ Default V2V strengths are tuned from Sogni Chat: `canny`/`pose`/`depth` use `0.8
|
|
|
732
734
|
|
|
733
735
|
```bash
|
|
734
736
|
# Seedance V2V without ControlNet
|
|
735
|
-
|
|
737
|
+
sogni-agent --video --workflow v2v -m seedance2-v2v \
|
|
736
738
|
--ref-video input.mp4 "make the clip more cinematic"
|
|
737
739
|
```
|
|
738
740
|
|
|
@@ -758,7 +760,7 @@ sogni-agent -c old_photo.jpg -o restored.png -w 1024 -h 1280 \
|
|
|
758
760
|
|
|
759
761
|
**Finding received images (Telegram/etc):**
|
|
760
762
|
```bash
|
|
761
|
-
|
|
763
|
+
sogni-agent --json --list-media images
|
|
762
764
|
```
|
|
763
765
|
|
|
764
766
|
**Do NOT use `ls`, `cp`, or other shell commands to browse user files.** Always use `--list-media` to find inbound media.
|
|
@@ -826,41 +828,41 @@ When user asks to generate/draw/create an image:
|
|
|
826
828
|
|
|
827
829
|
```bash
|
|
828
830
|
# Generate and save locally (use -Q for quality presets instead of memorizing model IDs)
|
|
829
|
-
|
|
830
|
-
|
|
831
|
+
sogni-agent -q -Q fast -o /tmp/generated.png "user's prompt"
|
|
832
|
+
sogni-agent -q -Q pro -o /tmp/generated.png "user's prompt"
|
|
831
833
|
|
|
832
834
|
# Generate with prompt variations (diverse images in one call)
|
|
833
|
-
|
|
835
|
+
sogni-agent -q -n 3 -o /tmp/cars.png "a {red|blue|green} sports car"
|
|
834
836
|
|
|
835
837
|
# Edit an existing image
|
|
836
|
-
|
|
838
|
+
sogni-agent -q -c /path/to/input.jpg -o /tmp/edited.png "make it pop art style"
|
|
837
839
|
|
|
838
840
|
# Generate video from image
|
|
839
|
-
|
|
841
|
+
sogni-agent -q --video --ref /path/to/image.png -o /tmp/video.mp4 "A medium shot holds on the subject in soft late-afternoon light as fabric edges and background details remain clear and stable. The camera performs a slow push-in while the subject shifts weight subtly and turns slightly toward the lens, keeping the motion gentle and continuous. Leaves rustle softly in the background and the scene maintains smooth cinematic movement with no abrupt action changes."
|
|
840
842
|
|
|
841
843
|
# Generate text-to-video
|
|
842
|
-
|
|
844
|
+
sogni-agent -q --video -o /tmp/video.mp4 "A wide cinematic shot opens on ocean waves rolling toward a rocky shoreline at sunset, golden light spreading across the water while sea mist drifts through the air. Foam patterns form and recede over the dark sand as the horizon glows orange and pink in the distance. The camera glides forward in one continuous movement, holding smooth stabilized motion and calm environmental detail throughout the scene."
|
|
843
845
|
|
|
844
846
|
# Generate direct music/audio
|
|
845
|
-
|
|
847
|
+
sogni-agent -q --music --duration 30 -o /tmp/music.mp3 "uplifting cinematic synthwave theme for a product launch"
|
|
846
848
|
|
|
847
849
|
# HD / "4K" text-to-video: prefer LTX-2.3
|
|
848
|
-
|
|
850
|
+
sogni-agent -q --video -m ltx23-22b-fp8_t2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A wide cinematic aerial shot opens over a rugged ocean coastline at golden hour, warm sunlight catching the cliff faces while white surf breaks against dark rock below. Low sea mist hangs over the water and bands of foam trace the shoreline as gulls wheel through the distance. The camera glides forward in one continuous pass, revealing the curve of the coast while wet stone flashes with reflected light and the scene keeps smooth stabilized motion from start to finish. The overall mood feels expansive and polished, with crisp environmental detail and steady travel-film energy."
|
|
849
851
|
|
|
850
852
|
# HD / "4K" image-to-video: prefer LTX i2v
|
|
851
|
-
|
|
853
|
+
sogni-agent -q --video --ref /path/to/image.png -m ltx23-22b-fp8_i2v_distilled -w 1920 -h 1088 -o /tmp/video.mp4 "A medium cinematic shot holds on the scene with clean subject separation and stable environmental detail as directional light shapes the surfaces and background depth. The camera performs a slow push-in while the main subject makes one subtle continuous movement, keeping posture and identity consistent from start to finish. Ambient motion in the background stays gentle and the overall clip remains smooth, stabilized, and visually coherent."
|
|
852
854
|
|
|
853
855
|
# Photobooth: stylize a face photo
|
|
854
|
-
|
|
856
|
+
sogni-agent -q --photobooth --ref /path/to/face.jpg -o /tmp/stylized.png "80s fashion portrait"
|
|
855
857
|
|
|
856
858
|
# Token auto-fallback (tries SPARK first, retries with SOGNI on insufficient balance)
|
|
857
|
-
|
|
859
|
+
sogni-agent -q --token-type auto -o /tmp/generated.png "user's prompt"
|
|
858
860
|
|
|
859
861
|
# Check current SPARK/SOGNI balances (no prompt required)
|
|
860
|
-
|
|
862
|
+
sogni-agent --json --balance
|
|
861
863
|
|
|
862
864
|
# Find user-sent images/audio
|
|
863
|
-
|
|
865
|
+
sogni-agent --json --list-media images
|
|
864
866
|
|
|
865
867
|
# Then send via message tool with filePath
|
|
866
868
|
```
|
|
@@ -883,10 +885,10 @@ When the user wants multiple variations (different colors, styles, subjects), us
|
|
|
883
885
|
|
|
884
886
|
```bash
|
|
885
887
|
# 3 color variations
|
|
886
|
-
|
|
888
|
+
sogni-agent -q -n 3 "a {red|blue|green} sports car"
|
|
887
889
|
|
|
888
890
|
# 4 style variations
|
|
889
|
-
|
|
891
|
+
sogni-agent -q -n 4 "a portrait in {oil painting|watercolor|pencil sketch|pop art} style"
|
|
890
892
|
```
|
|
891
893
|
|
|
892
894
|
Options cycle sequentially per image. Without `{...}` syntax, `-n` generates multiple images with the same prompt.
|
|
@@ -916,7 +918,7 @@ When a user asks to **animate between two images**, use `--ref` (first frame) an
|
|
|
916
918
|
|
|
917
919
|
```bash
|
|
918
920
|
# Animate from image A to image B
|
|
919
|
-
|
|
921
|
+
sogni-agent -q --video --ref /tmp/imageA.png --ref-end /tmp/imageB.png -o /tmp/transition.mp4 "descriptive prompt of the transition"
|
|
920
922
|
```
|
|
921
923
|
|
|
922
924
|
### Animate a Video to an Image (Scene Continuation)
|
|
@@ -925,15 +927,15 @@ When a user asks to **animate from a video to an image** (or "continue" a video
|
|
|
925
927
|
|
|
926
928
|
1. **Extract the last frame** of the existing video using the built-in safe wrapper:
|
|
927
929
|
```bash
|
|
928
|
-
|
|
930
|
+
sogni-agent --extract-last-frame /tmp/existing.mp4 /tmp/lastframe.png
|
|
929
931
|
```
|
|
930
932
|
2. **Generate a new video** using the last frame as `--ref` and the target image as `--ref-end`:
|
|
931
933
|
```bash
|
|
932
|
-
|
|
934
|
+
sogni-agent -q --video --ref /tmp/lastframe.png --ref-end /tmp/target.png -o /tmp/continuation.mp4 "scene transition prompt"
|
|
933
935
|
```
|
|
934
936
|
3. **Concatenate the videos** using the built-in safe wrapper:
|
|
935
937
|
```bash
|
|
936
|
-
|
|
938
|
+
sogni-agent --concat-videos /tmp/full_sequence.mp4 /tmp/existing.mp4 /tmp/continuation.mp4
|
|
937
939
|
```
|
|
938
940
|
|
|
939
941
|
This ensures visual continuity — the new clip picks up exactly where the previous one ended.
|
|
@@ -998,22 +1000,22 @@ Personas are named people with saved reference photos and optional voice clips.
|
|
|
998
1000
|
|
|
999
1001
|
```bash
|
|
1000
1002
|
# Add a persona with a reference photo
|
|
1001
|
-
|
|
1003
|
+
sogni-agent --persona-add "Mark" --ref face.jpg --relationship self --description "30s male, brown hair, brown eyes"
|
|
1002
1004
|
|
|
1003
1005
|
# Add with voice clip for video voice cloning
|
|
1004
|
-
|
|
1006
|
+
sogni-agent --persona-add "Sarah" --ref sarah.jpg --relationship partner --voice-clip sarah-voice.webm --voice "warm alto with British accent"
|
|
1005
1007
|
|
|
1006
1008
|
# List all personas
|
|
1007
|
-
|
|
1009
|
+
sogni-agent --persona-list --json
|
|
1008
1010
|
|
|
1009
1011
|
# Resolve a persona by name, tag, or pronoun
|
|
1010
|
-
|
|
1012
|
+
sogni-agent --persona-resolve "me" --json
|
|
1011
1013
|
|
|
1012
1014
|
# Generate using a persona (auto-injects photo as context)
|
|
1013
|
-
|
|
1015
|
+
sogni-agent --persona "Mark" -o /tmp/hero.png "superhero in dramatic lighting"
|
|
1014
1016
|
|
|
1015
1017
|
# Remove a persona
|
|
1016
|
-
|
|
1018
|
+
sogni-agent --persona-remove "Mark"
|
|
1017
1019
|
```
|
|
1018
1020
|
|
|
1019
1021
|
### Persona Pipeline Rules
|
|
@@ -1038,18 +1040,18 @@ Memories are persistent key-value preferences stored locally at `~/.config/sogni
|
|
|
1038
1040
|
|
|
1039
1041
|
```bash
|
|
1040
1042
|
# Save a preference
|
|
1041
|
-
|
|
1042
|
-
|
|
1043
|
-
|
|
1043
|
+
sogni-agent --memory-set preferred_style "watercolor and soft lighting"
|
|
1044
|
+
sogni-agent --memory-set aspect_ratio "16:9"
|
|
1045
|
+
sogni-agent --memory-set favorite_artist "Studio Ghibli"
|
|
1044
1046
|
|
|
1045
1047
|
# Read all memories
|
|
1046
|
-
|
|
1048
|
+
sogni-agent --memory-list --json
|
|
1047
1049
|
|
|
1048
1050
|
# Get one memory
|
|
1049
|
-
|
|
1051
|
+
sogni-agent --memory-get preferred_style --json
|
|
1050
1052
|
|
|
1051
1053
|
# Delete a memory
|
|
1052
|
-
|
|
1054
|
+
sogni-agent --memory-remove preferred_style
|
|
1053
1055
|
```
|
|
1054
1056
|
|
|
1055
1057
|
**Agent behavior:** Before generating, check memories with `--memory-list` and respect saved preferences. If the user says "I always want watercolor style", save it with `--memory-set`. Categories: `preference` (default), `fact`, `context`.
|
|
@@ -1060,13 +1062,13 @@ Users can set custom instructions that shape agent behavior, stored at `~/.confi
|
|
|
1060
1062
|
|
|
1061
1063
|
```bash
|
|
1062
1064
|
# Set personality
|
|
1063
|
-
|
|
1065
|
+
sogni-agent --personality-set "Be concise, always use cinematic lighting, suggest bold creative ideas"
|
|
1064
1066
|
|
|
1065
1067
|
# Read current personality
|
|
1066
|
-
|
|
1068
|
+
sogni-agent --personality-get --json
|
|
1067
1069
|
|
|
1068
1070
|
# Clear (reset to default)
|
|
1069
|
-
|
|
1071
|
+
sogni-agent --personality-clear
|
|
1070
1072
|
```
|
|
1071
1073
|
|
|
1072
1074
|
**Agent behavior:** Check personality on startup and adopt those instructions. Personality overrides default style but not hard constraints (safety, tool usage rules).
|
|
@@ -1077,13 +1079,13 @@ Apply artistic styles to existing images:
|
|
|
1077
1079
|
|
|
1078
1080
|
```bash
|
|
1079
1081
|
# Apply a named artist style
|
|
1080
|
-
|
|
1082
|
+
sogni-agent -c photo.jpg -o /tmp/styled.png "Apply style: Andy Warhol pop art with bold primary colors"
|
|
1081
1083
|
|
|
1082
1084
|
# Studio Ghibli transformation
|
|
1083
|
-
|
|
1085
|
+
sogni-agent -c photo.jpg -o /tmp/ghibli.png "Apply style: Studio Ghibli watercolor with soft pastel sky and lush greenery"
|
|
1084
1086
|
|
|
1085
1087
|
# For photos with people, always preserve identity
|
|
1086
|
-
|
|
1088
|
+
sogni-agent -c portrait.jpg -o /tmp/styled.png "Apply style: oil painting in the style of Vermeer. Preserve all facial features, expressions, and identity."
|
|
1087
1089
|
```
|
|
1088
1090
|
|
|
1089
1091
|
**Tips:** Reference artists and styles BY NAME for best results. Use positive phrasing. For photos with people, always append identity preservation instructions.
|
|
@@ -1094,13 +1096,13 @@ Generate a photo from a different camera angle:
|
|
|
1094
1096
|
|
|
1095
1097
|
```bash
|
|
1096
1098
|
# 3/4 view
|
|
1097
|
-
|
|
1099
|
+
sogni-agent --multi-angle -c subject.jpg --azimuth front-right "same subject"
|
|
1098
1100
|
|
|
1099
1101
|
# Side view
|
|
1100
|
-
|
|
1102
|
+
sogni-agent --multi-angle -c subject.jpg --azimuth left --elevation eye-level --distance medium "same subject"
|
|
1101
1103
|
|
|
1102
1104
|
# Full 360 turntable
|
|
1103
|
-
|
|
1105
|
+
sogni-agent --angles-360 -c subject.jpg "same subject"
|
|
1104
1106
|
```
|
|
1105
1107
|
|
|
1106
1108
|
**User term mapping:**
|
package/openclaw.plugin.json
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
"id": "sogni-creative-agent-skill",
|
|
3
3
|
"name": "Sogni Creative Agent Skill — Image, Video & Music Generation",
|
|
4
4
|
"description": "Agent skill and CLI for Sogni AI image, video, and music generation.",
|
|
5
|
-
"version": "3.1.
|
|
5
|
+
"version": "3.1.1",
|
|
6
6
|
"skills": [
|
|
7
7
|
"."
|
|
8
8
|
],
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@sogni-ai/sogni-creative-agent-skill",
|
|
3
|
-
"version": "3.1.
|
|
3
|
+
"version": "3.1.1",
|
|
4
4
|
"description": "Sogni Creative Agent Skill: agent skill and CLI for Sogni AI image, video, and music generation.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "sogni-agent.mjs",
|
package/version.mjs
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
export const PACKAGE_VERSION = '3.1.
|
|
1
|
+
export const PACKAGE_VERSION = '3.1.1';
|