sogni-gen 1.2.1 → 1.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +392 -0
- package/package.json +1 -1
package/SKILL.md
ADDED
|
@@ -0,0 +1,392 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sogni-gen
|
|
3
|
+
description: Generate images **and videos** using Sogni AI's decentralized network. Ask the agent to "draw", "generate", "create an image", or "make a video/animate" from a prompt or reference image.
|
|
4
|
+
homepage: https://sogni.ai
|
|
5
|
+
metadata:
|
|
6
|
+
clawdbot:
|
|
7
|
+
emoji: "🎨"
|
|
8
|
+
os: ["darwin", "linux", "win32"]
|
|
9
|
+
requires:
|
|
10
|
+
bins: ["node"]
|
|
11
|
+
install:
|
|
12
|
+
- id: npm
|
|
13
|
+
kind: exec
|
|
14
|
+
command: "cd {{skillDir}} && npm i"
|
|
15
|
+
label: "Install dependencies"
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
# Sogni Image & Video Generation
|
|
19
|
+
|
|
20
|
+
Generate **images and videos** using Sogni AI's decentralized GPU network.
|
|
21
|
+
|
|
22
|
+
## Setup
|
|
23
|
+
|
|
24
|
+
1. **Get Sogni credentials** at https://sogni.ai
|
|
25
|
+
2. **Create credentials file:**
|
|
26
|
+
```bash
|
|
27
|
+
mkdir -p ~/.config/sogni
|
|
28
|
+
cat > ~/.config/sogni/credentials << 'EOF'
|
|
29
|
+
SOGNI_USERNAME=your_username
|
|
30
|
+
SOGNI_PASSWORD=your_password
|
|
31
|
+
EOF
|
|
32
|
+
chmod 600 ~/.config/sogni/credentials
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
3. **Install dependencies (if cloned):**
|
|
36
|
+
```bash
|
|
37
|
+
cd /path/to/sogni-gen
|
|
38
|
+
npm i
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
4. **Or install from npm (no git clone):**
|
|
42
|
+
```bash
|
|
43
|
+
mkdir -p ~/.clawdbot/skills
|
|
44
|
+
cd ~/.clawdbot/skills
|
|
45
|
+
npm i sogni-gen
|
|
46
|
+
ln -sfn node_modules/sogni-gen sogni-gen
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## Usage (Images & Video)
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
# Generate and get URL
|
|
53
|
+
node sogni-gen.mjs "a cat wearing a hat"
|
|
54
|
+
|
|
55
|
+
# Save to file
|
|
56
|
+
node sogni-gen.mjs -o /tmp/cat.png "a cat wearing a hat"
|
|
57
|
+
|
|
58
|
+
# JSON output (for scripting)
|
|
59
|
+
node sogni-gen.mjs --json "a cat wearing a hat"
|
|
60
|
+
|
|
61
|
+
# Check token balances (no prompt required)
|
|
62
|
+
node sogni-gen.mjs --balance
|
|
63
|
+
|
|
64
|
+
# Check token balances in JSON
|
|
65
|
+
node sogni-gen.mjs --json --balance
|
|
66
|
+
|
|
67
|
+
# Quiet mode (suppress progress)
|
|
68
|
+
node sogni-gen.mjs -q -o /tmp/cat.png "a cat wearing a hat"
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Options
|
|
72
|
+
|
|
73
|
+
| Flag | Description | Default |
|
|
74
|
+
|------|-------------|---------|
|
|
75
|
+
| `-o, --output <path>` | Save to file | prints URL |
|
|
76
|
+
| `-m, --model <id>` | Model ID | z_image_turbo_bf16 |
|
|
77
|
+
| `-w, --width <px>` | Width | 512 |
|
|
78
|
+
| `-h, --height <px>` | Height | 512 |
|
|
79
|
+
| `-n, --count <num>` | Number of images | 1 |
|
|
80
|
+
| `-t, --timeout <sec>` | Timeout seconds | 30 (300 for video) |
|
|
81
|
+
| `-s, --seed <num>` | Specific seed | random |
|
|
82
|
+
| `--last-seed` | Reuse seed from last render | - |
|
|
83
|
+
| `--seed-strategy <s>` | Seed strategy: random\|prompt-hash | prompt-hash |
|
|
84
|
+
| `--multi-angle` | Multiple angles LoRA mode (Qwen Image Edit) | - |
|
|
85
|
+
| `--angles-360` | Generate 8 azimuths (front -> front-left) | - |
|
|
86
|
+
| `--angles-360-video` | Assemble looping 360 mp4 using i2v between angles (requires ffmpeg) | - |
|
|
87
|
+
| `--azimuth <key>` | front\|front-right\|right\|back-right\|back\|back-left\|left\|front-left | front |
|
|
88
|
+
| `--elevation <key>` | low-angle\|eye-level\|elevated\|high-angle | eye-level |
|
|
89
|
+
| `--distance <key>` | close-up\|medium\|wide | medium |
|
|
90
|
+
| `--angle-strength <n>` | LoRA strength for multiple_angles | 0.9 |
|
|
91
|
+
| `--angle-description <text>` | Optional subject description | - |
|
|
92
|
+
| `--steps <num>` | Override steps (model-dependent) | - |
|
|
93
|
+
| `--guidance <num>` | Override guidance (model-dependent) | - |
|
|
94
|
+
| `--output-format <f>` | Image output format: png\|jpg | png |
|
|
95
|
+
| `--sampler <name>` | Sampler (model-dependent) | - |
|
|
96
|
+
| `--scheduler <name>` | Scheduler (model-dependent) | - |
|
|
97
|
+
| `--lora <id>` | LoRA id (repeatable, edit only) | - |
|
|
98
|
+
| `--loras <ids>` | Comma-separated LoRA ids | - |
|
|
99
|
+
| `--lora-strength <n>` | LoRA strength (repeatable) | - |
|
|
100
|
+
| `--lora-strengths <n>` | Comma-separated LoRA strengths | - |
|
|
101
|
+
| `--token-type <type>` | Token type: spark\|sogni | spark |
|
|
102
|
+
| `--balance, --balances` | Show SPARK/SOGNI balances and exit | - |
|
|
103
|
+
| `-c, --context <path>` | Context image for editing | - |
|
|
104
|
+
| `--last-image` | Use last generated image as context/ref | - |
|
|
105
|
+
| `--video, -v` | Generate video instead of image | - |
|
|
106
|
+
| `--workflow <type>` | Video workflow (t2v|i2v|s2v|animate-move|animate-replace) | inferred |
|
|
107
|
+
| `--fps <num>` | Frames per second (video) | 16 |
|
|
108
|
+
| `--duration <sec>` | Duration in seconds (video) | 5 |
|
|
109
|
+
| `--frames <num>` | Override total frames (video) | - |
|
|
110
|
+
| `--auto-resize-assets` | Auto-resize video assets | true |
|
|
111
|
+
| `--no-auto-resize-assets` | Disable auto-resize | - |
|
|
112
|
+
| `--estimate-video-cost` | Estimate video cost and exit (requires --steps) | - |
|
|
113
|
+
| `--ref <path>` | Reference image for video | required for video |
|
|
114
|
+
| `--ref-end <path>` | End frame for i2v interpolation | - |
|
|
115
|
+
| `--ref-audio <path>` | Reference audio for s2v | - |
|
|
116
|
+
| `--ref-video <path>` | Reference video for animate workflows | - |
|
|
117
|
+
| `--last` | Show last render info | - |
|
|
118
|
+
| `--json` | JSON output | false |
|
|
119
|
+
| `--strict-size` | Do not auto-adjust i2v video size for reference resizing constraints | false |
|
|
120
|
+
| `-q, --quiet` | No progress output | false |
|
|
121
|
+
|
|
122
|
+
## OpenClaw Config Defaults
|
|
123
|
+
|
|
124
|
+
When installed as an OpenClaw plugin, `sogni-gen` will read defaults from:
|
|
125
|
+
|
|
126
|
+
`~/.openclaw/openclaw.json`
|
|
127
|
+
|
|
128
|
+
```json
|
|
129
|
+
{
|
|
130
|
+
"plugins": {
|
|
131
|
+
"entries": {
|
|
132
|
+
"sogni-gen": {
|
|
133
|
+
"enabled": true,
|
|
134
|
+
"config": {
|
|
135
|
+
"defaultImageModel": "z_image_turbo_bf16",
|
|
136
|
+
"defaultEditModel": "qwen_image_edit_2511_fp8_lightning",
|
|
137
|
+
"videoModels": {
|
|
138
|
+
"t2v": "wan_v2.2-14b-fp8_t2v_lightx2v",
|
|
139
|
+
"i2v": "wan_v2.2-14b-fp8_i2v_lightx2v",
|
|
140
|
+
"s2v": "wan_v2.2-14b-fp8_s2v_lightx2v",
|
|
141
|
+
"animate-move": "wan_v2.2-14b-fp8_animate-move_lightx2v",
|
|
142
|
+
"animate-replace": "wan_v2.2-14b-fp8_animate-replace_lightx2v"
|
|
143
|
+
},
|
|
144
|
+
"defaultVideoWorkflow": "t2v",
|
|
145
|
+
"defaultNetwork": "fast",
|
|
146
|
+
"defaultTokenType": "spark",
|
|
147
|
+
"seedStrategy": "prompt-hash",
|
|
148
|
+
"modelDefaults": {
|
|
149
|
+
"flux1-schnell-fp8": { "steps": 4, "guidance": 3.5 },
|
|
150
|
+
"flux2_dev_fp8": { "steps": 20, "guidance": 7.5 }
|
|
151
|
+
},
|
|
152
|
+
"defaultWidth": 768,
|
|
153
|
+
"defaultHeight": 768,
|
|
154
|
+
"defaultCount": 1,
|
|
155
|
+
"defaultFps": 16,
|
|
156
|
+
"defaultDurationSec": 5,
|
|
157
|
+
"defaultImageTimeoutSec": 30,
|
|
158
|
+
"defaultVideoTimeoutSec": 300
|
|
159
|
+
}
|
|
160
|
+
}
|
|
161
|
+
}
|
|
162
|
+
}
|
|
163
|
+
}
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
CLI flags always override these defaults.
|
|
167
|
+
If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
|
|
168
|
+
Seed strategies: `prompt-hash` (deterministic) or `random`.
|
|
169
|
+
|
|
170
|
+
## Image Models
|
|
171
|
+
|
|
172
|
+
| Model | Speed | Use Case |
|
|
173
|
+
|-------|-------|----------|
|
|
174
|
+
| `z_image_turbo_bf16` | Fast (~5-10s) | General purpose, default |
|
|
175
|
+
| `flux1-schnell-fp8` | Very fast | Quick iterations |
|
|
176
|
+
| `flux2_dev_fp8` | Slow (~2min) | High quality |
|
|
177
|
+
| `chroma-v.46-flash_fp8` | Medium | Balanced |
|
|
178
|
+
| `qwen_image_edit_2511_fp8` | Medium | Image editing with context (up to 3) |
|
|
179
|
+
| `qwen_image_edit_2511_fp8_lightning` | Fast | Quick image editing |
|
|
180
|
+
|
|
181
|
+
## Video Models
|
|
182
|
+
|
|
183
|
+
| Model | Speed | Use Case |
|
|
184
|
+
|-------|-------|----------|
|
|
185
|
+
| `wan_v2.2-14b-fp8_i2v_lightx2v` | Fast | Default video generation |
|
|
186
|
+
| `wan_v2.2-14b-fp8_i2v` | Slow | Higher quality video |
|
|
187
|
+
| `wan_v2.2-14b-fp8_t2v_lightx2v` | Fast | Text-to-video |
|
|
188
|
+
| `wan_v2.2-14b-fp8_s2v_lightx2v` | Fast | Sound-to-video |
|
|
189
|
+
| `wan_v2.2-14b-fp8_animate-move_lightx2v` | Fast | Animate-move |
|
|
190
|
+
| `wan_v2.2-14b-fp8_animate-replace_lightx2v` | Fast | Animate-replace |
|
|
191
|
+
|
|
192
|
+
## Image Editing with Context
|
|
193
|
+
|
|
194
|
+
Edit images using reference images (Qwen models support up to 3):
|
|
195
|
+
|
|
196
|
+
```bash
|
|
197
|
+
# Single context image
|
|
198
|
+
node sogni-gen.mjs -c photo.jpg "make the background a beach"
|
|
199
|
+
|
|
200
|
+
# Multiple context images (subject + style)
|
|
201
|
+
node sogni-gen.mjs -c subject.jpg -c style.jpg "apply the style to the subject"
|
|
202
|
+
|
|
203
|
+
# Use last generated image as context
|
|
204
|
+
node sogni-gen.mjs --last-image "make it more vibrant"
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
When context images are provided without `-m`, defaults to `qwen_image_edit_2511_fp8_lightning`.
|
|
208
|
+
|
|
209
|
+
## Multiple Angles (Turnaround)
|
|
210
|
+
|
|
211
|
+
Generate specific camera angles from a single reference image using the Multiple Angles LoRA:
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
# Single angle
|
|
215
|
+
node sogni-gen.mjs --multi-angle -c subject.jpg \
|
|
216
|
+
--azimuth front-right --elevation eye-level --distance medium \
|
|
217
|
+
--angle-strength 0.9 \
|
|
218
|
+
"studio portrait, same person"
|
|
219
|
+
|
|
220
|
+
# 360 sweep (8 azimuths)
|
|
221
|
+
node sogni-gen.mjs --angles-360 -c subject.jpg --distance medium --elevation eye-level \
|
|
222
|
+
"studio portrait, same person"
|
|
223
|
+
|
|
224
|
+
# 360 sweep video (looping mp4, uses i2v between angles; requires ffmpeg)
|
|
225
|
+
node sogni-gen.mjs --angles-360 --angles-360-video /tmp/turntable.mp4 \
|
|
226
|
+
-c subject.jpg --distance medium --elevation eye-level \
|
|
227
|
+
"studio portrait, same person"
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
The prompt is auto-built with the required `<sks>` token plus the selected camera angle keywords.
|
|
231
|
+
`--angles-360-video` generates i2v clips between consecutive angles (including last→first) and concatenates them with ffmpeg for a seamless loop.
|
|
232
|
+
|
|
233
|
+
### 360 Video Best Practices
|
|
234
|
+
|
|
235
|
+
When a user requests a "360 video", follow this workflow:
|
|
236
|
+
|
|
237
|
+
1. **Default camera parameters** (do not ask unless they specify):
|
|
238
|
+
- **Elevation**: default to **medium**
|
|
239
|
+
- **Distance**: default to **medium**
|
|
240
|
+
|
|
241
|
+
2. **Map user terms to flags**:
|
|
242
|
+
| User says | Flag value |
|
|
243
|
+
|-----------|------------|
|
|
244
|
+
| "high" angle | `--elevation high-angle` |
|
|
245
|
+
| "medium" angle | `--elevation eye-level` |
|
|
246
|
+
| "low" angle | `--elevation low-angle` |
|
|
247
|
+
| "close" | `--distance close-up` |
|
|
248
|
+
| "medium" distance | `--distance medium` |
|
|
249
|
+
| "far" | `--distance wide` |
|
|
250
|
+
|
|
251
|
+
3. **Always use first-frame/last-frame stitching** - the `--angles-360-video` flag automatically handles this by generating i2v clips between consecutive angles including last→first for seamless looping.
|
|
252
|
+
|
|
253
|
+
4. **Example command**:
|
|
254
|
+
```bash
|
|
255
|
+
node sogni-gen.mjs --angles-360 --angles-360-video /tmp/output.mp4 \
|
|
256
|
+
-c /path/to/image.png --elevation eye-level --distance medium \
|
|
257
|
+
"description of subject"
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### Transition Video Rule
|
|
261
|
+
|
|
262
|
+
For **any transition video work**, always use the **Sogni skill/plugin** (not ffmpeg or other methods) unless explicitly told otherwise.
|
|
263
|
+
|
|
264
|
+
### Insufficient Funds Handling
|
|
265
|
+
|
|
266
|
+
When you see **"Debit Error: Insufficient funds"**, reply:
|
|
267
|
+
|
|
268
|
+
"Insufficient funds. Claim 50 free daily Spark points at https://app.sogni.ai/"
|
|
269
|
+
|
|
270
|
+
## Video Generation
|
|
271
|
+
|
|
272
|
+
Generate videos from a reference image:
|
|
273
|
+
|
|
274
|
+
```bash
|
|
275
|
+
# Text-to-video (t2v)
|
|
276
|
+
node sogni-gen.mjs --video "ocean waves at sunset"
|
|
277
|
+
|
|
278
|
+
# Basic video from image
|
|
279
|
+
node sogni-gen.mjs --video --ref cat.jpg -o cat.mp4 "cat walks around"
|
|
280
|
+
|
|
281
|
+
# Use last generated image as reference
|
|
282
|
+
node sogni-gen.mjs --last-image --video "gentle camera pan"
|
|
283
|
+
|
|
284
|
+
# Custom duration and FPS
|
|
285
|
+
node sogni-gen.mjs --video --ref scene.png --duration 10 --fps 24 "zoom out slowly"
|
|
286
|
+
|
|
287
|
+
# Sound-to-video (s2v)
|
|
288
|
+
node sogni-gen.mjs --video --ref face.jpg --ref-audio speech.m4a \
|
|
289
|
+
-m wan_v2.2-14b-fp8_s2v_lightx2v "lip sync talking head"
|
|
290
|
+
|
|
291
|
+
# Animate (motion transfer)
|
|
292
|
+
node sogni-gen.mjs --video --ref subject.jpg --ref-video motion.mp4 \
|
|
293
|
+
--workflow animate-move "transfer motion"
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
## Photo Restoration
|
|
297
|
+
|
|
298
|
+
Restore damaged vintage photos using Qwen image editing:
|
|
299
|
+
|
|
300
|
+
```bash
|
|
301
|
+
# Basic restoration
|
|
302
|
+
sogni-gen -c damaged_photo.jpg -o restored.png \
|
|
303
|
+
"professionally restore this vintage photograph, remove damage and scratches"
|
|
304
|
+
|
|
305
|
+
# Detailed restoration with preservation hints
|
|
306
|
+
sogni-gen -c old_photo.jpg -o restored.png -w 1024 -h 1280 \
|
|
307
|
+
"restore this vintage photo, remove peeling, tears and wear marks, \
|
|
308
|
+
preserve natural features and expression, maintain warm nostalgic color tones"
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
**Tips for good restorations:**
|
|
312
|
+
- Describe the damage: "peeling", "scratches", "tears", "fading"
|
|
313
|
+
- Specify what to preserve: "natural features", "eye color", "hair", "expression"
|
|
314
|
+
- Mention the era for color tones: "1970s warm tones", "vintage sepia"
|
|
315
|
+
|
|
316
|
+
**Finding received images (Telegram/etc):**
|
|
317
|
+
```bash
|
|
318
|
+
ls -la ~/.clawdbot/media/inbound/*.jpg | tail -3
|
|
319
|
+
cp ~/.clawdbot/media/inbound/<latest>.jpg /tmp/to_restore.jpg
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
## Agent Usage
|
|
323
|
+
|
|
324
|
+
When user asks to generate/draw/create an image:
|
|
325
|
+
|
|
326
|
+
```bash
|
|
327
|
+
# Generate and save locally
|
|
328
|
+
node {{skillDir}}/sogni-gen.mjs -q -o /tmp/generated.png "user's prompt"
|
|
329
|
+
|
|
330
|
+
# Edit an existing image
|
|
331
|
+
node {{skillDir}}/sogni-gen.mjs -q -c /path/to/input.jpg -o /tmp/edited.png "make it pop art style"
|
|
332
|
+
|
|
333
|
+
# Generate video from image
|
|
334
|
+
node {{skillDir}}/sogni-gen.mjs -q --video --ref /path/to/image.png -o /tmp/video.mp4 "camera slowly zooms in"
|
|
335
|
+
|
|
336
|
+
# Generate text-to-video
|
|
337
|
+
node {{skillDir}}/sogni-gen.mjs -q --video -o /tmp/video.mp4 "ocean waves at sunset"
|
|
338
|
+
|
|
339
|
+
# Check current SPARK/SOGNI balances (no prompt required)
|
|
340
|
+
node {{skillDir}}/sogni-gen.mjs --json --balance
|
|
341
|
+
|
|
342
|
+
# Then send via message tool with filePath
|
|
343
|
+
```
|
|
344
|
+
|
|
345
|
+
## JSON Output
|
|
346
|
+
|
|
347
|
+
```json
|
|
348
|
+
{
|
|
349
|
+
"success": true,
|
|
350
|
+
"prompt": "a cat wearing a hat",
|
|
351
|
+
"model": "z_image_turbo_bf16",
|
|
352
|
+
"width": 512,
|
|
353
|
+
"height": 512,
|
|
354
|
+
"urls": ["https://..."],
|
|
355
|
+
"localPath": "/tmp/cat.png"
|
|
356
|
+
}
|
|
357
|
+
```
|
|
358
|
+
|
|
359
|
+
On error (with `--json`), the script returns a single JSON object like:
|
|
360
|
+
|
|
361
|
+
```json
|
|
362
|
+
{
|
|
363
|
+
"success": false,
|
|
364
|
+
"error": "Video width and height must be divisible by 16 (got 500x512).",
|
|
365
|
+
"errorCode": "INVALID_VIDEO_SIZE",
|
|
366
|
+
"hint": "Choose --width/--height divisible by 16. For i2v, also match the reference aspect ratio."
|
|
367
|
+
}
|
|
368
|
+
```
|
|
369
|
+
|
|
370
|
+
Balance check example (`--json --balance`):
|
|
371
|
+
|
|
372
|
+
```json
|
|
373
|
+
{
|
|
374
|
+
"success": true,
|
|
375
|
+
"type": "balance",
|
|
376
|
+
"spark": 12.34,
|
|
377
|
+
"sogni": 0.56
|
|
378
|
+
}
|
|
379
|
+
```
|
|
380
|
+
|
|
381
|
+
## Cost
|
|
382
|
+
|
|
383
|
+
Uses Spark tokens from your Sogni account. 512x512 images are most cost-efficient.
|
|
384
|
+
|
|
385
|
+
## Troubleshooting
|
|
386
|
+
|
|
387
|
+
- **Auth errors**: Check credentials in `~/.config/sogni/credentials`
|
|
388
|
+
- **i2v sizing gotchas**: Video sizes are constrained (min 480px, max 1536px, divisible by 16). For i2v, the client wrapper resizes the reference (`fit: inside`) and uses the resized dimensions as the final video size. Because this uses rounding, a requested size can still yield an invalid final size (example: `1024x1536` requested but ref becomes `1024x1535`).
|
|
389
|
+
- **Auto-adjustment**: With a local `--ref`, the script will auto-adjust the requested size to avoid non-16 resized reference dimensions.
|
|
390
|
+
- **If the script adjusts your size but you want to fail instead**: pass `--strict-size` and it will print a suggested `--width/--height`.
|
|
391
|
+
- **Timeouts**: Try a faster model or increase `-t` timeout
|
|
392
|
+
- **No workers**: Check https://sogni.ai for network status
|