@sogni-ai/sogni-creative-agent-skill 2.1.3 → 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +392 -181
- package/SKILL.md +187 -27
- package/generated/creative-agent-runtime.mjs +8559 -899
- package/llm.txt +29 -7
- package/openclaw.plugin.json +59 -4
- package/package.json +10 -4
- package/scripts/check-creative-agent-source.mjs +104 -0
- package/sogni-agent.mjs +2329 -186
- package/ssrf-guard.mjs +2 -1
- package/version.mjs +1 -1
package/README.md
CHANGED
|
@@ -1,82 +1,117 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
<img src="https://raw.githubusercontent.com/Sogni-AI/sogni-creative-agent-skill/main/docs/screenshot.jpg" alt="
|
|
2
|
+
<img src="https://raw.githubusercontent.com/Sogni-AI/sogni-creative-agent-skill/main/docs/screenshot.jpg" alt="Sogni Creative Agent Skill rendering an image from a Telegram-style chat" width="320" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
<h1 align="center">Sogni Creative Agent Skill</h1>
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
[OpenClaw](https://github.com/OpenClaw/OpenClaw),
|
|
9
|
-
[Hermes Agent](https://hermes-agent.nousresearch.com/),
|
|
10
|
-
[Manus AI](https://manus.im), and more — image generation, video generation, and
|
|
11
|
-
creative-media tools powered by [Sogni AI](https://sogni.ai)'s decentralized GPU
|
|
12
|
-
network.
|
|
7
|
+
<p align="center">Image, video, and music generation for AI agents — powered by <a href="https://sogni.ai">Sogni AI</a>'s decentralized GPU network.</p>
|
|
13
8
|
|
|
14
|
-
|
|
15
|
-
-
|
|
16
|
-
|
|
17
|
-
|
|
9
|
+
<p align="center">
|
|
10
|
+
<a href="https://www.npmjs.com/package/@sogni-ai/sogni-creative-agent-skill"><img alt="npm" src="https://img.shields.io/npm/v/@sogni-ai/sogni-creative-agent-skill.svg" /></a>
|
|
11
|
+
<a href="https://www.npmjs.com/package/@sogni-ai/sogni-creative-agent-skill"><img alt="downloads" src="https://img.shields.io/npm/dm/@sogni-ai/sogni-creative-agent-skill.svg" /></a>
|
|
12
|
+
<img alt="node" src="https://img.shields.io/node/v/@sogni-ai/sogni-creative-agent-skill.svg" />
|
|
13
|
+
<a href="./LICENSE"><img alt="license" src="https://img.shields.io/npm/l/@sogni-ai/sogni-creative-agent-skill.svg" /></a>
|
|
14
|
+
</p>
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
**Sogni Creative Agent Skill** plugs into the agent runtime you already use — Claude Code, [OpenClaw](https://github.com/OpenClaw/OpenClaw), [Hermes Agent](https://hermes-agent.nousresearch.com/), [Manus AI](https://manus.im), and others — and gives it production-quality image, video, and music generation through a single CLI: `sogni-agent`.
|
|
19
|
+
|
|
20
|
+
It ships three ways:
|
|
21
|
+
|
|
22
|
+
- a standalone Node.js CLI (`sogni-agent`)
|
|
23
|
+
- a skill source that any [`SKILL.md`](./SKILL.md)-aware agent can load
|
|
24
|
+
- a published [OpenClaw](https://github.com/OpenClaw/OpenClaw) plugin
|
|
18
25
|
|
|
19
|
-
|
|
26
|
+
With this skill, an agent can:
|
|
20
27
|
|
|
21
|
-
|
|
22
|
-
-
|
|
23
|
-
-
|
|
24
|
-
-
|
|
28
|
+
- generate images from prompts and edit/restyle existing images
|
|
29
|
+
- create videos from text, images, audio, or reference video (LTX-2.3, WAN 2.2, Seedance 2.0)
|
|
30
|
+
- generate instrumental music or full songs with lyrics
|
|
31
|
+
- run hosted creative workflows including storyboard-driven video
|
|
25
32
|
- save personas, preferences, and last-render state across sessions
|
|
26
33
|
- check balances, list models, and refine previous results
|
|
27
34
|
|
|
35
|
+
> **Fastest install:** paste this repo's GitHub URL into your agent and ask it to "install this skill".
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Table of Contents
|
|
40
|
+
|
|
41
|
+
- [Quick Start](#quick-start)
|
|
42
|
+
- [Requirements](#requirements)
|
|
43
|
+
- [Installation](#installation)
|
|
44
|
+
- [Node CLI (default)](#node-cli-default)
|
|
45
|
+
- [OpenClaw plugin](#openclaw-plugin)
|
|
46
|
+
- [Hermes Agent / Manus / other frameworks](#hermes-agent--manus--other-frameworks)
|
|
47
|
+
- [Manual install from source](#manual-install-from-source)
|
|
48
|
+
- [Upgrading safely from inside an agent](#upgrading-safely-from-inside-an-agent)
|
|
49
|
+
- [Setup (Sogni API key)](#setup-sogni-api-key)
|
|
50
|
+
- [Usage](#usage)
|
|
51
|
+
- [CLI Reference](#cli-reference)
|
|
52
|
+
- [Common options](#common-options)
|
|
53
|
+
- [Quality presets](#quality-presets)
|
|
54
|
+
- [Recommended models](#recommended-models)
|
|
55
|
+
- [Video Sizing & Aspect Ratios](#video-sizing--aspect-ratios)
|
|
56
|
+
- [LTX-2.3 Prompting Guide](#ltx-23-prompting-guide)
|
|
57
|
+
- [Photobooth (Face Transfer)](#photobooth-face-transfer)
|
|
58
|
+
- [Personas, Memory, and Personality](#personas-memory-and-personality)
|
|
59
|
+
- [Hosted API Modes](#hosted-api-modes)
|
|
60
|
+
- [Dynamic Prompt Variations](#dynamic-prompt-variations)
|
|
61
|
+
- [Token Auto-Fallback](#token-auto-fallback)
|
|
62
|
+
- [Error Reporting & Output](#error-reporting--output)
|
|
63
|
+
- [For AI Agents](#for-ai-agents)
|
|
64
|
+
- [Development](#development)
|
|
65
|
+
- [License](#license)
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
28
69
|
## Quick Start
|
|
29
70
|
|
|
30
|
-
1.
|
|
31
|
-
2. Install the
|
|
71
|
+
1. Get a Sogni API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu) and save it — see [Setup](#setup-sogni-api-key).
|
|
72
|
+
2. Install the CLI:
|
|
32
73
|
|
|
33
|
-
```bash
|
|
34
|
-
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
35
|
-
sogni-agent --version
|
|
36
|
-
```
|
|
74
|
+
```bash
|
|
75
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
76
|
+
sogni-agent --version
|
|
77
|
+
```
|
|
37
78
|
|
|
38
|
-
3. Point your agent
|
|
79
|
+
3. Point your agent runtime at this repository's [`SKILL.md`](./SKILL.md).
|
|
80
|
+
|
|
81
|
+
Then ask your agent to do something:
|
|
39
82
|
|
|
40
|
-
Then ask your agent to do something simple, for example:
|
|
41
83
|
- "Generate an image of a sunset over mountains"
|
|
42
84
|
- "Edit this image to add a rainbow"
|
|
43
85
|
- "Make a video of a cat playing piano"
|
|
86
|
+
- "Generate a 30 second synthwave product-launch theme"
|
|
44
87
|
- "Turn my selfie into James Bond using photobooth"
|
|
45
88
|
- "Refine the last image at higher quality"
|
|
46
89
|
|
|
47
|
-
|
|
90
|
+
---
|
|
48
91
|
|
|
49
|
-
|
|
92
|
+
## Requirements
|
|
50
93
|
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
94
|
+
- **Node.js ≥ 22.11.0**
|
|
95
|
+
- **Sogni API key** ([dashboard.sogni.ai](https://dashboard.sogni.ai))
|
|
96
|
+
- **`ffmpeg`** *(optional)* — required for local utilities such as `--angles-360-video`, `--concat-videos`, and `--extract-last-frame`. Set `FFMPEG_PATH` to override discovery.
|
|
97
|
+
- macOS, Linux, or Windows
|
|
55
98
|
|
|
56
|
-
|
|
99
|
+
---
|
|
57
100
|
|
|
58
|
-
|
|
101
|
+
## Installation
|
|
59
102
|
|
|
60
|
-
|
|
103
|
+
### Node CLI (default)
|
|
61
104
|
|
|
62
|
-
For
|
|
105
|
+
For most agents and human users:
|
|
63
106
|
|
|
64
107
|
```bash
|
|
65
108
|
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
66
109
|
sogni-agent --version
|
|
67
110
|
```
|
|
68
111
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
```bash
|
|
72
|
-
DEST="$HOME/Documents/git/sogni/sogni-creative-agent-skill"
|
|
73
|
-
git -C "$DEST" pull --ff-only
|
|
74
|
-
npm --prefix "$DEST" install
|
|
75
|
-
```
|
|
76
|
-
|
|
77
|
-
If the checkout is missing, use the npm install path above or explicitly approve a clone.
|
|
112
|
+
Then point your agent/runtime at this repository's [`SKILL.md`](./SKILL.md). When an install request is ambiguous, install the CLI and skill source together — that's the supported default.
|
|
78
113
|
|
|
79
|
-
### OpenClaw
|
|
114
|
+
### OpenClaw plugin
|
|
80
115
|
|
|
81
116
|
For the published plugin:
|
|
82
117
|
|
|
@@ -86,7 +121,7 @@ openclaw plugins install sogni-creative-agent-skill
|
|
|
86
121
|
|
|
87
122
|
The installed plugin loads its behavior from [`SKILL.md`](./SKILL.md) via [`openclaw.plugin.json`](./openclaw.plugin.json).
|
|
88
123
|
|
|
89
|
-
For a local checkout that you want to update continuously, link the minimal OpenClaw surface
|
|
124
|
+
For a local checkout that you want to update continuously, link the minimal OpenClaw surface (`.openclaw-link/`) — not the repository root, which contains development tests that OpenClaw correctly blocks during plugin safety scanning:
|
|
90
125
|
|
|
91
126
|
```bash
|
|
92
127
|
cd /path/to/sogni-creative-agent-skill
|
|
@@ -97,7 +132,7 @@ openclaw plugins install -l "$PWD/.openclaw-link"
|
|
|
97
132
|
openclaw gateway restart
|
|
98
133
|
```
|
|
99
134
|
|
|
100
|
-
To update
|
|
135
|
+
To update the linked install later:
|
|
101
136
|
|
|
102
137
|
```bash
|
|
103
138
|
cd /path/to/sogni-creative-agent-skill
|
|
@@ -108,13 +143,17 @@ npm run openclaw:sync
|
|
|
108
143
|
openclaw gateway restart
|
|
109
144
|
```
|
|
110
145
|
|
|
111
|
-
|
|
146
|
+
The generated `.openclaw-link/` directory is only for OpenClaw; Hermes, Manus, and other skill-based agents should continue using the root [`SKILL.md`](./SKILL.md).
|
|
112
147
|
|
|
113
|
-
|
|
148
|
+
#### OpenClaw configuration
|
|
114
149
|
|
|
115
|
-
|
|
150
|
+
When loaded through OpenClaw, this skill reads plugin defaults from OpenClaw config; CLI flags always override them. The supported config schema is defined in [`openclaw.plugin.json`](./openclaw.plugin.json) and includes default models, video workflow models, hosted API defaults (`apiBaseUrl`, `defaultLlmModel`, `defaultTaskProfile`, `defaultApiMaxTokens`, `defaultApiThinking`, `defaultApiToolMode`, workflow cost defaults), token type, seed strategy, timeouts, and media paths. If your OpenClaw config lives elsewhere, set `OPENCLAW_CONFIG_PATH`.
|
|
116
151
|
|
|
117
|
-
###
|
|
152
|
+
### Hermes Agent / Manus / other frameworks
|
|
153
|
+
|
|
154
|
+
Point the agent at this repository's [`SKILL.md`](./SKILL.md) for behavior guidance and [`llm.txt`](https://raw.githubusercontent.com/Sogni-AI/sogni-creative-agent-skill/main/llm.txt) for install/setup help. The agent should invoke the globally installed `sogni-agent` CLI by default.
|
|
155
|
+
|
|
156
|
+
### Manual install from source
|
|
118
157
|
|
|
119
158
|
```bash
|
|
120
159
|
gh repo clone Sogni-AI/sogni-creative-agent-skill
|
|
@@ -122,45 +161,57 @@ cd sogni-creative-agent-skill
|
|
|
122
161
|
npm install
|
|
123
162
|
```
|
|
124
163
|
|
|
125
|
-
###
|
|
164
|
+
### Upgrading safely from inside an agent
|
|
165
|
+
|
|
166
|
+
When upgrading from inside an agent runtime, prefer direct package-manager or existing-checkout commands. Avoid asking the agent to build a clone-or-pull shell bootstrap script with `set -e`, `bash -c`, `sh -c`, or an inline repository URL — some sandboxes correctly route those through approval and the install will stall.
|
|
126
167
|
|
|
127
|
-
|
|
168
|
+
For a global CLI:
|
|
128
169
|
|
|
129
170
|
```bash
|
|
130
|
-
npm
|
|
171
|
+
npm install -g @sogni-ai/sogni-creative-agent-skill@latest
|
|
172
|
+
sogni-agent --version
|
|
131
173
|
```
|
|
132
174
|
|
|
133
|
-
|
|
175
|
+
For an existing local checkout:
|
|
134
176
|
|
|
135
|
-
|
|
177
|
+
```bash
|
|
178
|
+
DEST="$HOME/Documents/git/sogni/sogni-creative-agent-skill"
|
|
179
|
+
git -C "$DEST" pull --ff-only
|
|
180
|
+
npm --prefix "$DEST" install
|
|
181
|
+
```
|
|
136
182
|
|
|
137
|
-
|
|
183
|
+
If the checkout is missing, use the npm install path above or explicitly approve a clone.
|
|
138
184
|
|
|
139
|
-
|
|
185
|
+
---
|
|
140
186
|
|
|
141
|
-
|
|
187
|
+
## Setup (Sogni API key)
|
|
142
188
|
|
|
143
|
-
|
|
189
|
+
1. Get your API key from [dashboard.sogni.ai](https://dashboard.sogni.ai) (open the account menu).
|
|
190
|
+
2. Save it to a credentials file:
|
|
144
191
|
|
|
145
|
-
|
|
146
|
-
|
|
192
|
+
```bash
|
|
193
|
+
mkdir -p ~/.config/sogni
|
|
194
|
+
cat > ~/.config/sogni/credentials << 'EOF'
|
|
195
|
+
SOGNI_API_KEY=your_api_key
|
|
196
|
+
EOF
|
|
197
|
+
chmod 600 ~/.config/sogni/credentials
|
|
198
|
+
```
|
|
147
199
|
|
|
148
|
-
|
|
149
|
-
mkdir -p ~/.config/sogni
|
|
150
|
-
cat > ~/.config/sogni/credentials << 'EOF'
|
|
151
|
-
SOGNI_API_KEY=your_api_key
|
|
152
|
-
# or:
|
|
153
|
-
# SOGNI_USERNAME=your_username
|
|
154
|
-
# SOGNI_PASSWORD=your_password
|
|
155
|
-
EOF
|
|
156
|
-
chmod 600 ~/.config/sogni/credentials
|
|
157
|
-
```
|
|
200
|
+
You can also skip the file and export `SOGNI_API_KEY` in your environment.
|
|
158
201
|
|
|
159
|
-
|
|
202
|
+
### Filesystem path overrides
|
|
160
203
|
|
|
161
|
-
|
|
204
|
+
Defaults live under `~/.config/sogni/` for credentials, last-render metadata, personas, memories, and personality. Override individual paths with:
|
|
162
205
|
|
|
163
|
-
|
|
206
|
+
| Variable | Purpose |
|
|
207
|
+
|----------|---------|
|
|
208
|
+
| `SOGNI_CREDENTIALS_PATH` | Custom credentials file |
|
|
209
|
+
| `SOGNI_LAST_RENDER_PATH` | Where last-render state is persisted |
|
|
210
|
+
| `SOGNI_MEDIA_INBOUND_DIR` | Directory used by `--list-media` |
|
|
211
|
+
| `OPENCLAW_CONFIG_PATH` | OpenClaw config file location |
|
|
212
|
+
| `FFMPEG_PATH` | Custom `ffmpeg` binary |
|
|
213
|
+
|
|
214
|
+
---
|
|
164
215
|
|
|
165
216
|
## Usage
|
|
166
217
|
|
|
@@ -174,14 +225,14 @@ sogni-agent -c subject.jpg "add a neon cyberpunk glow"
|
|
|
174
225
|
# Photobooth face transfer
|
|
175
226
|
sogni-agent --photobooth --ref face.jpg "80s fashion portrait"
|
|
176
227
|
|
|
177
|
-
# Text-to-video (t2v)
|
|
178
|
-
sogni-agent --video
|
|
228
|
+
# Text-to-video (t2v) with native dialogue
|
|
229
|
+
sogni-agent --video 'A narrator says "welcome to the story" as ocean waves crash'
|
|
179
230
|
|
|
180
|
-
# Short-side targeting preserves the
|
|
231
|
+
# Short-side resolution targeting (preserves the inherited aspect ratio)
|
|
181
232
|
sogni-agent --video --target-resolution 768 \
|
|
182
233
|
"A calm cinematic shot of lanterns drifting across a night lake"
|
|
183
234
|
|
|
184
|
-
# Seedance 2.0
|
|
235
|
+
# Seedance 2.0 (4-15s vendor video path with native audio)
|
|
185
236
|
sogni-agent --video -m seedance2 --duration 8 \
|
|
186
237
|
"A polished product reveal with native ambient sound"
|
|
187
238
|
|
|
@@ -195,15 +246,58 @@ sogni-agent --video -m seedance2 --workflow t2v \
|
|
|
195
246
|
# Image-to-video (i2v)
|
|
196
247
|
sogni-agent --video --ref cat.jpg "gentle camera pan"
|
|
197
248
|
|
|
198
|
-
# Image+audio-to-video (auto-routes to LTX
|
|
249
|
+
# Image+audio-to-video (auto-routes to LTX-2.3 ia2v)
|
|
199
250
|
sogni-agent --video --ref cover.jpg --ref-audio song.mp3 \
|
|
200
251
|
"music video with synchronized motion"
|
|
201
252
|
|
|
202
|
-
#
|
|
253
|
+
# Direct music generation
|
|
254
|
+
sogni-agent --music --duration 30 \
|
|
255
|
+
"uplifting cinematic synthwave theme for a product launch"
|
|
256
|
+
|
|
257
|
+
# Song with lyrics and musical controls
|
|
258
|
+
sogni-agent --music --lyrics "Rise with the morning light" --bpm 128 \
|
|
259
|
+
--keyscale "C major" --output-format mp3 "bright indie pop chorus"
|
|
260
|
+
|
|
261
|
+
# LTX-2.3 voice identity / persona
|
|
203
262
|
sogni-agent --video --reference-audio-identity voice.webm \
|
|
204
|
-
|
|
263
|
+
'NARRATOR: "This is my voice."'
|
|
264
|
+
|
|
265
|
+
# Hosted chat with Sogni creative-agent tools (/v1/chat/completions)
|
|
266
|
+
sogni-agent --api-chat \
|
|
267
|
+
"Create a 4-shot product video concept for a red sneaker"
|
|
268
|
+
|
|
269
|
+
# Hosted chat with image vision plus media-reference metadata
|
|
270
|
+
sogni-agent --api-chat --ref product.jpg \
|
|
271
|
+
"Turn this into a launch poster and describe the edit plan"
|
|
272
|
+
|
|
273
|
+
# Hosted chat controls and model discovery
|
|
274
|
+
sogni-agent --api-chat --task-profile reasoning --no-thinking \
|
|
275
|
+
"Plan a concise multi-step product launch workflow"
|
|
276
|
+
sogni-agent --list-api-models
|
|
277
|
+
|
|
278
|
+
# Durable hosted workflow (/v1/creative-agent/workflows)
|
|
279
|
+
sogni-agent --api-workflow image-to-video \
|
|
280
|
+
--video-prompt "The camera slowly pushes in as the sketch comes alive" \
|
|
281
|
+
"A graphite robot sketch on a drafting table"
|
|
282
|
+
|
|
283
|
+
# Durable workflow with a media reference and a cost ceiling
|
|
284
|
+
sogni-agent --api-workflow image-to-video --ref https://cdn.example.com/sketch.png \
|
|
285
|
+
--workflow-max-cost 25 --confirm-cost \
|
|
286
|
+
--video-prompt "The camera slowly pushes in as the sketch comes alive" \
|
|
287
|
+
"Animate the referenced sketch"
|
|
288
|
+
|
|
289
|
+
# Shared CreativeWorkflowPlan -> API compiles to hosted sequence
|
|
290
|
+
sogni-agent --api-workflow creative-plan --workflow-input @plan.json
|
|
291
|
+
|
|
292
|
+
# Storyline -> GPT Image 2 storyboard sheet -> Seedance video sequence
|
|
293
|
+
sogni-agent --api-workflow storyboard-video --storyboard-frames 6 --duration 12 -Q hq \
|
|
294
|
+
"Create a 9:16 bakery launch video with a neon street-window reveal"
|
|
295
|
+
|
|
296
|
+
# Sogni Intelligence replay records
|
|
297
|
+
sogni-agent --list-replays 20
|
|
298
|
+
sogni-agent --get-replay run_abc123 --json
|
|
205
299
|
|
|
206
|
-
#
|
|
300
|
+
# Local segment + concat with external soundtrack
|
|
207
301
|
sogni-agent --video --workflow v2v --ref-video dance.mp4 \
|
|
208
302
|
--video-start 10 --duration 8 --controlnet-name pose -o /tmp/clip-2.mp4 \
|
|
209
303
|
"robot dancing"
|
|
@@ -215,114 +309,140 @@ sogni-agent --balance
|
|
|
215
309
|
sogni-agent --help
|
|
216
310
|
```
|
|
217
311
|
|
|
218
|
-
|
|
312
|
+
> Prefer `.webm`, `.m4a`, or `.mp3` voice clips. Local `.wav` clips are normalized to `.m4a` before upload when `ffmpeg` is available.
|
|
313
|
+
>
|
|
314
|
+
> For local multi-clip workflows, use the built-in FFmpeg wrappers (`--video-start`, `--audio-start`, `--audio-duration`, `--concat-videos`, `--concat-audio`) over raw shell commands — they produce safer, more reproducible results.
|
|
219
315
|
|
|
220
|
-
|
|
316
|
+
---
|
|
221
317
|
|
|
222
|
-
##
|
|
223
|
-
|
|
224
|
-
When you use `ltx23-22b-fp8_t2v_distilled`, do not feed it short tag prompts like `"cinematic drone shot over tropical cliffs"`. LTX-2.3 renders more reliably from a dense natural-language scene description.
|
|
318
|
+
## CLI Reference
|
|
225
319
|
|
|
226
|
-
-
|
|
227
|
-
- Use 4-8 flowing present-tense sentences describing one continuous shot, not a montage.
|
|
228
|
-
- Start with shot scale and scene identity, then cover environment, time of day, textures, and named light sources.
|
|
229
|
-
- Keep characters and objects concrete and stable. Describe one main action thread from start to finish.
|
|
230
|
-
- If the user wants dialogue, include the exact spoken words in double quotes with the speaker and delivery identified inline.
|
|
231
|
-
- Express mood through visible behavior, motion, and sound cues instead of vague adjectives.
|
|
232
|
-
- Use positive phrasing. Avoid script formatting, negative prompts, on-screen text/logo requests, and generic filler words like "beautiful" or "nice".
|
|
233
|
-
- Match scene density to clip length. For the default short clips, describe one main beat rather than several unrelated actions.
|
|
320
|
+
Run `sogni-agent --help` for the full CLI. Below are the options and tables most agents and users reach for first.
|
|
234
321
|
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
```text
|
|
238
|
-
User ask: "make a 4k video of a woman in a neon alley"
|
|
322
|
+
### Common options
|
|
239
323
|
|
|
240
|
-
|
|
241
|
-
|
|
324
|
+
| Option | Use |
|
|
325
|
+
|--------|-----|
|
|
326
|
+
| `-Q fast\|hq\|pro` | Pick image quality without memorizing model IDs |
|
|
327
|
+
| `-o <path>` | Save output locally |
|
|
328
|
+
| `-c <path>` | Provide image context for edits |
|
|
329
|
+
| `--video` | Generate video instead of image |
|
|
330
|
+
| `--music` | Generate music/audio instead of image |
|
|
331
|
+
| `--lyrics`, `--bpm`, `--keyscale`, `--timesig` | Music generation controls |
|
|
332
|
+
| `--ref`, `--ref-audio`, `--ref-video` | Image/audio/video references; HTTPS refs are forwarded as URL context for Seedance |
|
|
333
|
+
| `--target-resolution <px>` | Target the short side, preserving aspect ratio |
|
|
334
|
+
| `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
|
|
335
|
+
| `--api-chat` | Use `/v1/chat/completions` with Sogni creative-agent tools |
|
|
336
|
+
| `--api-workflow <kind>` | Start a `/v1/creative-agent/workflows` durable workflow: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, or `storyboard-video` |
|
|
337
|
+
| `--workflow-input <json\|path\|@path>` | Explicit hosted workflow input JSON |
|
|
338
|
+
| `--workflow-max-cost <n>`, `--confirm-cost`, `--no-confirm-cost` | Set durable workflow capacity ceiling and explicit cost confirmation |
|
|
339
|
+
| `--storyboard-frames <n>` | Beat count for `--api-workflow storyboard-video` |
|
|
340
|
+
| `--video-prompt`, `--negative-prompt`, `--generate-audio`, `--expand-prompt` | Durable image-to-video workflow inputs |
|
|
341
|
+
| `--watch-workflow`, `--list-workflows`, `--get-workflow <id>`, `--workflow-events <id>`, `--stream-workflow <id>`, `--cancel-workflow <id>` | Manage durable workflows |
|
|
342
|
+
| `--api-tools <mode>`, `--no-api-tool-execution`, `--llm-model <id>`, `--task-profile <profile>`, `--max-tokens <n>`, `--thinking` / `--no-thinking`, `--api-base-url <url>` | Tune hosted API requests |
|
|
343
|
+
| `--list-api-models`, `--get-api-model <id>` | Inspect Sogni Intelligence LLM models |
|
|
344
|
+
| `--list-replays [n]`, `--get-replay <id>`, `--ingest-replay <json\|path\|@path>` | Manage Sogni Intelligence replay records |
|
|
345
|
+
| `--persona <name>` | Use a saved persona |
|
|
346
|
+
| `--concat-videos <out> <clips...>` | Stitch clips locally with FFmpeg |
|
|
347
|
+
| `--last`, `--last-image` | Inspect last render / reuse last image as context or video reference |
|
|
348
|
+
| `--strict-size` | Fail instead of auto-adjusting video size |
|
|
349
|
+
| `--json` | Emit structured output for agents |
|
|
242
350
|
|
|
243
|
-
|
|
351
|
+
### Quality presets
|
|
244
352
|
|
|
245
|
-
|
|
353
|
+
Skip remembering model IDs — `--quality` / `-Q` selects the right model, steps, and dimensions for image generation:
|
|
246
354
|
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
355
|
+
| Preset | Model | Steps | Size | Speed |
|
|
356
|
+
|--------|-------|-------|------|-------|
|
|
357
|
+
| `fast` | `z_image_turbo_bf16` | 8 | 512×512 | ~5–10s |
|
|
358
|
+
| `hq` | `z_image_turbo_bf16` | default | 768×768 | ~10–15s |
|
|
359
|
+
| `pro` | `flux2_dev_fp8` | 40 | 1024×1024 | ~2 min |
|
|
251
360
|
|
|
252
|
-
|
|
361
|
+
Explicit `--model` overrides the preset's model. Explicit `-w`/`-h` overrides dimensions.
|
|
253
362
|
|
|
254
|
-
|
|
255
|
-
`--angles-360-video` generates i2v clips between consecutive angles (including last→first) and concatenates them with ffmpeg for a seamless loop.
|
|
256
|
-
`--balance` / `--balances` does not require a prompt and exits after printing current `SPARK` and `SOGNI` balances.
|
|
363
|
+
### Recommended models
|
|
257
364
|
|
|
258
|
-
|
|
365
|
+
Prefer `-Q fast|hq|pro` for images and automatic workflow routing for video. Pass `-m` only when you need a specific model family.
|
|
259
366
|
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
367
|
+
| Need | Recommended selector |
|
|
368
|
+
|------|----------------------|
|
|
369
|
+
| Default images | `z_image_turbo_bf16` |
|
|
370
|
+
| OpenAI GPT Image generation, editing, or strong text rendering | `gpt-image-2` |
|
|
371
|
+
| Highest-quality images | `flux2_dev_fp8` (or `-Q pro`) |
|
|
372
|
+
| Image editing | `qwen_image_edit_2511_fp8_lightning` |
|
|
373
|
+
| Photobooth face transfer | `coreml-sogniXLturbo_alpha1_ad` |
|
|
374
|
+
| Direct music generation | `ace_step_1.5_turbo` (or `--music-model turbo`) |
|
|
375
|
+
| Music with stronger lyric handling | `ace_step_1.5_sft` (or `--music-model sft`) |
|
|
376
|
+
| Text-to-video with native dialogue/audio | `ltx23-22b-fp8_t2v_distilled` |
|
|
377
|
+
| Image+audio-to-video | `ltx23-22b-fp8_ia2v_distilled` |
|
|
378
|
+
| Audio-to-video | `ltx23-22b-fp8_a2v_distilled` |
|
|
379
|
+
| Video-to-video with ControlNet | `ltx23-22b-fp8_v2v_distilled` |
|
|
380
|
+
| Seedance text-to-video | `seedance2` or `seedance2-fast` |
|
|
381
|
+
| Seedance video-to-video without ControlNet | `seedance2-v2v` |
|
|
382
|
+
| Face lip-sync with uploaded audio | `wan_v2.2-14b-fp8_s2v_lightx2v` |
|
|
268
383
|
|
|
269
|
-
|
|
384
|
+
`gpt-image-2` supports flexible OpenAI image sizes up to 3840 px on either edge, max 3:1 aspect ratio, and total pixels from 655,360 to 8,294,400; the API snaps dimensions to valid multiples of 16. For image editing with `gpt-image-2`, you can pass up to 16 context images.
|
|
270
385
|
|
|
271
|
-
|
|
386
|
+
Music generation uses `--music` and outputs `mp3` by default. `--audio` remains the video-reference alias for `--ref-audio`; use `--music` or `--generate-music` for direct audio-only generation.
|
|
272
387
|
|
|
273
|
-
|
|
388
|
+
---
|
|
274
389
|
|
|
275
|
-
|
|
390
|
+
## Video Sizing & Aspect Ratios
|
|
276
391
|
|
|
277
|
-
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
| `--workflow <type>` | Force `t2v`, `i2v`, `s2v`, `ia2v`, `a2v`, `v2v`, or animate workflows |
|
|
286
|
-
| `--persona <name>` | Use a saved persona reference |
|
|
287
|
-
| `--concat-videos <out> <clips...>` | Stitch clips locally with FFmpeg |
|
|
288
|
-
| `--json` | Return structured output for agents |
|
|
392
|
+
- **WAN models** use dimensions divisible by 16, min 480 px, max 1536 px.
|
|
393
|
+
- **LTX family** (`ltx2-*`, `ltx23-*`) uses dimensions divisible by 64. The current wrapper caps non-WAN video dimensions at 2048 px on the long side.
|
|
394
|
+
- **Seedance** runs at fixed 24 fps and supports 4–15 s durations. Other default/WAN paths support up to 10 s; LTX and WAN animate workflows support up to 20 s.
|
|
395
|
+
- The script auto-normalizes video sizes to satisfy these constraints.
|
|
396
|
+
- Use `--target-resolution <px>` for bare resolution requests like "720p" — it targets the short side and preserves the inherited aspect ratio.
|
|
397
|
+
- Natural-language aspect requests like "portrait", "square", "16:9", or "9:16" are inferred when width/height aren't explicitly set. Combined requests like "720p 9:16" keep the requested short side while applying the requested shape.
|
|
398
|
+
- For i2v (and any workflow using `--ref` / `--ref-end`), the client wrapper resizes the reference image with strict aspect-fit (`fit: inside`) and uses the *resized* dimensions as the final video size. Because that resize uses rounding, a "valid" requested size can still produce an invalid final size (example: `1024×1536` requested, but ref becomes `1024×1535`). `sogni-agent` detects this for local refs and auto-adjusts to a nearby safe size.
|
|
399
|
+
- Pass `--strict-size` to fail instead — the script will print a suggested size.
|
|
289
400
|
|
|
290
|
-
|
|
401
|
+
V2V defaults mirror Sogni Chat workflow tuning: `canny`, `pose`, and `depth` use ControlNet strength `0.85` with detailer assist; `detailer` uses strength `1.0`. Use `-m seedance2-v2v` for Seedance V2V without ControlNet. Seedance accepts public HTTPS image, video, and audio references that pass CLI URL safety checks; localhost and private-network URLs are rejected before forwarding. Audio references must be paired with an image or video reference.
|
|
291
402
|
|
|
292
|
-
|
|
403
|
+
---
|
|
293
404
|
|
|
294
|
-
|
|
295
|
-
|--------|-------|-------|------|-------|
|
|
296
|
-
| `fast` | z_image_turbo_bf16 | 8 | 512x512 | ~5-10s |
|
|
297
|
-
| `hq` | z_image_turbo_bf16 | default | 768x768 | ~10-15s |
|
|
298
|
-
| `pro` | flux2_dev_fp8 | 40 | 1024x1024 | ~2min |
|
|
405
|
+
## LTX-2.3 Prompting Guide
|
|
299
406
|
|
|
300
|
-
|
|
407
|
+
When you use `ltx23-22b-fp8_t2v_distilled`, do **not** feed it short tag prompts like `"cinematic drone shot over tropical cliffs"`. LTX-2.3 renders more reliably from a dense natural-language scene description.
|
|
301
408
|
|
|
302
|
-
|
|
409
|
+
- Write one unbroken paragraph — no line breaks, bullets, headers, or tag blocks.
|
|
410
|
+
- Use 4–8 flowing present-tense sentences describing one continuous shot, not a montage.
|
|
411
|
+
- Start with shot scale and scene identity, then cover environment, time of day, textures, and named light sources.
|
|
412
|
+
- Keep characters and objects concrete and stable; describe one main action thread from start to finish.
|
|
413
|
+
- For dialogue, include the exact spoken words in double quotes with the speaker and delivery identified inline.
|
|
414
|
+
- Express mood through visible behavior, motion, and sound cues — not vague adjectives.
|
|
415
|
+
- Use positive phrasing. Avoid script formatting, negative prompts, on-screen text/logo requests, and filler words like "beautiful" or "nice".
|
|
416
|
+
- Match scene density to clip length. For short clips, describe one main beat, not several actions.
|
|
303
417
|
|
|
304
|
-
|
|
418
|
+
**Example rewrite:**
|
|
305
419
|
|
|
306
|
-
```
|
|
307
|
-
|
|
308
|
-
sogni-agent -n 3 "a {red|blue|green} car"
|
|
420
|
+
```text
|
|
421
|
+
User ask: "make a 4k video of a woman in a neon alley"
|
|
309
422
|
|
|
310
|
-
|
|
311
|
-
sogni-agent -n 4 "a {cat|dog} in a {garden|kitchen}"
|
|
312
|
-
# → "a cat in a garden", "a dog in a kitchen", "a cat in a garden", "a dog in a kitchen"
|
|
423
|
+
LTX-2.3 prompt: "A medium cinematic shot frames a woman in her 30s standing in a rain-soaked neon alley at night, violet and amber signs reflecting across the wet pavement while warm steam drifts from street vents. She wears a dark trench coat with damp strands of black hair clinging near her cheek as light glances across the fabric texture and the brick walls behind her. She turns toward the camera and steps forward with measured focus, one hand tightening around the strap of her bag while rain taps softly on the metal fire escape and a distant train hum rolls through the block. The camera performs a slow push-in as her jaw sets and her breathing steadies, maintaining smooth stabilized motion and a tense urban-thriller mood."
|
|
313
424
|
```
|
|
314
425
|
|
|
315
|
-
|
|
426
|
+
---
|
|
316
427
|
|
|
317
|
-
|
|
428
|
+
## Photobooth (Face Transfer)
|
|
318
429
|
|
|
319
|
-
|
|
430
|
+
Generate stylized portraits from a face photo using InstantID ControlNet:
|
|
320
431
|
|
|
321
432
|
```bash
|
|
322
|
-
sogni-agent --
|
|
433
|
+
sogni-agent --photobooth --ref face.jpg "80s fashion portrait"
|
|
434
|
+
sogni-agent --photobooth --ref face.jpg -n 4 "LinkedIn professional headshot"
|
|
323
435
|
```
|
|
324
436
|
|
|
325
|
-
|
|
437
|
+
Uses SDXL Turbo (`coreml-sogniXLturbo_alpha1_ad`) at 1024×1024 by default. The face image is passed via `--ref` and styled by the prompt. Cannot be combined with `--video` or `-c` / `--context`.
|
|
438
|
+
|
|
439
|
+
Multi-angle mode (`--multi-angle` / `--angles-360`) auto-builds the `<sks>` prompt and applies the `multiple_angles` LoRA. `--angles-360-video` generates i2v clips between consecutive angles (including last → first) and concatenates them with `ffmpeg` into a seamless loop.
|
|
440
|
+
|
|
441
|
+
`--balance` / `--balances` does not require a prompt and prints current `SPARK` and `SOGNI` balances before exiting.
|
|
442
|
+
|
|
443
|
+
---
|
|
444
|
+
|
|
445
|
+
## Personas, Memory, and Personality
|
|
326
446
|
|
|
327
447
|
### Personas
|
|
328
448
|
|
|
@@ -335,20 +455,20 @@ sogni-agent --persona-add "Mark" --ref face.jpg --relationship self --descriptio
|
|
|
335
455
|
# Add with voice clip for video voice cloning
|
|
336
456
|
sogni-agent --persona-add "Sarah" --ref sarah.jpg --relationship partner --voice-clip voice.webm
|
|
337
457
|
|
|
338
|
-
# Generate
|
|
458
|
+
# Generate using a persona (auto-injects photo as context)
|
|
339
459
|
sogni-agent --persona "Mark" -o hero.png "superhero in dramatic lighting"
|
|
340
460
|
|
|
341
|
-
#
|
|
342
|
-
sogni-agent --video --persona "Sarah"
|
|
461
|
+
# Video using a persona photo + saved voice identity
|
|
462
|
+
sogni-agent --video --persona "Sarah" 'SARAH: "This is my voice."'
|
|
343
463
|
|
|
344
464
|
# List / remove
|
|
345
465
|
sogni-agent --persona-list
|
|
346
466
|
sogni-agent --persona-remove "Mark"
|
|
347
467
|
```
|
|
348
468
|
|
|
349
|
-
|
|
469
|
+
Stored at `~/.config/sogni/personas/`. Pronouns like "me" / "myself" auto-resolve to the `self` persona; "my wife" resolves to `partner`, etc.
|
|
350
470
|
|
|
351
|
-
### Memory (
|
|
471
|
+
### Memory (persistent preferences)
|
|
352
472
|
|
|
353
473
|
Save preferences that agents respect across sessions:
|
|
354
474
|
|
|
@@ -361,9 +481,9 @@ sogni-agent --memory-remove preferred_style
|
|
|
361
481
|
|
|
362
482
|
Stored at `~/.config/sogni/memories.json`.
|
|
363
483
|
|
|
364
|
-
### Personality (
|
|
484
|
+
### Personality (custom agent instructions)
|
|
365
485
|
|
|
366
|
-
|
|
486
|
+
Tell the agent how it should behave:
|
|
367
487
|
|
|
368
488
|
```bash
|
|
369
489
|
sogni-agent --personality-set "Be concise, always use cinematic lighting"
|
|
@@ -373,24 +493,115 @@ sogni-agent --personality-clear
|
|
|
373
493
|
|
|
374
494
|
Stored at `~/.config/sogni/personality.txt`.
|
|
375
495
|
|
|
376
|
-
|
|
496
|
+
---
|
|
377
497
|
|
|
378
|
-
|
|
498
|
+
## Hosted API Modes
|
|
379
499
|
|
|
380
|
-
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
500
|
+
Hosted API modes require `SOGNI_API_KEY`.
|
|
501
|
+
|
|
502
|
+
- **`--api-chat`** targets `/v1/chat/completions` with Sogni creative-agent tools — best for text-first natural-language workflows. The CLI sanitizes prompt-injection markers before forwarding messages and can use the current server-side creative-agent media tools, including video extension, segment replacement, overlays, subtitles, stitch/orbit/dance composition, and generated artifact indexing. Tune with `--api-tools creative-agent|creative-tools|none`, `--no-api-tool-execution`, `--llm-model`, and `--system`.
|
|
503
|
+
- **Sogni Intelligence controls** include `--task-profile general|coding|reasoning`, `--max-tokens`, and `--thinking` / `--no-thinking`, which forward to `/v1/chat/completions` as `task_profile`, `max_tokens`, and `chat_template_kwargs.enable_thinking`. Use `--list-api-models` or `--get-api-model <id>` to inspect `/v1/models`.
|
|
504
|
+
- **`--api-workflow`** targets `/v1/creative-agent/workflows` for durable, async workflow records with event streaming and cancellation. Supported kinds: `image-to-video`, `hosted-tool-sequence`, `creative-plan`, and `storyboard-video`.
|
|
505
|
+
- **`--api-workflow creative-plan`** forwards a shared `CreativeWorkflowPlan` JSON object (`{ title?, steps: [...] }`) to the API as `kind: "creative_plan"`. Compilation, hosted-tool argument validation, and persistence happen in `../sogni-api` through `@sogni/creative-agent`; the public skill does not duplicate that compiler. Use this when you need exact shared-plan behavior such as repeated `replace_video_segment` steps with `replacementStartSeconds` / `replacementEndSeconds` for interleaved video slices.
|
|
506
|
+
- **`--api-workflow storyboard-video`** generates a storyline, creates a single GPT Image 2 storyboard sheet, then passes that artifact into Seedance as the video reference. The `-Q fast|hq|pro` preset maps to GPT Image 2 low/medium/high quality for that storyboard sheet.
|
|
507
|
+
- **Media references** from `-c`, `--ref`, `--ref-end`, `--ref-audio`, `--reference-audio-identity`, and `--ref-video` are forwarded as `media_references` metadata in hosted API requests. API chat also attaches image refs as vision inputs. Local file references are uploaded to Sogni media storage first, then forwarded as retrievable URLs so durable executors do not depend on `data:` URI support. Durable workflow JSON can bind those references into step arguments with `sourceStepId: "$input_media"`. Use direct CLI mode for private media that must not leave the local machine.
|
|
508
|
+
- **Cost controls** use `--workflow-max-cost <n>` to reject workflow starts above a capacity-unit ceiling, and `--confirm-cost` / `--no-confirm-cost` to forward explicit billing confirmation.
|
|
509
|
+
- Manage runs with `--watch-workflow`, `--workflow-events`, `--stream-workflow`, `--list-workflows`, `--get-workflow`, and `--cancel-workflow`. Use `--workflow-input` to provide exact hosted workflow JSON.
|
|
510
|
+
- **Replay records** use `/v1/replay/records`: `--list-replays [limit]`, `--get-replay <runId>`, and `--ingest-replay <json|path|@path>` expose redacted RunRecord storage for Sogni Intelligence replay/debug viewers.
|
|
511
|
+
|
|
512
|
+
Override the API origin with `--api-base-url`, `SOGNI_API_BASE_URL`, or `SOGNI_REST_ENDPOINT`.
|
|
513
|
+
Hosted API credentials are only sent to `https://api.sogni.ai` by default. Add trusted custom
|
|
514
|
+
hosts with `SOGNI_API_ALLOWED_HOSTS`; loopback or non-HTTPS local testing requires
|
|
515
|
+
`SOGNI_ALLOW_UNSAFE_API_BASE_URL=1`.
|
|
516
|
+
|
|
517
|
+
> The public skill consumes generated storyboard adapters from `../sogni-creative-agent`: `compileForModel()` now works in the bundled runtime for Seedance, GPT Image 2, LTX-2.3, and WAN storyboard stages.
|
|
518
|
+
|
|
519
|
+
---
|
|
520
|
+
|
|
521
|
+
## Dynamic Prompt Variations
|
|
522
|
+
|
|
523
|
+
Generate diverse images in a single call with `{option1|option2|option3}` syntax:
|
|
524
|
+
|
|
525
|
+
```bash
|
|
526
|
+
# 3 images: "a red car", "a blue car", "a green car"
|
|
527
|
+
sogni-agent -n 3 "a {red|blue|green} car"
|
|
528
|
+
|
|
529
|
+
# Multiple groups cycle independently
|
|
530
|
+
sogni-agent -n 4 "a {cat|dog} in a {garden|kitchen}"
|
|
531
|
+
# -> "a cat in a garden", "a dog in a kitchen", "a cat in a garden", "a dog in a kitchen"
|
|
532
|
+
```
|
|
533
|
+
|
|
534
|
+
Options cycle sequentially per image. Without `{...}` syntax, `-n` produces multiple images with the same prompt.
|
|
535
|
+
|
|
536
|
+
---
|
|
537
|
+
|
|
538
|
+
## Token Auto-Fallback
|
|
539
|
+
|
|
540
|
+
Use `--token-type auto` to retry with SOGNI tokens when SPARK is insufficient:
|
|
541
|
+
|
|
542
|
+
```bash
|
|
543
|
+
sogni-agent --token-type auto "a dragon eating tacos"
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
Tries SPARK first (free daily tokens), then falls back to SOGNI if the balance is too low.
|
|
547
|
+
|
|
548
|
+
---
|
|
549
|
+
|
|
550
|
+
## Error Reporting & Output
|
|
551
|
+
|
|
552
|
+
- **Exit codes:** failures use a non-zero exit code with human-readable stderr.
|
|
553
|
+
- **Structured output:** add `--json` when an agent needs machine-parseable success/error data, or `--last` to inspect the last render. JSON failures include canonical `errorType`, `errorCategory`, and `retryable` fields where the shared runtime can classify the error.
|
|
554
|
+
- **Output files:** use `-o <path>` to save locally; otherwise the CLI prints a result URL.
|
|
555
|
+
- **Quiet mode:** `-q` / `--quiet` suppresses progress output without changing exit semantics.
|
|
556
|
+
|
|
557
|
+
---
|
|
558
|
+
|
|
559
|
+
## For AI Agents
|
|
560
|
+
|
|
561
|
+
This skill is designed to be loaded into agent runtimes as a first-class capability.
|
|
562
|
+
|
|
563
|
+
1. **Behavior contract — [`SKILL.md`](./SKILL.md)**
|
|
564
|
+
The canonical instructions for how the agent should call `sogni-agent`. Load this as the skill source.
|
|
565
|
+
2. **Install/setup hints — [`llm.txt`](./llm.txt)**
|
|
566
|
+
A condensed install/setup reference for agents that fetch `llm.txt` over HTTPS:
|
|
567
|
+
`https://raw.githubusercontent.com/Sogni-AI/sogni-creative-agent-skill/main/llm.txt`
|
|
568
|
+
3. **OpenClaw manifest — [`openclaw.plugin.json`](./openclaw.plugin.json)**
|
|
569
|
+
Plugin metadata, config schema, and defaults for OpenClaw-aware runtimes.
|
|
570
|
+
4. **Structured output — `--json`**
|
|
571
|
+
Use `--json` for machine-readable success/error payloads. Use `--last` to read the previous render's metadata.
|
|
572
|
+
5. **Agent-safe install/upgrade**
|
|
573
|
+
Prefer the `npm install -g` and `git -C "$DEST" pull --ff-only` paths above. Avoid generating clone-or-pull bootstrap scripts with `set -e`, `bash -c`, `sh -c`, or inline repository URLs — agent sandboxes correctly route those through approval and the install will stall.
|
|
574
|
+
6. **SSRF / URL safety**
|
|
575
|
+
The CLI runs an SSRF guard ([`ssrf-guard.mjs`](./ssrf-guard.mjs)) before forwarding any HTTP(S) reference to hosted models. Localhost and private-network URLs are rejected; only public HTTPS references are forwarded as Seedance multimodal context.
|
|
576
|
+
|
|
577
|
+
---
|
|
578
|
+
|
|
579
|
+
## Development
|
|
580
|
+
|
|
581
|
+
The public skill keeps CLI/runtime glue in this repo, but Sogni model routing, video workflow defaults, quality tiers, and prompt guardrails are generated from the private `sogni-creative-agent` repo. The generated runtime is committed at [`generated/creative-agent-runtime.mjs`](./generated/creative-agent-runtime.mjs) so public installs do not need access to the private repo.
|
|
582
|
+
|
|
583
|
+
Run the test suite:
|
|
584
|
+
|
|
585
|
+
```bash
|
|
586
|
+
npm test
|
|
587
|
+
```
|
|
588
|
+
|
|
589
|
+
`npm test` first runs `npm run check:creative-agent-runtime`, which regenerates the runtime file and fails if it differs from the committed copy.
|
|
590
|
+
|
|
591
|
+
With both repos checked out as siblings, refresh the generated runtime before publishing:
|
|
592
|
+
|
|
593
|
+
```bash
|
|
594
|
+
npm run sync:creative-agent-runtime
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
Reusable workflow rules should be added to `../sogni-creative-agent` first, then synced here. Keep storyboard planning, tool argument validation, prompt linting, typed media turn intent, and typed repair/control semantics aligned with `sogni-chat`, `sogni-client`, and `sogni-api` hosted chat/workflow endpoints rather than recreating skill-only regex guards. Prefer generated or copied shared helpers for hosted workflow compilation, schema argument validation, `CreativeTurnPlannerFields` / `classifyMediaTurnIntent()` media-routing contracts, repair-control decisions, and guard telemetry summaries over skill-local guard code — this keeps public-agent behavior close to `/v1/chat/completions` and `/v1/creative-agent/workflows`.
|
|
598
|
+
|
|
599
|
+
Public-skill regex should stay limited to CLI argument/fact extraction such as file paths, URLs, extensions, dimensions, durations, and explicit positions. Hosted-style decisions such as latest-video continuation, uploaded-video modification, image-selection waits, stitch-after-batch state, and repair/control routing belong upstream in typed planner/runtime fields before they are synced here.
|
|
600
|
+
|
|
601
|
+
Issues and feature requests: [github.com/Sogni-AI/sogni-creative-agent-skill/issues](https://github.com/Sogni-AI/sogni-creative-agent-skill/issues).
|
|
602
|
+
|
|
603
|
+
---
|
|
393
604
|
|
|
394
605
|
## License
|
|
395
606
|
|
|
396
|
-
MIT
|
|
607
|
+
[MIT](./LICENSE) © Sogni AI
|