pixverse-ai-cli 1.1.12 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,10 +1,10 @@
1
1
  # PixVerse CLI
2
2
 
3
- The official command-line interface (CLI) for [PixVerse](https://pixverse.ai) — create AI-powered videos and images directly from your terminal.
3
+ The official command-line interface (CLI) for [PixVerse](https://pixverse.ai) — create AI-powered videos, images, and audio directly from your terminal.
4
4
 
5
5
  ## What is PixVerse?
6
6
 
7
- PixVerse is an AI-powered creative platform that generates high-quality videos and images from text prompts or reference images. It supports a wide range of creative workflows including text-to-video, image-to-video, text-to-image, video transitions, lip-sync speech, sound effects, templates/effects, and more.
7
+ PixVerse is an AI-powered creative platform that generates high-quality videos, images, and audio from text prompts or reference images. It supports a wide range of creative workflows including text-to-video, image-to-video, text-to-image, video transitions, text-to-speech (voice synthesis), music generation, templates/effects, and more.
8
8
 
9
9
  ## What is PixVerse CLI?
10
10
 
@@ -13,12 +13,12 @@ PixVerse CLI is essentially **a UI-free version of the PixVerse website**. All f
13
13
  It is designed for:
14
14
 
15
15
  - **AI agents** — structured JSON output, deterministic exit codes, and pipeable commands make it a perfect tool for autonomous workflows (e.g. Claude Code, Cursor, Codex, LangChain, custom agents).
16
- - **Developers & power users** — scriptable video/image generation without leaving the terminal.
16
+ - **Developers & power users** — scriptable video/image/audio generation without leaving the terminal.
17
17
  - **Automation** — integrate AI content generation into CI/CD pipelines, batch processing scripts, or content production workflows.
18
18
 
19
19
  ## Subscription Required
20
20
 
21
- PixVerse CLI uses the same credit system as the website — generating videos and images consumes credits from your PixVerse account balance with the same pricing. To prevent abuse, **PixVerse CLI is currently available to subscribed users only**. For details on subscription plans and member benefits, see the [PixVerse Subscribe](https://app.pixverse.ai/subscribe) page.
21
+ PixVerse CLI uses the same credit system as the website — generating videos, images, and audio consumes credits from your PixVerse account balance with the same pricing. To prevent abuse, **PixVerse CLI is currently available to subscribed users only**. For details on subscription plans and member benefits, see the [PixVerse Subscribe](https://app.pixverse.ai/subscribe) page.
22
22
 
23
23
  ## Installation
24
24
 
@@ -55,56 +55,82 @@ This opens a browser where you confirm the authorization. You can also copy the
55
55
 
56
56
  ### Video Models (`--model <value>`)
57
57
 
58
- | Model | `--model` value | Quality | Duration | Aspect Ratio |
59
- |:---|:---|:---|:---|:---|
60
- | PixVerse V6 *(default)* | `v6` | `360p` `540p` `720p` `1080p` | `1`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` `21:9` |
61
- | PixVerse C1 | `pixverse-c1` | `360p` `540p` `720p` `1080p` | `1`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
62
- | Seedance 2.0 Standard | `seedance-2.0-standard` | `480p` `720p` `1080p` | `4`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `21:9` |
63
- | Seedance 2.0 Fast | `seedance-2.0-fast` | `480p` `720p` | `4`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `21:9` |
64
- | Happy Horse 1.0 | `happyhorse-1.0` | `720p` `1080p` | `3`–`15`s | `16:9` `9:16` `1:1` `4:3` `3:4` |
65
- | Kling O3 Pro | `kling-o3-pro` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
66
- | Kling O3 Standard | `kling-o3-standard` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
67
- | Kling 3.0 Pro | `kling-3.0-pro` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
68
- | Kling 3.0 Standard | `kling-3.0-standard` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
69
- | Grok Imagine | `grok-imagine` | `480p` `720p` | `1`–`15`s | `16:9` `4:3` `1:1` `9:16` `3:4` `3:2` `2:3` |
70
- | Veo 3.1 Lite | `veo-3.1-lite` | `720p` `1080p` | `4` `6` `8`s | `16:9` `9:16` |
71
- | Veo 3.1 Standard | `veo-3.1-standard` | `720p` `1080p` `2160p` | `4` `6` `8`s | `16:9` `9:16` |
72
- | Veo 3.1 Fast | `veo-3.1-fast` | `720p` `1080p` `2160p` | `4` `6` `8`s | `16:9` `9:16` |
73
- | Sora 2 Pro | `sora-2-pro` | `720p` `1080p` | `4` `8` `12`s | `16:9` `9:16` |
74
- | Sora 2 | `sora-2` | `720p` | `4` `8` `12`s | `16:9` `9:16` |
75
- | PixVerse v5.6 | `v5.6` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
76
- | PixVerse v5.5 | `v5.5` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
77
- | PixVerse v5 | `v5` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
58
+ | Model | `--model` value | Quality | Duration | Aspect Ratio |
59
+ | :---------------------- | :---------------------- | :---------------------------------- | :------------ | :------------------------------------------------- |
60
+ | PixVerse V6 _(default)_ | `v6` | `360p` `540p` `720p` `1080p` | `1`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` `21:9` |
61
+ | PixVerse C1 | `pixverse-c1` | `360p` `540p` `720p` `1080p` | `1`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
62
+ | Seedance 2.0 Standard | `seedance-2.0-standard` | `480p` `720p` `1080p` | `4`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `21:9` |
63
+ | Seedance 2.0 Fast | `seedance-2.0-fast` | `480p` `720p` | `4`–`15`s | `16:9` `4:3` `1:1` `3:4` `9:16` `21:9` |
64
+ | Happy Horse 1.0 | `happyhorse-1.0` | `720p` `1080p` | `3`–`15`s | `16:9` `9:16` `1:1` `4:3` `3:4` |
65
+ | Kling O3 Pro | `kling-o3-pro` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
66
+ | Kling O3 Standard | `kling-o3-standard` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
67
+ | Kling 3.0 Pro | `kling-3.0-pro` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
68
+ | Kling 3.0 Standard | `kling-3.0-standard` | `720p` | `3`–`15`s | `16:9` `9:16` `1:1` |
69
+ | Grok Imagine 1.5 | `grok-imagine-1.5` | `480p` `720p` | `1`–`15`s | _from image_ |
70
+ | Grok Imagine | `grok-imagine` | `480p` `720p` | `1`–`15`s | `16:9` `4:3` `1:1` `9:16` `3:4` `3:2` `2:3` |
71
+ | Veo 3.1 Lite | `veo-3.1-lite` | `720p` `1080p` | `4` `6` `8`s | `16:9` `9:16` |
72
+ | Veo 3.1 Standard | `veo-3.1-standard` | `720p` `1080p` `2160p` | `4` `6` `8`s | `16:9` `9:16` |
73
+ | Veo 3.1 Fast | `veo-3.1-fast` | `720p` `1080p` `2160p` | `4` `6` `8`s | `16:9` `9:16` |
74
+ | Sora 2 Pro | `sora-2-pro` | `720p` `1080p` | `4` `8` `12`s | `16:9` `9:16` |
75
+ | Sora 2 | `sora-2` | `720p` | `4` `8` `12`s | `16:9` `9:16` |
76
+ | PixVerse v5.6 | `v5.6` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
77
+ | PixVerse v5.5 | `v5.5` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
78
+ | PixVerse v5 | `v5` | `360p` `480p` `540p` `720p` `1080p` | `1`–`10`s | `16:9` `4:3` `1:1` `3:4` `9:16` `3:2` `2:3` |
79
+
80
+ > Grok Imagine 1.5 is image-to-video only — it requires `--image` and derives its aspect ratio from the input image (the `--aspect-ratio` flag is ignored).
78
81
 
79
82
  > Not all models support all creation modes. See the per-mode support matrix below.
80
83
 
81
84
  #### Per-mode Model Support
82
85
 
83
- | Creation mode | Supported `--model` values |
84
- |:---|:---|
85
- | `create video` (text-to-video / image-to-video) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `happyhorse-1.0` `kling-o3-pro` `kling-o3-standard` `kling-3.0-pro` `kling-3.0-standard` `grok-imagine` `veo-3.1-lite` `veo-3.1-standard` `veo-3.1-fast` `sora-2-pro` `sora-2` `v5.6` |
86
- | `create extend` | `v6` `grok-imagine` |
87
- | `create reference` (multi-subject fusion) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `kling-o3-pro` `kling-o3-standard` `grok-imagine` `v5.6` |
88
- | `create transition` (2 frames) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `kling-o3-pro` `kling-o3-standard` `kling-3.0-pro` `kling-3.0-standard` `veo-3.1-lite` `veo-3.1-standard` `veo-3.1-fast` `v5.6` |
89
- | `create transition` (3+ frames) | `v5` |
90
- | `create modify` | `v5.5` |
91
- | `create motion-control` | `v5.6` |
92
- | `create speech` (lip sync) | `v5` |
86
+ | Creation mode | Supported `--model` values |
87
+ | :---------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
88
+ | `create video` (text-to-video / image-to-video) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `happyhorse-1.0` `kling-o3-pro` `kling-o3-standard` `kling-3.0-pro` `kling-3.0-standard` `grok-imagine-1.5` `grok-imagine` `veo-3.1-lite` `veo-3.1-standard` `veo-3.1-fast` `sora-2-pro` `sora-2` `v5.6` |
89
+ | `create extend` | `v6` `grok-imagine` |
90
+ | `create reference` (multi-subject reference) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `kling-o3-pro` `kling-o3-standard` `grok-imagine` `v5.6` |
91
+ | `create transition` (2 frames) | `v6` `pixverse-c1` `seedance-2.0-standard` `seedance-2.0-fast` `kling-o3-pro` `kling-o3-standard` `kling-3.0-pro` `kling-3.0-standard` `veo-3.1-lite` `veo-3.1-standard` `veo-3.1-fast` `v5.6` |
92
+ | `create transition` (3+ frames) | `v5` |
93
+ | `create modify` | `v5.5` |
94
+ | `create motion-control` | `v5.6` |
95
+
96
+ > Audio creation uses separate model families: `create voice` for text-to-speech and `create music` for prompt-to-music.
93
97
 
94
98
  ### Image Models (`--model <value>`)
95
99
 
96
- | Model | `--model` value | Quality | Aspect Ratio |
97
- |:---|:---|:---|:---|
98
- | GPT Image 2 *(default)* | `gpt-image-2.0` | `1080p` `1440p` `2160p` | `1:1` `16:9` `9:16` `4:3` `3:4` `3:2` `2:3` `2:1` `1:2` `21:9` |
99
- | Nano Banana 2 | `gemini-3.1-flash` | `512p` `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
100
- | Qwen-image | `qwen-image` | `720p` `1080p` | `1:1` `16:9` `9:16` `4:3` `3:4` `5:4` `4:5` `3:2` `2:3` `21:9` |
101
- | Nano Banana Pro | `gemini-3.0` | `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
102
- | Nano Banana | `gemini-2.5-flash` | `1080p` | `auto` `1:1` `16:9` `9:16` + more |
103
- | Seedream 5.0 Lite | `seedream-5.0-lite` | `1440p` `1800p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
104
- | Seedream 4.5 | `seedream-4.5` | `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
105
- | Seedream 4.0 | `seedream-4.0` | `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
106
- | Kling Image O3 | `kling-image-o3` | `1080p` `1440p` `2160p` | `16:9` `9:16` `1:1` + more |
107
- | Kling Image V3 | `kling-image-v3` | `1080p` `1440p` | `16:9` `9:16` `1:1` + more |
100
+ | Model | `--model` value | Quality | Aspect Ratio |
101
+ | :---------------------- | :------------------ | :----------------------------- | :------------------------------------------------------------- |
102
+ | GPT Image 2 _(default)_ | `gpt-image-2.0` | `1080p` `1440p` `2160p` | `1:1` `16:9` `9:16` `4:3` `3:4` `3:2` `2:3` `2:1` `1:2` `21:9` |
103
+ | Nano Banana 2 | `gemini-3.1-flash` | `512p` `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
104
+ | Qwen-image | `qwen-image` | `720p` `1080p` | `1:1` `16:9` `9:16` `4:3` `3:4` `5:4` `4:5` `3:2` `2:3` `21:9` |
105
+ | Nano Banana Pro | `gemini-3.0` | `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
106
+ | Nano Banana | `gemini-2.5-flash` | `1080p` | `auto` `1:1` `16:9` `9:16` + more |
107
+ | Seedream 5.0 Lite | `seedream-5.0-lite` | `1440p` `1800p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
108
+ | Seedream 4.5 | `seedream-4.5` | `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
109
+ | Seedream 4.0 | `seedream-4.0` | `1080p` `1440p` `2160p` | `auto` `1:1` `16:9` `9:16` + more |
110
+ | Kling Image O3 | `kling-image-o3` | `1080p` `1440p` `2160p` | `16:9` `9:16` `1:1` + more |
111
+ | Kling Image V3 | `kling-image-v3` | `1080p` `1440p` | `16:9` `9:16` `1:1` + more |
112
+
113
+ ### Voice / TTS Models (`create voice --model <value>`)
114
+
115
+ | Model | `--model` value | Provider | Max characters |
116
+ | :-------------------------------- | :----------------------- | :--------- | :------------- |
117
+ | MiniMax Speech 2.8 HD _(default)_ | `speech-2.8-hd` | MiniMax | 10,000 |
118
+ | MiniMax Speech 2.8 Turbo | `speech-2.8-turbo` | MiniMax | 10,000 |
119
+ | Eleven Multilingual v2 | `eleven-multilingual-v2` | ElevenLabs | 10,000 |
120
+ | Eleven v3 | `eleven-v3` | ElevenLabs | 5,000 |
121
+ | Eleven Turbo v2.5 | `eleven-turbo-v2.5` | ElevenLabs | 40,000 |
122
+
123
+ > Browse available preset voices with `pixverse voice presets --model <id>` and the full live model catalog with `pixverse voice models`.
124
+
125
+ ### Music Models (`create music --model <value>`)
126
+
127
+ | Model | `--model` value | Provider | Duration | Notes |
128
+ | :---------------------------- | :-------------------- | :--------- | :---------- | :--------------------------------------- |
129
+ | MiniMax Music 2.6 _(default)_ | `music-2.6` | MiniMax | `10`-`240`s | Lyrics, auto lyrics, instrumental |
130
+ | ElevenLabs Music | `music_v1` | ElevenLabs | `10`-`240`s | Lyrics, auto lyrics, instrumental |
131
+ | Google Lyria 3 Pro | `lyria-3-pro-preview` | Google | `10`-`240`s | Image references, no separate `--lyrics` |
132
+
133
+ > Browse the live music model catalog with `pixverse music models`.
108
134
 
109
135
  ---
110
136
 
@@ -129,16 +155,23 @@ Local image inputs larger than `1920x1920` or `5MB` are automatically resized/co
129
155
  pixverse create video --prompt "A cat walking on Mars" --model v6 --quality 720p --aspect-ratio 16:9
130
156
  ```
131
157
 
132
- ### Prompts from stdin
158
+ ### Text inputs: literal, a file, or stdin
159
+
160
+ Text-input flags — `--prompt` (all create commands), `--text` (`create voice`), and `--lyrics` (`create music`) — accept three forms, just like `--image` / `--video`:
133
161
 
134
- Pass `-` to `--prompt` (or `--tts-text`) to read the value from stdin. Handy for long or multi-line prompts and for piping output from another tool without fighting shell quoting:
162
+ - a **literal** string: `--prompt "A neon city skyline"`
163
+ - a **local file path**: `--prompt ./scene.txt` (the file's contents are used)
164
+ - `-` to read from **stdin**: `... | pixverse create video --prompt -`
135
165
 
136
166
  ```bash
137
- echo "A neon city skyline at dusk, slow drone shot" | pixverse create video --prompt -
167
+ pixverse create video --prompt ./scene.txt
138
168
  cat scene.txt | pixverse create image --prompt - --json
139
- some-prompt-generator | pixverse create speech --video <id> --tts-text -
169
+ echo "Hello from the command line" | pixverse create voice --text -
170
+ pixverse create music --prompt "Bright synth-pop" --lyrics ./lyrics.txt
140
171
  ```
141
172
 
173
+ > A value is treated as a file only when a matching file actually exists on disk; otherwise it's used as literal text (the same rule as `--image` / `--video`).
174
+
142
175
  ### Image to Video
143
176
 
144
177
  ```bash
@@ -163,9 +196,22 @@ pixverse create image --prompt "Turn this into a watercolor painting" --image ./
163
196
  # Create a transition between keyframes (requires 2+ images)
164
197
  pixverse create transition --images ./frame1.png ./frame2.png ./frame3.png
165
198
 
166
- # Add lip-sync speech to a video (via TTS or audio file)
167
- pixverse create speech --video <video_id> --tts-text "Hello world"
168
- pixverse create speech --video <video_id> --audio ./speech.mp3
199
+ # Generate speech audio from text (text-to-speech)
200
+ pixverse create voice --text "Hello world" --voice-id <preset_voice_id> --output ./out.mp3
201
+ # Browse available models / preset voices:
202
+ pixverse voice models
203
+ pixverse voice presets --model speech-2.8-hd
204
+
205
+ # Generate music audio from a prompt
206
+ pixverse create music --prompt "A cinematic pop song with bright synths" --auto-lyrics
207
+ pixverse create music --prompt "Uplifting piano theme" --instrumental --duration-seconds 60
208
+ # Lyrics-capable models require lyrics unless --auto-lyrics or --instrumental is used:
209
+ # (--lyrics takes a literal string, a local file path, or - for stdin)
210
+ pixverse create music --prompt "Bright synth-pop, uplifting mood" --lyrics ./lyrics.txt
211
+ # Google Lyria supports image references and expects lyric-like instructions in --prompt:
212
+ pixverse create music -m lyria-3-pro-preview --prompt "Instrumental orchestral cue inspired by these images" --image ./moodboard.png
213
+ # Browse available music models:
214
+ pixverse music models
169
215
 
170
216
  # Extend video duration
171
217
  pixverse create extend --video <video_id>
@@ -189,19 +235,26 @@ pixverse create motion-control --image ./character.png --video ./dance.mp4
189
235
  pixverse create template --template-id 12345 --image ./photo.png
190
236
  ```
191
237
 
238
+ Voice speed uses provider-specific validation:
239
+
240
+ | Provider | Default | Valid range | Invalid range error | Provider request field |
241
+ | :--------- | :------ | :---------- | :------------------------------------ | :--------------------- |
242
+ | ElevenLabs | `1.0` | `0.7..1.2` | `--speed must be between 0.7 and 1.2` | `voice_settings.speed` |
243
+ | MiniMax | `1.0` | `0.5..2.0` | `--speed must be between 0.5 and 2` | `voice_setting.speed` |
244
+
192
245
  ### Common Creation Flags
193
246
 
194
247
  These flags are available across most `create` subcommands:
195
248
 
196
- | Flag | Description |
197
- |:---|:---|
198
- | `--count <n>` | Generate multiple variations (1–4, default 1) |
199
- | `--seed <number>` | Set random seed for reproducible results |
200
- | `--off-peak` | Use off-peak pricing (lower credit cost) |
201
- | `--audio` / `--no-audio` | Enable or disable audio generation |
202
- | `--multi-shot` / `--no-multi-shot` | Enable or disable multi-shot mode (video only) |
203
- | `--no-wait` | Return immediately without waiting for completion |
204
- | `--timeout <sec>` | Polling timeout in seconds (default 300) |
249
+ | Flag | Description |
250
+ | :--------------------------------- | :------------------------------------------------ |
251
+ | `--count <n>` | Generate multiple variations (1–4, default 1) |
252
+ | `--seed <number>` | Set random seed for reproducible results |
253
+ | `--off-peak` | Use off-peak pricing (lower credit cost) |
254
+ | `--audio` / `--no-audio` | Enable or disable audio generation |
255
+ | `--multi-shot` / `--no-multi-shot` | Enable or disable multi-shot mode (video only) |
256
+ | `--no-wait` | Return immediately without waiting for completion |
257
+ | `--timeout <sec>` | Polling timeout in seconds (default 300) |
205
258
 
206
259
  ### Task Management
207
260
 
@@ -209,6 +262,9 @@ These flags are available across most `create` subcommands:
209
262
  # Check task status
210
263
  pixverse task status <id>
211
264
 
265
+ # Poll a voice/music audio task (audio is not auto-detected — pass --type audio)
266
+ pixverse task status <id> --type audio
267
+
212
268
  # Batch status query (parallel; per-ID failures captured in the response map)
213
269
  pixverse task status --ids 123,456,789 --type video --json
214
270
 
@@ -222,21 +278,32 @@ pixverse task wait <id>
222
278
  # List your generated assets (default: created videos)
223
279
  pixverse asset list
224
280
  pixverse asset list --type image
281
+ pixverse asset list --type audio # voice and music audio history
282
+ pixverse asset list --type audio --source upload
225
283
  pixverse asset list --source upload
226
284
  pixverse asset list --source create --off-peak
227
285
 
228
286
  # Upload a local file or URL to asset library
229
287
  pixverse asset upload ./photo.png
288
+ pixverse asset upload ./voice-over.mp3
230
289
  pixverse asset upload https://example.com/image.jpg
231
290
 
232
- # Get asset details
291
+ # Get asset details (type auto-detected: video → image → audio)
233
292
  pixverse asset info <id>
293
+ # Pass --type to skip auto-detection
294
+ pixverse asset info <id> --type audio
295
+ pixverse asset info <id> --type audio --source upload
234
296
 
235
- # Download a generated video or image
297
+ # Download a created video, image, or audio (uploads are not downloadable)
236
298
  pixverse asset download <id>
299
+ pixverse asset download <id> --type audio --dest ./out/
237
300
 
238
- # Delete an asset
301
+ # Delete a created asset — pass its id (auto-detected)
239
302
  pixverse asset delete <id>
303
+ pixverse asset delete <id> --type audio
304
+
305
+ # Delete an uploaded asset — pass the id from `asset list --source upload`
306
+ pixverse asset delete <id> --source upload --type image
240
307
  ```
241
308
 
242
309
  ### Saved Folders
@@ -362,81 +429,87 @@ pixverse asset download "$VID" --dest ./output/
362
429
 
363
430
  ### Exit Codes
364
431
 
365
- | Code | Meaning |
366
- |:---|:---|
367
- | `0` | Success |
368
- | `1` | General error |
369
- | `2` | Timeout |
370
- | `3` | Authentication error |
371
- | `4` | Credit / subscription limit |
372
- | `5` | Generation failed |
373
- | `6` | Validation error |
432
+ | Code | Meaning |
433
+ | :--- | :-------------------------- |
434
+ | `0` | Success |
435
+ | `1` | General error |
436
+ | `2` | Timeout |
437
+ | `3` | Authentication error |
438
+ | `4` | Credit / subscription limit |
439
+ | `5` | Generation failed |
440
+ | `6` | Validation error |
374
441
 
375
442
  ## All Commands
376
443
 
377
- | Command | Description |
378
- |:---|:---|
379
- | `auth login` | Login via browser (OAuth device flow) |
380
- | `auth status` | Check authentication status |
381
- | `auth logout` | Remove stored token |
382
- | `create video` | Text-to-video or image-to-video |
383
- | `create image` | Text-to-image or image-to-image |
384
- | `create transition` | Create transitions between keyframes |
385
- | `create speech` | Add lip-sync speech to video |
386
- | `create extend` | Extend video duration |
387
- | `create modify` | Modify an existing video |
388
- | `create upscale` | Upscale video resolution |
389
- | `create reference` | Generate video with character references |
390
- | `create motion-control` | Motion control with character image + reference video |
391
- | `create template` | Create from a template/effect |
392
- | `template categories` | List template categories |
393
- | `template list` | List templates (with category filter) |
394
- | `template search` | Search templates by keyword |
395
- | `template info` | Get template details |
396
- | `task status` | Check task status (single `<id>` or `--ids id1,id2,...` for batch) |
397
- | `task wait` | Wait for task completion |
398
- | `asset list` | List assets (`--source create\|upload`, `--type video\|image`, `--off-peak`) |
399
- | `asset upload` | Upload a local file or HTTPS URL to asset library |
400
- | `asset info` | Get asset details |
401
- | `asset download` | Download a generated asset |
402
- | `asset delete` | Delete an asset |
403
- | `saved list` | List saved folders |
404
- | `saved items` | List items in a saved folder |
405
- | `saved new` | Create a new saved folder |
406
- | `saved rename` | Rename a saved folder |
407
- | `saved add` | Add assets to a saved folder |
408
- | `saved remove` | Remove assets from a saved folder |
409
- | `saved delete` | Delete a saved folder |
410
- | `workspace list` | List all workspaces |
411
- | `workspace status` | Show current workspace |
412
- | `workspace switch` | Switch workspace (interactive or by ID) |
413
- | `workspace manage` | Open workspace management in browser |
414
- | `account info` | View account info and workspace credits |
415
- | `account usage` | View credit usage |
416
- | `account slots` | View current concurrent generation slots (image / video) |
417
- | `subscribe` | Open subscription page |
418
- | `update` | Update the CLI to the latest version (`npm i -g pixverse@latest`) |
419
- | `config set` | Set a config value |
420
- | `config get` | Get a config value |
421
- | `config list` | List all config values |
422
- | `config reset` | Reset config to defaults |
423
- | `config path` | Show config file path |
424
- | `config defaults` | Manage per-mode creation defaults |
444
+ | Command | Description |
445
+ | :---------------------- | :---------------------------------------------------------------------------------- |
446
+ | `auth login` | Login via browser (OAuth device flow) |
447
+ | `auth status` | Check authentication status |
448
+ | `auth logout` | Remove stored token |
449
+ | `create video` | Text-to-video or image-to-video |
450
+ | `create image` | Text-to-image or image-to-image |
451
+ | `create transition` | Create transitions between keyframes |
452
+ | `create voice` | Generate speech audio from text (text-to-speech) |
453
+ | `create music` | Generate music audio from a prompt |
454
+ | `create extend` | Extend video duration |
455
+ | `create modify` | Modify an existing video |
456
+ | `create upscale` | Upscale video resolution |
457
+ | `create reference` | Generate video with character references |
458
+ | `create motion-control` | Motion control with character image + reference video |
459
+ | `create template` | Create from a template/effect |
460
+ | `template categories` | List template categories |
461
+ | `template list` | List templates (with category filter) |
462
+ | `template search` | Search templates by keyword |
463
+ | `template info` | Get template details |
464
+ | `voice models` | List voice/TTS providers, models, and supported languages |
465
+ | `voice presets` | List preset voices (filterable by model / language / provider) |
466
+ | `music models` | List music providers, models, and capabilities |
467
+ | `task status` | Check task status (single `<id>` or `--ids id1,id2,...` for batch) |
468
+ | `task wait` | Wait for task completion |
469
+ | `asset list` | List assets (`--source create\|upload`, `--type video\|image\|audio`, `--off-peak`) |
470
+ | `asset upload` | Upload a local file or HTTPS URL to asset library |
471
+ | `asset info` | Get asset details |
472
+ | `asset download` | Download a generated asset |
473
+ | `asset delete` | Delete an asset |
474
+ | `saved list` | List saved folders |
475
+ | `saved items` | List items in a saved folder |
476
+ | `saved new` | Create a new saved folder |
477
+ | `saved rename` | Rename a saved folder |
478
+ | `saved add` | Add assets to a saved folder |
479
+ | `saved remove` | Remove assets from a saved folder |
480
+ | `saved delete` | Delete a saved folder |
481
+ | `workspace list` | List all workspaces |
482
+ | `workspace status` | Show current workspace |
483
+ | `workspace switch` | Switch workspace (interactive or by ID) |
484
+ | `workspace manage` | Open workspace management in browser |
485
+ | `account info` | View account info and workspace credits |
486
+ | `account usage` | View credit usage |
487
+ | `account slots` | View current concurrent generation slots (image / video) |
488
+ | `subscribe` | Open subscription page |
489
+ | `update` | Update the CLI to the latest version (`npm i -g pixverse@latest`) |
490
+ | `config set` | Set a config value |
491
+ | `config get` | Get a config value |
492
+ | `config list` | List all config values |
493
+ | `config reset` | Reset config to defaults |
494
+ | `config path` | Show config file path |
495
+ | `config defaults` | Manage per-mode creation defaults |
425
496
 
426
497
  ## Global Flags
427
498
 
428
- | Flag | Description |
429
- |:---|:---|
430
- | `--json` | Output as JSON |
431
- | `-p` | Print mode (alias for `--json`) |
499
+ | Flag | Description |
500
+ | :-------------------- | :-------------------------------------------------------- |
501
+ | `--json` | Output as JSON |
502
+ | `-p` | Print mode (alias for `--json`) |
432
503
  | `--workspace-id <id>` | Override active workspace for this command (0 = personal) |
433
- | `-V, --version` | Show CLI version |
434
- | `-h, --help` | Show help for any command |
504
+ | `-V, --version` | Show CLI version |
505
+ | `-h, --help` | Show help for any command |
435
506
 
436
507
  ## For AI Agents — Advanced Usage
437
508
 
438
509
  For AI agents (Claude Code, Cursor, Codex, etc.), we **strongly recommend** installing [PixVerse Skills](https://github.com/PixVerseAI/skills) — a comprehensive skill library that teaches agents how to use PixVerse CLI correctly with full model constraints, multi-step pipelines, and error handling.
439
510
 
511
+ For lightweight discovery, the public repo also includes a compact machine-readable command manifest at `capabilities.json`; the npm package includes the same file at `dist/capabilities.json`.
512
+
440
513
  **Install via Skills CLI:**
441
514
 
442
515
  ```bash
@@ -448,6 +521,7 @@ npx skills add https://github.com/pixverseai/skills --skill pixverse-ai-image-an
448
521
  [https://clawhub.ai/pixverse-official/pixverse-ai-image-and-video-generator](https://clawhub.ai/pixverse-official/pixverse-ai-image-and-video-generator)
449
522
 
450
523
  Skills include:
524
+
451
525
  - Per-model parameter constraints (which models support which modes, quality levels, durations, aspect ratios)
452
526
  - End-to-end workflow pipelines (text-to-video, storyboard-to-video, video production, motion control, etc.)
453
527
  - Prompt optimization techniques for better generation quality