@felores/kie-ai-mcp-server 2.0.0 → 2.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +189 -1893
- package/ai_docs/artist.md +437 -223
- package/ai_docs/filmographer.md +70 -25
- package/dist/index.js +1382 -1211
- package/package.json +1 -1
- /package/ai_docs/kie/{bytedamce_seedream-v4-edit.md → bytedance_seedream-v4-edit.md} +0 -0
package/README.md
CHANGED
|
@@ -4,7 +4,14 @@
|
|
|
4
4
|
|
|
5
5
|
Kie.ai offers **30-50% lower cost** than competitors with 99.9% uptime and 24/7 human support.
|
|
6
6
|
|
|
7
|
-
##
|
|
7
|
+
## 📚 Documentation
|
|
8
|
+
|
|
9
|
+
- **[Complete Tool Reference](docs/TOOLS.md)** - Detailed documentation for all 21 AI tools
|
|
10
|
+
- **[Database & Task Management](docs/DATABASE.md)** - SQLite database and task lifecycle
|
|
11
|
+
- **[Administrator Configuration](docs/ADMIN.md)** - Deployment guides and environment setup
|
|
12
|
+
- **[Intelligent Features](docs/INTELLIGENCE.md)** - Smart mode detection and cost optimization
|
|
13
|
+
|
|
14
|
+
## 🚀 Quick Start - Add to Your MCP Client
|
|
8
15
|
|
|
9
16
|
The easiest way to use this server is to add it to your MCP client configuration:
|
|
10
17
|
|
|
@@ -39,7 +46,8 @@ The easiest way to use this server is to add it to your MCP client configuration
|
|
|
39
46
|
| **Support** | 24/7 Human | Email + Discord | 24/7 AI |
|
|
40
47
|
| **Free Trial** | Yes | Limited | Limited |
|
|
41
48
|
|
|
42
|
-
### 🚀
|
|
49
|
+
### 🚀 All AI Models in One API
|
|
50
|
+
|
|
43
51
|
- **Google Veo 3**: Cinematic video generation with synchronized audio and 1080p output
|
|
44
52
|
- **OpenAI Sora 2**: Advanced video generation with text/image/storyboard modes (unified)
|
|
45
53
|
- **Runway Aleph**: Advanced video editing with object removal and style transfer
|
|
@@ -53,51 +61,19 @@ The easiest way to use this server is to add it to your MCP client configuration
|
|
|
53
61
|
- **Flux Kontext**: Professional image generation and editing with advanced features (unified)
|
|
54
62
|
- **Alibaba Wan 2.5**: High-quality video generation with text-to-video and image-to-video (unified)
|
|
55
63
|
- **Hailuo 02**: Professional video generation with text-to-video and image-to-video modes (unified, standard/pro quality)
|
|
64
|
+
- **Kling Video**: Multi-tier video generation with v2.1-pro control and v2.5-turbo speed
|
|
56
65
|
- **Midjourney AI**: Industry-leading image and video generation with multiple modes (unified)
|
|
57
|
-
- **Recraft Remove Background**: Professional AI-powered background removal
|
|
58
|
-
- **Ideogram V3 Reframe**: Intelligent image reframing and aspect ratio conversion
|
|
59
|
-
|
|
60
|
-
### 💰 **Affordable Pricing**
|
|
61
|
-
Pay-as-you-go credit system means you only pay for what you use. Good for startups and enterprises looking to reduce AI costs.
|
|
62
|
-
|
|
63
|
-
### ⚡ **Fast & Reliable**
|
|
64
|
-
- **99.9% uptime**
|
|
65
|
-
- **25.2s average response time**
|
|
66
|
-
- Low latency for applications
|
|
67
|
-
- High concurrency support
|
|
68
|
-
|
|
69
|
-
### 🔒 **Secure**
|
|
70
|
-
Your data is protected with encryption. We prioritize privacy and do not expose your information.
|
|
66
|
+
- **Recraft Remove Background**: Professional AI-powered background removal
|
|
67
|
+
- **Ideogram V3 Reframe**: Intelligent image reframing and aspect ratio conversion
|
|
71
68
|
|
|
72
69
|
## What You Can Build
|
|
73
70
|
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
### 🎨 **Image Generation**
|
|
82
|
-
Create images, edit existing ones, and upscale with AI. Use for:
|
|
83
|
-
- Content creation
|
|
84
|
-
- Product photography
|
|
85
|
-
- Artistic projects
|
|
86
|
-
- Design mockups
|
|
87
|
-
|
|
88
|
-
### 🎵 **Music Generation**
|
|
89
|
-
Generate music tracks with vocals. Use for:
|
|
90
|
-
- Background music for videos
|
|
91
|
-
- Podcast intros/outros
|
|
92
|
-
- Game soundtracks
|
|
93
|
-
- Commercial projects
|
|
94
|
-
|
|
95
|
-
### 🎤 **Audio Generation**
|
|
96
|
-
Voiceovers and sound effects. Use for:
|
|
97
|
-
- Narration and voiceovers
|
|
98
|
-
- Podcast production
|
|
99
|
-
- Game audio
|
|
100
|
-
- Accessibility features
|
|
71
|
+
| Category | Use Cases |
|
|
72
|
+
|----------|-----------|
|
|
73
|
+
| **🎬 Video Generation** | Social media content, marketing materials, product demonstrations, creative projects |
|
|
74
|
+
| **🎨 Image Generation** | Content creation, product photography, artistic projects, design mockups |
|
|
75
|
+
| **🎵 Music Generation** | Background music for videos, podcast intros/outros, game soundtracks, commercial projects |
|
|
76
|
+
| **🎤 Audio Generation** | Narration and voiceovers, podcast production, game audio, accessibility features |
|
|
101
77
|
|
|
102
78
|
## MCP Features
|
|
103
79
|
|
|
@@ -105,17 +81,11 @@ Voiceovers and sound effects. Use for:
|
|
|
105
81
|
|
|
106
82
|
Trigger specialized AI agents with simple commands in your MCP client:
|
|
107
83
|
|
|
108
|
-
- **`/artist`** - Image generation and editing agent
|
|
109
|
-
|
|
110
|
-
- Handles text-to-image, image editing, upscaling, background removal
|
|
111
|
-
- Intelligently selects the best model for your request
|
|
112
|
-
- Just describe what you want: _"/artist create a logo for a coffee shop"_
|
|
84
|
+
- **`/artist`** - Image generation and editing agent
|
|
85
|
+
Just describe what you want: _"/artist create a logo for a coffee shop"_
|
|
113
86
|
|
|
114
|
-
- **`/filmographer`** - Video generation agent
|
|
115
|
-
|
|
116
|
-
- Handles text-to-video and image-to-video generation
|
|
117
|
-
- Optimizes quality vs cost based on your keywords
|
|
118
|
-
- Just describe what you want: _"/filmographer create a 10-second sunset video"_
|
|
87
|
+
- **`/filmographer`** - Video generation agent
|
|
88
|
+
Just describe what you want: _"/filmographer create a 10-second sunset video"_
|
|
119
89
|
|
|
120
90
|
### 📚 Knowledge Resources
|
|
121
91
|
|
|
@@ -125,60 +95,36 @@ Your AI assistant can research and learn about available models before using the
|
|
|
125
95
|
- `kie://agents/artist` - Complete image generation workflow
|
|
126
96
|
- `kie://agents/filmographer` - Complete video generation workflow
|
|
127
97
|
|
|
128
|
-
**Model Documentation (
|
|
98
|
+
**Model Documentation (33+ models):**
|
|
129
99
|
- `kie://models/bytedance-seedream` - 4K image generation
|
|
130
100
|
- `kie://models/veo3` - Premium cinematic video
|
|
131
101
|
- `kie://models/qwen-image` - Fast image processing
|
|
132
102
|
- `kie://models/flux-kontext` - Professional image generation
|
|
133
|
-
- ...and
|
|
103
|
+
- ...and 29 more models
|
|
134
104
|
|
|
135
105
|
**Comparison Guides:**
|
|
136
106
|
- `kie://guides/image-models-comparison` - Feature matrix for all image models
|
|
137
107
|
- `kie://guides/video-models-comparison` - Feature matrix for all video models
|
|
138
108
|
- `kie://guides/quality-optimization` - Cost/quality strategies
|
|
139
109
|
|
|
140
|
-
**Operational Resources:**
|
|
141
|
-
- `kie://tasks/active` - Real-time task monitoring
|
|
142
|
-
- `kie://stats/usage` - Usage statistics
|
|
143
|
-
|
|
144
110
|
### 🛠️ 21 Unified AI Tools
|
|
145
111
|
|
|
146
112
|
All tools feature **smart mode detection** - one tool does multiple things:
|
|
147
113
|
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
**Video Tools (8):**
|
|
158
|
-
- `veo3_generate_video` - Premium cinematic video (text OR image input)
|
|
159
|
-
- `sora_video` - OpenAI's advanced video model (text/image/storyboard modes, standard/pro)
|
|
160
|
-
- `bytedance_seedance_video` - Professional video (text OR image input, lite OR pro)
|
|
161
|
-
- `wan_video` - Fast social media video (text OR image input)
|
|
162
|
-
- `hailuo_video` - Professional video (text-to-video OR image-to-video, standard OR pro quality)
|
|
163
|
-
- `kling_video` - High-quality video (text, image-to-video, OR v2.1-pro with start+end frames)
|
|
164
|
-
- `runway_aleph_video` - Video-to-video transformation
|
|
165
|
-
- `midjourney_generate` - Images AND videos with multiple modes
|
|
166
|
-
|
|
167
|
-
**Audio Tools (3):**
|
|
168
|
-
- `suno_generate_music` - Professional music with vocals
|
|
169
|
-
- `elevenlabs_tts` - Studio-quality text-to-speech
|
|
170
|
-
- `elevenlabs_ttsfx` - AI-powered sound effects
|
|
171
|
-
|
|
172
|
-
**Utility Tools (3):**
|
|
173
|
-
- `list_tasks` - View all generation tasks
|
|
174
|
-
- `get_task_status` - Check task progress
|
|
175
|
-
- `veo3_get_1080p_video` - Upgrade to 1080p (Veo3 only)
|
|
114
|
+
| Category | Tools |
|
|
115
|
+
|----------|-------|
|
|
116
|
+
| **Image (7)** | `bytedance_seedream_image`, `qwen_image`, `nano_banana_image`, `flux_kontext_image`, `openai_4o_image`, `recraft_remove_background`, `ideogram_reframe` |
|
|
117
|
+
| **Video (8)** | `veo3_generate_video`, `sora_video`, `bytedance_seedance_video`, `wan_video`, `hailuo_video`, `kling_video`, `runway_aleph_video`, `midjourney_generate` |
|
|
118
|
+
| **Audio (3)** | `suno_generate_music`, `elevenlabs_tts`, `elevenlabs_ttsfx` |
|
|
119
|
+
| **Utility (3)** | `list_tasks`, `get_task_status`, `veo3_get_1080p_video` |
|
|
120
|
+
|
|
121
|
+
**→ [See complete tool documentation](docs/TOOLS.md)**
|
|
176
122
|
|
|
177
123
|
## Key Features
|
|
178
124
|
|
|
179
125
|
- **🎯 One API Key**: Access all models with one credential
|
|
180
126
|
- **🤖 AI Agent Prompts**: Slash commands trigger specialized workflows
|
|
181
|
-
- **📖 Knowledge Base**:
|
|
127
|
+
- **📖 Knowledge Base**: 33+ resources for model research and comparison
|
|
182
128
|
- **🔄 Task Management**: Built-in SQLite database for tracking generations
|
|
183
129
|
- **📱 Smart Routing**: Automatic endpoint detection and status monitoring
|
|
184
130
|
- **🛡️ Error Handling**: Validation and error recovery
|
|
@@ -191,195 +137,30 @@ All tools feature **smart mode detection** - one tool does multiple things:
|
|
|
191
137
|
|
|
192
138
|
The MCP server features advanced **intention detection algorithms** that automatically understand user requirements and optimize both cost and quality without manual configuration.
|
|
193
139
|
|
|
194
|
-
###
|
|
195
|
-
|
|
196
|
-
#### **Automatic Quality Detection**
|
|
197
|
-
The system analyzes user language to determine quality requirements:
|
|
198
|
-
|
|
199
|
-
```typescript
|
|
200
|
-
// Source: src/kie-ai-client.ts:224-232
|
|
201
|
-
const quality = request.quality || 'lite';
|
|
202
|
-
let model: string;
|
|
203
|
-
if (isImageToVideo) {
|
|
204
|
-
model = quality === 'pro' ? 'bytedance/v1-pro-image-to-video' : 'bytedance/v1-lite-image-to-video';
|
|
205
|
-
} else {
|
|
206
|
-
model = quality === 'pro' ? 'bytedance/v1-pro-text-to-video' : 'bytedance/v1-lite-text-to-video';
|
|
207
|
-
}
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
**User Language → System Action**:
|
|
211
|
-
- `"high quality"`, `"professional"`, `"premium"` → Pro models + 1080p
|
|
212
|
-
- `"fast"`, `"quick"`, `"social media"` → Lite models + 720p
|
|
213
|
-
- No quality mentioned → Lite models + 720p (cost-effective default)
|
|
214
|
-
|
|
215
|
-
#### **Dynamic Endpoint Routing**
|
|
216
|
-
Quality parameters automatically map to optimal endpoints:
|
|
217
|
-
|
|
218
|
-
| Quality Parameter | Text-to-Video Endpoint | Image-to-Video Endpoint |
|
|
219
|
-
|------------------|----------------------|-----------------------|
|
|
220
|
-
| `"lite"` | `bytedance/v1-lite-text-to-video` | `bytedance/v1-lite-image-to-video` |
|
|
221
|
-
| `"pro"` | `bytedance/v1-pro-text-to-video` | `bytedance/v1-pro-image-to-video` |
|
|
222
|
-
|
|
223
|
-
### **🔧 Unified Tool Architecture**
|
|
224
|
-
|
|
225
|
-
#### **Smart Mode Detection**
|
|
226
|
-
Single tools automatically detect operation mode based on parameter combinations:
|
|
227
|
-
|
|
228
|
-
```typescript
|
|
229
|
-
// Source: src/types.ts:146-166 (Nano Banana example)
|
|
230
|
-
.refine((data) => {
|
|
231
|
-
const hasPrompt = !!data.prompt;
|
|
232
|
-
const hasImage = !!data.image_urls;
|
|
233
|
-
const hasMask = !!data.mask_url;
|
|
234
|
-
|
|
235
|
-
if (hasImage && hasMask) return hasPrompt; // Edit mode
|
|
236
|
-
else if (hasImage) return true; // Variants mode
|
|
237
|
-
else return hasPrompt; // Generate mode
|
|
238
|
-
});
|
|
239
|
-
```
|
|
240
|
-
|
|
241
|
-
**Unified Tools with Auto-Detection**:
|
|
242
|
-
- **`nano_banana_image`**: Generate/Edit/Upscale based on parameters
|
|
243
|
-
- **`bytedance_seedance_video`**: Text-to-video vs Image-to-video based on `image_url` presence
|
|
244
|
-
- **`openai_4o_image`**: Generate/Edit/Variants based on `filesUrl` and `maskUrl` combination
|
|
245
|
-
- **`qwen_image`**: Text-to-image vs Image editing based on `image_url` presence
|
|
246
|
-
|
|
247
|
-
### **📊 Intelligent Task Management**
|
|
248
|
-
|
|
249
|
-
#### **Smart Status Routing**
|
|
250
|
-
The system automatically routes status checks to correct API endpoints based on task type:
|
|
251
|
-
|
|
252
|
-
```typescript
|
|
253
|
-
// Source: src/index.ts:1155-1175
|
|
254
|
-
switch (task.api_type) {
|
|
255
|
-
case 'veo3':
|
|
256
|
-
return this.makeRequest(`/veo/record-info?taskId=${taskId}`, 'GET');
|
|
257
|
-
case 'suno':
|
|
258
|
-
return this.makeRequest(`/generate/record-info?taskId=${taskId}`, 'GET');
|
|
259
|
-
case 'bytedance-seedance-video':
|
|
260
|
-
case 'midjourney-generate':
|
|
261
|
-
return this.makeRequest(`/jobs/recordInfo?taskId=${taskId}`, 'GET');
|
|
262
|
-
}
|
|
263
|
-
```
|
|
140
|
+
### Quick Summary
|
|
264
141
|
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
-- Source: README.md database schema
|
|
270
|
-
CREATE TABLE tasks (
|
|
271
|
-
task_id TEXT UNIQUE NOT NULL,
|
|
272
|
-
api_type TEXT NOT NULL, -- Enables intelligent endpoint routing
|
|
273
|
-
status TEXT DEFAULT 'pending',
|
|
274
|
-
result_url TEXT,
|
|
275
|
-
-- ... other fields
|
|
276
|
-
);
|
|
277
|
-
```
|
|
142
|
+
- **Automatic Quality Detection**: Analyzes user language ("high quality" → pro models, "quick" → lite models)
|
|
143
|
+
- **Smart Mode Detection**: Single tools auto-detect operation mode (generate/edit/upscale) based on parameters
|
|
144
|
+
- **Database-Driven Intelligence**: Local SQLite cache reduces API calls and provides smart routing
|
|
145
|
+
- **Cost Control by Design**: Defaults to cheapest options (720p, lite quality) unless explicitly requested
|
|
278
146
|
|
|
279
|
-
|
|
147
|
+
**Example**: User says _"Make a quick social media video"_ → System automatically chooses: lite quality + 720p + 5 second duration = lowest cost tier (1x baseline)
|
|
280
148
|
|
|
281
|
-
|
|
282
|
-
- **Resolution**: Defaults to `"720p"` (API defaults to 1080p - explicit setting prevents cost overruns)
|
|
283
|
-
- **Quality**: Defaults to `"lite"` (2-3x cheaper than pro versions)
|
|
284
|
-
- **Models**: Defaults to faster variants unless premium quality requested
|
|
149
|
+
**Example**: User says _"I need a high quality video for a client presentation"_ → System automatically chooses: pro quality + 1080p = highest cost tier (4-6x baseline)
|
|
285
150
|
|
|
286
|
-
|
|
287
|
-
Users must explicitly request higher quality:
|
|
288
|
-
- `"high quality"` → Automatic upgrade to pro models + 1080p
|
|
289
|
-
- `"high quality in 720p"` → Pro models + cost-effective resolution
|
|
290
|
-
- `"professional"` → Pro models + balanced resolution
|
|
151
|
+
**→ [See complete intelligence documentation](docs/INTELLIGENCE.md)** with real-world examples and verifiable code references
|
|
291
152
|
|
|
292
|
-
|
|
153
|
+
## Installation & Configuration
|
|
293
154
|
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
- **Mode Detection**: `src/types.ts:146-166` (multiple examples)
|
|
297
|
-
- **Endpoint Routing**: `src/index.ts:1155-1175`
|
|
298
|
-
- **Schema Validation**: `src/types.ts` (all tool schemas)
|
|
299
|
-
- **Database Integration**: `src/database.ts` + `src/index.ts`
|
|
155
|
+
<details>
|
|
156
|
+
<summary><strong>📦 Installation Options (click to expand)</strong></summary>
|
|
300
157
|
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
### **🚀 Real-World Intelligence Examples**
|
|
304
|
-
|
|
305
|
-
#### **Example 1: Video Generation Request**
|
|
306
|
-
```
|
|
307
|
-
User: "Make a quick social media video of a sunset"
|
|
308
|
-
```
|
|
309
|
-
**System Automatically Chooses**:
|
|
310
|
-
- **Tool**: `bytedance_seedance_video` (default video model)
|
|
311
|
-
- **Quality**: `"lite"` (detected "quick" → cost-effective)
|
|
312
|
-
- **Resolution**: `"720p"` (default for cost control)
|
|
313
|
-
- **Endpoint**: `bytedance/v1-lite-text-to-video`
|
|
314
|
-
- **Duration**: `"5"` (optimal for social media)
|
|
315
|
-
|
|
316
|
-
#### **Example 2: Professional Quality Request**
|
|
317
|
-
```
|
|
318
|
-
User: "I need a high quality video for a client presentation"
|
|
319
|
-
```
|
|
320
|
-
**System Automatically Chooses**:
|
|
321
|
-
- **Tool**: `bytedance_seedance_video` (default video model)
|
|
322
|
-
- **Quality**: `"pro"` (detected "high quality" → premium)
|
|
323
|
-
- **Resolution**: `"1080p"` (high quality default)
|
|
324
|
-
- **Endpoint**: `bytedance/v1-pro-text-to-video`
|
|
325
|
-
- **Duration**: `"5"` (professional standard)
|
|
326
|
-
|
|
327
|
-
#### **Example 3: Specific Quality Requirements**
|
|
328
|
-
```
|
|
329
|
-
User: "Generate a professional video but keep it 720p to save costs"
|
|
330
|
-
```
|
|
331
|
-
**System Automatically Chooses**:
|
|
332
|
-
- **Tool**: `bytedance_seedance_video`
|
|
333
|
-
- **Quality**: `"pro"` (detected "professional" → premium)
|
|
334
|
-
- **Resolution**: `"720p"` (explicitly requested)
|
|
335
|
-
- **Endpoint**: `bytedance/v1-pro-text-to-video`
|
|
336
|
-
- **Cost**: ~2x lite model but 50% less than 1080p
|
|
337
|
-
|
|
338
|
-
#### **Example 4: Unified Tool Intelligence**
|
|
339
|
-
```json
|
|
340
|
-
// User provides image + prompt
|
|
341
|
-
{
|
|
342
|
-
"tool": "nano_banana_image",
|
|
343
|
-
"arguments": {
|
|
344
|
-
"prompt": "Add sunglasses to the person",
|
|
345
|
-
"image_urls": ["https://example.com/portrait.jpg"]
|
|
346
|
-
}
|
|
347
|
-
}
|
|
348
|
-
```
|
|
349
|
-
**System Automatically Detects**: **Edit Mode** (prompt + image_urls)
|
|
350
|
-
**Routes to**: `/jobs/createTask` with edit-specific parameters
|
|
351
|
-
|
|
352
|
-
#### **Example 5: Smart Status Monitoring**
|
|
353
|
-
```json
|
|
354
|
-
// User checks task status
|
|
355
|
-
{
|
|
356
|
-
"tool": "get_task_status",
|
|
357
|
-
"arguments": {
|
|
358
|
-
"task_id": "abc123"
|
|
359
|
-
}
|
|
360
|
-
}
|
|
361
|
-
```
|
|
362
|
-
**System Automatically**:
|
|
363
|
-
1. **Queries database**: Gets `api_type: "bytedance-seedance-video"`
|
|
364
|
-
2. **Routes to**: `/jobs/recordInfo?taskId=abc123` (correct endpoint)
|
|
365
|
-
3. **Updates local record**: Syncs API response with database
|
|
366
|
-
4. **Returns combined data**: Local + API information
|
|
367
|
-
|
|
368
|
-
## Quick Start
|
|
369
|
-
|
|
370
|
-
### 🎯 Get Your Free API Key
|
|
371
|
-
1. Visit [Kie.ai API Key](https://kie.ai/api-key) to get your free API key
|
|
372
|
-
2. **Try any model for free** in the AI Playground before committing
|
|
373
|
-
3. Choose the flexible pricing plan that fits your needs
|
|
374
|
-
|
|
375
|
-
### 📦 Installation
|
|
376
|
-
|
|
377
|
-
#### Option 1: Install from NPM (Recommended)
|
|
158
|
+
### Option 1: Install from NPM (Recommended)
|
|
378
159
|
```bash
|
|
379
160
|
npm install -g @felores/kie-ai-mcp-server
|
|
380
161
|
```
|
|
381
162
|
|
|
382
|
-
|
|
163
|
+
### Option 2: Install from Source
|
|
383
164
|
```bash
|
|
384
165
|
# Clone the repository
|
|
385
166
|
git clone https://github.com/felores/kie-ai-mcp-server.git
|
|
@@ -391,57 +172,73 @@ npm install
|
|
|
391
172
|
# Build the project
|
|
392
173
|
npm run build
|
|
393
174
|
```
|
|
175
|
+
</details>
|
|
394
176
|
|
|
395
|
-
|
|
177
|
+
<details>
|
|
178
|
+
<summary><strong>⚙️ Environment Variables (click to expand)</strong></summary>
|
|
396
179
|
|
|
397
|
-
|
|
180
|
+
### Required
|
|
398
181
|
```bash
|
|
399
|
-
#
|
|
400
|
-
export KIE_AI_API_KEY="your_api_key_here"
|
|
401
|
-
|
|
402
|
-
# Optional: Custom settings
|
|
403
|
-
export KIE_AI_BASE_URL="https://api.kie.ai/api/v1" # Default
|
|
404
|
-
export KIE_AI_TIMEOUT="60000" # Default: 60 seconds
|
|
405
|
-
export KIE_AI_DB_PATH="./tasks.db" # Default: local database
|
|
406
|
-
export KIE_AI_CALLBACK_URL="https://your-domain.com/webhook" # Optional: Custom callback
|
|
407
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Optional: Admin fallback
|
|
182
|
+
export KIE_AI_API_KEY="your-api-key-here" # Get from https://kie.ai/api-key
|
|
408
183
|
```
|
|
409
184
|
|
|
410
|
-
###
|
|
411
|
-
You're ready to create amazing AI content! The server will automatically:
|
|
412
|
-
- Track all your generations in a local database
|
|
413
|
-
- Handle task status and completion notifications
|
|
414
|
-
- Route requests to the optimal AI models
|
|
415
|
-
- Provide detailed error messages and guidance
|
|
416
|
-
|
|
417
|
-
## Configuration
|
|
418
|
-
|
|
419
|
-
### Environment Variables
|
|
420
|
-
|
|
185
|
+
### Optional
|
|
421
186
|
```bash
|
|
422
|
-
#
|
|
423
|
-
export
|
|
424
|
-
|
|
425
|
-
#
|
|
426
|
-
export
|
|
427
|
-
export KIE_AI_TIMEOUT="60000" # Default: 60 seconds
|
|
428
|
-
export KIE_AI_DB_PATH="./tasks.db" # Default: ./tasks.db
|
|
429
|
-
export KIE_AI_CALLBACK_URL="https://your-domain.com/api/callback" # Optional: Custom callback
|
|
430
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Optional: Admin fallback
|
|
187
|
+
export KIE_AI_BASE_URL="https://api.kie.ai/api/v1" # Default API base URL
|
|
188
|
+
export KIE_AI_TIMEOUT="60000" # Request timeout (ms)
|
|
189
|
+
export KIE_AI_DB_PATH="./tasks.db" # Database file location
|
|
190
|
+
export KIE_AI_CALLBACK_URL="https://your-domain.com/webhook" # Custom callback
|
|
191
|
+
export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Admin fallback
|
|
431
192
|
```
|
|
432
193
|
|
|
433
|
-
###
|
|
194
|
+
### Callback URL Priority
|
|
434
195
|
|
|
435
196
|
| Priority | Source | Variable | Use Case |
|
|
436
197
|
|----------|--------|----------|----------|
|
|
437
198
|
| 1 | User Parameter | `callBackUrl` | Per-request override |
|
|
438
199
|
| 2 | Environment | `KIE_AI_CALLBACK_URL` | User's custom callback |
|
|
439
|
-
| 3 | Admin Fallback | `KIE_AI_CALLBACK_URL_FALLBACK` | ⭐
|
|
200
|
+
| 3 | Admin Fallback | `KIE_AI_CALLBACK_URL_FALLBACK` | ⭐ Deployment-wide default |
|
|
440
201
|
| 4 | Hardcoded | - | `https://proxy.kie.ai/mcp-callback` |
|
|
441
202
|
|
|
442
|
-
|
|
203
|
+
**→ [See administrator configuration guide](docs/ADMIN.md)** for Docker, Kubernetes, Systemd examples
|
|
204
|
+
</details>
|
|
205
|
+
|
|
206
|
+
### Tool Filtering (v2.0.2+)
|
|
207
|
+
|
|
208
|
+
**Filter which AI tools are available** to reduce cognitive load and focus your workflow:
|
|
209
|
+
|
|
210
|
+
```bash
|
|
211
|
+
# Whitelist: Enable only specific tools (highest priority)
|
|
212
|
+
# Note: Utility tools (list_tasks, get_task_status) are always included automatically
|
|
213
|
+
export KIE_AI_ENABLED_TOOLS="nano_banana_image,veo3_generate_video,suno_generate_music"
|
|
214
|
+
|
|
215
|
+
# Category filter: Enable by category (medium priority)
|
|
216
|
+
export KIE_AI_TOOL_CATEGORIES="image,video" # Categories: image, video, audio
|
|
217
|
+
|
|
218
|
+
# Blacklist: Disable specific tools (lowest priority)
|
|
219
|
+
# Note: Utility tools cannot be disabled
|
|
220
|
+
export KIE_AI_DISABLED_TOOLS="midjourney_generate,runway_aleph_video"
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
**Priority Logic**: `ENABLED_TOOLS` > `TOOL_CATEGORIES` > `DISABLED_TOOLS` > All tools (default)
|
|
224
|
+
|
|
225
|
+
**Tool Categories**:
|
|
226
|
+
- **image** (8): nano_banana, seedream, qwen, openai_4o, flux, recraft, ideogram, midjourney*
|
|
227
|
+
- **video** (9): veo3, veo3_1080p, sora, seedance, wan, hailuo, kling, runway, midjourney*
|
|
228
|
+
- **audio** (3): suno, elevenlabs_tts, elevenlabs_ttsfx
|
|
229
|
+
- **utility** (2): list_tasks, get_task_status ⭐ **Always enabled**
|
|
230
|
+
|
|
231
|
+
_* midjourney appears in both image and video categories (supports both)_
|
|
232
|
+
- ⭐ **Utility tools are always enabled** for server monitoring and task management
|
|
233
|
+
- When using whitelist mode, utility tools are automatically added to your selection
|
|
234
|
+
- When using blacklist mode, utility tools cannot be disabled (warning shown if attempted)
|
|
235
|
+
|
|
236
|
+
<details>
|
|
237
|
+
<summary><strong>🔧 MCP Client Configuration (click to expand)</strong></summary>
|
|
443
238
|
|
|
444
|
-
|
|
239
|
+
### Claude Desktop, Cursor, Windsurf, VS Code, etc.
|
|
240
|
+
|
|
241
|
+
Add to your MCP client configuration file:
|
|
445
242
|
|
|
446
243
|
```json
|
|
447
244
|
{
|
|
@@ -455,7 +252,7 @@ Add to your Claude Desktop or MCP client configuration:
|
|
|
455
252
|
}
|
|
456
253
|
```
|
|
457
254
|
|
|
458
|
-
Or if installed globally:
|
|
255
|
+
Or if installed globally with npx:
|
|
459
256
|
|
|
460
257
|
```json
|
|
461
258
|
{
|
|
@@ -468,1590 +265,107 @@ Or if installed globally:
|
|
|
468
265
|
}
|
|
469
266
|
}
|
|
470
267
|
```
|
|
268
|
+
</details>
|
|
471
269
|
|
|
472
|
-
##
|
|
473
|
-
|
|
474
|
-
**New in v1.9.8:** No callback URL setup required! The server automatically handles task completion notifications with intelligent fallback:
|
|
475
|
-
|
|
476
|
-
1. **User Parameter** - If you provide `callBackUrl` in a tool request, it uses that
|
|
477
|
-
2. **Environment Variable** - Uses `KIE_AI_CALLBACK_URL` if set (existing setups keep working)
|
|
478
|
-
3. **Admin Fallback** - Uses `KIE_AI_CALLBACK_URL_FALLBACK` for deployment-wide defaults
|
|
479
|
-
4. **Hardcoded Default** - Falls back to `https://proxy.kie.ai/mcp-callback` automatically
|
|
480
|
-
|
|
481
|
-
**For Users:** Just provide your API key - everything else is handled automatically!
|
|
482
|
-
**For Administrators:** Set `KIE_AI_CALLBACK_URL_FALLBACK` for custom proxy configurations
|
|
483
|
-
|
|
484
|
-
## 🔧 Administrator Configuration
|
|
485
|
-
|
|
486
|
-
### `KIE_AI_CALLBACK_URL_FALLBACK`
|
|
487
|
-
|
|
488
|
-
For system administrators and deployment managers, this environment variable provides organization-wide control over callback URLs:
|
|
489
|
-
|
|
490
|
-
```bash
|
|
491
|
-
# Set deployment-wide callback URL
|
|
492
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.company.com/mcp-callback"
|
|
493
|
-
```
|
|
494
|
-
|
|
495
|
-
### **Use Cases:**
|
|
496
|
-
|
|
497
|
-
**1. Corporate Proxy Setup:**
|
|
498
|
-
```bash
|
|
499
|
-
# For enterprise deployments behind corporate firewalls
|
|
500
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://internal-proxy.company.ai/kie-callback"
|
|
501
|
-
```
|
|
502
|
-
|
|
503
|
-
**2. Multi-Tenant Services:**
|
|
504
|
-
```bash
|
|
505
|
-
# For SaaS platforms managing multiple users
|
|
506
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://api.yourservice.com/webhooks/kie-ai"
|
|
507
|
-
```
|
|
508
|
-
|
|
509
|
-
**3. Development/Staging Environments:**
|
|
510
|
-
```bash
|
|
511
|
-
# Separate callbacks for different environments
|
|
512
|
-
export KIE_AI_CALLBACK_URL_FALLBACK="https://staging-webhook.yourapp.com/kie"
|
|
513
|
-
```
|
|
514
|
-
|
|
515
|
-
### **Configuration Examples:**
|
|
516
|
-
|
|
517
|
-
**Docker Compose:**
|
|
518
|
-
```yaml
|
|
519
|
-
services:
|
|
520
|
-
kie-ai-mcp:
|
|
521
|
-
image: node:18
|
|
522
|
-
environment:
|
|
523
|
-
- KIE_AI_API_KEY=${API_KEY}
|
|
524
|
-
- KIE_AI_CALLBACK_URL_FALLBACK=https://proxy.company.com/webhook
|
|
525
|
-
command: npx -y @felores/kie-ai-mcp-server
|
|
526
|
-
```
|
|
527
|
-
|
|
528
|
-
**Kubernetes:**
|
|
529
|
-
```yaml
|
|
530
|
-
env:
|
|
531
|
-
- name: KIE_AI_API_KEY
|
|
532
|
-
valueFrom:
|
|
533
|
-
secretKeyRef:
|
|
534
|
-
name: kie-ai-secrets
|
|
535
|
-
key: api-key
|
|
536
|
-
- name: KIE_AI_CALLBACK_URL_FALLBACK
|
|
537
|
-
value: "https://proxy.company.com/kie-callback"
|
|
538
|
-
```
|
|
539
|
-
|
|
540
|
-
**Systemd Service:**
|
|
541
|
-
```ini
|
|
542
|
-
[Service]
|
|
543
|
-
Environment=KIE_AI_API_KEY=your-api-key
|
|
544
|
-
Environment=KIE_AI_CALLBACK_URL_FALLBACK=https://proxy.company.com/webhook
|
|
545
|
-
ExecStart=npx -y @felores/kie-ai-mcp-server
|
|
546
|
-
```
|
|
547
|
-
|
|
548
|
-
### **Security Considerations:**
|
|
549
|
-
|
|
550
|
-
- **HTTPS Required:** Always use HTTPS URLs for callbacks
|
|
551
|
-
- **Authentication:** Ensure your callback endpoint validates requests
|
|
552
|
-
- **Rate Limiting:** Implement rate limiting on your callback endpoint
|
|
553
|
-
- **Logging:** Log callback requests for debugging and monitoring
|
|
554
|
-
|
|
555
|
-
### **Fallback Behavior:**
|
|
556
|
-
|
|
557
|
-
The admin fallback only activates when:
|
|
558
|
-
1. No user-provided `callBackUrl` parameter
|
|
559
|
-
2. No `KIE_AI_CALLBACK_URL` environment variable set
|
|
560
|
-
|
|
561
|
-
This ensures user preferences and existing configurations take priority.
|
|
562
|
-
|
|
563
|
-
## Available Tools
|
|
564
|
-
|
|
565
|
-
### 1. `list_tasks`
|
|
566
|
-
List recent tasks with their status.
|
|
567
|
-
|
|
568
|
-
**Parameters:**
|
|
569
|
-
- `limit` (integer, optional): Max tasks to return (default: 20, max: 100)
|
|
570
|
-
- `status` (string, optional): Filter by status ("pending", "processing", "completed", "failed")
|
|
571
|
-
|
|
572
|
-
**Example:**
|
|
573
|
-
```json
|
|
574
|
-
{
|
|
575
|
-
"limit": 10,
|
|
576
|
-
"status": "completed"
|
|
577
|
-
}
|
|
578
|
-
```
|
|
579
|
-
|
|
580
|
-
### 2. `get_task_status`
|
|
581
|
-
Check the status of a generation task.
|
|
582
|
-
|
|
583
|
-
**Parameters:**
|
|
584
|
-
- `task_id` (string, required): Task ID to check
|
|
585
|
-
|
|
586
|
-
**Example:**
|
|
587
|
-
```json
|
|
588
|
-
{
|
|
589
|
-
"task_id": "281e5b0*********************f39b9"
|
|
590
|
-
}
|
|
591
|
-
```
|
|
592
|
-
|
|
593
|
-
### 3. `nano_banana_image`
|
|
594
|
-
Generate, edit, and upscale images using Google's Gemini 2.5 Flash Image Preview (Nano Banana). This unified tool automatically detects the operation mode based on parameters.
|
|
595
|
-
|
|
596
|
-
**Smart Mode Detection:**
|
|
597
|
-
- **Generate mode**: Provide `prompt` only
|
|
598
|
-
- **Edit mode**: Provide `prompt` + `image_urls`
|
|
599
|
-
- **Upscale mode**: Provide `image` (+ optional `scale`)
|
|
600
|
-
|
|
601
|
-
**Parameters:**
|
|
602
|
-
- `prompt` (string, optional): Text description for generate/edit modes (max 5000 chars)
|
|
603
|
-
- `image_urls` (array, optional): URLs of images for edit mode (1-10 URLs)
|
|
604
|
-
- `image` (string, optional): URL of image for upscale mode (max 10MB, jpeg/png/webp)
|
|
605
|
-
- `scale` (integer, optional): Upscale factor for upscale mode, 1-4 (default: 2)
|
|
606
|
-
- `face_enhance` (boolean, optional): Enable face enhancement for upscale mode (default: false)
|
|
607
|
-
- `output_format` (string, optional): "png" or "jpeg" for generate/edit modes (default: "png")
|
|
608
|
-
- `image_size` (string, optional): Aspect ratio for generate/edit modes - "1:1", "9:16", "16:9", "3:4", "4:3", "3:2", "2:3", "5:4", "4:5", "21:9", "auto" (default: "1:1")
|
|
609
|
-
|
|
610
|
-
**Examples:**
|
|
611
|
-
|
|
612
|
-
*Generate mode:*
|
|
613
|
-
```json
|
|
614
|
-
{
|
|
615
|
-
"prompt": "A surreal painting of a giant banana floating in space",
|
|
616
|
-
"output_format": "png",
|
|
617
|
-
"image_size": "16:9"
|
|
618
|
-
}
|
|
619
|
-
```
|
|
620
|
-
|
|
621
|
-
*Edit mode:*
|
|
622
|
-
```json
|
|
623
|
-
{
|
|
624
|
-
"prompt": "Add a rainbow arching over the mountains",
|
|
625
|
-
"image_urls": ["https://example.com/image.jpg"],
|
|
626
|
-
"output_format": "png",
|
|
627
|
-
"image_size": "16:9"
|
|
628
|
-
}
|
|
629
|
-
```
|
|
630
|
-
|
|
631
|
-
*Upscale mode:*
|
|
632
|
-
```json
|
|
633
|
-
{
|
|
634
|
-
"image": "https://example.com/image.jpg",
|
|
635
|
-
"scale": 4,
|
|
636
|
-
"face_enhance": true
|
|
637
|
-
}
|
|
638
|
-
```
|
|
639
|
-
|
|
640
|
-
### 4. `sora_video`
|
|
641
|
-
Generate videos using OpenAI's Sora 2 models (unified tool for text-to-video, image-to-video, and storyboard modes).
|
|
642
|
-
|
|
643
|
-
**Parameters:**
|
|
644
|
-
- `prompt` (string, optional): Text prompt for video generation (max 4000 chars, required for text-to-video and image-to-video modes)
|
|
645
|
-
- `image_url` (string, optional): URL of input image for image-to-video mode (if not provided, uses text-to-video)
|
|
646
|
-
- `storyboard_image_url` (string, optional): URL of storyboard image for storyboard mode (if not provided, uses text-to-video)
|
|
647
|
-
- `storyboard_prompt` (string, optional): Text prompt for storyboard mode (max 4000 chars, if not provided, uses text-to-video)
|
|
648
|
-
- `model` (string, optional): Model version (default: "sora-2")
|
|
649
|
-
- Options: `sora-2` (standard), `sora-2-pro` (premium quality)
|
|
650
|
-
- `aspect_ratio` (string, optional): Video aspect ratio (default: "16:9")
|
|
651
|
-
- Options: `16:9`, `9:16`, `1:1`
|
|
652
|
-
- `resolution` (string, optional): Video resolution (default: "720p")
|
|
653
|
-
- `480p`: Faster generation
|
|
654
|
-
- `720p`: Balanced quality and speed
|
|
655
|
-
- `1080p`: Highest quality (pro model only)
|
|
656
|
-
- `duration` (string, optional): Video duration in seconds (default: "5")
|
|
657
|
-
- Standard: 5-20 seconds
|
|
658
|
-
- Pro: 5-20 seconds
|
|
659
|
-
- `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
|
|
660
|
-
- `watermark` (string, optional): Watermark text to add to the video (max 100 chars)
|
|
661
|
-
- `enable_translation` (boolean, optional): Auto-translate non-English prompts to English (default: true)
|
|
662
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
663
|
-
|
|
664
|
-
**Examples:**
|
|
665
|
-
|
|
666
|
-
Text-to-video generation:
|
|
667
|
-
```json
|
|
668
|
-
{
|
|
669
|
-
"prompt": "A serene Japanese garden with cherry blossoms falling gently around a tranquil koi pond. Soft morning light filters through the trees. No dialogue. Peaceful ambient audio with gentle water sounds and bird songs.",
|
|
670
|
-
"model": "sora-2",
|
|
671
|
-
"aspect_ratio": "16:9",
|
|
672
|
-
"resolution": "1080p",
|
|
673
|
-
"duration": "10",
|
|
674
|
-
"seed": 42
|
|
675
|
-
}
|
|
676
|
-
```
|
|
677
|
-
|
|
678
|
-
Image-to-video generation:
|
|
679
|
-
```json
|
|
680
|
-
{
|
|
681
|
-
"prompt": "The person in the portrait smiles warmly and looks around, then speaks with enthusiasm: 'Welcome to the future of AI video generation!'",
|
|
682
|
-
"image_url": "https://example.com/portrait.jpg",
|
|
683
|
-
"model": "sora-2-pro",
|
|
684
|
-
"resolution": "1080p",
|
|
685
|
-
"duration": "8"
|
|
686
|
-
}
|
|
687
|
-
```
|
|
688
|
-
|
|
689
|
-
Storyboard mode (no prompt required):
|
|
690
|
-
```json
|
|
691
|
-
{
|
|
692
|
-
"storyboard_image_url": "https://example.com/storyboard-frame.jpg",
|
|
693
|
-
"storyboard_prompt": "A cinematic tracking shot through a futuristic city with flying vehicles",
|
|
694
|
-
"model": "sora-2-pro",
|
|
695
|
-
"aspect_ratio": "16:9",
|
|
696
|
-
"resolution": "1080p",
|
|
697
|
-
"duration": "15"
|
|
698
|
-
}
|
|
699
|
-
```
|
|
700
|
-
|
|
701
|
-
**Key Features:**
|
|
702
|
-
- **Unified Interface**: Single tool for text-to-video, image-to-video, and storyboard modes
|
|
703
|
-
- **Smart Mode Detection**: Automatically detects mode based on provided parameters
|
|
704
|
-
- Text-to-Video: `prompt` provided, no `image_url` or `storyboard_image_url`
|
|
705
|
-
- Image-to-Video: `prompt` + `image_url` provided
|
|
706
|
-
- Storyboard: `storyboard_image_url` provided (prompt optional)
|
|
707
|
-
- **Quality Tiers**: Standard for speed, Pro for premium quality
|
|
708
|
-
- **Flexible Resolutions**: 480p for speed, 720p for balance, 1080p for maximum quality
|
|
709
|
-
- **Aspect Ratio Control**: Support for horizontal, vertical, and square formats
|
|
710
|
-
- **Storyboard Mode**: Unique feature for creating videos from storyboard frames without prompts
|
|
711
|
-
- **Reproducible Results**: Seed control for consistent output
|
|
712
|
-
- **Translation Support**: Automatic translation for non-English prompts
|
|
713
|
-
|
|
714
|
-
**Model Selection Logic:**
|
|
715
|
-
- If `storyboard_image_url` provided → Storyboard mode
|
|
716
|
-
- If `image_url` provided → Image-to-video mode
|
|
717
|
-
- If `prompt` provided → Text-to-video mode
|
|
718
|
-
- Quality automatically determined by `model` parameter (`sora-2` vs `sora-2-pro`)
|
|
719
|
-
|
|
720
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-8 minutes depending on model quality, resolution, and duration.
|
|
721
|
-
|
|
722
|
-
### 5. `veo3_generate_video`
|
|
723
|
-
Generate videos using Veo3.
|
|
724
|
-
|
|
725
|
-
**Parameters:**
|
|
726
|
-
- `prompt` (string, required): Video description
|
|
727
|
-
- `imageUrls` (array, optional): Image for image-to-video (max 1)
|
|
728
|
-
- `model` (enum, optional): "veo3" or "veo3_fast" (default: "veo3")
|
|
729
|
-
- `aspectRatio` (enum, optional): "16:9", "9:16", or "Auto" (default: "16:9", only 16:9 supports 1080P)
|
|
730
|
-
- `seeds` (integer, optional): Random seed 10000-99999
|
|
731
|
-
- `watermark` (string, optional): Watermark text
|
|
732
|
-
- `callBackUrl` (string, optional): Callback URL for completion notifications
|
|
733
|
-
- `enableFallback` (boolean, optional): Enable fallback mechanism (default: false, fallback videos cannot use 1080P endpoint)
|
|
734
|
-
- `enableTranslation` (boolean, optional): Auto-translate prompts to English (default: true)
|
|
735
|
-
|
|
736
|
-
**Example:**
|
|
737
|
-
```json
|
|
738
|
-
{
|
|
739
|
-
"prompt": "A dog playing in a park",
|
|
740
|
-
"model": "veo3",
|
|
741
|
-
"aspectRatio": "16:9",
|
|
742
|
-
"seeds": 12345,
|
|
743
|
-
"enableTranslation": true
|
|
744
|
-
}
|
|
745
|
-
```
|
|
746
|
-
|
|
747
|
-
### 6. `veo3_get_1080p_video`
|
|
748
|
-
Get 1080P high-definition version of a Veo3 video.
|
|
749
|
-
|
|
750
|
-
**Parameters:**
|
|
751
|
-
- `task_id` (string, required): Veo3 task ID to get 1080p video for
|
|
752
|
-
- `index` (integer, optional): Video index (for multiple video results)
|
|
753
|
-
|
|
754
|
-
**Note**: Not available for videos generated with fallback mode.
|
|
755
|
-
|
|
756
|
-
### 7. `suno_generate_music`
|
|
757
|
-
Generate music with AI using Suno models.
|
|
758
|
-
|
|
759
|
-
**Parameters:**
|
|
760
|
-
- `prompt` (string, required): Description of desired audio content (max 5000 chars for V4_5+, V5; 3000 for V3_5, V4; 500 chars for non-custom mode)
|
|
761
|
-
- `customMode` (boolean, required): Enable advanced parameter customization
|
|
762
|
-
- `instrumental` (boolean, required): Generate instrumental music (no lyrics)
|
|
763
|
-
- `model` (enum, optional): AI model version - "V3_5", "V4", "V4_5", "V4_5PLUS", or "V5" (default: "V5")
|
|
764
|
-
- `callBackUrl` (string, optional): URL to receive task completion updates (automatic fallback if not provided)
|
|
765
|
-
- `style` (string, optional): Music style/genre (required in custom mode, max 1000 chars for V4_5+, V5; 200 for V3_5, V4)
|
|
766
|
-
- `title` (string, optional): Track title (required in custom mode, max 80 chars)
|
|
767
|
-
- `negativeTags` (string, optional): Music styles to exclude (max 200 chars)
|
|
768
|
-
- `vocalGender` (enum, optional): Vocal gender preference - "m" or "f" (custom mode only)
|
|
769
|
-
- `styleWeight` (number, optional): Style adherence strength (0-1, up to 2 decimal places)
|
|
770
|
-
- `weirdnessConstraint` (number, optional): Creative deviation control (0-1, up to 2 decimal places)
|
|
771
|
-
- `audioWeight` (number, optional): Audio feature balance (0-1, up to 2 decimal places)
|
|
772
|
-
|
|
773
|
-
**Examples:**
|
|
774
|
-
|
|
775
|
-
With explicit callback URL:
|
|
776
|
-
```json
|
|
777
|
-
{
|
|
778
|
-
"prompt": "A calm and relaxing piano track with soft melodies",
|
|
779
|
-
"customMode": true,
|
|
780
|
-
"instrumental": true,
|
|
781
|
-
"model": "V5",
|
|
782
|
-
"callBackUrl": "https://api.example.com/callback",
|
|
783
|
-
"style": "Classical",
|
|
784
|
-
"title": "Peaceful Piano Meditation"
|
|
785
|
-
}
|
|
786
|
-
```
|
|
787
|
-
|
|
788
|
-
Using automatic callback (no setup required):
|
|
789
|
-
```json
|
|
790
|
-
{
|
|
791
|
-
"prompt": "A relaxing electronic music track",
|
|
792
|
-
"customMode": false,
|
|
793
|
-
"instrumental": false
|
|
794
|
-
}
|
|
795
|
-
```
|
|
796
|
-
|
|
797
|
-
Using explicit model (overrides default V5):
|
|
798
|
-
```json
|
|
799
|
-
{
|
|
800
|
-
"prompt": "A relaxing electronic music track",
|
|
801
|
-
"customMode": false,
|
|
802
|
-
"instrumental": false,
|
|
803
|
-
"model": "V4_5PLUS"
|
|
804
|
-
}
|
|
805
|
-
```
|
|
806
|
-
|
|
807
|
-
**Note**: In custom mode, `style` and `title` are required. If `instrumental` is false, `prompt` is used as exact lyrics. The `callBackUrl` is optional and uses automatic fallback if not provided. The `model` parameter defaults to "V5" but can be explicitly set to any available version.
|
|
808
|
-
|
|
809
|
-
### 8. `elevenlabs_tts`
|
|
810
|
-
Generate speech from text using ElevenLabs TTS models (Turbo 2.5 by default, with optional Multilingual v2 support).
|
|
811
|
-
|
|
812
|
-
**Parameters:**
|
|
813
|
-
- `text` (string, required): The text to convert to speech (max 5000 characters)
|
|
814
|
-
- `model` (enum, optional): TTS model to use - "turbo" (faster, default) or "multilingual" (supports context)
|
|
815
|
-
- `voice` (enum, optional): Voice to use - "Rachel", "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" (default: "Rachel")
|
|
816
|
-
- `stability` (number, optional): Voice stability (0-1, step 0.01, default: 0.5)
|
|
817
|
-
- `similarity_boost` (number, optional): Similarity boost (0-1, step 0.01, default: 0.75)
|
|
818
|
-
- `style` (number, optional): Style exaggeration (0-1, step 0.01, default: 0)
|
|
819
|
-
- `speed` (number, optional): Speech speed (0.7-1.2, step 0.01, default: 1.0)
|
|
820
|
-
- `timestamps` (boolean, optional): Whether to return timestamps for each word (default: false)
|
|
821
|
-
- `previous_text` (string, optional): Text that came before current request (multilingual model only, max 5000 chars)
|
|
822
|
-
- `next_text` (string, optional): Text that comes after current request (multilingual model only, max 5000 chars)
|
|
823
|
-
- `language_code` (string, optional): ISO 639-1 language code for language enforcement (turbo model only, max 500 chars)
|
|
824
|
-
- `callBackUrl` (string, optional): URL to receive task completion updates (automatic fallback if not provided)
|
|
825
|
-
|
|
826
|
-
**Examples:**
|
|
827
|
-
|
|
828
|
-
Basic TTS generation (uses Turbo model by default):
|
|
829
|
-
```json
|
|
830
|
-
{
|
|
831
|
-
"text": "Hello, this is a test of the ElevenLabs text-to-speech system.",
|
|
832
|
-
"voice": "Rachel"
|
|
833
|
-
}
|
|
834
|
-
```
|
|
835
|
-
|
|
836
|
-
Fast generation with language enforcement (Turbo model):
|
|
837
|
-
```json
|
|
838
|
-
{
|
|
839
|
-
"text": "Bonjour, comment allez-vous?",
|
|
840
|
-
"voice": "Rachel",
|
|
841
|
-
"model": "turbo",
|
|
842
|
-
"language_code": "fr"
|
|
843
|
-
}
|
|
844
|
-
```
|
|
845
|
-
|
|
846
|
-
Advanced voice controls with context (Multilingual model):
|
|
847
|
-
```json
|
|
848
|
-
{
|
|
849
|
-
"text": "This is the second part of our conversation.",
|
|
850
|
-
"voice": "Roger",
|
|
851
|
-
"model": "multilingual",
|
|
852
|
-
"stability": 0.8,
|
|
853
|
-
"similarity_boost": 0.9,
|
|
854
|
-
"previous_text": "This is the first part of our conversation.",
|
|
855
|
-
"next_text": "This is the third part of our conversation."
|
|
856
|
-
}
|
|
857
|
-
```
|
|
858
|
-
|
|
859
|
-
**Model Comparison:**
|
|
860
|
-
- **Turbo 2.5** (default): Faster generation (15-60 seconds), supports language enforcement with `language_code`
|
|
861
|
-
- **Multilingual v2**: Supports context with `previous_text`/`next_text`, generation takes 30-120 seconds
|
|
862
|
-
|
|
863
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Choose Turbo model for speed and language enforcement, or Multilingual model for context-aware speech generation.
|
|
270
|
+
## Quick Examples
|
|
864
271
|
|
|
865
|
-
###
|
|
866
|
-
Generate sound effects from text descriptions using ElevenLabs Sound Effects v2 model.
|
|
867
|
-
|
|
868
|
-
**Parameters:**
|
|
869
|
-
- `text` (string, required): Description of the sound effect to generate (max 5000 chars)
|
|
870
|
-
- `loop` (boolean, optional): Whether to create a sound effect that loops smoothly (default: false)
|
|
871
|
-
- `duration_seconds` (number, optional): Duration in seconds (0.5-22, step 0.1). If not specified, optimal duration will be determined from prompt
|
|
872
|
-
- `prompt_influence` (number, optional): How closely to follow the prompt (0-1, step 0.01, default: 0.3). Higher values mean less variation
|
|
873
|
-
- `output_format` (string, optional): Audio output format (default: "mp3_44100_192")
|
|
874
|
-
- MP3 options: `mp3_22050_32`, `mp3_44100_32`, `mp3_44100_64`, `mp3_44100_96`, `mp3_44100_128`, `mp3_44100_192`
|
|
875
|
-
- PCM options: `pcm_8000`, `pcm_16000`, `pcm_22050`, `pcm_24000`, `pcm_44100`, `pcm_48000`
|
|
876
|
-
- Telephony: `ulaw_8000`, `alaw_8000`
|
|
877
|
-
- Opus: `opus_48000_32`, `opus_48000_64`, `opus_48000_96`, `opus_48000_128`, `opus_48000_192`
|
|
878
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
879
|
-
|
|
880
|
-
**Examples:**
|
|
881
|
-
|
|
882
|
-
Basic sound effect:
|
|
272
|
+
### Generate Image
|
|
883
273
|
```json
|
|
884
274
|
{
|
|
885
|
-
"
|
|
886
|
-
|
|
887
|
-
|
|
888
|
-
|
|
889
|
-
|
|
890
|
-
|
|
891
|
-
{
|
|
892
|
-
"text": "Epic thunderstorm with heavy rain and distant thunder",
|
|
893
|
-
"duration_seconds": 15.0,
|
|
894
|
-
"prompt_influence": 0.8,
|
|
895
|
-
"output_format": "mp3_44100_192"
|
|
896
|
-
}
|
|
897
|
-
```
|
|
898
|
-
|
|
899
|
-
Looping ambient sound:
|
|
900
|
-
```json
|
|
901
|
-
{
|
|
902
|
-
"text": "Gentle ocean waves lapping at the shore",
|
|
903
|
-
"loop": true,
|
|
904
|
-
"duration_seconds": 10.0
|
|
905
|
-
}
|
|
906
|
-
```
|
|
907
|
-
|
|
908
|
-
**Key Features:**
|
|
909
|
-
- **High-Quality Audio**: Professional-grade sound effect generation
|
|
910
|
-
- **Flexible Duration**: Control exact length from 0.5 to 22 seconds
|
|
911
|
-
- **Loop Support**: Create seamless looping sound effects
|
|
912
|
-
- **Multiple Formats**: Support for MP3, PCM, Opus, and telephony formats
|
|
913
|
-
- **Prompt Control**: Adjust how closely to follow your description
|
|
914
|
-
|
|
915
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Sound effects generation typically takes 30-90 seconds depending on complexity.
|
|
916
|
-
|
|
917
|
-
### 10. `bytedance_seedance_video`
|
|
918
|
-
Generate videos using ByteDance Seedance models (unified tool for both text-to-video and image-to-video).
|
|
919
|
-
|
|
920
|
-
**Parameters:**
|
|
921
|
-
- `prompt` (string, required): Text prompt for video generation (max 10000 chars)
|
|
922
|
-
- `image_url` (string, optional): URL of input image for image-to-video generation (if not provided, uses text-to-video)
|
|
923
|
-
- `quality` (string, optional): Model quality level (default: "lite")
|
|
924
|
-
- `lite`: Faster generation with good quality
|
|
925
|
-
- `pro`: Higher quality with longer generation time
|
|
926
|
-
- `aspect_ratio` (string, optional): Video aspect ratio (default: "16:9")
|
|
927
|
-
- Options: `1:1`, `9:16`, `16:9`, `4:3`, `3:4`, `21:9`, `9:21`
|
|
928
|
-
- `resolution` (string, optional): Video resolution (default: "720p")
|
|
929
|
-
- `480p`: Faster generation
|
|
930
|
-
- `720p`: Balanced quality and speed
|
|
931
|
-
- `1080p`: Highest quality
|
|
932
|
-
- `duration` (string, optional): Video duration in seconds 2-12 (default: "5")
|
|
933
|
-
- `camera_fixed` (boolean, optional): Whether to fix camera position (default: false)
|
|
934
|
-
- `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
|
|
935
|
-
- `enable_safety_checker` (boolean, optional): Enable content safety checking (default: true)
|
|
936
|
-
- `end_image_url` (string, optional): URL of ending image (image-to-video only)
|
|
937
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
938
|
-
|
|
939
|
-
**Examples:**
|
|
940
|
-
|
|
941
|
-
Text-to-video (lite quality):
|
|
942
|
-
```json
|
|
943
|
-
{
|
|
944
|
-
"prompt": "A serene sailing boat gently sways in the harbor at dawn, surrounded by soft Impressionist hues of pink and orange",
|
|
945
|
-
"quality": "lite",
|
|
946
|
-
"aspect_ratio": "16:9",
|
|
947
|
-
"duration": "5"
|
|
948
|
-
}
|
|
949
|
-
```
|
|
950
|
-
|
|
951
|
-
Image-to-video (pro quality):
|
|
952
|
-
```json
|
|
953
|
-
{
|
|
954
|
-
"prompt": "A golden retriever dashing through shallow surf at the beach, splashes frozen in time",
|
|
955
|
-
"image_url": "https://example.com/golden-retriever.jpg",
|
|
956
|
-
"quality": "pro",
|
|
957
|
-
"resolution": "1080p",
|
|
958
|
-
"duration": "6",
|
|
959
|
-
"camera_fixed": false
|
|
960
|
-
}
|
|
961
|
-
```
|
|
962
|
-
|
|
963
|
-
Video with specific ending frame:
|
|
964
|
-
```json
|
|
965
|
-
{
|
|
966
|
-
"prompt": "A traveler crosses an endless desert toward a glowing archway",
|
|
967
|
-
"image_url": "https://example.com/desert-traveler.jpg",
|
|
968
|
-
"end_image_url": "https://example.com/archway.jpg",
|
|
969
|
-
"quality": "pro",
|
|
970
|
-
"duration": "8"
|
|
971
|
-
}
|
|
972
|
-
```
|
|
973
|
-
|
|
974
|
-
**Key Features:**
|
|
975
|
-
- **Unified Interface**: Single tool for both text-to-video and image-to-video
|
|
976
|
-
- **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
|
|
977
|
-
- **Quality Options**: Lite for speed, Pro for quality
|
|
978
|
-
- **Flexible Aspect Ratios**: Support for vertical, horizontal, and square formats
|
|
979
|
-
- **Camera Control**: Option to fix camera position for stable shots
|
|
980
|
-
- **Reproducible Results**: Seed control for consistent output
|
|
981
|
-
- **Safety Features**: Built-in content safety checking
|
|
982
|
-
|
|
983
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-5 minutes depending on quality and complexity.
|
|
984
|
-
|
|
985
|
-
### 11. `bytedance_seedream_image`
|
|
986
|
-
Generate and edit images using ByteDance Seedream V4 models (unified tool for both text-to-image and image editing).
|
|
987
|
-
|
|
988
|
-
**Parameters:**
|
|
989
|
-
- `prompt` (string, required): Text prompt for image generation or editing (max 10000 chars)
|
|
990
|
-
- `image_urls` (array, optional): Array of image URLs for editing mode (1-10 images, if not provided, uses text-to-image)
|
|
991
|
-
- `image_size` (string, optional): Image aspect ratio (default: "1:1")
|
|
992
|
-
- Options: `1:1`, `4:3`, `3:4`, `16:9`, `9:16`, `21:9`, `9:21`, `3:2`, `2:3`
|
|
993
|
-
- `image_resolution` (string, optional): Image resolution (default: "1K")
|
|
994
|
-
- `1K`: Standard resolution (1024px on shortest side)
|
|
995
|
-
- `2K`: High resolution (2048px on shortest side)
|
|
996
|
-
- `4K`: Ultra high resolution (4096px on shortest side)
|
|
997
|
-
- `max_images` (integer, optional): Number of images to generate (1-6, default: 1)
|
|
998
|
-
- `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
|
|
999
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1000
|
-
|
|
1001
|
-
**Examples:**
|
|
1002
|
-
|
|
1003
|
-
Text-to-image generation:
|
|
1004
|
-
```json
|
|
1005
|
-
{
|
|
1006
|
-
"prompt": "A majestic dragon perched atop a crystal mountain at sunset, digital art style",
|
|
1007
|
-
"image_size": "16:9",
|
|
1008
|
-
"image_resolution": "2K",
|
|
1009
|
-
"max_images": 2,
|
|
1010
|
-
"seed": 42
|
|
1011
|
-
}
|
|
1012
|
-
```
|
|
1013
|
-
|
|
1014
|
-
Image editing:
|
|
1015
|
-
```json
|
|
1016
|
-
{
|
|
1017
|
-
"prompt": "Transform the day scene into a magical night with glowing stars and moonlight",
|
|
1018
|
-
"image_urls": ["https://example.com/day-landscape.jpg"],
|
|
1019
|
-
"image_size": "16:9",
|
|
1020
|
-
"image_resolution": "2K",
|
|
1021
|
-
"max_images": 1
|
|
1022
|
-
}
|
|
1023
|
-
```
|
|
1024
|
-
|
|
1025
|
-
Multiple image editing:
|
|
1026
|
-
```json
|
|
1027
|
-
{
|
|
1028
|
-
"prompt": "Apply a consistent cyberpunk aesthetic to all images with neon lights and futuristic elements",
|
|
1029
|
-
"image_urls": [
|
|
1030
|
-
"https://example.com/character1.jpg",
|
|
1031
|
-
"https://example.com/character2.jpg",
|
|
1032
|
-
"https://example.com/background.jpg"
|
|
1033
|
-
],
|
|
1034
|
-
"image_resolution": "4K",
|
|
1035
|
-
"max_images": 3
|
|
1036
|
-
}
|
|
1037
|
-
```
|
|
1038
|
-
|
|
1039
|
-
**Key Features:**
|
|
1040
|
-
- **Unified Interface**: Single tool for both text-to-image and image editing
|
|
1041
|
-
- **Smart Mode Detection**: Automatically detects mode based on presence of `image_urls`
|
|
1042
|
-
- **High Resolution**: Support for 1K, 2K, and 4K output
|
|
1043
|
-
- **Multiple Images**: Generate up to 6 images in a single request
|
|
1044
|
-
- **Batch Editing**: Edit up to 10 images simultaneously with consistent style
|
|
1045
|
-
- **Reproducible Results**: Seed control for consistent output
|
|
1046
|
-
|
|
1047
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 30-120 seconds depending on resolution and complexity.
|
|
1048
|
-
|
|
1049
|
-
### 12. `qwen_image`
|
|
1050
|
-
Generate and edit images using Qwen models (unified tool for both text-to-image and image editing).
|
|
1051
|
-
|
|
1052
|
-
**Parameters:**
|
|
1053
|
-
- `prompt` (string, required): Text prompt for image generation or editing
|
|
1054
|
-
- `image_url` (string, optional): URL of image to edit (if not provided, uses text-to-image)
|
|
1055
|
-
- `image_size` (string, optional): Image size (default: "square_hd")
|
|
1056
|
-
- Options: `square`, `square_hd`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`
|
|
1057
|
-
- `num_inference_steps` (integer, optional): Number of inference steps (default: 30 for text-to-image, 25 for edit)
|
|
1058
|
-
- Text-to-image: 2-250, Edit: 2-49
|
|
1059
|
-
- `guidance_scale` (number, optional): CFG scale (default: 2.5 for text-to-image, 4 for edit)
|
|
1060
|
-
- Range: 0-20
|
|
1061
|
-
- `enable_safety_checker` (boolean, optional): Enable safety checker (default: true)
|
|
1062
|
-
- `output_format` (string, optional): Output format (default: "png")
|
|
1063
|
-
- Options: `png`, `jpeg`
|
|
1064
|
-
- `negative_prompt` (string, optional): Negative prompt (max 500 chars, default: " ")
|
|
1065
|
-
- `acceleration` (string, optional): Acceleration level (default: "none")
|
|
1066
|
-
- Options: `none`, `regular`, `high`
|
|
1067
|
-
- `num_images` (string, optional): Number of images (edit mode only)
|
|
1068
|
-
- Options: `1`, `2`, `3`, `4`
|
|
1069
|
-
- `sync_mode` (boolean, optional): Sync mode (edit mode only, default: false)
|
|
1070
|
-
- `seed` (number, optional): Random seed for reproducible results
|
|
1071
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1072
|
-
|
|
1073
|
-
**Examples:**
|
|
1074
|
-
|
|
1075
|
-
Text-to-image generation:
|
|
1076
|
-
```json
|
|
1077
|
-
{
|
|
1078
|
-
"prompt": "A beautiful landscape with mountains and a lake at sunset",
|
|
1079
|
-
"image_size": "landscape_16_9",
|
|
1080
|
-
"num_inference_steps": 30,
|
|
1081
|
-
"guidance_scale": 2.5,
|
|
1082
|
-
"output_format": "png",
|
|
1083
|
-
"seed": 42
|
|
1084
|
-
}
|
|
1085
|
-
```
|
|
1086
|
-
|
|
1087
|
-
Image editing:
|
|
1088
|
-
```json
|
|
1089
|
-
{
|
|
1090
|
-
"prompt": "Change the day scene to night with stars and moonlight",
|
|
1091
|
-
"image_url": "https://example.com/day-landscape.jpg",
|
|
1092
|
-
"image_size": "landscape_16_9",
|
|
1093
|
-
"num_inference_steps": 25,
|
|
1094
|
-
"guidance_scale": 4,
|
|
1095
|
-
"num_images": "2",
|
|
1096
|
-
"output_format": "png"
|
|
1097
|
-
}
|
|
1098
|
-
```
|
|
1099
|
-
|
|
1100
|
-
High-acceleration generation:
|
|
1101
|
-
```json
|
|
1102
|
-
{
|
|
1103
|
-
"prompt": "A futuristic city with flying cars",
|
|
1104
|
-
"image_size": "square_hd",
|
|
1105
|
-
"acceleration": "high",
|
|
1106
|
-
"enable_safety_checker": true,
|
|
1107
|
-
"negative_prompt": "blurry, low quality"
|
|
1108
|
-
}
|
|
1109
|
-
```
|
|
1110
|
-
|
|
1111
|
-
**Key Features:**
|
|
1112
|
-
- **Unified Interface**: Single tool for both text-to-image and image editing
|
|
1113
|
-
- **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
|
|
1114
|
-
- **Flexible Sizing**: Support for multiple aspect ratios and resolutions
|
|
1115
|
-
- **Acceleration Options**: Speed up generation with acceleration levels
|
|
1116
|
-
- **Batch Generation**: Generate multiple images in edit mode
|
|
1117
|
-
- **Reproducible Results**: Seed control for consistent output
|
|
1118
|
-
|
|
1119
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 10-60 seconds depending on settings and acceleration level.
|
|
1120
|
-
|
|
1121
|
-
### 13. `runway_aleph_video`
|
|
1122
|
-
Transform videos using Runway Aleph video-to-video generation with AI-powered editing.
|
|
1123
|
-
|
|
1124
|
-
**Parameters:**
|
|
1125
|
-
- `prompt` (string, required): Text prompt describing desired video transformation (max 1000 chars)
|
|
1126
|
-
- `videoUrl` (string, required): URL of the input video to transform
|
|
1127
|
-
- `waterMark` (string, optional): Watermark text to add to the video (max 100 chars, default: "")
|
|
1128
|
-
- `uploadCn` (boolean, optional): Whether to upload to China servers (default: false)
|
|
1129
|
-
- `aspectRatio` (enum, optional): Output video aspect ratio (default: "16:9")
|
|
1130
|
-
- Options: `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `21:9`
|
|
1131
|
-
- `seed` (integer, optional): Random seed for reproducible results (1-999999)
|
|
1132
|
-
- `referenceImage` (string, optional): URL of reference image for style guidance
|
|
1133
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1134
|
-
|
|
1135
|
-
**Examples:**
|
|
1136
|
-
|
|
1137
|
-
Basic video transformation:
|
|
1138
|
-
```json
|
|
1139
|
-
{
|
|
1140
|
-
"prompt": "Transform this video into a cinematic anime style with vibrant colors",
|
|
1141
|
-
"videoUrl": "https://example.com/input-video.mp4",
|
|
1142
|
-
"aspectRatio": "16:9"
|
|
1143
|
-
}
|
|
1144
|
-
```
|
|
1145
|
-
|
|
1146
|
-
Advanced transformation with reference image:
|
|
1147
|
-
```json
|
|
1148
|
-
{
|
|
1149
|
-
"prompt": "Apply the artistic style of the reference image to this video",
|
|
1150
|
-
"videoUrl": "https://example.com/cooking-video.mp4",
|
|
1151
|
-
"referenceImage": "https://example.com/van-gogh-painting.jpg",
|
|
1152
|
-
"seed": 123456,
|
|
1153
|
-
"waterMark": "My Channel"
|
|
1154
|
-
}
|
|
1155
|
-
```
|
|
1156
|
-
|
|
1157
|
-
Vertical video for social media:
|
|
1158
|
-
```json
|
|
1159
|
-
{
|
|
1160
|
-
"prompt": "Convert to a dreamy, ethereal style with soft lighting",
|
|
1161
|
-
"videoUrl": "https://example.com/landscape-video.mp4",
|
|
1162
|
-
"aspectRatio": "9:16",
|
|
1163
|
-
"uploadCn": false
|
|
1164
|
-
}
|
|
1165
|
-
```
|
|
1166
|
-
|
|
1167
|
-
**Key Features:**
|
|
1168
|
-
- **Video-to-Video Transformation**: Transform existing videos with AI-powered editing
|
|
1169
|
-
- **Style Transfer**: Apply artistic styles from text prompts or reference images
|
|
1170
|
-
- **Aspect Ratio Control**: Convert between horizontal, vertical, and square formats
|
|
1171
|
-
- **Reproducible Results**: Seed control for consistent transformations
|
|
1172
|
-
- **Watermark Support**: Add custom watermarks to transformed videos
|
|
1173
|
-
- **Reference Guidance**: Use reference images to guide the transformation style
|
|
1174
|
-
|
|
1175
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video-to-video transformation typically takes 3-8 minutes depending on complexity and length.
|
|
1176
|
-
|
|
1177
|
-
### 14. `midjourney_generate`
|
|
1178
|
-
Generate images and videos using Midjourney AI models (unified tool for text-to-image, image-to-image, style reference, omni reference, and video generation).
|
|
1179
|
-
|
|
1180
|
-
**Parameters:**
|
|
1181
|
-
- `prompt` (string, required): Text prompt describing the desired image or video (max 2000 chars)
|
|
1182
|
-
- `taskType` (string, optional): Task type for generation mode (auto-detected if not provided)
|
|
1183
|
-
- Options: `mj_txt2img`, `mj_img2img`, `mj_style_reference`, `mj_omni_reference`, `mj_video`, `mj_video_hd`
|
|
1184
|
-
- `fileUrl` (string, optional): Single image URL for image-to-image or video generation (legacy - use fileUrls instead)
|
|
1185
|
-
- `fileUrls` (array, optional): Array of image URLs for image-to-image or video generation (recommended, max 10)
|
|
1186
|
-
- `speed` (string, optional): Generation speed (not required for video/omni tasks)
|
|
1187
|
-
- Options: `relaxed`, `fast`, `turbo`
|
|
1188
|
-
- `aspectRatio` (string, optional): Output aspect ratio (default: "16:9")
|
|
1189
|
-
- Options: `1:2`, `9:16`, `2:3`, `3:4`, `5:6`, `6:5`, `4:3`, `3:2`, `1:1`, `16:9`, `2:1`
|
|
1190
|
-
- `version` (string, optional): Midjourney model version (default: "7")
|
|
1191
|
-
- Options: `7`, `6.1`, `6`, `5.2`, `5.1`, `niji6`
|
|
1192
|
-
- `variety` (integer, optional): Controls diversity of generated results (0-100, increment by 5)
|
|
1193
|
-
- `stylization` (integer, optional): Artistic style intensity (0-1000, suggested multiple of 50)
|
|
1194
|
-
- `weirdness` (integer, optional): Creativity and uniqueness level (0-3000, suggested multiple of 100)
|
|
1195
|
-
- `ow` (integer, optional): Omni intensity parameter for omni reference tasks (1-1000)
|
|
1196
|
-
- `waterMark` (string, optional): Watermark identifier (max 100 chars)
|
|
1197
|
-
- `enableTranslation` (boolean, optional): Auto-translate non-English prompts to English (default: false)
|
|
1198
|
-
- `videoBatchSize` (string, optional): Number of videos to generate (video mode only, default: "1")
|
|
1199
|
-
- Options: `1`, `2`, `4`
|
|
1200
|
-
- `motion` (string, optional): Motion level for video generation (required for video mode, default: "high")
|
|
1201
|
-
- Options: `high`, `low`
|
|
1202
|
-
- `high_definition_video` (boolean, optional): Use HD video generation instead of standard definition (default: false)
|
|
1203
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1204
|
-
|
|
1205
|
-
**Examples:**
|
|
1206
|
-
|
|
1207
|
-
Text-to-image generation:
|
|
1208
|
-
```json
|
|
1209
|
-
{
|
|
1210
|
-
"prompt": "A majestic dragon perched atop a crystal mountain at sunset, digital art style",
|
|
1211
|
-
"aspectRatio": "16:9",
|
|
1212
|
-
"version": "7",
|
|
1213
|
-
"speed": "fast",
|
|
1214
|
-
"stylization": 500
|
|
1215
|
-
}
|
|
1216
|
-
```
|
|
1217
|
-
|
|
1218
|
-
Image-to-image generation:
|
|
1219
|
-
```json
|
|
1220
|
-
{
|
|
1221
|
-
"prompt": "Transform this portrait into a cyberpunk style with neon lights",
|
|
1222
|
-
"fileUrls": ["https://example.com/portrait.jpg"],
|
|
1223
|
-
"aspectRatio": "1:1",
|
|
1224
|
-
"version": "7",
|
|
1225
|
-
"variety": 10
|
|
1226
|
-
}
|
|
1227
|
-
```
|
|
1228
|
-
|
|
1229
|
-
Standard definition video generation (default):
|
|
1230
|
-
```json
|
|
1231
|
-
{
|
|
1232
|
-
"prompt": "Add gentle movement and atmospheric effects",
|
|
1233
|
-
"fileUrls": ["https://example.com/landscape.jpg"],
|
|
1234
|
-
"motion": "high",
|
|
1235
|
-
"videoBatchSize": "1",
|
|
1236
|
-
"aspectRatio": "16:9"
|
|
1237
|
-
}
|
|
1238
|
-
```
|
|
1239
|
-
|
|
1240
|
-
High definition video generation (explicit):
|
|
1241
|
-
```json
|
|
1242
|
-
{
|
|
1243
|
-
"prompt": "Create cinematic video with dramatic motion",
|
|
1244
|
-
"fileUrls": ["https://example.com/cityscape.jpg"],
|
|
1245
|
-
"motion": "high",
|
|
1246
|
-
"high_definition_video": true,
|
|
1247
|
-
"videoBatchSize": "2",
|
|
1248
|
-
"aspectRatio": "16:9"
|
|
1249
|
-
}
|
|
1250
|
-
```
|
|
1251
|
-
|
|
1252
|
-
Omni reference generation:
|
|
1253
|
-
```json
|
|
1254
|
-
{
|
|
1255
|
-
"prompt": "Place this character in a fantasy forest setting",
|
|
1256
|
-
"fileUrls": ["https://example.com/character.jpg"],
|
|
1257
|
-
"ow": 500,
|
|
1258
|
-
"aspectRatio": "16:9",
|
|
1259
|
-
"version": "7"
|
|
1260
|
-
}
|
|
1261
|
-
```
|
|
1262
|
-
|
|
1263
|
-
Style reference generation:
|
|
1264
|
-
```json
|
|
1265
|
-
{
|
|
1266
|
-
"prompt": "Apply this artistic style to a new landscape",
|
|
1267
|
-
"fileUrls": ["https://example.com/artistic-style.jpg"],
|
|
1268
|
-
"taskType": "mj_style_reference",
|
|
1269
|
-
"aspectRatio": "16:9",
|
|
1270
|
-
"stylization": 700
|
|
1271
|
-
}
|
|
1272
|
-
```
|
|
1273
|
-
|
|
1274
|
-
**Key Features:**
|
|
1275
|
-
- **Unified Interface**: Single tool for all Midjourney generation modes
|
|
1276
|
-
- **Smart Mode Detection**: Automatically detects task type based on parameters
|
|
1277
|
-
- **Video Default**: Uses standard definition video by default, HD only when explicitly requested
|
|
1278
|
-
- **Multiple Aspect Ratios**: Support for vertical, horizontal, square, and ultra-wide formats
|
|
1279
|
-
- **Style Control**: Fine-tune artistic style with stylization, variety, and weirdness parameters
|
|
1280
|
-
- **Speed Options**: Choose generation speed based on urgency (relaxed/fast/turbo)
|
|
1281
|
-
- **Model Versions**: Access different Midjourney models including niji for anime/illustration
|
|
1282
|
-
- **Reference Modes**: Advanced omni and style reference for character and style transfer
|
|
1283
|
-
- **Batch Generation**: Generate multiple videos in a single request
|
|
1284
|
-
|
|
1285
|
-
**Smart Detection Logic:**
|
|
1286
|
-
- If `high_definition_video` is true → `mj_video_hd`
|
|
1287
|
-
- If `motion` or `videoBatchSize` present → `mj_video` (standard) or `mj_video_hd` (explicit)
|
|
1288
|
-
- If `ow` present → `mj_omni_reference`
|
|
1289
|
-
- If `taskType` is `mj_style_reference` → `mj_style_reference`
|
|
1290
|
-
- If `fileUrl`/`fileUrls` present → `mj_img2img`
|
|
1291
|
-
- Otherwise → `mj_txt2img`
|
|
1292
|
-
|
|
1293
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Generation times vary: text-to-image (1-3 minutes), image-to-image (2-4 minutes), video generation (3-8 minutes), reference modes (2-5 minutes).
|
|
1294
|
-
|
|
1295
|
-
### 15. `wan_video`
|
|
1296
|
-
Generate videos using Alibaba Wan 2.5 models (unified tool for both text-to-video and image-to-video).
|
|
1297
|
-
|
|
1298
|
-
**Parameters:**
|
|
1299
|
-
- `prompt` (string, required): Text prompt for video generation (max 800 chars)
|
|
1300
|
-
- `image_url` (string, optional): URL of input image for image-to-video generation (if not provided, uses text-to-video)
|
|
1301
|
-
- `aspect_ratio` (string, optional): Video aspect ratio for text-to-video (default: "16:9")
|
|
1302
|
-
- Options: `16:9`, `9:16`, `1:1`
|
|
1303
|
-
- `resolution` (string, optional): Video resolution (default: "1080p")
|
|
1304
|
-
- `720p`: Faster generation
|
|
1305
|
-
- `1080p`: Higher quality
|
|
1306
|
-
- `duration` (string, optional): Video duration for image-to-video (default: "5")
|
|
1307
|
-
- Options: `5`, `10` seconds
|
|
1308
|
-
- `negative_prompt` (string, optional): Negative prompt to describe content to avoid (max 500 chars, default: "")
|
|
1309
|
-
- `enable_prompt_expansion` (boolean, optional): Enable prompt rewriting using LLM (default: true)
|
|
1310
|
-
- `seed` (integer, optional): Random seed for reproducible results
|
|
1311
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1312
|
-
|
|
1313
|
-
**Examples:**
|
|
1314
|
-
|
|
1315
|
-
Text-to-video generation:
|
|
1316
|
-
```json
|
|
1317
|
-
{
|
|
1318
|
-
"prompt": "A dimly lit jazz bar at night, wooden tables glowing under warm pendant lights. Patrons sip drinks and chat quietly while a three-piece band performs on stage. The saxophone player stands under a spotlight, gleaming instrument reflecting the light. No dialogue. Ambient audio: smooth live jazz music with saxophone and piano, clinking glasses, low murmur of audience conversations.",
|
|
1319
|
-
"aspect_ratio": "16:9",
|
|
1320
|
-
"resolution": "1080p",
|
|
1321
|
-
"enable_prompt_expansion": true,
|
|
1322
|
-
"seed": 42
|
|
1323
|
-
}
|
|
1324
|
-
```
|
|
1325
|
-
|
|
1326
|
-
Image-to-video generation:
|
|
1327
|
-
```json
|
|
1328
|
-
{
|
|
1329
|
-
"prompt": "The same woman from the reference image looks directly into the camera, takes a breath, then smiles brightly and speaks with enthusiasm: 'Have you heard? Alibaba Wan 2.5 API is now available on Kie.ai!'",
|
|
1330
|
-
"image_url": "https://example.com/portrait.jpg",
|
|
1331
|
-
"duration": "5",
|
|
1332
|
-
"resolution": "1080p",
|
|
1333
|
-
"negative_prompt": "blurry, low quality",
|
|
1334
|
-
"seed": 123
|
|
1335
|
-
}
|
|
1336
|
-
```
|
|
1337
|
-
|
|
1338
|
-
**Key Features:**
|
|
1339
|
-
- **Unified Interface**: Single tool for both text-to-video and image-to-video
|
|
1340
|
-
- **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
|
|
1341
|
-
- **Prompt Expansion**: LLM-powered prompt rewriting for better results with short prompts
|
|
1342
|
-
- **Flexible Resolutions**: 720p for speed, 1080p for quality
|
|
1343
|
-
- **Aspect Ratio Control**: Support for horizontal, vertical, and square formats (text-to-video)
|
|
1344
|
-
- **Duration Control**: 5 or 10 second options for image-to-video
|
|
1345
|
-
- **Negative Prompts**: Fine-tune results by specifying what to avoid
|
|
1346
|
-
- **Reproducible Results**: Seed control for consistent output
|
|
1347
|
-
|
|
1348
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-6 minutes depending on resolution and complexity.
|
|
1349
|
-
|
|
1350
|
-
### 16. `hailuo_video`
|
|
1351
|
-
|
|
1352
|
-
Generate professional videos using Hailuo 02 models (unified tool for text-to-video and image-to-video with standard/pro quality).
|
|
1353
|
-
|
|
1354
|
-
**Parameters:**
|
|
1355
|
-
- `prompt` (string, required): Text prompt describing the video content (max 1500 chars)
|
|
1356
|
-
- `imageUrl` (string, optional): URL of input image for image-to-video mode (if not provided, uses text-to-video)
|
|
1357
|
-
- `endImageUrl` (string, optional): URL of end frame image for image-to-video (optional, requires imageUrl)
|
|
1358
|
-
- `quality` (string, optional): Quality level of generation (default: "standard")
|
|
1359
|
-
- Options: `standard`, `pro`
|
|
1360
|
-
- `duration` (string, optional): Duration of video in seconds - standard quality only (default: "6")
|
|
1361
|
-
- Options: `6`, `10`
|
|
1362
|
-
- `resolution` (string, optional): Video resolution - standard quality only (default: "768P")
|
|
1363
|
-
- Options: `512P`, `768P`
|
|
1364
|
-
- `promptOptimizer` (boolean, optional): Enable prompt optimization (default: true)
|
|
1365
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1366
|
-
|
|
1367
|
-
**Examples:**
|
|
1368
|
-
|
|
1369
|
-
Text-to-video generation:
|
|
1370
|
-
```json
|
|
1371
|
-
{
|
|
1372
|
-
"prompt": "A cinematic shot of a futuristic city at night with flying vehicles and holographic billboards. Camera pans across the skyline.",
|
|
1373
|
-
"quality": "pro",
|
|
1374
|
-
"promptOptimizer": true
|
|
1375
|
-
}
|
|
1376
|
-
```
|
|
1377
|
-
|
|
1378
|
-
Image-to-video generation (standard quality):
|
|
1379
|
-
```json
|
|
1380
|
-
{
|
|
1381
|
-
"prompt": "The person in the image stands up and walks towards the window, looking out at the scenic view",
|
|
1382
|
-
"imageUrl": "https://example.com/portrait.jpg",
|
|
1383
|
-
"quality": "standard",
|
|
1384
|
-
"duration": "10",
|
|
1385
|
-
"resolution": "768P"
|
|
1386
|
-
}
|
|
1387
|
-
```
|
|
1388
|
-
|
|
1389
|
-
Image-to-video with end frame:
|
|
1390
|
-
```json
|
|
1391
|
-
{
|
|
1392
|
-
"prompt": "A smooth transition from the morning scene to sunset over the mountains",
|
|
1393
|
-
"imageUrl": "https://example.com/start-frame.jpg",
|
|
1394
|
-
"endImageUrl": "https://example.com/end-frame.jpg",
|
|
1395
|
-
"quality": "standard"
|
|
1396
|
-
}
|
|
1397
|
-
```
|
|
1398
|
-
|
|
1399
|
-
**Key Features:**
|
|
1400
|
-
- **Two Intelligent Modes**:
|
|
1401
|
-
- Text-to-video: Create videos from text descriptions
|
|
1402
|
-
- Image-to-video: Animate static images with optional end frame reference
|
|
1403
|
-
- **Quality Selection**: Choose between standard (faster) and pro (higher quality) modes
|
|
1404
|
-
- **Smart Mode Detection**: Automatically selects the best model based on parameters and quality setting
|
|
1405
|
-
- **Standard Quality Options**: Flexible duration (6/10 seconds) and resolution (512P/768P)
|
|
1406
|
-
- **Pro Quality**: Optimized for maximum visual fidelity (no resolution/duration constraints)
|
|
1407
|
-
- **Prompt Optimization**: AI-powered prompt enhancement for better results
|
|
1408
|
-
|
|
1409
|
-
**Model Selection Logic:**
|
|
1410
|
-
- If `imageUrl` provided:
|
|
1411
|
-
- `quality === 'pro'` → `hailuo/02-image-to-video-pro`
|
|
1412
|
-
- Otherwise → `hailuo/02-image-to-video-standard`
|
|
1413
|
-
- Otherwise (text-to-video):
|
|
1414
|
-
- `quality === 'pro'` → `hailuo/02-text-to-video-pro`
|
|
1415
|
-
- Otherwise → `hailuo/02-text-to-video-standard`
|
|
1416
|
-
|
|
1417
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 1-5 minutes depending on quality setting and complexity.
|
|
1418
|
-
|
|
1419
|
-
### 17. `kling_video`
|
|
1420
|
-
|
|
1421
|
-
Generate high-quality videos using Kling AI models (unified tool for text-to-video, image-to-video, and v2.1-pro with start+end frames).
|
|
1422
|
-
|
|
1423
|
-
**Parameters:**
|
|
1424
|
-
- `prompt` (string, required): Text prompt describing the video (max 5000 chars)
|
|
1425
|
-
- `image_url` (string, optional): URL of input image for image-to-video or v2.1-pro start frame (if not provided, uses text-to-video)
|
|
1426
|
-
- `tail_image_url` (string, optional): URL of end frame image for v2.1-pro (requires image_url). When provided, uses v2.1-pro model with start and end frame reference
|
|
1427
|
-
- `duration` (string, optional): Duration of video in seconds (default: "5")
|
|
1428
|
-
- Options: `5`, `10`
|
|
1429
|
-
- `aspect_ratio` (string, optional): Aspect ratio for text-to-video (default: "16:9")
|
|
1430
|
-
- Options: `16:9`, `9:16`, `1:1`
|
|
1431
|
-
- `negative_prompt` (string, optional): Elements to avoid (max 2500 chars, default: "blur, distort, and low quality")
|
|
1432
|
-
- `cfg_scale` (number, optional): CFG scale for prompt adherence (0-1, step 0.1, default: 0.5)
|
|
1433
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1434
|
-
|
|
1435
|
-
**Examples:**
|
|
1436
|
-
|
|
1437
|
-
Text-to-video generation:
|
|
1438
|
-
```json
|
|
1439
|
-
{
|
|
1440
|
-
"prompt": "A serene forest scene with sunlight filtering through the canopy. Birds chirping, gentle breeze rustling leaves. Camera slowly pans through the trees revealing a hidden waterfall",
|
|
1441
|
-
"aspect_ratio": "16:9",
|
|
1442
|
-
"duration": "10",
|
|
1443
|
-
"cfg_scale": 0.7
|
|
1444
|
-
}
|
|
1445
|
-
```
|
|
1446
|
-
|
|
1447
|
-
Image-to-video generation:
|
|
1448
|
-
```json
|
|
1449
|
-
{
|
|
1450
|
-
"prompt": "The person in the image waves and smiles, then turns to look at the scenic mountain view",
|
|
1451
|
-
"image_url": "https://example.com/portrait.jpg",
|
|
1452
|
-
"duration": "5"
|
|
1453
|
-
}
|
|
1454
|
-
```
|
|
1455
|
-
|
|
1456
|
-
V2.1-pro with start and end frames:
|
|
1457
|
-
```json
|
|
1458
|
-
{
|
|
1459
|
-
"prompt": "A smooth transition showing the landscape changing from day to night, with the person from frame 1 walking towards the sunset",
|
|
1460
|
-
"image_url": "https://example.com/start-frame.jpg",
|
|
1461
|
-
"tail_image_url": "https://example.com/end-frame.jpg",
|
|
1462
|
-
"duration": "10",
|
|
1463
|
-
"cfg_scale": 0.6
|
|
1464
|
-
}
|
|
1465
|
-
```
|
|
1466
|
-
|
|
1467
|
-
**Key Features:**
|
|
1468
|
-
- **Three Intelligent Modes**:
|
|
1469
|
-
- Text-to-video: Create videos from text descriptions
|
|
1470
|
-
- Image-to-video: Animate static images
|
|
1471
|
-
- V2.1-pro: Advanced mode with start and end frame references for controlled video transitions
|
|
1472
|
-
- **Smart Mode Detection**: Automatically selects the best model based on parameters
|
|
1473
|
-
- **Start/End Frame Control**: V2.1-pro uniquely supports specifying both start and end frames for precise video flows
|
|
1474
|
-
- **Flexible Duration**: 5 or 10 second options
|
|
1475
|
-
- **Aspect Ratio Control**: Multiple formats for text-to-video (16:9, 9:16, 1:1)
|
|
1476
|
-
- **Quality Control**: CFG scale for controlling prompt adherence
|
|
1477
|
-
- **Negative Prompts**: Fine-tune by specifying what to avoid
|
|
1478
|
-
|
|
1479
|
-
**Model Selection Logic:**
|
|
1480
|
-
- If `tail_image_url` provided → `kling/v2-1-pro` (start + end frame reference)
|
|
1481
|
-
- If `image_url` provided → `kling/v2-5-turbo-image-to-video-pro` (image animation)
|
|
1482
|
-
- Otherwise → `kling/v2-5-turbo-text-to-video-pro` (text-to-video)
|
|
1483
|
-
|
|
1484
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-5 minutes depending on duration and complexity.
|
|
1485
|
-
|
|
1486
|
-
### 18. `openai_4o_image`
|
|
1487
|
-
Generate, edit, and create image variants using OpenAI's GPT-4o image models (unified tool for text-to-image, image editing, and image variants).
|
|
1488
|
-
|
|
1489
|
-
**Parameters:**
|
|
1490
|
-
- `prompt` (string, required): Text prompt for image generation or editing (max 4000 chars)
|
|
1491
|
-
- `filesUrl` (string, optional): URL of input image for editing/variants mode (if not provided, uses text-to-image)
|
|
1492
|
-
- `maskUrl` (string, optional): URL of mask image for editing mode (required for editing, must be same dimensions as filesUrl)
|
|
1493
|
-
- `nVariants` (integer, optional): Number of image variants to generate (1-4, default: 4)
|
|
1494
|
-
- `size` (string, optional): Output image size (default: "1024x1024")
|
|
1495
|
-
- Options: `256x256`, `512x512`, `1024x1024`, `1792x1024`, `1024x1792`
|
|
1496
|
-
- `model` (string, optional): Model to use (default: "gpt-4o-image")
|
|
1497
|
-
- Options: `gpt-4o-image`, `gpt-4o-image-mini`
|
|
1498
|
-
- `style` (string, optional): Image style (default: "vivid")
|
|
1499
|
-
- Options: `vivid`, `natural`
|
|
1500
|
-
- `quality` (string, optional): Image quality (default: "standard")
|
|
1501
|
-
- Options: `standard`, `hd`
|
|
1502
|
-
- `responseFormat` (string, optional): Response format (default: "url")
|
|
1503
|
-
- Options: `url`, `b64_json`
|
|
1504
|
-
- `user` (string, optional): User identifier for tracking (max 100 chars)
|
|
1505
|
-
- `enableFallback` (boolean, optional): Enable fallback mechanism (default: true)
|
|
1506
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1507
|
-
|
|
1508
|
-
**Examples:**
|
|
1509
|
-
|
|
1510
|
-
Text-to-image generation:
|
|
1511
|
-
```json
|
|
1512
|
-
{
|
|
1513
|
-
"prompt": "A futuristic city skyline at sunset with flying cars and neon lights, cyberpunk style",
|
|
1514
|
-
"nVariants": 4,
|
|
1515
|
-
"size": "1024x1024",
|
|
1516
|
-
"quality": "hd",
|
|
1517
|
-
"style": "vivid"
|
|
1518
|
-
}
|
|
1519
|
-
```
|
|
1520
|
-
|
|
1521
|
-
Image editing with mask:
|
|
1522
|
-
```json
|
|
1523
|
-
{
|
|
1524
|
-
"prompt": "Replace the cloudy sky with a clear starry night and add a full moon",
|
|
1525
|
-
"filesUrl": "https://example.com/landscape.jpg",
|
|
1526
|
-
"maskUrl": "https://example.com/landscape-mask.png",
|
|
1527
|
-
"nVariants": 2,
|
|
1528
|
-
"size": "1024x1024",
|
|
1529
|
-
"quality": "hd"
|
|
1530
|
-
}
|
|
1531
|
-
```
|
|
1532
|
-
|
|
1533
|
-
Image variants:
|
|
1534
|
-
```json
|
|
1535
|
-
{
|
|
1536
|
-
"filesUrl": "https://example.com/portrait.jpg",
|
|
1537
|
-
"nVariants": 4,
|
|
1538
|
-
"style": "natural",
|
|
1539
|
-
"quality": "standard"
|
|
1540
|
-
}
|
|
1541
|
-
```
|
|
1542
|
-
|
|
1543
|
-
High-quality generation with fallback:
|
|
1544
|
-
```json
|
|
1545
|
-
{
|
|
1546
|
-
"prompt": "A detailed oil painting of a serene mountain lake at dawn",
|
|
1547
|
-
"nVariants": 2,
|
|
1548
|
-
"size": "1792x1024",
|
|
1549
|
-
"quality": "hd",
|
|
1550
|
-
"model": "gpt-4o-image",
|
|
1551
|
-
"enableFallback": true
|
|
1552
|
-
}
|
|
1553
|
-
```
|
|
1554
|
-
|
|
1555
|
-
**Key Features:**
|
|
1556
|
-
- **Unified Interface**: Single tool for text-to-image, image editing, and image variants
|
|
1557
|
-
- **Smart Mode Detection**: Automatically detects mode based on provided parameters
|
|
1558
|
-
- Text-to-Image: `prompt` provided, no `filesUrl`
|
|
1559
|
-
- Image Editing: `filesUrl` + `maskUrl` provided
|
|
1560
|
-
- Image Variants: `filesUrl` provided, no `maskUrl`
|
|
1561
|
-
- **Multiple Variants**: Generate up to 4 image variations in a single request
|
|
1562
|
-
- **Flexible Sizing**: Support for square, portrait, and landscape formats
|
|
1563
|
-
- **Quality Options**: Standard or HD quality for different use cases
|
|
1564
|
-
- **Style Control**: Choose between vivid (creative) or natural (realistic) styles
|
|
1565
|
-
- **Fallback Support**: Automatic fallback to FLUX_MAX model if GPT-4o fails
|
|
1566
|
-
- **Model Options**: Use full GPT-4o or mini model based on requirements
|
|
1567
|
-
|
|
1568
|
-
**Smart Detection Logic:**
|
|
1569
|
-
- If `filesUrl` and `maskUrl` provided → Image Editing mode
|
|
1570
|
-
- If `filesUrl` provided but no `maskUrl` → Image Variants mode
|
|
1571
|
-
- If no `filesUrl` provided → Text-to-Image mode
|
|
1572
|
-
|
|
1573
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 30-120 seconds depending on complexity and quality settings. The fallback mechanism uses FLUX_MAX model when GPT-4o fails, ensuring reliable generation.
|
|
1574
|
-
|
|
1575
|
-
### 19. `flux_kontext_image`
|
|
1576
|
-
Generate or edit images using Flux Kontext AI models (unified tool for text-to-image generation and image editing with advanced features).
|
|
1577
|
-
|
|
1578
|
-
**Parameters:**
|
|
1579
|
-
- `prompt` (string, required): Text prompt describing the desired image or edit (max 5000 chars, English recommended)
|
|
1580
|
-
- `inputImage` (string, optional): Input image URL for editing mode (omit for text-to-image generation)
|
|
1581
|
-
- `aspectRatio` (string, optional): Output aspect ratio (default: "16:9")
|
|
1582
|
-
- Options: `21:9` (ultra-wide), `16:9` (widescreen), `4:3` (standard), `1:1` (square), `3:4` (portrait), `9:16` (mobile portrait)
|
|
1583
|
-
- `outputFormat` (string, optional): Output image format (default: "jpeg")
|
|
1584
|
-
- Options: `jpeg`, `png`
|
|
1585
|
-
- `model` (string, optional): Model version (default: "flux-kontext-pro")
|
|
1586
|
-
- Options: `flux-kontext-pro` (standard), `flux-kontext-max` (enhanced)
|
|
1587
|
-
- `enableTranslation` (boolean, optional): Auto-translate non-English prompts (default: true)
|
|
1588
|
-
- `promptUpsampling` (boolean, optional): Enable prompt enhancement (default: false)
|
|
1589
|
-
- `safetyTolerance` (integer, optional): Content moderation level (default: 2)
|
|
1590
|
-
- Generation mode: 0-6 (0=strict, 6=permissive)
|
|
1591
|
-
- Editing mode: 0-2 (0=strict, 2=balanced)
|
|
1592
|
-
- `uploadCn` (boolean, optional): Route uploads via China servers (default: false)
|
|
1593
|
-
- `watermark` (string, optional): Watermark identifier to add to generated image
|
|
1594
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1595
|
-
|
|
1596
|
-
**Examples:**
|
|
1597
|
-
|
|
1598
|
-
Text-to-image generation:
|
|
1599
|
-
```json
|
|
1600
|
-
{
|
|
1601
|
-
"prompt": "A serene mountain landscape at sunset with a lake reflecting the orange sky, photorealistic style",
|
|
1602
|
-
"aspectRatio": "16:9",
|
|
1603
|
-
"model": "flux-kontext-max",
|
|
1604
|
-
"outputFormat": "png"
|
|
1605
|
-
}
|
|
1606
|
-
```
|
|
1607
|
-
|
|
1608
|
-
Image editing:
|
|
1609
|
-
```json
|
|
1610
|
-
{
|
|
1611
|
-
"prompt": "Replace the sky with a starry night and add glowing lanterns",
|
|
1612
|
-
"inputImage": "https://example.com/original-image.jpg",
|
|
1613
|
-
"aspectRatio": "16:9",
|
|
1614
|
-
"safetyTolerance": 2,
|
|
1615
|
-
"enableTranslation": false
|
|
1616
|
-
}
|
|
1617
|
-
```
|
|
1618
|
-
|
|
1619
|
-
Mobile portrait generation:
|
|
1620
|
-
```json
|
|
1621
|
-
{
|
|
1622
|
-
"prompt": "A futuristic cityscape with flying cars and neon lights, cyberpunk style",
|
|
1623
|
-
"aspectRatio": "9:16",
|
|
1624
|
-
"model": "flux-kontext-max",
|
|
1625
|
-
"promptUpsampling": true
|
|
1626
|
-
}
|
|
1627
|
-
```
|
|
1628
|
-
|
|
1629
|
-
**Key Features:**
|
|
1630
|
-
- **Unified Interface**: Single tool for both text-to-image generation and image editing
|
|
1631
|
-
- **Smart Mode Detection**: Automatically detects mode based on `inputImage` parameter
|
|
1632
|
-
- Text-to-Image: No `inputImage` provided
|
|
1633
|
-
- Image Editing: `inputImage` provided
|
|
1634
|
-
- **Advanced Translation**: Automatic translation of non-English prompts to English
|
|
1635
|
-
- **Multiple Aspect Ratios**: Support for ultra-wide, standard, square, and mobile formats
|
|
1636
|
-
- **Model Selection**: Choose between standard (pro) and enhanced (max) quality models
|
|
1637
|
-
- **Safety Controls**: Configurable content moderation with different levels for generation vs editing
|
|
1638
|
-
- **Prompt Enhancement**: Optional upsampling for improved generation quality
|
|
1639
|
-
- **Watermark Support**: Add custom watermarks to generated images
|
|
1640
|
-
- **Regional Optimization**: Choose optimal server region for uploads
|
|
1641
|
-
|
|
1642
|
-
**Smart Detection Logic:**
|
|
1643
|
-
- If `inputImage` provided → Image Editing mode
|
|
1644
|
-
- If no `inputImage` provided → Text-to-Image mode
|
|
1645
|
-
|
|
1646
|
-
**Performance:**
|
|
1647
|
-
- Text-to-image generation: 30-60 seconds
|
|
1648
|
-
- Image editing: 1-3 minutes
|
|
1649
|
-
- Enhanced model (flux-kontext-max): May take longer but provides higher quality
|
|
1650
|
-
|
|
1651
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Safety tolerance levels are automatically validated based on the generation mode (0-2 for editing, 0-6 for generation).
|
|
1652
|
-
|
|
1653
|
-
### 20. `ideogram_reframe`
|
|
1654
|
-
Reframe images to different aspect ratios and sizes using Ideogram V3 Reframe model with intelligent content adaptation.
|
|
1655
|
-
|
|
1656
|
-
**Parameters:**
|
|
1657
|
-
- `image_url` (string, required): URL of image to reframe (JPEG, PNG, WEBP, max 10MB)
|
|
1658
|
-
- `image_size` (string, optional): Output size for the reframed image (default: "square_hd")
|
|
1659
|
-
- Options: `square`, `square_hd`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`
|
|
1660
|
-
- `rendering_speed` (string, optional): Rendering speed for generation (default: "BALANCED")
|
|
1661
|
-
- Options: `TURBO` (fast), `BALANCED` (default), `QUALITY` (best)
|
|
1662
|
-
- `style` (string, optional): Style type for generation (default: "AUTO")
|
|
1663
|
-
- Options: `AUTO`, `GENERAL`, `REALISTIC`, `DESIGN`
|
|
1664
|
-
- `num_images` (string, optional): Number of images to generate (default: "1")
|
|
1665
|
-
- Options: `1`, `2`, `3`, `4`
|
|
1666
|
-
- `seed` (number, optional): Seed for reproducible results (default: 0)
|
|
1667
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1668
|
-
|
|
1669
|
-
**Examples:**
|
|
1670
|
-
|
|
1671
|
-
Basic reframing to square HD:
|
|
1672
|
-
```json
|
|
1673
|
-
{
|
|
1674
|
-
"image_url": "https://example.com/landscape-photo.jpg",
|
|
1675
|
-
"image_size": "square_hd"
|
|
1676
|
-
}
|
|
1677
|
-
```
|
|
1678
|
-
|
|
1679
|
-
High-quality portrait reframing:
|
|
1680
|
-
```json
|
|
1681
|
-
{
|
|
1682
|
-
"image_url": "https://example.com/group-photo.jpg",
|
|
1683
|
-
"image_size": "portrait_9_16",
|
|
1684
|
-
"rendering_speed": "QUALITY",
|
|
1685
|
-
"style": "REALISTIC",
|
|
1686
|
-
"num_images": "2"
|
|
1687
|
-
}
|
|
1688
|
-
```
|
|
1689
|
-
|
|
1690
|
-
Fast generation with custom style:
|
|
1691
|
-
```json
|
|
1692
|
-
{
|
|
1693
|
-
"image_url": "https://example.com/artwork.jpg",
|
|
1694
|
-
"image_size": "landscape_16_9",
|
|
1695
|
-
"rendering_speed": "TURBO",
|
|
1696
|
-
"style": "DESIGN",
|
|
1697
|
-
"seed": 42
|
|
275
|
+
"tool": "nano_banana_image",
|
|
276
|
+
"arguments": {
|
|
277
|
+
"prompt": "A futuristic city at sunset, cyberpunk style",
|
|
278
|
+
"image_size": "16:9",
|
|
279
|
+
"output_format": "png"
|
|
280
|
+
}
|
|
1698
281
|
}
|
|
1699
282
|
```
|
|
1700
283
|
|
|
1701
|
-
|
|
284
|
+
### Generate Video
|
|
1702
285
|
```json
|
|
1703
286
|
{
|
|
1704
|
-
"
|
|
1705
|
-
"
|
|
1706
|
-
|
|
1707
|
-
|
|
287
|
+
"tool": "sora_video",
|
|
288
|
+
"arguments": {
|
|
289
|
+
"prompt": "A peaceful garden with blooming flowers and butterflies",
|
|
290
|
+
"model": "sora-2",
|
|
291
|
+
"resolution": "1080p",
|
|
292
|
+
"duration": "10"
|
|
293
|
+
}
|
|
1708
294
|
}
|
|
1709
295
|
```
|
|
1710
296
|
|
|
1711
|
-
|
|
1712
|
-
- **Intelligent Content Adaptation**: Smart content-aware reframing that preserves important elements
|
|
1713
|
-
- **Multiple Aspect Ratios**: Support for square, portrait, and landscape formats
|
|
1714
|
-
- **Rendering Speed Control**: Choose between speed (TURBO), balance (BALANCED), or quality (QUALITY)
|
|
1715
|
-
- **Style Options**: Auto-detection or specific style types (GENERAL, REALISTIC, DESIGN)
|
|
1716
|
-
- **Batch Generation**: Create multiple variants in a single request
|
|
1717
|
-
- **Reproducible Results**: Seed control for consistent output across sessions
|
|
1718
|
-
- **Professional Quality**: High-quality reframing with minimal artifacts
|
|
1719
|
-
|
|
1720
|
-
**Output Sizes:**
|
|
1721
|
-
- **Square**: 1:1 aspect ratio for social media and avatars
|
|
1722
|
-
- **Square HD**: High-definition square format with better quality
|
|
1723
|
-
- **Portrait 4:3**: Standard portrait orientation
|
|
1724
|
-
- **Portrait 16:9**: Wide portrait for mobile and stories
|
|
1725
|
-
- **Landscape 4:3**: Traditional landscape orientation
|
|
1726
|
-
- **Landscape 16:9**: Widescreen format for displays and video
|
|
1727
|
-
|
|
1728
|
-
**Use Cases:**
|
|
1729
|
-
- **Social Media**: Convert images to optimal formats for different platforms
|
|
1730
|
-
- **Content Adaptation**: Repurpose content for multiple aspect ratios
|
|
1731
|
-
- **Design Workflows**: Generate variations for different layout requirements
|
|
1732
|
-
- **Mobile Optimization**: Create mobile-friendly versions of desktop content
|
|
1733
|
-
- **Batch Processing**: Generate multiple format variants efficiently
|
|
1734
|
-
|
|
1735
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image reframing typically takes 30-120 seconds depending on complexity, rendering speed, and output settings.
|
|
1736
|
-
|
|
1737
|
-
### 21. `recraft_remove_background`
|
|
1738
|
-
Remove backgrounds from images using Recraft AI background removal model with professional-quality edge detection.
|
|
1739
|
-
|
|
1740
|
-
**Parameters:**
|
|
1741
|
-
- `image` (string, required): URL of image to remove background from (PNG, JPG, WEBP, max 5MB, 16MP, 4096px max, 256px min)
|
|
1742
|
-
- `callBackUrl` (string, optional): URL for task completion notifications
|
|
1743
|
-
|
|
1744
|
-
**Examples:**
|
|
1745
|
-
|
|
1746
|
-
Basic background removal:
|
|
297
|
+
### Generate Music
|
|
1747
298
|
```json
|
|
1748
299
|
{
|
|
1749
|
-
"
|
|
300
|
+
"tool": "suno_generate_music",
|
|
301
|
+
"arguments": {
|
|
302
|
+
"prompt": "Upbeat electronic music with energetic beats",
|
|
303
|
+
"customMode": true,
|
|
304
|
+
"instrumental": false,
|
|
305
|
+
"model": "V5",
|
|
306
|
+
"style": "Electronic",
|
|
307
|
+
"title": "Energy Boost"
|
|
308
|
+
}
|
|
1750
309
|
}
|
|
1751
310
|
```
|
|
1752
311
|
|
|
1753
|
-
|
|
312
|
+
### Text-to-Speech
|
|
1754
313
|
```json
|
|
1755
314
|
{
|
|
1756
|
-
"
|
|
1757
|
-
"
|
|
315
|
+
"tool": "elevenlabs_tts",
|
|
316
|
+
"arguments": {
|
|
317
|
+
"text": "Welcome to the future of AI-powered content creation!",
|
|
318
|
+
"voice": "Rachel",
|
|
319
|
+
"model": "turbo"
|
|
320
|
+
}
|
|
1758
321
|
}
|
|
1759
322
|
```
|
|
1760
323
|
|
|
1761
|
-
**
|
|
1762
|
-
- **Professional Quality**: Clean edge detection with precise background separation
|
|
1763
|
-
- **Format Support**: Works with PNG, JPG, and WEBP images
|
|
1764
|
-
- **Size Optimization**: Handles images up to 16MP with optimal processing
|
|
1765
|
-
- **Fast Processing**: Quick background removal for most image types
|
|
1766
|
-
- **Automatic Enhancement**: Smart edge refinement for natural results
|
|
1767
|
-
|
|
1768
|
-
**Use Cases:**
|
|
1769
|
-
- **Product Photography**: Create clean product images with transparent backgrounds
|
|
1770
|
-
- **Portrait Processing**: Remove backgrounds for professional headshots
|
|
1771
|
-
- **Design Workflows**: Isolate subjects for composite images
|
|
1772
|
-
- **E-commerce**: Prepare product images for catalogs
|
|
1773
|
-
- **Content Creation**: Create assets for social media and marketing
|
|
1774
|
-
|
|
1775
|
-
**Technical Specifications:**
|
|
1776
|
-
- **Supported Formats**: PNG, JPG, WEBP
|
|
1777
|
-
- **Maximum File Size**: 5MB
|
|
1778
|
-
- **Maximum Resolution**: 16MP (4096px max dimension)
|
|
1779
|
-
- **Minimum Resolution**: 256px min dimension
|
|
1780
|
-
- **Output Format**: PNG with transparent background
|
|
1781
|
-
|
|
1782
|
-
**Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Background removal typically takes 10-30 seconds depending on image complexity and size.
|
|
1783
|
-
|
|
1784
|
-
## Why Developers Choose Kie.ai Over Alternatives
|
|
1785
|
-
|
|
1786
|
-
### 💸 **Better Value Than Fal.ai**
|
|
1787
|
-
- **Lower costs** for the same premium AI models
|
|
1788
|
-
- **Pay-as-you-go pricing** - no monthly commitments
|
|
1789
|
-
- **Free trial** to test before you buy
|
|
1790
|
-
|
|
1791
|
-
### 🛠️ **Developer Experience**
|
|
1792
|
-
- **Single API key** for all models
|
|
1793
|
-
- **Documentation** with examples
|
|
1794
|
-
- **Simple integration** - get started in minutes
|
|
1795
|
-
- **24/7 support** from technical team
|
|
1796
|
-
|
|
1797
|
-
### 🚀 **Performance**
|
|
1798
|
-
- **99.9% uptime**
|
|
1799
|
-
- **Fast response times** (25.2s average)
|
|
1800
|
-
- **High concurrency** for production workloads
|
|
1801
|
-
- **Reliable results**
|
|
1802
|
-
|
|
1803
|
-
### 🔒 **Security**
|
|
1804
|
-
- **Encryption** for your data
|
|
1805
|
-
- **GDPR compliant** data handling
|
|
1806
|
-
- **Private prompts and results**
|
|
1807
|
-
- **Regular security updates**
|
|
1808
|
-
|
|
1809
|
-
### 🎯 **Platform**
|
|
1810
|
-
- **Latest AI models** as they're released
|
|
1811
|
-
- **Backward compatible** API
|
|
1812
|
-
- **Feature updates** based on feedback
|
|
1813
|
-
- **Active development**
|
|
1814
|
-
|
|
1815
|
-
## API Endpoints
|
|
1816
|
-
|
|
1817
|
-
The server interfaces with these Kie.ai API endpoints:
|
|
1818
|
-
|
|
1819
|
-
- **Veo3 Video Generation**: `POST /api/v1/veo/generate`
|
|
1820
|
-
- **Veo3 Video Status**: `GET /api/v1/veo/record-info`
|
|
1821
|
-
- **Veo3 1080p Upgrade**: `GET /api/v1/veo/get-1080p-video`
|
|
1822
|
-
- **Nano Banana Generation**: `POST /api/v1/jobs/createTask`
|
|
1823
|
-
- **Nano Banana Edit**: `POST /api/v1/jobs/createTask`
|
|
1824
|
-
- **Nano Banana Upscale**: `POST /api/v1/jobs/createTask`
|
|
1825
|
-
- **Nano Banana Status**: `GET /api/v1/jobs/recordInfo`
|
|
1826
|
-
- **Suno Music Generation**: `POST /api/v1/generate`
|
|
1827
|
-
- **Suno Music Status**: `GET /api/v1/generate?taskId=XXX`
|
|
1828
|
-
- **ElevenLabs TTS Generation**: `POST /api/v1/jobs/createTask`
|
|
1829
|
-
- **ElevenLabs TTS Status**: `GET /api/v1/jobs/recordInfo`
|
|
1830
|
-
- **ElevenLabs Sound Effects**: `POST /api/v1/jobs/createTask`
|
|
1831
|
-
- **ElevenLabs Sound Effects Status**: `GET /api/v1/jobs/recordInfo`
|
|
1832
|
-
- **ByteDance Seedance Video**: `POST /api/v1/jobs/createTask`
|
|
1833
|
-
- **ByteDance Seedance Status**: `GET /api/v1/jobs/recordInfo`
|
|
1834
|
-
- **ByteDance Seedream Image**: `POST /api/v1/jobs/createTask`
|
|
1835
|
-
- **ByteDance Seedream Status**: `GET /api/v1/jobs/recordInfo`
|
|
1836
|
-
- **Qwen Image Generation**: `POST /api/v1/jobs/createTask`
|
|
1837
|
-
- **Qwen Image Status**: `GET /api/v1/jobs/recordInfo`
|
|
1838
|
-
- **Runway Aleph Video**: `POST /api/v1/jobs/createTask`
|
|
1839
|
-
- **Runway Aleph Status**: `GET /api/v1/jobs/recordInfo`
|
|
1840
|
-
- **Midjourney Generation**: `POST /api/v1/jobs/createTask`
|
|
1841
|
-
- **Midjourney Status**: `GET /api/v1/jobs/recordInfo`
|
|
1842
|
-
- **Wan Video Generation**: `POST /api/v1/jobs/createTask`
|
|
1843
|
-
- **Wan Video Status**: `GET /api/v1/jobs/recordInfo`
|
|
1844
|
-
- **OpenAI 4o Image Generation**: `POST /api/v1/jobs/createTask`
|
|
1845
|
-
- **OpenAI 4o Image Status**: `GET /api/v1/jobs/recordInfo`
|
|
1846
|
-
- **Flux Kontext Image**: `POST /api/v1/jobs/createTask`
|
|
1847
|
-
- **Flux Kontext Status**: `GET /api/v1/jobs/recordInfo`
|
|
1848
|
-
- **Recraft Remove Background**: `POST /api/v1/jobs/createTask`
|
|
1849
|
-
- **Recraft Remove Background Status**: `GET /api/v1/jobs/recordInfo`
|
|
1850
|
-
- **Ideogram V3 Reframe**: `POST /api/v1/jobs/createTask`
|
|
1851
|
-
- **Ideogram V3 Reframe Status**: `GET /api/v1/jobs/recordInfo`
|
|
1852
|
-
|
|
1853
|
-
All endpoints follow official Kie.ai API documentation.
|
|
1854
|
-
|
|
1855
|
-
## Database Schema
|
|
1856
|
-
|
|
1857
|
-
The server uses SQLite to track tasks:
|
|
1858
|
-
|
|
1859
|
-
```sql
|
|
1860
|
-
CREATE TABLE tasks (
|
|
1861
|
-
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
1862
|
-
task_id TEXT UNIQUE NOT NULL,
|
|
1863
|
-
api_type TEXT NOT NULL, -- 'nano-banana', 'nano-banana-edit', 'nano-banana-upscale', 'veo3', 'suno', 'elevenlabs-tts', 'elevenlabs-sound-effects', 'bytedance-seedance-video', 'bytedance-seedream-image', 'qwen-image', 'runway-aleph-video', 'midjourney-generate', 'wan-video', 'kling-v2-1-pro', 'kling-v2-5-turbo-text-to-video', 'kling-v2-5-turbo-image-to-video', 'openai-4o-image', 'flux-kontext-image', 'recraft-remove-background', 'ideogram-reframe'
|
|
1864
|
-
status TEXT DEFAULT 'pending',
|
|
1865
|
-
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
1866
|
-
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
1867
|
-
result_url TEXT,
|
|
1868
|
-
error_message TEXT
|
|
1869
|
-
);
|
|
1870
|
-
```
|
|
324
|
+
**→ [See 100+ more examples in tool documentation](docs/TOOLS.md)**
|
|
1871
325
|
|
|
1872
326
|
## Database & Task Management
|
|
1873
327
|
|
|
1874
|
-
The server includes a built-in SQLite database for persistent task tracking
|
|
1875
|
-
|
|
1876
|
-
### **Database Features**
|
|
328
|
+
The server includes a built-in SQLite database for persistent task tracking:
|
|
1877
329
|
|
|
1878
330
|
- **🔄 Persistent Storage**: Tasks survive server restarts
|
|
1879
|
-
- **📊 Complete History**: Track all generation tasks and their results
|
|
1880
|
-
- **⚡ Smart Caching**: Local database reduces API calls
|
|
1881
|
-
- **🔍 Full Audit Trail**: Complete lifecycle tracking
|
|
331
|
+
- **📊 Complete History**: Track all generation tasks and their results
|
|
332
|
+
- **⚡ Smart Caching**: Local database reduces API calls
|
|
333
|
+
- **🔍 Full Audit Trail**: Complete lifecycle tracking
|
|
1882
334
|
- **🎯 Intelligent Routing**: Database provides api_type for correct endpoint selection
|
|
1883
335
|
|
|
1884
|
-
###
|
|
1885
|
-
|
|
1886
|
-
```
|
|
1887
|
-
1. Task Created → INSERT (status: 'pending')
|
|
1888
|
-
2. API Processing → UPDATE (status: 'processing')
|
|
1889
|
-
3. API Complete → UPDATE (status: 'completed', result_url: '...')
|
|
1890
|
-
4. API Failed → UPDATE (status: 'failed', error_message: '...')
|
|
1891
|
-
```
|
|
1892
|
-
|
|
1893
|
-
### **Available Task Management Tools**
|
|
1894
|
-
|
|
1895
|
-
#### **1. `list_tasks`**
|
|
1896
|
-
List all tasks in the database with optional limit.
|
|
336
|
+
### Quick Examples
|
|
1897
337
|
|
|
338
|
+
**List recent tasks:**
|
|
1898
339
|
```json
|
|
1899
340
|
{
|
|
1900
|
-
"
|
|
1901
|
-
|
|
1902
|
-
|
|
1903
|
-
|
|
1904
|
-
|
|
1905
|
-
```json
|
|
1906
|
-
{
|
|
1907
|
-
"tasks": [
|
|
1908
|
-
{
|
|
1909
|
-
"id": 1,
|
|
1910
|
-
"task_id": "281e5b0*********************f39b9",
|
|
1911
|
-
"api_type": "veo3",
|
|
1912
|
-
"status": "completed",
|
|
1913
|
-
"created_at": "2025-01-14T10:30:00.000Z",
|
|
1914
|
-
"updated_at": "2025-01-14T10:35:00.000Z",
|
|
1915
|
-
"result_url": "https://file.aiquickdraw.com/custom-page/akr/video.mp4",
|
|
1916
|
-
"error_message": null
|
|
1917
|
-
}
|
|
1918
|
-
]
|
|
1919
|
-
}
|
|
1920
|
-
```
|
|
1921
|
-
|
|
1922
|
-
#### **2. `get_task_status`**
|
|
1923
|
-
Get detailed status of a specific task, combining local database with live API data.
|
|
1924
|
-
|
|
1925
|
-
```json
|
|
1926
|
-
{
|
|
1927
|
-
"task_id": "281e5b0*********************f39b9"
|
|
341
|
+
"tool": "list_tasks",
|
|
342
|
+
"arguments": {
|
|
343
|
+
"limit": 20,
|
|
344
|
+
"status": "completed"
|
|
345
|
+
}
|
|
1928
346
|
}
|
|
1929
347
|
```
|
|
1930
348
|
|
|
1931
|
-
**
|
|
349
|
+
**Check task status:**
|
|
1932
350
|
```json
|
|
1933
351
|
{
|
|
1934
|
-
"
|
|
1935
|
-
"
|
|
1936
|
-
|
|
1937
|
-
"local_status": "completed",
|
|
1938
|
-
"api_status": "success",
|
|
1939
|
-
"created_at": "2025-01-14T10:30:00.000Z",
|
|
1940
|
-
"updated_at": "2025-01-14T10:35:00.000Z",
|
|
1941
|
-
"result_url": "https://file.aiquickdraw.com/custom-page/akr/video.mp4",
|
|
1942
|
-
"api_data": {
|
|
1943
|
-
"state": "success",
|
|
1944
|
-
"resultJson": "{\"resultUrls\":[\"https://file.aiquickdraw.com/custom-page/akr/video.mp4\"]}",
|
|
1945
|
-
"costTime": 180000,
|
|
1946
|
-
"completeTime": 1757584164490
|
|
352
|
+
"tool": "get_task_status",
|
|
353
|
+
"arguments": {
|
|
354
|
+
"task_id": "281e5b0*********************f39b9"
|
|
1947
355
|
}
|
|
1948
356
|
}
|
|
1949
357
|
```
|
|
1950
358
|
|
|
1951
|
-
|
|
1952
|
-
|
|
1953
|
-
#### **Environment Variables**
|
|
1954
|
-
```bash
|
|
1955
|
-
# Custom database file location (optional)
|
|
1956
|
-
KIE_AI_DB_PATH=./custom_tasks.db
|
|
1957
|
-
|
|
1958
|
-
# Default: ./tasks.db in current working directory
|
|
1959
|
-
```
|
|
1960
|
-
|
|
1961
|
-
#### **Database Behavior**
|
|
1962
|
-
- **Auto-initialization**: Creates tables and indexes on first run
|
|
1963
|
-
- **Indexing**: Optimized queries on `task_id` and `status` fields
|
|
1964
|
-
- **Thread-safe**: Uses SQLite serialization for concurrent access
|
|
1965
|
-
- **Persistent**: Data survives server restarts
|
|
1966
|
-
- **Inspectable**: Can be opened with any SQLite client tool
|
|
1967
|
-
|
|
1968
|
-
### **Smart Status Checking**
|
|
1969
|
-
|
|
1970
|
-
The `get_task_status` tool uses intelligent routing:
|
|
1971
|
-
|
|
1972
|
-
1. **Query Local Database**: Fast lookup of task metadata
|
|
1973
|
-
2. **API Status Check**: Calls appropriate endpoint based on `api_type`
|
|
1974
|
-
3. **Database Update**: Stores latest status from API response
|
|
1975
|
-
4. **Combined Response**: Merges local and API data for complete picture
|
|
1976
|
-
|
|
1977
|
-
### **API Type Routing**
|
|
1978
|
-
|
|
1979
|
-
The database `api_type` field determines which Kie.ai endpoint to query:
|
|
1980
|
-
|
|
1981
|
-
| api_type | Endpoint | Purpose |
|
|
1982
|
-
|----------|----------|---------|
|
|
1983
|
-
| `veo3` | `/veo/record-info` | Veo3 video generation |
|
|
1984
|
-
| `nano-banana` | `/jobs/recordInfo` | Image generation |
|
|
1985
|
-
| `nano-banana-edit` | `/jobs/recordInfo` | Image editing |
|
|
1986
|
-
| `nano-banana-upscale` | `/jobs/recordInfo` | Image upscaling |
|
|
1987
|
-
| `suno` | `/generate/record-info` | Music generation |
|
|
1988
|
-
| `elevenlabs-tts` | `/jobs/recordInfo` | Text-to-speech |
|
|
1989
|
-
| `elevenlabs-sound-effects` | `/jobs/recordInfo` | Sound effects |
|
|
1990
|
-
| `bytedance-seedance-video` | `/jobs/recordInfo` | Video generation |
|
|
1991
|
-
| `bytedance-seedream-image` | `/jobs/recordInfo` | Image generation/editing |
|
|
1992
|
-
| `qwen-image` | `/jobs/recordInfo` | Image generation/editing |
|
|
1993
|
-
| `runway-aleph-video` | `/jobs/recordInfo` | Video-to-video transformation |
|
|
1994
|
-
| `midjourney-generate` | `/jobs/recordInfo` | Image/video generation |
|
|
1995
|
-
| `wan-video` | `/jobs/recordInfo` | Video generation |
|
|
1996
|
-
| `kling-v2-1-pro` | `/jobs/recordInfo` | Video generation (start+end frames) |
|
|
1997
|
-
| `kling-v2-5-turbo-text-to-video` | `/jobs/recordInfo` | Video generation (text-to-video) |
|
|
1998
|
-
| `kling-v2-5-turbo-image-to-video` | `/jobs/recordInfo` | Video generation (image-to-video) |
|
|
1999
|
-
| `openai-4o-image` | `/jobs/recordInfo` | Image generation/editing/variants |
|
|
2000
|
-
| `flux-kontext-image` | `/jobs/recordInfo` | Image generation/editing |
|
|
2001
|
-
| `recraft-remove-background` | `/jobs/recordInfo` | Background removal |
|
|
2002
|
-
| `ideogram-reframe` | `/jobs/recordInfo` | Image reframing |
|
|
2003
|
-
|
|
2004
|
-
### **Task Status Values**
|
|
2005
|
-
|
|
2006
|
-
- **`pending`**: Task created, waiting for API processing
|
|
2007
|
-
- **`processing`**: API is actively processing the task
|
|
2008
|
-
- **`completed`**: Task finished successfully, result available
|
|
2009
|
-
- **`failed`**: Task failed, error message available
|
|
2010
|
-
|
|
2011
|
-
### **Best Practices**
|
|
2012
|
-
|
|
2013
|
-
- **Use `list_tasks`** to get overview of all generation activity
|
|
2014
|
-
- **Use `get_task_status`** for detailed progress tracking
|
|
2015
|
-
- **Monitor `updated_at`** to see when status last changed
|
|
2016
|
-
- **Check `error_message`** for failed tasks to debug issues
|
|
2017
|
-
- **Use `result_url`** to access completed generation results
|
|
2018
|
-
|
|
2019
|
-
## Usage Examples
|
|
2020
|
-
|
|
2021
|
-
### Basic Image Generation
|
|
2022
|
-
```bash
|
|
2023
|
-
# Generate an image
|
|
2024
|
-
curl -X POST http://localhost:3000/tools/call \
|
|
2025
|
-
-H "Content-Type: application/json" \
|
|
2026
|
-
-d '{
|
|
2027
|
-
"name": "nano_banana_generate",
|
|
2028
|
-
"arguments": {
|
|
2029
|
-
"prompt": "A cat wearing a space helmet"
|
|
2030
|
-
}
|
|
2031
|
-
}'
|
|
2032
|
-
```
|
|
2033
|
-
|
|
2034
|
-
### Video Generation with Options
|
|
2035
|
-
```bash
|
|
2036
|
-
# Generate a video
|
|
2037
|
-
curl -X POST http://localhost:3000/tools/call \
|
|
2038
|
-
-H "Content-Type: application/json" \
|
|
2039
|
-
-d '{
|
|
2040
|
-
"name": "veo3_generate_video",
|
|
2041
|
-
"arguments": {
|
|
2042
|
-
"prompt": "A peaceful garden with blooming flowers",
|
|
2043
|
-
"aspectRatio": "16:9",
|
|
2044
|
-
"model": "veo3_fast"
|
|
2045
|
-
}
|
|
2046
|
-
}'
|
|
2047
|
-
```
|
|
359
|
+
**→ [See complete database documentation](docs/DATABASE.md)** including schema, lifecycle, and best practices
|
|
2048
360
|
|
|
2049
361
|
## Real-World Use Cases
|
|
2050
362
|
|
|
2051
|
-
|
|
363
|
+
<details>
|
|
364
|
+
<summary><strong>🎬 Content Creation Agencies (click to expand)</strong></summary>
|
|
365
|
+
|
|
2052
366
|
```bash
|
|
2053
367
|
# Generate social media video content
|
|
2054
|
-
|
|
368
|
+
sora_video: "A trendy coffee shop with latte art, cinematic lighting"
|
|
2055
369
|
|
|
2056
370
|
# Create product photography
|
|
2057
371
|
nano_banana_image: "Luxury watch on marble surface, professional product shot"
|
|
@@ -2059,11 +373,14 @@ nano_banana_image: "Luxury watch on marble surface, professional product shot"
|
|
|
2059
373
|
# Add background music
|
|
2060
374
|
suno_generate_music: "Upbeat corporate background music, 2 minutes"
|
|
2061
375
|
```
|
|
376
|
+
</details>
|
|
377
|
+
|
|
378
|
+
<details>
|
|
379
|
+
<summary><strong>🎮 Game Development Studios (click to expand)</strong></summary>
|
|
2062
380
|
|
|
2063
|
-
### 🎮 **Game Development Studios**
|
|
2064
381
|
```bash
|
|
2065
382
|
# Generate game assets
|
|
2066
|
-
|
|
383
|
+
bytedance_seedream_image: "Fantasy sword with glowing runes, game asset style"
|
|
2067
384
|
|
|
2068
385
|
# Create character voiceovers
|
|
2069
386
|
elevenlabs_tts: "Welcome, brave adventurer! Your quest begins now."
|
|
@@ -2071,11 +388,14 @@ elevenlabs_tts: "Welcome, brave adventurer! Your quest begins now."
|
|
|
2071
388
|
# Design sound effects
|
|
2072
389
|
elevenlabs_ttsfx: "Magical spell casting with sparkles and energy"
|
|
2073
390
|
```
|
|
391
|
+
</details>
|
|
392
|
+
|
|
393
|
+
<details>
|
|
394
|
+
<summary><strong>📱 Mobile App Developers (click to expand)</strong></summary>
|
|
2074
395
|
|
|
2075
|
-
### 📱 **Mobile App Developers**
|
|
2076
396
|
```bash
|
|
2077
397
|
# Generate app icons and illustrations
|
|
2078
|
-
|
|
398
|
+
flux_kontext_image: "Modern minimalist app icon for fitness tracker"
|
|
2079
399
|
|
|
2080
400
|
# Create tutorial videos
|
|
2081
401
|
bytedance_seedance_video: "Screen recording showing app features, clean interface"
|
|
@@ -2083,64 +403,40 @@ bytedance_seedance_video: "Screen recording showing app features, clean interfac
|
|
|
2083
403
|
# Add narration
|
|
2084
404
|
elevenlabs_tts: "Tap here to get started with your new profile"
|
|
2085
405
|
```
|
|
406
|
+
</details>
|
|
407
|
+
|
|
408
|
+
<details>
|
|
409
|
+
<summary><strong>🏢 Enterprise Applications (click to expand)</strong></summary>
|
|
2086
410
|
|
|
2087
|
-
### 🏢 **Enterprise Applications**
|
|
2088
411
|
```bash
|
|
2089
412
|
# Generate training materials
|
|
2090
413
|
veo3_generate_video: "Professional office environment, employee training scenario"
|
|
2091
414
|
|
|
2092
415
|
# Create corporate presentations
|
|
2093
|
-
|
|
2094
|
-
"prompt": "Add company logo to presentation slide, maintain professional style",
|
|
2095
|
-
"image_urls": ["https://example.com/slide.jpg"]
|
|
2096
|
-
}
|
|
416
|
+
openai_4o_image: "Add company logo to presentation slide, maintain professional style"
|
|
2097
417
|
|
|
2098
418
|
# Produce marketing content
|
|
2099
419
|
suno_generate_music: "Corporate background music for promotional video"
|
|
2100
420
|
```
|
|
2101
|
-
|
|
2102
|
-
### 🎨 **Creative Professionals**
|
|
2103
|
-
```bash
|
|
2104
|
-
# Artistic projects
|
|
2105
|
-
bytedance_seedance_video: "Abstract art coming to life, vibrant colors flowing"
|
|
2106
|
-
|
|
2107
|
-
# Photography enhancement
|
|
2108
|
-
nano_banana_image: {
|
|
2109
|
-
"image": "https://example.com/portrait.jpg",
|
|
2110
|
-
"scale": 4,
|
|
2111
|
-
"face_enhance": true
|
|
2112
|
-
}
|
|
2113
|
-
|
|
2114
|
-
# Audio production
|
|
2115
|
-
elevenlabs_sound_effects: "Nature soundscape with birds and gentle wind"
|
|
2116
|
-
```
|
|
2117
|
-
|
|
2118
|
-
## Success Stories
|
|
2119
|
-
|
|
2120
|
-
### 🚀 **Startup Reduces AI Costs**
|
|
2121
|
-
*"Switched from multiple AI services to Kie.ai and cut our monthly AI budget from $2,000 to $600. The unified API simplified our codebase."* - CTO, Content Startup
|
|
2122
|
-
|
|
2123
|
-
### ⚡ **Agency Speeds Up Delivery**
|
|
2124
|
-
*"Our video production timeline went from 2 weeks to 3 days using Veo 3. Clients like the quality and we handle more projects."* - Creative Director, Marketing Agency
|
|
2125
|
-
|
|
2126
|
-
### 🎵 **Music Producer Scales Work**
|
|
2127
|
-
*"Suno API lets us generate custom background music for client videos in minutes instead of days. It improved our workflow."* - Producer, Video Production Company
|
|
421
|
+
</details>
|
|
2128
422
|
|
|
2129
423
|
## Error Handling
|
|
2130
424
|
|
|
2131
425
|
The server handles these HTTP error codes from Kie.ai:
|
|
2132
426
|
|
|
2133
|
-
|
|
2134
|
-
|
|
2135
|
-
|
|
2136
|
-
|
|
2137
|
-
|
|
2138
|
-
|
|
2139
|
-
|
|
2140
|
-
|
|
2141
|
-
|
|
2142
|
-
|
|
2143
|
-
|
|
427
|
+
| Code | Meaning |
|
|
428
|
+
|------|---------|
|
|
429
|
+
| **200** | Success |
|
|
430
|
+
| **400** | Content policy violation / English prompts only |
|
|
431
|
+
| **401** | Unauthorized (invalid API key) |
|
|
432
|
+
| **402** | Insufficient credits |
|
|
433
|
+
| **404** | Resource not found |
|
|
434
|
+
| **422** | Validation error / record is null |
|
|
435
|
+
| **429** | Rate limited |
|
|
436
|
+
| **451** | Image access limits |
|
|
437
|
+
| **455** | Service maintenance |
|
|
438
|
+
| **500** | Server error / timeout |
|
|
439
|
+
| **501** | Generation failed |
|
|
2144
440
|
|
|
2145
441
|
## Development
|
|
2146
442
|
|
|
@@ -2175,6 +471,8 @@ See https://kie.ai/billing for detailed pricing.
|
|
|
2175
471
|
4. **Monitoring**: Monitor task status and handle failed generations appropriately
|
|
2176
472
|
5. **Storage**: Consider automatic cleanup of old task records
|
|
2177
473
|
|
|
474
|
+
**→ [See complete administrator guide](docs/ADMIN.md)** for deployment best practices
|
|
475
|
+
|
|
2178
476
|
## Troubleshooting
|
|
2179
477
|
|
|
2180
478
|
### Common Issues
|
|
@@ -2201,21 +499,19 @@ For issues related to:
|
|
|
2201
499
|
|
|
2202
500
|
## 🚀 Start Building with Kie.ai
|
|
2203
501
|
|
|
2204
|
-
|
|
2205
|
-
|
|
2206
|
-
### 🎯 **Get Started**
|
|
502
|
+
### 🎯 Get Started
|
|
2207
503
|
1. **Get your free API key** at [kie.ai/api-key](https://kie.ai/api-key)
|
|
2208
504
|
2. **Install the MCP server**: `npm install @felores/kie-ai-mcp-server`
|
|
2209
505
|
3. **Generate your first AI content** in minutes
|
|
2210
506
|
|
|
2211
|
-
### 💡
|
|
507
|
+
### 💡 Benefits
|
|
2212
508
|
- ✅ **Free trial** - Test models before paying
|
|
2213
509
|
- ✅ **30-50% lower pricing** than competitors
|
|
2214
510
|
- ✅ **99.9% uptime** guarantee
|
|
2215
511
|
- ✅ **24/7 human support**
|
|
2216
512
|
- ✅ **Simple integration**
|
|
2217
513
|
|
|
2218
|
-
### 🌟
|
|
514
|
+
### 🌟 AI Content Generation
|
|
2219
515
|
Kie.ai provides access to advanced AI models at competitive pricing.
|
|
2220
516
|
|
|
2221
517
|
**Start your project today.** 🚀
|
|
@@ -2236,4 +532,4 @@ MIT License - see LICENSE file for details.
|
|
|
2236
532
|
|
|
2237
533
|
## Changelog
|
|
2238
534
|
|
|
2239
|
-
See [CHANGELOG.md](CHANGELOG.md) for detailed version history and release notes.
|
|
535
|
+
See [CHANGELOG.md](CHANGELOG.md) for detailed version history and release notes.
|