@felores/kie-ai-mcp-server 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,7 +4,14 @@
4
4
 
5
5
  Kie.ai offers **30-50% lower cost** than competitors with 99.9% uptime and 24/7 human support.
6
6
 
7
- ## 🚀 **Quick Start - Add to Your MCP Client**
7
+ ## 📚 Documentation
8
+
9
+ - **[Complete Tool Reference](docs/TOOLS.md)** - Detailed documentation for all 21 AI tools
10
+ - **[Database & Task Management](docs/DATABASE.md)** - SQLite database and task lifecycle
11
+ - **[Administrator Configuration](docs/ADMIN.md)** - Deployment guides and environment setup
12
+ - **[Intelligent Features](docs/INTELLIGENCE.md)** - Smart mode detection and cost optimization
13
+
14
+ ## 🚀 Quick Start - Add to Your MCP Client
8
15
 
9
16
  The easiest way to use this server is to add it to your MCP client configuration:
10
17
 
@@ -39,7 +46,8 @@ The easiest way to use this server is to add it to your MCP client configuration
39
46
  | **Support** | 24/7 Human | Email + Discord | 24/7 AI |
40
47
  | **Free Trial** | Yes | Limited | Limited |
41
48
 
42
- ### 🚀 **All AI Models in One API**
49
+ ### 🚀 All AI Models in One API
50
+
43
51
  - **Google Veo 3**: Cinematic video generation with synchronized audio and 1080p output
44
52
  - **OpenAI Sora 2**: Advanced video generation with text/image/storyboard modes (unified)
45
53
  - **Runway Aleph**: Advanced video editing with object removal and style transfer
@@ -53,51 +61,19 @@ The easiest way to use this server is to add it to your MCP client configuration
53
61
  - **Flux Kontext**: Professional image generation and editing with advanced features (unified)
54
62
  - **Alibaba Wan 2.5**: High-quality video generation with text-to-video and image-to-video (unified)
55
63
  - **Hailuo 02**: Professional video generation with text-to-video and image-to-video modes (unified, standard/pro quality)
64
+ - **Kling Video**: Multi-tier video generation with v2.1-pro control and v2.5-turbo speed
56
65
  - **Midjourney AI**: Industry-leading image and video generation with multiple modes (unified)
57
- - **Recraft Remove Background**: Professional AI-powered background removal with clean edge detection
58
- - **Ideogram V3 Reframe**: Intelligent image reframing and aspect ratio conversion with content-aware adaptation
59
-
60
- ### 💰 **Affordable Pricing**
61
- Pay-as-you-go credit system means you only pay for what you use. Good for startups and enterprises looking to reduce AI costs.
62
-
63
- ### ⚡ **Fast & Reliable**
64
- - **99.9% uptime**
65
- - **25.2s average response time**
66
- - Low latency for applications
67
- - High concurrency support
68
-
69
- ### 🔒 **Secure**
70
- Your data is protected with encryption. We prioritize privacy and do not expose your information.
66
+ - **Recraft Remove Background**: Professional AI-powered background removal
67
+ - **Ideogram V3 Reframe**: Intelligent image reframing and aspect ratio conversion
71
68
 
72
69
  ## What You Can Build
73
70
 
74
- ### 🎬 **Video Generation**
75
- Generate videos from text or images. Use for:
76
- - Social media content
77
- - Marketing materials
78
- - Product demonstrations
79
- - Creative projects
80
-
81
- ### 🎨 **Image Generation**
82
- Create images, edit existing ones, and upscale with AI. Use for:
83
- - Content creation
84
- - Product photography
85
- - Artistic projects
86
- - Design mockups
87
-
88
- ### 🎵 **Music Generation**
89
- Generate music tracks with vocals. Use for:
90
- - Background music for videos
91
- - Podcast intros/outros
92
- - Game soundtracks
93
- - Commercial projects
94
-
95
- ### 🎤 **Audio Generation**
96
- Voiceovers and sound effects. Use for:
97
- - Narration and voiceovers
98
- - Podcast production
99
- - Game audio
100
- - Accessibility features
71
+ | Category | Use Cases |
72
+ |----------|-----------|
73
+ | **🎬 Video Generation** | Social media content, marketing materials, product demonstrations, creative projects |
74
+ | **🎨 Image Generation** | Content creation, product photography, artistic projects, design mockups |
75
+ | **🎵 Music Generation** | Background music for videos, podcast intros/outros, game soundtracks, commercial projects |
76
+ | **🎤 Audio Generation** | Narration and voiceovers, podcast production, game audio, accessibility features |
101
77
 
102
78
  ## MCP Features
103
79
 
@@ -105,17 +81,11 @@ Voiceovers and sound effects. Use for:
105
81
 
106
82
  Trigger specialized AI agents with simple commands in your MCP client:
107
83
 
108
- - **`/artist`** - Image generation and editing agent
109
- - Automatically loads full artist workflow instructions
110
- - Handles text-to-image, image editing, upscaling, background removal
111
- - Intelligently selects the best model for your request
112
- - Just describe what you want: _"/artist create a logo for a coffee shop"_
84
+ - **`/artist`** - Image generation and editing agent
85
+ Just describe what you want: _"/artist create a logo for a coffee shop"_
113
86
 
114
- - **`/filmographer`** - Video generation agent
115
- - Automatically loads full video workflow instructions
116
- - Handles text-to-video and image-to-video generation
117
- - Optimizes quality vs cost based on your keywords
118
- - Just describe what you want: _"/filmographer create a 10-second sunset video"_
87
+ - **`/filmographer`** - Video generation agent
88
+ Just describe what you want: _"/filmographer create a 10-second sunset video"_
119
89
 
120
90
  ### 📚 Knowledge Resources
121
91
 
@@ -125,60 +95,36 @@ Your AI assistant can research and learn about available models before using the
125
95
  - `kie://agents/artist` - Complete image generation workflow
126
96
  - `kie://agents/filmographer` - Complete video generation workflow
127
97
 
128
- **Model Documentation (12+ models):**
98
+ **Model Documentation (33+ models):**
129
99
  - `kie://models/bytedance-seedream` - 4K image generation
130
100
  - `kie://models/veo3` - Premium cinematic video
131
101
  - `kie://models/qwen-image` - Fast image processing
132
102
  - `kie://models/flux-kontext` - Professional image generation
133
- - ...and 8 more models
103
+ - ...and 29 more models
134
104
 
135
105
  **Comparison Guides:**
136
106
  - `kie://guides/image-models-comparison` - Feature matrix for all image models
137
107
  - `kie://guides/video-models-comparison` - Feature matrix for all video models
138
108
  - `kie://guides/quality-optimization` - Cost/quality strategies
139
109
 
140
- **Operational Resources:**
141
- - `kie://tasks/active` - Real-time task monitoring
142
- - `kie://stats/usage` - Usage statistics
143
-
144
110
  ### 🛠️ 21 Unified AI Tools
145
111
 
146
112
  All tools feature **smart mode detection** - one tool does multiple things:
147
113
 
148
- **Image Tools (7):**
149
- - `bytedance_seedream_image` - Generate OR edit images (detects mode automatically)
150
- - `qwen_image` - Generate OR edit images with acceleration
151
- - `nano_banana_image` - Generate OR edit OR upscale images
152
- - `flux_kontext_image` - Generate OR edit with advanced controls
153
- - `openai_4o_image` - Generate OR edit OR create variants
154
- - `recraft_remove_background` - Professional background removal
155
- - `ideogram_reframe` - Intelligent aspect ratio conversion
156
-
157
- **Video Tools (8):**
158
- - `veo3_generate_video` - Premium cinematic video (text OR image input)
159
- - `sora_video` - OpenAI's advanced video model (text/image/storyboard modes, standard/pro)
160
- - `bytedance_seedance_video` - Professional video (text OR image input, lite OR pro)
161
- - `wan_video` - Fast social media video (text OR image input)
162
- - `hailuo_video` - Professional video (text-to-video OR image-to-video, standard OR pro quality)
163
- - `kling_video` - High-quality video (text, image-to-video, OR v2.1-pro with start+end frames)
164
- - `runway_aleph_video` - Video-to-video transformation
165
- - `midjourney_generate` - Images AND videos with multiple modes
166
-
167
- **Audio Tools (3):**
168
- - `suno_generate_music` - Professional music with vocals
169
- - `elevenlabs_tts` - Studio-quality text-to-speech
170
- - `elevenlabs_ttsfx` - AI-powered sound effects
171
-
172
- **Utility Tools (3):**
173
- - `list_tasks` - View all generation tasks
174
- - `get_task_status` - Check task progress
175
- - `veo3_get_1080p_video` - Upgrade to 1080p (Veo3 only)
114
+ | Category | Tools |
115
+ |----------|-------|
116
+ | **Image (7)** | `bytedance_seedream_image`, `qwen_image`, `nano_banana_image`, `flux_kontext_image`, `openai_4o_image`, `recraft_remove_background`, `ideogram_reframe` |
117
+ | **Video (8)** | `veo3_generate_video`, `sora_video`, `bytedance_seedance_video`, `wan_video`, `hailuo_video`, `kling_video`, `runway_aleph_video`, `midjourney_generate` |
118
+ | **Audio (3)** | `suno_generate_music`, `elevenlabs_tts`, `elevenlabs_ttsfx` |
119
+ | **Utility (3)** | `list_tasks`, `get_task_status`, `veo3_get_1080p_video` |
120
+
121
+ **→ [See complete tool documentation](docs/TOOLS.md)**
176
122
 
177
123
  ## Key Features
178
124
 
179
125
  - **🎯 One API Key**: Access all models with one credential
180
126
  - **🤖 AI Agent Prompts**: Slash commands trigger specialized workflows
181
- - **📖 Knowledge Base**: 19 resources for model research and comparison
127
+ - **📖 Knowledge Base**: 33+ resources for model research and comparison
182
128
  - **🔄 Task Management**: Built-in SQLite database for tracking generations
183
129
  - **📱 Smart Routing**: Automatic endpoint detection and status monitoring
184
130
  - **🛡️ Error Handling**: Validation and error recovery
@@ -191,195 +137,30 @@ All tools feature **smart mode detection** - one tool does multiple things:
191
137
 
192
138
  The MCP server features advanced **intention detection algorithms** that automatically understand user requirements and optimize both cost and quality without manual configuration.
193
139
 
194
- ### **🎯 Quality & Cost Optimization**
195
-
196
- #### **Automatic Quality Detection**
197
- The system analyzes user language to determine quality requirements:
198
-
199
- ```typescript
200
- // Source: src/kie-ai-client.ts:224-232
201
- const quality = request.quality || 'lite';
202
- let model: string;
203
- if (isImageToVideo) {
204
- model = quality === 'pro' ? 'bytedance/v1-pro-image-to-video' : 'bytedance/v1-lite-image-to-video';
205
- } else {
206
- model = quality === 'pro' ? 'bytedance/v1-pro-text-to-video' : 'bytedance/v1-lite-text-to-video';
207
- }
208
- ```
209
-
210
- **User Language → System Action**:
211
- - `"high quality"`, `"professional"`, `"premium"` → Pro models + 1080p
212
- - `"fast"`, `"quick"`, `"social media"` → Lite models + 720p
213
- - No quality mentioned → Lite models + 720p (cost-effective default)
214
-
215
- #### **Dynamic Endpoint Routing**
216
- Quality parameters automatically map to optimal endpoints:
217
-
218
- | Quality Parameter | Text-to-Video Endpoint | Image-to-Video Endpoint |
219
- |------------------|----------------------|-----------------------|
220
- | `"lite"` | `bytedance/v1-lite-text-to-video` | `bytedance/v1-lite-image-to-video` |
221
- | `"pro"` | `bytedance/v1-pro-text-to-video` | `bytedance/v1-pro-image-to-video` |
222
-
223
- ### **🔧 Unified Tool Architecture**
224
-
225
- #### **Smart Mode Detection**
226
- Single tools automatically detect operation mode based on parameter combinations:
227
-
228
- ```typescript
229
- // Source: src/types.ts:146-166 (Nano Banana example)
230
- .refine((data) => {
231
- const hasPrompt = !!data.prompt;
232
- const hasImage = !!data.image_urls;
233
- const hasMask = !!data.mask_url;
234
-
235
- if (hasImage && hasMask) return hasPrompt; // Edit mode
236
- else if (hasImage) return true; // Variants mode
237
- else return hasPrompt; // Generate mode
238
- });
239
- ```
240
-
241
- **Unified Tools with Auto-Detection**:
242
- - **`nano_banana_image`**: Generate/Edit/Upscale based on parameters
243
- - **`bytedance_seedance_video`**: Text-to-video vs Image-to-video based on `image_url` presence
244
- - **`openai_4o_image`**: Generate/Edit/Variants based on `filesUrl` and `maskUrl` combination
245
- - **`qwen_image`**: Text-to-image vs Image editing based on `image_url` presence
246
-
247
- ### **📊 Intelligent Task Management**
248
-
249
- #### **Smart Status Routing**
250
- The system automatically routes status checks to correct API endpoints based on task type:
251
-
252
- ```typescript
253
- // Source: src/index.ts:1155-1175
254
- switch (task.api_type) {
255
- case 'veo3':
256
- return this.makeRequest(`/veo/record-info?taskId=${taskId}`, 'GET');
257
- case 'suno':
258
- return this.makeRequest(`/generate/record-info?taskId=${taskId}`, 'GET');
259
- case 'bytedance-seedance-video':
260
- case 'midjourney-generate':
261
- return this.makeRequest(`/jobs/recordInfo?taskId=${taskId}`, 'GET');
262
- }
263
- ```
140
+ ### Quick Summary
264
141
 
265
- #### **Database-Driven Intelligence**
266
- Local SQLite database provides intelligent caching and routing:
267
-
268
- ```sql
269
- -- Source: README.md database schema
270
- CREATE TABLE tasks (
271
- task_id TEXT UNIQUE NOT NULL,
272
- api_type TEXT NOT NULL, -- Enables intelligent endpoint routing
273
- status TEXT DEFAULT 'pending',
274
- result_url TEXT,
275
- -- ... other fields
276
- );
277
- ```
142
+ - **Automatic Quality Detection**: Analyzes user language ("high quality" → pro models, "quick" → lite models)
143
+ - **Smart Mode Detection**: Single tools auto-detect operation mode (generate/edit/upscale) based on parameters
144
+ - **Database-Driven Intelligence**: Local SQLite cache reduces API calls and provides smart routing
145
+ - **Cost Control by Design**: Defaults to cheapest options (720p, lite quality) unless explicitly requested
278
146
 
279
- ### **💡 Cost Control by Design**
147
+ **Example**: User says _"Make a quick social media video"_ → System automatically chooses: lite quality + 720p + 5 second duration = lowest cost tier (1x baseline)
280
148
 
281
- #### **Default to Savings**
282
- - **Resolution**: Defaults to `"720p"` (API defaults to 1080p - explicit setting prevents cost overruns)
283
- - **Quality**: Defaults to `"lite"` (2-3x cheaper than pro versions)
284
- - **Models**: Defaults to faster variants unless premium quality requested
149
+ **Example**: User says _"I need a high quality video for a client presentation"_ → System automatically chooses: pro quality + 1080p = highest cost tier (4-6x baseline)
285
150
 
286
- #### **Explicit Upgrade Required**
287
- Users must explicitly request higher quality:
288
- - `"high quality"` → Automatic upgrade to pro models + 1080p
289
- - `"high quality in 720p"` → Pro models + cost-effective resolution
290
- - `"professional"` → Pro models + balanced resolution
151
+ **→ [See complete intelligence documentation](docs/INTELLIGENCE.md)** with real-world examples and verifiable code references
291
152
 
292
- ### **🔍 Verifiable Intelligence**
153
+ ## Installation & Configuration
293
154
 
294
- All intelligent behaviors are implemented in the codebase:
295
- - **Quality Detection**: `src/kie-ai-client.ts:224-232`
296
- - **Mode Detection**: `src/types.ts:146-166` (multiple examples)
297
- - **Endpoint Routing**: `src/index.ts:1155-1175`
298
- - **Schema Validation**: `src/types.ts` (all tool schemas)
299
- - **Database Integration**: `src/database.ts` + `src/index.ts`
155
+ <details>
156
+ <summary><strong>📦 Installation Options (click to expand)</strong></summary>
300
157
 
301
- This system ensures **optimal user experience** while maintaining **cost control** and **technical accuracy** - users get what they want without needing to understand the underlying complexity.
302
-
303
- ### **🚀 Real-World Intelligence Examples**
304
-
305
- #### **Example 1: Video Generation Request**
306
- ```
307
- User: "Make a quick social media video of a sunset"
308
- ```
309
- **System Automatically Chooses**:
310
- - **Tool**: `bytedance_seedance_video` (default video model)
311
- - **Quality**: `"lite"` (detected "quick" → cost-effective)
312
- - **Resolution**: `"720p"` (default for cost control)
313
- - **Endpoint**: `bytedance/v1-lite-text-to-video`
314
- - **Duration**: `"5"` (optimal for social media)
315
-
316
- #### **Example 2: Professional Quality Request**
317
- ```
318
- User: "I need a high quality video for a client presentation"
319
- ```
320
- **System Automatically Chooses**:
321
- - **Tool**: `bytedance_seedance_video` (default video model)
322
- - **Quality**: `"pro"` (detected "high quality" → premium)
323
- - **Resolution**: `"1080p"` (high quality default)
324
- - **Endpoint**: `bytedance/v1-pro-text-to-video`
325
- - **Duration**: `"5"` (professional standard)
326
-
327
- #### **Example 3: Specific Quality Requirements**
328
- ```
329
- User: "Generate a professional video but keep it 720p to save costs"
330
- ```
331
- **System Automatically Chooses**:
332
- - **Tool**: `bytedance_seedance_video`
333
- - **Quality**: `"pro"` (detected "professional" → premium)
334
- - **Resolution**: `"720p"` (explicitly requested)
335
- - **Endpoint**: `bytedance/v1-pro-text-to-video`
336
- - **Cost**: ~2x lite model but 50% less than 1080p
337
-
338
- #### **Example 4: Unified Tool Intelligence**
339
- ```json
340
- // User provides image + prompt
341
- {
342
- "tool": "nano_banana_image",
343
- "arguments": {
344
- "prompt": "Add sunglasses to the person",
345
- "image_urls": ["https://example.com/portrait.jpg"]
346
- }
347
- }
348
- ```
349
- **System Automatically Detects**: **Edit Mode** (prompt + image_urls)
350
- **Routes to**: `/jobs/createTask` with edit-specific parameters
351
-
352
- #### **Example 5: Smart Status Monitoring**
353
- ```json
354
- // User checks task status
355
- {
356
- "tool": "get_task_status",
357
- "arguments": {
358
- "task_id": "abc123"
359
- }
360
- }
361
- ```
362
- **System Automatically**:
363
- 1. **Queries database**: Gets `api_type: "bytedance-seedance-video"`
364
- 2. **Routes to**: `/jobs/recordInfo?taskId=abc123` (correct endpoint)
365
- 3. **Updates local record**: Syncs API response with database
366
- 4. **Returns combined data**: Local + API information
367
-
368
- ## Quick Start
369
-
370
- ### 🎯 Get Your Free API Key
371
- 1. Visit [Kie.ai API Key](https://kie.ai/api-key) to get your free API key
372
- 2. **Try any model for free** in the AI Playground before committing
373
- 3. Choose the flexible pricing plan that fits your needs
374
-
375
- ### 📦 Installation
376
-
377
- #### Option 1: Install from NPM (Recommended)
158
+ ### Option 1: Install from NPM (Recommended)
378
159
  ```bash
379
160
  npm install -g @felores/kie-ai-mcp-server
380
161
  ```
381
162
 
382
- #### Option 2: Install from Source
163
+ ### Option 2: Install from Source
383
164
  ```bash
384
165
  # Clone the repository
385
166
  git clone https://github.com/felores/kie-ai-mcp-server.git
@@ -391,57 +172,73 @@ npm install
391
172
  # Build the project
392
173
  npm run build
393
174
  ```
175
+ </details>
394
176
 
395
- ### ⚙️ Configuration
177
+ <details>
178
+ <summary><strong>⚙️ Environment Variables (click to expand)</strong></summary>
396
179
 
397
- Create your environment file:
180
+ ### Required
398
181
  ```bash
399
- # Required: Your API key from https://kie.ai/api-key
400
- export KIE_AI_API_KEY="your_api_key_here"
401
-
402
- # Optional: Custom settings
403
- export KIE_AI_BASE_URL="https://api.kie.ai/api/v1" # Default
404
- export KIE_AI_TIMEOUT="60000" # Default: 60 seconds
405
- export KIE_AI_DB_PATH="./tasks.db" # Default: local database
406
- export KIE_AI_CALLBACK_URL="https://your-domain.com/webhook" # Optional: Custom callback
407
- export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Optional: Admin fallback
182
+ export KIE_AI_API_KEY="your-api-key-here" # Get from https://kie.ai/api-key
408
183
  ```
409
184
 
410
- ### 🚀 Start Generating
411
- You're ready to create amazing AI content! The server will automatically:
412
- - Track all your generations in a local database
413
- - Handle task status and completion notifications
414
- - Route requests to the optimal AI models
415
- - Provide detailed error messages and guidance
416
-
417
- ## Configuration
418
-
419
- ### Environment Variables
420
-
185
+ ### Optional
421
186
  ```bash
422
- # Required
423
- export KIE_AI_API_KEY="your-api-key-here"
424
-
425
- # Optional
426
- export KIE_AI_BASE_URL="https://api.kie.ai/api/v1" # Default
427
- export KIE_AI_TIMEOUT="60000" # Default: 60 seconds
428
- export KIE_AI_DB_PATH="./tasks.db" # Default: ./tasks.db
429
- export KIE_AI_CALLBACK_URL="https://your-domain.com/api/callback" # Optional: Custom callback
430
- export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Optional: Admin fallback
187
+ export KIE_AI_BASE_URL="https://api.kie.ai/api/v1" # Default API base URL
188
+ export KIE_AI_TIMEOUT="60000" # Request timeout (ms)
189
+ export KIE_AI_DB_PATH="./tasks.db" # Database file location
190
+ export KIE_AI_CALLBACK_URL="https://your-domain.com/webhook" # Custom callback
191
+ export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.com/callback" # Admin fallback
431
192
  ```
432
193
 
433
- ### **Callback URL Priority:**
194
+ ### Callback URL Priority
434
195
 
435
196
  | Priority | Source | Variable | Use Case |
436
197
  |----------|--------|----------|----------|
437
198
  | 1 | User Parameter | `callBackUrl` | Per-request override |
438
199
  | 2 | Environment | `KIE_AI_CALLBACK_URL` | User's custom callback |
439
- | 3 | Admin Fallback | `KIE_AI_CALLBACK_URL_FALLBACK` | ⭐ **Deployment-wide default** |
200
+ | 3 | Admin Fallback | `KIE_AI_CALLBACK_URL_FALLBACK` | ⭐ Deployment-wide default |
440
201
  | 4 | Hardcoded | - | `https://proxy.kie.ai/mcp-callback` |
441
202
 
442
- ### MCP Configuration
203
+ **→ [See administrator configuration guide](docs/ADMIN.md)** for Docker, Kubernetes, Systemd examples
204
+ </details>
205
+
206
+ ### Tool Filtering (v2.0.2+)
207
+
208
+ **Filter which AI tools are available** to reduce cognitive load and focus your workflow:
209
+
210
+ ```bash
211
+ # Whitelist: Enable only specific tools (highest priority)
212
+ # Note: Utility tools (list_tasks, get_task_status) are always included automatically
213
+ export KIE_AI_ENABLED_TOOLS="nano_banana_image,veo3_generate_video,suno_generate_music"
214
+
215
+ # Category filter: Enable by category (medium priority)
216
+ export KIE_AI_TOOL_CATEGORIES="image,video" # Categories: image, video, audio
217
+
218
+ # Blacklist: Disable specific tools (lowest priority)
219
+ # Note: Utility tools cannot be disabled
220
+ export KIE_AI_DISABLED_TOOLS="midjourney_generate,runway_aleph_video"
221
+ ```
222
+
223
+ **Priority Logic**: `ENABLED_TOOLS` > `TOOL_CATEGORIES` > `DISABLED_TOOLS` > All tools (default)
224
+
225
+ **Tool Categories**:
226
+ - **image** (8): nano_banana, seedream, qwen, openai_4o, flux, recraft, ideogram, midjourney*
227
+ - **video** (9): veo3, veo3_1080p, sora, seedance, wan, hailuo, kling, runway, midjourney*
228
+ - **audio** (3): suno, elevenlabs_tts, elevenlabs_ttsfx
229
+ - **utility** (2): list_tasks, get_task_status ⭐ **Always enabled**
230
+
231
+ _* midjourney appears in both image and video categories (supports both)_
232
+ - ⭐ **Utility tools are always enabled** for server monitoring and task management
233
+ - When using whitelist mode, utility tools are automatically added to your selection
234
+ - When using blacklist mode, utility tools cannot be disabled (warning shown if attempted)
235
+
236
+ <details>
237
+ <summary><strong>🔧 MCP Client Configuration (click to expand)</strong></summary>
443
238
 
444
- Add to your Claude Desktop or MCP client configuration:
239
+ ### Claude Desktop, Cursor, Windsurf, VS Code, etc.
240
+
241
+ Add to your MCP client configuration file:
445
242
 
446
243
  ```json
447
244
  {
@@ -455,7 +252,7 @@ Add to your Claude Desktop or MCP client configuration:
455
252
  }
456
253
  ```
457
254
 
458
- Or if installed globally:
255
+ Or if installed globally with npx:
459
256
 
460
257
  ```json
461
258
  {
@@ -468,1590 +265,107 @@ Or if installed globally:
468
265
  }
469
266
  }
470
267
  ```
268
+ </details>
471
269
 
472
- ## 🎯 Zero-Configuration Callback URLs
473
-
474
- **New in v1.9.8:** No callback URL setup required! The server automatically handles task completion notifications with intelligent fallback:
475
-
476
- 1. **User Parameter** - If you provide `callBackUrl` in a tool request, it uses that
477
- 2. **Environment Variable** - Uses `KIE_AI_CALLBACK_URL` if set (existing setups keep working)
478
- 3. **Admin Fallback** - Uses `KIE_AI_CALLBACK_URL_FALLBACK` for deployment-wide defaults
479
- 4. **Hardcoded Default** - Falls back to `https://proxy.kie.ai/mcp-callback` automatically
480
-
481
- **For Users:** Just provide your API key - everything else is handled automatically!
482
- **For Administrators:** Set `KIE_AI_CALLBACK_URL_FALLBACK` for custom proxy configurations
483
-
484
- ## 🔧 Administrator Configuration
485
-
486
- ### `KIE_AI_CALLBACK_URL_FALLBACK`
487
-
488
- For system administrators and deployment managers, this environment variable provides organization-wide control over callback URLs:
489
-
490
- ```bash
491
- # Set deployment-wide callback URL
492
- export KIE_AI_CALLBACK_URL_FALLBACK="https://your-proxy.company.com/mcp-callback"
493
- ```
494
-
495
- ### **Use Cases:**
496
-
497
- **1. Corporate Proxy Setup:**
498
- ```bash
499
- # For enterprise deployments behind corporate firewalls
500
- export KIE_AI_CALLBACK_URL_FALLBACK="https://internal-proxy.company.ai/kie-callback"
501
- ```
502
-
503
- **2. Multi-Tenant Services:**
504
- ```bash
505
- # For SaaS platforms managing multiple users
506
- export KIE_AI_CALLBACK_URL_FALLBACK="https://api.yourservice.com/webhooks/kie-ai"
507
- ```
508
-
509
- **3. Development/Staging Environments:**
510
- ```bash
511
- # Separate callbacks for different environments
512
- export KIE_AI_CALLBACK_URL_FALLBACK="https://staging-webhook.yourapp.com/kie"
513
- ```
514
-
515
- ### **Configuration Examples:**
516
-
517
- **Docker Compose:**
518
- ```yaml
519
- services:
520
- kie-ai-mcp:
521
- image: node:18
522
- environment:
523
- - KIE_AI_API_KEY=${API_KEY}
524
- - KIE_AI_CALLBACK_URL_FALLBACK=https://proxy.company.com/webhook
525
- command: npx -y @felores/kie-ai-mcp-server
526
- ```
527
-
528
- **Kubernetes:**
529
- ```yaml
530
- env:
531
- - name: KIE_AI_API_KEY
532
- valueFrom:
533
- secretKeyRef:
534
- name: kie-ai-secrets
535
- key: api-key
536
- - name: KIE_AI_CALLBACK_URL_FALLBACK
537
- value: "https://proxy.company.com/kie-callback"
538
- ```
539
-
540
- **Systemd Service:**
541
- ```ini
542
- [Service]
543
- Environment=KIE_AI_API_KEY=your-api-key
544
- Environment=KIE_AI_CALLBACK_URL_FALLBACK=https://proxy.company.com/webhook
545
- ExecStart=npx -y @felores/kie-ai-mcp-server
546
- ```
547
-
548
- ### **Security Considerations:**
549
-
550
- - **HTTPS Required:** Always use HTTPS URLs for callbacks
551
- - **Authentication:** Ensure your callback endpoint validates requests
552
- - **Rate Limiting:** Implement rate limiting on your callback endpoint
553
- - **Logging:** Log callback requests for debugging and monitoring
554
-
555
- ### **Fallback Behavior:**
556
-
557
- The admin fallback only activates when:
558
- 1. No user-provided `callBackUrl` parameter
559
- 2. No `KIE_AI_CALLBACK_URL` environment variable set
560
-
561
- This ensures user preferences and existing configurations take priority.
562
-
563
- ## Available Tools
564
-
565
- ### 1. `list_tasks`
566
- List recent tasks with their status.
567
-
568
- **Parameters:**
569
- - `limit` (integer, optional): Max tasks to return (default: 20, max: 100)
570
- - `status` (string, optional): Filter by status ("pending", "processing", "completed", "failed")
571
-
572
- **Example:**
573
- ```json
574
- {
575
- "limit": 10,
576
- "status": "completed"
577
- }
578
- ```
579
-
580
- ### 2. `get_task_status`
581
- Check the status of a generation task.
582
-
583
- **Parameters:**
584
- - `task_id` (string, required): Task ID to check
585
-
586
- **Example:**
587
- ```json
588
- {
589
- "task_id": "281e5b0*********************f39b9"
590
- }
591
- ```
592
-
593
- ### 3. `nano_banana_image`
594
- Generate, edit, and upscale images using Google's Gemini 2.5 Flash Image Preview (Nano Banana). This unified tool automatically detects the operation mode based on parameters.
595
-
596
- **Smart Mode Detection:**
597
- - **Generate mode**: Provide `prompt` only
598
- - **Edit mode**: Provide `prompt` + `image_urls`
599
- - **Upscale mode**: Provide `image` (+ optional `scale`)
600
-
601
- **Parameters:**
602
- - `prompt` (string, optional): Text description for generate/edit modes (max 5000 chars)
603
- - `image_urls` (array, optional): URLs of images for edit mode (1-10 URLs)
604
- - `image` (string, optional): URL of image for upscale mode (max 10MB, jpeg/png/webp)
605
- - `scale` (integer, optional): Upscale factor for upscale mode, 1-4 (default: 2)
606
- - `face_enhance` (boolean, optional): Enable face enhancement for upscale mode (default: false)
607
- - `output_format` (string, optional): "png" or "jpeg" for generate/edit modes (default: "png")
608
- - `image_size` (string, optional): Aspect ratio for generate/edit modes - "1:1", "9:16", "16:9", "3:4", "4:3", "3:2", "2:3", "5:4", "4:5", "21:9", "auto" (default: "1:1")
609
-
610
- **Examples:**
611
-
612
- *Generate mode:*
613
- ```json
614
- {
615
- "prompt": "A surreal painting of a giant banana floating in space",
616
- "output_format": "png",
617
- "image_size": "16:9"
618
- }
619
- ```
620
-
621
- *Edit mode:*
622
- ```json
623
- {
624
- "prompt": "Add a rainbow arching over the mountains",
625
- "image_urls": ["https://example.com/image.jpg"],
626
- "output_format": "png",
627
- "image_size": "16:9"
628
- }
629
- ```
630
-
631
- *Upscale mode:*
632
- ```json
633
- {
634
- "image": "https://example.com/image.jpg",
635
- "scale": 4,
636
- "face_enhance": true
637
- }
638
- ```
639
-
640
- ### 4. `sora_video`
641
- Generate videos using OpenAI's Sora 2 models (unified tool for text-to-video, image-to-video, and storyboard modes).
642
-
643
- **Parameters:**
644
- - `prompt` (string, optional): Text prompt for video generation (max 4000 chars, required for text-to-video and image-to-video modes)
645
- - `image_url` (string, optional): URL of input image for image-to-video mode (if not provided, uses text-to-video)
646
- - `storyboard_image_url` (string, optional): URL of storyboard image for storyboard mode (if not provided, uses text-to-video)
647
- - `storyboard_prompt` (string, optional): Text prompt for storyboard mode (max 4000 chars, if not provided, uses text-to-video)
648
- - `model` (string, optional): Model version (default: "sora-2")
649
- - Options: `sora-2` (standard), `sora-2-pro` (premium quality)
650
- - `aspect_ratio` (string, optional): Video aspect ratio (default: "16:9")
651
- - Options: `16:9`, `9:16`, `1:1`
652
- - `resolution` (string, optional): Video resolution (default: "720p")
653
- - `480p`: Faster generation
654
- - `720p`: Balanced quality and speed
655
- - `1080p`: Highest quality (pro model only)
656
- - `duration` (string, optional): Video duration in seconds (default: "5")
657
- - Standard: 5-20 seconds
658
- - Pro: 5-20 seconds
659
- - `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
660
- - `watermark` (string, optional): Watermark text to add to the video (max 100 chars)
661
- - `enable_translation` (boolean, optional): Auto-translate non-English prompts to English (default: true)
662
- - `callBackUrl` (string, optional): URL for task completion notifications
663
-
664
- **Examples:**
665
-
666
- Text-to-video generation:
667
- ```json
668
- {
669
- "prompt": "A serene Japanese garden with cherry blossoms falling gently around a tranquil koi pond. Soft morning light filters through the trees. No dialogue. Peaceful ambient audio with gentle water sounds and bird songs.",
670
- "model": "sora-2",
671
- "aspect_ratio": "16:9",
672
- "resolution": "1080p",
673
- "duration": "10",
674
- "seed": 42
675
- }
676
- ```
677
-
678
- Image-to-video generation:
679
- ```json
680
- {
681
- "prompt": "The person in the portrait smiles warmly and looks around, then speaks with enthusiasm: 'Welcome to the future of AI video generation!'",
682
- "image_url": "https://example.com/portrait.jpg",
683
- "model": "sora-2-pro",
684
- "resolution": "1080p",
685
- "duration": "8"
686
- }
687
- ```
688
-
689
- Storyboard mode (no prompt required):
690
- ```json
691
- {
692
- "storyboard_image_url": "https://example.com/storyboard-frame.jpg",
693
- "storyboard_prompt": "A cinematic tracking shot through a futuristic city with flying vehicles",
694
- "model": "sora-2-pro",
695
- "aspect_ratio": "16:9",
696
- "resolution": "1080p",
697
- "duration": "15"
698
- }
699
- ```
700
-
701
- **Key Features:**
702
- - **Unified Interface**: Single tool for text-to-video, image-to-video, and storyboard modes
703
- - **Smart Mode Detection**: Automatically detects mode based on provided parameters
704
- - Text-to-Video: `prompt` provided, no `image_url` or `storyboard_image_url`
705
- - Image-to-Video: `prompt` + `image_url` provided
706
- - Storyboard: `storyboard_image_url` provided (prompt optional)
707
- - **Quality Tiers**: Standard for speed, Pro for premium quality
708
- - **Flexible Resolutions**: 480p for speed, 720p for balance, 1080p for maximum quality
709
- - **Aspect Ratio Control**: Support for horizontal, vertical, and square formats
710
- - **Storyboard Mode**: Unique feature for creating videos from storyboard frames without prompts
711
- - **Reproducible Results**: Seed control for consistent output
712
- - **Translation Support**: Automatic translation for non-English prompts
713
-
714
- **Model Selection Logic:**
715
- - If `storyboard_image_url` provided → Storyboard mode
716
- - If `image_url` provided → Image-to-video mode
717
- - If `prompt` provided → Text-to-video mode
718
- - Quality automatically determined by `model` parameter (`sora-2` vs `sora-2-pro`)
719
-
720
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-8 minutes depending on model quality, resolution, and duration.
721
-
722
- ### 5. `veo3_generate_video`
723
- Generate videos using Veo3.
724
-
725
- **Parameters:**
726
- - `prompt` (string, required): Video description
727
- - `imageUrls` (array, optional): Image for image-to-video (max 1)
728
- - `model` (enum, optional): "veo3" or "veo3_fast" (default: "veo3")
729
- - `aspectRatio` (enum, optional): "16:9", "9:16", or "Auto" (default: "16:9", only 16:9 supports 1080P)
730
- - `seeds` (integer, optional): Random seed 10000-99999
731
- - `watermark` (string, optional): Watermark text
732
- - `callBackUrl` (string, optional): Callback URL for completion notifications
733
- - `enableFallback` (boolean, optional): Enable fallback mechanism (default: false, fallback videos cannot use 1080P endpoint)
734
- - `enableTranslation` (boolean, optional): Auto-translate prompts to English (default: true)
735
-
736
- **Example:**
737
- ```json
738
- {
739
- "prompt": "A dog playing in a park",
740
- "model": "veo3",
741
- "aspectRatio": "16:9",
742
- "seeds": 12345,
743
- "enableTranslation": true
744
- }
745
- ```
746
-
747
- ### 6. `veo3_get_1080p_video`
748
- Get 1080P high-definition version of a Veo3 video.
749
-
750
- **Parameters:**
751
- - `task_id` (string, required): Veo3 task ID to get 1080p video for
752
- - `index` (integer, optional): Video index (for multiple video results)
753
-
754
- **Note**: Not available for videos generated with fallback mode.
755
-
756
- ### 7. `suno_generate_music`
757
- Generate music with AI using Suno models.
758
-
759
- **Parameters:**
760
- - `prompt` (string, required): Description of desired audio content (max 5000 chars for V4_5+, V5; 3000 for V3_5, V4; 500 chars for non-custom mode)
761
- - `customMode` (boolean, required): Enable advanced parameter customization
762
- - `instrumental` (boolean, required): Generate instrumental music (no lyrics)
763
- - `model` (enum, optional): AI model version - "V3_5", "V4", "V4_5", "V4_5PLUS", or "V5" (default: "V5")
764
- - `callBackUrl` (string, optional): URL to receive task completion updates (automatic fallback if not provided)
765
- - `style` (string, optional): Music style/genre (required in custom mode, max 1000 chars for V4_5+, V5; 200 for V3_5, V4)
766
- - `title` (string, optional): Track title (required in custom mode, max 80 chars)
767
- - `negativeTags` (string, optional): Music styles to exclude (max 200 chars)
768
- - `vocalGender` (enum, optional): Vocal gender preference - "m" or "f" (custom mode only)
769
- - `styleWeight` (number, optional): Style adherence strength (0-1, up to 2 decimal places)
770
- - `weirdnessConstraint` (number, optional): Creative deviation control (0-1, up to 2 decimal places)
771
- - `audioWeight` (number, optional): Audio feature balance (0-1, up to 2 decimal places)
772
-
773
- **Examples:**
774
-
775
- With explicit callback URL:
776
- ```json
777
- {
778
- "prompt": "A calm and relaxing piano track with soft melodies",
779
- "customMode": true,
780
- "instrumental": true,
781
- "model": "V5",
782
- "callBackUrl": "https://api.example.com/callback",
783
- "style": "Classical",
784
- "title": "Peaceful Piano Meditation"
785
- }
786
- ```
787
-
788
- Using automatic callback (no setup required):
789
- ```json
790
- {
791
- "prompt": "A relaxing electronic music track",
792
- "customMode": false,
793
- "instrumental": false
794
- }
795
- ```
796
-
797
- Using explicit model (overrides default V5):
798
- ```json
799
- {
800
- "prompt": "A relaxing electronic music track",
801
- "customMode": false,
802
- "instrumental": false,
803
- "model": "V4_5PLUS"
804
- }
805
- ```
806
-
807
- **Note**: In custom mode, `style` and `title` are required. If `instrumental` is false, `prompt` is used as exact lyrics. The `callBackUrl` is optional and uses automatic fallback if not provided. The `model` parameter defaults to "V5" but can be explicitly set to any available version.
808
-
809
- ### 8. `elevenlabs_tts`
810
- Generate speech from text using ElevenLabs TTS models (Turbo 2.5 by default, with optional Multilingual v2 support).
811
-
812
- **Parameters:**
813
- - `text` (string, required): The text to convert to speech (max 5000 characters)
814
- - `model` (enum, optional): TTS model to use - "turbo" (faster, default) or "multilingual" (supports context)
815
- - `voice` (enum, optional): Voice to use - "Rachel", "Aria", "Roger", "Sarah", "Laura", "Charlie", "George", "Callum", "River", "Liam", "Charlotte", "Alice", "Matilda", "Will", "Jessica", "Eric", "Chris", "Brian", "Daniel", "Lily", "Bill" (default: "Rachel")
816
- - `stability` (number, optional): Voice stability (0-1, step 0.01, default: 0.5)
817
- - `similarity_boost` (number, optional): Similarity boost (0-1, step 0.01, default: 0.75)
818
- - `style` (number, optional): Style exaggeration (0-1, step 0.01, default: 0)
819
- - `speed` (number, optional): Speech speed (0.7-1.2, step 0.01, default: 1.0)
820
- - `timestamps` (boolean, optional): Whether to return timestamps for each word (default: false)
821
- - `previous_text` (string, optional): Text that came before current request (multilingual model only, max 5000 chars)
822
- - `next_text` (string, optional): Text that comes after current request (multilingual model only, max 5000 chars)
823
- - `language_code` (string, optional): ISO 639-1 language code for language enforcement (turbo model only, max 500 chars)
824
- - `callBackUrl` (string, optional): URL to receive task completion updates (automatic fallback if not provided)
825
-
826
- **Examples:**
827
-
828
- Basic TTS generation (uses Turbo model by default):
829
- ```json
830
- {
831
- "text": "Hello, this is a test of the ElevenLabs text-to-speech system.",
832
- "voice": "Rachel"
833
- }
834
- ```
835
-
836
- Fast generation with language enforcement (Turbo model):
837
- ```json
838
- {
839
- "text": "Bonjour, comment allez-vous?",
840
- "voice": "Rachel",
841
- "model": "turbo",
842
- "language_code": "fr"
843
- }
844
- ```
845
-
846
- Advanced voice controls with context (Multilingual model):
847
- ```json
848
- {
849
- "text": "This is the second part of our conversation.",
850
- "voice": "Roger",
851
- "model": "multilingual",
852
- "stability": 0.8,
853
- "similarity_boost": 0.9,
854
- "previous_text": "This is the first part of our conversation.",
855
- "next_text": "This is the third part of our conversation."
856
- }
857
- ```
858
-
859
- **Model Comparison:**
860
- - **Turbo 2.5** (default): Faster generation (15-60 seconds), supports language enforcement with `language_code`
861
- - **Multilingual v2**: Supports context with `previous_text`/`next_text`, generation takes 30-120 seconds
862
-
863
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Choose Turbo model for speed and language enforcement, or Multilingual model for context-aware speech generation.
270
+ ## Quick Examples
864
271
 
865
- ### 9. `elevenlabs_ttsfx`
866
- Generate sound effects from text descriptions using ElevenLabs Sound Effects v2 model.
867
-
868
- **Parameters:**
869
- - `text` (string, required): Description of the sound effect to generate (max 5000 chars)
870
- - `loop` (boolean, optional): Whether to create a sound effect that loops smoothly (default: false)
871
- - `duration_seconds` (number, optional): Duration in seconds (0.5-22, step 0.1). If not specified, optimal duration will be determined from prompt
872
- - `prompt_influence` (number, optional): How closely to follow the prompt (0-1, step 0.01, default: 0.3). Higher values mean less variation
873
- - `output_format` (string, optional): Audio output format (default: "mp3_44100_192")
874
- - MP3 options: `mp3_22050_32`, `mp3_44100_32`, `mp3_44100_64`, `mp3_44100_96`, `mp3_44100_128`, `mp3_44100_192`
875
- - PCM options: `pcm_8000`, `pcm_16000`, `pcm_22050`, `pcm_24000`, `pcm_44100`, `pcm_48000`
876
- - Telephony: `ulaw_8000`, `alaw_8000`
877
- - Opus: `opus_48000_32`, `opus_48000_64`, `opus_48000_96`, `opus_48000_128`, `opus_48000_192`
878
- - `callBackUrl` (string, optional): URL for task completion notifications
879
-
880
- **Examples:**
881
-
882
- Basic sound effect:
272
+ ### Generate Image
883
273
  ```json
884
274
  {
885
- "text": "Rain falling on a tin roof"
886
- }
887
- ```
888
-
889
- Advanced sound effect with custom duration:
890
- ```json
891
- {
892
- "text": "Epic thunderstorm with heavy rain and distant thunder",
893
- "duration_seconds": 15.0,
894
- "prompt_influence": 0.8,
895
- "output_format": "mp3_44100_192"
896
- }
897
- ```
898
-
899
- Looping ambient sound:
900
- ```json
901
- {
902
- "text": "Gentle ocean waves lapping at the shore",
903
- "loop": true,
904
- "duration_seconds": 10.0
905
- }
906
- ```
907
-
908
- **Key Features:**
909
- - **High-Quality Audio**: Professional-grade sound effect generation
910
- - **Flexible Duration**: Control exact length from 0.5 to 22 seconds
911
- - **Loop Support**: Create seamless looping sound effects
912
- - **Multiple Formats**: Support for MP3, PCM, Opus, and telephony formats
913
- - **Prompt Control**: Adjust how closely to follow your description
914
-
915
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Sound effects generation typically takes 30-90 seconds depending on complexity.
916
-
917
- ### 10. `bytedance_seedance_video`
918
- Generate videos using ByteDance Seedance models (unified tool for both text-to-video and image-to-video).
919
-
920
- **Parameters:**
921
- - `prompt` (string, required): Text prompt for video generation (max 10000 chars)
922
- - `image_url` (string, optional): URL of input image for image-to-video generation (if not provided, uses text-to-video)
923
- - `quality` (string, optional): Model quality level (default: "lite")
924
- - `lite`: Faster generation with good quality
925
- - `pro`: Higher quality with longer generation time
926
- - `aspect_ratio` (string, optional): Video aspect ratio (default: "16:9")
927
- - Options: `1:1`, `9:16`, `16:9`, `4:3`, `3:4`, `21:9`, `9:21`
928
- - `resolution` (string, optional): Video resolution (default: "720p")
929
- - `480p`: Faster generation
930
- - `720p`: Balanced quality and speed
931
- - `1080p`: Highest quality
932
- - `duration` (string, optional): Video duration in seconds 2-12 (default: "5")
933
- - `camera_fixed` (boolean, optional): Whether to fix camera position (default: false)
934
- - `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
935
- - `enable_safety_checker` (boolean, optional): Enable content safety checking (default: true)
936
- - `end_image_url` (string, optional): URL of ending image (image-to-video only)
937
- - `callBackUrl` (string, optional): URL for task completion notifications
938
-
939
- **Examples:**
940
-
941
- Text-to-video (lite quality):
942
- ```json
943
- {
944
- "prompt": "A serene sailing boat gently sways in the harbor at dawn, surrounded by soft Impressionist hues of pink and orange",
945
- "quality": "lite",
946
- "aspect_ratio": "16:9",
947
- "duration": "5"
948
- }
949
- ```
950
-
951
- Image-to-video (pro quality):
952
- ```json
953
- {
954
- "prompt": "A golden retriever dashing through shallow surf at the beach, splashes frozen in time",
955
- "image_url": "https://example.com/golden-retriever.jpg",
956
- "quality": "pro",
957
- "resolution": "1080p",
958
- "duration": "6",
959
- "camera_fixed": false
960
- }
961
- ```
962
-
963
- Video with specific ending frame:
964
- ```json
965
- {
966
- "prompt": "A traveler crosses an endless desert toward a glowing archway",
967
- "image_url": "https://example.com/desert-traveler.jpg",
968
- "end_image_url": "https://example.com/archway.jpg",
969
- "quality": "pro",
970
- "duration": "8"
971
- }
972
- ```
973
-
974
- **Key Features:**
975
- - **Unified Interface**: Single tool for both text-to-video and image-to-video
976
- - **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
977
- - **Quality Options**: Lite for speed, Pro for quality
978
- - **Flexible Aspect Ratios**: Support for vertical, horizontal, and square formats
979
- - **Camera Control**: Option to fix camera position for stable shots
980
- - **Reproducible Results**: Seed control for consistent output
981
- - **Safety Features**: Built-in content safety checking
982
-
983
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-5 minutes depending on quality and complexity.
984
-
985
- ### 11. `bytedance_seedream_image`
986
- Generate and edit images using ByteDance Seedream V4 models (unified tool for both text-to-image and image editing).
987
-
988
- **Parameters:**
989
- - `prompt` (string, required): Text prompt for image generation or editing (max 10000 chars)
990
- - `image_urls` (array, optional): Array of image URLs for editing mode (1-10 images, if not provided, uses text-to-image)
991
- - `image_size` (string, optional): Image aspect ratio (default: "1:1")
992
- - Options: `1:1`, `4:3`, `3:4`, `16:9`, `9:16`, `21:9`, `9:21`, `3:2`, `2:3`
993
- - `image_resolution` (string, optional): Image resolution (default: "1K")
994
- - `1K`: Standard resolution (1024px on shortest side)
995
- - `2K`: High resolution (2048px on shortest side)
996
- - `4K`: Ultra high resolution (4096px on shortest side)
997
- - `max_images` (integer, optional): Number of images to generate (1-6, default: 1)
998
- - `seed` (integer, optional): Random seed for reproducible results (default: -1 for random)
999
- - `callBackUrl` (string, optional): URL for task completion notifications
1000
-
1001
- **Examples:**
1002
-
1003
- Text-to-image generation:
1004
- ```json
1005
- {
1006
- "prompt": "A majestic dragon perched atop a crystal mountain at sunset, digital art style",
1007
- "image_size": "16:9",
1008
- "image_resolution": "2K",
1009
- "max_images": 2,
1010
- "seed": 42
1011
- }
1012
- ```
1013
-
1014
- Image editing:
1015
- ```json
1016
- {
1017
- "prompt": "Transform the day scene into a magical night with glowing stars and moonlight",
1018
- "image_urls": ["https://example.com/day-landscape.jpg"],
1019
- "image_size": "16:9",
1020
- "image_resolution": "2K",
1021
- "max_images": 1
1022
- }
1023
- ```
1024
-
1025
- Multiple image editing:
1026
- ```json
1027
- {
1028
- "prompt": "Apply a consistent cyberpunk aesthetic to all images with neon lights and futuristic elements",
1029
- "image_urls": [
1030
- "https://example.com/character1.jpg",
1031
- "https://example.com/character2.jpg",
1032
- "https://example.com/background.jpg"
1033
- ],
1034
- "image_resolution": "4K",
1035
- "max_images": 3
1036
- }
1037
- ```
1038
-
1039
- **Key Features:**
1040
- - **Unified Interface**: Single tool for both text-to-image and image editing
1041
- - **Smart Mode Detection**: Automatically detects mode based on presence of `image_urls`
1042
- - **High Resolution**: Support for 1K, 2K, and 4K output
1043
- - **Multiple Images**: Generate up to 6 images in a single request
1044
- - **Batch Editing**: Edit up to 10 images simultaneously with consistent style
1045
- - **Reproducible Results**: Seed control for consistent output
1046
-
1047
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 30-120 seconds depending on resolution and complexity.
1048
-
1049
- ### 12. `qwen_image`
1050
- Generate and edit images using Qwen models (unified tool for both text-to-image and image editing).
1051
-
1052
- **Parameters:**
1053
- - `prompt` (string, required): Text prompt for image generation or editing
1054
- - `image_url` (string, optional): URL of image to edit (if not provided, uses text-to-image)
1055
- - `image_size` (string, optional): Image size (default: "square_hd")
1056
- - Options: `square`, `square_hd`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`
1057
- - `num_inference_steps` (integer, optional): Number of inference steps (default: 30 for text-to-image, 25 for edit)
1058
- - Text-to-image: 2-250, Edit: 2-49
1059
- - `guidance_scale` (number, optional): CFG scale (default: 2.5 for text-to-image, 4 for edit)
1060
- - Range: 0-20
1061
- - `enable_safety_checker` (boolean, optional): Enable safety checker (default: true)
1062
- - `output_format` (string, optional): Output format (default: "png")
1063
- - Options: `png`, `jpeg`
1064
- - `negative_prompt` (string, optional): Negative prompt (max 500 chars, default: " ")
1065
- - `acceleration` (string, optional): Acceleration level (default: "none")
1066
- - Options: `none`, `regular`, `high`
1067
- - `num_images` (string, optional): Number of images (edit mode only)
1068
- - Options: `1`, `2`, `3`, `4`
1069
- - `sync_mode` (boolean, optional): Sync mode (edit mode only, default: false)
1070
- - `seed` (number, optional): Random seed for reproducible results
1071
- - `callBackUrl` (string, optional): URL for task completion notifications
1072
-
1073
- **Examples:**
1074
-
1075
- Text-to-image generation:
1076
- ```json
1077
- {
1078
- "prompt": "A beautiful landscape with mountains and a lake at sunset",
1079
- "image_size": "landscape_16_9",
1080
- "num_inference_steps": 30,
1081
- "guidance_scale": 2.5,
1082
- "output_format": "png",
1083
- "seed": 42
1084
- }
1085
- ```
1086
-
1087
- Image editing:
1088
- ```json
1089
- {
1090
- "prompt": "Change the day scene to night with stars and moonlight",
1091
- "image_url": "https://example.com/day-landscape.jpg",
1092
- "image_size": "landscape_16_9",
1093
- "num_inference_steps": 25,
1094
- "guidance_scale": 4,
1095
- "num_images": "2",
1096
- "output_format": "png"
1097
- }
1098
- ```
1099
-
1100
- High-acceleration generation:
1101
- ```json
1102
- {
1103
- "prompt": "A futuristic city with flying cars",
1104
- "image_size": "square_hd",
1105
- "acceleration": "high",
1106
- "enable_safety_checker": true,
1107
- "negative_prompt": "blurry, low quality"
1108
- }
1109
- ```
1110
-
1111
- **Key Features:**
1112
- - **Unified Interface**: Single tool for both text-to-image and image editing
1113
- - **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
1114
- - **Flexible Sizing**: Support for multiple aspect ratios and resolutions
1115
- - **Acceleration Options**: Speed up generation with acceleration levels
1116
- - **Batch Generation**: Generate multiple images in edit mode
1117
- - **Reproducible Results**: Seed control for consistent output
1118
-
1119
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 10-60 seconds depending on settings and acceleration level.
1120
-
1121
- ### 13. `runway_aleph_video`
1122
- Transform videos using Runway Aleph video-to-video generation with AI-powered editing.
1123
-
1124
- **Parameters:**
1125
- - `prompt` (string, required): Text prompt describing desired video transformation (max 1000 chars)
1126
- - `videoUrl` (string, required): URL of the input video to transform
1127
- - `waterMark` (string, optional): Watermark text to add to the video (max 100 chars, default: "")
1128
- - `uploadCn` (boolean, optional): Whether to upload to China servers (default: false)
1129
- - `aspectRatio` (enum, optional): Output video aspect ratio (default: "16:9")
1130
- - Options: `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `21:9`
1131
- - `seed` (integer, optional): Random seed for reproducible results (1-999999)
1132
- - `referenceImage` (string, optional): URL of reference image for style guidance
1133
- - `callBackUrl` (string, optional): URL for task completion notifications
1134
-
1135
- **Examples:**
1136
-
1137
- Basic video transformation:
1138
- ```json
1139
- {
1140
- "prompt": "Transform this video into a cinematic anime style with vibrant colors",
1141
- "videoUrl": "https://example.com/input-video.mp4",
1142
- "aspectRatio": "16:9"
1143
- }
1144
- ```
1145
-
1146
- Advanced transformation with reference image:
1147
- ```json
1148
- {
1149
- "prompt": "Apply the artistic style of the reference image to this video",
1150
- "videoUrl": "https://example.com/cooking-video.mp4",
1151
- "referenceImage": "https://example.com/van-gogh-painting.jpg",
1152
- "seed": 123456,
1153
- "waterMark": "My Channel"
1154
- }
1155
- ```
1156
-
1157
- Vertical video for social media:
1158
- ```json
1159
- {
1160
- "prompt": "Convert to a dreamy, ethereal style with soft lighting",
1161
- "videoUrl": "https://example.com/landscape-video.mp4",
1162
- "aspectRatio": "9:16",
1163
- "uploadCn": false
1164
- }
1165
- ```
1166
-
1167
- **Key Features:**
1168
- - **Video-to-Video Transformation**: Transform existing videos with AI-powered editing
1169
- - **Style Transfer**: Apply artistic styles from text prompts or reference images
1170
- - **Aspect Ratio Control**: Convert between horizontal, vertical, and square formats
1171
- - **Reproducible Results**: Seed control for consistent transformations
1172
- - **Watermark Support**: Add custom watermarks to transformed videos
1173
- - **Reference Guidance**: Use reference images to guide the transformation style
1174
-
1175
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video-to-video transformation typically takes 3-8 minutes depending on complexity and length.
1176
-
1177
- ### 14. `midjourney_generate`
1178
- Generate images and videos using Midjourney AI models (unified tool for text-to-image, image-to-image, style reference, omni reference, and video generation).
1179
-
1180
- **Parameters:**
1181
- - `prompt` (string, required): Text prompt describing the desired image or video (max 2000 chars)
1182
- - `taskType` (string, optional): Task type for generation mode (auto-detected if not provided)
1183
- - Options: `mj_txt2img`, `mj_img2img`, `mj_style_reference`, `mj_omni_reference`, `mj_video`, `mj_video_hd`
1184
- - `fileUrl` (string, optional): Single image URL for image-to-image or video generation (legacy - use fileUrls instead)
1185
- - `fileUrls` (array, optional): Array of image URLs for image-to-image or video generation (recommended, max 10)
1186
- - `speed` (string, optional): Generation speed (not required for video/omni tasks)
1187
- - Options: `relaxed`, `fast`, `turbo`
1188
- - `aspectRatio` (string, optional): Output aspect ratio (default: "16:9")
1189
- - Options: `1:2`, `9:16`, `2:3`, `3:4`, `5:6`, `6:5`, `4:3`, `3:2`, `1:1`, `16:9`, `2:1`
1190
- - `version` (string, optional): Midjourney model version (default: "7")
1191
- - Options: `7`, `6.1`, `6`, `5.2`, `5.1`, `niji6`
1192
- - `variety` (integer, optional): Controls diversity of generated results (0-100, increment by 5)
1193
- - `stylization` (integer, optional): Artistic style intensity (0-1000, suggested multiple of 50)
1194
- - `weirdness` (integer, optional): Creativity and uniqueness level (0-3000, suggested multiple of 100)
1195
- - `ow` (integer, optional): Omni intensity parameter for omni reference tasks (1-1000)
1196
- - `waterMark` (string, optional): Watermark identifier (max 100 chars)
1197
- - `enableTranslation` (boolean, optional): Auto-translate non-English prompts to English (default: false)
1198
- - `videoBatchSize` (string, optional): Number of videos to generate (video mode only, default: "1")
1199
- - Options: `1`, `2`, `4`
1200
- - `motion` (string, optional): Motion level for video generation (required for video mode, default: "high")
1201
- - Options: `high`, `low`
1202
- - `high_definition_video` (boolean, optional): Use HD video generation instead of standard definition (default: false)
1203
- - `callBackUrl` (string, optional): URL for task completion notifications
1204
-
1205
- **Examples:**
1206
-
1207
- Text-to-image generation:
1208
- ```json
1209
- {
1210
- "prompt": "A majestic dragon perched atop a crystal mountain at sunset, digital art style",
1211
- "aspectRatio": "16:9",
1212
- "version": "7",
1213
- "speed": "fast",
1214
- "stylization": 500
1215
- }
1216
- ```
1217
-
1218
- Image-to-image generation:
1219
- ```json
1220
- {
1221
- "prompt": "Transform this portrait into a cyberpunk style with neon lights",
1222
- "fileUrls": ["https://example.com/portrait.jpg"],
1223
- "aspectRatio": "1:1",
1224
- "version": "7",
1225
- "variety": 10
1226
- }
1227
- ```
1228
-
1229
- Standard definition video generation (default):
1230
- ```json
1231
- {
1232
- "prompt": "Add gentle movement and atmospheric effects",
1233
- "fileUrls": ["https://example.com/landscape.jpg"],
1234
- "motion": "high",
1235
- "videoBatchSize": "1",
1236
- "aspectRatio": "16:9"
1237
- }
1238
- ```
1239
-
1240
- High definition video generation (explicit):
1241
- ```json
1242
- {
1243
- "prompt": "Create cinematic video with dramatic motion",
1244
- "fileUrls": ["https://example.com/cityscape.jpg"],
1245
- "motion": "high",
1246
- "high_definition_video": true,
1247
- "videoBatchSize": "2",
1248
- "aspectRatio": "16:9"
1249
- }
1250
- ```
1251
-
1252
- Omni reference generation:
1253
- ```json
1254
- {
1255
- "prompt": "Place this character in a fantasy forest setting",
1256
- "fileUrls": ["https://example.com/character.jpg"],
1257
- "ow": 500,
1258
- "aspectRatio": "16:9",
1259
- "version": "7"
1260
- }
1261
- ```
1262
-
1263
- Style reference generation:
1264
- ```json
1265
- {
1266
- "prompt": "Apply this artistic style to a new landscape",
1267
- "fileUrls": ["https://example.com/artistic-style.jpg"],
1268
- "taskType": "mj_style_reference",
1269
- "aspectRatio": "16:9",
1270
- "stylization": 700
1271
- }
1272
- ```
1273
-
1274
- **Key Features:**
1275
- - **Unified Interface**: Single tool for all Midjourney generation modes
1276
- - **Smart Mode Detection**: Automatically detects task type based on parameters
1277
- - **Video Default**: Uses standard definition video by default, HD only when explicitly requested
1278
- - **Multiple Aspect Ratios**: Support for vertical, horizontal, square, and ultra-wide formats
1279
- - **Style Control**: Fine-tune artistic style with stylization, variety, and weirdness parameters
1280
- - **Speed Options**: Choose generation speed based on urgency (relaxed/fast/turbo)
1281
- - **Model Versions**: Access different Midjourney models including niji for anime/illustration
1282
- - **Reference Modes**: Advanced omni and style reference for character and style transfer
1283
- - **Batch Generation**: Generate multiple videos in a single request
1284
-
1285
- **Smart Detection Logic:**
1286
- - If `high_definition_video` is true → `mj_video_hd`
1287
- - If `motion` or `videoBatchSize` present → `mj_video` (standard) or `mj_video_hd` (explicit)
1288
- - If `ow` present → `mj_omni_reference`
1289
- - If `taskType` is `mj_style_reference` → `mj_style_reference`
1290
- - If `fileUrl`/`fileUrls` present → `mj_img2img`
1291
- - Otherwise → `mj_txt2img`
1292
-
1293
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Generation times vary: text-to-image (1-3 minutes), image-to-image (2-4 minutes), video generation (3-8 minutes), reference modes (2-5 minutes).
1294
-
1295
- ### 15. `wan_video`
1296
- Generate videos using Alibaba Wan 2.5 models (unified tool for both text-to-video and image-to-video).
1297
-
1298
- **Parameters:**
1299
- - `prompt` (string, required): Text prompt for video generation (max 800 chars)
1300
- - `image_url` (string, optional): URL of input image for image-to-video generation (if not provided, uses text-to-video)
1301
- - `aspect_ratio` (string, optional): Video aspect ratio for text-to-video (default: "16:9")
1302
- - Options: `16:9`, `9:16`, `1:1`
1303
- - `resolution` (string, optional): Video resolution (default: "1080p")
1304
- - `720p`: Faster generation
1305
- - `1080p`: Higher quality
1306
- - `duration` (string, optional): Video duration for image-to-video (default: "5")
1307
- - Options: `5`, `10` seconds
1308
- - `negative_prompt` (string, optional): Negative prompt to describe content to avoid (max 500 chars, default: "")
1309
- - `enable_prompt_expansion` (boolean, optional): Enable prompt rewriting using LLM (default: true)
1310
- - `seed` (integer, optional): Random seed for reproducible results
1311
- - `callBackUrl` (string, optional): URL for task completion notifications
1312
-
1313
- **Examples:**
1314
-
1315
- Text-to-video generation:
1316
- ```json
1317
- {
1318
- "prompt": "A dimly lit jazz bar at night, wooden tables glowing under warm pendant lights. Patrons sip drinks and chat quietly while a three-piece band performs on stage. The saxophone player stands under a spotlight, gleaming instrument reflecting the light. No dialogue. Ambient audio: smooth live jazz music with saxophone and piano, clinking glasses, low murmur of audience conversations.",
1319
- "aspect_ratio": "16:9",
1320
- "resolution": "1080p",
1321
- "enable_prompt_expansion": true,
1322
- "seed": 42
1323
- }
1324
- ```
1325
-
1326
- Image-to-video generation:
1327
- ```json
1328
- {
1329
- "prompt": "The same woman from the reference image looks directly into the camera, takes a breath, then smiles brightly and speaks with enthusiasm: 'Have you heard? Alibaba Wan 2.5 API is now available on Kie.ai!'",
1330
- "image_url": "https://example.com/portrait.jpg",
1331
- "duration": "5",
1332
- "resolution": "1080p",
1333
- "negative_prompt": "blurry, low quality",
1334
- "seed": 123
1335
- }
1336
- ```
1337
-
1338
- **Key Features:**
1339
- - **Unified Interface**: Single tool for both text-to-video and image-to-video
1340
- - **Smart Mode Detection**: Automatically detects mode based on presence of `image_url`
1341
- - **Prompt Expansion**: LLM-powered prompt rewriting for better results with short prompts
1342
- - **Flexible Resolutions**: 720p for speed, 1080p for quality
1343
- - **Aspect Ratio Control**: Support for horizontal, vertical, and square formats (text-to-video)
1344
- - **Duration Control**: 5 or 10 second options for image-to-video
1345
- - **Negative Prompts**: Fine-tune results by specifying what to avoid
1346
- - **Reproducible Results**: Seed control for consistent output
1347
-
1348
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-6 minutes depending on resolution and complexity.
1349
-
1350
- ### 16. `hailuo_video`
1351
-
1352
- Generate professional videos using Hailuo 02 models (unified tool for text-to-video and image-to-video with standard/pro quality).
1353
-
1354
- **Parameters:**
1355
- - `prompt` (string, required): Text prompt describing the video content (max 1500 chars)
1356
- - `imageUrl` (string, optional): URL of input image for image-to-video mode (if not provided, uses text-to-video)
1357
- - `endImageUrl` (string, optional): URL of end frame image for image-to-video (optional, requires imageUrl)
1358
- - `quality` (string, optional): Quality level of generation (default: "standard")
1359
- - Options: `standard`, `pro`
1360
- - `duration` (string, optional): Duration of video in seconds - standard quality only (default: "6")
1361
- - Options: `6`, `10`
1362
- - `resolution` (string, optional): Video resolution - standard quality only (default: "768P")
1363
- - Options: `512P`, `768P`
1364
- - `promptOptimizer` (boolean, optional): Enable prompt optimization (default: true)
1365
- - `callBackUrl` (string, optional): URL for task completion notifications
1366
-
1367
- **Examples:**
1368
-
1369
- Text-to-video generation:
1370
- ```json
1371
- {
1372
- "prompt": "A cinematic shot of a futuristic city at night with flying vehicles and holographic billboards. Camera pans across the skyline.",
1373
- "quality": "pro",
1374
- "promptOptimizer": true
1375
- }
1376
- ```
1377
-
1378
- Image-to-video generation (standard quality):
1379
- ```json
1380
- {
1381
- "prompt": "The person in the image stands up and walks towards the window, looking out at the scenic view",
1382
- "imageUrl": "https://example.com/portrait.jpg",
1383
- "quality": "standard",
1384
- "duration": "10",
1385
- "resolution": "768P"
1386
- }
1387
- ```
1388
-
1389
- Image-to-video with end frame:
1390
- ```json
1391
- {
1392
- "prompt": "A smooth transition from the morning scene to sunset over the mountains",
1393
- "imageUrl": "https://example.com/start-frame.jpg",
1394
- "endImageUrl": "https://example.com/end-frame.jpg",
1395
- "quality": "standard"
1396
- }
1397
- ```
1398
-
1399
- **Key Features:**
1400
- - **Two Intelligent Modes**:
1401
- - Text-to-video: Create videos from text descriptions
1402
- - Image-to-video: Animate static images with optional end frame reference
1403
- - **Quality Selection**: Choose between standard (faster) and pro (higher quality) modes
1404
- - **Smart Mode Detection**: Automatically selects the best model based on parameters and quality setting
1405
- - **Standard Quality Options**: Flexible duration (6/10 seconds) and resolution (512P/768P)
1406
- - **Pro Quality**: Optimized for maximum visual fidelity (no resolution/duration constraints)
1407
- - **Prompt Optimization**: AI-powered prompt enhancement for better results
1408
-
1409
- **Model Selection Logic:**
1410
- - If `imageUrl` provided:
1411
- - `quality === 'pro'` → `hailuo/02-image-to-video-pro`
1412
- - Otherwise → `hailuo/02-image-to-video-standard`
1413
- - Otherwise (text-to-video):
1414
- - `quality === 'pro'` → `hailuo/02-text-to-video-pro`
1415
- - Otherwise → `hailuo/02-text-to-video-standard`
1416
-
1417
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 1-5 minutes depending on quality setting and complexity.
1418
-
1419
- ### 17. `kling_video`
1420
-
1421
- Generate high-quality videos using Kling AI models (unified tool for text-to-video, image-to-video, and v2.1-pro with start+end frames).
1422
-
1423
- **Parameters:**
1424
- - `prompt` (string, required): Text prompt describing the video (max 5000 chars)
1425
- - `image_url` (string, optional): URL of input image for image-to-video or v2.1-pro start frame (if not provided, uses text-to-video)
1426
- - `tail_image_url` (string, optional): URL of end frame image for v2.1-pro (requires image_url). When provided, uses v2.1-pro model with start and end frame reference
1427
- - `duration` (string, optional): Duration of video in seconds (default: "5")
1428
- - Options: `5`, `10`
1429
- - `aspect_ratio` (string, optional): Aspect ratio for text-to-video (default: "16:9")
1430
- - Options: `16:9`, `9:16`, `1:1`
1431
- - `negative_prompt` (string, optional): Elements to avoid (max 2500 chars, default: "blur, distort, and low quality")
1432
- - `cfg_scale` (number, optional): CFG scale for prompt adherence (0-1, step 0.1, default: 0.5)
1433
- - `callBackUrl` (string, optional): URL for task completion notifications
1434
-
1435
- **Examples:**
1436
-
1437
- Text-to-video generation:
1438
- ```json
1439
- {
1440
- "prompt": "A serene forest scene with sunlight filtering through the canopy. Birds chirping, gentle breeze rustling leaves. Camera slowly pans through the trees revealing a hidden waterfall",
1441
- "aspect_ratio": "16:9",
1442
- "duration": "10",
1443
- "cfg_scale": 0.7
1444
- }
1445
- ```
1446
-
1447
- Image-to-video generation:
1448
- ```json
1449
- {
1450
- "prompt": "The person in the image waves and smiles, then turns to look at the scenic mountain view",
1451
- "image_url": "https://example.com/portrait.jpg",
1452
- "duration": "5"
1453
- }
1454
- ```
1455
-
1456
- V2.1-pro with start and end frames:
1457
- ```json
1458
- {
1459
- "prompt": "A smooth transition showing the landscape changing from day to night, with the person from frame 1 walking towards the sunset",
1460
- "image_url": "https://example.com/start-frame.jpg",
1461
- "tail_image_url": "https://example.com/end-frame.jpg",
1462
- "duration": "10",
1463
- "cfg_scale": 0.6
1464
- }
1465
- ```
1466
-
1467
- **Key Features:**
1468
- - **Three Intelligent Modes**:
1469
- - Text-to-video: Create videos from text descriptions
1470
- - Image-to-video: Animate static images
1471
- - V2.1-pro: Advanced mode with start and end frame references for controlled video transitions
1472
- - **Smart Mode Detection**: Automatically selects the best model based on parameters
1473
- - **Start/End Frame Control**: V2.1-pro uniquely supports specifying both start and end frames for precise video flows
1474
- - **Flexible Duration**: 5 or 10 second options
1475
- - **Aspect Ratio Control**: Multiple formats for text-to-video (16:9, 9:16, 1:1)
1476
- - **Quality Control**: CFG scale for controlling prompt adherence
1477
- - **Negative Prompts**: Fine-tune by specifying what to avoid
1478
-
1479
- **Model Selection Logic:**
1480
- - If `tail_image_url` provided → `kling/v2-1-pro` (start + end frame reference)
1481
- - If `image_url` provided → `kling/v2-5-turbo-image-to-video-pro` (image animation)
1482
- - Otherwise → `kling/v2-5-turbo-text-to-video-pro` (text-to-video)
1483
-
1484
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Video generation typically takes 2-5 minutes depending on duration and complexity.
1485
-
1486
- ### 18. `openai_4o_image`
1487
- Generate, edit, and create image variants using OpenAI's GPT-4o image models (unified tool for text-to-image, image editing, and image variants).
1488
-
1489
- **Parameters:**
1490
- - `prompt` (string, required): Text prompt for image generation or editing (max 4000 chars)
1491
- - `filesUrl` (string, optional): URL of input image for editing/variants mode (if not provided, uses text-to-image)
1492
- - `maskUrl` (string, optional): URL of mask image for editing mode (required for editing, must be same dimensions as filesUrl)
1493
- - `nVariants` (integer, optional): Number of image variants to generate (1-4, default: 4)
1494
- - `size` (string, optional): Output image size (default: "1024x1024")
1495
- - Options: `256x256`, `512x512`, `1024x1024`, `1792x1024`, `1024x1792`
1496
- - `model` (string, optional): Model to use (default: "gpt-4o-image")
1497
- - Options: `gpt-4o-image`, `gpt-4o-image-mini`
1498
- - `style` (string, optional): Image style (default: "vivid")
1499
- - Options: `vivid`, `natural`
1500
- - `quality` (string, optional): Image quality (default: "standard")
1501
- - Options: `standard`, `hd`
1502
- - `responseFormat` (string, optional): Response format (default: "url")
1503
- - Options: `url`, `b64_json`
1504
- - `user` (string, optional): User identifier for tracking (max 100 chars)
1505
- - `enableFallback` (boolean, optional): Enable fallback mechanism (default: true)
1506
- - `callBackUrl` (string, optional): URL for task completion notifications
1507
-
1508
- **Examples:**
1509
-
1510
- Text-to-image generation:
1511
- ```json
1512
- {
1513
- "prompt": "A futuristic city skyline at sunset with flying cars and neon lights, cyberpunk style",
1514
- "nVariants": 4,
1515
- "size": "1024x1024",
1516
- "quality": "hd",
1517
- "style": "vivid"
1518
- }
1519
- ```
1520
-
1521
- Image editing with mask:
1522
- ```json
1523
- {
1524
- "prompt": "Replace the cloudy sky with a clear starry night and add a full moon",
1525
- "filesUrl": "https://example.com/landscape.jpg",
1526
- "maskUrl": "https://example.com/landscape-mask.png",
1527
- "nVariants": 2,
1528
- "size": "1024x1024",
1529
- "quality": "hd"
1530
- }
1531
- ```
1532
-
1533
- Image variants:
1534
- ```json
1535
- {
1536
- "filesUrl": "https://example.com/portrait.jpg",
1537
- "nVariants": 4,
1538
- "style": "natural",
1539
- "quality": "standard"
1540
- }
1541
- ```
1542
-
1543
- High-quality generation with fallback:
1544
- ```json
1545
- {
1546
- "prompt": "A detailed oil painting of a serene mountain lake at dawn",
1547
- "nVariants": 2,
1548
- "size": "1792x1024",
1549
- "quality": "hd",
1550
- "model": "gpt-4o-image",
1551
- "enableFallback": true
1552
- }
1553
- ```
1554
-
1555
- **Key Features:**
1556
- - **Unified Interface**: Single tool for text-to-image, image editing, and image variants
1557
- - **Smart Mode Detection**: Automatically detects mode based on provided parameters
1558
- - Text-to-Image: `prompt` provided, no `filesUrl`
1559
- - Image Editing: `filesUrl` + `maskUrl` provided
1560
- - Image Variants: `filesUrl` provided, no `maskUrl`
1561
- - **Multiple Variants**: Generate up to 4 image variations in a single request
1562
- - **Flexible Sizing**: Support for square, portrait, and landscape formats
1563
- - **Quality Options**: Standard or HD quality for different use cases
1564
- - **Style Control**: Choose between vivid (creative) or natural (realistic) styles
1565
- - **Fallback Support**: Automatic fallback to FLUX_MAX model if GPT-4o fails
1566
- - **Model Options**: Use full GPT-4o or mini model based on requirements
1567
-
1568
- **Smart Detection Logic:**
1569
- - If `filesUrl` and `maskUrl` provided → Image Editing mode
1570
- - If `filesUrl` provided but no `maskUrl` → Image Variants mode
1571
- - If no `filesUrl` provided → Text-to-Image mode
1572
-
1573
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image generation typically takes 30-120 seconds depending on complexity and quality settings. The fallback mechanism uses FLUX_MAX model when GPT-4o fails, ensuring reliable generation.
1574
-
1575
- ### 19. `flux_kontext_image`
1576
- Generate or edit images using Flux Kontext AI models (unified tool for text-to-image generation and image editing with advanced features).
1577
-
1578
- **Parameters:**
1579
- - `prompt` (string, required): Text prompt describing the desired image or edit (max 5000 chars, English recommended)
1580
- - `inputImage` (string, optional): Input image URL for editing mode (omit for text-to-image generation)
1581
- - `aspectRatio` (string, optional): Output aspect ratio (default: "16:9")
1582
- - Options: `21:9` (ultra-wide), `16:9` (widescreen), `4:3` (standard), `1:1` (square), `3:4` (portrait), `9:16` (mobile portrait)
1583
- - `outputFormat` (string, optional): Output image format (default: "jpeg")
1584
- - Options: `jpeg`, `png`
1585
- - `model` (string, optional): Model version (default: "flux-kontext-pro")
1586
- - Options: `flux-kontext-pro` (standard), `flux-kontext-max` (enhanced)
1587
- - `enableTranslation` (boolean, optional): Auto-translate non-English prompts (default: true)
1588
- - `promptUpsampling` (boolean, optional): Enable prompt enhancement (default: false)
1589
- - `safetyTolerance` (integer, optional): Content moderation level (default: 2)
1590
- - Generation mode: 0-6 (0=strict, 6=permissive)
1591
- - Editing mode: 0-2 (0=strict, 2=balanced)
1592
- - `uploadCn` (boolean, optional): Route uploads via China servers (default: false)
1593
- - `watermark` (string, optional): Watermark identifier to add to generated image
1594
- - `callBackUrl` (string, optional): URL for task completion notifications
1595
-
1596
- **Examples:**
1597
-
1598
- Text-to-image generation:
1599
- ```json
1600
- {
1601
- "prompt": "A serene mountain landscape at sunset with a lake reflecting the orange sky, photorealistic style",
1602
- "aspectRatio": "16:9",
1603
- "model": "flux-kontext-max",
1604
- "outputFormat": "png"
1605
- }
1606
- ```
1607
-
1608
- Image editing:
1609
- ```json
1610
- {
1611
- "prompt": "Replace the sky with a starry night and add glowing lanterns",
1612
- "inputImage": "https://example.com/original-image.jpg",
1613
- "aspectRatio": "16:9",
1614
- "safetyTolerance": 2,
1615
- "enableTranslation": false
1616
- }
1617
- ```
1618
-
1619
- Mobile portrait generation:
1620
- ```json
1621
- {
1622
- "prompt": "A futuristic cityscape with flying cars and neon lights, cyberpunk style",
1623
- "aspectRatio": "9:16",
1624
- "model": "flux-kontext-max",
1625
- "promptUpsampling": true
1626
- }
1627
- ```
1628
-
1629
- **Key Features:**
1630
- - **Unified Interface**: Single tool for both text-to-image generation and image editing
1631
- - **Smart Mode Detection**: Automatically detects mode based on `inputImage` parameter
1632
- - Text-to-Image: No `inputImage` provided
1633
- - Image Editing: `inputImage` provided
1634
- - **Advanced Translation**: Automatic translation of non-English prompts to English
1635
- - **Multiple Aspect Ratios**: Support for ultra-wide, standard, square, and mobile formats
1636
- - **Model Selection**: Choose between standard (pro) and enhanced (max) quality models
1637
- - **Safety Controls**: Configurable content moderation with different levels for generation vs editing
1638
- - **Prompt Enhancement**: Optional upsampling for improved generation quality
1639
- - **Watermark Support**: Add custom watermarks to generated images
1640
- - **Regional Optimization**: Choose optimal server region for uploads
1641
-
1642
- **Smart Detection Logic:**
1643
- - If `inputImage` provided → Image Editing mode
1644
- - If no `inputImage` provided → Text-to-Image mode
1645
-
1646
- **Performance:**
1647
- - Text-to-image generation: 30-60 seconds
1648
- - Image editing: 1-3 minutes
1649
- - Enhanced model (flux-kontext-max): May take longer but provides higher quality
1650
-
1651
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Safety tolerance levels are automatically validated based on the generation mode (0-2 for editing, 0-6 for generation).
1652
-
1653
- ### 20. `ideogram_reframe`
1654
- Reframe images to different aspect ratios and sizes using Ideogram V3 Reframe model with intelligent content adaptation.
1655
-
1656
- **Parameters:**
1657
- - `image_url` (string, required): URL of image to reframe (JPEG, PNG, WEBP, max 10MB)
1658
- - `image_size` (string, optional): Output size for the reframed image (default: "square_hd")
1659
- - Options: `square`, `square_hd`, `portrait_4_3`, `portrait_16_9`, `landscape_4_3`, `landscape_16_9`
1660
- - `rendering_speed` (string, optional): Rendering speed for generation (default: "BALANCED")
1661
- - Options: `TURBO` (fast), `BALANCED` (default), `QUALITY` (best)
1662
- - `style` (string, optional): Style type for generation (default: "AUTO")
1663
- - Options: `AUTO`, `GENERAL`, `REALISTIC`, `DESIGN`
1664
- - `num_images` (string, optional): Number of images to generate (default: "1")
1665
- - Options: `1`, `2`, `3`, `4`
1666
- - `seed` (number, optional): Seed for reproducible results (default: 0)
1667
- - `callBackUrl` (string, optional): URL for task completion notifications
1668
-
1669
- **Examples:**
1670
-
1671
- Basic reframing to square HD:
1672
- ```json
1673
- {
1674
- "image_url": "https://example.com/landscape-photo.jpg",
1675
- "image_size": "square_hd"
1676
- }
1677
- ```
1678
-
1679
- High-quality portrait reframing:
1680
- ```json
1681
- {
1682
- "image_url": "https://example.com/group-photo.jpg",
1683
- "image_size": "portrait_9_16",
1684
- "rendering_speed": "QUALITY",
1685
- "style": "REALISTIC",
1686
- "num_images": "2"
1687
- }
1688
- ```
1689
-
1690
- Fast generation with custom style:
1691
- ```json
1692
- {
1693
- "image_url": "https://example.com/artwork.jpg",
1694
- "image_size": "landscape_16_9",
1695
- "rendering_speed": "TURBO",
1696
- "style": "DESIGN",
1697
- "seed": 42
275
+ "tool": "nano_banana_image",
276
+ "arguments": {
277
+ "prompt": "A futuristic city at sunset, cyberpunk style",
278
+ "image_size": "16:9",
279
+ "output_format": "png"
280
+ }
1698
281
  }
1699
282
  ```
1700
283
 
1701
- Multiple variants for social media:
284
+ ### Generate Video
1702
285
  ```json
1703
286
  {
1704
- "image_url": "https://example.com/product-photo.jpg",
1705
- "image_size": "square",
1706
- "num_images": "4",
1707
- "style": "AUTO"
287
+ "tool": "sora_video",
288
+ "arguments": {
289
+ "prompt": "A peaceful garden with blooming flowers and butterflies",
290
+ "model": "sora-2",
291
+ "resolution": "1080p",
292
+ "duration": "10"
293
+ }
1708
294
  }
1709
295
  ```
1710
296
 
1711
- **Key Features:**
1712
- - **Intelligent Content Adaptation**: Smart content-aware reframing that preserves important elements
1713
- - **Multiple Aspect Ratios**: Support for square, portrait, and landscape formats
1714
- - **Rendering Speed Control**: Choose between speed (TURBO), balance (BALANCED), or quality (QUALITY)
1715
- - **Style Options**: Auto-detection or specific style types (GENERAL, REALISTIC, DESIGN)
1716
- - **Batch Generation**: Create multiple variants in a single request
1717
- - **Reproducible Results**: Seed control for consistent output across sessions
1718
- - **Professional Quality**: High-quality reframing with minimal artifacts
1719
-
1720
- **Output Sizes:**
1721
- - **Square**: 1:1 aspect ratio for social media and avatars
1722
- - **Square HD**: High-definition square format with better quality
1723
- - **Portrait 4:3**: Standard portrait orientation
1724
- - **Portrait 16:9**: Wide portrait for mobile and stories
1725
- - **Landscape 4:3**: Traditional landscape orientation
1726
- - **Landscape 16:9**: Widescreen format for displays and video
1727
-
1728
- **Use Cases:**
1729
- - **Social Media**: Convert images to optimal formats for different platforms
1730
- - **Content Adaptation**: Repurpose content for multiple aspect ratios
1731
- - **Design Workflows**: Generate variations for different layout requirements
1732
- - **Mobile Optimization**: Create mobile-friendly versions of desktop content
1733
- - **Batch Processing**: Generate multiple format variants efficiently
1734
-
1735
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Image reframing typically takes 30-120 seconds depending on complexity, rendering speed, and output settings.
1736
-
1737
- ### 21. `recraft_remove_background`
1738
- Remove backgrounds from images using Recraft AI background removal model with professional-quality edge detection.
1739
-
1740
- **Parameters:**
1741
- - `image` (string, required): URL of image to remove background from (PNG, JPG, WEBP, max 5MB, 16MP, 4096px max, 256px min)
1742
- - `callBackUrl` (string, optional): URL for task completion notifications
1743
-
1744
- **Examples:**
1745
-
1746
- Basic background removal:
297
+ ### Generate Music
1747
298
  ```json
1748
299
  {
1749
- "image": "https://example.com/portrait.jpg"
300
+ "tool": "suno_generate_music",
301
+ "arguments": {
302
+ "prompt": "Upbeat electronic music with energetic beats",
303
+ "customMode": true,
304
+ "instrumental": false,
305
+ "model": "V5",
306
+ "style": "Electronic",
307
+ "title": "Energy Boost"
308
+ }
1750
309
  }
1751
310
  ```
1752
311
 
1753
- With callback URL:
312
+ ### Text-to-Speech
1754
313
  ```json
1755
314
  {
1756
- "image": "https://example.com/product-photo.jpg",
1757
- "callBackUrl": "https://api.example.com/callback"
315
+ "tool": "elevenlabs_tts",
316
+ "arguments": {
317
+ "text": "Welcome to the future of AI-powered content creation!",
318
+ "voice": "Rachel",
319
+ "model": "turbo"
320
+ }
1758
321
  }
1759
322
  ```
1760
323
 
1761
- **Key Features:**
1762
- - **Professional Quality**: Clean edge detection with precise background separation
1763
- - **Format Support**: Works with PNG, JPG, and WEBP images
1764
- - **Size Optimization**: Handles images up to 16MP with optimal processing
1765
- - **Fast Processing**: Quick background removal for most image types
1766
- - **Automatic Enhancement**: Smart edge refinement for natural results
1767
-
1768
- **Use Cases:**
1769
- - **Product Photography**: Create clean product images with transparent backgrounds
1770
- - **Portrait Processing**: Remove backgrounds for professional headshots
1771
- - **Design Workflows**: Isolate subjects for composite images
1772
- - **E-commerce**: Prepare product images for catalogs
1773
- - **Content Creation**: Create assets for social media and marketing
1774
-
1775
- **Technical Specifications:**
1776
- - **Supported Formats**: PNG, JPG, WEBP
1777
- - **Maximum File Size**: 5MB
1778
- - **Maximum Resolution**: 16MP (4096px max dimension)
1779
- - **Minimum Resolution**: 256px min dimension
1780
- - **Output Format**: PNG with transparent background
1781
-
1782
- **Note**: The `callBackUrl` is optional and uses automatic fallback if not provided. Background removal typically takes 10-30 seconds depending on image complexity and size.
1783
-
1784
- ## Why Developers Choose Kie.ai Over Alternatives
1785
-
1786
- ### 💸 **Better Value Than Fal.ai**
1787
- - **Lower costs** for the same premium AI models
1788
- - **Pay-as-you-go pricing** - no monthly commitments
1789
- - **Free trial** to test before you buy
1790
-
1791
- ### 🛠️ **Developer Experience**
1792
- - **Single API key** for all models
1793
- - **Documentation** with examples
1794
- - **Simple integration** - get started in minutes
1795
- - **24/7 support** from technical team
1796
-
1797
- ### 🚀 **Performance**
1798
- - **99.9% uptime**
1799
- - **Fast response times** (25.2s average)
1800
- - **High concurrency** for production workloads
1801
- - **Reliable results**
1802
-
1803
- ### 🔒 **Security**
1804
- - **Encryption** for your data
1805
- - **GDPR compliant** data handling
1806
- - **Private prompts and results**
1807
- - **Regular security updates**
1808
-
1809
- ### 🎯 **Platform**
1810
- - **Latest AI models** as they're released
1811
- - **Backward compatible** API
1812
- - **Feature updates** based on feedback
1813
- - **Active development**
1814
-
1815
- ## API Endpoints
1816
-
1817
- The server interfaces with these Kie.ai API endpoints:
1818
-
1819
- - **Veo3 Video Generation**: `POST /api/v1/veo/generate`
1820
- - **Veo3 Video Status**: `GET /api/v1/veo/record-info`
1821
- - **Veo3 1080p Upgrade**: `GET /api/v1/veo/get-1080p-video`
1822
- - **Nano Banana Generation**: `POST /api/v1/jobs/createTask`
1823
- - **Nano Banana Edit**: `POST /api/v1/jobs/createTask`
1824
- - **Nano Banana Upscale**: `POST /api/v1/jobs/createTask`
1825
- - **Nano Banana Status**: `GET /api/v1/jobs/recordInfo`
1826
- - **Suno Music Generation**: `POST /api/v1/generate`
1827
- - **Suno Music Status**: `GET /api/v1/generate?taskId=XXX`
1828
- - **ElevenLabs TTS Generation**: `POST /api/v1/jobs/createTask`
1829
- - **ElevenLabs TTS Status**: `GET /api/v1/jobs/recordInfo`
1830
- - **ElevenLabs Sound Effects**: `POST /api/v1/jobs/createTask`
1831
- - **ElevenLabs Sound Effects Status**: `GET /api/v1/jobs/recordInfo`
1832
- - **ByteDance Seedance Video**: `POST /api/v1/jobs/createTask`
1833
- - **ByteDance Seedance Status**: `GET /api/v1/jobs/recordInfo`
1834
- - **ByteDance Seedream Image**: `POST /api/v1/jobs/createTask`
1835
- - **ByteDance Seedream Status**: `GET /api/v1/jobs/recordInfo`
1836
- - **Qwen Image Generation**: `POST /api/v1/jobs/createTask`
1837
- - **Qwen Image Status**: `GET /api/v1/jobs/recordInfo`
1838
- - **Runway Aleph Video**: `POST /api/v1/jobs/createTask`
1839
- - **Runway Aleph Status**: `GET /api/v1/jobs/recordInfo`
1840
- - **Midjourney Generation**: `POST /api/v1/jobs/createTask`
1841
- - **Midjourney Status**: `GET /api/v1/jobs/recordInfo`
1842
- - **Wan Video Generation**: `POST /api/v1/jobs/createTask`
1843
- - **Wan Video Status**: `GET /api/v1/jobs/recordInfo`
1844
- - **OpenAI 4o Image Generation**: `POST /api/v1/jobs/createTask`
1845
- - **OpenAI 4o Image Status**: `GET /api/v1/jobs/recordInfo`
1846
- - **Flux Kontext Image**: `POST /api/v1/jobs/createTask`
1847
- - **Flux Kontext Status**: `GET /api/v1/jobs/recordInfo`
1848
- - **Recraft Remove Background**: `POST /api/v1/jobs/createTask`
1849
- - **Recraft Remove Background Status**: `GET /api/v1/jobs/recordInfo`
1850
- - **Ideogram V3 Reframe**: `POST /api/v1/jobs/createTask`
1851
- - **Ideogram V3 Reframe Status**: `GET /api/v1/jobs/recordInfo`
1852
-
1853
- All endpoints follow official Kie.ai API documentation.
1854
-
1855
- ## Database Schema
1856
-
1857
- The server uses SQLite to track tasks:
1858
-
1859
- ```sql
1860
- CREATE TABLE tasks (
1861
- id INTEGER PRIMARY KEY AUTOINCREMENT,
1862
- task_id TEXT UNIQUE NOT NULL,
1863
- api_type TEXT NOT NULL, -- 'nano-banana', 'nano-banana-edit', 'nano-banana-upscale', 'veo3', 'suno', 'elevenlabs-tts', 'elevenlabs-sound-effects', 'bytedance-seedance-video', 'bytedance-seedream-image', 'qwen-image', 'runway-aleph-video', 'midjourney-generate', 'wan-video', 'kling-v2-1-pro', 'kling-v2-5-turbo-text-to-video', 'kling-v2-5-turbo-image-to-video', 'openai-4o-image', 'flux-kontext-image', 'recraft-remove-background', 'ideogram-reframe'
1864
- status TEXT DEFAULT 'pending',
1865
- created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
1866
- updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
1867
- result_url TEXT,
1868
- error_message TEXT
1869
- );
1870
- ```
324
+ **→ [See 100+ more examples in tool documentation](docs/TOOLS.md)**
1871
325
 
1872
326
  ## Database & Task Management
1873
327
 
1874
- The server includes a built-in SQLite database for persistent task tracking and management.
1875
-
1876
- ### **Database Features**
328
+ The server includes a built-in SQLite database for persistent task tracking:
1877
329
 
1878
330
  - **🔄 Persistent Storage**: Tasks survive server restarts
1879
- - **📊 Complete History**: Track all generation tasks and their results
1880
- - **⚡ Smart Caching**: Local database reduces API calls for status checks
1881
- - **🔍 Full Audit Trail**: Complete lifecycle tracking for every task
331
+ - **📊 Complete History**: Track all generation tasks and their results
332
+ - **⚡ Smart Caching**: Local database reduces API calls
333
+ - **🔍 Full Audit Trail**: Complete lifecycle tracking
1882
334
  - **🎯 Intelligent Routing**: Database provides api_type for correct endpoint selection
1883
335
 
1884
- ### **Task Lifecycle**
1885
-
1886
- ```
1887
- 1. Task Created → INSERT (status: 'pending')
1888
- 2. API Processing → UPDATE (status: 'processing')
1889
- 3. API Complete → UPDATE (status: 'completed', result_url: '...')
1890
- 4. API Failed → UPDATE (status: 'failed', error_message: '...')
1891
- ```
1892
-
1893
- ### **Available Task Management Tools**
1894
-
1895
- #### **1. `list_tasks`**
1896
- List all tasks in the database with optional limit.
336
+ ### Quick Examples
1897
337
 
338
+ **List recent tasks:**
1898
339
  ```json
1899
340
  {
1900
- "limit": 50 // optional, default 100
1901
- }
1902
- ```
1903
-
1904
- **Response:**
1905
- ```json
1906
- {
1907
- "tasks": [
1908
- {
1909
- "id": 1,
1910
- "task_id": "281e5b0*********************f39b9",
1911
- "api_type": "veo3",
1912
- "status": "completed",
1913
- "created_at": "2025-01-14T10:30:00.000Z",
1914
- "updated_at": "2025-01-14T10:35:00.000Z",
1915
- "result_url": "https://file.aiquickdraw.com/custom-page/akr/video.mp4",
1916
- "error_message": null
1917
- }
1918
- ]
1919
- }
1920
- ```
1921
-
1922
- #### **2. `get_task_status`**
1923
- Get detailed status of a specific task, combining local database with live API data.
1924
-
1925
- ```json
1926
- {
1927
- "task_id": "281e5b0*********************f39b9"
341
+ "tool": "list_tasks",
342
+ "arguments": {
343
+ "limit": 20,
344
+ "status": "completed"
345
+ }
1928
346
  }
1929
347
  ```
1930
348
 
1931
- **Response:**
349
+ **Check task status:**
1932
350
  ```json
1933
351
  {
1934
- "task_id": "281e5b0*********************f39b9",
1935
- "api_type": "veo3",
1936
- "status": "completed",
1937
- "local_status": "completed",
1938
- "api_status": "success",
1939
- "created_at": "2025-01-14T10:30:00.000Z",
1940
- "updated_at": "2025-01-14T10:35:00.000Z",
1941
- "result_url": "https://file.aiquickdraw.com/custom-page/akr/video.mp4",
1942
- "api_data": {
1943
- "state": "success",
1944
- "resultJson": "{\"resultUrls\":[\"https://file.aiquickdraw.com/custom-page/akr/video.mp4\"]}",
1945
- "costTime": 180000,
1946
- "completeTime": 1757584164490
352
+ "tool": "get_task_status",
353
+ "arguments": {
354
+ "task_id": "281e5b0*********************f39b9"
1947
355
  }
1948
356
  }
1949
357
  ```
1950
358
 
1951
- ### **Database Configuration**
1952
-
1953
- #### **Environment Variables**
1954
- ```bash
1955
- # Custom database file location (optional)
1956
- KIE_AI_DB_PATH=./custom_tasks.db
1957
-
1958
- # Default: ./tasks.db in current working directory
1959
- ```
1960
-
1961
- #### **Database Behavior**
1962
- - **Auto-initialization**: Creates tables and indexes on first run
1963
- - **Indexing**: Optimized queries on `task_id` and `status` fields
1964
- - **Thread-safe**: Uses SQLite serialization for concurrent access
1965
- - **Persistent**: Data survives server restarts
1966
- - **Inspectable**: Can be opened with any SQLite client tool
1967
-
1968
- ### **Smart Status Checking**
1969
-
1970
- The `get_task_status` tool uses intelligent routing:
1971
-
1972
- 1. **Query Local Database**: Fast lookup of task metadata
1973
- 2. **API Status Check**: Calls appropriate endpoint based on `api_type`
1974
- 3. **Database Update**: Stores latest status from API response
1975
- 4. **Combined Response**: Merges local and API data for complete picture
1976
-
1977
- ### **API Type Routing**
1978
-
1979
- The database `api_type` field determines which Kie.ai endpoint to query:
1980
-
1981
- | api_type | Endpoint | Purpose |
1982
- |----------|----------|---------|
1983
- | `veo3` | `/veo/record-info` | Veo3 video generation |
1984
- | `nano-banana` | `/jobs/recordInfo` | Image generation |
1985
- | `nano-banana-edit` | `/jobs/recordInfo` | Image editing |
1986
- | `nano-banana-upscale` | `/jobs/recordInfo` | Image upscaling |
1987
- | `suno` | `/generate/record-info` | Music generation |
1988
- | `elevenlabs-tts` | `/jobs/recordInfo` | Text-to-speech |
1989
- | `elevenlabs-sound-effects` | `/jobs/recordInfo` | Sound effects |
1990
- | `bytedance-seedance-video` | `/jobs/recordInfo` | Video generation |
1991
- | `bytedance-seedream-image` | `/jobs/recordInfo` | Image generation/editing |
1992
- | `qwen-image` | `/jobs/recordInfo` | Image generation/editing |
1993
- | `runway-aleph-video` | `/jobs/recordInfo` | Video-to-video transformation |
1994
- | `midjourney-generate` | `/jobs/recordInfo` | Image/video generation |
1995
- | `wan-video` | `/jobs/recordInfo` | Video generation |
1996
- | `kling-v2-1-pro` | `/jobs/recordInfo` | Video generation (start+end frames) |
1997
- | `kling-v2-5-turbo-text-to-video` | `/jobs/recordInfo` | Video generation (text-to-video) |
1998
- | `kling-v2-5-turbo-image-to-video` | `/jobs/recordInfo` | Video generation (image-to-video) |
1999
- | `openai-4o-image` | `/jobs/recordInfo` | Image generation/editing/variants |
2000
- | `flux-kontext-image` | `/jobs/recordInfo` | Image generation/editing |
2001
- | `recraft-remove-background` | `/jobs/recordInfo` | Background removal |
2002
- | `ideogram-reframe` | `/jobs/recordInfo` | Image reframing |
2003
-
2004
- ### **Task Status Values**
2005
-
2006
- - **`pending`**: Task created, waiting for API processing
2007
- - **`processing`**: API is actively processing the task
2008
- - **`completed`**: Task finished successfully, result available
2009
- - **`failed`**: Task failed, error message available
2010
-
2011
- ### **Best Practices**
2012
-
2013
- - **Use `list_tasks`** to get overview of all generation activity
2014
- - **Use `get_task_status`** for detailed progress tracking
2015
- - **Monitor `updated_at`** to see when status last changed
2016
- - **Check `error_message`** for failed tasks to debug issues
2017
- - **Use `result_url`** to access completed generation results
2018
-
2019
- ## Usage Examples
2020
-
2021
- ### Basic Image Generation
2022
- ```bash
2023
- # Generate an image
2024
- curl -X POST http://localhost:3000/tools/call \
2025
- -H "Content-Type: application/json" \
2026
- -d '{
2027
- "name": "nano_banana_generate",
2028
- "arguments": {
2029
- "prompt": "A cat wearing a space helmet"
2030
- }
2031
- }'
2032
- ```
2033
-
2034
- ### Video Generation with Options
2035
- ```bash
2036
- # Generate a video
2037
- curl -X POST http://localhost:3000/tools/call \
2038
- -H "Content-Type: application/json" \
2039
- -d '{
2040
- "name": "veo3_generate_video",
2041
- "arguments": {
2042
- "prompt": "A peaceful garden with blooming flowers",
2043
- "aspectRatio": "16:9",
2044
- "model": "veo3_fast"
2045
- }
2046
- }'
2047
- ```
359
+ **→ [See complete database documentation](docs/DATABASE.md)** including schema, lifecycle, and best practices
2048
360
 
2049
361
  ## Real-World Use Cases
2050
362
 
2051
- ### 🎬 **Content Creation Agencies**
363
+ <details>
364
+ <summary><strong>🎬 Content Creation Agencies (click to expand)</strong></summary>
365
+
2052
366
  ```bash
2053
367
  # Generate social media video content
2054
- veo3_generate_video: "A trendy coffee shop with latte art, cinematic lighting"
368
+ sora_video: "A trendy coffee shop with latte art, cinematic lighting"
2055
369
 
2056
370
  # Create product photography
2057
371
  nano_banana_image: "Luxury watch on marble surface, professional product shot"
@@ -2059,11 +373,14 @@ nano_banana_image: "Luxury watch on marble surface, professional product shot"
2059
373
  # Add background music
2060
374
  suno_generate_music: "Upbeat corporate background music, 2 minutes"
2061
375
  ```
376
+ </details>
377
+
378
+ <details>
379
+ <summary><strong>🎮 Game Development Studios (click to expand)</strong></summary>
2062
380
 
2063
- ### 🎮 **Game Development Studios**
2064
381
  ```bash
2065
382
  # Generate game assets
2066
- nano_banana_generate: "Fantasy sword with glowing runes, game asset style"
383
+ bytedance_seedream_image: "Fantasy sword with glowing runes, game asset style"
2067
384
 
2068
385
  # Create character voiceovers
2069
386
  elevenlabs_tts: "Welcome, brave adventurer! Your quest begins now."
@@ -2071,11 +388,14 @@ elevenlabs_tts: "Welcome, brave adventurer! Your quest begins now."
2071
388
  # Design sound effects
2072
389
  elevenlabs_ttsfx: "Magical spell casting with sparkles and energy"
2073
390
  ```
391
+ </details>
392
+
393
+ <details>
394
+ <summary><strong>📱 Mobile App Developers (click to expand)</strong></summary>
2074
395
 
2075
- ### 📱 **Mobile App Developers**
2076
396
  ```bash
2077
397
  # Generate app icons and illustrations
2078
- nano_banana_image: "Modern minimalist app icon for fitness tracker"
398
+ flux_kontext_image: "Modern minimalist app icon for fitness tracker"
2079
399
 
2080
400
  # Create tutorial videos
2081
401
  bytedance_seedance_video: "Screen recording showing app features, clean interface"
@@ -2083,64 +403,40 @@ bytedance_seedance_video: "Screen recording showing app features, clean interfac
2083
403
  # Add narration
2084
404
  elevenlabs_tts: "Tap here to get started with your new profile"
2085
405
  ```
406
+ </details>
407
+
408
+ <details>
409
+ <summary><strong>🏢 Enterprise Applications (click to expand)</strong></summary>
2086
410
 
2087
- ### 🏢 **Enterprise Applications**
2088
411
  ```bash
2089
412
  # Generate training materials
2090
413
  veo3_generate_video: "Professional office environment, employee training scenario"
2091
414
 
2092
415
  # Create corporate presentations
2093
- nano_banana_image: {
2094
- "prompt": "Add company logo to presentation slide, maintain professional style",
2095
- "image_urls": ["https://example.com/slide.jpg"]
2096
- }
416
+ openai_4o_image: "Add company logo to presentation slide, maintain professional style"
2097
417
 
2098
418
  # Produce marketing content
2099
419
  suno_generate_music: "Corporate background music for promotional video"
2100
420
  ```
2101
-
2102
- ### 🎨 **Creative Professionals**
2103
- ```bash
2104
- # Artistic projects
2105
- bytedance_seedance_video: "Abstract art coming to life, vibrant colors flowing"
2106
-
2107
- # Photography enhancement
2108
- nano_banana_image: {
2109
- "image": "https://example.com/portrait.jpg",
2110
- "scale": 4,
2111
- "face_enhance": true
2112
- }
2113
-
2114
- # Audio production
2115
- elevenlabs_sound_effects: "Nature soundscape with birds and gentle wind"
2116
- ```
2117
-
2118
- ## Success Stories
2119
-
2120
- ### 🚀 **Startup Reduces AI Costs**
2121
- *"Switched from multiple AI services to Kie.ai and cut our monthly AI budget from $2,000 to $600. The unified API simplified our codebase."* - CTO, Content Startup
2122
-
2123
- ### ⚡ **Agency Speeds Up Delivery**
2124
- *"Our video production timeline went from 2 weeks to 3 days using Veo 3. Clients like the quality and we handle more projects."* - Creative Director, Marketing Agency
2125
-
2126
- ### 🎵 **Music Producer Scales Work**
2127
- *"Suno API lets us generate custom background music for client videos in minutes instead of days. It improved our workflow."* - Producer, Video Production Company
421
+ </details>
2128
422
 
2129
423
  ## Error Handling
2130
424
 
2131
425
  The server handles these HTTP error codes from Kie.ai:
2132
426
 
2133
- - **200**: Success
2134
- - **400**: Content policy violation / English prompts only
2135
- - **401**: Unauthorized (invalid API key)
2136
- - **402**: Insufficient credits
2137
- - **404**: Resource not found
2138
- - **422**: Validation error / record is null
2139
- - **429**: Rate limited
2140
- - **451**: Image access limits
2141
- - **455**: Service maintenance
2142
- - **500**: Server error / timeout
2143
- - **501**: Generation failed
427
+ | Code | Meaning |
428
+ |------|---------|
429
+ | **200** | Success |
430
+ | **400** | Content policy violation / English prompts only |
431
+ | **401** | Unauthorized (invalid API key) |
432
+ | **402** | Insufficient credits |
433
+ | **404** | Resource not found |
434
+ | **422** | Validation error / record is null |
435
+ | **429** | Rate limited |
436
+ | **451** | Image access limits |
437
+ | **455** | Service maintenance |
438
+ | **500** | Server error / timeout |
439
+ | **501** | Generation failed |
2144
440
 
2145
441
  ## Development
2146
442
 
@@ -2175,6 +471,8 @@ See https://kie.ai/billing for detailed pricing.
2175
471
  4. **Monitoring**: Monitor task status and handle failed generations appropriately
2176
472
  5. **Storage**: Consider automatic cleanup of old task records
2177
473
 
474
+ **→ [See complete administrator guide](docs/ADMIN.md)** for deployment best practices
475
+
2178
476
  ## Troubleshooting
2179
477
 
2180
478
  ### Common Issues
@@ -2201,21 +499,19 @@ For issues related to:
2201
499
 
2202
500
  ## 🚀 Start Building with Kie.ai
2203
501
 
2204
- Developers are using Kie.ai for their AI media generation:
2205
-
2206
- ### 🎯 **Get Started**
502
+ ### 🎯 Get Started
2207
503
  1. **Get your free API key** at [kie.ai/api-key](https://kie.ai/api-key)
2208
504
  2. **Install the MCP server**: `npm install @felores/kie-ai-mcp-server`
2209
505
  3. **Generate your first AI content** in minutes
2210
506
 
2211
- ### 💡 **Benefits**
507
+ ### 💡 Benefits
2212
508
  - ✅ **Free trial** - Test models before paying
2213
509
  - ✅ **30-50% lower pricing** than competitors
2214
510
  - ✅ **99.9% uptime** guarantee
2215
511
  - ✅ **24/7 human support**
2216
512
  - ✅ **Simple integration**
2217
513
 
2218
- ### 🌟 **AI Content Generation**
514
+ ### 🌟 AI Content Generation
2219
515
  Kie.ai provides access to advanced AI models at competitive pricing.
2220
516
 
2221
517
  **Start your project today.** 🚀
@@ -2236,4 +532,4 @@ MIT License - see LICENSE file for details.
2236
532
 
2237
533
  ## Changelog
2238
534
 
2239
- See [CHANGELOG.md](CHANGELOG.md) for detailed version history and release notes.
535
+ See [CHANGELOG.md](CHANGELOG.md) for detailed version history and release notes.