@acedatacloud/skills 2026.614.0 → 2026.614.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@acedatacloud/skills",
|
|
3
|
-
"version": "2026.614.
|
|
3
|
+
"version": "2026.614.2",
|
|
4
4
|
"description": "Agent Skills for AceDataCloud AI services — music, image, video generation, LLM chat, web search. Compatible with Claude Code, GitHub Copilot, Gemini CLI, OpenAI Codex, and 30+ AI coding agents.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"agent-skills",
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: kling-video
|
|
3
|
-
description: Generate AI videos with Kuaishou Kling via AceDataCloud API. Use when creating videos from text or images, extending existing videos,
|
|
3
|
+
description: Generate AI videos with Kuaishou Kling via AceDataCloud API. Use when creating videos from text or images, extending existing videos, applying motion control, or lip-syncing audio/text to video. Supports text-to-video, image-to-video, extend, motion generation, and lip-sync with multiple models and quality modes.
|
|
4
4
|
license: Apache-2.0
|
|
5
5
|
metadata:
|
|
6
6
|
author: acedatacloud
|
|
@@ -104,6 +104,19 @@ POST /kling/motion
|
|
|
104
104
|
}
|
|
105
105
|
```
|
|
106
106
|
|
|
107
|
+
### 5. Lip Sync
|
|
108
|
+
|
|
109
|
+
Create a lip-synced video from a source video plus either an audio track or input text.
|
|
110
|
+
|
|
111
|
+
```json
|
|
112
|
+
POST /kling/lip-sync
|
|
113
|
+
{
|
|
114
|
+
"video_url": "https://example.com/source.mp4",
|
|
115
|
+
"mode": "audio2video",
|
|
116
|
+
"audio_url": "https://example.com/voiceover.mp3"
|
|
117
|
+
}
|
|
118
|
+
```
|
|
119
|
+
|
|
107
120
|
## Parameters
|
|
108
121
|
|
|
109
122
|
| Parameter | Values | Description |
|
|
@@ -120,6 +133,16 @@ POST /kling/motion
|
|
|
120
133
|
| `element_list` | array | Reference subjects from the element library (each item has `element_id`). Combined with `video_list`, total reference images + subjects ≤ 7 (or ≤ 4 if a reference video is included) |
|
|
121
134
|
| `video_list` | array | Reference video(s) via `video_url` (MP4/MOV, 3–10s, ≤200MB, max 1 video). Each item has `video_url`, `refer_type` (`"feature"` or `"base"`), and optional `keep_original_sound` |
|
|
122
135
|
| `callback_url` | string | Async callback URL |
|
|
136
|
+
| `mode` (`/kling/lip-sync`) | `"audio2video"`, `"text2video"` | Lip-sync mode |
|
|
137
|
+
| `video_url` (`/kling/lip-sync`) | URL | Source video URL for lip-sync |
|
|
138
|
+
| `video_id` (`/kling/lip-sync`) | string | Existing Kling video ID for lip-sync |
|
|
139
|
+
| `audio_url` (`/kling/lip-sync`) | URL | Audio source URL (for `audio2video`) |
|
|
140
|
+
| `audio_type` (`/kling/lip-sync`) | `"url"`, `"file"` | Audio input type (default `url`) |
|
|
141
|
+
| `audio_file` (`/kling/lip-sync`) | string | Audio file payload when `audio_type=file` |
|
|
142
|
+
| `text` (`/kling/lip-sync`) | string | Input text to synthesize speech (for `text2video`) |
|
|
143
|
+
| `voice_id` (`/kling/lip-sync`) | string | Voice preset ID used in `text2video` |
|
|
144
|
+
| `voice_language` (`/kling/lip-sync`) | `"zh"`, `"en"` | TTS language for `text2video` (default `zh`) |
|
|
145
|
+
| `voice_speed` (`/kling/lip-sync`) | number | TTS speaking speed (default `1.0`) |
|
|
123
146
|
|
|
124
147
|
## Gotchas
|
|
125
148
|
|
|
@@ -128,6 +151,7 @@ POST /kling/motion
|
|
|
128
151
|
- `generate_audio` enables synchronized audio generation (supported by `kling-v3`, `kling-v3-omni`, and `kling-v2-6` in pro mode)
|
|
129
152
|
- `end_image_url` is only for `image2video` action — it defines the last frame
|
|
130
153
|
- Motion control (`/kling/motion`) is a separate endpoint from video generation
|
|
154
|
+
- Lip-sync is a separate endpoint (`/kling/lip-sync`) and requires `mode`; use `audio_url` for `audio2video` or `text` + voice fields for `text2video`
|
|
131
155
|
- `pro` mode costs roughly 2x `std` mode but generates faster with better quality
|
|
132
156
|
- Task states use `"succeed"` (not "succeeded") — check for this value when polling
|
|
133
157
|
- `negative_prompt` helps avoid unwanted elements (e.g., "blurry, low quality, text")
|
|
@@ -66,14 +66,27 @@ curl -sS -X POST https://api.acedata.cloud/seedance/videos \
|
|
|
66
66
|
curl -sS -X POST https://api.acedata.cloud/suno/audios \
|
|
67
67
|
-H "Authorization: Bearer $ACEDATACLOUD_API_TOKEN" -H "Content-Type: application/json" \
|
|
68
68
|
-d '{"action":"generate","prompt":"uplifting minimal electronic, premium tech","instrumental":true,"model":"chirp-v5-5"}'
|
|
69
|
+
|
|
70
|
+
# Voiceover — OpenAI-compatible TTS, SYNCHRONOUS (returns audio bytes, no polling).
|
|
71
|
+
# Generate ONE file per scene so the audio aligns to scene boundaries.
|
|
72
|
+
curl -sS -X POST https://api.acedata.cloud/v1/audio/speech \
|
|
73
|
+
-H "Authorization: Bearer $ACEDATACLOUD_API_TOKEN" -H "Content-Type: application/json" \
|
|
74
|
+
-o scene1.mp3 \
|
|
75
|
+
-d '{"model":"tts-1-hd","input":"<scene narration>","voice":"nova"}'
|
|
69
76
|
```
|
|
70
77
|
|
|
71
78
|
All of the above return a `task_id` — **poll the matching `/<service>/tasks`** until
|
|
72
79
|
`state`/`status` is terminal, then read the media URL (see _shared/async-tasks.md).
|
|
73
80
|
The media is served from `*.cdn.acedata.cloud`. Per-model details: `flux-image`,
|
|
74
81
|
`seedream-image`, `nano-banana-image`, `seedance-video`, `veo-video`, `suno-music`,
|
|
75
|
-
`fish-audio` skills.
|
|
76
|
-
|
|
82
|
+
`fish-audio` skills.
|
|
83
|
+
|
|
84
|
+
**Voiceover (TTS):** `POST /v1/audio/speech` is the OpenAI-compatible route — it is
|
|
85
|
+
**synchronous** (returns the audio bytes directly, no `task_id`/polling), models
|
|
86
|
+
`tts-1-hd` (default) / `tts-1`, voices `alloy|echo|fable|onyx|nova|shimmer`, and it
|
|
87
|
+
speaks both English and Chinese. (`/fish/tts` is an alternate voice-cloning route.)
|
|
88
|
+
The endpoint does **not** return word timings — run WhisperX on the returned audio
|
|
89
|
+
for karaoke, or distribute words across the scene duration proportionally.
|
|
77
90
|
|
|
78
91
|
## Recipe — capture product UI (Playwright)
|
|
79
92
|
|
|
@@ -126,8 +139,9 @@ real bold sans (`C:/Windows/Fonts/arialbd.ttf`, `DejaVuSans-Bold.ttf`, etc.),
|
|
|
126
139
|
so the product stays visible.
|
|
127
140
|
|
|
128
141
|
> Reference implementation (Scene-JSON contract, caption-burn, render driver,
|
|
129
|
-
> material-library convention): **AceDataCloud/PlatformStudio**
|
|
130
|
-
> `scripts/
|
|
142
|
+
> material-library convention): **AceDataCloud/PlatformStudio** — `app/contract.py`,
|
|
143
|
+
> `app/pipeline/`, `scripts/render_veo_rough_cut.py`, `scripts/build_material_catalog.py`,
|
|
144
|
+
> and the curated material index `materials/catalog.json` + `materials/curated.json`.
|
|
131
145
|
|
|
132
146
|
## Recipe — upload to CDN + distribute
|
|
133
147
|
|