npm - @acedatacloud/skills - Versions diffs - 2026.620.0 → 2026.620.1 - Mend

@acedatacloud/skills 2026.620.0 → 2026.620.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/package.json +1 -1
package/skills/cos-upload/SKILL.md +41 -0
package/skills/fish-audio/SKILL.md +35 -51
package/skills/gpt-image-2/SKILL.md +64 -0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@acedatacloud/skills",
-  "version": "2026.620.0",
+  "version": "2026.620.1",
   "description": "Agent Skills for AceDataCloud AI services — music, image, video generation, LLM chat, web search. Compatible with Claude Code, GitHub Copilot, Gemini CLI, OpenAI Codex, and 30+ AI coding agents.",
   "keywords": [
     "agent-skills",

package/skills/cos-upload/SKILL.md ADDED Viewed

@@ -0,0 +1,41 @@
+---
+name: cos-upload
+description: Upload a local file to AceData Cloud CDN and get back a public URL. Use whenever you produce a local artifact (image, audio, video, doc) that another API needs as a URL, or that you need to return/persist (e.g. feed a generated image into an image-to-video API, or publish a finished video).
+license: Apache-2.0
+metadata:
+  author: acedatacloud
+  version: "1.0"
+compatibility: Requires ACEDATACLOUD_PLATFORM_TOKEN (see _shared/authentication.md). Some runtimes instead provide a bundled uploader — prefer that when present.
+---
+# Upload a file → AceData CDN URL
+Turn a local file into a public `https://cdn.acedata.cloud/...` URL.
+## Upload (multipart, synchronous)
+```bash
+curl -s -X POST https://platform.acedata.cloud/api/v1/files/ \
+  -H "Authorization: Bearer $ACEDATACLOUD_PLATFORM_TOKEN" \
+  -F "file=@/path/to/video.mp4"
+```
+Response:
+```json
+{"file_url": "https://cdn.acedata.cloud/7f849b80b9.mp4"}
+```
+→ use `file_url`. One file per request (loop for several).
+## When to use
+- **Feed a generated asset into another API by URL** — e.g. upload a gpt-image-2 still, then pass its URL to `seedance` / `kling` image-to-video.
+- **Publish/return a finished artifact** (final video, cover image).
+- **Persist intermediate artifacts** so a later run can re-download and continue.
+## Notes
+- Auth uses the **platform token** (`ACEDATACLOUD_PLATFORM_TOKEN`), not the per-service API token — the files endpoint is on `platform.acedata.cloud`, not `api.acedata.cloud`.
+- If your runtime ships a bundled uploader (e.g. a worker that owns the storage creds), prefer it — it avoids handling the platform token directly.
+- The returned URL is CDN-served and stable; safe to store and re-download later.

package/skills/fish-audio/SKILL.md CHANGED Viewed

@@ -1,99 +1,83 @@
 ---
 name: fish-audio
-description: Generate AI audio and synthesize voices with Fish Audio via AceDataCloud API. Use when creating text-to-speech audio, synthesizing voices, or generating audio content. Supports multiple voice models and TTS capabilities.
+description: Generate AI text-to-speech audio and clone voices with Fish Audio via AceDataCloud API. Use when creating voiceover/narration audio (TTS), synthesizing speech, or cloning a reference voice. Chinese + multilingual.
 license: Apache-2.0
 metadata:
   author: acedatacloud
-  version: "1.0"
+  version: "1.1"
 compatibility: Requires ACEDATACLOUD_API_TOKEN in .env file (see _shared/authentication.md).
 ---
-# Fish Audio — Voice & Audio Synthesis
+# Fish Audio — Text-to-Speech & Voice Cloning
-Generate AI audio and synthesize voices through AceDataCloud's Fish Audio API.
+Generate narration / voiceover and clone voices through AceDataCloud's Fish Audio API.
 > **Setup:** See [authentication](../_shared/authentication.md) for token setup.
-## Quick Start
+## Quick Start (TTS — synchronous, ~3s)
 ```bash
 curl -X POST https://api.acedata.cloud/fish/audios \
   -H "Authorization: Bearer $ACEDATACLOUD_API_TOKEN" \
   -H "Content-Type: application/json" \
-  -d '{"prompt": "Hello, this is a demonstration of AI voice synthesis."}'
+  -d '{"action":"speech","model":"fish-tts","voice_id":"543e4181d81b4ef6874b0e8fbdf27c78","prompt":"你好,欢迎使用 AceData Cloud。"}'
 ```
-> **Async:** See [async task polling](../_shared/async-tasks.md). Poll via `POST /fish/tasks` with `{"id": "..."}`.
+Response (synchronous — no polling needed for `speech`):
+```json
+{"success": true, "data": [{"audio_url": "https://platform.r2.fish.audio/task/....mp3"}]}
+```
+→ download `data[0].audio_url`. `voice_id` is **required**. A good default Mandarin
+news-anchor voice is **`543e4181d81b4ef6874b0e8fbdf27c78`**.
 ## Endpoints
 | Endpoint | Purpose |
 |----------|---------|
-| `POST /fish/audios` | Generate audio from text or parameters |
-| `POST /fish/voices` | Voice synthesis and cloning |
-| `POST /fish/tasks` | Poll task status |
+| `POST /fish/audios` | TTS (`action: "speech"`) — synchronous |
+| `POST /fish/voices` | List / register (clone) voices |
 ## Workflows
-### 1. Text-to-Speech
+### 1. Text-to-Speech (the common case)
 ```json
 POST /fish/audios
 {
-  "prompt": "The quick brown fox jumps over the lazy dog.",
-  "voice_id": "default"
+  "action": "speech",
+  "model": "fish-tts",
+  "voice_id": "543e4181d81b4ef6874b0e8fbdf27c78",
+  "prompt": "你的旁白文本。"
 }
 ```
-### 2. Voice Cloning — Register a Voice
-Upload a reference audio to create a cloneable voice.
+### 2. Clone a voice from a reference sample
 ```json
 POST /fish/voices
 {
   "voice_url": "https://example.com/reference-voice.mp3",
   "title": "My Custom Voice",
-  "description": "Clear, neutral-toned speaker for TTS",
-  "image_url": "https://example.com/avatar.jpg"
-}
-```
-### 3. Text-to-Speech with Cloned Voice
-```json
-POST /fish/audios
-{
-  "prompt": "Welcome to our platform.",
-  "voice_id": "<voice_id from POST /fish/voices>"
+  "description": "Clear, neutral-toned speaker"
 }
 ```
-## Parameters
-### `/fish/audios`
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `prompt` | string | Text to synthesize into speech |
-| `voice_id` | string | Voice model or cloned voice ID to use |
-| `model` | string | TTS model (e.g., `"speech-1.5"`, `"speech-1.5-hd"`) |
-| `action` | string | Operation type (e.g., `"generate"`) |
-| `callback_url` | string | Webhook URL for async delivery |
+Then pass the returned id as `voice_id` in workflow 1.
-### `/fish/voices`
+## Parameters — `/fish/audios`
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `voice_url` | string | Reference audio URL for voice cloning |
-| `title` | string | Display title for the cloned voice |
-| `description` | string | Description of the voice |
-| `image_url` | string | Cover image URL for the voice |
-| `callback_url` | string | Webhook URL for async delivery |
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| `action` | string | yes | Use `"speech"` for TTS |
+| `model` | string | yes | `"fish-tts"` |
+| `voice_id` | string | yes | A Fish reference/cloned voice id (default Mandarin: `543e4181d81b4ef6874b0e8fbdf27c78`) |
+| `prompt` | string | yes | Text to synthesize |
 ## Gotchas
-- Pricing is based on **byte count** of the generated audio
-- Voice cloning requires a clear reference audio sample
-- Text-to-speech supports multiple languages automatically
-- Use the `/fish/voices` endpoint to register a reference audio and receive a `voice_id` for TTS
+- **TTS (`action:"speech"`) is synchronous** — the response carries `data[0].audio_url`; do NOT poll `/fish/tasks` for it.
+- `voice_id` is **required** — a bare `{"prompt": "..."}` returns `400 voice_id is required when action is speech`.
+- `model` must be `"fish-tts"` for speech (NOT `speech-1.5`); sending a different model returns `400 model is invalid if action is speech`.
+- Pricing is based on the **byte count** of the generated audio. Multilingual is automatic.

package/skills/gpt-image-2/SKILL.md ADDED Viewed

@@ -0,0 +1,64 @@
+---
+name: gpt-image-2
+description: Generate and EDIT images with OpenAI gpt-image-2 via AceDataCloud API. Use when you need high-fidelity images from a prompt, or to edit/composite existing images (e.g. fuse a real logo/QR/screenshot into a scene, keep characters consistent, restyle). Strong at legible text and faithful editing.
+license: Apache-2.0
+metadata:
+  author: acedatacloud
+  version: "1.0"
+compatibility: Requires ACEDATACLOUD_API_TOKEN in .env file (see _shared/authentication.md).
+---
+# gpt-image-2 — Image Generation & Editing
+OpenAI `gpt-image-2` through AceDataCloud. Two endpoints, both **synchronous** (return image url(s) directly). Its standout is **editing**: feed real images (logos, QR codes, product shots, screenshots) and it composites/restyles them faithfully — great for on-brand video assets and character consistency.
+> **Setup:** See [authentication](../_shared/authentication.md) for token setup.
+## 1. Generate (text → image)
+```bash
+curl -X POST https://api.acedata.cloud/openai/images/generations \
+  -H "Authorization: Bearer $ACEDATACLOUD_API_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"gpt-image-2","prompt":"a clean dark tech hero background with a glowing API hub, lots of negative space","size":"1792x1024","n":1}'
+```
+## 2. Edit / composite (images + prompt → image)  ← the powerful one
+Multipart. Pass one or more source images via repeated `image[]` (local files with
+`@`, or URLs). Use it to **fuse a real logo/QR into a generated scene**, keep a subject
+consistent across scenes, or restyle a screenshot.
+```bash
+curl -X POST https://api.acedata.cloud/openai/images/edits \
+  -H "Authorization: Bearer $ACEDATACLOUD_API_TOKEN" \
+  -F "model=gpt-image-2" \
+  -F "prompt=Place this logo crisply in the top-left on the tech background; keep the logo's exact colors and shape." \
+  -F "image[]=@background.png" \
+  -F "image[]=@logo.png" \
+  -F "size=1792x1024" \
+  -F "n=1"
+```
+Response (both endpoints): `{"data":[{"url":"https://...png"}]}` → download `data[0].url`.
+## Sizes
+`size` is `WxH` (a preset) or `"auto"`. Common presets:
+| Aspect | Sizes |
+|---|---|
+| 16:9 | `1792x1024` (HD), `2048x1152`, `3840x2160` (4K) |
+| 9:16 | `1024x1792`, `1152x2048`, `2160x3840` |
+| 1:1 | `1024x1024`, `2048x2048`, `4096x4096` |
+(Omit `size` or use `"auto"` to let the model pick. Invalid sizes 400.)
+## Tips
+- **Editing keeps things faithful** — to place a logo/QR exactly, pass it as one of the
+  `image[]` and say "keep its exact colors/shape, do not redraw it".
+- For **character/scene consistency** across video beats, generate one hero image, then
+  `edits` it per beat instead of regenerating from scratch.
+- Text in images renders legibly — good for titles/labels you don't want to overlay in HTML.
+- Both endpoints are synchronous; no `/tasks` polling.