getaiapi 1.3.1 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,12 @@
1
1
  # getaiapi
2
2
 
3
- **One function to call any AI model.**
3
+ **Typed AI provider SDKs. One import per provider.**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/getaiapi)](https://www.npmjs.com/package/getaiapi)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
7
7
  [![TypeScript](https://img.shields.io/badge/TypeScript-strict-blue.svg)](https://www.typescriptlang.org/)
8
8
 
9
- A unified TypeScript library that wraps 1,890+ AI models across 5 providers into a single `generate()` function. One input shape. One output shape. Any model.
9
+ Each AI provider gets a typed namespace with one function per model. No generic `generate()`, no model strings, no mapping layers. What you type is what gets sent.
10
10
 
11
11
  ## Install
12
12
 
@@ -14,639 +14,695 @@ A unified TypeScript library that wraps 1,890+ AI models across 5 providers into
14
14
  npm install getaiapi
15
15
  ```
16
16
 
17
- ## Quick Start
17
+ ## Kling AI
18
18
 
19
- ```typescript
20
- import { generate } from 'getaiapi'
19
+ 69 models across 20 endpoints. Each model is a typed function with Kling-native field names.
21
20
 
22
- const result = await generate({
23
- model: 'flux-schnell',
24
- prompt: 'a cat wearing sunglasses'
25
- })
21
+ ### Setup
26
22
 
27
- console.log(result.outputs[0].url)
23
+ ```bash
24
+ export KLING_ACCESS_KEY="your-access-key"
25
+ export KLING_SECRET_KEY="your-secret-key"
28
26
  ```
29
27
 
30
- ## More Examples
31
-
32
- **Text generation (LLMs)**
28
+ Or configure programmatically:
33
29
 
34
30
  ```typescript
35
- const answer = await generate({
36
- model: 'claude-sonnet-4-6',
37
- prompt: 'Explain quantum computing in one paragraph'
38
- })
31
+ import { kling } from 'getaiapi'
39
32
 
40
- console.log(answer.outputs[0].content)
33
+ kling.configure({ accessKey: '...', secretKey: '...' })
41
34
  ```
42
35
 
43
- With system prompt and parameters:
36
+ ### Text to Video
37
+
38
+ 9 models: V1 Standard, V1.6 Pro/Standard, V2 Master, V2.1 Master, V2.5 Turbo Pro, V2.6 Pro, V3 Pro/Standard.
44
39
 
45
40
  ```typescript
46
- const reply = await generate({
47
- model: 'gpt-4o',
48
- prompt: 'Write a haiku about TypeScript',
49
- options: {
50
- system: 'You are a creative poet.',
51
- temperature: 0.9,
52
- max_tokens: 100,
53
- }
41
+ import { kling } from 'getaiapi'
42
+
43
+ const result = await kling.textToVideoV3Pro({
44
+ prompt: 'a golden retriever running on a beach at sunset',
45
+ duration: '5',
46
+ aspect_ratio: '16:9',
47
+ sound: 'on',
54
48
  })
49
+
50
+ console.log(result.videos[0].url)
55
51
  ```
56
52
 
57
- **Text-to-video**
53
+ | Function | Model | Mode |
54
+ |----------|-------|------|
55
+ | `textToVideoV1Standard` | kling-v1 | std |
56
+ | `textToVideoV1_6Pro` | kling-v1-6 | pro |
57
+ | `textToVideoV1_6Standard` | kling-v1-6 | std |
58
+ | `textToVideoV2Master` | kling-v2-master | — |
59
+ | `textToVideoV2_1Master` | kling-v2-1-master | — |
60
+ | `textToVideoV2_5TurboPro` | kling-v2-5-turbo | pro |
61
+ | `textToVideoV2_6Pro` | kling-v2-6 | pro |
62
+ | `textToVideoV3Pro` | kling-v3 | pro |
63
+ | `textToVideoV3Standard` | kling-v3 | std |
64
+
65
+ **Input: `TextToVideoInput`**
58
66
 
59
67
  ```typescript
60
- const video = await generate({
61
- model: 'veo3.1',
62
- prompt: 'a timelapse of a flower blooming in a garden'
63
- })
68
+ {
69
+ prompt: string // required
70
+ negative_prompt?: string
71
+ duration?: string // '5' or '10'
72
+ aspect_ratio?: string // '16:9', '9:16', '1:1'
73
+ cfg_scale?: number
74
+ sound?: 'on' | 'off' // generate audio
75
+ }
64
76
  ```
65
77
 
66
- **Image editing**
78
+ ### Image to Video
79
+
80
+ 13 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard, V2 Master, V2.1 Master/Pro/Standard, V2.5 Turbo Pro/Standard, V2.6 Pro, V3 Pro/Standard.
67
81
 
68
82
  ```typescript
69
- const edited = await generate({
70
- model: 'gpt-image-1.5-edit',
83
+ const result = await kling.imageToVideoV3Pro({
71
84
  image: 'https://example.com/photo.jpg',
72
- prompt: 'add a rainbow in the sky'
85
+ prompt: 'animate this photo with gentle wind',
86
+ duration: '5',
73
87
  })
74
88
  ```
75
89
 
76
- **Multi-image references** (e.g., character + location consistency)
90
+ | Function | Model | Mode |
91
+ |----------|-------|------|
92
+ | `imageToVideoV1Standard` | kling-v1 | std |
93
+ | `imageToVideoV1_5Pro` | kling-v1-5 | pro |
94
+ | `imageToVideoV1_6Pro` | kling-v1-6 | pro |
95
+ | `imageToVideoV1_6Standard` | kling-v1-6 | std |
96
+ | `imageToVideoV2Master` | kling-v2-master | — |
97
+ | `imageToVideoV2_1Master` | kling-v2-1-master | — |
98
+ | `imageToVideoV2_1Pro` | kling-v2-1 | pro |
99
+ | `imageToVideoV2_1Standard` | kling-v2-1 | std |
100
+ | `imageToVideoV2_5TurboPro` | kling-v2-5-turbo | pro |
101
+ | `imageToVideoV2_5TurboStandard` | kling-v2-5-turbo | std |
102
+ | `imageToVideoV2_6Pro` | kling-v2-6 | pro |
103
+ | `imageToVideoV3Pro` | kling-v3 | pro |
104
+ | `imageToVideoV3Standard` | kling-v3 | std |
105
+
106
+ **Input: `ImageToVideoInput`**
77
107
 
78
108
  ```typescript
79
- const scene = await generate({
80
- model: 'google-nano-banana-pro-edit',
81
- prompt: 'cinematic shot of the character in the location',
82
- image: 'https://example.com/character.jpg',
83
- images: [
84
- 'https://example.com/character.jpg',
85
- 'https://example.com/location.jpg',
86
- ],
87
- })
109
+ {
110
+ image: string // required — URL or base64
111
+ prompt?: string
112
+ negative_prompt?: string
113
+ duration?: string
114
+ aspect_ratio?: string
115
+ cfg_scale?: number
116
+ sound?: 'on' | 'off'
117
+ image_tail?: string // end frame image URL
118
+ voice_list?: Array<{ voice_id: string }>
119
+ element_list?: Array<{ id: string; image: string }>
120
+ }
88
121
  ```
89
122
 
90
- **Text-to-speech**
123
+ ### Omni Video
124
+
125
+ 17 models across O1 and O3 variants. Supports text-to-video, image-to-video, reference-to-video, video editing, and video reference — all through one endpoint.
91
126
 
92
127
  ```typescript
93
- const speech = await generate({
94
- model: 'elevenlabs-v3',
95
- prompt: 'Hello, welcome to getaiapi.',
96
- options: { voice_id: 'rachel' }
128
+ const result = await kling.omniVideoO3ProTextToVideo({
129
+ prompt: 'a cyberpunk city at night',
130
+ duration: '5',
131
+ aspect_ratio: '16:9',
97
132
  })
98
133
  ```
99
134
 
100
- **Upscale an image**
135
+ | Function | Model | Mode |
136
+ |----------|-------|------|
137
+ | `omniVideoO1ImageToVideo` | kling-video-o1 | — |
138
+ | `omniVideoO1ReferenceToVideo` | kling-video-o1 | — |
139
+ | `omniVideoO1StandardImageToVideo` | kling-video-o1 | std |
140
+ | `omniVideoO1StandardReferenceToVideo` | kling-video-o1 | std |
141
+ | `omniVideoO1StandardVideoEdit` | kling-video-o1 | std |
142
+ | `omniVideoO1StandardVideoReference` | kling-video-o1 | std |
143
+ | `omniVideoO1VideoEdit` | kling-video-o1 | — |
144
+ | `omniVideoO1VideoReference` | kling-video-o1 | — |
145
+ | `omniVideoO3ProImageToVideo` | kling-v3-omni | pro |
146
+ | `omniVideoO3ProReferenceToVideo` | kling-v3-omni | pro |
147
+ | `omniVideoO3ProTextToVideo` | kling-v3-omni | pro |
148
+ | `omniVideoO3ProVideoEdit` | kling-v3-omni | pro |
149
+ | `omniVideoO3ProVideoReference` | kling-v3-omni | pro |
150
+ | `omniVideoO3StandardReferenceToVideo` | kling-v3-omni | std |
151
+ | `omniVideoO3StandardTextToVideo` | kling-v3-omni | std |
152
+ | `omniVideoO3StandardVideoEdit` | kling-v3-omni | std |
153
+ | `omniVideoO3StandardVideoReference` | kling-v3-omni | std |
154
+
155
+ **Input: `OmniVideoInput`**
101
156
 
102
157
  ```typescript
103
- const upscaled = await generate({
104
- model: 'topaz-upscale-image',
105
- image: 'https://example.com/low-res.jpg'
106
- })
158
+ {
159
+ prompt: string // required
160
+ image?: string
161
+ negative_prompt?: string
162
+ duration?: string
163
+ aspect_ratio?: string
164
+ cfg_scale?: number
165
+ sound?: 'on' | 'off'
166
+ element_list?: Array<{ id: string; image: string }>
167
+ }
107
168
  ```
108
169
 
109
- **Kling native provider** (bypass fal-ai, call Kling API directly)
170
+ ### Image Generation
171
+
172
+ 2 models on `v1/images/generations` and 3 models on `v1/images/omni-image`.
110
173
 
111
174
  ```typescript
112
- const video = await generate({
113
- model: 'kling-video-v3-pro-text-to-video',
114
- provider: 'kling', // uses KLING_ACCESS_KEY directly
115
- prompt: 'a golden retriever running on a beach at sunset',
116
- duration: '5',
117
- options: { aspect_ratio: '16:9', sound: 'on' },
175
+ const result = await kling.imageO1({
176
+ prompt: 'a watercolor painting of a mountain lake',
177
+ n: 2,
178
+ aspect_ratio: '16:9',
118
179
  })
180
+
181
+ console.log(result.images[0].url)
119
182
  ```
120
183
 
121
- **Remove background**
184
+ | Function | Endpoint | Model |
185
+ |----------|----------|-------|
186
+ | `imageV3TextToImage` | generations | kling-v3 |
187
+ | `imageV3ImageToImage` | generations | kling-v3 |
188
+ | `imageO1` | omni-image | kling-image-o1 |
189
+ | `imageO3TextToImage` | omni-image | kling-v3-omni |
190
+ | `imageO3ImageToImage` | omni-image | kling-v3-omni |
191
+
192
+ **Input: `ImageGenerationInput` / `OmniImageInput`**
122
193
 
123
194
  ```typescript
124
- const cutout = await generate({
125
- model: 'birefnet-v2',
126
- image: 'https://example.com/portrait.jpg'
127
- })
195
+ {
196
+ prompt: string // required
197
+ image?: string // for image-to-image
198
+ n?: number // number of outputs
199
+ aspect_ratio?: string
200
+ }
128
201
  ```
129
202
 
130
- ## Async Job Control
131
-
132
- For long-running jobs (video generation, training), you can submit a job and poll for status separately instead of blocking until completion.
203
+ ### Virtual Try-On
133
204
 
134
205
  ```typescript
135
- import { submit, poll } from 'getaiapi'
136
-
137
- // Submit — returns immediately with the provider's task ID
138
- const job = await submit({
139
- model: 'veo3.1',
140
- prompt: 'a timelapse of a flower blooming',
206
+ const result = await kling.virtualTryOn({
207
+ human_image: 'https://example.com/person.jpg',
208
+ cloth_image: 'https://example.com/shirt.jpg',
141
209
  })
210
+ ```
142
211
 
143
- console.log(job.id) // provider task ID
144
- console.log(job.status) // 'pending' | 'processing' | 'completed'
145
-
146
- // Poll — check status manually (call in a loop, on a timer, etc.)
147
- let result = await poll(job)
148
-
149
- while (result.status === 'pending' || result.status === 'processing') {
150
- await new Promise(r => setTimeout(r, 2000))
151
- result = await poll(job)
152
- }
212
+ **Input: `VirtualTryOnInput`**
153
213
 
154
- if (result.status === 'completed') {
155
- console.log(result.outputs[0].url)
214
+ ```typescript
215
+ {
216
+ human_image: string // required
217
+ cloth_image: string // required
156
218
  }
157
219
  ```
158
220
 
159
- Synchronous providers (like OpenRouter) return `status: 'completed'` from `submit()` immediately -- check status before polling.
221
+ ### AI Avatar
160
222
 
161
- `submitAndPoll()` is an alias for `generate()` that makes the blocking behavior explicit:
223
+ 4 models: V1 Pro/Standard, V2 Pro/Standard.
162
224
 
163
225
  ```typescript
164
- import { submitAndPoll } from 'getaiapi'
165
-
166
- const result = await submitAndPoll({
167
- model: 'flux-schnell',
168
- prompt: 'a cat in space',
226
+ const result = await kling.avatarV2Pro({
227
+ image: 'https://example.com/portrait.jpg',
228
+ sound_file: 'https://example.com/speech.mp3',
229
+ prompt: 'talking head presentation',
169
230
  })
170
231
  ```
171
232
 
172
- ## Configuration
233
+ | Function | Mode |
234
+ |----------|------|
235
+ | `avatarV1Pro` | pro |
236
+ | `avatarV1Standard` | std |
237
+ | `avatarV2Pro` | pro |
238
+ | `avatarV2Standard` | std |
173
239
 
174
- ### Option 1: Environment Variables
240
+ **Input: `AvatarInput`**
175
241
 
176
- Set API keys as environment variables. You only need keys for the providers you plan to call.
242
+ ```typescript
243
+ {
244
+ image: string // required — portrait image
245
+ sound_file?: string // audio for lip sync
246
+ prompt?: string
247
+ }
248
+ ```
177
249
 
178
- ```bash
179
- # fal-ai (1,201 models)
180
- export FAL_KEY="your-fal-key"
250
+ ### Lip Sync
181
251
 
182
- # Replicate (687 models)
183
- export REPLICATE_API_TOKEN="your-replicate-token"
252
+ ```typescript
253
+ const result = await kling.lipSyncAudioToVideo({
254
+ sound_file: 'https://example.com/speech.mp3',
255
+ })
256
+ ```
184
257
 
185
- # WaveSpeed (66 models)
186
- export WAVESPEED_API_KEY="your-wavespeed-key"
258
+ | Function | Description |
259
+ |----------|-------------|
260
+ | `lipSyncAudioToVideo` | Audio-driven lip sync |
261
+ | `lipSyncTextToVideo` | Text-driven lip sync |
187
262
 
188
- # OpenRouter (24 LLM models — Claude, GPT, Gemini, Llama, etc.)
189
- export OPENROUTER_API_KEY="your-openrouter-key"
263
+ **Input: `LipSyncInput`**
190
264
 
191
- # Kling AI (69 models — native API, bypasses fal-ai middleman)
192
- export KLING_ACCESS_KEY="your-access-key"
193
- export KLING_SECRET_KEY="your-secret-key"
265
+ ```typescript
266
+ {
267
+ sound_file?: string // audio URL
268
+ }
194
269
  ```
195
270
 
196
- ### Option 2: Programmatic Configuration
271
+ ### Video Effects
197
272
 
198
- Use `configure()` to set keys in code -- useful when your env vars have different names or keys come from a secrets manager.
273
+ 4 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard.
199
274
 
200
275
  ```typescript
201
- import { configure } from 'getaiapi'
202
-
203
- configure({
204
- keys: {
205
- 'fal-ai': process.env.MY_FAL_TOKEN,
206
- 'replicate': process.env.MY_REPLICATE_TOKEN,
207
- 'wavespeed': process.env.MY_WAVESPEED_TOKEN,
208
- 'openrouter': process.env.MY_OPENROUTER_TOKEN,
209
- 'kling': `${process.env.MY_KLING_AK}:${process.env.MY_KLING_SK}`,
210
- },
276
+ const result = await kling.effectsV1_6Pro({
277
+ image: 'https://example.com/photo.jpg',
211
278
  })
212
279
  ```
213
280
 
214
- You can also set keys and storage together:
281
+ | Function |
282
+ |----------|
283
+ | `effectsV1Standard` |
284
+ | `effectsV1_5Pro` |
285
+ | `effectsV1_6Pro` |
286
+ | `effectsV1_6Standard` |
287
+
288
+ **Input: `EffectsInput`**
215
289
 
216
290
  ```typescript
217
- configure({
218
- keys: {
219
- 'fal-ai': 'your-fal-key',
220
- },
221
- storage: {
222
- accountId: 'your-r2-account',
223
- bucketName: 'your-bucket',
224
- accessKeyId: 'your-r2-key',
225
- secretAccessKey: 'your-r2-secret',
226
- publicUrlBase: 'https://cdn.example.com',
227
- },
228
- })
291
+ {
292
+ image: string // required
293
+ }
229
294
  ```
230
295
 
231
- Or set just provider keys with `configureAuth()`:
296
+ ### Motion Control
232
297
 
233
- ```typescript
234
- import { configureAuth } from 'getaiapi'
298
+ 4 models: V2.6 Pro/Standard, V3 Pro/Standard.
235
299
 
236
- configureAuth({
237
- 'fal-ai': myKeyVault.get('fal'),
238
- 'replicate': myKeyVault.get('replicate'),
300
+ ```typescript
301
+ const result = await kling.motionControlV3Pro({
302
+ image_url: 'https://example.com/scene.jpg',
303
+ prompt: 'camera pan left',
239
304
  })
240
305
  ```
241
306
 
242
- Programmatic keys take priority over environment variables. Any provider not set programmatically falls back to its default env var.
307
+ | Function | Model | Mode |
308
+ |----------|-------|------|
309
+ | `motionControlV2_6Pro` | kling-v2-6 | pro |
310
+ | `motionControlV2_6Standard` | kling-v2-6 | std |
311
+ | `motionControlV3Pro` | kling-v3 | pro |
312
+ | `motionControlV3Standard` | kling-v3 | std |
243
313
 
244
- Models are automatically filtered to only show providers where you have a valid key configured.
245
-
246
- ## Model Discovery
314
+ **Input: `MotionControlInput`**
247
315
 
248
316
  ```typescript
249
- import { listModels, resolveModel, deriveCategory } from 'getaiapi'
250
-
251
- // List all models
252
- const all = listModels()
253
-
254
- // Filter by input/output modality
255
- const imageModels = listModels({ input: 'text', output: 'image' })
256
-
257
- // Filter by provider
258
- const falModels = listModels({ provider: 'fal-ai' })
317
+ {
318
+ image_url: string // required
319
+ video_url?: string
320
+ prompt?: string
321
+ keep_original_sound?: boolean
322
+ character_orientation?: string
323
+ element_list?: Array<{ id: string; image: string }>
324
+ }
325
+ ```
259
326
 
260
- // Search by name
261
- const fluxModels = listModels({ query: 'flux' })
327
+ ### Text to Speech (Sync)
262
328
 
263
- // Resolve a specific model
264
- const model = resolveModel('flux-schnell')
265
- // => { canonical_name, aliases, modality, providers }
329
+ Returns immediately no polling.
266
330
 
267
- // Derive a display label from modality
268
- deriveCategory(model) // => "text-to-image"
331
+ ```typescript
332
+ const result = await kling.tts({ text: 'Hello world' })
333
+ console.log(result.audios[0].url)
269
334
  ```
270
335
 
271
- ## Modality
336
+ **Input: `TtsInput`**
272
337
 
273
- Models declare their input and output types via `modality`. There are no fixed categories — modality is the source of truth.
274
-
275
- **Input types:** `text`, `image`, `audio`, `video`
338
+ ```typescript
339
+ {
340
+ text: string // required
341
+ }
342
+ ```
276
343
 
277
- **Output types:** `image`, `video`, `audio`, `text`, `3d`, `segmentation`
344
+ ### Video to Audio
278
345
 
279
- Common combinations across 1,890+ models (69 with native Kling provider):
346
+ Generates audio for a video. Returns both the merged video and the generated audio tracks.
280
347
 
281
- | Inputs | Outputs | Example |
282
- |---|---|---|
283
- | text | image | `flux-schnell`, `ideogram-v3` |
284
- | text | video | `veo3.1`, `sora-2` |
285
- | image, text | image | `gpt-image-1.5-edit`, `flux-2-pro-edit` |
286
- | image, text | video | `kling-video-v3-pro`, `seedance-v1.5-pro` |
287
- | text | audio | `elevenlabs-v3`, `minimax-music-v2` |
288
- | text | text | `claude-sonnet-4-6`, `gpt-4o` |
289
- | image | image | `topaz-upscale-image`, `birefnet-v2` |
290
- | image | 3d | `trellis-image-to-3d` |
291
- | audio | text | `whisper` |
348
+ ```typescript
349
+ const result = await kling.videoToAudio({
350
+ video_url: 'https://example.com/video.mp4',
351
+ sound_effect_prompt: 'ocean waves crashing',
352
+ })
292
353
 
293
- ## Providers
354
+ console.log(result.videos[0].url) // merged video with audio
355
+ console.log(result.audios[0].url_mp3) // audio track (mp3)
356
+ console.log(result.audios[0].url_wav) // audio track (wav)
357
+ ```
294
358
 
295
- | Provider | Models | Auth Env Var | Protocol |
296
- |---|---|---|---|
297
- | fal-ai | 1,201 | `FAL_KEY` | Native fetch |
298
- | Replicate | 687 | `REPLICATE_API_TOKEN` | Native fetch |
299
- | Kling AI | 69 | `KLING_ACCESS_KEY` | Native fetch + JWT |
300
- | WaveSpeed | 66 | `WAVESPEED_API_KEY` | Native fetch |
301
- | OpenRouter | 24 | `OPENROUTER_API_KEY` | Native fetch |
359
+ **Input: `VideoToAudioInput`**
302
360
 
303
- Many Kling models are available through both fal-ai and the native Kling provider. Using `provider: 'kling'` calls the Kling API directly with JWT authentication, bypassing intermediary markup. Set both `KLING_ACCESS_KEY` and `KLING_SECRET_KEY` env vars (or pass them combined as `accessKey:secretKey` via `configure()`).
361
+ ```typescript
362
+ {
363
+ video_url?: string // mutually exclusive with video_id
364
+ video_id?: string // mutually exclusive with video_url
365
+ sound_effect_prompt?: string
366
+ bgm_prompt?: string // background music prompt
367
+ asmr_mode?: boolean // enhanced detailed sound effects
368
+ }
369
+ ```
304
370
 
305
- **Provider portability** -- the same code works across providers. Parameter names are aligned: `generate_audio`, `end_image_url`, `voice_ids`, and `elements` work identically whether you use `provider: 'fal-ai'` or `provider: 'kling'`. The library automatically translates to each provider's native field names (e.g., `generate_audio: true` becomes `sound: "on"` for Kling, stays `generate_audio: true` for fal-ai).
371
+ ### Text to Audio
306
372
 
307
- Zero external dependencies -- all provider communication uses native `fetch`. Works in Node.js, Vercel Edge, Cloudflare Workers, Deno, Bun, and any ESM runtime -- no `fs` or special bundler config needed.
373
+ ```typescript
374
+ const result = await kling.textToAudio({
375
+ prompt: 'thunderstorm with heavy rain',
376
+ duration: 5.0,
377
+ })
308
378
 
309
- ## API Reference
379
+ console.log(result.audios[0].url) // normalized from url_mp3
380
+ console.log(result.audios[0].url_mp3) // mp3 URL
381
+ console.log(result.audios[0].url_wav) // wav URL
382
+ ```
310
383
 
311
- ### `generate(request: GenerateRequest): Promise<GenerateResponse>`
384
+ **Input: `TextToAudioInput`**
312
385
 
313
- The core function. Resolves the model, maps parameters, calls the provider, and returns a unified response.
386
+ ```typescript
387
+ {
388
+ prompt: string // required
389
+ duration: number // required — 3.0 to 10.0
390
+ }
391
+ ```
314
392
 
315
- **GenerateRequest**
393
+ ### Voice Clone
316
394
 
317
395
  ```typescript
318
- interface GenerateRequest<P extends ProviderName = ProviderName> {
319
- model: string // required - model name
320
- provider?: P // preferred provider (optional)
321
- prompt?: string // text prompt
322
- image?: string | File // input image (URL or File)
323
- images?: (string | File)[] // multiple reference images
324
- audio?: string | File // input audio
325
- video?: string | File // input video
326
- negative_prompt?: string // what to avoid
327
- count?: number // number of outputs
328
- size?: string | { width: number; height: number } // output dimensions
329
- seed?: number // reproducibility seed
330
- guidance?: number // guidance scale
331
- steps?: number // inference steps
332
- strength?: number // denoising strength
333
- format?: 'png' | 'jpeg' | 'webp' | 'mp4' | 'mp3' | 'wav' | 'obj' | 'glb'
334
- quality?: number // output quality
335
- safety?: boolean // enable safety checker
336
- duration?: string // output duration (video/audio)
337
- options?: ProviderOptionsFor<P> // provider-specific overrides
338
- }
396
+ const result = await kling.createVoice({
397
+ voice_name: 'my-voice',
398
+ voice_url: 'https://example.com/sample.mp3',
399
+ })
400
+
401
+ console.log(result.voices[0].voice_id)
402
+ console.log(result.voices[0].trial_url)
339
403
  ```
340
404
 
341
- The generic `P` narrows `options` by provider. Use `GenerateRequest<'kling'>` for type-safe Kling options:
405
+ **Input: `CreateVoiceInput`**
342
406
 
343
407
  ```typescript
344
- const req: GenerateRequest<'kling'> = {
345
- model: 'kling-video-v3-pro-image-to-video',
346
- provider: 'kling',
347
- image: 'https://example.com/img.png',
348
- prompt: 'Animate this photo',
349
- options: {
350
- sound: 'on', // typed: 'on' | 'off'
351
- aspect_ratio: '16:9', // typed: string
352
- cfg_scale: 0.5, // typed: number
353
- },
408
+ {
409
+ voice_name: string // required
410
+ voice_url?: string // audio sample URL
411
+ video_id?: string // or extract from video
354
412
  }
355
413
  ```
356
414
 
357
- Without a generic, `options` accepts any `Record<string, unknown>` (backward compatible).
415
+ ### Multi-Shot
358
416
 
359
- **GenerateResponse**
417
+ Generate multi-angle reference images from a frontal image. Each image returns 3 angle variants.
360
418
 
361
419
  ```typescript
362
- interface GenerateResponse {
363
- id: string
364
- model: string
365
- provider: string
366
- status: 'completed' | 'failed'
367
- outputs: OutputItem[]
368
- metadata: {
369
- seed?: number
370
- inference_time_ms?: number
371
- cost?: number
372
- safety_flagged?: boolean
373
- tokens?: number // total tokens (LLM only)
374
- prompt_tokens?: number // input tokens (LLM only)
375
- completion_tokens?: number // output tokens (LLM only)
376
- }
377
- }
420
+ const result = await kling.multiShot({
421
+ element_frontal_image: 'https://example.com/face.jpg',
422
+ })
378
423
 
379
- interface OutputItem {
380
- type: 'image' | 'video' | 'audio' | 'text' | '3d' | 'segmentation'
381
- url?: string // URL for media outputs
382
- content?: string // text content for LLM outputs
383
- content_type: string
384
- size_bytes?: number
385
- }
424
+ console.log(result.images[0].url_1) // angle 1
425
+ console.log(result.images[0].url_2) // angle 2
426
+ console.log(result.images[0].url_3) // angle 3
386
427
  ```
387
428
 
388
- ### `submit(request: GenerateRequest): Promise<SubmitResponse>`
389
-
390
- Submits a job to the provider and returns immediately without waiting for completion. Returns the provider's task ID and enough context to poll later.
429
+ **Input: `MultiShotInput`**
391
430
 
392
431
  ```typescript
393
- interface SubmitResponse {
394
- id: string // provider's task/request ID
395
- model: string // canonical model name
396
- provider: ProviderName // which provider handled it
397
- endpoint: string // needed for polling
398
- status: 'pending' | 'processing' | 'completed'
432
+ {
433
+ element_frontal_image: string // required
399
434
  }
400
435
  ```
401
436
 
402
- ### `poll(job: SubmitResponse): Promise<PollResponse>`
403
-
404
- Checks the status of a submitted job once. Returns current status, and includes mapped outputs and metadata when completed.
437
+ ### Reference to Image
405
438
 
406
439
  ```typescript
407
- interface PollResponse {
408
- id: string
409
- model: string
410
- provider: ProviderName
411
- status: 'completed' | 'failed' | 'processing' | 'pending'
412
- outputs?: OutputItem[] // populated when completed
413
- metadata?: GenerateResponse['metadata'] // populated when completed
414
- error?: string // populated when failed
415
- }
440
+ const result = await kling.referenceToImage({
441
+ prompt: 'portrait in watercolor style',
442
+ n: 2,
443
+ })
416
444
  ```
417
445
 
418
- ### `submitAndPoll(request: GenerateRequest): Promise<GenerateResponse>`
419
-
420
- Alias for `generate()`. Submits a job and polls until completion. Use this when you want the blocking behavior but want to be explicit about it.
446
+ **Input: `ReferenceToImageInput`**
421
447
 
422
- ### `listModels(filters?: ListModelsFilters): ModelEntry[]`
448
+ ```typescript
449
+ {
450
+ prompt: string // required
451
+ n?: number
452
+ aspect_ratio?: string
453
+ }
454
+ ```
423
455
 
424
- Returns all models in the registry. Accepts optional filters:
456
+ ### Expand Image
425
457
 
426
- - `input` -- filter by input modality (e.g. `'text'`, `'image'`, `'audio'`, `'video'`)
427
- - `output` -- filter by output modality (e.g. `'image'`, `'video'`, `'text'`, `'3d'`)
428
- - `provider` -- filter by provider (e.g. `'fal-ai'`)
429
- - `query` -- search canonical names and aliases
458
+ Outpainting expand an image beyond its borders.
430
459
 
431
- ### `resolveModel(name: string): ModelEntry`
460
+ ```typescript
461
+ const result = await kling.expandImage({
462
+ image: 'https://example.com/photo.jpg',
463
+ prompt: 'extend the landscape',
464
+ })
465
+ ```
432
466
 
433
- Resolves a model by name. Accepts canonical names, aliases, and normalized variants. Throws if no match is found.
467
+ **Input: `ExpandImageInput`**
434
468
 
435
- ### `deriveCategory(model: ModelEntry): string`
469
+ ```typescript
470
+ {
471
+ image: string // required
472
+ prompt?: string
473
+ n?: number
474
+ }
475
+ ```
436
476
 
437
- Derives a display category label from a model's modality (e.g. `"text-to-image"`).
477
+ ### Extend Video
438
478
 
439
- ## R2 Storage (Asset Uploads)
479
+ Continue a video beyond its last frame.
440
480
 
441
- getaiapi includes built-in Cloudflare R2 storage support that automatically uploads binary assets before sending them to providers. Two modes are supported:
481
+ ```typescript
482
+ const result = await kling.extendVideo({
483
+ prompt: 'the camera continues to pan right',
484
+ })
485
+ ```
442
486
 
443
- - **`public`** (default) — requires a publicly readable bucket; returns public URLs (via `publicUrlBase` or the R2 endpoint)
444
- - **`presigned`** — works with private buckets; returns time-limited presigned GET URLs signed with S3 Signature V4 (no public access needed, `publicUrlBase` is not required)
487
+ **Input: `ExtendVideoInput`**
445
488
 
446
- ### Setup
489
+ ```typescript
490
+ {
491
+ prompt?: string
492
+ negative_prompt?: string
493
+ }
494
+ ```
447
495
 
448
- Set these environment variables:
496
+ ### Identify Face (Sync)
449
497
 
450
- ```bash
451
- # Required
452
- export R2_ACCOUNT_ID="your-cloudflare-account-id"
453
- export R2_BUCKET_NAME="your-bucket-name"
454
- export R2_ACCESS_KEY_ID="your-r2-access-key"
455
- export R2_SECRET_ACCESS_KEY="your-r2-secret-key"
498
+ Detect faces in a video for lip-sync targeting. Returns immediately — no polling.
456
499
 
457
- # Optional - custom public URL (only needed for mode: 'public')
458
- export R2_PUBLIC_URL="https://cdn.example.com"
500
+ ```typescript
501
+ const result = await kling.identifyFace({
502
+ video_url: 'https://example.com/video.mp4',
503
+ })
459
504
 
460
- # Optional - use presigned URLs for private buckets (default: 'public')
461
- export R2_STORAGE_MODE="presigned"
462
- export R2_PRESIGN_EXPIRES_IN="3600" # seconds, default: 3600, max: 604800 (7 days)
505
+ console.log(result.session_id)
506
+ result.face_data.forEach(face => {
507
+ console.log(face.face_id, face.face_image, face.start_time, face.end_time)
508
+ })
463
509
  ```
464
510
 
465
- #### How to get your R2 Public URL (public mode only)
511
+ **Input: `IdentifyFaceInput`**
466
512
 
467
- If using `mode: 'presigned'`, you can skip this — no public bucket access is needed.
513
+ ```typescript
514
+ {
515
+ video_url?: string // mutually exclusive with video_id
516
+ video_id?: string // mutually exclusive with video_url
517
+ }
518
+ ```
468
519
 
469
- 1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com)
470
- 2. Go to **R2 Object Storage** in the left sidebar
471
- 3. Click on your bucket
472
- 4. Go to the **Settings** tab
473
- 5. Under **Public access**, click **Allow Access**
474
- 6. Cloudflare will provide a public URL like `https://<bucket>.<account-id>.r2.dev` — use this as your `R2_PUBLIC_URL`
475
- 7. (Optional) You can also connect a **Custom Domain** under the same section for a cleaner URL like `https://cdn.yourdomain.com`
520
+ ### Image Recognize (Sync)
476
521
 
477
- Then call `configureStorage()` once at startup:
522
+ Returns immediately no polling.
478
523
 
479
524
  ```typescript
480
- import { configureStorage } from 'getaiapi'
481
-
482
- // Read from environment variables
483
- configureStorage()
484
-
485
- // Or pass config directly
486
- configureStorage({
487
- accountId: 'your-account-id',
488
- bucketName: 'your-bucket',
489
- accessKeyId: 'your-key',
490
- secretAccessKey: 'your-secret',
491
- publicUrlBase: 'https://cdn.example.com', // optional
492
- autoUpload: false, // optional
493
- mode: 'public', // 'public' | 'presigned' (default: 'public')
494
- presignExpiresIn: 3600, // presigned URL TTL in seconds (default: 3600)
525
+ const result = await kling.imageRecognize({
526
+ image: 'https://example.com/photo.jpg',
495
527
  })
496
528
  ```
497
529
 
498
- ### Automatic Uploads in `generate()`
499
-
500
- Once storage is configured, any `Buffer`, `Blob`, `File`, or `ArrayBuffer` values in provider params are automatically uploaded to R2 and replaced with public URLs before the request is sent to the provider. This works recursively -- nested objects and arrays are traversed, so params like Kling's `elements[].frontal_image_url` are handled automatically. No code changes needed -- it just works.
530
+ **Input: `ImageRecognizeInput`**
501
531
 
502
532
  ```typescript
503
- import { generate, configureStorage } from 'getaiapi'
504
- import { readFileSync } from 'fs'
533
+ {
534
+ image: string // required
535
+ }
536
+ ```
537
+
538
+ ### Account Costs
505
539
 
506
- configureStorage()
540
+ Query resource package balances under your account. Free to call; QPS ≤ 1. Note: `remaining_quantity` has a 12-hour reporting delay.
507
541
 
508
- const result = await generate({
509
- model: 'gpt-image-1.5-edit',
510
- image: readFileSync('./photo.jpg'), // Buffer uploaded to R2 automatically
511
- prompt: 'add a rainbow in the sky',
542
+ ```typescript
543
+ const result = await kling.accountCosts({
544
+ start_time: Date.now() - 86_400_000, // last 24h
545
+ end_time: Date.now(),
512
546
  })
547
+
548
+ for (const pack of result.resource_pack_subscribe_infos) {
549
+ console.log(pack.resource_pack_name, pack.remaining_quantity, pack.status)
550
+ }
513
551
  ```
514
552
 
515
- To also re-upload URL strings through R2 (useful when providers can't access the original URL), pass `reupload: true` per-call:
553
+ **Input: `AccountCostsInput`**
516
554
 
517
555
  ```typescript
518
- const result = await generate({
519
- model: 'kling-video-pro',
520
- image: 'https://private-server.com/img.jpg',
521
- prompt: 'animate this image',
522
- options: { reupload: true },
523
- })
556
+ {
557
+ start_time: number // required — Unix ms
558
+ end_time: number // required — Unix ms
559
+ resource_pack_name?: string // optional — filter by exact package name
560
+ }
524
561
  ```
525
562
 
526
- Or enable it globally with `autoUpload: true` in the storage config.
563
+ **Output: `AccountCostsResult`**
527
564
 
528
- ### Cleanup / Lifecycle
529
-
530
- Assets uploaded automatically via `generate()` use the `getaiapi-tmp/` key prefix. You can set a [Cloudflare R2 lifecycle rule](https://developers.cloudflare.com/r2/buckets/object-lifecycles/) to auto-expire objects under that prefix (e.g. delete after 24 hours) so ephemeral generation assets don't accumulate.
565
+ ```typescript
566
+ {
567
+ resource_pack_subscribe_infos: Array<{
568
+ resource_pack_name: string
569
+ resource_pack_id: string
570
+ resource_pack_type: 'decreasing_total' | 'constant_period'
571
+ total_quantity: number
572
+ remaining_quantity: number // 12h delay
573
+ purchase_time: number
574
+ effective_time: number
575
+ invalid_time: number
576
+ status: 'toBeOnline' | 'online' | 'expired' | 'runOut'
577
+ }>
578
+ }
579
+ ```
531
580
 
532
- ### Standalone Upload / Delete
581
+ ## Output Types
533
582
 
534
- You can also use R2 storage directly:
583
+ All functions return typed results based on output modality:
535
584
 
536
585
  ```typescript
537
- import { uploadAsset, deleteAsset, configureStorage } from 'getaiapi'
586
+ // Video endpoints (textToVideo, imageToVideo, omniVideo, avatar, lipSync, effects, motionControl, extendVideo)
587
+ interface KlingVideoResult {
588
+ task_id: string
589
+ videos: Array<{ id: string; url: string; duration: string }>
590
+ }
538
591
 
539
- configureStorage()
592
+ // Image endpoints (imageGeneration, omniImage, virtualTryOn, referenceToImage, expandImage)
593
+ interface KlingImageResult {
594
+ task_id: string
595
+ images: Array<{ index: number; url: string }>
596
+ }
540
597
 
541
- // Upload a buffer
542
- const { url, key, size_bytes, content_type } = await uploadAsset(
543
- Buffer.from('hello world'),
544
- { contentType: 'text/plain', prefix: 'uploads' }
545
- )
546
- console.log(url) // https://cdn.example.com/uploads/a1b2c3d4-...
598
+ // Audio endpoints (tts, textToAudio)
599
+ interface KlingAudioResult {
600
+ task_id: string
601
+ audios: Array<{ id: string; url: string; url_mp3?: string; url_wav?: string; duration?: string; duration_mp3?: string; duration_wav?: string }>
602
+ }
547
603
 
548
- // Delete by key
549
- await deleteAsset(key)
550
- ```
604
+ // Multi-shot endpoint — 3 angle URLs per image
605
+ interface KlingMultiShotResult {
606
+ task_id: string
607
+ images: Array<{ index: number; url_1: string; url_2: string; url_3: string }>
608
+ }
551
609
 
552
- ### Presigned URLs (Private Buckets)
610
+ // Voice clone endpoint
611
+ interface KlingVoiceResult {
612
+ task_id: string
613
+ voices: Array<{ voice_id: string; voice_name: string; trial_url: string; owned_by: string }>
614
+ }
553
615
 
554
- If your R2 bucket doesn't have public read access, use presigned mode. Instead of returning a public URL, `uploadAsset` will return a time-limited presigned GET URL signed with S3 Signature V4.
616
+ // Video-to-audio endpoint merged video + generated audio
617
+ interface KlingVideoAudioResult {
618
+ task_id: string
619
+ videos: Array<{ id: string; url: string; duration: string }>
620
+ audios: Array<{ id: string; url_mp3?: string; url_wav?: string; duration_mp3?: string; duration_wav?: string }>
621
+ }
555
622
 
556
- ```typescript
557
- configureStorage({
558
- accountId: 'your-account-id',
559
- bucketName: 'private-bucket',
560
- accessKeyId: 'your-key',
561
- secretAccessKey: 'your-secret',
562
- mode: 'presigned', // uploadAsset returns presigned URLs
563
- presignExpiresIn: 1800, // URLs expire after 30 minutes
564
- })
623
+ // Face detection (identifyFace) — sync, no task_id
624
+ interface KlingFaceResult {
625
+ session_id: string
626
+ face_data: Array<{ face_id: string; face_image: string; start_time: number; end_time: number }>
627
+ }
565
628
 
566
- const { url } = await uploadAsset(Buffer.from('secret data'), {
567
- contentType: 'application/octet-stream',
568
- })
569
- // url is a presigned GET URL, valid for 30 minutes
629
+ // Generic JSON (imageRecognize)
630
+ interface KlingJsonResult {
631
+ task_id: string
632
+ data: unknown
633
+ }
570
634
  ```
571
635
 
572
- You can also generate presigned URLs for existing objects:
573
-
574
- ```typescript
575
- import { presignAsset } from 'getaiapi'
636
+ ## Polling Control
576
637
 
577
- const url = presignAsset('uploads/my-file.png')
578
- // => https://<account>.r2.cloudflarestorage.com/<bucket>/uploads/my-file.png?X-Amz-Algorithm=...
638
+ All functions accept optional polling parameters:
579
639
 
580
- // Custom expiry per-call (overrides config default)
581
- const shortUrl = presignAsset('uploads/my-file.png', { expiresIn: 300 }) // 5 minutes
640
+ ```typescript
641
+ await kling.textToVideoV3Pro({
642
+ prompt: 'a sunset',
643
+ timeout: 600_000, // max wait time in ms (default: 300_000 = 5 min)
644
+ pollInterval: 5_000, // poll frequency in ms (default: 3_000)
645
+ })
582
646
  ```
583
647
 
584
- **UploadOptions**
648
+ Sync endpoints (`tts`, `imageRecognize`, `identifyFace`) return immediately regardless of these settings.
585
649
 
586
- | Option | Type | Description |
587
- |---|---|---|
588
- | `key` | `string` | Custom object key (default: auto-generated UUID) |
589
- | `contentType` | `string` | MIME type (default: detected from input or `application/octet-stream`) |
590
- | `prefix` | `string` | Key prefix / folder (e.g. `"uploads"`) |
591
- | `maxBytes` | `number` | Max upload size in bytes (default: 500 MB) |
650
+ ## Extra Parameters
592
651
 
593
- ### Storage Errors
652
+ All input types accept additional Kling-native fields via index signature. Pass any parameter the Kling API supports:
594
653
 
595
654
  ```typescript
596
- import { StorageError } from 'getaiapi'
597
-
598
- try {
599
- await uploadAsset(buffer)
600
- } catch (err) {
601
- if (err instanceof StorageError) {
602
- console.error(err.operation) // 'upload' | 'delete' | 'config'
603
- console.error(err.statusCode) // HTTP status from R2, if applicable
604
- }
605
- }
655
+ await kling.textToVideoV3Pro({
656
+ prompt: 'a sunset',
657
+ camera_control: { type: 'simple', config: { horizontal: 5 } },
658
+ callback_url: 'https://example.com/webhook',
659
+ })
606
660
  ```
607
661
 
608
662
  ## Error Handling
609
663
 
610
- All errors extend `GetAIApiError` and can be caught uniformly or by type:
611
-
612
- | Error | When |
613
- |---|---|
614
- | `AuthError` | Missing or invalid API key for a provider |
615
- | `ModelNotFoundError` | Model name could not be resolved |
616
- | `ValidationError` | Invalid input parameters |
617
- | `ProviderError` | Provider returned an error response |
618
- | `TimeoutError` | Generation exceeded the timeout |
619
- | `RateLimitError` | Provider returned HTTP 429 |
620
- | `StorageError` | R2 upload, delete, or config failure |
621
-
622
664
  ```typescript
623
- import { generate, AuthError, ModelNotFoundError } from 'getaiapi'
665
+ import { kling, KlingAuthError, KlingTimeoutError, KlingTaskFailedError } from 'getaiapi'
624
666
 
625
667
  try {
626
- const result = await generate({ model: 'flux-schnell', prompt: 'a cat' })
668
+ await kling.textToVideoV3Pro({ prompt: 'test' })
627
669
  } catch (err) {
628
- if (err instanceof AuthError) {
629
- console.error(`Set ${err.envVar} to use ${err.provider}`)
670
+ if (err instanceof KlingAuthError) {
671
+ // Missing or invalid credentials
672
+ }
673
+ if (err instanceof KlingTimeoutError) {
674
+ // Task took too long (increase timeout)
630
675
  }
631
- if (err instanceof ModelNotFoundError) {
632
- console.error(err.message) // includes "did you mean" suggestions
676
+ if (err instanceof KlingTaskFailedError) {
677
+ // Kling rejected the task (content violation, bad params, etc.)
678
+ console.error(err.taskId, err.message)
633
679
  }
634
680
  }
635
681
  ```
636
682
 
637
- ## Migrating from v0.x
683
+ | Error | Code | When |
684
+ |-------|------|------|
685
+ | `KlingAuthError` | `AUTH_ERROR` | Missing credentials or 401 response |
686
+ | `KlingRateLimitError` | `RATE_LIMIT` | HTTP 429 or body codes 1100-1102 |
687
+ | `KlingApiError` | `API_ERROR` | Provider returned an error |
688
+ | `KlingTimeoutError` | `TIMEOUT` | Polling exceeded timeout |
689
+ | `KlingTaskFailedError` | `TASK_FAILED` | Task status is 'failed' |
638
690
 
639
- v1.0.0 replaces the category-based architecture with a modality-first design. Key changes:
691
+ All errors extend `KlingError` which extends `Error`.
640
692
 
641
- - `getModel()` is now `resolveModel()`
642
- - `listModels({ category: '...' })` is now `listModels({ input: '...', output: '...' })`
643
- - No more `readFileSync` -- works in edge runtimes without any bundler config
693
+ ## Deprecated: v1 Unified Gateway
644
694
 
645
- See the full [Migration Guide](docs/MIGRATION.md) for details.
695
+ The previous `generate()`, `submit()`, `poll()` APIs and the multi-provider registry are deprecated but still exported for backward compatibility. They will be removed in the next major version.
646
696
 
647
- ## Documentation
697
+ ```typescript
698
+ // Deprecated — still works but will be removed
699
+ import { generate } from 'getaiapi'
700
+ await generate({ model: 'flux-schnell', prompt: '...' })
648
701
 
649
- Full documentation available at [interactive10.com/getaiapi.html](https://www.interactive10.com/getaiapi.html)
702
+ // New use provider-specific typed functions
703
+ import { kling } from 'getaiapi'
704
+ await kling.textToVideoV3Pro({ prompt: '...' })
705
+ ```
650
706
 
651
707
  ## License
652
708