getaiapi 1.3.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,12 +1,12 @@
1
1
  # getaiapi
2
2
 
3
- **One function to call any AI model.**
3
+ **Typed AI provider SDKs. One import per provider.**
4
4
 
5
5
  [![npm version](https://img.shields.io/npm/v/getaiapi)](https://www.npmjs.com/package/getaiapi)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
7
7
  [![TypeScript](https://img.shields.io/badge/TypeScript-strict-blue.svg)](https://www.typescriptlang.org/)
8
8
 
9
- A unified TypeScript library that wraps 1,890+ AI models across 4 providers into a single `generate()` function. One input shape. One output shape. Any model.
9
+ Each AI provider gets a typed namespace with one function per model. No generic `generate()`, no model strings, no mapping layers. What you type is what gets sent.
10
10
 
11
11
  ## Install
12
12
 
@@ -14,598 +14,652 @@ A unified TypeScript library that wraps 1,890+ AI models across 4 providers into
14
14
  npm install getaiapi
15
15
  ```
16
16
 
17
- ## Quick Start
17
+ ## Kling AI
18
18
 
19
- ```typescript
20
- import { generate } from 'getaiapi'
19
+ 69 models across 20 endpoints. Each model is a typed function with Kling-native field names.
21
20
 
22
- const result = await generate({
23
- model: 'flux-schnell',
24
- prompt: 'a cat wearing sunglasses'
25
- })
21
+ ### Setup
26
22
 
27
- console.log(result.outputs[0].url)
23
+ ```bash
24
+ export KLING_ACCESS_KEY="your-access-key"
25
+ export KLING_SECRET_KEY="your-secret-key"
28
26
  ```
29
27
 
30
- ## More Examples
31
-
32
- **Text generation (LLMs)**
28
+ Or configure programmatically:
33
29
 
34
30
  ```typescript
35
- const answer = await generate({
36
- model: 'claude-sonnet-4-6',
37
- prompt: 'Explain quantum computing in one paragraph'
38
- })
31
+ import { kling } from 'getaiapi'
39
32
 
40
- console.log(answer.outputs[0].content)
33
+ kling.configure({ accessKey: '...', secretKey: '...' })
41
34
  ```
42
35
 
43
- With system prompt and parameters:
36
+ ### Text to Video
37
+
38
+ 9 models: V1 Standard, V1.6 Pro/Standard, V2 Master, V2.1 Master, V2.5 Turbo Pro, V2.6 Pro, V3 Pro/Standard.
44
39
 
45
40
  ```typescript
46
- const reply = await generate({
47
- model: 'gpt-4o',
48
- prompt: 'Write a haiku about TypeScript',
49
- options: {
50
- system: 'You are a creative poet.',
51
- temperature: 0.9,
52
- max_tokens: 100,
53
- }
41
+ import { kling } from 'getaiapi'
42
+
43
+ const result = await kling.textToVideoV3Pro({
44
+ prompt: 'a golden retriever running on a beach at sunset',
45
+ duration: '5',
46
+ aspect_ratio: '16:9',
47
+ sound: 'on',
54
48
  })
49
+
50
+ console.log(result.videos[0].url)
55
51
  ```
56
52
 
57
- **Text-to-video**
53
+ | Function | Model | Mode |
54
+ |----------|-------|------|
55
+ | `textToVideoV1Standard` | kling-v1 | std |
56
+ | `textToVideoV1_6Pro` | kling-v1-6 | pro |
57
+ | `textToVideoV1_6Standard` | kling-v1-6 | std |
58
+ | `textToVideoV2Master` | kling-v2-master | — |
59
+ | `textToVideoV2_1Master` | kling-v2-1-master | — |
60
+ | `textToVideoV2_5TurboPro` | kling-v2-5-turbo | pro |
61
+ | `textToVideoV2_6Pro` | kling-v2-6 | pro |
62
+ | `textToVideoV3Pro` | kling-v3 | pro |
63
+ | `textToVideoV3Standard` | kling-v3 | std |
64
+
65
+ **Input: `TextToVideoInput`**
58
66
 
59
67
  ```typescript
60
- const video = await generate({
61
- model: 'veo3.1',
62
- prompt: 'a timelapse of a flower blooming in a garden'
63
- })
68
+ {
69
+ prompt: string // required
70
+ negative_prompt?: string
71
+ duration?: string // '5' or '10'
72
+ aspect_ratio?: string // '16:9', '9:16', '1:1'
73
+ cfg_scale?: number
74
+ sound?: 'on' | 'off' // generate audio
75
+ }
64
76
  ```
65
77
 
66
- **Image editing**
78
+ ### Image to Video
79
+
80
+ 13 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard, V2 Master, V2.1 Master/Pro/Standard, V2.5 Turbo Pro/Standard, V2.6 Pro, V3 Pro/Standard.
67
81
 
68
82
  ```typescript
69
- const edited = await generate({
70
- model: 'gpt-image-1.5-edit',
83
+ const result = await kling.imageToVideoV3Pro({
71
84
  image: 'https://example.com/photo.jpg',
72
- prompt: 'add a rainbow in the sky'
85
+ prompt: 'animate this photo with gentle wind',
86
+ duration: '5',
73
87
  })
74
88
  ```
75
89
 
76
- **Multi-image references** (e.g., character + location consistency)
90
+ | Function | Model | Mode |
91
+ |----------|-------|------|
92
+ | `imageToVideoV1Standard` | kling-v1 | std |
93
+ | `imageToVideoV1_5Pro` | kling-v1-5 | pro |
94
+ | `imageToVideoV1_6Pro` | kling-v1-6 | pro |
95
+ | `imageToVideoV1_6Standard` | kling-v1-6 | std |
96
+ | `imageToVideoV2Master` | kling-v2-master | — |
97
+ | `imageToVideoV2_1Master` | kling-v2-1-master | — |
98
+ | `imageToVideoV2_1Pro` | kling-v2-1 | pro |
99
+ | `imageToVideoV2_1Standard` | kling-v2-1 | std |
100
+ | `imageToVideoV2_5TurboPro` | kling-v2-5-turbo | pro |
101
+ | `imageToVideoV2_5TurboStandard` | kling-v2-5-turbo | std |
102
+ | `imageToVideoV2_6Pro` | kling-v2-6 | pro |
103
+ | `imageToVideoV3Pro` | kling-v3 | pro |
104
+ | `imageToVideoV3Standard` | kling-v3 | std |
105
+
106
+ **Input: `ImageToVideoInput`**
77
107
 
78
108
  ```typescript
79
- const scene = await generate({
80
- model: 'google-nano-banana-pro-edit',
81
- prompt: 'cinematic shot of the character in the location',
82
- image: 'https://example.com/character.jpg',
83
- images: [
84
- 'https://example.com/character.jpg',
85
- 'https://example.com/location.jpg',
86
- ],
87
- })
109
+ {
110
+ image: string // required — URL or base64
111
+ prompt?: string
112
+ negative_prompt?: string
113
+ duration?: string
114
+ aspect_ratio?: string
115
+ cfg_scale?: number
116
+ sound?: 'on' | 'off'
117
+ image_tail?: string // end frame image URL
118
+ voice_list?: Array<{ voice_id: string }>
119
+ element_list?: Array<{ id: string; image: string }>
120
+ }
88
121
  ```
89
122
 
90
- **Text-to-speech**
91
-
92
- ```typescript
93
- const speech = await generate({
94
- model: 'elevenlabs-v3',
95
- prompt: 'Hello, welcome to getaiapi.',
96
- options: { voice_id: 'rachel' }
97
- })
98
- ```
123
+ ### Omni Video
99
124
 
100
- **Upscale an image**
125
+ 17 models across O1 and O3 variants. Supports text-to-video, image-to-video, reference-to-video, video editing, and video reference — all through one endpoint.
101
126
 
102
127
  ```typescript
103
- const upscaled = await generate({
104
- model: 'topaz-upscale-image',
105
- image: 'https://example.com/low-res.jpg'
128
+ const result = await kling.omniVideoO3ProTextToVideo({
129
+ prompt: 'a cyberpunk city at night',
130
+ duration: '5',
131
+ aspect_ratio: '16:9',
106
132
  })
107
133
  ```
108
134
 
109
- **Remove background**
135
+ | Function | Model | Mode |
136
+ |----------|-------|------|
137
+ | `omniVideoO1ImageToVideo` | kling-video-o1 | — |
138
+ | `omniVideoO1ReferenceToVideo` | kling-video-o1 | — |
139
+ | `omniVideoO1StandardImageToVideo` | kling-video-o1 | std |
140
+ | `omniVideoO1StandardReferenceToVideo` | kling-video-o1 | std |
141
+ | `omniVideoO1StandardVideoEdit` | kling-video-o1 | std |
142
+ | `omniVideoO1StandardVideoReference` | kling-video-o1 | std |
143
+ | `omniVideoO1VideoEdit` | kling-video-o1 | — |
144
+ | `omniVideoO1VideoReference` | kling-video-o1 | — |
145
+ | `omniVideoO3ProImageToVideo` | kling-v3-omni | pro |
146
+ | `omniVideoO3ProReferenceToVideo` | kling-v3-omni | pro |
147
+ | `omniVideoO3ProTextToVideo` | kling-v3-omni | pro |
148
+ | `omniVideoO3ProVideoEdit` | kling-v3-omni | pro |
149
+ | `omniVideoO3ProVideoReference` | kling-v3-omni | pro |
150
+ | `omniVideoO3StandardReferenceToVideo` | kling-v3-omni | std |
151
+ | `omniVideoO3StandardTextToVideo` | kling-v3-omni | std |
152
+ | `omniVideoO3StandardVideoEdit` | kling-v3-omni | std |
153
+ | `omniVideoO3StandardVideoReference` | kling-v3-omni | std |
154
+
155
+ **Input: `OmniVideoInput`**
110
156
 
111
157
  ```typescript
112
- const cutout = await generate({
113
- model: 'birefnet-v2',
114
- image: 'https://example.com/portrait.jpg'
115
- })
158
+ {
159
+ prompt: string // required
160
+ image?: string
161
+ negative_prompt?: string
162
+ duration?: string
163
+ aspect_ratio?: string
164
+ cfg_scale?: number
165
+ sound?: 'on' | 'off'
166
+ element_list?: Array<{ id: string; image: string }>
167
+ }
116
168
  ```
117
169
 
118
- ## Async Job Control
170
+ ### Image Generation
119
171
 
120
- For long-running jobs (video generation, training), you can submit a job and poll for status separately instead of blocking until completion.
172
+ 2 models on `v1/images/generations` and 3 models on `v1/images/omni-image`.
121
173
 
122
174
  ```typescript
123
- import { submit, poll } from 'getaiapi'
124
-
125
- // Submit — returns immediately with the provider's task ID
126
- const job = await submit({
127
- model: 'veo3.1',
128
- prompt: 'a timelapse of a flower blooming',
175
+ const result = await kling.imageO1({
176
+ prompt: 'a watercolor painting of a mountain lake',
177
+ n: 2,
178
+ aspect_ratio: '16:9',
129
179
  })
130
180
 
131
- console.log(job.id) // provider task ID
132
- console.log(job.status) // 'pending' | 'processing' | 'completed'
181
+ console.log(result.images[0].url)
182
+ ```
133
183
 
134
- // Poll check status manually (call in a loop, on a timer, etc.)
135
- let result = await poll(job)
184
+ | Function | Endpoint | Model |
185
+ |----------|----------|-------|
186
+ | `imageV3TextToImage` | generations | kling-v3 |
187
+ | `imageV3ImageToImage` | generations | kling-v3 |
188
+ | `imageO1` | omni-image | kling-image-o1 |
189
+ | `imageO3TextToImage` | omni-image | kling-v3-omni |
190
+ | `imageO3ImageToImage` | omni-image | kling-v3-omni |
136
191
 
137
- while (result.status === 'pending' || result.status === 'processing') {
138
- await new Promise(r => setTimeout(r, 2000))
139
- result = await poll(job)
140
- }
192
+ **Input: `ImageGenerationInput` / `OmniImageInput`**
141
193
 
142
- if (result.status === 'completed') {
143
- console.log(result.outputs[0].url)
194
+ ```typescript
195
+ {
196
+ prompt: string // required
197
+ image?: string // for image-to-image
198
+ n?: number // number of outputs
199
+ aspect_ratio?: string
144
200
  }
145
201
  ```
146
202
 
147
- Synchronous providers (like OpenRouter) return `status: 'completed'` from `submit()` immediately -- check status before polling.
148
-
149
- `submitAndPoll()` is an alias for `generate()` that makes the blocking behavior explicit:
203
+ ### Virtual Try-On
150
204
 
151
205
  ```typescript
152
- import { submitAndPoll } from 'getaiapi'
153
-
154
- const result = await submitAndPoll({
155
- model: 'flux-schnell',
156
- prompt: 'a cat in space',
206
+ const result = await kling.virtualTryOn({
207
+ human_image: 'https://example.com/person.jpg',
208
+ cloth_image: 'https://example.com/shirt.jpg',
157
209
  })
158
210
  ```
159
211
 
160
- ## Configuration
161
-
162
- ### Option 1: Environment Variables
212
+ **Input: `VirtualTryOnInput`**
163
213
 
164
- Set API keys as environment variables. You only need keys for the providers you plan to call.
165
-
166
- ```bash
167
- # fal-ai (1,201 models)
168
- export FAL_KEY="your-fal-key"
214
+ ```typescript
215
+ {
216
+ human_image: string // required
217
+ cloth_image: string // required
218
+ }
219
+ ```
169
220
 
170
- # Replicate (687 models)
171
- export REPLICATE_API_TOKEN="your-replicate-token"
221
+ ### AI Avatar
172
222
 
173
- # WaveSpeed (66 models)
174
- export WAVESPEED_API_KEY="your-wavespeed-key"
223
+ 4 models: V1 Pro/Standard, V2 Pro/Standard.
175
224
 
176
- # OpenRouter (24 LLM models — Claude, GPT, Gemini, Llama, etc.)
177
- export OPENROUTER_API_KEY="your-openrouter-key"
225
+ ```typescript
226
+ const result = await kling.avatarV2Pro({
227
+ image: 'https://example.com/portrait.jpg',
228
+ sound_file: 'https://example.com/speech.mp3',
229
+ prompt: 'talking head presentation',
230
+ })
178
231
  ```
179
232
 
180
- ### Option 2: Programmatic Configuration
233
+ | Function | Mode |
234
+ |----------|------|
235
+ | `avatarV1Pro` | pro |
236
+ | `avatarV1Standard` | std |
237
+ | `avatarV2Pro` | pro |
238
+ | `avatarV2Standard` | std |
181
239
 
182
- Use `configure()` to set keys in code -- useful when your env vars have different names or keys come from a secrets manager.
240
+ **Input: `AvatarInput`**
183
241
 
184
242
  ```typescript
185
- import { configure } from 'getaiapi'
186
-
187
- configure({
188
- keys: {
189
- 'fal-ai': process.env.MY_FAL_TOKEN,
190
- 'replicate': process.env.MY_REPLICATE_TOKEN,
191
- 'wavespeed': process.env.MY_WAVESPEED_TOKEN,
192
- 'openrouter': process.env.MY_OPENROUTER_TOKEN,
193
- },
194
- })
243
+ {
244
+ image: string // required — portrait image
245
+ sound_file?: string // audio for lip sync
246
+ prompt?: string
247
+ }
195
248
  ```
196
249
 
197
- You can also set keys and storage together:
250
+ ### Lip Sync
198
251
 
199
252
  ```typescript
200
- configure({
201
- keys: {
202
- 'fal-ai': 'your-fal-key',
203
- },
204
- storage: {
205
- accountId: 'your-r2-account',
206
- bucketName: 'your-bucket',
207
- accessKeyId: 'your-r2-key',
208
- secretAccessKey: 'your-r2-secret',
209
- publicUrlBase: 'https://cdn.example.com',
210
- },
253
+ const result = await kling.lipSyncAudioToVideo({
254
+ sound_file: 'https://example.com/speech.mp3',
211
255
  })
212
256
  ```
213
257
 
214
- Or set just provider keys with `configureAuth()`:
258
+ | Function | Description |
259
+ |----------|-------------|
260
+ | `lipSyncAudioToVideo` | Audio-driven lip sync |
261
+ | `lipSyncTextToVideo` | Text-driven lip sync |
215
262
 
216
- ```typescript
217
- import { configureAuth } from 'getaiapi'
263
+ **Input: `LipSyncInput`**
218
264
 
219
- configureAuth({
220
- 'fal-ai': myKeyVault.get('fal'),
221
- 'replicate': myKeyVault.get('replicate'),
222
- })
265
+ ```typescript
266
+ {
267
+ sound_file?: string // audio URL
268
+ }
223
269
  ```
224
270
 
225
- Programmatic keys take priority over environment variables. Any provider not set programmatically falls back to its default env var.
226
-
227
- Models are automatically filtered to only show providers where you have a valid key configured.
271
+ ### Video Effects
228
272
 
229
- ## Model Discovery
273
+ 4 models: V1 Standard, V1.5 Pro, V1.6 Pro/Standard.
230
274
 
231
275
  ```typescript
232
- import { listModels, resolveModel, deriveCategory } from 'getaiapi'
276
+ const result = await kling.effectsV1_6Pro({
277
+ image: 'https://example.com/photo.jpg',
278
+ })
279
+ ```
233
280
 
234
- // List all models
235
- const all = listModels()
281
+ | Function |
282
+ |----------|
283
+ | `effectsV1Standard` |
284
+ | `effectsV1_5Pro` |
285
+ | `effectsV1_6Pro` |
286
+ | `effectsV1_6Standard` |
236
287
 
237
- // Filter by input/output modality
238
- const imageModels = listModels({ input: 'text', output: 'image' })
288
+ **Input: `EffectsInput`**
239
289
 
240
- // Filter by provider
241
- const falModels = listModels({ provider: 'fal-ai' })
290
+ ```typescript
291
+ {
292
+ image: string // required
293
+ }
294
+ ```
242
295
 
243
- // Search by name
244
- const fluxModels = listModels({ query: 'flux' })
296
+ ### Motion Control
245
297
 
246
- // Resolve a specific model
247
- const model = resolveModel('flux-schnell')
248
- // => { canonical_name, aliases, modality, providers }
298
+ 4 models: V2.6 Pro/Standard, V3 Pro/Standard.
249
299
 
250
- // Derive a display label from modality
251
- deriveCategory(model) // => "text-to-image"
300
+ ```typescript
301
+ const result = await kling.motionControlV3Pro({
302
+ image_url: 'https://example.com/scene.jpg',
303
+ prompt: 'camera pan left',
304
+ })
252
305
  ```
253
306
 
254
- ## Modality
307
+ | Function | Model | Mode |
308
+ |----------|-------|------|
309
+ | `motionControlV2_6Pro` | kling-v2-6 | pro |
310
+ | `motionControlV2_6Standard` | kling-v2-6 | std |
311
+ | `motionControlV3Pro` | kling-v3 | pro |
312
+ | `motionControlV3Standard` | kling-v3 | std |
255
313
 
256
- Models declare their input and output types via `modality`. There are no fixed categories — modality is the source of truth.
314
+ **Input: `MotionControlInput`**
257
315
 
258
- **Input types:** `text`, `image`, `audio`, `video`
316
+ ```typescript
317
+ {
318
+ image_url: string // required
319
+ video_url?: string
320
+ prompt?: string
321
+ keep_original_sound?: boolean
322
+ character_orientation?: string
323
+ element_list?: Array<{ id: string; image: string }>
324
+ }
325
+ ```
259
326
 
260
- **Output types:** `image`, `video`, `audio`, `text`, `3d`, `segmentation`
327
+ ### Text to Speech (Sync)
261
328
 
262
- Common combinations across 1,890+ models:
329
+ Returns immediately no polling.
263
330
 
264
- | Inputs | Outputs | Example |
265
- |---|---|---|
266
- | text | image | `flux-schnell`, `ideogram-v3` |
267
- | text | video | `veo3.1`, `sora-2` |
268
- | image, text | image | `gpt-image-1.5-edit`, `flux-2-pro-edit` |
269
- | image, text | video | `kling-video-v3-pro`, `seedance-v1.5-pro` |
270
- | text | audio | `elevenlabs-v3`, `minimax-music-v2` |
271
- | text | text | `claude-sonnet-4-6`, `gpt-4o` |
272
- | image | image | `topaz-upscale-image`, `birefnet-v2` |
273
- | image | 3d | `trellis-image-to-3d` |
274
- | audio | text | `whisper` |
331
+ ```typescript
332
+ const result = await kling.tts({ text: 'Hello world' })
333
+ console.log(result.audios[0].url)
334
+ ```
275
335
 
276
- ## Providers
336
+ **Input: `TtsInput`**
277
337
 
278
- | Provider | Models | Auth Env Var | Protocol |
279
- |---|---|---|---|
280
- | fal-ai | 1,201 | `FAL_KEY` | Native fetch |
281
- | Replicate | 687 | `REPLICATE_API_TOKEN` | Native fetch |
282
- | WaveSpeed | 66 | `WAVESPEED_API_KEY` | Native fetch |
283
- | OpenRouter | 24 | `OPENROUTER_API_KEY` | Native fetch |
338
+ ```typescript
339
+ {
340
+ text: string // required
341
+ }
342
+ ```
284
343
 
285
- Zero external dependencies -- all provider communication uses native `fetch`. Works in Node.js, Vercel Edge, Cloudflare Workers, Deno, Bun, and any ESM runtime -- no `fs` or special bundler config needed.
344
+ ### Video to Audio
286
345
 
287
- ## API Reference
346
+ Generates audio for a video. Returns both the merged video and the generated audio tracks.
288
347
 
289
- ### `generate(request: GenerateRequest): Promise<GenerateResponse>`
348
+ ```typescript
349
+ const result = await kling.videoToAudio({
350
+ video_url: 'https://example.com/video.mp4',
351
+ sound_effect_prompt: 'ocean waves crashing',
352
+ })
290
353
 
291
- The core function. Resolves the model, maps parameters, calls the provider, and returns a unified response.
354
+ console.log(result.videos[0].url) // merged video with audio
355
+ console.log(result.audios[0].url_mp3) // audio track (mp3)
356
+ console.log(result.audios[0].url_wav) // audio track (wav)
357
+ ```
292
358
 
293
- **GenerateRequest**
359
+ **Input: `VideoToAudioInput`**
294
360
 
295
361
  ```typescript
296
- interface GenerateRequest {
297
- model: string // required - model name
298
- provider?: ProviderName // preferred provider (optional)
299
- prompt?: string // text prompt
300
- image?: string | File // input image (URL or File)
301
- images?: (string | File)[] // multiple reference images
302
- audio?: string | File // input audio
303
- video?: string | File // input video
304
- negative_prompt?: string // what to avoid
305
- count?: number // number of outputs
306
- size?: string | { width: number; height: number } // output dimensions
307
- seed?: number // reproducibility seed
308
- guidance?: number // guidance scale
309
- steps?: number // inference steps
310
- strength?: number // denoising strength
311
- format?: 'png' | 'jpeg' | 'webp' | 'mp4' | 'mp3' | 'wav' | 'obj' | 'glb'
312
- quality?: number // output quality
313
- safety?: boolean // enable safety checker
314
- options?: Record<string, unknown> // provider-specific overrides
362
+ {
363
+ video_url?: string // mutually exclusive with video_id
364
+ video_id?: string // mutually exclusive with video_url
365
+ sound_effect_prompt?: string
366
+ bgm_prompt?: string // background music prompt
367
+ asmr_mode?: boolean // enhanced detailed sound effects
315
368
  }
316
369
  ```
317
370
 
318
- **GenerateResponse**
371
+ ### Text to Audio
319
372
 
320
373
  ```typescript
321
- interface GenerateResponse {
322
- id: string
323
- model: string
324
- provider: string
325
- status: 'completed' | 'failed'
326
- outputs: OutputItem[]
327
- metadata: {
328
- seed?: number
329
- inference_time_ms?: number
330
- cost?: number
331
- safety_flagged?: boolean
332
- tokens?: number // total tokens (LLM only)
333
- prompt_tokens?: number // input tokens (LLM only)
334
- completion_tokens?: number // output tokens (LLM only)
335
- }
336
- }
374
+ const result = await kling.textToAudio({
375
+ prompt: 'thunderstorm with heavy rain',
376
+ duration: 5.0,
377
+ })
337
378
 
338
- interface OutputItem {
339
- type: 'image' | 'video' | 'audio' | 'text' | '3d' | 'segmentation'
340
- url?: string // URL for media outputs
341
- content?: string // text content for LLM outputs
342
- content_type: string
343
- size_bytes?: number
344
- }
379
+ console.log(result.audios[0].url) // normalized from url_mp3
380
+ console.log(result.audios[0].url_mp3) // mp3 URL
381
+ console.log(result.audios[0].url_wav) // wav URL
345
382
  ```
346
383
 
347
- ### `submit(request: GenerateRequest): Promise<SubmitResponse>`
348
-
349
- Submits a job to the provider and returns immediately without waiting for completion. Returns the provider's task ID and enough context to poll later.
384
+ **Input: `TextToAudioInput`**
350
385
 
351
386
  ```typescript
352
- interface SubmitResponse {
353
- id: string // provider's task/request ID
354
- model: string // canonical model name
355
- provider: ProviderName // which provider handled it
356
- endpoint: string // needed for polling
357
- status: 'pending' | 'processing' | 'completed'
387
+ {
388
+ prompt: string // required
389
+ duration: number // required 3.0 to 10.0
358
390
  }
359
391
  ```
360
392
 
361
- ### `poll(job: SubmitResponse): Promise<PollResponse>`
393
+ ### Voice Clone
394
+
395
+ ```typescript
396
+ const result = await kling.createVoice({
397
+ voice_name: 'my-voice',
398
+ voice_url: 'https://example.com/sample.mp3',
399
+ })
400
+
401
+ console.log(result.voices[0].voice_id)
402
+ console.log(result.voices[0].trial_url)
403
+ ```
362
404
 
363
- Checks the status of a submitted job once. Returns current status, and includes mapped outputs and metadata when completed.
405
+ **Input: `CreateVoiceInput`**
364
406
 
365
407
  ```typescript
366
- interface PollResponse {
367
- id: string
368
- model: string
369
- provider: ProviderName
370
- status: 'completed' | 'failed' | 'processing' | 'pending'
371
- outputs?: OutputItem[] // populated when completed
372
- metadata?: GenerateResponse['metadata'] // populated when completed
373
- error?: string // populated when failed
408
+ {
409
+ voice_name: string // required
410
+ voice_url?: string // audio sample URL
411
+ video_id?: string // or extract from video
374
412
  }
375
413
  ```
376
414
 
377
- ### `submitAndPoll(request: GenerateRequest): Promise<GenerateResponse>`
415
+ ### Multi-Shot
378
416
 
379
- Alias for `generate()`. Submits a job and polls until completion. Use this when you want the blocking behavior but want to be explicit about it.
417
+ Generate multi-angle reference images from a frontal image. Each image returns 3 angle variants.
380
418
 
381
- ### `listModels(filters?: ListModelsFilters): ModelEntry[]`
419
+ ```typescript
420
+ const result = await kling.multiShot({
421
+ element_frontal_image: 'https://example.com/face.jpg',
422
+ })
382
423
 
383
- Returns all models in the registry. Accepts optional filters:
424
+ console.log(result.images[0].url_1) // angle 1
425
+ console.log(result.images[0].url_2) // angle 2
426
+ console.log(result.images[0].url_3) // angle 3
427
+ ```
384
428
 
385
- - `input` -- filter by input modality (e.g. `'text'`, `'image'`, `'audio'`, `'video'`)
386
- - `output` -- filter by output modality (e.g. `'image'`, `'video'`, `'text'`, `'3d'`)
387
- - `provider` -- filter by provider (e.g. `'fal-ai'`)
388
- - `query` -- search canonical names and aliases
429
+ **Input: `MultiShotInput`**
389
430
 
390
- ### `resolveModel(name: string): ModelEntry`
431
+ ```typescript
432
+ {
433
+ element_frontal_image: string // required
434
+ }
435
+ ```
391
436
 
392
- Resolves a model by name. Accepts canonical names, aliases, and normalized variants. Throws if no match is found.
437
+ ### Reference to Image
393
438
 
394
- ### `deriveCategory(model: ModelEntry): string`
439
+ ```typescript
440
+ const result = await kling.referenceToImage({
441
+ prompt: 'portrait in watercolor style',
442
+ n: 2,
443
+ })
444
+ ```
395
445
 
396
- Derives a display category label from a model's modality (e.g. `"text-to-image"`).
446
+ **Input: `ReferenceToImageInput`**
397
447
 
398
- ## R2 Storage (Asset Uploads)
448
+ ```typescript
449
+ {
450
+ prompt: string // required
451
+ n?: number
452
+ aspect_ratio?: string
453
+ }
454
+ ```
399
455
 
400
- getaiapi includes built-in Cloudflare R2 storage support that automatically uploads binary assets before sending them to providers. Two modes are supported:
456
+ ### Expand Image
401
457
 
402
- - **`public`** (default) requires a publicly readable bucket; returns public URLs (via `publicUrlBase` or the R2 endpoint)
403
- - **`presigned`** — works with private buckets; returns time-limited presigned GET URLs signed with S3 Signature V4 (no public access needed, `publicUrlBase` is not required)
458
+ Outpaintingexpand an image beyond its borders.
404
459
 
405
- ### Setup
460
+ ```typescript
461
+ const result = await kling.expandImage({
462
+ image: 'https://example.com/photo.jpg',
463
+ prompt: 'extend the landscape',
464
+ })
465
+ ```
406
466
 
407
- Set these environment variables:
467
+ **Input: `ExpandImageInput`**
408
468
 
409
- ```bash
410
- # Required
411
- export R2_ACCOUNT_ID="your-cloudflare-account-id"
412
- export R2_BUCKET_NAME="your-bucket-name"
413
- export R2_ACCESS_KEY_ID="your-r2-access-key"
414
- export R2_SECRET_ACCESS_KEY="your-r2-secret-key"
469
+ ```typescript
470
+ {
471
+ image: string // required
472
+ prompt?: string
473
+ n?: number
474
+ }
475
+ ```
476
+
477
+ ### Extend Video
415
478
 
416
- # Optional - custom public URL (only needed for mode: 'public')
417
- export R2_PUBLIC_URL="https://cdn.example.com"
479
+ Continue a video beyond its last frame.
418
480
 
419
- # Optional - use presigned URLs for private buckets (default: 'public')
420
- export R2_STORAGE_MODE="presigned"
421
- export R2_PRESIGN_EXPIRES_IN="3600" # seconds, default: 3600, max: 604800 (7 days)
481
+ ```typescript
482
+ const result = await kling.extendVideo({
483
+ prompt: 'the camera continues to pan right',
484
+ })
422
485
  ```
423
486
 
424
- #### How to get your R2 Public URL (public mode only)
487
+ **Input: `ExtendVideoInput`**
425
488
 
426
- If using `mode: 'presigned'`, you can skip this — no public bucket access is needed.
489
+ ```typescript
490
+ {
491
+ prompt?: string
492
+ negative_prompt?: string
493
+ }
494
+ ```
427
495
 
428
- 1. Log in to the [Cloudflare dashboard](https://dash.cloudflare.com)
429
- 2. Go to **R2 Object Storage** in the left sidebar
430
- 3. Click on your bucket
431
- 4. Go to the **Settings** tab
432
- 5. Under **Public access**, click **Allow Access**
433
- 6. Cloudflare will provide a public URL like `https://<bucket>.<account-id>.r2.dev` — use this as your `R2_PUBLIC_URL`
434
- 7. (Optional) You can also connect a **Custom Domain** under the same section for a cleaner URL like `https://cdn.yourdomain.com`
496
+ ### Identify Face (Sync)
435
497
 
436
- Then call `configureStorage()` once at startup:
498
+ Detect faces in a video for lip-sync targeting. Returns immediately — no polling.
437
499
 
438
500
  ```typescript
439
- import { configureStorage } from 'getaiapi'
440
-
441
- // Read from environment variables
442
- configureStorage()
501
+ const result = await kling.identifyFace({
502
+ video_url: 'https://example.com/video.mp4',
503
+ })
443
504
 
444
- // Or pass config directly
445
- configureStorage({
446
- accountId: 'your-account-id',
447
- bucketName: 'your-bucket',
448
- accessKeyId: 'your-key',
449
- secretAccessKey: 'your-secret',
450
- publicUrlBase: 'https://cdn.example.com', // optional
451
- autoUpload: false, // optional
452
- mode: 'public', // 'public' | 'presigned' (default: 'public')
453
- presignExpiresIn: 3600, // presigned URL TTL in seconds (default: 3600)
505
+ console.log(result.session_id)
506
+ result.face_data.forEach(face => {
507
+ console.log(face.face_id, face.face_image, face.start_time, face.end_time)
454
508
  })
455
509
  ```
456
510
 
457
- ### Automatic Uploads in `generate()`
458
-
459
- Once storage is configured, any `Buffer`, `Blob`, `File`, or `ArrayBuffer` values in provider params are automatically uploaded to R2 and replaced with public URLs before the request is sent to the provider. This works recursively -- nested objects and arrays are traversed, so params like Kling's `elements[].frontal_image_url` are handled automatically. No code changes needed -- it just works.
511
+ **Input: `IdentifyFaceInput`**
460
512
 
461
513
  ```typescript
462
- import { generate, configureStorage } from 'getaiapi'
463
- import { readFileSync } from 'fs'
464
-
465
- configureStorage()
466
-
467
- const result = await generate({
468
- model: 'gpt-image-1.5-edit',
469
- image: readFileSync('./photo.jpg'), // Buffer uploaded to R2 automatically
470
- prompt: 'add a rainbow in the sky',
471
- })
514
+ {
515
+ video_url?: string // mutually exclusive with video_id
516
+ video_id?: string // mutually exclusive with video_url
517
+ }
472
518
  ```
473
519
 
474
- To also re-upload URL strings through R2 (useful when providers can't access the original URL), pass `reupload: true` per-call:
520
+ ### Image Recognize (Sync)
521
+
522
+ Returns immediately — no polling.
475
523
 
476
524
  ```typescript
477
- const result = await generate({
478
- model: 'kling-video-pro',
479
- image: 'https://private-server.com/img.jpg',
480
- prompt: 'animate this image',
481
- options: { reupload: true },
525
+ const result = await kling.imageRecognize({
526
+ image: 'https://example.com/photo.jpg',
482
527
  })
483
528
  ```
484
529
 
485
- Or enable it globally with `autoUpload: true` in the storage config.
530
+ **Input: `ImageRecognizeInput`**
486
531
 
487
- ### Cleanup / Lifecycle
488
-
489
- Assets uploaded automatically via `generate()` use the `getaiapi-tmp/` key prefix. You can set a [Cloudflare R2 lifecycle rule](https://developers.cloudflare.com/r2/buckets/object-lifecycles/) to auto-expire objects under that prefix (e.g. delete after 24 hours) so ephemeral generation assets don't accumulate.
532
+ ```typescript
533
+ {
534
+ image: string // required
535
+ }
536
+ ```
490
537
 
491
- ### Standalone Upload / Delete
538
+ ## Output Types
492
539
 
493
- You can also use R2 storage directly:
540
+ All functions return typed results based on output modality:
494
541
 
495
542
  ```typescript
496
- import { uploadAsset, deleteAsset, configureStorage } from 'getaiapi'
543
+ // Video endpoints (textToVideo, imageToVideo, omniVideo, avatar, lipSync, effects, motionControl, extendVideo)
544
+ interface KlingVideoResult {
545
+ task_id: string
546
+ videos: Array<{ id: string; url: string; duration: string }>
547
+ }
497
548
 
498
- configureStorage()
549
+ // Image endpoints (imageGeneration, omniImage, virtualTryOn, referenceToImage, expandImage)
550
+ interface KlingImageResult {
551
+ task_id: string
552
+ images: Array<{ index: number; url: string }>
553
+ }
499
554
 
500
- // Upload a buffer
501
- const { url, key, size_bytes, content_type } = await uploadAsset(
502
- Buffer.from('hello world'),
503
- { contentType: 'text/plain', prefix: 'uploads' }
504
- )
505
- console.log(url) // https://cdn.example.com/uploads/a1b2c3d4-...
555
+ // Audio endpoints (tts, textToAudio)
556
+ interface KlingAudioResult {
557
+ task_id: string
558
+ audios: Array<{ id: string; url: string; url_mp3?: string; url_wav?: string; duration?: string; duration_mp3?: string; duration_wav?: string }>
559
+ }
506
560
 
507
- // Delete by key
508
- await deleteAsset(key)
509
- ```
561
+ // Multi-shot endpoint — 3 angle URLs per image
562
+ interface KlingMultiShotResult {
563
+ task_id: string
564
+ images: Array<{ index: number; url_1: string; url_2: string; url_3: string }>
565
+ }
510
566
 
511
- ### Presigned URLs (Private Buckets)
567
+ // Voice clone endpoint
568
+ interface KlingVoiceResult {
569
+ task_id: string
570
+ voices: Array<{ voice_id: string; voice_name: string; trial_url: string; owned_by: string }>
571
+ }
512
572
 
513
- If your R2 bucket doesn't have public read access, use presigned mode. Instead of returning a public URL, `uploadAsset` will return a time-limited presigned GET URL signed with S3 Signature V4.
573
+ // Video-to-audio endpoint merged video + generated audio
574
+ interface KlingVideoAudioResult {
575
+ task_id: string
576
+ videos: Array<{ id: string; url: string; duration: string }>
577
+ audios: Array<{ id: string; url_mp3?: string; url_wav?: string; duration_mp3?: string; duration_wav?: string }>
578
+ }
514
579
 
515
- ```typescript
516
- configureStorage({
517
- accountId: 'your-account-id',
518
- bucketName: 'private-bucket',
519
- accessKeyId: 'your-key',
520
- secretAccessKey: 'your-secret',
521
- mode: 'presigned', // uploadAsset returns presigned URLs
522
- presignExpiresIn: 1800, // URLs expire after 30 minutes
523
- })
580
+ // Face detection (identifyFace) — sync, no task_id
581
+ interface KlingFaceResult {
582
+ session_id: string
583
+ face_data: Array<{ face_id: string; face_image: string; start_time: number; end_time: number }>
584
+ }
524
585
 
525
- const { url } = await uploadAsset(Buffer.from('secret data'), {
526
- contentType: 'application/octet-stream',
527
- })
528
- // url is a presigned GET URL, valid for 30 minutes
586
+ // Generic JSON (imageRecognize)
587
+ interface KlingJsonResult {
588
+ task_id: string
589
+ data: unknown
590
+ }
529
591
  ```
530
592
 
531
- You can also generate presigned URLs for existing objects:
532
-
533
- ```typescript
534
- import { presignAsset } from 'getaiapi'
593
+ ## Polling Control
535
594
 
536
- const url = presignAsset('uploads/my-file.png')
537
- // => https://<account>.r2.cloudflarestorage.com/<bucket>/uploads/my-file.png?X-Amz-Algorithm=...
595
+ All functions accept optional polling parameters:
538
596
 
539
- // Custom expiry per-call (overrides config default)
540
- const shortUrl = presignAsset('uploads/my-file.png', { expiresIn: 300 }) // 5 minutes
597
+ ```typescript
598
+ await kling.textToVideoV3Pro({
599
+ prompt: 'a sunset',
600
+ timeout: 600_000, // max wait time in ms (default: 300_000 = 5 min)
601
+ pollInterval: 5_000, // poll frequency in ms (default: 3_000)
602
+ })
541
603
  ```
542
604
 
543
- **UploadOptions**
605
+ Sync endpoints (`tts`, `imageRecognize`, `identifyFace`) return immediately regardless of these settings.
544
606
 
545
- | Option | Type | Description |
546
- |---|---|---|
547
- | `key` | `string` | Custom object key (default: auto-generated UUID) |
548
- | `contentType` | `string` | MIME type (default: detected from input or `application/octet-stream`) |
549
- | `prefix` | `string` | Key prefix / folder (e.g. `"uploads"`) |
550
- | `maxBytes` | `number` | Max upload size in bytes (default: 500 MB) |
607
+ ## Extra Parameters
551
608
 
552
- ### Storage Errors
609
+ All input types accept additional Kling-native fields via index signature. Pass any parameter the Kling API supports:
553
610
 
554
611
  ```typescript
555
- import { StorageError } from 'getaiapi'
556
-
557
- try {
558
- await uploadAsset(buffer)
559
- } catch (err) {
560
- if (err instanceof StorageError) {
561
- console.error(err.operation) // 'upload' | 'delete' | 'config'
562
- console.error(err.statusCode) // HTTP status from R2, if applicable
563
- }
564
- }
612
+ await kling.textToVideoV3Pro({
613
+ prompt: 'a sunset',
614
+ camera_control: { type: 'simple', config: { horizontal: 5 } },
615
+ callback_url: 'https://example.com/webhook',
616
+ })
565
617
  ```
566
618
 
567
619
  ## Error Handling
568
620
 
569
- All errors extend `GetAIApiError` and can be caught uniformly or by type:
570
-
571
- | Error | When |
572
- |---|---|
573
- | `AuthError` | Missing or invalid API key for a provider |
574
- | `ModelNotFoundError` | Model name could not be resolved |
575
- | `ValidationError` | Invalid input parameters |
576
- | `ProviderError` | Provider returned an error response |
577
- | `TimeoutError` | Generation exceeded the timeout |
578
- | `RateLimitError` | Provider returned HTTP 429 |
579
- | `StorageError` | R2 upload, delete, or config failure |
580
-
581
621
  ```typescript
582
- import { generate, AuthError, ModelNotFoundError } from 'getaiapi'
622
+ import { kling, KlingAuthError, KlingTimeoutError, KlingTaskFailedError } from 'getaiapi'
583
623
 
584
624
  try {
585
- const result = await generate({ model: 'flux-schnell', prompt: 'a cat' })
625
+ await kling.textToVideoV3Pro({ prompt: 'test' })
586
626
  } catch (err) {
587
- if (err instanceof AuthError) {
588
- console.error(`Set ${err.envVar} to use ${err.provider}`)
627
+ if (err instanceof KlingAuthError) {
628
+ // Missing or invalid credentials
589
629
  }
590
- if (err instanceof ModelNotFoundError) {
591
- console.error(err.message) // includes "did you mean" suggestions
630
+ if (err instanceof KlingTimeoutError) {
631
+ // Task took too long (increase timeout)
632
+ }
633
+ if (err instanceof KlingTaskFailedError) {
634
+ // Kling rejected the task (content violation, bad params, etc.)
635
+ console.error(err.taskId, err.message)
592
636
  }
593
637
  }
594
638
  ```
595
639
 
596
- ## Migrating from v0.x
640
+ | Error | Code | When |
641
+ |-------|------|------|
642
+ | `KlingAuthError` | `AUTH_ERROR` | Missing credentials or 401 response |
643
+ | `KlingRateLimitError` | `RATE_LIMIT` | HTTP 429 or body codes 1100-1102 |
644
+ | `KlingApiError` | `API_ERROR` | Provider returned an error |
645
+ | `KlingTimeoutError` | `TIMEOUT` | Polling exceeded timeout |
646
+ | `KlingTaskFailedError` | `TASK_FAILED` | Task status is 'failed' |
597
647
 
598
- v1.0.0 replaces the category-based architecture with a modality-first design. Key changes:
648
+ All errors extend `KlingError` which extends `Error`.
599
649
 
600
- - `getModel()` is now `resolveModel()`
601
- - `listModels({ category: '...' })` is now `listModels({ input: '...', output: '...' })`
602
- - No more `readFileSync` -- works in edge runtimes without any bundler config
650
+ ## Deprecated: v1 Unified Gateway
603
651
 
604
- See the full [Migration Guide](docs/MIGRATION.md) for details.
652
+ The previous `generate()`, `submit()`, `poll()` APIs and the multi-provider registry are deprecated but still exported for backward compatibility. They will be removed in the next major version.
605
653
 
606
- ## Documentation
654
+ ```typescript
655
+ // Deprecated — still works but will be removed
656
+ import { generate } from 'getaiapi'
657
+ await generate({ model: 'flux-schnell', prompt: '...' })
607
658
 
608
- Full documentation available at [interactive10.com/getaiapi.html](https://www.interactive10.com/getaiapi.html)
659
+ // New use provider-specific typed functions
660
+ import { kling } from 'getaiapi'
661
+ await kling.textToVideoV3Pro({ prompt: '...' })
662
+ ```
609
663
 
610
664
  ## License
611
665