npm - @runpod/ai-sdk-provider - Versions diffs - 0.11.0 → 0.12.0 - Mend

@runpod/ai-sdk-provider 0.11.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,37 @@
 # @runpod/ai-sdk-provider
+## 0.12.0
+### Minor Changes
+- dcc2cc5: Add support for speech generation with `resembleai/chatterbox-turbo` model:
+  - `speechModel()` and `speech()` methods for text-to-speech
+  - Voice cloning via URL (5-10 seconds of audio)
+  - 20 built-in voices
+### Patch Changes
+- ace58c2: Add comprehensive documentation for Pruna and Nano Banana Pro models, including all supported aspect ratios, resolutions, and output formats. Update examples to use standard AI SDK options where possible.
+## 0.11.1
+### Patch Changes
+- f6115ac: Fix Pruna and Nano Banana Pro model support for all aspect ratios:
+  Pruna models:
+  - Skip standard size/aspectRatio validation
+  - Support all t2i aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, custom
+  - Support all edit aspect ratios: match_input_image, 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3
+  - Support custom width/height for t2i (256-1440, must be multiple of 16)
+  - Support 1-5 images for edit
+  Nano Banana Pro model:
+  - Skip standard size/aspectRatio validation
+  - Support all aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3, 21:9, 9:21
+  - Support resolution: 1k, 2k, 4k
+  - Support output_format: jpeg, png, webp
 ## 0.11.0
 ### Minor Changes

package/README.md CHANGED Viewed

@@ -224,24 +224,69 @@ writeFileSync('landscape.jpg', image.uint8Array);
 ### Model Capabilities
-| Model ID                               | Description                     | Supported Aspect Ratios               |
-| -------------------------------------- | ------------------------------- | ------------------------------------- |
-| `bytedance/seedream-3.0`               | Advanced text-to-image model    | 1:1, 4:3, 3:4                         |
-| `bytedance/seedream-4.0`               | Text-to-image (v4)              | 1:1 (supports 1024, 2048, 4096)       |
-| `bytedance/seedream-4.0-edit`          | Image editing (v4, multi-image) | 1:1 (supports 1024, 1536, 2048, 4096) |
-| `black-forest-labs/flux-1-schnell`     | Fast image generation (4 steps) | 1:1, 4:3, 3:4                         |
-| `black-forest-labs/flux-1-dev`         | High-quality image generation   | 1:1, 4:3, 3:4                         |
-| `black-forest-labs/flux-1-kontext-dev` | Context-aware image generation  | 1:1, 4:3, 3:4                         |
-| `qwen/qwen-image`                      | Text-to-image generation        | 1:1, 4:3, 3:4                         |
-| `qwen/qwen-image-edit`                 | Image editing (prompt-guided)   | 1:1, 4:3, 3:4                         |
-| `nano-banana-edit`                     | Image editing (multi-image)     | 1:1, 4:3, 3:4                         |
-| `google/nano-banana-pro-edit`          | Image editing (Gemini-powered)  | Uses resolution param (1k, 2k)        |
-| `pruna/p-image-t2i`                    | Pruna text-to-image             | 1:1, 16:9, 9:16, 4:3, 3:4, etc.       |
-| `pruna/p-image-edit`                   | Pruna image editing             | match_input_image, 1:1, 16:9, etc.    |
-**Note**: The provider uses strict validation for image parameters. Unsupported aspect ratios (like `16:9`, `9:16`, `3:2`, `2:3`) will throw an `InvalidArgumentError` with a clear message about supported alternatives.
+| Model ID                               | Type |
+| -------------------------------------- | ---- |
+| `bytedance/seedream-3.0`               | t2i  |
+| `bytedance/seedream-4.0`               | t2i  |
+| `bytedance/seedream-4.0-edit`          | edit |
+| `black-forest-labs/flux-1-schnell`     | t2i  |
+| `black-forest-labs/flux-1-dev`         | t2i  |
+| `black-forest-labs/flux-1-kontext-dev` | edit |
+| `qwen/qwen-image`                      | t2i  |
+| `qwen/qwen-image-edit`                 | edit |
+| `nano-banana-edit`                     | edit |
+| `google/nano-banana-pro-edit`          | edit |
+| `pruna/p-image-t2i`                    | t2i  |
+| `pruna/p-image-edit`                   | edit |
+For the full list of models, see the [Runpod Public Endpoint Reference](https://docs.runpod.io/hub/public-endpoint-reference).
+### Pruna Models
+Supported models: `pruna/p-image-t2i`, `pruna/p-image-edit`
+| Parameter                                 | Supported Values                                  | Notes                                                 |
+| :---------------------------------------- | :------------------------------------------------ | :---------------------------------------------------- |
+| `aspectRatio`                             | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3` | Standard AI SDK parameter                             |
+| `aspectRatio` (t2i only)                  | `custom`                                          | Requires `width` & `height` in providerOptions        |
+| `providerOptions.runpod.width` / `height` | `256` - `1440`                                    | Custom dimensions (t2i only). Must be multiple of 16. |
+| `providerOptions.runpod.images`           | `string[]`                                        | Required for `p-image-edit`. Supports 1-5 images.     |
+**Example: Custom Resolution (t2i)**
-**Note:** This list is not complete. For a full list of all available models, see the [Runpod Public Endpoint Reference](https://docs.runpod.io/hub/public-endpoint-reference).
+```ts
+const { image } = await generateImage({
+  model: runpod.imageModel('pruna/p-image-t2i'),
+  prompt: 'A robot',
+  providerOptions: {
+    runpod: {
+      aspect_ratio: 'custom',
+      width: 512,
+      height: 768,
+    },
+  },
+});
+```
+### Google Models
+#### Nano Banana Pro
+Supported model: `google/nano-banana-pro-edit`
+| Parameter                       | Supported Values                                                  | Notes                             |
+| :------------------------------ | :---------------------------------------------------------------- | :-------------------------------- |
+| `aspectRatio`                   | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`, `21:9`, `9:21` | Standard AI SDK parameter         |
+| `resolution`                    | `1k`, `2k`, `4k`                                                  | Output resolution quality         |
+| `output_format`                 | `jpeg`, `png`, `webp`                                             | Output image format               |
+| `providerOptions.runpod.images` | `string[]`                                                        | Required. Input image(s) to edit. |
+### Other Models
+Most other models (Flux, Seedream, Qwen, etc.) support standard `1:1`, `4:3`, and `3:4` aspect ratios.
+- **Flux models**: Support `num_inference_steps` and `guidance` settings.
+- **Edit models**: Require an input image via `providerOptions.runpod.image` (single) or `images` (multiple).
 ### Advanced Parameters
@@ -352,24 +397,122 @@ const { image } = await generateImage({
 ### Provider Options
-Runpod image models support flexible provider options through the `providerOptions.runpod` object:
-| Option                   | Type       | Default | Description                                                              |
-| ------------------------ | ---------- | ------- | ------------------------------------------------------------------------ |
-| `negative_prompt`        | `string`   | `""`    | Text describing what you don't want in the image                         |
-| `enable_safety_checker`  | `boolean`  | `true`  | Enable content safety filtering                                          |
-| `disable_safety_checker` | `boolean`  | `false` | Disable safety checker (Pruna models)                                    |
-| `image`                  | `string`   | -       | Single input image: URL or base64 data URI (Flux Kontext)                |
-| `images`                 | `string[]` | -       | Multiple input images (e.g., for `nano-banana-edit` multi-image editing) |
-| `aspect_ratio`           | `string`   | `"1:1"` | Aspect ratio string (Pruna: "16:9", "match_input_image", etc.)           |
-| `resolution`             | `string`   | `"1k"`  | Output resolution (Nano Banana Pro: "1k", "2k")                          |
-| `num_inference_steps`    | `number`   | Auto    | Number of denoising steps (Flux: 4 for schnell, 28 for others)           |
-| `guidance`               | `number`   | Auto    | Guidance scale for prompt adherence (Flux: 7 for schnell, 2 for others)  |
-| `output_format`          | `string`   | `"png"` | Output image format ("png", "jpg", or "jpeg")                            |
-| `enable_base64_output`   | `boolean`  | `false` | Return base64 instead of URL (Nano Banana Pro)                           |
-| `enable_sync_mode`       | `boolean`  | `false` | Enable synchronous mode (some models)                                    |
-| `maxPollAttempts`        | `number`   | `60`    | Maximum polling attempts for async generation                            |
-| `pollIntervalMillis`     | `number`   | `5000`  | Polling interval in milliseconds (5 seconds)                             |
+Use `providerOptions.runpod` for model-specific parameters:
+| Option                   | Type       | Default | Description                                     |
+| ------------------------ | ---------- | ------- | ----------------------------------------------- |
+| `negative_prompt`        | `string`   | `""`    | What to avoid in the image                      |
+| `enable_safety_checker`  | `boolean`  | `true`  | Content safety filtering                        |
+| `disable_safety_checker` | `boolean`  | `false` | Disable safety checker (Pruna)                  |
+| `image`                  | `string`   | -       | Input image URL or base64 (Flux Kontext)        |
+| `images`                 | `string[]` | -       | Multiple input images (edit models)             |
+| `resolution`             | `string`   | `"1k"`  | Output resolution: 1k, 2k, 4k (Nano Banana Pro) |
+| `width` / `height`       | `number`   | -       | Custom dimensions (Pruna t2i, 256-1440)         |
+| `num_inference_steps`    | `number`   | Auto    | Denoising steps                                 |
+| `guidance`               | `number`   | Auto    | Prompt adherence strength                       |
+| `output_format`          | `string`   | `"png"` | Output format: png, jpg, jpeg, webp             |
+| `maxPollAttempts`        | `number`   | `60`    | Max polling attempts                            |
+| `pollIntervalMillis`     | `number`   | `5000`  | Polling interval (ms)                           |
+## Speech
+You can generate speech using the AI SDK's `experimental_generateSpeech` and a Runpod speech model created via `runpod.speechModel()` (or the shorthand `runpod.speech()`).
+### Basic Usage
+```ts
+import { runpod } from '@runpod/ai-sdk-provider';
+import { experimental_generateSpeech as generateSpeech } from 'ai';
+const result = await generateSpeech({
+  model: runpod.speechModel('resembleai/chatterbox-turbo'),
+  text: 'Hello, this is Chatterbox Turbo running on Runpod.',
+  voice: 'lucy',
+});
+// Save to filesystem:
+import { writeFileSync } from 'fs';
+writeFileSync('speech.wav', result.audio.uint8Array);
+```
+**Returns:**
+- `result.audio.uint8Array` - Binary audio data (efficient for processing/saving)
+- `result.audio.base64` - Base64 encoded audio (useful for web embedding)
+- `result.audio.mediaType` - MIME type (e.g. `audio/wav`)
+- `result.audio.format` - Format (e.g. `wav`)
+- `result.warnings` - Array of any warnings about unsupported parameters
+- `result.providerMetadata.runpod.audioUrl` - Public URL to the generated audio
+- `result.providerMetadata.runpod.cost` - Cost information (if available)
+### Supported Models
+Supported model: `resembleai/chatterbox-turbo`
+### Parameters
+| Parameter | Type     | Default  | Description                              |
+| --------- | -------- | -------- | ---------------------------------------- |
+| `text`    | `string` | -        | Required. The text to convert to speech. |
+| `voice`   | `string` | `"lucy"` | Built-in voice name (see list below).    |
+### Provider Options
+Use `providerOptions.runpod` for model-specific parameters:
+| Option      | Type     | Default | Description                                 |
+| ----------- | -------- | ------- | ------------------------------------------- |
+| `voice_url` | `string` | -       | URL to audio file (5–10s) for voice cloning |
+| `voiceUrl`  | `string` | -       | Alias for `voice_url`                       |
+> Note: If `voice_url` is provided, the built-in `voice` is ignored.
+>
+> Note: This speech endpoint currently returns WAV only; `outputFormat` is ignored.
+### Voices
+`voice` selects one of the built-in voices (default: `lucy`):
+```ts
+[
+  'aaron',
+  'abigail',
+  'anaya',
+  'andy',
+  'archer',
+  'brian',
+  'chloe',
+  'dylan',
+  'emmanuel',
+  'ethan',
+  'evelyn',
+  'gavin',
+  'gordon',
+  'ivan',
+  'laura',
+  'lucy',
+  'madison',
+  'marisol',
+  'meera',
+  'walter',
+];
+```
+### Voice cloning (via URL)
+You can provide a `voice_url` (5–10s audio) through `providerOptions.runpod`:
+```ts
+const result = await generateSpeech({
+  model: runpod.speech('resembleai/chatterbox-turbo'),
+  text: 'Hello!',
+  providerOptions: {
+    runpod: {
+      voice_url: 'https://example.com/voice.wav',
+    },
+  },
+});
+```
 ## About Runpod

package/dist/index.d.mts CHANGED Viewed

@@ -1,4 +1,4 @@
-import { LanguageModelV2, ImageModelV2 } from '@ai-sdk/provider';
+import { LanguageModelV2, ImageModelV2, SpeechModelV2 } from '@ai-sdk/provider';
 import { FetchFunction } from '@ai-sdk/provider-utils';
 export { OpenAICompatibleErrorData as RunpodErrorData } from '@ai-sdk/openai-compatible';
 import { z } from 'zod';
@@ -44,6 +44,14 @@ interface RunpodProvider {
   Creates an image model for image generation.
   */
     imageModel(modelId: string): ImageModelV2;
+    /**
+  Creates a speech model for speech generation.
+  */
+    speechModel(modelId: string): SpeechModelV2;
+    /**
+  Creates a speech model for speech generation.
+  */
+    speech(modelId: string): SpeechModelV2;
 }
 declare function createRunpod(options?: RunpodProviderSettings): RunpodProvider;
 declare const runpod: RunpodProvider;

package/dist/index.d.ts CHANGED Viewed

@@ -1,4 +1,4 @@
-import { LanguageModelV2, ImageModelV2 } from '@ai-sdk/provider';
+import { LanguageModelV2, ImageModelV2, SpeechModelV2 } from '@ai-sdk/provider';
 import { FetchFunction } from '@ai-sdk/provider-utils';
 export { OpenAICompatibleErrorData as RunpodErrorData } from '@ai-sdk/openai-compatible';
 import { z } from 'zod';
@@ -44,6 +44,14 @@ interface RunpodProvider {
   Creates an image model for image generation.
   */
     imageModel(modelId: string): ImageModelV2;
+    /**
+  Creates a speech model for speech generation.
+  */
+    speechModel(modelId: string): SpeechModelV2;
+    /**
+  Creates a speech model for speech generation.
+  */
+    speech(modelId: string): SpeechModelV2;
 }
 declare function createRunpod(options?: RunpodProviderSettings): RunpodProvider;
 declare const runpod: RunpodProvider;

package/dist/index.js CHANGED Viewed

@@ -27,7 +27,7 @@ module.exports = __toCommonJS(index_exports);
 // src/runpod-provider.ts
 var import_openai_compatible = require("@ai-sdk/openai-compatible");
-var import_provider_utils3 = require("@ai-sdk/provider-utils");
+var import_provider_utils4 = require("@ai-sdk/provider-utils");
 // src/runpod-image-model.ts
 var import_provider_utils2 = require("@ai-sdk/provider-utils");
@@ -115,8 +115,12 @@ var RunpodImageModel = class {
     abortSignal
   }) {
     const warnings = [];
+    const isPrunaModel = this.modelId.includes("pruna") || this.modelId.includes("p-image");
+    const isNanoBananaProModel = this.modelId.includes("nano-banana-pro");
     let runpodSize;
-    if (size) {
+    if (isPrunaModel || isNanoBananaProModel) {
+      runpodSize = aspectRatio || "1:1";
+    } else if (size) {
       const runpodSizeCandidate = size.replace("x", "*");
       if (!SUPPORTED_SIZES.has(runpodSizeCandidate)) {
         throw new import_provider.InvalidArgumentError({
@@ -150,7 +154,8 @@ var RunpodImageModel = class {
       prompt,
       runpodSize,
       seed,
-      providerOptions.runpod
+      providerOptions.runpod,
+      aspectRatio
     );
     const { value: response, responseHeaders } = await (0, import_provider_utils2.postJsonToApi)({
       url: `${this.config.baseURL}/runsync`,
@@ -264,7 +269,7 @@ var RunpodImageModel = class {
       `Image generation timed out after ${maxAttempts} attempts (${maxAttempts * pollInterval / 1e3}s)`
     );
   }
-  buildInputPayload(prompt, runpodSize, seed, runpodOptions) {
+  buildInputPayload(prompt, runpodSize, seed, runpodOptions, aspectRatio) {
     const isFluxModel = this.modelId.includes("flux") || this.modelId.includes("black-forest-labs");
     if (isFluxModel) {
       const isKontext = this.modelId.includes("kontext");
@@ -300,50 +305,56 @@ var RunpodImageModel = class {
     if (isPrunaModel) {
       const isPrunaEdit = this.modelId.includes("edit");
       if (isPrunaEdit) {
-        return {
+        const editPayload = {
           prompt,
-          seed: seed ?? -1,
-          aspect_ratio: runpodOptions?.aspect_ratio ?? "match_input_image",
-          disable_safety_checker: runpodOptions?.disable_safety_checker ?? false,
-          enable_sync_mode: runpodOptions?.enable_sync_mode ?? false,
-          ...runpodOptions
+          aspect_ratio: runpodOptions?.aspect_ratio ?? aspectRatio ?? "1:1",
+          disable_safety_checker: runpodOptions?.disable_safety_checker ?? false
         };
+        if (seed !== void 0) {
+          editPayload.seed = seed;
+        } else if (runpodOptions?.seed !== void 0) {
+          editPayload.seed = runpodOptions.seed;
+        }
+        if (runpodOptions?.images) {
+          editPayload.images = runpodOptions.images;
+        }
+        return editPayload;
       } else {
-        const aspectRatioMap = {
-          "1328*1328": "1:1",
-          "1472*1140": "4:3",
-          "1140*1472": "3:4",
-          "512*512": "1:1",
-          "768*768": "1:1",
-          "1024*1024": "1:1",
-          "1536*1536": "1:1",
-          "2048*2048": "1:1",
-          "4096*4096": "1:1",
-          "512*768": "2:3",
-          "768*512": "3:2",
-          "1024*768": "4:3",
-          "768*1024": "3:4"
-        };
-        const aspectRatio = runpodOptions?.aspect_ratio ?? aspectRatioMap[runpodSize] ?? "1:1";
-        return {
+        const t2iPayload = {
           prompt,
-          seed: seed ?? 0,
-          aspect_ratio: aspectRatio,
-          enable_safety_checker: runpodOptions?.enable_safety_checker ?? true,
-          ...runpodOptions
+          aspect_ratio: runpodOptions?.aspect_ratio ?? aspectRatio ?? "1:1",
+          disable_safety_checker: runpodOptions?.disable_safety_checker ?? false
         };
+        if (seed !== void 0) {
+          t2iPayload.seed = seed;
+        } else if (runpodOptions?.seed !== void 0) {
+          t2iPayload.seed = runpodOptions.seed;
+        }
+        if (t2iPayload.aspect_ratio === "custom") {
+          if (runpodOptions?.width) {
+            t2iPayload.width = runpodOptions.width;
+          }
+          if (runpodOptions?.height) {
+            t2iPayload.height = runpodOptions.height;
+          }
+        }
+        return t2iPayload;
       }
     }
     const isNanaBananaProModel = this.modelId.includes("nano-banana-pro");
     if (isNanaBananaProModel) {
-      return {
+      const nanoBananaPayload = {
         prompt,
+        aspect_ratio: runpodOptions?.aspect_ratio ?? aspectRatio ?? "1:1",
         resolution: runpodOptions?.resolution ?? "1k",
         output_format: runpodOptions?.output_format ?? "jpeg",
         enable_base64_output: runpodOptions?.enable_base64_output ?? false,
-        enable_sync_mode: runpodOptions?.enable_sync_mode ?? false,
-        ...runpodOptions
+        enable_sync_mode: runpodOptions?.enable_sync_mode ?? false
       };
+      if (runpodOptions?.images) {
+        nanoBananaPayload.images = runpodOptions.images;
+      }
+      return nanoBananaPayload;
     }
     return {
       prompt,
@@ -381,6 +392,148 @@ var runpodImageStatusSchema = import_zod2.z.object({
   // Error message if FAILED
 });
+// src/runpod-speech-model.ts
+var import_provider_utils3 = require("@ai-sdk/provider-utils");
+function isRecord(value) {
+  return typeof value === "object" && value !== null;
+}
+function replaceNewlinesWithSpaces(value) {
+  return value.replace(/[\r\n]+/g, " ");
+}
+var RunpodSpeechModel = class {
+  constructor(modelId, config) {
+    this.modelId = modelId;
+    this.config = config;
+    this.specificationVersion = "v2";
+  }
+  get provider() {
+    return this.config.provider;
+  }
+  getRunpodRunSyncUrl() {
+    const baseURL = (0, import_provider_utils3.withoutTrailingSlash)(this.config.baseURL) ?? this.config.baseURL;
+    if (baseURL.endsWith("/run") || baseURL.endsWith("/runsync")) {
+      return baseURL;
+    }
+    return `${baseURL}/runsync`;
+  }
+  async doGenerate(options) {
+    const currentDate = this.config._internal?.currentDate?.() ?? /* @__PURE__ */ new Date();
+    const warnings = [];
+    const {
+      text,
+      voice,
+      outputFormat,
+      instructions,
+      speed,
+      language,
+      providerOptions,
+      abortSignal,
+      headers
+    } = options;
+    if (outputFormat != null && outputFormat !== "wav") {
+      warnings.push({
+        type: "unsupported-setting",
+        setting: "outputFormat",
+        details: `Unsupported outputFormat: ${outputFormat}. This endpoint returns 'wav'.`
+      });
+    }
+    if (instructions != null) {
+      warnings.push({
+        type: "unsupported-setting",
+        setting: "instructions",
+        details: `Instructions are not supported by this speech endpoint.`
+      });
+    }
+    if (speed != null) {
+      warnings.push({
+        type: "unsupported-setting",
+        setting: "speed",
+        details: `Speed is not supported by this speech endpoint.`
+      });
+    }
+    if (language != null) {
+      warnings.push({
+        type: "unsupported-setting",
+        setting: "language",
+        details: `Language selection is not supported by this speech endpoint.`
+      });
+    }
+    const runpodProviderOptions = isRecord(providerOptions) ? providerOptions.runpod : void 0;
+    const voiceUrl = isRecord(runpodProviderOptions) && (typeof runpodProviderOptions.voice_url === "string" || typeof runpodProviderOptions.voiceUrl === "string") ? runpodProviderOptions.voice_url ?? runpodProviderOptions.voiceUrl ?? void 0 : void 0;
+    const input = {
+      prompt: replaceNewlinesWithSpaces(text)
+    };
+    if (voiceUrl) {
+      input.voice_url = voiceUrl;
+    } else if (voice) {
+      input.voice = voice;
+    }
+    const requestBody = { input };
+    const url = this.getRunpodRunSyncUrl();
+    const fetchFn = this.config.fetch ?? fetch;
+    const requestHeaders = {
+      "Content-Type": "application/json",
+      ...this.config.headers()
+    };
+    if (headers) {
+      for (const [key, value] of Object.entries(headers)) {
+        if (value != null) {
+          requestHeaders[key] = value;
+        }
+      }
+    }
+    const response = await fetchFn(url, {
+      method: "POST",
+      headers: requestHeaders,
+      body: JSON.stringify(requestBody),
+      signal: abortSignal
+    });
+    const responseHeaders = Object.fromEntries(response.headers.entries());
+    const rawBodyText = await response.text();
+    let parsed = void 0;
+    try {
+      parsed = rawBodyText ? JSON.parse(rawBodyText) : void 0;
+    } catch {
+    }
+    if (!response.ok) {
+      const message = parsed && typeof parsed.error === "string" && parsed.error || rawBodyText || `HTTP ${response.status}`;
+      throw new Error(`Runpod speech request failed: ${message}`);
+    }
+    const output = parsed?.output ?? parsed;
+    const audioUrl = output?.audio_url;
+    if (typeof audioUrl !== "string" || audioUrl.length === 0) {
+      throw new Error("Runpod speech response did not include an audio_url.");
+    }
+    const audioResponse = await fetchFn(audioUrl, { signal: abortSignal });
+    if (!audioResponse.ok) {
+      throw new Error(
+        `Failed to download generated audio (${audioResponse.status}).`
+      );
+    }
+    const audio = new Uint8Array(await audioResponse.arrayBuffer());
+    const providerMetadata = {
+      runpod: {
+        audioUrl,
+        ...typeof output?.cost === "number" ? { cost: output.cost } : {}
+      }
+    };
+    return {
+      audio,
+      warnings,
+      request: {
+        body: JSON.stringify(requestBody)
+      },
+      response: {
+        timestamp: currentDate,
+        modelId: this.modelId,
+        headers: responseHeaders,
+        body: rawBodyText
+      },
+      providerMetadata
+    };
+  }
+};
 // src/runpod-provider.ts
 var MODEL_ID_TO_ENDPOINT_URL = {
   "qwen/qwen3-32b-awq": "https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1",
@@ -408,6 +561,9 @@ var IMAGE_MODEL_ID_TO_ENDPOINT_URL = {
   "pruna/p-image-t2i": "https://api.runpod.ai/v2/p-image-t2i",
   "pruna/p-image-edit": "https://api.runpod.ai/v2/p-image-edit"
 };
+var SPEECH_MODEL_ID_TO_ENDPOINT_URL = {
+  "resembleai/chatterbox-turbo": "https://api.runpod.ai/v2/chatterbox-turbo/"
+};
 var MODEL_ID_TO_OPENAI_NAME = {
   "qwen/qwen3-32b-awq": "Qwen/Qwen3-32B-AWQ",
   "deepcogito/cogito-671b-v2.1-fp8": "deepcogito/cogito-671b-v2.1-FP8",
@@ -417,9 +573,26 @@ function deriveEndpointURL(modelId) {
   const normalized = modelId.replace(/\//g, "-");
   return `https://api.runpod.ai/v2/${normalized}/openai/v1`;
 }
+function parseRunpodConsoleEndpointId(modelIdOrUrl) {
+  if (!modelIdOrUrl.startsWith("http")) {
+    return null;
+  }
+  try {
+    const url = new URL(modelIdOrUrl);
+    if (url.hostname !== "console.runpod.io") {
+      return null;
+    }
+    const parts = url.pathname.split("/").filter(Boolean);
+    const idx = parts.lastIndexOf("endpoint");
+    const endpointId = idx !== -1 ? parts[idx + 1] : void 0;
+    return endpointId || null;
+  } catch {
+    return null;
+  }
+}
 function createRunpod(options = {}) {
   const getHeaders = () => ({
-    Authorization: `Bearer ${(0, import_provider_utils3.loadApiKey)({
+    Authorization: `Bearer ${(0, import_provider_utils4.loadApiKey)({
       apiKey: options.apiKey,
       environmentVariableName: "RUNPOD_API_KEY",
       description: "Runpod"
@@ -449,7 +622,7 @@ function createRunpod(options = {}) {
     }
     return {
       provider: `runpod.${modelType}`,
-      url: ({ path }) => `${(0, import_provider_utils3.withoutTrailingSlash)(baseURL)}${path}`,
+      url: ({ path }) => `${(0, import_provider_utils4.withoutTrailingSlash)(baseURL)}${path}`,
       headers: getHeaders,
       fetch: runpodFetch
     };
@@ -482,11 +655,25 @@ function createRunpod(options = {}) {
       fetch: options.fetch
     });
   };
+  const createSpeechModel = (modelId) => {
+    const endpointIdFromConsole = parseRunpodConsoleEndpointId(modelId);
+    const normalizedModelId = endpointIdFromConsole ?? modelId;
+    const mappedBaseURL = SPEECH_MODEL_ID_TO_ENDPOINT_URL[normalizedModelId];
+    const baseURL = mappedBaseURL ?? (normalizedModelId.startsWith("http") ? normalizedModelId : `https://api.runpod.ai/v2/${normalizedModelId}`);
+    return new RunpodSpeechModel(normalizedModelId, {
+      provider: "runpod.speech",
+      baseURL,
+      headers: getHeaders,
+      fetch: runpodFetch
+    });
+  };
   const provider = (modelId) => createChatModel(modelId);
   provider.completionModel = createCompletionModel;
   provider.languageModel = createChatModel;
   provider.chatModel = createChatModel;
   provider.imageModel = createImageModel;
+  provider.speechModel = createSpeechModel;
+  provider.speech = createSpeechModel;
   return provider;
 }
 var runpod = createRunpod();