@mux/ai 0.1.6 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,36 +1,43 @@
1
- # @mux/ai 📼 🤝 🤖
1
+ # `@mux/ai` 📼 🤝 🤖
2
2
 
3
- A typescript library for connecting videos in your Mux account to multi-modal LLMs.
3
+ [![npm version](https://badge.fury.io/js/@mux%2Fai.svg)](https://www.npmjs.com/package/@mux/ai)
4
+ [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
4
5
 
5
- `@mux/ai` contains two abstractions:
6
+ > **A TypeScript SDK for building AI-driven video workflows on the server, powered by [Mux](https://www.mux.com)!**
6
7
 
7
- **Workflows** are production-ready functions that handle common video<->LLM tasks. Each workflow orchestrates the entire process: fetching video data from Mux (transcripts, thumbnails, storyboards), formatting it for AI providers, and returning structured results. Use workflows when you need battle-tested solutions for tasks like summarization, content moderation, chapter generation, or translation.
8
+ `@mux/ai` does this by providing:
9
+ - Easy to use, purpose-driven, cost effective, configurable **_workflow functions_** that integrate with a variety of popular AI/LLM providers (OpenAI, Anthropic, Google).
10
+ - **Examples:** [`generateChapters`](#chapter-generation), [`getModerationScores`](#content-moderation), [`generateVideoEmbeddings`](#video-search-with-embeddings), [`getSummaryAndTags`](#video-summarization)
11
+ - Convenient, parameterized, commonly needed **_primitive functions_** backed by [Mux Video](https://www.mux.com/video-api) for building your own media-based AI workflows and integrations.
12
+ - **Examples:** `getStoryboardUrl`, `chunkVTTCues`, `fetchTranscriptForAsset`
8
13
 
9
- **Primitives** are the low-level building blocks that workflows are composed from. They provide direct access to Mux video data (transcripts, storyboards, thumbnails) and utilities for chunking and processing text. Use primitives when you need complete control over your AI prompts or want to build custom workflows not covered by the pre-built options.
14
+ # Usage
10
15
 
11
- ## Available pre-built workflows
16
+ ```ts
17
+ import { getSummaryAndTags } from "@mux/ai/workflows";
18
+
19
+ const result = await getSummaryAndTags("your-asset-id", {
20
+ provider: "openai",
21
+ tone: "professional",
22
+ includeTranscript: true
23
+ });
24
+
25
+ console.log(result.title); // "Getting Started with TypeScript"
26
+ console.log(result.description); // "A comprehensive guide to..."
27
+ console.log(result.tags); // ["typescript", "tutorial", "programming"]
28
+ ```
29
+
30
+ > **⚠️ Important:** Many workflows rely on video transcripts for best results. Consider enabling [auto-generated captions](https://www.mux.com/docs/guides/add-autogenerated-captions-and-use-transcripts) on your Mux assets to unlock the full potential of transcript-based workflows like summarization, chapters, and embeddings.
31
+
32
+ # Quick Start
33
+
34
+ ## Prerequisites
35
+
36
+ - [Node.js](https://nodejs.org/en/download) (≥ 21.0.0)
37
+ - A Mux account and necessary [credentials](#credentials---mux) for your environment (sign up [here](https://dashboard.mux.com/signup) for free!)
38
+ - Accounts and [credentials](#credentials---ai-providers) for any AI providers you intend to use for your workflows
39
+ - (For some workflows only) AWS S3 and [other credentials](#credentials---other)
12
40
 
13
- | Workflow | Description | Providers | Default Models |
14
- | ------------------------------------------------------------------------ | ----------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------ |
15
- | [`getSummaryAndTags`](./docs/WORKFLOWS.md#video-summarization) | Generate titles, descriptions, and tags for an asset | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` |
16
- | [`getModerationScores`](./docs/WORKFLOWS.md#content-moderation) | Detect inappropriate (sexual or violent) content in an asset | OpenAI, Hive | `omni-moderation-latest` (OpenAI) or Hive visual moderation task |
17
- | [`hasBurnedInCaptions`](./docs/WORKFLOWS.md#burned-in-caption-detection) | Detect burned-in captions (hardcoded subtitles) in an asset | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` |
18
- | [`generateChapters`](./docs/WORKFLOWS.md#chapter-generation) | Generate chapter markers for an asset using the transcript | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` |
19
- | [`generateVideoEmbeddings`](./docs/WORKFLOWS.md#video-embeddings) | Generate vector embeddings for an asset's transcript chunks | OpenAI, Google | `text-embedding-3-small` (OpenAI), `gemini-embedding-001` (Google) |
20
- | [`translateCaptions`](./docs/WORKFLOWS.md#caption-translation) | Translate an asset's captions into different languages | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` |
21
- | [`translateAudio`](./docs/WORKFLOWS.md#audio-dubbing) | Create AI-dubbed audio tracks in different languages for an asset | ElevenLabs only | ElevenLabs Dubbing API |
22
-
23
- ## Features
24
-
25
- - **Cost-Effective by Default**: Uses affordable frontier models like `gpt-5-mini`, `claude-sonnet-4-5`, and `gemini-2.5-flash` to keep analysis costs low while maintaining high quality results
26
- - **Multi-modal Analysis**: Combines storyboard images with video transcripts
27
- - **Tone Control**: Normal, sassy, or professional analysis styles
28
- - **Prompt Customization**: Override specific prompt sections to tune workflows to your use case
29
- - **Configurable Thresholds**: Custom sensitivity levels for content moderation
30
- - **TypeScript**: Fully typed for excellent developer experience
31
- - **Provider Choice**: Switch between OpenAI, Anthropic, and Google for different perspectives
32
- - **Composable Building Blocks**: Import primitives to fetch transcripts, thumbnails, and storyboards to build bespoke flows
33
- - **Universal Language Support**: Automatic language name detection using `Intl.DisplayNames` for all ISO 639-1 codes
34
41
 
35
42
  ## Installation
36
43
 
@@ -40,7 +47,7 @@ npm install @mux/ai
40
47
 
41
48
  ## Configuration
42
49
 
43
- Set environment variables:
50
+ We support [dotenv](https://www.npmjs.com/package/dotenv), so you can simply add the following environment variables to your `.env` file:
44
51
 
45
52
  ```bash
46
53
  # Required
@@ -51,12 +58,10 @@ MUX_TOKEN_SECRET=your_mux_token_secret
51
58
  MUX_SIGNING_KEY=your_signing_key_id
52
59
  MUX_PRIVATE_KEY=your_base64_encoded_private_key
53
60
 
54
- # You only need to configure API keys for the AI platforms you're using
61
+ # You only need to configure API keys for the AI platforms and workflows you're using
55
62
  OPENAI_API_KEY=your_openai_api_key
56
63
  ANTHROPIC_API_KEY=your_anthropic_api_key
57
64
  GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
58
-
59
- # Needed for audio dubbing workflow
60
65
  ELEVENLABS_API_KEY=your_elevenlabs_api_key
61
66
 
62
67
  # S3-Compatible Storage (required for translation & audio dubbing)
@@ -77,106 +82,316 @@ const result = await getSummaryAndTags(assetId, {
77
82
  });
78
83
  ```
79
84
 
80
- ## Quick Start
85
+ > **💡 Tip:** If you're using `.env` in a repository or version tracking system, make sure you add this file to your `.gitignore` or equivalent to avoid unintentionally committing secure credentials.
86
+
87
+ # Workflows
88
+
89
+ ## Available pre-built workflows
90
+
91
+ | Workflow | Description | Providers | Default Models | Mux Asset Requirements | Cloud Infrastructure Requirements |
92
+ | ------------------------------------------------------------------------ | ----------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------ | ---------------------- | --------------------------------- |
93
+ | [`getSummaryAndTags`](./docs/WORKFLOWS.md#video-summarization)<br/>[API](./docs/API.md#getsummaryandtagsassetid-options) · [Source](./src/workflows/summarization.ts) | Generate titles, descriptions, and tags for an asset | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (optional) | None |
94
+ | [`getModerationScores`](./docs/WORKFLOWS.md#content-moderation)<br/>[API](./docs/API.md#getmoderationscoresassetid-options) · [Source](./src/workflows/moderation.ts) | Detect inappropriate (sexual or violent) content in an asset | OpenAI, Hive | `omni-moderation-latest` (OpenAI) or Hive visual moderation task | Video (required) | None |
95
+ | [`hasBurnedInCaptions`](./docs/WORKFLOWS.md#burned-in-caption-detection)<br/>[API](./docs/API.md#hasburnedincaptionsassetid-options) · [Source](./src/workflows/burned-in-captions.ts) | Detect burned-in captions (hardcoded subtitles) in an asset | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required) | None |
96
+ | [`generateChapters`](./docs/WORKFLOWS.md#chapter-generation)<br/>[API](./docs/API.md#generatechaptersassetid-languagecode-options) · [Source](./src/workflows/chapters.ts) | Generate chapter markers for an asset using the transcript | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (required) | None |
97
+ | [`generateVideoEmbeddings`](./docs/WORKFLOWS.md#video-embeddings)<br/>[API](./docs/API.md#generatevideoembeddingsassetid-options) · [Source](./src/workflows/embeddings.ts) | Generate vector embeddings for an asset's transcript chunks | OpenAI, Google | `text-embedding-3-small` (OpenAI), `gemini-embedding-001` (Google) | Video (required), Captions (required) | None |
98
+ | [`translateCaptions`](./docs/WORKFLOWS.md#caption-translation)<br/>[API](./docs/API.md#translatecaptionsassetid-fromlanguagecode-tolanguagecode-options) · [Source](./src/workflows/translate-captions.ts) | Translate an asset's captions into different languages | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (required) | AWS S3 (if `uploadToMux=true`) |
99
+ | [`translateAudio`](./docs/WORKFLOWS.md#audio-dubbing)<br/>[API](./docs/API.md#translateaudioassetid-tolanguagecode-options) · [Source](./src/workflows/translate-audio.ts) | Create AI-dubbed audio tracks in different languages for an asset | ElevenLabs only | ElevenLabs Dubbing API | Video (required), Audio (required) | AWS S3 (if `uploadToMux=true`) |
81
100
 
82
- > **‼️ Important: ‼️** Most workflows rely on video transcripts for best results. Enable [auto-generated captions](https://www.mux.com/docs/guides/add-autogenerated-captions-and-use-transcripts) on your Mux assets to unlock the full potential of transcript-based workflows like summarization, chapters, and embeddings.
101
+ ## Example Workflows
83
102
 
84
103
  ### Video Summarization
85
104
 
105
+ Generate SEO-friendly titles, descriptions, and tags from your video content:
106
+
86
107
  ```typescript
87
108
  import { getSummaryAndTags } from "@mux/ai/workflows";
88
109
 
89
- const result = await getSummaryAndTags("your-mux-asset-id", {
90
- tone: "professional"
110
+ const result = await getSummaryAndTags("your-asset-id", {
111
+ provider: "openai",
112
+ tone: "professional",
113
+ includeTranscript: true
91
114
  });
92
115
 
93
- console.log(result.title);
94
- console.log(result.description);
95
- console.log(result.tags);
116
+ console.log(result.title); // "Getting Started with TypeScript"
117
+ console.log(result.description); // "A comprehensive guide to..."
118
+ console.log(result.tags); // ["typescript", "tutorial", "programming"]
96
119
  ```
97
120
 
98
121
  ### Content Moderation
99
122
 
123
+ Automatically detect inappropriate content in videos:
124
+
100
125
  ```typescript
101
126
  import { getModerationScores } from "@mux/ai/workflows";
102
127
 
103
- const result = await getModerationScores("your-mux-asset-id", {
128
+ const result = await getModerationScores("your-asset-id", {
129
+ provider: "openai",
104
130
  thresholds: { sexual: 0.7, violence: 0.8 }
105
131
  });
106
132
 
107
- console.log(result.exceedsThreshold); // true if content flagged
133
+ if (result.exceedsThreshold) {
134
+ console.log("Content flagged for review");
135
+ console.log(`Max scores: ${result.maxScores}`);
136
+ }
108
137
  ```
109
138
 
110
- ### Generate Chapters
139
+ ### Chapter Generation
140
+
141
+ Create automatic chapter markers for better video navigation:
111
142
 
112
143
  ```typescript
113
144
  import { generateChapters } from "@mux/ai/workflows";
114
145
 
115
- const result = await generateChapters("your-mux-asset-id", "en");
146
+ const result = await generateChapters("your-asset-id", "en", {
147
+ provider: "anthropic"
148
+ });
116
149
 
117
150
  // Use with Mux Player
118
151
  player.addChapters(result.chapters);
152
+ // [
153
+ // { startTime: 0, title: "Introduction" },
154
+ // { startTime: 45, title: "Main Content" },
155
+ // { startTime: 120, title: "Conclusion" }
156
+ // ]
119
157
  ```
120
158
 
121
- ### Translate Captions
159
+ ### Video Search with Embeddings
160
+
161
+ Generate embeddings for semantic video search:
162
+
163
+ ```typescript
164
+ import { generateVideoEmbeddings } from "@mux/ai/workflows";
165
+
166
+ const result = await generateVideoEmbeddings("your-asset-id", {
167
+ provider: "openai",
168
+ languageCode: "en",
169
+ chunkingStrategy: {
170
+ type: "token",
171
+ maxTokens: 500,
172
+ overlap: 100
173
+ }
174
+ });
175
+
176
+ // Store embeddings in your vector database
177
+ for (const chunk of result.chunks) {
178
+ await vectorDB.insert({
179
+ embedding: chunk.embedding,
180
+ metadata: {
181
+ assetId: result.assetId,
182
+ startTime: chunk.metadata.startTime,
183
+ endTime: chunk.metadata.endTime
184
+ }
185
+ });
186
+ }
187
+ ```
188
+
189
+ # Key Features
190
+
191
+ - **Cost-Effective by Default**: Uses affordable frontier models like `gpt-5.1`, `claude-sonnet-4-5`, and `gemini-2.5-flash` to keep analysis costs low while maintaining high quality results
192
+ - **Multi-modal Analysis**: Combines storyboard images with video transcripts for richer understanding
193
+ - **Tone Control**: Choose between normal, sassy, or professional analysis styles for summarization
194
+ - **Prompt Customization**: Override specific prompt sections to tune workflows to your exact use case
195
+ - **Configurable Thresholds**: Set custom sensitivity levels for content moderation
196
+ - **Full TypeScript Support**: Comprehensive types for excellent developer experience and IDE autocomplete
197
+ - **Provider Flexibility**: Switch between OpenAI, Anthropic, Google, and other providers based on your needs
198
+ - **Composable Building Blocks**: Use primitives to fetch transcripts, thumbnails, and storyboards for custom workflows
199
+ - **Universal Language Support**: Automatic language name detection using `Intl.DisplayNames` for all ISO 639-1 codes
200
+ - **Production Ready**: Built-in retry logic, error handling, and edge case management
201
+
202
+ # Core Concepts
203
+
204
+ `@mux/ai` is built around two complementary abstractions:
205
+
206
+ ## Workflows
207
+
208
+ **Workflows** are functions that handle complete video AI tasks end-to-end. Each workflow orchestrates the entire process: fetching video data from Mux (transcripts, thumbnails, storyboards), formatting it for AI providers, and returning structured results.
122
209
 
123
210
  ```typescript
124
- import { translateCaptions } from "@mux/ai/workflows";
211
+ import { getSummaryAndTags } from "@mux/ai/workflows";
212
+
213
+ const result = await getSummaryAndTags("asset-id", { provider: "openai" });
214
+ ```
215
+
216
+ Use workflows when you need battle-tested solutions for common tasks like summarization, content moderation, chapter generation, or translation.
217
+
218
+ ## Primitives
125
219
 
126
- const result = await translateCaptions(
127
- "your-mux-asset-id",
128
- "en", // from
129
- "es", // to
130
- { provider: "anthropic" }
131
- );
220
+ **Primitives** are low-level building blocks that give you direct access to Mux video data and utilities. They provide functions for fetching transcripts, storyboards, thumbnails, and processing text—perfect for building custom workflows.
132
221
 
133
- console.log(result.uploadedTrackId); // New Mux track ID
222
+ ```typescript
223
+ import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";
224
+
225
+ const transcript = await fetchTranscriptForAsset("asset-id", "en");
226
+ const storyboard = getStoryboardUrl("playback-id", { width: 640 });
134
227
  ```
135
228
 
229
+ Use primitives when you need complete control over your AI prompts or want to build custom workflows not covered by the pre-built options.
230
+
136
231
  ## Package Structure
137
232
 
138
- This package ships with layered entry points:
233
+ ```typescript
234
+ // Import workflows
235
+ import { generateChapters } from "@mux/ai/workflows";
139
236
 
140
- - **`@mux/ai/workflows`** – Production-ready helpers like `getSummaryAndTags` and `generateChapters`
141
- - **`@mux/ai/primitives`** – Low-level building blocks like `fetchTranscriptForAsset` and `getStoryboardUrl`
142
- - **`@mux/ai`** – Main entry point that re-exports both namespaces plus shared types
237
+ // Import primitives
238
+ import { fetchTranscriptForAsset } from "@mux/ai/primitives";
143
239
 
144
- ```typescript
145
240
  // Or import everything
146
- import { primitives, workflows } from "@mux/ai";
147
- // Low-level primitives for custom workflows
148
- import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";
149
- // High-level workflows
150
- import { getSummaryAndTags } from "@mux/ai/workflows";
241
+ import { workflows, primitives } from "@mux/ai";
151
242
  ```
152
243
 
153
- Every workflow is composed from primitives, so you can start high-level and drop down to primitives when you need more control.
244
+ # Credentials
154
245
 
155
- ## Documentation
246
+ You'll need to set up credentials for Mux as well as any AI provider you want to use for a particular workflow. In addition, some workflows will need other cloud-hosted access (e.g. cloud storage via AWS S3).
156
247
 
157
- - **[Workflows](./docs/WORKFLOWS.md)** - Detailed guide to each pre-built workflow
158
- - **[Primitives](./docs/PRIMITIVES.md)** - Low-level building blocks for custom workflows
159
- - **[API Reference](./docs/API.md)** - Complete API documentation for all functions
160
- - **[Examples](./docs/EXAMPLES.md)** - Running examples from the repository
248
+ ## Credentials - Mux
249
+
250
+ ### Access Token (required)
251
+
252
+ All workflows require a Mux API access token to interact with your video assets. If you're already logged into the dashboard, you can [create a new access token here](https://dashboard.mux.com/settings/access-tokens).
253
+
254
+ **Required Permissions:**
255
+ - **Mux Video**: Read + Write access
256
+ - **Mux Data**: Read access
257
+
258
+ These permissions cover all current workflows. You can set these when creating your token in the dashboard.
259
+
260
+ > **💡 Tip:** For security reasons, consider creating a dedicated access token specifically for your AI workflows rather than reusing existing tokens.
161
261
 
162
- ## Development
262
+ ### Signing Key (conditionally required)
163
263
 
264
+ If your Mux assets use [signed playback URLs](https://docs.mux.com/guides/secure-video-playback) for security, you'll need to provide signing credentials so `@mux/ai` can access the video data.
265
+
266
+ **When needed:** Only if your assets have signed playback policies enabled and no public playback ID.
267
+
268
+ **How to get:**
269
+ 1. Go to [Settings > Signing Keys](https://dashboard.mux.com/settings/signing-keys) in your Mux dashboard
270
+ 2. Create a new signing key or use an existing one
271
+ 3. Save both the **Signing Key ID** and the **Base64-encoded Private Key**
272
+
273
+ **Configuration:**
164
274
  ```bash
165
- # Clone and install
166
- git clone https://github.com/muxinc/mux-ai.git
167
- cd mux-ai
168
- npm install # Automatically sets up git hooks
275
+ MUX_SIGNING_KEY=your_signing_key_id
276
+ MUX_PRIVATE_KEY=your_base64_encoded_private_key
277
+ ```
278
+
279
+ ## Credentials - AI Providers
169
280
 
170
- # Linting and type checking
171
- npm run lint
172
- npm run lint:fix
173
- npm run typecheck
281
+ Different workflows support various AI providers. You only need to configure API keys for the providers you plan to use.
174
282
 
175
- # Run tests
176
- npm test
283
+ ### OpenAI
284
+
285
+ **Used by:** `getSummaryAndTags`, `getModerationScores`, `hasBurnedInCaptions`, `generateChapters`, `generateVideoEmbeddings`, `translateCaptions`
286
+
287
+ **Get your API key:** [OpenAI API Keys](https://platform.openai.com/api-keys)
288
+
289
+ ```bash
290
+ OPENAI_API_KEY=your_openai_api_key
291
+ ```
292
+
293
+ ### Anthropic
294
+
295
+ **Used by:** `getSummaryAndTags`, `hasBurnedInCaptions`, `generateChapters`, `translateCaptions`
296
+
297
+ **Get your API key:** [Anthropic Console](https://console.anthropic.com/)
298
+
299
+ ```bash
300
+ ANTHROPIC_API_KEY=your_anthropic_api_key
301
+ ```
302
+
303
+ ### Google Generative AI
304
+
305
+ **Used by:** `getSummaryAndTags`, `hasBurnedInCaptions`, `generateChapters`, `generateVideoEmbeddings`, `translateCaptions`
306
+
307
+ **Get your API key:** [Google AI Studio](https://aistudio.google.com/app/apikey)
308
+
309
+ ```bash
310
+ GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
177
311
  ```
178
312
 
179
- This project uses ESLint with `@antfu/eslint-config`, TypeScript strict mode, and automated pre-commit hooks.
313
+ ### ElevenLabs
314
+
315
+ **Used by:** `translateAudio` (audio dubbing)
316
+
317
+ **Get your API key:** [ElevenLabs API Keys](https://elevenlabs.io/app/settings/api-keys)
318
+
319
+ **Note:** Requires a Creator plan or higher for dubbing features.
320
+
321
+ ```bash
322
+ ELEVENLABS_API_KEY=your_elevenlabs_api_key
323
+ ```
324
+
325
+ ### Hive
326
+
327
+ **Used by:** `getModerationScores` (alternative to OpenAI moderation)
328
+
329
+ **Get your API key:** [Hive Console](https://thehive.ai/)
330
+
331
+ ```bash
332
+ HIVE_API_KEY=your_hive_api_key
333
+ ```
334
+
335
+ ## Credentials - Cloud Infrastructure
336
+
337
+ ### AWS S3 (or S3-compatible storage)
338
+
339
+ **Required for:** `translateCaptions`, `translateAudio` (only if `uploadToMux` is true, which is the default)
340
+
341
+ Translation workflows need temporary storage to upload translated files before attaching them to your Mux assets. Any S3-compatible storage service works (AWS S3, Cloudflare R2, DigitalOcean Spaces, etc.).
342
+
343
+ **AWS S3 Setup:**
344
+ 1. [Create an S3 bucket](https://s3.console.aws.amazon.com/s3/home)
345
+ 2. [Create an IAM user](https://console.aws.amazon.com/iam/) with programmatic access
346
+ 3. Attach a policy with `s3:PutObject`, `s3:GetObject`, and `s3:PutObjectAcl` permissions for your bucket
347
+
348
+ **Configuration:**
349
+ ```bash
350
+ S3_ENDPOINT=https://s3.amazonaws.com # Or your S3-compatible endpoint
351
+ S3_REGION=us-east-1 # Your bucket region
352
+ S3_BUCKET=your-bucket-name
353
+ S3_ACCESS_KEY_ID=your-access-key
354
+ S3_SECRET_ACCESS_KEY=your-secret-key
355
+ ```
356
+
357
+ **Cloudflare R2 Example:**
358
+ ```bash
359
+ S3_ENDPOINT=https://your-account-id.r2.cloudflarestorage.com
360
+ S3_REGION=auto
361
+ S3_BUCKET=your-bucket-name
362
+ S3_ACCESS_KEY_ID=your-r2-access-key
363
+ S3_SECRET_ACCESS_KEY=your-r2-secret-key
364
+ ```
365
+
366
+ # Documentation
367
+
368
+ ## Full Documentation
369
+
370
+ - **[Workflows Guide](./docs/WORKFLOWS.md)** - Detailed guide to each pre-built workflow with examples
371
+ - **[API Reference](./docs/API.md)** - Complete API documentation for all functions, parameters, and return types
372
+ - **[Primitives Guide](./docs/PRIMITIVES.md)** - Low-level building blocks for custom workflows
373
+ - **[Examples](./docs/EXAMPLES.md)** - Running examples from the repository
374
+
375
+ ## Additional Resources
376
+
377
+ - **[Mux Video API Docs](https://docs.mux.com/guides/video)** - Learn about Mux Video features
378
+ - **[Auto-generated Captions](https://www.mux.com/docs/guides/add-autogenerated-captions-and-use-transcripts)** - Enable transcripts for your assets
379
+ - **[GitHub Repository](https://github.com/muxinc/ai)** - Source code, issues, and contributions
380
+ - **[npm Package](https://www.npmjs.com/package/@mux/ai)** - Package page and version history
381
+
382
+ # Contributing
383
+
384
+ We welcome contributions! Whether you're fixing bugs, adding features, or improving documentation, we'd love your help.
385
+
386
+ Please see our **[Contributing Guide](./CONTRIBUTING.md)** for details on:
387
+
388
+ - Setting up your development environment
389
+ - Running examples and tests
390
+ - Code style and conventions
391
+ - Submitting pull requests
392
+ - Reporting issues
393
+
394
+ For questions or discussions, feel free to [open an issue](https://github.com/muxinc/ai/issues).
180
395
 
181
396
  ## License
182
397
 
@@ -308,6 +308,10 @@ interface SummaryAndTagsResult {
308
308
  tags: string[];
309
309
  /** Storyboard image URL that was analyzed. */
310
310
  storyboardUrl: string;
311
+ /** Token usage from the AI provider (for efficiency/cost analysis). */
312
+ usage?: TokenUsage;
313
+ /** Raw transcript text used for analysis (when includeTranscript is true). */
314
+ transcriptText?: string;
311
315
  }
312
316
  /**
313
317
  * Sections of the summarization user prompt that can be overridden.
@@ -353,10 +357,111 @@ interface SummarizationOptions extends MuxAIOptions {
353
357
  }
354
358
  declare function getSummaryAndTags(assetId: string, options?: SummarizationOptions): Promise<SummaryAndTagsResult>;
355
359
 
360
+ /**
361
+ * Language Code Conversion Utilities
362
+ *
363
+ * Provides bidirectional mapping between:
364
+ * - ISO 639-1 (2-letter codes) - Used by browsers, BCP-47, most video players
365
+ * - ISO 639-3 (3-letter codes) - Used by various APIs and language processing systems
366
+ *
367
+ * This is essential for interoperability between different systems:
368
+ * - Mux uses ISO 639-1 for track language codes
369
+ * - Browser players expect BCP-47 compliant codes (based on ISO 639-1)
370
+ * - Some APIs require ISO 639-3 (3-letter) codes
371
+ */
372
+ /**
373
+ * Mapping from ISO 639-1 (2-letter) to ISO 639-3 (3-letter) codes.
374
+ * Covers the most common languages used in video translation.
375
+ *
376
+ * Reference: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
377
+ */
378
+ declare const ISO639_1_TO_3: {
379
+ readonly en: "eng";
380
+ readonly es: "spa";
381
+ readonly fr: "fra";
382
+ readonly de: "deu";
383
+ readonly it: "ita";
384
+ readonly pt: "por";
385
+ readonly ru: "rus";
386
+ readonly zh: "zho";
387
+ readonly ja: "jpn";
388
+ readonly ko: "kor";
389
+ readonly ar: "ara";
390
+ readonly hi: "hin";
391
+ readonly nl: "nld";
392
+ readonly pl: "pol";
393
+ readonly sv: "swe";
394
+ readonly da: "dan";
395
+ readonly no: "nor";
396
+ readonly fi: "fin";
397
+ readonly el: "ell";
398
+ readonly cs: "ces";
399
+ readonly hu: "hun";
400
+ readonly ro: "ron";
401
+ readonly bg: "bul";
402
+ readonly hr: "hrv";
403
+ readonly sk: "slk";
404
+ readonly sl: "slv";
405
+ readonly uk: "ukr";
406
+ readonly tr: "tur";
407
+ readonly th: "tha";
408
+ readonly vi: "vie";
409
+ readonly id: "ind";
410
+ readonly ms: "msa";
411
+ readonly tl: "tgl";
412
+ readonly he: "heb";
413
+ readonly fa: "fas";
414
+ readonly bn: "ben";
415
+ readonly ta: "tam";
416
+ readonly te: "tel";
417
+ readonly mr: "mar";
418
+ readonly gu: "guj";
419
+ readonly kn: "kan";
420
+ readonly ml: "mal";
421
+ readonly pa: "pan";
422
+ readonly ur: "urd";
423
+ readonly sw: "swa";
424
+ readonly af: "afr";
425
+ readonly ca: "cat";
426
+ readonly eu: "eus";
427
+ readonly gl: "glg";
428
+ readonly is: "isl";
429
+ readonly et: "est";
430
+ readonly lv: "lav";
431
+ readonly lt: "lit";
432
+ };
433
+ /**
434
+ * Supported ISO 639-1 two-letter language codes.
435
+ * These are the language codes supported for translation workflows.
436
+ */
437
+ type SupportedISO639_1 = keyof typeof ISO639_1_TO_3;
438
+ /**
439
+ * Supported ISO 639-3 three-letter language codes.
440
+ * These are the language codes supported for translation workflows.
441
+ */
442
+ type SupportedISO639_3 = (typeof ISO639_1_TO_3)[SupportedISO639_1];
443
+ /** ISO 639-1 two-letter language code (e.g., "en", "fr", "es") */
444
+ type ISO639_1 = SupportedISO639_1 | (string & {});
445
+ /** ISO 639-3 three-letter language code (e.g., "eng", "fra", "spa") */
446
+ type ISO639_3 = SupportedISO639_3 | (string & {});
447
+ /** Structured language code result containing both formats */
448
+ interface LanguageCodePair {
449
+ /** ISO 639-1 two-letter code (BCP-47 compatible) */
450
+ iso639_1: ISO639_1;
451
+ /** ISO 639-3 three-letter code */
452
+ iso639_3: ISO639_3;
453
+ }
454
+
356
455
  /** Output returned from `translateAudio`. */
357
456
  interface AudioTranslationResult {
358
457
  assetId: string;
359
- targetLanguageCode: string;
458
+ /** Target language code (ISO 639-1 two-letter format). */
459
+ targetLanguageCode: SupportedISO639_1;
460
+ /**
461
+ * Target language codes in both ISO 639-1 (2-letter) and ISO 639-3 (3-letter) formats.
462
+ * Use `iso639_1` for browser players (BCP-47 compliant) and `iso639_3` for ElevenLabs API.
463
+ */
464
+ targetLanguage: LanguageCodePair;
360
465
  dubbingId: string;
361
466
  uploadedTrackId?: string;
362
467
  presignedUrl?: string;
@@ -390,12 +495,26 @@ declare function translateAudio(assetId: string, toLanguageCode: string, options
390
495
  /** Output returned from `translateCaptions`. */
391
496
  interface TranslationResult {
392
497
  assetId: string;
393
- sourceLanguageCode: string;
394
- targetLanguageCode: string;
498
+ /** Source language code (ISO 639-1 two-letter format). */
499
+ sourceLanguageCode: SupportedISO639_1;
500
+ /** Target language code (ISO 639-1 two-letter format). */
501
+ targetLanguageCode: SupportedISO639_1;
502
+ /**
503
+ * Source language codes in both ISO 639-1 (2-letter) and ISO 639-3 (3-letter) formats.
504
+ * Use `iso639_1` for browser players (BCP-47 compliant) and `iso639_3` for APIs that require it.
505
+ */
506
+ sourceLanguage: LanguageCodePair;
507
+ /**
508
+ * Target language codes in both ISO 639-1 (2-letter) and ISO 639-3 (3-letter) formats.
509
+ * Use `iso639_1` for browser players (BCP-47 compliant) and `iso639_3` for APIs that require it.
510
+ */
511
+ targetLanguage: LanguageCodePair;
395
512
  originalVtt: string;
396
513
  translatedVtt: string;
397
514
  uploadedTrackId?: string;
398
515
  presignedUrl?: string;
516
+ /** Token usage from the AI provider (for efficiency/cost analysis). */
517
+ usage?: TokenUsage;
399
518
  }
400
519
  /** Configuration accepted by `translateCaptions`. */
401
520
  interface TranslationOptions<P extends SupportedProvider = SupportedProvider> extends MuxAIOptions {
package/dist/index.d.ts CHANGED
@@ -1,6 +1,6 @@
1
1
  export { i as primitives } from './index-DyTSka2R.js';
2
2
  export { A as AssetTextTrack, f as ChunkEmbedding, C as ChunkingStrategy, I as ImageSubmissionMode, M as MuxAIConfig, a as MuxAIOptions, b as MuxAsset, c as PlaybackAsset, P as PlaybackPolicy, e as TextChunk, d as TokenChunkingConfig, h as TokenUsage, T as ToneType, V as VTTChunkingConfig, g as VideoEmbeddingsResult } from './types-ktXDZ93V.js';
3
- export { i as workflows } from './index-Bnv7tv90.js';
3
+ export { i as workflows } from './index-CMZYZcj6.js';
4
4
  import '@mux/mux-node';
5
5
  import 'zod';
6
6
  import '@ai-sdk/anthropic';