@mux/ai 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,28 +1,70 @@
1
1
  # @mux/ai
2
2
 
3
- AI-powered video analysis library for Mux, built in TypeScript.
3
+ A set of tools for connecting videos in your Mux account to multi-modal LLMs.
4
4
 
5
- ## Available Tools
5
+ ## Available pre-built workflows
6
6
 
7
- | Function | Description | Providers | Default Models | Input | Output |
8
- |----------|-------------|-----------|----------------|--------|--------|
9
- | `getSummaryAndTags` | Generate titles, descriptions, and tags from a Mux video asset | OpenAI, Anthropic | `gpt-4o-mini`, `claude-3-5-haiku-20241022` | Asset ID + options | Title, description, tags, storyboard URL |
10
- | `getModerationScores` | Analyze video thumbnails for inappropriate content | OpenAI, Hive | `omni-moderation-latest`, Hive Visual API | Asset ID + thresholds | Sexual/violence scores, flagged status |
11
- | `hasBurnedInCaptions` | Detect burned-in captions (hardcoded subtitles) in video frames | OpenAI, Anthropic | `gpt-4o-mini`, `claude-3-5-haiku-20241022` | Asset ID + options | Boolean result, confidence, language |
12
- | `generateChapters` | Generate AI-powered chapter markers from video captions | OpenAI, Anthropic | `gpt-4o-mini`, `claude-3-5-haiku-20241022` | Asset ID + language + options | Timestamped chapter list |
13
- | `translateCaptions` | Translate video captions to different languages | Anthropic only | `claude-sonnet-4-20250514` | Asset ID + languages + S3 config | Translated VTT + Mux track ID |
14
- | `translateAudio` | Create AI-dubbed audio tracks in different languages | ElevenLabs only | ElevenLabs Dubbing API | Asset ID + languages + S3 config | Dubbed audio + Mux track ID |
7
+ | Workflow | Description | Providers | Default Models | Input | Output |
8
+ | ------------------------- | --------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------ | -------------------------------- | ---------------------------------------------- |
9
+ | `getSummaryAndTags` | Generate titles, descriptions, and tags from a Mux video asset | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + options | Title, description, tags, storyboard URL |
10
+ | `getModerationScores` | Analyze video thumbnails for inappropriate content | OpenAI, Hive | `omni-moderation-latest` (OpenAI) or Hive visual moderation task | Asset ID + thresholds | Sexual/violence scores, flagged status |
11
+ | `hasBurnedInCaptions` | Detect burned-in captions (hardcoded subtitles) in video frames | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + options | Boolean result, confidence, language |
12
+ | `generateChapters` | Generate AI-powered chapter markers from video captions | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + language + options | Timestamped chapter list, ready for Mux Player |
13
+ | `generateVideoEmbeddings` | Generate vector embeddings for video transcript chunks | OpenAI, Google | `text-embedding-3-small` (OpenAI), `gemini-embedding-001` (Google) | Asset ID + chunking strategy | Chunk embeddings + averaged embedding |
14
+ | `translateCaptions` | Translate video captions to different languages | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + languages + S3 config | Translated VTT + Mux track ID |
15
+ | `translateAudio` | Create AI-dubbed audio tracks in different languages | ElevenLabs only | ElevenLabs Dubbing API | Asset ID + languages + S3 config | Dubbed audio + Mux track ID |
15
16
 
16
17
  ## Features
17
18
 
18
- - **Cost-Effective by Default**: Uses affordable models like `gpt-4o-mini` and `claude-3-5-haiku` to keep analysis costs low while maintaining high quality results
19
+ - **Cost-Effective by Default**: Uses affordable frontier models like `gpt-5-mini`, `claude-sonnet-4-5`, and `gemini-2.5-flash` to keep analysis costs low while maintaining high quality results
19
20
  - **Multi-modal Analysis**: Combines storyboard images with video transcripts
20
- - **Tone Control**: Normal, sassy, or professional analysis styles (summarization only)
21
+ - **Tone Control**: Normal, sassy, or professional analysis styles
22
+ - **Prompt Customization**: Override specific prompt sections to tune workflows to your use case
21
23
  - **Configurable Thresholds**: Custom sensitivity levels for content moderation
22
24
  - **TypeScript**: Fully typed for excellent developer experience
23
- - **Provider Choice**: Switch between OpenAI and Anthropic for different perspectives
25
+ - **Provider Choice**: Switch between OpenAI, Anthropic, and Google for different perspectives
26
+ - **Composable Building Blocks**: Import primitives to fetch transcripts, thumbnails, and storyboards to build bespoke flows
24
27
  - **Universal Language Support**: Automatic language name detection using `Intl.DisplayNames` for all ISO 639-1 codes
25
28
 
29
+ ## Package Structure
30
+
31
+ This package ships with layered entry points so you can pick the right level of abstraction for your workflow:
32
+
33
+ - `@mux/ai/workflows` – opinionated, production-ready helpers (`getSummaryAndTags`, `generateChapters`, `translateCaptions`, etc.) that orchestrate Mux API access, transcript/storyboard gathering, and the AI provider call.
34
+ - `@mux/ai/primitives` – low-level building blocks such as `fetchTranscriptForAsset`, `getStoryboardUrl`, and `getThumbnailUrls`. Use these when you need to mix our utilities into your own prompts or custom workflows.
35
+ - `@mux/ai` – re-exports both namespaces, plus shared `types`, so you can also write `import { workflows, primitives } from '@mux/ai';`.
36
+
37
+ Every helper inside `@mux/ai/workflows` is composed from the primitives. That means you can start with a high-level workflow and gradually drop down to primitives whenever you need more control.
38
+
39
+ ```typescript
40
+ import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";
41
+ import { getModerationScores, getSummaryAndTags } from "@mux/ai/workflows";
42
+
43
+ // Compose high-level workflows for a custom workflow
44
+ export async function summarizeIfSafe(assetId: string) {
45
+ const moderation = await getModerationScores(assetId, { provider: "openai" });
46
+ if (moderation.exceedsThreshold) {
47
+ throw new Error("Asset failed content safety review");
48
+ }
49
+
50
+ return getSummaryAndTags(assetId, {
51
+ provider: "anthropic",
52
+ tone: "professional",
53
+ });
54
+ }
55
+
56
+ // Or drop down to primitives to build bespoke AI workflows
57
+ export async function customTranscriptAnalysis(assetId: string, playbackId: string) {
58
+ const transcript = await fetchTranscriptForAsset(assetId, "en");
59
+ const storyboardUrl = getStoryboardUrl(playbackId);
60
+
61
+ // Use these primitives in your own AI prompts or custom logic
62
+ return { transcript, storyboardUrl };
63
+ }
64
+ ```
65
+
66
+ Use whichever layer makes sense: call a workflow as-is, compose multiple workflows together, or drop down to primitives to build a completely custom workflow.
67
+
26
68
  ## Installation
27
69
 
28
70
  ```bash
@@ -34,99 +76,102 @@ npm install @mux/ai
34
76
  ### Video Summarization
35
77
 
36
78
  ```typescript
37
- import { getSummaryAndTags } from '@mux/ai';
79
+ import { getSummaryAndTags } from "@mux/ai/workflows";
38
80
 
39
81
  // Uses built-in optimized prompt
40
- const result = await getSummaryAndTags('your-mux-asset-id', {
41
- tone: 'professional'
82
+ const result = await getSummaryAndTags("your-mux-asset-id", {
83
+ tone: "professional"
42
84
  });
43
85
 
44
- console.log(result.title); // Short, descriptive title
86
+ console.log(result.title); // Short, descriptive title
45
87
  console.log(result.description); // Detailed description
46
- console.log(result.tags); // Array of relevant keywords
88
+ console.log(result.tags); // Array of relevant keywords
47
89
  console.log(result.storyboardUrl); // URL to Mux storyboard
48
90
 
49
- // Use base64 mode for improved reliability (works with both OpenAI and Anthropic)
50
- const reliableResult = await getSummaryAndTags('your-mux-asset-id', {
51
- provider: 'anthropic',
52
- imageSubmissionMode: 'base64', // Uses Files API for Anthropic, base64 for OpenAI
91
+ // Use base64 mode for improved reliability (works with OpenAI, Anthropic, and Google)
92
+ const reliableResult = await getSummaryAndTags("your-mux-asset-id", {
93
+ provider: "anthropic",
94
+ imageSubmissionMode: "base64", // Downloads storyboard locally before submission
53
95
  imageDownloadOptions: {
54
96
  timeout: 15000,
55
97
  retries: 2,
56
98
  retryDelay: 1000
57
99
  },
58
- tone: 'professional'
100
+ tone: "professional"
101
+ });
102
+
103
+ // Customize for specific use cases with promptOverrides
104
+ const seoResult = await getSummaryAndTags("your-mux-asset-id", {
105
+ promptOverrides: {
106
+ task: "Generate SEO-optimized metadata for search engines.",
107
+ title: "Create a search-optimized title (50-60 chars) with primary keyword.",
108
+ keywords: "Focus on high search volume and long-tail keywords.",
109
+ },
59
110
  });
60
111
  ```
61
112
 
62
113
  ### Content Moderation
63
114
 
64
115
  ```typescript
65
- import { getModerationScores } from '@mux/ai';
116
+ import { getModerationScores } from "@mux/ai/workflows";
66
117
 
67
118
  // Analyze Mux video asset for inappropriate content (OpenAI default)
68
- const result = await getModerationScores('your-mux-asset-id', {
119
+ const result = await getModerationScores("your-mux-asset-id", {
69
120
  thresholds: { sexual: 0.7, violence: 0.8 }
70
121
  });
71
122
 
72
- console.log(result.maxScores); // Highest scores across all thumbnails
123
+ console.log(result.maxScores); // Highest scores across all thumbnails
73
124
  console.log(result.exceedsThreshold); // true if content should be flagged
74
- console.log(result.thumbnailScores); // Individual thumbnail results
125
+ console.log(result.thumbnailScores); // Individual thumbnail results
75
126
 
76
- // Or use Hive for moderation
77
- const hiveResult = await getModerationScores('your-mux-asset-id', {
78
- provider: 'hive',
79
- thresholds: { sexual: 0.7, violence: 0.8 }
127
+ // Run the same analysis using Hive’s visual moderation API
128
+ const hiveResult = await getModerationScores("your-mux-asset-id", {
129
+ provider: "hive",
130
+ thresholds: { sexual: 0.9, violence: 0.9 },
80
131
  });
81
132
 
82
- // Use base64 submission for improved reliability (downloads images locally)
83
- const reliableResult = await getModerationScores('your-mux-asset-id', {
84
- provider: 'openai',
85
- imageSubmissionMode: 'base64',
133
+ // Use base64 submission for improved reliability with OpenAI (downloads images locally)
134
+ const reliableResult = await getModerationScores("your-mux-asset-id", {
135
+ provider: "openai",
136
+ imageSubmissionMode: "base64",
86
137
  imageDownloadOptions: {
87
138
  timeout: 15000,
88
139
  retries: 3,
89
140
  retryDelay: 1000
90
141
  }
91
142
  });
92
-
93
- // Hive also supports base64 mode (uses multipart upload)
94
- const hiveReliableResult = await getModerationScores('your-mux-asset-id', {
95
- provider: 'hive',
96
- imageSubmissionMode: 'base64',
97
- imageDownloadOptions: {
98
- timeout: 15000,
99
- retries: 2,
100
- retryDelay: 1000
101
- }
102
- });
103
143
  ```
104
144
 
105
145
  ### Burned-in Caption Detection
106
146
 
107
147
  ```typescript
108
- import { hasBurnedInCaptions } from '@mux/ai';
148
+ import { hasBurnedInCaptions } from "@mux/ai/workflows";
109
149
 
110
150
  // Detect burned-in captions (hardcoded subtitles) in video frames
111
- const result = await hasBurnedInCaptions('your-mux-asset-id', {
112
- provider: 'openai'
151
+ const result = await hasBurnedInCaptions("your-mux-asset-id", {
152
+ provider: "openai"
113
153
  });
114
154
 
115
155
  console.log(result.hasBurnedInCaptions); // true/false
116
- console.log(result.confidence); // 0.0-1.0 confidence score
117
- console.log(result.detectedLanguage); // Language if captions detected
118
- console.log(result.storyboardUrl); // Video storyboard analyzed
156
+ console.log(result.confidence); // 0.0-1.0 confidence score
157
+ console.log(result.detectedLanguage); // Language if captions detected
158
+ console.log(result.storyboardUrl); // Video storyboard analyzed
119
159
 
120
160
  // Compare providers
121
- const anthropicResult = await hasBurnedInCaptions('your-mux-asset-id', {
122
- provider: 'anthropic',
123
- model: 'claude-3-5-haiku-20241022'
161
+ const anthropicResult = await hasBurnedInCaptions("your-mux-asset-id", {
162
+ provider: "anthropic",
163
+ model: "claude-sonnet-4-5"
164
+ });
165
+
166
+ const googleResult = await hasBurnedInCaptions("your-mux-asset-id", {
167
+ provider: "google",
168
+ model: "gemini-2.5-flash"
124
169
  });
125
170
 
126
171
  // Use base64 mode for improved reliability
127
- const reliableResult = await hasBurnedInCaptions('your-mux-asset-id', {
128
- provider: 'openai',
129
- imageSubmissionMode: 'base64',
172
+ const reliableResult = await hasBurnedInCaptions("your-mux-asset-id", {
173
+ provider: "openai",
174
+ imageSubmissionMode: "base64",
130
175
  imageDownloadOptions: {
131
176
  timeout: 15000,
132
177
  retries: 3,
@@ -140,28 +185,29 @@ const reliableResult = await hasBurnedInCaptions('your-mux-asset-id', {
140
185
  Choose between two methods for submitting images to AI providers:
141
186
 
142
187
  **URL Mode (Default):**
188
+
143
189
  - Fast initial response
144
190
  - Lower bandwidth usage
145
191
  - Relies on AI provider's image downloading
146
192
  - May encounter timeouts with slow/unreliable image sources
147
193
 
148
194
  **Base64 Mode (Recommended for Production):**
195
+
149
196
  - Downloads images locally with robust retry logic
150
197
  - Eliminates AI provider timeout issues
151
198
  - Better control over slow TTFB and network issues
152
199
  - Slightly higher bandwidth usage but more reliable results
153
200
  - For OpenAI: submits images as base64 data URIs
154
- - For Hive: uploads images via multipart/form-data (Hive doesn't support base64 data URIs)
155
- - For Anthropic (summarization): uploads to Files API then references by file_id (no size limit)
201
+ - For Anthropic/Google: the AI SDK handles converting the base64 payload into the provider-specific format automatically
156
202
 
157
203
  ```typescript
158
204
  // High reliability mode - recommended for production
159
205
  const result = await getModerationScores(assetId, {
160
- imageSubmissionMode: 'base64',
206
+ imageSubmissionMode: "base64",
161
207
  imageDownloadOptions: {
162
- timeout: 15000, // 15s timeout per image
163
- retries: 3, // Retry failed downloads 3x
164
- retryDelay: 1000, // 1s base delay with exponential backoff
208
+ timeout: 15000, // 15s timeout per image
209
+ retries: 3, // Retry failed downloads 3x
210
+ retryDelay: 1000, // 1s base delay with exponential backoff
165
211
  exponentialBackoff: true
166
212
  }
167
213
  });
@@ -170,91 +216,142 @@ const result = await getModerationScores(assetId, {
170
216
  ### Caption Translation
171
217
 
172
218
  ```typescript
173
- import { translateCaptions } from '@mux/ai';
219
+ import { translateCaptions } from "@mux/ai/workflows";
174
220
 
175
221
  // Translate existing captions to Spanish and add as new track
176
222
  const result = await translateCaptions(
177
- 'your-mux-asset-id',
178
- 'en', // from language
179
- 'es', // to language
223
+ "your-mux-asset-id",
224
+ "en", // from language
225
+ "es", // to language
180
226
  {
181
- provider: 'anthropic',
182
- model: 'claude-sonnet-4-20250514'
227
+ provider: "google",
228
+ model: "gemini-2.5-flash"
183
229
  }
184
230
  );
185
231
 
186
- console.log(result.uploadedTrackId); // New Mux track ID
187
- console.log(result.presignedUrl); // S3 file URL
188
- console.log(result.translatedVtt); // Translated VTT content
232
+ console.log(result.uploadedTrackId); // New Mux track ID
233
+ console.log(result.presignedUrl); // S3 file URL
234
+ console.log(result.translatedVtt); // Translated VTT content
189
235
  ```
190
236
 
191
237
  ### Video Chapters
192
238
 
193
239
  ```typescript
194
- import { generateChapters } from '@mux/ai';
240
+ import { generateChapters } from "@mux/ai/workflows";
195
241
 
196
242
  // Generate AI-powered chapters from video captions
197
- const result = await generateChapters('your-mux-asset-id', 'en', {
198
- provider: 'openai'
243
+ const result = await generateChapters("your-mux-asset-id", "en", {
244
+ provider: "openai"
199
245
  });
200
246
 
201
- console.log(result.chapters); // Array of {startTime: number, title: string}
247
+ console.log(result.chapters); // Array of {startTime: number, title: string}
202
248
 
203
249
  // Use with Mux Player
204
- const player = document.querySelector('mux-player');
250
+ const player = document.querySelector("mux-player");
205
251
  player.addChapters(result.chapters);
206
252
 
207
253
  // Compare providers
208
- const anthropicResult = await generateChapters('your-mux-asset-id', 'en', {
209
- provider: 'anthropic',
210
- model: 'claude-3-5-haiku-20241022'
254
+ const anthropicResult = await generateChapters("your-mux-asset-id", "en", {
255
+ provider: "anthropic",
256
+ model: "claude-sonnet-4-5"
257
+ });
258
+
259
+ const googleResult = await generateChapters("your-mux-asset-id", "en", {
260
+ provider: "google",
261
+ model: "gemini-2.5-flash"
262
+ });
263
+ ```
264
+
265
+ ### Video Embeddings
266
+
267
+ ```typescript
268
+ import { generateVideoEmbeddings } from "@mux/ai/workflows";
269
+
270
+ // Generate embeddings for semantic video search
271
+ const result = await generateVideoEmbeddings("your-mux-asset-id", {
272
+ provider: "openai",
273
+ chunkingStrategy: {
274
+ type: "token",
275
+ maxTokens: 500,
276
+ overlap: 100
277
+ }
278
+ });
279
+
280
+ console.log(result.chunks); // Array of chunk embeddings with timestamps
281
+ console.log(result.averagedEmbedding); // Single embedding for entire video
282
+
283
+ // Store chunks in vector database for timestamp-accurate search
284
+ for (const chunk of result.chunks) {
285
+ await vectorDB.insert({
286
+ id: `${result.assetId}:${chunk.chunkId}`,
287
+ embedding: chunk.embedding,
288
+ startTime: chunk.metadata.startTime,
289
+ endTime: chunk.metadata.endTime
290
+ });
291
+ }
292
+
293
+ // Use VTT-based chunking to respect cue boundaries
294
+ const vttResult = await generateVideoEmbeddings("your-mux-asset-id", {
295
+ provider: "google",
296
+ chunkingStrategy: {
297
+ type: "vtt",
298
+ maxTokens: 500,
299
+ overlapCues: 2
300
+ }
211
301
  });
212
302
  ```
213
303
 
214
304
  ### Audio Dubbing
215
305
 
216
306
  ```typescript
217
- import { translateAudio } from '@mux/ai';
307
+ import { translateAudio } from "@mux/ai/workflows";
218
308
 
219
309
  // Create AI-dubbed audio track and add to Mux asset
220
310
  // Uses the default audio track on your asset, language is auto-detected
221
311
  const result = await translateAudio(
222
- 'your-mux-asset-id',
223
- 'es', // target language
312
+ "your-mux-asset-id",
313
+ "es", // target language
224
314
  {
225
- provider: 'elevenlabs',
315
+ provider: "elevenlabs",
226
316
  numSpeakers: 0 // Auto-detect speakers
227
317
  }
228
318
  );
229
319
 
230
- console.log(result.dubbingId); // ElevenLabs dubbing job ID
231
- console.log(result.uploadedTrackId); // New Mux audio track ID
232
- console.log(result.presignedUrl); // S3 audio file URL
320
+ console.log(result.dubbingId); // ElevenLabs dubbing job ID
321
+ console.log(result.uploadedTrackId); // New Mux audio track ID
322
+ console.log(result.presignedUrl); // S3 audio file URL
233
323
  ```
234
324
 
235
325
  ### Compare Summarization from Providers
236
326
 
237
327
  ```typescript
238
- import { getSummaryAndTags } from '@mux/ai';
328
+ import { getSummaryAndTags } from "@mux/ai/workflows";
239
329
 
240
330
  // Compare different AI providers analyzing the same Mux video asset
241
- const assetId = 'your-mux-asset-id';
331
+ const assetId = "your-mux-asset-id";
242
332
 
243
- // OpenAI analysis (default: gpt-4o-mini)
333
+ // OpenAI analysis (default: gpt-5-mini)
244
334
  const openaiResult = await getSummaryAndTags(assetId, {
245
- provider: 'openai',
246
- tone: 'professional'
335
+ provider: "openai",
336
+ tone: "professional"
247
337
  });
248
338
 
249
- // Anthropic analysis (default: claude-3-5-haiku-20241022)
339
+ // Anthropic analysis (default: claude-sonnet-4-5)
250
340
  const anthropicResult = await getSummaryAndTags(assetId, {
251
- provider: 'anthropic',
252
- tone: 'professional'
341
+ provider: "anthropic",
342
+ tone: "professional"
343
+ });
344
+
345
+ // Google Gemini analysis (default: gemini-2.5-flash)
346
+ const googleResult = await getSummaryAndTags(assetId, {
347
+ provider: "google",
348
+ tone: "professional"
253
349
  });
254
350
 
255
351
  // Compare results
256
- console.log('OpenAI:', openaiResult.title);
257
- console.log('Anthropic:', anthropicResult.title);
352
+ console.log("OpenAI:", openaiResult.title);
353
+ console.log("Anthropic:", anthropicResult.title);
354
+ console.log("Google:", googleResult.title);
258
355
  ```
259
356
 
260
357
  ## Configuration
@@ -266,8 +363,12 @@ MUX_TOKEN_ID=your_mux_token_id
266
363
  MUX_TOKEN_SECRET=your_mux_token_secret
267
364
  OPENAI_API_KEY=your_openai_api_key
268
365
  ANTHROPIC_API_KEY=your_anthropic_api_key
366
+ GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
269
367
  ELEVENLABS_API_KEY=your_elevenlabs_api_key
270
- HIVE_API_KEY=your_hive_api_key
368
+
369
+ # Signed Playback (for assets with signed playback policies)
370
+ MUX_SIGNING_KEY=your_signing_key_id
371
+ MUX_PRIVATE_KEY=your_base64_encoded_private_key
271
372
 
272
373
  # S3-Compatible Storage (required for translation & audio dubbing)
273
374
  S3_ENDPOINT=https://your-s3-endpoint.com
@@ -281,9 +382,12 @@ Or pass credentials directly:
281
382
 
282
383
  ```typescript
283
384
  const result = await getSummaryAndTags(assetId, {
284
- muxTokenId: 'your-token-id',
285
- muxTokenSecret: 'your-token-secret',
286
- openaiApiKey: 'your-openai-key'
385
+ muxTokenId: "your-token-id",
386
+ muxTokenSecret: "your-token-secret",
387
+ openaiApiKey: "your-openai-key",
388
+ // For assets with signed playback policies:
389
+ muxSigningKey: "your-signing-key-id",
390
+ muxPrivateKey: "your-base64-private-key"
287
391
  });
288
392
  ```
289
393
 
@@ -294,13 +398,15 @@ const result = await getSummaryAndTags(assetId, {
294
398
  Analyzes a Mux video asset and returns AI-generated metadata.
295
399
 
296
400
  **Parameters:**
401
+
297
402
  - `assetId` (string) - Mux video asset ID
298
403
  - `options` (optional) - Configuration options
299
404
 
300
405
  **Options:**
301
- - `provider?: 'openai' | 'anthropic'` - AI provider (default: 'openai')
406
+
407
+ - `provider?: 'openai' | 'anthropic' | 'google'` - AI provider (default: 'openai')
302
408
  - `tone?: 'normal' | 'sassy' | 'professional'` - Analysis tone (default: 'normal')
303
- - `model?: string` - AI model to use (default: 'gpt-4o-mini' for OpenAI, 'claude-3-5-haiku-20241022' for Anthropic)
409
+ - `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
304
410
  - `includeTranscript?: boolean` - Include video transcript in analysis (default: true)
305
411
  - `cleanTranscript?: boolean` - Remove VTT timestamps and formatting from transcript (default: true)
306
412
  - `imageSubmissionMode?: 'url' | 'base64'` - How to submit storyboard to AI providers (default: 'url')
@@ -310,33 +416,45 @@ Analyzes a Mux video asset and returns AI-generated metadata.
310
416
  - `retryDelay?: number` - Base delay between retries in milliseconds (default: 1000)
311
417
  - `maxRetryDelay?: number` - Maximum delay between retries in milliseconds (default: 10000)
312
418
  - `exponentialBackoff?: boolean` - Whether to use exponential backoff (default: true)
419
+ - `promptOverrides?: object` - Override specific sections of the prompt for custom use cases
420
+ - `task?: string` - Override the main task instruction
421
+ - `title?: string` - Override title generation guidance
422
+ - `description?: string` - Override description generation guidance
423
+ - `keywords?: string` - Override keywords generation guidance
424
+ - `qualityGuidelines?: string` - Override quality guidelines
313
425
  - `muxTokenId?: string` - Mux API token ID
314
- - `muxTokenSecret?: string` - Mux API token secret
426
+ - `muxTokenSecret?: string` - Mux API token secret
427
+ - `muxSigningKey?: string` - Signing key ID for signed playback policies
428
+ - `muxPrivateKey?: string` - Base64-encoded private key for signed playback policies
315
429
  - `openaiApiKey?: string` - OpenAI API key
316
430
  - `anthropicApiKey?: string` - Anthropic API key
431
+ - `googleApiKey?: string` - Google Generative AI API key
317
432
 
318
433
  **Returns:**
434
+
319
435
  ```typescript
320
- {
436
+ interface SummaryAndTagsResult {
321
437
  assetId: string;
322
- title: string; // Short title (max 100 chars)
323
- description: string; // Detailed description
324
- tags: string[]; // Relevant keywords
438
+ title: string; // Short title (max 100 chars)
439
+ description: string; // Detailed description
440
+ tags: string[]; // Relevant keywords
325
441
  storyboardUrl: string; // Video storyboard URL
326
442
  }
327
443
  ```
328
444
 
329
445
  ### `getModerationScores(assetId, options?)`
330
446
 
331
- Analyzes video thumbnails for inappropriate content using OpenAI's moderation API or Hive's Visual Moderation API.
447
+ Analyzes video thumbnails for inappropriate content using OpenAI's Moderation API or Hives visual moderation API.
332
448
 
333
449
  **Parameters:**
450
+
334
451
  - `assetId` (string) - Mux video asset ID
335
452
  - `options` (optional) - Configuration options
336
453
 
337
454
  **Options:**
455
+
338
456
  - `provider?: 'openai' | 'hive'` - Moderation provider (default: 'openai')
339
- - `model?: string` - OpenAI model to use (default: 'omni-moderation-latest')
457
+ - `model?: string` - OpenAI moderation model to use (default: `omni-moderation-latest`)
340
458
  - `thresholds?: { sexual?: number; violence?: number }` - Custom thresholds (default: {sexual: 0.7, violence: 0.8})
341
459
  - `thumbnailInterval?: number` - Seconds between thumbnails for long videos (default: 10)
342
460
  - `thumbnailWidth?: number` - Thumbnail width in pixels (default: 640)
@@ -348,25 +466,26 @@ Analyzes video thumbnails for inappropriate content using OpenAI's moderation AP
348
466
  - `retryDelay?: number` - Base delay between retries in milliseconds (default: 1000)
349
467
  - `maxRetryDelay?: number` - Maximum delay between retries in milliseconds (default: 10000)
350
468
  - `exponentialBackoff?: boolean` - Whether to use exponential backoff (default: true)
351
- - `muxTokenId/muxTokenSecret/openaiApiKey?: string` - API credentials
352
- - `hiveApiKey?: string` - Hive API key (required for Hive provider)
469
+ - `muxTokenId/muxTokenSecret?: string` - Mux credentials
470
+ - `openaiApiKey?/hiveApiKey?` - Provider credentials
353
471
 
354
472
  **Returns:**
473
+
355
474
  ```typescript
356
475
  {
357
476
  assetId: string;
358
- thumbnailScores: Array<{ // Individual thumbnail results
477
+ thumbnailScores: Array<{ // Individual thumbnail results
359
478
  url: string;
360
- sexual: number; // 0-1 score
361
- violence: number; // 0-1 score
479
+ sexual: number; // 0-1 score
480
+ violence: number; // 0-1 score
362
481
  error: boolean;
363
482
  }>;
364
- maxScores: { // Highest scores across all thumbnails
483
+ maxScores: { // Highest scores across all thumbnails
365
484
  sexual: number;
366
485
  violence: number;
367
486
  };
368
- exceedsThreshold: boolean; // true if content should be flagged
369
- thresholds: { // Threshold values used
487
+ exceedsThreshold: boolean; // true if content should be flagged
488
+ thresholds: { // Threshold values used
370
489
  sexual: number;
371
490
  violence: number;
372
491
  };
@@ -378,12 +497,14 @@ Analyzes video thumbnails for inappropriate content using OpenAI's moderation AP
378
497
  Analyzes video frames to detect burned-in captions (hardcoded subtitles) that are permanently embedded in the video image.
379
498
 
380
499
  **Parameters:**
500
+
381
501
  - `assetId` (string) - Mux video asset ID
382
502
  - `options` (optional) - Configuration options
383
503
 
384
504
  **Options:**
385
- - `provider?: 'openai' | 'anthropic'` - AI provider (default: 'openai')
386
- - `model?: string` - AI model to use (default: 'gpt-4o-mini' for OpenAI, 'claude-3-5-haiku-20241022' for Anthropic)
505
+
506
+ - `provider?: 'openai' | 'anthropic' | 'google'` - AI provider (default: 'openai')
507
+ - `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
387
508
  - `imageSubmissionMode?: 'url' | 'base64'` - How to submit storyboard to AI providers (default: 'url')
388
509
  - `imageDownloadOptions?: object` - Options for image download when using base64 mode
389
510
  - `timeout?: number` - Request timeout in milliseconds (default: 10000)
@@ -395,19 +516,22 @@ Analyzes video frames to detect burned-in captions (hardcoded subtitles) that ar
395
516
  - `muxTokenSecret?: string` - Mux API token secret
396
517
  - `openaiApiKey?: string` - OpenAI API key
397
518
  - `anthropicApiKey?: string` - Anthropic API key
519
+ - `googleApiKey?: string` - Google Generative AI API key
398
520
 
399
521
  **Returns:**
522
+
400
523
  ```typescript
401
524
  {
402
525
  assetId: string;
403
- hasBurnedInCaptions: boolean; // Whether burned-in captions were detected
404
- confidence: number; // Confidence score (0.0-1.0)
526
+ hasBurnedInCaptions: boolean; // Whether burned-in captions were detected
527
+ confidence: number; // Confidence score (0.0-1.0)
405
528
  detectedLanguage: string | null; // Language of detected captions, or null
406
- storyboardUrl: string; // URL to analyzed storyboard
529
+ storyboardUrl: string; // URL to analyzed storyboard
407
530
  }
408
531
  ```
409
532
 
410
533
  **Detection Logic:**
534
+
411
535
  - Analyzes video storyboard frames to identify text overlays
412
536
  - Distinguishes between actual captions and marketing/end-card text
413
537
  - Text appearing only in final 1-2 frames is classified as marketing copy
@@ -419,32 +543,36 @@ Analyzes video frames to detect burned-in captions (hardcoded subtitles) that ar
419
543
  Translates existing captions from one language to another and optionally adds them as a new track to the Mux asset.
420
544
 
421
545
  **Parameters:**
546
+
422
547
  - `assetId` (string) - Mux video asset ID
423
548
  - `fromLanguageCode` (string) - Source language code (e.g., 'en', 'es', 'fr')
424
549
  - `toLanguageCode` (string) - Target language code (e.g., 'es', 'fr', 'de')
425
550
  - `options` (optional) - Configuration options
426
551
 
427
552
  **Options:**
428
- - `provider?: 'anthropic'` - AI provider (default: 'anthropic')
429
- - `model?: string` - Model to use (default: 'claude-sonnet-4-20250514')
553
+
554
+ - `provider: 'openai' | 'anthropic' | 'google'` - AI provider (required)
555
+ - `model?: string` - Model to use (defaults to the provider's chat-vision model if omitted)
430
556
  - `uploadToMux?: boolean` - Whether to upload translated track to Mux (default: true)
431
557
  - `s3Endpoint?: string` - S3-compatible storage endpoint
432
558
  - `s3Region?: string` - S3 region (default: 'auto')
433
559
  - `s3Bucket?: string` - S3 bucket name
434
560
  - `s3AccessKeyId?: string` - S3 access key ID
435
561
  - `s3SecretAccessKey?: string` - S3 secret access key
436
- - `muxTokenId/muxTokenSecret/anthropicApiKey?: string` - API credentials
562
+ - `muxTokenId/muxTokenSecret?: string` - Mux credentials
563
+ - `openaiApiKey?/anthropicApiKey?/googleApiKey?` - Provider credentials
437
564
 
438
565
  **Returns:**
566
+
439
567
  ```typescript
440
- {
568
+ interface TranslateCaptionsResult {
441
569
  assetId: string;
442
570
  sourceLanguageCode: string;
443
571
  targetLanguageCode: string;
444
- originalVtt: string; // Original VTT content
445
- translatedVtt: string; // Translated VTT content
446
- uploadedTrackId?: string; // Mux track ID (if uploaded)
447
- presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
572
+ originalVtt: string; // Original VTT content
573
+ translatedVtt: string; // Translated VTT content
574
+ uploadedTrackId?: string; // Mux track ID (if uploaded)
575
+ presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
448
576
  }
449
577
  ```
450
578
 
@@ -456,42 +584,48 @@ All ISO 639-1 language codes are automatically supported using `Intl.DisplayName
456
584
  Generates AI-powered chapter markers by analyzing video captions. Creates logical chapter breaks based on topic changes and content transitions.
457
585
 
458
586
  **Parameters:**
587
+
459
588
  - `assetId` (string) - Mux video asset ID
460
589
  - `languageCode` (string) - Language code for captions (e.g., 'en', 'es', 'fr')
461
590
  - `options` (optional) - Configuration options
462
591
 
463
592
  **Options:**
464
- - `provider?: 'openai' | 'anthropic'` - AI provider (default: 'openai')
465
- - `model?: string` - AI model to use (default: 'gpt-4o-mini' for OpenAI, 'claude-3-5-haiku-20241022' for Anthropic)
593
+
594
+ - `provider?: 'openai' | 'anthropic' | 'google'` - AI provider (default: 'openai')
595
+ - `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
466
596
  - `muxTokenId?: string` - Mux API token ID
467
597
  - `muxTokenSecret?: string` - Mux API token secret
468
598
  - `openaiApiKey?: string` - OpenAI API key
469
599
  - `anthropicApiKey?: string` - Anthropic API key
600
+ - `googleApiKey?: string` - Google Generative AI API key
470
601
 
471
602
  **Returns:**
603
+
472
604
  ```typescript
473
605
  {
474
606
  assetId: string;
475
607
  languageCode: string;
476
608
  chapters: Array<{
477
- startTime: number; // Chapter start time in seconds
478
- title: string; // Descriptive chapter title
609
+ startTime: number; // Chapter start time in seconds
610
+ title: string; // Descriptive chapter title
479
611
  }>;
480
612
  }
481
613
  ```
482
614
 
483
615
  **Requirements:**
616
+
484
617
  - Asset must have caption track in the specified language
485
618
  - Caption track must be in 'ready' status
486
619
  - Uses existing auto-generated or uploaded captions
487
620
 
488
621
  **Example Output:**
622
+
489
623
  ```javascript
490
624
  // Perfect format for Mux Player
491
625
  player.addChapters([
492
- {startTime: 0, title: 'Introduction and Setup'},
493
- {startTime: 45, title: 'Main Content Discussion'},
494
- {startTime: 120, title: 'Conclusion'}
626
+ { startTime: 0, title: "Introduction and Setup" },
627
+ { startTime: 45, title: "Main Content Discussion" },
628
+ { startTime: 120, title: "Conclusion" }
495
629
  ]);
496
630
  ```
497
631
 
@@ -500,11 +634,13 @@ player.addChapters([
500
634
  Creates AI-dubbed audio tracks from existing video content using ElevenLabs voice cloning and translation. Uses the default audio track on your asset, language is auto-detected.
501
635
 
502
636
  **Parameters:**
637
+
503
638
  - `assetId` (string) - Mux video asset ID (must have audio.m4a static rendition)
504
639
  - `toLanguageCode` (string) - Target language code (e.g., 'es', 'fr', 'de')
505
640
  - `options` (optional) - Configuration options
506
641
 
507
642
  **Options:**
643
+
508
644
  - `provider?: 'elevenlabs'` - AI provider (default: 'elevenlabs')
509
645
  - `numSpeakers?: number` - Number of speakers (default: 0 for auto-detect)
510
646
  - `uploadToMux?: boolean` - Whether to upload dubbed track to Mux (default: true)
@@ -517,17 +653,19 @@ Creates AI-dubbed audio tracks from existing video content using ElevenLabs voic
517
653
  - `muxTokenId/muxTokenSecret?: string` - API credentials
518
654
 
519
655
  **Returns:**
656
+
520
657
  ```typescript
521
- {
658
+ interface TranslateAudioResult {
522
659
  assetId: string;
523
660
  targetLanguageCode: string;
524
- dubbingId: string; // ElevenLabs dubbing job ID
525
- uploadedTrackId?: string; // Mux audio track ID (if uploaded)
526
- presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
661
+ dubbingId: string; // ElevenLabs dubbing job ID
662
+ uploadedTrackId?: string; // Mux audio track ID (if uploaded)
663
+ presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
527
664
  }
528
665
  ```
529
666
 
530
667
  **Requirements:**
668
+
531
669
  - Asset must have an `audio.m4a` static rendition
532
670
  - ElevenLabs API key with Creator plan or higher
533
671
  - S3-compatible storage for Mux ingestion
@@ -535,53 +673,174 @@ Creates AI-dubbed audio tracks from existing video content using ElevenLabs voic
535
673
  **Supported Languages:**
536
674
  ElevenLabs supports 32+ languages with automatic language name detection via `Intl.DisplayNames`. Supported languages include English, Spanish, French, German, Italian, Portuguese, Polish, Japanese, Korean, Chinese, Russian, Arabic, Hindi, Thai, and many more. Track names are automatically generated (e.g., "Polish (auto-dubbed)").
537
675
 
538
- ### Custom Prompts
676
+ ### Custom Prompts with `promptOverrides`
539
677
 
540
- Override the default summarization prompt:
678
+ Customize specific sections of the summarization prompt for different use cases like SEO, social media, or technical analysis.
679
+
680
+ **Tip:** Before adding overrides, read through the default summarization prompt template in `src/functions/summarization.ts` (the `summarizationPromptBuilder` config) so that you have clear context on what each section does and what you’re changing.
541
681
 
542
682
  ```typescript
543
- const result = await getSummaryAndTags(
544
- assetId,
545
- 'Custom analysis prompt here',
546
- { tone: 'professional' }
547
- );
683
+ import { getSummaryAndTags } from "@mux/ai/workflows";
684
+
685
+ // SEO-optimized metadata
686
+ const seoResult = await getSummaryAndTags(assetId, {
687
+ tone: "professional",
688
+ promptOverrides: {
689
+ task: "Generate SEO-optimized metadata that maximizes discoverability.",
690
+ title: "Create a search-optimized title (50-60 chars) with primary keyword front-loaded.",
691
+ keywords: "Focus on high search volume terms and long-tail keywords.",
692
+ },
693
+ });
694
+
695
+ // Social media optimized for engagement
696
+ const socialResult = await getSummaryAndTags(assetId, {
697
+ promptOverrides: {
698
+ title: "Create a scroll-stopping headline using emotional triggers or curiosity gaps.",
699
+ description: "Write shareable copy that creates FOMO and works without watching the video.",
700
+ keywords: "Generate hashtag-ready keywords for trending and niche community tags.",
701
+ },
702
+ });
703
+
704
+ // Technical/production analysis
705
+ const technicalResult = await getSummaryAndTags(assetId, {
706
+ tone: "professional",
707
+ promptOverrides: {
708
+ task: "Analyze cinematography, lighting, and production techniques.",
709
+ title: "Describe the production style or filmmaking technique.",
710
+ description: "Provide a technical breakdown of camera work, lighting, and editing.",
711
+ keywords: "Use industry-standard production terminology.",
712
+ },
713
+ });
548
714
  ```
549
715
 
716
+ **Available override sections:**
717
+ | Section | Description |
718
+ |---------|-------------|
719
+ | `task` | Main instruction for what to analyze |
720
+ | `title` | Guidance for generating the title |
721
+ | `description` | Guidance for generating the description |
722
+ | `keywords` | Guidance for generating keywords/tags |
723
+ | `qualityGuidelines` | General quality instructions |
724
+
725
+ Each override can be a simple string (replaces the section content) or a full `PromptSection` object for advanced control over XML tag names and attributes.
726
+
550
727
  ## Examples
551
728
 
552
- See the `examples/` directory for complete working examples:
729
+ See the `examples/` directory for complete working examples.
730
+
731
+ **Prerequisites:**
732
+ Create a `.env` file in the project root with your API credentials:
733
+
734
+ ```bash
735
+ MUX_TOKEN_ID=your_token_id
736
+ MUX_TOKEN_SECRET=your_token_secret
737
+ OPENAI_API_KEY=your_openai_key
738
+ ANTHROPIC_API_KEY=your_anthropic_key
739
+ GOOGLE_GENERATIVE_AI_API_KEY=your_google_key
740
+ HIVE_API_KEY=your_hive_key # required for Hive moderation runs
741
+ ```
742
+
743
+ All examples automatically load environment variables using `dotenv`.
744
+
745
+ ### Quick Start (Run from Root)
746
+
747
+ You can run examples directly from the project root without installing dependencies in each example folder:
748
+
749
+ ```bash
750
+ # Chapters
751
+ npm run example:chapters <asset-id> [language-code] [provider]
752
+ npm run example:chapters:compare <asset-id> [language-code]
753
+
754
+ # Burned-in Caption Detection
755
+ npm run example:burned-in <asset-id> [provider]
756
+ npm run example:burned-in:compare <asset-id>
757
+
758
+ # Summarization
759
+ npm run example:summarization <asset-id> [provider]
760
+ npm run example:summarization:compare <asset-id>
761
+
762
+ # Moderation
763
+ npm run example:moderation <asset-id> [provider]
764
+ npm run example:moderation:compare <asset-id>
765
+
766
+ # Caption Translation
767
+ npm run example:translate-captions <asset-id> [from-lang] [to-lang] [provider]
768
+
769
+ # Audio Translation (Dubbing)
770
+ npm run example:translate-audio <asset-id> [to-lang]
771
+
772
+ # Signed Playback (for assets with signed playback policies)
773
+ npm run example:signed-playback <signed-asset-id>
774
+ npm run example:signed-playback:summarize <signed-asset-id> [provider]
775
+ ```
776
+
777
+ **Examples:**
778
+
779
+ ```bash
780
+ # Generate chapters with OpenAI
781
+ npm run example:chapters abc123 en openai
782
+
783
+ # Detect burned-in captions with Anthropic
784
+ npm run example:burned-in abc123 anthropic
785
+
786
+ # Compare OpenAI vs Anthropic chapter generation
787
+ npm run example:chapters:compare abc123 en
788
+
789
+ # Run moderation analysis with Hive
790
+ npm run example:moderation abc123 hive
791
+
792
+ # Translate captions from English to Spanish with Anthropic (default)
793
+ npm run example:translate-captions abc123 en es anthropic
794
+
795
+ # Summarize a video with Claude Sonnet 4.5 (default)
796
+ npm run example:summarization abc123 anthropic
797
+
798
+ # Create AI-dubbed audio in French
799
+ npm run example:translate-audio abc123 fr
800
+ ```
553
801
 
554
802
  ### Summarization Examples
803
+
555
804
  - **Basic Usage**: Default prompt with different tones
556
- - **Custom Prompts**: Override default behavior
805
+ - **Custom Prompts**: Override prompt sections with presets (SEO, social, technical, ecommerce)
557
806
  - **Tone Variations**: Compare analysis styles
558
807
 
559
808
  ```bash
560
809
  cd examples/summarization
561
810
  npm install
562
- npm run basic <your-asset-id>
811
+ npm run basic <your-asset-id> [provider]
563
812
  npm run tones <your-asset-id>
564
- npm run custom
813
+
814
+ # Custom prompts with presets
815
+ npm run custom <your-asset-id> --preset seo
816
+ npm run custom <your-asset-id> --preset social
817
+ npm run custom <your-asset-id> --preset technical
818
+ npm run custom <your-asset-id> --preset ecommerce
819
+
820
+ # Or provide individual overrides
821
+ npm run custom <your-asset-id> --task "Focus on product features"
565
822
  ```
566
823
 
567
824
  ### Moderation Examples
825
+
568
826
  - **Basic Moderation**: Analyze content with default thresholds
569
827
  - **Custom Thresholds**: Compare strict/default/permissive settings
570
- - **Hive Provider**: Use Hive's Visual Moderation API
571
- - **Provider Comparison**: Compare OpenAI vs Hive results side-by-side
828
+ - **Provider Comparison**: Compare OpenAI’s dedicated Moderation API with Hive’s visual moderation API
572
829
 
573
830
  ```bash
574
831
  cd examples/moderation
575
832
  npm install
576
- npm run basic <your-asset-id>
833
+ npm run basic <your-asset-id> [provider] # provider: openai | hive
577
834
  npm run thresholds <your-asset-id>
578
- npm run hive <your-asset-id>
579
835
  npm run compare <your-asset-id>
580
836
  ```
581
837
 
838
+ Supported moderation providers: `openai` (default) and `hive`. Use `HIVE_API_KEY` when selecting Hive.
839
+
582
840
  ### Burned-in Caption Examples
841
+
583
842
  - **Basic Detection**: Detect burned-in captions with different AI providers
584
- - **Provider Comparison**: Compare OpenAI vs Anthropic detection accuracy
843
+ - **Provider Comparison**: Compare OpenAI vs Anthropic vs Google detection accuracy
585
844
 
586
845
  ```bash
587
846
  cd examples/burned-in-captions
@@ -591,8 +850,9 @@ npm run compare <your-asset-id>
591
850
  ```
592
851
 
593
852
  ### Chapter Generation Examples
853
+
594
854
  - **Basic Chapters**: Generate chapters with different AI providers
595
- - **Provider Comparison**: Compare OpenAI vs Anthropic chapter generation
855
+ - **Provider Comparison**: Compare OpenAI vs Anthropic vs Google chapter generation
596
856
 
597
857
  ```bash
598
858
  cd examples/chapters
@@ -601,37 +861,41 @@ npm run chapters:basic <your-asset-id> [language-code] [provider]
601
861
  npm run compare <your-asset-id> [language-code]
602
862
  ```
603
863
 
604
- ### Translation Examples
864
+ ### Caption Translation Examples
865
+
605
866
  - **Basic Translation**: Translate captions and upload to Mux
606
867
  - **Translation Only**: Translate without uploading to Mux
607
868
 
608
869
  ```bash
609
- cd examples/translation
870
+ cd examples/translate-captions
610
871
  npm install
611
- npm run basic <your-asset-id> en es
612
- npm run translation-only <your-asset-id> en fr
872
+ npm run basic <your-asset-id> en es [provider]
873
+ npm run translation-only <your-asset-id> en fr [provider]
613
874
  ```
614
875
 
615
876
  **Translation Workflow:**
877
+
616
878
  1. Fetches existing captions from Mux asset
617
- 2. Translates VTT content using Anthropic Claude
879
+ 2. Translates VTT content using your selected provider (default: Claude Sonnet 4.5)
618
880
  3. Uploads translated VTT to S3-compatible storage
619
881
  4. Generates presigned URL (1-hour expiry)
620
882
  5. Adds new subtitle track to Mux asset
621
883
  6. Track name: "{Language} (auto-translated)"
622
884
 
623
885
  ### Audio Dubbing Examples
886
+
624
887
  - **Basic Dubbing**: Create AI-dubbed audio and upload to Mux
625
888
  - **Dubbing Only**: Create dubbed audio without uploading to Mux
626
889
 
627
890
  ```bash
628
- cd examples/audio-translation
891
+ cd examples/translate-audio
629
892
  npm install
630
893
  npm run basic <your-asset-id> es
631
894
  npm run dubbing-only <your-asset-id> fr
632
895
  ```
633
896
 
634
897
  **Audio Dubbing Workflow:**
898
+
635
899
  1. Checks asset has audio.m4a static rendition
636
900
  2. Downloads default audio track from Mux
637
901
  3. Creates ElevenLabs dubbing job with automatic language detection
@@ -642,6 +906,36 @@ npm run dubbing-only <your-asset-id> fr
642
906
  8. Adds new audio track to Mux asset
643
907
  9. Track name: "{Language} (auto-dubbed)"
644
908
 
909
+ ### Signed Playback Examples
910
+
911
+ - **URL Generation Test**: Verify signed URLs work for storyboards, thumbnails, and transcripts
912
+ - **Signed Summarization**: Full summarization workflow with a signed asset
913
+
914
+ ```bash
915
+ cd examples/signed-playback
916
+ npm install
917
+
918
+ # Verify signed URL generation
919
+ npm run basic <signed-asset-id>
920
+
921
+ # Summarize a signed asset
922
+ npm run summarize <signed-asset-id> [provider]
923
+ ```
924
+
925
+ **Prerequisites:**
926
+
927
+ 1. Create a Mux asset with `playback_policy: "signed"`
928
+ 2. Create a signing key in Mux Dashboard → Settings → Signing Keys
929
+ 3. Set `MUX_SIGNING_KEY` and `MUX_PRIVATE_KEY` environment variables
930
+
931
+ **How Signed Playback Works:**
932
+ When you provide signing credentials, the library automatically:
933
+
934
+ - Detects if an asset has a signed playback policy
935
+ - Generates JWT tokens with RS256 algorithm
936
+ - Uses the correct `aud` claim for each asset type (video, thumbnail, storyboard)
937
+ - Appends tokens to URLs as query parameters
938
+
645
939
  ## S3-Compatible Storage
646
940
 
647
941
  The translation feature requires S3-compatible storage to temporarily host VTT files for Mux ingestion. Supported providers include:
@@ -655,17 +949,45 @@ The translation feature requires S3-compatible storage to temporarily host VTT f
655
949
 
656
950
  **Why S3 Storage?**
657
951
  Mux requires a publicly accessible URL to ingest subtitle tracks. The translation workflow:
952
+
658
953
  1. Uploads translated VTT to your S3 storage
659
954
  2. Generates a presigned URL for secure access
660
955
  3. Mux fetches the file using the presigned URL
661
956
  4. File remains in your storage for future use
662
957
 
663
- ## Planned Features
958
+ ## Development
959
+
960
+ ### Setup
961
+
962
+ ```bash
963
+ # Clone repo and install dependencies
964
+ git clone https://github.com/muxinc/mux-ai.git
965
+ cd mux-ai
966
+ npm install # Automatically sets up git hooks via Husky
967
+ ```
968
+
969
+ ### Style
970
+
971
+ This project uses automated tooling to enforce consistent code style:
664
972
 
665
- - **Additional Translation Providers**: OpenAI GPT-4 support
666
- - **Batch Translation**: Translate multiple assets at once
667
- - **Custom Translation Prompts**: Override default translation behavior
973
+ - **ESLint** with `@antfu/eslint-config` for linting and formatting
974
+ - **TypeScript** strict mode for type safety
975
+ - **Pre-commit hooks** that run automatically before each commit
976
+
977
+ ```bash
978
+ # Check for linting issues
979
+ npm run lint
980
+
981
+ # Auto-fix linting issues
982
+ npm run lint:fix
983
+
984
+ # Run type checking
985
+ npm run typecheck
986
+
987
+ # Run tests
988
+ npm test
989
+ ```
668
990
 
669
991
  ## License
670
992
 
671
- MIT © Mux, Inc.
993
+ [Apache 2.0](LICENSE)