@mux/ai 0.1.5 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +226 -821
- package/dist/{index-Bnv7tv90.d.ts → index-CMZYZcj6.d.ts} +122 -3
- package/dist/index.d.ts +1 -1
- package/dist/index.js +955 -624
- package/dist/index.js.map +1 -1
- package/dist/primitives/index.js +18 -71
- package/dist/primitives/index.js.map +1 -1
- package/dist/workflows/index.d.ts +1 -1
- package/dist/workflows/index.js +953 -638
- package/dist/workflows/index.js.map +1 -1
- package/package.json +21 -23
- package/dist/index-BNnz9P_5.d.mts +0 -144
- package/dist/index-vJ5r2FNm.d.mts +0 -477
- package/dist/index.d.mts +0 -13
- package/dist/index.mjs +0 -2205
- package/dist/index.mjs.map +0 -1
- package/dist/primitives/index.d.mts +0 -3
- package/dist/primitives/index.mjs +0 -358
- package/dist/primitives/index.mjs.map +0 -1
- package/dist/types-ktXDZ93V.d.mts +0 -137
- package/dist/workflows/index.d.mts +0 -8
- package/dist/workflows/index.mjs +0 -2168
- package/dist/workflows/index.mjs.map +0 -1
package/README.md
CHANGED
|
@@ -1,69 +1,43 @@
|
|
|
1
|
-
#
|
|
1
|
+
# `@mux/ai` 📼 🤝 🤖
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
[](https://www.npmjs.com/package/@mux/ai)
|
|
4
|
+
[](https://opensource.org/licenses/Apache-2.0)
|
|
4
5
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
| Workflow | Description | Providers | Default Models | Input | Output |
|
|
8
|
-
| ------------------------- | --------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------ | -------------------------------- | ---------------------------------------------- |
|
|
9
|
-
| `getSummaryAndTags` | Generate titles, descriptions, and tags from a Mux video asset | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + options | Title, description, tags, storyboard URL |
|
|
10
|
-
| `getModerationScores` | Analyze video thumbnails for inappropriate content | OpenAI, Hive | `omni-moderation-latest` (OpenAI) or Hive visual moderation task | Asset ID + thresholds | Sexual/violence scores, flagged status |
|
|
11
|
-
| `hasBurnedInCaptions` | Detect burned-in captions (hardcoded subtitles) in video frames | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + options | Boolean result, confidence, language |
|
|
12
|
-
| `generateChapters` | Generate AI-powered chapter markers from video captions | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + language + options | Timestamped chapter list, ready for Mux Player |
|
|
13
|
-
| `generateVideoEmbeddings` | Generate vector embeddings for video transcript chunks | OpenAI, Google | `text-embedding-3-small` (OpenAI), `gemini-embedding-001` (Google) | Asset ID + chunking strategy | Chunk embeddings + averaged embedding |
|
|
14
|
-
| `translateCaptions` | Translate video captions to different languages | OpenAI, Anthropic, Google | `gpt-5-mini`, `claude-sonnet-4-5`, `gemini-2.5-flash` | Asset ID + languages + S3 config | Translated VTT + Mux track ID |
|
|
15
|
-
| `translateAudio` | Create AI-dubbed audio tracks in different languages | ElevenLabs only | ElevenLabs Dubbing API | Asset ID + languages + S3 config | Dubbed audio + Mux track ID |
|
|
16
|
-
|
|
17
|
-
## Features
|
|
18
|
-
|
|
19
|
-
- **Cost-Effective by Default**: Uses affordable frontier models like `gpt-5-mini`, `claude-sonnet-4-5`, and `gemini-2.5-flash` to keep analysis costs low while maintaining high quality results
|
|
20
|
-
- **Multi-modal Analysis**: Combines storyboard images with video transcripts
|
|
21
|
-
- **Tone Control**: Normal, sassy, or professional analysis styles
|
|
22
|
-
- **Prompt Customization**: Override specific prompt sections to tune workflows to your use case
|
|
23
|
-
- **Configurable Thresholds**: Custom sensitivity levels for content moderation
|
|
24
|
-
- **TypeScript**: Fully typed for excellent developer experience
|
|
25
|
-
- **Provider Choice**: Switch between OpenAI, Anthropic, and Google for different perspectives
|
|
26
|
-
- **Composable Building Blocks**: Import primitives to fetch transcripts, thumbnails, and storyboards to build bespoke flows
|
|
27
|
-
- **Universal Language Support**: Automatic language name detection using `Intl.DisplayNames` for all ISO 639-1 codes
|
|
6
|
+
> **A TypeScript SDK for building AI-driven video workflows on the server, powered by [Mux](https://www.mux.com)!**
|
|
28
7
|
|
|
29
|
-
|
|
8
|
+
`@mux/ai` does this by providing:
|
|
9
|
+
- Easy to use, purpose-driven, cost effective, configurable **_workflow functions_** that integrate with a variety of popular AI/LLM providers (OpenAI, Anthropic, Google).
|
|
10
|
+
- **Examples:** [`generateChapters`](#chapter-generation), [`getModerationScores`](#content-moderation), [`generateVideoEmbeddings`](#video-search-with-embeddings), [`getSummaryAndTags`](#video-summarization)
|
|
11
|
+
- Convenient, parameterized, commonly needed **_primitive functions_** backed by [Mux Video](https://www.mux.com/video-api) for building your own media-based AI workflows and integrations.
|
|
12
|
+
- **Examples:** `getStoryboardUrl`, `chunkVTTCues`, `fetchTranscriptForAsset`
|
|
30
13
|
|
|
31
|
-
|
|
14
|
+
# Usage
|
|
32
15
|
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
- `@mux/ai` – re-exports both namespaces, plus shared `types`, so you can also write `import { workflows, primitives } from '@mux/ai';`.
|
|
16
|
+
```ts
|
|
17
|
+
import { getSummaryAndTags } from "@mux/ai/workflows";
|
|
36
18
|
|
|
37
|
-
|
|
19
|
+
const result = await getSummaryAndTags("your-asset-id", {
|
|
20
|
+
provider: "openai",
|
|
21
|
+
tone: "professional",
|
|
22
|
+
includeTranscript: true
|
|
23
|
+
});
|
|
38
24
|
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
25
|
+
console.log(result.title); // "Getting Started with TypeScript"
|
|
26
|
+
console.log(result.description); // "A comprehensive guide to..."
|
|
27
|
+
console.log(result.tags); // ["typescript", "tutorial", "programming"]
|
|
28
|
+
```
|
|
42
29
|
|
|
43
|
-
|
|
44
|
-
export async function summarizeIfSafe(assetId: string) {
|
|
45
|
-
const moderation = await getModerationScores(assetId, { provider: "openai" });
|
|
46
|
-
if (moderation.exceedsThreshold) {
|
|
47
|
-
throw new Error("Asset failed content safety review");
|
|
48
|
-
}
|
|
30
|
+
> **⚠️ Important:** Many workflows rely on video transcripts for best results. Consider enabling [auto-generated captions](https://www.mux.com/docs/guides/add-autogenerated-captions-and-use-transcripts) on your Mux assets to unlock the full potential of transcript-based workflows like summarization, chapters, and embeddings.
|
|
49
31
|
|
|
50
|
-
|
|
51
|
-
provider: "anthropic",
|
|
52
|
-
tone: "professional",
|
|
53
|
-
});
|
|
54
|
-
}
|
|
32
|
+
# Quick Start
|
|
55
33
|
|
|
56
|
-
|
|
57
|
-
export async function customTranscriptAnalysis(assetId: string, playbackId: string) {
|
|
58
|
-
const transcript = await fetchTranscriptForAsset(assetId, "en");
|
|
59
|
-
const storyboardUrl = getStoryboardUrl(playbackId);
|
|
34
|
+
## Prerequisites
|
|
60
35
|
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
36
|
+
- [Node.js](https://nodejs.org/en/download) (≥ 21.0.0)
|
|
37
|
+
- A Mux account and necessary [credentials](#credentials---mux) for your environment (sign up [here](https://dashboard.mux.com/signup) for free!)
|
|
38
|
+
- Accounts and [credentials](#credentials---ai-providers) for any AI providers you intend to use for your workflows
|
|
39
|
+
- (For some workflows only) AWS S3 and [other credentials](#credentials---other)
|
|
65
40
|
|
|
66
|
-
Use whichever layer makes sense: call a workflow as-is, compose multiple workflows together, or drop down to primitives to build a completely custom workflow.
|
|
67
41
|
|
|
68
42
|
## Installation
|
|
69
43
|
|
|
@@ -71,205 +45,127 @@ Use whichever layer makes sense: call a workflow as-is, compose multiple workflo
|
|
|
71
45
|
npm install @mux/ai
|
|
72
46
|
```
|
|
73
47
|
|
|
74
|
-
##
|
|
48
|
+
## Configuration
|
|
75
49
|
|
|
76
|
-
|
|
50
|
+
We support [dotenv](https://www.npmjs.com/package/dotenv), so you can simply add the following environment variables to your `.env` file:
|
|
77
51
|
|
|
78
|
-
```
|
|
79
|
-
|
|
52
|
+
```bash
|
|
53
|
+
# Required
|
|
54
|
+
MUX_TOKEN_ID=your_mux_token_id
|
|
55
|
+
MUX_TOKEN_SECRET=your_mux_token_secret
|
|
80
56
|
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
});
|
|
57
|
+
# Needed if your assets _only_ have signed playback IDs
|
|
58
|
+
MUX_SIGNING_KEY=your_signing_key_id
|
|
59
|
+
MUX_PRIVATE_KEY=your_base64_encoded_private_key
|
|
85
60
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
// Use base64 mode for improved reliability (works with OpenAI, Anthropic, and Google)
|
|
92
|
-
const reliableResult = await getSummaryAndTags("your-mux-asset-id", {
|
|
93
|
-
provider: "anthropic",
|
|
94
|
-
imageSubmissionMode: "base64", // Downloads storyboard locally before submission
|
|
95
|
-
imageDownloadOptions: {
|
|
96
|
-
timeout: 15000,
|
|
97
|
-
retries: 2,
|
|
98
|
-
retryDelay: 1000
|
|
99
|
-
},
|
|
100
|
-
tone: "professional"
|
|
101
|
-
});
|
|
61
|
+
# You only need to configure API keys for the AI platforms and workflows you're using
|
|
62
|
+
OPENAI_API_KEY=your_openai_api_key
|
|
63
|
+
ANTHROPIC_API_KEY=your_anthropic_api_key
|
|
64
|
+
GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
|
|
65
|
+
ELEVENLABS_API_KEY=your_elevenlabs_api_key
|
|
102
66
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
},
|
|
110
|
-
});
|
|
67
|
+
# S3-Compatible Storage (required for translation & audio dubbing)
|
|
68
|
+
S3_ENDPOINT=https://your-s3-endpoint.com
|
|
69
|
+
S3_REGION=auto
|
|
70
|
+
S3_BUCKET=your-bucket-name
|
|
71
|
+
S3_ACCESS_KEY_ID=your-access-key
|
|
72
|
+
S3_SECRET_ACCESS_KEY=your-secret-key
|
|
111
73
|
```
|
|
112
74
|
|
|
113
|
-
|
|
75
|
+
Or pass credentials directly to each function:
|
|
114
76
|
|
|
115
77
|
```typescript
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
thresholds: { sexual: 0.7, violence: 0.8 }
|
|
78
|
+
const result = await getSummaryAndTags(assetId, {
|
|
79
|
+
muxTokenId: "your-token-id",
|
|
80
|
+
muxTokenSecret: "your-token-secret",
|
|
81
|
+
openaiApiKey: "your-openai-key"
|
|
121
82
|
});
|
|
83
|
+
```
|
|
122
84
|
|
|
123
|
-
|
|
124
|
-
console.log(result.exceedsThreshold); // true if content should be flagged
|
|
125
|
-
console.log(result.thumbnailScores); // Individual thumbnail results
|
|
85
|
+
> **💡 Tip:** If you're using `.env` in a repository or version tracking system, make sure you add this file to your `.gitignore` or equivalent to avoid unintentionally committing secure credentials.
|
|
126
86
|
|
|
127
|
-
|
|
128
|
-
const hiveResult = await getModerationScores("your-mux-asset-id", {
|
|
129
|
-
provider: "hive",
|
|
130
|
-
thresholds: { sexual: 0.9, violence: 0.9 },
|
|
131
|
-
});
|
|
87
|
+
# Workflows
|
|
132
88
|
|
|
133
|
-
|
|
134
|
-
const reliableResult = await getModerationScores("your-mux-asset-id", {
|
|
135
|
-
provider: "openai",
|
|
136
|
-
imageSubmissionMode: "base64",
|
|
137
|
-
imageDownloadOptions: {
|
|
138
|
-
timeout: 15000,
|
|
139
|
-
retries: 3,
|
|
140
|
-
retryDelay: 1000
|
|
141
|
-
}
|
|
142
|
-
});
|
|
143
|
-
```
|
|
89
|
+
## Available pre-built workflows
|
|
144
90
|
|
|
145
|
-
|
|
91
|
+
| Workflow | Description | Providers | Default Models | Mux Asset Requirements | Cloud Infrastructure Requirements |
|
|
92
|
+
| ------------------------------------------------------------------------ | ----------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------ | ---------------------- | --------------------------------- |
|
|
93
|
+
| [`getSummaryAndTags`](./docs/WORKFLOWS.md#video-summarization)<br/>[API](./docs/API.md#getsummaryandtagsassetid-options) · [Source](./src/workflows/summarization.ts) | Generate titles, descriptions, and tags for an asset | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (optional) | None |
|
|
94
|
+
| [`getModerationScores`](./docs/WORKFLOWS.md#content-moderation)<br/>[API](./docs/API.md#getmoderationscoresassetid-options) · [Source](./src/workflows/moderation.ts) | Detect inappropriate (sexual or violent) content in an asset | OpenAI, Hive | `omni-moderation-latest` (OpenAI) or Hive visual moderation task | Video (required) | None |
|
|
95
|
+
| [`hasBurnedInCaptions`](./docs/WORKFLOWS.md#burned-in-caption-detection)<br/>[API](./docs/API.md#hasburnedincaptionsassetid-options) · [Source](./src/workflows/burned-in-captions.ts) | Detect burned-in captions (hardcoded subtitles) in an asset | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required) | None |
|
|
96
|
+
| [`generateChapters`](./docs/WORKFLOWS.md#chapter-generation)<br/>[API](./docs/API.md#generatechaptersassetid-languagecode-options) · [Source](./src/workflows/chapters.ts) | Generate chapter markers for an asset using the transcript | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (required) | None |
|
|
97
|
+
| [`generateVideoEmbeddings`](./docs/WORKFLOWS.md#video-embeddings)<br/>[API](./docs/API.md#generatevideoembeddingsassetid-options) · [Source](./src/workflows/embeddings.ts) | Generate vector embeddings for an asset's transcript chunks | OpenAI, Google | `text-embedding-3-small` (OpenAI), `gemini-embedding-001` (Google) | Video (required), Captions (required) | None |
|
|
98
|
+
| [`translateCaptions`](./docs/WORKFLOWS.md#caption-translation)<br/>[API](./docs/API.md#translatecaptionsassetid-fromlanguagecode-tolanguagecode-options) · [Source](./src/workflows/translate-captions.ts) | Translate an asset's captions into different languages | OpenAI, Anthropic, Google | `gpt-5.1` (OpenAI), `claude-sonnet-4-5` (Anthropic), `gemini-2.5-flash` (Google) | Video (required), Captions (required) | AWS S3 (if `uploadToMux=true`) |
|
|
99
|
+
| [`translateAudio`](./docs/WORKFLOWS.md#audio-dubbing)<br/>[API](./docs/API.md#translateaudioassetid-tolanguagecode-options) · [Source](./src/workflows/translate-audio.ts) | Create AI-dubbed audio tracks in different languages for an asset | ElevenLabs only | ElevenLabs Dubbing API | Video (required), Audio (required) | AWS S3 (if `uploadToMux=true`) |
|
|
146
100
|
|
|
147
|
-
|
|
148
|
-
import { hasBurnedInCaptions } from "@mux/ai/workflows";
|
|
101
|
+
## Example Workflows
|
|
149
102
|
|
|
150
|
-
|
|
151
|
-
const result = await hasBurnedInCaptions("your-mux-asset-id", {
|
|
152
|
-
provider: "openai"
|
|
153
|
-
});
|
|
103
|
+
### Video Summarization
|
|
154
104
|
|
|
155
|
-
|
|
156
|
-
console.log(result.confidence); // 0.0-1.0 confidence score
|
|
157
|
-
console.log(result.detectedLanguage); // Language if captions detected
|
|
158
|
-
console.log(result.storyboardUrl); // Video storyboard analyzed
|
|
105
|
+
Generate SEO-friendly titles, descriptions, and tags from your video content:
|
|
159
106
|
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
provider: "anthropic",
|
|
163
|
-
model: "claude-sonnet-4-5"
|
|
164
|
-
});
|
|
165
|
-
|
|
166
|
-
const googleResult = await hasBurnedInCaptions("your-mux-asset-id", {
|
|
167
|
-
provider: "google",
|
|
168
|
-
model: "gemini-2.5-flash"
|
|
169
|
-
});
|
|
107
|
+
```typescript
|
|
108
|
+
import { getSummaryAndTags } from "@mux/ai/workflows";
|
|
170
109
|
|
|
171
|
-
|
|
172
|
-
const reliableResult = await hasBurnedInCaptions("your-mux-asset-id", {
|
|
110
|
+
const result = await getSummaryAndTags("your-asset-id", {
|
|
173
111
|
provider: "openai",
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
timeout: 15000,
|
|
177
|
-
retries: 3,
|
|
178
|
-
retryDelay: 1000
|
|
179
|
-
}
|
|
112
|
+
tone: "professional",
|
|
113
|
+
includeTranscript: true
|
|
180
114
|
});
|
|
181
|
-
```
|
|
182
115
|
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
**URL Mode (Default):**
|
|
188
|
-
|
|
189
|
-
- Fast initial response
|
|
190
|
-
- Lower bandwidth usage
|
|
191
|
-
- Relies on AI provider's image downloading
|
|
192
|
-
- May encounter timeouts with slow/unreliable image sources
|
|
116
|
+
console.log(result.title); // "Getting Started with TypeScript"
|
|
117
|
+
console.log(result.description); // "A comprehensive guide to..."
|
|
118
|
+
console.log(result.tags); // ["typescript", "tutorial", "programming"]
|
|
119
|
+
```
|
|
193
120
|
|
|
194
|
-
|
|
121
|
+
### Content Moderation
|
|
195
122
|
|
|
196
|
-
|
|
197
|
-
- Eliminates AI provider timeout issues
|
|
198
|
-
- Better control over slow TTFB and network issues
|
|
199
|
-
- Slightly higher bandwidth usage but more reliable results
|
|
200
|
-
- For OpenAI: submits images as base64 data URIs
|
|
201
|
-
- For Anthropic/Google: the AI SDK handles converting the base64 payload into the provider-specific format automatically
|
|
123
|
+
Automatically detect inappropriate content in videos:
|
|
202
124
|
|
|
203
125
|
```typescript
|
|
204
|
-
|
|
205
|
-
const result = await getModerationScores(assetId, {
|
|
206
|
-
imageSubmissionMode: "base64",
|
|
207
|
-
imageDownloadOptions: {
|
|
208
|
-
timeout: 15000, // 15s timeout per image
|
|
209
|
-
retries: 3, // Retry failed downloads 3x
|
|
210
|
-
retryDelay: 1000, // 1s base delay with exponential backoff
|
|
211
|
-
exponentialBackoff: true
|
|
212
|
-
}
|
|
213
|
-
});
|
|
214
|
-
```
|
|
215
|
-
|
|
216
|
-
### Caption Translation
|
|
126
|
+
import { getModerationScores } from "@mux/ai/workflows";
|
|
217
127
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
const result = await translateCaptions(
|
|
223
|
-
"your-mux-asset-id",
|
|
224
|
-
"en", // from language
|
|
225
|
-
"es", // to language
|
|
226
|
-
{
|
|
227
|
-
provider: "google",
|
|
228
|
-
model: "gemini-2.5-flash"
|
|
229
|
-
}
|
|
230
|
-
);
|
|
128
|
+
const result = await getModerationScores("your-asset-id", {
|
|
129
|
+
provider: "openai",
|
|
130
|
+
thresholds: { sexual: 0.7, violence: 0.8 }
|
|
131
|
+
});
|
|
231
132
|
|
|
232
|
-
|
|
233
|
-
console.log(
|
|
234
|
-
console.log(result.
|
|
133
|
+
if (result.exceedsThreshold) {
|
|
134
|
+
console.log("Content flagged for review");
|
|
135
|
+
console.log(`Max scores: ${result.maxScores}`);
|
|
136
|
+
}
|
|
235
137
|
```
|
|
236
138
|
|
|
237
|
-
###
|
|
139
|
+
### Chapter Generation
|
|
140
|
+
|
|
141
|
+
Create automatic chapter markers for better video navigation:
|
|
238
142
|
|
|
239
143
|
```typescript
|
|
240
144
|
import { generateChapters } from "@mux/ai/workflows";
|
|
241
145
|
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
provider: "openai"
|
|
146
|
+
const result = await generateChapters("your-asset-id", "en", {
|
|
147
|
+
provider: "anthropic"
|
|
245
148
|
});
|
|
246
149
|
|
|
247
|
-
console.log(result.chapters); // Array of {startTime: number, title: string}
|
|
248
|
-
|
|
249
150
|
// Use with Mux Player
|
|
250
|
-
const player = document.querySelector("mux-player");
|
|
251
151
|
player.addChapters(result.chapters);
|
|
252
|
-
|
|
253
|
-
//
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
});
|
|
258
|
-
|
|
259
|
-
const googleResult = await generateChapters("your-mux-asset-id", "en", {
|
|
260
|
-
provider: "google",
|
|
261
|
-
model: "gemini-2.5-flash"
|
|
262
|
-
});
|
|
152
|
+
// [
|
|
153
|
+
// { startTime: 0, title: "Introduction" },
|
|
154
|
+
// { startTime: 45, title: "Main Content" },
|
|
155
|
+
// { startTime: 120, title: "Conclusion" }
|
|
156
|
+
// ]
|
|
263
157
|
```
|
|
264
158
|
|
|
265
|
-
### Video Embeddings
|
|
159
|
+
### Video Search with Embeddings
|
|
160
|
+
|
|
161
|
+
Generate embeddings for semantic video search:
|
|
266
162
|
|
|
267
163
|
```typescript
|
|
268
164
|
import { generateVideoEmbeddings } from "@mux/ai/workflows";
|
|
269
165
|
|
|
270
|
-
|
|
271
|
-
const result = await generateVideoEmbeddings("your-mux-asset-id", {
|
|
166
|
+
const result = await generateVideoEmbeddings("your-asset-id", {
|
|
272
167
|
provider: "openai",
|
|
168
|
+
languageCode: "en",
|
|
273
169
|
chunkingStrategy: {
|
|
274
170
|
type: "token",
|
|
275
171
|
maxTokens: 500,
|
|
@@ -277,716 +173,225 @@ const result = await generateVideoEmbeddings("your-mux-asset-id", {
|
|
|
277
173
|
}
|
|
278
174
|
});
|
|
279
175
|
|
|
280
|
-
|
|
281
|
-
console.log(result.averagedEmbedding); // Single embedding for entire video
|
|
282
|
-
|
|
283
|
-
// Store chunks in vector database for timestamp-accurate search
|
|
176
|
+
// Store embeddings in your vector database
|
|
284
177
|
for (const chunk of result.chunks) {
|
|
285
178
|
await vectorDB.insert({
|
|
286
|
-
id: `${result.assetId}:${chunk.chunkId}`,
|
|
287
179
|
embedding: chunk.embedding,
|
|
288
|
-
|
|
289
|
-
|
|
180
|
+
metadata: {
|
|
181
|
+
assetId: result.assetId,
|
|
182
|
+
startTime: chunk.metadata.startTime,
|
|
183
|
+
endTime: chunk.metadata.endTime
|
|
184
|
+
}
|
|
290
185
|
});
|
|
291
186
|
}
|
|
292
|
-
|
|
293
|
-
// Use VTT-based chunking to respect cue boundaries
|
|
294
|
-
const vttResult = await generateVideoEmbeddings("your-mux-asset-id", {
|
|
295
|
-
provider: "google",
|
|
296
|
-
chunkingStrategy: {
|
|
297
|
-
type: "vtt",
|
|
298
|
-
maxTokens: 500,
|
|
299
|
-
overlapCues: 2
|
|
300
|
-
}
|
|
301
|
-
});
|
|
302
|
-
```
|
|
303
|
-
|
|
304
|
-
### Audio Dubbing
|
|
305
|
-
|
|
306
|
-
```typescript
|
|
307
|
-
import { translateAudio } from "@mux/ai/workflows";
|
|
308
|
-
|
|
309
|
-
// Create AI-dubbed audio track and add to Mux asset
|
|
310
|
-
// Uses the default audio track on your asset, language is auto-detected
|
|
311
|
-
const result = await translateAudio(
|
|
312
|
-
"your-mux-asset-id",
|
|
313
|
-
"es", // target language
|
|
314
|
-
{
|
|
315
|
-
provider: "elevenlabs",
|
|
316
|
-
numSpeakers: 0 // Auto-detect speakers
|
|
317
|
-
}
|
|
318
|
-
);
|
|
319
|
-
|
|
320
|
-
console.log(result.dubbingId); // ElevenLabs dubbing job ID
|
|
321
|
-
console.log(result.uploadedTrackId); // New Mux audio track ID
|
|
322
|
-
console.log(result.presignedUrl); // S3 audio file URL
|
|
323
187
|
```
|
|
324
188
|
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
```typescript
|
|
328
|
-
import { getSummaryAndTags } from "@mux/ai/workflows";
|
|
329
|
-
|
|
330
|
-
// Compare different AI providers analyzing the same Mux video asset
|
|
331
|
-
const assetId = "your-mux-asset-id";
|
|
189
|
+
# Key Features
|
|
332
190
|
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
336
|
-
|
|
337
|
-
|
|
338
|
-
|
|
339
|
-
|
|
340
|
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
});
|
|
344
|
-
|
|
345
|
-
// Google Gemini analysis (default: gemini-2.5-flash)
|
|
346
|
-
const googleResult = await getSummaryAndTags(assetId, {
|
|
347
|
-
provider: "google",
|
|
348
|
-
tone: "professional"
|
|
349
|
-
});
|
|
350
|
-
|
|
351
|
-
// Compare results
|
|
352
|
-
console.log("OpenAI:", openaiResult.title);
|
|
353
|
-
console.log("Anthropic:", anthropicResult.title);
|
|
354
|
-
console.log("Google:", googleResult.title);
|
|
355
|
-
```
|
|
356
|
-
|
|
357
|
-
## Configuration
|
|
358
|
-
|
|
359
|
-
Set environment variables:
|
|
360
|
-
|
|
361
|
-
```bash
|
|
362
|
-
MUX_TOKEN_ID=your_mux_token_id
|
|
363
|
-
MUX_TOKEN_SECRET=your_mux_token_secret
|
|
364
|
-
OPENAI_API_KEY=your_openai_api_key
|
|
365
|
-
ANTHROPIC_API_KEY=your_anthropic_api_key
|
|
366
|
-
GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
|
|
367
|
-
ELEVENLABS_API_KEY=your_elevenlabs_api_key
|
|
368
|
-
|
|
369
|
-
# Signed Playback (for assets with signed playback policies)
|
|
370
|
-
MUX_SIGNING_KEY=your_signing_key_id
|
|
371
|
-
MUX_PRIVATE_KEY=your_base64_encoded_private_key
|
|
191
|
+
- **Cost-Effective by Default**: Uses affordable frontier models like `gpt-5.1`, `claude-sonnet-4-5`, and `gemini-2.5-flash` to keep analysis costs low while maintaining high quality results
|
|
192
|
+
- **Multi-modal Analysis**: Combines storyboard images with video transcripts for richer understanding
|
|
193
|
+
- **Tone Control**: Choose between normal, sassy, or professional analysis styles for summarization
|
|
194
|
+
- **Prompt Customization**: Override specific prompt sections to tune workflows to your exact use case
|
|
195
|
+
- **Configurable Thresholds**: Set custom sensitivity levels for content moderation
|
|
196
|
+
- **Full TypeScript Support**: Comprehensive types for excellent developer experience and IDE autocomplete
|
|
197
|
+
- **Provider Flexibility**: Switch between OpenAI, Anthropic, Google, and other providers based on your needs
|
|
198
|
+
- **Composable Building Blocks**: Use primitives to fetch transcripts, thumbnails, and storyboards for custom workflows
|
|
199
|
+
- **Universal Language Support**: Automatic language name detection using `Intl.DisplayNames` for all ISO 639-1 codes
|
|
200
|
+
- **Production Ready**: Built-in retry logic, error handling, and edge case management
|
|
372
201
|
|
|
373
|
-
#
|
|
374
|
-
S3_ENDPOINT=https://your-s3-endpoint.com
|
|
375
|
-
S3_REGION=auto
|
|
376
|
-
S3_BUCKET=your-bucket-name
|
|
377
|
-
S3_ACCESS_KEY_ID=your-access-key
|
|
378
|
-
S3_SECRET_ACCESS_KEY=your-secret-key
|
|
379
|
-
```
|
|
202
|
+
# Core Concepts
|
|
380
203
|
|
|
381
|
-
|
|
204
|
+
`@mux/ai` is built around two complementary abstractions:
|
|
382
205
|
|
|
383
|
-
|
|
384
|
-
const result = await getSummaryAndTags(assetId, {
|
|
385
|
-
muxTokenId: "your-token-id",
|
|
386
|
-
muxTokenSecret: "your-token-secret",
|
|
387
|
-
openaiApiKey: "your-openai-key",
|
|
388
|
-
// For assets with signed playback policies:
|
|
389
|
-
muxSigningKey: "your-signing-key-id",
|
|
390
|
-
muxPrivateKey: "your-base64-private-key"
|
|
391
|
-
});
|
|
392
|
-
```
|
|
206
|
+
## Workflows
|
|
393
207
|
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
### `getSummaryAndTags(assetId, options?)`
|
|
397
|
-
|
|
398
|
-
Analyzes a Mux video asset and returns AI-generated metadata.
|
|
399
|
-
|
|
400
|
-
**Parameters:**
|
|
401
|
-
|
|
402
|
-
- `assetId` (string) - Mux video asset ID
|
|
403
|
-
- `options` (optional) - Configuration options
|
|
404
|
-
|
|
405
|
-
**Options:**
|
|
406
|
-
|
|
407
|
-
- `provider?: 'openai' | 'anthropic' | 'google'` - AI provider (default: 'openai')
|
|
408
|
-
- `tone?: 'normal' | 'sassy' | 'professional'` - Analysis tone (default: 'normal')
|
|
409
|
-
- `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
|
|
410
|
-
- `includeTranscript?: boolean` - Include video transcript in analysis (default: true)
|
|
411
|
-
- `cleanTranscript?: boolean` - Remove VTT timestamps and formatting from transcript (default: true)
|
|
412
|
-
- `imageSubmissionMode?: 'url' | 'base64'` - How to submit storyboard to AI providers (default: 'url')
|
|
413
|
-
- `imageDownloadOptions?: object` - Options for image download when using base64 mode
|
|
414
|
-
- `timeout?: number` - Request timeout in milliseconds (default: 10000)
|
|
415
|
-
- `retries?: number` - Maximum retry attempts (default: 3)
|
|
416
|
-
- `retryDelay?: number` - Base delay between retries in milliseconds (default: 1000)
|
|
417
|
-
- `maxRetryDelay?: number` - Maximum delay between retries in milliseconds (default: 10000)
|
|
418
|
-
- `exponentialBackoff?: boolean` - Whether to use exponential backoff (default: true)
|
|
419
|
-
- `promptOverrides?: object` - Override specific sections of the prompt for custom use cases
|
|
420
|
-
- `task?: string` - Override the main task instruction
|
|
421
|
-
- `title?: string` - Override title generation guidance
|
|
422
|
-
- `description?: string` - Override description generation guidance
|
|
423
|
-
- `keywords?: string` - Override keywords generation guidance
|
|
424
|
-
- `qualityGuidelines?: string` - Override quality guidelines
|
|
425
|
-
- `muxTokenId?: string` - Mux API token ID
|
|
426
|
-
- `muxTokenSecret?: string` - Mux API token secret
|
|
427
|
-
- `muxSigningKey?: string` - Signing key ID for signed playback policies
|
|
428
|
-
- `muxPrivateKey?: string` - Base64-encoded private key for signed playback policies
|
|
429
|
-
- `openaiApiKey?: string` - OpenAI API key
|
|
430
|
-
- `anthropicApiKey?: string` - Anthropic API key
|
|
431
|
-
- `googleApiKey?: string` - Google Generative AI API key
|
|
432
|
-
|
|
433
|
-
**Returns:**
|
|
208
|
+
**Workflows** are functions that handle complete video AI tasks end-to-end. Each workflow orchestrates the entire process: fetching video data from Mux (transcripts, thumbnails, storyboards), formatting it for AI providers, and returning structured results.
|
|
434
209
|
|
|
435
210
|
```typescript
|
|
436
|
-
|
|
437
|
-
assetId: string;
|
|
438
|
-
title: string; // Short title (max 100 chars)
|
|
439
|
-
description: string; // Detailed description
|
|
440
|
-
tags: string[]; // Relevant keywords
|
|
441
|
-
storyboardUrl: string; // Video storyboard URL
|
|
442
|
-
}
|
|
443
|
-
```
|
|
444
|
-
|
|
445
|
-
### `getModerationScores(assetId, options?)`
|
|
446
|
-
|
|
447
|
-
Analyzes video thumbnails for inappropriate content using OpenAI's Moderation API or Hive’s visual moderation API.
|
|
448
|
-
|
|
449
|
-
**Parameters:**
|
|
450
|
-
|
|
451
|
-
- `assetId` (string) - Mux video asset ID
|
|
452
|
-
- `options` (optional) - Configuration options
|
|
453
|
-
|
|
454
|
-
**Options:**
|
|
455
|
-
|
|
456
|
-
- `provider?: 'openai' | 'hive'` - Moderation provider (default: 'openai')
|
|
457
|
-
- `model?: string` - OpenAI moderation model to use (default: `omni-moderation-latest`)
|
|
458
|
-
- `thresholds?: { sexual?: number; violence?: number }` - Custom thresholds (default: {sexual: 0.7, violence: 0.8})
|
|
459
|
-
- `thumbnailInterval?: number` - Seconds between thumbnails for long videos (default: 10)
|
|
460
|
-
- `thumbnailWidth?: number` - Thumbnail width in pixels (default: 640)
|
|
461
|
-
- `maxConcurrent?: number` - Maximum concurrent API requests (default: 5)
|
|
462
|
-
- `imageSubmissionMode?: 'url' | 'base64'` - How to submit images to AI providers (default: 'url')
|
|
463
|
-
- `imageDownloadOptions?: object` - Options for image download when using base64 mode
|
|
464
|
-
- `timeout?: number` - Request timeout in milliseconds (default: 10000)
|
|
465
|
-
- `retries?: number` - Maximum retry attempts (default: 3)
|
|
466
|
-
- `retryDelay?: number` - Base delay between retries in milliseconds (default: 1000)
|
|
467
|
-
- `maxRetryDelay?: number` - Maximum delay between retries in milliseconds (default: 10000)
|
|
468
|
-
- `exponentialBackoff?: boolean` - Whether to use exponential backoff (default: true)
|
|
469
|
-
- `muxTokenId/muxTokenSecret?: string` - Mux credentials
|
|
470
|
-
- `openaiApiKey?/hiveApiKey?` - Provider credentials
|
|
471
|
-
|
|
472
|
-
**Returns:**
|
|
211
|
+
import { getSummaryAndTags } from "@mux/ai/workflows";
|
|
473
212
|
|
|
474
|
-
|
|
475
|
-
{
|
|
476
|
-
assetId: string;
|
|
477
|
-
thumbnailScores: Array<{ // Individual thumbnail results
|
|
478
|
-
url: string;
|
|
479
|
-
sexual: number; // 0-1 score
|
|
480
|
-
violence: number; // 0-1 score
|
|
481
|
-
error: boolean;
|
|
482
|
-
}>;
|
|
483
|
-
maxScores: { // Highest scores across all thumbnails
|
|
484
|
-
sexual: number;
|
|
485
|
-
violence: number;
|
|
486
|
-
};
|
|
487
|
-
exceedsThreshold: boolean; // true if content should be flagged
|
|
488
|
-
thresholds: { // Threshold values used
|
|
489
|
-
sexual: number;
|
|
490
|
-
violence: number;
|
|
491
|
-
};
|
|
492
|
-
}
|
|
213
|
+
const result = await getSummaryAndTags("asset-id", { provider: "openai" });
|
|
493
214
|
```
|
|
494
215
|
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
Analyzes video frames to detect burned-in captions (hardcoded subtitles) that are permanently embedded in the video image.
|
|
498
|
-
|
|
499
|
-
**Parameters:**
|
|
500
|
-
|
|
501
|
-
- `assetId` (string) - Mux video asset ID
|
|
502
|
-
- `options` (optional) - Configuration options
|
|
216
|
+
Use workflows when you need battle-tested solutions for common tasks like summarization, content moderation, chapter generation, or translation.
|
|
503
217
|
|
|
504
|
-
|
|
218
|
+
## Primitives
|
|
505
219
|
|
|
506
|
-
-
|
|
507
|
-
- `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
|
|
508
|
-
- `imageSubmissionMode?: 'url' | 'base64'` - How to submit storyboard to AI providers (default: 'url')
|
|
509
|
-
- `imageDownloadOptions?: object` - Options for image download when using base64 mode
|
|
510
|
-
- `timeout?: number` - Request timeout in milliseconds (default: 10000)
|
|
511
|
-
- `retries?: number` - Maximum retry attempts (default: 3)
|
|
512
|
-
- `retryDelay?: number` - Base delay between retries in milliseconds (default: 1000)
|
|
513
|
-
- `maxRetryDelay?: number` - Maximum delay between retries in milliseconds (default: 10000)
|
|
514
|
-
- `exponentialBackoff?: boolean` - Whether to use exponential backoff (default: true)
|
|
515
|
-
- `muxTokenId?: string` - Mux API token ID
|
|
516
|
-
- `muxTokenSecret?: string` - Mux API token secret
|
|
517
|
-
- `openaiApiKey?: string` - OpenAI API key
|
|
518
|
-
- `anthropicApiKey?: string` - Anthropic API key
|
|
519
|
-
- `googleApiKey?: string` - Google Generative AI API key
|
|
520
|
-
|
|
521
|
-
**Returns:**
|
|
220
|
+
**Primitives** are low-level building blocks that give you direct access to Mux video data and utilities. They provide functions for fetching transcripts, storyboards, thumbnails, and processing text—perfect for building custom workflows.
|
|
522
221
|
|
|
523
222
|
```typescript
|
|
524
|
-
{
|
|
525
|
-
assetId: string;
|
|
526
|
-
hasBurnedInCaptions: boolean; // Whether burned-in captions were detected
|
|
527
|
-
confidence: number; // Confidence score (0.0-1.0)
|
|
528
|
-
detectedLanguage: string | null; // Language of detected captions, or null
|
|
529
|
-
storyboardUrl: string; // URL to analyzed storyboard
|
|
530
|
-
}
|
|
531
|
-
```
|
|
532
|
-
|
|
533
|
-
**Detection Logic:**
|
|
534
|
-
|
|
535
|
-
- Analyzes video storyboard frames to identify text overlays
|
|
536
|
-
- Distinguishes between actual captions and marketing/end-card text
|
|
537
|
-
- Text appearing only in final 1-2 frames is classified as marketing copy
|
|
538
|
-
- Caption text must appear across multiple frames throughout the timeline
|
|
539
|
-
- Both providers use optimized prompts to minimize false positives
|
|
540
|
-
|
|
541
|
-
### `translateCaptions(assetId, fromLanguageCode, toLanguageCode, options?)`
|
|
542
|
-
|
|
543
|
-
Translates existing captions from one language to another and optionally adds them as a new track to the Mux asset.
|
|
544
|
-
|
|
545
|
-
**Parameters:**
|
|
546
|
-
|
|
547
|
-
- `assetId` (string) - Mux video asset ID
|
|
548
|
-
- `fromLanguageCode` (string) - Source language code (e.g., 'en', 'es', 'fr')
|
|
549
|
-
- `toLanguageCode` (string) - Target language code (e.g., 'es', 'fr', 'de')
|
|
550
|
-
- `options` (optional) - Configuration options
|
|
551
|
-
|
|
552
|
-
**Options:**
|
|
553
|
-
|
|
554
|
-
- `provider: 'openai' | 'anthropic' | 'google'` - AI provider (required)
|
|
555
|
-
- `model?: string` - Model to use (defaults to the provider's chat-vision model if omitted)
|
|
556
|
-
- `uploadToMux?: boolean` - Whether to upload translated track to Mux (default: true)
|
|
557
|
-
- `s3Endpoint?: string` - S3-compatible storage endpoint
|
|
558
|
-
- `s3Region?: string` - S3 region (default: 'auto')
|
|
559
|
-
- `s3Bucket?: string` - S3 bucket name
|
|
560
|
-
- `s3AccessKeyId?: string` - S3 access key ID
|
|
561
|
-
- `s3SecretAccessKey?: string` - S3 secret access key
|
|
562
|
-
- `muxTokenId/muxTokenSecret?: string` - Mux credentials
|
|
563
|
-
- `openaiApiKey?/anthropicApiKey?/googleApiKey?` - Provider credentials
|
|
564
|
-
|
|
565
|
-
**Returns:**
|
|
223
|
+
import { fetchTranscriptForAsset, getStoryboardUrl } from "@mux/ai/primitives";
|
|
566
224
|
|
|
567
|
-
|
|
568
|
-
|
|
569
|
-
assetId: string;
|
|
570
|
-
sourceLanguageCode: string;
|
|
571
|
-
targetLanguageCode: string;
|
|
572
|
-
originalVtt: string; // Original VTT content
|
|
573
|
-
translatedVtt: string; // Translated VTT content
|
|
574
|
-
uploadedTrackId?: string; // Mux track ID (if uploaded)
|
|
575
|
-
presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
|
|
576
|
-
}
|
|
225
|
+
const transcript = await fetchTranscriptForAsset("asset-id", "en");
|
|
226
|
+
const storyboard = getStoryboardUrl("playback-id", { width: 640 });
|
|
577
227
|
```
|
|
578
228
|
|
|
579
|
-
|
|
580
|
-
All ISO 639-1 language codes are automatically supported using `Intl.DisplayNames`. Examples: Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Japanese (ja), Korean (ko), Chinese (zh), Russian (ru), Arabic (ar), Hindi (hi), Thai (th), Swahili (sw), and many more.
|
|
581
|
-
|
|
582
|
-
### `generateChapters(assetId, languageCode, options?)`
|
|
583
|
-
|
|
584
|
-
Generates AI-powered chapter markers by analyzing video captions. Creates logical chapter breaks based on topic changes and content transitions.
|
|
585
|
-
|
|
586
|
-
**Parameters:**
|
|
587
|
-
|
|
588
|
-
- `assetId` (string) - Mux video asset ID
|
|
589
|
-
- `languageCode` (string) - Language code for captions (e.g., 'en', 'es', 'fr')
|
|
590
|
-
- `options` (optional) - Configuration options
|
|
591
|
-
|
|
592
|
-
**Options:**
|
|
229
|
+
Use primitives when you need complete control over your AI prompts or want to build custom workflows not covered by the pre-built options.
|
|
593
230
|
|
|
594
|
-
|
|
595
|
-
- `model?: string` - AI model to use (defaults: `gpt-5-mini`, `claude-sonnet-4-5`, or `gemini-2.5-flash`)
|
|
596
|
-
- `muxTokenId?: string` - Mux API token ID
|
|
597
|
-
- `muxTokenSecret?: string` - Mux API token secret
|
|
598
|
-
- `openaiApiKey?: string` - OpenAI API key
|
|
599
|
-
- `anthropicApiKey?: string` - Anthropic API key
|
|
600
|
-
- `googleApiKey?: string` - Google Generative AI API key
|
|
601
|
-
|
|
602
|
-
**Returns:**
|
|
231
|
+
## Package Structure
|
|
603
232
|
|
|
604
233
|
```typescript
|
|
605
|
-
|
|
606
|
-
|
|
607
|
-
languageCode: string;
|
|
608
|
-
chapters: Array<{
|
|
609
|
-
startTime: number; // Chapter start time in seconds
|
|
610
|
-
title: string; // Descriptive chapter title
|
|
611
|
-
}>;
|
|
612
|
-
}
|
|
613
|
-
```
|
|
614
|
-
|
|
615
|
-
**Requirements:**
|
|
616
|
-
|
|
617
|
-
- Asset must have caption track in the specified language
|
|
618
|
-
- Caption track must be in 'ready' status
|
|
619
|
-
- Uses existing auto-generated or uploaded captions
|
|
234
|
+
// Import workflows
|
|
235
|
+
import { generateChapters } from "@mux/ai/workflows";
|
|
620
236
|
|
|
621
|
-
|
|
237
|
+
// Import primitives
|
|
238
|
+
import { fetchTranscriptForAsset } from "@mux/ai/primitives";
|
|
622
239
|
|
|
623
|
-
|
|
624
|
-
|
|
625
|
-
player.addChapters([
|
|
626
|
-
{ startTime: 0, title: "Introduction and Setup" },
|
|
627
|
-
{ startTime: 45, title: "Main Content Discussion" },
|
|
628
|
-
{ startTime: 120, title: "Conclusion" }
|
|
629
|
-
]);
|
|
240
|
+
// Or import everything
|
|
241
|
+
import { workflows, primitives } from "@mux/ai";
|
|
630
242
|
```
|
|
631
243
|
|
|
632
|
-
|
|
244
|
+
# Credentials
|
|
633
245
|
|
|
634
|
-
|
|
246
|
+
You'll need to set up credentials for Mux as well as any AI provider you want to use for a particular workflow. In addition, some workflows will need other cloud-hosted access (e.g. cloud storage via AWS S3).
|
|
635
247
|
|
|
636
|
-
|
|
248
|
+
## Credentials - Mux
|
|
637
249
|
|
|
638
|
-
|
|
639
|
-
- `toLanguageCode` (string) - Target language code (e.g., 'es', 'fr', 'de')
|
|
640
|
-
- `options` (optional) - Configuration options
|
|
250
|
+
### Access Token (required)
|
|
641
251
|
|
|
642
|
-
|
|
252
|
+
All workflows require a Mux API access token to interact with your video assets. If you're already logged into the dashboard, you can [create a new access token here](https://dashboard.mux.com/settings/access-tokens).
|
|
643
253
|
|
|
644
|
-
|
|
645
|
-
-
|
|
646
|
-
-
|
|
647
|
-
- `s3Endpoint?: string` - S3-compatible storage endpoint
|
|
648
|
-
- `s3Region?: string` - S3 region (default: 'auto')
|
|
649
|
-
- `s3Bucket?: string` - S3 bucket name
|
|
650
|
-
- `s3AccessKeyId?: string` - S3 access key ID
|
|
651
|
-
- `s3SecretAccessKey?: string` - S3 secret access key
|
|
652
|
-
- `elevenLabsApiKey?: string` - ElevenLabs API key
|
|
653
|
-
- `muxTokenId/muxTokenSecret?: string` - API credentials
|
|
254
|
+
**Required Permissions:**
|
|
255
|
+
- **Mux Video**: Read + Write access
|
|
256
|
+
- **Mux Data**: Read access
|
|
654
257
|
|
|
655
|
-
|
|
258
|
+
These permissions cover all current workflows. You can set these when creating your token in the dashboard.
|
|
656
259
|
|
|
657
|
-
|
|
658
|
-
interface TranslateAudioResult {
|
|
659
|
-
assetId: string;
|
|
660
|
-
targetLanguageCode: string;
|
|
661
|
-
dubbingId: string; // ElevenLabs dubbing job ID
|
|
662
|
-
uploadedTrackId?: string; // Mux audio track ID (if uploaded)
|
|
663
|
-
presignedUrl?: string; // S3 presigned URL (expires in 1 hour)
|
|
664
|
-
}
|
|
665
|
-
```
|
|
666
|
-
|
|
667
|
-
**Requirements:**
|
|
668
|
-
|
|
669
|
-
- Asset must have an `audio.m4a` static rendition
|
|
670
|
-
- ElevenLabs API key with Creator plan or higher
|
|
671
|
-
- S3-compatible storage for Mux ingestion
|
|
260
|
+
> **💡 Tip:** For security reasons, consider creating a dedicated access token specifically for your AI workflows rather than reusing existing tokens.
|
|
672
261
|
|
|
673
|
-
|
|
674
|
-
ElevenLabs supports 32+ languages with automatic language name detection via `Intl.DisplayNames`. Supported languages include English, Spanish, French, German, Italian, Portuguese, Polish, Japanese, Korean, Chinese, Russian, Arabic, Hindi, Thai, and many more. Track names are automatically generated (e.g., "Polish (auto-dubbed)").
|
|
262
|
+
### Signing Key (conditionally required)
|
|
675
263
|
|
|
676
|
-
|
|
264
|
+
If your Mux assets use [signed playback URLs](https://docs.mux.com/guides/secure-video-playback) for security, you'll need to provide signing credentials so `@mux/ai` can access the video data.
|
|
677
265
|
|
|
678
|
-
|
|
266
|
+
**When needed:** Only if your assets have signed playback policies enabled and no public playback ID.
|
|
679
267
|
|
|
680
|
-
**
|
|
681
|
-
|
|
682
|
-
|
|
683
|
-
|
|
268
|
+
**How to get:**
|
|
269
|
+
1. Go to [Settings > Signing Keys](https://dashboard.mux.com/settings/signing-keys) in your Mux dashboard
|
|
270
|
+
2. Create a new signing key or use an existing one
|
|
271
|
+
3. Save both the **Signing Key ID** and the **Base64-encoded Private Key**
|
|
684
272
|
|
|
685
|
-
|
|
686
|
-
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
task: "Generate SEO-optimized metadata that maximizes discoverability.",
|
|
690
|
-
title: "Create a search-optimized title (50-60 chars) with primary keyword front-loaded.",
|
|
691
|
-
keywords: "Focus on high search volume terms and long-tail keywords.",
|
|
692
|
-
},
|
|
693
|
-
});
|
|
694
|
-
|
|
695
|
-
// Social media optimized for engagement
|
|
696
|
-
const socialResult = await getSummaryAndTags(assetId, {
|
|
697
|
-
promptOverrides: {
|
|
698
|
-
title: "Create a scroll-stopping headline using emotional triggers or curiosity gaps.",
|
|
699
|
-
description: "Write shareable copy that creates FOMO and works without watching the video.",
|
|
700
|
-
keywords: "Generate hashtag-ready keywords for trending and niche community tags.",
|
|
701
|
-
},
|
|
702
|
-
});
|
|
703
|
-
|
|
704
|
-
// Technical/production analysis
|
|
705
|
-
const technicalResult = await getSummaryAndTags(assetId, {
|
|
706
|
-
tone: "professional",
|
|
707
|
-
promptOverrides: {
|
|
708
|
-
task: "Analyze cinematography, lighting, and production techniques.",
|
|
709
|
-
title: "Describe the production style or filmmaking technique.",
|
|
710
|
-
description: "Provide a technical breakdown of camera work, lighting, and editing.",
|
|
711
|
-
keywords: "Use industry-standard production terminology.",
|
|
712
|
-
},
|
|
713
|
-
});
|
|
273
|
+
**Configuration:**
|
|
274
|
+
```bash
|
|
275
|
+
MUX_SIGNING_KEY=your_signing_key_id
|
|
276
|
+
MUX_PRIVATE_KEY=your_base64_encoded_private_key
|
|
714
277
|
```
|
|
715
278
|
|
|
716
|
-
|
|
717
|
-
| Section | Description |
|
|
718
|
-
|---------|-------------|
|
|
719
|
-
| `task` | Main instruction for what to analyze |
|
|
720
|
-
| `title` | Guidance for generating the title |
|
|
721
|
-
| `description` | Guidance for generating the description |
|
|
722
|
-
| `keywords` | Guidance for generating keywords/tags |
|
|
723
|
-
| `qualityGuidelines` | General quality instructions |
|
|
279
|
+
## Credentials - AI Providers
|
|
724
280
|
|
|
725
|
-
|
|
281
|
+
Different workflows support various AI providers. You only need to configure API keys for the providers you plan to use.
|
|
726
282
|
|
|
727
|
-
|
|
283
|
+
### OpenAI
|
|
728
284
|
|
|
729
|
-
|
|
285
|
+
**Used by:** `getSummaryAndTags`, `getModerationScores`, `hasBurnedInCaptions`, `generateChapters`, `generateVideoEmbeddings`, `translateCaptions`
|
|
730
286
|
|
|
731
|
-
**
|
|
732
|
-
Create a `.env` file in the project root with your API credentials:
|
|
287
|
+
**Get your API key:** [OpenAI API Keys](https://platform.openai.com/api-keys)
|
|
733
288
|
|
|
734
289
|
```bash
|
|
735
|
-
|
|
736
|
-
MUX_TOKEN_SECRET=your_token_secret
|
|
737
|
-
OPENAI_API_KEY=your_openai_key
|
|
738
|
-
ANTHROPIC_API_KEY=your_anthropic_key
|
|
739
|
-
GOOGLE_GENERATIVE_AI_API_KEY=your_google_key
|
|
740
|
-
HIVE_API_KEY=your_hive_key # required for Hive moderation runs
|
|
290
|
+
OPENAI_API_KEY=your_openai_api_key
|
|
741
291
|
```
|
|
742
292
|
|
|
743
|
-
|
|
293
|
+
### Anthropic
|
|
744
294
|
|
|
745
|
-
|
|
295
|
+
**Used by:** `getSummaryAndTags`, `hasBurnedInCaptions`, `generateChapters`, `translateCaptions`
|
|
746
296
|
|
|
747
|
-
|
|
297
|
+
**Get your API key:** [Anthropic Console](https://console.anthropic.com/)
|
|
748
298
|
|
|
749
299
|
```bash
|
|
750
|
-
|
|
751
|
-
npm run example:chapters <asset-id> [language-code] [provider]
|
|
752
|
-
npm run example:chapters:compare <asset-id> [language-code]
|
|
753
|
-
|
|
754
|
-
# Burned-in Caption Detection
|
|
755
|
-
npm run example:burned-in <asset-id> [provider]
|
|
756
|
-
npm run example:burned-in:compare <asset-id>
|
|
757
|
-
|
|
758
|
-
# Summarization
|
|
759
|
-
npm run example:summarization <asset-id> [provider]
|
|
760
|
-
npm run example:summarization:compare <asset-id>
|
|
761
|
-
|
|
762
|
-
# Moderation
|
|
763
|
-
npm run example:moderation <asset-id> [provider]
|
|
764
|
-
npm run example:moderation:compare <asset-id>
|
|
765
|
-
|
|
766
|
-
# Caption Translation
|
|
767
|
-
npm run example:translate-captions <asset-id> [from-lang] [to-lang] [provider]
|
|
768
|
-
|
|
769
|
-
# Audio Translation (Dubbing)
|
|
770
|
-
npm run example:translate-audio <asset-id> [to-lang]
|
|
771
|
-
|
|
772
|
-
# Signed Playback (for assets with signed playback policies)
|
|
773
|
-
npm run example:signed-playback <signed-asset-id>
|
|
774
|
-
npm run example:signed-playback:summarize <signed-asset-id> [provider]
|
|
300
|
+
ANTHROPIC_API_KEY=your_anthropic_api_key
|
|
775
301
|
```
|
|
776
302
|
|
|
777
|
-
|
|
778
|
-
|
|
779
|
-
```bash
|
|
780
|
-
# Generate chapters with OpenAI
|
|
781
|
-
npm run example:chapters abc123 en openai
|
|
782
|
-
|
|
783
|
-
# Detect burned-in captions with Anthropic
|
|
784
|
-
npm run example:burned-in abc123 anthropic
|
|
785
|
-
|
|
786
|
-
# Compare OpenAI vs Anthropic chapter generation
|
|
787
|
-
npm run example:chapters:compare abc123 en
|
|
303
|
+
### Google Generative AI
|
|
788
304
|
|
|
789
|
-
|
|
790
|
-
npm run example:moderation abc123 hive
|
|
305
|
+
**Used by:** `getSummaryAndTags`, `hasBurnedInCaptions`, `generateChapters`, `generateVideoEmbeddings`, `translateCaptions`
|
|
791
306
|
|
|
792
|
-
|
|
793
|
-
npm run example:translate-captions abc123 en es anthropic
|
|
794
|
-
|
|
795
|
-
# Summarize a video with Claude Sonnet 4.5 (default)
|
|
796
|
-
npm run example:summarization abc123 anthropic
|
|
797
|
-
|
|
798
|
-
# Create AI-dubbed audio in French
|
|
799
|
-
npm run example:translate-audio abc123 fr
|
|
800
|
-
```
|
|
801
|
-
|
|
802
|
-
### Summarization Examples
|
|
803
|
-
|
|
804
|
-
- **Basic Usage**: Default prompt with different tones
|
|
805
|
-
- **Custom Prompts**: Override prompt sections with presets (SEO, social, technical, ecommerce)
|
|
806
|
-
- **Tone Variations**: Compare analysis styles
|
|
307
|
+
**Get your API key:** [Google AI Studio](https://aistudio.google.com/app/apikey)
|
|
807
308
|
|
|
808
309
|
```bash
|
|
809
|
-
|
|
810
|
-
npm install
|
|
811
|
-
npm run basic <your-asset-id> [provider]
|
|
812
|
-
npm run tones <your-asset-id>
|
|
813
|
-
|
|
814
|
-
# Custom prompts with presets
|
|
815
|
-
npm run custom <your-asset-id> --preset seo
|
|
816
|
-
npm run custom <your-asset-id> --preset social
|
|
817
|
-
npm run custom <your-asset-id> --preset technical
|
|
818
|
-
npm run custom <your-asset-id> --preset ecommerce
|
|
819
|
-
|
|
820
|
-
# Or provide individual overrides
|
|
821
|
-
npm run custom <your-asset-id> --task "Focus on product features"
|
|
310
|
+
GOOGLE_GENERATIVE_AI_API_KEY=your_google_api_key
|
|
822
311
|
```
|
|
823
312
|
|
|
824
|
-
###
|
|
825
|
-
|
|
826
|
-
- **Basic Moderation**: Analyze content with default thresholds
|
|
827
|
-
- **Custom Thresholds**: Compare strict/default/permissive settings
|
|
828
|
-
- **Provider Comparison**: Compare OpenAI’s dedicated Moderation API with Hive’s visual moderation API
|
|
829
|
-
|
|
830
|
-
```bash
|
|
831
|
-
cd examples/moderation
|
|
832
|
-
npm install
|
|
833
|
-
npm run basic <your-asset-id> [provider] # provider: openai | hive
|
|
834
|
-
npm run thresholds <your-asset-id>
|
|
835
|
-
npm run compare <your-asset-id>
|
|
836
|
-
```
|
|
313
|
+
### ElevenLabs
|
|
837
314
|
|
|
838
|
-
|
|
315
|
+
**Used by:** `translateAudio` (audio dubbing)
|
|
839
316
|
|
|
840
|
-
|
|
317
|
+
**Get your API key:** [ElevenLabs API Keys](https://elevenlabs.io/app/settings/api-keys)
|
|
841
318
|
|
|
842
|
-
|
|
843
|
-
- **Provider Comparison**: Compare OpenAI vs Anthropic vs Google detection accuracy
|
|
319
|
+
**Note:** Requires a Creator plan or higher for dubbing features.
|
|
844
320
|
|
|
845
321
|
```bash
|
|
846
|
-
|
|
847
|
-
npm install
|
|
848
|
-
npm run burned-in:basic <your-asset-id> [provider]
|
|
849
|
-
npm run compare <your-asset-id>
|
|
322
|
+
ELEVENLABS_API_KEY=your_elevenlabs_api_key
|
|
850
323
|
```
|
|
851
324
|
|
|
852
|
-
###
|
|
853
|
-
|
|
854
|
-
- **Basic Chapters**: Generate chapters with different AI providers
|
|
855
|
-
- **Provider Comparison**: Compare OpenAI vs Anthropic vs Google chapter generation
|
|
856
|
-
|
|
857
|
-
```bash
|
|
858
|
-
cd examples/chapters
|
|
859
|
-
npm install
|
|
860
|
-
npm run chapters:basic <your-asset-id> [language-code] [provider]
|
|
861
|
-
npm run compare <your-asset-id> [language-code]
|
|
862
|
-
```
|
|
325
|
+
### Hive
|
|
863
326
|
|
|
864
|
-
|
|
327
|
+
**Used by:** `getModerationScores` (alternative to OpenAI moderation)
|
|
865
328
|
|
|
866
|
-
|
|
867
|
-
- **Translation Only**: Translate without uploading to Mux
|
|
329
|
+
**Get your API key:** [Hive Console](https://thehive.ai/)
|
|
868
330
|
|
|
869
331
|
```bash
|
|
870
|
-
|
|
871
|
-
npm install
|
|
872
|
-
npm run basic <your-asset-id> en es [provider]
|
|
873
|
-
npm run translation-only <your-asset-id> en fr [provider]
|
|
332
|
+
HIVE_API_KEY=your_hive_api_key
|
|
874
333
|
```
|
|
875
334
|
|
|
876
|
-
|
|
335
|
+
## Credentials - Cloud Infrastructure
|
|
877
336
|
|
|
878
|
-
|
|
879
|
-
2. Translates VTT content using your selected provider (default: Claude Sonnet 4.5)
|
|
880
|
-
3. Uploads translated VTT to S3-compatible storage
|
|
881
|
-
4. Generates presigned URL (1-hour expiry)
|
|
882
|
-
5. Adds new subtitle track to Mux asset
|
|
883
|
-
6. Track name: "{Language} (auto-translated)"
|
|
337
|
+
### AWS S3 (or S3-compatible storage)
|
|
884
338
|
|
|
885
|
-
|
|
339
|
+
**Required for:** `translateCaptions`, `translateAudio` (only if `uploadToMux` is true, which is the default)
|
|
886
340
|
|
|
887
|
-
|
|
888
|
-
- **Dubbing Only**: Create dubbed audio without uploading to Mux
|
|
341
|
+
Translation workflows need temporary storage to upload translated files before attaching them to your Mux assets. Any S3-compatible storage service works (AWS S3, Cloudflare R2, DigitalOcean Spaces, etc.).
|
|
889
342
|
|
|
343
|
+
**AWS S3 Setup:**
|
|
344
|
+
1. [Create an S3 bucket](https://s3.console.aws.amazon.com/s3/home)
|
|
345
|
+
2. [Create an IAM user](https://console.aws.amazon.com/iam/) with programmatic access
|
|
346
|
+
3. Attach a policy with `s3:PutObject`, `s3:GetObject`, and `s3:PutObjectAcl` permissions for your bucket
|
|
347
|
+
|
|
348
|
+
**Configuration:**
|
|
890
349
|
```bash
|
|
891
|
-
|
|
892
|
-
|
|
893
|
-
|
|
894
|
-
|
|
350
|
+
S3_ENDPOINT=https://s3.amazonaws.com # Or your S3-compatible endpoint
|
|
351
|
+
S3_REGION=us-east-1 # Your bucket region
|
|
352
|
+
S3_BUCKET=your-bucket-name
|
|
353
|
+
S3_ACCESS_KEY_ID=your-access-key
|
|
354
|
+
S3_SECRET_ACCESS_KEY=your-secret-key
|
|
895
355
|
```
|
|
896
356
|
|
|
897
|
-
**
|
|
898
|
-
|
|
899
|
-
1. Checks asset has audio.m4a static rendition
|
|
900
|
-
2. Downloads default audio track from Mux
|
|
901
|
-
3. Creates ElevenLabs dubbing job with automatic language detection
|
|
902
|
-
4. Polls for completion (up to 30 minutes)
|
|
903
|
-
5. Downloads dubbed audio file
|
|
904
|
-
6. Uploads to S3-compatible storage
|
|
905
|
-
7. Generates presigned URL (1-hour expiry)
|
|
906
|
-
8. Adds new audio track to Mux asset
|
|
907
|
-
9. Track name: "{Language} (auto-dubbed)"
|
|
908
|
-
|
|
909
|
-
### Signed Playback Examples
|
|
910
|
-
|
|
911
|
-
- **URL Generation Test**: Verify signed URLs work for storyboards, thumbnails, and transcripts
|
|
912
|
-
- **Signed Summarization**: Full summarization workflow with a signed asset
|
|
913
|
-
|
|
357
|
+
**Cloudflare R2 Example:**
|
|
914
358
|
```bash
|
|
915
|
-
|
|
916
|
-
|
|
917
|
-
|
|
918
|
-
|
|
919
|
-
|
|
920
|
-
|
|
921
|
-
# Summarize a signed asset
|
|
922
|
-
npm run summarize <signed-asset-id> [provider]
|
|
359
|
+
S3_ENDPOINT=https://your-account-id.r2.cloudflarestorage.com
|
|
360
|
+
S3_REGION=auto
|
|
361
|
+
S3_BUCKET=your-bucket-name
|
|
362
|
+
S3_ACCESS_KEY_ID=your-r2-access-key
|
|
363
|
+
S3_SECRET_ACCESS_KEY=your-r2-secret-key
|
|
923
364
|
```
|
|
924
365
|
|
|
925
|
-
|
|
926
|
-
|
|
927
|
-
1. Create a Mux asset with `playback_policy: "signed"`
|
|
928
|
-
2. Create a signing key in Mux Dashboard → Settings → Signing Keys
|
|
929
|
-
3. Set `MUX_SIGNING_KEY` and `MUX_PRIVATE_KEY` environment variables
|
|
930
|
-
|
|
931
|
-
**How Signed Playback Works:**
|
|
932
|
-
When you provide signing credentials, the library automatically:
|
|
366
|
+
# Documentation
|
|
933
367
|
|
|
934
|
-
|
|
935
|
-
- Generates JWT tokens with RS256 algorithm
|
|
936
|
-
- Uses the correct `aud` claim for each asset type (video, thumbnail, storyboard)
|
|
937
|
-
- Appends tokens to URLs as query parameters
|
|
368
|
+
## Full Documentation
|
|
938
369
|
|
|
939
|
-
|
|
370
|
+
- **[Workflows Guide](./docs/WORKFLOWS.md)** - Detailed guide to each pre-built workflow with examples
|
|
371
|
+
- **[API Reference](./docs/API.md)** - Complete API documentation for all functions, parameters, and return types
|
|
372
|
+
- **[Primitives Guide](./docs/PRIMITIVES.md)** - Low-level building blocks for custom workflows
|
|
373
|
+
- **[Examples](./docs/EXAMPLES.md)** - Running examples from the repository
|
|
940
374
|
|
|
941
|
-
|
|
375
|
+
## Additional Resources
|
|
942
376
|
|
|
943
|
-
- **
|
|
944
|
-
- **
|
|
945
|
-
- **
|
|
946
|
-
- **
|
|
947
|
-
- **Backblaze B2** - Cost-effective storage
|
|
948
|
-
- **Wasabi** - Hot cloud storage
|
|
377
|
+
- **[Mux Video API Docs](https://docs.mux.com/guides/video)** - Learn about Mux Video features
|
|
378
|
+
- **[Auto-generated Captions](https://www.mux.com/docs/guides/add-autogenerated-captions-and-use-transcripts)** - Enable transcripts for your assets
|
|
379
|
+
- **[GitHub Repository](https://github.com/muxinc/ai)** - Source code, issues, and contributions
|
|
380
|
+
- **[npm Package](https://www.npmjs.com/package/@mux/ai)** - Package page and version history
|
|
949
381
|
|
|
950
|
-
|
|
951
|
-
Mux requires a publicly accessible URL to ingest subtitle tracks. The translation workflow:
|
|
382
|
+
# Contributing
|
|
952
383
|
|
|
953
|
-
|
|
954
|
-
2. Generates a presigned URL for secure access
|
|
955
|
-
3. Mux fetches the file using the presigned URL
|
|
956
|
-
4. File remains in your storage for future use
|
|
384
|
+
We welcome contributions! Whether you're fixing bugs, adding features, or improving documentation, we'd love your help.
|
|
957
385
|
|
|
958
|
-
|
|
959
|
-
|
|
960
|
-
### Setup
|
|
961
|
-
|
|
962
|
-
```bash
|
|
963
|
-
# Clone repo and install dependencies
|
|
964
|
-
git clone https://github.com/muxinc/mux-ai.git
|
|
965
|
-
cd mux-ai
|
|
966
|
-
npm install # Automatically sets up git hooks via Husky
|
|
967
|
-
```
|
|
386
|
+
Please see our **[Contributing Guide](./CONTRIBUTING.md)** for details on:
|
|
968
387
|
|
|
969
|
-
|
|
388
|
+
- Setting up your development environment
|
|
389
|
+
- Running examples and tests
|
|
390
|
+
- Code style and conventions
|
|
391
|
+
- Submitting pull requests
|
|
392
|
+
- Reporting issues
|
|
970
393
|
|
|
971
|
-
|
|
972
|
-
|
|
973
|
-
- **ESLint** with `@antfu/eslint-config` for linting and formatting
|
|
974
|
-
- **TypeScript** strict mode for type safety
|
|
975
|
-
- **Pre-commit hooks** that run automatically before each commit
|
|
976
|
-
|
|
977
|
-
```bash
|
|
978
|
-
# Check for linting issues
|
|
979
|
-
npm run lint
|
|
980
|
-
|
|
981
|
-
# Auto-fix linting issues
|
|
982
|
-
npm run lint:fix
|
|
983
|
-
|
|
984
|
-
# Run type checking
|
|
985
|
-
npm run typecheck
|
|
986
|
-
|
|
987
|
-
# Run tests
|
|
988
|
-
npm test
|
|
989
|
-
```
|
|
394
|
+
For questions or discussions, feel free to [open an issue](https://github.com/muxinc/ai/issues).
|
|
990
395
|
|
|
991
396
|
## License
|
|
992
397
|
|