@ainative/ai-kit-video 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024-2026 AINative Studio
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,325 @@
1
+ # @ainative/ai-kit-video
2
+
3
+ Video processing utilities for AI Kit, including screen recording, camera recording, audio processing, and AI-powered transcription using OpenAI's Whisper API.
4
+
5
+ [![npm version](https://img.shields.io/npm/v/@ainative/ai-kit-video.svg)](https://www.npmjs.com/package/@ainative/ai-kit-video)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.3+-blue.svg)](https://www.typescriptlang.org/)
8
+
9
+ ## Features
10
+
11
+ ### Recording
12
+ - **Screen Recording** - Capture screen with customizable settings (quality, frame rate, audio)
13
+ - **Camera Recording** - Record from webcam with MediaStream API
14
+ - **Audio Recording** - Record audio with noise cancellation and processing
15
+ - **Picture-in-Picture** - Composite camera feed over screen recording
16
+
17
+ ### Processing
18
+ - **Audio Transcription** - AI-powered transcription using OpenAI Whisper
19
+ - **Text Formatting** - Clean and format transcribed text
20
+ - **Highlight Detection** - Detect significant moments in video
21
+ - **Noise Processing** - Reduce background noise in audio
22
+
23
+ ### Observability
24
+ - **Instrumented Recording** - Built-in performance metrics and event logging
25
+ - **Correlation IDs** - Track recordings across your application
26
+ - **Performance Metrics** - Monitor bitrate, file size, and duration
27
+
28
+ ## Installation
29
+
30
+ ```bash
31
+ npm install @ainative/ai-kit-video
32
+ ```
33
+
34
+ ## Usage
35
+
36
+ ### Screen Recording
37
+
38
+ ```typescript
39
+ import { ScreenRecorder } from '@ainative/ai-kit-video/recording'
40
+
41
+ const recorder = new ScreenRecorder({
42
+ mimeType: 'video/webm;codecs=vp9',
43
+ videoBitsPerSecond: 2500000,
44
+ audioBitsPerSecond: 128000
45
+ })
46
+
47
+ // Start recording
48
+ const stream = await recorder.getStream()
49
+ await recorder.start()
50
+
51
+ // Stop and get recording
52
+ await recorder.stop()
53
+ const blob = recorder.getRecordingBlob()
54
+ const url = recorder.getRecordingURL()
55
+
56
+ // Download the recording
57
+ const link = document.createElement('a')
58
+ link.href = url
59
+ link.download = 'screen-recording.webm'
60
+ link.click()
61
+
62
+ // Clean up
63
+ recorder.revokeURL(url)
64
+ ```
65
+
66
+ ### Camera Recording
67
+
68
+ ```typescript
69
+ import { CameraRecorder } from '@ainative/ai-kit-video/recording'
70
+
71
+ const recorder = new CameraRecorder({
72
+ video: {
73
+ width: { ideal: 1920 },
74
+ height: { ideal: 1080 },
75
+ frameRate: { ideal: 30 }
76
+ },
77
+ audio: true
78
+ })
79
+
80
+ const stream = await recorder.getStream()
81
+ await recorder.start()
82
+
83
+ // Display preview
84
+ const videoElement = document.querySelector('video')
85
+ videoElement.srcObject = stream
86
+
87
+ // Stop and save
88
+ await recorder.stop()
89
+ const blob = recorder.getRecordingBlob()
90
+ ```
91
+
92
+ ### Audio Transcription with Whisper
93
+
94
+ ```typescript
95
+ import { transcribeAudio } from '@ainative/ai-kit-video/processing'
96
+
97
+ // Transcribe audio file
98
+ const result = await transcribeAudio(audioFile, {
99
+ apiKey: process.env.OPENAI_API_KEY,
100
+ language: 'en',
101
+ response_format: 'verbose_json',
102
+ timestamp_granularities: ['segment', 'word']
103
+ })
104
+
105
+ console.log(result.text)
106
+ console.log(result.segments) // Segment-level timestamps
107
+ console.log(result.words) // Word-level timestamps
108
+ ```
109
+
110
+ ### Text Formatting
111
+
112
+ ```typescript
113
+ import { TextFormatter } from '@ainative/ai-kit-video/processing'
114
+
115
+ const formatter = new TextFormatter({
116
+ removeFillerWords: true,
117
+ correctPunctuation: true,
118
+ capitalizeFirstWord: true
119
+ })
120
+
121
+ const formatted = formatter.format(transcribedText)
122
+ console.log(formatted)
123
+ ```
124
+
125
+ ### Noise Cancellation
126
+
127
+ ```typescript
128
+ import { NoiseProcessor } from '@ainative/ai-kit-video/recording'
129
+
130
+ const processor = new NoiseProcessor({
131
+ noiseReduction: 0.8,
132
+ autoGain: true
133
+ })
134
+
135
+ // Process audio stream
136
+ const cleanStream = await processor.process(audioStream)
137
+ ```
138
+
139
+ ### Instrumented Recording with Observability
140
+
141
+ ```typescript
142
+ import { InstrumentedScreenRecorder } from '@ainative/ai-kit-video/recording'
143
+
144
+ const recorder = new InstrumentedScreenRecorder({
145
+ correlationId: 'user-session-123',
146
+ logger: customLogger,
147
+ enablePerformanceMetrics: true
148
+ })
149
+
150
+ // All events are logged with correlation ID
151
+ recorder.on('recording_started', (event) => {
152
+ console.log(`Recording started: ${event.recordingId}`)
153
+ console.log(`Correlation: ${event.correlationId}`)
154
+ })
155
+
156
+ recorder.on('performance_metrics', (metrics) => {
157
+ console.log(`Bitrate: ${metrics.avgBitrate}`)
158
+ console.log(`File size: ${metrics.fileSize}`)
159
+ })
160
+
161
+ await recorder.start()
162
+ ```
163
+
164
+ ### Picture-in-Picture Composite
165
+
166
+ ```typescript
167
+ import { PiPCompositor } from '@ainative/ai-kit-video/recording'
168
+
169
+ const compositor = new PiPCompositor({
170
+ position: 'bottom-right',
171
+ size: { width: 320, height: 180 },
172
+ borderRadius: 8,
173
+ border: '2px solid white'
174
+ })
175
+
176
+ // Combine screen and camera
177
+ const screenStream = await screenRecorder.getStream()
178
+ const cameraStream = await cameraRecorder.getStream()
179
+
180
+ const composite = compositor.composite(screenStream, cameraStream)
181
+ ```
182
+
183
+ ## API Reference
184
+
185
+ ### Recording Classes
186
+
187
+ #### `ScreenRecorder`
188
+ - `getStream(): Promise<MediaStream>` - Get display media stream
189
+ - `start(): Promise<void>` - Start recording
190
+ - `stop(): Promise<void>` - Stop recording
191
+ - `pause(): void` - Pause recording
192
+ - `resume(): void` - Resume recording
193
+ - `getRecordingBlob(): Blob` - Get recorded video as Blob
194
+ - `getRecordingURL(): string` - Get Blob URL for download
195
+ - `revokeURL(url: string): void` - Revoke Blob URL to free memory
196
+
197
+ #### `CameraRecorder`
198
+ - Same API as ScreenRecorder
199
+ - Automatically cleans up MediaStream on page unload
200
+ - Prevents memory leaks with proper resource disposal
201
+
202
+ #### `AudioRecorder`
203
+ - Record audio only with customizable settings
204
+ - Built-in noise reduction
205
+ - Auto-gain control
206
+
207
+ #### `InstrumentedScreenRecorder`
208
+ - Extends ScreenRecorder with observability
209
+ - Emits performance metrics and lifecycle events
210
+ - Supports correlation IDs for distributed tracing
211
+
212
+ ### Processing Functions
213
+
214
+ #### `transcribeAudio(file, options)`
215
+ Transcribe audio using OpenAI Whisper API.
216
+
217
+ **Options:**
218
+ - `apiKey` (required) - OpenAI API key
219
+ - `language` - ISO-639-1 language code (e.g., 'en', 'es')
220
+ - `prompt` - Guide the model's style or terminology
221
+ - `response_format` - 'json' | 'text' | 'srt' | 'verbose_json' | 'vtt'
222
+ - `temperature` - Sampling temperature (0-1)
223
+ - `timestamp_granularities` - ['word', 'segment']
224
+
225
+ **Returns:** `TranscriptionResult`
226
+ - `text` - Full transcription
227
+ - `language` - Detected language (verbose_json only)
228
+ - `duration` - Audio duration in seconds (verbose_json only)
229
+ - `segments` - Timestamped segments
230
+ - `words` - Word-level timestamps
231
+
232
+ #### `formatSegments(segments)`
233
+ Format transcription segments into readable text with timestamps.
234
+
235
+ #### `extractSpeakers(text)`
236
+ Extract speaker-labeled text from transcription (requires speaker hints in prompt).
237
+
238
+ #### `estimateTranscriptionCost(durationSeconds)`
239
+ Calculate estimated cost for Whisper transcription ($0.006/minute).
240
+
241
+ ### Utility Classes
242
+
243
+ #### `TextFormatter`
244
+ Clean and format transcribed text.
245
+
246
+ **Options:**
247
+ - `removeFillerWords` - Remove 'um', 'uh', 'like', etc.
248
+ - `correctPunctuation` - Add proper punctuation
249
+ - `capitalizeFirstWord` - Capitalize first word of sentences
250
+ - `removeExtraSpaces` - Normalize whitespace
251
+
252
+ #### `NoiseProcessor`
253
+ Process audio streams to reduce background noise.
254
+
255
+ **Options:**
256
+ - `noiseReduction` - Noise reduction strength (0-1)
257
+ - `autoGain` - Enable automatic gain control
258
+ - `echoCancellation` - Enable echo cancellation
259
+
260
+ ## TypeScript Support
261
+
262
+ This package is written in TypeScript and includes complete type definitions.
263
+
264
+ ```typescript
265
+ import type {
266
+ RecordingConfig,
267
+ TranscriptionOptions,
268
+ TranscriptionResult,
269
+ InstrumentationConfig
270
+ } from '@ainative/ai-kit-video'
271
+ ```
272
+
273
+ ## Browser Compatibility
274
+
275
+ - Chrome/Edge 87+
276
+ - Firefox 94+
277
+ - Safari 15.4+
278
+ - Requires HTTPS for MediaStream APIs
279
+
280
+ ## Memory Management
281
+
282
+ All recorders automatically clean up resources:
283
+ - Blob URLs are revocable via `revokeURL()`
284
+ - MediaStreams are stopped on page unload
285
+ - No memory leaks from unreleased resources
286
+
287
+ ```typescript
288
+ // Manual cleanup
289
+ recorder.stop()
290
+ const url = recorder.getRecordingURL()
291
+ // ... use the URL
292
+ recorder.revokeURL(url) // Free memory
293
+ ```
294
+
295
+ ## Performance
296
+
297
+ - **Screen Recording**: 2.5 Mbps video, 128 kbps audio (default)
298
+ - **Camera Recording**: 1920x1080 @ 30fps (configurable)
299
+ - **Whisper Transcription**: ~$0.006 per minute of audio
300
+
301
+ ## Examples
302
+
303
+ See the [examples directory](https://github.com/AINative-Studio/ai-kit/tree/main/examples) for complete working examples:
304
+ - Screen recording with download
305
+ - Camera recording with preview
306
+ - Audio transcription with Whisper
307
+ - PiP composite recording
308
+ - Instrumented recording with metrics
309
+
310
+ ## Contributing
311
+
312
+ Contributions are welcome! Please see [CONTRIBUTING.md](https://github.com/AINative-Studio/ai-kit/blob/main/docs/contributing/CONTRIBUTING.md) for guidelines.
313
+
314
+ ## License
315
+
316
+ MIT - See [LICENSE](./LICENSE) for details.
317
+
318
+ ## Support
319
+
320
+ - [GitHub Issues](https://github.com/AINative-Studio/ai-kit/issues)
321
+ - [Documentation](https://ainative.studio/ai-kit)
322
+
323
+ ---
324
+
325
+ **Built by [AINative Studio](https://ainative.studio)**