npm - @nadimtuhin/ytranscript - Versions diffs - 1.0.2 → 1.2.0 - Mend

@nadimtuhin/ytranscript 1.0.2 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/README.md +210 -123
package/dist/cli.d.ts +6 -0
package/dist/cli.d.ts.map +1 -0
package/dist/cli.js +104 -51
package/dist/index.d.ts +30 -0
package/dist/index.d.ts.map +1 -0
package/dist/index.js +63 -25
package/dist/lib/fetcher.d.ts +26 -0
package/dist/lib/fetcher.d.ts.map +1 -0
package/dist/lib/fs.d.ts +20 -0
package/dist/lib/fs.d.ts.map +1 -0
package/dist/lib/processor.d.ts +14 -0
package/dist/lib/processor.d.ts.map +1 -0
package/dist/loaders/history.d.ts +9 -0
package/dist/loaders/history.d.ts.map +1 -0
package/dist/loaders/index.d.ts +20 -0
package/dist/loaders/index.d.ts.map +1 -0
package/dist/loaders/watchLater.d.ts +9 -0
package/dist/loaders/watchLater.d.ts.map +1 -0
package/dist/mcp.d.ts +8 -0
package/dist/mcp.d.ts.map +1 -0
package/dist/mcp.js +24 -7
package/dist/outputs/index.d.ts +30 -0
package/dist/outputs/index.d.ts.map +1 -0
package/dist/types.d.ts +93 -0
package/dist/types.d.ts.map +1 -0
package/package.json +6 -6

package/README.md CHANGED Viewed

@@ -1,36 +1,98 @@
 # ytranscript
-Fast YouTube transcript extraction with bulk processing, Google Takeout support, MCP server, and multiple output formats.
+[![npm version](https://img.shields.io/npm/v/@nadimtuhin/ytranscript.svg)](https://www.npmjs.com/package/@nadimtuhin/ytranscript)
+[![npm downloads](https://img.shields.io/npm/dm/@nadimtuhin/ytranscript.svg)](https://www.npmjs.com/package/@nadimtuhin/ytranscript)
+[![CI](https://github.com/nadimtuhin/ytranscript/actions/workflows/ci.yml/badge.svg)](https://github.com/nadimtuhin/ytranscript/actions/workflows/ci.yml)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-Built with [Bun](https://bun.sh) for maximum performance.
+Extract transcripts from your entire YouTube watch history in minutes. Build AI-powered video summaries, searchable archives, or feed transcripts directly to Claude, Cursor, and other AI assistants via the built-in MCP server.
-## Features
+**[Read the blog post: "Automating My Second Brain with YouTube Transcripts"](https://nadimtuhin.com/blog/ytranscript-mcp-youtube-transcripts)**
-- **Direct YouTube API** - No third-party services, uses YouTube's innertube API
-- **MCP Server** - Use with Claude, Cursor, and other AI assistants via Model Context Protocol
-- **Bulk processing** - Process thousands of videos with concurrency control
-- **Google Takeout support** - Import from watch history JSON and watch-later CSV
+## Why ytranscript?
+- **No API keys required** - Uses YouTube's public innertube API directly
+- **Works with AI assistants** - Built-in MCP server for Claude, Cursor, and others
+- **Bulk processing** - Process thousands of videos from Google Takeout exports
 - **Resume-safe** - Automatically skips already-processed videos
-- **Multiple output formats** - JSON, JSONL, CSV, SRT, VTT, plain text
-- **Language selection** - Choose preferred transcript languages
-- **Programmatic API** - Use as a library in your TypeScript/JavaScript projects
+- **Multiple formats** - JSON, JSONL, CSV, SRT, VTT, plain text
+## Quick Start
+```bash
+# Get a transcript in 10 seconds
+npx @nadimtuhin/ytranscript get dQw4w9WgXcQ
+# Output: "We're no strangers to love, you know the rules..."
+```
 ## Installation
 ```bash
-# Install globally
+# Global install (recommended for CLI usage)
 npm install -g @nadimtuhin/ytranscript
-# Or use locally in a project
+# Or use with npx (no install)
+npx @nadimtuhin/ytranscript get VIDEO_ID
+# Add to a project (for library usage)
 npm add @nadimtuhin/ytranscript
+```
+**Runtimes supported:** Node.js 18+ and Bun 1.0+
+## MCP Server (AI Assistant Integration)
+ytranscript includes an MCP (Model Context Protocol) server that lets Claude, Cursor, and other AI assistants fetch YouTube transcripts directly.
+### Available Tools
+| Tool | Description |
+|------|-------------|
+| `get_transcript` | Fetch transcript with format options (text, segments, srt, vtt) |
+| `get_transcript_languages` | List available caption languages for a video |
+| `extract_video_id` | Extract video ID from various YouTube URL formats |
+| `get_transcripts_bulk` | Fetch transcripts for multiple videos at once |
-# With bun
-bun add @nadimtuhin/ytranscript
+### Setup with Claude Desktop
+Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
+```json
+{
+  "mcpServers": {
+    "ytranscript": {
+      "command": "npx",
+      "args": ["-y", "@nadimtuhin/ytranscript", "mcp"]
+    }
+  }
+}
 ```
+Or if installed globally:
+```json
+{
+  "mcpServers": {
+    "ytranscript": {
+      "command": "ytranscript-mcp"
+    }
+  }
+}
+```
+### Example Prompts for Claude
+Once configured, you can ask Claude:
+- "Get the transcript for this YouTube video: https://youtube.com/watch?v=dQw4w9WgXcQ"
+- "Summarize the key points from this video"
+- "What languages are available for this video's captions?"
+- "Get transcripts for these 5 videos and compare their content"
 ## CLI Usage
-### Fetch a single transcript
+### Single Video
 ```bash
 # Basic usage (outputs plain text)
@@ -49,13 +111,17 @@ ytranscript get dQw4w9WgXcQ --format srt -o video.srt
 ytranscript get dQw4w9WgXcQ --format json
 ```
-### Check available languages
+### Check Available Languages
 ```bash
 ytranscript info dQw4w9WgXcQ
+# Output:
+#   en     English (auto-generated)
+#   es     Spanish
+#   fr     French
 ```
-### Bulk processing
+### Bulk Processing
 ```bash
 # From Google Takeout exports
@@ -71,36 +137,77 @@ ytranscript bulk --videos "dQw4w9WgXcQ,jNQXAC9IVRw,9bZkp7q19f0"
 # From a file (one ID or URL per line)
 ytranscript bulk --file videos.txt
-# Resume a previous run
+# Resume a previous run (skips already-processed videos)
 ytranscript bulk --history watch-history.json --resume
+```
+### Rate Limiting
+YouTube may rate-limit requests. Use these flags to control pacing:
-# Control concurrency and rate limiting
+```bash
 ytranscript bulk \
   --history watch-history.json \
-  --concurrency 8 \
-  --pause-after 20 \
-  --pause-ms 3000
+  --concurrency 4 \      # Max concurrent requests (default: 4, safe: 1-8)
+  --pause-after 10 \     # Pause after N requests (default: 10)
+  --pause-ms 5000        # Pause duration in ms (default: 5000)
 ```
-## Programmatic API
+**Recommended for large batches:** `--concurrency 2 --pause-after 10 --pause-ms 5000`
+### Proxy Support
+Route requests through an HTTP proxy to avoid rate limiting or access from restricted networks:
-### Fetch a single transcript
+```bash
+# CLI with proxy
+ytranscript get dQw4w9WgXcQ --proxy http://localhost:8080
+# Bulk with proxy
+ytranscript bulk --history watch-history.json --proxy http://user:pass@proxy.example.com:8080
+# With authentication
+ytranscript get dQw4w9WgXcQ --proxy http://username:password@proxy:8080
+```
+Programmatic usage:
 ```typescript
 import { fetchTranscript } from '@nadimtuhin/ytranscript';
 const transcript = await fetchTranscript('dQw4w9WgXcQ', {
-  languages: ['en', 'es'], // Preference order
-  includeAutoGenerated: true,
+  proxy: {
+    url: 'http://localhost:8080',
+  },
 });
+```
+> Proxy support inspired by [ytfetcher](https://github.com/kaya70875/ytfetcher)
+## Programmatic API
+### Fetch a Single Transcript
+```typescript
+import { fetchTranscript } from '@nadimtuhin/ytranscript';
-console.log(transcript.text); // Full transcript text
-console.log(transcript.segments); // Array of { text, start, duration }
-console.log(transcript.language); // 'en'
-console.log(transcript.isAutoGenerated); // true/false
+try {
+  const transcript = await fetchTranscript('dQw4w9WgXcQ', {
+    languages: ['en', 'es'], // Preference order
+    includeAutoGenerated: true,
+  });
+  console.log(transcript.text);           // Full transcript text
+  console.log(transcript.segments);       // Array of { text, start, duration }
+  console.log(transcript.language);       // 'en'
+  console.log(transcript.isAutoGenerated); // true/false
+} catch (error) {
+  // See "Error Handling" section below
+  console.error(error.message);
+}
 ```
-### Bulk processing
+### Bulk Processing
 ```typescript
 import {
@@ -132,7 +239,7 @@ const results = await processVideos(videos, {
 const transcripts = results.filter((r) => r.transcript);
 ```
-### Streaming for large datasets
+### Streaming for Large Datasets
 ```typescript
 import { streamVideos, appendJsonl } from '@nadimtuhin/ytranscript';
@@ -143,20 +250,21 @@ for await (const result of streamVideos(videos, { concurrency: 4 })) {
 }
 ```
-### Output formatting
+### Output Formatting
 ```typescript
 import { fetchTranscript, formatSrt, formatVtt, formatText } from '@nadimtuhin/ytranscript';
+import { writeFile } from 'fs/promises';
 const transcript = await fetchTranscript('dQw4w9WgXcQ');
 // SRT subtitles
 const srt = formatSrt(transcript);
-await Bun.write('video.srt', srt);
+await writeFile('video.srt', srt);
 // VTT subtitles
 const vtt = formatVtt(transcript);
-await Bun.write('video.vtt', vtt);
+await writeFile('video.vtt', vtt);
 // Plain text with timestamps
 const text = formatText(transcript, true);
@@ -164,6 +272,43 @@ const text = formatText(transcript, true);
 // [0:05] Second line...
 ```
+## Error Handling
+The library throws errors for various failure cases:
+| Error Message | Cause | Solution |
+|---------------|-------|----------|
+| `No captions available for this video` | Video has no captions/subtitles | Check with `ytranscript info` first |
+| `No suitable caption track found` | Requested language not available | Use `includeAutoGenerated: true` or different language |
+| `Caption track is empty` | Captions exist but have no content | Rare; try a different language |
+| `HTTP 429` | Rate limited by YouTube | Reduce concurrency, add pauses |
+| `HTTP 403` | Video is private or region-locked | Cannot access this video |
+```typescript
+try {
+  const transcript = await fetchTranscript(videoId);
+} catch (error) {
+  if (error.message.includes('No captions available')) {
+    console.log('This video has no subtitles');
+  } else if (error.message.includes('429')) {
+    console.log('Rate limited - slow down requests');
+  }
+}
+```
+## Limitations
+| Scenario | Supported |
+|----------|-----------|
+| Public videos with captions | ✅ Yes |
+| Auto-generated captions | ✅ Yes |
+| Manual/community captions | ✅ Yes |
+| Private videos | ❌ No |
+| Age-restricted videos | ❌ No |
+| Live streams (while live) | ❌ No |
+| Premiere videos (before premiere) | ❌ No |
+| Region-locked videos | ❌ No (unless you're in the allowed region) |
 ## Google Takeout
 To export your YouTube data:
@@ -195,110 +340,52 @@ interface Transcript {
 interface TranscriptSegment {
   text: string;
-  start: number;  // seconds
-  duration: number;  // seconds
+  start: number;    // seconds
+  duration: number; // seconds
+}
+interface WatchHistoryMeta {
+  videoId: string;
+  title?: string;
+  url?: string;
+  channel?: { name?: string; url?: string };
+  watchedAt?: string;
+  source: 'history' | 'watch_later' | 'manual';
 }
 interface TranscriptResult {
   meta: WatchHistoryMeta;
   transcript: Transcript | null;
-  error?: string;
+  error?: string;  // Present when transcript is null
 }
 interface FetchOptions {
-  languages?: string[];
-  timeout?: number;
-  includeAutoGenerated?: boolean;
+  languages?: string[];          // Default: ['en']
+  timeout?: number;              // Default: 30000 (ms)
+  includeAutoGenerated?: boolean; // Default: true
+  proxy?: ProxyConfig;           // Optional proxy configuration
 }
-interface BulkOptions extends FetchOptions {
-  concurrency?: number;
-  pauseAfter?: number;
-  pauseDuration?: number;
-  skipIds?: Set<string>;
-  onProgress?: (completed: number, total: number, result: TranscriptResult) => void;
+interface ProxyConfig {
+  url: string;        // HTTP proxy URL (e.g., "http://user:pass@host:port")
 }
-```
-## License
-MIT
----
-## MCP Server (Model Context Protocol)
-ytranscript includes an MCP server that allows AI assistants like Claude to fetch YouTube transcripts directly.
-### Available Tools
-| Tool | Description |
-|------|-------------|
-| `get_transcript` | Fetch transcript for a YouTube video with format options (text, segments, srt, vtt) |
-| `get_transcript_languages` | List available caption languages for a video |
-| `extract_video_id` | Extract video ID from various YouTube URL formats |
-| `get_transcripts_bulk` | Fetch transcripts for multiple videos at once |
-### Setup with Claude Desktop
-Add to your Claude Desktop config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
-```json
-{
-  "mcpServers": {
-    "ytranscript": {
-      "command": "npx",
-      "args": ["-y", "ytranscript-mcp"]
-    }
-  }
-}
-```
-Or if installed globally:
-```json
-{
-  "mcpServers": {
-    "ytranscript": {
-      "command": "ytranscript-mcp"
-    }
-  }
-}
-```
-### Setup with Cursor
-Add to your Cursor MCP settings:
-```json
-{
-  "mcpServers": {
-    "ytranscript": {
-      "command": "npx",
-      "args": ["-y", "ytranscript-mcp"]
-    }
-  }
+interface BulkOptions extends FetchOptions {
+  concurrency?: number;    // Default: 4
+  pauseAfter?: number;     // Default: 10
+  pauseDuration?: number;  // Default: 5000 (ms)
+  skipIds?: Set<string>;   // Videos to skip
+  onProgress?: (completed: number, total: number, result: TranscriptResult) => void;
 }
 ```
-### Example Usage in Claude
+## Contributing
-Once configured, you can ask Claude:
+Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
-- "Get the transcript for this YouTube video: https://youtube.com/watch?v=dQw4w9WgXcQ"
-- "What languages are available for this video?"
-- "Summarize the transcript of this video"
-- "Get transcripts for these 5 videos and compare their content"
+- Report bugs via [GitHub Issues](https://github.com/nadimtuhin/ytranscript/issues)
+- Security issues: see [SECURITY.md](SECURITY.md)
-### Running the MCP Server Manually
-```bash
-# Via npx
-npx ytranscript-mcp
-# Or if installed globally
-ytranscript-mcp
+## License
-# For development
-bun run dev:mcp
-```
+MIT

package/dist/cli.d.ts ADDED Viewed

@@ -0,0 +1,6 @@
+#!/usr/bin/env node
+/**
+ * ytranscript CLI - Bulk YouTube transcript extraction
+ */
+export {};
+//# sourceMappingURL=cli.d.ts.map

package/dist/cli.d.ts.map ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA;;GAEG"}