npm - mcp-headless-youtube-transcript - Versions diffs - 0.3.0 → 0.5.0 - Mend

mcp-headless-youtube-transcript 0.3.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 Andrew Lewin
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md CHANGED Viewed

@@ -1,10 +1,13 @@
 # MCP Headless YouTube Transcript
-An MCP (Model Context Protocol) server that extracts YouTube video transcripts using the `headless-youtube-captions` library.
+An MCP (Model Context Protocol) server that extracts YouTube video transcripts, channel videos, and comments using the `headless-youtube-captions` library.
 ## Features
 - Extract transcripts from YouTube videos using video ID or full URL
+- Get videos from YouTube channels with pagination support
+- Search for videos within a specific channel
+- Retrieve comments from YouTube videos
 - Support for multiple languages
 - Automatic pagination for large transcripts (98k character chunks)
 - Clean text output optimized for LLM consumption
@@ -84,9 +87,93 @@ With pagination:
 }
 ```
-## Response Format
+### `get_channel_videos`
-The tool returns the raw transcript text. For large transcripts, the response includes pagination information:
+Extract videos from a YouTube channel with pagination support.
+**Parameters:**
+- `channelUrl` (required): YouTube channel URL, @handle, or channel ID
+- `maxVideos` (optional): Maximum number of videos to retrieve. Defaults to 50
+**Examples:**
+Using handle:
+```json
+{
+  "name": "get_channel_videos",
+  "arguments": {
+    "channelUrl": "@mkbhd"
+  }
+}
+```
+Using channel URL:
+```json
+{
+  "name": "get_channel_videos",
+  "arguments": {
+    "channelUrl": "https://www.youtube.com/channel/UCBJycsmduvYEL83R_U4JriQ",
+    "maxVideos": 100
+  }
+}
+```
+### `search_channel_videos`
+Search for specific videos within a YouTube channel.
+**Parameters:**
+- `channelUrl` (required): YouTube channel URL, @handle, or channel ID
+- `query` (required): Search query to find videos in the channel
+**Example:**
+```json
+{
+  "name": "search_channel_videos",
+  "arguments": {
+    "channelUrl": "@mkbhd",
+    "query": "iPhone review"
+  }
+}
+```
+### `get_video_comments`
+Retrieve comments from a YouTube video.
+**Parameters:**
+- `videoId` (required): YouTube video ID or full URL
+- `sortBy` (optional): Sort comments by "top" or "newest". Defaults to "top"
+- `maxComments` (optional): Maximum number of comments to retrieve. Defaults to 100
+**Examples:**
+Basic usage:
+```json
+{
+  "name": "get_video_comments",
+  "arguments": {
+    "videoId": "dQw4w9WgXcQ"
+  }
+}
+```
+With sorting and limit:
+```json
+{
+  "name": "get_video_comments",
+  "arguments": {
+    "videoId": "dQw4w9WgXcQ",
+    "sortBy": "newest",
+    "maxComments": 50
+  }
+}
+```
+## Response Formats
+### Transcript Response
+For `get_youtube_transcript`, the tool returns the raw transcript text. For large transcripts, the response includes pagination information:
 ```
 [Segment 1 of 3]
@@ -96,6 +183,50 @@ this is the actual transcript text content...
 When multiple segments are available, you can retrieve subsequent segments by incrementing the `segment` parameter.
+### Channel Videos Response
+For `get_channel_videos` and `search_channel_videos`, the response is a JSON object containing channel information and video details:
+```json
+{
+  "channel": {
+    "name": "Channel Name",
+    "subscribers": "1.23M subscribers",
+    "videoCount": "500 videos"
+  },
+  "videos": [
+    {
+      "id": "videoId123",
+      "title": "Video Title",
+      "url": "https://www.youtube.com/watch?v=videoId123",
+      "views": "1.2M views",
+      "uploadTime": "2 weeks ago",
+      "duration": "10:34"
+    }
+  ],
+  "totalVideosRetrieved": 50
+}
+```
+### Comments Response
+For `get_video_comments`, the response includes comment details:
+```json
+{
+  "videoId": "dQw4w9WgXcQ",
+  "sortBy": "top",
+  "comments": [
+    {
+      "author": "Username",
+      "text": "This is a comment",
+      "likes": "1.2K",
+      "replyCount": 23,
+      "timeAgo": "2 weeks ago"
+    }
+  ],
+  "totalComments": 100
+}
+```
 ## Caching
 The server includes built-in caching to improve performance for paginated requests. The cache behavior can be configured with an environment variable:
@@ -115,8 +246,29 @@ The server includes built-in caching to improve performance for paginated reques
 TRANSCRIPT_CACHE_TTL=600 npx mcp-headless-youtube-transcript
 ```
+## Environment Variables
+### PUPPETEER_EXECUTABLE_PATH
+If you need to specify a custom path for the Chromium/Chrome executable used by Puppeteer, you can set the `PUPPETEER_EXECUTABLE_PATH` environment variable:
+```bash
+# Example: Using system Chrome
+PUPPETEER_EXECUTABLE_PATH="/usr/bin/google-chrome" npx mcp-headless-youtube-transcript
+# Example: Using a specific Chromium installation
+PUPPETEER_EXECUTABLE_PATH="/path/to/chromium" npx mcp-headless-youtube-transcript
+```
+This is useful when:
+- Running in containerized environments
+- Using a system-installed Chrome/Chromium instead of the bundled one
+- Working in environments with specific security requirements
+- Troubleshooting Puppeteer launch issues
 ## Supported URL Formats
+### Video URLs
 - Video ID: `dQw4w9WgXcQ`
 - YouTube URLs:
   - `https://www.youtube.com/watch?v=dQw4w9WgXcQ`
@@ -124,6 +276,15 @@ TRANSCRIPT_CACHE_TTL=600 npx mcp-headless-youtube-transcript
   - `https://www.youtube.com/embed/dQw4w9WgXcQ`
   - `https://www.youtube.com/v/dQw4w9WgXcQ`
+### Channel URLs
+- Handle: `@mkbhd`
+- Channel ID: `UCBJycsmduvYEL83R_U4JriQ`
+- Channel URLs:
+  - `https://www.youtube.com/channel/UCBJycsmduvYEL83R_U4JriQ`
+  - `https://www.youtube.com/c/mkbhd`
+  - `https://www.youtube.com/user/marquesbrownlee`
+  - `https://www.youtube.com/@mkbhd`
 ## Development
 ```bash

package/build/index.js CHANGED Viewed

@@ -2,8 +2,9 @@
 import { Server } from '@modelcontextprotocol/sdk/server/index.js';
 import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
 import { CallToolRequestSchema, ListToolsRequestSchema, } from '@modelcontextprotocol/sdk/types.js';
-import { getSubtitles } from 'headless-youtube-captions';
-import { extractVideoId } from './utils.js';
+// @ts-ignore - Types are defined in global.d.ts
+import { getSubtitles, getChannelVideos, searchChannelVideos, getVideoComments } from 'headless-youtube-captions';
+import { extractVideoId, extractChannelIdentifier, formatChannelUrl, truncateText } from './utils.js';
 // In-memory cache
 const transcriptCache = new Map();
 // Get cache TTL from environment variable (default 5 minutes)
@@ -75,6 +76,68 @@ server.setRequestHandler(ListToolsRequestSchema, async () => {
                     required: ['videoId'],
                 },
             },
+            {
+                name: 'get_channel_videos',
+                description: 'Extract videos from a YouTube channel with pagination support',
+                inputSchema: {
+                    type: 'object',
+                    properties: {
+                        channelUrl: {
+                            type: 'string',
+                            description: 'YouTube channel URL, @handle, or channel ID',
+                        },
+                        maxVideos: {
+                            type: 'number',
+                            description: 'Maximum number of videos to retrieve. Defaults to 50',
+                            default: 50,
+                        },
+                    },
+                    required: ['channelUrl'],
+                },
+            },
+            {
+                name: 'search_channel_videos',
+                description: 'Search for specific videos within a YouTube channel',
+                inputSchema: {
+                    type: 'object',
+                    properties: {
+                        channelUrl: {
+                            type: 'string',
+                            description: 'YouTube channel URL, @handle, or channel ID',
+                        },
+                        query: {
+                            type: 'string',
+                            description: 'Search query to find videos in the channel',
+                        },
+                    },
+                    required: ['channelUrl', 'query'],
+                },
+            },
+            {
+                name: 'get_video_comments',
+                description: 'Retrieve comments from a YouTube video',
+                inputSchema: {
+                    type: 'object',
+                    properties: {
+                        videoId: {
+                            type: 'string',
+                            description: 'YouTube video ID or full URL',
+                        },
+                        sortBy: {
+                            type: 'string',
+                            description: 'Sort comments by "top" or "newest". Defaults to "top"',
+                            enum: ['top', 'newest'],
+                            default: 'top',
+                        },
+                        maxComments: {
+                            type: 'number',
+                            description: 'Maximum number of comments to retrieve. Defaults to 100',
+                            default: 100,
+                        },
+                    },
+                    required: ['videoId'],
+                },
+            },
         ],
     };
 });
@@ -152,6 +215,164 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
             cleanupExpiredCache();
         }
     }
+    if (name === 'get_channel_videos') {
+        try {
+            const { channelUrl, maxVideos = 50 } = args;
+            // Extract and format channel URL
+            const channelIdentifier = extractChannelIdentifier(channelUrl);
+            const formattedUrl = formatChannelUrl(channelIdentifier);
+            // Get channel videos
+            const result = await getChannelVideos({
+                channelURL: formattedUrl,
+                limit: maxVideos
+            });
+            // Format the response
+            const response = {
+                channel: {
+                    name: result.channel.name,
+                    subscribers: result.channel.subscribers,
+                    videoCount: result.channel.videoCount,
+                },
+                videos: result.videos.map((video) => ({
+                    id: video.id,
+                    title: video.title,
+                    url: video.url,
+                    views: video.views,
+                    uploadTime: video.uploadTime,
+                    duration: video.duration,
+                    thumbnail: video.thumbnail,
+                })),
+                totalVideosRetrieved: result.totalLoaded,
+                hasMore: result.hasMore,
+            };
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: JSON.stringify(response, null, 2),
+                    },
+                ],
+            };
+        }
+        catch (error) {
+            const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: `Error getting channel videos: ${errorMessage}`,
+                    },
+                ],
+                isError: true,
+            };
+        }
+    }
+    if (name === 'search_channel_videos') {
+        try {
+            const { channelUrl, query } = args;
+            // Extract and format channel URL
+            const channelIdentifier = extractChannelIdentifier(channelUrl);
+            const formattedUrl = formatChannelUrl(channelIdentifier);
+            // Search channel videos
+            const result = await searchChannelVideos({
+                channelURL: formattedUrl,
+                query: query
+            });
+            // Format the response
+            const response = {
+                query: result.query,
+                channelUrl: formattedUrl,
+                results: result.results.map((video) => ({
+                    id: video.id,
+                    title: video.title,
+                    url: video.url,
+                    views: video.views,
+                    uploadTime: video.uploadTime,
+                    duration: video.duration,
+                    thumbnail: video.thumbnail,
+                })),
+                totalResults: result.totalFound,
+            };
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: JSON.stringify(response, null, 2),
+                    },
+                ],
+            };
+        }
+        catch (error) {
+            const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: `Error searching channel videos: ${errorMessage}`,
+                    },
+                ],
+                isError: true,
+            };
+        }
+    }
+    if (name === 'get_video_comments') {
+        try {
+            const { videoId, sortBy = 'top', maxComments = 100 } = args;
+            // Extract video ID from URL if needed
+            const extractedVideoId = extractVideoId(videoId);
+            if (!extractedVideoId) {
+                throw new Error('Invalid YouTube video ID or URL');
+            }
+            // Get video comments
+            const result = await getVideoComments({
+                videoID: extractedVideoId,
+                sortBy: sortBy,
+                limit: maxComments
+            });
+            // Format the response (truncate if needed)
+            const response = {
+                video: {
+                    id: result.video.id,
+                    title: result.video.title,
+                    channel: result.video.channel,
+                    views: result.video.views,
+                },
+                sortBy: result.sortBy,
+                comments: result.comments.map((comment) => ({
+                    author: comment.author,
+                    text: comment.text,
+                    likes: comment.likes,
+                    replyCount: comment.replyCount,
+                    time: comment.time,
+                })),
+                totalComments: result.totalComments,
+                totalLoaded: result.totalLoaded,
+                hasMore: result.hasMore,
+            };
+            const responseText = JSON.stringify(response, null, 2);
+            const truncatedResponse = truncateText(responseText);
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: truncatedResponse,
+                    },
+                ],
+            };
+        }
+        catch (error) {
+            const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
+            return {
+                content: [
+                    {
+                        type: 'text',
+                        text: `Error getting video comments: ${errorMessage}`,
+                    },
+                ],
+                isError: true,
+            };
+        }
+    }
     throw new Error(`Unknown tool: ${name}`);
 });
 async function main() {

package/build/utils.d.ts CHANGED Viewed

@@ -1,3 +1,6 @@
 export declare function extractVideoId(input: string): string | null;
 export declare function formatTime(seconds: number): string;
+export declare function extractChannelIdentifier(input: string): string;
+export declare function formatChannelUrl(identifier: string): string;
+export declare function truncateText(text: string, maxLength?: number): string;
 //# sourceMappingURL=utils.d.ts.map

package/build/utils.js CHANGED Viewed

@@ -23,4 +23,46 @@ export function formatTime(seconds) {
     const remainingSeconds = Math.floor(seconds % 60);
     return `${minutes.toString().padStart(2, '0')}:${remainingSeconds.toString().padStart(2, '0')}`;
 }
+// Helper function to extract channel identifier from various YouTube channel URL formats
+export function extractChannelIdentifier(input) {
+    // If it's already a channel ID (starts with UC) or a handle (starts with @)
+    if (/^UC[a-zA-Z0-9_-]{22}$/.test(input) || /^@[\w.-]+$/.test(input)) {
+        return input;
+    }
+    // Extract from various YouTube channel URL formats
+    const patterns = [
+        /youtube\.com\/channel\/(UC[a-zA-Z0-9_-]{22})/,
+        /youtube\.com\/c\/([^\/]+)/,
+        /youtube\.com\/user\/([^\/]+)/,
+        /youtube\.com\/(@[\w.-]+)/,
+    ];
+    for (const pattern of patterns) {
+        const match = input.match(pattern);
+        if (match) {
+            return match[1];
+        }
+    }
+    // If no pattern matches, return the input as-is (might be a channel name)
+    return input;
+}
+// Helper function to format/normalize channel URLs
+export function formatChannelUrl(identifier) {
+    // If it's a handle, construct the URL with the handle
+    if (identifier.startsWith('@')) {
+        return `https://www.youtube.com/${identifier}`;
+    }
+    // If it's a channel ID, use the channel URL format
+    if (/^UC[a-zA-Z0-9_-]{22}$/.test(identifier)) {
+        return `https://www.youtube.com/channel/${identifier}`;
+    }
+    // Otherwise, assume it's a custom URL/username
+    return `https://www.youtube.com/c/${identifier}`;
+}
+// Helper function to truncate text for large responses
+export function truncateText(text, maxLength = 50000) {
+    if (text.length <= maxLength) {
+        return text;
+    }
+    return text.substring(0, maxLength) + '\n\n[Content truncated due to length...]';
+}
 //# sourceMappingURL=utils.js.map

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "mcp-headless-youtube-transcript",
-  "version": "0.3.0",
+  "version": "0.5.0",
   "description": "MCP server for extracting YouTube video transcripts using headless-youtube-captions",
   "main": "build/index.js",
   "bin": {
@@ -34,7 +34,7 @@
   "license": "MIT",
   "dependencies": {
     "@modelcontextprotocol/sdk": "^1.0.0",
-    "headless-youtube-captions": "^1.0.2"
+    "headless-youtube-captions": "^1.2.0"
   },
   "devDependencies": {
     "@types/node": "^22.0.0",