mcp-headless-youtube-transcript 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +64 -11
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -6,43 +6,96 @@ An MCP (Model Context Protocol) server that extracts YouTube video transcripts u
6
6
 
7
7
  - Extract transcripts from YouTube videos using video ID or full URL
8
8
  - Support for multiple languages
9
- - Timestamped transcript output
9
+ - Automatic pagination for large transcripts (98k character chunks)
10
+ - Clean text output optimized for LLM consumption
10
11
  - Built with TypeScript and the MCP SDK
11
12
 
12
13
  ## Installation
13
14
 
15
+ Install via npm:
16
+
14
17
  ```bash
15
- npm install
16
- npm run build
18
+ npm install -g mcp-headless-youtube-transcript
17
19
  ```
18
20
 
19
- ## Usage
21
+ Or use directly with npx:
20
22
 
21
- ### As an MCP Server
23
+ ```bash
24
+ npx mcp-headless-youtube-transcript
25
+ ```
22
26
 
23
- This server implements the Model Context Protocol and can be used with MCP clients.
27
+ ## MCP Configuration
24
28
 
25
- ### Tools Available
29
+ Add this server to your MCP settings:
26
30
 
27
- #### `get_youtube_transcript`
31
+ ```json
32
+ {
33
+ "mcpServers": {
34
+ "youtube-transcript": {
35
+ "command": "npx",
36
+ "args": ["-y", "mcp-headless-youtube-transcript"]
37
+ }
38
+ }
39
+ }
40
+ ```
28
41
 
29
- Extracts transcript/captions from a YouTube video.
42
+ ## Tools Available
43
+
44
+ ### `get_youtube_transcript`
45
+
46
+ Extracts transcript/captions from a YouTube video with automatic pagination for large transcripts.
30
47
 
31
48
  **Parameters:**
32
49
  - `videoId` (required): YouTube video ID or full URL
33
50
  - `lang` (optional): Language code for captions (e.g., "en", "es", "ko"). Defaults to "en"
51
+ - `segment` (optional): Segment number to retrieve (1-based). Each segment is ~98k characters. Defaults to 1
52
+
53
+ **Examples:**
34
54
 
35
- **Example:**
55
+ Basic usage:
56
+ ```json
57
+ {
58
+ "name": "get_youtube_transcript",
59
+ "arguments": {
60
+ "videoId": "dQw4w9WgXcQ"
61
+ }
62
+ }
63
+ ```
64
+
65
+ With language:
66
+ ```json
67
+ {
68
+ "name": "get_youtube_transcript",
69
+ "arguments": {
70
+ "videoId": "dQw4w9WgXcQ",
71
+ "lang": "es"
72
+ }
73
+ }
74
+ ```
75
+
76
+ With pagination:
36
77
  ```json
37
78
  {
38
79
  "name": "get_youtube_transcript",
39
80
  "arguments": {
40
81
  "videoId": "dQw4w9WgXcQ",
41
- "lang": "en"
82
+ "segment": 2
42
83
  }
43
84
  }
44
85
  ```
45
86
 
87
+ ## Response Format
88
+
89
+ The tool returns the raw transcript text. For large transcripts, the response includes pagination information:
90
+
91
+ ```
92
+ [Segment 1 of 3]
93
+
94
+ this is the actual transcript text content...
95
+ ```
96
+
97
+ When multiple segments are available, you can retrieve subsequent segments by incrementing the `segment` parameter.
98
+
46
99
  ## Supported URL Formats
47
100
 
48
101
  - Video ID: `dQw4w9WgXcQ`
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "mcp-headless-youtube-transcript",
3
- "version": "0.2.0",
3
+ "version": "0.2.1",
4
4
  "description": "MCP server for extracting YouTube video transcripts using headless-youtube-captions",
5
5
  "main": "build/index.js",
6
6
  "bin": {