mcp-headless-youtube-transcript 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +64 -11
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -6,43 +6,96 @@ An MCP (Model Context Protocol) server that extracts YouTube video transcripts u
|
|
|
6
6
|
|
|
7
7
|
- Extract transcripts from YouTube videos using video ID or full URL
|
|
8
8
|
- Support for multiple languages
|
|
9
|
-
-
|
|
9
|
+
- Automatic pagination for large transcripts (98k character chunks)
|
|
10
|
+
- Clean text output optimized for LLM consumption
|
|
10
11
|
- Built with TypeScript and the MCP SDK
|
|
11
12
|
|
|
12
13
|
## Installation
|
|
13
14
|
|
|
15
|
+
Install via npm:
|
|
16
|
+
|
|
14
17
|
```bash
|
|
15
|
-
npm install
|
|
16
|
-
npm run build
|
|
18
|
+
npm install -g mcp-headless-youtube-transcript
|
|
17
19
|
```
|
|
18
20
|
|
|
19
|
-
|
|
21
|
+
Or use directly with npx:
|
|
20
22
|
|
|
21
|
-
|
|
23
|
+
```bash
|
|
24
|
+
npx mcp-headless-youtube-transcript
|
|
25
|
+
```
|
|
22
26
|
|
|
23
|
-
|
|
27
|
+
## MCP Configuration
|
|
24
28
|
|
|
25
|
-
|
|
29
|
+
Add this server to your MCP settings:
|
|
26
30
|
|
|
27
|
-
|
|
31
|
+
```json
|
|
32
|
+
{
|
|
33
|
+
"mcpServers": {
|
|
34
|
+
"youtube-transcript": {
|
|
35
|
+
"command": "npx",
|
|
36
|
+
"args": ["-y", "mcp-headless-youtube-transcript"]
|
|
37
|
+
}
|
|
38
|
+
}
|
|
39
|
+
}
|
|
40
|
+
```
|
|
28
41
|
|
|
29
|
-
|
|
42
|
+
## Tools Available
|
|
43
|
+
|
|
44
|
+
### `get_youtube_transcript`
|
|
45
|
+
|
|
46
|
+
Extracts transcript/captions from a YouTube video with automatic pagination for large transcripts.
|
|
30
47
|
|
|
31
48
|
**Parameters:**
|
|
32
49
|
- `videoId` (required): YouTube video ID or full URL
|
|
33
50
|
- `lang` (optional): Language code for captions (e.g., "en", "es", "ko"). Defaults to "en"
|
|
51
|
+
- `segment` (optional): Segment number to retrieve (1-based). Each segment is ~98k characters. Defaults to 1
|
|
52
|
+
|
|
53
|
+
**Examples:**
|
|
34
54
|
|
|
35
|
-
|
|
55
|
+
Basic usage:
|
|
56
|
+
```json
|
|
57
|
+
{
|
|
58
|
+
"name": "get_youtube_transcript",
|
|
59
|
+
"arguments": {
|
|
60
|
+
"videoId": "dQw4w9WgXcQ"
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
With language:
|
|
66
|
+
```json
|
|
67
|
+
{
|
|
68
|
+
"name": "get_youtube_transcript",
|
|
69
|
+
"arguments": {
|
|
70
|
+
"videoId": "dQw4w9WgXcQ",
|
|
71
|
+
"lang": "es"
|
|
72
|
+
}
|
|
73
|
+
}
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
With pagination:
|
|
36
77
|
```json
|
|
37
78
|
{
|
|
38
79
|
"name": "get_youtube_transcript",
|
|
39
80
|
"arguments": {
|
|
40
81
|
"videoId": "dQw4w9WgXcQ",
|
|
41
|
-
"
|
|
82
|
+
"segment": 2
|
|
42
83
|
}
|
|
43
84
|
}
|
|
44
85
|
```
|
|
45
86
|
|
|
87
|
+
## Response Format
|
|
88
|
+
|
|
89
|
+
The tool returns the raw transcript text. For large transcripts, the response includes pagination information:
|
|
90
|
+
|
|
91
|
+
```
|
|
92
|
+
[Segment 1 of 3]
|
|
93
|
+
|
|
94
|
+
this is the actual transcript text content...
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
When multiple segments are available, you can retrieve subsequent segments by incrementing the `segment` parameter.
|
|
98
|
+
|
|
46
99
|
## Supported URL Formats
|
|
47
100
|
|
|
48
101
|
- Video ID: `dQw4w9WgXcQ`
|