@tekmidian/scribe 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/dist/index.js +62 -0
- package/package.json +37 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Matthias Nott
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,314 @@
|
|
|
1
|
+
# Scribe — YouTube transcript extraction for Claude
|
|
2
|
+
|
|
3
|
+
Scribe is an MCP server that extracts transcripts and captions from YouTube videos, giving Claude the ability to read, summarize, and analyze video content without watching it.
|
|
4
|
+
|
|
5
|
+
Scribe speaks directly to YouTube's Innertube API using an Android client context — no API keys, no third-party services, no credentials to manage. It bypasses EU/GDPR consent gates automatically, handles both manual and auto-generated captions, and outputs transcripts in plain text, SRT subtitle format, or structured JSON with millisecond-accurate timing data.
|
|
6
|
+
|
|
7
|
+
## How It Works
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
Claude (AI client)
|
|
11
|
+
|
|
|
12
|
+
| MCP (stdio)
|
|
13
|
+
v
|
|
14
|
+
scribe-mcp server
|
|
15
|
+
|
|
|
16
|
+
|-- 1. Fetch youtube.com watch page (extract embedded JSON + cookies)
|
|
17
|
+
|-- 2. POST /youtubei/v1/get_transcript (Android client context)
|
|
18
|
+
|-- 3. Parse transcript segment list from API response
|
|
19
|
+
|
|
|
20
|
+
v
|
|
21
|
+
Transcript returned to Claude
|
|
22
|
+
(text / SRT / JSON with timing)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
The server runs as a local process. Claude connects over stdio via the MCP protocol. No data leaves your machine except the requests to YouTube's own endpoints.
|
|
26
|
+
|
|
27
|
+
## Quick Start
|
|
28
|
+
|
|
29
|
+
### Prerequisites
|
|
30
|
+
|
|
31
|
+
- [Claude Desktop](https://claude.ai/download) or [Claude Code](https://claude.ai/code)
|
|
32
|
+
- [Node.js](https://nodejs.org) 18+ **or** [Bun](https://bun.sh) 1.0+
|
|
33
|
+
|
|
34
|
+
### Install via Claude Code
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
claude mcp add scribe-mcp -- npx -y scribe-mcp
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### Manual install (Claude Desktop)
|
|
41
|
+
|
|
42
|
+
Add the following to your `claude_desktop_config.json`:
|
|
43
|
+
|
|
44
|
+
**macOS:** `~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
45
|
+
**Windows:** `%APPDATA%\Claude\claude_desktop_config.json`
|
|
46
|
+
|
|
47
|
+
```json
|
|
48
|
+
{
|
|
49
|
+
"mcpServers": {
|
|
50
|
+
"scribe": {
|
|
51
|
+
"command": "npx",
|
|
52
|
+
"args": ["-y", "scribe-mcp"]
|
|
53
|
+
}
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Or with Bun:
|
|
59
|
+
|
|
60
|
+
```json
|
|
61
|
+
{
|
|
62
|
+
"mcpServers": {
|
|
63
|
+
"scribe": {
|
|
64
|
+
"command": "bunx",
|
|
65
|
+
"args": ["scribe-mcp"]
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Build from source
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
git clone https://github.com/mnott/Scribe
|
|
75
|
+
cd Scribe
|
|
76
|
+
bun install
|
|
77
|
+
bun run build
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Then point your MCP config at the built binary:
|
|
81
|
+
|
|
82
|
+
```json
|
|
83
|
+
{
|
|
84
|
+
"mcpServers": {
|
|
85
|
+
"scribe": {
|
|
86
|
+
"command": "node",
|
|
87
|
+
"args": ["/absolute/path/to/Scribe/dist/index.js"]
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Tools at a Glance
|
|
94
|
+
|
|
95
|
+
| Tool | What it does |
|
|
96
|
+
|------|-------------|
|
|
97
|
+
| `youtube_transcribe` | Fetch the transcript for a YouTube video in text, SRT, or JSON format |
|
|
98
|
+
| `youtube_list_languages` | List every caption language available for a video |
|
|
99
|
+
|
|
100
|
+
## User Guide
|
|
101
|
+
|
|
102
|
+
Scribe gives Claude the ability to read YouTube videos as text. Just describe what you want in plain language.
|
|
103
|
+
|
|
104
|
+
### Get a transcript
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
Transcribe this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
```
|
|
111
|
+
Get the transcript of this talk: https://youtu.be/abc123xyz11
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
Can you read this video for me? https://youtube.com/shorts/def456uvw22
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
### Change the language
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
Get me the German transcript of this lecture: [url]
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
```
|
|
125
|
+
I need the Spanish subtitles for this video: [url]
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
Transcribe this video in French
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
### Discover available languages
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
What languages are available for this video? [url]
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
Does this talk have Japanese captions?
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Choose an output format
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
Give me the SRT subtitles for this video: [url]
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
```
|
|
149
|
+
Return the transcript as JSON with timing data: [url]
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
```
|
|
153
|
+
Get the transcript with timestamps included: [url]
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
### Ask Claude to analyze the content
|
|
157
|
+
|
|
158
|
+
```
|
|
159
|
+
Summarize this YouTube video: [url]
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
Extract the key points from this lecture: [url]
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
What are the main arguments made in this talk? [url]
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
```
|
|
171
|
+
Find every mention of "machine learning" in this video and the timestamp it appears: [url]
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
```
|
|
175
|
+
Translate the transcript of this video into English: [url]
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
## MCP Tool Reference
|
|
179
|
+
|
|
180
|
+
### youtube_transcribe
|
|
181
|
+
|
|
182
|
+
Fetches captions for a YouTube video and returns them in the requested format.
|
|
183
|
+
|
|
184
|
+
Supports `youtube.com/watch`, `youtu.be`, `youtube.com/shorts`, `youtube.com/embed`, and bare 11-character video IDs.
|
|
185
|
+
|
|
186
|
+
| Parameter | Type | Required | Default | Description |
|
|
187
|
+
|-----------|------|----------|---------|-------------|
|
|
188
|
+
| `url` | string | yes | — | YouTube video URL or bare video ID |
|
|
189
|
+
| `language` | string | no | `en` | BCP-47 language code (`en`, `de`, `fr`, `ja`, …) |
|
|
190
|
+
| `format` | string | no | `text` | Output format: `text`, `srt`, or `json` |
|
|
191
|
+
| `timestamps` | boolean | no | `false` | Prepend `[MM:SS]` timestamps to each line (text format only) |
|
|
192
|
+
|
|
193
|
+
**Output formats**
|
|
194
|
+
|
|
195
|
+
- `text` — Clean continuous prose. Add `timestamps: true` for `[MM:SS]` prefixes on each segment.
|
|
196
|
+
- `srt` — Standard SubRip format, ready to use as a subtitle file.
|
|
197
|
+
- `json` — Array of objects with `text`, `startMs`, and `durationMs` fields for precise timing.
|
|
198
|
+
|
|
199
|
+
**Response includes a metadata header:**
|
|
200
|
+
|
|
201
|
+
```
|
|
202
|
+
Video ID: dQw4w9WgXcQ
|
|
203
|
+
Language: en (auto-generated)
|
|
204
|
+
Format: text
|
|
205
|
+
|
|
206
|
+
[transcript body follows]
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### youtube_list_languages
|
|
210
|
+
|
|
211
|
+
Lists every caption track available for a video, distinguishing manual captions from auto-generated ones.
|
|
212
|
+
|
|
213
|
+
| Parameter | Type | Required | Description |
|
|
214
|
+
|-----------|------|----------|-------------|
|
|
215
|
+
| `url` | string | yes | YouTube video URL or bare video ID |
|
|
216
|
+
|
|
217
|
+
**Example output:**
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
Available languages for dQw4w9WgXcQ:
|
|
221
|
+
|
|
222
|
+
- en: English (manual)
|
|
223
|
+
- de: German (auto-generated)
|
|
224
|
+
- fr: French (auto-generated)
|
|
225
|
+
- ja: Japanese (auto-generated)
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
## Configuration
|
|
229
|
+
|
|
230
|
+
Scribe has no configuration file. All behavior is controlled by parameters passed per-request. The server runs on stdio and exits when the client disconnects.
|
|
231
|
+
|
|
232
|
+
**npx (Node.js):**
|
|
233
|
+
|
|
234
|
+
```json
|
|
235
|
+
{
|
|
236
|
+
"mcpServers": {
|
|
237
|
+
"scribe": {
|
|
238
|
+
"command": "npx",
|
|
239
|
+
"args": ["-y", "scribe-mcp"]
|
|
240
|
+
}
|
|
241
|
+
}
|
|
242
|
+
}
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
**bunx (Bun):**
|
|
246
|
+
|
|
247
|
+
```json
|
|
248
|
+
{
|
|
249
|
+
"mcpServers": {
|
|
250
|
+
"scribe": {
|
|
251
|
+
"command": "bunx",
|
|
252
|
+
"args": ["scribe-mcp"]
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
}
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
**Local build:**
|
|
259
|
+
|
|
260
|
+
```json
|
|
261
|
+
{
|
|
262
|
+
"mcpServers": {
|
|
263
|
+
"scribe": {
|
|
264
|
+
"command": "node",
|
|
265
|
+
"args": ["/path/to/Scribe/dist/index.js"]
|
|
266
|
+
}
|
|
267
|
+
}
|
|
268
|
+
}
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
## Troubleshooting
|
|
272
|
+
|
|
273
|
+
**"This video does not have captions available"**
|
|
274
|
+
|
|
275
|
+
The video creator has disabled captions, and YouTube has not generated automatic captions for it. This is common for very new uploads (auto-captions can take hours), music videos, videos with no speech, or videos where the creator has explicitly disabled the feature. Nothing can be done — there are no captions to extract.
|
|
276
|
+
|
|
277
|
+
**"Language not available — supported: en, de, fr"**
|
|
278
|
+
|
|
279
|
+
The language code you requested does not have a caption track. Use `youtube_list_languages` first to see what is available, then retry with a supported code.
|
|
280
|
+
|
|
281
|
+
**Geo-restricted content**
|
|
282
|
+
|
|
283
|
+
If a video is only available in certain countries, Scribe may receive a `VIDEO_NOT_AVAILABLE` or similar playability error from YouTube. The server reports this clearly. There is no workaround — the restriction is enforced by YouTube's servers, not by Scribe.
|
|
284
|
+
|
|
285
|
+
**Age-restricted content**
|
|
286
|
+
|
|
287
|
+
Age-restricted videos require a logged-in session to view. Scribe does not support authenticated sessions. Age-restricted videos will fail with a playability status error. Transcript extraction is not possible for these videos without a cookie-based session, which Scribe intentionally does not implement.
|
|
288
|
+
|
|
289
|
+
**Rate limiting / HTTP 429**
|
|
290
|
+
|
|
291
|
+
If you transcribe many videos in rapid succession, YouTube may temporarily throttle requests. Wait a few minutes before retrying. Scribe does not implement retry logic or backoff — this is left to the caller.
|
|
292
|
+
|
|
293
|
+
## Requirements
|
|
294
|
+
|
|
295
|
+
- Node.js 18 or later (for `npx` usage), **or** Bun 1.0 or later (for `bunx` usage)
|
|
296
|
+
- An MCP-compatible client (Claude Desktop, Claude Code, or any MCP-aware host)
|
|
297
|
+
- Internet access to reach `youtube.com` and `www.youtube.com/youtubei/v1/`
|
|
298
|
+
|
|
299
|
+
No API keys. No accounts. No external dependencies beyond the MCP SDK and Zod.
|
|
300
|
+
|
|
301
|
+
## Coming soon
|
|
302
|
+
|
|
303
|
+
- Vimeo transcript extraction
|
|
304
|
+
- Direct audio/video file transcription
|
|
305
|
+
- Podcast RSS feed support
|
|
306
|
+
- SoundCloud track transcription
|
|
307
|
+
|
|
308
|
+
## License
|
|
309
|
+
|
|
310
|
+
MIT
|
|
311
|
+
|
|
312
|
+
## Author
|
|
313
|
+
|
|
314
|
+
Matthias Nott — [github.com/mnott](https://github.com/mnott)
|