@tekmidian/scribe 0.1.1 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +134 -82
- package/dist/index.js +223 -41
- package/package.json +9 -2
package/README.md
CHANGED
|
@@ -1,8 +1,12 @@
|
|
|
1
|
-
|
|
1
|
+
---
|
|
2
|
+
links: "[[Ideaverse/AI/Scribe/Scribe|Scribe]]"
|
|
3
|
+
---
|
|
2
4
|
|
|
3
|
-
Scribe
|
|
5
|
+
# Scribe — Content extraction for Claude
|
|
4
6
|
|
|
5
|
-
Scribe
|
|
7
|
+
Scribe is an MCP server that extracts content from multiple sources — YouTube videos, web articles, PDFs, and Claude.ai conversations — giving Claude the ability to read and work with content from anywhere.
|
|
8
|
+
|
|
9
|
+
**4 providers, one tool:** `extract_content` auto-detects the source type and routes to the right provider. YouTube transcripts come from the Innertube API (no API keys needed), articles use Readability extraction, PDFs are parsed locally, and Claude.ai conversations are downloaded directly from the web UI API.
|
|
6
10
|
|
|
7
11
|
## How It Works
|
|
8
12
|
|
|
@@ -13,16 +17,17 @@ Claude (AI client)
|
|
|
13
17
|
v
|
|
14
18
|
scribe-mcp server
|
|
15
19
|
|
|
|
16
|
-
|--
|
|
17
|
-
|
|
18
|
-
|
|
20
|
+
|-- extract_content auto-routes by URL:
|
|
21
|
+
| youtube.com/* → YouTube provider (Innertube API)
|
|
22
|
+
| claude.ai/* → Claude provider (web UI API)
|
|
23
|
+
| *.pdf → PDF provider (local parsing)
|
|
24
|
+
| any other URL → Article provider (Readability)
|
|
19
25
|
|
|
|
20
26
|
v
|
|
21
|
-
|
|
22
|
-
(text / SRT / JSON with timing)
|
|
27
|
+
Clean text/markdown returned to Claude
|
|
23
28
|
```
|
|
24
29
|
|
|
25
|
-
The server runs as a local process. Claude connects over stdio via the MCP protocol.
|
|
30
|
+
The server runs as a local process. Claude connects over stdio via the MCP protocol.
|
|
26
31
|
|
|
27
32
|
## Quick Start
|
|
28
33
|
|
|
@@ -45,7 +50,37 @@ Or use the CLI directly:
|
|
|
45
50
|
claude mcp add scribe-mcp -- npx -y @tekmidian/scribe
|
|
46
51
|
```
|
|
47
52
|
|
|
48
|
-
### Manual install
|
|
53
|
+
### Manual install
|
|
54
|
+
|
|
55
|
+
#### Claude Code
|
|
56
|
+
|
|
57
|
+
Add to `~/.claude.json`:
|
|
58
|
+
|
|
59
|
+
```json
|
|
60
|
+
{
|
|
61
|
+
"mcpServers": {
|
|
62
|
+
"scribe": {
|
|
63
|
+
"command": "npx",
|
|
64
|
+
"args": ["-y", "scribe-mcp"]
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Or with Bun (faster):
|
|
71
|
+
|
|
72
|
+
```json
|
|
73
|
+
{
|
|
74
|
+
"mcpServers": {
|
|
75
|
+
"scribe": {
|
|
76
|
+
"command": "bunx",
|
|
77
|
+
"args": ["scribe-mcp"]
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
#### Claude Desktop
|
|
49
84
|
|
|
50
85
|
Add the following to your `claude_desktop_config.json`:
|
|
51
86
|
|
|
@@ -102,89 +137,137 @@ Then point your MCP config at the built binary:
|
|
|
102
137
|
|
|
103
138
|
| Tool | What it does |
|
|
104
139
|
|------|-------------|
|
|
105
|
-
| `
|
|
140
|
+
| `extract_content` | Extract content from any supported source — auto-detects the provider |
|
|
141
|
+
| `list_providers` | Show all available providers and their capabilities |
|
|
142
|
+
| `youtube_transcribe` | Fetch YouTube transcript in text, SRT, or JSON format |
|
|
106
143
|
| `youtube_list_languages` | List every caption language available for a video |
|
|
107
144
|
|
|
145
|
+
## Providers
|
|
146
|
+
|
|
147
|
+
| Provider | Sources | Output |
|
|
148
|
+
|----------|---------|--------|
|
|
149
|
+
| **youtube** | YouTube videos (all URL formats + bare IDs) | Text, SRT, JSON with timing |
|
|
150
|
+
| **claude** | Claude.ai chats and projects | Markdown with metadata |
|
|
151
|
+
| **pdf** | PDF files (URLs or local paths) | Plain text |
|
|
152
|
+
| **article** | Any web page | Clean text via Readability |
|
|
153
|
+
|
|
108
154
|
## User Guide
|
|
109
155
|
|
|
110
|
-
|
|
156
|
+
Just give Claude a URL. Scribe auto-detects the source type.
|
|
111
157
|
|
|
112
|
-
###
|
|
158
|
+
### YouTube videos
|
|
113
159
|
|
|
114
160
|
```
|
|
115
|
-
|
|
161
|
+
Summarize this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
116
162
|
```
|
|
117
163
|
|
|
118
164
|
```
|
|
119
|
-
Get the transcript of this
|
|
165
|
+
Get me the German transcript of this lecture: [url]
|
|
120
166
|
```
|
|
121
167
|
|
|
122
168
|
```
|
|
123
|
-
|
|
169
|
+
Return the transcript as JSON with timing data: [url]
|
|
124
170
|
```
|
|
125
171
|
|
|
126
|
-
###
|
|
172
|
+
### Claude.ai conversations
|
|
127
173
|
|
|
128
174
|
```
|
|
129
|
-
|
|
175
|
+
Download this conversation: https://claude.ai/chat/550e8400-e29b-41d4-a716-446655440000
|
|
130
176
|
```
|
|
131
177
|
|
|
132
178
|
```
|
|
133
|
-
|
|
179
|
+
Get all conversations from this project: https://claude.ai/project/550e8400-e29b-41d4-a716-446655440000
|
|
134
180
|
```
|
|
135
181
|
|
|
182
|
+
### Web articles
|
|
183
|
+
|
|
136
184
|
```
|
|
137
|
-
|
|
185
|
+
Extract the content from this article: https://example.com/interesting-post
|
|
138
186
|
```
|
|
139
187
|
|
|
140
|
-
###
|
|
188
|
+
### PDFs
|
|
141
189
|
|
|
142
190
|
```
|
|
143
|
-
|
|
191
|
+
Read this PDF: https://example.com/paper.pdf
|
|
144
192
|
```
|
|
145
193
|
|
|
146
194
|
```
|
|
147
|
-
|
|
195
|
+
Extract text from /Users/me/Documents/report.pdf
|
|
148
196
|
```
|
|
149
197
|
|
|
150
|
-
###
|
|
198
|
+
### Analyze anything
|
|
151
199
|
|
|
152
200
|
```
|
|
153
|
-
|
|
201
|
+
Summarize this: [any supported URL]
|
|
154
202
|
```
|
|
155
203
|
|
|
156
204
|
```
|
|
157
|
-
|
|
205
|
+
Extract the key points from this: [any supported URL]
|
|
158
206
|
```
|
|
159
207
|
|
|
160
|
-
|
|
161
|
-
Get the transcript with timestamps included: [url]
|
|
162
|
-
```
|
|
208
|
+
## Claude.ai Provider Setup
|
|
163
209
|
|
|
164
|
-
|
|
210
|
+
The Claude provider downloads conversations from the Claude.ai web UI. It requires a session cookie for authentication. Three options:
|
|
165
211
|
|
|
166
|
-
|
|
167
|
-
Summarize this YouTube video: [url]
|
|
168
|
-
```
|
|
212
|
+
**Option A — Playwright (automated, recommended if you have Playwright MCP):**
|
|
169
213
|
|
|
170
|
-
|
|
171
|
-
Extract the key points from this lecture: [url]
|
|
172
|
-
```
|
|
214
|
+
Ask Claude Code to navigate to claude.ai and extract cookies:
|
|
173
215
|
|
|
174
216
|
```
|
|
175
|
-
|
|
217
|
+
Navigate to claude.ai and extract all cookies, save them as JSON to ~/claude-cookies.json
|
|
176
218
|
```
|
|
177
219
|
|
|
178
|
-
|
|
179
|
-
Find every mention of "machine learning" in this video and the timestamp it appears: [url]
|
|
180
|
-
```
|
|
220
|
+
Then add to your MCP config:
|
|
181
221
|
|
|
222
|
+
```json
|
|
223
|
+
{
|
|
224
|
+
"mcpServers": {
|
|
225
|
+
"scribe": {
|
|
226
|
+
"command": "npx",
|
|
227
|
+
"args": ["-y", "@tekmidian/scribe"],
|
|
228
|
+
"env": {
|
|
229
|
+
"CLAUDE_COOKIES_FILE": "/Users/you/claude-cookies.json"
|
|
230
|
+
}
|
|
231
|
+
}
|
|
232
|
+
}
|
|
233
|
+
}
|
|
182
234
|
```
|
|
183
|
-
|
|
235
|
+
|
|
236
|
+
**Option B — Browser extension:**
|
|
237
|
+
|
|
238
|
+
Install a cookie export extension (e.g. "Cookie-Editor"), export claude.ai cookies as JSON, and set `CLAUDE_COOKIES_FILE` as above.
|
|
239
|
+
|
|
240
|
+
**Option C — Manual:**
|
|
241
|
+
|
|
242
|
+
Open claude.ai → F12 → Application → Cookies → copy the `sessionKey` value:
|
|
243
|
+
|
|
244
|
+
```json
|
|
245
|
+
{
|
|
246
|
+
"env": {
|
|
247
|
+
"CLAUDE_SESSION_KEY": "sk-ant-sid01-..."
|
|
248
|
+
}
|
|
249
|
+
}
|
|
184
250
|
```
|
|
185
251
|
|
|
252
|
+
Without either env var, the Claude provider is silently disabled — other providers still work normally.
|
|
253
|
+
|
|
186
254
|
## MCP Tool Reference
|
|
187
255
|
|
|
256
|
+
### extract_content
|
|
257
|
+
|
|
258
|
+
Extracts content from any supported source. Auto-detects the provider based on the URL.
|
|
259
|
+
|
|
260
|
+
| Parameter | Type | Required | Default | Description |
|
|
261
|
+
|-----------|------|----------|---------|-------------|
|
|
262
|
+
| `url` | string | yes | — | URL or file path to extract content from |
|
|
263
|
+
| `format` | string | no | `text` | Output format (available formats depend on provider) |
|
|
264
|
+
| `language` | string | no | — | Preferred language code (YouTube only) |
|
|
265
|
+
| `timestamps` | boolean | no | `false` | Include timestamps (YouTube text format only) |
|
|
266
|
+
|
|
267
|
+
### list_providers
|
|
268
|
+
|
|
269
|
+
Lists all available providers and their capabilities. No parameters.
|
|
270
|
+
|
|
188
271
|
### youtube_transcribe
|
|
189
272
|
|
|
190
273
|
Fetches captions for a YouTube video and returns them in the requested format.
|
|
@@ -235,46 +318,13 @@ Available languages for dQw4w9WgXcQ:
|
|
|
235
318
|
|
|
236
319
|
## Configuration
|
|
237
320
|
|
|
238
|
-
|
|
321
|
+
All behavior is controlled by parameters passed per-request. Optional environment variables enable the Claude.ai provider:
|
|
239
322
|
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
"scribe": {
|
|
246
|
-
"command": "npx",
|
|
247
|
-
"args": ["-y", "scribe-mcp"]
|
|
248
|
-
}
|
|
249
|
-
}
|
|
250
|
-
}
|
|
251
|
-
```
|
|
252
|
-
|
|
253
|
-
**bunx (Bun):**
|
|
254
|
-
|
|
255
|
-
```json
|
|
256
|
-
{
|
|
257
|
-
"mcpServers": {
|
|
258
|
-
"scribe": {
|
|
259
|
-
"command": "bunx",
|
|
260
|
-
"args": ["scribe-mcp"]
|
|
261
|
-
}
|
|
262
|
-
}
|
|
263
|
-
}
|
|
264
|
-
```
|
|
265
|
-
|
|
266
|
-
**Local build:**
|
|
267
|
-
|
|
268
|
-
```json
|
|
269
|
-
{
|
|
270
|
-
"mcpServers": {
|
|
271
|
-
"scribe": {
|
|
272
|
-
"command": "node",
|
|
273
|
-
"args": ["/path/to/Scribe/dist/index.js"]
|
|
274
|
-
}
|
|
275
|
-
}
|
|
276
|
-
}
|
|
277
|
-
```
|
|
323
|
+
| Variable | Required | Description |
|
|
324
|
+
|----------|----------|-------------|
|
|
325
|
+
| `CLAUDE_COOKIES_FILE` | no | Path to browser cookie export JSON (claude.ai provider) |
|
|
326
|
+
| `CLAUDE_SESSION_KEY` | no | Direct session key value (claude.ai provider) |
|
|
327
|
+
| `CLAUDE_ORG_ID` | no | Organization ID (auto-discovered if not set) |
|
|
278
328
|
|
|
279
329
|
## Troubleshooting
|
|
280
330
|
|
|
@@ -304,14 +354,13 @@ If you transcribe many videos in rapid succession, YouTube may temporarily throt
|
|
|
304
354
|
- An MCP-compatible client (Claude Desktop, Claude Code, or any MCP-aware host)
|
|
305
355
|
- Internet access to reach `youtube.com` and `www.youtube.com/youtubei/v1/`
|
|
306
356
|
|
|
307
|
-
No API keys
|
|
357
|
+
No API keys needed for YouTube, articles, or PDFs. Claude.ai provider requires a session cookie (see setup above).
|
|
308
358
|
|
|
309
359
|
## Coming soon
|
|
310
360
|
|
|
311
361
|
- Vimeo transcript extraction
|
|
312
362
|
- Direct audio/video file transcription
|
|
313
363
|
- Podcast RSS feed support
|
|
314
|
-
- SoundCloud track transcription
|
|
315
364
|
|
|
316
365
|
## License
|
|
317
366
|
|
|
@@ -320,3 +369,6 @@ MIT
|
|
|
320
369
|
## Author
|
|
321
370
|
|
|
322
371
|
Matthias Nott — [github.com/mnott](https://github.com/mnott)
|
|
372
|
+
|
|
373
|
+
---
|
|
374
|
+
*Links:* [[Ideaverse/AI/Scribe/Scribe|Scribe]]
|