@tekmidian/scribe 0.1.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +144 -84
- package/dist/index.js +223 -41
- package/package.json +23 -3
package/README.md
CHANGED
|
@@ -1,8 +1,12 @@
|
|
|
1
|
-
|
|
1
|
+
---
|
|
2
|
+
links: "[[Ideaverse/AI/Scribe/Scribe|Scribe]]"
|
|
3
|
+
---
|
|
2
4
|
|
|
3
|
-
Scribe
|
|
5
|
+
# Scribe — Content extraction for Claude
|
|
4
6
|
|
|
5
|
-
Scribe
|
|
7
|
+
Scribe is an MCP server that extracts content from multiple sources — YouTube videos, web articles, PDFs, and Claude.ai conversations — giving Claude the ability to read and work with content from anywhere.
|
|
8
|
+
|
|
9
|
+
**4 providers, one tool:** `extract_content` auto-detects the source type and routes to the right provider. YouTube transcripts come from the Innertube API (no API keys needed), articles use Readability extraction, PDFs are parsed locally, and Claude.ai conversations are downloaded directly from the web UI API.
|
|
6
10
|
|
|
7
11
|
## How It Works
|
|
8
12
|
|
|
@@ -13,16 +17,17 @@ Claude (AI client)
|
|
|
13
17
|
v
|
|
14
18
|
scribe-mcp server
|
|
15
19
|
|
|
|
16
|
-
|--
|
|
17
|
-
|
|
18
|
-
|
|
20
|
+
|-- extract_content auto-routes by URL:
|
|
21
|
+
| youtube.com/* → YouTube provider (Innertube API)
|
|
22
|
+
| claude.ai/* → Claude provider (web UI API)
|
|
23
|
+
| *.pdf → PDF provider (local parsing)
|
|
24
|
+
| any other URL → Article provider (Readability)
|
|
19
25
|
|
|
|
20
26
|
v
|
|
21
|
-
|
|
22
|
-
(text / SRT / JSON with timing)
|
|
27
|
+
Clean text/markdown returned to Claude
|
|
23
28
|
```
|
|
24
29
|
|
|
25
|
-
The server runs as a local process. Claude connects over stdio via the MCP protocol.
|
|
30
|
+
The server runs as a local process. Claude connects over stdio via the MCP protocol.
|
|
26
31
|
|
|
27
32
|
## Quick Start
|
|
28
33
|
|
|
@@ -31,13 +36,51 @@ The server runs as a local process. Claude connects over stdio via the MCP proto
|
|
|
31
36
|
- [Claude Desktop](https://claude.ai/download) or [Claude Code](https://claude.ai/code)
|
|
32
37
|
- [Node.js](https://nodejs.org) 18+ **or** [Bun](https://bun.sh) 1.0+
|
|
33
38
|
|
|
34
|
-
### Install
|
|
39
|
+
### Install with Claude Code
|
|
40
|
+
|
|
41
|
+
Tell Claude:
|
|
42
|
+
|
|
43
|
+
> *"Install the scribe MCP server from github.com/mnott/Scribe"*
|
|
44
|
+
|
|
45
|
+
Claude will clone the repo, build it, and add it to your MCP config.
|
|
46
|
+
|
|
47
|
+
Or use the CLI directly:
|
|
35
48
|
|
|
36
49
|
```bash
|
|
37
|
-
claude mcp add scribe-mcp -- npx -y scribe
|
|
50
|
+
claude mcp add scribe-mcp -- npx -y @tekmidian/scribe
|
|
38
51
|
```
|
|
39
52
|
|
|
40
|
-
### Manual install
|
|
53
|
+
### Manual install
|
|
54
|
+
|
|
55
|
+
#### Claude Code
|
|
56
|
+
|
|
57
|
+
Add to `~/.claude.json`:
|
|
58
|
+
|
|
59
|
+
```json
|
|
60
|
+
{
|
|
61
|
+
"mcpServers": {
|
|
62
|
+
"scribe": {
|
|
63
|
+
"command": "npx",
|
|
64
|
+
"args": ["-y", "scribe-mcp"]
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Or with Bun (faster):
|
|
71
|
+
|
|
72
|
+
```json
|
|
73
|
+
{
|
|
74
|
+
"mcpServers": {
|
|
75
|
+
"scribe": {
|
|
76
|
+
"command": "bunx",
|
|
77
|
+
"args": ["scribe-mcp"]
|
|
78
|
+
}
|
|
79
|
+
}
|
|
80
|
+
}
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
#### Claude Desktop
|
|
41
84
|
|
|
42
85
|
Add the following to your `claude_desktop_config.json`:
|
|
43
86
|
|
|
@@ -94,89 +137,137 @@ Then point your MCP config at the built binary:
|
|
|
94
137
|
|
|
95
138
|
| Tool | What it does |
|
|
96
139
|
|------|-------------|
|
|
97
|
-
| `
|
|
140
|
+
| `extract_content` | Extract content from any supported source — auto-detects the provider |
|
|
141
|
+
| `list_providers` | Show all available providers and their capabilities |
|
|
142
|
+
| `youtube_transcribe` | Fetch YouTube transcript in text, SRT, or JSON format |
|
|
98
143
|
| `youtube_list_languages` | List every caption language available for a video |
|
|
99
144
|
|
|
145
|
+
## Providers
|
|
146
|
+
|
|
147
|
+
| Provider | Sources | Output |
|
|
148
|
+
|----------|---------|--------|
|
|
149
|
+
| **youtube** | YouTube videos (all URL formats + bare IDs) | Text, SRT, JSON with timing |
|
|
150
|
+
| **claude** | Claude.ai chats and projects | Markdown with metadata |
|
|
151
|
+
| **pdf** | PDF files (URLs or local paths) | Plain text |
|
|
152
|
+
| **article** | Any web page | Clean text via Readability |
|
|
153
|
+
|
|
100
154
|
## User Guide
|
|
101
155
|
|
|
102
|
-
|
|
156
|
+
Just give Claude a URL. Scribe auto-detects the source type.
|
|
103
157
|
|
|
104
|
-
###
|
|
158
|
+
### YouTube videos
|
|
105
159
|
|
|
106
160
|
```
|
|
107
|
-
|
|
161
|
+
Summarize this YouTube video: https://www.youtube.com/watch?v=dQw4w9WgXcQ
|
|
108
162
|
```
|
|
109
163
|
|
|
110
164
|
```
|
|
111
|
-
Get the transcript of this
|
|
165
|
+
Get me the German transcript of this lecture: [url]
|
|
112
166
|
```
|
|
113
167
|
|
|
114
168
|
```
|
|
115
|
-
|
|
169
|
+
Return the transcript as JSON with timing data: [url]
|
|
116
170
|
```
|
|
117
171
|
|
|
118
|
-
###
|
|
172
|
+
### Claude.ai conversations
|
|
119
173
|
|
|
120
174
|
```
|
|
121
|
-
|
|
175
|
+
Download this conversation: https://claude.ai/chat/550e8400-e29b-41d4-a716-446655440000
|
|
122
176
|
```
|
|
123
177
|
|
|
124
178
|
```
|
|
125
|
-
|
|
179
|
+
Get all conversations from this project: https://claude.ai/project/550e8400-e29b-41d4-a716-446655440000
|
|
126
180
|
```
|
|
127
181
|
|
|
182
|
+
### Web articles
|
|
183
|
+
|
|
128
184
|
```
|
|
129
|
-
|
|
185
|
+
Extract the content from this article: https://example.com/interesting-post
|
|
130
186
|
```
|
|
131
187
|
|
|
132
|
-
###
|
|
188
|
+
### PDFs
|
|
133
189
|
|
|
134
190
|
```
|
|
135
|
-
|
|
191
|
+
Read this PDF: https://example.com/paper.pdf
|
|
136
192
|
```
|
|
137
193
|
|
|
138
194
|
```
|
|
139
|
-
|
|
195
|
+
Extract text from /Users/me/Documents/report.pdf
|
|
140
196
|
```
|
|
141
197
|
|
|
142
|
-
###
|
|
198
|
+
### Analyze anything
|
|
143
199
|
|
|
144
200
|
```
|
|
145
|
-
|
|
201
|
+
Summarize this: [any supported URL]
|
|
146
202
|
```
|
|
147
203
|
|
|
148
204
|
```
|
|
149
|
-
|
|
205
|
+
Extract the key points from this: [any supported URL]
|
|
150
206
|
```
|
|
151
207
|
|
|
152
|
-
|
|
153
|
-
Get the transcript with timestamps included: [url]
|
|
154
|
-
```
|
|
208
|
+
## Claude.ai Provider Setup
|
|
155
209
|
|
|
156
|
-
|
|
210
|
+
The Claude provider downloads conversations from the Claude.ai web UI. It requires a session cookie for authentication. Three options:
|
|
157
211
|
|
|
158
|
-
|
|
159
|
-
Summarize this YouTube video: [url]
|
|
160
|
-
```
|
|
212
|
+
**Option A — Playwright (automated, recommended if you have Playwright MCP):**
|
|
161
213
|
|
|
162
|
-
|
|
163
|
-
Extract the key points from this lecture: [url]
|
|
164
|
-
```
|
|
214
|
+
Ask Claude Code to navigate to claude.ai and extract cookies:
|
|
165
215
|
|
|
166
216
|
```
|
|
167
|
-
|
|
217
|
+
Navigate to claude.ai and extract all cookies, save them as JSON to ~/claude-cookies.json
|
|
168
218
|
```
|
|
169
219
|
|
|
170
|
-
|
|
171
|
-
Find every mention of "machine learning" in this video and the timestamp it appears: [url]
|
|
172
|
-
```
|
|
220
|
+
Then add to your MCP config:
|
|
173
221
|
|
|
222
|
+
```json
|
|
223
|
+
{
|
|
224
|
+
"mcpServers": {
|
|
225
|
+
"scribe": {
|
|
226
|
+
"command": "npx",
|
|
227
|
+
"args": ["-y", "@tekmidian/scribe"],
|
|
228
|
+
"env": {
|
|
229
|
+
"CLAUDE_COOKIES_FILE": "/Users/you/claude-cookies.json"
|
|
230
|
+
}
|
|
231
|
+
}
|
|
232
|
+
}
|
|
233
|
+
}
|
|
174
234
|
```
|
|
175
|
-
|
|
235
|
+
|
|
236
|
+
**Option B — Browser extension:**
|
|
237
|
+
|
|
238
|
+
Install a cookie export extension (e.g. "Cookie-Editor"), export claude.ai cookies as JSON, and set `CLAUDE_COOKIES_FILE` as above.
|
|
239
|
+
|
|
240
|
+
**Option C — Manual:**
|
|
241
|
+
|
|
242
|
+
Open claude.ai → F12 → Application → Cookies → copy the `sessionKey` value:
|
|
243
|
+
|
|
244
|
+
```json
|
|
245
|
+
{
|
|
246
|
+
"env": {
|
|
247
|
+
"CLAUDE_SESSION_KEY": "sk-ant-sid01-..."
|
|
248
|
+
}
|
|
249
|
+
}
|
|
176
250
|
```
|
|
177
251
|
|
|
252
|
+
Without either env var, the Claude provider is silently disabled — other providers still work normally.
|
|
253
|
+
|
|
178
254
|
## MCP Tool Reference
|
|
179
255
|
|
|
256
|
+
### extract_content
|
|
257
|
+
|
|
258
|
+
Extracts content from any supported source. Auto-detects the provider based on the URL.
|
|
259
|
+
|
|
260
|
+
| Parameter | Type | Required | Default | Description |
|
|
261
|
+
|-----------|------|----------|---------|-------------|
|
|
262
|
+
| `url` | string | yes | — | URL or file path to extract content from |
|
|
263
|
+
| `format` | string | no | `text` | Output format (available formats depend on provider) |
|
|
264
|
+
| `language` | string | no | — | Preferred language code (YouTube only) |
|
|
265
|
+
| `timestamps` | boolean | no | `false` | Include timestamps (YouTube text format only) |
|
|
266
|
+
|
|
267
|
+
### list_providers
|
|
268
|
+
|
|
269
|
+
Lists all available providers and their capabilities. No parameters.
|
|
270
|
+
|
|
180
271
|
### youtube_transcribe
|
|
181
272
|
|
|
182
273
|
Fetches captions for a YouTube video and returns them in the requested format.
|
|
@@ -227,46 +318,13 @@ Available languages for dQw4w9WgXcQ:
|
|
|
227
318
|
|
|
228
319
|
## Configuration
|
|
229
320
|
|
|
230
|
-
|
|
321
|
+
All behavior is controlled by parameters passed per-request. Optional environment variables enable the Claude.ai provider:
|
|
231
322
|
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
"scribe": {
|
|
238
|
-
"command": "npx",
|
|
239
|
-
"args": ["-y", "scribe-mcp"]
|
|
240
|
-
}
|
|
241
|
-
}
|
|
242
|
-
}
|
|
243
|
-
```
|
|
244
|
-
|
|
245
|
-
**bunx (Bun):**
|
|
246
|
-
|
|
247
|
-
```json
|
|
248
|
-
{
|
|
249
|
-
"mcpServers": {
|
|
250
|
-
"scribe": {
|
|
251
|
-
"command": "bunx",
|
|
252
|
-
"args": ["scribe-mcp"]
|
|
253
|
-
}
|
|
254
|
-
}
|
|
255
|
-
}
|
|
256
|
-
```
|
|
257
|
-
|
|
258
|
-
**Local build:**
|
|
259
|
-
|
|
260
|
-
```json
|
|
261
|
-
{
|
|
262
|
-
"mcpServers": {
|
|
263
|
-
"scribe": {
|
|
264
|
-
"command": "node",
|
|
265
|
-
"args": ["/path/to/Scribe/dist/index.js"]
|
|
266
|
-
}
|
|
267
|
-
}
|
|
268
|
-
}
|
|
269
|
-
```
|
|
323
|
+
| Variable | Required | Description |
|
|
324
|
+
|----------|----------|-------------|
|
|
325
|
+
| `CLAUDE_COOKIES_FILE` | no | Path to browser cookie export JSON (claude.ai provider) |
|
|
326
|
+
| `CLAUDE_SESSION_KEY` | no | Direct session key value (claude.ai provider) |
|
|
327
|
+
| `CLAUDE_ORG_ID` | no | Organization ID (auto-discovered if not set) |
|
|
270
328
|
|
|
271
329
|
## Troubleshooting
|
|
272
330
|
|
|
@@ -296,14 +354,13 @@ If you transcribe many videos in rapid succession, YouTube may temporarily throt
|
|
|
296
354
|
- An MCP-compatible client (Claude Desktop, Claude Code, or any MCP-aware host)
|
|
297
355
|
- Internet access to reach `youtube.com` and `www.youtube.com/youtubei/v1/`
|
|
298
356
|
|
|
299
|
-
No API keys
|
|
357
|
+
No API keys needed for YouTube, articles, or PDFs. Claude.ai provider requires a session cookie (see setup above).
|
|
300
358
|
|
|
301
359
|
## Coming soon
|
|
302
360
|
|
|
303
361
|
- Vimeo transcript extraction
|
|
304
362
|
- Direct audio/video file transcription
|
|
305
363
|
- Podcast RSS feed support
|
|
306
|
-
- SoundCloud track transcription
|
|
307
364
|
|
|
308
365
|
## License
|
|
309
366
|
|
|
@@ -312,3 +369,6 @@ MIT
|
|
|
312
369
|
## Author
|
|
313
370
|
|
|
314
371
|
Matthias Nott — [github.com/mnott](https://github.com/mnott)
|
|
372
|
+
|
|
373
|
+
---
|
|
374
|
+
*Links:* [[Ideaverse/AI/Scribe/Scribe|Scribe]]
|