ytb-tools 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +143 -0
- package/dist/index.js +47 -0
- package/dist/library/cache.js +20 -0
- package/dist/library/paths.js +29 -0
- package/dist/library/save.js +41 -0
- package/dist/tools/saveSummary.js +21 -0
- package/dist/tools/search.js +11 -0
- package/dist/tools/transcript.js +27 -0
- package/dist/youtube/client.js +8 -0
- package/dist/youtube/search.js +72 -0
- package/dist/youtube/transcript.js +96 -0
- package/dist/youtube/types.js +1 -0
- package/dist/youtube/url.js +29 -0
- package/dist/ytdlp/provision.js +91 -0
- package/package.json +58 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 aliildan
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
# 🎬 ytb-tools
|
|
2
|
+
|
|
3
|
+
[](https://www.npmjs.com/package/ytb-tools)
|
|
4
|
+
[](LICENSE)
|
|
5
|
+
|
|
6
|
+
**Search YouTube, pull transcripts, and get AI summaries — right inside Claude and any other MCP client.**
|
|
7
|
+
|
|
8
|
+
ytb-tools is a [Model Context Protocol](https://modelcontextprotocol.io) server that turns YouTube into something your AI assistant can actually work with. Ask it to find videos, grab a transcript, or summarize a talk — it just works.
|
|
9
|
+
|
|
10
|
+
> ✨ **Zero setup.** No API keys. No Google account. No manual installs. ytb-tools provisions everything it needs on its own.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## What you can do
|
|
15
|
+
|
|
16
|
+
- 🔎 **Search YouTube** — "find me the top React 19 talks" → ranked results with titles, channels, durations, and views.
|
|
17
|
+
- 📝 **Get transcripts** — full transcripts in the video's language (or any available caption track), saved to a tidy local library.
|
|
18
|
+
- 🧠 **Summarize videos** — TL;DR, structured notes, or a deep dive — written by Claude, in the video's own language.
|
|
19
|
+
- 💾 **Builds your library** — every transcript and summary is auto-saved as clean files you can browse, search, and keep.
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Quick start
|
|
24
|
+
|
|
25
|
+
### Any MCP client (Claude Desktop, Cursor, Cline, …)
|
|
26
|
+
|
|
27
|
+
Add this to your client's MCP config — that's the whole install:
|
|
28
|
+
|
|
29
|
+
```json
|
|
30
|
+
{
|
|
31
|
+
"mcpServers": {
|
|
32
|
+
"ytb-tools": {
|
|
33
|
+
"command": "npx",
|
|
34
|
+
"args": ["-y", "ytb-tools"]
|
|
35
|
+
}
|
|
36
|
+
}
|
|
37
|
+
}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Then just ask:
|
|
41
|
+
|
|
42
|
+
> *"Search YouTube for the best intro to Rust, then summarize the top result."*
|
|
43
|
+
|
|
44
|
+
### Claude Code (plugin)
|
|
45
|
+
|
|
46
|
+
Install it as a plugin to get the slash commands. Run these inside Claude Code:
|
|
47
|
+
|
|
48
|
+
```text
|
|
49
|
+
/plugin marketplace add aliildan/ytb-tools
|
|
50
|
+
/plugin install ytb-tools@ytb-tools
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
The first command registers this repo as a marketplace; the second installs the plugin (which pulls in the MCP server via `npx` and adds the slash commands). Prefer a menu? Just run `/plugin`.
|
|
54
|
+
|
|
55
|
+
You then get three commands:
|
|
56
|
+
|
|
57
|
+
| Command | What it does |
|
|
58
|
+
|---|---|
|
|
59
|
+
| `/yt-search <query>` | List ranked search results |
|
|
60
|
+
| `/yt-transcript <url\|id> [lang]` | Fetch a transcript |
|
|
61
|
+
| `/yt-summary <url\|id> [quick\|standard\|detailed]` | Summarize at the depth you want |
|
|
62
|
+
|
|
63
|
+
`/yt-summary` automatically picks the right model for the job — **quick → Haiku**, **standard → Sonnet**, **detailed → Opus** — and writes the summary in the video's language.
|
|
64
|
+
|
|
65
|
+
#### Updating the plugin
|
|
66
|
+
|
|
67
|
+
When a new version ships, refresh the marketplace catalog and update:
|
|
68
|
+
|
|
69
|
+
```text
|
|
70
|
+
/plugin marketplace update ytb-tools
|
|
71
|
+
/plugin update ytb-tools
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
#### Uninstalling
|
|
75
|
+
|
|
76
|
+
```text
|
|
77
|
+
/plugin uninstall ytb-tools@ytb-tools
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
> **Scripting it?** The same actions work non-interactively from your shell:
|
|
81
|
+
> ```bash
|
|
82
|
+
> claude plugin marketplace add aliildan/ytb-tools
|
|
83
|
+
> claude plugin install ytb-tools@ytb-tools
|
|
84
|
+
> ```
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Research a whole topic at once
|
|
89
|
+
|
|
90
|
+
Installed in Claude Code, the **`yt-research`** skill chains everything together. Just ask in plain language:
|
|
91
|
+
|
|
92
|
+
> *"Research the top 30 YouTube videos on 'rust async' and give me a digest."*
|
|
93
|
+
|
|
94
|
+
It searches, pulls each transcript, summarizes each (defaulting to quick/Haiku to keep big batches cheap), and produces a **combined digest** — recurring themes, a ranked "start here" shortlist, and any videos it had to skip. For large runs it **confirms with you first** and processes in batches with progress updates.
|
|
95
|
+
|
|
96
|
+
## Your library
|
|
97
|
+
|
|
98
|
+
Everything is saved automatically (default `~/ytb-tools/`):
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
~/ytb-tools/
|
|
102
|
+
├── transcripts/
|
|
103
|
+
│ ├── dQw4w9WgXcQ.en.json # timestamped segments
|
|
104
|
+
│ └── dQw4w9WgXcQ.en.txt # plain text
|
|
105
|
+
└── summaries/
|
|
106
|
+
└── dQw4w9WgXcQ.standard.md # Markdown with title, url, model, date
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Want them somewhere else? Set `YT_OUTPUT_DIR`.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## The tools
|
|
114
|
+
|
|
115
|
+
| Tool | Does |
|
|
116
|
+
|---|---|
|
|
117
|
+
| `youtube_search` | Search YouTube and return ranked video results |
|
|
118
|
+
| `youtube_get_transcript` | Extract a transcript (with language selection), auto-saved |
|
|
119
|
+
| `youtube_save_summary` | Save a generated summary to your library |
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Configuration
|
|
124
|
+
|
|
125
|
+
All optional:
|
|
126
|
+
|
|
127
|
+
| Variable | Purpose | Default |
|
|
128
|
+
|---|---|---|
|
|
129
|
+
| `YT_OUTPUT_DIR` | Where transcripts & summaries are saved | `~/ytb-tools` |
|
|
130
|
+
| `YT_CACHE_DIR` | Cache location | OS cache dir |
|
|
131
|
+
| `YT_DLP_PATH` | Use an existing yt-dlp instead of the bundled one | auto |
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## How it works (the short version)
|
|
136
|
+
|
|
137
|
+
Search runs entirely in-process via [`youtubei.js`](https://github.com/LuanRT/YouTube.js) — no key, no quotas. Transcripts are powered by [yt-dlp](https://github.com/yt-dlp/yt-dlp), which ytb-tools **downloads and manages for you automatically** the first time you need it (it reuses the Node runtime that's already running — no Python, no Deno). Summaries are written by your assistant's own model, so there's no extra API bill.
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## License
|
|
142
|
+
|
|
143
|
+
MIT © aliildan
|
package/dist/index.js
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
|
|
3
|
+
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
|
|
4
|
+
import { searchInput, handleSearch } from "./tools/search.js";
|
|
5
|
+
import { transcriptInput, handleTranscript } from "./tools/transcript.js";
|
|
6
|
+
import { saveSummaryInput, handleSaveSummary } from "./tools/saveSummary.js";
|
|
7
|
+
function ok(data) {
|
|
8
|
+
return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
|
|
9
|
+
}
|
|
10
|
+
function fail(err) {
|
|
11
|
+
const message = err instanceof Error ? err.message : String(err);
|
|
12
|
+
return { content: [{ type: "text", text: message }], isError: true };
|
|
13
|
+
}
|
|
14
|
+
const server = new McpServer({ name: "ytb-tools", version: "0.1.0" });
|
|
15
|
+
server.registerTool("youtube_search", { description: "Search YouTube and return ranked video results.", inputSchema: searchInput }, async (args) => {
|
|
16
|
+
try {
|
|
17
|
+
return ok(await handleSearch(args));
|
|
18
|
+
}
|
|
19
|
+
catch (e) {
|
|
20
|
+
return fail(e);
|
|
21
|
+
}
|
|
22
|
+
});
|
|
23
|
+
server.registerTool("youtube_get_transcript", {
|
|
24
|
+
description: "Fetch a video's transcript via yt-dlp (cache-aware) and auto-save it to the library. " +
|
|
25
|
+
"May trigger a one-time yt-dlp download on first use.",
|
|
26
|
+
inputSchema: transcriptInput,
|
|
27
|
+
}, async (args) => {
|
|
28
|
+
try {
|
|
29
|
+
return ok(await handleTranscript(args));
|
|
30
|
+
}
|
|
31
|
+
catch (e) {
|
|
32
|
+
return fail(e);
|
|
33
|
+
}
|
|
34
|
+
});
|
|
35
|
+
server.registerTool("youtube_save_summary", {
|
|
36
|
+
description: "Persist a host-generated summary to the library as Markdown with frontmatter.",
|
|
37
|
+
inputSchema: saveSummaryInput,
|
|
38
|
+
}, async (args) => {
|
|
39
|
+
try {
|
|
40
|
+
return ok(await handleSaveSummary(args));
|
|
41
|
+
}
|
|
42
|
+
catch (e) {
|
|
43
|
+
return fail(e);
|
|
44
|
+
}
|
|
45
|
+
});
|
|
46
|
+
const transport = new StdioServerTransport();
|
|
47
|
+
await server.connect(transport);
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
import { promises as fs } from "node:fs";
|
|
2
|
+
import path from "node:path";
|
|
3
|
+
import { cacheDir } from "./paths.js";
|
|
4
|
+
function cacheFile(videoId, key, env) {
|
|
5
|
+
return path.join(cacheDir(env), "transcripts", `${videoId}.${key}.json`);
|
|
6
|
+
}
|
|
7
|
+
export async function readCachedTranscript(videoId, key, env = process.env) {
|
|
8
|
+
try {
|
|
9
|
+
const raw = await fs.readFile(cacheFile(videoId, key, env), "utf8");
|
|
10
|
+
return JSON.parse(raw);
|
|
11
|
+
}
|
|
12
|
+
catch {
|
|
13
|
+
return null;
|
|
14
|
+
}
|
|
15
|
+
}
|
|
16
|
+
export async function writeCachedTranscript(t, key, env = process.env) {
|
|
17
|
+
const file = cacheFile(t.videoId, key, env);
|
|
18
|
+
await fs.mkdir(path.dirname(file), { recursive: true });
|
|
19
|
+
await fs.writeFile(file, JSON.stringify(t), "utf8");
|
|
20
|
+
}
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
import os from "node:os";
|
|
2
|
+
import path from "node:path";
|
|
3
|
+
import envPaths from "env-paths";
|
|
4
|
+
export function expandHome(p) {
|
|
5
|
+
if (p === "~")
|
|
6
|
+
return os.homedir();
|
|
7
|
+
if (p.startsWith("~/") || p.startsWith("~\\")) {
|
|
8
|
+
return path.join(os.homedir(), p.slice(2));
|
|
9
|
+
}
|
|
10
|
+
return p;
|
|
11
|
+
}
|
|
12
|
+
export function outputDir(env = process.env) {
|
|
13
|
+
const override = env.YT_OUTPUT_DIR?.trim();
|
|
14
|
+
if (override)
|
|
15
|
+
return path.resolve(expandHome(override));
|
|
16
|
+
return path.join(os.homedir(), "ytb-tools");
|
|
17
|
+
}
|
|
18
|
+
export function cacheDir(env = process.env) {
|
|
19
|
+
const override = env.YT_CACHE_DIR?.trim();
|
|
20
|
+
if (override)
|
|
21
|
+
return path.resolve(expandHome(override));
|
|
22
|
+
return envPaths("ytb-tools", { suffix: "" }).cache;
|
|
23
|
+
}
|
|
24
|
+
export function transcriptsDir(env = process.env) {
|
|
25
|
+
return path.join(outputDir(env), "transcripts");
|
|
26
|
+
}
|
|
27
|
+
export function summariesDir(env = process.env) {
|
|
28
|
+
return path.join(outputDir(env), "summaries");
|
|
29
|
+
}
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
import { promises as fs } from "node:fs";
|
|
2
|
+
import path from "node:path";
|
|
3
|
+
import { transcriptsDir, summariesDir } from "./paths.js";
|
|
4
|
+
export function slugify(s) {
|
|
5
|
+
const out = s
|
|
6
|
+
.toLowerCase()
|
|
7
|
+
.normalize("NFKD")
|
|
8
|
+
.replace(/[^\p{Letter}\p{Number}]+/gu, "-")
|
|
9
|
+
.replace(/^-+|-+$/g, "")
|
|
10
|
+
.slice(0, 60);
|
|
11
|
+
return out || "untitled";
|
|
12
|
+
}
|
|
13
|
+
export async function saveTranscript(t, env = process.env) {
|
|
14
|
+
const dir = transcriptsDir(env);
|
|
15
|
+
await fs.mkdir(dir, { recursive: true });
|
|
16
|
+
const base = `${t.videoId}.${t.language}`;
|
|
17
|
+
const jsonPath = path.join(dir, `${base}.json`);
|
|
18
|
+
const txtPath = path.join(dir, `${base}.txt`);
|
|
19
|
+
await fs.writeFile(jsonPath, JSON.stringify(t, null, 2), "utf8");
|
|
20
|
+
await fs.writeFile(txtPath, t.fullText, "utf8");
|
|
21
|
+
return { jsonPath, txtPath };
|
|
22
|
+
}
|
|
23
|
+
export async function saveSummary(videoId, summary, meta, env = process.env) {
|
|
24
|
+
const dir = summariesDir(env);
|
|
25
|
+
await fs.mkdir(dir, { recursive: true });
|
|
26
|
+
const file = path.join(dir, `${videoId}.${meta.style}.md`);
|
|
27
|
+
const frontmatter = [
|
|
28
|
+
"---",
|
|
29
|
+
`title: ${JSON.stringify(meta.title)}`,
|
|
30
|
+
`url: ${meta.url}`,
|
|
31
|
+
`videoId: ${videoId}`,
|
|
32
|
+
`model: ${meta.model}`,
|
|
33
|
+
`style: ${meta.style}`,
|
|
34
|
+
`language: ${meta.language}`,
|
|
35
|
+
`date: ${new Date().toISOString()}`,
|
|
36
|
+
"---",
|
|
37
|
+
"",
|
|
38
|
+
].join("\n");
|
|
39
|
+
await fs.writeFile(file, frontmatter + summary + "\n", "utf8");
|
|
40
|
+
return file;
|
|
41
|
+
}
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
import { z } from "zod";
|
|
2
|
+
import { saveSummary } from "../library/save.js";
|
|
3
|
+
export const saveSummaryInput = {
|
|
4
|
+
videoId: z.string().min(1).describe("11-character video ID"),
|
|
5
|
+
summary: z.string().min(1).describe("The summary text produced by the host model"),
|
|
6
|
+
style: z.enum(["quick", "standard", "detailed"]).default("standard"),
|
|
7
|
+
title: z.string().default("").describe("Video title"),
|
|
8
|
+
url: z.string().default("").describe("Video URL"),
|
|
9
|
+
model: z.string().default("").describe("Model that produced the summary"),
|
|
10
|
+
language: z.string().default("").describe("Summary language (BCP-47)"),
|
|
11
|
+
};
|
|
12
|
+
export async function handleSaveSummary(args, env = process.env) {
|
|
13
|
+
const savedTo = await saveSummary(args.videoId, args.summary, {
|
|
14
|
+
title: args.title ?? "",
|
|
15
|
+
url: args.url ?? "",
|
|
16
|
+
model: args.model ?? "",
|
|
17
|
+
language: args.language ?? "",
|
|
18
|
+
style: args.style ?? "standard",
|
|
19
|
+
}, env);
|
|
20
|
+
return { savedTo };
|
|
21
|
+
}
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
import { z } from "zod";
|
|
2
|
+
import { searchVideos } from "../youtube/search.js";
|
|
3
|
+
export const searchInput = {
|
|
4
|
+
query: z.string().min(1).describe("Search query"),
|
|
5
|
+
limit: z.number().int().min(1).max(200).default(10).describe("Maximum results (paginated)"),
|
|
6
|
+
type: z.enum(["video", "channel", "playlist"]).default("video").describe("Result type"),
|
|
7
|
+
};
|
|
8
|
+
export async function handleSearch(args) {
|
|
9
|
+
const results = await searchVideos(args.query, { limit: args.limit, type: args.type });
|
|
10
|
+
return { count: results.length, results };
|
|
11
|
+
}
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
import { z } from "zod";
|
|
2
|
+
import { parseVideoId } from "../youtube/url.js";
|
|
3
|
+
import { fetchTranscript } from "../youtube/transcript.js";
|
|
4
|
+
import { readCachedTranscript, writeCachedTranscript } from "../library/cache.js";
|
|
5
|
+
import { saveTranscript } from "../library/save.js";
|
|
6
|
+
export const transcriptInput = {
|
|
7
|
+
video: z.string().min(1).describe("YouTube URL or 11-character video ID"),
|
|
8
|
+
lang: z.string().optional().describe("BCP-47 language code, e.g. en, tr"),
|
|
9
|
+
fresh: z.boolean().default(false).describe("Bypass the transcript cache"),
|
|
10
|
+
};
|
|
11
|
+
export async function handleTranscript(args, deps = {}) {
|
|
12
|
+
const fetch = deps.fetch ?? fetchTranscript;
|
|
13
|
+
const readCache = deps.readCache ?? ((id, key) => readCachedTranscript(id, key));
|
|
14
|
+
const writeCache = deps.writeCache ?? ((t, key) => writeCachedTranscript(t, key));
|
|
15
|
+
const save = deps.save ?? ((t) => saveTranscript(t));
|
|
16
|
+
const videoId = parseVideoId(args.video);
|
|
17
|
+
const key = args.lang ?? "auto";
|
|
18
|
+
if (!args.fresh) {
|
|
19
|
+
const cached = await readCache(videoId, key);
|
|
20
|
+
if (cached)
|
|
21
|
+
return cached;
|
|
22
|
+
}
|
|
23
|
+
const t = await fetch(videoId, args.lang);
|
|
24
|
+
await writeCache(t, key);
|
|
25
|
+
await save(t);
|
|
26
|
+
return t;
|
|
27
|
+
}
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
import { watchUrl } from "./url.js";
|
|
2
|
+
import { getInnertube } from "./client.js";
|
|
3
|
+
function txt(v) {
|
|
4
|
+
if (!v)
|
|
5
|
+
return "";
|
|
6
|
+
if (typeof v === "string")
|
|
7
|
+
return v;
|
|
8
|
+
if ("text" in v && typeof v.text === "string")
|
|
9
|
+
return v.text;
|
|
10
|
+
if ("name" in v && typeof v.name === "string")
|
|
11
|
+
return v.name;
|
|
12
|
+
return "";
|
|
13
|
+
}
|
|
14
|
+
function parseViewCount(s) {
|
|
15
|
+
const m = s.replace(/,/g, "").match(/([\d.]+)\s*([KMB]?)/i);
|
|
16
|
+
if (!m)
|
|
17
|
+
return null;
|
|
18
|
+
const n = parseFloat(m[1]);
|
|
19
|
+
if (Number.isNaN(n))
|
|
20
|
+
return null;
|
|
21
|
+
const mult = { K: 1e3, M: 1e6, B: 1e9 };
|
|
22
|
+
return Math.round(n * (mult[m[2].toUpperCase()] ?? 1));
|
|
23
|
+
}
|
|
24
|
+
export function normalizeSearchResults(raw, limit) {
|
|
25
|
+
const out = [];
|
|
26
|
+
for (const v of raw) {
|
|
27
|
+
const videoId = v.id ?? v.video_id;
|
|
28
|
+
if (!videoId)
|
|
29
|
+
continue;
|
|
30
|
+
out.push({
|
|
31
|
+
videoId,
|
|
32
|
+
title: txt(v.title),
|
|
33
|
+
channel: txt(v.author),
|
|
34
|
+
durationSeconds: v.duration?.seconds ?? null,
|
|
35
|
+
viewCount: parseViewCount(txt(v.view_count) || txt(v.short_view_count)),
|
|
36
|
+
publishedAt: txt(v.published) || null,
|
|
37
|
+
url: watchUrl(videoId),
|
|
38
|
+
descriptionSnippet: txt(v.description_snippet),
|
|
39
|
+
});
|
|
40
|
+
if (out.length >= limit)
|
|
41
|
+
break;
|
|
42
|
+
}
|
|
43
|
+
return out;
|
|
44
|
+
}
|
|
45
|
+
export async function searchVideos(query, opts = {}) {
|
|
46
|
+
const limit = opts.limit ?? 10;
|
|
47
|
+
const yt = await getInnertube();
|
|
48
|
+
let page = (await yt.search(query, { type: opts.type ?? "video" }));
|
|
49
|
+
const out = [];
|
|
50
|
+
const seen = new Set();
|
|
51
|
+
// Page through continuations until we have `limit` results or run out.
|
|
52
|
+
for (let guard = 0; page && guard < 30; guard++) {
|
|
53
|
+
const items = page.results ?? page.videos ?? [];
|
|
54
|
+
for (const r of normalizeSearchResults(items, items.length)) {
|
|
55
|
+
if (seen.has(r.videoId))
|
|
56
|
+
continue;
|
|
57
|
+
seen.add(r.videoId);
|
|
58
|
+
out.push(r);
|
|
59
|
+
if (out.length >= limit)
|
|
60
|
+
return out;
|
|
61
|
+
}
|
|
62
|
+
if (!page.has_continuation || typeof page.getContinuation !== "function")
|
|
63
|
+
break;
|
|
64
|
+
try {
|
|
65
|
+
page = await page.getContinuation();
|
|
66
|
+
}
|
|
67
|
+
catch {
|
|
68
|
+
break;
|
|
69
|
+
}
|
|
70
|
+
}
|
|
71
|
+
return out.slice(0, limit);
|
|
72
|
+
}
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
import { promises as fs } from "node:fs";
|
|
2
|
+
import os from "node:os";
|
|
3
|
+
import path from "node:path";
|
|
4
|
+
import { spawn } from "node:child_process";
|
|
5
|
+
import { watchUrl } from "./url.js";
|
|
6
|
+
import { ensureYtDlp, jsRuntimeArg } from "../ytdlp/provision.js";
|
|
7
|
+
export function parseJson3(data) {
|
|
8
|
+
const events = data?.events ?? [];
|
|
9
|
+
const segs = [];
|
|
10
|
+
for (const ev of events) {
|
|
11
|
+
const text = (ev.segs ?? []).map((s) => s.utf8 ?? "").join("").trim();
|
|
12
|
+
if (!text)
|
|
13
|
+
continue;
|
|
14
|
+
segs.push({ startSeconds: Math.round((ev.tStartMs ?? 0) / 1000), text });
|
|
15
|
+
}
|
|
16
|
+
return segs;
|
|
17
|
+
}
|
|
18
|
+
export function buildFullText(segments) {
|
|
19
|
+
return segments
|
|
20
|
+
.map((s) => s.text)
|
|
21
|
+
.join(" ")
|
|
22
|
+
.replace(/\s+/g, " ")
|
|
23
|
+
.trim();
|
|
24
|
+
}
|
|
25
|
+
export function pickLanguage(requested, metaLang, available) {
|
|
26
|
+
const match = (l) => available.find((a) => a === l) ??
|
|
27
|
+
available.find((a) => !!l && (a.startsWith(l + "-") || a === l + "-orig"));
|
|
28
|
+
return match(requested) ?? match(metaLang) ?? match("en") ?? available[0];
|
|
29
|
+
}
|
|
30
|
+
function run(bin, args) {
|
|
31
|
+
return new Promise((resolve, reject) => {
|
|
32
|
+
const p = spawn(bin, args);
|
|
33
|
+
let stdout = "";
|
|
34
|
+
let stderr = "";
|
|
35
|
+
p.on("error", reject);
|
|
36
|
+
p.stdout.on("data", (d) => (stdout += d));
|
|
37
|
+
p.stderr.on("data", (d) => (stderr += d));
|
|
38
|
+
p.on("close", (code) => resolve({ code: code ?? -1, stdout, stderr }));
|
|
39
|
+
});
|
|
40
|
+
}
|
|
41
|
+
export async function fetchTranscript(videoId, lang, env = process.env) {
|
|
42
|
+
const bin = await ensureYtDlp(env);
|
|
43
|
+
const url = watchUrl(videoId);
|
|
44
|
+
const jsRt = jsRuntimeArg();
|
|
45
|
+
// 1. Metadata: title + available caption languages.
|
|
46
|
+
const meta = await run(bin, ["-J", "--skip-download", "--no-warnings", "--js-runtimes", jsRt, url]);
|
|
47
|
+
if (meta.code !== 0)
|
|
48
|
+
throw new Error(`yt-dlp metadata failed for ${videoId}: ${meta.stderr.slice(-200)}`);
|
|
49
|
+
const info = JSON.parse(meta.stdout);
|
|
50
|
+
const available = Array.from(new Set([...Object.keys(info.subtitles ?? {}), ...Object.keys(info.automatic_captions ?? {})]));
|
|
51
|
+
if (available.length === 0)
|
|
52
|
+
throw new Error(`No transcript available for video ${videoId}`);
|
|
53
|
+
const chosen = pickLanguage(lang, info.language, available);
|
|
54
|
+
// 2. Download the chosen track as json3 into a temp dir, then parse.
|
|
55
|
+
const dir = await fs.mkdtemp(path.join(os.tmpdir(), "ytb-tr-"));
|
|
56
|
+
try {
|
|
57
|
+
const dl = await run(bin, [
|
|
58
|
+
"--skip-download",
|
|
59
|
+
"--write-subs",
|
|
60
|
+
"--write-auto-subs",
|
|
61
|
+
"--sub-langs",
|
|
62
|
+
chosen,
|
|
63
|
+
"--sub-format",
|
|
64
|
+
"json3",
|
|
65
|
+
"--no-warnings",
|
|
66
|
+
"--js-runtimes",
|
|
67
|
+
jsRt,
|
|
68
|
+
"-o",
|
|
69
|
+
path.join(dir, "t.%(ext)s"),
|
|
70
|
+
url,
|
|
71
|
+
]);
|
|
72
|
+
if (dl.code !== 0)
|
|
73
|
+
throw new Error(`yt-dlp subtitle download failed for ${videoId}: ${dl.stderr.slice(-200)}`);
|
|
74
|
+
const files = (await fs.readdir(dir)).filter((f) => f.endsWith(".json3"));
|
|
75
|
+
if (files.length === 0)
|
|
76
|
+
throw new Error(`No transcript track produced for ${videoId} (lang ${chosen})`);
|
|
77
|
+
const file = files.find((f) => f.includes(`.${chosen}.`)) ?? files[0];
|
|
78
|
+
const json = JSON.parse(await fs.readFile(path.join(dir, file), "utf8"));
|
|
79
|
+
const segments = parseJson3(json);
|
|
80
|
+
if (segments.length === 0)
|
|
81
|
+
throw new Error(`Empty transcript for ${videoId}`);
|
|
82
|
+
const m = file.match(/\.([^.]+)\.json3$/);
|
|
83
|
+
const language = m ? m[1] : chosen;
|
|
84
|
+
return {
|
|
85
|
+
videoId,
|
|
86
|
+
title: info.title ?? "",
|
|
87
|
+
language,
|
|
88
|
+
availableLanguages: available,
|
|
89
|
+
segments,
|
|
90
|
+
fullText: buildFullText(segments),
|
|
91
|
+
};
|
|
92
|
+
}
|
|
93
|
+
finally {
|
|
94
|
+
await fs.rm(dir, { recursive: true, force: true });
|
|
95
|
+
}
|
|
96
|
+
}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export {};
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
const ID_RE = /^[a-zA-Z0-9_-]{11}$/;
|
|
2
|
+
export function parseVideoId(input) {
|
|
3
|
+
const s = input.trim();
|
|
4
|
+
if (ID_RE.test(s))
|
|
5
|
+
return s;
|
|
6
|
+
try {
|
|
7
|
+
const u = new URL(s);
|
|
8
|
+
if (u.hostname === "youtu.be") {
|
|
9
|
+
const id = u.pathname.slice(1);
|
|
10
|
+
if (ID_RE.test(id))
|
|
11
|
+
return id;
|
|
12
|
+
}
|
|
13
|
+
if (u.hostname.endsWith("youtube.com")) {
|
|
14
|
+
const v = u.searchParams.get("v");
|
|
15
|
+
if (v && ID_RE.test(v))
|
|
16
|
+
return v;
|
|
17
|
+
const m = u.pathname.match(/\/(shorts|embed|v)\/([a-zA-Z0-9_-]{11})/);
|
|
18
|
+
if (m)
|
|
19
|
+
return m[2];
|
|
20
|
+
}
|
|
21
|
+
}
|
|
22
|
+
catch {
|
|
23
|
+
/* not a URL — fall through */
|
|
24
|
+
}
|
|
25
|
+
throw new Error(`Could not parse a YouTube video ID from: ${input}`);
|
|
26
|
+
}
|
|
27
|
+
export function watchUrl(videoId) {
|
|
28
|
+
return `https://www.youtube.com/watch?v=${videoId}`;
|
|
29
|
+
}
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
import { promises as fs } from "node:fs";
|
|
2
|
+
import path from "node:path";
|
|
3
|
+
import { createHash } from "node:crypto";
|
|
4
|
+
import { spawn } from "node:child_process";
|
|
5
|
+
import { cacheDir } from "../library/paths.js";
|
|
6
|
+
const RELEASE_BASE = "https://github.com/yt-dlp/yt-dlp/releases/latest/download";
|
|
7
|
+
export function assetName(platform = process.platform, arch = process.arch) {
|
|
8
|
+
if (platform === "win32")
|
|
9
|
+
return "yt-dlp.exe";
|
|
10
|
+
if (platform === "darwin")
|
|
11
|
+
return "yt-dlp_macos";
|
|
12
|
+
if (platform === "linux")
|
|
13
|
+
return arch === "arm64" ? "yt-dlp_linux_aarch64" : "yt-dlp_linux";
|
|
14
|
+
return "yt-dlp_linux";
|
|
15
|
+
}
|
|
16
|
+
export function jsRuntimeArg() {
|
|
17
|
+
return `node:${process.execPath}`;
|
|
18
|
+
}
|
|
19
|
+
function binPath(env = process.env) {
|
|
20
|
+
return path.join(cacheDir(env), "bin", assetName());
|
|
21
|
+
}
|
|
22
|
+
function commandExists(cmd) {
|
|
23
|
+
return new Promise((resolve) => {
|
|
24
|
+
const p = spawn(cmd, ["--version"]);
|
|
25
|
+
p.on("error", () => resolve(false));
|
|
26
|
+
p.on("close", (code) => resolve(code === 0));
|
|
27
|
+
});
|
|
28
|
+
}
|
|
29
|
+
async function fileExists(p) {
|
|
30
|
+
try {
|
|
31
|
+
await fs.access(p);
|
|
32
|
+
return true;
|
|
33
|
+
}
|
|
34
|
+
catch {
|
|
35
|
+
return false;
|
|
36
|
+
}
|
|
37
|
+
}
|
|
38
|
+
async function verifyChecksum(buf, asset) {
|
|
39
|
+
// Best-effort: if SUMS is unreachable, skip rather than break provisioning.
|
|
40
|
+
let sums;
|
|
41
|
+
try {
|
|
42
|
+
const res = await fetch(`${RELEASE_BASE}/SHA2-256SUMS`);
|
|
43
|
+
if (!res.ok)
|
|
44
|
+
return;
|
|
45
|
+
sums = await res.text();
|
|
46
|
+
}
|
|
47
|
+
catch {
|
|
48
|
+
return;
|
|
49
|
+
}
|
|
50
|
+
const line = sums.split("\n").find((l) => l.trim().endsWith(asset));
|
|
51
|
+
if (!line)
|
|
52
|
+
return;
|
|
53
|
+
const expected = line.trim().split(/\s+/)[0].toLowerCase();
|
|
54
|
+
const actual = createHash("sha256").update(buf).digest("hex");
|
|
55
|
+
if (expected && actual !== expected) {
|
|
56
|
+
throw new Error(`yt-dlp checksum mismatch for ${asset}: expected ${expected}, got ${actual}`);
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
async function download(env) {
|
|
60
|
+
const asset = assetName();
|
|
61
|
+
const dest = binPath(env);
|
|
62
|
+
await fs.mkdir(path.dirname(dest), { recursive: true });
|
|
63
|
+
const res = await fetch(`${RELEASE_BASE}/${asset}`);
|
|
64
|
+
if (!res.ok)
|
|
65
|
+
throw new Error(`Failed to download yt-dlp (${asset}): HTTP ${res.status}`);
|
|
66
|
+
const buf = Buffer.from(await res.arrayBuffer());
|
|
67
|
+
await verifyChecksum(buf, asset);
|
|
68
|
+
await fs.writeFile(dest, buf);
|
|
69
|
+
if (process.platform !== "win32")
|
|
70
|
+
await fs.chmod(dest, 0o755);
|
|
71
|
+
return dest;
|
|
72
|
+
}
|
|
73
|
+
export async function ensureYtDlp(env = process.env) {
|
|
74
|
+
const override = env.YT_DLP_PATH?.trim();
|
|
75
|
+
if (override)
|
|
76
|
+
return override;
|
|
77
|
+
if (await commandExists("yt-dlp"))
|
|
78
|
+
return "yt-dlp";
|
|
79
|
+
const cached = binPath(env);
|
|
80
|
+
if (await fileExists(cached))
|
|
81
|
+
return cached;
|
|
82
|
+
try {
|
|
83
|
+
return await download(env);
|
|
84
|
+
}
|
|
85
|
+
catch (e) {
|
|
86
|
+
const msg = e instanceof Error ? e.message : String(e);
|
|
87
|
+
throw new Error(`Could not provision yt-dlp automatically (${msg}). Install it manually — ` +
|
|
88
|
+
`macOS: 'brew install yt-dlp'; Linux: 'pipx install yt-dlp'; ` +
|
|
89
|
+
`Windows: 'winget install yt-dlp' — or set YT_DLP_PATH to an existing binary.`);
|
|
90
|
+
}
|
|
91
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "ytb-tools",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"description": "Keyless MCP server for YouTube search, transcript extraction (auto-provisioned yt-dlp), and host-driven summarization.",
|
|
5
|
+
"type": "module",
|
|
6
|
+
"license": "MIT",
|
|
7
|
+
"author": "aliildan",
|
|
8
|
+
"homepage": "https://github.com/aliildan/ytb-tools#readme",
|
|
9
|
+
"repository": {
|
|
10
|
+
"type": "git",
|
|
11
|
+
"url": "git+https://github.com/aliildan/ytb-tools.git"
|
|
12
|
+
},
|
|
13
|
+
"bugs": {
|
|
14
|
+
"url": "https://github.com/aliildan/ytb-tools/issues"
|
|
15
|
+
},
|
|
16
|
+
"keywords": [
|
|
17
|
+
"mcp",
|
|
18
|
+
"model-context-protocol",
|
|
19
|
+
"youtube",
|
|
20
|
+
"transcript",
|
|
21
|
+
"captions",
|
|
22
|
+
"summarize",
|
|
23
|
+
"claude",
|
|
24
|
+
"yt-dlp"
|
|
25
|
+
],
|
|
26
|
+
"bin": {
|
|
27
|
+
"ytb-tools": "dist/index.js"
|
|
28
|
+
},
|
|
29
|
+
"files": [
|
|
30
|
+
"dist"
|
|
31
|
+
],
|
|
32
|
+
"engines": {
|
|
33
|
+
"node": ">=20"
|
|
34
|
+
},
|
|
35
|
+
"publishConfig": {
|
|
36
|
+
"access": "public"
|
|
37
|
+
},
|
|
38
|
+
"scripts": {
|
|
39
|
+
"build": "tsc -p tsconfig.json",
|
|
40
|
+
"dev": "tsx src/index.ts",
|
|
41
|
+
"test": "vitest run",
|
|
42
|
+
"test:watch": "vitest",
|
|
43
|
+
"prepare": "npm run build",
|
|
44
|
+
"prepublishOnly": "npm run build && npm test"
|
|
45
|
+
},
|
|
46
|
+
"dependencies": {
|
|
47
|
+
"@modelcontextprotocol/sdk": "^1.29.0",
|
|
48
|
+
"env-paths": "^4.0.0",
|
|
49
|
+
"youtubei.js": "^17.2.0",
|
|
50
|
+
"zod": "^4.4.3"
|
|
51
|
+
},
|
|
52
|
+
"devDependencies": {
|
|
53
|
+
"@types/node": "^26.0.1",
|
|
54
|
+
"tsx": "^4.22.4",
|
|
55
|
+
"typescript": "^6.0.3",
|
|
56
|
+
"vitest": "^4.1.9"
|
|
57
|
+
}
|
|
58
|
+
}
|