yt-liked 0.2.0-alpha.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,203 @@
1
+ # yt-liked
2
+
3
+ `yt-liked` gives you the `ytl` command: an archive-first CLI for your YouTube liked videos.
4
+
5
+ It gives you a local workflow that already works today:
6
+
7
+ - import an existing likes export
8
+ - repair collapsed uploader metadata
9
+ - classify it with Gemini, Claude, or Codex
10
+ - search it with local FTS
11
+ - inspect it in the terminal
12
+ - keep everything on your machine
13
+
14
+ It also includes an exploratory browser-session probe for future native sync work, but that part is intentionally not marketed as solved yet.
15
+
16
+ ## What Works Today
17
+
18
+ - Local-first JSONL + SQLite archive under `~/.yt-liked`
19
+ - Uploader metadata repair via YouTube oEmbed
20
+ - Interactive classify setup with a YouTube-themed engine picker
21
+ - Gemini API-key classification with resumable batching and concurrency
22
+ - Claude CLI and Codex CLI classification support
23
+ - Archive search, list, show, viz, status, and path commands
24
+ - Browser-session feasibility probe for YouTube Likes
25
+
26
+ ## What Is Experimental
27
+
28
+ - `ytl sync` is still a research/probe command
29
+ - uploader/channel metadata may need the built-in enrichment pass depending on your source archive
30
+ - the main supported path today is `import -> enrich-channels -> classify -> search/list/show/viz`
31
+
32
+ ## Principles
33
+
34
+ - Local-first
35
+ - No telemetry
36
+ - No hosted sync service
37
+ - No hosted classification service
38
+ - Gemini uses your own `GEMINI_API_KEY` or `GOOGLE_API_KEY`
39
+ - Claude and Codex reuse your existing local CLI login
40
+ - Saved local config lives in `~/.yt-liked/.env.local`
41
+
42
+ ## Install
43
+
44
+ `ytl` currently targets macOS first, especially for the exploratory Chrome-session sync probe.
45
+
46
+ Alpha install:
47
+
48
+ ```bash
49
+ npm install -g yt-liked@alpha
50
+ ```
51
+
52
+ One-off use:
53
+
54
+ ```bash
55
+ npx yt-liked@alpha status
56
+ ```
57
+
58
+ Local development:
59
+
60
+ ```bash
61
+ npm install
62
+ npm run build
63
+ npm link
64
+ ```
65
+
66
+ Then use the `ytl` command normally.
67
+
68
+ ## Quick Start
69
+
70
+ Import an existing archive:
71
+
72
+ ```bash
73
+ ytl import /Users/4xiom/code/projects/youtube-liked-organizer/data/liked_videos.json
74
+ ```
75
+
76
+ Classify it:
77
+
78
+ ```bash
79
+ ytl enrich-channels
80
+ ytl classify
81
+ ```
82
+
83
+ Search it:
84
+
85
+ ```bash
86
+ ytl search "sqlite"
87
+ ```
88
+
89
+ Inspect it:
90
+
91
+ ```bash
92
+ ytl list --limit 5
93
+ ytl show <stored-id-or-video-id-or-url>
94
+ ytl viz
95
+ ytl status
96
+ ```
97
+
98
+ ## Classification Engines
99
+
100
+ `ytl classify` and `ytl classify-domains` support three engines:
101
+
102
+ - `gemini`
103
+ - `claude`
104
+ - `codex`
105
+
106
+ If you omit `--engine` in an interactive terminal, `ytl` opens an engine picker.
107
+
108
+ ### Gemini
109
+
110
+ Gemini is the default guided path.
111
+
112
+ - Model default: `models/gemini-3.1-flash-lite-preview`
113
+ - Default concurrency: `10`
114
+ - Default batch size: `50`
115
+
116
+ If no Gemini key is configured, `ytl` opens a hidden paste-to-save prompt and writes the key to:
117
+
118
+ ```bash
119
+ ~/.yt-liked/.env.local
120
+ ```
121
+
122
+ If you launch `ytl classify` interactively without custom flags, `ytl` also offers these launch profiles:
123
+
124
+ - `Rocket`: preview Flash Lite, batch `50`, workers `10`
125
+ - `Balanced`: preview Flash Lite, batch `50`, workers `6`
126
+ - `Careful`: preview Flash Lite, batch `25`, workers `3`
127
+
128
+ ### Claude and Codex
129
+
130
+ Claude and Codex follow the same local-CLI model as Field Theory:
131
+
132
+ - `claude` uses your existing Claude CLI login
133
+ - `codex` uses your existing Codex CLI login
134
+ - both run single-worker classification for stability
135
+
136
+ Examples:
137
+
138
+ ```bash
139
+ ytl classify --engine gemini
140
+ ytl classify --engine claude
141
+ ytl classify-domains --engine codex
142
+ ```
143
+
144
+ ## Commands
145
+
146
+ ### Archive
147
+
148
+ ```bash
149
+ ytl import <path>
150
+ ytl enrich-channels [--limit <n>] [--concurrency <n>] [--force]
151
+ ytl search <query> [--channel <name>] [--category <slug>] [--domain <slug>] [--limit <n>] [--json]
152
+ ytl list [--query <q>] [--channel <name>] [--category <slug>] [--domain <slug>] [--privacy <value>] [--after <date>] [--before <date>] [--limit <n>] [--offset <n>] [--json]
153
+ ytl show <stored-id-or-video-id-or-url> [--json]
154
+ ytl viz
155
+ ytl stats
156
+ ytl status
157
+ ytl path
158
+ ```
159
+
160
+ ### Classification
161
+
162
+ ```bash
163
+ ytl classify [--engine <gemini|claude|codex>] [--model <name>] [--batch-size <n>] [--concurrency <n>] [--limit <n>]
164
+ ytl classify-domains [--all] [--engine <gemini|claude|codex>] [--model <name>] [--batch-size <n>] [--concurrency <n>] [--limit <n>]
165
+ ```
166
+
167
+ ### Experimental Sync Probe
168
+
169
+ ```bash
170
+ ytl sync [--max-pages <n>] [--delay-ms <ms>] [--max-minutes <n>] [--chrome-user-data-dir <path>] [--chrome-profile-directory <name>]
171
+ ```
172
+
173
+ This command is deliberately framed as exploratory. It tests whether the logged-in YouTube web client can expose more history than the current archive ceiling on your machine.
174
+
175
+ ## Local Files
176
+
177
+ - `~/.yt-liked/videos.jsonl`
178
+ - `~/.yt-liked/videos.db`
179
+ - `~/.yt-liked/videos-meta.json`
180
+ - `~/.yt-liked/videos-backfill-state.json`
181
+ - `~/.yt-liked/.env.local`
182
+
183
+ ## Development
184
+
185
+ Build and test:
186
+
187
+ ```bash
188
+ npm run build
189
+ npm test
190
+ ```
191
+
192
+ Create a publish tarball locally:
193
+
194
+ ```bash
195
+ npm pack
196
+ ```
197
+
198
+ ## Current Limitations
199
+
200
+ - Native YouTube Likes sync is not production-ready yet
201
+ - The remaining unresolved uploader bucket is now mostly private/deleted videos and is no longer attributed to your profile in `ytl viz`
202
+ - Some imported archives collapse uploader metadata to the playlist owner, so `ytl enrich-channels` is the recommended repair step after import
203
+ - `ytl` is strongest today as a personal archive workflow, not as a full YouTube data replacement
package/bin/ytl.mjs ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ import('../dist/cli.js').then(({ run }) => run(process.argv));
@@ -0,0 +1,209 @@
1
+ import { buildIndex } from './videos-db.js';
2
+ import { readVideoArchive, writeJsonLines } from './jsonl.js';
3
+ import { videosJsonlPath } from './paths.js';
4
+ function normalizeString(value) {
5
+ if (typeof value !== 'string')
6
+ return null;
7
+ const trimmed = value.trim();
8
+ return trimmed.length > 0 ? trimmed : null;
9
+ }
10
+ function signatureKey(record) {
11
+ return `${record.channel_title ?? ''}||${record.channel_id ?? ''}`;
12
+ }
13
+ function detectDominantFallback(records) {
14
+ if (records.length === 0) {
15
+ return { title: null, id: null };
16
+ }
17
+ const counts = new Map();
18
+ for (const record of records) {
19
+ const key = signatureKey(record);
20
+ const entry = counts.get(key);
21
+ if (entry) {
22
+ entry.count += 1;
23
+ }
24
+ else {
25
+ counts.set(key, {
26
+ count: 1,
27
+ title: record.channel_title ?? null,
28
+ id: record.channel_id ?? null,
29
+ });
30
+ }
31
+ }
32
+ const dominant = Array.from(counts.values()).sort((a, b) => b.count - a.count)[0];
33
+ if (!dominant) {
34
+ return { title: null, id: null };
35
+ }
36
+ const title = normalizeString(dominant.title);
37
+ const suspiciousShare = dominant.count / Math.max(records.length, 1);
38
+ const shortTitle = !title || title.length <= 2;
39
+ if (suspiciousShare < 0.4 && !shortTitle) {
40
+ return { title: null, id: null };
41
+ }
42
+ return {
43
+ title,
44
+ id: normalizeString(dominant.id),
45
+ };
46
+ }
47
+ function parseChannelKey(authorUrl) {
48
+ if (!authorUrl)
49
+ return null;
50
+ try {
51
+ const url = new URL(authorUrl);
52
+ const parts = url.pathname.split('/').filter(Boolean);
53
+ if (parts.length === 0)
54
+ return authorUrl;
55
+ if (parts[0]?.startsWith('@'))
56
+ return parts[0];
57
+ if ((parts[0] === 'channel' || parts[0] === 'user' || parts[0] === 'c') && parts[1]) {
58
+ return parts[1];
59
+ }
60
+ return parts.join('/');
61
+ }
62
+ catch {
63
+ return authorUrl;
64
+ }
65
+ }
66
+ function shouldEnrich(record, fallback, force) {
67
+ if (!record.video_id && !record.url)
68
+ return false;
69
+ if (force)
70
+ return true;
71
+ const title = normalizeString(record.channel_title);
72
+ const id = normalizeString(record.channel_id);
73
+ if (!title)
74
+ return true;
75
+ if (title.length <= 2)
76
+ return true;
77
+ if (fallback.title && title === fallback.title) {
78
+ if (!fallback.id)
79
+ return true;
80
+ return id === fallback.id;
81
+ }
82
+ return false;
83
+ }
84
+ async function fetchOEmbed(url) {
85
+ const endpoint = `https://www.youtube.com/oembed?url=${encodeURIComponent(url)}&format=json`;
86
+ const response = await fetch(endpoint, {
87
+ headers: {
88
+ accept: 'application/json',
89
+ 'user-agent': 'ytl/0.2.0 (+https://github.com/afar1/fieldtheory-cli-inspired)',
90
+ },
91
+ });
92
+ if (!response.ok) {
93
+ throw new Error(`oEmbed request failed (${response.status})`);
94
+ }
95
+ return await response.json();
96
+ }
97
+ function extractWatchPageOwner(html) {
98
+ const ownerName = html.match(/"ownerChannelName":"([^"]+)"/)?.[1] ?? null;
99
+ const canonicalBaseUrl = html.match(/"canonicalBaseUrl":"([^"]+)"/)?.[1] ?? null;
100
+ const ownerProfileUrl = html.match(/"ownerProfileUrl":"([^"]+)"/)?.[1] ?? null;
101
+ const channelId = html.match(/"channelId":"([^"]+)"/)?.[1] ?? null;
102
+ const channelTitle = normalizeString(ownerName)
103
+ ?? normalizeString(canonicalBaseUrl?.split('/').filter(Boolean).at(-1)?.replace(/^@/, ''))
104
+ ?? normalizeString(ownerProfileUrl?.split('/').filter(Boolean).at(-1)?.replace(/^@/, ''));
105
+ const channelKey = parseChannelKey(normalizeString(ownerProfileUrl))
106
+ ?? parseChannelKey(normalizeString(canonicalBaseUrl))
107
+ ?? normalizeString(channelId);
108
+ return {
109
+ channelTitle,
110
+ channelKey,
111
+ };
112
+ }
113
+ async function fetchWatchPageOwner(url) {
114
+ const response = await fetch(url, {
115
+ headers: {
116
+ accept: 'text/html,application/xhtml+xml',
117
+ 'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36',
118
+ },
119
+ });
120
+ if (!response.ok) {
121
+ throw new Error(`watch page request failed (${response.status})`);
122
+ }
123
+ const html = await response.text();
124
+ return extractWatchPageOwner(html);
125
+ }
126
+ async function runConcurrent(items, concurrency, worker) {
127
+ let nextIndex = 0;
128
+ async function consume() {
129
+ while (true) {
130
+ const index = nextIndex;
131
+ nextIndex += 1;
132
+ if (index >= items.length) {
133
+ return;
134
+ }
135
+ await worker(items[index], index);
136
+ }
137
+ }
138
+ const workers = Array.from({ length: Math.max(1, concurrency) }, () => consume());
139
+ await Promise.all(workers);
140
+ }
141
+ export async function enrichChannels(options = {}) {
142
+ const records = await readVideoArchive();
143
+ const fallback = detectDominantFallback(records);
144
+ const candidates = records
145
+ .filter((record) => shouldEnrich(record, fallback, Boolean(options.force)))
146
+ .slice(0, options.limit ?? Number.MAX_SAFE_INTEGER);
147
+ let completed = 0;
148
+ let updated = 0;
149
+ let failed = 0;
150
+ let skipped = 0;
151
+ await runConcurrent(candidates, options.concurrency ?? 8, async (record) => {
152
+ try {
153
+ let nextTitle = null;
154
+ let nextId = null;
155
+ try {
156
+ const payload = await fetchOEmbed(record.url);
157
+ nextTitle = normalizeString(payload.author_name);
158
+ nextId = parseChannelKey(normalizeString(payload.author_url));
159
+ }
160
+ catch {
161
+ // Some public videos disable or break oEmbed; fall back to the watch page.
162
+ }
163
+ if (!nextTitle) {
164
+ const watchPage = await fetchWatchPageOwner(record.url);
165
+ nextTitle = watchPage.channelTitle;
166
+ nextId = nextId ?? watchPage.channelKey;
167
+ }
168
+ if (!nextTitle) {
169
+ skipped += 1;
170
+ }
171
+ else {
172
+ let changed = false;
173
+ if (record.channel_title !== nextTitle) {
174
+ record.channel_title = nextTitle;
175
+ changed = true;
176
+ }
177
+ if (nextId && record.channel_id !== nextId) {
178
+ record.channel_id = nextId;
179
+ changed = true;
180
+ }
181
+ if (changed) {
182
+ updated += 1;
183
+ }
184
+ else {
185
+ skipped += 1;
186
+ }
187
+ }
188
+ }
189
+ catch {
190
+ failed += 1;
191
+ }
192
+ finally {
193
+ completed += 1;
194
+ options.onProgress?.(completed, candidates.length);
195
+ }
196
+ });
197
+ if (updated > 0) {
198
+ writeJsonLines(videosJsonlPath(), records);
199
+ await buildIndex({ force: true });
200
+ }
201
+ return {
202
+ attempted: candidates.length,
203
+ updated,
204
+ failed,
205
+ skipped,
206
+ dominantFallbackTitle: fallback.title,
207
+ dominantFallbackId: fallback.id,
208
+ };
209
+ }
@@ -0,0 +1,130 @@
1
+ import { execFileSync } from 'node:child_process';
2
+ import { copyFileSync, existsSync, unlinkSync } from 'node:fs';
3
+ import { tmpdir } from 'node:os';
4
+ import { join } from 'node:path';
5
+ import { createDecipheriv, pbkdf2Sync, randomUUID } from 'node:crypto';
6
+ function getMacOSChromeKey() {
7
+ const candidates = [
8
+ { service: 'Chrome Safe Storage', account: 'Chrome' },
9
+ { service: 'Chrome Safe Storage', account: 'Google Chrome' },
10
+ { service: 'Google Chrome Safe Storage', account: 'Chrome' },
11
+ { service: 'Google Chrome Safe Storage', account: 'Google Chrome' },
12
+ ];
13
+ for (const candidate of candidates) {
14
+ try {
15
+ const password = execFileSync('security', ['find-generic-password', '-w', '-s', candidate.service, '-a', candidate.account], { encoding: 'utf8', stdio: ['pipe', 'pipe', 'pipe'] }).trim();
16
+ if (password) {
17
+ return pbkdf2Sync(password, 'saltysalt', 1003, 16, 'sha1');
18
+ }
19
+ }
20
+ catch {
21
+ // Try the next naming pair.
22
+ }
23
+ }
24
+ throw new Error('Could not read Chrome Safe Storage from the macOS Keychain.\n' +
25
+ 'Open Google Chrome once, confirm it is installed normally, and try again.');
26
+ }
27
+ function decryptCookieValue(encryptedValue, key, dbVersion) {
28
+ if (encryptedValue.length === 0)
29
+ return '';
30
+ if (encryptedValue[0] === 0x76 && encryptedValue[1] === 0x31 && encryptedValue[2] === 0x30) {
31
+ const iv = Buffer.alloc(16, 0x20);
32
+ const ciphertext = encryptedValue.subarray(3);
33
+ const decipher = createDecipheriv('aes-128-cbc', key, iv);
34
+ let decrypted = decipher.update(ciphertext);
35
+ decrypted = Buffer.concat([decrypted, decipher.final()]);
36
+ if (dbVersion >= 24 && decrypted.length > 32) {
37
+ decrypted = decrypted.subarray(32);
38
+ }
39
+ return decrypted.toString('utf8');
40
+ }
41
+ return encryptedValue.toString('utf8');
42
+ }
43
+ function runSqliteQuery(dbPath, sql) {
44
+ return execFileSync('sqlite3', ['-json', dbPath, sql], {
45
+ encoding: 'utf8',
46
+ stdio: ['pipe', 'pipe', 'pipe'],
47
+ timeout: 10000,
48
+ }).trim();
49
+ }
50
+ function withReadableDb(dbPath, fn) {
51
+ try {
52
+ return fn(dbPath);
53
+ }
54
+ catch {
55
+ const tmpDb = join(tmpdir(), `ytl-cookies-${randomUUID()}.db`);
56
+ try {
57
+ copyFileSync(dbPath, tmpDb);
58
+ return fn(tmpDb);
59
+ }
60
+ finally {
61
+ try {
62
+ unlinkSync(tmpDb);
63
+ }
64
+ catch {
65
+ // Ignore cleanup failures.
66
+ }
67
+ }
68
+ }
69
+ }
70
+ function queryDbVersion(dbPath) {
71
+ return withReadableDb(dbPath, (readablePath) => {
72
+ const value = execFileSync('sqlite3', [readablePath, "SELECT value FROM meta WHERE key='version';"], {
73
+ encoding: 'utf8',
74
+ stdio: ['pipe', 'pipe', 'pipe'],
75
+ timeout: 5000,
76
+ }).trim();
77
+ return Number.parseInt(value, 10) || 0;
78
+ });
79
+ }
80
+ function queryYoutubeCookies(dbPath) {
81
+ if (!existsSync(dbPath)) {
82
+ throw new Error(`Chrome Cookies database not found at ${dbPath}`);
83
+ }
84
+ const sql = `
85
+ SELECT
86
+ name,
87
+ host_key,
88
+ value,
89
+ hex(encrypted_value) AS encrypted_value_hex
90
+ FROM cookies
91
+ WHERE host_key LIKE '%youtube.com'
92
+ ORDER BY host_key DESC, name ASC;
93
+ `;
94
+ const raw = withReadableDb(dbPath, (readablePath) => runSqliteQuery(readablePath, sql));
95
+ if (!raw || raw === '[]')
96
+ return [];
97
+ return JSON.parse(raw);
98
+ }
99
+ function sanitizeCookieValue(name, value) {
100
+ const cleaned = value.replace(/\0+$/g, '').trim();
101
+ if (!cleaned) {
102
+ throw new Error(`Cookie ${name} was empty after decryption.`);
103
+ }
104
+ return cleaned;
105
+ }
106
+ export function extractChromeYoutubeCookies(chromeUserDataDir, profileDirectory = 'Default') {
107
+ const dbPath = join(chromeUserDataDir, profileDirectory, 'Cookies');
108
+ const key = getMacOSChromeKey();
109
+ const dbVersion = queryDbVersion(dbPath);
110
+ const rawCookies = queryYoutubeCookies(dbPath);
111
+ const cookies = new Map();
112
+ for (const cookie of rawCookies) {
113
+ const hexValue = cookie.encrypted_value_hex;
114
+ const value = hexValue
115
+ ? decryptCookieValue(Buffer.from(hexValue, 'hex'), key, dbVersion)
116
+ : cookie.value;
117
+ if (!value)
118
+ continue;
119
+ cookies.set(cookie.name, sanitizeCookieValue(cookie.name, value));
120
+ }
121
+ const sapisid = cookies.get('SAPISID') ?? cookies.get('__Secure-1PAPISID') ?? cookies.get('__Secure-3PAPISID');
122
+ if (!sapisid) {
123
+ throw new Error('No authenticated YouTube SAPISID cookie was found in Chrome.\n' +
124
+ 'Open Google Chrome, make sure you are logged into YouTube, and try again.');
125
+ }
126
+ const cookieHeader = Array.from(cookies.entries())
127
+ .map(([name, value]) => `${name}=${value}`)
128
+ .join('; ');
129
+ return { cookies, cookieHeader, sapisid };
130
+ }