memex-mvp 0.6.1 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/HELP.md CHANGED
@@ -293,6 +293,51 @@ Memex по дефолту сортирует по **релевантности**
293
293
 
294
294
  ---
295
295
 
296
+ ## 💻 Терминальный CLI (v0.7+) — когда MCP не работает
297
+
298
+ Если MCP-интеграция не подцепилась к твоему агенту (или ты в агенте без MCP-поддержки, но с shell-доступом) — у memex есть **terminal-режим** на том же бинаре. Один пакет, два режима.
299
+
300
+ ```bash
301
+ memex search "Postgres миграция" # FTS5 поиск
302
+ memex search "Q2 deck" --chat "Memex Bot" # фильтр по title чата
303
+ memex search "auth" --source claude-code --limit 5 --sort date_desc
304
+
305
+ memex recent --limit 5 # последние сообщения
306
+ memex recent --source telegram
307
+
308
+ memex list # все conversations
309
+ memex list --source web # только сохранённые URL'ы
310
+
311
+ memex get web-1582ab51a7b7 # полный контент conversation
312
+
313
+ memex overview # snapshot корпуса
314
+ memex projects # уникальные project_paths
315
+ memex help # эта инструкция в терминале
316
+ memex --help # справка по командам
317
+ memex --version
318
+ ```
319
+
320
+ **Все query-команды поддерживают `--json`** для пайпов и скриптов:
321
+
322
+ ```bash
323
+ memex search "TODO" --json | jq '.results[].snippet'
324
+ memex list --source telegram --json | jq -r '.conversations[].title'
325
+ memex get web-1582ab51a7b7 --json > backup.json
326
+ ```
327
+
328
+ **БД открывается read-only** — безопасно запускать пока daemon-writer работает.
329
+
330
+ **Когда использовать CLI вместо MCP:**
331
+
332
+ - MCP-интеграция в твоём агенте не подключилась → `memex overview` подтвердит что сам memex здоров, проблема в MCP-config'е клиента
333
+ - Агент без MCP-поддержки (OpenCode + Kimi, любые CLI-only агенты), но с shell-доступом
334
+ - Shell-скрипты / автоматизация
335
+ - Дебаг: «вижу ли я свою историю напрямую?»
336
+
337
+ **`memex` (без аргументов)** — это MCP stdio-сервер. Это поведение по умолчанию для Claude Code / Cursor / Cline через их MCP-config'и. CLI-команды активируются только при наличии распознанного subcommand'a.
338
+
339
+ ---
340
+
296
341
  ## Если что-то не работает
297
342
 
298
343
  ### Поиск пустой
package/README.md CHANGED
@@ -100,6 +100,28 @@ For a fully-automated install across all detected MCP clients, see [the AI-drive
100
100
 
101
101
  ---
102
102
 
103
+ ## Terminal CLI (v0.7+) — query memex without MCP
104
+
105
+ The same `memex` binary that runs as an MCP server also has a terminal mode for direct queries. Useful when MCP isn't wired up, when you want to pipe results into shell scripts, or when debugging MCP-config issues:
106
+
107
+ ```sh
108
+ memex search "Postgres migration" # full-text search
109
+ memex search "Q2 deck" --chat "Memex Bot" # scope to one conversation by title
110
+ memex recent --limit 5 # last 5 messages across all sources
111
+ memex list --source web # all saved URLs
112
+ memex get web-1582ab51a7b7 # full content of one conversation
113
+ memex overview # snapshot of corpus
114
+ memex projects # distinct project_paths captured
115
+ memex help # full user guide (HELP.md)
116
+ memex --help # command reference
117
+ ```
118
+
119
+ Every query supports `--json` for machine-readable output: `memex search foo --json | jq '.results[].snippet'`. The DB is opened **read-only** — safe to run while `memex-sync` daemon is writing.
120
+
121
+ When called **without arguments** (`memex`), the binary still runs as an MCP stdio server (the way Claude Code / Cursor / Cline launch it). CLI mode and MCP mode are the same package — no extra install.
122
+
123
+ ---
124
+
103
125
  ## Save URLs into memex (v0.6+)
104
126
 
105
127
  Once memex is installed, any MCP-aware agent can also save **web pages, AI chat shares, and pasted text** into your memex memory — searchable from any other AI chat later. In Claude Code, Cursor, Cline, …:
package/README.ru.md CHANGED
@@ -136,6 +136,32 @@ curl -fsSL https://raw.githubusercontent.com/parallelclaw/memex-mvp/main/skills/
136
136
 
137
137
  Полное руководство и edge cases (private Perplexity, paywall, login-walls): [HELP.md §8](HELP.md).
138
138
 
139
+ ### Терминальный CLI (v0.7+) — запросы к memex без MCP
140
+
141
+ Тот же бинарь `memex`, который работает как MCP-сервер, имеет **terminal-режим** для прямых запросов. Полезно когда MCP не настроен, когда хочешь пайпить результаты в shell-скрипты, или дебажить MCP-конфиг:
142
+
143
+ ```bash
144
+ memex search "Postgres миграция" # полнотекстовый поиск
145
+ memex search "Q2 deck" --chat "Memex Bot" # сузить до конкретного чата по title
146
+ memex recent --limit 5 # последние 5 сообщений из всех источников
147
+ memex list --source web # все сохранённые URL'ы
148
+ memex get web-1582ab51a7b7 # полный контент одной conversation
149
+ memex overview # snapshot корпуса
150
+ memex projects # уникальные project_paths
151
+ memex help # полное руководство (HELP.md)
152
+ memex --help # справка по командам
153
+ ```
154
+
155
+ У каждого query-subcommand'a есть `--json` для machine-readable вывода: `memex search foo --json | jq '.results[].snippet'`. БД открывается **read-only** — безопасно запускать пока daemon пишет.
156
+
157
+ При запуске **без аргументов** (`memex`) бинарь по-прежнему работает как MCP stdio server (как и вызывают его Claude Code / Cursor / Cline из своих конфигов). CLI-режим и MCP-режим — один и тот же пакет, без дополнительной установки.
158
+
159
+ **Использовать CLI, когда:**
160
+ - MCP-интеграция не подцепилась к твоему агенту → `memex overview` подтвердит что сам memex здоров
161
+ - Агент без MCP-поддержки, но с shell-доступом
162
+ - Хочешь пайпить результаты: `memex search foo --json | jq ...`
163
+ - Хочешь сдампить полный transcript в stdout для context'a
164
+
139
165
  ### Подключение к Claude Code
140
166
 
141
167
  Сначала возьми **два абсолютных пути** в терминале:
@@ -0,0 +1,513 @@
1
+ /**
2
+ * memex CLI — terminal-mode subcommands for the `memex` binary.
3
+ *
4
+ * When the user invokes the `memex` bin with a recognized subcommand
5
+ * (search / recent / list / get / overview / projects / help / --help
6
+ * / --version), we run a one-shot query and exit. When called WITHOUT
7
+ * any argument, server.js falls through to MCP-stdio mode (the
8
+ * primary mode used by Claude Code, Cursor, Cline, Continue, Zed).
9
+ *
10
+ * The CLI opens memex.db in read-only mode and uses WAL-friendly
11
+ * queries — safe to run while memex-sync daemon is writing.
12
+ *
13
+ * Why duplicate SQL from server.js? The MCP handlers in server.js
14
+ * are tightly coupled with the JSON-RPC response shape (jsonResult /
15
+ * textResult, half-life-boost params, group_by_conversation, …).
16
+ * Replicating the simple queries here keeps the CLI self-contained
17
+ * and avoids a risky refactor of the production MCP path. The CLI
18
+ * intentionally exposes the MOST USEFUL subset — not every MCP tool
19
+ * has a CLI peer.
20
+ *
21
+ * Output format:
22
+ * default → human-friendly markdown with light ANSI colors (TTY only)
23
+ * --json → structured JSON for shell pipelines / agents
24
+ */
25
+
26
+ import Database from 'better-sqlite3';
27
+ import { join } from 'node:path';
28
+ import { homedir } from 'node:os';
29
+ import { existsSync, readFileSync } from 'node:fs';
30
+ import { fileURLToPath } from 'node:url';
31
+
32
+ // ---------- Subcommand registry ----------
33
+ export const CLI_SUBCOMMAND_NAMES = [
34
+ 'search', 'recent', 'list', 'get', 'overview',
35
+ 'projects', 'help', '-h', '--help', '-v', '--version',
36
+ ];
37
+
38
+ // ---------- Path helpers ----------
39
+ const HOME = homedir();
40
+ const MEMEX_DIR = process.env.MEMEX_DIR || join(HOME, '.memex');
41
+ const DB_PATH = join(MEMEX_DIR, 'data', 'memex.db');
42
+ // HELP.md lives at the package root, two levels up from lib/cli/
43
+ const PACKAGE_ROOT = fileURLToPath(new URL('../../', import.meta.url));
44
+ const HELP_MD_PATH = join(PACKAGE_ROOT, 'HELP.md');
45
+
46
+ // ---------- ANSI helpers ----------
47
+ const TTY = process.stdout.isTTY;
48
+ const c = TTY
49
+ ? {
50
+ dim: (s) => `\x1b[2m${s}\x1b[0m`,
51
+ bold: (s) => `\x1b[1m${s}\x1b[0m`,
52
+ cyan: (s) => `\x1b[36m${s}\x1b[0m`,
53
+ green: (s) => `\x1b[32m${s}\x1b[0m`,
54
+ yellow:(s) => `\x1b[33m${s}\x1b[0m`,
55
+ }
56
+ : {
57
+ dim: (s) => s, bold: (s) => s, cyan: (s) => s,
58
+ green: (s) => s, yellow: (s) => s,
59
+ };
60
+
61
+ // ---------- argv parser (minimal, no deps) ----------
62
+ function parseArgs(argv) {
63
+ const opts = {};
64
+ const positionals = [];
65
+ for (let i = 0; i < argv.length; i++) {
66
+ const a = argv[i];
67
+ if (a === '--json') opts.json = true;
68
+ else if (a === '--limit') opts.limit = parseInt(argv[++i], 10);
69
+ else if (a === '--source') opts.source = argv[++i];
70
+ else if (a === '--chat') opts.chat = argv[++i];
71
+ else if (a === '--project') opts.project = argv[++i];
72
+ else if (a === '--sort') opts.sort = argv[++i];
73
+ else if (a === '--include-archived') opts.includeArchived = true;
74
+ else if (a === '--help' || a === '-h') opts.help = true;
75
+ else if (a.startsWith('--')) { /* ignore unknown flag for forward-compat */ }
76
+ else positionals.push(a);
77
+ }
78
+ return { opts, positionals };
79
+ }
80
+
81
+ function openDb() {
82
+ if (!existsSync(DB_PATH)) {
83
+ console.error(`memex.db not found at ${DB_PATH}`);
84
+ console.error(`Run 'memex-sync install' to set up the daemon and create the DB.`);
85
+ process.exit(1);
86
+ }
87
+ // Read-only handle: WAL allows this to coexist with the writing daemon.
88
+ return new Database(DB_PATH, { readonly: true, fileMustExist: true });
89
+ }
90
+
91
+ function fmtDate(ts) {
92
+ if (!ts || ts === 0) return '?';
93
+ return new Date(ts * 1000).toISOString().slice(0, 10);
94
+ }
95
+
96
+ function fmtDateTime(ts) {
97
+ if (!ts || ts === 0) return '?';
98
+ return new Date(ts * 1000).toISOString().slice(0, 16).replace('T', ' ');
99
+ }
100
+
101
+ // FTS5 expects sanitized tokens — strip what would be operators
102
+ function sanitizeFtsQuery(q) {
103
+ return String(q || '')
104
+ .trim()
105
+ .replace(/[^\p{L}\p{N}_\-\s"]/gu, ' ')
106
+ .replace(/\s+/g, ' ')
107
+ .trim();
108
+ }
109
+
110
+ // =============================================================
111
+ // SEARCH
112
+ // =============================================================
113
+ async function cmdSearch(args) {
114
+ const { opts, positionals } = parseArgs(args);
115
+ const query = positionals.join(' ').trim();
116
+ if (!query || opts.help) {
117
+ console.error('Usage: memex search "<query>" [--source X] [--chat X] [--project X] [--sort SORT] [--limit N] [--json]');
118
+ console.error(' --sort: relevance (default) | date_asc | date_desc');
119
+ process.exit(query ? 0 : 2);
120
+ }
121
+ const limit = Math.min(50, Math.max(1, opts.limit || 10));
122
+ const sanitized = sanitizeFtsQuery(query);
123
+ if (!sanitized) {
124
+ console.error('Query became empty after sanitization — try simpler keywords.');
125
+ process.exit(2);
126
+ }
127
+
128
+ const filters = ['messages_fts MATCH ?'];
129
+ const params = [sanitized];
130
+ if (opts.source) {
131
+ filters.push('m.source = ?');
132
+ params.push(opts.source);
133
+ }
134
+ if (!opts.includeArchived) {
135
+ filters.push('(c.archived_at IS NULL OR c.archived_at = 0)');
136
+ }
137
+ if (opts.project) {
138
+ filters.push('c.project_path LIKE ?');
139
+ params.push(`%${opts.project}%`);
140
+ }
141
+ if (opts.chat) {
142
+ filters.push('LOWER(c.title) LIKE LOWER(?)');
143
+ params.push(`%${opts.chat}%`);
144
+ }
145
+
146
+ let orderBy;
147
+ if (opts.sort === 'date_asc') {
148
+ orderBy = 'CASE WHEN m.ts IS NULL OR m.ts = 0 THEN 1 ELSE 0 END, m.ts ASC';
149
+ } else if (opts.sort === 'date_desc') {
150
+ orderBy = 'CASE WHEN m.ts IS NULL OR m.ts = 0 THEN 1 ELSE 0 END, m.ts DESC';
151
+ } else {
152
+ // Same BM25 × recency formula as memex_search, with half_life = 30 days
153
+ orderBy = `bm25(messages_fts) * exp(-(CAST(strftime('%s','now') AS REAL) - COALESCE(NULLIF(m.ts, 0), CAST(strftime('%s','now') AS REAL))) / 86400.0 / 30.0)`;
154
+ }
155
+
156
+ const sql = `
157
+ SELECT m.source, m.conversation_id, m.role, m.sender, m.ts,
158
+ snippet(messages_fts, 0, '<<', '>>', ' … ', 18) AS snippet,
159
+ c.title AS conversation_title
160
+ FROM messages_fts
161
+ JOIN messages m ON m.id = messages_fts.rowid
162
+ LEFT JOIN conversations c ON c.conversation_id = m.conversation_id
163
+ WHERE ${filters.join(' AND ')}
164
+ ORDER BY ${orderBy}
165
+ LIMIT ?
166
+ `;
167
+ const db = openDb();
168
+ const rows = db.prepare(sql).all(...params, limit);
169
+ db.close();
170
+
171
+ if (opts.json) {
172
+ console.log(JSON.stringify({ query, count: rows.length, results: rows }, null, 2));
173
+ return;
174
+ }
175
+
176
+ if (rows.length === 0) {
177
+ console.log(`No results for ${c.bold('"' + query + '"')}`);
178
+ return;
179
+ }
180
+ console.log(`${c.bold(rows.length)} result(s) for ${c.bold('"' + query + '"')}\n`);
181
+ for (const r of rows) {
182
+ console.log(`${c.cyan(r.conversation_title || r.conversation_id)} ${c.dim('· ' + r.source + ' · ' + fmtDate(r.ts))}`);
183
+ console.log(` ${r.snippet.replace(/<<(.+?)>>/g, (_, m) => c.yellow(m))}`);
184
+ console.log(` ${c.dim('conversation_id: ' + r.conversation_id)}`);
185
+ console.log('');
186
+ }
187
+ }
188
+
189
+ // =============================================================
190
+ // RECENT
191
+ // =============================================================
192
+ async function cmdRecent(args) {
193
+ const { opts } = parseArgs(args);
194
+ if (opts.help) {
195
+ console.error('Usage: memex recent [--limit N] [--source X] [--json]');
196
+ process.exit(0);
197
+ }
198
+ const limit = Math.min(100, Math.max(1, opts.limit || 20));
199
+ const filters = [];
200
+ const params = [];
201
+ if (opts.source) {
202
+ filters.push('m.source = ?');
203
+ params.push(opts.source);
204
+ }
205
+ if (!opts.includeArchived) {
206
+ filters.push('(c.archived_at IS NULL OR c.archived_at = 0)');
207
+ }
208
+ const where = filters.length ? `WHERE ${filters.join(' AND ')}` : '';
209
+ const sql = `
210
+ SELECT m.source, m.conversation_id, m.role, m.sender, m.ts,
211
+ substr(m.text, 1, 240) AS preview,
212
+ c.title AS conversation_title
213
+ FROM messages m
214
+ LEFT JOIN conversations c ON c.conversation_id = m.conversation_id
215
+ ${where}
216
+ ORDER BY m.ts DESC
217
+ LIMIT ?
218
+ `;
219
+ const db = openDb();
220
+ const rows = db.prepare(sql).all(...params, limit);
221
+ db.close();
222
+
223
+ if (opts.json) {
224
+ console.log(JSON.stringify({ count: rows.length, results: rows }, null, 2));
225
+ return;
226
+ }
227
+ console.log(`${c.bold(rows.length)} recent message(s)\n`);
228
+ for (const r of rows) {
229
+ console.log(`${c.cyan(r.conversation_title || r.conversation_id)} ${c.dim('· ' + r.source + ' · ' + fmtDateTime(r.ts))}`);
230
+ console.log(` ${c.dim(r.role + ':')} ${r.preview.replace(/\s+/g, ' ').trim()}`);
231
+ console.log('');
232
+ }
233
+ }
234
+
235
+ // =============================================================
236
+ // LIST conversations
237
+ // =============================================================
238
+ async function cmdList(args) {
239
+ const { opts } = parseArgs(args);
240
+ if (opts.help) {
241
+ console.error('Usage: memex list [--source X] [--limit N] [--json]');
242
+ process.exit(0);
243
+ }
244
+ const limit = Math.min(200, Math.max(1, opts.limit || 20));
245
+ const filters = [];
246
+ const params = [];
247
+ if (opts.source) {
248
+ filters.push('source = ?');
249
+ params.push(opts.source);
250
+ }
251
+ if (!opts.includeArchived) {
252
+ filters.push('(archived_at IS NULL OR archived_at = 0)');
253
+ }
254
+ filters.push("(parent_conversation_id IS NULL)"); // skip subagents by default
255
+ const where = filters.length ? `WHERE ${filters.join(' AND ')}` : '';
256
+ const sql = `
257
+ SELECT conversation_id, source, title, first_ts, last_ts, message_count
258
+ FROM conversations
259
+ ${where}
260
+ ORDER BY last_ts DESC
261
+ LIMIT ?
262
+ `;
263
+ const db = openDb();
264
+ const rows = db.prepare(sql).all(...params, limit);
265
+ db.close();
266
+
267
+ if (opts.json) {
268
+ console.log(JSON.stringify({ count: rows.length, conversations: rows }, null, 2));
269
+ return;
270
+ }
271
+ console.log(`${c.bold(rows.length)} conversation(s)\n`);
272
+ for (const r of rows) {
273
+ console.log(`${c.cyan(r.title || r.conversation_id)}`);
274
+ console.log(` ${c.dim(r.source + ' · ' + r.message_count + ' msgs · ' + fmtDate(r.first_ts) + ' → ' + fmtDate(r.last_ts))}`);
275
+ console.log(` ${c.dim(r.conversation_id)}`);
276
+ console.log('');
277
+ }
278
+ }
279
+
280
+ // =============================================================
281
+ // GET full conversation
282
+ // =============================================================
283
+ async function cmdGet(args) {
284
+ const { opts, positionals } = parseArgs(args);
285
+ const convId = positionals[0];
286
+ if (!convId || opts.help) {
287
+ console.error('Usage: memex get <conversation_id> [--limit N] [--json]');
288
+ console.error('Find conversation_ids via `memex list` or `memex search`.');
289
+ process.exit(convId ? 0 : 2);
290
+ }
291
+ const limit = Math.min(2000, Math.max(1, opts.limit || 200));
292
+ const db = openDb();
293
+ const conv = db
294
+ .prepare(`SELECT * FROM conversations WHERE conversation_id = ?`)
295
+ .get(convId);
296
+ if (!conv) {
297
+ db.close();
298
+ console.error(`No conversation found for id: ${convId}`);
299
+ process.exit(1);
300
+ }
301
+ const msgs = db
302
+ .prepare(`
303
+ SELECT role, sender, text, ts
304
+ FROM messages
305
+ WHERE conversation_id = ?
306
+ ORDER BY ts ASC, id ASC
307
+ LIMIT ?
308
+ `)
309
+ .all(convId, limit);
310
+ db.close();
311
+
312
+ if (opts.json) {
313
+ console.log(JSON.stringify({ conversation: conv, messages: msgs }, null, 2));
314
+ return;
315
+ }
316
+ console.log(`# ${conv.title || conv.conversation_id}`);
317
+ console.log(`${c.dim(conv.source + ' · ' + msgs.length + ' message(s) · ' + fmtDate(conv.first_ts) + ' → ' + fmtDate(conv.last_ts))}`);
318
+ console.log('');
319
+ for (const m of msgs) {
320
+ console.log(`${c.cyan(m.role + ' (' + m.sender + ')')} ${c.dim(fmtDateTime(m.ts))}`);
321
+ console.log(m.text);
322
+ console.log('');
323
+ }
324
+ }
325
+
326
+ // =============================================================
327
+ // OVERVIEW
328
+ // =============================================================
329
+ async function cmdOverview(args) {
330
+ const { opts } = parseArgs(args);
331
+ const db = openDb();
332
+ const sources = db.prepare(`
333
+ SELECT source, COUNT(*) AS msgs, COUNT(DISTINCT conversation_id) AS chats,
334
+ MIN(ts) AS first_ts, MAX(ts) AS last_ts
335
+ FROM messages
336
+ GROUP BY source
337
+ ORDER BY msgs DESC
338
+ `).all();
339
+ const totalMsgs = db.prepare(`SELECT COUNT(*) AS c FROM messages`).get().c;
340
+ const totalConvs = db.prepare(`SELECT COUNT(*) AS c FROM conversations`).get().c;
341
+ const recentConvs = db.prepare(`
342
+ SELECT conversation_id, source, title, last_ts
343
+ FROM conversations
344
+ WHERE archived_at IS NULL OR archived_at = 0
345
+ ORDER BY last_ts DESC
346
+ LIMIT 10
347
+ `).all();
348
+ db.close();
349
+
350
+ if (opts.json) {
351
+ console.log(JSON.stringify({
352
+ total_messages: totalMsgs,
353
+ total_conversations: totalConvs,
354
+ sources,
355
+ recent_conversations: recentConvs,
356
+ }, null, 2));
357
+ return;
358
+ }
359
+ console.log(c.bold('memex corpus snapshot') + '\n');
360
+ console.log(`Total: ${c.green(totalMsgs + ' messages')} in ${c.green(totalConvs + ' conversations')}\n`);
361
+ console.log(c.bold('By source:'));
362
+ for (const s of sources) {
363
+ console.log(` ${s.source.padEnd(18)} ${String(s.msgs).padStart(7)} msgs · ${String(s.chats).padStart(5)} chats · ${fmtDate(s.first_ts)} → ${fmtDate(s.last_ts)}`);
364
+ }
365
+ console.log('');
366
+ console.log(c.bold('10 most recent conversations:'));
367
+ for (const r of recentConvs) {
368
+ console.log(` ${c.dim(fmtDate(r.last_ts))} ${c.cyan((r.title || r.conversation_id).slice(0, 60))} ${c.dim('(' + r.source + ')')}`);
369
+ }
370
+ }
371
+
372
+ // =============================================================
373
+ // PROJECTS
374
+ // =============================================================
375
+ async function cmdProjects(args) {
376
+ const { opts } = parseArgs(args);
377
+ const limit = Math.min(500, Math.max(1, opts.limit || 50));
378
+ const db = openDb();
379
+ const rows = db.prepare(`
380
+ SELECT project_path AS path, COUNT(*) AS chats
381
+ FROM conversations
382
+ WHERE project_path IS NOT NULL AND project_path != ''
383
+ GROUP BY project_path
384
+ ORDER BY chats DESC, project_path ASC
385
+ LIMIT ?
386
+ `).all(limit);
387
+ db.close();
388
+
389
+ if (opts.json) {
390
+ console.log(JSON.stringify({ count: rows.length, projects: rows }, null, 2));
391
+ return;
392
+ }
393
+ if (rows.length === 0) {
394
+ console.log('No projects captured yet. Run `memex-sync backfill-projects` to populate project paths on older conversations.');
395
+ return;
396
+ }
397
+ console.log(`${c.bold(rows.length)} project(s):\n`);
398
+ for (const r of rows) {
399
+ console.log(` ${String(r.chats).padStart(4)} chats ${c.cyan(r.path)}`);
400
+ }
401
+ }
402
+
403
+ // =============================================================
404
+ // HELP — print HELP.md content
405
+ // =============================================================
406
+ async function cmdHelp() {
407
+ if (!existsSync(HELP_MD_PATH)) {
408
+ console.error(`HELP.md not found at ${HELP_MD_PATH}`);
409
+ console.error(`See https://github.com/parallelclaw/memex-mvp/blob/main/HELP.md`);
410
+ process.exit(1);
411
+ }
412
+ process.stdout.write(readFileSync(HELP_MD_PATH, 'utf-8'));
413
+ }
414
+
415
+ // =============================================================
416
+ // USAGE — `memex --help`
417
+ // =============================================================
418
+ async function cmdUsage() {
419
+ console.log(`memex — local-first MCP memory server for AI agents
420
+
421
+ USAGE
422
+ memex run as MCP stdio server (called by Claude Code,
423
+ Cursor, Cline, Continue, Zed via MCP config)
424
+
425
+ memex <command> [args] run a one-shot terminal query and exit
426
+
427
+ COMMANDS
428
+ search "<query>" full-text search across all sources
429
+ --source <name> filter by source (telegram, claude-code, …)
430
+ --chat "<title>" filter by conversation title (substring)
431
+ --project <path> filter by project_path (substring)
432
+ --sort <mode> relevance | date_asc | date_desc
433
+ --limit N max results (default 10, max 50)
434
+ --json output JSON instead of markdown
435
+
436
+ recent most recent messages across all sources
437
+ --limit N default 20, max 100
438
+ --source <name> filter by source
439
+ --json
440
+
441
+ list list conversations by recency
442
+ --source <name> filter by source
443
+ --limit N default 20, max 200
444
+ --json
445
+
446
+ get <conversation_id> full transcript of one conversation
447
+ --limit N max messages (default 200, max 2000)
448
+ --json
449
+
450
+ overview corpus snapshot — sources, counts, recent chats
451
+ --json
452
+
453
+ projects list distinct project_paths captured
454
+ --limit N default 50, max 500
455
+ --json
456
+
457
+ help print the user guide (HELP.md)
458
+ --help, -h this command reference
459
+ --version, -v print package version
460
+
461
+ EXAMPLES
462
+ memex search "Postgres migration"
463
+ memex search "Q2 deck" --chat "Memex Bot"
464
+ memex search "auth" --source claude-code --sort date_desc --limit 5
465
+ memex list --source web --json | jq '.conversations[].title'
466
+ memex get web-1582ab51a7b7
467
+
468
+ DAEMON COMMANDS (separate binary)
469
+ memex-sync install register the macOS LaunchAgent for auto-capture
470
+ memex-sync status daemon health + watched files
471
+ memex-sync scan one-time backfill of existing AI sessions
472
+ memex-sync --help full daemon CLI reference
473
+
474
+ For the full user guide: memex help
475
+ On the web: https://memex.parallelclaw.ai
476
+ `);
477
+ }
478
+
479
+ // =============================================================
480
+ // VERSION
481
+ // =============================================================
482
+ async function cmdVersion() {
483
+ try {
484
+ const pkgPath = join(PACKAGE_ROOT, 'package.json');
485
+ const pkg = JSON.parse(readFileSync(pkgPath, 'utf-8'));
486
+ console.log(`memex-mvp ${pkg.version}`);
487
+ } catch (_) {
488
+ console.log('memex-mvp (version unknown)');
489
+ }
490
+ }
491
+
492
+ // =============================================================
493
+ // DISPATCH
494
+ // =============================================================
495
+ export async function runCli(sub, args) {
496
+ switch (sub) {
497
+ case 'search': return cmdSearch(args);
498
+ case 'recent': return cmdRecent(args);
499
+ case 'list': return cmdList(args);
500
+ case 'get': return cmdGet(args);
501
+ case 'overview': return cmdOverview(args);
502
+ case 'projects': return cmdProjects(args);
503
+ case 'help': return cmdHelp();
504
+ case '--help':
505
+ case '-h': return cmdUsage();
506
+ case '--version':
507
+ case '-v': return cmdVersion();
508
+ default:
509
+ console.error(`Unknown subcommand: ${sub}`);
510
+ console.error(`Run 'memex --help' for usage.`);
511
+ process.exit(2);
512
+ }
513
+ }
@@ -2,13 +2,18 @@
2
2
  * Extract a title from fetched page content.
3
3
  *
4
4
  * Strategy (first hit wins):
5
- * 1. Markdown H1 `# Title text` (Jina Reader's output starts with this)
6
- * 2. HTML <title> `<title>Page Title</title>`
7
- * 3. HTML <h1> — `<h1>Page Title</h1>`
8
- * 4. First non-empty line if short enough to look like a title
9
- * 5. URL slug fallback last meaningful path segment, decoded
10
- * 6. Domain fallbackjust the domain name
11
- * 7. "Untitled document"
5
+ * 0. Strip Jina Reader prefix block if present (Jina prepends
6
+ * `Title: …\nURL Source: …\nPublished Time: …\nMarkdown Content:\n`
7
+ * to its output; the literal "Title:" line is often useless boilerplate
8
+ * like "Title: Perplexity" rather than the actual thread title)
9
+ * 1. Markdown H1`# Title text`
10
+ * 2. Markdown H2`## Title text` (Perplexity threads start with H2)
11
+ * 3. HTML <title> — `<title>Page Title</title>`
12
+ * 4. HTML <h1> — `<h1>Page Title</h1>`
13
+ * 5. First non-empty line if short enough to look like a title
14
+ * 6. URL slug fallback — last meaningful path segment, decoded
15
+ * 7. Domain fallback — just the domain name
16
+ * 8. "Untitled document"
12
17
  *
13
18
  * Returns a trimmed string up to MAX_LEN characters. Always returns a
14
19
  * non-empty string (worst case "Untitled document").
@@ -23,13 +28,52 @@ function trimTitle(s) {
23
28
  return t;
24
29
  }
25
30
 
31
+ /**
32
+ * Jina AI Reader (r.jina.ai/<url>) wraps every page in a metadata
33
+ * prefix:
34
+ *
35
+ * Title: <browser tab title>
36
+ *
37
+ * URL Source: <original URL>
38
+ *
39
+ * Published Time: <date>
40
+ *
41
+ * Markdown Content:
42
+ * <actual page markdown follows here>
43
+ *
44
+ * The "Title:" line is frequently a generic app shell ("Perplexity",
45
+ * "Twitter / X", "GitHub") rather than the actual document title — so
46
+ * we strip the whole prefix and run title extraction against the real
47
+ * markdown body. The actual H1/H2 inside is what we want.
48
+ *
49
+ * Detection is keyed on "URL Source: http" near the top — that line
50
+ * is unique to Jina's output format. If it's not present, content is
51
+ * returned unchanged (non-Jina source).
52
+ */
53
+ function stripJinaPrefix(content) {
54
+ // Quick gate: look for URL Source line in the first ~500 chars
55
+ if (!/^URL Source:\s*https?:\/\//m.test(content.slice(0, 500))) {
56
+ return content;
57
+ }
58
+ // Find the "Markdown Content:" delimiter and slice everything after it
59
+ const m = content.match(/^Markdown Content:\s*\n/m);
60
+ if (!m) return content;
61
+ return content.slice(m.index + m[0].length);
62
+ }
63
+
26
64
  function fromMarkdownH1(content) {
27
- // Markdown H1: line starts with single # then space, then text.
28
- // Use \r? for cross-platform line endings. Stop at end-of-line.
65
+ // Single # at start of line, then space(s), then text.
29
66
  const m = content.match(/^[ \t]*#[ \t]+([^\r\n]+?)[ \t]*$/m);
30
67
  return m ? trimTitle(m[1]) : '';
31
68
  }
32
69
 
70
+ function fromMarkdownH2(content) {
71
+ // ## at start of line — used as fallback when H1 absent
72
+ // (Perplexity, Jina-fetched Twitter threads, many blog "subtopic" layouts).
73
+ const m = content.match(/^[ \t]*##[ \t]+([^\r\n]+?)[ \t]*$/m);
74
+ return m ? trimTitle(m[1]) : '';
75
+ }
76
+
33
77
  function fromHtmlTitle(content) {
34
78
  const m = content.match(/<title[^>]*>([^<]+)<\/title>/i);
35
79
  return m ? trimTitle(decodeEntities(m[1])) : '';
@@ -104,12 +148,14 @@ function decodeEntities(s) {
104
148
  */
105
149
  export function extractTitle(content, url) {
106
150
  const safe = typeof content === 'string' ? content : '';
151
+ const body = stripJinaPrefix(safe);
107
152
 
108
153
  return (
109
- fromMarkdownH1(safe) ||
110
- fromHtmlTitle(safe) ||
111
- fromHtmlH1(safe) ||
112
- fromFirstLine(safe) ||
154
+ fromMarkdownH1(body) ||
155
+ fromMarkdownH2(body) ||
156
+ fromHtmlTitle(body) ||
157
+ fromHtmlH1(body) ||
158
+ fromFirstLine(body) ||
113
159
  fromUrlSlug(url) ||
114
160
  'Untitled document'
115
161
  );
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "memex-mvp",
3
- "version": "0.6.1",
3
+ "version": "0.7.0",
4
4
  "description": "Local-first MCP server for cross-agent AI memory. One SQLite + FTS5 corpus across Claude Code, Cowork, Cursor, Continue, Zed, Obsidian, and Telegram — passively captured, verbatim, searchable from any MCP-compatible client.",
5
5
  "type": "module",
6
6
  "main": "server.js",
@@ -26,7 +26,7 @@
26
26
  "sync": "node ingest.js",
27
27
  "ingest": "node ingest.js",
28
28
  "bot": "node bot/index.js",
29
- "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js",
29
+ "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js && node test/cli.test.js",
30
30
  "prepublishOnly": "npm test"
31
31
  },
32
32
  "engines": {
package/server.js CHANGED
@@ -50,6 +50,31 @@ import {
50
50
  import { detectIssues, isBlocked } from './lib/store-doc/detect.js';
51
51
  import { extractTitle } from './lib/store-doc/extract-title.js';
52
52
  import { createHash } from 'node:crypto';
53
+ import { runCli, CLI_SUBCOMMAND_NAMES } from './lib/cli/index.js';
54
+
55
+ // -------------------- CLI subcommand dispatch --------------------
56
+ // When invoked with a recognized subcommand (search, recent, list, get,
57
+ // overview, projects, help, --help, --version) — run a one-shot query
58
+ // and exit. When invoked WITHOUT any argument (the way MCP clients
59
+ // always call this binary), fall through to MCP-stdio mode below.
60
+ //
61
+ // This runs BEFORE any DB/watcher side-effects so the CLI doesn't open
62
+ // the DB in write mode unnecessarily.
63
+ {
64
+ const sub = process.argv[2];
65
+ if (sub && CLI_SUBCOMMAND_NAMES.includes(sub)) {
66
+ await runCli(sub, process.argv.slice(3));
67
+ process.exit(0);
68
+ }
69
+ if (sub && !sub.startsWith('-')) {
70
+ // Unknown positional subcommand — fail fast with help, don't drift
71
+ // into MCP mode (which would just hang waiting for stdin).
72
+ console.error(`Unknown subcommand: ${sub}`);
73
+ console.error(`Run 'memex --help' for usage.`);
74
+ process.exit(2);
75
+ }
76
+ // No args (or only flags we don't recognize) → MCP mode
77
+ }
53
78
 
54
79
  // -------------------- Paths --------------------
55
80
  const HOME = homedir();
@@ -959,6 +984,38 @@ Archived conversations are hidden from default list/search but stay
959
984
  fully indexed. Pass include_archived: true on search/list to include
960
985
  them. Visibility flag only — never deletes data.
961
986
 
987
+ ══ CLI FALLBACK — when MCP isn't available ══
988
+
989
+ If you're running in an agent where memex MCP tools aren't wired up
990
+ (or wired up but not responding), memex ALSO ships a terminal CLI on
991
+ the same \`memex\` binary. Use this as a fallback before resorting to
992
+ raw SQLite. Available subcommands:
993
+
994
+ memex search "<query>" [--source X] [--chat X] [--sort MODE] [--limit N] [--json]
995
+ memex recent [--limit N] [--source X] [--json]
996
+ memex list [--source X] [--limit N] [--json]
997
+ memex get <id> [--json]
998
+ memex overview [--json]
999
+ memex projects
1000
+ memex help prints the full HELP.md user guide
1001
+ memex --help command reference
1002
+
1003
+ The --json flag on every query subcommand returns structured JSON
1004
+ for parsing. The DB is opened read-only — safe to run while the
1005
+ auto-capture daemon is writing.
1006
+
1007
+ WHEN TO USE THE CLI:
1008
+ • You suspect MCP integration is broken — \`memex overview\` confirms
1009
+ memex itself is healthy independent of MCP wiring
1010
+ • You're in an agent without MCP support but with shell access
1011
+ • You want to pipe results: \`memex search foo --json | jq ...\`
1012
+ • You want to dump a full conversation to stdout for context
1013
+
1014
+ DON'T fall back to raw SQLite queries against memex.db when the CLI
1015
+ exists — the CLI handles edge cases (FTS5 syntax sanitization,
1016
+ date formatting, snippet highlighting, archive filtering) that raw
1017
+ SQL doesn't, and the schema may change between versions.
1018
+
962
1019
  ══ DOCUMENT INGESTION (web pages, articles, AI chat shares) ══
963
1020
 
964
1021
  memex_store_document accepts content YOU fetch and stores it verbatim.
@@ -11,7 +11,7 @@ After you drop the skill into your agent (`~/.claude/skills/` for Claude Code, o
11
11
  3. **MCP config merge** — adds a single absolute-path `command` entry into your client's `mcpServers` config. Never overwrites your other servers
12
12
  4. **`memex-sync install`** — registers the macOS LaunchAgent for live auto-capture
13
13
  5. **`memex-sync scan`** — one-time backfill of every session that already exists on disk
14
- 6. **Restart hint + verification commands**
14
+ 6. **Restart hint + verification commands** — including the v0.7+ CLI fallback (`memex overview`, `memex search "foo"`) so you can verify memex works even if MCP didn't wire up cleanly
15
15
 
16
16
  End-to-end: **~2 minutes**, fully observable (agent shows each command before running).
17
17
 
@@ -164,6 +164,17 @@ After restart, suggest they try any of:
164
164
 
165
165
  These confirm everything works end-to-end.
166
166
 
167
+ **CLI fallback (v0.7+):** if the MCP integration doesn't pick up in the user's client for any reason, tell them they can verify memex from the terminal directly — same binary, no MCP needed:
168
+
169
+ ```sh
170
+ memex overview # confirms memex itself is healthy
171
+ memex search "foo" # FTS search from CLI
172
+ memex list # list conversations
173
+ memex --help # command reference
174
+ ```
175
+
176
+ This is also useful for agents without native MCP support (OpenCode + Kimi, plain shell scripts, CI pipelines) — they can shell out to `memex` directly.
177
+
167
178
  ## Safety rules — read before starting
168
179
 
169
180
  - If `node` or `npm` aren't installed, stop and tell the user to install Node.js (recommend nvm.sh or `brew install node`). Don't try to install Node yourself.
@@ -207,6 +207,37 @@ Sample interactions when the user invokes this skill. These show typical happy p
207
207
 
208
208
  ---
209
209
 
210
+ ---
211
+
212
+ ## Example 7 · MCP didn't connect — agent recommends CLI fallback (v0.7+)
213
+
214
+ > **User:** ok, I installed memex. but when I ask Claude "show me what memex has", nothing happens.
215
+ >
216
+ > **Agent:** Could be a MCP-config issue with your Claude client — let's verify memex itself is healthy first, separately from MCP:
217
+ >
218
+ > ```sh
219
+ > memex overview
220
+ > ```
221
+ >
222
+ > ```
223
+ > memex corpus snapshot
224
+ >
225
+ > Total: 1,247 messages in 23 conversations
226
+ >
227
+ > By source:
228
+ > claude-code 742 msgs · 18 chats · 2026-04-06 → 2026-05-15
229
+ > claude-cowork 312 msgs · 4 chats · 2026-04-06 → 2026-05-07
230
+ > ...
231
+ > ```
232
+ >
233
+ > Memex itself is healthy — your data is there. The problem is the MCP wiring in your Claude client.
234
+ >
235
+ > Try `memex search "recent topic you discussed"` directly in the terminal — if that returns hits, you've got a fully-working memex; it's just not exposed to your Claude session yet.
236
+ >
237
+ > Then check your Claude MCP config (`~/.claude/config.json` or equivalent) — the `memex` entry should have an absolute path. Run `which memex` to get it, and verify it's there. If you need to re-do the wiring, ask me again — I'll re-run step 2 of the install.
238
+
239
+ ---
240
+
210
241
  ## What these examples illustrate about the skill's behavior
211
242
 
212
243
  - **Always discover first**, then ask for confirmation before any write action