memex-mvp 0.10.10 → 0.10.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/HELP.md CHANGED
@@ -337,6 +337,13 @@ memex telegram open-pending --in terminal # forcefully в Terminal с
337
337
  memex telegram scan # one-shot ре-скан Downloads
338
338
  memex telegram status # all decisions
339
339
 
340
+ # Импорт файла из ЛЮБОГО пути (v0.10.12+) — без танцев с inbox/pending:
341
+ memex import ~/projects/memex/result.json # автодетект формата
342
+ memex import ~/Downloads/ChatExport_2026-05-18/ # директория Telegram HTML
343
+ memex import ~/path/to/session.jsonl # Claude Code JSONL
344
+ memex import ~/Downloads/result.json --force # пропустить privacy-gate
345
+ memex import file --format claude-jsonl --json # явный формат + JSON-вывод
346
+
340
347
  memex help # эта инструкция в терминале
341
348
  memex --help # справка по командам
342
349
  memex --version
@@ -363,6 +370,27 @@ memex get web-1582ab51a7b7 --json > backup.json
363
370
 
364
371
  ---
365
372
 
373
+ ## 📥 Импорт файла из произвольного пути (v0.10.8+ → v0.10.12+ MCP)
374
+
375
+ Если у тебя файл `result.json` (Telegram export) или `.jsonl` (Claude/Cowork-сессия) лежит где-нибудь в `~/Downloads/`, `~/Desktop/`, `~/projects/foo/` — **не нужно** копировать его руками в `~/.memex/inbox/`. Одна команда:
376
+
377
+ ```bash
378
+ memex import ~/projects/memex/result.json
379
+ ```
380
+
381
+ Что произойдёт:
382
+ - Авто-детект формата (Telegram JSON / HTML / Claude Code JSONL / Cowork JSONL)
383
+ - Для Telegram: проверяет твои decision-предпочтения (`allow` / `skip` / `block`)
384
+ - Новый чат → выводит preview (title, кол-во сообщений, диапазон дат, senders) и спрашивает подтверждения (`--force` чтобы импортить)
385
+ - Уже в `allowed` → импортит сразу
386
+ - В `skipped` / `blocked` → отказывается (или `--force` чтобы переопределить)
387
+ - Для Claude/Cowork JSONL — импортит сразу (своя память, нет privacy-gate)
388
+ - Идемпотентно — повторный импорт того же файла не задвоит (UNIQUE constraint)
389
+
390
+ **То же самое через AI-агента:** любой агент с MCP-соединением может вызвать `memex_import_file({path: "..."})` одним tool-call'ом, получить структурированный ответ. Это **в 100 раз дешевле по токенам** чем агент через bash руками таскает файлы в inbox и парсит логи.
391
+
392
+ ---
393
+
366
394
  ## 🖥 Web-дашборд (v0.10.8+) — посмотреть память глазами
367
395
 
368
396
  Когда хочется не звать AI, а **самому полистать** что у тебя в memex'е — есть локальный read-only веб-UI. Та же SQLite, другая поверхность.
package/README.md CHANGED
@@ -131,10 +131,23 @@ memex list --source web # all saved URLs
131
131
  memex get web-1582ab51a7b7 # full content of one conversation
132
132
  memex overview # snapshot of corpus + v0.8.1: capture streak
133
133
  memex projects # distinct project_paths captured
134
+ memex import ~/projects/memex/result.json # v0.10.12: ingest any file from any path
134
135
  memex help # full user guide (HELP.md)
135
136
  memex --help # command reference
136
137
  ```
137
138
 
139
+ ### Ingest from any path (v0.10.12+)
140
+
141
+ ```sh
142
+ memex import ~/projects/memex/result.json # auto-detects Telegram JSON
143
+ memex import ~/Downloads/ChatExport_2026-05-18/ # Telegram HTML export directory
144
+ memex import ~/path/to/session.jsonl # Claude Code JSONL
145
+ memex import ~/Downloads/result.json --force # skip Telegram privacy gate
146
+ memex import some-file --format claude-jsonl # explicit format override
147
+ ```
148
+
149
+ For Telegram, the privacy gate fires for any chat that isn't on your allow-list — the command exits with a preview (title, message count, date range, senders) so you can review before re-running with `--force`. Same path via MCP: any AI agent can call `memex_import_file({path: "..."})` to ingest one file in one tool call (instead of ~10k tokens of bash `mv`-shuffling).
150
+
138
151
  Every query supports `--json` for machine-readable output: `memex search foo --json | jq '.results[].snippet'`. The DB is opened **read-only** — safe to run while `memex-sync` daemon is writing.
139
152
 
140
153
  When called **without arguments** (`memex`), the binary still runs as an MCP stdio server (the way Claude Code / Cursor / Cline launch it). CLI mode and MCP mode are the same package — no extra install.
package/README.ru.md CHANGED
@@ -523,9 +523,22 @@ One file with all your AI conversations — sounds scarier than it is.
523
523
  1. Telegram **Desktop** (mobile не умеет export)
524
524
  2. Чат → меню → **Export chat history**
525
525
  3. **Format: JSON** (не HTML)
526
- 4. **Path:** `~/.memex/inbox/`
526
+ 4. **Path:** `~/.memex/inbox/` — или, начиная с v0.10.12, **любой путь куда угодно**, и потом `memex import <путь>` ([см. ниже](#импорт-из-любого-пути-v01012))
527
527
  5. Готово. Memex подхватит автоматически.
528
528
 
529
+ ## Импорт из любого пути (v0.10.12+)
530
+
531
+ Если файл лежит не в `~/.memex/inbox/`, а в обычном месте на диске (`~/Downloads/`, `~/Desktop/`, `~/projects/foo/`) — одна команда:
532
+
533
+ ```bash
534
+ memex import ~/projects/memex/result.json
535
+ memex import ~/Downloads/ChatExport_2026-05-18/ # Telegram HTML-папка
536
+ memex import ~/path/to/session.jsonl # Claude Code сессия
537
+ memex import ~/Downloads/result.json --force # пропустить privacy-gate
538
+ ```
539
+
540
+ То же через AI-агента: `memex_import_file({path: "..."})` — один MCP tool-call. Не нужно вручную переносить файлы.
541
+
529
542
  ---
530
543
 
531
544
  ## Web-дашборд (v0.10.8+) — увидеть свою память без AI
package/lib/cli/index.js CHANGED
@@ -38,7 +38,7 @@ import {
38
38
  // ---------- Subcommand registry ----------
39
39
  export const CLI_SUBCOMMAND_NAMES = [
40
40
  'search', 'recent', 'list', 'get', 'overview',
41
- 'projects', 'context', 'hook', 'when', 'telegram', 'web',
41
+ 'projects', 'context', 'hook', 'when', 'telegram', 'web', 'import',
42
42
  'help', '-h', '--help', '-v', '--version',
43
43
  ];
44
44
 
@@ -1691,6 +1691,134 @@ function tgCmdScan(opts, discovery, pending) {
1691
1691
  console.log(c.dim('Review: memex telegram pending'));
1692
1692
  }
1693
1693
 
1694
+ // =============================================================
1695
+ // `memex import <path>` — single-command file ingest from any path
1696
+ // =============================================================
1697
+ async function cmdImport(args) {
1698
+ let path = null;
1699
+ let format = 'auto';
1700
+ let force = false;
1701
+ let json = false;
1702
+
1703
+ for (let i = 0; i < args.length; i++) {
1704
+ const a = args[i];
1705
+ if (a === '--format') format = args[++i] || 'auto';
1706
+ else if (a === '--force' || a === '-f') force = true;
1707
+ else if (a === '--json') json = true;
1708
+ else if (a === '--help' || a === '-h') {
1709
+ console.log(`memex import — ingest a chat file from any path on disk
1710
+
1711
+ Usage:
1712
+ memex import <path> [options]
1713
+
1714
+ Arguments:
1715
+ <path> Absolute or ~-relative path to the file or directory.
1716
+ Supported: Telegram JSON / HTML / dir, Claude Code JSONL,
1717
+ Cowork JSONL.
1718
+
1719
+ Options:
1720
+ --format <fmt> Override auto-detection. One of: telegram-json,
1721
+ telegram-html, claude-jsonl, cowork-jsonl, auto (default).
1722
+ --force, -f Bypass Telegram privacy gate (import new chat without
1723
+ asking, or override a previously-skipped chat). Only set
1724
+ this when the user has explicitly approved.
1725
+ --json Output JSON instead of human text.
1726
+ -h, --help Show this help.
1727
+
1728
+ Examples:
1729
+ memex import ~/projects/memex/result.json
1730
+ memex import ~/Downloads/ChatExport_2026-05-18/
1731
+ memex import ~/path/to/session.jsonl --format claude-jsonl
1732
+ memex import ~/Downloads/result.json --force # skip privacy gate
1733
+ `);
1734
+ return;
1735
+ }
1736
+ else if (a.startsWith('-')) {
1737
+ console.error(`Unknown flag: ${a}`);
1738
+ process.exit(2);
1739
+ }
1740
+ else if (!path) path = a;
1741
+ else { console.error(`Unexpected positional: ${a}`); process.exit(2); }
1742
+ }
1743
+
1744
+ if (!path) {
1745
+ console.error('Usage: memex import <path> [--format X] [--force]');
1746
+ console.error('Run `memex import --help` for details.');
1747
+ process.exit(2);
1748
+ }
1749
+
1750
+ // Open writable DB so the helper can do upserts directly.
1751
+ if (!existsSync(DB_PATH)) {
1752
+ console.error(`memex.db not found at ${DB_PATH}`);
1753
+ console.error(`Run 'memex-sync install' first.`);
1754
+ process.exit(1);
1755
+ }
1756
+ const db = new Database(DB_PATH);
1757
+ db.pragma('journal_mode = WAL');
1758
+ db.pragma('synchronous = NORMAL');
1759
+
1760
+ const { ingestFile } = await import('../ingest-file.js');
1761
+ let result;
1762
+ try {
1763
+ result = await ingestFile(db, path, { format, force });
1764
+ } finally {
1765
+ db.close();
1766
+ }
1767
+
1768
+ if (json) {
1769
+ console.log(JSON.stringify(result, null, 2));
1770
+ return;
1771
+ }
1772
+
1773
+ // Human-readable summary
1774
+ switch (result.status) {
1775
+ case 'imported': {
1776
+ if (result.chats) {
1777
+ // Telegram path
1778
+ console.log(c.green('✓') + ` Imported ${result.total_imported.toLocaleString()} messages from ${result.chats.length} chat(s):`);
1779
+ for (const ch of result.chats) {
1780
+ console.log(` • ${ch.title} — ${ch.msg_count.toLocaleString()} msgs · ${ch.date_first} → ${ch.date_last}`);
1781
+ }
1782
+ } else {
1783
+ // Claude/Cowork JSONL path
1784
+ console.log(c.green('✓') + ` Imported ${result.total_imported.toLocaleString()} records into "${result.title}"`);
1785
+ console.log(c.dim(` conversation_id: ${result.conversation_id}`));
1786
+ }
1787
+ console.log('');
1788
+ console.log(c.dim('Searchable from any MCP client now.'));
1789
+ break;
1790
+ }
1791
+ case 'needs_consent': {
1792
+ console.log(c.yellow('⚠ ') + `New Telegram chat requires your consent:`);
1793
+ console.log(` Title: ${c.bold(result.chat_title)}`);
1794
+ console.log(` Type: ${result.chat_type}`);
1795
+ console.log(` Messages: ${result.message_count.toLocaleString()}`);
1796
+ console.log(` Dates: ${result.date_first} → ${result.date_last}`);
1797
+ if (result.senders_sample?.length) {
1798
+ console.log(` Senders: ${result.senders_sample.join(', ')}`);
1799
+ }
1800
+ console.log('');
1801
+ console.log(c.dim('To import this chat:'));
1802
+ console.log(` ${c.cyan('memex import')} ${path} ${c.cyan('--force')}`);
1803
+ console.log('');
1804
+ console.log(c.dim('To never import this chat:'));
1805
+ console.log(` ${c.cyan('memex telegram skip')} "${result.chat_title}"`);
1806
+ process.exit(0);
1807
+ }
1808
+ case 'skipped': {
1809
+ console.log(c.yellow('⏭ Skipped: ') + result.chat_title);
1810
+ console.log(c.dim(` Reason: ${result.reason}`));
1811
+ if (result.message) console.log(c.dim(' ' + result.message));
1812
+ process.exit(0);
1813
+ }
1814
+ case 'error':
1815
+ default: {
1816
+ console.error(c.yellow('✗ ') + 'Import failed: ' + (result.error || 'unknown'));
1817
+ process.exit(1);
1818
+ }
1819
+ }
1820
+ }
1821
+
1694
1822
  // =============================================================
1695
1823
  // `memex web` — opt-in local dashboard
1696
1824
  // =============================================================
@@ -1798,6 +1926,7 @@ export async function runCli(sub, args) {
1798
1926
  case 'hook': await cmdHook(args); break;
1799
1927
  case 'telegram': await cmdTelegram(args); break;
1800
1928
  case 'web': await cmdWeb(args); break;
1929
+ case 'import': await cmdImport(args); break;
1801
1930
  case 'help': await cmdHelp(); break;
1802
1931
  case '--help':
1803
1932
  case '-h': await cmdUsage(); break;
@@ -0,0 +1,366 @@
1
+ /**
2
+ * Single-call ingest of an arbitrary file path.
3
+ *
4
+ * Solves the v0.10.x onboarding gap: users put their Telegram/Claude exports
5
+ * in *natural* places like ~/projects/memex/ or ~/Desktop/ rather than the
6
+ * memex-magic paths (~/.memex/inbox/, ~/Downloads/Telegram Desktop/). The
7
+ * old answer was "drop it where memex watches" — but users don't know those
8
+ * paths and AI agents had to fall back to ~10k tokens of bash file-ops
9
+ * (mv to inbox, poll ingest.log, check DB count, …) just to get one file in.
10
+ *
11
+ * This helper lets the agent (or CLI) pass any path and get a single
12
+ * structured response. Auto-detects format, respects Telegram privacy
13
+ * decisions, returns precise insert/duplicate counts.
14
+ *
15
+ * Returns one of:
16
+ * { status: 'imported', ... } — records landed in DB
17
+ * { status: 'needs_consent',... } — new Telegram chat, ask user before forcing
18
+ * { status: 'skipped', ... } — user previously skipped/blocked this chat
19
+ * { status: 'error', ... } — file unreadable / format unknown / parse failed
20
+ */
21
+
22
+ import { existsSync, readFileSync, statSync } from 'node:fs';
23
+ import { homedir } from 'node:os';
24
+ import { basename, isAbsolute, join } from 'node:path';
25
+
26
+ const SUPPORTED_FORMATS = ['telegram-json', 'telegram-html', 'claude-jsonl', 'cowork-jsonl'];
27
+
28
+ /**
29
+ * Expand ~ and resolve relative paths against cwd. We DON'T realpath here —
30
+ * symlinks in the user's home are valid (e.g. iCloud-synced ~/Downloads).
31
+ */
32
+ export function resolvePath(p) {
33
+ if (!p || typeof p !== 'string') return null;
34
+ let s = p.trim();
35
+ if (s.startsWith('~/')) s = join(homedir(), s.slice(2));
36
+ else if (s === '~') s = homedir();
37
+ if (!isAbsolute(s)) s = join(process.cwd(), s);
38
+ return s;
39
+ }
40
+
41
+ /**
42
+ * Sniff format from path + content. Returns null when we can't tell.
43
+ *
44
+ * Heuristics:
45
+ * - directory containing messages.html → telegram-html (Desktop export dir)
46
+ * - .html / .htm → telegram-html
47
+ * - .json with Telegram markers in first 8KB → telegram-json
48
+ * - .jsonl starting with cowork- → cowork-jsonl
49
+ * - other .jsonl → claude-jsonl
50
+ */
51
+ export function detectFormat(absPath) {
52
+ if (!existsSync(absPath)) return null;
53
+ const stat = statSync(absPath);
54
+
55
+ if (stat.isDirectory()) {
56
+ // Telegram Desktop HTML export = directory with messages.html at root
57
+ if (existsSync(join(absPath, 'messages.html'))) return 'telegram-html';
58
+ // Some exports nest result.json inside
59
+ if (existsSync(join(absPath, 'result.json'))) return 'telegram-json-in-dir';
60
+ return null;
61
+ }
62
+
63
+ const lower = absPath.toLowerCase();
64
+ if (lower.endsWith('.html') || lower.endsWith('.htm')) return 'telegram-html';
65
+
66
+ if (lower.endsWith('.json')) {
67
+ let head = '';
68
+ try { head = readFileSync(absPath, 'utf-8').slice(0, 8192); }
69
+ catch (_) { return null; }
70
+ if (
71
+ head.includes('"messages"') ||
72
+ head.includes('"chats"') ||
73
+ head.includes('"personal_information"')
74
+ ) return 'telegram-json';
75
+ return null;
76
+ }
77
+
78
+ if (lower.endsWith('.jsonl')) {
79
+ const name = basename(lower);
80
+ if (name.startsWith('cowork-')) return 'cowork-jsonl';
81
+ return 'claude-jsonl';
82
+ }
83
+
84
+ return null;
85
+ }
86
+
87
+ /**
88
+ * Main entry. Caller is responsible for opening a writable better-sqlite3 Database
89
+ * handle and passing it in — we don't open/close the DB here.
90
+ *
91
+ * opts:
92
+ * format: one of SUPPORTED_FORMATS, or 'auto' (default).
93
+ * force: bypass Telegram privacy gate (skip/block decisions). Default false.
94
+ */
95
+ export async function ingestFile(db, rawPath, opts = {}) {
96
+ const path = resolvePath(rawPath);
97
+ if (!path) return { status: 'error', error: 'path required' };
98
+ if (!existsSync(path)) return { status: 'error', error: `file not found: ${path}` };
99
+
100
+ const format = opts.format && opts.format !== 'auto' ? opts.format : detectFormat(path);
101
+ if (!format) {
102
+ return {
103
+ status: 'error',
104
+ error: `unable to detect format; supported: ${SUPPORTED_FORMATS.join(', ')}. ` +
105
+ `Pass an explicit format= if the file has no extension hint.`,
106
+ };
107
+ }
108
+
109
+ if (format === 'telegram-json' || format === 'telegram-json-in-dir' || format === 'telegram-html') {
110
+ return ingestTelegram(db, path, format, opts);
111
+ }
112
+ if (format === 'claude-jsonl' || format === 'cowork-jsonl') {
113
+ return ingestClaudeJsonl(db, path, format, opts);
114
+ }
115
+ return { status: 'error', error: `unsupported format: ${format}` };
116
+ }
117
+
118
+ // ----- Telegram path -----
119
+
120
+ async function ingestTelegram(db, path, format, opts = {}) {
121
+ // Parse raw → { chats: { list: [...] }, ... }
122
+ let raw;
123
+ try {
124
+ if (format === 'telegram-html') {
125
+ const { parseTelegramHtmlExport } = await import('./parse-telegram-html.js');
126
+ raw = parseTelegramHtmlExport(path);
127
+ } else if (format === 'telegram-json-in-dir') {
128
+ raw = JSON.parse(readFileSync(join(path, 'result.json'), 'utf-8'));
129
+ } else {
130
+ raw = JSON.parse(readFileSync(path, 'utf-8'));
131
+ }
132
+ } catch (e) {
133
+ return { status: 'error', error: `parse failed: ${e.message}` };
134
+ }
135
+ if (!raw) return { status: 'error', error: 'parse returned empty' };
136
+
137
+ // Find the chat title — for single-chat exports, it's the only chat in the list
138
+ const chats = Array.isArray(raw.chats?.list)
139
+ ? raw.chats.list
140
+ : Array.isArray(raw.list)
141
+ ? raw.list
142
+ : raw.messages
143
+ ? [raw]
144
+ : [];
145
+ if (chats.length === 0) {
146
+ return { status: 'error', error: 'no chats found in file' };
147
+ }
148
+
149
+ // Privacy gate — single chat exports route through user's decision state;
150
+ // multi-chat exports (whole-Telegram-archive case, rare) skip the gate and
151
+ // import everything by default (the agent already had to coordinate that).
152
+ const decisionsMod = await import('./telegram-decisions.js');
153
+ const state = decisionsMod.loadDecisions();
154
+
155
+ if (chats.length === 1 && !opts.force) {
156
+ const chat = chats[0];
157
+ const title =
158
+ chat.name ||
159
+ (chat.type === 'saved_messages' ? 'Saved Messages' : `Telegram chat ${chat.id}`);
160
+
161
+ if (decisionsMod.isBlocked(state, title)) {
162
+ return {
163
+ status: 'skipped',
164
+ format,
165
+ chat_title: title,
166
+ reason: 'matches a block pattern in your decision state',
167
+ message: `Chat "${title}" matches a block pattern. To override, call again with force=true (and consider unblocking via memex telegram unblock <pattern>).`,
168
+ };
169
+ }
170
+ if (decisionsMod.isSkipped(state, title)) {
171
+ return {
172
+ status: 'skipped',
173
+ format,
174
+ chat_title: title,
175
+ reason: 'previously skipped by user',
176
+ message: `Chat "${title}" was previously skipped. To override, call again with force=true.`,
177
+ };
178
+ }
179
+ if (!decisionsMod.isAllowed(state, title)) {
180
+ // New chat — needs consent. Surface preview so the agent can ask the user.
181
+ const msgs = Array.isArray(chat.messages) ? chat.messages : [];
182
+ const realMsgs = msgs.filter((m) => m.type === 'message');
183
+ const senders = new Set();
184
+ let firstTs = Infinity, lastTs = 0;
185
+ for (const m of realMsgs.slice(0, 200)) {
186
+ if (m.from) senders.add(m.from);
187
+ const ts = parseInt(m.date_unixtime || '0', 10);
188
+ if (ts) { firstTs = Math.min(firstTs, ts); lastTs = Math.max(lastTs, ts); }
189
+ }
190
+ return {
191
+ status: 'needs_consent',
192
+ format,
193
+ chat_title: title,
194
+ chat_type: chat.type || 'unknown',
195
+ message_count: realMsgs.length,
196
+ senders_sample: Array.from(senders).slice(0, 5),
197
+ date_first: isFinite(firstTs) ? new Date(firstTs * 1000).toISOString().slice(0, 10) : null,
198
+ date_last: lastTs ? new Date(lastTs * 1000).toISOString().slice(0, 10) : null,
199
+ message:
200
+ `New Telegram chat "${title}" (${realMsgs.length} msgs). ` +
201
+ `Ask the user before importing — privacy gate. ` +
202
+ `To proceed: call memex_import_file again with force=true. ` +
203
+ `To never auto-import this chat: memex telegram skip "${title}".`,
204
+ };
205
+ }
206
+ }
207
+
208
+ // Allowed (or forced, or multi-chat archive) → run the upsert.
209
+ const { importTelegramRaw } = await import('./import-telegram.js');
210
+ let result;
211
+ try {
212
+ result = importTelegramRaw(db, raw);
213
+ } catch (e) {
214
+ return { status: 'error', error: `import failed: ${e.message}` };
215
+ }
216
+
217
+ // Mark chats as allowed for future re-exports
218
+ for (const c of result.chats) {
219
+ decisionsMod.allowChat(state, c.title);
220
+ }
221
+ decisionsMod.saveDecisions(state);
222
+
223
+ return {
224
+ status: 'imported',
225
+ format,
226
+ chats: result.chats.map((c) => ({
227
+ conversation_id: c.conversation_id,
228
+ title: c.title,
229
+ msg_count: c.msg_count,
230
+ date_first: c.first_ts ? new Date(c.first_ts * 1000).toISOString().slice(0, 10) : null,
231
+ date_last: c.last_ts ? new Date(c.last_ts * 1000).toISOString().slice(0, 10) : null,
232
+ })),
233
+ total_imported: result.totalImported,
234
+ };
235
+ }
236
+
237
+ // ----- Claude Code / Cowork JSONL path -----
238
+
239
+ async function ingestClaudeJsonl(db, path, format, _opts) {
240
+ const source = format === 'cowork-jsonl' ? 'claude-cowork' : 'claude-code';
241
+ const fileName = basename(path, '.jsonl');
242
+ const conversationId = `${source}-${fileName}`;
243
+
244
+ // We replicate the inner logic of server.js's importClaudeCodeJsonl so this
245
+ // module stays self-contained. (Extracting that function from server.js is
246
+ // a bigger refactor we don't need right now.)
247
+ const { extractMessageFromRecord, extractCompactBoundary, isContinuationBoilerplate } =
248
+ await import('./parse.js');
249
+
250
+ const insertMessage = db.prepare(`
251
+ INSERT INTO messages (source, conversation_id, msg_id, role, sender, text, ts, metadata, edited_at, uuid)
252
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
253
+ ON CONFLICT(source, conversation_id, msg_id) DO UPDATE SET
254
+ text = excluded.text,
255
+ uuid = COALESCE(messages.uuid, excluded.uuid)
256
+ `);
257
+ const upsertConversation = db.prepare(`
258
+ INSERT INTO conversations (conversation_id, source, title, first_ts, last_ts, message_count, parent_conversation_id, project_path)
259
+ VALUES (?, ?, ?, ?, ?, ?, ?, ?)
260
+ ON CONFLICT(conversation_id) DO UPDATE SET
261
+ title = COALESCE(excluded.title, title),
262
+ first_ts = MIN(first_ts, excluded.first_ts),
263
+ last_ts = MAX(last_ts, excluded.last_ts),
264
+ project_path = COALESCE(excluded.project_path, project_path),
265
+ message_count = (
266
+ SELECT COUNT(*) FROM messages WHERE messages.conversation_id = conversations.conversation_id
267
+ )
268
+ `);
269
+
270
+ let lines;
271
+ try {
272
+ lines = readFileSync(path, 'utf-8').split('\n').filter(Boolean);
273
+ } catch (e) {
274
+ return { status: 'error', error: `read failed: ${e.message}` };
275
+ }
276
+
277
+ let aiTitle = null;
278
+ let firstUserText = null;
279
+ let projectPath = null;
280
+ let first_ts = Infinity, last_ts = 0;
281
+ let imported = 0;
282
+
283
+ const tx = db.transaction(() => {
284
+ for (const line of lines) {
285
+ let obj;
286
+ try { obj = JSON.parse(line); } catch (_) { continue; }
287
+
288
+ if (obj.type === 'ai-title' && typeof obj.aiTitle === 'string' && obj.aiTitle.trim()) {
289
+ aiTitle = obj.aiTitle.trim();
290
+ continue;
291
+ }
292
+ if (obj.cwd && !projectPath) projectPath = obj.cwd;
293
+
294
+ const boundary = extractCompactBoundary(obj);
295
+ if (boundary) {
296
+ const ts = boundary.timestamp ? Math.floor(new Date(boundary.timestamp).getTime() / 1000) : 0;
297
+ if (ts) { first_ts = Math.min(first_ts, ts); last_ts = Math.max(last_ts, ts); }
298
+ const msgId =
299
+ boundary.id ||
300
+ (boundary.uuid ? `boundary-${boundary.uuid}` : null) ||
301
+ (boundary.timestamp ? `boundary-${boundary.timestamp}` : 'boundary-unknown');
302
+ try {
303
+ insertMessage.run(
304
+ source, conversationId, msgId, 'boundary', 'compact',
305
+ JSON.stringify(boundary.metadata || {}), ts,
306
+ JSON.stringify({ raw_type: 'compact_boundary', parentUuid: boundary.parentUuid || null }),
307
+ null, boundary.uuid || null,
308
+ );
309
+ imported++;
310
+ } catch (_) { /* dupe */ }
311
+ continue;
312
+ }
313
+
314
+ const m = extractMessageFromRecord(obj);
315
+ if (!m) continue;
316
+ if (!['user', 'assistant', 'summary'].includes(m.role)) continue;
317
+
318
+ const ts = m.timestamp ? Math.floor(new Date(m.timestamp).getTime() / 1000) : 0;
319
+ if (ts) { first_ts = Math.min(first_ts, ts); last_ts = Math.max(last_ts, ts); }
320
+
321
+ if (m.role === 'user' && !firstUserText) {
322
+ const text = m.text.trim().replace(/\s+/g, ' ');
323
+ if (text && !isContinuationBoilerplate(text)) firstUserText = text.slice(0, 80);
324
+ }
325
+
326
+ const sender =
327
+ m.role === 'user' ? 'me'
328
+ : m.role === 'summary' ? 'compact-summary'
329
+ : source;
330
+
331
+ try {
332
+ insertMessage.run(
333
+ source, conversationId,
334
+ obj.id || `${source}-${fileName}-${(m.uuid || ts).toString().slice(0, 16)}`,
335
+ m.role, sender, m.text, ts,
336
+ JSON.stringify({ raw_type: obj.type || null, parentUuid: m.parentUuid || null }),
337
+ null, m.uuid || null,
338
+ );
339
+ imported++;
340
+ } catch (_) { /* dupe */ }
341
+ }
342
+
343
+ const baseTitle = aiTitle || firstUserText || fileName;
344
+ upsertConversation.run(
345
+ conversationId, source, baseTitle,
346
+ isFinite(first_ts) ? first_ts : null,
347
+ last_ts || null,
348
+ 0, // message_count gets recomputed by the trigger expression above
349
+ null, projectPath,
350
+ );
351
+ });
352
+
353
+ try {
354
+ tx();
355
+ } catch (e) {
356
+ return { status: 'error', error: `transaction failed: ${e.message}` };
357
+ }
358
+
359
+ return {
360
+ status: 'imported',
361
+ format,
362
+ conversation_id: conversationId,
363
+ total_imported: imported,
364
+ title: aiTitle || firstUserText || fileName,
365
+ };
366
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "memex-mvp",
3
- "version": "0.10.10",
3
+ "version": "0.10.12",
4
4
  "description": "Local-first MCP server for cross-agent AI memory. One SQLite + FTS5 corpus across Claude Code, Cowork, Cursor, Continue, Zed, Obsidian, and Telegram — passively captured, verbatim, searchable from any MCP-compatible client.",
5
5
  "type": "module",
6
6
  "main": "server.js",
@@ -26,7 +26,7 @@
26
26
  "sync": "node ingest.js",
27
27
  "ingest": "node ingest.js",
28
28
  "bot": "node bot/index.js",
29
- "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js && node test/cli.test.js && node test/hook.test.js && node test/telegram-html.test.js && node test/telegram-decisions.test.js && node test/telegram-pending.test.js && node test/telegram-notify.test.js && node test/notify-click-action.test.js",
29
+ "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js && node test/cli.test.js && node test/hook.test.js && node test/telegram-html.test.js && node test/telegram-decisions.test.js && node test/telegram-pending.test.js && node test/telegram-notify.test.js && node test/notify-click-action.test.js && node test/inbox-watcher.test.js && node test/e2e-inbox.test.js && node test/ingest-file.test.js",
30
30
  "prepublishOnly": "npm test"
31
31
  },
32
32
  "engines": {
package/server.js CHANGED
@@ -810,18 +810,23 @@ function importFile(filePath) {
810
810
  Math.floor(Date.now() / 1000),
811
811
  imported
812
812
  );
813
+ log(`imported ${imported} messages from ${basename(filePath)} (${source})`);
814
+ } else {
815
+ log(`no NEW messages from ${basename(filePath)} (all dupes)`);
816
+ }
813
817
 
814
- // Move processed file to archive
818
+ // Move processed file to archive regardless of imported count. If we only
819
+ // archive when imported>0, a fully-deduplicated snapshot stays in inbox.
820
+ // Daemon then overwrites that file periodically — and on filesystems where
821
+ // rename-over-existing only fires chokidar 'change' (not 'add'), the
822
+ // 'change' listener above re-imports, fine; but it also means a wasted
823
+ // file accumulates in inbox if for any reason the listener didn't catch.
824
+ // Archiving always keeps inbox a clean "what's new" queue.
825
+ if (source !== 'unknown') {
815
826
  const targetDir = join(ARCHIVE, source);
816
827
  mkdirSync(targetDir, { recursive: true });
817
828
  const target = join(targetDir, basename(filePath));
818
- try {
819
- renameSync(filePath, target);
820
- } catch (_) {}
821
-
822
- log(`imported ${imported} messages from ${basename(filePath)} (${source})`);
823
- } else {
824
- log(`no messages imported from ${basename(filePath)}`);
829
+ try { renameSync(filePath, target); } catch (_) {}
825
830
  }
826
831
  return imported;
827
832
  }
@@ -852,6 +857,19 @@ chokidar
852
857
  log('inbox detected (file):', basename(filePath));
853
858
  importFile(filePath);
854
859
  })
860
+ // 'change' is critical: the ingest daemon overwrites the inbox snapshot
861
+ // file every few seconds as the underlying Claude Code / Cowork JSONL
862
+ // grows (the snapshot is a full re-serialisation, not a delta append).
863
+ // Without a 'change' listener, chokidar only fires 'add' once when the
864
+ // file first appears — every subsequent overwrite is silent, the inbox
865
+ // file accumulates new content on disk but server.js never re-imports
866
+ // it. UNIQUE(source, conv, msg_id) + INSERT OR IGNORE keep repeated
867
+ // imports idempotent, so re-processing the whole file on every change
868
+ // is correct (and cheap: SQLite handles ~10k rows in tens of ms).
869
+ .on('change', (filePath) => {
870
+ log('inbox changed (file):', basename(filePath));
871
+ importFile(filePath);
872
+ })
855
873
  .on('addDir', (dirPath) => {
856
874
  // Skip the inbox itself
857
875
  if (dirPath === INBOX) return;
@@ -1820,6 +1838,60 @@ const TOOLS = [
1820
1838
  required: ['content'],
1821
1839
  },
1822
1840
  },
1841
+ {
1842
+ name: 'memex_import_file',
1843
+ description:
1844
+ 'Ingest a chat/conversation file at any path on disk into memex. ' +
1845
+ 'Use this when the user gives you a file path (e.g. "~/projects/memex/result.json", ' +
1846
+ '"~/Downloads/ChatExport_2026-05-18/", or a Claude Code .jsonl) and wants its content ' +
1847
+ 'in memex. Auto-detects format (Telegram JSON / Telegram HTML directory / Claude Code ' +
1848
+ 'JSONL / Cowork JSONL).\n\n' +
1849
+ 'WHEN TO USE:\n' +
1850
+ ' • User: "загрузи мой файл из ~/projects/memex/result.json"\n' +
1851
+ ' • User: "вот файл с историей перпплексити, импортируй"\n' +
1852
+ ' • Anywhere a path is given outside the auto-watched directories.\n\n' +
1853
+ 'PRIVACY GATE: for Telegram chats not yet on the user\'s allow-list, the tool returns ' +
1854
+ 'status="needs_consent" with a preview (chat title, msg count, date range, senders). ' +
1855
+ 'Surface that to the user, confirm, then call again with force=true. Skipped/blocked ' +
1856
+ 'chats are refused similarly (override with force=true only after explicit "yes").\n\n' +
1857
+ 'IDEMPOTENT: re-importing the same file is safe — UNIQUE(source, conversation_id, msg_id) ' +
1858
+ 'dedups under the hood. Re-importing a Telegram export with new messages adds the delta.\n\n' +
1859
+ 'DO NOT use this for URLs / web pages — use memex_store_document for those. ' +
1860
+ 'DO NOT use bash file ops (mv, cp) to move files into ~/.memex/inbox/ — that\'s the ' +
1861
+ 'old workflow this tool replaces. Pass the path directly here and get a structured response.',
1862
+ inputSchema: {
1863
+ type: 'object',
1864
+ properties: {
1865
+ path: {
1866
+ type: 'string',
1867
+ description:
1868
+ 'Absolute or ~-relative path to the file or directory. Supported: ' +
1869
+ '.json (Telegram Desktop JSON export), .html (Telegram HTML export, single file), ' +
1870
+ 'directory (Telegram HTML export root with messages.html inside, or a Telegram dir ' +
1871
+ 'with nested result.json), .jsonl (Claude Code / Cowork session log — prefix cowork- ' +
1872
+ 'is treated as Cowork).',
1873
+ },
1874
+ format: {
1875
+ type: 'string',
1876
+ enum: ['auto', 'telegram-json', 'telegram-html', 'claude-jsonl', 'cowork-jsonl'],
1877
+ default: 'auto',
1878
+ description:
1879
+ 'Override format detection. Default "auto" sniffs the file content/extension. ' +
1880
+ 'Pass an explicit value when the file lacks an extension or auto-detection fails.',
1881
+ },
1882
+ force: {
1883
+ type: 'boolean',
1884
+ default: false,
1885
+ description:
1886
+ 'For Telegram: bypass the privacy gate — import even if the chat is new (no prior ' +
1887
+ '"allow" decision), or if it was previously skipped/blocked. Default false: the tool ' +
1888
+ 'returns "needs_consent" or "skipped" so the agent can confirm with the user first. ' +
1889
+ 'Set true only AFTER the user has explicitly approved this chat\'s import.',
1890
+ },
1891
+ },
1892
+ required: ['path'],
1893
+ },
1894
+ },
1823
1895
  // ---------------------------------------------------------------------
1824
1896
  // Telegram capture flow (v0.10+) — proactive agent-driven import path.
1825
1897
  //
@@ -3113,6 +3185,25 @@ server.setRequestHandler(CallToolRequestSchema, async (req) => {
3113
3185
  });
3114
3186
  }
3115
3187
 
3188
+ // ============================================================
3189
+ // ARBITRARY-PATH FILE INGEST (v0.10.12+)
3190
+ //
3191
+ // Closes the onboarding gap: a user puts an export at a path
3192
+ // memex doesn't watch by default (~/projects/foo/result.json,
3193
+ // ~/Desktop/, etc.) — agent calls this with the path and gets a
3194
+ // structured result. No more 10k-token bash file-shuffling.
3195
+ // ============================================================
3196
+ if (name === 'memex_import_file') {
3197
+ const { ingestFile } = await import('./lib/ingest-file.js');
3198
+ const path = typeof args.path === 'string' ? args.path : '';
3199
+ if (!path) return jsonResult({ status: 'error', error: 'path is required' });
3200
+ const result = await ingestFile(db, path, {
3201
+ format: args.format || 'auto',
3202
+ force: args.force === true,
3203
+ });
3204
+ return jsonResult(result);
3205
+ }
3206
+
3116
3207
  // ============================================================
3117
3208
  // TELEGRAM CAPTURE FLOW (v0.10+)
3118
3209
  // ============================================================
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: install-memex
3
3
  description: Make Claude, Cursor, Cline, Continue, and Zed remember every AI conversation forever — one local SQLite corpus shared across all of them. Installs memex (local-first MCP server) in ~60 seconds via curl one-liner. Includes auto-capture daemon for Claude Code / Cowork / Cursor / Obsidian; v0.10 Telegram auto-detect (export from Desktop → memex stages it → AI proactively asks which to import, privacy-first per-chat consent); v0.8 SessionStart hook for the Brian Chesky moment ("Claude already knows what you were doing"); URL / Perplexity / AI chat share capture via memex_store_document. 18 MCP tools, no cloud, no account, verbatim storage. Use when the user says "install memex", "set up memex", "add memory to my AI", "make my agent remember across sessions", "сохрани мои чаты", or similar.
4
- version: 1.4.1
4
+ version: 1.5.0
5
5
  metadata:
6
6
  openclaw:
7
7
  emoji: "📚"
@@ -290,7 +290,13 @@ If the user declines Telegram setup ("not now" / "skip"): say "OK, I'll skip Tel
290
290
 
291
291
  Tell the user to fully quit and reopen the MCP client (Cmd+Q on macOS) so it picks up the new memex tools.
292
292
 
293
- **🌟 FIRST, STRONGLY RECOMMEND THIS (v0.10.8+):** offer to run `memex web --open` for them right now. It's the single most magical post-install moment a local browser tab opens at `http://127.0.0.1:8765` with 5 pages showing every conversation memex has captured, dialogue rendered as chat-bubbles, full-text searchable. Verbatim not AI-summarized. Read-only, localhost-only, Ctrl+C to stop. Phrase it naturally, e.g.:
293
+ **🌟 OFFER TO IMPORT THEIR EXISTING DATA RIGHT NOW (v0.10.12+):** before anything else, ask the user if they have a Telegram export or a chat-history file sitting somewhere on disk that they'd like to load. Many users come with a `result.json` they exported yesterday, sitting in `~/Downloads/` or a project folder. Don't make them figure out memex's "magic paths" just ask:
294
+
295
+ > "Do you have any chat exports already on disk you want loaded into memex right now? Tell me the path (e.g. `~/Downloads/result.json` or `~/projects/memex/result.json`) — I'll handle it."
296
+
297
+ Then call `memex_import_file({path: "<their-path>"})`. The tool auto-detects format (Telegram JSON / HTML / Claude JSONL / Cowork JSONL), respects the privacy gate for Telegram (returns `needs_consent` with a preview — surface it, get user OK, retry with `force: true`), and gives you a structured result in one call. **Do not use bash `mv`/`cp` to shuffle files into `~/.memex/inbox/`** — that was the old workflow and burned ~10k tokens per file. Use `memex_import_file` instead.
298
+
299
+ **🌟 ALSO STRONGLY RECOMMEND THIS (v0.10.8+):** offer to run `memex web --open` for them right now. It's the single most magical post-install moment — a local browser tab opens at `http://127.0.0.1:8765` with 5 pages showing every conversation memex has captured, dialogue rendered as chat-bubbles, full-text searchable. Verbatim — not AI-summarized. Read-only, localhost-only, Ctrl+C to stop. Phrase it naturally, e.g.:
294
300
 
295
301
  > "I can open a local web dashboard that lets you actually *see* what memex captured — every message, in chat-bubble form, searchable. It's just a browser tab on `localhost:8765`. Run it now? It's a way better demo than asking me search queries blind."
296
302