npm - memex-mvp - Versions diffs - 0.8.1 → 0.9.1 - Mend

memex-mvp 0.8.1 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/HELP.md +20 -5
package/README.md +18 -1
package/README.ru.md +19 -1
package/lib/parse-telegram-html.js +384 -0
package/package.json +2 -2
package/server.js +109 -4
package/skills/install-memex/README.md +5 -6
package/skills/install-memex/SKILL.md +52 -10
package/skills/install-memex/examples.md +59 -13

package/HELP.md CHANGED Viewed

@@ -348,7 +348,7 @@ Magic-фича. Когда ты открываешь Claude Code в проект
 **Технически:** SessionStart hook в `~/.claude/settings.json`. При старте каждой Claude Code сессии хук вызывает `memex context` → memex выдаёт markdown summary → Claude получает его как system message _до_ твоего первого вопроса.
-**Установка:** во время `memex-sync install` будет промпт `[Y/n]` — соглашайся (Y по default'у). Или установи позже:
+**Установка:** проще всего — `curl -fsSL https://memex.parallelclaw.ai/install.sh | bash` поставит memex + daemon + auto-context хук одной командой. Если ставил memex вручную — во время `memex-sync install` будет промпт `[Y/n]` (соглашайся, Y по default'у). Или установи хук позже:
 ```bash
 memex hook install         # добавить хук
@@ -378,7 +378,7 @@ memex context --freshness-days 30    # только последние 30 дне
 1. Вызови `memex_overview`. Внимание на статус-баннер сверху:
    - 🟢 daemon работает — всё ок, может ещё не успел проиндексировать
    - 🔴 daemon установлен но не работает — `launchctl load ~/Library/LaunchAgents/com.parallelclaw.memex.sync.plist`
-   - ⚪ daemon не установлен — `npx memex-sync install` из директории memex-mvp
+   - ⚪ daemon не установлен — самый простой способ: `curl -fsSL https://memex.parallelclaw.ai/install.sh | bash` (поставит сам всё). Или вручную: `memex-sync install`.
 ### Хочу проиндексировать существующие сессии (бэкфилл)
@@ -390,9 +390,24 @@ npx memex-sync scan
 ### Telegram-чаты не появляются
-1. В Telegram **Desktop** (не mobile!): чат → меню → **Export chat history** → **Format: JSON**
-2. Кинь `result.json` в `~/.memex/inbox/`
-3. Memex подхватит автоматически за ~1.5 сек
+В Telegram **Desktop** (не mobile!): чат → меню (⋮ или ☰) → **Export chat history**.
+**memex поддерживает оба формата экспорта (v0.9+):**
+- **«Machine-readable JSON»** ← рекомендуется. Один файл `result.json`, чисто ингестится.
+- **«HTML»** ← тоже работает (с v0.9+). Получишь директорию `ChatExport_<chat>_<date>/` — кинь её **целиком** в inbox, memex авто-парсит.
+Шаги:
+1. Выбери формат (JSON — короче путь, HTML — если уже экспортировано).
+2. Сохрани в **`~/.memex/inbox/`** — для JSON это `result.json`, для HTML — вся директория `ChatExport_…`.
+3. Memex подхватит автоматически за ~1.5 секунды. Файл/директория уедет в `~/.memex/data/conversations/telegram/` (или `…/telegram-html/`) после ingest'a.
+Если ингест не произошёл — проверь `~/.memex/data/memex.log`. Для HTML-export'а парсер пишет actionable error если что-то не так (формат изменился у Telegram, директория повреждена, и т.д.).
+**Что НЕ работает:**
+- ❌ Мобильный экспорт (Telegram mobile не экспортирует структурированно)
+- ❌ Скриншоты чатов
+- ❌ Только-медиа чаты (без текста парсятся как `[photo]`/`[voice]` placeholders, но без content)
 ### Хочу подключить новый Obsidian-vault

package/README.md CHANGED Viewed

@@ -26,6 +26,23 @@ MCP server  →  Cursor · Cline · Claude Code · Continue · Zed · Codex ·
 ## Install in 60 seconds
+**One-line install (recommended):**
+```sh
+curl -fsSL https://memex.parallelclaw.ai/install.sh | bash
+```
+That single command:
+1. Verifies Node ≥ 20.
+2. Runs `npm install -g memex-mvp`, auto-fixing `EACCES` by moving npm's prefix to `~/.npm-global` (no `sudo` needed, ever).
+3. Installs the auto-capture daemon (`memex-sync install`) **with** the v0.8 Brian Chesky auto-context hook into `~/.claude/settings.json` (preserves existing hooks).
+4. Backfills history (`memex-sync scan`) so memex already knows about your past sessions.
+5. If `claude` (Claude Code CLI) is on PATH, runs `claude mcp add memex --scope user -- memex` to wire MCP automatically.
+Idempotent — safe to re-run. To inspect the script before piping to bash: `curl -fsSL https://memex.parallelclaw.ai/install.sh | less`.
+**Prefer manual install?**
 ```sh
 npm install -g memex-mvp
 memex-sync install      # macOS LaunchAgent for auto-capture
@@ -172,7 +189,7 @@ Perplexity threads need to be made **Public** in the Share dialog first — meme
 | Cursor IDE chats      | Auto: reads Cursor's local SQLite session store                |
 | Continue / Zed        | Auto: filesystem watchers per platform                         |
 | Obsidian notes        | Auto: per-vault markdown watcher                               |
-| Telegram exports      | Manual: drop `result.json` (Telegram Desktop) into `~/.memex/inbox/` |
+| Telegram exports      | Manual: drop `result.json` (Telegram Desktop → Export → JSON) into `~/.memex/inbox/`. v0.9+: also accepts the **HTML** export directory (`ChatExport_*/`) — drop the whole folder, memex parses it. |
 | Telegram (live)       | Run [`memex-bot`](bot/README.md) — captures messages you send/forward to your private bot |
 | **Web pages, AI chat shares, pasted text** | From any MCP agent: *"save https://... to memex"*. Agent fetches; memex stores verbatim. Cloudflare-protected pages (Perplexity, npm.com, Twitter, Medium, …) handled via the agent's r.jina.ai fallback. See [HELP.md §8](HELP.md) |

package/README.ru.md CHANGED Viewed

@@ -83,6 +83,23 @@ MCP server (stdio JSON-RPC)
 ## Установка за 60 секунд
+**Установка в одну строку (рекомендуется):**
+```bash
+curl -fsSL https://memex.parallelclaw.ai/install.sh | bash
+```
+Эта команда сама делает:
+1. Проверяет Node ≥ 20.
+2. Запускает `npm install -g memex-mvp`, и если ловит `EACCES` — сама переносит npm prefix в `~/.npm-global` (sudo больше не нужен — никогда, ни для одного `npm install -g`).
+3. Поднимает auto-capture daemon (`memex-sync install`) **вместе с** Brian Chesky auto-context хуком (v0.8+) в `~/.claude/settings.json` (другие хуки не трогает).
+4. Бэкфиллит историю (`memex-sync scan`) — memex сразу знает о твоих прошлых сессиях.
+5. Если на машине найден Claude Code CLI (`claude`), вызывает `claude mcp add memex --scope user -- memex` — MCP прописывается автоматом.
+Идемпотентно — безопасно перезапускать. Хочешь сначала посмотреть скрипт: `curl -fsSL https://memex.parallelclaw.ai/install.sh | less`.
+**Или вручную:**
 ```bash
 npm install -g memex-mvp
 memex-sync install      # macOS LaunchAgent для auto-capture
@@ -224,7 +241,8 @@ which node  # → путь до бинарника node (например /Users
 | **Claude Cowork** | `cowork-*.jsonl` (через filename prefix), включая subagents | ✅ работает |
 | **Cursor IDE** (Composer + Chat) | SQLite `state.vscdb` в `~/Library/Application Support/Cursor/` | ✅ работает (poll каждые 5 мин) |
 | **Obsidian** vault notes | `.md` файлы + YAML frontmatter | ✅ работает (FSEvents, hash-based dedupe) |
-| **Telegram** | `result.json` из Desktop export | ✅ работает |
+| **Telegram (JSON export)** | `result.json` из Desktop export | ✅ работает |
+| **Telegram (HTML export)** | директория `ChatExport_*` из Desktop (v0.9+) | ✅ работает — кидай всю папку в inbox |
 | **Telegram (live)** | бот `memex-bot` ловит твои сообщения / форварды | ✅ работает |
 | **Web-страницы, AI-share'ы, paste'ы** | `memex_store_document` — агент fetch'ит, memex хранит verbatim (v0.6+) | ✅ работает |
 | Claude.ai web export | будет в v0.7 | — |

package/lib/parse-telegram-html.js ADDED Viewed

@@ -0,0 +1,384 @@
+/**
+ * Telegram Desktop HTML export → Telegram-JSON-shape converter.
+ *
+ * Telegram Desktop offers two export formats:
+ *   - "Machine-readable JSON"  — what memex's importTelegram expects
+ *   - "Human-readable HTML"    — what many users pick by default
+ *
+ * Users frequently export as HTML by accident (often the default in the
+ * Telegram UI), then memex's inbox watcher silently ignores the dropped
+ * directory. This module makes HTML work: parse → emit the same shape
+ * importTelegram already understands.
+ *
+ * Telegram's HTML export is reasonably stable:
+ *
+ *   ChatExport_<chat-title>_<date>/
+ *     ├── messages.html     (or messages.htm — chunked: messages2, messages3, …)
+ *     ├── photos/
+ *     ├── files/
+ *     ├── stickers/
+ *     └── voice_messages/
+ *
+ * Each messages*.html has structure:
+ *
+ *   <div class="message default clearfix" id="message12345">
+ *     <div class="body">
+ *       <div class="from_name"> ↳ Sender Name </div>      (may be absent on "joined" messages)
+ *       <div class="text"> message text </div>
+ *       <div class="pull_right date details" title="2024-01-01 14:23:45 UTC+03:00">14:23</div>
+ *     </div>
+ *   </div>
+ *
+ *   Joined message = same sender as previous, has class "joined", no from_name.
+ *   Service message = class "service" (joined chat, name change, …) — we skip these.
+ *   Forwarded = "forwarded body" wrapping the message body.
+ *   Reply = "reply_to details" sibling.
+ *
+ * We use regex-based parsing (no DOM dependency) because Telegram's class
+ * names are stable and we control which fields we care about. If Telegram
+ * radically changes the schema, parser breaks loudly (returns 0 messages
+ * + clear log) rather than silently corrupting.
+ */
+import { readFileSync, existsSync, readdirSync, statSync } from 'node:fs';
+import { join, basename, dirname } from 'node:path';
+/**
+ * Detect if a given path is a Telegram HTML export.
+ * Accepts both a directory (most common — ChatExport_xxx/) and a bare
+ * messages.html file (rare — user dropped just the one file).
+ *
+ * Returns { type: 'dir' | 'file' | null, htmlFiles: string[] }
+ *   null type means "not a Telegram HTML export"
+ */
+export function detectTelegramHtml(path) {
+  if (!existsSync(path)) return { type: null, htmlFiles: [] };
+  const stats = statSync(path);
+  // Directory case: look for messages*.html inside
+  if (stats.isDirectory()) {
+    let entries = [];
+    try { entries = readdirSync(path); } catch (_) { return { type: null, htmlFiles: [] }; }
+    const htmlFiles = entries
+      .filter((f) => /^messages\d*\.html?$/i.test(f))
+      .map((f) => join(path, f));
+    if (htmlFiles.length === 0) return { type: null, htmlFiles: [] };
+    // Verify the first one contains Telegram-shaped markers
+    const head = safeReadHead(htmlFiles[0]);
+    if (!looksLikeTelegram(head)) return { type: null, htmlFiles: [] };
+    // Sort chunks: messages.html < messages2.html < messages3.html …
+    htmlFiles.sort(numericChunkSort);
+    return { type: 'dir', htmlFiles };
+  }
+  // Single file case: must be messages*.html
+  if (stats.isFile() && /\.html?$/i.test(path) && /messages\d*\.html?$/i.test(basename(path))) {
+    const head = safeReadHead(path);
+    if (!looksLikeTelegram(head)) return { type: null, htmlFiles: [] };
+    return { type: 'file', htmlFiles: [path] };
+  }
+  return { type: null, htmlFiles: [] };
+}
+function safeReadHead(file, bytes = 8192) {
+  try {
+    return readFileSync(file, 'utf-8').slice(0, bytes);
+  } catch (_) {
+    return '';
+  }
+}
+function looksLikeTelegram(head) {
+  // Reliable markers in Telegram Desktop HTML exports
+  return /class="page_wrap"/.test(head) ||
+         /class="page_body chat_page"/.test(head) ||
+         (/class="from_name"/.test(head) && /class="text"/.test(head));
+}
+function numericChunkSort(a, b) {
+  const numA = parseInt((basename(a).match(/messages(\d*)\.html?/i) || [, '0'])[1] || '0', 10);
+  const numB = parseInt((basename(b).match(/messages(\d*)\.html?/i) || [, '0'])[1] || '0', 10);
+  return numA - numB;
+}
+/**
+ * Strip HTML tags and decode common entities → plain text.
+ * Conservative: preserves newlines from <br>, paragraph breaks from </div>.
+ */
+function htmlToText(html) {
+  if (!html) return '';
+  let out = String(html);
+  // Convert breaks to newlines BEFORE stripping tags
+  out = out.replace(/<br\s*\/?>/gi, '\n');
+  out = out.replace(/<\/p>/gi, '\n\n');
+  out = out.replace(/<\/div>/gi, '\n');
+  // Drop all remaining tags
+  out = out.replace(/<[^>]+>/g, '');
+  // Decode common entities
+  out = out
+    .replace(/&nbsp;/g, ' ')
+    .replace(/&amp;/g, '&')
+    .replace(/&lt;/g, '<')
+    .replace(/&gt;/g, '>')
+    .replace(/&quot;/g, '"')
+    .replace(/&apos;/g, "'")
+    .replace(/&#39;/g, "'")
+    .replace(/&#x([0-9a-f]+);/gi, (_, hex) => String.fromCodePoint(parseInt(hex, 16)))
+    .replace(/&#(\d+);/g, (_, dec) => String.fromCodePoint(parseInt(dec, 10)));
+  // Collapse 3+ blank lines, trim
+  out = out.replace(/\n{3,}/g, '\n\n').trim();
+  return out;
+}
+/**
+ * Parse a Telegram date title into Unix timestamp.
+ * Telegram emits dates in the user's locale format, e.g.:
+ *   • "2024-01-01 14:23:45 UTC+03:00"   (ISO — English locale)
+ *   • "01.01.2024 14:23:45 UTC+03:00"   (European — Russian / German / etc.)
+ *   • "01/01/2024 14:23:45 UTC+03:00"   (US slash format — less common in exports)
+ * Returns { tsUnix, isoString } or null if unparseable.
+ */
+function parseTelegramDate(title) {
+  if (!title) return null;
+  let y, mo, d, h, mi, s, sign, oh, om;
+  // ISO: YYYY-MM-DD HH:MM:SS [UTC±HH:MM]
+  let m = title.match(/^(\d{4})-(\d{2})-(\d{2})\s+(\d{2}):(\d{2}):(\d{2})(?:\s+UTC([+-])(\d{2}):(\d{2}))?$/);
+  if (m) {
+    [, y, mo, d, h, mi, s, sign, oh, om] = m;
+  } else {
+    // European: DD.MM.YYYY HH:MM:SS [UTC±HH:MM]  (also supports "/" or "-" as separator)
+    m = title.match(/^(\d{2})[.\/-](\d{2})[.\/-](\d{4})\s+(\d{2}):(\d{2}):(\d{2})(?:\s+UTC([+-])(\d{2}):(\d{2}))?$/);
+    if (!m) return null;
+    [, d, mo, y, h, mi, s, sign, oh, om] = m;
+  }
+  // Construct an ISO 8601 string with the explicit offset (or UTC if absent)
+  const offset = sign ? `${sign}${oh}:${om}` : 'Z';
+  const iso = `${y}-${mo}-${d}T${h}:${mi}:${s}${offset}`;
+  const date = new Date(iso);
+  if (isNaN(date.getTime())) return null;
+  return {
+    tsUnix: Math.floor(date.getTime() / 1000),
+    isoString: iso.replace(/[+-]\d{2}:\d{2}$/, '').replace('Z', ''),
+  };
+}
+/**
+ * Parse a single message div (raw HTML segment).
+ * Returns null for service messages (we skip those) or messages with no text.
+ */
+function parseMessageDiv(messageHtml, lastSender) {
+  // Skip service messages outright
+  if (/class="message service\b/.test(messageHtml)) return null;
+  // Extract message id from outer div: id="message12345"
+  const idMatch = messageHtml.match(/id="message(\d+)"/);
+  const msgId = idMatch ? idMatch[1] : null;
+  if (!msgId) return null;
+  const isJoined = /class="message [^"]*joined/.test(messageHtml);
+  // Forwarded marker
+  const isForwarded = /class="forwarded body"/.test(messageHtml);
+  let forwardedFrom = null;
+  if (isForwarded) {
+    const fwdM = messageHtml.match(/class="forwarded[^"]*"[\s\S]*?<div class="from_name"[^>]*>\s*([\s\S]*?)\s*<\/div>/);
+    if (fwdM) {
+      forwardedFrom = htmlToText(fwdM[1]).replace(/^Forwarded from:?\s*/i, '').trim();
+    }
+  }
+  // Sender (from_name) — absent on joined messages
+  let fromName = null;
+  const fromM = messageHtml.match(/<div class="from_name"[^>]*>\s*([\s\S]*?)\s*<\/div>/);
+  if (fromM && !isForwarded) {
+    fromName = htmlToText(fromM[1]).trim();
+  }
+  // If joined, inherit lastSender; otherwise use parsed or fallback
+  if (!fromName && isJoined && lastSender) fromName = lastSender;
+  if (!fromName) fromName = 'Unknown';
+  // Date — title attribute on `.date.details`
+  let date = null;
+  const dateM = messageHtml.match(/class="[^"]*\bdate details[^"]*"\s+title="([^"]+)"/);
+  if (dateM) date = parseTelegramDate(dateM[1]);
+  // Main text — last `<div class="text">…</div>` inside body (forwards may have one earlier)
+  let text = '';
+  const textMatches = [...messageHtml.matchAll(/<div class="text"[^>]*>([\s\S]*?)<\/div>(?=\s*(?:<div class="(?!text)|<\/div>|<a class="|$))/g)];
+  if (textMatches.length > 0) {
+    // Use last one (the actual message body, after any quoted/forwarded preamble)
+    text = htmlToText(textMatches[textMatches.length - 1][1]);
+  }
+  // Reply marker — include as prefix so it's searchable but not lost
+  const replyM = messageHtml.match(/class="reply_to details"[^>]*>([\s\S]*?)<\/div>/);
+  if (replyM) {
+    const replyTxt = htmlToText(replyM[1]).replace(/^In reply to\s+/i, '').trim();
+    if (replyTxt) text = `↩ Reply: ${replyTxt}\n\n${text}`;
+  }
+  // Photo / media — if no text, note the media presence so the row isn't lost.
+  // Use word-boundary regexes since class attrs like "photo_wrap clearfix pull_left"
+  // wouldn't match a strict `class="photo_wrap"` pattern.
+  if (!text) {
+    if (/class="[^"]*\bphoto_wrap\b/.test(messageHtml)) text = '[photo]';
+    else if (/class="[^"]*\bmedia_voice_message\b/.test(messageHtml)) text = '[voice message]';
+    else if (/class="[^"]*\bmedia_video_file\b/.test(messageHtml)) text = '[video]';
+    else if (/class="[^"]*\bmedia_audio_file\b/.test(messageHtml)) text = '[audio]';
+    else if (/class="[^"]*\bmedia_file\b/.test(messageHtml)) text = '[file]';
+    else if (/class="[^"]*\bsticker\b/.test(messageHtml)) text = '[sticker]';
+    else return null;  // Truly empty — skip
+  }
+  // Build the message object in the shape importTelegram expects
+  // (date and date_unixtime are required by the importer)
+  const isoDate = date ? date.isoString : null;
+  const tsUnix = date ? date.tsUnix : 0;
+  return {
+    id: parseInt(msgId, 10),
+    type: 'message',
+    date: isoDate || '1970-01-01T00:00:00',
+    date_unixtime: tsUnix > 0 ? String(tsUnix) : '0',
+    from: fromName,
+    from_id: fromName ? `user_html_${slugify(fromName)}` : 'unknown',
+    text: text,
+    ...(forwardedFrom ? { forwarded_from: forwardedFrom } : {}),
+  };
+}
+function slugify(s) {
+  return String(s).toLowerCase().replace(/[^a-z0-9]+/g, '_').replace(/^_+|_+$/g, '').slice(0, 40) || 'anon';
+}
+/**
+ * Extract chat title from messages.html (or first chunk).
+ * Falls back to directory name basename, then "Telegram chat".
+ */
+function extractChatTitle(htmlContent, fallbackPath) {
+  // PREFER the page_header — it's the actual chat name as shown in Telegram.
+  // The <title> tag is often the locale-specific "Exported Data" / "Telegram"
+  // boilerplate, which we want to avoid.
+  const headerM = htmlContent.match(/<div class="page_header"[\s\S]*?<div class="text bold"[^>]*>\s*([\s\S]*?)\s*<\/div>/);
+  if (headerM) {
+    const t = htmlToText(headerM[1]).trim();
+    if (t) return t;
+  }
+  // Fallback: <title>...</title> — strip "Chat Export" / "Telegram" / "Exported Data" suffixes
+  const titleM = htmlContent.match(/<title>\s*([^<]+?)\s*<\/title>/i);
+  if (titleM) {
+    let t = titleM[1].trim();
+    t = t.replace(/\s*[—-]\s*(Chat Export|Telegram).*$/i, '').trim();
+    // Skip locale boilerplate that Telegram itself uses as the page <title>
+    if (t && !/^(Telegram|Exported Data|Экспорт(ированные)? данные|Эспортированные данные)$/i.test(t)) {
+      return t;
+    }
+  }
+  // Fallback: dirname of the parent ChatExport_xxx folder
+  if (fallbackPath) {
+    const parent = basename(dirname(fallbackPath));
+    if (parent && parent.startsWith('ChatExport')) {
+      return parent.replace(/^ChatExport_?/, '').replace(/_/g, ' ').trim() || 'Telegram chat';
+    }
+  }
+  return 'Telegram chat';
+}
+/**
+ * Main entrypoint. Parse a Telegram HTML export path → return an object
+ * shaped like a Telegram JSON export, ready for importTelegram().
+ *
+ * Returns null if path isn't a valid Telegram HTML export.
+ *
+ * Object shape:
+ *   {
+ *     personal_information: { user_id: "" },
+ *     chats: {
+ *       list: [{
+ *         id: <stable hash of chat title>,
+ *         name: <chat title>,
+ *         type: "personal_chat",
+ *         messages: [{ id, type, date, date_unixtime, from, from_id, text, … }, …]
+ *       }]
+ *     }
+ *   }
+ */
+export function parseTelegramHtmlExport(path, opts = {}) {
+  const detection = detectTelegramHtml(path);
+  if (!detection.type) return null;
+  if (detection.htmlFiles.length === 0) return null;
+  let allMessages = [];
+  let chatTitle = null;
+  let lastSender = null;
+  for (const htmlPath of detection.htmlFiles) {
+    let content;
+    try { content = readFileSync(htmlPath, 'utf-8'); }
+    catch (_) { continue; }
+    if (!chatTitle) chatTitle = extractChatTitle(content, htmlPath);
+    // Split into per-message blocks. The reliable boundary is the
+    // opening `<div class="message ` of the next message.
+    // Use a tolerant regex that handles the message default / joined variants.
+    const messageBlocks = [...content.matchAll(/<div class="message [^"]*"[\s\S]*?(?=<div class="message [^"]*"|<div class="page_footer"|<\/body>)/g)];
+    for (const blockMatch of messageBlocks) {
+      const msg = parseMessageDiv(blockMatch[0], lastSender);
+      if (msg) {
+        allMessages.push(msg);
+        // Track sender for "joined" continuation messages
+        if (msg.from && msg.from !== 'Unknown') lastSender = msg.from;
+      }
+    }
+  }
+  if (allMessages.length === 0) return null;
+  // Stable chat id: hash of title + first message ts (good enough for dedup)
+  // We use a simple numeric hash so the synthetic chat_id is stable across re-imports.
+  const chatId = stableChatId(chatTitle || 'Telegram chat', allMessages[0]?.date_unixtime || '0');
+  // Detect chat type from sender diversity. A `personal_chat` has at most 2 distinct
+  // senders (you + the other person). 3+ distinct senders → group / supergroup.
+  // We can't distinguish private_group vs public_supergroup from HTML alone, so we
+  // call it `private_group` (matches the JSON export taxonomy).
+  const distinctSenders = new Set();
+  for (const m of allMessages) {
+    if (m.from && m.from !== 'Unknown') distinctSenders.add(m.from);
+    if (distinctSenders.size > 2) break;
+  }
+  const chatType = distinctSenders.size > 2 ? 'private_group' : 'personal_chat';
+  return {
+    personal_information: { user_id: '' },
+    chats: {
+      list: [
+        {
+          id: chatId,
+          name: chatTitle || 'Telegram chat',
+          type: chatType,
+          messages: allMessages,
+        },
+      ],
+    },
+    _source: {
+      format: 'telegram-html',
+      original_path: path,
+      chunks: detection.htmlFiles.length,
+      messages_total: allMessages.length,
+    },
+  };
+}
+function stableChatId(title, firstTs) {
+  let hash = 0;
+  const key = title + ':' + firstTs;
+  for (let i = 0; i < key.length; i++) {
+    hash = ((hash << 5) - hash) + key.charCodeAt(i);
+    hash |= 0;
+  }
+  return Math.abs(hash);
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "memex-mvp",
-  "version": "0.8.1",
+  "version": "0.9.1",
   "description": "Local-first MCP server for cross-agent AI memory. One SQLite + FTS5 corpus across Claude Code, Cowork, Cursor, Continue, Zed, Obsidian, and Telegram — passively captured, verbatim, searchable from any MCP-compatible client.",
   "type": "module",
   "main": "server.js",
@@ -26,7 +26,7 @@
     "sync": "node ingest.js",
     "ingest": "node ingest.js",
     "bot": "node bot/index.js",
-    "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js && node test/cli.test.js && node test/hook.test.js",
+    "test": "node test/parser.test.js && node test/bot-inbox.test.js && node test/search-sort.test.js && node test/store-document.test.js && node test/cli.test.js && node test/hook.test.js && node test/telegram-html.test.js",
     "prepublishOnly": "npm test"
   },
   "engines": {

package/server.js CHANGED Viewed

@@ -49,6 +49,10 @@ import {
 } from './lib/store-doc/canonicalize.js';
 import { detectIssues, isBlocked } from './lib/store-doc/detect.js';
 import { extractTitle } from './lib/store-doc/extract-title.js';
+import {
+  detectTelegramHtml,
+  parseTelegramHtmlExport,
+} from './lib/parse-telegram-html.js';
 import { createHash } from 'node:crypto';
 import { runCli, CLI_SUBCOMMAND_NAMES } from './lib/cli/index.js';
@@ -325,9 +329,17 @@ const insertImport = db.prepare(`
 // -------------------- Importers --------------------
-/** Telegram Desktop JSON export (single chat or all_chats). */
-function importTelegram(filePath) {
-  const raw = JSON.parse(readFileSync(filePath, 'utf-8'));
+/**
+ * Telegram Desktop export importer. Accepts:
+ *   - filePath (string) — path to result.json
+ *   - rawObject (object) — already-parsed export, e.g. from parseTelegramHtmlExport
+ *
+ * Returns total imported message count.
+ */
+function importTelegram(filePathOrRaw) {
+  const raw = typeof filePathOrRaw === 'string'
+    ? JSON.parse(readFileSync(filePathOrRaw, 'utf-8'))
+    : filePathOrRaw;
   // Telegram Desktop produces either a single chat object or { chats: { list: [...] } }
   const chats = Array.isArray(raw.chats?.list)
@@ -670,9 +682,83 @@ function resolvePendingParents() {
 }
 /** Auto-detect format and import */
+/**
+ * Try to import a path as a Telegram HTML export (directory or single file).
+ * Returns imported message count, or 0 if not an HTML export.
+ *
+ * Side effects on success:
+ *   - Inserts an `imports` row tagged "telegram-html"
+ *   - Moves the source directory/file to ~/.memex/data/conversations/telegram-html/
+ *
+ * If it LOOKS like a Telegram HTML export but parsing failed, prints an
+ * actionable error pointing the user at the Desktop export menu — instead
+ * of silently ignoring. This was Tester 5's friction point.
+ */
+function importTelegramHtmlIfMatches(path) {
+  const detection = detectTelegramHtml(path);
+  if (!detection.type) return 0;
+  let parsed;
+  try {
+    parsed = parseTelegramHtmlExport(path);
+  } catch (err) {
+    log('telegram-html parse error:', basename(path), err.message);
+    parsed = null;
+  }
+  if (!parsed || parsed.chats.list[0].messages.length === 0) {
+    // Looked like Telegram HTML (had markers) but extraction yielded nothing.
+    // Print actionable error rather than silent ignore.
+    log('');
+    log('⚠ Detected Telegram HTML export at ' + basename(path) + ' but extracted 0 messages.');
+    log('  This usually means Telegram changed the HTML format, or the export is partial.');
+    log('  EASIEST FIX — re-export as JSON:');
+    log('    1. Open Telegram Desktop');
+    log('    2. Click the chat → ⋮ menu → "Export chat history"');
+    log('    3. Format: change "HTML" to "Machine-readable JSON"');
+    log('    4. Drop the new result.json into ~/.memex/inbox/');
+    log('');
+    log('  HTML export will be left in place — feel free to delete it once JSON works.');
+    return 0;
+  }
+  let imported = 0;
+  try {
+    imported = importTelegram(parsed);
+  } catch (err) {
+    log('telegram-html import error:', err.message);
+    return 0;
+  }
+  if (imported > 0) {
+    insertImport.run(
+      basename(path),
+      'telegram-html',
+      Math.floor(Date.now() / 1000),
+      imported
+    );
+    // Archive: move the whole directory (or file) so the watcher doesn't re-process
+    const targetDir = join(ARCHIVE, 'telegram-html');
+    mkdirSync(targetDir, { recursive: true });
+    const target = join(targetDir, basename(path));
+    try {
+      renameSync(path, target);
+    } catch (_) {}
+    log(`imported ${imported} messages from ${basename(path)} (telegram-html, ${detection.htmlFiles.length} chunk(s))`);
+  }
+  return imported;
+}
 function importFile(filePath) {
   if (!existsSync(filePath)) return 0;
   const stats = statSync(filePath);
+  // Telegram HTML export — can be either a directory (ChatExport_xxx/)
+  // or a bare messages.html file. We accept both. Detected via marker
+  // patterns inside the HTML, not file extension alone.
+  if (stats.isDirectory()) {
+    return importTelegramHtmlIfMatches(filePath);
+  }
   if (!stats.isFile()) return 0;
   const lower = filePath.toLowerCase();
@@ -692,6 +778,10 @@ function importFile(filePath) {
         imported = importTelegram(filePath);
         source = 'telegram';
       }
+    } else if (/\.html?$/i.test(lower)) {
+      // Single-file HTML drop (rare — usually a directory)
+      imported = importTelegramHtmlIfMatches(filePath);
+      if (imported > 0) source = 'telegram';
     } else if (lower.endsWith('.jsonl')) {
       // Filename prefix tells us which product the session came from.
       // cowork-   → Claude Cowork (incl. its subagents)
@@ -739,15 +829,30 @@ function importFile(filePath) {
 // dropping a partial file by hand — the watcher must not race the writer and
 // move the unfinished tmp into archive, which used to spam ENOENT into the
 // daemon's rename and corrupt the import accounting.
+// Watch INBOX top-level. Files: chokidar 'add' event. Directories:
+// chokidar 'addDir' event (v0.9+ inbox can also receive Telegram HTML
+// export DIRECTORIES like ChatExport_xxx/, not just JSON/JSONL files).
+//
+// `depth: 0` means we only get top-level entries — we DON'T want every
+// .html chunk inside ChatExport_xxx to fire 'add' separately. The
+// directory drop itself is what we react to; the HTML parser walks
+// inside.
 chokidar
   .watch(INBOX, {
     ignoreInitial: false,
     ignored: /\.tmp$/,
     awaitWriteFinish: { stabilityThreshold: 800 },
+    depth: 0,
   })
   .on('add', (filePath) => {
-    log('inbox detected:', basename(filePath));
+    log('inbox detected (file):', basename(filePath));
     importFile(filePath);
+  })
+  .on('addDir', (dirPath) => {
+    // Skip the inbox itself
+    if (dirPath === INBOX) return;
+    log('inbox detected (dir):', basename(dirPath));
+    importFile(dirPath);
   });
 // -------------------- MCP Server --------------------

package/skills/install-memex/README.md CHANGED Viewed

@@ -7,13 +7,12 @@
 After you drop the skill into your agent (`~/.claude/skills/` for Claude Code, or your client's equivalent), saying **"install memex"** triggers a guided installation:
 1. **Discovery** — read-only checks for which MCP client you're using and what AI data already exists on your machine
-2. **`npm install -g memex-mvp`** — with EACCES fallbacks (one-shot sudo OR permanent prefix-fix; the agent asks you which)
-3. **MCP config merge** — adds a single absolute-path `command` entry into your client's `mcpServers` config. Never overwrites your other servers
-4. **`memex-sync install`** — registers the macOS LaunchAgent for live auto-capture
-5. **`memex-sync scan`** — one-time backfill of every session that already exists on disk
-6. **Restart hint + verification commands** — including the v0.7+ CLI fallback (`memex overview`, `memex search "foo"`) so you can verify memex works even if MCP didn't wire up cleanly
+2. **Fast path (v1.1+)** — `curl -fsSL https://memex.parallelclaw.ai/install.sh | bash`: one hosted bash script does npm install (with EACCES auto-fix to `~/.npm-global`), daemon setup, v0.8 auto-context hook, history backfill, and `claude mcp add memex` if Claude Code CLI is on PATH. Idempotent.
+3. **Fallback: manual five-step** — if curl fails, the user declines, or the agent is inside a GUI client (Cursor/Cline/Continue/Zed) where the MCP config still needs editing: `npm install -g memex-mvp` → MCP config merge → `memex-sync install` → `memex-sync scan`.
+4. **MCP config merge** (only needed for GUI clients) — adds a single absolute-path `command` entry into your client's `mcpServers` config. Never overwrites your other servers.
+5. **Restart hint + verification commands** — including the v0.7+ CLI fallback (`memex overview`, `memex search "foo"`) so you can verify memex works even if MCP didn't wire up cleanly.
-End-to-end: **~2 minutes**, fully observable (agent shows each command before running).
+End-to-end: **~60 seconds** via fast path, **~2 minutes** via manual flow, fully observable (agent shows each command before running).
 ## What is memex?

package/skills/install-memex/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: install-memex
 description: Make Claude, Cursor, Cline, Continue, and Zed remember every AI conversation forever — one local SQLite corpus shared across all of them. Installs memex (local-first MCP server) in ~2 minutes — npm install, MCP config wiring, auto-capture daemon, history backfill. No cloud, no account, verbatim storage. Also indexes Obsidian notes, Telegram chats, and any URL the user wants to save (web pages, Perplexity threads, AI chat shares — memex_store_document tool, v0.6+). Use when the user says "install memex", "set up memex", "add memory to my AI", "make my agent remember across sessions", or similar.
-version: 1.0.0
+version: 1.1.0
 metadata:
   openclaw:
     emoji: "📚"
@@ -48,7 +48,32 @@ Scan the user's setup so you can tailor advice and tell them exactly what memex
    - "After install, memex will auto-index conversations from: [detected sources]"
 4. Wait for the user's "ok" before starting step 1.
-## Five-step install
+## Fast path — one-line installer (try this first)
+memex ships a hosted bash installer that does steps 1, 3, and 4 in a single run — and also wires up Claude Code's MCP entry if `claude` is on PATH. It's idempotent (safe to re-run), auto-fixes the `EACCES` case by moving npm's prefix to `~/.npm-global`, and prompts before enabling the auto-context hook.
+Show this command to the user, explain what it does, get their **explicit ok**, then run:
+```sh
+curl -fsSL https://memex.parallelclaw.ai/install.sh | bash
+```
+What the script does, in order:
+1. Checks Node ≥ 20.
+2. `npm install -g memex-mvp` — on EACCES, sets `npm config set prefix ~/.npm-global`, appends PATH to `~/.zshrc`, retries.
+3. `memex-sync install` with `--auto-context yes` (Brian Chesky hook into `~/.claude/settings.json` — preserves other hooks).
+4. `memex-sync scan` — backfills existing history.
+5. `claude mcp add memex --scope user -- memex` if Claude Code CLI is detected.
+After the script finishes:
+- If the user is in **Claude Code (CLI)** → install is complete. Skip straight to step 5 (verification + restart).
+- If the user is in **Cursor / Cline / Continue / Zed** → the npm install + daemon + auto-context + scan are done, but the GUI client's MCP config still needs the memex entry. **Skip step 1 (already installed)**, **skip step 3** (daemon already installed) and **skip step 4** (scan already ran). **Do step 2** (wire MCP into the GUI client's config) and **step 5** (verify + restart).
+If the script fails for any reason — non-zero exit, weird output, user uncomfortable piping curl to bash — fall back to the **Manual five-step install** below.
+To inspect what the script does first: `curl -fsSL https://memex.parallelclaw.ai/install.sh | less` (don't pipe to bash).
+## Manual five-step install (if the fast path didn't fit)
 Do these in order. Show each command before running it. Stop and ask if anything fails or looks wrong.
@@ -58,14 +83,19 @@ Do these in order. Show each command before running it. Stop and ask if anything
 npm install -g memex-mvp
 ```
-If you get `EACCES` (macOS system Node), tell the user to choose:
+If you get `EACCES` (macOS system Node), **recommend Option B (permanent prefix fix) by default** and proceed unless the user explicitly chooses sudo.
-**Option A** — one-shot sudo:
-```sh
-sudo npm install -g memex-mvp
-```
+Say to the user:
+> "Hit EACCES — your Node is installed in a system directory that npm can't write to without admin rights. Two fixes:
+>
+> **A)** Quick: `sudo npm install -g memex-mvp` — one-time, requires your password
+> **B)** Permanent fix: I'll move npm's install location to your home directory. After this, no `npm install -g` will ever ask for sudo again, for any package. Five commands, one-time.
+>
+> I recommend **B** — it's a permanent improvement that benefits all your future Node tools, not just memex. Proceed with B?"
+If the user agrees (or says anything like "ok", "yes", "default", "go", "B") — proceed with B without further questions:
-**Option B** — fix prefix permanently (better long-term):
 ```sh
 mkdir -p ~/.npm-global
 npm config set prefix ~/.npm-global
@@ -74,7 +104,19 @@ source ~/.zshrc
 npm install -g memex-mvp
 ```
-**Ask which the user prefers — don't guess.**
+If the user **explicitly** picks A (or says "sudo", "fast", "quick"), use sudo:
+```sh
+sudo npm install -g memex-mvp
+```
+After the install completes, verify with `which memex` (should print an absolute path) and `memex --version` (should print the version).
+**Important if Option B was used + user already had memex from a prior sudo install:** the old sudo-installed copy in `/usr/local/lib/node_modules/memex-mvp/` is now orphaned (PATH prefers the new home install). Mention this once:
+> "By the way, you have an older sudo-installed memex copy in `/usr/local/lib/`. It's harmless but takes ~60 MB. Clean it up when convenient: `sudo npm uninstall -g memex-mvp` — not urgent."
+Don't run that uninstall yourself — let the user decide when.
 If `node` or `npm` are missing, STOP and tell the user to install Node.js (recommend nvm.sh or `brew install node`). Don't try to install Node yourself.
@@ -207,4 +249,4 @@ This is also useful for agents without native MCP support (OpenCode + Kimi, plai
 ## Begin
-Greet the user, confirm which MCP client you're running inside, and run the Discovery checks before any install actions.
+Greet the user, confirm which MCP client you're running inside, and run the Discovery checks before any install actions. After Discovery, **propose the fast path (curl one-liner) first** — it covers ~90% of cases in one shot. Only fall back to the manual five-step install if the user objects, the script fails, or you're inside a GUI client where you'll still need to do step 2 manually after the script runs.

package/skills/install-memex/examples.md CHANGED Viewed

@@ -2,6 +2,52 @@
 Sample interactions when the user invokes this skill. These show typical happy paths and how the skill handles common edge cases.
+> **Note (v1.1+):** The skill now leads with the **fast-path one-liner** (`curl -fsSL https://memex.parallelclaw.ai/install.sh | bash`) after Discovery — see Example 0 below. The legacy five-step examples (1–5 below) stay as the fallback when the agent is inside a GUI client (Cursor / Cline / Continue / Zed — still need step 2 manually after the script), the user declines curl, or the script fails.
+---
+## Example 0 · Fast path on Claude Code CLI (v1.1+)
+> **User:** install memex
+>
+> **Agent:** 👋 You're running inside Claude Code CLI. Let me run a quick discovery first.
+>
+> ```
+> ✓ Found Claude Code data at ~/.claude/projects/ (14 sessions)
+> ✓ Found Cowork sessions (5 subagent transcripts)
+> ✗ Cursor data not found
+> ✗ Continue / Zed not found
+> ```
+>
+> After install, memex will auto-index conversations from: **Claude Code, Cowork**. You can drop Telegram exports into `~/.memex/inbox/` later if needed.
+>
+> The fast path is a single hosted bash script that does **npm install (with EACCES auto-fix), daemon setup, Brian Chesky auto-context hook, history backfill, and `claude mcp add memex`** all in one go. Idempotent. Want me to run:
+>
+> ```sh
+> curl -fsSL https://memex.parallelclaw.ai/install.sh | bash
+> ```
+>
+> **User:** ok
+>
+> **Agent:** [runs the curl command, streams output]
+>
+> ```
+> ▸ memex one-line installer
+> ▸ Node v20.11.1 — ok
+> ▸ Installing memex-mvp from npm...
+> ✓ memex 0.9.0 ready at /Users/you/.npm-global/bin/memex
+> ▸ Installing memex-sync (LaunchAgent + auto-context hook)...
+> ✓ daemon running (PID 41872)
+> ✓ Brian Chesky auto-context hook added to ~/.claude/settings.json
+> ▸ Backfilling history (memex-sync scan)...
+> ✓ 87 conversations indexed
+> ▸ Wiring MCP into Claude Code...
+> ✓ claude mcp add memex — done
+> ▸ All set. Quit Claude Code (Cmd+Q) and reopen any project. Memex will inject prior context before your first message.
+> ```
+>
+> All five steps done in ~25 seconds. Quit Claude Code (Cmd+Q) and reopen — when you open any project you've worked in, memex will inject relevant context before you even type. Try: `memex overview` from terminal to confirm.
 ---
 ## Example 1 · Happy path on a fresh Cursor install
@@ -73,7 +119,7 @@ Sample interactions when the user invokes this skill. These show typical happy p
 ---
-## Example 2 · EACCES recovery (macOS system Node)
+## Example 2 · EACCES recovery (macOS system Node) — recommend B by default
 > **User:** install memex
 >
@@ -84,14 +130,18 @@ Sample interactions when the user invokes this skill. These show typical happy p
 > npm error path: '/usr/local/lib/node_modules/memex-mvp'
 > ```
 >
-> Hit a permissions error — this is common on macOS when Node is installed system-wide. Two ways to fix:
+> Hit EACCES — your Node is installed in a system directory that npm can't write to without admin rights. Two fixes:
 >
-> **Option A** (quick, requires sudo password):
-> ```sh
-> sudo npm install -g memex-mvp
-> ```
+> **A)** Quick: `sudo npm install -g memex-mvp` — one-time, requires your password
+>
+> **B)** Permanent fix: I'll move npm's install location to your home directory. After this, no `npm install -g` will ever ask for sudo again, for any package. Five commands, one-time.
+>
+> I recommend **B** — it's a permanent improvement that benefits all your future Node tools, not just memex. **Proceed with B?**
+>
+> **User:** ok
+>
+> **Agent:** Running the 5 prefix-fix commands:
 >
-> **Option B** (one-time setup, then future `npm install -g` works without sudo):
 > ```sh
 > mkdir -p ~/.npm-global
 > npm config set prefix ~/.npm-global
@@ -100,11 +150,7 @@ Sample interactions when the user invokes this skill. These show typical happy p
 > npm install -g memex-mvp
 > ```
 >
-> Option B is the standard npm recommendation — solves this for every future global install. Option A is faster if you don't want to touch your shell config. Which do you prefer?
->
-> **User:** B
->
-> **Agent:** [runs the 5 commands in sequence, shows output for each] ✓ Done. Verifying:
+> ✓ Done. Verifying:
 >
 > ```
 > which memex → /Users/macbook/.npm-global/bin/memex
@@ -242,7 +288,7 @@ Sample interactions when the user invokes this skill. These show typical happy p
 - **Always discover first**, then ask for confirmation before any write action
 - **Show every command** before running it; show diffs for config edits
-- **Offer choices** at decision points (EACCES, daemon yes/no) rather than guessing
+- **Recommend the better default** with a clear reason (e.g. Option B for EACCES) instead of forcing the user to weigh both options blind. Fall back to alternatives only if explicitly chosen.
 - **Detect and adapt** to existing installs, missing prerequisites, partial setups
 - **Refuse gracefully** when running in a context where the install can't actually work (web-only agents)