alvin-bot 4.20.2 → 4.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,87 @@
2
2
 
3
3
  All notable changes to Alvin Bot are documented here.
4
4
 
5
+ ## [4.22.0] — 2026-05-05
6
+
7
+ ### 🧠 Memory architecture overhaul: pluggable providers + smart inject
8
+
9
+ Public users without `GOOGLE_API_KEY` (the v4.20–v4.21 default for embeddings) now get a working indexed memory store out of the box. The embeddings layer is refactored behind a provider interface with four backends auto-detected at startup:
10
+
11
+ | Tier | Provider | Setup | Cost | Dim |
12
+ |---|---|---|---|---|
13
+ | 1 | Gemini (`gemini-embedding-001`) | `GOOGLE_API_KEY` | free tier | 3072 |
14
+ | 2 | OpenAI (`text-embedding-3-small`) | `OPENAI_API_KEY` | ~$0.02 / 1M tokens | 1536 |
15
+ | 3 | Ollama (default `nomic-embed-text`) | `ollama pull nomic-embed-text` | free, local, private | 768 |
16
+ | 4 | **FTS5 (BM25 keyword)** | nothing | free | n/a |
17
+
18
+ The FTS5 fallback is the headline: SQLite's built-in full-text-search virtual table with BM25 ranking. No API key, no network, no setup. Indexes the same chunks as the vector providers (`MEMORY.md`, daily logs, project files, hub memory, asset index) and ranks matches by relevance. Excellent for proper-noun and exact-term lookups (project names, commands, error messages); weaker than vector search for synonyms and conceptual paraphrase queries — but available everywhere.
19
+
20
+ **Upgrade path.** A user starts on FTS5 (no keys needed). Later they set `GOOGLE_API_KEY` in their `.env` → next bot start detects the schema mismatch via `meta.embedding_model`, drops the FTS5 table, initialises the vector schema, and reindexes. Same in reverse. All seamless, no manual steps.
21
+
22
+ Override the auto-detection with `EMBEDDINGS_PROVIDER=gemini|openai|ollama|fts5|auto` (default `auto`).
23
+
24
+ ### âœ‚ī¸ MEMORY.md no longer bulk-injected into every system prompt (when SQLite is populated)
25
+
26
+ Pre-v4.22, `MEMORY.md` (typically tens of KB of curated long-term knowledge) and the last two daily logs were plain-text-injected into the system prompt on **every turn**. With a populated SQLite store, the same content is available via the smaller, query-targeted `searchMemory()` retrieval — much smaller prompts, much more relevant context.
27
+
28
+ New `MEMORY_INJECT_MODE` env var:
29
+
30
+ - `auto` (default) — sqlite when the store has indexed entries, else legacy
31
+ - `legacy` — pre-v4.22 behaviour, full plain-text inject every turn
32
+ - `sqlite` — never plain-text-inject `MEMORY.md` or daily logs (force smart mode regardless of store state)
33
+
34
+ Always plain-text injected regardless of mode: `identity.md` (L0) and `preferences.md` (L1) — these are tiny by design and contain always-on facts that semantic search may miss for short or generic queries. Recommended pattern: keep critical "never X" / "always Y" rules in `preferences.md`, let the bulk knowledge live in `MEMORY.md` and be retrieved on demand.
35
+
36
+ For users still on the legacy monolithic `MEMORY.md` setup (no `identity.md`, no `preferences.md`), auto mode kicks in only after the SQLite store is populated — until then, plain-text injection of `MEMORY.md` continues to work as before. Zero-touch upgrade.
37
+
38
+ ### 🔇 Quieter logs for missing keys
39
+
40
+ The `âš ī¸ Embeddings init failed: Google API key not configured` warning is gone — that startup line is now `â„šī¸ Memory provider: fts5-bm25 (keyword-local). Initial index will run on first use.` Public users without Gemini no longer see a scary warning that suggested the bot was broken when in fact it was working correctly.
41
+
42
+ ### đŸŠē `alvin-bot doctor` Memory section expanded
43
+
44
+ Reports the active provider, dimension, indexed entry/file counts, last-reindex timestamp, and effective inject mode. For not-yet-initialised stores it predicts which provider will run on first start so users can confirm the auto-detection picked what they expected.
45
+
46
+ ```
47
+ Memory:
48
+ ✅ Provider: gemini-embedding-001 (vector-cloud, 3072-dim)
49
+ 3827 entries / 316 files indexed, 48.8 MB on disk
50
+ Last reindex: 25 h ago
51
+ Inject mode: sqlite (auto)
52
+ ```
53
+
54
+ ### Architecture
55
+
56
+ - New: `src/services/embeddings/` directory — `provider.ts` (interface), `vector-base.ts` (shared vector logic), `gemini.ts`, `openai.ts`, `ollama.ts`, `fts5.ts`, `auto-detect.ts`, `index.ts` (facade)
57
+ - New: `src/services/memory-inject-mode.ts` — env resolver
58
+ - Updated: `src/services/memory-layers.ts`, `src/services/memory.ts` — gate plain-text injection on inject mode
59
+ - `src/services/embeddings.ts` is now a thin re-export shim — all existing imports keep working
60
+
61
+ ### Tests
62
+
63
+ - 24 new tests across FTS5 provider, auto-detection, and inject-mode resolver
64
+ - All 535 existing tests still pass (one pre-existing port-binding flake in `web-server-integration.test.ts` is unrelated)
65
+
66
+ ## [4.21.0] — 2026-05-04
67
+
68
+ ### 🌐 New skill: Agent Browser (Tier-1.5)
69
+
70
+ Adds a new bundled skill, `skills/agent-browser/SKILL.md`, that teaches the bot to use the `agent-browser` CLI when it's available. Agent Browser is a [Vercel Labs](https://github.com/vercel-labs/agent-browser) tool that exposes pages as accessibility-tree snapshots with `@e1`, `@e2`, â€Ļ refs — interactions cost ~200–400 tokens per turn instead of parsing rendered HTML, which is roughly 90 % cheaper than a Playwright/Puppeteer-driven flow.
71
+
72
+ The skill is **opt-in by install, not by config**: it only activates when `command -v agent-browser` succeeds. No new dependency in `package.json`, no postinstall hook, no extra disk on a fresh install. Existing browser strategies (Tier 1 Stealth, Tier 2 CDP, Tier 3 Extension) keep working untouched and remain the right tool for stealth scraping, logged-in personal accounts, and watch-along flows.
73
+
74
+ The bundled `Browser Automation` skill (`skills/browse/SKILL.md`) was updated to route the bot to the Agent Browser skill first when the binary is on the PATH and the task is interactive (click/fill/extract on cooperative pages).
75
+
76
+ `alvin-bot doctor` shows a new `Browser tools:` section reporting whether agent-browser is installed, and gives the one-liner install command if not:
77
+
78
+ ```
79
+ npm i -g agent-browser && agent-browser install
80
+ ```
81
+
82
+ The first command pulls the Node CLI; the second downloads a private Chrome-for-Testing build into `~/.agent-browser/`. Together about 240 MB — that's why we don't bundle it.
83
+
84
+ No code changes in the bot's core pipeline. Existing users notice nothing unless they install the CLI.
85
+
5
86
  ## [4.20.2] — 2026-05-04
6
87
 
7
88
  ### đŸ›Ąī¸ Security: Web UI loopback by default + Slack caller allowlist
package/README.md CHANGED
@@ -433,6 +433,13 @@ OPENROUTER_API_KEY=<key> # OpenRouter (100+ models)
433
433
  PRIMARY_PROVIDER=claude-sdk # Primary AI provider
434
434
  FALLBACK_PROVIDERS=nvidia-kimi-k2.5,nvidia-llama-3.3-70b
435
435
 
436
+ # Memory backend (v4.22+) — auto-detects based on what keys you have.
437
+ # Set to override the default priority: gemini → openai → ollama → fts5.
438
+ # fts5 is the zero-config keyword fallback — no key needed, works for everyone.
439
+ EMBEDDINGS_PROVIDER=auto # auto | gemini | openai | ollama | fts5
440
+ OLLAMA_EMBEDDING_MODEL=nomic-embed-text # only used for ollama provider
441
+ MEMORY_INJECT_MODE=auto # auto | legacy | sqlite (see CHANGELOG v4.22)
442
+
436
443
  # Optional Platforms
437
444
  WHATSAPP_ENABLED=true # Enable WhatsApp (needs Chrome)
438
445
  DISCORD_TOKEN=<token> # Enable Discord
package/bin/cli.js CHANGED
@@ -1392,13 +1392,28 @@ async function doctor() {
1392
1392
  }
1393
1393
  }
1394
1394
 
1395
- // ── Memory (semantic search backend) ──
1395
+ // ── Browser tools (optional Tier-1.5 agent-browser) ──
1396
+ console.log("\n Browser tools:");
1397
+ let agentBrowserVersion = "";
1398
+ try {
1399
+ agentBrowserVersion = execSync("agent-browser --version 2>/dev/null", { encoding: "utf-8", timeout: 3000 }).trim();
1400
+ } catch {}
1401
+ if (agentBrowserVersion) {
1402
+ // `agent-browser --version` prints "agent-browser X.Y.Z" — strip the prefix.
1403
+ const v = agentBrowserVersion.replace(/^agent-browser\s+/i, "");
1404
+ console.log(` ✅ agent-browser ${v} — Tier-1.5 (token-efficient snapshot+ref) available`);
1405
+ } else {
1406
+ console.log(` â„šī¸ agent-browser not installed (optional Tier-1.5)`);
1407
+ console.log(` Install for ~90% cheaper interactive automation:`);
1408
+ console.log(` npm i -g agent-browser && agent-browser install`);
1409
+ }
1410
+
1411
+ // ── Memory (provider + index health) ──
1396
1412
  console.log("\n Memory:");
1397
1413
  const embJson = resolve(DATA_DIR, "memory", ".embeddings.json");
1398
1414
  const embDb = resolve(DATA_DIR, "memory", ".embeddings.db");
1399
1415
  const embBakSqlite = resolve(DATA_DIR, "memory", ".embeddings.json.bak-pre-sqlite");
1400
1416
 
1401
- // better-sqlite3 native binary loadable?
1402
1417
  let sqliteOk = false;
1403
1418
  let sqliteErr = "";
1404
1419
  try {
@@ -1408,26 +1423,54 @@ async function doctor() {
1408
1423
  } catch (err) {
1409
1424
  sqliteErr = err instanceof Error ? err.message : String(err);
1410
1425
  }
1411
- if (sqliteOk) {
1412
- console.log(` ✅ better-sqlite3 native binary loadable`);
1413
- } else {
1414
- console.log(` ❌ better-sqlite3 native binary not loadable — semantic search disabled`);
1426
+ if (!sqliteOk) {
1427
+ console.log(` ❌ better-sqlite3 native binary not loadable — memory store disabled`);
1415
1428
  console.log(` Fix: cd $(npm root -g)/alvin-bot && npm rebuild better-sqlite3`);
1416
1429
  console.log(` Detail: ${sqliteErr.split("\n")[0]}`);
1417
- }
1418
-
1419
- if (sqliteOk && existsSync(embDb)) {
1430
+ } else if (existsSync(embDb)) {
1420
1431
  try {
1421
1432
  const req = (await import("module")).createRequire(import.meta.url);
1422
1433
  const Database = req("better-sqlite3");
1423
1434
  const db = new Database(embDb, { readonly: true });
1424
- const entries = db.prepare("SELECT COUNT(*) AS c FROM entries").get().c;
1425
- const files = db.prepare("SELECT COUNT(*) AS c FROM file_mtimes").get().c;
1426
- const sizeMb = (statSync(embDb).size / 1024 / 1024).toFixed(0);
1435
+ // Read provider + meta
1436
+ let model = "unknown", tier = "unknown", dim = 0, lastReindex = 0;
1437
+ try {
1438
+ const meta = db.prepare("SELECT key, value FROM meta").all();
1439
+ const m = Object.fromEntries(meta.map(r => [r.key, r.value]));
1440
+ // v4.22 keys preferred; fall back to v4.20 legacy "model" key.
1441
+ // Legacy v4.20 DBs only have meta.model (always Gemini-format). v4.22+
1442
+ // sets meta.embedding_model with a tier-prefixed name.
1443
+ model = m.embedding_model || m.model || "unknown";
1444
+ tier = m.embedding_tier || (m.model ? "vector-cloud" : "unknown");
1445
+ dim = Number(m.embedding_dim || 0);
1446
+ lastReindex = Number(m.lastReindex || 0);
1447
+ } catch { /* meta table missing */ }
1448
+
1449
+ // Count rows in whichever provider table exists.
1450
+ let entries = 0;
1451
+ for (const tbl of ["entries", "entries_fts"]) {
1452
+ try {
1453
+ entries = db.prepare(`SELECT COUNT(*) AS c FROM ${tbl}`).get().c;
1454
+ if (entries > 0) break;
1455
+ } catch { /* table missing */ }
1456
+ }
1457
+ const files = (() => {
1458
+ try { return db.prepare("SELECT COUNT(*) AS c FROM file_mtimes").get().c; } catch { return 0; }
1459
+ })();
1460
+ const sizeMb = (statSync(embDb).size / 1024 / 1024).toFixed(1);
1427
1461
  db.close();
1428
- console.log(` ✅ Vector store: ${entries} entries across ${files} sources (${sizeMb} MB SQLite)`);
1462
+
1463
+ console.log(` ✅ Provider: ${model}${dim ? ` (${tier}, ${dim}-dim)` : ` (${tier})`}`);
1464
+ console.log(` ${entries} entries / ${files} files indexed, ${sizeMb} MB on disk`);
1465
+ if (lastReindex) {
1466
+ const ago = Math.round((Date.now() - lastReindex) / 1000 / 60);
1467
+ console.log(` Last reindex: ${ago < 60 ? `${ago} min ago` : `${Math.round(ago / 60)} h ago`}`);
1468
+ }
1469
+ const injectMode = (getEnv("MEMORY_INJECT_MODE") || "auto").toLowerCase();
1470
+ const effective = injectMode === "auto" ? (entries > 0 ? "sqlite" : "legacy") : injectMode;
1471
+ console.log(` Inject mode: ${effective}${injectMode === "auto" ? " (auto)" : ""}`);
1429
1472
  } catch (err) {
1430
- console.log(` âš ī¸ Vector store exists but unreadable: ${err.message}`);
1473
+ console.log(` âš ī¸ Memory store exists but unreadable: ${err.message}`);
1431
1474
  }
1432
1475
  } else if (existsSync(embJson)) {
1433
1476
  const sizeMb = (statSync(embJson).size / 1024 / 1024).toFixed(0);
@@ -1435,7 +1478,13 @@ async function doctor() {
1435
1478
  } else if (existsSync(embBakSqlite)) {
1436
1479
  console.log(` ✅ Migration to SQLite already done (legacy JSON kept as .bak-pre-sqlite)`);
1437
1480
  } else {
1438
- console.log(` â„šī¸ No vector store yet — will be built on first message`);
1481
+ // Predict which provider will be picked on first start.
1482
+ const hasGoogle = !!getEnv("GOOGLE_API_KEY");
1483
+ const hasOpenAI = !!getEnv("OPENAI_API_KEY");
1484
+ console.log(` â„šī¸ Memory store not initialised yet (will be on first bot start)`);
1485
+ if (hasGoogle) console.log(` Will use: Gemini (3072-dim, semantic)`);
1486
+ else if (hasOpenAI) console.log(` Will use: OpenAI text-embedding-3-small (1536-dim, semantic)`);
1487
+ else console.log(` Will use: FTS5 keyword (zero-config). Set GOOGLE_API_KEY or OPENAI_API_KEY for semantic vectors.`);
1439
1488
  }
1440
1489
 
1441
1490
  // ── Extras ──
@@ -370,7 +370,12 @@ export function registerCommands(bot) {
370
370
  // Memory stats
371
371
  const memStats = getMemoryStats();
372
372
  const idxStats = getIndexStats();
373
- const memLine = `${memStats.dailyLogs} days, ${memStats.todayEntries} entries today, ${formatBytes(memStats.longTermSize)} LTM | 🔍 ${idxStats.entries} vectors`;
373
+ const { getEffectiveInjectMode, getInjectModeRaw } = await import("../services/memory-inject-mode.js");
374
+ const injectMode = getEffectiveInjectMode();
375
+ const injectRaw = getInjectModeRaw();
376
+ const indexLabel = idxStats.tier === "keyword-local" ? "FTS5" : "vec";
377
+ const modeLabel = injectRaw === "auto" ? `${injectMode}(auto)` : injectMode;
378
+ const memLine = `${memStats.dailyLogs} days, ${memStats.todayEntries} entries today, ${formatBytes(memStats.longTermSize)} LTM | 🔍 ${idxStats.entries} ${indexLabel} (${idxStats.provider}) | inject:${modeLabel}`;
374
379
  // Provider health + failover state
375
380
  const healthRows = getHealthStatus();
376
381
  const failedOver = isFailedOver();
@@ -0,0 +1,74 @@
1
+ /**
2
+ * Provider auto-detection for the memory backend.
3
+ *
4
+ * Probes available providers in priority order and returns the first one that
5
+ * is usable right now. The order is:
6
+ *
7
+ * 1. EMBEDDINGS_PROVIDER env override (gemini|openai|ollama|fts5) — explicit wins.
8
+ * 2. Gemini (free tier, 3072-dim) — when GOOGLE_API_KEY is set.
9
+ * 3. OpenAI (cheap, 1536-dim) — when OPENAI_API_KEY is set.
10
+ * 4. Ollama (local, free, 768-dim default) — when /api/tags responds AND
11
+ * an embedding model is pulled. Many Ollama users only have chat
12
+ * models, so we don't auto-pull; we return false from isAvailable.
13
+ * 5. FTS5 (always available) — universal zero-config fallback.
14
+ *
15
+ * The facade calls this once per startup and caches the chosen provider for
16
+ * the lifetime of the process. If the user changes EMBEDDINGS_PROVIDER or
17
+ * adds a key, a restart picks up the new choice (and triggers a reindex via
18
+ * schema-mismatch detection in the facade).
19
+ */
20
+ import { GeminiProvider } from "./gemini.js";
21
+ import { OpenAIProvider } from "./openai.js";
22
+ import { OllamaProvider } from "./ollama.js";
23
+ import { Fts5Provider } from "./fts5.js";
24
+ export function parseProviderKey(raw) {
25
+ const v = (raw ?? "").trim().toLowerCase();
26
+ switch (v) {
27
+ case "gemini":
28
+ case "openai":
29
+ case "ollama":
30
+ case "fts5":
31
+ case "auto":
32
+ return v;
33
+ default:
34
+ return "auto";
35
+ }
36
+ }
37
+ function instantiate(key) {
38
+ switch (key) {
39
+ case "gemini":
40
+ return new GeminiProvider();
41
+ case "openai":
42
+ return new OpenAIProvider();
43
+ case "ollama":
44
+ return new OllamaProvider();
45
+ case "fts5":
46
+ return new Fts5Provider();
47
+ }
48
+ }
49
+ /**
50
+ * Pick the active provider. If override is given (and not "auto"), force it
51
+ * regardless of availability — the facade still runs isAvailable() and
52
+ * surfaces a clear error if the forced provider can't actually run.
53
+ *
54
+ * Otherwise probe in priority order until one succeeds. FTS5 is the universal
55
+ * tail and always succeeds (assuming better-sqlite3 loaded).
56
+ */
57
+ export async function detectProvider(override) {
58
+ if (override && override !== "auto") {
59
+ return instantiate(override);
60
+ }
61
+ const tryOrder = ["gemini", "openai", "ollama", "fts5"];
62
+ for (const key of tryOrder) {
63
+ const p = instantiate(key);
64
+ try {
65
+ if (await p.isAvailable())
66
+ return p;
67
+ }
68
+ catch {
69
+ // probe failure is non-fatal — try next
70
+ }
71
+ }
72
+ // unreachable: fts5.isAvailable always returns true
73
+ return new Fts5Provider();
74
+ }
@@ -0,0 +1,108 @@
1
+ /**
2
+ * FTS5 Memory Provider — zero-config keyword search via SQLite full-text.
3
+ *
4
+ * No API keys, no network, no embeddings. Indexes chunk text into an FTS5
5
+ * virtual table and ranks matches via BM25. Universal fallback when the user
6
+ * has no Gemini / OpenAI / Ollama configured. Excellent for proper-noun and
7
+ * exact-term lookups (project names, commands, error messages); weaker than
8
+ * vector search for synonyms and conceptual paraphrase queries.
9
+ *
10
+ * Schema:
11
+ * entries_fts (id UNINDEXED, source UNINDEXED, text)
12
+ * tokenizer: unicode61 with diacritic stripping (works for de/en mixed memory).
13
+ *
14
+ * Score normalisation: SQLite's bm25() returns negative numbers (more negative
15
+ * = more relevant). We map to [0, 1] via 1 / (1 + |bm25|) so callers can use
16
+ * the same minScore semantics as vector providers.
17
+ */
18
+ const TABLE = "entries_fts";
19
+ /** FTS5 has reserved characters/operators in MATCH queries. Sanitize to plain
20
+ * word-OR by extracting alphanumeric tokens and quoting each as a phrase. */
21
+ function sanitizeQuery(query) {
22
+ const tokens = query
23
+ .toLowerCase()
24
+ .split(/[\s\W]+/u)
25
+ .filter(t => t.length >= 2 && t.length <= 64);
26
+ if (tokens.length === 0)
27
+ return "";
28
+ // Each token wrapped in double quotes makes it a literal phrase, immune to
29
+ // FTS5 operator characters (NEAR, AND, OR, NOT, *, etc.). Joined with OR.
30
+ return tokens.map(t => `"${t.replace(/"/g, '""')}"`).join(" OR ");
31
+ }
32
+ export class Fts5Provider {
33
+ name = "fts5-bm25";
34
+ dim = 0;
35
+ tier = "keyword-local";
36
+ async isAvailable() {
37
+ return true;
38
+ }
39
+ initSchema(db) {
40
+ // FTS5 doesn't allow secondary indexes on the virtual table itself;
41
+ // source filtering happens via WHERE clauses on the UNINDEXED column,
42
+ // which is fast enough at our corpus size (<100k chunks).
43
+ db.exec(`
44
+ CREATE VIRTUAL TABLE IF NOT EXISTS ${TABLE} USING fts5(
45
+ id UNINDEXED,
46
+ source UNINDEXED,
47
+ text,
48
+ tokenize = 'unicode61 remove_diacritics 2'
49
+ );
50
+ `);
51
+ }
52
+ dropSchema(db) {
53
+ db.exec(`DROP TABLE IF EXISTS ${TABLE};`);
54
+ }
55
+ async indexChunks(db, chunks) {
56
+ if (chunks.length === 0)
57
+ return;
58
+ const ins = db.prepare(`INSERT INTO ${TABLE} (id, source, text) VALUES (?, ?, ?)`);
59
+ const writeAll = db.transaction((rows) => {
60
+ for (const c of rows)
61
+ ins.run(c.id, c.source, c.text);
62
+ });
63
+ writeAll(chunks);
64
+ }
65
+ dropEntriesForSources(db, sources) {
66
+ if (sources.length === 0)
67
+ return;
68
+ const del = db.prepare(`DELETE FROM ${TABLE} WHERE source = ?`);
69
+ const dropAll = db.transaction((srcs) => {
70
+ for (const s of srcs)
71
+ del.run(s);
72
+ });
73
+ dropAll(sources);
74
+ }
75
+ async search(db, query, topK, minScore) {
76
+ const matchExpr = sanitizeQuery(query);
77
+ if (!matchExpr)
78
+ return [];
79
+ let rows;
80
+ try {
81
+ rows = db
82
+ .prepare(`SELECT source, text, bm25(${TABLE}) AS bm25 FROM ${TABLE} WHERE ${TABLE} MATCH ? ORDER BY bm25(${TABLE}) LIMIT ?`)
83
+ .all(matchExpr, topK * 3);
84
+ }
85
+ catch {
86
+ // FTS5 MATCH parse errors (e.g. exotic Unicode) → return empty.
87
+ return [];
88
+ }
89
+ const results = rows
90
+ .map(r => ({
91
+ text: r.text,
92
+ source: r.source,
93
+ score: 1 / (1 + Math.abs(r.bm25)),
94
+ }))
95
+ .filter(r => r.score >= minScore)
96
+ .slice(0, topK);
97
+ return results;
98
+ }
99
+ countEntries(db) {
100
+ try {
101
+ const row = db.prepare(`SELECT COUNT(*) AS c FROM ${TABLE}`).get();
102
+ return row?.c ?? 0;
103
+ }
104
+ catch {
105
+ return 0;
106
+ }
107
+ }
108
+ }
@@ -0,0 +1,65 @@
1
+ /**
2
+ * Gemini Memory Provider — Google's gemini-embedding-001 (3072-dim).
3
+ *
4
+ * Uses the public Generative Language API. Free tier limits: 100 RPM, 30k TPM,
5
+ * 1500 RPD as of 2026-04. Batches up to 100 texts per request via
6
+ * batchEmbedContents. RETRIEVAL_DOCUMENT for index, RETRIEVAL_QUERY for search.
7
+ */
8
+ import { config } from "../../config.js";
9
+ import { VectorProviderBase } from "./vector-base.js";
10
+ const MODEL = "gemini-embedding-001";
11
+ const BATCH_SIZE = 100;
12
+ export class GeminiProvider extends VectorProviderBase {
13
+ name = MODEL;
14
+ dim = 3072;
15
+ tier = "vector-cloud";
16
+ async isAvailable() {
17
+ return Boolean(config.apiKeys.google);
18
+ }
19
+ async embed(texts) {
20
+ const apiKey = config.apiKeys.google;
21
+ if (!apiKey)
22
+ throw new Error("GOOGLE_API_KEY not configured");
23
+ const out = [];
24
+ for (let i = 0; i < texts.length; i += BATCH_SIZE) {
25
+ const batch = texts.slice(i, i + BATCH_SIZE);
26
+ const res = await fetch(`https://generativelanguage.googleapis.com/v1beta/models/${MODEL}:batchEmbedContents?key=${apiKey}`, {
27
+ method: "POST",
28
+ headers: { "Content-Type": "application/json" },
29
+ body: JSON.stringify({
30
+ requests: batch.map(text => ({
31
+ model: `models/${MODEL}`,
32
+ content: { parts: [{ text }] },
33
+ taskType: "RETRIEVAL_DOCUMENT",
34
+ })),
35
+ }),
36
+ });
37
+ if (!res.ok) {
38
+ throw new Error(`Gemini embeddings API error: ${res.status} — ${await res.text()}`);
39
+ }
40
+ const data = (await res.json());
41
+ for (const e of data.embeddings)
42
+ out.push(e.values);
43
+ }
44
+ return out;
45
+ }
46
+ async embedQuery(text) {
47
+ const apiKey = config.apiKeys.google;
48
+ if (!apiKey)
49
+ throw new Error("GOOGLE_API_KEY not configured");
50
+ const res = await fetch(`https://generativelanguage.googleapis.com/v1beta/models/${MODEL}:embedContent?key=${apiKey}`, {
51
+ method: "POST",
52
+ headers: { "Content-Type": "application/json" },
53
+ body: JSON.stringify({
54
+ model: `models/${MODEL}`,
55
+ content: { parts: [{ text }] },
56
+ taskType: "RETRIEVAL_QUERY",
57
+ }),
58
+ });
59
+ if (!res.ok) {
60
+ throw new Error(`Gemini embeddings API error: ${res.status} — ${await res.text()}`);
61
+ }
62
+ const data = (await res.json());
63
+ return data.embedding.values;
64
+ }
65
+ }