npm - kongbrain - Versions diffs - 0.4.4 → 0.5.0 - Mend

kongbrain 0.4.4 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md +47 -0
package/README.github.md +53 -3
package/README.md +29 -3
package/README.npm.md +29 -3
package/SKILL.md +1 -1
package/bin/kongbrain-reembed.ts +143 -0
package/openclaw.plugin.json +37 -7
package/package.json +4 -1
package/src/causal.ts +4 -1
package/src/cognitive-bootstrap.ts +1 -0
package/src/concept-extract.ts +4 -2
package/src/config.ts +56 -10
package/src/embeddings-openai.ts +232 -0
package/src/embeddings.ts +48 -6
package/src/identity.ts +2 -0
package/src/index.ts +54 -5
package/src/memory-daemon.ts +1 -1
package/src/migrate-reembed.ts +305 -0
package/src/reflection.ts +10 -4
package/src/schema.surql +29 -0
package/src/skills.ts +14 -5
package/src/supersedes.ts +2 -1
package/src/surreal.ts +77 -19
package/src/workspace-migrate.ts +3 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,53 @@
 All notable changes to KongBrain are documented here.
+## [0.5.0] - 2026-04-25
+Configurable embedding providers. Closes #1.
+### Features
+- **Configurable embedding provider**: New `embedding.provider` config field. Options: `local` (BGE-M3 via node-llama-cpp, default and unchanged) or `openai-compat` (any OpenAI-compatible `/v1/embeddings` endpoint — OpenAI, Azure OpenAI, Together, Anyscale, vLLM, LM Studio, Ollama, DeepInfra, Fireworks).
+- **OpenAI-compatible provider**: `fetch`-based, no SDK dependency. Batches inputs at 96/request, retries 429 + 5xx with exponential backoff and `Retry-After` honoring, hard-fails on 401/403/404 and `insufficient_quota` with clear error messages, verifies returned dimensionality matches config.
+- **Per-row provider tagging**: Every vector-bearing table (`turn`, `concept`, `memory`, `artifact`, `identity_chunk`, `skill`, `reflection`, `monologue`) gets an `embedding_provider` column. Searches filter by the active provider so vectors from different models (different vector spaces) never mix in HNSW results.
+- **Re-embed migration tool**: `npx kongbrain-reembed --from <provider-id> [--dry-run] [--tables …] [--batch …]`. Resumable on interruption (the WHERE filter naturally excludes processed rows). Reports per-table progress and estimated cost.
+- **Startup mismatch warning**: Logs a clear notice (with row counts and migration command) when the configured provider does not match what is in the database.
+- **Provider env overrides**: `KONGBRAIN_EMBED_PROVIDER` flips provider without editing config; `OPENAI_BASE_URL` overrides endpoint (matches the official OpenAI SDK convention); `embedding.openaiCompat.apiKeyEnv` names the env var holding the secret so keys never appear in config files.
+- **Plugin manifest**: `openclaw.plugin.json` extended with `provider` / `dimensions` / `openaiCompat` schema and uiHints with inline help text.
+### Infrastructure
+- **Idempotent schema migration + backfill**: All schema additions use `IF NOT EXISTS`. On first startup with this version, existing rows are tagged with `local-bge-m3`. Runs cleanly on every subsequent startup as a no-op.
+### Tests
+- 439 → 469 tests. New: 17 OpenAI provider unit tests (success, batching, dim mismatch, retry, hard-fail), 4 config tests (provider, env overrides, fallback), 8 migration tests (full migrate, table filter, dry-run, blank-text, multi-batch, refusal, no-op, format), 4 backfill upgrade-path integration tests, plus 6 gated live tests (real OpenAI / real DB / real reembed) that skip in CI.
+### Documentation
+- README and README.npm updated with embedding-provider comparison table, switching instructions for OpenAI / Ollama / vLLM, and migration command.
+### Upgrade notes
+- **No action required for existing local BGE-M3 deployments.** The schema migration adds the new column and tags all existing rows as `local-bge-m3`. Search continues to work identically.
+- **To switch providers**: set `embedding.provider: "openai-compat"` and `OPENAI_API_KEY`. On restart you will see a warning about rows in the old vector space. Run `npx kongbrain-reembed --from local-bge-m3 --dry-run` to estimate cost (~$0.04 per ~3,400 turns on text-embedding-3-small), then drop `--dry-run` to migrate. Resumable if interrupted.
+## [0.4.4] - 2026-04-04
+### Performance
+- **WMR rebalance**: Cosine-dominant scoring weights, dampen access count feedback loop that was reinforcing already-popular memories.
+- **Tag-boosted concept retrieval**: Surface topically relevant concepts even when embedding similarity alone misses them.
+### Bug Fixes
+- **Empty LLM extraction responses**: `outputFormat` injected via pi-ai's `onPayload` hook caused Anthropic API to return 0 content blocks. Removed structured output from pi-ai code path; daemon's JSON parsing cascade handles free-text reliably.
+- **`SELECT WHERE id IN $ids` binding**: Same silent no-op as `bumpAccessCounts` — SurrealDB string arrays don't resolve to record references. Fixed in `getSessionRetrievedMemories` and ACAN `fetchTrainingData`.
+- **ACAN NaN/Infinity validation**: `loadWeights` now rejects corrupted weights (null, NaN, Infinity in bias, W_final, or spot-checked W_q/W_k rows).
+- **Lazy daemon start**: If gateway restarts mid-session, `afterTurn` now starts the daemon on demand instead of silently skipping extraction.
+- **`getOrCreateSession` in afterTurn**: Resumed sessions after gateway restart no longer return null.
+- **Model object unwrapping**: `defaults.model` can be `{primary: "provider/model"}` — unwrap and split provider/model format.
+### Infrastructure
+- **CI pipeline**: GitHub Actions with SurrealDB service container, Node 22, 439 tests (unit + integration).
+- **PR checks**: Type checking + unit tests on all pull requests.
+### Tests
+- 415 → 439 tests. New: ACAN NaN/Infinity validation (7), score stability/performance (3), `SELECT IN` integration test, additional integration coverage.
 ## [0.4.2] - 2026-04-03
 ### Performance

package/README.github.md CHANGED Viewed

@@ -11,7 +11,7 @@
 [![Node.js](https://img.shields.io/badge/Node.js-20+-339933?style=for-the-badge&logo=node.js&logoColor=white)](https://nodejs.org)
 [![SurrealDB](https://img.shields.io/badge/SurrealDB-3.0-ff00a0?style=for-the-badge&logo=surrealdb&logoColor=white)](https://surrealdb.com)
 [![OpenClaw](https://img.shields.io/badge/OpenClaw-Plugin-ff6b35?style=for-the-badge)](https://github.com/openclaw/openclaw)
-[![Tests](https://img.shields.io/badge/Tests-415_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
+[![Tests](https://img.shields.io/badge/Tests-469_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
 **A graph-backed cognitive engine for [OpenClaw](https://github.com/openclaw/openclaw).**
@@ -121,7 +121,9 @@ openclaw tui
 That's it. KongBrain uses whatever LLM provider and model you already have configured in OpenClaw (Anthropic, OpenAI, Google, Ollama, whatever). No separate API keys needed for the brain itself.
-The BGE-M3 embedding model (~420MB) downloads automatically on first startup from [Hugging Face](https://huggingface.co/BAAI/bge-m3). All database tables and indexes are created automatically on first run. No manual setup required.
+By default KongBrain runs the BGE-M3 embedding model locally via `node-llama-cpp` — the GGUF (~420MB) auto-downloads from [Hugging Face](https://huggingface.co/BAAI/bge-m3) on first startup. For high-traffic deployments the local model can become a bottleneck on serial embedding calls; in that case switch to any OpenAI-compatible API (real OpenAI, Azure OpenAI, Together, vLLM, LM Studio, Ollama) by changing one config field. See [Embedding Providers](#embedding-providers) below.
+All database tables and indexes are created automatically on first run. No manual setup required.
 <details>
 <summary><strong>Configuration Options</strong></summary>
@@ -135,8 +137,12 @@ All options have sensible defaults. Override via plugin config or environment va
 | `surreal.pass` | `SURREAL_PASS` | (required) |
 | `surreal.ns` | `SURREAL_NS` | `kong` |
 | `surreal.db` | `SURREAL_DB` | `memory` |
-| `embedding.modelPath` | `KONGBRAIN_EMBEDDING_MODEL` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.provider` | `KONGBRAIN_EMBED_PROVIDER` | `local` (or `openai-compat`) |
 | `embedding.dimensions` | - | `1024` |
+| `embedding.modelPath` | `EMBED_MODEL_PATH` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.openaiCompat.model` | - | `text-embedding-3-small` |
+| `embedding.openaiCompat.baseURL` | `OPENAI_BASE_URL` | `https://api.openai.com/v1` |
+| `embedding.openaiCompat.apiKeyEnv` | - | `OPENAI_API_KEY` |
 Full config example:
@@ -166,6 +172,50 @@ Full config example:
 </details>
+### Embedding Providers
+| | `local` (default) | `openai-compat` |
+|---|---|---|
+| **Inference** | BGE-M3 GGUF via node-llama-cpp, in-process | HTTP POST to `/v1/embeddings` |
+| **Cost** | Zero (~420MB model on disk, CPU inference) | Per-token API charges |
+| **Throughput** | Serial; bottlenecks under high turn volume | High parallelism, batched at 96 inputs/request |
+| **Network** | None required | Required |
+| **Compatible servers** | n/a | OpenAI, Azure OpenAI, Together, Anyscale, vLLM, LM Studio, Ollama, DeepInfra, Fireworks |
+Switching providers is safe by design. Every embedding written to the database is tagged with the provider that produced it (`embedding_provider` column). At search time, KongBrain only compares vectors that were produced by the active provider — vectors from a different provider live in a different vector space and would corrupt similarity scores if mixed.
+When you switch from `local` to `openai-compat`, pre-existing rows stay in the database but become invisible to recall until re-embedded. Run the included migration tool to bring them into the new vector space:
+```bash
+# Estimate cost first (no writes)
+npx kongbrain-reembed --from local-bge-m3 --dry-run
+# Then run for real (resumable on interruption)
+npx kongbrain-reembed --from local-bge-m3
+```
+Cost is small: text-embedding-3-small is $0.02 per 1M tokens; a typical 3,400-turn database costs ~$0.04 to re-embed.
+#### Switching to OpenAI
+```bash
+export KONGBRAIN_EMBED_PROVIDER=openai-compat
+export OPENAI_API_KEY=sk-...
+# Restart the plugin. Migration tool handles existing data.
+```
+#### Switching to a local OpenAI-compatible server (Ollama, vLLM, LM Studio)
+```bash
+# Ollama example — runs entirely locally, no API key needed
+ollama pull nomic-embed-text
+export OPENAI_BASE_URL=http://localhost:11434/v1
+export OPENAI_API_KEY=ollama  # any non-empty string; Ollama doesn't validate
+export KONGBRAIN_EMBED_PROVIDER=openai-compat
+```
+Set `embedding.dimensions` to match the server's native output dim. KongBrain verifies the returned vector size on every response and fails loudly if the server ignores the `dimensions` parameter — preventing silently corrupted indexes.
 ---
 ## Architecture

package/README.md CHANGED Viewed

@@ -7,7 +7,7 @@
 [![Node.js](https://img.shields.io/badge/Node.js-20+-339933?style=for-the-badge&logo=node.js&logoColor=white)](https://nodejs.org)
 [![SurrealDB](https://img.shields.io/badge/SurrealDB-3.0-ff00a0?style=for-the-badge&logo=surrealdb&logoColor=white)](https://surrealdb.com)
 [![OpenClaw](https://img.shields.io/badge/OpenClaw-Plugin-ff6b35?style=for-the-badge)](https://github.com/openclaw/openclaw)
-[![Tests](https://img.shields.io/badge/Tests-88_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
+[![Tests](https://img.shields.io/badge/Tests-469_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
 **A graph-backed cognitive engine for [OpenClaw](https://github.com/openclaw/openclaw).**
@@ -111,7 +111,9 @@ openclaw tui
 That's it. KongBrain uses whatever LLM provider and model you already have configured in OpenClaw (Anthropic, OpenAI, Google, Ollama, whatever). No separate API keys needed for the brain itself.
-The BGE-M3 embedding model (~420MB) downloads automatically on first startup from [Hugging Face](https://huggingface.co/BAAI/bge-m3). All database tables and indexes are created automatically on first run. No manual setup required.
+By default KongBrain runs the BGE-M3 embedding model locally via `node-llama-cpp` — the GGUF (~420MB) auto-downloads from [Hugging Face](https://huggingface.co/BAAI/bge-m3) on first startup. For high-traffic deployments the local model can become a bottleneck on serial embedding calls; in that case switch to any OpenAI-compatible API (real OpenAI, Azure OpenAI, Together, vLLM, LM Studio, Ollama) by changing one config field.
+All database tables and indexes are created automatically on first run. No manual setup required.
 <details>
 <summary><strong>Configuration Options</strong></summary>
@@ -125,8 +127,12 @@ All options have sensible defaults. Override via plugin config or environment va
 | `surreal.pass` | `SURREAL_PASS` | (required) |
 | `surreal.ns` | `SURREAL_NS` | `kong` |
 | `surreal.db` | `SURREAL_DB` | `memory` |
-| `embedding.modelPath` | `KONGBRAIN_EMBEDDING_MODEL` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.provider` | `KONGBRAIN_EMBED_PROVIDER` | `local` (or `openai-compat`) |
 | `embedding.dimensions` | - | `1024` |
+| `embedding.modelPath` | `EMBED_MODEL_PATH` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.openaiCompat.model` | - | `text-embedding-3-small` |
+| `embedding.openaiCompat.baseURL` | `OPENAI_BASE_URL` | `https://api.openai.com/v1` |
+| `embedding.openaiCompat.apiKeyEnv` | - | `OPENAI_API_KEY` |
 Full config example:
@@ -156,6 +162,26 @@ Full config example:
 </details>
+### Embedding Providers
+| | `local` (default) | `openai-compat` |
+|---|---|---|
+| **Inference** | BGE-M3 GGUF via node-llama-cpp, in-process | HTTP POST to `/v1/embeddings` |
+| **Cost** | Zero | Per-token API charges |
+| **Throughput** | Serial; bottlenecks under high turn volume | High parallelism, batched at 96 inputs/request |
+| **Compatible servers** | n/a | OpenAI, Azure OpenAI, Together, Anyscale, vLLM, LM Studio, Ollama, DeepInfra, Fireworks |
+Every embedding is tagged with the provider that produced it. At search time, KongBrain only compares vectors from the active provider — vectors from a different provider live in a different vector space.
+When you switch providers, run the included migration tool to re-embed pre-existing rows:
+```bash
+npx kongbrain-reembed --from local-bge-m3 --dry-run    # estimate cost
+npx kongbrain-reembed --from local-bge-m3              # run for real (resumable)
+```
+text-embedding-3-small costs ~$0.04 to re-embed a typical 3,400-turn database.
 ---
 ## Architecture

package/README.npm.md CHANGED Viewed

@@ -7,7 +7,7 @@
 [![Node.js](https://img.shields.io/badge/Node.js-20+-339933?style=for-the-badge&logo=node.js&logoColor=white)](https://nodejs.org)
 [![SurrealDB](https://img.shields.io/badge/SurrealDB-3.0-ff00a0?style=for-the-badge&logo=surrealdb&logoColor=white)](https://surrealdb.com)
 [![OpenClaw](https://img.shields.io/badge/OpenClaw-Plugin-ff6b35?style=for-the-badge)](https://github.com/openclaw/openclaw)
-[![Tests](https://img.shields.io/badge/Tests-88_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
+[![Tests](https://img.shields.io/badge/Tests-469_passing-brightgreen?style=for-the-badge&logo=vitest&logoColor=white)](https://vitest.dev)
 **A graph-backed cognitive engine for [OpenClaw](https://github.com/openclaw/openclaw).**
@@ -111,7 +111,9 @@ openclaw tui
 That's it. KongBrain uses whatever LLM provider and model you already have configured in OpenClaw (Anthropic, OpenAI, Google, Ollama, whatever). No separate API keys needed for the brain itself.
-The BGE-M3 embedding model (~420MB) downloads automatically on first startup from [Hugging Face](https://huggingface.co/BAAI/bge-m3). All database tables and indexes are created automatically on first run. No manual setup required.
+By default KongBrain runs the BGE-M3 embedding model locally via `node-llama-cpp` — the GGUF (~420MB) auto-downloads from [Hugging Face](https://huggingface.co/BAAI/bge-m3) on first startup. For high-traffic deployments the local model can become a bottleneck on serial embedding calls; in that case switch to any OpenAI-compatible API (real OpenAI, Azure OpenAI, Together, vLLM, LM Studio, Ollama) by changing one config field.
+All database tables and indexes are created automatically on first run. No manual setup required.
 <details>
 <summary><strong>Configuration Options</strong></summary>
@@ -125,8 +127,12 @@ All options have sensible defaults. Override via plugin config or environment va
 | `surreal.pass` | `SURREAL_PASS` | (required) |
 | `surreal.ns` | `SURREAL_NS` | `kong` |
 | `surreal.db` | `SURREAL_DB` | `memory` |
-| `embedding.modelPath` | `KONGBRAIN_EMBEDDING_MODEL` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.provider` | `KONGBRAIN_EMBED_PROVIDER` | `local` (or `openai-compat`) |
 | `embedding.dimensions` | - | `1024` |
+| `embedding.modelPath` | `EMBED_MODEL_PATH` | Auto-downloaded BGE-M3 Q4_K_M |
+| `embedding.openaiCompat.model` | - | `text-embedding-3-small` |
+| `embedding.openaiCompat.baseURL` | `OPENAI_BASE_URL` | `https://api.openai.com/v1` |
+| `embedding.openaiCompat.apiKeyEnv` | - | `OPENAI_API_KEY` |
 Full config example:
@@ -156,6 +162,26 @@ Full config example:
 </details>
+### Embedding Providers
+| | `local` (default) | `openai-compat` |
+|---|---|---|
+| **Inference** | BGE-M3 GGUF via node-llama-cpp, in-process | HTTP POST to `/v1/embeddings` |
+| **Cost** | Zero | Per-token API charges |
+| **Throughput** | Serial; bottlenecks under high turn volume | High parallelism, batched at 96 inputs/request |
+| **Compatible servers** | n/a | OpenAI, Azure OpenAI, Together, Anyscale, vLLM, LM Studio, Ollama, DeepInfra, Fireworks |
+Every embedding is tagged with the provider that produced it. At search time, KongBrain only compares vectors from the active provider — vectors from a different provider live in a different vector space.
+When you switch providers, run the included migration tool to re-embed pre-existing rows:
+```bash
+npx kongbrain-reembed --from local-bge-m3 --dry-run    # estimate cost
+npx kongbrain-reembed --from local-bge-m3              # run for real (resumable)
+```
+text-embedding-3-small costs ~$0.04 to re-embed a typical 3,400-turn database.
 ---
 ## Architecture

package/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: kongbrain
 description: Graph-backed persistent memory engine for OpenClaw. Replaces the default context window with SurrealDB + vector embeddings that learn across sessions.
-version: 0.4.4
+version: 0.5.0
 homepage: https://github.com/42U/kongbrain
 metadata:
   openclaw:

package/bin/kongbrain-reembed.ts ADDED Viewed

@@ -0,0 +1,143 @@
+#!/usr/bin/env node
+/**
+ * kongbrain-reembed — re-embed migration CLI.
+ *
+ * Reads connection settings from the same env vars the plugin uses, plus
+ * KONGBRAIN_EMBED_PROVIDER / OPENAI_BASE_URL / etc. for the target
+ * provider. Migrates rows tagged with --from to the active provider's
+ * vector space.
+ *
+ * Usage:
+ *   kongbrain-reembed --from local-bge-m3 [--dry-run] [--tables turn,memory] [--batch 256]
+ *
+ * Resumable: each batch flips embedding_provider so processed rows leave
+ * the FROM filter. Restarting after a crash continues from where it
+ * stopped.
+ */
+import { parsePluginConfig } from "../src/config.js";
+import { createEmbeddingService } from "../src/embeddings.js";
+import { SurrealStore } from "../src/surreal.js";
+import {
+  formatResult,
+  reembedAll,
+  VECTOR_TABLES,
+  type VectorTable,
+} from "../src/migrate-reembed.js";
+interface CliFlags {
+  from: string | null;
+  dryRun: boolean;
+  tables: VectorTable[] | null;
+  batch: number;
+  help: boolean;
+}
+function parseArgs(argv: string[]): CliFlags {
+  const flags: CliFlags = {
+    from: null,
+    dryRun: false,
+    tables: null,
+    batch: 256,
+    help: false,
+  };
+  for (let i = 0; i < argv.length; i++) {
+    const a = argv[i];
+    if (a === "--help" || a === "-h") flags.help = true;
+    else if (a === "--dry-run") flags.dryRun = true;
+    else if (a === "--from") flags.from = argv[++i] ?? null;
+    else if (a === "--batch") flags.batch = Number(argv[++i] ?? "256");
+    else if (a === "--tables") {
+      const list = (argv[++i] ?? "").split(",").map(s => s.trim()).filter(Boolean);
+      const valid: VectorTable[] = [];
+      for (const t of list) {
+        if ((VECTOR_TABLES as readonly string[]).includes(t)) valid.push(t as VectorTable);
+        else throw new Error(`Unknown table: ${t}. Valid: ${VECTOR_TABLES.join(", ")}`);
+      }
+      flags.tables = valid;
+    }
+  }
+  return flags;
+}
+const HELP = `kongbrain-reembed — migrate embeddings between providers
+Required:
+  --from <provider-id>     Provider tag to migrate FROM (e.g. local-bge-m3)
+Optional:
+  --dry-run                Count rows + estimate cost without writing
+  --tables turn,memory     Only migrate these tables (default: all 8)
+  --batch <n>              Rows per batch (default: 256)
+  --help                   Show this message
+The TARGET provider is whatever the active EmbeddingService produces, set
+via plugin config + env vars (KONGBRAIN_EMBED_PROVIDER, OPENAI_BASE_URL,
+the API key env var named in embedding.openaiCompat.apiKeyEnv).
+Resumability: each batch flips the embedding_provider tag, so re-running
+after an interruption picks up from where the last successful batch left
+off — no checkpoint file needed.
+Example: migrate from local BGE-M3 to OpenAI text-embedding-3-small at 1024d:
+  export KONGBRAIN_EMBED_PROVIDER=openai-compat
+  export OPENAI_API_KEY=sk-...
+  npx kongbrain-reembed --from local-bge-m3 --dry-run    # check the size
+  npx kongbrain-reembed --from local-bge-m3              # run for real
+`;
+async function main(): Promise<number> {
+  const flags = parseArgs(process.argv.slice(2));
+  if (flags.help) {
+    console.log(HELP);
+    return 0;
+  }
+  if (!flags.from) {
+    console.error("Missing required --from <provider-id>. See --help.");
+    return 2;
+  }
+  const config = parsePluginConfig();
+  const store = new SurrealStore(config.surreal);
+  const embeddings = createEmbeddingService(config.embedding);
+  console.log(`Source provider:  ${flags.from}`);
+  console.log(`Target provider:  ${embeddings.providerId}`);
+  console.log(`SurrealDB:        ${config.surreal.url}`);
+  console.log(`Mode:             ${flags.dryRun ? "DRY RUN" : "WRITE"}`);
+  console.log("");
+  await store.initialize();
+  store.setActiveProvider(embeddings.providerId);
+  if (!flags.dryRun) {
+    await embeddings.initialize();
+  }
+  try {
+    const result = await reembedAll(store, embeddings, {
+      fromProvider: flags.from,
+      tables: flags.tables ?? undefined,
+      batchSize: flags.batch,
+      dryRun: flags.dryRun,
+      onProgress: ev => {
+        process.stdout.write(
+          `[${ev.table}] ${ev.tableProcessed}/${ev.tableTotal}\r`,
+        );
+      },
+    });
+    process.stdout.write("\n");
+    console.log(formatResult(result, embeddings.providerId));
+    return 0;
+  } finally {
+    await embeddings.dispose().catch(() => {});
+    await store.close().catch(() => {});
+  }
+}
+main().then(
+  code => process.exit(code),
+  err => {
+    console.error("kongbrain-reembed failed:", err?.message ?? err);
+    process.exit(1);
+  },
+);

package/openclaw.plugin.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "id": "kongbrain",
   "name": "KongBrain",
-  "description": "Graph-backed cognitive context engine with SurrealDB + BGE-M3",
+  "description": "Graph-backed cognitive context engine with SurrealDB + pluggable embeddings (local BGE-M3 or OpenAI-compatible)",
   "kind": "context-engine",
   "requires": {
     "bins": ["surreal"],
@@ -30,16 +30,36 @@
       "label": "SurrealDB Database",
       "placeholder": "memory"
     },
-    "embedding.modelPath": {
-      "label": "Embedding Model Path",
-      "placeholder": "~/.node-llama-cpp/models/bge-m3-q4_k_m.gguf",
-      "help": "Path to BGE-M3 GGUF model file (~420MB, auto-downloaded if missing)",
-      "advanced": true
+    "embedding.provider": {
+      "label": "Embedding Provider",
+      "placeholder": "local",
+      "help": "Either 'local' (BGE-M3 via node-llama-cpp) or 'openai-compat' (any OpenAI-compatible /v1/embeddings endpoint: OpenAI, Azure, Together, vLLM, Ollama, etc.)"
     },
     "embedding.dimensions": {
       "label": "Embedding Dimensions",
       "placeholder": "1024",
+      "help": "Output vector dimensionality. Must match across providers if you intend to swap them. text-embedding-3-* honors arbitrary values; non-OpenAI compat servers may return their native dim regardless."
+    },
+    "embedding.modelPath": {
+      "label": "Local Model Path",
+      "placeholder": "~/.node-llama-cpp/models/bge-m3-q4_k_m.gguf",
+      "help": "Path to BGE-M3 GGUF file. Only used when provider is 'local'.",
       "advanced": true
+    },
+    "embedding.openaiCompat.model": {
+      "label": "OpenAI-compat Model",
+      "placeholder": "text-embedding-3-small",
+      "help": "Model name passed in the embeddings request body. Only used when provider is 'openai-compat'."
+    },
+    "embedding.openaiCompat.baseURL": {
+      "label": "OpenAI-compat Base URL",
+      "placeholder": "https://api.openai.com/v1",
+      "help": "Endpoint base. The OPENAI_BASE_URL env var overrides this if set. Only used when provider is 'openai-compat'."
+    },
+    "embedding.openaiCompat.apiKeyEnv": {
+      "label": "API Key Env Var",
+      "placeholder": "OPENAI_API_KEY",
+      "help": "Name of the env var holding the API key (the secret stays out of config files). Only used when provider is 'openai-compat'."
     }
   },
   "configSchema": {
@@ -61,8 +81,18 @@
         "type": "object",
         "additionalProperties": false,
         "properties": {
+          "provider": { "type": "string", "enum": ["local", "openai-compat"] },
+          "dimensions": { "type": "number" },
           "modelPath": { "type": "string" },
-          "dimensions": { "type": "number" }
+          "openaiCompat": {
+            "type": "object",
+            "additionalProperties": false,
+            "properties": {
+              "model": { "type": "string" },
+              "baseURL": { "type": "string" },
+              "apiKeyEnv": { "type": "string" }
+            }
+          }
         }
       }
     }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "kongbrain",
-  "version": "0.4.4",
+  "version": "0.5.0",
   "description": "Graph-backed persistent memory engine for OpenClaw. Replaces the default context window with SurrealDB + vector embeddings that learn across sessions.",
   "type": "module",
   "license": "MIT",
@@ -24,6 +24,9 @@
     "knowledge-graph",
     "llm"
   ],
+  "bin": {
+    "kongbrain-reembed": "./bin/kongbrain-reembed.ts"
+  },
   "openclaw": {
     "extensions": [
       "./src/index.ts"

package/src/causal.ts CHANGED Viewed

@@ -127,9 +127,12 @@ export async function queryCausalContext(
   const seen = new Set<string>(validIds);
   let frontier = validIds;
   const results: VectorSearchResult[] = [];
-  const bindings = { vec: queryVec };
+  // Score only against rows in the active provider's vector space; rows from
+  // other providers still appear via graph traversal but score 0.
+  const bindings = { vec: queryVec, provider: store.getActiveProvider() };
   const scoreExpr = `, IF embedding != NONE AND array::len(embedding) > 0
+         AND embedding_provider = $provider
          THEN vector::similarity::cosine(embedding, $vec)
          ELSE 0 END AS score`;

package/src/cognitive-bootstrap.ts CHANGED Viewed

@@ -146,6 +146,7 @@ export async function seedCognitiveBootstrap(
               chunk_index: i,
               text: chunk.text,
               embedding: vec,
+              embedding_provider: embeddings.providerId,
               importance: chunk.importance,
             },
           });

package/src/concept-extract.ts CHANGED Viewed

@@ -108,9 +108,10 @@ export async function linkToRelevantConcepts(
       `SELECT id, vector::similarity::cosine(embedding, $vec) AS score
        FROM concept
        WHERE embedding != NONE AND array::len(embedding) > 0
+         AND embedding_provider = $provider
        ORDER BY score DESC
        LIMIT $lim`,
-      { vec, lim: limit },
+      { vec, lim: limit, provider: embeddings.providerId },
     );
     for (const m of matches) {
       if (m.score < threshold) break;
@@ -175,9 +176,10 @@ export async function linkConceptHierarchy(
              FROM concept
              WHERE id != $cid
                AND embedding != NONE AND array::len(embedding) > 0
+               AND embedding_provider = $provider
              ORDER BY score DESC
              LIMIT 3`,
-            { vec: conceptEmb, cid: conceptId },
+            { vec: conceptEmb, cid: conceptId, provider: embeddings.providerId },
           );
           for (const s of similar) {
             if (s.score < 0.75) break;

package/src/config.ts CHANGED Viewed

@@ -10,9 +10,26 @@ export interface SurrealConfig {
   db: string;
 }
+export type EmbeddingProvider = "local" | "openai-compat";
+export interface OpenAICompatEmbeddingConfig {
+  /** Model name passed in the embeddings request body (e.g. "text-embedding-3-small"). */
+  model: string;
+  /** Endpoint base URL. Default: "https://api.openai.com/v1". */
+  baseURL: string;
+  /** Name of the env var holding the API key. Default: "OPENAI_API_KEY". */
+  apiKeyEnv: string;
+}
 export interface EmbeddingConfig {
-  modelPath: string;
+  /** Which provider to use. Default "local" (BGE-M3 via node-llama-cpp). */
+  provider: EmbeddingProvider;
+  /** Vector dimensionality the active provider should produce. */
   dimensions: number;
+  /** Path to the local GGUF model — only consulted when provider === "local". */
+  modelPath: string;
+  /** OpenAI-compatible provider settings — only consulted when provider === "openai-compat". */
+  openaiCompat: OpenAICompatEmbeddingConfig;
 }
 export interface ThresholdConfig {
@@ -34,6 +51,43 @@ export interface KongBrainConfig {
   thresholds: ThresholdConfig;
 }
+function parseEmbeddingConfig(raw: Record<string, unknown>): EmbeddingConfig {
+  const openaiCompatRaw = (raw.openaiCompat ?? {}) as Record<string, unknown>;
+  // Provider precedence: env var > plugin config > default "local"
+  const rawProvider =
+    process.env.KONGBRAIN_EMBED_PROVIDER ??
+    (typeof raw.provider === "string" ? raw.provider : null);
+  const provider: EmbeddingProvider =
+    rawProvider === "openai-compat" ? "openai-compat" : "local";
+  return {
+    provider,
+    dimensions: typeof raw.dimensions === "number" ? raw.dimensions : 1024,
+    modelPath:
+      process.env.EMBED_MODEL_PATH ??
+      (typeof raw.modelPath === "string"
+        ? raw.modelPath
+        : join(homedir(), ".node-llama-cpp", "models", "bge-m3-q4_k_m.gguf")),
+    openaiCompat: {
+      model:
+        typeof openaiCompatRaw.model === "string"
+          ? openaiCompatRaw.model
+          : "text-embedding-3-small",
+      // baseURL: env wins (matches the official openai SDK convention)
+      baseURL:
+        process.env.OPENAI_BASE_URL ??
+        (typeof openaiCompatRaw.baseURL === "string"
+          ? openaiCompatRaw.baseURL
+          : "https://api.openai.com/v1"),
+      apiKeyEnv:
+        typeof openaiCompatRaw.apiKeyEnv === "string"
+          ? openaiCompatRaw.apiKeyEnv
+          : "OPENAI_API_KEY",
+    },
+  };
+}
 /**
  * Parse plugin config from openclaw.plugin.json configSchema values,
  * with env var overrides and sensible defaults.
@@ -66,15 +120,7 @@ export function parsePluginConfig(raw?: Record<string, unknown>): KongBrainConfi
       ns: (typeof surreal.ns === "string" ? surreal.ns : null) ?? process.env.SURREAL_NS ?? "kong",
       db: (typeof surreal.db === "string" ? surreal.db : null) ?? process.env.SURREAL_DB ?? "memory",
     },
-    embedding: {
-      modelPath:
-        process.env.EMBED_MODEL_PATH ??
-        (typeof embedding.modelPath === "string"
-          ? embedding.modelPath
-          : join(homedir(), ".node-llama-cpp", "models", "bge-m3-q4_k_m.gguf")),
-      dimensions:
-        typeof embedding.dimensions === "number" ? embedding.dimensions : 1024,
-    },
+    embedding: parseEmbeddingConfig(embedding),
     thresholds: {
       daemonTokenThreshold:
         typeof thresholds.daemonTokenThreshold === "number" ? thresholds.daemonTokenThreshold : 4000,