npm - @lacneu/openclaw-knowledge - Versions diffs - 3.2.2 → 3.2.4 - Mend

@lacneu/openclaw-knowledge 3.2.2 → 3.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/CHANGELOG.md +329 -0
package/dist/config.js +22 -0
package/dist/config.js.map +1 -1
package/dist/index.js +102 -16
package/dist/index.js.map +1 -1
package/dist/jina/classifier.d.ts +7 -2
package/dist/jina/classifier.js +4 -2
package/dist/jina/classifier.js.map +1 -1
package/dist/jina/client.d.ts +11 -1
package/dist/jina/client.js +5 -1
package/dist/jina/client.js.map +1 -1
package/dist/jina/rate-limit.d.ts +62 -0
package/dist/jina/rate-limit.js +100 -0
package/dist/jina/rate-limit.js.map +1 -0
package/dist/jina/reranker.d.ts +4 -1
package/dist/jina/reranker.js +2 -1
package/dist/jina/reranker.js.map +1 -1
package/dist/pgvector.d.ts +43 -0
package/dist/pgvector.js +47 -25
package/dist/pgvector.js.map +1 -1
package/dist/router/heuristic.js +39 -0
package/dist/router/heuristic.js.map +1 -1
package/dist/router/index.d.ts +15 -0
package/dist/router/index.js +9 -0
package/dist/router/index.js.map +1 -1
package/dist/tracing/events.d.ts +43 -2
package/dist/tracing/events.js +7 -0
package/dist/tracing/events.js.map +1 -1
package/dist/types.d.ts +53 -0
package/openclaw.plugin.json +37 -1
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,335 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [3.2.4] - 2026-05-24
+### Added — payload-size guards for the pgvector reranker
+Production observation on jerome's Jina dashboard (2026-05-17 →
+2026-05-24): rerank calls averaged **~66 600 tokens** each — way over
+the model's 8 K context window. Almost all of that came from LightRAG-
+side reranker chunking, but the plugin-side reranker would face the
+same risk once `knowledge_jerome` is alimented. v3.2.4 ships two
+preventive knobs to keep the plugin-side spend bounded:
+- **`jina.pgvectorReranker.candidatePoolMax`** (default `20`). Caps the
+  number of cosine-ranked candidates sent to Jina /v1/rerank. Pgvector
+  recall is typically 20-50 hits; only the top 10-15 are worth
+  reranking. Setting to `0` disables the cap (legacy v3.2.3 behavior).
+- **`jina.pgvectorReranker.maxCharsPerDoc`** (default `2000`). Pre-
+  truncates each candidate text BEFORE submission. Long chunks
+  (transcripts, books) carry most of their relevance signal in the
+  first ~2000 chars; the tail wastes Jina tokens without adding signal.
+  Setting to `0` disables truncation.
+Both guards are pure pre-filters — they do not alter the reranker's
+output ordering, only the input size.
+### Added — `jina` usage events
+The `JinaUsageEvent` shape (already exported since v3.2.0) is now
+actually emitted on every successful Jina API call:
+- **`/v1/classify`** — emitted from `decideRoute` via the new
+  `RouterConfig.onClassifierUsage` callback. `inputCount: 1` (one
+  query per call), `model: "zero-shot" | "few-shot"`.
+- **`/v1/rerank`** — emitted from `rerankPgvectorResults` via the new
+  `RerankPgvectorParams.onUsage` callback. `inputCount: N` (post-trim
+  document count).
+Dashboards can now graph Jina conso per turn AND per endpoint without
+re-deriving from log timestamps.
+### Added — soft RPM monitor
+A lightweight sliding-window counter (`src/jina/rate-limit.ts`)
+observes outbound Jina request rate across the whole plugin. When the
+configured `jina.rpmBudget` is exceeded within any 60-second window,
+the plugin:
+- emits a single `{type: "jina_rpm_exceeded", count, budget}` event
+  for that window,
+- logs a warning naming the budget and the observed count.
+**The call is never blocked.** The existing 429 cooldown breaker
+remains the hard backstop; the monitor only adds visibility so the
+operator can alert BEFORE billing surprises hit — especially relevant
+when the API key is shared with another service (e.g. Hindsight).
+Default budget: `60` RPM (well below the Jina free-tier 100 RPM
+ceiling). `0` disables the monitor.
+### LightRAG companion changes (operator action)
+A non-trivial share of Jina-token saving sits in the LightRAG `.env`
+config, NOT the plugin. The companion
+`openclaw-notes/lightrag/.env.template` is updated in the same patch
+to recommend:
+- `RERANK_MAX_TOKENS_PER_DOC=600` (was `480`): each candidate splits
+  into 2 sub-rerank calls instead of 3 (1200/600=2 vs 1200/480=3).
+  Cuts the per-rerank Jina spend by **~33%** at no quality cost
+  (sub-chunks remain complete; payload fits cleanly in v2's 8K
+  context window with no server-side truncation).
+- `MIN_RERANK_SCORE=0.05` (was `0.0`): drops pure-noise reranked
+  chunks. Cleaner injected context, ~10% fewer chars on the
+  downstream LLM prompt.
+- `RERANK_ENABLE_CHUNKING` stays `true` — disabling it would push
+  the cumulative payload (10 docs × 1200 tokens ≈ 12 K) over the
+  8 K v2-multilingual context window, triggering silent server-side
+  truncation (Jina's `truncate:true` policy) and lossy ranking. See
+  the `.env.template` block for the full operating-point comparison
+  (chunking-ON vs TOP_K=5 vs jina-reranker-v3).
+Apply on the NAS:
+```bash
+sudo vi /volume3/openclaw/lightrag/.env.jerome
+# Change: RERANK_MAX_TOKENS_PER_DOC=600
+# Change: MIN_RERANK_SCORE=0.05
+sudo docker restart openclaw-lightrag-jerome
+```
+Expected saving on jerome's observed rate: ~8.4 M → ~5.6 M tokens
+per 7 days (-33%), no quality regression.
+### Migration
+Drop-in plugin patch. Defaults preserve the v3.2.3 behavior on
+runtimes that don't set the new fields. To activate the new caps on
+both instances:
+```bash
+sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker restart openclaw-jerome openclaw-olivier
+```
+Verify post-restart:
+- `[knowledge.event] {"type":"jina","endpoint":"classify",...}` lines
+  appear on every classifier call.
+- `[knowledge.event] {"type":"jina","endpoint":"rerank","inputCount":N,...}`
+  lines appear on every rerank call. `inputCount` is now bounded by
+  `candidatePoolMax`.
+- `[knowledge.event] {"type":"jina_rpm_exceeded",...}` appears ONLY
+  during real overshoots (typically silent on single-user workloads).
+### Codex pass #33 correction (P2)
+`rpmBudget: 0` is documented as "disables the monitor entirely" on
+both `JinaPluginConfig.rpmBudget` and the `openclaw.plugin.json`
+schema. The initial implementation contradicted that promise: the
+overshoot check `count > this.budget` evaluated to `true` on the very
+first record (`1 > 0`), firing a spurious `jina_rpm_exceeded` event
+on every plugin turn.
+Fixed with defense-in-depth:
+- **`RpmMonitor.record()` / `RpmMonitor.peek()`** short-circuit to
+  `0` when `budget <= 0`. No timestamp tracked, no `onExceeded`
+  callback ever fires. Negative budgets get the same treatment.
+- **`src/index.ts`** skips constructing the monitor entirely when
+  `config.jinaRpmBudget === 0` — `rpmMonitor` becomes `undefined`
+  and every `rpmMonitor?.record()` callsite becomes a free no-op.
+### Codex pass #34 correction (P3)
+`RpmMonitor.lastExceededNotice` was initialized to `0`. The dedup
+check `t - lastExceededNotice >= 60_000` was unsatisfiable until
+simulated time reached 60s when a test clock started at `now=0` — a
+documented use case for `RpmMonitorOptions.now`. The first alert was
+silently suppressed during the first minute of any such test.
+Initialized to `Number.NEGATIVE_INFINITY` so the first overshoot
+fires regardless of clock origin. Production `Date.now()` was always
+well above 60_000, so this is a test-correctness fix with no behavior
+change in production.
+### Test coverage
+- Total: 271 tests, all green (was 250 in 3.2.3; +21 new).
+- New file `test/jina/rate-limit.test.ts` (12 tests) — sliding-window
+  count, 60s expiry, one-notice-per-window deduplication,
+  `DEFAULT_RPM_BUDGET` constant, **Codex #33 regression: budget=0
+  AND negative-budget disable the monitor completely**,
+  **Codex #34 regression: first overshoot fires even with a test
+  clock starting at `now=0`**.
+- 5 new tests in `test/pgvector.test.ts` — `candidatePoolMax` cap,
+  `maxCharsPerDoc` truncation, `onUsage` callback fires on success,
+  `onUsage` NOT fired on failure, legacy `undefined` keeps no-trim
+  behavior.
+- 4 new tests in `test/config.test.ts` — default values, override,
+  `[0, ∞)` clamping on the three new numeric fields.
+- 1 fixture update in `test/tracing/events.test.ts` for the new
+  `JinaRpmExceededEvent` shape.
+## [3.2.3] - 2026-05-23
+### Added — observability for "ran and matched nothing" (P1)
+Post-3.2.2 production observation: when the router lets retrieval go
+through (route=ALL or route=PGVECTOR_ONLY/LIGHTRAG_ONLY), the event
+log can stay completely silent if a source returns zero hits above
+the score threshold. That makes it impossible to distinguish:
+- "pgvector / LightRAG ran and matched nothing"  →  knowledge gap
+- "pgvector / LightRAG was never called"         →  config / cooldown
+Both cases looked identical from dashboards. v3.2.3 fixes that.
+**`pgvector` event now emitted unconditionally**. Previously the event
+was inside the `if (formatted)` block in `renderSection`; a 0-hit
+query produced no event line. The event now fires every time pgvector
+is consulted, with `rawCount: 0`, `topScore: null`, `rerankedCount: null`
+when nothing matched. The plugin still returns `null` from
+`renderSection` (no prompt injection), but operators can now see the
+attempt and the empty result in dashboards.
+**`lightrag` event gains a `sparse: boolean` field**. Set to `true`
+when the truncated payload is shorter than the new constant
+`LIGHTRAG_SPARSE_THRESHOLD_CHARS` (200). The threshold is calibrated
+on the production noise floor observed 2026-05-23 19:42:05 where
+LightRAG returned 70 chars of stub on a 1862-char query (OWUI title
+generation — see P2 below). 200 chars ≈ two short sentences; below
+that, the response cannot ground a non-trivial answer.
+The `lightrag` event is now also emitted on empty responses, with
+`contextChars: 0` and `sparse: true`, for the same visibility reason
+as the pgvector change above.
+### Added — short-circuit Open WebUI auto-prompts (P2)
+Open WebUI re-uses the same chat thread to ask the LLM for chat
+metadata after every assistant turn:
+- Chat title:    `### Task:\nGenerate a concise, 3-5 word title …`
+- Tags:          `### Task:\nGenerate 1-3 broad tags …`
+- Follow-ups:    `### Task:\nSuggest 3-5 follow-up questions …`
+- Summary:       `### Task:\nCreate a short summary …`
+These are not user questions and have no business hitting the
+knowledge base. Previously the heuristic let them through, the
+Jina classifier was billed, and on jerome they typically scored 0.27
+→ `classifier_low_confidence` → ALL → wasted LightRAG call (the
+2026-05-23 19:42:05 case in the issue).
+The new META_PATTERN catches them at the start of the prompt:
+```
+/^\s*###\s*Task:\s*\n\s*(?:Generate|Suggest|Create)\s+/i
+```
+Result: route `NONE`, reason `heuristic_meta`, **zero Jina spend and
+zero RAG call** for the OWUI metadata loop. Anchored on `^` so a user
+who quotes the template inside a real question keeps their content.
+### Migration
+Drop-in patch. No config change required.
+```bash
+sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker restart openclaw-jerome openclaw-olivier
+```
+After restart, expect:
+- `[knowledge.event] {"type":"pgvector","rawCount":0,...}` lines on
+  queries where the collection had no hits above threshold.
+- `[knowledge.event] {"type":"lightrag",...,"sparse":true}` lines on
+  thin responses — track this as a KG-coverage indicator.
+- Zero `[knowledge.event] {"type":"router",...}` lines on OWUI
+  title/tag/followup prompts. They now appear as
+  `route=NONE,reason=heuristic_meta` and no downstream events follow.
+### Codex pass #28 corrections (P2)
+**P2 #1 — OWUI META_PATTERN tightened from verb-only to structural.**
+The initial pattern `^\s*### Task:\n\s*(Generate|Suggest|Create)\s+`
+also matched legitimate user prompts such as
+`### Task:\nCreate a migration plan from the docs`. The verb alone is
+not discriminant.
+**Codex pass #29 P2 follow-up — first try.**
+The structural triple `### Task:` + `### Output:` + `JSON format: {`
+also matched legitimate structured-output user tasks
+(`### Output:\nJSON format: { "clients": ["..."] }`). We added a
+whitelist on the first JSON key (title / tags / follow_ups / summary).
+**Codex pass #30 — second try: `<chat_history>` XML block.**
+The four canonical OWUI keys are themselves not specific enough — a
+user can legitimately ask for `{ "summary": "..." }` of documents.
+Added `<chat_history>…</chat_history>` anchor.
+**Codex pass #31 P2 — final discriminator: full 4-section template
+ending at EOF.**
+Even `<chat_history>…</chat_history>` was insufficient — a user
+analyzing the OWUI template could paste the block as content. The
+final pattern stacks FOUR OWUI-specific structural markers:
+```
+/^\s*###\s*Task:[\s\S]{1,16000}\n###\s*Output:[\s\S]{1,4000}?\n###\s*Chat\s+History:\s*\n<chat_history>[\s\S]{0,32000}<\/chat_history>\s*$/i
+```
+- `### Task:`         at the START of the prompt (anchored)
+- `### Output:`       OWUI directive section
+- `### Chat History:` OWUI section header (literal — a user
+                      pasting the XML inline omits this)
+- `<chat_history>…</chat_history>` block, AT END-OF-PROMPT (`\s*$`).
+  OWUI auto-prompts terminate exactly here; a user analyzing the
+  template typically appends a question AFTER the closing tag,
+  defeating the end anchor.
+Bounded greedy matches `{1,16000}`, `{1,4000}?`, `{0,32000}` keep
+the regex engine linear on malformed input.
+**P2 #2 — pgvector errored vs. empty distinguished.**
+`searchCollection` previously caught every SQL error and returned
+`[]`. Combined with the unconditional event emission added earlier in
+v3.2.3, a real database failure (DB down, schema drift, network) was
+logged as `rawCount: 0` — visually identical to a clean 0-hit query.
+The catch was removed from `searchCollection`. `runPgvectorSource`
+now uses `Promise.allSettled` over the per-collection searches; a
+single failing collection no longer erases the results from the
+others (graceful degradation preserved). The new
+`PgvectorSourceResult.errored` flag propagates to the event:
+- `errored: true`  → `rawCount: null` (failure, not a metric)
+- `errored: false` → `rawCount: N`     (real recall count)
+The per-collection failure is logged with the error **class name
+only** — no SQL params, no query content — to keep PHI out of logs.
+### Test coverage
+- Total: 250 tests, all green (was 230 in 3.2.2; +20 new).
+- 3 new tests in `test/plugin.test.ts` covering the LightRAG
+  `sparse:false`/`sparse:true`/empty-response paths.
+- 7 new tests in `test/plugin.test.ts` covering OWUI title-gen
+  (with the full 4-section template), tag-gen (with the full
+  template), Codex P2 #1 regression (real `### Task:` prompt MUST
+  reach sources), Codex P2 #29 regression (DOMAIN-key JSON task
+  MUST reach sources), Codex P2 #30 regression (user `{"summary":...}`
+  request MUST reach sources), Codex P2 #31 regression (user pastes
+  OWUI block then asks something after MUST reach sources), and
+  the mid-body negative case.
+- 1 new test in `test/plugin.test.ts` for the Codex P2 #2 regression
+  (`errored:true` + `rawCount:null` on SQL failure).
+- 7 new tests in `test/router/heuristic.test.ts`: OWUI positive
+  template (full 4-section shape), Codex #28 negative (real
+  `### Task:` prompts NOT skipped), Codex #29 negative (domain-key
+  JSON tasks NOT skipped), Codex #30 negative (user `{summary/tags/
+  title/follow_ups}` of docs NOT skipped), two Codex #31 negatives
+  (user pastes template-then-asks; user embeds XML inline without
+  the OWUI section header), and the anchor-on-start negative.
+- 1 updated test in `test/pgvector.test.ts` — `searchCollection` now
+  rejects instead of returning `[]` on DB errors.
+- 2 fixture updates in `test/tracing/events.test.ts` for the new
+  `sparse` field on `LightRAGEvent` and the new `errored` field on
+  `PgvectorEvent`.
 ## [3.2.2] - 2026-05-23
 ### Fixed — Jina classifier silently blocked retrieval on low-confidence scores

package/dist/config.js CHANGED Viewed

@@ -2,6 +2,7 @@
 //
 // These helpers are the only place that touches `process.env`, keeping the
 // rest of the plugin easy to test with deterministic values.
+import { DEFAULT_RPM_BUDGET } from "./jina/rate-limit.js";
 import { DEFAULT_MIN_CONFIDENCE } from "./router/index.js";
 /**
  * Expand `${VAR_NAME}` patterns in a config string against `process.env`.
@@ -45,6 +46,12 @@ const DEFAULT_ROUTER_MODE = "heuristic";
 // staying below typical hit scores (≈ 0.40-0.65).
 const DEFAULT_RERANKER_MODEL = "jina-reranker-v2-base-multilingual";
 const DEFAULT_RERANKER_TOP_N = 5;
+// 3.2.4 — payload-trimming defaults. Empirically calibrated on jerome's
+// Jina dashboard (2026-05-17 → 2026-05-24): 20 candidates × 2000 chars
+// fits in ~10K tokens (well below jina-reranker-v2's 8K context window
+// once the query is added) while preserving the top-precision band.
+const DEFAULT_RERANKER_CANDIDATE_POOL_MAX = 20;
+const DEFAULT_RERANKER_MAX_CHARS_PER_DOC = 2000;
 /**
  * Apply defaults and env substitution to the raw plugin config. A source is
  * enabled when its credentials are present, unless the user explicitly toggles
@@ -80,6 +87,8 @@ export function resolveConfig(cfg = {}) {
         lightragEnabled: cfg.lightragEnabled !== false && Boolean(lightragUrl),
         // Jina shared key (used by router and/or reranker)
         jinaApiKey,
+        // 3.2.4 — soft RPM budget. 0 disables the monitor entirely.
+        jinaRpmBudget: clampNonNegInt(jina.rpmBudget ?? DEFAULT_RPM_BUDGET),
         // Router — disabled by default, even with a Jina key present, so
         // operators must opt in explicitly. "heuristic" mode is the safest
         // entry point: zero cost, deterministic.
@@ -95,6 +104,19 @@ export function resolveConfig(cfg = {}) {
         pgvectorRerankerEnabled: reranker.enabled === true && Boolean(jinaApiKey),
         pgvectorRerankerModel: reranker.model ?? DEFAULT_RERANKER_MODEL,
         pgvectorRerankerTopN: reranker.topN ?? DEFAULT_RERANKER_TOP_N,
+        // 3.2.4 — payload-size guards. `null`/`undefined` user input falls
+        // back to the production-tuned defaults; an explicit `0` disables
+        // the corresponding cap (legacy v3.2.3 behavior).
+        pgvectorRerankerCandidatePoolMax: clampNonNegInt(reranker.candidatePoolMax ?? DEFAULT_RERANKER_CANDIDATE_POOL_MAX),
+        pgvectorRerankerMaxCharsPerDoc: clampNonNegInt(reranker.maxCharsPerDoc ?? DEFAULT_RERANKER_MAX_CHARS_PER_DOC),
     };
 }
+/** Clamp a value to a non-negative integer. Bad input collapses to `0`. */
+function clampNonNegInt(value) {
+    if (!Number.isFinite(value))
+        return 0;
+    if (value < 0)
+        return 0;
+    return Math.floor(value);
+}
 //# sourceMappingURL=config.js.map

package/dist/config.js.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;~~AAG7D~~,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;~~AAEjC~~;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;~~QAEV~~,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;~~KAC9D~~,CAAC;AACJ,CAAC"}
1	+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAE7D,OAAO,EAAE,kBAAkB,EAAE,MAAM,sBAAsB,CAAC;AAE1D,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AACjC,wEAAwE;AACxE,uEAAuE;AACvE,uEAAuE;AACvE,oEAAoE;AACpE,MAAM,mCAAmC,GAAG,EAAE,CAAC;AAC/C,MAAM,kCAAkC,GAAG,IAAI,CAAC;AAEhD;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QACV,4DAA4D;QAC5D,aAAa,EAAE,cAAc,CAAC,IAAI,CAAC,SAAS,IAAI,kBAAkB,CAAC;QAEnE,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;QAC7D,mEAAmE;QACnE,kEAAkE;QAClE,kDAAkD;QAClD,gCAAgC,EAAE,cAAc,CAC9C,QAAQ,CAAC,gBAAgB,IAAI,mCAAmC,CACjE;QACD,8BAA8B,EAAE,cAAc,CAC5C,QAAQ,CAAC,cAAc,IAAI,kCAAkC,CAC9D;KACF,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,SAAS,cAAc,CAAC,KAAa;IACnC,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,IAAI,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;AAC3B,CAAC"}

package/dist/index.js CHANGED Viewed

@@ -28,7 +28,8 @@ import { searchCollection, formatPgvectorResults, rerankPgvectorResults, } from
 import { queryLightRAG, formatLightRAGResults } from "./lightrag.js";
 import { decideRoute } from "./router/index.js";
 import { JinaError, summarizeJinaError } from "./jina/errors.js";
-import { emitEvent, emitTurnMetadata } from "./tracing/events.js";
+import { RpmMonitor } from "./jina/rate-limit.js";
+import { emitEvent, emitTurnMetadata, LIGHTRAG_SPARSE_THRESHOLD_CHARS, } from "./tracing/events.js";
 // Re-export helpers so the test suite can import them directly without
 // duplicating imports from every submodule.
 export { resolveEnv, resolveConfig } from "./config.js";
@@ -58,6 +59,27 @@ export function createBeforePromptBuildHandler(deps) {
         router: newCooldown(),
         pgvector_reranker: newCooldown(),
     };
+    // Per-instance RPM monitor — one sliding window per plugin runtime.
+    // The `onExceeded` callback emits a structured event the FIRST time the
+    // budget is overshot in any given 60-second window, so dashboards alert
+    // BEFORE the operator sees billing surprises (especially relevant when
+    // the Jina key is shared with another service like Hindsight).
+    //
+    // When `config.jinaRpmBudget === 0`, the monitor is fully disabled
+    // (no instance constructed, no timestamps tracked, no callback ever
+    // fires). This matches the contract documented on
+    // `JinaPluginConfig.rpmBudget`. Defense-in-depth: even if a caller
+    // bypasses this gate and constructs `RpmMonitor` with budget=0
+    // directly, the `record()` method itself short-circuits to a no-op.
+    const rpmMonitor = config.jinaRpmBudget > 0
+        ? new RpmMonitor({
+            budget: config.jinaRpmBudget,
+            onExceeded: ({ count, budget }) => {
+                logger.warn(`openclaw-knowledge: Jina RPM budget exceeded — ${count}/${budget} requests in the last 60s`);
+                emitEvent(logger, { type: "jina_rpm_exceeded", count, budget });
+            },
+        })
+        : undefined;
     return async function beforePromptBuild(event, ctx) {
         if (!config.enabled)
             return undefined;
@@ -73,7 +95,7 @@ export function createBeforePromptBuildHandler(deps) {
         // -----------------------------------------------------------------
         // Router gate — decide which sources (if any) to consult.
         // -----------------------------------------------------------------
-        const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger);
+        const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger, rpmMonitor);
         // Project the abstract router decision onto the sources actually
         // configured in this deployment. Without this projection, an
         // exclusive route (e.g. LIGHTRAG_ONLY) on a single-source deployment
@@ -98,7 +120,7 @@ export function createBeforePromptBuildHandler(deps) {
             if (shouldUsePgvector(effectiveRoute) &&
                 config.pgvectorEnabled &&
                 pool) {
-                tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger));
+                tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger, rpmMonitor));
             }
             if (shouldUseLightRAG(effectiveRoute) && config.lightragEnabled) {
                 tasks.push(runLightRAGSource(query, config));
@@ -195,7 +217,7 @@ export function projectRouteOnEnabledSources(route, pgvectorEnabled, lightragEna
  * meant to suppress repeated log spam during a sustained outage, not to
  * stop retrieval.
  */
-async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
+async function runRouterWithCooldown(config, ctx, query, cooldown, logger, rpmMonitor) {
     // Reset stale cooldown FIRST so we don't keep the classifier circuit
     // open longer than necessary (the first turn after expiry must be
     // able to attempt the classifier again).
@@ -217,6 +239,16 @@ async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
             jinaApiKey: config.jinaApiKey,
             classifierId: config.routerClassifierId || undefined,
             minConfidence: config.routerMinConfidence,
+            onClassifierUsage: (usage) => emitEvent(logger, {
+                type: "jina",
+                endpoint: "classify",
+                model: usage.model,
+                durationMs: usage.durationMs,
+                // 1 query item per call. Few-shot adds no labels in the
+                // body, so inputCount = 1 covers both paths.
+                inputCount: 1,
+            }),
+            rpmMonitor,
         }, {
             query,
             trigger: ctx?.trigger,
@@ -431,11 +463,32 @@ export function extractQueryFromMessages(messages) {
     }
     return "";
 }
-async function runPgvectorSource(pool, query, config, rerankerCooldown, logger) {
+async function runPgvectorSource(pool, query, config, rerankerCooldown, logger, rpmMonitor) {
     const startedAt = Date.now();
     const vector = await embedQuery(query, config.geminiApiKey);
-    const searches = config.collections.map((col) => searchCollection(pool, col, vector, config.topK, config.scoreThreshold));
-    const allResults = (await Promise.all(searches)).flat();
+    // Use `Promise.allSettled` so a single failing collection (transient DB
+    // hiccup, bad schema on one shard, etc.) does NOT erase the results
+    // from the others. `errored` is set when ANY settle is rejected so
+    // the downstream event can flag the partial failure.
+    const settled = await Promise.allSettled(config.collections.map((col) => searchCollection(pool, col, vector, config.topK, config.scoreThreshold)));
+    const allResults = [];
+    let errored = false;
+    for (let i = 0; i < settled.length; i++) {
+        const r = settled[i];
+        if (r.status === "fulfilled") {
+            allResults.push(...r.value);
+        }
+        else {
+            errored = true;
+            // SECURITY: never log r.reason directly. pg errors can include
+            // the offending SQL parameter values (the embedding vector and,
+            // historically, the query text in older driver versions). We log
+            // the constructor name only — sufficient to triage without
+            // risking PHI / query leakage.
+            const reasonClass = r.reason?.constructor?.name ?? "Error";
+            logger.error(`openclaw-knowledge: pgvector collection "${config.collections[i]}" failed — ${reasonClass}`);
+        }
+    }
     allResults.sort((a, b) => b.score - a.score);
     // Capture the recall size BEFORE the reranker runs. This is the
     // number that monitors "how many candidates did pgvector find?"
@@ -461,6 +514,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
             rawCount,
             reranked: false,
             durationMs: Date.now() - startedAt,
+            errored,
         };
     }
     try {
@@ -469,6 +523,16 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
             query,
             model: config.pgvectorRerankerModel,
             topN: config.pgvectorRerankerTopN,
+            candidatePoolMax: config.pgvectorRerankerCandidatePoolMax || undefined,
+            maxCharsPerDoc: config.pgvectorRerankerMaxCharsPerDoc || undefined,
+            rpmMonitor,
+            onUsage: (usage) => emitEvent(logger, {
+                type: "jina",
+                endpoint: "rerank",
+                model: config.pgvectorRerankerModel,
+                durationMs: usage.durationMs,
+                inputCount: usage.inputCount,
+            }),
         });
         rerankerCooldown.consecutiveErrors = 0;
         return {
@@ -477,6 +541,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
             rawCount,
             reranked: true,
             durationMs: Date.now() - startedAt,
+            errored,
         };
     }
     catch (err) {
@@ -499,6 +564,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
             rawCount,
             reranked: false,
             durationMs: Date.now() - startedAt,
+            errored,
         };
     }
 }
@@ -510,11 +576,13 @@ async function runLightRAGSource(query, config) {
 function renderSection(result, config, logger) {
     if (result.source === "pgvector") {
         const formatted = formatPgvectorResults(result.data, config.maxInjectChars);
-        if (!formatted)
-            return null;
         const topScore = result.data[0]?.score?.toFixed(2) ?? "n/a";
         const rerankNote = result.reranked ? " [reranked]" : "";
-        logger.info(`openclaw-knowledge: pgvector — ${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
+        // Emit the event UNCONDITIONALLY — even when pgvector returned no
+        // result above threshold. The previous behavior (silent on empty)
+        // made it impossible to distinguish "pgvector ran and matched
+        // nothing" from "pgvector was never called". Operators need the
+        // former to monitor recall and trigger ingestion when warranted.
         emitEvent(logger, {
             type: "pgvector",
             collections: config.collections,
@@ -523,25 +591,43 @@ function renderSection(result, config, logger) {
             // final size that reaches the LLM (or `null` when the reranker
             // is inactive). This split lets operators monitor recall vs.
             // pruning independently.
-            rawCount: result.rawCount,
+            //
+            // When `errored` is set, `rawCount` is reported as `null` rather
+            // than `0` so dashboards do not conflate a partial SQL failure
+            // with a clean 0-hit query. See the `runPgvectorSource` comment
+            // about `Promise.allSettled` for the source of the flag.
+            rawCount: result.errored ? null : result.rawCount,
             rerankedCount: result.reranked ? result.data.length : null,
             topScore: result.data[0]?.score ?? null,
             durationMs: result.durationMs,
+            errored: result.errored,
         });
+        if (!formatted) {
+            logger.info(`openclaw-knowledge: pgvector — no result above threshold (rawCount=${result.rawCount})`);
+            return null;
+        }
+        logger.info(`openclaw-knowledge: pgvector — ${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
         return "### Document Search Results (pgvector)\n" + formatted;
     }
     if (result.source === "lightrag") {
         const formatted = formatLightRAGResults(result.data, config.lightragMaxChars);
-        if (!formatted)
-            return null;
-        logger.info(`openclaw-knowledge: LightRAG — ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
+        // Emit the event UNCONDITIONALLY too — sparse responses are the
+        // single most useful signal for diagnosing KG coverage gaps.
+        const truncatedLen = formatted?.truncated.length ?? 0;
+        const originalLen = formatted?.originalLength ?? result.data.length;
         emitEvent(logger, {
             type: "lightrag",
             mode: config.lightragQueryMode,
-            contextChars: formatted.originalLength,
-            truncatedChars: formatted.truncated.length,
+            contextChars: originalLen,
+            truncatedChars: truncatedLen,
             durationMs: result.durationMs,
+            sparse: truncatedLen < LIGHTRAG_SPARSE_THRESHOLD_CHARS,
         });
+        if (!formatted) {
+            logger.info(`openclaw-knowledge: LightRAG — empty response (${originalLen} chars)`);
+            return null;
+        }
+        logger.info(`openclaw-knowledge: LightRAG — ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
         return "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
     }
     return null;