@lacneu/openclaw-knowledge 3.2.2 → 3.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,335 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [3.2.4] - 2026-05-24
11
+
12
+ ### Added — payload-size guards for the pgvector reranker
13
+
14
+ Production observation on jerome's Jina dashboard (2026-05-17 →
15
+ 2026-05-24): rerank calls averaged **~66 600 tokens** each — way over
16
+ the model's 8 K context window. Almost all of that came from LightRAG-
17
+ side reranker chunking, but the plugin-side reranker would face the
18
+ same risk once `knowledge_jerome` is alimented. v3.2.4 ships two
19
+ preventive knobs to keep the plugin-side spend bounded:
20
+
21
+ - **`jina.pgvectorReranker.candidatePoolMax`** (default `20`). Caps the
22
+ number of cosine-ranked candidates sent to Jina /v1/rerank. Pgvector
23
+ recall is typically 20-50 hits; only the top 10-15 are worth
24
+ reranking. Setting to `0` disables the cap (legacy v3.2.3 behavior).
25
+
26
+ - **`jina.pgvectorReranker.maxCharsPerDoc`** (default `2000`). Pre-
27
+ truncates each candidate text BEFORE submission. Long chunks
28
+ (transcripts, books) carry most of their relevance signal in the
29
+ first ~2000 chars; the tail wastes Jina tokens without adding signal.
30
+ Setting to `0` disables truncation.
31
+
32
+ Both guards are pure pre-filters — they do not alter the reranker's
33
+ output ordering, only the input size.
34
+
35
+ ### Added — `jina` usage events
36
+
37
+ The `JinaUsageEvent` shape (already exported since v3.2.0) is now
38
+ actually emitted on every successful Jina API call:
39
+
40
+ - **`/v1/classify`** — emitted from `decideRoute` via the new
41
+ `RouterConfig.onClassifierUsage` callback. `inputCount: 1` (one
42
+ query per call), `model: "zero-shot" | "few-shot"`.
43
+ - **`/v1/rerank`** — emitted from `rerankPgvectorResults` via the new
44
+ `RerankPgvectorParams.onUsage` callback. `inputCount: N` (post-trim
45
+ document count).
46
+
47
+ Dashboards can now graph Jina conso per turn AND per endpoint without
48
+ re-deriving from log timestamps.
49
+
50
+ ### Added — soft RPM monitor
51
+
52
+ A lightweight sliding-window counter (`src/jina/rate-limit.ts`)
53
+ observes outbound Jina request rate across the whole plugin. When the
54
+ configured `jina.rpmBudget` is exceeded within any 60-second window,
55
+ the plugin:
56
+
57
+ - emits a single `{type: "jina_rpm_exceeded", count, budget}` event
58
+ for that window,
59
+ - logs a warning naming the budget and the observed count.
60
+
61
+ **The call is never blocked.** The existing 429 cooldown breaker
62
+ remains the hard backstop; the monitor only adds visibility so the
63
+ operator can alert BEFORE billing surprises hit — especially relevant
64
+ when the API key is shared with another service (e.g. Hindsight).
65
+ Default budget: `60` RPM (well below the Jina free-tier 100 RPM
66
+ ceiling). `0` disables the monitor.
67
+
68
+ ### LightRAG companion changes (operator action)
69
+
70
+ A non-trivial share of Jina-token saving sits in the LightRAG `.env`
71
+ config, NOT the plugin. The companion
72
+ `openclaw-notes/lightrag/.env.template` is updated in the same patch
73
+ to recommend:
74
+
75
+ - `RERANK_MAX_TOKENS_PER_DOC=600` (was `480`): each candidate splits
76
+ into 2 sub-rerank calls instead of 3 (1200/600=2 vs 1200/480=3).
77
+ Cuts the per-rerank Jina spend by **~33%** at no quality cost
78
+ (sub-chunks remain complete; payload fits cleanly in v2's 8K
79
+ context window with no server-side truncation).
80
+ - `MIN_RERANK_SCORE=0.05` (was `0.0`): drops pure-noise reranked
81
+ chunks. Cleaner injected context, ~10% fewer chars on the
82
+ downstream LLM prompt.
83
+ - `RERANK_ENABLE_CHUNKING` stays `true` — disabling it would push
84
+ the cumulative payload (10 docs × 1200 tokens ≈ 12 K) over the
85
+ 8 K v2-multilingual context window, triggering silent server-side
86
+ truncation (Jina's `truncate:true` policy) and lossy ranking. See
87
+ the `.env.template` block for the full operating-point comparison
88
+ (chunking-ON vs TOP_K=5 vs jina-reranker-v3).
89
+
90
+ Apply on the NAS:
91
+
92
+ ```bash
93
+ sudo vi /volume3/openclaw/lightrag/.env.jerome
94
+ # Change: RERANK_MAX_TOKENS_PER_DOC=600
95
+ # Change: MIN_RERANK_SCORE=0.05
96
+ sudo docker restart openclaw-lightrag-jerome
97
+ ```
98
+
99
+ Expected saving on jerome's observed rate: ~8.4 M → ~5.6 M tokens
100
+ per 7 days (-33%), no quality regression.
101
+
102
+ ### Migration
103
+
104
+ Drop-in plugin patch. Defaults preserve the v3.2.3 behavior on
105
+ runtimes that don't set the new fields. To activate the new caps on
106
+ both instances:
107
+
108
+ ```bash
109
+ sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
110
+ sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
111
+ sudo docker restart openclaw-jerome openclaw-olivier
112
+ ```
113
+
114
+ Verify post-restart:
115
+ - `[knowledge.event] {"type":"jina","endpoint":"classify",...}` lines
116
+ appear on every classifier call.
117
+ - `[knowledge.event] {"type":"jina","endpoint":"rerank","inputCount":N,...}`
118
+ lines appear on every rerank call. `inputCount` is now bounded by
119
+ `candidatePoolMax`.
120
+ - `[knowledge.event] {"type":"jina_rpm_exceeded",...}` appears ONLY
121
+ during real overshoots (typically silent on single-user workloads).
122
+
123
+ ### Codex pass #33 correction (P2)
124
+
125
+ `rpmBudget: 0` is documented as "disables the monitor entirely" on
126
+ both `JinaPluginConfig.rpmBudget` and the `openclaw.plugin.json`
127
+ schema. The initial implementation contradicted that promise: the
128
+ overshoot check `count > this.budget` evaluated to `true` on the very
129
+ first record (`1 > 0`), firing a spurious `jina_rpm_exceeded` event
130
+ on every plugin turn.
131
+
132
+ Fixed with defense-in-depth:
133
+
134
+ - **`RpmMonitor.record()` / `RpmMonitor.peek()`** short-circuit to
135
+ `0` when `budget <= 0`. No timestamp tracked, no `onExceeded`
136
+ callback ever fires. Negative budgets get the same treatment.
137
+ - **`src/index.ts`** skips constructing the monitor entirely when
138
+ `config.jinaRpmBudget === 0` — `rpmMonitor` becomes `undefined`
139
+ and every `rpmMonitor?.record()` callsite becomes a free no-op.
140
+
141
+ ### Codex pass #34 correction (P3)
142
+
143
+ `RpmMonitor.lastExceededNotice` was initialized to `0`. The dedup
144
+ check `t - lastExceededNotice >= 60_000` was unsatisfiable until
145
+ simulated time reached 60s when a test clock started at `now=0` — a
146
+ documented use case for `RpmMonitorOptions.now`. The first alert was
147
+ silently suppressed during the first minute of any such test.
148
+
149
+ Initialized to `Number.NEGATIVE_INFINITY` so the first overshoot
150
+ fires regardless of clock origin. Production `Date.now()` was always
151
+ well above 60_000, so this is a test-correctness fix with no behavior
152
+ change in production.
153
+
154
+ ### Test coverage
155
+
156
+ - Total: 271 tests, all green (was 250 in 3.2.3; +21 new).
157
+ - New file `test/jina/rate-limit.test.ts` (12 tests) — sliding-window
158
+ count, 60s expiry, one-notice-per-window deduplication,
159
+ `DEFAULT_RPM_BUDGET` constant, **Codex #33 regression: budget=0
160
+ AND negative-budget disable the monitor completely**,
161
+ **Codex #34 regression: first overshoot fires even with a test
162
+ clock starting at `now=0`**.
163
+ - 5 new tests in `test/pgvector.test.ts` — `candidatePoolMax` cap,
164
+ `maxCharsPerDoc` truncation, `onUsage` callback fires on success,
165
+ `onUsage` NOT fired on failure, legacy `undefined` keeps no-trim
166
+ behavior.
167
+ - 4 new tests in `test/config.test.ts` — default values, override,
168
+ `[0, ∞)` clamping on the three new numeric fields.
169
+ - 1 fixture update in `test/tracing/events.test.ts` for the new
170
+ `JinaRpmExceededEvent` shape.
171
+
172
+ ## [3.2.3] - 2026-05-23
173
+
174
+ ### Added — observability for "ran and matched nothing" (P1)
175
+
176
+ Post-3.2.2 production observation: when the router lets retrieval go
177
+ through (route=ALL or route=PGVECTOR_ONLY/LIGHTRAG_ONLY), the event
178
+ log can stay completely silent if a source returns zero hits above
179
+ the score threshold. That makes it impossible to distinguish:
180
+
181
+ - "pgvector / LightRAG ran and matched nothing" → knowledge gap
182
+ - "pgvector / LightRAG was never called" → config / cooldown
183
+
184
+ Both cases looked identical from dashboards. v3.2.3 fixes that.
185
+
186
+ **`pgvector` event now emitted unconditionally**. Previously the event
187
+ was inside the `if (formatted)` block in `renderSection`; a 0-hit
188
+ query produced no event line. The event now fires every time pgvector
189
+ is consulted, with `rawCount: 0`, `topScore: null`, `rerankedCount: null`
190
+ when nothing matched. The plugin still returns `null` from
191
+ `renderSection` (no prompt injection), but operators can now see the
192
+ attempt and the empty result in dashboards.
193
+
194
+ **`lightrag` event gains a `sparse: boolean` field**. Set to `true`
195
+ when the truncated payload is shorter than the new constant
196
+ `LIGHTRAG_SPARSE_THRESHOLD_CHARS` (200). The threshold is calibrated
197
+ on the production noise floor observed 2026-05-23 19:42:05 where
198
+ LightRAG returned 70 chars of stub on a 1862-char query (OWUI title
199
+ generation — see P2 below). 200 chars ≈ two short sentences; below
200
+ that, the response cannot ground a non-trivial answer.
201
+
202
+ The `lightrag` event is now also emitted on empty responses, with
203
+ `contextChars: 0` and `sparse: true`, for the same visibility reason
204
+ as the pgvector change above.
205
+
206
+ ### Added — short-circuit Open WebUI auto-prompts (P2)
207
+
208
+ Open WebUI re-uses the same chat thread to ask the LLM for chat
209
+ metadata after every assistant turn:
210
+
211
+ - Chat title: `### Task:\nGenerate a concise, 3-5 word title …`
212
+ - Tags: `### Task:\nGenerate 1-3 broad tags …`
213
+ - Follow-ups: `### Task:\nSuggest 3-5 follow-up questions …`
214
+ - Summary: `### Task:\nCreate a short summary …`
215
+
216
+ These are not user questions and have no business hitting the
217
+ knowledge base. Previously the heuristic let them through, the
218
+ Jina classifier was billed, and on jerome they typically scored 0.27
219
+ → `classifier_low_confidence` → ALL → wasted LightRAG call (the
220
+ 2026-05-23 19:42:05 case in the issue).
221
+
222
+ The new META_PATTERN catches them at the start of the prompt:
223
+
224
+ ```
225
+ /^\s*###\s*Task:\s*\n\s*(?:Generate|Suggest|Create)\s+/i
226
+ ```
227
+
228
+ Result: route `NONE`, reason `heuristic_meta`, **zero Jina spend and
229
+ zero RAG call** for the OWUI metadata loop. Anchored on `^` so a user
230
+ who quotes the template inside a real question keeps their content.
231
+
232
+ ### Migration
233
+
234
+ Drop-in patch. No config change required.
235
+
236
+ ```bash
237
+ sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
238
+ sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
239
+ sudo docker restart openclaw-jerome openclaw-olivier
240
+ ```
241
+
242
+ After restart, expect:
243
+
244
+ - `[knowledge.event] {"type":"pgvector","rawCount":0,...}` lines on
245
+ queries where the collection had no hits above threshold.
246
+ - `[knowledge.event] {"type":"lightrag",...,"sparse":true}` lines on
247
+ thin responses — track this as a KG-coverage indicator.
248
+ - Zero `[knowledge.event] {"type":"router",...}` lines on OWUI
249
+ title/tag/followup prompts. They now appear as
250
+ `route=NONE,reason=heuristic_meta` and no downstream events follow.
251
+
252
+ ### Codex pass #28 corrections (P2)
253
+
254
+ **P2 #1 — OWUI META_PATTERN tightened from verb-only to structural.**
255
+ The initial pattern `^\s*### Task:\n\s*(Generate|Suggest|Create)\s+`
256
+ also matched legitimate user prompts such as
257
+ `### Task:\nCreate a migration plan from the docs`. The verb alone is
258
+ not discriminant.
259
+
260
+ **Codex pass #29 P2 follow-up — first try.**
261
+ The structural triple `### Task:` + `### Output:` + `JSON format: {`
262
+ also matched legitimate structured-output user tasks
263
+ (`### Output:\nJSON format: { "clients": ["..."] }`). We added a
264
+ whitelist on the first JSON key (title / tags / follow_ups / summary).
265
+
266
+ **Codex pass #30 — second try: `<chat_history>` XML block.**
267
+ The four canonical OWUI keys are themselves not specific enough — a
268
+ user can legitimately ask for `{ "summary": "..." }` of documents.
269
+ Added `<chat_history>…</chat_history>` anchor.
270
+
271
+ **Codex pass #31 P2 — final discriminator: full 4-section template
272
+ ending at EOF.**
273
+ Even `<chat_history>…</chat_history>` was insufficient — a user
274
+ analyzing the OWUI template could paste the block as content. The
275
+ final pattern stacks FOUR OWUI-specific structural markers:
276
+
277
+ ```
278
+ /^\s*###\s*Task:[\s\S]{1,16000}\n###\s*Output:[\s\S]{1,4000}?\n###\s*Chat\s+History:\s*\n<chat_history>[\s\S]{0,32000}<\/chat_history>\s*$/i
279
+ ```
280
+
281
+ - `### Task:` at the START of the prompt (anchored)
282
+ - `### Output:` OWUI directive section
283
+ - `### Chat History:` OWUI section header (literal — a user
284
+ pasting the XML inline omits this)
285
+ - `<chat_history>…</chat_history>` block, AT END-OF-PROMPT (`\s*$`).
286
+ OWUI auto-prompts terminate exactly here; a user analyzing the
287
+ template typically appends a question AFTER the closing tag,
288
+ defeating the end anchor.
289
+
290
+ Bounded greedy matches `{1,16000}`, `{1,4000}?`, `{0,32000}` keep
291
+ the regex engine linear on malformed input.
292
+
293
+ **P2 #2 — pgvector errored vs. empty distinguished.**
294
+ `searchCollection` previously caught every SQL error and returned
295
+ `[]`. Combined with the unconditional event emission added earlier in
296
+ v3.2.3, a real database failure (DB down, schema drift, network) was
297
+ logged as `rawCount: 0` — visually identical to a clean 0-hit query.
298
+
299
+ The catch was removed from `searchCollection`. `runPgvectorSource`
300
+ now uses `Promise.allSettled` over the per-collection searches; a
301
+ single failing collection no longer erases the results from the
302
+ others (graceful degradation preserved). The new
303
+ `PgvectorSourceResult.errored` flag propagates to the event:
304
+
305
+ - `errored: true` → `rawCount: null` (failure, not a metric)
306
+ - `errored: false` → `rawCount: N` (real recall count)
307
+
308
+ The per-collection failure is logged with the error **class name
309
+ only** — no SQL params, no query content — to keep PHI out of logs.
310
+
311
+ ### Test coverage
312
+
313
+ - Total: 250 tests, all green (was 230 in 3.2.2; +20 new).
314
+ - 3 new tests in `test/plugin.test.ts` covering the LightRAG
315
+ `sparse:false`/`sparse:true`/empty-response paths.
316
+ - 7 new tests in `test/plugin.test.ts` covering OWUI title-gen
317
+ (with the full 4-section template), tag-gen (with the full
318
+ template), Codex P2 #1 regression (real `### Task:` prompt MUST
319
+ reach sources), Codex P2 #29 regression (DOMAIN-key JSON task
320
+ MUST reach sources), Codex P2 #30 regression (user `{"summary":...}`
321
+ request MUST reach sources), Codex P2 #31 regression (user pastes
322
+ OWUI block then asks something after MUST reach sources), and
323
+ the mid-body negative case.
324
+ - 1 new test in `test/plugin.test.ts` for the Codex P2 #2 regression
325
+ (`errored:true` + `rawCount:null` on SQL failure).
326
+ - 7 new tests in `test/router/heuristic.test.ts`: OWUI positive
327
+ template (full 4-section shape), Codex #28 negative (real
328
+ `### Task:` prompts NOT skipped), Codex #29 negative (domain-key
329
+ JSON tasks NOT skipped), Codex #30 negative (user `{summary/tags/
330
+ title/follow_ups}` of docs NOT skipped), two Codex #31 negatives
331
+ (user pastes template-then-asks; user embeds XML inline without
332
+ the OWUI section header), and the anchor-on-start negative.
333
+ - 1 updated test in `test/pgvector.test.ts` — `searchCollection` now
334
+ rejects instead of returning `[]` on DB errors.
335
+ - 2 fixture updates in `test/tracing/events.test.ts` for the new
336
+ `sparse` field on `LightRAGEvent` and the new `errored` field on
337
+ `PgvectorEvent`.
338
+
10
339
  ## [3.2.2] - 2026-05-23
11
340
 
12
341
  ### Fixed — Jina classifier silently blocked retrieval on low-confidence scores
package/dist/config.js CHANGED
@@ -2,6 +2,7 @@
2
2
  //
3
3
  // These helpers are the only place that touches `process.env`, keeping the
4
4
  // rest of the plugin easy to test with deterministic values.
5
+ import { DEFAULT_RPM_BUDGET } from "./jina/rate-limit.js";
5
6
  import { DEFAULT_MIN_CONFIDENCE } from "./router/index.js";
6
7
  /**
7
8
  * Expand `${VAR_NAME}` patterns in a config string against `process.env`.
@@ -45,6 +46,12 @@ const DEFAULT_ROUTER_MODE = "heuristic";
45
46
  // staying below typical hit scores (≈ 0.40-0.65).
46
47
  const DEFAULT_RERANKER_MODEL = "jina-reranker-v2-base-multilingual";
47
48
  const DEFAULT_RERANKER_TOP_N = 5;
49
+ // 3.2.4 — payload-trimming defaults. Empirically calibrated on jerome's
50
+ // Jina dashboard (2026-05-17 → 2026-05-24): 20 candidates × 2000 chars
51
+ // fits in ~10K tokens (well below jina-reranker-v2's 8K context window
52
+ // once the query is added) while preserving the top-precision band.
53
+ const DEFAULT_RERANKER_CANDIDATE_POOL_MAX = 20;
54
+ const DEFAULT_RERANKER_MAX_CHARS_PER_DOC = 2000;
48
55
  /**
49
56
  * Apply defaults and env substitution to the raw plugin config. A source is
50
57
  * enabled when its credentials are present, unless the user explicitly toggles
@@ -80,6 +87,8 @@ export function resolveConfig(cfg = {}) {
80
87
  lightragEnabled: cfg.lightragEnabled !== false && Boolean(lightragUrl),
81
88
  // Jina shared key (used by router and/or reranker)
82
89
  jinaApiKey,
90
+ // 3.2.4 — soft RPM budget. 0 disables the monitor entirely.
91
+ jinaRpmBudget: clampNonNegInt(jina.rpmBudget ?? DEFAULT_RPM_BUDGET),
83
92
  // Router — disabled by default, even with a Jina key present, so
84
93
  // operators must opt in explicitly. "heuristic" mode is the safest
85
94
  // entry point: zero cost, deterministic.
@@ -95,6 +104,19 @@ export function resolveConfig(cfg = {}) {
95
104
  pgvectorRerankerEnabled: reranker.enabled === true && Boolean(jinaApiKey),
96
105
  pgvectorRerankerModel: reranker.model ?? DEFAULT_RERANKER_MODEL,
97
106
  pgvectorRerankerTopN: reranker.topN ?? DEFAULT_RERANKER_TOP_N,
107
+ // 3.2.4 — payload-size guards. `null`/`undefined` user input falls
108
+ // back to the production-tuned defaults; an explicit `0` disables
109
+ // the corresponding cap (legacy v3.2.3 behavior).
110
+ pgvectorRerankerCandidatePoolMax: clampNonNegInt(reranker.candidatePoolMax ?? DEFAULT_RERANKER_CANDIDATE_POOL_MAX),
111
+ pgvectorRerankerMaxCharsPerDoc: clampNonNegInt(reranker.maxCharsPerDoc ?? DEFAULT_RERANKER_MAX_CHARS_PER_DOC),
98
112
  };
99
113
  }
114
+ /** Clamp a value to a non-negative integer. Bad input collapses to `0`. */
115
+ function clampNonNegInt(value) {
116
+ if (!Number.isFinite(value))
117
+ return 0;
118
+ if (value < 0)
119
+ return 0;
120
+ return Math.floor(value);
121
+ }
100
122
  //# sourceMappingURL=config.js.map
@@ -1 +1 @@
1
- {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAG7D,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AAEjC;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QAEV,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;KAC9D,CAAC;AACJ,CAAC"}
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAE7D,OAAO,EAAE,kBAAkB,EAAE,MAAM,sBAAsB,CAAC;AAE1D,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AACjC,wEAAwE;AACxE,uEAAuE;AACvE,uEAAuE;AACvE,oEAAoE;AACpE,MAAM,mCAAmC,GAAG,EAAE,CAAC;AAC/C,MAAM,kCAAkC,GAAG,IAAI,CAAC;AAEhD;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QACV,4DAA4D;QAC5D,aAAa,EAAE,cAAc,CAAC,IAAI,CAAC,SAAS,IAAI,kBAAkB,CAAC;QAEnE,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;QAC7D,mEAAmE;QACnE,kEAAkE;QAClE,kDAAkD;QAClD,gCAAgC,EAAE,cAAc,CAC9C,QAAQ,CAAC,gBAAgB,IAAI,mCAAmC,CACjE;QACD,8BAA8B,EAAE,cAAc,CAC5C,QAAQ,CAAC,cAAc,IAAI,kCAAkC,CAC9D;KACF,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,SAAS,cAAc,CAAC,KAAa;IACnC,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,IAAI,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;AAC3B,CAAC"}
package/dist/index.js CHANGED
@@ -28,7 +28,8 @@ import { searchCollection, formatPgvectorResults, rerankPgvectorResults, } from
28
28
  import { queryLightRAG, formatLightRAGResults } from "./lightrag.js";
29
29
  import { decideRoute } from "./router/index.js";
30
30
  import { JinaError, summarizeJinaError } from "./jina/errors.js";
31
- import { emitEvent, emitTurnMetadata } from "./tracing/events.js";
31
+ import { RpmMonitor } from "./jina/rate-limit.js";
32
+ import { emitEvent, emitTurnMetadata, LIGHTRAG_SPARSE_THRESHOLD_CHARS, } from "./tracing/events.js";
32
33
  // Re-export helpers so the test suite can import them directly without
33
34
  // duplicating imports from every submodule.
34
35
  export { resolveEnv, resolveConfig } from "./config.js";
@@ -58,6 +59,27 @@ export function createBeforePromptBuildHandler(deps) {
58
59
  router: newCooldown(),
59
60
  pgvector_reranker: newCooldown(),
60
61
  };
62
+ // Per-instance RPM monitor — one sliding window per plugin runtime.
63
+ // The `onExceeded` callback emits a structured event the FIRST time the
64
+ // budget is overshot in any given 60-second window, so dashboards alert
65
+ // BEFORE the operator sees billing surprises (especially relevant when
66
+ // the Jina key is shared with another service like Hindsight).
67
+ //
68
+ // When `config.jinaRpmBudget === 0`, the monitor is fully disabled
69
+ // (no instance constructed, no timestamps tracked, no callback ever
70
+ // fires). This matches the contract documented on
71
+ // `JinaPluginConfig.rpmBudget`. Defense-in-depth: even if a caller
72
+ // bypasses this gate and constructs `RpmMonitor` with budget=0
73
+ // directly, the `record()` method itself short-circuits to a no-op.
74
+ const rpmMonitor = config.jinaRpmBudget > 0
75
+ ? new RpmMonitor({
76
+ budget: config.jinaRpmBudget,
77
+ onExceeded: ({ count, budget }) => {
78
+ logger.warn(`openclaw-knowledge: Jina RPM budget exceeded — ${count}/${budget} requests in the last 60s`);
79
+ emitEvent(logger, { type: "jina_rpm_exceeded", count, budget });
80
+ },
81
+ })
82
+ : undefined;
61
83
  return async function beforePromptBuild(event, ctx) {
62
84
  if (!config.enabled)
63
85
  return undefined;
@@ -73,7 +95,7 @@ export function createBeforePromptBuildHandler(deps) {
73
95
  // -----------------------------------------------------------------
74
96
  // Router gate — decide which sources (if any) to consult.
75
97
  // -----------------------------------------------------------------
76
- const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger);
98
+ const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger, rpmMonitor);
77
99
  // Project the abstract router decision onto the sources actually
78
100
  // configured in this deployment. Without this projection, an
79
101
  // exclusive route (e.g. LIGHTRAG_ONLY) on a single-source deployment
@@ -98,7 +120,7 @@ export function createBeforePromptBuildHandler(deps) {
98
120
  if (shouldUsePgvector(effectiveRoute) &&
99
121
  config.pgvectorEnabled &&
100
122
  pool) {
101
- tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger));
123
+ tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger, rpmMonitor));
102
124
  }
103
125
  if (shouldUseLightRAG(effectiveRoute) && config.lightragEnabled) {
104
126
  tasks.push(runLightRAGSource(query, config));
@@ -195,7 +217,7 @@ export function projectRouteOnEnabledSources(route, pgvectorEnabled, lightragEna
195
217
  * meant to suppress repeated log spam during a sustained outage, not to
196
218
  * stop retrieval.
197
219
  */
198
- async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
220
+ async function runRouterWithCooldown(config, ctx, query, cooldown, logger, rpmMonitor) {
199
221
  // Reset stale cooldown FIRST so we don't keep the classifier circuit
200
222
  // open longer than necessary (the first turn after expiry must be
201
223
  // able to attempt the classifier again).
@@ -217,6 +239,16 @@ async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
217
239
  jinaApiKey: config.jinaApiKey,
218
240
  classifierId: config.routerClassifierId || undefined,
219
241
  minConfidence: config.routerMinConfidence,
242
+ onClassifierUsage: (usage) => emitEvent(logger, {
243
+ type: "jina",
244
+ endpoint: "classify",
245
+ model: usage.model,
246
+ durationMs: usage.durationMs,
247
+ // 1 query item per call. Few-shot adds no labels in the
248
+ // body, so inputCount = 1 covers both paths.
249
+ inputCount: 1,
250
+ }),
251
+ rpmMonitor,
220
252
  }, {
221
253
  query,
222
254
  trigger: ctx?.trigger,
@@ -431,11 +463,32 @@ export function extractQueryFromMessages(messages) {
431
463
  }
432
464
  return "";
433
465
  }
434
- async function runPgvectorSource(pool, query, config, rerankerCooldown, logger) {
466
+ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger, rpmMonitor) {
435
467
  const startedAt = Date.now();
436
468
  const vector = await embedQuery(query, config.geminiApiKey);
437
- const searches = config.collections.map((col) => searchCollection(pool, col, vector, config.topK, config.scoreThreshold));
438
- const allResults = (await Promise.all(searches)).flat();
469
+ // Use `Promise.allSettled` so a single failing collection (transient DB
470
+ // hiccup, bad schema on one shard, etc.) does NOT erase the results
471
+ // from the others. `errored` is set when ANY settle is rejected so
472
+ // the downstream event can flag the partial failure.
473
+ const settled = await Promise.allSettled(config.collections.map((col) => searchCollection(pool, col, vector, config.topK, config.scoreThreshold)));
474
+ const allResults = [];
475
+ let errored = false;
476
+ for (let i = 0; i < settled.length; i++) {
477
+ const r = settled[i];
478
+ if (r.status === "fulfilled") {
479
+ allResults.push(...r.value);
480
+ }
481
+ else {
482
+ errored = true;
483
+ // SECURITY: never log r.reason directly. pg errors can include
484
+ // the offending SQL parameter values (the embedding vector and,
485
+ // historically, the query text in older driver versions). We log
486
+ // the constructor name only — sufficient to triage without
487
+ // risking PHI / query leakage.
488
+ const reasonClass = r.reason?.constructor?.name ?? "Error";
489
+ logger.error(`openclaw-knowledge: pgvector collection "${config.collections[i]}" failed — ${reasonClass}`);
490
+ }
491
+ }
439
492
  allResults.sort((a, b) => b.score - a.score);
440
493
  // Capture the recall size BEFORE the reranker runs. This is the
441
494
  // number that monitors "how many candidates did pgvector find?"
@@ -461,6 +514,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
461
514
  rawCount,
462
515
  reranked: false,
463
516
  durationMs: Date.now() - startedAt,
517
+ errored,
464
518
  };
465
519
  }
466
520
  try {
@@ -469,6 +523,16 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
469
523
  query,
470
524
  model: config.pgvectorRerankerModel,
471
525
  topN: config.pgvectorRerankerTopN,
526
+ candidatePoolMax: config.pgvectorRerankerCandidatePoolMax || undefined,
527
+ maxCharsPerDoc: config.pgvectorRerankerMaxCharsPerDoc || undefined,
528
+ rpmMonitor,
529
+ onUsage: (usage) => emitEvent(logger, {
530
+ type: "jina",
531
+ endpoint: "rerank",
532
+ model: config.pgvectorRerankerModel,
533
+ durationMs: usage.durationMs,
534
+ inputCount: usage.inputCount,
535
+ }),
472
536
  });
473
537
  rerankerCooldown.consecutiveErrors = 0;
474
538
  return {
@@ -477,6 +541,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
477
541
  rawCount,
478
542
  reranked: true,
479
543
  durationMs: Date.now() - startedAt,
544
+ errored,
480
545
  };
481
546
  }
482
547
  catch (err) {
@@ -499,6 +564,7 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
499
564
  rawCount,
500
565
  reranked: false,
501
566
  durationMs: Date.now() - startedAt,
567
+ errored,
502
568
  };
503
569
  }
504
570
  }
@@ -510,11 +576,13 @@ async function runLightRAGSource(query, config) {
510
576
  function renderSection(result, config, logger) {
511
577
  if (result.source === "pgvector") {
512
578
  const formatted = formatPgvectorResults(result.data, config.maxInjectChars);
513
- if (!formatted)
514
- return null;
515
579
  const topScore = result.data[0]?.score?.toFixed(2) ?? "n/a";
516
580
  const rerankNote = result.reranked ? " [reranked]" : "";
517
- logger.info(`openclaw-knowledge: pgvector${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
581
+ // Emit the event UNCONDITIONALLY even when pgvector returned no
582
+ // result above threshold. The previous behavior (silent on empty)
583
+ // made it impossible to distinguish "pgvector ran and matched
584
+ // nothing" from "pgvector was never called". Operators need the
585
+ // former to monitor recall and trigger ingestion when warranted.
518
586
  emitEvent(logger, {
519
587
  type: "pgvector",
520
588
  collections: config.collections,
@@ -523,25 +591,43 @@ function renderSection(result, config, logger) {
523
591
  // final size that reaches the LLM (or `null` when the reranker
524
592
  // is inactive). This split lets operators monitor recall vs.
525
593
  // pruning independently.
526
- rawCount: result.rawCount,
594
+ //
595
+ // When `errored` is set, `rawCount` is reported as `null` rather
596
+ // than `0` so dashboards do not conflate a partial SQL failure
597
+ // with a clean 0-hit query. See the `runPgvectorSource` comment
598
+ // about `Promise.allSettled` for the source of the flag.
599
+ rawCount: result.errored ? null : result.rawCount,
527
600
  rerankedCount: result.reranked ? result.data.length : null,
528
601
  topScore: result.data[0]?.score ?? null,
529
602
  durationMs: result.durationMs,
603
+ errored: result.errored,
530
604
  });
605
+ if (!formatted) {
606
+ logger.info(`openclaw-knowledge: pgvector — no result above threshold (rawCount=${result.rawCount})`);
607
+ return null;
608
+ }
609
+ logger.info(`openclaw-knowledge: pgvector — ${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
531
610
  return "### Document Search Results (pgvector)\n" + formatted;
532
611
  }
533
612
  if (result.source === "lightrag") {
534
613
  const formatted = formatLightRAGResults(result.data, config.lightragMaxChars);
535
- if (!formatted)
536
- return null;
537
- logger.info(`openclaw-knowledge: LightRAG ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
614
+ // Emit the event UNCONDITIONALLY too — sparse responses are the
615
+ // single most useful signal for diagnosing KG coverage gaps.
616
+ const truncatedLen = formatted?.truncated.length ?? 0;
617
+ const originalLen = formatted?.originalLength ?? result.data.length;
538
618
  emitEvent(logger, {
539
619
  type: "lightrag",
540
620
  mode: config.lightragQueryMode,
541
- contextChars: formatted.originalLength,
542
- truncatedChars: formatted.truncated.length,
621
+ contextChars: originalLen,
622
+ truncatedChars: truncatedLen,
543
623
  durationMs: result.durationMs,
624
+ sparse: truncatedLen < LIGHTRAG_SPARSE_THRESHOLD_CHARS,
544
625
  });
626
+ if (!formatted) {
627
+ logger.info(`openclaw-knowledge: LightRAG — empty response (${originalLen} chars)`);
628
+ return null;
629
+ }
630
+ logger.info(`openclaw-knowledge: LightRAG — ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
545
631
  return "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
546
632
  }
547
633
  return null;