@lacneu/openclaw-knowledge 3.2.3 → 3.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,243 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ### Added — provenance reporting (provenance/v1, opt-in)
11
+
12
+ New `provenanceReport` config (`"off"` default | `"metadata"` | `"full"`).
13
+ When enabled, every injected knowledge section also emits a structured
14
+ provenance report on the gateway agent-event bus (stream
15
+ `openclaw-knowledge.provenance`) so a chat frontend (openclaw-webchat) can
16
+ show the user **which documents fed this reply** — pgvector items carry
17
+ `file_name`/`collection`/`score` (and the exact injected excerpt at `full`),
18
+ LightRAG carries the mode and the injected context excerpt. The report mirrors
19
+ EXACTLY what reached the LLM (post-rerank, post-truncation), never the raw
20
+ retrieval. Normative contract: openclaw-webchat `docs/PROVENANCE_CONTRACT.md`.
21
+
22
+ Robustness/privacy invariants: emission degrades to SILENCE on old SDKs
23
+ (`emitAgentEvent` feature-detected), on a missing `runId`, or on a gateway
24
+ rejection (logged, metadata only); reports never go through logs; the gateway
25
+ re-registration quirk is handled by emitting through the FIRST registration's
26
+ api (module singleton). Off-list config values normalize to `"off"`.
27
+
28
+ ### Fixed — pgvector provenance now reports ONLY entries actually injected
29
+
30
+ `buildPgvectorProvenance` was previously fed the full post-rerank list
31
+ (`result.data`) instead of the post-`maxInjectChars` subset that actually
32
+ reaches the LLM. When the character budget truncated the list,
33
+ `formatPgvectorResults` `break`ed after the budget was exhausted but the
34
+ provenance report still exposed metadata (and excerpts in `full` mode) for
35
+ the dropped entries — leaking file names and content for documents the
36
+ LLM never saw, in violation of the PROVENANCE_CONTRACT rule "emit what was
37
+ injected, not what was retrieved".
38
+
39
+ A new internal helper `formatPgvectorResultsDetailed(results, maxChars):
40
+ { output, injectedCount } | null` carries the truncation logic and exposes
41
+ the count of entries that actually fit the budget; `renderSection` uses it
42
+ to slice `result.data` to the exact prefix injected and pass it to
43
+ `buildPgvectorProvenance`. The public `formatPgvectorResults` signature
44
+ remains `(results, maxChars) => string | null` (a thin wrapper over the
45
+ Detailed variant) — preserving backward compatibility across the v3.x
46
+ line. Both helpers return `null` when no entry fits the budget,
47
+ preserving the prior `if (!formatted)` skip-path. The LightRAG provenance
48
+ path was unaffected because it already used the post-truncation
49
+ `formatted.truncated` text directly.
50
+
51
+ ### Fixed — provenance emission failures never leak payload to logs (codex pass #36 P2)
52
+
53
+ `emitProvenanceReports` previously interpolated the gateway's raw `reason`
54
+ string AND the caught `Error.message` into its `logger.warn` lines.
55
+ Either path could echo back content the emitter was rejecting — the
56
+ gateway commonly cites the offending field/value in its validation
57
+ reasons, and third-party libs routinely enrich `Error.message` with input
58
+ data. That broke the module-level invariant that report content NEVER
59
+ reaches logs.
60
+
61
+ The helper now classifies rejection reasons to a stable category code
62
+ (`plugin_not_loaded`, `missing_run_context`, `invalid_stream`,
63
+ `validation_error`, `rate_limited`, `rejected`) and, for thrown errors,
64
+ logs only `throw:<Error.name>` (constructor name only — never `.message`,
65
+ never the raw exception). Operators still get enough signal to triage;
66
+ no payload byte can reach the log stream.
67
+
68
+ ### Fixed — LightRAG provenance `injected.chars` matches what the LLM actually saw (codex pass #36 P3)
69
+
70
+ `buildLightRAGProvenance` was being called with `formatted.truncated`,
71
+ which is the BODY of the section. The block actually delivered to the
72
+ LLM also includes the `### Knowledge Graph Context (LightRAG)\n` header,
73
+ so the report's `injected.chars` field was systematically short by the
74
+ header length — contradicting the PROVENANCE_CONTRACT promise that the
75
+ field reflects exactly what reached the model.
76
+
77
+ `buildLightRAGProvenance` now takes an optional fourth parameter
78
+ `injectedChars` for callers that track the full section length
79
+ separately. The plugin's render path passes `text.length` (header
80
+ INCLUDED). The `full`-level excerpt body remains the post-truncation
81
+ context (the header carries no semantic content for the chat frontend).
82
+ Invalid `injectedChars` values (`NaN`, negative numbers) fall back to
83
+ `injectedText.length` to keep the report well-formed.
84
+
85
+ ## [3.2.4] - 2026-05-24
86
+
87
+ ### Added — payload-size guards for the pgvector reranker
88
+
89
+ Production observation on jerome's Jina dashboard (2026-05-17 →
90
+ 2026-05-24): rerank calls averaged **~66 600 tokens** each — way over
91
+ the model's 8 K context window. Almost all of that came from LightRAG-
92
+ side reranker chunking, but the plugin-side reranker would face the
93
+ same risk once `knowledge_jerome` is alimented. v3.2.4 ships two
94
+ preventive knobs to keep the plugin-side spend bounded:
95
+
96
+ - **`jina.pgvectorReranker.candidatePoolMax`** (default `20`). Caps the
97
+ number of cosine-ranked candidates sent to Jina /v1/rerank. Pgvector
98
+ recall is typically 20-50 hits; only the top 10-15 are worth
99
+ reranking. Setting to `0` disables the cap (legacy v3.2.3 behavior).
100
+
101
+ - **`jina.pgvectorReranker.maxCharsPerDoc`** (default `2000`). Pre-
102
+ truncates each candidate text BEFORE submission. Long chunks
103
+ (transcripts, books) carry most of their relevance signal in the
104
+ first ~2000 chars; the tail wastes Jina tokens without adding signal.
105
+ Setting to `0` disables truncation.
106
+
107
+ Both guards are pure pre-filters — they do not alter the reranker's
108
+ output ordering, only the input size.
109
+
110
+ ### Added — `jina` usage events
111
+
112
+ The `JinaUsageEvent` shape (already exported since v3.2.0) is now
113
+ actually emitted on every successful Jina API call:
114
+
115
+ - **`/v1/classify`** — emitted from `decideRoute` via the new
116
+ `RouterConfig.onClassifierUsage` callback. `inputCount: 1` (one
117
+ query per call), `model: "zero-shot" | "few-shot"`.
118
+ - **`/v1/rerank`** — emitted from `rerankPgvectorResults` via the new
119
+ `RerankPgvectorParams.onUsage` callback. `inputCount: N` (post-trim
120
+ document count).
121
+
122
+ Dashboards can now graph Jina conso per turn AND per endpoint without
123
+ re-deriving from log timestamps.
124
+
125
+ ### Added — soft RPM monitor
126
+
127
+ A lightweight sliding-window counter (`src/jina/rate-limit.ts`)
128
+ observes outbound Jina request rate across the whole plugin. When the
129
+ configured `jina.rpmBudget` is exceeded within any 60-second window,
130
+ the plugin:
131
+
132
+ - emits a single `{type: "jina_rpm_exceeded", count, budget}` event
133
+ for that window,
134
+ - logs a warning naming the budget and the observed count.
135
+
136
+ **The call is never blocked.** The existing 429 cooldown breaker
137
+ remains the hard backstop; the monitor only adds visibility so the
138
+ operator can alert BEFORE billing surprises hit — especially relevant
139
+ when the API key is shared with another service (e.g. Hindsight).
140
+ Default budget: `60` RPM (well below the Jina free-tier 100 RPM
141
+ ceiling). `0` disables the monitor.
142
+
143
+ ### LightRAG companion changes (operator action)
144
+
145
+ A non-trivial share of Jina-token saving sits in the LightRAG `.env`
146
+ config, NOT the plugin. The companion
147
+ `openclaw-notes/lightrag/.env.template` is updated in the same patch
148
+ to recommend:
149
+
150
+ - `RERANK_MAX_TOKENS_PER_DOC=600` (was `480`): each candidate splits
151
+ into 2 sub-rerank calls instead of 3 (1200/600=2 vs 1200/480=3).
152
+ Cuts the per-rerank Jina spend by **~33%** at no quality cost
153
+ (sub-chunks remain complete; payload fits cleanly in v2's 8K
154
+ context window with no server-side truncation).
155
+ - `MIN_RERANK_SCORE=0.05` (was `0.0`): drops pure-noise reranked
156
+ chunks. Cleaner injected context, ~10% fewer chars on the
157
+ downstream LLM prompt.
158
+ - `RERANK_ENABLE_CHUNKING` stays `true` — disabling it would push
159
+ the cumulative payload (10 docs × 1200 tokens ≈ 12 K) over the
160
+ 8 K v2-multilingual context window, triggering silent server-side
161
+ truncation (Jina's `truncate:true` policy) and lossy ranking. See
162
+ the `.env.template` block for the full operating-point comparison
163
+ (chunking-ON vs TOP_K=5 vs jina-reranker-v3).
164
+
165
+ Apply on the NAS:
166
+
167
+ ```bash
168
+ sudo vi /volume3/openclaw/lightrag/.env.jerome
169
+ # Change: RERANK_MAX_TOKENS_PER_DOC=600
170
+ # Change: MIN_RERANK_SCORE=0.05
171
+ sudo docker restart openclaw-lightrag-jerome
172
+ ```
173
+
174
+ Expected saving on jerome's observed rate: ~8.4 M → ~5.6 M tokens
175
+ per 7 days (-33%), no quality regression.
176
+
177
+ ### Migration
178
+
179
+ Drop-in plugin patch. Defaults preserve the v3.2.3 behavior on
180
+ runtimes that don't set the new fields. To activate the new caps on
181
+ both instances:
182
+
183
+ ```bash
184
+ sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
185
+ sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
186
+ sudo docker restart openclaw-jerome openclaw-olivier
187
+ ```
188
+
189
+ Verify post-restart:
190
+ - `[knowledge.event] {"type":"jina","endpoint":"classify",...}` lines
191
+ appear on every classifier call.
192
+ - `[knowledge.event] {"type":"jina","endpoint":"rerank","inputCount":N,...}`
193
+ lines appear on every rerank call. `inputCount` is now bounded by
194
+ `candidatePoolMax`.
195
+ - `[knowledge.event] {"type":"jina_rpm_exceeded",...}` appears ONLY
196
+ during real overshoots (typically silent on single-user workloads).
197
+
198
+ ### Codex pass #33 correction (P2)
199
+
200
+ `rpmBudget: 0` is documented as "disables the monitor entirely" on
201
+ both `JinaPluginConfig.rpmBudget` and the `openclaw.plugin.json`
202
+ schema. The initial implementation contradicted that promise: the
203
+ overshoot check `count > this.budget` evaluated to `true` on the very
204
+ first record (`1 > 0`), firing a spurious `jina_rpm_exceeded` event
205
+ on every plugin turn.
206
+
207
+ Fixed with defense-in-depth:
208
+
209
+ - **`RpmMonitor.record()` / `RpmMonitor.peek()`** short-circuit to
210
+ `0` when `budget <= 0`. No timestamp tracked, no `onExceeded`
211
+ callback ever fires. Negative budgets get the same treatment.
212
+ - **`src/index.ts`** skips constructing the monitor entirely when
213
+ `config.jinaRpmBudget === 0` — `rpmMonitor` becomes `undefined`
214
+ and every `rpmMonitor?.record()` callsite becomes a free no-op.
215
+
216
+ ### Codex pass #34 correction (P3)
217
+
218
+ `RpmMonitor.lastExceededNotice` was initialized to `0`. The dedup
219
+ check `t - lastExceededNotice >= 60_000` was unsatisfiable until
220
+ simulated time reached 60s when a test clock started at `now=0` — a
221
+ documented use case for `RpmMonitorOptions.now`. The first alert was
222
+ silently suppressed during the first minute of any such test.
223
+
224
+ Initialized to `Number.NEGATIVE_INFINITY` so the first overshoot
225
+ fires regardless of clock origin. Production `Date.now()` was always
226
+ well above 60_000, so this is a test-correctness fix with no behavior
227
+ change in production.
228
+
229
+ ### Test coverage
230
+
231
+ - Total: 271 tests, all green (was 250 in 3.2.3; +21 new).
232
+ - New file `test/jina/rate-limit.test.ts` (12 tests) — sliding-window
233
+ count, 60s expiry, one-notice-per-window deduplication,
234
+ `DEFAULT_RPM_BUDGET` constant, **Codex #33 regression: budget=0
235
+ AND negative-budget disable the monitor completely**,
236
+ **Codex #34 regression: first overshoot fires even with a test
237
+ clock starting at `now=0`**.
238
+ - 5 new tests in `test/pgvector.test.ts` — `candidatePoolMax` cap,
239
+ `maxCharsPerDoc` truncation, `onUsage` callback fires on success,
240
+ `onUsage` NOT fired on failure, legacy `undefined` keeps no-trim
241
+ behavior.
242
+ - 4 new tests in `test/config.test.ts` — default values, override,
243
+ `[0, ∞)` clamping on the three new numeric fields.
244
+ - 1 fixture update in `test/tracing/events.test.ts` for the new
245
+ `JinaRpmExceededEvent` shape.
246
+
10
247
  ## [3.2.3] - 2026-05-23
11
248
 
12
249
  ### Added — observability for "ran and matched nothing" (P1)
package/dist/config.js CHANGED
@@ -2,6 +2,8 @@
2
2
  //
3
3
  // These helpers are the only place that touches `process.env`, keeping the
4
4
  // rest of the plugin easy to test with deterministic values.
5
+ import { DEFAULT_RPM_BUDGET } from "./jina/rate-limit.js";
6
+ import { resolveProvenanceLevel } from "./provenance.js";
5
7
  import { DEFAULT_MIN_CONFIDENCE } from "./router/index.js";
6
8
  /**
7
9
  * Expand `${VAR_NAME}` patterns in a config string against `process.env`.
@@ -45,6 +47,12 @@ const DEFAULT_ROUTER_MODE = "heuristic";
45
47
  // staying below typical hit scores (≈ 0.40-0.65).
46
48
  const DEFAULT_RERANKER_MODEL = "jina-reranker-v2-base-multilingual";
47
49
  const DEFAULT_RERANKER_TOP_N = 5;
50
+ // 3.2.4 — payload-trimming defaults. Empirically calibrated on jerome's
51
+ // Jina dashboard (2026-05-17 → 2026-05-24): 20 candidates × 2000 chars
52
+ // fits in ~10K tokens (well below jina-reranker-v2's 8K context window
53
+ // once the query is added) while preserving the top-precision band.
54
+ const DEFAULT_RERANKER_CANDIDATE_POOL_MAX = 20;
55
+ const DEFAULT_RERANKER_MAX_CHARS_PER_DOC = 2000;
48
56
  /**
49
57
  * Apply defaults and env substitution to the raw plugin config. A source is
50
58
  * enabled when its credentials are present, unless the user explicitly toggles
@@ -80,6 +88,8 @@ export function resolveConfig(cfg = {}) {
80
88
  lightragEnabled: cfg.lightragEnabled !== false && Boolean(lightragUrl),
81
89
  // Jina shared key (used by router and/or reranker)
82
90
  jinaApiKey,
91
+ // 3.2.4 — soft RPM budget. 0 disables the monitor entirely.
92
+ jinaRpmBudget: clampNonNegInt(jina.rpmBudget ?? DEFAULT_RPM_BUDGET),
83
93
  // Router — disabled by default, even with a Jina key present, so
84
94
  // operators must opt in explicitly. "heuristic" mode is the safest
85
95
  // entry point: zero cost, deterministic.
@@ -95,6 +105,23 @@ export function resolveConfig(cfg = {}) {
95
105
  pgvectorRerankerEnabled: reranker.enabled === true && Boolean(jinaApiKey),
96
106
  pgvectorRerankerModel: reranker.model ?? DEFAULT_RERANKER_MODEL,
97
107
  pgvectorRerankerTopN: reranker.topN ?? DEFAULT_RERANKER_TOP_N,
108
+ // 3.2.4 — payload-size guards. `null`/`undefined` user input falls
109
+ // back to the production-tuned defaults; an explicit `0` disables
110
+ // the corresponding cap (legacy v3.2.3 behavior).
111
+ pgvectorRerankerCandidatePoolMax: clampNonNegInt(reranker.candidatePoolMax ?? DEFAULT_RERANKER_CANDIDATE_POOL_MAX),
112
+ pgvectorRerankerMaxCharsPerDoc: clampNonNegInt(reranker.maxCharsPerDoc ?? DEFAULT_RERANKER_MAX_CHARS_PER_DOC),
113
+ // 3.3.0 — provenance reporting toward chat frontends. Off-list values
114
+ // (typos, future levels) normalize to "off": a misconfiguration must
115
+ // never silently leak content.
116
+ provenanceReport: resolveProvenanceLevel(cfg.provenanceReport),
98
117
  };
99
118
  }
119
+ /** Clamp a value to a non-negative integer. Bad input collapses to `0`. */
120
+ function clampNonNegInt(value) {
121
+ if (!Number.isFinite(value))
122
+ return 0;
123
+ if (value < 0)
124
+ return 0;
125
+ return Math.floor(value);
126
+ }
100
127
  //# sourceMappingURL=config.js.map
@@ -1 +1 @@
1
- {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAG7D,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AAEjC;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QAEV,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;KAC9D,CAAC;AACJ,CAAC"}
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAE7D,OAAO,EAAE,kBAAkB,EAAE,MAAM,sBAAsB,CAAC;AAE1D,OAAO,EAAE,sBAAsB,EAAE,MAAM,iBAAiB,CAAC;AACzD,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AACjC,wEAAwE;AACxE,uEAAuE;AACvE,uEAAuE;AACvE,oEAAoE;AACpE,MAAM,mCAAmC,GAAG,EAAE,CAAC;AAC/C,MAAM,kCAAkC,GAAG,IAAI,CAAC;AAEhD;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QACV,4DAA4D;QAC5D,aAAa,EAAE,cAAc,CAAC,IAAI,CAAC,SAAS,IAAI,kBAAkB,CAAC;QAEnE,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;QAC7D,mEAAmE;QACnE,kEAAkE;QAClE,kDAAkD;QAClD,gCAAgC,EAAE,cAAc,CAC9C,QAAQ,CAAC,gBAAgB,IAAI,mCAAmC,CACjE;QACD,8BAA8B,EAAE,cAAc,CAC5C,QAAQ,CAAC,cAAc,IAAI,kCAAkC,CAC9D;QAED,sEAAsE;QACtE,qEAAqE;QACrE,+BAA+B;QAC/B,gBAAgB,EAAE,sBAAsB,CAAC,GAAG,CAAC,gBAAgB,CAAC;KAC/D,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,SAAS,cAAc,CAAC,KAAa;IACnC,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,IAAI,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;AAC3B,CAAC"}
package/dist/index.d.ts CHANGED
@@ -1,4 +1,6 @@
1
+ import type { OpenClawPluginDefinition } from "openclaw/plugin-sdk/plugin-entry";
1
2
  import type { OpenClawPluginApi, PluginLogger } from "openclaw/plugin-sdk/plugin-entry";
3
+ import { type EmitAgentEventFn } from "./provenance.js";
2
4
  import type { Route } from "./router/types.js";
3
5
  import type { BeforePromptBuildEvent, BeforePromptBuildResult, PgPoolLike, PluginHookAgentContext, PromptMessage, ResolvedKnowledgeConfig } from "./types.js";
4
6
  export { resolveEnv, resolveConfig } from "./config.js";
@@ -11,6 +13,13 @@ interface HookHandlerDeps {
11
13
  config: ResolvedKnowledgeConfig;
12
14
  pool: PgPoolLike | null;
13
15
  logger: PluginLogger;
16
+ /**
17
+ * Gateway agent-event emitter for provenance reports (provenance/v1).
18
+ * `undefined` on SDKs that predate emitAgentEvent — the handler then
19
+ * degrades to silence. MUST be bound to the FIRST registration's api
20
+ * (gateway re-registration quirk; see registerKnowledgePlugin).
21
+ */
22
+ emitAgentEvent?: EmitAgentEventFn;
14
23
  }
15
24
  /**
16
25
  * Build the `before_prompt_build` handler bound to a specific plugin state.
@@ -72,17 +81,6 @@ export declare function extractUserQuery(event: BeforePromptBuildEvent): string;
72
81
  * @internal exported for unit testing and backward compatibility
73
82
  */
74
83
  export declare function extractQueryFromMessages(messages: PromptMessage[] | undefined): string;
75
- /**
76
- * Register the plugin against a minimal shape-compatible subset of the
77
- * OpenClaw plugin API. Returns nothing; side effects are setting a hook and
78
- * logging the initial status.
79
- */
80
84
  export declare function registerKnowledgePlugin(api: OpenClawPluginApi): void;
81
- declare const _default: {
82
- id: string;
83
- name: string;
84
- description: string;
85
- configSchema: import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginConfigSchema;
86
- register: NonNullable<import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginDefinition["register"]>;
87
- } & Pick<import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginDefinition, "kind" | "reload" | "nodeHostCommands" | "securityAuditCollectors">;
88
- export default _default;
85
+ declare const knowledgePluginEntry: OpenClawPluginDefinition;
86
+ export default knowledgePluginEntry;
package/dist/index.js CHANGED
@@ -23,11 +23,13 @@
23
23
  import pg from "pg";
24
24
  import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
25
25
  import { resolveConfig } from "./config.js";
26
+ import { buildLightRAGProvenance, buildPgvectorProvenance, emitProvenanceReports, resolveEmitAgentEvent, } from "./provenance.js";
26
27
  import { embedQuery } from "./embeddings.js";
27
- import { searchCollection, formatPgvectorResults, rerankPgvectorResults, } from "./pgvector.js";
28
+ import { searchCollection, formatPgvectorResultsDetailed, rerankPgvectorResults, } from "./pgvector.js";
28
29
  import { queryLightRAG, formatLightRAGResults } from "./lightrag.js";
29
30
  import { decideRoute } from "./router/index.js";
30
31
  import { JinaError, summarizeJinaError } from "./jina/errors.js";
32
+ import { RpmMonitor } from "./jina/rate-limit.js";
31
33
  import { emitEvent, emitTurnMetadata, LIGHTRAG_SPARSE_THRESHOLD_CHARS, } from "./tracing/events.js";
32
34
  // Re-export helpers so the test suite can import them directly without
33
35
  // duplicating imports from every submodule.
@@ -58,6 +60,27 @@ export function createBeforePromptBuildHandler(deps) {
58
60
  router: newCooldown(),
59
61
  pgvector_reranker: newCooldown(),
60
62
  };
63
+ // Per-instance RPM monitor — one sliding window per plugin runtime.
64
+ // The `onExceeded` callback emits a structured event the FIRST time the
65
+ // budget is overshot in any given 60-second window, so dashboards alert
66
+ // BEFORE the operator sees billing surprises (especially relevant when
67
+ // the Jina key is shared with another service like Hindsight).
68
+ //
69
+ // When `config.jinaRpmBudget === 0`, the monitor is fully disabled
70
+ // (no instance constructed, no timestamps tracked, no callback ever
71
+ // fires). This matches the contract documented on
72
+ // `JinaPluginConfig.rpmBudget`. Defense-in-depth: even if a caller
73
+ // bypasses this gate and constructs `RpmMonitor` with budget=0
74
+ // directly, the `record()` method itself short-circuits to a no-op.
75
+ const rpmMonitor = config.jinaRpmBudget > 0
76
+ ? new RpmMonitor({
77
+ budget: config.jinaRpmBudget,
78
+ onExceeded: ({ count, budget }) => {
79
+ logger.warn(`openclaw-knowledge: Jina RPM budget exceeded — ${count}/${budget} requests in the last 60s`);
80
+ emitEvent(logger, { type: "jina_rpm_exceeded", count, budget });
81
+ },
82
+ })
83
+ : undefined;
61
84
  return async function beforePromptBuild(event, ctx) {
62
85
  if (!config.enabled)
63
86
  return undefined;
@@ -73,7 +96,7 @@ export function createBeforePromptBuildHandler(deps) {
73
96
  // -----------------------------------------------------------------
74
97
  // Router gate — decide which sources (if any) to consult.
75
98
  // -----------------------------------------------------------------
76
- const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger);
99
+ const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger, rpmMonitor);
77
100
  // Project the abstract router decision onto the sources actually
78
101
  // configured in this deployment. Without this projection, an
79
102
  // exclusive route (e.g. LIGHTRAG_ONLY) on a single-source deployment
@@ -98,7 +121,7 @@ export function createBeforePromptBuildHandler(deps) {
98
121
  if (shouldUsePgvector(effectiveRoute) &&
99
122
  config.pgvectorEnabled &&
100
123
  pool) {
101
- tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger));
124
+ tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger, rpmMonitor));
102
125
  }
103
126
  if (shouldUseLightRAG(effectiveRoute) && config.lightragEnabled) {
104
127
  tasks.push(runLightRAGSource(query, config));
@@ -107,6 +130,7 @@ export function createBeforePromptBuildHandler(deps) {
107
130
  return undefined;
108
131
  const settled = await Promise.allSettled(tasks);
109
132
  const sections = [];
133
+ const provenanceReports = [];
110
134
  let failedSources = 0;
111
135
  for (const result of settled) {
112
136
  if (result.status === "rejected") {
@@ -116,8 +140,10 @@ export function createBeforePromptBuildHandler(deps) {
116
140
  continue;
117
141
  }
118
142
  const section = renderSection(result.value, config, logger);
119
- if (section)
120
- sections.push(section);
143
+ if (section) {
144
+ sections.push(section.text);
145
+ provenanceReports.push(section.provenance);
146
+ }
121
147
  }
122
148
  // If every source we launched failed, treat the turn as a failure for
123
149
  // cooldown tracking. A partial failure is fine — the other source's
@@ -129,6 +155,10 @@ export function createBeforePromptBuildHandler(deps) {
129
155
  cooldowns.global.consecutiveErrors = 0;
130
156
  if (sections.length === 0)
131
157
  return undefined;
158
+ // Provenance reports describe EXACTLY the sections returned below —
159
+ // emitted just before the injection is handed to the gateway, so a
160
+ // dropped turn can never have reported sources it did not use.
161
+ emitProvenanceReports(deps.emitAgentEvent, logger, ctx?.runId, ctx?.sessionKey, provenanceReports);
132
162
  return {
133
163
  appendSystemContext: [
134
164
  "",
@@ -195,7 +225,7 @@ export function projectRouteOnEnabledSources(route, pgvectorEnabled, lightragEna
195
225
  * meant to suppress repeated log spam during a sustained outage, not to
196
226
  * stop retrieval.
197
227
  */
198
- async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
228
+ async function runRouterWithCooldown(config, ctx, query, cooldown, logger, rpmMonitor) {
199
229
  // Reset stale cooldown FIRST so we don't keep the classifier circuit
200
230
  // open longer than necessary (the first turn after expiry must be
201
231
  // able to attempt the classifier again).
@@ -217,6 +247,16 @@ async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
217
247
  jinaApiKey: config.jinaApiKey,
218
248
  classifierId: config.routerClassifierId || undefined,
219
249
  minConfidence: config.routerMinConfidence,
250
+ onClassifierUsage: (usage) => emitEvent(logger, {
251
+ type: "jina",
252
+ endpoint: "classify",
253
+ model: usage.model,
254
+ durationMs: usage.durationMs,
255
+ // 1 query item per call. Few-shot adds no labels in the
256
+ // body, so inputCount = 1 covers both paths.
257
+ inputCount: 1,
258
+ }),
259
+ rpmMonitor,
220
260
  }, {
221
261
  query,
222
262
  trigger: ctx?.trigger,
@@ -431,7 +471,7 @@ export function extractQueryFromMessages(messages) {
431
471
  }
432
472
  return "";
433
473
  }
434
- async function runPgvectorSource(pool, query, config, rerankerCooldown, logger) {
474
+ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger, rpmMonitor) {
435
475
  const startedAt = Date.now();
436
476
  const vector = await embedQuery(query, config.geminiApiKey);
437
477
  // Use `Promise.allSettled` so a single failing collection (transient DB
@@ -491,6 +531,16 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
491
531
  query,
492
532
  model: config.pgvectorRerankerModel,
493
533
  topN: config.pgvectorRerankerTopN,
534
+ candidatePoolMax: config.pgvectorRerankerCandidatePoolMax || undefined,
535
+ maxCharsPerDoc: config.pgvectorRerankerMaxCharsPerDoc || undefined,
536
+ rpmMonitor,
537
+ onUsage: (usage) => emitEvent(logger, {
538
+ type: "jina",
539
+ endpoint: "rerank",
540
+ model: config.pgvectorRerankerModel,
541
+ durationMs: usage.durationMs,
542
+ inputCount: usage.inputCount,
543
+ }),
494
544
  });
495
545
  rerankerCooldown.consecutiveErrors = 0;
496
546
  return {
@@ -533,7 +583,7 @@ async function runLightRAGSource(query, config) {
533
583
  }
534
584
  function renderSection(result, config, logger) {
535
585
  if (result.source === "pgvector") {
536
- const formatted = formatPgvectorResults(result.data, config.maxInjectChars);
586
+ const formatted = formatPgvectorResultsDetailed(result.data, config.maxInjectChars);
537
587
  const topScore = result.data[0]?.score?.toFixed(2) ?? "n/a";
538
588
  const rerankNote = result.reranked ? " [reranked]" : "";
539
589
  // Emit the event UNCONDITIONALLY — even when pgvector returned no
@@ -564,8 +614,20 @@ function renderSection(result, config, logger) {
564
614
  logger.info(`openclaw-knowledge: pgvector — no result above threshold (rawCount=${result.rawCount})`);
565
615
  return null;
566
616
  }
567
- logger.info(`openclaw-knowledge: pgvector ${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
568
- return "### Document Search Results (pgvector)\n" + formatted;
617
+ // `injectedCount` is the count of entries that actually fit in
618
+ // `maxInjectChars`. It is `<= result.data.length` anything past the
619
+ // budget was dropped by `formatPgvectorResults`. The provenance
620
+ // report MUST mirror the injected subset (contract: "emit what was
621
+ // injected, not what was retrieved"), not the post-rerank candidate
622
+ // list — otherwise `metadata` mode leaks file names and `full` mode
623
+ // leaks excerpts of documents that never reached the LLM.
624
+ const injected = result.data.slice(0, formatted.injectedCount);
625
+ logger.info(`openclaw-knowledge: pgvector — ${formatted.injectedCount}/${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
626
+ const text = "### Document Search Results (pgvector)\n" + formatted.output;
627
+ return {
628
+ text,
629
+ provenance: buildPgvectorProvenance(injected, config.collections, config.provenanceReport, text.length),
630
+ };
569
631
  }
570
632
  if (result.source === "lightrag") {
571
633
  const formatted = formatLightRAGResults(result.data, config.lightragMaxChars);
@@ -586,7 +648,16 @@ function renderSection(result, config, logger) {
586
648
  return null;
587
649
  }
588
650
  logger.info(`openclaw-knowledge: LightRAG — ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
589
- return "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
651
+ const text = "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
652
+ return {
653
+ text,
654
+ // `text.length` (header INCLUDED) is what actually reaches the LLM —
655
+ // pass it through so the provenance `injected.chars` field matches.
656
+ // The `formatted.truncated` body is still used for the `full`-level
657
+ // excerpt because the header is structural noise (no semantic
658
+ // content worth surfacing to the chat frontend).
659
+ provenance: buildLightRAGProvenance(formatted.truncated, config.lightragQueryMode, config.provenanceReport, text.length),
660
+ };
590
661
  }
591
662
  return null;
592
663
  }
@@ -625,7 +696,13 @@ function registerError(state, scope, logger) {
625
696
  * OpenClaw plugin API. Returns nothing; side effects are setting a hook and
626
697
  * logging the initial status.
627
698
  */
699
+ // FIRST registration's api (gateway re-registration quirk — see the handler
700
+ // wiring below). Module-level: the ESM cache is per-process, so every later
701
+ // registration in the same gateway process sees the original, "loaded" api.
702
+ let stableApi = null;
628
703
  export function registerKnowledgePlugin(api) {
704
+ if (stableApi === null)
705
+ stableApi = api;
629
706
  const rawConfig = (api.pluginConfig ?? {});
630
707
  const config = resolveConfig(rawConfig);
631
708
  if (!config.pgvectorEnabled && !config.lightragEnabled) {
@@ -674,6 +751,12 @@ export function registerKnowledgePlugin(api) {
674
751
  config,
675
752
  pool,
676
753
  logger: api.logger,
754
+ // Provenance reports ride the agent-event bus. GATEWAY QUIRK
755
+ // (bench-verified 2026-06-12): the runtime RE-REGISTERS plugins per run
756
+ // and emitting through a re-registration's api is rejected "plugin is
757
+ // not loaded" — only the FIRST registration's api stays loaded, hence
758
+ // the module-level singleton.
759
+ emitAgentEvent: resolveEmitAgentEvent(stableApi ?? api),
677
760
  });
678
761
  // The SDK's `api.on<K>` signature is strongly typed per hook name, so we
679
762
  // use a cast here to bridge our structural handler type with the precise
@@ -686,7 +769,15 @@ export function registerKnowledgePlugin(api) {
686
769
  // ---------------------------------------------------------------------------
687
770
  // Canonical plugin entry
688
771
  // ---------------------------------------------------------------------------
689
- export default definePluginEntry({
772
+ // Explicit annotation on the default export, otherwise TS2742 fires when
773
+ // `declaration: true` is on: `definePluginEntry`'s return type
774
+ // (`DefinedPluginEntry`) is a module-local alias that is NOT exported by
775
+ // the SDK's public surface, so TypeScript has no portable name to write
776
+ // into our emitted `dist/index.d.ts`. Pinning to the publicly-exported
777
+ // supertype `OpenClawPluginDefinition` resolves the diagnostic without
778
+ // loosening type safety (the return type is structurally assignable to
779
+ // it — see `Pick<OpenClawPluginDefinition, …>` in the SDK definition).
780
+ const knowledgePluginEntry = definePluginEntry({
690
781
  id: "openclaw-knowledge",
691
782
  name: "Knowledge Base",
692
783
  description: "Multi-source knowledge search for OpenClaw (pgvector + LightRAG) with optional Jina-powered router & reranker",
@@ -694,4 +785,5 @@ export default definePluginEntry({
694
785
  registerKnowledgePlugin(api);
695
786
  },
696
787
  });
788
+ export default knowledgePluginEntry;
697
789
  //# sourceMappingURL=index.js.map