npm - @lacneu/openclaw-knowledge - Versions diffs - 3.2.3 → 3.2.5 - Mend

@lacneu/openclaw-knowledge 3.2.3 → 3.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/CHANGELOG.md +237 -0
package/dist/config.js +27 -0
package/dist/config.js.map +1 -1
package/dist/index.d.ts +11 -13
package/dist/index.js +104 -12
package/dist/index.js.map +1 -1
package/dist/jina/classifier.d.ts +7 -2
package/dist/jina/classifier.js +4 -2
package/dist/jina/classifier.js.map +1 -1
package/dist/jina/client.d.ts +11 -1
package/dist/jina/client.js +5 -1
package/dist/jina/client.js.map +1 -1
package/dist/jina/rate-limit.d.ts +62 -0
package/dist/jina/rate-limit.js +100 -0
package/dist/jina/rate-limit.js.map +1 -0
package/dist/jina/reranker.d.ts +4 -1
package/dist/jina/reranker.js +2 -1
package/dist/jina/reranker.js.map +1 -1
package/dist/pgvector.d.ts +83 -0
package/dist/pgvector.js +62 -9
package/dist/pgvector.js.map +1 -1
package/dist/provenance.d.ts +105 -0
package/dist/provenance.js +186 -0
package/dist/provenance.js.map +1 -0
package/dist/router/index.d.ts +15 -0
package/dist/router/index.js +9 -0
package/dist/router/index.js.map +1 -1
package/dist/tracing/events.d.ts +13 -1
package/dist/tracing/events.js.map +1 -1
package/dist/types.d.ts +68 -0
package/openclaw.plugin.json +64 -6
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,243 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+### Added — provenance reporting (provenance/v1, opt-in)
+New `provenanceReport` config (`"off"` default | `"metadata"` | `"full"`).
+When enabled, every injected knowledge section also emits a structured
+provenance report on the gateway agent-event bus (stream
+`openclaw-knowledge.provenance`) so a chat frontend (openclaw-webchat) can
+show the user **which documents fed this reply** — pgvector items carry
+`file_name`/`collection`/`score` (and the exact injected excerpt at `full`),
+LightRAG carries the mode and the injected context excerpt. The report mirrors
+EXACTLY what reached the LLM (post-rerank, post-truncation), never the raw
+retrieval. Normative contract: openclaw-webchat `docs/PROVENANCE_CONTRACT.md`.
+Robustness/privacy invariants: emission degrades to SILENCE on old SDKs
+(`emitAgentEvent` feature-detected), on a missing `runId`, or on a gateway
+rejection (logged, metadata only); reports never go through logs; the gateway
+re-registration quirk is handled by emitting through the FIRST registration's
+api (module singleton). Off-list config values normalize to `"off"`.
+### Fixed — pgvector provenance now reports ONLY entries actually injected
+`buildPgvectorProvenance` was previously fed the full post-rerank list
+(`result.data`) instead of the post-`maxInjectChars` subset that actually
+reaches the LLM. When the character budget truncated the list,
+`formatPgvectorResults` `break`ed after the budget was exhausted but the
+provenance report still exposed metadata (and excerpts in `full` mode) for
+the dropped entries — leaking file names and content for documents the
+LLM never saw, in violation of the PROVENANCE_CONTRACT rule "emit what was
+injected, not what was retrieved".
+A new internal helper `formatPgvectorResultsDetailed(results, maxChars):
+{ output, injectedCount } | null` carries the truncation logic and exposes
+the count of entries that actually fit the budget; `renderSection` uses it
+to slice `result.data` to the exact prefix injected and pass it to
+`buildPgvectorProvenance`. The public `formatPgvectorResults` signature
+remains `(results, maxChars) => string | null` (a thin wrapper over the
+Detailed variant) — preserving backward compatibility across the v3.x
+line. Both helpers return `null` when no entry fits the budget,
+preserving the prior `if (!formatted)` skip-path. The LightRAG provenance
+path was unaffected because it already used the post-truncation
+`formatted.truncated` text directly.
+### Fixed — provenance emission failures never leak payload to logs (codex pass #36 P2)
+`emitProvenanceReports` previously interpolated the gateway's raw `reason`
+string AND the caught `Error.message` into its `logger.warn` lines.
+Either path could echo back content the emitter was rejecting — the
+gateway commonly cites the offending field/value in its validation
+reasons, and third-party libs routinely enrich `Error.message` with input
+data. That broke the module-level invariant that report content NEVER
+reaches logs.
+The helper now classifies rejection reasons to a stable category code
+(`plugin_not_loaded`, `missing_run_context`, `invalid_stream`,
+`validation_error`, `rate_limited`, `rejected`) and, for thrown errors,
+logs only `throw:<Error.name>` (constructor name only — never `.message`,
+never the raw exception). Operators still get enough signal to triage;
+no payload byte can reach the log stream.
+### Fixed — LightRAG provenance `injected.chars` matches what the LLM actually saw (codex pass #36 P3)
+`buildLightRAGProvenance` was being called with `formatted.truncated`,
+which is the BODY of the section. The block actually delivered to the
+LLM also includes the `### Knowledge Graph Context (LightRAG)\n` header,
+so the report's `injected.chars` field was systematically short by the
+header length — contradicting the PROVENANCE_CONTRACT promise that the
+field reflects exactly what reached the model.
+`buildLightRAGProvenance` now takes an optional fourth parameter
+`injectedChars` for callers that track the full section length
+separately. The plugin's render path passes `text.length` (header
+INCLUDED). The `full`-level excerpt body remains the post-truncation
+context (the header carries no semantic content for the chat frontend).
+Invalid `injectedChars` values (`NaN`, negative numbers) fall back to
+`injectedText.length` to keep the report well-formed.
+## [3.2.4] - 2026-05-24
+### Added — payload-size guards for the pgvector reranker
+Production observation on jerome's Jina dashboard (2026-05-17 →
+2026-05-24): rerank calls averaged **~66 600 tokens** each — way over
+the model's 8 K context window. Almost all of that came from LightRAG-
+side reranker chunking, but the plugin-side reranker would face the
+same risk once `knowledge_jerome` is alimented. v3.2.4 ships two
+preventive knobs to keep the plugin-side spend bounded:
+- **`jina.pgvectorReranker.candidatePoolMax`** (default `20`). Caps the
+  number of cosine-ranked candidates sent to Jina /v1/rerank. Pgvector
+  recall is typically 20-50 hits; only the top 10-15 are worth
+  reranking. Setting to `0` disables the cap (legacy v3.2.3 behavior).
+- **`jina.pgvectorReranker.maxCharsPerDoc`** (default `2000`). Pre-
+  truncates each candidate text BEFORE submission. Long chunks
+  (transcripts, books) carry most of their relevance signal in the
+  first ~2000 chars; the tail wastes Jina tokens without adding signal.
+  Setting to `0` disables truncation.
+Both guards are pure pre-filters — they do not alter the reranker's
+output ordering, only the input size.
+### Added — `jina` usage events
+The `JinaUsageEvent` shape (already exported since v3.2.0) is now
+actually emitted on every successful Jina API call:
+- **`/v1/classify`** — emitted from `decideRoute` via the new
+  `RouterConfig.onClassifierUsage` callback. `inputCount: 1` (one
+  query per call), `model: "zero-shot" | "few-shot"`.
+- **`/v1/rerank`** — emitted from `rerankPgvectorResults` via the new
+  `RerankPgvectorParams.onUsage` callback. `inputCount: N` (post-trim
+  document count).
+Dashboards can now graph Jina conso per turn AND per endpoint without
+re-deriving from log timestamps.
+### Added — soft RPM monitor
+A lightweight sliding-window counter (`src/jina/rate-limit.ts`)
+observes outbound Jina request rate across the whole plugin. When the
+configured `jina.rpmBudget` is exceeded within any 60-second window,
+the plugin:
+- emits a single `{type: "jina_rpm_exceeded", count, budget}` event
+  for that window,
+- logs a warning naming the budget and the observed count.
+**The call is never blocked.** The existing 429 cooldown breaker
+remains the hard backstop; the monitor only adds visibility so the
+operator can alert BEFORE billing surprises hit — especially relevant
+when the API key is shared with another service (e.g. Hindsight).
+Default budget: `60` RPM (well below the Jina free-tier 100 RPM
+ceiling). `0` disables the monitor.
+### LightRAG companion changes (operator action)
+A non-trivial share of Jina-token saving sits in the LightRAG `.env`
+config, NOT the plugin. The companion
+`openclaw-notes/lightrag/.env.template` is updated in the same patch
+to recommend:
+- `RERANK_MAX_TOKENS_PER_DOC=600` (was `480`): each candidate splits
+  into 2 sub-rerank calls instead of 3 (1200/600=2 vs 1200/480=3).
+  Cuts the per-rerank Jina spend by **~33%** at no quality cost
+  (sub-chunks remain complete; payload fits cleanly in v2's 8K
+  context window with no server-side truncation).
+- `MIN_RERANK_SCORE=0.05` (was `0.0`): drops pure-noise reranked
+  chunks. Cleaner injected context, ~10% fewer chars on the
+  downstream LLM prompt.
+- `RERANK_ENABLE_CHUNKING` stays `true` — disabling it would push
+  the cumulative payload (10 docs × 1200 tokens ≈ 12 K) over the
+  8 K v2-multilingual context window, triggering silent server-side
+  truncation (Jina's `truncate:true` policy) and lossy ranking. See
+  the `.env.template` block for the full operating-point comparison
+  (chunking-ON vs TOP_K=5 vs jina-reranker-v3).
+Apply on the NAS:
+```bash
+sudo vi /volume3/openclaw/lightrag/.env.jerome
+# Change: RERANK_MAX_TOKENS_PER_DOC=600
+# Change: MIN_RERANK_SCORE=0.05
+sudo docker restart openclaw-lightrag-jerome
+```
+Expected saving on jerome's observed rate: ~8.4 M → ~5.6 M tokens
+per 7 days (-33%), no quality regression.
+### Migration
+Drop-in plugin patch. Defaults preserve the v3.2.3 behavior on
+runtimes that don't set the new fields. To activate the new caps on
+both instances:
+```bash
+sudo docker exec openclaw-jerome openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker exec openclaw-olivier openclaw plugins update @lacneu/openclaw-knowledge
+sudo docker restart openclaw-jerome openclaw-olivier
+```
+Verify post-restart:
+- `[knowledge.event] {"type":"jina","endpoint":"classify",...}` lines
+  appear on every classifier call.
+- `[knowledge.event] {"type":"jina","endpoint":"rerank","inputCount":N,...}`
+  lines appear on every rerank call. `inputCount` is now bounded by
+  `candidatePoolMax`.
+- `[knowledge.event] {"type":"jina_rpm_exceeded",...}` appears ONLY
+  during real overshoots (typically silent on single-user workloads).
+### Codex pass #33 correction (P2)
+`rpmBudget: 0` is documented as "disables the monitor entirely" on
+both `JinaPluginConfig.rpmBudget` and the `openclaw.plugin.json`
+schema. The initial implementation contradicted that promise: the
+overshoot check `count > this.budget` evaluated to `true` on the very
+first record (`1 > 0`), firing a spurious `jina_rpm_exceeded` event
+on every plugin turn.
+Fixed with defense-in-depth:
+- **`RpmMonitor.record()` / `RpmMonitor.peek()`** short-circuit to
+  `0` when `budget <= 0`. No timestamp tracked, no `onExceeded`
+  callback ever fires. Negative budgets get the same treatment.
+- **`src/index.ts`** skips constructing the monitor entirely when
+  `config.jinaRpmBudget === 0` — `rpmMonitor` becomes `undefined`
+  and every `rpmMonitor?.record()` callsite becomes a free no-op.
+### Codex pass #34 correction (P3)
+`RpmMonitor.lastExceededNotice` was initialized to `0`. The dedup
+check `t - lastExceededNotice >= 60_000` was unsatisfiable until
+simulated time reached 60s when a test clock started at `now=0` — a
+documented use case for `RpmMonitorOptions.now`. The first alert was
+silently suppressed during the first minute of any such test.
+Initialized to `Number.NEGATIVE_INFINITY` so the first overshoot
+fires regardless of clock origin. Production `Date.now()` was always
+well above 60_000, so this is a test-correctness fix with no behavior
+change in production.
+### Test coverage
+- Total: 271 tests, all green (was 250 in 3.2.3; +21 new).
+- New file `test/jina/rate-limit.test.ts` (12 tests) — sliding-window
+  count, 60s expiry, one-notice-per-window deduplication,
+  `DEFAULT_RPM_BUDGET` constant, **Codex #33 regression: budget=0
+  AND negative-budget disable the monitor completely**,
+  **Codex #34 regression: first overshoot fires even with a test
+  clock starting at `now=0`**.
+- 5 new tests in `test/pgvector.test.ts` — `candidatePoolMax` cap,
+  `maxCharsPerDoc` truncation, `onUsage` callback fires on success,
+  `onUsage` NOT fired on failure, legacy `undefined` keeps no-trim
+  behavior.
+- 4 new tests in `test/config.test.ts` — default values, override,
+  `[0, ∞)` clamping on the three new numeric fields.
+- 1 fixture update in `test/tracing/events.test.ts` for the new
+  `JinaRpmExceededEvent` shape.
 ## [3.2.3] - 2026-05-23
 ### Added — observability for "ran and matched nothing" (P1)

package/dist/config.js CHANGED Viewed

@@ -2,6 +2,8 @@
 //
 // These helpers are the only place that touches `process.env`, keeping the
 // rest of the plugin easy to test with deterministic values.
+import { DEFAULT_RPM_BUDGET } from "./jina/rate-limit.js";
+import { resolveProvenanceLevel } from "./provenance.js";
 import { DEFAULT_MIN_CONFIDENCE } from "./router/index.js";
 /**
  * Expand `${VAR_NAME}` patterns in a config string against `process.env`.
@@ -45,6 +47,12 @@ const DEFAULT_ROUTER_MODE = "heuristic";
 // staying below typical hit scores (≈ 0.40-0.65).
 const DEFAULT_RERANKER_MODEL = "jina-reranker-v2-base-multilingual";
 const DEFAULT_RERANKER_TOP_N = 5;
+// 3.2.4 — payload-trimming defaults. Empirically calibrated on jerome's
+// Jina dashboard (2026-05-17 → 2026-05-24): 20 candidates × 2000 chars
+// fits in ~10K tokens (well below jina-reranker-v2's 8K context window
+// once the query is added) while preserving the top-precision band.
+const DEFAULT_RERANKER_CANDIDATE_POOL_MAX = 20;
+const DEFAULT_RERANKER_MAX_CHARS_PER_DOC = 2000;
 /**
  * Apply defaults and env substitution to the raw plugin config. A source is
  * enabled when its credentials are present, unless the user explicitly toggles
@@ -80,6 +88,8 @@ export function resolveConfig(cfg = {}) {
         lightragEnabled: cfg.lightragEnabled !== false && Boolean(lightragUrl),
         // Jina shared key (used by router and/or reranker)
         jinaApiKey,
+        // 3.2.4 — soft RPM budget. 0 disables the monitor entirely.
+        jinaRpmBudget: clampNonNegInt(jina.rpmBudget ?? DEFAULT_RPM_BUDGET),
         // Router — disabled by default, even with a Jina key present, so
         // operators must opt in explicitly. "heuristic" mode is the safest
         // entry point: zero cost, deterministic.
@@ -95,6 +105,23 @@ export function resolveConfig(cfg = {}) {
         pgvectorRerankerEnabled: reranker.enabled === true && Boolean(jinaApiKey),
         pgvectorRerankerModel: reranker.model ?? DEFAULT_RERANKER_MODEL,
         pgvectorRerankerTopN: reranker.topN ?? DEFAULT_RERANKER_TOP_N,
+        // 3.2.4 — payload-size guards. `null`/`undefined` user input falls
+        // back to the production-tuned defaults; an explicit `0` disables
+        // the corresponding cap (legacy v3.2.3 behavior).
+        pgvectorRerankerCandidatePoolMax: clampNonNegInt(reranker.candidatePoolMax ?? DEFAULT_RERANKER_CANDIDATE_POOL_MAX),
+        pgvectorRerankerMaxCharsPerDoc: clampNonNegInt(reranker.maxCharsPerDoc ?? DEFAULT_RERANKER_MAX_CHARS_PER_DOC),
+        // 3.3.0 — provenance reporting toward chat frontends. Off-list values
+        // (typos, future levels) normalize to "off": a misconfiguration must
+        // never silently leak content.
+        provenanceReport: resolveProvenanceLevel(cfg.provenanceReport),
     };
 }
+/** Clamp a value to a non-negative integer. Bad input collapses to `0`. */
+function clampNonNegInt(value) {
+    if (!Number.isFinite(value))
+        return 0;
+    if (value < 0)
+        return 0;
+    return Math.floor(value);
+}
 //# sourceMappingURL=config.js.map

package/dist/config.js.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;~~AAG7D~~,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;~~AAEjC~~;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;~~QAEV~~,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;~~KAC9D~~,CAAC;AACJ,CAAC"}
1	+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAE7D,OAAO,EAAE,kBAAkB,EAAE,MAAM,sBAAsB,CAAC;AAE1D,OAAO,EAAE,sBAAsB,EAAE,MAAM,iBAAiB,CAAC;AACzD,OAAO,EAAE,sBAAsB,EAAE,MAAM,mBAAmB,CAAC;AAU3D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,+EAA+E;AAC/E,SAAS,OAAO,CAAC,KAAa;IAC5B,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,KAAK,CAAC;AACf,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC,MAAM,mBAAmB,GAAoC,WAAW,CAAC;AACzE,kEAAkE;AAClE,sEAAsE;AACtE,gEAAgE;AAChE,EAAE;AACF,wEAAwE;AACxE,uEAAuE;AACvE,kEAAkE;AAClE,kEAAkE;AAClE,kDAAkD;AAClD,MAAM,sBAAsB,GAAkB,oCAAoC,CAAC;AACnF,MAAM,sBAAsB,GAAG,CAAC,CAAC;AACjC,wEAAwE;AACxE,uEAAuE;AACvE,uEAAuE;AACvE,oEAAoE;AACpE,MAAM,mCAAmC,GAAG,EAAE,CAAC;AAC/C,MAAM,kCAAkC,GAAG,IAAI,CAAC;AAEhD;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,MAAM,IAAI,GAAG,CAAC,GAAG,CAAC,IAAI,IAAI,EAAE,CAAqB,CAAC;IAClD,MAAM,MAAM,GAAG,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAuB,CAAC;IACzD,MAAM,QAAQ,GAAG,CAAC,IAAI,CAAC,gBAAgB,IAAI,EAAE,CAAiC,CAAC;IAC/E,MAAM,UAAU,GAAG,UAAU,CAAC,IAAI,CAAC,MAAM,IAAI,EAAE,CAAC,CAAC;IACjD,MAAM,kBAAkB,GAAG,UAAU,CAAC,MAAM,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IAEjE,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;QAEtE,mDAAmD;QACnD,UAAU;QACV,4DAA4D;QAC5D,aAAa,EAAE,cAAc,CAAC,IAAI,CAAC,SAAS,IAAI,kBAAkB,CAAC;QAEnE,iEAAiE;QACjE,mEAAmE;QACnE,yCAAyC;QACzC,aAAa,EAAE,MAAM,CAAC,OAAO,KAAK,IAAI;QACtC,UAAU,EAAE,MAAM,CAAC,IAAI,IAAI,mBAAmB;QAC9C,kBAAkB;QAClB,iEAAiE;QACjE,+DAA+D;QAC/D,mBAAmB,EAAE,OAAO,CAC1B,MAAM,CAAC,aAAa,IAAI,sBAAsB,CAC/C;QAED,oEAAoE;QACpE,qEAAqE;QACrE,oCAAoC;QACpC,uBAAuB,EAAE,QAAQ,CAAC,OAAO,KAAK,IAAI,IAAI,OAAO,CAAC,UAAU,CAAC;QACzE,qBAAqB,EAAE,QAAQ,CAAC,KAAK,IAAI,sBAAsB;QAC/D,oBAAoB,EAAE,QAAQ,CAAC,IAAI,IAAI,sBAAsB;QAC7D,mEAAmE;QACnE,kEAAkE;QAClE,kDAAkD;QAClD,gCAAgC,EAAE,cAAc,CAC9C,QAAQ,CAAC,gBAAgB,IAAI,mCAAmC,CACjE;QACD,8BAA8B,EAAE,cAAc,CAC5C,QAAQ,CAAC,cAAc,IAAI,kCAAkC,CAC9D;QAED,sEAAsE;QACtE,qEAAqE;QACrE,+BAA+B;QAC/B,gBAAgB,EAAE,sBAAsB,CAAC,GAAG,CAAC,gBAAgB,CAAC;KAC/D,CAAC;AACJ,CAAC;AAED,2EAA2E;AAC3E,SAAS,cAAc,CAAC,KAAa;IACnC,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,KAAK,CAAC;QAAE,OAAO,CAAC,CAAC;IACtC,IAAI,KAAK,GAAG,CAAC;QAAE,OAAO,CAAC,CAAC;IACxB,OAAO,IAAI,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;AAC3B,CAAC"}

package/dist/index.d.ts CHANGED Viewed

@@ -1,4 +1,6 @@
+import type { OpenClawPluginDefinition } from "openclaw/plugin-sdk/plugin-entry";
 import type { OpenClawPluginApi, PluginLogger } from "openclaw/plugin-sdk/plugin-entry";
+import { type EmitAgentEventFn } from "./provenance.js";
 import type { Route } from "./router/types.js";
 import type { BeforePromptBuildEvent, BeforePromptBuildResult, PgPoolLike, PluginHookAgentContext, PromptMessage, ResolvedKnowledgeConfig } from "./types.js";
 export { resolveEnv, resolveConfig } from "./config.js";
@@ -11,6 +13,13 @@ interface HookHandlerDeps {
     config: ResolvedKnowledgeConfig;
     pool: PgPoolLike | null;
     logger: PluginLogger;
+    /**
+     * Gateway agent-event emitter for provenance reports (provenance/v1).
+     * `undefined` on SDKs that predate emitAgentEvent — the handler then
+     * degrades to silence. MUST be bound to the FIRST registration's api
+     * (gateway re-registration quirk; see registerKnowledgePlugin).
+     */
+    emitAgentEvent?: EmitAgentEventFn;
 }
 /**
  * Build the `before_prompt_build` handler bound to a specific plugin state.
@@ -72,17 +81,6 @@ export declare function extractUserQuery(event: BeforePromptBuildEvent): string;
  * @internal exported for unit testing and backward compatibility
  */
 export declare function extractQueryFromMessages(messages: PromptMessage[] | undefined): string;
-/**
- * Register the plugin against a minimal shape-compatible subset of the
- * OpenClaw plugin API. Returns nothing; side effects are setting a hook and
- * logging the initial status.
- */
 export declare function registerKnowledgePlugin(api: OpenClawPluginApi): void;
-declare const _default: {
-    id: string;
-    name: string;
-    description: string;
-    configSchema: import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginConfigSchema;
-    register: NonNullable<import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginDefinition["register"]>;
-} & Pick<import("openclaw/plugin-sdk/plugin-entry").OpenClawPluginDefinition, "kind" | "reload" | "nodeHostCommands" | "securityAuditCollectors">;
-export default _default;
+declare const knowledgePluginEntry: OpenClawPluginDefinition;
+export default knowledgePluginEntry;

package/dist/index.js CHANGED Viewed

@@ -23,11 +23,13 @@
 import pg from "pg";
 import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
 import { resolveConfig } from "./config.js";
+import { buildLightRAGProvenance, buildPgvectorProvenance, emitProvenanceReports, resolveEmitAgentEvent, } from "./provenance.js";
 import { embedQuery } from "./embeddings.js";
-import { searchCollection, formatPgvectorResults, rerankPgvectorResults, } from "./pgvector.js";
+import { searchCollection, formatPgvectorResultsDetailed, rerankPgvectorResults, } from "./pgvector.js";
 import { queryLightRAG, formatLightRAGResults } from "./lightrag.js";
 import { decideRoute } from "./router/index.js";
 import { JinaError, summarizeJinaError } from "./jina/errors.js";
+import { RpmMonitor } from "./jina/rate-limit.js";
 import { emitEvent, emitTurnMetadata, LIGHTRAG_SPARSE_THRESHOLD_CHARS, } from "./tracing/events.js";
 // Re-export helpers so the test suite can import them directly without
 // duplicating imports from every submodule.
@@ -58,6 +60,27 @@ export function createBeforePromptBuildHandler(deps) {
         router: newCooldown(),
         pgvector_reranker: newCooldown(),
     };
+    // Per-instance RPM monitor — one sliding window per plugin runtime.
+    // The `onExceeded` callback emits a structured event the FIRST time the
+    // budget is overshot in any given 60-second window, so dashboards alert
+    // BEFORE the operator sees billing surprises (especially relevant when
+    // the Jina key is shared with another service like Hindsight).
+    //
+    // When `config.jinaRpmBudget === 0`, the monitor is fully disabled
+    // (no instance constructed, no timestamps tracked, no callback ever
+    // fires). This matches the contract documented on
+    // `JinaPluginConfig.rpmBudget`. Defense-in-depth: even if a caller
+    // bypasses this gate and constructs `RpmMonitor` with budget=0
+    // directly, the `record()` method itself short-circuits to a no-op.
+    const rpmMonitor = config.jinaRpmBudget > 0
+        ? new RpmMonitor({
+            budget: config.jinaRpmBudget,
+            onExceeded: ({ count, budget }) => {
+                logger.warn(`openclaw-knowledge: Jina RPM budget exceeded — ${count}/${budget} requests in the last 60s`);
+                emitEvent(logger, { type: "jina_rpm_exceeded", count, budget });
+            },
+        })
+        : undefined;
     return async function beforePromptBuild(event, ctx) {
         if (!config.enabled)
             return undefined;
@@ -73,7 +96,7 @@ export function createBeforePromptBuildHandler(deps) {
         // -----------------------------------------------------------------
         // Router gate — decide which sources (if any) to consult.
         // -----------------------------------------------------------------
-        const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger);
+        const decision = await runRouterWithCooldown(config, ctx, query, cooldowns.router, logger, rpmMonitor);
         // Project the abstract router decision onto the sources actually
         // configured in this deployment. Without this projection, an
         // exclusive route (e.g. LIGHTRAG_ONLY) on a single-source deployment
@@ -98,7 +121,7 @@ export function createBeforePromptBuildHandler(deps) {
             if (shouldUsePgvector(effectiveRoute) &&
                 config.pgvectorEnabled &&
                 pool) {
-                tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger));
+                tasks.push(runPgvectorSource(pool, query, config, cooldowns.pgvector_reranker, logger, rpmMonitor));
             }
             if (shouldUseLightRAG(effectiveRoute) && config.lightragEnabled) {
                 tasks.push(runLightRAGSource(query, config));
@@ -107,6 +130,7 @@ export function createBeforePromptBuildHandler(deps) {
                 return undefined;
             const settled = await Promise.allSettled(tasks);
             const sections = [];
+            const provenanceReports = [];
             let failedSources = 0;
             for (const result of settled) {
                 if (result.status === "rejected") {
@@ -116,8 +140,10 @@ export function createBeforePromptBuildHandler(deps) {
                     continue;
                 }
                 const section = renderSection(result.value, config, logger);
-                if (section)
-                    sections.push(section);
+                if (section) {
+                    sections.push(section.text);
+                    provenanceReports.push(section.provenance);
+                }
             }
             // If every source we launched failed, treat the turn as a failure for
             // cooldown tracking. A partial failure is fine — the other source's
@@ -129,6 +155,10 @@ export function createBeforePromptBuildHandler(deps) {
             cooldowns.global.consecutiveErrors = 0;
             if (sections.length === 0)
                 return undefined;
+            // Provenance reports describe EXACTLY the sections returned below —
+            // emitted just before the injection is handed to the gateway, so a
+            // dropped turn can never have reported sources it did not use.
+            emitProvenanceReports(deps.emitAgentEvent, logger, ctx?.runId, ctx?.sessionKey, provenanceReports);
             return {
                 appendSystemContext: [
                     "",
@@ -195,7 +225,7 @@ export function projectRouteOnEnabledSources(route, pgvectorEnabled, lightragEna
  * meant to suppress repeated log spam during a sustained outage, not to
  * stop retrieval.
  */
-async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
+async function runRouterWithCooldown(config, ctx, query, cooldown, logger, rpmMonitor) {
     // Reset stale cooldown FIRST so we don't keep the classifier circuit
     // open longer than necessary (the first turn after expiry must be
     // able to attempt the classifier again).
@@ -217,6 +247,16 @@ async function runRouterWithCooldown(config, ctx, query, cooldown, logger) {
             jinaApiKey: config.jinaApiKey,
             classifierId: config.routerClassifierId || undefined,
             minConfidence: config.routerMinConfidence,
+            onClassifierUsage: (usage) => emitEvent(logger, {
+                type: "jina",
+                endpoint: "classify",
+                model: usage.model,
+                durationMs: usage.durationMs,
+                // 1 query item per call. Few-shot adds no labels in the
+                // body, so inputCount = 1 covers both paths.
+                inputCount: 1,
+            }),
+            rpmMonitor,
         }, {
             query,
             trigger: ctx?.trigger,
@@ -431,7 +471,7 @@ export function extractQueryFromMessages(messages) {
     }
     return "";
 }
-async function runPgvectorSource(pool, query, config, rerankerCooldown, logger) {
+async function runPgvectorSource(pool, query, config, rerankerCooldown, logger, rpmMonitor) {
     const startedAt = Date.now();
     const vector = await embedQuery(query, config.geminiApiKey);
     // Use `Promise.allSettled` so a single failing collection (transient DB
@@ -491,6 +531,16 @@ async function runPgvectorSource(pool, query, config, rerankerCooldown, logger)
             query,
             model: config.pgvectorRerankerModel,
             topN: config.pgvectorRerankerTopN,
+            candidatePoolMax: config.pgvectorRerankerCandidatePoolMax || undefined,
+            maxCharsPerDoc: config.pgvectorRerankerMaxCharsPerDoc || undefined,
+            rpmMonitor,
+            onUsage: (usage) => emitEvent(logger, {
+                type: "jina",
+                endpoint: "rerank",
+                model: config.pgvectorRerankerModel,
+                durationMs: usage.durationMs,
+                inputCount: usage.inputCount,
+            }),
         });
         rerankerCooldown.consecutiveErrors = 0;
         return {
@@ -533,7 +583,7 @@ async function runLightRAGSource(query, config) {
 }
 function renderSection(result, config, logger) {
     if (result.source === "pgvector") {
-        const formatted = formatPgvectorResults(result.data, config.maxInjectChars);
+        const formatted = formatPgvectorResultsDetailed(result.data, config.maxInjectChars);
         const topScore = result.data[0]?.score?.toFixed(2) ?? "n/a";
         const rerankNote = result.reranked ? " [reranked]" : "";
         // Emit the event UNCONDITIONALLY — even when pgvector returned no
@@ -564,8 +614,20 @@ function renderSection(result, config, logger) {
             logger.info(`openclaw-knowledge: pgvector — no result above threshold (rawCount=${result.rawCount})`);
             return null;
         }
-        logger.info(`openclaw-knowledge: pgvector — ${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
-        return "### Document Search Results (pgvector)\n" + formatted;
+        // `injectedCount` is the count of entries that actually fit in
+        // `maxInjectChars`. It is `<= result.data.length` — anything past the
+        // budget was dropped by `formatPgvectorResults`. The provenance
+        // report MUST mirror the injected subset (contract: "emit what was
+        // injected, not what was retrieved"), not the post-rerank candidate
+        // list — otherwise `metadata` mode leaks file names and `full` mode
+        // leaks excerpts of documents that never reached the LLM.
+        const injected = result.data.slice(0, formatted.injectedCount);
+        logger.info(`openclaw-knowledge: pgvector — ${formatted.injectedCount}/${result.data.length} result(s)${rerankNote} (top: ${topScore})`);
+        const text = "### Document Search Results (pgvector)\n" + formatted.output;
+        return {
+            text,
+            provenance: buildPgvectorProvenance(injected, config.collections, config.provenanceReport, text.length),
+        };
     }
     if (result.source === "lightrag") {
         const formatted = formatLightRAGResults(result.data, config.lightragMaxChars);
@@ -586,7 +648,16 @@ function renderSection(result, config, logger) {
             return null;
         }
         logger.info(`openclaw-knowledge: LightRAG — ${formatted.truncated.length}/${formatted.originalLength} chars (truncated from ${formatted.originalLength})`);
-        return "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
+        const text = "### Knowledge Graph Context (LightRAG)\n" + formatted.truncated;
+        return {
+            text,
+            // `text.length` (header INCLUDED) is what actually reaches the LLM —
+            // pass it through so the provenance `injected.chars` field matches.
+            // The `formatted.truncated` body is still used for the `full`-level
+            // excerpt because the header is structural noise (no semantic
+            // content worth surfacing to the chat frontend).
+            provenance: buildLightRAGProvenance(formatted.truncated, config.lightragQueryMode, config.provenanceReport, text.length),
+        };
     }
     return null;
 }
@@ -625,7 +696,13 @@ function registerError(state, scope, logger) {
  * OpenClaw plugin API. Returns nothing; side effects are setting a hook and
  * logging the initial status.
  */
+// FIRST registration's api (gateway re-registration quirk — see the handler
+// wiring below). Module-level: the ESM cache is per-process, so every later
+// registration in the same gateway process sees the original, "loaded" api.
+let stableApi = null;
 export function registerKnowledgePlugin(api) {
+    if (stableApi === null)
+        stableApi = api;
     const rawConfig = (api.pluginConfig ?? {});
     const config = resolveConfig(rawConfig);
     if (!config.pgvectorEnabled && !config.lightragEnabled) {
@@ -674,6 +751,12 @@ export function registerKnowledgePlugin(api) {
         config,
         pool,
         logger: api.logger,
+        // Provenance reports ride the agent-event bus. GATEWAY QUIRK
+        // (bench-verified 2026-06-12): the runtime RE-REGISTERS plugins per run
+        // and emitting through a re-registration's api is rejected "plugin is
+        // not loaded" — only the FIRST registration's api stays loaded, hence
+        // the module-level singleton.
+        emitAgentEvent: resolveEmitAgentEvent(stableApi ?? api),
     });
     // The SDK's `api.on<K>` signature is strongly typed per hook name, so we
     // use a cast here to bridge our structural handler type with the precise
@@ -686,7 +769,15 @@ export function registerKnowledgePlugin(api) {
 // ---------------------------------------------------------------------------
 // Canonical plugin entry
 // ---------------------------------------------------------------------------
-export default definePluginEntry({
+// Explicit annotation on the default export, otherwise TS2742 fires when
+// `declaration: true` is on: `definePluginEntry`'s return type
+// (`DefinedPluginEntry`) is a module-local alias that is NOT exported by
+// the SDK's public surface, so TypeScript has no portable name to write
+// into our emitted `dist/index.d.ts`. Pinning to the publicly-exported
+// supertype `OpenClawPluginDefinition` resolves the diagnostic without
+// loosening type safety (the return type is structurally assignable to
+// it — see `Pick<OpenClawPluginDefinition, …>` in the SDK definition).
+const knowledgePluginEntry = definePluginEntry({
     id: "openclaw-knowledge",
     name: "Knowledge Base",
     description: "Multi-source knowledge search for OpenClaw (pgvector + LightRAG) with optional Jina-powered router & reranker",
@@ -694,4 +785,5 @@ export default definePluginEntry({
         registerKnowledgePlugin(api);
     },
 });
+export default knowledgePluginEntry;
 //# sourceMappingURL=index.js.map