@lacneu/openclaw-knowledge 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +159 -0
- package/LICENSE +21 -0
- package/README.md +424 -0
- package/dist/config.d.ts +14 -0
- package/dist/config.js +51 -0
- package/dist/config.js.map +1 -0
- package/dist/embeddings.d.ts +9 -0
- package/dist/embeddings.js +33 -0
- package/dist/embeddings.js.map +1 -0
- package/dist/index.d.ts +31 -0
- package/dist/index.js +239 -0
- package/dist/index.js.map +1 -0
- package/dist/lightrag.d.ts +28 -0
- package/dist/lightrag.js +69 -0
- package/dist/lightrag.js.map +1 -0
- package/dist/pgvector.d.ts +21 -0
- package/dist/pgvector.js +81 -0
- package/dist/pgvector.js.map +1 -0
- package/dist/types.d.ts +107 -0
- package/dist/types.js +6 -0
- package/dist/types.js.map +1 -0
- package/openclaw.plugin.json +149 -0
- package/package.json +70 -0
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,159 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
## [Unreleased]
|
|
9
|
+
|
|
10
|
+
## [3.1.0] - 2026-04-10
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
- **Distribution migrated to npm as `@lacneu/openclaw-knowledge`.** The plugin
|
|
14
|
+
is now published to the public npm registry under the `lacneu` organization
|
|
15
|
+
and installs via the official OpenClaw CLI:
|
|
16
|
+
```bash
|
|
17
|
+
openclaw plugins install @lacneu/openclaw-knowledge
|
|
18
|
+
openclaw plugins update @lacneu/openclaw-knowledge
|
|
19
|
+
```
|
|
20
|
+
OpenClaw tracks the install under `plugins.installs`, so `openclaw plugins update`
|
|
21
|
+
works out of the box — no custom deployment script required.
|
|
22
|
+
- `package.json` renamed to the scoped package `@lacneu/openclaw-knowledge`
|
|
23
|
+
and now declares `publishConfig.access: "public"`, a narrow `files` allowlist
|
|
24
|
+
(dist, manifest, README, CHANGELOG, LICENSE), `repository`, `homepage`, `bugs`
|
|
25
|
+
and `keywords` metadata for discoverability on npmjs.com.
|
|
26
|
+
- GitHub Actions release workflow now runs `npm publish --access public` via
|
|
27
|
+
**npm Trusted Publishing (OIDC)** — no `NPM_TOKEN` secret needed. The workflow
|
|
28
|
+
requests an OIDC token from GitHub, exchanges it for a short-lived npm
|
|
29
|
+
credential scoped to this repo + workflow, and ships a provenance statement
|
|
30
|
+
with every release (automatic with Trusted Publishing, `--provenance` flag no
|
|
31
|
+
longer needed). No bundled tarball artifact — the GitHub Release is still
|
|
32
|
+
created for changelog visibility but carries no files.
|
|
33
|
+
- **Full migration to TypeScript + the official OpenClaw plugin SDK.** The plugin now
|
|
34
|
+
uses `definePluginEntry` from `openclaw/plugin-sdk/plugin-entry` as the canonical
|
|
35
|
+
entry point, replacing the bare `{ id, name, register }` object export.
|
|
36
|
+
- Source code is split into focused modules under `src/`:
|
|
37
|
+
`index.ts` (entry + hook wiring), `config.ts` (resolveEnv + defaults),
|
|
38
|
+
`embeddings.ts` (Gemini client), `pgvector.ts` (PostgreSQL search + formatter),
|
|
39
|
+
`lightrag.ts` (LightRAG client + truncation), `types.ts` (shared interfaces).
|
|
40
|
+
- Tests migrated to TypeScript under `test/*.test.ts` using `node:test`.
|
|
41
|
+
Coverage trimmed to 56 tests after removing legacy-shape test cases.
|
|
42
|
+
- Business logic is **unchanged**: same hook (`before_prompt_build`), same output
|
|
43
|
+
format (`### Document Search Results` + `### Knowledge Graph Context`), same
|
|
44
|
+
parallel execution via `Promise.allSettled`, same cooldown (3 errors → 5 min),
|
|
45
|
+
same Gemini native `embedContent` endpoint, same `halfvec(3072)` SQL cast.
|
|
46
|
+
- Current plugin configurations (Olivier and Jerome instances) continue to work
|
|
47
|
+
without any changes — all config keys and defaults are preserved. The breaking
|
|
48
|
+
changes below are limited to internal types and legacy input shapes that were
|
|
49
|
+
defensive cruft, not fields used by active deployments.
|
|
50
|
+
|
|
51
|
+
### Added
|
|
52
|
+
- `tsconfig.json` with strict mode (`noImplicitAny`, `noUnusedLocals`,
|
|
53
|
+
`noUnusedParameters`, `noImplicitReturns`, `noFallthroughCasesInSwitch`).
|
|
54
|
+
- `tsconfig.test.json` and `tsconfig.test-build.json` for typecheck and test compilation.
|
|
55
|
+
- `npm run build`, `npm run typecheck`, `npm run clean` scripts.
|
|
56
|
+
- `@types/node`, `@types/pg`, `typescript`, and `openclaw` (for SDK types) as
|
|
57
|
+
`devDependencies`.
|
|
58
|
+
- Release workflow now runs `npm run typecheck`, `npm run build`, then prunes to
|
|
59
|
+
production dependencies before bundling. The release tarball ships the compiled
|
|
60
|
+
`dist/` directory rather than raw source.
|
|
61
|
+
- CI workflow runs typecheck, tests, and build on Node.js 22 and 24.
|
|
62
|
+
|
|
63
|
+
### Removed (BREAKING)
|
|
64
|
+
- `index.js` and `index.test.js` at the repository root (replaced by `src/` and `test/`).
|
|
65
|
+
- Legacy message shapes in `extractQueryFromMessages`: the `sender` field (alias
|
|
66
|
+
for `role`), the `"human"` role alias, and the `{text: "..."}` fallback form
|
|
67
|
+
are no longer recognized. Only the canonical `{role, content}` shape is accepted,
|
|
68
|
+
where `content` is a `string` or an array of `{type, text}` parts.
|
|
69
|
+
- Legacy LightRAG response shapes in `queryLightRAG`: plain string responses and
|
|
70
|
+
`{context: ...}` payloads are no longer normalized. Only the current
|
|
71
|
+
`{response: string}` shape is supported (LightRAG 1.4.x+).
|
|
72
|
+
- `PromptMessage.sender`, `PromptMessage.text`, and the `[key: string]: unknown`
|
|
73
|
+
index signatures on `PromptMessage` and `PromptContentPart` are removed from
|
|
74
|
+
the exported types. Strict structural typing only.
|
|
75
|
+
- `truncateLightRAG(text: string | null | undefined, ...)` tightened to
|
|
76
|
+
`truncateLightRAG(text: string, ...)`. Callers must pre-check for non-empty.
|
|
77
|
+
- `resolveConfig(raw: KnowledgePluginConfig | null | undefined)` tightened to
|
|
78
|
+
`resolveConfig(cfg?: KnowledgePluginConfig)`. Pass `{}` or no argument instead
|
|
79
|
+
of `null` / `undefined`.
|
|
80
|
+
- `PgvectorRow.score` type tightened from `string | number` to `string` (matches
|
|
81
|
+
actual `pg` driver behaviour for numeric columns).
|
|
82
|
+
|
|
83
|
+
### Previous [Unreleased] entries (now folded into this TS migration)
|
|
84
|
+
- `package.json` now declares `openclaw.compat.pluginApi` and `openclaw.compat.minGatewayVersion`
|
|
85
|
+
so OpenClaw can validate compatibility before loading the plugin.
|
|
86
|
+
- Full `uiHints` coverage in `openclaw.plugin.json` for every config field (labels, placeholders,
|
|
87
|
+
`sensitive: true` on secrets, `advanced: true` on tuning knobs).
|
|
88
|
+
- JSON Schema constraints in `configSchema`: `default`, `minimum`/`maximum` on numeric fields,
|
|
89
|
+
`enum` on `lightragQueryMode`, explicit `default` on `enabled`, `topK`, `scoreThreshold`,
|
|
90
|
+
`maxInjectChars`, `lightragMaxChars`, `lightragQueryMode` and `collections`.
|
|
91
|
+
- Manifest `description` updated to explicitly mention the `before_prompt_build` hook.
|
|
92
|
+
- `peerDependencies.openclaw` bumped to `>=2026.3.7` to match the hook requirement already
|
|
93
|
+
stated in the README.
|
|
94
|
+
|
|
95
|
+
## [3.0.4] - 2026-04-10
|
|
96
|
+
|
|
97
|
+
### Fixed
|
|
98
|
+
- Release tarball now bundles `node_modules` with the `pg` dependency, eliminating
|
|
99
|
+
the need for `npm install` at deployment time. Previously, runtime `npm install`
|
|
100
|
+
would silently fail on Docker installations with tmpfs cache conflicts, leaving
|
|
101
|
+
the plugin unable to load (`Cannot find module 'pg'`).
|
|
102
|
+
|
|
103
|
+
### Changed
|
|
104
|
+
- `update-knowledge-plugin.sh` simplified: no longer runs `npm install` on target
|
|
105
|
+
containers, only verifies that bundled dependencies are present.
|
|
106
|
+
|
|
107
|
+
## [1.2.0] - 2026-03-30
|
|
108
|
+
|
|
109
|
+
### Changed
|
|
110
|
+
- Reverted hook from `before_prompt_build` back to `before_agent_start` for broader compatibility.
|
|
111
|
+
- Changed context injection from `appendSystemContext` to `prependContext` with `<relevant-documents>` tagging.
|
|
112
|
+
- Added logic to prevent memory pollution by `autoCapture`.
|
|
113
|
+
|
|
114
|
+
### Fixed
|
|
115
|
+
- Release workflow now stamps version from tag into `package.json` and `openclaw.plugin.json` before building artifact.
|
|
116
|
+
- Improved logging: removed noisy event keys log, added query length and preview logging.
|
|
117
|
+
|
|
118
|
+
## [1.1.2] - 2026-03-30
|
|
119
|
+
|
|
120
|
+
### Fixed
|
|
121
|
+
- Enhanced logging in `before_prompt_build` hook: capture event keys and improve query handling logic.
|
|
122
|
+
|
|
123
|
+
## [1.1.1] - 2026-03-30
|
|
124
|
+
|
|
125
|
+
### Changed
|
|
126
|
+
- Stabilized hook naming: renamed from `before_agent_start` to `before_prompt_build`.
|
|
127
|
+
- Updated test cases to reflect new hook names.
|
|
128
|
+
- Streamlined query handling and improved context injection logic.
|
|
129
|
+
|
|
130
|
+
## [1.1.0] - 2026-03-30
|
|
131
|
+
|
|
132
|
+
### Changed
|
|
133
|
+
- Switched hook from `before_agent_start` to `before_prompt_build`.
|
|
134
|
+
- Changed injection mechanism from `prependContext` to `appendSystemContext` for system prompt handling.
|
|
135
|
+
- Expanded README with installation and update guidance.
|
|
136
|
+
|
|
137
|
+
## [1.0.0] - 2026-03-30
|
|
138
|
+
|
|
139
|
+
### Added
|
|
140
|
+
- Multi-collection Qdrant vector search via `before_agent_start` hook.
|
|
141
|
+
- Query embedding using Gemini Embedding 2 Preview (3072 dimensions, cross-modal compatible).
|
|
142
|
+
- Parallel search across multiple Qdrant collections.
|
|
143
|
+
- Results sorted by similarity score, injected as `<relevant-documents>` block via `prependContext`.
|
|
144
|
+
- Environment variable substitution in config values (`${VAR_NAME}` syntax).
|
|
145
|
+
- Configurable score threshold, top-K, max injection size, and per-instance collection list.
|
|
146
|
+
- Fail-safe error handling: errors never block the agent.
|
|
147
|
+
- Cooldown mechanism: pauses 5 minutes after 3 consecutive failures.
|
|
148
|
+
- Unit tests (26 tests) using Node.js built-in test runner (`node:test`).
|
|
149
|
+
- CI workflow: tests on Node.js 18, 20, and 22.
|
|
150
|
+
- Release workflow: creates GitHub Release with tarball on tag push.
|
|
151
|
+
- Architecture, lifecycle, and sequence diagrams in `schemas/`.
|
|
152
|
+
|
|
153
|
+
[Unreleased]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v3.1.0...HEAD
|
|
154
|
+
[3.1.0]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v1.2.0...v3.1.0
|
|
155
|
+
[1.2.0]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v1.1.2...v1.2.0
|
|
156
|
+
[1.1.2]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v1.1.1...v1.1.2
|
|
157
|
+
[1.1.1]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v1.1.0...v1.1.1
|
|
158
|
+
[1.1.0]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/compare/v1.0.0...v1.1.0
|
|
159
|
+
[1.0.0]: https://github.com/OlivierNeu/openclaw-knowledge-plugin/releases/tag/v1.0.0
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 OlivierNeu
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,424 @@
|
|
|
1
|
+
# openclaw-knowledge-plugin
|
|
2
|
+
|
|
3
|
+
> **Dual-source knowledge injection plugin for OpenClaw**
|
|
4
|
+
> Automatically enriches agent prompts with relevant context from your document knowledge base,
|
|
5
|
+
> combining **pgvector semantic search** and **LightRAG knowledge graph** in a single hook.
|
|
6
|
+
|
|
7
|
+
[](LICENSE)
|
|
8
|
+
[](https://github.com/openclaw/openclaw)
|
|
9
|
+
[](https://www.npmjs.com/package/@lacneu/openclaw-knowledge)
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Overview
|
|
14
|
+
|
|
15
|
+
`openclaw-knowledge` is an OpenClaw plugin that automatically injects relevant
|
|
16
|
+
documents and knowledge graph context into every agent turn. It hooks into
|
|
17
|
+
`before_prompt_build` and queries **two complementary sources in parallel**:
|
|
18
|
+
|
|
19
|
+
| Source | Technology | What it provides |
|
|
20
|
+
|--------|------------|------------------|
|
|
21
|
+
| **pgvector** | PostgreSQL + `pgvector` extension | Semantic vector search on document chunks (cosine similarity on 3072-dim embeddings) |
|
|
22
|
+
| **LightRAG** | Neo4j + PostgreSQL | Knowledge graph with entity/relation multi-hop traversal |
|
|
23
|
+
|
|
24
|
+
Both sources run **in parallel** via `Promise.allSettled`, so a failure in one
|
|
25
|
+
source doesn't block the other. Results are merged and injected into the agent's
|
|
26
|
+
system prompt via `appendSystemContext`.
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Why two sources?
|
|
31
|
+
|
|
32
|
+
Vector search and knowledge graphs answer different kinds of questions:
|
|
33
|
+
|
|
34
|
+
- **Vector search** finds passages that are **semantically similar** to the query.
|
|
35
|
+
Good for "What did the meeting say about pricing?" — matches embeddings.
|
|
36
|
+
- **Knowledge graph** finds entities and **their relationships**.
|
|
37
|
+
Good for "Which clients work in the insurance sector?" — traverses entity links.
|
|
38
|
+
|
|
39
|
+
Running both gives the agent both capabilities simultaneously, without requiring
|
|
40
|
+
the LLM to decide which to use.
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## Architecture
|
|
45
|
+
|
|
46
|
+

|
|
47
|
+
|
|
48
|
+
The plugin is the **query layer** of a larger knowledge pipeline:
|
|
49
|
+
|
|
50
|
+
1. **Ingestion (background, via n8n):** Google Drive documents are polled,
|
|
51
|
+
OCR'd via Mistral, embedded via Gemini, and stored in PostgreSQL (`pgvector`)
|
|
52
|
+
and Neo4j (LightRAG knowledge graph).
|
|
53
|
+
2. **Query (real-time, via this plugin):** Every user message triggers a
|
|
54
|
+
parallel search in both sources, results are formatted and prepended to
|
|
55
|
+
the agent's prompt.
|
|
56
|
+
|
|
57
|
+
The plugin does **not** handle ingestion — that's the responsibility of the n8n
|
|
58
|
+
ETL pipeline. This plugin only reads from the existing data stores.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Query lifecycle
|
|
63
|
+
|
|
64
|
+

|
|
65
|
+
|
|
66
|
+
Every user message triggers the following sequence:
|
|
67
|
+
|
|
68
|
+
1. OpenClaw fires `before_prompt_build` with the user's prompt
|
|
69
|
+
2. The plugin checks its **cooldown state** (pauses 5 min after 3 consecutive errors)
|
|
70
|
+
3. Query text is extracted and validated (≥ 3 characters)
|
|
71
|
+
4. **In parallel** (`Promise.allSettled`):
|
|
72
|
+
- **pgvector path:** embed query via Gemini → SQL search on `knowledge_vectors`
|
|
73
|
+
- **LightRAG path:** POST `/query` with `mode=hybrid` to the LightRAG server
|
|
74
|
+
5. Results are merged and truncated to `maxInjectChars`
|
|
75
|
+
6. Formatted blocks (`### Document Search Results` + `### Knowledge Graph Context`)
|
|
76
|
+
are injected via `appendSystemContext`
|
|
77
|
+
7. The agent receives the enriched prompt and generates its response
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Decision flow
|
|
82
|
+
|
|
83
|
+

|
|
84
|
+
|
|
85
|
+
The plugin implements several safeguards to ensure it never blocks the agent:
|
|
86
|
+
|
|
87
|
+
| Safeguard | Purpose |
|
|
88
|
+
|-----------|---------|
|
|
89
|
+
| **Cooldown** (3 errors → 5 min pause) | Avoid log spam and unnecessary API calls during outages |
|
|
90
|
+
| **Query length check** (≥ 3 chars) | Skip meaningless searches |
|
|
91
|
+
| **`Promise.allSettled`** for sources | A failure in one source doesn't affect the other |
|
|
92
|
+
| **Silent error handling** | Errors are logged but never thrown to the agent |
|
|
93
|
+
| **Gracefull degradation** | If both sources fail, the agent runs as if the plugin weren't there |
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## Installation
|
|
98
|
+
|
|
99
|
+
### Requirements
|
|
100
|
+
|
|
101
|
+
- OpenClaw ≥ `v2026.3.7` (for `before_prompt_build` hook)
|
|
102
|
+
- PostgreSQL with `pgvector` extension
|
|
103
|
+
- LightRAG server (optional — plugin works with pgvector alone)
|
|
104
|
+
- Gemini API key (for query embedding)
|
|
105
|
+
|
|
106
|
+
### Install via OpenClaw CLI (recommended)
|
|
107
|
+
|
|
108
|
+
The plugin is published on npm as `@lacneu/openclaw-knowledge`. Use the
|
|
109
|
+
official `openclaw plugins` commands — install, update, list, inspect all
|
|
110
|
+
work out of the box:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
# Install (pulls the latest version from npm)
|
|
114
|
+
openclaw plugins install @lacneu/openclaw-knowledge
|
|
115
|
+
|
|
116
|
+
# Inspect the installed version and manifest
|
|
117
|
+
openclaw plugins inspect @lacneu/openclaw-knowledge
|
|
118
|
+
|
|
119
|
+
# Update to the latest published version
|
|
120
|
+
openclaw plugins update @lacneu/openclaw-knowledge
|
|
121
|
+
|
|
122
|
+
# List everything installed
|
|
123
|
+
openclaw plugins list
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
OpenClaw tracks the install source under `plugins.installs` in your
|
|
127
|
+
configuration, so subsequent `update` calls know where to fetch new versions
|
|
128
|
+
from.
|
|
129
|
+
|
|
130
|
+
### Configuration
|
|
131
|
+
|
|
132
|
+
Add to your `openclaw.json`:
|
|
133
|
+
|
|
134
|
+
```json
|
|
135
|
+
{
|
|
136
|
+
"plugins": {
|
|
137
|
+
"allow": ["openclaw-knowledge", "hindsight-openclaw", "telegram"],
|
|
138
|
+
"entries": {
|
|
139
|
+
"openclaw-knowledge": {
|
|
140
|
+
"enabled": true,
|
|
141
|
+
"config": {
|
|
142
|
+
"geminiApiKey": "${GEMINI_API_KEY}",
|
|
143
|
+
"postgresUrl": "postgresql://user:${POSTGRES_PASSWORD}@postgresql:5432/knowledge",
|
|
144
|
+
"collections": ["knowledge_alice"],
|
|
145
|
+
"topK": 5,
|
|
146
|
+
"scoreThreshold": 0,
|
|
147
|
+
"maxInjectChars": 4000,
|
|
148
|
+
"lightragUrl": "http://lightrag:9621",
|
|
149
|
+
"lightragApiKey": "${LIGHTRAG_API_KEY}",
|
|
150
|
+
"lightragQueryMode": "hybrid",
|
|
151
|
+
"lightragMaxChars": 4000
|
|
152
|
+
}
|
|
153
|
+
}
|
|
154
|
+
}
|
|
155
|
+
}
|
|
156
|
+
}
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Then restart the gateway:
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
openclaw gateway restart
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## Configuration reference
|
|
168
|
+
|
|
169
|
+
| Parameter | Type | Default | Description |
|
|
170
|
+
|-----------|------|---------|-------------|
|
|
171
|
+
| `enabled` | boolean | `true` | Master switch for the plugin |
|
|
172
|
+
| **pgvector source** | | | |
|
|
173
|
+
| `geminiApiKey` | string | — | Gemini API key for query embedding (supports `${ENV_VAR}`) |
|
|
174
|
+
| `postgresUrl` | string | — | PostgreSQL connection URL (supports `${ENV_VAR}`) |
|
|
175
|
+
| `collections` | string[] | `["knowledge_default"]` | Collections to search in `knowledge_vectors` table |
|
|
176
|
+
| `topK` | number | `5` | Max results per collection |
|
|
177
|
+
| `scoreThreshold` | number | `0.3` | Minimum cosine similarity (0–1) |
|
|
178
|
+
| `maxInjectChars` | number | `4000` | Character budget for pgvector results |
|
|
179
|
+
| `pgvectorEnabled` | boolean | `true` if `geminiApiKey` set | Disable pgvector while keeping LightRAG |
|
|
180
|
+
| **LightRAG source** | | | |
|
|
181
|
+
| `lightragUrl` | string | — | LightRAG server base URL |
|
|
182
|
+
| `lightragApiKey` | string | — | LightRAG API key (supports `${ENV_VAR}`) |
|
|
183
|
+
| `lightragQueryMode` | string | `"hybrid"` | Query mode: `naive`, `local`, `global`, `hybrid` |
|
|
184
|
+
| `lightragMaxChars` | number | `4000` | Character budget for LightRAG context |
|
|
185
|
+
| `lightragEnabled` | boolean | `true` if `lightragUrl` set | Disable LightRAG while keeping pgvector |
|
|
186
|
+
|
|
187
|
+
### LightRAG query modes
|
|
188
|
+
|
|
189
|
+
| Mode | Description | Best for |
|
|
190
|
+
|------|-------------|----------|
|
|
191
|
+
| `naive` | Simple vector similarity on chunks | Fast, basic keyword matching |
|
|
192
|
+
| `local` | Entity neighborhood traversal | Questions about a specific entity |
|
|
193
|
+
| `global` | Community summaries | Broad, overview questions |
|
|
194
|
+
| `hybrid` | Combines local + global | **Recommended for most cases** |
|
|
195
|
+
|
|
196
|
+
---
|
|
197
|
+
|
|
198
|
+
## Data model
|
|
199
|
+
|
|
200
|
+
### pgvector: `knowledge_vectors` table
|
|
201
|
+
|
|
202
|
+
The plugin expects a PostgreSQL table with this structure:
|
|
203
|
+
|
|
204
|
+
```sql
|
|
205
|
+
CREATE TABLE knowledge_vectors (
|
|
206
|
+
id SERIAL PRIMARY KEY,
|
|
207
|
+
collection TEXT NOT NULL,
|
|
208
|
+
file_name TEXT,
|
|
209
|
+
mime_type TEXT,
|
|
210
|
+
text TEXT,
|
|
211
|
+
file_id TEXT,
|
|
212
|
+
source TEXT,
|
|
213
|
+
owner TEXT,
|
|
214
|
+
chunk_index INTEGER,
|
|
215
|
+
total_chunks INTEGER,
|
|
216
|
+
timestamp_start TEXT,
|
|
217
|
+
timestamp_end TEXT,
|
|
218
|
+
embedded_at TIMESTAMPTZ,
|
|
219
|
+
embedding vector(3072) NOT NULL
|
|
220
|
+
);
|
|
221
|
+
|
|
222
|
+
CREATE INDEX idx_knowledge_vectors_hnsw
|
|
223
|
+
ON knowledge_vectors
|
|
224
|
+
USING hnsw ((embedding::halfvec(3072)) halfvec_cosine_ops);
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
**Important:** The HNSW index must use `halfvec(3072)` because pgvector's HNSW
|
|
228
|
+
index has a 2000-dimension limit for the native `vector` type. `halfvec`
|
|
229
|
+
supports up to 4000 dimensions. The plugin query casts both the column and the
|
|
230
|
+
parameter accordingly.
|
|
231
|
+
|
|
232
|
+
### Embeddings
|
|
233
|
+
|
|
234
|
+
- **Model:** `gemini-embedding-2-preview` via the native Gemini API
|
|
235
|
+
- **Dimensions:** 3072
|
|
236
|
+
- **Distance metric:** cosine similarity
|
|
237
|
+
- **Query endpoint:** the plugin uses the **native** `embedContent` endpoint
|
|
238
|
+
(not the OpenAI-compatible one), because the native endpoint supports
|
|
239
|
+
multimodal embedding at ingestion time while still working for text queries.
|
|
240
|
+
|
|
241
|
+
### LightRAG query
|
|
242
|
+
|
|
243
|
+
The plugin sends a POST request:
|
|
244
|
+
|
|
245
|
+
```http
|
|
246
|
+
POST /query HTTP/1.1
|
|
247
|
+
X-API-Key: <lightragApiKey>
|
|
248
|
+
Content-Type: application/json
|
|
249
|
+
|
|
250
|
+
{
|
|
251
|
+
"query": "<user message>",
|
|
252
|
+
"mode": "hybrid",
|
|
253
|
+
"only_need_context": true
|
|
254
|
+
}
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
`only_need_context: true` tells LightRAG to return the retrieved context
|
|
258
|
+
**without** running the final LLM synthesis — the plugin only needs the
|
|
259
|
+
raw context to inject into the agent's prompt.
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## Multi-tenant support
|
|
264
|
+
|
|
265
|
+
Each OpenClaw instance can configure its own set of collections:
|
|
266
|
+
|
|
267
|
+
```json
|
|
268
|
+
// Alice's instance
|
|
269
|
+
"collections": ["knowledge_alice", "knowledge_shared"]
|
|
270
|
+
|
|
271
|
+
// Bob's instance
|
|
272
|
+
"collections": ["knowledge_bob", "knowledge_shared"]
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
All instances can share the same PostgreSQL database — isolation is done
|
|
276
|
+
at the collection level. LightRAG, however, uses one instance per tenant
|
|
277
|
+
(workspace isolation is not yet exposed in the plugin).
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## Example output
|
|
282
|
+
|
|
283
|
+
When the agent receives a user message, it sees something like this in its system prompt:
|
|
284
|
+
|
|
285
|
+
```
|
|
286
|
+
<existing system prompt>
|
|
287
|
+
|
|
288
|
+
### Document Search Results (pgvector)
|
|
289
|
+
|
|
290
|
+
[knowledge_alice] Contrat_Acme_Corp.pdf (score: 0.92, chunk 2/5)
|
|
291
|
+
Service agreement between Alice Consulting and Acme Corp. Duration: 6 months,
|
|
292
|
+
daily rate: 1500 EUR, start date: 2026-01-15, deliverables: strategy workshops,
|
|
293
|
+
CODIR alignment sessions, monthly follow-ups...
|
|
294
|
+
|
|
295
|
+
[knowledge_shared] Pricing_Grid_2026.pdf (score: 0.87, chunk 1/1)
|
|
296
|
+
Standard pricing grid: senior consulting 1500 EUR/day, junior 900 EUR/day,
|
|
297
|
+
workshops 3500 EUR/day flat...
|
|
298
|
+
|
|
299
|
+
### Knowledge Graph Context (LightRAG)
|
|
300
|
+
|
|
301
|
+
Entity: Acme Corp (Organization)
|
|
302
|
+
Relationships:
|
|
303
|
+
- Acme Corp → client_of → Alice Consulting (since 2026-01-15)
|
|
304
|
+
- Acme Corp → subject_of → Contrat_Acme_Corp.pdf
|
|
305
|
+
- Acme Corp → operates_in → Insurance sector
|
|
306
|
+
- Acme Corp → represented_by → Thomas Martin (Contact)
|
|
307
|
+
|
|
308
|
+
User: What were the terms of the Acme contract?
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
The LLM can now cite both the vector search hits (specific text passages) and
|
|
312
|
+
the knowledge graph entities (relationships and structure) to produce a
|
|
313
|
+
grounded answer.
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Relationship with Hindsight
|
|
318
|
+
|
|
319
|
+
This plugin **complements** [Hindsight](https://github.com/vectorize-io/hindsight)
|
|
320
|
+
(the memory plugin) without conflict:
|
|
321
|
+
|
|
322
|
+
| | Hindsight | openclaw-knowledge |
|
|
323
|
+
|---|-----------|-------------------|
|
|
324
|
+
| **Purpose** | Conversational memory | Document knowledge (RAG) |
|
|
325
|
+
| **Source** | Facts extracted from chats | Documents from Google Drive |
|
|
326
|
+
| **Storage** | PostgreSQL (Hindsight schema) | PostgreSQL (`knowledge_vectors`) + Neo4j |
|
|
327
|
+
| **Trigger** | `auto-recall` on every message | `before_prompt_build` on every message |
|
|
328
|
+
| **Injection block** | `<relevant-memories>` | `### Document Search Results` + `### Knowledge Graph Context` |
|
|
329
|
+
| **OpenClaw slot** | `memory` (exclusive) | None (coexists freely) |
|
|
330
|
+
|
|
331
|
+
Both run on every user message. The agent receives **both** blocks, giving it
|
|
332
|
+
conversational memory AND document knowledge simultaneously.
|
|
333
|
+
|
|
334
|
+
---
|
|
335
|
+
|
|
336
|
+
## Development
|
|
337
|
+
|
|
338
|
+
This plugin is written in **TypeScript** and builds against the official
|
|
339
|
+
OpenClaw plugin SDK (`openclaw/plugin-sdk/plugin-entry`).
|
|
340
|
+
|
|
341
|
+
### Project layout
|
|
342
|
+
|
|
343
|
+
```
|
|
344
|
+
openclaw-knowledge-plugin/
|
|
345
|
+
├── src/ # TypeScript source
|
|
346
|
+
│ ├── index.ts # Entry point (definePluginEntry + register)
|
|
347
|
+
│ ├── config.ts # resolveEnv + default resolution
|
|
348
|
+
│ ├── embeddings.ts # Gemini embedContent client
|
|
349
|
+
│ ├── pgvector.ts # PostgreSQL search + result formatter
|
|
350
|
+
│ ├── lightrag.ts # LightRAG client + truncation
|
|
351
|
+
│ └── types.ts # Shared interfaces
|
|
352
|
+
├── test/ # TypeScript test suites (node:test)
|
|
353
|
+
├── dist/ # Compiled JS + .d.ts (gitignored)
|
|
354
|
+
├── tsconfig.json # Strict TS config for src
|
|
355
|
+
├── tsconfig.test.json # Typecheck (src + test)
|
|
356
|
+
├── tsconfig.test-build.json # Compile tests to dist-test/ for node:test
|
|
357
|
+
├── openclaw.plugin.json # Plugin manifest (config schema + uiHints)
|
|
358
|
+
└── package.json
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
### Build and test
|
|
362
|
+
|
|
363
|
+
```bash
|
|
364
|
+
# Install dev dependencies (includes the openclaw SDK for types, ~200 MB)
|
|
365
|
+
npm install
|
|
366
|
+
|
|
367
|
+
# Strict type check (src + tests)
|
|
368
|
+
npm run typecheck
|
|
369
|
+
|
|
370
|
+
# Run the full test suite (compiles tests then runs node:test)
|
|
371
|
+
npm test
|
|
372
|
+
|
|
373
|
+
# Compile TS → dist/
|
|
374
|
+
npm run build
|
|
375
|
+
|
|
376
|
+
# Clean build output
|
|
377
|
+
npm run clean
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
### Release process
|
|
381
|
+
|
|
382
|
+
1. Update `CHANGELOG.md` with the new version (add a `## [x.y.z] - YYYY-MM-DD` section)
|
|
383
|
+
2. Commit the changelog update
|
|
384
|
+
3. Create and push a git tag:
|
|
385
|
+
```bash
|
|
386
|
+
git tag v3.1.0
|
|
387
|
+
git push origin v3.1.0
|
|
388
|
+
```
|
|
389
|
+
4. GitHub Actions will automatically:
|
|
390
|
+
- Run `npm run typecheck`, `npm test`, `npm run build` on Node.js 24
|
|
391
|
+
- Stamp the version from the tag into `package.json` and `openclaw.plugin.json`
|
|
392
|
+
- Compile TypeScript (`npm run build`)
|
|
393
|
+
- **Publish `@olivierneu/openclaw-knowledge` to npm** (public access)
|
|
394
|
+
- Create a GitHub Release with changelog notes extracted from `CHANGELOG.md`
|
|
395
|
+
|
|
396
|
+
#### Required GitHub secret
|
|
397
|
+
|
|
398
|
+
The workflow needs an `NPM_TOKEN` secret. Because the npm account has 2FA
|
|
399
|
+
enabled with a security key, the token **must** be an **Automation token**
|
|
400
|
+
(not a regular Publish token), because automation tokens bypass 2FA for CI/CD.
|
|
401
|
+
|
|
402
|
+
Generate it on npm: *Access Tokens → Generate New Token → Classic Token →
|
|
403
|
+
Automation*, then add it under GitHub repo *Settings → Secrets and variables
|
|
404
|
+
→ Actions* as `NPM_TOKEN`.
|
|
405
|
+
|
|
406
|
+
---
|
|
407
|
+
|
|
408
|
+
## Troubleshooting
|
|
409
|
+
|
|
410
|
+
| Symptom | Cause | Solution |
|
|
411
|
+
|---------|-------|----------|
|
|
412
|
+
| `Cannot find module 'pg'` | Old release (pre-v3.0.4) without bundled deps | Upgrade to v3.0.4+ |
|
|
413
|
+
| `neither pgvector nor LightRAG configured — plugin disabled` | No `geminiApiKey` and no `lightragUrl` | Configure at least one source |
|
|
414
|
+
| `pgvector — source failed: Gemini embedding failed (429)` | Gemini quota exceeded | Check Gemini API quotas or back off |
|
|
415
|
+
| `LightRAG query failed (401)` | Wrong or missing `lightragApiKey` | Verify the header `X-API-Key` is accepted |
|
|
416
|
+
| `LightRAG query failed (503)` | LightRAG server down | Check LightRAG container status |
|
|
417
|
+
| Plugin loads but no context injected | `scoreThreshold` too high | Lower to `0` to see all matches |
|
|
418
|
+
| Plugin enters 5-min cooldown | 3 consecutive errors on all sources | Check logs, fix the underlying issue |
|
|
419
|
+
|
|
420
|
+
---
|
|
421
|
+
|
|
422
|
+
## License
|
|
423
|
+
|
|
424
|
+
MIT — see [LICENSE](LICENSE)
|
package/dist/config.d.ts
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
import type { KnowledgePluginConfig, ResolvedKnowledgeConfig } from "./types.js";
|
|
2
|
+
/**
|
|
3
|
+
* Expand `${VAR_NAME}` patterns in a config string against `process.env`.
|
|
4
|
+
* Non-string values are returned untouched so the helper can be used on any
|
|
5
|
+
* raw config field without type narrowing at the call site. Missing env vars
|
|
6
|
+
* become empty strings to avoid leaking `undefined` into downstream code.
|
|
7
|
+
*/
|
|
8
|
+
export declare function resolveEnv<T>(value: T): T;
|
|
9
|
+
/**
|
|
10
|
+
* Apply defaults and env substitution to the raw plugin config. A source is
|
|
11
|
+
* enabled when its credentials are present, unless the user explicitly toggles
|
|
12
|
+
* `pgvectorEnabled`/`lightragEnabled` off.
|
|
13
|
+
*/
|
|
14
|
+
export declare function resolveConfig(cfg?: KnowledgePluginConfig): ResolvedKnowledgeConfig;
|
package/dist/config.js
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
// Plugin configuration helpers.
|
|
2
|
+
//
|
|
3
|
+
// These helpers are the only place that touches `process.env`, keeping the
|
|
4
|
+
// rest of the plugin easy to test with deterministic values.
|
|
5
|
+
/**
|
|
6
|
+
* Expand `${VAR_NAME}` patterns in a config string against `process.env`.
|
|
7
|
+
* Non-string values are returned untouched so the helper can be used on any
|
|
8
|
+
* raw config field without type narrowing at the call site. Missing env vars
|
|
9
|
+
* become empty strings to avoid leaking `undefined` into downstream code.
|
|
10
|
+
*/
|
|
11
|
+
export function resolveEnv(value) {
|
|
12
|
+
if (typeof value !== "string")
|
|
13
|
+
return value;
|
|
14
|
+
return value.replace(/\$\{(\w+)\}/g, (_, name) => {
|
|
15
|
+
return process.env[name] ?? "";
|
|
16
|
+
});
|
|
17
|
+
}
|
|
18
|
+
const DEFAULT_POSTGRES_URL = "postgresql://openclaw:@postgresql:5432/knowledge";
|
|
19
|
+
const DEFAULT_COLLECTIONS = ["knowledge_default"];
|
|
20
|
+
const DEFAULT_TOP_K = 5;
|
|
21
|
+
const DEFAULT_SCORE_THRESHOLD = 0.3;
|
|
22
|
+
const DEFAULT_MAX_INJECT_CHARS = 4000;
|
|
23
|
+
const DEFAULT_LIGHTRAG_MODE = "hybrid";
|
|
24
|
+
const DEFAULT_LIGHTRAG_MAX_CHARS = 4000;
|
|
25
|
+
/**
|
|
26
|
+
* Apply defaults and env substitution to the raw plugin config. A source is
|
|
27
|
+
* enabled when its credentials are present, unless the user explicitly toggles
|
|
28
|
+
* `pgvectorEnabled`/`lightragEnabled` off.
|
|
29
|
+
*/
|
|
30
|
+
export function resolveConfig(cfg = {}) {
|
|
31
|
+
const geminiApiKey = resolveEnv(cfg.geminiApiKey ?? "");
|
|
32
|
+
const postgresUrl = resolveEnv(cfg.postgresUrl ?? DEFAULT_POSTGRES_URL);
|
|
33
|
+
const lightragUrl = resolveEnv(cfg.lightragUrl ?? "");
|
|
34
|
+
const lightragApiKey = resolveEnv(cfg.lightragApiKey ?? "");
|
|
35
|
+
return {
|
|
36
|
+
enabled: cfg.enabled !== false,
|
|
37
|
+
geminiApiKey,
|
|
38
|
+
postgresUrl,
|
|
39
|
+
collections: cfg.collections ?? DEFAULT_COLLECTIONS,
|
|
40
|
+
topK: cfg.topK ?? DEFAULT_TOP_K,
|
|
41
|
+
scoreThreshold: cfg.scoreThreshold ?? DEFAULT_SCORE_THRESHOLD,
|
|
42
|
+
maxInjectChars: cfg.maxInjectChars ?? DEFAULT_MAX_INJECT_CHARS,
|
|
43
|
+
pgvectorEnabled: cfg.pgvectorEnabled !== false && Boolean(geminiApiKey),
|
|
44
|
+
lightragUrl,
|
|
45
|
+
lightragApiKey,
|
|
46
|
+
lightragQueryMode: cfg.lightragQueryMode ?? DEFAULT_LIGHTRAG_MODE,
|
|
47
|
+
lightragMaxChars: cfg.lightragMaxChars ?? DEFAULT_LIGHTRAG_MAX_CHARS,
|
|
48
|
+
lightragEnabled: cfg.lightragEnabled !== false && Boolean(lightragUrl),
|
|
49
|
+
};
|
|
50
|
+
}
|
|
51
|
+
//# sourceMappingURL=config.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,gCAAgC;AAChC,EAAE;AACF,2EAA2E;AAC3E,6DAA6D;AAQ7D;;;;;GAKG;AACH,MAAM,UAAU,UAAU,CAAI,KAAQ;IACpC,IAAI,OAAO,KAAK,KAAK,QAAQ;QAAE,OAAO,KAAK,CAAC;IAC5C,OAAO,KAAK,CAAC,OAAO,CAAC,cAAc,EAAE,CAAC,CAAC,EAAE,IAAY,EAAE,EAAE;QACvD,OAAO,OAAO,CAAC,GAAG,CAAC,IAAI,CAAC,IAAI,EAAE,CAAC;IACjC,CAAC,CAAiB,CAAC;AACrB,CAAC;AAED,MAAM,oBAAoB,GAAG,kDAAkD,CAAC;AAChF,MAAM,mBAAmB,GAAG,CAAC,mBAAmB,CAAC,CAAC;AAClD,MAAM,aAAa,GAAG,CAAC,CAAC;AACxB,MAAM,uBAAuB,GAAG,GAAG,CAAC;AACpC,MAAM,wBAAwB,GAAG,IAAI,CAAC;AACtC,MAAM,qBAAqB,GAAsB,QAAQ,CAAC;AAC1D,MAAM,0BAA0B,GAAG,IAAI,CAAC;AAExC;;;;GAIG;AACH,MAAM,UAAU,aAAa,CAC3B,MAA6B,EAAE;IAE/B,MAAM,YAAY,GAAG,UAAU,CAAC,GAAG,CAAC,YAAY,IAAI,EAAE,CAAC,CAAC;IACxD,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,oBAAoB,CAAC,CAAC;IACxE,MAAM,WAAW,GAAG,UAAU,CAAC,GAAG,CAAC,WAAW,IAAI,EAAE,CAAC,CAAC;IACtD,MAAM,cAAc,GAAG,UAAU,CAAC,GAAG,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC;IAE5D,OAAO;QACL,OAAO,EAAE,GAAG,CAAC,OAAO,KAAK,KAAK;QAC9B,YAAY;QACZ,WAAW;QACX,WAAW,EAAE,GAAG,CAAC,WAAW,IAAI,mBAAmB;QACnD,IAAI,EAAE,GAAG,CAAC,IAAI,IAAI,aAAa;QAC/B,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,uBAAuB;QAC7D,cAAc,EAAE,GAAG,CAAC,cAAc,IAAI,wBAAwB;QAC9D,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,YAAY,CAAC;QACvE,WAAW;QACX,cAAc;QACd,iBAAiB,EAAE,GAAG,CAAC,iBAAiB,IAAI,qBAAqB;QACjE,gBAAgB,EAAE,GAAG,CAAC,gBAAgB,IAAI,0BAA0B;QACpE,eAAe,EAAE,GAAG,CAAC,eAAe,KAAK,KAAK,IAAI,OAAO,CAAC,WAAW,CAAC;KACvE,CAAC;AACJ,CAAC"}
|
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Embed a text query via Gemini Embedding 2 Preview.
|
|
3
|
+
* Uses the same model as n8n document ingestion so that query vectors and
|
|
4
|
+
* stored chunks live in the same 3072-dimensional space.
|
|
5
|
+
*
|
|
6
|
+
* @throws Error on any non-OK HTTP response, with the first 200 chars of the
|
|
7
|
+
* error body for debugging.
|
|
8
|
+
*/
|
|
9
|
+
export declare function embedQuery(text: string, geminiApiKey: string): Promise<number[]>;
|