@crowi/plugin-search-elasticsearch 0.1.0-alpha.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +105 -0
- package/dist/index.d.mts +250 -0
- package/dist/index.d.ts +250 -0
- package/dist/index.js +831 -0
- package/dist/index.js.map +1 -0
- package/dist/index.mjs +801 -0
- package/dist/index.mjs.map +1 -0
- package/package.json +46 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
The MIT License (MIT)
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2013 Sotaro KARASAWA <sotaro.k@gmail.com>
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
|
13
|
+
all copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
21
|
+
THE SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
# @crowi/plugin-search-elasticsearch
|
|
2
|
+
|
|
3
|
+
Elasticsearch 9 search driver for Crowi 2.0. Indexes pages on create /
|
|
4
|
+
update / delete, serves the wiki search box, and rebuilds the whole
|
|
5
|
+
index from scratch on demand. Targets a `<indexName>-current` alias so
|
|
6
|
+
a rebuild can swap the underlying index atomically.
|
|
7
|
+
|
|
8
|
+
## Install
|
|
9
|
+
|
|
10
|
+
```bash
|
|
11
|
+
crowi-admin plugin add @crowi/plugin-search-elasticsearch
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
(or, in dev: `pnpm --filter @crowi/api add -D @crowi/plugin-search-elasticsearch`)
|
|
15
|
+
|
|
16
|
+
## Configure
|
|
17
|
+
|
|
18
|
+
### 1. Activate the driver in `crowi.config.json`
|
|
19
|
+
|
|
20
|
+
```jsonc
|
|
21
|
+
{
|
|
22
|
+
"plugins": ["@crowi/plugin-search-elasticsearch"],
|
|
23
|
+
"search": { "driver": "elasticsearch" }
|
|
24
|
+
}
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
A server restart is required when `search.driver` changes — Crowi
|
|
28
|
+
reads this file once at boot.
|
|
29
|
+
|
|
30
|
+
### 2. Fill in connection settings in the admin UI
|
|
31
|
+
|
|
32
|
+
Open `/admin/plugins` and edit `@crowi/plugin-search-elasticsearch`:
|
|
33
|
+
|
|
34
|
+
- **`url`** — `https://[user:pass@]host[:port][/indexName]`. The URL
|
|
35
|
+
embeds the cluster password (Bonsai-style), so it is encrypted at
|
|
36
|
+
rest with `CROWI_ENCRYPTION_KEY`.
|
|
37
|
+
- **`indexName`** — base index name (default `crowi`). The driver
|
|
38
|
+
reads / writes the `<indexName>-current` alias.
|
|
39
|
+
- **`requestTimeout`** — per-request timeout in ms (default `5000`).
|
|
40
|
+
- **`analyzer`** — `default` / `kuromoji` / `sudachi` (see below).
|
|
41
|
+
|
|
42
|
+
## Hot-reload (no restart needed)
|
|
43
|
+
|
|
44
|
+
This plugin implements `reconfigure`, so **saving connection settings
|
|
45
|
+
in the admin UI applies without a server restart**. When you save:
|
|
46
|
+
|
|
47
|
+
- the `url` / `indexName` / `requestTimeout` / `analyzer` changes are
|
|
48
|
+
picked up by the live driver,
|
|
49
|
+
- a fresh Elasticsearch client is built and the previous one is closed
|
|
50
|
+
in the background (its HTTP keep-alive pool drains),
|
|
51
|
+
- the admin UI shows a "saved — applied immediately" toast.
|
|
52
|
+
|
|
53
|
+
Mechanics: the driver holds a module-scope state ref; each operation
|
|
54
|
+
(`query` / `index` / `remove` / `rebuild`) snapshots the state once at
|
|
55
|
+
the top of the call, so a save that lands mid-request cannot retarget
|
|
56
|
+
an inflight operation onto a different cluster. The next request sees
|
|
57
|
+
the new settings.
|
|
58
|
+
|
|
59
|
+
### Caveats
|
|
60
|
+
|
|
61
|
+
- **Analyzer changes need a manual rebuild.** Switching `analyzer`
|
|
62
|
+
(`default` ↔ `kuromoji` ↔ `sudachi`) updates the setting immediately,
|
|
63
|
+
but the **existing index keeps its old analyzer** — Elasticsearch
|
|
64
|
+
analyzers are fixed at index-creation time. Run a full rebuild from
|
|
65
|
+
`/admin/search` (or the rebuild action) to create a new index with
|
|
66
|
+
the new analyzer and swap the alias to it.
|
|
67
|
+
- **Empty `url` → configured `url` is restart-only.** If `url` was
|
|
68
|
+
empty at boot, the driver is not registered and there is nothing for
|
|
69
|
+
`reconfigure` to mutate; configure a `url` and restart once. After
|
|
70
|
+
that, all further changes hot-reload. Clearing a configured `url`
|
|
71
|
+
(configured → empty) *is* handled live — search requests then fail
|
|
72
|
+
with a clear `Search not configured` error until a `url` is set again.
|
|
73
|
+
- A rebuild that is already running when you reconfigure runs to
|
|
74
|
+
completion against the cluster / index name it started with.
|
|
75
|
+
|
|
76
|
+
## Analyzer flavours
|
|
77
|
+
|
|
78
|
+
| Analyzer | Cluster requirement |
|
|
79
|
+
|---|---|
|
|
80
|
+
| `default` | No extra Elasticsearch plugin. |
|
|
81
|
+
| `kuromoji` | `analysis-kuromoji` plugin (Elastic-distributed). The dev image (`elasticsearch.Dockerfile`) preinstalls it. |
|
|
82
|
+
| `sudachi` | Third-party `analysis-sudachi` plugin + dictionary. **Not** bundled in the dev image — operators must build a derived image. Picking this without the plugin makes `rebuild()` fail. |
|
|
83
|
+
|
|
84
|
+
## Verifying hot-reload in dev
|
|
85
|
+
|
|
86
|
+
The dev `docker compose` stack includes an Elasticsearch service, so
|
|
87
|
+
you can exercise the hot-reload path end to end:
|
|
88
|
+
|
|
89
|
+
1. `docker compose up -d` — brings up mongo / redis / elasticsearch.
|
|
90
|
+
2. `pnpm dev` — starts api + web.
|
|
91
|
+
3. In `/admin/plugins`, set `@crowi/plugin-search-elasticsearch`'s
|
|
92
|
+
`url` to `http://elasticsearch:9200/crowi` and save; restart once if
|
|
93
|
+
the driver was previously unconfigured.
|
|
94
|
+
4. From `/admin/search`, run a rebuild to populate the index.
|
|
95
|
+
5. Change `requestTimeout` (or point `indexName` at a freshly rebuilt
|
|
96
|
+
index) and save **without restarting**. The next search query uses
|
|
97
|
+
the new settings — confirmed by the api log line
|
|
98
|
+
`reconfigured elasticsearch search driver (...)`.
|
|
99
|
+
|
|
100
|
+
## See also
|
|
101
|
+
|
|
102
|
+
- RFC-0001 §"Search" for the migration story from the legacy ES7 indexer.
|
|
103
|
+
- [`@crowi/plugin-storage-aws-s3`](../plugin-storage-aws-s3) — the
|
|
104
|
+
reference implementation of the same state-ref + snapshot hot-reload
|
|
105
|
+
pattern.
|
package/dist/index.d.mts
ADDED
|
@@ -0,0 +1,250 @@
|
|
|
1
|
+
import { z } from 'zod/v3';
|
|
2
|
+
import { SearchDriver, PluginLogger, SearchQueryViewer, SearchQueryGrants, CrowiPlugin } from '@crowi/plugin-api';
|
|
3
|
+
import { Client } from '@elastic/elasticsearch';
|
|
4
|
+
|
|
5
|
+
/**
|
|
6
|
+
* Elasticsearch 9 driver implementing the `SearchDriver` contract.
|
|
7
|
+
* Owns the Client, the `${indexName}-current` alias (legacy ops
|
|
8
|
+
* compat), single-doc index / remove, query against the alias, and
|
|
9
|
+
* rebuild-from-scratch in 2k-doc bulk batches with bookmark counts
|
|
10
|
+
* pre-fetched in one aggregate. Document field shape (path / body /
|
|
11
|
+
* username / grant / granted_users / *_count / *_at) matches the
|
|
12
|
+
* legacy ES7 indexer for reindex-free migration.
|
|
13
|
+
*/
|
|
14
|
+
|
|
15
|
+
type Analyzer = 'default' | 'kuromoji' | 'sudachi';
|
|
16
|
+
interface ElasticsearchDriverConfig {
|
|
17
|
+
url: string;
|
|
18
|
+
indexName: string;
|
|
19
|
+
requestTimeout: number;
|
|
20
|
+
analyzer: Analyzer;
|
|
21
|
+
}
|
|
22
|
+
interface ElasticsearchDriverDeps {
|
|
23
|
+
log?: PluginLogger;
|
|
24
|
+
/**
|
|
25
|
+
* Iterate every page in the Mongo Page collection in cursor-style.
|
|
26
|
+
* Plugin can't import the Page model directly, so the manager wires
|
|
27
|
+
* this in from `ctx.model('Page')`. Each yielded doc is the lean
|
|
28
|
+
* shape produced by `Page.getStreamOfFindAll({ publicOnly: false })`.
|
|
29
|
+
*/
|
|
30
|
+
iteratePages?: (handler: (page: PageStreamDoc) => Promise<void>) => Promise<void>;
|
|
31
|
+
/** Total page count, used for progress reporting. */
|
|
32
|
+
countAllPages?: () => Promise<number>;
|
|
33
|
+
/**
|
|
34
|
+
* Bulk-fetch bookmark counts for every page in one Mongo aggregate.
|
|
35
|
+
* Avoids the per-doc N+1 lookup the legacy rebuild used. Returns a
|
|
36
|
+
* `Map<pageId, count>`; pages without bookmarks may be absent
|
|
37
|
+
* (caller defaults to 0).
|
|
38
|
+
*/
|
|
39
|
+
getBookmarkCountsBulk?: () => Promise<Map<string, number>>;
|
|
40
|
+
/** Total user count, used to scale the bookmark-count factor. */
|
|
41
|
+
countUsers?: () => Promise<number>;
|
|
42
|
+
}
|
|
43
|
+
/** The lean Page document shape we expect from the rebuild stream. */
|
|
44
|
+
interface PageStreamDoc {
|
|
45
|
+
_id: {
|
|
46
|
+
toString: () => string;
|
|
47
|
+
} | string;
|
|
48
|
+
path: string;
|
|
49
|
+
redirectTo: string | null;
|
|
50
|
+
status: string;
|
|
51
|
+
grant: number;
|
|
52
|
+
grantedUsers?: Array<{
|
|
53
|
+
toString: () => string;
|
|
54
|
+
} | string>;
|
|
55
|
+
creator?: {
|
|
56
|
+
username?: string;
|
|
57
|
+
};
|
|
58
|
+
revision?: {
|
|
59
|
+
body?: string;
|
|
60
|
+
};
|
|
61
|
+
liker?: unknown[];
|
|
62
|
+
commentCount?: number;
|
|
63
|
+
bookmarkCount?: number;
|
|
64
|
+
createdAt?: Date;
|
|
65
|
+
updatedAt?: Date;
|
|
66
|
+
}
|
|
67
|
+
interface ElasticsearchDriver extends SearchDriver {
|
|
68
|
+
/** Currently-targeted alias name (`<indexName>-current`). Exposed for tests / admin UI. */
|
|
69
|
+
readonly aliasName: string;
|
|
70
|
+
/** ES node URI parsed out of `config.url`. */
|
|
71
|
+
readonly node: string;
|
|
72
|
+
/** Base index name (without timestamp / `-current` suffix). */
|
|
73
|
+
readonly baseIndexName: string;
|
|
74
|
+
/** Test-only handle to the underlying client. */
|
|
75
|
+
readonly client: Client;
|
|
76
|
+
}
|
|
77
|
+
/**
|
|
78
|
+
* Mutable driver state. `createElasticsearchDriver` receives a ref to
|
|
79
|
+
* this; each driver method snapshots the fields it needs *once at the
|
|
80
|
+
* top* of the call, so a concurrent `reconfigure` cannot swap the
|
|
81
|
+
* client / index name mid-operation. `reconfigure` mutates the fields
|
|
82
|
+
* in place via {@link applyConfigInPlace}; the next call sees the new
|
|
83
|
+
* values. An empty `url` leaves `client` as `null` — the methods then
|
|
84
|
+
* throw a `Search not configured` error rather than touching a stale
|
|
85
|
+
* client.
|
|
86
|
+
*/
|
|
87
|
+
interface ESDriverState {
|
|
88
|
+
/** `null` when `url` is empty (driver configured-but-disabled). */
|
|
89
|
+
client: Client | null;
|
|
90
|
+
/** ES node URI parsed out of `config.url`; empty string when `url` is empty. */
|
|
91
|
+
node: string;
|
|
92
|
+
/** Base index name (without timestamp / `-current` suffix). */
|
|
93
|
+
baseIndexName: string;
|
|
94
|
+
/** Runtime alias the driver reads / writes (`<baseIndexName>-current`). */
|
|
95
|
+
aliasName: string;
|
|
96
|
+
analyzer: Analyzer;
|
|
97
|
+
requestTimeout: number;
|
|
98
|
+
}
|
|
99
|
+
/**
|
|
100
|
+
* Build a fresh {@link ESDriverState} from a config. An empty `url`
|
|
101
|
+
* yields a disabled state (`client: null`) instead of throwing — the
|
|
102
|
+
* driver stays registered but every method rejects with a
|
|
103
|
+
* `Search not configured` error.
|
|
104
|
+
*/
|
|
105
|
+
declare function applyConfig(config: ElasticsearchDriverConfig): ESDriverState;
|
|
106
|
+
/**
|
|
107
|
+
* Mutate `target` in place to reflect `config`. Used by `reconfigure`:
|
|
108
|
+
* the old client reference is returned so the caller can `close()` it
|
|
109
|
+
* (fire-and-forget) once the swap is done — inflight operations have
|
|
110
|
+
* already snapshotted the old client and will run to completion.
|
|
111
|
+
*/
|
|
112
|
+
declare function applyConfigInPlace(target: ESDriverState, config: ElasticsearchDriverConfig): {
|
|
113
|
+
oldClient: Client | null;
|
|
114
|
+
};
|
|
115
|
+
/**
|
|
116
|
+
* Build the search driver around a {@link ESDriverState} ref. Methods
|
|
117
|
+
* snapshot `state` *once at the top* — a `reconfigure` running
|
|
118
|
+
* concurrently with an inflight call cannot swap the client mid-call;
|
|
119
|
+
* the next call sees the new client / index name.
|
|
120
|
+
*/
|
|
121
|
+
declare function createElasticsearchDriver(state: ESDriverState, deps?: ElasticsearchDriverDeps): ElasticsearchDriver;
|
|
122
|
+
|
|
123
|
+
/**
|
|
124
|
+
* Search-string parser for the Elasticsearch driver.
|
|
125
|
+
*
|
|
126
|
+
* Splits a free-form query into positive / negative keywords and
|
|
127
|
+
* phrases. Lifted from the legacy `packages/api/src/service/query.ts`
|
|
128
|
+
* with no behaviour changes — preserved here as a plugin-private
|
|
129
|
+
* helper because the parser is currently ES-specific (the +/- and
|
|
130
|
+
* `"phrase"` syntax maps directly to ES `multi_match` queries). When
|
|
131
|
+
* a future driver wants the same shape, factor it back into
|
|
132
|
+
* `@crowi/plugin-api`.
|
|
133
|
+
*/
|
|
134
|
+
type PositiveAndNegative<T> = {
|
|
135
|
+
positive: T;
|
|
136
|
+
negative: T;
|
|
137
|
+
};
|
|
138
|
+
type ParsedSearchQuery = {
|
|
139
|
+
keywords: PositiveAndNegative<string[]>;
|
|
140
|
+
phrases: PositiveAndNegative<string[]>;
|
|
141
|
+
};
|
|
142
|
+
declare const parseQuery: (query: string) => ParsedSearchQuery;
|
|
143
|
+
|
|
144
|
+
/**
|
|
145
|
+
* Build an Elasticsearch 8 search request body from the SearchQuery
|
|
146
|
+
* shape exposed by `@crowi/plugin-api`. The driver passes a parsed
|
|
147
|
+
* keyword/phrase tree plus the viewer + grants from the original
|
|
148
|
+
* SearchQuery; this module composes them into a single bool query.
|
|
149
|
+
*
|
|
150
|
+
* Design notes:
|
|
151
|
+
* - All filters are composed at the top-level `bool`. We never nest
|
|
152
|
+
* a second `bool` for the same operator type (must / filter /
|
|
153
|
+
* should / must_not), so the generated body is small and easy to
|
|
154
|
+
* diff in tests.
|
|
155
|
+
* - The grant filter mirrors the legacy ES Searcher precisely:
|
|
156
|
+
* a non-public page (RESTRICTED / SPECIFIED / OWNER) is hidden
|
|
157
|
+
* unless its `username` field matches the viewer's username.
|
|
158
|
+
* For SPECIFIED / OWNER / RESTRICTED pages, we additionally allow
|
|
159
|
+
* the page through if `granted_users` contains the viewer id —
|
|
160
|
+
* the legacy query only checked `username`, but the new
|
|
161
|
+
* SearchableDoc lets us index `granted_users` precisely so we
|
|
162
|
+
* can express "shared with me" as well.
|
|
163
|
+
* - Type filter (portal / public / user) reproduces the legacy
|
|
164
|
+
* `path.raw` regex / prefix queries.
|
|
165
|
+
*/
|
|
166
|
+
|
|
167
|
+
type FunctionScoreParams = {
|
|
168
|
+
fieldValueFactor: {
|
|
169
|
+
field: string;
|
|
170
|
+
factor?: number;
|
|
171
|
+
modifier?: 'log' | 'log1p' | 'log2p' | 'ln' | 'ln1p' | 'ln2p' | 'square' | 'sqrt' | 'reciprocal' | 'none';
|
|
172
|
+
missing: number;
|
|
173
|
+
};
|
|
174
|
+
boostMode: 'multiply' | 'replace' | 'sum' | 'avg' | 'max' | 'min';
|
|
175
|
+
};
|
|
176
|
+
interface BuildSearchBodyParams {
|
|
177
|
+
parsed: ParsedSearchQuery;
|
|
178
|
+
pathPrefix?: string;
|
|
179
|
+
viewer?: SearchQueryViewer;
|
|
180
|
+
grants?: SearchQueryGrants;
|
|
181
|
+
functionScore?: FunctionScoreParams;
|
|
182
|
+
from: number;
|
|
183
|
+
size: number;
|
|
184
|
+
}
|
|
185
|
+
/**
|
|
186
|
+
* Build the ES9 search request body. Returns an object suitable for
|
|
187
|
+
* `client.search({ index, ...body })`.
|
|
188
|
+
*/
|
|
189
|
+
declare function buildSearchBody(params: BuildSearchBodyParams): {
|
|
190
|
+
from: number;
|
|
191
|
+
size: number;
|
|
192
|
+
sort: Array<Record<string, unknown>>;
|
|
193
|
+
highlight: Record<string, unknown>;
|
|
194
|
+
query: Record<string, unknown>;
|
|
195
|
+
_source: string[];
|
|
196
|
+
};
|
|
197
|
+
|
|
198
|
+
/**
|
|
199
|
+
* @crowi/plugin-search-elasticsearch — search driver registering
|
|
200
|
+
* `'elasticsearch'` against the SearchRegistry.
|
|
201
|
+
*
|
|
202
|
+
* Activation: add this plugin to the runner's `crowi.config.json`
|
|
203
|
+
* `plugins` array and set `search.driver: 'elasticsearch'`. Configure
|
|
204
|
+
* via the Mongo Config namespace `plugin:@crowi/plugin-search-elasticsearch:*`
|
|
205
|
+
* — operators set the connection URL exclusively from the admin UI.
|
|
206
|
+
*/
|
|
207
|
+
|
|
208
|
+
declare const ElasticsearchConfigSchema: z.ZodObject<{
|
|
209
|
+
/**
|
|
210
|
+
* `https://[user:pass@]host[:port][/indexName]`. Empty string keeps
|
|
211
|
+
* the driver registered but disabled — `query()` will throw a
|
|
212
|
+
* helpful error and `index()` becomes a no-op.
|
|
213
|
+
*
|
|
214
|
+
* Marked `@sensitive` because the URL embeds the cluster password
|
|
215
|
+
* (Bonsai-style `https://USER:PASS@HOST/INDEX`); we don't want
|
|
216
|
+
* Mongo to keep it in plaintext.
|
|
217
|
+
*/
|
|
218
|
+
url: z.ZodDefault<z.ZodString>;
|
|
219
|
+
/**
|
|
220
|
+
* Base index name. Used as the `indexName` if not provided in the
|
|
221
|
+
* URL path. The runtime alias `${indexName}-current` is what the
|
|
222
|
+
* driver actually targets for read / write.
|
|
223
|
+
*/
|
|
224
|
+
indexName: z.ZodDefault<z.ZodString>;
|
|
225
|
+
requestTimeout: z.ZodDefault<z.ZodNumber>;
|
|
226
|
+
/**
|
|
227
|
+
* Mapping flavour. Cluster requirements:
|
|
228
|
+
* - `default`: no extra ES plugin.
|
|
229
|
+
* - `kuromoji`: `analysis-kuromoji` plugin (Elastic-distributed).
|
|
230
|
+
* The dev image (`elasticsearch.Dockerfile`) preinstalls it.
|
|
231
|
+
* - `sudachi`: third-party `analysis-sudachi` plugin + dictionary.
|
|
232
|
+
* NOT bundled in the dev image; operators must build a derived
|
|
233
|
+
* image. Picking this without the plugin makes `rebuild()` fail.
|
|
234
|
+
*/
|
|
235
|
+
analyzer: z.ZodDefault<z.ZodEnum<["default", "kuromoji", "sudachi"]>>;
|
|
236
|
+
}, "strict", z.ZodTypeAny, {
|
|
237
|
+
url: string;
|
|
238
|
+
indexName: string;
|
|
239
|
+
requestTimeout: number;
|
|
240
|
+
analyzer: "default" | "kuromoji" | "sudachi";
|
|
241
|
+
}, {
|
|
242
|
+
url?: string | undefined;
|
|
243
|
+
indexName?: string | undefined;
|
|
244
|
+
requestTimeout?: number | undefined;
|
|
245
|
+
analyzer?: "default" | "kuromoji" | "sudachi" | undefined;
|
|
246
|
+
}>;
|
|
247
|
+
type ElasticsearchConfig = z.infer<typeof ElasticsearchConfigSchema>;
|
|
248
|
+
declare const plugin: CrowiPlugin;
|
|
249
|
+
|
|
250
|
+
export { type Analyzer, type ESDriverState, type ElasticsearchConfig, ElasticsearchConfigSchema, type ElasticsearchDriver, type ElasticsearchDriverConfig, type ElasticsearchDriverDeps, type PageStreamDoc, applyConfig, applyConfigInPlace, buildSearchBody, createElasticsearchDriver, plugin as default, parseQuery };
|
package/dist/index.d.ts
ADDED
|
@@ -0,0 +1,250 @@
|
|
|
1
|
+
import { z } from 'zod/v3';
|
|
2
|
+
import { SearchDriver, PluginLogger, SearchQueryViewer, SearchQueryGrants, CrowiPlugin } from '@crowi/plugin-api';
|
|
3
|
+
import { Client } from '@elastic/elasticsearch';
|
|
4
|
+
|
|
5
|
+
/**
|
|
6
|
+
* Elasticsearch 9 driver implementing the `SearchDriver` contract.
|
|
7
|
+
* Owns the Client, the `${indexName}-current` alias (legacy ops
|
|
8
|
+
* compat), single-doc index / remove, query against the alias, and
|
|
9
|
+
* rebuild-from-scratch in 2k-doc bulk batches with bookmark counts
|
|
10
|
+
* pre-fetched in one aggregate. Document field shape (path / body /
|
|
11
|
+
* username / grant / granted_users / *_count / *_at) matches the
|
|
12
|
+
* legacy ES7 indexer for reindex-free migration.
|
|
13
|
+
*/
|
|
14
|
+
|
|
15
|
+
type Analyzer = 'default' | 'kuromoji' | 'sudachi';
|
|
16
|
+
interface ElasticsearchDriverConfig {
|
|
17
|
+
url: string;
|
|
18
|
+
indexName: string;
|
|
19
|
+
requestTimeout: number;
|
|
20
|
+
analyzer: Analyzer;
|
|
21
|
+
}
|
|
22
|
+
interface ElasticsearchDriverDeps {
|
|
23
|
+
log?: PluginLogger;
|
|
24
|
+
/**
|
|
25
|
+
* Iterate every page in the Mongo Page collection in cursor-style.
|
|
26
|
+
* Plugin can't import the Page model directly, so the manager wires
|
|
27
|
+
* this in from `ctx.model('Page')`. Each yielded doc is the lean
|
|
28
|
+
* shape produced by `Page.getStreamOfFindAll({ publicOnly: false })`.
|
|
29
|
+
*/
|
|
30
|
+
iteratePages?: (handler: (page: PageStreamDoc) => Promise<void>) => Promise<void>;
|
|
31
|
+
/** Total page count, used for progress reporting. */
|
|
32
|
+
countAllPages?: () => Promise<number>;
|
|
33
|
+
/**
|
|
34
|
+
* Bulk-fetch bookmark counts for every page in one Mongo aggregate.
|
|
35
|
+
* Avoids the per-doc N+1 lookup the legacy rebuild used. Returns a
|
|
36
|
+
* `Map<pageId, count>`; pages without bookmarks may be absent
|
|
37
|
+
* (caller defaults to 0).
|
|
38
|
+
*/
|
|
39
|
+
getBookmarkCountsBulk?: () => Promise<Map<string, number>>;
|
|
40
|
+
/** Total user count, used to scale the bookmark-count factor. */
|
|
41
|
+
countUsers?: () => Promise<number>;
|
|
42
|
+
}
|
|
43
|
+
/** The lean Page document shape we expect from the rebuild stream. */
|
|
44
|
+
interface PageStreamDoc {
|
|
45
|
+
_id: {
|
|
46
|
+
toString: () => string;
|
|
47
|
+
} | string;
|
|
48
|
+
path: string;
|
|
49
|
+
redirectTo: string | null;
|
|
50
|
+
status: string;
|
|
51
|
+
grant: number;
|
|
52
|
+
grantedUsers?: Array<{
|
|
53
|
+
toString: () => string;
|
|
54
|
+
} | string>;
|
|
55
|
+
creator?: {
|
|
56
|
+
username?: string;
|
|
57
|
+
};
|
|
58
|
+
revision?: {
|
|
59
|
+
body?: string;
|
|
60
|
+
};
|
|
61
|
+
liker?: unknown[];
|
|
62
|
+
commentCount?: number;
|
|
63
|
+
bookmarkCount?: number;
|
|
64
|
+
createdAt?: Date;
|
|
65
|
+
updatedAt?: Date;
|
|
66
|
+
}
|
|
67
|
+
interface ElasticsearchDriver extends SearchDriver {
|
|
68
|
+
/** Currently-targeted alias name (`<indexName>-current`). Exposed for tests / admin UI. */
|
|
69
|
+
readonly aliasName: string;
|
|
70
|
+
/** ES node URI parsed out of `config.url`. */
|
|
71
|
+
readonly node: string;
|
|
72
|
+
/** Base index name (without timestamp / `-current` suffix). */
|
|
73
|
+
readonly baseIndexName: string;
|
|
74
|
+
/** Test-only handle to the underlying client. */
|
|
75
|
+
readonly client: Client;
|
|
76
|
+
}
|
|
77
|
+
/**
|
|
78
|
+
* Mutable driver state. `createElasticsearchDriver` receives a ref to
|
|
79
|
+
* this; each driver method snapshots the fields it needs *once at the
|
|
80
|
+
* top* of the call, so a concurrent `reconfigure` cannot swap the
|
|
81
|
+
* client / index name mid-operation. `reconfigure` mutates the fields
|
|
82
|
+
* in place via {@link applyConfigInPlace}; the next call sees the new
|
|
83
|
+
* values. An empty `url` leaves `client` as `null` — the methods then
|
|
84
|
+
* throw a `Search not configured` error rather than touching a stale
|
|
85
|
+
* client.
|
|
86
|
+
*/
|
|
87
|
+
interface ESDriverState {
|
|
88
|
+
/** `null` when `url` is empty (driver configured-but-disabled). */
|
|
89
|
+
client: Client | null;
|
|
90
|
+
/** ES node URI parsed out of `config.url`; empty string when `url` is empty. */
|
|
91
|
+
node: string;
|
|
92
|
+
/** Base index name (without timestamp / `-current` suffix). */
|
|
93
|
+
baseIndexName: string;
|
|
94
|
+
/** Runtime alias the driver reads / writes (`<baseIndexName>-current`). */
|
|
95
|
+
aliasName: string;
|
|
96
|
+
analyzer: Analyzer;
|
|
97
|
+
requestTimeout: number;
|
|
98
|
+
}
|
|
99
|
+
/**
|
|
100
|
+
* Build a fresh {@link ESDriverState} from a config. An empty `url`
|
|
101
|
+
* yields a disabled state (`client: null`) instead of throwing — the
|
|
102
|
+
* driver stays registered but every method rejects with a
|
|
103
|
+
* `Search not configured` error.
|
|
104
|
+
*/
|
|
105
|
+
declare function applyConfig(config: ElasticsearchDriverConfig): ESDriverState;
|
|
106
|
+
/**
|
|
107
|
+
* Mutate `target` in place to reflect `config`. Used by `reconfigure`:
|
|
108
|
+
* the old client reference is returned so the caller can `close()` it
|
|
109
|
+
* (fire-and-forget) once the swap is done — inflight operations have
|
|
110
|
+
* already snapshotted the old client and will run to completion.
|
|
111
|
+
*/
|
|
112
|
+
declare function applyConfigInPlace(target: ESDriverState, config: ElasticsearchDriverConfig): {
|
|
113
|
+
oldClient: Client | null;
|
|
114
|
+
};
|
|
115
|
+
/**
|
|
116
|
+
* Build the search driver around a {@link ESDriverState} ref. Methods
|
|
117
|
+
* snapshot `state` *once at the top* — a `reconfigure` running
|
|
118
|
+
* concurrently with an inflight call cannot swap the client mid-call;
|
|
119
|
+
* the next call sees the new client / index name.
|
|
120
|
+
*/
|
|
121
|
+
declare function createElasticsearchDriver(state: ESDriverState, deps?: ElasticsearchDriverDeps): ElasticsearchDriver;
|
|
122
|
+
|
|
123
|
+
/**
|
|
124
|
+
* Search-string parser for the Elasticsearch driver.
|
|
125
|
+
*
|
|
126
|
+
* Splits a free-form query into positive / negative keywords and
|
|
127
|
+
* phrases. Lifted from the legacy `packages/api/src/service/query.ts`
|
|
128
|
+
* with no behaviour changes — preserved here as a plugin-private
|
|
129
|
+
* helper because the parser is currently ES-specific (the +/- and
|
|
130
|
+
* `"phrase"` syntax maps directly to ES `multi_match` queries). When
|
|
131
|
+
* a future driver wants the same shape, factor it back into
|
|
132
|
+
* `@crowi/plugin-api`.
|
|
133
|
+
*/
|
|
134
|
+
type PositiveAndNegative<T> = {
|
|
135
|
+
positive: T;
|
|
136
|
+
negative: T;
|
|
137
|
+
};
|
|
138
|
+
type ParsedSearchQuery = {
|
|
139
|
+
keywords: PositiveAndNegative<string[]>;
|
|
140
|
+
phrases: PositiveAndNegative<string[]>;
|
|
141
|
+
};
|
|
142
|
+
declare const parseQuery: (query: string) => ParsedSearchQuery;
|
|
143
|
+
|
|
144
|
+
/**
|
|
145
|
+
* Build an Elasticsearch 8 search request body from the SearchQuery
|
|
146
|
+
* shape exposed by `@crowi/plugin-api`. The driver passes a parsed
|
|
147
|
+
* keyword/phrase tree plus the viewer + grants from the original
|
|
148
|
+
* SearchQuery; this module composes them into a single bool query.
|
|
149
|
+
*
|
|
150
|
+
* Design notes:
|
|
151
|
+
* - All filters are composed at the top-level `bool`. We never nest
|
|
152
|
+
* a second `bool` for the same operator type (must / filter /
|
|
153
|
+
* should / must_not), so the generated body is small and easy to
|
|
154
|
+
* diff in tests.
|
|
155
|
+
* - The grant filter mirrors the legacy ES Searcher precisely:
|
|
156
|
+
* a non-public page (RESTRICTED / SPECIFIED / OWNER) is hidden
|
|
157
|
+
* unless its `username` field matches the viewer's username.
|
|
158
|
+
* For SPECIFIED / OWNER / RESTRICTED pages, we additionally allow
|
|
159
|
+
* the page through if `granted_users` contains the viewer id —
|
|
160
|
+
* the legacy query only checked `username`, but the new
|
|
161
|
+
* SearchableDoc lets us index `granted_users` precisely so we
|
|
162
|
+
* can express "shared with me" as well.
|
|
163
|
+
* - Type filter (portal / public / user) reproduces the legacy
|
|
164
|
+
* `path.raw` regex / prefix queries.
|
|
165
|
+
*/
|
|
166
|
+
|
|
167
|
+
type FunctionScoreParams = {
|
|
168
|
+
fieldValueFactor: {
|
|
169
|
+
field: string;
|
|
170
|
+
factor?: number;
|
|
171
|
+
modifier?: 'log' | 'log1p' | 'log2p' | 'ln' | 'ln1p' | 'ln2p' | 'square' | 'sqrt' | 'reciprocal' | 'none';
|
|
172
|
+
missing: number;
|
|
173
|
+
};
|
|
174
|
+
boostMode: 'multiply' | 'replace' | 'sum' | 'avg' | 'max' | 'min';
|
|
175
|
+
};
|
|
176
|
+
interface BuildSearchBodyParams {
|
|
177
|
+
parsed: ParsedSearchQuery;
|
|
178
|
+
pathPrefix?: string;
|
|
179
|
+
viewer?: SearchQueryViewer;
|
|
180
|
+
grants?: SearchQueryGrants;
|
|
181
|
+
functionScore?: FunctionScoreParams;
|
|
182
|
+
from: number;
|
|
183
|
+
size: number;
|
|
184
|
+
}
|
|
185
|
+
/**
|
|
186
|
+
* Build the ES9 search request body. Returns an object suitable for
|
|
187
|
+
* `client.search({ index, ...body })`.
|
|
188
|
+
*/
|
|
189
|
+
declare function buildSearchBody(params: BuildSearchBodyParams): {
|
|
190
|
+
from: number;
|
|
191
|
+
size: number;
|
|
192
|
+
sort: Array<Record<string, unknown>>;
|
|
193
|
+
highlight: Record<string, unknown>;
|
|
194
|
+
query: Record<string, unknown>;
|
|
195
|
+
_source: string[];
|
|
196
|
+
};
|
|
197
|
+
|
|
198
|
+
/**
|
|
199
|
+
* @crowi/plugin-search-elasticsearch — search driver registering
|
|
200
|
+
* `'elasticsearch'` against the SearchRegistry.
|
|
201
|
+
*
|
|
202
|
+
* Activation: add this plugin to the runner's `crowi.config.json`
|
|
203
|
+
* `plugins` array and set `search.driver: 'elasticsearch'`. Configure
|
|
204
|
+
* via the Mongo Config namespace `plugin:@crowi/plugin-search-elasticsearch:*`
|
|
205
|
+
* — operators set the connection URL exclusively from the admin UI.
|
|
206
|
+
*/
|
|
207
|
+
|
|
208
|
+
declare const ElasticsearchConfigSchema: z.ZodObject<{
|
|
209
|
+
/**
|
|
210
|
+
* `https://[user:pass@]host[:port][/indexName]`. Empty string keeps
|
|
211
|
+
* the driver registered but disabled — `query()` will throw a
|
|
212
|
+
* helpful error and `index()` becomes a no-op.
|
|
213
|
+
*
|
|
214
|
+
* Marked `@sensitive` because the URL embeds the cluster password
|
|
215
|
+
* (Bonsai-style `https://USER:PASS@HOST/INDEX`); we don't want
|
|
216
|
+
* Mongo to keep it in plaintext.
|
|
217
|
+
*/
|
|
218
|
+
url: z.ZodDefault<z.ZodString>;
|
|
219
|
+
/**
|
|
220
|
+
* Base index name. Used as the `indexName` if not provided in the
|
|
221
|
+
* URL path. The runtime alias `${indexName}-current` is what the
|
|
222
|
+
* driver actually targets for read / write.
|
|
223
|
+
*/
|
|
224
|
+
indexName: z.ZodDefault<z.ZodString>;
|
|
225
|
+
requestTimeout: z.ZodDefault<z.ZodNumber>;
|
|
226
|
+
/**
|
|
227
|
+
* Mapping flavour. Cluster requirements:
|
|
228
|
+
* - `default`: no extra ES plugin.
|
|
229
|
+
* - `kuromoji`: `analysis-kuromoji` plugin (Elastic-distributed).
|
|
230
|
+
* The dev image (`elasticsearch.Dockerfile`) preinstalls it.
|
|
231
|
+
* - `sudachi`: third-party `analysis-sudachi` plugin + dictionary.
|
|
232
|
+
* NOT bundled in the dev image; operators must build a derived
|
|
233
|
+
* image. Picking this without the plugin makes `rebuild()` fail.
|
|
234
|
+
*/
|
|
235
|
+
analyzer: z.ZodDefault<z.ZodEnum<["default", "kuromoji", "sudachi"]>>;
|
|
236
|
+
}, "strict", z.ZodTypeAny, {
|
|
237
|
+
url: string;
|
|
238
|
+
indexName: string;
|
|
239
|
+
requestTimeout: number;
|
|
240
|
+
analyzer: "default" | "kuromoji" | "sudachi";
|
|
241
|
+
}, {
|
|
242
|
+
url?: string | undefined;
|
|
243
|
+
indexName?: string | undefined;
|
|
244
|
+
requestTimeout?: number | undefined;
|
|
245
|
+
analyzer?: "default" | "kuromoji" | "sudachi" | undefined;
|
|
246
|
+
}>;
|
|
247
|
+
type ElasticsearchConfig = z.infer<typeof ElasticsearchConfigSchema>;
|
|
248
|
+
declare const plugin: CrowiPlugin;
|
|
249
|
+
|
|
250
|
+
export { type Analyzer, type ESDriverState, type ElasticsearchConfig, ElasticsearchConfigSchema, type ElasticsearchDriver, type ElasticsearchDriverConfig, type ElasticsearchDriverDeps, type PageStreamDoc, applyConfig, applyConfigInPlace, buildSearchBody, createElasticsearchDriver, plugin as default, parseQuery };
|