@radically-straightforward/utilities 2.0.6 → 2.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,9 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.0.7 · 2024-10-18
4
+
5
+ - Added `tokenize()`, `normalizeToken()`, `highlight()`, and `snippet()`.
6
+
3
7
  ## 2.0.6 · 2024-07-12
4
8
 
5
9
  - Added `dedent()`.
package/README.md CHANGED
@@ -111,6 +111,194 @@ utilities.dedent`
111
111
  // => "Here is an\n\nexample of\n an interpolated string including a newline and indentation\n\nfollowed by some more text."
112
112
  ```
113
113
 
114
+ ### `tokenize()`
115
+
116
+ ```typescript
117
+ export function tokenize(
118
+ text: string,
119
+ {
120
+ stopWords = new Set(),
121
+ stopWordsAction = "delete",
122
+ stem = (token) => token,
123
+ }: {
124
+ stopWords?: Set<string>;
125
+ stopWordsAction?: "delete" | "mark";
126
+ stem?: (token: string) => string;
127
+ } = {},
128
+ ): {
129
+ token: string;
130
+ tokenIsStopWord: boolean;
131
+ start: number;
132
+ end: number;
133
+ }[];
134
+ ```
135
+
136
+ Process text into tokens that can be used for full-text search.
137
+
138
+ The part that breaks the text into tokens matches the behavior of [SQLite’s Unicode61 Tokenizer](https://www.sqlite.org/fts5.html#unicode61_tokenizer).
139
+
140
+ The `stopWords` are removed from the text. They are expected to be the result of `normalizeToken()`.
141
+
142
+ The `stem()` allows you to implement, for example, [SQLite’s Porter Tokenizer](https://www.sqlite.org/fts5.html#porter_tokenizer).
143
+
144
+ Reasons to use `tokenize()` instead of SQLite’s Tokenizers:
145
+
146
+ 1. `tokenize()` provides a source map, linking each to token back to the ranges in `text` where they came from. This is useful in `highlight()`. [SQLite’s own `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function) doesn’t allow you to, for example, do full-text search on just the text from a message, while `highlight()`ing the message including markup.
147
+ 2. The `stopWords` may be removed.
148
+ 3. The `stem()` may support other languages, while SQLite’s Porter Tokenizer only supports English.
149
+
150
+ When using `tokenize()`, it’s appropriate to rely on the default tokenizer in SQLite, Unicode61.
151
+
152
+ We recommend using [Natural](https://naturalnode.github.io/natural/) for [`stopWords`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/util) and [`stem()`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/stemmers).
153
+
154
+ **Example**
155
+
156
+ ```typescript
157
+ import * as utilities from "@radically-straightforward/utilities";
158
+ import natural from "natural";
159
+
160
+ console.log(
161
+ utilities.tokenize(
162
+ "For my peanuts allergy peanut butter is sometimes used.",
163
+ {
164
+ stopWords: new Set(
165
+ natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
166
+ ),
167
+ stem: (token) => natural.PorterStemmer.stem(token),
168
+ },
169
+ ),
170
+ );
171
+ // =>
172
+ // [
173
+ // { token: 'peanut', tokenIsStopWord: false, start: 7, end: 14 },
174
+ // { token: 'allergi', tokenIsStopWord: false, start: 15, end: 22 },
175
+ // { token: 'peanut', tokenIsStopWord: false, start: 23, end: 29 },
176
+ // { token: 'butter', tokenIsStopWord: false, start: 30, end: 36 },
177
+ // { token: 'sometim', tokenIsStopWord: false, start: 40, end: 49 },
178
+ // { token: 'us', tokenIsStopWord: false, start: 50, end: 54 }
179
+ // ]
180
+ ```
181
+
182
+ ### `normalizeToken()`
183
+
184
+ ```typescript
185
+ export function normalizeToken(token: string): string;
186
+ ```
187
+
188
+ Normalize a token for `tokenize()`. It removes accents, for example, turning `ú` into `u`. It lower cases, for example, turning `HeLlO` into `hello`.
189
+
190
+ **References**
191
+
192
+ - https://stackoverflow.com/a/37511463
193
+
194
+ ### `highlight()`
195
+
196
+ ```typescript
197
+ export function highlight(
198
+ text: string,
199
+ search: Set<string>,
200
+ {
201
+ start = `<span class="highlight">`,
202
+ end = `</span>`,
203
+ ...tokenizeOptions
204
+ }: {
205
+ start?: string;
206
+ end?: string;
207
+ } & Parameters<typeof tokenize>[1] = {},
208
+ ): string;
209
+ ```
210
+
211
+ Highlight the `search` terms in `text`. Similar to [SQLite’s `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function), but because it’s implemented at the application level, it can work with `text` including markup by parsing the markup into DOM and traversing the DOM using `highlight()` on each [Text Node](https://developer.mozilla.org/en-US/docs/Web/API/Text).
212
+
213
+ `search` is the `token` part of the value returned by `tokenize()`.
214
+
215
+ **Example**
216
+
217
+ ```typescript
218
+ import * as utilities from "@radically-straightforward/utilities";
219
+ import natural from "natural";
220
+
221
+ const stopWords = new Set(
222
+ natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
223
+ );
224
+
225
+ console.log(
226
+ utilities.highlight(
227
+ "For my peanuts allergy peanut butter is sometimes used.",
228
+ new Set(
229
+ utilities
230
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
231
+ .map((tokenWithPosition) => tokenWithPosition.token),
232
+ ),
233
+ { stopWords, stem: natural.PorterStemmer.stem },
234
+ ),
235
+ );
236
+ // => `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`
237
+ ```
238
+
239
+ ### `snippet()`
240
+
241
+ ```typescript
242
+ export function snippet(
243
+ text: string,
244
+ search: Set<string>,
245
+ {
246
+ surroundingTokens = 5,
247
+ ...highlightOptions
248
+ }: {
249
+ surroundingTokens?: number;
250
+ } & Parameters<typeof highlight>[2] = {},
251
+ ): string;
252
+ ```
253
+
254
+ Extract a snippet from a long `text` that includes the `search` terms.
255
+
256
+ **Example**
257
+
258
+ ```typescript
259
+ import * as utilities from "@radically-straightforward/utilities";
260
+ import natural from "natural";
261
+
262
+ const stopWords = new Set(
263
+ natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
264
+ );
265
+
266
+ console.log(
267
+ utilities.snippet(
268
+ utilities.dedent`
269
+ Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
270
+
271
+ Locally e g i aquatica. Hardness whiteness and gloss and.
272
+
273
+ Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
274
+
275
+ A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
276
+
277
+ For my peanuts allergy peanut butter is sometimes used.
278
+
279
+ Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
280
+
281
+ Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
282
+
283
+ Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
284
+
285
+ Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
286
+
287
+ Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
288
+
289
+ Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
290
+ `,
291
+ new Set(
292
+ utilities
293
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
294
+ .map((tokenWithPosition) => tokenWithPosition.token),
295
+ ),
296
+ { stopWords, stem: natural.PorterStemmer.stem },
297
+ ),
298
+ );
299
+ // => `… work in restaurants? For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`
300
+ ```
301
+
114
302
  ### `isDate()`
115
303
 
116
304
  ```typescript
package/build/index.d.mts CHANGED
@@ -68,6 +68,156 @@ export declare function capitalize(string: string): string;
68
68
  * ```
69
69
  */
70
70
  export declare function dedent(templateStrings: TemplateStringsArray, ...substitutions: any[]): string;
71
+ /**
72
+ * Process text into tokens that can be used for full-text search.
73
+ *
74
+ * The part that breaks the text into tokens matches the behavior of [SQLite’s Unicode61 Tokenizer](https://www.sqlite.org/fts5.html#unicode61_tokenizer).
75
+ *
76
+ * The `stopWords` are removed from the text. They are expected to be the result of `normalizeToken()`.
77
+ *
78
+ * The `stem()` allows you to implement, for example, [SQLite’s Porter Tokenizer](https://www.sqlite.org/fts5.html#porter_tokenizer).
79
+ *
80
+ * Reasons to use `tokenize()` instead of SQLite’s Tokenizers:
81
+ *
82
+ * 1. `tokenize()` provides a source map, linking each to token back to the ranges in `text` where they came from. This is useful in `highlight()`. [SQLite’s own `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function) doesn’t allow you to, for example, do full-text search on just the text from a message, while `highlight()`ing the message including markup.
83
+ * 2. The `stopWords` may be removed.
84
+ * 3. The `stem()` may support other languages, while SQLite’s Porter Tokenizer only supports English.
85
+ *
86
+ * When using `tokenize()`, it’s appropriate to rely on the default tokenizer in SQLite, Unicode61.
87
+ *
88
+ * We recommend using [Natural](https://naturalnode.github.io/natural/) for [`stopWords`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/util) and [`stem()`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/stemmers).
89
+ *
90
+ * **Example**
91
+ *
92
+ * ```typescript
93
+ * import * as utilities from "@radically-straightforward/utilities";
94
+ * import natural from "natural";
95
+ *
96
+ * console.log(
97
+ * utilities.tokenize(
98
+ * "For my peanuts allergy peanut butter is sometimes used.",
99
+ * {
100
+ * stopWords: new Set(
101
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
102
+ * ),
103
+ * stem: (token) => natural.PorterStemmer.stem(token),
104
+ * },
105
+ * ),
106
+ * );
107
+ * // =>
108
+ * // [
109
+ * // { token: 'peanut', tokenIsStopWord: false, start: 7, end: 14 },
110
+ * // { token: 'allergi', tokenIsStopWord: false, start: 15, end: 22 },
111
+ * // { token: 'peanut', tokenIsStopWord: false, start: 23, end: 29 },
112
+ * // { token: 'butter', tokenIsStopWord: false, start: 30, end: 36 },
113
+ * // { token: 'sometim', tokenIsStopWord: false, start: 40, end: 49 },
114
+ * // { token: 'us', tokenIsStopWord: false, start: 50, end: 54 }
115
+ * // ]
116
+ * ```
117
+ */
118
+ export declare function tokenize(text: string, { stopWords, stopWordsAction, stem, }?: {
119
+ stopWords?: Set<string>;
120
+ stopWordsAction?: "delete" | "mark";
121
+ stem?: (token: string) => string;
122
+ }): {
123
+ token: string;
124
+ tokenIsStopWord: boolean;
125
+ start: number;
126
+ end: number;
127
+ }[];
128
+ /**
129
+ * Normalize a token for `tokenize()`. It removes accents, for example, turning `ú` into `u`. It lower cases, for example, turning `HeLlO` into `hello`.
130
+ *
131
+ * **References**
132
+ *
133
+ * - https://stackoverflow.com/a/37511463
134
+ */
135
+ export declare function normalizeToken(token: string): string;
136
+ /**
137
+ * Highlight the `search` terms in `text`. Similar to [SQLite’s `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function), but because it’s implemented at the application level, it can work with `text` including markup by parsing the markup into DOM and traversing the DOM using `highlight()` on each [Text Node](https://developer.mozilla.org/en-US/docs/Web/API/Text).
138
+ *
139
+ * `search` is the `token` part of the value returned by `tokenize()`.
140
+ *
141
+ * **Example**
142
+ *
143
+ * ```typescript
144
+ * import * as utilities from "@radically-straightforward/utilities";
145
+ * import natural from "natural";
146
+ *
147
+ * const stopWords = new Set(
148
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
149
+ * );
150
+ *
151
+ * console.log(
152
+ * utilities.highlight(
153
+ * "For my peanuts allergy peanut butter is sometimes used.",
154
+ * new Set(
155
+ * utilities
156
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
157
+ * .map((tokenWithPosition) => tokenWithPosition.token),
158
+ * ),
159
+ * { stopWords, stem: natural.PorterStemmer.stem },
160
+ * ),
161
+ * );
162
+ * // => `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`
163
+ * ```
164
+ */
165
+ export declare function highlight(text: string, search: Set<string>, { start, end, ...tokenizeOptions }?: {
166
+ start?: string;
167
+ end?: string;
168
+ } & Parameters<typeof tokenize>[1]): string;
169
+ /**
170
+ * Extract a snippet from a long `text` that includes the `search` terms.
171
+ *
172
+ * **Example**
173
+ *
174
+ * ```typescript
175
+ * import * as utilities from "@radically-straightforward/utilities";
176
+ * import natural from "natural";
177
+ *
178
+ * const stopWords = new Set(
179
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
180
+ * );
181
+ *
182
+ * console.log(
183
+ * utilities.snippet(
184
+ * utilities.dedent`
185
+ * Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
186
+ *
187
+ * Locally e g i aquatica. Hardness whiteness and gloss and.
188
+ *
189
+ * Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
190
+ *
191
+ * A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
192
+ *
193
+ * For my peanuts allergy peanut butter is sometimes used.
194
+ *
195
+ * Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
196
+ *
197
+ * Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
198
+ *
199
+ * Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
200
+ *
201
+ * Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
202
+ *
203
+ * Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
204
+ *
205
+ * Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
206
+ * `,
207
+ * new Set(
208
+ * utilities
209
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
210
+ * .map((tokenWithPosition) => tokenWithPosition.token),
211
+ * ),
212
+ * { stopWords, stem: natural.PorterStemmer.stem },
213
+ * ),
214
+ * );
215
+ * // => `… work in restaurants? For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`
216
+ * ```
217
+ */
218
+ export declare function snippet(text: string, search: Set<string>, { surroundingTokens, ...highlightOptions }?: {
219
+ surroundingTokens?: number;
220
+ } & Parameters<typeof highlight>[2]): string;
71
221
  /**
72
222
  * Determine whether the given `string` is a valid `Date`, that is, it’s in ISO format and corresponds to an existing date, for example, it is **not** April 32nd.
73
223
  */
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.mts","sourceRoot":"","sources":["../source/index.mts"],"names":[],"mappings":"AAAA;;GAEG;AACH,wBAAgB,KAAK,CAAC,QAAQ,EAAE,MAAM,GAAG,OAAO,CAAC,IAAI,CAAC,CAErD;AAED;;GAEG;AACH,wBAAgB,YAAY,IAAI,MAAM,CAErC;AAED;;GAEG;AACH,wBAAgB,GAAG,CAAC,GAAG,YAAY,EAAE,MAAM,EAAE,GAAG,IAAI,CAEnD;AAED;;;;;;;;;;;;;;;;;GAiBG;AACH,qBAAa,wBAAyB,SAAQ,eAAe;;CAkB5D;AAED;;GAEG;AACH,wBAAgB,UAAU,CAAC,MAAM,EAAE,MAAM,GAAG,MAAM,CAIjD;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA+BG;AACH,wBAAgB,MAAM,CACpB,eAAe,EAAE,oBAAoB,EACrC,GAAG,aAAa,EAAE,GAAG,EAAE,UAwBxB;AAED;;GAEG;AACH,wBAAgB,MAAM,CAAC,MAAM,EAAE,MAAM,GAAG,OAAO,CAI9C;AAED;;GAEG;AACH,eAAO,MAAM,WAAW,EAAE,MAAoD,CAAC;AAE/E;;GAEG;AACH,eAAO,MAAM,aAAa,EAAE,MACqB,CAAC;AAElD;;GAEG;AACH,MAAM,MAAM,MAAM,CAAC,IAAI,IAAI,QAAQ,CAAC,IAAI,GAAG;IAAE,CAAC,YAAY,CAAC,EAAE,IAAI,CAAA;CAAE,CAAC,CAAC;AAErE;;GAEG;AACH,MAAM,MAAM,gBAAgB,GACxB,MAAM,GACN,MAAM,GACN,MAAM,GACN,OAAO,GACP,MAAM,GACN,SAAS,GACT,IAAI,GACJ,MAAM,CAAC,OAAO,CAAC,CAAC;AAEpB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAqGG;AACH,wBAAgB,MAAM,CACpB,IAAI,SAAS,KAAK,CAAC,gBAAgB,CAAC,GAAG;IAAE,CAAC,GAAG,EAAE,MAAM,GAAG,gBAAgB,CAAA;CAAE,EAC1E,KAAK,EAAE,IAAI,GAAG,MAAM,CAAC,IAAI,CAAC,CAyC3B;yBA3Ce,MAAM;;;;;;;;;;;;;;;;AA6CtB,eAAO,MAAM,YAAY,eAAmB,CAAC;AAiB7C;;;;;;;;;;;;;;;;;;;;;;;;;;GA0BG;AACH,wBAAgB,aAAa,CAC3B,EACE,QAAQ,EACR,MAAiB,GAClB,EAAE;IACD,QAAQ,EAAE,MAAM,CAAC;IACjB,MAAM,CAAC,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,CAAC;CACrC,EACD,GAAG,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,GAC9B;IACD,GAAG,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,CAAC;IACzB,IAAI,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,CAAC;CAC3B,CA6CA;AAED;;GAEG;AACH,wBAAsB,OAAO,CAAC,IAAI,EAChC,QAAQ,EAAE,MAAM,EAChB,SAAS,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,GAC7B,OAAO,CAAC,IAAI,CAAC,CAcf;AAED;;;;;;;;GAQG;AACH,wBAAgB,aAAa,CAC3B,GAAG,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,GAC9B,MAAM,OAAO,CAAC,IAAI,CAAC,CA8BrB"}
1
+ {"version":3,"file":"index.d.mts","sourceRoot":"","sources":["../source/index.mts"],"names":[],"mappings":"AAAA;;GAEG;AACH,wBAAgB,KAAK,CAAC,QAAQ,EAAE,MAAM,GAAG,OAAO,CAAC,IAAI,CAAC,CAErD;AAED;;GAEG;AACH,wBAAgB,YAAY,IAAI,MAAM,CAErC;AAED;;GAEG;AACH,wBAAgB,GAAG,CAAC,GAAG,YAAY,EAAE,MAAM,EAAE,GAAG,IAAI,CAEnD;AAED;;;;;;;;;;;;;;;;;GAiBG;AACH,qBAAa,wBAAyB,SAAQ,eAAe;;CAkB5D;AAED;;GAEG;AACH,wBAAgB,UAAU,CAAC,MAAM,EAAE,MAAM,GAAG,MAAM,CAIjD;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA+BG;AACH,wBAAgB,MAAM,CACpB,eAAe,EAAE,oBAAoB,EACrC,GAAG,aAAa,EAAE,GAAG,EAAE,UAwBxB;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA8CG;AACH,wBAAgB,QAAQ,CACtB,IAAI,EAAE,MAAM,EACZ,EACE,SAAqB,EACrB,eAA0B,EAC1B,IAAuB,GACxB,GAAE;IACD,SAAS,CAAC,EAAE,GAAG,CAAC,MAAM,CAAC,CAAC;IACxB,eAAe,CAAC,EAAE,QAAQ,GAAG,MAAM,CAAC;IACpC,IAAI,CAAC,EAAE,CAAC,KAAK,EAAE,MAAM,KAAK,MAAM,CAAC;CAC7B,GACL;IAAE,KAAK,EAAE,MAAM,CAAC;IAAC,eAAe,EAAE,OAAO,CAAC;IAAC,KAAK,EAAE,MAAM,CAAC;IAAC,GAAG,EAAE,MAAM,CAAA;CAAE,EAAE,CAiB3E;AAED;;;;;;GAMG;AACH,wBAAgB,cAAc,CAAC,KAAK,EAAE,MAAM,GAAG,MAAM,CAKpD;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA4BG;AACH,wBAAgB,SAAS,CACvB,IAAI,EAAE,MAAM,EACZ,MAAM,EAAE,GAAG,CAAC,MAAM,CAAC,EACnB,EACE,KAAkC,EAClC,GAAe,EACf,GAAG,eAAe,EACnB,GAAE;IAAE,KAAK,CAAC,EAAE,MAAM,CAAC;IAAC,GAAG,CAAC,EAAE,MAAM,CAAA;CAAE,GAAG,UAAU,CAAC,OAAO,QAAQ,CAAC,CAAC,CAAC,CAAM,GACxE,MAAM,CAmBR;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAgDG;AACH,wBAAgB,OAAO,CACrB,IAAI,EAAE,MAAM,EACZ,MAAM,EAAE,GAAG,CAAC,MAAM,CAAC,EACnB,EACE,iBAAqB,EACrB,GAAG,gBAAgB,EACpB,GAAE;IAAE,iBAAiB,CAAC,EAAE,MAAM,CAAA;CAAE,GAAG,UAAU,CAAC,OAAO,SAAS,CAAC,CAAC,CAAC,CAAM,GACvE,MAAM,CAkCR;AAED;;GAEG;AACH,wBAAgB,MAAM,CAAC,MAAM,EAAE,MAAM,GAAG,OAAO,CAI9C;AAED;;GAEG;AACH,eAAO,MAAM,WAAW,EAAE,MAAoD,CAAC;AAE/E;;GAEG;AACH,eAAO,MAAM,aAAa,EAAE,MACqB,CAAC;AAElD;;GAEG;AACH,MAAM,MAAM,MAAM,CAAC,IAAI,IAAI,QAAQ,CAAC,IAAI,GAAG;IAAE,CAAC,YAAY,CAAC,EAAE,IAAI,CAAA;CAAE,CAAC,CAAC;AAErE;;GAEG;AACH,MAAM,MAAM,gBAAgB,GACxB,MAAM,GACN,MAAM,GACN,MAAM,GACN,OAAO,GACP,MAAM,GACN,SAAS,GACT,IAAI,GACJ,MAAM,CAAC,OAAO,CAAC,CAAC;AAEpB;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAqGG;AACH,wBAAgB,MAAM,CACpB,IAAI,SAAS,KAAK,CAAC,gBAAgB,CAAC,GAAG;IAAE,CAAC,GAAG,EAAE,MAAM,GAAG,gBAAgB,CAAA;CAAE,EAC1E,KAAK,EAAE,IAAI,GAAG,MAAM,CAAC,IAAI,CAAC,CAyC3B;yBA3Ce,MAAM;;;;;;;;;;;;;;;;AA6CtB,eAAO,MAAM,YAAY,eAAmB,CAAC;AAiB7C;;;;;;;;;;;;;;;;;;;;;;;;;;GA0BG;AACH,wBAAgB,aAAa,CAC3B,EACE,QAAQ,EACR,MAAiB,GAClB,EAAE;IACD,QAAQ,EAAE,MAAM,CAAC;IACjB,MAAM,CAAC,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,CAAC;CACrC,EACD,GAAG,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,GAC9B;IACD,GAAG,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,CAAC;IACzB,IAAI,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,CAAC;CAC3B,CA6CA;AAED;;GAEG;AACH,wBAAsB,OAAO,CAAC,IAAI,EAChC,QAAQ,EAAE,MAAM,EAChB,SAAS,EAAE,MAAM,OAAO,CAAC,IAAI,CAAC,GAC7B,OAAO,CAAC,IAAI,CAAC,CAcf;AAED;;;;;;;;GAQG;AACH,wBAAgB,aAAa,CAC3B,GAAG,EAAE,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,GAC9B,MAAM,OAAO,CAAC,IAAI,CAAC,CA8BrB"}
package/build/index.mjs CHANGED
@@ -111,6 +111,191 @@ export function dedent(templateStrings, ...substitutions) {
111
111
  .replaceAll("\n" + " ".repeat(indentationLevel), "\n");
112
112
  return output.replace(/^[ ]*\n/, "").replace(/\n[ ]*$/, "");
113
113
  }
114
+ /**
115
+ * Process text into tokens that can be used for full-text search.
116
+ *
117
+ * The part that breaks the text into tokens matches the behavior of [SQLite’s Unicode61 Tokenizer](https://www.sqlite.org/fts5.html#unicode61_tokenizer).
118
+ *
119
+ * The `stopWords` are removed from the text. They are expected to be the result of `normalizeToken()`.
120
+ *
121
+ * The `stem()` allows you to implement, for example, [SQLite’s Porter Tokenizer](https://www.sqlite.org/fts5.html#porter_tokenizer).
122
+ *
123
+ * Reasons to use `tokenize()` instead of SQLite’s Tokenizers:
124
+ *
125
+ * 1. `tokenize()` provides a source map, linking each to token back to the ranges in `text` where they came from. This is useful in `highlight()`. [SQLite’s own `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function) doesn’t allow you to, for example, do full-text search on just the text from a message, while `highlight()`ing the message including markup.
126
+ * 2. The `stopWords` may be removed.
127
+ * 3. The `stem()` may support other languages, while SQLite’s Porter Tokenizer only supports English.
128
+ *
129
+ * When using `tokenize()`, it’s appropriate to rely on the default tokenizer in SQLite, Unicode61.
130
+ *
131
+ * We recommend using [Natural](https://naturalnode.github.io/natural/) for [`stopWords`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/util) and [`stem()`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/stemmers).
132
+ *
133
+ * **Example**
134
+ *
135
+ * ```typescript
136
+ * import * as utilities from "@radically-straightforward/utilities";
137
+ * import natural from "natural";
138
+ *
139
+ * console.log(
140
+ * utilities.tokenize(
141
+ * "For my peanuts allergy peanut butter is sometimes used.",
142
+ * {
143
+ * stopWords: new Set(
144
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
145
+ * ),
146
+ * stem: (token) => natural.PorterStemmer.stem(token),
147
+ * },
148
+ * ),
149
+ * );
150
+ * // =>
151
+ * // [
152
+ * // { token: 'peanut', tokenIsStopWord: false, start: 7, end: 14 },
153
+ * // { token: 'allergi', tokenIsStopWord: false, start: 15, end: 22 },
154
+ * // { token: 'peanut', tokenIsStopWord: false, start: 23, end: 29 },
155
+ * // { token: 'butter', tokenIsStopWord: false, start: 30, end: 36 },
156
+ * // { token: 'sometim', tokenIsStopWord: false, start: 40, end: 49 },
157
+ * // { token: 'us', tokenIsStopWord: false, start: 50, end: 54 }
158
+ * // ]
159
+ * ```
160
+ */
161
+ export function tokenize(text, { stopWords = new Set(), stopWordsAction = "delete", stem = (token) => token, } = {}) {
162
+ return [...text.matchAll(/[\p{Letter}\p{Number}\p{Private_Use}]+/gu)].flatMap((match) => {
163
+ const token = normalizeToken(match[0]);
164
+ const tokenIsStopWord = stopWords.has(token);
165
+ return tokenIsStopWord && stopWordsAction === "delete"
166
+ ? []
167
+ : [
168
+ {
169
+ token: stem(token),
170
+ tokenIsStopWord,
171
+ start: match.index,
172
+ end: match.index + match[0].length,
173
+ },
174
+ ];
175
+ });
176
+ }
177
+ /**
178
+ * Normalize a token for `tokenize()`. It removes accents, for example, turning `ú` into `u`. It lower cases, for example, turning `HeLlO` into `hello`.
179
+ *
180
+ * **References**
181
+ *
182
+ * - https://stackoverflow.com/a/37511463
183
+ */
184
+ export function normalizeToken(token) {
185
+ return token
186
+ .normalize("NFKD")
187
+ .replace(/\p{Diacritic}/gu, "")
188
+ .toLowerCase();
189
+ }
190
+ /**
191
+ * Highlight the `search` terms in `text`. Similar to [SQLite’s `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function), but because it’s implemented at the application level, it can work with `text` including markup by parsing the markup into DOM and traversing the DOM using `highlight()` on each [Text Node](https://developer.mozilla.org/en-US/docs/Web/API/Text).
192
+ *
193
+ * `search` is the `token` part of the value returned by `tokenize()`.
194
+ *
195
+ * **Example**
196
+ *
197
+ * ```typescript
198
+ * import * as utilities from "@radically-straightforward/utilities";
199
+ * import natural from "natural";
200
+ *
201
+ * const stopWords = new Set(
202
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
203
+ * );
204
+ *
205
+ * console.log(
206
+ * utilities.highlight(
207
+ * "For my peanuts allergy peanut butter is sometimes used.",
208
+ * new Set(
209
+ * utilities
210
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
211
+ * .map((tokenWithPosition) => tokenWithPosition.token),
212
+ * ),
213
+ * { stopWords, stem: natural.PorterStemmer.stem },
214
+ * ),
215
+ * );
216
+ * // => `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`
217
+ * ```
218
+ */
219
+ export function highlight(text, search, { start = `<span class="highlight">`, end = `</span>`, ...tokenizeOptions } = {}) {
220
+ let highlightedText = "";
221
+ const highlightedTokens = tokenize(text, tokenizeOptions).filter((tokenWithPosition) => search.has(tokenWithPosition.token));
222
+ highlightedText += text.slice(0, highlightedTokens[0]?.start ?? text.length);
223
+ for (const highlightedTokensIndex of highlightedTokens.keys())
224
+ highlightedText +=
225
+ start +
226
+ text.slice(highlightedTokens[highlightedTokensIndex].start, highlightedTokens[highlightedTokensIndex].end) +
227
+ end +
228
+ text.slice(highlightedTokens[highlightedTokensIndex].end, highlightedTokens[highlightedTokensIndex + 1]?.start ?? text.length);
229
+ return highlightedText;
230
+ }
231
+ /**
232
+ * Extract a snippet from a long `text` that includes the `search` terms.
233
+ *
234
+ * **Example**
235
+ *
236
+ * ```typescript
237
+ * import * as utilities from "@radically-straightforward/utilities";
238
+ * import natural from "natural";
239
+ *
240
+ * const stopWords = new Set(
241
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
242
+ * );
243
+ *
244
+ * console.log(
245
+ * utilities.snippet(
246
+ * utilities.dedent`
247
+ * Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
248
+ *
249
+ * Locally e g i aquatica. Hardness whiteness and gloss and.
250
+ *
251
+ * Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
252
+ *
253
+ * A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
254
+ *
255
+ * For my peanuts allergy peanut butter is sometimes used.
256
+ *
257
+ * Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
258
+ *
259
+ * Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
260
+ *
261
+ * Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
262
+ *
263
+ * Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
264
+ *
265
+ * Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
266
+ *
267
+ * Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
268
+ * `,
269
+ * new Set(
270
+ * utilities
271
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
272
+ * .map((tokenWithPosition) => tokenWithPosition.token),
273
+ * ),
274
+ * { stopWords, stem: natural.PorterStemmer.stem },
275
+ * ),
276
+ * );
277
+ * // => `… work in restaurants? For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`
278
+ * ```
279
+ */
280
+ export function snippet(text, search, { surroundingTokens = 5, ...highlightOptions } = {}) {
281
+ const textTokens = tokenize(text, {
282
+ ...highlightOptions,
283
+ stopWordsAction: "mark",
284
+ });
285
+ const textTokenMatchIndex = textTokens.findIndex((tokenWithPosition) => tokenWithPosition.tokenIsStopWord === false &&
286
+ search.has(tokenWithPosition.token));
287
+ if (textTokenMatchIndex === -1)
288
+ throw new Error(`‘snippet()’ called with no matching token.`);
289
+ const textTokenSnippetIndexStart = Math.max(0, textTokenMatchIndex - surroundingTokens);
290
+ const textTokenSnippetIndexEnd = Math.min(textTokens.length - 1, textTokenMatchIndex + surroundingTokens);
291
+ return highlight((textTokenSnippetIndexStart === 0 ? "" : "… ") +
292
+ text.slice(textTokenSnippetIndexStart === 0
293
+ ? 0
294
+ : textTokens[textTokenSnippetIndexStart].start, textTokenSnippetIndexEnd === textTokens.length - 1
295
+ ? text.length
296
+ : textTokens[textTokenSnippetIndexEnd].end) +
297
+ (textTokenSnippetIndexEnd === textTokens.length - 1 ? "" : " …"), search, highlightOptions);
298
+ }
114
299
  /**
115
300
  * Determine whether the given `string` is a valid `Date`, that is, it’s in ISO format and corresponds to an existing date, for example, it is **not** April 32nd.
116
301
  */
@@ -1 +1 @@
1
- {"version":3,"file":"index.mjs","sourceRoot":"","sources":["../source/index.mts"],"names":[],"mappings":"AAAA;;GAEG;AACH,MAAM,UAAU,KAAK,CAAC,QAAgB;IACpC,OAAO,IAAI,OAAO,CAAC,CAAC,OAAO,EAAE,EAAE,CAAC,UAAU,CAAC,OAAO,EAAE,QAAQ,CAAC,CAAC,CAAC;AACjE,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,YAAY;IAC1B,OAAO,IAAI,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,EAAE,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;AAC7C,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,GAAG,CAAC,GAAG,YAAsB;IAC3C,OAAO,CAAC,GAAG,CAAC,CAAC,IAAI,IAAI,EAAE,CAAC,WAAW,EAAE,EAAE,GAAG,YAAY,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC;AACvE,CAAC;AAED;;;;;;;;;;;;;;;;;GAiBG;AACH,MAAM,OAAO,wBAAyB,SAAQ,eAAe;IAC3D;QACE,IAAI,MAAM,GAAG,EAAE,CAAC;QAChB,KAAK,CAAC;YACJ,KAAK,CAAC,SAAS,CAAC,KAAK,EAAE,UAAU;gBAC/B,MAAM,IAAI,MAAM,KAAK,CAAC;gBACtB,MAAM,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;gBACjC,MAAM,GAAG,KAAK,CAAC,GAAG,EAAE,IAAI,EAAE,CAAC;gBAC3B,KAAK,MAAM,IAAI,IAAI,KAAK;oBACtB,IAAI,IAAI,CAAC,IAAI,EAAE,KAAK,EAAE;wBACpB,IAAI,CAAC;4BACH,UAAU,CAAC,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC;wBACvC,CAAC;wBAAC,OAAO,KAAK,EAAE,CAAC;4BACf,UAAU,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;wBAC1B,CAAC;YACP,CAAC;SACF,CAAC,CAAC;IACL,CAAC;CACF;AAED;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,MAAc;IACvC,OAAO,MAAM,CAAC,MAAM,KAAK,CAAC;QACxB,CAAC,CAAC,MAAM;QACR,CAAC,CAAC,GAAG,MAAM,CAAC,CAAC,CAAC,CAAC,WAAW,EAAE,GAAG,MAAM,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC;AACrD,CAAC;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA+BG;AACH,MAAM,UAAU,MAAM,CACpB,eAAqC,EACrC,GAAG,aAAoB;IAEvB,MAAM,gBAAgB,GAAG,eAAe;SACrC,IAAI,CAAC,EAAE,CAAC;SACR,KAAK,CAAC,IAAI,CAAC;SACX,MAAM,CACL,CAAC,gBAAgB,EAAE,IAAI,EAAE,EAAE,CACzB,IAAI,CAAC,IAAI,EAAE,KAAK,EAAE;QAChB,CAAC,CAAC,gBAAgB;QAClB,CAAC,CAAC,IAAI,CAAC,GAAG,CAAC,gBAAgB,EAAE,IAAI,CAAC,KAAK,CAAC,OAAO,CAAE,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,EAChE,QAAQ,CACT,CAAC;IACJ,IAAI,MAAM,GAAG,EAAE,CAAC;IAChB,KAAK,MAAM,KAAK,IAAI,aAAa,CAAC,IAAI,EAAE,EAAE,CAAC;QACzC,MAAM,IAAI,eAAe,CAAC,KAAK,CAAC,CAAC,UAAU,CACzC,IAAI,GAAG,GAAG,CAAC,MAAM,CAAC,gBAAgB,CAAC,EACnC,IAAI,CACL,CAAC;QACF,MAAM,IAAI,aAAa,CAAC,KAAK,CAAC,CAAC;IACjC,CAAC;IACD,MAAM,IAAI,eAAe;SACtB,EAAE,CAAC,CAAC,CAAC,CAAE;SACP,UAAU,CAAC,IAAI,GAAG,GAAG,CAAC,MAAM,CAAC,gBAAgB,CAAC,EAAE,IAAI,CAAC,CAAC;IACzD,OAAO,MAAM,CAAC,OAAO,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC,OAAO,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC;AAC9D,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,MAAM,CAAC,MAAc;IACnC,OAAO,CACL,MAAM,CAAC,KAAK,CAAC,aAAa,CAAC,KAAK,IAAI,IAAI,CAAC,KAAK,CAAC,IAAI,IAAI,CAAC,MAAM,CAAC,CAAC,OAAO,EAAE,CAAC,CAC3E,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,CAAC,MAAM,WAAW,GAAW,2CAA2C,CAAC;AAE/E;;GAEG;AACH,MAAM,CAAC,MAAM,aAAa,GACxB,+CAA+C,CAAC;AAoBlD;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAqGG;AACH,MAAM,UAAU,MAAM,CAEpB,KAAW;IACX,MAAM,IAAI,GAAG,KAAK,CAAC,OAAO,CAAC,KAAK,CAAC;QAC/B,CAAC,CAAC,OAAO;QACT,CAAC,CAAC,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,KAAK,IAAI;YAC3C,CAAC,CAAC,QAAQ;YACV,CAAC,CAAC,CAAC,GAAG,EAAE;gBACJ,MAAM,IAAI,KAAK,CAAC,yBAAyB,CAAC,CAAC;YAC7C,CAAC,CAAC,EAAE,CAAC;IACX,MAAM,IAAI,GAAG,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;IAChC,KAAK,MAAM,aAAa,IAAI,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,MAAM,EAAE,EAAE,CAAC;QACvD,MAAM,WAAW,GAAG,aAAa,CAAC,KAAK,EAAE,CAAC;QAC1C,IACE,WAAW,KAAK,SAAS;YACzB,IAAI,CAAC,MAAM,KAAK,MAAM,CAAC,IAAI,CAAC,WAAW,CAAC,CAAC,MAAM;YAE/C,SAAS;QACX,IAAI,IAAI,CAAC,KAAK,CAAC,CAAC,GAAG,EAAE,EAAE,CAAE,KAAa,CAAC,GAAG,CAAC,KAAM,WAAmB,CAAC,GAAG,CAAC,CAAC;YACxE,OAAO,WAAkB,CAAC;IAC9B,CAAC;IACD,KAAK,MAAM,UAAU,IAAI,MAAM,CAAC,MAAM,CAAC,KAAK,CAAC;QAC3C,IACE,CAAC,CACC,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,SAAS;YAC/B,OAAO,UAAU,KAAK,QAAQ;YAC9B,UAAU,KAAK,SAAS;YACxB,UAAU,KAAK,IAAI;YAClB,UAAkB,CAAC,YAAY,CAAC,KAAK,IAAI,CAC3C;YAED,MAAM,IAAI,KAAK,CACb,6DAA6D,CAC9D,CAAC;IACN,MAAM,GAAG,GAAG,MAAM,EAAE,CAAC;IACpB,KAAa,CAAC,YAAY,CAAC,GAAG,IAAI,CAAC;IACpC,MAAM,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC;IACrB,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,EAAE,IAAI,OAAO,CAAC,KAAY,CAAC,CAAC,CAAC;IACtD,MAAM,CAAC,oBAAoB,CAAC,QAAQ,CAAC,KAAK,EAAE,EAAE,IAAI,EAAE,GAAG,EAAE,CAAC,CAAC;IAC3D,OAAO,KAAY,CAAC;AACtB,CAAC;AAED,MAAM,CAAC,MAAM,YAAY,GAAG,MAAM,CAAC,QAAQ,CAAC,CAAC;AAE7C,MAAM,CAAC,IAAI,GAAG;IACZ,KAAK,EAAE,IAAI,GAAG,EAA+C;IAC7D,MAAM,EAAE,IAAI,GAAG,EAGZ;CACJ,CAAC;AAEF,MAAM,CAAC,oBAAoB,GAAG,IAAI,oBAAoB,CAGnD,CAAC,EAAE,IAAI,EAAE,GAAG,EAAE,EAAE,EAAE;IACnB,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;AAChC,CAAC,CAAC,CAAC;AAEH;;;;;;;;;;;;;;;;;;;;;;;;;;GA0BG;AACH,MAAM,UAAU,aAAa,CAC3B,EACE,QAAQ,EACR,MAAM,GAAG,GAAG,EAAE,GAAE,CAAC,GAIlB,EACD,GAA+B;IAK/B,IAAI,KAAK,GACP,UAAU,CAAC;IACb,IAAI,OAAO,GAAG,UAAU,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;IACnC,MAAM,aAAa,GAAG;QACpB,GAAG,EAAE,KAAK,IAAI,EAAE;YACd,QAAQ,KAAK,EAAE,CAAC;gBACd,KAAK,UAAU;oBACb,YAAY,CAAC,OAAO,CAAC,CAAC;oBACtB,KAAK,GAAG,SAAS,CAAC;oBAClB,IAAI,CAAC;wBACH,MAAM,GAAG,EAAE,CAAC;oBACd,CAAC;oBAAC,OAAO,KAAK,EAAE,CAAC;wBACf,GAAG,CACD,sBAAsB,EACtB,MAAM,CAAC,KAAK,CAAC,EACZ,KAAe,EAAE,KAAK,IAAI,EAAE,CAC9B,CAAC;oBACJ,CAAC;oBACD,IAAI,KAAK,KAAK,SAAS,IAAI,KAAK,KAAK,0BAA0B,EAAE,CAAC;wBAChE,OAAO,GAAG,UAAU,CAClB,GAAG,EAAE;4BACH,aAAa,CAAC,GAAG,EAAE,CAAC;wBACtB,CAAC,EACA,KAAgD;4BAC/C,0BAA0B;4BAC1B,CAAC,CAAC,CAAC;4BACH,CAAC,CAAC,QAAQ,GAAG,CAAC,CAAC,GAAG,GAAG,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,CACzC,CAAC;wBACF,KAAK,GAAG,UAAU,CAAC;oBACrB,CAAC;oBACD,MAAM;gBACR,KAAK,SAAS;oBACZ,KAAK,GAAG,0BAA0B,CAAC;oBACnC,MAAM;YACV,CAAC;QACH,CAAC;QACD,IAAI,EAAE,KAAK,IAAI,EAAE;YACf,YAAY,CAAC,OAAO,CAAC,CAAC;YACtB,KAAK,GAAG,SAAS,CAAC;YAClB,MAAM,MAAM,EAAE,CAAC;QACjB,CAAC;KACF,CAAC;IACF,aAAa,CAAC,GAAG,EAAE,CAAC;IACpB,OAAO,aAAa,CAAC;AACvB,CAAC;AAED;;GAEG;AACH,MAAM,CAAC,KAAK,UAAU,OAAO,CAC3B,QAAgB,EAChB,SAA8B;IAE9B,IAAI,OAAO,GAA+B,SAAS,CAAC;IACpD,IAAI,CAAC;QACH,OAAO,MAAM,OAAO,CAAC,IAAI,CAAO;YAC9B,SAAS,EAAE;YACX,IAAI,OAAO,CAAC,CAAC,OAAO,EAAE,MAAM,EAAE,EAAE;gBAC9B,OAAO,GAAG,UAAU,CAAC,GAAG,EAAE;oBACxB,MAAM,CAAC,SAAS,CAAC,CAAC;gBACpB,CAAC,EAAE,QAAQ,CAAC,CAAC;YACf,CAAC,CAAC;SACH,CAAC,CAAC;IACL,CAAC;YAAS,CAAC;QACT,YAAY,CAAC,OAAO,CAAC,CAAC;IACxB,CAAC;AACH,CAAC;AAED;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,GAA+B;IAE/B,IAAI,KAAK,GAAyD,WAAW,CAAC;IAC9E,MAAM,GAAG,GAAG,KAAK,IAAI,EAAE;QACrB,QAAQ,KAAK,EAAE,CAAC;YACd,KAAK,WAAW;gBACd,KAAK,GAAG,SAAS,CAAC;gBAClB,IAAI,CAAC;oBACH,MAAM,GAAG,EAAE,CAAC;gBACd,CAAC;gBAAC,OAAO,KAAK,EAAE,CAAC;oBACf,GAAG,CACD,sBAAsB,EACtB,MAAM,CAAC,KAAK,CAAC,EACZ,KAAe,EAAE,KAAK,IAAI,EAAE,CAC9B,CAAC;gBACJ,CAAC;gBACD,IACG,KAAgD;oBACjD,0BAA0B;oBAE1B,UAAU,CAAC,GAAG,EAAE;wBACd,GAAG,EAAE,CAAC;oBACR,CAAC,CAAC,CAAC;gBACL,KAAK,GAAG,WAAW,CAAC;gBACpB,MAAM;YACR,KAAK,SAAS;gBACZ,KAAK,GAAG,0BAA0B,CAAC;gBACnC,MAAM;QACV,CAAC;IACH,CAAC,CAAC;IACF,OAAO,GAAG,CAAC;AACb,CAAC"}
1
+ {"version":3,"file":"index.mjs","sourceRoot":"","sources":["../source/index.mts"],"names":[],"mappings":"AAAA;;GAEG;AACH,MAAM,UAAU,KAAK,CAAC,QAAgB;IACpC,OAAO,IAAI,OAAO,CAAC,CAAC,OAAO,EAAE,EAAE,CAAC,UAAU,CAAC,OAAO,EAAE,QAAQ,CAAC,CAAC,CAAC;AACjE,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,YAAY;IAC1B,OAAO,IAAI,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,EAAE,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;AAC7C,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,GAAG,CAAC,GAAG,YAAsB;IAC3C,OAAO,CAAC,GAAG,CAAC,CAAC,IAAI,IAAI,EAAE,CAAC,WAAW,EAAE,EAAE,GAAG,YAAY,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC;AACvE,CAAC;AAED;;;;;;;;;;;;;;;;;GAiBG;AACH,MAAM,OAAO,wBAAyB,SAAQ,eAAe;IAC3D;QACE,IAAI,MAAM,GAAG,EAAE,CAAC;QAChB,KAAK,CAAC;YACJ,KAAK,CAAC,SAAS,CAAC,KAAK,EAAE,UAAU;gBAC/B,MAAM,IAAI,MAAM,KAAK,CAAC;gBACtB,MAAM,KAAK,GAAG,MAAM,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;gBACjC,MAAM,GAAG,KAAK,CAAC,GAAG,EAAE,IAAI,EAAE,CAAC;gBAC3B,KAAK,MAAM,IAAI,IAAI,KAAK;oBACtB,IAAI,IAAI,CAAC,IAAI,EAAE,KAAK,EAAE;wBACpB,IAAI,CAAC;4BACH,UAAU,CAAC,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC;wBACvC,CAAC;wBAAC,OAAO,KAAK,EAAE,CAAC;4BACf,UAAU,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC;wBAC1B,CAAC;YACP,CAAC;SACF,CAAC,CAAC;IACL,CAAC;CACF;AAED;;GAEG;AACH,MAAM,UAAU,UAAU,CAAC,MAAc;IACvC,OAAO,MAAM,CAAC,MAAM,KAAK,CAAC;QACxB,CAAC,CAAC,MAAM;QACR,CAAC,CAAC,GAAG,MAAM,CAAC,CAAC,CAAC,CAAC,WAAW,EAAE,GAAG,MAAM,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC;AACrD,CAAC;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA+BG;AACH,MAAM,UAAU,MAAM,CACpB,eAAqC,EACrC,GAAG,aAAoB;IAEvB,MAAM,gBAAgB,GAAG,eAAe;SACrC,IAAI,CAAC,EAAE,CAAC;SACR,KAAK,CAAC,IAAI,CAAC;SACX,MAAM,CACL,CAAC,gBAAgB,EAAE,IAAI,EAAE,EAAE,CACzB,IAAI,CAAC,IAAI,EAAE,KAAK,EAAE;QAChB,CAAC,CAAC,gBAAgB;QAClB,CAAC,CAAC,IAAI,CAAC,GAAG,CAAC,gBAAgB,EAAE,IAAI,CAAC,KAAK,CAAC,OAAO,CAAE,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,EAChE,QAAQ,CACT,CAAC;IACJ,IAAI,MAAM,GAAG,EAAE,CAAC;IAChB,KAAK,MAAM,KAAK,IAAI,aAAa,CAAC,IAAI,EAAE,EAAE,CAAC;QACzC,MAAM,IAAI,eAAe,CAAC,KAAK,CAAC,CAAC,UAAU,CACzC,IAAI,GAAG,GAAG,CAAC,MAAM,CAAC,gBAAgB,CAAC,EACnC,IAAI,CACL,CAAC;QACF,MAAM,IAAI,aAAa,CAAC,KAAK,CAAC,CAAC;IACjC,CAAC;IACD,MAAM,IAAI,eAAe;SACtB,EAAE,CAAC,CAAC,CAAC,CAAE;SACP,UAAU,CAAC,IAAI,GAAG,GAAG,CAAC,MAAM,CAAC,gBAAgB,CAAC,EAAE,IAAI,CAAC,CAAC;IACzD,OAAO,MAAM,CAAC,OAAO,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC,OAAO,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC;AAC9D,CAAC;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA8CG;AACH,MAAM,UAAU,QAAQ,CACtB,IAAY,EACZ,EACE,SAAS,GAAG,IAAI,GAAG,EAAE,EACrB,eAAe,GAAG,QAAQ,EAC1B,IAAI,GAAG,CAAC,KAAK,EAAE,EAAE,CAAC,KAAK,MAKrB,EAAE;IAEN,OAAO,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,0CAA0C,CAAC,CAAC,CAAC,OAAO,CAC3E,CAAC,KAAK,EAAE,EAAE;QACR,MAAM,KAAK,GAAG,cAAc,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;QACvC,MAAM,eAAe,GAAG,SAAS,CAAC,GAAG,CAAC,KAAK,CAAC,CAAC;QAC7C,OAAO,eAAe,IAAI,eAAe,KAAK,QAAQ;YACpD,CAAC,CAAC,EAAE;YACJ,CAAC,CAAC;gBACE;oBACE,KAAK,EAAE,IAAI,CAAC,KAAK,CAAC;oBAClB,eAAe;oBACf,KAAK,EAAE,KAAK,CAAC,KAAK;oBAClB,GAAG,EAAE,KAAK,CAAC,KAAK,GAAG,KAAK,CAAC,CAAC,CAAC,CAAC,MAAM;iBACnC;aACF,CAAC;IACR,CAAC,CACF,CAAC;AACJ,CAAC;AAED;;;;;;GAMG;AACH,MAAM,UAAU,cAAc,CAAC,KAAa;IAC1C,OAAO,KAAK;SACT,SAAS,CAAC,MAAM,CAAC;SACjB,OAAO,CAAC,iBAAiB,EAAE,EAAE,CAAC;SAC9B,WAAW,EAAE,CAAC;AACnB,CAAC;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA4BG;AACH,MAAM,UAAU,SAAS,CACvB,IAAY,EACZ,MAAmB,EACnB,EACE,KAAK,GAAG,0BAA0B,EAClC,GAAG,GAAG,SAAS,EACf,GAAG,eAAe,KACmD,EAAE;IAEzE,IAAI,eAAe,GAAG,EAAE,CAAC;IACzB,MAAM,iBAAiB,GAAG,QAAQ,CAAC,IAAI,EAAE,eAAe,CAAC,CAAC,MAAM,CAC9D,CAAC,iBAAiB,EAAE,EAAE,CAAC,MAAM,CAAC,GAAG,CAAC,iBAAiB,CAAC,KAAK,CAAC,CAC3D,CAAC;IACF,eAAe,IAAI,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,iBAAiB,CAAC,CAAC,CAAC,EAAE,KAAK,IAAI,IAAI,CAAC,MAAM,CAAC,CAAC;IAC7E,KAAK,MAAM,sBAAsB,IAAI,iBAAiB,CAAC,IAAI,EAAE;QAC3D,eAAe;YACb,KAAK;gBACL,IAAI,CAAC,KAAK,CACR,iBAAiB,CAAC,sBAAsB,CAAC,CAAC,KAAK,EAC/C,iBAAiB,CAAC,sBAAsB,CAAC,CAAC,GAAG,CAC9C;gBACD,GAAG;gBACH,IAAI,CAAC,KAAK,CACR,iBAAiB,CAAC,sBAAsB,CAAC,CAAC,GAAG,EAC7C,iBAAiB,CAAC,sBAAsB,GAAG,CAAC,CAAC,EAAE,KAAK,IAAI,IAAI,CAAC,MAAM,CACpE,CAAC;IACN,OAAO,eAAe,CAAC;AACzB,CAAC;AAED;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAgDG;AACH,MAAM,UAAU,OAAO,CACrB,IAAY,EACZ,MAAmB,EACnB,EACE,iBAAiB,GAAG,CAAC,EACrB,GAAG,gBAAgB,KACiD,EAAE;IAExE,MAAM,UAAU,GAAG,QAAQ,CAAC,IAAI,EAAE;QAChC,GAAG,gBAAgB;QACnB,eAAe,EAAE,MAAM;KACxB,CAAC,CAAC;IACH,MAAM,mBAAmB,GAAG,UAAU,CAAC,SAAS,CAC9C,CAAC,iBAAiB,EAAE,EAAE,CACpB,iBAAiB,CAAC,eAAe,KAAK,KAAK;QAC3C,MAAM,CAAC,GAAG,CAAC,iBAAiB,CAAC,KAAK,CAAC,CACtC,CAAC;IACF,IAAI,mBAAmB,KAAK,CAAC,CAAC;QAC5B,MAAM,IAAI,KAAK,CAAC,4CAA4C,CAAC,CAAC;IAChE,MAAM,0BAA0B,GAAG,IAAI,CAAC,GAAG,CACzC,CAAC,EACD,mBAAmB,GAAG,iBAAiB,CACxC,CAAC;IACF,MAAM,wBAAwB,GAAG,IAAI,CAAC,GAAG,CACvC,UAAU,CAAC,MAAM,GAAG,CAAC,EACrB,mBAAmB,GAAG,iBAAiB,CACxC,CAAC;IACF,OAAO,SAAS,CACd,CAAC,0BAA0B,KAAK,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,IAAI,CAAC;QAC5C,IAAI,CAAC,KAAK,CACR,0BAA0B,KAAK,CAAC;YAC9B,CAAC,CAAC,CAAC;YACH,CAAC,CAAC,UAAU,CAAC,0BAA0B,CAAC,CAAC,KAAK,EAChD,wBAAwB,KAAK,UAAU,CAAC,MAAM,GAAG,CAAC;YAChD,CAAC,CAAC,IAAI,CAAC,MAAM;YACb,CAAC,CAAC,UAAU,CAAC,wBAAwB,CAAC,CAAC,GAAG,CAC7C;QACD,CAAC,wBAAwB,KAAK,UAAU,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,IAAI,CAAC,EAClE,MAAM,EACN,gBAAgB,CACjB,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,MAAM,CAAC,MAAc;IACnC,OAAO,CACL,MAAM,CAAC,KAAK,CAAC,aAAa,CAAC,KAAK,IAAI,IAAI,CAAC,KAAK,CAAC,IAAI,IAAI,CAAC,MAAM,CAAC,CAAC,OAAO,EAAE,CAAC,CAC3E,CAAC;AACJ,CAAC;AAED;;GAEG;AACH,MAAM,CAAC,MAAM,WAAW,GAAW,2CAA2C,CAAC;AAE/E;;GAEG;AACH,MAAM,CAAC,MAAM,aAAa,GACxB,+CAA+C,CAAC;AAoBlD;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GAqGG;AACH,MAAM,UAAU,MAAM,CAEpB,KAAW;IACX,MAAM,IAAI,GAAG,KAAK,CAAC,OAAO,CAAC,KAAK,CAAC;QAC/B,CAAC,CAAC,OAAO;QACT,CAAC,CAAC,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,KAAK,IAAI;YAC3C,CAAC,CAAC,QAAQ;YACV,CAAC,CAAC,CAAC,GAAG,EAAE;gBACJ,MAAM,IAAI,KAAK,CAAC,yBAAyB,CAAC,CAAC;YAC7C,CAAC,CAAC,EAAE,CAAC;IACX,MAAM,IAAI,GAAG,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;IAChC,KAAK,MAAM,aAAa,IAAI,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,MAAM,EAAE,EAAE,CAAC;QACvD,MAAM,WAAW,GAAG,aAAa,CAAC,KAAK,EAAE,CAAC;QAC1C,IACE,WAAW,KAAK,SAAS;YACzB,IAAI,CAAC,MAAM,KAAK,MAAM,CAAC,IAAI,CAAC,WAAW,CAAC,CAAC,MAAM;YAE/C,SAAS;QACX,IAAI,IAAI,CAAC,KAAK,CAAC,CAAC,GAAG,EAAE,EAAE,CAAE,KAAa,CAAC,GAAG,CAAC,KAAM,WAAmB,CAAC,GAAG,CAAC,CAAC;YACxE,OAAO,WAAkB,CAAC;IAC9B,CAAC;IACD,KAAK,MAAM,UAAU,IAAI,MAAM,CAAC,MAAM,CAAC,KAAK,CAAC;QAC3C,IACE,CAAC,CACC,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,QAAQ;YAC9B,OAAO,UAAU,KAAK,SAAS;YAC/B,OAAO,UAAU,KAAK,QAAQ;YAC9B,UAAU,KAAK,SAAS;YACxB,UAAU,KAAK,IAAI;YAClB,UAAkB,CAAC,YAAY,CAAC,KAAK,IAAI,CAC3C;YAED,MAAM,IAAI,KAAK,CACb,6DAA6D,CAC9D,CAAC;IACN,MAAM,GAAG,GAAG,MAAM,EAAE,CAAC;IACpB,KAAa,CAAC,YAAY,CAAC,GAAG,IAAI,CAAC;IACpC,MAAM,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC;IACrB,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,EAAE,IAAI,OAAO,CAAC,KAAY,CAAC,CAAC,CAAC;IACtD,MAAM,CAAC,oBAAoB,CAAC,QAAQ,CAAC,KAAK,EAAE,EAAE,IAAI,EAAE,GAAG,EAAE,CAAC,CAAC;IAC3D,OAAO,KAAY,CAAC;AACtB,CAAC;AAED,MAAM,CAAC,MAAM,YAAY,GAAG,MAAM,CAAC,QAAQ,CAAC,CAAC;AAE7C,MAAM,CAAC,IAAI,GAAG;IACZ,KAAK,EAAE,IAAI,GAAG,EAA+C;IAC7D,MAAM,EAAE,IAAI,GAAG,EAGZ;CACJ,CAAC;AAEF,MAAM,CAAC,oBAAoB,GAAG,IAAI,oBAAoB,CAGnD,CAAC,EAAE,IAAI,EAAE,GAAG,EAAE,EAAE,EAAE;IACnB,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;AAChC,CAAC,CAAC,CAAC;AAEH;;;;;;;;;;;;;;;;;;;;;;;;;;GA0BG;AACH,MAAM,UAAU,aAAa,CAC3B,EACE,QAAQ,EACR,MAAM,GAAG,GAAG,EAAE,GAAE,CAAC,GAIlB,EACD,GAA+B;IAK/B,IAAI,KAAK,GACP,UAAU,CAAC;IACb,IAAI,OAAO,GAAG,UAAU,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;IACnC,MAAM,aAAa,GAAG;QACpB,GAAG,EAAE,KAAK,IAAI,EAAE;YACd,QAAQ,KAAK,EAAE,CAAC;gBACd,KAAK,UAAU;oBACb,YAAY,CAAC,OAAO,CAAC,CAAC;oBACtB,KAAK,GAAG,SAAS,CAAC;oBAClB,IAAI,CAAC;wBACH,MAAM,GAAG,EAAE,CAAC;oBACd,CAAC;oBAAC,OAAO,KAAK,EAAE,CAAC;wBACf,GAAG,CACD,sBAAsB,EACtB,MAAM,CAAC,KAAK,CAAC,EACZ,KAAe,EAAE,KAAK,IAAI,EAAE,CAC9B,CAAC;oBACJ,CAAC;oBACD,IAAI,KAAK,KAAK,SAAS,IAAI,KAAK,KAAK,0BAA0B,EAAE,CAAC;wBAChE,OAAO,GAAG,UAAU,CAClB,GAAG,EAAE;4BACH,aAAa,CAAC,GAAG,EAAE,CAAC;wBACtB,CAAC,EACA,KAAgD;4BAC/C,0BAA0B;4BAC1B,CAAC,CAAC,CAAC;4BACH,CAAC,CAAC,QAAQ,GAAG,CAAC,CAAC,GAAG,GAAG,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,CACzC,CAAC;wBACF,KAAK,GAAG,UAAU,CAAC;oBACrB,CAAC;oBACD,MAAM;gBACR,KAAK,SAAS;oBACZ,KAAK,GAAG,0BAA0B,CAAC;oBACnC,MAAM;YACV,CAAC;QACH,CAAC;QACD,IAAI,EAAE,KAAK,IAAI,EAAE;YACf,YAAY,CAAC,OAAO,CAAC,CAAC;YACtB,KAAK,GAAG,SAAS,CAAC;YAClB,MAAM,MAAM,EAAE,CAAC;QACjB,CAAC;KACF,CAAC;IACF,aAAa,CAAC,GAAG,EAAE,CAAC;IACpB,OAAO,aAAa,CAAC;AACvB,CAAC;AAED;;GAEG;AACH,MAAM,CAAC,KAAK,UAAU,OAAO,CAC3B,QAAgB,EAChB,SAA8B;IAE9B,IAAI,OAAO,GAA+B,SAAS,CAAC;IACpD,IAAI,CAAC;QACH,OAAO,MAAM,OAAO,CAAC,IAAI,CAAO;YAC9B,SAAS,EAAE;YACX,IAAI,OAAO,CAAC,CAAC,OAAO,EAAE,MAAM,EAAE,EAAE;gBAC9B,OAAO,GAAG,UAAU,CAAC,GAAG,EAAE;oBACxB,MAAM,CAAC,SAAS,CAAC,CAAC;gBACpB,CAAC,EAAE,QAAQ,CAAC,CAAC;YACf,CAAC,CAAC;SACH,CAAC,CAAC;IACL,CAAC;YAAS,CAAC;QACT,YAAY,CAAC,OAAO,CAAC,CAAC;IACxB,CAAC;AACH,CAAC;AAED;;;;;;;;GAQG;AACH,MAAM,UAAU,aAAa,CAC3B,GAA+B;IAE/B,IAAI,KAAK,GAAyD,WAAW,CAAC;IAC9E,MAAM,GAAG,GAAG,KAAK,IAAI,EAAE;QACrB,QAAQ,KAAK,EAAE,CAAC;YACd,KAAK,WAAW;gBACd,KAAK,GAAG,SAAS,CAAC;gBAClB,IAAI,CAAC;oBACH,MAAM,GAAG,EAAE,CAAC;gBACd,CAAC;gBAAC,OAAO,KAAK,EAAE,CAAC;oBACf,GAAG,CACD,sBAAsB,EACtB,MAAM,CAAC,KAAK,CAAC,EACZ,KAAe,EAAE,KAAK,IAAI,EAAE,CAC9B,CAAC;gBACJ,CAAC;gBACD,IACG,KAAgD;oBACjD,0BAA0B;oBAE1B,UAAU,CAAC,GAAG,EAAE;wBACd,GAAG,EAAE,CAAC;oBACR,CAAC,CAAC,CAAC;gBACL,KAAK,GAAG,WAAW,CAAC;gBACpB,MAAM;YACR,KAAK,SAAS;gBACZ,KAAK,GAAG,0BAA0B,CAAC;gBACnC,MAAM;QACV,CAAC;IACH,CAAC,CAAC;IACF,OAAO,GAAG,CAAC;AACb,CAAC"}
@@ -1,5 +1,6 @@
1
1
  import test from "node:test";
2
2
  import assert from "node:assert/strict";
3
+ import natural from "natural";
3
4
  import * as utilities from "@radically-straightforward/utilities";
4
5
  import { intern as $ } from "@radically-straightforward/utilities";
5
6
  test("sleep()", async () => {
@@ -51,6 +52,56 @@ test("dedent()", () => {
51
52
  followed by some more text.
52
53
  `, "Here is an\n\nexample of\n an interpolated string including a newline and indentation\n\nfollowed by some more text.");
53
54
  });
55
+ test("tokenize()", () => {
56
+ assert.deepEqual(utilities.tokenize("For my peanuts allergy peanut butter is sometimes used.", {
57
+ stopWords: new Set(natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord))),
58
+ stem: (token) => natural.PorterStemmer.stem(token),
59
+ }), [
60
+ { token: "peanut", tokenIsStopWord: false, start: 7, end: 14 },
61
+ { token: "allergi", tokenIsStopWord: false, start: 15, end: 22 },
62
+ { token: "peanut", tokenIsStopWord: false, start: 23, end: 29 },
63
+ { token: "butter", tokenIsStopWord: false, start: 30, end: 36 },
64
+ { token: "sometim", tokenIsStopWord: false, start: 40, end: 49 },
65
+ { token: "us", tokenIsStopWord: false, start: 50, end: 54 },
66
+ ]);
67
+ });
68
+ test("normalizeToken()", () => {
69
+ assert.equal(utilities.normalizeToken("ú HeLlo"), "u hello");
70
+ });
71
+ test("highlight()", () => {
72
+ const stopWords = new Set(natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)));
73
+ assert.equal(utilities.highlight("For my peanuts allergy peanut butter is sometimes used.", new Set(utilities
74
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
75
+ .map((tokenWithPosition) => tokenWithPosition.token)), { stopWords, stem: natural.PorterStemmer.stem }), `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`);
76
+ });
77
+ test("snippet()", () => {
78
+ const stopWords = new Set(natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)));
79
+ assert.equal(utilities.snippet(utilities.dedent `
80
+ Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
81
+
82
+ Locally e g i aquatica. Hardness whiteness and gloss and.
83
+
84
+ Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
85
+
86
+ A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
87
+
88
+ For my peanuts allergy peanut butter is sometimes used.
89
+
90
+ Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
91
+
92
+ Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
93
+
94
+ Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
95
+
96
+ Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
97
+
98
+ Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
99
+
100
+ Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
101
+ `, new Set(utilities
102
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
103
+ .map((tokenWithPosition) => tokenWithPosition.token)), { stopWords, stem: natural.PorterStemmer.stem }), `… work in restaurants?\n\nFor my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`);
104
+ });
54
105
  test("isDate()", () => {
55
106
  assert(utilities.isDate("2024-04-01T14:57:46.638Z"));
56
107
  assert(!utilities.isDate("2024-04-01T14:57:46.68Z"));
@@ -1 +1 @@
1
- {"version":3,"file":"index.test.mjs","sourceRoot":"","sources":["../source/index.test.mts"],"names":[],"mappings":"AAAA,OAAO,IAAI,MAAM,WAAW,CAAC;AAC7B,OAAO,MAAM,MAAM,oBAAoB,CAAC;AACxC,OAAO,KAAK,SAAS,MAAM,sCAAsC,CAAC;AAClE,OAAO,EAAE,MAAM,IAAI,CAAC,EAAE,MAAM,sCAAsC,CAAC;AAEnE,IAAI,CAAC,SAAS,EAAE,KAAK,IAAI,EAAE;IACzB,MAAM,MAAM,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC;IAC1B,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;IAC5B,MAAM,CAAC,IAAI,CAAC,GAAG,EAAE,GAAG,MAAM,IAAI,IAAI,CAAC,CAAC;AACtC,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,gBAAgB,EAAE,GAAG,EAAE;IAC1B,MAAM,CAAC,KAAK,CAAC,SAAS,CAAC,YAAY,EAAE,EAAE,aAAa,CAAC,CAAC;AACxD,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,OAAO,EAAE,GAAG,EAAE;IACjB,SAAS,CAAC,GAAG,CAAC,SAAS,EAAE,IAAI,EAAE,uBAAuB,CAAC,CAAC;AAC1D,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,0BAA0B,EAAE,KAAK,IAAI,EAAE;IAC1C,CAAC;QACC,MAAM,MAAM,GAAG,IAAI,IAAI,CAAC;YACtB,OAAO,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,KAAK,IAAI,CAAC,SAAS,CAAC,EAAE,KAAK,EAAE,OAAO,EAAE,CAAC,IAAI;SACvE,CAAC;aACC,MAAM,EAAE;aACR,WAAW,CAAC,IAAI,iBAAiB,EAAE,CAAC;aACpC,WAAW,CAAC,IAAI,SAAS,CAAC,wBAAwB,EAAE,CAAC;aACrD,SAAS,EAAE,CAAC;QACf,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,IAAI,CAAC,CAAC;QAChD,MAAM,CAAC,SAAS,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,EAAE,KAAK,EAAE,OAAO,EAAE,CAAC,CAAC;QAClE,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,SAAS,CAAC,CAAC;IACvD,CAAC;IAED,CAAC;QACC,MAAM,MAAM,GAAG,IAAI,IAAI,CAAC,CAAC,OAAO,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;aAC1D,MAAM,EAAE;aACR,WAAW,CAAC,IAAI,iBAAiB,EAAE,CAAC;aACpC,WAAW,CAAC,IAAI,SAAS,CAAC,wBAAwB,EAAE,CAAC;aACrD,SAAS,EAAE,CAAC;QACf,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,IAAI,CAAC,CAAC;QAChD,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;YAC9B,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC;QACtB,CAAC,CAAC,CAAC;IACL,CAAC;AACH,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,cAAc,EAAE,GAAG,EAAE;IACxB,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,UAAU,CAAC,qBAAqB,CAAC,EAC3C,qBAAqB,CACtB,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,MAAM,2BAA2B,GAC/B,yEAAyE,CAAC;IAE5E,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,MAAM,CAAA;;;QAGZ,2BAA2B;;;KAG9B,EACD,sHAAsH,CACvH,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,MAAM,CAAC,SAAS,CAAC,MAAM,CAAC,0BAA0B,CAAC,CAAC,CAAC;IACrD,MAAM,CAAC,CAAC,SAAS,CAAC,MAAM,CAAC,yBAAyB,CAAC,CAAC,CAAC;IACrD,MAAM,CAAC,CAAC,SAAS,CAAC,MAAM,CAAC,0BAA0B,CAAC,CAAC,CAAC;AACxD,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,aAAa,EAAE,GAAG,EAAE;IACvB,MAAM,CAAC,KAAK,CAAC,oBAAoB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAC1D,MAAM,CAAC,YAAY,CAAC,kBAAkB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAC/D,MAAM,CAAC,YAAY,CAAC,mBAAmB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAChE,MAAM,CAAC,YAAY,CAAC,YAAY,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IACzD,MAAM,CAAC,YAAY,CAAC,oBAAoB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IACjE,MAAM,CAAC,YAAY,CAAC,qBAAqB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;AACpE,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,eAAe,EAAE,GAAG,EAAE;IACzB,MAAM,CAAC,KAAK,CAAC,0BAA0B,EAAE,SAAS,CAAC,aAAa,CAAC,CAAC;IAClE,MAAM,CAAC,YAAY,CAAC,kBAAkB,EAAE,SAAS,CAAC,aAAa,CAAC,CAAC;AACnE,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,mBAAmB;IACnB,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,KAAK,KAAK,CAAC,CAAC;IAChC,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1B,MAAM,CAAC,CAAC,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,CAAC,CAAC;IAEhD,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAE1B,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAoB,CAAC;QACxC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAChB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAChB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,SAAS,CAAC,CAAC;IACxC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAsC,CAAC;QAC1D,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QACnB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QACnB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IACnC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAY,CAAC;QAChC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QACb,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QACb,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,KAAK,CAAC,CAAC;IACjC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAA8B,CAAC;QAClD,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QAChB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QAChB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1B,CAAC;IAED,MAAM,CAAC,MAAM,CAAC,GAAG,EAAE;QACjB,mBAAmB;QACnB,CAAC,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC;IACb,CAAC,CAAC,CAAC;IACH,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;IAExC,MAAM,CAAC,MAAM,CAAC,GAAG,EAAE;QACjB,mBAAmB;QACnB,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;IAChB,CAAC,CAAC,CAAC;AACL,CAAC,CAAC,CAAC;AAEH,IAAI,CACF,iBAAiB,EACjB;IACE,IAAI,EACF,OAAO,CAAC,KAAK,CAAC,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,iBAAiB;QAC1D,CAAC,CAAC,KAAK;QACP,CAAC,CAAC,4EAA4E;CACnF,EACD,KAAK,IAAI,EAAE;IACT,MAAM,aAAa,GAAG,SAAS,CAAC,aAAa,CAC3C,EAAE,QAAQ,EAAE,CAAC,GAAG,IAAI,EAAE,EACtB,KAAK,IAAI,EAAE;QACT,OAAO,CAAC,GAAG,CAAC,4CAA4C,CAAC,CAAC;QAC1D,MAAM,SAAS,CAAC,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,CAAC;QAChC,OAAO,CAAC,GAAG,CAAC,sDAAsD,CAAC,CAAC;QACpE,IAAI,IAAI,CAAC,MAAM,EAAE,GAAG,GAAG;YACrB,MAAM,IAAI,KAAK,CACb,gEAAgE,CACjE,CAAC;IACN,CAAC,CACF,CAAC;IACF,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;QACzB,aAAa,CAAC,GAAG,EAAE,CAAC;IACtB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,EAAE,CAAC,QAAQ,EAAE,GAAG,EAAE;QACxB,aAAa,CAAC,IAAI,EAAE,CAAC;IACvB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,GAAG,CAAC,4DAA4D,CAAC,CAAC;AAC5E,CAAC,CACF,CAAC;AAEF,IAAI,CAAC,WAAW,EAAE,KAAK,IAAI,EAAE;IAC3B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;QACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QAC5B,OAAO,CAAC,GAAG,CAAC,0CAA0C,CAAC,CAAC;IAC1D,CAAC,CAAC,CAAC;IACH,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;QAC9B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;YACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;YAC5B,MAAM,IAAI,KAAK,CAAC,uCAAuC,CAAC,CAAC;QAC3D,CAAC,CAAC,CAAC;IACL,CAAC,CAAC,CAAC;IACH,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;QAC9B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;YACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;YAC5B,OAAO,CAAC,GAAG,CAAC,wDAAwD,CAAC,CAAC;QACxE,CAAC,CAAC,CAAC;IACL,CAAC,CAAC,CAAC;AACL,CAAC,CAAC,CAAC;AAEH,IAAI,CACF,iBAAiB,EACjB;IACE,IAAI,EACF,OAAO,CAAC,KAAK,CAAC,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,iBAAiB;QAC1D,CAAC,CAAC,KAAK;QACP,CAAC,CAAC,4EAA4E;CACnF,EACD,KAAK,IAAI,EAAE;IACT,MAAM,aAAa,GAAG,SAAS,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE;QACvD,OAAO,CAAC,GAAG,CAAC,4CAA4C,CAAC,CAAC;QAC1D,MAAM,SAAS,CAAC,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,CAAC;QAChC,OAAO,CAAC,GAAG,CAAC,sDAAsD,CAAC,CAAC;QACpE,IAAI,IAAI,CAAC,MAAM,EAAE,GAAG,GAAG;YACrB,MAAM,IAAI,KAAK,CACb,gEAAgE,CACjE,CAAC;IACN,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;QACzB,aAAa,EAAE,CAAC;IAClB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,GAAG,CAAC,oDAAoD,CAAC,CAAC;IAClE,WAAW,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;AACxB,CAAC,CACF,CAAC"}
1
+ {"version":3,"file":"index.test.mjs","sourceRoot":"","sources":["../source/index.test.mts"],"names":[],"mappings":"AAAA,OAAO,IAAI,MAAM,WAAW,CAAC;AAC7B,OAAO,MAAM,MAAM,oBAAoB,CAAC;AACxC,OAAO,OAAO,MAAM,SAAS,CAAC;AAC9B,OAAO,KAAK,SAAS,MAAM,sCAAsC,CAAC;AAClE,OAAO,EAAE,MAAM,IAAI,CAAC,EAAE,MAAM,sCAAsC,CAAC;AAEnE,IAAI,CAAC,SAAS,EAAE,KAAK,IAAI,EAAE;IACzB,MAAM,MAAM,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC;IAC1B,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;IAC5B,MAAM,CAAC,IAAI,CAAC,GAAG,EAAE,GAAG,MAAM,IAAI,IAAI,CAAC,CAAC;AACtC,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,gBAAgB,EAAE,GAAG,EAAE;IAC1B,MAAM,CAAC,KAAK,CAAC,SAAS,CAAC,YAAY,EAAE,EAAE,aAAa,CAAC,CAAC;AACxD,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,OAAO,EAAE,GAAG,EAAE;IACjB,SAAS,CAAC,GAAG,CAAC,SAAS,EAAE,IAAI,EAAE,uBAAuB,CAAC,CAAC;AAC1D,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,0BAA0B,EAAE,KAAK,IAAI,EAAE;IAC1C,CAAC;QACC,MAAM,MAAM,GAAG,IAAI,IAAI,CAAC;YACtB,OAAO,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,KAAK,IAAI,CAAC,SAAS,CAAC,EAAE,KAAK,EAAE,OAAO,EAAE,CAAC,IAAI;SACvE,CAAC;aACC,MAAM,EAAE;aACR,WAAW,CAAC,IAAI,iBAAiB,EAAE,CAAC;aACpC,WAAW,CAAC,IAAI,SAAS,CAAC,wBAAwB,EAAE,CAAC;aACrD,SAAS,EAAE,CAAC;QACf,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,IAAI,CAAC,CAAC;QAChD,MAAM,CAAC,SAAS,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,EAAE,KAAK,EAAE,OAAO,EAAE,CAAC,CAAC;QAClE,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,SAAS,CAAC,CAAC;IACvD,CAAC;IAED,CAAC;QACC,MAAM,MAAM,GAAG,IAAI,IAAI,CAAC,CAAC,OAAO,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;aAC1D,MAAM,EAAE;aACR,WAAW,CAAC,IAAI,iBAAiB,EAAE,CAAC;aACpC,WAAW,CAAC,IAAI,SAAS,CAAC,wBAAwB,EAAE,CAAC;aACrD,SAAS,EAAE,CAAC;QACf,MAAM,CAAC,KAAK,CAAC,CAAC,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,KAAK,EAAE,IAAI,CAAC,CAAC;QAChD,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;YAC9B,MAAM,MAAM,CAAC,IAAI,EAAE,CAAC;QACtB,CAAC,CAAC,CAAC;IACL,CAAC;AACH,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,cAAc,EAAE,GAAG,EAAE;IACxB,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,UAAU,CAAC,qBAAqB,CAAC,EAC3C,qBAAqB,CACtB,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,MAAM,2BAA2B,GAC/B,yEAAyE,CAAC;IAE5E,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,MAAM,CAAA;;;QAGZ,2BAA2B;;;KAG9B,EACD,sHAAsH,CACvH,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,YAAY,EAAE,GAAG,EAAE;IACtB,MAAM,CAAC,SAAS,CACd,SAAS,CAAC,QAAQ,CAChB,yDAAyD,EACzD;QACE,SAAS,EAAE,IAAI,GAAG,CAChB,OAAO,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE,EAAE,CACjC,SAAS,CAAC,cAAc,CAAC,QAAQ,CAAC,CACnC,CACF;QACD,IAAI,EAAE,CAAC,KAAK,EAAE,EAAE,CAAC,OAAO,CAAC,aAAa,CAAC,IAAI,CAAC,KAAK,CAAC;KACnD,CACF,EACD;QACE,EAAE,KAAK,EAAE,QAAQ,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,CAAC,EAAE,GAAG,EAAE,EAAE,EAAE;QAC9D,EAAE,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,GAAG,EAAE,EAAE,EAAE;QAChE,EAAE,KAAK,EAAE,QAAQ,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,GAAG,EAAE,EAAE,EAAE;QAC/D,EAAE,KAAK,EAAE,QAAQ,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,GAAG,EAAE,EAAE,EAAE;QAC/D,EAAE,KAAK,EAAE,SAAS,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,GAAG,EAAE,EAAE,EAAE;QAChE,EAAE,KAAK,EAAE,IAAI,EAAE,eAAe,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE,EAAE,GAAG,EAAE,EAAE,EAAE;KAC5D,CACF,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,kBAAkB,EAAE,GAAG,EAAE;IAC5B,MAAM,CAAC,KAAK,CAAC,SAAS,CAAC,cAAc,CAAC,SAAS,CAAC,EAAE,SAAS,CAAC,CAAC;AAC/D,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,aAAa,EAAE,GAAG,EAAE;IACvB,MAAM,SAAS,GAAG,IAAI,GAAG,CACvB,OAAO,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE,EAAE,CAAC,SAAS,CAAC,cAAc,CAAC,QAAQ,CAAC,CAAC,CACxE,CAAC;IACF,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,SAAS,CACjB,yDAAyD,EACzD,IAAI,GAAG,CACL,SAAS;SACN,QAAQ,CAAC,SAAS,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,OAAO,CAAC,aAAa,CAAC,IAAI,EAAE,CAAC;SACpE,GAAG,CAAC,CAAC,iBAAiB,EAAE,EAAE,CAAC,iBAAiB,CAAC,KAAK,CAAC,CACvD,EACD,EAAE,SAAS,EAAE,IAAI,EAAE,OAAO,CAAC,aAAa,CAAC,IAAI,EAAE,CAChD,EACD,uHAAuH,CACxH,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,WAAW,EAAE,GAAG,EAAE;IACrB,MAAM,SAAS,GAAG,IAAI,GAAG,CACvB,OAAO,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC,QAAQ,EAAE,EAAE,CAAC,SAAS,CAAC,cAAc,CAAC,QAAQ,CAAC,CAAC,CACxE,CAAC;IACF,MAAM,CAAC,KAAK,CACV,SAAS,CAAC,OAAO,CACf,SAAS,CAAC,MAAM,CAAA;;;;;;;;;;;;;;;;;;;;;;OAsBf,EACD,IAAI,GAAG,CACL,SAAS;SACN,QAAQ,CAAC,SAAS,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,OAAO,CAAC,aAAa,CAAC,IAAI,EAAE,CAAC;SACpE,GAAG,CAAC,CAAC,iBAAiB,EAAE,EAAE,CAAC,iBAAiB,CAAC,KAAK,CAAC,CACvD,EACD,EAAE,SAAS,EAAE,IAAI,EAAE,OAAO,CAAC,aAAa,CAAC,IAAI,EAAE,CAChD,EACD,6IAA6I,CAC9I,CAAC;AACJ,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,MAAM,CAAC,SAAS,CAAC,MAAM,CAAC,0BAA0B,CAAC,CAAC,CAAC;IACrD,MAAM,CAAC,CAAC,SAAS,CAAC,MAAM,CAAC,yBAAyB,CAAC,CAAC,CAAC;IACrD,MAAM,CAAC,CAAC,SAAS,CAAC,MAAM,CAAC,0BAA0B,CAAC,CAAC,CAAC;AACxD,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,aAAa,EAAE,GAAG,EAAE;IACvB,MAAM,CAAC,KAAK,CAAC,oBAAoB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAC1D,MAAM,CAAC,YAAY,CAAC,kBAAkB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAC/D,MAAM,CAAC,YAAY,CAAC,mBAAmB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IAChE,MAAM,CAAC,YAAY,CAAC,YAAY,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IACzD,MAAM,CAAC,YAAY,CAAC,oBAAoB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;IACjE,MAAM,CAAC,YAAY,CAAC,qBAAqB,EAAE,SAAS,CAAC,WAAW,CAAC,CAAC;AACpE,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,eAAe,EAAE,GAAG,EAAE;IACzB,MAAM,CAAC,KAAK,CAAC,0BAA0B,EAAE,SAAS,CAAC,aAAa,CAAC,CAAC;IAClE,MAAM,CAAC,YAAY,CAAC,kBAAkB,EAAE,SAAS,CAAC,aAAa,CAAC,CAAC;AACnE,CAAC,CAAC,CAAC;AAEH,IAAI,CAAC,UAAU,EAAE,GAAG,EAAE;IACpB,mBAAmB;IACnB,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,KAAK,KAAK,CAAC,CAAC;IAChC,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1B,MAAM,CAAC,CAAC,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,EAAE,CAAC,CAAC,CAAC;IAEhD,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAE1B,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAoB,CAAC;QACxC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAChB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QAChB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,SAAS,CAAC,CAAC;IACxC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAsC,CAAC;QAC1D,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QACnB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QACnB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;IACnC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAAY,CAAC;QAChC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QACb,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QACb,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,KAAK,CAAC,CAAC;IACjC,CAAC;IAED,CAAC;QACC,MAAM,GAAG,GAAG,IAAI,GAAG,EAA8B,CAAC;QAClD,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QAChB,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;QAChB,MAAM,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC;QAC1B,MAAM,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1B,CAAC;IAED,MAAM,CAAC,MAAM,CAAC,GAAG,EAAE;QACjB,mBAAmB;QACnB,CAAC,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC;IACb,CAAC,CAAC,CAAC;IACH,MAAM,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;IAExC,MAAM,CAAC,MAAM,CAAC,GAAG,EAAE;QACjB,mBAAmB;QACnB,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;IAChB,CAAC,CAAC,CAAC;AACL,CAAC,CAAC,CAAC;AAEH,IAAI,CACF,iBAAiB,EACjB;IACE,IAAI,EACF,OAAO,CAAC,KAAK,CAAC,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,iBAAiB;QAC1D,CAAC,CAAC,KAAK;QACP,CAAC,CAAC,4EAA4E;CACnF,EACD,KAAK,IAAI,EAAE;IACT,MAAM,aAAa,GAAG,SAAS,CAAC,aAAa,CAC3C,EAAE,QAAQ,EAAE,CAAC,GAAG,IAAI,EAAE,EACtB,KAAK,IAAI,EAAE;QACT,OAAO,CAAC,GAAG,CAAC,4CAA4C,CAAC,CAAC;QAC1D,MAAM,SAAS,CAAC,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,CAAC;QAChC,OAAO,CAAC,GAAG,CAAC,sDAAsD,CAAC,CAAC;QACpE,IAAI,IAAI,CAAC,MAAM,EAAE,GAAG,GAAG;YACrB,MAAM,IAAI,KAAK,CACb,gEAAgE,CACjE,CAAC;IACN,CAAC,CACF,CAAC;IACF,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;QACzB,aAAa,CAAC,GAAG,EAAE,CAAC;IACtB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,EAAE,CAAC,QAAQ,EAAE,GAAG,EAAE;QACxB,aAAa,CAAC,IAAI,EAAE,CAAC;IACvB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,GAAG,CAAC,4DAA4D,CAAC,CAAC;AAC5E,CAAC,CACF,CAAC;AAEF,IAAI,CAAC,WAAW,EAAE,KAAK,IAAI,EAAE;IAC3B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;QACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QAC5B,OAAO,CAAC,GAAG,CAAC,0CAA0C,CAAC,CAAC;IAC1D,CAAC,CAAC,CAAC;IACH,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;QAC9B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;YACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;YAC5B,MAAM,IAAI,KAAK,CAAC,uCAAuC,CAAC,CAAC;QAC3D,CAAC,CAAC,CAAC;IACL,CAAC,CAAC,CAAC;IACH,MAAM,MAAM,CAAC,OAAO,CAAC,KAAK,IAAI,EAAE;QAC9B,MAAM,SAAS,CAAC,OAAO,CAAC,IAAI,EAAE,KAAK,IAAI,EAAE;YACvC,MAAM,SAAS,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;YAC5B,OAAO,CAAC,GAAG,CAAC,wDAAwD,CAAC,CAAC;QACxE,CAAC,CAAC,CAAC;IACL,CAAC,CAAC,CAAC;AACL,CAAC,CAAC,CAAC;AAEH,IAAI,CACF,iBAAiB,EACjB;IACE,IAAI,EACF,OAAO,CAAC,KAAK,CAAC,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,KAAK,iBAAiB;QAC1D,CAAC,CAAC,KAAK;QACP,CAAC,CAAC,4EAA4E;CACnF,EACD,KAAK,IAAI,EAAE;IACT,MAAM,aAAa,GAAG,SAAS,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE;QACvD,OAAO,CAAC,GAAG,CAAC,4CAA4C,CAAC,CAAC;QAC1D,MAAM,SAAS,CAAC,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,CAAC;QAChC,OAAO,CAAC,GAAG,CAAC,sDAAsD,CAAC,CAAC;QACpE,IAAI,IAAI,CAAC,MAAM,EAAE,GAAG,GAAG;YACrB,MAAM,IAAI,KAAK,CACb,gEAAgE,CACjE,CAAC;IACN,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;QACzB,aAAa,EAAE,CAAC;IAClB,CAAC,CAAC,CAAC;IACH,OAAO,CAAC,GAAG,CAAC,oDAAoD,CAAC,CAAC;IAClE,WAAW,CAAC,GAAG,EAAE,GAAE,CAAC,CAAC,CAAC;AACxB,CAAC,CACF,CAAC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@radically-straightforward/utilities",
3
- "version": "2.0.6",
3
+ "version": "2.0.7",
4
4
  "description": "🛠️ Utilities for Node.js and the browser",
5
5
  "keywords": [
6
6
  "node",
@@ -31,6 +31,7 @@
31
31
  "@radically-straightforward/documentation": "^1.0.4",
32
32
  "@radically-straightforward/typescript": "^1.0.0",
33
33
  "@types/node": "^20.14.1",
34
+ "natural": "^8.0.1",
34
35
  "prettier": "^3.3.0",
35
36
  "typescript": "^5.4.5"
36
37
  },
package/source/index.mts CHANGED
@@ -126,6 +126,247 @@ export function dedent(
126
126
  return output.replace(/^[ ]*\n/, "").replace(/\n[ ]*$/, "");
127
127
  }
128
128
 
129
+ /**
130
+ * Process text into tokens that can be used for full-text search.
131
+ *
132
+ * The part that breaks the text into tokens matches the behavior of [SQLite’s Unicode61 Tokenizer](https://www.sqlite.org/fts5.html#unicode61_tokenizer).
133
+ *
134
+ * The `stopWords` are removed from the text. They are expected to be the result of `normalizeToken()`.
135
+ *
136
+ * The `stem()` allows you to implement, for example, [SQLite’s Porter Tokenizer](https://www.sqlite.org/fts5.html#porter_tokenizer).
137
+ *
138
+ * Reasons to use `tokenize()` instead of SQLite’s Tokenizers:
139
+ *
140
+ * 1. `tokenize()` provides a source map, linking each to token back to the ranges in `text` where they came from. This is useful in `highlight()`. [SQLite’s own `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function) doesn’t allow you to, for example, do full-text search on just the text from a message, while `highlight()`ing the message including markup.
141
+ * 2. The `stopWords` may be removed.
142
+ * 3. The `stem()` may support other languages, while SQLite’s Porter Tokenizer only supports English.
143
+ *
144
+ * When using `tokenize()`, it’s appropriate to rely on the default tokenizer in SQLite, Unicode61.
145
+ *
146
+ * We recommend using [Natural](https://naturalnode.github.io/natural/) for [`stopWords`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/util) and [`stem()`](https://github.com/NaturalNode/natural/tree/791df0bb8011c6caa8fc2a3a00f75deed4b3a855/lib/natural/stemmers).
147
+ *
148
+ * **Example**
149
+ *
150
+ * ```typescript
151
+ * import * as utilities from "@radically-straightforward/utilities";
152
+ * import natural from "natural";
153
+ *
154
+ * console.log(
155
+ * utilities.tokenize(
156
+ * "For my peanuts allergy peanut butter is sometimes used.",
157
+ * {
158
+ * stopWords: new Set(
159
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
160
+ * ),
161
+ * stem: (token) => natural.PorterStemmer.stem(token),
162
+ * },
163
+ * ),
164
+ * );
165
+ * // =>
166
+ * // [
167
+ * // { token: 'peanut', tokenIsStopWord: false, start: 7, end: 14 },
168
+ * // { token: 'allergi', tokenIsStopWord: false, start: 15, end: 22 },
169
+ * // { token: 'peanut', tokenIsStopWord: false, start: 23, end: 29 },
170
+ * // { token: 'butter', tokenIsStopWord: false, start: 30, end: 36 },
171
+ * // { token: 'sometim', tokenIsStopWord: false, start: 40, end: 49 },
172
+ * // { token: 'us', tokenIsStopWord: false, start: 50, end: 54 }
173
+ * // ]
174
+ * ```
175
+ */
176
+ export function tokenize(
177
+ text: string,
178
+ {
179
+ stopWords = new Set(),
180
+ stopWordsAction = "delete",
181
+ stem = (token) => token,
182
+ }: {
183
+ stopWords?: Set<string>;
184
+ stopWordsAction?: "delete" | "mark";
185
+ stem?: (token: string) => string;
186
+ } = {},
187
+ ): { token: string; tokenIsStopWord: boolean; start: number; end: number }[] {
188
+ return [...text.matchAll(/[\p{Letter}\p{Number}\p{Private_Use}]+/gu)].flatMap(
189
+ (match) => {
190
+ const token = normalizeToken(match[0]);
191
+ const tokenIsStopWord = stopWords.has(token);
192
+ return tokenIsStopWord && stopWordsAction === "delete"
193
+ ? []
194
+ : [
195
+ {
196
+ token: stem(token),
197
+ tokenIsStopWord,
198
+ start: match.index,
199
+ end: match.index + match[0].length,
200
+ },
201
+ ];
202
+ },
203
+ );
204
+ }
205
+
206
+ /**
207
+ * Normalize a token for `tokenize()`. It removes accents, for example, turning `ú` into `u`. It lower cases, for example, turning `HeLlO` into `hello`.
208
+ *
209
+ * **References**
210
+ *
211
+ * - https://stackoverflow.com/a/37511463
212
+ */
213
+ export function normalizeToken(token: string): string {
214
+ return token
215
+ .normalize("NFKD")
216
+ .replace(/\p{Diacritic}/gu, "")
217
+ .toLowerCase();
218
+ }
219
+
220
+ /**
221
+ * Highlight the `search` terms in `text`. Similar to [SQLite’s `highlight()` function](https://www.sqlite.org/fts5.html#the_highlight_function), but because it’s implemented at the application level, it can work with `text` including markup by parsing the markup into DOM and traversing the DOM using `highlight()` on each [Text Node](https://developer.mozilla.org/en-US/docs/Web/API/Text).
222
+ *
223
+ * `search` is the `token` part of the value returned by `tokenize()`.
224
+ *
225
+ * **Example**
226
+ *
227
+ * ```typescript
228
+ * import * as utilities from "@radically-straightforward/utilities";
229
+ * import natural from "natural";
230
+ *
231
+ * const stopWords = new Set(
232
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
233
+ * );
234
+ *
235
+ * console.log(
236
+ * utilities.highlight(
237
+ * "For my peanuts allergy peanut butter is sometimes used.",
238
+ * new Set(
239
+ * utilities
240
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
241
+ * .map((tokenWithPosition) => tokenWithPosition.token),
242
+ * ),
243
+ * { stopWords, stem: natural.PorterStemmer.stem },
244
+ * ),
245
+ * );
246
+ * // => `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`
247
+ * ```
248
+ */
249
+ export function highlight(
250
+ text: string,
251
+ search: Set<string>,
252
+ {
253
+ start = `<span class="highlight">`,
254
+ end = `</span>`,
255
+ ...tokenizeOptions
256
+ }: { start?: string; end?: string } & Parameters<typeof tokenize>[1] = {},
257
+ ): string {
258
+ let highlightedText = "";
259
+ const highlightedTokens = tokenize(text, tokenizeOptions).filter(
260
+ (tokenWithPosition) => search.has(tokenWithPosition.token),
261
+ );
262
+ highlightedText += text.slice(0, highlightedTokens[0]?.start ?? text.length);
263
+ for (const highlightedTokensIndex of highlightedTokens.keys())
264
+ highlightedText +=
265
+ start +
266
+ text.slice(
267
+ highlightedTokens[highlightedTokensIndex].start,
268
+ highlightedTokens[highlightedTokensIndex].end,
269
+ ) +
270
+ end +
271
+ text.slice(
272
+ highlightedTokens[highlightedTokensIndex].end,
273
+ highlightedTokens[highlightedTokensIndex + 1]?.start ?? text.length,
274
+ );
275
+ return highlightedText;
276
+ }
277
+
278
+ /**
279
+ * Extract a snippet from a long `text` that includes the `search` terms.
280
+ *
281
+ * **Example**
282
+ *
283
+ * ```typescript
284
+ * import * as utilities from "@radically-straightforward/utilities";
285
+ * import natural from "natural";
286
+ *
287
+ * const stopWords = new Set(
288
+ * natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
289
+ * );
290
+ *
291
+ * console.log(
292
+ * utilities.snippet(
293
+ * utilities.dedent`
294
+ * Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
295
+ *
296
+ * Locally e g i aquatica. Hardness whiteness and gloss and.
297
+ *
298
+ * Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
299
+ *
300
+ * A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
301
+ *
302
+ * For my peanuts allergy peanut butter is sometimes used.
303
+ *
304
+ * Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
305
+ *
306
+ * Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
307
+ *
308
+ * Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
309
+ *
310
+ * Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
311
+ *
312
+ * Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
313
+ *
314
+ * Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
315
+ * `,
316
+ * new Set(
317
+ * utilities
318
+ * .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
319
+ * .map((tokenWithPosition) => tokenWithPosition.token),
320
+ * ),
321
+ * { stopWords, stem: natural.PorterStemmer.stem },
322
+ * ),
323
+ * );
324
+ * // => `… work in restaurants? For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`
325
+ * ```
326
+ */
327
+ export function snippet(
328
+ text: string,
329
+ search: Set<string>,
330
+ {
331
+ surroundingTokens = 5,
332
+ ...highlightOptions
333
+ }: { surroundingTokens?: number } & Parameters<typeof highlight>[2] = {},
334
+ ): string {
335
+ const textTokens = tokenize(text, {
336
+ ...highlightOptions,
337
+ stopWordsAction: "mark",
338
+ });
339
+ const textTokenMatchIndex = textTokens.findIndex(
340
+ (tokenWithPosition) =>
341
+ tokenWithPosition.tokenIsStopWord === false &&
342
+ search.has(tokenWithPosition.token),
343
+ );
344
+ if (textTokenMatchIndex === -1)
345
+ throw new Error(`‘snippet()’ called with no matching token.`);
346
+ const textTokenSnippetIndexStart = Math.max(
347
+ 0,
348
+ textTokenMatchIndex - surroundingTokens,
349
+ );
350
+ const textTokenSnippetIndexEnd = Math.min(
351
+ textTokens.length - 1,
352
+ textTokenMatchIndex + surroundingTokens,
353
+ );
354
+ return highlight(
355
+ (textTokenSnippetIndexStart === 0 ? "" : "… ") +
356
+ text.slice(
357
+ textTokenSnippetIndexStart === 0
358
+ ? 0
359
+ : textTokens[textTokenSnippetIndexStart].start,
360
+ textTokenSnippetIndexEnd === textTokens.length - 1
361
+ ? text.length
362
+ : textTokens[textTokenSnippetIndexEnd].end,
363
+ ) +
364
+ (textTokenSnippetIndexEnd === textTokens.length - 1 ? "" : " …"),
365
+ search,
366
+ highlightOptions,
367
+ );
368
+ }
369
+
129
370
  /**
130
371
  * Determine whether the given `string` is a valid `Date`, that is, it’s in ISO format and corresponds to an existing date, for example, it is **not** April 32nd.
131
372
  */
@@ -1,5 +1,6 @@
1
1
  import test from "node:test";
2
2
  import assert from "node:assert/strict";
3
+ import natural from "natural";
3
4
  import * as utilities from "@radically-straightforward/utilities";
4
5
  import { intern as $ } from "@radically-straightforward/utilities";
5
6
 
@@ -67,6 +68,92 @@ test("dedent()", () => {
67
68
  );
68
69
  });
69
70
 
71
+ test("tokenize()", () => {
72
+ assert.deepEqual(
73
+ utilities.tokenize(
74
+ "For my peanuts allergy peanut butter is sometimes used.",
75
+ {
76
+ stopWords: new Set(
77
+ natural.stopwords.map((stopWord) =>
78
+ utilities.normalizeToken(stopWord),
79
+ ),
80
+ ),
81
+ stem: (token) => natural.PorterStemmer.stem(token),
82
+ },
83
+ ),
84
+ [
85
+ { token: "peanut", tokenIsStopWord: false, start: 7, end: 14 },
86
+ { token: "allergi", tokenIsStopWord: false, start: 15, end: 22 },
87
+ { token: "peanut", tokenIsStopWord: false, start: 23, end: 29 },
88
+ { token: "butter", tokenIsStopWord: false, start: 30, end: 36 },
89
+ { token: "sometim", tokenIsStopWord: false, start: 40, end: 49 },
90
+ { token: "us", tokenIsStopWord: false, start: 50, end: 54 },
91
+ ],
92
+ );
93
+ });
94
+
95
+ test("normalizeToken()", () => {
96
+ assert.equal(utilities.normalizeToken("ú HeLlo"), "u hello");
97
+ });
98
+
99
+ test("highlight()", () => {
100
+ const stopWords = new Set(
101
+ natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
102
+ );
103
+ assert.equal(
104
+ utilities.highlight(
105
+ "For my peanuts allergy peanut butter is sometimes used.",
106
+ new Set(
107
+ utilities
108
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
109
+ .map((tokenWithPosition) => tokenWithPosition.token),
110
+ ),
111
+ { stopWords, stem: natural.PorterStemmer.stem },
112
+ ),
113
+ `For my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes used.`,
114
+ );
115
+ });
116
+
117
+ test("snippet()", () => {
118
+ const stopWords = new Set(
119
+ natural.stopwords.map((stopWord) => utilities.normalizeToken(stopWord)),
120
+ );
121
+ assert.equal(
122
+ utilities.snippet(
123
+ utilities.dedent`
124
+ Typically mixed in these languages the. Paste extracted from sugarcane or sugar beet was the genesis of contemporary. British brought western style pastry to the spouts mounted on sledges or wagons. Toss their pancakes as well liked by.
125
+
126
+ Locally e g i aquatica. Hardness whiteness and gloss and.
127
+
128
+ Extensively planted as ornamental trees by homeowners businesses and. Yh t ritarit poor knights once only a dessert.
129
+
130
+ A shortbread base and was then only known. Pies of meat particularly beef chicken or turkey gravy and mixed vegetables potatoes. A level the name for an extended time to incorporate. Of soup beer bread and onions before they left for work in restaurants?
131
+
132
+ For my peanuts allergy peanut butter is sometimes used.
133
+
134
+ Is transformed from an inferior ovary i e one. They declined in popularity with the correct humidity. Christmas foods to be referred to as xoc l tl. Which part or all of them contain cocoa butter while maintaining.
135
+
136
+ Potato was called morgenmete and the united states? Used oil in place of. These sandwiches were not as sweet fillings include.
137
+
138
+ Granola mixed with achiote because. Has undergone multiple changes since then until. Made before making white chocolate they say. Confectionery recipes for them proliferated ' the.
139
+
140
+ Outdoorsman horace kephart recommended it in central america. Chickpea flour and certain areas of the peter.
141
+
142
+ Wan are the results two classic ways of manually tempering chocolate. Cost cocoa beans is ng g which is a. Croatian serbian and slovene pala. Km mi further south revealed that sweet potatoes have been identified from grinding. Rabanadas are a range of apple sauce depending on its consistency. Retail value rose percent latin?
143
+
144
+ Ghee and tea aid the body it is the largest pies of the era. In turkey ak tma in areas of central europe formerly belonging to!
145
+ `,
146
+ new Set(
147
+ utilities
148
+ .tokenize("peanuts", { stopWords, stem: natural.PorterStemmer.stem })
149
+ .map((tokenWithPosition) => tokenWithPosition.token),
150
+ ),
151
+ { stopWords, stem: natural.PorterStemmer.stem },
152
+ ),
153
+ `… work in restaurants?\n\nFor my <span class="highlight">peanuts</span> allergy <span class="highlight">peanut</span> butter is sometimes …`,
154
+ );
155
+ });
156
+
70
157
  test("isDate()", () => {
71
158
  assert(utilities.isDate("2024-04-01T14:57:46.638Z"));
72
159
  assert(!utilities.isDate("2024-04-01T14:57:46.68Z"));