allprofanity 2.2.0 → 2.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +144 -25
- package/allprofanity.config.example.json +6 -0
- package/bin/init.js +1 -1
- package/bin/mcp.js +6 -0
- package/config.schema.json +44 -0
- package/dist/algos/aho-corasick.d.ts +11 -1
- package/dist/algos/aho-corasick.js +31 -6
- package/dist/algos/aho-corasick.js.map +1 -1
- package/dist/algos/bloom-filter.d.ts +2 -2
- package/dist/algos/bloom-filter.js +6 -6
- package/dist/algos/bloom-filter.js.map +1 -1
- package/dist/index.d.ts +896 -48
- package/dist/index.js +1438 -177
- package/dist/index.js.map +1 -1
- package/dist/languages/hindi-words.js +2 -2
- package/dist/languages/hindi-words.js.map +1 -1
- package/dist/mcp/server.d.ts +30 -0
- package/dist/mcp/server.js +364 -0
- package/dist/mcp/server.js.map +1 -0
- package/dist/mcp/stdio.d.ts +1 -0
- package/dist/mcp/stdio.js +72 -0
- package/dist/mcp/stdio.js.map +1 -0
- package/examples-config/README.md +113 -0
- package/examples-config/chat-app.json +24 -0
- package/examples-config/content-moderation.json +42 -0
- package/examples-config/family-friendly-max.json +33 -0
- package/examples-config/high-throughput-api.json +29 -0
- package/examples-config/low-latency-minimal.json +24 -0
- package/examples-config/medical-professional.json +42 -0
- package/examples-config/multilingual-global.json +33 -0
- package/package.json +17 -7
package/README.md
CHANGED
|
@@ -1,33 +1,61 @@
|
|
|
1
1
|
# AllProfanity
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
The most evasion-resistant multi-language profanity filter for JavaScript/TypeScript. Catches leet-speak (`sh1t`), masked words (`f*ck`, `f#ck`), stretched letters (`fuuuuck`), spaced-out spelling (`f u c k`), Unicode tricks (fullwidth `fuck`, homoglyph `fυck`, zero-width injection) and nine languages — with zero false positives on the classic traps ("Scunthorpe", "classic", "bass") and sub-millisecond checks.
|
|
4
4
|
|
|
5
5
|
[](https://www.npmjs.com/package/allprofanity)
|
|
6
6
|
[](https://opensource.org/licenses/MIT)
|
|
7
7
|
|
|
8
8
|
---
|
|
9
9
|
|
|
10
|
-
## What's New in v2.
|
|
10
|
+
## What's New in v2.3.0
|
|
11
11
|
|
|
12
|
-
- **
|
|
13
|
-
- **
|
|
14
|
-
- **
|
|
15
|
-
- **
|
|
16
|
-
- **
|
|
17
|
-
-
|
|
12
|
+
- **Evasion protection suite:** masked characters (`f*ck`, `f#ck`, `f@ck`), stretched letters (`fuuuuck`), separated letters (`f u c k`, `f.u.c.k`), and a Unicode folding pass (fullwidth forms, Cyrillic/Greek homoglyphs, diacritics, zero-width/invisible characters) — all on by default, all individually configurable
|
|
13
|
+
- **Early-exit `check()`:** boolean checks stop at the first confirmed match — a 20k-word profane document that took v2.2.1 ~24 ms now checks in ~0.01 ms
|
|
14
|
+
- **8× faster leet-speak engine vs v2.2.1:** the normalizer was rewritten with first-character bucketing and an ASCII fast path — a 100 KB clean document dropped from ~24 ms to ~3 ms per scan
|
|
15
|
+
- **Position-accurate everywhere:** matches found through any normalization (leet, Unicode, collapse) now report exact positions in the *original* string, so cleaning masks exactly the evasive text
|
|
16
|
+
- **Hardened core:** `remove()`/`clearList()` now reach the Aho-Corasick automaton, the result cache is truly LRU and invalidates on every mutation, `contextAnalysis.scoreThreshold` and `ahoCorasick.prebuild` are honored, and importing the library no longer writes to the console
|
|
17
|
+
- **`ProfanitySeverity.NONE` (0):** clean text now reports `NONE` instead of `MILD`
|
|
18
18
|
|
|
19
19
|
[Read the full Performance Analysis →](./docs/SPEED_VS_ACCURACY.md)
|
|
20
20
|
|
|
21
21
|
---
|
|
22
22
|
|
|
23
|
+
## How It Compares
|
|
24
|
+
|
|
25
|
+
Benchmarked against the most popular npm profanity filters on a 20-case evasion battery, 15 false-positive traps, and throughput tests. Reproduce with [`benchmark/compare-libraries.mjs`](./benchmark/compare-libraries.mjs).
|
|
26
|
+
|
|
27
|
+
### Detection accuracy
|
|
28
|
+
|
|
29
|
+
| Library | Evasion cases caught | False positives | Multi-language |
|
|
30
|
+
|---|---|---|---|
|
|
31
|
+
| **allprofanity** | **19/20** | **0/15** | **9 languages** |
|
|
32
|
+
| obscenity | 16/20 | 1/15 (flags "shiitake") | English only |
|
|
33
|
+
| bad-words | 9/20 | 0/15 | English only |
|
|
34
|
+
| @2toad/profanity | 9/20 | 0/15 | English only |
|
|
35
|
+
| leo-profanity | 6/20 | 0/15 | English only |
|
|
36
|
+
|
|
37
|
+
The battery covers plain profanity, leet-speak (`sh1t`, `a55hole`, `b@stard`), masked words (`f*ck`, `f#ck`), stretched letters, spaced/dotted spelling, case tricks, fullwidth Unicode, homoglyphs, and Hindi in both Roman and Devanagari script.
|
|
38
|
+
|
|
39
|
+
### Performance (ms per `check()` call)
|
|
40
|
+
|
|
41
|
+
| Input | allprofanity | leo-profanity | bad-words | obscenity | @2toad |
|
|
42
|
+
|---|---|---|---|---|---|
|
|
43
|
+
| Short message (10 words) | 0.001 | 0.001 | 0.356 | 0.017 | 0.001 |
|
|
44
|
+
| Large text, profane (20k words) | **0.013** | 0.449 | 15.1 | 11.3 | 0.041 |
|
|
45
|
+
| Large text, clean (20k words) | 3.20 | 0.75 | 13.2 | 14.6 | 0.29 |
|
|
46
|
+
|
|
47
|
+
Against **obscenity** — the only library close on detection — allprofanity is ~4.5× faster on clean text and ~870× faster on profane text, with better detection and zero false positives. The thin word-list filters (leo-profanity, @2toad) are faster on clean throughput but miss half to two-thirds of the evasion battery.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
23
51
|
## Features
|
|
24
52
|
|
|
25
53
|
### Performance & Speed
|
|
26
54
|
|
|
27
55
|
- **Multiple Algorithm Options:** Choose between Trie (default), Aho-Corasick, or Hybrid modes
|
|
28
|
-
- **
|
|
29
|
-
- **
|
|
30
|
-
-
|
|
56
|
+
- **300K+ ops/sec on short texts:** default Trie mode (chat-message-sized inputs, measured on the comparison benchmark hardware); profane text exits early at 1M+ ops/sec
|
|
57
|
+
- **O(1) Cached Repeats:** LRU result cache turns repeated checks into map lookups — 25× (short texts) to 6,500× (large texts) faster than the same uncached call
|
|
58
|
+
- **Aho-Corasick Option:** single-pass multi-pattern matching whose cost stays flat as your dictionary grows (vs the Trie's per-position scanning)
|
|
31
59
|
- **Single-Pass Scanning:** O(n) complexity regardless of dictionary size
|
|
32
60
|
- **Batch Processing Ready:** Optimized for high-throughput API endpoints
|
|
33
61
|
|
|
@@ -35,9 +63,13 @@ A blazing-fast, multi-language profanity filter for JavaScript/TypeScript with a
|
|
|
35
63
|
|
|
36
64
|
- **Word Boundary Matching:** Smart whole-word detection prevents false positives like "assassin" or "assistance"
|
|
37
65
|
- **Pattern-Based Context Detection:** Recognizes medical terms ("anal region") and negation patterns ("not bad")
|
|
38
|
-
- **Advanced Leet-Speak:** Detects obfuscated profanities (`
|
|
39
|
-
- **
|
|
40
|
-
- **
|
|
66
|
+
- **Advanced Leet-Speak:** Detects obfuscated profanities (`a55hole`, `sh1t`, `b!tch`, etc.)
|
|
67
|
+
- **Masked-Character Wildcards:** Resolves `f*ck`, `f#ck`, `f@ck` — without flagging `c#`, `5% off`, or email addresses
|
|
68
|
+
- **Stretched & Separated Letters:** Catches `fuuuuck`, `f u c k`, `f.u.c.k` while leaving "U S A" and "hmmmm" alone
|
|
69
|
+
- **Embedded Strong Words:** Catches profanity glued into non-words (`sisfuck`, `totalshitshow`) for a curated list of unambiguous stems — with built-in exceptions so "Scunthorpe", "mishit" and "snigger" never flag
|
|
70
|
+
- **Unicode Evasion Folding:** Fullwidth forms (`fuck`), Cyrillic/Greek homoglyphs (`fυck`, `bаstard`), diacritics (`fück`), and zero-width/invisible character injection
|
|
71
|
+
- **Position-Accurate Cleaning:** Every match maps back to the exact span in the original text, however it was obfuscated
|
|
72
|
+
- **Configurable Strictness:** Each evasion pass can be toggled independently via `evasionProtection`
|
|
41
73
|
|
|
42
74
|
### Multi-Language & Flexibility
|
|
43
75
|
|
|
@@ -81,9 +113,18 @@ import profanity from 'allprofanity';
|
|
|
81
113
|
|
|
82
114
|
// Simple check
|
|
83
115
|
profanity.check('This is a clean sentence.'); // false
|
|
84
|
-
profanity.check('What the f#ck is this?'); // true (
|
|
116
|
+
profanity.check('What the f#ck is this?'); // true (masked character)
|
|
117
|
+
profanity.check('This is sh1t'); // true (leet-speak)
|
|
118
|
+
profanity.check('fuuuuck this'); // true (stretched letters)
|
|
119
|
+
profanity.check('f u c k you'); // true (separated letters)
|
|
120
|
+
profanity.check('fuck and fυck'); // true (unicode evasion)
|
|
85
121
|
profanity.check('यह एक चूतिया परीक्षण है।'); // true (Hindi)
|
|
86
122
|
profanity.check('Ye ek chutiya test hai.'); // true (Hinglish Roman script)
|
|
123
|
+
|
|
124
|
+
// But the classics stay clean:
|
|
125
|
+
profanity.check('I live in Scunthorpe'); // false
|
|
126
|
+
profanity.check('a classic bass guitar class'); // false
|
|
127
|
+
profanity.check('I write c# code, get 5% off'); // false
|
|
87
128
|
```
|
|
88
129
|
|
|
89
130
|
---
|
|
@@ -92,6 +133,22 @@ profanity.check('Ye ek chutiya test hai.'); // true (Hinglish Roman scr
|
|
|
92
133
|
|
|
93
134
|
AllProfanity v2.2+ offers multiple algorithms optimized for different use cases. You can configure via **constructor options** or **config file**.
|
|
94
135
|
|
|
136
|
+
### Plug-and-Play Presets
|
|
137
|
+
|
|
138
|
+
Skip the tuning: [`examples-config/`](./examples-config/) ships ready-made configurations for every type of user — chat apps, content moderation, high-throughput APIs, medical/professional content, kids' platforms, latency-critical paths, and global multilingual audiences. Each preset is documented with who it's for, why it's tuned that way, and its trade-offs in the [preset guide](./examples-config/README.md).
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
# Copy the preset that fits your use case
|
|
142
|
+
cp node_modules/allprofanity/examples-config/chat-app.json ./allprofanity.config.json
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
```typescript
|
|
146
|
+
import { AllProfanity } from 'allprofanity';
|
|
147
|
+
import config from './allprofanity.config.json';
|
|
148
|
+
|
|
149
|
+
const filter = AllProfanity.fromConfig(config);
|
|
150
|
+
```
|
|
151
|
+
|
|
95
152
|
### Configuration Methods
|
|
96
153
|
|
|
97
154
|
#### Method 1: Constructor Options (Inline)
|
|
@@ -169,7 +226,7 @@ const filter2 = AllProfanity.fromConfig({
|
|
|
169
226
|
```typescript
|
|
170
227
|
import { AllProfanity } from 'allprofanity';
|
|
171
228
|
const filter = new AllProfanity();
|
|
172
|
-
// Uses optimized Trie - fast and reliable (
|
|
229
|
+
// Uses optimized Trie - fast and reliable (300K+ ops/sec on short texts)
|
|
173
230
|
```
|
|
174
231
|
|
|
175
232
|
#### 2. Large Text Processing (Documents, Articles)
|
|
@@ -178,7 +235,8 @@ const filter = new AllProfanity();
|
|
|
178
235
|
const filter = new AllProfanity({
|
|
179
236
|
algorithm: { matching: "aho-corasick" }
|
|
180
237
|
});
|
|
181
|
-
//
|
|
238
|
+
// Single-pass scanning that stays flat as the dictionary grows;
|
|
239
|
+
// prebuild pays the compile cost at startup instead of the first request
|
|
182
240
|
```
|
|
183
241
|
|
|
184
242
|
#### 3. Reduced False Positives (Social Media, Content Moderation)
|
|
@@ -210,10 +268,27 @@ const filter = new AllProfanity({
|
|
|
210
268
|
cacheSize: 1000
|
|
211
269
|
}
|
|
212
270
|
});
|
|
213
|
-
//
|
|
271
|
+
// O(1) on cache hits - 25x to 6,500x faster than the same uncached
|
|
272
|
+
// detect() call, depending on text size (short message vs 20k words)
|
|
214
273
|
```
|
|
215
274
|
|
|
216
|
-
#### 5.
|
|
275
|
+
#### 5. Tuning Evasion Protection
|
|
276
|
+
|
|
277
|
+
```typescript
|
|
278
|
+
const filter = new AllProfanity({
|
|
279
|
+
evasionProtection: {
|
|
280
|
+
unicode: true, // fold fullwidth/homoglyph/diacritic/invisible-char tricks
|
|
281
|
+
repeatedCharacters: true, // collapse "fuuuuck" -> "fuck"
|
|
282
|
+
maskedCharacters: true, // resolve "f*ck", "f#ck", "f@ck"
|
|
283
|
+
separatedLetters: true // catch "f u c k", "f.u.c.k"
|
|
284
|
+
}
|
|
285
|
+
});
|
|
286
|
+
// All four passes are ON by default and only run when their trigger
|
|
287
|
+
// characters appear in the text, so ordinary input pays near-zero cost.
|
|
288
|
+
// Disable any of them for maximum-throughput or specialised pipelines.
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
#### 6. Medical/Professional Content
|
|
217
292
|
|
|
218
293
|
```typescript
|
|
219
294
|
const filter = new AllProfanity({
|
|
@@ -234,9 +309,11 @@ const filter = new AllProfanity({
|
|
|
234
309
|
|
|
235
310
|
| Use Case | Algorithm | Speed | Detection | Best For |
|
|
236
311
|
|----------|-----------|-------|----------|----------|
|
|
237
|
-
| Short texts (<500 chars) | Trie (default) |
|
|
238
|
-
| Large texts (
|
|
239
|
-
| Repeated patterns | Any + Caching |
|
|
312
|
+
| Short texts (<500 chars) | Trie (default) | 300K+ ops/sec | Excellent | Chat, comments |
|
|
313
|
+
| Large texts (20KB+) | Trie or Aho-Corasick | ~3 ms/check (clean), ~0.01 ms (profane, early exit) | Excellent | Documents, articles |
|
|
314
|
+
| Repeated patterns | Any + Caching | O(1) — 25× to 6,500× faster than uncached | Excellent | Forms, validation |
|
|
315
|
+
|
|
316
|
+
All numbers measured with [`benchmark/compare-libraries.mjs`](./benchmark/compare-libraries.mjs) and `npm run benchmark` (Node 22, Windows x64); baselines are v2.2.1 for version-to-version claims and the named library for cross-library claims. Re-run them yourself — results vary with hardware.
|
|
240
317
|
| Content moderation | Hybrid + Context | Moderate | Good (fewer false positives) | Social media, UGC |
|
|
241
318
|
| Professional content | Hybrid + Context (strict) | Moderate | Reduced false flags | Medical, academic |
|
|
242
319
|
|
|
@@ -248,12 +325,14 @@ const filter = new AllProfanity({
|
|
|
248
325
|
|
|
249
326
|
### `check(text: string): boolean`
|
|
250
327
|
|
|
251
|
-
Returns `true` if the text contains any profanity.
|
|
328
|
+
Returns `true` if the text contains any profanity. Uses an early-exit fast
|
|
329
|
+
path that stops scanning at the first confirmed match — for boolean
|
|
330
|
+
moderation gates it is dramatically faster than `detect()` on profane input.
|
|
252
331
|
|
|
253
332
|
```typescript
|
|
254
333
|
profanity.check('This is a clean sentence.'); // false
|
|
255
334
|
profanity.check('This is a bullshit sentence.'); // true
|
|
256
|
-
profanity.check('What the f#ck is this?'); // true (
|
|
335
|
+
profanity.check('What the f#ck is this?'); // true (masked character)
|
|
257
336
|
profanity.check('यह एक चूतिया परीक्षण है।'); // true (Hindi)
|
|
258
337
|
```
|
|
259
338
|
|
|
@@ -521,6 +600,12 @@ AllProfanity supports JSON-based configuration for easy setup and deployment. Th
|
|
|
521
600
|
"detectPartialWords": boolean, // Detect within words (default: false)
|
|
522
601
|
"defaultPlaceholder": string // Default censoring character (default: "*")
|
|
523
602
|
},
|
|
603
|
+
"evasionProtection": {
|
|
604
|
+
"unicode": boolean, // Fold fullwidth/homoglyphs/diacritics/invisibles (default: true)
|
|
605
|
+
"repeatedCharacters": boolean, // Collapse "fuuuuck" (default: true)
|
|
606
|
+
"maskedCharacters": boolean, // Resolve "f*ck", "f#ck" (default: true)
|
|
607
|
+
"separatedLetters": boolean // Catch "f u c k" (default: true)
|
|
608
|
+
},
|
|
524
609
|
"performance": {
|
|
525
610
|
"enableCaching": boolean, // Enable result cache (default: false)
|
|
526
611
|
"cacheSize": number // Cache size limit (default: 1000)
|
|
@@ -618,6 +703,7 @@ Severity reflects the number and variety of detected profanities:
|
|
|
618
703
|
|
|
619
704
|
| Level | Enum Value | Description |
|
|
620
705
|
|-----------|------------|-----------------------------------------|
|
|
706
|
+
| NONE | 0 | No profanity detected |
|
|
621
707
|
| MILD | 1 | 1 unique/total word |
|
|
622
708
|
| MODERATE | 2 | 2 unique or total words |
|
|
623
709
|
| SEVERE | 3 | 3 unique/total words |
|
|
@@ -654,7 +740,8 @@ console.log(englishBadWords.slice(0, 5)); // ["fuck", "shit", ...]
|
|
|
654
740
|
|
|
655
741
|
- **No wordlist exposure:** There is no `.list()` function for security and encapsulation. Use exported word arrays for samples.
|
|
656
742
|
- **TRIE-based:** Scales easily to 50,000+ words.
|
|
657
|
-
- **
|
|
743
|
+
- **Evasion-resistant:** Catches leet-speak (`a55hole`), masked characters (`f*ck`, `f#ck`), stretched letters (`fuuuuck`), separated letters (`f u c k`), and Unicode tricks (fullwidth, homoglyphs, zero-width injection).
|
|
744
|
+
- **Silent by default:** Importing the library never writes to the console; instantiate with a custom `logger` for diagnostics.
|
|
658
745
|
|
|
659
746
|
---
|
|
660
747
|
|
|
@@ -704,6 +791,38 @@ A: Yes! AllProfanity is universal.
|
|
|
704
791
|
|
|
705
792
|
---
|
|
706
793
|
|
|
794
|
+
## Use with AI Agents (MCP Server)
|
|
795
|
+
|
|
796
|
+
AllProfanity ships a built-in [Model Context Protocol](https://modelcontextprotocol.io) server, so any MCP-capable agent (Claude Code, Claude Desktop, Cursor, custom agents) can check, analyze and clean text with zero integration code — and zero extra dependencies.
|
|
797
|
+
|
|
798
|
+
**Claude Code:**
|
|
799
|
+
|
|
800
|
+
```bash
|
|
801
|
+
claude mcp add allprofanity -- npx -y -p allprofanity allprofanity-mcp
|
|
802
|
+
```
|
|
803
|
+
|
|
804
|
+
**Claude Desktop / generic MCP config:**
|
|
805
|
+
|
|
806
|
+
```json
|
|
807
|
+
{
|
|
808
|
+
"mcpServers": {
|
|
809
|
+
"allprofanity": {
|
|
810
|
+
"command": "npx",
|
|
811
|
+
"args": ["-y", "-p", "allprofanity", "allprofanity-mcp"],
|
|
812
|
+
"env": {
|
|
813
|
+
"ALLPROFANITY_LANGUAGES": "french,german,spanish"
|
|
814
|
+
}
|
|
815
|
+
}
|
|
816
|
+
}
|
|
817
|
+
}
|
|
818
|
+
```
|
|
819
|
+
|
|
820
|
+
**Exposed tools:** `check_profanity`, `detect_profanity` (positions + severity + cleaned text), `clean_profanity` (character or word mode), `add_words`, `add_to_whitelist`, `load_language`, `list_languages`, and `get_documentation` — agents can read this README and the preset guide directly through the server (also exposed as MCP resources), so they can learn how to integrate and configure the library on their own.
|
|
821
|
+
|
|
822
|
+
**Configuration:** set `ALLPROFANITY_LANGUAGES` (comma-separated) to preload languages, or `ALLPROFANITY_CONFIG` to the path of an `allprofanity.config.json` (any [preset](./examples-config/) works).
|
|
823
|
+
|
|
824
|
+
---
|
|
825
|
+
|
|
707
826
|
## Middleware Examples
|
|
708
827
|
|
|
709
828
|
**Looking for Express.js/Node.js middleware to use AllProfanity in your API or chat app?**
|
|
@@ -28,6 +28,12 @@
|
|
|
28
28
|
"detectPartialWords": false,
|
|
29
29
|
"defaultPlaceholder": "*"
|
|
30
30
|
},
|
|
31
|
+
"evasionProtection": {
|
|
32
|
+
"unicode": true,
|
|
33
|
+
"repeatedCharacters": true,
|
|
34
|
+
"maskedCharacters": true,
|
|
35
|
+
"separatedLetters": true
|
|
36
|
+
},
|
|
31
37
|
"performance": {
|
|
32
38
|
"cacheSize": 1000,
|
|
33
39
|
"enableCaching": true
|
package/bin/init.js
CHANGED
package/bin/mcp.js
ADDED
package/config.schema.json
CHANGED
|
@@ -140,6 +140,50 @@
|
|
|
140
140
|
}
|
|
141
141
|
}
|
|
142
142
|
},
|
|
143
|
+
"evasionProtection": {
|
|
144
|
+
"type": "object",
|
|
145
|
+
"description": "Evasion-protection passes (all enabled by default; each only runs when its trigger characters appear in the text)",
|
|
146
|
+
"properties": {
|
|
147
|
+
"unicode": {
|
|
148
|
+
"type": "boolean",
|
|
149
|
+
"default": true,
|
|
150
|
+
"description": "Fold unicode evasion: fullwidth forms, Cyrillic/Greek homoglyphs, diacritics, zero-width/invisible characters"
|
|
151
|
+
},
|
|
152
|
+
"repeatedCharacters": {
|
|
153
|
+
"type": "boolean",
|
|
154
|
+
"default": true,
|
|
155
|
+
"description": "Collapse stretched characters (fuuuuck -> fuck)"
|
|
156
|
+
},
|
|
157
|
+
"maskedCharacters": {
|
|
158
|
+
"type": "boolean",
|
|
159
|
+
"default": true,
|
|
160
|
+
"description": "Resolve masked characters as wildcards (f*ck, f#ck, f@ck)"
|
|
161
|
+
},
|
|
162
|
+
"separatedLetters": {
|
|
163
|
+
"type": "boolean",
|
|
164
|
+
"default": true,
|
|
165
|
+
"description": "Detect words spelled with uniform single separators (f u c k, f.u.c.k)"
|
|
166
|
+
}
|
|
167
|
+
}
|
|
168
|
+
},
|
|
169
|
+
"languages": {
|
|
170
|
+
"type": "array",
|
|
171
|
+
"items": {
|
|
172
|
+
"type": "string",
|
|
173
|
+
"enum": ["english", "hindi", "french", "german", "spanish", "bengali", "tamil", "telugu", "brazilian"]
|
|
174
|
+
},
|
|
175
|
+
"description": "Additional built-in language packs to load (english and hindi load by default)"
|
|
176
|
+
},
|
|
177
|
+
"whitelistWords": {
|
|
178
|
+
"type": "array",
|
|
179
|
+
"items": { "type": "string" },
|
|
180
|
+
"description": "Words that are never flagged as profanity"
|
|
181
|
+
},
|
|
182
|
+
"silent": {
|
|
183
|
+
"type": "boolean",
|
|
184
|
+
"default": false,
|
|
185
|
+
"description": "Suppress all log output"
|
|
186
|
+
},
|
|
143
187
|
"performance": {
|
|
144
188
|
"type": "object",
|
|
145
189
|
"description": "Performance optimization settings",
|
|
@@ -10,6 +10,7 @@ export interface Match {
|
|
|
10
10
|
export declare class AhoCorasick {
|
|
11
11
|
private root;
|
|
12
12
|
private patterns;
|
|
13
|
+
private patternSet;
|
|
13
14
|
private compiled;
|
|
14
15
|
constructor(patterns?: string[]);
|
|
15
16
|
/**
|
|
@@ -21,9 +22,18 @@ export declare class AhoCorasick {
|
|
|
21
22
|
*/
|
|
22
23
|
addPatterns(patterns: string[]): void;
|
|
23
24
|
/**
|
|
24
|
-
* Add a single pattern to the automaton
|
|
25
|
+
* Add a single pattern to the automaton (duplicates are ignored)
|
|
25
26
|
*/
|
|
26
27
|
addPattern(pattern: string): void;
|
|
28
|
+
/**
|
|
29
|
+
* Remove a pattern from the automaton
|
|
30
|
+
* @returns True if the pattern existed and was removed
|
|
31
|
+
*/
|
|
32
|
+
removePattern(pattern: string): boolean;
|
|
33
|
+
/**
|
|
34
|
+
* Compile the automaton now instead of lazily on first search
|
|
35
|
+
*/
|
|
36
|
+
build(): void;
|
|
27
37
|
/**
|
|
28
38
|
* Build the Aho-Corasick automaton
|
|
29
39
|
*/
|
|
@@ -4,9 +4,10 @@
|
|
|
4
4
|
export class AhoCorasick {
|
|
5
5
|
constructor(patterns = []) {
|
|
6
6
|
this.compiled = false;
|
|
7
|
-
this.
|
|
7
|
+
this.patternSet = new Set(patterns.filter((p) => p && p.length > 0));
|
|
8
|
+
this.patterns = [...this.patternSet];
|
|
8
9
|
this.root = this.createNode();
|
|
9
|
-
if (patterns.length > 0) {
|
|
10
|
+
if (this.patterns.length > 0) {
|
|
10
11
|
this.buildAutomaton();
|
|
11
12
|
}
|
|
12
13
|
}
|
|
@@ -26,18 +27,41 @@ export class AhoCorasick {
|
|
|
26
27
|
* Add patterns to the automaton
|
|
27
28
|
*/
|
|
28
29
|
addPatterns(patterns) {
|
|
29
|
-
|
|
30
|
-
|
|
30
|
+
for (const pattern of patterns) {
|
|
31
|
+
this.addPattern(pattern);
|
|
32
|
+
}
|
|
31
33
|
}
|
|
32
34
|
/**
|
|
33
|
-
* Add a single pattern to the automaton
|
|
35
|
+
* Add a single pattern to the automaton (duplicates are ignored)
|
|
34
36
|
*/
|
|
35
37
|
addPattern(pattern) {
|
|
36
|
-
if (pattern && pattern.length > 0) {
|
|
38
|
+
if (pattern && pattern.length > 0 && !this.patternSet.has(pattern)) {
|
|
39
|
+
this.patternSet.add(pattern);
|
|
37
40
|
this.patterns.push(pattern);
|
|
38
41
|
this.compiled = false;
|
|
39
42
|
}
|
|
40
43
|
}
|
|
44
|
+
/**
|
|
45
|
+
* Remove a pattern from the automaton
|
|
46
|
+
* @returns True if the pattern existed and was removed
|
|
47
|
+
*/
|
|
48
|
+
removePattern(pattern) {
|
|
49
|
+
if (!this.patternSet.has(pattern)) {
|
|
50
|
+
return false;
|
|
51
|
+
}
|
|
52
|
+
this.patternSet.delete(pattern);
|
|
53
|
+
this.patterns = this.patterns.filter((p) => p !== pattern);
|
|
54
|
+
this.compiled = false;
|
|
55
|
+
return true;
|
|
56
|
+
}
|
|
57
|
+
/**
|
|
58
|
+
* Compile the automaton now instead of lazily on first search
|
|
59
|
+
*/
|
|
60
|
+
build() {
|
|
61
|
+
if (!this.compiled) {
|
|
62
|
+
this.buildAutomaton();
|
|
63
|
+
}
|
|
64
|
+
}
|
|
41
65
|
/**
|
|
42
66
|
* Build the Aho-Corasick automaton
|
|
43
67
|
*/
|
|
@@ -206,6 +230,7 @@ export class AhoCorasick {
|
|
|
206
230
|
*/
|
|
207
231
|
clear() {
|
|
208
232
|
this.patterns = [];
|
|
233
|
+
this.patternSet.clear();
|
|
209
234
|
this.root = this.createNode();
|
|
210
235
|
this.compiled = false;
|
|
211
236
|
}
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"aho-corasick.js","sourceRoot":"","sources":["../../src/algos/aho-corasick.ts"],"names":[],"mappings":"AAAA;;GAEG;AAiBH,MAAM,OAAO,WAAW;
|
|
1
|
+
{"version":3,"file":"aho-corasick.js","sourceRoot":"","sources":["../../src/algos/aho-corasick.ts"],"names":[],"mappings":"AAAA;;GAEG;AAiBH,MAAM,OAAO,WAAW;IAMtB,YAAY,WAAqB,EAAE;QAF3B,aAAQ,GAAY,KAAK,CAAC;QAGhC,IAAI,CAAC,UAAU,GAAG,IAAI,GAAG,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,IAAI,CAAC,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC;QACrE,IAAI,CAAC,QAAQ,GAAG,CAAC,GAAG,IAAI,CAAC,UAAU,CAAC,CAAC;QACrC,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,UAAU,EAAE,CAAC;QAC9B,IAAI,IAAI,CAAC,QAAQ,CAAC,MAAM,GAAG,CAAC,EAAE;YAC5B,IAAI,CAAC,cAAc,EAAE,CAAC;SACvB;IACH,CAAC;IAED;;OAEG;IACK,UAAU;QAChB,OAAO;YACL,QAAQ,EAAE,IAAI,GAAG,EAAoB;YACrC,MAAM,EAAE,EAAE;YACV,aAAa,EAAE,EAAE;YACjB,OAAO,EAAE,IAAI;YACb,cAAc,EAAE,KAAK;SACtB,CAAC;IACJ,CAAC;IAED;;OAEG;IACH,WAAW,CAAC,QAAkB;QAC5B,KAAK,MAAM,OAAO,IAAI,QAAQ,EAAE;YAC9B,IAAI,CAAC,UAAU,CAAC,OAAO,CAAC,CAAC;SAC1B;IACH,CAAC;IAED;;OAEG;IACH,UAAU,CAAC,OAAe;QACxB,IAAI,OAAO,IAAI,OAAO,CAAC,MAAM,GAAG,CAAC,IAAI,CAAC,IAAI,CAAC,UAAU,CAAC,GAAG,CAAC,OAAO,CAAC,EAAE;YAClE,IAAI,CAAC,UAAU,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;YAC7B,IAAI,CAAC,QAAQ,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;YAC5B,IAAI,CAAC,QAAQ,GAAG,KAAK,CAAC;SACvB;IACH,CAAC;IAED;;;OAGG;IACH,aAAa,CAAC,OAAe;QAC3B,IAAI,CAAC,IAAI,CAAC,UAAU,CAAC,GAAG,CAAC,OAAO,CAAC,EAAE;YACjC,OAAO,KAAK,CAAC;SACd;QACD,IAAI,CAAC,UAAU,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;QAChC,IAAI,CAAC,QAAQ,GAAG,IAAI,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,KAAK,OAAO,CAAC,CAAC;QAC3D,IAAI,CAAC,QAAQ,GAAG,KAAK,CAAC;QACtB,OAAO,IAAI,CAAC;IACd,CAAC;IAED;;OAEG;IACH,KAAK;QACH,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE;YAClB,IAAI,CAAC,cAAc,EAAE,CAAC;SACvB;IACH,CAAC;IAED;;OAEG;IACK,cAAc;QACpB,IAAI,CAAC,SAAS,EAAE,CAAC;QACjB,IAAI,CAAC,iBAAiB,EAAE,CAAC;QACzB,IAAI,CAAC,gBAAgB,EAAE,CAAC;QACxB,IAAI,CAAC,QAAQ,GAAG,IAAI,CAAC;IACvB,CAAC;IAED;;OAEG;IACK,SAAS;QACf,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,UAAU,EAAE,CAAC;QAE9B,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YAC7C,MAAM,OAAO,GAAG,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC;YACjC,IAAI,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;YAExB,KAAK,MAAM,IAAI,IAAI,OAAO,EAAE;gBAC1B,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;oBAC/B,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,EAAE,IAAI,CAAC,UAAU,EAAE,CAAC,CAAC;iBAC/C;gBACD,OAAO,GAAG,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAE,CAAC;aACvC;YAED,OAAO,CAAC,cAAc,GAAG,IAAI,CAAC;YAC9B,OAAO,CAAC,MAAM,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;YAC7B,OAAO,CAAC,aAAa,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;SAC/B;IACH,CAAC;IAED;;OAEG;IACK,iBAAiB;QACvB,MAAM,KAAK,GAAe,EAAE,CAAC;QAE7B,6CAA6C;QAC7C,KAAK,MAAM,KAAK,IAAI,IAAI,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,EAAE;YAC/C,KAAK,CAAC,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;YAC1B,KAAK,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;SACnB;QAED,uCAAuC;QACvC,OAAO,KAAK,CAAC,MAAM,GAAG,CAAC,EAAE;YACvB,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,EAAG,CAAC;YAE/B,KAAK,MAAM,CAAC,IAAI,EAAE,KAAK,CAAC,IAAI,OAAO,CAAC,QAAQ,EAAE;gBAC5C,KAAK,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;gBAElB,IAAI,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;gBAC9B,OAAO,OAAO,KAAK,IAAI,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;oBACtD,OAAO,GAAG,OAAO,CAAC,OAAO,CAAC;iBAC3B;gBAED,KAAK,CAAC,OAAO,GAAG,OAAO,CAAC,CAAC,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAE,CAAC,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC;aACnE;SACF;IACH,CAAC;IAED;;OAEG;IACK,gBAAgB;QACtB,MAAM,KAAK,GAAe,EAAE,CAAC;QAE7B,KAAK,MAAM,KAAK,IAAI,IAAI,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,EAAE;YAC/C,KAAK,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;SACnB;QAED,OAAO,KAAK,CAAC,MAAM,GAAG,CAAC,EAAE;YACvB,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,EAAG,CAAC;YAE/B,2CAA2C;YAC3C,IAAI,OAAO,CAAC,OAAO,IAAI,OAAO,CAAC,OAAO,CAAC,MAAM,CAAC,MAAM,GAAG,CAAC,EAAE;gBACxD,OAAO,CAAC,MAAM,CAAC,IAAI,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,MAAM,CAAC,CAAC;gBAC/C,OAAO,CAAC,aAAa,CAAC,IAAI,CAAC,GAAG,OAAO,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;aAC9D;YAED,KAAK,MAAM,KAAK,IAAI,OAAO,CAAC,QAAQ,CAAC,MAAM,EAAE,EAAE;gBAC7C,KAAK,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC;aACnB;SACF;IACH,CAAC;IAED;;OAEG;IACH,OAAO,CAAC,IAAY;QAClB,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE;YAClB,IAAI,CAAC,cAAc,EAAE,CAAC;SACvB;QAED,MAAM,OAAO,GAAY,EAAE,CAAC;QAC5B,IAAI,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;QAExB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,MAAM,IAAI,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;YAErB,gEAAgE;YAChE,OAAO,OAAO,KAAK,IAAI,CAAC,IAAI,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC3D,OAAO,GAAG,OAAO,CAAC,OAAQ,CAAC;aAC5B;YAED,uCAAuC;YACvC,IAAI,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC9B,OAAO,GAAG,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAE,CAAC;aACvC;YAED,gDAAgD;YAChD,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,OAAO,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;gBAC9C,MAAM,OAAO,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC;gBAClC,MAAM,YAAY,GAAG,OAAO,CAAC,aAAa,CAAC,CAAC,CAAC,CAAC;gBAC9C,MAAM,KAAK,GAAG,CAAC,GAAG,OAAO,CAAC,MAAM,GAAG,CAAC,CAAC;gBAErC,OAAO,CAAC,IAAI,CAAC;oBACX,OAAO;oBACP,KAAK;oBACL,GAAG,EAAE,CAAC,GAAG,CAAC;oBACV,YAAY;iBACb,CAAC,CAAC;aACJ;SACF;QAED,OAAO,OAAO,CAAC;IACjB,CAAC;IAED;;OAEG;IACH,QAAQ,CAAC,IAAY;QACnB,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE;YAClB,IAAI,CAAC,cAAc,EAAE,CAAC;SACvB;QAED,IAAI,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;QAExB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,MAAM,IAAI,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;YAErB,OAAO,OAAO,KAAK,IAAI,CAAC,IAAI,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC3D,OAAO,GAAG,OAAO,CAAC,OAAQ,CAAC;aAC5B;YAED,IAAI,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC9B,OAAO,GAAG,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAE,CAAC;aACvC;YAED,IAAI,OAAO,CAAC,MAAM,CAAC,MAAM,GAAG,CAAC,EAAE;gBAC7B,OAAO,IAAI,CAAC;aACb;SACF;QAED,OAAO,KAAK,CAAC;IACf,CAAC;IAED;;OAEG;IACH,SAAS,CAAC,IAAY;QACpB,IAAI,CAAC,IAAI,CAAC,QAAQ,EAAE;YAClB,IAAI,CAAC,cAAc,EAAE,CAAC;SACvB;QAED,IAAI,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;QAExB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,MAAM,IAAI,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;YAErB,OAAO,OAAO,KAAK,IAAI,CAAC,IAAI,IAAI,CAAC,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC3D,OAAO,GAAG,OAAO,CAAC,OAAQ,CAAC;aAC5B;YAED,IAAI,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE;gBAC9B,OAAO,GAAG,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,CAAE,CAAC;aACvC;YAED,IAAI,OAAO,CAAC,MAAM,CAAC,MAAM,GAAG,CAAC,EAAE;gBAC7B,MAAM,OAAO,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC,CAAC,CAAC;gBAClC,MAAM,YAAY,GAAG,OAAO,CAAC,aAAa,CAAC,CAAC,CAAC,CAAC;gBAC9C,MAAM,KAAK,GAAG,CAAC,GAAG,OAAO,CAAC,MAAM,GAAG,CAAC,CAAC;gBAErC,OAAO;oBACL,OAAO;oBACP,KAAK;oBACL,GAAG,EAAE,CAAC,GAAG,CAAC;oBACV,YAAY;iBACb,CAAC;aACH;SACF;QAED,OAAO,IAAI,CAAC;IACd,CAAC;IAED;;OAEG;IACH,WAAW;QACT,OAAO,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,CAAC;IAC5B,CAAC;IAED;;OAEG;IACH,KAAK;QACH,IAAI,CAAC,QAAQ,GAAG,EAAE,CAAC;QACnB,IAAI,CAAC,UAAU,CAAC,KAAK,EAAE,CAAC;QACxB,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,UAAU,EAAE,CAAC;QAC9B,IAAI,CAAC,QAAQ,GAAG,KAAK,CAAC;IACxB,CAAC;IAED;;OAEG;IACH,QAAQ;QAKN,MAAM,SAAS,GAAG,IAAI,CAAC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;QAC7C,MAAM,oBAAoB,GACxB,IAAI,CAAC,QAAQ,CAAC,MAAM,GAAG,CAAC;YACtB,CAAC,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC,GAAG,EAAE,CAAC,EAAE,EAAE,CAAC,GAAG,GAAG,CAAC,CAAC,MAAM,EAAE,CAAC,CAAC;gBACnD,IAAI,CAAC,QAAQ,CAAC,MAAM;YACtB,CAAC,CAAC,CAAC,CAAC;QAER,OAAO;YACL,YAAY,EAAE,IAAI,CAAC,QAAQ,CAAC,MAAM;YAClC,SAAS;YACT,oBAAoB;SACrB,CAAC;IACJ,CAAC;IAED;;OAEG;IACK,UAAU,CAAC,IAAc;QAC/B,IAAI,KAAK,GAAG,CAAC,CAAC;QACd,KAAK,MAAM,KAAK,IAAI,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,EAAE;YAC1C,KAAK,IAAI,IAAI,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC;SACjC;QACD,OAAO,KAAK,CAAC;IACf,CAAC;CACF"}
|
|
@@ -16,11 +16,11 @@ export declare class BloomFilter {
|
|
|
16
16
|
*/
|
|
17
17
|
private calculateOptimalHashCount;
|
|
18
18
|
/**
|
|
19
|
-
* Hash function 1 (FNV-1a
|
|
19
|
+
* Hash function 1 (FNV-1a, 32-bit)
|
|
20
20
|
*/
|
|
21
21
|
private hash1;
|
|
22
22
|
/**
|
|
23
|
-
* Hash function 2 (djb2
|
|
23
|
+
* Hash function 2 (djb2, 32-bit)
|
|
24
24
|
*/
|
|
25
25
|
private hash2;
|
|
26
26
|
/**
|
|
@@ -22,25 +22,25 @@ export class BloomFilter {
|
|
|
22
22
|
return Math.ceil((m / n) * Math.log(2));
|
|
23
23
|
}
|
|
24
24
|
/**
|
|
25
|
-
* Hash function 1 (FNV-1a
|
|
25
|
+
* Hash function 1 (FNV-1a, 32-bit)
|
|
26
26
|
*/
|
|
27
27
|
hash1(item) {
|
|
28
28
|
let hash = 2166136261;
|
|
29
29
|
for (let i = 0; i < item.length; i++) {
|
|
30
30
|
hash ^= item.charCodeAt(i);
|
|
31
|
-
hash
|
|
31
|
+
hash = Math.imul(hash, 16777619);
|
|
32
32
|
}
|
|
33
|
-
return
|
|
33
|
+
return (hash >>> 0) % this.size;
|
|
34
34
|
}
|
|
35
35
|
/**
|
|
36
|
-
* Hash function 2 (djb2
|
|
36
|
+
* Hash function 2 (djb2, 32-bit)
|
|
37
37
|
*/
|
|
38
38
|
hash2(item) {
|
|
39
39
|
let hash = 5381;
|
|
40
40
|
for (let i = 0; i < item.length; i++) {
|
|
41
|
-
hash = (hash
|
|
41
|
+
hash = (Math.imul(hash, 33) + item.charCodeAt(i)) | 0;
|
|
42
42
|
}
|
|
43
|
-
return
|
|
43
|
+
return (hash >>> 0) % this.size;
|
|
44
44
|
}
|
|
45
45
|
/**
|
|
46
46
|
* Generate k hash values for an item using double hashing
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"bloom-filter.js","sourceRoot":"","sources":["../../src/algos/bloom-filter.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,MAAM,OAAO,WAAW;IAMtB,YAAY,aAAqB,EAAE,oBAA4B,IAAI;QAF3D,cAAS,GAAW,CAAC,CAAC;QAG5B,wCAAwC;QACxC,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,oBAAoB,CAAC,aAAa,EAAE,iBAAiB,CAAC,CAAC;QACxE,IAAI,CAAC,SAAS,GAAG,IAAI,CAAC,yBAAyB,CAAC,IAAI,CAAC,IAAI,EAAE,aAAa,CAAC,CAAC;QAC1E,IAAI,CAAC,QAAQ,GAAG,IAAI,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,GAAG,CAAC,CAAC,CAAC,CAAC;IAC3D,CAAC;IAED;;OAEG;IACK,oBAAoB,CAAC,CAAS,EAAE,CAAS;QAC/C,OAAO,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;IAC1D,CAAC;IAED;;OAEG;IACK,yBAAyB,CAAC,CAAS,EAAE,CAAS;QACpD,OAAO,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1C,CAAC;IAED;;OAEG;IACK,KAAK,CAAC,IAAY;QACxB,IAAI,IAAI,GAAG,UAAU,CAAC;QACtB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,IAAI,IAAI,IAAI,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC;YAC3B,IAAI,IAAI,QAAQ,CAAC;
|
|
1
|
+
{"version":3,"file":"bloom-filter.js","sourceRoot":"","sources":["../../src/algos/bloom-filter.ts"],"names":[],"mappings":"AAAA;;GAEG;AAEH,MAAM,OAAO,WAAW;IAMtB,YAAY,aAAqB,EAAE,oBAA4B,IAAI;QAF3D,cAAS,GAAW,CAAC,CAAC;QAG5B,wCAAwC;QACxC,IAAI,CAAC,IAAI,GAAG,IAAI,CAAC,oBAAoB,CAAC,aAAa,EAAE,iBAAiB,CAAC,CAAC;QACxE,IAAI,CAAC,SAAS,GAAG,IAAI,CAAC,yBAAyB,CAAC,IAAI,CAAC,IAAI,EAAE,aAAa,CAAC,CAAC;QAC1E,IAAI,CAAC,QAAQ,GAAG,IAAI,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,GAAG,CAAC,CAAC,CAAC,CAAC;IAC3D,CAAC;IAED;;OAEG;IACK,oBAAoB,CAAC,CAAS,EAAE,CAAS;QAC/C,OAAO,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC;IAC1D,CAAC;IAED;;OAEG;IACK,yBAAyB,CAAC,CAAS,EAAE,CAAS;QACpD,OAAO,IAAI,CAAC,IAAI,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC;IAC1C,CAAC;IAED;;OAEG;IACK,KAAK,CAAC,IAAY;QACxB,IAAI,IAAI,GAAG,UAAU,CAAC;QACtB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,IAAI,IAAI,IAAI,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC;YAC3B,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,QAAQ,CAAC,CAAC;SAClC;QACD,OAAO,CAAC,IAAI,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,IAAI,CAAC;IAClC,CAAC;IAED;;OAEG;IACK,KAAK,CAAC,IAAY;QACxB,IAAI,IAAI,GAAG,IAAI,CAAC;QAChB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YACpC,IAAI,GAAG,CAAC,IAAI,CAAC,IAAI,CAAC,IAAI,EAAE,EAAE,CAAC,GAAG,IAAI,CAAC,UAAU,CAAC,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;SACvD;QACD,OAAO,CAAC,IAAI,KAAK,CAAC,CAAC,GAAG,IAAI,CAAC,IAAI,CAAC;IAClC,CAAC;IAED;;OAEG;IACK,SAAS,CAAC,IAAY;QAC5B,MAAM,KAAK,GAAG,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QAC/B,MAAM,KAAK,GAAG,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;QAC/B,MAAM,MAAM,GAAa,EAAE,CAAC;QAE5B,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,SAAS,EAAE,CAAC,EAAE,EAAE;YACvC,MAAM,IAAI,GAAG,CAAC,KAAK,GAAG,CAAC,GAAG,KAAK,CAAC,GAAG,IAAI,CAAC,IAAI,CAAC;YAC7C,MAAM,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC,CAAC;SAC7B;QAED,OAAO,MAAM,CAAC;IAChB,CAAC;IAED;;OAEG;IACK,MAAM,CAAC,KAAa;QAC1B,MAAM,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,CAAC,CAAC,CAAC;QACxC,MAAM,QAAQ,GAAG,KAAK,GAAG,CAAC,CAAC;QAC3B,IAAI,CAAC,QAAQ,CAAC,SAAS,CAAC,IAAI,CAAC,IAAI,QAAQ,CAAC;IAC5C,CAAC;IAED;;OAEG;IACK,MAAM,CAAC,KAAa;QAC1B,MAAM,SAAS,GAAG,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,CAAC,CAAC,CAAC;QACxC,MAAM,QAAQ,GAAG,KAAK,GAAG,CAAC,CAAC;QAC3B,OAAO,CAAC,IAAI,CAAC,QAAQ,CAAC,SAAS,CAAC,GAAG,CAAC,CAAC,IAAI,QAAQ,CAAC,CAAC,KAAK,CAAC,CAAC;IAC5D,CAAC;IAED;;OAEG;IACH,GAAG,CAAC,IAAY;QACd,MAAM,MAAM,GAAG,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,CAAC;QACpC,KAAK,MAAM,IAAI,IAAI,MAAM,EAAE;YACzB,IAAI,CAAC,MAAM,CAAC,IAAI,CAAC,CAAC;SACnB;QACD,IAAI,CAAC,SAAS,EAAE,CAAC;IACnB,CAAC;IAED;;OAEG;IACH,MAAM,CAAC,KAAe;QACpB,KAAK,MAAM,IAAI,IAAI,KAAK,EAAE;YACxB,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;SAChB;IACH,CAAC;IAED;;OAEG;IACH,YAAY,CAAC,IAAY;QACvB,MAAM,MAAM,GAAG,IAAI,CAAC,SAAS,CAAC,IAAI,CAAC,CAAC;QACpC,KAAK,MAAM,IAAI,IAAI,MAAM,EAAE;YACzB,IAAI,CAAC,IAAI,CAAC,MAAM,CAAC,IAAI,CAAC,EAAE;gBACtB,OAAO,KAAK,CAAC;aACd;SACF;QACD,OAAO,IAAI,CAAC;IACd,CAAC;IAED;;OAEG;IACH,eAAe,CAAC,KAAe;QAC7B,OAAO,KAAK,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,IAAI,CAAC,YAAY,CAAC,IAAI,CAAC,CAAC,CAAC;IACvD,CAAC;IAED;;OAEG;IACH,MAAM,CAAC,KAAe;QACpB,OAAO,KAAK,CAAC,MAAM,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,IAAI,CAAC,YAAY,CAAC,IAAI,CAAC,CAAC,CAAC;IACzD,CAAC;IAED;;OAEG;IACH,KAAK;QACH,IAAI,CAAC,QAAQ,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;QACtB,IAAI,CAAC,SAAS,GAAG,CAAC,CAAC;IACrB,CAAC;IAED;;OAEG;IACH,2BAA2B;QACzB,MAAM,KAAK,GAAG,IAAI,CAAC,SAAS,GAAG,IAAI,CAAC,IAAI,CAAC;QACzC,OAAO,IAAI,CAAC,GAAG,CAAC,CAAC,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,IAAI,CAAC,SAAS,GAAG,KAAK,CAAC,EAAE,IAAI,CAAC,SAAS,CAAC,CAAC;IACzE,CAAC;IAED;;OAEG;IACH,QAAQ;QAQN,IAAI,OAAO,GAAG,CAAC,CAAC;QAChB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,IAAI,EAAE,CAAC,EAAE,EAAE;YAClC,IAAI,IAAI,CAAC,MAAM,CAAC,CAAC,CAAC,EAAE;gBAClB,OAAO,EAAE,CAAC;aACX;SACF;QAED,MAAM,UAAU,GAAG,OAAO,GAAG,IAAI,CAAC,IAAI,CAAC;QACvC,MAAM,0BAA0B,GAAG,IAAI,CAAC,GAAG,CAAC,UAAU,EAAE,IAAI,CAAC,SAAS,CAAC,CAAC;QAExE,OAAO;YACL,IAAI,EAAE,IAAI,CAAC,IAAI;YACf,SAAS,EAAE,IAAI,CAAC,SAAS;YACzB,SAAS,EAAE,IAAI,CAAC,SAAS;YACzB,OAAO;YACP,UAAU;YACV,0BAA0B;SAC3B,CAAC;IACJ,CAAC;IAED;;OAEG;IACH,MAAM;QAMJ,OAAO;YACL,IAAI,EAAE,IAAI,CAAC,IAAI;YACf,SAAS,EAAE,IAAI,CAAC,SAAS;YACzB,SAAS,EAAE,IAAI,CAAC,SAAS;YACzB,QAAQ,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,QAAQ,CAAC;SACpC,CAAC;IACJ,CAAC;IAED;;OAEG;IACH,MAAM,CAAC,QAAQ,CAAC,IAKf;QACC,MAAM,MAAM,GAAG,MAAM,CAAC,MAAM,CAAC,WAAW,CAAC,SAAS,CAAC,CAAC;QACpD,MAAM,CAAC,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;QACxB,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAClC,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAClC,MAAM,CAAC,QAAQ,GAAG,IAAI,UAAU,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC;QAChD,OAAO,MAAM,CAAC;IAChB,CAAC;IAED;;OAEG;IACH,KAAK,CAAC,KAAkB;QACtB,IAAI,IAAI,CAAC,IAAI,KAAK,KAAK,CAAC,IAAI,IAAI,IAAI,CAAC,SAAS,KAAK,KAAK,CAAC,SAAS,EAAE;YAClE,MAAM,IAAI,KAAK,CACb,sEAAsE,CACvE,CAAC;SACH;QAED,MAAM,MAAM,GAAG,IAAI,WAAW,CAAC,CAAC,EAAE,IAAI,CAAC,CAAC;QACxC,MAAM,CAAC,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;QACxB,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAClC,MAAM,CAAC,QAAQ,GAAG,IAAI,UAAU,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC;QACvD,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,GAAG,KAAK,CAAC,SAAS,CAAC;QAEpD,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YAC7C,MAAM,CAAC,QAAQ,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,GAAG,KAAK,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC;SAC3D;QAED,OAAO,MAAM,CAAC;IAChB,CAAC;IAED;;OAEG;IACH,SAAS,CAAC,KAAkB;QAC1B,IAAI,IAAI,CAAC,IAAI,KAAK,KAAK,CAAC,IAAI,IAAI,IAAI,CAAC,SAAS,KAAK,KAAK,CAAC,SAAS,EAAE;YAClE,MAAM,IAAI,KAAK,CACb,6EAA6E,CAC9E,CAAC;SACH;QAED,MAAM,MAAM,GAAG,IAAI,WAAW,CAAC,CAAC,EAAE,IAAI,CAAC,CAAC;QACxC,MAAM,CAAC,IAAI,GAAG,IAAI,CAAC,IAAI,CAAC;QACxB,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,SAAS,CAAC;QAClC,MAAM,CAAC,QAAQ,GAAG,IAAI,UAAU,CAAC,IAAI,CAAC,QAAQ,CAAC,MAAM,CAAC,CAAC;QACvD,MAAM,CAAC,SAAS,GAAG,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,SAAS,EAAE,KAAK,CAAC,SAAS,CAAC,CAAC;QAE7D,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE;YAC7C,MAAM,CAAC,QAAQ,CAAC,CAAC,CAAC,GAAG,IAAI,CAAC,QAAQ,CAAC,CAAC,CAAC,GAAG,KAAK,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC;SAC3D;QAED,OAAO,MAAM,CAAC;IAChB,CAAC;CACF"}
|