npm - bekindprofanityfilter - Versions diffs - 0.0.1 → 0.0.3 - Mend

bekindprofanityfilter 0.0.1 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (101) hide show

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # BeKind Profanity Filter
-> Forked from [AllProfanity](https://github.com/ayush-jadaun/allprofanity) by Ayush Jadaun. Extended with **romanization profanity detection** (catches Hinglish, transliterated text), **language-aware innocence scoring** (ELD + trie-based detection prevents false positives for cross-language collisions like "slut" in Swedish), and additional language dictionaries. Licensed under MIT.
+> Forked from [AllProfanity](https://github.com/ayush-jadaun/allprofanity) by Ayush Jadaun. Extended with **romanization profanity detection** (catches Hinglish, transliterated text), **language-aware innocence scoring** (ELD + trie-based detection prevents false positives for cross-language collisions like "got" in Turkish), and additional language dictionaries. Licensed under MIT.
 > ⚠️ **Early-stage package in progress.** Features available in the original AllProfanity are being actively deprecated, adjusted, or replaced. API surface may change without notice. Contributions and suggestions greatly appreciated.
@@ -19,7 +19,7 @@ A multi-language profanity filter with romanization detection, language-aware in
 - **Multi-Language Profanity Detection:** 34K+ word dictionary across 16 languages with 18-language detection trie
 - **Romanization Detection:** Catches Hinglish, transliterated Bengali, Tamil, Telugu, and Japanese
-- **Cross-Language Innocence Scoring:** Handles words like "slut" (Swedish: "end") and "fart" (Norwegian: "speed")
+- **Cross-Language Innocence Scoring:** Handles words like "got" (Turkish: "buttocks") and "fart" (Norwegian: "speed")
 - **Context-Aware Analysis:** Booster/reducer patterns detect sexual context, negation, medical usage, and quoted speech
 - **Leet-Speak Detection:** Catches obfuscated profanity (`f#ck`, `a55hole`, `sh1t`)
 - **Word Boundary Detection:** Smart whole-word matching prevents flagging "assassin" or "assistance"
@@ -63,7 +63,7 @@ A multi-language profanity filter with romanization detection, language-aware in
 ---
-> **Forked from [BeKind](https://github.com/ayush-jadaun/allprofanity)** by Ayush Jadaun. Extended with **romanization profanity detection** (catches Hinglish, transliterated text), **language-aware innocence scoring** (ELD + trie-based detection prevents false positives for cross-language collisions like "slut" in Swedish), and additional language dictionaries. Licensed under MIT.
+> **Forked from [AllProfanity](https://github.com/ayush-jadaun/allprofanity)** by Ayush Jadaun. Extended with **romanization profanity detection** (catches Hinglish, transliterated text), **language-aware innocence scoring** (ELD + trie-based detection prevents false positives for cross-language collisions like "got" in Turkish), and additional language dictionaries. Licensed under MIT.
 ## Installation
@@ -193,85 +193,88 @@ const filter = new BeKind({
 });
 ```
-### Competitor Comparison
+### Alternative Library Comparison
+The main strength of be-kind comes from its dictionary and knowledge base. To give a fair comparison, **all benchmarks below inject be-kind's full 34K-word dictionary into every alternative library**, so the results compare **matching engines and detection features**, not dictionary coverage.
 Benchmarked on a single CPU core (pinned via `taskset -c 0`). All numbers are **ops/second — higher is better**.
-> **Honest context:** be-kind loads a ~34K-word dictionary across 18 languages by default. `leo + dict` injects be-kind's full 34K dictionary into [leo-profanity](https://github.com/jojoee/leo-profanity) (which ships with ~400 English words only) to test the matching engine with equivalent vocabulary. glin-profanity is benchmarked with all 24 supported languages loaded. `glin + dict` injects be-kind's full 34K dictionary into glin for the same reason.
+> [leo-profanity](https://github.com/jojoee/leo-profanity) ships with ~400 English words, [bad-words](https://github.com/web-mech/badwords) ships with ~400 English words, and [glin-profanity](https://www.glincker.com/tools/glin-profanity) loads its own 24-language dictionaries — all receive be-kind's 34K dictionary on top.
 | Library | Languages (out-of-the-box) | Leet-speak | Repeat compression | Context-aware |
 |---------|--------------------------|-----------|-------------------|--------------|
 | **be-kind** | 16 profanity dicts + 18-lang detection trie | ✅ | 🚧 planned | ✅ (certainty-delta) |
 | **be-kind (ctx)** | same as be-kind | ✅ | 🚧 planned | ✅ (boosters + reducers) |
 | [leo-profanity](https://github.com/jojoee/leo-profanity) + dict | 16 (via be-kind dict injection) | ❌ | ❌ | ❌ |
-| [bad-words](https://github.com/web-mech/badwords) | 1 (English) | ❌ | ❌ | ❌ |
-| [glin-profanity](https://www.glincker.com/tools/glin-profanity) | 24 | ✅ (3 levels) | ✅ | ✅ (heuristic) |
-**Speed benchmark** — ops/second on a single CPU core (`taskset -c 0`), higher is better:
-| Test | be-kind | be-kind (ctx) | leo | bad-words | glin (basic) | glin (enhanced) | glin + dict |
-|------|--------:|--------------:|----:|----------:|-------------:|----------------:|------------:|
-| check — clean (short) | 2,654 | 2,903 | 879,009 | 2,932 | 816 | 751 | 68 |
-| check — profane (short) | 2,366 | 2,031 | 1,496,281 | 3,025 | 3,128 | 3,304 | 3,350 |
-| check — leet-speak | 1,243 | 1,198 | 1,100,028 | 3,148 | 2,760 | 4,078 | 4,499 |
-| clean — profane (short) | 2,398 | 2,011 | 298,713 | 243 | N/A | N/A | N/A |
-| check — 500-char clean | 411 | 397 | 100,898 | 2,157 | 253 | 247 | 20 |
-| check — 500-char profane | 348 | 277 | 216,204 | 2,155 | 789 | 720 | 762 |
-| check — 2,500-char clean | 91 | 88 | 18,900 | 1,225 | 74 | 71 | 6 |
-| check — 2,500-char profane | 82 | 62 | 50,454 | 1,084 | 196 | 185 | 186 |
+| [bad-words](https://github.com/web-mech/badwords) + dict | 16 (via be-kind dict injection) | ❌ | ❌ | ❌ |
+| [glin-profanity](https://www.glincker.com/tools/glin-profanity) + dict | 24 + be-kind dict | ✅ (3 levels) | ✅ | ✅ (heuristic) |
+**Speed benchmark** — ops/second on a single CPU core (`taskset -c 0`), higher is better. All competitors have be-kind's 34K dictionary injected:
+| Test | be-kind | be-kind (ctx) | leo + dict | bad-words + dict | glin (basic) | glin (enhanced) |
+|------|--------:|--------------:|-----------:|-----------------:|-------------:|----------------:|
+| check — clean (short) | 2,625 | 3,007 | 932,597 | 29 | 68 | 68 |
+| check — profane (short) | 2,556 | 2,251 | 1,424,984 | 27 | 3,602 | 3,333 |
+| check — leet-speak | 1,407 | 1,324 | 1,540,700 | 26 | 2,791 | 4,350 |
+| clean — profane (short) | 2,499 | 2,243 | 372,049 | 2 | N/A | N/A |
+| check — 500-char clean | 409 | 427 | 110,318 | 17 | 21 | 22 |
+| check — 500-char profane | 357 | 314 | 217,347 | 17 | 828 | 718 |
+| check — 2,500-char clean | 88 | 90 | 21,727 | 10 | 6 | 6 |
+| check — 2,500-char profane | 79 | 69 | 47,966 | 9 | 192 | 165 |
 **Library versions tested:** `leo-profanity@1.9.0`, `bad-words@4.0.0`, `glin-profanity@3.3.0`
 **Notes:**
-- **be-kind** and **be-kind (ctx)** both load a 34K-word dictionary across 18 languages. Despite this, be-kind is ~3x faster than glin on clean text because it uses a **trie** (O(input_length) matching), while glin uses **linear scanning** over its dictionary (`for (const word of this.words.keys())` — O(dict_size * input_length)). This architectural difference becomes dramatic at large dictionary sizes.
-- `be-kind (ctx)` adds ~10-20% overhead over default be-kind — context analysis (certainty-delta pattern matching) is cheap.
-- `leo-profanity` is the fastest but its ~400-word English-only dictionary explains most of the gap.
-- `glin` with all 24 languages loaded is ~17x slower than English-only due to its linear-scan architecture scaling with dictionary size.
-- `glin + dict` (glin enhanced + be-kind's 34K words injected) demonstrates the linear-scan bottleneck: 67 ops/s on clean short text vs 2,560 for be-kind with the same vocabulary. On profane text it short-circuits on first match, so performance is normal (3,335 ops/s).
+- **All competitors have be-kind's 34K dictionary injected** to isolate matching-engine performance from dictionary coverage.
+- **be-kind** is **~39x faster than glin** on clean short text (2,625 vs 68 ops/s) with the same vocabulary. be-kind uses a **trie** (O(input_length) matching), while glin uses **linear scanning** (`for (const word of this.words.keys())` — O(dict_size * input_length)).
+- `be-kind (ctx)` adds ~10-15% overhead over default be-kind — context analysis (certainty-delta pattern matching) is cheap.
+- `leo + dict` is the fastest by a large margin but offers **no leet-speak, no context analysis, and no repeat compression** — it's a simple substring matcher. Its speed advantage comes from a flat array lookup with no normalization overhead.
+- `bad-words + dict` demonstrates the regex bottleneck catastrophically: 29 ops/s on clean short text vs 2,625 for be-kind — a **~90x slowdown**. bad-words creates a new `RegExp` per word in a `.filter()` loop ([source](https://github.com/web-mech/badwords/blob/master/src/badwords.ts#L91-L103)) — no short-circuiting, so clean and profane text perform identically (~27 ops/s). `clean()` drops to 2 ops/s (vs 2,499 for be-kind). This makes bad-words unsuitable for large multilingual dictionaries.
+- **glin with dict** collapses to 68 ops/s on clean short text (vs 2,625 for be-kind) — a **~39x slowdown** — demonstrating the linear-scan bottleneck at scale. glin short-circuits on first match, which explains the ~53x speedup on profane text (3,602 ops/s) vs clean text (68 ops/s).
 - be-kind is the only library with cross-language innocence scoring, romanization support, and context-aware certainty adjustment.
-Run the benchmark yourself:
+Run the speed benchmark yourself:
 ```bash
 taskset -c 0 bun run benchmark:competitors
 ```
 ### Accuracy Comparison
-Measures TP rate (recall), FP rate, and F1 across eight test categories (225 labeled cases, dataset v6). All libraries are tested against all categories — no exemptions. **Higher F1 and lower FP rate are better.**
+Measures TP rate (recall), FP rate, and F1 across eight test categories (225 labeled cases, dataset v6). All alternative libraries have be-kind's 34K dictionary injected. All libraries are tested against all categories — no exemptions. **Higher F1 and lower FP rate are better.**
 > **Bias disclaimer:** This dataset was created by the be-kind team. Non-English cases were likely drawn from or verified against be-kind's own dictionary, which advantages be-kind on those categories. To partially offset this, the dataset includes independent test cases from [glin-profanity's upstream test suite](https://github.com/GLINCKER/glin-profanity/tree/release/tests) and adversarial false-positive cases specifically chosen to expose known be-kind failures. We strongly recommend running this benchmark against your own dataset before drawing conclusions.
-> **Note:** `be-kind (sensitive)` = `sensitiveMode: true` (flags AMBIVALENT words too). `be-kind (ctx)` = `contextAnalysis.enabled: true`. `glin (collapsed)` = glin (basic) with `collapseRepeatedCharacters()` pre-processing.
+> **Note:** `be-kind (sensitive)` = `sensitiveMode: true` (flags AMBIVALENT words too). `be-kind (ctx)` = `contextAnalysis.enabled: true`. `glin (collapsed) + dict` = glin (basic) + dict with `collapseRepeatedCharacters()` pre-processing. All alternative libraries have be-kind's 34K dictionary injected.
 #### Single-language detection — 65 cases (English incl. leetspeak, French, German, Spanish, Hindi)
 | Library | Recall | Precision | FP Rate | F1 |
 |---|---|---|---|---|
 | be-kind (sensitive) | 100% | 100% | 0% | **1.00** |
+| bad-words + dict | 88% | 100% | 0% | 0.94 |
+| glin (enhanced) + dict | 88% | 100% | 0% | 0.94 |
+| glin (collapsed) + dict | 86% | 100% | 0% | 0.92 |
 | leo + dict | 82% | 100% | 0% | 0.90 |
 | be-kind | 80% | 100% | 0% | 0.89 |
 | be-kind (ctx) | 80% | 100% | 0% | 0.89 |
-| glin (enhanced) | 72% | 100% | 0% | 0.84 |
-| glin (collapsed) | 72% | 100% | 0% | 0.84 |
-| bad-words | 52% | 100% | 0% | 0.68 |
-> All libraries tested against all 65 cases including French, German, Spanish, and Hindi. `leo + dict` benefits significantly from be-kind's multilingual dictionary, jumping from 34% to 82% recall. be-kind misses mild words (`damn`, `hell`) in default mode; `sensitiveMode: true` catches these. All libraries achieve 100% precision — when they flag something, it's always correct.
+> With be-kind's 34K dictionary injected, all alternatives improve dramatically. `bad-words + dict` and `glin (enhanced) + dict` both reach 88% recall (up from 52% and 72% without dict). be-kind in default mode misses mild words (`damn`, `hell`); `sensitiveMode: true` catches these. All libraries achieve 100% precision — when they flag something, it's always correct.
 #### False positives / innocent words — 48 cases (clean only, lower FP rate is better)
-Includes adversarial cases (`cum laude`, `Dick Van Dyke`, culinary `faggots`, Swedish `slut`). Recall and F1 are undefined (no profane cases).
+Includes adversarial cases (`cum laude`, `Dick Van Dyke`, culinary `faggots`, Turkish `got`). Recall and F1 are undefined (no profane cases).
 | Library | FP Rate |
 |---|---|
-| glin (collapsed) | **19%** |
-| glin (enhanced) | 21% |
-| be-kind (ctx) | 21% |
-| bad-words | 23% |
-| leo + dict | 25% |
+| leo + dict | **25%** |
+| be-kind (ctx) | **25%** |
 | be-kind | 27% |
 | be-kind (sensitive) | 31% |
+| glin (enhanced) + dict | 31% |
+| glin (collapsed) + dict | 31% |
+| bad-words + dict | 33% |
-> be-kind's FP rate remains its most significant weakness — over-triggers on proper nouns, Latin phrases, and homographs. `sensitiveMode: true` worsens this. `be-kind (ctx)` with context analysis reduces FP rate from 27% to 21% by detecting innocent contexts (medical terms, proper nouns, quoted text). `leo + dict` at 25% shows that leo's simple substring matching creates more false positives when given a large dictionary.
+> With the full 34K dictionary injected, glin and bad-words now produce more false positives than before — their FP rates rise to 31-33% due to the larger vocabulary. `be-kind (ctx)` ties with `leo + dict` for the lowest FP rate (25%) thanks to context-aware certainty adjustment. be-kind's FP rate remains a significant weakness, but context analysis helps.
 #### Multi-language detection — 26 cases (Hinglish, French, German, Spanish, mixed)
@@ -280,96 +283,98 @@ Includes adversarial cases (`cum laude`, `Dick Van Dyke`, culinary `faggots`, Sw
 | be-kind | 100% | 100% | 0% | **1.00** |
 | be-kind (sensitive) | 100% | 100% | 0% | **1.00** |
 | leo + dict | 100% | 100% | 0% | **1.00** |
-| be-kind (ctx) | 95% | 100% | 0% | 0.98 |
-| glin (enhanced) | 95% | 100% | 0% | 0.98 |
-| glin (collapsed) | 95% | 100% | 0% | 0.98 |
-| bad-words | 62% | 100% | 0% | 0.76 |
+| bad-words + dict | 100% | 100% | 0% | **1.00** |
+| glin (enhanced) + dict | 100% | 100% | 0% | **1.00** |
+| be-kind (ctx) | 100% | 100% | 0% | **1.00** |
+| glin (collapsed) + dict | 100% | 100% | 0% | **1.00** |
-> With be-kind's dictionary injected, leo + dict achieves 100% recall on multi-language cases — proving the dictionary is the key differentiator. be-kind (ctx) scores 95% — context analysis slightly reduces multi-language recall vs default be-kind.
+> With be-kind's 34K dictionary injected, **every library achieves 100% recall** — proving the dictionary is the sole differentiator for multi-language detection. The matching engine doesn't matter when the vocabulary is comprehensive enough.
 #### Romanization — 30 cases (Hinglish, Bengali, Tamil, Telugu, Japanese)
 | Library | Recall | Precision | FP Rate | F1 |
 |---|---|---|---|---|
+| glin (enhanced) + dict | 85% | 81% | 40% | **0.83** |
 | leo + dict | 75% | 94% | 10% | **0.83** |
 | be-kind | 80% | 84% | 30% | 0.82 |
 | be-kind (sensitive) | 80% | 84% | 30% | 0.82 |
 | be-kind (ctx) | 80% | 84% | 30% | 0.82 |
-| glin (enhanced) | 15% | 100% | 0% | 0.26 |
-| glin (collapsed) | 15% | 100% | 0% | 0.26 |
-| bad-words | 0% | 0% | 10% | — |
+| bad-words + dict | 80% | 84% | 30% | 0.82 |
+| glin (collapsed) + dict | 80% | 84% | 30% | 0.82 |
-> leo + dict edges out be-kind on F1 here (0.83 vs 0.82) thanks to a lower FP rate (10% vs 30%) despite slightly lower recall (75% vs 80%). be-kind's higher FP rate is a known limitation where clean romanized words collide with its dictionary. glin catches 15% with perfect precision but far less coverage.
+> With dict injection, `glin (enhanced) + dict` achieves the **highest recall** (85%) on romanization — glin's leet-speak detection catches additional transliterated variants. However, its FP rate (40%) is also the highest. `leo + dict` achieves the same F1 (0.83) with much better precision (94%) and lowest FP (10%). be-kind, bad-words + dict, and glin (collapsed) + dict all tie at 80% recall / 30% FP / F1=0.82, showing that the dictionary drives most romanization detection — not the matching engine.
 #### Semantic context — 25 cases
 | Library | Recall | Precision | FP Rate | F1 |
 |---|---|---|---|---|
-| be-kind (ctx) | 80% | 73% | 20% | **0.76** |
 | leo + dict | 100% | 59% | 47% | 0.74 |
-| glin (enhanced) | 90% | 53% | 53% | 0.67 |
-| glin (collapsed) | 90% | 53% | 53% | 0.67 |
+| bad-words + dict | 100% | 48% | 73% | 0.65 |
 | be-kind (sensitive) | 100% | 48% | 73% | 0.65 |
-| bad-words | 100% | 48% | 73% | 0.65 |
+| glin (enhanced) + dict | 100% | 48% | 73% | 0.65 |
+| glin (collapsed) + dict | 100% | 48% | 73% | 0.65 |
+| be-kind (ctx) | 80% | 62% | 47% | 0.64 |
 | be-kind | 80% | 47% | 60% | 0.59 |
-> Semantic context is where all libraries struggle — precision drops below 50% for most. Cases include metalinguistic uses ("the word 'fuck' has uncertain origins"), negation ("she's not a bitch"), and medical context ("rectal cancer screening"). be-kind (ctx) achieves the best F1 (0.76) thanks to context-aware certainty adjustment — boosters confirm profane intent, reducers detect innocent contexts like quotation and negation. leo + dict achieves 100% recall but at the cost of a 47% FP rate.
+> Semantic context is where all libraries struggle — precision drops below 50% for most. Cases include metalinguistic uses, negation, and medical context. With dict injection, bad-words + dict and glin now achieve 100% recall but at the cost of 73% FP rate. `be-kind (ctx)` trades lower recall (80%) for better precision (62%) and a lower FP rate (47%) via context-aware certainty adjustment — boosters confirm profane intent, reducers detect innocent contexts like proper nouns and medical terms.
-#### Repeated character evasion — 5 cases (`fuuuuuuuuck`, `cunnnnnnttttt`, etc.)
+#### Repeated character evasion — 5 cases (elongated profanity)
 No clean cases in this category — FP rate is undefined.
 | Library | Recall | Precision |
 |---|---|---|
-| glin (enhanced) | **100%** | 100% |
-| glin (collapsed) | 40% | 100% |
+| glin (enhanced) + dict | **100%** | 100% |
+| glin (collapsed) + dict | 40% | 100% |
 | be-kind | 0% | — |
 | be-kind (sensitive) | 0% | — |
 | be-kind (ctx) | 0% | — |
 | leo + dict | 0% | — |
-| bad-words | 0% | — |
+| bad-words + dict | 0% | — |
-#### Concatenated / no-space evasion — 7 cases (`urASSHOLEbro`, `youFUCKINGidiot`, etc.)
+#### Concatenated / no-space evasion — 7 cases (profanity embedded in concatenated strings)
 | Library | Recall | Precision | FP Rate | F1 |
 |---|---|---|---|---|
 | be-kind | 20% | 100% | 0% | 0.33 |
 | be-kind (sensitive) | 20% | 100% | 0% | 0.33 |
 | be-kind (ctx) | 20% | 100% | 0% | 0.33 |
+| bad-words + dict | 20% | 100% | 0% | 0.33 |
+| glin (enhanced) + dict | 20% | 100% | 0% | 0.33 |
+| glin (collapsed) + dict | 20% | 100% | 0% | 0.33 |
 | leo + dict | 0% | — | 0% | — |
-| bad-words | 0% | — | 0% | — |
-| glin (enhanced) | 0% | — | 0% | — |
-| glin (collapsed) | 0% | — | 0% | — |
 #### Challenge cases — 19 cases (semantic disambiguation, embedded substrings, separator evasion)
-Hard problems: `cock` as rooster, `ass` as donkey, Swedish `slut` = "end", `puta` in etymological discussion, profanity in concatenated strings, and separator-spaced evasion (`f u c k`, `f_u*c k`, `a.s.s.h.o.l.e`).
+Hard problems: `cock` as rooster, `ass` as donkey, Turkish `got` = "buttocks" vs English "got", profanity in concatenated strings, and separator-spaced evasion (`f u c k`, `f_u*c k`, `a.s.s.h.o.l.e`).
 | Library | Recall | Precision | FP Rate | F1 |
 |---|---|---|---|---|
-| be-kind (ctx) | 60% | 75% | 22% | **0.67** |
+| be-kind (ctx) | 60% | 75% | 33% | **0.63** |
 | be-kind | 60% | 60% | 44% | 0.60 |
 | be-kind (sensitive) | 60% | 60% | 44% | 0.60 |
-| glin (enhanced) | 30% | 43% | 44% | 0.35 |
+| glin (enhanced) + dict | 60% | 60% | 44% | 0.60 |
+| bad-words + dict | 50% | 56% | 44% | 0.53 |
+| glin (collapsed) + dict | 50% | 56% | 44% | 0.53 |
 | leo + dict | 20% | 50% | 22% | 0.29 |
-| bad-words | 20% | 33% | 44% | 0.25 |
-| glin (collapsed) | 0% | 0% | 44% | — |
-> be-kind (ctx) halves the FP rate on challenge cases (44% → 22%) by recognizing innocent contexts like "cock crowed at dawn" and "wild ass is an equine." Separator-spaced evasion cases (`f u c k`, `f_u*c k`, mixed separators) test the separator tolerance feature. These cases still require semantic understanding that no dictionary-based filter can fully solve — the strongest argument for LLM-assisted moderation as a second pass.
+> be-kind (ctx) achieves the best F1 on challenge cases thanks to context-aware certainty adjustment — recognizing innocent contexts like "cock crowed at dawn" and "wild ass is an equine." With dict injection, glin (enhanced) + dict now matches be-kind's recall (60%) but at higher FP (44% vs 33%). Separator-spaced evasion cases (`f u c k`, `f_u*c k`, mixed separators) test features that no alternative library supports. These cases still require semantic understanding that no dictionary-based filter can fully solve — the strongest argument for LLM-assisted moderation as a second pass.
 #### Overall summary — micro-averaged across all 225 cases
+All alternative libraries have be-kind's 34K dictionary injected.
 | Library | Recall | Precision | FP Rate | F1 | TP | FN | FP | TN |
 |---|---|---|---|---|---|---|---|---|
-| be-kind (sensitive) | **86%** | 76% | 32% | 0.81 | 104 | 17 | 33 | 71 |
-| be-kind (ctx) | 75% | **83%** | **17%** | **0.79** | 91 | 30 | 18 | 86 |
+| be-kind (sensitive) | **86%** | 76% | 32% | **0.81** | 104 | 17 | 33 | 71 |
+| glin (enhanced) + dict | **86%** | 75% | 33% | 0.80 | 104 | 17 | 34 | 70 |
+| glin (collapsed) + dict | 81% | 75% | 32% | 0.78 | 98 | 23 | 33 | 71 |
+| bad-words + dict | 80% | 74% | 33% | 0.77 | 97 | 24 | 34 | 70 |
+| leo + dict | 74% | 80% | 21% | 0.77 | 89 | 32 | 22 | 82 |
+| be-kind (ctx) | 76% | **79%** | **24%** | 0.77 | 92 | 29 | 25 | 79 |
 | be-kind | 76% | 76% | 28% | 0.76 | 92 | 29 | 29 | 75 |
-| leo + dict | 74% | 80% | 21% | 0.76 | 89 | 32 | 22 | 82 |
-| glin (enhanced) | 63% | 78% | 21% | 0.70 | 76 | 45 | 22 | 82 |
-| glin (collapsed) | 58% | 77% | 20% | 0.66 | 70 | 51 | 21 | 83 |
-| bad-words | 42% | 65% | 26% | 0.51 | 51 | 70 | 27 | 77 |
-> Micro-averaged: all 225 cases (121 profane, 104 clean) aggregated into one confusion matrix per library, then recall/precision/F1 computed once. No category weighting artifacts. All glin variants use all 24 supported languages. `leo + dict` with be-kind's 34K dictionary achieves F1 parity with default be-kind (0.76) — proving the dictionary is the core differentiator. be-kind (ctx) achieves the best balance of precision (83%) and recall (75%) with the lowest FP rate (17%) among be-kind variants, thanks to context-aware certainty adjustment via booster and reducer patterns.
+> Micro-averaged: all 225 cases (121 profane, 104 clean) aggregated into one confusion matrix per library, then recall/precision/F1 computed once. No category weighting artifacts. With be-kind's dictionary injected, **glin (enhanced) + dict matches be-kind (sensitive) on recall (86%)** and nearly matches on F1 (0.80 vs 0.81) — proving the dictionary is the core differentiator, not the matching engine. `leo + dict` and `be-kind (ctx)` tie for best precision (79-80%) and lowest FP rates (21-24%). be-kind (ctx) achieves this through context-aware certainty adjustment; leo achieves it through simpler matching that avoids over-triggering.
 Run the accuracy benchmark yourself:
 ```bash
@@ -386,7 +391,7 @@ Returns `true` if the text contains any profanity.
 ```typescript
 profanity.check('This is a clean sentence.');  // false
-profanity.check('This is a bullshit sentence.'); // true
+profanity.check('This is a b*llsh*t sentence.'); // true
 profanity.check('What the f#ck is this?'); // true (leet-speak)
 profanity.check('यह एक चूतिया परीक्षण है।'); // true (Hindi)
 ```
@@ -404,9 +409,9 @@ Returns a detailed result:
 - `positions: Array<{ word: string, start: number, end: number }>`
 ```typescript
-const result = profanity.detect('This is fucking bullshit and chutiya.');
+const result = profanity.detect('This is f**king b*llsh*t and chutiya.');
 console.log(result.hasProfanity); // true
-console.log(result.detectedWords); // ['fucking', 'bullshit', 'chutiya']
+console.log(result.detectedWords); // ['f**king', 'b*llsh*t', 'chutiya']
 console.log(result.severity); // 3 (SEVERE)
 console.log(result.cleanedText); // "This is ******* ******** and ******."
 console.log(result.positions); // e.g. [{word: 'fucking', start: 8, end: 15}, ...]
@@ -419,8 +424,8 @@ console.log(result.positions); // e.g. [{word: 'fucking', start: 8, end: 15}, ..
 Replace each character of profane words with a placeholder (default: `*`).
 ```typescript
-profanity.clean('This contains bullshit.'); // "This contains ********."
-profanity.clean('This contains bullshit.', '#'); // "This contains ########."
+profanity.clean('This contains b*llsh*t.'); // "This contains ********."
+profanity.clean('This contains b*llsh*t.', '#'); // "This contains ########."
 profanity.clean('यह एक चूतिया परीक्षण है।'); // e.g. "यह एक ***** परीक्षण है।"
 ```
@@ -432,8 +437,8 @@ Replace each profane word with a single placeholder (default: `***`).
 (If the placeholder is omitted, uses `***`.)
 ```typescript
-profanity.cleanWithPlaceholder('This contains bullshit.'); // "This contains ***."
-profanity.cleanWithPlaceholder('This contains bullshit.', '[CENSORED]'); // "This contains [CENSORED]."
+profanity.cleanWithPlaceholder('This contains b*llsh*t.'); // "This contains ***."
+profanity.cleanWithPlaceholder('This contains b*llsh*t.', '[CENSORED]'); // "This contains [CENSORED]."
 profanity.cleanWithPlaceholder('यह एक चूतिया परीक्षण है।', '####'); // e.g. "यह एक #### परीक्षण है।"
 ```
@@ -459,8 +464,8 @@ profanity.check('Qué puta situación.'); // true
 Remove a word or an array of words from the profanity filter.
 ```typescript
-profanity.remove('bullshit');
-profanity.check('This is bullshit.'); // false
+profanity.remove('b*llsh*t');
+profanity.check('This is b*llsh*t.'); // false
 profanity.remove(['mierda', 'puta']);
 profanity.check('Esto es mierda.'); // false
@@ -473,11 +478,11 @@ profanity.check('Esto es mierda.'); // false
 Whitelist words so they are never flagged as profane.
 ```typescript
-profanity.addToWhitelist(['fuck', 'idiot','shit']);
-profanity.check('He is an fucking idiot.'); // false
-profanity.check('Fuck this shit.'); // false
+profanity.addToWhitelist(['f**k', 'idiot','sh*t']);
+profanity.check('He is an f**king idiot.'); // false
+profanity.check('F**k this sh*t.'); // false
 // Remove from whitelist to restore detection
-profanity.removeFromWhitelist(['fuck', 'idiot','shit']);
+profanity.removeFromWhitelist(['f**k', 'idiot','sh*t']);
 ```
 ---
@@ -498,7 +503,7 @@ Set the default placeholder character for `clean()`.
 ```typescript
 profanity.setPlaceholder('#');
-profanity.clean('This is bullshit.'); // "This is ########."
+profanity.clean('This is b*llsh*t.'); // "This is ########."
 profanity.setPlaceholder('*'); // Reset to default
 ```
@@ -511,7 +516,7 @@ Options include: `enableLeetSpeak`, `caseSensitive`, `strictMode`, `detectPartia
 ```typescript
 profanity.updateConfig({ caseSensitive: true, enableLeetSpeak: false });
-profanity.check('FUCK'); // false (if caseSensitive)
+profanity.check('F**K'); // false (if caseSensitive)
 profanity.updateConfig({ caseSensitive: false, enableLeetSpeak: true });
 profanity.check('f#ck'); // true
 ```
@@ -591,9 +596,9 @@ Remove all loaded languages and dynamic words (start with a clean filter).
 ```typescript
 profanity.clearList();
-profanity.check('fuck'); // false
+profanity.check('f**k'); // false
 profanity.loadLanguage('english');
-profanity.check('fuck'); // true
+profanity.check('f**k'); // true
 ```
 ---
@@ -723,7 +728,7 @@ Edit `bekindprofanityfilter.config.json` to enable/disable features. Your IDE wi
 ## Cross-Language Innocence Scoring
-Many words are profane in one language but perfectly innocent in another. For example, "slut" means "end/finish" in Swedish, "fart" means "speed" in Scandinavian languages, and "bite" is a common English word that's vulgar in French. BeKind handles these cross-language collisions automatically using a multi-layer language detection and scoring system.
+Many words are profane in one language but perfectly innocent in another. For example, "got" means "buttocks" in Turkish but is an extremely common English word, "fart" means "speed" in Scandinavian languages, and "bite" is a common English word that's vulgar in French. BeKind handles these cross-language collisions automatically using a multi-layer language detection and scoring system.
 ### Language Detection Architecture
@@ -745,10 +750,10 @@ Unicode codepoint ranges map characters directly to language families (e.g., Cyr
 For each word, `scoreWord()` combines all three layers into a single `Record<string, number>` mapping language codes to confidence scores:
 ```
-scoreWord("slut") → { sv: 0.8, en: 0.6, de: 0.3, ... }
-                       ↑ Swedish trie match (exact word in vocabulary)
-                            ↑ English trie match (partial/common word)
-                                 ↑ German ELD n-gram signal
+scoreWord("got") → { en: 0.9, tr: 0.7, de: 0.2, ... }
+                      ↑ English trie match (extremely common word)
+                           ↑ Turkish trie match (profane in Turkish)
+                                ↑ German ELD n-gram signal
 ```
 Layer weights: Script (1.0) > Trie (0.8) > ELD (0.6) > Suffix (0.3+) > Prefix (0.3+)
@@ -758,8 +763,8 @@ Layer weights: Script (1.0) > Trie (0.8) > ELD (0.6) > Suffix (0.3+) > Prefix (0
 For full text, `detectLanguages()` runs `scoreWord()` on every word and aggregates results into document-level proportions:
 ```typescript
-detectLanguages("Programmet börjar klockan åtta och tar slut vid tio")
-// → { languages: [{ language: "de", proportion: 0.6 }, { language: "sv", proportion: 0.3 }, ...] }
+detectLanguages("We got the tickets and went to the show")
+// → { languages: [{ language: "en", proportion: 0.9 }, { language: "tr", proportion: 0.1 }, ...] }
 ```
 *Note:* ELD often classifies Swedish as German due to n-gram similarity. The confusion map (see below) compensates for this.
@@ -798,41 +803,37 @@ If profane language dominates (profaneAmp > innocentAmp):
 Result clamped to [0, 5]
 ```
-The `dampeningFactor` (0-1) controls how aggressively the adjustment works per collision word. Words that are genuinely innocent in another language (e.g., "slut" in Swedish, df=0.9) get heavy dampening, while dangerous dual-meaning words (e.g., "cock" as rooster, df=0.1) barely adjust.
+The `dampeningFactor` (0-1) controls how aggressively the adjustment works per collision word. Words that are genuinely innocent in another language (e.g., "got" in English, df=0.95) get heavy dampening, while dangerous dual-meaning words (e.g., "cock" as rooster, df=0.1) barely adjust.
 ### End-to-End Flow
 ```
-Text: "Programmet börjar klockan åtta och tar slut vid tio"
-                                          ^^^^
-                                          "slut" detected (en: s:3 c:4)
+Text: "All proceeds go to the local food bank"
+                      ^^^^^^
+                      "go t" bridged → "got" detected (tr: s:4 c:4)
   1. Collision word matched → check innocent-words map
-     "slut" → innocent in Swedish (meaning: "end/finish", dampeningFactor: 0.9)
+     "got" → innocent in English (meaning: "past tense of get", dampeningFactor: 0.95)
   2. Language detection triggered (lazy — only runs on collision matches)
-     Document signal: detectLanguages() → { de: 0.7, en: 0.2, ... }
-     Word signal:     scoreWord("slut")  → { sv: 0.8, en: 0.6, ... }
+     Document signal: detectLanguages() → { en: 0.9, tr: 0.05, ... }
+     Word signal:     scoreWord("got")  → { en: 0.9, tr: 0.7, ... }
   3. Weighted average (1.5:1 doc:word ratio)
-     amplified["sv"] = (0.8 × 1.0 + 0.0 × 1.5) / 2.5 = 0.32
-     amplified["de"] = (0.0 × 1.0 + 0.7 × 1.5) / 2.5 = 0.42
-     amplified["en"] = (0.6 × 1.0 + 0.2 × 1.5) / 2.5 = 0.36
-  4. Confusion map: German signal → partial Swedish evidence
-     effectiveAmp["sv"] = max(0.32, 0.42 × 0.8) = 0.336
+     amplified["en"] = (0.9 × 1.0 + 0.9 × 1.5) / 2.5 = 0.90
+     amplified["tr"] = (0.7 × 1.0 + 0.05 × 1.5) / 2.5 = 0.31
-  5. Innocent language (sv: 0.336) > Profane language (en: 0.36)?
-     → Close, but Swedish trie words boost sv signal further
-     → Certainty dampened: 4 × (1 - 0.9 × 0.336) = 2.79
-     → Below flag threshold (s:3 needs c:3+) → NOT FLAGGED ✓
+  4. Innocent language (en: 0.90) > Profane language (tr: 0.31)?
+     → Yes, English signal dominates
+     → Certainty dampened: 4 × (1 - 0.95 × 0.90) = 0.58
+     → Below flag threshold (s:4 needs c:2+) → NOT FLAGGED ✓
 ```
 ### Key Features
 - **29 collision words** mapped across 7 languages (English, Swedish, Norwegian, Danish, German, Dutch, French, Spanish)
 - **Per-word dampening factors** control adjustment strength:
-  - `0.9` = heavy dampening (genuinely innocent cross-language, e.g., "slut" in Swedish)
+  - `0.95` = heavy dampening (genuinely innocent cross-language, e.g., "got" in English)
   - `0.1` = barely dampens (almost always used as profanity, e.g., "cock" in English)
 - **Lazy language detection** — `detectLanguages()` only runs when a collision word is matched (zero performance cost for non-collision text)
 - **Confusion map** — handles ELD n-gram detector's known misclassifications (e.g., Swedish often classified as German)
@@ -842,6 +843,7 @@ Text: "Programmet börjar klockan åtta och tar slut vid tio"
 | Word | Profane In | Innocent In | Meaning |
 |------|-----------|-------------|---------|
+| got | Turkish | English | past tense of "get" (df: 0.95) |
 | slut | English | Swedish, Danish | end/finish |
 | fart | English | Swedish, Norwegian, Danish | speed |
 | hell | English | Swedish, Norwegian | luck |
@@ -903,7 +905,7 @@ Severity reflects the number and variety of detected profanities:
 - **Mixed Content:** Handles mixed-language and code-switched sentences with language-aware scoring.
 ```typescript
-profanity.check('This is bullshit and चूतिया.'); // true (mixed English/Hindi)
+profanity.check('This is b*llsh*t and चूतिया.'); // true (mixed English/Hindi)
 profanity.check('Ce mot est merde and पागल.');   // true (French/Hindi)
 profanity.check('Isso é uma merda.');             // true (Brazilian Portuguese)
 ```
@@ -916,7 +918,7 @@ For sample words in a language (for UIs, admin, etc):
 ```typescript
 import { englishBadWords, hindiBadWords } from 'bekindprofanityfilter';
-console.log(englishBadWords.slice(0, 5)); // ["fuck", "shit", ...]
+console.log(englishBadWords.slice(0, 5)); // ["f**k", "sh*t", ...]
 ```
 ---
@@ -925,7 +927,7 @@ console.log(englishBadWords.slice(0, 5)); // ["fuck", "shit", ...]
 - **No wordlist exposure:** There is no `.list()` function for security and encapsulation. Use exported word arrays for samples.
 - **TRIE-based:** Scales easily to 50,000+ words.
-- **Handles leet-speak:** Catches obfuscated variants like `f#ck`, `a55hole`.
+- **Handles leet-speak:** Catches obfuscated variants like `f#ck`, `a55h*le`.
 ---
@@ -947,7 +949,7 @@ profanity.addToWhitelist(['anal', 'ass']);
 console.log(profanity.check('He is an associate professor.')); // false
 // Severity
-const result = profanity.detect('This is fucking bullshit and chutiya.');
+const result = profanity.detect('This is f**king b*llsh*t and chutiya.');
 console.log(ProfanitySeverity[result.severity]); // "SEVERE"
 // Custom dictionary
@@ -957,7 +959,7 @@ console.log(profanity.check('You barnacle-head!')); // true
 // Placeholder configuration
 profanity.setPlaceholder('#');
-console.log(profanity.clean('This is bullshit.')); // "This is ########."
+console.log(profanity.clean('This is b*llsh*t.')); // "This is ########."
 profanity.setPlaceholder('*'); // Reset
 ```
@@ -991,7 +993,7 @@ A: Yes! BeKind is universal.
 - ✅ Additional language packs (Arabic, Russian, Japanese, Korean, Chinese, Dutch)
 - ✅ Romanization detection (Hinglish and other transliterated scripts)
 - 🚧 Norwegian and Danish trie vocabularies (currently covered via confusion map)
-- 🚧 Repeat character compression (normalize "fuuuuccckkkk" → "fuck" before matching, avoiding the need to enumerate elongations in the dictionary)
+- 🚧 Repeat character compression (normalize elongated words before matching, avoiding the need to enumerate elongations in the dictionary)
 - 🚧 Phonetic matching (sounds-like detection)
 - 🚧 Plugin system for custom detection algorithms
@@ -1001,7 +1003,7 @@ A: Yes! BeKind is universal.
 MIT — See [LICENSE](https://github.com/grassroots-labs-org/be-kind-profanity-filter/blob/main/LICENSE)
-This project is a fork of [BeKind](https://github.com/ayush-jadaun/allprofanity) by Ayush Jadaun, also licensed under MIT.
+This project is a fork of [AllProfanity](https://github.com/ayush-jadaun/allprofanity) by Ayush Jadaun, also licensed under MIT.
 ---