npm - allprofanity - Versions diffs - 1.0.4 → 2.0.0 - Mend

allprofanity 1.0.4 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +129 -85
package/dist/index.d.ts +100 -18
package/dist/index.js +546 -99
package/dist/index.js.map +1 -1
package/dist/languages/bengali-words.js +0 -1
package/dist/languages/bengali-words.js.map +1 -1
package/dist/languages/english-words.d.ts +2 -0
package/dist/languages/english-words.js +256 -0
package/dist/languages/english-words.js.map +1 -0
package/package.json +2 -5

package/README.md CHANGED Viewed

@@ -1,24 +1,29 @@
 # AllProfanity
-A comprehensive multi-language profanity filter for JavaScript/TypeScript applications with built-in support for English, Hindi, Hinglish, Bengali, Tamil, Telugu, French, German, and Spanish content.
+A comprehensive, zero-dependency, multi-language profanity filter for JavaScript/TypeScript applications with built-in support for English, Hindi, Hinglish, Bengali, Tamil, Telugu, French, German, and Spanish content.
 [![npm version](https://img.shields.io/npm/v/allprofanity.svg)](https://www.npmjs.com/package/allprofanity)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 ## Features
-- **Multi-language Support**: Pre-loaded with English profanities (from leo-profanity) and extensive dictionaries for Hindi/Hinglish, Bengali, Tamil, Telugu, French, German, and Spanish
-- **Multiple Scripts**: Detects profanity in both Latin/Roman and native scripts (Devanagari, Bengali, Tamil, Telugu)
-- **Case Insensitive**: Works regardless of letter case
+- **Multi-language Support**: Built-in dictionaries for English, Hindi/Hinglish, Bengali, Tamil, Telugu, French, German, and Spanish.
+- **Multiple Scripts**: Detects profanity in both Latin/Roman and native scripts (Devanagari, Bengali, Tamil, Telugu).
+- **Case Insensitive (configurable)**: By default, not case sensitive, but can be configured to be case sensitive.
+- **Leet Speak Detection**: Optionally detects leet speak and obfuscated profanities.
 - **Flexible Cleaning Options**:
-  - Character-level replacement (each character of a profane word becomes a placeholder)
-  - Word-level replacement (entire profane word becomes a single placeholder)
-- **Customizable**:
-  - Dynamically add/remove words from the filter
-  - Set custom placeholder characters or strings
-- **Zero Dependencies**: Only depends on leo-profanity as the base filter
-- **TypeScript Support**: Full TypeScript type definitions included
-- **Extensible**: Designed with multi-language support in mind, making it easy to add more languages in the future
+  - Character-level replacement (each character of a profane word becomes a placeholder).
+  - Word-level replacement (entire profane word becomes a single placeholder).
+- **Customizable & Extensible**:
+  - Dynamically add/remove words or whole dictionaries.
+  - Set custom placeholder characters or strings.
+  - Supports custom language packs.
+  - Whitelist words to avoid false positives.
+  - Strict mode and partial word detection options.
+- **Severity Levels**: Detects severity of profanities (MILD, MODERATE, SEVERE, EXTREME).
+- **Zero External Dependencies**: Fully built from scratch for maximum performance and control.
+- **TypeScript Support**: Full TypeScript type definitions included.
+- **Exportable Dictionaries**: Language word lists are exportable for direct use or extension.
 ## Installation
@@ -42,11 +47,11 @@ profanity.check('यह एक चूतिया परीक्षण है
 profanity.check('Ye ek chutiya test hai.');    // true (Hinglish example)
 // Clean profanity (character by character replacement)
-profanity.clean('This is a fucking test.');
+profanity.clean('This is a fucking test.');
 // => "This is a ****ing test."
 // Clean profanity (whole word replacement)
-profanity.cleanWithWord('This is a fucking test.');
+profanity.cleanWithWord('This is a fucking test.');
 // => "This is a *** test."
 ```
@@ -61,6 +66,21 @@ profanity.check('This contains bullshit.');  // true
 profanity.check('This is clean.');           // false
 ```
+### `detect(string: string): ProfanityDetectionResult`
+Advanced detection with details about profanities found, severity, cleaned text, and word positions.
+```javascript
+const result = profanity.detect('This contains bullshit.');
+// result: {
+//   hasProfanity: true,
+//   detectedWords: [...],
+//   cleanedText: ...,
+//   severity: ...,
+//   positions: [...]
+// }
+```
 ### `clean(string: string, placeholder?: string): string`
 Cleans a string by replacing each character of profane words with a placeholder character.
@@ -139,114 +159,140 @@ Sets the default placeholder character for the `clean` method.
 profanity.setPlaceholder('#');
 ```
-## Word Boundary Detection
+### `getLoadedLanguages(): string[]`
-The library is designed to handle word boundaries correctly, reducing false positives:
+Returns the list of currently loaded languages.
 ```javascript
-profanity.check('He is an associate professor.');  // false, even though 'ass' is a profane word
-profanity.check('I'm an analyst at this company.'); // false, even though 'anal' is a profane word
-profanity.check('This is ass and that's bad.');     // true
+const loaded = profanity.getLoadedLanguages();
+// => ['english', 'hindi', ...]
 ```
-## Language Support
-### Current Languages
+### `getAvailableLanguages(): string[]`
-#### English
+Returns the list of all available built-in languages.
-Built on top of the leo-profanity library, AllProfanity includes comprehensive English profanity detection.
+```javascript
+const available = profanity.getAvailableLanguages();
+// => ['english', 'hindi', 'bengali', 'tamil', 'telugu', 'french', 'german', 'spanish']
+```
-#### Hindi/Hinglish Support
+### `loadLanguage(language: string): boolean`
-The library comes pre-loaded with an extensive list of Hindi profanities in both Devanagari and Roman scripts, as well as common Hinglish abbreviations and variations.
+Loads a built-in language dictionary by name.
 ```javascript
-// Hindi in Devanagari script
-profanity.check('इस वाक्य में लंड शब्द है।');  // true
+profanity.loadLanguage('bengali');
+profanity.loadLanguage('french');
+```
-// Hindi in Roman script
-profanity.check('Is vakya mein lund shabd hai.');  // true
+### `loadLanguages(languages: string[]): number`
-// Hinglish abbreviations
-profanity.check('Usne bc kaha.');  // true
+Loads multiple languages at once.
+```javascript
+profanity.loadLanguages(['tamil', 'german', 'spanish']);
 ```
-#### Indian Languages
+### `loadIndianLanguages(): number`
-AllProfanity supports multiple Indian languages including Bengali, Tamil, and Telugu in both their native scripts and Roman transliterations.
+Loads Hindi, Bengali, Tamil, and Telugu dictionaries.
 ```javascript
-// Bengali in Bengali script
-profanity.check('এই বাক্যে বাল শব্দ আছে।');  // true
+profanity.loadIndianLanguages();
+```
-// Tamil in Tamil script
-profanity.check('இந்த வாக்கியத்தில் கூதி உள்ளது.');  // true
+### `loadCustomDictionary(name: string, words: string[]): void`
-// Telugu in Telugu script
-profanity.check('ఈ వాక్యంలో పూకు పదం ఉంది.');  // true
+Loads a custom dictionary under the given name.
-// Loading all Indian languages at once
-import { AllProfanity } from 'allprofanity';
-const filter = new AllProfanity();
-filter.loadIndianLanguages();  // Loads Hindi, Bengali, Tamil, and Telugu
+```javascript
+profanity.loadCustomDictionary('myLanguage', ['word1', 'word2']);
+profanity.loadLanguage('myLanguage');
 ```
-#### European Languages
+### `addToWhitelist(words: string[]): void` / `removeFromWhitelist(words: string[]): void`
-AllProfanity also supports several European languages including French, German, and Spanish.
+Add or remove words from the whitelist (words never flagged as profanity).
 ```javascript
-// French example
-profanity.check('Cette phrase contient le mot merde.');  // true
+profanity.addToWhitelist(['anal', 'ass']);
+profanity.removeFromWhitelist(['anal']);
+```
+### `getConfig(): AllProfanityOptions`
+Get current configuration.
+### `updateConfig(options: Partial<AllProfanityOptions>): void`
-// German example
-profanity.check('Dieser Satz enthält das Wort scheisse.');  // true
+Update configuration (enable/disable leet speak, case sensitivity, etc.).
-// Spanish example
-profanity.check('Esta frase contiene la palabra mierda.');  // true
+## Word Boundary Detection
+The library handles word boundaries and reduces false positives:
+```javascript
+profanity.check('He is an associate professor.');  // false, even though 'ass' is a profane word
+profanity.check('I\'m an analyst at this company.'); // false, even though 'anal' is a profane word
+profanity.check('This is ass and that\'s bad.');     // true
 ```
-### Loading Additional Languages
+## Language Support
+### Current Languages
-By default, only English and Hindi are loaded. You can load additional languages as needed:
+- **English** (imported from `./languages/english-words.js`)
+- **Hindi/Hinglish** (`./languages/hindi-words.js`)
+- **Bengali** (`./languages/bengali-words.js`)
+- **Tamil** (`./languages/tamil-words.js`)
+- **Telugu** (`./languages/telugu-words.js`)
+- **French** (`./languages/french-words.js`)
+- **German** (`./languages/german-words.js`)
+- **Spanish** (`./languages/spanish-words.js`)
+> **Note:** All dictionaries are exported for direct access/import.
+#### Usage Examples
 ```javascript
-// Load individual languages
-profanity.loadLanguage('bengali');
-profanity.loadLanguage('tamil');
-profanity.loadLanguage('french');
+profanity.check('इस वाक्य में लंड शब्द है।');  // true (Hindi)
+profanity.check('Is vakya mein lund shabd hai.');  // true (Hinglish)
+profanity.check('এই বাক্যে বাল শব্দ আছে।');  // true (Bengali)
+profanity.check('இந்த வாக்கியத்தில் கூதி உள்ளது.');  // true (Tamil)
+profanity.check('Cette phrase contient le mot merde.');  // true (French)
+```
-// Load multiple languages at once
-profanity.loadLanguages(['telugu', 'german', 'spanish']);
+### Mixed Language Content
-// Get available languages
-const availableLanguages = profanity.getAvailableLanguages();
-// => ['hindi', 'bengali', 'tamil', 'telugu', 'french', 'german', 'spanish']
+AllProfanity can detect profanities from multiple languages in a single string:
-// Get currently loaded languages
-const loadedLanguages = profanity.getLoadedLanguages();
-// => ['english', 'hindi', ...]
+```javascript
+profanity.check('This English sentence has chutiya which is bad.');  // true
+profanity.check('I\'m saying मादरचोद and bullshit in one sentence.');  // true
 ```
-### Future Language Support
+### Loading Additional or Custom Languages
-AllProfanity is designed with extensibility in mind. If you'd like to contribute language packs, please see the Contributing section below.
+By default, only English and Hindi are loaded. You can load additional languages as needed:
-## Mixed Language Content
+```javascript
+profanity.loadLanguage('bengali');
+profanity.loadLanguages(['tamil', 'french']);
+```
-AllProfanity effectively handles mixed-language content containing profanities from different languages:
+You can also load custom dictionaries:
 ```javascript
-profanity.check('This English sentence has chutiya which is bad.');  // true
-profanity.check('I'm saying मादरचोद and bullshit in one sentence.');  // true
+profanity.loadCustomDictionary('swedish', ['fulord1', 'fulord2']);
+profanity.loadLanguage('swedish');
 ```
 ## Customizing The Library
 ### Adding Custom Profanity Lists
-You can add your own profanity lists to extend support for other languages or add additional words to existing languages:
+You can add your own profanity words to extend support for other languages or add additional words to existing languages:
 ```javascript
 // Add custom profanity words
@@ -255,31 +301,26 @@ profanity.add([
   'customword2',
   'customword3'
 ]);
-// Now it will detect Spanish profanity
-profanity.check('Este es un ejemplo de mierda.');  // true
 ```
-### Creating a Custom-Configured Instance
+## Creating a Custom-Configured Instance
-If you need multiple differently-configured instances of the filter, you can import the AllProfanity class directly:
+If you need multiple differently-configured filters, import the `AllProfanity` class directly:
 ```javascript
 import { AllProfanity } from 'allprofanity';
-// Create custom instances
-const kidSafeFilter = new AllProfanity({ includeModerate: true });
-const adultFilter = new AllProfanity({ includeModerate: false });
+const kidSafeFilter = new AllProfanity({ enableLeetSpeak: true, strictMode: true });
+const adultFilter = new AllProfanity({ enableLeetSpeak: false, detectPartialWords: false });
 ```
 ## Advanced Use Cases
 ### Performance Optimization
-For applications processing large volumes of text:
+For high-throughput applications:
 ```javascript
-// Pre-compile your most used strings for faster checking
 const badWordsList = profanity.list();
 const preCompiledRegex = new RegExp('\\b(' + badWordsList.join('|') + ')\\b', 'i');
@@ -322,7 +363,7 @@ AllProfanity works in all modern browsers and Node.js environments.
 ## Roadmap
-- Add support for more languages (Spanish, French, German, Arabic, etc.)
+- Add support for more languages (Arabic, Chinese, Russian, etc.)
 - Contextual profanity detection
 - Severity levels for different categories of profanity
 - Phonetic matching for evasion attempts
@@ -348,5 +389,8 @@ To add support for a new language:
 ## Acknowledgements
-- Built on top of [leo-profanity](https://github.com/jojoee/leo-profanity)
+- Inspired by [leo-profanity](https://github.com/jojoee/leo-profanity), but fully rebuilt for extensibility and multi-language support.
+```diff
+- Note: As of v2+, AllProfanity is zero-dependency and does not use leo-profanity internally.
+```

package/dist/index.d.ts CHANGED Viewed

@@ -1,10 +1,11 @@
-export { default as hindiBadWords } from "./languages/hindi-words";
-export { default as frenchBadWords } from "./languages/french-words";
-export { default as germanBadWords } from "./languages/german-words";
-export { default as spanishBadWords } from "./languages/spanish-words";
-export { default as bengaliBadWords } from "./languages/bengali-words";
-export { default as tamilBadWords } from "./languages/tamil-words";
-export { default as teluguBadWords } from "./languages/telugu-words";
+export { default as englishBadWords } from "./languages/english-words.js";
+export { default as hindiBadWords } from "./languages/hindi-words.js";
+export { default as frenchBadWords } from "./languages/french-words.js";
+export { default as germanBadWords } from "./languages/german-words.js";
+export { default as spanishBadWords } from "./languages/spanish-words.js";
+export { default as bengaliBadWords } from "./languages/bengali-words.js";
+export { default as tamilBadWords } from "./languages/tamil-words.js";
+export { default as teluguBadWords } from "./languages/telugu-words.js";
 /**
  * Configuration options for AllProfanity
  */
@@ -12,21 +13,78 @@ export interface AllProfanityOptions {
     languages?: string[];
     customDictionaries?: Record<string, string[]>;
     defaultPlaceholder?: string;
+    enableLeetSpeak?: boolean;
+    caseSensitive?: boolean;
+    whitelistWords?: string[];
+    strictMode?: boolean;
+    detectPartialWords?: boolean;
 }
 /**
- * AllProfanity - Extended profanity filter with multi-language support
- * Based on leo-profanity with additional language capabilities
+ * Severity levels for profanity detection
+ */
+export declare enum ProfanitySeverity {
+    MILD = 1,
+    MODERATE = 2,
+    SEVERE = 3,
+    EXTREME = 4
+}
+/**
+ * Detection result interface
+ */
+export interface ProfanityDetectionResult {
+    hasProfanity: boolean;
+    detectedWords: string[];
+    cleanedText: string;
+    severity: ProfanitySeverity;
+    positions: Array<{
+        word: string;
+        start: number;
+        end: number;
+    }>;
+}
+/**
+ * Advanced AllProfanity - Custom profanity filter with multi-language support and leet speak detection
+ * No external dependencies - built from scratch for maximum performance and control
  */
 export declare class AllProfanity {
-    private filter;
+    private profanitySet;
+    private normalizedProfanityMap;
     private defaultPlaceholder;
     private loadedLanguages;
+    private whitelistSet;
+    private enableLeetSpeak;
+    private caseSensitive;
+    private strictMode;
+    private detectPartialWords;
+    private readonly leetMap;
+    private readonly wordBoundaryChars;
+    private readonly commonSuffixes;
+    private readonly commonPrefixes;
     private availableLanguages;
     /**
      * Create a new AllProfanity instance
      * @param options - Configuration options
      */
     constructor(options?: AllProfanityOptions);
+    /**
+     * Normalize text by converting leet speak to regular characters
+     * @param text - Text to normalize
+     * @returns Normalized text
+     */
+    private normalizeLeetSpeak;
+    private escapeRegex;
+    /**
+     * Generate word variations with common prefixes and suffixes
+     */
+    private generateWordVariations;
+    /**
+     * Check if text contains word boundaries around a match
+     */
+    private hasWordBoundaries;
+    /**
+     * Calculate severity based on detected words
+     */
+    private calculateSeverity;
     /**
      * Load a built-in language dictionary
      * @param language - The language to load
@@ -51,17 +109,23 @@ export declare class AllProfanity {
      */
     loadCustomDictionary(name: string, words: string[]): void;
     /**
-     * Get the list of currently loaded languages
-     * @returns string[] - Array of loaded language names
+     * Add words to whitelist (words that should never be flagged as profanity)
+     * @param words - Array of words to whitelist
      */
-    getLoadedLanguages(): string[];
+    addToWhitelist(words: string[]): void;
     /**
-     * Get the list of available language dictionaries
-     * @returns string[] - Array of available language names
+     * Remove words from whitelist
+     * @param words - Array of words to remove from whitelist
      */
-    getAvailableLanguages(): string[];
+    removeFromWhitelist(words: string[]): void;
     /**
-     * Check if a string contains profanity
+     * Advanced profanity detection with detailed results
+     * @param text - The text to analyze
+     * @returns ProfanityDetectionResult - Detailed detection results
+     */
+    detect(text: string): ProfanityDetectionResult;
+    /**
+     * Check if a string contains profanity (simple boolean check)
      * @param string - The string to check
      * @returns boolean - True if profanity found, false otherwise
      */
@@ -69,7 +133,7 @@ export declare class AllProfanity {
     /**
      * Clean a string by replacing profanities with placeholders
      * @param string - The string to clean
-     * @param placeholder - Optional custom placeholder (defaults to '*')
+     * @param placeholder - Optional custom placeholder
      * @returns string - The cleaned string
      */
     clean(string: string, placeholder?: string): string;
@@ -104,6 +168,24 @@ export declare class AllProfanity {
      * @param placeholder - Single character to use as placeholder
      */
     setPlaceholder(placeholder: string): void;
+    /**
+     * Get the list of currently loaded languages
+     * @returns string[] - Array of loaded language names
+     */
+    getLoadedLanguages(): string[];
+    /**
+     * Get the list of available language dictionaries
+     * @returns string[] - Array of available language names
+     */
+    getAvailableLanguages(): string[];
+    /**
+     * Get current configuration
+     */
+    getConfig(): Partial<AllProfanityOptions>;
+    /**
+     * Update configuration
+     */
+    updateConfig(options: Partial<AllProfanityOptions>): void;
 }
 declare const allProfanity: AllProfanity;
 export default allProfanity;