@redsocs/spam-warden 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +144 -0
- package/dist/spamwarden.js +418 -0
- package/dist/spamwarden.min.js +1 -0
- package/package.json +41 -0
package/README.md
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# SpamWarden.js
|
|
2
|
+
|
|
3
|
+
Lightweight, client-side JavaScript library for real-time spam detection and automated form protection. Optimized for Thai text and high-performance browser environments.
|
|
4
|
+
|
|
5
|
+
[](https://gitlab.com/redsocs/spam-warden/-/pipelines)
|
|
6
|
+
[](https://www.npmjs.com/package/@redsocs/spam-warden)
|
|
7
|
+
[](https://buymeacoffee.com/redsocs?new=1)
|
|
8
|
+
|
|
9
|
+
# What is this?
|
|
10
|
+
|
|
11
|
+
**SpamWarden.js** is a zero-dependency, client-side engine that detects spam directly in the user's browser. It uses a **Present-Only Naive Bayes** model (derived from Bernoulli Naive Bayes) trained specifically on Thai spam patterns (gambling, loans, "fast money" scams) and optimized with a dynamic, length-calibrated decision threshold to eliminate false positives on longer, clean text.
|
|
12
|
+
|
|
13
|
+
By running in the browser, it allows you to **block spam before it ever hits your database**, saving server resources and keeping your data clean.
|
|
14
|
+
|
|
15
|
+

|
|
16
|
+
|
|
17
|
+
# Live Demo & Scanner
|
|
18
|
+
|
|
19
|
+
You can test the spam engine interactively, analyze your forms, and generate auto-blocking script configurations directly on our GitLab Pages site:
|
|
20
|
+
|
|
21
|
+
👉 **[Live Demo & Generator](https://spam-warden-js-527b79.gitlab.io/)**
|
|
22
|
+
|
|
23
|
+
# Quickstart
|
|
24
|
+
|
|
25
|
+
> [!IMPORTANT]
|
|
26
|
+
> **Are you a Thai government agency or public sector website administrator?**
|
|
27
|
+
> Get your free token configuration and drop-in script to protect your online portals from annoying gambling/loan ads and spam campaigns at [redsocs.com/spam-warden](https://redsocs.com/spam-warden).
|
|
28
|
+
|
|
29
|
+
### 1. The "No-Code" Way (Auto-Blocking)
|
|
30
|
+
|
|
31
|
+
Add this script to your page. It will automatically find your form and block submission if spam is detected.
|
|
32
|
+
|
|
33
|
+
```html
|
|
34
|
+
<script src="https://cdn.redsocs.com/js/spamwarden.min.js?client=cG9zdHEtZm9ybXxtZXNzYWdlLWlucHV0fDE"></script>
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
_Note: The `client` parameter is a Base64 configuration string of `formId|inputId|sdFlag|siemEndpoint` (e.g., `postq-form|message-input|1` encoded)._
|
|
38
|
+
|
|
39
|
+
### 2. Manual Configuration
|
|
40
|
+
|
|
41
|
+
```html
|
|
42
|
+
<script src="dist/spamwarden.min.js"></script>
|
|
43
|
+
<script>
|
|
44
|
+
spamwarden.configure({
|
|
45
|
+
siteToken: "YOUR_TOKEN",
|
|
46
|
+
formId: "contact-form",
|
|
47
|
+
inputId: "message-field",
|
|
48
|
+
autoReport: true,
|
|
49
|
+
isTrusted: true, // Required to authorize telemetry reporting
|
|
50
|
+
reportSD: true, // Optional: Enable PII/DLP leak telemetry auditing
|
|
51
|
+
siemEndpoint: "https://api.yourdomain.com/v1/telemetry", // Optional: Custom secondary SIEM/SOC endpoint
|
|
52
|
+
onSpam: (result) => {
|
|
53
|
+
alert("Please do not send spam!");
|
|
54
|
+
},
|
|
55
|
+
});
|
|
56
|
+
</script>
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### 3. API Usage (Node or Browser)
|
|
60
|
+
|
|
61
|
+
```javascript
|
|
62
|
+
const result = spamwarden.spamcheck("สมัครสมาชิกวันนี้ รับโบนัส ฟรี!");
|
|
63
|
+
if (result.isSpam) {
|
|
64
|
+
console.log("Blocked:", result.reason || "AI match");
|
|
65
|
+
console.log("Confidence:", result.prob);
|
|
66
|
+
}
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
# Scope
|
|
70
|
+
|
|
71
|
+
SpamWarden is designed for **interactive web elements**:
|
|
72
|
+
|
|
73
|
+
- **Contact Forms:** Prevent bot and manual spam submissions.
|
|
74
|
+
- **Comment Sections:** Real-time feedback for users before they post.
|
|
75
|
+
- **Chat Inputs:** Instant filtering of malicious links and currency-heavy spam.
|
|
76
|
+
- **Privacy-First Apps:** Since detection happens locally, user data doesn't leave the browser unless explicitly reported.
|
|
77
|
+
|
|
78
|
+
# What's inside?
|
|
79
|
+
|
|
80
|
+
- **Hybrid Detection Engine:**
|
|
81
|
+
- **Hard Rules:** Instant blocking for currency symbols (`$€£฿`) and known spam link patterns (`line[dot]me`, `bit[dot]ly`).
|
|
82
|
+
- **Thai-Optimized Tokenizer:** Extracts whitespace tokens, **trigrams**, and **quadgrams** to handle the space-less nature of the Thai language.
|
|
83
|
+
- **Present-Only NB Classifier:** A modified Naive Bayes model trained on real-world spam samples. It only evaluates present vocabulary features and utilizes a length-dependent threshold offset ($5.5 + 0.49 \times N$ matched features) to calibrate confidence and prevent false positives on longer clean texts.
|
|
84
|
+
- **Telemetry System:** Optional auto-reporting of spam hits to `api.redsocs.com` for global threat intelligence.
|
|
85
|
+
- **Auto-Interceptor:** Event listeners that hook into DOM forms to provide "Drop-in" protection.
|
|
86
|
+
|
|
87
|
+
# Why this exists?
|
|
88
|
+
|
|
89
|
+
Traditional spam filters (like Akismet or ReCaptcha) often:
|
|
90
|
+
|
|
91
|
+
1. Require a round-trip to a server (latency).
|
|
92
|
+
2. Are expensive for high-volume sites.
|
|
93
|
+
3. Over-collect user data (privacy concerns).
|
|
94
|
+
4. Struggle with specific Thai-language spam patterns.
|
|
95
|
+
|
|
96
|
+
**SpamWarden** exists to provide a **local, fast, and Thai-centric** alternative that stops spam at the source: the user's input field.
|
|
97
|
+
|
|
98
|
+
# Local Simulation & Testing
|
|
99
|
+
|
|
100
|
+
You can spin up a local simulation server to test the DOM auto-blocking behavior and inspect the SIEM telemetry payloads in real time:
|
|
101
|
+
|
|
102
|
+
1. **Start the simulation server**:
|
|
103
|
+
```bash
|
|
104
|
+
npm run test-server
|
|
105
|
+
```
|
|
106
|
+
2. **Open the test page** in your browser:
|
|
107
|
+
[http://localhost:3000/](http://localhost:3000/)
|
|
108
|
+
3. **Submit a spam message** (e.g., including currency signs like `฿` or links like `line[dot]me`).
|
|
109
|
+
4. **Observe the result**:
|
|
110
|
+
- The form submission will be blocked on the page.
|
|
111
|
+
- The terminal will display the defanged and sanitized telemetry payload sent to the SIEM receiver:
|
|
112
|
+
```text
|
|
113
|
+
🚨 [SIEM RECEIVER] Blocked Payload Received!
|
|
114
|
+
================================================
|
|
115
|
+
Client Token: cG9zdHEtZm9ybXxtZXNzYWdlLWlucHV0fDF8aHR0cDovL2xvY2FsaG9zdDozMDAwL3YxL3RlbGVtZXRyeQ
|
|
116
|
+
URL: h_tt_p://localhost:3000/
|
|
117
|
+
Rule Matched: currency_symbol
|
|
118
|
+
Confidence: 100%
|
|
119
|
+
PII Masked? false
|
|
120
|
+
Pasted? false
|
|
121
|
+
Actors: []
|
|
122
|
+
Sanitized: "Win [CARD_MASKED] now!"
|
|
123
|
+
================================================
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
# About
|
|
127
|
+
|
|
128
|
+
- **Version:** 1.0.4 (v2 Engine)
|
|
129
|
+
- **Author:** [RedSocs](https://github.com/RedSocs)
|
|
130
|
+
- **License:** MIT
|
|
131
|
+
- **Model Origin:** Trained via [RedSocs/spam-labeler](https://github.com/RedSocs/spam-labeler)
|
|
132
|
+
- **Inquiries & Enterprise Support:** [pichit[at]redsocs.com](mailto:pichit@redsocs.com)
|
|
133
|
+
- **Sponsor:** [Buy Me a Coffee](https://buymeacoffee.com/redsocs?new=1)
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
### Technical Specs
|
|
138
|
+
|
|
139
|
+
| Property | Value |
|
|
140
|
+
| ----------------- | ------------------------- |
|
|
141
|
+
| **Minified Size** | ~2.0 MB (including model) |
|
|
142
|
+
| **Gzipped Size** | **~341 KB** |
|
|
143
|
+
| **Dependencies** | 0 (Vanilla JS) |
|
|
144
|
+
| **Vocabulary** | 36,017 features |
|