web-sentinel 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +68 -0
- package/dist/chunk-VGMZURHA.js +5 -0
- package/dist/hooks.d.ts +6 -0
- package/dist/hooks.js +1 -0
- package/dist/index.d.ts +12 -0
- package/dist/index.js +1 -0
- package/dist/middleware.d.ts +6 -0
- package/dist/middleware.js +1 -0
- package/dist/types-Cow_3dYg.d.ts +18 -0
- package/package.json +72 -0
package/README.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# Sentinel
|
|
2
|
+
|
|
3
|
+
Ultra-high performance blocking of bots and vulnerability scanners.
|
|
4
|
+
|
|
5
|
+
When you host a service on the web you'll invitably get hit with bots scanning for a vulnerable website. The most common target is Wordpress due to its poor security history but there are others too, and sometimes it can be a modern site which someone could have accidendally deployed some credentials to or some other file.
|
|
6
|
+
|
|
7
|
+
Even if your site isn't vulnerable, these bots can be a nuisance sending hundreds of requests that could trigger database lookups, wake instances, fill your logs with noise, and generally waste your resources.
|
|
8
|
+
|
|
9
|
+
Sure, there are some very nice protection services available but they cost money and take take to setup and manage. Wouldn't it be nice to just detect illegitimate requests and slam the door shut as quickly as possible? After all, you moved on from Wordpress 20 years ago, nothing on your site is serving up .php files, so why let requests hit your app?
|
|
10
|
+
|
|
11
|
+
It's provided as the low-level function generator, as Node or Polka http middlware, or as a SvelteKit `hooks.server handle function` (although it's better to put it in front of your app, even if you are using SvelteKit).
|
|
12
|
+
|
|
13
|
+
# High-Performance Bot Filter: The "Jump-Trie" Approach
|
|
14
|
+
|
|
15
|
+
This library utilizes a pre-compiled, static **Non-deterministic Finite Automaton (NFA)** implemented via nested `switch` statements. This approach is specifically engineered to leverage the architectural strengths of modern CPUs and the V8 JavaScript engine's optimization pipeline to produce the fastest pattern checks possible.
|
|
16
|
+
|
|
17
|
+
## 🚀 Performance Benchmarks
|
|
18
|
+
|
|
19
|
+
On an **M4 Mac Mini**, this implementation achieves:
|
|
20
|
+
|
|
21
|
+
- **Throughput:** ~37,000,000+ operations per second.
|
|
22
|
+
- **Latency:** < 0.0001ms (Mean).
|
|
23
|
+
- **Stability:** ±0.1% RME (Relative Margin of Error).
|
|
24
|
+
|
|
25
|
+
## 🧠 How the "Jump Code" Works
|
|
26
|
+
|
|
27
|
+
Standard routers or filters typically use a `Map`, `Set`, or `Regex`. While flexible, those methods incur overhead from hashing, state machine initialization, or object property lookups.
|
|
28
|
+
|
|
29
|
+
This implementation compiles your blacklists into a **Hard-Coded Prefix Tree (Trie)** using `path.charCodeAt(n)` with a fast `path.startsWith()` static check to confirm a match. No garbage, no string operations.
|
|
30
|
+
|
|
31
|
+
### 1. The Power of `charCodeAt`
|
|
32
|
+
|
|
33
|
+
Unlike `path[0]` or `path.substring()`, `charCodeAt` returns a raw integer representing the character at a specific memory offset. This maps directly to low-level CPU instructions, avoiding the creation of new string objects.
|
|
34
|
+
|
|
35
|
+
### 2. Jump Tables vs. Linear Search
|
|
36
|
+
|
|
37
|
+
When the JavaScript engine (V8) encounters a `switch` statement with integer cases, it doesn't just perform a series of "if/else" checks. If the cases are sufficiently optimized, it creates a **Jump Table**.
|
|
38
|
+
|
|
39
|
+
Instead of checking every possibility, the CPU calculates a memory offset based on the character code and "jumps" directly to the next block of code. This makes the search complexity `O(L)`, where `L` is the length of the string to match, regardless of how many total paths are in your blacklist. What does this mean? Pure speed baby!
|
|
40
|
+
|
|
41
|
+
### 3. Early Exit Strategy
|
|
42
|
+
|
|
43
|
+
Most bot probes can be rejected after checking only one or two characters.
|
|
44
|
+
|
|
45
|
+
- **Traditional Regex:** Scans the string for patterns, often looking at the entire path. Multiple regexes scan the same path over and over.
|
|
46
|
+
- **Jump-Trie:** If a path starts with `/a...` and your filter only cares about `/w...` (WordPress) and `/.e...` (Env), the function returns `false` on the **very first character**.
|
|
47
|
+
|
|
48
|
+
## 🛠 Why it’s so fast on modern hardware
|
|
49
|
+
|
|
50
|
+
### Mechanical Sympathy
|
|
51
|
+
|
|
52
|
+
Modern CPUs feature highly advanced **Branch Predictors**. Because the logic is "baked" into the source code rather than stored in a data structure, the CPU can "learn" the structure of your filter. It begins speculatively executing the next switch level before the current one has even finished, effectively hiding the latency of the check.
|
|
53
|
+
|
|
54
|
+
### Zero Allocations
|
|
55
|
+
|
|
56
|
+
This filter is "garbage collector friendly." It performs:
|
|
57
|
+
|
|
58
|
+
- **Zero** array iterations.
|
|
59
|
+
- **Zero** object allocations.
|
|
60
|
+
- **Zero** string slicing.
|
|
61
|
+
|
|
62
|
+
It operates entirely on the stack using primitives.
|
|
63
|
+
|
|
64
|
+
## ⚠️ Limitations & Best Practices
|
|
65
|
+
|
|
66
|
+
- **Static Nature:** This is not a dynamic router. Any changes to the blacklist require a re-compile but that is also blazingly fast to do suring app startup to generate the optimized code that can then be re-used.
|
|
67
|
+
- **Code Size:** While extremely fast, a list of 10,000+ paths will result in a large JS file. This may eventually exceed the CPU’s **L1 Instruction Cache**, leading to a slight performance dip. It's intended for hundreds of patterns.
|
|
68
|
+
- **Type Safety:** Ensure only strings are passed to the `test()` function. Passing `undefined` or an `object` will cause a V8 "De-optimization," dropping performance significantly.
|
|
@@ -0,0 +1,5 @@
|
|
|
1
|
+
var k={hostname:{suffix:[".bc.googleusercontent.com",".appspot.com",".google.com"]},pathname:{prefix:["/.env","/.git","/.ssh","/.map","/.yml","/.yaml","/.vscode","/.npmrc","/.DS_Store","/.well-known/security.txt","/.aws/credentials","/wp-admin","/wp-config","/wp-content","/wp-includes","/wlwmanifest.xml","/package.json","/cgi-bin","/bash_history","/etc/passwd"],suffix:[".env",".asp",".ashx",".aspx",".bak",".cgi",".php",".py",".rb",".rss",".zip",".dat",".rar",".gz",".sql",".go",".swift",".ts"]},user_agent:{prefix:["python-requests/","go-http-client/","curl/","wget/","scrapy/","urllib/","aiohttp/","axios/"]},search_params:{contain:["../"]},http_status:404};function P(e,t){let n={};for(let r of t){let s=n,o=e==="suffix"?r.split("").reverse().join(""):r;for(let i of o)s[i]||(s[i]={}),s=s[i]}return n}function m(e){let t=Object.keys(e);if(t.length){let n=0;for(let r of t)n+=m(e[r]);return n}else return 1}function x(e){let t=Object.keys(e);if(t.length===1){let n=t[0];return n+x(e[n])}else return""}function a(e){return" ".repeat(e)}function y(e,t,n,r=0){let s=Object.keys(e),o=s.length===0,i=n==="suffix";if(s.length===1&&!o&&m(e)===1){let c=s[0],p=c+x(e[c]);if(i){let f=p.split("").reverse().join("");return[`${a(r)}if (value.endsWith('${f}', value.length - ${t})) return true;`,...t>0?[`${a(r)}break;`]:[]]}else{let f=t+p.length,v=n==="exact"?`value.length === ${f} && value.startsWith('${p}', ${t})`:`value.startsWith('${p}', ${t})`;return[`${a(r)}if (${v}) return true;`,...t>0?[`${a(r)}break;`]:[]]}}if(s.length===0)return[`${a(t)}return true;`];let l=i?`value.length - ${t+1}`:t.toString(),u=[];o&&!i&&u.push(`${a(r)}if (value.length === ${t}) return true;`),u.push(`${a(r)}switch (value.charCodeAt(${l})) {`);for(let c of s)u.push(`${a(r+1)}case ${c.charCodeAt(0)}: // '${c}'`),u.push(...y(e[c],t+1,n,r+2));return u.push(`${a(r)}}`),u.push(`${a(r)}return false;`),u.filter(c=>c.trim().length)}function h(e,t){switch(e){case"contain":return t.map(s=>`if (value.includes('${s}')) return true;`).concat("return false;");case"exact":case"prefix":case"suffix":let n=P(e,t);return[`if (value.length < ${Math.min(...t.map(s=>s.length))}) return false;`,...y(n,0,e)]}}function w(e,t){let n=h(e,t).join(`
|
|
2
|
+
`);return new Function("value",n)}var g=["prefix","suffix","exact","contain"];var $=["hostname","pathname","user_agent","search_params"];function b(e){let t=[];for(let n of $){let r=e[n];t.push(` function check_${n}(value) {`);for(let o of g)if(r[o]){let i=h(o,r[o]);t.push(` function check_${o}(value) {`),t.push(...i.map(l=>" "+l)),t.push(` };
|
|
3
|
+
`)}let s=g.filter(o=>r[o]).map(o=>`check_${o}(value)`);t.push(` return ${s.join(" || ")};`),t.push(` };
|
|
4
|
+
`)}return t.push(` return ${$.map(n=>`check_${n}(${n})`).join(" || ")};`),t.join(`
|
|
5
|
+
`)}function d(e){let t=b(e);return new Function("hostname","pathname","user_agent","search_params",t)}export{k as a,h as b,w as c,g as d,b as e,d as f};
|
package/dist/hooks.d.ts
ADDED
package/dist/hooks.js
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
import{a as o,f as s}from"./chunk-VGMZURHA.js";import{STATUS_CODES as h}from"http";var _=n=>{let t={...o,...n},r=s(t);return async function({event:e,resolve:a}){let{request:p,url:i}=e,{hostname:c,pathname:m,search:l}=i,u=p.headers.get("user-agent")||"";return r(c,m,u,l)?new Response(h[t.http_status],{status:t.http_status,headers:{"Content-Type":"text/plain","Cache-Control":"public, max-age=3600",Connection:"close"}}):a(e)}};export{_ as createHandler};
|
package/dist/index.d.ts
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
import { O as Options, P as PatternType } from './types-Cow_3dYg.js';
|
|
2
|
+
export { H as Http4xx, a as Patterns } from './types-Cow_3dYg.js';
|
|
3
|
+
|
|
4
|
+
declare const default_options: Options;
|
|
5
|
+
|
|
6
|
+
declare function buildPatternCheck(options: Options): string;
|
|
7
|
+
declare function compilePatternCheck(options: Options): (this: void, hostname: string, pathname: string, user_agent: string, search_params: string) => boolean;
|
|
8
|
+
|
|
9
|
+
declare function buildRules(type: PatternType, patterns: string[]): string[];
|
|
10
|
+
declare function compileRules(type: PatternType, patterns: string[]): (this: void, value: string) => boolean;
|
|
11
|
+
|
|
12
|
+
export { Options, PatternType, buildPatternCheck, buildRules, compilePatternCheck, compileRules, default_options };
|
package/dist/index.js
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
import{a as o,b as r,c as e,d as f,e as m,f as p}from"./chunk-VGMZURHA.js";export{f as PatternType,m as buildPatternCheck,r as buildRules,p as compilePatternCheck,e as compileRules,o as default_options};
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
import{a as s,f as n}from"./chunk-VGMZURHA.js";import{STATUS_CODES as d}from"http";function b(r){let t={...s,...r},a=n(t);return(e,o,p)=>{let i=e.url||"/",[c]=i.split("#"),[m,l=""]=c.split("?"),h=e.headers.host||"",u=e.headers["user-agent"]||"";a(h,m,u,l)?(o.writeHead(t.http_status,d[t.http_status],{"Content-Type":"text/plain","Cache-Control":"public, max-age=3600",Connection:"close"}),o.end()):p()}}export{b as middleware};
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
declare const PatternType: readonly ["prefix", "suffix", "exact", "contain"];
|
|
2
|
+
type PatternType = (typeof PatternType)[number];
|
|
3
|
+
type Http4xx = 400 | 401 | 402 | 403 | 404 | 405 | 406 | 407 | 408 | 409 | 410 | 411 | 412 | 413 | 414 | 415 | 416 | 417 | 418 | 421 | 422 | 423 | 424 | 425 | 426 | 428 | 429 | 431 | 451;
|
|
4
|
+
interface Patterns {
|
|
5
|
+
exact?: string[];
|
|
6
|
+
prefix?: string[];
|
|
7
|
+
suffix?: string[];
|
|
8
|
+
contain?: string[];
|
|
9
|
+
}
|
|
10
|
+
interface Options {
|
|
11
|
+
hostname: Patterns;
|
|
12
|
+
pathname: Patterns;
|
|
13
|
+
user_agent: Patterns;
|
|
14
|
+
search_params: Patterns;
|
|
15
|
+
http_status: Http4xx;
|
|
16
|
+
}
|
|
17
|
+
|
|
18
|
+
export { type Http4xx as H, type Options as O, PatternType as P, type Patterns as a };
|
package/package.json
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "web-sentinel",
|
|
3
|
+
"description": "High performance protection against bots and probes",
|
|
4
|
+
"license": "MIT",
|
|
5
|
+
"version": "0.0.1",
|
|
6
|
+
"homepage": "https://captaincodeman.github.io/sentinel/",
|
|
7
|
+
"repository": {
|
|
8
|
+
"type": "git",
|
|
9
|
+
"url": "git+https://github.com/captaincodeman/sentinel.git"
|
|
10
|
+
},
|
|
11
|
+
"author": {
|
|
12
|
+
"name": "Simon Green",
|
|
13
|
+
"email": "simon@captaincodeman.com",
|
|
14
|
+
"url": "https://www.captaincodeman.com/"
|
|
15
|
+
},
|
|
16
|
+
"keywords": [
|
|
17
|
+
"http",
|
|
18
|
+
"bot",
|
|
19
|
+
"scrapers",
|
|
20
|
+
"sveltekit",
|
|
21
|
+
"middleware"
|
|
22
|
+
],
|
|
23
|
+
"type": "module",
|
|
24
|
+
"exports": {
|
|
25
|
+
".": {
|
|
26
|
+
"types": "./dist/index.d.ts",
|
|
27
|
+
"import": "./dist/index.js"
|
|
28
|
+
},
|
|
29
|
+
"./hooks": {
|
|
30
|
+
"types": "./dist/hooks.d.ts",
|
|
31
|
+
"import": "./dist/hooks.js"
|
|
32
|
+
},
|
|
33
|
+
"./middleware": {
|
|
34
|
+
"types": "./dist/middleware.d.ts",
|
|
35
|
+
"import": "./dist/middleware.js"
|
|
36
|
+
}
|
|
37
|
+
},
|
|
38
|
+
"files": [
|
|
39
|
+
"dist"
|
|
40
|
+
],
|
|
41
|
+
"devDependencies": {
|
|
42
|
+
"@sveltejs/adapter-static": "^3.0.10",
|
|
43
|
+
"@sveltejs/kit": "^2.50.2",
|
|
44
|
+
"@sveltejs/vite-plugin-svelte": "^6.2.4",
|
|
45
|
+
"@tailwindcss/vite": "^4.1.18",
|
|
46
|
+
"@types/node": "^25.2.3",
|
|
47
|
+
"@vitest/ui": "4.0.18",
|
|
48
|
+
"lucide-svelte": "^0.563.0",
|
|
49
|
+
"prettier": "^3.8.1",
|
|
50
|
+
"prettier-plugin-svelte": "^3.4.1",
|
|
51
|
+
"prettier-plugin-tailwindcss": "^0.7.2",
|
|
52
|
+
"svelte": "^5.49.2",
|
|
53
|
+
"svelte-check": "^4.3.6",
|
|
54
|
+
"tailwindcss": "^4.1.18",
|
|
55
|
+
"tsup": "^8.5.1",
|
|
56
|
+
"typescript": "^5.9.3",
|
|
57
|
+
"vite": "^7.3.1",
|
|
58
|
+
"vitest": "^4.0.18"
|
|
59
|
+
},
|
|
60
|
+
"scripts": {
|
|
61
|
+
"dev": "vite dev",
|
|
62
|
+
"build": "vite build",
|
|
63
|
+
"preview": "vite preview",
|
|
64
|
+
"check": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json",
|
|
65
|
+
"check:watch": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json --watch",
|
|
66
|
+
"lint": "prettier --plugin prettier-plugin-svelte --check .",
|
|
67
|
+
"format": "prettier --plugin prettier-plugin-svelte --write .",
|
|
68
|
+
"test": "vitest",
|
|
69
|
+
"test:ui": "vitest --ui",
|
|
70
|
+
"bench": "vitest bench"
|
|
71
|
+
}
|
|
72
|
+
}
|