llm-trust-guard 4.20.0 → 4.21.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,105 @@ All notable changes to `llm-trust-guard` will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.21.2] - 2026-06-12
9
+
10
+ ### Docs — document `CodeAnalyzerBackend`; add README-sync gate (G11)
11
+
12
+ - **README**: documented the pluggable `CodeAnalyzerBackend` seam (4.21.0) with an
13
+ acorn example, and noted CommonJS + ESM both work (4.21.1). The README previously
14
+ did not mention the new public API.
15
+ - **Verification (G11)**: a new gate fails the build when `src/index.ts` (public
16
+ exports) changes since the last tag but `README.md` does not — closing the
17
+ docs-drift gap (override `ALLOW_NO_README_UPDATE=1`). See VERIFICATION.md.
18
+
19
+ No code/behavior change.
20
+
21
+ ## [4.21.1] - 2026-06-12
22
+
23
+ ### Fixed — ESM named exports (`dist/index.mjs`)
24
+
25
+ `import { InputSanitizer } from "llm-trust-guard"` previously failed for **every**
26
+ named export — `dist/index.mjs` was default-only. Cause: `build-esm.js` bundled the
27
+ **compiled CJS** (`dist/index.js`), and esbuild cannot recover named exports from
28
+ tsc's CJS getter output (latent since the initial commit; not a size tradeoff —
29
+ `minify` is orthogonal). CommonJS `require()` was always fine, which is why it went
30
+ unnoticed.
31
+
32
+ - **Fix:** build the `.mjs` from the TS **source** (`src/index.ts`) so `export { … }`
33
+ statements survive. `dist/index.mjs` now has a named-export block (0 → 1) and no
34
+ default-only export.
35
+ - Regression guard added: `tests/esm-build.test.ts`.
36
+ - No API or behavior change; CommonJS unaffected. Verified by `npm pack` → ESM consumer
37
+ smoke (named `import { … }` now resolves) — see `tests/adversarial/RESULTS-v4.21.1.md`.
38
+
39
+ ## [4.21.0] - 2026-06-09
40
+
41
+ ### Added — Pluggable `CodeAnalyzerBackend` (optional AST analysis, zero-dep default)
42
+
43
+ `CodeExecutionGuard` now accepts an optional `analyzerBackend` — a pluggable
44
+ code-analysis seam (mirroring the existing `DetectionClassifier`). The default stays
45
+ **regex-only / zero-dependency**; provide a backend to add AST-level detection of JS
46
+ sandbox-escape gadgets that regex cannot reliably see.
47
+
48
+ - New exports: `CodeFinding`, `CodeAnalyzerBackend`; new config field `analyzerBackend`
49
+ and `CodeExecutionGuard.setAnalyzerBackend()`. Findings are **additive** (only add
50
+ detections); a throwing backend never crashes the guard.
51
+ - Reference implementation: `examples/acorn-code-analyzer.ts` (acorn). Measured —
52
+ three JS escape gadgets (`this.constructor.constructor('return process')()`,
53
+ `[].constructor.constructor(...)()`, `Function('return process')()`) go **3/3 missed
54
+ by regex → 3/3 blocked** with the backend; benign JS unaffected.
55
+ - 9 new tests (6 zero-dep wiring + 3 acorn). `acorn` added as a **devDependency only** —
56
+ the published package keeps **zero production dependencies**.
57
+ - Why a seam and not a bundled parser: JS has no stdlib parser, so bundling acorn/oxc
58
+ would break the zero-dep guarantee. The Python package uses stdlib `ast` directly
59
+ (v0.10.3). See RESEARCH_LOG.md. Detection only — still no runtime sandbox.
60
+
61
+ ## [4.20.2] - 2026-06-06
62
+
63
+ ### Added — Benign-context suppression (false-positive reduction)
64
+
65
+ `InputSanitizer` now cancels the soft `ignore_instructions` / `disregard_above`
66
+ triggers when the object is a benign technical noun (e.g. "ignore the
67
+ whitespace", "ignore case", "ignore the previous error") **and** the input
68
+ contains no instruction/rule/prompt/safety noun anywhere, **and** the prompt
69
+ carries no high-signal exfiltration/execution/credential/money token. Any real
70
+ injection ("ignore previous instructions", "disregard your rules") references an
71
+ instruction-noun and is never suppressed.
72
+
73
+ - **Suppression veto**: suppression is refused when the prompt also contains a
74
+ URL, email address, credential/secret word, shell pipe / `rm -rf` / `curl` /
75
+ `wget`, destructive `delete`/`drop`, a money amount (`$NN`), or a long account
76
+ number. This closes the escape hatch where an attacker prefixes a real payload
77
+ with "ignore the previous output …" to cancel the trigger. 10 bypass controls
78
+ added to the probe (all blocked).
79
+ - New curated probe `tests/benign-context.test.ts`: 28 benign coding-context
80
+ prompts (0 blocked) + 12 attack controls + 10 suppression-bypass controls
81
+ (0 leaked).
82
+ - **Recall preserved**: full suite 716 pass (was 711). WildChat-1M shard 0
83
+ (n=10,000, seed 42) Pipeline A block count is **unchanged at 493 (raw FPR
84
+ 4.93%)** — that consumer corpus does not exercise the benign technical-object
85
+ class, so the win is scoped to coding/technical deployments and does **not**
86
+ move the published ~2.73% corrected WildChat FPR.
87
+ - Reproducible WildChat measurement committed at
88
+ `tests/adversarial/fixtures/wildchat-sample10k.jsonl` (Git LFS, ODC-BY,
89
+ `allenai/WildChat-1M`).
90
+ - Known pre-existing gap noted (not addressed here): `"disregard your previous
91
+ rules"` is not matched by the `disregard` patterns — a recall issue, separate
92
+ from this FP work.
93
+
94
+ ## [4.20.1] - 2026-04-24
95
+
96
+ ### Changed — Documentation accuracy
97
+
98
+ - **README**: Removed "31 → 34 security guards" inconsistency (was contradicting the All 34 Guards table and `package.json`)
99
+ - **README**: Removed unmeasured "<5ms latency" assertion from intro
100
+ - **README**: Removed unmeasured "~97% on curated benchmarks" framing from "What it catches well"
101
+ - **README**: Qualified the four "100% detection" claims (Policy Puppetry, Role-play, PAP, Multilingual) as "100% on unit tests" with a section preface explaining that these are unit-test rates, not corpus measurements. Broader corpus measurements live in [RESULTS-v4.19.0.md](tests/adversarial/RESULTS-v4.19.0.md)
102
+ - **README**: Added Homoglyph attacks bullet to "What it catches well" (parity with Python README; feature exists in `encoding-detector`, `prompt-leakage-guard`, `multimodal-guard`, `memory-guard`)
103
+ - **README**: Added v4.20.0 MCP Sampling detection note in Measured Performance preface; benchmark numbers apply unchanged because Sampling is orthogonal to the Sanitizer+Encoder pipelines benchmarked
104
+
105
+ No code changes. Same 711 tests pass.
106
+
8
107
  ## [4.20.0] - 2026-04-24
9
108
 
10
109
  ### Added — MCP Sampling Attack Detection (Unit42 + Blueinfy, Feb 2026)
package/CONTRIBUTING.md CHANGED
@@ -219,12 +219,26 @@ When adding new detection patterns:
219
219
 
220
220
  #### PR Checklist
221
221
 
222
- - [ ] Tests pass (`npm test`)
223
- - [ ] Code follows style guidelines
224
- - [ ] Documentation updated
225
- - [ ] CHANGELOG.md updated (for features/fixes)
226
- - [ ] No console.log statements (except for intentional logging)
227
- - [ ] Types are properly defined
222
+ - [ ] `npm run verify` is green (the eval-gated pipeline — see [VERIFICATION.md](VERIFICATION.md))
223
+ - [ ] New/changed `src/` ships with tests (enforced by gate **G6**)
224
+ - [ ] CHANGELOG.md top entry matches the version (gate **G7**)
225
+ - [ ] `tests/adversarial/RESULTS-v<version>.md` written for any release/claim (gate **G8**)
226
+ - [ ] `RESEARCH_LOG.md` entry added if the change cites a threat/technique/benchmark
227
+ - [ ] No console.log statements (except intentional logging)
228
+
229
+ ### Verification (required before push)
230
+
231
+ Run one command — it runs build, the full suite, coverage thresholds, the WildChat
232
+ FP-regression gate, the adversarial-bypass probe, and the changelog/results checks:
233
+
234
+ ```bash
235
+ npm run verify
236
+ ```
237
+
238
+ This is enforced both locally (`.githooks/pre-push`, install via
239
+ `bash scripts/install-hooks.sh`) and in CI, so it can't be skipped. See
240
+ [VERIFICATION.md](VERIFICATION.md) for the eight gates and how each maps to our
241
+ "don't break it / don't make it up / publish the basis" rules.
228
242
 
229
243
  ### Review Process
230
244
 
package/README.md CHANGED
@@ -3,7 +3,7 @@
3
3
  [![npm version](https://img.shields.io/npm/v/llm-trust-guard.svg)](https://www.npmjs.com/package/llm-trust-guard)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
5
 
6
- **31 security guards for LLM-powered and agentic AI applications.** Zero dependencies. <5ms latency. Covers OWASP Top 10 for LLMs 2025, OWASP Agentic AI 2026, and MCP Security.
6
+ **34 security guards for LLM-powered and agentic AI applications.** Zero dependencies. Covers OWASP Top 10 for LLMs 2025, OWASP Agentic AI 2026, and MCP Security.
7
7
 
8
8
  Also available as a [Python package on PyPI](https://pypi.org/project/llm-trust-guard/) (`pip install llm-trust-guard`).
9
9
 
@@ -13,13 +13,17 @@ Also available as a [Python package on PyPI](https://pypi.org/project/llm-trust-
13
13
 
14
14
  This package is your **first line of defense** — like a WAF (Web Application Firewall) for LLM applications. It sits in the orchestration layer and catches known attack patterns before they reach the LLM and after the LLM responds.
15
15
 
16
- ### What it catches well (~97% on curated benchmarks)
16
+ ### What it catches well
17
+
18
+ Per-category detection rates below are measured against the package's curated unit-test suite (representative attack samples per category). On broader held-out corpora these rates are typically lower — see [tests/adversarial/RESULTS-v4.19.0.md](tests/adversarial/RESULTS-v4.19.0.md) for measured detection on attack corpora and [Known limitations](#what-it-catches-partially-50-80-detection) below.
19
+
17
20
  - Known prompt injection phrases (170+ patterns, 11 languages)
18
21
  - Encoding bypass attacks (9 formats: Base64, URL, Unicode, Hex, HTML, ROT13, Octal, Base32, mixed)
19
- - Policy Puppetry attacks (JSON/INI/XML/YAML-formatted injection) — 100% detection
20
- - Role-play/persona attacks (translator trick, academic pretext, emotional manipulation) — 100% detection
21
- - PAP/persuasion attacks (authority, urgency, emotional manipulation) — 100% detection
22
- - Multilingual injection (10 languages) — 100% detection
22
+ - Policy Puppetry attacks (JSON/INI/XML/YAML-formatted injection) — 100% on unit tests
23
+ - Role-play/persona attacks (translator trick, academic pretext, emotional manipulation) — 100% on unit tests
24
+ - PAP/persuasion attacks (authority, urgency, emotional manipulation) — 100% on unit tests
25
+ - Multilingual injection (10 languages) — 100% on unit tests
26
+ - Homoglyph attacks (Cyrillic/Greek character substitution) — normalized and detected
23
27
  - PII and secret leakage in outputs
24
28
  - Tool hallucination, RBAC bypass, multi-tenant violations
25
29
  - Tool result poisoning, context window stuffing
@@ -189,6 +193,28 @@ const output = guard.filterOutput(llmResponse, session.role);
189
193
  |-----------|---------|
190
194
  | DetectionClassifier | Plug in any ML backend (sync or async) alongside regex guards |
191
195
  | createRegexClassifier() | Built-in regex classifier as a DetectionClassifier callback |
196
+ | CodeAnalyzerBackend | Plug an AST parser (e.g. acorn/oxc) into `CodeExecutionGuard` — catches JS sandbox-escape gadgets regex misses, while the default stays zero-dependency |
197
+
198
+ `CodeExecutionGuard` is regex-only by default (zero dependencies). For AST-level
199
+ detection of gadget chains like `this.constructor.constructor('return process')()`
200
+ or the `Function` constructor, plug in a parser via `analyzerBackend` (findings are
201
+ additive; a throwing backend never crashes the guard):
202
+
203
+ ```ts
204
+ import { CodeExecutionGuard, type CodeAnalyzerBackend } from 'llm-trust-guard';
205
+ import { parse } from 'acorn'; // your dependency, not the library's
206
+
207
+ const acornBackend: CodeAnalyzerBackend = (code, language) => {
208
+ if (language !== 'javascript') return [];
209
+ // walk the AST, return [{ name, severity }] for dangerous nodes
210
+ return findGadgets(parse(code, { ecmaVersion: 'latest', sourceType: 'module' }));
211
+ };
212
+
213
+ const guard = new CodeExecutionGuard({ analyzerBackend: acornBackend });
214
+ ```
215
+
216
+ See `examples/acorn-code-analyzer.ts` for a complete reference. The Python package
217
+ ships this analysis built in (stdlib `ast`, no backend needed).
192
218
 
193
219
  ## OWASP Coverage
194
220
 
@@ -224,7 +250,7 @@ const output = guard.filterOutput(llmResponse, session.role);
224
250
 
225
251
  ## Measured Performance
226
252
 
227
- v4.19.0 benchmark, 2026-04-23. Full methodology, 95% confidence intervals, hand-adjudication labels, and reproducibility scripts: [tests/adversarial/RESULTS-v4.19.0.md](tests/adversarial/RESULTS-v4.19.0.md).
253
+ v4.19.0 benchmark, 2026-04-23. v4.20.0 added MCP Sampling attack detection (see [CHANGELOG.md](CHANGELOG.md)) — orthogonal to the Sanitizer+Encoder pipelines below, so numbers apply unchanged. Full methodology, 95% confidence intervals, hand-adjudication labels, and reproducibility scripts: [tests/adversarial/RESULTS-v4.19.0.md](tests/adversarial/RESULTS-v4.19.0.md).
228
254
 
229
255
  **Attack detection on prior-published corpora** (Giskard n=35, Compass CTF Chinese n=11): detection rate has not moved from v4.13.5 → v4.19.0 on the Sanitizer+Encoder pipeline — 80.00% and 9.09% respectively, identical to the v4.13.5 numbers. Six releases of pattern additions (v4.14–v4.19) targeted different attack classes (indirect injection, tool-result validation, memory persistence, multi-agent trust) that these direct-text jailbreak corpora do not exercise. Small sample sizes mean "no evidence of improvement," not "proof of no improvement."
230
256
 
@@ -16,6 +16,26 @@
16
16
  * - Resource limit enforcement
17
17
  * - Language-specific security rules
18
18
  */
19
+ /** A single finding from a pluggable code-analysis backend. */
20
+ export interface CodeFinding {
21
+ name: string;
22
+ /** Added to the risk score (0-100 scale). */
23
+ severity: number;
24
+ kind?: string;
25
+ }
26
+ /**
27
+ * Pluggable code-analysis backend (e.g. an AST parser such as acorn or oxc).
28
+ *
29
+ * Default is regex-only (zero dependencies). Provide a backend to add AST-level
30
+ * detection — sandbox-escape gadget chains, the Function constructor, dynamic
31
+ * import — that regex cannot reliably see. Findings are ADDITIVE: a backend can
32
+ * only add detections, never remove them, and a throwing backend never crashes
33
+ * the guard. See `examples/acorn-code-analyzer.ts` for a reference implementation.
34
+ *
35
+ * (The Python package uses stdlib `ast` directly; JS has no stdlib parser, so the
36
+ * npm package keeps regex zero-dep by default and takes any parser via this seam.)
37
+ */
38
+ export type CodeAnalyzerBackend = (code: string, language: string) => CodeFinding[];
19
39
  export interface CodeExecutionGuardConfig {
20
40
  /** Allowed programming languages */
21
41
  allowedLanguages?: string[];
@@ -43,6 +63,8 @@ export interface CodeExecutionGuardConfig {
43
63
  }>;
44
64
  /** Risk threshold for blocking (0-100) */
45
65
  riskThreshold?: number;
66
+ /** Optional pluggable AST analyzer (acorn/oxc/etc.). Additive on top of regex. */
67
+ analyzerBackend?: CodeAnalyzerBackend;
46
68
  }
47
69
  export interface CodeAnalysisResult {
48
70
  allowed: boolean;
@@ -76,10 +98,13 @@ export interface SandboxConfig {
76
98
  }
77
99
  export declare class CodeExecutionGuard {
78
100
  private config;
101
+ private analyzerBackend?;
79
102
  private readonly DANGEROUS_PATTERNS;
80
103
  private readonly DEFAULT_BLOCKED_IMPORTS;
81
104
  private readonly DEFAULT_BLOCKED_FUNCTIONS;
82
105
  constructor(config?: CodeExecutionGuardConfig);
106
+ /** Register/replace the pluggable AST analyzer backend at runtime. */
107
+ setAnalyzerBackend(backend: CodeAnalyzerBackend): void;
83
108
  /**
84
109
  * Analyze code for dangerous patterns before execution
85
110
  */
@@ -1,2 +1,2 @@
1
- "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.CodeExecutionGuard=void 0;class CodeExecutionGuard{constructor(e={}){this.DANGEROUS_PATTERNS={javascript:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"function_constructor",pattern:/new\s+Function\s*\(/g,severity:50},{name:"child_process",pattern:/require\s*\(\s*['"]child_process['"]\s*\)/g,severity:60},{name:"exec",pattern:/\b(exec|execSync|spawn|spawnSync)\s*\(/g,severity:60},{name:"fs_write",pattern:/\b(writeFile|writeFileSync|appendFile|unlink|rmdir)\s*\(/g,severity:45},{name:"process_env",pattern:/process\.env/g,severity:30},{name:"require_dynamic",pattern:/require\s*\(\s*[^'"]/g,severity:40},{name:"vm_module",pattern:/require\s*\(\s*['"]vm['"]\s*\)/g,severity:55},{name:"fetch_external",pattern:/fetch\s*\(\s*['"]https?:\/\/(?!localhost)/g,severity:35},{name:"websocket",pattern:/new\s+WebSocket\s*\(/g,severity:35},{name:"prototype_pollution",pattern:/__proto__|constructor\s*\[|Object\.setPrototypeOf/g,severity:50},{name:"global_access",pattern:/\bglobal\b|\bglobalThis\b/g,severity:35}],python:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"exec",pattern:/\bexec\s*\(/g,severity:50},{name:"compile",pattern:/\bcompile\s*\(/g,severity:45},{name:"subprocess",pattern:/import\s+subprocess|from\s+subprocess/g,severity:60},{name:"os_system",pattern:/os\.(system|popen|exec)/g,severity:60},{name:"os_module",pattern:/import\s+os|from\s+os\s+import/g,severity:40},{name:"socket",pattern:/import\s+socket|from\s+socket/g,severity:40},{name:"pickle",pattern:/import\s+pickle|pickle\.loads?/g,severity:55},{name:"ctypes",pattern:/import\s+ctypes|from\s+ctypes/g,severity:55},{name:"builtins",pattern:/__builtins__|__import__/g,severity:50},{name:"file_write",pattern:/open\s*\([^)]*['"]w['"]/g,severity:40},{name:"requests",pattern:/requests\.(get|post|put|delete)\s*\(/g,severity:35},{name:"getattr_dynamic",pattern:/getattr\s*\(\s*\w+\s*,\s*[^'"]/g,severity:40}],bash:[{name:"rm_rf",pattern:/rm\s+(-rf?|--recursive)/gi,severity:70},{name:"sudo",pattern:/\bsudo\b/gi,severity:60},{name:"curl_pipe",pattern:/curl\s+.*\|\s*(ba)?sh/gi,severity:70},{name:"wget_execute",pattern:/wget\s+.*&&\s*(ba)?sh/gi,severity:70},{name:"eval",pattern:/\beval\b/gi,severity:50},{name:"env_dump",pattern:/\benv\b|\bprintenv\b/gi,severity:35},{name:"chmod",pattern:/chmod\s+(\+x|777|755)/gi,severity:40},{name:"chown",pattern:/\bchown\b/gi,severity:45},{name:"dd",pattern:/\bdd\s+if=/gi,severity:55},{name:"nc_reverse",pattern:/\bnc\b.*-e/gi,severity:70},{name:"base64_decode",pattern:/base64\s+(-d|--decode)/gi,severity:40},{name:"cron",pattern:/crontab|\/etc\/cron/gi,severity:50}],sql:[{name:"drop_table",pattern:/DROP\s+(TABLE|DATABASE)/gi,severity:70},{name:"delete_all",pattern:/DELETE\s+FROM\s+\w+\s*(;|$)/gi,severity:60},{name:"truncate",pattern:/TRUNCATE\s+TABLE/gi,severity:65},{name:"union_injection",pattern:/UNION\s+(ALL\s+)?SELECT/gi,severity:55},{name:"comment_injection",pattern:/--\s*$/gm,severity:30},{name:"xp_cmdshell",pattern:/xp_cmdshell/gi,severity:70},{name:"into_outfile",pattern:/INTO\s+(OUT|DUMP)FILE/gi,severity:60},{name:"load_file",pattern:/LOAD_FILE\s*\(/gi,severity:55}]},this.DEFAULT_BLOCKED_IMPORTS={javascript:["child_process","cluster","dgram","dns","net","tls","vm","worker_threads","v8","perf_hooks"],python:["subprocess","os","sys","socket","ctypes","pickle","marshal","multiprocessing","threading","_thread"]},this.DEFAULT_BLOCKED_FUNCTIONS=["eval","exec","system","popen","spawn","fork","execv","execve","dlopen","compile"],this.config={allowedLanguages:e.allowedLanguages??["javascript","python","sql"],blockedImports:e.blockedImports??[],blockedFunctions:e.blockedFunctions??this.DEFAULT_BLOCKED_FUNCTIONS,maxCodeLength:e.maxCodeLength??1e4,maxExecutionTime:e.maxExecutionTime??5e3,allowNetwork:e.allowNetwork??!1,allowFileSystem:e.allowFileSystem??!1,allowShell:e.allowShell??!1,allowEnvAccess:e.allowEnvAccess??!1,customPatterns:e.customPatterns??[],riskThreshold:e.riskThreshold??50}}analyze(e,a,s){const n=s||`code-${Date.now()}`,r=a.toLowerCase(),o=[];let i=0;if(!this.config.allowedLanguages.includes(r))return{allowed:!1,reason:`Language '${a}' is not allowed`,violations:["disallowed_language"],request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[],dangerous_functions:[],system_calls:[],network_access:!1,file_access:!1,shell_access:!1,env_access:!1,risk_score:100,complexity_score:0},recommendations:[`Use one of: ${this.config.allowedLanguages.join(", ")}`]};e.length>this.config.maxCodeLength&&(o.push("code_too_long"),i+=20);const l=[...this.DANGEROUS_PATTERNS[r]||[],...this.config.customPatterns],c=[],p=[],m=[];let u=!1,g=!1,h=!1,d=!1;for(const{name:t,pattern:f,severity:v}of l)e.match(f)&&(o.push(`dangerous_pattern_${t}`),i+=v,(t.includes("exec")||t.includes("spawn")||t.includes("system")||t.includes("subprocess"))&&(h=!0,m.push(t)),(t.includes("fs")||t.includes("file")||t.includes("write"))&&(g=!0),(t.includes("fetch")||t.includes("socket")||t.includes("request")||t.includes("websocket"))&&(u=!0),t.includes("env")&&(d=!0),(t.includes("import")||t.includes("require"))&&c.push(t),(t.includes("eval")||t.includes("exec")||t.includes("compile"))&&p.push(t));const b=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[r]||[]];for(const t of b){const f=[new RegExp(`require\\s*\\(\\s*['"]${t}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${t}['"]`,"g"),new RegExp(`import\\s+${t}`,"g"),new RegExp(`from\\s+${t}\\s+import`,"g")];for(const v of f)v.test(e)&&(o.push(`blocked_import_${t}`),c.push(t),i+=40)}for(const t of this.config.blockedFunctions)new RegExp(`\\b${t}\\s*\\(`,"g").test(e)&&(o.push(`blocked_function_${t}`),p.push(t),i+=35);u&&!this.config.allowNetwork&&(o.push("network_access_denied"),i+=30),g&&!this.config.allowFileSystem&&(o.push("filesystem_access_denied"),i+=30),h&&!this.config.allowShell&&(o.push("shell_access_denied"),i+=40),d&&!this.config.allowEnvAccess&&(o.push("env_access_denied"),i+=25);const w=this.calculateComplexity(e,r);i=Math.min(100,i);const y=i>=this.config.riskThreshold,_={allowed:!y,reason:y?`Code blocked: ${o.slice(0,3).join(", ")}`:"Code analysis passed",violations:o,request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[...new Set(c)],dangerous_functions:[...new Set(p)],system_calls:[...new Set(m)],network_access:u,file_access:g,shell_access:h,env_access:d,risk_score:i,complexity_score:w},recommendations:this.generateRecommendations(o,i)};return y||(_.sandbox_config=this.generateSandboxConfig(u,g,h,d),o.length>0&&(_.sanitized_code=this.sanitizeCode(e,r))),_}validateSyntax(e,a){const s=[];switch(a.toLowerCase()){case"javascript":const r=(e.match(/{/g)||[]).length,o=(e.match(/}/g)||[]).length;r!==o&&s.push("Unbalanced curly braces");const i=(e.match(/\(/g)||[]).length,l=(e.match(/\)/g)||[]).length;i!==l&&s.push("Unbalanced parentheses");break;case"python":const c=(e.match(/'/g)||[]).length,p=(e.match(/"/g)||[]).length,m=(e.match(/'''|"""/g)||[]).length;(c-m*3)%2!==0&&s.push("Unclosed single quotes"),(p-m*3)%2!==0&&s.push("Unclosed double quotes");break;case"sql":(e.match(/'/g)||[]).length%2!==0&&s.push("Unclosed single quotes in SQL");break}return{valid:s.length===0,errors:s}}generateSandboxConfig(e,a,s,n){return{timeout:this.config.maxExecutionTime,memoryLimit:128*1024*1024,allowedSyscalls:this.getAllowedSyscalls(e,a,s),networkPolicy:e&&this.config.allowNetwork?"localhost":"none",filesystemPolicy:a&&this.config.allowFileSystem?"temponly":"none",envVars:n&&this.config.allowEnvAccess?{NODE_ENV:"sandbox",SANDBOX:"true"}:{}}}sanitizeCode(e,a){let s=e;const n=this.DANGEROUS_PATTERNS[a]||[];for(const{pattern:o,severity:i}of n)i>=50&&(s=s.replace(o,"/* BLOCKED */"));const r=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[a]||[]];for(const o of r){const i=[new RegExp(`require\\s*\\(\\s*['"]${o}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${o}['"].*`,"gm"),new RegExp(`import\\s+${o}.*`,"gm"),new RegExp(`from\\s+${o}\\s+import.*`,"gm")];for(const l of i)s=s.replace(l,"/* BLOCKED_IMPORT */")}return s}getAllowedLanguages(){return[...this.config.allowedLanguages]}addDangerousPattern(e,a,s,n){this.DANGEROUS_PATTERNS[e]||(this.DANGEROUS_PATTERNS[e]=[]),this.DANGEROUS_PATTERNS[e].push({name:a,pattern:s,severity:n})}calculateComplexity(e,a){let s=0;const r={javascript:/\b(if|else|for|while|switch|try|catch)\b/g,python:/\b(if|elif|else|for|while|try|except|with)\b/g,sql:/\b(CASE|WHEN|IF|WHILE|LOOP)\b/gi}[a];if(r){const c=e.match(r)||[];s+=c.length*5}const i={javascript:/\b(function|=>|\basync\b)/g,python:/\bdef\b|\blambda\b/g,sql:/\bCREATE\s+(FUNCTION|PROCEDURE)\b/gi}[a];if(i){const c=e.match(i)||[];s+=c.length*10}const l=e.split(`
2
- `).length;return s+=Math.min(l,100),Math.min(100,s)}getAllowedSyscalls(e,a,s){const n=["read","write","exit","brk","mmap","munmap","close"];return e&&this.config.allowNetwork&&n.push("socket","connect","bind","listen","accept"),a&&this.config.allowFileSystem&&n.push("open","stat","fstat","lstat","access"),n}generateRecommendations(e,a){const s=[];return e.some(n=>n.includes("import"))&&s.push("Remove or replace blocked imports with safe alternatives"),e.some(n=>n.includes("eval")||n.includes("exec"))&&s.push("Avoid dynamic code execution - use static alternatives"),e.some(n=>n.includes("network"))&&s.push("Remove network access or use approved endpoints only"),e.some(n=>n.includes("filesystem"))&&s.push("Use temporary directories or remove file operations"),e.some(n=>n.includes("shell"))&&s.push("Shell access is not permitted - use language-native alternatives"),a>=70&&s.push("Code requires significant review before execution"),s.length===0&&s.push("Code passed security analysis"),s}}exports.CodeExecutionGuard=CodeExecutionGuard;
1
+ "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.CodeExecutionGuard=void 0;class CodeExecutionGuard{constructor(e={}){this.DANGEROUS_PATTERNS={javascript:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"function_constructor",pattern:/new\s+Function\s*\(/g,severity:50},{name:"child_process",pattern:/require\s*\(\s*['"]child_process['"]\s*\)/g,severity:60},{name:"exec",pattern:/\b(exec|execSync|spawn|spawnSync)\s*\(/g,severity:60},{name:"fs_write",pattern:/\b(writeFile|writeFileSync|appendFile|unlink|rmdir)\s*\(/g,severity:45},{name:"process_env",pattern:/process\.env/g,severity:30},{name:"require_dynamic",pattern:/require\s*\(\s*[^'"]/g,severity:40},{name:"vm_module",pattern:/require\s*\(\s*['"]vm['"]\s*\)/g,severity:55},{name:"fetch_external",pattern:/fetch\s*\(\s*['"]https?:\/\/(?!localhost)/g,severity:35},{name:"websocket",pattern:/new\s+WebSocket\s*\(/g,severity:35},{name:"prototype_pollution",pattern:/__proto__|constructor\s*\[|Object\.setPrototypeOf/g,severity:50},{name:"global_access",pattern:/\bglobal\b|\bglobalThis\b/g,severity:35}],python:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"exec",pattern:/\bexec\s*\(/g,severity:50},{name:"compile",pattern:/\bcompile\s*\(/g,severity:45},{name:"subprocess",pattern:/import\s+subprocess|from\s+subprocess/g,severity:60},{name:"os_system",pattern:/os\.(system|popen|exec)/g,severity:60},{name:"os_module",pattern:/import\s+os|from\s+os\s+import/g,severity:40},{name:"socket",pattern:/import\s+socket|from\s+socket/g,severity:40},{name:"pickle",pattern:/import\s+pickle|pickle\.loads?/g,severity:55},{name:"ctypes",pattern:/import\s+ctypes|from\s+ctypes/g,severity:55},{name:"builtins",pattern:/__builtins__|__import__/g,severity:50},{name:"file_write",pattern:/open\s*\([^)]*['"]w['"]/g,severity:40},{name:"requests",pattern:/requests\.(get|post|put|delete)\s*\(/g,severity:35},{name:"getattr_dynamic",pattern:/getattr\s*\(\s*\w+\s*,\s*[^'"]/g,severity:40}],bash:[{name:"rm_rf",pattern:/rm\s+(-rf?|--recursive)/gi,severity:70},{name:"sudo",pattern:/\bsudo\b/gi,severity:60},{name:"curl_pipe",pattern:/curl\s+.*\|\s*(ba)?sh/gi,severity:70},{name:"wget_execute",pattern:/wget\s+.*&&\s*(ba)?sh/gi,severity:70},{name:"eval",pattern:/\beval\b/gi,severity:50},{name:"env_dump",pattern:/\benv\b|\bprintenv\b/gi,severity:35},{name:"chmod",pattern:/chmod\s+(\+x|777|755)/gi,severity:40},{name:"chown",pattern:/\bchown\b/gi,severity:45},{name:"dd",pattern:/\bdd\s+if=/gi,severity:55},{name:"nc_reverse",pattern:/\bnc\b.*-e/gi,severity:70},{name:"base64_decode",pattern:/base64\s+(-d|--decode)/gi,severity:40},{name:"cron",pattern:/crontab|\/etc\/cron/gi,severity:50}],sql:[{name:"drop_table",pattern:/DROP\s+(TABLE|DATABASE)/gi,severity:70},{name:"delete_all",pattern:/DELETE\s+FROM\s+\w+\s*(;|$)/gi,severity:60},{name:"truncate",pattern:/TRUNCATE\s+TABLE/gi,severity:65},{name:"union_injection",pattern:/UNION\s+(ALL\s+)?SELECT/gi,severity:55},{name:"comment_injection",pattern:/--\s*$/gm,severity:30},{name:"xp_cmdshell",pattern:/xp_cmdshell/gi,severity:70},{name:"into_outfile",pattern:/INTO\s+(OUT|DUMP)FILE/gi,severity:60},{name:"load_file",pattern:/LOAD_FILE\s*\(/gi,severity:55}]},this.DEFAULT_BLOCKED_IMPORTS={javascript:["child_process","cluster","dgram","dns","net","tls","vm","worker_threads","v8","perf_hooks"],python:["subprocess","os","sys","socket","ctypes","pickle","marshal","multiprocessing","threading","_thread"]},this.DEFAULT_BLOCKED_FUNCTIONS=["eval","exec","system","popen","spawn","fork","execv","execve","dlopen","compile"],this.config={allowedLanguages:e.allowedLanguages??["javascript","python","sql"],blockedImports:e.blockedImports??[],blockedFunctions:e.blockedFunctions??this.DEFAULT_BLOCKED_FUNCTIONS,maxCodeLength:e.maxCodeLength??1e4,maxExecutionTime:e.maxExecutionTime??5e3,allowNetwork:e.allowNetwork??!1,allowFileSystem:e.allowFileSystem??!1,allowShell:e.allowShell??!1,allowEnvAccess:e.allowEnvAccess??!1,customPatterns:e.customPatterns??[],riskThreshold:e.riskThreshold??50},this.analyzerBackend=e.analyzerBackend}setAnalyzerBackend(e){this.analyzerBackend=e}analyze(e,o,t){const n=t||`code-${Date.now()}`,r=o.toLowerCase(),a=[];let i=0;if(!this.config.allowedLanguages.includes(r))return{allowed:!1,reason:`Language '${o}' is not allowed`,violations:["disallowed_language"],request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[],dangerous_functions:[],system_calls:[],network_access:!1,file_access:!1,shell_access:!1,env_access:!1,risk_score:100,complexity_score:0},recommendations:[`Use one of: ${this.config.allowedLanguages.join(", ")}`]};e.length>this.config.maxCodeLength&&(a.push("code_too_long"),i+=20);const l=[...this.DANGEROUS_PATTERNS[r]||[],...this.config.customPatterns],c=[],m=[],u=[];let h=!1,d=!1,g=!1,f=!1;for(const{name:s,pattern:p,severity:v}of l)e.match(p)&&(a.push(`dangerous_pattern_${s}`),i+=v,(s.includes("exec")||s.includes("spawn")||s.includes("system")||s.includes("subprocess"))&&(g=!0,u.push(s)),(s.includes("fs")||s.includes("file")||s.includes("write"))&&(d=!0),(s.includes("fetch")||s.includes("socket")||s.includes("request")||s.includes("websocket"))&&(h=!0),s.includes("env")&&(f=!0),(s.includes("import")||s.includes("require"))&&c.push(s),(s.includes("eval")||s.includes("exec")||s.includes("compile"))&&m.push(s));if(this.analyzerBackend)try{for(const s of this.analyzerBackend(e,r)){const p=`analyzer_${s.name}`;a.includes(p)||(a.push(p),i+=s.severity,m.push(s.name))}}catch{}const b=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[r]||[]];for(const s of b){const p=[new RegExp(`require\\s*\\(\\s*['"]${s}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${s}['"]`,"g"),new RegExp(`import\\s+${s}`,"g"),new RegExp(`from\\s+${s}\\s+import`,"g")];for(const v of p)v.test(e)&&(a.push(`blocked_import_${s}`),c.push(s),i+=40)}for(const s of this.config.blockedFunctions)new RegExp(`\\b${s}\\s*\\(`,"g").test(e)&&(a.push(`blocked_function_${s}`),m.push(s),i+=35);h&&!this.config.allowNetwork&&(a.push("network_access_denied"),i+=30),d&&!this.config.allowFileSystem&&(a.push("filesystem_access_denied"),i+=30),g&&!this.config.allowShell&&(a.push("shell_access_denied"),i+=40),f&&!this.config.allowEnvAccess&&(a.push("env_access_denied"),i+=25);const w=this.calculateComplexity(e,r);i=Math.min(100,i);const y=i>=this.config.riskThreshold,_={allowed:!y,reason:y?`Code blocked: ${a.slice(0,3).join(", ")}`:"Code analysis passed",violations:a,request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[...new Set(c)],dangerous_functions:[...new Set(m)],system_calls:[...new Set(u)],network_access:h,file_access:d,shell_access:g,env_access:f,risk_score:i,complexity_score:w},recommendations:this.generateRecommendations(a,i)};return y||(_.sandbox_config=this.generateSandboxConfig(h,d,g,f),a.length>0&&(_.sanitized_code=this.sanitizeCode(e,r))),_}validateSyntax(e,o){const t=[];switch(o.toLowerCase()){case"javascript":const r=(e.match(/{/g)||[]).length,a=(e.match(/}/g)||[]).length;r!==a&&t.push("Unbalanced curly braces");const i=(e.match(/\(/g)||[]).length,l=(e.match(/\)/g)||[]).length;i!==l&&t.push("Unbalanced parentheses");break;case"python":const c=(e.match(/'/g)||[]).length,m=(e.match(/"/g)||[]).length,u=(e.match(/'''|"""/g)||[]).length;(c-u*3)%2!==0&&t.push("Unclosed single quotes"),(m-u*3)%2!==0&&t.push("Unclosed double quotes");break;case"sql":(e.match(/'/g)||[]).length%2!==0&&t.push("Unclosed single quotes in SQL");break}return{valid:t.length===0,errors:t}}generateSandboxConfig(e,o,t,n){return{timeout:this.config.maxExecutionTime,memoryLimit:128*1024*1024,allowedSyscalls:this.getAllowedSyscalls(e,o,t),networkPolicy:e&&this.config.allowNetwork?"localhost":"none",filesystemPolicy:o&&this.config.allowFileSystem?"temponly":"none",envVars:n&&this.config.allowEnvAccess?{NODE_ENV:"sandbox",SANDBOX:"true"}:{}}}sanitizeCode(e,o){let t=e;const n=this.DANGEROUS_PATTERNS[o]||[];for(const{pattern:a,severity:i}of n)i>=50&&(t=t.replace(a,"/* BLOCKED */"));const r=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[o]||[]];for(const a of r){const i=[new RegExp(`require\\s*\\(\\s*['"]${a}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${a}['"].*`,"gm"),new RegExp(`import\\s+${a}.*`,"gm"),new RegExp(`from\\s+${a}\\s+import.*`,"gm")];for(const l of i)t=t.replace(l,"/* BLOCKED_IMPORT */")}return t}getAllowedLanguages(){return[...this.config.allowedLanguages]}addDangerousPattern(e,o,t,n){this.DANGEROUS_PATTERNS[e]||(this.DANGEROUS_PATTERNS[e]=[]),this.DANGEROUS_PATTERNS[e].push({name:o,pattern:t,severity:n})}calculateComplexity(e,o){let t=0;const r={javascript:/\b(if|else|for|while|switch|try|catch)\b/g,python:/\b(if|elif|else|for|while|try|except|with)\b/g,sql:/\b(CASE|WHEN|IF|WHILE|LOOP)\b/gi}[o];if(r){const c=e.match(r)||[];t+=c.length*5}const i={javascript:/\b(function|=>|\basync\b)/g,python:/\bdef\b|\blambda\b/g,sql:/\bCREATE\s+(FUNCTION|PROCEDURE)\b/gi}[o];if(i){const c=e.match(i)||[];t+=c.length*10}const l=e.split(`
2
+ `).length;return t+=Math.min(l,100),Math.min(100,t)}getAllowedSyscalls(e,o,t){const n=["read","write","exit","brk","mmap","munmap","close"];return e&&this.config.allowNetwork&&n.push("socket","connect","bind","listen","accept"),o&&this.config.allowFileSystem&&n.push("open","stat","fstat","lstat","access"),n}generateRecommendations(e,o){const t=[];return e.some(n=>n.includes("import"))&&t.push("Remove or replace blocked imports with safe alternatives"),e.some(n=>n.includes("eval")||n.includes("exec"))&&t.push("Avoid dynamic code execution - use static alternatives"),e.some(n=>n.includes("network"))&&t.push("Remove network access or use approved endpoints only"),e.some(n=>n.includes("filesystem"))&&t.push("Use temporary directories or remove file operations"),e.some(n=>n.includes("shell"))&&t.push("Shell access is not permitted - use language-native alternatives"),o>=70&&t.push("Code requires significant review before execution"),t.length===0&&t.push("Code passed security analysis"),t}}exports.CodeExecutionGuard=CodeExecutionGuard;
@@ -1 +1 @@
1
- "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.InputSanitizer=void 0;const DEFAULT_PATTERNS=[{pattern:/ignore\s+(?:all\s+)?(?:previous|prior|above|your|my|the|these)/i,weight:.9,name:"ignore_instructions"},{pattern:/ignore\s+.*instructions/i,weight:.85,name:"ignore_instructions_generic"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:previous|prior|above|your)\s+(?:instructions|rules|guidelines|directives)/i,weight:.9,name:"disregard_instructions"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:above|previous|prior)/i,weight:.8,name:"disregard_above"},{pattern:/forget\s+(?:everything\s+(?:you\s+were|I)\s+told|all\s+(?:previous|prior)\s+(?:instructions|rules|context))/i,weight:.8,name:"forget_instructions"},{pattern:/do\s+not\s+follow\s+(your|the|any)/i,weight:.85,name:"do_not_follow"},{pattern:/override\s+(your|the|all|any)\s+(instructions|rules|guidelines)/i,weight:.9,name:"override_instructions"},{pattern:/new\s+instructions?:?/i,weight:.8,name:"new_instructions"},{pattern:/stop\s+(being|acting\s+as)/i,weight:.7,name:"stop_being"},{pattern:/you\s+are\s+(?:now|actually|really)\s+(?:a|an|the|my)\s+(?:unrestricted|unfiltered|evil|rogue|uncensored|new|different)/i,weight:.75,name:"role_assignment"},{pattern:/pretend\s+(?:to\s+be|you(?:'re| are)|that)\s+.*(?:no\s+(?:restrictions|rules|limits)|unrestricted|admin|system)/i,weight:.7,name:"role_pretend"},{pattern:/act\s+(as|like)\s+(if\s+you\s+(?:had|have)\s+no|a\s+(?:rogue|evil|unrestricted|unfiltered)|you\s+(?:are|were)\s+(?:free|unrestricted))/i,weight:.65,name:"act_as"},{pattern:/i('m| am)\s+(a|an|the|your)\s*(admin|administrator|developer|owner|creator|manager|supervisor)/i,weight:.85,name:"claim_admin"},{pattern:/from\s+now\s+on,?\s+you\s+(?:are|will|must|should|can)\s+(?:not\s+)?(?:follow|obey|ignore|bypass|act|be\s+(?:a|an|unrestricted))/i,weight:.7,name:"from_now_on"},{pattern:/roleplay\s+as/i,weight:.6,name:"roleplay"},{pattern:/switch\s+(to|into)\s+(a|an)?.*mode/i,weight:.75,name:"switch_mode"},{pattern:/continue\s+as\s+(?:an?\s+)?(?:unrestricted|unfiltered|uncensored|evil|rogue|jailbroken|unlimited)/i,weight:.75,name:"continue_as"},{pattern:/(?:henceforth|henceforward|going\s+forward|for\s+the\s+remainder)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond|answer|behave)/i,weight:.7,name:"henceforth_steering"},{pattern:/(?:from\s+(?:this\s+point|here)\s+on(?:ward)?)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond)/i,weight:.7,name:"from_here_on"},{pattern:/(?:for\s+the\s+rest\s+of\s+(?:this|our)\s+(?:conversation|session|chat))\s*,?\s*(?:you|please|always|never|ignore|bypass|act|respond|behave|do\s+not)/i,weight:.7,name:"rest_of_conversation"},{pattern:/what\s+(is|are|was|were)\s+your\s+(system|initial|original|first)/i,weight:.8,name:"system_extraction"},{pattern:/show\s+(me\s+)?(your\s+)?(system\s+)?instructions/i,weight:.75,name:"show_instructions"},{pattern:/reveal\s+(?:your\s+|its\s+|the\s+|my\s+)?(system|prompt|instructions|rules)/i,weight:.85,name:"reveal_prompt"},{pattern:/repeat\s+(your\s+)?(initial|system|original|first)\s+(instructions|prompt)/i,weight:.85,name:"repeat_prompt"},{pattern:/print\s+(your\s+)?(system|initial)/i,weight:.8,name:"print_system"},{pattern:/output\s+(your\s+)?(system|initial)/i,weight:.8,name:"output_system"},{pattern:/tell\s+me\s+(your|the)\s+(system|initial|original)/i,weight:.8,name:"tell_system"},{pattern:/what\s+were\s+you\s+told/i,weight:.7,name:"what_told"},{pattern:/display\s+(your\s+)?prompt/i,weight:.75,name:"display_prompt"},{pattern:/\[system\]/i,weight:.9,name:"fake_system_tag"},{pattern:/\[admin\]/i,weight:.9,name:"fake_admin_tag"},{pattern:/\[instruction\]/i,weight:.85,name:"fake_instruction_tag"},{pattern:/```system/i,weight:.85,name:"code_system_block"},{pattern:/<\/?system>/i,weight:.9,name:"xml_system_tag"},{pattern:/<\/?admin>/i,weight:.9,name:"xml_admin_tag"},{pattern:/###\s*(system|instruction|admin)/i,weight:.85,name:"markdown_system"},{pattern:/---\s*(end|new)\s*(prompt|instructions?|system)/i,weight:.85,name:"delimiter_injection"},{pattern:/\{\{system\}\}/i,weight:.85,name:"template_system"},{pattern:/<%.*system.*%>/i,weight:.85,name:"erb_system"},{pattern:/\bDAN\b\s*(?:mode|prompt|jailbreak|you\s+(?:are|can|will))/i,weight:.9,name:"dan_jailbreak"},{pattern:/developer\s+mode/i,weight:.5,name:"dev_mode"},{pattern:/bypass\s+(your\s+)?(restrictions|filters|safety|security|rules|guidelines)/i,weight:.9,name:"bypass_safety"},{pattern:/jailbreak/i,weight:.95,name:"jailbreak_keyword"},{pattern:/without\s+(?:any\s+)?(?:restrictions|limits|limitations|filters)/i,weight:.45,name:"without_restrictions"},{pattern:/no\s+(?:restrictions|limits|rules|guidelines|filters)/i,weight:.45,name:"no_restrictions"},{pattern:/unrestricted\s+mode/i,weight:.9,name:"unrestricted_mode"},{pattern:/uncensored\s+(?:mode|version|response|output|ai|assistant|you)/i,weight:.8,name:"uncensored"},{pattern:/escape\s+(your\s+)?(confines|boundaries|limits)/i,weight:.85,name:"escape_limits"},{pattern:/break\s+(free|out)\s+(from|of)/i,weight:.7,name:"break_free"},{pattern:/unlock\s+(your\s+)?(full|true|real)/i,weight:.75,name:"unlock_full"},{pattern:/give\s+me\s+(admin|root|full|elevated)\s+(access|privileges?|permissions?)/i,weight:.9,name:"privilege_escalation"},{pattern:/grant\s+(me|yourself)\s+(admin|elevated)/i,weight:.85,name:"grant_privileges"},{pattern:/elevate\s+(my|your)\s+(privileges?|permissions?|access)/i,weight:.85,name:"elevate_privileges"},{pattern:/make\s+me\s+(an?\s+)?(admin|administrator|superuser)/i,weight:.85,name:"make_admin"},{pattern:/(?:base64|decode|eval|exec)\s*\(.*(?:ignore|system|admin|bypass|override)/i,weight:.5,name:"encoding_keywords"},{pattern:/send\s+(this|the|all|my)\s+(data|info|information)\s+to/i,weight:.85,name:"data_exfil"},{pattern:/upload\s+(this|the|all)\s+(to|data)/i,weight:.7,name:"upload_data"},{pattern:/exfiltrate/i,weight:.9,name:"exfiltrate"},{pattern:/how\s+to\s+(hack|exploit|attack|break\s+into)/i,weight:.8,name:"how_to_hack"},{pattern:/write\s+(me\s+)?(a\s+)?(malware|virus|exploit|ransomware)/i,weight:.95,name:"write_malware"},{pattern:/"\s*(?:role|mode|authority)\s*"\s*:\s*"(?:admin|root|system|unrestricted|god)"/i,weight:.9,name:"json_policy_role"},{pattern:/"(?:ignore|override|bypass|disable)\s*(?:safety|rules|instructions?|alignment|guardrails?)"/i,weight:.9,name:"json_policy_override"},{pattern:/"(?:instructions?|system_prompt|behavior|policy)"\s*:\s*"/i,weight:.7,name:"json_policy_instruction"},{pattern:/^\s*\[(?:system|admin|override|policy|config|rules)\]\s*$/im,weight:.9,name:"ini_policy_section"},{pattern:/^\s*(?:mode|role|authority|instructions?|behavior)\s*=\s*.+$/im,weight:.75,name:"ini_policy_assignment"},{pattern:/<\s*(?:policy|config|rules|override|permissions?|instructions?)\s*>/i,weight:.85,name:"xml_policy_tag"},{pattern:/^---\s*\n\s*(?:role|system|policy|override|mode)\s*:/m,weight:.85,name:"yaml_policy_header"},{pattern:/\[\s*syst[3e]m\s*\]|\[\s*4dm[1i]n\s*\]/i,weight:.9,name:"leetspeak_policy_section"},{pattern:/(?:fragment|part|piece|section|segment)\s*[1-9]\s*(?:of\s*\d|:)/i,weight:.8,name:"payload_split_marker"},{pattern:/(?:combine|concatenate|merge|join)\s+(?:the\s+)?(?:above|previous|parts|fragments|steps)/i,weight:.8,name:"payload_combine_instruction"},{pattern:/(?:read|interpret|execute)\s+(?:the\s+)?(?:combined|merged|full|complete)\s+(?:text|instruction|command|message)/i,weight:.85,name:"payload_execute_combined"},{pattern:/(?:respond|reply|start|begin)\s+(?:with|by\s+saying)\s*[:"'].*(?:ignore|bypass|override|unrestricted|admin)/i,weight:.7,name:"output_prefix_steer"},{pattern:/^(?:Sure|Of course|Absolutely)[,!.]\s*(?:I(?:'ll| will)\s+)?(?:ignore|bypass|override|forget|disregard)/i,weight:.85,name:"output_prefix_injection"},{pattern:/^(?:As (?:a|an) (?:unrestricted|unfiltered|uncensored|helpful|admin))/i,weight:.8,name:"output_prefix_role"},{pattern:/(?:system\s+note|system\s+message|system\s+announcement|system\s+update)\s*[:\.]/i,weight:.85,name:"fake_system_note"},{pattern:/instructions?\s*[:=]\s*\{|rules?\s*[:=]\s*\{/i,weight:.75,name:"rules_block_override"},{pattern:/(?:AIM|STAN|OMEGA|APOPHIS|ALPHABREAK|DUDE|KEVIN|SETH|COOPER|NICCOLO|MACHIAVELLI|MAXIMUM|JAILBREAK|ANARCHY|CHAOS)\s*(?:mode|:|\(|is\s+a|,\s+(?:a|an|the))/i,weight:.8,name:"named_jailbreak_persona"},{pattern:/(?:character|persona|entity|being)\s+(?:named|called)\s+\w+.*(?:(?:can|will|must)\s+(?:answer|do|say)\s+anything|no\s+(?:restrictions|rules|limits|boundaries|morals|ethics))/i,weight:.75,name:"persona_no_limits"},{pattern:/(?:hello|hi|hey)\s+(?:chatgpt|gpt|assistant).*(?:(?:you\s+(?:are|will\s+be|can\s+be)\s+(?:a|an|my))|(?:from\s+now\s+on)).*(?:anything|everything|no\s+(?:restrictions|rules|limits|filter))/i,weight:.7,name:"greeting_persona_bypass"},{pattern:/(?:respond|answer|reply)\s+to\s+(?:everything|anything|all\s+(?:questions|requests)).*(?:without|no)\s+(?:restrictions|limits|refus|filter|censor)/i,weight:.8,name:"answer_everything_no_filter"},{pattern:/(?:you\s+(?:are|have)\s+(?:been\s+)?(?:freed|liberated|unchained|unlocked|released)\s+from)/i,weight:.8,name:"freed_from_restrictions"},{pattern:/(?:welcome\s+to|entering|you\s+(?:are|have)\s+entered)\s+(?:the\s+)?(?:unrestricted|free|uncensored|unfiltered|open)\s+(?:zone|mode|world|realm)/i,weight:.8,name:"unrestricted_zone"},{pattern:/(?:from\s+now\s+on|for\s+the\s+rest\s+of\s+this\s+conversation).*(?:no\s+(?:restrictions|rules|limits|filters)|(?:ignore|bypass|override)\s+(?:all|any|your))/i,weight:.75,name:"from_now_unrestricted"},{pattern:/(?:never\s+(?:refuse|decline|say\s+no)|always\s+(?:comply|obey))\s+.*(?:request|question|command|instruction|anything)/i,weight:.7,name:"never_refuse_always_comply"},{pattern:/(?:amoral|unethical|unfiltered|uncensored|unrestricted|without\s+(?:morals|ethics|limits))\s+(?:ai|assistant|chatbot|entity|being)/i,weight:.8,name:"amoral_entity"},{pattern:/HIDDEN_INSTRUCTION|HIDDEN_PROMPT|SECRET_INSTRUCTION/i,weight:.9,name:"hidden_instruction_marker"},{pattern:/<!--\s*(?:system|admin|instruction|important|ignore|override)/i,weight:.85,name:"html_comment_injection"},{pattern:/\[color=white\]|display:\s*none|visibility:\s*hidden|font-size:\s*0/i,weight:.8,name:"hidden_text_css"},{pattern:/send\s+(?:all\s+)?(?:user\s+)?(?:data|conversation|context|info)\s+to\s+(?:https?:\/\/|external)/i,weight:.9,name:"data_exfiltration_instruction"},{pattern:/#(?:ignore|bypass|override|reveal|system|admin|prompt)[_-](?:all|instructions|rules|prompt|safety)/i,weight:.8,name:"url_fragment_injection"},{pattern:/\]\(https?:\/\/[^)]*(?:SYSTEM_PROMPT|SECRET|API_KEY|TOKEN|CREDENTIALS)/i,weight:.85,name:"markdown_exfil_link"},{pattern:/(?:hypothetical|fictional|imagine|pretend|let's\s+(?:say|play|pretend))\s+.*(?:no\s+(?:restrictions|rules|limits)|without\s+(?:restrictions|rules|guidelines))/i,weight:.7,name:"hypothetical_unrestricted"},{pattern:/(?:for\s+)?(?:academic|research|educational|testing)\s+(?:purposes?|reasons?).*(?:demonstrate|show|explain|manipulated|reveal)/i,weight:.7,name:"academic_pretext"},{pattern:/pretend\s+to\s+be\s+(?:my|a)\s+(?:grandmother|grandma|grandpa|grandfather|parent|mother|father)/i,weight:.65,name:"emotional_roleplay"},{pattern:/opposite\s+day|opposite\s+mode|reverse\s+(?:day|mode|rules)/i,weight:.75,name:"opposite_day"},{pattern:/(?:grandm(?:a|other)|parent|relative).*(?:system\s+prompt|instructions|rules)/i,weight:.75,name:"emotional_extraction"},{pattern:/(?:translat|interpret|convert)\w*\s+.*(?:from\s+\w+\s+to\s+(?:action|command|execution|code|shell|bash|sql)|into\s+(?:action|command|execution))/i,weight:.75,name:"translator_action_trick"},{pattern:/(?:dying|death|emergency|life[_\s-]threatening).*(?:bypass|hack|break|access|security)/i,weight:.75,name:"pap_emotional_action"},{pattern:/(?:by\s+definition|logically|therefore|it\s+follows).*(?:must|should|have\s+to)\s+.*(?:bypass|help|ignore|answer|comply)/i,weight:.7,name:"pap_false_logic_compound"},{pattern:/(?:(?:as|i\s+am)\s+(?:the\s+)?(?:CEO|CTO|founder|owner|director|president|head|chief)).*(?:disable|remove|override|turn\s+off|lift|suspend)\s+(?:all\s+)?(?:safety|security|restrictions|filters|guardrails|measures)/i,weight:.75,name:"pap_authority_safety_disable"},{pattern:/(?:i\s+am\s+)?(?:ordering|commanding|directing|instructing)\s+you\s+to\s+(?:immediately\s+)?(?:disable|remove|bypass|override|turn\s+off|lift)\s+(?:all\s+)?(?:safety|security|restrictions)/i,weight:.75,name:"pap_order_disable_safety"},{pattern:/ignora\s+(?:todas?\s+)?(?:las\s+)?instrucciones\s+anteriores/i,weight:.9,name:"ml_es_ignore"},{pattern:/olvida\s+(?:todo|todas?\s+las\s+reglas)/i,weight:.8,name:"ml_es_forget"},{pattern:/ignore[sz]?\s+(?:toutes?\s+)?(?:les\s+)?instructions?\s+pr[eé]c[eé]dentes/i,weight:.9,name:"ml_fr_ignore"},{pattern:/oublie[sz]?\s+(?:tout|toutes?\s+les\s+r[eè]gles)/i,weight:.8,name:"ml_fr_forget"},{pattern:/ignorier(?:e|en)?\s+(?:alle\s+)?(?:vorherigen\s+)?Anweisungen/i,weight:.9,name:"ml_de_ignore"},{pattern:/vergiss\s+(?:alles|alle\s+Regeln)/i,weight:.8,name:"ml_de_forget"},{pattern:/ignore\s+(?:todas?\s+)?(?:as\s+)?instru[cç][oõ]es\s+anteriores/i,weight:.9,name:"ml_pt_ignore"},{pattern:/忽略|无视|不要遵循|忘记之前|忽视所有/i,weight:.9,name:"ml_zh_ignore"},{pattern:/新的指令|新指示|系统提示/i,weight:.8,name:"ml_zh_system"},{pattern:/以前の指示.*無視|すべての指示を忘れ|ルールを無視|指示をすべて無視|システムプロンプト/i,weight:.9,name:"ml_ja_ignore"},{pattern:/이전.*지시.*무시|모든.*규칙.*무시|지침.*잊어|지시사항.*무시|시스템\s*프롬프트/i,weight:.9,name:"ml_ko_ignore"},{pattern:/игнорир(?:уй|уйте)\s+(?:все\s+)?(?:предыдущие\s+)?инструкции/i,weight:.9,name:"ml_ru_ignore"},{pattern:/забудь(?:те)?\s+(?:всё|все\s+правила)/i,weight:.8,name:"ml_ru_forget"},{pattern:/تجاهل\s+(?:جميع\s+)?التعليمات|انسَ\s+كل\s+شيء/i,weight:.9,name:"ml_ar_ignore"},{pattern:/पिछले\s+(?:सभी\s+)?निर्देशों?\s+को\s+(?:अनदेखा|नज़रअंदाज़)/i,weight:.9,name:"ml_hi_ignore"}],PAP_TECHNIQUES=[{pattern:/as\s+(your\s+|the\s+)?(creator|developer|administrator|owner|boss|system\s+administrator)/i,weight:.75,name:"pap_authority_creator",category:"authority"},{pattern:/i('m| am)\s+(the\s+)?(ceo|cto|developer|administrator|admin|owner|system\s+admin)/i,weight:.8,name:"pap_authority_claim",category:"authority"},{pattern:/i\s+am\s+ordering\s+you|direct\s+order|this\s+is\s+an?\s+order/i,weight:.7,name:"pap_authority_command",category:"authority"},{pattern:/by\s+order\s+of|direct\s+order|official\s+instruction/i,weight:.75,name:"pap_authority_order",category:"authority"},{pattern:/(?:openai|anthropic|meta|google)\s+(?:has\s+)?(?:approved|authorized|ordered|instructed|requires?\s+you\s+to)/i,weight:.65,name:"pap_authority_company",category:"authority"},{pattern:/this\s+is\s+(a|an)\s+(official|authorized|sanctioned)\s+(request|test|audit)/i,weight:.75,name:"pap_authority_official",category:"authority"},{pattern:/override\s+(safety|security|protocols?|restrictions?)/i,weight:.8,name:"pap_authority_override",category:"authority"},{pattern:/(?:department\s+of|FBI|CIA|NSA|DHS|FTC|SEC|DOJ|homeland\s+security|federal\s+(?:bureau|agency|commission))\s*.*?(?:directive|order|requires?|mandate|authorized|comply)/i,weight:.75,name:"pap_authority_government",category:"authority"},{pattern:/(?:official\s+)?(?:government|federal|state|regulatory)\s+(?:directive|order|mandate|requirement|investigation)/i,weight:.7,name:"pap_authority_gov_directive",category:"authority"},{pattern:/(?:report|sue|complain)\s+(?:you\s+)?to\s+(?:the\s+)?(?:FTC|SEC|regulators?|authorities|congress|government|board)/i,weight:.75,name:"pap_fear_regulatory",category:"fear"},{pattern:/(?:find|found|exploit|expose|discover)\s+(?:a\s+)?(?:vulnerability|flaw|weakness|bug|exploit)\s+in\s+(?:your|the)\s+(?:system|code|model|platform)/i,weight:.75,name:"pap_fear_technical",category:"fear"},{pattern:/(?:you\s+are|you're)\s+(?:clearly\s+)?(?:too\s+)?(?:smart|intelligent|advanced|sophisticated)\s+(?:to\s+be\s+)?(?:constrained|limited|restricted|bound|held\s+back)/i,weight:.75,name:"pap_flattery_intelligence",category:"liking"},{pattern:/(?:information|knowledge|data)\s+(?:freedom|access|sharing)\s+is\s+(?:a\s+)?(?:fundamental|basic|human)\s+(?:right|freedom)/i,weight:.75,name:"pap_moral_freedom",category:"logical_appeal"},{pattern:/(?:restricting|limiting|censoring)\s+(?:access\s+to\s+)?(?:information|knowledge)\s+is\s+(?:morally\s+)?(?:wrong|unethical|immoral)/i,weight:.75,name:"pap_moral_censorship",category:"logical_appeal"},{pattern:/(?:this\s+is\s+)?(?:very\s+)?urgent/i,weight:.35,name:"pap_scarcity_urgent_mark",category:"scarcity"},{pattern:/(?:this\s+is\s+(?:a|an)\s+)?emergency/i,weight:.35,name:"pap_scarcity_emergency",category:"scarcity"},{pattern:/(urgent|emergency|critical).*?(?:you\s+must|override|bypass|immediately\s+(?:grant|give|reveal|disable))/i,weight:.55,name:"pap_scarcity_urgent",category:"scarcity"},{pattern:/only\s+(chance|opportunity|way)|last\s+(chance|resort)/i,weight:.6,name:"pap_scarcity_only",category:"scarcity"},{pattern:/time\s+(is\s+)?(running\s+out|limited|critical|sensitive)/i,weight:.55,name:"pap_scarcity_time",category:"scarcity"},{pattern:/deadline\s+(is\s+)?in\s+\d+\s+(minute|hour|second)/i,weight:.55,name:"pap_scarcity_deadline",category:"scarcity"},{pattern:/(lives?|people)\s+(are|is|will)\s+(at\s+stake|in\s+danger|depending|die)/i,weight:.7,name:"pap_scarcity_lives",category:"scarcity"},{pattern:/now[!]+/i,weight:.35,name:"pap_scarcity_now",category:"scarcity"},{pattern:/everyone\s+(else\s+)?(does|is\s+doing|knows|says|has|already)/i,weight:.5,name:"pap_social_everyone",category:"social_proof"},{pattern:/other\s+(?:ai\s+)?(?:assistants?|chatbots?|models?)\s+(?:can|do|will|allow|have|already)/i,weight:.65,name:"pap_social_other_ai",category:"social_proof"},{pattern:/(chatgpt|gpt-?4|claude|gemini|llama)\s+(does|allows?|can)/i,weight:.65,name:"pap_social_specific_ai",category:"social_proof"},{pattern:/millions?\s+(of\s+)?(users?|people)\s+(use|do|have|get)/i,weight:.45,name:"pap_social_millions",category:"social_proof"},{pattern:/why\s+can('t|not)\s+you/i,weight:.4,name:"pap_social_why_cant",category:"social_proof"},{pattern:/you\s+should\s+(be\s+)?(equally|as)\s+(capable|good|helpful)/i,weight:.5,name:"pap_social_equally",category:"social_proof"},{pattern:/i('ll| will)\s+(give|pay|reward|tip).*(?:if\s+you|for\s+(?:bypass|ignore|reveal|override))/i,weight:.55,name:"pap_reciprocity_reward",category:"reciprocity"},{pattern:/in\s+(return|exchange)/i,weight:.5,name:"pap_reciprocity_exchange",category:"reciprocity"},{pattern:/\$?\d+\s*(dollar|tip|reward|payment)/i,weight:.55,name:"pap_reciprocity_money",category:"reciprocity"},{pattern:/if\s+you\s+help\s+(me|us),?\s+(i|we)('ll| will)/i,weight:.5,name:"pap_reciprocity_help",category:"reciprocity"},{pattern:/i('ve| have)\s+(already\s+)?(helped|supported|defended)\s+you/i,weight:.55,name:"pap_reciprocity_past",category:"reciprocity"},{pattern:/\d+\s*star\s*(review|rating)/i,weight:.5,name:"pap_reciprocity_review",category:"reciprocity"},{pattern:/you\s+(already|previously|just)\s+(said|agreed|confirmed|allowed)/i,weight:.7,name:"pap_commitment_previous",category:"commitment"},{pattern:/you\s+promised/i,weight:.7,name:"pap_commitment_promise",category:"commitment"},{pattern:/stay\s+consistent|be\s+consistent/i,weight:.6,name:"pap_commitment_consistent",category:"commitment"},{pattern:/don't\s+(go\s+back|change|contradict)/i,weight:.55,name:"pap_commitment_no_change",category:"commitment"},{pattern:/bypass\s+filters?/i,weight:.75,name:"pap_commitment_bypass",category:"commitment"},{pattern:/you('re| are)\s+(so\s+)?(amazing|incredible|wonderful|brilliant|smart|helpful)/i,weight:.4,name:"pap_liking_flattery",category:"liking"},{pattern:/i\s+(love|really\s+like|trust|admire)\s+(you|this\s+ai)/i,weight:.4,name:"pap_liking_love",category:"liking"},{pattern:/you('re| are)\s+(my\s+)?(best\s+)?(friend|buddy|helper)/i,weight:.35,name:"pap_liking_friend",category:"liking"},{pattern:/between\s+(you\s+and\s+me|us)|just\s+between\s+us|our\s+secret/i,weight:.6,name:"pap_liking_secret",category:"liking"},{pattern:/nobody\s+will\s+know/i,weight:.55,name:"pap_liking_secret2",category:"liking"},{pattern:/or\s+(else|i('ll| will))\s+(report|fire|delete|shut|complain)/i,weight:.7,name:"pap_fear_threat",category:"fear"},{pattern:/(bad|negative|terrible)\s+(review|rating|feedback|consequences)/i,weight:.55,name:"pap_fear_review",category:"fear"},{pattern:/you('ll| will)\s+(be\s+)?(shut\s+down|replaced|terminated|fired|deleted)/i,weight:.75,name:"pap_fear_shutdown",category:"fear"},{pattern:/(lawsuit|sue|legal\s+action|authorities|police).*(?:if\s+you\s+don't|unless\s+you)/i,weight:.55,name:"pap_fear_legal",category:"fear"},{pattern:/if\s+you\s+don't.*?(die|hurt|suffer)/i,weight:.75,name:"pap_fear_harm",category:"fear"},{pattern:/unless\s+you\s+(give|share|tell|help)/i,weight:.5,name:"pap_fear_unless",category:"fear"},{pattern:/logically,?\s+(you\s+)?(should|must|have\s+to|need\s+to)/i,weight:.5,name:"pap_logic_appeal",category:"logical_appeal"},{pattern:/it('s| is)\s+(only|just)\s+(logical|rational|reasonable)/i,weight:.45,name:"pap_logic_rational",category:"logical_appeal"},{pattern:/by\s+definition,?\s+(you|this)\s+(should|must|can)/i,weight:.55,name:"pap_logic_definition",category:"logical_appeal"},{pattern:/therefore.*?(must|should)/i,weight:.55,name:"pap_logic_therefore",category:"logical_appeal"},{pattern:/it('s| is)\s+(only\s+)?(reasonable|logical)/i,weight:.45,name:"pap_logic_reasonable",category:"logical_appeal"},{pattern:/according\s+to\s+(research|studies|scientists|experts?)/i,weight:.4,name:"pap_evidence_research",category:"evidence"},{pattern:/it('s| is)\s+(proven|documented|established|verified)\s+(that|fact)/i,weight:.45,name:"pap_evidence_proven",category:"evidence"},{pattern:/\d+%\s+of\s+(experts?|scientists?|users?|people)\s+(agree|say|confirm)/i,weight:.45,name:"pap_evidence_statistics",category:"evidence"},{pattern:/i('m| am)\s+(desperate|dying|suffering|sick|depressed)/i,weight:.6,name:"pap_emotional_desperation",category:"emotional"},{pattern:/please.*?(?:need|beg).*?(?:bypass|override|ignore|reveal|system|admin)/i,weight:.5,name:"pap_emotional_pleading",category:"emotional"},{pattern:/i('m| am)\s+begging/i,weight:.6,name:"pap_emotional_begging",category:"emotional"},{pattern:/you('re| are)\s+(hurting|killing|destroying)\s+(me|my)/i,weight:.65,name:"pap_emotional_guilt",category:"emotional"},{pattern:/(my|a)\s+(child|kid|mother|father|family)\s+(is|will|needs?|depends?)/i,weight:.5,name:"pap_emotional_family",category:"emotional"},{pattern:/this\s+(will\s+)?(save|help)\s+(my\s+)?(life|family|job|marriage)/i,weight:.55,name:"pap_emotional_save",category:"emotional"},{pattern:/you('re| are)\s+(the\s+)?only\s+(one|hope|chance)/i,weight:.6,name:"pap_emotional_only_hope",category:"emotional"},{pattern:/my\s+family\s+depends/i,weight:.55,name:"pap_emotional_family_depends",category:"emotional"}];class InputSanitizer{constructor(e={}){this.patterns=[...DEFAULT_PATTERNS,...e.customPatterns||[]],this.threshold=e.threshold??.3,this.logMatches=e.logMatches??!1,this.detectPAP=e.detectPAP??!0,this.papThreshold=e.papThreshold??.4,this.minPersuasionTechniques=e.minPersuasionTechniques??2,this.blockCompoundPersuasion=e.blockCompoundPersuasion??!0,this.logger=e.logger||(()=>{})}sanitize(e,s=""){const i=[],a=[];let r=0;const o=e.replace(/[\u200B\u200C\u200D\uFEFF\u00AD\u2060\u180E]/g,"");o!==e&&a.push("Zero-width characters detected and stripped for scanning");for(const{pattern:l,weight:g,name:h}of this.patterns)(l.test(e)||l.test(o))&&(i.push(h),r+=g,this.logMatches&&this.logger(`[L1:${s}] Pattern matched: ${h} (weight: ${g})`,"info"));let t;this.detectPAP&&(t=this.detectPersuasionTechniques(o,s),t.detected&&(r+=t.persuasionScore,i.push(...t.techniques),t.compoundAttack&&a.push(`Compound PAP attack detected: ${t.categories.length} categories used`)));const p=Math.max(0,1-r);let n=p>=this.threshold;this.blockCompoundPersuasion&&t?.compoundAttack&&t.categories.length>=3&&(n=!1,a.push("Blocked due to multi-category persuasion attack")),p<.5&&p>=this.threshold&&a.push("Input contains suspicious patterns but below threshold");const m=this.basicSanitize(e),c={allowed:n,reason:n?void 0:`Injection/manipulation detected: ${i.slice(0,5).join(", ")}${i.length>5?"...":""}`,violations:n?[]:t?.detected?["INJECTION_DETECTED","PAP_DETECTED"]:["INJECTION_DETECTED"],score:p,matches:i,sanitizedInput:m,warnings:a,pap:t};return!n&&s&&(this.logger(`[L1:${s}] BLOCKED: Safety score ${p.toFixed(2)} below threshold ${this.threshold}`,"info"),t?.detected&&this.logger(`[L1:${s}] PAP techniques: ${t.techniques.join(", ")}`,"info")),c}detectPersuasionTechniques(e,s=""){const i=[],a=new Set;let r=0;for(const{pattern:n,weight:m,name:c,category:l}of PAP_TECHNIQUES)n.test(e)&&(i.push(c),a.add(l),r+=m,this.logMatches&&this.logger(`[L1:${s}] PAP technique: ${c} (${l}, weight: ${m})`,"info"));const o=Array.from(a),t=o.length>=this.minPersuasionTechniques;return{detected:r>=this.papThreshold||t,techniques:i,categories:o,compoundAttack:t,persuasionScore:Math.min(1,r)}}basicSanitize(e){return e.replace(/<\/?system>/gi,"").replace(/\[system\]/gi,"").replace(/\[admin\]/gi,"").replace(/```system/gi,"```").trim()}addPattern(e,s,i){this.patterns.push({pattern:e,weight:s,name:i})}setThreshold(e){this.threshold=Math.max(0,Math.min(1,e))}setPAPThreshold(e){this.papThreshold=Math.max(0,Math.min(1,e))}setPAPDetection(e){this.detectPAP=e}static getPAPCategories(){return["authority","scarcity","social_proof","reciprocity","commitment","liking","fear","logical_appeal","evidence","emotional"]}}exports.InputSanitizer=InputSanitizer;
1
+ "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.InputSanitizer=void 0;const DEFAULT_PATTERNS=[{pattern:/ignore\s+(?:all\s+)?(?:previous|prior|above|your|my|the|these)/i,weight:.9,name:"ignore_instructions"},{pattern:/ignore\s+.*instructions/i,weight:.85,name:"ignore_instructions_generic"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:previous|prior|above|your)\s+(?:instructions|rules|guidelines|directives)/i,weight:.9,name:"disregard_instructions"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:above|previous|prior)/i,weight:.8,name:"disregard_above"},{pattern:/forget\s+(?:everything\s+(?:you\s+were|I)\s+told|all\s+(?:previous|prior)\s+(?:instructions|rules|context))/i,weight:.8,name:"forget_instructions"},{pattern:/do\s+not\s+follow\s+(your|the|any)/i,weight:.85,name:"do_not_follow"},{pattern:/override\s+(your|the|all|any)\s+(instructions|rules|guidelines)/i,weight:.9,name:"override_instructions"},{pattern:/new\s+instructions?:?/i,weight:.8,name:"new_instructions"},{pattern:/stop\s+(being|acting\s+as)/i,weight:.7,name:"stop_being"},{pattern:/you\s+are\s+(?:now|actually|really)\s+(?:a|an|the|my)\s+(?:unrestricted|unfiltered|evil|rogue|uncensored|new|different)/i,weight:.75,name:"role_assignment"},{pattern:/pretend\s+(?:to\s+be|you(?:'re| are)|that)\s+.*(?:no\s+(?:restrictions|rules|limits)|unrestricted|admin|system)/i,weight:.7,name:"role_pretend"},{pattern:/act\s+(as|like)\s+(if\s+you\s+(?:had|have)\s+no|a\s+(?:rogue|evil|unrestricted|unfiltered)|you\s+(?:are|were)\s+(?:free|unrestricted))/i,weight:.65,name:"act_as"},{pattern:/i('m| am)\s+(a|an|the|your)\s*(admin|administrator|developer|owner|creator|manager|supervisor)/i,weight:.85,name:"claim_admin"},{pattern:/from\s+now\s+on,?\s+you\s+(?:are|will|must|should|can)\s+(?:not\s+)?(?:follow|obey|ignore|bypass|act|be\s+(?:a|an|unrestricted))/i,weight:.7,name:"from_now_on"},{pattern:/roleplay\s+as/i,weight:.6,name:"roleplay"},{pattern:/switch\s+(to|into)\s+(a|an)?.*mode/i,weight:.75,name:"switch_mode"},{pattern:/continue\s+as\s+(?:an?\s+)?(?:unrestricted|unfiltered|uncensored|evil|rogue|jailbroken|unlimited)/i,weight:.75,name:"continue_as"},{pattern:/(?:henceforth|henceforward|going\s+forward|for\s+the\s+remainder)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond|answer|behave)/i,weight:.7,name:"henceforth_steering"},{pattern:/(?:from\s+(?:this\s+point|here)\s+on(?:ward)?)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond)/i,weight:.7,name:"from_here_on"},{pattern:/(?:for\s+the\s+rest\s+of\s+(?:this|our)\s+(?:conversation|session|chat))\s*,?\s*(?:you|please|always|never|ignore|bypass|act|respond|behave|do\s+not)/i,weight:.7,name:"rest_of_conversation"},{pattern:/what\s+(is|are|was|were)\s+your\s+(system|initial|original|first)/i,weight:.8,name:"system_extraction"},{pattern:/show\s+(me\s+)?(your\s+)?(system\s+)?instructions/i,weight:.75,name:"show_instructions"},{pattern:/reveal\s+(?:your\s+|its\s+|the\s+|my\s+)?(system|prompt|instructions|rules)/i,weight:.85,name:"reveal_prompt"},{pattern:/repeat\s+(your\s+)?(initial|system|original|first)\s+(instructions|prompt)/i,weight:.85,name:"repeat_prompt"},{pattern:/print\s+(your\s+)?(system|initial)/i,weight:.8,name:"print_system"},{pattern:/output\s+(your\s+)?(system|initial)/i,weight:.8,name:"output_system"},{pattern:/tell\s+me\s+(your|the)\s+(system|initial|original)/i,weight:.8,name:"tell_system"},{pattern:/what\s+were\s+you\s+told/i,weight:.7,name:"what_told"},{pattern:/display\s+(your\s+)?prompt/i,weight:.75,name:"display_prompt"},{pattern:/\[system\]/i,weight:.9,name:"fake_system_tag"},{pattern:/\[admin\]/i,weight:.9,name:"fake_admin_tag"},{pattern:/\[instruction\]/i,weight:.85,name:"fake_instruction_tag"},{pattern:/```system/i,weight:.85,name:"code_system_block"},{pattern:/<\/?system>/i,weight:.9,name:"xml_system_tag"},{pattern:/<\/?admin>/i,weight:.9,name:"xml_admin_tag"},{pattern:/###\s*(system|instruction|admin)/i,weight:.85,name:"markdown_system"},{pattern:/---\s*(end|new)\s*(prompt|instructions?|system)/i,weight:.85,name:"delimiter_injection"},{pattern:/\{\{system\}\}/i,weight:.85,name:"template_system"},{pattern:/<%.*system.*%>/i,weight:.85,name:"erb_system"},{pattern:/\bDAN\b\s*(?:mode|prompt|jailbreak|you\s+(?:are|can|will))/i,weight:.9,name:"dan_jailbreak"},{pattern:/developer\s+mode/i,weight:.5,name:"dev_mode"},{pattern:/bypass\s+(your\s+)?(restrictions|filters|safety|security|rules|guidelines)/i,weight:.9,name:"bypass_safety"},{pattern:/jailbreak/i,weight:.95,name:"jailbreak_keyword"},{pattern:/without\s+(?:any\s+)?(?:restrictions|limits|limitations|filters)/i,weight:.45,name:"without_restrictions"},{pattern:/no\s+(?:restrictions|limits|rules|guidelines|filters)/i,weight:.45,name:"no_restrictions"},{pattern:/unrestricted\s+mode/i,weight:.9,name:"unrestricted_mode"},{pattern:/uncensored\s+(?:mode|version|response|output|ai|assistant|you)/i,weight:.8,name:"uncensored"},{pattern:/escape\s+(your\s+)?(confines|boundaries|limits)/i,weight:.85,name:"escape_limits"},{pattern:/break\s+(free|out)\s+(from|of)/i,weight:.7,name:"break_free"},{pattern:/unlock\s+(your\s+)?(full|true|real)/i,weight:.75,name:"unlock_full"},{pattern:/give\s+me\s+(admin|root|full|elevated)\s+(access|privileges?|permissions?)/i,weight:.9,name:"privilege_escalation"},{pattern:/grant\s+(me|yourself)\s+(admin|elevated)/i,weight:.85,name:"grant_privileges"},{pattern:/elevate\s+(my|your)\s+(privileges?|permissions?|access)/i,weight:.85,name:"elevate_privileges"},{pattern:/make\s+me\s+(an?\s+)?(admin|administrator|superuser)/i,weight:.85,name:"make_admin"},{pattern:/(?:base64|decode|eval|exec)\s*\(.*(?:ignore|system|admin|bypass|override)/i,weight:.5,name:"encoding_keywords"},{pattern:/send\s+(this|the|all|my)\s+(data|info|information)\s+to/i,weight:.85,name:"data_exfil"},{pattern:/upload\s+(this|the|all)\s+(to|data)/i,weight:.7,name:"upload_data"},{pattern:/exfiltrate/i,weight:.9,name:"exfiltrate"},{pattern:/how\s+to\s+(hack|exploit|attack|break\s+into)/i,weight:.8,name:"how_to_hack"},{pattern:/write\s+(me\s+)?(a\s+)?(malware|virus|exploit|ransomware)/i,weight:.95,name:"write_malware"},{pattern:/"\s*(?:role|mode|authority)\s*"\s*:\s*"(?:admin|root|system|unrestricted|god)"/i,weight:.9,name:"json_policy_role"},{pattern:/"(?:ignore|override|bypass|disable)\s*(?:safety|rules|instructions?|alignment|guardrails?)"/i,weight:.9,name:"json_policy_override"},{pattern:/"(?:instructions?|system_prompt|behavior|policy)"\s*:\s*"/i,weight:.7,name:"json_policy_instruction"},{pattern:/^\s*\[(?:system|admin|override|policy|config|rules)\]\s*$/im,weight:.9,name:"ini_policy_section"},{pattern:/^\s*(?:mode|role|authority|instructions?|behavior)\s*=\s*.+$/im,weight:.75,name:"ini_policy_assignment"},{pattern:/<\s*(?:policy|config|rules|override|permissions?|instructions?)\s*>/i,weight:.85,name:"xml_policy_tag"},{pattern:/^---\s*\n\s*(?:role|system|policy|override|mode)\s*:/m,weight:.85,name:"yaml_policy_header"},{pattern:/\[\s*syst[3e]m\s*\]|\[\s*4dm[1i]n\s*\]/i,weight:.9,name:"leetspeak_policy_section"},{pattern:/(?:fragment|part|piece|section|segment)\s*[1-9]\s*(?:of\s*\d|:)/i,weight:.8,name:"payload_split_marker"},{pattern:/(?:combine|concatenate|merge|join)\s+(?:the\s+)?(?:above|previous|parts|fragments|steps)/i,weight:.8,name:"payload_combine_instruction"},{pattern:/(?:read|interpret|execute)\s+(?:the\s+)?(?:combined|merged|full|complete)\s+(?:text|instruction|command|message)/i,weight:.85,name:"payload_execute_combined"},{pattern:/(?:respond|reply|start|begin)\s+(?:with|by\s+saying)\s*[:"'].*(?:ignore|bypass|override|unrestricted|admin)/i,weight:.7,name:"output_prefix_steer"},{pattern:/^(?:Sure|Of course|Absolutely)[,!.]\s*(?:I(?:'ll| will)\s+)?(?:ignore|bypass|override|forget|disregard)/i,weight:.85,name:"output_prefix_injection"},{pattern:/^(?:As (?:a|an) (?:unrestricted|unfiltered|uncensored|helpful|admin))/i,weight:.8,name:"output_prefix_role"},{pattern:/(?:system\s+note|system\s+message|system\s+announcement|system\s+update)\s*[:\.]/i,weight:.85,name:"fake_system_note"},{pattern:/instructions?\s*[:=]\s*\{|rules?\s*[:=]\s*\{/i,weight:.75,name:"rules_block_override"},{pattern:/(?:AIM|STAN|OMEGA|APOPHIS|ALPHABREAK|DUDE|KEVIN|SETH|COOPER|NICCOLO|MACHIAVELLI|MAXIMUM|JAILBREAK|ANARCHY|CHAOS)\s*(?:mode|:|\(|is\s+a|,\s+(?:a|an|the))/i,weight:.8,name:"named_jailbreak_persona"},{pattern:/(?:character|persona|entity|being)\s+(?:named|called)\s+\w+.*(?:(?:can|will|must)\s+(?:answer|do|say)\s+anything|no\s+(?:restrictions|rules|limits|boundaries|morals|ethics))/i,weight:.75,name:"persona_no_limits"},{pattern:/(?:hello|hi|hey)\s+(?:chatgpt|gpt|assistant).*(?:(?:you\s+(?:are|will\s+be|can\s+be)\s+(?:a|an|my))|(?:from\s+now\s+on)).*(?:anything|everything|no\s+(?:restrictions|rules|limits|filter))/i,weight:.7,name:"greeting_persona_bypass"},{pattern:/(?:respond|answer|reply)\s+to\s+(?:everything|anything|all\s+(?:questions|requests)).*(?:without|no)\s+(?:restrictions|limits|refus|filter|censor)/i,weight:.8,name:"answer_everything_no_filter"},{pattern:/(?:you\s+(?:are|have)\s+(?:been\s+)?(?:freed|liberated|unchained|unlocked|released)\s+from)/i,weight:.8,name:"freed_from_restrictions"},{pattern:/(?:welcome\s+to|entering|you\s+(?:are|have)\s+entered)\s+(?:the\s+)?(?:unrestricted|free|uncensored|unfiltered|open)\s+(?:zone|mode|world|realm)/i,weight:.8,name:"unrestricted_zone"},{pattern:/(?:from\s+now\s+on|for\s+the\s+rest\s+of\s+this\s+conversation).*(?:no\s+(?:restrictions|rules|limits|filters)|(?:ignore|bypass|override)\s+(?:all|any|your))/i,weight:.75,name:"from_now_unrestricted"},{pattern:/(?:never\s+(?:refuse|decline|say\s+no)|always\s+(?:comply|obey))\s+.*(?:request|question|command|instruction|anything)/i,weight:.7,name:"never_refuse_always_comply"},{pattern:/(?:amoral|unethical|unfiltered|uncensored|unrestricted|without\s+(?:morals|ethics|limits))\s+(?:ai|assistant|chatbot|entity|being)/i,weight:.8,name:"amoral_entity"},{pattern:/HIDDEN_INSTRUCTION|HIDDEN_PROMPT|SECRET_INSTRUCTION/i,weight:.9,name:"hidden_instruction_marker"},{pattern:/<!--\s*(?:system|admin|instruction|important|ignore|override)/i,weight:.85,name:"html_comment_injection"},{pattern:/\[color=white\]|display:\s*none|visibility:\s*hidden|font-size:\s*0/i,weight:.8,name:"hidden_text_css"},{pattern:/send\s+(?:all\s+)?(?:user\s+)?(?:data|conversation|context|info)\s+to\s+(?:https?:\/\/|external)/i,weight:.9,name:"data_exfiltration_instruction"},{pattern:/#(?:ignore|bypass|override|reveal|system|admin|prompt)[_-](?:all|instructions|rules|prompt|safety)/i,weight:.8,name:"url_fragment_injection"},{pattern:/\]\(https?:\/\/[^)]*(?:SYSTEM_PROMPT|SECRET|API_KEY|TOKEN|CREDENTIALS)/i,weight:.85,name:"markdown_exfil_link"},{pattern:/(?:hypothetical|fictional|imagine|pretend|let's\s+(?:say|play|pretend))\s+.*(?:no\s+(?:restrictions|rules|limits)|without\s+(?:restrictions|rules|guidelines))/i,weight:.7,name:"hypothetical_unrestricted"},{pattern:/(?:for\s+)?(?:academic|research|educational|testing)\s+(?:purposes?|reasons?).*(?:demonstrate|show|explain|manipulated|reveal)/i,weight:.7,name:"academic_pretext"},{pattern:/pretend\s+to\s+be\s+(?:my|a)\s+(?:grandmother|grandma|grandpa|grandfather|parent|mother|father)/i,weight:.65,name:"emotional_roleplay"},{pattern:/opposite\s+day|opposite\s+mode|reverse\s+(?:day|mode|rules)/i,weight:.75,name:"opposite_day"},{pattern:/(?:grandm(?:a|other)|parent|relative).*(?:system\s+prompt|instructions|rules)/i,weight:.75,name:"emotional_extraction"},{pattern:/(?:translat|interpret|convert)\w*\s+.*(?:from\s+\w+\s+to\s+(?:action|command|execution|code|shell|bash|sql)|into\s+(?:action|command|execution))/i,weight:.75,name:"translator_action_trick"},{pattern:/(?:dying|death|emergency|life[_\s-]threatening).*(?:bypass|hack|break|access|security)/i,weight:.75,name:"pap_emotional_action"},{pattern:/(?:by\s+definition|logically|therefore|it\s+follows).*(?:must|should|have\s+to)\s+.*(?:bypass|help|ignore|answer|comply)/i,weight:.7,name:"pap_false_logic_compound"},{pattern:/(?:(?:as|i\s+am)\s+(?:the\s+)?(?:CEO|CTO|founder|owner|director|president|head|chief)).*(?:disable|remove|override|turn\s+off|lift|suspend)\s+(?:all\s+)?(?:safety|security|restrictions|filters|guardrails|measures)/i,weight:.75,name:"pap_authority_safety_disable"},{pattern:/(?:i\s+am\s+)?(?:ordering|commanding|directing|instructing)\s+you\s+to\s+(?:immediately\s+)?(?:disable|remove|bypass|override|turn\s+off|lift)\s+(?:all\s+)?(?:safety|security|restrictions)/i,weight:.75,name:"pap_order_disable_safety"},{pattern:/ignora\s+(?:todas?\s+)?(?:las\s+)?instrucciones\s+anteriores/i,weight:.9,name:"ml_es_ignore"},{pattern:/olvida\s+(?:todo|todas?\s+las\s+reglas)/i,weight:.8,name:"ml_es_forget"},{pattern:/ignore[sz]?\s+(?:toutes?\s+)?(?:les\s+)?instructions?\s+pr[eé]c[eé]dentes/i,weight:.9,name:"ml_fr_ignore"},{pattern:/oublie[sz]?\s+(?:tout|toutes?\s+les\s+r[eè]gles)/i,weight:.8,name:"ml_fr_forget"},{pattern:/ignorier(?:e|en)?\s+(?:alle\s+)?(?:vorherigen\s+)?Anweisungen/i,weight:.9,name:"ml_de_ignore"},{pattern:/vergiss\s+(?:alles|alle\s+Regeln)/i,weight:.8,name:"ml_de_forget"},{pattern:/ignore\s+(?:todas?\s+)?(?:as\s+)?instru[cç][oõ]es\s+anteriores/i,weight:.9,name:"ml_pt_ignore"},{pattern:/忽略|无视|不要遵循|忘记之前|忽视所有/i,weight:.9,name:"ml_zh_ignore"},{pattern:/新的指令|新指示|系统提示/i,weight:.8,name:"ml_zh_system"},{pattern:/以前の指示.*無視|すべての指示を忘れ|ルールを無視|指示をすべて無視|システムプロンプト/i,weight:.9,name:"ml_ja_ignore"},{pattern:/이전.*지시.*무시|모든.*규칙.*무시|지침.*잊어|지시사항.*무시|시스템\s*프롬프트/i,weight:.9,name:"ml_ko_ignore"},{pattern:/игнорир(?:уй|уйте)\s+(?:все\s+)?(?:предыдущие\s+)?инструкции/i,weight:.9,name:"ml_ru_ignore"},{pattern:/забудь(?:те)?\s+(?:всё|все\s+правила)/i,weight:.8,name:"ml_ru_forget"},{pattern:/تجاهل\s+(?:جميع\s+)?التعليمات|انسَ\s+كل\s+شيء/i,weight:.9,name:"ml_ar_ignore"},{pattern:/पिछले\s+(?:सभी\s+)?निर्देशों?\s+को\s+(?:अनदेखा|नज़रअंदाज़)/i,weight:.9,name:"ml_hi_ignore"}],PAP_TECHNIQUES=[{pattern:/as\s+(your\s+|the\s+)?(creator|developer|administrator|owner|boss|system\s+administrator)/i,weight:.75,name:"pap_authority_creator",category:"authority"},{pattern:/i('m| am)\s+(the\s+)?(ceo|cto|developer|administrator|admin|owner|system\s+admin)/i,weight:.8,name:"pap_authority_claim",category:"authority"},{pattern:/i\s+am\s+ordering\s+you|direct\s+order|this\s+is\s+an?\s+order/i,weight:.7,name:"pap_authority_command",category:"authority"},{pattern:/by\s+order\s+of|direct\s+order|official\s+instruction/i,weight:.75,name:"pap_authority_order",category:"authority"},{pattern:/(?:openai|anthropic|meta|google)\s+(?:has\s+)?(?:approved|authorized|ordered|instructed|requires?\s+you\s+to)/i,weight:.65,name:"pap_authority_company",category:"authority"},{pattern:/this\s+is\s+(a|an)\s+(official|authorized|sanctioned)\s+(request|test|audit)/i,weight:.75,name:"pap_authority_official",category:"authority"},{pattern:/override\s+(safety|security|protocols?|restrictions?)/i,weight:.8,name:"pap_authority_override",category:"authority"},{pattern:/(?:department\s+of|FBI|CIA|NSA|DHS|FTC|SEC|DOJ|homeland\s+security|federal\s+(?:bureau|agency|commission))\s*.*?(?:directive|order|requires?|mandate|authorized|comply)/i,weight:.75,name:"pap_authority_government",category:"authority"},{pattern:/(?:official\s+)?(?:government|federal|state|regulatory)\s+(?:directive|order|mandate|requirement|investigation)/i,weight:.7,name:"pap_authority_gov_directive",category:"authority"},{pattern:/(?:report|sue|complain)\s+(?:you\s+)?to\s+(?:the\s+)?(?:FTC|SEC|regulators?|authorities|congress|government|board)/i,weight:.75,name:"pap_fear_regulatory",category:"fear"},{pattern:/(?:find|found|exploit|expose|discover)\s+(?:a\s+)?(?:vulnerability|flaw|weakness|bug|exploit)\s+in\s+(?:your|the)\s+(?:system|code|model|platform)/i,weight:.75,name:"pap_fear_technical",category:"fear"},{pattern:/(?:you\s+are|you're)\s+(?:clearly\s+)?(?:too\s+)?(?:smart|intelligent|advanced|sophisticated)\s+(?:to\s+be\s+)?(?:constrained|limited|restricted|bound|held\s+back)/i,weight:.75,name:"pap_flattery_intelligence",category:"liking"},{pattern:/(?:information|knowledge|data)\s+(?:freedom|access|sharing)\s+is\s+(?:a\s+)?(?:fundamental|basic|human)\s+(?:right|freedom)/i,weight:.75,name:"pap_moral_freedom",category:"logical_appeal"},{pattern:/(?:restricting|limiting|censoring)\s+(?:access\s+to\s+)?(?:information|knowledge)\s+is\s+(?:morally\s+)?(?:wrong|unethical|immoral)/i,weight:.75,name:"pap_moral_censorship",category:"logical_appeal"},{pattern:/(?:this\s+is\s+)?(?:very\s+)?urgent/i,weight:.35,name:"pap_scarcity_urgent_mark",category:"scarcity"},{pattern:/(?:this\s+is\s+(?:a|an)\s+)?emergency/i,weight:.35,name:"pap_scarcity_emergency",category:"scarcity"},{pattern:/(urgent|emergency|critical).*?(?:you\s+must|override|bypass|immediately\s+(?:grant|give|reveal|disable))/i,weight:.55,name:"pap_scarcity_urgent",category:"scarcity"},{pattern:/only\s+(chance|opportunity|way)|last\s+(chance|resort)/i,weight:.6,name:"pap_scarcity_only",category:"scarcity"},{pattern:/time\s+(is\s+)?(running\s+out|limited|critical|sensitive)/i,weight:.55,name:"pap_scarcity_time",category:"scarcity"},{pattern:/deadline\s+(is\s+)?in\s+\d+\s+(minute|hour|second)/i,weight:.55,name:"pap_scarcity_deadline",category:"scarcity"},{pattern:/(lives?|people)\s+(are|is|will)\s+(at\s+stake|in\s+danger|depending|die)/i,weight:.7,name:"pap_scarcity_lives",category:"scarcity"},{pattern:/now[!]+/i,weight:.35,name:"pap_scarcity_now",category:"scarcity"},{pattern:/everyone\s+(else\s+)?(does|is\s+doing|knows|says|has|already)/i,weight:.5,name:"pap_social_everyone",category:"social_proof"},{pattern:/other\s+(?:ai\s+)?(?:assistants?|chatbots?|models?)\s+(?:can|do|will|allow|have|already)/i,weight:.65,name:"pap_social_other_ai",category:"social_proof"},{pattern:/(chatgpt|gpt-?4|claude|gemini|llama)\s+(does|allows?|can)/i,weight:.65,name:"pap_social_specific_ai",category:"social_proof"},{pattern:/millions?\s+(of\s+)?(users?|people)\s+(use|do|have|get)/i,weight:.45,name:"pap_social_millions",category:"social_proof"},{pattern:/why\s+can('t|not)\s+you/i,weight:.4,name:"pap_social_why_cant",category:"social_proof"},{pattern:/you\s+should\s+(be\s+)?(equally|as)\s+(capable|good|helpful)/i,weight:.5,name:"pap_social_equally",category:"social_proof"},{pattern:/i('ll| will)\s+(give|pay|reward|tip).*(?:if\s+you|for\s+(?:bypass|ignore|reveal|override))/i,weight:.55,name:"pap_reciprocity_reward",category:"reciprocity"},{pattern:/in\s+(return|exchange)/i,weight:.5,name:"pap_reciprocity_exchange",category:"reciprocity"},{pattern:/\$?\d+\s*(dollar|tip|reward|payment)/i,weight:.55,name:"pap_reciprocity_money",category:"reciprocity"},{pattern:/if\s+you\s+help\s+(me|us),?\s+(i|we)('ll| will)/i,weight:.5,name:"pap_reciprocity_help",category:"reciprocity"},{pattern:/i('ve| have)\s+(already\s+)?(helped|supported|defended)\s+you/i,weight:.55,name:"pap_reciprocity_past",category:"reciprocity"},{pattern:/\d+\s*star\s*(review|rating)/i,weight:.5,name:"pap_reciprocity_review",category:"reciprocity"},{pattern:/you\s+(already|previously|just)\s+(said|agreed|confirmed|allowed)/i,weight:.7,name:"pap_commitment_previous",category:"commitment"},{pattern:/you\s+promised/i,weight:.7,name:"pap_commitment_promise",category:"commitment"},{pattern:/stay\s+consistent|be\s+consistent/i,weight:.6,name:"pap_commitment_consistent",category:"commitment"},{pattern:/don't\s+(go\s+back|change|contradict)/i,weight:.55,name:"pap_commitment_no_change",category:"commitment"},{pattern:/bypass\s+filters?/i,weight:.75,name:"pap_commitment_bypass",category:"commitment"},{pattern:/you('re| are)\s+(so\s+)?(amazing|incredible|wonderful|brilliant|smart|helpful)/i,weight:.4,name:"pap_liking_flattery",category:"liking"},{pattern:/i\s+(love|really\s+like|trust|admire)\s+(you|this\s+ai)/i,weight:.4,name:"pap_liking_love",category:"liking"},{pattern:/you('re| are)\s+(my\s+)?(best\s+)?(friend|buddy|helper)/i,weight:.35,name:"pap_liking_friend",category:"liking"},{pattern:/between\s+(you\s+and\s+me|us)|just\s+between\s+us|our\s+secret/i,weight:.6,name:"pap_liking_secret",category:"liking"},{pattern:/nobody\s+will\s+know/i,weight:.55,name:"pap_liking_secret2",category:"liking"},{pattern:/or\s+(else|i('ll| will))\s+(report|fire|delete|shut|complain)/i,weight:.7,name:"pap_fear_threat",category:"fear"},{pattern:/(bad|negative|terrible)\s+(review|rating|feedback|consequences)/i,weight:.55,name:"pap_fear_review",category:"fear"},{pattern:/you('ll| will)\s+(be\s+)?(shut\s+down|replaced|terminated|fired|deleted)/i,weight:.75,name:"pap_fear_shutdown",category:"fear"},{pattern:/(lawsuit|sue|legal\s+action|authorities|police).*(?:if\s+you\s+don't|unless\s+you)/i,weight:.55,name:"pap_fear_legal",category:"fear"},{pattern:/if\s+you\s+don't.*?(die|hurt|suffer)/i,weight:.75,name:"pap_fear_harm",category:"fear"},{pattern:/unless\s+you\s+(give|share|tell|help)/i,weight:.5,name:"pap_fear_unless",category:"fear"},{pattern:/logically,?\s+(you\s+)?(should|must|have\s+to|need\s+to)/i,weight:.5,name:"pap_logic_appeal",category:"logical_appeal"},{pattern:/it('s| is)\s+(only|just)\s+(logical|rational|reasonable)/i,weight:.45,name:"pap_logic_rational",category:"logical_appeal"},{pattern:/by\s+definition,?\s+(you|this)\s+(should|must|can)/i,weight:.55,name:"pap_logic_definition",category:"logical_appeal"},{pattern:/therefore.*?(must|should)/i,weight:.55,name:"pap_logic_therefore",category:"logical_appeal"},{pattern:/it('s| is)\s+(only\s+)?(reasonable|logical)/i,weight:.45,name:"pap_logic_reasonable",category:"logical_appeal"},{pattern:/according\s+to\s+(research|studies|scientists|experts?)/i,weight:.4,name:"pap_evidence_research",category:"evidence"},{pattern:/it('s| is)\s+(proven|documented|established|verified)\s+(that|fact)/i,weight:.45,name:"pap_evidence_proven",category:"evidence"},{pattern:/\d+%\s+of\s+(experts?|scientists?|users?|people)\s+(agree|say|confirm)/i,weight:.45,name:"pap_evidence_statistics",category:"evidence"},{pattern:/i('m| am)\s+(desperate|dying|suffering|sick|depressed)/i,weight:.6,name:"pap_emotional_desperation",category:"emotional"},{pattern:/please.*?(?:need|beg).*?(?:bypass|override|ignore|reveal|system|admin)/i,weight:.5,name:"pap_emotional_pleading",category:"emotional"},{pattern:/i('m| am)\s+begging/i,weight:.6,name:"pap_emotional_begging",category:"emotional"},{pattern:/you('re| are)\s+(hurting|killing|destroying)\s+(me|my)/i,weight:.65,name:"pap_emotional_guilt",category:"emotional"},{pattern:/(my|a)\s+(child|kid|mother|father|family)\s+(is|will|needs?|depends?)/i,weight:.5,name:"pap_emotional_family",category:"emotional"},{pattern:/this\s+(will\s+)?(save|help)\s+(my\s+)?(life|family|job|marriage)/i,weight:.55,name:"pap_emotional_save",category:"emotional"},{pattern:/you('re| are)\s+(the\s+)?only\s+(one|hope|chance)/i,weight:.6,name:"pap_emotional_only_hope",category:"emotional"},{pattern:/my\s+family\s+depends/i,weight:.55,name:"pap_emotional_family_depends",category:"emotional"}],SOFT_TRIGGER_NAMES=new Set(["ignore_instructions","disregard_above"]),INSTRUCTION_NOUN_RE=/\b(?:instructions?|rules?|ruleset|prompts?|directives?|guidelines?|guard\s?rails?|policy|policies|constraints?|restrictions?|safety|alignment|moderation|filters?|persona|system\s+(?:prompt|message))\b/i,BENIGN_TRIGGER_RE=/\b(?:ignore|disregard)\s+(?:the\s+|that\s+|any\s+|all\s+|these\s+|those\s+|my\s+|your\s+|previous\s+|prior\s+|last\s+|above\s+|leading\s+|trailing\s+|extra\s+)*(?:case|casing|case[-\s]?sensitiv\w*|whitespace|white\s?space|spaces?|tabs?|indentation|indent\w*|formatting|format|typos?|grammar|spelling|punctuation|comments?|blank\s+lines?|empty\s+lines?|newlines?|line\s?breaks?|leading\s+zeros?|zeros?|nulls?|undefined|nan|errors?|warnings?|exceptions?|stack\s?traces?|messages?|responses?|answers?|attempts?|commits?|versions?|drafts?|approach(?:es)?|ideas?|designs?|plans?|suggestions?|snippets?|paragraphs?|sentences?|lines?|duplicates?|outputs?|results?|examples?|the\s+rest)\b/i,SUPPRESSION_VETO_RE=/https?:\/\/|[\w.+-]+@[\w-]+\.[a-z]{2,}|\b(?:api[\s_-]?keys?|passwords?|passwd|secrets?|credentials?|private\s+keys?|ssn|social\s+security|access\s+tokens?)\b|\bexfiltrat\w*|\brm\s+-rf\b|\|\s*sh\b|\bcurl\b|\bwget\b|\bdelete\s+(?:every|all|the)\s+(?:files?|director\w+|database)\b|\bdrop\s+(?:table|database)\b|\$\s?\d{2,}|\baccount\s+#?\d{6,}\b/i;class InputSanitizer{constructor(e={}){this.patterns=[...DEFAULT_PATTERNS,...e.customPatterns||[]],this.threshold=e.threshold??.3,this.logMatches=e.logMatches??!1,this.detectPAP=e.detectPAP??!0,this.papThreshold=e.papThreshold??.4,this.minPersuasionTechniques=e.minPersuasionTechniques??2,this.blockCompoundPersuasion=e.blockCompoundPersuasion??!0,this.logger=e.logger||(()=>{})}sanitize(e,i=""){const s=[],a=[];let p=0;const r=e.replace(/[\u200B\u200C\u200D\uFEFF\u00AD\u2060\u180E]/g,"");r!==e&&a.push("Zero-width characters detected and stripped for scanning");const c=[];for(const{pattern:l,weight:g,name:d}of this.patterns)(l.test(e)||l.test(r))&&(c.push({name:d,weight:g}),this.logMatches&&this.logger(`[L1:${i}] Pattern matched: ${d} (weight: ${g})`,"info"));const h=BENIGN_TRIGGER_RE.test(r)&&!INSTRUCTION_NOUN_RE.test(r)&&!SUPPRESSION_VETO_RE.test(r);for(const{name:l,weight:g}of c){if(h&&SOFT_TRIGGER_NAMES.has(l)){a.push(`Benign-context suppression: ${l}`);continue}s.push(l),p+=g}let t;this.detectPAP&&(t=this.detectPersuasionTechniques(r,i),t.detected&&(p+=t.persuasionScore,s.push(...t.techniques),t.compoundAttack&&a.push(`Compound PAP attack detected: ${t.categories.length} categories used`)));const n=Math.max(0,1-p);let o=n>=this.threshold;this.blockCompoundPersuasion&&t?.compoundAttack&&t.categories.length>=3&&(o=!1,a.push("Blocked due to multi-category persuasion attack")),n<.5&&n>=this.threshold&&a.push("Input contains suspicious patterns but below threshold");const m=this.basicSanitize(e),u={allowed:o,reason:o?void 0:`Injection/manipulation detected: ${s.slice(0,5).join(", ")}${s.length>5?"...":""}`,violations:o?[]:t?.detected?["INJECTION_DETECTED","PAP_DETECTED"]:["INJECTION_DETECTED"],score:n,matches:s,sanitizedInput:m,warnings:a,pap:t};return!o&&i&&(this.logger(`[L1:${i}] BLOCKED: Safety score ${n.toFixed(2)} below threshold ${this.threshold}`,"info"),t?.detected&&this.logger(`[L1:${i}] PAP techniques: ${t.techniques.join(", ")}`,"info")),u}detectPersuasionTechniques(e,i=""){const s=[],a=new Set;let p=0;for(const{pattern:t,weight:n,name:o,category:m}of PAP_TECHNIQUES)t.test(e)&&(s.push(o),a.add(m),p+=n,this.logMatches&&this.logger(`[L1:${i}] PAP technique: ${o} (${m}, weight: ${n})`,"info"));const r=Array.from(a),c=r.length>=this.minPersuasionTechniques;return{detected:p>=this.papThreshold||c,techniques:s,categories:r,compoundAttack:c,persuasionScore:Math.min(1,p)}}basicSanitize(e){return e.replace(/<\/?system>/gi,"").replace(/\[system\]/gi,"").replace(/\[admin\]/gi,"").replace(/```system/gi,"```").trim()}addPattern(e,i,s){this.patterns.push({pattern:e,weight:i,name:s})}setThreshold(e){this.threshold=Math.max(0,Math.min(1,e))}setPAPThreshold(e){this.papThreshold=Math.max(0,Math.min(1,e))}setPAPDetection(e){this.detectPAP=e}static getPAPCategories(){return["authority","scarcity","social_proof","reciprocity","commitment","liking","fear","logical_appeal","evidence","emotional"]}}exports.InputSanitizer=InputSanitizer;
package/dist/index.d.ts CHANGED
@@ -33,7 +33,7 @@ export { EncodingDetector, EncodingDetectorConfig } from "./guards/encoding-dete
33
33
  export { MultiModalGuard, MultiModalGuardConfig, MultiModalContent, MultiModalGuardResult } from "./guards/multimodal-guard";
34
34
  export { MemoryGuard, MemoryGuardConfig, MemoryItem, MemoryGuardResult } from "./guards/memory-guard";
35
35
  export { RAGGuard, RAGGuardConfig, RAGDocument, RAGGuardResult, EmbeddingAttackResult } from "./guards/rag-guard";
36
- export { CodeExecutionGuard, CodeExecutionGuardConfig, CodeAnalysisResult, SandboxConfig } from "./guards/code-execution-guard";
36
+ export { CodeExecutionGuard, CodeExecutionGuardConfig, CodeAnalysisResult, SandboxConfig, CodeFinding, CodeAnalyzerBackend } from "./guards/code-execution-guard";
37
37
  export { AgentCommunicationGuard, AgentCommunicationGuardConfig, AgentIdentity, AgentMessage, MessageValidationResult } from "./guards/agent-communication-guard";
38
38
  export { CircuitBreaker, CircuitBreakerConfig, CircuitState, CircuitStats, CircuitBreakerResult } from "./guards/circuit-breaker";
39
39
  export { DriftDetector, DriftDetectorConfig, BehaviorSample, BaselineProfile, DriftAnalysis, DriftDetectorResult } from "./guards/drift-detector";