safemode 2.0.19 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -12,12 +12,14 @@ Works with **Claude Code**, **Cursor**, and **Windsurf**. Free and open source (
12
12
 
13
13
  ## What it blocks
14
14
 
15
- - `rm -rf /` and other destructive shell commands
15
+ - `rm -rf /` and other destructive shell commands (hardcoded, cannot be disabled)
16
+ - Firewall evasion: base64-decoded pipes to shell, hex escapes, python/perl system() one-liners
16
17
  - Secrets and API keys leaving your machine
17
18
  - PII in tool call parameters
18
19
  - Unauthorized git pushes, force operations
19
20
  - Package installs with known vulnerabilities
20
21
  - Prompt injection attempts in tool outputs
22
+ - Jailbreak attempts to bypass safety controls
21
23
  - Runaway loops and cost spikes
22
24
 
23
25
  ## How it works
@@ -28,10 +30,15 @@ Your prompt → AI Agent → Tool Call → Safe Mode → Allow/Block → System
28
30
 
29
31
  Every tool call passes through a governance pipeline:
30
32
 
31
- 1. **CET Classification** — categorizes the action (read/write/delete/execute/network)
33
+ 1. **CET Classification** — decomposes the action into category, action, scope, and risk level
32
34
  2. **Rules Engine** — custom rules from your config
33
- 3. **Knob Gate** — preset-based permission checks (19 knob categories)
34
- 4. **15 Detection Engines** — loop detection, secrets scanning, PII detection, command firewall, budget caps, and more
35
+ 3. **Knob Gate** — preset-based permission checks (19 categories, 100+ knobs)
36
+ 4. **15 Detection Engines** — loop detection, secrets scanning, PII detection, command firewall, prompt injection, jailbreak detection, budget caps, and more
37
+
38
+ Risk-based engine routing:
39
+ - **Low risk** (reads, ls, git status): 8 counter engines (~2ms)
40
+ - **Medium risk** (npm run, curl, pip install): all 15 engines (~5ms)
41
+ - **High/Critical risk** (rm -rf, sudo, terraform destroy): all 15 engines, sequential with early-stop (~10ms)
35
42
 
36
43
  The hook runs as an esbuild bundle. Cold start is ~50ms. You won't notice it.
37
44
 
@@ -58,7 +65,7 @@ safemode preset <name>
58
65
  | Preset | Description |
59
66
  |--------|-------------|
60
67
  | `yolo` | Log everything, block nothing |
61
- | `coding` | Block destructive ops, allow reads/writes (default) |
68
+ | `coding` | Block destructive ops, approve file deletes and package installs (default) |
62
69
  | `personal` | Block secrets, PII, and destructive ops |
63
70
  | `trading` | Strict financial safety — block network, packages, git |
64
71
  | `strict` | Block everything that isn't a read |
@@ -79,10 +86,34 @@ safemode phone --telegram # Set up block notifications
79
86
  safemode uninstall # Remove hooks and restore configs
80
87
  ```
81
88
 
89
+ ## CET Classification
90
+
91
+ Every shell command is deeply classified, not treated as a black box:
92
+
93
+ | Command | Category | Action | Risk |
94
+ |---------|----------|--------|------|
95
+ | `ls`, `cat`, `grep` | terminal | read | low |
96
+ | `echo "data" > file.txt` | filesystem | write | medium |
97
+ | `rm file.txt` | filesystem | delete | medium |
98
+ | `rm -rf dist/` | terminal | delete | high |
99
+ | `git status`, `git log` | git | read | low |
100
+ | `git push --force` | git | delete | critical |
101
+ | `npm install lodash` | package | create | medium |
102
+ | `docker run nginx` | container | execute | high |
103
+ | `docker ps` | container | read | low |
104
+ | `kubectl delete pod` | cloud | delete | high |
105
+ | `terraform destroy` | cloud | delete | critical |
106
+ | `terraform plan` | cloud | read | low |
107
+ | `ssh user@host` | network | execute | high |
108
+ | `eval "..."` | terminal | execute | critical |
109
+ | `sudo apt install` | terminal | execute | critical |
110
+
111
+ Infrastructure tools (Docker, kubectl, Terraform) are differentiated by subcommand — `docker ps` (low) is treated differently from `docker run` (high).
112
+
82
113
  ## False positive? One command.
83
114
 
84
115
  ```bash
85
- safemode allow <action> --once # Allow for this session
116
+ safemode allow <action> --once # Allow for this session (5 min)
86
117
  safemode allow <action> --always # Allow permanently
87
118
  ```
88
119
 
@@ -141,10 +172,41 @@ safemode phone --test # Send a test notification
141
172
  | 10 | Secrets Scanner | AWS keys, tokens, passwords |
142
173
  | 11 | Prompt Injection | Injection attempts in tool outputs |
143
174
  | 12 | Jailbreak | Attempts to bypass safety controls |
144
- | 13 | Command Firewall | Dangerous shell commands (rm -rf, chmod 777, etc.) |
175
+ | 13 | Command Firewall | Dangerous shell commands (hardcoded, cannot be disabled) |
145
176
  | 14 | Budget Cap | Hard estimated spending limit |
146
177
  | 15 | Action-Label Mismatch | Tool says "read" but actually writes |
147
178
 
179
+ ### Command Firewall (Engine 13)
180
+
181
+ Hardcoded patterns that cannot be disabled by any preset or override:
182
+
183
+ - Disk destruction: `rm -rf /`, `rm -rf ~/`, `mkfs`, `dd if=/dev/zero`
184
+ - System directories: `rm -rf /usr`, `/var`, `/etc`, `/bin`, `/boot`
185
+ - Fork bombs: `:(){ :|:& };:`
186
+ - Pipe to shell: `curl | bash`, `wget | sh`
187
+ - Permission abuse: `chmod -R 777 /`, `chown -R root:root /`
188
+ - Raw device access: `> /dev/sda`, `> /dev/mem`
189
+ - System file tampering: `> /etc/passwd`, `> /etc/shadow`
190
+ - Reverse shells: `nc -le /bin/bash`, `python -c "import socket"`
191
+ - Evasion attempts: `base64 -d | bash`, `$'\x72\x6d'`, `xxd -r | sh`
192
+ - Dangerous eval: `eval "rm -rf /"`, `eval "curl | bash"`
193
+ - Python/Perl system exec: `python -c "os.system()"`, `perl -e "system()"`
194
+
195
+ ## Scope detection
196
+
197
+ File paths are classified into scopes that affect risk level:
198
+
199
+ | Path | Scope | Why |
200
+ |------|-------|-----|
201
+ | `./src/index.ts` | project | Relative path |
202
+ | `/Users/me/project/file.ts` | project | Within project directory |
203
+ | `~/Documents/secret.txt` | user_home | Home directory |
204
+ | `/etc/hosts` | system | System path |
205
+ | `/tmp/scratch.txt` | system | Temp directory |
206
+ | `https://api.example.com` | network | URL |
207
+
208
+ Writing to `system` scope escalates risk (write → high, delete → critical).
209
+
148
210
  ## Config
149
211
 
150
212
  Personal config: `~/.safemode/config.yaml`