@intentsolutionsio/penetration-tester 2.0.0 → 3.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (112) hide show
  1. package/.claude-plugin/plugin.json +8 -3
  2. package/README.md +8 -0
  3. package/commands/pentest.md +5 -0
  4. package/package.json +8 -3
  5. package/skills/analyzing-tls-config/SKILL.md +221 -0
  6. package/skills/analyzing-tls-config/references/AUTHORIZATION.md +133 -0
  7. package/skills/analyzing-tls-config/references/PLAYBOOK.md +267 -0
  8. package/skills/analyzing-tls-config/references/THEORY.md +128 -0
  9. package/skills/analyzing-tls-config/scripts/analyze_tls.py +415 -0
  10. package/skills/auditing-cors-policy/SKILL.md +186 -0
  11. package/skills/auditing-cors-policy/references/PLAYBOOK.md +220 -0
  12. package/skills/auditing-cors-policy/references/THEORY.md +142 -0
  13. package/skills/auditing-cors-policy/scripts/audit_cors.py +350 -0
  14. package/skills/auditing-npm-dependencies/SKILL.md +254 -0
  15. package/skills/auditing-npm-dependencies/references/PLAYBOOK.md +175 -0
  16. package/skills/auditing-npm-dependencies/references/THEORY.md +122 -0
  17. package/skills/auditing-npm-dependencies/scripts/audit_npm.py +408 -0
  18. package/skills/auditing-python-dependencies/SKILL.md +251 -0
  19. package/skills/auditing-python-dependencies/references/PLAYBOOK.md +193 -0
  20. package/skills/auditing-python-dependencies/references/THEORY.md +122 -0
  21. package/skills/auditing-python-dependencies/scripts/audit_python.py +459 -0
  22. package/skills/checking-http-security-headers/SKILL.md +176 -0
  23. package/skills/checking-http-security-headers/references/PLAYBOOK.md +212 -0
  24. package/skills/checking-http-security-headers/references/THEORY.md +137 -0
  25. package/skills/checking-http-security-headers/scripts/check_headers.py +362 -0
  26. package/skills/checking-license-compliance/SKILL.md +225 -0
  27. package/skills/checking-license-compliance/references/PLAYBOOK.md +161 -0
  28. package/skills/checking-license-compliance/references/THEORY.md +152 -0
  29. package/skills/checking-license-compliance/scripts/check_licenses.py +461 -0
  30. package/skills/composing-vulnerability-report/SKILL.md +212 -0
  31. package/skills/composing-vulnerability-report/references/PLAYBOOK.md +180 -0
  32. package/skills/composing-vulnerability-report/references/THEORY.md +178 -0
  33. package/skills/composing-vulnerability-report/scripts/compose_report.py +396 -0
  34. package/skills/confirming-pentest-authorization/SKILL.md +247 -0
  35. package/skills/confirming-pentest-authorization/references/PLAYBOOK.md +189 -0
  36. package/skills/confirming-pentest-authorization/references/THEORY.md +167 -0
  37. package/skills/confirming-pentest-authorization/scripts/check_authorization.py +457 -0
  38. package/skills/defining-pentest-scope/SKILL.md +227 -0
  39. package/skills/defining-pentest-scope/references/PLAYBOOK.md +238 -0
  40. package/skills/defining-pentest-scope/references/THEORY.md +170 -0
  41. package/skills/defining-pentest-scope/scripts/define_scope.py +472 -0
  42. package/skills/detecting-command-injection-patterns/SKILL.md +144 -0
  43. package/skills/detecting-command-injection-patterns/references/PLAYBOOK.md +302 -0
  44. package/skills/detecting-command-injection-patterns/references/THEORY.md +206 -0
  45. package/skills/detecting-command-injection-patterns/scripts/scan_cmdi.py +290 -0
  46. package/skills/detecting-debug-endpoints/SKILL.md +207 -0
  47. package/skills/detecting-debug-endpoints/references/PLAYBOOK.md +402 -0
  48. package/skills/detecting-debug-endpoints/references/THEORY.md +218 -0
  49. package/skills/detecting-debug-endpoints/scripts/probe_debug.py +518 -0
  50. package/skills/detecting-directory-listing/SKILL.md +206 -0
  51. package/skills/detecting-directory-listing/references/PLAYBOOK.md +277 -0
  52. package/skills/detecting-directory-listing/references/THEORY.md +203 -0
  53. package/skills/detecting-directory-listing/scripts/probe_directory_listing.py +180 -0
  54. package/skills/detecting-eval-exec-usage/SKILL.md +128 -0
  55. package/skills/detecting-eval-exec-usage/references/PLAYBOOK.md +306 -0
  56. package/skills/detecting-eval-exec-usage/references/THEORY.md +159 -0
  57. package/skills/detecting-eval-exec-usage/scripts/scan_eval.py +223 -0
  58. package/skills/detecting-exposed-secrets-files/SKILL.md +179 -0
  59. package/skills/detecting-exposed-secrets-files/references/PLAYBOOK.md +274 -0
  60. package/skills/detecting-exposed-secrets-files/references/THEORY.md +174 -0
  61. package/skills/detecting-exposed-secrets-files/scripts/probe_secrets.py +207 -0
  62. package/skills/detecting-insecure-deserialization/SKILL.md +148 -0
  63. package/skills/detecting-insecure-deserialization/references/PLAYBOOK.md +333 -0
  64. package/skills/detecting-insecure-deserialization/references/THEORY.md +199 -0
  65. package/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py +250 -0
  66. package/skills/detecting-sql-injection-patterns/SKILL.md +161 -0
  67. package/skills/detecting-sql-injection-patterns/references/PLAYBOOK.md +317 -0
  68. package/skills/detecting-sql-injection-patterns/references/THEORY.md +261 -0
  69. package/skills/detecting-sql-injection-patterns/scripts/scan_sqli.py +354 -0
  70. package/skills/detecting-ssl-cert-issues/SKILL.md +182 -0
  71. package/skills/detecting-ssl-cert-issues/references/PLAYBOOK.md +203 -0
  72. package/skills/detecting-ssl-cert-issues/references/THEORY.md +133 -0
  73. package/skills/detecting-ssl-cert-issues/scripts/check_cert_chain.py +481 -0
  74. package/skills/detecting-weak-cryptography/SKILL.md +147 -0
  75. package/skills/detecting-weak-cryptography/references/PLAYBOOK.md +466 -0
  76. package/skills/detecting-weak-cryptography/references/THEORY.md +194 -0
  77. package/skills/detecting-weak-cryptography/scripts/scan_weak_crypto.py +417 -0
  78. package/skills/fingerprinting-server-software/SKILL.md +191 -0
  79. package/skills/fingerprinting-server-software/references/PLAYBOOK.md +337 -0
  80. package/skills/fingerprinting-server-software/references/THEORY.md +183 -0
  81. package/skills/fingerprinting-server-software/scripts/fingerprint_server.py +347 -0
  82. package/skills/generating-executive-summary/SKILL.md +261 -0
  83. package/skills/generating-executive-summary/references/PLAYBOOK.md +201 -0
  84. package/skills/generating-executive-summary/references/THEORY.md +195 -0
  85. package/skills/generating-executive-summary/scripts/exec_summary.py +538 -0
  86. package/skills/mapping-findings-to-owasp-top10/SKILL.md +235 -0
  87. package/skills/mapping-findings-to-owasp-top10/references/PLAYBOOK.md +193 -0
  88. package/skills/mapping-findings-to-owasp-top10/references/THEORY.md +160 -0
  89. package/skills/mapping-findings-to-owasp-top10/scripts/map_owasp.py +540 -0
  90. package/skills/performing-penetration-testing/SKILL.md +282 -190
  91. package/skills/performing-penetration-testing/references/OWASP_TOP_10.md +22 -0
  92. package/skills/performing-penetration-testing/references/REMEDIATION_PLAYBOOK.md +46 -0
  93. package/skills/performing-penetration-testing/references/SECURITY_HEADERS.md +41 -0
  94. package/skills/performing-penetration-testing/scripts/code_security_scanner.py +144 -79
  95. package/skills/performing-penetration-testing/scripts/dependency_auditor.py +116 -93
  96. package/skills/performing-penetration-testing/scripts/security_scanner.py +574 -446
  97. package/skills/probing-dangerous-http-methods/SKILL.md +182 -0
  98. package/skills/probing-dangerous-http-methods/references/PLAYBOOK.md +234 -0
  99. package/skills/probing-dangerous-http-methods/references/THEORY.md +145 -0
  100. package/skills/probing-dangerous-http-methods/scripts/probe_methods.py +263 -0
  101. package/skills/recording-pentest-engagement/SKILL.md +253 -0
  102. package/skills/recording-pentest-engagement/references/PLAYBOOK.md +203 -0
  103. package/skills/recording-pentest-engagement/references/THEORY.md +195 -0
  104. package/skills/recording-pentest-engagement/scripts/record_engagement.py +461 -0
  105. package/skills/scanning-for-hardcoded-secrets/SKILL.md +215 -0
  106. package/skills/scanning-for-hardcoded-secrets/references/PLAYBOOK.md +325 -0
  107. package/skills/scanning-for-hardcoded-secrets/references/THEORY.md +175 -0
  108. package/skills/scanning-for-hardcoded-secrets/scripts/scan_secrets.py +395 -0
  109. package/skills/tracing-transitive-vulnerabilities/SKILL.md +235 -0
  110. package/skills/tracing-transitive-vulnerabilities/references/PLAYBOOK.md +233 -0
  111. package/skills/tracing-transitive-vulnerabilities/references/THEORY.md +138 -0
  112. package/skills/tracing-transitive-vulnerabilities/scripts/trace_vulns.py +484 -0
@@ -0,0 +1,148 @@
1
+ ---
2
+ name: detecting-insecure-deserialization
3
+ description: |
4
+ Scan a source tree for unsafe-by-default deserialization APIs:
5
+ Python pickle.loads / cPickle / shelve / dill, Ruby Marshal.load /
6
+ YAML.load (pre-3.1 default), Java ObjectInputStream.readObject,
7
+ PHP unserialize, .NET BinaryFormatter / NetDataContractSerializer,
8
+ Node.js node-serialize, JavaScript JSON.parse with reviver
9
+ containing eval.
10
+ Use when: pre-commit gate on services that accept binary blobs,
11
+ audit of legacy job-queue code (workers deserializing tasks),
12
+ post-bug-report when "we accept user-uploaded archives."
13
+ Threshold: any call to a known-unsafe deserialization API on
14
+ data that originates from user input, network, file upload,
15
+ or untrusted storage.
16
+ Trigger with: "scan deserialization", "pickle audit", "java
17
+ readObject scan", "yaml.load check".
18
+ allowed-tools:
19
+ - Read
20
+ - Bash(python3:*)
21
+ - Glob
22
+ - Grep
23
+ disallowed-tools:
24
+ - Bash(rm:*)
25
+ - Bash(curl:*)
26
+ version: 3.0.0-dev
27
+ author: Jeremy Longshore <jeremy@intentsolutions.io>
28
+ license: MIT
29
+ compatibility: Designed for Claude Code
30
+ tags:
31
+ - security
32
+ - static-analysis
33
+ - deserialization
34
+ - pentest
35
+ ---
36
+
37
+ # Detecting Insecure Deserialization
38
+
39
+ ## Overview
40
+
41
+ Insecure deserialization (CWE-502, OWASP A08:2021) is the highest-
42
+ severity injection class in many language stacks because it directly
43
+ maps to RCE. Pickle, Java serialization, PHP unserialize, and
44
+ BinaryFormatter all execute object-construction code during
45
+ deserialization. If that code includes `__reduce__` /
46
+ `readObject` / `__wakeup` / `OnDeserialization` callbacks that
47
+ the attacker controls, the deserialization step IS code execution.
48
+
49
+ Most legitimate use cases have safer alternatives (JSON for data,
50
+ YAML with safe-load, Protocol Buffers, Avro). The remaining cases
51
+ need explicit type allow-lists and HMAC-signed payloads.
52
+
53
+ ## When the skill produces findings
54
+
55
+ | Finding | Severity | Threshold | Affected control |
56
+ |---|---|---|---|
57
+ | Python `pickle.loads(...)` | **CRITICAL** | always (untrusted input) | CWE-502 |
58
+ | Python `pickle.load(file)` | **CRITICAL** | always | CWE-502 |
59
+ | Python `dill.loads` | **CRITICAL** | always | CWE-502 |
60
+ | Python `yaml.load(...)` without Loader= | **CRITICAL** | unsafe legacy default | CWE-502 |
61
+ | Python `yaml.unsafe_load(...)` | **CRITICAL** | explicit unsafe | CWE-502 |
62
+ | Python `shelve.open(...)` | **HIGH** | pickle-backed; user-controllable filename | CWE-502 |
63
+ | Java `ObjectInputStream.readObject()` | **CRITICAL** | always | CWE-502 |
64
+ | PHP `unserialize($input)` | **CRITICAL** | non-literal input | CWE-502 |
65
+ | .NET `BinaryFormatter.Deserialize(...)` | **CRITICAL** | deprecated unsafe API | CWE-502 |
66
+ | .NET `NetDataContractSerializer` | **CRITICAL** | also unsafe | CWE-502 |
67
+ | .NET `LosFormatter.Deserialize` | **CRITICAL** | ViewState path | CWE-502 |
68
+ | Ruby `Marshal.load(...)` | **CRITICAL** | non-literal | CWE-502 |
69
+ | Ruby `YAML.load(...)` (pre-3.1 Psych) | **CRITICAL** | safe in Psych 4.0+; needs version check | CWE-502 |
70
+ | Node.js `node-serialize.unserialize` | **CRITICAL** | known-vulnerable lib | CWE-502 |
71
+ | Node.js `serialize-javascript` reviver | **HIGH** | if used to deserialize untrusted | CWE-502 |
72
+
73
+ ## Prerequisites
74
+
75
+ - Python 3.9+
76
+ - Source tree on local filesystem
77
+
78
+ ## Instructions
79
+
80
+ ### Run
81
+
82
+ ```bash
83
+ python3 ${CLAUDE_PLUGIN_ROOT}/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py /path/to/repo
84
+ ```
85
+
86
+ Options same as previous skills: `--output`, `--format`,
87
+ `--min-severity`, `--include-tests`, `--languages`.
88
+
89
+ ### Interpret
90
+
91
+ CRITICAL across the board because these APIs grant RCE during
92
+ deserialization if the input is attacker-controlled. The
93
+ verification step is "can the input ever originate from
94
+ untrusted source" — if yes, it's an immediate fix.
95
+
96
+ ### Remediation
97
+
98
+ The fix depends on the data shape:
99
+
100
+ - **Data is structured (JSON-shaped):** switch to `json.loads`.
101
+ - **Data needs polymorphism / arbitrary types:** define a strict
102
+ schema (Pydantic / dataclasses / Protocol Buffers) and validate
103
+ on parse.
104
+ - **Data must round-trip exact Python / Java / .NET objects:** use
105
+ HMAC-signed serialization with an explicit type allow-list.
106
+
107
+ See `references/PLAYBOOK.md` for per-language migrations.
108
+
109
+ ## Examples
110
+
111
+ ### Worker-queue audit
112
+
113
+ ```bash
114
+ python3 ${CLAUDE_PLUGIN_ROOT}/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py \
115
+ /path/to/celery-workers --min-severity high
116
+ ```
117
+
118
+ Celery defaults to pickle in older configurations; this finds the
119
+ remaining unsafe-default callers.
120
+
121
+ ### CI
122
+
123
+ ```yaml
124
+ - name: Deserialization scan
125
+ run: |
126
+ python3 plugins/security/penetration-tester/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py \
127
+ . --min-severity high
128
+ ```
129
+
130
+ ## Output
131
+
132
+ JSON / JSONL / Markdown. Exit codes: 0 / 1 / 2.
133
+
134
+ ## Error Handling
135
+
136
+ Pickle / Marshal usage on a private cache file written by the same
137
+ application is technically safe (the attacker can't influence the
138
+ file contents). The scanner flags it as CRITICAL; verify by reading
139
+ where the input file originates.
140
+
141
+ ## Resources
142
+
143
+ - `references/THEORY.md` — Why deserialization is RCE, gadget chains,
144
+ HMAC-signing pattern, schema-validation alternatives
145
+ - `references/PLAYBOOK.md` — Per-language migrations (Python pickle
146
+ → JSON / msgpack, yaml.load → yaml.safe_load, Java ObjectInputStream
147
+ → JSON via Jackson with allow-list, PHP unserialize → JSON
148
+ alternatives, .NET BinaryFormatter → System.Text.Json)
@@ -0,0 +1,333 @@
1
+ # Insecure-Deserialization Remediation Playbook
2
+
3
+ The universal migration: switch from a behavioral format (pickle,
4
+ Java serialization, PHP unserialize, BinaryFormatter) to a
5
+ schema-validated structural format (JSON + Pydantic / dataclasses /
6
+ Jackson with allow-list / System.Text.Json).
7
+
8
+ ## Python — pickle → JSON / msgpack / pydantic
9
+
10
+ ### Before (Celery task queue with pickle)
11
+
12
+ ```python
13
+ # Celery worker
14
+ CELERY_TASK_SERIALIZER = 'pickle' # unsafe
15
+ result = pickle.loads(task_payload)
16
+ ```
17
+
18
+ ### After
19
+
20
+ ```python
21
+ # Celery configuration
22
+ CELERY_TASK_SERIALIZER = 'json'
23
+ CELERY_ACCEPT_CONTENT = ['json']
24
+ ```
25
+
26
+ ### Cache layer migration
27
+
28
+ ```python
29
+ # Before
30
+ import pickle
31
+ data = pickle.loads(redis_client.get(key))
32
+
33
+ # After (using msgpack for binary efficiency)
34
+ import msgpack
35
+ data = msgpack.unpackb(redis_client.get(key), raw=False)
36
+
37
+ # Or JSON if size isn't critical
38
+ import json
39
+ data = json.loads(redis_client.get(key))
40
+ ```
41
+
42
+ ### Application-data migration
43
+
44
+ ```python
45
+ # Before
46
+ class Order:
47
+ def __init__(self, id, items, total): ...
48
+ # Stored via pickle.dumps
49
+
50
+ # After (Pydantic with strict validation)
51
+ from pydantic import BaseModel
52
+ class Order(BaseModel):
53
+ id: int
54
+ items: list[str]
55
+ total: float
56
+
57
+ # Serialize
58
+ payload = order.model_dump_json()
59
+ # Deserialize with validation
60
+ restored = Order.model_validate_json(payload)
61
+ ```
62
+
63
+ ### YAML migration
64
+
65
+ ```python
66
+ # Before
67
+ import yaml
68
+ data = yaml.load(file_content) # UNSAFE
69
+
70
+ # After
71
+ data = yaml.safe_load(file_content) # restricts to basic types
72
+ ```
73
+
74
+ If you previously relied on `yaml.load` to instantiate Python
75
+ classes from YAML, replace the YAML schema with explicit
76
+ construction:
77
+
78
+ ```python
79
+ # Before YAML
80
+ # !MyClass
81
+ # arg1: hello
82
+ # arg2: 42
83
+
84
+ # Before code
85
+ import yaml
86
+ obj = yaml.load(content) # auto-constructs MyClass
87
+
88
+ # After: data-only YAML + explicit constructor
89
+ data = yaml.safe_load(content) # returns plain dict
90
+ obj = MyClass(arg1=data['arg1'], arg2=data['arg2'])
91
+ ```
92
+
93
+ ## Java — ObjectInputStream → Jackson with allow-list
94
+
95
+ ### Before
96
+
97
+ ```java
98
+ ObjectInputStream ois = new ObjectInputStream(inputStream);
99
+ MyObject obj = (MyObject) ois.readObject();
100
+ ```
101
+
102
+ ### After (Jackson, type-allow-listed)
103
+
104
+ ```java
105
+ import com.fasterxml.jackson.databind.*;
106
+ import com.fasterxml.jackson.databind.jsontype.*;
107
+
108
+ ObjectMapper mapper = new ObjectMapper();
109
+ mapper.activateDefaultTyping(
110
+ BasicPolymorphicTypeValidator.builder()
111
+ .allowIfSubType("com.example.") // your package only
112
+ .build(),
113
+ ObjectMapper.DefaultTyping.NON_FINAL
114
+ );
115
+ MyObject obj = mapper.readValue(inputStream, MyObject.class);
116
+ ```
117
+
118
+ The `BasicPolymorphicTypeValidator` restricts which classes
119
+ Jackson will instantiate during deserialization. Without it,
120
+ Jackson is roughly as exploitable as ObjectInputStream.
121
+
122
+ ### Legacy ObjectInputStream — minimal safety (last resort)
123
+
124
+ If you can't migrate from ObjectInputStream immediately:
125
+
126
+ ```java
127
+ public class AllowlistedObjectInputStream extends ObjectInputStream {
128
+ private static final Set<String> ALLOWED = Set.of(
129
+ "com.example.User",
130
+ "com.example.Order",
131
+ "java.lang.String",
132
+ "java.util.ArrayList"
133
+ );
134
+
135
+ public AllowlistedObjectInputStream(InputStream is) throws IOException {
136
+ super(is);
137
+ }
138
+
139
+ @Override
140
+ protected Class<?> resolveClass(ObjectStreamClass desc)
141
+ throws IOException, ClassNotFoundException {
142
+ if (!ALLOWED.contains(desc.getName())) {
143
+ throw new InvalidClassException(
144
+ "Unauthorized deserialization: " + desc.getName());
145
+ }
146
+ return super.resolveClass(desc);
147
+ }
148
+ }
149
+ ```
150
+
151
+ Use this as transitional protection only; migrate to a schema-
152
+ based format ASAP.
153
+
154
+ ## PHP — unserialize → json_decode
155
+
156
+ ### Before
157
+
158
+ ```php
159
+ $obj = unserialize($_POST['data']);
160
+ ```
161
+
162
+ ### After
163
+
164
+ ```php
165
+ $obj = json_decode($_POST['data'], true);
166
+ // Validate the shape:
167
+ if (!is_array($obj) || !isset($obj['id'], $obj['name'])) {
168
+ throw new InvalidArgumentException("Invalid payload");
169
+ }
170
+ ```
171
+
172
+ ### If you must keep unserialize: restrict allowed classes
173
+
174
+ ```php
175
+ $obj = unserialize($data, [
176
+ 'allowed_classes' => ['User', 'Order', 'OrderItem']
177
+ ]);
178
+ ```
179
+
180
+ In PHP 7.1+, passing `'allowed_classes' => false` restricts to
181
+ basic types only (still safer than the default).
182
+
183
+ ## .NET — BinaryFormatter → System.Text.Json
184
+
185
+ ### Before
186
+
187
+ ```csharp
188
+ var formatter = new BinaryFormatter();
189
+ var obj = (MyClass)formatter.Deserialize(stream);
190
+ ```
191
+
192
+ ### After (System.Text.Json)
193
+
194
+ ```csharp
195
+ using System.Text.Json;
196
+ var options = new JsonSerializerOptions {
197
+ PropertyNameCaseInsensitive = true,
198
+ };
199
+ MyClass obj = JsonSerializer.Deserialize<MyClass>(jsonString, options);
200
+ ```
201
+
202
+ ### For polymorphic types — explicit type discriminator
203
+
204
+ ```csharp
205
+ [JsonDerivedType(typeof(User), "user")]
206
+ [JsonDerivedType(typeof(Order), "order")]
207
+ public abstract class Entity { }
208
+
209
+ // Now the JSON looks like {"$type": "user", "id": 1, ...}
210
+ // And only the registered subtypes can be instantiated
211
+ ```
212
+
213
+ ### Avoid:
214
+
215
+ ```csharp
216
+ // NEVER do this — re-enables full polymorphic deserialization
217
+ new JsonSerializerOptions {
218
+ TypeInfoResolver = new DefaultJsonTypeInfoResolver { ... }
219
+ };
220
+ ```
221
+
222
+ ## Ruby — Marshal → JSON / safe YAML
223
+
224
+ ### Before
225
+
226
+ ```ruby
227
+ obj = Marshal.load(data)
228
+ ```
229
+
230
+ ### After
231
+
232
+ ```ruby
233
+ require 'json'
234
+ data = JSON.parse(json_str, create_additions: false) # create_additions: false REQUIRED
235
+ ```
236
+
237
+ `create_additions: true` (the default in older Ruby) lets JSON
238
+ instantiate arbitrary classes via the `json_create` hook. Set
239
+ `create_additions: false` to disable.
240
+
241
+ ### YAML — Psych 4.0+ defaults safe; older versions need explicit
242
+
243
+ ```ruby
244
+ # Psych 4.0+ (default since Ruby 3.1)
245
+ require 'yaml'
246
+ data = YAML.load(content) # safe by default
247
+
248
+ # Pre-4.0 explicit safety
249
+ data = YAML.safe_load(content, permitted_classes: [Symbol, Date])
250
+ ```
251
+
252
+ ## Node.js — node-serialize and friends
253
+
254
+ ### Before (using the known-vulnerable node-serialize package)
255
+
256
+ ```javascript
257
+ const serialize = require('node-serialize');
258
+ const obj = serialize.unserialize(data); // RCE
259
+ ```
260
+
261
+ ### After
262
+
263
+ ```javascript
264
+ const obj = JSON.parse(data);
265
+ // Validate shape with ajv or zod
266
+ const schema = z.object({ id: z.number(), name: z.string() });
267
+ const validated = schema.parse(obj);
268
+ ```
269
+
270
+ ### Avoid JSON.parse with reviver containing eval
271
+
272
+ ```javascript
273
+ // BAD
274
+ const obj = JSON.parse(data, (key, value) => eval(value));
275
+
276
+ // GOOD
277
+ const obj = JSON.parse(data);
278
+ ```
279
+
280
+ ## HMAC-signed serialization (when migration isn't feasible)
281
+
282
+ If you must keep pickle / Marshal / unserialize / BinaryFormatter
283
+ because of legacy storage that can't be re-encoded, wrap with HMAC
284
+ authentication.
285
+
286
+ ```python
287
+ import hmac, hashlib, pickle, os
288
+
289
+ KEY = os.environ["SERIALIZATION_HMAC_KEY"].encode()
290
+
291
+ def serialize_signed(obj):
292
+ payload = pickle.dumps(obj)
293
+ sig = hmac.new(KEY, payload, hashlib.sha256).digest()
294
+ return sig + payload
295
+
296
+ def deserialize_signed(data):
297
+ sig, payload = data[:32], data[32:]
298
+ expected = hmac.new(KEY, payload, hashlib.sha256).digest()
299
+ if not hmac.compare_digest(sig, expected):
300
+ raise ValueError("HMAC verification failed")
301
+ return pickle.loads(payload)
302
+ ```
303
+
304
+ The HMAC step proves the payload was created by code that holds
305
+ the KEY. Any tampering invalidates the HMAC. Pickle.loads still
306
+ runs, but only on payloads that you yourself signed.
307
+
308
+ This is necessary infrastructure when migrating; it's not a
309
+ permanent solution. The KEY is now a high-value target — leak the
310
+ key and the deserialization is exploitable again.
311
+
312
+ ## CI integration
313
+
314
+ ```yaml
315
+ - name: Insecure-deserialization scan
316
+ run: |
317
+ python3 plugins/security/penetration-tester/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py \
318
+ . --min-severity high --format json --output deser-scan.json
319
+ - run: |
320
+ if jq 'length > 0' deser-scan.json | grep -q true; then
321
+ echo "::error::Insecure deserialization detected"
322
+ exit 1
323
+ fi
324
+ ```
325
+
326
+ ## Verification after remediation
327
+
328
+ ```bash
329
+ python3 ${CLAUDE_PLUGIN_ROOT}/skills/detecting-insecure-deserialization/scripts/scan_deserialization.py \
330
+ /path/to/repo --min-severity high
331
+ ```
332
+
333
+ Expected: exit 0, zero findings.
@@ -0,0 +1,199 @@
1
+ # Insecure-Deserialization Theory
2
+
3
+ ## Why deserialization is RCE
4
+
5
+ Most serialization formats are structural (JSON: numbers, strings,
6
+ lists, dicts; XML: a tagged tree). They describe data shape, not
7
+ behavior. Deserializing them is a parse step: validate, build value
8
+ trees, return.
9
+
10
+ A small subset of formats is BEHAVIORAL. Python pickle, Java
11
+ serialization, PHP unserialize, .NET BinaryFormatter all store
12
+ not just object state but the type information AND any
13
+ deserialization callbacks the type registered. Re-creating the
14
+ object during deserialization invokes those callbacks.
15
+
16
+ The attack: craft a serialized payload that describes an object
17
+ graph using types whose deserialization callbacks execute
18
+ arbitrary code. The application calls `pickle.loads(payload)` and
19
+ the constructor / `__reduce__` / `readObject` chain runs the
20
+ attacker's code IN the application's process.
21
+
22
+ This is not theoretical. There are public gadget-chain libraries
23
+ for every language with behavioral deserialization:
24
+
25
+ - Java: `ysoserial` (Spring, Commons Collections, Hibernate gadgets)
26
+ - .NET: `ysoserial.net` (TypeConfuseDelegate, ObjectDataProvider)
27
+ - Python: pickle is trivially exploitable via `__reduce__` returning `(eval, ('...code...',))`
28
+ - PHP: PHPGGC (PHP Generic Gadget Chains library)
29
+ - Ruby: Marshal exploitation via `_load` callbacks
30
+
31
+ If your application accepts any of these formats from any source
32
+ not under your direct control, you have a deserialization RCE
33
+ unless and until you switch formats.
34
+
35
+ ## The "trusted source" trap
36
+
37
+ A common defense: "we only deserialize from a trusted source —
38
+ this database we control, this S3 bucket we own, etc."
39
+
40
+ Two problems:
41
+
42
+ 1. **The "trusted source" is often less trusted than assumed.**
43
+ The database might be writable by other services. The S3 bucket
44
+ might be world-readable by accident (see skill #9). An attacker
45
+ who reaches the storage layer (via SQL injection, IAM
46
+ misconfiguration, supply-chain compromise) can plant a payload
47
+ that the deserializer later picks up.
48
+
49
+ 2. **The format is brittle even without attackers.** Pickle and
50
+ Java serialization are not stable across language version
51
+ changes; an upgrade can break deserialization of stored data.
52
+ Switching to schema-validated JSON / Protobuf eliminates both
53
+ the security and stability concerns.
54
+
55
+ The pragmatic posture: assume any deserialization input could
56
+ become attacker-controlled through some future change. Use safe
57
+ formats by default.
58
+
59
+ ## Safe alternatives by use case
60
+
61
+ ### Use case: "I need to store structured data and reload it"
62
+
63
+ → JSON (with Pydantic, dataclasses, or msgspec for schema
64
+ validation). Or Protocol Buffers / Avro / MessagePack for binary
65
+ efficiency.
66
+
67
+ ### Use case: "I need to round-trip arbitrary Python / Java / .NET objects"
68
+
69
+ → Define a schema. If the data is genuinely polymorphic, use a
70
+ tagged-union pattern with explicit discriminator field. Validate
71
+ the discriminator against an allow-list before instantiating any
72
+ class.
73
+
74
+ ```python
75
+ # JSON with explicit type discriminator + allow-list
76
+ TYPE_REGISTRY = {
77
+ "User": User,
78
+ "Order": Order,
79
+ "Payment": Payment,
80
+ }
81
+ def deserialize(data: dict):
82
+ type_name = data["__type"]
83
+ if type_name not in TYPE_REGISTRY:
84
+ raise ValueError(f"Unknown type: {type_name}")
85
+ cls = TYPE_REGISTRY[type_name]
86
+ return cls(**{k: v for k, v in data.items() if k != "__type"})
87
+ ```
88
+
89
+ ### Use case: "I have a binary cache file I trust"
90
+
91
+ → HMAC-sign the file when writing; verify HMAC before
92
+ deserializing. The HMAC step proves the file wasn't tampered with
93
+ since you wrote it. Then deserialize.
94
+
95
+ ```python
96
+ import hmac, hashlib, pickle
97
+
98
+ def write_signed(path, obj, key):
99
+ payload = pickle.dumps(obj)
100
+ sig = hmac.new(key, payload, hashlib.sha256).digest()
101
+ with open(path, "wb") as f:
102
+ f.write(sig + payload)
103
+
104
+ def read_signed(path, key):
105
+ with open(path, "rb") as f:
106
+ data = f.read()
107
+ sig, payload = data[:32], data[32:]
108
+ expected = hmac.new(key, payload, hashlib.sha256).digest()
109
+ if not hmac.compare_digest(sig, expected):
110
+ raise ValueError("HMAC mismatch")
111
+ return pickle.loads(payload) # safe because HMAC verified
112
+ ```
113
+
114
+ This works only if the HMAC key is genuinely private. Once the
115
+ key is compromised, the protection is gone.
116
+
117
+ ### Use case: "I need to deserialize YAML config"
118
+
119
+ → Use the safe-load mode of your YAML library. Python's PyYAML
120
+ has `yaml.safe_load()`; Ruby Psych 4.0+ defaults to safe mode;
121
+ SnakeYAML (Java) supports `SafeConstructor`. These restrict
122
+ deserialization to basic types (strings, numbers, lists, dicts)
123
+ and refuse to instantiate arbitrary classes.
124
+
125
+ ## Language-specific notes
126
+
127
+ ### Python pickle
128
+
129
+ `pickle.loads(b)` on attacker-controlled bytes is roughly equivalent
130
+ to `exec()` on attacker-controlled code. There is no "safer pickle"
131
+ mode. Migrate to JSON / msgpack / pydantic-validated formats.
132
+
133
+ If you absolutely must keep pickle for round-tripping:
134
+
135
+ - HMAC-sign with a private key (see above)
136
+ - Restrict to in-process / same-machine usage; never accept pickle
137
+ from a network input
138
+ - Use `pickletools.dis()` to verify payloads conform to expected
139
+ opcodes (still not safe, just slightly harder to abuse)
140
+
141
+ ### Java ObjectInputStream
142
+
143
+ Use Jackson with type-allow-list. For legacy code that must keep
144
+ `ObjectInputStream`:
145
+
146
+ ```java
147
+ public class AllowlistedObjectInputStream extends ObjectInputStream {
148
+ private static final Set<String> ALLOWED = Set.of(
149
+ "com.example.User", "com.example.Order"
150
+ );
151
+ @Override
152
+ protected Class<?> resolveClass(ObjectStreamClass desc) throws IOException, ClassNotFoundException {
153
+ if (!ALLOWED.contains(desc.getName())) {
154
+ throw new InvalidClassException("Not allowed: " + desc.getName());
155
+ }
156
+ return super.resolveClass(desc);
157
+ }
158
+ }
159
+ ```
160
+
161
+ ### PHP unserialize
162
+
163
+ `unserialize($input, ["allowed_classes" => false])` restricts to
164
+ basic types (PHP 7+). For class-bearing payloads, use the array
165
+ form: `["allowed_classes" => ["User", "Order"]]`.
166
+
167
+ But the better answer is to switch to `json_decode` if the data
168
+ shape allows.
169
+
170
+ ### .NET BinaryFormatter
171
+
172
+ BinaryFormatter is deprecated in .NET 7+ and slated for removal.
173
+ Migrate to `System.Text.Json` for general-purpose serialization,
174
+ or `MessagePack-CSharp` for performance. The migration is
175
+ straightforward in most codebases; the only friction is
176
+ round-tripping types BinaryFormatter handles implicitly that
177
+ require explicit converter registration in System.Text.Json.
178
+
179
+ ## Gadget chain references
180
+
181
+ These are educational, not for use on production systems you
182
+ don't own:
183
+
184
+ - `ysoserial` — Java
185
+ - `ysoserial.net` — .NET
186
+ - `PHPGGC` — PHP
187
+ - Anthony Sotirov's pickle-exploit primers
188
+
189
+ Knowing the gadget-chain shape is useful for understanding the
190
+ risk; the operational answer is still "don't use unsafe
191
+ deserialization formats."
192
+
193
+ ## Primary sources
194
+
195
+ - [CWE-502 Deserialization of Untrusted Data](https://cwe.mitre.org/data/definitions/502.html)
196
+ - [OWASP A08:2021 Software and Data Integrity Failures](https://owasp.org/Top10/A08_2021-Software_and_Data_Integrity_Failures/)
197
+ - [OWASP Deserialization Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html)
198
+ - [Python pickle security docs](https://docs.python.org/3/library/pickle.html#restricting-globals)
199
+ - [Microsoft BinaryFormatter deprecation](https://learn.microsoft.com/en-us/dotnet/standard/serialization/binaryformatter-security-guide)