hppx 0.1.8 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,253 +1,492 @@
1
- # hppx
2
-
3
- 🔐 **Superior HTTP Parameter Pollution protection middleware** for Node.js/Express, written in TypeScript. It sanitizes `req.query`, `req.body`, and `req.params`, blocks prototype-pollution keys, supports nested whitelists, multiple merge strategies, and plays nicely with stacked middlewares.
4
-
5
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.9.3-blue.svg)](https://www.typescriptlang.org/)
7
- [![Node.js](https://img.shields.io/badge/Node.js-16+-green.svg)](https://nodejs.org/)
8
-
9
- ## Features
10
-
11
- - **Multiple merge strategies**: `keepFirst`, `keepLast` (default), `combine`
12
- - **Enhanced security**:
13
- - Blocks dangerous keys: `__proto__`, `prototype`, `constructor`
14
- - Prevents null-byte injection in keys
15
- - Validates key lengths to prevent DoS attacks
16
- - Limits array sizes to prevent memory exhaustion
17
- - **Flexible whitelisting**: Nested whitelist with dot-notation and leaf matching
18
- - **Pollution tracking**: Records polluted parameters on the request (`queryPolluted`, `bodyPolluted`, `paramsPolluted`)
19
- - **Multi-middleware support**: Works with multiple middlewares on different routes (whitelists applied incrementally)
20
- - **DoS protection**: `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
21
- - **Performance optimized**: Path caching for improved performance
22
- - **Fully typed API**: TypeScript-first with comprehensive type definitions and helper functions (`sanitize`)
23
-
24
- ## 📦 Installation
25
-
26
- ```bash
27
- npm install hppx
28
- ```
29
-
30
- ## Usage
31
-
32
- ### ESM (ES Modules)
33
-
34
- ```typescript
35
- import express from "express";
36
- import hppx from "hppx";
37
-
38
- const app = express();
39
- app.use(express.urlencoded({ extended: true }));
40
- app.use(express.json());
41
-
42
- app.use(
43
- hppx({
44
- whitelist: ["tags", "user.roles", "ids"],
45
- mergeStrategy: "keepLast",
46
- sources: ["query", "body"],
47
- }),
48
- );
49
-
50
- app.get("/search", (req, res) => {
51
- res.json({
52
- query: req.query,
53
- queryPolluted: req.queryPolluted ?? {},
54
- body: req.body ?? {},
55
- bodyPolluted: req.bodyPolluted ?? {},
56
- });
57
- });
58
- ```
59
-
60
- ### CommonJS
61
-
62
- ```javascript
63
- const express = require("express");
64
- const hppx = require("hppx");
65
-
66
- const app = express();
67
- app.use(express.urlencoded({ extended: true }));
68
- app.use(express.json());
69
-
70
- app.use(
71
- hppx({
72
- whitelist: ["tags", "user.roles", "ids"],
73
- mergeStrategy: "keepLast",
74
- sources: ["query", "body"],
75
- }),
76
- );
77
-
78
- app.get("/search", (req, res) => {
79
- res.json({
80
- query: req.query,
81
- queryPolluted: req.queryPolluted ?? {},
82
- body: req.body ?? {},
83
- bodyPolluted: req.bodyPolluted ?? {},
84
- });
85
- });
86
- ```
87
-
88
- ## API
89
-
90
- ### default export: `hppx(options?: HppxOptions)`
91
-
92
- Creates an Express-compatible middleware. Applies sanitization to each selected source and exposes `*.Polluted` objects.
93
-
94
- #### Key Options
95
-
96
- **Whitelist & Strategy:**
97
-
98
- - `whitelist?: string[]` — keys allowed as arrays; supports dot-notation; leaf matches too
99
- - `mergeStrategy?: 'keepFirst'|'keepLast'|'combine'` — how to reduce arrays when not whitelisted
100
-
101
- **Source Selection:**
102
-
103
- - `sources?: Array<'query'|'body'|'params'>` — which request parts to sanitize (default: all)
104
- - `checkBodyContentType?: 'urlencoded'|'any'|'none'` when to process `req.body` (default: `urlencoded`)
105
- - `excludePaths?: string[]` exclude specific paths (supports `*` wildcard suffix)
106
-
107
- **Security Limits (DoS Protection):**
108
-
109
- - `maxDepth?: number` maximum object nesting depth (default: 20, max: 100)
110
- - `maxKeys?: number` — maximum number of keys to process (default: 5000)
111
- - `maxArrayLength?: number` — maximum array length (default: 1000)
112
- - `maxKeyLength?: number` maximum key string length (default: 200, max: 1000)
113
-
114
- **Additional Options:**
115
-
116
- - `trimValues?: boolean` — trim string values (default: false)
117
- - `preserveNull?: boolean` — preserve null values (default: true)
118
- - `strict?: boolean` — if pollution detected, immediately respond with 400 error
119
- - `onPollutionDetected?: (req, info) => void` — callback on pollution detection
120
- - `logger?: (err: Error | string) => void` — custom logger for errors and pollution warnings
121
- - `logPollution?: boolean` — enable automatic logging when pollution is detected (default: true)
122
-
123
- ### named export: `sanitize(input, options)`
124
-
125
- Sanitize an arbitrary object using the same rules as the middleware. Useful for manual usage.
126
-
127
- ## Advanced usage
128
-
129
- ### Strict mode (respond 400 on pollution)
130
-
131
- ```typescript
132
- app.use(hppx({ strict: true }));
133
- ```
134
-
135
- ### Process JSON bodies too
136
-
137
- ```typescript
138
- app.use(express.json());
139
- app.use(hppx({ checkBodyContentType: "any" }));
140
- ```
141
-
142
- ### Exclude specific paths (supports `*` suffix)
143
-
144
- ```typescript
145
- app.use(hppx({ excludePaths: ["/public", "/assets*"] }));
146
- ```
147
-
148
- ### Custom logging for pollution detection
149
-
150
- ```typescript
151
- // Use your application's logger
152
- app.use(
153
- hppx({
154
- logger: (message) => {
155
- if (typeof message === "string") {
156
- myLogger.warn(message); // Pollution warnings
157
- } else {
158
- myLogger.error(message); // Errors
159
- }
160
- },
161
- }),
162
- );
163
-
164
- // Disable automatic pollution logging
165
- app.use(hppx({ logPollution: false }));
166
- ```
167
-
168
- ### Use the sanitizer directly
169
-
170
- ```typescript
171
- import { sanitize } from "hppx";
172
-
173
- const clean = sanitize(payload, {
174
- whitelist: ["user.tags"],
175
- mergeStrategy: "keepFirst",
176
- });
177
- ```
178
-
179
- **CommonJS:**
180
-
181
- ```javascript
182
- const { sanitize } = require("hppx");
183
-
184
- const clean = sanitize(payload, {
185
- whitelist: ["user.tags"],
186
- mergeStrategy: "keepFirst",
187
- });
188
- ```
189
-
190
- ## Security Best Practices
191
-
192
- ### Input Validation
193
-
194
- Always combine HPP protection with additional input validation:
195
-
196
- - Use schema validation libraries (e.g., Joi, Yup, Zod)
197
- - Validate data types and ranges after sanitization
198
- - Never trust user input, even after sanitization
199
-
200
- ### Configuration Recommendations
201
-
202
- For production environments, consider these settings:
203
-
204
- ```ts
205
- app.use(
206
- hppx({
207
- maxDepth: 10, // Lower depth for typical use cases
208
- maxKeys: 1000, // Reasonable limit for most requests
209
- maxArrayLength: 100, // Prevent large array attacks
210
- maxKeyLength: 100, // Shorter keys for most applications
211
- strict: true, // Return 400 on pollution attempts
212
- onPollutionDetected: (req, info) => {
213
- // Log security events for monitoring
214
- securityLogger.warn("HPP detected", {
215
- ip: req.ip,
216
- path: req.path,
217
- pollutedKeys: info.pollutedKeys,
218
- });
219
- },
220
- }),
221
- );
222
- ```
223
-
224
- ### What HPP Protects Against
225
-
226
- - **Parameter pollution**: Duplicate parameters causing unexpected behavior
227
- - **Prototype pollution**: Attacks via `__proto__`, `constructor`, `prototype`
228
- - **DoS attacks**: Excessive nesting, too many keys, huge arrays
229
- - **Null-byte injection**: Keys containing null characters (`\u0000`)
230
-
231
- ### What HPP Does NOT Protect Against
232
-
233
- HPP is not a complete security solution. You still need:
234
-
235
- - **SQL injection protection**: Use parameterized queries
236
- - **XSS protection**: Sanitize output, use CSP headers
237
- - **CSRF protection**: Use CSRF tokens
238
- - **Authentication/Authorization**: Validate user permissions
239
- - **Rate limiting**: Prevent brute-force attacks
240
-
241
- ## 📄 License
242
-
243
- MIT License - see [LICENSE](LICENSE) file for details.
244
-
245
- ## 🔗 Links
246
-
247
- - [NPM Package](https://www.npmjs.com/package/hppx)
248
- - [GitHub Repository](https://github.com/Hiprax/hppx)
249
- - [Issue Tracker](https://github.com/Hiprax/hppx/issues)
250
-
251
- ---
252
-
253
- ### **Made with ❤️ for secure applications**
1
+ # hppx
2
+
3
+ **Superior HTTP Parameter Pollution protection middleware** for Node.js/Express, written in TypeScript. It sanitizes `req.query`, `req.body`, and `req.params`, blocks prototype-pollution keys, supports nested whitelists, multiple merge strategies, and plays nicely with stacked middlewares.
4
+
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![npm version](https://img.shields.io/npm/v/hppx)](https://www.npmjs.com/package/hppx)
7
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)
8
+ [![Node.js](https://img.shields.io/badge/Node.js-%E2%89%A518-green.svg)](https://nodejs.org/)
9
+ [![Zero Dependencies](https://img.shields.io/badge/Dependencies-0-brightgreen.svg)](#)
10
+
11
+ ---
12
+
13
+ ## Features
14
+
15
+ - **Zero runtime dependencies** minimal attack surface and bundle size
16
+ - **Multiple merge strategies** `keepFirst`, `keepLast` (default), `combine`
17
+ - **Enhanced security:**
18
+ - Blocks dangerous keys: `__proto__`, `prototype`, `constructor`
19
+ - Prevents null-byte injection in keys
20
+ - Rejects malformed keys (dot/bracket-only patterns)
21
+ - Validates key lengths to prevent DoS attacks
22
+ - Limits array sizes to prevent memory exhaustion
23
+ - **Flexible whitelisting** — nested whitelist with dot-notation and leaf matching
24
+ - **Pollution tracking** — records polluted parameters on the request (`queryPolluted`, `bodyPolluted`, `paramsPolluted`)
25
+ - **Multi-middleware support** — works with multiple middlewares on different routes (whitelists applied incrementally)
26
+ - **DoS protection** — `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
27
+ - **Performance optimized** — path caching and Set-based lookups for fast whitelist checks
28
+ - **Fully typed API** — TypeScript-first with comprehensive type definitions for both ESM and CommonJS
29
+
30
+ ---
31
+
32
+ ## Installation
33
+
34
+ ```bash
35
+ npm install hppx
36
+ ```
37
+
38
+ ---
39
+
40
+ ## Quick Start
41
+
42
+ ### ESM (ES Modules)
43
+
44
+ ```typescript
45
+ import express from "express";
46
+ import hppx from "hppx";
47
+
48
+ const app = express();
49
+ app.use(express.urlencoded({ extended: true }));
50
+ app.use(express.json());
51
+
52
+ app.use(
53
+ hppx({
54
+ whitelist: ["tags", "user.roles", "ids"],
55
+ mergeStrategy: "keepLast",
56
+ sources: ["query", "body"],
57
+ }),
58
+ );
59
+
60
+ app.get("/search", (req, res) => {
61
+ res.json({
62
+ query: req.query,
63
+ queryPolluted: req.queryPolluted ?? {},
64
+ body: req.body ?? {},
65
+ bodyPolluted: req.bodyPolluted ?? {},
66
+ params: req.params,
67
+ paramsPolluted: req.paramsPolluted ?? {},
68
+ });
69
+ });
70
+ ```
71
+
72
+ ### CommonJS
73
+
74
+ ```javascript
75
+ const express = require("express");
76
+ const hppx = require("hppx");
77
+
78
+ const app = express();
79
+ app.use(express.urlencoded({ extended: true }));
80
+ app.use(express.json());
81
+
82
+ app.use(
83
+ hppx({
84
+ whitelist: ["tags", "user.roles", "ids"],
85
+ mergeStrategy: "keepLast",
86
+ sources: ["query", "body"],
87
+ }),
88
+ );
89
+
90
+ app.get("/search", (req, res) => {
91
+ res.json({
92
+ query: req.query,
93
+ queryPolluted: req.queryPolluted ?? {},
94
+ body: req.body ?? {},
95
+ bodyPolluted: req.bodyPolluted ?? {},
96
+ params: req.params,
97
+ paramsPolluted: req.paramsPolluted ?? {},
98
+ });
99
+ });
100
+ ```
101
+
102
+ ### Polluted Parameter Tree
103
+
104
+ For each enabled source, hppx attaches a parallel `*Polluted` object to the
105
+ request that records the original (pre-reduction) array values for any keys
106
+ that were detected as polluted:
107
+
108
+ | Source | Cleaned data on `req` | Polluted tree on `req` |
109
+ | -------- | --------------------- | ---------------------- |
110
+ | `query` | `req.query` | `req.queryPolluted` |
111
+ | `body` | `req.body` | `req.bodyPolluted` |
112
+ | `params` | `req.params` | `req.paramsPolluted` |
113
+
114
+ These properties are typed via a TypeScript module augmentation included in
115
+ the published types — no extra import is needed.
116
+
117
+ ---
118
+
119
+ ## API
120
+
121
+ ### Default Export: `hppx(options?: HppxOptions)`
122
+
123
+ Creates an Express-compatible middleware. Applies sanitization to each selected source and exposes `*.Polluted` objects on the request.
124
+
125
+ > **Note:** Invalid options throw a `TypeError` at middleware creation time, not at request time. This ensures misconfiguration is caught early.
126
+
127
+ #### Options
128
+
129
+ **Whitelist & Strategy:**
130
+
131
+ | Option | Type | Default | Description |
132
+ | --------------- | ---------------------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
133
+ | `whitelist` | `string[] \| string` | `[]` | Keys allowed to remain as arrays. Supports dot-notation (`"user.tags"`) and leaf matching (`"tags"` matches any path ending in `tags`). |
134
+ | `mergeStrategy` | `'keepFirst' \| 'keepLast' \| 'combine'` | `'keepLast'` | How to reduce duplicate/array parameters when not whitelisted. `keepFirst` takes the first value, `keepLast` takes the last, `combine` flattens all values into a single array. |
135
+
136
+ **Source Selection:**
137
+
138
+ | Option | Type | Default | Description |
139
+ | ---------------------- | -------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
140
+ | `sources` | `Array<'query' \| 'body' \| 'params'>` | `['query', 'body', 'params']` | Which request parts to sanitize. |
141
+ | `checkBodyContentType` | `'urlencoded' \| 'any' \| 'none'` | `'urlencoded'` | When to process `req.body`. `urlencoded` only processes URL-encoded bodies, `any` processes all content types, `none` skips body processing entirely. |
142
+ | `excludePaths` | `string[]` | `[]` | Paths to exclude from sanitization. Supports `*` wildcard suffix (e.g., `"/assets*"`). |
143
+
144
+ **Security Limits (DoS Protection):**
145
+
146
+ | Option | Type | Default | Range | Description |
147
+ | ---------------- | -------- | ------- | -------- | ------------------------------------------------------------------------------------- |
148
+ | `maxDepth` | `number` | `20` | 1 - 100 | Maximum object nesting depth. Exceeding this throws an error passed to `next()`. |
149
+ | `maxKeys` | `number` | `5000` | >= 1 | Maximum number of keys to process. Exceeding this throws an error passed to `next()`. |
150
+ | `maxArrayLength` | `number` | `1000` | >= 1 | Maximum array length. Arrays are truncated before processing. |
151
+ | `maxKeyLength` | `number` | `200` | 1 - 1000 | Maximum key string length. Longer keys are silently dropped. |
152
+
153
+ **Behavior & Callbacks:**
154
+
155
+ | Option | Type | Default | Description |
156
+ | --------------------- | --------------------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
157
+ | `trimValues` | `boolean` | `false` | Trim whitespace from string values. |
158
+ | `preserveNull` | `boolean` | `true` | Preserve `null` values in the output. |
159
+ | `strict` | `boolean` | `false` | Immediately respond with HTTP 400 when pollution is detected. Response includes `error`, `message`, `pollutedParameters`, and `code` (`"HPP_DETECTED"`) fields. |
160
+ | `onPollutionDetected` | `(req, info) => void` | — | Callback fired on pollution detection. Called **once per polluted source** (e.g., fires twice if both query and body are polluted). `info` contains `{ source: RequestSource, pollutedKeys: string[] }`. |
161
+ | `logger` | `(err: Error \| unknown) => void` | — | Custom logger for errors and pollution warnings. Receives `string` for pollution warnings and `Error` for caught errors. Falls back to `console.warn`/`console.error` if the logger throws. |
162
+ | `logPollution` | `boolean` | `true` | Enable automatic logging when pollution is detected. |
163
+
164
+ ---
165
+
166
+ ### Named Export: `sanitize(input, options?)`
167
+
168
+ ```typescript
169
+ function sanitize<T extends Record<string, unknown>>(input: T, options?: SanitizeOptions): T;
170
+ ```
171
+
172
+ Sanitize a plain object using the same rules as the middleware.
173
+
174
+ **Return shape:** `sanitize()` returns **only** the cleaned object — the
175
+ same shape as `input`, with arrays reduced according to the chosen merge
176
+ strategy. It does **not** return the internal `SanitizedResult`
177
+ (`{ cleaned, pollutedTree, pollutedKeys }`); polluted-tree data is
178
+ deliberately discarded. If you need access to the polluted tree or polluted
179
+ keys, use the middleware factory `hppx()` and read `req.queryPolluted` /
180
+ `req.bodyPolluted` / `req.paramsPolluted` instead.
181
+
182
+ **Options:** `sanitize()` accepts only `SanitizeOptions` —
183
+ `whitelist`, `mergeStrategy`, `maxDepth`, `maxKeys`, `maxArrayLength`,
184
+ `maxKeyLength`, `trimValues`, and `preserveNull`. Middleware-only options
185
+ (`sources`, `excludePaths`, `strict`, `onPollutionDetected`, `logger`,
186
+ `logPollution`, `checkBodyContentType`) are silently ignored when passed
187
+ to `sanitize()` — use `hppx()` if you need any of those features.
188
+
189
+ **ESM:**
190
+
191
+ ```typescript
192
+ import { sanitize } from "hppx";
193
+
194
+ const clean = sanitize(payload, {
195
+ whitelist: ["user.tags"],
196
+ mergeStrategy: "keepFirst",
197
+ });
198
+ ```
199
+
200
+ **CommonJS:**
201
+
202
+ ```javascript
203
+ const { sanitize } = require("hppx");
204
+
205
+ const clean = sanitize(payload, {
206
+ whitelist: ["user.tags"],
207
+ mergeStrategy: "keepFirst",
208
+ });
209
+ ```
210
+
211
+ ---
212
+
213
+ ### Exported Types
214
+
215
+ All types are available for both ESM and CommonJS consumers:
216
+
217
+ ```typescript
218
+ import type {
219
+ RequestSource, // "query" | "body" | "params"
220
+ MergeStrategy, // "keepFirst" | "keepLast" | "combine"
221
+ SanitizeOptions, // Options for sanitize()
222
+ HppxOptions, // Full middleware options (extends SanitizeOptions)
223
+ SanitizedResult, // { cleaned, pollutedTree, pollutedKeys }
224
+ } from "hppx";
225
+ ```
226
+
227
+ ### Exported Constants
228
+
229
+ ```typescript
230
+ import { DANGEROUS_KEYS, DEFAULT_SOURCES, DEFAULT_STRATEGY } from "hppx";
231
+
232
+ DANGEROUS_KEYS; // Set<string> — {"__proto__", "prototype", "constructor"}
233
+ DEFAULT_SOURCES; // ["query", "body", "params"]
234
+ DEFAULT_STRATEGY; // "keepLast"
235
+ ```
236
+
237
+ ---
238
+
239
+ ## Advanced Usage
240
+
241
+ ### Strict Mode (Respond 400 on Pollution)
242
+
243
+ ```typescript
244
+ app.use(hppx({ strict: true }));
245
+
246
+ // Polluted requests receive:
247
+ // {
248
+ // "error": "Bad Request",
249
+ // "message": "HTTP Parameter Pollution detected",
250
+ // "pollutedParameters": ["query.x"],
251
+ // "code": "HPP_DETECTED"
252
+ // }
253
+ ```
254
+
255
+ ### Process JSON Bodies Too
256
+
257
+ ```typescript
258
+ app.use(express.json());
259
+ app.use(hppx({ checkBodyContentType: "any" }));
260
+ ```
261
+
262
+ ### Exclude Specific Paths
263
+
264
+ ```typescript
265
+ app.use(hppx({ excludePaths: ["/public", "/assets*"] }));
266
+ ```
267
+
268
+ ### Custom Logging
269
+
270
+ ```typescript
271
+ // Use your application's logger
272
+ app.use(
273
+ hppx({
274
+ logger: (msg) => {
275
+ if (typeof msg === "string") {
276
+ myLogger.warn(msg); // Pollution warnings
277
+ } else {
278
+ myLogger.error(msg); // Errors
279
+ }
280
+ },
281
+ }),
282
+ );
283
+
284
+ // Disable automatic pollution logging
285
+ app.use(hppx({ logPollution: false }));
286
+ ```
287
+
288
+ ### Multi-Middleware Stacking
289
+
290
+ hppx supports incremental whitelisting across multiple middleware instances. Each subsequent middleware applies its own whitelist to the already-collected polluted data:
291
+
292
+ ```typescript
293
+ // Global middleware — whitelist "a"
294
+ app.use(hppx({ whitelist: ["a"] }));
295
+
296
+ // Route-level middleware — additionally whitelist "b" and "c"
297
+ const router = express.Router();
298
+ router.use(hppx({ whitelist: ["b", "c"] }));
299
+
300
+ // On this route, "a", "b", and "c" are all allowed as arrays
301
+ router.get("/data", (req, res) => {
302
+ res.json({ query: req.query });
303
+ });
304
+
305
+ app.use("/api", router);
306
+ ```
307
+
308
+ #### Option Precedence Across Stacked Middleware
309
+
310
+ When the same source (`query` / `body` / `params`) has already been processed by an
311
+ earlier `hppx()` instance on the same request, a subsequent `hppx()` only applies its
312
+ own **`whitelist`** — used to restore additional whitelisted entries from the
313
+ polluted tree the first middleware already collected. Every other option on the
314
+ later instance is **silently ignored for that source** because the source is no
315
+ longer available in its original (un-reduced) form.
316
+
317
+ The options ignored on subsequent middleware (per-source) are:
318
+
319
+ - `mergeStrategy`
320
+ - `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
321
+ - `trimValues`, `preserveNull`
322
+ - `strict` (will **not** trigger HTTP 400 if the earlier middleware already cleaned the source)
323
+ - `onPollutionDetected`, `logger`, `logPollution`
324
+ - `excludePaths` is checked per-instance (independent of the processed flag), but
325
+ if the source was already processed, only whitelist restoration runs.
326
+
327
+ **Footgun example:**
328
+
329
+ ```typescript
330
+ // Global middleware — keepLast strategy, no strict mode
331
+ app.use(hppx({ mergeStrategy: "keepLast" }));
332
+
333
+ // Route-level middleware — strict mode, but it's TOO LATE
334
+ app.use(
335
+ "/api/admin",
336
+ hppx({ strict: true }), // SILENTLY IGNORED — the source was already
337
+ // cleaned by the global middleware, so strict
338
+ // mode here will NOT cause a 400 response.
339
+ );
340
+ ```
341
+
342
+ If you need strict mode on a specific route, configure `strict: true` on the
343
+ **first** `hppx()` instance that processes the relevant source — typically the
344
+ global middleware. Equivalent options (`maxDepth`, `mergeStrategy`, callbacks,
345
+ loggers) must likewise be set on the first instance. Subsequent instances are
346
+ useful only for **expanding** the whitelist to recover additional fields from
347
+ `req.queryPolluted`/`req.bodyPolluted`/`req.paramsPolluted` on a per-route basis.
348
+
349
+ A subsequent middleware can only ever _expand_ the whitelist (by restoring more
350
+ fields back from the polluted tree). It cannot restrict an already-whitelisted
351
+ field, because the previous middleware has already moved that field back into
352
+ the source.
353
+
354
+ ### Pollution Detection Callback
355
+
356
+ ```typescript
357
+ app.use(
358
+ hppx({
359
+ onPollutionDetected: (req, info) => {
360
+ // Called once per polluted source (query, body, params)
361
+ securityLogger.warn("HPP detected", {
362
+ source: info.source,
363
+ pollutedKeys: info.pollutedKeys,
364
+ });
365
+ },
366
+ }),
367
+ );
368
+ ```
369
+
370
+ ---
371
+
372
+ ## Security
373
+
374
+ ### What hppx Protects Against
375
+
376
+ | Threat | Protection |
377
+ | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
378
+ | **Parameter pollution** | Duplicate parameters are reduced to a single value via the chosen merge strategy |
379
+ | **Prototype pollution** | `__proto__`, `constructor`, `prototype` keys are blocked at every processing level |
380
+ | **DoS via deep nesting** | `maxDepth` limit throws error on excessive nesting |
381
+ | **DoS via key flooding** | `maxKeys` limit throws error when key count is exceeded |
382
+ | **DoS via large arrays** | `maxArrayLength` truncates arrays before processing |
383
+ | **DoS via long keys** | `maxKeyLength` silently drops excessively long keys |
384
+ | **Null-byte injection** | Keys containing `\u0000` are silently dropped |
385
+ | **Control / bidi key chars** | Keys containing ASCII / C1 control characters (`\x00`-`\x1F`, `\x7F`-`\x9F`) or Unicode bidirectional override characters (LRM/RLM, LRE/RLE/PDF/LRO/RLO, LRI/RLI/FSI/PDI, BOM) are dropped |
386
+ | **Malformed keys** | Keys consisting only of dots/brackets (e.g., `"..."`, `"[["`) are dropped |
387
+
388
+ ### Production Configuration
389
+
390
+ ```typescript
391
+ app.use(
392
+ hppx({
393
+ maxDepth: 10,
394
+ maxKeys: 1000,
395
+ maxArrayLength: 100,
396
+ maxKeyLength: 100,
397
+ strict: true,
398
+ onPollutionDetected: (req, info) => {
399
+ securityLogger.warn("HPP detected", {
400
+ ip: req.ip,
401
+ path: req.path,
402
+ source: info.source,
403
+ pollutedKeys: info.pollutedKeys,
404
+ });
405
+ },
406
+ }),
407
+ );
408
+ ```
409
+
410
+ ### Express 5 Note
411
+
412
+ In Express 5, `req.query` is exposed as a lazy getter on the prototype chain rather than
413
+ an own property. hppx handles this transparently: it uses `Object.defineProperty` to
414
+ install the sanitized value as a writable own property that shadows the proto-level
415
+ getter. After the middleware runs, `req.query` reflects the cleaned value (e.g.
416
+ `req.query.x === "2"` after `?x=1&x=2` with `mergeStrategy: "keepLast"`).
417
+
418
+ If a downstream layer makes `req.query` non-configurable AND non-writable before hppx
419
+ runs (uncommon), hppx will not silently leave the polluted value in place — it will
420
+ emit a warning via the configured `logger` (or `console.warn` if none is provided) so
421
+ the misconfiguration is visible. The warning is de-duplicated per request and per
422
+ source.
423
+
424
+ ### What hppx Does NOT Protect Against
425
+
426
+ hppx is not a complete security solution. You still need:
427
+
428
+ - **SQL injection protection** — use parameterized queries
429
+ - **XSS protection** — sanitize output, use CSP headers
430
+ - **CSRF protection** — use CSRF tokens
431
+ - **Authentication/Authorization** — validate user permissions
432
+ - **Rate limiting** — prevent brute-force attacks
433
+ - **Input validation** — use schema validation libraries (Joi, Yup, Zod) alongside hppx
434
+
435
+ ---
436
+
437
+ ## FAQ / Known Behaviors
438
+
439
+ A short reference for behaviors that surprise people most often.
440
+
441
+ **1. The `combine` strategy still records pollution.**
442
+ Earlier versions silently dropped pollution events when `mergeStrategy: "combine"`
443
+ was in effect, because the array was preserved as-is. This was a footgun for
444
+ security logging. Today, `combine` records polluted keys into `req.queryPolluted`
445
+ (etc.) and fires `onPollutionDetected` and `logPollution` exactly like the other
446
+ strategies — the cleaned data simply contains the flattened array rather than a
447
+ reduced single value.
448
+
449
+ **2. Multi-middleware: subsequent passes only honor `whitelist`.**
450
+ When two `hppx()` instances run on the same request (e.g. global + router), the
451
+ second one only applies its own `whitelist` — to restore additional fields out
452
+ of the polluted tree the first instance already collected. All other options
453
+ (`mergeStrategy`, `strict`, `onPollutionDetected`, limits, etc.) on the second
454
+ instance are silently ignored for any source the first instance already
455
+ processed. See **Multi-Middleware Stacking → Option Precedence** above for the
456
+ full list. Configure `strict: true`, callbacks, and limits on the **first**
457
+ `hppx()` that processes the source — typically the global middleware.
458
+
459
+ **3. Express 5 frozen `req.query` fallback.**
460
+ Express 5 exposes `req.query` as a lazy getter on the prototype chain. hppx
461
+ shadows it with a writable own property carrying the cleaned value. If an
462
+ unusual downstream layer makes `req.query` non-configurable AND non-writable
463
+ before hppx runs, hppx will emit a warning via the configured `logger` (or
464
+ `console.warn`) instead of silently leaving the polluted array in place.
465
+
466
+ **4. Control / bidirectional override characters in keys are rejected.**
467
+ Keys containing ASCII / C1 control characters (`\x00`-`\x1F`, `\x7F`-`\x9F`),
468
+ Unicode bidirectional override characters (LRM/RLM, LRE/RLE/PDF/LRO/RLO,
469
+ LRI/RLI/FSI/PDI), or BOM (``) are silently dropped. This prevents
470
+ log-injection / DB-corruption tricks that use invisible control characters
471
+ to disguise key names.
472
+
473
+ **5. `sanitize()` returns only the cleaned object.**
474
+ The standalone `sanitize()` function returns the same shape as its input, with
475
+ arrays reduced. It does **not** return `{cleaned, pollutedTree, pollutedKeys}`
476
+ — if you need the polluted tree, use the middleware factory `hppx()` and read
477
+ `req.queryPolluted` / `req.bodyPolluted` / `req.paramsPolluted`. Middleware-only
478
+ options (`sources`, `excludePaths`, `strict`, callbacks, etc.) are silently
479
+ ignored when passed to `sanitize()`.
480
+
481
+ ---
482
+
483
+ ## License
484
+
485
+ MIT License - see [LICENSE](LICENSE) file for details.
486
+
487
+ ## Links
488
+
489
+ - [NPM Package](https://www.npmjs.com/package/hppx)
490
+ - [GitHub Repository](https://github.com/Hiprax/hppx)
491
+ - [Issue Tracker](https://github.com/Hiprax/hppx/issues)
492
+ - [Changelog](CHANGELOG.md)