hppx 0.1.10 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,351 +1,494 @@
1
- # hppx
2
-
3
- **Superior HTTP Parameter Pollution protection middleware** for Node.js/Express, written in TypeScript. It sanitizes `req.query`, `req.body`, and `req.params`, blocks prototype-pollution keys, supports nested whitelists, multiple merge strategies, and plays nicely with stacked middlewares.
4
-
5
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
- [![npm version](https://img.shields.io/npm/v/hppx)](https://www.npmjs.com/package/hppx)
7
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.9-blue.svg)](https://www.typescriptlang.org/)
8
- [![Node.js](https://img.shields.io/badge/Node.js-%E2%89%A516-green.svg)](https://nodejs.org/)
9
- [![Zero Dependencies](https://img.shields.io/badge/Dependencies-0-brightgreen.svg)](#)
10
-
11
- ---
12
-
13
- ## Features
14
-
15
- - **Zero runtime dependencies** — minimal attack surface and bundle size
16
- - **Multiple merge strategies** — `keepFirst`, `keepLast` (default), `combine`
17
- - **Enhanced security:**
18
- - Blocks dangerous keys: `__proto__`, `prototype`, `constructor`
19
- - Prevents null-byte injection in keys
20
- - Rejects malformed keys (dot/bracket-only patterns)
21
- - Validates key lengths to prevent DoS attacks
22
- - Limits array sizes to prevent memory exhaustion
23
- - **Flexible whitelisting** nested whitelist with dot-notation and leaf matching
24
- - **Pollution tracking** records polluted parameters on the request (`queryPolluted`, `bodyPolluted`, `paramsPolluted`)
25
- - **Multi-middleware support** — works with multiple middlewares on different routes (whitelists applied incrementally)
26
- - **DoS protection** — `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
27
- - **Performance optimized** — path caching and Set-based lookups for fast whitelist checks
28
- - **Fully typed API** — TypeScript-first with comprehensive type definitions for both ESM and CommonJS
29
-
30
- ---
31
-
32
- ## Installation
33
-
34
- ```bash
35
- npm install hppx
36
- ```
37
-
38
- ---
39
-
40
- ## Quick Start
41
-
42
- ### ESM (ES Modules)
43
-
44
- ```typescript
45
- import express from "express";
46
- import hppx from "hppx";
47
-
48
- const app = express();
49
- app.use(express.urlencoded({ extended: true }));
50
- app.use(express.json());
51
-
52
- app.use(
53
- hppx({
54
- whitelist: ["tags", "user.roles", "ids"],
55
- mergeStrategy: "keepLast",
56
- sources: ["query", "body"],
57
- }),
58
- );
59
-
60
- app.get("/search", (req, res) => {
61
- res.json({
62
- query: req.query,
63
- queryPolluted: req.queryPolluted ?? {},
64
- body: req.body ?? {},
65
- bodyPolluted: req.bodyPolluted ?? {},
66
- });
67
- });
68
- ```
69
-
70
- ### CommonJS
71
-
72
- ```javascript
73
- const express = require("express");
74
- const hppx = require("hppx");
75
-
76
- const app = express();
77
- app.use(express.urlencoded({ extended: true }));
78
- app.use(express.json());
79
-
80
- app.use(
81
- hppx({
82
- whitelist: ["tags", "user.roles", "ids"],
83
- mergeStrategy: "keepLast",
84
- sources: ["query", "body"],
85
- }),
86
- );
87
-
88
- app.get("/search", (req, res) => {
89
- res.json({
90
- query: req.query,
91
- queryPolluted: req.queryPolluted ?? {},
92
- body: req.body ?? {},
93
- bodyPolluted: req.bodyPolluted ?? {},
94
- });
95
- });
96
- ```
97
-
98
- ---
99
-
100
- ## API
101
-
102
- ### Default Export: `hppx(options?: HppxOptions)`
103
-
104
- Creates an Express-compatible middleware. Applies sanitization to each selected source and exposes `*.Polluted` objects on the request.
105
-
106
- > **Note:** Invalid options throw a `TypeError` at middleware creation time, not at request time. This ensures misconfiguration is caught early.
107
-
108
- #### Options
109
-
110
- **Whitelist & Strategy:**
111
-
112
- | Option | Type | Default | Description |
113
- | --------------- | ---------------------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
114
- | `whitelist` | `string[] \| string` | `[]` | Keys allowed to remain as arrays. Supports dot-notation (`"user.tags"`) and leaf matching (`"tags"` matches any path ending in `tags`). |
115
- | `mergeStrategy` | `'keepFirst' \| 'keepLast' \| 'combine'` | `'keepLast'` | How to reduce duplicate/array parameters when not whitelisted. `keepFirst` takes the first value, `keepLast` takes the last, `combine` flattens all values into a single array. |
116
-
117
- **Source Selection:**
118
-
119
- | Option | Type | Default | Description |
120
- | ---------------------- | -------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
121
- | `sources` | `Array<'query' \| 'body' \| 'params'>` | `['query', 'body', 'params']` | Which request parts to sanitize. |
122
- | `checkBodyContentType` | `'urlencoded' \| 'any' \| 'none'` | `'urlencoded'` | When to process `req.body`. `urlencoded` only processes URL-encoded bodies, `any` processes all content types, `none` skips body processing entirely. |
123
- | `excludePaths` | `string[]` | `[]` | Paths to exclude from sanitization. Supports `*` wildcard suffix (e.g., `"/assets*"`). |
124
-
125
- **Security Limits (DoS Protection):**
126
-
127
- | Option | Type | Default | Range | Description |
128
- | ---------------- | -------- | ------- | -------- | ------------------------------------------------------------------------------------- |
129
- | `maxDepth` | `number` | `20` | 1 - 100 | Maximum object nesting depth. Exceeding this throws an error passed to `next()`. |
130
- | `maxKeys` | `number` | `5000` | >= 1 | Maximum number of keys to process. Exceeding this throws an error passed to `next()`. |
131
- | `maxArrayLength` | `number` | `1000` | >= 1 | Maximum array length. Arrays are truncated before processing. |
132
- | `maxKeyLength` | `number` | `200` | 1 - 1000 | Maximum key string length. Longer keys are silently dropped. |
133
-
134
- **Behavior & Callbacks:**
135
-
136
- | Option | Type | Default | Description |
137
- | --------------------- | --------------------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
138
- | `trimValues` | `boolean` | `false` | Trim whitespace from string values. |
139
- | `preserveNull` | `boolean` | `true` | Preserve `null` values in the output. |
140
- | `strict` | `boolean` | `false` | Immediately respond with HTTP 400 when pollution is detected. Response includes `error`, `message`, `pollutedParameters`, and `code` (`"HPP_DETECTED"`) fields. |
141
- | `onPollutionDetected` | `(req, info) => void` | — | Callback fired on pollution detection. Called **once per polluted source** (e.g., fires twice if both query and body are polluted). `info` contains `{ source: RequestSource, pollutedKeys: string[] }`. |
142
- | `logger` | `(err: Error \| unknown) => void` | — | Custom logger for errors and pollution warnings. Receives `string` for pollution warnings and `Error` for caught errors. Falls back to `console.warn`/`console.error` if the logger throws. |
143
- | `logPollution` | `boolean` | `true` | Enable automatic logging when pollution is detected. |
144
-
145
- ---
146
-
147
- ### Named Export: `sanitize(input, options?)`
148
-
149
- ```typescript
150
- function sanitize<T extends Record<string, unknown>>(input: T, options?: SanitizeOptions): T;
151
- ```
152
-
153
- Sanitize a plain object using the same rules as the middleware. Returns only the cleaned object (polluted data is not returned — use the middleware if you need `req.queryPolluted` etc.).
154
-
155
- **ESM:**
156
-
157
- ```typescript
158
- import { sanitize } from "hppx";
159
-
160
- const clean = sanitize(payload, {
161
- whitelist: ["user.tags"],
162
- mergeStrategy: "keepFirst",
163
- });
164
- ```
165
-
166
- **CommonJS:**
167
-
168
- ```javascript
169
- const { sanitize } = require("hppx");
170
-
171
- const clean = sanitize(payload, {
172
- whitelist: ["user.tags"],
173
- mergeStrategy: "keepFirst",
174
- });
175
- ```
176
-
177
- ---
178
-
179
- ### Exported Types
180
-
181
- All types are available for both ESM and CommonJS consumers:
182
-
183
- ```typescript
184
- import type {
185
- RequestSource, // "query" | "body" | "params"
186
- MergeStrategy, // "keepFirst" | "keepLast" | "combine"
187
- SanitizeOptions, // Options for sanitize()
188
- HppxOptions, // Full middleware options (extends SanitizeOptions)
189
- SanitizedResult, // { cleaned, pollutedTree, pollutedKeys }
190
- } from "hppx";
191
- ```
192
-
193
- ### Exported Constants
194
-
195
- ```typescript
196
- import { DANGEROUS_KEYS, DEFAULT_SOURCES, DEFAULT_STRATEGY } from "hppx";
197
-
198
- DANGEROUS_KEYS; // Set<string> — {"__proto__", "prototype", "constructor"}
199
- DEFAULT_SOURCES; // ["query", "body", "params"]
200
- DEFAULT_STRATEGY; // "keepLast"
201
- ```
202
-
203
- ---
204
-
205
- ## Advanced Usage
206
-
207
- ### Strict Mode (Respond 400 on Pollution)
208
-
209
- ```typescript
210
- app.use(hppx({ strict: true }));
211
-
212
- // Polluted requests receive:
213
- // {
214
- // "error": "Bad Request",
215
- // "message": "HTTP Parameter Pollution detected",
216
- // "pollutedParameters": ["query.x"],
217
- // "code": "HPP_DETECTED"
218
- // }
219
- ```
220
-
221
- ### Process JSON Bodies Too
222
-
223
- ```typescript
224
- app.use(express.json());
225
- app.use(hppx({ checkBodyContentType: "any" }));
226
- ```
227
-
228
- ### Exclude Specific Paths
229
-
230
- ```typescript
231
- app.use(hppx({ excludePaths: ["/public", "/assets*"] }));
232
- ```
233
-
234
- ### Custom Logging
235
-
236
- ```typescript
237
- // Use your application's logger
238
- app.use(
239
- hppx({
240
- logger: (msg) => {
241
- if (typeof msg === "string") {
242
- myLogger.warn(msg); // Pollution warnings
243
- } else {
244
- myLogger.error(msg); // Errors
245
- }
246
- },
247
- }),
248
- );
249
-
250
- // Disable automatic pollution logging
251
- app.use(hppx({ logPollution: false }));
252
- ```
253
-
254
- ### Multi-Middleware Stacking
255
-
256
- hppx supports incremental whitelisting across multiple middleware instances. Each subsequent middleware applies its own whitelist to the already-collected polluted data:
257
-
258
- ```typescript
259
- // Global middleware — whitelist "a"
260
- app.use(hppx({ whitelist: ["a"] }));
261
-
262
- // Route-level middleware — additionally whitelist "b" and "c"
263
- const router = express.Router();
264
- router.use(hppx({ whitelist: ["b", "c"] }));
265
-
266
- // On this route, "a", "b", and "c" are all allowed as arrays
267
- router.get("/data", (req, res) => {
268
- res.json({ query: req.query });
269
- });
270
-
271
- app.use("/api", router);
272
- ```
273
-
274
- ### Pollution Detection Callback
275
-
276
- ```typescript
277
- app.use(
278
- hppx({
279
- onPollutionDetected: (req, info) => {
280
- // Called once per polluted source (query, body, params)
281
- securityLogger.warn("HPP detected", {
282
- source: info.source,
283
- pollutedKeys: info.pollutedKeys,
284
- });
285
- },
286
- }),
287
- );
288
- ```
289
-
290
- ---
291
-
292
- ## Security
293
-
294
- ### What hppx Protects Against
295
-
296
- | Threat | Protection |
297
- | ------------------------ | ---------------------------------------------------------------------------------- |
298
- | **Parameter pollution** | Duplicate parameters are reduced to a single value via the chosen merge strategy |
299
- | **Prototype pollution** | `__proto__`, `constructor`, `prototype` keys are blocked at every processing level |
300
- | **DoS via deep nesting** | `maxDepth` limit throws error on excessive nesting |
301
- | **DoS via key flooding** | `maxKeys` limit throws error when key count is exceeded |
302
- | **DoS via large arrays** | `maxArrayLength` truncates arrays before processing |
303
- | **DoS via long keys** | `maxKeyLength` silently drops excessively long keys |
304
- | **Null-byte injection** | Keys containing `\u0000` are silently dropped |
305
- | **Malformed keys** | Keys consisting only of dots/brackets (e.g., `"..."`, `"[["`) are dropped |
306
-
307
- ### Production Configuration
308
-
309
- ```typescript
310
- app.use(
311
- hppx({
312
- maxDepth: 10,
313
- maxKeys: 1000,
314
- maxArrayLength: 100,
315
- maxKeyLength: 100,
316
- strict: true,
317
- onPollutionDetected: (req, info) => {
318
- securityLogger.warn("HPP detected", {
319
- ip: req.ip,
320
- path: req.path,
321
- source: info.source,
322
- pollutedKeys: info.pollutedKeys,
323
- });
324
- },
325
- }),
326
- );
327
- ```
328
-
329
- ### What hppx Does NOT Protect Against
330
-
331
- hppx is not a complete security solution. You still need:
332
-
333
- - **SQL injection protection** — use parameterized queries
334
- - **XSS protection** — sanitize output, use CSP headers
335
- - **CSRF protection** use CSRF tokens
336
- - **Authentication/Authorization** — validate user permissions
337
- - **Rate limiting** — prevent brute-force attacks
338
- - **Input validation** use schema validation libraries (Joi, Yup, Zod) alongside hppx
339
-
340
- ---
341
-
342
- ## License
343
-
344
- MIT License - see [LICENSE](LICENSE) file for details.
345
-
346
- ## Links
347
-
348
- - [NPM Package](https://www.npmjs.com/package/hppx)
349
- - [GitHub Repository](https://github.com/Hiprax/hppx)
350
- - [Issue Tracker](https://github.com/Hiprax/hppx/issues)
351
- - [Changelog](CHANGELOG.md)
1
+ # hppx
2
+
3
+ **Superior HTTP Parameter Pollution protection middleware** for Node.js/Express, written in TypeScript. It sanitizes `req.query`, `req.body`, and `req.params`, blocks prototype-pollution keys, supports nested whitelists, multiple merge strategies, and plays nicely with stacked middlewares.
4
+
5
+ [![npm version](https://img.shields.io/npm/v/hppx)](https://www.npmjs.com/package/hppx)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+ [![CI](https://github.com/Hiprax/hppx/actions/workflows/ci.yml/badge.svg)](https://github.com/Hiprax/hppx/actions/workflows/ci.yml)
8
+ [![CodeQL](https://github.com/Hiprax/hppx/actions/workflows/codeql.yml/badge.svg)](https://github.com/Hiprax/hppx/actions/workflows/codeql.yml)
9
+ [![codecov](https://codecov.io/gh/Hiprax/hppx/branch/main/graph/badge.svg)](https://codecov.io/gh/Hiprax/hppx)
10
+ [![Node.js](https://img.shields.io/badge/Node.js-%E2%89%A518-green.svg)](https://nodejs.org/)
11
+ [![Zero Dependencies](https://img.shields.io/badge/Dependencies-0-brightgreen.svg)](#)
12
+
13
+ ---
14
+
15
+ ## Features
16
+
17
+ - **Zero runtime dependencies** — minimal attack surface and bundle size
18
+ - **Multiple merge strategies** `keepFirst`, `keepLast` (default), `combine`
19
+ - **Enhanced security:**
20
+ - Blocks dangerous keys: `__proto__`, `prototype`, `constructor`
21
+ - Prevents null-byte injection in keys
22
+ - Rejects malformed keys (dot/bracket-only patterns)
23
+ - Validates key lengths to prevent DoS attacks
24
+ - Limits array sizes to prevent memory exhaustion
25
+ - **Flexible whitelisting** — nested whitelist with dot-notation and leaf matching
26
+ - **Pollution tracking** — records polluted parameters on the request (`queryPolluted`, `bodyPolluted`, `paramsPolluted`)
27
+ - **Multi-middleware support** — works with multiple middlewares on different routes (whitelists applied incrementally)
28
+ - **DoS protection** — `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
29
+ - **Performance optimized** — path caching and Set-based lookups for fast whitelist checks
30
+ - **Fully typed API** — TypeScript-first with comprehensive type definitions for both ESM and CommonJS
31
+
32
+ ---
33
+
34
+ ## Installation
35
+
36
+ ```bash
37
+ npm install hppx
38
+ ```
39
+
40
+ ---
41
+
42
+ ## Quick Start
43
+
44
+ ### ESM (ES Modules)
45
+
46
+ ```typescript
47
+ import express from "express";
48
+ import hppx from "hppx";
49
+
50
+ const app = express();
51
+ app.use(express.urlencoded({ extended: true }));
52
+ app.use(express.json());
53
+
54
+ app.use(
55
+ hppx({
56
+ whitelist: ["tags", "user.roles", "ids"],
57
+ mergeStrategy: "keepLast",
58
+ sources: ["query", "body"],
59
+ }),
60
+ );
61
+
62
+ app.get("/search", (req, res) => {
63
+ res.json({
64
+ query: req.query,
65
+ queryPolluted: req.queryPolluted ?? {},
66
+ body: req.body ?? {},
67
+ bodyPolluted: req.bodyPolluted ?? {},
68
+ params: req.params,
69
+ paramsPolluted: req.paramsPolluted ?? {},
70
+ });
71
+ });
72
+ ```
73
+
74
+ ### CommonJS
75
+
76
+ ```javascript
77
+ const express = require("express");
78
+ const hppx = require("hppx");
79
+
80
+ const app = express();
81
+ app.use(express.urlencoded({ extended: true }));
82
+ app.use(express.json());
83
+
84
+ app.use(
85
+ hppx({
86
+ whitelist: ["tags", "user.roles", "ids"],
87
+ mergeStrategy: "keepLast",
88
+ sources: ["query", "body"],
89
+ }),
90
+ );
91
+
92
+ app.get("/search", (req, res) => {
93
+ res.json({
94
+ query: req.query,
95
+ queryPolluted: req.queryPolluted ?? {},
96
+ body: req.body ?? {},
97
+ bodyPolluted: req.bodyPolluted ?? {},
98
+ params: req.params,
99
+ paramsPolluted: req.paramsPolluted ?? {},
100
+ });
101
+ });
102
+ ```
103
+
104
+ ### Polluted Parameter Tree
105
+
106
+ For each enabled source, hppx attaches a parallel `*Polluted` object to the
107
+ request that records the original (pre-reduction) array values for any keys
108
+ that were detected as polluted:
109
+
110
+ | Source | Cleaned data on `req` | Polluted tree on `req` |
111
+ | -------- | --------------------- | ---------------------- |
112
+ | `query` | `req.query` | `req.queryPolluted` |
113
+ | `body` | `req.body` | `req.bodyPolluted` |
114
+ | `params` | `req.params` | `req.paramsPolluted` |
115
+
116
+ These properties are typed via a TypeScript module augmentation included in
117
+ the published types — no extra import is needed.
118
+
119
+ ---
120
+
121
+ ## API
122
+
123
+ ### Default Export: `hppx(options?: HppxOptions)`
124
+
125
+ Creates an Express-compatible middleware. Applies sanitization to each selected source and exposes `*.Polluted` objects on the request.
126
+
127
+ > **Note:** Invalid options throw a `TypeError` at middleware creation time, not at request time. This ensures misconfiguration is caught early.
128
+
129
+ #### Options
130
+
131
+ **Whitelist & Strategy:**
132
+
133
+ | Option | Type | Default | Description |
134
+ | --------------- | ---------------------------------------- | ------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
135
+ | `whitelist` | `string[] \| string` | `[]` | Keys allowed to remain as arrays. Supports dot-notation (`"user.tags"`) and leaf matching (`"tags"` matches any path ending in `tags`). |
136
+ | `mergeStrategy` | `'keepFirst' \| 'keepLast' \| 'combine'` | `'keepLast'` | How to reduce duplicate/array parameters when not whitelisted. `keepFirst` takes the first value, `keepLast` takes the last, `combine` flattens all values into a single array. |
137
+
138
+ **Source Selection:**
139
+
140
+ | Option | Type | Default | Description |
141
+ | ---------------------- | -------------------------------------- | ----------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- |
142
+ | `sources` | `Array<'query' \| 'body' \| 'params'>` | `['query', 'body', 'params']` | Which request parts to sanitize. |
143
+ | `checkBodyContentType` | `'urlencoded' \| 'any' \| 'none'` | `'urlencoded'` | When to process `req.body`. `urlencoded` only processes URL-encoded bodies, `any` processes all content types, `none` skips body processing entirely. |
144
+ | `excludePaths` | `string[]` | `[]` | Paths to exclude from sanitization. Supports `*` wildcard suffix (e.g., `"/assets*"`). |
145
+
146
+ **Security Limits (DoS Protection):**
147
+
148
+ | Option | Type | Default | Range | Description |
149
+ | ---------------- | -------- | ------- | -------- | ------------------------------------------------------------------------------------- |
150
+ | `maxDepth` | `number` | `20` | 1 - 100 | Maximum object nesting depth. Exceeding this throws an error passed to `next()`. |
151
+ | `maxKeys` | `number` | `5000` | >= 1 | Maximum number of keys to process. Exceeding this throws an error passed to `next()`. |
152
+ | `maxArrayLength` | `number` | `1000` | >= 1 | Maximum array length. Arrays are truncated before processing. |
153
+ | `maxKeyLength` | `number` | `200` | 1 - 1000 | Maximum key string length. Longer keys are silently dropped. |
154
+
155
+ **Behavior & Callbacks:**
156
+
157
+ | Option | Type | Default | Description |
158
+ | --------------------- | --------------------------------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
159
+ | `trimValues` | `boolean` | `false` | Trim whitespace from string values. |
160
+ | `preserveNull` | `boolean` | `true` | Preserve `null` values in the output. |
161
+ | `strict` | `boolean` | `false` | Immediately respond with HTTP 400 when pollution is detected. Response includes `error`, `message`, `pollutedParameters`, and `code` (`"HPP_DETECTED"`) fields. |
162
+ | `onPollutionDetected` | `(req, info) => void` | — | Callback fired on pollution detection. Called **once per polluted source** (e.g., fires twice if both query and body are polluted). `info` contains `{ source: RequestSource, pollutedKeys: string[] }`. |
163
+ | `logger` | `(err: Error \| unknown) => void` | — | Custom logger for errors and pollution warnings. Receives `string` for pollution warnings and `Error` for caught errors. Falls back to `console.warn`/`console.error` if the logger throws. |
164
+ | `logPollution` | `boolean` | `true` | Enable automatic logging when pollution is detected. |
165
+
166
+ ---
167
+
168
+ ### Named Export: `sanitize(input, options?)`
169
+
170
+ ```typescript
171
+ function sanitize<T extends Record<string, unknown>>(input: T, options?: SanitizeOptions): T;
172
+ ```
173
+
174
+ Sanitize a plain object using the same rules as the middleware.
175
+
176
+ **Return shape:** `sanitize()` returns **only** the cleaned object — the
177
+ same shape as `input`, with arrays reduced according to the chosen merge
178
+ strategy. It does **not** return the internal `SanitizedResult`
179
+ (`{ cleaned, pollutedTree, pollutedKeys }`); polluted-tree data is
180
+ deliberately discarded. If you need access to the polluted tree or polluted
181
+ keys, use the middleware factory `hppx()` and read `req.queryPolluted` /
182
+ `req.bodyPolluted` / `req.paramsPolluted` instead.
183
+
184
+ **Options:** `sanitize()` accepts only `SanitizeOptions` —
185
+ `whitelist`, `mergeStrategy`, `maxDepth`, `maxKeys`, `maxArrayLength`,
186
+ `maxKeyLength`, `trimValues`, and `preserveNull`. Middleware-only options
187
+ (`sources`, `excludePaths`, `strict`, `onPollutionDetected`, `logger`,
188
+ `logPollution`, `checkBodyContentType`) are silently ignored when passed
189
+ to `sanitize()` use `hppx()` if you need any of those features.
190
+
191
+ **ESM:**
192
+
193
+ ```typescript
194
+ import { sanitize } from "hppx";
195
+
196
+ const clean = sanitize(payload, {
197
+ whitelist: ["user.tags"],
198
+ mergeStrategy: "keepFirst",
199
+ });
200
+ ```
201
+
202
+ **CommonJS:**
203
+
204
+ ```javascript
205
+ const { sanitize } = require("hppx");
206
+
207
+ const clean = sanitize(payload, {
208
+ whitelist: ["user.tags"],
209
+ mergeStrategy: "keepFirst",
210
+ });
211
+ ```
212
+
213
+ ---
214
+
215
+ ### Exported Types
216
+
217
+ All types are available for both ESM and CommonJS consumers:
218
+
219
+ ```typescript
220
+ import type {
221
+ RequestSource, // "query" | "body" | "params"
222
+ MergeStrategy, // "keepFirst" | "keepLast" | "combine"
223
+ SanitizeOptions, // Options for sanitize()
224
+ HppxOptions, // Full middleware options (extends SanitizeOptions)
225
+ SanitizedResult, // { cleaned, pollutedTree, pollutedKeys }
226
+ } from "hppx";
227
+ ```
228
+
229
+ ### Exported Constants
230
+
231
+ ```typescript
232
+ import { DANGEROUS_KEYS, DEFAULT_SOURCES, DEFAULT_STRATEGY } from "hppx";
233
+
234
+ DANGEROUS_KEYS; // Set<string> — {"__proto__", "prototype", "constructor"}
235
+ DEFAULT_SOURCES; // ["query", "body", "params"]
236
+ DEFAULT_STRATEGY; // "keepLast"
237
+ ```
238
+
239
+ ---
240
+
241
+ ## Advanced Usage
242
+
243
+ ### Strict Mode (Respond 400 on Pollution)
244
+
245
+ ```typescript
246
+ app.use(hppx({ strict: true }));
247
+
248
+ // Polluted requests receive:
249
+ // {
250
+ // "error": "Bad Request",
251
+ // "message": "HTTP Parameter Pollution detected",
252
+ // "pollutedParameters": ["query.x"],
253
+ // "code": "HPP_DETECTED"
254
+ // }
255
+ ```
256
+
257
+ ### Process JSON Bodies Too
258
+
259
+ ```typescript
260
+ app.use(express.json());
261
+ app.use(hppx({ checkBodyContentType: "any" }));
262
+ ```
263
+
264
+ ### Exclude Specific Paths
265
+
266
+ ```typescript
267
+ app.use(hppx({ excludePaths: ["/public", "/assets*"] }));
268
+ ```
269
+
270
+ ### Custom Logging
271
+
272
+ ```typescript
273
+ // Use your application's logger
274
+ app.use(
275
+ hppx({
276
+ logger: (msg) => {
277
+ if (typeof msg === "string") {
278
+ myLogger.warn(msg); // Pollution warnings
279
+ } else {
280
+ myLogger.error(msg); // Errors
281
+ }
282
+ },
283
+ }),
284
+ );
285
+
286
+ // Disable automatic pollution logging
287
+ app.use(hppx({ logPollution: false }));
288
+ ```
289
+
290
+ ### Multi-Middleware Stacking
291
+
292
+ hppx supports incremental whitelisting across multiple middleware instances. Each subsequent middleware applies its own whitelist to the already-collected polluted data:
293
+
294
+ ```typescript
295
+ // Global middleware — whitelist "a"
296
+ app.use(hppx({ whitelist: ["a"] }));
297
+
298
+ // Route-level middleware additionally whitelist "b" and "c"
299
+ const router = express.Router();
300
+ router.use(hppx({ whitelist: ["b", "c"] }));
301
+
302
+ // On this route, "a", "b", and "c" are all allowed as arrays
303
+ router.get("/data", (req, res) => {
304
+ res.json({ query: req.query });
305
+ });
306
+
307
+ app.use("/api", router);
308
+ ```
309
+
310
+ #### Option Precedence Across Stacked Middleware
311
+
312
+ When the same source (`query` / `body` / `params`) has already been processed by an
313
+ earlier `hppx()` instance on the same request, a subsequent `hppx()` only applies its
314
+ own **`whitelist`** — used to restore additional whitelisted entries from the
315
+ polluted tree the first middleware already collected. Every other option on the
316
+ later instance is **silently ignored for that source** because the source is no
317
+ longer available in its original (un-reduced) form.
318
+
319
+ The options ignored on subsequent middleware (per-source) are:
320
+
321
+ - `mergeStrategy`
322
+ - `maxDepth`, `maxKeys`, `maxArrayLength`, `maxKeyLength`
323
+ - `trimValues`, `preserveNull`
324
+ - `strict` (will **not** trigger HTTP 400 if the earlier middleware already cleaned the source)
325
+ - `onPollutionDetected`, `logger`, `logPollution`
326
+ - `excludePaths` is checked per-instance (independent of the processed flag), but
327
+ if the source was already processed, only whitelist restoration runs.
328
+
329
+ **Footgun example:**
330
+
331
+ ```typescript
332
+ // Global middleware — keepLast strategy, no strict mode
333
+ app.use(hppx({ mergeStrategy: "keepLast" }));
334
+
335
+ // Route-level middlewarestrict mode, but it's TOO LATE
336
+ app.use(
337
+ "/api/admin",
338
+ hppx({ strict: true }), // SILENTLY IGNORED the source was already
339
+ // cleaned by the global middleware, so strict
340
+ // mode here will NOT cause a 400 response.
341
+ );
342
+ ```
343
+
344
+ If you need strict mode on a specific route, configure `strict: true` on the
345
+ **first** `hppx()` instance that processes the relevant source — typically the
346
+ global middleware. Equivalent options (`maxDepth`, `mergeStrategy`, callbacks,
347
+ loggers) must likewise be set on the first instance. Subsequent instances are
348
+ useful only for **expanding** the whitelist to recover additional fields from
349
+ `req.queryPolluted`/`req.bodyPolluted`/`req.paramsPolluted` on a per-route basis.
350
+
351
+ A subsequent middleware can only ever _expand_ the whitelist (by restoring more
352
+ fields back from the polluted tree). It cannot restrict an already-whitelisted
353
+ field, because the previous middleware has already moved that field back into
354
+ the source.
355
+
356
+ ### Pollution Detection Callback
357
+
358
+ ```typescript
359
+ app.use(
360
+ hppx({
361
+ onPollutionDetected: (req, info) => {
362
+ // Called once per polluted source (query, body, params)
363
+ securityLogger.warn("HPP detected", {
364
+ source: info.source,
365
+ pollutedKeys: info.pollutedKeys,
366
+ });
367
+ },
368
+ }),
369
+ );
370
+ ```
371
+
372
+ ---
373
+
374
+ ## Security
375
+
376
+ ### What hppx Protects Against
377
+
378
+ | Threat | Protection |
379
+ | ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
380
+ | **Parameter pollution** | Duplicate parameters are reduced to a single value via the chosen merge strategy |
381
+ | **Prototype pollution** | `__proto__`, `constructor`, `prototype` keys are blocked at every processing level |
382
+ | **DoS via deep nesting** | `maxDepth` limit throws error on excessive nesting |
383
+ | **DoS via key flooding** | `maxKeys` limit throws error when key count is exceeded |
384
+ | **DoS via large arrays** | `maxArrayLength` truncates arrays before processing |
385
+ | **DoS via long keys** | `maxKeyLength` silently drops excessively long keys |
386
+ | **Null-byte injection** | Keys containing `\u0000` are silently dropped |
387
+ | **Control / bidi key chars** | Keys containing ASCII / C1 control characters (`\x00`-`\x1F`, `\x7F`-`\x9F`) or Unicode bidirectional override characters (LRM/RLM, LRE/RLE/PDF/LRO/RLO, LRI/RLI/FSI/PDI, BOM) are dropped |
388
+ | **Malformed keys** | Keys consisting only of dots/brackets (e.g., `"..."`, `"[["`) are dropped |
389
+
390
+ ### Production Configuration
391
+
392
+ ```typescript
393
+ app.use(
394
+ hppx({
395
+ maxDepth: 10,
396
+ maxKeys: 1000,
397
+ maxArrayLength: 100,
398
+ maxKeyLength: 100,
399
+ strict: true,
400
+ onPollutionDetected: (req, info) => {
401
+ securityLogger.warn("HPP detected", {
402
+ ip: req.ip,
403
+ path: req.path,
404
+ source: info.source,
405
+ pollutedKeys: info.pollutedKeys,
406
+ });
407
+ },
408
+ }),
409
+ );
410
+ ```
411
+
412
+ ### Express 5 Note
413
+
414
+ In Express 5, `req.query` is exposed as a lazy getter on the prototype chain rather than
415
+ an own property. hppx handles this transparently: it uses `Object.defineProperty` to
416
+ install the sanitized value as a writable own property that shadows the proto-level
417
+ getter. After the middleware runs, `req.query` reflects the cleaned value (e.g.
418
+ `req.query.x === "2"` after `?x=1&x=2` with `mergeStrategy: "keepLast"`).
419
+
420
+ If a downstream layer makes `req.query` non-configurable AND non-writable before hppx
421
+ runs (uncommon), hppx will not silently leave the polluted value in place — it will
422
+ emit a warning via the configured `logger` (or `console.warn` if none is provided) so
423
+ the misconfiguration is visible. The warning is de-duplicated per request and per
424
+ source.
425
+
426
+ ### What hppx Does NOT Protect Against
427
+
428
+ hppx is not a complete security solution. You still need:
429
+
430
+ - **SQL injection protection** — use parameterized queries
431
+ - **XSS protection** — sanitize output, use CSP headers
432
+ - **CSRF protection** — use CSRF tokens
433
+ - **Authentication/Authorization** — validate user permissions
434
+ - **Rate limiting** — prevent brute-force attacks
435
+ - **Input validation** — use schema validation libraries (Joi, Yup, Zod) alongside hppx
436
+
437
+ ---
438
+
439
+ ## FAQ / Known Behaviors
440
+
441
+ A short reference for behaviors that surprise people most often.
442
+
443
+ **1. The `combine` strategy still records pollution.**
444
+ Earlier versions silently dropped pollution events when `mergeStrategy: "combine"`
445
+ was in effect, because the array was preserved as-is. This was a footgun for
446
+ security logging. Today, `combine` records polluted keys into `req.queryPolluted`
447
+ (etc.) and fires `onPollutionDetected` and `logPollution` exactly like the other
448
+ strategies — the cleaned data simply contains the flattened array rather than a
449
+ reduced single value.
450
+
451
+ **2. Multi-middleware: subsequent passes only honor `whitelist`.**
452
+ When two `hppx()` instances run on the same request (e.g. global + router), the
453
+ second one only applies its own `whitelist` — to restore additional fields out
454
+ of the polluted tree the first instance already collected. All other options
455
+ (`mergeStrategy`, `strict`, `onPollutionDetected`, limits, etc.) on the second
456
+ instance are silently ignored for any source the first instance already
457
+ processed. See **Multi-Middleware Stacking → Option Precedence** above for the
458
+ full list. Configure `strict: true`, callbacks, and limits on the **first**
459
+ `hppx()` that processes the source — typically the global middleware.
460
+
461
+ **3. Express 5 frozen `req.query` fallback.**
462
+ Express 5 exposes `req.query` as a lazy getter on the prototype chain. hppx
463
+ shadows it with a writable own property carrying the cleaned value. If an
464
+ unusual downstream layer makes `req.query` non-configurable AND non-writable
465
+ before hppx runs, hppx will emit a warning via the configured `logger` (or
466
+ `console.warn`) instead of silently leaving the polluted array in place.
467
+
468
+ **4. Control / bidirectional override characters in keys are rejected.**
469
+ Keys containing ASCII / C1 control characters (`\x00`-`\x1F`, `\x7F`-`\x9F`),
470
+ Unicode bidirectional override characters (LRM/RLM, LRE/RLE/PDF/LRO/RLO,
471
+ LRI/RLI/FSI/PDI), or BOM (``) are silently dropped. This prevents
472
+ log-injection / DB-corruption tricks that use invisible control characters
473
+ to disguise key names.
474
+
475
+ **5. `sanitize()` returns only the cleaned object.**
476
+ The standalone `sanitize()` function returns the same shape as its input, with
477
+ arrays reduced. It does **not** return `{cleaned, pollutedTree, pollutedKeys}`
478
+ — if you need the polluted tree, use the middleware factory `hppx()` and read
479
+ `req.queryPolluted` / `req.bodyPolluted` / `req.paramsPolluted`. Middleware-only
480
+ options (`sources`, `excludePaths`, `strict`, callbacks, etc.) are silently
481
+ ignored when passed to `sanitize()`.
482
+
483
+ ---
484
+
485
+ ## License
486
+
487
+ MIT License - see [LICENSE](LICENSE) file for details.
488
+
489
+ ## Links
490
+
491
+ - [NPM Package](https://www.npmjs.com/package/hppx)
492
+ - [GitHub Repository](https://github.com/Hiprax/hppx)
493
+ - [Issue Tracker](https://github.com/Hiprax/hppx/issues)
494
+ - [Changelog](CHANGELOG.md)