ata-validator 0.4.4 → 0.4.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +48 -33
- package/index.js +17 -10
- package/lib/js-compiler.js +9 -10
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -10,10 +10,10 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
|
|
|
10
10
|
|
|
11
11
|
| Scenario | ata | ajv | |
|
|
12
12
|
|---|---|---|---|
|
|
13
|
-
| **validate(obj)** |
|
|
14
|
-
| **isValidObject(obj)** |
|
|
15
|
-
| **validateJSON(str)** |
|
|
16
|
-
| **isValidJSON(str)** |
|
|
13
|
+
| **validate(obj)** | 15M ops/sec | 8.5M ops/sec | **ata 1.8x faster** |
|
|
14
|
+
| **isValidObject(obj)** | 17.4M ops/sec | 9.4M ops/sec | **ata 1.8x faster** |
|
|
15
|
+
| **validateJSON(str)** | 2.1M ops/sec | 1.9M ops/sec | **ata 1.1x faster** |
|
|
16
|
+
| **isValidJSON(str)** | 2.0M ops/sec | 1.9M ops/sec | **ata 1.1x faster** |
|
|
17
17
|
| **Schema compilation** | 125,690 ops/sec | 831 ops/sec | **ata 151x faster** |
|
|
18
18
|
|
|
19
19
|
### Large Data — JS Object Validation
|
|
@@ -24,61 +24,64 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
|
|
|
24
24
|
| 100 users (20KB) | 658K ops/sec | 243K ops/sec | **ata 2.7x faster** |
|
|
25
25
|
| 1,000 users (205KB) | 64K ops/sec | 23.5K ops/sec | **ata 2.7x faster** |
|
|
26
26
|
|
|
27
|
-
###
|
|
27
|
+
### Real-World Scenarios
|
|
28
28
|
|
|
29
|
-
|
|
|
29
|
+
| Scenario | ata | ajv | |
|
|
30
30
|
|---|---|---|---|
|
|
31
|
-
|
|
|
32
|
-
|
|
|
33
|
-
|
|
34
|
-
|
|
31
|
+
| **Serverless cold start** (50 schemas) | 7.7ms | 96ms | **ata 12.5x faster** |
|
|
32
|
+
| **ReDoS protection** (`^(a+)+$`) | 0.3ms | 765ms | **ata immune (RE2)** |
|
|
33
|
+
| **Batch NDJSON** (10K items, multi-core) | 13.4M/sec | 5.1M/sec | **ata 2.6x faster** |
|
|
34
|
+
| **Fastify HTTP** (100 users POST) | 24.6K req/sec | 22.6K req/sec | **ata 9% faster** |
|
|
35
35
|
|
|
36
36
|
### Where ajv wins
|
|
37
37
|
|
|
38
38
|
| Scenario | ata | ajv | |
|
|
39
39
|
|---|---|---|---|
|
|
40
|
-
| **validate(obj)** (invalid data
|
|
41
|
-
| **validateJSON(str)** (invalid data) |
|
|
40
|
+
| **validate(obj)** (invalid data) | 6M ops/sec | 7.9M ops/sec | **ajv 1.3x faster** |
|
|
41
|
+
| **validateJSON(str)** (invalid data) | 2.2M ops/sec | 2.3M ops/sec | **ajv 1.1x faster** |
|
|
42
42
|
|
|
43
|
-
> Invalid-data error
|
|
43
|
+
> Invalid-data error path — ajv is slightly faster. Production traffic is overwhelmingly valid.
|
|
44
44
|
|
|
45
45
|
### How it works
|
|
46
46
|
|
|
47
|
-
**Speculative validation**: For valid data (the common case), ata runs a JS codegen fast path entirely in V8 JIT — no NAPI boundary crossing. Only when validation fails does it fall through to the
|
|
47
|
+
**Speculative validation**: For valid data (the common case), ata runs a JS codegen fast path entirely in V8 JIT — no NAPI boundary crossing. Only when validation fails does it fall through to the JS error-collecting codegen or C++ engine.
|
|
48
48
|
|
|
49
|
-
**JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
|
|
49
|
+
**JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `$ref` (local), `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
|
|
50
50
|
|
|
51
|
-
**V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables.
|
|
51
|
+
**V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables, tiered uniqueItems (nested loop for small arrays).
|
|
52
52
|
|
|
53
53
|
**Adaptive simdjson**: For large documents (>8KB) with selective schemas, simdjson On Demand seeks only the needed fields — skipping irrelevant data at GB/s speeds.
|
|
54
54
|
|
|
55
55
|
### JSON Schema Test Suite
|
|
56
56
|
|
|
57
|
-
**98.
|
|
57
|
+
**98.4%** pass rate (937/952) on official [JSON Schema Test Suite](https://github.com/json-schema-org/JSON-Schema-Test-Suite) (Draft 2020-12).
|
|
58
58
|
|
|
59
59
|
## When to use ata
|
|
60
60
|
|
|
61
|
-
- **Any `validate(obj)` workload** — 1.
|
|
62
|
-
- **
|
|
63
|
-
- **
|
|
61
|
+
- **Any `validate(obj)` workload** — 1.8x–2.7x faster than ajv on valid data
|
|
62
|
+
- **Serverless / cold starts** — 12.5x faster schema compilation
|
|
63
|
+
- **Security-sensitive apps** — RE2 regex, immune to ReDoS attacks
|
|
64
|
+
- **Batch/streaming validation** — NDJSON log processing, data pipelines (2.6x faster)
|
|
65
|
+
- **Standard Schema V1** — native support for Fastify v5, tRPC, TanStack
|
|
64
66
|
- **C/C++ embedding** — native library, no JS runtime needed
|
|
65
67
|
|
|
66
68
|
## When to use ajv
|
|
67
69
|
|
|
68
|
-
- **Error-heavy workloads** — where most data is invalid
|
|
69
|
-
- **Schemas with
|
|
70
|
+
- **Error-heavy workloads** — where most data is invalid (ajv 1.3x faster on error path)
|
|
71
|
+
- **Schemas with `patternProperties`, `dependentSchemas`** — these bypass JS codegen
|
|
70
72
|
|
|
71
73
|
## Features
|
|
72
74
|
|
|
73
75
|
- **Speculative validation**: JS codegen fast path — valid data never crosses the NAPI boundary
|
|
74
|
-
- **Multi-core**: Parallel validation across all CPU cores —
|
|
76
|
+
- **Multi-core**: Parallel validation across all CPU cores — 13.4M validations/sec
|
|
75
77
|
- **simdjson**: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
|
|
76
|
-
- **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks
|
|
78
|
+
- **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks (2391x faster on pathological input)
|
|
77
79
|
- **V8-optimized codegen**: Destructuring batch reads, type guard elimination, property hoisting
|
|
78
80
|
- **Standard Schema V1**: Compatible with Fastify, tRPC, TanStack, Drizzle
|
|
79
81
|
- **Zero-copy paths**: Buffer and pre-padded input support — no unnecessary copies
|
|
82
|
+
- **Defaults + coercion**: `default` values, `coerceTypes`, `removeAdditional` support
|
|
80
83
|
- **C/C++ library**: Native API for non-Node.js environments
|
|
81
|
-
- **98.
|
|
84
|
+
- **98.4% spec compliant**: Draft 2020-12
|
|
82
85
|
|
|
83
86
|
## Installation
|
|
84
87
|
|
|
@@ -98,18 +101,18 @@ const v = new Validator({
|
|
|
98
101
|
properties: {
|
|
99
102
|
name: { type: 'string', minLength: 1 },
|
|
100
103
|
email: { type: 'string', format: 'email' },
|
|
101
|
-
age: { type: 'integer', minimum: 0 }
|
|
104
|
+
age: { type: 'integer', minimum: 0 },
|
|
105
|
+
role: { type: 'string', default: 'user' }
|
|
102
106
|
},
|
|
103
107
|
required: ['name', 'email']
|
|
104
108
|
});
|
|
105
109
|
|
|
106
|
-
// Fast boolean check — JS codegen, no NAPI (1.
|
|
110
|
+
// Fast boolean check — JS codegen, no NAPI (1.8x faster than ajv)
|
|
107
111
|
v.isValidObject({ name: 'Mert', email: 'mert@example.com', age: 26 }); // true
|
|
108
112
|
|
|
109
|
-
// Full validation with error details
|
|
110
|
-
const result = v.validate({ name: 'Mert', email: 'mert@example.com'
|
|
111
|
-
|
|
112
|
-
console.log(result.errors); // []
|
|
113
|
+
// Full validation with error details + defaults applied
|
|
114
|
+
const result = v.validate({ name: 'Mert', email: 'mert@example.com' });
|
|
115
|
+
// result.valid === true, data.role === 'user' (default applied)
|
|
113
116
|
|
|
114
117
|
// JSON string validation (simdjson fast path)
|
|
115
118
|
v.validateJSON('{"name": "Mert", "email": "mert@example.com"}');
|
|
@@ -118,12 +121,21 @@ v.isValidJSON('{"name": "Mert", "email": "mert@example.com"}'); // true
|
|
|
118
121
|
// Buffer input (zero-copy, raw NAPI)
|
|
119
122
|
v.isValid(Buffer.from('{"name": "Mert", "email": "mert@example.com"}'));
|
|
120
123
|
|
|
121
|
-
// Parallel batch — multi-core, NDJSON (
|
|
124
|
+
// Parallel batch — multi-core, NDJSON (2.6x faster than ajv)
|
|
122
125
|
const ndjson = Buffer.from(lines.join('\n'));
|
|
123
126
|
v.isValidParallel(ndjson); // bool[]
|
|
124
127
|
v.countValid(ndjson); // number
|
|
125
128
|
```
|
|
126
129
|
|
|
130
|
+
### Options
|
|
131
|
+
|
|
132
|
+
```javascript
|
|
133
|
+
const v = new Validator(schema, {
|
|
134
|
+
coerceTypes: true, // "42" → 42 for integer fields
|
|
135
|
+
removeAdditional: true, // strip properties not in schema
|
|
136
|
+
});
|
|
137
|
+
```
|
|
138
|
+
|
|
127
139
|
### Standard Schema V1
|
|
128
140
|
|
|
129
141
|
```javascript
|
|
@@ -143,7 +155,10 @@ npm install fastify-ata
|
|
|
143
155
|
|
|
144
156
|
```javascript
|
|
145
157
|
const fastify = require('fastify')();
|
|
146
|
-
fastify.register(require('fastify-ata')
|
|
158
|
+
fastify.register(require('fastify-ata'), {
|
|
159
|
+
coerceTypes: true,
|
|
160
|
+
removeAdditional: true,
|
|
161
|
+
});
|
|
147
162
|
|
|
148
163
|
// All existing JSON Schema route definitions work as-is
|
|
149
164
|
```
|
package/index.js
CHANGED
|
@@ -250,17 +250,24 @@ class Validator {
|
|
|
250
250
|
const useSimdjsonForLarge = !hasArrayTraversal;
|
|
251
251
|
|
|
252
252
|
if (jsFn) {
|
|
253
|
-
//
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
253
|
+
// Best path: combined validator (single pass, lazy error array)
|
|
254
|
+
// Valid data: no array allocation, returns VALID_RESULT
|
|
255
|
+
// Invalid data: collects errors in one pass (no double validation)
|
|
256
|
+
// Fallback: jsFn + errFn for schemas combined can't handle
|
|
257
|
+
const errFn = jsErrFn
|
|
258
|
+
? (d) => { try { return jsErrFn(d, true); } catch { return compiled.validate(d); } }
|
|
259
|
+
: (d) => compiled.validate(d);
|
|
260
|
+
this.validate = jsCombinedFn
|
|
261
|
+
? (preprocess
|
|
262
|
+
? (data) => { preprocess(data); try { return jsCombinedFn(data); } catch { return jsFn(data) ? VALID_RESULT : errFn(data); } }
|
|
263
|
+
: (data) => { try { return jsCombinedFn(data); } catch { return jsFn(data) ? VALID_RESULT : errFn(data); } })
|
|
264
|
+
: (preprocess
|
|
265
|
+
? (data) => { preprocess(data); return jsFn(data) ? VALID_RESULT : errFn(data); }
|
|
266
|
+
: (data) => jsFn(data) ? VALID_RESULT : errFn(data));
|
|
262
267
|
this.isValidObject = jsFn;
|
|
263
|
-
const jsonValidateFn =
|
|
268
|
+
const jsonValidateFn = jsCombinedFn
|
|
269
|
+
? (obj) => { try { return jsCombinedFn(obj); } catch { return jsFn(obj) ? VALID_RESULT : errFn(obj); } }
|
|
270
|
+
: (obj) => jsFn(obj) ? VALID_RESULT : errFn(obj);
|
|
264
271
|
this.validateJSON = useSimdjsonForLarge
|
|
265
272
|
? (jsonStr) => {
|
|
266
273
|
if (jsonStr.length >= SIMDJSON_THRESHOLD) {
|
package/lib/js-compiler.js
CHANGED
|
@@ -1208,16 +1208,18 @@ function compileToJSCombined(schema, VALID_RESULT) {
|
|
|
1208
1208
|
|
|
1209
1209
|
// Use factory pattern: closure vars (regexes, etc.) created once, not per call
|
|
1210
1210
|
const closureParams = ctx.closureVars.join(',')
|
|
1211
|
-
|
|
1211
|
+
// Lazy error array — no allocation for valid data (the common case)
|
|
1212
|
+
const inner = `let _e;\n ` +
|
|
1212
1213
|
(ctx.helperCode.length ? ctx.helperCode.join('\n ') + '\n ' : '') +
|
|
1213
1214
|
lines.join('\n ') +
|
|
1214
|
-
`\n return _e
|
|
1215
|
+
`\n return _e?{valid:false,errors:_e}:R`
|
|
1215
1216
|
|
|
1216
1217
|
try {
|
|
1217
1218
|
const factory = new Function('R' + (closureParams ? ',' + closureParams : ''),
|
|
1218
1219
|
`return function(d){${inner}}`)
|
|
1219
1220
|
return factory(VALID_RESULT, ...ctx.closureVals)
|
|
1220
|
-
} catch {
|
|
1221
|
+
} catch (e) {
|
|
1222
|
+
if (process.env.ATA_DEBUG) console.error('compileToJSCombined error:', e.message, '\n', inner.slice(0, 500))
|
|
1221
1223
|
return null
|
|
1222
1224
|
}
|
|
1223
1225
|
}
|
|
@@ -1239,7 +1241,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1239
1241
|
const types = schema.type ? (Array.isArray(schema.type) ? schema.type : [schema.type]) : null
|
|
1240
1242
|
let isObj = false, isArr = false, isStr = false, isNum = false
|
|
1241
1243
|
|
|
1242
|
-
const fail = (code, msg) => `_e.push({code:'${code}',path:${pathExpr||'""'},message:${msg}})`
|
|
1244
|
+
const fail = (code, msg) => `(_e||(_e=[])).push({code:'${code}',path:${pathExpr||'""'},message:${msg}})`
|
|
1243
1245
|
|
|
1244
1246
|
if (types) {
|
|
1245
1247
|
const conds = types.map(t => {
|
|
@@ -1267,8 +1269,6 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1267
1269
|
isNum = types[0] === 'number' || types[0] === 'integer'
|
|
1268
1270
|
}
|
|
1269
1271
|
lines.push(`if(${typeOk}){`)
|
|
1270
|
-
// We'll close this block at the end of genCodeC — mark it
|
|
1271
|
-
ctx._typeBlock = true
|
|
1272
1272
|
}
|
|
1273
1273
|
|
|
1274
1274
|
// enum
|
|
@@ -1311,7 +1311,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1311
1311
|
for (const key of schema.required) {
|
|
1312
1312
|
const check = hoisted[key] ? `${hoisted[key]}===undefined` : `${v}[${JSON.stringify(key)}]===undefined`
|
|
1313
1313
|
const p = pathExpr ? `${pathExpr}+'/${esc(key)}'` : `'/${esc(key)}'`
|
|
1314
|
-
lines.push(`if(${check}){${`_e.push({code:'required_missing',path:${p},message:'missing: ${esc(key)}'})`}}`)
|
|
1314
|
+
lines.push(`if(${check}){${`(_e||(_e=[])).push({code:'required_missing',path:${p},message:'missing: ${esc(key)}'})`}}`)
|
|
1315
1315
|
}
|
|
1316
1316
|
} else if (schema.required) {
|
|
1317
1317
|
for (const key of schema.required) {
|
|
@@ -1383,7 +1383,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1383
1383
|
for (const [key, deps] of Object.entries(schema.dependentRequired)) {
|
|
1384
1384
|
for (const dep of deps) {
|
|
1385
1385
|
const p = pathExpr ? `${pathExpr}+'/${esc(dep)}'` : `'/${esc(dep)}'`
|
|
1386
|
-
lines.push(`if(typeof ${v}==='object'&&${v}!==null&&${JSON.stringify(key)} in ${v}&&!(${JSON.stringify(dep)} in ${v})){_e.push({code:'required_missing',path:${p},message:'${esc(key)} requires ${esc(dep)}'})}`)
|
|
1386
|
+
lines.push(`if(typeof ${v}==='object'&&${v}!==null&&${JSON.stringify(key)} in ${v}&&!(${JSON.stringify(dep)} in ${v})){(_e||(_e=[])).push({code:'required_missing',path:${p},message:'${esc(key)} requires ${esc(dep)}'})}`)
|
|
1387
1387
|
}
|
|
1388
1388
|
}
|
|
1389
1389
|
}
|
|
@@ -1482,9 +1482,8 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1482
1482
|
}
|
|
1483
1483
|
|
|
1484
1484
|
// Close type-success block if opened
|
|
1485
|
-
if (
|
|
1485
|
+
if (types) {
|
|
1486
1486
|
lines.push(`}`)
|
|
1487
|
-
ctx._typeBlock = false
|
|
1488
1487
|
}
|
|
1489
1488
|
}
|
|
1490
1489
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ata-validator",
|
|
3
|
-
"version": "0.4.
|
|
3
|
+
"version": "0.4.5",
|
|
4
4
|
"description": "Ultra-fast JSON Schema validator. Beats ajv on every valid-path benchmark: 1.1x–2.7x faster validate(obj), 151x faster compilation, 5.9x faster parallel batch. Speculative validation with V8-optimized JS codegen, simdjson, multi-core. Standard Schema V1 compatible.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|