ata-validator 0.4.4 → 0.4.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +49 -40
- package/index.js +17 -10
- package/lib/js-compiler.js +9 -10
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -6,15 +6,16 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
|
|
|
6
6
|
|
|
7
7
|
## Performance
|
|
8
8
|
|
|
9
|
-
### Single-Document Validation
|
|
9
|
+
### Single-Document Validation
|
|
10
10
|
|
|
11
11
|
| Scenario | ata | ajv | |
|
|
12
12
|
|---|---|---|---|
|
|
13
|
-
| **validate(obj)** |
|
|
14
|
-
| **
|
|
15
|
-
| **
|
|
16
|
-
| **
|
|
17
|
-
| **
|
|
13
|
+
| **validate(obj)** valid | 15M ops/sec | 8M ops/sec | **ata 1.9x faster** |
|
|
14
|
+
| **validate(obj)** invalid | 13.1M ops/sec | 8.1M ops/sec | **ata 1.6x faster** |
|
|
15
|
+
| **isValidObject(obj)** | 15.4M ops/sec | 9.2M ops/sec | **ata 1.7x faster** |
|
|
16
|
+
| **validateJSON(str)** valid | 2.15M ops/sec | 1.88M ops/sec | **ata 1.1x faster** |
|
|
17
|
+
| **validateJSON(str)** invalid | 2.62M ops/sec | 2.35M ops/sec | **ata 1.1x faster** |
|
|
18
|
+
| **Schema compilation** | 112K ops/sec | 773 ops/sec | **ata 145x faster** |
|
|
18
19
|
|
|
19
20
|
### Large Data — JS Object Validation
|
|
20
21
|
|
|
@@ -24,61 +25,57 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
|
|
|
24
25
|
| 100 users (20KB) | 658K ops/sec | 243K ops/sec | **ata 2.7x faster** |
|
|
25
26
|
| 1,000 users (205KB) | 64K ops/sec | 23.5K ops/sec | **ata 2.7x faster** |
|
|
26
27
|
|
|
27
|
-
###
|
|
28
|
-
|
|
29
|
-
| Batch Size | ata | ajv | |
|
|
30
|
-
|---|---|---|---|
|
|
31
|
-
| 1,000 items | 8.4M items/sec | 2.2M items/sec | **ata 3.9x faster** |
|
|
32
|
-
| 10,000 items | 12.5M items/sec | 2.1M items/sec | **ata 5.9x faster** |
|
|
33
|
-
|
|
34
|
-
> ajv is single-threaded (JS). ata uses all CPU cores via a persistent C++ thread pool.
|
|
35
|
-
|
|
36
|
-
### Where ajv wins
|
|
28
|
+
### Real-World Scenarios
|
|
37
29
|
|
|
38
30
|
| Scenario | ata | ajv | |
|
|
39
31
|
|---|---|---|---|
|
|
40
|
-
| **
|
|
41
|
-
| **
|
|
32
|
+
| **Serverless cold start** (50 schemas) | 7.7ms | 96ms | **ata 12.5x faster** |
|
|
33
|
+
| **ReDoS protection** (`^(a+)+$`) | 0.3ms | 765ms | **ata immune (RE2)** |
|
|
34
|
+
| **Batch NDJSON** (10K items, multi-core) | 13.4M/sec | 5.1M/sec | **ata 2.6x faster** |
|
|
35
|
+
| **Fastify HTTP** (100 users POST) | 24.6K req/sec | 22.6K req/sec | **ata 9% faster** |
|
|
42
36
|
|
|
43
|
-
>
|
|
37
|
+
> ata is faster than ajv on **every** benchmark — valid and invalid data, objects and JSON strings, single documents and parallel batches.
|
|
44
38
|
|
|
45
39
|
### How it works
|
|
46
40
|
|
|
47
|
-
**
|
|
41
|
+
**Combined single-pass validation**: ata compiles schemas into monolithic JS functions that both validate and collect errors in a single pass. Valid data returns immediately (lazy error array — zero allocation). Invalid data collects errors without a second pass.
|
|
48
42
|
|
|
49
|
-
**JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
|
|
43
|
+
**JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `$ref` (local), `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
|
|
50
44
|
|
|
51
|
-
**V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables.
|
|
45
|
+
**V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables, tiered uniqueItems (nested loop for small arrays).
|
|
52
46
|
|
|
53
47
|
**Adaptive simdjson**: For large documents (>8KB) with selective schemas, simdjson On Demand seeks only the needed fields — skipping irrelevant data at GB/s speeds.
|
|
54
48
|
|
|
55
49
|
### JSON Schema Test Suite
|
|
56
50
|
|
|
57
|
-
**98.
|
|
51
|
+
**98.4%** pass rate (937/952) on official [JSON Schema Test Suite](https://github.com/json-schema-org/JSON-Schema-Test-Suite) (Draft 2020-12).
|
|
58
52
|
|
|
59
53
|
## When to use ata
|
|
60
54
|
|
|
61
|
-
- **Any `validate(obj)` workload** — 1.
|
|
62
|
-
- **
|
|
63
|
-
- **
|
|
55
|
+
- **Any `validate(obj)` workload** — 1.6x–2.7x faster than ajv on all data
|
|
56
|
+
- **Serverless / cold starts** — 12.5x faster schema compilation
|
|
57
|
+
- **Security-sensitive apps** — RE2 regex, immune to ReDoS attacks
|
|
58
|
+
- **Batch/streaming validation** — NDJSON log processing, data pipelines (2.6x faster)
|
|
59
|
+
- **Standard Schema V1** — native support for Fastify v5, tRPC, TanStack
|
|
64
60
|
- **C/C++ embedding** — native library, no JS runtime needed
|
|
65
61
|
|
|
66
62
|
## When to use ajv
|
|
67
63
|
|
|
68
|
-
- **
|
|
69
|
-
- **
|
|
64
|
+
- **Schemas with `patternProperties`, `dependentSchemas`** — these bypass JS codegen and hit the slower NAPI path
|
|
65
|
+
- **100% spec compliance needed** — ajv covers more edge cases (ata: 98.4%)
|
|
70
66
|
|
|
71
67
|
## Features
|
|
72
68
|
|
|
73
|
-
- **
|
|
74
|
-
- **Multi-core**: Parallel validation across all CPU cores —
|
|
69
|
+
- **Combined single-pass validation**: One JS function validates + collects errors — no double pass, lazy error allocation
|
|
70
|
+
- **Multi-core**: Parallel validation across all CPU cores — 13.4M validations/sec
|
|
75
71
|
- **simdjson**: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
|
|
76
|
-
- **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks
|
|
72
|
+
- **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks (2391x faster on pathological input)
|
|
77
73
|
- **V8-optimized codegen**: Destructuring batch reads, type guard elimination, property hoisting
|
|
78
74
|
- **Standard Schema V1**: Compatible with Fastify, tRPC, TanStack, Drizzle
|
|
79
75
|
- **Zero-copy paths**: Buffer and pre-padded input support — no unnecessary copies
|
|
76
|
+
- **Defaults + coercion**: `default` values, `coerceTypes`, `removeAdditional` support
|
|
80
77
|
- **C/C++ library**: Native API for non-Node.js environments
|
|
81
|
-
- **98.
|
|
78
|
+
- **98.4% spec compliant**: Draft 2020-12
|
|
82
79
|
|
|
83
80
|
## Installation
|
|
84
81
|
|
|
@@ -98,18 +95,18 @@ const v = new Validator({
|
|
|
98
95
|
properties: {
|
|
99
96
|
name: { type: 'string', minLength: 1 },
|
|
100
97
|
email: { type: 'string', format: 'email' },
|
|
101
|
-
age: { type: 'integer', minimum: 0 }
|
|
98
|
+
age: { type: 'integer', minimum: 0 },
|
|
99
|
+
role: { type: 'string', default: 'user' }
|
|
102
100
|
},
|
|
103
101
|
required: ['name', 'email']
|
|
104
102
|
});
|
|
105
103
|
|
|
106
|
-
// Fast boolean check — JS codegen
|
|
104
|
+
// Fast boolean check — JS codegen (1.7x faster than ajv)
|
|
107
105
|
v.isValidObject({ name: 'Mert', email: 'mert@example.com', age: 26 }); // true
|
|
108
106
|
|
|
109
|
-
// Full validation with error details
|
|
110
|
-
const result = v.validate({ name: 'Mert', email: 'mert@example.com'
|
|
111
|
-
|
|
112
|
-
console.log(result.errors); // []
|
|
107
|
+
// Full validation with error details + defaults applied
|
|
108
|
+
const result = v.validate({ name: 'Mert', email: 'mert@example.com' });
|
|
109
|
+
// result.valid === true, data.role === 'user' (default applied)
|
|
113
110
|
|
|
114
111
|
// JSON string validation (simdjson fast path)
|
|
115
112
|
v.validateJSON('{"name": "Mert", "email": "mert@example.com"}');
|
|
@@ -118,12 +115,21 @@ v.isValidJSON('{"name": "Mert", "email": "mert@example.com"}'); // true
|
|
|
118
115
|
// Buffer input (zero-copy, raw NAPI)
|
|
119
116
|
v.isValid(Buffer.from('{"name": "Mert", "email": "mert@example.com"}'));
|
|
120
117
|
|
|
121
|
-
// Parallel batch — multi-core, NDJSON (
|
|
118
|
+
// Parallel batch — multi-core, NDJSON (2.6x faster than ajv)
|
|
122
119
|
const ndjson = Buffer.from(lines.join('\n'));
|
|
123
120
|
v.isValidParallel(ndjson); // bool[]
|
|
124
121
|
v.countValid(ndjson); // number
|
|
125
122
|
```
|
|
126
123
|
|
|
124
|
+
### Options
|
|
125
|
+
|
|
126
|
+
```javascript
|
|
127
|
+
const v = new Validator(schema, {
|
|
128
|
+
coerceTypes: true, // "42" → 42 for integer fields
|
|
129
|
+
removeAdditional: true, // strip properties not in schema
|
|
130
|
+
});
|
|
131
|
+
```
|
|
132
|
+
|
|
127
133
|
### Standard Schema V1
|
|
128
134
|
|
|
129
135
|
```javascript
|
|
@@ -143,7 +149,10 @@ npm install fastify-ata
|
|
|
143
149
|
|
|
144
150
|
```javascript
|
|
145
151
|
const fastify = require('fastify')();
|
|
146
|
-
fastify.register(require('fastify-ata')
|
|
152
|
+
fastify.register(require('fastify-ata'), {
|
|
153
|
+
coerceTypes: true,
|
|
154
|
+
removeAdditional: true,
|
|
155
|
+
});
|
|
147
156
|
|
|
148
157
|
// All existing JSON Schema route definitions work as-is
|
|
149
158
|
```
|
package/index.js
CHANGED
|
@@ -250,17 +250,24 @@ class Validator {
|
|
|
250
250
|
const useSimdjsonForLarge = !hasArrayTraversal;
|
|
251
251
|
|
|
252
252
|
if (jsFn) {
|
|
253
|
-
//
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
253
|
+
// Best path: combined validator (single pass, lazy error array)
|
|
254
|
+
// Valid data: no array allocation, returns VALID_RESULT
|
|
255
|
+
// Invalid data: collects errors in one pass (no double validation)
|
|
256
|
+
// Fallback: jsFn + errFn for schemas combined can't handle
|
|
257
|
+
const errFn = jsErrFn
|
|
258
|
+
? (d) => { try { return jsErrFn(d, true); } catch { return compiled.validate(d); } }
|
|
259
|
+
: (d) => compiled.validate(d);
|
|
260
|
+
this.validate = jsCombinedFn
|
|
261
|
+
? (preprocess
|
|
262
|
+
? (data) => { preprocess(data); try { return jsCombinedFn(data); } catch { return jsFn(data) ? VALID_RESULT : errFn(data); } }
|
|
263
|
+
: (data) => { try { return jsCombinedFn(data); } catch { return jsFn(data) ? VALID_RESULT : errFn(data); } })
|
|
264
|
+
: (preprocess
|
|
265
|
+
? (data) => { preprocess(data); return jsFn(data) ? VALID_RESULT : errFn(data); }
|
|
266
|
+
: (data) => jsFn(data) ? VALID_RESULT : errFn(data));
|
|
262
267
|
this.isValidObject = jsFn;
|
|
263
|
-
const jsonValidateFn =
|
|
268
|
+
const jsonValidateFn = jsCombinedFn
|
|
269
|
+
? (obj) => { try { return jsCombinedFn(obj); } catch { return jsFn(obj) ? VALID_RESULT : errFn(obj); } }
|
|
270
|
+
: (obj) => jsFn(obj) ? VALID_RESULT : errFn(obj);
|
|
264
271
|
this.validateJSON = useSimdjsonForLarge
|
|
265
272
|
? (jsonStr) => {
|
|
266
273
|
if (jsonStr.length >= SIMDJSON_THRESHOLD) {
|
package/lib/js-compiler.js
CHANGED
|
@@ -1208,16 +1208,18 @@ function compileToJSCombined(schema, VALID_RESULT) {
|
|
|
1208
1208
|
|
|
1209
1209
|
// Use factory pattern: closure vars (regexes, etc.) created once, not per call
|
|
1210
1210
|
const closureParams = ctx.closureVars.join(',')
|
|
1211
|
-
|
|
1211
|
+
// Lazy error array — no allocation for valid data (the common case)
|
|
1212
|
+
const inner = `let _e;\n ` +
|
|
1212
1213
|
(ctx.helperCode.length ? ctx.helperCode.join('\n ') + '\n ' : '') +
|
|
1213
1214
|
lines.join('\n ') +
|
|
1214
|
-
`\n return _e
|
|
1215
|
+
`\n return _e?{valid:false,errors:_e}:R`
|
|
1215
1216
|
|
|
1216
1217
|
try {
|
|
1217
1218
|
const factory = new Function('R' + (closureParams ? ',' + closureParams : ''),
|
|
1218
1219
|
`return function(d){${inner}}`)
|
|
1219
1220
|
return factory(VALID_RESULT, ...ctx.closureVals)
|
|
1220
|
-
} catch {
|
|
1221
|
+
} catch (e) {
|
|
1222
|
+
if (process.env.ATA_DEBUG) console.error('compileToJSCombined error:', e.message, '\n', inner.slice(0, 500))
|
|
1221
1223
|
return null
|
|
1222
1224
|
}
|
|
1223
1225
|
}
|
|
@@ -1239,7 +1241,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1239
1241
|
const types = schema.type ? (Array.isArray(schema.type) ? schema.type : [schema.type]) : null
|
|
1240
1242
|
let isObj = false, isArr = false, isStr = false, isNum = false
|
|
1241
1243
|
|
|
1242
|
-
const fail = (code, msg) => `_e.push({code:'${code}',path:${pathExpr||'""'},message:${msg}})`
|
|
1244
|
+
const fail = (code, msg) => `(_e||(_e=[])).push({code:'${code}',path:${pathExpr||'""'},message:${msg}})`
|
|
1243
1245
|
|
|
1244
1246
|
if (types) {
|
|
1245
1247
|
const conds = types.map(t => {
|
|
@@ -1267,8 +1269,6 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1267
1269
|
isNum = types[0] === 'number' || types[0] === 'integer'
|
|
1268
1270
|
}
|
|
1269
1271
|
lines.push(`if(${typeOk}){`)
|
|
1270
|
-
// We'll close this block at the end of genCodeC — mark it
|
|
1271
|
-
ctx._typeBlock = true
|
|
1272
1272
|
}
|
|
1273
1273
|
|
|
1274
1274
|
// enum
|
|
@@ -1311,7 +1311,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1311
1311
|
for (const key of schema.required) {
|
|
1312
1312
|
const check = hoisted[key] ? `${hoisted[key]}===undefined` : `${v}[${JSON.stringify(key)}]===undefined`
|
|
1313
1313
|
const p = pathExpr ? `${pathExpr}+'/${esc(key)}'` : `'/${esc(key)}'`
|
|
1314
|
-
lines.push(`if(${check}){${`_e.push({code:'required_missing',path:${p},message:'missing: ${esc(key)}'})`}}`)
|
|
1314
|
+
lines.push(`if(${check}){${`(_e||(_e=[])).push({code:'required_missing',path:${p},message:'missing: ${esc(key)}'})`}}`)
|
|
1315
1315
|
}
|
|
1316
1316
|
} else if (schema.required) {
|
|
1317
1317
|
for (const key of schema.required) {
|
|
@@ -1383,7 +1383,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1383
1383
|
for (const [key, deps] of Object.entries(schema.dependentRequired)) {
|
|
1384
1384
|
for (const dep of deps) {
|
|
1385
1385
|
const p = pathExpr ? `${pathExpr}+'/${esc(dep)}'` : `'/${esc(dep)}'`
|
|
1386
|
-
lines.push(`if(typeof ${v}==='object'&&${v}!==null&&${JSON.stringify(key)} in ${v}&&!(${JSON.stringify(dep)} in ${v})){_e.push({code:'required_missing',path:${p},message:'${esc(key)} requires ${esc(dep)}'})}`)
|
|
1386
|
+
lines.push(`if(typeof ${v}==='object'&&${v}!==null&&${JSON.stringify(key)} in ${v}&&!(${JSON.stringify(dep)} in ${v})){(_e||(_e=[])).push({code:'required_missing',path:${p},message:'${esc(key)} requires ${esc(dep)}'})}`)
|
|
1387
1387
|
}
|
|
1388
1388
|
}
|
|
1389
1389
|
}
|
|
@@ -1482,9 +1482,8 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
|
|
|
1482
1482
|
}
|
|
1483
1483
|
|
|
1484
1484
|
// Close type-success block if opened
|
|
1485
|
-
if (
|
|
1485
|
+
if (types) {
|
|
1486
1486
|
lines.push(`}`)
|
|
1487
|
-
ctx._typeBlock = false
|
|
1488
1487
|
}
|
|
1489
1488
|
}
|
|
1490
1489
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "ata-validator",
|
|
3
|
-
"version": "0.4.
|
|
3
|
+
"version": "0.4.6",
|
|
4
4
|
"description": "Ultra-fast JSON Schema validator. Beats ajv on every valid-path benchmark: 1.1x–2.7x faster validate(obj), 151x faster compilation, 5.9x faster parallel batch. Speculative validation with V8-optimized JS codegen, simdjson, multi-core. Standard Schema V1 compatible.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"types": "index.d.ts",
|