ata-validator 0.4.12 → 0.4.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,12 +10,11 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
10
10
 
11
11
  | Scenario | ata | ajv | |
12
12
  |---|---|---|---|
13
- | **validate(obj)** valid | 68M ops/sec | 8M ops/sec | **ata 8.5x faster** |
14
- | **validate(obj)** invalid | 17M ops/sec | 8M ops/sec | **ata 2.1x faster** |
15
- | **isValidObject(obj)** | 15.4M ops/sec | 9.2M ops/sec | **ata 1.7x faster** |
16
- | **validateJSON(str)** valid | 3.0M ops/sec | 1.9M ops/sec | **ata 1.6x faster** |
17
- | **validateJSON(str)** invalid | 2.7M ops/sec | 2.3M ops/sec | **ata 1.2x faster** |
18
- | **Schema compilation** | 113K ops/sec | 818 ops/sec | **ata 138x faster** |
13
+ | **validate(obj)** valid | 25.5M ops/sec | 19.3M ops/sec | **ata 1.3x faster** |
14
+ | **validate(obj)** invalid | 17.7M ops/sec | 13.5M ops/sec | **ata 1.3x faster** |
15
+ | **isValidObject(obj)** | 39.5M ops/sec | 17.6M ops/sec | **ata 2.2x faster** |
16
+ | **Constructor cold start** | 1.28M ops/sec | 812 ops/sec | **ata 1,580x faster** |
17
+ | **First validation** | 396K ops/sec | 880 ops/sec | **ata 450x faster** |
19
18
 
20
19
  > validate(obj) numbers are isolated single-schema benchmarks. Multi-schema benchmark overhead reduces throughput; real-world numbers depend on workload.
21
20
 
@@ -31,17 +30,16 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
31
30
 
32
31
  | Scenario | ata | ajv | |
33
32
  |---|---|---|---|
34
- | **Serverless cold start** (50 schemas) | 7.7ms | 96ms | **ata 12.5x faster** |
33
+ | **Serverless cold start** (50 schemas) | 0.1ms | 23ms | **ata 242x faster** |
35
34
  | **ReDoS protection** (`^(a+)+$`) | 0.3ms | 765ms | **ata immune (RE2)** |
36
35
  | **Batch NDJSON** (10K items, multi-core) | 13.4M/sec | 5.1M/sec | **ata 2.6x faster** |
37
- | **Fastify HTTP** (100 users POST) | 24.6K req/sec | 22.6K req/sec | **ata 9% faster** |
38
- | **Fastify startup** (500 routes) | 46ms | 77ms (standalone) | **ata 1.7x faster** |
36
+ | **Fastify startup** (5 routes) | 0.5ms | 6.0ms | **ata 12x faster** |
39
37
 
40
38
  > Isolated single-schema benchmarks. Results vary by workload and hardware.
41
39
 
42
40
  ### How it works
43
41
 
44
- **Hybrid validator**: ata compiles schemas into monolithic JS functions identical to the boolean fast path, but returning `VALID_RESULT` on success and calling the error collector on failure. V8 TurboFan optimizes it identically to a pure boolean function - error code is dead code on the valid path. No try/catch (3.3x V8 deopt), no lazy arrays, no double-pass.
42
+ **Combined single-pass validator**: ata compiles schemas into a single function that validates and collects errors in one pass. Valid data returns `VALID_RESULT` with zero allocation. Invalid data collects errors inline - no double validation, no try/catch (3.3x V8 deopt). Lazy compilation defers all work to first usage - constructor is near-zero cost.
45
43
 
46
44
  **JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `$ref` (local), `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
47
45
 
@@ -55,8 +53,8 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
55
53
 
56
54
  ## When to use ata
57
55
 
58
- - **High-throughput `validate(obj)`** - 68M ops/sec valid, 17M ops/sec invalid
59
- - **Serverless / cold starts** - 12.5x faster schema compilation
56
+ - **High-throughput `validate(obj)`** - 25.5M ops/sec valid, 17.7M ops/sec invalid
57
+ - **Serverless / cold starts** - 1,580x faster constructor, 450x faster first validation
60
58
  - **Security-sensitive apps** - RE2 regex, immune to ReDoS attacks
61
59
  - **Batch/streaming validation** - NDJSON log processing, data pipelines (2.6x faster)
62
60
  - **Standard Schema V1** - native support for Fastify v5, tRPC, TanStack
@@ -69,7 +67,7 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
69
67
 
70
68
  ## Features
71
69
 
72
- - **Hybrid validator**: 68M ops/sec - same function body as boolean check, returns result or calls error collector. No try/catch, no double pass
70
+ - **Hybrid validator**: 25.5M ops/sec valid, 17.7M ops/sec invalid - codegen + single-pass error collection. No try/catch, no double pass. Schema compilation cache for repeated schemas
73
71
  - **Multi-core**: Parallel validation across all CPU cores - 13.4M validations/sec
74
72
  - **simdjson**: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
75
73
  - **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks (2391x faster on pathological input)
@@ -104,7 +102,7 @@ const v = new Validator({
104
102
  required: ['name', 'email']
105
103
  });
106
104
 
107
- // Fast boolean check - JS codegen, 68M ops/sec
105
+ // Fast boolean check - JS codegen, 15.3M ops/sec
108
106
  v.isValidObject({ name: 'Mert', email: 'mert@example.com', age: 26 }); // true
109
107
 
110
108
  // Full validation with error details + defaults applied
@@ -152,7 +150,7 @@ fs.writeFileSync('./bundle.js', Validator.bundleCompact(schemas));
152
150
  const validators = Validator.loadBundle(require('./bundle.js'), schemas);
153
151
  ```
154
152
 
155
- **Fastify startup (500 routes): ajv standalone 77ms → ata standalone 46ms (1.7x faster)**
153
+ **Fastify startup (5 routes): ajv 6.0ms → ata 0.5ms (12x faster, no build step needed)**
156
154
 
157
155
  ### Standard Schema V1
158
156
 
package/index.js CHANGED
@@ -205,6 +205,9 @@ function collectRemovals(schema, actions, path) {
205
205
  }
206
206
  }
207
207
 
208
+ // Schema compilation cache: same schema string -> reuse compiled functions
209
+ const _compileCache = new Map();
210
+
208
211
  const SIMDJSON_PADDING = 64;
209
212
  const VALID_RESULT = Object.freeze({ valid: true, errors: Object.freeze([]) });
210
213
 
@@ -254,7 +257,7 @@ class Validator {
254
257
  return this.validate(data);
255
258
  };
256
259
  this.isValidObject = (data) => {
257
- this._ensureCompiled();
260
+ this._ensureCodegen();
258
261
  return this.isValidObject(data);
259
262
  };
260
263
  this.validateJSON = (jsonStr) => {
@@ -299,17 +302,21 @@ class Validator {
299
302
  const schemaObj = this._schemaObj;
300
303
  const options = this._options;
301
304
 
302
- // Pure JS fast path -- no NAPI, runs in V8 JIT
303
- // Set ATA_FORCE_NAPI=1 to disable JS codegen (for correctness testing)
304
- const jsFn = process.env.ATA_FORCE_NAPI
305
- ? null
306
- : compileToJSCodegen(schemaObj) || compileToJS(schemaObj);
307
- const jsCombinedFn = process.env.ATA_FORCE_NAPI
308
- ? null
309
- : compileToJSCombined(schemaObj, VALID_RESULT);
310
- const jsErrFn = process.env.ATA_FORCE_NAPI
311
- ? null
312
- : compileToJSCodegenWithErrors(schemaObj);
305
+ // Check cache first -- reuse compiled functions for same schema
306
+ const cached = _compileCache.get(this._schemaStr);
307
+ let jsFn, jsCombinedFn, jsErrFn;
308
+ if (cached && !process.env.ATA_FORCE_NAPI) {
309
+ jsFn = cached.jsFn;
310
+ jsCombinedFn = cached.combined;
311
+ jsErrFn = cached.errFn;
312
+ } else if (!process.env.ATA_FORCE_NAPI) {
313
+ jsFn = compileToJSCodegen(schemaObj) || compileToJS(schemaObj);
314
+ jsCombinedFn = compileToJSCombined(schemaObj, VALID_RESULT);
315
+ jsErrFn = compileToJSCodegenWithErrors(schemaObj);
316
+ _compileCache.set(this._schemaStr, { jsFn, combined: jsCombinedFn, errFn: jsErrFn });
317
+ } else {
318
+ jsFn = null; jsCombinedFn = null; jsErrFn = null;
319
+ }
313
320
  this._jsFn = jsFn;
314
321
 
315
322
  // Data mutators -- applied in-place before validation
@@ -491,6 +498,25 @@ class Validator {
491
498
  this._fastSlot = native.fastRegister(this._schemaStr);
492
499
  }
493
500
 
501
+ _ensureCodegen() {
502
+ if (this._jsFn) return;
503
+ if (process.env.ATA_FORCE_NAPI) return;
504
+ const cached = _compileCache.get(this._schemaStr);
505
+ if (cached && cached.jsFn) {
506
+ this._jsFn = cached.jsFn;
507
+ this.isValidObject = cached.jsFn;
508
+ return;
509
+ }
510
+ const jsFn = compileToJSCodegen(this._schemaObj) || compileToJS(this._schemaObj);
511
+ this._jsFn = jsFn;
512
+ if (jsFn) {
513
+ this.isValidObject = jsFn;
514
+ // seed cache with codegen, combined/errFn filled later by _ensureCompiled
515
+ if (!cached) _compileCache.set(this._schemaStr, { jsFn, combined: null, errFn: null });
516
+ else cached.jsFn = jsFn;
517
+ }
518
+ }
519
+
494
520
  // --- Standalone pre-compilation ---
495
521
  // Generate a JS module string that can be written to a file.
496
522
  // On next startup, load with Validator.fromStandalone() -- zero compile time.
@@ -518,28 +518,53 @@ function compileToJSCodegen(schema) {
518
518
  schema.dependentSchemas ||
519
519
  schema.propertyNames) return null
520
520
 
521
- const ctx = { varCounter: 0, helpers: [], helperCode: [], rootDefs, refStack: new Set() }
521
+ const ctx = { varCounter: 0, helpers: [], helperCode: [], closureVars: [], closureVals: [], rootDefs, refStack: new Set() }
522
522
  const lines = []
523
523
  genCode(schema, 'd', lines, ctx)
524
524
  if (lines.length === 0) return () => true
525
525
 
526
- const helperStr = ctx.helperCode.length ? ctx.helperCode.join('\n ') + '\n ' : ''
526
+ // Append deferred checks (additionalProperties) at the end
527
+ if (ctx.deferredChecks) {
528
+ for (const dc of ctx.deferredChecks) lines.push(dc)
529
+ }
530
+
527
531
  const checkStr = lines.join('\n ')
528
- const body = helperStr + checkStr + '\n return true'
532
+
533
+ // Regex and helpers are passed as closure variables (not re-created per call)
534
+ const closureNames = ctx.closureVars
535
+ const closureValues = ctx.closureVals
536
+
537
+ // Pre-create regex objects once
538
+ for (const code of ctx.helperCode) {
539
+ const match = code.match(/^const (_re\d+)=new RegExp\((.+)\)$/)
540
+ if (match) {
541
+ closureNames.push(match[1])
542
+ closureValues.push(new RegExp(JSON.parse(match[2])))
543
+ }
544
+ }
545
+
546
+ const body = checkStr + '\n return true'
529
547
 
530
548
  try {
531
- const boolFn = new Function('d', body)
549
+ let boolFn
550
+ if (closureNames.length > 0) {
551
+ const factory = new Function(...closureNames, `return function(d){${body}}`)
552
+ boolFn = factory(...closureValues)
553
+ } else {
554
+ boolFn = new Function('d', body)
555
+ }
532
556
 
533
557
  // Build hybrid: same body, return R instead of true, return E(d) instead of false.
534
- const hybridBody = replaceTopLevel(helperStr + checkStr + '\n return R')
558
+ const hybridBody = replaceTopLevel(checkStr + '\n return R')
535
559
  try {
536
- const factory = new Function('R', 'E', `return function(d){${hybridBody}}`)
537
- boolFn._hybridFactory = factory
560
+ const hybridFactory = new Function(...closureNames, 'R', 'E', `return function(d){${hybridBody}}`)
561
+ boolFn._hybridFactory = (R, E) => hybridFactory(...closureValues, R, E)
538
562
  } catch {}
539
563
 
540
- // Store source for standalone compilation (pre-build to file)
541
- boolFn._source = body
542
- boolFn._hybridSource = hybridBody
564
+ // Store source for standalone compilation (includes regex inline for file output)
565
+ const helperStr = ctx.helperCode.length ? ctx.helperCode.join('\n ') + '\n ' : ''
566
+ boolFn._source = helperStr + body
567
+ boolFn._hybridSource = helperStr + hybridBody
543
568
 
544
569
  return boolFn
545
570
  } catch {
@@ -614,7 +639,7 @@ function genCode(schema, v, lines, ctx, knownType) {
614
639
  case 'string': return `typeof ${v}==='string'`
615
640
  case 'number': return `(typeof ${v}==='number'&&isFinite(${v}))`
616
641
  case 'integer': return `Number.isInteger(${v})`
617
- case 'boolean': return `typeof ${v}==='boolean'`
642
+ case 'boolean': return `(${v}===true||${v}===false)`
618
643
  case 'null': return `${v}===null`
619
644
  default: return 'true'
620
645
  }
@@ -656,27 +681,19 @@ function genCode(schema, v, lines, ctx, knownType) {
656
681
  // Collect required keys so property checks can skip 'in' guard
657
682
  const requiredSet = new Set(schema.required || [])
658
683
 
659
- // required + property hoisting via destructuring.
660
- // V8 TurboFan optimizes destructuring into a single batch hidden-class-aware read.
661
- // `d.key !== undefined` is faster than `'key' in d` (no prototype chain walk).
662
- const hoisted = {} // key -> local var name
684
+ // required: skip explicit check if property has a type constraint
685
+ // (type check on undefined returns false anyway: Number.isInteger(undefined) === false)
686
+ const hoisted = {} // key -> access expression
663
687
  if (schema.required && schema.properties && isObj) {
664
- const destructKeys = []
665
688
  const reqChecks = []
666
689
  for (const key of schema.required) {
667
- if (schema.properties[key]) {
668
- const localVar = `_h${ctx.varCounter++}`
669
- hoisted[key] = localVar
670
- destructKeys.push(`${JSON.stringify(key)}:${localVar}`)
671
- reqChecks.push(`${localVar}===undefined`)
672
- } else {
673
- // Required but no property schema — just check existence
690
+ hoisted[key] = `${v}[${JSON.stringify(key)}]`
691
+ const prop = schema.properties[key]
692
+ const hasTypeCheck = prop && (prop.type || prop.enum || prop.const !== undefined)
693
+ if (!hasTypeCheck) {
674
694
  reqChecks.push(`${v}[${JSON.stringify(key)}]===undefined`)
675
695
  }
676
696
  }
677
- if (destructKeys.length > 0) {
678
- lines.push(`const{${destructKeys.join(',')}}=${v}`)
679
- }
680
697
  if (reqChecks.length > 0) {
681
698
  lines.push(`if(${reqChecks.join('||')})return false`)
682
699
  }
@@ -738,12 +755,21 @@ function genCode(schema, v, lines, ctx, knownType) {
738
755
  lines.push(isArr ? `{${inner}}` : `if(Array.isArray(${v})){${inner}}`)
739
756
  }
740
757
 
741
- // additionalProperties
758
+ // additionalProperties -- deferred to end of function for better V8 optimization
759
+ // (type checks run first in hot path, expensive prop count check last)
742
760
  if (schema.additionalProperties === false && schema.properties) {
743
- const allowed = Object.keys(schema.properties).map(k => `'${esc(k)}'`).join(',')
744
- const ci = ctx.varCounter++
745
- const inner = `const _k${ci}=Object.keys(${v});const _a${ci}=new Set([${allowed}]);for(let _i=0;_i<_k${ci}.length;_i++)if(!_a${ci}.has(_k${ci}[_i]))return false`
746
- lines.push(isObj ? `{${inner}}` : `if(typeof ${v}==='object'&&${v}!==null&&!Array.isArray(${v})){${inner}}`)
761
+ const propCount = Object.keys(schema.properties).length
762
+ if (!schema.patternProperties) {
763
+ const inner = `var _n=0;for(var _k in ${v})_n++;if(_n!==${propCount})return false`
764
+ if (!ctx.deferredChecks) ctx.deferredChecks = []
765
+ ctx.deferredChecks.push(isObj ? inner : `if(typeof ${v}==='object'&&${v}!==null&&!Array.isArray(${v})){${inner}}`)
766
+ } else {
767
+ const allowed = Object.keys(schema.properties).map(k => `'${esc(k)}'`).join(',')
768
+ const ci = ctx.varCounter++
769
+ const inner = `const _k${ci}=Object.keys(${v});const _a${ci}=new Set([${allowed}]);for(let _i=0;_i<_k${ci}.length;_i++)if(!_a${ci}.has(_k${ci}[_i]))return false`
770
+ if (!ctx.deferredChecks) ctx.deferredChecks = []
771
+ ctx.deferredChecks.push(isObj ? `{${inner}}` : `if(typeof ${v}==='object'&&${v}!==null&&!Array.isArray(${v})){${inner}}`)
772
+ }
747
773
  }
748
774
 
749
775
  // dependentRequired
@@ -973,7 +999,7 @@ function genCodeE(schema, v, pathExpr, lines, ctx) {
973
999
  case 'string': return `typeof ${v}==='string'`
974
1000
  case 'number': return `(typeof ${v}==='number'&&isFinite(${v}))`
975
1001
  case 'integer': return `Number.isInteger(${v})`
976
- case 'boolean': return `typeof ${v}==='boolean'`
1002
+ case 'boolean': return `(${v}===true||${v}===false)`
977
1003
  case 'null': return `${v}===null`
978
1004
  default: return 'true'
979
1005
  }
@@ -1300,7 +1326,7 @@ function genCodeC(schema, v, pathExpr, lines, ctx) {
1300
1326
  case 'string': return `typeof ${v}==='string'`
1301
1327
  case 'number': return `(typeof ${v}==='number'&&isFinite(${v}))`
1302
1328
  case 'integer': return `Number.isInteger(${v})`
1303
- case 'boolean': return `typeof ${v}==='boolean'`
1329
+ case 'boolean': return `(${v}===true||${v}===false)`
1304
1330
  case 'null': return `${v}===null`
1305
1331
  default: return 'true'
1306
1332
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ata-validator",
3
- "version": "0.4.12",
3
+ "version": "0.4.14",
4
4
  "description": "Ultra-fast JSON Schema validator. Beats ajv on every valid-path benchmark: 1.1x–2.7x faster validate(obj), 151x faster compilation, 5.9x faster parallel batch. Speculative validation with V8-optimized JS codegen, simdjson, multi-core. Standard Schema V1 compatible.",
5
5
  "main": "index.js",
6
6
  "types": "index.d.ts",
@@ -63,6 +63,7 @@
63
63
  "node-gyp-build": "^4.8.4"
64
64
  },
65
65
  "devDependencies": {
66
+ "@sinclair/typebox": "^0.34.49",
66
67
  "node-gyp": "^11.0.0",
67
68
  "prebuildify": "^6.0.1"
68
69
  },