ata-validator 0.11.1 → 0.12.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,8 +1,12 @@
1
+ <p align="center">
2
+ <img src="./assets/ata-validator.svg" alt="ata-validator" width="640" />
3
+ </p>
4
+
1
5
  # ata-validator
2
6
 
3
7
  Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjson/simdjson). Multi-core parallel validation, RE2 regex, codegen bytecode engine. Standard Schema V1 compatible.
4
8
 
5
- **[ata-validator.com](https://ata-validator.com)** | **[API Docs](docs/API.md)** | **[Migrate from ajv](docs/migration-from-ajv.md)** | **[Contributing](CONTRIBUTING.md)**
9
+ **[ata-validator.com](https://ata-validator.com)** | **[API Docs](docs/API.md)** | **[Migrate from ajv](docs/migration-from-ajv.md)** | **[Framework integrations](docs/integrations/)** | **[Contributing](CONTRIBUTING.md)**
6
10
 
7
11
  ## Performance
8
12
 
@@ -10,26 +14,34 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
10
14
 
11
15
  | Scenario | ata | ajv | |
12
16
  |---|---|---|---|
13
- | **validate(obj)** valid | 22ns | 102ns | **ata 4.6x faster** |
14
- | **validate(obj)** invalid | 87ns | 182ns | **ata 2.1x faster** |
15
- | **isValidObject(obj)** | 21ns | 100ns | **ata 4.7x faster** |
16
- | **Schema compilation** | 453ns | 1.24ms | **ata 2,729x faster** |
17
- | **First validation** | 2.07μs | 1.11ms | **ata 534x faster** |
17
+ | **validate(obj)** valid | 21ns | 108ns | **ata 5.1x faster** |
18
+ | **validate(obj)** invalid | 86ns | 104ns | **ata 1.2x faster** |
19
+ | **isValidObject(obj)** | 20ns | 109ns | **ata 5.4x faster** |
20
+ | **Schema instantiation** (lazy compile) | 8ns | 1.33ms | **ata 159,000x faster** |
21
+ | **First validation** (compile + validate) | 28ns | 1.21ms | **ata 43,000x faster** |
22
+
23
+ > **Honest read of the three rows above:**
24
+ >
25
+ > - **Hot loop** (millions of `validate(obj)` calls on a warm validator): ata is **~5× faster** than ajv. This is the steady-state advantage and what most apps care about most of the time.
26
+ > - **Cold start** (construct + first validate, apples-to-apples vs `ajv.compile(schema) + validate(obj)`): ata is **~43,000× faster**. Matters for serverless cold starts, CLI tools, batch workers — anywhere you instantiate a schema and exercise it once or a few times.
27
+ > - **Instantiation only** (`new Validator(schema)` with no validation yet): ata is **~159,000× faster**, but only because ata defers codegen to first use (lazy compile + a tier-0 interpreter for low-traffic schemas). The number is real but it is constructor cost vs ajv's full compile cost — not the same unit of work. Quote it carefully.
28
+ >
29
+ > The lazy compile architecture is also why an instantiated-but-never-validated schema is essentially free in ata, while in ajv it costs the full compile. That's the underlying real win, beyond the multiplier above.
18
30
 
19
31
  ### Complex Schema (patternProperties + dependentSchemas + propertyNames + additionalProperties)
20
32
 
21
33
  | Scenario | ata | ajv | |
22
34
  |---|---|---|---|
23
- | **validate(obj)** valid | 17ns | 115ns | **ata 6.8x faster** |
24
- | **validate(obj)** invalid | 59ns | 194ns | **ata 3.3x faster** |
25
- | **isValidObject(obj)** | 19ns | 124ns | **ata 6.6x faster** |
35
+ | **validate(obj)** valid | 19ns | 116ns | **ata 6.1x faster** |
36
+ | **validate(obj)** invalid | 62ns | 195ns | **ata 3.1x faster** |
37
+ | **isValidObject(obj)** | 18ns | 122ns | **ata 6.8x faster** |
26
38
 
27
39
  ### Cross-Schema `$ref` (multi-schema with `$id` registry)
28
40
 
29
41
  | Scenario | ata | ajv | |
30
42
  |---|---|---|---|
31
- | **validate(obj)** valid | 17ns | 25ns | **ata 1.5x faster** |
32
- | **validate(obj)** invalid | 34ns | 54ns | **ata 1.6x faster** |
43
+ | **validate(obj)** valid | 13ns | 25ns | **ata 2.0x faster** |
44
+ | **validate(obj)** invalid | 28ns | 56ns | **ata 2.0x faster** |
33
45
 
34
46
  > Measured with [mitata](https://github.com/evanwashere/mitata) on Apple M4 Pro (process-isolated). [Benchmark code](benchmark/bench_complex_mitata.mjs)
35
47
 
@@ -37,14 +49,14 @@ Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjs
37
49
 
38
50
  | Scenario | ata | ajv | |
39
51
  |---|---|---|---|
40
- | **Tier 1** (properties only) valid | 3.3ns | 8.7ns | **ata 2.6x faster** |
41
- | **Tier 1** invalid | 3.7ns | 19.1ns | **ata 5.2x faster** |
42
- | **Tier 2** (allOf) valid | 3.3ns | 9.9ns | **ata 3.0x faster** |
43
- | **Tier 3** (anyOf) valid | 6.7ns | 23.2ns | **ata 3.5x faster** |
44
- | **Tier 3** invalid | 7.1ns | 42.4ns | **ata 6.0x faster** |
45
- | **unevaluatedItems** valid | 1.0ns | 5.5ns | **ata 5.4x faster** |
46
- | **unevaluatedItems** invalid | 0.96ns | 14.2ns | **ata 14.8x faster** |
47
- | **Compilation** | 375ns | 2.59ms | **ata 6,904x faster** |
52
+ | **Tier 1** (properties only) valid | 3.3ns | 8.5ns | **ata 2.6x faster** |
53
+ | **Tier 1** invalid | 3.6ns | 18.6ns | **ata 5.2x faster** |
54
+ | **Tier 2** (allOf) valid | 3.3ns | 10.1ns | **ata 3.0x faster** |
55
+ | **Tier 3** (anyOf) valid | 6.7ns | 22.9ns | **ata 3.4x faster** |
56
+ | **Tier 3** invalid | 7.5ns | 41.8ns | **ata 5.6x faster** |
57
+ | **unevaluatedItems** valid | 0.97ns | 5.4ns | **ata 5.6x faster** |
58
+ | **unevaluatedItems** invalid | 0.99ns | 14.9ns | **ata 15.0x faster** |
59
+ | **Compilation** | 8.8ns | 2.64ms | **ata 298,000x faster** |
48
60
 
49
61
  Three-tier hybrid codegen: static schemas compile to zero-overhead key checks, dynamic schemas (anyOf/oneOf) use bitmask tracking with V8-inlined branch functions. [Benchmark code](benchmark/bench_unevaluated_mitata.mjs)
50
62
 
@@ -52,20 +64,21 @@ Three-tier hybrid codegen: static schemas compile to zero-overhead key checks, d
52
64
 
53
65
  | Scenario | ata | ajv | typebox | zod | valibot |
54
66
  |---|---|---|---|---|---|
55
- | **validate (valid)** | **9ns** | 38ns | 50ns | 334ns | 326ns |
56
- | **validate (invalid)** | **37ns** | 103ns | 4ns | 11.8μs | 842ns |
57
- | **compilation** | **453ns** | 1.24ms | 52μs | | |
58
- | **first validation** | **2.1μs** | 1.11ms | 54μs | | |
67
+ | **validate (valid)** | **7ns** | 38ns | 50ns | 342ns | 337ns |
68
+ | **validate (invalid, all errors)** | **38ns** | 102ns | n/a | 11.9μs | 855ns |
69
+ | **isValid (invalid, boolean)** | **0.93ns** | 16ns | 2.3ns | n/a | n/a |
70
+ | **compilation** | **9ns** | 1.20ms | 53μs | n/a | n/a |
71
+ | **first validation** | **16ns** | 1.16ms | 54μs | n/a | n/a |
59
72
 
60
- > Different categories: ata/ajv/typebox are JSON Schema validators, zod/valibot are schema-builder DSLs. [Benchmark code](benchmark/bench_all_mitata.mjs)
73
+ > Different categories: ata/ajv/typebox are JSON Schema validators, zod/valibot are schema-builder DSLs. The two invalid-path rows compare different units of work — `validate(invalid, all errors)` walks the full schema and builds an errors array (apples-to-apples vs ajv `{allErrors: true}`), while `isValid(invalid, boolean)` returns false on the first failed check (apples-to-apples vs typebox `Check()` and ajv `{allErrors: false}`). Reading both rows together avoids the trap of comparing a full error walk against a first-fail boolean. [Benchmark code](benchmark/bench_all_mitata.mjs)
61
74
 
62
75
  ### Large Data - JS Object Validation
63
76
 
64
77
  | Size | ata | ajv | |
65
78
  |---|---|---|---|
66
- | 10 users (2KB) | 6.2M ops/sec | 2.5M ops/sec | **ata 2.5x faster** |
67
- | 100 users (20KB) | 658K ops/sec | 243K ops/sec | **ata 2.7x faster** |
68
- | 1,000 users (205KB) | 64K ops/sec | 23.5K ops/sec | **ata 2.7x faster** |
79
+ | 10 users (2KB) | 6.0M ops/sec | 2.4M ops/sec | **ata 2.5x faster** |
80
+ | 100 users (20KB) | 621K ops/sec | 229K ops/sec | **ata 2.7x faster** |
81
+ | 1,000 users (205KB) | 63K ops/sec | 22.5K ops/sec | **ata 2.8x faster** |
69
82
 
70
83
  ### Real-World Scenarios
71
84
 
@@ -93,10 +106,10 @@ Three-tier hybrid codegen: static schemas compile to zero-overhead key checks, d
93
106
  | Scenario | ata | ajv | |
94
107
  |---|---|---|---|
95
108
  | **$dynamicRef tree** valid | 22ns | 54ns | **ata 2.4x faster** |
96
- | **$dynamicRef tree** invalid | 70ns | 76ns | **ata 1.1x faster** |
97
- | **$dynamicRef override** valid | 2.6ns | 183ns | **ata 70x faster** |
98
- | **$dynamicRef override** invalid | 48ns | 185ns | **ata 3.8x faster** |
99
- | **$anchor array** valid | 2.3ns | 3.1ns | **ata 1.4x faster** |
109
+ | **$dynamicRef tree** invalid | 71ns | 77ns | **ata 1.1x faster** |
110
+ | **$dynamicRef override** valid | 2.6ns | 187ns | **ata 71x faster** |
111
+ | **$dynamicRef override** invalid | 50ns | 189ns | **ata 3.8x faster** |
112
+ | **$anchor array** valid | 2.2ns | 3.2ns | **ata 1.4x faster** |
100
113
 
101
114
  Self-recursive named functions for $dynamicRef, compile-time cross-schema resolution, zero-wrapper hybrid path. [Benchmark code](benchmark/bench_dynamicref_vs_ajv.mjs)
102
115
 
@@ -106,7 +119,7 @@ Self-recursive named functions for $dynamicRef, compile-time cross-schema resolu
106
119
 
107
120
  ## When to use ata
108
121
 
109
- - **High-throughput `validate(obj)`** - 3.1x faster than ajv, 38x faster than zod
122
+ - **High-throughput `validate(obj)`** - 5.1x faster than ajv, 47x faster than zod
110
123
  - **Complex schemas** - `patternProperties`, `dependentSchemas`, `propertyNames`, `unevaluatedProperties` all inline JS codegen
111
124
  - **Multi-schema projects** - cross-schema `$ref` with `$id` registry, `addSchema()` API
112
125
  - **Draft 7 migration** - auto-detects `$schema`, normalizes Draft 7 keywords transparently
@@ -219,7 +232,7 @@ const v = new Validator(schema, {
219
232
 
220
233
  ### Build-time compile (`ata compile`)
221
234
 
222
- The `ata` CLI turns a JSON Schema file into a self-contained JavaScript module. No runtime dependency on `ata-validator`, so only the generated validator ships to the browser typical output is ~1 KB gzipped compared to ~27 KB for the full runtime.
235
+ The `ata` CLI turns a JSON Schema file into a self-contained JavaScript module. No runtime dependency on `ata-validator`, so only the generated validator ships to the browser. Typical output is ~1 KB gzipped compared to ~27 KB for the full runtime.
223
236
 
224
237
  ```bash
225
238
  npx ata compile schemas/user.json -o src/generated/user.validator.mjs
@@ -314,6 +327,24 @@ auto result = ata::validate(schema, R"({"name": "Mert"})");
314
327
  // result.valid == true
315
328
  ```
316
329
 
330
+ ## Framework integrations
331
+
332
+ Copy-paste recipes for the common frameworks. Most need 10-20 lines of glue. See [docs/integrations](docs/integrations/) for the full set.
333
+
334
+ | Framework | Pattern | Recipe |
335
+ |---|---|---|
336
+ | Fastify | dedicated plugin | [`fastify-ata`](https://github.com/ata-core/fastify-ata) |
337
+ | Vite (build-time compile) | dedicated plugin | [`ata-vite`](https://github.com/ata-core/ata-vite) |
338
+ | Hono | async middleware | [docs/integrations/hono.md](docs/integrations/hono.md) |
339
+ | Elysia | direct handler check | [docs/integrations/elysia.md](docs/integrations/elysia.md) |
340
+ | tRPC | Standard Schema V1 input | [docs/integrations/trpc.md](docs/integrations/trpc.md) |
341
+ | TanStack Form | Standard Schema V1 validator | [docs/integrations/tanstack-form.md](docs/integrations/tanstack-form.md) |
342
+ | Express | sync middleware | [docs/integrations/express.md](docs/integrations/express.md) |
343
+ | Koa | async ctx middleware | [docs/integrations/koa.md](docs/integrations/koa.md) |
344
+ | NestJS | validation pipe | [docs/integrations/nestjs.md](docs/integrations/nestjs.md) |
345
+ | SvelteKit | form action, API route | [docs/integrations/sveltekit.md](docs/integrations/sveltekit.md) |
346
+ | Astro | API route, server action | [docs/integrations/astro.md](docs/integrations/astro.md) |
347
+
317
348
  ## Supported Keywords
318
349
 
319
350
  | Category | Keywords |
@@ -973,15 +973,27 @@ function tryGenCombined(schema, access, ctx) {
973
973
 
974
974
  if (t === 'string') {
975
975
  if (schema.pattern || schema.format) return null
976
- // Both bounds set: hoist _cpLen once so ASCII strings are not scanned twice.
976
+ // s.length is an upper bound on cpLen and at least cpLen / 2 (worst case
977
+ // all-surrogate). Use s.length fast paths and only call _cpLen in the
978
+ // uncertain band; ASCII strings (>99% of real data) skip _cpLen entirely.
977
979
  if (schema.minLength !== undefined && schema.maxLength !== undefined) {
980
+ const M = schema.minLength
981
+ const X = schema.maxLength
978
982
  const v2 = isIdent ? access : '_v'
979
983
  const prelude = isIdent ? '' : `const _v=${access};`
980
- return `{${prelude}if(typeof ${v2}!=='string')return false;const _lv=_cpLen(${v2});if(_lv<${schema.minLength}||_lv>${schema.maxLength})return false}`
984
+ return `{${prelude}if(typeof ${v2}!=='string')return false;const _lv=${v2}.length;if(_lv<${M}||_lv>${X * 2})return false;if(_lv<${M * 2}||_lv>${X}){const _cp=_cpLen(${v2});if(_cp<${M}||_cp>${X})return false}}`
981
985
  }
982
986
  const conds = [`typeof _v!=='string'`]
983
- if (schema.minLength !== undefined) conds.push(`_cpLen(_v)<${schema.minLength}`)
984
- if (schema.maxLength !== undefined) conds.push(`_cpLen(_v)>${schema.maxLength}`)
987
+ if (schema.minLength !== undefined) {
988
+ const M = schema.minLength
989
+ conds.push(`_v.length<${M}`)
990
+ conds.push(`_v.length<${M * 2}&&_cpLen(_v)<${M}`)
991
+ }
992
+ if (schema.maxLength !== undefined) {
993
+ const X = schema.maxLength
994
+ conds.push(`_v.length>${X * 2}`)
995
+ conds.push(`_v.length>${X}&&_cpLen(_v)>${X}`)
996
+ }
985
997
  if (conds.length < 2) return null
986
998
  return bind(conds)
987
999
  }
@@ -1235,20 +1247,32 @@ function genCode(schema, v, lines, ctx, knownType) {
1235
1247
  if (schema.exclusiveMaximum !== undefined) lines.push(isNum ? `if(${v}>=${schema.exclusiveMaximum})return false` : `if(typeof ${v}==='number'&&${v}>=${schema.exclusiveMaximum})return false`)
1236
1248
  if (schema.multipleOf !== undefined) lines.push(isNum ? `if(${v}%${schema.multipleOf}!==0)return false` : `if(typeof ${v}==='number'&&${v}%${schema.multipleOf}!==0)return false`)
1237
1249
 
1238
- // string — skip type guard if known string. When both bounds are set, call
1239
- // _cpLen once and compare the cached result so ASCII strings do not get
1240
- // scanned twice.
1250
+ // string length — skip type guard if known string.
1251
+ // s.length (UTF-16 code units) is an upper bound on cpLen, and at least cpLen
1252
+ // (worst case all surrogate pairs gives s.length = 2 * cpLen). So:
1253
+ // length < M → certain fail minLength
1254
+ // length > 2*X → certain fail maxLength
1255
+ // 2*M <= length <= X → certain pass both
1256
+ // Only call _cpLen in the uncertain band. ASCII strings (>99% of real data)
1257
+ // never enter the band.
1241
1258
  if (schema.minLength !== undefined && schema.maxLength !== undefined) {
1259
+ const M = schema.minLength
1260
+ const X = schema.maxLength
1242
1261
  const li = ctx.varCounter++
1243
1262
  const lv = `_l${li}`
1244
- if (isStr) {
1245
- lines.push(`{const ${lv}=_cpLen(${v});if(${lv}<${schema.minLength}||${lv}>${schema.maxLength})return false}`)
1246
- } else {
1247
- lines.push(`if(typeof ${v}==='string'){const ${lv}=_cpLen(${v});if(${lv}<${schema.minLength}||${lv}>${schema.maxLength})return false}`)
1248
- }
1263
+ const body = `{const ${lv}=${v}.length;if(${lv}<${M}||${lv}>${X * 2})return false;if(${lv}<${M * 2}||${lv}>${X}){const _cp=_cpLen(${v});if(_cp<${M}||_cp>${X})return false}}`
1264
+ lines.push(isStr ? body : `if(typeof ${v}==='string')${body}`)
1249
1265
  } else {
1250
- if (schema.minLength !== undefined) lines.push(isStr ? `if(_cpLen(${v})<${schema.minLength})return false` : `if(typeof ${v}==='string'&&_cpLen(${v})<${schema.minLength})return false`)
1251
- if (schema.maxLength !== undefined) lines.push(isStr ? `if(_cpLen(${v})>${schema.maxLength})return false` : `if(typeof ${v}==='string'&&_cpLen(${v})>${schema.maxLength})return false`)
1266
+ if (schema.minLength !== undefined) {
1267
+ const M = schema.minLength
1268
+ const body = `if(${v}.length<${M})return false;if(${v}.length<${M * 2}&&_cpLen(${v})<${M})return false`
1269
+ lines.push(isStr ? body : `if(typeof ${v}==='string'){${body}}`)
1270
+ }
1271
+ if (schema.maxLength !== undefined) {
1272
+ const X = schema.maxLength
1273
+ const body = `if(${v}.length>${X * 2})return false;if(${v}.length>${X}&&_cpLen(${v})>${X})return false`
1274
+ lines.push(isStr ? body : `if(typeof ${v}==='string'){${body}}`)
1275
+ }
1252
1276
  }
1253
1277
 
1254
1278
  // array size — skip guard if known array
@@ -2199,9 +2223,20 @@ function compilePatternInline(pattern, varName) {
2199
2223
  // Match: ^[chars]{exact}$ — e.g., ^[0-9]{5}$
2200
2224
  let m = pattern.match(/^\^(\[[\w\-]+\])\{(\d+)\}\$$/)
2201
2225
  if (m) {
2226
+ const len = parseInt(m[2])
2227
+ // For small fixed-length patterns, fully unroll: avoids the per-call closure
2228
+ // allocation of the IIFE form. Cap at 16 chars to keep emitted code small.
2229
+ if (len <= 16) {
2230
+ const checks = []
2231
+ for (let i = 0; i < len; i++) {
2232
+ const ck = charClassToCheck(m[1], `${varName}.charCodeAt(${i})`)
2233
+ if (!ck) return null
2234
+ checks.push(ck)
2235
+ }
2236
+ return `${varName}.length===${len}&&${checks.join('&&')}`
2237
+ }
2202
2238
  const rangeCheck = charClassToCheck(m[1], `${varName}.charCodeAt(_pi)`)
2203
2239
  if (!rangeCheck) return null
2204
- const len = parseInt(m[2])
2205
2240
  return `${varName}.length===${len}&&(()=>{for(let _pi=0;_pi<${len};_pi++){if(!(${rangeCheck}))return false}return true})()`
2206
2241
  }
2207
2242
  // Match: ^[chars]+$ — e.g., ^[a-z]+$
@@ -2539,14 +2574,21 @@ function genCodeE(schema, v, pathExpr, lines, ctx, schemaPrefix) {
2539
2574
  lines.push(`{const _r${ci}=typeof ${v}==='number'?${v}%${m}:NaN;if(typeof ${v}==='number'&&Math.abs(_r${ci})>1e-8&&Math.abs(_r${ci}-${m})>1e-8){${fail('multipleOf', 'multipleOf', `{multipleOf:${m}}`, `'must be multiple of ${m}'`)}}}`)
2540
2575
  }
2541
2576
 
2542
- // string
2577
+ // string length — same s.length fast paths as the boolean codegen above.
2578
+ // length < M → certain fail; length > 2*X → certain fail; sweet spot
2579
+ // 2*M <= length <= X passes without scanning. Only call _cpLen in the
2580
+ // uncertain band (caches it so it isn't called twice when both bounds are set).
2543
2581
  if (schema.minLength !== undefined) {
2544
- const c = isStr ? `_cpLen(${v})<${schema.minLength}` : `typeof ${v}==='string'&&_cpLen(${v})<${schema.minLength}`
2545
- lines.push(`if(${c}){${fail('minLength', 'minLength', `{limit:${schema.minLength}}`, `'must NOT have fewer than ${schema.minLength} characters'`)}}`)
2582
+ const M = schema.minLength
2583
+ const inner = `${v}.length<${M}||(${v}.length<${M * 2}&&_cpLen(${v})<${M})`
2584
+ const c = isStr ? inner : `typeof ${v}==='string'&&(${inner})`
2585
+ lines.push(`if(${c}){${fail('minLength', 'minLength', `{limit:${M}}`, `'must NOT have fewer than ${M} characters'`)}}`)
2546
2586
  }
2547
2587
  if (schema.maxLength !== undefined) {
2548
- const c = isStr ? `_cpLen(${v})>${schema.maxLength}` : `typeof ${v}==='string'&&_cpLen(${v})>${schema.maxLength}`
2549
- lines.push(`if(${c}){${fail('maxLength', 'maxLength', `{limit:${schema.maxLength}}`, `'must NOT have more than ${schema.maxLength} characters'`)}}`)
2588
+ const X = schema.maxLength
2589
+ const inner = `${v}.length>${X * 2}||(${v}.length>${X}&&_cpLen(${v})>${X})`
2590
+ const c = isStr ? inner : `typeof ${v}==='string'&&(${inner})`
2591
+ lines.push(`if(${c}){${fail('maxLength', 'maxLength', `{limit:${X}}`, `'must NOT have more than ${X} characters'`)}}`)
2550
2592
  }
2551
2593
  if (schema.pattern) {
2552
2594
  const inlineCheck = compilePatternInline(schema.pattern, v)
@@ -3081,9 +3123,19 @@ function genCodeC(schema, v, pathExpr, lines, ctx, schemaPrefix) {
3081
3123
  lines.push(`{const _r${ci}=typeof ${v}==='number'?${v}%${m}:NaN;if(typeof ${v}==='number'&&Math.abs(_r${ci})>1e-8&&Math.abs(_r${ci}-${m})>1e-8){${fail('multipleOf', 'multipleOf', `{multipleOf:${m}}`, `'must be multiple of ${m}'`)}}}`)
3082
3124
  }
3083
3125
 
3084
- // string — skip guard if known
3085
- if (schema.minLength !== undefined) { const c = isStr ? `_cpLen(${v})<${schema.minLength}` : `typeof ${v}==='string'&&_cpLen(${v})<${schema.minLength}`; lines.push(`if(${c}){${fail('minLength', 'minLength', `{limit:${schema.minLength}}`, `'must NOT have fewer than ${schema.minLength} characters'`)}}`) }
3086
- if (schema.maxLength !== undefined) { const c = isStr ? `_cpLen(${v})>${schema.maxLength}` : `typeof ${v}==='string'&&_cpLen(${v})>${schema.maxLength}`; lines.push(`if(${c}){${fail('maxLength', 'maxLength', `{limit:${schema.maxLength}}`, `'must NOT have more than ${schema.maxLength} characters'`)}}`) }
3126
+ // string length s.length fast paths, _cpLen only in the uncertain band.
3127
+ if (schema.minLength !== undefined) {
3128
+ const M = schema.minLength
3129
+ const inner = `${v}.length<${M}||(${v}.length<${M * 2}&&_cpLen(${v})<${M})`
3130
+ const c = isStr ? inner : `typeof ${v}==='string'&&(${inner})`
3131
+ lines.push(`if(${c}){${fail('minLength', 'minLength', `{limit:${M}}`, `'must NOT have fewer than ${M} characters'`)}}`)
3132
+ }
3133
+ if (schema.maxLength !== undefined) {
3134
+ const X = schema.maxLength
3135
+ const inner = `${v}.length>${X * 2}||(${v}.length>${X}&&_cpLen(${v})>${X})`
3136
+ const c = isStr ? inner : `typeof ${v}==='string'&&(${inner})`
3137
+ lines.push(`if(${c}){${fail('maxLength', 'maxLength', `{limit:${X}}`, `'must NOT have more than ${X} characters'`)}}`)
3138
+ }
3087
3139
  if (schema.pattern) {
3088
3140
  const inlineCheck = compilePatternInline(schema.pattern, v)
3089
3141
  if (inlineCheck) {
package/lib/ts-gen.js CHANGED
@@ -56,6 +56,28 @@ function renderValueType(schema, defs, depth = 0) {
56
56
 
57
57
  if (t === 'array') {
58
58
  const items = schema.items;
59
+ const prefix = Array.isArray(schema.prefixItems) ? schema.prefixItems : null;
60
+
61
+ if (prefix) {
62
+ const prefixTypes = prefix.map((s) => renderValueType(s, defs, depth + 1));
63
+ const minItems = typeof schema.minItems === 'number' ? schema.minItems : 0;
64
+ // Elements before minItems are required; the remainder are optional
65
+ // because JSON Schema does not require prefixItems to be present.
66
+ const elements = prefixTypes.map((t, i) => (i < minItems ? t : `${t}?`));
67
+ if (items === false) {
68
+ return `[${elements.join(', ')}]`;
69
+ }
70
+ if (items === undefined || items === true) {
71
+ return `[${elements.join(', ')}, ...unknown[]]`;
72
+ }
73
+ if (typeof items === 'object' && items !== null) {
74
+ const rest = renderValueType(items, defs, depth + 1);
75
+ const restType = rest.includes(' | ') ? `(${rest})` : rest;
76
+ return `[${elements.join(', ')}, ...${restType}[]]`;
77
+ }
78
+ }
79
+
80
+ if (items === false) return 'never[]';
59
81
  if (items === undefined || items === true) return 'unknown[]';
60
82
  const inner = renderValueType(items, defs, depth + 1);
61
83
  return inner.includes(' | ') ? `Array<${inner}>` : `${inner}[]`;
@@ -84,15 +106,30 @@ function renderObject(schema, defs, depth) {
84
106
  const t = renderValueType(props[k], defs, depth + 1);
85
107
  const opt = required.has(k) ? '' : '?';
86
108
  const safeKey = /^[A-Za-z_$][\w$]*$/.test(k) ? k : JSON.stringify(k);
87
- const desc = typeof props[k] === 'object' && props[k] && typeof props[k].description === 'string'
88
- ? ` /** ${props[k].description.replace(/\*\//g, '* /')} */\n`
89
- : '';
90
- return `${desc} ${safeKey}${opt}: ${t};`;
109
+ const doc = renderJsDoc(props[k], ' ');
110
+ return `${doc} ${safeKey}${opt}: ${t};`;
91
111
  });
92
112
  // extra keys when additionalProperties is present as a schema or true
93
113
  const extra = schema.additionalProperties;
94
114
  if (extra && typeof extra === 'object') {
95
- lines.push(` [key: string]: ${renderValueType(extra, defs, depth + 1)};`);
115
+ // TypeScript requires the index signature to be a supertype of every
116
+ // named property's emitted type. Widen to a union covering each property
117
+ // type, plus undefined when any property is optional.
118
+ const widen = new Set();
119
+ widen.add(renderValueType(extra, defs, depth + 1));
120
+ let hasOptional = false;
121
+ for (const k of keys) {
122
+ widen.add(renderValueType(props[k], defs, depth + 1));
123
+ if (!required.has(k)) hasOptional = true;
124
+ }
125
+ if (hasOptional) widen.add('undefined');
126
+ const indexType = widen.has('unknown') ? 'unknown' : Array.from(widen).join(' | ');
127
+ lines.push(` [key: string]: ${indexType};`);
128
+ } else if (extra !== false) {
129
+ // JSON Schema accepts extra keys by default. Emit a permissive index
130
+ // signature so tsc does not reject excess properties that the runtime
131
+ // would consider valid.
132
+ lines.push(` [key: string]: unknown;`);
96
133
  }
97
134
  return `{\n${lines.join('\n')}\n}`;
98
135
  }
@@ -106,10 +143,54 @@ function renderLiteral(v) {
106
143
 
107
144
  function toTypeName(name) {
108
145
  const cleaned = String(name).replace(/[^A-Za-z0-9_]/g, '_');
146
+ if (cleaned === '') return '_Anon';
109
147
  if (/^[0-9]/.test(cleaned)) return `_${cleaned}`;
110
148
  return cleaned.charAt(0).toUpperCase() + cleaned.slice(1);
111
149
  }
112
150
 
151
+ // Build a JSDoc block that captures the description plus any runtime-only
152
+ // constraints the TypeScript type cannot express (minLength, format, range,
153
+ // etc.). Editors and TypeDoc surface these on hover, so authors can see what
154
+ // the schema requires even though tsc does not enforce it.
155
+ function renderJsDoc(schema, indent) {
156
+ if (!schema || typeof schema !== 'object') return '';
157
+
158
+ let description = '';
159
+ if (typeof schema.description === 'string' && schema.description.length > 0) {
160
+ description = schema.description.replace(/\*\//g, '* /');
161
+ }
162
+
163
+ const tags = [];
164
+ const numKeys = ['minLength', 'maxLength', 'minItems', 'maxItems', 'minProperties', 'maxProperties',
165
+ 'minimum', 'maximum', 'exclusiveMinimum', 'exclusiveMaximum', 'multipleOf'];
166
+ for (const k of numKeys) {
167
+ if (typeof schema[k] === 'number') tags.push(`@${k} ${schema[k]}`);
168
+ }
169
+ if (typeof schema.pattern === 'string') tags.push(`@pattern ${schema.pattern}`);
170
+ if (typeof schema.format === 'string') tags.push(`@format ${schema.format}`);
171
+ if (schema.uniqueItems === true) tags.push('@uniqueItems');
172
+ if (schema.deprecated === true) tags.push('@deprecated');
173
+ if (schema.default !== undefined) {
174
+ try { tags.push(`@default ${JSON.stringify(schema.default)}`); } catch (_) {}
175
+ }
176
+ if (Array.isArray(schema.examples) && schema.examples.length > 0) {
177
+ try { tags.push(`@example ${JSON.stringify(schema.examples[0])}`); } catch (_) {}
178
+ }
179
+
180
+ if (description === '' && tags.length === 0) return '';
181
+
182
+ if (description !== '' && tags.length === 0) {
183
+ return `${indent}/** ${description} */\n`;
184
+ }
185
+
186
+ const lines = [`${indent}/**`];
187
+ if (description !== '') lines.push(`${indent} * ${description}`);
188
+ if (description !== '' && tags.length > 0) lines.push(`${indent} *`);
189
+ for (const t of tags) lines.push(`${indent} * ${t}`);
190
+ lines.push(`${indent} */`);
191
+ return lines.join('\n') + '\n';
192
+ }
193
+
113
194
  // Public: given a schema and optional type name, return a .d.ts source.
114
195
  function toTypeScript(schema, opts) {
115
196
  const options = opts || {};
@@ -125,13 +206,14 @@ function toTypeScript(schema, opts) {
125
206
  }
126
207
 
127
208
  const rootType = renderValueType(schema, defs, 0);
209
+ const rootDoc = renderJsDoc(schema, '');
128
210
  // Use `interface` only for a pure object literal; otherwise fall back to
129
211
  // `type`. Catches cases like `{...}[]` (array of object) and `Record<...>`
130
212
  // which are valid TS but cannot be expressed as an interface body.
131
213
  const isPureObjectLiteral = rootType.startsWith('{') && rootType.endsWith('}') && !rootType.includes(' | ');
132
214
  const rootDecl = isPureObjectLiteral
133
- ? `export interface ${rootName} ${rootType}`
134
- : `export type ${rootName} = ${rootType};`;
215
+ ? `${rootDoc}export interface ${rootName} ${rootType}`
216
+ : `${rootDoc}export type ${rootName} = ${rootType};`;
135
217
 
136
218
  return `// Auto-generated by ata-validator — do not edit.
137
219
  ${defLines.length ? defLines.join('\n\n') + '\n\n' : ''}${rootDecl}
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "ata-validator",
3
- "version": "0.11.1",
4
- "description": "Ultra-fast JSON Schema validator. 4.7x faster validation, 1,800x faster compilation. Works without native addon. Cross-schema $ref, Draft 2020-12 + Draft 7, V8-optimized JS codegen, simdjson, RE2, multi-core. Standard Schema V1 compatible.",
3
+ "version": "0.12.1",
4
+ "description": "Ultra-fast JSON Schema validator. 5x faster validation, 159,000x faster compilation. Works without native addon. Cross-schema $ref, Draft 2020-12 + Draft 7, V8-optimized JS codegen, simdjson, RE2, multi-core. Standard Schema V1 compatible.",
5
5
  "main": "index.js",
6
6
  "module": "index.mjs",
7
7
  "types": "index.d.ts",
@@ -41,6 +41,8 @@
41
41
  "test:standard-schema": "node tests/test_standard_schema.js",
42
42
  "test:browser": "node tests/test_browser.js",
43
43
  "test:ts": "node tests/test_ts_gen.js",
44
+ "test:ts-corpus": "node tests/test_ts_corpus.js",
45
+ "test:ts-differential": "node tests/test_ts_differential.js",
44
46
  "bench": "node benchmark/bench_large.js",
45
47
  "fuzz": "node tests/fuzz_differential.js",
46
48
  "fuzz:long": "FUZZ_ITERATIONS=100000 node tests/fuzz_differential.js",