ata-validator 0.12.6 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,23 @@
1
+ # Changelog
2
+
3
+ All notable changes to ata-validator are documented here. The format follows [Keep a Changelog](https://keepachangelog.com/), and this project adheres to semantic versioning.
4
+
5
+ ## 0.13.0 — 2026-05-09
6
+
7
+ ### Added
8
+
9
+ - **`ata build <glob>`** subcommand for project-wide AOT compilation. Compiles each matched schema to a per-file `.compiled.mjs` ESM module with a sibling `.d.mts` TypeScript declaration. Production bundles can drop the runtime ata-validator dependency entirely and import compiled validators as plain ESM modules.
10
+ - **`ata-validator/build` programmatic subpath export.** `import { build, watch } from 'ata-validator/build'` exposes the same engine the CLI uses, so build pipelines and bundler plugins can integrate without going through the CLI.
11
+ - **CLI flags for `ata build`:** `--out-dir`, `--suffix`, `--format esm|cjs`, `--abort-early`, `--no-types`, `--cache-file`, `--check`, `--watch`, `--max-size`, `--strict`.
12
+ - **Incremental cache** via content-hashed `--cache-file`. Second run on unchanged inputs skips compilation.
13
+ - **YAML schema support** when the `yaml` peer dependency is installed (optional). `.yaml` and `.yml` inputs parse the same as `.json`.
14
+ - **AOT vs AJV-runtime benchmark** at `benchmark/bench_aot_vs_ajv.mjs`. On the included fixtures, ata-AOT outputs are 25-56x smaller gzipped than the AJV runtime, cold start is ~2x faster, throughput is 2-4.5x faster, and compile time is 71-246x shorter.
15
+
16
+ ### Fixed
17
+
18
+ - **Standalone modules now correctly serialize closure-bound helpers** (RegExp, Set, sub-validator functions, branch-property arrays) into the emitted `.mjs`. Previously, schemas using `patternProperties`, `propertyNames` with regex, or `unevaluatedProperties` with `anyOf`/`oneOf` produced standalone output that referenced undefined variables (`_ppf0_0`, `_re*`, `_es*`, `_bk*`) and threw `ReferenceError` at runtime. The runtime validation path was unaffected.
19
+
20
+ ### Notes
21
+
22
+ - The runtime `Validator` API and the `ata-validator/compat` AJV-shim remain unchanged. Existing dynamic-schema users have no migration to do.
23
+ - Bundler plugins (ata-vite v0.2.0, ata-webpack, ata-codemod-ajv) are out of scope for this release and will land in 0.14.0+.
package/README.md CHANGED
@@ -1,161 +1,57 @@
1
- <p align="center">
2
- <img src="./assets/ata-validator.svg" alt="ata-validator" width="640" />
3
- </p>
4
-
5
1
  # ata-validator
6
2
 
7
- Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjson/simdjson). Multi-core parallel validation, RE2 regex, codegen bytecode engine. Standard Schema V1 compatible.
8
-
9
- **[ata-validator.com](https://ata-validator.com)** | **[API Docs](docs/API.md)** | **[Migrate from ajv](docs/migration-from-ajv.md)** | **[Framework integrations](docs/integrations/)** | **[Contributing](CONTRIBUTING.md)**
10
-
11
- ## Performance
12
-
13
- ### Simple Schema (7 properties, type + format + range + nested object)
14
-
15
- | Scenario | ata | ajv | |
16
- |---|---|---|---|
17
- | **validate(obj)** valid | 21ns | 108ns | **ata 5.1x faster** |
18
- | **validate(obj)** invalid | 86ns | 104ns | **ata 1.2x faster** |
19
- | **isValidObject(obj)** | 20ns | 109ns | **ata 5.4x faster** |
20
- | **Schema instantiation** (lazy compile) | 8ns | 1.33ms | **ata 159,000x faster** |
21
- | **First validation** (compile + validate) | 28ns | 1.21ms | **ata 43,000x faster** |
22
-
23
- > **Honest read of the three rows above:**
24
- >
25
- > - **Hot loop** (millions of `validate(obj)` calls on a warm validator): ata is **~5× faster** than ajv. This is the steady-state advantage and what most apps care about most of the time.
26
- > - **Cold start** (construct + first validate, apples-to-apples vs `ajv.compile(schema) + validate(obj)`): ata is **~43,000× faster**. Matters for serverless cold starts, CLI tools, batch workers — anywhere you instantiate a schema and exercise it once or a few times.
27
- > - **Instantiation only** (`new Validator(schema)` with no validation yet): ata is **~159,000× faster**, but only because ata defers codegen to first use (lazy compile + a tier-0 interpreter for low-traffic schemas). The number is real but it is constructor cost vs ajv's full compile cost — not the same unit of work. Quote it carefully.
28
- >
29
- > The lazy compile architecture is also why an instantiated-but-never-validated schema is essentially free in ata, while in ajv it costs the full compile. That's the underlying real win, beyond the multiplier above.
30
-
31
- ### Complex Schema (patternProperties + dependentSchemas + propertyNames + additionalProperties)
32
-
33
- | Scenario | ata | ajv | |
34
- |---|---|---|---|
35
- | **validate(obj)** valid | 19ns | 116ns | **ata 6.1x faster** |
36
- | **validate(obj)** invalid | 62ns | 195ns | **ata 3.1x faster** |
37
- | **isValidObject(obj)** | 18ns | 122ns | **ata 6.8x faster** |
38
-
39
- ### Cross-Schema `$ref` (multi-schema with `$id` registry)
40
-
41
- | Scenario | ata | ajv | |
42
- |---|---|---|---|
43
- | **validate(obj)** valid | 13ns | 25ns | **ata 2.0x faster** |
44
- | **validate(obj)** invalid | 28ns | 56ns | **ata 2.0x faster** |
45
-
46
- > Measured with [mitata](https://github.com/evanwashere/mitata) on Apple M4 Pro (process-isolated). [Benchmark code](benchmark/bench_complex_mitata.mjs)
47
-
48
- ### unevaluatedProperties / unevaluatedItems
49
-
50
- | Scenario | ata | ajv | |
51
- |---|---|---|---|
52
- | **Tier 1** (properties only) valid | 3.3ns | 8.5ns | **ata 2.6x faster** |
53
- | **Tier 1** invalid | 3.6ns | 18.6ns | **ata 5.2x faster** |
54
- | **Tier 2** (allOf) valid | 3.3ns | 10.1ns | **ata 3.0x faster** |
55
- | **Tier 3** (anyOf) valid | 6.7ns | 22.9ns | **ata 3.4x faster** |
56
- | **Tier 3** invalid | 7.5ns | 41.8ns | **ata 5.6x faster** |
57
- | **unevaluatedItems** valid | 0.97ns | 5.4ns | **ata 5.6x faster** |
58
- | **unevaluatedItems** invalid | 0.99ns | 14.9ns | **ata 15.0x faster** |
59
- | **Compilation** | 8.8ns | 2.64ms | **ata 298,000x faster** |
60
-
61
- Three-tier hybrid codegen: static schemas compile to zero-overhead key checks, dynamic schemas (anyOf/oneOf) use bitmask tracking with V8-inlined branch functions. [Benchmark code](benchmark/bench_unevaluated_mitata.mjs)
62
-
63
- ### vs Ecosystem (Zod, Valibot, TypeBox)
64
-
65
- | Scenario | ata | ajv | typebox | zod | valibot |
66
- |---|---|---|---|---|---|
67
- | **validate (valid)** | **7ns** | 38ns | 50ns | 342ns | 337ns |
68
- | **validate (invalid, all errors)** | **38ns** | 102ns | n/a | 11.9μs | 855ns |
69
- | **isValid (invalid, boolean)** | **0.93ns** | 16ns | 2.3ns | n/a | n/a |
70
- | **compilation** | **9ns** | 1.20ms | 53μs | n/a | n/a |
71
- | **first validation** | **16ns** | 1.16ms | 54μs | n/a | n/a |
72
-
73
- > Different categories: ata/ajv/typebox are JSON Schema validators, zod/valibot are schema-builder DSLs. The two invalid-path rows compare different units of work — `validate(invalid, all errors)` walks the full schema and builds an errors array (apples-to-apples vs ajv `{allErrors: true}`), while `isValid(invalid, boolean)` returns false on the first failed check (apples-to-apples vs typebox `Check()` and ajv `{allErrors: false}`). Reading both rows together avoids the trap of comparing a full error walk against a first-fail boolean. [Benchmark code](benchmark/bench_all_mitata.mjs)
74
-
75
- ### Large Data - JS Object Validation
76
-
77
- | Size | ata | ajv | |
78
- |---|---|---|---|
79
- | 10 users (2KB) | 6.0M ops/sec | 2.4M ops/sec | **ata 2.5x faster** |
80
- | 100 users (20KB) | 621K ops/sec | 229K ops/sec | **ata 2.7x faster** |
81
- | 1,000 users (205KB) | 63K ops/sec | 22.5K ops/sec | **ata 2.8x faster** |
3
+ Compile JSON Schema files into per-schema ESM modules at build time. Drop the runtime validator from your production bundle. Optional runtime API for dynamic schemas.
82
4
 
83
- ### Real-World Scenarios
5
+ [![npm](https://img.shields.io/npm/v/ata-validator)](https://www.npmjs.com/package/ata-validator)
6
+ [![License](https://img.shields.io/npm/l/ata-validator)](LICENSE)
84
7
 
85
- | Scenario | ata | ajv | |
86
- |---|---|---|---|
87
- | **Serverless cold start** (50 schemas) | 0.087ms | 3.67ms | **ata 42x faster** |
88
- | **ReDoS protection** (`^(a+)+$`) | 0.3ms | 765ms | **ata immune (RE2)** |
89
- | **Batch NDJSON** (10K items, multi-core) | 13.4M/sec | 5.1M/sec | **ata 2.6x faster** |
90
- | **Fastify startup** (5 routes) | 0.5ms | 6.0ms | **ata 12x faster** |
8
+ ## Quick start
91
9
 
92
- > Isolated single-schema benchmarks. Results vary by workload and hardware.
93
-
94
- ### How it works
95
-
96
- **Combined single-pass validator**: ata compiles schemas into a single function that validates and collects errors in one pass. Valid data returns `VALID_RESULT` with zero allocation. Invalid data collects errors inline with pre-allocated frozen error objects - no double validation, no try/catch (3.3x V8 deopt). Lazy compilation defers all work to first usage - constructor is near-zero cost.
97
-
98
- **JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Full keyword support including `patternProperties`, `dependentSchemas`, `propertyNames`, `unevaluatedProperties`, `unevaluatedItems`, cross-schema `$ref` with `$id` registry, and Draft 7 auto-detection. Three-tier hybrid approach for unevaluated keywords: compile-time resolution for static schemas, bitmask tracking for dynamic ones. charCodeAt prefix matching replaces regex for simple patterns (4x faster). Merged key iteration loops (patternProperties + propertyNames + additionalProperties in a single `for..in`).
99
-
100
- **V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables, tiered uniqueItems (nested loop for small arrays), inline key comparison for small property sets (no Set.has overhead).
101
-
102
- **Adaptive simdjson**: For large documents (>8KB) with selective schemas, simdjson On Demand seeks only the needed fields - skipping irrelevant data at GB/s speeds.
103
-
104
- ### $dynamicRef / $dynamicAnchor / $anchor
10
+ ```bash
11
+ npm install --save-dev ata-validator
12
+ npx ata build 'schemas/*.json' --out-dir src/generated
13
+ ```
105
14
 
106
- | Scenario | ata | ajv | |
107
- |---|---|---|---|
108
- | **$dynamicRef tree** valid | 22ns | 54ns | **ata 2.4x faster** |
109
- | **$dynamicRef tree** invalid | 71ns | 77ns | **ata 1.1x faster** |
110
- | **$dynamicRef override** valid | 2.6ns | 187ns | **ata 71x faster** |
111
- | **$dynamicRef override** invalid | 50ns | 189ns | **ata 3.8x faster** |
112
- | **$anchor array** valid | 2.2ns | 3.2ns | **ata 1.4x faster** |
15
+ In your code:
113
16
 
114
- Self-recursive named functions for $dynamicRef, compile-time cross-schema resolution, zero-wrapper hybrid path. [Benchmark code](benchmark/bench_dynamicref_vs_ajv.mjs)
17
+ ```ts
18
+ import { validate, isValid, type User } from './generated/user.compiled.mjs'
115
19
 
116
- ### JSON Schema Test Suite
20
+ if (isValid(req.body)) {
21
+ const user: User = req.body
22
+ // ...
23
+ }
24
+ ```
117
25
 
118
- **98.5%** pass rate (1172/1190) on official [JSON Schema Test Suite](https://github.com/json-schema-org/JSON-Schema-Test-Suite) (Draft 2020-12), excluding remote refs and vocabulary (intentionally unsupported). **95.3%** on [@exodus/schemasafe](https://github.com/ExodusMovement/schemasafe) test suite.
26
+ The `.compiled.mjs` modules are self-contained: zero runtime dependency on ata-validator, fully tree-shakeable, with TypeScript types emitted alongside.
119
27
 
120
- ## When to use ata
28
+ ## Why AOT
121
29
 
122
- - **High-throughput `validate(obj)`** - 5.1x faster than ajv, 47x faster than zod
123
- - **Complex schemas** - `patternProperties`, `dependentSchemas`, `propertyNames`, `unevaluatedProperties` all inline JS codegen
124
- - **Multi-schema projects** - cross-schema `$ref` with `$id` registry, `addSchema()` API
125
- - **Draft 7 migration** - auto-detects `$schema`, normalizes Draft 7 keywords transparently
126
- - **Serverless / cold starts** - 6,904x faster compilation, 5,148x faster first validation
127
- - **Security-sensitive apps** - RE2 regex, immune to ReDoS attacks
128
- - **Batch/streaming validation** - NDJSON log processing, data pipelines (2.6x faster)
129
- - **Standard Schema V1** - native support for Fastify v5, tRPC, TanStack
130
- - **C/C++ embedding** - native library, no JS runtime needed
30
+ | Dimension | Schema | ata-AOT | AJV-runtime | Difference |
31
+ |---|---|---|---|---|
32
+ | Bundle (gzipped) | simple | 955 B | 52.7 KB | 56x smaller |
33
+ | Bundle (gzipped) | complex | 1.6 KB | 52.7 KB | 32x smaller |
34
+ | Cold start | simple | 21 ms | 38 ms | 1.8x faster |
35
+ | Throughput (10M ops) | simple | 345 Mops/s | 116 Mops/s | 3.0x faster |
36
+ | Compile time | simple | 6 µs | 1.5 ms | 246x faster |
131
37
 
132
- ## When to use ajv
38
+ Reproduce on your machine with `npm run bench:aot-vs-ajv`. Numbers measured on Apple M4 Pro, Node 25.2.1.
133
39
 
134
- - **Existing ajv ecosystem** - plugins, custom keywords, large community
135
- - **Full unevaluatedProperties/Items** - ata covers most cases but some edge cases remain
40
+ The wins are largest on bundle size and compile time because AOT moves work from runtime to build time. Throughput and cold start are also faster because the compiled validator is a tight straight-line function with no schema-walk overhead.
136
41
 
137
- ## Features
42
+ ## When to use the runtime API instead
138
43
 
139
- - **Hybrid validator**: 4.1x faster than ajv, up to 70x faster on $dynamicRef - zero-wrapper hybrid path for valid data (no allocation), combined codegen for error collection. Schema compilation cache for repeated schemas
140
- - **$dynamicRef / $dynamicAnchor / $anchor**: Full Draft 2020-12 dynamic reference support. Self-recursive named functions, compile-time cross-schema resolution (42/42 spec tests)
141
- - **Cross-schema `$ref`**: `schemas` option and `addSchema()` API. Compile-time resolution with `$id` registry, zero runtime overhead
142
- - **Draft 7 support**: Auto-detects `$schema` field, normalizes `dependencies`/`additionalItems`/`definitions` transparently
143
- - **Multi-core**: Parallel validation across all CPU cores - 13.4M validations/sec
144
- - **simdjson**: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
145
- - **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks (2391x faster on pathological input)
146
- - **V8-optimized codegen**: Destructuring batch reads, type guard elimination, property hoisting
147
- - **Standard Schema V1**: Compatible with Fastify, tRPC, TanStack, Drizzle
148
- - **Zero-copy paths**: Buffer and pre-padded input support - no unnecessary copies
149
- - **Defaults + coercion**: `default` values, `coerceTypes`, `removeAdditional` support
150
- - **C/C++ library**: Native API for non-Node.js environments
151
- - **98.5% spec compliant**: Draft 2020-12
44
+ `ata build` is for schemas you know at build time. If your schemas are user-supplied at runtime (form builders, no-code platforms, dynamic API ingestion), use the runtime API:
152
45
 
153
- ## Installation
46
+ ```js
47
+ import { Validator } from 'ata-validator'
154
48
 
155
- ```bash
156
- npm install ata-validator
49
+ const v = new Validator(schema)
50
+ const result = v.validate(data)
157
51
  ```
158
52
 
53
+ The runtime API is unchanged from previous releases. AJV-shim users continue importing from `ata-validator/compat`.
54
+
159
55
  ## Usage
160
56
 
161
57
  ### Node.js
@@ -265,6 +161,14 @@ CLI options:
265
161
  | `--abort-early` | off | Generate the stub-error variant (~0.5 KB gzipped) |
266
162
  | `--no-types` | off | Skip the `.d.mts` / `.d.cts` output |
267
163
 
164
+ For a project with many schemas, `ata build <glob>` compiles them all in one command:
165
+
166
+ ```bash
167
+ npx ata build 'schemas/*.json' --out-dir build/validators --check
168
+ ```
169
+
170
+ Run with `--watch` during development for incremental rebuilds.
171
+
268
172
  Typical bundle sizes (10-field user schema, gzipped):
269
173
 
270
174
  | Variant | Size | Notes |
package/bin/ata.js CHANGED
@@ -8,20 +8,34 @@ function usage() {
8
8
  process.stdout.write(`ata-validator CLI
9
9
 
10
10
  Usage:
11
- ata compile <schema-file> [options]
11
+ ata compile <schema-file> [options] Compile one schema to a standalone module.
12
+ ata build <glob>... [options] Compile a project's schemas (glob pattern) per file.
12
13
 
13
- Options:
14
+ Compile options:
14
15
  -o, --output <file> Output path. Default: <schema-file>.validator.mjs
15
16
  -f, --format <fmt> Module format: esm | cjs. Default: esm
16
17
  --name <TypeName> Name of the top-level type in .d.ts. Default: inferred from filename
17
18
  --no-types Skip .d.ts generation
18
19
  --abort-early Use stub errors (smallest bundle)
20
+
21
+ Build options:
22
+ --out-dir <dir> Write outputs into this directory instead of alongside sources
23
+ --suffix <str> Output filename suffix (default: ".compiled")
24
+ -f, --format <fmt> Module format: esm | cjs. Default: esm
25
+ --abort-early Use stub errors (smallest bundle)
26
+ --check Check (don't write); exit 1 if any output is stale
27
+ --cache-file <path> Cache file for incremental builds (default: cache disabled)
28
+ --max-size <bytes> Fail build if any compiled module exceeds this gzipped size
29
+ --strict Treat any AOT-incompatible schema as a build error (default: skip + warn)
30
+ --watch Re-emit on schema change (Ctrl-C to exit)
31
+ --no-types Skip .d.mts/.d.cts emission alongside compiled modules
32
+
19
33
  -h, --help Show this message
20
34
 
21
35
  Examples:
22
36
  ata compile schemas/user.json -o src/generated/user.validator.mjs
23
- ata compile schemas/user.json --format cjs -o dist/user.validator.cjs
24
- ata compile schemas/public-api.json --abort-early -o dist/api.mjs
37
+ ata build 'schemas/*.json'
38
+ ata build 'src/**/*.schema.json' --out-dir build/validators
25
39
  `);
26
40
  }
27
41
 
@@ -35,6 +49,21 @@ function parseArgs(argv) {
35
49
  if (a === '--name') { out.opts.name = argv[++i]; continue; }
36
50
  if (a === '--no-types') { out.opts.types = false; continue; }
37
51
  if (a === '--abort-early') { out.opts.abortEarly = true; continue; }
52
+ if (a === '--check') { out.opts.check = true; continue; }
53
+ if (a === '--strict') { out.opts.strict = true; continue; }
54
+ if (a === '--out-dir') { out.opts.outDir = argv[++i]; continue; }
55
+ if (a === '--suffix') { out.opts.suffix = argv[++i]; continue; }
56
+ if (a === '--cache-file') { out.opts.cacheFile = argv[++i]; continue; }
57
+ if (a === '--max-size') {
58
+ const v = argv[++i];
59
+ const n = Number(v);
60
+ if (!Number.isFinite(n) || n <= 0 || !Number.isInteger(n)) {
61
+ throw new Error(`--max-size requires a positive integer (got "${v}")`);
62
+ }
63
+ out.opts.maxSize = n;
64
+ continue;
65
+ }
66
+ if (a === '--watch') { out.opts.watch = true; continue; }
38
67
  if (a.startsWith('-')) { throw new Error(`Unknown option: ${a}`); }
39
68
  out._.push(a);
40
69
  }
@@ -113,6 +142,73 @@ function cmdCompile(args) {
113
142
  }
114
143
  }
115
144
 
145
+ function cmdBuild(args) {
146
+ if (args._.length === 0) {
147
+ process.stderr.write('error: missing <glob>\n\n');
148
+ usage();
149
+ process.exit(1);
150
+ }
151
+ const buildLib = require('../lib/aot-build');
152
+ const format = args.opts.format || 'esm';
153
+ if (format !== 'esm' && format !== 'cjs') {
154
+ process.stderr.write(`error: --format must be esm or cjs (got "${format}")\n`);
155
+ process.exit(1);
156
+ }
157
+ const buildOpts = {
158
+ globs: args._,
159
+ format,
160
+ outDir: args.opts.outDir,
161
+ suffix: args.opts.suffix,
162
+ abortEarly: !!args.opts.abortEarly,
163
+ check: !!args.opts.check,
164
+ maxSize: args.opts.maxSize,
165
+ strict: !!args.opts.strict,
166
+ types: args.opts.types,
167
+ cacheFile: args.opts.cacheFile,
168
+ };
169
+
170
+ const printReport = (report) => {
171
+ if (args.opts.check) {
172
+ process.stdout.write(`ata: check — ${report.cached.length} up to date, ${report.staleCount} stale\n`);
173
+ return;
174
+ }
175
+ for (const c of report.compiled) {
176
+ process.stdout.write(`ata: ${c.input} -> ${c.output} (${c.bytes.toLocaleString()} bytes)\n`);
177
+ }
178
+ for (const c of report.cached) {
179
+ process.stdout.write(`ata: cached ${c.input}\n`);
180
+ }
181
+ for (const s of report.skipped) {
182
+ process.stdout.write(`ata: skipped ${s.input}: ${s.reason}\n`);
183
+ }
184
+ for (const f of report.failed) {
185
+ process.stderr.write(`ata: failed ${f.input}: ${f.error}\n`);
186
+ }
187
+ };
188
+
189
+ if (args.opts.watch) {
190
+ let handle;
191
+ buildLib.watch(buildOpts, printReport).then((h) => {
192
+ handle = h;
193
+ process.stdout.write('ata: watching for changes (Ctrl-C to exit)\n');
194
+ });
195
+ process.on('SIGINT', () => {
196
+ if (handle) handle.close();
197
+ process.exit(0);
198
+ });
199
+ return;
200
+ }
201
+
202
+ buildLib.build(buildOpts).then((report) => {
203
+ printReport(report);
204
+ if (args.opts.check && report.staleCount > 0) process.exit(1);
205
+ if (report.failed.length > 0) process.exit(1);
206
+ }).catch((e) => {
207
+ process.stderr.write(`error: ${e.message}\n`);
208
+ process.exit(1);
209
+ });
210
+ }
211
+
116
212
  function main() {
117
213
  const argv = process.argv.slice(2);
118
214
  if (argv.length === 0) { usage(); process.exit(0); }
@@ -136,6 +232,11 @@ function main() {
136
232
  return;
137
233
  }
138
234
 
235
+ if (cmd === 'build') {
236
+ cmdBuild(args);
237
+ return;
238
+ }
239
+
139
240
  process.stderr.write(`error: unknown command "${cmd}"\n\n`);
140
241
  usage();
141
242
  process.exit(1);
package/build.d.ts ADDED
@@ -0,0 +1,63 @@
1
+ export interface BuildOptions {
2
+ /** Glob patterns to expand into input schema files. */
3
+ globs: string[];
4
+ /** Module format for compiled outputs. Default: 'esm'. */
5
+ format?: 'esm' | 'cjs';
6
+ /** Write outputs into this directory instead of alongside sources. */
7
+ outDir?: string;
8
+ /** Output filename suffix. Default: '.compiled'. */
9
+ suffix?: string;
10
+ /** Use stub error functions for the smallest output. Default: false. */
11
+ abortEarly?: boolean;
12
+ /** Path to incremental cache file. Default: cache disabled. */
13
+ cacheFile?: string;
14
+ /** When true, do not write outputs; only report stale count. */
15
+ check?: boolean;
16
+ /** Maximum gzipped output size per compiled module, in bytes. */
17
+ maxSize?: number;
18
+ /** When true, AOT-incompatible schemas become failures (default: skipped). */
19
+ strict?: boolean;
20
+ /** Emit a .d.mts/.d.cts/.d.ts sibling for each compiled module. Default: true. */
21
+ types?: boolean;
22
+ }
23
+
24
+ export interface CompiledEntry {
25
+ input: string;
26
+ output: string;
27
+ bytes: number;
28
+ gzipBytes?: number;
29
+ }
30
+
31
+ export interface CachedEntry {
32
+ input: string;
33
+ output: string;
34
+ }
35
+
36
+ export interface SkippedEntry {
37
+ input: string;
38
+ reason: string;
39
+ }
40
+
41
+ export interface FailedEntry {
42
+ input: string;
43
+ error: string;
44
+ }
45
+
46
+ export interface BuildReport {
47
+ compiled: CompiledEntry[];
48
+ cached: CachedEntry[];
49
+ skipped: SkippedEntry[];
50
+ failed: FailedEntry[];
51
+ /** Only set when opts.check === true. */
52
+ staleCount?: number;
53
+ }
54
+
55
+ export function build(opts: BuildOptions): Promise<BuildReport>;
56
+ export function expandGlobs(globs: string[]): Promise<string[]>;
57
+ export function parseSchemaFile(filePath: string): unknown;
58
+ export function outputPathFor(input: string, opts: { format?: 'esm' | 'cjs'; outDir?: string; suffix?: string }): string;
59
+
60
+ export interface WatchHandle {
61
+ close(): void;
62
+ }
63
+ export function watch(opts: BuildOptions, onReport?: (r: BuildReport) => void): Promise<WatchHandle>;
package/build.js ADDED
@@ -0,0 +1,6 @@
1
+ 'use strict';
2
+
3
+ // Public subpath: `ata-validator/build`
4
+ // Re-exports the programmatic build API. The CLI in bin/ata.js is a thin
5
+ // wrapper around the same module.
6
+ module.exports = require('./lib/aot-build');
package/build.mjs ADDED
@@ -0,0 +1,8 @@
1
+ import mod from './build.js';
2
+
3
+ export const build = mod.build;
4
+ export const expandGlobs = mod.expandGlobs;
5
+ export const parseSchemaFile = mod.parseSchemaFile;
6
+ export const outputPathFor = mod.outputPathFor;
7
+ export const watch = mod.watch;
8
+ export default mod;
package/index.js CHANGED
@@ -964,6 +964,32 @@ module.exports = { boolFn, hybridFactory, errFn };
964
964
  }
965
965
  }
966
966
 
967
+ // Serialize closure vars referenced in _fn body: regex, sub-validators, sets.
968
+ let closureDecls = '';
969
+ if (jsFn._closures && jsFn._closures.length > 0) {
970
+ const lines = [];
971
+ for (const { name, val } of jsFn._closures) {
972
+ if (Array.isArray(val)) {
973
+ lines.push(`const ${name} = ${JSON.stringify(val)};`);
974
+ continue;
975
+ }
976
+ if (val instanceof RegExp) {
977
+ const flags = val.flags;
978
+ lines.push(`const ${name} = new RegExp(${JSON.stringify(val.source)}${flags ? ', ' + JSON.stringify(flags) : ''});`);
979
+ } else if (val instanceof Set) {
980
+ lines.push(`const ${name} = new Set(${JSON.stringify([...val])});`);
981
+ } else if (typeof val === 'function') {
982
+ // new Function('_ppv', body) — extract body from toString()
983
+ const str = val.toString();
984
+ // Matches: "function anonymous(_ppv\n) {\nbody\n}" or "function(_ppv){body}"
985
+ const m = str.match(/^function[^(]*\([^)]*\)\s*\{([\s\S]*)\}$/)
986
+ const body = m ? m[1].trim() : str;
987
+ lines.push(`const ${name} = function(_ppv) { ${body} };`);
988
+ }
989
+ }
990
+ if (lines.length) closureDecls = lines.join('\n') + '\n';
991
+ }
992
+
967
993
  const validBody = errCore
968
994
  ? 'return _fn(data) ? VALID : { valid: false, errors: errFn(data, true).errors }'
969
995
  : 'return _fn(data) ? VALID : ABORT';
@@ -978,7 +1004,7 @@ module.exports = { boolFn, hybridFactory, errFn };
978
1004
  ${_CP_LEN_SOURCE}
979
1005
  const VALID = Object.freeze({ valid: true, errors: Object.freeze([]) });
980
1006
  const ABORT = Object.freeze({ valid: false, errors: Object.freeze([Object.freeze({ message: 'validation failed' })]) });
981
- const _fn = function(d) {
1007
+ ${closureDecls}const _fn = function(d) {
982
1008
  ${src}
983
1009
  };
984
1010
  ${errCore}function isValid(data) { return _fn(data); }
@@ -0,0 +1,206 @@
1
+ 'use strict';
2
+
3
+ const crypto = require('crypto');
4
+ const fs = require('fs');
5
+ const path = require('path');
6
+ const zlib = require('zlib');
7
+ const { Validator } = require('..');
8
+
9
+ async function expandGlobs(globs) {
10
+ const out = [];
11
+ for (const g of globs) {
12
+ if (typeof fs.promises.glob === 'function') {
13
+ // Node 22+
14
+ for await (const f of fs.promises.glob(g)) out.push(path.resolve(f));
15
+ } else {
16
+ // Node 18-21 fallback: simple non-recursive directory + extension match
17
+ // Pattern accepted: '<dir>/*.<ext>' or absolute file path.
18
+ if (fs.existsSync(g) && fs.statSync(g).isFile()) {
19
+ out.push(path.resolve(g));
20
+ continue;
21
+ }
22
+ const m = g.match(/^(.*?)(?:\/\*\.(.+))?$/);
23
+ const dir = m && m[1] ? m[1] : '.';
24
+ const ext = m && m[2] ? '.' + m[2] : null;
25
+ if (!fs.existsSync(dir)) continue;
26
+ for (const entry of fs.readdirSync(dir)) {
27
+ if (ext && !entry.endsWith(ext)) continue;
28
+ const full = path.join(dir, entry);
29
+ if (fs.statSync(full).isFile()) out.push(path.resolve(full));
30
+ }
31
+ }
32
+ }
33
+ return [...new Set(out)];
34
+ }
35
+
36
+ function parseSchemaFile(filePath) {
37
+ const text = fs.readFileSync(filePath, 'utf8');
38
+ const ext = path.extname(filePath).toLowerCase();
39
+ if (ext === '.json') return JSON.parse(text);
40
+ if (ext === '.yaml' || ext === '.yml') {
41
+ let yaml;
42
+ try { yaml = require('yaml'); }
43
+ catch { throw new Error(`install the 'yaml' package to compile YAML schemas (file: ${filePath})`); }
44
+ return yaml.parse(text);
45
+ }
46
+ throw new Error(`unsupported schema extension: ${ext} (file: ${filePath})`);
47
+ }
48
+
49
+ function outputPathFor(input, opts) {
50
+ const suffix = opts.suffix || '.compiled';
51
+ const ext = opts.format === 'cjs' ? '.cjs' : '.mjs';
52
+ const dir = opts.outDir || path.dirname(input);
53
+ const base = path.basename(input, path.extname(input));
54
+ // Strip a trailing ".schema" for cleaner output names: foo.schema.json -> foo.compiled.mjs
55
+ const stem = base.endsWith('.schema') ? base.slice(0, -('.schema'.length)) : base;
56
+ return path.join(dir, stem + suffix + ext);
57
+ }
58
+
59
+ function readCache(cacheFile) {
60
+ if (!cacheFile || !fs.existsSync(cacheFile)) return {};
61
+ try { return JSON.parse(fs.readFileSync(cacheFile, 'utf8')); } catch { return {}; }
62
+ }
63
+
64
+ function writeCache(cacheFile, data) {
65
+ if (!cacheFile) return;
66
+ fs.mkdirSync(path.dirname(cacheFile), { recursive: true });
67
+ fs.writeFileSync(cacheFile, JSON.stringify(data, null, 2));
68
+ }
69
+
70
+ function hashContent(buf) {
71
+ return crypto.createHash('sha256').update(buf).digest('hex').slice(0, 32);
72
+ }
73
+
74
+ async function build(opts) {
75
+ const globs = opts.globs || [];
76
+ if (globs.length === 0) throw new Error('build: at least one glob required');
77
+ const format = opts.format || 'esm';
78
+ const inputs = await expandGlobs(globs);
79
+ const cache = readCache(opts.cacheFile);
80
+ const newCache = {};
81
+
82
+ const compiled = [];
83
+ const cached = [];
84
+ const skipped = [];
85
+ const failed = [];
86
+
87
+ for (const input of inputs) {
88
+ try {
89
+ const raw = fs.readFileSync(input);
90
+ const inputHash = hashContent(raw);
91
+ const output = outputPathFor(input, { ...opts, format });
92
+ const cacheEntry = cache[input];
93
+ if (opts.check) {
94
+ const upToDate = (
95
+ cacheEntry &&
96
+ cacheEntry.inputHash === inputHash &&
97
+ cacheEntry.output === output &&
98
+ fs.existsSync(output) &&
99
+ cacheEntry.outputHash === hashContent(fs.readFileSync(output))
100
+ );
101
+ if (upToDate) {
102
+ cached.push({ input, output });
103
+ }
104
+ continue;
105
+ }
106
+ if (
107
+ cacheEntry &&
108
+ cacheEntry.inputHash === inputHash &&
109
+ cacheEntry.output === output &&
110
+ fs.existsSync(output) &&
111
+ cacheEntry.outputHash === hashContent(fs.readFileSync(output))
112
+ ) {
113
+ cached.push({ input, output });
114
+ newCache[input] = cacheEntry;
115
+ continue;
116
+ }
117
+ const schema = parseSchemaFile(input);
118
+ const v = new Validator(schema);
119
+ const src = v.toStandaloneModule({ format, abortEarly: !!opts.abortEarly });
120
+ if (!src) {
121
+ const reason = 'schema is not AOT-compatible (toStandaloneModule returned null)';
122
+ if (opts.strict) failed.push({ input, error: reason });
123
+ else skipped.push({ input, reason });
124
+ continue;
125
+ }
126
+ fs.mkdirSync(path.dirname(output), { recursive: true });
127
+ fs.writeFileSync(output, src);
128
+ const outBytes = Buffer.byteLength(src, 'utf8');
129
+ const gz = zlib.gzipSync(src);
130
+ const gzBytes = gz.length;
131
+ if (typeof opts.maxSize === 'number' && gzBytes > opts.maxSize) {
132
+ // Roll back the write so a failed build doesn't leave a stale artifact.
133
+ try { fs.unlinkSync(output); } catch {}
134
+ failed.push({ input, error: `output ${output} exceeds --max-size: ${gzBytes} > ${opts.maxSize} (gzipped bytes)` });
135
+ continue;
136
+ }
137
+ compiled.push({ input, output, bytes: outBytes, gzipBytes: gzBytes });
138
+ if (opts.types !== false) {
139
+ const { toTypeScript } = require('./ts-gen');
140
+ const stem = path.basename(input, path.extname(input)).replace(/\.schema$/, '');
141
+ const typeName = stem
142
+ .replace(/[^A-Za-z0-9_]/g, '_')
143
+ .replace(/^([a-z])/, (m) => m.toUpperCase()) || 'Data';
144
+ const dts = toTypeScript(schema, { name: typeName });
145
+ const ext = path.extname(output);
146
+ const dtsExt = ext === '.mjs' ? '.d.mts'
147
+ : ext === '.cjs' ? '.d.cts'
148
+ : '.d.ts';
149
+ const base = output.slice(0, output.length - ext.length);
150
+ fs.writeFileSync(base + dtsExt, dts);
151
+ }
152
+ newCache[input] = {
153
+ inputHash,
154
+ output,
155
+ outputHash: hashContent(Buffer.from(src, 'utf8')),
156
+ };
157
+ } catch (e) {
158
+ failed.push({ input, error: e.message });
159
+ }
160
+ }
161
+
162
+ if (opts.check) {
163
+ const staleCount = inputs.length - cached.length;
164
+ return { compiled: [], cached, skipped, failed, staleCount };
165
+ }
166
+
167
+ writeCache(opts.cacheFile, newCache);
168
+
169
+ return { compiled, cached, skipped, failed };
170
+ }
171
+
172
+ async function watch(opts, onReport) {
173
+ const initial = await build(opts);
174
+ if (typeof onReport === 'function') onReport(initial);
175
+
176
+ const inputs = await expandGlobs(opts.globs || []);
177
+ const dirs = [...new Set(inputs.map((p) => path.dirname(p)))];
178
+ let debounceTimer = null;
179
+
180
+ const runOnce = async () => {
181
+ debounceTimer = null;
182
+ try {
183
+ const r = await build(opts);
184
+ if (typeof onReport === 'function') onReport(r);
185
+ } catch (e) {
186
+ if (typeof onReport === 'function') onReport({ compiled: [], cached: [], skipped: [], failed: [{ input: '<watch>', error: e.message }] });
187
+ }
188
+ };
189
+
190
+ const watchers = dirs.map((d) => fs.watch(d, (_event, filename) => {
191
+ if (!filename) return;
192
+ const ext = path.extname(filename).toLowerCase();
193
+ if (ext !== '.json' && ext !== '.yaml' && ext !== '.yml') return;
194
+ if (debounceTimer) clearTimeout(debounceTimer);
195
+ debounceTimer = setTimeout(runOnce, 100);
196
+ }));
197
+
198
+ return {
199
+ close() {
200
+ if (debounceTimer) clearTimeout(debounceTimer);
201
+ for (const w of watchers) w.close();
202
+ },
203
+ };
204
+ }
205
+
206
+ module.exports = { build, expandGlobs, parseSchemaFile, outputPathFor, watch };
@@ -957,6 +957,18 @@ function compileToJSCodegen(schema, schemaMap, userFormats) {
957
957
  }
958
958
  if (fmtEntries.length) boolFn._formatClosures = fmtEntries
959
959
  }
960
+ // Closure variables (regex, sub-validators, sets) referenced in _source that
961
+ // standalone module output must declare. Excludes _cpLen (emitted by _CP_LEN_SOURCE)
962
+ // and _uf_* (emitted via _formatClosures).
963
+ {
964
+ const entries = []
965
+ for (let i = 0; i < closureNames.length; i++) {
966
+ const name = closureNames[i]
967
+ if (name === '_cpLen' || name.startsWith('_uf_')) continue
968
+ entries.push({ name, val: closureValues[i] })
969
+ }
970
+ if (entries.length) boolFn._closures = entries
971
+ }
960
972
 
961
973
  return boolFn
962
974
  } catch {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ata-validator",
3
- "version": "0.12.6",
3
+ "version": "0.13.0",
4
4
  "description": "Ultra-fast JSON Schema validator. 5x faster validation, 159,000x faster compilation. Works without native addon. Cross-schema $ref, Draft 2020-12 + Draft 7, V8-optimized JS codegen, simdjson, RE2, multi-core. Standard Schema V1 compatible.",
5
5
  "main": "index.js",
6
6
  "module": "index.mjs",
@@ -20,6 +20,11 @@
20
20
  "import": "./compat.mjs",
21
21
  "require": "./compat.js"
22
22
  },
23
+ "./build": {
24
+ "types": "./build.d.ts",
25
+ "import": "./build.mjs",
26
+ "require": "./build.js"
27
+ },
23
28
  "./package.json": "./package.json"
24
29
  },
25
30
  "sideEffects": false,
@@ -35,7 +40,7 @@
35
40
  "rebuild": "cmake-js rebuild --target ata",
36
41
  "prebuild": "pkg-prebuilds-copy --baseDir build/Release --source ata.node --name=ata --strip --napi_version=10",
37
42
  "prebuild-all": "npm run prebuild -- --arch x64 && npm run prebuild -- --arch arm64",
38
- "test": "node test.js",
43
+ "test": "node test.js && node tests/test_aot_build.js && node tests/test_aot_differential.js && node tests/test_aot_cli_build.js && node tests/test_aot_cli_smoke.js",
39
44
  "test:suite": "node tests/run_suite.js",
40
45
  "test:compat": "node tests/test_compat.js",
41
46
  "test:standard-schema": "node tests/test_standard_schema.js",
@@ -46,7 +51,11 @@
46
51
  "bench": "node benchmark/bench_large.js",
47
52
  "fuzz": "node tests/fuzz_differential.js",
48
53
  "fuzz:long": "FUZZ_ITERATIONS=100000 node tests/fuzz_differential.js",
49
- "test:json-suite": "node tests/run_json_test_suite.js"
54
+ "test:aot": "node tests/test_aot_build.js",
55
+ "test:aot-differential": "node tests/test_aot_differential.js",
56
+ "test:aot-cli": "node tests/test_aot_cli_build.js",
57
+ "test:json-suite": "node tests/run_json_test_suite.js",
58
+ "bench:aot-vs-ajv": "node benchmark/bench_aot_vs_ajv.mjs"
50
59
  },
51
60
  "keywords": [
52
61
  "json",
@@ -85,6 +94,9 @@
85
94
  "compat.js",
86
95
  "compat.mjs",
87
96
  "compat.d.ts",
97
+ "build.js",
98
+ "build.mjs",
99
+ "build.d.ts",
88
100
  "binding-options.js",
89
101
  "binding/",
90
102
  "include/",
@@ -94,6 +106,7 @@
94
106
  "scripts/",
95
107
  "CMakeLists.txt",
96
108
  "README.md",
109
+ "CHANGELOG.md",
97
110
  "LICENSE"
98
111
  ],
99
112
  "dependencies": {
@@ -101,6 +114,12 @@
101
114
  "node-api-headers": "^1.8.0",
102
115
  "pkg-prebuilds": "^1.0.0"
103
116
  },
117
+ "peerDependencies": {
118
+ "yaml": "^2.0.0"
119
+ },
120
+ "peerDependenciesMeta": {
121
+ "yaml": { "optional": true }
122
+ },
104
123
  "devDependencies": {
105
124
  "@sinclair/typebox": "^0.34.49",
106
125
  "cmake-js": "^8.0.0",