ata-validator 0.1.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +120 -187
- package/binding/ata_napi.cpp +903 -114
- package/binding.gyp +13 -2
- package/compat.d.ts +23 -0
- package/include/ata.h +10 -2
- package/index.d.ts +37 -0
- package/index.js +150 -5
- package/lib/js-compiler.js +845 -0
- package/package.json +15 -8
- package/prebuilds/darwin-arm64/ata-validator.node +0 -0
- package/src/ata.cpp +776 -125
package/README.md
CHANGED
|
@@ -1,65 +1,89 @@
|
|
|
1
|
-
# ata
|
|
1
|
+
# ata-validator
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Ultra-fast JSON Schema validator powered by [simdjson](https://github.com/simdjson/simdjson). Multi-core parallel validation, RE2 regex, codegen bytecode engine. Standard Schema V1 compatible.
|
|
4
|
+
|
|
5
|
+
**[ata-validator.com](https://ata-validator.com)**
|
|
4
6
|
|
|
5
7
|
## Performance
|
|
6
8
|
|
|
7
|
-
###
|
|
9
|
+
### Single-Document Validation (valid data)
|
|
10
|
+
|
|
11
|
+
| Scenario | ata | ajv | |
|
|
12
|
+
|---|---|---|---|
|
|
13
|
+
| **validate(obj)** | 9.6M ops/sec | 8.5M ops/sec | **ata 1.1x faster** |
|
|
14
|
+
| **isValidObject(obj)** | 10.4M ops/sec | 9.3M ops/sec | **ata 1.1x faster** |
|
|
15
|
+
| **validateJSON(str)** | 1.9M ops/sec | 1.87M ops/sec | **ata 1.02x faster** |
|
|
16
|
+
| **isValidJSON(str)** | 1.9M ops/sec | 1.89M ops/sec | **ata 1.01x faster** |
|
|
17
|
+
| **Schema compilation** | 125,690 ops/sec | 831 ops/sec | **ata 151x faster** |
|
|
18
|
+
|
|
19
|
+
### Large Data — JS Object Validation
|
|
8
20
|
|
|
9
|
-
|
|
|
10
|
-
|
|
11
|
-
| **ata
|
|
12
|
-
|
|
|
21
|
+
| Size | ata | ajv | |
|
|
22
|
+
|---|---|---|---|
|
|
23
|
+
| 10 users (2KB) | 6.2M ops/sec | 2.5M ops/sec | **ata 2.5x faster** |
|
|
24
|
+
| 100 users (20KB) | 658K ops/sec | 243K ops/sec | **ata 2.7x faster** |
|
|
25
|
+
| 1,000 users (205KB) | 64K ops/sec | 23.5K ops/sec | **ata 2.7x faster** |
|
|
26
|
+
|
|
27
|
+
### Parallel Batch Validation (multi-core)
|
|
28
|
+
|
|
29
|
+
| Batch Size | ata | ajv | |
|
|
30
|
+
|---|---|---|---|
|
|
31
|
+
| 1,000 items | 8.4M items/sec | 2.2M items/sec | **ata 3.9x faster** |
|
|
32
|
+
| 10,000 items | 12.5M items/sec | 2.1M items/sec | **ata 5.9x faster** |
|
|
13
33
|
|
|
14
|
-
> ata
|
|
34
|
+
> ajv is single-threaded (JS). ata uses all CPU cores via a persistent C++ thread pool.
|
|
15
35
|
|
|
16
|
-
###
|
|
36
|
+
### Where ajv wins
|
|
17
37
|
|
|
18
|
-
|
|
|
38
|
+
| Scenario | ata | ajv | |
|
|
19
39
|
|---|---|---|---|
|
|
20
|
-
|
|
|
21
|
-
|
|
|
22
|
-
| 20 KB | 73,142 | 20,459 | **ata 3.6x faster** |
|
|
23
|
-
| 100 KB | 14,388 | 4,062 | **ata 3.5x faster** |
|
|
24
|
-
| 200 KB | 7,590 | 2,021 | **ata 3.8x faster** |
|
|
40
|
+
| **validate(obj)** (invalid data, error collection) | 133K ops/sec | 7.5M ops/sec | **ajv 56x faster** |
|
|
41
|
+
| **validateJSON(str)** (invalid data) | 169K ops/sec | 2.3M ops/sec | **ajv 14x faster** |
|
|
25
42
|
|
|
26
|
-
>
|
|
43
|
+
> Invalid-data error collection goes through the C++ NAPI path. This is the slow path by design — production traffic is overwhelmingly valid.
|
|
44
|
+
|
|
45
|
+
### How it works
|
|
46
|
+
|
|
47
|
+
**Speculative validation**: For valid data (the common case), ata runs a JS codegen fast path entirely in V8 JIT — no NAPI boundary crossing. Only when validation fails does it fall through to the C++ engine for detailed error collection.
|
|
48
|
+
|
|
49
|
+
**JS codegen**: Schemas are compiled to monolithic JS functions (like ajv). Supported keywords: `type`, `required`, `properties`, `items`, `enum`, `const`, `allOf`, `anyOf`, `oneOf`, `not`, `if/then/else`, `uniqueItems`, `contains`, `prefixItems`, `additionalProperties`, `dependentRequired`, `minimum/maximum`, `minLength/maxLength`, `pattern`, `format`.
|
|
50
|
+
|
|
51
|
+
**V8 TurboFan optimizations**: Destructuring batch reads, `undefined` checks instead of `in` operator, context-aware type guard elimination, property hoisting to local variables.
|
|
52
|
+
|
|
53
|
+
**Adaptive simdjson**: For large documents (>8KB) with selective schemas, simdjson On Demand seeks only the needed fields — skipping irrelevant data at GB/s speeds.
|
|
27
54
|
|
|
28
55
|
### JSON Schema Test Suite
|
|
29
56
|
|
|
30
|
-
**
|
|
57
|
+
**98.5%** pass rate (938/952) on official [JSON Schema Test Suite](https://github.com/json-schema-org/JSON-Schema-Test-Suite) (Draft 2020-12).
|
|
31
58
|
|
|
32
|
-
##
|
|
59
|
+
## When to use ata
|
|
33
60
|
|
|
34
|
-
- **
|
|
35
|
-
- **
|
|
36
|
-
- **
|
|
37
|
-
- **
|
|
38
|
-
- **Multi-Language**: C API (`ata_c.h`) enables bindings for Rust, Python, Go, Ruby, and more
|
|
39
|
-
- **Drop-in Replacement**: ajv-compatible API — switch with one line change
|
|
40
|
-
- **Node.js Binding**: Native N-API addon
|
|
41
|
-
- **Error Details**: Rich error messages with JSON Pointer paths
|
|
61
|
+
- **Any `validate(obj)` workload** — 1.1x–2.7x faster than ajv on valid data
|
|
62
|
+
- **Batch/streaming validation** — NDJSON log processing, data pipelines (5.9x faster)
|
|
63
|
+
- **Schema-heavy startup** — many schemas compiled at boot (151x faster compile)
|
|
64
|
+
- **C/C++ embedding** — native library, no JS runtime needed
|
|
42
65
|
|
|
43
|
-
##
|
|
66
|
+
## When to use ajv
|
|
44
67
|
|
|
45
|
-
|
|
68
|
+
- **Error-heavy workloads** — where most data is invalid and error details matter
|
|
69
|
+
- **Schemas with `$ref`, `patternProperties`, `dependentSchemas`** — these bypass JS codegen and hit the slower NAPI path
|
|
46
70
|
|
|
47
|
-
|
|
48
|
-
npm install ata-validator
|
|
49
|
-
```
|
|
71
|
+
## Features
|
|
50
72
|
|
|
51
|
-
|
|
73
|
+
- **Speculative validation**: JS codegen fast path — valid data never crosses the NAPI boundary
|
|
74
|
+
- **Multi-core**: Parallel validation across all CPU cores — 12.5M validations/sec
|
|
75
|
+
- **simdjson**: SIMD-accelerated JSON parsing at GB/s speeds, adaptive On Demand for large docs
|
|
76
|
+
- **RE2 regex**: Linear-time guarantees, immune to ReDoS attacks
|
|
77
|
+
- **V8-optimized codegen**: Destructuring batch reads, type guard elimination, property hoisting
|
|
78
|
+
- **Standard Schema V1**: Compatible with Fastify, tRPC, TanStack, Drizzle
|
|
79
|
+
- **Zero-copy paths**: Buffer and pre-padded input support — no unnecessary copies
|
|
80
|
+
- **C/C++ library**: Native API for non-Node.js environments
|
|
81
|
+
- **98.5% spec compliant**: Draft 2020-12
|
|
52
82
|
|
|
53
|
-
|
|
54
|
-
include(FetchContent)
|
|
55
|
-
FetchContent_Declare(
|
|
56
|
-
ata
|
|
57
|
-
GIT_REPOSITORY https://github.com/mertcanaltin/ata.git
|
|
58
|
-
GIT_TAG main
|
|
59
|
-
)
|
|
60
|
-
FetchContent_MakeAvailable(ata)
|
|
83
|
+
## Installation
|
|
61
84
|
|
|
62
|
-
|
|
85
|
+
```bash
|
|
86
|
+
npm install ata-validator
|
|
63
87
|
```
|
|
64
88
|
|
|
65
89
|
## Usage
|
|
@@ -67,9 +91,8 @@ target_link_libraries(your_target PRIVATE ata::ata)
|
|
|
67
91
|
### Node.js
|
|
68
92
|
|
|
69
93
|
```javascript
|
|
70
|
-
const { Validator
|
|
94
|
+
const { Validator } = require('ata-validator');
|
|
71
95
|
|
|
72
|
-
// Pre-compiled schema (recommended)
|
|
73
96
|
const v = new Validator({
|
|
74
97
|
type: 'object',
|
|
75
98
|
properties: {
|
|
@@ -80,196 +103,106 @@ const v = new Validator({
|
|
|
80
103
|
required: ['name', 'email']
|
|
81
104
|
});
|
|
82
105
|
|
|
83
|
-
//
|
|
84
|
-
|
|
106
|
+
// Fast boolean check — JS codegen, no NAPI (1.1x faster than ajv)
|
|
107
|
+
v.isValidObject({ name: 'Mert', email: 'mert@example.com', age: 26 }); // true
|
|
108
|
+
|
|
109
|
+
// Full validation with error details
|
|
110
|
+
const result = v.validate({ name: 'Mert', email: 'mert@example.com', age: 26 });
|
|
85
111
|
console.log(result.valid); // true
|
|
112
|
+
console.log(result.errors); // []
|
|
86
113
|
|
|
87
|
-
//
|
|
88
|
-
|
|
89
|
-
|
|
114
|
+
// JSON string validation (simdjson fast path)
|
|
115
|
+
v.validateJSON('{"name": "Mert", "email": "mert@example.com"}');
|
|
116
|
+
v.isValidJSON('{"name": "Mert", "email": "mert@example.com"}'); // true
|
|
90
117
|
|
|
91
|
-
//
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
//
|
|
118
|
+
// Buffer input (zero-copy, raw NAPI)
|
|
119
|
+
v.isValid(Buffer.from('{"name": "Mert", "email": "mert@example.com"}'));
|
|
120
|
+
|
|
121
|
+
// Parallel batch — multi-core, NDJSON (5.9x faster than ajv)
|
|
122
|
+
const ndjson = Buffer.from(lines.join('\n'));
|
|
123
|
+
v.isValidParallel(ndjson); // bool[]
|
|
124
|
+
v.countValid(ndjson); // number
|
|
95
125
|
```
|
|
96
126
|
|
|
97
|
-
###
|
|
127
|
+
### Standard Schema V1
|
|
98
128
|
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
+ const Ajv = require('ata-validator/compat');
|
|
129
|
+
```javascript
|
|
130
|
+
const v = new Validator(schema);
|
|
102
131
|
|
|
103
|
-
|
|
104
|
-
const
|
|
105
|
-
|
|
106
|
-
|
|
132
|
+
// Works with Fastify, tRPC, TanStack, etc.
|
|
133
|
+
const result = v['~standard'].validate(data);
|
|
134
|
+
// { value: data } on success
|
|
135
|
+
// { issues: [{ message, path }] } on failure
|
|
107
136
|
```
|
|
108
137
|
|
|
109
|
-
###
|
|
138
|
+
### Fastify Plugin
|
|
110
139
|
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
#include <iostream>
|
|
114
|
-
|
|
115
|
-
int main() {
|
|
116
|
-
auto schema = ata::compile(R"({
|
|
117
|
-
"type": "object",
|
|
118
|
-
"properties": {
|
|
119
|
-
"name": {"type": "string"},
|
|
120
|
-
"age": {"type": "integer", "minimum": 0}
|
|
121
|
-
},
|
|
122
|
-
"required": ["name"]
|
|
123
|
-
})");
|
|
124
|
-
|
|
125
|
-
auto result = ata::validate(schema, R"({"name": "Mert", "age": 28})");
|
|
126
|
-
|
|
127
|
-
if (result) {
|
|
128
|
-
std::cout << "Valid!" << std::endl;
|
|
129
|
-
} else {
|
|
130
|
-
for (const auto& err : result.errors) {
|
|
131
|
-
std::cout << err.path << ": " << err.message << std::endl;
|
|
132
|
-
}
|
|
133
|
-
}
|
|
134
|
-
return 0;
|
|
135
|
-
}
|
|
140
|
+
```bash
|
|
141
|
+
npm install fastify-ata
|
|
136
142
|
```
|
|
137
143
|
|
|
138
|
-
|
|
144
|
+
```javascript
|
|
145
|
+
const fastify = require('fastify')();
|
|
146
|
+
fastify.register(require('fastify-ata'));
|
|
139
147
|
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
#include <stdio.h>
|
|
143
|
-
#include <string.h>
|
|
148
|
+
// All existing JSON Schema route definitions work as-is
|
|
149
|
+
```
|
|
144
150
|
|
|
145
|
-
|
|
146
|
-
const char* schema = "{\"type\":\"string\",\"minLength\":3}";
|
|
147
|
-
ata_schema s = ata_compile(schema, strlen(schema));
|
|
151
|
+
### C++
|
|
148
152
|
|
|
149
|
-
|
|
150
|
-
|
|
153
|
+
```cpp
|
|
154
|
+
#include "ata.h"
|
|
151
155
|
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
printf("Error: %.*s\n", (int)msg.length, msg.data);
|
|
158
|
-
}
|
|
159
|
-
}
|
|
156
|
+
auto schema = ata::compile(R"({
|
|
157
|
+
"type": "object",
|
|
158
|
+
"properties": { "name": {"type": "string"} },
|
|
159
|
+
"required": ["name"]
|
|
160
|
+
})");
|
|
160
161
|
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
}
|
|
162
|
+
auto result = ata::validate(schema, R"({"name": "Mert"})");
|
|
163
|
+
// result.valid == true
|
|
164
164
|
```
|
|
165
165
|
|
|
166
166
|
## Supported Keywords
|
|
167
167
|
|
|
168
168
|
| Category | Keywords |
|
|
169
169
|
|----------|----------|
|
|
170
|
-
| Type | `type`
|
|
170
|
+
| Type | `type` |
|
|
171
171
|
| Numeric | `minimum`, `maximum`, `exclusiveMinimum`, `exclusiveMaximum`, `multipleOf` |
|
|
172
172
|
| String | `minLength`, `maxLength`, `pattern`, `format` |
|
|
173
|
-
| Array | `items`, `prefixItems`, `minItems`, `maxItems`, `uniqueItems` |
|
|
174
|
-
| Object | `properties`, `required`, `additionalProperties`, `patternProperties`, `minProperties`, `maxProperties` |
|
|
173
|
+
| Array | `items`, `prefixItems`, `minItems`, `maxItems`, `uniqueItems`, `contains`, `minContains`, `maxContains` |
|
|
174
|
+
| Object | `properties`, `required`, `additionalProperties`, `patternProperties`, `minProperties`, `maxProperties`, `propertyNames`, `dependentRequired`, `dependentSchemas` |
|
|
175
175
|
| Enum/Const | `enum`, `const` |
|
|
176
176
|
| Composition | `allOf`, `anyOf`, `oneOf`, `not` |
|
|
177
177
|
| Conditional | `if`, `then`, `else` |
|
|
178
178
|
| References | `$ref`, `$defs`, `definitions`, `$id` |
|
|
179
|
-
| Boolean | `true
|
|
179
|
+
| Boolean | `true`, `false` |
|
|
180
180
|
|
|
181
|
-
### Format Validators
|
|
181
|
+
### Format Validators (hand-written, no regex)
|
|
182
182
|
|
|
183
183
|
`email`, `date`, `date-time`, `time`, `uri`, `uri-reference`, `ipv4`, `ipv6`, `uuid`, `hostname`
|
|
184
184
|
|
|
185
|
-
## Why ata over ajv?
|
|
186
|
-
|
|
187
|
-
| | ata | ajv |
|
|
188
|
-
|---|---|---|
|
|
189
|
-
| Schema compilation | **11,000x faster** | Slow (code generation) |
|
|
190
|
-
| JSON string validation | **2-4x faster** | JSON.parse + validate |
|
|
191
|
-
| CSP compatible | Yes | No (`new Function()`) |
|
|
192
|
-
| Multi-language | C, C++, Rust, Python, Go | JavaScript only |
|
|
193
|
-
| Bundle size | ~20KB JS + native | ~150KB minified |
|
|
194
|
-
| Node.js core candidate | Yes (like ada-url, simdutf) | No (JS dependency) |
|
|
195
|
-
|
|
196
185
|
## Building from Source
|
|
197
186
|
|
|
198
187
|
```bash
|
|
199
188
|
# C++ library + tests
|
|
200
189
|
cmake -B build
|
|
201
190
|
cmake --build build
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
# With benchmarks
|
|
205
|
-
cmake -B build -DATA_BENCHMARKS=ON
|
|
206
|
-
cmake --build build
|
|
207
|
-
./build/ata_bench
|
|
191
|
+
./build/ata_tests
|
|
208
192
|
|
|
209
193
|
# Node.js addon
|
|
210
194
|
npm install
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
# Run JSON Schema Test Suite
|
|
214
|
-
node tests/run_suite.js
|
|
215
|
-
```
|
|
216
|
-
|
|
217
|
-
### Build Options
|
|
218
|
-
|
|
219
|
-
| Option | Default | Description |
|
|
220
|
-
|--------|---------|-------------|
|
|
221
|
-
| `ATA_TESTING` | `ON` | Build test suite |
|
|
222
|
-
| `ATA_BENCHMARKS` | `OFF` | Build benchmarks |
|
|
223
|
-
| `ATA_SANITIZE` | `OFF` | Enable address sanitizer |
|
|
195
|
+
npm run build
|
|
196
|
+
npm test
|
|
224
197
|
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
### C++ API
|
|
228
|
-
|
|
229
|
-
#### `ata::compile(schema_json) -> schema_ref`
|
|
230
|
-
Compile a JSON Schema string. Returns a reusable `schema_ref` (falsy on error).
|
|
231
|
-
|
|
232
|
-
#### `ata::validate(schema_ref, json, opts) -> validation_result`
|
|
233
|
-
Validate a JSON string against a pre-compiled schema. Pass `{.all_errors = false}` to stop at first error (faster).
|
|
234
|
-
|
|
235
|
-
#### `ata::validation_result`
|
|
236
|
-
```cpp
|
|
237
|
-
struct validation_result {
|
|
238
|
-
bool valid;
|
|
239
|
-
std::vector<validation_error> errors;
|
|
240
|
-
explicit operator bool() const noexcept { return valid; }
|
|
241
|
-
};
|
|
242
|
-
```
|
|
243
|
-
|
|
244
|
-
### Node.js API
|
|
245
|
-
|
|
246
|
-
#### `new Validator(schema)`
|
|
247
|
-
Create a validator with a pre-compiled schema. `schema` can be an object or JSON string.
|
|
248
|
-
|
|
249
|
-
#### `validator.validate(data) -> { valid, errors }`
|
|
250
|
-
Validate any JS value directly via V8 traversal (no serialization).
|
|
251
|
-
|
|
252
|
-
#### `validator.validateJSON(jsonString) -> { valid, errors }`
|
|
253
|
-
Validate a JSON string via simdjson (fastest path for string input).
|
|
254
|
-
|
|
255
|
-
#### `validate(schema, data) -> { valid, errors }`
|
|
256
|
-
One-shot validation without pre-compilation.
|
|
257
|
-
|
|
258
|
-
### ajv-compatible API (`compat.js`)
|
|
259
|
-
|
|
260
|
-
```javascript
|
|
261
|
-
const Ata = require('ata-validator/compat');
|
|
262
|
-
const ata = new Ata();
|
|
263
|
-
const validate = ata.compile(schema);
|
|
264
|
-
const valid = validate(data);
|
|
265
|
-
if (!valid) console.log(validate.errors);
|
|
198
|
+
# JSON Schema Test Suite
|
|
199
|
+
npm run test:suite
|
|
266
200
|
```
|
|
267
201
|
|
|
268
202
|
## License
|
|
269
203
|
|
|
270
|
-
|
|
204
|
+
MIT
|
|
271
205
|
|
|
272
|
-
|
|
273
|
-
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
|
|
206
|
+
## Author
|
|
274
207
|
|
|
275
|
-
|
|
208
|
+
[Mert Can Altin](https://github.com/mertcanaltin)
|