eyeleng 1.0.6 → 1.0.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +275 -18
- package/dist/browser/eyeleng.browser.js +123 -13
- package/eyeleng.js +123 -13
- package/package.json +6 -4
- package/playground.html +1 -1
- package/reports/w3c-rdf-earl.ttl +11707 -0
- package/reports/w3c-shacl12-rules-earl.ttl +1055 -1336
- package/src/parser.js +76 -2
- package/src/rdfManifest.js +13 -0
- package/src/shacl12RulesManifest.js +386 -0
- package/src/tokenizer.js +47 -11
- package/test/api.test.js +32 -0
- package/test/shacl12-rules.test.js +37 -215
- package/test/w3c-rdf.test.js +5 -2
- package/tools/w3c-rdf.js +20 -1
- package/tools/w3c-shacl12-rules.js +55 -0
- package/HANDBOOK.md +0 -1070
package/README.md
CHANGED
|
@@ -3,38 +3,146 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/eyeleng)
|
|
4
4
|
[](https://doi.org/10.5281/zenodo.20342577)
|
|
5
5
|
|
|
6
|
-
`eyeleng` stands for **EYE Logic Engine**.
|
|
6
|
+
`eyeleng` stands for **EYE Logic Engine**. It is a compact JavaScript implementation of SHACL 1.2 Rules with two rule front-ends:
|
|
7
|
+
|
|
8
|
+
- **SRL** — the Shape Rules Language syntax used by the SHACL 1.2 Rules draft.
|
|
9
|
+
- **RDF Rules** — a Turtle/RDF syntax for rule sets.
|
|
10
|
+
|
|
11
|
+
Eyeleng is a forward-chaining reasoner over RDF-style triples. It is deliberately small, dependency-free at runtime, readable as ordinary JavaScript, and usable from the CLI, Node.js, and the browser playground.
|
|
12
|
+
|
|
13
|
+
Eyeleng implements the rules/reasoning surface. It is **not** a SHACL validation engine and does not emit SHACL validation reports.
|
|
7
14
|
|
|
8
15
|
## Quick start
|
|
9
16
|
|
|
10
17
|
```sh
|
|
18
|
+
npm install
|
|
11
19
|
npm test
|
|
12
20
|
./eyeleng.js examples/family.srl
|
|
13
|
-
./eyeleng.js examples/
|
|
14
|
-
./eyeleng.js examples/deep-taxonomy-100.srl
|
|
21
|
+
./eyeleng.js --all examples/family.srl
|
|
15
22
|
./eyeleng.js examples/basic-ruleset.ttl
|
|
16
|
-
./eyeleng.js --syntax rdf examples/w3c-rule-set-snippet.ttl
|
|
17
23
|
./eyeleng.js --check --deps examples/stratified-negation.srl
|
|
18
24
|
```
|
|
19
25
|
|
|
20
|
-
|
|
26
|
+
A minimal SRL program:
|
|
27
|
+
|
|
28
|
+
```srl
|
|
29
|
+
PREFIX : <http://example/>
|
|
30
|
+
|
|
31
|
+
DATA {
|
|
32
|
+
:Socrates a :Man .
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
RULE { ?x a :Mortal } WHERE { ?x a :Man }
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
It derives:
|
|
39
|
+
|
|
40
|
+
```srl
|
|
41
|
+
:Socrates a :Mortal .
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Open the [Playground](https://eyereasoner.github.io/eyeleng/playground) for a self-contained browser UI with URL loading, autosave, share links, diagnostics, queries, and SRL/RDF Rules syntax selection.
|
|
45
|
+
|
|
46
|
+
## How the reasoner works
|
|
47
|
+
|
|
48
|
+
Eyeleng computes the closure of a rule set:
|
|
49
|
+
|
|
50
|
+
```text
|
|
51
|
+
parse source
|
|
52
|
+
analyze dependencies and strata
|
|
53
|
+
match rule bodies against known triples
|
|
54
|
+
instantiate rule heads
|
|
55
|
+
add new triples
|
|
56
|
+
repeat until stable
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
A rule has a body and a head:
|
|
60
|
+
|
|
61
|
+
```srl
|
|
62
|
+
RULE { ?child :childOf ?parent } WHERE { ?parent :parentOf ?child }
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
If the graph contains:
|
|
66
|
+
|
|
67
|
+
```srl
|
|
68
|
+
:alice :parentOf :bob .
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
then the body matches with `?parent = :alice` and `?child = :bob`, and the head derives:
|
|
72
|
+
|
|
73
|
+
```srl
|
|
74
|
+
:bob :childOf :alice .
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Negation is handled by stratified evaluation: rules are grouped into dependency layers, and recursion through negation is rejected so the result stays deterministic.
|
|
78
|
+
|
|
79
|
+
## Language surface
|
|
80
|
+
|
|
81
|
+
SRL supports the practical rule features used by the SHACL 1.2 Rules tests and examples:
|
|
82
|
+
|
|
83
|
+
```srl
|
|
84
|
+
PREFIX : <http://example/>
|
|
85
|
+
|
|
86
|
+
DATA {
|
|
87
|
+
:alice :score 7 .
|
|
88
|
+
:bob :score 3 .
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
RULE { ?x :grade ?grade } WHERE {
|
|
92
|
+
?x :score ?score .
|
|
93
|
+
FILTER(?score >= 5) .
|
|
94
|
+
BIND(concat("pass-", str(?score)) AS ?grade)
|
|
95
|
+
}
|
|
96
|
+
|
|
97
|
+
RULE { ?x :eligible true } WHERE {
|
|
98
|
+
?x :grade ?grade .
|
|
99
|
+
NOT { ?x :blocked true . }
|
|
100
|
+
}
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Implemented syntax includes:
|
|
104
|
+
|
|
105
|
+
- `PREFIX`, `BASE`, `VERSION`, and `IMPORTS`
|
|
106
|
+
- `DATA`, `RULE`, `WHERE`, `FILTER`, `BIND`, `SET`, and `NOT`
|
|
107
|
+
- variables such as `?x`
|
|
108
|
+
- IRIs, prefixed names, blank nodes, literals, RDF collections, and RDF 1.2 triple terms
|
|
109
|
+
- Turtle-style `;`, `,`, `a`, blank-node property lists, lists, annotations, and reifiers where supported
|
|
110
|
+
- property paths in rule bodies
|
|
111
|
+
- language tags, base-direction literals, and common XML Schema datatypes
|
|
112
|
+
- SRL and RDF Rules syntax front-ends
|
|
113
|
+
|
|
114
|
+
The RDF parsing path is shared with the W3C RDF syntax harness, so SRL `DATA { ... }` uses the same grammar-hardened RDF parser surface as Turtle/TriG input.
|
|
21
115
|
|
|
22
|
-
|
|
116
|
+
## Builtins and expressions
|
|
23
117
|
|
|
24
|
-
|
|
118
|
+
`FILTER`, `BIND`, and `SET` use expression evaluation. Supported operations include comparisons, boolean operators, arithmetic, `IN`, `NOT IN`, datatype/language checks, string functions, numeric functions, and selected date/time helpers.
|
|
25
119
|
|
|
26
|
-
|
|
120
|
+
Common builtins include:
|
|
27
121
|
|
|
28
|
-
|
|
122
|
+
```text
|
|
123
|
+
str, concat, lcase, ucase, replace
|
|
124
|
+
abs, round, floor, ceil
|
|
125
|
+
datatype, lang, iri, uri
|
|
126
|
+
now, year, month, day
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
The goal is useful SHACL Rules/SRL behavior, not complete SPARQL expression coverage.
|
|
29
130
|
|
|
30
|
-
|
|
131
|
+
## RDF 1.2 features
|
|
31
132
|
|
|
32
|
-
|
|
133
|
+
Eyeleng includes grammar-hardened RDF 1.1/1.2 parsing support in `src/rdfSyntax.js` and W3C manifest runners in `src/rdfManifest.js` / `src/rdfEntailment.js`.
|
|
33
134
|
|
|
135
|
+
Covered surfaces include:
|
|
34
136
|
|
|
35
|
-
|
|
137
|
+
- N-Triples and N-Quads
|
|
138
|
+
- Turtle and TriG
|
|
139
|
+
- RDF 1.2 triple terms
|
|
140
|
+
- reifiers and annotation blocks
|
|
141
|
+
- language-direction literals
|
|
142
|
+
- graph isomorphism for blank nodes
|
|
143
|
+
- simple, RDF, and RDFS entailment checks for RDF-MT / RDF 1.2 Semantics manifests
|
|
36
144
|
|
|
37
|
-
|
|
145
|
+
W3C checks:
|
|
38
146
|
|
|
39
147
|
```sh
|
|
40
148
|
npm run w3c:rules
|
|
@@ -44,10 +152,9 @@ npm run w3c:rdf:json
|
|
|
44
152
|
npm run w3c:rdf:earl
|
|
45
153
|
```
|
|
46
154
|
|
|
47
|
-
`npm test` includes
|
|
48
|
-
|
|
49
|
-
The RDF harness covers N-Triples, N-Quads, Turtle, and TriG RDF 1.1/1.2 parser syntax/eval manifests, plus the RDF-MT and RDF 1.2 Semantics entailment manifests. Entailment tests are evaluated under their declared `mf:entailmentRegime` (`simple`, `RDF`, or `RDFS`) with their declared recognized datatypes.
|
|
155
|
+
`npm test` includes the W3C harnesses. When W3C URLs are reachable, progress is printed test by test. In offline environments, remote W3C checks are reported as unreachable unless `EYELENG_W3C_REQUIRED=1` is set.
|
|
50
156
|
|
|
157
|
+
The official Eyeleng EARL 1.0 report for the W3C SHACL 1.2 Rules manifest is in [reports/w3c-shacl12-rules-earl.ttl](./reports/w3c-shacl12-rules-earl.ttl).
|
|
51
158
|
|
|
52
159
|
## RDF Message Logs
|
|
53
160
|
|
|
@@ -64,7 +171,7 @@ MESSAGE
|
|
|
64
171
|
_:reading :sensor :s2 .
|
|
65
172
|
```
|
|
66
173
|
|
|
67
|
-
Use
|
|
174
|
+
Use message logs directly, import them from SRL, or force message-log parsing from the CLI:
|
|
68
175
|
|
|
69
176
|
```sh
|
|
70
177
|
./eyeleng.js examples/rdf-messages.srl
|
|
@@ -72,4 +179,154 @@ Use it directly, import it from SRL, or force message-log parsing from the CLI:
|
|
|
72
179
|
./eyeleng.js --stream-messages --all examples/rdf-messages.trig
|
|
73
180
|
```
|
|
74
181
|
|
|
75
|
-
The replay data includes
|
|
182
|
+
The replay data includes message streams, envelopes, offsets, next-envelope links, payload kind, payload graph, and `eymsg:payloadTriple` triple terms. Blank-node labels are scoped per message. For Eyeleng, each payload graph is also represented as a closed RDF list of RDF 1.2 triple terms via `log:nameOf`.
|
|
183
|
+
|
|
184
|
+
## CLI
|
|
185
|
+
|
|
186
|
+
Common commands:
|
|
187
|
+
|
|
188
|
+
```sh
|
|
189
|
+
./eyeleng.js examples/family.srl
|
|
190
|
+
./eyeleng.js --all examples/family.srl
|
|
191
|
+
./eyeleng.js --check --deps examples/stratified-negation.srl
|
|
192
|
+
./eyeleng.js --json --trace --stats examples/if-then.srl
|
|
193
|
+
./eyeleng.js --query-file examples/query-body.txt examples/query.srl
|
|
194
|
+
./eyeleng.js --syntax rdf examples/w3c-rule-set-snippet.ttl
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Important options:
|
|
198
|
+
|
|
199
|
+
```text
|
|
200
|
+
--all print input and inferred triples
|
|
201
|
+
--check parse and analyze only
|
|
202
|
+
--strict warnings become fatal
|
|
203
|
+
--deps print dependency edges and layers
|
|
204
|
+
--trace show rule firings
|
|
205
|
+
--stats show iteration and rule counts
|
|
206
|
+
--json structured output
|
|
207
|
+
--query run a raw body pattern over the closure
|
|
208
|
+
--query-file FILE read a query body from a file
|
|
209
|
+
--max-iterations N recursive-layer fixpoint safety guard
|
|
210
|
+
--no-imports parse imports but do not load them
|
|
211
|
+
--rdf-messages parse input as an RDF Message Log
|
|
212
|
+
--stream-messages replay RDF Message Log envelopes
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
## Public API
|
|
216
|
+
|
|
217
|
+
Typical API use:
|
|
218
|
+
|
|
219
|
+
```js
|
|
220
|
+
const { run, formatTriples } = require('./src/index.js');
|
|
221
|
+
|
|
222
|
+
const result = run(`
|
|
223
|
+
PREFIX : <http://example/>
|
|
224
|
+
DATA { :Socrates a :Man . }
|
|
225
|
+
RULE { ?x a :Mortal } WHERE { ?x a :Man }
|
|
226
|
+
`);
|
|
227
|
+
|
|
228
|
+
console.log(formatTriples(result.inferred, result.prefixes));
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
Query mode:
|
|
232
|
+
|
|
233
|
+
```js
|
|
234
|
+
const { runQuery, formatBindings } = require('./src/index.js');
|
|
235
|
+
|
|
236
|
+
const result = runQuery(source, '?x :ancestorOf ?y');
|
|
237
|
+
console.log(formatBindings(result.query.bindings, result.prefixes));
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
Imports:
|
|
241
|
+
|
|
242
|
+
```js
|
|
243
|
+
const result = run(source, {
|
|
244
|
+
baseIRI: 'file:///main.srl',
|
|
245
|
+
importResolver(target) {
|
|
246
|
+
return {
|
|
247
|
+
source: readSomehow(target),
|
|
248
|
+
options: { baseIRI: target, filename: target }
|
|
249
|
+
};
|
|
250
|
+
}
|
|
251
|
+
});
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
The API returns structured parsed programs, diagnostics, inferred triples, closure triples, traces, stats, and query bindings.
|
|
255
|
+
|
|
256
|
+
## Project layout
|
|
257
|
+
|
|
258
|
+
```text
|
|
259
|
+
src/tokenizer.js source text -> tokens
|
|
260
|
+
src/parser.js SRL parser -> program object
|
|
261
|
+
src/rdfSyntax.js RDF 1.1/1.2 N-Triples/N-Quads/Turtle/TriG syntax
|
|
262
|
+
src/rdfManifest.js W3C RDF manifest runner
|
|
263
|
+
src/rdfEntailment.js simple/RDF/RDFS entailment checks
|
|
264
|
+
src/rdfMessages.js RDF Message Log replay support
|
|
265
|
+
src/term.js terms, keys, equality, formatting
|
|
266
|
+
src/store.js triple set, predicate index, matching, paths
|
|
267
|
+
src/builtins.js expression evaluation and built-in functions
|
|
268
|
+
src/analyze.js diagnostics, dependencies, strata
|
|
269
|
+
src/engine.js layered forward-chaining evaluator
|
|
270
|
+
src/query.js external raw-body query operation
|
|
271
|
+
src/format.js text and JSON output
|
|
272
|
+
src/api.js public JavaScript API and import merging
|
|
273
|
+
src/cli.js command-line interface
|
|
274
|
+
tools/bundle.js self-contained bundle generator
|
|
275
|
+
test/*.test.js executable regression and conformance tests
|
|
276
|
+
examples/*.srl runnable SRL examples
|
|
277
|
+
examples/*.ttl RDF Rules / Turtle examples
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
A good reading order is `term.js`, `tokenizer.js`, `parser.js`, `rdfSyntax.js`, `store.js`, `builtins.js`, `analyze.js`, `engine.js`, then `api.js` and `cli.js`.
|
|
281
|
+
|
|
282
|
+
## Tests and build
|
|
283
|
+
|
|
284
|
+
```sh
|
|
285
|
+
npm test
|
|
286
|
+
npm run build
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
`npm run build` writes the command-line bundle to `eyeleng.js` and the browser API bundle to `dist/browser/eyeleng.browser.js`. In a browser, the bundle exposes `window.eyeleng`.
|
|
290
|
+
|
|
291
|
+
The tests are executable documentation. They cover parsing, recursion, filters, negation, assignment, typed/language literals, RDF 1.2 syntax, property paths, stratification, imports, queries, examples, deep taxonomy benchmarks, W3C SHACL Rules, W3C RDF syntax, and RDF/RDFS entailment.
|
|
292
|
+
|
|
293
|
+
## Examples
|
|
294
|
+
|
|
295
|
+
Examples live in [examples/](./examples/):
|
|
296
|
+
|
|
297
|
+
- `family.srl` — small recursive rules
|
|
298
|
+
- `negation.srl` — stratified negation
|
|
299
|
+
- `assignment.srl` — assignment and expressions
|
|
300
|
+
- `property-paths.srl` — path matching in bodies
|
|
301
|
+
- `basic-ruleset.ttl` — RDF Rules syntax
|
|
302
|
+
- `rdf-messages.srl` / `rdf-messages.trig` — RDF Message Log replay
|
|
303
|
+
- `deep-taxonomy-*.srl` — generated benchmark programs
|
|
304
|
+
|
|
305
|
+
## Known boundaries
|
|
306
|
+
|
|
307
|
+
Eyeleng intentionally remains a compact reasoner:
|
|
308
|
+
|
|
309
|
+
- it does not implement SHACL validation or validation reports
|
|
310
|
+
- it does not aim to be a full RDF database
|
|
311
|
+
- RDF Rules syntax support is a front-end for rule execution, not a shapes-validation layer
|
|
312
|
+
- property paths and SPARQL expressions are practical subsets
|
|
313
|
+
- W3C manifests are used as executable alignment tests, but implementation status should be read from the current test reports
|
|
314
|
+
|
|
315
|
+
## Extending Eyeleng safely
|
|
316
|
+
|
|
317
|
+
Preserve the pipeline:
|
|
318
|
+
|
|
319
|
+
```text
|
|
320
|
+
syntax -> AST/program -> analysis -> evaluation -> formatting
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
Avoid making the evaluator parse strings. Parsing belongs in `parser.js` or `rdfSyntax.js`. Avoid making the parser derive triples. Inference belongs in `engine.js`.
|
|
324
|
+
|
|
325
|
+
A safe extension usually needs:
|
|
326
|
+
|
|
327
|
+
1. syntax support
|
|
328
|
+
2. one focused example
|
|
329
|
+
3. parser/API tests
|
|
330
|
+
4. execution tests
|
|
331
|
+
5. README or handbook notes
|
|
332
|
+
6. bundle regeneration
|
|
@@ -166,7 +166,7 @@
|
|
|
166
166
|
|
|
167
167
|
class Parser {
|
|
168
168
|
constructor(source, options = {}) {
|
|
169
|
-
this.tokens = Array.isArray(source) ? source : tokenize(source, options
|
|
169
|
+
this.tokens = Array.isArray(source) ? source : tokenize(source, options);
|
|
170
170
|
this.pos = 0;
|
|
171
171
|
this.options = options;
|
|
172
172
|
this.baseIRI = options.baseIRI || null;
|
|
@@ -229,6 +229,7 @@
|
|
|
229
229
|
let name = nameToken.value;
|
|
230
230
|
if (!name.endsWith(':')) throw this.error('Prefix name must end with :', nameToken);
|
|
231
231
|
name = name.slice(0, -1);
|
|
232
|
+
if (this.strictGrammar() && !isValidPNPrefix(name)) throw this.error(`Invalid prefix name ${nameToken.value}`, nameToken);
|
|
232
233
|
const iriToken = this.expectType('iri');
|
|
233
234
|
this.prefixes[name] = this.resolveIRI(iriToken.value, iriToken);
|
|
234
235
|
if (wasAtPrefix) this.consumeOptionalDot();
|
|
@@ -236,6 +237,10 @@
|
|
|
236
237
|
|
|
237
238
|
parseVersion() {
|
|
238
239
|
const token = this.expectType('string');
|
|
240
|
+
if (this.strictGrammar()) {
|
|
241
|
+
if (token.long) throw this.error('VERSION must use a short string literal', token);
|
|
242
|
+
if (token.value !== '1.2') throw this.error('VERSION must be the SHACL Rules version label \"1.2\"', token);
|
|
243
|
+
}
|
|
239
244
|
this.version = token.value;
|
|
240
245
|
}
|
|
241
246
|
|
|
@@ -631,6 +636,7 @@
|
|
|
631
636
|
} else if (this.matchWord('SET')) {
|
|
632
637
|
clauses.push(this.parseSetClause());
|
|
633
638
|
} else if (this.matchWord('BIND')) {
|
|
639
|
+
if (this.strictGrammar()) throw this.error('BIND is not part of the SHACL 1.2 Rules grammar; use SET');
|
|
634
640
|
clauses.push(this.parseBindClause());
|
|
635
641
|
} else if (this.matchWord('NOT')) {
|
|
636
642
|
this.expectValue('{');
|
|
@@ -738,8 +744,9 @@
|
|
|
738
744
|
if (colon < 0) throw this.error(`Expected IRI, prefixed name, literal, blank node, or variable; got ${value}`, token);
|
|
739
745
|
const prefix = value.slice(0, colon);
|
|
740
746
|
const local = value.slice(colon + 1);
|
|
747
|
+
if (this.strictGrammar()) validatePrefixedName(prefix, local, value, token, (message, errToken) => this.error(message, errToken));
|
|
741
748
|
if (!(prefix in this.prefixes)) throw this.error(`Unknown prefix ${prefix}:`, token);
|
|
742
|
-
return this.prefixes[prefix] + local;
|
|
749
|
+
return this.prefixes[prefix] + decodePNLocalEscapes(local);
|
|
743
750
|
}
|
|
744
751
|
|
|
745
752
|
resolveIRI(value, token = null) {
|
|
@@ -898,9 +905,76 @@
|
|
|
898
905
|
peek() { return this.tokens[this.pos]; }
|
|
899
906
|
peekN(n) { return this.tokens[this.pos + n] || this.tokens[this.tokens.length - 1]; }
|
|
900
907
|
previous() { return this.tokens[this.pos - 1]; }
|
|
908
|
+
strictGrammar() { return !!this.options.strictGrammar; }
|
|
901
909
|
error(message, token = this.peek()) { return new SyntaxErrorWithLocation(message, token); }
|
|
902
910
|
}
|
|
903
911
|
|
|
912
|
+
|
|
913
|
+
function isPnCharsBase(ch) {
|
|
914
|
+
if (!ch) return false;
|
|
915
|
+
return /[A-Za-z]/.test(ch) || ch.codePointAt(0) >= 0x00C0;
|
|
916
|
+
}
|
|
917
|
+
|
|
918
|
+
function isPnCharsU(ch) {
|
|
919
|
+
return isPnCharsBase(ch) || ch === '_';
|
|
920
|
+
}
|
|
921
|
+
|
|
922
|
+
function isPnChars(ch) {
|
|
923
|
+
return isPnCharsU(ch) || /[0-9-]/.test(ch) || ch === '\u00B7' || /[\u0300-\u036F\u203F-\u2040]/u.test(ch);
|
|
924
|
+
}
|
|
925
|
+
|
|
926
|
+
function isValidPNPrefix(prefix) {
|
|
927
|
+
if (prefix === '') return true;
|
|
928
|
+
const chars = Array.from(prefix);
|
|
929
|
+
if (!isPnCharsBase(chars[0])) return false;
|
|
930
|
+
if (chars.length > 1 && chars.at(-1) === '.') return false;
|
|
931
|
+
return chars.slice(1).every((ch) => isPnChars(ch) || ch === '.');
|
|
932
|
+
}
|
|
933
|
+
|
|
934
|
+
function plxLength(text, index) {
|
|
935
|
+
const ch = text[index];
|
|
936
|
+
if (ch === '%' && /[0-9A-Fa-f]/.test(text[index + 1] || '') && /[0-9A-Fa-f]/.test(text[index + 2] || '')) return 3;
|
|
937
|
+
if (ch === '\\' && /[_~.!$&'()*+,;=/?#@%-]/.test(text[index + 1] || '')) return 2;
|
|
938
|
+
return 0;
|
|
939
|
+
}
|
|
940
|
+
|
|
941
|
+
function isPNLocalStartAt(text, index) {
|
|
942
|
+
const ch = text[index];
|
|
943
|
+
return isPnCharsU(ch) || /[0-9:]/.test(ch || '') || plxLength(text, index) > 0;
|
|
944
|
+
}
|
|
945
|
+
|
|
946
|
+
function isPNLocalBodyAt(text, index) {
|
|
947
|
+
const ch = text[index];
|
|
948
|
+
return isPnChars(ch) || ch === '.' || ch === ':' || plxLength(text, index) > 0;
|
|
949
|
+
}
|
|
950
|
+
|
|
951
|
+
function isPNLocalEndAt(text, index) {
|
|
952
|
+
const ch = text[index];
|
|
953
|
+
return isPnChars(ch) || ch === ':' || plxLength(text, index) > 0;
|
|
954
|
+
}
|
|
955
|
+
|
|
956
|
+
function validatePNLocal(local) {
|
|
957
|
+
if (local === '') return true;
|
|
958
|
+
if (!isPNLocalStartAt(local, 0)) return false;
|
|
959
|
+
let lastStart = 0;
|
|
960
|
+
for (let i = 0; i < local.length;) {
|
|
961
|
+
const len = plxLength(local, i) || 1;
|
|
962
|
+
if (i > 0 && !isPNLocalBodyAt(local, i)) return false;
|
|
963
|
+
lastStart = i;
|
|
964
|
+
i += len;
|
|
965
|
+
}
|
|
966
|
+
return isPNLocalEndAt(local, lastStart);
|
|
967
|
+
}
|
|
968
|
+
|
|
969
|
+
function validatePrefixedName(prefix, local, value, token, makeError) {
|
|
970
|
+
if (!isValidPNPrefix(prefix)) throw makeError(`Invalid prefixed name ${value}: invalid prefix`, token);
|
|
971
|
+
if (!validatePNLocal(local)) throw makeError(`Invalid prefixed name ${value}: invalid local name`, token);
|
|
972
|
+
}
|
|
973
|
+
|
|
974
|
+
function decodePNLocalEscapes(local) {
|
|
975
|
+
return String(local).replace(/\\([_~.!$&'()*+,;=/?#@%-])/g, '$1');
|
|
976
|
+
}
|
|
977
|
+
|
|
904
978
|
function numericLiteral(value) {
|
|
905
979
|
if (Number.isInteger(value)) return literal(value, XSD_INTEGER);
|
|
906
980
|
return literal(value, XSD_DECIMAL);
|
|
@@ -970,7 +1044,10 @@
|
|
|
970
1044
|
}
|
|
971
1045
|
}
|
|
972
1046
|
|
|
973
|
-
function tokenize(source,
|
|
1047
|
+
function tokenize(source, filenameOrOptions = '<input>') {
|
|
1048
|
+
const options = typeof filenameOrOptions === 'object' && filenameOrOptions !== null ? filenameOrOptions : { filename: filenameOrOptions };
|
|
1049
|
+
const filename = options.filename || '<input>';
|
|
1050
|
+
const strictGrammar = !!options.strictGrammar;
|
|
974
1051
|
const tokens = [];
|
|
975
1052
|
let i = 0;
|
|
976
1053
|
let line = 1;
|
|
@@ -985,8 +1062,8 @@
|
|
|
985
1062
|
else column += 1;
|
|
986
1063
|
return ch;
|
|
987
1064
|
}
|
|
988
|
-
function token(type, value, startLine, startColumn) {
|
|
989
|
-
tokens.push({ type, value, line: startLine, column: startColumn, filename });
|
|
1065
|
+
function token(type, value, startLine, startColumn, extra = {}) {
|
|
1066
|
+
tokens.push({ type, value, line: startLine, column: startColumn, filename, ...extra });
|
|
990
1067
|
}
|
|
991
1068
|
function syntax(message, startLine, startColumn) {
|
|
992
1069
|
throw new SyntaxErrorWithLocation(message, { line: startLine, column: startColumn, filename });
|
|
@@ -1024,14 +1101,37 @@
|
|
|
1024
1101
|
const length = esc === 'u' ? 4 : 8;
|
|
1025
1102
|
let hex = '';
|
|
1026
1103
|
for (let j = 0; j < length; j += 1) {
|
|
1027
|
-
if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid
|
|
1104
|
+
if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid \\${esc} escape`, startLine, startColumn);
|
|
1028
1105
|
hex += advance();
|
|
1029
1106
|
}
|
|
1030
|
-
|
|
1107
|
+
const codePoint = Number.parseInt(hex, 16);
|
|
1108
|
+
try { return String.fromCodePoint(codePoint); }
|
|
1109
|
+
catch { syntax(`Invalid \\${esc} escape`, startLine, startColumn); }
|
|
1031
1110
|
}
|
|
1111
|
+
if (strictGrammar && !Object.hasOwn(escapeMap, esc)) syntax(`Invalid escape \\${esc}`, startLine, startColumn);
|
|
1032
1112
|
return escapeValue(esc);
|
|
1033
1113
|
}
|
|
1034
1114
|
|
|
1115
|
+
function readIriChar(startLine, startColumn) {
|
|
1116
|
+
if (current() === '\\') {
|
|
1117
|
+
advance();
|
|
1118
|
+
const esc = advance();
|
|
1119
|
+
if (esc !== 'u' && esc !== 'U') syntax(`Invalid IRI escape \\${esc}`, startLine, startColumn);
|
|
1120
|
+
const length = esc === 'u' ? 4 : 8;
|
|
1121
|
+
let hex = '';
|
|
1122
|
+
for (let j = 0; j < length; j += 1) {
|
|
1123
|
+
if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid \\${esc} escape`, startLine, startColumn);
|
|
1124
|
+
hex += advance();
|
|
1125
|
+
}
|
|
1126
|
+
const codePoint = Number.parseInt(hex, 16);
|
|
1127
|
+
try { return String.fromCodePoint(codePoint); }
|
|
1128
|
+
catch { syntax(`Invalid \\${esc} escape`, startLine, startColumn); }
|
|
1129
|
+
}
|
|
1130
|
+
const c = current();
|
|
1131
|
+
if (strictGrammar && (/[\u0000-\u0020]/.test(c) || /[<>"{}|^`]/.test(c))) syntax(`Invalid character in IRI reference ${JSON.stringify(c)}`, startLine, startColumn);
|
|
1132
|
+
return advance();
|
|
1133
|
+
}
|
|
1134
|
+
|
|
1035
1135
|
while (i < source.length) {
|
|
1036
1136
|
const ch = current();
|
|
1037
1137
|
if (/\s/.test(ch)) { advance(); continue; }
|
|
@@ -1077,7 +1177,7 @@
|
|
|
1077
1177
|
if (ch === '<' && looksLikeIRI(source, i)) {
|
|
1078
1178
|
let value = '';
|
|
1079
1179
|
advance();
|
|
1080
|
-
while (i < source.length && current() !== '>') value +=
|
|
1180
|
+
while (i < source.length && current() !== '>') value += readIriChar(startLine, startColumn);
|
|
1081
1181
|
if (current() !== '>') syntax('Unterminated IRI', startLine, startColumn);
|
|
1082
1182
|
advance();
|
|
1083
1183
|
token('iri', value, startLine, startColumn);
|
|
@@ -1097,7 +1197,7 @@
|
|
|
1097
1197
|
}
|
|
1098
1198
|
if (!startsWith(quote.repeat(3))) syntax('Unterminated long string literal', startLine, startColumn);
|
|
1099
1199
|
advance(); advance(); advance();
|
|
1100
|
-
token('string', value, startLine, startColumn);
|
|
1200
|
+
token('string', value, startLine, startColumn, { long: true, quote });
|
|
1101
1201
|
continue;
|
|
1102
1202
|
}
|
|
1103
1203
|
|
|
@@ -1115,7 +1215,7 @@
|
|
|
1115
1215
|
}
|
|
1116
1216
|
if (current() !== quote) syntax('Unterminated string literal', startLine, startColumn);
|
|
1117
1217
|
advance();
|
|
1118
|
-
token('string', value, startLine, startColumn);
|
|
1218
|
+
token('string', value, startLine, startColumn, { long: false, quote });
|
|
1119
1219
|
continue;
|
|
1120
1220
|
}
|
|
1121
1221
|
|
|
@@ -1161,7 +1261,16 @@
|
|
|
1161
1261
|
let value = '';
|
|
1162
1262
|
while (i < source.length) {
|
|
1163
1263
|
const c = current();
|
|
1164
|
-
if (
|
|
1264
|
+
if (c === '\\' && peek() !== undefined) {
|
|
1265
|
+
value += advance();
|
|
1266
|
+
value += advance();
|
|
1267
|
+
continue;
|
|
1268
|
+
}
|
|
1269
|
+
if (/\s/.test(c) || '{}()[],;|'.includes(c) || '=<>+-*/!^~'.includes(c)) break;
|
|
1270
|
+
if (c === '.') {
|
|
1271
|
+
const n = peek();
|
|
1272
|
+
if (n === undefined || /\s/.test(n) || '{}()[],;|'.includes(n) || '=<>+-*/!^~'.includes(n)) break;
|
|
1273
|
+
}
|
|
1165
1274
|
if (c === '#') break;
|
|
1166
1275
|
value += advance();
|
|
1167
1276
|
}
|
|
@@ -1194,9 +1303,10 @@
|
|
|
1194
1303
|
return false;
|
|
1195
1304
|
}
|
|
1196
1305
|
|
|
1306
|
+
const escapeMap = { n: '\n', r: '\r', t: '\t', b: '\b', f: '\f', '"': '"', "'": "'", '\\': '\\' };
|
|
1307
|
+
|
|
1197
1308
|
function escapeValue(esc) {
|
|
1198
|
-
|
|
1199
|
-
return map[esc] ?? esc;
|
|
1309
|
+
return escapeMap[esc] ?? esc;
|
|
1200
1310
|
}
|
|
1201
1311
|
|
|
1202
1312
|
module.exports = { tokenize, SyntaxErrorWithLocation };
|