eyeleng 1.0.6 → 1.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,38 +3,146 @@
3
3
  [![npm version](https://img.shields.io/npm/v/eyeleng.svg)](https://www.npmjs.com/package/eyeleng)
4
4
  [![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.20342577-blue.svg)](https://doi.org/10.5281/zenodo.20342577)
5
5
 
6
- `eyeleng` stands for **EYE Logic Engine**. Eyeleng is a JavaScript implementation of SHACL 1.2 Rules, including SRL and RDF Rules syntax front-ends.
6
+ `eyeleng` stands for **EYE Logic Engine**. It is a compact JavaScript implementation of SHACL 1.2 Rules with two rule front-ends:
7
+
8
+ - **SRL** — the Shape Rules Language syntax used by the SHACL 1.2 Rules draft.
9
+ - **RDF Rules** — a Turtle/RDF syntax for rule sets.
10
+
11
+ Eyeleng is a forward-chaining reasoner over RDF-style triples. It is deliberately small, dependency-free at runtime, readable as ordinary JavaScript, and usable from the CLI, Node.js, and the browser playground.
12
+
13
+ Eyeleng implements the rules/reasoning surface. It is **not** a SHACL validation engine and does not emit SHACL validation reports.
7
14
 
8
15
  ## Quick start
9
16
 
10
17
  ```sh
18
+ npm install
11
19
  npm test
12
20
  ./eyeleng.js examples/family.srl
13
- ./eyeleng.js examples/spec-2-2-recursion.srl
14
- ./eyeleng.js examples/deep-taxonomy-100.srl
21
+ ./eyeleng.js --all examples/family.srl
15
22
  ./eyeleng.js examples/basic-ruleset.ttl
16
- ./eyeleng.js --syntax rdf examples/w3c-rule-set-snippet.ttl
17
23
  ./eyeleng.js --check --deps examples/stratified-negation.srl
18
24
  ```
19
25
 
20
- ## Read next
26
+ A minimal SRL program:
27
+
28
+ ```srl
29
+ PREFIX : <http://example/>
30
+
31
+ DATA {
32
+ :Socrates a :Man .
33
+ }
34
+
35
+ RULE { ?x a :Mortal } WHERE { ?x a :Man }
36
+ ```
37
+
38
+ It derives:
39
+
40
+ ```srl
41
+ :Socrates a :Mortal .
42
+ ```
43
+
44
+ Open the [Playground](https://eyereasoner.github.io/eyeleng/playground) for a self-contained browser UI with URL loading, autosave, share links, diagnostics, queries, and SRL/RDF Rules syntax selection.
45
+
46
+ ## How the reasoner works
47
+
48
+ Eyeleng computes the closure of a rule set:
49
+
50
+ ```text
51
+ parse source
52
+ analyze dependencies and strata
53
+ match rule bodies against known triples
54
+ instantiate rule heads
55
+ add new triples
56
+ repeat until stable
57
+ ```
58
+
59
+ A rule has a body and a head:
60
+
61
+ ```srl
62
+ RULE { ?child :childOf ?parent } WHERE { ?parent :parentOf ?child }
63
+ ```
64
+
65
+ If the graph contains:
66
+
67
+ ```srl
68
+ :alice :parentOf :bob .
69
+ ```
70
+
71
+ then the body matches with `?parent = :alice` and `?child = :bob`, and the head derives:
72
+
73
+ ```srl
74
+ :bob :childOf :alice .
75
+ ```
76
+
77
+ Negation is handled by stratified evaluation: rules are grouped into dependency layers, and recursion through negation is rejected so the result stays deterministic.
78
+
79
+ ## Language surface
80
+
81
+ SRL supports the practical rule features used by the SHACL 1.2 Rules tests and examples:
82
+
83
+ ```srl
84
+ PREFIX : <http://example/>
85
+
86
+ DATA {
87
+ :alice :score 7 .
88
+ :bob :score 3 .
89
+ }
90
+
91
+ RULE { ?x :grade ?grade } WHERE {
92
+ ?x :score ?score .
93
+ FILTER(?score >= 5) .
94
+ BIND(concat("pass-", str(?score)) AS ?grade)
95
+ }
96
+
97
+ RULE { ?x :eligible true } WHERE {
98
+ ?x :grade ?grade .
99
+ NOT { ?x :blocked true . }
100
+ }
101
+ ```
102
+
103
+ Implemented syntax includes:
104
+
105
+ - `PREFIX`, `BASE`, `VERSION`, and `IMPORTS`
106
+ - `DATA`, `RULE`, `WHERE`, `FILTER`, `BIND`, `SET`, and `NOT`
107
+ - variables such as `?x`
108
+ - IRIs, prefixed names, blank nodes, literals, RDF collections, and RDF 1.2 triple terms
109
+ - Turtle-style `;`, `,`, `a`, blank-node property lists, lists, annotations, and reifiers where supported
110
+ - property paths in rule bodies
111
+ - language tags, base-direction literals, and common XML Schema datatypes
112
+ - SRL and RDF Rules syntax front-ends
113
+
114
+ The RDF parsing path is shared with the W3C RDF syntax harness, so SRL `DATA { ... }` uses the same grammar-hardened RDF parser surface as Turtle/TriG input.
21
115
 
22
- Read [Handbook](https://eyereasoner.github.io/eyeleng/HANDBOOK) for the full explanation of Eyeleng as code and as a reasoning machine.
116
+ ## Builtins and expressions
23
117
 
24
- Open [Playground](https://eyereasoner.github.io/eyeleng/playground) for a self-contained browser playground with URL loading, autosave, share links, diagnostics, queries, and SRL/RDF Rules syntax selection.
118
+ `FILTER`, `BIND`, and `SET` use expression evaluation. Supported operations include comparisons, boolean operators, arithmetic, `IN`, `NOT IN`, datatype/language checks, string functions, numeric functions, and selected date/time helpers.
25
119
 
26
- `npm run build` writes the command-line bundle to `eyeleng.js` and the browser API bundle to `dist/browser/eyeleng.browser.js`. In a browser, the bundle exposes the API as `window.eyeleng`.
120
+ Common builtins include:
27
121
 
28
- The examples live in [examples/](./examples/) at one level. Draft SRL examples are named `spec-*.srl`, RDF Rules syntax examples use `.ttl`, and deep taxonomy benchmarks are named `deep-taxonomy-*.srl`.
122
+ ```text
123
+ str, concat, lcase, ucase, replace
124
+ abs, round, floor, ceil
125
+ datatype, lang, iri, uri
126
+ now, year, month, day
127
+ ```
128
+
129
+ The goal is useful SHACL Rules/SRL behavior, not complete SPARQL expression coverage.
29
130
 
30
- Status: Eyeleng runs a growing implementation of the SHACL 1.2 Rules draft surface. It does not implement SHACL validation.
131
+ ## RDF 1.2 features
31
132
 
32
- The official Eyeleng EARL 1.0 test report for the W3C SHACL 1.2 Rules manifest is in [reports/w3c-shacl12-rules-earl.ttl](./reports/w3c-shacl12-rules-earl.ttl). It records 88/88 passing tests for `https://w3c.github.io/data-shapes/shacl12-test-suite/tests/rules/manifest-rules.ttl`.
133
+ Eyeleng includes grammar-hardened RDF 1.1/1.2 parsing support in `src/rdfSyntax.js` and W3C manifest runners in `src/rdfManifest.js` / `src/rdfEntailment.js`.
33
134
 
135
+ Covered surfaces include:
34
136
 
35
- ## W3C RDF syntax and semantics manifests
137
+ - N-Triples and N-Quads
138
+ - Turtle and TriG
139
+ - RDF 1.2 triple terms
140
+ - reifiers and annotation blocks
141
+ - language-direction literals
142
+ - graph isomorphism for blank nodes
143
+ - simple, RDF, and RDFS entailment checks for RDF-MT / RDF 1.2 Semantics manifests
36
144
 
37
- Eyeleng now integrates the grammar-hardened RDF syntax work into its normal `src/rdfSyntax.js` / `src/rdfManifest.js` structure, and adds a small RDF/RDFS entailment runner in `src/rdfEntailment.js`. The W3C RDF checks are therefore part of the same parser/reasoner discipline as the RDF Rules front-end.
145
+ W3C checks:
38
146
 
39
147
  ```sh
40
148
  npm run w3c:rules
@@ -44,10 +152,9 @@ npm run w3c:rdf:json
44
152
  npm run w3c:rdf:earl
45
153
  ```
46
154
 
47
- `npm test` includes both W3C harnesses. When the W3C URLs are reachable, progress is printed test by test. In offline environments, the remote W3C checks are reported as unreachable unless `EYELENG_W3C_REQUIRED=1` is set.
48
-
49
- The RDF harness covers N-Triples, N-Quads, Turtle, and TriG RDF 1.1/1.2 parser syntax/eval manifests, plus the RDF-MT and RDF 1.2 Semantics entailment manifests. Entailment tests are evaluated under their declared `mf:entailmentRegime` (`simple`, `RDF`, or `RDFS`) with their declared recognized datatypes.
155
+ `npm test` includes the W3C harnesses. When W3C URLs are reachable, progress is printed test by test. In offline environments, remote W3C checks are reported as unreachable unless `EYELENG_W3C_REQUIRED=1` is set.
50
156
 
157
+ The official Eyeleng EARL 1.0 report for the W3C SHACL 1.2 Rules manifest is in [reports/w3c-shacl12-rules-earl.ttl](./reports/w3c-shacl12-rules-earl.ttl).
51
158
 
52
159
  ## RDF Message Logs
53
160
 
@@ -64,7 +171,7 @@ MESSAGE
64
171
  _:reading :sensor :s2 .
65
172
  ```
66
173
 
67
- Use it directly, import it from SRL, or force message-log parsing from the CLI:
174
+ Use message logs directly, import them from SRL, or force message-log parsing from the CLI:
68
175
 
69
176
  ```sh
70
177
  ./eyeleng.js examples/rdf-messages.srl
@@ -72,4 +179,154 @@ Use it directly, import it from SRL, or force message-log parsing from the CLI:
72
179
  ./eyeleng.js --stream-messages --all examples/rdf-messages.trig
73
180
  ```
74
181
 
75
- The replay data includes `eymsg:RDFMessageStream`, `eymsg:MessageEnvelope`, offsets, next-envelope links, payload kind, payload graph, and `eymsg:payloadTriple` triple terms. Blank-node labels are scoped per message. For Eyeleng, each payload graph is represented as a closed list of RDF 1.2 triple terms via `log:nameOf`, plus direct `eymsg:payloadTriple` links for convenient SRL rules.
182
+ The replay data includes message streams, envelopes, offsets, next-envelope links, payload kind, payload graph, and `eymsg:payloadTriple` triple terms. Blank-node labels are scoped per message. For Eyeleng, each payload graph is also represented as a closed RDF list of RDF 1.2 triple terms via `log:nameOf`.
183
+
184
+ ## CLI
185
+
186
+ Common commands:
187
+
188
+ ```sh
189
+ ./eyeleng.js examples/family.srl
190
+ ./eyeleng.js --all examples/family.srl
191
+ ./eyeleng.js --check --deps examples/stratified-negation.srl
192
+ ./eyeleng.js --json --trace --stats examples/if-then.srl
193
+ ./eyeleng.js --query-file examples/query-body.txt examples/query.srl
194
+ ./eyeleng.js --syntax rdf examples/w3c-rule-set-snippet.ttl
195
+ ```
196
+
197
+ Important options:
198
+
199
+ ```text
200
+ --all print input and inferred triples
201
+ --check parse and analyze only
202
+ --strict warnings become fatal
203
+ --deps print dependency edges and layers
204
+ --trace show rule firings
205
+ --stats show iteration and rule counts
206
+ --json structured output
207
+ --query run a raw body pattern over the closure
208
+ --query-file FILE read a query body from a file
209
+ --max-iterations N recursive-layer fixpoint safety guard
210
+ --no-imports parse imports but do not load them
211
+ --rdf-messages parse input as an RDF Message Log
212
+ --stream-messages replay RDF Message Log envelopes
213
+ ```
214
+
215
+ ## Public API
216
+
217
+ Typical API use:
218
+
219
+ ```js
220
+ const { run, formatTriples } = require('./src/index.js');
221
+
222
+ const result = run(`
223
+ PREFIX : <http://example/>
224
+ DATA { :Socrates a :Man . }
225
+ RULE { ?x a :Mortal } WHERE { ?x a :Man }
226
+ `);
227
+
228
+ console.log(formatTriples(result.inferred, result.prefixes));
229
+ ```
230
+
231
+ Query mode:
232
+
233
+ ```js
234
+ const { runQuery, formatBindings } = require('./src/index.js');
235
+
236
+ const result = runQuery(source, '?x :ancestorOf ?y');
237
+ console.log(formatBindings(result.query.bindings, result.prefixes));
238
+ ```
239
+
240
+ Imports:
241
+
242
+ ```js
243
+ const result = run(source, {
244
+ baseIRI: 'file:///main.srl',
245
+ importResolver(target) {
246
+ return {
247
+ source: readSomehow(target),
248
+ options: { baseIRI: target, filename: target }
249
+ };
250
+ }
251
+ });
252
+ ```
253
+
254
+ The API returns structured parsed programs, diagnostics, inferred triples, closure triples, traces, stats, and query bindings.
255
+
256
+ ## Project layout
257
+
258
+ ```text
259
+ src/tokenizer.js source text -> tokens
260
+ src/parser.js SRL parser -> program object
261
+ src/rdfSyntax.js RDF 1.1/1.2 N-Triples/N-Quads/Turtle/TriG syntax
262
+ src/rdfManifest.js W3C RDF manifest runner
263
+ src/rdfEntailment.js simple/RDF/RDFS entailment checks
264
+ src/rdfMessages.js RDF Message Log replay support
265
+ src/term.js terms, keys, equality, formatting
266
+ src/store.js triple set, predicate index, matching, paths
267
+ src/builtins.js expression evaluation and built-in functions
268
+ src/analyze.js diagnostics, dependencies, strata
269
+ src/engine.js layered forward-chaining evaluator
270
+ src/query.js external raw-body query operation
271
+ src/format.js text and JSON output
272
+ src/api.js public JavaScript API and import merging
273
+ src/cli.js command-line interface
274
+ tools/bundle.js self-contained bundle generator
275
+ test/*.test.js executable regression and conformance tests
276
+ examples/*.srl runnable SRL examples
277
+ examples/*.ttl RDF Rules / Turtle examples
278
+ ```
279
+
280
+ A good reading order is `term.js`, `tokenizer.js`, `parser.js`, `rdfSyntax.js`, `store.js`, `builtins.js`, `analyze.js`, `engine.js`, then `api.js` and `cli.js`.
281
+
282
+ ## Tests and build
283
+
284
+ ```sh
285
+ npm test
286
+ npm run build
287
+ ```
288
+
289
+ `npm run build` writes the command-line bundle to `eyeleng.js` and the browser API bundle to `dist/browser/eyeleng.browser.js`. In a browser, the bundle exposes `window.eyeleng`.
290
+
291
+ The tests are executable documentation. They cover parsing, recursion, filters, negation, assignment, typed/language literals, RDF 1.2 syntax, property paths, stratification, imports, queries, examples, deep taxonomy benchmarks, W3C SHACL Rules, W3C RDF syntax, and RDF/RDFS entailment.
292
+
293
+ ## Examples
294
+
295
+ Examples live in [examples/](./examples/):
296
+
297
+ - `family.srl` — small recursive rules
298
+ - `negation.srl` — stratified negation
299
+ - `assignment.srl` — assignment and expressions
300
+ - `property-paths.srl` — path matching in bodies
301
+ - `basic-ruleset.ttl` — RDF Rules syntax
302
+ - `rdf-messages.srl` / `rdf-messages.trig` — RDF Message Log replay
303
+ - `deep-taxonomy-*.srl` — generated benchmark programs
304
+
305
+ ## Known boundaries
306
+
307
+ Eyeleng intentionally remains a compact reasoner:
308
+
309
+ - it does not implement SHACL validation or validation reports
310
+ - it does not aim to be a full RDF database
311
+ - RDF Rules syntax support is a front-end for rule execution, not a shapes-validation layer
312
+ - property paths and SPARQL expressions are practical subsets
313
+ - W3C manifests are used as executable alignment tests, but implementation status should be read from the current test reports
314
+
315
+ ## Extending Eyeleng safely
316
+
317
+ Preserve the pipeline:
318
+
319
+ ```text
320
+ syntax -> AST/program -> analysis -> evaluation -> formatting
321
+ ```
322
+
323
+ Avoid making the evaluator parse strings. Parsing belongs in `parser.js` or `rdfSyntax.js`. Avoid making the parser derive triples. Inference belongs in `engine.js`.
324
+
325
+ A safe extension usually needs:
326
+
327
+ 1. syntax support
328
+ 2. one focused example
329
+ 3. parser/API tests
330
+ 4. execution tests
331
+ 5. README or handbook notes
332
+ 6. bundle regeneration
@@ -166,7 +166,7 @@
166
166
 
167
167
  class Parser {
168
168
  constructor(source, options = {}) {
169
- this.tokens = Array.isArray(source) ? source : tokenize(source, options.filename);
169
+ this.tokens = Array.isArray(source) ? source : tokenize(source, options);
170
170
  this.pos = 0;
171
171
  this.options = options;
172
172
  this.baseIRI = options.baseIRI || null;
@@ -229,6 +229,7 @@
229
229
  let name = nameToken.value;
230
230
  if (!name.endsWith(':')) throw this.error('Prefix name must end with :', nameToken);
231
231
  name = name.slice(0, -1);
232
+ if (this.strictGrammar() && !isValidPNPrefix(name)) throw this.error(`Invalid prefix name ${nameToken.value}`, nameToken);
232
233
  const iriToken = this.expectType('iri');
233
234
  this.prefixes[name] = this.resolveIRI(iriToken.value, iriToken);
234
235
  if (wasAtPrefix) this.consumeOptionalDot();
@@ -236,6 +237,10 @@
236
237
 
237
238
  parseVersion() {
238
239
  const token = this.expectType('string');
240
+ if (this.strictGrammar()) {
241
+ if (token.long) throw this.error('VERSION must use a short string literal', token);
242
+ if (token.value !== '1.2') throw this.error('VERSION must be the SHACL Rules version label \"1.2\"', token);
243
+ }
239
244
  this.version = token.value;
240
245
  }
241
246
 
@@ -631,6 +636,7 @@
631
636
  } else if (this.matchWord('SET')) {
632
637
  clauses.push(this.parseSetClause());
633
638
  } else if (this.matchWord('BIND')) {
639
+ if (this.strictGrammar()) throw this.error('BIND is not part of the SHACL 1.2 Rules grammar; use SET');
634
640
  clauses.push(this.parseBindClause());
635
641
  } else if (this.matchWord('NOT')) {
636
642
  this.expectValue('{');
@@ -738,8 +744,9 @@
738
744
  if (colon < 0) throw this.error(`Expected IRI, prefixed name, literal, blank node, or variable; got ${value}`, token);
739
745
  const prefix = value.slice(0, colon);
740
746
  const local = value.slice(colon + 1);
747
+ if (this.strictGrammar()) validatePrefixedName(prefix, local, value, token, (message, errToken) => this.error(message, errToken));
741
748
  if (!(prefix in this.prefixes)) throw this.error(`Unknown prefix ${prefix}:`, token);
742
- return this.prefixes[prefix] + local;
749
+ return this.prefixes[prefix] + decodePNLocalEscapes(local);
743
750
  }
744
751
 
745
752
  resolveIRI(value, token = null) {
@@ -898,9 +905,76 @@
898
905
  peek() { return this.tokens[this.pos]; }
899
906
  peekN(n) { return this.tokens[this.pos + n] || this.tokens[this.tokens.length - 1]; }
900
907
  previous() { return this.tokens[this.pos - 1]; }
908
+ strictGrammar() { return !!this.options.strictGrammar; }
901
909
  error(message, token = this.peek()) { return new SyntaxErrorWithLocation(message, token); }
902
910
  }
903
911
 
912
+
913
+ function isPnCharsBase(ch) {
914
+ if (!ch) return false;
915
+ return /[A-Za-z]/.test(ch) || ch.codePointAt(0) >= 0x00C0;
916
+ }
917
+
918
+ function isPnCharsU(ch) {
919
+ return isPnCharsBase(ch) || ch === '_';
920
+ }
921
+
922
+ function isPnChars(ch) {
923
+ return isPnCharsU(ch) || /[0-9-]/.test(ch) || ch === '\u00B7' || /[\u0300-\u036F\u203F-\u2040]/u.test(ch);
924
+ }
925
+
926
+ function isValidPNPrefix(prefix) {
927
+ if (prefix === '') return true;
928
+ const chars = Array.from(prefix);
929
+ if (!isPnCharsBase(chars[0])) return false;
930
+ if (chars.length > 1 && chars.at(-1) === '.') return false;
931
+ return chars.slice(1).every((ch) => isPnChars(ch) || ch === '.');
932
+ }
933
+
934
+ function plxLength(text, index) {
935
+ const ch = text[index];
936
+ if (ch === '%' && /[0-9A-Fa-f]/.test(text[index + 1] || '') && /[0-9A-Fa-f]/.test(text[index + 2] || '')) return 3;
937
+ if (ch === '\\' && /[_~.!$&'()*+,;=/?#@%-]/.test(text[index + 1] || '')) return 2;
938
+ return 0;
939
+ }
940
+
941
+ function isPNLocalStartAt(text, index) {
942
+ const ch = text[index];
943
+ return isPnCharsU(ch) || /[0-9:]/.test(ch || '') || plxLength(text, index) > 0;
944
+ }
945
+
946
+ function isPNLocalBodyAt(text, index) {
947
+ const ch = text[index];
948
+ return isPnChars(ch) || ch === '.' || ch === ':' || plxLength(text, index) > 0;
949
+ }
950
+
951
+ function isPNLocalEndAt(text, index) {
952
+ const ch = text[index];
953
+ return isPnChars(ch) || ch === ':' || plxLength(text, index) > 0;
954
+ }
955
+
956
+ function validatePNLocal(local) {
957
+ if (local === '') return true;
958
+ if (!isPNLocalStartAt(local, 0)) return false;
959
+ let lastStart = 0;
960
+ for (let i = 0; i < local.length;) {
961
+ const len = plxLength(local, i) || 1;
962
+ if (i > 0 && !isPNLocalBodyAt(local, i)) return false;
963
+ lastStart = i;
964
+ i += len;
965
+ }
966
+ return isPNLocalEndAt(local, lastStart);
967
+ }
968
+
969
+ function validatePrefixedName(prefix, local, value, token, makeError) {
970
+ if (!isValidPNPrefix(prefix)) throw makeError(`Invalid prefixed name ${value}: invalid prefix`, token);
971
+ if (!validatePNLocal(local)) throw makeError(`Invalid prefixed name ${value}: invalid local name`, token);
972
+ }
973
+
974
+ function decodePNLocalEscapes(local) {
975
+ return String(local).replace(/\\([_~.!$&'()*+,;=/?#@%-])/g, '$1');
976
+ }
977
+
904
978
  function numericLiteral(value) {
905
979
  if (Number.isInteger(value)) return literal(value, XSD_INTEGER);
906
980
  return literal(value, XSD_DECIMAL);
@@ -970,7 +1044,10 @@
970
1044
  }
971
1045
  }
972
1046
 
973
- function tokenize(source, filename = '<input>') {
1047
+ function tokenize(source, filenameOrOptions = '<input>') {
1048
+ const options = typeof filenameOrOptions === 'object' && filenameOrOptions !== null ? filenameOrOptions : { filename: filenameOrOptions };
1049
+ const filename = options.filename || '<input>';
1050
+ const strictGrammar = !!options.strictGrammar;
974
1051
  const tokens = [];
975
1052
  let i = 0;
976
1053
  let line = 1;
@@ -985,8 +1062,8 @@
985
1062
  else column += 1;
986
1063
  return ch;
987
1064
  }
988
- function token(type, value, startLine, startColumn) {
989
- tokens.push({ type, value, line: startLine, column: startColumn, filename });
1065
+ function token(type, value, startLine, startColumn, extra = {}) {
1066
+ tokens.push({ type, value, line: startLine, column: startColumn, filename, ...extra });
990
1067
  }
991
1068
  function syntax(message, startLine, startColumn) {
992
1069
  throw new SyntaxErrorWithLocation(message, { line: startLine, column: startColumn, filename });
@@ -1024,14 +1101,37 @@
1024
1101
  const length = esc === 'u' ? 4 : 8;
1025
1102
  let hex = '';
1026
1103
  for (let j = 0; j < length; j += 1) {
1027
- if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid \${esc} escape`, startLine, startColumn);
1104
+ if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid \\${esc} escape`, startLine, startColumn);
1028
1105
  hex += advance();
1029
1106
  }
1030
- return String.fromCodePoint(Number.parseInt(hex, 16));
1107
+ const codePoint = Number.parseInt(hex, 16);
1108
+ try { return String.fromCodePoint(codePoint); }
1109
+ catch { syntax(`Invalid \\${esc} escape`, startLine, startColumn); }
1031
1110
  }
1111
+ if (strictGrammar && !Object.hasOwn(escapeMap, esc)) syntax(`Invalid escape \\${esc}`, startLine, startColumn);
1032
1112
  return escapeValue(esc);
1033
1113
  }
1034
1114
 
1115
+ function readIriChar(startLine, startColumn) {
1116
+ if (current() === '\\') {
1117
+ advance();
1118
+ const esc = advance();
1119
+ if (esc !== 'u' && esc !== 'U') syntax(`Invalid IRI escape \\${esc}`, startLine, startColumn);
1120
+ const length = esc === 'u' ? 4 : 8;
1121
+ let hex = '';
1122
+ for (let j = 0; j < length; j += 1) {
1123
+ if (!/[0-9A-Fa-f]/.test(current() || '')) syntax(`Invalid \\${esc} escape`, startLine, startColumn);
1124
+ hex += advance();
1125
+ }
1126
+ const codePoint = Number.parseInt(hex, 16);
1127
+ try { return String.fromCodePoint(codePoint); }
1128
+ catch { syntax(`Invalid \\${esc} escape`, startLine, startColumn); }
1129
+ }
1130
+ const c = current();
1131
+ if (strictGrammar && (/[\u0000-\u0020]/.test(c) || /[<>"{}|^`]/.test(c))) syntax(`Invalid character in IRI reference ${JSON.stringify(c)}`, startLine, startColumn);
1132
+ return advance();
1133
+ }
1134
+
1035
1135
  while (i < source.length) {
1036
1136
  const ch = current();
1037
1137
  if (/\s/.test(ch)) { advance(); continue; }
@@ -1077,7 +1177,7 @@
1077
1177
  if (ch === '<' && looksLikeIRI(source, i)) {
1078
1178
  let value = '';
1079
1179
  advance();
1080
- while (i < source.length && current() !== '>') value += advance();
1180
+ while (i < source.length && current() !== '>') value += readIriChar(startLine, startColumn);
1081
1181
  if (current() !== '>') syntax('Unterminated IRI', startLine, startColumn);
1082
1182
  advance();
1083
1183
  token('iri', value, startLine, startColumn);
@@ -1097,7 +1197,7 @@
1097
1197
  }
1098
1198
  if (!startsWith(quote.repeat(3))) syntax('Unterminated long string literal', startLine, startColumn);
1099
1199
  advance(); advance(); advance();
1100
- token('string', value, startLine, startColumn);
1200
+ token('string', value, startLine, startColumn, { long: true, quote });
1101
1201
  continue;
1102
1202
  }
1103
1203
 
@@ -1115,7 +1215,7 @@
1115
1215
  }
1116
1216
  if (current() !== quote) syntax('Unterminated string literal', startLine, startColumn);
1117
1217
  advance();
1118
- token('string', value, startLine, startColumn);
1218
+ token('string', value, startLine, startColumn, { long: false, quote });
1119
1219
  continue;
1120
1220
  }
1121
1221
 
@@ -1161,7 +1261,16 @@
1161
1261
  let value = '';
1162
1262
  while (i < source.length) {
1163
1263
  const c = current();
1164
- if (/\s/.test(c) || '{}()[].,;|'.includes(c) || '=<>+-*/!^~'.includes(c)) break;
1264
+ if (c === '\\' && peek() !== undefined) {
1265
+ value += advance();
1266
+ value += advance();
1267
+ continue;
1268
+ }
1269
+ if (/\s/.test(c) || '{}()[],;|'.includes(c) || '=<>+-*/!^~'.includes(c)) break;
1270
+ if (c === '.') {
1271
+ const n = peek();
1272
+ if (n === undefined || /\s/.test(n) || '{}()[],;|'.includes(n) || '=<>+-*/!^~'.includes(n)) break;
1273
+ }
1165
1274
  if (c === '#') break;
1166
1275
  value += advance();
1167
1276
  }
@@ -1194,9 +1303,10 @@
1194
1303
  return false;
1195
1304
  }
1196
1305
 
1306
+ const escapeMap = { n: '\n', r: '\r', t: '\t', b: '\b', f: '\f', '"': '"', "'": "'", '\\': '\\' };
1307
+
1197
1308
  function escapeValue(esc) {
1198
- const map = { n: '\n', r: '\r', t: '\t', b: '\b', f: '\f', '"': '"', "'": "'", '\\': '\\' };
1199
- return map[esc] ?? esc;
1309
+ return escapeMap[esc] ?? esc;
1200
1310
  }
1201
1311
 
1202
1312
  module.exports = { tokenize, SyntaxErrorWithLocation };