re2js 1.0.1 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +33 -14
- package/build/index.cjs.cjs +3513 -3341
- package/build/index.cjs.cjs.map +1 -1
- package/build/index.esm.d.ts +14 -2
- package/build/index.esm.d.ts.map +1 -1
- package/build/index.esm.js +3513 -3341
- package/build/index.esm.js.map +1 -1
- package/build/index.umd.js +3513 -3341
- package/build/index.umd.js.map +1 -1
- package/package.json +17 -17
package/README.md
CHANGED
|
@@ -98,18 +98,6 @@ RE2JS.DISABLE_UNICODE_GROUPS
|
|
|
98
98
|
RE2JS.LONGEST_MATCH
|
|
99
99
|
```
|
|
100
100
|
|
|
101
|
-
### Program size
|
|
102
|
-
|
|
103
|
-
The program size represents a very approximate measure of a regexp's "cost". Larger numbers are more expensive than smaller numbers.
|
|
104
|
-
|
|
105
|
-
```js
|
|
106
|
-
import { RE2JS } from 're2js'
|
|
107
|
-
|
|
108
|
-
console.log(RE2JS.compile('^').programSize()); // Outputs: 3
|
|
109
|
-
console.log(RE2JS.compile('a+b').programSize()); // Outputs: 5
|
|
110
|
-
console.log(RE2JS.compile('(a+b?)').programSize()); // Outputs: 8
|
|
111
|
-
```
|
|
112
|
-
|
|
113
101
|
### Checking for Matches
|
|
114
102
|
|
|
115
103
|
RE2JS allows you to check if a string matches a given regex pattern using the `matches()` function
|
|
@@ -353,6 +341,25 @@ RE2JS.compile('(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)(.)')
|
|
|
353
341
|
|
|
354
342
|
Function support second argument `perlMode`, which work in the same way, as for `replaceAll` function
|
|
355
343
|
|
|
344
|
+
### Translating Regular Expressions
|
|
345
|
+
|
|
346
|
+
The `translateRegExp()` method preprocesses a given regular expression string to ensure compatibility with RE2JS.
|
|
347
|
+
It applies necessary transformations, such as escaping special characters, adjusting Unicode sequences, and converting named capture groups
|
|
348
|
+
|
|
349
|
+
```js
|
|
350
|
+
import { RE2JS } from 're2js'
|
|
351
|
+
|
|
352
|
+
const regexp = RE2JS.translateRegExp('(?<word>\\w+)') // '(?P<word>\\w+)'
|
|
353
|
+
|
|
354
|
+
RE2JS.matches(regexp, 'hello') // true
|
|
355
|
+
RE2JS.matches(regexp, '123') // true
|
|
356
|
+
|
|
357
|
+
const unicodeRegexp = RE2JS.translateRegExp('\\u{1F600}') // '\\x{1F600}'
|
|
358
|
+
|
|
359
|
+
RE2JS.matches(unicodeRegexp, '😀') // true
|
|
360
|
+
RE2JS.matches(unicodeRegexp, '😃') // false
|
|
361
|
+
```
|
|
362
|
+
|
|
356
363
|
### Escaping Special Characters
|
|
357
364
|
|
|
358
365
|
The `quote()` method returns a literal pattern string for the specified string. This can be useful if you want to search for a literal string pattern that may contain special characters
|
|
@@ -366,9 +373,21 @@ RE2JS.matches(regexp, 'ab+c') // true
|
|
|
366
373
|
RE2JS.matches(regexp, 'abc') // false
|
|
367
374
|
```
|
|
368
375
|
|
|
376
|
+
### Program size
|
|
377
|
+
|
|
378
|
+
The program size represents a very approximate measure of a regexp's "cost". Larger numbers are more expensive than smaller numbers
|
|
379
|
+
|
|
380
|
+
```js
|
|
381
|
+
import { RE2JS } from 're2js'
|
|
382
|
+
|
|
383
|
+
console.log(RE2JS.compile('^').programSize()); // Outputs: 3
|
|
384
|
+
console.log(RE2JS.compile('a+b').programSize()); // Outputs: 5
|
|
385
|
+
console.log(RE2JS.compile('(a+b?)').programSize()); // Outputs: 8
|
|
386
|
+
```
|
|
387
|
+
|
|
369
388
|
## Performance
|
|
370
389
|
|
|
371
|
-
The RE2JS engine runs more slowly compared to native RegExp objects. This reduced speed is also noticeable when comparing RE2JS to the original RE2 engine. The C++ implementation of the RE2 engine includes both NFA (Nondeterministic Finite Automaton) and DFA (Deterministic Finite Automaton) engines, as well as a variety of optimizations. Russ Cox ported a simplified version of the NFA engine to Go. Later, Alan Donovan ported the NFA-based Go implementation to Java. I then ported the NFA-based Java implementation to a pure JS version. This is another reason why the pure JS version will perform more slowly compared to the original RE2 engine.
|
|
390
|
+
The RE2JS engine runs more slowly compared to native RegExp objects. This reduced speed is also noticeable when comparing RE2JS to the original RE2 engine. The C++ implementation of the RE2 engine includes both NFA (Nondeterministic Finite Automaton) and DFA (Deterministic Finite Automaton) engines, as well as a variety of optimizations. Russ Cox ported a simplified version of the NFA engine to Go. Later, Alan Donovan ported the NFA-based Go implementation to Java. I then ported the NFA-based Java implementation (plus Golang stuff, which are not present in Java implementation, like checks for regular expression complexity) to a pure JS version. This is another reason why the pure JS version will perform more slowly compared to the original RE2 engine.
|
|
372
391
|
|
|
373
392
|
Should you require high performance on the server side when using RE2, it would be beneficial to consider the following packages for JS:
|
|
374
393
|
|
|
@@ -418,7 +437,7 @@ These factors combined make the RE2 vanilla JS port a valuable tool for develope
|
|
|
418
437
|
Some files like `CharGroup.js` and `UnicodeTables.js` is generated and should be edited in generator files
|
|
419
438
|
|
|
420
439
|
```bash
|
|
421
|
-
./tools/scripts/make_perl_groups.pl
|
|
440
|
+
./tools/scripts/make_perl_groups.pl > src/CharGroup.js
|
|
422
441
|
yarn node ./tools/scripts/genUnicodeTable.js > src/UnicodeTables.js
|
|
423
442
|
```
|
|
424
443
|
|