@zigsterz/parzing 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,7 +1,9 @@
1
1
  # Parzing: TypesSript Parser Combinator Library
2
2
 
3
3
  ## Overview
4
- This package implements a parser combinator system, allowing client code to easily create parsers in JavaScript or TypeScript. When used with TypeScript accurate types are computed for parsed and intermediate results, allowing quick and safe implementation and utilization of parsers.
4
+ This package is Parzing: a parser combinator library, allowing client code to easily create parsers in JavaScript or TypeScript. When used with TypeScript accurate types are computed for parsed and intermediate results, allowing easy and safe implementation and use of parsers.
5
+
6
+ This page provides instructions on how to use and customize Parzing. For more insights about the library, [see this blog](https://www.imonlydoingthis.benhaim.net/home/categories/parzing).
5
7
 
6
8
  ## Installation
7
9
 
@@ -13,7 +15,7 @@ npm install --save @zigsterz/parzing
13
15
 
14
16
  Parzing exposes a `parse` function for invoking a parser on content. To use it, we first construct a parser, and then pass the parser along with the content to parse. `parse` will either return a the result of succesfuly parsing the content, or throw an error describing a failure to parse.
15
17
 
16
- The `ParserBuilder` class exposes a set of helper factory functions for constructing parser.
18
+ The parser passed to `parse` is usually built using the `ParserBuilder` class. This class exposes a set of helper factory functions for constructing parsers.
17
19
 
18
20
  The example below demonstrates how to parse a sequence of 1 to 3 digits by first constructing a `ParserBuilder`, then using it to create an `AnyOfParser` parser and finally running the parser using `parse`.
19
21
 
@@ -29,11 +31,11 @@ assert(result == "123");
29
31
 
30
32
  ### Parsing Results
31
33
  The result of parsing content may be a value of any type. A Parzing parser has an associated result type that describes the type of the parsing result returned by that parser.
32
-
34
+ The return type from `parse` will match the result type of the parser passed to it.
33
35
 
34
36
  ### Basic Parsers
35
37
 
36
- `pb.anyOf` creates the Any Of *basic parser*. Basic parsers are the atomic building blocks for parsing. They may be combined using [*parser combinators*](#parser-combinators) to construct more complex parsers.
38
+ In the example above, `pb.anyOf` creates the Any Of *basic parser*. Basic parsers are the atomic building blocks for parsing. They may be combined using [*parser combinators*](#parser-combinators) to construct more complex parsers.
37
39
 
38
40
  Parzing offers the following basic parsers out of the box:
39
41
 
@@ -41,17 +43,18 @@ Parzing offers the following basic parsers out of the box:
41
43
  | ------------------ | ----------- | ----------- |
42
44
  | `ParserBuilder.anyOf(chars, min, max)` | Parses a minimum of *min* characters, and up to *max* characters, all out the characters listed in *chars*. | `string` |
43
45
  | `ParserBuilder.token(token)` | Parses the exact string specified by *token*. | `string` |
46
+ | `ParserBuilder.regex(regex)` | Parses the regular expression specified by *regex*. | `string` |
44
47
  | `ParserBuilder.pass()` | This is a no-op parser. It consumes no input and always succeeds. | `void` |
45
48
  | `ParserBuilder.fail(message)` | Fail parsing with the error message provided in *message*. | `void` |
46
49
 
47
- In addition to these parsers, you can create [custom parsers](#custom-parsers) to parse arbitrary complex "atoms". Customer parsers may provide any result type.
50
+ In addition to these parsers, you can create [custom parsers](#custom-parsers) to parse arbitrary complex "atoms". Custom parsers may provide any result type.
48
51
 
49
52
  ## Creating Complex Parsers
50
53
 
51
54
  ### Parser Combinators
52
- *Parser Combinators* are parsers that are constructed based on other parsers, and combine these parsers in some form to generate a more complex parser.
55
+ *Parser Combinators* are parsers constructed based on other parsers that combine these parsers in some form to generate a more complex parser.
53
56
 
54
- Perhaps the simplest example of a paser combinator is the Sequence combinator.
57
+ Perhaps the simplest example of a parser combinator is the Sequence combinator.
55
58
  The sequence combinator is constructed based on a sequence of underlying parsers, using the `ParserBuilder.sequence(...)` factory method.
56
59
  When parsing input content, the combinator will invoke each of the underlying parsers to parse consecutive fragments of the content.
57
60
  If any underlying parser fails, the sequence parser fails as well.
@@ -76,7 +79,7 @@ const result = parse(sample_parser, "1-2-3", true);
76
79
  assert.deepEqual(result, ["1", "-", "2", "-", "3"]);
77
80
  ```
78
81
 
79
- In addition to the `sequence` parser combinator, the following parser combinators offered by Parzing out-of-the box:
82
+ In addition to the `sequence` parser combinator, the following parser combinators are offered by Parzing out-of-the box:
80
83
 
81
84
  | Parser constructor | Description | Result type |
82
85
  | ------------------ | ----------- | ----------- |
@@ -91,13 +94,13 @@ In addition to the `sequence` parser combinator, the following parser combinator
91
94
 
92
95
  ### Whitespace Support
93
96
 
94
- Parsers that derive from `ParserWithInternalWhitespaceSupport` support ignoring whitespace within the content. Exactly where whitespace is ignored depends on the specific parser (see table below).
97
+ Parsers that derive from `ParserWithInternalWhitespaceSupport` support ignoring whitespace within parsed content. Exactly where whitespace is ignored depends on the specific parser as per the table below.
95
98
 
96
99
  For all of these parsers, the ignored "whitespace" is defined as content that can be parsed by the *whitespace parser*. The whitespace parser can be set by invoking `target_parser.whitespace(whitespace_parser)`.
97
- If this method is not invoked on the parser, and the parser was created using `ParserBuilder`, then the whitespace parser is set as the default whitespace parser for the builder. The default whitespace parser for a `ParserBuilder` can be set by passing it on construction.
98
-
99
100
  If there's no set whitespace parser on a `ParserWithInternalWhitespaceSupport`, no whitespace will be ignored.
100
101
 
102
+ If the `whitespace` method is not invoked on a parser, and the parser was created using `ParserBuilder`, then the whitespace parser is set as the default whitespace parser for the builder. The default whitespace parser for a `ParserBuilder` can be set by passing it on construction.
103
+
101
104
  The `WhitespaceParser` class implements a parser that accepts common whitespace patterns.
102
105
 
103
106
  The following code example illustrates a few ways to set whitespace parsers:
@@ -167,9 +170,10 @@ This parser will fail, but the failure will be reported by the `choice` combinat
167
170
  ParseError { message: 'Parser rejected input' }
168
171
  ```
169
172
 
170
- Clearly, for the input `number a123` a more approach would be to not even try the second alternative in the `choice` combinator above, and immediately bail out if we've encountered the `number` token.
171
- This kind of behavior can be achieved using cuts. A cut is a special parser, constructed using `ParserBuilder.cut`, that doesn't attempt to consume any input. Rather,
172
- When a cut 'parses', the fact that it was encountered is recored in the parsing context. Backtracking parsers, such as the ones listed above, will not attempt to backtrack parsing if a cut was encountered by one of their underlying parsers. Rather they will immediate fail with whatever failure that would have caused them to backtrack.
173
+ Clearly, for the input `number a123` a more reasonable behavior would be if the parser didn't even try the second alternative in the `choice` combinator above, and immediately bail out if we've encountered the `number` token. In Parzing, This kind of behavior can be achieved using *cuts*.
174
+
175
+ A cut is a special parser, constructed using `ParserBuilder.cut`, that doesn't attempt to consume any input. Rather,
176
+ When a cut 'parses', the fact that it was encountered is recored in the parsing context. Backtracking parsers, such as the ones listed above, will not attempt to backtrack parsing if a cut was encountered by one of their underlying parsers. Rather they will immediately fail with whatever failure that would have caused them to backtrack.
173
177
 
174
178
  Fixing the example above using cuts, we can write:
175
179
 
@@ -202,7 +206,7 @@ which would now result in the following exception:
202
206
  ParseError { message: "Expecting AnyOf 0123456789 at 7 ('a123')" }
203
207
  ```
204
208
 
205
- Clearly a more useful error message. Note that in addition to yielding clearer errors, cuts may also improve parsing performance since by preventing backtracks.
209
+ Clearly a more useful error message. Note that in addition to yielding clearer errors, cuts may also improve parsing performance by preventing backtracks.
206
210
 
207
211
  There are cases where you may want to reuse the same parser in different contexts -- where in some contexts you want the cut to appear but in others you want the cut to be ignored. This is achieved by invoking ``ParserBuilder.attempt`` on the parser which will return an ``AttemptCombinator``. This combinator parser will "swallow" any cut encountered indication within the underlying parser.
208
212
 
@@ -250,12 +254,14 @@ Parzing offers the following operators out of the box. Note that you can also cr
250
254
 
251
255
  ### Recursive Parsers
252
256
 
253
- In many cases a language may include recursive grammar definitions. Consider for example the following simple expresion parser grammer:
257
+ Grammars often include recursive definitions. Consider for example the following simple expresion parser grammer:
254
258
 
259
+ ```
255
260
  expression := addition | subtraction
256
261
  addition := term '+' term
257
262
  subtraction := term '-' term
258
263
  term := number | '(' expression ')'
264
+ ````
259
265
 
260
266
  How would we define this using Parzing?
261
267
 
@@ -289,9 +295,9 @@ const subtraction = pb.sequence(
289
295
  const expression = pb.choice(addition, subtraction);
290
296
  ```
291
297
 
292
- Note the comment "Ooops!" above. The recursive nature of the parser creates a circular declaration in Typescript, which is disallowed.
298
+ Note the comment "Ooops!" above. The recursive nature of the parser creates a circular declaration, which is disallowed in Typescript.
293
299
 
294
- The `ParserBuilder.ref` method allows coping with this situation by receiving a parameterless function returning a parser, and creating a parser that lazily resolves to the function's return value.
300
+ The `ParserBuilder.ref` method allows solving this problem by receiving a parameterless function returning a parser, and creating a parser that lazily resolves to the function's return value.
295
301
  Using this mechanism, our recursive parser becomes possible by modifying the code above as follows:
296
302
 
297
303
 
package/dist/builder.d.ts CHANGED
@@ -6,6 +6,7 @@ import { OptionalCombinator } from "./combinators/OptionalCombinator";
6
6
  import { SequenceCombinator } from "./combinators/SequenceCombinator";
7
7
  import { CutParser, FailParser, Parser, ParserType, PassParser, RefParser } from "./core";
8
8
  import { AnyOfParser } from "./parsers/AnyOfParser";
9
+ import { RegexParser } from "./parsers/RegexParser";
9
10
  import { TokenParser } from "./parsers/TokenParser";
10
11
  declare type WithPostfixSupport<T> = T & {
11
12
  _<R>(f: (target: WithPostfixSupport<T>) => R): WithPostfixSupport<R>;
@@ -17,6 +18,7 @@ export declare class ParserBuilder {
17
18
  parser<T extends Parser<any>>(p: T): WithPostfixSupport<T>;
18
19
  token(tok: string): WithPostfixSupport<TokenParser>;
19
20
  anyOf(alts: string, minLen?: number | null, maxLen?: number | null): WithPostfixSupport<AnyOfParser>;
21
+ regex(re: RegExp): WithPostfixSupport<RegexParser>;
20
22
  fail(message?: string): WithPostfixSupport<FailParser>;
21
23
  pass(): WithPostfixSupport<PassParser>;
22
24
  cut(): WithPostfixSupport<CutParser>;
package/dist/builder.js CHANGED
@@ -9,6 +9,7 @@ var OptionalCombinator_1 = require("./combinators/OptionalCombinator");
9
9
  var SequenceCombinator_1 = require("./combinators/SequenceCombinator");
10
10
  var core_1 = require("./core");
11
11
  var AnyOfParser_1 = require("./parsers/AnyOfParser");
12
+ var RegexParser_1 = require("./parsers/RegexParser");
12
13
  var TokenParser_1 = require("./parsers/TokenParser");
13
14
  function addPostfixSupport(who) {
14
15
  var ret = who;
@@ -41,6 +42,9 @@ var ParserBuilder = /** @class */ (function () {
41
42
  if (maxLen === void 0) { maxLen = null; }
42
43
  return this.postProcessParser(new AnyOfParser_1.AnyOfParser(alts, minLen, maxLen));
43
44
  };
45
+ ParserBuilder.prototype.regex = function (re) {
46
+ return this.postProcessParser(new RegexParser_1.RegexParser(re));
47
+ };
44
48
  ParserBuilder.prototype.fail = function (message) {
45
49
  if (message === void 0) { message = null; }
46
50
  return this.postProcessParser(new core_1.FailParser(message));
package/dist/core.d.ts CHANGED
@@ -3,6 +3,8 @@ export interface ParserInputBookmark {
3
3
  export interface ParserInput {
4
4
  read(readLen: number): string;
5
5
  peek(peekLen: number): string;
6
+ readRegex?(regex: RegExp): string | null;
7
+ peekRegex?(regex: RegExp): string | null;
6
8
  getBookmark(): ParserInputBookmark;
7
9
  seekToBookmark(bm: ParserInputBookmark): any;
8
10
  eof(): boolean;
@@ -14,6 +16,8 @@ export declare class StringParserInput implements ParserInput {
14
16
  constructor(_text: String);
15
17
  read(readLen: number): string;
16
18
  peek(readLen: number): string;
19
+ readRegex(regex: RegExp): string | null;
20
+ peekRegex(regex: RegExp): string | null;
17
21
  getBookmark(): ParserInputBookmark;
18
22
  seekToBookmark(bm: ParserInputBookmark): void;
19
23
  eof(): boolean;
@@ -48,7 +52,7 @@ export interface Parser<T> {
48
52
  parse(parserContext: ParserContext): ParseResult<T>;
49
53
  }
50
54
  export declare function isParser(p: any): p is Parser<unknown>;
51
- export declare class FailParser implements Parser<void> {
55
+ export declare class FailParser implements Parser<unknown> {
52
56
  private _message;
53
57
  constructor(_message?: string);
54
58
  parse(parserContext: ParserContext): any;
package/dist/core.js CHANGED
@@ -18,6 +18,23 @@ var StringParserInput = /** @class */ (function () {
18
18
  StringParserInput.prototype.peek = function (readLen) {
19
19
  return this._text.substr(this._index, readLen);
20
20
  };
21
+ StringParserInput.prototype.readRegex = function (regex) {
22
+ var ret = this.peekRegex(regex);
23
+ if (ret) {
24
+ this._index += ret.length;
25
+ }
26
+ return ret;
27
+ };
28
+ StringParserInput.prototype.peekRegex = function (regex) {
29
+ var matchResult = regex.exec(this._text.substring(this._index));
30
+ if (!matchResult) {
31
+ return null;
32
+ }
33
+ if (matchResult.index != 0) {
34
+ return null;
35
+ }
36
+ return matchResult[0];
37
+ };
21
38
  StringParserInput.prototype.getBookmark = function () {
22
39
  return this._index;
23
40
  };
@@ -0,0 +1,7 @@
1
+ import { Parser, ParserContext, ParseResult } from "../core";
2
+ export declare class RegexParser implements Parser<string> {
3
+ private _regex;
4
+ _charBitmap: number[] | null;
5
+ constructor(_regex: RegExp);
6
+ parse(parserContext: ParserContext): ParseResult<string>;
7
+ }
@@ -0,0 +1,21 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.RegexParser = void 0;
4
+ var core_1 = require("../core");
5
+ var RegexParser = /** @class */ (function () {
6
+ function RegexParser(_regex) {
7
+ this._regex = _regex;
8
+ }
9
+ RegexParser.prototype.parse = function (parserContext) {
10
+ if (!parserContext.input.readRegex) {
11
+ throw core_1.ParseError.parserRejected(this, parserContext, "Input doesn't support regex parsing");
12
+ }
13
+ var reResult = parserContext.input.readRegex(this._regex);
14
+ if (!reResult) {
15
+ return core_1.ParseResult.failed(core_1.ParseError.parserRejected(this, parserContext, "Expected " + this._regex.source));
16
+ }
17
+ return core_1.ParseResult.successful(reResult);
18
+ };
19
+ return RegexParser;
20
+ }());
21
+ exports.RegexParser = RegexParser;
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigsterz/parzing",
3
- "version": "1.0.0",
3
+ "version": "1.1.0",
4
4
  "description": "Parser combinators library",
5
5
  "scripts": {
6
6
  "test": "ts-mocha test/**.ts",