clarity-pattern-parser 11.3.5 → 11.3.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,472 +1,513 @@
1
+ # Clarity Pattern Parser
2
+
3
+ A powerful pattern matching and parsing library that provides a flexible grammar for defining complex patterns. Perfect for building parsers, validators, and text processing tools.
4
+
5
+ > **Try it online!** 🚀 [Open in Playground](https://jaredjbarnes.github.io/cpat-editor/)
6
+
7
+ ## Features
8
+
9
+ - 🎯 Flexible pattern matching with both grammar and direct API
10
+ - 🔄 Support for recursive patterns and expressions
11
+ - 🎨 Customizable pattern composition
12
+ - 🚀 High performance parsing
13
+ - 🔍 Built-in debugging support
14
+ - 📝 Rich AST manipulation capabilities
15
+ - 🔌 Extensible through custom patterns and decorators
16
+
1
17
  ## Installation
2
18
 
3
- ```
19
+ ```bash
4
20
  npm install clarity-pattern-parser
5
21
  ```
6
- ## Overview
7
22
 
8
- ### Leaf Patterns
9
- * Literal
10
- * Regex
23
+ ## Quick Start
11
24
 
12
- ### Composing Patterns
13
- * And
14
- * Or
15
- * Repeat
16
- * Reference
17
- * Not
25
+ ### Using Grammar
18
26
 
19
- The `Not` pattern is a negative look ahead and used with the `And` pattern. This will be illustrated in more detail within the `Not` pattern section.
27
+ ```typescript
28
+ import { patterns } from "clarity-pattern-parser";
20
29
 
21
- ## Literal
22
- The `Literal` pattern uses a string literal to match patterns.
23
- ```ts
24
- import { Literal } from "clarity-pattern-parser";
30
+ // Define patterns using grammar
31
+ const { fullName } = patterns`
32
+ first-name = "John"
33
+ last-name = "Doe"
34
+ space = /\s+/
35
+ full-name = first-name + space + last-name
36
+ `;
37
+
38
+ // Execute pattern
39
+ const result = fullName.exec("John Doe");
40
+ console.log(result.ast?.value); // "John Doe"
41
+ ```
25
42
 
43
+ ### Using Direct API
44
+
45
+ ```typescript
46
+ import { Literal, Sequence } from "clarity-pattern-parser";
47
+
48
+ // Create patterns directly
26
49
  const firstName = new Literal("first-name", "John");
50
+ const space = new Literal("space", " ");
51
+ const lastName = new Literal("last-name", "Doe");
52
+ const fullName = new Sequence("full-name", [firstName, space, lastName]);
53
+
54
+ // Execute pattern
55
+ const result = fullName.exec("John Doe");
56
+ console.log(result.ast?.value); // "John Doe"
57
+ ```
58
+
59
+ ## Online Playground
60
+
61
+ Try Clarity Pattern Parser in your browser with our interactive playground:
62
+
63
+ [Open in Playground](https://jaredjbarnes.github.io/cpat-editor/)
64
+
65
+ The playground allows you to:
66
+ - Write and test patterns in real-time
67
+ - See the AST visualization
68
+ - Debug pattern execution
69
+ - Share patterns with others
70
+ - Try out different examples
27
71
 
28
- const { ast } = firstName.exec("John");
72
+ ## Table of Contents
29
73
 
30
- ast.toJson(2)
74
+ 1. [Grammar Documentation](#grammar-documentation)
75
+ - [Basic Patterns](#basic-patterns)
76
+ - [Pattern Operators](#pattern-operators)
77
+ - [Repetition](#repetition)
78
+ - [Imports and Parameters](#imports-and-parameters)
79
+ - [Decorators](#decorators)
80
+ - [Comments](#comments)
81
+ - [Pattern References](#pattern-references)
82
+ - [Pattern Aliasing](#pattern-aliasing)
83
+ - [String Template Patterns](#string-template-patterns)
84
+
85
+ 2. [Direct Pattern Usage](#direct-pattern-usage)
86
+ - [Basic Patterns](#basic-patterns-1)
87
+ - [Composite Patterns](#composite-patterns)
88
+ - [Pattern Context](#pattern-context)
89
+ - [Pattern Reference](#pattern-reference)
90
+ - [Pattern Execution](#pattern-execution)
91
+ - [AST Manipulation](#ast-manipulation)
92
+
93
+ 3. [Advanced Topics](#advanced-topics)
94
+ - [Custom Patterns](#custom-patterns)
95
+ - [Performance Tips](#performance-tips)
96
+ - [Debugging](#debugging)
97
+ - [Error Handling](#error-handling)
98
+
99
+ ## Grammar Documentation
100
+
101
+ This document describes the grammar features supported by the Clarity Pattern Parser.
102
+
103
+ ### Basic Patterns
104
+
105
+ #### Literal Strings
106
+ Define literal string patterns using double quotes:
31
107
  ```
32
- ```json
33
- {
34
- "type": "literal",
35
- "name": "first-name",
36
- "value": "John",
37
- "firstIndex": 0,
38
- "lastIndex": 3,
39
- "startIndex": 0,
40
- "endIndex": 4,
41
- "children": []
42
- }
108
+ name = "John"
43
109
  ```
44
110
 
45
- ## Regex
46
- The `Regex` pattern uses regular expressions to match patterns.
47
- ```ts
48
- import { Regex } from "clarity-pattern-parser";
111
+ Escaped characters are supported in literals:
112
+ - `\n` - newline
113
+ - `\r` - carriage return
114
+ - `\t` - tab
115
+ - `\b` - backspace
116
+ - `\f` - form feed
117
+ - `\v` - vertical tab
118
+ - `\0` - null character
119
+ - `\x00` - hex character
120
+ - `\u0000` - unicode character
121
+ - `\"` - escaped quote
122
+ - `\\` - escaped backslash
123
+
124
+ #### Regular Expressions
125
+ Define regex patterns using forward slashes:
126
+ ```
127
+ name = /\w/
128
+ ```
49
129
 
50
- const digits = new Regex("digits", "\\d+");
130
+ ### Pattern Operators
131
+
132
+ #### Options (|)
133
+ Match one of multiple patterns using the `|` operator. This is used for simple alternatives where order doesn't matter:
134
+ ```
135
+ names = john | jane
136
+ ```
137
+
138
+ #### Expression (|)
139
+ Expression patterns also use the `|` operator but are used for defining operator precedence in expressions. The order of alternatives determines precedence, with earlier alternatives having higher precedence. By default, operators are left-associative.
140
+
141
+ Example of an arithmetic expression grammar:
142
+ ```
143
+ prefix-operators = "+" | "-"
144
+ prefix-expression = prefix-operators + expression
145
+ postfix-operators = "++" | "--"
146
+ postfix-expression = expression + postfix-operators
147
+ add-sub-operators = "+" | "-"
148
+ add-sub-expression = expression + add-sub-operators + expression
149
+ mul-div-operators = "*" | "/"
150
+ mul-div-expression = expression + mul-div-operators + expression
151
+ expression = prefix-expression | mul-div-expression | add-sub-expression | postfix-expression
152
+ ```
153
+
154
+ ### Repetition
155
+
156
+ #### Basic Repeat
157
+ Repeat a pattern one or more times using `+`:
158
+ ```
159
+ digits = (digit)+
160
+ ```
161
+
162
+ #### Zero or More
163
+ Repeat a pattern zero or more times using `*`:
164
+ ```
165
+ digits = (digit)*
166
+ ```
167
+
168
+ #### Bounded Repetition
169
+ Specify exact repetition counts using curly braces:
170
+ - `{n}` - Exactly n times: `(pattern){3}`
171
+ - `{n,}` - At least n times: `(pattern){1,}`
172
+ - `{,n}` - At most n times: `(pattern){,3}`
173
+ - `{n,m}` - Between n and m times: `(pattern){1,3}`
51
174
 
52
- const { ast } = digits.exec("12");
175
+ #### Repetition with Divider
176
+ Repeat patterns with a divider between occurrences:
177
+ ```
178
+ digits = (digit, comma){3}
179
+ ```
180
+
181
+ Add `trim` keyword to trim the divider from the end:
182
+ ```
183
+ digits = (digit, comma trim)+
184
+ ```
185
+
186
+ ### Imports and Parameters
187
+
188
+ #### Basic Import
189
+ Import patterns from other files:
190
+ ```
191
+ import { pattern-name } from "path/to/file.cpat"
192
+ ```
53
193
 
54
- ast.toJson(2);
194
+ #### Import with Parameters
195
+ Import with custom parameters:
55
196
  ```
56
- ```json
57
- {
58
- "type": "regex",
59
- "name": "digits",
60
- "value": "12",
61
- "firstIndex": 0,
62
- "lastIndex": 1,
63
- "startIndex": 0,
64
- "endIndex": 2,
65
- "children": []
197
+ import { pattern } from "file.cpat" with params {
198
+ custom-param = "value"
66
199
  }
67
200
  ```
68
201
 
69
- ### Regex Caveats
70
- Do not use "^" at the beginning or "$" at the end of your regular expression. If you are creating a regular expression that is concerned about the beginning and end of the text you should probably just use a regular expression.
202
+ #### Parameter Declaration
203
+ Declare parameters that can be passed to the grammar:
204
+ ```
205
+ use params {
206
+ param-name
207
+ }
208
+ ```
71
209
 
72
- ## And
73
- The `And` pattern is a way to make a sequence pattern. `And` accepts all other patterns as children.
74
- ```ts
75
- import { And, Literal } from "clarity-pattern-parser";
210
+ #### Default Parameters
211
+ Specify default values for parameters:
212
+ ```
213
+ use params {
214
+ param = default-value
215
+ }
216
+ ```
217
+
218
+ ### Decorators
76
219
 
77
- const jane = new Literal("first-name", "Jane");
220
+ #### Token Decorator
221
+ Specify tokens for a pattern:
222
+ ```
223
+ @tokens([" "])
224
+ spaces = /\s+/
225
+ ```
226
+
227
+ #### Custom Decorators
228
+ Support for custom decorators with various argument types:
229
+ ```
230
+ @decorator() // No arguments
231
+ @decorator(["value"]) // Array argument
232
+ @decorator({"prop": value}) // Object argument
233
+ ```
234
+
235
+ ### Comments
236
+ Add comments using the `#` symbol:
237
+ ```
238
+ # This is a comment
239
+ pattern = "value"
240
+ ```
241
+
242
+ ### Pattern References
243
+ Reference other patterns by name:
244
+ ```
245
+ pattern1 = "value"
246
+ pattern2 = pattern1
247
+ ```
248
+
249
+ ### Pattern Aliasing
250
+ Import patterns with aliases:
251
+ ```
252
+ import { original as alias } from "file.cpat"
253
+ ```
254
+
255
+ ### String Template Patterns
256
+
257
+ Patterns can be defined inline using string templates. This allows for quick pattern definition and testing without creating separate files.
258
+
259
+ #### Basic Example
260
+ ```typescript
261
+ const { fullName } = patterns`
262
+ first-name = "John"
263
+ last-name = "Doe"
264
+ space = /\s+/
265
+ full-name = first-name + space + last-name
266
+ `;
267
+
268
+ const result = fullName.exec("John Doe");
269
+ // result.ast.value will be "John Doe"
270
+ ```
271
+
272
+ #### Complex Example (HTML-like Markup)
273
+ ```typescript
274
+ const { body } = patterns`
275
+ tag-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
276
+ ws = /\s+/
277
+ opening-tag = "<" + tag-name + ws? + ">"
278
+ closing-tag = "</" + tag-name + ws? + ">"
279
+ child = ws? + element + ws?
280
+ children = (child)*
281
+ element = opening-tag + children + closing-tag
282
+ body = ws? + element + ws?
283
+ `;
284
+
285
+ const result = body.exec(`
286
+ <div>
287
+ <div></div>
288
+ <div></div>
289
+ </div>
290
+ `, true);
291
+
292
+ // Clean up spaces from the AST
293
+ result?.ast?.findAll(n => n.name.includes("ws")).forEach(n => n.remove());
294
+ // result.ast.value will be "<div><div></div><div></div></div>"
295
+ ```
296
+
297
+ ## Direct Pattern Usage
298
+
299
+ While the grammar provides a convenient way to define patterns, you can also use the Pattern classes directly for more control and flexibility.
300
+
301
+ ### Basic Patterns
302
+
303
+ #### Literal
304
+ ```typescript
305
+ import { Literal } from "clarity-pattern-parser";
306
+
307
+ const firstName = new Literal("first-name", "John");
308
+ const result = firstName.exec("John");
309
+ // result.ast.value will be "John"
310
+ ```
311
+
312
+ #### Regex
313
+ ```typescript
314
+ import { Regex } from "clarity-pattern-parser";
315
+
316
+ const digits = new Regex("digits", "\\d+");
317
+ const result = digits.exec("123");
318
+ // result.ast.value will be "123"
319
+ ```
320
+
321
+ ### Composite Patterns
322
+
323
+ #### Sequence
324
+ ```typescript
325
+ import { Sequence, Literal } from "clarity-pattern-parser";
326
+
327
+ const firstName = new Literal("first-name", "John");
78
328
  const space = new Literal("space", " ");
79
- const doe = new Literal("last-name", "Doe");
80
- const fullName = new And("full-name", [jane, space, doe]);
81
-
82
- const { ast } = fullName.exec("Jane Doe");
83
-
84
- ast.toJson(2); // Look Below for output
85
- ```
86
-
87
- ```json
88
- {
89
- "type": "and",
90
- "name": "full-name",
91
- "value": "Jane Doe",
92
- "firstIndex": 0,
93
- "lastIndex": 7,
94
- "startIndex": 0,
95
- "endIndex": 8,
96
- "children": [
97
- {
98
- "type": "literal",
99
- "name": "first-name",
100
- "value": "Jane",
101
- "firstIndex": 0,
102
- "lastIndex": 3,
103
- "startIndex": 0,
104
- "endIndex": 4,
105
- "children": []
106
- },
107
- {
108
- "type": "and",
109
- "name": "space",
110
- "value": " ",
111
- "firstIndex": 4,
112
- "lastIndex": 4,
113
- "startIndex": 4,
114
- "endIndex": 5,
115
- "children": []
116
- },
117
- {
118
- "type": "and",
119
- "name": "last-name",
120
- "value": "Doe",
121
- "firstIndex": 5,
122
- "lastIndex": 7,
123
- "startIndex": 5,
124
- "endIndex": 8,
125
- "children": []
126
- }
127
- ]
128
- }
329
+ const lastName = new Literal("last-name", "Doe");
330
+ const fullName = new Sequence("full-name", [firstName, space, lastName]);
331
+
332
+ const result = fullName.exec("John Doe");
333
+ // result.ast.value will be "John Doe"
129
334
  ```
130
335
 
131
- ## Or
132
- The `Or` pattern matches any of the patterns given to the constructor.
133
- ```ts
134
- import { Or, Literal } from "clarity-pattern-parser";
336
+ #### Options
337
+ ```typescript
338
+ import { Options, Literal } from "clarity-pattern-parser";
135
339
 
136
- const jane = new Literal("jane", "Jane");
137
340
  const john = new Literal("john", "John");
138
- const firstName = new Or("first-name", [jane, john]);
139
- const { ast } = firstName.exec("Jane");
140
-
141
- ast.toJson(2)
142
- ```
143
- ```json
144
- {
145
- "type": "literal",
146
- "name": "jane",
147
- "value": "Jane",
148
- "firstIndex": 0,
149
- "lastIndex": 3,
150
- "startIndex": 0,
151
- "endIndex": 4,
152
- "children": []
341
+ const jane = new Literal("jane", "Jane");
342
+ const names = new Options("names", [john, jane]);
343
+
344
+ const result = names.exec("Jane");
345
+ // result.ast.value will be "Jane"
346
+ ```
347
+
348
+ #### Expression
349
+ ```typescript
350
+ import { Expression, Literal } from "clarity-pattern-parser";
351
+
352
+ const a = new Literal("a", "a");
353
+ const b = new Literal("b", "b");
354
+ const c = new Literal("c", "c");
355
+ const expression = new Expression("expression", [a, b, c]);
356
+
357
+ const result = expression.exec("a ? b : c");
358
+ // result.ast.value will be "a ? b : c"
359
+ ```
360
+
361
+ ### Pattern Context
362
+ ```typescript
363
+ import { Context, Literal } from "clarity-pattern-parser";
364
+
365
+ const name = new Literal("name", "John");
366
+ const context = new Context("name-context", name);
367
+
368
+ const result = context.exec("John");
369
+ // result.ast.value will be "John"
370
+ ```
371
+
372
+ ### Pattern Reference
373
+ ```typescript
374
+ import { Reference, Literal, Sequence } from "clarity-pattern-parser";
375
+
376
+ const name = new Literal("name", "John");
377
+ const reference = new Reference("name-ref", name);
378
+ const pattern = new Sequence("pattern", [reference]);
379
+
380
+ const result = pattern.exec("John");
381
+ // result.ast.value will be "John"
382
+ ```
383
+
384
+ ### Pattern Execution
385
+
386
+ Pattern execution returns a `ParseResult` that includes the AST and any error information:
387
+
388
+ ```typescript
389
+ const result = pattern.exec("some text");
390
+ if (result.error) {
391
+ console.error(result.error.message);
392
+ console.error(result.error.expected);
393
+ console.error(result.error.position);
394
+ } else {
395
+ console.log(result.ast?.value);
396
+ }
397
+ ```
398
+
399
+ ### AST Manipulation
400
+
401
+ The AST (Abstract Syntax Tree) returned by pattern execution can be manipulated:
402
+
403
+ ```typescript
404
+ const result = pattern.exec("some text");
405
+ if (result.ast) {
406
+ // Find all nodes with a specific name
407
+ const nodes = result.ast.findAll(n => n.name === "space");
408
+
409
+ // Remove nodes
410
+ nodes.forEach(n => n.remove());
411
+
412
+ // Get the final value
413
+ const value = result.ast.value;
153
414
  }
154
415
  ```
155
- ## Repeat
156
- The `Repeat` patterns allows you to match repeating patterns with, or without a divider.
157
-
158
- For example you may want to match a pattern like so.
159
- ```
160
- 1,2,3
161
- ```
162
- Here is the code to do so.
163
- ```ts
164
- import { Repeat, Literal, Regex } from "clarity-pattern-parser";
165
-
166
- const digit = new Regex("digit", "\\d+");
167
- const commaDivider = new Literal("comma", ",");
168
- const numberList = new Repeat("number-list", digit, commaDivider);
169
-
170
- const ast = numberList.exec("1,2,3").ast;
171
-
172
- ast.type // ==> "repeat"
173
- ast.name // ==> "number-list"
174
- ast.value // ==> "1,2,3
175
-
176
- ast.children[0].value // ==> "1"
177
- ast.children[1].value // ==> ","
178
- ast.children[2].value // ==> "2"
179
- ast.children[3].value // ==> ","
180
- ast.children[4].value // ==> "3"
181
- ```
182
-
183
- If there is a trailing divider without the repeating pattern, it will not include the trailing divider as part of the result. Here is an example.
184
-
185
- ```ts
186
- import { Repeat, Literal, Regex } from "clarity-pattern-parser";
187
-
188
- const digit = new Regex("digit", "\\d+");
189
- const commaDivider = new Literal("comma", ",");
190
- const numberList = new Repeat("number-list", digit, commaDivider);
191
-
192
- const ast = numberList.exec("1,2,").ast;
193
-
194
- ast.type // ==> "repeat"
195
- ast.name // ==> "number-list"
196
- ast.value // ==> "1,2
197
-
198
- ast.children[0].value // ==> "1"
199
- ast.children[1].value // ==> ","
200
- ast.children[2].value // ==> "2"
201
- ast.children.length // ==> 3
202
- ```
203
-
204
- ## Reference
205
- Reference is a way to handle cyclical patterns. An example of this would be arrays within arrays. Lets say we want to make a pattern that matches an array that can store numbers and arrays.
206
- ```
207
- [[1, [1]], 1, 2, 3]
208
- ```
209
- Here is an example of using `Reference` to parse this pattern.
210
- ```ts
211
- import { Regex, Literal, Or, Repeat, And, Reference } from "clarity-pattern-parser";
212
-
213
- const integer = new Regex("integer", "\\d+");
214
- const commaDivider = new Regex("comma-divider", "\\s*,\\s*");
215
-
216
- const openBracket = new Literal("open-bracket", "[");
217
- const closeBracket = new Literal("close-bracket", "]");
218
- const item = new Or("item", [integer, new Reference("array")]);
219
- const items = new Repeat("items", item, commaDivider);
220
-
221
- const array = new And("array", [openBracket, items, closeBracket]);
222
- const { ast } = array.exec("[[1, [1]], 1, 2, 3]");
223
-
224
- ast.toJson();
225
- ```
226
- ```json
227
- {
228
- "type": "and",
229
- "name": "array",
230
- "value": "[[1, [1]], 1, 2, 3]",
231
- "firstIndex": 0,
232
- "lastIndex": 18,
233
- "startIndex": 0,
234
- "endIndex": 19,
235
- "children": [
236
- {
237
- "type": "literal",
238
- "name": "open-bracket",
239
- "value": "[",
240
- "firstIndex": 0,
241
- "lastIndex": 0,
242
- "startIndex": 0,
243
- "endIndex": 1,
244
- "children": []
245
- },
246
- {
247
- "type": "repeat",
248
- "name": "items",
249
- "value": "[1, [1]], 1, 2, 3",
250
- "firstIndex": 1,
251
- "lastIndex": 17,
252
- "startIndex": 1,
253
- "endIndex": 18,
254
- "children": [
255
- {
256
- "type": "and",
257
- "name": "array",
258
- "value": "[1, [1]]",
259
- "firstIndex": 1,
260
- "lastIndex": 8,
261
- "startIndex": 1,
262
- "endIndex": 9,
263
- "children": [
264
- {
265
- "type": "literal",
266
- "name": "open-bracket",
267
- "value": "[",
268
- "firstIndex": 1,
269
- "lastIndex": 1,
270
- "startIndex": 1,
271
- "endIndex": 2,
272
- "children": []
273
- },
274
- {
275
- "type": "repeat",
276
- "name": "items",
277
- "value": "1, [1]",
278
- "firstIndex": 2,
279
- "lastIndex": 7,
280
- "startIndex": 2,
281
- "endIndex": 8,
282
- "children": [
283
- {
284
- "type": "regex",
285
- "name": "integer",
286
- "value": "1",
287
- "firstIndex": 2,
288
- "lastIndex": 2,
289
- "startIndex": 2,
290
- "endIndex": 3,
291
- "children": []
292
- },
293
- {
294
- "type": "regex",
295
- "name": "comma-divider",
296
- "value": ", ",
297
- "firstIndex": 3,
298
- "lastIndex": 4,
299
- "startIndex": 3,
300
- "endIndex": 5,
301
- "children": []
302
- },
303
- {
304
- "type": "and",
305
- "name": "array",
306
- "value": "[1]",
307
- "firstIndex": 5,
308
- "lastIndex": 7,
309
- "startIndex": 5,
310
- "endIndex": 8,
311
- "children": [
312
- {
313
- "type": "literal",
314
- "name": "open-bracket",
315
- "value": "[",
316
- "firstIndex": 5,
317
- "lastIndex": 5,
318
- "startIndex": 5,
319
- "endIndex": 6,
320
- "children": []
321
- },
322
- {
323
- "type": "repeat",
324
- "name": "items",
325
- "value": "1",
326
- "firstIndex": 6,
327
- "lastIndex": 6,
328
- "startIndex": 6,
329
- "endIndex": 7,
330
- "children": [
331
- {
332
- "type": "regex",
333
- "name": "integer",
334
- "value": "1",
335
- "firstIndex": 6,
336
- "lastIndex": 6,
337
- "startIndex": 6,
338
- "endIndex": 7,
339
- "children": []
340
- }
341
- ]
342
- },
343
- {
344
- "type": "literal",
345
- "name": "close-bracket",
346
- "value": "]",
347
- "firstIndex": 7,
348
- "lastIndex": 7,
349
- "startIndex": 7,
350
- "endIndex": 8,
351
- "children": []
352
- }
353
- ]
354
- }
355
- ]
356
- },
357
- {
358
- "type": "literal",
359
- "name": "close-bracket",
360
- "value": "]",
361
- "firstIndex": 8,
362
- "lastIndex": 8,
363
- "startIndex": 8,
364
- "endIndex": 9,
365
- "children": []
366
- }
367
- ]
368
- },
369
- {
370
- "type": "regex",
371
- "name": "comma-divider",
372
- "value": ", ",
373
- "firstIndex": 9,
374
- "lastIndex": 10,
375
- "startIndex": 9,
376
- "endIndex": 11,
377
- "children": []
378
- },
379
- {
380
- "type": "regex",
381
- "name": "integer",
382
- "value": "1",
383
- "firstIndex": 11,
384
- "lastIndex": 11,
385
- "startIndex": 11,
386
- "endIndex": 12,
387
- "children": []
388
- },
389
- {
390
- "type": "regex",
391
- "name": "comma-divider",
392
- "value": ", ",
393
- "firstIndex": 12,
394
- "lastIndex": 13,
395
- "startIndex": 12,
396
- "endIndex": 14,
397
- "children": []
398
- },
399
- {
400
- "type": "regex",
401
- "name": "integer",
402
- "value": "2",
403
- "firstIndex": 14,
404
- "lastIndex": 14,
405
- "startIndex": 14,
406
- "endIndex": 15,
407
- "children": []
408
- },
409
- {
410
- "type": "regex",
411
- "name": "comma-divider",
412
- "value": ", ",
413
- "firstIndex": 15,
414
- "lastIndex": 16,
415
- "startIndex": 15,
416
- "endIndex": 17,
417
- "children": []
418
- },
419
- {
420
- "type": "regex",
421
- "name": "integer",
422
- "value": "3",
423
- "firstIndex": 17,
424
- "lastIndex": 17,
425
- "startIndex": 17,
426
- "endIndex": 18,
427
- "children": []
428
- }
429
- ]
430
- },
431
- {
432
- "type": "literal",
433
- "name": "close-bracket",
434
- "value": "]",
435
- "firstIndex": 18,
436
- "lastIndex": 18,
437
- "startIndex": 18,
438
- "endIndex": 19,
439
- "children": []
416
+
417
+ ## Advanced Topics
418
+
419
+ ### Custom Patterns
420
+
421
+ You can create custom patterns by extending the base `Pattern` class:
422
+
423
+ ```typescript
424
+ import { Pattern } from "clarity-pattern-parser";
425
+
426
+ class CustomPattern extends Pattern {
427
+ constructor(name: string) {
428
+ super(name);
429
+ }
430
+
431
+ exec(text: string) {
432
+ // Custom pattern implementation
440
433
  }
441
- ]
442
434
  }
443
435
  ```
444
- The `Reference` pattern traverses the pattern composition to find the pattern that matches the one given to it at construction. It will then clone that pattern and tell that pattern to parse the text. If it cannot find the pattern with the given name, it will throw a runtime error.
445
- ## Not
446
436
 
447
- ## Intellisense
448
- Because the patterns are composed in a tree and the cursor remembers what patterns matched last, we can ask what tokens are next. We will discuss how you can use clarity-pattern-parser for text auto complete and other interesting approaches for intellisense.
437
+ ### Performance Tips
449
438
 
450
- ## GetTokens
451
- The `getTokens` method allow you to ask the pattern what tokens it is looking for. The Regex pattern was the only pattern that didn't already intrinsically know what patterns it was looking for, and we solved this by adding a `setTokens` to its class. This allows you to define a regexp that can capture infinitely many patterns, but suggest a finite set. We will discuss this further in the setTokens section. For now we will demonstrate what `getTokens` does.
439
+ 1. Use `test()` instead of `exec()` when you only need to check if a pattern matches
440
+ 2. Cache frequently used patterns
441
+ 3. Use `Reference` for recursive patterns instead of direct recursion
442
+ 4. Minimize the use of optional patterns in sequences
443
+ 5. Use bounded repetition when possible
452
444
 
453
- ```ts
454
- import { Or, Literal } from "clarity-pattern-parser";
445
+ ### Debugging
455
446
 
456
- const jane = new Literal("jane", "Jane");
457
- const john = new Literal("john", "John");
458
- const jack = new Literal("jack", "Jack");
459
- const jill = new Literal("jill", "Jill");
447
+ Enable debug mode to get detailed information about pattern execution:
460
448
 
461
- const names = new Or("names", [jane, john, jack, jill]);
449
+ ```typescript
450
+ const result = pattern.exec("some text", true);
451
+ // Debug information will be available in result.debug
452
+ ```
453
+
454
+ ### Error Handling
455
+
456
+ Pattern execution returns a `ParseResult` that includes error information:
457
+
458
+ ```typescript
459
+ const result = pattern.exec("invalid text");
460
+ if (result.error) {
461
+ console.error(result.error.message);
462
+ console.error(result.error.expected);
463
+ console.error(result.error.position);
464
+ }
465
+ ```
462
466
 
463
- names.getTokens();
467
+ ## Examples
468
+
469
+ ### JSON Parser
470
+ ```typescript
471
+ const { json } = patterns`
472
+ # Basic JSON grammar
473
+ ws = /\s+/
474
+ string = /"[^"]*"/
475
+ number = /-?\d+(\.\d+)?/
476
+ boolean = "true" | "false"
477
+ null = "null"
478
+ value = string | number | boolean | null | array | object
479
+ array-items = (value, /\s*,\s*/)+
480
+ array = "[" +ws? + array-items? + ws? + "]"
481
+ object-property = string + ws? + ":" + ws? + value
482
+ object-properties = (object-property, /\s*,\s*/ trim)+
483
+ object = "{" + ws? + object-properties? + ws? + "}"
484
+ json = ws? + value + ws?
485
+ `;
464
486
  ```
465
- ```json
466
- ["Jane", "John", "Jack", "Jill"]
487
+
488
+ ### HTML Parser
489
+ ```typescript
490
+ const { html } = patterns`
491
+ # Basic HTML grammar
492
+ ws = /\s+/
493
+ tag-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
494
+ attribute-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
495
+ attribute-value = /"[^"]*"/
496
+ value-attribute = attribute-name + "=" + attribute-value
497
+ bool-attribute = attribute-name
498
+ attribute = value-attribute | bool-attribute
499
+ attributes = (attribute, ws)*
500
+ opening-tag = "<" + ws? + tag-name + ws? + attributes? + ">"
501
+ closing-tag = "</" + ws? + tag-name + ws? + ">"
502
+ text = /[^<]+/
503
+ child = text | element
504
+ children = (child, /\s*/)+
505
+ element = opening-tag + children? + closing-tag
506
+ html = ws? + element + ws?
507
+ `;
467
508
  ```
468
- ## GetNextTokens
469
509
 
470
- ## SetTokens
510
+ ## License
511
+
512
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
471
513
 
472
- ## Error Handling