clarity-pattern-parser 11.3.5 → 11.3.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,472 +1,887 @@
1
+ # Clarity Pattern Parser
2
+
3
+ A powerful pattern matching and parsing library that provides a flexible grammar for defining complex patterns. Perfect for building parsers, validators, and text processing tools.
4
+
5
+ > **Try it online!** 🚀 [Open in Playground](https://jaredjbarnes.github.io/cpat-editor/)
6
+
7
+ ## Features
8
+
9
+ - 🎯 Flexible pattern matching with both grammar and direct API
10
+ - 🔄 Support for recursive patterns and expressions
11
+ - 🎨 Customizable pattern composition
12
+ - 🚀 High performance parsing
13
+ - 🔍 Built-in debugging support
14
+ - 📝 Rich AST manipulation capabilities
15
+ - 🔌 Extensible through custom patterns and decorators
16
+
1
17
  ## Installation
2
18
 
3
- ```
19
+ ```bash
4
20
  npm install clarity-pattern-parser
5
21
  ```
6
- ## Overview
7
22
 
8
- ### Leaf Patterns
9
- * Literal
10
- * Regex
23
+ ## Quick Start
11
24
 
12
- ### Composing Patterns
13
- * And
14
- * Or
15
- * Repeat
16
- * Reference
17
- * Not
25
+ ### Using Grammar
18
26
 
19
- The `Not` pattern is a negative look ahead and used with the `And` pattern. This will be illustrated in more detail within the `Not` pattern section.
27
+ ```typescript
28
+ import { patterns } from "clarity-pattern-parser";
20
29
 
21
- ## Literal
22
- The `Literal` pattern uses a string literal to match patterns.
23
- ```ts
24
- import { Literal } from "clarity-pattern-parser";
30
+ // Define patterns using grammar
31
+ const { fullName } = patterns`
32
+ first-name = "John"
33
+ last-name = "Doe"
34
+ space = /\s+/
35
+ full-name = first-name + space + last-name
36
+ `;
25
37
 
38
+ // Execute pattern
39
+ const result = fullName.exec("John Doe");
40
+ console.log(result.ast?.value); // "John Doe"
41
+ ```
42
+
43
+ ### Using Direct API
44
+
45
+ ```typescript
46
+ import { Literal, Sequence } from "clarity-pattern-parser";
47
+
48
+ // Create patterns directly
26
49
  const firstName = new Literal("first-name", "John");
50
+ const space = new Literal("space", " ");
51
+ const lastName = new Literal("last-name", "Doe");
52
+ const fullName = new Sequence("full-name", [firstName, space, lastName]);
53
+
54
+ // Execute pattern
55
+ const result = fullName.exec("John Doe");
56
+ console.log(result.ast?.value); // "John Doe"
57
+ ```
58
+
59
+ ## Online Playground
60
+
61
+ Try Clarity Pattern Parser in your browser with our interactive playground:
62
+
63
+ [Open in Playground](https://jaredjbarnes.github.io/cpat-editor/)
64
+
65
+ The playground allows you to:
66
+ - Write and test patterns in real-time
67
+ - See the AST visualization
68
+ - Debug pattern execution
69
+ - Share patterns with others
70
+ - Try out different examples
71
+
72
+ ## Table of Contents
73
+
74
+ 1. [Grammar Documentation](#grammar-documentation)
75
+ - [Basic Patterns](#basic-patterns)
76
+ - [Pattern Operators](#pattern-operators)
77
+ - [Repetition](#repetition)
78
+ - [Imports and Parameters](#imports-and-parameters)
79
+ - [Decorators](#decorators)
80
+ - [Comments](#comments)
81
+ - [Pattern References](#pattern-references)
82
+ - [Pattern Aliasing](#pattern-aliasing)
83
+ - [String Template Patterns](#string-template-patterns)
84
+
85
+ 2. [Direct Pattern Usage](#direct-pattern-usage)
86
+ - [Basic Patterns](#basic-patterns-1)
87
+ - [Composite Patterns](#composite-patterns)
88
+ - [Pattern Context](#pattern-context)
89
+ - [Pattern Reference](#pattern-reference)
90
+ - [Pattern Execution](#pattern-execution)
91
+ - [AST Manipulation](#ast-manipulation)
92
+
93
+ 3. [Advanced Topics](#advanced-topics)
94
+ - [Custom Patterns](#custom-patterns)
95
+ - [Performance Tips](#performance-tips)
96
+ - [Debugging](#debugging)
97
+ - [Error Handling](#error-handling)
98
+
99
+ ## Advanced Topics
100
+
101
+ ### Custom Patterns
102
+
103
+ You can create custom patterns by extending the base `Pattern` class:
104
+
105
+ ```typescript
106
+ import { Pattern } from "clarity-pattern-parser";
27
107
 
28
- const { ast } = firstName.exec("John");
108
+ class CustomPattern extends Pattern {
109
+ constructor(name: string) {
110
+ super(name);
111
+ }
29
112
 
30
- ast.toJson(2)
113
+ exec(text: string) {
114
+ // Custom pattern implementation
115
+ }
116
+ }
31
117
  ```
32
- ```json
33
- {
34
- "type": "literal",
35
- "name": "first-name",
36
- "value": "John",
37
- "firstIndex": 0,
38
- "lastIndex": 3,
39
- "startIndex": 0,
40
- "endIndex": 4,
41
- "children": []
118
+
119
+ ### Performance Tips
120
+
121
+ 1. Use `test()` instead of `exec()` when you only need to check if a pattern matches
122
+ 2. Cache frequently used patterns
123
+ 3. Use `Reference` for recursive patterns instead of direct recursion
124
+ 4. Minimize the use of optional patterns in sequences
125
+ 5. Use bounded repetition when possible
126
+
127
+ ### Debugging
128
+
129
+ Enable debug mode to get detailed information about pattern execution:
130
+
131
+ ```typescript
132
+ const result = pattern.exec("some text", true);
133
+ // Debug information will be available in result.debug
134
+ ```
135
+
136
+ ### Error Handling
137
+
138
+ Pattern execution returns a `ParseResult` that includes error information:
139
+
140
+ ```typescript
141
+ const result = pattern.exec("invalid text");
142
+ if (result.error) {
143
+ console.error(result.error.message);
144
+ console.error(result.error.expected);
145
+ console.error(result.error.position);
42
146
  }
43
147
  ```
44
148
 
45
- ## Regex
46
- The `Regex` pattern uses regular expressions to match patterns.
47
- ```ts
48
- import { Regex } from "clarity-pattern-parser";
149
+ ## Examples
150
+
151
+ ### JSON Parser
152
+ ```typescript
153
+ const { json } = patterns`
154
+ # Basic JSON grammar
155
+ ws = /\s+/
156
+ string = /"[^"]*"/
157
+ number = /-?\d+(\.\d+)?/
158
+ boolean = "true" | "false"
159
+ null = "null"
160
+ value = string | number | boolean | null | array | object
161
+ array-items = (value, /\s*,\s*/)+
162
+ array = "[" +ws? + array-items? + ws? + "]"
163
+ object-property = string + ws? + ":" + ws? + value
164
+ object-properties = (object-property, /\s*,\s*/ trim)+
165
+ object = "{" + ws? + object-properties? + ws? + "}"
166
+ json = ws? + value + ws?
167
+ `;
168
+ ```
49
169
 
50
- const digits = new Regex("digits", "\\d+");
170
+ ### HTML Parser
171
+ ```typescript
172
+ const { html } = patterns`
173
+ # Basic HTML grammar
174
+ ws = /\s+/
175
+ tag-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
176
+ attribute-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
177
+ attribute-value = /"[^"]*"/
178
+ value-attribute = attribute-name + "=" + attribute-value
179
+ bool-attribute = attribute-name
180
+ attribute = value-attribute | bool-attribute
181
+ attributes = (attribute, ws)*
182
+ opening-tag = "<" + ws? + tag-name + ws? + attributes? + ">"
183
+ closing-tag = "</" + ws? + tag-name + ws? + ">"
184
+ text = /[^<]+/
185
+ child = text | element
186
+ children = (child, /\s*/)+
187
+ element = opening-tag + children? + closing-tag
188
+ html = ws? + element + ws?
189
+ `;
190
+ ```
191
+
192
+ ## License
193
+
194
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
195
+
196
+ ## Grammar Documentation
197
+
198
+ This document describes the grammar features supported by the Clarity Pattern Parser.
51
199
 
52
- const { ast } = digits.exec("12");
200
+ ## Basic Patterns
53
201
 
54
- ast.toJson(2);
202
+ ### Literal Strings
203
+ Define literal string patterns using double quotes:
55
204
  ```
56
- ```json
57
- {
58
- "type": "regex",
59
- "name": "digits",
60
- "value": "12",
61
- "firstIndex": 0,
62
- "lastIndex": 1,
63
- "startIndex": 0,
64
- "endIndex": 2,
65
- "children": []
205
+ name = "John"
206
+ ```
207
+
208
+ Escaped characters are supported in literals:
209
+ - `\n` - newline
210
+ - `\r` - carriage return
211
+ - `\t` - tab
212
+ - `\b` - backspace
213
+ - `\f` - form feed
214
+ - `\v` - vertical tab
215
+ - `\0` - null character
216
+ - `\x00` - hex character
217
+ - `\u0000` - unicode character
218
+ - `\"` - escaped quote
219
+ - `\\` - escaped backslash
220
+
221
+ ### Regular Expressions
222
+ Define regex patterns using forward slashes:
223
+ ```
224
+ name = /\w/
225
+ ```
226
+
227
+ ## Pattern Operators
228
+
229
+ ### Options (|)
230
+ Match one of multiple patterns using the `|` operator. This is used for simple alternatives where order doesn't matter:
231
+ ```
232
+ names = john | jane
233
+ ```
234
+
235
+ ### Expression (|)
236
+ Expression patterns also use the `|` operator but are used for defining operator precedence in expressions. The order of alternatives determines precedence, with earlier alternatives having higher precedence. By default, operators are left-associative.
237
+
238
+ Example of an arithmetic expression grammar:
239
+ ```
240
+ prefix-operators = "+" | "-"
241
+ prefix-expression = prefix-operators + expression
242
+ postfix-operators = "++" | "--"
243
+ postfix-expression = expression + postfix-operators
244
+ add-sub-operators = "+" | "-"
245
+ add-sub-expression = expression + add-sub-operators + expression
246
+ mul-div-operators = "*" | "/"
247
+ mul-div-expression = expression + mul-div-operators + expression
248
+ expression = prefix-expression | mul-div-expression | add-sub-expression | postfix-expression
249
+ ```
250
+
251
+ In this example:
252
+ - `prefix-expression` has highest precedence
253
+ - `mul-div-expression` has next highest precedence
254
+ - `add-sub-expression` has next highest precedence
255
+ - `postfix-expression` has lowest precedence
256
+
257
+ To make an operator right-associative, add the `right` keyword:
258
+ ```
259
+ expression = prefix-expression | mul-div-expression | add-sub-expression right | postfix-expression
260
+ ```
261
+
262
+ ### Sequence (+)
263
+ Concatenate patterns in sequence using the `+` operator:
264
+ ```
265
+ full-name = first-name + space + last-name
266
+ ```
267
+
268
+ ### Optional (?)
269
+ Make a pattern optional using the `?` operator:
270
+ ```
271
+ full-name = first-name + space + middle-name? + last-name
272
+ ```
273
+
274
+ ### Not (!)
275
+ Negative lookahead using the `!` operator:
276
+ ```
277
+ pattern = !excluded-pattern + actual-pattern
278
+ ```
279
+
280
+ ### Take Until (?->|)
281
+ Match all characters until a specific pattern is found:
282
+ ```
283
+ script-text = ?->| "</script"
284
+ ```
285
+
286
+ ## Repetition
287
+
288
+ ### Basic Repeat
289
+ Repeat a pattern one or more times using `+`:
290
+ ```
291
+ digits = (digit)+
292
+ ```
293
+
294
+ ### Zero or More
295
+ Repeat a pattern zero or more times using `*`:
296
+ ```
297
+ digits = (digit)*
298
+ ```
299
+
300
+ ### Bounded Repetition
301
+ Specify exact repetition counts using curly braces:
302
+ - `{n}` - Exactly n times: `(pattern){3}`
303
+ - `{n,}` - At least n times: `(pattern){1,}`
304
+ - `{,n}` - At most n times: `(pattern){,3}`
305
+ - `{n,m}` - Between n and m times: `(pattern){1,3}`
306
+
307
+ ### Repetition with Divider
308
+ Repeat patterns with a divider between occurrences:
309
+ ```
310
+ digits = (digit, comma){3}
311
+ ```
312
+
313
+ Add `trim` keyword to trim the divider from the end:
314
+ ```
315
+ digits = (digit, comma trim)+
316
+ ```
317
+
318
+ ## Imports and Parameters
319
+
320
+ ### Basic Import
321
+ Import patterns from other files:
322
+ ```
323
+ import { pattern-name } from "path/to/file.cpat"
324
+ ```
325
+
326
+ ### Import with Parameters
327
+ Import with custom parameters:
328
+ ```
329
+ import { pattern } from "file.cpat" with params {
330
+ custom-param = "value"
331
+ }
332
+ ```
333
+
334
+ ### Parameter Declaration
335
+ Declare parameters that can be passed to the grammar:
336
+ ```
337
+ use params {
338
+ param-name
339
+ }
340
+ ```
341
+
342
+ ### Default Parameters
343
+ Specify default values for parameters:
344
+ ```
345
+ use params {
346
+ param = default-value
66
347
  }
67
348
  ```
68
349
 
69
- ### Regex Caveats
70
- Do not use "^" at the beginning or "$" at the end of your regular expression. If you are creating a regular expression that is concerned about the beginning and end of the text you should probably just use a regular expression.
350
+ ## Custom Grammar Resolvers
351
+
352
+ The Clarity Pattern Parser allows you to provide your own resolver for handling imports of `.cpat` files. This is useful when you need to load patterns from different sources like a database, network, or custom file system.
353
+
354
+ ### Basic Resolver Example
355
+
356
+ ```typescript
357
+ import { Grammar } from "clarity-pattern-parser";
358
+
359
+ // Simple in-memory resolver
360
+ const pathMap: Record<string, string> = {
361
+ "first-name.cpat": `first-name = "John"`,
362
+ "space.cpat": `space = " "`
363
+ };
364
+
365
+ const resolver = (resource: string) => {
366
+ return Promise.resolve({
367
+ expression: pathMap[resource],
368
+ resource
369
+ });
370
+ };
371
+
372
+ const patterns = await Grammar.parse(`
373
+ import { first-name } from "first-name.cpat"
374
+ import { space } from "space.cpat"
375
+ last-name = "Doe"
376
+ full-name = first-name + space + last-name
377
+ `, { resolveImport: resolver });
378
+
379
+ const result = patterns["full-name"].exec("John Doe");
380
+ // result.ast.value will be "John Doe"
381
+ ```
382
+
383
+ ### Resolver with Parameters
384
+
385
+ ```typescript
386
+ const spaceExpression = `
387
+ use params { custom-space }
388
+ space = custom-space
389
+ `;
390
+
391
+ const pathMap: Record<string, string> = {
392
+ "space.cpat": spaceExpression
393
+ };
394
+
395
+ const resolver = (resource: string) => {
396
+ return Promise.resolve({
397
+ expression: pathMap[resource],
398
+ resource
399
+ });
400
+ };
401
+
402
+ const patterns = await Grammar.parse(`
403
+ import { space } from "space.cpat" with params {
404
+ custom-space = " "
405
+ }
406
+ last-name = "Doe"
407
+ full-name = first-name + space + last-name
408
+ `, { resolveImport: resolver });
409
+
410
+ const result = patterns["full-name"].exec("John Doe");
411
+ // result.ast.value will be "John Doe"
412
+ ```
413
+
414
+ ### Resolver with Aliases
415
+
416
+ ```typescript
417
+ const pathMap: Record<string, string> = {
418
+ "resource1.cpat": `value = "Value"`,
419
+ "resource2.cpat": `
420
+ use params { param }
421
+ export-value = param
422
+ `
423
+ };
424
+
425
+ const resolver = (resource: string) => {
426
+ return Promise.resolve({
427
+ expression: pathMap[resource],
428
+ resource
429
+ });
430
+ };
431
+
432
+ const patterns = await Grammar.parse(`
433
+ import { value as alias } from "resource1.cpat"
434
+ import { export-value } from "resource2.cpat" with params {
435
+ param = alias
436
+ }
437
+ name = export-value
438
+ `, { resolveImport: resolver });
439
+
440
+ const result = patterns["name"].exec("Value");
441
+ // result.ast.value will be "Value"
442
+ ```
443
+
444
+ ### Resolver with Default Values
445
+
446
+ ```typescript
447
+ const resolver = (_: string) => {
448
+ return Promise.reject(new Error("No Import"));
449
+ };
450
+
451
+ const patterns = await Grammar.parse(`
452
+ use params {
453
+ value = default-value
454
+ }
455
+ default-value = "DefaultValue"
456
+ alias = value
457
+ `, {
458
+ resolveImport: resolver,
459
+ params: [new Literal("value", "Value")]
460
+ });
461
+
462
+ const result = patterns["alias"].exec("Value");
463
+ // result.ast.value will be "Value"
464
+ ```
465
+
466
+ ### Key Features of Custom Resolvers
467
+
468
+ 1. **Flexibility**: Load patterns from any source (filesystem, network, database, etc.)
469
+ 2. **Parameter Support**: Handle parameter passing between imported patterns
470
+ 3. **Alias Support**: Support pattern aliasing during import
471
+ 4. **Default Values**: Provide default values for parameters
472
+ 5. **Error Handling**: Custom error handling for import failures
473
+ 6. **Resource Tracking**: Track the origin of imported patterns
474
+
475
+ ### Resolver Interface
476
+
477
+ The resolver function should implement the following interface:
478
+
479
+ ```typescript
480
+ type Resolver = (resource: string, originResource: string | null) => Promise<{
481
+ expression: string; // The pattern expression to parse
482
+ resource: string; // The resource identifier
483
+ }>;
484
+ ```
485
+
486
+ ## Decorators
487
+
488
+ Decorators can be applied to patterns using the `@` syntax:
489
+
490
+ ### Token Decorator
491
+ Specify tokens for a pattern:
492
+ ```
493
+ @tokens([" "])
494
+ spaces = /\s+/
495
+ ```
496
+
497
+ ### Custom Decorators
498
+ Support for custom decorators with various argument types:
499
+ ```
500
+ @decorator() // No arguments
501
+ @decorator(["value"]) // Array argument
502
+ @decorator({"prop": value}) // Object argument
503
+ ```
504
+
505
+ ## Comments
506
+ Add comments using the `#` symbol:
507
+ ```
508
+ # This is a comment
509
+ pattern = "value"
510
+ ```
71
511
 
72
- ## And
73
- The `And` pattern is a way to make a sequence pattern. `And` accepts all other patterns as children.
74
- ```ts
75
- import { And, Literal } from "clarity-pattern-parser";
512
+ ## Pattern References
513
+ Reference other patterns by name:
514
+ ```
515
+ pattern1 = "value"
516
+ pattern2 = pattern1
517
+ ```
518
+
519
+ ## Pattern Aliasing
520
+ Import patterns with aliases:
521
+ ```
522
+ import { original as alias } from "file.cpat"
523
+ ```
76
524
 
77
- const jane = new Literal("first-name", "Jane");
525
+ ## String Template Patterns
526
+
527
+ Patterns can be defined inline using string templates. This allows for quick pattern definition and testing without creating separate files.
528
+
529
+ ### Basic Example
530
+ ```typescript
531
+ const { fullName } = patterns`
532
+ first-name = "John"
533
+ last-name = "Doe"
534
+ space = /\s+/
535
+ full-name = first-name + space + last-name
536
+ `;
537
+
538
+ const result = fullName.exec("John Doe");
539
+ // result.ast.value will be "John Doe"
540
+ ```
541
+
542
+ ### Complex Example (HTML-like Markup)
543
+ ```typescript
544
+ const { body } = patterns`
545
+ tag-name = /[a-zA-Z_-]+[a-zA-Z0-9_-]*/
546
+ ws = /\s+/
547
+ opening-tag = "<" + tag-name + ws? + ">"
548
+ closing-tag = "</" + tag-name + ws? + ">"
549
+ child = ws? + element + ws?
550
+ children = (child)*
551
+ element = opening-tag + children + closing-tag
552
+ body = ws? + element + ws?
553
+ `;
554
+
555
+ const result = body.exec(`
556
+ <div>
557
+ <div></div>
558
+ <div></div>
559
+ </div>
560
+ `, true);
561
+
562
+ // Clean up spaces from the AST
563
+ result?.ast?.findAll(n => n.name.includes("ws")).forEach(n => n.remove());
564
+ // result.ast.value will be "<div><div></div><div></div></div>"
565
+ ```
566
+
567
+ ### Key Features
568
+ 1. Patterns are defined using backticks (`)
569
+ 2. Each pattern definition is on a new line
570
+ 3. The `patterns` function returns an object with all defined patterns
571
+ 4. Patterns can be used immediately after definition
572
+ 5. The AST can be manipulated after parsing (e.g., removing spaces)
573
+ 6. The `exec` method can take an optional second parameter to enable debug mode
574
+
575
+ ## Direct Pattern Usage
576
+
577
+ While the grammar provides a convenient way to define patterns, you can also use the Pattern classes directly for more control and flexibility.
578
+
579
+ ### Basic Patterns
580
+
581
+ #### Literal
582
+ ```typescript
583
+ import { Literal } from "clarity-pattern-parser";
584
+
585
+ const firstName = new Literal("first-name", "John");
586
+ const result = firstName.exec("John");
587
+ // result.ast.value will be "John"
588
+ ```
589
+
590
+ #### Regex
591
+ ```typescript
592
+ import { Regex } from "clarity-pattern-parser";
593
+
594
+ const digits = new Regex("digits", "\\d+");
595
+ const result = digits.exec("123");
596
+ // result.ast.value will be "123"
597
+ ```
598
+
599
+ ### Composite Patterns
600
+
601
+ #### Sequence
602
+ ```typescript
603
+ import { Sequence, Literal } from "clarity-pattern-parser";
604
+
605
+ const firstName = new Literal("first-name", "John");
78
606
  const space = new Literal("space", " ");
79
- const doe = new Literal("last-name", "Doe");
80
- const fullName = new And("full-name", [jane, space, doe]);
81
-
82
- const { ast } = fullName.exec("Jane Doe");
83
-
84
- ast.toJson(2); // Look Below for output
85
- ```
86
-
87
- ```json
88
- {
89
- "type": "and",
90
- "name": "full-name",
91
- "value": "Jane Doe",
92
- "firstIndex": 0,
93
- "lastIndex": 7,
94
- "startIndex": 0,
95
- "endIndex": 8,
96
- "children": [
97
- {
98
- "type": "literal",
99
- "name": "first-name",
100
- "value": "Jane",
101
- "firstIndex": 0,
102
- "lastIndex": 3,
103
- "startIndex": 0,
104
- "endIndex": 4,
105
- "children": []
106
- },
107
- {
108
- "type": "and",
109
- "name": "space",
110
- "value": " ",
111
- "firstIndex": 4,
112
- "lastIndex": 4,
113
- "startIndex": 4,
114
- "endIndex": 5,
115
- "children": []
116
- },
117
- {
118
- "type": "and",
119
- "name": "last-name",
120
- "value": "Doe",
121
- "firstIndex": 5,
122
- "lastIndex": 7,
123
- "startIndex": 5,
124
- "endIndex": 8,
125
- "children": []
126
- }
127
- ]
128
- }
607
+ const lastName = new Literal("last-name", "Doe");
608
+ const fullName = new Sequence("full-name", [firstName, space, lastName]);
609
+
610
+ const result = fullName.exec("John Doe");
611
+ // result.ast.value will be "John Doe"
129
612
  ```
130
613
 
131
- ## Or
132
- The `Or` pattern matches any of the patterns given to the constructor.
133
- ```ts
134
- import { Or, Literal } from "clarity-pattern-parser";
614
+ #### Options
615
+ ```typescript
616
+ import { Options, Literal } from "clarity-pattern-parser";
135
617
 
136
- const jane = new Literal("jane", "Jane");
137
618
  const john = new Literal("john", "John");
138
- const firstName = new Or("first-name", [jane, john]);
139
- const { ast } = firstName.exec("Jane");
140
-
141
- ast.toJson(2)
142
- ```
143
- ```json
144
- {
145
- "type": "literal",
146
- "name": "jane",
147
- "value": "Jane",
148
- "firstIndex": 0,
149
- "lastIndex": 3,
150
- "startIndex": 0,
151
- "endIndex": 4,
152
- "children": []
153
- }
619
+ const jane = new Literal("jane", "Jane");
620
+ const names = new Options("names", [john, jane]);
621
+
622
+ const result = names.exec("Jane");
623
+ // result.ast.value will be "Jane"
154
624
  ```
155
- ## Repeat
156
- The `Repeat` patterns allows you to match repeating patterns with, or without a divider.
157
625
 
158
- For example you may want to match a pattern like so.
626
+ #### Expression
627
+ ```typescript
628
+ import { Expression, Literal } from "clarity-pattern-parser";
629
+
630
+ const a = new Literal("a", "a");
631
+ const b = new Literal("b", "b");
632
+ const c = new Literal("c", "c");
633
+ const expression = new Expression("expression", [a, b, c]);
634
+
635
+ const result = expression.exec("a ? b : c");
636
+ // result.ast.value will be "a ? b : c"
159
637
  ```
160
- 1,2,3
638
+
639
+ #### Not (Negative Lookahead)
640
+ ```typescript
641
+ import { Not, Literal, Sequence } from "clarity-pattern-parser";
642
+
643
+ const notJohn = new Not("not-john", new Literal("john", "John"));
644
+ const name = new Literal("name", "Jane");
645
+ const pattern = new Sequence("pattern", [notJohn, name]);
646
+
647
+ const result = pattern.exec("Jane");
648
+ // result.ast.value will be "Jane"
161
649
  ```
162
- Here is the code to do so.
163
- ```ts
164
- import { Repeat, Literal, Regex } from "clarity-pattern-parser";
650
+
651
+ #### Repeat
652
+ ```typescript
653
+ import { Repeat, Regex, Literal } from "clarity-pattern-parser";
165
654
 
166
655
  const digit = new Regex("digit", "\\d+");
167
- const commaDivider = new Literal("comma", ",");
168
- const numberList = new Repeat("number-list", digit, commaDivider);
656
+ const comma = new Literal("comma", ",");
657
+ const digits = new Repeat("digits", digit, { divider: comma, min: 1, max: 3 });
169
658
 
170
- const ast = numberList.exec("1,2,3").ast;
659
+ const result = digits.exec("1,2,3");
660
+ // result.ast.value will be "1,2,3"
661
+ ```
171
662
 
172
- ast.type // ==> "repeat"
173
- ast.name // ==> "number-list"
174
- ast.value // ==> "1,2,3
663
+ #### Take Until
664
+ ```typescript
665
+ import { TakeUntil, Literal } from "clarity-pattern-parser";
175
666
 
176
- ast.children[0].value // ==> "1"
177
- ast.children[1].value // ==> ","
178
- ast.children[2].value // ==> "2"
179
- ast.children[3].value // ==> ","
180
- ast.children[4].value // ==> "3"
667
+ const scriptText = new TakeUntil("script-text", new Literal("end-script", "</script"));
668
+ const result = scriptText.exec("function() { return 1; }</script>");
669
+ // result.ast.value will be "function() { return 1; }"
181
670
  ```
182
671
 
183
- If there is a trailing divider without the repeating pattern, it will not include the trailing divider as part of the result. Here is an example.
672
+ ### Pattern Context
673
+ ```typescript
674
+ import { Context, Literal } from "clarity-pattern-parser";
184
675
 
185
- ```ts
186
- import { Repeat, Literal, Regex } from "clarity-pattern-parser";
676
+ const name = new Literal("name", "John");
677
+ const context = new Context("name-context", name);
187
678
 
188
- const digit = new Regex("digit", "\\d+");
189
- const commaDivider = new Literal("comma", ",");
190
- const numberList = new Repeat("number-list", digit, commaDivider);
191
-
192
- const ast = numberList.exec("1,2,").ast;
193
-
194
- ast.type // ==> "repeat"
195
- ast.name // ==> "number-list"
196
- ast.value // ==> "1,2
197
-
198
- ast.children[0].value // ==> "1"
199
- ast.children[1].value // ==> ","
200
- ast.children[2].value // ==> "2"
201
- ast.children.length // ==> 3
202
- ```
203
-
204
- ## Reference
205
- Reference is a way to handle cyclical patterns. An example of this would be arrays within arrays. Lets say we want to make a pattern that matches an array that can store numbers and arrays.
206
- ```
207
- [[1, [1]], 1, 2, 3]
208
- ```
209
- Here is an example of using `Reference` to parse this pattern.
210
- ```ts
211
- import { Regex, Literal, Or, Repeat, And, Reference } from "clarity-pattern-parser";
212
-
213
- const integer = new Regex("integer", "\\d+");
214
- const commaDivider = new Regex("comma-divider", "\\s*,\\s*");
215
-
216
- const openBracket = new Literal("open-bracket", "[");
217
- const closeBracket = new Literal("close-bracket", "]");
218
- const item = new Or("item", [integer, new Reference("array")]);
219
- const items = new Repeat("items", item, commaDivider);
220
-
221
- const array = new And("array", [openBracket, items, closeBracket]);
222
- const { ast } = array.exec("[[1, [1]], 1, 2, 3]");
223
-
224
- ast.toJson();
225
- ```
226
- ```json
227
- {
228
- "type": "and",
229
- "name": "array",
230
- "value": "[[1, [1]], 1, 2, 3]",
231
- "firstIndex": 0,
232
- "lastIndex": 18,
233
- "startIndex": 0,
234
- "endIndex": 19,
235
- "children": [
236
- {
237
- "type": "literal",
238
- "name": "open-bracket",
239
- "value": "[",
240
- "firstIndex": 0,
241
- "lastIndex": 0,
242
- "startIndex": 0,
243
- "endIndex": 1,
244
- "children": []
245
- },
246
- {
247
- "type": "repeat",
248
- "name": "items",
249
- "value": "[1, [1]], 1, 2, 3",
250
- "firstIndex": 1,
251
- "lastIndex": 17,
252
- "startIndex": 1,
253
- "endIndex": 18,
254
- "children": [
255
- {
256
- "type": "and",
257
- "name": "array",
258
- "value": "[1, [1]]",
259
- "firstIndex": 1,
260
- "lastIndex": 8,
261
- "startIndex": 1,
262
- "endIndex": 9,
263
- "children": [
264
- {
265
- "type": "literal",
266
- "name": "open-bracket",
267
- "value": "[",
268
- "firstIndex": 1,
269
- "lastIndex": 1,
270
- "startIndex": 1,
271
- "endIndex": 2,
272
- "children": []
273
- },
274
- {
275
- "type": "repeat",
276
- "name": "items",
277
- "value": "1, [1]",
278
- "firstIndex": 2,
279
- "lastIndex": 7,
280
- "startIndex": 2,
281
- "endIndex": 8,
282
- "children": [
283
- {
284
- "type": "regex",
285
- "name": "integer",
286
- "value": "1",
287
- "firstIndex": 2,
288
- "lastIndex": 2,
289
- "startIndex": 2,
290
- "endIndex": 3,
291
- "children": []
292
- },
293
- {
294
- "type": "regex",
295
- "name": "comma-divider",
296
- "value": ", ",
297
- "firstIndex": 3,
298
- "lastIndex": 4,
299
- "startIndex": 3,
300
- "endIndex": 5,
301
- "children": []
302
- },
303
- {
304
- "type": "and",
305
- "name": "array",
306
- "value": "[1]",
307
- "firstIndex": 5,
308
- "lastIndex": 7,
309
- "startIndex": 5,
310
- "endIndex": 8,
311
- "children": [
312
- {
313
- "type": "literal",
314
- "name": "open-bracket",
315
- "value": "[",
316
- "firstIndex": 5,
317
- "lastIndex": 5,
318
- "startIndex": 5,
319
- "endIndex": 6,
320
- "children": []
321
- },
322
- {
323
- "type": "repeat",
324
- "name": "items",
325
- "value": "1",
326
- "firstIndex": 6,
327
- "lastIndex": 6,
328
- "startIndex": 6,
329
- "endIndex": 7,
330
- "children": [
331
- {
332
- "type": "regex",
333
- "name": "integer",
334
- "value": "1",
335
- "firstIndex": 6,
336
- "lastIndex": 6,
337
- "startIndex": 6,
338
- "endIndex": 7,
339
- "children": []
340
- }
341
- ]
342
- },
343
- {
344
- "type": "literal",
345
- "name": "close-bracket",
346
- "value": "]",
347
- "firstIndex": 7,
348
- "lastIndex": 7,
349
- "startIndex": 7,
350
- "endIndex": 8,
351
- "children": []
352
- }
353
- ]
354
- }
355
- ]
356
- },
357
- {
358
- "type": "literal",
359
- "name": "close-bracket",
360
- "value": "]",
361
- "firstIndex": 8,
362
- "lastIndex": 8,
363
- "startIndex": 8,
364
- "endIndex": 9,
365
- "children": []
366
- }
367
- ]
368
- },
369
- {
370
- "type": "regex",
371
- "name": "comma-divider",
372
- "value": ", ",
373
- "firstIndex": 9,
374
- "lastIndex": 10,
375
- "startIndex": 9,
376
- "endIndex": 11,
377
- "children": []
378
- },
379
- {
380
- "type": "regex",
381
- "name": "integer",
382
- "value": "1",
383
- "firstIndex": 11,
384
- "lastIndex": 11,
385
- "startIndex": 11,
386
- "endIndex": 12,
387
- "children": []
388
- },
389
- {
390
- "type": "regex",
391
- "name": "comma-divider",
392
- "value": ", ",
393
- "firstIndex": 12,
394
- "lastIndex": 13,
395
- "startIndex": 12,
396
- "endIndex": 14,
397
- "children": []
398
- },
399
- {
400
- "type": "regex",
401
- "name": "integer",
402
- "value": "2",
403
- "firstIndex": 14,
404
- "lastIndex": 14,
405
- "startIndex": 14,
406
- "endIndex": 15,
407
- "children": []
408
- },
409
- {
410
- "type": "regex",
411
- "name": "comma-divider",
412
- "value": ", ",
413
- "firstIndex": 15,
414
- "lastIndex": 16,
415
- "startIndex": 15,
416
- "endIndex": 17,
417
- "children": []
418
- },
419
- {
420
- "type": "regex",
421
- "name": "integer",
422
- "value": "3",
423
- "firstIndex": 17,
424
- "lastIndex": 17,
425
- "startIndex": 17,
426
- "endIndex": 18,
427
- "children": []
428
- }
429
- ]
430
- },
431
- {
432
- "type": "literal",
433
- "name": "close-bracket",
434
- "value": "]",
435
- "firstIndex": 18,
436
- "lastIndex": 18,
437
- "startIndex": 18,
438
- "endIndex": 19,
439
- "children": []
440
- }
441
- ]
442
- }
679
+ const result = context.exec("John");
680
+ // result.ast.value will be "John"
443
681
  ```
444
- The `Reference` pattern traverses the pattern composition to find the pattern that matches the one given to it at construction. It will then clone that pattern and tell that pattern to parse the text. If it cannot find the pattern with the given name, it will throw a runtime error.
445
- ## Not
446
682
 
447
- ## Intellisense
448
- Because the patterns are composed in a tree and the cursor remembers what patterns matched last, we can ask what tokens are next. We will discuss how you can use clarity-pattern-parser for text auto complete and other interesting approaches for intellisense.
683
+ ### Pattern Reference
684
+ ```typescript
685
+ import { Reference, Literal, Sequence } from "clarity-pattern-parser";
449
686
 
450
- ## GetTokens
451
- The `getTokens` method allow you to ask the pattern what tokens it is looking for. The Regex pattern was the only pattern that didn't already intrinsically know what patterns it was looking for, and we solved this by adding a `setTokens` to its class. This allows you to define a regexp that can capture infinitely many patterns, but suggest a finite set. We will discuss this further in the setTokens section. For now we will demonstrate what `getTokens` does.
687
+ const name = new Literal("name", "John");
688
+ const reference = new Reference("name-ref", name);
689
+ const pattern = new Sequence("pattern", [reference]);
452
690
 
453
- ```ts
454
- import { Or, Literal } from "clarity-pattern-parser";
691
+ const result = pattern.exec("John");
692
+ // result.ast.value will be "John"
693
+ ```
455
694
 
456
- const jane = new Literal("jane", "Jane");
457
- const john = new Literal("john", "John");
458
- const jack = new Literal("jack", "Jack");
459
- const jill = new Literal("jill", "Jill");
695
+ ### Key Features of Direct Pattern Usage
696
+ 1. Full control over pattern construction and configuration
697
+ 2. Ability to create custom pattern types
698
+ 3. Direct access to pattern execution and AST manipulation
699
+ 4. Better performance for complex patterns
700
+ 5. Easier debugging and testing
701
+ 6. More flexible pattern composition
702
+
703
+ ## Pattern Interface
704
+
705
+ All patterns implement the `Pattern` interface, which provides a consistent API for pattern matching and manipulation.
706
+
707
+ ### Core Methods
708
+
709
+ #### `parse(cursor: Cursor): Node | null`
710
+ Parses the text using the provided cursor and returns a Node if successful.
711
+ - `cursor`: The cursor tracking the current parsing position
712
+ - Returns: A Node if parsing succeeds, null otherwise
713
+
714
+ #### `exec(text: string, record?: boolean): ParseResult`
715
+ Executes the pattern against the given text and returns a `ParseResult` containing the AST and any errors.
716
+ - `text`: The text to parse
717
+ - `record`: Optional boolean to enable debug recording
718
+ - Returns: `ParseResult` with AST and error information
719
+
720
+ #### `test(text: string, record?: boolean): boolean`
721
+ Tests if the pattern matches the given text without building an AST.
722
+ - `text`: The text to test
723
+ - `record`: Optional boolean to enable debug recording
724
+ - Returns: `true` if the pattern matches, `false` otherwise
725
+
726
+ #### `clone(name?: string): Pattern`
727
+ Creates a deep copy of the pattern.
728
+ - `name`: Optional new name for the cloned pattern
729
+ - Returns: A new instance of the pattern
730
+
731
+ ### Token Methods
732
+
733
+ #### `getTokens(): string[]`
734
+ Returns all possible tokens that this pattern can match.
735
+ - Returns: Array of possible token strings
736
+
737
+ #### `getTokensAfter(childReference: Pattern): string[]`
738
+ Returns tokens that can appear after a specific child pattern.
739
+ - `childReference`: The child pattern to check after
740
+ - Returns: Array of possible token strings
741
+
742
+ #### `getNextTokens(): string[]`
743
+ Returns the next possible tokens based on the current state.
744
+ - Returns: Array of possible token strings
745
+
746
+ ### Pattern Methods
747
+
748
+ #### `getPatterns(): Pattern[]`
749
+ Returns all child patterns.
750
+ - Returns: Array of child patterns
751
+
752
+ #### `getPatternsAfter(childReference: Pattern): Pattern[]`
753
+ Returns patterns that can appear after a specific child pattern.
754
+ - `childReference`: The child pattern to check after
755
+ - Returns: Array of possible patterns
756
+
757
+ #### `getNextPatterns(): Pattern[]`
758
+ Returns the next possible patterns based on the current state.
759
+ - Returns: Array of possible patterns
760
+
761
+ ### Utility Methods
762
+
763
+ #### `find(predicate: (pattern: Pattern) => boolean): Pattern | null`
764
+ Finds a pattern that matches the given predicate.
765
+ - `predicate`: Function that tests each pattern
766
+ - Returns: The first matching pattern or null
767
+
768
+ #### `isEqual(pattern: Pattern): boolean`
769
+ Tests if this pattern is equal to another pattern.
770
+ - `pattern`: The pattern to compare with
771
+ - Returns: `true` if patterns are equal, `false` otherwise
772
+
773
+ ### Properties
774
+
775
+ - `id`: Unique identifier for the pattern
776
+ - `type`: Type of the pattern (e.g., "literal", "regex", "sequence")
777
+ - `name`: Name of the pattern
778
+ - `parent`: Parent pattern or null
779
+ - `children`: Array of child patterns
780
+ - `startedOnIndex`: Index where pattern matching started parsing
781
+
782
+ ### AST Manipulation
783
+ The AST (Abstract Syntax Tree) returned by pattern execution can be manipulated:
784
+ ```typescript
785
+ const result = pattern.exec("some text");
786
+ if (result.ast) {
787
+ // Find all nodes with a specific name
788
+ const nodes = result.ast.findAll(n => n.name === "space");
789
+
790
+ // Remove nodes
791
+ nodes.forEach(n => n.remove());
792
+
793
+ // Get the final value
794
+ const value = result.ast.value;
795
+ }
796
+ ```
460
797
 
461
- const names = new Or("names", [jane, john, jack, jill]);
798
+ ### Node Class Reference
799
+
800
+ The `Node` class is the fundamental building block of the AST (Abstract Syntax Tree) in Clarity Pattern Parser. It provides a rich set of methods for tree manipulation and traversal.
801
+
802
+ #### Basic Properties
803
+ - `id`: Unique identifier for the node
804
+ - `type`: Type of the node (e.g., "literal", "regex", "sequence")
805
+ - `name`: Name of the node
806
+ - `value`: String value of the node (concatenated from children if present)
807
+ - `firstIndex`: First character index in the input text
808
+ - `lastIndex`: Last character index in the input text
809
+ - `startIndex`: Starting position in the input text
810
+ - `endIndex`: Ending position in the input text
811
+ - `parent`: Parent node or null
812
+ - `children`: Array of child nodes
813
+ - `hasChildren`: Whether the node has any children
814
+ - `isLeaf`: Whether the node is a leaf (no children)
815
+
816
+ #### Tree Manipulation
817
+ ```typescript
818
+ // Create nodes
819
+ const node = Node.createValueNode("type", "name", "value");
820
+ const parent = Node.createNode("type", "name", [node]);
821
+
822
+ // Add/remove children
823
+ parent.appendChild(newNode);
824
+ parent.removeChild(node);
825
+ parent.removeAllChildren();
826
+
827
+ // Insert/replace nodes
828
+ parent.insertBefore(newNode, referenceNode);
829
+ parent.replaceChild(newNode, referenceNode);
830
+ node.replaceWith(newNode);
831
+
832
+ // Navigate siblings
833
+ const next = node.nextSibling();
834
+ const prev = node.previousSibling();
835
+ ```
836
+
837
+ #### Tree Traversal
838
+ ```typescript
839
+ // Find nodes
840
+ const found = node.find(n => n.name === "target");
841
+ const all = node.findAll(n => n.type === "literal");
842
+
843
+ // Walk the tree
844
+ node.walkUp(n => console.log(n.name)); // Bottom-up
845
+ node.walkDown(n => console.log(n.name)); // Top-down
846
+ node.walkBreadthFirst(n => console.log(n.name)); // Level by level
462
847
 
463
- names.getTokens();
848
+ // Find ancestors
849
+ const ancestor = node.findAncestor(n => n.type === "parent");
464
850
  ```
465
- ```json
466
- ["Jane", "John", "Jack", "Jill"]
851
+
852
+ #### Tree Transformation
853
+ ```typescript
854
+ // Transform nodes based on type
855
+ const transformed = node.transform({
856
+ "literal": n => Node.createValueNode("new-type", n.name, n.value),
857
+ "sequence": n => Node.createNode("new-type", n.name, n.children)
858
+ });
859
+ ```
860
+
861
+ #### Tree Operations
862
+ ```typescript
863
+ // Flatten tree to array
864
+ const nodes = node.flatten();
865
+
866
+ // Compact node (remove children, keep value)
867
+ node.compact();
868
+
869
+ // Clone node
870
+ const clone = node.clone();
871
+
872
+ // Normalize indices
873
+ node.normalize();
874
+
875
+ // Convert to JSON
876
+ const json = node.toJson(2);
467
877
  ```
468
- ## GetNextTokens
469
878
 
470
- ## SetTokens
879
+ #### Static Methods
880
+ ```typescript
881
+ // Create a value node
882
+ const valueNode = Node.createValueNode("type", "name", "value");
883
+
884
+ // Create a node with children
885
+ const parentNode = Node.createNode("type", "name", [child1, child2]);
886
+ ```
471
887
 
472
- ## Error Handling