prism 0.13.0 → 0.15.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 290f2191f3700e2584cc320e6dd99f78cfbd2cfd82006700f0bf977bfdc30424
4
- data.tar.gz: 2886b0f5553307c1cd5d5e963ac03ea66330e2fa7d26cc4684d8f556e52520f5
3
+ metadata.gz: 1d02e58a6abcd72bbe920246599ed073e7c074929dff4f5b2408f87c9a222a51
4
+ data.tar.gz: 3c89a774748375ff30057d712465678ecff0936db4359d8fef10a819ba410d42
5
5
  SHA512:
6
- metadata.gz: 24dc8c15f9693b120e8dfcdf8edf2d3a396e6eb636589eded6dac5b651cc62beee1f9c6464d49b1c93b5666e963755bfc4eb041a020ca9585ab7a0bb10172787
7
- data.tar.gz: 272d11c30527f924644c3c97ac2895167798774007919d429b03325926f1853422a56463e7272ae02d4855f54dd1f775ce546065d2d2c78b1187192cce722e48
6
+ metadata.gz: c8b5d4de5b481f30fe65177f04b22d323cc697512d0a16077a44aefe393c821d1527343d369fc0a4a850cae725d7ef5a5f073c65c29d4cb49a558262e5c4decb
7
+ data.tar.gz: 6a487e4ca950684075ab462ce3cb9e04653cfa0e9af6888280866f08ca05653172ce86dcb89a8dd7bda1d4c5d4bd45f6f8cd106f428cca723ffe9787ff3d19b5
data/CHANGELOG.md CHANGED
@@ -6,6 +6,40 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [0.15.0] - 2023-10-18
10
+
11
+ ### Added
12
+
13
+ - `BackReferenceReadNode#name` is now provided.
14
+ - `Index{Operator,And,Or}WriteNode` are introduced, split out from `Call{Operator,And,Or}WriteNode` when the method is `[]`.
15
+
16
+ ### Changed
17
+
18
+ - Ensure `PM_NODE_FLAG_COMMON_MASK` into a constant expression to fix compile errors.
19
+ - `super(&arg)` is now fixed.
20
+ - Ensure the last encoding flag on regular expressions wins.
21
+ - Fix the common whitespace calculation when embedded expressions begin on a line.
22
+ - Capture groups in regular expressions now scan the unescaped version to get the correct local variables.
23
+ - `*` and `&` are added to the local table when `...` is found in the parameters of a method definition.
24
+
25
+ ## [0.14.0] - 2023-10-13
26
+
27
+ ### Added
28
+
29
+ - Syntax errors are added for invalid lambda local semicolon placement.
30
+ - Lambda locals are now checked for duplicate names.
31
+ - Destructured parameters are now checked for duplicate names.
32
+ - `Constant{Read,Path,PathTarget}Node#full_name` and `Constant{Read,Path,PathTarget}Node#full_name_parts` are added to walk constant paths for you to find the full name of the constant.
33
+ - Syntax errors are added when assigning to a numbered parameter.
34
+ - `Node::type` is added, which matches the `Node#type` API.
35
+ - Magic comments are now parsed as part of the parsing process and a new field is added in the form of `ParseResult#magic_comments` to access them.
36
+
37
+ ### Changed
38
+
39
+ - **BREAKING**: `Call*Node#name` methods now return symbols instead of strings.
40
+ - **BREAKING**: For loops now have their index value considered as part of the body, so depths of local variable assignments will be increased by 1.
41
+ - Tilde heredocs now split up their lines into multiple string nodes to make them easier to dedent.
42
+
9
43
  ## [0.13.0] - 2023-09-29
10
44
 
11
45
  ### Added
@@ -161,7 +195,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
161
195
 
162
196
  - 🎉 Initial release! 🎉
163
197
 
164
- [unreleased]: https://github.com/ruby/prism/compare/v0.13.0...HEAD
198
+ [unreleased]: https://github.com/ruby/prism/compare/v0.15.0...HEAD
199
+ [0.15.0]: https://github.com/ruby/prism/compare/v0.14.0...v0.15.0
200
+ [0.14.0]: https://github.com/ruby/prism/compare/v0.13.0...v0.14.0
165
201
  [0.13.0]: https://github.com/ruby/prism/compare/v0.12.0...v0.13.0
166
202
  [0.12.0]: https://github.com/ruby/prism/compare/v0.11.0...v0.12.0
167
203
  [0.11.0]: https://github.com/ruby/prism/compare/v0.10.0...v0.11.0
data/README.md CHANGED
@@ -1,4 +1,7 @@
1
- # Prism Ruby parser
1
+ <h1 align="center">Prism Ruby parser</h1>
2
+ <div align="center">
3
+ <img alt="Prism Ruby parser" height="256px" src="https://github.com/ruby/prism/blob/main/docs/prism.png?raw=true">
4
+ </div>
2
5
 
3
6
  This is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into [CRuby](https://github.com/ruby/ruby), [JRuby](https://github.com/jruby/jruby), [TruffleRuby](https://github.com/oracle/truffleruby), [Sorbet](https://github.com/sorbet/sorbet), and [Syntax Tree](https://github.com/ruby-syntax-tree/syntax_tree).
4
7
 
data/config.yml CHANGED
@@ -361,6 +361,8 @@ flags:
361
361
  comment: "x - ignores whitespace and allows comments in regular expressions"
362
362
  - name: MULTI_LINE
363
363
  comment: "m - allows $ to match the end of lines within strings"
364
+ - name: ONCE
365
+ comment: "o - only interpolates values into the regular expression once"
364
366
  - name: EUC_JP
365
367
  comment: "e - forces the EUC-JP encoding"
366
368
  - name: ASCII_8BIT
@@ -369,12 +371,10 @@ flags:
369
371
  comment: "s - forces the Windows-31J encoding"
370
372
  - name: UTF_8
371
373
  comment: "u - forces the UTF-8 encoding"
372
- - name: ONCE
373
- comment: "o - only interpolates values into the regular expression once"
374
374
  - name: StringFlags
375
375
  values:
376
376
  - name: FROZEN
377
- comment: "frozen by virtue of a frozen_string_literal comment"
377
+ comment: "frozen by virtue of a `frozen_string_literal` comment"
378
378
  nodes:
379
379
  - name: AliasGlobalVariableNode
380
380
  fields:
@@ -507,6 +507,9 @@ nodes:
507
507
  { **foo }
508
508
  ^^^^^
509
509
  - name: BackReferenceReadNode
510
+ fields:
511
+ - name: name
512
+ type: constant
510
513
  comment: |
511
514
  Represents reading a reference to a field in the previous match.
512
515
 
@@ -630,20 +633,13 @@ nodes:
630
633
  type: location?
631
634
  - name: message_loc
632
635
  type: location?
633
- - name: opening_loc
634
- type: location?
635
- - name: arguments
636
- type: node?
637
- kind: ArgumentsNode
638
- - name: closing_loc
639
- type: location?
640
636
  - name: flags
641
637
  type: flags
642
638
  kind: CallNodeFlags
643
639
  - name: read_name
644
- type: string
640
+ type: constant
645
641
  - name: write_name
646
- type: string
642
+ type: constant
647
643
  - name: operator_loc
648
644
  type: location
649
645
  - name: value
@@ -674,7 +670,7 @@ nodes:
674
670
  type: flags
675
671
  kind: CallNodeFlags
676
672
  - name: name
677
- type: string
673
+ type: constant
678
674
  comment: |
679
675
  Represents a method call, in all of the various forms that can take.
680
676
 
@@ -703,20 +699,13 @@ nodes:
703
699
  type: location?
704
700
  - name: message_loc
705
701
  type: location?
706
- - name: opening_loc
707
- type: location?
708
- - name: arguments
709
- type: node?
710
- kind: ArgumentsNode
711
- - name: closing_loc
712
- type: location?
713
702
  - name: flags
714
703
  type: flags
715
704
  kind: CallNodeFlags
716
705
  - name: read_name
717
- type: string
706
+ type: constant
718
707
  - name: write_name
719
- type: string
708
+ type: constant
720
709
  - name: operator
721
710
  type: constant
722
711
  - name: operator_loc
@@ -736,20 +725,13 @@ nodes:
736
725
  type: location?
737
726
  - name: message_loc
738
727
  type: location?
739
- - name: opening_loc
740
- type: location?
741
- - name: arguments
742
- type: node?
743
- kind: ArgumentsNode
744
- - name: closing_loc
745
- type: location?
746
728
  - name: flags
747
729
  type: flags
748
730
  kind: CallNodeFlags
749
731
  - name: read_name
750
- type: string
732
+ type: constant
751
733
  - name: write_name
752
- type: string
734
+ type: constant
753
735
  - name: operator_loc
754
736
  type: location
755
737
  - name: value
@@ -1443,6 +1425,89 @@ nodes:
1443
1425
 
1444
1426
  case a; in b then c end
1445
1427
  ^^^^^^^^^^^
1428
+ - name: IndexAndWriteNode
1429
+ fields:
1430
+ - name: receiver
1431
+ type: node?
1432
+ - name: call_operator_loc
1433
+ type: location?
1434
+ - name: opening_loc
1435
+ type: location
1436
+ - name: arguments
1437
+ type: node?
1438
+ kind: ArgumentsNode
1439
+ - name: closing_loc
1440
+ type: location
1441
+ - name: block
1442
+ type: node?
1443
+ - name: flags
1444
+ type: flags
1445
+ kind: CallNodeFlags
1446
+ - name: operator_loc
1447
+ type: location
1448
+ - name: value
1449
+ type: node
1450
+ comment: |
1451
+ Represents the use of the `&&=` operator on a call to the `[]` method.
1452
+
1453
+ foo.bar[baz] &&= value
1454
+ ^^^^^^^^^^^^^^^^^^^^^^
1455
+ - name: IndexOperatorWriteNode
1456
+ fields:
1457
+ - name: receiver
1458
+ type: node?
1459
+ - name: call_operator_loc
1460
+ type: location?
1461
+ - name: opening_loc
1462
+ type: location
1463
+ - name: arguments
1464
+ type: node?
1465
+ kind: ArgumentsNode
1466
+ - name: closing_loc
1467
+ type: location
1468
+ - name: block
1469
+ type: node?
1470
+ - name: flags
1471
+ type: flags
1472
+ kind: CallNodeFlags
1473
+ - name: operator
1474
+ type: constant
1475
+ - name: operator_loc
1476
+ type: location
1477
+ - name: value
1478
+ type: node
1479
+ comment: |
1480
+ Represents the use of an assignment operator on a call to `[]`.
1481
+
1482
+ foo.bar[baz] += value
1483
+ ^^^^^^^^^^^^^^^^^^^^^
1484
+ - name: IndexOrWriteNode
1485
+ fields:
1486
+ - name: receiver
1487
+ type: node?
1488
+ - name: call_operator_loc
1489
+ type: location?
1490
+ - name: opening_loc
1491
+ type: location
1492
+ - name: arguments
1493
+ type: node?
1494
+ kind: ArgumentsNode
1495
+ - name: closing_loc
1496
+ type: location
1497
+ - name: block
1498
+ type: node?
1499
+ - name: flags
1500
+ type: flags
1501
+ kind: CallNodeFlags
1502
+ - name: operator_loc
1503
+ type: location
1504
+ - name: value
1505
+ type: node
1506
+ comment: |
1507
+ Represents the use of the `||=` operator on a call to `[]`.
1508
+
1509
+ foo.bar[baz] ||= value
1510
+ ^^^^^^^^^^^^^^^^^^^^^^
1446
1511
  - name: InstanceVariableAndWriteNode
1447
1512
  fields:
1448
1513
  - name: name
@@ -1772,7 +1837,6 @@ nodes:
1772
1837
  type: location
1773
1838
  - name: content_loc
1774
1839
  type: location
1775
- semantic_field: true # https://github.com/ruby/prism/issues/1452
1776
1840
  - name: closing_loc
1777
1841
  type: location
1778
1842
  - name: unescaped
@@ -2093,7 +2157,6 @@ nodes:
2093
2157
  type: location
2094
2158
  - name: content_loc
2095
2159
  type: location
2096
- semantic_field: true # https://github.com/ruby/prism/issues/1452
2097
2160
  - name: closing_loc
2098
2161
  type: location
2099
2162
  - name: unescaped
@@ -2287,10 +2350,8 @@ nodes:
2287
2350
  kind: StringFlags
2288
2351
  - name: opening_loc
2289
2352
  type: location?
2290
- semantic_field: true # https://github.com/ruby/prism/issues/1452
2291
2353
  - name: content_loc
2292
2354
  type: location
2293
- semantic_field: true # https://github.com/ruby/prism/issues/1452
2294
2355
  - name: closing_loc
2295
2356
  type: location?
2296
2357
  - name: unescaped
data/docs/fuzzing.md CHANGED
@@ -6,8 +6,7 @@ We use fuzzing to test the various entrypoints to the library. The fuzzer we use
6
6
  fuzz
7
7
  ├── corpus
8
8
  │   ├── parse fuzzing corpus for parsing (a symlink to our fixtures)
9
- │   ├── regexp fuzzing corpus for regexp
10
- │   └── unescape fuzzing corpus for unescaping strings
9
+ │   └── regexp fuzzing corpus for regexp
11
10
  ├── dict a AFL++ dictionary containing various tokens
12
11
  ├── docker
13
12
  │   └── Dockerfile for building a container with the fuzzer toolchain
@@ -17,11 +16,9 @@ fuzz
17
16
  ├── parse.sh script to run parsing fuzzer
18
17
  ├── regexp.c fuzz handler for regular expression parsing
19
18
  ├── regexp.sh script to run regexp fuzzer
20
- ├── tools
21
- │   ├── backtrace.sh generates backtrace files for a crash directory
22
- │   └── minimize.sh generates minimized crash or hang files
23
- ├── unescape.c fuzz handler for unescape functionality
24
- └── unescape.sh script to run unescape fuzzer
19
+ └── tools
20
+    ├── backtrace.sh generates backtrace files for a crash directory
21
+    └── minimize.sh generates minimized crash or hang files
25
22
  ```
26
23
 
27
24
  ## Usage
@@ -30,14 +27,12 @@ There are currently three fuzzing targets
30
27
 
31
28
  - `pm_parse_serialize` (parse)
32
29
  - `pm_regexp_named_capture_group_names` (regexp)
33
- - `pm_unescape_manipulate_string` (unescape)
34
30
 
35
31
  Respectively, fuzzing can be performed with
36
32
 
37
33
  ```
38
34
  make fuzz-run-parse
39
35
  make fuzz-run-regexp
40
- make fuzz-run-unescape
41
36
  ```
42
37
 
43
38
  To end a fuzzing job, interrupt with CTRL+C. To enter a container with the fuzzing toolchain and debug utilities, run
@@ -60,7 +55,7 @@ Note, that this may make reproducing bugs difficult as they may depend on memory
60
55
 
61
56
  ```
62
57
  make fuzz-debug # enter the docker container with build tools
63
- make build/fuzz.heisenbug.parse # or .unescape or .regexp
58
+ make build/fuzz.heisenbug.parse # or .regexp
64
59
  ./build/fuzz.heisenbug.parse path-to-problem-input
65
60
  ```
66
61
 
data/docs/prism.png ADDED
Binary file
@@ -31,6 +31,7 @@ This drastically cuts down on the size of the serialized string, especially when
31
31
  ### comment
32
32
 
33
33
  The comment type is one of:
34
+
34
35
  * 0=`INLINE` (`# comment`)
35
36
  * 1=`EMBEDDED_DOCUMENT` (`=begin`/`=end`)
36
37
  * 2=`__END__` (after `__END__`)
@@ -40,6 +41,13 @@ The comment type is one of:
40
41
  | `1` | comment type |
41
42
  | location | the location in the source of this comment |
42
43
 
44
+ ### magic comment
45
+
46
+ | # bytes | field |
47
+ | --- | --- |
48
+ | location | the location of the key of the magic comment |
49
+ | location | the location of the value of the magic comment |
50
+
43
51
  ### diagnostic
44
52
 
45
53
  | # bytes | field |
@@ -66,6 +74,8 @@ The header is structured like the following table:
66
74
  | string | the encoding name |
67
75
  | varint | number of comments |
68
76
  | comment* | comments |
77
+ | varint | number of magic comments |
78
+ | magic comment* | magic comments |
69
79
  | varint | number of errors |
70
80
  | diagnostic* | errors |
71
81
  | varint | number of warnings |