prism 0.29.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +77 -1
- data/CONTRIBUTING.md +0 -4
- data/README.md +4 -0
- data/config.yml +498 -145
- data/docs/fuzzing.md +1 -1
- data/docs/parsing_rules.md +4 -1
- data/docs/ripper_translation.md +22 -0
- data/docs/serialization.md +3 -0
- data/ext/prism/api_node.c +2858 -2082
- data/ext/prism/extconf.rb +1 -1
- data/ext/prism/extension.c +203 -421
- data/ext/prism/extension.h +2 -2
- data/include/prism/ast.h +1732 -453
- data/include/prism/defines.h +36 -0
- data/include/prism/diagnostic.h +23 -6
- data/include/prism/node.h +0 -21
- data/include/prism/options.h +94 -3
- data/include/prism/parser.h +57 -28
- data/include/prism/regexp.h +18 -8
- data/include/prism/static_literals.h +3 -2
- data/include/prism/util/pm_char.h +1 -2
- data/include/prism/util/pm_constant_pool.h +0 -8
- data/include/prism/util/pm_integer.h +22 -15
- data/include/prism/util/pm_newline_list.h +11 -0
- data/include/prism/util/pm_string.h +28 -12
- data/include/prism/version.h +3 -3
- data/include/prism.h +0 -11
- data/lib/prism/compiler.rb +3 -0
- data/lib/prism/desugar_compiler.rb +111 -74
- data/lib/prism/dispatcher.rb +16 -1
- data/lib/prism/dot_visitor.rb +45 -34
- data/lib/prism/dsl.rb +660 -468
- data/lib/prism/ffi.rb +64 -6
- data/lib/prism/inspect_visitor.rb +294 -64
- data/lib/prism/lex_compat.rb +1 -1
- data/lib/prism/mutation_compiler.rb +11 -6
- data/lib/prism/node.rb +2469 -4973
- data/lib/prism/node_ext.rb +91 -14
- data/lib/prism/parse_result/comments.rb +0 -7
- data/lib/prism/parse_result/errors.rb +65 -0
- data/lib/prism/parse_result/newlines.rb +101 -11
- data/lib/prism/parse_result.rb +43 -3
- data/lib/prism/reflection.rb +10 -8
- data/lib/prism/serialize.rb +484 -609
- data/lib/prism/translation/parser/compiler.rb +152 -132
- data/lib/prism/translation/parser/lexer.rb +26 -4
- data/lib/prism/translation/parser.rb +9 -4
- data/lib/prism/translation/ripper.rb +22 -20
- data/lib/prism/translation/ruby_parser.rb +73 -13
- data/lib/prism/visitor.rb +3 -0
- data/lib/prism.rb +0 -4
- data/prism.gemspec +3 -5
- data/rbi/prism/dsl.rbi +521 -0
- data/rbi/prism/node.rbi +744 -4837
- data/rbi/prism/visitor.rbi +3 -0
- data/rbi/prism.rbi +36 -30
- data/sig/prism/dsl.rbs +190 -303
- data/sig/prism/mutation_compiler.rbs +1 -0
- data/sig/prism/node.rbs +759 -628
- data/sig/prism/parse_result.rbs +2 -0
- data/sig/prism/visitor.rbs +1 -0
- data/sig/prism.rbs +103 -64
- data/src/diagnostic.c +62 -28
- data/src/node.c +499 -1754
- data/src/options.c +76 -27
- data/src/prettyprint.c +156 -112
- data/src/prism.c +2773 -2081
- data/src/regexp.c +202 -69
- data/src/serialize.c +170 -50
- data/src/static_literals.c +63 -84
- data/src/token_type.c +4 -4
- data/src/util/pm_constant_pool.c +0 -8
- data/src/util/pm_integer.c +53 -25
- data/src/util/pm_newline_list.c +29 -0
- data/src/util/pm_string.c +130 -80
- data/src/util/pm_strpbrk.c +32 -6
- metadata +4 -6
- data/include/prism/util/pm_string_list.h +0 -44
- data/lib/prism/debug.rb +0 -249
- data/lib/prism/translation/parser/rubocop.rb +0 -73
- data/src/util/pm_string_list.c +0 -28
data/docs/fuzzing.md
CHANGED
data/docs/parsing_rules.md
CHANGED
@@ -12,7 +12,10 @@ Constants in Ruby begin with an upper-case letter. This is followed by any numbe
|
|
12
12
|
|
13
13
|
Most expressions in CRuby are non-void. This means the expression they represent resolves to a value. For example, `1 + 2` is a non-void expression, because it resolves to a method call. Even things like `class Foo; end` is a non-void expression, because it returns the last evaluated expression in the body of the class (or `nil`).
|
14
14
|
|
15
|
-
Certain nodes, however, are void expressions, and cannot be combined to form larger expressions.
|
15
|
+
Certain nodes, however, are void expressions, and cannot be combined to form larger expressions.
|
16
|
+
* `BEGIN {}`, `END {}`, `alias foo bar`, and `undef foo` can only be at a statement position.
|
17
|
+
* The "jumps": `return`, `break`, `next`, `redo`, `retry` are void expressions.
|
18
|
+
* `value => pattern` is also considered a void expression.
|
16
19
|
|
17
20
|
## Identifiers
|
18
21
|
|
data/docs/ripper_translation.md
CHANGED
@@ -48,3 +48,25 @@ ArithmeticRipper.new("1 + 2 - 3").parse # => [0]
|
|
48
48
|
```
|
49
49
|
|
50
50
|
The exact names of the `on_*` methods are listed in the `Ripper` source.
|
51
|
+
|
52
|
+
## Background
|
53
|
+
|
54
|
+
It is helpful to understand the differences between the `Ripper` library and the `Prism` library. Both libraries perform parsing and provide you with APIs to manipulate and understand the resulting syntax tree. However, there are a few key differences.
|
55
|
+
|
56
|
+
### Design
|
57
|
+
|
58
|
+
`Ripper` is a streaming parser. This means as it is parsing Ruby code, it dispatches events back to the consumer. This allows quite a bit of flexibility. You can use it to build your own syntax tree or to find specific patterns in the code. `Prism` on the other hand returns to your the completed syntax tree _before_ it allows you to manipulate it. This means the tree that you get back is the only representation that can be generated by the parser _at parse time_ (but of course can be manipulated later).
|
59
|
+
|
60
|
+
### Fields
|
61
|
+
|
62
|
+
We use the term "field" to mean a piece of information on a syntax tree node. `Ripper` provides the minimal number of fields to accurately represent the syntax tree for the purposes of compilation/interpretation. For example, in the callbacks for nodes that are based on keywords (`class`, `module`, `for`, `while`, etc.) you are not given the keyword itself, you need to attach it on your own. In other cases, tokens are not necessarily dispatched at all, meaning you need to find them yourself. `Prism` provides the opposite: the maximum number of fields on nodes is provided. As a tradeoff, this requires more memory, but this is chosen to make it easier on consumers.
|
63
|
+
|
64
|
+
### Maintainability
|
65
|
+
|
66
|
+
The `Ripper` interface is not guaranteed in any way, and tends to change between patch versions of CRuby. This is largely due to the fact that `Ripper` is a by-product of the generated parser, as opposed to its own parser. As an example, in the expression `foo::bar = baz`, there are three different represents possible for the call operator, including:
|
67
|
+
|
68
|
+
* `:"::"` - Ruby 1.9 to Ruby 3.1.4
|
69
|
+
* `73` - Ruby 3.1.5 to Ruby 3.1.6
|
70
|
+
* `[:@op, "::", [lineno, column]]` - Ruby 3.2.0 and later
|
71
|
+
|
72
|
+
The `Prism` interface is guaranteed going forward to be the consistent, and the official Ruby syntax tree interface. This means you can rely on this interface without having to worry about individual changes between Ruby versions. It also is a gem, which means it is versioned based on the gem version, as opposed to being versioned based on the Ruby version. Finally, you can use `Prism` to parse multiple versions of Ruby, whereas `Ripper` is tied to the Ruby version it is running on.
|
data/docs/serialization.md
CHANGED
@@ -116,7 +116,9 @@ Each node is structured like the following table:
|
|
116
116
|
| # bytes | field |
|
117
117
|
| --- | --- |
|
118
118
|
| `1` | node type |
|
119
|
+
| varuint | node identifier |
|
119
120
|
| location | node location |
|
121
|
+
| varuint | node flags |
|
120
122
|
|
121
123
|
Every field on the node is then appended to the serialized string. The fields can be determined by referencing `config.yml`. Depending on the type of field, it could take a couple of different forms, described below:
|
122
124
|
|
@@ -199,6 +201,7 @@ The final argument to `pm_serialize_parse` is an optional string that controls t
|
|
199
201
|
| `1` | frozen string literal |
|
200
202
|
| `1` | command line flags |
|
201
203
|
| `1` | syntax version, see [pm_options_version_t](https://github.com/ruby/prism/blob/main/include/prism/options.h) for valid values |
|
204
|
+
| `1` | whether or not the encoding is locked (should almost always be false) |
|
202
205
|
| `4` | the number of scopes |
|
203
206
|
| ... | the scopes |
|
204
207
|
|