ed-precompiled_prism 1.5.2-arm64-darwin
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/BSDmakefile +58 -0
- data/CHANGELOG.md +723 -0
- data/CODE_OF_CONDUCT.md +76 -0
- data/CONTRIBUTING.md +58 -0
- data/LICENSE.md +7 -0
- data/Makefile +110 -0
- data/README.md +143 -0
- data/config.yml +4714 -0
- data/docs/build_system.md +119 -0
- data/docs/configuration.md +68 -0
- data/docs/cruby_compilation.md +27 -0
- data/docs/design.md +53 -0
- data/docs/encoding.md +121 -0
- data/docs/fuzzing.md +88 -0
- data/docs/heredocs.md +36 -0
- data/docs/javascript.md +118 -0
- data/docs/local_variable_depth.md +229 -0
- data/docs/mapping.md +117 -0
- data/docs/parser_translation.md +24 -0
- data/docs/parsing_rules.md +22 -0
- data/docs/releasing.md +98 -0
- data/docs/relocation.md +34 -0
- data/docs/ripper_translation.md +72 -0
- data/docs/ruby_api.md +44 -0
- data/docs/ruby_parser_translation.md +19 -0
- data/docs/serialization.md +233 -0
- data/docs/testing.md +55 -0
- data/ext/prism/api_node.c +6941 -0
- data/ext/prism/api_pack.c +276 -0
- data/ext/prism/extconf.rb +127 -0
- data/ext/prism/extension.c +1419 -0
- data/ext/prism/extension.h +19 -0
- data/include/prism/ast.h +8220 -0
- data/include/prism/defines.h +260 -0
- data/include/prism/diagnostic.h +456 -0
- data/include/prism/encoding.h +283 -0
- data/include/prism/node.h +129 -0
- data/include/prism/options.h +482 -0
- data/include/prism/pack.h +163 -0
- data/include/prism/parser.h +933 -0
- data/include/prism/prettyprint.h +34 -0
- data/include/prism/regexp.h +43 -0
- data/include/prism/static_literals.h +121 -0
- data/include/prism/util/pm_buffer.h +236 -0
- data/include/prism/util/pm_char.h +204 -0
- data/include/prism/util/pm_constant_pool.h +218 -0
- data/include/prism/util/pm_integer.h +130 -0
- data/include/prism/util/pm_list.h +103 -0
- data/include/prism/util/pm_memchr.h +29 -0
- data/include/prism/util/pm_newline_list.h +113 -0
- data/include/prism/util/pm_string.h +200 -0
- data/include/prism/util/pm_strncasecmp.h +32 -0
- data/include/prism/util/pm_strpbrk.h +46 -0
- data/include/prism/version.h +29 -0
- data/include/prism.h +408 -0
- data/lib/prism/3.0/prism.bundle +0 -0
- data/lib/prism/3.1/prism.bundle +0 -0
- data/lib/prism/3.2/prism.bundle +0 -0
- data/lib/prism/3.3/prism.bundle +0 -0
- data/lib/prism/3.4/prism.bundle +0 -0
- data/lib/prism/compiler.rb +801 -0
- data/lib/prism/desugar_compiler.rb +392 -0
- data/lib/prism/dispatcher.rb +2210 -0
- data/lib/prism/dot_visitor.rb +4762 -0
- data/lib/prism/dsl.rb +1003 -0
- data/lib/prism/ffi.rb +570 -0
- data/lib/prism/inspect_visitor.rb +2392 -0
- data/lib/prism/lex_compat.rb +928 -0
- data/lib/prism/mutation_compiler.rb +772 -0
- data/lib/prism/node.rb +18816 -0
- data/lib/prism/node_ext.rb +511 -0
- data/lib/prism/pack.rb +230 -0
- data/lib/prism/parse_result/comments.rb +188 -0
- data/lib/prism/parse_result/errors.rb +66 -0
- data/lib/prism/parse_result/newlines.rb +155 -0
- data/lib/prism/parse_result.rb +911 -0
- data/lib/prism/pattern.rb +269 -0
- data/lib/prism/polyfill/append_as_bytes.rb +15 -0
- data/lib/prism/polyfill/byteindex.rb +13 -0
- data/lib/prism/polyfill/scan_byte.rb +14 -0
- data/lib/prism/polyfill/unpack1.rb +14 -0
- data/lib/prism/polyfill/warn.rb +36 -0
- data/lib/prism/reflection.rb +416 -0
- data/lib/prism/relocation.rb +505 -0
- data/lib/prism/serialize.rb +2398 -0
- data/lib/prism/string_query.rb +31 -0
- data/lib/prism/translation/parser/builder.rb +62 -0
- data/lib/prism/translation/parser/compiler.rb +2234 -0
- data/lib/prism/translation/parser/lexer.rb +820 -0
- data/lib/prism/translation/parser.rb +374 -0
- data/lib/prism/translation/parser33.rb +13 -0
- data/lib/prism/translation/parser34.rb +13 -0
- data/lib/prism/translation/parser35.rb +13 -0
- data/lib/prism/translation/parser_current.rb +24 -0
- data/lib/prism/translation/ripper/sexp.rb +126 -0
- data/lib/prism/translation/ripper/shim.rb +5 -0
- data/lib/prism/translation/ripper.rb +3474 -0
- data/lib/prism/translation/ruby_parser.rb +1929 -0
- data/lib/prism/translation.rb +16 -0
- data/lib/prism/visitor.rb +813 -0
- data/lib/prism.rb +97 -0
- data/prism.gemspec +174 -0
- data/rbi/prism/compiler.rbi +12 -0
- data/rbi/prism/dsl.rbi +524 -0
- data/rbi/prism/inspect_visitor.rbi +12 -0
- data/rbi/prism/node.rbi +8734 -0
- data/rbi/prism/node_ext.rbi +107 -0
- data/rbi/prism/parse_result.rbi +404 -0
- data/rbi/prism/reflection.rbi +58 -0
- data/rbi/prism/string_query.rbi +12 -0
- data/rbi/prism/translation/parser.rbi +11 -0
- data/rbi/prism/translation/parser33.rbi +6 -0
- data/rbi/prism/translation/parser34.rbi +6 -0
- data/rbi/prism/translation/parser35.rbi +6 -0
- data/rbi/prism/translation/ripper.rbi +15 -0
- data/rbi/prism/visitor.rbi +473 -0
- data/rbi/prism.rbi +66 -0
- data/sig/prism/compiler.rbs +9 -0
- data/sig/prism/dispatcher.rbs +19 -0
- data/sig/prism/dot_visitor.rbs +6 -0
- data/sig/prism/dsl.rbs +351 -0
- data/sig/prism/inspect_visitor.rbs +22 -0
- data/sig/prism/lex_compat.rbs +10 -0
- data/sig/prism/mutation_compiler.rbs +159 -0
- data/sig/prism/node.rbs +4028 -0
- data/sig/prism/node_ext.rbs +149 -0
- data/sig/prism/pack.rbs +43 -0
- data/sig/prism/parse_result/comments.rbs +38 -0
- data/sig/prism/parse_result.rbs +196 -0
- data/sig/prism/pattern.rbs +13 -0
- data/sig/prism/reflection.rbs +50 -0
- data/sig/prism/relocation.rbs +185 -0
- data/sig/prism/serialize.rbs +8 -0
- data/sig/prism/string_query.rbs +11 -0
- data/sig/prism/visitor.rbs +169 -0
- data/sig/prism.rbs +254 -0
- data/src/diagnostic.c +850 -0
- data/src/encoding.c +5235 -0
- data/src/node.c +8676 -0
- data/src/options.c +328 -0
- data/src/pack.c +509 -0
- data/src/prettyprint.c +8941 -0
- data/src/prism.c +23361 -0
- data/src/regexp.c +790 -0
- data/src/serialize.c +2268 -0
- data/src/static_literals.c +617 -0
- data/src/token_type.c +703 -0
- data/src/util/pm_buffer.c +357 -0
- data/src/util/pm_char.c +318 -0
- data/src/util/pm_constant_pool.c +342 -0
- data/src/util/pm_integer.c +670 -0
- data/src/util/pm_list.c +49 -0
- data/src/util/pm_memchr.c +35 -0
- data/src/util/pm_newline_list.c +125 -0
- data/src/util/pm_string.c +381 -0
- data/src/util/pm_strncasecmp.c +36 -0
- data/src/util/pm_strpbrk.c +206 -0
- metadata +202 -0
|
@@ -0,0 +1,229 @@
|
|
|
1
|
+
# Local variable depth
|
|
2
|
+
|
|
3
|
+
One feature of Prism is that it resolves local variables as it parses. It's necessary to do this because of ambiguities in the grammar. For example, consider the following code:
|
|
4
|
+
|
|
5
|
+
```ruby
|
|
6
|
+
foo / bar#/
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
If `foo` is a local variable, this is a call to `/` with `bar` as an argument, followed by a comment. If it's not a local variable, this is a method call to `foo` with a regular expression argument.
|
|
10
|
+
|
|
11
|
+
"Depth" refers to the number of visible scopes that Prism has to go up to find the declaration of a local variable.
|
|
12
|
+
Note that this follows the same scoping rules as Ruby, so a local variable is only visible in the scope it is declared in and in blocks nested in that scope.
|
|
13
|
+
The rules for calculating the depth are very important to understand because they may differ from individual Ruby implementations since they are not specified by the language.
|
|
14
|
+
|
|
15
|
+
Prism uses the minimum number of scopes, i.e., it only creates scopes when necessary semantically, in other words when there must be distinct scopes (which can be observed through `binding.local_variables`).
|
|
16
|
+
That are no "transparent/invisible" scopes in Prism.
|
|
17
|
+
Some Ruby implementations use those for some language constructs and need to adjust by maintaining a depth offset.
|
|
18
|
+
|
|
19
|
+
Below are the places where a local variable can be written/targeted, along with how the depth is calculated at that point.
|
|
20
|
+
|
|
21
|
+
## General
|
|
22
|
+
|
|
23
|
+
In the course of general Ruby code when reading a local variable, the depth is equal to the number of scopes to go up to find the declaration of that variable. For example:
|
|
24
|
+
|
|
25
|
+
```ruby
|
|
26
|
+
foo = 1
|
|
27
|
+
bar = 2
|
|
28
|
+
baz = 3
|
|
29
|
+
|
|
30
|
+
foo # depth 0
|
|
31
|
+
tap { bar } # depth 1
|
|
32
|
+
tap { tap { baz } } # depth 2
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
This also includes writing to a local variable, which could be writing to a local variable that is already declared. For example:
|
|
36
|
+
|
|
37
|
+
```ruby
|
|
38
|
+
foo = 1
|
|
39
|
+
bar = 2
|
|
40
|
+
|
|
41
|
+
foo = 3 # depth 0
|
|
42
|
+
tap { bar = 4 } # depth 1
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
This includes multiple assignment, where the same principle applies. For example:
|
|
46
|
+
|
|
47
|
+
```ruby
|
|
48
|
+
foo = 1
|
|
49
|
+
bar = 2
|
|
50
|
+
|
|
51
|
+
foo, bar = 3, 4 # depth 0
|
|
52
|
+
tap { foo, bar = 5, 6 } # depth 1
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## `for` loops
|
|
56
|
+
|
|
57
|
+
`for` loops in Ruby break down to calls to `.each` with a block.
|
|
58
|
+
However in that case local variable reads and writes within the block will be in the same scope as the scope surrounding the `for` and not in a deeper/separate scope (surprising, but this is Ruby semantics).
|
|
59
|
+
For example:
|
|
60
|
+
|
|
61
|
+
```ruby
|
|
62
|
+
foo = 1
|
|
63
|
+
|
|
64
|
+
for e in baz
|
|
65
|
+
foo # depth 0
|
|
66
|
+
bar = 2 # depth 0
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
p bar # depth 0, prints 2
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
The local variable(s) used for the index of the `for` are also at the same depth (as variables inside and outside the `for`):
|
|
73
|
+
|
|
74
|
+
```ruby
|
|
75
|
+
for e in [1, 2] # depth 0
|
|
76
|
+
e # depth 0
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
p e # depth 0, prints 2
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
## Pattern matching captures
|
|
83
|
+
|
|
84
|
+
You can target a local variable in a pattern matching expression using capture syntax. Using this syntax, you can target local variables in the current scope or in visible parent scopes. For example:
|
|
85
|
+
|
|
86
|
+
```ruby
|
|
87
|
+
42 => bar # depth 0
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
The example above writes to a local variable in the current scope. If the variable is already declared in a higher visible scope, it will be written to that scope instead. For example:
|
|
91
|
+
|
|
92
|
+
```ruby
|
|
93
|
+
foo = 1
|
|
94
|
+
tap { 42 => foo } # depth 1
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
## Named capture groups
|
|
98
|
+
|
|
99
|
+
You can target local variables through named capture groups in regular expressions if they are used on the left-hand side of a `=~` operator. For example:
|
|
100
|
+
|
|
101
|
+
```ruby
|
|
102
|
+
/(?<foo>\d+)/ =~ "42" # depth 0
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
This will write to a `foo` local variable. If the variable is already declared in a higher visible scope, it will be written to that scope instead. For example:
|
|
106
|
+
|
|
107
|
+
```ruby
|
|
108
|
+
foo = 1
|
|
109
|
+
tap { /(?<foo>\d+)/ =~ "42" } # depth 1
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## "interpolated once" regular expressions
|
|
113
|
+
|
|
114
|
+
Regular expressions that interpolate local variables (unrelated to capture group local variables) and have the `o` flag will only interpolate the local variables once for the runtime of the program.
|
|
115
|
+
In CRuby, this is implemented by compiling the regular expression within a nested instruction sequence, which means CRuby thinks the depth is one more than prism does. For example:
|
|
116
|
+
|
|
117
|
+
```
|
|
118
|
+
$ ruby --dump=insns -e 'foo = 1; /#{foo}/o'
|
|
119
|
+
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,18)> (catch: false)
|
|
120
|
+
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
121
|
+
[ 1] foo@0
|
|
122
|
+
0000 putobject_INT2FIX_1_ ( 1)[Li]
|
|
123
|
+
0001 setlocal_WC_0 foo@0
|
|
124
|
+
0003 once block in <main>, <is:0>
|
|
125
|
+
0006 leave
|
|
126
|
+
|
|
127
|
+
== disasm: #<ISeq:block in <main>@-e:1 (1,9)-(1,18)> (catch: false)
|
|
128
|
+
0000 putobject "" ( 1)
|
|
129
|
+
0002 getlocal_WC_1 foo@0
|
|
130
|
+
0004 dup
|
|
131
|
+
0005 objtostring <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE>
|
|
132
|
+
0007 anytostring
|
|
133
|
+
0008 toregexp 0, 2
|
|
134
|
+
0011 leave
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
In this case CRuby fetches the local variable with `getlocal_WC_1` as the second instruction to the "once" instruction sequence. When compiling CRuby, prism therefore will adjust the depth to account for this difference.
|
|
138
|
+
|
|
139
|
+
## `rescue` clauses
|
|
140
|
+
|
|
141
|
+
In CRuby, `rescue` clauses are implemented as their own instruction sequence, and therefore CRuby thinks the depth is one more than prism does. For example:
|
|
142
|
+
|
|
143
|
+
```
|
|
144
|
+
$ ruby --dump=insns -e 'begin; foo = 1; rescue; foo; end'
|
|
145
|
+
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,32)> (catch: true)
|
|
146
|
+
== catch table
|
|
147
|
+
| catch type: rescue st: 0000 ed: 0004 sp: 0000 cont: 0005
|
|
148
|
+
| == disasm: #<ISeq:rescue in <main>@-e:1 (1,16)-(1,28)> (catch: true)
|
|
149
|
+
| local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
150
|
+
| [ 1] $!@0
|
|
151
|
+
| 0000 getlocal_WC_0 $!@0 ( 1)
|
|
152
|
+
| 0002 putobject StandardError
|
|
153
|
+
| 0004 checkmatch 3
|
|
154
|
+
| 0006 branchunless 11
|
|
155
|
+
| 0008 getlocal_WC_1 foo@0[Li]
|
|
156
|
+
| 0010 leave
|
|
157
|
+
| 0011 getlocal_WC_0 $!@0
|
|
158
|
+
| 0013 throw 0
|
|
159
|
+
| catch type: retry st: 0004 ed: 0005 sp: 0000 cont: 0000
|
|
160
|
+
|------------------------------------------------------------------------
|
|
161
|
+
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
162
|
+
[ 1] foo@0
|
|
163
|
+
0000 putobject_INT2FIX_1_ ( 1)[Li]
|
|
164
|
+
0001 dup
|
|
165
|
+
0002 setlocal_WC_0 foo@0
|
|
166
|
+
0004 nop
|
|
167
|
+
0005 leave
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
In the catch table, CRuby is reading the `foo` local variable using `getlocal_WC_1` as the fifth instruction to the "rescue" instruction sequence. When compiling CRuby, prism therefore will adjust the depth to account for this difference.
|
|
171
|
+
|
|
172
|
+
Note that this includes the error reference, which can target local variables, as in:
|
|
173
|
+
|
|
174
|
+
```
|
|
175
|
+
$ ruby --dump=insns -e 'foo = 1; begin; rescue => foo; end'
|
|
176
|
+
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,34)> (catch: true)
|
|
177
|
+
== catch table
|
|
178
|
+
| catch type: rescue st: 0003 ed: 0004 sp: 0000 cont: 0005
|
|
179
|
+
| == disasm: #<ISeq:rescue in <main>@-e:1 (1,16)-(1,30)> (catch: true)
|
|
180
|
+
| local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
181
|
+
| [ 1] $!@0
|
|
182
|
+
| 0000 getlocal_WC_0 $!@0 ( 1)
|
|
183
|
+
| 0002 putobject StandardError
|
|
184
|
+
| 0004 checkmatch 3
|
|
185
|
+
| 0006 branchunless 14
|
|
186
|
+
| 0008 getlocal_WC_0 $!@0
|
|
187
|
+
| 0010 setlocal_WC_1 foo@0
|
|
188
|
+
| 0012 putnil
|
|
189
|
+
| 0013 leave
|
|
190
|
+
| 0014 getlocal_WC_0 $!@0
|
|
191
|
+
| 0016 throw 0
|
|
192
|
+
| catch type: retry st: 0004 ed: 0005 sp: 0000 cont: 0003
|
|
193
|
+
|------------------------------------------------------------------------
|
|
194
|
+
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
195
|
+
[ 1] foo@0
|
|
196
|
+
0000 putobject_INT2FIX_1_ ( 1)[Li]
|
|
197
|
+
0001 setlocal_WC_0 foo@0
|
|
198
|
+
0003 putnil
|
|
199
|
+
0004 nop
|
|
200
|
+
0005 leave
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
Note that CRuby is writing to the `foo` local variable using the `setlocal_WC_1` instruction as the sixth instruction to the "rescue" instruction sequence. When compiling CRuby, prism therefore will adjust the depth to account for this difference.
|
|
204
|
+
|
|
205
|
+
## Post execution blocks
|
|
206
|
+
|
|
207
|
+
The `END {}` syntax allows executing code when the program exits. In CRuby, this is implemented as two nested instruction sequences. CRuby therefore thinks the depth is two more than prism does. For example:
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
$ ruby --dump=insns -e 'foo = 1; END { foo }'
|
|
211
|
+
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,20)> (catch: false)
|
|
212
|
+
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
|
|
213
|
+
[ 1] foo@0
|
|
214
|
+
0000 putobject_INT2FIX_1_ ( 1)[Li]
|
|
215
|
+
0001 setlocal_WC_0 foo@0
|
|
216
|
+
0003 once block in <main>, <is:0>
|
|
217
|
+
0006 leave
|
|
218
|
+
|
|
219
|
+
== disasm: #<ISeq:block in <main>@-e:0 (0,0)-(-1,-1)> (catch: false)
|
|
220
|
+
0000 putspecialobject 1 ( 1)
|
|
221
|
+
0002 send <calldata!mid:core#set_postexe, argc:0, FCALL>, block in <main>
|
|
222
|
+
0005 leave
|
|
223
|
+
|
|
224
|
+
== disasm: #<ISeq:block in <main>@-e:1 (1,9)-(1,20)> (catch: false)
|
|
225
|
+
0000 getlocal foo@0, 2 ( 1)[LiBc]
|
|
226
|
+
0003 leave [Br]
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
In the instruction sequence corresponding to the code that gets executed inside the `END` block, CRuby is reading the `foo` local variable using `getlocal` as the second instruction to the `"block in <main>"` instruction sequence. When compiling CRuby, prism therefore will adjust the depth to account for this difference.
|
data/docs/mapping.md
ADDED
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
# Mapping
|
|
2
|
+
|
|
3
|
+
When considering the previous CRuby parser versus prism, this document should be helpful to understand how various concepts are mapped.
|
|
4
|
+
|
|
5
|
+
## Nodes
|
|
6
|
+
|
|
7
|
+
The following table shows how the various CRuby nodes are mapped to prism nodes.
|
|
8
|
+
|
|
9
|
+
| CRuby | prism |
|
|
10
|
+
| --- | --- |
|
|
11
|
+
| `NODE_SCOPE` | |
|
|
12
|
+
| `NODE_BLOCK` | |
|
|
13
|
+
| `NODE_IF` | `PM_IF_NODE` |
|
|
14
|
+
| `NODE_UNLESS` | `PM_UNLESS_NODE` |
|
|
15
|
+
| `NODE_CASE` | `PM_CASE_NODE` |
|
|
16
|
+
| `NODE_CASE2` | `PM_CASE_NODE` (with a null predicate) |
|
|
17
|
+
| `NODE_CASE3` | |
|
|
18
|
+
| `NODE_WHEN` | `PM_WHEN_NODE` |
|
|
19
|
+
| `NODE_IN` | `PM_IN_NODE` |
|
|
20
|
+
| `NODE_WHILE` | `PM_WHILE_NODE` |
|
|
21
|
+
| `NODE_UNTIL` | `PM_UNTIL_NODE` |
|
|
22
|
+
| `NODE_ITER` | `PM_CALL_NODE` (with a non-null block) |
|
|
23
|
+
| `NODE_FOR` | `PM_FOR_NODE` |
|
|
24
|
+
| `NODE_FOR_MASGN` | `PM_FOR_NODE` (with a multi-write node as the index) |
|
|
25
|
+
| `NODE_BREAK` | `PM_BREAK_NODE` |
|
|
26
|
+
| `NODE_NEXT` | `PM_NEXT_NODE` |
|
|
27
|
+
| `NODE_REDO` | `PM_REDO_NODE` |
|
|
28
|
+
| `NODE_RETRY` | `PM_RETRY_NODE` |
|
|
29
|
+
| `NODE_BEGIN` | `PM_BEGIN_NODE` |
|
|
30
|
+
| `NODE_RESCUE` | `PM_RESCUE_NODE` |
|
|
31
|
+
| `NODE_RESBODY` | |
|
|
32
|
+
| `NODE_ENSURE` | `PM_ENSURE_NODE` |
|
|
33
|
+
| `NODE_AND` | `PM_AND_NODE` |
|
|
34
|
+
| `NODE_OR` | `PM_OR_NODE` |
|
|
35
|
+
| `NODE_MASGN` | `PM_MULTI_WRITE_NODE` |
|
|
36
|
+
| `NODE_LASGN` | `PM_LOCAL_VARIABLE_WRITE_NODE` |
|
|
37
|
+
| `NODE_DASGN` | `PM_LOCAL_VARIABLE_WRITE_NODE` |
|
|
38
|
+
| `NODE_GASGN` | `PM_GLOBAL_VARIABLE_WRITE_NODE` |
|
|
39
|
+
| `NODE_IASGN` | `PM_INSTANCE_VARIABLE_WRITE_NODE` |
|
|
40
|
+
| `NODE_CDECL` | `PM_CONSTANT_PATH_WRITE_NODE` |
|
|
41
|
+
| `NODE_CVASGN` | `PM_CLASS_VARIABLE_WRITE_NODE` |
|
|
42
|
+
| `NODE_OP_ASGN1` | |
|
|
43
|
+
| `NODE_OP_ASGN2` | |
|
|
44
|
+
| `NODE_OP_ASGN_AND` | `PM_OPERATOR_AND_ASSIGNMENT_NODE` |
|
|
45
|
+
| `NODE_OP_ASGN_OR` | `PM_OPERATOR_OR_ASSIGNMENT_NODE` |
|
|
46
|
+
| `NODE_OP_CDECL` | |
|
|
47
|
+
| `NODE_CALL` | `PM_CALL_NODE` |
|
|
48
|
+
| `NODE_OPCALL` | `PM_CALL_NODE` (with an operator as the method) |
|
|
49
|
+
| `NODE_FCALL` | `PM_CALL_NODE` (with a null receiver and parentheses) |
|
|
50
|
+
| `NODE_VCALL` | `PM_CALL_NODE` (with a null receiver and parentheses or arguments) |
|
|
51
|
+
| `NODE_QCALL` | `PM_CALL_NODE` (with a &. operator) |
|
|
52
|
+
| `NODE_SUPER` | `PM_SUPER_NODE` |
|
|
53
|
+
| `NODE_ZSUPER` | `PM_FORWARDING_SUPER_NODE` |
|
|
54
|
+
| `NODE_LIST` | `PM_ARRAY_NODE` |
|
|
55
|
+
| `NODE_ZLIST` | `PM_ARRAY_NODE` (with no child elements) |
|
|
56
|
+
| `NODE_VALUES` | `PM_ARGUMENTS_NODE` |
|
|
57
|
+
| `NODE_HASH` | `PM_HASH_NODE` |
|
|
58
|
+
| `NODE_RETURN` | `PM_RETURN_NODE` |
|
|
59
|
+
| `NODE_YIELD` | `PM_YIELD_NODE` |
|
|
60
|
+
| `NODE_LVAR` | `PM_LOCAL_VARIABLE_READ_NODE` |
|
|
61
|
+
| `NODE_DVAR` | `PM_LOCAL_VARIABLE_READ_NODE` |
|
|
62
|
+
| `NODE_GVAR` | `PM_GLOBAL_VARIABLE_READ_NODE` |
|
|
63
|
+
| `NODE_IVAR` | `PM_INSTANCE_VARIABLE_READ_NODE` |
|
|
64
|
+
| `NODE_CONST` | `PM_CONSTANT_PATH_READ_NODE` |
|
|
65
|
+
| `NODE_CVAR` | `PM_CLASS_VARIABLE_READ_NODE` |
|
|
66
|
+
| `NODE_NTH_REF` | `PM_NUMBERED_REFERENCE_READ_NODE` |
|
|
67
|
+
| `NODE_BACK_REF` | `PM_BACK_REFERENCE_READ_NODE` |
|
|
68
|
+
| `NODE_MATCH` | |
|
|
69
|
+
| `NODE_MATCH2` | `PM_CALL_NODE` (with regular expression as receiver) |
|
|
70
|
+
| `NODE_MATCH3` | `PM_CALL_NODE` (with regular expression as only argument) |
|
|
71
|
+
| `NODE_LIT` | |
|
|
72
|
+
| `NODE_STR` | `PM_STRING_NODE` |
|
|
73
|
+
| `NODE_DSTR` | `PM_INTERPOLATED_STRING_NODE` |
|
|
74
|
+
| `NODE_XSTR` | `PM_X_STRING_NODE` |
|
|
75
|
+
| `NODE_DXSTR` | `PM_INTERPOLATED_X_STRING_NODE` |
|
|
76
|
+
| `NODE_EVSTR` | `PM_STRING_INTERPOLATED_NODE` |
|
|
77
|
+
| `NODE_DREGX` | `PM_INTERPOLATED_REGULAR_EXPRESSION_NODE` |
|
|
78
|
+
| `NODE_ONCE` | |
|
|
79
|
+
| `NODE_ARGS` | `PM_PARAMETERS_NODE` |
|
|
80
|
+
| `NODE_ARGS_AUX` | |
|
|
81
|
+
| `NODE_OPT_ARG` | `PM_OPTIONAL_PARAMETER_NODE` |
|
|
82
|
+
| `NODE_KW_ARG` | `PM_KEYWORD_PARAMETER_NODE` |
|
|
83
|
+
| `NODE_POSTARG` | `PM_REQUIRED_PARAMETER_NODE` |
|
|
84
|
+
| `NODE_ARGSCAT` | |
|
|
85
|
+
| `NODE_ARGSPUSH` | |
|
|
86
|
+
| `NODE_SPLAT` | `PM_SPLAT_NODE` |
|
|
87
|
+
| `NODE_BLOCK_PASS` | `PM_BLOCK_ARGUMENT_NODE` |
|
|
88
|
+
| `NODE_DEFN` | `PM_DEF_NODE` (with a null receiver) |
|
|
89
|
+
| `NODE_DEFS` | `PM_DEF_NODE` (with a non-null receiver) |
|
|
90
|
+
| `NODE_ALIAS` | `PM_ALIAS_NODE` |
|
|
91
|
+
| `NODE_VALIAS` | `PM_ALIAS_NODE` (with a global variable first argument) |
|
|
92
|
+
| `NODE_UNDEF` | `PM_UNDEF_NODE` |
|
|
93
|
+
| `NODE_CLASS` | `PM_CLASS_NODE` |
|
|
94
|
+
| `NODE_MODULE` | `PM_MODULE_NODE` |
|
|
95
|
+
| `NODE_SCLASS` | `PM_S_CLASS_NODE` |
|
|
96
|
+
| `NODE_COLON2` | `PM_CONSTANT_PATH_NODE` |
|
|
97
|
+
| `NODE_COLON3` | `PM_CONSTANT_PATH_NODE` (with a null receiver) |
|
|
98
|
+
| `NODE_DOT2` | `PM_RANGE_NODE` (with a .. operator) |
|
|
99
|
+
| `NODE_DOT3` | `PM_RANGE_NODE` (with a ... operator) |
|
|
100
|
+
| `NODE_FLIP2` | `PM_RANGE_NODE` (with a .. operator) |
|
|
101
|
+
| `NODE_FLIP3` | `PM_RANGE_NODE` (with a ... operator) |
|
|
102
|
+
| `NODE_SELF` | `PM_SELF_NODE` |
|
|
103
|
+
| `NODE_NIL` | `PM_NIL_NODE` |
|
|
104
|
+
| `NODE_TRUE` | `PM_TRUE_NODE` |
|
|
105
|
+
| `NODE_FALSE` | `PM_FALSE_NODE` |
|
|
106
|
+
| `NODE_ERRINFO` | |
|
|
107
|
+
| `NODE_DEFINED` | `PM_DEFINED_NODE` |
|
|
108
|
+
| `NODE_POSTEXE` | `PM_POST_EXECUTION_NODE` |
|
|
109
|
+
| `NODE_DSYM` | `PM_INTERPOLATED_SYMBOL_NODE` |
|
|
110
|
+
| `NODE_ATTRASGN` | `PM_CALL_NODE` (with a message that ends with =) |
|
|
111
|
+
| `NODE_LAMBDA` | `PM_LAMBDA_NODE` |
|
|
112
|
+
| `NODE_ARYPTN` | `PM_ARRAY_PATTERN_NODE` |
|
|
113
|
+
| `NODE_HSHPTN` | `PM_HASH_PATTERN_NODE` |
|
|
114
|
+
| `NODE_FNDPTN` | `PM_FIND_PATTERN_NODE` |
|
|
115
|
+
| `NODE_ERROR` | `PM_MISSING_NODE` |
|
|
116
|
+
| `NODE_LAST` | |
|
|
117
|
+
```
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
# parser translation
|
|
2
|
+
|
|
3
|
+
Prism ships with the ability to translate its syntax tree into the syntax tree used by the [whitequark/parser](https://github.com/whitequark/parser) gem. This allows you to use tools built on top of the `parser` gem with the `prism` parser.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
The `parser` gem provides multiple parsers to support different versions of the Ruby grammar. This includes all of the Ruby versions going back to 1.8, as well as third-party parsers like MacRuby and RubyMotion. The `prism` gem provides another parser that uses the `prism` parser to build the syntax tree.
|
|
8
|
+
|
|
9
|
+
You can use the `prism` parser like you would any other. After requiring `prism`, you should be able to call any of the regular `Parser::Base` APIs that you would normally use.
|
|
10
|
+
|
|
11
|
+
```ruby
|
|
12
|
+
require "prism"
|
|
13
|
+
|
|
14
|
+
# Same as `Parser::Ruby34`
|
|
15
|
+
Prism::Translation::Parser34.parse_file("path/to/file.rb")
|
|
16
|
+
|
|
17
|
+
# Same as `Parser::CurrentRuby`
|
|
18
|
+
Prism::Translation::ParserCurrent.parse("puts 'Hello World!'")
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
All the parsers are autoloaded, so you don't have to worry about requiring them yourself.
|
|
22
|
+
|
|
23
|
+
If you also need to parse Ruby versions below 3.3 (for which the `prism` translation layer does not have explicit support), check out
|
|
24
|
+
[this guide](https://github.com/whitequark/parser/blob/master/doc/PRISM_TRANSLATION.md) from the `parser` gem on how to use both in conjunction.
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Rules
|
|
2
|
+
|
|
3
|
+
This document contains information related to the rules of the parser for Ruby source code.
|
|
4
|
+
|
|
5
|
+
As an example, in the documentation of many of the fields of nodes, it's mentioned that a field follows the lexing rules for `identifier` or `constant`. This document describes what those rules are.
|
|
6
|
+
|
|
7
|
+
## Constants
|
|
8
|
+
|
|
9
|
+
Constants in Ruby begin with an upper-case letter. This is followed by any number of underscores, alphanumeric, or non-ASCII characters. The definition of "alphanumeric" and "upper-case letter" are encoding-dependent.
|
|
10
|
+
|
|
11
|
+
## Non-void expression
|
|
12
|
+
|
|
13
|
+
Most expressions in CRuby are non-void. This means the expression they represent resolves to a value. For example, `1 + 2` is a non-void expression, because it resolves to a method call. Even things like `class Foo; end` is a non-void expression, because it returns the last evaluated expression in the body of the class (or `nil`).
|
|
14
|
+
|
|
15
|
+
Certain nodes, however, are void expressions, and cannot be combined to form larger expressions.
|
|
16
|
+
* `BEGIN {}`, `END {}`, `alias foo bar`, and `undef foo` can only be at a statement position.
|
|
17
|
+
* The "jumps": `return`, `break`, `next`, `redo`, `retry` are void expressions.
|
|
18
|
+
* `value => pattern` is also considered a void expression.
|
|
19
|
+
|
|
20
|
+
## Identifiers
|
|
21
|
+
|
|
22
|
+
Identifiers in Ruby begin with an underscore or lower-case letter. This is followed by any number of underscores, alphanumeric, or non-ASCII characters. The definition of "alphanumeric" and "lower-case letter" are encoding-dependent.
|
data/docs/releasing.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
# Releasing
|
|
2
|
+
|
|
3
|
+
To release a new version of Prism, perform the following steps:
|
|
4
|
+
|
|
5
|
+
## Preparation
|
|
6
|
+
|
|
7
|
+
* Update the `CHANGELOG.md` file.
|
|
8
|
+
* Add a new section for the new version at the top of the file.
|
|
9
|
+
* Fill in the relevant changes — it may be easiest to click the link for the `Unreleased` heading to find the commits.
|
|
10
|
+
* Update the links at the bottom of the file.
|
|
11
|
+
* Update the version numbers in the various files that reference them:
|
|
12
|
+
|
|
13
|
+
```sh
|
|
14
|
+
export PRISM_MAJOR="x"
|
|
15
|
+
export PRISM_MINOR="y"
|
|
16
|
+
export PRISM_PATCH="z"
|
|
17
|
+
export PRISM_VERSION="$PRISM_MAJOR.$PRISM_MINOR.$PRISM_PATCH"
|
|
18
|
+
ruby -pi -e 'gsub(/spec\.version = ".+?"/, %Q{spec.version = "#{ENV["PRISM_VERSION"]}"})' prism.gemspec
|
|
19
|
+
ruby -pi -e 'gsub(/EXPECTED_PRISM_VERSION ".+?"/, %Q{EXPECTED_PRISM_VERSION "#{ENV["PRISM_VERSION"]}"})' ext/prism/extension.h
|
|
20
|
+
ruby -pi -e 'gsub(/PRISM_VERSION_MAJOR \d+/, %Q{PRISM_VERSION_MAJOR #{ENV["PRISM_MAJOR"]}})' include/prism/version.h
|
|
21
|
+
ruby -pi -e 'gsub(/PRISM_VERSION_MINOR \d+/, %Q{PRISM_VERSION_MINOR #{ENV["PRISM_MINOR"]}})' include/prism/version.h
|
|
22
|
+
ruby -pi -e 'gsub(/PRISM_VERSION_PATCH \d+/, %Q{PRISM_VERSION_PATCH #{ENV["PRISM_PATCH"]}})' include/prism/version.h
|
|
23
|
+
ruby -pi -e 'gsub(/PRISM_VERSION ".+?"/, %Q{PRISM_VERSION "#{ENV["PRISM_VERSION"]}"})' include/prism/version.h
|
|
24
|
+
ruby -pi -e 'gsub(/"version": ".+?"/, %Q{"version": "#{ENV["PRISM_VERSION"]}"})' javascript/package.json
|
|
25
|
+
ruby -pi -e 'gsub(/lossy\(\), ".+?"/, %Q{lossy(), "#{ENV["PRISM_VERSION"]}"})' rust/ruby-prism-sys/tests/utils_tests.rs
|
|
26
|
+
ruby -pi -e 'gsub(/\d+, "prism major/, %Q{#{ENV["PRISM_MAJOR"]}, "prism major})' templates/java/org/prism/Loader.java.erb
|
|
27
|
+
ruby -pi -e 'gsub(/\d+, "prism minor/, %Q{#{ENV["PRISM_MINOR"]}, "prism minor})' templates/java/org/prism/Loader.java.erb
|
|
28
|
+
ruby -pi -e 'gsub(/\d+, "prism patch/, %Q{#{ENV["PRISM_PATCH"]}, "prism patch})' templates/java/org/prism/Loader.java.erb
|
|
29
|
+
ruby -pi -e 'gsub(/MAJOR_VERSION = \d+/, %Q{MAJOR_VERSION = #{ENV["PRISM_MAJOR"]}})' templates/javascript/src/deserialize.js.erb
|
|
30
|
+
ruby -pi -e 'gsub(/MINOR_VERSION = \d+/, %Q{MINOR_VERSION = #{ENV["PRISM_MINOR"]}})' templates/javascript/src/deserialize.js.erb
|
|
31
|
+
ruby -pi -e 'gsub(/PATCH_VERSION = \d+/, %Q{PATCH_VERSION = #{ENV["PRISM_PATCH"]}})' templates/javascript/src/deserialize.js.erb
|
|
32
|
+
ruby -pi -e 'gsub(/MAJOR_VERSION = \d+/, %Q{MAJOR_VERSION = #{ENV["PRISM_MAJOR"]}})' templates/lib/prism/serialize.rb.erb
|
|
33
|
+
ruby -pi -e 'gsub(/MINOR_VERSION = \d+/, %Q{MINOR_VERSION = #{ENV["PRISM_MINOR"]}})' templates/lib/prism/serialize.rb.erb
|
|
34
|
+
ruby -pi -e 'gsub(/PATCH_VERSION = \d+/, %Q{PATCH_VERSION = #{ENV["PRISM_PATCH"]}})' templates/lib/prism/serialize.rb.erb
|
|
35
|
+
ruby -pi -e 'gsub(/^version = ".+?"/, %Q{version = "#{ENV["PRISM_VERSION"]}"})' rust/ruby-prism-sys/Cargo.toml
|
|
36
|
+
ruby -pi -e 'gsub(/^version = ".+?"/, %Q{version = "#{ENV["PRISM_VERSION"]}"})' rust/ruby-prism/Cargo.toml
|
|
37
|
+
ruby -pi -e 'gsub(/^ruby-prism-sys = \{ version = ".+?"/, %Q{ruby-prism-sys = \{ version = "#{ENV["PRISM_VERSION"]}"})' rust/ruby-prism/Cargo.toml
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
* Update the `Gemfile.lock` file:
|
|
41
|
+
|
|
42
|
+
```sh
|
|
43
|
+
chruby ruby-3.5.0-dev
|
|
44
|
+
bundle install
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
* Update the version-specific lockfiles:
|
|
48
|
+
|
|
49
|
+
```sh
|
|
50
|
+
for VERSION in "2.7" "3.0" "3.1" "3.2" "3.3" "3.4"; do docker run -it --rm -v "$PWD":/usr/src/app -w /usr/src/app -e BUNDLE_GEMFILE="gemfiles/$VERSION/Gemfile" "ruby:$VERSION" bundle update; done
|
|
51
|
+
docker run -it --rm -v "$PWD":/usr/src/app -w /usr/src/app -e BUNDLE_GEMFILE="gemfiles/3.5/Gemfile" ruby:3.5.0-preview1 bundle update
|
|
52
|
+
BUNDLE_GEMFILE=gemfiles/truffleruby/Gemfile chruby-exec truffleruby -- bundle update
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
* Update the cargo lockfiles:
|
|
56
|
+
|
|
57
|
+
```sh
|
|
58
|
+
bundle exec rake cargo:build
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
* Commit all of the updated files:
|
|
62
|
+
|
|
63
|
+
```sh
|
|
64
|
+
git commit -am "Bump to v$PRISM_VERSION"
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
* Push up the changes:
|
|
68
|
+
|
|
69
|
+
```sh
|
|
70
|
+
git push
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## Publishing
|
|
74
|
+
|
|
75
|
+
* Update the GitHub release page with a copy of the latest entry in the `CHANGELOG.md` file.
|
|
76
|
+
* Publish the gem to [rubygems.org](rubygems.org). Note that you must have access to the `prism` gem to do this.
|
|
77
|
+
|
|
78
|
+
```sh
|
|
79
|
+
bundle exec rake release
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
* Generate the `wasm` artifact (or download it from GitHub actions and put it in `javascript/src/prism.wasm`).
|
|
83
|
+
|
|
84
|
+
```sh
|
|
85
|
+
make wasm
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
* Publish the JavaScript package to [npmjs.com](npmjs.com). Note that you must have access to the `@ruby/prism` package to do this.
|
|
89
|
+
|
|
90
|
+
```sh
|
|
91
|
+
npm publish
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
* Publish the rust crate to [crates.io](crates.io). Note that you must have access to the `ruby-prism-sys` and `ruby-prism` crates to do this.
|
|
95
|
+
|
|
96
|
+
```sh
|
|
97
|
+
bundle exec rake cargo:publish:real
|
|
98
|
+
```
|
data/docs/relocation.md
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# Relocation
|
|
2
|
+
|
|
3
|
+
Prism parses deterministically for the same input. This provides a nice property that is exposed through the `#node_id` API on nodes. Effectively this means that for the same input, these values will remain consistent every time the source is parsed. This means we can reparse the source same with a `#node_id` value and find the exact same node again.
|
|
4
|
+
|
|
5
|
+
The `Relocation` module provides an API around this property. It allows you to "save" nodes and locations using a minimal amount of memory (just the node_id and a field identifier) and then reify them later. This minimizes the amount of memory you need to allocate to store this information because it does not keep around a pointer to the source string.
|
|
6
|
+
|
|
7
|
+
## Getting started
|
|
8
|
+
|
|
9
|
+
To get started with the `Relocation` module, you would first instantiate a `Repository` object. You do this through a DSL that chains method calls for configuration. For example, if for every entry in the repository you want to store the start and end lines, the start and end code unit columns for in UTF-16, and the leading comments, you would:
|
|
10
|
+
|
|
11
|
+
```ruby
|
|
12
|
+
repository = Prism::Relocation.filepath("path/to/file").lines.code_unit_columns(Encoding::UTF_16).leading_comments
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Now that you have the repository, you can pass it into any of the `save*` APIs on nodes or locations to create entries in the repository that will be lazily reified.
|
|
16
|
+
|
|
17
|
+
```ruby
|
|
18
|
+
# assume that node is a Prism::ClassNode object
|
|
19
|
+
entry = node.constant_path.save(repository)
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Now that you have the entry object, you do not need to keep around a reference to the repository, it will be cleaned up on its own when the last entry is reified. Now, whenever you need to, you may call the associated field methods on the entry object, as in:
|
|
23
|
+
|
|
24
|
+
```ruby
|
|
25
|
+
entry.start_line
|
|
26
|
+
entry.end_line
|
|
27
|
+
|
|
28
|
+
entry.start_code_units_column
|
|
29
|
+
entry.end_code_units_column
|
|
30
|
+
|
|
31
|
+
entry.leading_comments
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Note that if you had configured other fields to be saved, you would be able to access them as well. The first time one of these fields is accessed, the repository will reify every entry it knows about and then clean itself up. In this way, you can effectively treat them as if you had kept around lightweight versions of `Prism::Node` or `Prism::Location` objects.
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
# Ripper translation
|
|
2
|
+
|
|
3
|
+
Prism provides the ability to mirror the `Ripper` standard library. You can do this by:
|
|
4
|
+
|
|
5
|
+
```ruby
|
|
6
|
+
require "prism/translation/ripper/shim"
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
This provides the APIs like:
|
|
10
|
+
|
|
11
|
+
```ruby
|
|
12
|
+
Ripper.lex
|
|
13
|
+
Ripper.parse
|
|
14
|
+
Ripper.sexp_raw
|
|
15
|
+
Ripper.sexp
|
|
16
|
+
|
|
17
|
+
Ripper::SexpBuilder
|
|
18
|
+
Ripper::SexpBuilderPP
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Briefly, `Ripper` is a streaming parser that allows you to construct your own syntax tree. As an example:
|
|
22
|
+
|
|
23
|
+
```ruby
|
|
24
|
+
class ArithmeticRipper < Prism::Translation::Ripper
|
|
25
|
+
def on_binary(left, operator, right)
|
|
26
|
+
left.public_send(operator, right)
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
def on_int(value)
|
|
30
|
+
value.to_i
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
def on_program(stmts)
|
|
34
|
+
stmts
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
def on_stmts_new
|
|
38
|
+
[]
|
|
39
|
+
end
|
|
40
|
+
|
|
41
|
+
def on_stmts_add(stmts, stmt)
|
|
42
|
+
stmts << stmt
|
|
43
|
+
stmts
|
|
44
|
+
end
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
ArithmeticRipper.new("1 + 2 - 3").parse # => [0]
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
The exact names of the `on_*` methods are listed in the `Ripper` source.
|
|
51
|
+
|
|
52
|
+
## Background
|
|
53
|
+
|
|
54
|
+
It is helpful to understand the differences between the `Ripper` library and the `Prism` library. Both libraries perform parsing and provide you with APIs to manipulate and understand the resulting syntax tree. However, there are a few key differences.
|
|
55
|
+
|
|
56
|
+
### Design
|
|
57
|
+
|
|
58
|
+
`Ripper` is a streaming parser. This means as it is parsing Ruby code, it dispatches events back to the consumer. This allows quite a bit of flexibility. You can use it to build your own syntax tree or to find specific patterns in the code. `Prism` on the other hand returns to you the completed syntax tree _before_ it allows you to manipulate it. This means the tree that you get back is the only representation that can be generated by the parser _at parse time_ (but of course can be manipulated later).
|
|
59
|
+
|
|
60
|
+
### Fields
|
|
61
|
+
|
|
62
|
+
We use the term "field" to mean a piece of information on a syntax tree node. `Ripper` provides the minimal number of fields to accurately represent the syntax tree for the purposes of compilation/interpretation. For example, in the callbacks for nodes that are based on keywords (`class`, `module`, `for`, `while`, etc.) you are not given the keyword itself, you need to attach it on your own. In other cases, tokens are not necessarily dispatched at all, meaning you need to find them yourself. `Prism` provides the opposite: the maximum number of fields on nodes is provided. As a tradeoff, this requires more memory, but this is chosen to make it easier on consumers.
|
|
63
|
+
|
|
64
|
+
### Maintainability
|
|
65
|
+
|
|
66
|
+
The `Ripper` interface is not guaranteed in any way, and tends to change between patch versions of CRuby. This is largely due to the fact that `Ripper` is a by-product of the generated parser, as opposed to its own parser. As an example, in the expression `foo::bar = baz`, there are three different represents possible for the call operator, including:
|
|
67
|
+
|
|
68
|
+
* `:"::"` - Ruby 1.9 to Ruby 3.1.4
|
|
69
|
+
* `73` - Ruby 3.1.5 to Ruby 3.1.6
|
|
70
|
+
* `[:@op, "::", [lineno, column]]` - Ruby 3.2.0 and later
|
|
71
|
+
|
|
72
|
+
The `Prism` interface is guaranteed going forward to be the consistent, and the official Ruby syntax tree interface. This means you can rely on this interface without having to worry about individual changes between Ruby versions. It also is a gem, which means it is versioned based on the gem version, as opposed to being versioned based on the Ruby version. Finally, you can use `Prism` to parse multiple versions of Ruby, whereas `Ripper` is tied to the Ruby version it is running on.
|
data/docs/ruby_api.md
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Ruby API
|
|
2
|
+
|
|
3
|
+
The `prism` gem provides a Ruby API for accessing the syntax tree.
|
|
4
|
+
|
|
5
|
+
For the most part, the API for accessing the tree mirrors that found in the [Syntax Tree](https://github.com/ruby-syntax-tree/syntax_tree) project. This means:
|
|
6
|
+
|
|
7
|
+
* Walking the tree involves creating a visitor and passing it to the `#accept` method on any node in the tree
|
|
8
|
+
* Nodes in the tree respond to named methods for accessing their children as well as `#child_nodes`
|
|
9
|
+
* Nodes respond to the pattern matching interfaces `#deconstruct` and `#deconstruct_keys`
|
|
10
|
+
|
|
11
|
+
Every entry in `config.yml` will generate a Ruby class as well as the code that builds the nodes themselves.
|
|
12
|
+
Creating a syntax tree involves calling one of the class methods on the `Prism` module.
|
|
13
|
+
The full API is documented below.
|
|
14
|
+
|
|
15
|
+
## API
|
|
16
|
+
|
|
17
|
+
* `Prism.dump(source)` - parse the syntax tree corresponding to the given source string, and serialize it to a string
|
|
18
|
+
* `Prism.dump_file(filepath)` - parse the syntax tree corresponding to the given source file and serialize it to a string
|
|
19
|
+
* `Prism.lex(source)` - parse the tokens corresponding to the given source string and return them as an array within a parse result
|
|
20
|
+
* `Prism.lex_file(filepath)` - parse the tokens corresponding to the given source file and return them as an array within a parse result
|
|
21
|
+
* `Prism.parse(source)` - parse the syntax tree corresponding to the given source string and return it within a parse result
|
|
22
|
+
* `Prism.parse_file(filepath)` - parse the syntax tree corresponding to the given source file and return it within a parse result
|
|
23
|
+
* `Prism.parse_stream(io)` - parse the syntax tree corresponding to the source that is read out of the given IO object using the `#gets` method and return it within a parse result
|
|
24
|
+
* `Prism.parse_lex(source)` - parse the syntax tree corresponding to the given source string and return it within a parse result, along with the tokens
|
|
25
|
+
* `Prism.parse_lex_file(filepath)` - parse the syntax tree corresponding to the given source file and return it within a parse result, along with the tokens
|
|
26
|
+
* `Prism.load(source, serialized, freeze = false)` - load the serialized syntax tree using the source as a reference into a syntax tree
|
|
27
|
+
* `Prism.parse_comments(source)` - parse the comments corresponding to the given source string and return them
|
|
28
|
+
* `Prism.parse_file_comments(source)` - parse the comments corresponding to the given source file and return them
|
|
29
|
+
* `Prism.parse_success?(source)` - parse the syntax tree corresponding to the given source string and return true if it was parsed without errors
|
|
30
|
+
* `Prism.parse_file_success?(filepath)` - parse the syntax tree corresponding to the given source file and return true if it was parsed without errors
|
|
31
|
+
|
|
32
|
+
## Nodes
|
|
33
|
+
|
|
34
|
+
Once you have nodes in hand coming out of a parse result, there are a number of common APIs that are available on each instance. They are:
|
|
35
|
+
|
|
36
|
+
* `#accept(visitor)` - a method that will immediately call `visit_*` to specialize for the node type
|
|
37
|
+
* `#child_nodes` - a positional array of the child nodes of the node, with `nil` values for any missing children
|
|
38
|
+
* `#compact_child_nodes` - a positional array of the child nodes of the node with no `nil` values
|
|
39
|
+
* `#copy(**keys)` - a method that allows creating a shallow copy of the node with the given keys overridden
|
|
40
|
+
* `#deconstruct`/`#deconstruct_keys(keys)` - the pattern matching interface for nodes
|
|
41
|
+
* `#inspect` - a string representation that looks like the syntax tree of the node
|
|
42
|
+
* `#location` - a `Location` object that describes the location of the node in the source file
|
|
43
|
+
* `#to_dot` - convert the node's syntax tree into graphviz dot notation
|
|
44
|
+
* `#type` - a symbol that represents the type of the node, useful for quick comparisons
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# ruby_parser translation
|
|
2
|
+
|
|
3
|
+
Prism ships with the ability to translate its syntax tree into the syntax tree used by the [seattlerb/ruby_parser](https://github.com/seattlerb/ruby_parser) gem. This allows you to use tools built on top of the `ruby_parser` gem with the `prism` parser.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
You can call the `parse` and `parse_file` methods on the `Prism::Translation::RubyParser` module:
|
|
8
|
+
|
|
9
|
+
```ruby
|
|
10
|
+
filepath = "path/to/file.rb"
|
|
11
|
+
Prism::Translation::RubyParser.parse_file(filepath)
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
This will return to you `Sexp` objects that mirror the result of calling `RubyParser` methods, as in:
|
|
15
|
+
|
|
16
|
+
```ruby
|
|
17
|
+
filepath = "path/to/file.rb"
|
|
18
|
+
RubyParser.new.parse(File.read(filepath), filepath)
|
|
19
|
+
```
|