prism 0.16.0 → 0.17.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +16 -1
- data/Makefile +6 -0
- data/README.md +1 -1
- data/config.yml +50 -35
- data/docs/fuzzing.md +1 -1
- data/docs/serialization.md +28 -29
- data/ext/prism/api_node.c +802 -770
- data/ext/prism/api_pack.c +20 -9
- data/ext/prism/extension.c +464 -162
- data/ext/prism/extension.h +1 -1
- data/include/prism/ast.h +3173 -763
- data/include/prism/defines.h +32 -9
- data/include/prism/diagnostic.h +36 -3
- data/include/prism/enc/pm_encoding.h +118 -28
- data/include/prism/node.h +38 -13
- data/include/prism/options.h +204 -0
- data/include/prism/pack.h +44 -33
- data/include/prism/parser.h +445 -200
- data/include/prism/prettyprint.h +12 -1
- data/include/prism/regexp.h +16 -2
- data/include/prism/util/pm_buffer.h +94 -16
- data/include/prism/util/pm_char.h +162 -48
- data/include/prism/util/pm_constant_pool.h +126 -32
- data/include/prism/util/pm_list.h +68 -38
- data/include/prism/util/pm_memchr.h +18 -3
- data/include/prism/util/pm_newline_list.h +70 -27
- data/include/prism/util/pm_state_stack.h +25 -7
- data/include/prism/util/pm_string.h +115 -27
- data/include/prism/util/pm_string_list.h +25 -6
- data/include/prism/util/pm_strncasecmp.h +32 -0
- data/include/prism/util/pm_strpbrk.h +31 -17
- data/include/prism/version.h +27 -2
- data/include/prism.h +224 -31
- data/lib/prism/compiler.rb +6 -3
- data/lib/prism/debug.rb +23 -7
- data/lib/prism/dispatcher.rb +33 -18
- data/lib/prism/dsl.rb +10 -5
- data/lib/prism/ffi.rb +132 -80
- data/lib/prism/lex_compat.rb +25 -15
- data/lib/prism/mutation_compiler.rb +10 -5
- data/lib/prism/node.rb +370 -135
- data/lib/prism/node_ext.rb +1 -1
- data/lib/prism/node_inspector.rb +1 -1
- data/lib/prism/pack.rb +79 -40
- data/lib/prism/parse_result/comments.rb +7 -2
- data/lib/prism/parse_result/newlines.rb +4 -0
- data/lib/prism/parse_result.rb +150 -30
- data/lib/prism/pattern.rb +11 -0
- data/lib/prism/ripper_compat.rb +28 -10
- data/lib/prism/serialize.rb +86 -54
- data/lib/prism/visitor.rb +10 -3
- data/lib/prism.rb +20 -2
- data/prism.gemspec +4 -2
- data/rbi/prism.rbi +104 -60
- data/rbi/prism_static.rbi +16 -2
- data/sig/prism.rbs +72 -43
- data/sig/prism_static.rbs +14 -1
- data/src/diagnostic.c +56 -53
- data/src/enc/pm_big5.c +1 -0
- data/src/enc/pm_euc_jp.c +1 -0
- data/src/enc/pm_gbk.c +1 -0
- data/src/enc/pm_shift_jis.c +1 -0
- data/src/enc/pm_tables.c +316 -80
- data/src/enc/pm_unicode.c +53 -8
- data/src/enc/pm_windows_31j.c +1 -0
- data/src/node.c +334 -321
- data/src/options.c +170 -0
- data/src/prettyprint.c +74 -47
- data/src/prism.c +1642 -856
- data/src/regexp.c +151 -95
- data/src/serialize.c +44 -20
- data/src/token_type.c +3 -1
- data/src/util/pm_buffer.c +45 -15
- data/src/util/pm_char.c +103 -57
- data/src/util/pm_constant_pool.c +51 -21
- data/src/util/pm_list.c +12 -4
- data/src/util/pm_memchr.c +5 -3
- data/src/util/pm_newline_list.c +20 -12
- data/src/util/pm_state_stack.c +9 -3
- data/src/util/pm_string.c +95 -85
- data/src/util/pm_string_list.c +14 -15
- data/src/util/pm_strncasecmp.c +10 -3
- data/src/util/pm_strpbrk.c +25 -19
- metadata +5 -3
- data/docs/prism.png +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 69cdca044f91ad2a198666562fdffd6035323906da465743ebff2cbd34ccac5d
|
4
|
+
data.tar.gz: 3da1460885f7cabda4b1ae6438274ab64a6e966536587d98eab2b3c764d0b6c0
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c01d8b62728fe1cbce99394d683c28b943b7962e6a1fc82d435b3190286612fc4d7e4aaab93a6bef7a75f8acdfa3b5a597acac3295e489a7dd89aada94b29b23
|
7
|
+
data.tar.gz: ce88826e2a46cb18fe5e89b1a7c18c8fba2387b9c9e5625ec9aac5e5bd1449b21f23624bd0a6b526c4a9fa156c679186f8b89c09d0a003d386663f4a6a33269e
|
data/CHANGELOG.md
CHANGED
@@ -6,6 +6,20 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
|
|
6
6
|
|
7
7
|
## [Unreleased]
|
8
8
|
|
9
|
+
## [0.17.0] - 2023-11-03
|
10
|
+
|
11
|
+
### Added
|
12
|
+
|
13
|
+
- We now properly support forwarding arguments into arrays, like `def foo(*) = [*]`.
|
14
|
+
- We now have much better documentation for the C and Ruby APIs.
|
15
|
+
- We now properly provide an error message when attempting to assign to numbered parameters from within regular expression named capture groups, as in `/(?<_1>)/ =~ ""`.
|
16
|
+
|
17
|
+
### Changed
|
18
|
+
|
19
|
+
- **BREAKING**: `KeywordParameterNode` is split into `OptionalKeywordParameterNode` and `RequiredKeywordParameterNode`. `RequiredKeywordParameterNode` has no `value` field.
|
20
|
+
- **BREAKING**: Most of the `Prism::` APIs now accept a bunch of keyword options. The options we now support are: `filepath`, `encoding`, `line`, `frozen_string_literal`, `verbose`, and `scopes`. See [the pull request](https://github.com/ruby/prism/pull/1763) for more details.
|
21
|
+
- **BREAKING**: Comments are now split into three different classes instead of a single class, and the `type` field has been removed. They are: `InlineComment`, `EmbDocComment`, and `DATAComment`.
|
22
|
+
|
9
23
|
## [0.16.0] - 2023-10-30
|
10
24
|
|
11
25
|
### Added
|
@@ -219,7 +233,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
|
|
219
233
|
|
220
234
|
- 🎉 Initial release! 🎉
|
221
235
|
|
222
|
-
[unreleased]: https://github.com/ruby/prism/compare/v0.
|
236
|
+
[unreleased]: https://github.com/ruby/prism/compare/v0.17.0...HEAD
|
237
|
+
[0.17.0]: https://github.com/ruby/prism/compare/v0.16.0...v0.17.0
|
223
238
|
[0.16.0]: https://github.com/ruby/prism/compare/v0.15.1...v0.16.0
|
224
239
|
[0.15.1]: https://github.com/ruby/prism/compare/v0.15.0...v0.15.1
|
225
240
|
[0.15.0]: https://github.com/ruby/prism/compare/v0.14.0...v0.15.0
|
data/Makefile
CHANGED
@@ -88,3 +88,9 @@ clean:
|
|
88
88
|
all-no-debug: DEBUG_FLAGS := -DNDEBUG=1
|
89
89
|
all-no-debug: OPTFLAGS := -O3
|
90
90
|
all-no-debug: all
|
91
|
+
|
92
|
+
run: Makefile $(STATIC_OBJECTS) $(HEADERS) test.c
|
93
|
+
$(ECHO) "compiling test.c"
|
94
|
+
$(Q) $(CC) $(CPPFLAGS) $(CFLAGS) $(STATIC_OBJECTS) test.c
|
95
|
+
$(ECHO) "running test.c"
|
96
|
+
$(Q) ./a.out
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
<h1 align="center">Prism Ruby parser</h1>
|
2
2
|
<div align="center">
|
3
|
-
<img alt="Prism Ruby parser" height="256px" src="https://github.com/ruby/prism/blob/main/
|
3
|
+
<img alt="Prism Ruby parser" height="256px" src="https://github.com/ruby/prism/blob/main/doc/images/prism.png?raw=true">
|
4
4
|
</div>
|
5
5
|
|
6
6
|
This is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into [CRuby](https://github.com/ruby/ruby), [JRuby](https://github.com/jruby/jruby), [TruffleRuby](https://github.com/oracle/truffleruby), [Sorbet](https://github.com/sorbet/sorbet), and [Syntax Tree](https://github.com/ruby-syntax-tree/syntax_tree).
|
data/config.yml
CHANGED
@@ -59,11 +59,11 @@ tokens:
|
|
59
59
|
- name: CONSTANT
|
60
60
|
comment: "a constant"
|
61
61
|
- name: DOT
|
62
|
-
comment: "."
|
62
|
+
comment: "the . call operator"
|
63
63
|
- name: DOT_DOT
|
64
|
-
comment: ".."
|
64
|
+
comment: "the .. range operator"
|
65
65
|
- name: DOT_DOT_DOT
|
66
|
-
comment: "..."
|
66
|
+
comment: "the ... range operator or forwarding parameter"
|
67
67
|
- name: EMBDOC_BEGIN
|
68
68
|
comment: "=begin"
|
69
69
|
- name: EMBDOC_END
|
@@ -311,9 +311,9 @@ tokens:
|
|
311
311
|
- name: UCOLON_COLON
|
312
312
|
comment: "unary ::"
|
313
313
|
- name: UDOT_DOT
|
314
|
-
comment: "unary .."
|
314
|
+
comment: "unary .. operator"
|
315
315
|
- name: UDOT_DOT_DOT
|
316
|
-
comment: "unary ..."
|
316
|
+
comment: "unary ... operator"
|
317
317
|
- name: UMINUS
|
318
318
|
comment: "-@"
|
319
319
|
- name: UMINUS_NUM
|
@@ -333,12 +333,14 @@ flags:
|
|
333
333
|
values:
|
334
334
|
- name: KEYWORD_SPLAT
|
335
335
|
comment: "if arguments contain keyword splat"
|
336
|
+
comment: Flags for arguments nodes.
|
336
337
|
- name: CallNodeFlags
|
337
338
|
values:
|
338
339
|
- name: SAFE_NAVIGATION
|
339
340
|
comment: "&. operator"
|
340
341
|
- name: VARIABLE_CALL
|
341
342
|
comment: "a call that could have been a local variable"
|
343
|
+
comment: Flags for call nodes.
|
342
344
|
- name: IntegerBaseFlags
|
343
345
|
values:
|
344
346
|
- name: BINARY
|
@@ -349,14 +351,17 @@ flags:
|
|
349
351
|
comment: "0d or no prefix"
|
350
352
|
- name: HEXADECIMAL
|
351
353
|
comment: "0x prefix"
|
354
|
+
comment: Flags for integer nodes that correspond to the base of the integer.
|
352
355
|
- name: LoopFlags
|
353
356
|
values:
|
354
357
|
- name: BEGIN_MODIFIER
|
355
358
|
comment: "a loop after a begin statement, so the body is executed first before the condition"
|
359
|
+
comment: Flags for while and until loop nodes.
|
356
360
|
- name: RangeFlags
|
357
361
|
values:
|
358
362
|
- name: EXCLUDE_END
|
359
363
|
comment: "... operator"
|
364
|
+
comment: Flags for range and flip-flop nodes.
|
360
365
|
- name: RegularExpressionFlags
|
361
366
|
values:
|
362
367
|
- name: IGNORE_CASE
|
@@ -375,10 +380,12 @@ flags:
|
|
375
380
|
comment: "s - forces the Windows-31J encoding"
|
376
381
|
- name: UTF_8
|
377
382
|
comment: "u - forces the UTF-8 encoding"
|
383
|
+
comment: Flags for regular expression and match last line nodes.
|
378
384
|
- name: StringFlags
|
379
385
|
values:
|
380
386
|
- name: FROZEN
|
381
387
|
comment: "frozen by virtue of a `frozen_string_literal` comment"
|
388
|
+
comment: Flags for string nodes.
|
382
389
|
nodes:
|
383
390
|
- name: AliasGlobalVariableNode
|
384
391
|
fields:
|
@@ -777,10 +784,10 @@ nodes:
|
|
777
784
|
comment: |
|
778
785
|
Represents the use of a case statement.
|
779
786
|
|
780
|
-
|
781
|
-
|
782
|
-
|
783
|
-
|
787
|
+
case true
|
788
|
+
when false
|
789
|
+
end
|
790
|
+
^^^^^^^^^^
|
784
791
|
- name: ClassNode
|
785
792
|
fields:
|
786
793
|
- name: locals
|
@@ -818,7 +825,7 @@ nodes:
|
|
818
825
|
Represents the use of the `&&=` operator for assignment to a class variable.
|
819
826
|
|
820
827
|
@@target &&= value
|
821
|
-
|
828
|
+
^^^^^^^^^^^^^^^^^^
|
822
829
|
- name: ClassVariableOperatorWriteNode
|
823
830
|
fields:
|
824
831
|
- name: name
|
@@ -1183,13 +1190,13 @@ nodes:
|
|
1183
1190
|
Represents a find pattern in pattern matching.
|
1184
1191
|
|
1185
1192
|
foo in *bar, baz, *qux
|
1186
|
-
|
1193
|
+
^^^^^^^^^^^^^^^
|
1187
1194
|
|
1188
1195
|
foo in [*bar, baz, *qux]
|
1189
|
-
|
1196
|
+
^^^^^^^^^^^^^^^^^
|
1190
1197
|
|
1191
1198
|
foo in Foo(*bar, baz, *qux)
|
1192
|
-
|
1199
|
+
^^^^^^^^^^^^^^^^^^^^
|
1193
1200
|
- name: FlipFlopNode
|
1194
1201
|
fields:
|
1195
1202
|
- name: left
|
@@ -1240,7 +1247,7 @@ nodes:
|
|
1240
1247
|
|
1241
1248
|
def foo(...)
|
1242
1249
|
bar(...)
|
1243
|
-
|
1250
|
+
^^^
|
1244
1251
|
end
|
1245
1252
|
- name: ForwardingParameterNode
|
1246
1253
|
comment: |
|
@@ -1692,24 +1699,6 @@ nodes:
|
|
1692
1699
|
|
1693
1700
|
foo(a: b)
|
1694
1701
|
^^^^
|
1695
|
-
- name: KeywordParameterNode
|
1696
|
-
fields:
|
1697
|
-
- name: name
|
1698
|
-
type: constant
|
1699
|
-
- name: name_loc
|
1700
|
-
type: location
|
1701
|
-
- name: value
|
1702
|
-
type: node?
|
1703
|
-
comment: |
|
1704
|
-
Represents a keyword parameter to a method, block, or lambda definition.
|
1705
|
-
|
1706
|
-
def a(b:)
|
1707
|
-
^^
|
1708
|
-
end
|
1709
|
-
|
1710
|
-
def a(b: 1)
|
1711
|
-
^^^^
|
1712
|
-
end
|
1713
1702
|
- name: KeywordRestParameterNode
|
1714
1703
|
fields:
|
1715
1704
|
- name: name
|
@@ -1997,6 +1986,20 @@ nodes:
|
|
1997
1986
|
|
1998
1987
|
$1
|
1999
1988
|
^^
|
1989
|
+
- name: OptionalKeywordParameterNode
|
1990
|
+
fields:
|
1991
|
+
- name: name
|
1992
|
+
type: constant
|
1993
|
+
- name: name_loc
|
1994
|
+
type: location
|
1995
|
+
- name: value
|
1996
|
+
type: node
|
1997
|
+
comment: |
|
1998
|
+
Represents an optional keyword parameter to a method, block, or lambda definition.
|
1999
|
+
|
2000
|
+
def a(b: 1)
|
2001
|
+
^^^^
|
2002
|
+
end
|
2000
2003
|
- name: OptionalParameterNode
|
2001
2004
|
fields:
|
2002
2005
|
- name: name
|
@@ -2184,6 +2187,18 @@ nodes:
|
|
2184
2187
|
|
2185
2188
|
/foo/i
|
2186
2189
|
^^^^^^
|
2190
|
+
- name: RequiredKeywordParameterNode
|
2191
|
+
fields:
|
2192
|
+
- name: name
|
2193
|
+
type: constant
|
2194
|
+
- name: name_loc
|
2195
|
+
type: location
|
2196
|
+
comment: |
|
2197
|
+
Represents a required keyword parameter to a method, block, or lambda definition.
|
2198
|
+
|
2199
|
+
def a(b: )
|
2200
|
+
^^
|
2201
|
+
end
|
2187
2202
|
- name: RequiredParameterNode
|
2188
2203
|
fields:
|
2189
2204
|
- name: name
|
@@ -2206,8 +2221,8 @@ nodes:
|
|
2206
2221
|
comment: |
|
2207
2222
|
Represents an expression modified with a rescue.
|
2208
2223
|
|
2209
|
-
|
2210
|
-
|
2224
|
+
foo rescue nil
|
2225
|
+
^^^^^^^^^^^^^^
|
2211
2226
|
- name: RescueNode
|
2212
2227
|
fields:
|
2213
2228
|
- name: keyword_loc
|
@@ -2229,8 +2244,8 @@ nodes:
|
|
2229
2244
|
|
2230
2245
|
begin
|
2231
2246
|
rescue Foo, *splat, Bar => ex
|
2232
|
-
^^^^^^
|
2233
2247
|
foo
|
2248
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
2234
2249
|
end
|
2235
2250
|
|
2236
2251
|
`Foo, *splat, Bar` are in the `exceptions` field.
|
data/docs/fuzzing.md
CHANGED
data/docs/serialization.md
CHANGED
@@ -72,6 +72,7 @@ The header is structured like the following table:
|
|
72
72
|
| `1` | patch version number |
|
73
73
|
| `1` | 1 indicates only semantics fields were serialized, 0 indicates all fields were serialized (including location fields) |
|
74
74
|
| string | the encoding name |
|
75
|
+
| varint | the start line |
|
75
76
|
| varint | number of comments |
|
76
77
|
| comment* | comments |
|
77
78
|
| varint | number of magic comments |
|
@@ -136,56 +137,54 @@ typedef struct {
|
|
136
137
|
size_t capacity;
|
137
138
|
} pm_buffer_t;
|
138
139
|
|
139
|
-
// Initialize a pm_buffer_t with its default values.
|
140
|
-
bool pm_buffer_init(pm_buffer_t *);
|
141
|
-
|
142
140
|
// Free the memory associated with the buffer.
|
143
141
|
void pm_buffer_free(pm_buffer_t *);
|
144
142
|
|
145
143
|
// Parse and serialize the AST represented by the given source to the given
|
146
144
|
// buffer.
|
147
|
-
void
|
145
|
+
void pm_serialize_parse(pm_buffer_t *buffer, const uint8_t *source, size_t length, const char *data);
|
148
146
|
```
|
149
147
|
|
150
|
-
Typically you would use a stack-allocated `pm_buffer_t` and call `
|
148
|
+
Typically you would use a stack-allocated `pm_buffer_t` and call `pm_serialize_parse`, as in:
|
151
149
|
|
152
150
|
```c
|
153
151
|
void
|
154
152
|
serialize(const uint8_t *source, size_t length) {
|
155
|
-
pm_buffer_t buffer;
|
156
|
-
|
153
|
+
pm_buffer_t buffer = { 0 };
|
154
|
+
pm_serialize_parse(&buffer, source, length, NULL);
|
157
155
|
|
158
|
-
pm_parse_serialize(source, length, &buffer, NULL);
|
159
156
|
// Do something with the serialized string.
|
160
157
|
|
161
158
|
pm_buffer_free(&buffer);
|
162
159
|
}
|
163
160
|
```
|
164
161
|
|
165
|
-
The final argument to `
|
166
|
-
This includes the filepath that the source is associated with, and any nested local variables scopes that are necessary to properly parse the file (in the case of parsing an `eval`).
|
167
|
-
Note that no `varint` are used here to make it easier to produce the metadata for the caller, and also serialized size is less important here.
|
168
|
-
The metadata is a serialized format itself, and is structured as follows:
|
162
|
+
The final argument to `pm_serialize_parse` is an optional string that controls the options to the parse function. This includes all of the normal options that could be passed to `pm_parser_init` through a `pm_options_t` struct, but serialized as a string to make it easier for callers through FFI. Note that no `varint` are used here to make it easier to produce the data for the caller, and also serialized size is less important here. The format of the data is structured as follows:
|
169
163
|
|
170
|
-
| # bytes | field
|
171
|
-
|
|
172
|
-
| `4`
|
173
|
-
| | the filepath
|
174
|
-
| `4`
|
164
|
+
| # bytes | field |
|
165
|
+
| ------- | -------------------------- |
|
166
|
+
| `4` | the length of the filepath |
|
167
|
+
| ... | the filepath bytes |
|
168
|
+
| `4` | the line number |
|
169
|
+
| `4` | the length the encoding |
|
170
|
+
| ... | the encoding bytes |
|
171
|
+
| `1` | frozen string literal |
|
172
|
+
| `1` | suppress warnings |
|
173
|
+
| `4` | the number of scopes |
|
174
|
+
| ... | the scopes |
|
175
175
|
|
176
|
-
|
176
|
+
Each scope is layed out as follows:
|
177
177
|
|
178
|
-
| # bytes | field
|
179
|
-
|
|
180
|
-
| `4`
|
181
|
-
| | the
|
178
|
+
| # bytes | field |
|
179
|
+
| ------- | -------------------------- |
|
180
|
+
| `4` | the number of locals |
|
181
|
+
| ... | the locals |
|
182
182
|
|
183
|
-
Each local
|
183
|
+
Each local is layed out as follows:
|
184
184
|
|
185
|
-
| # bytes | field
|
186
|
-
|
|
187
|
-
| `4`
|
188
|
-
| | the local
|
185
|
+
| # bytes | field |
|
186
|
+
| ------- | -------------------------- |
|
187
|
+
| `4` | the length of the local |
|
188
|
+
| ... | the local bytes |
|
189
189
|
|
190
|
-
The
|
191
|
-
If it is not null, then a minimal metadata string would be `"\0\0\0\0\0\0\0\0"` which would use 4 bytes to indicate an empty filepath string and 4 bytes to indicate that there were no local variable scopes.
|
190
|
+
The data can be `NULL` (as seen in the example above).
|