prism 0.15.1 → 0.17.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (91) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +35 -1
  3. data/Makefile +12 -0
  4. data/README.md +3 -1
  5. data/config.yml +66 -50
  6. data/docs/configuration.md +2 -0
  7. data/docs/fuzzing.md +1 -1
  8. data/docs/javascript.md +90 -0
  9. data/docs/releasing.md +27 -0
  10. data/docs/ruby_api.md +2 -0
  11. data/docs/serialization.md +28 -29
  12. data/ext/prism/api_node.c +856 -826
  13. data/ext/prism/api_pack.c +20 -9
  14. data/ext/prism/extension.c +494 -119
  15. data/ext/prism/extension.h +1 -1
  16. data/include/prism/ast.h +3157 -747
  17. data/include/prism/defines.h +40 -8
  18. data/include/prism/diagnostic.h +36 -3
  19. data/include/prism/enc/pm_encoding.h +119 -28
  20. data/include/prism/node.h +38 -30
  21. data/include/prism/options.h +204 -0
  22. data/include/prism/pack.h +44 -33
  23. data/include/prism/parser.h +445 -199
  24. data/include/prism/prettyprint.h +26 -0
  25. data/include/prism/regexp.h +16 -2
  26. data/include/prism/util/pm_buffer.h +102 -18
  27. data/include/prism/util/pm_char.h +162 -48
  28. data/include/prism/util/pm_constant_pool.h +128 -34
  29. data/include/prism/util/pm_list.h +68 -38
  30. data/include/prism/util/pm_memchr.h +18 -3
  31. data/include/prism/util/pm_newline_list.h +71 -28
  32. data/include/prism/util/pm_state_stack.h +25 -7
  33. data/include/prism/util/pm_string.h +115 -27
  34. data/include/prism/util/pm_string_list.h +25 -6
  35. data/include/prism/util/pm_strncasecmp.h +32 -0
  36. data/include/prism/util/pm_strpbrk.h +31 -17
  37. data/include/prism/version.h +28 -3
  38. data/include/prism.h +229 -36
  39. data/lib/prism/compiler.rb +5 -5
  40. data/lib/prism/debug.rb +43 -13
  41. data/lib/prism/desugar_compiler.rb +1 -1
  42. data/lib/prism/dispatcher.rb +27 -26
  43. data/lib/prism/dsl.rb +16 -16
  44. data/lib/prism/ffi.rb +138 -61
  45. data/lib/prism/lex_compat.rb +26 -16
  46. data/lib/prism/mutation_compiler.rb +11 -11
  47. data/lib/prism/node.rb +426 -227
  48. data/lib/prism/node_ext.rb +23 -16
  49. data/lib/prism/node_inspector.rb +1 -1
  50. data/lib/prism/pack.rb +79 -40
  51. data/lib/prism/parse_result/comments.rb +7 -2
  52. data/lib/prism/parse_result/newlines.rb +4 -0
  53. data/lib/prism/parse_result.rb +157 -21
  54. data/lib/prism/pattern.rb +14 -3
  55. data/lib/prism/ripper_compat.rb +28 -10
  56. data/lib/prism/serialize.rb +935 -307
  57. data/lib/prism/visitor.rb +9 -5
  58. data/lib/prism.rb +20 -2
  59. data/prism.gemspec +11 -2
  60. data/rbi/prism.rbi +7305 -0
  61. data/rbi/prism_static.rbi +196 -0
  62. data/sig/prism.rbs +4468 -0
  63. data/sig/prism_static.rbs +123 -0
  64. data/src/diagnostic.c +56 -53
  65. data/src/enc/pm_big5.c +1 -0
  66. data/src/enc/pm_euc_jp.c +1 -0
  67. data/src/enc/pm_gbk.c +1 -0
  68. data/src/enc/pm_shift_jis.c +1 -0
  69. data/src/enc/pm_tables.c +316 -80
  70. data/src/enc/pm_unicode.c +54 -9
  71. data/src/enc/pm_windows_31j.c +1 -0
  72. data/src/node.c +357 -345
  73. data/src/options.c +170 -0
  74. data/src/prettyprint.c +7697 -1643
  75. data/src/prism.c +1964 -1125
  76. data/src/regexp.c +153 -95
  77. data/src/serialize.c +432 -397
  78. data/src/token_type.c +3 -1
  79. data/src/util/pm_buffer.c +88 -23
  80. data/src/util/pm_char.c +103 -57
  81. data/src/util/pm_constant_pool.c +52 -22
  82. data/src/util/pm_list.c +12 -4
  83. data/src/util/pm_memchr.c +5 -3
  84. data/src/util/pm_newline_list.c +25 -63
  85. data/src/util/pm_state_stack.c +9 -3
  86. data/src/util/pm_string.c +95 -85
  87. data/src/util/pm_string_list.c +14 -15
  88. data/src/util/pm_strncasecmp.c +10 -3
  89. data/src/util/pm_strpbrk.c +25 -19
  90. metadata +12 -3
  91. data/docs/prism.png +0 -0
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 967c414d9d2354dd828368920733b1c81240d1f0c267fdc422d075965f1cec94
4
- data.tar.gz: a258399d77a8f4c6ca297fcea435c8b2c167bced045b32d2884f270a9483c33b
3
+ metadata.gz: 69cdca044f91ad2a198666562fdffd6035323906da465743ebff2cbd34ccac5d
4
+ data.tar.gz: 3da1460885f7cabda4b1ae6438274ab64a6e966536587d98eab2b3c764d0b6c0
5
5
  SHA512:
6
- metadata.gz: 104746123657e06d12be6da5ebc6778afda2f94646d3544ff88acd42c5ef2d01f46855f96548360166ee0b4ce6d5164107f7a857c1b52984237421287da4e85c
7
- data.tar.gz: d4237da63459ab266d5ee824d680eadef6ee893da5b7fd47960b55f4496630e18b0804b455e51576dd52ab4f085a3eb013d1e08fc7c4577c1ed07441abf6b53e
6
+ metadata.gz: c01d8b62728fe1cbce99394d683c28b943b7962e6a1fc82d435b3190286612fc4d7e4aaab93a6bef7a75f8acdfa3b5a597acac3295e489a7dd89aada94b29b23
7
+ data.tar.gz: ce88826e2a46cb18fe5e89b1a7c18c8fba2387b9c9e5625ec9aac5e5bd1449b21f23624bd0a6b526c4a9fa156c679186f8b89c09d0a003d386663f4a6a33269e
data/CHANGELOG.md CHANGED
@@ -6,6 +6,38 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [0.17.0] - 2023-11-03
10
+
11
+ ### Added
12
+
13
+ - We now properly support forwarding arguments into arrays, like `def foo(*) = [*]`.
14
+ - We now have much better documentation for the C and Ruby APIs.
15
+ - We now properly provide an error message when attempting to assign to numbered parameters from within regular expression named capture groups, as in `/(?<_1>)/ =~ ""`.
16
+
17
+ ### Changed
18
+
19
+ - **BREAKING**: `KeywordParameterNode` is split into `OptionalKeywordParameterNode` and `RequiredKeywordParameterNode`. `RequiredKeywordParameterNode` has no `value` field.
20
+ - **BREAKING**: Most of the `Prism::` APIs now accept a bunch of keyword options. The options we now support are: `filepath`, `encoding`, `line`, `frozen_string_literal`, `verbose`, and `scopes`. See [the pull request](https://github.com/ruby/prism/pull/1763) for more details.
21
+ - **BREAKING**: Comments are now split into three different classes instead of a single class, and the `type` field has been removed. They are: `InlineComment`, `EmbDocComment`, and `DATAComment`.
22
+
23
+ ## [0.16.0] - 2023-10-30
24
+
25
+ ### Added
26
+
27
+ - `InterpolatedMatchLastLineNode#options` and `MatchLastLineNode#options` are added, which are the same methods as are exposed on `InterpolatedRegularExpressionNode` and `RegularExpressionNode`.
28
+ - The project can now be compiled with `wasi-sdk` to expose a WebAssembly interface.
29
+ - `ArgumentsNode#keyword_splat?` is added to indicate if the arguments node has a keyword splat.
30
+ - The C API `pm_prettyprint` has a much improved output which lines up closely with `Node#inspect`.
31
+ - Prism now ships with `RBS` and `RBI` type signatures (in the `/sig` and `/rbi` directories, respectively).
32
+ - `Prism::parse_comments` and `Prism::parse_file_comments` APIs are added to extract only the comments from the source code.
33
+
34
+ ### Changed
35
+
36
+ - **BREAKING**: `Multi{Target,Write}Node#targets` is split up now into `lefts`, `rest`, and `rights`. This is to avoid having to scan the list in the case that there are splat nodes.
37
+ - Some bugs are fixed on `Multi{Target,Write}Node` accidentally creating additional nesting when not necessary.
38
+ - **BREAKING**: `RequiredDestructuredParameterNode` has been removed in favor of using `MultiTargetNode` in those places.
39
+ - **BREAKING**: `HashPatternNode#assocs` has been renamed to `HashPatternNode#elements`. `HashPatternNode#kwrest` has been renamed to `HashPatternNode#rest`.
40
+
9
41
  ## [0.15.1] - 2023-10-18
10
42
 
11
43
  ### Changed
@@ -201,7 +233,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a
201
233
 
202
234
  - 🎉 Initial release! 🎉
203
235
 
204
- [unreleased]: https://github.com/ruby/prism/compare/v0.15.1...HEAD
236
+ [unreleased]: https://github.com/ruby/prism/compare/v0.17.0...HEAD
237
+ [0.17.0]: https://github.com/ruby/prism/compare/v0.16.0...v0.17.0
238
+ [0.16.0]: https://github.com/ruby/prism/compare/v0.15.1...v0.16.0
205
239
  [0.15.1]: https://github.com/ruby/prism/compare/v0.15.0...v0.15.1
206
240
  [0.15.0]: https://github.com/ruby/prism/compare/v0.14.0...v0.15.0
207
241
  [0.14.0]: https://github.com/ruby/prism/compare/v0.13.0...v0.14.0
data/Makefile CHANGED
@@ -13,6 +13,7 @@ SOEXT := $(shell ruby -e 'puts RbConfig::CONFIG["SOEXT"]')
13
13
  CPPFLAGS := -Iinclude
14
14
  CFLAGS := -g -O2 -std=c99 -Wall -Werror -Wextra -Wpedantic -Wundef -Wconversion -fPIC -fvisibility=hidden
15
15
  CC := cc
16
+ WASI_SDK_PATH := /opt/wasi-sdk
16
17
 
17
18
  HEADERS := $(shell find include -name '*.h')
18
19
  SOURCES := $(shell find src -name '*.c')
@@ -23,6 +24,7 @@ all: shared static
23
24
 
24
25
  shared: build/librubyparser.$(SOEXT)
25
26
  static: build/librubyparser.a
27
+ wasm: javascript/src/prism.wasm
26
28
 
27
29
  build/librubyparser.$(SOEXT): $(SHARED_OBJECTS)
28
30
  $(ECHO) "linking $@"
@@ -32,6 +34,10 @@ build/librubyparser.a: $(STATIC_OBJECTS)
32
34
  $(ECHO) "building $@"
33
35
  $(Q) $(AR) $(ARFLAGS) $@ $(STATIC_OBJECTS) $(Q1:0=>/dev/null)
34
36
 
37
+ javascript/src/prism.wasm: Makefile $(SOURCES) $(HEADERS)
38
+ $(ECHO) "building $@"
39
+ $(Q) $(WASI_SDK_PATH)/bin/clang --sysroot=$(WASI_SDK_PATH)/share/wasi-sysroot/ $(DEBUG_FLAGS) -DPRISM_EXPORT_SYMBOLS -D_WASI_EMULATED_MMAN -lwasi-emulated-mman $(CPPFLAGS) $(CFLAGS) -Wl,--export-all -Wl,--no-entry -mexec-model=reactor -o $@ $(SOURCES)
40
+
35
41
  build/shared/%.o: src/%.c Makefile $(HEADERS)
36
42
  $(ECHO) "compiling $@"
37
43
  $(Q) mkdir -p $(@D)
@@ -82,3 +88,9 @@ clean:
82
88
  all-no-debug: DEBUG_FLAGS := -DNDEBUG=1
83
89
  all-no-debug: OPTFLAGS := -O3
84
90
  all-no-debug: all
91
+
92
+ run: Makefile $(STATIC_OBJECTS) $(HEADERS) test.c
93
+ $(ECHO) "compiling test.c"
94
+ $(Q) $(CC) $(CPPFLAGS) $(CFLAGS) $(STATIC_OBJECTS) test.c
95
+ $(ECHO) "running test.c"
96
+ $(Q) ./a.out
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  <h1 align="center">Prism Ruby parser</h1>
2
2
  <div align="center">
3
- <img alt="Prism Ruby parser" height="256px" src="https://github.com/ruby/prism/blob/main/docs/prism.png?raw=true">
3
+ <img alt="Prism Ruby parser" height="256px" src="https://github.com/ruby/prism/blob/main/doc/images/prism.png?raw=true">
4
4
  </div>
5
5
 
6
6
  This is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into [CRuby](https://github.com/ruby/ruby), [JRuby](https://github.com/jruby/jruby), [TruffleRuby](https://github.com/oracle/truffleruby), [Sorbet](https://github.com/sorbet/sorbet), and [Syntax Tree](https://github.com/ruby-syntax-tree/syntax_tree).
@@ -85,7 +85,9 @@ See the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information. We additio
85
85
  * [Encoding](docs/encoding.md)
86
86
  * [Fuzzing](docs/fuzzing.md)
87
87
  * [Heredocs](docs/heredocs.md)
88
+ * [JavaScript](docs/javascript.md)
88
89
  * [Mapping](docs/mapping.md)
90
+ * [Releasing](docs/releasing.md)
89
91
  * [Ripper](docs/ripper.md)
90
92
  * [Ruby API](docs/ruby_api.md)
91
93
  * [Serialization](docs/serialization.md)
data/config.yml CHANGED
@@ -59,11 +59,11 @@ tokens:
59
59
  - name: CONSTANT
60
60
  comment: "a constant"
61
61
  - name: DOT
62
- comment: "."
62
+ comment: "the . call operator"
63
63
  - name: DOT_DOT
64
- comment: ".."
64
+ comment: "the .. range operator"
65
65
  - name: DOT_DOT_DOT
66
- comment: "..."
66
+ comment: "the ... range operator or forwarding parameter"
67
67
  - name: EMBDOC_BEGIN
68
68
  comment: "=begin"
69
69
  - name: EMBDOC_END
@@ -311,9 +311,9 @@ tokens:
311
311
  - name: UCOLON_COLON
312
312
  comment: "unary ::"
313
313
  - name: UDOT_DOT
314
- comment: "unary .."
314
+ comment: "unary .. operator"
315
315
  - name: UDOT_DOT_DOT
316
- comment: "unary ..."
316
+ comment: "unary ... operator"
317
317
  - name: UMINUS
318
318
  comment: "-@"
319
319
  - name: UMINUS_NUM
@@ -329,12 +329,18 @@ tokens:
329
329
  - name: __END__
330
330
  comment: "marker for the point in the file at which the parser should stop"
331
331
  flags:
332
+ - name: ArgumentsNodeFlags
333
+ values:
334
+ - name: KEYWORD_SPLAT
335
+ comment: "if arguments contain keyword splat"
336
+ comment: Flags for arguments nodes.
332
337
  - name: CallNodeFlags
333
338
  values:
334
339
  - name: SAFE_NAVIGATION
335
340
  comment: "&. operator"
336
341
  - name: VARIABLE_CALL
337
342
  comment: "a call that could have been a local variable"
343
+ comment: Flags for call nodes.
338
344
  - name: IntegerBaseFlags
339
345
  values:
340
346
  - name: BINARY
@@ -345,14 +351,17 @@ flags:
345
351
  comment: "0d or no prefix"
346
352
  - name: HEXADECIMAL
347
353
  comment: "0x prefix"
354
+ comment: Flags for integer nodes that correspond to the base of the integer.
348
355
  - name: LoopFlags
349
356
  values:
350
357
  - name: BEGIN_MODIFIER
351
358
  comment: "a loop after a begin statement, so the body is executed first before the condition"
359
+ comment: Flags for while and until loop nodes.
352
360
  - name: RangeFlags
353
361
  values:
354
362
  - name: EXCLUDE_END
355
363
  comment: "... operator"
364
+ comment: Flags for range and flip-flop nodes.
356
365
  - name: RegularExpressionFlags
357
366
  values:
358
367
  - name: IGNORE_CASE
@@ -371,10 +380,12 @@ flags:
371
380
  comment: "s - forces the Windows-31J encoding"
372
381
  - name: UTF_8
373
382
  comment: "u - forces the UTF-8 encoding"
383
+ comment: Flags for regular expression and match last line nodes.
374
384
  - name: StringFlags
375
385
  values:
376
386
  - name: FROZEN
377
387
  comment: "frozen by virtue of a `frozen_string_literal` comment"
388
+ comment: Flags for string nodes.
378
389
  nodes:
379
390
  - name: AliasGlobalVariableNode
380
391
  fields:
@@ -432,6 +443,9 @@ nodes:
432
443
  fields:
433
444
  - name: arguments
434
445
  type: node[]
446
+ - name: flags
447
+ type: flags
448
+ kind: ArgumentsNodeFlags
435
449
  comment: |
436
450
  Represents a set of arguments to a method or a keyword.
437
451
 
@@ -770,10 +784,10 @@ nodes:
770
784
  comment: |
771
785
  Represents the use of a case statement.
772
786
 
773
- case true
774
- ^^^^^^^^^
775
- when false
776
- end
787
+ case true
788
+ when false
789
+ end
790
+ ^^^^^^^^^^
777
791
  - name: ClassNode
778
792
  fields:
779
793
  - name: locals
@@ -811,7 +825,7 @@ nodes:
811
825
  Represents the use of the `&&=` operator for assignment to a class variable.
812
826
 
813
827
  @@target &&= value
814
- ^^^^^^^^^^^^^^^^
828
+ ^^^^^^^^^^^^^^^^^^
815
829
  - name: ClassVariableOperatorWriteNode
816
830
  fields:
817
831
  - name: name
@@ -1176,13 +1190,13 @@ nodes:
1176
1190
  Represents a find pattern in pattern matching.
1177
1191
 
1178
1192
  foo in *bar, baz, *qux
1179
- ^^^^^^^^^^^^^^^^^^^^^^
1193
+ ^^^^^^^^^^^^^^^
1180
1194
 
1181
1195
  foo in [*bar, baz, *qux]
1182
- ^^^^^^^^^^^^^^^^^^^^^^^^
1196
+ ^^^^^^^^^^^^^^^^^
1183
1197
 
1184
1198
  foo in Foo(*bar, baz, *qux)
1185
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^
1199
+ ^^^^^^^^^^^^^^^^^^^^
1186
1200
  - name: FlipFlopNode
1187
1201
  fields:
1188
1202
  - name: left
@@ -1233,7 +1247,7 @@ nodes:
1233
1247
 
1234
1248
  def foo(...)
1235
1249
  bar(...)
1236
- ^^^^^^^^
1250
+ ^^^
1237
1251
  end
1238
1252
  - name: ForwardingParameterNode
1239
1253
  comment: |
@@ -1349,9 +1363,9 @@ nodes:
1349
1363
  fields:
1350
1364
  - name: constant
1351
1365
  type: node?
1352
- - name: assocs
1366
+ - name: elements
1353
1367
  type: node[]
1354
- - name: kwrest
1368
+ - name: rest
1355
1369
  type: node?
1356
1370
  - name: opening_loc
1357
1371
  type: location?
@@ -1685,24 +1699,6 @@ nodes:
1685
1699
 
1686
1700
  foo(a: b)
1687
1701
  ^^^^
1688
- - name: KeywordParameterNode
1689
- fields:
1690
- - name: name
1691
- type: constant
1692
- - name: name_loc
1693
- type: location
1694
- - name: value
1695
- type: node?
1696
- comment: |
1697
- Represents a keyword parameter to a method, block, or lambda definition.
1698
-
1699
- def a(b:)
1700
- ^^
1701
- end
1702
-
1703
- def a(b: 1)
1704
- ^^^^
1705
- end
1706
1702
  - name: KeywordRestParameterNode
1707
1703
  fields:
1708
1704
  - name: name
@@ -1915,7 +1911,11 @@ nodes:
1915
1911
  ^^^^^^^^^^^^^^
1916
1912
  - name: MultiTargetNode
1917
1913
  fields:
1918
- - name: targets
1914
+ - name: lefts
1915
+ type: node[]
1916
+ - name: rest
1917
+ type: node?
1918
+ - name: rights
1919
1919
  type: node[]
1920
1920
  - name: lparen_loc
1921
1921
  type: location?
@@ -1924,11 +1924,15 @@ nodes:
1924
1924
  comment: |
1925
1925
  Represents a multi-target expression.
1926
1926
 
1927
- a, b, c = 1, 2, 3
1928
- ^^^^^^^
1927
+ a, (b, c) = 1, 2, 3
1928
+ ^^^^^^
1929
1929
  - name: MultiWriteNode
1930
1930
  fields:
1931
- - name: targets
1931
+ - name: lefts
1932
+ type: node[]
1933
+ - name: rest
1934
+ type: node?
1935
+ - name: rights
1932
1936
  type: node[]
1933
1937
  - name: lparen_loc
1934
1938
  type: location?
@@ -1982,6 +1986,20 @@ nodes:
1982
1986
 
1983
1987
  $1
1984
1988
  ^^
1989
+ - name: OptionalKeywordParameterNode
1990
+ fields:
1991
+ - name: name
1992
+ type: constant
1993
+ - name: name_loc
1994
+ type: location
1995
+ - name: value
1996
+ type: node
1997
+ comment: |
1998
+ Represents an optional keyword parameter to a method, block, or lambda definition.
1999
+
2000
+ def a(b: 1)
2001
+ ^^^^
2002
+ end
1985
2003
  - name: OptionalParameterNode
1986
2004
  fields:
1987
2005
  - name: name
@@ -2169,19 +2187,17 @@ nodes:
2169
2187
 
2170
2188
  /foo/i
2171
2189
  ^^^^^^
2172
- - name: RequiredDestructuredParameterNode
2190
+ - name: RequiredKeywordParameterNode
2173
2191
  fields:
2174
- - name: parameters
2175
- type: node[]
2176
- - name: opening_loc
2177
- type: location
2178
- - name: closing_loc
2192
+ - name: name
2193
+ type: constant
2194
+ - name: name_loc
2179
2195
  type: location
2180
2196
  comment: |
2181
- Represents a destructured required parameter node.
2197
+ Represents a required keyword parameter to a method, block, or lambda definition.
2182
2198
 
2183
- def foo((bar, baz))
2184
- ^^^^^^^^^^
2199
+ def a(b: )
2200
+ ^^
2185
2201
  end
2186
2202
  - name: RequiredParameterNode
2187
2203
  fields:
@@ -2205,8 +2221,8 @@ nodes:
2205
2221
  comment: |
2206
2222
  Represents an expression modified with a rescue.
2207
2223
 
2208
- foo rescue nil
2209
- ^^^^^^^^^^^^^^
2224
+ foo rescue nil
2225
+ ^^^^^^^^^^^^^^
2210
2226
  - name: RescueNode
2211
2227
  fields:
2212
2228
  - name: keyword_loc
@@ -2228,8 +2244,8 @@ nodes:
2228
2244
 
2229
2245
  begin
2230
2246
  rescue Foo, *splat, Bar => ex
2231
- ^^^^^^
2232
2247
  foo
2248
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2233
2249
  end
2234
2250
 
2235
2251
  `Foo, *splat, Bar` are in the `exceptions` field.
@@ -4,6 +4,8 @@ A lot of code in prism's repository is templated from a single configuration fil
4
4
 
5
5
  * `ext/prism/api_node.c` - for defining how to build Ruby objects for the nodes out of C structs
6
6
  * `include/prism/ast.h` - for defining the C structs that represent the nodes
7
+ * `javascript/src/deserialize.js` - for defining how to deserialize the nodes in JavaScript
8
+ * `javascript/src/nodes.js` - for defining the nodes in JavaScript
7
9
  * `java/org/prism/AbstractNodeVisitor.java` - for defining the visitor interface for the nodes in Java
8
10
  * `java/org/prism/Loader.java` - for defining how to deserialize the nodes in Java
9
11
  * `java/org/prism/Nodes.java` - for defining the nodes in Java
data/docs/fuzzing.md CHANGED
@@ -25,7 +25,7 @@ fuzz
25
25
 
26
26
  There are currently three fuzzing targets
27
27
 
28
- - `pm_parse_serialize` (parse)
28
+ - `pm_serialize_parse` (parse)
29
29
  - `pm_regexp_named_capture_group_names` (regexp)
30
30
 
31
31
  Respectively, fuzzing can be performed with
@@ -0,0 +1,90 @@
1
+ # JavaScript
2
+
3
+ Prism provides bindings to JavaScript out of the box.
4
+
5
+ ## Node
6
+
7
+ To use the package from node, install the `@ruby/prism` dependency:
8
+
9
+ ```sh
10
+ npm install @ruby/prism
11
+ ```
12
+
13
+ Then import the package:
14
+
15
+ ```js
16
+ import { loadPrism } from "@ruby/prism";
17
+ ```
18
+
19
+ Then call the load function to get a parse function:
20
+
21
+ ```js
22
+ const parse = await loadPrism();
23
+ ```
24
+
25
+ ## Browser
26
+
27
+ To use the package from the browser, you will need to do some additional work. The [javascript/example.html](javascript/example.html) file shows an example of running Prism in the browser. You will need to instantiate the WebAssembly module yourself and then pass it to the `parsePrism` function.
28
+
29
+ First, get a shim for WASI since not all browsers support it yet.
30
+
31
+ ```js
32
+ import { WASI } from "https://unpkg.com/@bjorn3/browser_wasi_shim@latest/dist/index.js";
33
+ ```
34
+
35
+ Next, import the `parsePrism` function from `@ruby/prism`, either through a CDN or by bundling it with your application.
36
+
37
+ ```js
38
+ import { parsePrism } from "https://unpkg.com/@ruby/prism@latest/src/parsePrism.js";
39
+ ```
40
+
41
+ Next, fetch and instantiate the WebAssembly module. You can access it through a CDN or by bundling it with your application.
42
+
43
+ ```js
44
+ const wasm = await WebAssembly.compileStreaming(fetch("https://unpkg.com/@ruby/prism@latest/src/prism.wasm"));
45
+ ```
46
+
47
+ Next, instantiate the module and initialize WASI.
48
+
49
+ ```js
50
+ const wasi = new WASI([], [], []);
51
+ const instance = await WebAssembly.instantiate(wasm, { wasi_snapshot_preview1: wasi.wasiImport });
52
+ wasi.initialize(instance);
53
+ ```
54
+
55
+ Finally, you can create a function that will parse a string of Ruby code.
56
+
57
+ ```js
58
+ function parse(source) {
59
+ return parsePrism(instance.exports, source);
60
+ }
61
+ ```
62
+
63
+ ## API
64
+
65
+ Now that we have access to a `parse` function, we can use it to parse Ruby code:
66
+
67
+ ```js
68
+ const parseResult = parse("1 + 2");
69
+ ```
70
+
71
+ A ParseResult object is very similar to the Prism::ParseResult object from Ruby. It has the same properties: `value`, `comments`, `magicComments`, `errors`, and `warnings`. Here we can serialize the AST to JSON.
72
+
73
+ ```js
74
+ console.log(JSON.stringify(parseResult.value, null, 2));
75
+ ```
76
+
77
+ ## Building
78
+
79
+ To build the WASM package yourself, first obtain a copy of `wasi-sdk`. You can retrieve this here: <https://github.com/WebAssembly/wasi-sdk>. Next, run:
80
+
81
+ ```sh
82
+ make wasm WASI_SDK_PATH=path/to/wasi-sdk
83
+ ```
84
+
85
+ This will generate `javascript/src/prism.wasm`. From there, you can run the tests to verify everything was generated correctly.
86
+
87
+ ```sh
88
+ cd javascript
89
+ node test
90
+ ```
data/docs/releasing.md ADDED
@@ -0,0 +1,27 @@
1
+ # Releasing
2
+
3
+ To release a new version of Prism, perform the following steps:
4
+
5
+ ## Preparation
6
+
7
+ * Update the CHANGELOG.md file.
8
+ * Add a new section for the new version at the top of the file.
9
+ * Fill in the relevant changes — it may be easiest to click the link for the `Unreleased` heading to find the commits.
10
+ * Update the links at the bottom of the file.
11
+ * Update the version in the following files:
12
+ * `prism.gemspec` in the `Gem::Specification#version=` method call
13
+ * `ext/prism/extension.h` in the `EXPECTED_PRISM_VERSION` macro
14
+ * `include/prism/version.h` in the version macros
15
+ * `javascript/package.json` in the `version` field
16
+ * `rust/prism-sys/tests/utils_tests.rs` in the `version_test` function
17
+ * `templates/java/org/prism/Loader.java.erb` in the `load` function
18
+ * `templates/javascript/src/deserialize.js.erb` in the version constants
19
+ * `templates/lib/prism/serialize.rb.erb` in the version constants
20
+ * Run `bundle install` to update the `Gemfile.lock` file.
21
+ * Update `rust/prism-sys/Cargo.toml` to match the new version and run `cargo build`
22
+ * Update `rust/prism/Cargo.toml` to match the new version and run `cargo build`
23
+ * Commit all of the updated files.
24
+
25
+ ## Publishing
26
+
27
+ * Run `bundle exec rake release` to publish the gem to [rubygems.org](rubygems.org). Note that you must have access to the `prism` gem to do this.
data/docs/ruby_api.md CHANGED
@@ -23,3 +23,5 @@ The full API is documented below.
23
23
  * `Prism.parse_lex(source)` - parse the syntax tree corresponding to the given source string and return it within a parse result, along with the tokens
24
24
  * `Prism.parse_lex_file(filepath)` - parse the syntax tree corresponding to the given source file and return it within a parse result, along with the tokens
25
25
  * `Prism.load(source, serialized)` - load the serialized syntax tree using the source as a reference into a syntax tree
26
+ * `Prism.parse_comments(source)` - parse the comments corresponding to the given source string and return them
27
+ * `Prism.parse_file_comments(source)` - parse the comments corresponding to the given source file and return them
@@ -72,6 +72,7 @@ The header is structured like the following table:
72
72
  | `1` | patch version number |
73
73
  | `1` | 1 indicates only semantics fields were serialized, 0 indicates all fields were serialized (including location fields) |
74
74
  | string | the encoding name |
75
+ | varint | the start line |
75
76
  | varint | number of comments |
76
77
  | comment* | comments |
77
78
  | varint | number of magic comments |
@@ -136,56 +137,54 @@ typedef struct {
136
137
  size_t capacity;
137
138
  } pm_buffer_t;
138
139
 
139
- // Initialize a pm_buffer_t with its default values.
140
- bool pm_buffer_init(pm_buffer_t *);
141
-
142
140
  // Free the memory associated with the buffer.
143
141
  void pm_buffer_free(pm_buffer_t *);
144
142
 
145
143
  // Parse and serialize the AST represented by the given source to the given
146
144
  // buffer.
147
- void pm_parse_serialize(const uint8_t *source, size_t length, pm_buffer_t *buffer, const char *metadata);
145
+ void pm_serialize_parse(pm_buffer_t *buffer, const uint8_t *source, size_t length, const char *data);
148
146
  ```
149
147
 
150
- Typically you would use a stack-allocated `pm_buffer_t` and call `pm_parse_serialize`, as in:
148
+ Typically you would use a stack-allocated `pm_buffer_t` and call `pm_serialize_parse`, as in:
151
149
 
152
150
  ```c
153
151
  void
154
152
  serialize(const uint8_t *source, size_t length) {
155
- pm_buffer_t buffer;
156
- if (!pm_buffer_init(&buffer)) return;
153
+ pm_buffer_t buffer = { 0 };
154
+ pm_serialize_parse(&buffer, source, length, NULL);
157
155
 
158
- pm_parse_serialize(source, length, &buffer, NULL);
159
156
  // Do something with the serialized string.
160
157
 
161
158
  pm_buffer_free(&buffer);
162
159
  }
163
160
  ```
164
161
 
165
- The final argument to `pm_parse_serialize` controls the metadata of the source.
166
- This includes the filepath that the source is associated with, and any nested local variables scopes that are necessary to properly parse the file (in the case of parsing an `eval`).
167
- Note that no `varint` are used here to make it easier to produce the metadata for the caller, and also serialized size is less important here.
168
- The metadata is a serialized format itself, and is structured as follows:
162
+ The final argument to `pm_serialize_parse` is an optional string that controls the options to the parse function. This includes all of the normal options that could be passed to `pm_parser_init` through a `pm_options_t` struct, but serialized as a string to make it easier for callers through FFI. Note that no `varint` are used here to make it easier to produce the data for the caller, and also serialized size is less important here. The format of the data is structured as follows:
169
163
 
170
- | # bytes | field |
171
- | --- | --- |
172
- | `4` | the size of the filepath string |
173
- | | the filepath string |
174
- | `4` | the number of local variable scopes |
164
+ | # bytes | field |
165
+ | ------- | -------------------------- |
166
+ | `4` | the length of the filepath |
167
+ | ... | the filepath bytes |
168
+ | `4` | the line number |
169
+ | `4` | the length the encoding |
170
+ | ... | the encoding bytes |
171
+ | `1` | frozen string literal |
172
+ | `1` | suppress warnings |
173
+ | `4` | the number of scopes |
174
+ | ... | the scopes |
175
175
 
176
- Then, each local variable scope is encoded as:
176
+ Each scope is layed out as follows:
177
177
 
178
- | # bytes | field |
179
- | --- | --- |
180
- | `4` | the number of local variables in the scope |
181
- | | the local variables |
178
+ | # bytes | field |
179
+ | ------- | -------------------------- |
180
+ | `4` | the number of locals |
181
+ | ... | the locals |
182
182
 
183
- Each local variable within each scope is encoded as:
183
+ Each local is layed out as follows:
184
184
 
185
- | # bytes | field |
186
- | --- | --- |
187
- | `4` | the size of the local variable name |
188
- | | the local variable name |
185
+ | # bytes | field |
186
+ | ------- | -------------------------- |
187
+ | `4` | the length of the local |
188
+ | ... | the local bytes |
189
189
 
190
- The metadata can be `NULL` (as seen in the example above).
191
- If it is not null, then a minimal metadata string would be `"\0\0\0\0\0\0\0\0"` which would use 4 bytes to indicate an empty filepath string and 4 bytes to indicate that there were no local variable scopes.
190
+ The data can be `NULL` (as seen in the example above).