prism 0.13.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (95) hide show
  1. checksums.yaml +7 -0
  2. data/CHANGELOG.md +172 -0
  3. data/CODE_OF_CONDUCT.md +76 -0
  4. data/CONTRIBUTING.md +62 -0
  5. data/LICENSE.md +7 -0
  6. data/Makefile +84 -0
  7. data/README.md +89 -0
  8. data/config.yml +2481 -0
  9. data/docs/build_system.md +74 -0
  10. data/docs/building.md +22 -0
  11. data/docs/configuration.md +60 -0
  12. data/docs/design.md +53 -0
  13. data/docs/encoding.md +117 -0
  14. data/docs/fuzzing.md +93 -0
  15. data/docs/heredocs.md +36 -0
  16. data/docs/mapping.md +117 -0
  17. data/docs/ripper.md +36 -0
  18. data/docs/ruby_api.md +25 -0
  19. data/docs/serialization.md +181 -0
  20. data/docs/testing.md +55 -0
  21. data/ext/prism/api_node.c +4725 -0
  22. data/ext/prism/api_pack.c +256 -0
  23. data/ext/prism/extconf.rb +136 -0
  24. data/ext/prism/extension.c +626 -0
  25. data/ext/prism/extension.h +18 -0
  26. data/include/prism/ast.h +1932 -0
  27. data/include/prism/defines.h +45 -0
  28. data/include/prism/diagnostic.h +231 -0
  29. data/include/prism/enc/pm_encoding.h +95 -0
  30. data/include/prism/node.h +41 -0
  31. data/include/prism/pack.h +141 -0
  32. data/include/prism/parser.h +418 -0
  33. data/include/prism/regexp.h +19 -0
  34. data/include/prism/unescape.h +48 -0
  35. data/include/prism/util/pm_buffer.h +51 -0
  36. data/include/prism/util/pm_char.h +91 -0
  37. data/include/prism/util/pm_constant_pool.h +78 -0
  38. data/include/prism/util/pm_list.h +67 -0
  39. data/include/prism/util/pm_memchr.h +14 -0
  40. data/include/prism/util/pm_newline_list.h +61 -0
  41. data/include/prism/util/pm_state_stack.h +24 -0
  42. data/include/prism/util/pm_string.h +61 -0
  43. data/include/prism/util/pm_string_list.h +25 -0
  44. data/include/prism/util/pm_strpbrk.h +29 -0
  45. data/include/prism/version.h +4 -0
  46. data/include/prism.h +82 -0
  47. data/lib/prism/compiler.rb +465 -0
  48. data/lib/prism/debug.rb +157 -0
  49. data/lib/prism/desugar_compiler.rb +206 -0
  50. data/lib/prism/dispatcher.rb +2051 -0
  51. data/lib/prism/dsl.rb +750 -0
  52. data/lib/prism/ffi.rb +251 -0
  53. data/lib/prism/lex_compat.rb +838 -0
  54. data/lib/prism/mutation_compiler.rb +718 -0
  55. data/lib/prism/node.rb +14540 -0
  56. data/lib/prism/node_ext.rb +55 -0
  57. data/lib/prism/node_inspector.rb +68 -0
  58. data/lib/prism/pack.rb +185 -0
  59. data/lib/prism/parse_result/comments.rb +172 -0
  60. data/lib/prism/parse_result/newlines.rb +60 -0
  61. data/lib/prism/parse_result.rb +266 -0
  62. data/lib/prism/pattern.rb +239 -0
  63. data/lib/prism/ripper_compat.rb +174 -0
  64. data/lib/prism/serialize.rb +662 -0
  65. data/lib/prism/visitor.rb +470 -0
  66. data/lib/prism.rb +64 -0
  67. data/prism.gemspec +113 -0
  68. data/src/diagnostic.c +287 -0
  69. data/src/enc/pm_big5.c +52 -0
  70. data/src/enc/pm_euc_jp.c +58 -0
  71. data/src/enc/pm_gbk.c +61 -0
  72. data/src/enc/pm_shift_jis.c +56 -0
  73. data/src/enc/pm_tables.c +507 -0
  74. data/src/enc/pm_unicode.c +2324 -0
  75. data/src/enc/pm_windows_31j.c +56 -0
  76. data/src/node.c +2633 -0
  77. data/src/pack.c +493 -0
  78. data/src/prettyprint.c +2136 -0
  79. data/src/prism.c +14587 -0
  80. data/src/regexp.c +580 -0
  81. data/src/serialize.c +1899 -0
  82. data/src/token_type.c +349 -0
  83. data/src/unescape.c +637 -0
  84. data/src/util/pm_buffer.c +103 -0
  85. data/src/util/pm_char.c +272 -0
  86. data/src/util/pm_constant_pool.c +252 -0
  87. data/src/util/pm_list.c +41 -0
  88. data/src/util/pm_memchr.c +33 -0
  89. data/src/util/pm_newline_list.c +134 -0
  90. data/src/util/pm_state_stack.c +19 -0
  91. data/src/util/pm_string.c +200 -0
  92. data/src/util/pm_string_list.c +29 -0
  93. data/src/util/pm_strncasecmp.c +17 -0
  94. data/src/util/pm_strpbrk.c +66 -0
  95. metadata +138 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 290f2191f3700e2584cc320e6dd99f78cfbd2cfd82006700f0bf977bfdc30424
4
+ data.tar.gz: 2886b0f5553307c1cd5d5e963ac03ea66330e2fa7d26cc4684d8f556e52520f5
5
+ SHA512:
6
+ metadata.gz: 24dc8c15f9693b120e8dfcdf8edf2d3a396e6eb636589eded6dac5b651cc62beee1f9c6464d49b1c93b5666e963755bfc4eb041a020ca9585ab7a0bb10172787
7
+ data.tar.gz: 272d11c30527f924644c3c97ac2895167798774007919d429b03325926f1853422a56463e7272ae02d4855f54dd1f775ce546065d2d2c78b1187192cce722e48
data/CHANGELOG.md ADDED
@@ -0,0 +1,172 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
6
+
7
+ ## [Unreleased]
8
+
9
+ ## [0.13.0] - 2023-09-29
10
+
11
+ ### Added
12
+
13
+ - `BEGIN {}` blocks are only allowed at the top-level, and will now provide a syntax error if they are not.
14
+ - Numbered parameters are not allowed in block parameters, and will now provide a syntax error if they are.
15
+ - Many more Ruby modules and classes are now documented. Also, many have been moved into their own files and autoloaded so that initial boot time of the gem is much faster.
16
+ - `PM_TOKEN_METHOD_NAME` is introduced, used to indicate an identifier that if definitely a method name because it has an `!` or `?` at the end.
17
+ - In the C API, arrays, assocs, and hashes now can have the `PM_NODE_FLAG_STATIC_LITERAL` flag attached if they can be compiled statically. This is used in CRuby, for example, to determine if a `duphash`/`duparray` instruction can be used as opposed to a `newhash`/`newarray`.
18
+ - `Node#type` is introduced, which returns a symbol representing the type of the node. This is useful for case comparisons when you have to compare against multiple types.
19
+
20
+ ### Changed
21
+
22
+ - **BREAKING**: Everything has been renamed to `prism` instead of `yarp`. The `yp_`/`YP_` prefix in the C API has been changed to `pm_`/`PM_`. For the most part, everything should be find/replaceable.
23
+ - **BREAKING**: `BlockArgumentNode` nodes now go into the `block` field on `CallNode` nodes, in addition to the `BlockNode` nodes that used to be there. Hopefully this makes it more consistent to compile/deal with in general, but it does mean it can be a surprising breaking change.
24
+ - Escaped whitespace in `%w` lists is now properly unescaped.
25
+ - `Node#pretty_print` now respects pretty print indentation.
26
+ - `Dispatcher` was previously firing `_leave` events in the incorrect order. This has now been fixed.
27
+ - **BREAKING**: `Visitor` has now been split into `Visitor` and `Compiler`. The visitor visits nodes but doesn't return anything from the visit methods. It is suitable for taking action based on the tree, but not manipulating the tree itself. The `Compiler` visits nodes and returns the computed value up the tree. It is suitable for compiling the tree into another format. As such, `MutationVisitor` has been renamed to `MutationCompiler`.
28
+
29
+ ## [0.12.0] - 2023-09-15
30
+
31
+ ### Added
32
+
33
+ - `RegularExpressionNode#options` and `InterpolatedRegularExpressionNode#options` are now provided. These return integers that match up to the `Regexp#options` API.
34
+ - Greatly improved `Node#inspect` and `Node#pretty_print` APIs.
35
+ - `MatchLastLineNode` and `InterpolatedMatchLastLineNode` are introduced to represent using a regular expression as the predicate of an `if` or `unless` statement.
36
+ - `IntegerNode` now has a base flag on it.
37
+ - Heredocs that were previously `InterpolatedStringNode` and `InterpolatedXStringNode` nodes without any actual interpolation are now `StringNode` and `XStringNode`, respectively.
38
+ - `StringNode` now has a `frozen?` flag on it, which respects the `frozen_string_literal` magic comment.
39
+ - Numbered parameters are now supported, and are properly represented using `LocalVariableReadNode` nodes.
40
+ - `ImplicitNode` is introduced, which wraps implicit calls, local variable reads, or constant reads in omitted hash values.
41
+ - `YARP::Dispatcher` is introduced, which provides a way for multiple objects to listen for certain events on the AST while it is being walked. This is effectively a way to implement a more efficient visitor pattern when you have many different uses for the AST.
42
+
43
+ ### Changed
44
+
45
+ - **BREAKING**: Flags fields are now marked as private, to ensure we can change their implementation under the hood. Actually querying should be through the accessor methods.
46
+ - **BREAKING**: `AliasNode` is now split into `AliasMethodNode` and `AliasGlobalVariableNode`.
47
+ - Method definitions on local variables is now correctly handled.
48
+ - Unary minus precedence has been fixed.
49
+ - Concatenating character literals with string literals is now fixed.
50
+ - Many more invalid syntaxes are now properly rejected.
51
+ - **BREAKING**: Comments now no longer include their trailing newline.
52
+
53
+ ## [0.11.0] - 2023-09-08
54
+
55
+ ### Added
56
+
57
+ - `Node#inspect` is much improved.
58
+ - `YARP::Pattern` is introduced, which can construct procs to match against nodes.
59
+ - `BlockLocalVariableNode` is introduced to take the place of the locations array on `BlockParametersNode`.
60
+ - `ParseResult#attach_comments!` is now provided to attach comments to locations in the tree.
61
+ - `MultiTargetNode` is introduced as the target of multi writes and for loops.
62
+ - `Node#comment_targets` is introduced to return the list of objects that can have attached comments.
63
+
64
+ ### Changed
65
+
66
+ - **BREAKING**: `GlobalVariable*Node#name` now returns a symbol.
67
+ - **BREAKING**: `Constant*Node#name` now returns a symbol.
68
+ - **BREAKING**: `BlockParameterNode`, `KeywordParameterNode`, `KeywordRestParameterNode`, `RestParameterNode`, `DefNode` all have their `name` methods returning symbols now.
69
+ - **BREAKING**: `ClassNode#name` and `ModuleNode#name` now return symbols.
70
+ - **BREAKING**: `Location#end_column` is now exclusive instead of inclusive.
71
+ - `Location#slice` now returns a properly encoded string.
72
+ - `CallNode#operator_loc` is now `CallNode#call_operator_loc`.
73
+ - `CallOperatorAndWriteNode` is renamed to `CallAndWriteNode` and its structure has changed.
74
+ - `CallOperatorOrWriteNode` is renamed to `CallOrWriteNode` and its structure has changed.
75
+
76
+ ## [0.10.0] - 2023-09-01
77
+
78
+ ### Added
79
+
80
+ - `InstanceVariable*Node` and `ClassVariable*Node` objects now have their `name` returning a Symbol. This is because they are now part of the constant pool.
81
+ - `NumberedReferenceReadNode` now has a `number` field, which returns an Integer.
82
+
83
+ ### Changed
84
+
85
+ - **BREAKING**: Various `operator_id` and `constant_id` fields have been renamed to `operator` and `name`, respectively. See [09d0a144](https://github.com/ruby/yarp/commit/09d0a144dfd519c5b5f96f0b6ee95d256e2cb1a6) for details.
86
+ - `%w`, `%W`, `%i`, `%I`, `%q`, and `%Q` literals can now span around the contents of a heredoc.
87
+ - **BREAKING**: All of the public C APIs that accept the source string now accept `const uint8_t *` as opposed to `const char *`.
88
+
89
+ ## [0.9.0] - 2023-08-25
90
+
91
+ ### Added
92
+
93
+ - Regular expressions can now be bound by `\n`, `\r`, and a combination of `\r\n`.
94
+ - Strings delimited by `%`, `%q`, and `%Q` can now be bound by `\n`, `\r`, and a combination of `\r\n`.
95
+ - `IntegerNode#value` now returns the value of the integer as a Ruby `Integer`.
96
+ - `FloatNode#value` now returns the value of the float as a Ruby `Float`.
97
+ - `RationalNode#value` now returns the value of the rational as a Ruby `Rational`.
98
+ - `ImaginaryNode#value` now returns the value of the imaginary as a Ruby `Complex`.
99
+ - `ClassNode#name` is now a string that returns the name of just the class, without the namespace.
100
+ - `ModuleNode#name` is now a string that returns the name of just the module, without the namespace.
101
+ - Regular expressions and strings found after a heredoc declaration but before the heredoc body are now parsed correctly.
102
+ - The serialization API now supports shared strings, which should help reduce the size of the serialized AST.
103
+ - `*Node#copy` is introduced, which returns a copy of the node with the given overrides.
104
+ - `Location#copy` is introduced, which returns a copy of the location with the given overrides.
105
+ - `DesugarVisitor` is introduced, which provides a simpler AST for use in tools that want to process fewer node types.
106
+ - `{ClassVariable,Constant,ConstantPath,GlobalVariable,InstanceVariable,LocalVariable}TargetNode` are introduced. These nodes represent the target of writes in locations where a value cannot be provided, like a multi write or a rescue reference.
107
+ - `UntilNode#closing_loc` and `WhileNode#closing_loc` are now provided.
108
+ - `Location#join` is now provided, which joins two locations together.
109
+ - `YARP::parse_lex` and `YARP::parse_lex_file` are introduced to parse and lex in one result.
110
+
111
+ ### Changed
112
+
113
+ - When there is a magic encoding comment, the encoding of the first token's source string is now properly reencoded.
114
+ - Constants followed by unary `&` are now properly parsed as a call with a passed block argument.
115
+ - Escaping multi-byte characters in a string literal will now properly escape the entire character.
116
+ - `YARP.lex_compat` now has more accurate behavior when a byte-order mark is present in the file.
117
+ - **BREAKING**: `AndWriteNode`, `OrWriteNode`, and `OperatorWriteNode` have been split back up into their `0.7.0` versions.
118
+ - We now properly support spaces between the `encoding` and `=`/`:` in a magic encoding comment.
119
+ - We now properly parse `-> foo: bar do end`.
120
+
121
+ ## [0.8.0] - 2023-08-18
122
+
123
+ ### Added
124
+
125
+ - Some performance improvements when converting from the C AST to the Ruby AST.
126
+ - Two rust crates have been added: `yarp-sys` and `yarp`. They are as yet unpublished.
127
+
128
+ ### Changed
129
+
130
+ - Escaped newlines in strings and heredocs are now handled more correctly.
131
+ - Dedenting heredocs that result in empty string nodes will now drop those string nodes from the list.
132
+ - Beginless and endless ranges in conditional expressions now properly form a flip flop node.
133
+ - `%` at the end of files no longer crashes.
134
+ - Location information has been corrected for `if/elsif` chains that have no `else`.
135
+ - `__END__` at the very end of the file was previously parsed as an identifier, but is now correct.
136
+ - **BREAKING**: Nodes that reference `&&=`, `||=`, and other writing operators have been consolidated. Previously, they were separate individual nodes. Now they are a tree with the target being the left-hand side and the value being the right-hand side with a joining `AndWriteNode`, `OrWriteNode`, or `OperatorWriteNode` in the middle. This impacts all of the nodes that match this pattern: `{ClassVariable,Constant,ConstantPath,GlobalVariable,InstanceVariable,LocalVariable}Operator{And,Or,}WriteNode`.
137
+ - **BREAKING**: `BlockParametersNode`, `ClassNode`, `DefNode`, `LambdaNode`, `ModuleNode`, `ParenthesesNode`, and `SingletonClassNode` have had their `statements` field renamed to `body` to give a hint that it might not be a `StatementsNode` (it could also be a `BeginNode`).
138
+
139
+ ## [0.7.0] - 2023-08-14
140
+
141
+ ### Added
142
+
143
+ - We now have an explicit `FlipFlopNode`. It has the same flags as `RangeNode`.
144
+ - We now have a syntax error when implicit and explicit blocks are passed to a method call.
145
+ - `Node#slice` is now implemented, for retrieving the slice of the source code corresponding to a node.
146
+ - We now support the `utf8-mac` encoding.
147
+ - Predicate methods have been added for nodes that have flags. For example `CallNode#safe_navigation?` and `RangeNode#exclude_end?`.
148
+ - The gem now functions on JRuby and TruffleRuby, thanks to a new FFI backend.
149
+ - Comments are now part of the serialization API.
150
+
151
+ ### Changed
152
+
153
+ - Autotools has been removed from the build system, so when the gem is installed it will no longer need to go through a configure step.
154
+ - The AST for `foo = *bar` has changed to have an explicit array on the right hand side, rather than a splat node. This is more consistent with how other parsers handle this.
155
+ - **BREAKING**: `RangeNodeFlags` has been renamed to `RangeFlags`.
156
+ - Unary minus on number literals is now parsed as part of the literal, rather than a call to a unary operator. This is more consistent with how other parsers handle this.
157
+
158
+ ## [0.6.0] - 2023-08-09
159
+
160
+ ### Added
161
+
162
+ - 🎉 Initial release! 🎉
163
+
164
+ [unreleased]: https://github.com/ruby/prism/compare/v0.13.0...HEAD
165
+ [0.13.0]: https://github.com/ruby/prism/compare/v0.12.0...v0.13.0
166
+ [0.12.0]: https://github.com/ruby/prism/compare/v0.11.0...v0.12.0
167
+ [0.11.0]: https://github.com/ruby/prism/compare/v0.10.0...v0.11.0
168
+ [0.10.0]: https://github.com/ruby/prism/compare/v0.9.0...v0.10.0
169
+ [0.9.0]: https://github.com/ruby/prism/compare/v0.8.0...v0.9.0
170
+ [0.8.0]: https://github.com/ruby/prism/compare/v0.7.0...v0.8.0
171
+ [0.7.0]: https://github.com/ruby/prism/compare/v0.6.0...v0.7.0
172
+ [0.6.0]: https://github.com/ruby/prism/compare/d60531...v0.6.0
@@ -0,0 +1,76 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, sex characteristics, gender identity and expression,
9
+ level of experience, education, socio-economic status, nationality, personal
10
+ appearance, race, religion, or sexual identity and orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ - Using welcoming and inclusive language
18
+ - Being respectful of differing viewpoints and experiences
19
+ - Gracefully accepting constructive criticism
20
+ - Focusing on what is best for the community
21
+ - Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ - The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ - Trolling, insulting/derogatory comments, and personal or political attacks
28
+ - Public or private harassment
29
+ - Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ - Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at opensource@shopify.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
72
+
73
+ [homepage]: https://www.contributor-covenant.org
74
+
75
+ For answers to common questions about this code of conduct, see
76
+ https://www.contributor-covenant.org/faq
data/CONTRIBUTING.md ADDED
@@ -0,0 +1,62 @@
1
+ # Contributing
2
+
3
+ Thank you for your interest in contributing to prism! Below are a couple of ways that you can help out.
4
+
5
+ ## Discussions
6
+
7
+ The discussions page on the GitHub repository are open. If you have a question or want to discuss the project, feel free to open a new discussion or comment on an existing one. This is the best place to ask questions about the project.
8
+
9
+ ## Code
10
+
11
+ If you want to contribute code, please first open or contribute to a discussion. A lot of the project is in flux, and we want to make sure that you are contributing to the right place. Once you have a discussion going, you can open a pull request with your changes. We will review your code and get it merged in.
12
+
13
+ ### Ruby Features
14
+
15
+ Pattern matching and endless method definitions should be avoided as long as the latest TruffleRuby release does not support it.
16
+
17
+ ## Tests
18
+
19
+ We could always use more tests! One of the biggest challenges of this project is building up a big test suite. If you want to contribute tests, feel free to open a pull request. These will get merged in as soon as possible.
20
+
21
+ The `test` Rake task will not compile libraries or the C extension, and this is intentional (to make testing against an installed version easier). If you want to test your changes, please make sure you're also running either the task:
22
+
23
+ ``` sh
24
+ bundle exec rake
25
+ ```
26
+
27
+ or explicitly running the `compile` task:
28
+
29
+ ``` sh
30
+ bundle exec rake compile test
31
+ # or to just compile the C extension ...
32
+ bundle exec rake compile:prism test
33
+ ```
34
+
35
+ To test the rust bindings (with caveats about setting up your Rust environment properly first):
36
+
37
+ ``` sh
38
+ bundle exec rake compile test:rust
39
+ ```
40
+
41
+
42
+ ## Documentation
43
+
44
+ We could always use more documentation! If you want to contribute documentation, feel free to open a pull request. These will get merged in as soon as possible. Documenting functions or methods is always useful, but we also need more guides and tutorials. If you have an idea for a guide or tutorial, feel free to open an issue and we can discuss it.
45
+
46
+ ## Developing
47
+
48
+ To get `clangd` support in the editor for development, generate the compilation database. This command will
49
+ create an ignored `compile_commands.json` file at the project root, which is used by clangd to provide functionality.
50
+
51
+ You will need `bear` which can be installed on macOS with `brew install bear`.
52
+
53
+ ```sh
54
+ bundle exec rake bear
55
+ ```
56
+
57
+ ## Debugging
58
+
59
+ Some useful rake tasks:
60
+
61
+ - `test:valgrind` runs the test suite under valgrind to look for illegal memory access or memory leaks
62
+ - `test:gdb` and `test:lldb` run the test suite under those debuggers
data/LICENSE.md ADDED
@@ -0,0 +1,7 @@
1
+ Copyright 2022-present, Shopify Inc.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Makefile ADDED
@@ -0,0 +1,84 @@
1
+
2
+ # V=0 quiet, V=1 verbose. other values don't work.
3
+ V = 0
4
+ V0 = $(V:0=)
5
+ Q1 = $(V:1=)
6
+ Q = $(Q1:0=@)
7
+ ECHO1 = $(V:1=@ :)
8
+ ECHO = $(ECHO1:0=@ echo)
9
+ FUZZ_OUTPUT_DIR = $(shell pwd)/fuzz/output
10
+
11
+ SOEXT := $(shell ruby -e 'puts RbConfig::CONFIG["SOEXT"]')
12
+
13
+ CPPFLAGS := -Iinclude
14
+ CFLAGS := -g -O2 -std=c99 -Wall -Werror -Wextra -Wpedantic -Wundef -Wconversion -fPIC -fvisibility=hidden
15
+ CC := cc
16
+
17
+ HEADERS := $(shell find include -name '*.h')
18
+ SOURCES := $(shell find src -name '*.c')
19
+ SHARED_OBJECTS := $(subst src/,build/shared/,$(SOURCES:.c=.o))
20
+ STATIC_OBJECTS := $(subst src/,build/static/,$(SOURCES:.c=.o))
21
+
22
+ all: shared static
23
+
24
+ shared: build/librubyparser.$(SOEXT)
25
+ static: build/librubyparser.a
26
+
27
+ build/librubyparser.$(SOEXT): $(SHARED_OBJECTS)
28
+ $(ECHO) "linking $@"
29
+ $(Q) $(CC) $(DEBUG_FLAGS) $(CFLAGS) -shared -o $@ $(SHARED_OBJECTS)
30
+
31
+ build/librubyparser.a: $(STATIC_OBJECTS)
32
+ $(ECHO) "building $@"
33
+ $(Q) $(AR) $(ARFLAGS) $@ $(STATIC_OBJECTS) $(Q1:0=>/dev/null)
34
+
35
+ build/shared/%.o: src/%.c Makefile $(HEADERS)
36
+ $(ECHO) "compiling $@"
37
+ $(Q) mkdir -p $(@D)
38
+ $(Q) $(CC) $(DEBUG_FLAGS) -DPRISM_EXPORT_SYMBOLS $(CPPFLAGS) $(CFLAGS) -c -o $@ $<
39
+
40
+ build/static/%.o: src/%.c Makefile $(HEADERS)
41
+ $(ECHO) "compiling $@"
42
+ $(Q) mkdir -p $(@D)
43
+ $(Q) $(CC) $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) -c -o $@ $<
44
+
45
+ build/fuzz.%: $(SOURCES) fuzz/%.c fuzz/fuzz.c
46
+ $(ECHO) "building $* fuzzer"
47
+ $(Q) mkdir -p $(@D)
48
+ $(ECHO) "building main fuzz binary"
49
+ $(Q) AFL_HARDEN=1 afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^
50
+ $(ECHO) "building cmplog binary"
51
+ $(Q) AFL_HARDEN=1 AFL_LLVM_CMPLOG=1 afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@.cmplog $^
52
+
53
+ build/fuzz.heisenbug.%: $(SOURCES) fuzz/%.c fuzz/heisenbug.c
54
+ $(Q) AFL_HARDEN=1 afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^
55
+
56
+ fuzz-debug:
57
+ $(ECHO) "entering debug shell"
58
+ $(Q) docker run -it --rm -e HISTFILE=/prism/fuzz/output/.bash_history -v $(shell pwd):/prism -v $(FUZZ_OUTPUT_DIR):/fuzz_output prism/fuzz
59
+
60
+ fuzz-docker-build: fuzz/docker/Dockerfile
61
+ $(ECHO) "building docker image"
62
+ $(Q) docker build -t prism/fuzz fuzz/docker/
63
+
64
+ fuzz-run-%: FORCE fuzz-docker-build
65
+ $(ECHO) "generating templates"
66
+ $(Q) bundle exec rake templates
67
+ $(ECHO) "running $* fuzzer"
68
+ $(Q) docker run --rm -v $(shell pwd):/prism prism/fuzz /bin/bash -c "FUZZ_FLAGS=\"$(FUZZ_FLAGS)\" make build/fuzz.$*"
69
+ $(ECHO) "starting AFL++ run"
70
+ $(Q) mkdir -p $(FUZZ_OUTPUT_DIR)/$*
71
+ $(Q) docker run -it --rm -v $(shell pwd):/prism -v $(FUZZ_OUTPUT_DIR):/fuzz_output prism/fuzz /bin/bash -c "./fuzz/$*.sh /fuzz_output/$*"
72
+ FORCE:
73
+
74
+ fuzz-clean:
75
+ $(Q) rm -f -r fuzz/output
76
+
77
+ clean:
78
+ $(Q) rm -f -r build
79
+
80
+ .PHONY: clean fuzz-clean
81
+
82
+ all-no-debug: DEBUG_FLAGS := -DNDEBUG=1
83
+ all-no-debug: OPTFLAGS := -O3
84
+ all-no-debug: all
data/README.md ADDED
@@ -0,0 +1,89 @@
1
+ # Prism Ruby parser
2
+
3
+ This is a parser for the Ruby programming language. It is designed to be portable, error tolerant, and maintainable. It is written in C99 and has no dependencies. It is currently being integrated into [CRuby](https://github.com/ruby/ruby), [JRuby](https://github.com/jruby/jruby), [TruffleRuby](https://github.com/oracle/truffleruby), [Sorbet](https://github.com/sorbet/sorbet), and [Syntax Tree](https://github.com/ruby-syntax-tree/syntax_tree).
4
+
5
+ ## Overview
6
+
7
+ The repository contains the infrastructure for both a shared library (librubyparser) and a native CRuby extension. The shared library has no bindings to CRuby itself, and so can be used by other projects. The native CRuby extension links against `ruby.h`, and so is suitable in the context of CRuby.
8
+
9
+ ```
10
+ .
11
+ ├── Makefile configuration to compile the shared library and native tests
12
+ ├── Rakefile configuration to compile the native extension and run the Ruby tests
13
+ ├── bin
14
+ │   ├── lex runs the lexer on a file or string, prints the tokens, and compares to ripper
15
+ │   └── parse runs the parser on a file or string and prints the syntax tree
16
+ ├── config.yml specification for tokens and nodes in the tree
17
+ ├── docs documentation about the project
18
+ ├── ext
19
+ │   └── prism
20
+ │   ├── extconf.rb configuration to generate the Makefile for the native extension
21
+ │   └── extension.c the native extension that interacts with librubyparser
22
+ ├── fuzz files related to fuzz testing
23
+ ├── include
24
+ │   ├── prism header files for the shared library
25
+ │   └── prism.h main header file for the shared library
26
+ ├── java Java bindings for the shared library
27
+ ├── lib
28
+ │   ├── prism Ruby library files
29
+ │   └── prism.rb main entrypoint for the Ruby library
30
+ ├── rakelib various Rake tasks for the project
31
+ ├── rust
32
+ │   ├── prism Rustified crate for the shared library
33
+ │   └── prism-sys FFI binding for Rust
34
+ ├── src
35
+ │   ├── enc various encoding files
36
+ │   ├── util various utility files
37
+ │   └── prism.c main entrypoint for the shared library
38
+ ├── templates contains ERB templates generated by templates/template.rb
39
+ │   └── template.rb generates code from the nodes and tokens configured by config.yml
40
+ └── test
41
+ └── prism
42
+ ├── fixtures Ruby code used for testing
43
+ └── snapshots snapshots of generated syntax trees corresponding to fixtures
44
+ ```
45
+
46
+ ## Getting started
47
+
48
+ To compile the shared library, you will need:
49
+
50
+ * A C99 compiler
51
+ * autotools (autoconf, automake, libtool)
52
+ * make
53
+ * Ruby 3.3.0-preview1 or later
54
+
55
+ Once you have these dependencies, run:
56
+
57
+ ```
58
+ bundle install
59
+ ```
60
+
61
+ to fetch the Ruby dependencies. Finally, run:
62
+
63
+ ```
64
+ rake compile
65
+ ```
66
+
67
+ to compile the shared library. It will be built in the `build` directory. To test that everything is working, run:
68
+
69
+ ```
70
+ bin/parse -e "1 + 2"
71
+ ```
72
+
73
+ to see the syntax tree for the expression `1 + 2`.
74
+
75
+ ## Contributing
76
+
77
+ See the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information. We additionally have documentation about the overall design of the project as well as various subtopics.
78
+
79
+ * [Building](docs/building.md)
80
+ * [Configuration](docs/configuration.md)
81
+ * [Design](docs/design.md)
82
+ * [Encoding](docs/encoding.md)
83
+ * [Fuzzing](docs/fuzzing.md)
84
+ * [Heredocs](docs/heredocs.md)
85
+ * [Mapping](docs/mapping.md)
86
+ * [Ripper](docs/ripper.md)
87
+ * [Ruby API](docs/ruby_api.md)
88
+ * [Serialization](docs/serialization.md)
89
+ * [Testing](docs/testing.md)