regexp_parser 2.1.1 → 2.9.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (167) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile +6 -5
  3. data/LICENSE +1 -1
  4. data/Rakefile +6 -70
  5. data/lib/regexp_parser/error.rb +1 -1
  6. data/lib/regexp_parser/expression/base.rb +76 -0
  7. data/lib/regexp_parser/expression/classes/alternation.rb +1 -1
  8. data/lib/regexp_parser/expression/classes/anchor.rb +0 -2
  9. data/lib/regexp_parser/expression/classes/{backref.rb → backreference.rb} +18 -3
  10. data/lib/regexp_parser/expression/classes/{set → character_set}/range.rb +2 -7
  11. data/lib/regexp_parser/expression/classes/{set.rb → character_set.rb} +4 -8
  12. data/lib/regexp_parser/expression/classes/{type.rb → character_type.rb} +0 -2
  13. data/lib/regexp_parser/expression/classes/conditional.rb +2 -6
  14. data/lib/regexp_parser/expression/classes/{escape.rb → escape_sequence.rb} +15 -7
  15. data/lib/regexp_parser/expression/classes/free_space.rb +4 -4
  16. data/lib/regexp_parser/expression/classes/group.rb +10 -22
  17. data/lib/regexp_parser/expression/classes/keep.rb +2 -0
  18. data/lib/regexp_parser/expression/classes/literal.rb +1 -5
  19. data/lib/regexp_parser/expression/classes/posix_class.rb +5 -5
  20. data/lib/regexp_parser/expression/classes/root.rb +3 -6
  21. data/lib/regexp_parser/expression/classes/{property.rb → unicode_property.rb} +10 -11
  22. data/lib/regexp_parser/expression/methods/construct.rb +41 -0
  23. data/lib/regexp_parser/expression/methods/human_name.rb +43 -0
  24. data/lib/regexp_parser/expression/methods/match_length.rb +9 -5
  25. data/lib/regexp_parser/expression/methods/negative.rb +20 -0
  26. data/lib/regexp_parser/expression/methods/parts.rb +23 -0
  27. data/lib/regexp_parser/expression/methods/printing.rb +26 -0
  28. data/lib/regexp_parser/expression/methods/strfregexp.rb +1 -1
  29. data/lib/regexp_parser/expression/methods/tests.rb +47 -1
  30. data/lib/regexp_parser/expression/methods/traverse.rb +35 -19
  31. data/lib/regexp_parser/expression/quantifier.rb +55 -24
  32. data/lib/regexp_parser/expression/sequence.rb +11 -31
  33. data/lib/regexp_parser/expression/sequence_operation.rb +4 -9
  34. data/lib/regexp_parser/expression/shared.rb +111 -0
  35. data/lib/regexp_parser/expression/subexpression.rb +26 -18
  36. data/lib/regexp_parser/expression.rb +37 -155
  37. data/lib/regexp_parser/lexer.rb +81 -39
  38. data/lib/regexp_parser/parser.rb +135 -173
  39. data/lib/regexp_parser/scanner/errors/premature_end_error.rb +8 -0
  40. data/lib/regexp_parser/scanner/errors/scanner_error.rb +6 -0
  41. data/lib/regexp_parser/scanner/errors/validation_error.rb +63 -0
  42. data/lib/regexp_parser/scanner/properties/long.csv +651 -0
  43. data/lib/regexp_parser/scanner/properties/short.csv +249 -0
  44. data/lib/regexp_parser/scanner/property.rl +2 -2
  45. data/lib/regexp_parser/scanner/scanner.rl +127 -185
  46. data/lib/regexp_parser/scanner.rb +1185 -1402
  47. data/lib/regexp_parser/syntax/any.rb +2 -7
  48. data/lib/regexp_parser/syntax/base.rb +91 -66
  49. data/lib/regexp_parser/syntax/token/anchor.rb +15 -0
  50. data/lib/regexp_parser/syntax/{tokens → token}/assertion.rb +2 -2
  51. data/lib/regexp_parser/syntax/token/backreference.rb +33 -0
  52. data/lib/regexp_parser/syntax/token/character_set.rb +16 -0
  53. data/lib/regexp_parser/syntax/{tokens → token}/character_type.rb +3 -3
  54. data/lib/regexp_parser/syntax/{tokens → token}/conditional.rb +3 -3
  55. data/lib/regexp_parser/syntax/token/escape.rb +33 -0
  56. data/lib/regexp_parser/syntax/{tokens → token}/group.rb +7 -7
  57. data/lib/regexp_parser/syntax/{tokens → token}/keep.rb +1 -1
  58. data/lib/regexp_parser/syntax/token/meta.rb +20 -0
  59. data/lib/regexp_parser/syntax/{tokens → token}/posix_class.rb +3 -3
  60. data/lib/regexp_parser/syntax/token/quantifier.rb +35 -0
  61. data/lib/regexp_parser/syntax/token/unicode_property.rb +751 -0
  62. data/lib/regexp_parser/syntax/token/virtual.rb +11 -0
  63. data/lib/regexp_parser/syntax/token.rb +45 -0
  64. data/lib/regexp_parser/syntax/version_lookup.rb +17 -34
  65. data/lib/regexp_parser/syntax/versions/1.8.6.rb +13 -20
  66. data/lib/regexp_parser/syntax/versions/1.9.1.rb +10 -17
  67. data/lib/regexp_parser/syntax/versions/1.9.3.rb +3 -10
  68. data/lib/regexp_parser/syntax/versions/2.0.0.rb +8 -15
  69. data/lib/regexp_parser/syntax/versions/2.2.0.rb +3 -9
  70. data/lib/regexp_parser/syntax/versions/2.3.0.rb +3 -9
  71. data/lib/regexp_parser/syntax/versions/2.4.0.rb +3 -9
  72. data/lib/regexp_parser/syntax/versions/2.4.1.rb +2 -8
  73. data/lib/regexp_parser/syntax/versions/2.5.0.rb +3 -9
  74. data/lib/regexp_parser/syntax/versions/2.6.0.rb +3 -9
  75. data/lib/regexp_parser/syntax/versions/2.6.2.rb +3 -9
  76. data/lib/regexp_parser/syntax/versions/2.6.3.rb +3 -9
  77. data/lib/regexp_parser/syntax/versions/3.1.0.rb +4 -0
  78. data/lib/regexp_parser/syntax/versions/3.2.0.rb +4 -0
  79. data/lib/regexp_parser/syntax/versions.rb +4 -2
  80. data/lib/regexp_parser/syntax.rb +2 -2
  81. data/lib/regexp_parser/token.rb +9 -20
  82. data/lib/regexp_parser/version.rb +1 -1
  83. data/lib/regexp_parser.rb +6 -8
  84. data/regexp_parser.gemspec +20 -22
  85. metadata +49 -171
  86. data/CHANGELOG.md +0 -494
  87. data/README.md +0 -479
  88. data/lib/regexp_parser/scanner/properties/long.yml +0 -594
  89. data/lib/regexp_parser/scanner/properties/short.yml +0 -237
  90. data/lib/regexp_parser/syntax/tokens/anchor.rb +0 -15
  91. data/lib/regexp_parser/syntax/tokens/backref.rb +0 -24
  92. data/lib/regexp_parser/syntax/tokens/character_set.rb +0 -13
  93. data/lib/regexp_parser/syntax/tokens/escape.rb +0 -30
  94. data/lib/regexp_parser/syntax/tokens/meta.rb +0 -13
  95. data/lib/regexp_parser/syntax/tokens/quantifier.rb +0 -35
  96. data/lib/regexp_parser/syntax/tokens/unicode_property.rb +0 -675
  97. data/lib/regexp_parser/syntax/tokens.rb +0 -45
  98. data/spec/expression/base_spec.rb +0 -104
  99. data/spec/expression/clone_spec.rb +0 -152
  100. data/spec/expression/conditional_spec.rb +0 -89
  101. data/spec/expression/free_space_spec.rb +0 -27
  102. data/spec/expression/methods/match_length_spec.rb +0 -161
  103. data/spec/expression/methods/match_spec.rb +0 -25
  104. data/spec/expression/methods/strfregexp_spec.rb +0 -224
  105. data/spec/expression/methods/tests_spec.rb +0 -99
  106. data/spec/expression/methods/traverse_spec.rb +0 -161
  107. data/spec/expression/options_spec.rb +0 -128
  108. data/spec/expression/subexpression_spec.rb +0 -50
  109. data/spec/expression/to_h_spec.rb +0 -26
  110. data/spec/expression/to_s_spec.rb +0 -108
  111. data/spec/lexer/all_spec.rb +0 -22
  112. data/spec/lexer/conditionals_spec.rb +0 -53
  113. data/spec/lexer/delimiters_spec.rb +0 -68
  114. data/spec/lexer/escapes_spec.rb +0 -14
  115. data/spec/lexer/keep_spec.rb +0 -10
  116. data/spec/lexer/literals_spec.rb +0 -64
  117. data/spec/lexer/nesting_spec.rb +0 -99
  118. data/spec/lexer/refcalls_spec.rb +0 -60
  119. data/spec/parser/all_spec.rb +0 -43
  120. data/spec/parser/alternation_spec.rb +0 -88
  121. data/spec/parser/anchors_spec.rb +0 -17
  122. data/spec/parser/conditionals_spec.rb +0 -179
  123. data/spec/parser/errors_spec.rb +0 -30
  124. data/spec/parser/escapes_spec.rb +0 -121
  125. data/spec/parser/free_space_spec.rb +0 -130
  126. data/spec/parser/groups_spec.rb +0 -108
  127. data/spec/parser/keep_spec.rb +0 -6
  128. data/spec/parser/options_spec.rb +0 -28
  129. data/spec/parser/posix_classes_spec.rb +0 -8
  130. data/spec/parser/properties_spec.rb +0 -115
  131. data/spec/parser/quantifiers_spec.rb +0 -68
  132. data/spec/parser/refcalls_spec.rb +0 -117
  133. data/spec/parser/set/intersections_spec.rb +0 -127
  134. data/spec/parser/set/ranges_spec.rb +0 -111
  135. data/spec/parser/sets_spec.rb +0 -178
  136. data/spec/parser/types_spec.rb +0 -18
  137. data/spec/scanner/all_spec.rb +0 -18
  138. data/spec/scanner/anchors_spec.rb +0 -21
  139. data/spec/scanner/conditionals_spec.rb +0 -128
  140. data/spec/scanner/delimiters_spec.rb +0 -52
  141. data/spec/scanner/errors_spec.rb +0 -67
  142. data/spec/scanner/escapes_spec.rb +0 -64
  143. data/spec/scanner/free_space_spec.rb +0 -165
  144. data/spec/scanner/groups_spec.rb +0 -61
  145. data/spec/scanner/keep_spec.rb +0 -10
  146. data/spec/scanner/literals_spec.rb +0 -39
  147. data/spec/scanner/meta_spec.rb +0 -18
  148. data/spec/scanner/options_spec.rb +0 -36
  149. data/spec/scanner/properties_spec.rb +0 -64
  150. data/spec/scanner/quantifiers_spec.rb +0 -25
  151. data/spec/scanner/refcalls_spec.rb +0 -55
  152. data/spec/scanner/sets_spec.rb +0 -151
  153. data/spec/scanner/types_spec.rb +0 -14
  154. data/spec/spec_helper.rb +0 -16
  155. data/spec/support/runner.rb +0 -42
  156. data/spec/support/shared_examples.rb +0 -77
  157. data/spec/support/warning_extractor.rb +0 -60
  158. data/spec/syntax/syntax_spec.rb +0 -48
  159. data/spec/syntax/syntax_token_map_spec.rb +0 -23
  160. data/spec/syntax/versions/1.8.6_spec.rb +0 -17
  161. data/spec/syntax/versions/1.9.1_spec.rb +0 -10
  162. data/spec/syntax/versions/1.9.3_spec.rb +0 -9
  163. data/spec/syntax/versions/2.0.0_spec.rb +0 -13
  164. data/spec/syntax/versions/2.2.0_spec.rb +0 -9
  165. data/spec/syntax/versions/aliases_spec.rb +0 -37
  166. data/spec/token/token_spec.rb +0 -85
  167. /data/lib/regexp_parser/expression/classes/{set → character_set}/intersection.rb +0 -0
data/CHANGELOG.md DELETED
@@ -1,494 +0,0 @@
1
- ## [Unreleased]
2
-
3
- ## [2.1.1] - 2021-02-23 - [Janosch Müller](mailto:janosch84@gmail.com)
4
-
5
- ### Fixed
6
-
7
- - fixed `NameError` when requiring only `'regexp_parser/scanner'` in v2.1.0
8
- * thanks to [Jared White and Sam Ruby](https://github.com/ruby2js/ruby2js) for the report
9
-
10
- ## [2.1.0] - 2021-02-22 - [Janosch Müller](mailto:janosch84@gmail.com)
11
-
12
- ### Added
13
-
14
- - common ancestor for all scanning/parsing/lexing errors
15
- * `Regexp::Parser::Error` can now be rescued as a catch-all
16
- * the following errors (and their many descendants) now inherit from it:
17
- - `Regexp::Expression::Conditional::TooManyBranches`
18
- - `Regexp::Parser::ParserError`
19
- - `Regexp::Scanner::ScannerError`
20
- - `Regexp::Scanner::ValidationError`
21
- - `Regexp::Syntax::SyntaxError`
22
- * it replaces `ArgumentError` in some rare cases (`Regexp::Parser.parse('?')`)
23
- * thanks to [sandstrom](https://github.com/sandstrom) for the cue
24
-
25
- ### Fixed
26
-
27
- - fixed scanning of whole-pattern recursion calls `\g<0>` and `\g'0'`
28
- * a regression in v2.0.1 had caused them to be scanned as literals
29
- - fixed scanning of some backreference and subexpression call edge cases
30
- * e.g. `\k<+1>`, `\g<x-1>`
31
- - fixed tokenization of some escapes in character sets
32
- * `.`, `|`, `{`, `}`, `(`, `)`, `^`, `$`, `?`, `+`, `*`
33
- * all of these correctly emitted `#type` `:literal` and `#token` `:literal` if *not* escaped
34
- * if escaped, they emitted e.g. `#type` `:escape` and `#token` `:group_open` for `[\(]`
35
- * the escaped versions now correctly emit `#type` `:escape` and `#token` `:literal`
36
- - fixed handling of control/metacontrol escapes in character sets
37
- * e.g. `[\cX]`, `[\M-\C-X]`
38
- * they were misread as bunch of individual literals, escapes, and ranges
39
- - fixed some cases where calling `#dup`/`#clone` on expressions led to shared state
40
-
41
- ## [2.0.3] - 2020-12-28 - [Janosch Müller](mailto:janosch84@gmail.com)
42
-
43
- ### Fixed
44
-
45
- - fixed error when scanning some unlikely and redundant but valid charset patterns
46
- * e.g. `/[[.a-b.]]/`, `/[[=e=]]/`,
47
- - fixed ancestry of some error classes related to syntax version lookup
48
- * `NotImplementedError`, `InvalidVersionNameError`, `UnknownSyntaxNameError`
49
- * they now correctly inherit from `Regexp::Syntax::SyntaxError` instead of Rubys `::SyntaxError`
50
-
51
- ## [2.0.2] - 2020-12-25 - [Janosch Müller](mailto:janosch84@gmail.com)
52
-
53
- ### Fixed
54
-
55
- - fixed `FrozenError` when calling `#to_s` on a frozen `Group::Passive`
56
- * thanks to [Daniel Gollahon](https://github.com/dgollahon)
57
-
58
- ## [2.0.1] - 2020-12-20 - [Janosch Müller](mailto:janosch84@gmail.com)
59
-
60
- ### Fixed
61
-
62
- - fixed error when scanning some group names
63
- * this affected names containing hyphens, digits or multibyte chars, e.g. `/(?<a1>a)/`
64
- * thanks to [Daniel Gollahon](https://github.com/dgollahon) for the report
65
- - fixed error when scanning hex escapes with just one hex digit
66
- * e.g. `/\x0A/` was scanned correctly, but the equivalent `/\xA/` was not
67
- * thanks to [Daniel Gollahon](https://github.com/dgollahon) for the report
68
-
69
- ## [2.0.0] - 2020-11-25 - [Janosch Müller](mailto:janosch84@gmail.com)
70
-
71
- ### Changed
72
-
73
- - some methods that used to return byte-based indices now return char-based indices
74
- * the returned values have only changed for Regexps that contain multibyte chars
75
- * this is only a breaking change if you used such methods directly AND relied on them pointing to bytes
76
- * affected methods:
77
- * `Regexp::Token` `#length`, `#offset`, `#te`, `#ts`
78
- * `Regexp::Expression::Base` `#full_length`, `#offset`, `#starts_at`, `#te`, `#ts`
79
- * thanks to [Akinori MUSHA](https://github.com/knu) for the report
80
- - removed some deprecated methods/signatures
81
- * these are rarely used and have been showing deprecation warnings for a long time
82
- * `Regexp::Expression::Subexpression.new` with 3 arguments
83
- * `Regexp::Expression::Root.new` without a token argument
84
- * `Regexp::Expression.parsed`
85
-
86
- ### Added
87
-
88
- - `Regexp::Expression::Base#base_length`
89
- * returns the character count of an expression body, ignoring any quantifier
90
- - pragmatic, experimental support for chained quantifiers
91
- * e.g.: `/^a{10}{4,6}$/` matches exactly 40, 50 or 60 `a`s
92
- * successive quantifiers used to be silently dropped by the parser
93
- * they are now wrapped with passive groups as if they were written `(?:a{10}){4,6}`
94
- * thanks to [calfeld](https://github.com/calfeld) for reporting this a while back
95
-
96
- ### Fixed
97
-
98
- - incorrect encoding output for non-ascii comments
99
- * this led to a crash when calling `#to_s` on parse results containing such comments
100
- * thanks to [Michael Glass](https://github.com/michaelglass) for the report
101
- - some crashes when scanning contrived patterns such as `'\😋'`
102
-
103
- ### [1.8.2] - 2020-10-11 - [Janosch Müller](mailto:janosch84@gmail.com)
104
-
105
- ### Fixed
106
-
107
- - fix `FrozenError` in `Expression::Base#repetitions` on Ruby 3.0
108
- * thanks to [Thomas Walpole](https://github.com/twalpole)
109
- - removed "unknown future version" warning on Ruby 3.0
110
-
111
- ### [1.8.1] - 2020-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
112
-
113
- ### Fixed
114
-
115
- - fixed scanning of comment-like text in normal mode
116
- * this was an old bug, but had become more prevalent in v1.8.0
117
- * thanks to [Tietew](https://github.com/Tietew) for the report
118
- - specified correct minimum Ruby version in gemspec
119
- * it said 1.9 but really required 2.0 as of v1.8.0
120
-
121
- ### [1.8.0] - 2020-09-20 - [Janosch Müller](mailto:janosch84@gmail.com)
122
-
123
- ### Changed
124
-
125
- - dropped support for running on Ruby 1.9.x
126
-
127
- ### Added
128
-
129
- - regexp flags can now be passed when parsing a `String` as regexp body
130
- * see the [README](/README.md#usage) for details
131
- * thanks to [Owen Stephens](https://github.com/owst)
132
- - bare occurrences of `\g` and `\k` are now allowed and scanned as literal escapes
133
- * matches Onigmo behavior
134
- * thanks for the report to [Marc-André Lafortune](https://github.com/marcandre)
135
-
136
- ### Fixed
137
-
138
- - fixed parsing comments without preceding space or trailing newline in x-mode
139
- * thanks to [Owen Stephens](https://github.com/owst)
140
-
141
- ### [1.7.1] - 2020-06-07 - [Ammar Ali](mailto:ammarabuali@gmail.com)
142
-
143
- ### Fixed
144
-
145
- - Support for literals that include the unescaped delimiters `{`, `}`, and `]`. These
146
- delimiters are informally supported by various regexp engines.
147
-
148
- ### [1.7.0] - 2020-02-23 - [Janosch Müller](mailto:janosch84@gmail.com)
149
-
150
- ### Added
151
-
152
- - `Expression#each_expression` and `#traverse` can now be called without a block
153
- * this returns an `Enumerator` and allows chaining, e.g. `each_expression.select`
154
- * thanks to [Masataka Kuwabara](https://github.com/pocke)
155
-
156
- ### Fixed
157
-
158
- - `MatchLength#each` no longer ignores the given `limit:` when called without a block
159
-
160
- ### [1.6.0] - 2019-06-16 - [Janosch Müller](mailto:janosch84@gmail.com)
161
-
162
- ### Added
163
-
164
- - Added support for 16 new unicode properties introduced in Ruby 2.6.2 and 2.6.3
165
-
166
- ### [1.5.1] - 2019-05-23 - [Janosch Müller](mailto:janosch84@gmail.com)
167
-
168
- ### Fixed
169
-
170
- - Fixed `#options` (and thus `#i?`, `#u?` etc.) not being set for some expressions:
171
- * this affected posix classes as well as alternation, conditional, and intersection branches
172
- * `#options` was already correct for all child expressions of such branches
173
- * this only made an operational difference for posix classes as they respect encoding flags
174
- - Fixed `#options` not respecting all negative options in weird cases like '(?u-m-x)'
175
- - Fixed `Group#option_changes` not accounting for indirectly disabled (overridden) encoding flags
176
- - Fixed `Scanner` allowing negative encoding options if there were no positive options, e.g. '(?-u)'
177
- - Fixed `ScannerError` for some valid meta/control sequences such as '\\C-\\\\'
178
- - Fixed `Expression#match` and `#=~` not working with a single argument
179
-
180
- ### [1.5.0] - 2019-05-14 - [Janosch Müller](mailto:janosch84@gmail.com)
181
-
182
- ### Added
183
-
184
- - Added `#referenced_expression` for backrefs, subexp calls and conditionals
185
- * returns the `Group` expression that is being referenced via name or number
186
- - Added `Expression#repetitions`
187
- * returns a `Range` of allowed repetitions (`1..1` if there is no quantifier)
188
- * like `#quantity` but with a more uniform interface
189
- - Added `Expression#match_length`
190
- * allows to inspect and iterate over String lengths matched by the Expression
191
-
192
- ### Fixed
193
-
194
- - Fixed `Expression#clone` "direction"
195
- * it used to dup ivars onto the callee, leaving only the clone referencing the original objects
196
- * this will affect you if you call `#eql?`/`#equal?` on expressions or use them as Hash keys
197
- - Fixed `#clone` results for `Sequences`, e.g. alternations and conditionals
198
- * the inner `#text` was cloned onto the `Sequence` and thus duplicated
199
- * e.g. `Regexp::Parser.parse(/(a|bc)/).clone.to_s # => (aa|bcbc)`
200
- - Fixed inconsistent `#to_s` output for `Sequences`
201
- * it used to return only the "specific" text, e.g. "|" for an alternation
202
- * now it includes nested expressions as it does for all other `Subexpressions`
203
- - Fixed quantification of codepoint lists with more than one entry (`\u{62 63 64}+`)
204
- * quantifiers apply only to the last entry, so this token is now split up if quantified
205
-
206
- ### [1.4.0] - 2019-04-02 - [Janosch Müller](mailto:janosch84@gmail.com)
207
-
208
- ### Added
209
-
210
- - Added support for 19 new unicode properties introduced in Ruby 2.6.0
211
-
212
- ### [1.3.0] - 2018-11-14 - [Janosch Müller](mailto:janosch84@gmail.com)
213
-
214
- ### Added
215
-
216
- - `Syntax#features` returns a `Hash` of all types and tokens supported by a given `Syntax`
217
-
218
- ### Fixed
219
-
220
- - Thanks to [Akira Matsuda](https://github.com/amatsuda)
221
- * eliminated warning "assigned but unused variable - testEof"
222
-
223
- ## [1.2.0] - 2018-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
224
-
225
- ### Added
226
-
227
- - `Subexpression` (branch node) includes `Enumerable`, allowing to `#select` children etc.
228
-
229
- ### Fixed
230
-
231
- - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re`
232
- - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree
233
- - it used to be the only expression stored in a custom ivar, complicating traversal
234
- - its setter and getter (`#condition=`, `#condition`) still work as before
235
-
236
- ## [1.1.0] - 2018-09-17 - [Janosch Müller](mailto:janosch84@gmail.com)
237
-
238
- ### Added
239
-
240
- - Added `Quantifier` methods `#greedy?`, `#possessive?`, `#reluctant?`/`#lazy?`
241
- - Added `Group::Options#option_changes`
242
- - shows the options enabled or disabled by the given options group
243
- - as with all other expressions, `#options` shows the overall active options
244
- - Added `Conditional#reference` and `Condition#reference`, indicating the determinative group
245
- - Added `Subexpression#dig`, acts like [`Array#dig`](http://ruby-doc.org/core-2.5.0/Array.html#method-i-dig)
246
-
247
- ### Fixed
248
-
249
- - Fixed parsing of quantified conditional expressions (quantifiers were assigned to the wrong expression)
250
- - Fixed scanning and parsing of forward-referring subexpression calls (e.g. `\g<+1>`)
251
- - `Root` and `Sequence` expressions now support the same constructor signature as all other expressions
252
-
253
- ## [1.0.0] - 2018-09-01 - [Janosch Müller](mailto:janosch84@gmail.com)
254
-
255
- This release includes several breaking changes, mostly to character sets, #map and properties.
256
-
257
- ### Changed
258
-
259
- - Changed handling of sets (a.k.a. character classes or "bracket expressions")
260
- * see PR [#55](https://github.com/ammar/regexp_parser/pull/55) / issue [#47](https://github.com/ammar/regexp_parser/issues/47) for details
261
- * sets are now parsed to expression trees like other nestable expressions
262
- * `#scan` now emits the same tokens as outside sets (no longer `:set, :member`)
263
- * `CharacterSet#members` has been removed
264
- * new `Range` and `Intersection` classes represent corresponding syntax features
265
- * a new `PosixClass` expression class represents e.g. `[[:ascii:]]`
266
- * `PosixClass` instances behave like `Property` ones, e.g. support `#negative?`
267
- * `#scan` emits `:(non)posixclass, :<type>` instead of `:set, :char_(non)<type>`
268
- - Changed `Subexpression#map` to act like regular `Enumerable#map`
269
- * the old behavior is available as `Subexpression#flat_map`
270
- * e.g. `parse(/[a]/).map(&:to_s) == ["[a]"]`; used to be `["[a]", "a"]`
271
- - Changed expression emissions for some escape sequences
272
- * `EscapeSequence::Codepoint`, `CodepointList`, `Hex` and `Octal` are now all used
273
- * they already existed, but were all parsed as `EscapeSequence::Literal`
274
- * e.g. `\x97` is now `EscapeSequence::Hex` instead of `EscapeSequence::Literal`
275
- - Changed naming of many property tokens (emitted for `\p{...}`)
276
- * if you work with these tokens, see PR [#56](https://github.com/ammar/regexp_parser/pull/56) for details
277
- * e.g. `:punct_dash` is now `:dash_punctuation`
278
- - Changed `(?m)` and the likes to emit as `:options_switch` token (@4ade4d1)
279
- * allows differentiating from group-local `:options`, e.g. `(?m:.)`
280
- - Changed name of `Backreference::..NestLevel` to `..RecursionLevel` (@4184339)
281
- - Changed `Backreference::Number#number` from `String` to `Integer` (@40a2231)
282
-
283
- ### Added
284
-
285
- - Added support for all previously missing properties (about 250)
286
- - Added `Expression::UnicodeProperty#shortcut` (e.g. returns "m" for `\p{mark}`)
287
- - Added `#char(s)` and `#codepoint(s)` methods to all `EscapeSequence` expressions
288
- - Added `#number`/`#name`/`#recursion_level` to all backref/call expressions (@174bf21)
289
- - Added `#number` and `#number_at_level` to capturing group expressions (@40a2231)
290
-
291
- ### Fixed
292
-
293
- - Fixed Ruby version mapping of some properties
294
- - Fixed scanning of some property spellings, e.g. with dashes
295
- - Fixed some incorrect property alias normalizations
296
- - Fixed scanning of codepoint escapes with 6 digits (e.g. `\u{10FFFF}`)
297
- - Fixed scanning of `\R` and `\X` within sets; they act as literals there
298
-
299
- ## [0.5.0] - 2018-04-29 - [Janosch Müller](mailto:janosch84@gmail.com)
300
-
301
- ### Changed
302
-
303
- - Changed handling of Ruby versions (PR [#53](https://github.com/ammar/regexp_parser/pull/53))
304
- * New Ruby versions are now supported by default
305
- * Some deep-lying APIs have changed, which should not affect most users:
306
- * `Regexp::Syntax::VERSIONS` is gone
307
- * Syntax version names have changed from `Regexp::Syntax::Ruby::Vnnn`
308
- to `Regexp::Syntax::Vn_n_n`
309
- * Syntax version classes for Ruby versions without regex feature changes
310
- are no longer predefined and are now only created on demand / lazily
311
- * `Regexp::Syntax::supported?` returns true for any argument >= 1.8.6
312
-
313
- ### Fixed
314
-
315
- - Fixed some use cases of Expression methods #strfregexp and #to_h (@e738107)
316
-
317
- ### Added
318
-
319
- - Added full signature support to collection methods of Expressions (@aa7c55a)
320
-
321
- ## [0.4.13] - 2018-04-04 - [Ammar Ali](mailto:ammarabuali@gmail.com)
322
-
323
- - Added ruby version files for 2.2.10 and 2.3.7
324
-
325
- ## [0.4.12] - 2018-03-30 - [Janosch Müller](mailto:janosch84@gmail.com)
326
-
327
- - Added ruby version files for 2.4.4 and 2.5.1
328
-
329
- ## [0.4.11] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
330
-
331
- - Fixed UnknownSyntaxNameError introduced in v0.4.10 if
332
- the gems parent dir tree included a 'ruby' dir
333
-
334
- ## [0.4.10] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
335
-
336
- - Added ruby version file for 2.6.0
337
- - Added support for Emoji properties (available in Ruby since 2.5.0)
338
- - Added support for XPosixPunct and Regional_Indicator properties
339
- - Fixed parsing of Unicode 6.0 and 7.0 script properties
340
- - Fixed parsing of the special Assigned property
341
- - Fixed scanning of InCyrillic_Supplement property
342
-
343
- ## [0.4.9] - 2017-12-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
344
-
345
- - Added ruby version file for 2.5.0
346
-
347
- ## [0.4.8] - 2017-12-18 - [Janosch Müller](mailto:janosch84@gmail.com)
348
-
349
- - Added ruby version files for 2.2.9, 2.3.6, and 2.4.3
350
-
351
- ## [0.4.7] - 2017-10-15 - [Janosch Müller](mailto:janosch84@gmail.com)
352
-
353
- - Fixed a thread safety issue (issue #45)
354
- - Some public class methods that were only reliable for
355
- internal use are now private instance methods (PR #46)
356
- - Improved the usefulness of Expression#options (issue #43) -
357
- #options and derived methods such as #i?, #m? and #x? are now
358
- defined for all Expressions that are affected by such flags.
359
- - Fixed scanning of whitespace following (?x) (commit 5c94bd2)
360
- - Fixed a Parser bug where the #number attribute of traditional
361
- numerical backreferences was not set correctly (commit 851b620)
362
-
363
- ## [0.4.6] - 2017-09-18 - [Janosch Müller](mailto:janosch84@gmail.com)
364
-
365
- - Added Parser support for hex escapes in sets (PR #36)
366
- - Added Parser support for octal escapes (PR #37)
367
- - Added support for cluster types \R and \X (PR #38)
368
- - Added support for more metacontrol notations (PR #39)
369
-
370
- ## [0.4.5] - 2017-09-17 - [Ammar Ali](mailto:ammarabuali@gmail.com)
371
-
372
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
373
- * Support ruby 2.2.7 (PR #42)
374
- - Added ruby version files for 2.2.8, 2.3.5, and 2.4.2
375
-
376
- ## [0.4.4] - 2017-07-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
377
-
378
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
379
- * Add support for new absence operator (PR #33)
380
- - Thanks to [Bartek Bułat](https://github.com/barthez):
381
- * Add support for Ruby 2.3.4 version (PR #40)
382
-
383
- ## [0.4.3] - 2017-03-24 - [Ammar Ali](mailto:ammarabuali@gmail.com)
384
-
385
- - Added ruby version file for 2.4.1
386
-
387
- ## [0.4.2] - 2017-01-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
388
-
389
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
390
- * Support ruby 2.4 (PR #30)
391
- * Improve codepoint handling (PR #27)
392
-
393
- ## [0.4.1] - 2016-11-22 - [Ammar Ali](mailto:ammarabuali@gmail.com)
394
-
395
- - Updated ruby version file for 2.3.3
396
-
397
- ## [0.4.0] - 2016-11-20 - [Ammar Ali](mailto:ammarabuali@gmail.com)
398
-
399
- - Added Syntax.supported? method
400
- - Updated ruby versions for latest releases; 2.1.10, 2.2.6, and 2.3.2
401
-
402
- ## [0.3.6] - 2016-06-08 - [Ammar Ali](mailto:ammarabuali@gmail.com)
403
-
404
- - Thanks to [John Backus](https://github.com/backus):
405
- * Remove warnings (PR #26)
406
-
407
- ## [0.3.5] - 2016-05-30 - [Ammar Ali](mailto:ammarabuali@gmail.com)
408
-
409
- - Thanks to [John Backus](https://github.com/backus):
410
- * Fix parsing of /\xFF/n (hex:escape) (PR #24)
411
-
412
- ## [0.3.4] - 2016-05-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
413
-
414
- - Thanks to [John Backus](https://github.com/backus):
415
- * Fix warnings (PR #19)
416
- - Thanks to [Dana Scheider](https://github.com/danascheider):
417
- * Correct error in README (PR #20)
418
- - Fixed mistyped \h and \H character types (issue #21)
419
- - Added ancestry syntax files for latest rubies (issue #22)
420
-
421
- ## [0.3.3] - 2016-04-26 - [Ammar Ali](mailto:ammarabuali@gmail.com)
422
-
423
- - Thanks to [John Backus](https://github.com/backus):
424
- * Fixed scanning of zero length comments (PR #12)
425
- * Fixed missing escape:codepoint_list syntax token (PR #14)
426
- * Fixed to_s for modified interval quantifiers (PR #17)
427
- - Added a note about MRI implementation quirks to Scanner section
428
-
429
- ## [0.3.2] - 2016-01-01 - [Ammar Ali](mailto:ammarabuali@gmail.com)
430
-
431
- - Updated ruby versions for latest releases; 2.1.8, 2.2.4, and 2.3.0
432
- - Fixed class name for UnknownSyntaxNameError exception
433
- - Added UnicodeBlocks support to the parser.
434
- - Added UnicodeBlocks support to the scanner.
435
- - Added expand_members method to CharacterSet, returns traditional
436
- or unicode property forms of shothands (\d, \W, \s, etc.)
437
- - Improved meaning and output of %t and %T in strfregexp.
438
- - Added syntax versions for ruby 2.1.4 and 2.1.5 and updated
439
- latest 2.1 version.
440
- - Added to_h methods to Expression, Subexpression, and Quantifier.
441
- - Added traversal methods; traverse, each_expression, and map.
442
- - Added token/type test methods; type?, is?, and one_of?
443
- - Added printing method strfregexp, inspired by strftime.
444
- - Added scanning and parsing of free spacing (x mode) expressions.
445
- - Improved handling of inline options (?mixdau:...)
446
- - Added conditional expressions. Ruby 2.0.
447
- - Added keep (\K) markers. Ruby 2.0.
448
- - Added d, a, and u options. Ruby 2.0.
449
- - Added missing meta sequences to the parser. They were supported by the scanner only.
450
- - Renamed Lexer's method to lex, added an alias to the old name (scan)
451
- - Use #map instead of #each to run the block in Lexer.lex.
452
- - Replaced VERSION.yml file with a constant.
453
- - Updated README
454
- - Update tokens and scanner with new additions in Unicode 7.0.
455
-
456
- ## [0.1.6] - 2014-10-06 - [Ammar Ali](mailto:ammarabuali@gmail.com)
457
-
458
- - Fixed test and gem building rake tasks and extracted the gem
459
- specification from the Rakefile into a .gemspec file.
460
- - Added syntax files for missing ruby 2.x versions. These do not add
461
- extra syntax support, they just make the gem work with the newer
462
- ruby versions.
463
- - Added .travis.yml to project root.
464
- - README:
465
- - Removed note purporting runtime support for ruby 1.8.6.
466
- - Added a section identifying the main unsupported syntax features.
467
- - Added sections for Testing and Building
468
- - Added badges for gem version, Travis CI, and code climate.
469
- - Updated README, fixing broken examples, and converting it from a rdoc file to Github's flavor of Markdown.
470
- - Fixed a parser bug where an alternation sequence that contained nested expressions was incorrectly being appended to the parent expression when the nesting was exited. e.g. in /a|(b)c/, c was appended to the root.
471
-
472
- - Fixed a bug where character types were not being correctly scanned within character sets. e.g. in [\d], two tokens were scanned; one for the backslash '\' and one for the 'd'
473
-
474
- ## [0.1.5] - 2014-01-14 - [Ammar Ali](mailto:ammarabuali@gmail.com)
475
-
476
- - Correct ChangeLog.
477
- - Added syntax stubs for ruby versions 2.0 and 2.1
478
- - Added clone methods for deep copying expressions.
479
- - Added optional format argument for to_s on expressions to return the text of the expression with (:full, the default) or without (:base) its quantifier.
480
- - Renamed the :beginning_of_line and :end_of_line tokens to :bol and :eol.
481
- - Fixed a bug where alternations with more than two alternatives and one of them ending in a group were being incorrectly nested.
482
- - Improved EOF handling in general and especially from sequences like hex and control escapes.
483
- - Fixed a bug where named groups with an empty name would return a blank token [].
484
- - Fixed a bug where member of a parent set where being added to its last subset.
485
- - Various code cleanups in scanner.rl
486
- - Fixed a few mutable string bugs by calling dup on the originals.
487
- - Made ruby 1.8.6 the base for all 1.8 syntax, and the 1.8 name a pointer to the latest (1.8.7 at this time)
488
- - Removed look-behind assertions (positive and negative) from 1.8 syntax
489
- - Added control (\cc and \C-c) and meta (\M-c) escapes to 1.8 syntax
490
- - The default syntax is now the one of the running ruby version in both the lexer and the parser.
491
-
492
- ## [0.1.0] - 2010-11-21 - [Ammar Ali](mailto:ammarabuali@gmail.com)
493
-
494
- - Initial release