regexp_parser 2.8.1 → 2.8.3

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG.md DELETED
@@ -1,691 +0,0 @@
1
- # Changelog
2
-
3
- All notable changes to this project will be documented in this file.
4
-
5
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
-
8
- ## [Unreleased]
9
-
10
- ## [2.8.1] - 2023-06-10 - [Janosch Müller](mailto:janosch84@gmail.com)
11
-
12
- ### Fixed
13
-
14
- - support for extpict unicode property, added in Ruby 2.6
15
- - support for 10 unicode script/block properties added in Ruby 3.2
16
-
17
- ## [2.8.0] - 2023-04-17 - [Janosch Müller](mailto:janosch84@gmail.com)
18
-
19
- ### Added
20
-
21
- - `Regexp::Expression::Shared#ends_at`
22
- * e.g. `parse(/a +/x)[0].ends_at # => 3`
23
- * e.g. `parse(/a +/x)[0].ends_at(include_quantifier = false) # => 1`
24
- - `Regexp::Expression::Shared#{capturing?,comment?}`
25
- * previously only available on capturing and comment groups
26
- - `Regexp::Expression::Shared#{decorative?}`
27
- * true for decorations: comment groups as well as comments and whitespace in x-mode
28
- - `Regexp::Expression::Shared#parent`
29
- - new format argument `:original` for `Regexp::Expression::Base#to_s`
30
- * includes decorative elements between node and its quantifier
31
- * e.g. `parse(/a (?#comment) +/x)[0].to_s(:original) # => "a (?#comment) +"`
32
- * using it is not needed when calling `Root#to_s` as Root can't be quantified
33
- - support calling `Subexpression#{each_expression,flat_map}` with a one-argument block
34
- * in this case, only the expressions are passed to the block, no indices
35
- - support calling test methods at Expression class level
36
- - `capturing?`, `comment?`, `decorative?`, `referential?`, `terminal?`
37
- - e.g. `Regexp::Expression::CharacterSet.terminal? # => false`
38
-
39
- ### Fixed
40
-
41
- - `Regexp::Expression::Shared#full_length` with whitespace before quantifier
42
- * e.g. `parse(/a +/x)[0].full_length` used to yield `2`, now it yields `3`
43
- - `Subexpression#to_s` output with children with whitespace before their quantifier
44
- * e.g. `parse(/a + /x).to_s` used to yield `"a+ "`, now it yields `"a + "`
45
- * calling `#to_s` on sub-nodes still omits such decorative interludes by default
46
- - use new `#to_s` format `:original` to include it
47
- - e.g. `parse(/a + /x)[0].to_s(:original) # => "a +"`
48
- - fixed `Subexpression#te` behaving differently from other expressions
49
- * only `Subexpression#te` used to include the quantifier
50
- * now `#te` is the end index without quantifier, as for other expressions
51
- - fixed `NoMethodError` when calling `#starts_at` or `#ts` on empty sequences
52
- * e.g. `Regexp::Parser.parse(/|/)[0].starts_at`
53
- * e.g. `Regexp::Parser.parse(/[&&]/)[0][0].starts_at`
54
- - fixed nested comment groups breaking local x-options
55
- * e.g. in `/(?x:(?#hello)) /`, the x-option wrongly applied to the whitespace
56
- - fixed nested comment groups breaking conditionals
57
- * e.g. in `/(a)(?(1)b|c(?#hello)d)e/`, the 2nd conditional branch included "e"
58
- - fixed quantifiers after comment groups being mis-assigned to that group
59
- * e.g. in `/a(?#foo){3}/` (matches 'aaa')
60
- - fixed Scanner accepting two cases of invalid Regexp syntax
61
- * unmatched closing parentheses (`)`) and k-backrefs with number 0 (`\k<0>`)
62
- * these are a `SyntaxError` in Ruby, so could only be passed as a String
63
- * they now raise a `Regexp::Scanner::ScannerError`
64
- - fixed some scanner errors not inheriting from `Regexp::Scanner::ScannerError`
65
- - reduced verbosity of inspect / pretty print output
66
-
67
- ## [2.7.0] - 2023-02-08 - [Janosch Müller](mailto:janosch84@gmail.com)
68
-
69
- ### Added
70
-
71
- - `Regexp::Lexer.lex` now streams tokens when called with a block
72
- * it can now take arbitrarily large input, just like `Regexp::Scanner`
73
- * this also slightly improves `Regexp::Parser.parse` performance
74
- * note: `Regexp::Parser.parse` still does not and will not support streaming
75
- - improved performance of `Subexpression#each_expression`
76
- - minor improvements to `Regexp::Scanner` performance
77
- - overall improvement of parse performance: about 10% for large Regexps
78
-
79
- ### Fixed
80
-
81
- - parsing of octal escape sequences in sets, e.g. `[\141]`
82
- * thanks to [Randy Stauner](https://github.com/rwstauner) for the report
83
-
84
- ## [2.6.2] - 2023-01-19 - [Janosch Müller](mailto:janosch84@gmail.com)
85
-
86
- ### Fixed
87
-
88
- - fixed `SystemStackError` when cloning recursive subexpression calls
89
- * e.g. `Regexp::Parser.parse(/a|b\g<0>/).dup`
90
-
91
- ## [2.6.1] - 2022-11-16 - [Janosch Müller](mailto:janosch84@gmail.com)
92
-
93
- ### Fixed
94
-
95
- - fixed scanning of two negative lookbehind edge cases
96
- * `(?<!x)y>` used to raise a ScannerError
97
- * `(?<!x>)y` used to be misinterpreted as a named group
98
- * thanks to [Sergio Medina](https://github.com/serch) for the report
99
-
100
- ## [2.6.0] - 2022-09-26 - [Janosch Müller](mailto:janosch84@gmail.com)
101
-
102
- ### Fixed
103
-
104
- - fixed `#referenced_expression` for `\g<0>` (was `nil`, is now the `Root` exp)
105
- - fixed `#reference`, `#referenced_expression` for recursion level backrefs
106
- * e.g. `(a)(b)\k<-1+1>`
107
- * `#referenced_expression` was `nil`, now it is the correct `Group` exp
108
- - detect and raise for two more syntax errors when parsing String input
109
- * quantification of option switches (e.g. `(?i)+`)
110
- * invalid references (e.g. `/\k<1>/`)
111
- * these are a `SyntaxError` in Ruby, so could only be passed as a String
112
-
113
- ### Added
114
-
115
- - `Regexp::Expression::Base#human_name`
116
- * returns a nice, human-readable description of the expression
117
- - `Regexp::Expression::Base#optional?`
118
- * returns `true` if the expression is quantified accordingly (e.g. with `*`, `{,n}`)
119
- - added a deprecation warning when calling `#to_re` on set members
120
-
121
- ## [2.5.0] - 2022-05-27 - [Janosch Müller](mailto:janosch84@gmail.com)
122
-
123
- ### Added
124
-
125
- - `Regexp::Expression::Base.construct` and `.token_class` methods
126
- * see the [wiki](https://github.com/ammar/regexp_parser/wiki) for details
127
-
128
- ## [2.4.0] - 2022-05-09 - [Janosch Müller](mailto:janosch84@gmail.com)
129
-
130
- ### Fixed
131
-
132
- - fixed interpretation of `+` and `?` after interval quantifiers (`{n,n}`)
133
- * they used to be treated as reluctant or possessive mode indicators
134
- * however, Ruby does not support these modes for interval quantifiers
135
- * they are now treated as chained quantifiers instead, as Ruby does it
136
- * c.f. [#3](https://github.com/ammar/regexp_parser/issues/3)
137
- - fixed `Expression::Base#nesting_level` for some tree rewrite cases
138
- * e.g. the alternatives in `/a|[b]/` had an inconsistent nesting_level
139
- - fixed `Scanner` accepting invalid posix classes, e.g. `[[:foo:]]`
140
- * they raise a `SyntaxError` when used in a Regexp, so could only be passed as String
141
- * they now raise a `Regexp::Scanner::ValidationError` in the `Scanner`
142
-
143
- ### Added
144
-
145
- - added `Expression::Base#==` for (deep) comparison of expressions
146
- - added `Expression::Base#parts`
147
- * returns the text elements and subexpressions of an expression
148
- * e.g. `parse(/(a)/)[0].parts # => ["(", #<Literal @text="a"...>, ")"]`
149
- - added `Expression::Base#te` (a.k.a. token end index)
150
- * `Expression::Subexpression` always had `#te`, only terminal nodes lacked it so far
151
- - made some `Expression::Base` methods available on `Quantifier` instances, too
152
- * `#type`, `#type?`, `#is?`, `#one_of?`, `#options`, `#terminal?`
153
- * `#base_length`, `#full_length`, `#starts_at`, `#te`, `#ts`, `#offset`
154
- * `#conditional_level`, `#level`, `#nesting_level` , `#set_level`
155
- * this allows a more unified handling with `Expression::Base` instances
156
- - allowed `Quantifier#initialize` to take a token and options Hash like other nodes
157
- - added a deprecation warning for initializing Quantifiers with 4+ arguments:
158
-
159
- Calling `Expression::Base#quantify` or `Quantifier.new` with 4+ arguments
160
- is deprecated.
161
-
162
- It will no longer be supported in regexp_parser v3.0.0.
163
-
164
- Please pass a Regexp::Token instead, e.g. replace `token, text, min, max, mode`
165
- with `::Regexp::Token.new(:quantifier, token, text)`. min, max, and mode
166
- will be derived automatically.
167
-
168
- Or do `exp.quantifier = Quantifier.construct(token: token, text: str)`.
169
-
170
- This is consistent with how Expression::Base instances are created.
171
-
172
-
173
- ## [2.3.1] - 2022-04-24 - [Janosch Müller](mailto:janosch84@gmail.com)
174
-
175
- ### Fixed
176
-
177
- - removed five inexistent unicode properties from `Syntax#features`
178
- * these were never supported by Ruby or the `Regexp::Scanner`
179
- * thanks to [Markus Schirp](https://github.com/mbj) for the report
180
-
181
- ## [2.3.0] - 2022-04-08 - [Janosch Müller](mailto:janosch84@gmail.com)
182
-
183
- ### Added
184
-
185
- - improved parsing performance through `Syntax` refactoring
186
- * instead of fresh `Syntax` instances, pre-loaded constants are now re-used
187
- * this approximately doubles the parsing speed for simple regexps
188
- - added methods to `Syntax` classes to show relative feature sets
189
- * e.g. `Regexp::Syntax::V3_2_0.added_features`
190
- - support for new unicode properties of Ruby 3.2 / Unicode 14.0
191
-
192
- ## [2.2.1] - 2022-02-11 - [Janosch Müller](mailto:janosch84@gmail.com)
193
-
194
- ### Fixed
195
-
196
- - fixed Syntax version of absence groups (`(?~...)`)
197
- * the lexer accepted them for any Ruby version
198
- * now they are only recognized for Ruby >= 2.4.1 in which they were introduced
199
- - reduced gem size by excluding specs from package
200
- - removed deprecated `test_files` gemspec setting
201
- - no longer depend on `yaml`/`psych` (except for Ruby <= 2.4)
202
- - no longer depend on `set`
203
- * `set` was removed from the stdlib and made a standalone gem as of Ruby 3
204
- * this made it a hidden/undeclared dependency of `regexp_parser`
205
-
206
- ## [2.2.0] - 2021-12-04 - [Janosch Müller](mailto:janosch84@gmail.com)
207
-
208
- ### Added
209
-
210
- - added support for 13 new unicode properties introduced in Ruby 3.1.0
211
-
212
- ## [2.1.1] - 2021-02-23 - [Janosch Müller](mailto:janosch84@gmail.com)
213
-
214
- ### Fixed
215
-
216
- - fixed `NameError` when requiring only `'regexp_parser/scanner'` in v2.1.0
217
- * thanks to [Jared White and Sam Ruby](https://github.com/ruby2js/ruby2js) for the report
218
-
219
- ## [2.1.0] - 2021-02-22 - [Janosch Müller](mailto:janosch84@gmail.com)
220
-
221
- ### Added
222
-
223
- - common ancestor for all scanning/parsing/lexing errors
224
- * `Regexp::Parser::Error` can now be rescued as a catch-all
225
- * the following errors (and their many descendants) now inherit from it:
226
- - `Regexp::Expression::Conditional::TooManyBranches`
227
- - `Regexp::Parser::ParserError`
228
- - `Regexp::Scanner::ScannerError`
229
- - `Regexp::Scanner::ValidationError`
230
- - `Regexp::Syntax::SyntaxError`
231
- * it replaces `ArgumentError` in some rare cases (`Regexp::Parser.parse('?')`)
232
- * thanks to [sandstrom](https://github.com/sandstrom) for the cue
233
-
234
- ### Fixed
235
-
236
- - fixed scanning of whole-pattern recursion calls `\g<0>` and `\g'0'`
237
- * a regression in v2.0.1 had caused them to be scanned as literals
238
- - fixed scanning of some backreference and subexpression call edge cases
239
- * e.g. `\k<+1>`, `\g<x-1>`
240
- - fixed tokenization of some escapes in character sets
241
- * `.`, `|`, `{`, `}`, `(`, `)`, `^`, `$`, `?`, `+`, `*`
242
- * all of these correctly emitted `#type` `:literal` and `#token` `:literal` if *not* escaped
243
- * if escaped, they emitted e.g. `#type` `:escape` and `#token` `:group_open` for `[\(]`
244
- * the escaped versions now correctly emit `#type` `:escape` and `#token` `:literal`
245
- - fixed handling of control/metacontrol escapes in character sets
246
- * e.g. `[\cX]`, `[\M-\C-X]`
247
- * they were misread as bunch of individual literals, escapes, and ranges
248
- - fixed some cases where calling `#dup`/`#clone` on expressions led to shared state
249
-
250
- ## [2.0.3] - 2020-12-28 - [Janosch Müller](mailto:janosch84@gmail.com)
251
-
252
- ### Fixed
253
-
254
- - fixed error when scanning some unlikely and redundant but valid charset patterns
255
- * e.g. `/[[.a-b.]]/`, `/[[=e=]]/`,
256
- - fixed ancestry of some error classes related to syntax version lookup
257
- * `NotImplementedError`, `InvalidVersionNameError`, `UnknownSyntaxNameError`
258
- * they now correctly inherit from `Regexp::Syntax::SyntaxError` instead of Rubys `::SyntaxError`
259
-
260
- ## [2.0.2] - 2020-12-25 - [Janosch Müller](mailto:janosch84@gmail.com)
261
-
262
- ### Fixed
263
-
264
- - fixed `FrozenError` when calling `#to_s` on a frozen `Group::Passive`
265
- * thanks to [Daniel Gollahon](https://github.com/dgollahon)
266
-
267
- ## [2.0.1] - 2020-12-20 - [Janosch Müller](mailto:janosch84@gmail.com)
268
-
269
- ### Fixed
270
-
271
- - fixed error when scanning some group names
272
- * this affected names containing hyphens, digits or multibyte chars, e.g. `/(?<a1>a)/`
273
- * thanks to [Daniel Gollahon](https://github.com/dgollahon) for the report
274
- - fixed error when scanning hex escapes with just one hex digit
275
- * e.g. `/\x0A/` was scanned correctly, but the equivalent `/\xA/` was not
276
- * thanks to [Daniel Gollahon](https://github.com/dgollahon) for the report
277
-
278
- ## [2.0.0] - 2020-11-25 - [Janosch Müller](mailto:janosch84@gmail.com)
279
-
280
- ### Changed
281
-
282
- - some methods that used to return byte-based indices now return char-based indices
283
- * the returned values have only changed for Regexps that contain multibyte chars
284
- * this is only a breaking change if you used such methods directly AND relied on them pointing to bytes
285
- * affected methods:
286
- * `Regexp::Token` `#length`, `#offset`, `#te`, `#ts`
287
- * `Regexp::Expression::Base` `#full_length`, `#offset`, `#starts_at`, `#te`, `#ts`
288
- * thanks to [Akinori MUSHA](https://github.com/knu) for the report
289
- - removed some deprecated methods/signatures
290
- * these are rarely used and have been showing deprecation warnings for a long time
291
- * `Regexp::Expression::Subexpression.new` with 3 arguments
292
- * `Regexp::Expression::Root.new` without a token argument
293
- * `Regexp::Expression.parsed`
294
-
295
- ### Added
296
-
297
- - `Regexp::Expression::Base#base_length`
298
- * returns the character count of an expression body, ignoring any quantifier
299
- - pragmatic, experimental support for chained quantifiers
300
- * e.g.: `/^a{10}{4,6}$/` matches exactly 40, 50 or 60 `a`s
301
- * successive quantifiers used to be silently dropped by the parser
302
- * they are now wrapped with passive groups as if they were written `(?:a{10}){4,6}`
303
- * thanks to [calfeld](https://github.com/calfeld) for reporting this a while back
304
-
305
- ### Fixed
306
-
307
- - incorrect encoding output for non-ascii comments
308
- * this led to a crash when calling `#to_s` on parse results containing such comments
309
- * thanks to [Michael Glass](https://github.com/michaelglass) for the report
310
- - some crashes when scanning contrived patterns such as `'\😋'`
311
-
312
- ### [1.8.2] - 2020-10-11 - [Janosch Müller](mailto:janosch84@gmail.com)
313
-
314
- ### Fixed
315
-
316
- - fix `FrozenError` in `Expression::Base#repetitions` on Ruby 3.0
317
- * thanks to [Thomas Walpole](https://github.com/twalpole)
318
- - removed "unknown future version" warning on Ruby 3.0
319
-
320
- ### [1.8.1] - 2020-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
321
-
322
- ### Fixed
323
-
324
- - fixed scanning of comment-like text in normal mode
325
- * this was an old bug, but had become more prevalent in v1.8.0
326
- * thanks to [Tietew](https://github.com/Tietew) for the report
327
- - specified correct minimum Ruby version in gemspec
328
- * it said 1.9 but really required 2.0 as of v1.8.0
329
-
330
- ### [1.8.0] - 2020-09-20 - [Janosch Müller](mailto:janosch84@gmail.com)
331
-
332
- ### Changed
333
-
334
- - dropped support for running on Ruby 1.9.x
335
-
336
- ### Added
337
-
338
- - regexp flags can now be passed when parsing a `String` as regexp body
339
- * see the [README](/README.md#usage) for details
340
- * thanks to [Owen Stephens](https://github.com/owst)
341
- - bare occurrences of `\g` and `\k` are now allowed and scanned as literal escapes
342
- * matches Onigmo behavior
343
- * thanks for the report to [Marc-André Lafortune](https://github.com/marcandre)
344
-
345
- ### Fixed
346
-
347
- - fixed parsing comments without preceding space or trailing newline in x-mode
348
- * thanks to [Owen Stephens](https://github.com/owst)
349
-
350
- ### [1.7.1] - 2020-06-07 - [Ammar Ali](mailto:ammarabuali@gmail.com)
351
-
352
- ### Fixed
353
-
354
- - Support for literals that include the unescaped delimiters `{`, `}`, and `]`. These
355
- delimiters are informally supported by various regexp engines.
356
-
357
- ### [1.7.0] - 2020-02-23 - [Janosch Müller](mailto:janosch84@gmail.com)
358
-
359
- ### Added
360
-
361
- - `Expression::Base#each_expression` and `#traverse` can now be called without a block
362
- * this returns an `Enumerator` and allows chaining, e.g. `each_expression.select`
363
- * thanks to [Masataka Kuwabara](https://github.com/pocke)
364
-
365
- ### Fixed
366
-
367
- - `MatchLength#each` no longer ignores the given `limit:` when called without a block
368
-
369
- ### [1.6.0] - 2019-06-16 - [Janosch Müller](mailto:janosch84@gmail.com)
370
-
371
- ### Added
372
-
373
- - Added support for 16 new unicode properties introduced in Ruby 2.6.2 and 2.6.3
374
-
375
- ### [1.5.1] - 2019-05-23 - [Janosch Müller](mailto:janosch84@gmail.com)
376
-
377
- ### Fixed
378
-
379
- - Fixed `#options` (and thus `#i?`, `#u?` etc.) not being set for some expressions:
380
- * this affected posix classes as well as alternation, conditional, and intersection branches
381
- * `#options` was already correct for all child expressions of such branches
382
- * this only made an operational difference for posix classes as they respect encoding flags
383
- - Fixed `#options` not respecting all negative options in weird cases like '(?u-m-x)'
384
- - Fixed `Group#option_changes` not accounting for indirectly disabled (overridden) encoding flags
385
- - Fixed `Scanner` allowing negative encoding options if there were no positive options, e.g. '(?-u)'
386
- - Fixed `ScannerError` for some valid meta/control sequences such as '\\C-\\\\'
387
- - Fixed `Expression::Base#match` and `#=~` not working with a single argument
388
-
389
- ### [1.5.0] - 2019-05-14 - [Janosch Müller](mailto:janosch84@gmail.com)
390
-
391
- ### Added
392
-
393
- - Added `#referenced_expression` for backrefs, subexp calls and conditionals
394
- * returns the `Group` expression that is being referenced via name or number
395
- - Added `Expression::Base#repetitions`
396
- * returns a `Range` of allowed repetitions (`1..1` if there is no quantifier)
397
- * like `#quantity` but with a more uniform interface
398
- - Added `Expression::Base#match_length`
399
- * allows to inspect and iterate over String lengths matched by the Expression
400
-
401
- ### Fixed
402
-
403
- - Fixed `Expression::Base#clone` "direction"
404
- * it used to dup ivars onto the callee, leaving only the clone referencing the original objects
405
- * this will affect you if you call `#eql?`/`#equal?` on expressions or use them as Hash keys
406
- - Fixed `#clone` results for `Sequences`, e.g. alternations and conditionals
407
- * the inner `#text` was cloned onto the `Sequence` and thus duplicated
408
- * e.g. `Regexp::Parser.parse(/(a|bc)/).clone.to_s # => (aa|bcbc)`
409
- - Fixed inconsistent `#to_s` output for `Sequences`
410
- * it used to return only the "specific" text, e.g. "|" for an alternation
411
- * now it includes nested expressions as it does for all other `Subexpressions`
412
- - Fixed quantification of codepoint lists with more than one entry (`\u{62 63 64}+`)
413
- * quantifiers apply only to the last entry, so this token is now split up if quantified
414
-
415
- ### [1.4.0] - 2019-04-02 - [Janosch Müller](mailto:janosch84@gmail.com)
416
-
417
- ### Added
418
-
419
- - Added support for 19 new unicode properties introduced in Ruby 2.6.0
420
-
421
- ### [1.3.0] - 2018-11-14 - [Janosch Müller](mailto:janosch84@gmail.com)
422
-
423
- ### Added
424
-
425
- - `Syntax#features` returns a `Hash` of all types and tokens supported by a given `Syntax`
426
-
427
- ### Fixed
428
-
429
- - Thanks to [Akira Matsuda](https://github.com/amatsuda)
430
- * eliminated warning "assigned but unused variable - testEof"
431
-
432
- ## [1.2.0] - 2018-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
433
-
434
- ### Added
435
-
436
- - `Subexpression` (branch node) includes `Enumerable`, allowing to `#select` children etc.
437
-
438
- ### Fixed
439
-
440
- - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re`
441
- - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree
442
- * it used to be the only expression stored in a custom ivar, complicating traversal
443
- * its setter and getter (`#condition=`, `#condition`) still work as before
444
-
445
- ## [1.1.0] - 2018-09-17 - [Janosch Müller](mailto:janosch84@gmail.com)
446
-
447
- ### Added
448
-
449
- - Added `Quantifier` methods `#greedy?`, `#possessive?`, `#reluctant?`/`#lazy?`
450
- - Added `Group::Options#option_changes`
451
- * shows the options enabled or disabled by the given options group
452
- * as with all other expressions, `#options` shows the overall active options
453
- - Added `Conditional#reference` and `Condition#reference`, indicating the determinative group
454
- - Added `Subexpression#dig`, acts like [`Array#dig`](http://ruby-doc.org/core-2.5.0/Array.html#method-i-dig)
455
-
456
- ### Fixed
457
-
458
- - Fixed parsing of quantified conditional expressions (quantifiers were assigned to the wrong expression)
459
- - Fixed scanning and parsing of forward-referring subexpression calls (e.g. `\g<+1>`)
460
- - `Root` and `Sequence` expressions now support the same constructor signature as all other expressions
461
-
462
- ## [1.0.0] - 2018-09-01 - [Janosch Müller](mailto:janosch84@gmail.com)
463
-
464
- This release includes several breaking changes, mostly to character sets, #map and properties.
465
-
466
- ### Changed
467
-
468
- - Changed handling of sets (a.k.a. character classes or "bracket expressions")
469
- * see PR [#55](https://github.com/ammar/regexp_parser/pull/55) / issue [#47](https://github.com/ammar/regexp_parser/issues/47) for details
470
- * sets are now parsed to expression trees like other nestable expressions
471
- * `#scan` now emits the same tokens as outside sets (no longer `:set, :member`)
472
- * `CharacterSet#members` has been removed
473
- * new `Range` and `Intersection` classes represent corresponding syntax features
474
- * a new `PosixClass` expression class represents e.g. `[[:ascii:]]`
475
- * `PosixClass` instances behave like `Property` ones, e.g. support `#negative?`
476
- * `#scan` emits `:(non)posixclass, :<type>` instead of `:set, :char_(non)<type>`
477
- - Changed `Subexpression#map` to act like regular `Enumerable#map`
478
- * the old behavior is available as `Subexpression#flat_map`
479
- * e.g. `parse(/[a]/).map(&:to_s) == ["[a]"]`; used to be `["[a]", "a"]`
480
- - Changed expression emissions for some escape sequences
481
- * `EscapeSequence::Codepoint`, `CodepointList`, `Hex` and `Octal` are now all used
482
- * they already existed, but were all parsed as `EscapeSequence::Literal`
483
- * e.g. `\x97` is now `EscapeSequence::Hex` instead of `EscapeSequence::Literal`
484
- - Changed naming of many property tokens (emitted for `\p{...}`)
485
- * if you work with these tokens, see PR [#56](https://github.com/ammar/regexp_parser/pull/56) for details
486
- * e.g. `:punct_dash` is now `:dash_punctuation`
487
- - Changed `(?m)` and the likes to emit as `:options_switch` token (@4ade4d1)
488
- * allows differentiating from group-local `:options`, e.g. `(?m:.)`
489
- - Changed name of `Backreference::..NestLevel` to `..RecursionLevel` (@4184339)
490
- - Changed `Backreference::Number#number` from `String` to `Integer` (@40a2231)
491
-
492
- ### Added
493
-
494
- - Added support for all previously missing properties (about 250)
495
- - Added `Expression::UnicodeProperty#shortcut` (e.g. returns "m" for `\p{mark}`)
496
- - Added `#char(s)` and `#codepoint(s)` methods to all `EscapeSequence` expressions
497
- - Added `#number`/`#name`/`#recursion_level` to all backref/call expressions (@174bf21)
498
- - Added `#number` and `#number_at_level` to capturing group expressions (@40a2231)
499
-
500
- ### Fixed
501
-
502
- - Fixed Ruby version mapping of some properties
503
- - Fixed scanning of some property spellings, e.g. with dashes
504
- - Fixed some incorrect property alias normalizations
505
- - Fixed scanning of codepoint escapes with 6 digits (e.g. `\u{10FFFF}`)
506
- - Fixed scanning of `\R` and `\X` within sets; they act as literals there
507
-
508
- ## [0.5.0] - 2018-04-29 - [Janosch Müller](mailto:janosch84@gmail.com)
509
-
510
- ### Changed
511
-
512
- - Changed handling of Ruby versions (PR [#53](https://github.com/ammar/regexp_parser/pull/53))
513
- * New Ruby versions are now supported by default
514
- * Some deep-lying APIs have changed, which should not affect most users:
515
- * `Regexp::Syntax::VERSIONS` is gone
516
- * Syntax version names have changed from `Regexp::Syntax::Ruby::Vnnn`
517
- to `Regexp::Syntax::Vn_n_n`
518
- * Syntax version classes for Ruby versions without regex feature changes
519
- are no longer predefined and are now only created on demand / lazily
520
- * `Regexp::Syntax::supported?` returns true for any argument >= 1.8.6
521
-
522
- ### Fixed
523
-
524
- - Fixed some use cases of Expression methods #strfregexp and #to_h (@e738107)
525
-
526
- ### Added
527
-
528
- - Added full signature support to collection methods of Expressions (@aa7c55a)
529
-
530
- ## [0.4.13] - 2018-04-04 - [Ammar Ali](mailto:ammarabuali@gmail.com)
531
-
532
- - Added ruby version files for 2.2.10 and 2.3.7
533
-
534
- ## [0.4.12] - 2018-03-30 - [Janosch Müller](mailto:janosch84@gmail.com)
535
-
536
- - Added ruby version files for 2.4.4 and 2.5.1
537
-
538
- ## [0.4.11] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
539
-
540
- - Fixed UnknownSyntaxNameError introduced in v0.4.10 if
541
- the gems parent dir tree included a 'ruby' dir
542
-
543
- ## [0.4.10] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
544
-
545
- - Added ruby version file for 2.6.0
546
- - Added support for Emoji properties (available in Ruby since 2.5.0)
547
- - Added support for XPosixPunct and Regional_Indicator properties
548
- - Fixed parsing of Unicode 6.0 and 7.0 script properties
549
- - Fixed parsing of the special Assigned property
550
- - Fixed scanning of InCyrillic_Supplement property
551
-
552
- ## [0.4.9] - 2017-12-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
553
-
554
- - Added ruby version file for 2.5.0
555
-
556
- ## [0.4.8] - 2017-12-18 - [Janosch Müller](mailto:janosch84@gmail.com)
557
-
558
- - Added ruby version files for 2.2.9, 2.3.6, and 2.4.3
559
-
560
- ## [0.4.7] - 2017-10-15 - [Janosch Müller](mailto:janosch84@gmail.com)
561
-
562
- - Fixed a thread safety issue (issue #45)
563
- - Some public class methods that were only reliable for
564
- internal use are now private instance methods (PR #46)
565
- - Improved the usefulness of Expression::Base#options (issue #43) -
566
- #options and derived methods such as #i?, #m? and #x? are now
567
- defined for all Expressions that are affected by such flags.
568
- - Fixed scanning of whitespace following (?x) (commit 5c94bd2)
569
- - Fixed a Parser bug where the #number attribute of traditional
570
- numerical backreferences was not set correctly (commit 851b620)
571
-
572
- ## [0.4.6] - 2017-09-18 - [Janosch Müller](mailto:janosch84@gmail.com)
573
-
574
- - Added Parser support for hex escapes in sets (PR #36)
575
- - Added Parser support for octal escapes (PR #37)
576
- - Added support for cluster types \R and \X (PR #38)
577
- - Added support for more metacontrol notations (PR #39)
578
-
579
- ## [0.4.5] - 2017-09-17 - [Ammar Ali](mailto:ammarabuali@gmail.com)
580
-
581
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
582
- * Support ruby 2.2.7 (PR #42)
583
- - Added ruby version files for 2.2.8, 2.3.5, and 2.4.2
584
-
585
- ## [0.4.4] - 2017-07-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
586
-
587
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
588
- * Add support for new absence operator (PR #33)
589
- - Thanks to [Bartek Bułat](https://github.com/barthez):
590
- * Add support for Ruby 2.3.4 version (PR #40)
591
-
592
- ## [0.4.3] - 2017-03-24 - [Ammar Ali](mailto:ammarabuali@gmail.com)
593
-
594
- - Added ruby version file for 2.4.1
595
-
596
- ## [0.4.2] - 2017-01-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
597
-
598
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
599
- * Support ruby 2.4 (PR #30)
600
- * Improve codepoint handling (PR #27)
601
-
602
- ## [0.4.1] - 2016-11-22 - [Ammar Ali](mailto:ammarabuali@gmail.com)
603
-
604
- - Updated ruby version file for 2.3.3
605
-
606
- ## [0.4.0] - 2016-11-20 - [Ammar Ali](mailto:ammarabuali@gmail.com)
607
-
608
- - Added Syntax.supported? method
609
- - Updated ruby versions for latest releases; 2.1.10, 2.2.6, and 2.3.2
610
-
611
- ## [0.3.6] - 2016-06-08 - [Ammar Ali](mailto:ammarabuali@gmail.com)
612
-
613
- - Thanks to [John Backus](https://github.com/backus):
614
- * Remove warnings (PR #26)
615
-
616
- ## [0.3.5] - 2016-05-30 - [Ammar Ali](mailto:ammarabuali@gmail.com)
617
-
618
- - Thanks to [John Backus](https://github.com/backus):
619
- * Fix parsing of /\xFF/n (hex:escape) (PR #24)
620
-
621
- ## [0.3.4] - 2016-05-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
622
-
623
- - Thanks to [John Backus](https://github.com/backus):
624
- * Fix warnings (PR #19)
625
- - Thanks to [Dana Scheider](https://github.com/danascheider):
626
- * Correct error in README (PR #20)
627
- - Fixed mistyped \h and \H character types (issue #21)
628
- - Added ancestry syntax files for latest rubies (issue #22)
629
-
630
- ## [0.3.3] - 2016-04-26 - [Ammar Ali](mailto:ammarabuali@gmail.com)
631
-
632
- - Thanks to [John Backus](https://github.com/backus):
633
- * Fixed scanning of zero length comments (PR #12)
634
- * Fixed missing escape:codepoint_list syntax token (PR #14)
635
- * Fixed to_s for modified interval quantifiers (PR #17)
636
-
637
- ## [0.3.2] - 2016-01-01 - [Ammar Ali](mailto:ammarabuali@gmail.com)
638
-
639
- - Updated ruby versions for latest releases; 2.1.8, 2.2.4, and 2.3.0
640
- - Fixed class name for UnknownSyntaxNameError exception
641
- - Added UnicodeBlocks support to the parser.
642
- - Added UnicodeBlocks support to the scanner.
643
- - Added expand_members method to CharacterSet, returns traditional
644
- or unicode property forms of shothands (\d, \W, \s, etc.)
645
- - Improved meaning and output of %t and %T in strfregexp.
646
- - Added syntax versions for ruby 2.1.4 and 2.1.5 and updated
647
- latest 2.1 version.
648
- - Added to_h methods to Expression, Subexpression, and Quantifier.
649
- - Added traversal methods; traverse, each_expression, and map.
650
- - Added token/type test methods; type?, is?, and one_of?
651
- - Added printing method strfregexp, inspired by strftime.
652
- - Added scanning and parsing of free spacing (x mode) expressions.
653
- - Improved handling of inline options (?mixdau:...)
654
- - Added conditional expressions. Ruby 2.0.
655
- - Added keep (\K) markers. Ruby 2.0.
656
- - Added d, a, and u options. Ruby 2.0.
657
- - Added missing meta sequences to the parser. They were supported by the scanner only.
658
- - Renamed Lexer's method to lex, added an alias to the old name (scan)
659
- - Use #map instead of #each to run the block in Lexer.lex.
660
- - Replaced VERSION.yml file with a constant.
661
- - Update tokens and scanner with new additions in Unicode 7.0.
662
-
663
- ## [0.1.6] - 2014-10-06 - [Ammar Ali](mailto:ammarabuali@gmail.com)
664
-
665
- - Fixed test and gem building rake tasks and extracted the gem
666
- specification from the Rakefile into a .gemspec file.
667
- - Added syntax files for missing ruby 2.x versions. These do not add
668
- extra syntax support, they just make the gem work with the newer
669
- ruby versions.
670
- - Fixed a parser bug where an alternation sequence that contained nested expressions was incorrectly being appended to the parent expression when the nesting was exited. e.g. in /a|(b)c/, c was appended to the root.
671
- - Fixed a bug where character types were not being correctly scanned within character sets. e.g. in [\d], two tokens were scanned; one for the backslash '\' and one for the 'd'
672
-
673
- ## [0.1.5] - 2014-01-14 - [Ammar Ali](mailto:ammarabuali@gmail.com)
674
-
675
- - Added syntax stubs for ruby versions 2.0 and 2.1
676
- - Added clone methods for deep copying expressions.
677
- - Added optional format argument for to_s on expressions to return the text of the expression with (:full, the default) or without (:base) its quantifier.
678
- - Renamed the :beginning_of_line and :end_of_line tokens to :bol and :eol.
679
- - Fixed a bug where alternations with more than two alternatives and one of them ending in a group were being incorrectly nested.
680
- - Improved EOF handling in general and especially from sequences like hex and control escapes.
681
- - Fixed a bug where named groups with an empty name would return a blank token [].
682
- - Fixed a bug where member of a parent set where being added to its last subset.
683
- - Fixed a few mutable string bugs by calling dup on the originals.
684
- - Made ruby 1.8.6 the base for all 1.8 syntax, and the 1.8 name a pointer to the latest (1.8.7 at this time)
685
- - Removed look-behind assertions (positive and negative) from 1.8 syntax
686
- - Added control (\cc and \C-c) and meta (\M-c) escapes to 1.8 syntax
687
- - The default syntax is now the one of the running ruby version in both the lexer and the parser.
688
-
689
- ## [0.1.0] - 2010-11-21 - [Ammar Ali](mailto:ammarabuali@gmail.com)
690
-
691
- - Initial release