regexp_parser 1.7.0 → 2.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (166) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile +9 -3
  3. data/LICENSE +1 -1
  4. data/Rakefile +6 -70
  5. data/lib/regexp_parser/error.rb +4 -0
  6. data/lib/regexp_parser/expression/base.rb +76 -0
  7. data/lib/regexp_parser/expression/classes/alternation.rb +1 -1
  8. data/lib/regexp_parser/expression/classes/anchor.rb +0 -2
  9. data/lib/regexp_parser/expression/classes/{backref.rb → backreference.rb} +22 -2
  10. data/lib/regexp_parser/expression/classes/{set → character_set}/range.rb +4 -8
  11. data/lib/regexp_parser/expression/classes/{set.rb → character_set.rb} +4 -8
  12. data/lib/regexp_parser/expression/classes/{type.rb → character_type.rb} +0 -2
  13. data/lib/regexp_parser/expression/classes/conditional.rb +11 -5
  14. data/lib/regexp_parser/expression/classes/{escape.rb → escape_sequence.rb} +15 -7
  15. data/lib/regexp_parser/expression/classes/free_space.rb +5 -5
  16. data/lib/regexp_parser/expression/classes/group.rb +28 -15
  17. data/lib/regexp_parser/expression/classes/keep.rb +2 -0
  18. data/lib/regexp_parser/expression/classes/literal.rb +1 -5
  19. data/lib/regexp_parser/expression/classes/posix_class.rb +5 -5
  20. data/lib/regexp_parser/expression/classes/root.rb +4 -19
  21. data/lib/regexp_parser/expression/classes/{property.rb → unicode_property.rb} +11 -12
  22. data/lib/regexp_parser/expression/methods/construct.rb +41 -0
  23. data/lib/regexp_parser/expression/methods/human_name.rb +43 -0
  24. data/lib/regexp_parser/expression/methods/match_length.rb +11 -7
  25. data/lib/regexp_parser/expression/methods/negative.rb +20 -0
  26. data/lib/regexp_parser/expression/methods/parts.rb +23 -0
  27. data/lib/regexp_parser/expression/methods/printing.rb +26 -0
  28. data/lib/regexp_parser/expression/methods/strfregexp.rb +1 -1
  29. data/lib/regexp_parser/expression/methods/tests.rb +47 -1
  30. data/lib/regexp_parser/expression/methods/traverse.rb +34 -18
  31. data/lib/regexp_parser/expression/quantifier.rb +57 -17
  32. data/lib/regexp_parser/expression/sequence.rb +11 -47
  33. data/lib/regexp_parser/expression/sequence_operation.rb +4 -9
  34. data/lib/regexp_parser/expression/shared.rb +111 -0
  35. data/lib/regexp_parser/expression/subexpression.rb +27 -19
  36. data/lib/regexp_parser/expression.rb +15 -141
  37. data/lib/regexp_parser/lexer.rb +83 -41
  38. data/lib/regexp_parser/parser.rb +372 -429
  39. data/lib/regexp_parser/scanner/char_type.rl +11 -11
  40. data/lib/regexp_parser/scanner/errors/premature_end_error.rb +8 -0
  41. data/lib/regexp_parser/scanner/errors/scanner_error.rb +6 -0
  42. data/lib/regexp_parser/scanner/errors/validation_error.rb +63 -0
  43. data/lib/regexp_parser/scanner/properties/long.csv +651 -0
  44. data/lib/regexp_parser/scanner/properties/short.csv +249 -0
  45. data/lib/regexp_parser/scanner/property.rl +4 -4
  46. data/lib/regexp_parser/scanner/scanner.rl +303 -368
  47. data/lib/regexp_parser/scanner.rb +1423 -1674
  48. data/lib/regexp_parser/syntax/any.rb +2 -7
  49. data/lib/regexp_parser/syntax/base.rb +92 -67
  50. data/lib/regexp_parser/syntax/token/anchor.rb +15 -0
  51. data/lib/regexp_parser/syntax/{tokens → token}/assertion.rb +2 -2
  52. data/lib/regexp_parser/syntax/token/backreference.rb +33 -0
  53. data/lib/regexp_parser/syntax/token/character_set.rb +16 -0
  54. data/lib/regexp_parser/syntax/{tokens → token}/character_type.rb +3 -3
  55. data/lib/regexp_parser/syntax/{tokens → token}/conditional.rb +3 -3
  56. data/lib/regexp_parser/syntax/token/escape.rb +33 -0
  57. data/lib/regexp_parser/syntax/{tokens → token}/group.rb +7 -7
  58. data/lib/regexp_parser/syntax/{tokens → token}/keep.rb +1 -1
  59. data/lib/regexp_parser/syntax/token/meta.rb +20 -0
  60. data/lib/regexp_parser/syntax/{tokens → token}/posix_class.rb +3 -3
  61. data/lib/regexp_parser/syntax/token/quantifier.rb +35 -0
  62. data/lib/regexp_parser/syntax/token/unicode_property.rb +751 -0
  63. data/lib/regexp_parser/syntax/token/virtual.rb +11 -0
  64. data/lib/regexp_parser/syntax/token.rb +45 -0
  65. data/lib/regexp_parser/syntax/version_lookup.rb +19 -36
  66. data/lib/regexp_parser/syntax/versions/1.8.6.rb +13 -20
  67. data/lib/regexp_parser/syntax/versions/1.9.1.rb +10 -17
  68. data/lib/regexp_parser/syntax/versions/1.9.3.rb +3 -10
  69. data/lib/regexp_parser/syntax/versions/2.0.0.rb +8 -15
  70. data/lib/regexp_parser/syntax/versions/2.2.0.rb +3 -9
  71. data/lib/regexp_parser/syntax/versions/2.3.0.rb +3 -9
  72. data/lib/regexp_parser/syntax/versions/2.4.0.rb +3 -9
  73. data/lib/regexp_parser/syntax/versions/2.4.1.rb +2 -8
  74. data/lib/regexp_parser/syntax/versions/2.5.0.rb +3 -9
  75. data/lib/regexp_parser/syntax/versions/2.6.0.rb +3 -9
  76. data/lib/regexp_parser/syntax/versions/2.6.2.rb +3 -9
  77. data/lib/regexp_parser/syntax/versions/2.6.3.rb +3 -9
  78. data/lib/regexp_parser/syntax/versions/3.1.0.rb +4 -0
  79. data/lib/regexp_parser/syntax/versions/3.2.0.rb +4 -0
  80. data/lib/regexp_parser/syntax/versions.rb +3 -1
  81. data/lib/regexp_parser/syntax.rb +8 -6
  82. data/lib/regexp_parser/token.rb +9 -20
  83. data/lib/regexp_parser/version.rb +1 -1
  84. data/lib/regexp_parser.rb +0 -2
  85. data/regexp_parser.gemspec +19 -23
  86. metadata +53 -171
  87. data/CHANGELOG.md +0 -349
  88. data/README.md +0 -470
  89. data/lib/regexp_parser/scanner/properties/long.yml +0 -594
  90. data/lib/regexp_parser/scanner/properties/short.yml +0 -237
  91. data/lib/regexp_parser/syntax/tokens/anchor.rb +0 -15
  92. data/lib/regexp_parser/syntax/tokens/backref.rb +0 -24
  93. data/lib/regexp_parser/syntax/tokens/character_set.rb +0 -13
  94. data/lib/regexp_parser/syntax/tokens/escape.rb +0 -30
  95. data/lib/regexp_parser/syntax/tokens/meta.rb +0 -13
  96. data/lib/regexp_parser/syntax/tokens/quantifier.rb +0 -35
  97. data/lib/regexp_parser/syntax/tokens/unicode_property.rb +0 -675
  98. data/lib/regexp_parser/syntax/tokens.rb +0 -45
  99. data/spec/expression/base_spec.rb +0 -94
  100. data/spec/expression/clone_spec.rb +0 -120
  101. data/spec/expression/conditional_spec.rb +0 -89
  102. data/spec/expression/free_space_spec.rb +0 -27
  103. data/spec/expression/methods/match_length_spec.rb +0 -161
  104. data/spec/expression/methods/match_spec.rb +0 -25
  105. data/spec/expression/methods/strfregexp_spec.rb +0 -224
  106. data/spec/expression/methods/tests_spec.rb +0 -99
  107. data/spec/expression/methods/traverse_spec.rb +0 -161
  108. data/spec/expression/options_spec.rb +0 -128
  109. data/spec/expression/root_spec.rb +0 -9
  110. data/spec/expression/sequence_spec.rb +0 -9
  111. data/spec/expression/subexpression_spec.rb +0 -50
  112. data/spec/expression/to_h_spec.rb +0 -26
  113. data/spec/expression/to_s_spec.rb +0 -100
  114. data/spec/lexer/all_spec.rb +0 -22
  115. data/spec/lexer/conditionals_spec.rb +0 -53
  116. data/spec/lexer/escapes_spec.rb +0 -14
  117. data/spec/lexer/keep_spec.rb +0 -10
  118. data/spec/lexer/literals_spec.rb +0 -89
  119. data/spec/lexer/nesting_spec.rb +0 -99
  120. data/spec/lexer/refcalls_spec.rb +0 -55
  121. data/spec/parser/all_spec.rb +0 -43
  122. data/spec/parser/alternation_spec.rb +0 -88
  123. data/spec/parser/anchors_spec.rb +0 -17
  124. data/spec/parser/conditionals_spec.rb +0 -179
  125. data/spec/parser/errors_spec.rb +0 -30
  126. data/spec/parser/escapes_spec.rb +0 -121
  127. data/spec/parser/free_space_spec.rb +0 -130
  128. data/spec/parser/groups_spec.rb +0 -108
  129. data/spec/parser/keep_spec.rb +0 -6
  130. data/spec/parser/posix_classes_spec.rb +0 -8
  131. data/spec/parser/properties_spec.rb +0 -115
  132. data/spec/parser/quantifiers_spec.rb +0 -51
  133. data/spec/parser/refcalls_spec.rb +0 -112
  134. data/spec/parser/set/intersections_spec.rb +0 -127
  135. data/spec/parser/set/ranges_spec.rb +0 -111
  136. data/spec/parser/sets_spec.rb +0 -178
  137. data/spec/parser/types_spec.rb +0 -18
  138. data/spec/scanner/all_spec.rb +0 -18
  139. data/spec/scanner/anchors_spec.rb +0 -21
  140. data/spec/scanner/conditionals_spec.rb +0 -128
  141. data/spec/scanner/errors_spec.rb +0 -68
  142. data/spec/scanner/escapes_spec.rb +0 -53
  143. data/spec/scanner/free_space_spec.rb +0 -133
  144. data/spec/scanner/groups_spec.rb +0 -52
  145. data/spec/scanner/keep_spec.rb +0 -10
  146. data/spec/scanner/literals_spec.rb +0 -49
  147. data/spec/scanner/meta_spec.rb +0 -18
  148. data/spec/scanner/properties_spec.rb +0 -64
  149. data/spec/scanner/quantifiers_spec.rb +0 -20
  150. data/spec/scanner/refcalls_spec.rb +0 -36
  151. data/spec/scanner/sets_spec.rb +0 -102
  152. data/spec/scanner/types_spec.rb +0 -14
  153. data/spec/spec_helper.rb +0 -15
  154. data/spec/support/runner.rb +0 -42
  155. data/spec/support/shared_examples.rb +0 -77
  156. data/spec/support/warning_extractor.rb +0 -60
  157. data/spec/syntax/syntax_spec.rb +0 -48
  158. data/spec/syntax/syntax_token_map_spec.rb +0 -23
  159. data/spec/syntax/versions/1.8.6_spec.rb +0 -17
  160. data/spec/syntax/versions/1.9.1_spec.rb +0 -10
  161. data/spec/syntax/versions/1.9.3_spec.rb +0 -9
  162. data/spec/syntax/versions/2.0.0_spec.rb +0 -13
  163. data/spec/syntax/versions/2.2.0_spec.rb +0 -9
  164. data/spec/syntax/versions/aliases_spec.rb +0 -37
  165. data/spec/token/token_spec.rb +0 -85
  166. /data/lib/regexp_parser/expression/classes/{set → character_set}/intersection.rb +0 -0
metadata CHANGED
@@ -1,80 +1,92 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: regexp_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.7.0
4
+ version: 2.9.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ammar Ali
8
- autorequire:
8
+ - Janosch Müller
9
+ autorequire:
9
10
  bindir: bin
10
11
  cert_chain: []
11
- date: 2020-02-23 00:00:00.000000000 Z
12
+ date: 2024-01-07 00:00:00.000000000 Z
12
13
  dependencies: []
13
14
  description: A library for tokenizing, lexing, and parsing Ruby regular expressions.
14
15
  email:
15
16
  - ammarabuali@gmail.com
17
+ - janosch84@gmail.com
16
18
  executables: []
17
19
  extensions: []
18
20
  extra_rdoc_files: []
19
21
  files:
20
- - CHANGELOG.md
21
22
  - Gemfile
22
23
  - LICENSE
23
- - README.md
24
24
  - Rakefile
25
25
  - lib/regexp_parser.rb
26
+ - lib/regexp_parser/error.rb
26
27
  - lib/regexp_parser/expression.rb
28
+ - lib/regexp_parser/expression/base.rb
27
29
  - lib/regexp_parser/expression/classes/alternation.rb
28
30
  - lib/regexp_parser/expression/classes/anchor.rb
29
- - lib/regexp_parser/expression/classes/backref.rb
31
+ - lib/regexp_parser/expression/classes/backreference.rb
32
+ - lib/regexp_parser/expression/classes/character_set.rb
33
+ - lib/regexp_parser/expression/classes/character_set/intersection.rb
34
+ - lib/regexp_parser/expression/classes/character_set/range.rb
35
+ - lib/regexp_parser/expression/classes/character_type.rb
30
36
  - lib/regexp_parser/expression/classes/conditional.rb
31
- - lib/regexp_parser/expression/classes/escape.rb
37
+ - lib/regexp_parser/expression/classes/escape_sequence.rb
32
38
  - lib/regexp_parser/expression/classes/free_space.rb
33
39
  - lib/regexp_parser/expression/classes/group.rb
34
40
  - lib/regexp_parser/expression/classes/keep.rb
35
41
  - lib/regexp_parser/expression/classes/literal.rb
36
42
  - lib/regexp_parser/expression/classes/posix_class.rb
37
- - lib/regexp_parser/expression/classes/property.rb
38
43
  - lib/regexp_parser/expression/classes/root.rb
39
- - lib/regexp_parser/expression/classes/set.rb
40
- - lib/regexp_parser/expression/classes/set/intersection.rb
41
- - lib/regexp_parser/expression/classes/set/range.rb
42
- - lib/regexp_parser/expression/classes/type.rb
44
+ - lib/regexp_parser/expression/classes/unicode_property.rb
45
+ - lib/regexp_parser/expression/methods/construct.rb
46
+ - lib/regexp_parser/expression/methods/human_name.rb
43
47
  - lib/regexp_parser/expression/methods/match.rb
44
48
  - lib/regexp_parser/expression/methods/match_length.rb
49
+ - lib/regexp_parser/expression/methods/negative.rb
45
50
  - lib/regexp_parser/expression/methods/options.rb
51
+ - lib/regexp_parser/expression/methods/parts.rb
52
+ - lib/regexp_parser/expression/methods/printing.rb
46
53
  - lib/regexp_parser/expression/methods/strfregexp.rb
47
54
  - lib/regexp_parser/expression/methods/tests.rb
48
55
  - lib/regexp_parser/expression/methods/traverse.rb
49
56
  - lib/regexp_parser/expression/quantifier.rb
50
57
  - lib/regexp_parser/expression/sequence.rb
51
58
  - lib/regexp_parser/expression/sequence_operation.rb
59
+ - lib/regexp_parser/expression/shared.rb
52
60
  - lib/regexp_parser/expression/subexpression.rb
53
61
  - lib/regexp_parser/lexer.rb
54
62
  - lib/regexp_parser/parser.rb
55
63
  - lib/regexp_parser/scanner.rb
56
64
  - lib/regexp_parser/scanner/char_type.rl
57
- - lib/regexp_parser/scanner/properties/long.yml
58
- - lib/regexp_parser/scanner/properties/short.yml
65
+ - lib/regexp_parser/scanner/errors/premature_end_error.rb
66
+ - lib/regexp_parser/scanner/errors/scanner_error.rb
67
+ - lib/regexp_parser/scanner/errors/validation_error.rb
68
+ - lib/regexp_parser/scanner/properties/long.csv
69
+ - lib/regexp_parser/scanner/properties/short.csv
59
70
  - lib/regexp_parser/scanner/property.rl
60
71
  - lib/regexp_parser/scanner/scanner.rl
61
72
  - lib/regexp_parser/syntax.rb
62
73
  - lib/regexp_parser/syntax/any.rb
63
74
  - lib/regexp_parser/syntax/base.rb
64
- - lib/regexp_parser/syntax/tokens.rb
65
- - lib/regexp_parser/syntax/tokens/anchor.rb
66
- - lib/regexp_parser/syntax/tokens/assertion.rb
67
- - lib/regexp_parser/syntax/tokens/backref.rb
68
- - lib/regexp_parser/syntax/tokens/character_set.rb
69
- - lib/regexp_parser/syntax/tokens/character_type.rb
70
- - lib/regexp_parser/syntax/tokens/conditional.rb
71
- - lib/regexp_parser/syntax/tokens/escape.rb
72
- - lib/regexp_parser/syntax/tokens/group.rb
73
- - lib/regexp_parser/syntax/tokens/keep.rb
74
- - lib/regexp_parser/syntax/tokens/meta.rb
75
- - lib/regexp_parser/syntax/tokens/posix_class.rb
76
- - lib/regexp_parser/syntax/tokens/quantifier.rb
77
- - lib/regexp_parser/syntax/tokens/unicode_property.rb
75
+ - lib/regexp_parser/syntax/token.rb
76
+ - lib/regexp_parser/syntax/token/anchor.rb
77
+ - lib/regexp_parser/syntax/token/assertion.rb
78
+ - lib/regexp_parser/syntax/token/backreference.rb
79
+ - lib/regexp_parser/syntax/token/character_set.rb
80
+ - lib/regexp_parser/syntax/token/character_type.rb
81
+ - lib/regexp_parser/syntax/token/conditional.rb
82
+ - lib/regexp_parser/syntax/token/escape.rb
83
+ - lib/regexp_parser/syntax/token/group.rb
84
+ - lib/regexp_parser/syntax/token/keep.rb
85
+ - lib/regexp_parser/syntax/token/meta.rb
86
+ - lib/regexp_parser/syntax/token/posix_class.rb
87
+ - lib/regexp_parser/syntax/token/quantifier.rb
88
+ - lib/regexp_parser/syntax/token/unicode_property.rb
89
+ - lib/regexp_parser/syntax/token/virtual.rb
78
90
  - lib/regexp_parser/syntax/version_lookup.rb
79
91
  - lib/regexp_parser/syntax/versions.rb
80
92
  - lib/regexp_parser/syntax/versions/1.8.6.rb
@@ -89,167 +101,37 @@ files:
89
101
  - lib/regexp_parser/syntax/versions/2.6.0.rb
90
102
  - lib/regexp_parser/syntax/versions/2.6.2.rb
91
103
  - lib/regexp_parser/syntax/versions/2.6.3.rb
104
+ - lib/regexp_parser/syntax/versions/3.1.0.rb
105
+ - lib/regexp_parser/syntax/versions/3.2.0.rb
92
106
  - lib/regexp_parser/token.rb
93
107
  - lib/regexp_parser/version.rb
94
108
  - regexp_parser.gemspec
95
- - spec/expression/base_spec.rb
96
- - spec/expression/clone_spec.rb
97
- - spec/expression/conditional_spec.rb
98
- - spec/expression/free_space_spec.rb
99
- - spec/expression/methods/match_length_spec.rb
100
- - spec/expression/methods/match_spec.rb
101
- - spec/expression/methods/strfregexp_spec.rb
102
- - spec/expression/methods/tests_spec.rb
103
- - spec/expression/methods/traverse_spec.rb
104
- - spec/expression/options_spec.rb
105
- - spec/expression/root_spec.rb
106
- - spec/expression/sequence_spec.rb
107
- - spec/expression/subexpression_spec.rb
108
- - spec/expression/to_h_spec.rb
109
- - spec/expression/to_s_spec.rb
110
- - spec/lexer/all_spec.rb
111
- - spec/lexer/conditionals_spec.rb
112
- - spec/lexer/escapes_spec.rb
113
- - spec/lexer/keep_spec.rb
114
- - spec/lexer/literals_spec.rb
115
- - spec/lexer/nesting_spec.rb
116
- - spec/lexer/refcalls_spec.rb
117
- - spec/parser/all_spec.rb
118
- - spec/parser/alternation_spec.rb
119
- - spec/parser/anchors_spec.rb
120
- - spec/parser/conditionals_spec.rb
121
- - spec/parser/errors_spec.rb
122
- - spec/parser/escapes_spec.rb
123
- - spec/parser/free_space_spec.rb
124
- - spec/parser/groups_spec.rb
125
- - spec/parser/keep_spec.rb
126
- - spec/parser/posix_classes_spec.rb
127
- - spec/parser/properties_spec.rb
128
- - spec/parser/quantifiers_spec.rb
129
- - spec/parser/refcalls_spec.rb
130
- - spec/parser/set/intersections_spec.rb
131
- - spec/parser/set/ranges_spec.rb
132
- - spec/parser/sets_spec.rb
133
- - spec/parser/types_spec.rb
134
- - spec/scanner/all_spec.rb
135
- - spec/scanner/anchors_spec.rb
136
- - spec/scanner/conditionals_spec.rb
137
- - spec/scanner/errors_spec.rb
138
- - spec/scanner/escapes_spec.rb
139
- - spec/scanner/free_space_spec.rb
140
- - spec/scanner/groups_spec.rb
141
- - spec/scanner/keep_spec.rb
142
- - spec/scanner/literals_spec.rb
143
- - spec/scanner/meta_spec.rb
144
- - spec/scanner/properties_spec.rb
145
- - spec/scanner/quantifiers_spec.rb
146
- - spec/scanner/refcalls_spec.rb
147
- - spec/scanner/sets_spec.rb
148
- - spec/scanner/types_spec.rb
149
- - spec/spec_helper.rb
150
- - spec/support/runner.rb
151
- - spec/support/shared_examples.rb
152
- - spec/support/warning_extractor.rb
153
- - spec/syntax/syntax_spec.rb
154
- - spec/syntax/syntax_token_map_spec.rb
155
- - spec/syntax/versions/1.8.6_spec.rb
156
- - spec/syntax/versions/1.9.1_spec.rb
157
- - spec/syntax/versions/1.9.3_spec.rb
158
- - spec/syntax/versions/2.0.0_spec.rb
159
- - spec/syntax/versions/2.2.0_spec.rb
160
- - spec/syntax/versions/aliases_spec.rb
161
- - spec/token/token_spec.rb
162
109
  homepage: https://github.com/ammar/regexp_parser
163
110
  licenses:
164
111
  - MIT
165
112
  metadata:
166
- issue_tracker: https://github.com/ammar/regexp_parser/issues
167
- post_install_message:
168
- rdoc_options:
169
- - "--inline-source"
170
- - "--charset=UTF-8"
113
+ bug_tracker_uri: https://github.com/ammar/regexp_parser/issues
114
+ changelog_uri: https://github.com/ammar/regexp_parser/blob/master/CHANGELOG.md
115
+ homepage_uri: https://github.com/ammar/regexp_parser
116
+ source_code_uri: https://github.com/ammar/regexp_parser
117
+ wiki_uri: https://github.com/ammar/regexp_parser/wiki
118
+ post_install_message:
119
+ rdoc_options: []
171
120
  require_paths:
172
121
  - lib
173
122
  required_ruby_version: !ruby/object:Gem::Requirement
174
123
  requirements:
175
124
  - - ">="
176
125
  - !ruby/object:Gem::Version
177
- version: 1.9.1
126
+ version: 2.0.0
178
127
  required_rubygems_version: !ruby/object:Gem::Requirement
179
128
  requirements:
180
129
  - - ">="
181
130
  - !ruby/object:Gem::Version
182
131
  version: '0'
183
132
  requirements: []
184
- rubygems_version: 3.1.2
185
- signing_key:
133
+ rubygems_version: 3.5.0.dev
134
+ signing_key:
186
135
  specification_version: 4
187
136
  summary: Scanner, lexer, parser for ruby's regular expressions
188
- test_files:
189
- - spec/token/token_spec.rb
190
- - spec/spec_helper.rb
191
- - spec/lexer/escapes_spec.rb
192
- - spec/lexer/keep_spec.rb
193
- - spec/lexer/all_spec.rb
194
- - spec/lexer/conditionals_spec.rb
195
- - spec/lexer/nesting_spec.rb
196
- - spec/lexer/refcalls_spec.rb
197
- - spec/lexer/literals_spec.rb
198
- - spec/parser/escapes_spec.rb
199
- - spec/parser/properties_spec.rb
200
- - spec/parser/sets_spec.rb
201
- - spec/parser/free_space_spec.rb
202
- - spec/parser/keep_spec.rb
203
- - spec/parser/all_spec.rb
204
- - spec/parser/conditionals_spec.rb
205
- - spec/parser/types_spec.rb
206
- - spec/parser/anchors_spec.rb
207
- - spec/parser/alternation_spec.rb
208
- - spec/parser/posix_classes_spec.rb
209
- - spec/parser/set/ranges_spec.rb
210
- - spec/parser/set/intersections_spec.rb
211
- - spec/parser/errors_spec.rb
212
- - spec/parser/refcalls_spec.rb
213
- - spec/parser/groups_spec.rb
214
- - spec/parser/quantifiers_spec.rb
215
- - spec/support/warning_extractor.rb
216
- - spec/support/shared_examples.rb
217
- - spec/support/runner.rb
218
- - spec/expression/subexpression_spec.rb
219
- - spec/expression/methods/match_spec.rb
220
- - spec/expression/methods/match_length_spec.rb
221
- - spec/expression/methods/traverse_spec.rb
222
- - spec/expression/methods/strfregexp_spec.rb
223
- - spec/expression/methods/tests_spec.rb
224
- - spec/expression/free_space_spec.rb
225
- - spec/expression/options_spec.rb
226
- - spec/expression/to_s_spec.rb
227
- - spec/expression/root_spec.rb
228
- - spec/expression/sequence_spec.rb
229
- - spec/expression/clone_spec.rb
230
- - spec/expression/to_h_spec.rb
231
- - spec/expression/conditional_spec.rb
232
- - spec/expression/base_spec.rb
233
- - spec/syntax/syntax_spec.rb
234
- - spec/syntax/syntax_token_map_spec.rb
235
- - spec/syntax/versions/1.9.3_spec.rb
236
- - spec/syntax/versions/2.2.0_spec.rb
237
- - spec/syntax/versions/1.9.1_spec.rb
238
- - spec/syntax/versions/2.0.0_spec.rb
239
- - spec/syntax/versions/1.8.6_spec.rb
240
- - spec/syntax/versions/aliases_spec.rb
241
- - spec/scanner/escapes_spec.rb
242
- - spec/scanner/properties_spec.rb
243
- - spec/scanner/sets_spec.rb
244
- - spec/scanner/free_space_spec.rb
245
- - spec/scanner/keep_spec.rb
246
- - spec/scanner/all_spec.rb
247
- - spec/scanner/conditionals_spec.rb
248
- - spec/scanner/types_spec.rb
249
- - spec/scanner/anchors_spec.rb
250
- - spec/scanner/meta_spec.rb
251
- - spec/scanner/errors_spec.rb
252
- - spec/scanner/refcalls_spec.rb
253
- - spec/scanner/groups_spec.rb
254
- - spec/scanner/literals_spec.rb
255
- - spec/scanner/quantifiers_spec.rb
137
+ test_files: []
data/CHANGELOG.md DELETED
@@ -1,349 +0,0 @@
1
- ## [Unreleased]
2
-
3
- ### [1.7.0] - 2020-02-23 - [Janosch Müller](mailto:janosch84@gmail.com)
4
-
5
- ### Added
6
-
7
- - `Expression#each_expression` and `1.#traverse` can now be called without a block
8
- * this returns an `Enumerator` and allows chaining, e.g. `each_expression.select`
9
- * thanks to [Masataka Kuwabara](https://github.com/pocke)
10
-
11
- ### Fixed
12
-
13
- - `MatchLength#each` no longer ignores the given `limit:` when called without a block
14
-
15
- ### [1.6.0] - 2019-06-16 - [Janosch Müller](mailto:janosch84@gmail.com)
16
-
17
- ### Added
18
-
19
- - Added support for 16 new unicode properties introduced in Ruby 2.6.2 and 2.6.3
20
-
21
- ### [1.5.1] - 2019-05-23 - [Janosch Müller](mailto:janosch84@gmail.com)
22
-
23
- ### Fixed
24
-
25
- - Fixed `#options` (and thus `#i?`, `#u?` etc.) not being set for some expressions:
26
- * this affected posix classes as well as alternation, conditional, and intersection branches
27
- * `#options` was already correct for all child expressions of such branches
28
- * this only made an operational difference for posix classes as they respect encoding flags
29
- - Fixed `#options` not respecting all negative options in weird cases like '(?u-m-x)'
30
- - Fixed `Group#option_changes` not accounting for indirectly disabled (overridden) encoding flags
31
- - Fixed `Scanner` allowing negative encoding options if there were no positive options, e.g. '(?-u)'
32
- - Fixed `ScannerError` for some valid meta/control sequences such as '\\C-\\\\'
33
- - Fixed `Expression#match` and `#=~` not working with a single argument
34
-
35
- ### [1.5.0] - 2019-05-14 - [Janosch Müller](mailto:janosch84@gmail.com)
36
-
37
- ### Added
38
-
39
- - Added `#referenced_expression` for backrefs, subexp calls and conditionals
40
- * returns the `Group` expression that is being referenced via name or number
41
- - Added `Expression#repetitions`
42
- * returns a `Range` of allowed repetitions (`1..1` if there is no quantifier)
43
- * like `#quantity` but with a more uniform interface
44
- - Added `Expression#match_length`
45
- * allows to inspect and iterate over String lengths matched by the Expression
46
-
47
- ### Fixed
48
-
49
- - Fixed `Expression#clone` "direction"
50
- * it used to dup ivars onto the callee, leaving only the clone referencing the original objects
51
- * this will affect you if you call `#eql?`/`#equal?` on expressions or use them as Hash keys
52
- - Fixed `#clone` results for `Sequences`, e.g. alternations and conditionals
53
- * the inner `#text` was cloned onto the `Sequence` and thus duplicated
54
- * e.g. `Regexp::Parser.parse(/(a|bc)/).clone.to_s # => (aa|bcbc)`
55
- - Fixed inconsistent `#to_s` output for `Sequences`
56
- * it used to return only the "specific" text, e.g. "|" for an alternation
57
- * now it includes nested expressions as it does for all other `Subexpressions`
58
- - Fixed quantification of codepoint lists with more than one entry (`\u{62 63 64}+`)
59
- * quantifiers apply only to the last entry, so this token is now split up if quantified
60
-
61
- ### [1.4.0] - 2019-04-02 - [Janosch Müller](mailto:janosch84@gmail.com)
62
-
63
- ### Added
64
-
65
- - Added support for 19 new unicode properties introduced in Ruby 2.6.0
66
-
67
- ### [1.3.0] - 2018-11-14 - [Janosch Müller](mailto:janosch84@gmail.com)
68
-
69
- ### Added
70
-
71
- - `Syntax#features` returns a `Hash` of all types and tokens supported by a given `Syntax`
72
-
73
- ### Fixed
74
-
75
- - Thanks to [Akira Matsuda](https://github.com/amatsuda)
76
- * eliminated warning "assigned but unused variable - testEof"
77
-
78
- ## [1.2.0] - 2018-09-28 - [Janosch Müller](mailto:janosch84@gmail.com)
79
-
80
- ### Added
81
-
82
- - `Subexpression` (branch node) includes `Enumerable`, allowing to `#select` children etc.
83
-
84
- ### Fixed
85
-
86
- - Fixed missing quantifier in `Conditional::Expression` methods `#to_s`, `#to_re`
87
- - `Conditional::Condition` no longer lives outside the recursive `#expressions` tree
88
- - it used to be the only expression stored in a custom ivar, complicating traversal
89
- - its setter and getter (`#condition=`, `#condition`) still work as before
90
-
91
- ## [1.1.0] - 2018-09-17 - [Janosch Müller](mailto:janosch84@gmail.com)
92
-
93
- ### Added
94
-
95
- - Added `Quantifier` methods `#greedy?`, `#possessive?`, `#reluctant?`/`#lazy?`
96
- - Added `Group::Options#option_changes`
97
- - shows the options enabled or disabled by the given options group
98
- - as with all other expressions, `#options` shows the overall active options
99
- - Added `Conditional#reference` and `Condition#reference`, indicating the determinative group
100
- - Added `Subexpression#dig`, acts like [`Array#dig`](http://ruby-doc.org/core-2.5.0/Array.html#method-i-dig)
101
-
102
- ### Fixed
103
-
104
- - Fixed parsing of quantified conditional expressions (quantifiers were assigned to the wrong expression)
105
- - Fixed scanning and parsing of forward-referring subexpression calls (e.g. `\g<+1>`)
106
- - `Root` and `Sequence` expressions now support the same constructor signature as all other expressions
107
-
108
- ## [1.0.0] - 2018-09-01 - [Janosch Müller](mailto:janosch84@gmail.com)
109
-
110
- This release includes several breaking changes, mostly to character sets, #map and properties.
111
-
112
- ### Changed
113
-
114
- - Changed handling of sets (a.k.a. character classes or "bracket expressions")
115
- * see PR [#55](https://github.com/ammar/regexp_parser/pull/55) / issue [#47](https://github.com/ammar/regexp_parser/issues/47) for details
116
- * sets are now parsed to expression trees like other nestable expressions
117
- * `#scan` now emits the same tokens as outside sets (no longer `:set, :member`)
118
- * `CharacterSet#members` has been removed
119
- * new `Range` and `Intersection` classes represent corresponding syntax features
120
- * a new `PosixClass` expression class represents e.g. `[[:ascii:]]`
121
- * `PosixClass` instances behave like `Property` ones, e.g. support `#negative?`
122
- * `#scan` emits `:(non)posixclass, :<type>` instead of `:set, :char_(non)<type>`
123
- - Changed `Subexpression#map` to act like regular `Enumerable#map`
124
- * the old behavior is available as `Subexpression#flat_map`
125
- * e.g. `parse(/[a]/).map(&:to_s) == ["[a]"]`; used to be `["[a]", "a"]`
126
- - Changed expression emissions for some escape sequences
127
- * `EscapeSequence::Codepoint`, `CodepointList`, `Hex` and `Octal` are now all used
128
- * they already existed, but were all parsed as `EscapeSequence::Literal`
129
- * e.g. `\x97` is now `EscapeSequence::Hex` instead of `EscapeSequence::Literal`
130
- - Changed naming of many property tokens (emitted for `\p{...}`)
131
- * if you work with these tokens, see PR [#56](https://github.com/ammar/regexp_parser/pull/56) for details
132
- * e.g. `:punct_dash` is now `:dash_punctuation`
133
- - Changed `(?m)` and the likes to emit as `:options_switch` token (@4ade4d1)
134
- * allows differentiating from group-local `:options`, e.g. `(?m:.)`
135
- - Changed name of `Backreference::..NestLevel` to `..RecursionLevel` (@4184339)
136
- - Changed `Backreference::Number#number` from `String` to `Integer` (@40a2231)
137
-
138
- ### Added
139
-
140
- - Added support for all previously missing properties (about 250)
141
- - Added `Expression::UnicodeProperty#shortcut` (e.g. returns "m" for `\p{mark}`)
142
- - Added `#char(s)` and `#codepoint(s)` methods to all `EscapeSequence` expressions
143
- - Added `#number`/`#name`/`#recursion_level` to all backref/call expressions (@174bf21)
144
- - Added `#number` and `#number_at_level` to capturing group expressions (@40a2231)
145
-
146
- ### Fixed
147
-
148
- - Fixed Ruby version mapping of some properties
149
- - Fixed scanning of some property spellings, e.g. with dashes
150
- - Fixed some incorrect property alias normalizations
151
- - Fixed scanning of codepoint escapes with 6 digits (e.g. `\u{10FFFF}`)
152
- - Fixed scanning of `\R` and `\X` within sets; they act as literals there
153
-
154
- ## [0.5.0] - 2018-04-29 - [Janosch Müller](mailto:janosch84@gmail.com)
155
-
156
- ### Changed
157
-
158
- - Changed handling of Ruby versions (PR [#53](https://github.com/ammar/regexp_parser/pull/53))
159
- * New Ruby versions are now supported by default
160
- * Some deep-lying APIs have changed, which should not affect most users:
161
- * `Regexp::Syntax::VERSIONS` is gone
162
- * Syntax version names have changed from `Regexp::Syntax::Ruby::Vnnn`
163
- to `Regexp::Syntax::Vn_n_n`
164
- * Syntax version classes for Ruby versions without regex feature changes
165
- are no longer predefined and are now only created on demand / lazily
166
- * `Regexp::Syntax::supported?` returns true for any argument >= 1.8.6
167
-
168
- ### Fixed
169
-
170
- - Fixed some use cases of Expression methods #strfregexp and #to_h (@e738107)
171
-
172
- ### Added
173
-
174
- - Added full signature support to collection methods of Expressions (@aa7c55a)
175
-
176
- ## [0.4.13] - 2018-04-04 - [Ammar Ali](mailto:ammarabuali@gmail.com)
177
-
178
- - Added ruby version files for 2.2.10 and 2.3.7
179
-
180
- ## [0.4.12] - 2018-03-30 - [Janosch Müller](mailto:janosch84@gmail.com)
181
-
182
- - Added ruby version files for 2.4.4 and 2.5.1
183
-
184
- ## [0.4.11] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
185
-
186
- - Fixed UnknownSyntaxNameError introduced in v0.4.10 if
187
- the gems parent dir tree included a 'ruby' dir
188
-
189
- ## [0.4.10] - 2018-03-04 - [Janosch Müller](mailto:janosch84@gmail.com)
190
-
191
- - Added ruby version file for 2.6.0
192
- - Added support for Emoji properties (available in Ruby since 2.5.0)
193
- - Added support for XPosixPunct and Regional_Indicator properties
194
- - Fixed parsing of Unicode 6.0 and 7.0 script properties
195
- - Fixed parsing of the special Assigned property
196
- - Fixed scanning of InCyrillic_Supplement property
197
-
198
- ## [0.4.9] - 2017-12-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
199
-
200
- - Added ruby version file for 2.5.0
201
-
202
- ## [0.4.8] - 2017-12-18 - [Janosch Müller](mailto:janosch84@gmail.com)
203
-
204
- - Added ruby version files for 2.2.9, 2.3.6, and 2.4.3
205
-
206
- ## [0.4.7] - 2017-10-15 - [Janosch Müller](mailto:janosch84@gmail.com)
207
-
208
- - Fixed a thread safety issue (issue #45)
209
- - Some public class methods that were only reliable for
210
- internal use are now private instance methods (PR #46)
211
- - Improved the usefulness of Expression#options (issue #43) -
212
- #options and derived methods such as #i?, #m? and #x? are now
213
- defined for all Expressions that are affected by such flags.
214
- - Fixed scanning of whitespace following (?x) (commit 5c94bd2)
215
- - Fixed a Parser bug where the #number attribute of traditional
216
- numerical backreferences was not set correctly (commit 851b620)
217
-
218
- ## [0.4.6] - 2017-09-18 - [Janosch Müller](mailto:janosch84@gmail.com)
219
-
220
- - Added Parser support for hex escapes in sets (PR #36)
221
- - Added Parser support for octal escapes (PR #37)
222
- - Added support for cluster types \R and \X (PR #38)
223
- - Added support for more metacontrol notations (PR #39)
224
-
225
- ## [0.4.5] - 2017-09-17 - [Ammar Ali](mailto:ammarabuali@gmail.com)
226
-
227
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
228
- * Support ruby 2.2.7 (PR #42)
229
- - Added ruby version files for 2.2.8, 2.3.5, and 2.4.2
230
-
231
- ## [0.4.4] - 2017-07-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
232
-
233
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
234
- * Add support for new absence operator (PR #33)
235
- - Thanks to [Bartek Bułat](https://github.com/barthez):
236
- * Add support for Ruby 2.3.4 version (PR #40)
237
-
238
- ## [0.4.3] - 2017-03-24 - [Ammar Ali](mailto:ammarabuali@gmail.com)
239
-
240
- - Added ruby version file for 2.4.1
241
-
242
- ## [0.4.2] - 2017-01-10 - [Ammar Ali](mailto:ammarabuali@gmail.com)
243
-
244
- - Thanks to [Janosch Müller](https://github.com/janosch-x):
245
- * Support ruby 2.4 (PR #30)
246
- * Improve codepoint handling (PR #27)
247
-
248
- ## [0.4.1] - 2016-11-22 - [Ammar Ali](mailto:ammarabuali@gmail.com)
249
-
250
- - Updated ruby version file for 2.3.3
251
-
252
- ## [0.4.0] - 2016-11-20 - [Ammar Ali](mailto:ammarabuali@gmail.com)
253
-
254
- - Added Syntax.supported? method
255
- - Updated ruby versions for latest releases; 2.1.10, 2.2.6, and 2.3.2
256
-
257
- ## [0.3.6] - 2016-06-08 - [Ammar Ali](mailto:ammarabuali@gmail.com)
258
-
259
- - Thanks to [John Backus](https://github.com/backus):
260
- * Remove warnings (PR #26)
261
-
262
- ## [0.3.5] - 2016-05-30 - [Ammar Ali](mailto:ammarabuali@gmail.com)
263
-
264
- - Thanks to [John Backus](https://github.com/backus):
265
- * Fix parsing of /\xFF/n (hex:escape) (PR #24)
266
-
267
- ## [0.3.4] - 2016-05-25 - [Ammar Ali](mailto:ammarabuali@gmail.com)
268
-
269
- - Thanks to [John Backus](https://github.com/backus):
270
- * Fix warnings (PR #19)
271
- - Thanks to [Dana Scheider](https://github.com/danascheider):
272
- * Correct error in README (PR #20)
273
- - Fixed mistyped \h and \H character types (issue #21)
274
- - Added ancestry syntax files for latest rubies (issue #22)
275
-
276
- ## [0.3.3] - 2016-04-26 - [Ammar Ali](mailto:ammarabuali@gmail.com)
277
-
278
- - Thanks to [John Backus](https://github.com/backus):
279
- * Fixed scanning of zero length comments (PR #12)
280
- * Fixed missing escape:codepoint_list syntax token (PR #14)
281
- * Fixed to_s for modified interval quantifiers (PR #17)
282
- - Added a note about MRI implementation quirks to Scanner section
283
-
284
- ## [0.3.2] - 2016-01-01 - [Ammar Ali](mailto:ammarabuali@gmail.com)
285
-
286
- - Updated ruby versions for latest releases; 2.1.8, 2.2.4, and 2.3.0
287
- - Fixed class name for UnknownSyntaxNameError exception
288
- - Added UnicodeBlocks support to the parser.
289
- - Added UnicodeBlocks support to the scanner.
290
- - Added expand_members method to CharacterSet, returns traditional
291
- or unicode property forms of shothands (\d, \W, \s, etc.)
292
- - Improved meaning and output of %t and %T in strfregexp.
293
- - Added syntax versions for ruby 2.1.4 and 2.1.5 and updated
294
- latest 2.1 version.
295
- - Added to_h methods to Expression, Subexpression, and Quantifier.
296
- - Added traversal methods; traverse, each_expression, and map.
297
- - Added token/type test methods; type?, is?, and one_of?
298
- - Added printing method strfregexp, inspired by strftime.
299
- - Added scanning and parsing of free spacing (x mode) expressions.
300
- - Improved handling of inline options (?mixdau:...)
301
- - Added conditional expressions. Ruby 2.0.
302
- - Added keep (\K) markers. Ruby 2.0.
303
- - Added d, a, and u options. Ruby 2.0.
304
- - Added missing meta sequences to the parser. They were supported by the scanner only.
305
- - Renamed Lexer's method to lex, added an alias to the old name (scan)
306
- - Use #map instead of #each to run the block in Lexer.lex.
307
- - Replaced VERSION.yml file with a constant.
308
- - Updated README
309
- - Update tokens and scanner with new additions in Unicode 7.0.
310
-
311
- ## [0.1.6] - 2014-10-06 - [Ammar Ali](mailto:ammarabuali@gmail.com)
312
-
313
- - Fixed test and gem building rake tasks and extracted the gem
314
- specification from the Rakefile into a .gemspec file.
315
- - Added syntax files for missing ruby 2.x versions. These do not add
316
- extra syntax support, they just make the gem work with the newer
317
- ruby versions.
318
- - Added .travis.yml to project root.
319
- - README:
320
- - Removed note purporting runtime support for ruby 1.8.6.
321
- - Added a section identifying the main unsupported syntax features.
322
- - Added sections for Testing and Building
323
- - Added badges for gem version, Travis CI, and code climate.
324
- - Updated README, fixing broken examples, and converting it from a rdoc file to Github's flavor of Markdown.
325
- - Fixed a parser bug where an alternation sequence that contained nested expressions was incorrectly being appended to the parent expression when the nesting was exited. e.g. in /a|(b)c/, c was appended to the root.
326
-
327
- - Fixed a bug where character types were not being correctly scanned within character sets. e.g. in [\d], two tokens were scanned; one for the backslash '\' and one for the 'd'
328
-
329
- ## [0.1.5] - 2014-01-14 - [Ammar Ali](mailto:ammarabuali@gmail.com)
330
-
331
- - Correct ChangeLog.
332
- - Added syntax stubs for ruby versions 2.0 and 2.1
333
- - Added clone methods for deep copying expressions.
334
- - Added optional format argument for to_s on expressions to return the text of the expression with (:full, the default) or without (:base) its quantifier.
335
- - Renamed the :beginning_of_line and :end_of_line tokens to :bol and :eol.
336
- - Fixed a bug where alternations with more than two alternatives and one of them ending in a group were being incorrectly nested.
337
- - Improved EOF handling in general and especially from sequences like hex and control escapes.
338
- - Fixed a bug where named groups with an empty name would return a blank token [].
339
- - Fixed a bug where member of a parent set where being added to its last subset.
340
- - Various code cleanups in scanner.rl
341
- - Fixed a few mutable string bugs by calling dup on the originals.
342
- - Made ruby 1.8.6 the base for all 1.8 syntax, and the 1.8 name a pointer to the latest (1.8.7 at this time)
343
- - Removed look-behind assertions (positive and negative) from 1.8 syntax
344
- - Added control (\cc and \C-c) and meta (\M-c) escapes to 1.8 syntax
345
- - The default syntax is now the one of the running ruby version in both the lexer and the parser.
346
-
347
- ## [0.1.0] - 2010-11-21 - [Ammar Ali](mailto:ammarabuali@gmail.com)
348
-
349
- - Initial release