re2 2.0.0 → 2.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5cb52f3a9f07caf37fd3638f7746f37d0cdd27aa15fb483998de9ea81eda64d8
4
- data.tar.gz: 838a9651e01bde380be8567f85ede79ecf21b2b2ea59a8bb0bb0ffecb6839fab
3
+ metadata.gz: 37e251e652dcc758e2d18f0734cf6fe9af62fa23b3ed127175c27cc439e80451
4
+ data.tar.gz: 536b7cf0e815c480638a381dfca47677149d0c8d2f711105db34439e0c2d9e09
5
5
  SHA512:
6
- metadata.gz: 34646d238d369101434fe4d6fae3c8c71d0c2386c7da99087e87aada5f2056d84d00aedc2d7f88a4e20dbcf8547e3f19e5e7fdd1af4db14b8b4ce5f310053c2a
7
- data.tar.gz: '093f89c7bb023a3995266b62b488fd13c64e8c8bd3011558535ccefa871ce3cc857c373a756a770b70b44e5544810f2457da0e297941b19bfc6ad606402a14aa'
6
+ metadata.gz: 711ba1845bbc96f4a2120ec595f77316d08803bcddb995758871232811b23fa5a2c7cda1949d5fee2f86279ecba694001c530efe7bc29767aa7b5371fe4b5613
7
+ data.tar.gz: f13f0c6d838775667d3d0ed507397e6dadad55c352e8968d7af8295c5dd7e87f6f987c813042c7edee453ded849c9dbd9cc26d440966efe48eda698d29db616b
@@ -0,0 +1,237 @@
1
+ # Vendored Dependency Licenses
2
+
3
+ The library re2 (which lives at https://github.com/mudge/re2) may include the source code for RE2 (which lives at https://github.com/google/re2) and Abseil (which lives at https://abseil.io).
4
+
5
+ The license terms shipped with RE2 are included here for your convenience:
6
+
7
+ // Copyright (c) 2009 The RE2 Authors. All rights reserved.
8
+ //
9
+ // Redistribution and use in source and binary forms, with or without
10
+ // modification, are permitted provided that the following conditions are
11
+ // met:
12
+ //
13
+ // * Redistributions of source code must retain the above copyright
14
+ // notice, this list of conditions and the following disclaimer.
15
+ // * Redistributions in binary form must reproduce the above
16
+ // copyright notice, this list of conditions and the following disclaimer
17
+ // in the documentation and/or other materials provided with the
18
+ // distribution.
19
+ // * Neither the name of Google Inc. nor the names of its
20
+ // contributors may be used to endorse or promote products derived from
21
+ // this software without specific prior written permission.
22
+ //
23
+ // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
24
+ // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
25
+ // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
26
+ // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
27
+ // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
28
+ // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
29
+ // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
30
+ // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
31
+ // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
32
+ // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
33
+ // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
34
+
35
+ The license terms shipped with Abseil are included here for your convenience:
36
+
37
+ Apache License
38
+ Version 2.0, January 2004
39
+ https://www.apache.org/licenses/
40
+
41
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
42
+
43
+ 1. Definitions.
44
+
45
+ "License" shall mean the terms and conditions for use, reproduction,
46
+ and distribution as defined by Sections 1 through 9 of this document.
47
+
48
+ "Licensor" shall mean the copyright owner or entity authorized by
49
+ the copyright owner that is granting the License.
50
+
51
+ "Legal Entity" shall mean the union of the acting entity and all
52
+ other entities that control, are controlled by, or are under common
53
+ control with that entity. For the purposes of this definition,
54
+ "control" means (i) the power, direct or indirect, to cause the
55
+ direction or management of such entity, whether by contract or
56
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
57
+ outstanding shares, or (iii) beneficial ownership of such entity.
58
+
59
+ "You" (or "Your") shall mean an individual or Legal Entity
60
+ exercising permissions granted by this License.
61
+
62
+ "Source" form shall mean the preferred form for making modifications,
63
+ including but not limited to software source code, documentation
64
+ source, and configuration files.
65
+
66
+ "Object" form shall mean any form resulting from mechanical
67
+ transformation or translation of a Source form, including but
68
+ not limited to compiled object code, generated documentation,
69
+ and conversions to other media types.
70
+
71
+ "Work" shall mean the work of authorship, whether in Source or
72
+ Object form, made available under the License, as indicated by a
73
+ copyright notice that is included in or attached to the work
74
+ (an example is provided in the Appendix below).
75
+
76
+ "Derivative Works" shall mean any work, whether in Source or Object
77
+ form, that is based on (or derived from) the Work and for which the
78
+ editorial revisions, annotations, elaborations, or other modifications
79
+ represent, as a whole, an original work of authorship. For the purposes
80
+ of this License, Derivative Works shall not include works that remain
81
+ separable from, or merely link (or bind by name) to the interfaces of,
82
+ the Work and Derivative Works thereof.
83
+
84
+ "Contribution" shall mean any work of authorship, including
85
+ the original version of the Work and any modifications or additions
86
+ to that Work or Derivative Works thereof, that is intentionally
87
+ submitted to Licensor for inclusion in the Work by the copyright owner
88
+ or by an individual or Legal Entity authorized to submit on behalf of
89
+ the copyright owner. For the purposes of this definition, "submitted"
90
+ means any form of electronic, verbal, or written communication sent
91
+ to the Licensor or its representatives, including but not limited to
92
+ communication on electronic mailing lists, source code control systems,
93
+ and issue tracking systems that are managed by, or on behalf of, the
94
+ Licensor for the purpose of discussing and improving the Work, but
95
+ excluding communication that is conspicuously marked or otherwise
96
+ designated in writing by the copyright owner as "Not a Contribution."
97
+
98
+ "Contributor" shall mean Licensor and any individual or Legal Entity
99
+ on behalf of whom a Contribution has been received by Licensor and
100
+ subsequently incorporated within the Work.
101
+
102
+ 2. Grant of Copyright License. Subject to the terms and conditions of
103
+ this License, each Contributor hereby grants to You a perpetual,
104
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
105
+ copyright license to reproduce, prepare Derivative Works of,
106
+ publicly display, publicly perform, sublicense, and distribute the
107
+ Work and such Derivative Works in Source or Object form.
108
+
109
+ 3. Grant of Patent License. Subject to the terms and conditions of
110
+ this License, each Contributor hereby grants to You a perpetual,
111
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
112
+ (except as stated in this section) patent license to make, have made,
113
+ use, offer to sell, sell, import, and otherwise transfer the Work,
114
+ where such license applies only to those patent claims licensable
115
+ by such Contributor that are necessarily infringed by their
116
+ Contribution(s) alone or by combination of their Contribution(s)
117
+ with the Work to which such Contribution(s) was submitted. If You
118
+ institute patent litigation against any entity (including a
119
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
120
+ or a Contribution incorporated within the Work constitutes direct
121
+ or contributory patent infringement, then any patent licenses
122
+ granted to You under this License for that Work shall terminate
123
+ as of the date such litigation is filed.
124
+
125
+ 4. Redistribution. You may reproduce and distribute copies of the
126
+ Work or Derivative Works thereof in any medium, with or without
127
+ modifications, and in Source or Object form, provided that You
128
+ meet the following conditions:
129
+
130
+ (a) You must give any other recipients of the Work or
131
+ Derivative Works a copy of this License; and
132
+
133
+ (b) You must cause any modified files to carry prominent notices
134
+ stating that You changed the files; and
135
+
136
+ (c) You must retain, in the Source form of any Derivative Works
137
+ that You distribute, all copyright, patent, trademark, and
138
+ attribution notices from the Source form of the Work,
139
+ excluding those notices that do not pertain to any part of
140
+ the Derivative Works; and
141
+
142
+ (d) If the Work includes a "NOTICE" text file as part of its
143
+ distribution, then any Derivative Works that You distribute must
144
+ include a readable copy of the attribution notices contained
145
+ within such NOTICE file, excluding those notices that do not
146
+ pertain to any part of the Derivative Works, in at least one
147
+ of the following places: within a NOTICE text file distributed
148
+ as part of the Derivative Works; within the Source form or
149
+ documentation, if provided along with the Derivative Works; or,
150
+ within a display generated by the Derivative Works, if and
151
+ wherever such third-party notices normally appear. The contents
152
+ of the NOTICE file are for informational purposes only and
153
+ do not modify the License. You may add Your own attribution
154
+ notices within Derivative Works that You distribute, alongside
155
+ or as an addendum to the NOTICE text from the Work, provided
156
+ that such additional attribution notices cannot be construed
157
+ as modifying the License.
158
+
159
+ You may add Your own copyright statement to Your modifications and
160
+ may provide additional or different license terms and conditions
161
+ for use, reproduction, or distribution of Your modifications, or
162
+ for any such Derivative Works as a whole, provided Your use,
163
+ reproduction, and distribution of the Work otherwise complies with
164
+ the conditions stated in this License.
165
+
166
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
167
+ any Contribution intentionally submitted for inclusion in the Work
168
+ by You to the Licensor shall be under the terms and conditions of
169
+ this License, without any additional terms or conditions.
170
+ Notwithstanding the above, nothing herein shall supersede or modify
171
+ the terms of any separate license agreement you may have executed
172
+ with Licensor regarding such Contributions.
173
+
174
+ 6. Trademarks. This License does not grant permission to use the trade
175
+ names, trademarks, service marks, or product names of the Licensor,
176
+ except as required for reasonable and customary use in describing the
177
+ origin of the Work and reproducing the content of the NOTICE file.
178
+
179
+ 7. Disclaimer of Warranty. Unless required by applicable law or
180
+ agreed to in writing, Licensor provides the Work (and each
181
+ Contributor provides its Contributions) on an "AS IS" BASIS,
182
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
183
+ implied, including, without limitation, any warranties or conditions
184
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
185
+ PARTICULAR PURPOSE. You are solely responsible for determining the
186
+ appropriateness of using or redistributing the Work and assume any
187
+ risks associated with Your exercise of permissions under this License.
188
+
189
+ 8. Limitation of Liability. In no event and under no legal theory,
190
+ whether in tort (including negligence), contract, or otherwise,
191
+ unless required by applicable law (such as deliberate and grossly
192
+ negligent acts) or agreed to in writing, shall any Contributor be
193
+ liable to You for damages, including any direct, indirect, special,
194
+ incidental, or consequential damages of any character arising as a
195
+ result of this License or out of the use or inability to use the
196
+ Work (including but not limited to damages for loss of goodwill,
197
+ work stoppage, computer failure or malfunction, or any and all
198
+ other commercial damages or losses), even if such Contributor
199
+ has been advised of the possibility of such damages.
200
+
201
+ 9. Accepting Warranty or Additional Liability. While redistributing
202
+ the Work or Derivative Works thereof, You may choose to offer,
203
+ and charge a fee for, acceptance of support, warranty, indemnity,
204
+ or other liability obligations and/or rights consistent with this
205
+ License. However, in accepting such obligations, You may act only
206
+ on Your own behalf and on Your sole responsibility, not on behalf
207
+ of any other Contributor, and only if You agree to indemnify,
208
+ defend, and hold each Contributor harmless for any liability
209
+ incurred by, or claims asserted against, such Contributor by reason
210
+ of your accepting any such warranty or additional liability.
211
+
212
+ END OF TERMS AND CONDITIONS
213
+
214
+ APPENDIX: How to apply the Apache License to your work.
215
+
216
+ To apply the Apache License to your work, attach the following
217
+ boilerplate notice, with the fields enclosed by brackets "[]"
218
+ replaced with your own identifying information. (Don't include
219
+ the brackets!) The text should be enclosed in the appropriate
220
+ comment syntax for the file format. We also recommend that a
221
+ file or class name and description of purpose be included on the
222
+ same "printed page" as the copyright notice for easier
223
+ identification within third-party archives.
224
+
225
+ Copyright [yyyy] [name of copyright owner]
226
+
227
+ Licensed under the Apache License, Version 2.0 (the "License");
228
+ you may not use this file except in compliance with the License.
229
+ You may obtain a copy of the License at
230
+
231
+ https://www.apache.org/licenses/LICENSE-2.0
232
+
233
+ Unless required by applicable law or agreed to in writing, software
234
+ distributed under the License is distributed on an "AS IS" BASIS,
235
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
236
+ See the License for the specific language governing permissions and
237
+ limitations under the License.
data/LICENSE.txt CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2010-2014, Paul Mucur.
1
+ Copyright (c) 2010, Paul Mucur.
2
2
  All rights reserved.
3
3
 
4
4
  Redistribution and use in source and binary forms, with or without
data/README.md CHANGED
@@ -1,18 +1,19 @@
1
1
  re2 [![Build Status](https://github.com/mudge/re2/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/mudge/re2/actions)
2
2
  ===
3
3
 
4
- A Ruby binding to [re2][], an "efficient, principled regular expression
5
- library".
4
+ Ruby bindings to [RE2][], a "fast, safe, thread-friendly alternative to
5
+ backtracking regular expression engines like those used in PCRE, Perl, and
6
+ Python".
6
7
 
7
- **Current version:** 2.0.0
8
+ **Current version:** 2.1.0
8
9
  **Supported Ruby versions:** 2.6, 2.7, 3.0, 3.1, 3.2
9
- **Bundled re2 version:** libre2.11 (2023-09-01)
10
- **Supported re2 versions:** libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01), libre2.10 (2022-12-01), libre2.11 (2023-07-01)
10
+ **Bundled RE2 version:** libre2.11 (2023-09-01)
11
+ **Supported RE2 versions:** libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01), libre2.10 (2022-12-01), libre2.11 (2023-07-01)
11
12
 
12
13
  Installation
13
14
  ------------
14
15
 
15
- The gem comes bundled with a version of [re2][] and will compile itself (and
16
+ The gem comes bundled with a version of [RE2][] and will compile itself (and
16
17
  any dependencies) on install. As compilation can take a while, precompiled
17
18
  native gems are available for Linux, Windows and macOS.
18
19
 
@@ -27,10 +28,10 @@ on these platforms:
27
28
  - `x86_64-darwin`
28
29
  - `x86_64-linux` (requires: glibc >= 2.17)
29
30
 
30
- If you wish to opt out of using the bundled libraries, you will need re2
31
+ If you wish to opt out of using the bundled libraries, you will need RE2
31
32
  installed as well as a C++ compiler such as [gcc][] (on Debian and Ubuntu, this
32
- is provided by the [build-essential][] package). If you are using Mac OS X, I
33
- recommend installing re2 with [Homebrew][] by running the following:
33
+ is provided by the [build-essential][] package). If you are using macOS, I
34
+ recommend installing RE2 with [Homebrew][] by running the following:
34
35
 
35
36
  $ brew install re2
36
37
 
@@ -38,7 +39,7 @@ If you are using Debian, you can install the [libre2-dev][] package like so:
38
39
 
39
40
  $ sudo apt-get install libre2-dev
40
41
 
41
- Recent versions of re2 require a compiler with C++14 support such as
42
+ Recent versions of RE2 require a compiler with C++14 support such as
42
43
  [clang](http://clang.llvm.org/) 3.4 or [gcc](https://gcc.gnu.org/) 5.
43
44
 
44
45
  If you are using a packaged Ruby distribution, make sure you also have the
@@ -47,7 +48,7 @@ on Debian and Ubuntu.
47
48
 
48
49
  You can then install the library via RubyGems with `gem install re2 --platform=ruby --
49
50
  --enable-system-libraries` or `gem install re2 --platform=ruby -- --enable-system-libraries
50
- --with-re2-dir=/path/to/re2/prefix` if re2 is not installed in any of the
51
+ --with-re2-dir=/path/to/re2/prefix` if RE2 is not installed in any of the
51
52
  following default locations:
52
53
 
53
54
  * `/usr/local`
@@ -66,7 +67,7 @@ Documentation
66
67
  Full documentation automatically generated from the latest version is
67
68
  available at <http://mudge.name/re2/>.
68
69
 
69
- Note that re2's regular expression syntax differs from PCRE and Ruby's
70
+ Note that RE2's regular expression syntax differs from PCRE and Ruby's
70
71
  built-in [`Regexp`][Regexp] library, see the [official syntax page][] for more
71
72
  details.
72
73
 
@@ -192,6 +193,24 @@ end
192
193
  # My name is Alice and I am 42 years old
193
194
  ```
194
195
 
196
+ Encoding
197
+ --------
198
+
199
+ Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
200
+ returned in UTF-8 by default or ISO-8859-1 if the `:utf8` option for the
201
+ `RE2::Regexp` is set to false (any other encoding's behaviour is undefined).
202
+
203
+ For backward compatibility: re2 won't automatically convert string inputs to
204
+ the right encoding so this is the responsibility of the caller, e.g.
205
+
206
+ ```ruby
207
+ # By default, RE2 will process patterns and text as UTF-8
208
+ RE2(non_utf8_pattern.encode("UTF-8")).match(non_utf8_text.encode("UTF-8"))
209
+
210
+ # If the :utf8 option is false, RE2 will process patterns and text as ISO-8859-1
211
+ RE2(non_latin1_pattern.encode("ISO-8859-1"), :utf8 => false).match(non_latin1_text.encode("ISO-8859-1"))
212
+ ```
213
+
195
214
  Features
196
215
  --------
197
216
 
@@ -238,9 +257,9 @@ Contributions
238
257
  * Thanks to [Jason Woods](https://github.com/driskell) who contributed the
239
258
  original implementations of `RE2::MatchData#begin` and `RE2::MatchData#end`;
240
259
  * Thanks to [Stefano Rivera](https://github.com/stefanor) who first contributed C++11 support;
241
- * Thanks to [Stan Hu](https://github.com/stanhu) for reporting a bug with empty patterns and `RE2::Regexp#scan`, contributing support for libre2.11 (2023-07-01) and for vendoring re2 and abseil and compiling native gems in 2.0;
260
+ * Thanks to [Stan Hu](https://github.com/stanhu) for reporting a bug with empty patterns and `RE2::Regexp#scan`, contributing support for libre2.11 (2023-07-01) and for vendoring RE2 and abseil and compiling native gems in 2.0;
242
261
  * Thanks to [Sebastian Reitenbach](https://github.com/buzzdeee) for reporting
243
- the deprecation and removal of the `utf8` encoding option in re2;
262
+ the deprecation and removal of the `utf8` encoding option in RE2;
244
263
  * Thanks to [Sergio Medina](https://github.com/serch) for reporting a bug when
245
264
  using `RE2::Scanner#scan` with an invalid regular expression;
246
265
  * Thanks to [Pritam Baral](https://github.com/pritambaral) for contributed the
@@ -251,7 +270,19 @@ Contact
251
270
 
252
271
  All issues and suggestions should go to [GitHub Issues](https://github.com/mudge/re2/issues).
253
272
 
254
- [re2]: https://github.com/google/re2
273
+ License
274
+ -------
275
+
276
+ This library is licensed under the BSD 3-Clause License, see `LICENSE.txt`.
277
+
278
+ Dependencies
279
+ ------------
280
+
281
+ The source code of [RE2][] is distributed in the `ruby` platform gem. This code is licensed under the BSD 3-Clause License, see `LICENSE-DEPENDENCIES.txt`.
282
+
283
+ The source code of [Abseil][] is distributed in the `ruby` platform gem. This code is licensed under the Apache License 2.0, see `LICENSE-DEPENDENCIES.txt`.
284
+
285
+ [RE2]: https://github.com/google/re2
255
286
  [gcc]: http://gcc.gnu.org/
256
287
  [ruby-dev]: http://packages.debian.org/ruby-dev
257
288
  [build-essential]: http://packages.debian.org/build-essential
@@ -260,4 +291,4 @@ All issues and suggestions should go to [GitHub Issues](https://github.com/mudge
260
291
  [Homebrew]: http://mxcl.github.com/homebrew
261
292
  [libre2-dev]: http://packages.debian.org/search?keywords=libre2-dev
262
293
  [official syntax page]: https://github.com/google/re2/wiki/Syntax
263
-
294
+ [Abseil]: https://abseil.io
data/ext/re2/re2.cc CHANGED
@@ -39,18 +39,9 @@ using std::vector;
39
39
  rb_enc_associate_index(_string, _enc); \
40
40
  _string; \
41
41
  })
42
- #define ENCODED_STR_NEW2(str, length, str2) \
43
- ({ \
44
- VALUE _string = rb_str_new(str, length); \
45
- int _enc = rb_enc_get_index(str2); \
46
- rb_enc_associate_index(_string, _enc); \
47
- _string; \
48
- })
49
42
  #else
50
43
  #define ENCODED_STR_NEW(str, length, encoding) \
51
44
  rb_str_new((const char *)str, (long)length)
52
- #define ENCODED_STR_NEW2(str, length, str2) \
53
- rb_str_new((const char *)str, (long)length)
54
45
  #endif
55
46
 
56
47
  #ifdef HAVE_RB_STR_SUBLEN
@@ -284,6 +275,10 @@ static VALUE re2_scanner_rewind(VALUE self) {
284
275
  * Scan the given text incrementally for matches, returning an array of
285
276
  * matches on each subsequent call. Returns nil if no matches are found.
286
277
  *
278
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
279
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
280
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
281
+ *
287
282
  * @return [Array<String>] the matches.
288
283
  * @example
289
284
  * s = RE2::Regexp.new('(\w+)').scan("Foo bar baz")
@@ -312,7 +307,6 @@ static VALUE re2_scanner_scan(VALUE self) {
312
307
  original_input_size = c->input->size();
313
308
 
314
309
  for (i = 0; i < c->number_of_capturing_groups; i++) {
315
- matches[i] = "";
316
310
  argv[i] = &matches[i];
317
311
  args[i] = &argv[i];
318
312
  }
@@ -504,6 +498,10 @@ static VALUE re2_regexp_allocate(VALUE klass) {
504
498
  /*
505
499
  * Returns the array of matches.
506
500
  *
501
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
502
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
503
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
504
+ *
507
505
  * @return [Array<String, nil>] the array of matches
508
506
  * @example
509
507
  * m = RE2::Regexp.new('(\d+)').match("bob 123")
@@ -579,6 +577,10 @@ static VALUE re2_matchdata_named_match(const char* name, VALUE self) {
579
577
  /*
580
578
  * Retrieve zero, one or more matches by index or name.
581
579
  *
580
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
581
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
582
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
583
+ *
582
584
  * @return [Array<String, nil>, String, Boolean]
583
585
  *
584
586
  * @overload [](index)
@@ -646,6 +648,10 @@ static VALUE re2_matchdata_to_s(VALUE self) {
646
648
  /*
647
649
  * Returns a printable version of the match.
648
650
  *
651
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
652
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
653
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
654
+ *
649
655
  * @return [String] a printable version of the match
650
656
  * @example
651
657
  * m = RE2::Regexp.new('(\d+)').match("bob 123")
@@ -690,6 +696,10 @@ static VALUE re2_matchdata_inspect(VALUE self) {
690
696
  /*
691
697
  * Returns the array of submatches for pattern matching.
692
698
  *
699
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
700
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
701
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
702
+ *
693
703
  * @return [Array<String, nil>] the array of submatches
694
704
  * @example
695
705
  * m = RE2::Regexp.new('(\d+)').match("bob 123")
@@ -735,6 +745,10 @@ static VALUE re2_matchdata_deconstruct(VALUE self) {
735
745
  * more keys than there are capturing groups. Given keys will populate the hash in
736
746
  * order but an invalid name will cause the hash to be immediately returned.
737
747
  *
748
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
749
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
750
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
751
+ *
738
752
  * @return [Hash] a hash of capturing group names to submatches
739
753
  * @param [Array<Symbol>, nil] keys an array of Symbol capturing group names or nil to return all names
740
754
  * @example
@@ -866,6 +880,10 @@ static VALUE re2_regexp_initialize(int argc, VALUE *argv, VALUE self) {
866
880
  /*
867
881
  * Returns a printable version of the regular expression +re2+.
868
882
  *
883
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
884
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
885
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
886
+ *
869
887
  * @return [String] a printable version of the regular expression
870
888
  * @example
871
889
  * re2 = RE2::Regexp.new("woo?")
@@ -889,6 +907,10 @@ static VALUE re2_regexp_inspect(VALUE self) {
889
907
  /*
890
908
  * Returns a string version of the regular expression +re2+.
891
909
  *
910
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
911
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
912
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
913
+ *
892
914
  * @return [String] a string version of the regular expression
893
915
  * @example
894
916
  * re2 = RE2::Regexp.new("woo?")
@@ -1116,6 +1138,10 @@ static VALUE re2_regexp_error(VALUE self) {
1116
1138
  * If the RE2 could not be created properly, returns
1117
1139
  * the offending portion of the regexp otherwise returns nil.
1118
1140
  *
1141
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
1142
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
1143
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
1144
+ *
1119
1145
  * @return [String, nil] the offending portion of the regexp or nil
1120
1146
  */
1121
1147
  static VALUE re2_regexp_error_arg(VALUE self) {
@@ -1212,6 +1238,10 @@ static VALUE re2_regexp_number_of_capturing_groups(VALUE self) {
1212
1238
  /*
1213
1239
  * Returns a hash of names to capturing indices of groups.
1214
1240
  *
1241
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
1242
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
1243
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
1244
+ *
1215
1245
  * @return [Hash] a hash of names to capturing indices
1216
1246
  */
1217
1247
  static VALUE re2_regexp_named_capturing_groups(VALUE self) {
@@ -1242,16 +1272,23 @@ static VALUE re2_regexp_named_capturing_groups(VALUE self) {
1242
1272
  * @return [Boolean, RE2::MatchData]
1243
1273
  *
1244
1274
  * @overload match(text)
1245
- * Returns an {RE2::MatchData} containing the matching
1246
- * pattern and all subpatterns resulting from looking for
1247
- * the regexp in +text+.
1275
+ * Returns an {RE2::MatchData} containing the matching pattern and all
1276
+ * subpatterns resulting from looking for the regexp in +text+ if the pattern
1277
+ * contains capturing groups.
1278
+ *
1279
+ * Returns either true or false indicating whether a successful match was
1280
+ * made if the pattern contains no capturing groups.
1248
1281
  *
1249
1282
  * @param [String] text the text to search
1250
- * @return [RE2::MatchData] the matches
1283
+ * @return [RE2::MatchData] if the pattern contains capturing groups
1284
+ * @return [Boolean] if the pattern does not contain capturing groups
1251
1285
  * @raise [NoMemoryError] if there was not enough memory to allocate the matches
1252
- * @example
1286
+ * @example Matching with capturing groups
1253
1287
  * r = RE2::Regexp.new('w(o)(o)')
1254
1288
  * r.match('woo') #=> #<RE2::MatchData "woo" 1:"o" 2:"o">
1289
+ * @example Matching without capturing groups
1290
+ * r = RE2::Regexp.new('woo')
1291
+ * r.match('woo') #=> true
1255
1292
  *
1256
1293
  * @overload match(text, 0)
1257
1294
  * Returns either true or false indicating whether a
@@ -1390,6 +1427,10 @@ static VALUE re2_regexp_scan(VALUE self, VALUE text) {
1390
1427
  * Returns a copy of +str+ with the first occurrence +pattern+
1391
1428
  * replaced with +rewrite+.
1392
1429
  *
1430
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
1431
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
1432
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
1433
+ *
1393
1434
  * @param [String] str the string to modify
1394
1435
  * @param [String, RE2::Regexp] pattern a regexp matching text to be replaced
1395
1436
  * @param [String] rewrite the string to replace with
@@ -1404,7 +1445,9 @@ static VALUE re2_Replace(VALUE self, VALUE str, VALUE pattern,
1404
1445
  UNUSED(self);
1405
1446
  re2_pattern *p;
1406
1447
 
1407
- /* Convert all the inputs to be pumped into RE2::Replace. */
1448
+ /* Take a copy of str so it can be modified in-place by
1449
+ * RE2::Replace.
1450
+ */
1408
1451
  string str_as_string(StringValuePtr(str));
1409
1452
 
1410
1453
  /* Do the replacement. */
@@ -1418,8 +1461,8 @@ static VALUE re2_Replace(VALUE self, VALUE str, VALUE pattern,
1418
1461
  RE2::Replace(&str_as_string, StringValuePtr(pattern),
1419
1462
  StringValuePtr(rewrite));
1420
1463
 
1421
- return ENCODED_STR_NEW2(str_as_string.data(), str_as_string.size(),
1422
- pattern);
1464
+ return ENCODED_STR_NEW(str_as_string.data(), str_as_string.size(),
1465
+ "UTF-8");
1423
1466
  }
1424
1467
 
1425
1468
  }
@@ -1427,6 +1470,10 @@ static VALUE re2_Replace(VALUE self, VALUE str, VALUE pattern,
1427
1470
  /*
1428
1471
  * Return a copy of +str+ with +pattern+ replaced by +rewrite+.
1429
1472
  *
1473
+ * Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
1474
+ * returned in UTF-8 by default or ISO-8859-1 if the :utf8 option for the
1475
+ * RE2::Regexp is set to false (any other encoding's behaviour is undefined).
1476
+ *
1430
1477
  * @param [String] str the string to modify
1431
1478
  * @param [String, RE2::Regexp] pattern a regexp matching text to be replaced
1432
1479
  * @param [String] rewrite the string to replace with
@@ -1440,7 +1487,9 @@ static VALUE re2_GlobalReplace(VALUE self, VALUE str, VALUE pattern,
1440
1487
  VALUE rewrite) {
1441
1488
  UNUSED(self);
1442
1489
 
1443
- /* Convert all the inputs to be pumped into RE2::GlobalReplace. */
1490
+ /* Take a copy of str so it can be modified in-place by
1491
+ * RE2::GlobalReplace.
1492
+ */
1444
1493
  re2_pattern *p;
1445
1494
  string str_as_string(StringValuePtr(str));
1446
1495
 
@@ -1455,8 +1504,8 @@ static VALUE re2_GlobalReplace(VALUE self, VALUE str, VALUE pattern,
1455
1504
  RE2::GlobalReplace(&str_as_string, StringValuePtr(pattern),
1456
1505
  StringValuePtr(rewrite));
1457
1506
 
1458
- return ENCODED_STR_NEW2(str_as_string.data(), str_as_string.size(),
1459
- pattern);
1507
+ return ENCODED_STR_NEW(str_as_string.data(), str_as_string.size(),
1508
+ "UTF-8");
1460
1509
  }
1461
1510
  }
1462
1511
 
@@ -1579,11 +1628,12 @@ static VALUE re2_set_initialize(int argc, VALUE *argv, VALUE self) {
1579
1628
  * set.add("def") #=> 1
1580
1629
  */
1581
1630
  static VALUE re2_set_add(VALUE self, VALUE pattern) {
1582
- Check_Type(pattern, T_STRING);
1631
+ StringValue(pattern);
1583
1632
  re2::StringPiece regex(RSTRING_PTR(pattern), RSTRING_LEN(pattern));
1584
1633
  std::string err;
1585
1634
  re2_set *s;
1586
1635
  Data_Get_Struct(self, re2_set, s);
1636
+
1587
1637
  int index = s->set->Add(regex, &err);
1588
1638
  if (index < 0) {
1589
1639
  rb_raise(rb_eArgError, "str rejected by RE2::Set->Add(): %s", err.c_str());
@@ -1669,7 +1719,8 @@ static VALUE re2_set_match(int argc, VALUE *argv, VALUE self) {
1669
1719
  VALUE str, options, exception_option;
1670
1720
  bool raise_exception = true;
1671
1721
  rb_scan_args(argc, argv, "11", &str, &options);
1672
- Check_Type(str, T_STRING);
1722
+
1723
+ StringValue(str);
1673
1724
  re2::StringPiece data(RSTRING_PTR(str), RSTRING_LEN(str));
1674
1725
  std::vector<int> v;
1675
1726
  re2_set *s;
data/lib/re2/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module RE2
4
- VERSION = "2.0.0"
4
+ VERSION = "2.1.0"
5
5
  end
data/re2.gemspec CHANGED
@@ -2,8 +2,8 @@ require_relative 'lib/re2/version'
2
2
 
3
3
  Gem::Specification.new do |s|
4
4
  s.name = "re2"
5
- s.summary = "Ruby bindings to re2."
6
- s.description = 'Ruby bindings to re2, "an efficient, principled regular expression library".'
5
+ s.summary = "Ruby bindings to RE2."
6
+ s.description = 'Ruby bindings to RE2, "a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python".'
7
7
  s.version = RE2::VERSION
8
8
  s.authors = ["Paul Mucur", "Stan Hu"]
9
9
  s.homepage = "https://github.com/mudge/re2"
@@ -22,6 +22,7 @@ Gem::Specification.new do |s|
22
22
  "lib/re2/string.rb",
23
23
  "lib/re2/version.rb",
24
24
  "LICENSE.txt",
25
+ "LICENSE-DEPENDENCIES.txt",
25
26
  "README.md",
26
27
  "Rakefile",
27
28
  "re2.gemspec"
@@ -36,8 +37,8 @@ Gem::Specification.new do |s|
36
37
  "spec/re2/set_spec.rb",
37
38
  "spec/re2/scanner_spec.rb"
38
39
  ]
39
- s.add_development_dependency "rake-compiler", "~> 1.2.1"
40
- s.add_development_dependency "rake-compiler-dock", "~> 1.3.0"
40
+ s.add_development_dependency("rake-compiler", "~> 1.2.1")
41
+ s.add_development_dependency("rake-compiler-dock", "~> 1.3.0")
41
42
  s.add_development_dependency("rspec", "~> 3.2")
42
43
  s.add_runtime_dependency("mini_portile2", "~> 2.8.4") # keep version in sync with extconf.rb
43
44
  end
data/spec/kernel_spec.rb CHANGED
@@ -1,13 +1,13 @@
1
1
  RSpec.describe Kernel do
2
- describe "#RE2" do
2
+ describe ".RE2" do
3
3
  it "returns an RE2::Regexp instance given a pattern" do
4
4
  expect(RE2('w(o)(o)')).to be_a(RE2::Regexp)
5
5
  end
6
6
 
7
7
  it "returns an RE2::Regexp instance given a pattern and options" do
8
8
  re = RE2('w(o)(o)', :case_sensitive => false)
9
- expect(re).to be_a(RE2::Regexp)
10
- expect(re).to_not be_case_sensitive
9
+
10
+ expect(re).not_to be_case_sensitive
11
11
  end
12
12
  end
13
13
  end
@@ -10,6 +10,18 @@ RSpec.describe RE2::MatchData do
10
10
  a = RE2::Regexp.new('(\d?)(a)(b)').match('ab').to_a
11
11
  expect(a).to eq(["ab", nil, "a", "b"])
12
12
  end
13
+
14
+ it "returns UTF-8 strings if the pattern is UTF-8" do
15
+ a = RE2::Regexp.new('w(o)(o)').match('woo').to_a
16
+
17
+ expect(a.map(&:encoding)).to all eq(Encoding::UTF_8)
18
+ end
19
+
20
+ it "returns ISO-8859-1 strings if the pattern is not UTF-8" do
21
+ a = RE2::Regexp.new('w(o)(o)', :utf8 => false).match('woo').to_a
22
+
23
+ expect(a.map(&:encoding)).to all eq(Encoding::ISO_8859_1)
24
+ end
13
25
  end
14
26
 
15
27
  describe "#[]" do
@@ -19,6 +31,18 @@ RSpec.describe RE2::MatchData do
19
31
  expect(md[2]).to eq("23")
20
32
  end
21
33
 
34
+ it "returns a UTF-8 string by numerical index if the pattern is UTF-8" do
35
+ md = RE2::Regexp.new('(\d)(\d{2})').match("123")
36
+
37
+ expect(md[1].encoding).to eq(Encoding::UTF_8)
38
+ end
39
+
40
+ it "returns a ISO-8859-1 string by numerical index if the pattern is not UTF-8" do
41
+ md = RE2::Regexp.new('(\d)(\d{2})', :utf8 => false).match("123")
42
+
43
+ expect(md[1].encoding).to eq(Encoding::ISO_8859_1)
44
+ end
45
+
22
46
  it "has the whole match as the 0th item" do
23
47
  md = RE2::Regexp.new('(\d)(\d{2})').match("123")
24
48
  expect(md[0]).to eq("123")
@@ -287,6 +287,12 @@ RSpec.describe RE2::Regexp do
287
287
  expect(re.match("My age is 99", 0)).to eq(false)
288
288
  end
289
289
 
290
+ it "returns only true or false if the pattern has no capturing groups" do
291
+ re = RE2::Regexp.new('My name is')
292
+
293
+ expect(re.match('My name is Robert Paulson')).to eq(true)
294
+ end
295
+
290
296
  it "raises an exception when given nil" do
291
297
  expect { re.match(nil) }.to raise_error(TypeError)
292
298
  end
@@ -21,9 +21,10 @@ RSpec.describe RE2::Scanner do
21
21
  end
22
22
 
23
23
  describe "#scan" do
24
- it "returns the next array of matches" do
24
+ it "returns the next array of matches", :aggregate_failures do
25
25
  r = RE2::Regexp.new('(\w+)')
26
26
  scanner = r.scan("It is a truth universally acknowledged")
27
+
27
28
  expect(scanner.scan).to eq(["It"])
28
29
  expect(scanner.scan).to eq(["is"])
29
30
  expect(scanner.scan).to eq(["a"])
@@ -33,41 +34,73 @@ RSpec.describe RE2::Scanner do
33
34
  expect(scanner.scan).to be_nil
34
35
  end
35
36
 
37
+ it "returns UTF-8 matches if the pattern is UTF-8" do
38
+ r = RE2::Regexp.new('(\w+)')
39
+ scanner = r.scan("It")
40
+ matches = scanner.scan
41
+
42
+ expect(matches.first.encoding).to eq(Encoding::UTF_8)
43
+ end
44
+
45
+ it "returns ISO-8859-1 matches if the pattern is not UTF-8" do
46
+ r = RE2::Regexp.new('(\w+)', :utf8 => false)
47
+ scanner = r.scan("It")
48
+ matches = scanner.scan
49
+
50
+ expect(matches.first.encoding).to eq(Encoding::ISO_8859_1)
51
+ end
52
+
53
+ it "returns multiple capturing groups at a time", :aggregate_failures do
54
+ r = RE2::Regexp.new('(\w+) (\w+)')
55
+ scanner = r.scan("It is a truth universally acknowledged")
56
+
57
+ expect(scanner.scan).to eq(["It", "is"])
58
+ expect(scanner.scan).to eq(["a", "truth"])
59
+ expect(scanner.scan).to eq(["universally", "acknowledged"])
60
+ expect(scanner.scan).to be_nil
61
+ end
62
+
36
63
  it "returns an empty array if there are no capturing groups" do
37
64
  r = RE2::Regexp.new('\w+')
38
65
  scanner = r.scan("Foo bar")
66
+
39
67
  expect(scanner.scan).to eq([])
40
68
  end
41
69
 
42
70
  it "returns nil if there is no match" do
43
71
  r = RE2::Regexp.new('\d+')
44
72
  scanner = r.scan("Foo bar")
73
+
45
74
  expect(scanner.scan).to be_nil
46
75
  end
47
76
 
48
77
  it "returns nil if the regexp is invalid" do
49
78
  r = RE2::Regexp.new('???', :log_errors => false)
50
79
  scanner = r.scan("Foo bar")
80
+
51
81
  expect(scanner.scan).to be_nil
52
82
  end
53
83
 
54
- it "returns an empty array if the input is empty" do
84
+ it "returns an empty array if the input is empty", :aggregate_failures do
55
85
  r = RE2::Regexp.new("")
56
86
  scanner = r.scan("")
87
+
57
88
  expect(scanner.scan).to eq([])
58
89
  expect(scanner.scan).to be_nil
59
90
  end
60
91
 
61
- it "returns an array of nil with an empty input and capture" do
92
+ it "returns an array of nil with an empty input and capture", :aggregate_failures do
62
93
  r = RE2::Regexp.new("()")
63
94
  scanner = r.scan("")
95
+
64
96
  expect(scanner.scan).to eq([nil])
65
97
  expect(scanner.scan).to be_nil
66
98
  end
67
99
 
68
- it "returns an empty array for every match if the pattern is empty" do
100
+ it "returns an empty array for every match if the pattern is empty", :aggregate_failures do
69
101
  r = RE2::Regexp.new("")
70
102
  scanner = r.scan("Foo")
103
+
71
104
  expect(scanner.scan).to eq([])
72
105
  expect(scanner.scan).to eq([])
73
106
  expect(scanner.scan).to eq([])
@@ -75,9 +108,10 @@ RSpec.describe RE2::Scanner do
75
108
  expect(scanner.scan).to be_nil
76
109
  end
77
110
 
78
- it "returns an array of nil if the pattern is an empty capturing group" do
111
+ it "returns an array of nil if the pattern is an empty capturing group", :aggregate_failures do
79
112
  r = RE2::Regexp.new("()")
80
113
  scanner = r.scan("Foo")
114
+
81
115
  expect(scanner.scan).to eq([nil])
82
116
  expect(scanner.scan).to eq([nil])
83
117
  expect(scanner.scan).to eq([nil])
@@ -85,9 +119,10 @@ RSpec.describe RE2::Scanner do
85
119
  expect(scanner.scan).to be_nil
86
120
  end
87
121
 
88
- it "returns array of nils with multiple empty capturing groups" do
122
+ it "returns array of nils with multiple empty capturing groups", :aggregate_failures do
89
123
  r = RE2::Regexp.new("()()()")
90
124
  scanner = r.scan("Foo")
125
+
91
126
  expect(scanner.scan).to eq([nil, nil, nil])
92
127
  expect(scanner.scan).to eq([nil, nil, nil])
93
128
  expect(scanner.scan).to eq([nil, nil, nil])
@@ -95,17 +130,34 @@ RSpec.describe RE2::Scanner do
95
130
  expect(scanner.scan).to be_nil
96
131
  end
97
132
 
98
- it "supports empty groups with multibyte characters" do
133
+ it "supports empty groups with multibyte characters", :aggregate_failures do
99
134
  r = RE2::Regexp.new("()€")
100
135
  scanner = r.scan("€")
136
+
101
137
  expect(scanner.scan).to eq([nil])
102
138
  expect(scanner.scan).to be_nil
103
139
  end
140
+
141
+ it "raises a Type Error if given input that can't be coerced to a String" do
142
+ r = RE2::Regexp.new('(\w+)')
143
+
144
+ expect { r.scan(0) }.to raise_error(TypeError)
145
+ end
146
+
147
+ it "accepts input that can be coerced to a String", :aggregate_failures do
148
+ r = RE2::Regexp.new('(\w+)')
149
+ scanner = r.scan(StringLike.new("Hello world"))
150
+
151
+ expect(scanner.scan).to eq(["Hello"])
152
+ expect(scanner.scan).to eq(["world"])
153
+ expect(scanner.scan).to be_nil
154
+ end
104
155
  end
105
156
 
106
157
  it "is enumerable" do
107
158
  r = RE2::Regexp.new('(\d)')
108
159
  scanner = r.scan("There are 1 some 2 numbers 3")
160
+
109
161
  expect(scanner).to be_a(Enumerable)
110
162
  end
111
163
 
@@ -113,12 +165,8 @@ RSpec.describe RE2::Scanner do
113
165
  it "yields each match" do
114
166
  r = RE2::Regexp.new('(\d)')
115
167
  scanner = r.scan("There are 1 some 2 numbers 3")
116
- matches = []
117
- scanner.each do |match|
118
- matches << match
119
- end
120
168
 
121
- expect(matches).to eq([["1"], ["2"], ["3"]])
169
+ expect { |b| scanner.each(&b) }.to yield_successive_args(["1"], ["2"], ["3"])
122
170
  end
123
171
 
124
172
  it "returns an enumerator when not given a block" do
@@ -135,22 +183,28 @@ RSpec.describe RE2::Scanner do
135
183
  end
136
184
 
137
185
  describe "#rewind" do
138
- it "resets any consumption" do
186
+ it "resets any consumption", :aggregate_failures do
139
187
  r = RE2::Regexp.new('(\d)')
140
188
  scanner = r.scan("There are 1 some 2 numbers 3")
189
+
141
190
  expect(scanner.to_enum.first).to eq(["1"])
142
191
  expect(scanner.to_enum.first).to eq(["2"])
192
+
143
193
  scanner.rewind
194
+
144
195
  expect(scanner.to_enum.first).to eq(["1"])
145
196
  end
146
197
 
147
- it "resets the eof? check" do
198
+ it "resets the eof? check", :aggregate_failures do
148
199
  r = RE2::Regexp.new('(\d)')
149
200
  scanner = r.scan("1")
150
201
  scanner.scan
151
- expect(scanner.eof?).to be_truthy
202
+
203
+ expect(scanner).to be_eof
204
+
152
205
  scanner.rewind
153
- expect(scanner.eof?).to be_falsey
206
+
207
+ expect(scanner).not_to be_eof
154
208
  end
155
209
  end
156
210
 
@@ -159,7 +213,7 @@ RSpec.describe RE2::Scanner do
159
213
  r = RE2::Regexp.new('(\d)')
160
214
  scanner = r.scan("1 2 3")
161
215
 
162
- expect(scanner.eof?).to be_falsey
216
+ expect(scanner).not_to be_eof
163
217
  end
164
218
 
165
219
  it "returns true if the input has been consumed" do
@@ -167,7 +221,7 @@ RSpec.describe RE2::Scanner do
167
221
  scanner = r.scan("1")
168
222
  scanner.scan
169
223
 
170
- expect(scanner.eof?).to be_truthy
224
+ expect(scanner).to be_eof
171
225
  end
172
226
 
173
227
  it "returns false if no match is made" do
@@ -175,14 +229,14 @@ RSpec.describe RE2::Scanner do
175
229
  scanner = r.scan("a")
176
230
  scanner.scan
177
231
 
178
- expect(scanner.eof?).to be_falsey
232
+ expect(scanner).not_to be_eof
179
233
  end
180
234
 
181
235
  it "returns false with an empty input that has not been scanned" do
182
236
  r = RE2::Regexp.new("")
183
237
  scanner = r.scan("")
184
238
 
185
- expect(scanner.eof?).to be_falsey
239
+ expect(scanner).not_to be_eof
186
240
  end
187
241
 
188
242
  it "returns false with an empty input that has not been matched" do
@@ -190,7 +244,7 @@ RSpec.describe RE2::Scanner do
190
244
  scanner = r.scan("")
191
245
  scanner.scan
192
246
 
193
- expect(scanner.eof?).to be_falsey
247
+ expect(scanner).not_to be_eof
194
248
  end
195
249
 
196
250
  it "returns true with an empty input that has been matched" do
@@ -198,7 +252,7 @@ RSpec.describe RE2::Scanner do
198
252
  scanner = r.scan("")
199
253
  scanner.scan
200
254
 
201
- expect(scanner.eof?).to be_truthy
255
+ expect(scanner).to be_eof
202
256
  end
203
257
  end
204
258
  end
data/spec/re2/set_spec.rb CHANGED
@@ -67,11 +67,17 @@ RSpec.describe RE2::Set do
67
67
  end
68
68
  end
69
69
 
70
- it "raises an error if given a non-string pattern" do
70
+ it "raises an error if given a pattern that can't be coerced to a String" do
71
71
  set = RE2::Set.new(:unanchored, :log_errors => false)
72
72
 
73
73
  expect { set.add(0) }.to raise_error(TypeError)
74
74
  end
75
+
76
+ it "accepts a pattern that can be coerced to a String" do
77
+ set = RE2::Set.new
78
+
79
+ expect(set.add(StringLike.new("abc"))).to eq(0)
80
+ end
75
81
  end
76
82
 
77
83
  describe "#compile" do
@@ -96,6 +102,24 @@ RSpec.describe RE2::Set do
96
102
  expect(set.match("abcdefghi", :exception => false)).to eq([0, 1, 2])
97
103
  end
98
104
 
105
+ it "returns an empty array if there is no match" do
106
+ set = RE2::Set.new
107
+ set.add("abc")
108
+ set.compile
109
+
110
+ expect(set.match("def", :exception => false)).to be_empty
111
+ end
112
+
113
+ it "returns an empty array if there is no match when :exception is true" do
114
+ skip "Underlying RE2::Set::Match does not output error information" unless RE2::Set.match_raises_errors?
115
+
116
+ set = RE2::Set.new
117
+ set.add("abc")
118
+ set.compile
119
+
120
+ expect(set.match("def")).to be_empty
121
+ end
122
+
99
123
  it "raises an error if called before #compile by default" do
100
124
  skip "Underlying RE2::Set::Match does not output error information" unless RE2::Set.match_raises_errors?
101
125
 
@@ -139,6 +163,22 @@ RSpec.describe RE2::Set do
139
163
 
140
164
  expect { set.match("", 0) }.to raise_error(TypeError)
141
165
  end
166
+
167
+ it "raises a Type Error if given input that can't be coerced to a String" do
168
+ set = RE2::Set.new
169
+ set.add("abc")
170
+ set.compile
171
+
172
+ expect { set.match(0, :exception => false) }.to raise_error(TypeError)
173
+ end
174
+
175
+ it "accepts input if it can be coerced to a String" do
176
+ set = RE2::Set.new
177
+ set.add("abc")
178
+ set.compile
179
+
180
+ expect(set.match(StringLike.new("abcdef"), :exception => false)).to contain_exactly(0)
181
+ end
142
182
  end
143
183
 
144
184
  def silence_stderr
@@ -12,7 +12,8 @@ RSpec.describe RE2::String do
12
12
 
13
13
  it "doesn't perform an in-place replacement" do
14
14
  string = "My name is Robert Paulson"
15
- expect(string.re2_sub('Robert', 'Crobert')).to_not equal(string)
15
+
16
+ expect(string.re2_sub('Robert', 'Crobert')).not_to equal(string)
16
17
  end
17
18
  end
18
19
 
@@ -23,13 +24,15 @@ RSpec.describe RE2::String do
23
24
 
24
25
  it "doesn't perform an in-place replacement" do
25
26
  string = "My name is Robert Paulson"
26
- expect(string.re2_gsub('a', 'e')).to_not equal(string)
27
+
28
+ expect(string.re2_gsub('a', 'e')).not_to equal(string)
27
29
  end
28
30
  end
29
31
 
30
32
  describe "#re2_match" do
31
- it "delegates to RE2::Regexp#match to perform matches" do
33
+ it "delegates to RE2::Regexp#match to perform matches", :aggregate_failures do
32
34
  md = "My name is Robert Paulson".re2_match('My name is (\S+) (\S+)')
35
+
33
36
  expect(md).to be_a(RE2::MatchData)
34
37
  expect(md[0]).to eq("My name is Robert Paulson")
35
38
  expect(md[1]).to eq("Robert")
@@ -38,6 +41,7 @@ RSpec.describe RE2::String do
38
41
 
39
42
  it "supports limiting the number of matches" do
40
43
  md = "My name is Robert Paulson".re2_match('My name is (\S+) (\S+)', 0)
44
+
41
45
  expect(md).to eq(true)
42
46
  end
43
47
  end
data/spec/re2_spec.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  RSpec.describe RE2 do
2
- describe "#Replace" do
2
+ describe ".Replace" do
3
3
  it "only replaces the first occurrence of the pattern" do
4
4
  expect(RE2.Replace("woo", "o", "a")).to eq("wao")
5
5
  end
@@ -15,29 +15,68 @@ RSpec.describe RE2 do
15
15
  it "does not perform replacements in-place" do
16
16
  name = "Robert"
17
17
  replacement = RE2.Replace(name, "R", "Cr")
18
- expect(name).to_not equal(replacement)
18
+
19
+ expect(name).not_to equal(replacement)
19
20
  end
20
21
 
21
22
  it "supports passing an RE2::Regexp as the pattern" do
22
23
  re = RE2::Regexp.new('wo{2}')
24
+
23
25
  expect(RE2.Replace("woo", re, "miaow")).to eq("miaow")
24
26
  end
25
27
 
26
28
  it "respects any passed RE2::Regexp's flags" do
27
29
  re = RE2::Regexp.new('gOOD MORNING', :case_sensitive => false)
30
+
28
31
  expect(RE2.Replace("Good morning", re, "hi")).to eq("hi")
29
32
  end
30
33
 
31
- if String.method_defined?(:encoding)
32
- it "preserves the original string's encoding" do
33
- original = "Foo"
34
- replacement = RE2.Replace(original, "oo", "ah")
35
- expect(original.encoding).to eq(replacement.encoding)
36
- end
34
+ it "supports passing something that can be coerced to a String as input" do
35
+ expect(RE2.Replace(StringLike.new("woo"), "oo", "ah")).to eq("wah")
36
+ end
37
+
38
+ it "supports passing something that can be coerced to a String as a pattern" do
39
+ expect(RE2.Replace("woo", StringLike.new("oo"), "ah")).to eq("wah")
40
+ end
41
+
42
+ it "supports passing something that can be coerced to a String as a replacement" do
43
+ expect(RE2.Replace("woo", "oo", StringLike.new("ah"))).to eq("wah")
44
+ end
45
+
46
+ it "returns UTF-8 strings if the pattern is UTF-8" do
47
+ original = "Foo".encode("ISO-8859-1")
48
+ replacement = RE2.Replace(original, "oo", "ah")
49
+
50
+ expect(replacement.encoding).to eq(Encoding::UTF_8)
51
+ end
52
+
53
+ it "returns ISO-8859-1 strings if the pattern is not UTF-8" do
54
+ original = "Foo"
55
+ replacement = RE2.Replace(original, RE2("oo", :utf8 => false), "ah")
56
+
57
+ expect(replacement.encoding).to eq(Encoding::ISO_8859_1)
58
+ end
59
+
60
+ it "returns UTF-8 strings when given a String pattern" do
61
+ replacement = RE2.Replace("Foo", "oo".encode("ISO-8859-1"), "ah")
62
+
63
+ expect(replacement.encoding).to eq(Encoding::UTF_8)
64
+ end
65
+
66
+ it "raises a Type Error for input that can't be converted to String" do
67
+ expect { RE2.Replace(0, "oo", "ah") }.to raise_error(TypeError)
68
+ end
69
+
70
+ it "raises a Type Error for a non-RE2::Regexp pattern that can't be converted to String" do
71
+ expect { RE2.Replace("woo", 0, "ah") }.to raise_error(TypeError)
72
+ end
73
+
74
+ it "raises a Type Error for a replacement that can't be converted to String" do
75
+ expect { RE2.Replace("woo", "oo", 0) }.to raise_error(TypeError)
37
76
  end
38
77
  end
39
78
 
40
- describe "#GlobalReplace" do
79
+ describe ".GlobalReplace" do
41
80
  it "replaces every occurrence of a pattern" do
42
81
  expect(RE2.GlobalReplace("woo", "o", "a")).to eq("waa")
43
82
  end
@@ -53,23 +92,78 @@ RSpec.describe RE2 do
53
92
  it "does not perform replacement in-place" do
54
93
  name = "Robert"
55
94
  replacement = RE2.GlobalReplace(name, "(?i)R", "w")
56
- expect(name).to_not equal(replacement)
95
+
96
+ expect(name).not_to equal(replacement)
57
97
  end
58
98
 
59
99
  it "supports passing an RE2::Regexp as the pattern" do
60
100
  re = RE2::Regexp.new('wo{2,}')
101
+
61
102
  expect(RE2.GlobalReplace("woowooo", re, "miaow")).to eq("miaowmiaow")
62
103
  end
63
104
 
64
105
  it "respects any passed RE2::Regexp's flags" do
65
106
  re = RE2::Regexp.new('gOOD MORNING', :case_sensitive => false)
107
+
66
108
  expect(RE2.GlobalReplace("Good morning Good morning", re, "hi")).to eq("hi hi")
67
109
  end
110
+
111
+ it "supports passing something that can be coerced to a String as input" do
112
+ expect(RE2.GlobalReplace(StringLike.new("woo"), "o", "a")).to eq("waa")
113
+ end
114
+
115
+ it "supports passing something that can be coerced to a String as a pattern" do
116
+ expect(RE2.GlobalReplace("woo", StringLike.new("o"), "a")).to eq("waa")
117
+ end
118
+
119
+ it "supports passing something that can be coerced to a String as a replacement" do
120
+ expect(RE2.GlobalReplace("woo", "o", StringLike.new("a"))).to eq("waa")
121
+ end
122
+
123
+ it "returns UTF-8 strings if the pattern is UTF-8" do
124
+ original = "Foo".encode("ISO-8859-1")
125
+ replacement = RE2.GlobalReplace(original, "oo", "ah")
126
+
127
+ expect(replacement.encoding).to eq(Encoding::UTF_8)
128
+ end
129
+
130
+ it "returns ISO-8859-1 strings if the pattern is not UTF-8" do
131
+ original = "Foo"
132
+ replacement = RE2.GlobalReplace(original, RE2("oo", :utf8 => false), "ah")
133
+
134
+ expect(replacement.encoding).to eq(Encoding::ISO_8859_1)
135
+ end
136
+
137
+ it "returns UTF-8 strings when given a String pattern" do
138
+ replacement = RE2.GlobalReplace("Foo", "oo".encode("ISO-8859-1"), "ah")
139
+
140
+ expect(replacement.encoding).to eq(Encoding::UTF_8)
141
+ end
142
+
143
+ it "raises a Type Error for input that can't be converted to String" do
144
+ expect { RE2.GlobalReplace(0, "o", "a") }.to raise_error(TypeError)
145
+ end
146
+
147
+ it "raises a Type Error for a non-RE2::Regexp pattern that can't be converted to String" do
148
+ expect { RE2.GlobalReplace("woo", 0, "a") }.to raise_error(TypeError)
149
+ end
150
+
151
+ it "raises a Type Error for a replacement that can't be converted to String" do
152
+ expect { RE2.GlobalReplace("woo", "o", 0) }.to raise_error(TypeError)
153
+ end
68
154
  end
69
155
 
70
156
  describe "#QuoteMeta" do
71
157
  it "escapes a string so it can be used as a regular expression" do
72
158
  expect(RE2.QuoteMeta("1.5-2.0?")).to eq('1\.5\-2\.0\?')
73
159
  end
160
+
161
+ it "raises a Type Error for input that can't be converted to String" do
162
+ expect { RE2.QuoteMeta(0) }.to raise_error(TypeError)
163
+ end
164
+
165
+ it "supports passing something that can be coerced to a String as input" do
166
+ expect(RE2.QuoteMeta(StringLike.new("1.5"))).to eq('1\.5')
167
+ end
74
168
  end
75
169
  end
data/spec/spec_helper.rb CHANGED
@@ -1,5 +1,15 @@
1
1
  require "re2"
2
2
 
3
+ # To test passing objects that can be coerced to a String.
4
+ class StringLike
5
+ attr_reader :str
6
+ alias_method :to_str, :str
7
+
8
+ def initialize(str)
9
+ @str = str
10
+ end
11
+ end
12
+
3
13
  RSpec.configure do |config|
4
14
  config.expect_with :rspec do |expectations|
5
15
  expectations.include_chain_clauses_in_custom_matcher_descriptions = true
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: re2
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Paul Mucur
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2023-09-13 00:00:00.000000000 Z
12
+ date: 2023-09-16 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rake-compiler
@@ -67,7 +67,8 @@ dependencies:
67
67
  - - "~>"
68
68
  - !ruby/object:Gem::Version
69
69
  version: 2.8.4
70
- description: Ruby bindings to re2, "an efficient, principled regular expression library".
70
+ description: Ruby bindings to RE2, "a fast, safe, thread-friendly alternative to backtracking
71
+ regular expression engines like those used in PCRE, Perl, and Python".
71
72
  email:
72
73
  executables: []
73
74
  extensions:
@@ -76,6 +77,7 @@ extra_rdoc_files: []
76
77
  files:
77
78
  - ".rspec"
78
79
  - Gemfile
80
+ - LICENSE-DEPENDENCIES.txt
79
81
  - LICENSE.txt
80
82
  - README.md
81
83
  - Rakefile
@@ -120,7 +122,7 @@ requirements: []
120
122
  rubygems_version: 3.4.10
121
123
  signing_key:
122
124
  specification_version: 4
123
- summary: Ruby bindings to re2.
125
+ summary: Ruby bindings to RE2.
124
126
  test_files:
125
127
  - spec/spec_helper.rb
126
128
  - spec/re2_spec.rb