re2 2.15.0 → 2.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 92d0c12dc899d22cf74c00bbda8f89f29665b32d10fcc9a8eb9ca01f671c62f5
4
- data.tar.gz: 1aea6bbc6cc7168a5a16fbb52b244535ed422707f63ebac59c2a0d1ae81bfbaf
3
+ metadata.gz: 940bde937a78abda3052eb0957885f3f190310fcd16fd6408cab7796994bd008
4
+ data.tar.gz: a1847bd0e009e2e4a63395e9452b3ecf39d938922549cc2d8477a9398c1661b6
5
5
  SHA512:
6
- metadata.gz: ff0c9014a05c741b9bc7f3b9961c9e36a97cc4e48f7546779a4ceca60ff225c84a3930acd7daf5d58225aa051b0d6ee475ce778c66ecd3f3643d522bd105b72b
7
- data.tar.gz: ea0b6da27aa3e568821dc48a452fcd3e4e16600fe66e3964702d37c56e2f9247d16130e95a60c83684530633937b810539cf45d5c0f842f88d4ccc5b35dafa53
6
+ metadata.gz: 360dc102ebb6ee640bc4c45c8e17f0d22a5b2288c37c40e1288384275214b2c2d72c14ba490ee06472b2ecb9790fbfb5a9a3f7afc35df27aac516b3d4416e302
7
+ data.tar.gz: 02f43ce9a57cee1b1c3c068f3f66c12c029a0bc22b5af00d837fc1bb2fa5fb90e120dc9bf3287adb3f99302fd0f0dd32af5303fd0e7780f41961f2f7a8429ff5
data/README.md CHANGED
@@ -6,8 +6,8 @@ Python".
6
6
 
7
7
  [![Build Status](https://github.com/mudge/re2/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/mudge/re2/actions)
8
8
 
9
- **Current version:** 2.15.0
10
- **Bundled RE2 version:** libre2.11 (2024-07-02)
9
+ **Current version:** 2.23.0
10
+ **Bundled RE2 version:** libre2.11 (2025-11-05)
11
11
 
12
12
  ```ruby
13
13
  RE2('h.*o').full_match?("hello") #=> true
@@ -224,12 +224,15 @@ the set. After all patterns have been added, the set can be compiled using
224
224
  and then
225
225
  [`RE2::Set#match`](https://mudge.name/re2/RE2/Set.html#match-instance_method)
226
226
  will return an array containing the indices of all the patterns that matched.
227
+ [`RE2::Set#size`](https://mudge.name/re2/RE2/Set.html#size-instance_method)
228
+ will return the number of patterns in the set.
227
229
 
228
230
  ```ruby
229
231
  set = RE2::Set.new
230
232
  set.add("abc") #=> 0
231
233
  set.add("def") #=> 1
232
234
  set.add("ghi") #=> 2
235
+ set.size #=> 3
233
236
  set.compile #=> true
234
237
  set.match("abcdefghi") #=> [0, 1, 2]
235
238
  set.match("ghidefabc") #=> [2, 1, 0]
@@ -257,39 +260,39 @@ RE2(non_latin1_pattern.encode("ISO-8859-1"), utf8: false).match(non_latin1_text.
257
260
 
258
261
  This gem requires the following to run:
259
262
 
260
- * [Ruby](https://www.ruby-lang.org/en/) 2.6 to 3.4
263
+ * [Ruby](https://www.ruby-lang.org/en/) 3.1 to 4.0
261
264
 
262
265
  It supports the following RE2 ABI versions:
263
266
 
264
- * libre2.0 (prior to release 2020-03-02) to libre2.11 (2023-07-01 to 2024-07-02)
267
+ * libre2.0 (prior to release 2020-03-02) to libre2.11 (2023-07-01 to 2025-11-05)
265
268
 
266
269
  ### Native gems
267
270
 
268
271
  Where possible, a pre-compiled native gem will be provided for the following platforms:
269
272
 
270
273
  * Linux
271
- * `aarch64-linux`, `arm-linux`, `x86-linux` and `x86_64-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.29+, RubyGems 3.3.22+ and Bundler 2.3.21+)
274
+ * `aarch64-linux`, `arm-linux`, and `x86_64-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.29+, RubyGems 3.3.22+ and Bundler 2.3.21+)
272
275
  * [musl](https://musl.libc.org/)-based systems such as [Alpine](https://alpinelinux.org) are supported with Bundler 2.5.6+
273
- * macOS `x86_64-darwin` and `arm64-darwin`
274
- * Windows `x64-mingw32` and `x64-mingw-ucrt`
276
+ * macOS 10.14+ `x86_64-darwin` and `arm64-darwin`
277
+ * Windows 2022+ `x64-mingw-ucrt`
275
278
 
276
279
  ### Verifying the gems
277
280
 
278
281
  SHA256 checksums are included in the [release notes](https://github.com/mudge/re2/releases) for each version and can be checked with `sha256sum`, e.g.
279
282
 
280
283
  ```console
281
- $ gem fetch re2 -v 2.14.0
282
- Fetching re2-2.14.0-arm64-darwin.gem
283
- Downloaded re2-2.14.0-arm64-darwin
284
- $ sha256sum re2-2.14.0-arm64-darwin.gem
285
- 3c922d54a44ac88499f6391bc2f9740559381deaf7f4e49eef5634cf32efc2ce re2-2.14.0-arm64-darwin.gem
284
+ $ gem fetch re2 -v 2.18.0
285
+ Fetching re2-2.18.0-arm64-darwin.gem
286
+ Downloaded re2-2.18.0-arm64-darwin
287
+ $ sha256sum re2-2.18.0-arm64-darwin.gem
288
+ 953063f0491420163d3484ed256fe2ff616c777ec66ee20aa5ec1a1a1fc39ff5 re2-2.18.0-arm64-darwin.gem
286
289
  ```
287
290
 
288
291
  [GPG](https://www.gnupg.org/) signatures are attached to each release (the assets ending in `.sig`) and can be verified if you import [our signing key `0x39AC3530070E0F75`](https://mudge.name/39AC3530070E0F75.asc) (or fetch it from a public keyserver, e.g. `gpg --keyserver keyserver.ubuntu.com --recv-key 0x39AC3530070E0F75`):
289
292
 
290
293
  ```console
291
- $ gpg --verify re2-2.14.0-arm64-darwin.gem.sig re2-2.14.0-arm64-darwin.gem
292
- gpg: Signature made Fri 2 Aug 12:39:12 2024 BST
294
+ $ gpg --verify re2-2.18.0-arm64-darwin.gem.sig re2-2.18.0-arm64-darwin.gem
295
+ gpg: Signature made Sun 3 Aug 11:02:26 2025 BST
293
296
  gpg: using RSA key 702609D9C790F45B577D7BEC39AC3530070E0F75
294
297
  gpg: Good signature from "Paul Mucur <mudge@mudge.name>" [unknown]
295
298
  gpg: aka "Paul Mucur <paul@ghostcassette.com>" [unknown]
@@ -328,8 +331,8 @@ You will need a full compiler toolchain for compiling Ruby C extensions (see
328
331
  Toolchain"](https://nokogiri.org/tutorials/installing_nokogiri.html#appendix-a-the-compiler-toolchain))
329
332
  plus the toolchain required for compiling the vendored version of RE2 and its
330
333
  dependency [Abseil][] which includes [CMake](https://cmake.org), a compiler
331
- with C++14 support such as [clang](http://clang.llvm.org/) 3.4 or
332
- [gcc](https://gcc.gnu.org/) 5 and a recent version of
334
+ with C++17 support such as [clang](http://clang.llvm.org/) 5 or
335
+ [gcc](https://gcc.gnu.org/) 8 and a recent version of
333
336
  [pkg-config](https://www.freedesktop.org/wiki/Software/pkg-config/). On
334
337
  Windows, you'll also need pkgconf 2.1.0+ to avoid [`undefined reference`
335
338
  errors](https://github.com/pkgconf/pkgconf/issues/322) when attempting to
@@ -375,6 +378,8 @@ Alternatively, you can set the `RE2_USE_SYSTEM_LIBRARIES` environment variable i
375
378
  improvements in 2.4.0.
376
379
  * Thanks to [Manuel Jacob](https://github.com/manueljacob) for reporting a bug
377
380
  when passing strings with null bytes.
381
+ * Thanks to [Maciej Gajewski](https://github.com/konieczkow) for helping
382
+ confirm issues with GC compaction and mutable strings.
378
383
 
379
384
  ## Contact
380
385
 
data/Rakefile CHANGED
@@ -33,7 +33,7 @@ cross_platforms = %w[
33
33
  x86_64-linux-musl
34
34
  ].freeze
35
35
 
36
- ENV['RUBY_CC_VERSION'] = '3.4.1:3.3.5:3.2.6:3.1.6:3.0.7:2.7.8:2.6.10'
36
+ RakeCompilerDock.set_ruby_cc_version("~> 3.1", "~> 4.0")
37
37
 
38
38
  Gem::PackageTask.new(re2_gemspec).define
39
39
 
@@ -70,8 +70,12 @@ namespace :gem do
70
70
  desc "Compile and build native gem for #{platform} platform"
71
71
  task platform do
72
72
  RakeCompilerDock.sh <<~SCRIPT, platform: platform, verbose: true
73
+ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null &&
74
+ echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ focal main' | sudo tee /etc/apt/sources.list.d/kitware.list >/dev/null &&
75
+ sudo apt-get update &&
76
+ sudo apt-get install -y cmake=3.22.2-0kitware1ubuntu20.04.1 cmake-data=3.22.2-0kitware1ubuntu20.04.1 &&
73
77
  gem install bundler --no-document &&
74
- bundle &&
78
+ bundle install &&
75
79
  bundle exec rake native:#{platform} pkg/#{re2_gemspec.full_name}-#{Gem::Platform.new(platform)}.gem PATH="/usr/local/bin:$PATH"
76
80
  SCRIPT
77
81
  end
data/dependencies.yml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  libre2:
3
- version: '2024-07-02'
4
- sha256: eb2df807c781601c14a260a507a5bb4509be1ee626024cb45acbd57cb9d4032b
3
+ version: '2025-11-05'
4
+ sha256: 87f6029d2f6de8aa023654240a03ada90e876ce9a4676e258dd01ea4c26ffd67
5
5
  abseil:
6
- version: '20240722.0'
7
- sha256: f50e5ac311a81382da7fa75b97310e4b9006474f9560ac46f54a9967f07d4ae3
6
+ version: '20250814.1'
7
+ sha256: 1692f77d1739bacf3f94337188b78583cf09bab7e420d2dc6c5605a4f86785a1
data/ext/re2/extconf.rb CHANGED
@@ -103,6 +103,8 @@ module RE2
103
103
  def build_with_vendored_libraries
104
104
  message "Building re2 using packaged libraries.\n"
105
105
 
106
+ ENV["MACOSX_DEPLOYMENT_TARGET"] = "10.14"
107
+
106
108
  abseil_recipe, re2_recipe = load_recipes
107
109
 
108
110
  process_recipe(abseil_recipe) do |recipe|
@@ -129,10 +131,8 @@ module RE2
129
131
  end
130
132
 
131
133
  def build_extension
132
- # Enable optional warnings but disable deprecated register warning for Ruby 2.6 support
133
134
  $CFLAGS << " -Wall -Wextra -funroll-loops"
134
135
  $CXXFLAGS << " -Wall -Wextra -funroll-loops"
135
- $CPPFLAGS << " -Wno-register"
136
136
 
137
137
  # Pass -x c++ to force gcc to compile the test program
138
138
  # as C++ (as it will end in .c by default).
@@ -140,7 +140,6 @@ module RE2
140
140
 
141
141
  have_library("stdc++")
142
142
  have_header("stdint.h")
143
- have_func("rb_gc_mark_movable") # introduced in Ruby 2.7
144
143
 
145
144
  minimal_program = <<~SRC
146
145
  #include <re2/re2.h>
@@ -152,13 +151,9 @@ module RE2
152
151
  end
153
152
 
154
153
  if re2_requires_version_flag
155
- # Recent versions of re2 depend directly on abseil, which requires a
156
- # compiler with C++14 support (see
157
- # https://github.com/abseil/abseil-cpp/issues/1127 and
158
- # https://github.com/abseil/abseil-cpp/issues/1431). However, the
159
- # `std=c++14` flag doesn't appear to suffice; we need at least
160
- # `std=c++17`.
161
- abort "Cannot compile re2 with your compiler: recent versions require C++14 support." unless %w[c++20 c++17 c++11 c++0x].any? do |std|
154
+ # Recent versions of RE2 depend directly on Abseil, which requires a
155
+ # compiler with C++17 support.
156
+ abort "Cannot compile re2 with your compiler: recent versions require C++17 support." unless %w[c++20 c++17 c++11 c++0x].any? do |std|
162
157
  checking_for("re2 that compiles with #{std} standard") do
163
158
  if try_compile(minimal_program, compile_options + " -std=#{std}")
164
159
  compile_options << " -std=#{std}"
@@ -217,6 +212,24 @@ module RE2
217
212
  $defs.push("-DHAVE_ERROR_INFO_ARGUMENT")
218
213
  end
219
214
  end
215
+
216
+ checking_for("RE2::Set::Size()") do
217
+ test_re2_set_size = <<~SRC
218
+ #include <re2/re2.h>
219
+ #include <re2/set.h>
220
+
221
+ int main() {
222
+ RE2::Set s(RE2::DefaultOptions, RE2::UNANCHORED);
223
+ s.Size();
224
+
225
+ return 0;
226
+ }
227
+ SRC
228
+
229
+ if try_compile(test_re2_set_size, compile_options)
230
+ $defs.push("-DHAVE_SET_SIZE")
231
+ end
232
+ end
220
233
  end
221
234
 
222
235
  def static_pkg_config(pc_file, pkg_config_paths)
data/ext/re2/re2.cc CHANGED
@@ -125,27 +125,16 @@ static void parse_re2_options(RE2::Options* re2_options, const VALUE options) {
125
125
  }
126
126
  }
127
127
 
128
- /* For compatibility with Ruby < 2.7 */
129
- #ifdef HAVE_RB_GC_MARK_MOVABLE
130
- #define re2_compact_callback(x) (x),
131
- #else
132
- #define rb_gc_mark_movable(x) rb_gc_mark(x)
133
- #define re2_compact_callback(x)
134
- #endif
135
-
136
128
  static void re2_matchdata_mark(void *ptr) {
137
129
  re2_matchdata *m = reinterpret_cast<re2_matchdata *>(ptr);
138
130
  rb_gc_mark_movable(m->regexp);
139
- rb_gc_mark_movable(m->text);
131
+ rb_gc_mark(m->text);
140
132
  }
141
133
 
142
- #ifdef HAVE_RB_GC_MARK_MOVABLE
143
134
  static void re2_matchdata_compact(void *ptr) {
144
135
  re2_matchdata *m = reinterpret_cast<re2_matchdata *>(ptr);
145
136
  m->regexp = rb_gc_location(m->regexp);
146
- m->text = rb_gc_location(m->text);
147
137
  }
148
- #endif
149
138
 
150
139
  static void re2_matchdata_free(void *ptr) {
151
140
  re2_matchdata *m = reinterpret_cast<re2_matchdata *>(ptr);
@@ -171,7 +160,7 @@ static const rb_data_type_t re2_matchdata_data_type = {
171
160
  re2_matchdata_mark,
172
161
  re2_matchdata_free,
173
162
  re2_matchdata_memsize,
174
- re2_compact_callback(re2_matchdata_compact)
163
+ re2_matchdata_compact
175
164
  },
176
165
  0,
177
166
  0,
@@ -183,16 +172,13 @@ static const rb_data_type_t re2_matchdata_data_type = {
183
172
  static void re2_scanner_mark(void *ptr) {
184
173
  re2_scanner *s = reinterpret_cast<re2_scanner *>(ptr);
185
174
  rb_gc_mark_movable(s->regexp);
186
- rb_gc_mark_movable(s->text);
175
+ rb_gc_mark(s->text);
187
176
  }
188
177
 
189
- #ifdef HAVE_RB_GC_MARK_MOVABLE
190
178
  static void re2_scanner_compact(void *ptr) {
191
179
  re2_scanner *s = reinterpret_cast<re2_scanner *>(ptr);
192
180
  s->regexp = rb_gc_location(s->regexp);
193
- s->text = rb_gc_location(s->text);
194
181
  }
195
- #endif
196
182
 
197
183
  static void re2_scanner_free(void *ptr) {
198
184
  re2_scanner *s = reinterpret_cast<re2_scanner *>(ptr);
@@ -218,7 +204,7 @@ static const rb_data_type_t re2_scanner_data_type = {
218
204
  re2_scanner_mark,
219
205
  re2_scanner_free,
220
206
  re2_scanner_memsize,
221
- re2_compact_callback(re2_scanner_compact)
207
+ re2_scanner_compact
222
208
  },
223
209
  0,
224
210
  0,
@@ -290,10 +276,12 @@ static VALUE re2_matchdata_string(const VALUE self) {
290
276
  }
291
277
 
292
278
  /*
293
- * Returns the text supplied when incrementally matching with
279
+ * Returns a frozen copy of the text supplied when incrementally matching with
294
280
  * {RE2::Regexp#scan}.
295
281
  *
296
- * @return [String] the original string passed to {RE2::Regexp#scan}
282
+ * If the text was already a frozen string, returns the original.
283
+ *
284
+ * @return [String] a frozen string with the text passed to {RE2::Regexp#scan}
297
285
  * @example
298
286
  * c = RE2::Regexp.new('(\d+)').scan("foo")
299
287
  * c.string #=> "foo"
@@ -338,6 +326,11 @@ static VALUE re2_scanner_rewind(VALUE self) {
338
326
  delete c->input;
339
327
  c->input = new(std::nothrow) re2::StringPiece(
340
328
  RSTRING_PTR(c->text), RSTRING_LEN(c->text));
329
+ if (c->input == 0) {
330
+ rb_raise(rb_eNoMemError,
331
+ "not enough memory to allocate StringPiece for input");
332
+ }
333
+
341
334
  c->eof = false;
342
335
 
343
336
  return self;
@@ -424,8 +417,8 @@ static re2::StringPiece *re2_matchdata_find_match(VALUE idx, const VALUE self) {
424
417
 
425
418
  int id;
426
419
 
427
- if (FIXNUM_P(idx)) {
428
- id = FIX2INT(idx);
420
+ if (RB_INTEGER_TYPE_P(idx)) {
421
+ id = NUM2INT(idx);
429
422
  } else if (SYMBOL_P(idx)) {
430
423
  const std::map<std::string, int>& groups = p->pattern->NamedCapturingGroups();
431
424
  std::map<std::string, int>::const_iterator search = groups.find(rb_id2name(SYM2ID(idx)));
@@ -693,10 +686,10 @@ static VALUE re2_matchdata_aref(int argc, VALUE *argv, const VALUE self) {
693
686
  std::string(RSTRING_PTR(idx), RSTRING_LEN(idx)), self);
694
687
  } else if (SYMBOL_P(idx)) {
695
688
  return re2_matchdata_named_match(rb_id2name(SYM2ID(idx)), self);
696
- } else if (!NIL_P(rest) || !FIXNUM_P(idx) || FIX2INT(idx) < 0) {
689
+ } else if (!NIL_P(rest) || !RB_INTEGER_TYPE_P(idx) || NUM2INT(idx) < 0) {
697
690
  return rb_ary_aref(argc, argv, re2_matchdata_to_a(self));
698
691
  } else {
699
- return re2_matchdata_nth_match(FIX2INT(idx), self);
692
+ return re2_matchdata_nth_match(NUM2INT(idx), self);
700
693
  }
701
694
  }
702
695
 
@@ -1433,7 +1426,7 @@ static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
1433
1426
  RE2::Anchor anchor = RE2::UNANCHORED;
1434
1427
 
1435
1428
  if (RTEST(options)) {
1436
- if (FIXNUM_P(options)) {
1429
+ if (RB_INTEGER_TYPE_P(options)) {
1437
1430
  n = NUM2INT(options);
1438
1431
 
1439
1432
  if (n < 0) {
@@ -1447,8 +1440,6 @@ static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
1447
1440
  VALUE endpos_option = rb_hash_aref(options, ID2SYM(id_endpos));
1448
1441
  if (!NIL_P(endpos_option)) {
1449
1442
  #ifdef HAVE_ENDPOS_ARGUMENT
1450
- Check_Type(endpos_option, T_FIXNUM);
1451
-
1452
1443
  endpos = NUM2INT(endpos_option);
1453
1444
 
1454
1445
  if (endpos < 0) {
@@ -1477,8 +1468,6 @@ static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
1477
1468
 
1478
1469
  VALUE submatches_option = rb_hash_aref(options, ID2SYM(id_submatches));
1479
1470
  if (!NIL_P(submatches_option)) {
1480
- Check_Type(submatches_option, T_FIXNUM);
1481
-
1482
1471
  n = NUM2INT(submatches_option);
1483
1472
 
1484
1473
  if (n < 0) {
@@ -1494,8 +1483,6 @@ static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
1494
1483
 
1495
1484
  VALUE startpos_option = rb_hash_aref(options, ID2SYM(id_startpos));
1496
1485
  if (!NIL_P(startpos_option)) {
1497
- Check_Type(startpos_option, T_FIXNUM);
1498
-
1499
1486
  startpos = NUM2INT(startpos_option);
1500
1487
 
1501
1488
  if (startpos < 0) {
@@ -1527,37 +1514,43 @@ static VALUE re2_regexp_match(int argc, VALUE *argv, const VALUE self) {
1527
1514
  #endif
1528
1515
  return BOOL2RUBY(matched);
1529
1516
  } else {
1517
+ if (n == INT_MAX) {
1518
+ rb_raise(rb_eRangeError, "number of matches should be < %d", INT_MAX);
1519
+ }
1520
+
1530
1521
  /* Because match returns the whole match as well. */
1531
1522
  n += 1;
1532
1523
 
1533
- VALUE matchdata = rb_class_new_instance(0, 0, re2_cMatchData);
1534
- TypedData_Get_Struct(matchdata, re2_matchdata, &re2_matchdata_data_type, m);
1535
- m->matches = new(std::nothrow) re2::StringPiece[n];
1536
- RB_OBJ_WRITE(matchdata, &m->regexp, self);
1537
- if (!RTEST(rb_obj_frozen_p(text))) {
1538
- text = rb_str_freeze(rb_str_dup(text));
1539
- }
1540
- RB_OBJ_WRITE(matchdata, &m->text, text);
1541
-
1542
- if (m->matches == 0) {
1524
+ re2::StringPiece *matches = new(std::nothrow) re2::StringPiece[n];
1525
+ if (matches == 0) {
1543
1526
  rb_raise(rb_eNoMemError,
1544
1527
  "not enough memory to allocate StringPieces for matches");
1545
1528
  }
1546
1529
 
1547
- m->number_of_matches = n;
1530
+ text = rb_str_new_frozen(text);
1548
1531
 
1549
1532
  #ifdef HAVE_ENDPOS_ARGUMENT
1550
1533
  bool matched = p->pattern->Match(
1551
- re2::StringPiece(RSTRING_PTR(m->text), RSTRING_LEN(m->text)),
1552
- startpos, endpos, anchor, m->matches, n);
1534
+ re2::StringPiece(RSTRING_PTR(text), RSTRING_LEN(text)),
1535
+ startpos, endpos, anchor, matches, n);
1553
1536
  #else
1554
1537
  bool matched = p->pattern->Match(
1555
- re2::StringPiece(RSTRING_PTR(m->text), RSTRING_LEN(m->text)),
1556
- startpos, anchor, m->matches, n);
1538
+ re2::StringPiece(RSTRING_PTR(text), RSTRING_LEN(text)),
1539
+ startpos, anchor, matches, n);
1557
1540
  #endif
1558
1541
  if (matched) {
1542
+ VALUE matchdata = rb_class_new_instance(0, 0, re2_cMatchData);
1543
+ TypedData_Get_Struct(matchdata, re2_matchdata, &re2_matchdata_data_type, m);
1544
+
1545
+ RB_OBJ_WRITE(matchdata, &m->regexp, self);
1546
+ RB_OBJ_WRITE(matchdata, &m->text, text);
1547
+ m->matches = matches;
1548
+ m->number_of_matches = n;
1549
+
1559
1550
  return matchdata;
1560
1551
  } else {
1552
+ delete[] matches;
1553
+
1561
1554
  return Qnil;
1562
1555
  }
1563
1556
  }
@@ -1626,10 +1619,14 @@ static VALUE re2_regexp_scan(const VALUE self, VALUE text) {
1626
1619
  VALUE scanner = rb_class_new_instance(0, 0, re2_cScanner);
1627
1620
  TypedData_Get_Struct(scanner, re2_scanner, &re2_scanner_data_type, c);
1628
1621
 
1629
- c->input = new(std::nothrow) re2::StringPiece(
1630
- RSTRING_PTR(text), RSTRING_LEN(text));
1631
1622
  RB_OBJ_WRITE(scanner, &c->regexp, self);
1632
- RB_OBJ_WRITE(scanner, &c->text, text);
1623
+ RB_OBJ_WRITE(scanner, &c->text, rb_str_new_frozen(text));
1624
+ c->input = new(std::nothrow) re2::StringPiece(
1625
+ RSTRING_PTR(c->text), RSTRING_LEN(c->text));
1626
+ if (c->input == 0) {
1627
+ rb_raise(rb_eNoMemError,
1628
+ "not enough memory to allocate StringPiece for input");
1629
+ }
1633
1630
 
1634
1631
  if (p->pattern->ok()) {
1635
1632
  c->number_of_capturing_groups = p->pattern->NumberOfCapturingGroups();
@@ -1925,21 +1922,19 @@ static VALUE re2_set_add(VALUE self, VALUE pattern) {
1925
1922
  re2_set *s;
1926
1923
  TypedData_Get_Struct(self, re2_set, &re2_set_data_type, s);
1927
1924
 
1928
- /* To prevent the memory of the err string leaking when we call rb_raise,
1929
- * take a copy of it and let it go out of scope.
1930
- */
1931
- char msg[100];
1932
1925
  int index;
1926
+ VALUE msg;
1933
1927
 
1934
1928
  {
1935
1929
  std::string err;
1936
1930
  index = s->set->Add(
1937
1931
  re2::StringPiece(RSTRING_PTR(pattern), RSTRING_LEN(pattern)), &err);
1938
- strlcpy(msg, err.c_str(), sizeof(msg));
1932
+ msg = rb_str_new(err.data(), err.size());
1939
1933
  }
1940
1934
 
1941
1935
  if (index < 0) {
1942
- rb_raise(rb_eArgError, "str rejected by RE2::Set->Add(): %s", msg);
1936
+ rb_raise(rb_eArgError,
1937
+ "str rejected by RE2::Set->Add(): %s", RSTRING_PTR(msg));
1943
1938
  }
1944
1939
 
1945
1940
  return INT2FIX(index);
@@ -1962,6 +1957,26 @@ static VALUE re2_set_compile(VALUE self) {
1962
1957
  return BOOL2RUBY(s->set->Compile());
1963
1958
  }
1964
1959
 
1960
+ /*
1961
+ * Returns the size of the {RE2::Set}.
1962
+ *
1963
+ * @return [Integer] the number of patterns in the set
1964
+ * @example
1965
+ * set = RE2::Set.new
1966
+ * set.add("abc")
1967
+ * set.size #=> 1
1968
+ */
1969
+ static VALUE re2_set_size(VALUE self) {
1970
+ #ifdef HAVE_SET_SIZE
1971
+ re2_set *s;
1972
+ TypedData_Get_Struct(self, re2_set, &re2_set_data_type, s);
1973
+
1974
+ return INT2FIX(s->set->Size());
1975
+ #else
1976
+ rb_raise(re2_eSetUnsupportedError, "current version of RE2::Set does not have Size method");
1977
+ #endif
1978
+ }
1979
+
1965
1980
  /*
1966
1981
  * Returns whether the underlying RE2 version outputs error information from
1967
1982
  * {https://github.com/google/re2/blob/bc0faab533e2b27b85b8ad312abf061e33ed6b5d/re2/set.h#L62-L65
@@ -1978,6 +1993,19 @@ static VALUE re2_set_match_raises_errors_p(VALUE) {
1978
1993
  #endif
1979
1994
  }
1980
1995
 
1996
+ /*
1997
+ * Returns whether the underlying RE2 version has a Set::Size method.
1998
+ *
1999
+ * @return [Boolean] whether the underlying RE2 has a Set::Size method
2000
+ */
2001
+ static VALUE re2_set_size_p(VALUE) {
2002
+ #ifdef HAVE_SET_SIZE
2003
+ return Qtrue;
2004
+ #else
2005
+ return Qfalse;
2006
+ #endif
2007
+ }
2008
+
1981
2009
  /*
1982
2010
  * Matches the given text against patterns in the set, returning an array of
1983
2011
  * integer indices of the matching patterns if matched or an empty array if
@@ -2208,11 +2236,15 @@ extern "C" void Init_re2(void) {
2208
2236
 
2209
2237
  rb_define_singleton_method(re2_cSet, "match_raises_errors?",
2210
2238
  RUBY_METHOD_FUNC(re2_set_match_raises_errors_p), 0);
2239
+ rb_define_singleton_method(re2_cSet, "size?",
2240
+ RUBY_METHOD_FUNC(re2_set_size_p), 0);
2211
2241
  rb_define_method(re2_cSet, "initialize",
2212
2242
  RUBY_METHOD_FUNC(re2_set_initialize), -1);
2213
2243
  rb_define_method(re2_cSet, "add", RUBY_METHOD_FUNC(re2_set_add), 1);
2214
2244
  rb_define_method(re2_cSet, "compile", RUBY_METHOD_FUNC(re2_set_compile), 0);
2215
2245
  rb_define_method(re2_cSet, "match", RUBY_METHOD_FUNC(re2_set_match), -1);
2246
+ rb_define_method(re2_cSet, "size", RUBY_METHOD_FUNC(re2_set_size), 0);
2247
+ rb_define_method(re2_cSet, "length", RUBY_METHOD_FUNC(re2_set_size), 0);
2216
2248
 
2217
2249
  rb_define_module_function(re2_mRE2, "Replace",
2218
2250
  RUBY_METHOD_FUNC(re2_Replace), 3);
data/ext/re2/recipes.rb CHANGED
@@ -9,7 +9,7 @@
9
9
  # Released under the BSD Licence, please see LICENSE.txt
10
10
 
11
11
  PACKAGE_ROOT_DIR = File.expand_path('../..', __dir__)
12
- REQUIRED_MINI_PORTILE_VERSION = '~> 2.8.7' # keep this version in sync with the one in the gemspec
12
+ REQUIRED_MINI_PORTILE_VERSION = '~> 2.8.9' # keep this version in sync with the one in the gemspec
13
13
 
14
14
  def load_recipes
15
15
  require 'yaml'
@@ -40,8 +40,8 @@ def build_recipe(name, version)
40
40
  MiniPortileCMake.new(name, version).tap do |recipe|
41
41
  recipe.target = File.join(PACKAGE_ROOT_DIR, 'ports')
42
42
  recipe.configure_options += [
43
- # abseil needs a C++14 compiler
44
- '-DCMAKE_CXX_STANDARD=14',
43
+ # abseil needs a C++17 compiler
44
+ '-DCMAKE_CXX_STANDARD=17',
45
45
  # needed for building the C extension shared library with -fPIC
46
46
  '-DCMAKE_POSITION_INDEPENDENT_CODE=ON',
47
47
  # ensures pkg-config and installed libraries will be in lib, not lib64
data/lib/re2/version.rb CHANGED
@@ -10,5 +10,5 @@
10
10
 
11
11
 
12
12
  module RE2
13
- VERSION = "2.15.0"
13
+ VERSION = "2.23.0"
14
14
  end
Binary file
data/re2.gemspec CHANGED
@@ -11,7 +11,7 @@ Gem::Specification.new do |s|
11
11
  s.homepage = "https://github.com/mudge/re2"
12
12
  s.extensions = ["ext/re2/extconf.rb"]
13
13
  s.license = "BSD-3-Clause"
14
- s.required_ruby_version = ">= 2.6.0"
14
+ s.required_ruby_version = ">= 3.1.0"
15
15
  s.files = [
16
16
  "dependencies.yml",
17
17
  "ext/re2/extconf.rb",
@@ -40,8 +40,8 @@ Gem::Specification.new do |s|
40
40
  "spec/re2/set_spec.rb",
41
41
  "spec/re2/scanner_spec.rb"
42
42
  ]
43
- s.add_development_dependency("rake-compiler", "~> 1.2.7")
44
- s.add_development_dependency("rake-compiler-dock", "~> 1.8.0")
43
+ s.add_development_dependency("rake-compiler", "~> 1.3.1")
44
+ s.add_development_dependency("rake-compiler-dock", "~> 1.11.0")
45
45
  s.add_development_dependency("rspec", "~> 3.2")
46
- s.add_runtime_dependency("mini_portile2", "~> 2.8.7") # keep version in sync with extconf.rb
46
+ s.add_runtime_dependency("mini_portile2", "~> 2.8.9") # keep version in sync with extconf.rb
47
47
  end
@@ -133,6 +133,13 @@ RSpec.describe RE2::MatchData do
133
133
  expect(md["name"].encoding.name).to eq("ISO-8859-1")
134
134
  expect(md[:name].encoding.name).to eq("ISO-8859-1")
135
135
  end
136
+
137
+ it "supports GC compaction" do
138
+ md = RE2::Regexp.new('(wo{2})').match('woohoo' * 5)
139
+ GC.compact
140
+
141
+ expect(md[1]).to eq("woo")
142
+ end
136
143
  end
137
144
 
138
145
  describe "#string" do
@@ -287,6 +294,13 @@ RSpec.describe RE2::MatchData do
287
294
 
288
295
  expect { md.begin(nil) }.to raise_error(TypeError)
289
296
  end
297
+
298
+ it "supports GC compaction" do
299
+ md = RE2::Regexp.new('(wo{2})').match('woohoo' * 5)
300
+ GC.compact
301
+
302
+ expect(md.string[md.begin(0)..-1]).to eq('woohoo' * 5)
303
+ end
290
304
  end
291
305
 
292
306
  describe "#end" do
@@ -349,6 +363,13 @@ RSpec.describe RE2::MatchData do
349
363
 
350
364
  expect { md.end(nil) }.to raise_error(TypeError)
351
365
  end
366
+
367
+ it "supports GC compaction" do
368
+ md = RE2::Regexp.new('(wo{2})').match('woohoo' * 5)
369
+ GC.compact
370
+
371
+ expect(md.string[0...md.end(0)]).to eq('woo')
372
+ end
352
373
  end
353
374
 
354
375
  describe "#deconstruct" do
@@ -1,6 +1,10 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "rbconfig/sizeof"
4
+
3
5
  RSpec.describe RE2::Regexp do
6
+ INT_MAX = 2**(RbConfig::SIZEOF.fetch("int") * 8 - 1) - 1
7
+
4
8
  describe "#initialize" do
5
9
  it "returns an instance given only a pattern" do
6
10
  re = RE2::Regexp.new('woo')
@@ -566,6 +570,12 @@ RSpec.describe RE2::Regexp do
566
570
  expect { re.match("one two three", submatches: :invalid) }.to raise_error(TypeError)
567
571
  end
568
572
 
573
+ it "raises an exception when given too large a number of submatches" do
574
+ re = RE2::Regexp.new('(\w+) (\w+) (\w+)')
575
+
576
+ expect { re.match("one two three", submatches: INT_MAX) }.to raise_error(RangeError, "number of matches should be < #{INT_MAX}")
577
+ end
578
+
569
579
  it "defaults to extracting all submatches when given nil", :aggregate_failures do
570
580
  re = RE2::Regexp.new('(\w+) (\w+) (\w+)')
571
581
  md = re.match("one two three", submatches: nil)
@@ -584,6 +594,13 @@ RSpec.describe RE2::Regexp do
584
594
  expect(md[3]).to be_nil
585
595
  end
586
596
 
597
+ it "raises an exception if given too large a number of submatches instead of options" do
598
+ re = RE2::Regexp.new('(\w+) (\w+) (\w+)')
599
+ md = re.match("one two three", 2)
600
+
601
+ expect { re.match("one two three", INT_MAX) }.to raise_error(RangeError, "number of matches should be < #{INT_MAX}")
602
+ end
603
+
587
604
  it "raises an exception when given invalid options" do
588
605
  re = RE2::Regexp.new('(\w+) (\w+) (\w+)')
589
606
 
@@ -11,7 +11,39 @@ RSpec.describe RE2::Scanner do
11
11
  end
12
12
 
13
13
  describe "#string" do
14
- it "returns the original text for the scanner" do
14
+ it "returns the text for the scanner" do
15
+ re = RE2::Regexp.new('(\w+)')
16
+ text = "It is a truth"
17
+ scanner = re.scan(text)
18
+
19
+ expect(scanner.string).to eq("It is a truth")
20
+ end
21
+
22
+ it "returns a frozen string" do
23
+ re = RE2::Regexp.new('(\w+)')
24
+ text = "It is a truth"
25
+ scanner = re.scan(text)
26
+
27
+ expect(scanner.string).to be_frozen
28
+ end
29
+
30
+ it "freezes unfrozen strings" do
31
+ re = RE2::Regexp.new('(\w+)')
32
+ text = +"It is a truth"
33
+ scanner = re.scan(text)
34
+
35
+ expect(scanner.string).to be_frozen
36
+ end
37
+
38
+ it "copies unfrozen strings" do
39
+ re = RE2::Regexp.new('(\w+)')
40
+ text = +"It is a truth"
41
+ scanner = re.scan(text)
42
+
43
+ expect(scanner.string).to_not equal(text)
44
+ end
45
+
46
+ it "does not copy the string if it was already frozen" do
15
47
  re = RE2::Regexp.new('(\w+)')
16
48
  text = "It is a truth"
17
49
  scanner = re.scan(text)
@@ -162,6 +194,23 @@ RSpec.describe RE2::Scanner do
162
194
  expect(scanner.scan).to eq(["world"])
163
195
  expect(scanner.scan).to be_nil
164
196
  end
197
+
198
+ it "supports GC compaction" do
199
+ r = RE2::Regexp.new('(\w+)')
200
+ scanner = r.scan("Hello world" * 2)
201
+ GC.compact
202
+
203
+ expect(scanner.scan).to eq(["Hello"])
204
+ end
205
+
206
+ it "works even if the original input is mutated" do
207
+ r = RE2::Regexp.new('(\w+)')
208
+ text = +"It is a truth universally acknowledged"
209
+ scanner = r.scan(text)
210
+ text.upcase!
211
+
212
+ expect(scanner.scan).to eq(["It"])
213
+ end
165
214
  end
166
215
 
167
216
  it "is enumerable" do
data/spec/re2/set_spec.rb CHANGED
@@ -66,10 +66,10 @@ RSpec.describe RE2::Set do
66
66
  expect { set.add("???") }.to raise_error(ArgumentError, /str rejected by RE2::Set->Add\(\)/)
67
67
  end
68
68
 
69
- it "truncates error messages to 100 characters" do
69
+ it "includes the full error message" do
70
70
  set = RE2::Set.new(:unanchored, log_errors: false)
71
71
 
72
- expect { set.add("(?P<#{'o' * 200}") }.to raise_error(ArgumentError, "str rejected by RE2::Set->Add(): invalid named capture group: (?P<oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo")
72
+ expect { set.add("(?P<#{'o' * 200}") }.to raise_error(ArgumentError, "str rejected by RE2::Set->Add(): invalid named capture group: (?P<oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo")
73
73
  end
74
74
 
75
75
  it "raises an error if called after #compile" do
@@ -204,6 +204,50 @@ RSpec.describe RE2::Set do
204
204
  end
205
205
  end
206
206
 
207
+ describe "#size" do
208
+ it "returns the number of patterns added to the set", :aggregate_failures do
209
+ skip "Underlying RE2::Set has no Size method" unless RE2::Set.size?
210
+
211
+ set = RE2::Set.new
212
+
213
+ expect(set.size).to eq(0)
214
+
215
+ set.add("abc")
216
+
217
+ expect(set.size).to eq(1)
218
+
219
+ set.add("def")
220
+
221
+ expect(set.size).to eq(2)
222
+ end
223
+
224
+ it "raises an error if RE2 does not support Set::Size" do
225
+ skip "Underlying RE2::Set has a Size method" if RE2::Set.size?
226
+
227
+ set = RE2::Set.new
228
+
229
+ expect { set.size }.to raise_error(RE2::Set::UnsupportedError)
230
+ end
231
+ end
232
+
233
+ describe "#length" do
234
+ it "is an alias for size" do
235
+ skip "Underlying RE2::Set has no Size method" unless RE2::Set.size?
236
+
237
+ set = RE2::Set.new
238
+
239
+ expect(set.length).to eq(0)
240
+
241
+ set.add("abc")
242
+
243
+ expect(set.length).to eq(1)
244
+
245
+ set.add("def")
246
+
247
+ expect(set.length).to eq(2)
248
+ end
249
+ end
250
+
207
251
  def silence_stderr
208
252
  original_stream = STDERR
209
253
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: re2
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.15.0
4
+ version: 2.23.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Paul Mucur
8
8
  - Stan Hu
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-01-06 00:00:00.000000000 Z
11
+ date: 1980-01-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake-compiler
@@ -16,28 +16,28 @@ dependencies:
16
16
  requirements:
17
17
  - - "~>"
18
18
  - !ruby/object:Gem::Version
19
- version: 1.2.7
19
+ version: 1.3.1
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
- version: 1.2.7
26
+ version: 1.3.1
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake-compiler-dock
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - "~>"
32
32
  - !ruby/object:Gem::Version
33
- version: 1.8.0
33
+ version: 1.11.0
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
- version: 1.8.0
40
+ version: 1.11.0
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: rspec
43
43
  requirement: !ruby/object:Gem::Requirement
@@ -58,14 +58,14 @@ dependencies:
58
58
  requirements:
59
59
  - - "~>"
60
60
  - !ruby/object:Gem::Version
61
- version: 2.8.7
61
+ version: 2.8.9
62
62
  type: :runtime
63
63
  prerelease: false
64
64
  version_requirements: !ruby/object:Gem::Requirement
65
65
  requirements:
66
66
  - - "~>"
67
67
  - !ruby/object:Gem::Version
68
- version: 2.8.7
68
+ version: 2.8.9
69
69
  description: Ruby bindings to RE2, "a fast, safe, thread-friendly alternative to backtracking
70
70
  regular expression engines like those used in PCRE, Perl, and Python".
71
71
  executables: []
@@ -88,8 +88,8 @@ files:
88
88
  - lib/re2/scanner.rb
89
89
  - lib/re2/string.rb
90
90
  - lib/re2/version.rb
91
- - ports/archives/20240722.0.tar.gz
92
- - ports/archives/re2-2024-07-02.tar.gz
91
+ - ports/archives/20250814.1.tar.gz
92
+ - ports/archives/re2-2025-11-05.tar.gz
93
93
  - re2.gemspec
94
94
  - spec/kernel_spec.rb
95
95
  - spec/re2/match_data_spec.rb
@@ -110,23 +110,23 @@ required_ruby_version: !ruby/object:Gem::Requirement
110
110
  requirements:
111
111
  - - ">="
112
112
  - !ruby/object:Gem::Version
113
- version: 2.6.0
113
+ version: 3.1.0
114
114
  required_rubygems_version: !ruby/object:Gem::Requirement
115
115
  requirements:
116
116
  - - ">="
117
117
  - !ruby/object:Gem::Version
118
118
  version: '0'
119
119
  requirements: []
120
- rubygems_version: 3.6.2
120
+ rubygems_version: 4.0.3
121
121
  specification_version: 4
122
122
  summary: Ruby bindings to RE2.
123
123
  test_files:
124
124
  - ".rspec"
125
- - spec/spec_helper.rb
126
- - spec/re2_spec.rb
127
125
  - spec/kernel_spec.rb
128
- - spec/re2/regexp_spec.rb
129
126
  - spec/re2/match_data_spec.rb
130
- - spec/re2/string_spec.rb
131
- - spec/re2/set_spec.rb
127
+ - spec/re2/regexp_spec.rb
132
128
  - spec/re2/scanner_spec.rb
129
+ - spec/re2/set_spec.rb
130
+ - spec/re2/string_spec.rb
131
+ - spec/re2_spec.rb
132
+ - spec/spec_helper.rb
Binary file
Binary file