re2 2.4.3-arm64-darwin → 2.6.0.rc1-arm64-darwin

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 43a20c2befed9b069626ffe8e8b870d57f5b4bcbc48923494ea12603ff436d8a
4
- data.tar.gz: ab8d9c2ea3ab8d73be017d100c42c12b62efbb657ed33958b1ca6323adaf0776
3
+ metadata.gz: aeea11fe983c4a111eca8d593e830e47fbc88b7f8b6f2d1d1a58c0a45c72b363
4
+ data.tar.gz: a4e3841f0d06601371e045814e8c3f4381a70dc3a74e6431eb5d308678866f1f
5
5
  SHA512:
6
- metadata.gz: 7ff372cf6389d2e5a0194d7edd77e3942e7d11b26aa13a8d9bb789d53228c086b1e3c9210f327f2f86242f93e751a4160f7fb2da273c97e85036685c32cf8d9e
7
- data.tar.gz: 5984c45dc42a3898b40c48861183530c987004f3b632c8fa3f77721330acc004ef6df5942dce087718af3b541e3499edf2d6c8564a85dc30f035d89d7dad960e
6
+ metadata.gz: 91b75520402ee70ef82cba9ccf654e356526588a476a4a2d527b8e71c61e160abc63f4a9840f11a61fc14b9f50d8387948fe46cd2831c41dc7f3fe3c2db44b98
7
+ data.tar.gz: b962af565015eabf353839cc9f7ced89f33fd9bb7abf1a61aebc1f3752c0b484f7f2e8d932773786c80b0bf473ca2dd13757e9f101b71dbc38a081109e15c4dd
data/README.md CHANGED
@@ -1,205 +1,245 @@
1
- re2 [![Build Status](https://github.com/mudge/re2/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/mudge/re2/actions)
2
- ===
1
+ # re2 - safer regular expressions in Ruby
3
2
 
4
3
  Ruby bindings to [RE2][], a "fast, safe, thread-friendly alternative to
5
4
  backtracking regular expression engines like those used in PCRE, Perl, and
6
5
  Python".
7
6
 
8
- **Current version:** 2.4.3
9
- **Supported Ruby versions:** 2.6, 2.7, 3.0, 3.1, 3.2
7
+ [![Build Status](https://github.com/mudge/re2/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/mudge/re2/actions)
8
+
9
+ **Current version:** 2.5.0
10
10
  **Bundled RE2 version:** libre2.11 (2023-11-01)
11
- **Supported RE2 versions:** libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01), libre2.10 (2022-12-01), libre2.11 (2023-07-01)
12
11
 
13
- Installation
14
- ------------
12
+ ```ruby
13
+ RE2('h.*o').full_match?("hello") #=> true
14
+ RE2('e').full_match?("hello") #=> false
15
+ RE2('h.*o').partial_match?("hello") #=> true
16
+ RE2('e').partial_match?("hello") #=> true
17
+ RE2('(\w+):(\d+)').full_match("ruby:1234")
18
+ #=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
19
+ ```
15
20
 
16
- The gem comes bundled with a version of [RE2][] and will compile itself (and
17
- any dependencies) on install. As compilation can take a while, precompiled
18
- native gems are available for Linux, Windows and macOS.
21
+ ## Table of Contents
22
+
23
+ * [Why RE2?](#why-re2)
24
+ * [Usage](#usage)
25
+ * [Compiling regular expressions](#compiling-regular-expressions)
26
+ * [Matching interface](#matching-interface)
27
+ * [Submatch extraction](#submatch-extraction)
28
+ * [Scanning text incrementally](#scanning-text-incrementally)
29
+ * [Searching simultaneously](#searching-simultaneously)
30
+ * [Encoding](#encoding)
31
+ * [Requirements](#requirements)
32
+ * [Native gems](#native-gems)
33
+ * [Installing the `ruby` platform gem](#installing-the-ruby-platform-gem)
34
+ * [Using system libraries](#using-system-libraries)
35
+ * [Thanks](#thanks)
36
+ * [Contact](#contact)
37
+ * [License](#license)
38
+ * [Dependencies](#dependencies)
39
+
40
+ ## Why RE2?
41
+
42
+ While [recent
43
+ versions](https://www.ruby-lang.org/en/news/2022/12/25/ruby-3-2-0-released/) of
44
+ Ruby have improved defences against [regular expression denial of service
45
+ (ReDoS) attacks](https://en.wikipedia.org/wiki/ReDoS), it is still possible for
46
+ users to craft malicious patterns that take a long time to process by using
47
+ syntactic features such as [back-references, lookaheads and possessive
48
+ quantifiers](https://bugs.ruby-lang.org/issues/19104#note-3). RE2 aims to
49
+ eliminate ReDoS by design:
50
+
51
+ > **_Safety is RE2's raison d'être._**
52
+ >
53
+ > RE2 was designed and implemented with an explicit goal of being able to
54
+ > handle regular expressions from untrusted users without risk. One of its
55
+ > primary guarantees is that the match time is linear in the length of the
56
+ > input string. It was also written with production concerns in mind: the
57
+ > parser, the compiler and the execution engines limit their memory usage by
58
+ > working within a configurable budget – failing gracefully when exhausted –
59
+ > and they avoid stack overflow by eschewing recursion.
60
+
61
+ — [Why RE2?](https://github.com/google/re2/wiki/WhyRE2)
62
+
63
+ ## Usage
64
+
65
+ Install re2 as a dependency:
19
66
 
20
- In v2.0 and later, precompiled native gems are available for Ruby 2.6 to 3.2
21
- on these platforms:
67
+ ```ruby
68
+ # In your Gemfile
69
+ gem "re2"
22
70
 
23
- - `aarch64-linux` (requires: glibc >= 2.29)
24
- - `arm-linux` (requires: glibc >= 2.29)
25
- - `arm64-darwin`
26
- - `x64-mingw32` / `x64-mingw-ucrt`
27
- - `x86-linux` (requires: glibc >= 2.17)
28
- - `x86_64-darwin`
29
- - `x86_64-linux` (requires: glibc >= 2.17)
71
+ # Or without Bundler
72
+ gem install re2
73
+ ```
30
74
 
31
- If you wish to opt out of using the bundled libraries, you will need RE2
32
- installed as well as a C++ compiler such as [gcc][] (on Debian and Ubuntu, this
33
- is provided by the [build-essential][] package). If you are using macOS, I
34
- recommend installing RE2 with [Homebrew][] by running the following:
75
+ Include in your code:
35
76
 
36
- $ brew install re2
77
+ ```ruby
78
+ require "re2"
79
+ ```
37
80
 
38
- If you are using Debian, you can install the [libre2-dev][] package like so:
81
+ Full API documentation automatically generated from the latest version is
82
+ available at https://mudge.name/re2/.
39
83
 
40
- $ sudo apt-get install libre2-dev
84
+ While re2 uses the same naming scheme as Ruby's built-in regular expression
85
+ library (with [`Regexp`](https://mudge.name/re2/RE2/Regexp.html) and
86
+ [`MatchData`](https://mudge.name/re2/RE2/MatchData.html)), its API is slightly
87
+ different:
41
88
 
42
- Recent versions of RE2 require [CMake](https://cmake.org) and a compiler with
43
- C++14 support such as [clang](http://clang.llvm.org/) 3.4 or
44
- [gcc](https://gcc.gnu.org/) 5.
89
+ ### Compiling regular expressions
45
90
 
46
- If you are using a packaged Ruby distribution, make sure you also have the
47
- Ruby header files installed such as those provided by the [ruby-dev][] package
48
- on Debian and Ubuntu.
91
+ > [!WARNING]
92
+ > RE2's regular expression syntax differs from PCRE and Ruby's built-in
93
+ > [`Regexp`](https://docs.ruby-lang.org/en/3.2/Regexp.html) library, see the
94
+ > [official syntax page](https://github.com/google/re2/wiki/Syntax) for more
95
+ > details.
49
96
 
50
- You can then install the library via RubyGems with `gem install re2 --platform=ruby --
51
- --enable-system-libraries` or `gem install re2 --platform=ruby -- --enable-system-libraries
52
- --with-re2-dir=/path/to/re2/prefix` if RE2 is not installed in any of the
53
- following default locations:
97
+ The core class is [`RE2::Regexp`](https://mudge.name/re2/RE2/Regexp.html) which
98
+ takes a regular expression as a string and compiles it internally into an `RE2`
99
+ object. A global function `RE2` is available to concisely compile a new
100
+ `RE2::Regexp`:
54
101
 
55
- * `/usr/local`
56
- * `/opt/homebrew`
57
- * `/usr`
102
+ ```ruby
103
+ re = RE2('(\w+):(\d+)')
104
+ #=> #<RE2::Regexp /(\w+):(\d+)/>
105
+ re.ok? #=> true
58
106
 
59
- Alternatively, you can set the `RE2_USE_SYSTEM_LIBRARIES` environment variable instead of passing `--enable-system-libraries` to the `gem` command.
107
+ re = RE2('abc)def')
108
+ re.ok? #=> false
109
+ re.error #=> "missing ): abc(def"
110
+ ```
60
111
 
61
- If you're using Bundler, you can use the
62
- [`force_ruby_platform`](https://bundler.io/v2.3/man/gemfile.5.html#FORCE_RUBY_PLATFORM)
63
- option in your Gemfile.
112
+ > [!TIP]
113
+ > Note the use of *single quotes* when passing the regular expression as
114
+ > a string to `RE2` so that the backslashes aren't interpreted as escapes.
64
115
 
65
- Documentation
66
- -------------
116
+ When compiling a regular expression, an optional second argument can be used to change RE2's default options, e.g. stop logging syntax and execution errors to stderr with `log_errors`:
67
117
 
68
- Full documentation automatically generated from the latest version is
69
- available at <http://mudge.name/re2/>.
118
+ ```ruby
119
+ RE2('abc)def', log_errors: false)
120
+ ```
70
121
 
71
- Note that RE2's regular expression syntax differs from PCRE and Ruby's
72
- built-in [`Regexp`][Regexp] library, see the [official syntax page][] for more
73
- details.
122
+ See the API documentation for [`RE2::Regexp#initialize`](https://mudge.name/re2/RE2/Regexp.html#initialize-instance_method) for all the available options.
74
123
 
75
- Usage
76
- -----
124
+ ### Matching interface
77
125
 
78
- While re2 uses the same naming scheme as Ruby's built-in regular expression
79
- library (with [`Regexp`](http://mudge.name/re2/RE2/Regexp.html) and
80
- [`MatchData`](http://mudge.name/re2/RE2/MatchData.html)), its API is slightly
81
- different:
126
+ There are two main methods for matching: [`RE2::Regexp#full_match?`](https://mudge.name/re2/RE2/Regexp.html#full_match%3F-instance_method) requires the regular expression to match the entire input text, and [`RE2::Regexp#partial_match?`](https://mudge.name/re2/RE2/Regexp.html#partial_match%3F-instance_method) looks for a match for a substring of the input text, returning a boolean to indicate whether a match was successful or not.
127
+
128
+ ```ruby
129
+ RE2('h.*o').full_match?("hello") #=> true
130
+ RE2('e').full_match?("hello") #=> false
82
131
 
83
- ```console
84
- $ irb -rubygems
85
- > require 're2'
86
- > r = RE2::Regexp.new('w(\d)(\d+)')
87
- => #<RE2::Regexp /w(\d)(\d+)/>
88
- > m = r.match("w1234")
89
- => #<RE2::MatchData "w1234" 1:"1" 2:"234">
90
- > m[1]
91
- => "1"
92
- > m.string
93
- => "w1234"
94
- > m.begin(1)
95
- => 1
96
- > m.end(1)
97
- => 2
98
- > r =~ "w1234"
99
- => true
100
- > r !~ "bob"
101
- => true
102
- > r.match("bob")
103
- => nil
132
+ RE2('h.*o').partial_match?("hello") #=> true
133
+ RE2('e').partial_match?("hello") #=> true
104
134
  ```
105
135
 
106
- As
107
- [`RE2::Regexp.new`](http://mudge.name/re2/RE2/Regexp.html#initialize-instance_method)
108
- (or `RE2::Regexp.compile`) can be quite verbose, a helper method has been
109
- defined against `Kernel` so you can use a shorter version to create regular
110
- expressions:
136
+ ### Submatch extraction
111
137
 
112
- ```console
113
- > RE2('(\d+)')
114
- => #<RE2::Regexp /(\d+)/>
115
- ```
138
+ > [!TIP]
139
+ > Only extract the number of submatches you need as performance is improved
140
+ > with fewer submatches (with the best performance when avoiding submatch
141
+ > extraction altogether).
116
142
 
117
- Note the use of *single quotes* as double quotes will interpret `\d` as `d` as
118
- in the following example:
143
+ Both matching methods have a second form that can extract submatches as [`RE2::MatchData`](https://mudge.name/re2/RE2/MatchData.html) objects: [`RE2::Regexp#full_match`](https://mudge.name/re2/RE2/Regexp.html#full_match-instance_method) and [`RE2::Regexp#partial_match`](https://mudge.name/re2/RE2/Regexp.html#partial_match-instance_method).
119
144
 
120
- ```console
121
- > RE2("(\d+)")
122
- => #<RE2::Regexp /(d+)/>
145
+ ```ruby
146
+ m = RE2('(\w+):(\d+)').full_match("ruby:1234")
147
+ #=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
148
+
149
+ m[0] #=> "ruby:1234"
150
+ m[1] #=> "ruby"
151
+ m[2] #=> "1234"
152
+
153
+ m = RE2('(\w+):(\d+)').full_match("r")
154
+ #=> nil
123
155
  ```
124
156
 
125
- As of 0.3.0, you can use named groups:
126
-
127
- ```console
128
- > r = RE2::Regexp.new('(?P<name>\w+) (?P<age>\d+)')
129
- => #<RE2::Regexp /(?P<name>\w+) (?P<age>\d+)/>
130
- > m = r.match("Bob 40")
131
- => #<RE2::MatchData "Bob 40" 1:"Bob" 2:"40">
132
- > m[:name]
133
- => "Bob"
134
- > m["age"]
135
- => "40"
157
+ `RE2::MatchData` supports retrieving submatches by numeric index or by name if present in the regular expression:
158
+
159
+ ```ruby
160
+ m = RE2('(?P<word>\w+):(?P<number>\d+)').full_match("ruby:1234")
161
+ #=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
162
+
163
+ m["word"] #=> "ruby"
164
+ m["number"] #=> "1234"
136
165
  ```
137
166
 
138
- As of 0.6.0, you can use `RE2::Regexp#scan` to incrementally scan text for
139
- matches (similar in purpose to Ruby's
140
- [`String#scan`](http://ruby-doc.org/core-2.0.0/String.html#method-i-scan)).
141
- Calling `scan` will return an `RE2::Scanner` which is
142
- [enumerable](http://ruby-doc.org/core-2.0.0/Enumerable.html) meaning you can
143
- use `each` to iterate through the matches (and even use
144
- [`Enumerator::Lazy`](http://ruby-doc.org/core-2.0/Enumerator/Lazy.html)):
167
+ They can also be used with Ruby's [pattern matching](https://docs.ruby-lang.org/en/3.2/syntax/pattern_matching_rdoc.html):
145
168
 
146
169
  ```ruby
147
- re = RE2('(\w+)')
148
- scanner = re.scan("It is a truth universally acknowledged")
149
- scanner.each do |match|
150
- puts match
170
+ case RE2('(\w+):(\d+)').full_match("ruby:1234")
171
+ in [word, number]
172
+ puts "Word: #{word}, Number: #{number}"
173
+ else
174
+ puts "No match"
151
175
  end
176
+ # Word: ruby, Number: 1234
152
177
 
153
- scanner.rewind
154
-
155
- enum = scanner.to_enum
156
- enum.next #=> ["It"]
157
- enum.next #=> ["is"]
178
+ case RE2('(?P<word>\w+):(?P<number>\d+)').full_match("ruby:1234")
179
+ in word:, number:
180
+ puts "Word: #{word}, Number: #{number}"
181
+ else
182
+ puts "No match"
183
+ end
184
+ # Word: ruby, Number: 1234
158
185
  ```
159
186
 
160
- As of 1.5.0, you can use `RE2::Set` to match multiple patterns against a
161
- string. Calling `RE2::Set#add` with a pattern will return an integer index of
162
- the pattern. After all patterns have been added, the set can be compiled using
163
- `RE2::Set#compile`, and then `RE2::Set#match` will return an `Array<Integer>`
164
- containing the indices of all the patterns that matched.
187
+ By default, both `full_match` and `partial_match` will extract all submatches into the `RE2::MatchData` based on the number of capturing groups in the regular expression. This can be changed by passing an optional second argument when matching:
165
188
 
166
189
  ```ruby
167
- set = RE2::Set.new
168
- set.add("abc") #=> 0
169
- set.add("def") #=> 1
170
- set.add("ghi") #=> 2
171
- set.compile #=> true
172
- set.match("abcdefghi") #=> [0, 1, 2]
173
- set.match("ghidefabc") #=> [2, 1, 0]
190
+ m = RE2('(\w+):(\d+)').full_match("ruby:1234", submatches: 1)
191
+ => #<RE2::MatchData "ruby:1234" 1:"ruby">
174
192
  ```
175
193
 
176
- As of 1.6.0, you can use [Ruby's pattern matching](https://docs.ruby-lang.org/en/3.0/syntax/pattern_matching_rdoc.html) against `RE2::MatchData` with both array patterns and hash patterns:
194
+ > [!WARNING]
195
+ > If the regular expression has no capturing groups or you pass `submatches:
196
+ > 0`, the matching method will behave like its `full_match?` or
197
+ > `partial_match?` form and only return `true` or `false` rather than
198
+ > `RE2::MatchData`.
199
+
200
+ ### Scanning text incrementally
201
+
202
+ If you want to repeatedly match regular expressions from the start of some input text, you can use [`RE2::Regexp#scan`](https://mudge.name/re2/RE2/Regexp.html#scan-instance_method) to return an `Enumerable` [`RE2::Scanner`](https://mudge.name/re2/RE2/Scanner.html) object which will lazily consume matches as you iterate over it:
177
203
 
178
204
  ```ruby
179
- case RE2('(\w+) (\d+)').match("Alice 42")
180
- in [name, age]
181
- puts "My name is #{name} and I am #{age} years old"
182
- else
183
- puts "No match!"
205
+ scanner = RE2('(\w+)').scan(" one two three 4")
206
+ scanner.each do |match|
207
+ puts match.inspect
184
208
  end
185
- # My name is Alice and I am 42 years old
209
+ # ["one"]
210
+ # ["two"]
211
+ # ["three"]
212
+ # ["4"]
213
+ ```
186
214
 
215
+ ### Searching simultaneously
187
216
 
188
- case RE2('(?P<name>\w+) (?P<age>\d+)').match("Alice 42")
189
- in {name:, age:}
190
- puts "My name is #{name} and I am #{age} years old"
191
- else
192
- puts "No match!"
193
- end
194
- # My name is Alice and I am 42 years old
217
+ [`RE2::Set`](https://mudge.name/re2/RE2/Set.html) represents a collection of
218
+ regular expressions that can be searched for simultaneously. Calling
219
+ [`RE2::Set#add`](https://mudge.name/re2/RE2/Set.html#add-instance_method) with
220
+ a regular expression will return the integer index at which it is stored within
221
+ the set. After all patterns have been added, the set can be compiled using
222
+ [`RE2::Set#compile`](https://mudge.name/re2/RE2/Set.html#compile-instance_method),
223
+ and then
224
+ [`RE2::Set#match`](https://mudge.name/re2/RE2/Set.html#match-instance_method)
225
+ will return an array containing the indices of all the patterns that matched.
226
+
227
+ ```ruby
228
+ set = RE2::Set.new
229
+ set.add("abc") #=> 0
230
+ set.add("def") #=> 1
231
+ set.add("ghi") #=> 2
232
+ set.compile #=> true
233
+ set.match("abcdefghi") #=> [0, 1, 2]
234
+ set.match("ghidefabc") #=> [2, 1, 0]
195
235
  ```
196
236
 
197
- Encoding
198
- --------
237
+ ### Encoding
199
238
 
200
- Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
201
- returned in UTF-8 by default or ISO-8859-1 if the `:utf8` option for the
202
- `RE2::Regexp` is set to false (any other encoding's behaviour is undefined).
239
+ > [!WARNING]
240
+ > Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
241
+ > returned in UTF-8 by default or ISO-8859-1 if the `:utf8` option for the
242
+ > `RE2::Regexp` is set to `false` (any other encoding's behaviour is undefined).
203
243
 
204
244
  For backward compatibility: re2 won't automatically convert string inputs to
205
245
  the right encoding so this is the responsibility of the caller, e.g.
@@ -209,51 +249,77 @@ the right encoding so this is the responsibility of the caller, e.g.
209
249
  RE2(non_utf8_pattern.encode("UTF-8")).match(non_utf8_text.encode("UTF-8"))
210
250
 
211
251
  # If the :utf8 option is false, RE2 will process patterns and text as ISO-8859-1
212
- RE2(non_latin1_pattern.encode("ISO-8859-1"), :utf8 => false).match(non_latin1_text.encode("ISO-8859-1"))
252
+ RE2(non_latin1_pattern.encode("ISO-8859-1"), utf8: false).match(non_latin1_text.encode("ISO-8859-1"))
213
253
  ```
214
254
 
215
- Features
216
- --------
255
+ ## Requirements
217
256
 
218
- * Pre-compiling regular expressions with
219
- [`RE2::Regexp.new(re)`](https://github.com/google/re2/blob/2016-02-01/re2/re2.h#L100),
220
- `RE2::Regexp.compile(re)` or `RE2(re)` (including specifying options, e.g.
221
- `RE2::Regexp.new("pattern", :case_sensitive => false)`
257
+ This gem requires the following to run:
222
258
 
223
- * Extracting matches with `re2.match(text)` (and an exact number of matches
224
- with `re2.match(text, number_of_matches)` such as `re2.match("123-234", 2)`)
259
+ * [Ruby](https://www.ruby-lang.org/en/) 2.6 to 3.3
225
260
 
226
- * Extracting matches by name (both with strings and symbols)
261
+ It supports the following RE2 ABI versions:
227
262
 
228
- * Checking for matches with `re2 =~ text`, `re2 === text` (for use in `case`
229
- statements) and `re2 !~ text`
263
+ * libre2.0 (prior to release 2020-03-02) to libre2.11 (2023-07-01 to 2023-11-01)
230
264
 
231
- * Incrementally scanning text with `re2.scan(text)`
265
+ ### Native gems
232
266
 
233
- * Search a collection of patterns simultaneously with `RE2::Set`
267
+ Where possible, a pre-compiled native gem will be provided for the following platforms:
234
268
 
235
- * Checking regular expression compilation with `re2.ok?`, `re2.error` and
236
- `re2.error_arg`
269
+ * Linux `aarch64-linux` and `arm-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.29+)
270
+ * Linux `x86-linux` and `x86_64-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.17+) including [musl](https://musl.libc.org/)-based systems such as [Alpine](https://alpinelinux.org)
271
+ * macOS `x86_64-darwin` and `arm64-darwin`
272
+ * Windows `x64-mingw32` and `x64-mingw-ucrt`
237
273
 
238
- * Checking regular expression "cost" with `re2.program_size`
274
+ ### Installing the `ruby` platform gem
239
275
 
240
- * Checking the options for an expression with `re2.options` or individually
241
- with `re2.case_sensitive?`
276
+ > [!WARNING]
277
+ > We strongly recommend using the native gems where possible to avoid the need
278
+ > for compiling the C++ extension and its dependencies which will take longer
279
+ > and be less reliable.
242
280
 
243
- * Performing a single string replacement with `pattern.replace(replacement,
244
- original)`
281
+ If you wish to compile the gem, you will need to explicitly install the `ruby` platform gem:
245
282
 
246
- * Performing a global string replacement with
247
- `pattern.replace_all(replacement, original)`
283
+ ```ruby
284
+ # In your Gemfile with Bundler 2.3.18+
285
+ gem "re2", force_ruby_platform: true
286
+
287
+ # With Bundler 2.1+
288
+ bundle config set force_ruby_platform true
289
+
290
+ # With older versions of Bundler
291
+ bundle config force_ruby_platform true
292
+
293
+ # Without Bundler
294
+ gem install re2 --platform=ruby
295
+ ```
296
+
297
+ You will need a full compiler toolchain for compiling Ruby C extensions (see
298
+ [Nokogiri's "The Compiler
299
+ Toolchain"](https://nokogiri.org/tutorials/installing_nokogiri.html#appendix-a-the-compiler-toolchain))
300
+ plus the toolchain required for compiling the vendored version of RE2 and its
301
+ dependency [Abseil][] which includes
302
+ [CMake](https://cmake.org) and a compiler with C++14 support such as
303
+ [clang](http://clang.llvm.org/) 3.4 or [gcc](https://gcc.gnu.org/) 5. On
304
+ Windows, you'll also need pkgconf 2.1.0+ to avoid [`undefined reference`
305
+ errors](https://github.com/pkgconf/pkgconf/issues/322) when attempting to
306
+ compile Abseil.
307
+
308
+ ### Using system libraries
248
309
 
249
- * Escaping regular expressions with
250
- [`RE2.escape(unquoted)`](https://github.com/google/re2/blob/2016-02-01/re2/re2.h#L418) and
251
- `RE2.quote(unquoted)`
310
+ If you already have RE2 installed, you can instruct the gem not to use its own vendored version:
252
311
 
253
- * Pattern matching with `RE2::MatchData`
312
+ ```ruby
313
+ gem install re2 --platform=ruby -- --enable-system-libraries
314
+
315
+ # If RE2 is not installed in /usr/local, /usr, or /opt/homebrew:
316
+ gem install re2 --platform=ruby -- --enable-system-libraries --with-re2-dir=/path/to/re2/prefix
317
+ ```
254
318
 
255
- Contributions
256
- -------------
319
+ Alternatively, you can set the `RE2_USE_SYSTEM_LIBRARIES` environment variable instead of passing `--enable-system-libraries` to the `gem` command.
320
+
321
+
322
+ ## Thanks
257
323
 
258
324
  * Thanks to [Jason Woods](https://github.com/driskell) who contributed the
259
325
  original implementations of `RE2::MatchData#begin` and `RE2::MatchData#end`.
@@ -278,30 +344,21 @@ Contributions
278
344
  switch to Ruby's `TypedData` API and the resulting garbage collection
279
345
  improvements in 2.4.0.
280
346
 
281
- Contact
282
- -------
347
+ ## Contact
283
348
 
284
349
  All issues and suggestions should go to [GitHub Issues](https://github.com/mudge/re2/issues).
285
350
 
286
- License
287
- -------
351
+ ## License
288
352
 
289
353
  This library is licensed under the BSD 3-Clause License, see `LICENSE.txt`.
290
354
 
291
- Dependencies
292
- ------------
355
+ Copyright © 2010, Paul Mucur.
356
+
357
+ ### Dependencies
293
358
 
294
359
  The source code of [RE2][] is distributed in the `ruby` platform gem. This code is licensed under the BSD 3-Clause License, see `LICENSE-DEPENDENCIES.txt`.
295
360
 
296
361
  The source code of [Abseil][] is distributed in the `ruby` platform gem. This code is licensed under the Apache License 2.0, see `LICENSE-DEPENDENCIES.txt`.
297
362
 
298
363
  [RE2]: https://github.com/google/re2
299
- [gcc]: http://gcc.gnu.org/
300
- [ruby-dev]: http://packages.debian.org/ruby-dev
301
- [build-essential]: http://packages.debian.org/build-essential
302
- [Regexp]: http://ruby-doc.org/core/classes/Regexp.html
303
- [MatchData]: http://ruby-doc.org/core/classes/MatchData.html
304
- [Homebrew]: http://mxcl.github.com/homebrew
305
- [libre2-dev]: http://packages.debian.org/search?keywords=libre2-dev
306
- [official syntax page]: https://github.com/google/re2/wiki/Syntax
307
364
  [Abseil]: https://abseil.io
data/Rakefile CHANGED
@@ -33,7 +33,7 @@ Gem::PackageTask.new(RE2_GEM_SPEC) do |p|
33
33
  p.need_tar = false
34
34
  end
35
35
 
36
- CROSS_RUBY_VERSIONS = %w[3.2.0 3.1.0 3.0.0 2.7.0 2.6.0].join(':')
36
+ CROSS_RUBY_VERSIONS = %w[3.3.0 3.2.0 3.1.0 3.0.0 2.7.0 2.6.0].join(':')
37
37
  CROSS_RUBY_PLATFORMS = %w[
38
38
  aarch64-linux
39
39
  arm-linux
data/ext/re2/extconf.rb CHANGED
@@ -1,7 +1,9 @@
1
- # re2 (http://github.com/mudge/re2)
2
- # Ruby bindings to re2, an "efficient, principled regular expression library"
1
+ # re2 (https://github.com/mudge/re2)
2
+ # Ruby bindings to RE2, a "fast, safe, thread-friendly alternative to
3
+ # backtracking regular expression engines like those used in PCRE, Perl, and
4
+ # Python".
3
5
  #
4
- # Copyright (c) 2010-2012, Paul Mucur (http://mudge.name)
6
+ # Copyright (c) 2010, Paul Mucur (https://mudge.name)
5
7
  # Released under the BSD Licence, please see LICENSE.txt
6
8
 
7
9
  require 'mkmf'
@@ -271,65 +273,6 @@ def build_with_system_libraries
271
273
  build_extension
272
274
  end
273
275
 
274
- # pkgconf v1.9.3 on Windows incorrectly sorts the output of `pkg-config
275
- # --libs --static`, resulting in build failures: https://github.com/pkgconf/pkgconf/issues/268.
276
- # To work around the issue, store the correct order of abseil flags here and add them manually
277
- # for Windows.
278
- #
279
- # Note that `-ldbghelp` is incorrectly added before `-labsl_symbolize` in abseil:
280
- # https://github.com/abseil/abseil-cpp/issues/1497
281
- ABSL_LDFLAGS = %w[
282
- -labsl_flags
283
- -labsl_flags_internal
284
- -labsl_flags_marshalling
285
- -labsl_flags_reflection
286
- -labsl_flags_private_handle_accessor
287
- -labsl_flags_commandlineflag
288
- -labsl_flags_commandlineflag_internal
289
- -labsl_flags_config
290
- -labsl_flags_program_name
291
- -labsl_cord
292
- -labsl_cordz_info
293
- -labsl_cord_internal
294
- -labsl_cordz_functions
295
- -labsl_cordz_handle
296
- -labsl_crc_cord_state
297
- -labsl_crc32c
298
- -labsl_crc_internal
299
- -labsl_crc_cpu_detect
300
- -labsl_raw_hash_set
301
- -labsl_hash
302
- -labsl_city
303
- -labsl_bad_variant_access
304
- -labsl_low_level_hash
305
- -labsl_hashtablez_sampler
306
- -labsl_exponential_biased
307
- -labsl_bad_optional_access
308
- -labsl_str_format_internal
309
- -labsl_synchronization
310
- -labsl_graphcycles_internal
311
- -labsl_kernel_timeout_internal
312
- -labsl_stacktrace
313
- -labsl_symbolize
314
- -ldbghelp
315
- -labsl_debugging_internal
316
- -labsl_demangle_internal
317
- -labsl_malloc_internal
318
- -labsl_time
319
- -labsl_civil_time
320
- -labsl_strings
321
- -labsl_string_view
322
- -labsl_strings_internal
323
- -labsl_base
324
- -ladvapi32
325
- -labsl_spinlock_wait
326
- -labsl_int128
327
- -labsl_throw_delegate
328
- -labsl_raw_logging_internal
329
- -labsl_log_severity
330
- -labsl_time_zone
331
- ].freeze
332
-
333
276
  def libflag_to_filename(ldflag)
334
277
  case ldflag
335
278
  when /\A-l(.+)/
@@ -382,14 +325,7 @@ def add_flag(arg, lib_paths)
382
325
  end
383
326
 
384
327
  def add_static_ldflags(flags, lib_paths)
385
- static_flags = flags.strip.shellsplit
386
-
387
- if MiniPortile.windows?
388
- static_flags.each { |flag| add_flag(flag, lib_paths) unless ABSL_LDFLAGS.include?(flag) }
389
- ABSL_LDFLAGS.each { |flag| add_flag(flag, lib_paths) }
390
- else
391
- static_flags.each { |flag| add_flag(flag, lib_paths) }
392
- end
328
+ flags.strip.shellsplit.each { |flag| add_flag(flag, lib_paths) }
393
329
  end
394
330
 
395
331
  def build_with_vendored_libraries