re2 2.4.3-arm-linux → 2.6.0.rc1-arm-linux
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +249 -192
- data/Rakefile +1 -1
- data/ext/re2/extconf.rb +6 -70
- data/ext/re2/re2.cc +450 -263
- data/ext/re2/recipes.rb +8 -0
- data/lib/2.6/re2.so +0 -0
- data/lib/2.7/re2.so +0 -0
- data/lib/3.0/re2.so +0 -0
- data/lib/3.1/re2.so +0 -0
- data/lib/3.2/re2.so +0 -0
- data/lib/3.3/re2.so +0 -0
- data/lib/re2/regexp.rb +70 -0
- data/lib/re2/scanner.rb +9 -0
- data/lib/re2/string.rb +10 -59
- data/lib/re2/version.rb +10 -1
- data/lib/re2.rb +7 -3
- data/re2.gemspec +3 -2
- data/spec/kernel_spec.rb +2 -2
- data/spec/re2/match_data_spec.rb +64 -25
- data/spec/re2/regexp_spec.rb +492 -113
- data/spec/re2/scanner_spec.rb +3 -8
- data/spec/re2/set_spec.rb +18 -18
- data/spec/re2_spec.rb +4 -4
- metadata +11 -9
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8b62f672bf51fd588d76dde9157762ce67ffbaa75e405790088a0c1778e55eee
|
4
|
+
data.tar.gz: 6beb436dbcf3de798bf31ba312142d6c00262c852ce69330528666e5d900f204
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 881d7292b8ccaafe67ca6d12fe689efe89b15c6ee06176a70bde71d3a055cb2ff5d6c69f75768f42d926a1e0a3f6b0fe4fe096e9710734c1ef2bc85bd5ff4bef
|
7
|
+
data.tar.gz: 4be32fe0fb1b98d7dff808bfe662e192a260e439ed51df846fc7cd95ae4791e4cc8e1cbc91202ee917ad17e3e46a18d373ffa48cc41037c9aee520ff912dec78
|
data/README.md
CHANGED
@@ -1,205 +1,245 @@
|
|
1
|
-
re2
|
2
|
-
===
|
1
|
+
# re2 - safer regular expressions in Ruby
|
3
2
|
|
4
3
|
Ruby bindings to [RE2][], a "fast, safe, thread-friendly alternative to
|
5
4
|
backtracking regular expression engines like those used in PCRE, Perl, and
|
6
5
|
Python".
|
7
6
|
|
8
|
-
|
9
|
-
|
7
|
+
[![Build Status](https://github.com/mudge/re2/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/mudge/re2/actions)
|
8
|
+
|
9
|
+
**Current version:** 2.5.0
|
10
10
|
**Bundled RE2 version:** libre2.11 (2023-11-01)
|
11
|
-
**Supported RE2 versions:** libre2.0 (< 2020-03-02), libre2.1 (2020-03-02), libre2.6 (2020-03-03), libre2.7 (2020-05-01), libre2.8 (2020-07-06), libre2.9 (2020-11-01), libre2.10 (2022-12-01), libre2.11 (2023-07-01)
|
12
11
|
|
13
|
-
|
14
|
-
|
12
|
+
```ruby
|
13
|
+
RE2('h.*o').full_match?("hello") #=> true
|
14
|
+
RE2('e').full_match?("hello") #=> false
|
15
|
+
RE2('h.*o').partial_match?("hello") #=> true
|
16
|
+
RE2('e').partial_match?("hello") #=> true
|
17
|
+
RE2('(\w+):(\d+)').full_match("ruby:1234")
|
18
|
+
#=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
|
19
|
+
```
|
15
20
|
|
16
|
-
|
17
|
-
|
18
|
-
|
21
|
+
## Table of Contents
|
22
|
+
|
23
|
+
* [Why RE2?](#why-re2)
|
24
|
+
* [Usage](#usage)
|
25
|
+
* [Compiling regular expressions](#compiling-regular-expressions)
|
26
|
+
* [Matching interface](#matching-interface)
|
27
|
+
* [Submatch extraction](#submatch-extraction)
|
28
|
+
* [Scanning text incrementally](#scanning-text-incrementally)
|
29
|
+
* [Searching simultaneously](#searching-simultaneously)
|
30
|
+
* [Encoding](#encoding)
|
31
|
+
* [Requirements](#requirements)
|
32
|
+
* [Native gems](#native-gems)
|
33
|
+
* [Installing the `ruby` platform gem](#installing-the-ruby-platform-gem)
|
34
|
+
* [Using system libraries](#using-system-libraries)
|
35
|
+
* [Thanks](#thanks)
|
36
|
+
* [Contact](#contact)
|
37
|
+
* [License](#license)
|
38
|
+
* [Dependencies](#dependencies)
|
39
|
+
|
40
|
+
## Why RE2?
|
41
|
+
|
42
|
+
While [recent
|
43
|
+
versions](https://www.ruby-lang.org/en/news/2022/12/25/ruby-3-2-0-released/) of
|
44
|
+
Ruby have improved defences against [regular expression denial of service
|
45
|
+
(ReDoS) attacks](https://en.wikipedia.org/wiki/ReDoS), it is still possible for
|
46
|
+
users to craft malicious patterns that take a long time to process by using
|
47
|
+
syntactic features such as [back-references, lookaheads and possessive
|
48
|
+
quantifiers](https://bugs.ruby-lang.org/issues/19104#note-3). RE2 aims to
|
49
|
+
eliminate ReDoS by design:
|
50
|
+
|
51
|
+
> **_Safety is RE2's raison d'être._**
|
52
|
+
>
|
53
|
+
> RE2 was designed and implemented with an explicit goal of being able to
|
54
|
+
> handle regular expressions from untrusted users without risk. One of its
|
55
|
+
> primary guarantees is that the match time is linear in the length of the
|
56
|
+
> input string. It was also written with production concerns in mind: the
|
57
|
+
> parser, the compiler and the execution engines limit their memory usage by
|
58
|
+
> working within a configurable budget – failing gracefully when exhausted –
|
59
|
+
> and they avoid stack overflow by eschewing recursion.
|
60
|
+
|
61
|
+
— [Why RE2?](https://github.com/google/re2/wiki/WhyRE2)
|
62
|
+
|
63
|
+
## Usage
|
64
|
+
|
65
|
+
Install re2 as a dependency:
|
19
66
|
|
20
|
-
|
21
|
-
|
67
|
+
```ruby
|
68
|
+
# In your Gemfile
|
69
|
+
gem "re2"
|
22
70
|
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
- `x64-mingw32` / `x64-mingw-ucrt`
|
27
|
-
- `x86-linux` (requires: glibc >= 2.17)
|
28
|
-
- `x86_64-darwin`
|
29
|
-
- `x86_64-linux` (requires: glibc >= 2.17)
|
71
|
+
# Or without Bundler
|
72
|
+
gem install re2
|
73
|
+
```
|
30
74
|
|
31
|
-
|
32
|
-
installed as well as a C++ compiler such as [gcc][] (on Debian and Ubuntu, this
|
33
|
-
is provided by the [build-essential][] package). If you are using macOS, I
|
34
|
-
recommend installing RE2 with [Homebrew][] by running the following:
|
75
|
+
Include in your code:
|
35
76
|
|
36
|
-
|
77
|
+
```ruby
|
78
|
+
require "re2"
|
79
|
+
```
|
37
80
|
|
38
|
-
|
81
|
+
Full API documentation automatically generated from the latest version is
|
82
|
+
available at https://mudge.name/re2/.
|
39
83
|
|
40
|
-
|
84
|
+
While re2 uses the same naming scheme as Ruby's built-in regular expression
|
85
|
+
library (with [`Regexp`](https://mudge.name/re2/RE2/Regexp.html) and
|
86
|
+
[`MatchData`](https://mudge.name/re2/RE2/MatchData.html)), its API is slightly
|
87
|
+
different:
|
41
88
|
|
42
|
-
|
43
|
-
C++14 support such as [clang](http://clang.llvm.org/) 3.4 or
|
44
|
-
[gcc](https://gcc.gnu.org/) 5.
|
89
|
+
### Compiling regular expressions
|
45
90
|
|
46
|
-
|
47
|
-
|
48
|
-
|
91
|
+
> [!WARNING]
|
92
|
+
> RE2's regular expression syntax differs from PCRE and Ruby's built-in
|
93
|
+
> [`Regexp`](https://docs.ruby-lang.org/en/3.2/Regexp.html) library, see the
|
94
|
+
> [official syntax page](https://github.com/google/re2/wiki/Syntax) for more
|
95
|
+
> details.
|
49
96
|
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
97
|
+
The core class is [`RE2::Regexp`](https://mudge.name/re2/RE2/Regexp.html) which
|
98
|
+
takes a regular expression as a string and compiles it internally into an `RE2`
|
99
|
+
object. A global function `RE2` is available to concisely compile a new
|
100
|
+
`RE2::Regexp`:
|
54
101
|
|
55
|
-
|
56
|
-
|
57
|
-
|
102
|
+
```ruby
|
103
|
+
re = RE2('(\w+):(\d+)')
|
104
|
+
#=> #<RE2::Regexp /(\w+):(\d+)/>
|
105
|
+
re.ok? #=> true
|
58
106
|
|
59
|
-
|
107
|
+
re = RE2('abc)def')
|
108
|
+
re.ok? #=> false
|
109
|
+
re.error #=> "missing ): abc(def"
|
110
|
+
```
|
60
111
|
|
61
|
-
|
62
|
-
|
63
|
-
|
112
|
+
> [!TIP]
|
113
|
+
> Note the use of *single quotes* when passing the regular expression as
|
114
|
+
> a string to `RE2` so that the backslashes aren't interpreted as escapes.
|
64
115
|
|
65
|
-
|
66
|
-
-------------
|
116
|
+
When compiling a regular expression, an optional second argument can be used to change RE2's default options, e.g. stop logging syntax and execution errors to stderr with `log_errors`:
|
67
117
|
|
68
|
-
|
69
|
-
|
118
|
+
```ruby
|
119
|
+
RE2('abc)def', log_errors: false)
|
120
|
+
```
|
70
121
|
|
71
|
-
|
72
|
-
built-in [`Regexp`][Regexp] library, see the [official syntax page][] for more
|
73
|
-
details.
|
122
|
+
See the API documentation for [`RE2::Regexp#initialize`](https://mudge.name/re2/RE2/Regexp.html#initialize-instance_method) for all the available options.
|
74
123
|
|
75
|
-
|
76
|
-
-----
|
124
|
+
### Matching interface
|
77
125
|
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
126
|
+
There are two main methods for matching: [`RE2::Regexp#full_match?`](https://mudge.name/re2/RE2/Regexp.html#full_match%3F-instance_method) requires the regular expression to match the entire input text, and [`RE2::Regexp#partial_match?`](https://mudge.name/re2/RE2/Regexp.html#partial_match%3F-instance_method) looks for a match for a substring of the input text, returning a boolean to indicate whether a match was successful or not.
|
127
|
+
|
128
|
+
```ruby
|
129
|
+
RE2('h.*o').full_match?("hello") #=> true
|
130
|
+
RE2('e').full_match?("hello") #=> false
|
82
131
|
|
83
|
-
|
84
|
-
|
85
|
-
> require 're2'
|
86
|
-
> r = RE2::Regexp.new('w(\d)(\d+)')
|
87
|
-
=> #<RE2::Regexp /w(\d)(\d+)/>
|
88
|
-
> m = r.match("w1234")
|
89
|
-
=> #<RE2::MatchData "w1234" 1:"1" 2:"234">
|
90
|
-
> m[1]
|
91
|
-
=> "1"
|
92
|
-
> m.string
|
93
|
-
=> "w1234"
|
94
|
-
> m.begin(1)
|
95
|
-
=> 1
|
96
|
-
> m.end(1)
|
97
|
-
=> 2
|
98
|
-
> r =~ "w1234"
|
99
|
-
=> true
|
100
|
-
> r !~ "bob"
|
101
|
-
=> true
|
102
|
-
> r.match("bob")
|
103
|
-
=> nil
|
132
|
+
RE2('h.*o').partial_match?("hello") #=> true
|
133
|
+
RE2('e').partial_match?("hello") #=> true
|
104
134
|
```
|
105
135
|
|
106
|
-
|
107
|
-
[`RE2::Regexp.new`](http://mudge.name/re2/RE2/Regexp.html#initialize-instance_method)
|
108
|
-
(or `RE2::Regexp.compile`) can be quite verbose, a helper method has been
|
109
|
-
defined against `Kernel` so you can use a shorter version to create regular
|
110
|
-
expressions:
|
136
|
+
### Submatch extraction
|
111
137
|
|
112
|
-
|
113
|
-
>
|
114
|
-
|
115
|
-
|
138
|
+
> [!TIP]
|
139
|
+
> Only extract the number of submatches you need as performance is improved
|
140
|
+
> with fewer submatches (with the best performance when avoiding submatch
|
141
|
+
> extraction altogether).
|
116
142
|
|
117
|
-
|
118
|
-
in the following example:
|
143
|
+
Both matching methods have a second form that can extract submatches as [`RE2::MatchData`](https://mudge.name/re2/RE2/MatchData.html) objects: [`RE2::Regexp#full_match`](https://mudge.name/re2/RE2/Regexp.html#full_match-instance_method) and [`RE2::Regexp#partial_match`](https://mudge.name/re2/RE2/Regexp.html#partial_match-instance_method).
|
119
144
|
|
120
|
-
```
|
121
|
-
|
122
|
-
|
145
|
+
```ruby
|
146
|
+
m = RE2('(\w+):(\d+)').full_match("ruby:1234")
|
147
|
+
#=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
|
148
|
+
|
149
|
+
m[0] #=> "ruby:1234"
|
150
|
+
m[1] #=> "ruby"
|
151
|
+
m[2] #=> "1234"
|
152
|
+
|
153
|
+
m = RE2('(\w+):(\d+)').full_match("r")
|
154
|
+
#=> nil
|
123
155
|
```
|
124
156
|
|
125
|
-
|
126
|
-
|
127
|
-
```
|
128
|
-
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
=> "Bob"
|
134
|
-
> m["age"]
|
135
|
-
=> "40"
|
157
|
+
`RE2::MatchData` supports retrieving submatches by numeric index or by name if present in the regular expression:
|
158
|
+
|
159
|
+
```ruby
|
160
|
+
m = RE2('(?P<word>\w+):(?P<number>\d+)').full_match("ruby:1234")
|
161
|
+
#=> #<RE2::MatchData "ruby:1234" 1:"ruby" 2:"1234">
|
162
|
+
|
163
|
+
m["word"] #=> "ruby"
|
164
|
+
m["number"] #=> "1234"
|
136
165
|
```
|
137
166
|
|
138
|
-
|
139
|
-
matches (similar in purpose to Ruby's
|
140
|
-
[`String#scan`](http://ruby-doc.org/core-2.0.0/String.html#method-i-scan)).
|
141
|
-
Calling `scan` will return an `RE2::Scanner` which is
|
142
|
-
[enumerable](http://ruby-doc.org/core-2.0.0/Enumerable.html) meaning you can
|
143
|
-
use `each` to iterate through the matches (and even use
|
144
|
-
[`Enumerator::Lazy`](http://ruby-doc.org/core-2.0/Enumerator/Lazy.html)):
|
167
|
+
They can also be used with Ruby's [pattern matching](https://docs.ruby-lang.org/en/3.2/syntax/pattern_matching_rdoc.html):
|
145
168
|
|
146
169
|
```ruby
|
147
|
-
|
148
|
-
|
149
|
-
|
150
|
-
|
170
|
+
case RE2('(\w+):(\d+)').full_match("ruby:1234")
|
171
|
+
in [word, number]
|
172
|
+
puts "Word: #{word}, Number: #{number}"
|
173
|
+
else
|
174
|
+
puts "No match"
|
151
175
|
end
|
176
|
+
# Word: ruby, Number: 1234
|
152
177
|
|
153
|
-
|
154
|
-
|
155
|
-
|
156
|
-
|
157
|
-
|
178
|
+
case RE2('(?P<word>\w+):(?P<number>\d+)').full_match("ruby:1234")
|
179
|
+
in word:, number:
|
180
|
+
puts "Word: #{word}, Number: #{number}"
|
181
|
+
else
|
182
|
+
puts "No match"
|
183
|
+
end
|
184
|
+
# Word: ruby, Number: 1234
|
158
185
|
```
|
159
186
|
|
160
|
-
|
161
|
-
string. Calling `RE2::Set#add` with a pattern will return an integer index of
|
162
|
-
the pattern. After all patterns have been added, the set can be compiled using
|
163
|
-
`RE2::Set#compile`, and then `RE2::Set#match` will return an `Array<Integer>`
|
164
|
-
containing the indices of all the patterns that matched.
|
187
|
+
By default, both `full_match` and `partial_match` will extract all submatches into the `RE2::MatchData` based on the number of capturing groups in the regular expression. This can be changed by passing an optional second argument when matching:
|
165
188
|
|
166
189
|
```ruby
|
167
|
-
|
168
|
-
|
169
|
-
set.add("def") #=> 1
|
170
|
-
set.add("ghi") #=> 2
|
171
|
-
set.compile #=> true
|
172
|
-
set.match("abcdefghi") #=> [0, 1, 2]
|
173
|
-
set.match("ghidefabc") #=> [2, 1, 0]
|
190
|
+
m = RE2('(\w+):(\d+)').full_match("ruby:1234", submatches: 1)
|
191
|
+
=> #<RE2::MatchData "ruby:1234" 1:"ruby">
|
174
192
|
```
|
175
193
|
|
176
|
-
|
194
|
+
> [!WARNING]
|
195
|
+
> If the regular expression has no capturing groups or you pass `submatches:
|
196
|
+
> 0`, the matching method will behave like its `full_match?` or
|
197
|
+
> `partial_match?` form and only return `true` or `false` rather than
|
198
|
+
> `RE2::MatchData`.
|
199
|
+
|
200
|
+
### Scanning text incrementally
|
201
|
+
|
202
|
+
If you want to repeatedly match regular expressions from the start of some input text, you can use [`RE2::Regexp#scan`](https://mudge.name/re2/RE2/Regexp.html#scan-instance_method) to return an `Enumerable` [`RE2::Scanner`](https://mudge.name/re2/RE2/Scanner.html) object which will lazily consume matches as you iterate over it:
|
177
203
|
|
178
204
|
```ruby
|
179
|
-
|
180
|
-
|
181
|
-
puts
|
182
|
-
else
|
183
|
-
puts "No match!"
|
205
|
+
scanner = RE2('(\w+)').scan(" one two three 4")
|
206
|
+
scanner.each do |match|
|
207
|
+
puts match.inspect
|
184
208
|
end
|
185
|
-
#
|
209
|
+
# ["one"]
|
210
|
+
# ["two"]
|
211
|
+
# ["three"]
|
212
|
+
# ["4"]
|
213
|
+
```
|
186
214
|
|
215
|
+
### Searching simultaneously
|
187
216
|
|
188
|
-
|
189
|
-
|
190
|
-
|
191
|
-
|
192
|
-
|
193
|
-
|
194
|
-
|
217
|
+
[`RE2::Set`](https://mudge.name/re2/RE2/Set.html) represents a collection of
|
218
|
+
regular expressions that can be searched for simultaneously. Calling
|
219
|
+
[`RE2::Set#add`](https://mudge.name/re2/RE2/Set.html#add-instance_method) with
|
220
|
+
a regular expression will return the integer index at which it is stored within
|
221
|
+
the set. After all patterns have been added, the set can be compiled using
|
222
|
+
[`RE2::Set#compile`](https://mudge.name/re2/RE2/Set.html#compile-instance_method),
|
223
|
+
and then
|
224
|
+
[`RE2::Set#match`](https://mudge.name/re2/RE2/Set.html#match-instance_method)
|
225
|
+
will return an array containing the indices of all the patterns that matched.
|
226
|
+
|
227
|
+
```ruby
|
228
|
+
set = RE2::Set.new
|
229
|
+
set.add("abc") #=> 0
|
230
|
+
set.add("def") #=> 1
|
231
|
+
set.add("ghi") #=> 2
|
232
|
+
set.compile #=> true
|
233
|
+
set.match("abcdefghi") #=> [0, 1, 2]
|
234
|
+
set.match("ghidefabc") #=> [2, 1, 0]
|
195
235
|
```
|
196
236
|
|
197
|
-
Encoding
|
198
|
-
--------
|
237
|
+
### Encoding
|
199
238
|
|
200
|
-
|
201
|
-
|
202
|
-
|
239
|
+
> [!WARNING]
|
240
|
+
> Note RE2 only supports UTF-8 and ISO-8859-1 encoding so strings will be
|
241
|
+
> returned in UTF-8 by default or ISO-8859-1 if the `:utf8` option for the
|
242
|
+
> `RE2::Regexp` is set to `false` (any other encoding's behaviour is undefined).
|
203
243
|
|
204
244
|
For backward compatibility: re2 won't automatically convert string inputs to
|
205
245
|
the right encoding so this is the responsibility of the caller, e.g.
|
@@ -209,51 +249,77 @@ the right encoding so this is the responsibility of the caller, e.g.
|
|
209
249
|
RE2(non_utf8_pattern.encode("UTF-8")).match(non_utf8_text.encode("UTF-8"))
|
210
250
|
|
211
251
|
# If the :utf8 option is false, RE2 will process patterns and text as ISO-8859-1
|
212
|
-
RE2(non_latin1_pattern.encode("ISO-8859-1"), :
|
252
|
+
RE2(non_latin1_pattern.encode("ISO-8859-1"), utf8: false).match(non_latin1_text.encode("ISO-8859-1"))
|
213
253
|
```
|
214
254
|
|
215
|
-
|
216
|
-
--------
|
255
|
+
## Requirements
|
217
256
|
|
218
|
-
|
219
|
-
[`RE2::Regexp.new(re)`](https://github.com/google/re2/blob/2016-02-01/re2/re2.h#L100),
|
220
|
-
`RE2::Regexp.compile(re)` or `RE2(re)` (including specifying options, e.g.
|
221
|
-
`RE2::Regexp.new("pattern", :case_sensitive => false)`
|
257
|
+
This gem requires the following to run:
|
222
258
|
|
223
|
-
*
|
224
|
-
with `re2.match(text, number_of_matches)` such as `re2.match("123-234", 2)`)
|
259
|
+
* [Ruby](https://www.ruby-lang.org/en/) 2.6 to 3.3
|
225
260
|
|
226
|
-
|
261
|
+
It supports the following RE2 ABI versions:
|
227
262
|
|
228
|
-
*
|
229
|
-
statements) and `re2 !~ text`
|
263
|
+
* libre2.0 (prior to release 2020-03-02) to libre2.11 (2023-07-01 to 2023-11-01)
|
230
264
|
|
231
|
-
|
265
|
+
### Native gems
|
232
266
|
|
233
|
-
|
267
|
+
Where possible, a pre-compiled native gem will be provided for the following platforms:
|
234
268
|
|
235
|
-
*
|
236
|
-
|
269
|
+
* Linux `aarch64-linux` and `arm-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.29+)
|
270
|
+
* Linux `x86-linux` and `x86_64-linux` (requires [glibc](https://www.gnu.org/software/libc/) 2.17+) including [musl](https://musl.libc.org/)-based systems such as [Alpine](https://alpinelinux.org)
|
271
|
+
* macOS `x86_64-darwin` and `arm64-darwin`
|
272
|
+
* Windows `x64-mingw32` and `x64-mingw-ucrt`
|
237
273
|
|
238
|
-
|
274
|
+
### Installing the `ruby` platform gem
|
239
275
|
|
240
|
-
|
241
|
-
|
276
|
+
> [!WARNING]
|
277
|
+
> We strongly recommend using the native gems where possible to avoid the need
|
278
|
+
> for compiling the C++ extension and its dependencies which will take longer
|
279
|
+
> and be less reliable.
|
242
280
|
|
243
|
-
|
244
|
-
original)`
|
281
|
+
If you wish to compile the gem, you will need to explicitly install the `ruby` platform gem:
|
245
282
|
|
246
|
-
|
247
|
-
|
283
|
+
```ruby
|
284
|
+
# In your Gemfile with Bundler 2.3.18+
|
285
|
+
gem "re2", force_ruby_platform: true
|
286
|
+
|
287
|
+
# With Bundler 2.1+
|
288
|
+
bundle config set force_ruby_platform true
|
289
|
+
|
290
|
+
# With older versions of Bundler
|
291
|
+
bundle config force_ruby_platform true
|
292
|
+
|
293
|
+
# Without Bundler
|
294
|
+
gem install re2 --platform=ruby
|
295
|
+
```
|
296
|
+
|
297
|
+
You will need a full compiler toolchain for compiling Ruby C extensions (see
|
298
|
+
[Nokogiri's "The Compiler
|
299
|
+
Toolchain"](https://nokogiri.org/tutorials/installing_nokogiri.html#appendix-a-the-compiler-toolchain))
|
300
|
+
plus the toolchain required for compiling the vendored version of RE2 and its
|
301
|
+
dependency [Abseil][] which includes
|
302
|
+
[CMake](https://cmake.org) and a compiler with C++14 support such as
|
303
|
+
[clang](http://clang.llvm.org/) 3.4 or [gcc](https://gcc.gnu.org/) 5. On
|
304
|
+
Windows, you'll also need pkgconf 2.1.0+ to avoid [`undefined reference`
|
305
|
+
errors](https://github.com/pkgconf/pkgconf/issues/322) when attempting to
|
306
|
+
compile Abseil.
|
307
|
+
|
308
|
+
### Using system libraries
|
248
309
|
|
249
|
-
|
250
|
-
[`RE2.escape(unquoted)`](https://github.com/google/re2/blob/2016-02-01/re2/re2.h#L418) and
|
251
|
-
`RE2.quote(unquoted)`
|
310
|
+
If you already have RE2 installed, you can instruct the gem not to use its own vendored version:
|
252
311
|
|
253
|
-
|
312
|
+
```ruby
|
313
|
+
gem install re2 --platform=ruby -- --enable-system-libraries
|
314
|
+
|
315
|
+
# If RE2 is not installed in /usr/local, /usr, or /opt/homebrew:
|
316
|
+
gem install re2 --platform=ruby -- --enable-system-libraries --with-re2-dir=/path/to/re2/prefix
|
317
|
+
```
|
254
318
|
|
255
|
-
|
256
|
-
|
319
|
+
Alternatively, you can set the `RE2_USE_SYSTEM_LIBRARIES` environment variable instead of passing `--enable-system-libraries` to the `gem` command.
|
320
|
+
|
321
|
+
|
322
|
+
## Thanks
|
257
323
|
|
258
324
|
* Thanks to [Jason Woods](https://github.com/driskell) who contributed the
|
259
325
|
original implementations of `RE2::MatchData#begin` and `RE2::MatchData#end`.
|
@@ -278,30 +344,21 @@ Contributions
|
|
278
344
|
switch to Ruby's `TypedData` API and the resulting garbage collection
|
279
345
|
improvements in 2.4.0.
|
280
346
|
|
281
|
-
Contact
|
282
|
-
-------
|
347
|
+
## Contact
|
283
348
|
|
284
349
|
All issues and suggestions should go to [GitHub Issues](https://github.com/mudge/re2/issues).
|
285
350
|
|
286
|
-
License
|
287
|
-
-------
|
351
|
+
## License
|
288
352
|
|
289
353
|
This library is licensed under the BSD 3-Clause License, see `LICENSE.txt`.
|
290
354
|
|
291
|
-
|
292
|
-
|
355
|
+
Copyright © 2010, Paul Mucur.
|
356
|
+
|
357
|
+
### Dependencies
|
293
358
|
|
294
359
|
The source code of [RE2][] is distributed in the `ruby` platform gem. This code is licensed under the BSD 3-Clause License, see `LICENSE-DEPENDENCIES.txt`.
|
295
360
|
|
296
361
|
The source code of [Abseil][] is distributed in the `ruby` platform gem. This code is licensed under the Apache License 2.0, see `LICENSE-DEPENDENCIES.txt`.
|
297
362
|
|
298
363
|
[RE2]: https://github.com/google/re2
|
299
|
-
[gcc]: http://gcc.gnu.org/
|
300
|
-
[ruby-dev]: http://packages.debian.org/ruby-dev
|
301
|
-
[build-essential]: http://packages.debian.org/build-essential
|
302
|
-
[Regexp]: http://ruby-doc.org/core/classes/Regexp.html
|
303
|
-
[MatchData]: http://ruby-doc.org/core/classes/MatchData.html
|
304
|
-
[Homebrew]: http://mxcl.github.com/homebrew
|
305
|
-
[libre2-dev]: http://packages.debian.org/search?keywords=libre2-dev
|
306
|
-
[official syntax page]: https://github.com/google/re2/wiki/Syntax
|
307
364
|
[Abseil]: https://abseil.io
|
data/Rakefile
CHANGED
@@ -33,7 +33,7 @@ Gem::PackageTask.new(RE2_GEM_SPEC) do |p|
|
|
33
33
|
p.need_tar = false
|
34
34
|
end
|
35
35
|
|
36
|
-
CROSS_RUBY_VERSIONS = %w[3.2.0 3.1.0 3.0.0 2.7.0 2.6.0].join(':')
|
36
|
+
CROSS_RUBY_VERSIONS = %w[3.3.0 3.2.0 3.1.0 3.0.0 2.7.0 2.6.0].join(':')
|
37
37
|
CROSS_RUBY_PLATFORMS = %w[
|
38
38
|
aarch64-linux
|
39
39
|
arm-linux
|
data/ext/re2/extconf.rb
CHANGED
@@ -1,7 +1,9 @@
|
|
1
|
-
# re2 (
|
2
|
-
# Ruby bindings to
|
1
|
+
# re2 (https://github.com/mudge/re2)
|
2
|
+
# Ruby bindings to RE2, a "fast, safe, thread-friendly alternative to
|
3
|
+
# backtracking regular expression engines like those used in PCRE, Perl, and
|
4
|
+
# Python".
|
3
5
|
#
|
4
|
-
# Copyright (c) 2010
|
6
|
+
# Copyright (c) 2010, Paul Mucur (https://mudge.name)
|
5
7
|
# Released under the BSD Licence, please see LICENSE.txt
|
6
8
|
|
7
9
|
require 'mkmf'
|
@@ -271,65 +273,6 @@ def build_with_system_libraries
|
|
271
273
|
build_extension
|
272
274
|
end
|
273
275
|
|
274
|
-
# pkgconf v1.9.3 on Windows incorrectly sorts the output of `pkg-config
|
275
|
-
# --libs --static`, resulting in build failures: https://github.com/pkgconf/pkgconf/issues/268.
|
276
|
-
# To work around the issue, store the correct order of abseil flags here and add them manually
|
277
|
-
# for Windows.
|
278
|
-
#
|
279
|
-
# Note that `-ldbghelp` is incorrectly added before `-labsl_symbolize` in abseil:
|
280
|
-
# https://github.com/abseil/abseil-cpp/issues/1497
|
281
|
-
ABSL_LDFLAGS = %w[
|
282
|
-
-labsl_flags
|
283
|
-
-labsl_flags_internal
|
284
|
-
-labsl_flags_marshalling
|
285
|
-
-labsl_flags_reflection
|
286
|
-
-labsl_flags_private_handle_accessor
|
287
|
-
-labsl_flags_commandlineflag
|
288
|
-
-labsl_flags_commandlineflag_internal
|
289
|
-
-labsl_flags_config
|
290
|
-
-labsl_flags_program_name
|
291
|
-
-labsl_cord
|
292
|
-
-labsl_cordz_info
|
293
|
-
-labsl_cord_internal
|
294
|
-
-labsl_cordz_functions
|
295
|
-
-labsl_cordz_handle
|
296
|
-
-labsl_crc_cord_state
|
297
|
-
-labsl_crc32c
|
298
|
-
-labsl_crc_internal
|
299
|
-
-labsl_crc_cpu_detect
|
300
|
-
-labsl_raw_hash_set
|
301
|
-
-labsl_hash
|
302
|
-
-labsl_city
|
303
|
-
-labsl_bad_variant_access
|
304
|
-
-labsl_low_level_hash
|
305
|
-
-labsl_hashtablez_sampler
|
306
|
-
-labsl_exponential_biased
|
307
|
-
-labsl_bad_optional_access
|
308
|
-
-labsl_str_format_internal
|
309
|
-
-labsl_synchronization
|
310
|
-
-labsl_graphcycles_internal
|
311
|
-
-labsl_kernel_timeout_internal
|
312
|
-
-labsl_stacktrace
|
313
|
-
-labsl_symbolize
|
314
|
-
-ldbghelp
|
315
|
-
-labsl_debugging_internal
|
316
|
-
-labsl_demangle_internal
|
317
|
-
-labsl_malloc_internal
|
318
|
-
-labsl_time
|
319
|
-
-labsl_civil_time
|
320
|
-
-labsl_strings
|
321
|
-
-labsl_string_view
|
322
|
-
-labsl_strings_internal
|
323
|
-
-labsl_base
|
324
|
-
-ladvapi32
|
325
|
-
-labsl_spinlock_wait
|
326
|
-
-labsl_int128
|
327
|
-
-labsl_throw_delegate
|
328
|
-
-labsl_raw_logging_internal
|
329
|
-
-labsl_log_severity
|
330
|
-
-labsl_time_zone
|
331
|
-
].freeze
|
332
|
-
|
333
276
|
def libflag_to_filename(ldflag)
|
334
277
|
case ldflag
|
335
278
|
when /\A-l(.+)/
|
@@ -382,14 +325,7 @@ def add_flag(arg, lib_paths)
|
|
382
325
|
end
|
383
326
|
|
384
327
|
def add_static_ldflags(flags, lib_paths)
|
385
|
-
|
386
|
-
|
387
|
-
if MiniPortile.windows?
|
388
|
-
static_flags.each { |flag| add_flag(flag, lib_paths) unless ABSL_LDFLAGS.include?(flag) }
|
389
|
-
ABSL_LDFLAGS.each { |flag| add_flag(flag, lib_paths) }
|
390
|
-
else
|
391
|
-
static_flags.each { |flag| add_flag(flag, lib_paths) }
|
392
|
-
end
|
328
|
+
flags.strip.shellsplit.each { |flag| add_flag(flag, lib_paths) }
|
393
329
|
end
|
394
330
|
|
395
331
|
def build_with_vendored_libraries
|