ahocorasick-rust 1.0.2-aarch64-linux
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +142 -0
- data/Rakefile +31 -0
- data/ext/rahocorasick/Cargo.lock +303 -0
- data/ext/rahocorasick/Cargo.toml +12 -0
- data/ext/rahocorasick/extconf.rb +6 -0
- data/ext/rahocorasick/src/lib.rs +32 -0
- data/lib/ahocorasick-rust.rb +9 -0
- data/lib/rahocorasick/2.7/rahocorasick.so +0 -0
- data/lib/rahocorasick/3.0/rahocorasick.so +0 -0
- data/lib/rahocorasick/3.1/rahocorasick.so +0 -0
- data/lib/rahocorasick/3.2/rahocorasick.so +0 -0
- data/lib/rahocorasick/3.3/rahocorasick.so +0 -0
- metadata +75 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 0b2a97a53298cc7de937d6174c7159b84b0da1fb6850d5319d35b2120c2fbaf5
|
|
4
|
+
data.tar.gz: 41ceeef1a12560186e56489085692dafef5d11bc21c283b341bd4aff9fd2b5e7
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: 9d5b9738cc084fc45377fc7b5c3da832096ab941d772329c7825e0d6eda0b7534feb704cdf4419fbe597df996ccaedc2b99130e9dbf5e23ab818024d36f97473
|
|
7
|
+
data.tar.gz: 53e5084626818a65dbc8dbc3a8c7ee3d3fa5a67981f0ffa0556eea7adb63c873fa209bd6457e416a2828f7e8a413ceafe16beafffcaf76e856d3e38b504c139f
|
data/README.md
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# Aho-Corasick Rust ✨
|
|
2
|
+
|
|
3
|
+
[](https://badge.fury.io/rb/ahocorasick-rust)
|
|
4
|
+
|
|
5
|
+
> **Blazing-fast multi-pattern string matching for Ruby!** (ノ◕ヮ◕)ノ*:・゚✧
|
|
6
|
+
|
|
7
|
+
`ahocorasick-rust` is a Ruby wrapper for the [Aho-Corasick algorithm](https://github.com/BurntSushi/aho-corasick) implemented in Rust! 🦀💎
|
|
8
|
+
|
|
9
|
+
## What is Aho-Corasick? 🤔
|
|
10
|
+
|
|
11
|
+
Aho-Corasick is a powerful string searching algorithm that can find **multiple patterns simultaneously** in a single pass through your text! Unlike traditional string matching that searches for one pattern at a time, Aho-Corasick builds a finite state machine from your dictionary of patterns and matches them all at once.
|
|
12
|
+
|
|
13
|
+
**Perfect for:**
|
|
14
|
+
- 🔍 Content filtering & moderation
|
|
15
|
+
- 📝 Finding keywords in large documents
|
|
16
|
+
- 🚫 Detecting prohibited words or phrases
|
|
17
|
+
- 🏷️ Multi-pattern text analysis
|
|
18
|
+
- ⚡ Any scenario where you need to search for many patterns efficiently!
|
|
19
|
+
|
|
20
|
+
**Why this gem rocks:**
|
|
21
|
+
- 🦀 Powered by Rust for maximum speed
|
|
22
|
+
- 💎 Easy Ruby interface
|
|
23
|
+
- 🚀 Up to **67x faster** than pure Ruby implementations
|
|
24
|
+
- ✨ Precompiled binaries for major platforms
|
|
25
|
+
- 🌈 Works with Ruby 2.7+
|
|
26
|
+
|
|
27
|
+
## Installation 📦
|
|
28
|
+
|
|
29
|
+
Add this gem to your `Gemfile`:
|
|
30
|
+
|
|
31
|
+
```ruby
|
|
32
|
+
gem 'ahocorasick-rust'
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Then execute:
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
bundle install
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Or install it yourself:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
gem install ahocorasick-rust
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Usage 🎀
|
|
48
|
+
|
|
49
|
+
It's super simple!
|
|
50
|
+
|
|
51
|
+
```ruby
|
|
52
|
+
require 'ahocorasick-rust'
|
|
53
|
+
|
|
54
|
+
# Create a new matcher with your patterns
|
|
55
|
+
animals = ['cat', 'dog', 'bunny', 'fox']
|
|
56
|
+
matcher = AhoCorasickRust.new(animals)
|
|
57
|
+
|
|
58
|
+
# Search for all patterns in your text - finds them all in one pass! ✨
|
|
59
|
+
text = "The quick brown fox jumps over the lazy dog."
|
|
60
|
+
matcher.lookup(text)
|
|
61
|
+
# => ["fox", "dog"]
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Want more examples?** Check out our [example script](scripts/example.rb) with content filtering, language detection, and more! 🌈
|
|
65
|
+
|
|
66
|
+
## Benchmark 📊
|
|
67
|
+
|
|
68
|
+
**Don't just take our word for it - check out these performance numbers!** 🎉
|
|
69
|
+
|
|
70
|
+
### Test Setup 1
|
|
71
|
+
- Words: 500 patterns
|
|
72
|
+
- Test cases: 2,000
|
|
73
|
+
- Text length: 3,154 chars (avg), 23,676 (max)
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
user system total real
|
|
77
|
+
each&include 6.487059 0.185424 6.672483 ( 6.791808)
|
|
78
|
+
ruby_ahoc 4.178672 0.138610 4.317282 ( 4.547964)
|
|
79
|
+
rust_ahoc 0.157662 0.004847 0.162509 ( 0.166964)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
> 🎈 **27.2x faster** than pure Ruby implementation!
|
|
83
|
+
|
|
84
|
+
### Test Setup 2
|
|
85
|
+
- Words: 500 patterns
|
|
86
|
+
- Test cases: 2,000
|
|
87
|
+
- Text length: 49,162 chars (avg), 10,392,056 (max)
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
user system total real
|
|
91
|
+
each&include 27.903179 0.237389 28.140568 ( 28.563194)
|
|
92
|
+
ruby_ahoc 45.220535 0.363107 45.583642 ( 46.477702)
|
|
93
|
+
rust_ahoc 0.670583 0.007192 0.677775 ( 0.686904)
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
> 🎈 **67.7x faster** than pure Ruby implementation!
|
|
97
|
+
|
|
98
|
+
The larger your text and the more patterns you have, the more this gem shines! ✨
|
|
99
|
+
|
|
100
|
+
## Platform Support 🌍
|
|
101
|
+
|
|
102
|
+
Precompiled binaries are available for:
|
|
103
|
+
- 🍎 macOS (ARM64 & x86_64)
|
|
104
|
+
- 🐧 Linux (ARM64 & x86_64)
|
|
105
|
+
|
|
106
|
+
If a precompiled binary isn't available for your platform, the gem will automatically compile the Rust extension during installation.
|
|
107
|
+
|
|
108
|
+
## Development 🛠️
|
|
109
|
+
|
|
110
|
+
Want to contribute? Yay! 🎉
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
# Install dependencies
|
|
114
|
+
bundle install
|
|
115
|
+
|
|
116
|
+
# Compile the extension
|
|
117
|
+
fish -c "bundle exec rake dev compile"
|
|
118
|
+
|
|
119
|
+
# Run tests
|
|
120
|
+
fish -c "bundle exec rake test"
|
|
121
|
+
|
|
122
|
+
# Build the gem
|
|
123
|
+
gem build ahocorasick-rust.gemspec
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## References 📚
|
|
127
|
+
|
|
128
|
+
- [Aho-Corasick (Rust)](https://github.com/BurntSushi/aho-corasick) - The amazing Rust implementation we wrap
|
|
129
|
+
- [Aho-Corasick Algorithm](https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm) - Learn about the algorithm
|
|
130
|
+
- [Original Ruby Implementation](https://github.com/ahnick/ahocorasick) - Pure Ruby version for comparison
|
|
131
|
+
|
|
132
|
+
## Contributing 💝
|
|
133
|
+
|
|
134
|
+
Bug reports and pull requests are welcome on GitHub at [https://github.com/jetpks/ahocorasick-rust-ruby](https://github.com/jetpks/ahocorasick-rust-ruby)!
|
|
135
|
+
|
|
136
|
+
## License 📄
|
|
137
|
+
|
|
138
|
+
This gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
Made with 💖 and Rust 🦀 by Eric
|
data/Rakefile
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'rake/testtask'
|
|
4
|
+
require 'rake/extensiontask'
|
|
5
|
+
|
|
6
|
+
CROSS_PLATFORMS = %w[
|
|
7
|
+
aarch64-linux
|
|
8
|
+
arm64-darwin
|
|
9
|
+
x86_64-darwin
|
|
10
|
+
x86_64-linux
|
|
11
|
+
].freeze
|
|
12
|
+
|
|
13
|
+
spec = Bundler.load_gemspec('ahocorasick-rust.gemspec')
|
|
14
|
+
|
|
15
|
+
Rake::ExtensionTask.new('rahocorasick', spec) do |c|
|
|
16
|
+
c.lib_dir = 'lib/rahocorasick'
|
|
17
|
+
c.source_pattern = '*.{rs,toml}'
|
|
18
|
+
c.cross_compile = true
|
|
19
|
+
c.cross_platform = CROSS_PLATFORMS
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
Rake::TestTask.new do |t|
|
|
23
|
+
t.deps << :dev << :compile
|
|
24
|
+
t.test_files = FileList['test/**/*_test.rb']
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
task :dev do
|
|
28
|
+
ENV['RB_SYS_CARGO_PROFILE'] = 'dev'
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
task default: :test
|
|
@@ -0,0 +1,303 @@
|
|
|
1
|
+
# This file is automatically @generated by Cargo.
|
|
2
|
+
# It is not intended for manual editing.
|
|
3
|
+
version = 4
|
|
4
|
+
|
|
5
|
+
[[package]]
|
|
6
|
+
name = "aho-corasick"
|
|
7
|
+
version = "0.7.20"
|
|
8
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
9
|
+
checksum = "cc936419f96fa211c1b9166887b38e5e40b19958e5b895be7c1f93adec7071ac"
|
|
10
|
+
dependencies = [
|
|
11
|
+
"memchr",
|
|
12
|
+
]
|
|
13
|
+
|
|
14
|
+
[[package]]
|
|
15
|
+
name = "aho-corasick"
|
|
16
|
+
version = "1.1.4"
|
|
17
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
18
|
+
checksum = "ddd31a130427c27518df266943a5308ed92d4b226cc639f5a8f1002816174301"
|
|
19
|
+
dependencies = [
|
|
20
|
+
"memchr",
|
|
21
|
+
]
|
|
22
|
+
|
|
23
|
+
[[package]]
|
|
24
|
+
name = "bindgen"
|
|
25
|
+
version = "0.69.5"
|
|
26
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
27
|
+
checksum = "271383c67ccabffb7381723dea0672a673f292304fcb45c01cc648c7a8d58088"
|
|
28
|
+
dependencies = [
|
|
29
|
+
"bitflags",
|
|
30
|
+
"cexpr",
|
|
31
|
+
"clang-sys",
|
|
32
|
+
"itertools",
|
|
33
|
+
"lazy_static",
|
|
34
|
+
"lazycell",
|
|
35
|
+
"proc-macro2",
|
|
36
|
+
"quote",
|
|
37
|
+
"regex",
|
|
38
|
+
"rustc-hash",
|
|
39
|
+
"shlex",
|
|
40
|
+
"syn",
|
|
41
|
+
]
|
|
42
|
+
|
|
43
|
+
[[package]]
|
|
44
|
+
name = "bitflags"
|
|
45
|
+
version = "2.10.0"
|
|
46
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
47
|
+
checksum = "812e12b5285cc515a9c72a5c1d3b6d46a19dac5acfef5265968c166106e31dd3"
|
|
48
|
+
|
|
49
|
+
[[package]]
|
|
50
|
+
name = "cexpr"
|
|
51
|
+
version = "0.6.0"
|
|
52
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
53
|
+
checksum = "6fac387a98bb7c37292057cffc56d62ecb629900026402633ae9160df93a8766"
|
|
54
|
+
dependencies = [
|
|
55
|
+
"nom",
|
|
56
|
+
]
|
|
57
|
+
|
|
58
|
+
[[package]]
|
|
59
|
+
name = "cfg-if"
|
|
60
|
+
version = "1.0.0"
|
|
61
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
62
|
+
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
|
|
63
|
+
|
|
64
|
+
[[package]]
|
|
65
|
+
name = "clang-sys"
|
|
66
|
+
version = "1.6.0"
|
|
67
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
68
|
+
checksum = "77ed9a53e5d4d9c573ae844bfac6872b159cb1d1585a83b29e7a64b7eef7332a"
|
|
69
|
+
dependencies = [
|
|
70
|
+
"glob",
|
|
71
|
+
"libc",
|
|
72
|
+
"libloading",
|
|
73
|
+
]
|
|
74
|
+
|
|
75
|
+
[[package]]
|
|
76
|
+
name = "either"
|
|
77
|
+
version = "1.15.0"
|
|
78
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
79
|
+
checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719"
|
|
80
|
+
|
|
81
|
+
[[package]]
|
|
82
|
+
name = "glob"
|
|
83
|
+
version = "0.3.1"
|
|
84
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
85
|
+
checksum = "d2fabcfbdc87f4758337ca535fb41a6d701b65693ce38287d856d1674551ec9b"
|
|
86
|
+
|
|
87
|
+
[[package]]
|
|
88
|
+
name = "itertools"
|
|
89
|
+
version = "0.12.1"
|
|
90
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
91
|
+
checksum = "ba291022dbbd398a455acf126c1e341954079855bc60dfdda641363bd6922569"
|
|
92
|
+
dependencies = [
|
|
93
|
+
"either",
|
|
94
|
+
]
|
|
95
|
+
|
|
96
|
+
[[package]]
|
|
97
|
+
name = "lazy_static"
|
|
98
|
+
version = "1.4.0"
|
|
99
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
100
|
+
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
|
|
101
|
+
|
|
102
|
+
[[package]]
|
|
103
|
+
name = "lazycell"
|
|
104
|
+
version = "1.3.0"
|
|
105
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
106
|
+
checksum = "830d08ce1d1d941e6b30645f1a0eb5643013d835ce3779a5fc208261dbe10f55"
|
|
107
|
+
|
|
108
|
+
[[package]]
|
|
109
|
+
name = "libc"
|
|
110
|
+
version = "0.2.139"
|
|
111
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
112
|
+
checksum = "201de327520df007757c1f0adce6e827fe8562fbc28bfd9c15571c66ca1f5f79"
|
|
113
|
+
|
|
114
|
+
[[package]]
|
|
115
|
+
name = "libloading"
|
|
116
|
+
version = "0.7.4"
|
|
117
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
118
|
+
checksum = "b67380fd3b2fbe7527a606e18729d21c6f3951633d0500574c4dc22d2d638b9f"
|
|
119
|
+
dependencies = [
|
|
120
|
+
"cfg-if",
|
|
121
|
+
"winapi",
|
|
122
|
+
]
|
|
123
|
+
|
|
124
|
+
[[package]]
|
|
125
|
+
name = "magnus"
|
|
126
|
+
version = "0.8.2"
|
|
127
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
128
|
+
checksum = "3b36a5b126bbe97eb0d02d07acfeb327036c6319fd816139a49824a83b7f9012"
|
|
129
|
+
dependencies = [
|
|
130
|
+
"magnus-macros",
|
|
131
|
+
"rb-sys",
|
|
132
|
+
"rb-sys-env",
|
|
133
|
+
"seq-macro",
|
|
134
|
+
]
|
|
135
|
+
|
|
136
|
+
[[package]]
|
|
137
|
+
name = "magnus-macros"
|
|
138
|
+
version = "0.8.0"
|
|
139
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
140
|
+
checksum = "47607461fd8e1513cb4f2076c197d8092d921a1ea75bd08af97398f593751892"
|
|
141
|
+
dependencies = [
|
|
142
|
+
"proc-macro2",
|
|
143
|
+
"quote",
|
|
144
|
+
"syn",
|
|
145
|
+
]
|
|
146
|
+
|
|
147
|
+
[[package]]
|
|
148
|
+
name = "memchr"
|
|
149
|
+
version = "2.5.0"
|
|
150
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
151
|
+
checksum = "2dffe52ecf27772e601905b7522cb4ef790d2cc203488bbd0e2fe85fcb74566d"
|
|
152
|
+
|
|
153
|
+
[[package]]
|
|
154
|
+
name = "minimal-lexical"
|
|
155
|
+
version = "0.2.1"
|
|
156
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
157
|
+
checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a"
|
|
158
|
+
|
|
159
|
+
[[package]]
|
|
160
|
+
name = "nom"
|
|
161
|
+
version = "7.1.3"
|
|
162
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
163
|
+
checksum = "d273983c5a657a70a3e8f2a01329822f3b8c8172b73826411a55751e404a0a4a"
|
|
164
|
+
dependencies = [
|
|
165
|
+
"memchr",
|
|
166
|
+
"minimal-lexical",
|
|
167
|
+
]
|
|
168
|
+
|
|
169
|
+
[[package]]
|
|
170
|
+
name = "proc-macro2"
|
|
171
|
+
version = "1.0.103"
|
|
172
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
173
|
+
checksum = "5ee95bc4ef87b8d5ba32e8b7714ccc834865276eab0aed5c9958d00ec45f49e8"
|
|
174
|
+
dependencies = [
|
|
175
|
+
"unicode-ident",
|
|
176
|
+
]
|
|
177
|
+
|
|
178
|
+
[[package]]
|
|
179
|
+
name = "quote"
|
|
180
|
+
version = "1.0.42"
|
|
181
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
182
|
+
checksum = "a338cc41d27e6cc6dce6cefc13a0729dfbb81c262b1f519331575dd80ef3067f"
|
|
183
|
+
dependencies = [
|
|
184
|
+
"proc-macro2",
|
|
185
|
+
]
|
|
186
|
+
|
|
187
|
+
[[package]]
|
|
188
|
+
name = "rahocorasick"
|
|
189
|
+
version = "0.1.0"
|
|
190
|
+
dependencies = [
|
|
191
|
+
"aho-corasick 1.1.4",
|
|
192
|
+
"magnus",
|
|
193
|
+
]
|
|
194
|
+
|
|
195
|
+
[[package]]
|
|
196
|
+
name = "rb-sys"
|
|
197
|
+
version = "0.9.117"
|
|
198
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
199
|
+
checksum = "f900d1ce4629a2ebffaf5de74bd8f9c1188d4c5ed406df02f97e22f77a006f44"
|
|
200
|
+
dependencies = [
|
|
201
|
+
"rb-sys-build",
|
|
202
|
+
]
|
|
203
|
+
|
|
204
|
+
[[package]]
|
|
205
|
+
name = "rb-sys-build"
|
|
206
|
+
version = "0.9.117"
|
|
207
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
208
|
+
checksum = "ef1e9c857028f631056bcd6d88cec390c751e343ce2223ddb26d23eb4a151d59"
|
|
209
|
+
dependencies = [
|
|
210
|
+
"bindgen",
|
|
211
|
+
"lazy_static",
|
|
212
|
+
"proc-macro2",
|
|
213
|
+
"quote",
|
|
214
|
+
"regex",
|
|
215
|
+
"shell-words",
|
|
216
|
+
"syn",
|
|
217
|
+
]
|
|
218
|
+
|
|
219
|
+
[[package]]
|
|
220
|
+
name = "rb-sys-env"
|
|
221
|
+
version = "0.2.2"
|
|
222
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
223
|
+
checksum = "08f8d2924cf136a1315e2b4c7460a39f62ef11ee5d522df9b2750fab55b868b6"
|
|
224
|
+
|
|
225
|
+
[[package]]
|
|
226
|
+
name = "regex"
|
|
227
|
+
version = "1.7.1"
|
|
228
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
229
|
+
checksum = "48aaa5748ba571fb95cd2c85c09f629215d3a6ece942baa100950af03a34f733"
|
|
230
|
+
dependencies = [
|
|
231
|
+
"aho-corasick 0.7.20",
|
|
232
|
+
"memchr",
|
|
233
|
+
"regex-syntax",
|
|
234
|
+
]
|
|
235
|
+
|
|
236
|
+
[[package]]
|
|
237
|
+
name = "regex-syntax"
|
|
238
|
+
version = "0.6.28"
|
|
239
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
240
|
+
checksum = "456c603be3e8d448b072f410900c09faf164fbce2d480456f50eea6e25f9c848"
|
|
241
|
+
|
|
242
|
+
[[package]]
|
|
243
|
+
name = "rustc-hash"
|
|
244
|
+
version = "1.1.0"
|
|
245
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
246
|
+
checksum = "08d43f7aa6b08d49f382cde6a7982047c3426db949b1424bc4b7ec9ae12c6ce2"
|
|
247
|
+
|
|
248
|
+
[[package]]
|
|
249
|
+
name = "seq-macro"
|
|
250
|
+
version = "0.3.6"
|
|
251
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
252
|
+
checksum = "1bc711410fbe7399f390ca1c3b60ad0f53f80e95c5eb935e52268a0e2cd49acc"
|
|
253
|
+
|
|
254
|
+
[[package]]
|
|
255
|
+
name = "shell-words"
|
|
256
|
+
version = "1.1.0"
|
|
257
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
258
|
+
checksum = "24188a676b6ae68c3b2cb3a01be17fbf7240ce009799bb56d5b1409051e78fde"
|
|
259
|
+
|
|
260
|
+
[[package]]
|
|
261
|
+
name = "shlex"
|
|
262
|
+
version = "1.1.0"
|
|
263
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
264
|
+
checksum = "43b2853a4d09f215c24cc5489c992ce46052d359b5109343cbafbf26bc62f8a3"
|
|
265
|
+
|
|
266
|
+
[[package]]
|
|
267
|
+
name = "syn"
|
|
268
|
+
version = "2.0.110"
|
|
269
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
270
|
+
checksum = "a99801b5bd34ede4cf3fc688c5919368fea4e4814a4664359503e6015b280aea"
|
|
271
|
+
dependencies = [
|
|
272
|
+
"proc-macro2",
|
|
273
|
+
"quote",
|
|
274
|
+
"unicode-ident",
|
|
275
|
+
]
|
|
276
|
+
|
|
277
|
+
[[package]]
|
|
278
|
+
name = "unicode-ident"
|
|
279
|
+
version = "1.0.6"
|
|
280
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
281
|
+
checksum = "84a22b9f218b40614adcb3f4ff08b703773ad44fa9423e4e0d346d5db86e4ebc"
|
|
282
|
+
|
|
283
|
+
[[package]]
|
|
284
|
+
name = "winapi"
|
|
285
|
+
version = "0.3.9"
|
|
286
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
287
|
+
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
|
|
288
|
+
dependencies = [
|
|
289
|
+
"winapi-i686-pc-windows-gnu",
|
|
290
|
+
"winapi-x86_64-pc-windows-gnu",
|
|
291
|
+
]
|
|
292
|
+
|
|
293
|
+
[[package]]
|
|
294
|
+
name = "winapi-i686-pc-windows-gnu"
|
|
295
|
+
version = "0.4.0"
|
|
296
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
297
|
+
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
|
|
298
|
+
|
|
299
|
+
[[package]]
|
|
300
|
+
name = "winapi-x86_64-pc-windows-gnu"
|
|
301
|
+
version = "0.4.0"
|
|
302
|
+
source = "registry+https://github.com/rust-lang/crates.io-index"
|
|
303
|
+
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
[package]
|
|
2
|
+
edition = "2024"
|
|
3
|
+
name = "rahocorasick"
|
|
4
|
+
version = "0.1.0"
|
|
5
|
+
|
|
6
|
+
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
|
|
7
|
+
[lib]
|
|
8
|
+
crate-type = ["cdylib"]
|
|
9
|
+
|
|
10
|
+
[dependencies]
|
|
11
|
+
aho-corasick = "1.1"
|
|
12
|
+
magnus = "0.8"
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
use aho_corasick::AhoCorasick;
|
|
2
|
+
use magnus::{method, function, prelude::*, Error, Ruby};
|
|
3
|
+
|
|
4
|
+
#[magnus::wrap(class = "AhoCorasickRust")]
|
|
5
|
+
pub struct AhoCorasickRust {
|
|
6
|
+
words: Vec<String>,
|
|
7
|
+
ac: AhoCorasick,
|
|
8
|
+
}
|
|
9
|
+
|
|
10
|
+
impl AhoCorasickRust {
|
|
11
|
+
fn new(ruby: &Ruby, words: Vec<String>) -> Result<Self, Error> {
|
|
12
|
+
let ac = AhoCorasick::new(&words)
|
|
13
|
+
.map_err(|e| Error::new(ruby.exception_runtime_error(), format!("Failed to build automaton: {}", e)))?;
|
|
14
|
+
Ok(Self { words, ac })
|
|
15
|
+
}
|
|
16
|
+
|
|
17
|
+
fn lookup(&self, haystack: String) -> Vec<String> {
|
|
18
|
+
let mut matches = vec![];
|
|
19
|
+
for mat in self.ac.find_iter(&haystack) {
|
|
20
|
+
matches.push(self.words[mat.pattern()].clone());
|
|
21
|
+
}
|
|
22
|
+
matches
|
|
23
|
+
}
|
|
24
|
+
}
|
|
25
|
+
|
|
26
|
+
#[magnus::init]
|
|
27
|
+
fn main(ruby: &Ruby) -> Result<(), Error> {
|
|
28
|
+
let class = ruby.define_class("AhoCorasickRust", ruby.class_object())?;
|
|
29
|
+
class.define_singleton_method("new", function!(AhoCorasickRust::new, 1))?;
|
|
30
|
+
class.define_method("lookup", method!(AhoCorasickRust::lookup, 1))?;
|
|
31
|
+
Ok(())
|
|
32
|
+
}
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
metadata
ADDED
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
|
2
|
+
name: ahocorasick-rust
|
|
3
|
+
version: !ruby/object:Gem::Version
|
|
4
|
+
version: 1.0.2
|
|
5
|
+
platform: aarch64-linux
|
|
6
|
+
authors:
|
|
7
|
+
- Eric
|
|
8
|
+
autorequire:
|
|
9
|
+
bindir: bin
|
|
10
|
+
cert_chain: []
|
|
11
|
+
date: 2025-11-16 00:00:00.000000000 Z
|
|
12
|
+
dependencies:
|
|
13
|
+
- !ruby/object:Gem::Dependency
|
|
14
|
+
name: rb_sys
|
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
|
16
|
+
requirements:
|
|
17
|
+
- - "~>"
|
|
18
|
+
- !ruby/object:Gem::Version
|
|
19
|
+
version: 0.9.117
|
|
20
|
+
type: :runtime
|
|
21
|
+
prerelease: false
|
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
23
|
+
requirements:
|
|
24
|
+
- - "~>"
|
|
25
|
+
- !ruby/object:Gem::Version
|
|
26
|
+
version: 0.9.117
|
|
27
|
+
description: A Ruby gem wrapping the legendary Rust Aho-Corasick algorithm! Aho-Corasick
|
|
28
|
+
is a powerful string searching algorithm that finds multiple patterns simultaneously
|
|
29
|
+
in a text. Perfect for matching dictionaries, filtering words, and other multi-pattern
|
|
30
|
+
search tasks at lightning speed! (ノ◕ヮ◕)ノ*:・゚✧
|
|
31
|
+
email:
|
|
32
|
+
- eric@ebj.dev
|
|
33
|
+
executables: []
|
|
34
|
+
extensions: []
|
|
35
|
+
extra_rdoc_files: []
|
|
36
|
+
files:
|
|
37
|
+
- README.md
|
|
38
|
+
- Rakefile
|
|
39
|
+
- ext/rahocorasick/Cargo.lock
|
|
40
|
+
- ext/rahocorasick/Cargo.toml
|
|
41
|
+
- ext/rahocorasick/extconf.rb
|
|
42
|
+
- ext/rahocorasick/src/lib.rs
|
|
43
|
+
- lib/ahocorasick-rust.rb
|
|
44
|
+
- lib/rahocorasick/2.7/rahocorasick.so
|
|
45
|
+
- lib/rahocorasick/3.0/rahocorasick.so
|
|
46
|
+
- lib/rahocorasick/3.1/rahocorasick.so
|
|
47
|
+
- lib/rahocorasick/3.2/rahocorasick.so
|
|
48
|
+
- lib/rahocorasick/3.3/rahocorasick.so
|
|
49
|
+
homepage: https://github.com/jetpks/ahocorasick-rust-ruby
|
|
50
|
+
licenses:
|
|
51
|
+
- MIT
|
|
52
|
+
metadata: {}
|
|
53
|
+
post_install_message:
|
|
54
|
+
rdoc_options: []
|
|
55
|
+
require_paths:
|
|
56
|
+
- lib
|
|
57
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
|
58
|
+
requirements:
|
|
59
|
+
- - ">="
|
|
60
|
+
- !ruby/object:Gem::Version
|
|
61
|
+
version: '2.7'
|
|
62
|
+
- - "<"
|
|
63
|
+
- !ruby/object:Gem::Version
|
|
64
|
+
version: 3.4.dev
|
|
65
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
66
|
+
requirements:
|
|
67
|
+
- - ">="
|
|
68
|
+
- !ruby/object:Gem::Version
|
|
69
|
+
version: '0'
|
|
70
|
+
requirements: []
|
|
71
|
+
rubygems_version: 3.5.23
|
|
72
|
+
signing_key:
|
|
73
|
+
specification_version: 4
|
|
74
|
+
summary: Blazing-fast ✨ Ruby wrapper for the Rust Aho-Corasick string matching algorithm!
|
|
75
|
+
test_files: []
|