tantiny 0.2.2 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +16 -0
- data/Cargo.toml +1 -1
- data/README.md +40 -10
- data/bin/console +11 -6
- data/ext/Rakefile +1 -1
- data/lib/tantiny/errors.rb +11 -2
- data/lib/tantiny/helpers.rb +10 -0
- data/lib/tantiny/index.rb +97 -19
- data/lib/tantiny/version.rb +1 -1
- data/lib/tantiny.rb +2 -0
- data/sig/tantiny/helpers.rbs +2 -0
- data/sig/tantiny/index.rbs +25 -4
- data/src/helpers.rs +6 -4
- data/src/index.rs +40 -13
- data/src/lib.rs +0 -1
- metadata +20 -7
- data/lib/tantiny.so +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6d0f8d182809f2ef16cea118c6afa3537ce75088eb660491bbf62f88cabf8f9a
|
4
|
+
data.tar.gz: 1194fe1dcd4167e165a27612aab27dae0c3564f2f3dc629cadd23319780c7e4f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 220af19fd88b94bfad830f709607fcf394423c23a7c6ad2b25efe459298e26bcddba3bd8516f8a7719bfd532e4b9b4964136be0521c7d7414409063842eadfb1
|
7
|
+
data.tar.gz: 2bca8e05530015a86b97eb7f0f9cc5dd755de5691c104c6247742baf2d11d5c0ee2843fd634acd3ac25823c5d1ddddc0eeda5a6dcdca41e0bbf89143b1104e21
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,21 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## [0.3.0](https://github.com/baygeldin/tantiny/compare/v0.2.2...v0.3.0) (2022-03-17)
|
4
|
+
|
5
|
+
|
6
|
+
### ⚠ BREAKING CHANGES
|
7
|
+
|
8
|
+
* `commit` method is no longer public
|
9
|
+
|
10
|
+
### Features
|
11
|
+
|
12
|
+
* Support multithreaded and multiprocess environments ([053b4a0](https://github.com/baygeldin/tantiny/commit/053b4a0a026ae8fd689d95a8d4f3b1a7b6d6779f))
|
13
|
+
|
14
|
+
|
15
|
+
### Bug Fixes
|
16
|
+
|
17
|
+
* Create the folder at index path when it doesn't exist ([c9446b7](https://github.com/baygeldin/tantiny/commit/c9446b7e949aad40de9ce179707a88915682055c))
|
18
|
+
|
3
19
|
### [0.2.2](https://github.com/baygeldin/tantiny/compare/v0.2.1...v0.2.2) (2022-03-07)
|
4
20
|
|
5
21
|
|
data/Cargo.toml
CHANGED
data/README.md
CHANGED
@@ -1,10 +1,15 @@
|
|
1
|
+
[](https://github.com/baygeldin/tantiny/actions/workflows/build.yml)
|
2
|
+
[](https://rubygems.org/gems/tantiny)
|
3
|
+
[](https://codeclimate.com/github/baygeldin/tantiny/maintainability)
|
4
|
+
[](https://codeclimate.com/github/baygeldin/tantiny/test_coverage)
|
5
|
+
|
1
6
|
# Tantiny
|
2
7
|
|
3
8
|
Need a fast full-text search for your Ruby script, but Solr and Elasticsearch are an overkill? 😏
|
4
9
|
|
5
|
-
You're in the right place. **Tantiny** is a minimalistic full-text search library for Ruby based on [
|
10
|
+
You're in the right place. **Tantiny** is a minimalistic full-text search library for Ruby based on [Tanti**v**y](https://github.com/quickwit-oss/tantivy) (an awesome alternative to Apache Lucene written in Rust). It's great for cases when your task at hand requires a full-text search, but configuring a full-blown distributed search engine would take more time than the task itself. And even if you already use such an engine in your project (which is highly likely, actually), it still might be easier to just use Tantiny instead because unlike Solr and Elasticsearch it doesn't need *anything* to work (no separate server or process or whatever), it's purely embeddable. So, when you find yourself in a situation when using your search engine of choice would be tricky/inconvinient or would require additional setup you can always revert back to a quick and dirty solution that is nontheless flexible and fast.
|
6
11
|
|
7
|
-
The main philosophy is to provide low-level access Tantivy's inverted index, but with a nice Ruby-esque API, sensible defaults, and additional functionality sprinkled on top
|
12
|
+
Tantiny is not exactly Ruby bindings to Tantivy, but it tries to be close. The main philosophy is to provide low-level access to Tantivy's inverted index, but with a nice Ruby-esque API, sensible defaults, and additional functionality sprinkled on top.
|
8
13
|
|
9
14
|
Take a look at the most basic example:
|
10
15
|
|
@@ -15,7 +20,6 @@ index << { id: 1, description: "Hello World!" }
|
|
15
20
|
index << { id: 2, description: "What's up?" }
|
16
21
|
index << { id: 3, description: "Goodbye World!" }
|
17
22
|
|
18
|
-
index.commit
|
19
23
|
index.reload
|
20
24
|
|
21
25
|
index.search("world") # 1, 3
|
@@ -26,7 +30,7 @@ index.search("world") # 1, 3
|
|
26
30
|
Add this line to your application's Gemfile:
|
27
31
|
|
28
32
|
```ruby
|
29
|
-
gem
|
33
|
+
gem "tantiny"
|
30
34
|
```
|
31
35
|
|
32
36
|
And then execute:
|
@@ -39,6 +43,10 @@ Or install it yourself as:
|
|
39
43
|
|
40
44
|
You don't **have to** have Rust installed on your system since Tantiny will try to download the pre-compiled binaries hosted on GitHub releases during the installation. However, if no pre-compiled binaries were found for your system (which is a combination of platform, architecture, and Ruby version) you will need to [install Rust](https://www.rust-lang.org/tools/install) first.
|
41
45
|
|
46
|
+
⚠️ **IMPORTANT** ⚠️
|
47
|
+
|
48
|
+
Please, make sure to specify the minor version when declaring dependency on `tantiny`. The API is a subject to change, and until it reaches `1.0.0` a bump in the minor version will most likely signify a breaking change.
|
49
|
+
|
42
50
|
## Defining the index
|
43
51
|
|
44
52
|
You have to specify a path to where the index would be stored and a block that defines the schema:
|
@@ -82,6 +90,8 @@ rio_bravo = OpenStruct.new(
|
|
82
90
|
release_date: Date.parse("March 18, 1959")
|
83
91
|
)
|
84
92
|
|
93
|
+
index << rio_bravo
|
94
|
+
|
85
95
|
hanabi = {
|
86
96
|
imdb_id: "tt0119250",
|
87
97
|
type: "/crime/Japan",
|
@@ -92,6 +102,8 @@ hanabi = {
|
|
92
102
|
release_date: Date.parse("December 1, 1998")
|
93
103
|
}
|
94
104
|
|
105
|
+
index << hanabi
|
106
|
+
|
95
107
|
brother = {
|
96
108
|
imdb_id: "tt0118767",
|
97
109
|
type: "/crime/Russia",
|
@@ -102,8 +114,6 @@ brother = {
|
|
102
114
|
release_date: Date.parse("December 12, 1997")
|
103
115
|
}
|
104
116
|
|
105
|
-
index << rio_bravo
|
106
|
-
index << hanabi
|
107
117
|
index << brother
|
108
118
|
```
|
109
119
|
|
@@ -120,12 +130,32 @@ You can also delete it if you want:
|
|
120
130
|
index.delete(rio_bravo.imdb_id)
|
121
131
|
```
|
122
132
|
|
123
|
-
|
133
|
+
### Transactions
|
134
|
+
|
135
|
+
If you need to perform multiple writing operations (i.e. more than one) you should always use `transaction`:
|
124
136
|
|
125
137
|
```ruby
|
126
|
-
index.
|
138
|
+
index.transaction do
|
139
|
+
index << rio_bravo
|
140
|
+
index << hanabi
|
141
|
+
index << brother
|
142
|
+
end
|
127
143
|
```
|
128
144
|
|
145
|
+
Transactions group changes and [commit](https://docs.rs/tantivy/latest/tantivy/struct.IndexWriter.html#method.commit) them to the index in one go. This is *dramatically* more efficient than performing these changes one by one. In fact, all writing operations (i.e. `<<` and `delete`) are wrapped in a transaction implicitly when you call them outside of a transaction, so calling `<<` 10 times outside of a transaction is the same thing as performing 10 separate transactions.
|
146
|
+
|
147
|
+
### Concurrency and thread-safety
|
148
|
+
|
149
|
+
Tantiny is thread-safe meaning that you can safely share a single instance of the index between threads. You can also spawn separate processes that could write to and read from the same index. However, while reading from the index should be parallel, writing to it is **not**. Whenever you call `transaction` or any other operation that modify the index (i.e. `<<` and `delete`) it will lock the index for the duration of the operation or wait for another process or thread to release the lock. The only exception to this is when there is another process with an index with an exclusive writer running somewhere in which case the methods that modify the index will fail immediately.
|
150
|
+
|
151
|
+
Thus, it's best to have a single writer process and many reader processes if you want to avoid blocking calls. The proper way to do this is to set `exclusive_writer` to `true` when initializing the index:
|
152
|
+
|
153
|
+
```ruby
|
154
|
+
index = Tantiny::Index.new("/path/to/index", exclusive_writer: true) {}
|
155
|
+
```
|
156
|
+
|
157
|
+
This way the [index writer](https://docs.rs/tantivy/latest/tantivy/struct.IndexWriter.html) will only be acquired once which means the memory for it and indexing threads will only be allocated once as well. Otherwise a new index writer is acquired every time you perform a writing operation.
|
158
|
+
|
129
159
|
## Searching
|
130
160
|
|
131
161
|
Make sure that your index is up-to-date by reloading it first:
|
@@ -190,7 +220,7 @@ All queries can search on multuple fields (except for `facet_query` because it d
|
|
190
220
|
So, the following query:
|
191
221
|
|
192
222
|
```ruby
|
193
|
-
index.term_query(%i[title
|
223
|
+
index.term_query(%i[title description], "hello")
|
194
224
|
```
|
195
225
|
|
196
226
|
Is equivalent to:
|
@@ -296,7 +326,7 @@ You may have noticed that `search` method returns only documents ids. This is by
|
|
296
326
|
|
297
327
|
## Development
|
298
328
|
|
299
|
-
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
329
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake build` to build native extensions, and then `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
300
330
|
|
301
331
|
We use [conventional commits](https://www.conventionalcommits.org) to automatically generate the CHANGELOG, bump the semantic version, and to publish and release the gem. All you need to do is stick to the convention and [CI will take care of everything else](https://github.com/baygeldin/tantiny/blob/main/.github/workflows/release.yml) for you.
|
302
332
|
|
data/bin/console
CHANGED
@@ -7,9 +7,13 @@ require "pry"
|
|
7
7
|
require "tantiny"
|
8
8
|
|
9
9
|
path = File.join(__dir__, "../tmp")
|
10
|
-
en_stem = Tantiny::Tokenizer.new(:stemmer, language: :en)
|
11
10
|
|
12
|
-
|
11
|
+
options = {
|
12
|
+
tokenizer: Tantiny::Tokenizer.new(:stemmer, language: :en),
|
13
|
+
exclusive_writer: true,
|
14
|
+
}
|
15
|
+
|
16
|
+
index = Tantiny::Index.new(path, **options) do
|
13
17
|
id :imdb_id
|
14
18
|
facet :category
|
15
19
|
string :title
|
@@ -49,11 +53,12 @@ brother = {
|
|
49
53
|
release_date: Date.parse("December 12, 1997")
|
50
54
|
}
|
51
55
|
|
52
|
-
index
|
53
|
-
index <<
|
54
|
-
index <<
|
56
|
+
index.transaction do
|
57
|
+
index << rio_bravo
|
58
|
+
index << hanabi
|
59
|
+
index << brother
|
60
|
+
end
|
55
61
|
|
56
|
-
index.commit
|
57
62
|
index.reload
|
58
63
|
|
59
64
|
binding.pry
|
data/ext/Rakefile
CHANGED
data/lib/tantiny/errors.rb
CHANGED
@@ -3,9 +3,18 @@
|
|
3
3
|
module Tantiny
|
4
4
|
class TantivyError < StandardError; end
|
5
5
|
|
6
|
-
class
|
6
|
+
class IndexWriterBusyError < StandardError
|
7
7
|
def initialize
|
8
|
-
|
8
|
+
msg = "Failed to acquire an index writer. "\
|
9
|
+
"Is there an active index with an exclusive writer already?"
|
10
|
+
|
11
|
+
super(msg)
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
class UnexpectedNone < StandardError
|
16
|
+
def initialize(type)
|
17
|
+
super("Didn't expect Option<#{type}> to be empty.")
|
9
18
|
end
|
10
19
|
end
|
11
20
|
|
data/lib/tantiny/helpers.rb
CHANGED
@@ -5,5 +5,15 @@ module Tantiny
|
|
5
5
|
def self.timestamp(date)
|
6
6
|
date.to_datetime.iso8601
|
7
7
|
end
|
8
|
+
|
9
|
+
def self.with_lock(lockfile)
|
10
|
+
File.open(lockfile, File::CREAT) do |file|
|
11
|
+
file.flock(File::LOCK_EX)
|
12
|
+
|
13
|
+
yield
|
14
|
+
|
15
|
+
file.flock(File::LOCK_UN)
|
16
|
+
end
|
17
|
+
end
|
8
18
|
end
|
9
19
|
end
|
data/lib/tantiny/index.rb
CHANGED
@@ -2,18 +2,18 @@
|
|
2
2
|
|
3
3
|
module Tantiny
|
4
4
|
class Index
|
5
|
-
|
5
|
+
LOCKFILE = ".tantiny.lock"
|
6
|
+
DEFAULT_WRITER_MEMORY = 5_000_000 # 5MB
|
6
7
|
DEFAULT_LIMIT = 10
|
7
8
|
|
8
9
|
def self.new(path, **options, &block)
|
9
|
-
|
10
|
-
default_tokenizer = options[:tokenizer] || Tokenizer.default
|
10
|
+
FileUtils.mkdir_p(path)
|
11
11
|
|
12
|
+
default_tokenizer = options[:tokenizer] || Tokenizer.default
|
12
13
|
schema = Schema.new(default_tokenizer, &block)
|
13
14
|
|
14
15
|
object = __new(
|
15
16
|
path.to_s,
|
16
|
-
index_size,
|
17
17
|
schema.default_tokenizer,
|
18
18
|
schema.field_tokenizers.transform_keys(&:to_s),
|
19
19
|
schema.text_fields.map(&:to_s),
|
@@ -24,15 +24,40 @@ module Tantiny
|
|
24
24
|
schema.facet_fields.map(&:to_s)
|
25
25
|
)
|
26
26
|
|
27
|
-
object.send(:schema
|
27
|
+
object.send(:initialize, path, schema, **options)
|
28
28
|
|
29
29
|
object
|
30
30
|
end
|
31
31
|
|
32
|
+
def initialize(path, schema, **options)
|
33
|
+
@path = path
|
34
|
+
@schema = schema
|
35
|
+
|
36
|
+
@indexer_memory = options[:writer_memory] || DEFAULT_WRITER_MEMORY
|
37
|
+
@exclusive_writer = options[:exclusive_writer] || false
|
38
|
+
|
39
|
+
@active_transaction = Concurrent::ThreadLocalVar.new(false)
|
40
|
+
@transaction_semaphore = Mutex.new
|
41
|
+
|
42
|
+
acquire_index_writer if exclusive_writer?
|
43
|
+
end
|
44
|
+
|
32
45
|
attr_reader :schema
|
33
46
|
|
34
|
-
def
|
35
|
-
|
47
|
+
def transaction
|
48
|
+
if inside_transaction?
|
49
|
+
yield
|
50
|
+
else
|
51
|
+
synchronize do
|
52
|
+
open_transaction!
|
53
|
+
|
54
|
+
yield
|
55
|
+
|
56
|
+
close_transaction!
|
57
|
+
end
|
58
|
+
end
|
59
|
+
|
60
|
+
nil
|
36
61
|
end
|
37
62
|
|
38
63
|
def reload
|
@@ -40,19 +65,23 @@ module Tantiny
|
|
40
65
|
end
|
41
66
|
|
42
67
|
def <<(document)
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
51
|
-
|
68
|
+
transaction do
|
69
|
+
__add_document(
|
70
|
+
resolve(document, schema.id_field).to_s,
|
71
|
+
slice_document(document, schema.text_fields) { |v| v.to_s },
|
72
|
+
slice_document(document, schema.string_fields) { |v| v.to_s },
|
73
|
+
slice_document(document, schema.integer_fields) { |v| v.to_i },
|
74
|
+
slice_document(document, schema.double_fields) { |v| v.to_f },
|
75
|
+
slice_document(document, schema.date_fields) { |v| Helpers.timestamp(v) },
|
76
|
+
slice_document(document, schema.facet_fields) { |v| v.to_s }
|
77
|
+
)
|
78
|
+
end
|
52
79
|
end
|
53
80
|
|
54
81
|
def delete(id)
|
55
|
-
|
82
|
+
transaction do
|
83
|
+
__delete_document(id.to_s)
|
84
|
+
end
|
56
85
|
end
|
57
86
|
|
58
87
|
def search(query, limit: DEFAULT_LIMIT, **smart_query_options)
|
@@ -79,8 +108,6 @@ module Tantiny
|
|
79
108
|
|
80
109
|
private
|
81
110
|
|
82
|
-
attr_writer :schema
|
83
|
-
|
84
111
|
def slice_document(document, fields, &block)
|
85
112
|
fields.inject({}) do |hash, field|
|
86
113
|
hash.tap { |h| h[field.to_s] = resolve(document, field) }
|
@@ -90,5 +117,56 @@ module Tantiny
|
|
90
117
|
def resolve(document, field)
|
91
118
|
document.is_a?(Hash) ? document[field] : document.send(field)
|
92
119
|
end
|
120
|
+
|
121
|
+
def acquire_index_writer
|
122
|
+
__acquire_index_writer(@indexer_memory)
|
123
|
+
rescue TantivyError => e
|
124
|
+
case e.message
|
125
|
+
when /Failed to acquire Lockfile/
|
126
|
+
raise IndexWriterBusyError.new
|
127
|
+
else
|
128
|
+
raise
|
129
|
+
end
|
130
|
+
end
|
131
|
+
|
132
|
+
def release_index_writer
|
133
|
+
__release_index_writer
|
134
|
+
end
|
135
|
+
|
136
|
+
def commit
|
137
|
+
__commit
|
138
|
+
end
|
139
|
+
|
140
|
+
def open_transaction!
|
141
|
+
acquire_index_writer unless exclusive_writer?
|
142
|
+
|
143
|
+
@active_transaction.value = true
|
144
|
+
end
|
145
|
+
|
146
|
+
def close_transaction!
|
147
|
+
commit
|
148
|
+
|
149
|
+
release_index_writer unless exclusive_writer?
|
150
|
+
|
151
|
+
@active_transaction.value = false
|
152
|
+
end
|
153
|
+
|
154
|
+
def inside_transaction?
|
155
|
+
@active_transaction.value
|
156
|
+
end
|
157
|
+
|
158
|
+
def exclusive_writer?
|
159
|
+
@exclusive_writer
|
160
|
+
end
|
161
|
+
|
162
|
+
def synchronize(&block)
|
163
|
+
@transaction_semaphore.synchronize do
|
164
|
+
Helpers.with_lock(lockfile_path, &block)
|
165
|
+
end
|
166
|
+
end
|
167
|
+
|
168
|
+
def lockfile_path
|
169
|
+
@lockfile_path ||= File.join(@path, LOCKFILE)
|
170
|
+
end
|
93
171
|
end
|
94
172
|
end
|
data/lib/tantiny/version.rb
CHANGED
data/lib/tantiny.rb
CHANGED
data/sig/tantiny/helpers.rbs
CHANGED
data/sig/tantiny/index.rbs
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
module Tantiny
|
2
2
|
class Index
|
3
|
-
|
3
|
+
LOCKFILE: String
|
4
|
+
DEFAULT_WRITER_MEMORY: Integer
|
4
5
|
DEFAULT_LIMIT: Integer
|
5
6
|
|
6
7
|
def self.new: (
|
@@ -10,7 +11,6 @@ module Tantiny
|
|
10
11
|
|
11
12
|
def self.__new: (
|
12
13
|
String path,
|
13
|
-
Integer index_size,
|
14
14
|
Tokenizer default_tokenizer,
|
15
15
|
Hash[String, Tokenizer] field_tokenizers,
|
16
16
|
Array[String] text_fields,
|
@@ -21,9 +21,16 @@ module Tantiny
|
|
21
21
|
Array[String] facet_fields
|
22
22
|
) -> Index
|
23
23
|
|
24
|
+
def initialize: (
|
25
|
+
String path,
|
26
|
+
Schema schema,
|
27
|
+
**untyped options
|
28
|
+
) -> void
|
29
|
+
|
24
30
|
attr_reader schema: Schema
|
25
31
|
|
26
|
-
def
|
32
|
+
def transaction: () { (*untyped) -> void } -> void
|
33
|
+
|
27
34
|
def reload: () -> void
|
28
35
|
def <<: (untyped document) -> void
|
29
36
|
def delete: (String id) -> void
|
@@ -62,9 +69,12 @@ module Tantiny
|
|
62
69
|
|
63
70
|
def __search: (Query query, Integer limit) -> Array[String]
|
64
71
|
|
72
|
+
def __acquire_index_writer: (Integer overall_memory) -> void
|
73
|
+
def __release_index_writer: () -> void
|
74
|
+
|
65
75
|
private
|
66
76
|
|
67
|
-
|
77
|
+
def commit: () -> void
|
68
78
|
|
69
79
|
def slice_document: (
|
70
80
|
untyped document,
|
@@ -78,5 +88,16 @@ module Tantiny
|
|
78
88
|
) -> Array[String]
|
79
89
|
|
80
90
|
def resolve: (untyped document, Symbol field) -> untyped
|
91
|
+
|
92
|
+
def synchronize: () { (*untyped) -> void } -> void
|
93
|
+
def lockfile_path: () -> String
|
94
|
+
|
95
|
+
def exclusive_writer?: () -> bool
|
96
|
+
def acquire_index_writer: () -> void
|
97
|
+
def release_index_writer: () -> void
|
98
|
+
|
99
|
+
def open_transaction!: () -> void
|
100
|
+
def close_transaction!: () -> void
|
101
|
+
def inside_transaction?: () -> bool
|
81
102
|
end
|
82
103
|
end
|
data/src/helpers.rs
CHANGED
@@ -1,6 +1,5 @@
|
|
1
1
|
use std::collections::HashMap;
|
2
2
|
use rutie::{AnyException, Array, Exception, RString, Hash, Integer, Float, Boolean, Module};
|
3
|
-
use tantivy::schema::{Field};
|
4
3
|
use tantivy::tokenizer::Language;
|
5
4
|
|
6
5
|
// Macro dependencies:
|
@@ -114,12 +113,15 @@ where
|
|
114
113
|
}
|
115
114
|
}
|
116
115
|
|
117
|
-
impl TryUnwrap<
|
118
|
-
fn try_unwrap(self) ->
|
116
|
+
impl<T> TryUnwrap<T> for Option<T> {
|
117
|
+
fn try_unwrap(self) -> T {
|
119
118
|
if let Some(value) = self {
|
120
119
|
value
|
121
120
|
} else {
|
122
|
-
VM::raise_ex(AnyException::new(
|
121
|
+
VM::raise_ex(AnyException::new(
|
122
|
+
"Tantiny::UnexpectedNone",
|
123
|
+
Some(&*format!("{}", std::any::type_name::<T>())))
|
124
|
+
);
|
123
125
|
|
124
126
|
self.unwrap()
|
125
127
|
}
|
data/src/index.rs
CHANGED
@@ -11,9 +11,10 @@ use crate::query::{unwrap_query, RTantinyQuery};
|
|
11
11
|
use crate::tokenizer::{unwrap_tokenizer, RTantinyTokenizer};
|
12
12
|
|
13
13
|
pub struct TantinyIndex {
|
14
|
-
pub(crate) index_writer: IndexWriter,
|
15
|
-
pub(crate) index_reader: IndexReader,
|
16
14
|
pub(crate) schema: Schema,
|
15
|
+
pub(crate) index: Index,
|
16
|
+
pub(crate) index_writer: Option<IndexWriter>,
|
17
|
+
pub(crate) index_reader: IndexReader,
|
17
18
|
}
|
18
19
|
|
19
20
|
scaffold!(RTantinyIndex, TantinyIndex, "Index");
|
@@ -22,6 +23,10 @@ pub(crate) fn unwrap_index(index: &RTantinyIndex) -> &TantinyIndex {
|
|
22
23
|
index.get_data(&*TANTINY_INDEX_WRAPPER)
|
23
24
|
}
|
24
25
|
|
26
|
+
pub(crate) fn unwrap_index_mut(index: &mut RTantinyIndex) -> &mut TantinyIndex {
|
27
|
+
index.get_data_mut(&*TANTINY_INDEX_WRAPPER)
|
28
|
+
}
|
29
|
+
|
25
30
|
#[rustfmt::skip::macros(methods)]
|
26
31
|
methods!(
|
27
32
|
RTantinyIndex,
|
@@ -29,7 +34,6 @@ methods!(
|
|
29
34
|
|
30
35
|
fn new_index(
|
31
36
|
path: RString,
|
32
|
-
index_size: Integer,
|
33
37
|
default_tokenizer: AnyObject,
|
34
38
|
field_tokenizers: Hash,
|
35
39
|
text_fields: Array,
|
@@ -41,7 +45,6 @@ methods!(
|
|
41
45
|
) -> RTantinyIndex {
|
42
46
|
try_unwrap_params!(
|
43
47
|
path: String,
|
44
|
-
index_size: i64,
|
45
48
|
default_tokenizer: RTantinyTokenizer,
|
46
49
|
field_tokenizers: HashMap<String, RTantinyTokenizer>,
|
47
50
|
text_fields: Vec<String>,
|
@@ -103,9 +106,7 @@ methods!(
|
|
103
106
|
tokenizers.register(&field, unwrap_tokenizer(&tokenizer).clone())
|
104
107
|
}
|
105
108
|
|
106
|
-
let
|
107
|
-
.writer(index_size as usize)
|
108
|
-
.try_unwrap();
|
109
|
+
let index_writer = None;
|
109
110
|
|
110
111
|
let index_reader = index
|
111
112
|
.reader_builder()
|
@@ -114,7 +115,7 @@ methods!(
|
|
114
115
|
.try_unwrap();
|
115
116
|
|
116
117
|
klass().wrap_data(
|
117
|
-
TantinyIndex { index_writer, index_reader, schema },
|
118
|
+
TantinyIndex { index, index_writer, index_reader, schema },
|
118
119
|
&*TANTINY_INDEX_WRAPPER
|
119
120
|
)
|
120
121
|
}
|
@@ -138,9 +139,8 @@ methods!(
|
|
138
139
|
facet_fields: HashMap<String, String>
|
139
140
|
);
|
140
141
|
|
141
|
-
|
142
142
|
let internal = unwrap_index(&_itself);
|
143
|
-
let index_writer =
|
143
|
+
let index_writer = internal.index_writer.as_ref().try_unwrap();
|
144
144
|
let schema = &internal.schema;
|
145
145
|
|
146
146
|
let mut doc = Document::default();
|
@@ -191,7 +191,7 @@ methods!(
|
|
191
191
|
try_unwrap_params!(id: String);
|
192
192
|
|
193
193
|
let internal = unwrap_index(&_itself);
|
194
|
-
let index_writer =
|
194
|
+
let index_writer = internal.index_writer.as_ref().unwrap();
|
195
195
|
|
196
196
|
let id_field = internal.schema.get_field("id").try_unwrap();
|
197
197
|
let doc_id = Term::from_field_text(id_field, &id);
|
@@ -201,9 +201,34 @@ methods!(
|
|
201
201
|
NilClass::new()
|
202
202
|
}
|
203
203
|
|
204
|
+
fn acquire_index_writer(
|
205
|
+
overall_memory: Integer
|
206
|
+
) -> NilClass {
|
207
|
+
try_unwrap_params!(overall_memory: i64);
|
208
|
+
|
209
|
+
let internal = unwrap_index_mut(&mut _itself);
|
210
|
+
|
211
|
+
let mut index_writer = internal.index
|
212
|
+
.writer(overall_memory as usize)
|
213
|
+
.try_unwrap();
|
214
|
+
|
215
|
+
internal.index_writer = Some(index_writer);
|
216
|
+
|
217
|
+
NilClass::new()
|
218
|
+
}
|
219
|
+
|
220
|
+
fn release_index_writer() -> NilClass {
|
221
|
+
let internal = unwrap_index_mut(&mut _itself);
|
222
|
+
|
223
|
+
drop(internal.index_writer.as_ref().try_unwrap());
|
224
|
+
internal.index_writer = None;
|
225
|
+
|
226
|
+
NilClass::new()
|
227
|
+
}
|
228
|
+
|
204
229
|
fn commit() -> NilClass {
|
205
|
-
let internal = _itself
|
206
|
-
let index_writer =
|
230
|
+
let internal = unwrap_index_mut(&mut _itself);
|
231
|
+
let index_writer = internal.index_writer.as_mut().try_unwrap();
|
207
232
|
|
208
233
|
index_writer.commit().try_unwrap();
|
209
234
|
|
@@ -254,6 +279,8 @@ pub(super) fn init() {
|
|
254
279
|
klass.def_self("__new", new_index);
|
255
280
|
klass.def("__add_document", add_document);
|
256
281
|
klass.def("__delete_document", delete_document);
|
282
|
+
klass.def("__acquire_index_writer", acquire_index_writer);
|
283
|
+
klass.def("__release_index_writer", release_index_writer);
|
257
284
|
klass.def("__commit", commit);
|
258
285
|
klass.def("__reload", reload);
|
259
286
|
klass.def("__search", search);
|
data/src/lib.rs
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: tantiny
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Alexander Baygeldin
|
8
|
-
autorequire:
|
8
|
+
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2022-03-
|
11
|
+
date: 2022-03-17 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: ruby-next
|
@@ -66,7 +66,21 @@ dependencies:
|
|
66
66
|
- - "~>"
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '13.0'
|
69
|
-
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: concurrent-ruby
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '1.0'
|
76
|
+
type: :runtime
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '1.0'
|
83
|
+
description:
|
70
84
|
email:
|
71
85
|
- a.baygeldin@gmail.com
|
72
86
|
executables: []
|
@@ -83,7 +97,6 @@ files:
|
|
83
97
|
- ext/Rakefile
|
84
98
|
- lib/.rbnext/3.0/tantiny/schema.rb
|
85
99
|
- lib/tantiny.rb
|
86
|
-
- lib/tantiny.so
|
87
100
|
- lib/tantiny/errors.rb
|
88
101
|
- lib/tantiny/helpers.rb
|
89
102
|
- lib/tantiny/index.rb
|
@@ -113,7 +126,7 @@ metadata:
|
|
113
126
|
documentation_uri: https://github.com/baygeldin/tantiny/blob/master/README.md
|
114
127
|
homepage_uri: https://github.com/baygeldin/tantiny
|
115
128
|
source_code_uri: https://github.com/baygeldin/tantiny
|
116
|
-
post_install_message:
|
129
|
+
post_install_message:
|
117
130
|
rdoc_options: []
|
118
131
|
require_paths:
|
119
132
|
- lib
|
@@ -129,7 +142,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
129
142
|
version: '0'
|
130
143
|
requirements: []
|
131
144
|
rubygems_version: 3.3.7
|
132
|
-
signing_key:
|
145
|
+
signing_key:
|
133
146
|
specification_version: 4
|
134
147
|
summary: Tiny full-text search for Ruby powered by Tantivy.
|
135
148
|
test_files: []
|
data/lib/tantiny.so
DELETED
Binary file
|