tantiny 0.2.2 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e082705a8a556a2bf8ebfd08445ea2917e308a1fdd80e2e2dcf75edaa71bc135
4
- data.tar.gz: 2098370ed2edcf0aa533abf3327ba4513f76fdf1119ba6f007de68d162510897
3
+ metadata.gz: 6d0f8d182809f2ef16cea118c6afa3537ce75088eb660491bbf62f88cabf8f9a
4
+ data.tar.gz: 1194fe1dcd4167e165a27612aab27dae0c3564f2f3dc629cadd23319780c7e4f
5
5
  SHA512:
6
- metadata.gz: 76171526e6677a13874d8cb2b915496eb844ac120404f220440552d00217b059f3ac55f329318ec020249ec4106a2977ee7251a9c526c0fa237077be8401c937
7
- data.tar.gz: 4425ba8f9a882b9e6b2b9eb7c4d0d8aa4a87342249fd0588fd2d82a6f0c8f709fcc2ee5e22e63b66b84256c398decb2a7a207414636cf6d588c64a63c85a496f
6
+ metadata.gz: 220af19fd88b94bfad830f709607fcf394423c23a7c6ad2b25efe459298e26bcddba3bd8516f8a7719bfd532e4b9b4964136be0521c7d7414409063842eadfb1
7
+ data.tar.gz: 2bca8e05530015a86b97eb7f0f9cc5dd755de5691c104c6247742baf2d11d5c0ee2843fd634acd3ac25823c5d1ddddc0eeda5a6dcdca41e0bbf89143b1104e21
data/CHANGELOG.md CHANGED
@@ -1,5 +1,21 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.3.0](https://github.com/baygeldin/tantiny/compare/v0.2.2...v0.3.0) (2022-03-17)
4
+
5
+
6
+ ### ⚠ BREAKING CHANGES
7
+
8
+ * `commit` method is no longer public
9
+
10
+ ### Features
11
+
12
+ * Support multithreaded and multiprocess environments ([053b4a0](https://github.com/baygeldin/tantiny/commit/053b4a0a026ae8fd689d95a8d4f3b1a7b6d6779f))
13
+
14
+
15
+ ### Bug Fixes
16
+
17
+ * Create the folder at index path when it doesn't exist ([c9446b7](https://github.com/baygeldin/tantiny/commit/c9446b7e949aad40de9ce179707a88915682055c))
18
+
3
19
  ### [0.2.2](https://github.com/baygeldin/tantiny/compare/v0.2.1...v0.2.2) (2022-03-07)
4
20
 
5
21
 
data/Cargo.toml CHANGED
@@ -1,6 +1,6 @@
1
1
  [package]
2
2
  name = "tantiny"
3
- version = "0.2.2" # {x-release-please-version}
3
+ version = "0.3.0" # {x-release-please-version}
4
4
  edition = "2021"
5
5
  authors = ["Alexander Baygeldin"]
6
6
  repository = "https://github.com/baygeldin/tantiny"
data/README.md CHANGED
@@ -1,10 +1,15 @@
1
+ [![Build workflow](https://github.com/baygeldin/tantiny/actions/workflows/build.yml/badge.svg)](https://github.com/baygeldin/tantiny/actions/workflows/build.yml)
2
+ [![Tantiny](https://img.shields.io/gem/v/tantiny?color=31c553)](https://rubygems.org/gems/tantiny)
3
+ [![Maintainability](https://api.codeclimate.com/v1/badges/1b466b52d2ba71ab9d80/maintainability)](https://codeclimate.com/github/baygeldin/tantiny/maintainability)
4
+ [![Test Coverage](https://api.codeclimate.com/v1/badges/1b466b52d2ba71ab9d80/test_coverage)](https://codeclimate.com/github/baygeldin/tantiny/test_coverage)
5
+
1
6
  # Tantiny
2
7
 
3
8
  Need a fast full-text search for your Ruby script, but Solr and Elasticsearch are an overkill? 😏
4
9
 
5
- You're in the right place. **Tantiny** is a minimalistic full-text search library for Ruby based on [Tantivy](https://github.com/quickwit-oss/tantivy) (an awesome alternative to Apache Lucene written in Rust). The greatest advantage of using it is that you don't need *anything* to make it work (no separate server or process), it's purely embeddable. So, if your task at hand requires a full-text search, but a full-blown distributed search engine would be an overkill, Tantiny would be perfect for you.
10
+ You're in the right place. **Tantiny** is a minimalistic full-text search library for Ruby based on [Tanti**v**y](https://github.com/quickwit-oss/tantivy) (an awesome alternative to Apache Lucene written in Rust). It's great for cases when your task at hand requires a full-text search, but configuring a full-blown distributed search engine would take more time than the task itself. And even if you already use such an engine in your project (which is highly likely, actually), it still might be easier to just use Tantiny instead because unlike Solr and Elasticsearch it doesn't need *anything* to work (no separate server or process or whatever), it's purely embeddable. So, when you find yourself in a situation when using your search engine of choice would be tricky/inconvinient or would require additional setup you can always revert back to a quick and dirty solution that is nontheless flexible and fast.
6
11
 
7
- The main philosophy is to provide low-level access Tantivy's inverted index, but with a nice Ruby-esque API, sensible defaults, and additional functionality sprinkled on top (so, Tantiny not exactly bindings to Tantivy, but it tries to be close).
12
+ Tantiny is not exactly Ruby bindings to Tantivy, but it tries to be close. The main philosophy is to provide low-level access to Tantivy's inverted index, but with a nice Ruby-esque API, sensible defaults, and additional functionality sprinkled on top.
8
13
 
9
14
  Take a look at the most basic example:
10
15
 
@@ -15,7 +20,6 @@ index << { id: 1, description: "Hello World!" }
15
20
  index << { id: 2, description: "What's up?" }
16
21
  index << { id: 3, description: "Goodbye World!" }
17
22
 
18
- index.commit
19
23
  index.reload
20
24
 
21
25
  index.search("world") # 1, 3
@@ -26,7 +30,7 @@ index.search("world") # 1, 3
26
30
  Add this line to your application's Gemfile:
27
31
 
28
32
  ```ruby
29
- gem 'tantiny'
33
+ gem "tantiny"
30
34
  ```
31
35
 
32
36
  And then execute:
@@ -39,6 +43,10 @@ Or install it yourself as:
39
43
 
40
44
  You don't **have to** have Rust installed on your system since Tantiny will try to download the pre-compiled binaries hosted on GitHub releases during the installation. However, if no pre-compiled binaries were found for your system (which is a combination of platform, architecture, and Ruby version) you will need to [install Rust](https://www.rust-lang.org/tools/install) first.
41
45
 
46
+ ⚠️ **IMPORTANT** ⚠️
47
+
48
+ Please, make sure to specify the minor version when declaring dependency on `tantiny`. The API is a subject to change, and until it reaches `1.0.0` a bump in the minor version will most likely signify a breaking change.
49
+
42
50
  ## Defining the index
43
51
 
44
52
  You have to specify a path to where the index would be stored and a block that defines the schema:
@@ -82,6 +90,8 @@ rio_bravo = OpenStruct.new(
82
90
  release_date: Date.parse("March 18, 1959")
83
91
  )
84
92
 
93
+ index << rio_bravo
94
+
85
95
  hanabi = {
86
96
  imdb_id: "tt0119250",
87
97
  type: "/crime/Japan",
@@ -92,6 +102,8 @@ hanabi = {
92
102
  release_date: Date.parse("December 1, 1998")
93
103
  }
94
104
 
105
+ index << hanabi
106
+
95
107
  brother = {
96
108
  imdb_id: "tt0118767",
97
109
  type: "/crime/Russia",
@@ -102,8 +114,6 @@ brother = {
102
114
  release_date: Date.parse("December 12, 1997")
103
115
  }
104
116
 
105
- index << rio_bravo
106
- index << hanabi
107
117
  index << brother
108
118
  ```
109
119
 
@@ -120,12 +130,32 @@ You can also delete it if you want:
120
130
  index.delete(rio_bravo.imdb_id)
121
131
  ```
122
132
 
123
- After that you need to commit the index for the changes to take place:
133
+ ### Transactions
134
+
135
+ If you need to perform multiple writing operations (i.e. more than one) you should always use `transaction`:
124
136
 
125
137
  ```ruby
126
- index.commit
138
+ index.transaction do
139
+ index << rio_bravo
140
+ index << hanabi
141
+ index << brother
142
+ end
127
143
  ```
128
144
 
145
+ Transactions group changes and [commit](https://docs.rs/tantivy/latest/tantivy/struct.IndexWriter.html#method.commit) them to the index in one go. This is *dramatically* more efficient than performing these changes one by one. In fact, all writing operations (i.e. `<<` and `delete`) are wrapped in a transaction implicitly when you call them outside of a transaction, so calling `<<` 10 times outside of a transaction is the same thing as performing 10 separate transactions.
146
+
147
+ ### Concurrency and thread-safety
148
+
149
+ Tantiny is thread-safe meaning that you can safely share a single instance of the index between threads. You can also spawn separate processes that could write to and read from the same index. However, while reading from the index should be parallel, writing to it is **not**. Whenever you call `transaction` or any other operation that modify the index (i.e. `<<` and `delete`) it will lock the index for the duration of the operation or wait for another process or thread to release the lock. The only exception to this is when there is another process with an index with an exclusive writer running somewhere in which case the methods that modify the index will fail immediately.
150
+
151
+ Thus, it's best to have a single writer process and many reader processes if you want to avoid blocking calls. The proper way to do this is to set `exclusive_writer` to `true` when initializing the index:
152
+
153
+ ```ruby
154
+ index = Tantiny::Index.new("/path/to/index", exclusive_writer: true) {}
155
+ ```
156
+
157
+ This way the [index writer](https://docs.rs/tantivy/latest/tantivy/struct.IndexWriter.html) will only be acquired once which means the memory for it and indexing threads will only be allocated once as well. Otherwise a new index writer is acquired every time you perform a writing operation.
158
+
129
159
  ## Searching
130
160
 
131
161
  Make sure that your index is up-to-date by reloading it first:
@@ -190,7 +220,7 @@ All queries can search on multuple fields (except for `facet_query` because it d
190
220
  So, the following query:
191
221
 
192
222
  ```ruby
193
- index.term_query(%i[title, description], "hello")
223
+ index.term_query(%i[title description], "hello")
194
224
  ```
195
225
 
196
226
  Is equivalent to:
@@ -296,7 +326,7 @@ You may have noticed that `search` method returns only documents ids. This is by
296
326
 
297
327
  ## Development
298
328
 
299
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
329
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake build` to build native extensions, and then `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
300
330
 
301
331
  We use [conventional commits](https://www.conventionalcommits.org) to automatically generate the CHANGELOG, bump the semantic version, and to publish and release the gem. All you need to do is stick to the convention and [CI will take care of everything else](https://github.com/baygeldin/tantiny/blob/main/.github/workflows/release.yml) for you.
302
332
 
data/bin/console CHANGED
@@ -7,9 +7,13 @@ require "pry"
7
7
  require "tantiny"
8
8
 
9
9
  path = File.join(__dir__, "../tmp")
10
- en_stem = Tantiny::Tokenizer.new(:stemmer, language: :en)
11
10
 
12
- index = Tantiny::Index.new path, tokenizer: en_stem do
11
+ options = {
12
+ tokenizer: Tantiny::Tokenizer.new(:stemmer, language: :en),
13
+ exclusive_writer: true,
14
+ }
15
+
16
+ index = Tantiny::Index.new(path, **options) do
13
17
  id :imdb_id
14
18
  facet :category
15
19
  string :title
@@ -49,11 +53,12 @@ brother = {
49
53
  release_date: Date.parse("December 12, 1997")
50
54
  }
51
55
 
52
- index << rio_bravo
53
- index << hanabi
54
- index << brother
56
+ index.transaction do
57
+ index << rio_bravo
58
+ index << hanabi
59
+ index << brother
60
+ end
55
61
 
56
- index.commit
57
62
  index.reload
58
63
 
59
64
  binding.pry
data/ext/Rakefile CHANGED
@@ -1,5 +1,5 @@
1
1
  require "thermite/tasks"
2
2
 
3
- project_dir = File.dirname(File.dirname(__FILE__))
3
+ project_dir = File.dirname(__FILE__, 2)
4
4
  Thermite::Tasks.new(cargo_project_path: project_dir, ruby_project_path: project_dir)
5
5
  task default: %w[thermite:build]
@@ -3,9 +3,18 @@
3
3
  module Tantiny
4
4
  class TantivyError < StandardError; end
5
5
 
6
- class UnknownField < StandardError
6
+ class IndexWriterBusyError < StandardError
7
7
  def initialize
8
- super("Can't find the specified field in the schema.")
8
+ msg = "Failed to acquire an index writer. "\
9
+ "Is there an active index with an exclusive writer already?"
10
+
11
+ super(msg)
12
+ end
13
+ end
14
+
15
+ class UnexpectedNone < StandardError
16
+ def initialize(type)
17
+ super("Didn't expect Option<#{type}> to be empty.")
9
18
  end
10
19
  end
11
20
 
@@ -5,5 +5,15 @@ module Tantiny
5
5
  def self.timestamp(date)
6
6
  date.to_datetime.iso8601
7
7
  end
8
+
9
+ def self.with_lock(lockfile)
10
+ File.open(lockfile, File::CREAT) do |file|
11
+ file.flock(File::LOCK_EX)
12
+
13
+ yield
14
+
15
+ file.flock(File::LOCK_UN)
16
+ end
17
+ end
8
18
  end
9
19
  end
data/lib/tantiny/index.rb CHANGED
@@ -2,18 +2,18 @@
2
2
 
3
3
  module Tantiny
4
4
  class Index
5
- DEFAULT_INDEX_SIZE = 50_000_000
5
+ LOCKFILE = ".tantiny.lock"
6
+ DEFAULT_WRITER_MEMORY = 5_000_000 # 5MB
6
7
  DEFAULT_LIMIT = 10
7
8
 
8
9
  def self.new(path, **options, &block)
9
- index_size = options[:size] || DEFAULT_INDEX_SIZE
10
- default_tokenizer = options[:tokenizer] || Tokenizer.default
10
+ FileUtils.mkdir_p(path)
11
11
 
12
+ default_tokenizer = options[:tokenizer] || Tokenizer.default
12
13
  schema = Schema.new(default_tokenizer, &block)
13
14
 
14
15
  object = __new(
15
16
  path.to_s,
16
- index_size,
17
17
  schema.default_tokenizer,
18
18
  schema.field_tokenizers.transform_keys(&:to_s),
19
19
  schema.text_fields.map(&:to_s),
@@ -24,15 +24,40 @@ module Tantiny
24
24
  schema.facet_fields.map(&:to_s)
25
25
  )
26
26
 
27
- object.send(:schema=, schema)
27
+ object.send(:initialize, path, schema, **options)
28
28
 
29
29
  object
30
30
  end
31
31
 
32
+ def initialize(path, schema, **options)
33
+ @path = path
34
+ @schema = schema
35
+
36
+ @indexer_memory = options[:writer_memory] || DEFAULT_WRITER_MEMORY
37
+ @exclusive_writer = options[:exclusive_writer] || false
38
+
39
+ @active_transaction = Concurrent::ThreadLocalVar.new(false)
40
+ @transaction_semaphore = Mutex.new
41
+
42
+ acquire_index_writer if exclusive_writer?
43
+ end
44
+
32
45
  attr_reader :schema
33
46
 
34
- def commit
35
- __commit
47
+ def transaction
48
+ if inside_transaction?
49
+ yield
50
+ else
51
+ synchronize do
52
+ open_transaction!
53
+
54
+ yield
55
+
56
+ close_transaction!
57
+ end
58
+ end
59
+
60
+ nil
36
61
  end
37
62
 
38
63
  def reload
@@ -40,19 +65,23 @@ module Tantiny
40
65
  end
41
66
 
42
67
  def <<(document)
43
- __add_document(
44
- resolve(document, schema.id_field).to_s,
45
- slice_document(document, schema.text_fields) { |v| v.to_s },
46
- slice_document(document, schema.string_fields) { |v| v.to_s },
47
- slice_document(document, schema.integer_fields) { |v| v.to_i },
48
- slice_document(document, schema.double_fields) { |v| v.to_f },
49
- slice_document(document, schema.date_fields) { |v| Helpers.timestamp(v) },
50
- slice_document(document, schema.facet_fields) { |v| v.to_s }
51
- )
68
+ transaction do
69
+ __add_document(
70
+ resolve(document, schema.id_field).to_s,
71
+ slice_document(document, schema.text_fields) { |v| v.to_s },
72
+ slice_document(document, schema.string_fields) { |v| v.to_s },
73
+ slice_document(document, schema.integer_fields) { |v| v.to_i },
74
+ slice_document(document, schema.double_fields) { |v| v.to_f },
75
+ slice_document(document, schema.date_fields) { |v| Helpers.timestamp(v) },
76
+ slice_document(document, schema.facet_fields) { |v| v.to_s }
77
+ )
78
+ end
52
79
  end
53
80
 
54
81
  def delete(id)
55
- __delete_document(id.to_s)
82
+ transaction do
83
+ __delete_document(id.to_s)
84
+ end
56
85
  end
57
86
 
58
87
  def search(query, limit: DEFAULT_LIMIT, **smart_query_options)
@@ -79,8 +108,6 @@ module Tantiny
79
108
 
80
109
  private
81
110
 
82
- attr_writer :schema
83
-
84
111
  def slice_document(document, fields, &block)
85
112
  fields.inject({}) do |hash, field|
86
113
  hash.tap { |h| h[field.to_s] = resolve(document, field) }
@@ -90,5 +117,56 @@ module Tantiny
90
117
  def resolve(document, field)
91
118
  document.is_a?(Hash) ? document[field] : document.send(field)
92
119
  end
120
+
121
+ def acquire_index_writer
122
+ __acquire_index_writer(@indexer_memory)
123
+ rescue TantivyError => e
124
+ case e.message
125
+ when /Failed to acquire Lockfile/
126
+ raise IndexWriterBusyError.new
127
+ else
128
+ raise
129
+ end
130
+ end
131
+
132
+ def release_index_writer
133
+ __release_index_writer
134
+ end
135
+
136
+ def commit
137
+ __commit
138
+ end
139
+
140
+ def open_transaction!
141
+ acquire_index_writer unless exclusive_writer?
142
+
143
+ @active_transaction.value = true
144
+ end
145
+
146
+ def close_transaction!
147
+ commit
148
+
149
+ release_index_writer unless exclusive_writer?
150
+
151
+ @active_transaction.value = false
152
+ end
153
+
154
+ def inside_transaction?
155
+ @active_transaction.value
156
+ end
157
+
158
+ def exclusive_writer?
159
+ @exclusive_writer
160
+ end
161
+
162
+ def synchronize(&block)
163
+ @transaction_semaphore.synchronize do
164
+ Helpers.with_lock(lockfile_path, &block)
165
+ end
166
+ end
167
+
168
+ def lockfile_path
169
+ @lockfile_path ||= File.join(@path, LOCKFILE)
170
+ end
93
171
  end
94
172
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Tantiny
4
- VERSION = "0.2.2" # {x-release-please-version}
4
+ VERSION = "0.3.0" # {x-release-please-version}
5
5
  end
data/lib/tantiny.rb CHANGED
@@ -4,6 +4,8 @@ require "ruby-next/language/setup"
4
4
  RubyNext::Language.setup_gem_load_path
5
5
 
6
6
  require "rutie"
7
+ require "concurrent"
8
+ require "fileutils"
7
9
 
8
10
  require "tantiny/version"
9
11
  require "tantiny/errors"
@@ -2,5 +2,7 @@
2
2
  module Tantiny
3
3
  module Helpers
4
4
  def self.timestamp: ((Date | DateTime) date) -> String
5
+
6
+ def self.with_lock: (String lockfile) { (*untyped) -> void } -> void
5
7
  end
6
8
  end
@@ -1,6 +1,7 @@
1
1
  module Tantiny
2
2
  class Index
3
- DEFAULT_INDEX_SIZE: Integer
3
+ LOCKFILE: String
4
+ DEFAULT_WRITER_MEMORY: Integer
4
5
  DEFAULT_LIMIT: Integer
5
6
 
6
7
  def self.new: (
@@ -10,7 +11,6 @@ module Tantiny
10
11
 
11
12
  def self.__new: (
12
13
  String path,
13
- Integer index_size,
14
14
  Tokenizer default_tokenizer,
15
15
  Hash[String, Tokenizer] field_tokenizers,
16
16
  Array[String] text_fields,
@@ -21,9 +21,16 @@ module Tantiny
21
21
  Array[String] facet_fields
22
22
  ) -> Index
23
23
 
24
+ def initialize: (
25
+ String path,
26
+ Schema schema,
27
+ **untyped options
28
+ ) -> void
29
+
24
30
  attr_reader schema: Schema
25
31
 
26
- def commit: () -> void
32
+ def transaction: () { (*untyped) -> void } -> void
33
+
27
34
  def reload: () -> void
28
35
  def <<: (untyped document) -> void
29
36
  def delete: (String id) -> void
@@ -62,9 +69,12 @@ module Tantiny
62
69
 
63
70
  def __search: (Query query, Integer limit) -> Array[String]
64
71
 
72
+ def __acquire_index_writer: (Integer overall_memory) -> void
73
+ def __release_index_writer: () -> void
74
+
65
75
  private
66
76
 
67
- attr_writer schema: Schema
77
+ def commit: () -> void
68
78
 
69
79
  def slice_document: (
70
80
  untyped document,
@@ -78,5 +88,16 @@ module Tantiny
78
88
  ) -> Array[String]
79
89
 
80
90
  def resolve: (untyped document, Symbol field) -> untyped
91
+
92
+ def synchronize: () { (*untyped) -> void } -> void
93
+ def lockfile_path: () -> String
94
+
95
+ def exclusive_writer?: () -> bool
96
+ def acquire_index_writer: () -> void
97
+ def release_index_writer: () -> void
98
+
99
+ def open_transaction!: () -> void
100
+ def close_transaction!: () -> void
101
+ def inside_transaction?: () -> bool
81
102
  end
82
103
  end
data/src/helpers.rs CHANGED
@@ -1,6 +1,5 @@
1
1
  use std::collections::HashMap;
2
2
  use rutie::{AnyException, Array, Exception, RString, Hash, Integer, Float, Boolean, Module};
3
- use tantivy::schema::{Field};
4
3
  use tantivy::tokenizer::Language;
5
4
 
6
5
  // Macro dependencies:
@@ -114,12 +113,15 @@ where
114
113
  }
115
114
  }
116
115
 
117
- impl TryUnwrap<Field> for Option<Field> {
118
- fn try_unwrap(self) -> Field {
116
+ impl<T> TryUnwrap<T> for Option<T> {
117
+ fn try_unwrap(self) -> T {
119
118
  if let Some(value) = self {
120
119
  value
121
120
  } else {
122
- VM::raise_ex(AnyException::new("Tantiny::UnknownField", None));
121
+ VM::raise_ex(AnyException::new(
122
+ "Tantiny::UnexpectedNone",
123
+ Some(&*format!("{}", std::any::type_name::<T>())))
124
+ );
123
125
 
124
126
  self.unwrap()
125
127
  }
data/src/index.rs CHANGED
@@ -11,9 +11,10 @@ use crate::query::{unwrap_query, RTantinyQuery};
11
11
  use crate::tokenizer::{unwrap_tokenizer, RTantinyTokenizer};
12
12
 
13
13
  pub struct TantinyIndex {
14
- pub(crate) index_writer: IndexWriter,
15
- pub(crate) index_reader: IndexReader,
16
14
  pub(crate) schema: Schema,
15
+ pub(crate) index: Index,
16
+ pub(crate) index_writer: Option<IndexWriter>,
17
+ pub(crate) index_reader: IndexReader,
17
18
  }
18
19
 
19
20
  scaffold!(RTantinyIndex, TantinyIndex, "Index");
@@ -22,6 +23,10 @@ pub(crate) fn unwrap_index(index: &RTantinyIndex) -> &TantinyIndex {
22
23
  index.get_data(&*TANTINY_INDEX_WRAPPER)
23
24
  }
24
25
 
26
+ pub(crate) fn unwrap_index_mut(index: &mut RTantinyIndex) -> &mut TantinyIndex {
27
+ index.get_data_mut(&*TANTINY_INDEX_WRAPPER)
28
+ }
29
+
25
30
  #[rustfmt::skip::macros(methods)]
26
31
  methods!(
27
32
  RTantinyIndex,
@@ -29,7 +34,6 @@ methods!(
29
34
 
30
35
  fn new_index(
31
36
  path: RString,
32
- index_size: Integer,
33
37
  default_tokenizer: AnyObject,
34
38
  field_tokenizers: Hash,
35
39
  text_fields: Array,
@@ -41,7 +45,6 @@ methods!(
41
45
  ) -> RTantinyIndex {
42
46
  try_unwrap_params!(
43
47
  path: String,
44
- index_size: i64,
45
48
  default_tokenizer: RTantinyTokenizer,
46
49
  field_tokenizers: HashMap<String, RTantinyTokenizer>,
47
50
  text_fields: Vec<String>,
@@ -103,9 +106,7 @@ methods!(
103
106
  tokenizers.register(&field, unwrap_tokenizer(&tokenizer).clone())
104
107
  }
105
108
 
106
- let mut index_writer = index
107
- .writer(index_size as usize)
108
- .try_unwrap();
109
+ let index_writer = None;
109
110
 
110
111
  let index_reader = index
111
112
  .reader_builder()
@@ -114,7 +115,7 @@ methods!(
114
115
  .try_unwrap();
115
116
 
116
117
  klass().wrap_data(
117
- TantinyIndex { index_writer, index_reader, schema },
118
+ TantinyIndex { index, index_writer, index_reader, schema },
118
119
  &*TANTINY_INDEX_WRAPPER
119
120
  )
120
121
  }
@@ -138,9 +139,8 @@ methods!(
138
139
  facet_fields: HashMap<String, String>
139
140
  );
140
141
 
141
-
142
142
  let internal = unwrap_index(&_itself);
143
- let index_writer = &internal.index_writer;
143
+ let index_writer = internal.index_writer.as_ref().try_unwrap();
144
144
  let schema = &internal.schema;
145
145
 
146
146
  let mut doc = Document::default();
@@ -191,7 +191,7 @@ methods!(
191
191
  try_unwrap_params!(id: String);
192
192
 
193
193
  let internal = unwrap_index(&_itself);
194
- let index_writer = &internal.index_writer;
194
+ let index_writer = internal.index_writer.as_ref().unwrap();
195
195
 
196
196
  let id_field = internal.schema.get_field("id").try_unwrap();
197
197
  let doc_id = Term::from_field_text(id_field, &id);
@@ -201,9 +201,34 @@ methods!(
201
201
  NilClass::new()
202
202
  }
203
203
 
204
+ fn acquire_index_writer(
205
+ overall_memory: Integer
206
+ ) -> NilClass {
207
+ try_unwrap_params!(overall_memory: i64);
208
+
209
+ let internal = unwrap_index_mut(&mut _itself);
210
+
211
+ let mut index_writer = internal.index
212
+ .writer(overall_memory as usize)
213
+ .try_unwrap();
214
+
215
+ internal.index_writer = Some(index_writer);
216
+
217
+ NilClass::new()
218
+ }
219
+
220
+ fn release_index_writer() -> NilClass {
221
+ let internal = unwrap_index_mut(&mut _itself);
222
+
223
+ drop(internal.index_writer.as_ref().try_unwrap());
224
+ internal.index_writer = None;
225
+
226
+ NilClass::new()
227
+ }
228
+
204
229
  fn commit() -> NilClass {
205
- let internal = _itself.get_data_mut(&*TANTINY_INDEX_WRAPPER);
206
- let index_writer = &mut internal.index_writer;
230
+ let internal = unwrap_index_mut(&mut _itself);
231
+ let index_writer = internal.index_writer.as_mut().try_unwrap();
207
232
 
208
233
  index_writer.commit().try_unwrap();
209
234
 
@@ -254,6 +279,8 @@ pub(super) fn init() {
254
279
  klass.def_self("__new", new_index);
255
280
  klass.def("__add_document", add_document);
256
281
  klass.def("__delete_document", delete_document);
282
+ klass.def("__acquire_index_writer", acquire_index_writer);
283
+ klass.def("__release_index_writer", release_index_writer);
257
284
  klass.def("__commit", commit);
258
285
  klass.def("__reload", reload);
259
286
  klass.def("__search", search);
data/src/lib.rs CHANGED
@@ -3,7 +3,6 @@ mod helpers;
3
3
  mod index;
4
4
  #[allow(improper_ctypes_definitions)]
5
5
  mod query;
6
-
7
6
  #[allow(improper_ctypes_definitions)]
8
7
  mod tokenizer;
9
8
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tantiny
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.2
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Alexander Baygeldin
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-03-07 00:00:00.000000000 Z
11
+ date: 2022-03-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ruby-next
@@ -66,7 +66,21 @@ dependencies:
66
66
  - - "~>"
67
67
  - !ruby/object:Gem::Version
68
68
  version: '13.0'
69
- description:
69
+ - !ruby/object:Gem::Dependency
70
+ name: concurrent-ruby
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '1.0'
76
+ type: :runtime
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '1.0'
83
+ description:
70
84
  email:
71
85
  - a.baygeldin@gmail.com
72
86
  executables: []
@@ -83,7 +97,6 @@ files:
83
97
  - ext/Rakefile
84
98
  - lib/.rbnext/3.0/tantiny/schema.rb
85
99
  - lib/tantiny.rb
86
- - lib/tantiny.so
87
100
  - lib/tantiny/errors.rb
88
101
  - lib/tantiny/helpers.rb
89
102
  - lib/tantiny/index.rb
@@ -113,7 +126,7 @@ metadata:
113
126
  documentation_uri: https://github.com/baygeldin/tantiny/blob/master/README.md
114
127
  homepage_uri: https://github.com/baygeldin/tantiny
115
128
  source_code_uri: https://github.com/baygeldin/tantiny
116
- post_install_message:
129
+ post_install_message:
117
130
  rdoc_options: []
118
131
  require_paths:
119
132
  - lib
@@ -129,7 +142,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
129
142
  version: '0'
130
143
  requirements: []
131
144
  rubygems_version: 3.3.7
132
- signing_key:
145
+ signing_key:
133
146
  specification_version: 4
134
147
  summary: Tiny full-text search for Ruby powered by Tantivy.
135
148
  test_files: []
data/lib/tantiny.so DELETED
Binary file