knitsearch 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/CHANGELOG.md +11 -0
- data/README.md +652 -0
- data/lib/generators/knitsearch/install/install_generator.rb +208 -0
- data/lib/generators/knitsearch/install/templates/migration.rb.tt +7 -0
- data/lib/generators/knitsearch/multisearch_install/multisearch_install_generator.rb +89 -0
- data/lib/knitsearch/document.rb +12 -0
- data/lib/knitsearch/engine.rb +22 -0
- data/lib/knitsearch/fuzzy_corrector.rb +79 -0
- data/lib/knitsearch/has_many_dependent.rb +62 -0
- data/lib/knitsearch/has_many_through_join_dependent.rb +47 -0
- data/lib/knitsearch/has_many_through_target_dependent.rb +54 -0
- data/lib/knitsearch/highlighter.rb +36 -0
- data/lib/knitsearch/levenshtein.rb +35 -0
- data/lib/knitsearch/migration.rb +235 -0
- data/lib/knitsearch/model.rb +613 -0
- data/lib/knitsearch/multisearchable.rb +24 -0
- data/lib/knitsearch/multisearchable_sync.rb +38 -0
- data/lib/knitsearch/query.rb +57 -0
- data/lib/knitsearch/version.rb +5 -0
- data/lib/knitsearch.rb +129 -0
- data/lib/tasks/knitsearch.rake +33 -0
- metadata +125 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 35a7e641399429a874f032c57bbcd29345099e9ef2e8ac95d3c124d64c96551d
|
|
4
|
+
data.tar.gz: de85797a620e5865008575b4131388f7045ea17ad99d02f60fe5e95c50fd5ff2
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: b2ea03e5df42f40a9dc9c3cb621e1ca4a3e2977ea62b471e802471fe9f800651289fa4569d6460488850fff7edb2926ab58027c5d49ee69cf2f79c71369dd336
|
|
7
|
+
data.tar.gz: 11cabc803a87471a59b6ce66cc37fc58990824885b1e8b264879c178577e237227e8569a7a7bada27a6af8f4e42386f89d07b82535523182039013747a6f9a1d
|
data/CHANGELOG.md
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## v0.1
|
|
4
|
+
|
|
5
|
+
- `multisearchable(against:)` — global multi-model search macro. Declares which columns to index in `knitsearches` FTS5 table. `Knitsearch.multisearch(query, limit:)` returns BM25-ranked polymorphic Document relation. Independent from per-model `searchable_by`; both can coexist on the same model with zero extra sync cost.
|
|
6
|
+
- Fix after_save_commit callback silently no-op'ing for has_many associated_against — bundled callback registration and method definition into an ActiveSupport::Concern's `included do` block to fix callback-chain compilation timing.
|
|
7
|
+
- `dictionary: "trigram"` exposes FTS5's built-in trigram tokenizer for substring search. Zero new dependencies. Cannot be combined with `prefix:`.
|
|
8
|
+
- `search(query, match: :phrase)` requires tokens to appear as a contiguous, ordered phrase via FTS5's native phrase queries. Cannot be combined with `operator: :or`.
|
|
9
|
+
- `search(query, fallback_below: N)` — Searchkick-style two-pass fallback. Runs a strict AND search first; if fewer than N results, automatically retries with `operator: :or` and merges. Returns an `Array` instead of a `Relation` when used. No-op when combined with `operator: :or`.
|
|
10
|
+
- `Model.suggest(query, limit: 10, fallback_below: nil)` — Autocomplete convenience method. Delegates to `search(..., prefix: true)` with a sensible default limit. BM25-ranked results are chainable `ActiveRecord::Relation`s.
|
|
11
|
+
- `searchable_by(associated_against: { assoc: [:column] })` — Index fields from belongs_to, has_many, and has_many :through associations. Parent updates cascade to children (belongs_to) or child/join changes refresh parent (has_many, has_many :through). Target updates refresh all parents with that target (has_many :through only).
|
data/README.md
ADDED
|
@@ -0,0 +1,652 @@
|
|
|
1
|
+
# Knitsearch
|
|
2
|
+
|
|
3
|
+
Full-text search for Rails 8 + SQLite. Your search index updates in the same transaction as your row. No separate process, no eventual consistency, no extra infrastructure.
|
|
4
|
+
|
|
5
|
+
Most search gems make you choose: use your database's native FTS and lose rich text plus associated records, or add Elasticsearch and manage another moving part. Knitsearch does both in one line.
|
|
6
|
+
|
|
7
|
+
**Features that come for free:**
|
|
8
|
+
|
|
9
|
+
- ActionText rich-text fields. HTML stripped, kept in sync automatically.
|
|
10
|
+
- Search by associated model fields (find a Card by its Agenda's name, an Agenda by Card names)
|
|
11
|
+
- Multi-model search. One query, polymorphic results, ranked across your whole app.
|
|
12
|
+
- Typo tolerance, phrase matching, prefix matching, highlighting, snippets
|
|
13
|
+
- BM25 relevance ranking with per-column weights
|
|
14
|
+
- One line on the model, one migration
|
|
15
|
+
|
|
16
|
+
**Query like regular ActiveRecord.** The `.search()` method returns an `Relation`, so `.where`, `.includes`, `.pluck` all work without learning a DSL.
|
|
17
|
+
|
|
18
|
+
## Installation
|
|
19
|
+
|
|
20
|
+
Add to your Gemfile:
|
|
21
|
+
|
|
22
|
+
```ruby
|
|
23
|
+
gem "knitsearch"
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Run the install generator with your model and columns:
|
|
27
|
+
|
|
28
|
+
```sh
|
|
29
|
+
bin/rails generate knitsearch:install Article title body
|
|
30
|
+
bin/rails db:migrate
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
The generator creates an FTS5 index and three database triggers that keep it in sync on every insert, update, and delete. All updates happen in the same transaction as your row write.
|
|
34
|
+
|
|
35
|
+
Add one line to the model:
|
|
36
|
+
|
|
37
|
+
```ruby
|
|
38
|
+
class Article < ApplicationRecord
|
|
39
|
+
include Knitsearch::Model
|
|
40
|
+
searchable_by against: { title: "A", body: "B" }
|
|
41
|
+
end
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
If the table already has rows, backfill the index once:
|
|
45
|
+
|
|
46
|
+
```sh
|
|
47
|
+
bin/rails knitsearch:backfill[Article]
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Then search:
|
|
51
|
+
|
|
52
|
+
```ruby
|
|
53
|
+
articles = Article.search("rails sqlite")
|
|
54
|
+
articles.each { |a| puts a.title }
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## When to use Knitsearch
|
|
58
|
+
|
|
59
|
+
**Good fit:** Rails app, data in SQLite, you want search to commit and roll back with the row. You're indexing rich text, associated records, or both. You don't want to manage a separate search server.
|
|
60
|
+
|
|
61
|
+
**Reach for something else if:** You need vector or semantic search (use `sqlite-vec` directly or Meilisearch), distributed search across multiple machines (Elasticsearch, OpenSearch), or per-field synonyms as a first-class feature.
|
|
62
|
+
|
|
63
|
+
## End-to-end example
|
|
64
|
+
|
|
65
|
+
A blog with articles, authors, and tags. Start by generating the index:
|
|
66
|
+
|
|
67
|
+
```sh
|
|
68
|
+
bin/rails generate knitsearch:install Article title content
|
|
69
|
+
bin/rails db:migrate
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Then edit the migration to add associated fields. Open `db/migrate/[timestamp]_create_articles_search_table.rb` and update the call:
|
|
73
|
+
|
|
74
|
+
```ruby
|
|
75
|
+
class CreateArticlesSearchTable < ActiveRecord::Migration[8.0]
|
|
76
|
+
def change
|
|
77
|
+
reversible do |dir|
|
|
78
|
+
dir.up do
|
|
79
|
+
Knitsearch::Migration.create_searchable_table(
|
|
80
|
+
"articles",
|
|
81
|
+
columns: ["title"],
|
|
82
|
+
rich_text_columns: ["content"],
|
|
83
|
+
associated_against: {
|
|
84
|
+
author: [:name],
|
|
85
|
+
tags: [:name]
|
|
86
|
+
}
|
|
87
|
+
)
|
|
88
|
+
end
|
|
89
|
+
|
|
90
|
+
dir.down do
|
|
91
|
+
Knitsearch::Migration.drop_searchable_table("articles")
|
|
92
|
+
end
|
|
93
|
+
end
|
|
94
|
+
end
|
|
95
|
+
end
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Add the model declaration to match:
|
|
99
|
+
|
|
100
|
+
```ruby
|
|
101
|
+
class Article < ApplicationRecord
|
|
102
|
+
has_rich_text :content
|
|
103
|
+
belongs_to :author
|
|
104
|
+
has_many :article_tags
|
|
105
|
+
has_many :tags, through: :article_tags
|
|
106
|
+
|
|
107
|
+
include Knitsearch::Model
|
|
108
|
+
searchable_by(
|
|
109
|
+
against: { title: "A", content: "B" },
|
|
110
|
+
associated_against: { author: [:name], tags: [:name] }
|
|
111
|
+
)
|
|
112
|
+
end
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Run migrations and backfill:
|
|
116
|
+
|
|
117
|
+
```sh
|
|
118
|
+
bin/rails db:migrate
|
|
119
|
+
bin/rails knitsearch:backfill[Article]
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Now you can search articles by title, by their rich-text content (HTML stripped automatically), by the author's name, or by tag names. All in one index:
|
|
123
|
+
|
|
124
|
+
```ruby
|
|
125
|
+
# Search by title or content
|
|
126
|
+
Article.search("rails framework")
|
|
127
|
+
|
|
128
|
+
# Also matches by author name or tag
|
|
129
|
+
Article.search("john doe") # articles by author John Doe
|
|
130
|
+
Article.search("ruby") # articles tagged "ruby"
|
|
131
|
+
|
|
132
|
+
# Typo tolerance
|
|
133
|
+
Article.search("framwork", fuzzy: 1)
|
|
134
|
+
|
|
135
|
+
# Autocomplete: prefix on the last word, typo-correct the rest
|
|
136
|
+
Article.search("jhn do", fuzzy: 1, suggest: true)
|
|
137
|
+
# => matches "john doe"
|
|
138
|
+
|
|
139
|
+
# Phrase matching
|
|
140
|
+
Article.search("ruby on rails", match: :phrase)
|
|
141
|
+
|
|
142
|
+
# Highlight matches
|
|
143
|
+
results = Article.search("setup")
|
|
144
|
+
results.first.search_highlight(:title)
|
|
145
|
+
# => <p>Getting <mark>setup</mark> with Rails</p>
|
|
146
|
+
|
|
147
|
+
# Extract snippets with context
|
|
148
|
+
results.first.search_snippet(:content, 30)
|
|
149
|
+
# => <p>...To get <mark>setup</mark> quickly, install...</p>
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
Results are ordered by relevance. Chain any ActiveRecord method:
|
|
153
|
+
|
|
154
|
+
```ruby
|
|
155
|
+
Article.search("rails")
|
|
156
|
+
.where(published: true)
|
|
157
|
+
.includes(:author, :tags)
|
|
158
|
+
.limit(10)
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
When you update an article's title, author, or tags, the search index updates instantly in the same transaction. No drift, no background job, no eventual consistency.
|
|
162
|
+
|
|
163
|
+
## Querying
|
|
164
|
+
|
|
165
|
+
The `search` method returns an `ActiveRecord::Relation`. Chain it like any other:
|
|
166
|
+
|
|
167
|
+
```ruby
|
|
168
|
+
Article.search("rails").where(published: true).limit(10).offset(20)
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Common queries
|
|
172
|
+
|
|
173
|
+
Eager-load associations:
|
|
174
|
+
|
|
175
|
+
```ruby
|
|
176
|
+
Article.search("rails").includes(:author)
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
Match either term (default is AND):
|
|
180
|
+
|
|
181
|
+
```ruby
|
|
182
|
+
Article.search("ruby rails", operator: :or)
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
Phrase matching:
|
|
186
|
+
|
|
187
|
+
```ruby
|
|
188
|
+
Article.search("ruby on rails", match: :phrase)
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
Limit results:
|
|
192
|
+
|
|
193
|
+
```ruby
|
|
194
|
+
Article.search("rails", limit: 20)
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**User input is escaped automatically.** FTS5 syntax characters like `AND`, `OR`, `NOT`, `NEAR`, `*`, `"`, and parentheses become literals. Pass user-typed queries straight in.
|
|
198
|
+
|
|
199
|
+
### Results and relevance
|
|
200
|
+
|
|
201
|
+
The `search` method returns results ordered by BM25 relevance, so the most relevant row is first:
|
|
202
|
+
|
|
203
|
+
```ruby
|
|
204
|
+
results = Article.search("rails")
|
|
205
|
+
results.count
|
|
206
|
+
results.exists?
|
|
207
|
+
results.pluck(:title)
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
Empty queries return nothing:
|
|
211
|
+
|
|
212
|
+
```ruby
|
|
213
|
+
Article.search("").to_a # => []
|
|
214
|
+
Article.search(nil).to_a # => []
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
### Boosting: make some fields rank higher
|
|
218
|
+
|
|
219
|
+
By default, all fields are weighted equally. Boost important ones:
|
|
220
|
+
|
|
221
|
+
```ruby
|
|
222
|
+
searchable_by against: { title: "A", body: "B", tags: "C" }
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
`"A"` ranks 2 times higher than `"B"`, which ranks 2 times higher than `"C"`. The buckets map to BM25 multipliers: A=8, B=4, C=2, D=1.
|
|
226
|
+
|
|
227
|
+
You can also use numeric weights directly:
|
|
228
|
+
|
|
229
|
+
```ruby
|
|
230
|
+
searchable_by against: { title: 10, body: 1 }
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### Typo tolerance
|
|
234
|
+
|
|
235
|
+
There are two typo-handling APIs because they answer different questions.
|
|
236
|
+
|
|
237
|
+
**`fuzzy:` rewrites the query.** Use when the user is mid-typing — autocomplete, instant search — where rewriting trailing tokens is the point.
|
|
238
|
+
|
|
239
|
+
```ruby
|
|
240
|
+
Article.search("zuchini", fuzzy: 1) # rewrites to "zucchini" before searching
|
|
241
|
+
Article.search("jhon smtih", fuzzy: 1) # corrects each token independently
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
`fuzzy:` is the maximum Levenshtein edit distance (number of single-character changes). Use 1 for most words; 2 for words 8+ characters. `fuzzy: 0` or `fuzzy: nil` disables correction. The corrector preserves your word when it's already in the index at reasonable frequency — `"date"` stays as `"date"` even when `"data"` is more common. Only obvious outliers (a typo with vastly fewer occurrences than its corrected form) get rewritten.
|
|
245
|
+
|
|
246
|
+
**`suggest_correction` returns a suggestion.** Use for one-shot user searches where preserving intent matters. Returns the corrected string OR `nil` — `nil` means the user's spelling was fine and no suggestion is worth showing.
|
|
247
|
+
|
|
248
|
+
```ruby
|
|
249
|
+
suggestion = Article.suggest_correction("zuchini") # => "zucchini"
|
|
250
|
+
Article.suggest_correction("zucchini") # => nil
|
|
251
|
+
|
|
252
|
+
# In a controller:
|
|
253
|
+
@suggestion = Article.suggest_correction(params[:q])
|
|
254
|
+
@articles = Article.search(params[:q])
|
|
255
|
+
|
|
256
|
+
# In the view:
|
|
257
|
+
# <% if @suggestion %>Did you mean <%= link_to @suggestion, ... %>?<% end %>
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
Each whitespace-separated token is corrected independently against the FTS5 vocab table. Tokens shorter than 3 characters are left alone. Combine with `fallback_below:` to also widen sparse results — correction runs first, then the sparse-results fallback.
|
|
261
|
+
|
|
262
|
+
### Sparse-results fallback
|
|
263
|
+
|
|
264
|
+
When a strict AND search returns too few hits, widen automatically:
|
|
265
|
+
|
|
266
|
+
```ruby
|
|
267
|
+
Article.search("zucini", fallback_below: 5)
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
If the AND pass returns fewer than 5 results, the gem retries as OR (and prefix, if enabled). Returns an `Array` rather than a `Relation`. The second pass depends on the first pass's count, so chaining `.where` afterward isn't supported. Filter before the call, or filter in Ruby on the result.
|
|
271
|
+
|
|
272
|
+
### Highlighting and snippets
|
|
273
|
+
|
|
274
|
+
Wrap matched terms and extract context:
|
|
275
|
+
|
|
276
|
+
```ruby
|
|
277
|
+
results = Article.search("rails", highlight: [:title], snippet: { body: 30 })
|
|
278
|
+
results.first.search_highlight(:title) # safe HTML, hits wrapped in <mark>
|
|
279
|
+
results.first.search_snippet(:body) # 30-token excerpt with hits marked
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
The `highlight:` option takes an array of column names. The `snippet:` option takes an array (defaults to 20 tokens) or a hash specifying tokens per column. Both return safe HTML.
|
|
283
|
+
|
|
284
|
+
### Relevance scores
|
|
285
|
+
|
|
286
|
+
When you use `highlight:` or `snippet:`, results expose their BM25 score:
|
|
287
|
+
|
|
288
|
+
```ruby
|
|
289
|
+
results.first.searchable_score # => 0.342 (lower = more relevant)
|
|
290
|
+
```
|
|
291
|
+
|
|
292
|
+
Useful for ranking results from multiple `search` calls. For example, you can merge per-model searches into a unified list.
|
|
293
|
+
|
|
294
|
+
## Indexing
|
|
295
|
+
|
|
296
|
+
Each search index is an FTS5 virtual table in your SQLite database, kept in sync by triggers. There's no async worker, no separate connection pool, no eventual-consistency window. The index updates inside the same transaction as the row.
|
|
297
|
+
|
|
298
|
+
### Adding more models
|
|
299
|
+
|
|
300
|
+
Run the generator for each model:
|
|
301
|
+
|
|
302
|
+
```sh
|
|
303
|
+
bin/rails generate knitsearch:install Comment body
|
|
304
|
+
bin/rails db:migrate
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
Each model gets its own FTS table and triggers.
|
|
308
|
+
|
|
309
|
+
### Backfill
|
|
310
|
+
|
|
311
|
+
Triggers only catch writes that happen after the FTS table exists. For pre-existing rows:
|
|
312
|
+
|
|
313
|
+
```sh
|
|
314
|
+
bin/rails knitsearch:backfill[Article]
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
This is synchronous. Run once during quiet hours. For fresh apps that install the gem from day one, backfill is a no-op.
|
|
318
|
+
|
|
319
|
+
### Reindex
|
|
320
|
+
|
|
321
|
+
If the column set changes or the index gets corrupted:
|
|
322
|
+
|
|
323
|
+
```ruby
|
|
324
|
+
Article.reindex!
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
Or from the command line:
|
|
328
|
+
|
|
329
|
+
```sh
|
|
330
|
+
bin/rails knitsearch:reindex[Article]
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
For models with ActionText fields, use `Article.knitsearch_backfill!` instead. It repopulates shadow columns and rebuilds the index atomically.
|
|
334
|
+
|
|
335
|
+
## Autocomplete
|
|
336
|
+
|
|
337
|
+
Build prefix-based suggestions:
|
|
338
|
+
|
|
339
|
+
```ruby
|
|
340
|
+
Article.suggest("rai")
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
This is a thin wrapper over `search(..., prefix: true)` with a default limit of 10. Returns a chainable `Relation` ranked by BM25. Empty queries return nothing.
|
|
344
|
+
|
|
345
|
+
Correct typos in completed words while still prefix-expanding what the user is typing:
|
|
346
|
+
|
|
347
|
+
```ruby
|
|
348
|
+
Article.suggest("micheal jo", fuzzy: 1)
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
All tokens except the last are corrected. The last is prefix-expanded. So `"micheal jo"` becomes `"michael" + "jo*"`. Combine with `fallback_below:` to also widen sparse results.
|
|
352
|
+
|
|
353
|
+
### Enable prefix matching on every search
|
|
354
|
+
|
|
355
|
+
By default, prefix matching is off. The query `search("perf")` won't match `"performance"`. Enable it per-model:
|
|
356
|
+
|
|
357
|
+
```ruby
|
|
358
|
+
class Article < ApplicationRecord
|
|
359
|
+
include Knitsearch::Model
|
|
360
|
+
searchable_by against: { title: "A", body: "B" },
|
|
361
|
+
using: { fts5: { prefix: true } }
|
|
362
|
+
end
|
|
363
|
+
|
|
364
|
+
Article.search("perf") # now matches "performance", "performing"
|
|
365
|
+
```
|
|
366
|
+
|
|
367
|
+
Defaults to 2 and 3 character prefixes. Customize:
|
|
368
|
+
|
|
369
|
+
```ruby
|
|
370
|
+
using: { fts5: { prefix: [2, 3, 4] } }
|
|
371
|
+
```
|
|
372
|
+
|
|
373
|
+
Prefix indexes are roughly 2 times the size of plain indexes. Cannot be combined with `dictionary: "trigram"` because trigram already covers substring matching.
|
|
374
|
+
|
|
375
|
+
## Associated fields
|
|
376
|
+
|
|
377
|
+
Index fields from related records. Updates cascade automatically.
|
|
378
|
+
|
|
379
|
+
### Setup
|
|
380
|
+
|
|
381
|
+
To enable associated search, edit the generated migration's `create_searchable_table` call to pass `associated_against:`. For example, if you want a Card to be searchable by its Agenda's name and its Tags' names:
|
|
382
|
+
|
|
383
|
+
```ruby
|
|
384
|
+
# db/migrate/[timestamp]_create_cards_search_table.rb
|
|
385
|
+
class CreateCardsSearchTable < ActiveRecord::Migration[8.0]
|
|
386
|
+
def change
|
|
387
|
+
reversible do |dir|
|
|
388
|
+
dir.up do
|
|
389
|
+
Knitsearch::Migration.create_searchable_table(
|
|
390
|
+
"cards",
|
|
391
|
+
columns: ["name", "body"],
|
|
392
|
+
associated_against: {
|
|
393
|
+
agenda: [:name],
|
|
394
|
+
tags: [:name]
|
|
395
|
+
}
|
|
396
|
+
)
|
|
397
|
+
end
|
|
398
|
+
|
|
399
|
+
dir.down do
|
|
400
|
+
Knitsearch::Migration.drop_searchable_table("cards")
|
|
401
|
+
end
|
|
402
|
+
end
|
|
403
|
+
end
|
|
404
|
+
end
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Then add the matching `associated_against:` to your model:
|
|
408
|
+
|
|
409
|
+
```ruby
|
|
410
|
+
class Card < ApplicationRecord
|
|
411
|
+
belongs_to :agenda
|
|
412
|
+
has_many :tags
|
|
413
|
+
include Knitsearch::Model
|
|
414
|
+
searchable_by(
|
|
415
|
+
against: { name: "A", body: "B" },
|
|
416
|
+
associated_against: { agenda: [:name], tags: [:name] }
|
|
417
|
+
)
|
|
418
|
+
end
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
Run `bin/rails db:migrate`, then `bin/rails knitsearch:backfill[Card]` to index existing rows.
|
|
422
|
+
|
|
423
|
+
### belongs_to
|
|
424
|
+
|
|
425
|
+
Search a child record by its parent's fields:
|
|
426
|
+
|
|
427
|
+
```ruby
|
|
428
|
+
class Card < ApplicationRecord
|
|
429
|
+
belongs_to :agenda
|
|
430
|
+
include Knitsearch::Model
|
|
431
|
+
searchable_by(
|
|
432
|
+
against: { name: "A", body: "B" },
|
|
433
|
+
associated_against: { agenda: [:name] }
|
|
434
|
+
)
|
|
435
|
+
end
|
|
436
|
+
|
|
437
|
+
Card.search("real estate") # matches by agenda.name
|
|
438
|
+
```
|
|
439
|
+
|
|
440
|
+
A shadow column on the Card stores the Agenda's name. When the Agenda updates, the Card's index refreshes via `update_all`.
|
|
441
|
+
|
|
442
|
+
### has_many
|
|
443
|
+
|
|
444
|
+
Search a parent record by its children's fields:
|
|
445
|
+
|
|
446
|
+
```ruby
|
|
447
|
+
class Agenda < ApplicationRecord
|
|
448
|
+
has_many :cards
|
|
449
|
+
include Knitsearch::Model
|
|
450
|
+
searchable_by(
|
|
451
|
+
against: { name: "A" },
|
|
452
|
+
associated_against: { cards: [:name] }
|
|
453
|
+
)
|
|
454
|
+
end
|
|
455
|
+
|
|
456
|
+
Agenda.search("john smith") # matches by any child card's name
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
A shadow column on the Agenda stores space-separated, concatenated child values. When a Card is created, updated, destroyed, or reassigned, the Agenda's index refreshes automatically.
|
|
460
|
+
|
|
461
|
+
### has_many :through
|
|
462
|
+
|
|
463
|
+
```ruby
|
|
464
|
+
class Card < ApplicationRecord
|
|
465
|
+
has_many :card_tags
|
|
466
|
+
has_many :tags, through: :card_tags
|
|
467
|
+
include Knitsearch::Model
|
|
468
|
+
searchable_by(
|
|
469
|
+
against: { name: "A" },
|
|
470
|
+
associated_against: { tags: [:name] }
|
|
471
|
+
)
|
|
472
|
+
end
|
|
473
|
+
```
|
|
474
|
+
|
|
475
|
+
Join row changes and target updates both refresh the Card's index.
|
|
476
|
+
|
|
477
|
+
**Note:** The `collection.delete(item)` method uses direct SQL and skips destroy callbacks. This is a Rails limitation, not specific to this gem. Use `collection.destroy(item)` so the parent's shadow column refreshes.
|
|
478
|
+
|
|
479
|
+
### Weights for associated fields
|
|
480
|
+
|
|
481
|
+
Default weight is `"C"`. Override per column:
|
|
482
|
+
|
|
483
|
+
```ruby
|
|
484
|
+
associated_against: { agenda: { name: "B", description: "C" } }
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
Polymorphic associations are not supported yet.
|
|
488
|
+
|
|
489
|
+
## ActionText
|
|
490
|
+
|
|
491
|
+
Index rich-text fields automatically. HTML is stripped and plain text is kept in sync:
|
|
492
|
+
|
|
493
|
+
```ruby
|
|
494
|
+
class Article < ApplicationRecord
|
|
495
|
+
include Knitsearch::Model
|
|
496
|
+
has_rich_text :content
|
|
497
|
+
searchable_by against: { title: "A", content: "B" }
|
|
498
|
+
end
|
|
499
|
+
```
|
|
500
|
+
|
|
501
|
+
The generator detects `has_rich_text` and does three things:
|
|
502
|
+
|
|
503
|
+
1. Creates a `content_plain_text` shadow column
|
|
504
|
+
2. Configures the FTS index to read from the shadow column
|
|
505
|
+
3. Installs a `before_save` callback that extracts plain text (strips HTML, removes `<action-text-attachment>` elements, collapses whitespace, unescapes entities) and syncs the shadow column
|
|
506
|
+
|
|
507
|
+
Highlight and snippet operate on the plain text:
|
|
508
|
+
|
|
509
|
+
```ruby
|
|
510
|
+
Article.search("setup", highlight: [:content])
|
|
511
|
+
```
|
|
512
|
+
|
|
513
|
+
For pre-existing records with rich text, use the model method. The rake task skips ActionText:
|
|
514
|
+
|
|
515
|
+
```ruby
|
|
516
|
+
Article.knitsearch_backfill!
|
|
517
|
+
```
|
|
518
|
+
|
|
519
|
+
## Multi-model search
|
|
520
|
+
|
|
521
|
+
Search every searchable model in one query, returning polymorphic results ranked by BM25. Useful for global search UI.
|
|
522
|
+
|
|
523
|
+
Set up once:
|
|
524
|
+
|
|
525
|
+
```sh
|
|
526
|
+
bin/rails generate knitsearch:multisearch_install
|
|
527
|
+
bin/rails db:migrate
|
|
528
|
+
```
|
|
529
|
+
|
|
530
|
+
Declare which models are searchable:
|
|
531
|
+
|
|
532
|
+
```ruby
|
|
533
|
+
class Card < ApplicationRecord
|
|
534
|
+
multisearchable against: [:name, :body]
|
|
535
|
+
end
|
|
536
|
+
|
|
537
|
+
class Agenda < ApplicationRecord
|
|
538
|
+
multisearchable against: [:name]
|
|
539
|
+
end
|
|
540
|
+
```
|
|
541
|
+
|
|
542
|
+
Query:
|
|
543
|
+
|
|
544
|
+
```ruby
|
|
545
|
+
results = Knitsearch.multisearch("vip")
|
|
546
|
+
results.first.searchable_score
|
|
547
|
+
records = results.includes(:searchable).map(&:searchable)
|
|
548
|
+
# => [Card, Agenda, Card, ...] heterogeneous, BM25-ranked
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
Returns a `Relation` of `Knitsearch::Document`. Each document holds `searchable_type`, `searchable_id`, `content`, and `searchable_score`. Chain like any relation:
|
|
552
|
+
|
|
553
|
+
```ruby
|
|
554
|
+
Knitsearch.multisearch("vip", limit: 10)
|
|
555
|
+
Knitsearch.multisearch("vip").where("searchable_type = 'Card'")
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
Backfill existing rows:
|
|
559
|
+
|
|
560
|
+
```ruby
|
|
561
|
+
Card.knitsearch_multisearch_backfill!
|
|
562
|
+
Agenda.knitsearch_multisearch_backfill!
|
|
563
|
+
```
|
|
564
|
+
|
|
565
|
+
Per-model and multi-model indexes are independent. Declaring both costs two index writes per row, nothing more.
|
|
566
|
+
|
|
567
|
+
## Dictionaries
|
|
568
|
+
|
|
569
|
+
Pick how words are tokenized and matched. Set at install time:
|
|
570
|
+
|
|
571
|
+
```sh
|
|
572
|
+
bin/rails generate knitsearch:install Article title body --dictionary=english
|
|
573
|
+
```
|
|
574
|
+
|
|
575
|
+
Or in the model (must match the migration):
|
|
576
|
+
|
|
577
|
+
```ruby
|
|
578
|
+
searchable_by against: { title: "A", body: "B" },
|
|
579
|
+
using: { fts5: { dictionary: "english" } }
|
|
580
|
+
```
|
|
581
|
+
|
|
582
|
+
| Dictionary | Effect | Tokenizer |
|
|
583
|
+
|---|---|---|
|
|
584
|
+
| `"simple"` (default) | Case-folded, diacritics removed | `unicode61` |
|
|
585
|
+
| `"english"` | English stemming (running becomes run) | `porter` |
|
|
586
|
+
| `"trigram"` | Substring matching (mit matches Smith) | `trigram` |
|
|
587
|
+
|
|
588
|
+
### Trigram tradeoffs
|
|
589
|
+
|
|
590
|
+
Trigram tokenizes into overlapping 3-character substrings, so any substring is searchable. Useful for last names, product codes, anything where substring matching is natural.
|
|
591
|
+
|
|
592
|
+
Tradeoffs:
|
|
593
|
+
- About 3 times the index size, slower writes
|
|
594
|
+
- Query must be at least 3 characters
|
|
595
|
+
- Spans whitespace ("d ru" finds "red rubber")
|
|
596
|
+
- Cannot combine with `prefix:` because trigram already covers substring matching
|
|
597
|
+
|
|
598
|
+
To change a dictionary, write a new migration that drops and recreates the FTS table, then run `Model.reindex!`.
|
|
599
|
+
|
|
600
|
+
## Reference
|
|
601
|
+
|
|
602
|
+
### Errors
|
|
603
|
+
|
|
604
|
+
- `Thor::Error` when the generator runs on a non-SQLite adapter
|
|
605
|
+
- `Knitsearch::SchemaMismatchError` when a model declares `searchable_by` columns the FTS table doesn't have
|
|
606
|
+
- `Knitsearch::ColumnError` when `highlight:` or `snippet:` references a column not in `searchable_by`
|
|
607
|
+
|
|
608
|
+
All errors inherit from `Knitsearch::Error`.
|
|
609
|
+
|
|
610
|
+
### Troubleshooting
|
|
611
|
+
|
|
612
|
+
**Generator says the source table doesn't exist.**
|
|
613
|
+
Run `bin/rails db:migrate` first. The generator reads the live schema to validate column names.
|
|
614
|
+
|
|
615
|
+
**`SchemaMismatchError` after editing `searchable_by`.**
|
|
616
|
+
The model declares a column the FTS table doesn't have. Run the generator again with the new columns, then migrate.
|
|
617
|
+
|
|
618
|
+
**Search returns nothing after install.**
|
|
619
|
+
If you added the gem to an existing app, run `bin/rails knitsearch:backfill[Model]`. Triggers only catch writes after the FTS table exists.
|
|
620
|
+
|
|
621
|
+
**Search returns nothing for a rich-text field.**
|
|
622
|
+
Use `Model.knitsearch_backfill!` instead of the rake task. The rake task doesn't populate ActionText shadow columns.
|
|
623
|
+
|
|
624
|
+
**Generator rejects column names.**
|
|
625
|
+
FTS5 column names must be lowercase letters, digits, and underscores. Rename the column or add a shadow column.
|
|
626
|
+
|
|
627
|
+
**Associated search isn't working.**
|
|
628
|
+
Make sure the migration's `associated_against:` hash and the model's `associated_against:` hash match exactly. The keys (association names) and values (column arrays) must be identical. See [Setup](#setup) for an example.
|
|
629
|
+
|
|
630
|
+
### How it works
|
|
631
|
+
|
|
632
|
+
The FTS5 table is created with `content='articles'` and `content_rowid='id'`, so it stores only the inverted index and not the source rows. Three triggers (after insert, after delete, after update) fire inside the source table's transaction and keep the index in sync. This is the pattern recommended by SQLite's FTS5 documentation.
|
|
633
|
+
|
|
634
|
+
A `search` call is a single `SELECT` from the source table, joined to the FTS table on `rowid`, filtered by `MATCH`, ordered by `bm25()`, and limited. No intermediate step. That's why `.where`, `.includes`, `.pluck` all work natively.
|
|
635
|
+
|
|
636
|
+
### Limitations
|
|
637
|
+
|
|
638
|
+
- **Trigger overhead on writes.** Every write to a searchable model fires a trigger that updates the FTS table in the same transaction. Negligible for most apps and measurable for high-write workloads. Async indexing is on the roadmap.
|
|
639
|
+
- **SQLite only.** The install generator rejects other adapters. Use `pg_search` for PostgreSQL or your database's native FTS.
|
|
640
|
+
- **Rails 8.0+, Ruby 3.2+.**
|
|
641
|
+
- **Polymorphic associations not supported** by `associated_against:`.
|
|
642
|
+
|
|
643
|
+
### Roadmap
|
|
644
|
+
|
|
645
|
+
- Per-language stemming (Spanish, French, etc.) as optional gem dependencies
|
|
646
|
+
- Double Metaphone phonetic matching
|
|
647
|
+
- Optional async indexing for write-heavy apps
|
|
648
|
+
- Vector and semantic search integration (sqlite-vec) for hybrid lexical and embedding ranking
|
|
649
|
+
|
|
650
|
+
## License
|
|
651
|
+
|
|
652
|
+
Knitsearch is released under the MIT License.
|