rails-paradedb 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +61 -6
- data/README.md +255 -168
- data/lib/parade_db/aggregations.rb +112 -1
- data/lib/parade_db/arel/README.md +31 -21
- data/lib/parade_db/arel/builder.rb +309 -25
- data/lib/parade_db/arel/nodes.rb +70 -8
- data/lib/parade_db/arel/predications.rb +90 -34
- data/lib/parade_db/arel/visitor.rb +32 -1
- data/lib/parade_db/diagnostics.rb +78 -0
- data/lib/parade_db/migration_helpers.rb +10 -12
- data/lib/parade_db/model.rb +51 -16
- data/lib/parade_db/proximity.rb +71 -0
- data/lib/parade_db/query.rb +14 -0
- data/lib/parade_db/search_methods.rb +385 -66
- data/lib/parade_db/tokenizer_sql.rb +21 -0
- data/lib/parade_db/version.rb +1 -1
- data/lib/parade_db.rb +42 -2
- metadata +35 -13
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: f119945534d0ec4f358a9e05d85529d645dbd71e249a0456e65bbd70a7d63135
|
|
4
|
+
data.tar.gz: 7a9f56fe0eb2a0ea452697f90beeca5a58ca55b996c199fc08babbb14f10eb0c
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1f98449702f8795645745b81bc344a1264946f8928e7bef804295a3bd15e468677414c85b266b2983ceb46aa558d9494f7933e0f770757cb0e31ff1d5fa39567
|
|
7
|
+
data.tar.gz: 3c2e6e60585532b13f3f810160ea77684274a1ba9ad7d7e878d037602bdb784edb2dbe773b66a87e21651671be121c3ba948fd7be38ee4e341c13ec151b7d1bd
|
data/CHANGELOG.md
CHANGED
|
@@ -1,13 +1,66 @@
|
|
|
1
1
|
# Changelog
|
|
2
|
-
<!-- markdownlint-disable MD024 -->
|
|
3
2
|
|
|
4
|
-
All notable changes to this project will be documented in this file.
|
|
5
|
-
|
|
6
|
-
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
7
|
-
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
3
|
+
All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
8
4
|
|
|
9
5
|
## [Unreleased]
|
|
10
6
|
|
|
7
|
+
## [0.3.0] - 2026-03-23
|
|
8
|
+
|
|
9
|
+
### Removed
|
|
10
|
+
|
|
11
|
+
- **BREAKING**: Removed `has_paradedb_index` class attribute. It had no
|
|
12
|
+
effect on library behavior. Remove `self.has_paradedb_index = true`
|
|
13
|
+
from your models.
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
|
|
17
|
+
- **BREAKING**: `near` now accepts a chainable `ParadeDB.proximity(...).within(...)`
|
|
18
|
+
clause to support the full proximity API
|
|
19
|
+
|
|
20
|
+
## [0.2.0] - 2026-03-13
|
|
21
|
+
|
|
22
|
+
### Added
|
|
23
|
+
|
|
24
|
+
- Rails 7.2 support and CI coverage
|
|
25
|
+
- New search/query APIs: `regex_phrase`, `phrase_prefix`, `parse`,
|
|
26
|
+
grouped `aggregate_by`, and `ParadeDB::Query.regex`
|
|
27
|
+
- Expanded snippet support with `with_snippets` and
|
|
28
|
+
`with_snippet_positions`
|
|
29
|
+
- ParadeDB diagnostics helpers:
|
|
30
|
+
`paradedb_indexes`, `paradedb_index_segments`,
|
|
31
|
+
`paradedb_verify_index`, and `paradedb_verify_all_indexes`
|
|
32
|
+
- Additional aggregation helpers:
|
|
33
|
+
`percentiles`, `histogram`, `date_histogram`, `top_hits`, and
|
|
34
|
+
`filtered`
|
|
35
|
+
- Support for passing regexes into proximity queries using
|
|
36
|
+
`ParadeDB.regex_term`
|
|
37
|
+
|
|
38
|
+
### Changed
|
|
39
|
+
|
|
40
|
+
- Fuzzy search controls are now flattened across the relation and Arel
|
|
41
|
+
DSLs with direct `distance`, `prefix`, and
|
|
42
|
+
`transposition_cost_one` options
|
|
43
|
+
- `matching_all` and `matching_any` now accept explicit `tokenizer:`
|
|
44
|
+
overrides
|
|
45
|
+
- Runtime index validation now includes index-class discovery, drift
|
|
46
|
+
checks, indexed-field validation, and model helpers for
|
|
47
|
+
`paradedb_index_classes`, `paradedb_indexed_fields`,
|
|
48
|
+
`paradedb_key_field`, and `paradedb_index_name`
|
|
49
|
+
- Facet and aggregation APIs now support `exact:` controls for exact
|
|
50
|
+
versus windowed execution
|
|
51
|
+
- README, examples, and Arel documentation were expanded to cover the
|
|
52
|
+
newer query, snippet, aggregation, and diagnostics APIs
|
|
53
|
+
|
|
54
|
+
### Fixed
|
|
55
|
+
|
|
56
|
+
- Search/runtime tokenizer handling now renders tokenizer SQL safely and
|
|
57
|
+
validates unsupported tokenizer and facet combinations earlier
|
|
58
|
+
|
|
59
|
+
### Removed
|
|
60
|
+
|
|
61
|
+
- **BREAKING**: `near_regex` has been removed in favor of calling
|
|
62
|
+
`near` with a regex argument using `ParadeDB.regex_term`
|
|
63
|
+
|
|
11
64
|
## [0.1.0] - 2026-02-07
|
|
12
65
|
|
|
13
66
|
### Added
|
|
@@ -50,5 +103,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
50
103
|
- Schema dump/load round-trip for tokenizer configuration and index options
|
|
51
104
|
(including `target_segment_count`)
|
|
52
105
|
|
|
53
|
-
[Unreleased]: https://github.com/paradedb/rails-paradedb/compare/v0.
|
|
106
|
+
[Unreleased]: https://github.com/paradedb/rails-paradedb/compare/v0.3.0...HEAD
|
|
107
|
+
[0.3.0]: https://github.com/paradedb/rails-paradedb/releases/tag/v0.3.0
|
|
108
|
+
[0.2.0]: https://github.com/paradedb/rails-paradedb/releases/tag/v0.2.0
|
|
54
109
|
[0.1.0]: https://github.com/paradedb/rails-paradedb/releases/tag/v0.1.0
|
data/README.md
CHANGED
|
@@ -1,261 +1,340 @@
|
|
|
1
|
+
<!-- ParadeDB: Postgres for Search and Analytics -->
|
|
2
|
+
<h1 align="center">
|
|
3
|
+
<a href="https://paradedb.com"><img src="https://github.com/paradedb/paradedb/raw/main/docs/logo/readme.svg" alt="ParadeDB"></a>
|
|
4
|
+
<br>
|
|
5
|
+
</h1>
|
|
6
|
+
|
|
7
|
+
<p align="center">
|
|
8
|
+
<b>Simple, Elastic-quality search for Postgres</b><br/>
|
|
9
|
+
</p>
|
|
10
|
+
|
|
11
|
+
<h3 align="center">
|
|
12
|
+
<a href="https://paradedb.com">Website</a> •
|
|
13
|
+
<a href="https://docs.paradedb.com">Docs</a> •
|
|
14
|
+
<a href="https://paradedb.com/slack/">Community</a> •
|
|
15
|
+
<a href="https://paradedb.com/blog/">Blog</a> •
|
|
16
|
+
<a href="https://docs.paradedb.com/changelog/">Changelog</a>
|
|
17
|
+
</h3>
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
1
21
|
# rails-paradedb
|
|
2
22
|
|
|
3
23
|
[](https://rubygems.org/gems/rails-paradedb)
|
|
4
|
-
[](https://rubygems.org/gems/rails-paradedb)
|
|
25
|
+
[](https://rubygems.org/gems/rails-paradedb)
|
|
26
|
+
[](https://codecov.io/gh/paradedb/rails-paradedb)
|
|
5
27
|
[](https://github.com/paradedb/rails-paradedb?tab=MIT-1-ov-file#readme)
|
|
6
|
-
[](https://paradedb.com/slack)
|
|
7
29
|
[](https://x.com/paradedb)
|
|
8
30
|
|
|
9
|
-
[ParadeDB](https://paradedb.com)
|
|
31
|
+
The official Ruby client for [ParadeDB](https://paradedb.com), built for ActiveRecord.
|
|
32
|
+
Use Elastic-quality full-text search, scoring, snippets, facets, and aggregations directly from Rails.
|
|
33
|
+
|
|
34
|
+
## Features
|
|
10
35
|
|
|
11
|
-
|
|
36
|
+
- BM25 index management in Rails migrations (`create_paradedb_index`, `remove_bm25_index`, `reindex_bm25`)
|
|
37
|
+
- Chainable ActiveRecord search API (`matching_all`, `matching_any`, `term`, `phrase`, `regex`, `near`, `parse`, and more)
|
|
38
|
+
- Relevance and highlighting (`with_score`, `with_snippet`, `with_snippets`, `with_snippet_positions`)
|
|
39
|
+
- Facets and aggregations (`with_facets`, `facets`, `with_agg`, `facets_agg`, `aggregate_by`)
|
|
40
|
+
- More Like This similarity search (`more_like_this`)
|
|
41
|
+
- Arel integration for advanced query composition with native ParadeDB operators
|
|
42
|
+
- Diagnostics helpers and rake tasks for index health and verification
|
|
43
|
+
- Optional runtime index validation to detect missing/drifted BM25 indexes
|
|
12
44
|
|
|
13
45
|
## Requirements & Compatibility
|
|
14
46
|
|
|
15
|
-
| Component |
|
|
16
|
-
|
|
17
|
-
| Ruby | 3.2+
|
|
18
|
-
| Rails |
|
|
19
|
-
| ParadeDB | 0.
|
|
20
|
-
| PostgreSQL |
|
|
47
|
+
| Component | Supported |
|
|
48
|
+
| ---------- | ------------------------------------------------ |
|
|
49
|
+
| Ruby | 3.2+ |
|
|
50
|
+
| Rails | 7.2+ |
|
|
51
|
+
| ParadeDB | 0.22.0+ |
|
|
52
|
+
| PostgreSQL | 15+ (PostgreSQL adapter with ParadeDB extension) |
|
|
21
53
|
|
|
22
|
-
|
|
54
|
+
Notes:
|
|
23
55
|
|
|
24
|
-
|
|
56
|
+
- CI runs Ruby `3.2` through `4.0` across Rails `7.2` and `8.1` on PostgreSQL `18`.
|
|
57
|
+
- Schema compatibility is checked against every ParadeDB release.
|
|
58
|
+
- The maintained minimum ParadeDB version is `0.22.0`; update `README.md`, `RELEASE.md`, and CI in the same PR whenever that floor changes.
|
|
25
59
|
|
|
26
|
-
|
|
60
|
+
## Installation
|
|
27
61
|
|
|
28
62
|
```ruby
|
|
29
63
|
gem "rails-paradedb"
|
|
30
64
|
```
|
|
31
65
|
|
|
32
|
-
Then run:
|
|
33
|
-
|
|
34
66
|
```bash
|
|
35
67
|
bundle install
|
|
36
68
|
```
|
|
37
69
|
|
|
38
70
|
## Quick Start
|
|
39
71
|
|
|
40
|
-
|
|
72
|
+
### Prerequisites
|
|
41
73
|
|
|
42
|
-
|
|
43
|
-
class Product < ApplicationRecord
|
|
44
|
-
include ParadeDB::Model
|
|
45
|
-
end
|
|
46
|
-
```
|
|
74
|
+
Make sure your Rails app uses PostgreSQL and that `pg_search` is installed in the target database:
|
|
47
75
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
```ruby
|
|
51
|
-
Product.search(:description).matching_all("shoes")
|
|
76
|
+
```sql
|
|
77
|
+
CREATE EXTENSION IF NOT EXISTS pg_search;
|
|
52
78
|
```
|
|
53
79
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
- [Quick Start](examples/quickstart/quickstart.rb)
|
|
57
|
-
- [Faceted Search](examples/faceted_search/faceted_search.rb)
|
|
58
|
-
- [Autocomplete](examples/autocomplete/autocomplete.rb)
|
|
59
|
-
- [More Like This](examples/more_like_this/more_like_this.rb)
|
|
60
|
-
- [RAG](examples/rag/rag.rb)
|
|
61
|
-
|
|
62
|
-
## BM25 Index
|
|
63
|
-
|
|
64
|
-
Generate an index class and migration:
|
|
80
|
+
### 1. Define Your Model and Index
|
|
65
81
|
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
|
|
82
|
+
```ruby
|
|
83
|
+
class MockItem < ActiveRecord::Base
|
|
84
|
+
include ParadeDB::Model
|
|
69
85
|
|
|
70
|
-
|
|
86
|
+
self.table_name = "mock_items"
|
|
87
|
+
self.primary_key = "id"
|
|
88
|
+
end
|
|
71
89
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
self.table_name = :products
|
|
90
|
+
class MockItemIndex < ParadeDB::Index
|
|
91
|
+
self.table_name = :mock_items
|
|
75
92
|
self.key_field = :id
|
|
76
|
-
self.
|
|
93
|
+
self.index_name = :search_idx
|
|
77
94
|
self.fields = {
|
|
78
|
-
id:
|
|
79
|
-
description:
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
"metadata->>'color'": { tokenizer: :literal, alias: "metadata_color" },
|
|
87
|
-
metadata: { fast: true, expand_dots: false }
|
|
95
|
+
id: nil,
|
|
96
|
+
description: nil,
|
|
97
|
+
category: nil,
|
|
98
|
+
rating: nil,
|
|
99
|
+
in_stock: nil,
|
|
100
|
+
created_at: nil,
|
|
101
|
+
metadata: nil,
|
|
102
|
+
weight_range: nil
|
|
88
103
|
}
|
|
89
104
|
end
|
|
90
105
|
```
|
|
91
106
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
- `tokenizer` for a single tokenizer entry.
|
|
95
|
-
- `tokenizers` for multiple tokenizer entries on the same source field.
|
|
96
|
-
- `args`, `named_args`, `filters`, `stemmer`, `alias` inside tokenizer entries.
|
|
97
|
-
- field options such as `fast`, `record`, `normalizer`, `expand_dots`.
|
|
98
|
-
|
|
99
|
-
Create/remove it in a migration:
|
|
107
|
+
### 2. Create the BM25 Index in a Migration
|
|
100
108
|
|
|
101
109
|
```ruby
|
|
102
|
-
class
|
|
110
|
+
class AddMockItemBm25Index < ActiveRecord::Migration[7.2] # use your app's migration version
|
|
103
111
|
def up
|
|
104
|
-
create_paradedb_index(
|
|
112
|
+
create_paradedb_index(MockItemIndex, if_not_exists: true)
|
|
105
113
|
end
|
|
106
114
|
|
|
107
115
|
def down
|
|
108
|
-
remove_bm25_index :
|
|
116
|
+
remove_bm25_index :mock_items, name: :search_idx, if_exists: true
|
|
109
117
|
end
|
|
110
118
|
end
|
|
111
119
|
```
|
|
112
120
|
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
- `create_paradedb_index(index_class_or_name, if_not_exists: false)`
|
|
116
|
-
- `replace_paradedb_index(index_class_or_name)`
|
|
117
|
-
- `add_bm25_index(table, fields:, key_field:, name: nil, index_options: nil, if_not_exists: false)`
|
|
118
|
-
- `remove_bm25_index(table, name: nil, if_exists: false)`
|
|
119
|
-
- `reindex_bm25(table, name: nil, concurrently: false)`
|
|
120
|
-
|
|
121
|
-
### Index Validation Mode
|
|
122
|
-
|
|
123
|
-
Runtime index drift validation is controlled by `ParadeDB.index_validation_mode`.
|
|
124
|
-
Default is `:off` (no runtime drift checks).
|
|
121
|
+
### 3. Search
|
|
125
122
|
|
|
126
123
|
```ruby
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
124
|
+
MockItem.search(:description).matching_all("running shoes")
|
|
125
|
+
MockItem.search(:description).matching_any("wireless", "bluetooth")
|
|
126
|
+
MockItem.search(:description).term("electronics")
|
|
130
127
|
```
|
|
131
128
|
|
|
132
|
-
## Query
|
|
133
|
-
|
|
134
|
-
For advanced options, see [ParadeDB Query Builder Documentation](https://docs.paradedb.com/documentation/query-builder/overview) and the runnable scripts in [`examples/`](examples).
|
|
129
|
+
## Query API
|
|
135
130
|
|
|
136
131
|
```ruby
|
|
137
|
-
# Full
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
#
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
132
|
+
# Full text
|
|
133
|
+
MockItem.search(:description).matching_all("running shoes")
|
|
134
|
+
MockItem.search(:description).matching_any("wireless bluetooth")
|
|
135
|
+
|
|
136
|
+
# Query-time tokenizer override
|
|
137
|
+
MockItem.search(:description).matching_any("running shoes", tokenizer: "whitespace")
|
|
138
|
+
MockItem.search(:description).matching_any("running shoes", tokenizer: "whitespace('lowercase=false')")
|
|
139
|
+
|
|
140
|
+
# Fuzzy options on match/term
|
|
141
|
+
# Note: tokenizer overrides are mutually exclusive with fuzzy options.
|
|
142
|
+
MockItem.search(:description).matching_any("runing shose", distance: 1)
|
|
143
|
+
MockItem.search(:description).matching_all("runing", distance: 1, prefix: true)
|
|
144
|
+
MockItem.search(:description).term("shose", distance: 1, transposition_cost_one: true)
|
|
145
|
+
|
|
146
|
+
# Other query types
|
|
147
|
+
MockItem.search(:description).phrase("running shoes", slop: 2)
|
|
148
|
+
MockItem.search(:description).phrase("running shoes", tokenizer: "whitespace")
|
|
149
|
+
MockItem.search(:description).phrase(%w[running shoes])
|
|
150
|
+
MockItem.search(:description).regex("run.*")
|
|
151
|
+
MockItem.search(:description).near(ParadeDB.proximity("running").within(3, "shoes"))
|
|
152
|
+
MockItem.search(:description).near(ParadeDB.proximity("running").within(3, "shoes", ordered: true))
|
|
153
|
+
MockItem.search(:description).near(ParadeDB.proximity("hiking", "running").within(2, "shoes"))
|
|
154
|
+
MockItem.search(:description).near(ParadeDB.proximity("running").within(2, "shoes", "sneakers", ordered: true))
|
|
155
|
+
MockItem.search(:description).near(ParadeDB.regex_term("run.*").within(3, "shoes"))
|
|
156
|
+
MockItem.search(:description).near(ParadeDB.proximity("trail").within(1, "running").within(1, "shoes"))
|
|
157
|
+
MockItem.search(:description).near(ParadeDB.proximity("running").within(3, "shoes"), boost: 2.0)
|
|
158
|
+
MockItem.search(:description).near(ParadeDB.proximity("running").within(3, "shoes"), const: 1.0)
|
|
159
|
+
MockItem.search(:description).regex_phrase("run.*", "shoes")
|
|
160
|
+
MockItem.search(:description).phrase_prefix("run", "sh", max_expansion: 100)
|
|
161
|
+
MockItem.search(:description).parse("running AND shoes", lenient: true)
|
|
162
|
+
|
|
163
|
+
# Match-all / exists / ranges
|
|
164
|
+
MockItem.search(:id).match_all
|
|
165
|
+
MockItem.search(:id).exists
|
|
166
|
+
MockItem.search(:rating).range(gte: 3, lt: 5)
|
|
167
|
+
MockItem.search(:weight_range).range_term("(10, 12]", relation: "Intersects")
|
|
156
168
|
|
|
157
169
|
# Similarity
|
|
158
|
-
|
|
170
|
+
MockItem.more_like_this(42, fields: [:description])
|
|
159
171
|
```
|
|
160
172
|
|
|
161
|
-
##
|
|
162
|
-
|
|
163
|
-
See [BM25 Scoring](https://docs.paradedb.com/documentation/sorting/score) and [Highlighting](https://docs.paradedb.com/documentation/full-text/highlight) for full function details.
|
|
173
|
+
## Scoring and Highlighting
|
|
164
174
|
|
|
165
175
|
```ruby
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
176
|
+
results = MockItem.search(:description)
|
|
177
|
+
.matching_all("shoes")
|
|
178
|
+
.with_score
|
|
179
|
+
.order(search_score: :desc)
|
|
180
|
+
|
|
181
|
+
MockItem.search(:description)
|
|
182
|
+
.matching_all("shoes")
|
|
183
|
+
.with_snippet(:description, start_tag: "<b>", end_tag: "</b>", max_chars: 80)
|
|
184
|
+
|
|
185
|
+
MockItem.search(:description)
|
|
186
|
+
.matching_all("running")
|
|
187
|
+
.with_snippets(:description, max_chars: 15, limit: 2, offset: 0, sort_by: :position)
|
|
188
|
+
|
|
189
|
+
MockItem.search(:description)
|
|
190
|
+
.matching_all("running")
|
|
191
|
+
.with_snippet_positions(:description)
|
|
170
192
|
```
|
|
171
193
|
|
|
172
|
-
##
|
|
173
|
-
|
|
174
|
-
For supported aggregate functions and JSON shapes, see [ParadeDB Aggregations Documentation](https://docs.paradedb.com/documentation/aggregates/overview).
|
|
175
|
-
|
|
176
|
-
`with_facets(...)` requires:
|
|
177
|
-
|
|
178
|
-
- an existing ParadeDB predicate
|
|
179
|
-
- `.order(...)`
|
|
180
|
-
- `.limit(...)`
|
|
194
|
+
## Facets and Aggregations
|
|
181
195
|
|
|
182
196
|
```ruby
|
|
183
|
-
# Rows + facets
|
|
184
|
-
relation =
|
|
197
|
+
# Rows + facets (requires order + limit)
|
|
198
|
+
relation = MockItem.search(:description)
|
|
199
|
+
.matching_all("shoes")
|
|
185
200
|
.with_facets(:category, size: 10)
|
|
186
201
|
.order(:id)
|
|
187
202
|
.limit(10)
|
|
203
|
+
|
|
188
204
|
rows = relation.to_a
|
|
189
205
|
facets = relation.facets
|
|
190
206
|
|
|
191
|
-
# Facets
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
207
|
+
# Facets-only aggregate
|
|
208
|
+
MockItem.search(:description).matching_all("shoes").facets(:category)
|
|
209
|
+
|
|
210
|
+
# Named aggregations
|
|
211
|
+
MockItem.search(:description).matching_all("shoes").facets_agg(
|
|
212
|
+
docs: ParadeDB::Aggregations.value_count(:id),
|
|
213
|
+
avg_rating: ParadeDB::Aggregations.avg(:rating)
|
|
214
|
+
)
|
|
215
|
+
|
|
216
|
+
# Window aggregations + rows
|
|
217
|
+
MockItem.search(:description).matching_all("shoes").with_agg(
|
|
218
|
+
exact: false,
|
|
219
|
+
docs: ParadeDB::Aggregations.value_count(:id),
|
|
220
|
+
stats: ParadeDB::Aggregations.stats(:rating)
|
|
221
|
+
).order(:id).limit(10)
|
|
222
|
+
|
|
223
|
+
# Grouped aggregations
|
|
224
|
+
MockItem.search(:id).match_all.aggregate_by(
|
|
225
|
+
:category,
|
|
226
|
+
docs: ParadeDB::Aggregations.value_count(:id)
|
|
227
|
+
)
|
|
201
228
|
```
|
|
202
229
|
|
|
203
|
-
|
|
230
|
+
If you group by text/JSON fields, index those fields using `:literal` or `:literal_normalized`.
|
|
204
231
|
|
|
205
|
-
|
|
232
|
+
## ActiveRecord and Arel Composition
|
|
233
|
+
|
|
234
|
+
Use ParadeDB conditions with normal ActiveRecord scopes:
|
|
206
235
|
|
|
207
236
|
```ruby
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
237
|
+
MockItem.search(:description)
|
|
238
|
+
.matching_all("shoes")
|
|
239
|
+
.where(in_stock: true)
|
|
240
|
+
.where(MockItem.arel_table[:rating].gteq(4))
|
|
241
|
+
.order(created_at: :desc)
|
|
213
242
|
```
|
|
214
243
|
|
|
215
|
-
|
|
244
|
+
For advanced SQL composition, ParadeDB operators are also available through Arel predications:
|
|
216
245
|
|
|
217
|
-
|
|
218
|
-
|
|
246
|
+
```ruby
|
|
247
|
+
t = MockItem.arel_table
|
|
248
|
+
MockItem.where(t[:description].pdb_match("running shoes"))
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
## Diagnostics Helpers
|
|
219
252
|
|
|
220
|
-
|
|
253
|
+
Ruby helpers:
|
|
221
254
|
|
|
222
255
|
```ruby
|
|
223
|
-
|
|
256
|
+
ParadeDB.paradedb_indexes
|
|
257
|
+
ParadeDB.paradedb_index_segments("search_idx")
|
|
258
|
+
ParadeDB.paradedb_verify_index("search_idx", sample_rate: 0.1)
|
|
259
|
+
ParadeDB.paradedb_verify_all_indexes(index_pattern: "search_idx")
|
|
224
260
|
```
|
|
225
261
|
|
|
226
|
-
|
|
262
|
+
Availability depends on the installed `pg_search` version.
|
|
227
263
|
|
|
228
|
-
|
|
264
|
+
Repository development tasks (from this repo's `Rakefile`):
|
|
229
265
|
|
|
230
|
-
|
|
266
|
+
```bash
|
|
267
|
+
rake paradedb:diagnostics:indexes
|
|
268
|
+
rake "paradedb:diagnostics:index_segments[search_idx]"
|
|
269
|
+
rake "paradedb:diagnostics:verify_index[search_idx]" SAMPLE_RATE=0.1
|
|
270
|
+
rake paradedb:diagnostics:verify_all_indexes INDEX_PATTERN=search_idx
|
|
271
|
+
```
|
|
272
|
+
|
|
273
|
+
## Index Validation
|
|
231
274
|
|
|
232
|
-
|
|
275
|
+
By default, index validation is disabled. You can enable runtime checks globally:
|
|
233
276
|
|
|
234
|
-
|
|
277
|
+
```ruby
|
|
278
|
+
# config/initializers/paradedb.rb
|
|
279
|
+
ParadeDB.index_validation_mode = :warn # :warn, :raise, or :off
|
|
280
|
+
```
|
|
235
281
|
|
|
236
|
-
|
|
282
|
+
When enabled, `rails-paradedb` validates that the expected BM25 index exists and can raise
|
|
283
|
+
`ParadeDB::IndexDriftError` or `ParadeDB::IndexClassNotFoundError` depending on mode.
|
|
237
284
|
|
|
238
|
-
|
|
239
|
-
- Search terms use Arel's `Nodes.build_quoted()` for type-safe SQL generation
|
|
240
|
-
- This prevents SQL injection while maintaining compatibility with ParadeDB's full-text operators
|
|
285
|
+
## Common Errors
|
|
241
286
|
|
|
242
|
-
|
|
287
|
+
### "No search field set. Call .search(column) first."
|
|
243
288
|
|
|
244
|
-
|
|
289
|
+
```ruby
|
|
290
|
+
# ❌ Missing .search(...)
|
|
291
|
+
MockItem.matching_all("shoes")
|
|
245
292
|
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
293
|
+
# ✅ Start with .search(column)
|
|
294
|
+
MockItem.search(:description).matching_all("shoes")
|
|
295
|
+
```
|
|
249
296
|
|
|
250
|
-
|
|
297
|
+
### "with_facets requires ORDER BY and LIMIT"
|
|
251
298
|
|
|
252
299
|
```ruby
|
|
253
|
-
#
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
#
|
|
300
|
+
# ❌ Missing order/limit
|
|
301
|
+
MockItem.search(:description).matching_all("shoes").with_facets(:category).to_a
|
|
302
|
+
|
|
303
|
+
# ✅ Include both
|
|
304
|
+
relation = MockItem.search(:description)
|
|
305
|
+
.matching_all("shoes")
|
|
306
|
+
.with_facets(:category)
|
|
307
|
+
.order(:id)
|
|
308
|
+
.limit(10)
|
|
309
|
+
relation.to_a
|
|
310
|
+
relation.facets
|
|
257
311
|
```
|
|
258
312
|
|
|
313
|
+
### "search(:field) is not indexed"
|
|
314
|
+
|
|
315
|
+
```ruby
|
|
316
|
+
# ❌ Field not in your ParadeDB::Index fields hash
|
|
317
|
+
MockItem.search(:title).matching_all("shoes")
|
|
318
|
+
|
|
319
|
+
# ✅ Add :title to the index definition, then migrate
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
## Security
|
|
323
|
+
|
|
324
|
+
`rails-paradedb` builds SQL through Arel nodes and quoted literals (`Arel::Nodes.build_quoted`)
|
|
325
|
+
rather than manual string interpolation. Tokenizer expressions are validated and search operators are
|
|
326
|
+
rendered through typed nodes, with unit and integration coverage for quoting and edge cases.
|
|
327
|
+
|
|
328
|
+
## Examples
|
|
329
|
+
|
|
330
|
+
- [Quick Start](examples/quickstart/quickstart.rb)
|
|
331
|
+
- [Faceted Search](examples/faceted_search/faceted_search.rb)
|
|
332
|
+
- [Autocomplete](examples/autocomplete/autocomplete.rb)
|
|
333
|
+
- [More Like This](examples/more_like_this/more_like_this.rb)
|
|
334
|
+
- [Hybrid RRF](examples/hybrid_rrf/hybrid_rrf.rb)
|
|
335
|
+
- [RAG](examples/rag/rag.rb)
|
|
336
|
+
- [Examples README](examples/README.md)
|
|
337
|
+
|
|
259
338
|
## Documentation
|
|
260
339
|
|
|
261
340
|
- **ParadeDB Official Docs**: <https://docs.paradedb.com>
|
|
@@ -263,19 +342,27 @@ Product.search(:description).matching_all(user_query)
|
|
|
263
342
|
|
|
264
343
|
## Contributing
|
|
265
344
|
|
|
266
|
-
|
|
345
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, test commands, linting, and PR workflow.
|
|
267
346
|
|
|
268
347
|
## Support
|
|
269
348
|
|
|
270
|
-
If you're missing a feature or
|
|
349
|
+
If you're missing a feature or found a bug, open a
|
|
271
350
|
[GitHub Issue](https://github.com/paradedb/rails-paradedb/issues/new/choose).
|
|
272
351
|
|
|
273
|
-
|
|
352
|
+
For community support:
|
|
353
|
+
|
|
354
|
+
- Join the [ParadeDB Slack Community](https://paradedb.com/slack)
|
|
355
|
+
- Ask in [ParadeDB Discussions](https://github.com/paradedb/paradedb/discussions)
|
|
356
|
+
|
|
357
|
+
For commercial support, contact [sales@paradedb.com](mailto:sales@paradedb.com).
|
|
358
|
+
|
|
359
|
+
## Acknowledgments
|
|
274
360
|
|
|
275
|
-
|
|
276
|
-
- Ask for help on our [GitHub Discussions](https://github.com/paradedb/paradedb/discussions)
|
|
361
|
+
We would like to thank the following members of the community for their valuable feedback and reviews during the development of this package:
|
|
277
362
|
|
|
278
|
-
|
|
363
|
+
- [Eric Barendt](https://github.com/ebarendt) - Engineering at Modern Treasury
|
|
364
|
+
- [Matthew Higgins](https://github.com/matthuhiggins) - Engineering at Modern Treasury
|
|
365
|
+
- [Patrick Schmitz](https://github.com/bullfight) - Engineering at Modern Treasury
|
|
279
366
|
|
|
280
367
|
## License
|
|
281
368
|
|