rails-paradedb 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: aca8f9d891571b983de00268880360903e8e391e95e8272e538f67d033092c57
4
- data.tar.gz: 6194f019217d69935cbe6a3e1408514f45419cfe94e359240de362b0ecb01639
3
+ metadata.gz: be1908cb2b7b8e9062ac3a567ac2cb27ee4e90a62e5137ff09aa00345e805576
4
+ data.tar.gz: 0ee0f049473df3ab660b86a84f494885fc95f1021c7ab258c57d9278083f993e
5
5
  SHA512:
6
- metadata.gz: 2e9577cbb0203e5b053edeffa1cba33765b8b74c1f8245f0988022d5069a51f667bc9d19bedfeb52f26187c09b3ae32ab43e0f2bf7ad07eba4513b05bf4f601f
7
- data.tar.gz: 155fc1086655a107ee6d28704af95b02ed8fc181f9be8d2ec22df34b8e5d47ef944ab89434bdf2e4a7323aec7411787dd09cd01050f6f388b8bf8c6138698975
6
+ metadata.gz: 9038e6a1fa469c4e0de2d0ba10949074d5c995943d655216626590bca0f912a9e9381f5054ae5f647df7d699cef56586b1491e837d390908038c3a7f2142315b
7
+ data.tar.gz: 8fbbec4ccec76d6c1591e773dc72e06975b5b176ec7cc91ffcfe09200ccad9b892ecf2ab8c616e95190b180eed31d03df54e62a7b5d90d30b43df070b7eee48c
data/CHANGELOG.md CHANGED
@@ -1,13 +1,53 @@
1
1
  # Changelog
2
- <!-- markdownlint-disable MD024 -->
3
2
 
4
- All notable changes to this project will be documented in this file.
5
-
6
- The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
7
- and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
3
+ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
8
4
 
9
5
  ## [Unreleased]
10
6
 
7
+ ## [0.2.0] - 2026-03-13
8
+
9
+ ### Added
10
+
11
+ - Rails 7.2 support and CI coverage
12
+ - New search/query APIs: `regex_phrase`, `phrase_prefix`, `parse`,
13
+ grouped `aggregate_by`, and `ParadeDB::Query.regex`
14
+ - Expanded snippet support with `with_snippets` and
15
+ `with_snippet_positions`
16
+ - ParadeDB diagnostics helpers:
17
+ `paradedb_indexes`, `paradedb_index_segments`,
18
+ `paradedb_verify_index`, and `paradedb_verify_all_indexes`
19
+ - Additional aggregation helpers:
20
+ `percentiles`, `histogram`, `date_histogram`, `top_hits`, and
21
+ `filtered`
22
+ - Support for passing regexes into proximity queries using
23
+ `ParadeDB.regex_term`
24
+
25
+ ### Changed
26
+
27
+ - Fuzzy search controls are now flattened across the relation and Arel
28
+ DSLs with direct `distance`, `prefix`, and
29
+ `transposition_cost_one` options
30
+ - `matching_all` and `matching_any` now accept explicit `tokenizer:`
31
+ overrides
32
+ - Runtime index validation now includes index-class discovery, drift
33
+ checks, indexed-field validation, and model helpers for
34
+ `paradedb_index_classes`, `paradedb_indexed_fields`,
35
+ `paradedb_key_field`, and `paradedb_index_name`
36
+ - Facet and aggregation APIs now support `exact:` controls for exact
37
+ versus windowed execution
38
+ - README, examples, and Arel documentation were expanded to cover the
39
+ newer query, snippet, aggregation, and diagnostics APIs
40
+
41
+ ### Fixed
42
+
43
+ - Search/runtime tokenizer handling now renders tokenizer SQL safely and
44
+ validates unsupported tokenizer and facet combinations earlier
45
+
46
+ ### Removed
47
+
48
+ - **BREAKING**: `near_regex` has been removed in favor of calling
49
+ `near` with a regex argument using `ParadeDB.regex_term`
50
+
11
51
  ## [0.1.0] - 2026-02-07
12
52
 
13
53
  ### Added
@@ -50,5 +90,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
50
90
  - Schema dump/load round-trip for tokenizer configuration and index options
51
91
  (including `target_segment_count`)
52
92
 
53
- [Unreleased]: https://github.com/paradedb/rails-paradedb/compare/v0.1.0...HEAD
93
+ [Unreleased]: https://github.com/paradedb/rails-paradedb/compare/v0.2.0...HEAD
94
+ [0.2.0]: https://github.com/paradedb/rails-paradedb/releases/tag/v0.2.0
54
95
  [0.1.0]: https://github.com/paradedb/rails-paradedb/releases/tag/v0.1.0
data/README.md CHANGED
@@ -3,280 +3,214 @@
3
3
  [![Gem Version](https://img.shields.io/gem/v/rails-paradedb)](https://rubygems.org/gems/rails-paradedb)
4
4
  [![CI](https://github.com/paradedb/rails-paradedb/actions/workflows/ci.yml/badge.svg)](https://github.com/paradedb/rails-paradedb/actions/workflows/ci.yml)
5
5
  [![License](https://img.shields.io/github/license/paradedb/rails-paradedb?color=blue)](https://github.com/paradedb/rails-paradedb?tab=MIT-1-ov-file#readme)
6
- [![Slack URL](https://img.shields.io/badge/Join%20Slack-purple?logo=slack&link=https%3A%2F%2Fjoin.slack.com%2Ft%2Fparadedbcommunity%2Fshared_invite%2Fzt-32abtyjg4-yoYoi~RPh9MSW8tDbl0BQw)](https://join.slack.com/t/paradedbcommunity/shared_invite/zt-32abtyjg4-yoYoi~RPh9MSW8tDbl0BQw)
7
- [![X URL](https://img.shields.io/twitter/url?url=https%3A%2F%2Ftwitter.com%2Fparadedb&label=Follow%20%40paradedb)](https://x.com/paradedb)
8
6
 
9
- [ParadeDB](https://paradedb.com) simple, Elastic-quality search for Postgres **BM25 full-text** integration for ActiveRecord.
7
+ ActiveRecord integration for [ParadeDB](https://paradedb.com): BM25 full-text search, scoring, snippets, facets, and aggregations in PostgreSQL.
10
8
 
11
- For complete ParadeDB documentation, see [docs.paradedb.com](https://docs.paradedb.com/).
9
+ ParadeDB docs: <https://docs.paradedb.com>
12
10
 
13
- ## Requirements & Compatibility
11
+ ## Requirements
14
12
 
15
- | Component | Version |
16
- |------------|----------------------------------|
17
- | Ruby | 3.2+ |
18
- | Rails | 8.1+ |
19
- | ParadeDB | 0.21.0+ |
20
- | PostgreSQL | 17+ (with ParadeDB extension) |
21
-
22
- **Note**: This gem requires ActiveRecord with PostgreSQL. The DSL and Arel layer delegate SQL value quoting to `ActiveRecord::Base.connection.quote` for type safety and proper escaping.
13
+ - Ruby 3.2+
14
+ - Rails 7.2+
15
+ - PostgreSQL 17+ with `pg_search` (ParadeDB)
23
16
 
24
17
  ## Installation
25
18
 
26
- Add to your Gemfile:
27
-
28
19
  ```ruby
29
20
  gem "rails-paradedb"
30
21
  ```
31
22
 
32
- Then run:
33
-
34
23
  ```bash
35
24
  bundle install
36
25
  ```
37
26
 
38
27
  ## Quick Start
39
28
 
40
- Enable ParadeDB on a model:
41
-
42
29
  ```ruby
43
- class Product < ApplicationRecord
30
+ class MockItem < ActiveRecord::Base
44
31
  include ParadeDB::Model
32
+
33
+ self.table_name = "mock_items"
34
+ self.primary_key = "id"
35
+ self.has_paradedb_index = true
45
36
  end
46
37
  ```
47
38
 
48
- Search with a simple query:
49
-
50
39
  ```ruby
51
- Product.search(:description).matching_all("shoes")
52
- ```
53
-
54
- Check out some examples:
55
-
56
- - [Quick Start](examples/quickstart/quickstart.rb)
57
- - [Faceted Search](examples/faceted_search/faceted_search.rb)
58
- - [Autocomplete](examples/autocomplete/autocomplete.rb)
59
- - [More Like This](examples/more_like_this/more_like_this.rb)
60
- - [RAG](examples/rag/rag.rb)
61
-
62
- ## BM25 Index
63
-
64
- Generate an index class and migration:
65
-
66
- ```bash
67
- rails g parade_db:index Product description category rating
40
+ MockItem.search(:description).matching_all("running shoes")
41
+ MockItem.search(:description).matching_any("wireless", "bluetooth")
42
+ MockItem.search(:description).term("electronics")
68
43
  ```
69
44
 
70
- Or define one manually:
45
+ ## Index Definition
71
46
 
72
47
  ```ruby
73
- class ProductIndex < ParadeDB::Index
74
- self.table_name = :products
48
+ class MockItemIndex < ParadeDB::Index
49
+ self.table_name = :mock_items
75
50
  self.key_field = :id
76
- self.index_options = { target_segment_count: 17 }
51
+ self.index_name = :search_idx
77
52
  self.fields = {
78
- id: {},
79
- description: {
80
- tokenizers: [
81
- { tokenizer: :literal },
82
- { tokenizer: :simple, alias: "description_simple", filters: [:lowercase] }
83
- ]
84
- },
85
- category: { tokenizer: :literal, alias: "category" },
86
- "metadata->>'color'": { tokenizer: :literal, alias: "metadata_color" },
87
- metadata: { fast: true, expand_dots: false }
53
+ id: nil,
54
+ description: nil,
55
+ category: nil,
56
+ rating: nil,
57
+ in_stock: nil,
58
+ created_at: nil,
59
+ metadata: nil,
60
+ weight_range: nil
88
61
  }
89
62
  end
90
63
  ```
91
64
 
92
- Field config supports:
93
-
94
- - `tokenizer` for a single tokenizer entry.
95
- - `tokenizers` for multiple tokenizer entries on the same source field.
96
- - `args`, `named_args`, `filters`, `stemmer`, `alias` inside tokenizer entries.
97
- - field options such as `fast`, `record`, `normalizer`, `expand_dots`.
65
+ For text or JSON fields you plan to use in Top K queries, facets, grouped
66
+ aggregations, or `top_hits` docvalue fields, use `:literal` or
67
+ `:literal_normalized`.
98
68
 
99
- Create/remove it in a migration:
69
+ Create in migration:
100
70
 
101
71
  ```ruby
102
- class AddProductBm25Index < ActiveRecord::Migration[8.1]
72
+ class AddMockItemBm25Index < ActiveRecord::Migration[8.1]
103
73
  def up
104
- create_paradedb_index(ProductIndex, if_not_exists: true)
74
+ create_paradedb_index(MockItemIndex, if_not_exists: true)
105
75
  end
106
76
 
107
77
  def down
108
- remove_bm25_index :products, name: :products_bm25_idx, if_exists: true
78
+ remove_bm25_index :mock_items, name: :search_idx, if_exists: true
109
79
  end
110
80
  end
111
81
  ```
112
82
 
113
- Available migration helpers:
114
-
115
- - `create_paradedb_index(index_class_or_name, if_not_exists: false)`
116
- - `replace_paradedb_index(index_class_or_name)`
117
- - `add_bm25_index(table, fields:, key_field:, name: nil, index_options: nil, if_not_exists: false)`
118
- - `remove_bm25_index(table, name: nil, if_exists: false)`
119
- - `reindex_bm25(table, name: nil, concurrently: false)`
120
-
121
- ### Index Validation Mode
122
-
123
- Runtime index drift validation is controlled by `ParadeDB.index_validation_mode`.
124
- Default is `:off` (no runtime drift checks).
125
-
126
- ```ruby
127
- ParadeDB.index_validation_mode = :warn # log drift warnings
128
- ParadeDB.index_validation_mode = :raise # raise ParadeDB::IndexDriftError on drift
129
- ParadeDB.index_validation_mode = :off # disable drift checks (default)
130
- ```
131
-
132
- ## Query Types
133
-
134
- For advanced options, see [ParadeDB Query Builder Documentation](https://docs.paradedb.com/documentation/query-builder/overview) and the runnable scripts in [`examples/`](examples).
83
+ ## Query API
135
84
 
136
85
  ```ruby
137
86
  # Full-text
138
- Product.search(:description).matching_all("running shoes")
139
- Product.search(:description).matching_any("wireless", "bluetooth")
140
- Product.search(:description).phrase("running shoes", slop: 2)
141
- Product.search(:description).fuzzy("runing", distance: 2, prefix: true, boost: 1.5)
142
- Product.search(:description).regex("run.*")
143
- Product.search(:description).parse("running AND shoes", lenient: true)
144
-
145
- # Exact token matching
146
- Product.search(:category).term("electronics", boost: 2.0)
147
- Product.search(:category).term_set("electronics", "audio")
148
-
149
- # Other predicates
150
- Product.search(:description).excluding("cheap", "budget")
151
- Product.search(:description).near("running", "shoes", distance: 3)
152
- Product.search(:description).phrase_prefix("run", "sh")
153
- Product.search(:id).match_all
154
- Product.search(:id).exists
155
- Product.search(:rating).range(gte: 3, lt: 5)
156
-
157
- # Similarity
158
- Product.more_like_this(42, fields: [:description])
87
+ MockItem.search(:description).matching_all("running shoes")
88
+ MockItem.search(:description).matching_any("wireless bluetooth")
89
+
90
+ # Query-time tokenizer override
91
+ MockItem.search(:description).matching_any("running shoes", tokenizer: "whitespace")
92
+ MockItem.search(:description).matching_any("running shoes", tokenizer: "whitespace('lowercase=false')")
93
+
94
+ # Fuzzy options on match/term
95
+ MockItem.search(:description).matching_any("runing shose", distance: 1)
96
+ MockItem.search(:description).matching_all("runing", distance: 1, prefix: true)
97
+ MockItem.search(:description).term("shose", distance: 1, transposition_cost_one: true)
98
+
99
+ # Other query types
100
+ MockItem.search(:description).phrase("running shoes", slop: 2)
101
+ MockItem.search(:description).phrase("running shoes", tokenizer: "whitespace")
102
+ MockItem.search(:description).phrase(%w[running shoes])
103
+ MockItem.search(:description).regex("run.*")
104
+ MockItem.search(:description).near("running", anchor: "shoes", distance: 3)
105
+ MockItem.search(:description).near("running", anchor: "shoes", distance: 3, ordered: true)
106
+ MockItem.search(:description).near(ParadeDB.regex_term("run.*"), anchor: "shoes", distance: 3)
107
+ MockItem.search(:description).near("running", "trail", anchor: "shoes", distance: 3)
108
+ MockItem.search(:description).near(ParadeDB.regex_term("run.*"), "trail", anchor: "shoes", distance: 3)
109
+ MockItem.search(:description).regex_phrase("run.*", "shoes")
110
+ MockItem.search(:description).phrase_prefix("run", "sh")
111
+ MockItem.search(:description).phrase_prefix("run", "sh", max_expansion: 100)
112
+ MockItem.search(:description).parse("running AND shoes", lenient: true)
113
+ MockItem.search(:description).parse("running shoes", conjunction_mode: true)
114
+
115
+ MockItem.search(:id).match_all
116
+ MockItem.search(:id).exists
117
+ MockItem.search(:rating).range(gte: 3, lt: 5)
118
+ MockItem.search(:weight_range).range_term("(10, 12]", relation: "Intersects")
119
+
120
+ MockItem.more_like_this(42, fields: [:description])
159
121
  ```
160
122
 
161
- ## Annotations
162
-
163
- See [BM25 Scoring](https://docs.paradedb.com/documentation/sorting/score) and [Highlighting](https://docs.paradedb.com/documentation/full-text/highlight) for full function details.
123
+ ## Scoring and Highlighting
164
124
 
165
125
  ```ruby
166
- Product.search(:description).matching_all("shoes").with_score
167
- Product.search(:description).matching_all("shoes").with_snippet(:description, start_tag: "<b>", end_tag: "</b>", max_chars: 80)
168
- Product.search(:description).matching_all("running").with_snippets(:description, max_chars: 15, limit: 2, offset: 0, sort_by: :position)
169
- Product.search(:description).matching_all("running").with_snippet_positions(:description)
126
+ results = MockItem.search(:description)
127
+ .matching_all("shoes")
128
+ .with_score
129
+ .order(search_score: :desc)
130
+
131
+ MockItem.search(:description)
132
+ .matching_all("shoes")
133
+ .with_snippet(:description, start_tag: "<b>", end_tag: "</b>", max_chars: 80)
134
+
135
+ MockItem.search(:description)
136
+ .matching_all("running")
137
+ .with_snippets(:description, max_chars: 15, limit: 2, offset: 0, sort_by: :position)
138
+
139
+ MockItem.search(:description)
140
+ .matching_all("running")
141
+ .with_snippet_positions(:description)
170
142
  ```
171
143
 
172
- ## Faceted Search
173
-
174
- For supported aggregate functions and JSON shapes, see [ParadeDB Aggregations Documentation](https://docs.paradedb.com/documentation/aggregates/overview).
175
-
176
- `with_facets(...)` requires:
177
-
178
- - an existing ParadeDB predicate
179
- - `.order(...)`
180
- - `.limit(...)`
144
+ ## Facets and Aggregations
181
145
 
182
146
  ```ruby
183
- # Rows + facets
184
- relation = Product.search(:description).matching_all("shoes")
147
+ # Rows + facets (requires order + limit)
148
+ relation = MockItem.search(:description)
149
+ .matching_all("shoes")
185
150
  .with_facets(:category, size: 10)
186
151
  .order(:id)
187
152
  .limit(10)
188
153
  rows = relation.to_a
189
154
  facets = relation.facets
190
155
 
191
- # Facets only
192
- facets_only = Product.search(:description).matching_all("shoes")
193
- .facets(:category)
194
-
195
- # Named aggregation helpers
196
- aggs = Product.search(:description).matching_all("shoes")
197
- .facets_agg(
198
- docs: ParadeDB::Aggregations.value_count(:id),
199
- avg_rating: ParadeDB::Aggregations.avg(:rating)
200
- )
201
- ```
156
+ # Non-exact window facets
157
+ relation = MockItem.search(:description)
158
+ .matching_all("shoes")
159
+ .with_facets(:category, size: 10, exact: false)
160
+ .order(:id)
161
+ .limit(10)
202
162
 
203
- ## ActiveRecord Integration
163
+ # Facets-only aggregate
164
+ MockItem.search(:description).matching_all("shoes").facets(:category)
204
165
 
205
- ParadeDB scopes compose with regular ActiveRecord chaining:
166
+ # Named aggregations
167
+ MockItem.search(:description).matching_all("shoes").facets_agg(
168
+ docs: ParadeDB::Aggregations.value_count(:id),
169
+ avg_rating: ParadeDB::Aggregations.avg(:rating)
170
+ )
206
171
 
207
- ```ruby
208
- Product.search(:description).matching_all("running")
209
- .search(:category).term("footwear")
210
- .where(in_stock: true)
211
- .order(:id)
212
- .limit(10)
172
+ # Non-exact window named aggregations
173
+ MockItem.search(:description).matching_all("shoes").with_agg(
174
+ exact: false,
175
+ docs: ParadeDB::Aggregations.value_count(:id)
176
+ ).order(:id).limit(10)
213
177
  ```
214
178
 
215
- ### Method Name Conflicts
179
+ ## Diagnostics Helpers
216
180
 
217
- This gem defines a model class method named `.search`.
218
- If your application already defines `.search`, rails-paradedb will **not** override it.
219
-
220
- Use `.paradedb_search` instead:
181
+ Ruby helpers:
221
182
 
222
183
  ```ruby
223
- Product.paradedb_search(:description).matching_all("shoes")
184
+ ParadeDB.paradedb_indexes
185
+ ParadeDB.paradedb_index_segments("search_idx")
186
+ ParadeDB.paradedb_verify_index("search_idx", sample_rate: 0.1)
187
+ ParadeDB.paradedb_verify_all_indexes(index_pattern: "search_idx")
224
188
  ```
225
189
 
226
- ## Arel Layer
227
-
228
- See the dedicated Arel guide: [`lib/parade_db/arel/README.md`](lib/parade_db/arel/README.md).
229
-
230
- ## Security
231
-
232
- ### SQL Injection Protection
233
-
234
- rails-paradedb uses **ActiveRecord's quoting** for all search terms:
235
-
236
- **Quoting Strategy:**
237
-
238
- - All user input is quoted via `ActiveRecord::Base.connection.quote`
239
- - Search terms use Arel's `Nodes.build_quoted()` for type-safe SQL generation
240
- - This prevents SQL injection while maintaining compatibility with ParadeDB's full-text operators
241
-
242
- **Implementation Details:**
243
-
244
- All values flow through ActiveRecord's connection adapter quoting, which handles:
190
+ Rake tasks:
245
191
 
246
- - String escaping (`'` → `''`)
247
- - Type coercion (booleans, numbers)
248
- - NULL handling
249
-
250
- **Safety Guarantee:**
251
-
252
- ```ruby
253
- # Even malicious input is safely escaped
254
- user_query = "'; DROP TABLE products; --"
255
- Product.search(:description).matching_all(user_query)
256
- # The query is escaped and treated as a literal search term
192
+ ```bash
193
+ rake paradedb:diagnostics:indexes
194
+ rake "paradedb:diagnostics:index_segments[search_idx]"
195
+ rake "paradedb:diagnostics:verify_index[search_idx]" SAMPLE_RATE=0.1
196
+ rake paradedb:diagnostics:verify_all_indexes INDEX_PATTERN=search_idx
257
197
  ```
258
198
 
259
- ## Documentation
260
-
261
- - **ParadeDB Official Docs**: <https://docs.paradedb.com>
262
- - **ParadeDB Website**: <https://paradedb.com>
263
-
264
- ## Contributing
265
-
266
- Contribution and local development workflow live in [`CONTRIBUTING.md`](CONTRIBUTING.md).
199
+ Note: availability depends on your installed `pg_search` version.
267
200
 
268
- ## Support
201
+ ## Examples
269
202
 
270
- If you're missing a feature or have found a bug, please open a
271
- [GitHub Issue](https://github.com/paradedb/rails-paradedb/issues/new/choose).
272
-
273
- To get community support, you can:
203
+ - [Quick Start](examples/quickstart/quickstart.rb)
204
+ - [Faceted Search](examples/faceted_search/faceted_search.rb)
205
+ - [Autocomplete](examples/autocomplete/autocomplete.rb)
206
+ - [More Like This](examples/more_like_this/more_like_this.rb)
207
+ - [Hybrid RRF](examples/hybrid_rrf/hybrid_rrf.rb)
208
+ - [RAG](examples/rag/rag.rb)
274
209
 
275
- - Post a question in the [ParadeDB Slack Community](https://join.slack.com/t/paradedbcommunity/shared_invite/zt-32abtyjg4-yoYoi~RPh9MSW8tDbl0BQw)
276
- - Ask for help on our [GitHub Discussions](https://github.com/paradedb/paradedb/discussions)
210
+ ## Contributing
277
211
 
278
- If you need commercial support, please [contact the ParadeDB team](mailto:sales@paradedb.com).
212
+ See [CONTRIBUTING.md](CONTRIBUTING.md).
279
213
 
280
214
  ## License
281
215
 
282
- rails-paradedb is licensed under the [MIT License](LICENSE).
216
+ MIT
@@ -3,6 +3,19 @@
3
3
  module ParadeDB
4
4
  # Typed helpers for building agg JSON payloads passed to pdb.agg(...).
5
5
  module Aggregations
6
+ FilteredSpec = Struct.new(:spec, :agg_filter, keyword_init: true) do
7
+ # Backward-compatible reader for code that accessed `filtered_spec.filter`.
8
+ alias filter agg_filter
9
+ end
10
+ FieldTermFilter = Struct.new(
11
+ :field,
12
+ :term,
13
+ :distance,
14
+ :prefix,
15
+ :transposition_cost_one,
16
+ keyword_init: true
17
+ )
18
+
6
19
  TERMS_ORDER = {
7
20
  count_desc: { "_count" => "desc" },
8
21
  count_asc: { "_count" => "asc" },
@@ -18,7 +31,7 @@ module ParadeDB
18
31
 
19
32
  specs.each_with_object({}) do |(alias_name, spec), payload|
20
33
  alias_key = normalize_alias(alias_name)
21
- payload[alias_key] = normalize_spec(spec)
34
+ payload[alias_key] = normalize_named_spec(spec)
22
35
  end
23
36
  end
24
37
 
@@ -132,11 +145,43 @@ module ParadeDB
132
145
  }
133
146
  end
134
147
 
148
+ def top_hits(size: nil, from: nil, sort: nil, docvalue_fields: nil)
149
+ payload = {}
150
+ payload["size"] = normalize_non_negative_integer(size, "size") unless size.nil?
151
+ payload["from"] = normalize_non_negative_integer(from, "from") unless from.nil?
152
+ payload["sort"] = normalize_top_hits_sort(sort) unless sort.nil?
153
+ payload["docvalue_fields"] = normalize_docvalue_fields(docvalue_fields) unless docvalue_fields.nil?
154
+ { "top_hits" => payload }
155
+ end
156
+
157
+ def filtered(spec, filter: nil, field: nil, term: nil, distance: nil, prefix: nil, transposition_cost_one: nil)
158
+ normalized_spec = normalize_spec(spec)
159
+ normalized_filter = normalize_filter(
160
+ filter: filter,
161
+ field: field,
162
+ term: term,
163
+ distance: distance,
164
+ prefix: prefix,
165
+ transposition_cost_one: transposition_cost_one
166
+ )
167
+ FilteredSpec.new(spec: normalized_spec, agg_filter: normalized_filter)
168
+ end
169
+
135
170
  def metric(name, field)
136
171
  { name => { "field" => normalize_field(field) } }
137
172
  end
138
173
  private_class_method :metric
139
174
 
175
+ def normalize_named_spec(spec)
176
+ case spec
177
+ when FilteredSpec
178
+ FilteredSpec.new(spec: normalize_spec(spec.spec), agg_filter: spec.agg_filter)
179
+ else
180
+ normalize_spec(spec)
181
+ end
182
+ end
183
+ private_class_method :normalize_named_spec
184
+
140
185
  def normalize_alias(alias_name)
141
186
  value =
142
187
  case alias_name
@@ -166,6 +211,32 @@ module ParadeDB
166
211
  end
167
212
  private_class_method :normalize_spec
168
213
 
214
+ def normalize_filter(filter:, field:, term:, distance:, prefix:, transposition_cost_one:)
215
+ if filter
216
+ if !field.nil? || !term.nil?
217
+ raise ArgumentError, "filtered aggregation accepts either filter: or field/term arguments, not both"
218
+ end
219
+ return filter
220
+ end
221
+
222
+ if field.nil? || term.nil?
223
+ raise ArgumentError, "filtered aggregation requires filter: or both field: and term:"
224
+ end
225
+
226
+ normalized_distance = distance.nil? ? nil : normalize_non_negative_integer(distance, "distance")
227
+ normalized_prefix = normalize_boolean_option(prefix, "prefix")
228
+ normalized_transposition = normalize_boolean_option(transposition_cost_one, "transposition_cost_one")
229
+
230
+ FieldTermFilter.new(
231
+ field: normalize_field(field),
232
+ term: term,
233
+ distance: normalized_distance,
234
+ prefix: normalized_prefix,
235
+ transposition_cost_one: normalized_transposition
236
+ )
237
+ end
238
+ private_class_method :normalize_filter
239
+
169
240
  def normalize_field(field)
170
241
  case field
171
242
  when Symbol
@@ -215,6 +286,46 @@ module ParadeDB
215
286
  end
216
287
  private_class_method :normalize_bounds
217
288
 
289
+ def normalize_top_hits_sort(sort)
290
+ entries = Array(sort)
291
+ raise ArgumentError, "top_hits sort must include at least one field" if entries.empty?
292
+
293
+ entries.map do |entry|
294
+ raise ArgumentError, "top_hits sort entries must be Hash values" unless entry.is_a?(Hash)
295
+ raise ArgumentError, "top_hits sort entries must include exactly one field" unless entry.size == 1
296
+
297
+ field, direction = entry.first
298
+ {
299
+ normalize_field(field) => normalize_sort_direction(direction)
300
+ }
301
+ end
302
+ end
303
+ private_class_method :normalize_top_hits_sort
304
+
305
+ def normalize_docvalue_fields(fields)
306
+ values = Array(fields)
307
+ raise ArgumentError, "top_hits docvalue_fields must include at least one field" if values.empty?
308
+
309
+ values.map { |field| normalize_field(field) }
310
+ end
311
+ private_class_method :normalize_docvalue_fields
312
+
313
+ def normalize_sort_direction(direction)
314
+ value = direction.to_s
315
+ return value if %w[asc desc].include?(value)
316
+
317
+ raise ArgumentError, "sort direction must be 'asc' or 'desc'"
318
+ end
319
+ private_class_method :normalize_sort_direction
320
+
321
+ def normalize_boolean_option(value, name)
322
+ return nil if value.nil?
323
+ return value if value == true || value == false
324
+
325
+ raise ArgumentError, "#{name} must be true, false, or nil"
326
+ end
327
+ private_class_method :normalize_boolean_option
328
+
218
329
  def deep_stringify(value)
219
330
  case value
220
331
  when Hash