neighbor 0.5.2 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fb8418a71159a849442d146643f732c28d10a5237d94a5b5b7d7466a224a40b5
4
- data.tar.gz: 8a7ededb3071fdef4a77bbbdf4596d74ed323327d4d29ea15538e344d25af2f7
3
+ metadata.gz: 5caf112992a87ac93ef7d05094b9d5beb71470fef6d5dc07da8f4ba74787c381
4
+ data.tar.gz: 85d47b002533704a0098e41fb2e752405ea8ae50ca3cdb7d5984e0cd93016ab0
5
5
  SHA512:
6
- metadata.gz: 05b3b8ccd07570f531aef0132a80563aef046b88508e732b2f16fe02f6c0303bd6f6fdb00804a646845086ba7fd133358503a7c7b69b098c02f548434c859575
7
- data.tar.gz: 9bcb61d4e1453e30daeab1fff0364c023278405d170b435cdfb262aac0aa0d91622e19471fbf3f3076e4c944d09318dc81cf334fdac7767fa6f72d95c8560907
6
+ metadata.gz: af55fbcbe28352582379e84b0e831aae677411cf4a45432cd9a1caff7e33857d38362c6bc484a479e9c4e99f634cb094efa4ea10e02e789b4cbd552c72578b6c
7
+ data.tar.gz: b896e3edda9fc58a7e8aeb8aa71cf409ae5d0f8b30a785642e8380b071a0bff3e6504bc5123711405504ab7f54fb4ac856a2a26f5088bcd8bc4c963b2f4cab10
data/CHANGELOG.md CHANGED
@@ -1,3 +1,13 @@
1
+ ## 1.0.0 (2026-04-04)
2
+
3
+ - Dropped support for Ruby < 3.3 and Active Record < 7.2
4
+
5
+ ## 0.6.0 (2025-06-12)
6
+
7
+ - Added support for MariaDB 11.8
8
+ - Dropped experimental support for MariaDB 11.7
9
+ - Dropped support for Ruby < 3.2 and Active Record < 7.1
10
+
1
11
  ## 0.5.2 (2025-01-05)
2
12
 
3
13
  - Improved support for Postgres arrays
data/LICENSE.txt CHANGED
@@ -1,6 +1,6 @@
1
1
  The MIT License (MIT)
2
2
 
3
- Copyright (c) 2021-2024 Andrew Kane
3
+ Copyright (c) 2021-2026 Andrew Kane
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
data/README.md CHANGED
@@ -5,9 +5,11 @@ Nearest neighbor search for Rails
5
5
  Supports:
6
6
 
7
7
  - Postgres (cube and pgvector)
8
- - SQLite (sqlite-vec) - experimental
9
- - MariaDB 11.7 - experimental
8
+ - MariaDB 11.8
10
9
  - MySQL 9 (searching requires HeatWave) - experimental
10
+ - SQLite (sqlite-vec) - experimental
11
+
12
+ Also available for [Redis](https://github.com/ankane/neighbor-redis) and [S3 Vectors](https://github.com/ankane/neighbor-s3)
11
13
 
12
14
  [![Build Status](https://github.com/ankane/neighbor/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/neighbor/actions)
13
15
 
@@ -56,7 +58,7 @@ rails generate neighbor:sqlite
56
58
  Create a migration
57
59
 
58
60
  ```ruby
59
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
61
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
60
62
  def change
61
63
  # cube
62
64
  add_column :items, :embedding, :cube
@@ -107,9 +109,9 @@ See the additional docs for:
107
109
 
108
110
  - [cube](#cube)
109
111
  - [pgvector](#pgvector)
110
- - [sqlite-vec](#sqlite-vec)
111
112
  - [MariaDB](#mariadb)
112
113
  - [MySQL](#mysql)
114
+ - [sqlite-vec](#sqlite-vec)
113
115
 
114
116
  Or check out some [examples](#examples)
115
117
 
@@ -174,7 +176,7 @@ The `sparsevec` type can have up to 16,000 non-zero elements, and sparse vectors
174
176
  Add an approximate index to speed up queries. Create a migration with:
175
177
 
176
178
  ```ruby
177
- class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.0]
179
+ class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.1]
178
180
  def change
179
181
  add_index :items, :embedding, using: :hnsw, opclass: :vector_l2_ops
180
182
  # or
@@ -202,7 +204,7 @@ Item.connection.execute("SET ivfflat.probes = 3")
202
204
  Use the `halfvec` type to store half-precision vectors
203
205
 
204
206
  ```ruby
205
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
207
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
206
208
  def change
207
209
  add_column :items, :embedding, :halfvec, limit: 3 # dimensions
208
210
  end
@@ -214,9 +216,9 @@ end
214
216
  Index vectors at half precision for smaller indexes
215
217
 
216
218
  ```ruby
217
- class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.0]
219
+ class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.1]
218
220
  def change
219
- add_index :items, "(embedding::halfvec(3)) vector_l2_ops", using: :hnsw
221
+ add_index :items, "(embedding::halfvec(3)) halfvec_l2_ops", using: :hnsw
220
222
  end
221
223
  end
222
224
  ```
@@ -232,7 +234,7 @@ Item.nearest_neighbors(:embedding, [0.9, 1.3, 1.1], distance: "euclidean", preci
232
234
  Use the `bit` type to store binary vectors
233
235
 
234
236
  ```ruby
235
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
237
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
236
238
  def change
237
239
  add_column :items, :embedding, :bit, limit: 3 # dimensions
238
240
  end
@@ -250,7 +252,7 @@ Item.nearest_neighbors(:embedding, "101", distance: "hamming").first(5)
250
252
  Use expression indexing for binary quantization
251
253
 
252
254
  ```ruby
253
- class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.0]
255
+ class AddIndexToItemsEmbedding < ActiveRecord::Migration[8.1]
254
256
  def change
255
257
  add_index :items, "(binary_quantize(embedding)::bit(3)) bit_hamming_ops", using: :hnsw
256
258
  end
@@ -262,7 +264,7 @@ end
262
264
  Use the `sparsevec` type to store sparse vectors
263
265
 
264
266
  ```ruby
265
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
267
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
266
268
  def change
267
269
  add_column :items, :embedding, :sparsevec, limit: 3 # dimensions
268
270
  end
@@ -276,7 +278,7 @@ embedding = Neighbor::SparseVector.new({0 => 0.9, 1 => 1.3, 2 => 1.1}, 3)
276
278
  Item.nearest_neighbors(:embedding, embedding, distance: "euclidean").first(5)
277
279
  ```
278
280
 
279
- ## sqlite-vec
281
+ ## MariaDB
280
282
 
281
283
  ### Distance
282
284
 
@@ -284,82 +286,64 @@ Supported values are:
284
286
 
285
287
  - `euclidean`
286
288
  - `cosine`
287
- - `taxicab`
288
289
  - `hamming`
289
290
 
290
- ### Dimensions
291
+ ### Indexing
291
292
 
292
- For sqlite-vec, it’s a good idea to specify the number of dimensions to ensure all records have the same number.
293
+ Vector columns must use `null: false` to add a vector index
293
294
 
294
295
  ```ruby
295
- class Item < ApplicationRecord
296
- has_neighbors :embedding, dimensions: 3
296
+ class CreateItems < ActiveRecord::Migration[8.1]
297
+ def change
298
+ create_table :items do |t|
299
+ t.vector :embedding, limit: 3, null: false
300
+ t.index :embedding, type: :vector
301
+ end
302
+ end
297
303
  end
298
304
  ```
299
305
 
300
- ### Virtual Tables
306
+ ### Binary Vectors
301
307
 
302
- You can also use [virtual tables](https://alexgarcia.xyz/sqlite-vec/features/knn.html)
308
+ Use the `bigint` type to store binary vectors
303
309
 
304
310
  ```ruby
305
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
311
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
306
312
  def change
307
- # Rails 8+
308
- create_virtual_table :items, :vec0, [
309
- "id integer PRIMARY KEY AUTOINCREMENT NOT NULL",
310
- "embedding float[3] distance_metric=L2"
311
- ]
312
-
313
- # Rails < 8
314
- execute <<~SQL
315
- CREATE VIRTUAL TABLE items USING vec0(
316
- id integer PRIMARY KEY AUTOINCREMENT NOT NULL,
317
- embedding float[3] distance_metric=L2
318
- )
319
- SQL
313
+ add_column :items, :embedding, :bigint
320
314
  end
321
315
  end
322
316
  ```
323
317
 
324
- Use `distance_metric=cosine` for cosine distance
325
-
326
- You can optionally ignore any shadow tables that are created
327
-
328
- ```ruby
329
- ActiveRecord::SchemaDumper.ignore_tables += [
330
- "items_chunks", "items_rowids", "items_vector_chunks00"
331
- ]
332
- ```
318
+ Note: Binary vectors can have up to 64 dimensions
333
319
 
334
- Get the `k` nearest neighbors
320
+ Get the nearest neighbors by Hamming distance
335
321
 
336
322
  ```ruby
337
- Item.where("embedding MATCH ?", [1, 2, 3].to_s).where(k: 5).order(:distance)
323
+ Item.nearest_neighbors(:embedding, 5, distance: "hamming").first(5)
338
324
  ```
339
325
 
340
- Filter by primary key
326
+ ## MySQL
341
327
 
342
- ```ruby
343
- Item.where(id: [2, 3]).where("embedding MATCH ?", [1, 2, 3].to_s).where(k: 5).order(:distance)
344
- ```
328
+ ### Distance
345
329
 
346
- ### Int8 Vectors
330
+ Supported values are:
347
331
 
348
- Use the `type` option for int8 vectors
332
+ - `euclidean`
333
+ - `cosine`
334
+ - `hamming`
349
335
 
350
- ```ruby
351
- class Item < ApplicationRecord
352
- has_neighbors :embedding, dimensions: 3, type: :int8
353
- end
354
- ```
336
+ Note: The `DISTANCE()` function is [only available on HeatWave](https://dev.mysql.com/doc/refman/9.0/en/vector-functions.html)
355
337
 
356
338
  ### Binary Vectors
357
339
 
358
- Use the `type` option for binary vectors
340
+ Use the `binary` type to store binary vectors
359
341
 
360
342
  ```ruby
361
- class Item < ApplicationRecord
362
- has_neighbors :embedding, dimensions: 8, type: :bit
343
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
344
+ def change
345
+ add_column :items, :embedding, :binary
346
+ end
363
347
  end
364
348
  ```
365
349
 
@@ -369,7 +353,7 @@ Get the nearest neighbors by Hamming distance
369
353
  Item.nearest_neighbors(:embedding, "\x05", distance: "hamming").first(5)
370
354
  ```
371
355
 
372
- ## MariaDB
356
+ ## sqlite-vec
373
357
 
374
358
  ### Distance
375
359
 
@@ -377,64 +361,82 @@ Supported values are:
377
361
 
378
362
  - `euclidean`
379
363
  - `cosine`
364
+ - `taxicab`
380
365
  - `hamming`
381
366
 
382
- ### Indexing
367
+ ### Dimensions
383
368
 
384
- Vector columns must use `null: false` to add a vector index
369
+ For sqlite-vec, it’s a good idea to specify the number of dimensions to ensure all records have the same number.
385
370
 
386
371
  ```ruby
387
- class CreateItems < ActiveRecord::Migration[8.0]
388
- def change
389
- create_table :items do |t|
390
- t.vector :embedding, limit: 3, null: false
391
- t.index :embedding, type: :vector
392
- end
393
- end
372
+ class Item < ApplicationRecord
373
+ has_neighbors :embedding, dimensions: 3
394
374
  end
395
375
  ```
396
376
 
397
- ### Binary Vectors
377
+ ### Virtual Tables
398
378
 
399
- Use the `bigint` type to store binary vectors
379
+ You can also use [virtual tables](https://alexgarcia.xyz/sqlite-vec/features/knn.html)
400
380
 
401
381
  ```ruby
402
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
382
+ class AddEmbeddingToItems < ActiveRecord::Migration[8.1]
403
383
  def change
404
- add_column :items, :embedding, :bigint
384
+ # Rails 8+
385
+ create_virtual_table :items, :vec0, [
386
+ "id integer PRIMARY KEY AUTOINCREMENT NOT NULL",
387
+ "embedding float[3] distance_metric=L2"
388
+ ]
389
+
390
+ # Rails < 8
391
+ execute <<~SQL
392
+ CREATE VIRTUAL TABLE items USING vec0(
393
+ id integer PRIMARY KEY AUTOINCREMENT NOT NULL,
394
+ embedding float[3] distance_metric=L2
395
+ )
396
+ SQL
405
397
  end
406
398
  end
407
399
  ```
408
400
 
409
- Note: Binary vectors can have up to 64 dimensions
401
+ Use `distance_metric=cosine` for cosine distance
410
402
 
411
- Get the nearest neighbors by Hamming distance
403
+ You can optionally ignore any shadow tables that are created
412
404
 
413
405
  ```ruby
414
- Item.nearest_neighbors(:embedding, 5, distance: "hamming").first(5)
406
+ ActiveRecord::SchemaDumper.ignore_tables += [
407
+ "items_chunks", "items_rowids", "items_vector_chunks00"
408
+ ]
415
409
  ```
416
410
 
417
- ## MySQL
411
+ Get the `k` nearest neighbors
418
412
 
419
- ### Distance
413
+ ```ruby
414
+ Item.where("embedding MATCH ?", [1, 2, 3].to_s).where(k: 5).order(:distance)
415
+ ```
420
416
 
421
- Supported values are:
417
+ Filter by primary key
422
418
 
423
- - `euclidean`
424
- - `cosine`
425
- - `hamming`
419
+ ```ruby
420
+ Item.where(id: [2, 3]).where("embedding MATCH ?", [1, 2, 3].to_s).where(k: 5).order(:distance)
421
+ ```
426
422
 
427
- Note: The `DISTANCE()` function is [only available on HeatWave](https://dev.mysql.com/doc/refman/9.0/en/vector-functions.html)
423
+ ### Int8 Vectors
424
+
425
+ Use the `type` option for int8 vectors
426
+
427
+ ```ruby
428
+ class Item < ApplicationRecord
429
+ has_neighbors :embedding, dimensions: 3, type: :int8
430
+ end
431
+ ```
428
432
 
429
433
  ### Binary Vectors
430
434
 
431
- Use the `binary` type to store binary vectors
435
+ Use the `type` option for binary vectors
432
436
 
433
437
  ```ruby
434
- class AddEmbeddingToItems < ActiveRecord::Migration[8.0]
435
- def change
436
- add_column :items, :embedding, :binary
437
- end
438
+ class Item < ApplicationRecord
439
+ has_neighbors :embedding, dimensions: 8, type: :bit
438
440
  end
439
441
  ```
440
442
 
@@ -473,7 +475,7 @@ end
473
475
  Create a method to call the [embeddings API](https://platform.openai.com/docs/guides/embeddings)
474
476
 
475
477
  ```ruby
476
- def fetch_embeddings(input)
478
+ def embed(input)
477
479
  url = "https://api.openai.com/v1/embeddings"
478
480
  headers = {
479
481
  "Authorization" => "Bearer #{ENV.fetch("OPENAI_API_KEY")}",
@@ -497,7 +499,7 @@ input = [
497
499
  "The cat is purring",
498
500
  "The bear is growling"
499
501
  ]
500
- embeddings = fetch_embeddings(input)
502
+ embeddings = embed(input)
501
503
  ```
502
504
 
503
505
  Store the embeddings
@@ -524,7 +526,7 @@ See the [complete code](examples/openai/example.rb)
524
526
  Generate a model
525
527
 
526
528
  ```sh
527
- rails generate model Document content:text embedding:bit{1024}
529
+ rails generate model Document content:text embedding:bit{1536}
528
530
  rails db:migrate
529
531
  ```
530
532
 
@@ -539,15 +541,15 @@ end
539
541
  Create a method to call the [embed API](https://docs.cohere.com/reference/embed)
540
542
 
541
543
  ```ruby
542
- def fetch_embeddings(input, input_type)
543
- url = "https://api.cohere.com/v1/embed"
544
+ def embed(input, input_type)
545
+ url = "https://api.cohere.com/v2/embed"
544
546
  headers = {
545
547
  "Authorization" => "Bearer #{ENV.fetch("CO_API_KEY")}",
546
548
  "Content-Type" => "application/json"
547
549
  }
548
550
  data = {
549
551
  texts: input,
550
- model: "embed-english-v3.0",
552
+ model: "embed-v4.0",
551
553
  input_type: input_type,
552
554
  embedding_types: ["ubinary"]
553
555
  }
@@ -565,7 +567,7 @@ input = [
565
567
  "The cat is purring",
566
568
  "The bear is growling"
567
569
  ]
568
- embeddings = fetch_embeddings(input, "search_document")
570
+ embeddings = embed(input, "search_document")
569
571
  ```
570
572
 
571
573
  Store the embeddings
@@ -582,7 +584,7 @@ Embed the search query
582
584
 
583
585
  ```ruby
584
586
  query = "forest"
585
- query_embedding = fetch_embeddings([query], "search_query")[0]
587
+ query_embedding = embed([query], "search_query")[0]
586
588
  ```
587
589
 
588
590
  And search the documents
@@ -876,7 +878,7 @@ bundle exec rake test:postgresql
876
878
  bundle exec rake test:sqlite
877
879
 
878
880
  # MariaDB
879
- docker run -e MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 -e MARIADB_DATABASE=neighbor_test -p 3307:3306 mariadb:11.7-rc
881
+ docker run -e MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1 -e MARIADB_DATABASE=neighbor_test -p 3307:3306 mariadb:11.8
880
882
  bundle exec rake test:mariadb
881
883
 
882
884
  # MySQL
@@ -27,29 +27,13 @@ module Neighbor
27
27
  @neighbor_attributes[attribute_name] = {dimensions: dimensions, normalize: normalize, type: type&.to_sym}
28
28
  end
29
29
 
30
- if ActiveRecord::VERSION::STRING.to_f >= 7.2
31
- decorate_attributes(attribute_names) do |name, cast_type|
32
- Neighbor::Attribute.new(cast_type: cast_type, model: self, type: type, attribute_name: name)
33
- end
34
- else
35
- attribute_names.each do |attribute_name|
36
- attribute attribute_name do |cast_type|
37
- Neighbor::Attribute.new(cast_type: cast_type, model: self, type: type, attribute_name: attribute_name)
38
- end
39
- end
30
+ decorate_attributes(attribute_names) do |name, cast_type|
31
+ Neighbor::Attribute.new(cast_type: cast_type, model: self, type: type, attribute_name: name)
40
32
  end
41
33
 
42
34
  if normalize
43
- if ActiveRecord::VERSION::STRING.to_f >= 7.1
44
- attribute_names.each do |attribute_name|
45
- normalizes attribute_name, with: ->(v) { Neighbor::Utils.normalize(v, column_info: columns_hash[attribute_name.to_s]) }
46
- end
47
- else
48
- attribute_names.each do |attribute_name|
49
- attribute attribute_name do |cast_type|
50
- Neighbor::NormalizedAttribute.new(cast_type: cast_type, model: self, attribute_name: attribute_name)
51
- end
52
- end
35
+ attribute_names.each do |attribute_name|
36
+ normalizes attribute_name, with: ->(v) { Neighbor::Utils.normalize(v, column_info: columns_hash[attribute_name.to_s]) }
53
37
  end
54
38
  end
55
39
 
@@ -15,9 +15,6 @@ module Neighbor
15
15
 
16
16
  # prevent unknown OID warning
17
17
  ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter.singleton_class.prepend(RegisterTypes)
18
- if ActiveRecord::VERSION::STRING.to_f < 7.1
19
- ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter.register_vector_type(ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter::TYPE_MAP)
20
- end
21
18
  end
22
19
 
23
20
  module RegisterTypes
@@ -18,7 +18,7 @@ module Neighbor
18
18
  module InstanceMethods
19
19
  def configure_connection
20
20
  super
21
- db = ActiveRecord::VERSION::STRING.to_f >= 7.1 ? @raw_connection : @connection
21
+ db = @raw_connection
22
22
  db.enable_load_extension(1)
23
23
  SqliteVec.load(db)
24
24
  db.enable_load_extension(0)
@@ -1,3 +1,3 @@
1
1
  module Neighbor
2
- VERSION = "0.5.2"
2
+ VERSION = "1.0.0"
3
3
  end
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: neighbor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.2
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  bindir: bin
9
9
  cert_chain: []
10
- date: 2025-01-05 00:00:00.000000000 Z
10
+ date: 1980-01-02 00:00:00.000000000 Z
11
11
  dependencies:
12
12
  - !ruby/object:Gem::Dependency
13
13
  name: activerecord
@@ -15,14 +15,14 @@ dependencies:
15
15
  requirements:
16
16
  - - ">="
17
17
  - !ruby/object:Gem::Version
18
- version: '7'
18
+ version: '7.2'
19
19
  type: :runtime
20
20
  prerelease: false
21
21
  version_requirements: !ruby/object:Gem::Requirement
22
22
  requirements:
23
23
  - - ">="
24
24
  - !ruby/object:Gem::Version
25
- version: '7'
25
+ version: '7.2'
26
26
  email: andrew@ankane.org
27
27
  executables: []
28
28
  extensions: []
@@ -67,14 +67,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
67
67
  requirements:
68
68
  - - ">="
69
69
  - !ruby/object:Gem::Version
70
- version: '3.1'
70
+ version: '3.3'
71
71
  required_rubygems_version: !ruby/object:Gem::Requirement
72
72
  requirements:
73
73
  - - ">="
74
74
  - !ruby/object:Gem::Version
75
75
  version: '0'
76
76
  requirements: []
77
- rubygems_version: 3.6.2
77
+ rubygems_version: 4.0.3
78
78
  specification_version: 4
79
79
  summary: Nearest neighbor search for Rails
80
80
  test_files: []