chromable 0.2.0 → 0.3.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d44fa49957eefd26d00d3fb2315b51f928bd6ecdac7cdc4f849537cac23f036b
4
- data.tar.gz: ab51d49d9b558e6d6ba936640026af66258567c0750f2ce11c7b6e01a0b8fea1
3
+ metadata.gz: ad740d4c1d70642aa0e03518c74cb5e2b77a68df923a9e46e0797eb42908c312
4
+ data.tar.gz: 4cdb45e1599c5f579082cac30ff9c3887adaee9921d2bf73f20b6f96401769cd
5
5
  SHA512:
6
- metadata.gz: 2a6be5bd9ec59535eb33a59fc543baa23954ac43c4bc69b8a6e6d19eaab231d313180ba714781f85342f838bfdd4fd70fcfa32c785f663b880128d6a9425a69e
7
- data.tar.gz: f63ffea813494e3143c16ed900e190d52866fa11348bac39db894c09ad885e66115b0cb8d0e492f2fb18846b29d8fcb76c4dbdbf3c271e3fd4f6afb417350ee9
6
+ metadata.gz: 19ee5d2e13a7b61962442766faa52f52930acc221b346ac91609eab01c1c9cb6aad01bfd96c444451925434c39ede78d7b949923a0824ef7ae89276b11c49869
7
+ data.tar.gz: 8de31a2f2b16402acef0d391e39a1049622dcbc3a69968ce0e8485492e1acbb7dae77ab88ae782bc1ef07ae42a43102929e5a47791a8c5443dd67f758e054c4b
data/README.md CHANGED
@@ -2,6 +2,8 @@
2
2
 
3
3
  Ruby on Rails integration for ChromaDB based on `chroma-db` gem.
4
4
 
5
+ `chromable` were tested with Ruby 3.2.2 and Rails 7.1.2.
6
+
5
7
  ## Installation
6
8
 
7
9
  Install `chromable` and add it to the application's Gemfile by executing:
@@ -30,7 +32,7 @@ Then, include `Chromable` module in your model and initialize it:
30
32
  class Post < ApplicationRecord
31
33
  include Chromable
32
34
 
33
- chromable document: :content, metadata: %i[author category], embedder: :embed
35
+ chromable document: :content, metadata: %i[author category], embedder: :embed, keep_document: false
34
36
 
35
37
  def self.embed(text, **options)
36
38
  options[:is_query] ||= false
@@ -45,38 +47,43 @@ end
45
47
  ```
46
48
 
47
49
  Where:
48
- - `document:` is a callable represents the text content you want to embed and store in ChromaDB.
49
- - `metadata:` is the list of attributes to be passed to ChromaDB as metadata to be used to filter.
50
- - `embedder:` is a callable defined in the model that returns the embedding representation for the given `text` and `options`.
50
+ - `document:` is a callable represents the text content you want to embed and store in ChromaDB (e.g. Could be a model attribute).
51
+ - `metadata:` is a list of callables to be evaluated and passed to ChromaDB as metadata to be used to filter (e.g. Could be an instance method).
52
+ - `embedder:` is a callable defined at the model level that returns the embedding representation for the given `text` based on some `options`.
53
+ - `keep_document:` tells chromable to pass the `document:` to ChromaDB and save it or not. It is useful if you just want to have the embeddings in ChromaDB and the rest of the data in your Rails application database to reduce memory footprint. `keep_document:` is `true` by default.
51
54
 
52
55
  Optionaly you can pass `collection_name:`. If not passed, the plural form of the model name will be used.
53
56
 
54
- The only required option for `chromable` method is `document:`.
57
+ The only required option for `chromable` is `document:`.
55
58
 
56
59
  At this point, `chromable` will create, update, and destroy the ChromaDB embeddings for your objects based on Rails `after_save` and `after_destroy` callbacks.
57
60
 
58
- To interact with the ChromaDB collection, `chromable` provides `Model.query` method to query the collection and `Model.collection` method to access the collection directly.
61
+ To interact with the ChromaDB collection, `chromable` provides:
62
+ - `Model.query` to query the collection using similar API used in `chroma-db` gem, except for accepting `text:` instead of `query_embeddings:`. Extra arguments will be passed to the `embedder:` as `options`. Behind the scenes, `Model.query` will embed the given `text:`, then query the collection, and return the closest `results:` records.
63
+ - `Model.collection` to access the collection directly.
64
+ - `Model.delete_collection` to delete the entire collection.
59
65
 
60
66
  ```ruby
61
67
  puts Post.collection.count # Gets the number of documents inside the collection. Should always match Post.count.
62
68
 
63
69
  Post.query(
64
- query: params[:query],
70
+ text: params[:query],
65
71
  results: 20,
66
72
  where: chroma_search_filters,
67
- type: 'query' # `type` here will be passed to `Post.embed` as an option.
73
+ is_query: true # `is_query` here will be passed to `Post.embed` as an option.
68
74
  )
69
- ```
70
75
 
71
- `Model.query` accepts the same arguments accepted by `chroma-db` gem `query` method. Extra arguments will be passed to the `embedder:`. Behind the scene, `Model.query` will embed the given `query:` text, then query the collection, and return the closest `results:` records.
76
+ Post.first.neighbors(results: 20) # => [#<Post:0x0000ffff9e0b5f10 id: "0beb0f98, ...>, ...]
77
+ ```
72
78
 
73
79
  Also, `chromable` provides the following methods for each model instance:
74
80
 
75
81
  - `embedding`: Retrieves the instance's ChromaDB embedding object.
76
82
  - `upsert_embedding`: Creates or updates the instance's ChromaDB embedding object.
77
83
  - `destroy_embedding`: Destroys the instance's ChromaDB embedding object.
84
+ - `neighbors`: Gets the closest `results:` records to the current record.
78
85
 
79
- All these methods (including `Model.query` and `Model.collection`) are available with `chroma_` prefix, if you have similar methods defined in your model.
86
+ All these methods (including `Model.query`, `Model.collection`, and `Model.delete_collection`) are available with `chroma_` prefix, if you have similar methods defined in your model.
80
87
 
81
88
  ## Development
82
89
 
@@ -86,7 +93,7 @@ To install this gem onto your local machine, run `bundle install`. To release a
86
93
 
87
94
  ## Contributing
88
95
 
89
- Bug reports and pull requests are welcome on GitHub at https://github.com/AliOsm/chromable. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAME]/chromable/blob/main/CODE_OF_CONDUCT.md).
96
+ Bug reports and pull requests are welcome on GitHub at https://github.com/AliOsm/chromable. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/AliOsm/chromable/blob/main/CODE_OF_CONDUCT.md).
90
97
 
91
98
  ## License
92
99
 
@@ -94,4 +101,4 @@ The gem is available as open source under the terms of the [MIT License](https:/
94
101
 
95
102
  ## Code of Conduct
96
103
 
97
- Everyone interacting in the Chromable project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/chromable/blob/main/CODE_OF_CONDUCT.md).
104
+ Everyone interacting in the Chromable project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/AliOsm/chromable/blob/main/CODE_OF_CONDUCT.md).
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Chromable
4
- VERSION = '0.2.0'
4
+ VERSION = '0.3.1'
5
5
  end
data/lib/chromable.rb CHANGED
@@ -6,75 +6,122 @@ require_relative 'chromable/version'
6
6
  module Chromable
7
7
  def self.included(base)
8
8
  base.extend ClassMethods
9
- base.class_attribute :collection_name
10
- base.class_attribute :document
11
- base.class_attribute :metadata
12
- base.class_attribute :embedder
9
+ base.include InstanceMethods
13
10
 
14
11
  base.after_save :chroma_upsert_embedding
15
12
  base.after_destroy :chroma_destroy_embedding
16
13
  end
17
14
 
15
+ # Chromable settings class to hide them from Rails models.
16
+ class Settings
17
+ attr_accessor :document, :collection_name, :metadata, :embedder, :keep_document
18
+
19
+ def initialize(document:, metadata: nil, embedder: nil, collection_name: nil, keep_document: true)
20
+ @collection_name = collection_name
21
+ @document = document
22
+ @metadata = metadata
23
+ @embedder = embedder
24
+ @keep_document = keep_document
25
+ end
26
+ end
27
+
18
28
  # Methods to be added to the model class.
19
29
  module ClassMethods
20
- def chromable(document:, metadata: nil, embedder: nil, collection_name: nil)
21
- self.collection_name = (collection_name.presence || name.underscore.pluralize)
22
- self.document = document
23
- self.metadata = metadata
24
- self.embedder = embedder
30
+ def self.extended(base)
31
+ class << base
32
+ alias_method :collection, :chroma_collection unless method_defined? :collection
33
+ alias_method :delete_collection, :chroma_delete_collection unless method_defined? :delete_collection
34
+ alias_method :query, :chroma_query unless method_defined? :query
35
+ end
36
+
37
+ base.cattr_accessor :chromable_settings
38
+ end
39
+
40
+ def chromable(**options)
41
+ options[:collection_name] ||= name.underscore.pluralize
42
+
43
+ self.chromable_settings = Settings.new(**options)
25
44
  end
26
45
 
27
46
  def chroma_collection
28
- Chroma::Resources::Collection.get_or_create(collection_name)
47
+ Chroma::Resources::Collection.get_or_create(chromable_settings.collection_name)
29
48
  end
30
49
 
31
- def chroma_query( # rubocop:disable Metrics/ParameterLists
32
- query:,
33
- results: 10,
34
- where: {},
35
- where_document: {},
36
- include: %w[metadatas documents distances],
37
- **embedder_options
38
- )
50
+ def chroma_delete_collection
51
+ Chroma::Resources::Collection.delete(chromable_settings.collection_name)
52
+ end
53
+
54
+ def chroma_query(text:, results: 10, where: {}, where_document: {}, **embedder_options)
39
55
  find(chroma_collection.query(
40
- query_embeddings: [send(embedder, query, **embedder_options)],
56
+ query_embeddings: [send(chromable_settings.embedder, text, **embedder_options)],
41
57
  results: results,
42
58
  where: where,
43
59
  where_document: where_document,
44
60
  include: include
45
61
  ).map(&:id))
46
62
  end
47
-
48
- alias collection chroma_collection unless method_defined? :collection
49
- alias query chroma_query unless method_defined? :query
50
63
  end
51
64
 
52
- def chroma_embedding
53
- self.class.chroma_collection.get(ids: [id])[0]
54
- end
65
+ # Methods to be added to the model instances.
66
+ module InstanceMethods
67
+ def self.included(base)
68
+ base.instance_eval do
69
+ # rubocop:disable Style/Alias
70
+ alias_method :embedding, :chroma_embedding unless method_defined? :embedding
71
+ alias_method :upsert_embedding, :chroma_upsert_embedding unless method_defined? :upsert_embedding
72
+ alias_method :destroy_embedding, :chroma_destroy_embedding unless method_defined? :destroy_embedding
73
+ alias_method :neighbors, :chroma_neighbors unless method_defined? :neighbors
74
+ # rubocop:enable Style/Alias
75
+ end
76
+ end
55
77
 
56
- def chroma_upsert_embedding
57
- self.class.chroma_collection.upsert(build_embedding)
58
- end
78
+ def chroma_embedding
79
+ self.class.chroma_collection.get(ids: [id])[0]
80
+ end
59
81
 
60
- def chroma_destroy_embedding
61
- self.class.chroma_collection.delete(ids: [id])
62
- end
82
+ def chroma_upsert_embedding
83
+ self.class.chroma_collection.upsert(build_embedding)
84
+ end
63
85
 
64
- alias embedding chroma_embedding unless method_defined? :embedding
65
- alias upsert_embedding chroma_upsert_embedding unless method_defined? :upsert_embedding
66
- alias destroy_embedding chroma_destroy_embedding unless method_defined? :destroy_embedding
86
+ def chroma_destroy_embedding
87
+ self.class.chroma_collection.delete(ids: [id])
88
+ end
89
+
90
+ def chroma_neighbors(results: 10, where: {}, where_document: {})
91
+ collection = self.class.chroma_collection
92
+
93
+ embedding = collection.get(ids: [id], include: [:embeddings])[0].embedding
94
+
95
+ self.class.find(collection.query(
96
+ query_embeddings: [embedding],
97
+ results: results,
98
+ where: where,
99
+ where_document: where_document,
100
+ include: include
101
+ ).map(&:id))
102
+ end
103
+
104
+ private
105
+
106
+ def build_embedding
107
+ Chroma::Resources::Embedding.new(
108
+ id: id,
109
+ document: document_to_embed,
110
+ embedding: document_embedding,
111
+ metadata: embedding_metadata
112
+ )
113
+ end
67
114
 
68
- private
115
+ def document_to_embed
116
+ chromable_settings.keep_document ? send(chromable_settings.document) : nil
117
+ end
69
118
 
70
- def build_embedding
71
- document = send(self.class.document)
119
+ def document_embedding
120
+ chromable_settings.embedder && self.class.send(chromable_settings.embedder, send(chromable_settings.document))
121
+ end
72
122
 
73
- Chroma::Resources::Embedding.new(
74
- id: id,
75
- document: document,
76
- embedding: self.class.embedder && self.class.send(self.class.embedder, document),
77
- metadata: self.class.metadata&.index_with { |attribute| send(attribute) }
78
- )
123
+ def embedding_metadata
124
+ chromable_settings.metadata&.index_with { |attribute| send(attribute) }
125
+ end
79
126
  end
80
127
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: chromable
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.0
4
+ version: 0.3.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ali Hamdi Ali Fadel
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-12-11 00:00:00.000000000 Z
11
+ date: 2023-12-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: chroma-db