chromable 0.2.0 → 0.3.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +20 -13
- data/lib/chromable/version.rb +1 -1
- data/lib/chromable.rb +90 -43
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ad740d4c1d70642aa0e03518c74cb5e2b77a68df923a9e46e0797eb42908c312
|
4
|
+
data.tar.gz: 4cdb45e1599c5f579082cac30ff9c3887adaee9921d2bf73f20b6f96401769cd
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 19ee5d2e13a7b61962442766faa52f52930acc221b346ac91609eab01c1c9cb6aad01bfd96c444451925434c39ede78d7b949923a0824ef7ae89276b11c49869
|
7
|
+
data.tar.gz: 8de31a2f2b16402acef0d391e39a1049622dcbc3a69968ce0e8485492e1acbb7dae77ab88ae782bc1ef07ae42a43102929e5a47791a8c5443dd67f758e054c4b
|
data/README.md
CHANGED
@@ -2,6 +2,8 @@
|
|
2
2
|
|
3
3
|
Ruby on Rails integration for ChromaDB based on `chroma-db` gem.
|
4
4
|
|
5
|
+
`chromable` were tested with Ruby 3.2.2 and Rails 7.1.2.
|
6
|
+
|
5
7
|
## Installation
|
6
8
|
|
7
9
|
Install `chromable` and add it to the application's Gemfile by executing:
|
@@ -30,7 +32,7 @@ Then, include `Chromable` module in your model and initialize it:
|
|
30
32
|
class Post < ApplicationRecord
|
31
33
|
include Chromable
|
32
34
|
|
33
|
-
chromable document: :content, metadata: %i[author category], embedder: :embed
|
35
|
+
chromable document: :content, metadata: %i[author category], embedder: :embed, keep_document: false
|
34
36
|
|
35
37
|
def self.embed(text, **options)
|
36
38
|
options[:is_query] ||= false
|
@@ -45,38 +47,43 @@ end
|
|
45
47
|
```
|
46
48
|
|
47
49
|
Where:
|
48
|
-
- `document:` is a callable represents the text content you want to embed and store in ChromaDB.
|
49
|
-
- `metadata:` is
|
50
|
-
- `embedder:` is a callable defined
|
50
|
+
- `document:` is a callable represents the text content you want to embed and store in ChromaDB (e.g. Could be a model attribute).
|
51
|
+
- `metadata:` is a list of callables to be evaluated and passed to ChromaDB as metadata to be used to filter (e.g. Could be an instance method).
|
52
|
+
- `embedder:` is a callable defined at the model level that returns the embedding representation for the given `text` based on some `options`.
|
53
|
+
- `keep_document:` tells chromable to pass the `document:` to ChromaDB and save it or not. It is useful if you just want to have the embeddings in ChromaDB and the rest of the data in your Rails application database to reduce memory footprint. `keep_document:` is `true` by default.
|
51
54
|
|
52
55
|
Optionaly you can pass `collection_name:`. If not passed, the plural form of the model name will be used.
|
53
56
|
|
54
|
-
The only required option for `chromable`
|
57
|
+
The only required option for `chromable` is `document:`.
|
55
58
|
|
56
59
|
At this point, `chromable` will create, update, and destroy the ChromaDB embeddings for your objects based on Rails `after_save` and `after_destroy` callbacks.
|
57
60
|
|
58
|
-
To interact with the ChromaDB collection, `chromable` provides
|
61
|
+
To interact with the ChromaDB collection, `chromable` provides:
|
62
|
+
- `Model.query` to query the collection using similar API used in `chroma-db` gem, except for accepting `text:` instead of `query_embeddings:`. Extra arguments will be passed to the `embedder:` as `options`. Behind the scenes, `Model.query` will embed the given `text:`, then query the collection, and return the closest `results:` records.
|
63
|
+
- `Model.collection` to access the collection directly.
|
64
|
+
- `Model.delete_collection` to delete the entire collection.
|
59
65
|
|
60
66
|
```ruby
|
61
67
|
puts Post.collection.count # Gets the number of documents inside the collection. Should always match Post.count.
|
62
68
|
|
63
69
|
Post.query(
|
64
|
-
|
70
|
+
text: params[:query],
|
65
71
|
results: 20,
|
66
72
|
where: chroma_search_filters,
|
67
|
-
|
73
|
+
is_query: true # `is_query` here will be passed to `Post.embed` as an option.
|
68
74
|
)
|
69
|
-
```
|
70
75
|
|
71
|
-
|
76
|
+
Post.first.neighbors(results: 20) # => [#<Post:0x0000ffff9e0b5f10 id: "0beb0f98, ...>, ...]
|
77
|
+
```
|
72
78
|
|
73
79
|
Also, `chromable` provides the following methods for each model instance:
|
74
80
|
|
75
81
|
- `embedding`: Retrieves the instance's ChromaDB embedding object.
|
76
82
|
- `upsert_embedding`: Creates or updates the instance's ChromaDB embedding object.
|
77
83
|
- `destroy_embedding`: Destroys the instance's ChromaDB embedding object.
|
84
|
+
- `neighbors`: Gets the closest `results:` records to the current record.
|
78
85
|
|
79
|
-
All these methods (including `Model.query` and `Model.
|
86
|
+
All these methods (including `Model.query`, `Model.collection`, and `Model.delete_collection`) are available with `chroma_` prefix, if you have similar methods defined in your model.
|
80
87
|
|
81
88
|
## Development
|
82
89
|
|
@@ -86,7 +93,7 @@ To install this gem onto your local machine, run `bundle install`. To release a
|
|
86
93
|
|
87
94
|
## Contributing
|
88
95
|
|
89
|
-
Bug reports and pull requests are welcome on GitHub at https://github.com/AliOsm/chromable. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/
|
96
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/AliOsm/chromable. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/AliOsm/chromable/blob/main/CODE_OF_CONDUCT.md).
|
90
97
|
|
91
98
|
## License
|
92
99
|
|
@@ -94,4 +101,4 @@ The gem is available as open source under the terms of the [MIT License](https:/
|
|
94
101
|
|
95
102
|
## Code of Conduct
|
96
103
|
|
97
|
-
Everyone interacting in the Chromable project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/
|
104
|
+
Everyone interacting in the Chromable project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/AliOsm/chromable/blob/main/CODE_OF_CONDUCT.md).
|
data/lib/chromable/version.rb
CHANGED
data/lib/chromable.rb
CHANGED
@@ -6,75 +6,122 @@ require_relative 'chromable/version'
|
|
6
6
|
module Chromable
|
7
7
|
def self.included(base)
|
8
8
|
base.extend ClassMethods
|
9
|
-
base.
|
10
|
-
base.class_attribute :document
|
11
|
-
base.class_attribute :metadata
|
12
|
-
base.class_attribute :embedder
|
9
|
+
base.include InstanceMethods
|
13
10
|
|
14
11
|
base.after_save :chroma_upsert_embedding
|
15
12
|
base.after_destroy :chroma_destroy_embedding
|
16
13
|
end
|
17
14
|
|
15
|
+
# Chromable settings class to hide them from Rails models.
|
16
|
+
class Settings
|
17
|
+
attr_accessor :document, :collection_name, :metadata, :embedder, :keep_document
|
18
|
+
|
19
|
+
def initialize(document:, metadata: nil, embedder: nil, collection_name: nil, keep_document: true)
|
20
|
+
@collection_name = collection_name
|
21
|
+
@document = document
|
22
|
+
@metadata = metadata
|
23
|
+
@embedder = embedder
|
24
|
+
@keep_document = keep_document
|
25
|
+
end
|
26
|
+
end
|
27
|
+
|
18
28
|
# Methods to be added to the model class.
|
19
29
|
module ClassMethods
|
20
|
-
def
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
30
|
+
def self.extended(base)
|
31
|
+
class << base
|
32
|
+
alias_method :collection, :chroma_collection unless method_defined? :collection
|
33
|
+
alias_method :delete_collection, :chroma_delete_collection unless method_defined? :delete_collection
|
34
|
+
alias_method :query, :chroma_query unless method_defined? :query
|
35
|
+
end
|
36
|
+
|
37
|
+
base.cattr_accessor :chromable_settings
|
38
|
+
end
|
39
|
+
|
40
|
+
def chromable(**options)
|
41
|
+
options[:collection_name] ||= name.underscore.pluralize
|
42
|
+
|
43
|
+
self.chromable_settings = Settings.new(**options)
|
25
44
|
end
|
26
45
|
|
27
46
|
def chroma_collection
|
28
|
-
Chroma::Resources::Collection.get_or_create(collection_name)
|
47
|
+
Chroma::Resources::Collection.get_or_create(chromable_settings.collection_name)
|
29
48
|
end
|
30
49
|
|
31
|
-
def
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
include: %w[metadatas documents distances],
|
37
|
-
**embedder_options
|
38
|
-
)
|
50
|
+
def chroma_delete_collection
|
51
|
+
Chroma::Resources::Collection.delete(chromable_settings.collection_name)
|
52
|
+
end
|
53
|
+
|
54
|
+
def chroma_query(text:, results: 10, where: {}, where_document: {}, **embedder_options)
|
39
55
|
find(chroma_collection.query(
|
40
|
-
query_embeddings: [send(embedder,
|
56
|
+
query_embeddings: [send(chromable_settings.embedder, text, **embedder_options)],
|
41
57
|
results: results,
|
42
58
|
where: where,
|
43
59
|
where_document: where_document,
|
44
60
|
include: include
|
45
61
|
).map(&:id))
|
46
62
|
end
|
47
|
-
|
48
|
-
alias collection chroma_collection unless method_defined? :collection
|
49
|
-
alias query chroma_query unless method_defined? :query
|
50
63
|
end
|
51
64
|
|
52
|
-
|
53
|
-
|
54
|
-
|
65
|
+
# Methods to be added to the model instances.
|
66
|
+
module InstanceMethods
|
67
|
+
def self.included(base)
|
68
|
+
base.instance_eval do
|
69
|
+
# rubocop:disable Style/Alias
|
70
|
+
alias_method :embedding, :chroma_embedding unless method_defined? :embedding
|
71
|
+
alias_method :upsert_embedding, :chroma_upsert_embedding unless method_defined? :upsert_embedding
|
72
|
+
alias_method :destroy_embedding, :chroma_destroy_embedding unless method_defined? :destroy_embedding
|
73
|
+
alias_method :neighbors, :chroma_neighbors unless method_defined? :neighbors
|
74
|
+
# rubocop:enable Style/Alias
|
75
|
+
end
|
76
|
+
end
|
55
77
|
|
56
|
-
|
57
|
-
|
58
|
-
|
78
|
+
def chroma_embedding
|
79
|
+
self.class.chroma_collection.get(ids: [id])[0]
|
80
|
+
end
|
59
81
|
|
60
|
-
|
61
|
-
|
62
|
-
|
82
|
+
def chroma_upsert_embedding
|
83
|
+
self.class.chroma_collection.upsert(build_embedding)
|
84
|
+
end
|
63
85
|
|
64
|
-
|
65
|
-
|
66
|
-
|
86
|
+
def chroma_destroy_embedding
|
87
|
+
self.class.chroma_collection.delete(ids: [id])
|
88
|
+
end
|
89
|
+
|
90
|
+
def chroma_neighbors(results: 10, where: {}, where_document: {})
|
91
|
+
collection = self.class.chroma_collection
|
92
|
+
|
93
|
+
embedding = collection.get(ids: [id], include: [:embeddings])[0].embedding
|
94
|
+
|
95
|
+
self.class.find(collection.query(
|
96
|
+
query_embeddings: [embedding],
|
97
|
+
results: results,
|
98
|
+
where: where,
|
99
|
+
where_document: where_document,
|
100
|
+
include: include
|
101
|
+
).map(&:id))
|
102
|
+
end
|
103
|
+
|
104
|
+
private
|
105
|
+
|
106
|
+
def build_embedding
|
107
|
+
Chroma::Resources::Embedding.new(
|
108
|
+
id: id,
|
109
|
+
document: document_to_embed,
|
110
|
+
embedding: document_embedding,
|
111
|
+
metadata: embedding_metadata
|
112
|
+
)
|
113
|
+
end
|
67
114
|
|
68
|
-
|
115
|
+
def document_to_embed
|
116
|
+
chromable_settings.keep_document ? send(chromable_settings.document) : nil
|
117
|
+
end
|
69
118
|
|
70
|
-
|
71
|
-
|
119
|
+
def document_embedding
|
120
|
+
chromable_settings.embedder && self.class.send(chromable_settings.embedder, send(chromable_settings.document))
|
121
|
+
end
|
72
122
|
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
embedding: self.class.embedder && self.class.send(self.class.embedder, document),
|
77
|
-
metadata: self.class.metadata&.index_with { |attribute| send(attribute) }
|
78
|
-
)
|
123
|
+
def embedding_metadata
|
124
|
+
chromable_settings.metadata&.index_with { |attribute| send(attribute) }
|
125
|
+
end
|
79
126
|
end
|
80
127
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: chromable
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ali Hamdi Ali Fadel
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-12-
|
11
|
+
date: 2023-12-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: chroma-db
|