RubyGems - langchainrb - Versions diffs - 0.6.1 → 0.6.3 - Mend

langchainrb 0.6.1 → 0.6.3

Files changed (15) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +6 -0
data/Gemfile.lock +4 -4
data/README.md +45 -11
data/lib/langchain/active_record/hooks.rb +4 -2
data/lib/langchain/vectorsearch/base.rb +10 -0
data/lib/langchain/vectorsearch/chroma.rb +22 -4
data/lib/langchain/vectorsearch/hnswlib.rb +2 -0
data/lib/langchain/vectorsearch/milvus.rb +4 -0
data/lib/langchain/vectorsearch/pgvector.rb +2 -0
data/lib/langchain/vectorsearch/pinecone.rb +34 -4
data/lib/langchain/vectorsearch/qdrant.rb +17 -5
data/lib/langchain/vectorsearch/weaviate.rb +11 -4
data/lib/langchain/version.rb +1 -1
metadata +4 -4

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 5a6f4e8bb8ecaba6ff4d53bba384bd6338012429a69a0dc7df0a58a476763e7e
-  data.tar.gz: 92211a22fca9664831cf4f395a53dedddafc339ab419780932398c07256b737d
+  metadata.gz: 73f980d6a7dd67d0112038a8266a05f8b5697e05c98e61a94598d38406de7c8b
+  data.tar.gz: 8abc93ad6da8ad05d76ac35eff9aaab963c33549acb94bda4dd83daddeb71f4d
 SHA512:
-  metadata.gz: b5c84f0a9a54f51799c5318cba243457fcfd6f026c71b8f34e58cf60172d476963f25ea8d24c49b35ed93c893adb9e2844443a22dd9e927ab16318850a11419a
-  data.tar.gz: 4664927203ea032f737000c27ec5fa04c96ab606ec8377b4673b48638905b458077d4ab3cb7727fcb98be6c607a37bd318395fd96000a734de213c7d9041a219
+  metadata.gz: 7b5450e51ee732a1e2414e3db5f8a46d113d0b537b561f95556756e2854c9bb9175c898388acc2bb8672b2479e647625d3166580b7b1b25eb6cdc86ff6d42aee
+  data.tar.gz: b5843004533f952782946e6a753aa5306c6ad4a5f97887416f8f10f4192ca1f88d00d30624cd62581022314649e5d291d9c1ab46f2bab31f9455860fc533c83d

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,11 @@
 ## [Unreleased]
+## [0.6.3] - 2023-06-25
+- Add #destroy_default_schema() to Langchain::Vectorsearch::* classes
+## [0.6.2] - 2023-06-25
+- Qdrant, Chroma, and Pinecone are supported by ActiveRecord hooks
 ## [0.6.1] - 2023-06-24
 - Adding support to hook vectorsearch into ActiveRecord models

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    langchainrb (0.6.1)
+    langchainrb (0.6.3)
       baran (~> 0.1.6)
       colorize (~> 0.8.1)
       json-schema (~> 4.0.0)
@@ -133,7 +133,7 @@ GEM
       faraday (>= 1.0)
       faraday_middleware
       graphql-client
-    graphql (2.0.21)
+    graphql (2.0.23)
     graphql-client (0.18.0)
       activesupport (>= 3.0)
       graphql
@@ -298,7 +298,7 @@ GEM
     tzinfo (2.0.6)
       concurrent-ruby (~> 1.0)
     unicode-display_width (2.4.2)
-    weaviate-ruby (0.8.1)
+    weaviate-ruby (0.8.3)
       faraday (~> 1)
       faraday_middleware (~> 1)
       graphlient (~> 0.6.0)
@@ -346,7 +346,7 @@ DEPENDENCIES
   safe_ruby (~> 1.0.4)
   sequel (~> 5.68.0)
   standardrb
-  weaviate-ruby (~> 0.8.0)
+  weaviate-ruby (~> 0.8.3)
   wikipedia-client (~> 1.17.0)
   yard

data/README.md CHANGED Viewed

@@ -35,19 +35,19 @@ require "langchain"
 | Database | Querying           | Storage | Schema Management | Backups | Rails Integration |
 | -------- |:------------------:| -------:| -----------------:| -------:| -----------------:|
-| [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
+| [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | :white_check_mark: |
 | [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
 | [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
-| [Pinecone](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
+| [Pinecone](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | :white_check_mark: |
 | [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
-| [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | WIP               |
+| [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | :white_check_mark: |
 | [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP     | :white_check_mark: |
 ### Using Vector Search Databases 🔍
 Choose the LLM provider you'll be using (OpenAI or Cohere) and retrieve the API key.
-Add `gem "weaviate-ruby", "~> 0.8.0"`  to your Gemfile.
+Add `gem "weaviate-ruby", "~> 0.8.3"`  to your Gemfile.
 Pick the vector search database you'll be using and instantiate the client:
 ```ruby
@@ -110,6 +110,22 @@ client.ask(
 )
 ```
+## Integrating Vector Search into ActiveRecord models
+```ruby
+class Product < ActiveRecord::Base
+  vectorsearch provider: Langchain::Vectorsearch::Qdrant.new(
+                 api_key: ENV["QDRANT_API_KEY"],
+                 url: ENV["QDRANT_URL"],
+                 index_name: "Products",
+                 llm: Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])
+               )
+  after_save :upsert_to_vectorsearch
+end
+```
+Additional info [here](https://github.com/andreibondarev/langchainrb/blob/main/lib/langchain/active_record/hooks.rb#L10-L38).
 ### Using Standalone LLMs 🗣️
 Add `gem "ruby-openai", "~> 4.0.0"` to your Gemfile.
@@ -370,15 +386,33 @@ Langchain.logger.level = :info
 Join us in the [Langchain.rb](https://discord.gg/WDARp7J2n8) Discord server.
 ## Core Contributors
-[<img style="border-radius:50%" alt="Andrei Bondarev" src="https://avatars.githubusercontent.com/u/541665?v=4" width="80" height="80" class="avatar">](https://github.com/andreibondarev)
+[<img style="border-radius:50%" alt="Andrei Bondarev" src="https://avatars.githubusercontent.com/u/541665?v=4" width="80" height="80" class="avatar">](https://twitter.com/rushing_andrei)
-## Honorary Contributors
-[<img style="border-radius:50%" alt="Andrei Bondarev" src="https://avatars.githubusercontent.com/u/541665?v=4" width="80" height="80" class="avatar">](https://github.com/andreibondarev)
-[<img style="border-radius:50%" alt="Rafael Figueiredo" src="https://avatars.githubusercontent.com/u/35845775?v=4" width="80" height="80" class="avatar">](https://github.com/rafaelqfigueiredo)
-[<img style="border-radius:50%" alt="Ricky Chilcott" src="https://avatars.githubusercontent.com/u/445759?v=4" width="80" height="80" class="avatar">](https://github.com/rickychilcott)
+## Contributors
 [<img style="border-radius:50%" alt="Alex Chaplinsky" src="https://avatars.githubusercontent.com/u/695947?v=4" width="80" height="80" class="avatar">](https://github.com/alchaplinsky)
-(Criteria for becoming an Honorary Contributor or Core Contributor is pending...)
+[<img style="border-radius:50%" alt="Josh Nichols" src="https://avatars.githubusercontent.com/u/159?v=4" width="80" height="80" class="avatar">](https://github.com/technicalpickles)
+[<img style="border-radius:50%" alt="Matt Lindsey" src="https://avatars.githubusercontent.com/u/5638339?v=4" width="80" height="80" class="avatar">](https://github.com/mattlindsey)
+[<img style="border-radius:50%" alt="Ricky Chilcott" src="https://avatars.githubusercontent.com/u/445759?v=4" width="80" height="80" class="avatar">](https://github.com/rickychilcott)
+[<img style="border-radius:50%" alt="Moeki Kawakami" src="https://avatars.githubusercontent.com/u/72325947?v=4" width="80" height="80" class="avatar">](https://github.com/moekidev)
+[<img style="border-radius:50%" alt="Jens Stmrs" src="https://avatars.githubusercontent.com/u/3492669?v=4" width="80" height="80" class="avatar">](https://github.com/faustus7)
+[<img style="border-radius:50%" alt="Rafael Figueiredo" src="https://avatars.githubusercontent.com/u/35845775?v=4" width="80" height="80" class="avatar">](https://github.com/rafaelqfigueiredo)
+[<img style="border-radius:50%" alt="Piero Dotti" src="https://avatars.githubusercontent.com/u/5167659?v=4" width="80" height="80" class="avatar">](https://github.com/ProGM)
+[<img style="border-radius:50%" alt="Michał Ciemięga" src="https://avatars.githubusercontent.com/u/389828?v=4" width="80" height="80" class="avatar">](https://github.com/zewelor)
+[<img style="border-radius:50%" alt="Bruno Bornsztein" src="https://avatars.githubusercontent.com/u/3760?v=4" width="80" height="80" class="avatar">](https://github.com/bborn)
+[<img style="border-radius:50%" alt="Tim Williams" src="https://avatars.githubusercontent.com/u/1192351?v=4" width="80" height="80" class="avatar">](https://github.com/timrwilliams)
+[<img style="border-radius:50%" alt="Zhenhang Tung" src="https://avatars.githubusercontent.com/u/8170159?v=4" width="80" height="80" class="avatar">](https://github.com/ZhenhangTung)
+[<img style="border-radius:50%" alt="Hama" src="https://avatars.githubusercontent.com/u/38002468?v=4" width="80" height="80" class="avatar">](https://github.com/akmhmgc)
+[<img style="border-radius:50%" alt="Josh Weir" src="https://avatars.githubusercontent.com/u/10720337?v=4" width="80" height="80" class="avatar">](https://github.com/joshweir)
+[<img style="border-radius:50%" alt="Arthur Hess" src="https://avatars.githubusercontent.com/u/446035?v=4" width="80" height="80" class="avatar">](https://github.com/arthurhess)
+[<img style="border-radius:50%" alt="Jin Shen" src="https://avatars.githubusercontent.com/u/54917718?v=4" width="80" height="80" class="avatar">](https://github.com/jacshen-ebay)
+[<img style="border-radius:50%" alt="Earle Bunao" src="https://avatars.githubusercontent.com/u/4653624?v=4" width="80" height="80" class="avatar">](https://github.com/erbunao)
+[<img style="border-radius:50%" alt="Maël H." src="https://avatars.githubusercontent.com/u/61985678?v=4" width="80" height="80" class="avatar">](https://github.com/mael-ha)
+[<img style="border-radius:50%" alt="Chris O. Adebiyi" src="https://avatars.githubusercontent.com/u/62605573?v=4" width="80" height="80" class="avatar">](https://github.com/oluvvafemi)
+[<img style="border-radius:50%" alt="Aaron Breckenridge" src="https://avatars.githubusercontent.com/u/201360?v=4" width="80" height="80" class="avatar">](https://github.com/breckenedge)
+## Star History
+[![Star History Chart](https://api.star-history.com/svg?repos=andreibondarev/langchainrb&type=Date)](https://star-history.com/#andreibondarev/langchainrb&Date)
 ## Contributing

data/lib/langchain/active_record/hooks.rb CHANGED Viewed

@@ -35,7 +35,7 @@ module Langchain
     # Query the vector search provider
     #     Recipe.similarity_search("carnivore dish")
     # Delete the default schema to start over
-    #     Recipe.class_variable_get(:@@provider).client.schema.delete class_name: "Recipes"
+    #     Recipe.class_variable_get(:@@provider).destroy_default_schema
     #
     module Hooks
       def self.included(base)
@@ -87,7 +87,9 @@ module Langchain
             query: query,
             k: k
           )
-          ids = records.map { |record| record.dig("__id") }
+          # We use "__id" when Weaviate is the provider
+          ids = records.map { |record| record.dig("id") || record.dig("__id") }
           where(id: ids)
         end
       end

data/lib/langchain/vectorsearch/base.rb CHANGED Viewed

@@ -103,11 +103,21 @@ module Langchain::Vectorsearch
       raise NotImplementedError, "#{self.class.name} does not support creating a default schema"
     end
+    # Method supported by Vectorsearch DB to delete the default schema
+    def destroy_default_schema
+      raise NotImplementedError, "#{self.class.name} does not support deleting a default schema"
+    end
     # Method supported by Vectorsearch DB to add a list of texts to the index
     def add_texts(...)
       raise NotImplementedError, "#{self.class.name} does not support adding texts"
     end
+    # Method supported by Vectorsearch DB to update a list of texts to the index
+    def update_texts(...)
+      raise NotImplementedError, "#{self.class.name} does not support updating texts"
+    end
     # Method supported by Vectorsearch DB to search for similar texts in the index
     def similarity_search(...)
       raise NotImplementedError, "#{self.class.name} does not support similarity search"

data/lib/langchain/vectorsearch/chroma.rb CHANGED Viewed

@@ -32,11 +32,10 @@ module Langchain::Vectorsearch
     # Add a list of texts to the index
     # @param texts [Array] The list of texts to add
     # @return [Hash] The response from the server
-    def add_texts(texts:)
-      embeddings = Array(texts).map do |text|
+    def add_texts(texts:, ids: [])
+      embeddings = Array(texts).map.with_index do |text, i|
         ::Chroma::Resources::Embedding.new(
-          # TODO: Add support for passing your own IDs
-          id: SecureRandom.uuid,
+          id: ids[i] ? ids[i].to_s : SecureRandom.uuid,
           embedding: llm.embed(text: text),
           # TODO: Add support for passing metadata
           metadata: [], # metadatas[index],
@@ -48,12 +47,31 @@ module Langchain::Vectorsearch
       collection.add(embeddings)
     end
+    def update_texts(texts:, ids:)
+      embeddings = Array(texts).map.with_index do |text, i|
+        ::Chroma::Resources::Embedding.new(
+          id: ids[i].to_s,
+          embedding: llm.embed(text: text),
+          # TODO: Add support for passing metadata
+          metadata: [], # metadatas[index],
+          document: text # Do we actually need to store the whole original document?
+        )
+      end
+      collection.update(embeddings)
+    end
     # Create the collection with the default schema
     # @return [Hash] The response from the server
     def create_default_schema
       ::Chroma::Resources::Collection.create(index_name)
     end
+    # TODO: Uncomment and add the spec
+    # def destroy_default_schema
+    #   ::Chroma::Resources::Collection.delete(index_name)
+    # end
     # Search for similar texts
     # @param query [String] The text to search for
     # @param k [Integer] The number of results to return

data/lib/langchain/vectorsearch/hnswlib.rb CHANGED Viewed

@@ -53,6 +53,8 @@ module Langchain::Vectorsearch
       client.save_index(path_to_index)
     end
+    # TODO: Add update_texts method
     #
     # Search for similar texts
     #

data/lib/langchain/vectorsearch/milvus.rb CHANGED Viewed

@@ -39,6 +39,8 @@ module Langchain::Vectorsearch
       )
     end
+    # TODO: Add update_texts method
     # Create default schema
     # @return [Hash] The response from the server
     def create_default_schema
@@ -77,6 +79,8 @@ module Langchain::Vectorsearch
       )
     end
+    # TODO: Add destroy_default_schema method
     def similarity_search(query:, k: 4)
       embedding = llm.embed(text: query)

data/lib/langchain/vectorsearch/pgvector.rb CHANGED Viewed

@@ -69,6 +69,8 @@ module Langchain::Vectorsearch
       )
     end
+    # TODO: Add destroy_default_schema method
     # Search for similar texts in the index
     # @param query [String] The text to search for
     # @param k [Integer] The number of top results to return

data/lib/langchain/vectorsearch/pinecone.rb CHANGED Viewed

@@ -33,14 +33,14 @@ module Langchain::Vectorsearch
     # Add a list of texts to the index
     # @param texts [Array] The list of texts to add
+    # @param ids [Array] The list of IDs to add
     # @param namespace [String] The namespace to add the texts to
     # @param metadata [Hash] The metadata to use for the texts
     # @return [Hash] The response from the server
-    def add_texts(texts:, namespace: "", metadata: nil)
-      vectors = texts.map do |text|
+    def add_texts(texts:, ids: [], namespace: "", metadata: nil)
+      vectors = texts.map.with_index do |text, i|
         {
-          # TODO: Allows passing in your own IDs
-          id: SecureRandom.uuid,
+          id: ids[i] ? ids[i].to_s : SecureRandom.uuid,
           metadata: metadata || {content: text},
           values: llm.embed(text: text)
         }
@@ -51,6 +51,24 @@ module Langchain::Vectorsearch
       index.upsert(vectors: vectors, namespace: namespace)
     end
+    # Update a list of texts in the index
+    # @param texts [Array] The list of texts to update
+    # @param ids [Array] The list of IDs to update
+    # @param namespace [String] The namespace to update the texts in
+    # @param metadata [Hash] The metadata to use for the texts
+    # @return [Array] The response from the server
+    def update_texts(texts:, ids:, namespace: "", metadata: nil)
+      texts.map.with_index do |text, i|
+        # Pinecone::Vector#update ignore args when it is empty
+        index.update(
+          namespace: namespace,
+          id: ids[i].to_s,
+          values: llm.embed(text: text),
+          set_metadata: metadata
+        )
+      end
+    end
     # Create the index with the default schema
     # @return [Hash] The response from the server
     def create_default_schema
@@ -61,6 +79,12 @@ module Langchain::Vectorsearch
       )
     end
+    # Delete the index
+    # @return [Hash] The response from the server
+    def destroy_default_schema
+      client.delete_index(index_name)
+    end
     # Search for similar texts
     # @param query [String] The text to search for
     # @param k [Integer] The number of results to return
@@ -122,5 +146,11 @@ module Langchain::Vectorsearch
       llm.chat(prompt: prompt)
     end
+    # Pinecone index
+    # @return [Object] The Pinecone index
+    private def index
+      client.index(index_name)
+    end
   end
 end

data/lib/langchain/vectorsearch/qdrant.rb CHANGED Viewed

@@ -32,11 +32,11 @@ module Langchain::Vectorsearch
     # Add a list of texts to the index
     # @param texts [Array] The list of texts to add
     # @return [Hash] The response from the server
-    def add_texts(texts:)
+    def add_texts(texts:, ids:)
       batch = {ids: [], vectors: [], payloads: []}
-      Array(texts).each do |text|
-        batch[:ids].push(SecureRandom.uuid)
+      Array(texts).each_with_index do |text, i|
+        batch[:ids].push(ids[i] || SecureRandom.uuid)
         batch[:vectors].push(llm.embed(text: text))
         batch[:payloads].push({content: text})
       end
@@ -47,6 +47,16 @@ module Langchain::Vectorsearch
       )
     end
+    def update_texts(texts:, ids:)
+      add_texts(texts: texts, ids: ids)
+    end
+    # Deletes the default schema
+    # @return [Hash] The response from the server
+    def destroy_default_schema
+      client.collections.delete(collection_name: index_name)
+    end
     # Create the index with the default schema
     # @return [Hash] The response from the server
     def create_default_schema
@@ -83,12 +93,14 @@ module Langchain::Vectorsearch
       embedding:,
       k: 4
     )
-      client.points.search(
+      response = client.points.search(
         collection_name: index_name,
         limit: k,
         vector: embedding,
-        with_payload: true
+        with_payload: true,
+        with_vector: true
       )
+      response.dig("result")
     end
     # Ask a question and return the answer

data/lib/langchain/vectorsearch/weaviate.rb CHANGED Viewed

@@ -5,7 +5,7 @@ module Langchain::Vectorsearch
     #
     # Wrapper around Weaviate
     #
-    # Gem requirements: gem "weaviate-ruby", "~> 0.8.0"
+    # Gem requirements: gem "weaviate-ruby", "~> 0.8.3"
     #
     # Usage:
     # weaviate = Langchain::Vectorsearch::Weaviate.new(url:, api_key:, index_name:, llm:, llm_api_key:)
@@ -35,7 +35,7 @@ module Langchain::Vectorsearch
     # Add a list of texts to the index
     # @param texts [Array] The list of texts to add
     # @return [Hash] The response from the server
-    def add_texts(texts:, ids:)
+    def add_texts(texts:, ids: [])
       client.objects.batch_create(
         objects: weaviate_objects(texts, ids)
       )
@@ -72,6 +72,7 @@ module Langchain::Vectorsearch
     end
     # Create default schema
+    # @return [Hash] The response from the server
     def create_default_schema
       client.schema.create(
         class_name: index_name,
@@ -84,6 +85,12 @@ module Langchain::Vectorsearch
       )
     end
+    # Delete the index
+    # @return [Boolean] Whether the index was deleted
+    def destroy_default_schema
+      client.schema.delete(class_name: index_name)
+    end
     # Return documents similar to the query
     # @param query [String] The query to search for
     # @param k [Integer|String] The number of results to return
@@ -127,13 +134,13 @@ module Langchain::Vectorsearch
     private
-    def weaviate_objects(texts, ids)
+    def weaviate_objects(texts, ids = [])
       Array(texts).map.with_index do |text, i|
         weaviate_object(text, ids[i])
       end
     end
-    def weaviate_object(text, id)
+    def weaviate_object(text, id = nil)
       {
         class: index_name,
         properties: {

data/lib/langchain/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Langchain
-  VERSION = "0.6.1"
+  VERSION = "0.6.3"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: langchainrb
 version: !ruby/object:Gem::Version
-  version: 0.6.1
+  version: 0.6.3
 platform: ruby
 authors:
 - Andrei Bondarev
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2023-06-24 00:00:00.000000000 Z
+date: 2023-06-26 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: baran
@@ -436,14 +436,14 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.8.0
+        version: 0.8.3
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.8.0
+        version: 0.8.3
 - !ruby/object:Gem::Dependency
   name: wikipedia-client
   requirement: !ruby/object:Gem::Requirement