RubyGems - langchainrb - Versions diffs - 0.7.5 → 0.8.1 - Mend

langchainrb 0.7.5 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +9 -0
data/README.md +35 -27
data/lib/langchain/llm/azure.rb +1 -1
data/lib/langchain/llm/google_palm.rb +1 -1
data/lib/langchain/llm/google_vertex_ai.rb +149 -0
data/lib/langchain/llm/llama_cpp.rb +18 -16
data/lib/langchain/llm/openai.rb +18 -12
data/lib/langchain/llm/response/google_vertex_ai_response.rb +33 -0
data/lib/langchain/llm/response/llama_cpp_response.rb +13 -0
data/lib/langchain/output_parsers/output_fixing_parser.rb +1 -1
data/lib/langchain/prompt/loading.rb +1 -1
data/lib/langchain/utils/token_length/openai_validator.rb +3 -1
data/lib/langchain/vectorsearch/base.rb +3 -1
data/lib/langchain/vectorsearch/elasticsearch.rb +39 -3
data/lib/langchain/vectorsearch/epsilla.rb +143 -0
data/lib/langchain/vectorsearch/qdrant.rb +2 -2
data/lib/langchain/version.rb +1 -1
metadata +38 -6

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f4c388275b83a0e4260f4ae9271f4c164a8d34ea5ea9585916d91e7e9c17c980
-  data.tar.gz: 8daa400de3ed80bb3fb9c53cc19ef4d56f137c2aa157bd268dbda488d0fca432
+  metadata.gz: 5dd13c5aae47af13fe248636ed88bd40d0e241291ab5c3dc2d5925dcc742af37
+  data.tar.gz: b190f73403a77b4ea4d1f9869423546d584df32785ae342a01d9a72ee5fe04fd
 SHA512:
-  metadata.gz: 4bae87c050be6a8fa011c1ae5de4b119abac498669f2e63ca1829e11b7b5ecca7610330be670d24fd6cb98c2e2599c593e9922378985efc586d76c124efb865e
-  data.tar.gz: 2a39b084c6a239aeb0de22bfc87629d2f2909b23eabfcf71a835a1f1624d84afe3ea106afdafb8f1fb301b7934d73abc7253c9b8bd3f6c9b170231ebb5af0936
+  metadata.gz: 81dd80f49173e3d711a713b6dd365addf04129cb0f6c015d6909200a709780e30c39888f0bccba72035e03c17a0b01a4d1456e6431473149d9969907435f18c1
+  data.tar.gz: 748f841cf01b802e81bc6f6ecf8aaea5ab13593363afadc7c9634446c169812064dd41af3e58e87068a224972be85f00b1e3c2669a99e1406819507c86b1a15c

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,14 @@
 ## [Unreleased]
+## [0.8.1]
+- Support for Epsilla vector DB
+- Fully functioning Google Vertex AI LLM
+- Bug fixes
+## [0.8.0]
+- [BREAKING] Updated llama_cpp.rb to 0.9.4. The model file format used by the underlying llama.cpp library has changed to GGUF. llama.cpp ships with scripts to convert existing files and GGUF format models can be downloaded from HuggingFace.
+- Introducing Langchain::LLM::GoogleVertexAi LLM provider
 ## [0.7.5] - 2023-11-13
 - Fixes

data/README.md CHANGED Viewed

@@ -1,3 +1,6 @@
+# Please fill out the [Ruby AI Survey 2023](https://docs.google.com/forms/d/1dH_0js1wpEyh1YqPTOxU3b5fXj76sb5lYp12lVoNNZE/edit).
+Results will be anonymized and shared!
 💎🔗 Langchain.rb
 ---
 ⚡ Building LLM-powered applications in Ruby ⚡
@@ -53,23 +56,24 @@ require "langchain"
 Langchain.rb wraps all supported LLMs in a unified interface allowing you to easily swap out and test out different models.
 #### Supported LLMs and features:
-| LLM providers                                    | embed()            | complete()         | chat()              | summarize()        | Notes              |
-| --------                                         |:------------------:| :-------:          | :-----------------: | :-------:          | :----------------- |
-| [OpenAI](https://openai.com/)                    | :white_check_mark: | :white_check_mark: | :white_check_mark:  | ❌                 | Including Azure OpenAI |
-| [AI21](https://ai21.com/)                        | ❌                 | :white_check_mark: | ❌                  | :white_check_mark: |                    |
-| [Anthropic](https://milvus.io/)                  | ❌                 | :white_check_mark: | ❌                  | ❌                 |                    |
-| [AWS Bedrock](https://aws.amazon.com/bedrock)    | :white_check_mark: | :white_check_mark: | ❌                  | ❌                 | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
-| [Cohere](https://www.pinecone.io/)               | :white_check_mark: | :white_check_mark: | :white_check_mark:  | :white_check_mark: |                    |
-| [GooglePalm](https://ai.google/discover/palm2/) | :white_check_mark: | :white_check_mark: | :white_check_mark:  | :white_check_mark: |                    |
-| [HuggingFace](https://huggingface.co/)          | :white_check_mark: | ❌                 | ❌                  | ❌                 |                    |
-| [Ollama](https://ollama.ai/)                     | :white_check_mark: | :white_check_mark: | ❌                  | ❌                 |                    |
-| [Replicate](https://replicate.com/)              | :white_check_mark: | :white_check_mark: | :white_check_mark:  | :white_check_mark: |                    |
+| LLM providers                                                                                   | embed()            | complete()         | chat()              | summarize()        | Notes              |
+| --------                                                                                        |:------------------:| :-------:          | :-----------------: | :-------:          | :----------------- |
+| [OpenAI](https://openai.com/?utm_source=langchainrb&utm_medium=github)                          | ✅                 | ✅                 | ✅                  | ❌                 | Including Azure OpenAI |
+| [AI21](https://ai21.com/?utm_source=langchainrb&utm_medium=github)                              | ❌                 | ✅                 | ❌                  | ✅                 |                    |
+| [Anthropic](https://anthropic.com/?utm_source=langchainrb&utm_medium=github)                        | ❌                 | ✅                 | ❌                  | ❌                 |                    |
+| [AWS Bedrock](https://aws.amazon.com/bedrock?utm_source=langchainrb&utm_medium=github)          | ✅                 | ✅                 | ❌                  | ❌                 | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
+| [Cohere](https://cohere.com/?utm_source=langchainrb&utm_medium=github)                     | ✅                 | ✅                 | ✅                  | ✅                 |                    |
+| [GooglePalm](https://ai.google/discover/palm2?utm_source=langchainrb&utm_medium=github)         | ✅                 | ✅                 | ✅                  | ✅                 |                    |
+| [Google Vertex AI](https://cloud.google.com/vertex-ai?utm_source=langchainrb&utm_medium=github) | ✅                 | ❌                 | ❌                  | ❌                 |                    |
+| [HuggingFace](https://huggingface.co/?utm_source=langchainrb&utm_medium=github)                 | ✅                 | ❌                 | ❌                  | ❌                 |                    |
+| [Ollama](https://ollama.ai/?utm_source=langchainrb&utm_medium=github)                           | ✅                 | ✅                 | ❌                  | ❌                 |                    |
+| [Replicate](https://replicate.com/?utm_source=langchainrb&utm_medium=github)                    | ✅                 | ✅                 | ✅                  | ✅                 |                    |
 #### Using standalone LLMs:
 #### OpenAI
-Add `gem "ruby-openai", "~> 5.2.0"` to your Gemfile.
+Add `gem "ruby-openai", "~> 6.1.0"` to your Gemfile.
 ```ruby
 llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
@@ -86,22 +90,22 @@ llm.embed(text: "foo bar")
 Generate a text completion:
 ```ruby
-llm.complete(prompt: "What is the meaning of life?")
+llm.complete(prompt: "What is the meaning of life?").completion
 ```
 Generate a chat completion:
 ```ruby
-llm.chat(prompt: "Hey! How are you?")
+llm.chat(prompt: "Hey! How are you?").completion
 ```
 Summarize the text:
 ```ruby
-llm.complete(text: "...")
+llm.summarize(text: "...").completion
 ```
 You can use any other LLM by invoking the same interface:
 ```ruby
-llm = Langchain::LLM::GooglePalm.new(...)
+llm = Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"], default_options: { ... })
 ```
 ### Prompt Management
@@ -247,7 +251,7 @@ Then parse the llm response:
 ```ruby
 llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
-llm_response = llm.chat(prompt: prompt_text)
+llm_response = llm.chat(prompt: prompt_text).completion
 parser.parse(llm_response)
 # {
 #   "name" => "Kim Ji-hyun",
@@ -303,15 +307,17 @@ Langchain.rb provides a convenient unified interface on top of supported vectors
 #### Supported vector search databases and features:
-| Database                                         | Open-source        | Cloud offering     |
-| --------                                         |:------------------:| :------------:     |
-| [Chroma](https://trychroma.com/)                 | :white_check_mark: | :white_check_mark: |
-| [Hnswlib](https://github.com/nmslib/hnswlib/)    | :white_check_mark: | ❌                 |
-| [Milvus](https://milvus.io/)                     | :white_check_mark: | :white_check_mark: Zilliz Cloud |
-| [Pinecone](https://www.pinecone.io/)             | ❌                 | :white_check_mark: |
-| [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: |
-| [Qdrant](https://qdrant.tech/)                   | :white_check_mark: | :white_check_mark: |
-| [Weaviate](https://weaviate.io/)                 | :white_check_mark: | :white_check_mark: |
+| Database                                                                                   | Open-source        | Cloud offering     |
+| --------                                                                                   |:------------------:| :------------:     |
+| [Chroma](https://trychroma.com/?utm_source=langchainrb&utm_medium=github)                  | ✅                 | ✅                 |
+| [Epsilla](https://epsilla.com/?utm_source=langchainrb&utm_medium=github)                   | ✅                 | ✅                 |
+| [Hnswlib](https://github.com/nmslib/hnswlib/?utm_source=langchainrb&utm_medium=github)     | ✅                 | ❌                 |
+| [Milvus](https://milvus.io/?utm_source=langchainrb&utm_medium=github)                      | ✅                 | ✅ Zilliz Cloud    |
+| [Pinecone](https://www.pinecone.io/?utm_source=langchainrb&utm_medium=github)              | ❌                 | ✅                 |
+| [Pgvector](https://github.com/pgvector/pgvector/?utm_source=langchainrb&utm_medium=github) | ✅                 | ✅                 |
+| [Qdrant](https://qdrant.tech/?utm_source=langchainrb&utm_medium=github)                    | ✅                 | ✅                 |
+| [Weaviate](https://weaviate.io/?utm_source=langchainrb&utm_medium=github)                  | ✅                 | ✅                 |
+| [Elasticsearch](https://www.elastic.co/?utm_source=langchainrb&utm_medium=github)          | ✅                 | ✅                 |
 ### Using Vector Search Databases 🔍
@@ -337,11 +343,13 @@ client = Langchain::Vectorsearch::Weaviate.new(
 You can instantiate any other supported vector search database:
 ```ruby
 client = Langchain::Vectorsearch::Chroma.new(...)   # `gem "chroma-db", "~> 0.6.0"`
+client = Langchain::Vectorsearch::Epsilla.new(...)  # `gem "epsilla-ruby", "~> 0.0.3"`
 client = Langchain::Vectorsearch::Hnswlib.new(...)  # `gem "hnswlib", "~> 0.8.1"`
 client = Langchain::Vectorsearch::Milvus.new(...)   # `gem "milvus", "~> 0.9.2"`
 client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
 client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
-client = Langchain::Vectorsearch::Qdrant.new(...)   # `gem"qdrant-ruby", "~> 0.9.3"`
+client = Langchain::Vectorsearch::Qdrant.new(...)   # `gem "qdrant-ruby", "~> 0.9.3"`
+client = Langchain::Vectorsearch::Elasticsearch.new(...)   # `gem "elasticsearch", "~> 8.2.0"`
 ```
 Create the default schema:

data/lib/langchain/llm/azure.rb CHANGED Viewed

@@ -4,7 +4,7 @@ module Langchain::LLM
   # LLM interface for Azure OpenAI Service APIs: https://learn.microsoft.com/en-us/azure/ai-services/openai/
   #
   # Gem requirements:
-  #    gem "ruby-openai", "~> 5.2.0"
+  #    gem "ruby-openai", "~> 6.1.0"
   #
   # Usage:
   #    openai = Langchain::LLM::Azure.new(api_key:, llm_options: {}, embedding_deployment_url: chat_deployment_url:)

data/lib/langchain/llm/google_palm.rb CHANGED Viewed

@@ -131,7 +131,7 @@ module Langchain::LLM
         prompt: prompt,
         temperature: @defaults[:temperature],
         # Most models have a context length of 2048 tokens (except for the newest models, which support 4096).
-        max_tokens: 2048
+        max_tokens: 256
       )
     end

data/lib/langchain/llm/google_vertex_ai.rb ADDED Viewed

@@ -0,0 +1,149 @@
+# frozen_string_literal: true
+module Langchain::LLM
+  #
+  # Wrapper around the Google Vertex AI APIs: https://cloud.google.com/vertex-ai?hl=en
+  #
+  # Gem requirements:
+  #     gem "google-apis-aiplatform_v1", "~> 0.7"
+  #
+  # Usage:
+  #     google_palm = Langchain::LLM::GoogleVertexAi.new(project_id: ENV["GOOGLE_VERTEX_AI_PROJECT_ID"])
+  #
+  class GoogleVertexAi < Base
+    DEFAULTS = {
+      temperature: 0.1, # 0.1 is the default in the API, quite low ("grounded")
+      max_output_tokens: 1000,
+      top_p: 0.8,
+      top_k: 40,
+      dimension: 768,
+      completion_model_name: "text-bison", # Optional: tect-bison@001
+      embeddings_model_name: "textembedding-gecko"
+    }.freeze
+    # Google Cloud has a project id and a specific region of deployment.
+    # For GenAI-related things, a safe choice is us-central1.
+    attr_reader :project_id, :client, :region
+    def initialize(project_id:, default_options: {})
+      depends_on "google-apis-aiplatform_v1"
+      @project_id = project_id
+      @region = default_options.fetch :region, "us-central1"
+      @client = Google::Apis::AiplatformV1::AiplatformService.new
+      # TODO: Adapt for other regions; Pass it in via the constructor
+      # For the moment only us-central1 available so no big deal.
+      @client.root_url = "https://#{@region}-aiplatform.googleapis.com/"
+      @client.authorization = Google::Auth.get_application_default
+      @defaults = DEFAULTS.merge(default_options)
+    end
+    #
+    # Generate an embedding for a given text
+    #
+    # @param text [String] The text to generate an embedding for
+    # @return [Langchain::LLM::GoogleVertexAiResponse] Response object
+    #
+    def embed(text:)
+      content = [{content: text}]
+      request = Google::Apis::AiplatformV1::GoogleCloudAiplatformV1PredictRequest.new(instances: content)
+      api_path = "projects/#{@project_id}/locations/us-central1/publishers/google/models/#{@defaults[:embeddings_model_name]}"
+      # puts("api_path: #{api_path}")
+      response = client.predict_project_location_publisher_model(api_path, request)
+      Langchain::LLM::GoogleVertexAiResponse.new(response.to_h, model: @defaults[:embeddings_model_name])
+    end
+    #
+    # Generate a completion for a given prompt
+    #
+    # @param prompt [String] The prompt to generate a completion for
+    # @param params extra parameters passed to GooglePalmAPI::Client#generate_text
+    # @return [Langchain::LLM::GooglePalmResponse] Response object
+    #
+    def complete(prompt:, **params)
+      default_params = {
+        prompt: prompt,
+        temperature: @defaults[:temperature],
+        top_k: @defaults[:top_k],
+        top_p: @defaults[:top_p],
+        max_output_tokens: @defaults[:max_output_tokens],
+        model: @defaults[:completion_model_name]
+      }
+      if params[:stop_sequences]
+        default_params[:stop_sequences] = params.delete(:stop_sequences)
+      end
+      if params[:max_output_tokens]
+        default_params[:max_output_tokens] = params.delete(:max_output_tokens)
+      end
+      # to be tested
+      temperature = params.delete(:temperature) || @defaults[:temperature]
+      max_output_tokens = default_params.fetch(:max_output_tokens, @defaults[:max_output_tokens])
+      default_params.merge!(params)
+      # response = client.generate_text(**default_params)
+      request = Google::Apis::AiplatformV1::GoogleCloudAiplatformV1PredictRequest.new \
+        instances: [{
+          prompt: prompt # key used to be :content, changed to :prompt
+        }],
+        parameters: {
+          temperature: temperature,
+          maxOutputTokens: max_output_tokens,
+          topP: 0.8,
+          topK: 40
+        }
+      response = client.predict_project_location_publisher_model \
+        "projects/#{project_id}/locations/us-central1/publishers/google/models/#{@defaults[:completion_model_name]}",
+        request
+      Langchain::LLM::GoogleVertexAiResponse.new(response, model: default_params[:model])
+    end
+    #
+    # Generate a summarization for a given text
+    #
+    # @param text [String] The text to generate a summarization for
+    # @return [String] The summarization
+    #
+    # TODO(ricc): add params for Temp, topP, topK, MaxTokens and have it default to these 4 values.
+    def summarize(text:)
+      prompt_template = Langchain::Prompt.load_from_path(
+        file_path: Langchain.root.join("langchain/llm/prompts/summarize_template.yaml")
+      )
+      prompt = prompt_template.format(text: text)
+      complete(
+        prompt: prompt,
+        # For best temperature, topP, topK, MaxTokens for summarization: see
+        # https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-summarization
+        temperature: 0.2,
+        top_p: 0.95,
+        top_k: 40,
+        # Most models have a context length of 2048 tokens (except for the newest models, which support 4096).
+        max_output_tokens: 256
+      )
+    end
+    def chat(...)
+      # https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-chathat
+      # Chat params: https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-chat
+      # \"temperature\": 0.3,\n"
+      #       + "  \"maxDecodeSteps\": 200,\n"
+      #       + "  \"topP\": 0.8,\n"
+      #       + "  \"topK\": 40\n"
+      #       + "}";
+      raise NotImplementedError, "coming soon for Vertex AI.."
+    end
+  end
+end

data/lib/langchain/llm/llama_cpp.rb CHANGED Viewed

@@ -22,7 +22,7 @@ module Langchain::LLM
     # @param n_ctx [Integer] The number of context tokens to use
     # @param n_threads [Integer] The CPU number of threads to use
     # @param seed [Integer] The seed to use
-    def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: -1)
+    def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0)
       depends_on "llama_cpp"
       @model_path = model_path
@@ -33,30 +33,25 @@ module Langchain::LLM
     end
     # @param text [String] The text to embed
-    # @param n_threads [Integer] The number of CPU threads to use
     # @return [Array<Float>] The embedding
-    def embed(text:, n_threads: nil)
+    def embed(text:)
       # contexts are kinda stateful when it comes to embeddings, so allocate one each time
       context = embedding_context
-      embedding_input = context.tokenize(text: text, add_bos: true)
+      embedding_input = @model.tokenize(text: text, add_bos: true)
       return unless embedding_input.size.positive?
-      n_threads ||= self.n_threads
-      context.eval(tokens: embedding_input, n_past: 0, n_threads: n_threads)
-      context.embeddings
+      context.eval(tokens: embedding_input, n_past: 0)
+      Langchain::LLM::LlamaCppResponse.new(context, model: context.model.desc)
     end
     # @param prompt [String] The prompt to complete
     # @param n_predict [Integer] The number of tokens to predict
-    # @param n_threads [Integer] The number of CPU threads to use
     # @return [String] The completed prompt
-    def complete(prompt:, n_predict: 128, n_threads: nil)
-      n_threads ||= self.n_threads
+    def complete(prompt:, n_predict: 128)
       # contexts do not appear to be stateful when it comes to completion, so re-use the same one
       context = completion_context
-      ::LLaMACpp.generate(context, prompt, n_threads: n_threads, n_predict: n_predict)
+      ::LLaMACpp.generate(context, prompt, n_predict: n_predict)
     end
     private
@@ -71,23 +66,30 @@ module Langchain::LLM
       context_params.seed = seed
       context_params.n_ctx = n_ctx
-      context_params.n_gpu_layers = n_gpu_layers
+      context_params.n_threads = n_threads
       context_params.embedding = embeddings
       context_params
     end
+    def build_model_params
+      model_params = ::LLaMACpp::ModelParams.new
+      model_params.n_gpu_layers = n_gpu_layers
+      model_params
+    end
     def build_model(embeddings: false)
       return @model if defined?(@model)
-      @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_context_params(embeddings: embeddings))
+      @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_model_params)
     end
     def build_completion_context
-      ::LLaMACpp::Context.new(model: build_model)
+      ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: false))
     end
     def build_embedding_context
-      ::LLaMACpp::Context.new(model: build_model(embeddings: true))
+      ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: true))
     end
     def completion_context

data/lib/langchain/llm/openai.rb CHANGED Viewed

@@ -4,7 +4,7 @@ module Langchain::LLM
   # LLM interface for OpenAI APIs: https://platform.openai.com/overview
   #
   # Gem requirements:
-  #    gem "ruby-openai", "~> 5.2.0"
+  #    gem "ruby-openai", "~> 6.1.0"
   #
   # Usage:
   #    openai = Langchain::LLM::OpenAI.new(api_key:, llm_options: {})
@@ -29,7 +29,6 @@ module Langchain::LLM
     LENGTH_VALIDATOR = Langchain::Utils::TokenLength::OpenAIValidator
     attr_accessor :functions
-    attr_accessor :response_chunks
     def initialize(api_key:, llm_options: {}, default_options: {})
       depends_on "ruby-openai", req: "openai"
@@ -137,6 +136,7 @@ module Langchain::LLM
       response = with_api_error_handling { client.chat(parameters: parameters) }
       response = response_from_chunks if block
+      reset_response_chunks
       Langchain::LLM::OpenAIResponse.new(response)
     end
@@ -158,6 +158,12 @@ module Langchain::LLM
     private
+    attr_reader :response_chunks
+    def reset_response_chunks
+      @response_chunks = []
+    end
     def is_legacy_model?(model)
       LEGACY_COMPLETION_MODELS.any? { |legacy_model| model.include?(legacy_model) }
     end
@@ -242,18 +248,18 @@ module Langchain::LLM
     end
     def response_from_chunks
-      @response_chunks.first&.slice("id", "object", "created", "model")&.merge(
+      grouped_chunks = @response_chunks.group_by { |chunk| chunk.dig("choices", 0, "index") }
+      final_choices = grouped_chunks.map do |index, chunks|
         {
-          "choices" => [
-            {
-              "message" => {
-                "role" => "assistant",
-                "content" => @response_chunks.map { |chunk| chunk.dig("choices", 0, "delta", "content") }.join
-              }
-            }
-          ]
+          "index" => index,
+          "message" => {
+            "role" => "assistant",
+            "content" => chunks.map { |chunk| chunk.dig("choices", 0, "delta", "content") }.join
+          },
+          "finish_reason" => chunks.last.dig("choices", 0, "finish_reason")
         }
-      )
+      end
+      @response_chunks.first&.slice("id", "object", "created", "model")&.merge({"choices" => final_choices})
     end
   end
 end

data/lib/langchain/llm/response/google_vertex_ai_response.rb ADDED Viewed

@@ -0,0 +1,33 @@
+# frozen_string_literal: true
+module Langchain::LLM
+  class GoogleVertexAiResponse < BaseResponse
+    attr_reader :prompt_tokens
+    def initialize(raw_response, model: nil)
+      @prompt_tokens = prompt_tokens
+      super(raw_response, model: model)
+    end
+    def completion
+      # completions&.dig(0, "output")
+      raw_response.predictions[0]["content"]
+    end
+    def embedding
+      embeddings.first
+    end
+    def completions
+      raw_response.predictions.map { |p| p["content"] }
+    end
+    def total_tokens
+      raw_response.dig(:predictions, 0, :embeddings, :statistics, :token_count)
+    end
+    def embeddings
+      [raw_response.dig(:predictions, 0, :embeddings, :values)]
+    end
+  end
+end

data/lib/langchain/llm/response/llama_cpp_response.rb ADDED Viewed

@@ -0,0 +1,13 @@
+# frozen_string_literal: true
+module Langchain::LLM
+  class LlamaCppResponse < BaseResponse
+    def embedding
+      embeddings
+    end
+    def embeddings
+      raw_response.embeddings
+    end
+  end
+end

data/lib/langchain/output_parsers/output_fixing_parser.rb CHANGED Viewed

@@ -58,7 +58,7 @@ module Langchain::OutputParsers
           completion: completion,
           error: e
         )
-      )
+      ).completion
       parser.parse(new_completion)
     end

data/lib/langchain/prompt/loading.rb CHANGED Viewed

@@ -33,7 +33,7 @@ module Langchain::Prompt
         when ".json"
           config = JSON.parse(File.read(file_path))
         when ".yaml", ".yml"
-          config = YAML.safe_load(File.read(file_path))
+          config = YAML.safe_load_file(file_path)
         else
           raise ArgumentError, "Got unsupported file type #{file_path.extname}"
         end

data/lib/langchain/utils/token_length/openai_validator.rb CHANGED Viewed

@@ -15,7 +15,8 @@ module Langchain
           # Source:
           # https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
           "gpt-4-1106-preview" => 4096,
-          "gpt-4-vision-preview" => 4096
+          "gpt-4-vision-preview" => 4096,
+          "gpt-3.5-turbo-1106" => 4096
         }
         TOKEN_LIMITS = {
@@ -26,6 +27,7 @@ module Langchain
           "gpt-3.5-turbo" => 4096,
           "gpt-3.5-turbo-0301" => 4096,
           "gpt-3.5-turbo-0613" => 4096,
+          "gpt-3.5-turbo-1106" => 16385,
           "gpt-3.5-turbo-16k" => 16384,
           "gpt-3.5-turbo-16k-0613" => 16384,
           "text-davinci-003" => 4097,

data/lib/langchain/vectorsearch/base.rb CHANGED Viewed

@@ -7,6 +7,7 @@ module Langchain::Vectorsearch
   # == Available vector databases
   #
   # - {Langchain::Vectorsearch::Chroma}
+  # - {Langchain::Vectorsearch::Epsilla}
   # - {Langchain::Vectorsearch::Elasticsearch}
   # - {Langchain::Vectorsearch::Hnswlib}
   # - {Langchain::Vectorsearch::Milvus}
@@ -29,10 +30,11 @@ module Langchain::Vectorsearch
   #     )
   #
   #     # You can instantiate other supported vector databases the same way:
+  #     epsilla  = Langchain::Vectorsearch::Epsilla.new(...)
   #     milvus   = Langchain::Vectorsearch::Milvus.new(...)
   #     qdrant   = Langchain::Vectorsearch::Qdrant.new(...)
   #     pinecone = Langchain::Vectorsearch::Pinecone.new(...)
-  #     chrome   = Langchain::Vectorsearch::Chroma.new(...)
+  #     chroma   = Langchain::Vectorsearch::Chroma.new(...)
   #     pgvector = Langchain::Vectorsearch::Pgvector.new(...)
   #
   # == Schema Creation

data/lib/langchain/vectorsearch/elasticsearch.rb CHANGED Viewed

@@ -46,6 +46,9 @@ module Langchain::Vectorsearch
       super(llm: llm)
     end
+    # Add a list of texts to the index
+    # @param texts [Array<String>] The list of texts to add
+    # @return [Elasticsearch::Response] from the Elasticsearch server
     def add_texts(texts: [])
       body = texts.map do |text|
         [
@@ -57,6 +60,10 @@ module Langchain::Vectorsearch
       es_client.bulk(body: body)
     end
+    # Add a list of texts to the index
+    # @param texts [Array<String>] The list of texts to update
+    # @param texts [Array<Integer>] The list of texts to update
+    # @return [Elasticsearch::Response] from the Elasticsearch server
     def update_texts(texts: [], ids: [])
       body = texts.map.with_index do |text, i|
         [
@@ -68,6 +75,8 @@ module Langchain::Vectorsearch
       es_client.bulk(body: body)
     end
+    # Create the index with the default schema
+    # @return [Elasticsearch::Response] Index creation
     def create_default_schema
       es_client.indices.create(
         index: index_name,
@@ -75,6 +84,8 @@ module Langchain::Vectorsearch
       )
     end
+    # Deletes the default schema
+    # @return [Elasticsearch::Response] Index deletion
     def delete_default_schema
       es_client.indices.delete(
         index: index_name
@@ -116,10 +127,30 @@ module Langchain::Vectorsearch
       }
     end
-    # TODO: Implement this
-    # def ask()
-    # end
+    # Ask a question and return the answer
+    # @param question [String] The question to ask
+    # @param k [Integer] The number of results to have in context
+    # @yield [String] Stream responses back one String at a time
+    # @return [String] The answer to the question
+    def ask(question:, k: 4, &block)
+      search_results = similarity_search(query: question, k: k)
+      context = search_results.map do |result|
+        result[:input]
+      end.join("\n---\n")
+      prompt = generate_rag_prompt(question: question, context: context)
+      response = llm.chat(prompt: prompt, &block)
+      response.context = context
+      response
+    end
+    # Search for similar texts
+    # @param text [String] The text to search for
+    # @param k [Integer] The number of results to return
+    # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
+    # @return [Elasticsearch::Response] The response from the server
     def similarity_search(text: "", k: 10, query: {})
       if text.empty? && query.empty?
         raise "Either text or query should pass as an argument"
@@ -134,6 +165,11 @@ module Langchain::Vectorsearch
       es_client.search(body: {query: query, size: k}).body
     end
+    # Search for similar texts by embedding
+    # @param embedding [Array<Float>] The embedding to search for
+    # @param k [Integer] The number of results to return
+    # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
+    # @return [Elasticsearch::Response] The response from the server
     def similarity_search_by_vector(embedding: [], k: 10, query: {})
       if embedding.empty? && query.empty?
         raise "Either embedding or query should pass as an argument"

data/lib/langchain/vectorsearch/epsilla.rb ADDED Viewed

@@ -0,0 +1,143 @@
+# frozen_string_literal: true
+require "securerandom"
+require "json"
+require "timeout"
+require "uri"
+module Langchain::Vectorsearch
+  class Epsilla < Base
+    #
+    # Wrapper around Epsilla client library
+    #
+    # Gem requirements:
+    #     gem "epsilla-ruby", "~> 0.0.3"
+    #
+    # Usage:
+    #     epsilla = Langchain::Vectorsearch::Epsilla.new(url:, db_name:, db_path:, index_name:, llm:)
+    #
+    # Initialize Epsilla client
+    # @param url [String] URL to connect to the Epsilla db instance, protocol://host:port
+    # @param db_name [String] The name of the database to use
+    # @param db_path [String] The path to the database to use
+    # @param index_name [String] The name of the Epsilla table to use
+    # @param llm [Object] The LLM client to use
+    def initialize(url:, db_name:, db_path:, index_name:, llm:)
+      depends_on "epsilla-ruby", req: "epsilla"
+      uri = URI.parse(url)
+      protocol = uri.scheme
+      host = uri.host
+      port = uri.port
+      @client = ::Epsilla::Client.new(protocol, host, port)
+      Timeout.timeout(5) do
+        status_code, response = @client.database.load_db(db_name, db_path)
+        if status_code != 200
+          if status_code == 500 && response["message"].include?("already loaded")
+            Langchain.logger.info("Database already loaded")
+          else
+            raise "Failed to load database: #{response}"
+          end
+        end
+      end
+      @client.database.use_db(db_name)
+      @db_name = db_name
+      @db_path = db_path
+      @table_name = index_name
+      @vector_dimension = llm.default_dimension
+      super(llm: llm)
+    end
+    # Create a table using the index_name passed in the constructor
+    def create_default_schema
+      status_code, response = @client.database.create_table(@table_name, [
+        {"name" => "ID", "dataType" => "STRING", "primaryKey" => true},
+        {"name" => "Doc", "dataType" => "STRING"},
+        {"name" => "Embedding", "dataType" => "VECTOR_FLOAT", "dimensions" => @vector_dimension}
+      ])
+      raise "Failed to create table: #{response}" if status_code != 200
+      response
+    end
+    # Drop the table using the index_name passed in the constructor
+    def destroy_default_schema
+      status_code, response = @client.database.drop_table(@table_name)
+      raise "Failed to drop table: #{response}" if status_code != 200
+      response
+    end
+    # Add a list of texts to the database
+    # @param texts [Array<String>] The list of texts to add
+    # @param ids [Array<String>] The unique ids to add to the index, in the same order as the texts; if nil, it will be random uuids
+    def add_texts(texts:, ids: nil)
+      validated_ids = ids
+      if ids.nil?
+        validated_ids = texts.map { SecureRandom.uuid }
+      elsif ids.length != texts.length
+        raise "The number of ids must match the number of texts"
+      end
+      data = texts.map.with_index do |text, idx|
+        {Doc: text, Embedding: llm.embed(text: text).embedding, ID: validated_ids[idx]}
+      end
+      status_code, response = @client.database.insert(@table_name, data)
+      raise "Failed to insert texts: #{response}" if status_code != 200
+      response
+    end
+    # Search for similar texts
+    # @param query [String] The text to search for
+    # @param k [Integer] The number of results to return
+    # @return [String] The response from the server
+    def similarity_search(query:, k: 4)
+      embedding = llm.embed(text: query).embedding
+      similarity_search_by_vector(
+        embedding: embedding,
+        k: k
+      )
+    end
+    # Search for entries by embedding
+    # @param embedding [Array<Float>] The embedding to search for
+    # @param k [Integer] The number of results to return
+    # @return [String] The response from the server
+    def similarity_search_by_vector(embedding:, k: 4)
+      status_code, response = @client.database.query(@table_name, "Embedding", embedding, ["Doc"], k, false)
+      raise "Failed to do similarity search: #{response}" if status_code != 200
+      data = JSON.parse(response)["result"]
+      data.map { |result| result["Doc"] }
+    end
+    # Ask a question and return the answer
+    # @param question [String] The question to ask
+    # @param k [Integer] The number of results to have in context
+    # @yield [String] Stream responses back one String at a time
+    # @return [String] The answer to the question
+    def ask(question:, k: 4, &block)
+      search_results = similarity_search(query: question, k: k)
+      context = search_results.map do |result|
+        result.to_s
+      end
+      context = context.join("\n---\n")
+      prompt = generate_rag_prompt(question: question, context: context)
+      response = llm.chat(prompt: prompt, &block)
+      response.context = context
+      response
+    end
+  end
+end

data/lib/langchain/vectorsearch/qdrant.rb CHANGED Viewed

@@ -44,14 +44,14 @@ module Langchain::Vectorsearch
     # Add a list of texts to the index
     # @param texts [Array<String>] The list of texts to add
     # @return [Hash] The response from the server
-    def add_texts(texts:, ids: [])
+    def add_texts(texts:, ids: [], payload: {})
       batch = {ids: [], vectors: [], payloads: []}
       Array(texts).each_with_index do |text, i|
         id = ids[i] || SecureRandom.uuid
         batch[:ids].push(id)
         batch[:vectors].push(llm.embed(text: text).embedding)
-        batch[:payloads].push({content: text})
+        batch[:payloads].push({content: text}.merge(payload))
       end
       client.points.upsert(

data/lib/langchain/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Langchain
-  VERSION = "0.7.5"
+  VERSION = "0.8.1"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: langchainrb
 version: !ruby/object:Gem::Version
-  version: 0.7.5
+  version: 0.8.1
 platform: ruby
 authors:
 - Andrei Bondarev
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2023-11-13 00:00:00.000000000 Z
+date: 2023-12-07 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: baran
@@ -276,6 +276,20 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: 8.2.0
+- !ruby/object:Gem::Dependency
+  name: epsilla-ruby
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: 0.0.4
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: 0.0.4
 - !ruby/object:Gem::Dependency
   name: eqn
   requirement: !ruby/object:Gem::Requirement
@@ -290,6 +304,20 @@ dependencies:
     - - "~>"
       - !ruby/object:Gem::Version
         version: 1.6.5
+- !ruby/object:Gem::Dependency
+  name: google-apis-aiplatform_v1
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '0.7'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '0.7'
 - !ruby/object:Gem::Dependency
   name: google_palm_api
   requirement: !ruby/object:Gem::Requirement
@@ -366,14 +394,14 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.3.7
+        version: 0.9.4
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 0.3.7
+        version: 0.9.4
 - !ruby/object:Gem::Dependency
   name: nokogiri
   requirement: !ruby/object:Gem::Requirement
@@ -506,14 +534,14 @@ dependencies:
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 5.2.0
+        version: 6.1.0
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
     - - "~>"
       - !ruby/object:Gem::Version
-        version: 5.2.0
+        version: 6.1.0
 - !ruby/object:Gem::Dependency
   name: safe_ruby
   requirement: !ruby/object:Gem::Requirement
@@ -619,6 +647,7 @@ files:
 - lib/langchain/llm/base.rb
 - lib/langchain/llm/cohere.rb
 - lib/langchain/llm/google_palm.rb
+- lib/langchain/llm/google_vertex_ai.rb
 - lib/langchain/llm/hugging_face.rb
 - lib/langchain/llm/llama_cpp.rb
 - lib/langchain/llm/ollama.rb
@@ -631,7 +660,9 @@ files:
 - lib/langchain/llm/response/base_response.rb
 - lib/langchain/llm/response/cohere_response.rb
 - lib/langchain/llm/response/google_palm_response.rb
+- lib/langchain/llm/response/google_vertex_ai_response.rb
 - lib/langchain/llm/response/hugging_face_response.rb
+- lib/langchain/llm/response/llama_cpp_response.rb
 - lib/langchain/llm/response/ollama_response.rb
 - lib/langchain/llm/response/openai_response.rb
 - lib/langchain/llm/response/replicate_response.rb
@@ -671,6 +702,7 @@ files:
 - lib/langchain/vectorsearch/base.rb
 - lib/langchain/vectorsearch/chroma.rb
 - lib/langchain/vectorsearch/elasticsearch.rb
+- lib/langchain/vectorsearch/epsilla.rb
 - lib/langchain/vectorsearch/hnswlib.rb
 - lib/langchain/vectorsearch/milvus.rb
 - lib/langchain/vectorsearch/pgvector.rb