langchainrb 0.7.5 → 0.8.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f4c388275b83a0e4260f4ae9271f4c164a8d34ea5ea9585916d91e7e9c17c980
4
- data.tar.gz: 8daa400de3ed80bb3fb9c53cc19ef4d56f137c2aa157bd268dbda488d0fca432
3
+ metadata.gz: 5dd13c5aae47af13fe248636ed88bd40d0e241291ab5c3dc2d5925dcc742af37
4
+ data.tar.gz: b190f73403a77b4ea4d1f9869423546d584df32785ae342a01d9a72ee5fe04fd
5
5
  SHA512:
6
- metadata.gz: 4bae87c050be6a8fa011c1ae5de4b119abac498669f2e63ca1829e11b7b5ecca7610330be670d24fd6cb98c2e2599c593e9922378985efc586d76c124efb865e
7
- data.tar.gz: 2a39b084c6a239aeb0de22bfc87629d2f2909b23eabfcf71a835a1f1624d84afe3ea106afdafb8f1fb301b7934d73abc7253c9b8bd3f6c9b170231ebb5af0936
6
+ metadata.gz: 81dd80f49173e3d711a713b6dd365addf04129cb0f6c015d6909200a709780e30c39888f0bccba72035e03c17a0b01a4d1456e6431473149d9969907435f18c1
7
+ data.tar.gz: 748f841cf01b802e81bc6f6ecf8aaea5ab13593363afadc7c9634446c169812064dd41af3e58e87068a224972be85f00b1e3c2669a99e1406819507c86b1a15c
data/CHANGELOG.md CHANGED
@@ -1,5 +1,14 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.8.1]
4
+ - Support for Epsilla vector DB
5
+ - Fully functioning Google Vertex AI LLM
6
+ - Bug fixes
7
+
8
+ ## [0.8.0]
9
+ - [BREAKING] Updated llama_cpp.rb to 0.9.4. The model file format used by the underlying llama.cpp library has changed to GGUF. llama.cpp ships with scripts to convert existing files and GGUF format models can be downloaded from HuggingFace.
10
+ - Introducing Langchain::LLM::GoogleVertexAi LLM provider
11
+
3
12
  ## [0.7.5] - 2023-11-13
4
13
  - Fixes
5
14
 
data/README.md CHANGED
@@ -1,3 +1,6 @@
1
+ # Please fill out the [Ruby AI Survey 2023](https://docs.google.com/forms/d/1dH_0js1wpEyh1YqPTOxU3b5fXj76sb5lYp12lVoNNZE/edit).
2
+ Results will be anonymized and shared!
3
+
1
4
  💎🔗 Langchain.rb
2
5
  ---
3
6
  ⚡ Building LLM-powered applications in Ruby ⚡
@@ -53,23 +56,24 @@ require "langchain"
53
56
  Langchain.rb wraps all supported LLMs in a unified interface allowing you to easily swap out and test out different models.
54
57
 
55
58
  #### Supported LLMs and features:
56
- | LLM providers | embed() | complete() | chat() | summarize() | Notes |
57
- | -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
58
- | [OpenAI](https://openai.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | ❌ | Including Azure OpenAI |
59
- | [AI21](https://ai21.com/) | ❌ | :white_check_mark: | ❌ | :white_check_mark: | |
60
- | [Anthropic](https://milvus.io/) | ❌ | :white_check_mark: | ❌ | ❌ | |
61
- | [AWS Bedrock](https://aws.amazon.com/bedrock) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
62
- | [Cohere](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
63
- | [GooglePalm](https://ai.google/discover/palm2/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
64
- | [HuggingFace](https://huggingface.co/) | :white_check_mark: | ❌ | ❌ | ❌ | |
65
- | [Ollama](https://ollama.ai/) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | |
66
- | [Replicate](https://replicate.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
59
+ | LLM providers | embed() | complete() | chat() | summarize() | Notes |
60
+ | -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
61
+ | [OpenAI](https://openai.com/?utm_source=langchainrb&utm_medium=github) | | | | ❌ | Including Azure OpenAI |
62
+ | [AI21](https://ai21.com/?utm_source=langchainrb&utm_medium=github) | ❌ | | ❌ | | |
63
+ | [Anthropic](https://anthropic.com/?utm_source=langchainrb&utm_medium=github) | ❌ | | ❌ | ❌ | |
64
+ | [AWS Bedrock](https://aws.amazon.com/bedrock?utm_source=langchainrb&utm_medium=github) | | | ❌ | ❌ | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
65
+ | [Cohere](https://cohere.com/?utm_source=langchainrb&utm_medium=github) | | | | | |
66
+ | [GooglePalm](https://ai.google/discover/palm2?utm_source=langchainrb&utm_medium=github) | | | | | |
67
+ | [Google Vertex AI](https://cloud.google.com/vertex-ai?utm_source=langchainrb&utm_medium=github) | | ❌ | ❌ | ❌ | |
68
+ | [HuggingFace](https://huggingface.co/?utm_source=langchainrb&utm_medium=github) | | | ❌ | ❌ | |
69
+ | [Ollama](https://ollama.ai/?utm_source=langchainrb&utm_medium=github) | | | | | |
70
+ | [Replicate](https://replicate.com/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ | ✅ | ✅ | |
67
71
 
68
72
  #### Using standalone LLMs:
69
73
 
70
74
  #### OpenAI
71
75
 
72
- Add `gem "ruby-openai", "~> 5.2.0"` to your Gemfile.
76
+ Add `gem "ruby-openai", "~> 6.1.0"` to your Gemfile.
73
77
 
74
78
  ```ruby
75
79
  llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
@@ -86,22 +90,22 @@ llm.embed(text: "foo bar")
86
90
 
87
91
  Generate a text completion:
88
92
  ```ruby
89
- llm.complete(prompt: "What is the meaning of life?")
93
+ llm.complete(prompt: "What is the meaning of life?").completion
90
94
  ```
91
95
 
92
96
  Generate a chat completion:
93
97
  ```ruby
94
- llm.chat(prompt: "Hey! How are you?")
98
+ llm.chat(prompt: "Hey! How are you?").completion
95
99
  ```
96
100
 
97
101
  Summarize the text:
98
102
  ```ruby
99
- llm.complete(text: "...")
103
+ llm.summarize(text: "...").completion
100
104
  ```
101
105
 
102
106
  You can use any other LLM by invoking the same interface:
103
107
  ```ruby
104
- llm = Langchain::LLM::GooglePalm.new(...)
108
+ llm = Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"], default_options: { ... })
105
109
  ```
106
110
 
107
111
  ### Prompt Management
@@ -247,7 +251,7 @@ Then parse the llm response:
247
251
 
248
252
  ```ruby
249
253
  llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
250
- llm_response = llm.chat(prompt: prompt_text)
254
+ llm_response = llm.chat(prompt: prompt_text).completion
251
255
  parser.parse(llm_response)
252
256
  # {
253
257
  # "name" => "Kim Ji-hyun",
@@ -303,15 +307,17 @@ Langchain.rb provides a convenient unified interface on top of supported vectors
303
307
 
304
308
  #### Supported vector search databases and features:
305
309
 
306
- | Database | Open-source | Cloud offering |
307
- | -------- |:------------------:| :------------: |
308
- | [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: |
309
- | [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | |
310
- | [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: Zilliz Cloud |
311
- | [Pinecone](https://www.pinecone.io/) | | :white_check_mark: |
312
- | [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: |
313
- | [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: |
314
- | [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: |
310
+ | Database | Open-source | Cloud offering |
311
+ | -------- |:------------------:| :------------: |
312
+ | [Chroma](https://trychroma.com/?utm_source=langchainrb&utm_medium=github) | | |
313
+ | [Epsilla](https://epsilla.com/?utm_source=langchainrb&utm_medium=github) | | |
314
+ | [Hnswlib](https://github.com/nmslib/hnswlib/?utm_source=langchainrb&utm_medium=github) | | |
315
+ | [Milvus](https://milvus.io/?utm_source=langchainrb&utm_medium=github) | | Zilliz Cloud |
316
+ | [Pinecone](https://www.pinecone.io/?utm_source=langchainrb&utm_medium=github) | | |
317
+ | [Pgvector](https://github.com/pgvector/pgvector/?utm_source=langchainrb&utm_medium=github) | | |
318
+ | [Qdrant](https://qdrant.tech/?utm_source=langchainrb&utm_medium=github) | | |
319
+ | [Weaviate](https://weaviate.io/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ |
320
+ | [Elasticsearch](https://www.elastic.co/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ |
315
321
 
316
322
  ### Using Vector Search Databases 🔍
317
323
 
@@ -337,11 +343,13 @@ client = Langchain::Vectorsearch::Weaviate.new(
337
343
  You can instantiate any other supported vector search database:
338
344
  ```ruby
339
345
  client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
346
+ client = Langchain::Vectorsearch::Epsilla.new(...) # `gem "epsilla-ruby", "~> 0.0.3"`
340
347
  client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
341
348
  client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
342
349
  client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
343
350
  client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
344
- client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
351
+ client = Langchain::Vectorsearch::Qdrant.new(...) # `gem "qdrant-ruby", "~> 0.9.3"`
352
+ client = Langchain::Vectorsearch::Elasticsearch.new(...) # `gem "elasticsearch", "~> 8.2.0"`
345
353
  ```
346
354
 
347
355
  Create the default schema:
@@ -4,7 +4,7 @@ module Langchain::LLM
4
4
  # LLM interface for Azure OpenAI Service APIs: https://learn.microsoft.com/en-us/azure/ai-services/openai/
5
5
  #
6
6
  # Gem requirements:
7
- # gem "ruby-openai", "~> 5.2.0"
7
+ # gem "ruby-openai", "~> 6.1.0"
8
8
  #
9
9
  # Usage:
10
10
  # openai = Langchain::LLM::Azure.new(api_key:, llm_options: {}, embedding_deployment_url: chat_deployment_url:)
@@ -131,7 +131,7 @@ module Langchain::LLM
131
131
  prompt: prompt,
132
132
  temperature: @defaults[:temperature],
133
133
  # Most models have a context length of 2048 tokens (except for the newest models, which support 4096).
134
- max_tokens: 2048
134
+ max_tokens: 256
135
135
  )
136
136
  end
137
137
 
@@ -0,0 +1,149 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ #
5
+ # Wrapper around the Google Vertex AI APIs: https://cloud.google.com/vertex-ai?hl=en
6
+ #
7
+ # Gem requirements:
8
+ # gem "google-apis-aiplatform_v1", "~> 0.7"
9
+ #
10
+ # Usage:
11
+ # google_palm = Langchain::LLM::GoogleVertexAi.new(project_id: ENV["GOOGLE_VERTEX_AI_PROJECT_ID"])
12
+ #
13
+ class GoogleVertexAi < Base
14
+ DEFAULTS = {
15
+ temperature: 0.1, # 0.1 is the default in the API, quite low ("grounded")
16
+ max_output_tokens: 1000,
17
+ top_p: 0.8,
18
+ top_k: 40,
19
+ dimension: 768,
20
+ completion_model_name: "text-bison", # Optional: tect-bison@001
21
+ embeddings_model_name: "textembedding-gecko"
22
+ }.freeze
23
+
24
+ # Google Cloud has a project id and a specific region of deployment.
25
+ # For GenAI-related things, a safe choice is us-central1.
26
+ attr_reader :project_id, :client, :region
27
+
28
+ def initialize(project_id:, default_options: {})
29
+ depends_on "google-apis-aiplatform_v1"
30
+
31
+ @project_id = project_id
32
+ @region = default_options.fetch :region, "us-central1"
33
+
34
+ @client = Google::Apis::AiplatformV1::AiplatformService.new
35
+
36
+ # TODO: Adapt for other regions; Pass it in via the constructor
37
+ # For the moment only us-central1 available so no big deal.
38
+ @client.root_url = "https://#{@region}-aiplatform.googleapis.com/"
39
+ @client.authorization = Google::Auth.get_application_default
40
+
41
+ @defaults = DEFAULTS.merge(default_options)
42
+ end
43
+
44
+ #
45
+ # Generate an embedding for a given text
46
+ #
47
+ # @param text [String] The text to generate an embedding for
48
+ # @return [Langchain::LLM::GoogleVertexAiResponse] Response object
49
+ #
50
+ def embed(text:)
51
+ content = [{content: text}]
52
+ request = Google::Apis::AiplatformV1::GoogleCloudAiplatformV1PredictRequest.new(instances: content)
53
+
54
+ api_path = "projects/#{@project_id}/locations/us-central1/publishers/google/models/#{@defaults[:embeddings_model_name]}"
55
+
56
+ # puts("api_path: #{api_path}")
57
+
58
+ response = client.predict_project_location_publisher_model(api_path, request)
59
+
60
+ Langchain::LLM::GoogleVertexAiResponse.new(response.to_h, model: @defaults[:embeddings_model_name])
61
+ end
62
+
63
+ #
64
+ # Generate a completion for a given prompt
65
+ #
66
+ # @param prompt [String] The prompt to generate a completion for
67
+ # @param params extra parameters passed to GooglePalmAPI::Client#generate_text
68
+ # @return [Langchain::LLM::GooglePalmResponse] Response object
69
+ #
70
+ def complete(prompt:, **params)
71
+ default_params = {
72
+ prompt: prompt,
73
+ temperature: @defaults[:temperature],
74
+ top_k: @defaults[:top_k],
75
+ top_p: @defaults[:top_p],
76
+ max_output_tokens: @defaults[:max_output_tokens],
77
+ model: @defaults[:completion_model_name]
78
+ }
79
+
80
+ if params[:stop_sequences]
81
+ default_params[:stop_sequences] = params.delete(:stop_sequences)
82
+ end
83
+
84
+ if params[:max_output_tokens]
85
+ default_params[:max_output_tokens] = params.delete(:max_output_tokens)
86
+ end
87
+
88
+ # to be tested
89
+ temperature = params.delete(:temperature) || @defaults[:temperature]
90
+ max_output_tokens = default_params.fetch(:max_output_tokens, @defaults[:max_output_tokens])
91
+
92
+ default_params.merge!(params)
93
+
94
+ # response = client.generate_text(**default_params)
95
+ request = Google::Apis::AiplatformV1::GoogleCloudAiplatformV1PredictRequest.new \
96
+ instances: [{
97
+ prompt: prompt # key used to be :content, changed to :prompt
98
+ }],
99
+ parameters: {
100
+ temperature: temperature,
101
+ maxOutputTokens: max_output_tokens,
102
+ topP: 0.8,
103
+ topK: 40
104
+ }
105
+
106
+ response = client.predict_project_location_publisher_model \
107
+ "projects/#{project_id}/locations/us-central1/publishers/google/models/#{@defaults[:completion_model_name]}",
108
+ request
109
+
110
+ Langchain::LLM::GoogleVertexAiResponse.new(response, model: default_params[:model])
111
+ end
112
+
113
+ #
114
+ # Generate a summarization for a given text
115
+ #
116
+ # @param text [String] The text to generate a summarization for
117
+ # @return [String] The summarization
118
+ #
119
+ # TODO(ricc): add params for Temp, topP, topK, MaxTokens and have it default to these 4 values.
120
+ def summarize(text:)
121
+ prompt_template = Langchain::Prompt.load_from_path(
122
+ file_path: Langchain.root.join("langchain/llm/prompts/summarize_template.yaml")
123
+ )
124
+ prompt = prompt_template.format(text: text)
125
+
126
+ complete(
127
+ prompt: prompt,
128
+ # For best temperature, topP, topK, MaxTokens for summarization: see
129
+ # https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-summarization
130
+ temperature: 0.2,
131
+ top_p: 0.95,
132
+ top_k: 40,
133
+ # Most models have a context length of 2048 tokens (except for the newest models, which support 4096).
134
+ max_output_tokens: 256
135
+ )
136
+ end
137
+
138
+ def chat(...)
139
+ # https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-chathat
140
+ # Chat params: https://cloud.google.com/vertex-ai/docs/samples/aiplatform-sdk-chat
141
+ # \"temperature\": 0.3,\n"
142
+ # + " \"maxDecodeSteps\": 200,\n"
143
+ # + " \"topP\": 0.8,\n"
144
+ # + " \"topK\": 40\n"
145
+ # + "}";
146
+ raise NotImplementedError, "coming soon for Vertex AI.."
147
+ end
148
+ end
149
+ end
@@ -22,7 +22,7 @@ module Langchain::LLM
22
22
  # @param n_ctx [Integer] The number of context tokens to use
23
23
  # @param n_threads [Integer] The CPU number of threads to use
24
24
  # @param seed [Integer] The seed to use
25
- def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: -1)
25
+ def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0)
26
26
  depends_on "llama_cpp"
27
27
 
28
28
  @model_path = model_path
@@ -33,30 +33,25 @@ module Langchain::LLM
33
33
  end
34
34
 
35
35
  # @param text [String] The text to embed
36
- # @param n_threads [Integer] The number of CPU threads to use
37
36
  # @return [Array<Float>] The embedding
38
- def embed(text:, n_threads: nil)
37
+ def embed(text:)
39
38
  # contexts are kinda stateful when it comes to embeddings, so allocate one each time
40
39
  context = embedding_context
41
40
 
42
- embedding_input = context.tokenize(text: text, add_bos: true)
41
+ embedding_input = @model.tokenize(text: text, add_bos: true)
43
42
  return unless embedding_input.size.positive?
44
43
 
45
- n_threads ||= self.n_threads
46
-
47
- context.eval(tokens: embedding_input, n_past: 0, n_threads: n_threads)
48
- context.embeddings
44
+ context.eval(tokens: embedding_input, n_past: 0)
45
+ Langchain::LLM::LlamaCppResponse.new(context, model: context.model.desc)
49
46
  end
50
47
 
51
48
  # @param prompt [String] The prompt to complete
52
49
  # @param n_predict [Integer] The number of tokens to predict
53
- # @param n_threads [Integer] The number of CPU threads to use
54
50
  # @return [String] The completed prompt
55
- def complete(prompt:, n_predict: 128, n_threads: nil)
56
- n_threads ||= self.n_threads
51
+ def complete(prompt:, n_predict: 128)
57
52
  # contexts do not appear to be stateful when it comes to completion, so re-use the same one
58
53
  context = completion_context
59
- ::LLaMACpp.generate(context, prompt, n_threads: n_threads, n_predict: n_predict)
54
+ ::LLaMACpp.generate(context, prompt, n_predict: n_predict)
60
55
  end
61
56
 
62
57
  private
@@ -71,23 +66,30 @@ module Langchain::LLM
71
66
 
72
67
  context_params.seed = seed
73
68
  context_params.n_ctx = n_ctx
74
- context_params.n_gpu_layers = n_gpu_layers
69
+ context_params.n_threads = n_threads
75
70
  context_params.embedding = embeddings
76
71
 
77
72
  context_params
78
73
  end
79
74
 
75
+ def build_model_params
76
+ model_params = ::LLaMACpp::ModelParams.new
77
+ model_params.n_gpu_layers = n_gpu_layers
78
+
79
+ model_params
80
+ end
81
+
80
82
  def build_model(embeddings: false)
81
83
  return @model if defined?(@model)
82
- @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_context_params(embeddings: embeddings))
84
+ @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_model_params)
83
85
  end
84
86
 
85
87
  def build_completion_context
86
- ::LLaMACpp::Context.new(model: build_model)
88
+ ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: false))
87
89
  end
88
90
 
89
91
  def build_embedding_context
90
- ::LLaMACpp::Context.new(model: build_model(embeddings: true))
92
+ ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: true))
91
93
  end
92
94
 
93
95
  def completion_context
@@ -4,7 +4,7 @@ module Langchain::LLM
4
4
  # LLM interface for OpenAI APIs: https://platform.openai.com/overview
5
5
  #
6
6
  # Gem requirements:
7
- # gem "ruby-openai", "~> 5.2.0"
7
+ # gem "ruby-openai", "~> 6.1.0"
8
8
  #
9
9
  # Usage:
10
10
  # openai = Langchain::LLM::OpenAI.new(api_key:, llm_options: {})
@@ -29,7 +29,6 @@ module Langchain::LLM
29
29
  LENGTH_VALIDATOR = Langchain::Utils::TokenLength::OpenAIValidator
30
30
 
31
31
  attr_accessor :functions
32
- attr_accessor :response_chunks
33
32
 
34
33
  def initialize(api_key:, llm_options: {}, default_options: {})
35
34
  depends_on "ruby-openai", req: "openai"
@@ -137,6 +136,7 @@ module Langchain::LLM
137
136
 
138
137
  response = with_api_error_handling { client.chat(parameters: parameters) }
139
138
  response = response_from_chunks if block
139
+ reset_response_chunks
140
140
  Langchain::LLM::OpenAIResponse.new(response)
141
141
  end
142
142
 
@@ -158,6 +158,12 @@ module Langchain::LLM
158
158
 
159
159
  private
160
160
 
161
+ attr_reader :response_chunks
162
+
163
+ def reset_response_chunks
164
+ @response_chunks = []
165
+ end
166
+
161
167
  def is_legacy_model?(model)
162
168
  LEGACY_COMPLETION_MODELS.any? { |legacy_model| model.include?(legacy_model) }
163
169
  end
@@ -242,18 +248,18 @@ module Langchain::LLM
242
248
  end
243
249
 
244
250
  def response_from_chunks
245
- @response_chunks.first&.slice("id", "object", "created", "model")&.merge(
251
+ grouped_chunks = @response_chunks.group_by { |chunk| chunk.dig("choices", 0, "index") }
252
+ final_choices = grouped_chunks.map do |index, chunks|
246
253
  {
247
- "choices" => [
248
- {
249
- "message" => {
250
- "role" => "assistant",
251
- "content" => @response_chunks.map { |chunk| chunk.dig("choices", 0, "delta", "content") }.join
252
- }
253
- }
254
- ]
254
+ "index" => index,
255
+ "message" => {
256
+ "role" => "assistant",
257
+ "content" => chunks.map { |chunk| chunk.dig("choices", 0, "delta", "content") }.join
258
+ },
259
+ "finish_reason" => chunks.last.dig("choices", 0, "finish_reason")
255
260
  }
256
- )
261
+ end
262
+ @response_chunks.first&.slice("id", "object", "created", "model")&.merge({"choices" => final_choices})
257
263
  end
258
264
  end
259
265
  end
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ class GoogleVertexAiResponse < BaseResponse
5
+ attr_reader :prompt_tokens
6
+
7
+ def initialize(raw_response, model: nil)
8
+ @prompt_tokens = prompt_tokens
9
+ super(raw_response, model: model)
10
+ end
11
+
12
+ def completion
13
+ # completions&.dig(0, "output")
14
+ raw_response.predictions[0]["content"]
15
+ end
16
+
17
+ def embedding
18
+ embeddings.first
19
+ end
20
+
21
+ def completions
22
+ raw_response.predictions.map { |p| p["content"] }
23
+ end
24
+
25
+ def total_tokens
26
+ raw_response.dig(:predictions, 0, :embeddings, :statistics, :token_count)
27
+ end
28
+
29
+ def embeddings
30
+ [raw_response.dig(:predictions, 0, :embeddings, :values)]
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,13 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ class LlamaCppResponse < BaseResponse
5
+ def embedding
6
+ embeddings
7
+ end
8
+
9
+ def embeddings
10
+ raw_response.embeddings
11
+ end
12
+ end
13
+ end
@@ -58,7 +58,7 @@ module Langchain::OutputParsers
58
58
  completion: completion,
59
59
  error: e
60
60
  )
61
- )
61
+ ).completion
62
62
  parser.parse(new_completion)
63
63
  end
64
64
 
@@ -33,7 +33,7 @@ module Langchain::Prompt
33
33
  when ".json"
34
34
  config = JSON.parse(File.read(file_path))
35
35
  when ".yaml", ".yml"
36
- config = YAML.safe_load(File.read(file_path))
36
+ config = YAML.safe_load_file(file_path)
37
37
  else
38
38
  raise ArgumentError, "Got unsupported file type #{file_path.extname}"
39
39
  end
@@ -15,7 +15,8 @@ module Langchain
15
15
  # Source:
16
16
  # https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
17
17
  "gpt-4-1106-preview" => 4096,
18
- "gpt-4-vision-preview" => 4096
18
+ "gpt-4-vision-preview" => 4096,
19
+ "gpt-3.5-turbo-1106" => 4096
19
20
  }
20
21
 
21
22
  TOKEN_LIMITS = {
@@ -26,6 +27,7 @@ module Langchain
26
27
  "gpt-3.5-turbo" => 4096,
27
28
  "gpt-3.5-turbo-0301" => 4096,
28
29
  "gpt-3.5-turbo-0613" => 4096,
30
+ "gpt-3.5-turbo-1106" => 16385,
29
31
  "gpt-3.5-turbo-16k" => 16384,
30
32
  "gpt-3.5-turbo-16k-0613" => 16384,
31
33
  "text-davinci-003" => 4097,
@@ -7,6 +7,7 @@ module Langchain::Vectorsearch
7
7
  # == Available vector databases
8
8
  #
9
9
  # - {Langchain::Vectorsearch::Chroma}
10
+ # - {Langchain::Vectorsearch::Epsilla}
10
11
  # - {Langchain::Vectorsearch::Elasticsearch}
11
12
  # - {Langchain::Vectorsearch::Hnswlib}
12
13
  # - {Langchain::Vectorsearch::Milvus}
@@ -29,10 +30,11 @@ module Langchain::Vectorsearch
29
30
  # )
30
31
  #
31
32
  # # You can instantiate other supported vector databases the same way:
33
+ # epsilla = Langchain::Vectorsearch::Epsilla.new(...)
32
34
  # milvus = Langchain::Vectorsearch::Milvus.new(...)
33
35
  # qdrant = Langchain::Vectorsearch::Qdrant.new(...)
34
36
  # pinecone = Langchain::Vectorsearch::Pinecone.new(...)
35
- # chrome = Langchain::Vectorsearch::Chroma.new(...)
37
+ # chroma = Langchain::Vectorsearch::Chroma.new(...)
36
38
  # pgvector = Langchain::Vectorsearch::Pgvector.new(...)
37
39
  #
38
40
  # == Schema Creation
@@ -46,6 +46,9 @@ module Langchain::Vectorsearch
46
46
  super(llm: llm)
47
47
  end
48
48
 
49
+ # Add a list of texts to the index
50
+ # @param texts [Array<String>] The list of texts to add
51
+ # @return [Elasticsearch::Response] from the Elasticsearch server
49
52
  def add_texts(texts: [])
50
53
  body = texts.map do |text|
51
54
  [
@@ -57,6 +60,10 @@ module Langchain::Vectorsearch
57
60
  es_client.bulk(body: body)
58
61
  end
59
62
 
63
+ # Add a list of texts to the index
64
+ # @param texts [Array<String>] The list of texts to update
65
+ # @param texts [Array<Integer>] The list of texts to update
66
+ # @return [Elasticsearch::Response] from the Elasticsearch server
60
67
  def update_texts(texts: [], ids: [])
61
68
  body = texts.map.with_index do |text, i|
62
69
  [
@@ -68,6 +75,8 @@ module Langchain::Vectorsearch
68
75
  es_client.bulk(body: body)
69
76
  end
70
77
 
78
+ # Create the index with the default schema
79
+ # @return [Elasticsearch::Response] Index creation
71
80
  def create_default_schema
72
81
  es_client.indices.create(
73
82
  index: index_name,
@@ -75,6 +84,8 @@ module Langchain::Vectorsearch
75
84
  )
76
85
  end
77
86
 
87
+ # Deletes the default schema
88
+ # @return [Elasticsearch::Response] Index deletion
78
89
  def delete_default_schema
79
90
  es_client.indices.delete(
80
91
  index: index_name
@@ -116,10 +127,30 @@ module Langchain::Vectorsearch
116
127
  }
117
128
  end
118
129
 
119
- # TODO: Implement this
120
- # def ask()
121
- # end
130
+ # Ask a question and return the answer
131
+ # @param question [String] The question to ask
132
+ # @param k [Integer] The number of results to have in context
133
+ # @yield [String] Stream responses back one String at a time
134
+ # @return [String] The answer to the question
135
+ def ask(question:, k: 4, &block)
136
+ search_results = similarity_search(query: question, k: k)
122
137
 
138
+ context = search_results.map do |result|
139
+ result[:input]
140
+ end.join("\n---\n")
141
+
142
+ prompt = generate_rag_prompt(question: question, context: context)
143
+
144
+ response = llm.chat(prompt: prompt, &block)
145
+ response.context = context
146
+ response
147
+ end
148
+
149
+ # Search for similar texts
150
+ # @param text [String] The text to search for
151
+ # @param k [Integer] The number of results to return
152
+ # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
153
+ # @return [Elasticsearch::Response] The response from the server
123
154
  def similarity_search(text: "", k: 10, query: {})
124
155
  if text.empty? && query.empty?
125
156
  raise "Either text or query should pass as an argument"
@@ -134,6 +165,11 @@ module Langchain::Vectorsearch
134
165
  es_client.search(body: {query: query, size: k}).body
135
166
  end
136
167
 
168
+ # Search for similar texts by embedding
169
+ # @param embedding [Array<Float>] The embedding to search for
170
+ # @param k [Integer] The number of results to return
171
+ # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
172
+ # @return [Elasticsearch::Response] The response from the server
137
173
  def similarity_search_by_vector(embedding: [], k: 10, query: {})
138
174
  if embedding.empty? && query.empty?
139
175
  raise "Either embedding or query should pass as an argument"
@@ -0,0 +1,143 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "securerandom"
4
+ require "json"
5
+ require "timeout"
6
+ require "uri"
7
+
8
+ module Langchain::Vectorsearch
9
+ class Epsilla < Base
10
+ #
11
+ # Wrapper around Epsilla client library
12
+ #
13
+ # Gem requirements:
14
+ # gem "epsilla-ruby", "~> 0.0.3"
15
+ #
16
+ # Usage:
17
+ # epsilla = Langchain::Vectorsearch::Epsilla.new(url:, db_name:, db_path:, index_name:, llm:)
18
+ #
19
+ # Initialize Epsilla client
20
+ # @param url [String] URL to connect to the Epsilla db instance, protocol://host:port
21
+ # @param db_name [String] The name of the database to use
22
+ # @param db_path [String] The path to the database to use
23
+ # @param index_name [String] The name of the Epsilla table to use
24
+ # @param llm [Object] The LLM client to use
25
+ def initialize(url:, db_name:, db_path:, index_name:, llm:)
26
+ depends_on "epsilla-ruby", req: "epsilla"
27
+
28
+ uri = URI.parse(url)
29
+ protocol = uri.scheme
30
+ host = uri.host
31
+ port = uri.port
32
+
33
+ @client = ::Epsilla::Client.new(protocol, host, port)
34
+
35
+ Timeout.timeout(5) do
36
+ status_code, response = @client.database.load_db(db_name, db_path)
37
+
38
+ if status_code != 200
39
+ if status_code == 500 && response["message"].include?("already loaded")
40
+ Langchain.logger.info("Database already loaded")
41
+ else
42
+ raise "Failed to load database: #{response}"
43
+ end
44
+ end
45
+ end
46
+
47
+ @client.database.use_db(db_name)
48
+
49
+ @db_name = db_name
50
+ @db_path = db_path
51
+ @table_name = index_name
52
+
53
+ @vector_dimension = llm.default_dimension
54
+
55
+ super(llm: llm)
56
+ end
57
+
58
+ # Create a table using the index_name passed in the constructor
59
+ def create_default_schema
60
+ status_code, response = @client.database.create_table(@table_name, [
61
+ {"name" => "ID", "dataType" => "STRING", "primaryKey" => true},
62
+ {"name" => "Doc", "dataType" => "STRING"},
63
+ {"name" => "Embedding", "dataType" => "VECTOR_FLOAT", "dimensions" => @vector_dimension}
64
+ ])
65
+ raise "Failed to create table: #{response}" if status_code != 200
66
+
67
+ response
68
+ end
69
+
70
+ # Drop the table using the index_name passed in the constructor
71
+ def destroy_default_schema
72
+ status_code, response = @client.database.drop_table(@table_name)
73
+ raise "Failed to drop table: #{response}" if status_code != 200
74
+
75
+ response
76
+ end
77
+
78
+ # Add a list of texts to the database
79
+ # @param texts [Array<String>] The list of texts to add
80
+ # @param ids [Array<String>] The unique ids to add to the index, in the same order as the texts; if nil, it will be random uuids
81
+ def add_texts(texts:, ids: nil)
82
+ validated_ids = ids
83
+ if ids.nil?
84
+ validated_ids = texts.map { SecureRandom.uuid }
85
+ elsif ids.length != texts.length
86
+ raise "The number of ids must match the number of texts"
87
+ end
88
+
89
+ data = texts.map.with_index do |text, idx|
90
+ {Doc: text, Embedding: llm.embed(text: text).embedding, ID: validated_ids[idx]}
91
+ end
92
+
93
+ status_code, response = @client.database.insert(@table_name, data)
94
+ raise "Failed to insert texts: #{response}" if status_code != 200
95
+ response
96
+ end
97
+
98
+ # Search for similar texts
99
+ # @param query [String] The text to search for
100
+ # @param k [Integer] The number of results to return
101
+ # @return [String] The response from the server
102
+ def similarity_search(query:, k: 4)
103
+ embedding = llm.embed(text: query).embedding
104
+
105
+ similarity_search_by_vector(
106
+ embedding: embedding,
107
+ k: k
108
+ )
109
+ end
110
+
111
+ # Search for entries by embedding
112
+ # @param embedding [Array<Float>] The embedding to search for
113
+ # @param k [Integer] The number of results to return
114
+ # @return [String] The response from the server
115
+ def similarity_search_by_vector(embedding:, k: 4)
116
+ status_code, response = @client.database.query(@table_name, "Embedding", embedding, ["Doc"], k, false)
117
+ raise "Failed to do similarity search: #{response}" if status_code != 200
118
+
119
+ data = JSON.parse(response)["result"]
120
+ data.map { |result| result["Doc"] }
121
+ end
122
+
123
+ # Ask a question and return the answer
124
+ # @param question [String] The question to ask
125
+ # @param k [Integer] The number of results to have in context
126
+ # @yield [String] Stream responses back one String at a time
127
+ # @return [String] The answer to the question
128
+ def ask(question:, k: 4, &block)
129
+ search_results = similarity_search(query: question, k: k)
130
+
131
+ context = search_results.map do |result|
132
+ result.to_s
133
+ end
134
+ context = context.join("\n---\n")
135
+
136
+ prompt = generate_rag_prompt(question: question, context: context)
137
+
138
+ response = llm.chat(prompt: prompt, &block)
139
+ response.context = context
140
+ response
141
+ end
142
+ end
143
+ end
@@ -44,14 +44,14 @@ module Langchain::Vectorsearch
44
44
  # Add a list of texts to the index
45
45
  # @param texts [Array<String>] The list of texts to add
46
46
  # @return [Hash] The response from the server
47
- def add_texts(texts:, ids: [])
47
+ def add_texts(texts:, ids: [], payload: {})
48
48
  batch = {ids: [], vectors: [], payloads: []}
49
49
 
50
50
  Array(texts).each_with_index do |text, i|
51
51
  id = ids[i] || SecureRandom.uuid
52
52
  batch[:ids].push(id)
53
53
  batch[:vectors].push(llm.embed(text: text).embedding)
54
- batch[:payloads].push({content: text})
54
+ batch[:payloads].push({content: text}.merge(payload))
55
55
  end
56
56
 
57
57
  client.points.upsert(
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Langchain
4
- VERSION = "0.7.5"
4
+ VERSION = "0.8.1"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: langchainrb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.5
4
+ version: 0.8.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrei Bondarev
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-11-13 00:00:00.000000000 Z
11
+ date: 2023-12-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: baran
@@ -276,6 +276,20 @@ dependencies:
276
276
  - - "~>"
277
277
  - !ruby/object:Gem::Version
278
278
  version: 8.2.0
279
+ - !ruby/object:Gem::Dependency
280
+ name: epsilla-ruby
281
+ requirement: !ruby/object:Gem::Requirement
282
+ requirements:
283
+ - - "~>"
284
+ - !ruby/object:Gem::Version
285
+ version: 0.0.4
286
+ type: :development
287
+ prerelease: false
288
+ version_requirements: !ruby/object:Gem::Requirement
289
+ requirements:
290
+ - - "~>"
291
+ - !ruby/object:Gem::Version
292
+ version: 0.0.4
279
293
  - !ruby/object:Gem::Dependency
280
294
  name: eqn
281
295
  requirement: !ruby/object:Gem::Requirement
@@ -290,6 +304,20 @@ dependencies:
290
304
  - - "~>"
291
305
  - !ruby/object:Gem::Version
292
306
  version: 1.6.5
307
+ - !ruby/object:Gem::Dependency
308
+ name: google-apis-aiplatform_v1
309
+ requirement: !ruby/object:Gem::Requirement
310
+ requirements:
311
+ - - "~>"
312
+ - !ruby/object:Gem::Version
313
+ version: '0.7'
314
+ type: :development
315
+ prerelease: false
316
+ version_requirements: !ruby/object:Gem::Requirement
317
+ requirements:
318
+ - - "~>"
319
+ - !ruby/object:Gem::Version
320
+ version: '0.7'
293
321
  - !ruby/object:Gem::Dependency
294
322
  name: google_palm_api
295
323
  requirement: !ruby/object:Gem::Requirement
@@ -366,14 +394,14 @@ dependencies:
366
394
  requirements:
367
395
  - - "~>"
368
396
  - !ruby/object:Gem::Version
369
- version: 0.3.7
397
+ version: 0.9.4
370
398
  type: :development
371
399
  prerelease: false
372
400
  version_requirements: !ruby/object:Gem::Requirement
373
401
  requirements:
374
402
  - - "~>"
375
403
  - !ruby/object:Gem::Version
376
- version: 0.3.7
404
+ version: 0.9.4
377
405
  - !ruby/object:Gem::Dependency
378
406
  name: nokogiri
379
407
  requirement: !ruby/object:Gem::Requirement
@@ -506,14 +534,14 @@ dependencies:
506
534
  requirements:
507
535
  - - "~>"
508
536
  - !ruby/object:Gem::Version
509
- version: 5.2.0
537
+ version: 6.1.0
510
538
  type: :development
511
539
  prerelease: false
512
540
  version_requirements: !ruby/object:Gem::Requirement
513
541
  requirements:
514
542
  - - "~>"
515
543
  - !ruby/object:Gem::Version
516
- version: 5.2.0
544
+ version: 6.1.0
517
545
  - !ruby/object:Gem::Dependency
518
546
  name: safe_ruby
519
547
  requirement: !ruby/object:Gem::Requirement
@@ -619,6 +647,7 @@ files:
619
647
  - lib/langchain/llm/base.rb
620
648
  - lib/langchain/llm/cohere.rb
621
649
  - lib/langchain/llm/google_palm.rb
650
+ - lib/langchain/llm/google_vertex_ai.rb
622
651
  - lib/langchain/llm/hugging_face.rb
623
652
  - lib/langchain/llm/llama_cpp.rb
624
653
  - lib/langchain/llm/ollama.rb
@@ -631,7 +660,9 @@ files:
631
660
  - lib/langchain/llm/response/base_response.rb
632
661
  - lib/langchain/llm/response/cohere_response.rb
633
662
  - lib/langchain/llm/response/google_palm_response.rb
663
+ - lib/langchain/llm/response/google_vertex_ai_response.rb
634
664
  - lib/langchain/llm/response/hugging_face_response.rb
665
+ - lib/langchain/llm/response/llama_cpp_response.rb
635
666
  - lib/langchain/llm/response/ollama_response.rb
636
667
  - lib/langchain/llm/response/openai_response.rb
637
668
  - lib/langchain/llm/response/replicate_response.rb
@@ -671,6 +702,7 @@ files:
671
702
  - lib/langchain/vectorsearch/base.rb
672
703
  - lib/langchain/vectorsearch/chroma.rb
673
704
  - lib/langchain/vectorsearch/elasticsearch.rb
705
+ - lib/langchain/vectorsearch/epsilla.rb
674
706
  - lib/langchain/vectorsearch/hnswlib.rb
675
707
  - lib/langchain/vectorsearch/milvus.rb
676
708
  - lib/langchain/vectorsearch/pgvector.rb