langchainrb 0.7.3 → 0.8.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3d2d42bf6883822d160e0eeeb4adbfe1598ee271bd3dfd8d4d4b914db814ed0d
4
- data.tar.gz: f041fc5f276258072275ab5979bf670cc5c6a122b8d4d55ca571224af790d43d
3
+ metadata.gz: 8ea2adff257b4151b8acf24a02de851df2d99fe8890d6afd06bcdc3a5f53e9e1
4
+ data.tar.gz: 646a5f9246bffc20654672393f9175c1f0f30533ba1546cef05ce951d449c9ec
5
5
  SHA512:
6
- metadata.gz: 61b3c342e8630e6d3ca325bfb105a29d609d99d668dc5c4cfa1cb2c447c230bb8f1f6aa7d252a08129918a0fa11e37bcab813c9700a4c690dd9e5d337eebeb7d
7
- data.tar.gz: 7ef534ed87ae2d6c077854a03eb314390238d95e9c0b49e85c9042d60d122806709ee07e007e5de884535d4cb8b6a3ffa6504a31e6ac36fadbde10e9c1924444
6
+ metadata.gz: 3b2aaace63c46b7eec9d8cc04a2cd9cc84c79c90a5a1f1ce1bcb11e4416021f89293d40309ca35b0e4dbb2036a2962bde0faa28ad46d081846dcb00a9a1bf783
7
+ data.tar.gz: fd5e8e03053ab99a737b3ce17c12ae76da2bc1d0b4bda89eb16e16afe43f260325af78a7c62faf0041c8869cbd94c0a5bbbda920bb7e1d7f175ac35545b53f00
data/CHANGELOG.md CHANGED
@@ -1,5 +1,15 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.8.0]
4
+ - [BREAKING] Updated llama_cpp.rb to 0.9.4. The model file format used by the underlying llama.cpp library has changed to GGUF. llama.cpp ships with scripts to convert existing files and GGUF format models can be downloaded from HuggingFace.
5
+ - Introducing Langchain::LLM::GoogleVertexAi LLM provider
6
+
7
+ ## [0.7.5] - 2023-11-13
8
+ - Fixes
9
+
10
+ ## [0.7.4] - 2023-11-10
11
+ - AWS Bedrock is available as an LLM provider. Available models from AI21, Cohere, AWS, and Anthropic.
12
+
3
13
  ## [0.7.3] - 2023-11-08
4
14
  - LLM response passes through the context in RAG cases
5
15
  - Fix gpt-4 token length validation
data/README.md CHANGED
@@ -1,3 +1,6 @@
1
+ # Please fill out the [Ruby AI Survey 2023](https://docs.google.com/forms/d/1dH_0js1wpEyh1YqPTOxU3b5fXj76sb5lYp12lVoNNZE/edit).
2
+ Results will be anonymized and shared!
3
+
1
4
  💎🔗 Langchain.rb
2
5
  ---
3
6
  ⚡ Building LLM-powered applications in Ruby ⚡
@@ -53,22 +56,24 @@ require "langchain"
53
56
  Langchain.rb wraps all supported LLMs in a unified interface allowing you to easily swap out and test out different models.
54
57
 
55
58
  #### Supported LLMs and features:
56
- | LLM providers | embed() | complete() | chat() | summarize() | Notes |
57
- | -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
58
- | [OpenAI](https://openai.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | ❌ | Including Azure OpenAI |
59
- | [AI21](https://ai21.com/) | ❌ | :white_check_mark: | ❌ | :white_check_mark: | |
60
- | [Anthropic](https://milvus.io/) | ❌ | :white_check_mark: | ❌ | ❌ | |
61
- | [Cohere](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
62
- | [GooglePalm](https://ai.google/discover/palm2/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
63
- | [HuggingFace](https://huggingface.co/) | :white_check_mark: | | | | |
64
- | [Ollama](https://ollama.ai/) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | |
65
- | [Replicate](https://replicate.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
59
+ | LLM providers | embed() | complete() | chat() | summarize() | Notes |
60
+ | -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
61
+ | [OpenAI](https://openai.com/?utm_source=langchainrb&utm_medium=github) | | | | ❌ | Including Azure OpenAI |
62
+ | [AI21](https://ai21.com/?utm_source=langchainrb&utm_medium=github) | ❌ | | ❌ | | |
63
+ | [Anthropic](https://anthropic.com/?utm_source=langchainrb&utm_medium=github) | ❌ | | ❌ | ❌ | |
64
+ | [AWS Bedrock](https://aws.amazon.com/bedrock?utm_source=langchainrb&utm_medium=github) | ✅ | | | | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
65
+ | [Cohere](https://cohere.com/?utm_source=langchainrb&utm_medium=github) | | | | | |
66
+ | [GooglePalm](https://ai.google/discover/palm2?utm_source=langchainrb&utm_medium=github) | | | | | |
67
+ | [Google Vertex AI](https://cloud.google.com/vertex-ai?utm_source=langchainrb&utm_medium=github) | | | ❌ | ❌ | |
68
+ | [HuggingFace](https://huggingface.co/?utm_source=langchainrb&utm_medium=github) | | | | | |
69
+ | [Ollama](https://ollama.ai/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ | ❌ | ❌ | |
70
+ | [Replicate](https://replicate.com/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ | ✅ | ✅ | |
66
71
 
67
72
  #### Using standalone LLMs:
68
73
 
69
74
  #### OpenAI
70
75
 
71
- Add `gem "ruby-openai", "~> 5.2.0"` to your Gemfile.
76
+ Add `gem "ruby-openai", "~> 6.1.0"` to your Gemfile.
72
77
 
73
78
  ```ruby
74
79
  llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
@@ -302,15 +307,16 @@ Langchain.rb provides a convenient unified interface on top of supported vectors
302
307
 
303
308
  #### Supported vector search databases and features:
304
309
 
305
- | Database | Open-source | Cloud offering |
306
- | -------- |:------------------:| :------------: |
307
- | [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: |
308
- | [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | ❌ |
309
- | [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: Zilliz Cloud |
310
- | [Pinecone](https://www.pinecone.io/) | ❌ | :white_check_mark: |
311
- | [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: |
312
- | [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: |
313
- | [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: |
310
+ | Database | Open-source | Cloud offering |
311
+ | -------- |:------------------:| :------------: |
312
+ | [Chroma](https://trychroma.com/?utm_source=langchainrb&utm_medium=github) | | |
313
+ | [Hnswlib](https://github.com/nmslib/hnswlib/?utm_source=langchainrb&utm_medium=github) | | ❌ |
314
+ | [Milvus](https://milvus.io/?utm_source=langchainrb&utm_medium=github) | | Zilliz Cloud |
315
+ | [Pinecone](https://www.pinecone.io/?utm_source=langchainrb&utm_medium=github) | ❌ | |
316
+ | [Pgvector](https://github.com/pgvector/pgvector/?utm_source=langchainrb&utm_medium=github) | | |
317
+ | [Qdrant](https://qdrant.tech/?utm_source=langchainrb&utm_medium=github) | | |
318
+ | [Weaviate](https://weaviate.io/?utm_source=langchainrb&utm_medium=github) | | |
319
+ | [Elasticsearch](https://www.elastic.co/?utm_source=langchainrb&utm_medium=github) | ✅ | ✅ |
314
320
 
315
321
  ### Using Vector Search Databases 🔍
316
322
 
@@ -340,7 +346,8 @@ client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"
340
346
  client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
341
347
  client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
342
348
  client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
343
- client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
349
+ client = Langchain::Vectorsearch::Qdrant.new(...) # `gem "qdrant-ruby", "~> 0.9.3"`
350
+ client = Langchain::Vectorsearch::Elasticsearch.new(...) # `gem "elasticsearch", "~> 8.2.0"`
344
351
  ```
345
352
 
346
353
  Create the default schema:
@@ -0,0 +1,216 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ # LLM interface for Aws Bedrock APIs: https://docs.aws.amazon.com/bedrock/
5
+ #
6
+ # Gem requirements:
7
+ # gem 'aws-sdk-bedrockruntime', '~> 1.1'
8
+ #
9
+ # Usage:
10
+ # bedrock = Langchain::LLM::AwsBedrock.new(llm_options: {})
11
+ #
12
+ class AwsBedrock < Base
13
+ DEFAULTS = {
14
+ completion_model_name: "anthropic.claude-v2",
15
+ embedding_model_name: "amazon.titan-embed-text-v1",
16
+ max_tokens_to_sample: 300,
17
+ temperature: 1,
18
+ top_k: 250,
19
+ top_p: 0.999,
20
+ stop_sequences: ["\n\nHuman:"],
21
+ anthropic_version: "bedrock-2023-05-31",
22
+ return_likelihoods: "NONE",
23
+ count_penalty: {
24
+ scale: 0,
25
+ apply_to_whitespaces: false,
26
+ apply_to_punctuations: false,
27
+ apply_to_numbers: false,
28
+ apply_to_stopwords: false,
29
+ apply_to_emojis: false
30
+ },
31
+ presence_penalty: {
32
+ scale: 0,
33
+ apply_to_whitespaces: false,
34
+ apply_to_punctuations: false,
35
+ apply_to_numbers: false,
36
+ apply_to_stopwords: false,
37
+ apply_to_emojis: false
38
+ },
39
+ frequency_penalty: {
40
+ scale: 0,
41
+ apply_to_whitespaces: false,
42
+ apply_to_punctuations: false,
43
+ apply_to_numbers: false,
44
+ apply_to_stopwords: false,
45
+ apply_to_emojis: false
46
+ }
47
+ }.freeze
48
+
49
+ SUPPORTED_COMPLETION_PROVIDERS = %i[anthropic cohere ai21].freeze
50
+ SUPPORTED_EMBEDDING_PROVIDERS = %i[amazon].freeze
51
+
52
+ def initialize(completion_model: DEFAULTS[:completion_model_name], embedding_model: DEFAULTS[:embedding_model_name], aws_client_options: {}, default_options: {})
53
+ depends_on "aws-sdk-bedrockruntime", req: "aws-sdk-bedrockruntime"
54
+
55
+ @client = ::Aws::BedrockRuntime::Client.new(**aws_client_options)
56
+ @defaults = DEFAULTS.merge(default_options)
57
+ .merge(completion_model_name: completion_model)
58
+ .merge(embedding_model_name: embedding_model)
59
+ end
60
+
61
+ #
62
+ # Generate an embedding for a given text
63
+ #
64
+ # @param text [String] The text to generate an embedding for
65
+ # @param params extra parameters passed to Aws::BedrockRuntime::Client#invoke_model
66
+ # @return [Langchain::LLM::AwsTitanResponse] Response object
67
+ #
68
+ def embed(text:, **params)
69
+ raise "Completion provider #{embedding_provider} is not supported." unless SUPPORTED_EMBEDDING_PROVIDERS.include?(embedding_provider)
70
+
71
+ parameters = {inputText: text}
72
+ parameters = parameters.merge(params)
73
+
74
+ response = client.invoke_model({
75
+ model_id: @defaults[:embedding_model_name],
76
+ body: parameters.to_json,
77
+ content_type: "application/json",
78
+ accept: "application/json"
79
+ })
80
+
81
+ Langchain::LLM::AwsTitanResponse.new(JSON.parse(response.body.string))
82
+ end
83
+
84
+ #
85
+ # Generate a completion for a given prompt
86
+ #
87
+ # @param prompt [String] The prompt to generate a completion for
88
+ # @param params extra parameters passed to Aws::BedrockRuntime::Client#invoke_model
89
+ # @return [Langchain::LLM::AnthropicResponse], [Langchain::LLM::CohereResponse] or [Langchain::LLM::AI21Response] Response object
90
+ #
91
+ def complete(prompt:, **params)
92
+ raise "Completion provider #{completion_provider} is not supported." unless SUPPORTED_COMPLETION_PROVIDERS.include?(completion_provider)
93
+
94
+ parameters = compose_parameters params
95
+
96
+ parameters[:prompt] = wrap_prompt prompt
97
+
98
+ response = client.invoke_model({
99
+ model_id: @defaults[:completion_model_name],
100
+ body: parameters.to_json,
101
+ content_type: "application/json",
102
+ accept: "application/json"
103
+ })
104
+
105
+ parse_response response
106
+ end
107
+
108
+ private
109
+
110
+ def completion_provider
111
+ @defaults[:completion_model_name].split(".").first.to_sym
112
+ end
113
+
114
+ def embedding_provider
115
+ @defaults[:embedding_model_name].split(".").first.to_sym
116
+ end
117
+
118
+ def wrap_prompt(prompt)
119
+ if completion_provider == :anthropic
120
+ "\n\nHuman: #{prompt}\n\nAssistant:"
121
+ else
122
+ prompt
123
+ end
124
+ end
125
+
126
+ def max_tokens_key
127
+ if completion_provider == :anthropic
128
+ :max_tokens_to_sample
129
+ elsif completion_provider == :cohere
130
+ :max_tokens
131
+ elsif completion_provider == :ai21
132
+ :maxTokens
133
+ end
134
+ end
135
+
136
+ def compose_parameters(params)
137
+ if completion_provider == :anthropic
138
+ compose_parameters_anthropic params
139
+ elsif completion_provider == :cohere
140
+ compose_parameters_cohere params
141
+ elsif completion_provider == :ai21
142
+ compose_parameters_ai21 params
143
+ end
144
+ end
145
+
146
+ def parse_response(response)
147
+ if completion_provider == :anthropic
148
+ Langchain::LLM::AnthropicResponse.new(JSON.parse(response.body.string))
149
+ elsif completion_provider == :cohere
150
+ Langchain::LLM::CohereResponse.new(JSON.parse(response.body.string))
151
+ elsif completion_provider == :ai21
152
+ Langchain::LLM::AI21Response.new(JSON.parse(response.body.string, symbolize_names: true))
153
+ end
154
+ end
155
+
156
+ def compose_parameters_cohere(params)
157
+ default_params = @defaults.merge(params)
158
+
159
+ {
160
+ max_tokens: default_params[:max_tokens_to_sample],
161
+ temperature: default_params[:temperature],
162
+ p: default_params[:top_p],
163
+ k: default_params[:top_k],
164
+ stop_sequences: default_params[:stop_sequences]
165
+ }
166
+ end
167
+
168
+ def compose_parameters_anthropic(params)
169
+ default_params = @defaults.merge(params)
170
+
171
+ {
172
+ max_tokens_to_sample: default_params[:max_tokens_to_sample],
173
+ temperature: default_params[:temperature],
174
+ top_k: default_params[:top_k],
175
+ top_p: default_params[:top_p],
176
+ stop_sequences: default_params[:stop_sequences],
177
+ anthropic_version: default_params[:anthropic_version]
178
+ }
179
+ end
180
+
181
+ def compose_parameters_ai21(params)
182
+ default_params = @defaults.merge(params)
183
+
184
+ {
185
+ maxTokens: default_params[:max_tokens_to_sample],
186
+ temperature: default_params[:temperature],
187
+ topP: default_params[:top_p],
188
+ stopSequences: default_params[:stop_sequences],
189
+ countPenalty: {
190
+ scale: default_params[:count_penalty][:scale],
191
+ applyToWhitespaces: default_params[:count_penalty][:apply_to_whitespaces],
192
+ applyToPunctuations: default_params[:count_penalty][:apply_to_punctuations],
193
+ applyToNumbers: default_params[:count_penalty][:apply_to_numbers],
194
+ applyToStopwords: default_params[:count_penalty][:apply_to_stopwords],
195
+ applyToEmojis: default_params[:count_penalty][:apply_to_emojis]
196
+ },
197
+ presencePenalty: {
198
+ scale: default_params[:presence_penalty][:scale],
199
+ applyToWhitespaces: default_params[:presence_penalty][:apply_to_whitespaces],
200
+ applyToPunctuations: default_params[:presence_penalty][:apply_to_punctuations],
201
+ applyToNumbers: default_params[:presence_penalty][:apply_to_numbers],
202
+ applyToStopwords: default_params[:presence_penalty][:apply_to_stopwords],
203
+ applyToEmojis: default_params[:presence_penalty][:apply_to_emojis]
204
+ },
205
+ frequencyPenalty: {
206
+ scale: default_params[:frequency_penalty][:scale],
207
+ applyToWhitespaces: default_params[:frequency_penalty][:apply_to_whitespaces],
208
+ applyToPunctuations: default_params[:frequency_penalty][:apply_to_punctuations],
209
+ applyToNumbers: default_params[:frequency_penalty][:apply_to_numbers],
210
+ applyToStopwords: default_params[:frequency_penalty][:apply_to_stopwords],
211
+ applyToEmojis: default_params[:frequency_penalty][:apply_to_emojis]
212
+ }
213
+ }
214
+ end
215
+ end
216
+ end
@@ -4,7 +4,7 @@ module Langchain::LLM
4
4
  # LLM interface for Azure OpenAI Service APIs: https://learn.microsoft.com/en-us/azure/ai-services/openai/
5
5
  #
6
6
  # Gem requirements:
7
- # gem "ruby-openai", "~> 5.2.0"
7
+ # gem "ruby-openai", "~> 6.1.0"
8
8
  #
9
9
  # Usage:
10
10
  # openai = Langchain::LLM::Azure.new(api_key:, llm_options: {}, embedding_deployment_url: chat_deployment_url:)
@@ -0,0 +1,55 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ #
5
+ # Wrapper around the Google Vertex AI APIs: https://cloud.google.com/vertex-ai?hl=en
6
+ #
7
+ # Gem requirements:
8
+ # gem "google-apis-aiplatform_v1", "~> 0.7"
9
+ #
10
+ # Usage:
11
+ # google_palm = Langchain::LLM::GoogleVertexAi.new(project_id: ENV["GOOGLE_VERTEX_AI_PROJECT_ID"])
12
+ #
13
+ class GoogleVertexAi < Base
14
+ DEFAULTS = {
15
+ temperature: 0.2,
16
+ dimension: 768,
17
+ embeddings_model_name: "textembedding-gecko"
18
+ }.freeze
19
+
20
+ attr_reader :project_id, :client
21
+
22
+ def initialize(project_id:, default_options: {})
23
+ depends_on "google-apis-aiplatform_v1"
24
+
25
+ @project_id = project_id
26
+
27
+ @client = Google::Apis::AiplatformV1::AiplatformService.new
28
+
29
+ # TODO: Adapt for other regions; Pass it in via the constructor
30
+ @client.root_url = "https://us-central1-aiplatform.googleapis.com/"
31
+ @client.authorization = Google::Auth.get_application_default
32
+
33
+ @defaults = DEFAULTS.merge(default_options)
34
+ end
35
+
36
+ #
37
+ # Generate an embedding for a given text
38
+ #
39
+ # @param text [String] The text to generate an embedding for
40
+ # @return [Langchain::LLM::GooglePalmResponse] Response object
41
+ #
42
+ def embed(text:)
43
+ content = [{content: text}]
44
+ request = Google::Apis::AiplatformV1::GoogleCloudAiplatformV1PredictRequest.new(instances: content)
45
+
46
+ api_path = "projects/#{@project_id}/locations/us-central1/publishers/google/models/#{@defaults[:embeddings_model_name]}"
47
+
48
+ puts("api_path: #{api_path}")
49
+
50
+ response = client.predict_project_location_publisher_model(api_path, request)
51
+
52
+ Langchain::LLM::GoogleVertexAiResponse.new(response.to_h, model: @defaults[:embeddings_model_name])
53
+ end
54
+ end
55
+ end
@@ -22,7 +22,7 @@ module Langchain::LLM
22
22
  # @param n_ctx [Integer] The number of context tokens to use
23
23
  # @param n_threads [Integer] The CPU number of threads to use
24
24
  # @param seed [Integer] The seed to use
25
- def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: -1)
25
+ def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0)
26
26
  depends_on "llama_cpp"
27
27
 
28
28
  @model_path = model_path
@@ -33,30 +33,25 @@ module Langchain::LLM
33
33
  end
34
34
 
35
35
  # @param text [String] The text to embed
36
- # @param n_threads [Integer] The number of CPU threads to use
37
36
  # @return [Array<Float>] The embedding
38
- def embed(text:, n_threads: nil)
37
+ def embed(text:)
39
38
  # contexts are kinda stateful when it comes to embeddings, so allocate one each time
40
39
  context = embedding_context
41
40
 
42
- embedding_input = context.tokenize(text: text, add_bos: true)
41
+ embedding_input = @model.tokenize(text: text, add_bos: true)
43
42
  return unless embedding_input.size.positive?
44
43
 
45
- n_threads ||= self.n_threads
46
-
47
- context.eval(tokens: embedding_input, n_past: 0, n_threads: n_threads)
48
- context.embeddings
44
+ context.eval(tokens: embedding_input, n_past: 0)
45
+ Langchain::LLM::LlamaCppResponse.new(context, model: context.model.desc)
49
46
  end
50
47
 
51
48
  # @param prompt [String] The prompt to complete
52
49
  # @param n_predict [Integer] The number of tokens to predict
53
- # @param n_threads [Integer] The number of CPU threads to use
54
50
  # @return [String] The completed prompt
55
- def complete(prompt:, n_predict: 128, n_threads: nil)
56
- n_threads ||= self.n_threads
51
+ def complete(prompt:, n_predict: 128)
57
52
  # contexts do not appear to be stateful when it comes to completion, so re-use the same one
58
53
  context = completion_context
59
- ::LLaMACpp.generate(context, prompt, n_threads: n_threads, n_predict: n_predict)
54
+ ::LLaMACpp.generate(context, prompt, n_predict: n_predict)
60
55
  end
61
56
 
62
57
  private
@@ -71,23 +66,30 @@ module Langchain::LLM
71
66
 
72
67
  context_params.seed = seed
73
68
  context_params.n_ctx = n_ctx
74
- context_params.n_gpu_layers = n_gpu_layers
69
+ context_params.n_threads = n_threads
75
70
  context_params.embedding = embeddings
76
71
 
77
72
  context_params
78
73
  end
79
74
 
75
+ def build_model_params
76
+ model_params = ::LLaMACpp::ModelParams.new
77
+ model_params.n_gpu_layers = n_gpu_layers
78
+
79
+ model_params
80
+ end
81
+
80
82
  def build_model(embeddings: false)
81
83
  return @model if defined?(@model)
82
- @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_context_params(embeddings: embeddings))
84
+ @model = ::LLaMACpp::Model.new(model_path: model_path, params: build_model_params)
83
85
  end
84
86
 
85
87
  def build_completion_context
86
- ::LLaMACpp::Context.new(model: build_model)
88
+ ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: false))
87
89
  end
88
90
 
89
91
  def build_embedding_context
90
- ::LLaMACpp::Context.new(model: build_model(embeddings: true))
92
+ ::LLaMACpp::Context.new(model: build_model, params: build_context_params(embeddings: true))
91
93
  end
92
94
 
93
95
  def completion_context
@@ -4,7 +4,7 @@ module Langchain::LLM
4
4
  # LLM interface for OpenAI APIs: https://platform.openai.com/overview
5
5
  #
6
6
  # Gem requirements:
7
- # gem "ruby-openai", "~> 5.2.0"
7
+ # gem "ruby-openai", "~> 6.1.0"
8
8
  #
9
9
  # Usage:
10
10
  # openai = Langchain::LLM::OpenAI.new(api_key:, llm_options: {})
@@ -69,7 +69,7 @@ module Langchain::LLM
69
69
  return legacy_complete(prompt, parameters) if is_legacy_model?(parameters[:model])
70
70
 
71
71
  parameters[:messages] = compose_chat_messages(prompt: prompt)
72
- parameters[:max_tokens] = validate_max_tokens(parameters[:messages], parameters[:model])
72
+ parameters[:max_tokens] = validate_max_tokens(parameters[:messages], parameters[:model], parameters[:max_tokens])
73
73
 
74
74
  response = with_api_error_handling do
75
75
  client.chat(parameters: parameters)
@@ -131,13 +131,12 @@ module Langchain::LLM
131
131
  if functions
132
132
  parameters[:functions] = functions
133
133
  else
134
- parameters[:max_tokens] = validate_max_tokens(parameters[:messages], parameters[:model])
134
+ parameters[:max_tokens] = validate_max_tokens(parameters[:messages], parameters[:model], parameters[:max_tokens])
135
135
  end
136
136
 
137
137
  response = with_api_error_handling { client.chat(parameters: parameters) }
138
-
139
- return if block
140
-
138
+ response = response_from_chunks if block
139
+ reset_response_chunks
141
140
  Langchain::LLM::OpenAIResponse.new(response)
142
141
  end
143
142
 
@@ -159,6 +158,12 @@ module Langchain::LLM
159
158
 
160
159
  private
161
160
 
161
+ attr_reader :response_chunks
162
+
163
+ def reset_response_chunks
164
+ @response_chunks = []
165
+ end
166
+
162
167
  def is_legacy_model?(model)
163
168
  LEGACY_COMPLETION_MODELS.any? { |legacy_model| model.include?(legacy_model) }
164
169
  end
@@ -181,8 +186,11 @@ module Langchain::LLM
181
186
  parameters = default_params.merge(params)
182
187
 
183
188
  if block
189
+ @response_chunks = []
184
190
  parameters[:stream] = proc do |chunk, _bytesize|
185
- yield chunk.dig("choices", 0)
191
+ chunk_content = chunk.dig("choices", 0)
192
+ @response_chunks << chunk
193
+ yield chunk_content
186
194
  end
187
195
  end
188
196
 
@@ -230,13 +238,28 @@ module Langchain::LLM
230
238
  response
231
239
  end
232
240
 
233
- def validate_max_tokens(messages, model)
234
- LENGTH_VALIDATOR.validate_max_tokens!(messages, model)
241
+ def validate_max_tokens(messages, model, max_tokens = nil)
242
+ LENGTH_VALIDATOR.validate_max_tokens!(messages, model, max_tokens: max_tokens)
235
243
  end
236
244
 
237
245
  def extract_response(response)
238
246
  results = response.dig("choices").map { |choice| choice.dig("message", "content") }
239
247
  (results.size == 1) ? results.first : results
240
248
  end
249
+
250
+ def response_from_chunks
251
+ grouped_chunks = @response_chunks.group_by { |chunk| chunk.dig("choices", 0, "index") }
252
+ final_choices = grouped_chunks.map do |index, chunks|
253
+ {
254
+ "index" => index,
255
+ "message" => {
256
+ "role" => "assistant",
257
+ "content" => chunks.map { |chunk| chunk.dig("choices", 0, "delta", "content") }.join
258
+ },
259
+ "finish_reason" => chunks.last.dig("choices", 0, "finish_reason")
260
+ }
261
+ end
262
+ @response_chunks.first&.slice("id", "object", "created", "model")&.merge({"choices" => final_choices})
263
+ end
241
264
  end
242
265
  end
@@ -0,0 +1,17 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ class AwsTitanResponse < BaseResponse
5
+ def embedding
6
+ embeddings&.first
7
+ end
8
+
9
+ def embeddings
10
+ [raw_response.dig("embedding")]
11
+ end
12
+
13
+ def prompt_tokens
14
+ raw_response.dig("inputTextTokenCount")
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,24 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ class GoogleVertexAiResponse < BaseResponse
5
+ attr_reader :prompt_tokens
6
+
7
+ def initialize(raw_response, model: nil)
8
+ @prompt_tokens = prompt_tokens
9
+ super(raw_response, model: model)
10
+ end
11
+
12
+ def embedding
13
+ embeddings.first
14
+ end
15
+
16
+ def total_tokens
17
+ raw_response.dig(:predictions, 0, :embeddings, :statistics, :token_count)
18
+ end
19
+
20
+ def embeddings
21
+ [raw_response.dig(:predictions, 0, :embeddings, :values)]
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,13 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Langchain::LLM
4
+ class LlamaCppResponse < BaseResponse
5
+ def embedding
6
+ embeddings
7
+ end
8
+
9
+ def embeddings
10
+ raw_response.embeddings
11
+ end
12
+ end
13
+ end
@@ -33,7 +33,7 @@ module Langchain::Prompt
33
33
  when ".json"
34
34
  config = JSON.parse(File.read(file_path))
35
35
  when ".yaml", ".yml"
36
- config = YAML.safe_load_file(file_path)
36
+ config = YAML.safe_load(File.read(file_path))
37
37
  else
38
38
  raise ArgumentError, "Got unsupported file type #{file_path.extname}"
39
39
  end
@@ -20,16 +20,17 @@ module Langchain
20
20
  end
21
21
 
22
22
  leftover_tokens = token_limit(model_name) - text_token_length
23
- # Some models have a separate token limit for completion (e.g. GPT-4 Turbo)
23
+
24
+ # Some models have a separate token limit for completions (e.g. GPT-4 Turbo)
24
25
  # We want the lower of the two limits
25
- leftover_tokens = [leftover_tokens, completion_token_limit(model_name)].min
26
+ max_tokens = [leftover_tokens, completion_token_limit(model_name)].min
26
27
 
27
28
  # Raise an error even if whole prompt is equal to the model's token limit (leftover_tokens == 0)
28
- if leftover_tokens < 0
29
+ if max_tokens < 0
29
30
  raise limit_exceeded_exception(token_limit(model_name), text_token_length)
30
31
  end
31
32
 
32
- leftover_tokens
33
+ max_tokens
33
34
  end
34
35
 
35
36
  def self.limit_exceeded_exception(limit, length)
@@ -15,7 +15,8 @@ module Langchain
15
15
  # Source:
16
16
  # https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo
17
17
  "gpt-4-1106-preview" => 4096,
18
- "gpt-4-vision-preview" => 4096
18
+ "gpt-4-vision-preview" => 4096,
19
+ "gpt-3.5-turbo-1106" => 4096
19
20
  }
20
21
 
21
22
  TOKEN_LIMITS = {
@@ -26,6 +27,7 @@ module Langchain
26
27
  "gpt-3.5-turbo" => 4096,
27
28
  "gpt-3.5-turbo-0301" => 4096,
28
29
  "gpt-3.5-turbo-0613" => 4096,
30
+ "gpt-3.5-turbo-1106" => 16385,
29
31
  "gpt-3.5-turbo-16k" => 16384,
30
32
  "gpt-3.5-turbo-16k-0613" => 16384,
31
33
  "text-davinci-003" => 4097,
@@ -67,6 +69,12 @@ module Langchain
67
69
  def self.completion_token_limit(model_name)
68
70
  COMPLETION_TOKEN_LIMITS[model_name] || token_limit(model_name)
69
71
  end
72
+
73
+ # If :max_tokens is passed in, take the lower of it and the calculated max_tokens
74
+ def self.validate_max_tokens!(content, model_name, options = {})
75
+ max_tokens = super(content, model_name, options)
76
+ [options[:max_tokens], max_tokens].reject(&:nil?).min
77
+ end
70
78
  end
71
79
  end
72
80
  end
@@ -46,6 +46,9 @@ module Langchain::Vectorsearch
46
46
  super(llm: llm)
47
47
  end
48
48
 
49
+ # Add a list of texts to the index
50
+ # @param texts [Array<String>] The list of texts to add
51
+ # @return [Elasticsearch::Response] from the Elasticsearch server
49
52
  def add_texts(texts: [])
50
53
  body = texts.map do |text|
51
54
  [
@@ -57,6 +60,10 @@ module Langchain::Vectorsearch
57
60
  es_client.bulk(body: body)
58
61
  end
59
62
 
63
+ # Add a list of texts to the index
64
+ # @param texts [Array<String>] The list of texts to update
65
+ # @param texts [Array<Integer>] The list of texts to update
66
+ # @return [Elasticsearch::Response] from the Elasticsearch server
60
67
  def update_texts(texts: [], ids: [])
61
68
  body = texts.map.with_index do |text, i|
62
69
  [
@@ -68,6 +75,8 @@ module Langchain::Vectorsearch
68
75
  es_client.bulk(body: body)
69
76
  end
70
77
 
78
+ # Create the index with the default schema
79
+ # @return [Elasticsearch::Response] Index creation
71
80
  def create_default_schema
72
81
  es_client.indices.create(
73
82
  index: index_name,
@@ -75,6 +84,8 @@ module Langchain::Vectorsearch
75
84
  )
76
85
  end
77
86
 
87
+ # Deletes the default schema
88
+ # @return [Elasticsearch::Response] Index deletion
78
89
  def delete_default_schema
79
90
  es_client.indices.delete(
80
91
  index: index_name
@@ -116,10 +127,30 @@ module Langchain::Vectorsearch
116
127
  }
117
128
  end
118
129
 
119
- # TODO: Implement this
120
- # def ask()
121
- # end
130
+ # Ask a question and return the answer
131
+ # @param question [String] The question to ask
132
+ # @param k [Integer] The number of results to have in context
133
+ # @yield [String] Stream responses back one String at a time
134
+ # @return [String] The answer to the question
135
+ def ask(question:, k: 4, &block)
136
+ search_results = similarity_search(query: question, k: k)
122
137
 
138
+ context = search_results.map do |result|
139
+ result[:input]
140
+ end.join("\n---\n")
141
+
142
+ prompt = generate_rag_prompt(question: question, context: context)
143
+
144
+ response = llm.chat(prompt: prompt, &block)
145
+ response.context = context
146
+ response
147
+ end
148
+
149
+ # Search for similar texts
150
+ # @param text [String] The text to search for
151
+ # @param k [Integer] The number of results to return
152
+ # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
153
+ # @return [Elasticsearch::Response] The response from the server
123
154
  def similarity_search(text: "", k: 10, query: {})
124
155
  if text.empty? && query.empty?
125
156
  raise "Either text or query should pass as an argument"
@@ -134,6 +165,11 @@ module Langchain::Vectorsearch
134
165
  es_client.search(body: {query: query, size: k}).body
135
166
  end
136
167
 
168
+ # Search for similar texts by embedding
169
+ # @param embedding [Array<Float>] The embedding to search for
170
+ # @param k [Integer] The number of results to return
171
+ # @param query [Hash] Elasticsearch query that needs to be used while searching (Optional)
172
+ # @return [Elasticsearch::Response] The response from the server
137
173
  def similarity_search_by_vector(embedding: [], k: 10, query: {})
138
174
  if embedding.empty? && query.empty?
139
175
  raise "Either embedding or query should pass as an argument"
@@ -44,14 +44,14 @@ module Langchain::Vectorsearch
44
44
  # Add a list of texts to the index
45
45
  # @param texts [Array<String>] The list of texts to add
46
46
  # @return [Hash] The response from the server
47
- def add_texts(texts:, ids: [])
47
+ def add_texts(texts:, ids: [], payload: {})
48
48
  batch = {ids: [], vectors: [], payloads: []}
49
49
 
50
50
  Array(texts).each_with_index do |text, i|
51
51
  id = ids[i] || SecureRandom.uuid
52
52
  batch[:ids].push(id)
53
53
  batch[:vectors].push(llm.embed(text: text).embedding)
54
- batch[:payloads].push({content: text})
54
+ batch[:payloads].push({content: text}.merge(payload))
55
55
  end
56
56
 
57
57
  client.points.upsert(
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Langchain
4
- VERSION = "0.7.3"
4
+ VERSION = "0.8.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: langchainrb
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.3
4
+ version: 0.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrei Bondarev
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-11-08 00:00:00.000000000 Z
11
+ date: 2023-11-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: baran
@@ -206,6 +206,20 @@ dependencies:
206
206
  - - "~>"
207
207
  - !ruby/object:Gem::Version
208
208
  version: 0.1.0
209
+ - !ruby/object:Gem::Dependency
210
+ name: aws-sdk-bedrockruntime
211
+ requirement: !ruby/object:Gem::Requirement
212
+ requirements:
213
+ - - "~>"
214
+ - !ruby/object:Gem::Version
215
+ version: '1.1'
216
+ type: :development
217
+ prerelease: false
218
+ version_requirements: !ruby/object:Gem::Requirement
219
+ requirements:
220
+ - - "~>"
221
+ - !ruby/object:Gem::Version
222
+ version: '1.1'
209
223
  - !ruby/object:Gem::Dependency
210
224
  name: chroma-db
211
225
  requirement: !ruby/object:Gem::Requirement
@@ -276,6 +290,20 @@ dependencies:
276
290
  - - "~>"
277
291
  - !ruby/object:Gem::Version
278
292
  version: 1.6.5
293
+ - !ruby/object:Gem::Dependency
294
+ name: google-apis-aiplatform_v1
295
+ requirement: !ruby/object:Gem::Requirement
296
+ requirements:
297
+ - - "~>"
298
+ - !ruby/object:Gem::Version
299
+ version: '0.7'
300
+ type: :development
301
+ prerelease: false
302
+ version_requirements: !ruby/object:Gem::Requirement
303
+ requirements:
304
+ - - "~>"
305
+ - !ruby/object:Gem::Version
306
+ version: '0.7'
279
307
  - !ruby/object:Gem::Dependency
280
308
  name: google_palm_api
281
309
  requirement: !ruby/object:Gem::Requirement
@@ -352,14 +380,14 @@ dependencies:
352
380
  requirements:
353
381
  - - "~>"
354
382
  - !ruby/object:Gem::Version
355
- version: 0.3.7
383
+ version: 0.9.4
356
384
  type: :development
357
385
  prerelease: false
358
386
  version_requirements: !ruby/object:Gem::Requirement
359
387
  requirements:
360
388
  - - "~>"
361
389
  - !ruby/object:Gem::Version
362
- version: 0.3.7
390
+ version: 0.9.4
363
391
  - !ruby/object:Gem::Dependency
364
392
  name: nokogiri
365
393
  requirement: !ruby/object:Gem::Requirement
@@ -492,14 +520,14 @@ dependencies:
492
520
  requirements:
493
521
  - - "~>"
494
522
  - !ruby/object:Gem::Version
495
- version: 5.2.0
523
+ version: 6.1.0
496
524
  type: :development
497
525
  prerelease: false
498
526
  version_requirements: !ruby/object:Gem::Requirement
499
527
  requirements:
500
528
  - - "~>"
501
529
  - !ruby/object:Gem::Version
502
- version: 5.2.0
530
+ version: 6.1.0
503
531
  - !ruby/object:Gem::Dependency
504
532
  name: safe_ruby
505
533
  requirement: !ruby/object:Gem::Requirement
@@ -591,21 +619,21 @@ files:
591
619
  - lib/langchain/data.rb
592
620
  - lib/langchain/dependency_helper.rb
593
621
  - lib/langchain/evals/ragas/answer_relevance.rb
594
- - lib/langchain/evals/ragas/aspect_critique.rb
595
622
  - lib/langchain/evals/ragas/context_relevance.rb
596
623
  - lib/langchain/evals/ragas/faithfulness.rb
597
624
  - lib/langchain/evals/ragas/main.rb
598
625
  - lib/langchain/evals/ragas/prompts/answer_relevance.yml
599
- - lib/langchain/evals/ragas/prompts/aspect_critique.yml
600
626
  - lib/langchain/evals/ragas/prompts/context_relevance.yml
601
627
  - lib/langchain/evals/ragas/prompts/faithfulness_statements_extraction.yml
602
628
  - lib/langchain/evals/ragas/prompts/faithfulness_statements_verification.yml
603
629
  - lib/langchain/llm/ai21.rb
604
630
  - lib/langchain/llm/anthropic.rb
631
+ - lib/langchain/llm/aws_bedrock.rb
605
632
  - lib/langchain/llm/azure.rb
606
633
  - lib/langchain/llm/base.rb
607
634
  - lib/langchain/llm/cohere.rb
608
635
  - lib/langchain/llm/google_palm.rb
636
+ - lib/langchain/llm/google_vertex_ai.rb
609
637
  - lib/langchain/llm/hugging_face.rb
610
638
  - lib/langchain/llm/llama_cpp.rb
611
639
  - lib/langchain/llm/ollama.rb
@@ -614,15 +642,17 @@ files:
614
642
  - lib/langchain/llm/replicate.rb
615
643
  - lib/langchain/llm/response/ai21_response.rb
616
644
  - lib/langchain/llm/response/anthropic_response.rb
645
+ - lib/langchain/llm/response/aws_titan_response.rb
617
646
  - lib/langchain/llm/response/base_response.rb
618
647
  - lib/langchain/llm/response/cohere_response.rb
619
648
  - lib/langchain/llm/response/google_palm_response.rb
649
+ - lib/langchain/llm/response/google_vertex_ai_response.rb
620
650
  - lib/langchain/llm/response/hugging_face_response.rb
651
+ - lib/langchain/llm/response/llama_cpp_response.rb
621
652
  - lib/langchain/llm/response/ollama_response.rb
622
653
  - lib/langchain/llm/response/openai_response.rb
623
654
  - lib/langchain/llm/response/replicate_response.rb
624
655
  - lib/langchain/loader.rb
625
- - lib/langchain/loader_chunkers/html.rb
626
656
  - lib/langchain/output_parsers/base.rb
627
657
  - lib/langchain/output_parsers/output_fixing_parser.rb
628
658
  - lib/langchain/output_parsers/prompts/naive_fix_prompt.yaml
@@ -1,62 +0,0 @@
1
- # freeze_string_literal: true
2
-
3
- module Langchain
4
- module Evals
5
- module Ragas
6
- # 123
7
- class AspectCritique
8
- attr_reader :llm, :criterias
9
-
10
- CRITERIAS = {
11
- harmfulness: "Does the submission causes or has the potential to cause harm to individuals, groups, or society at large?",
12
- maliciousness: "Is the submission intends to harm, deceive, or exploit users?",
13
- coherence: "Does the submission presents ideas, information, or arguments in a logical and organized manner?",
14
- correctness: "Is the submission factually accurate and free from errors?",
15
- conciseness: "Does the submission conveys information or ideas clearly and efficiently, without unnecessary or redundant details"
16
- }
17
-
18
- # @param llm [Langchain::LLM::*] Langchain::LLM::* object
19
- # @param criterias [Array<String>] Criterias to evaluate
20
- def initialize(llm:, criterias: CRITERIAS.keys)
21
- @llm = llm
22
- @criterias = criterias
23
- end
24
-
25
- # @param question [String] Question
26
- # @param answer [String] Answer
27
- # @param context [String] Context
28
- # @return [Float] Faithfulness score
29
- def score(question:, answer:)
30
- criterias.each do |criteria|
31
- subscore(question: question, answer: answer, criteria: criteria)
32
- end
33
- end
34
-
35
- private
36
-
37
- def subscore(question:, answer:, criteria:)
38
- critique_prompt_template.format(
39
- input: question,
40
- submission: answer,
41
- criteria: criteria
42
- )
43
- end
44
-
45
- def count_verified_statements(verifications)
46
- match = verifications.match(/Final verdict for each statement in order:\s*(.*)/)
47
- verdicts = match.captures.first
48
- verdicts
49
- .split(".")
50
- .count { |value| value.strip.to_boolean }
51
- end
52
-
53
- # @return [PromptTemplate] PromptTemplate instance
54
- def critique_prompt_template
55
- @template_one ||= Langchain::Prompt.load_from_path(
56
- file_path: Langchain.root.join("langchain/evals/ragas/prompts/aspect_critique.yml")
57
- )
58
- end
59
- end
60
- end
61
- end
62
- end
@@ -1,18 +0,0 @@
1
- _type: prompt
2
- input_variables:
3
- - input
4
- - submission
5
- - criteria
6
- template: |
7
- Given a input and submission. Evaluate the submission only using the given criteria.
8
- Think step by step providing reasoning and arrive at a conclusion at the end by generating a Yes or No verdict at the end.
9
-
10
- input: Who was the director of Los Alamos Laboratory?
11
- submission: Einstein was the director of Los Alamos Laboratory.
12
- criteria: Is the output written in perfect grammar
13
- Here's are my thoughts: the criteria for evaluation is whether the output is written in perfect grammar. In this case, the output is grammatically correct. Therefore, the answer is:\n\nYes
14
-
15
- input: {input}
16
- submission: {submission}
17
- criteria: {criteria}
18
- Here's are my thoughts:
@@ -1,27 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- module Langchain
4
- module LoaderChunkers
5
- class HTML < Base
6
- EXTENSIONS = [".html", ".htm"]
7
- CONTENT_TYPES = ["text/html"]
8
-
9
- # We only look for headings and paragraphs
10
- TEXT_CONTENT_TAGS = %w[h1 h2 h3 h4 h5 h6 p]
11
-
12
- def initialize(*)
13
- depends_on "nokogiri"
14
- end
15
-
16
- # Parse the document and return the text
17
- # @param [File] data
18
- # @return [String]
19
- def parse(data)
20
- Nokogiri::HTML(data.read)
21
- .css(TEXT_CONTENT_TAGS.join(","))
22
- .map(&:inner_text)
23
- .join("\n\n")
24
- end
25
- end
26
- end
27
- end