langchainrb 0.7.2 → 0.7.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +10 -0
- data/README.md +158 -320
- data/lib/langchain/agent/agents.md +54 -0
- data/lib/langchain/llm/aws_bedrock.rb +216 -0
- data/lib/langchain/llm/openai.rb +26 -9
- data/lib/langchain/llm/response/aws_titan_response.rb +17 -0
- data/lib/langchain/llm/response/base_response.rb +3 -0
- data/lib/langchain/utils/token_length/ai21_validator.rb +1 -0
- data/lib/langchain/utils/token_length/base_validator.rb +6 -2
- data/lib/langchain/utils/token_length/cohere_validator.rb +1 -0
- data/lib/langchain/utils/token_length/google_palm_validator.rb +1 -0
- data/lib/langchain/utils/token_length/openai_validator.rb +20 -0
- data/lib/langchain/vectorsearch/chroma.rb +3 -1
- data/lib/langchain/vectorsearch/milvus.rb +3 -1
- data/lib/langchain/vectorsearch/pgvector.rb +3 -1
- data/lib/langchain/vectorsearch/pinecone.rb +3 -1
- data/lib/langchain/vectorsearch/qdrant.rb +3 -1
- data/lib/langchain/vectorsearch/weaviate.rb +4 -2
- data/lib/langchain/version.rb +1 -1
- metadata +19 -5
- data/lib/langchain/evals/ragas/critique.rb +0 -62
- data/lib/langchain/evals/ragas/prompts/critique.yml +0 -18
- data/lib/langchain/loader_chunkers/html.rb +0 -27
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f4c388275b83a0e4260f4ae9271f4c164a8d34ea5ea9585916d91e7e9c17c980
|
4
|
+
data.tar.gz: 8daa400de3ed80bb3fb9c53cc19ef4d56f137c2aa157bd268dbda488d0fca432
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4bae87c050be6a8fa011c1ae5de4b119abac498669f2e63ca1829e11b7b5ecca7610330be670d24fd6cb98c2e2599c593e9922378985efc586d76c124efb865e
|
7
|
+
data.tar.gz: 2a39b084c6a239aeb0de22bfc87629d2f2909b23eabfcf71a835a1f1624d84afe3ea106afdafb8f1fb301b7934d73abc7253c9b8bd3f6c9b170231ebb5af0936
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,15 @@
|
|
1
1
|
## [Unreleased]
|
2
2
|
|
3
|
+
## [0.7.5] - 2023-11-13
|
4
|
+
- Fixes
|
5
|
+
|
6
|
+
## [0.7.4] - 2023-11-10
|
7
|
+
- AWS Bedrock is available as an LLM provider. Available models from AI21, Cohere, AWS, and Anthropic.
|
8
|
+
|
9
|
+
## [0.7.3] - 2023-11-08
|
10
|
+
- LLM response passes through the context in RAG cases
|
11
|
+
- Fix gpt-4 token length validation
|
12
|
+
|
3
13
|
## [0.7.2] - 2023-11-02
|
4
14
|
- Azure OpenAI LLM support
|
5
15
|
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
💎🔗 Langchain.rb
|
2
2
|
---
|
3
|
-
⚡ Building applications
|
3
|
+
⚡ Building LLM-powered applications in Ruby ⚡
|
4
4
|
|
5
5
|
For deep Rails integration see: [langchainrb_rails](https://github.com/andreibondarev/langchainrb_rails) gem.
|
6
6
|
|
@@ -11,21 +11,24 @@ Available for paid consulting engagements! [Email me](mailto:andrei@sourcelabs.i
|
|
11
11
|
[](http://rubydoc.info/gems/langchainrb)
|
12
12
|
[](https://github.com/andreibondarev/langchainrb/blob/main/LICENSE.txt)
|
13
13
|
[](https://discord.gg/WDARp7J2n8)
|
14
|
+
[](https://twitter.com/rushing_andrei)
|
14
15
|
|
15
|
-
|
16
|
+
## Use Cases
|
17
|
+
* Retrieval Augmented Generation (RAG) and vector search
|
18
|
+
* Chat bots
|
19
|
+
* [AI agents](https://github.com/andreibondarev/langchainrb/tree/main/lib/langchain/agent/agents.md)
|
16
20
|
|
17
|
-
##
|
21
|
+
## Table of Contents
|
18
22
|
|
19
23
|
- [Installation](#installation)
|
20
24
|
- [Usage](#usage)
|
21
|
-
- [
|
22
|
-
- [
|
23
|
-
- [
|
24
|
-
- [
|
25
|
-
- [
|
26
|
-
- [Loaders](#loaders-)
|
27
|
-
- [Examples](#examples)
|
25
|
+
- [Large Language Models (LLMs)](#large-language-models-llms)
|
26
|
+
- [Prompt Management](#prompt-management)
|
27
|
+
- [Output Parsers](#output-parsers)
|
28
|
+
- [Building RAG](#building-retrieval-augment-generation-rag-system)
|
29
|
+
- [Building chat bots](#building-chat-bots)
|
28
30
|
- [Evaluations](#evaluations-evals)
|
31
|
+
- [Examples](#examples)
|
29
32
|
- [Logging](#logging)
|
30
33
|
- [Development](#development)
|
31
34
|
- [Discord](#discord)
|
@@ -46,264 +49,66 @@ If bundler is not being used to manage dependencies, install the gem by executin
|
|
46
49
|
require "langchain"
|
47
50
|
```
|
48
51
|
|
49
|
-
|
50
|
-
|
51
|
-
| Database | Querying | Storage | Schema Management | Backups | Rails Integration |
|
52
|
-
| -------- |:------------------:| -------:| -----------------:| -------:| -----------------:|
|
53
|
-
| [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
54
|
-
| [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | WIP |
|
55
|
-
| [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
56
|
-
| [Pinecone](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
57
|
-
| [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
58
|
-
| [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
59
|
-
| [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
|
60
|
-
|
61
|
-
### Using Vector Search Databases 🔍
|
62
|
-
|
63
|
-
Choose the LLM provider you'll be using (OpenAI or Cohere) and retrieve the API key.
|
64
|
-
|
65
|
-
Add `gem "weaviate-ruby", "~> 0.8.3"` to your Gemfile.
|
66
|
-
|
67
|
-
Pick the vector search database you'll be using and instantiate the client:
|
68
|
-
```ruby
|
69
|
-
client = Langchain::Vectorsearch::Weaviate.new(
|
70
|
-
url: ENV["WEAVIATE_URL"],
|
71
|
-
api_key: ENV["WEAVIATE_API_KEY"],
|
72
|
-
index_name: "",
|
73
|
-
llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
|
74
|
-
)
|
75
|
-
|
76
|
-
# You can instantiate any other supported vector search database:
|
77
|
-
client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
|
78
|
-
client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
|
79
|
-
client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
|
80
|
-
client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
|
81
|
-
client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
|
82
|
-
client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
|
83
|
-
```
|
84
|
-
|
85
|
-
```ruby
|
86
|
-
# Creating the default schema
|
87
|
-
client.create_default_schema
|
88
|
-
```
|
89
|
-
|
90
|
-
```ruby
|
91
|
-
# Store plain texts in your vector search database
|
92
|
-
client.add_texts(
|
93
|
-
texts: [
|
94
|
-
"Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly.",
|
95
|
-
"In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
|
96
|
-
]
|
97
|
-
)
|
98
|
-
```
|
99
|
-
```ruby
|
100
|
-
# Store the contents of your files in your vector search database
|
101
|
-
my_pdf = Langchain.root.join("path/to/my.pdf")
|
102
|
-
my_text = Langchain.root.join("path/to/my.txt")
|
103
|
-
my_docx = Langchain.root.join("path/to/my.docx")
|
104
|
-
|
105
|
-
client.add_data(paths: [my_pdf, my_text, my_docx])
|
106
|
-
```
|
107
|
-
```ruby
|
108
|
-
# Retrieve similar documents based on the query string passed in
|
109
|
-
client.similarity_search(
|
110
|
-
query:,
|
111
|
-
k: # number of results to be retrieved
|
112
|
-
)
|
113
|
-
```
|
114
|
-
```ruby
|
115
|
-
# Retrieve similar documents based on the query string passed in via the [HyDE technique](https://arxiv.org/abs/2212.10496)
|
116
|
-
client.similarity_search_with_hyde()
|
117
|
-
```
|
118
|
-
```ruby
|
119
|
-
# Retrieve similar documents based on the embedding passed in
|
120
|
-
client.similarity_search_by_vector(
|
121
|
-
embedding:,
|
122
|
-
k: # number of results to be retrieved
|
123
|
-
)
|
124
|
-
```
|
125
|
-
```ruby
|
126
|
-
# Q&A-style querying based on the question passed in
|
127
|
-
client.ask(
|
128
|
-
question:
|
129
|
-
)
|
130
|
-
```
|
131
|
-
|
132
|
-
## Integrating Vector Search into ActiveRecord models
|
133
|
-
```ruby
|
134
|
-
class Product < ActiveRecord::Base
|
135
|
-
vectorsearch provider: Langchain::Vectorsearch::Qdrant.new(
|
136
|
-
api_key: ENV["QDRANT_API_KEY"],
|
137
|
-
url: ENV["QDRANT_URL"],
|
138
|
-
index_name: "Products",
|
139
|
-
llm: Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])
|
140
|
-
)
|
52
|
+
## Large Language Models (LLMs)
|
53
|
+
Langchain.rb wraps all supported LLMs in a unified interface allowing you to easily swap out and test out different models.
|
141
54
|
|
142
|
-
|
143
|
-
|
144
|
-
|
55
|
+
#### Supported LLMs and features:
|
56
|
+
| LLM providers | embed() | complete() | chat() | summarize() | Notes |
|
57
|
+
| -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
|
58
|
+
| [OpenAI](https://openai.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | ❌ | Including Azure OpenAI |
|
59
|
+
| [AI21](https://ai21.com/) | ❌ | :white_check_mark: | ❌ | :white_check_mark: | |
|
60
|
+
| [Anthropic](https://milvus.io/) | ❌ | :white_check_mark: | ❌ | ❌ | |
|
61
|
+
| [AWS Bedrock](https://aws.amazon.com/bedrock) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
|
62
|
+
| [Cohere](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
63
|
+
| [GooglePalm](https://ai.google/discover/palm2/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
64
|
+
| [HuggingFace](https://huggingface.co/) | :white_check_mark: | ❌ | ❌ | ❌ | |
|
65
|
+
| [Ollama](https://ollama.ai/) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | |
|
66
|
+
| [Replicate](https://replicate.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
|
145
67
|
|
146
|
-
|
147
|
-
```ruby
|
148
|
-
# Retrieve similar products based on the query string passed in
|
149
|
-
Product.similarity_search(
|
150
|
-
query:,
|
151
|
-
k: # number of results to be retrieved
|
152
|
-
)
|
153
|
-
```
|
154
|
-
```ruby
|
155
|
-
# Q&A-style querying based on the question passed in
|
156
|
-
Product.ask(
|
157
|
-
question:
|
158
|
-
)
|
159
|
-
```
|
160
|
-
|
161
|
-
Additional info [here](https://github.com/andreibondarev/langchainrb/blob/main/lib/langchain/active_record/hooks.rb#L10-L38).
|
162
|
-
|
163
|
-
### Using Standalone LLMs 🗣️
|
164
|
-
|
165
|
-
Add `gem "ruby-openai", "~> 4.0.0"` to your Gemfile.
|
68
|
+
#### Using standalone LLMs:
|
166
69
|
|
167
70
|
#### OpenAI
|
168
|
-
```ruby
|
169
|
-
openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
|
170
|
-
```
|
171
|
-
You can pass additional parameters to the constructor, it will be passed to the OpenAI client:
|
172
|
-
```ruby
|
173
|
-
openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"], llm_options: {uri_base: "http://localhost:1234"}) )
|
174
|
-
```
|
175
|
-
```ruby
|
176
|
-
openai.embed(text: "foo bar")
|
177
|
-
```
|
178
|
-
```ruby
|
179
|
-
openai.complete(prompt: "What is the meaning of life?")
|
180
|
-
```
|
181
|
-
|
182
|
-
##### Open AI Function calls support
|
183
|
-
|
184
|
-
Conversation support
|
185
|
-
|
186
|
-
```ruby
|
187
|
-
chat = Langchain::Conversation.new(llm: openai)
|
188
|
-
```
|
189
|
-
```ruby
|
190
|
-
chat.set_context("You are the climate bot")
|
191
|
-
chat.set_functions(functions)
|
192
|
-
```
|
193
71
|
|
194
|
-
qdrant:
|
195
|
-
|
196
|
-
```ruby
|
197
|
-
client.llm.functions = functions
|
198
|
-
```
|
199
|
-
|
200
|
-
#### Azure
|
201
72
|
Add `gem "ruby-openai", "~> 5.2.0"` to your Gemfile.
|
202
73
|
|
203
74
|
```ruby
|
204
|
-
|
205
|
-
api_key: ENV["AZURE_API_KEY"],
|
206
|
-
llm_options: {
|
207
|
-
api_type: :azure,
|
208
|
-
api_version: "2023-03-15-preview"
|
209
|
-
},
|
210
|
-
embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
|
211
|
-
chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
|
212
|
-
)
|
213
|
-
```
|
214
|
-
where `AZURE_EMBEDDING_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo` and `AZURE_CHAT_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/ada-2`
|
215
|
-
|
216
|
-
You can pass additional parameters to the constructor, it will be passed to the Azure client:
|
217
|
-
```ruby
|
218
|
-
azure = Langchain::LLM::Azure.new(
|
219
|
-
api_key: ENV["AZURE_API_KEY"],
|
220
|
-
llm_options: {
|
221
|
-
api_type: :azure,
|
222
|
-
api_version: "2023-03-15-preview",
|
223
|
-
request_timeout: 240 # Optional
|
224
|
-
},
|
225
|
-
embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
|
226
|
-
chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
|
227
|
-
)
|
228
|
-
```
|
229
|
-
```ruby
|
230
|
-
azure.embed(text: "foo bar")
|
231
|
-
```
|
232
|
-
```ruby
|
233
|
-
azure.complete(prompt: "What is the meaning of life?")
|
234
|
-
```
|
235
|
-
|
236
|
-
#### Cohere
|
237
|
-
Add `gem "cohere-ruby", "~> 0.9.6"` to your Gemfile.
|
238
|
-
|
239
|
-
```ruby
|
240
|
-
cohere = Langchain::LLM::Cohere.new(api_key: ENV["COHERE_API_KEY"])
|
241
|
-
```
|
242
|
-
```ruby
|
243
|
-
cohere.embed(text: "foo bar")
|
244
|
-
```
|
245
|
-
```ruby
|
246
|
-
cohere.complete(prompt: "What is the meaning of life?")
|
247
|
-
```
|
248
|
-
|
249
|
-
#### HuggingFace
|
250
|
-
Add `gem "hugging-face", "~> 0.3.2"` to your Gemfile.
|
251
|
-
```ruby
|
252
|
-
hugging_face = Langchain::LLM::HuggingFace.new(api_key: ENV["HUGGING_FACE_API_KEY"])
|
253
|
-
```
|
254
|
-
|
255
|
-
#### Replicate
|
256
|
-
Add `gem "replicate-ruby", "~> 0.2.2"` to your Gemfile.
|
257
|
-
```ruby
|
258
|
-
replicate = Langchain::LLM::Replicate.new(api_key: ENV["REPLICATE_API_KEY"])
|
75
|
+
llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
|
259
76
|
```
|
260
|
-
|
261
|
-
#### Google PaLM (Pathways Language Model)
|
262
|
-
Add `"google_palm_api", "~> 0.1.3"` to your Gemfile.
|
77
|
+
You can pass additional parameters to the constructor, it will be passed to the OpenAI client:
|
263
78
|
```ruby
|
264
|
-
|
79
|
+
llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"], llm_options: { ... })
|
265
80
|
```
|
266
81
|
|
267
|
-
|
268
|
-
Add `gem "ai21", "~> 0.2.1"` to your Gemfile.
|
82
|
+
Generate vector embeddings:
|
269
83
|
```ruby
|
270
|
-
|
84
|
+
llm.embed(text: "foo bar")
|
271
85
|
```
|
272
86
|
|
273
|
-
|
274
|
-
Add `gem "anthropic", "~> 0.1.0"` to your Gemfile.
|
87
|
+
Generate a text completion:
|
275
88
|
```ruby
|
276
|
-
|
89
|
+
llm.complete(prompt: "What is the meaning of life?")
|
277
90
|
```
|
278
91
|
|
92
|
+
Generate a chat completion:
|
279
93
|
```ruby
|
280
|
-
|
94
|
+
llm.chat(prompt: "Hey! How are you?")
|
281
95
|
```
|
282
96
|
|
283
|
-
|
97
|
+
Summarize the text:
|
284
98
|
```ruby
|
285
|
-
|
99
|
+
llm.complete(text: "...")
|
286
100
|
```
|
287
101
|
|
102
|
+
You can use any other LLM by invoking the same interface:
|
288
103
|
```ruby
|
289
|
-
|
290
|
-
```
|
291
|
-
```ruby
|
292
|
-
ollama.embed(text: "Hello world!")
|
104
|
+
llm = Langchain::LLM::GooglePalm.new(...)
|
293
105
|
```
|
294
106
|
|
295
|
-
###
|
107
|
+
### Prompt Management
|
296
108
|
|
297
109
|
#### Prompt Templates
|
298
110
|
|
299
|
-
Create a prompt with
|
300
|
-
|
301
|
-
```ruby
|
302
|
-
prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke.", input_variables: ["adjective"])
|
303
|
-
prompt.format(adjective: "funny") # "Tell me a funny joke."
|
304
|
-
```
|
305
|
-
|
306
|
-
Create a prompt with multiple input variables:
|
111
|
+
Create a prompt with input variables:
|
307
112
|
|
308
113
|
```ruby
|
309
114
|
prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke about {content}.", input_variables: ["adjective", "content"])
|
@@ -384,7 +189,8 @@ prompt = Langchain::Prompt.load_from_path(file_path: "spec/fixtures/prompt/promp
|
|
384
189
|
prompt.input_variables #=> ["adjective", "content"]
|
385
190
|
```
|
386
191
|
|
387
|
-
|
192
|
+
|
193
|
+
### Output Parsers
|
388
194
|
|
389
195
|
Parse LLM text responses into structured output, such as JSON.
|
390
196
|
|
@@ -484,93 +290,147 @@ fix_parser.parse(llm_response)
|
|
484
290
|
|
485
291
|
See [here](https://github.com/andreibondarev/langchainrb/tree/main/examples/create_and_manage_prompt_templates_using_structured_output_parser.rb) for a concrete example
|
486
292
|
|
487
|
-
|
488
|
-
|
293
|
+
## Building Retrieval Augment Generation (RAG) system
|
294
|
+
RAG is a methodology that assists LLMs generate accurate and up-to-date information.
|
295
|
+
A typical RAG workflow follows the 3 steps below:
|
296
|
+
1. Relevant knowledge (or data) is retrieved from the knowledge base (typically a vector search DB)
|
297
|
+
2. A prompt, containing retrieved knowledge above, is constructed.
|
298
|
+
3. LLM receives the prompt above to generate a text completion.
|
299
|
+
Most common use-case for a RAG system is powering Q&A systems where users pose natural language questions and receive answers in natural language.
|
489
300
|
|
490
|
-
|
301
|
+
### Vector search databases
|
302
|
+
Langchain.rb provides a convenient unified interface on top of supported vectorsearch databases that make it easy to configure your index, add data, query and retrieve from it.
|
491
303
|
|
492
|
-
|
304
|
+
#### Supported vector search databases and features:
|
493
305
|
|
494
|
-
|
495
|
-
|
496
|
-
|
306
|
+
| Database | Open-source | Cloud offering |
|
307
|
+
| -------- |:------------------:| :------------: |
|
308
|
+
| [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: |
|
309
|
+
| [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | ❌ |
|
310
|
+
| [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: Zilliz Cloud |
|
311
|
+
| [Pinecone](https://www.pinecone.io/) | ❌ | :white_check_mark: |
|
312
|
+
| [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: |
|
313
|
+
| [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: |
|
314
|
+
| [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: |
|
497
315
|
|
498
|
-
|
316
|
+
### Using Vector Search Databases 🔍
|
499
317
|
|
500
|
-
|
501
|
-
llm: openai,
|
502
|
-
tools: [search_tool, calculator]
|
503
|
-
)
|
504
|
-
```
|
318
|
+
Pick the vector search database you'll be using, add the gem dependency and instantiate the client:
|
505
319
|
```ruby
|
506
|
-
|
507
|
-
#=> "Approximately 2,945 soccer fields would be needed to cover the distance between NYC and DC in a straight line."
|
320
|
+
gem "weaviate-ruby", "~> 0.8.9"
|
508
321
|
```
|
509
322
|
|
510
|
-
|
323
|
+
Choose and instantiate the LLM provider you'll be using to generate embeddings
|
324
|
+
```ruby
|
325
|
+
llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
|
326
|
+
```
|
511
327
|
|
512
|
-
|
328
|
+
```ruby
|
329
|
+
client = Langchain::Vectorsearch::Weaviate.new(
|
330
|
+
url: ENV["WEAVIATE_URL"],
|
331
|
+
api_key: ENV["WEAVIATE_API_KEY"],
|
332
|
+
index_name: "Documents",
|
333
|
+
llm: llm
|
334
|
+
)
|
335
|
+
```
|
513
336
|
|
337
|
+
You can instantiate any other supported vector search database:
|
514
338
|
```ruby
|
515
|
-
|
339
|
+
client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
|
340
|
+
client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
|
341
|
+
client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
|
342
|
+
client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
|
343
|
+
client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
|
344
|
+
client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
|
345
|
+
```
|
516
346
|
|
517
|
-
|
347
|
+
Create the default schema:
|
348
|
+
```ruby
|
349
|
+
client.create_default_schema
|
518
350
|
```
|
351
|
+
|
352
|
+
Add plain text data to your vector search database:
|
519
353
|
```ruby
|
520
|
-
|
521
|
-
|
354
|
+
client.add_texts(
|
355
|
+
texts: [
|
356
|
+
"Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly.",
|
357
|
+
"In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
|
358
|
+
]
|
359
|
+
)
|
522
360
|
```
|
523
361
|
|
524
|
-
|
525
|
-
|
362
|
+
Or use the file parsers to load, parse and index data into your database:
|
363
|
+
```ruby
|
364
|
+
my_pdf = Langchain.root.join("path/to/my.pdf")
|
365
|
+
my_text = Langchain.root.join("path/to/my.txt")
|
366
|
+
my_docx = Langchain.root.join("path/to/my.docx")
|
526
367
|
|
527
|
-
|
368
|
+
client.add_data(paths: [my_pdf, my_text, my_docx])
|
369
|
+
```
|
370
|
+
Supported file formats: docx, html, pdf, text, json, jsonl, csv, xlsx.
|
528
371
|
|
529
|
-
|
372
|
+
Retrieve similar documents based on the query string passed in:
|
373
|
+
```ruby
|
374
|
+
client.similarity_search(
|
375
|
+
query:,
|
376
|
+
k: # number of results to be retrieved
|
377
|
+
)
|
378
|
+
```
|
530
379
|
|
531
|
-
|
532
|
-
|
533
|
-
|
534
|
-
|
535
|
-
| "ruby_code_interpreter" | Interprets Ruby expressions | | `gem "safe_ruby", "~> 1.0.4"` |
|
536
|
-
| "google_search" | A wrapper around Google Search | `ENV["SERPAPI_API_KEY"]` (https://serpapi.com/manage-api-key) | `gem "google_search_results", "~> 2.0.0"` |
|
537
|
-
| "weather" | Calls Open Weather API to retrieve the current weather | `ENV["OPEN_WEATHER_API_KEY"]` (https://home.openweathermap.org/api_keys) | `gem "open-weather-ruby-client", "~> 0.3.0"` |
|
538
|
-
| "wikipedia" | Calls Wikipedia API to retrieve the summary | | `gem "wikipedia-client", "~> 1.17.0"` |
|
380
|
+
Retrieve similar documents based on the query string passed in via the [HyDE technique](https://arxiv.org/abs/2212.10496):
|
381
|
+
```ruby
|
382
|
+
client.similarity_search_with_hyde()
|
383
|
+
```
|
539
384
|
|
540
|
-
|
385
|
+
Retrieve similar documents based on the embedding passed in:
|
386
|
+
```ruby
|
387
|
+
client.similarity_search_by_vector(
|
388
|
+
embedding:,
|
389
|
+
k: # number of results to be retrieved
|
390
|
+
)
|
391
|
+
```
|
541
392
|
|
542
|
-
|
393
|
+
RAG-based querying
|
394
|
+
```ruby
|
395
|
+
client.ask(
|
396
|
+
question:
|
397
|
+
)
|
398
|
+
```
|
543
399
|
|
544
|
-
|
400
|
+
## Building chat bots
|
545
401
|
|
546
|
-
|
402
|
+
### Conversation class
|
547
403
|
|
404
|
+
Choose and instantiate the LLM provider you'll be using:
|
548
405
|
```ruby
|
549
|
-
Langchain::
|
406
|
+
llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
|
550
407
|
```
|
551
|
-
|
552
|
-
or
|
553
|
-
|
408
|
+
Instantiate the Conversation class:
|
554
409
|
```ruby
|
555
|
-
Langchain::
|
410
|
+
chat = Langchain::Conversation.new(llm: llm)
|
556
411
|
```
|
557
412
|
|
558
|
-
|
413
|
+
(Optional) Set the conversation context:
|
414
|
+
```ruby
|
415
|
+
chat.set_context("You are a chatbot from the future")
|
416
|
+
```
|
559
417
|
|
418
|
+
Exchange messages with the LLM
|
419
|
+
```ruby
|
420
|
+
chat.message("Tell me about future technologies")
|
421
|
+
```
|
560
422
|
|
561
|
-
|
562
|
-
|
563
|
-
|
564
|
-
|
565
|
-
|
566
|
-
|
567
|
-
| JSON | Langchain::Processors::JSON | |
|
568
|
-
| JSONL | Langchain::Processors::JSONL | |
|
569
|
-
| csv | Langchain::Processors::CSV | |
|
570
|
-
| xlsx | Langchain::Processors::Xlsx | `gem "roo", "~> 2.10.0"` |
|
423
|
+
To stream the chat response:
|
424
|
+
```ruby
|
425
|
+
chat = Langchain::Conversation.new(llm: llm) do |chunk|
|
426
|
+
print(chunk)
|
427
|
+
end
|
428
|
+
```
|
571
429
|
|
572
|
-
|
573
|
-
|
430
|
+
Open AI Functions support
|
431
|
+
```ruby
|
432
|
+
chat.set_functions(functions)
|
433
|
+
```
|
574
434
|
|
575
435
|
## Evaluations (Evals)
|
576
436
|
The Evaluations module is a collection of tools that can be used to evaluate and track the performance of the output products by LLM and your RAG (Retrieval Augmented Generation) pipelines.
|
@@ -598,13 +458,16 @@ ragas.score(answer: "", question: "", context: "")
|
|
598
458
|
# }
|
599
459
|
```
|
600
460
|
|
461
|
+
## Examples
|
462
|
+
Additional examples available: [/examples](https://github.com/andreibondarev/langchainrb/tree/main/examples)
|
463
|
+
|
601
464
|
## Logging
|
602
465
|
|
603
466
|
LangChain.rb uses standard logging mechanisms and defaults to `:warn` level. Most messages are at info level, but we will add debug or warn statements as needed.
|
604
467
|
To show all log messages:
|
605
468
|
|
606
469
|
```ruby
|
607
|
-
Langchain.logger.level = :
|
470
|
+
Langchain.logger.level = :debug
|
608
471
|
```
|
609
472
|
|
610
473
|
## Development
|
@@ -618,31 +481,6 @@ Langchain.logger.level = :info
|
|
618
481
|
## Discord
|
619
482
|
Join us in the [Langchain.rb](https://discord.gg/WDARp7J2n8) Discord server.
|
620
483
|
|
621
|
-
## Core Contributors
|
622
|
-
[<img style="border-radius:50%" alt="Andrei Bondarev" src="https://avatars.githubusercontent.com/u/541665?v=4" width="80" height="80" class="avatar">](https://twitter.com/rushing_andrei)
|
623
|
-
|
624
|
-
## Contributors
|
625
|
-
[<img style="border-radius:50%" alt="Alex Chaplinsky" src="https://avatars.githubusercontent.com/u/695947?v=4" width="80" height="80" class="avatar">](https://github.com/alchaplinsky)
|
626
|
-
[<img style="border-radius:50%" alt="Josh Nichols" src="https://avatars.githubusercontent.com/u/159?v=4" width="80" height="80" class="avatar">](https://github.com/technicalpickles)
|
627
|
-
[<img style="border-radius:50%" alt="Matt Lindsey" src="https://avatars.githubusercontent.com/u/5638339?v=4" width="80" height="80" class="avatar">](https://github.com/mattlindsey)
|
628
|
-
[<img style="border-radius:50%" alt="Ricky Chilcott" src="https://avatars.githubusercontent.com/u/445759?v=4" width="80" height="80" class="avatar">](https://github.com/rickychilcott)
|
629
|
-
[<img style="border-radius:50%" alt="Moeki Kawakami" src="https://avatars.githubusercontent.com/u/72325947?v=4" width="80" height="80" class="avatar">](https://github.com/moekidev)
|
630
|
-
[<img style="border-radius:50%" alt="Jens Stmrs" src="https://avatars.githubusercontent.com/u/3492669?v=4" width="80" height="80" class="avatar">](https://github.com/faustus7)
|
631
|
-
[<img style="border-radius:50%" alt="Rafael Figueiredo" src="https://avatars.githubusercontent.com/u/35845775?v=4" width="80" height="80" class="avatar">](https://github.com/rafaelqfigueiredo)
|
632
|
-
[<img style="border-radius:50%" alt="Piero Dotti" src="https://avatars.githubusercontent.com/u/5167659?v=4" width="80" height="80" class="avatar">](https://github.com/ProGM)
|
633
|
-
[<img style="border-radius:50%" alt="Michał Ciemięga" src="https://avatars.githubusercontent.com/u/389828?v=4" width="80" height="80" class="avatar">](https://github.com/zewelor)
|
634
|
-
[<img style="border-radius:50%" alt="Bruno Bornsztein" src="https://avatars.githubusercontent.com/u/3760?v=4" width="80" height="80" class="avatar">](https://github.com/bborn)
|
635
|
-
[<img style="border-radius:50%" alt="Tim Williams" src="https://avatars.githubusercontent.com/u/1192351?v=4" width="80" height="80" class="avatar">](https://github.com/timrwilliams)
|
636
|
-
[<img style="border-radius:50%" alt="Zhenhang Tung" src="https://avatars.githubusercontent.com/u/8170159?v=4" width="80" height="80" class="avatar">](https://github.com/ZhenhangTung)
|
637
|
-
[<img style="border-radius:50%" alt="Hama" src="https://avatars.githubusercontent.com/u/38002468?v=4" width="80" height="80" class="avatar">](https://github.com/akmhmgc)
|
638
|
-
[<img style="border-radius:50%" alt="Josh Weir" src="https://avatars.githubusercontent.com/u/10720337?v=4" width="80" height="80" class="avatar">](https://github.com/joshweir)
|
639
|
-
[<img style="border-radius:50%" alt="Arthur Hess" src="https://avatars.githubusercontent.com/u/446035?v=4" width="80" height="80" class="avatar">](https://github.com/arthurhess)
|
640
|
-
[<img style="border-radius:50%" alt="Jin Shen" src="https://avatars.githubusercontent.com/u/54917718?v=4" width="80" height="80" class="avatar">](https://github.com/jacshen-ebay)
|
641
|
-
[<img style="border-radius:50%" alt="Earle Bunao" src="https://avatars.githubusercontent.com/u/4653624?v=4" width="80" height="80" class="avatar">](https://github.com/erbunao)
|
642
|
-
[<img style="border-radius:50%" alt="Maël H." src="https://avatars.githubusercontent.com/u/61985678?v=4" width="80" height="80" class="avatar">](https://github.com/mael-ha)
|
643
|
-
[<img style="border-radius:50%" alt="Chris O. Adebiyi" src="https://avatars.githubusercontent.com/u/62605573?v=4" width="80" height="80" class="avatar">](https://github.com/oluvvafemi)
|
644
|
-
[<img style="border-radius:50%" alt="Aaron Breckenridge" src="https://avatars.githubusercontent.com/u/201360?v=4" width="80" height="80" class="avatar">](https://github.com/breckenedge)
|
645
|
-
|
646
484
|
## Star History
|
647
485
|
|
648
486
|
[](https://star-history.com/#andreibondarev/langchainrb&Date)
|