langchainrb 0.7.2 → 0.7.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 49f95a7d3bf92523a3bb74ffd9c1cff35c258c4ecb9523e75b3be4ffdf333359
4
- data.tar.gz: a114fc925963757330e83e9287314b1c363206a31293e788ab8f7cc5f8e82249
3
+ metadata.gz: f4c388275b83a0e4260f4ae9271f4c164a8d34ea5ea9585916d91e7e9c17c980
4
+ data.tar.gz: 8daa400de3ed80bb3fb9c53cc19ef4d56f137c2aa157bd268dbda488d0fca432
5
5
  SHA512:
6
- metadata.gz: e0fb4076645a2ba09e0e9012fa2ec84260c5294f59628284baace34ad98b4dc2621c29217890aba7995d21288b68b0eab96a4ad4ba74beb1c41d8e79c296539d
7
- data.tar.gz: 2d681b82119d4c4356011bcba6f5590429abdb3bea3049ab4c50ba720320493a64838bc08c6b9b8f16d2b2bd71d445795ae56923074a47b26e9948873460a250
6
+ metadata.gz: 4bae87c050be6a8fa011c1ae5de4b119abac498669f2e63ca1829e11b7b5ecca7610330be670d24fd6cb98c2e2599c593e9922378985efc586d76c124efb865e
7
+ data.tar.gz: 2a39b084c6a239aeb0de22bfc87629d2f2909b23eabfcf71a835a1f1624d84afe3ea106afdafb8f1fb301b7934d73abc7253c9b8bd3f6c9b170231ebb5af0936
data/CHANGELOG.md CHANGED
@@ -1,5 +1,15 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.7.5] - 2023-11-13
4
+ - Fixes
5
+
6
+ ## [0.7.4] - 2023-11-10
7
+ - AWS Bedrock is available as an LLM provider. Available models from AI21, Cohere, AWS, and Anthropic.
8
+
9
+ ## [0.7.3] - 2023-11-08
10
+ - LLM response passes through the context in RAG cases
11
+ - Fix gpt-4 token length validation
12
+
3
13
  ## [0.7.2] - 2023-11-02
4
14
  - Azure OpenAI LLM support
5
15
 
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  💎🔗 Langchain.rb
2
2
  ---
3
- ⚡ Building applications with LLMs through composability
3
+ ⚡ Building LLM-powered applications in Ruby
4
4
 
5
5
  For deep Rails integration see: [langchainrb_rails](https://github.com/andreibondarev/langchainrb_rails) gem.
6
6
 
@@ -11,21 +11,24 @@ Available for paid consulting engagements! [Email me](mailto:andrei@sourcelabs.i
11
11
  [![Docs](http://img.shields.io/badge/yard-docs-blue.svg)](http://rubydoc.info/gems/langchainrb)
12
12
  [![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/andreibondarev/langchainrb/blob/main/LICENSE.txt)
13
13
  [![](https://dcbadge.vercel.app/api/server/WDARp7J2n8?compact=true&style=flat)](https://discord.gg/WDARp7J2n8)
14
+ [![X](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40rushing_andrei)](https://twitter.com/rushing_andrei)
14
15
 
15
- Langchain.rb is a library that's an abstraction layer on top many emergent AI, ML and other DS tools. The goal is to abstract complexity and difficult concepts to make building AI/ML-supercharged applications approachable for traditional software engineers.
16
+ ## Use Cases
17
+ * Retrieval Augmented Generation (RAG) and vector search
18
+ * Chat bots
19
+ * [AI agents](https://github.com/andreibondarev/langchainrb/tree/main/lib/langchain/agent/agents.md)
16
20
 
17
- ## Explore Langchain.rb
21
+ ## Table of Contents
18
22
 
19
23
  - [Installation](#installation)
20
24
  - [Usage](#usage)
21
- - [Vector Search Databases](#using-vector-search-databases-)
22
- - [Standalone LLMs](#using-standalone-llms-️)
23
- - [Prompts](#using-prompts-)
24
- - [Output Parsers](#using-output-parsers)
25
- - [Agents](#using-agents-)
26
- - [Loaders](#loaders-)
27
- - [Examples](#examples)
25
+ - [Large Language Models (LLMs)](#large-language-models-llms)
26
+ - [Prompt Management](#prompt-management)
27
+ - [Output Parsers](#output-parsers)
28
+ - [Building RAG](#building-retrieval-augment-generation-rag-system)
29
+ - [Building chat bots](#building-chat-bots)
28
30
  - [Evaluations](#evaluations-evals)
31
+ - [Examples](#examples)
29
32
  - [Logging](#logging)
30
33
  - [Development](#development)
31
34
  - [Discord](#discord)
@@ -46,264 +49,66 @@ If bundler is not being used to manage dependencies, install the gem by executin
46
49
  require "langchain"
47
50
  ```
48
51
 
49
- #### Supported vector search databases and features:
50
-
51
- | Database | Querying | Storage | Schema Management | Backups | Rails Integration |
52
- | -------- |:------------------:| -------:| -----------------:| -------:| -----------------:|
53
- | [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
54
- | [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | WIP |
55
- | [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
56
- | [Pinecone](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
57
- | [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
58
- | [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
59
- | [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | WIP | :white_check_mark: |
60
-
61
- ### Using Vector Search Databases 🔍
62
-
63
- Choose the LLM provider you'll be using (OpenAI or Cohere) and retrieve the API key.
64
-
65
- Add `gem "weaviate-ruby", "~> 0.8.3"` to your Gemfile.
66
-
67
- Pick the vector search database you'll be using and instantiate the client:
68
- ```ruby
69
- client = Langchain::Vectorsearch::Weaviate.new(
70
- url: ENV["WEAVIATE_URL"],
71
- api_key: ENV["WEAVIATE_API_KEY"],
72
- index_name: "",
73
- llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
74
- )
75
-
76
- # You can instantiate any other supported vector search database:
77
- client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
78
- client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
79
- client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
80
- client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
81
- client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
82
- client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
83
- ```
84
-
85
- ```ruby
86
- # Creating the default schema
87
- client.create_default_schema
88
- ```
89
-
90
- ```ruby
91
- # Store plain texts in your vector search database
92
- client.add_texts(
93
- texts: [
94
- "Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly.",
95
- "In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
96
- ]
97
- )
98
- ```
99
- ```ruby
100
- # Store the contents of your files in your vector search database
101
- my_pdf = Langchain.root.join("path/to/my.pdf")
102
- my_text = Langchain.root.join("path/to/my.txt")
103
- my_docx = Langchain.root.join("path/to/my.docx")
104
-
105
- client.add_data(paths: [my_pdf, my_text, my_docx])
106
- ```
107
- ```ruby
108
- # Retrieve similar documents based on the query string passed in
109
- client.similarity_search(
110
- query:,
111
- k: # number of results to be retrieved
112
- )
113
- ```
114
- ```ruby
115
- # Retrieve similar documents based on the query string passed in via the [HyDE technique](https://arxiv.org/abs/2212.10496)
116
- client.similarity_search_with_hyde()
117
- ```
118
- ```ruby
119
- # Retrieve similar documents based on the embedding passed in
120
- client.similarity_search_by_vector(
121
- embedding:,
122
- k: # number of results to be retrieved
123
- )
124
- ```
125
- ```ruby
126
- # Q&A-style querying based on the question passed in
127
- client.ask(
128
- question:
129
- )
130
- ```
131
-
132
- ## Integrating Vector Search into ActiveRecord models
133
- ```ruby
134
- class Product < ActiveRecord::Base
135
- vectorsearch provider: Langchain::Vectorsearch::Qdrant.new(
136
- api_key: ENV["QDRANT_API_KEY"],
137
- url: ENV["QDRANT_URL"],
138
- index_name: "Products",
139
- llm: Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])
140
- )
52
+ ## Large Language Models (LLMs)
53
+ Langchain.rb wraps all supported LLMs in a unified interface allowing you to easily swap out and test out different models.
141
54
 
142
- after_save :upsert_to_vectorsearch
143
- end
144
- ```
55
+ #### Supported LLMs and features:
56
+ | LLM providers | embed() | complete() | chat() | summarize() | Notes |
57
+ | -------- |:------------------:| :-------: | :-----------------: | :-------: | :----------------- |
58
+ | [OpenAI](https://openai.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | ❌ | Including Azure OpenAI |
59
+ | [AI21](https://ai21.com/) | ❌ | :white_check_mark: | ❌ | :white_check_mark: | |
60
+ | [Anthropic](https://milvus.io/) | ❌ | :white_check_mark: | ❌ | ❌ | |
61
+ | [AWS Bedrock](https://aws.amazon.com/bedrock) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | Provides AWS, Cohere, AI21, Antropic and Stability AI models |
62
+ | [Cohere](https://www.pinecone.io/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
63
+ | [GooglePalm](https://ai.google/discover/palm2/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
64
+ | [HuggingFace](https://huggingface.co/) | :white_check_mark: | ❌ | ❌ | ❌ | |
65
+ | [Ollama](https://ollama.ai/) | :white_check_mark: | :white_check_mark: | ❌ | ❌ | |
66
+ | [Replicate](https://replicate.com/) | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | |
145
67
 
146
- ### Exposed ActiveRecord methods
147
- ```ruby
148
- # Retrieve similar products based on the query string passed in
149
- Product.similarity_search(
150
- query:,
151
- k: # number of results to be retrieved
152
- )
153
- ```
154
- ```ruby
155
- # Q&A-style querying based on the question passed in
156
- Product.ask(
157
- question:
158
- )
159
- ```
160
-
161
- Additional info [here](https://github.com/andreibondarev/langchainrb/blob/main/lib/langchain/active_record/hooks.rb#L10-L38).
162
-
163
- ### Using Standalone LLMs 🗣️
164
-
165
- Add `gem "ruby-openai", "~> 4.0.0"` to your Gemfile.
68
+ #### Using standalone LLMs:
166
69
 
167
70
  #### OpenAI
168
- ```ruby
169
- openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
170
- ```
171
- You can pass additional parameters to the constructor, it will be passed to the OpenAI client:
172
- ```ruby
173
- openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"], llm_options: {uri_base: "http://localhost:1234"}) )
174
- ```
175
- ```ruby
176
- openai.embed(text: "foo bar")
177
- ```
178
- ```ruby
179
- openai.complete(prompt: "What is the meaning of life?")
180
- ```
181
-
182
- ##### Open AI Function calls support
183
-
184
- Conversation support
185
-
186
- ```ruby
187
- chat = Langchain::Conversation.new(llm: openai)
188
- ```
189
- ```ruby
190
- chat.set_context("You are the climate bot")
191
- chat.set_functions(functions)
192
- ```
193
71
 
194
- qdrant:
195
-
196
- ```ruby
197
- client.llm.functions = functions
198
- ```
199
-
200
- #### Azure
201
72
  Add `gem "ruby-openai", "~> 5.2.0"` to your Gemfile.
202
73
 
203
74
  ```ruby
204
- azure = Langchain::LLM::Azure.new(
205
- api_key: ENV["AZURE_API_KEY"],
206
- llm_options: {
207
- api_type: :azure,
208
- api_version: "2023-03-15-preview"
209
- },
210
- embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
211
- chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
212
- )
213
- ```
214
- where `AZURE_EMBEDDING_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo` and `AZURE_CHAT_URI` is e.g. `https://custom-domain.openai.azure.com/openai/deployments/ada-2`
215
-
216
- You can pass additional parameters to the constructor, it will be passed to the Azure client:
217
- ```ruby
218
- azure = Langchain::LLM::Azure.new(
219
- api_key: ENV["AZURE_API_KEY"],
220
- llm_options: {
221
- api_type: :azure,
222
- api_version: "2023-03-15-preview",
223
- request_timeout: 240 # Optional
224
- },
225
- embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
226
- chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
227
- )
228
- ```
229
- ```ruby
230
- azure.embed(text: "foo bar")
231
- ```
232
- ```ruby
233
- azure.complete(prompt: "What is the meaning of life?")
234
- ```
235
-
236
- #### Cohere
237
- Add `gem "cohere-ruby", "~> 0.9.6"` to your Gemfile.
238
-
239
- ```ruby
240
- cohere = Langchain::LLM::Cohere.new(api_key: ENV["COHERE_API_KEY"])
241
- ```
242
- ```ruby
243
- cohere.embed(text: "foo bar")
244
- ```
245
- ```ruby
246
- cohere.complete(prompt: "What is the meaning of life?")
247
- ```
248
-
249
- #### HuggingFace
250
- Add `gem "hugging-face", "~> 0.3.2"` to your Gemfile.
251
- ```ruby
252
- hugging_face = Langchain::LLM::HuggingFace.new(api_key: ENV["HUGGING_FACE_API_KEY"])
253
- ```
254
-
255
- #### Replicate
256
- Add `gem "replicate-ruby", "~> 0.2.2"` to your Gemfile.
257
- ```ruby
258
- replicate = Langchain::LLM::Replicate.new(api_key: ENV["REPLICATE_API_KEY"])
75
+ llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
259
76
  ```
260
-
261
- #### Google PaLM (Pathways Language Model)
262
- Add `"google_palm_api", "~> 0.1.3"` to your Gemfile.
77
+ You can pass additional parameters to the constructor, it will be passed to the OpenAI client:
263
78
  ```ruby
264
- google_palm = Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])
79
+ llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"], llm_options: { ... })
265
80
  ```
266
81
 
267
- #### AI21
268
- Add `gem "ai21", "~> 0.2.1"` to your Gemfile.
82
+ Generate vector embeddings:
269
83
  ```ruby
270
- ai21 = Langchain::LLM::AI21.new(api_key: ENV["AI21_API_KEY"])
84
+ llm.embed(text: "foo bar")
271
85
  ```
272
86
 
273
- #### Anthropic
274
- Add `gem "anthropic", "~> 0.1.0"` to your Gemfile.
87
+ Generate a text completion:
275
88
  ```ruby
276
- anthropic = Langchain::LLM::Anthropic.new(api_key: ENV["ANTHROPIC_API_KEY"])
89
+ llm.complete(prompt: "What is the meaning of life?")
277
90
  ```
278
91
 
92
+ Generate a chat completion:
279
93
  ```ruby
280
- anthropic.complete(prompt: "What is the meaning of life?")
94
+ llm.chat(prompt: "Hey! How are you?")
281
95
  ```
282
96
 
283
- #### Ollama
97
+ Summarize the text:
284
98
  ```ruby
285
- ollama = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"])
99
+ llm.complete(text: "...")
286
100
  ```
287
101
 
102
+ You can use any other LLM by invoking the same interface:
288
103
  ```ruby
289
- ollama.complete(prompt: "What is the meaning of life?")
290
- ```
291
- ```ruby
292
- ollama.embed(text: "Hello world!")
104
+ llm = Langchain::LLM::GooglePalm.new(...)
293
105
  ```
294
106
 
295
- ### Using Prompts 📋
107
+ ### Prompt Management
296
108
 
297
109
  #### Prompt Templates
298
110
 
299
- Create a prompt with one input variable:
300
-
301
- ```ruby
302
- prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke.", input_variables: ["adjective"])
303
- prompt.format(adjective: "funny") # "Tell me a funny joke."
304
- ```
305
-
306
- Create a prompt with multiple input variables:
111
+ Create a prompt with input variables:
307
112
 
308
113
  ```ruby
309
114
  prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke about {content}.", input_variables: ["adjective", "content"])
@@ -384,7 +189,8 @@ prompt = Langchain::Prompt.load_from_path(file_path: "spec/fixtures/prompt/promp
384
189
  prompt.input_variables #=> ["adjective", "content"]
385
190
  ```
386
191
 
387
- ### Using Output Parsers
192
+
193
+ ### Output Parsers
388
194
 
389
195
  Parse LLM text responses into structured output, such as JSON.
390
196
 
@@ -484,93 +290,147 @@ fix_parser.parse(llm_response)
484
290
 
485
291
  See [here](https://github.com/andreibondarev/langchainrb/tree/main/examples/create_and_manage_prompt_templates_using_structured_output_parser.rb) for a concrete example
486
292
 
487
- ### Using Agents 🤖
488
- Agents are semi-autonomous bots that can respond to user questions and use available to them Tools to provide informed replies. They break down problems into series of steps and define Actions (and Action Inputs) along the way that are executed and fed back to them as additional information. Once an Agent decides that it has the Final Answer it responds with it.
293
+ ## Building Retrieval Augment Generation (RAG) system
294
+ RAG is a methodology that assists LLMs generate accurate and up-to-date information.
295
+ A typical RAG workflow follows the 3 steps below:
296
+ 1. Relevant knowledge (or data) is retrieved from the knowledge base (typically a vector search DB)
297
+ 2. A prompt, containing retrieved knowledge above, is constructed.
298
+ 3. LLM receives the prompt above to generate a text completion.
299
+ Most common use-case for a RAG system is powering Q&A systems where users pose natural language questions and receive answers in natural language.
489
300
 
490
- #### ReAct Agent
301
+ ### Vector search databases
302
+ Langchain.rb provides a convenient unified interface on top of supported vectorsearch databases that make it easy to configure your index, add data, query and retrieve from it.
491
303
 
492
- Add `gem "ruby-openai"`, `gem "eqn"`, and `gem "google_search_results"` to your Gemfile
304
+ #### Supported vector search databases and features:
493
305
 
494
- ```ruby
495
- search_tool = Langchain::Tool::GoogleSearch.new(api_key: ENV["SERPAPI_API_KEY"])
496
- calculator = Langchain::Tool::Calculator.new
306
+ | Database | Open-source | Cloud offering |
307
+ | -------- |:------------------:| :------------: |
308
+ | [Chroma](https://trychroma.com/) | :white_check_mark: | :white_check_mark: |
309
+ | [Hnswlib](https://github.com/nmslib/hnswlib/) | :white_check_mark: | ❌ |
310
+ | [Milvus](https://milvus.io/) | :white_check_mark: | :white_check_mark: Zilliz Cloud |
311
+ | [Pinecone](https://www.pinecone.io/) | ❌ | :white_check_mark: |
312
+ | [Pgvector](https://github.com/pgvector/pgvector) | :white_check_mark: | :white_check_mark: |
313
+ | [Qdrant](https://qdrant.tech/) | :white_check_mark: | :white_check_mark: |
314
+ | [Weaviate](https://weaviate.io/) | :white_check_mark: | :white_check_mark: |
497
315
 
498
- openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
316
+ ### Using Vector Search Databases 🔍
499
317
 
500
- agent = Langchain::Agent::ReActAgent.new(
501
- llm: openai,
502
- tools: [search_tool, calculator]
503
- )
504
- ```
318
+ Pick the vector search database you'll be using, add the gem dependency and instantiate the client:
505
319
  ```ruby
506
- agent.run(question: "How many full soccer fields would be needed to cover the distance between NYC and DC in a straight line?")
507
- #=> "Approximately 2,945 soccer fields would be needed to cover the distance between NYC and DC in a straight line."
320
+ gem "weaviate-ruby", "~> 0.8.9"
508
321
  ```
509
322
 
510
- #### SQL-Query Agent
323
+ Choose and instantiate the LLM provider you'll be using to generate embeddings
324
+ ```ruby
325
+ llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
326
+ ```
511
327
 
512
- Add `gem "sequel"` to your Gemfile
328
+ ```ruby
329
+ client = Langchain::Vectorsearch::Weaviate.new(
330
+ url: ENV["WEAVIATE_URL"],
331
+ api_key: ENV["WEAVIATE_API_KEY"],
332
+ index_name: "Documents",
333
+ llm: llm
334
+ )
335
+ ```
513
336
 
337
+ You can instantiate any other supported vector search database:
514
338
  ```ruby
515
- database = Langchain::Tool::Database.new(connection_string: "postgres://user:password@localhost:5432/db_name")
339
+ client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
340
+ client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
341
+ client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
342
+ client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
343
+ client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
344
+ client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`
345
+ ```
516
346
 
517
- agent = Langchain::Agent::SQLQueryAgent.new(llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"]), db: database)
347
+ Create the default schema:
348
+ ```ruby
349
+ client.create_default_schema
518
350
  ```
351
+
352
+ Add plain text data to your vector search database:
519
353
  ```ruby
520
- agent.run(question: "How many users have a name with length greater than 5 in the users table?")
521
- #=> "14 users have a name with length greater than 5 in the users table."
354
+ client.add_texts(
355
+ texts: [
356
+ "Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly.",
357
+ "In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
358
+ ]
359
+ )
522
360
  ```
523
361
 
524
- #### Demo
525
- ![May-12-2023 13-09-13](https://github.com/andreibondarev/langchainrb/assets/541665/6bad4cd9-976c-420f-9cf9-b85bf84f7eaf)
362
+ Or use the file parsers to load, parse and index data into your database:
363
+ ```ruby
364
+ my_pdf = Langchain.root.join("path/to/my.pdf")
365
+ my_text = Langchain.root.join("path/to/my.txt")
366
+ my_docx = Langchain.root.join("path/to/my.docx")
526
367
 
527
- ![May-12-2023 13-07-45](https://github.com/andreibondarev/langchainrb/assets/541665/9aacdcc7-4225-4ea0-ab96-7ee48826eb9b)
368
+ client.add_data(paths: [my_pdf, my_text, my_docx])
369
+ ```
370
+ Supported file formats: docx, html, pdf, text, json, jsonl, csv, xlsx.
528
371
 
529
- #### Available Tools 🛠️
372
+ Retrieve similar documents based on the query string passed in:
373
+ ```ruby
374
+ client.similarity_search(
375
+ query:,
376
+ k: # number of results to be retrieved
377
+ )
378
+ ```
530
379
 
531
- | Name | Description | ENV Requirements | Gem Requirements |
532
- | ------------ | :------------------------------------------------: | :-----------------------------------------------------------: | :---------------------------------------: |
533
- | "calculator" | Useful for getting the result of a math expression | | `gem "eqn", "~> 1.6.5"` |
534
- | "database" | Useful for querying a SQL database | | `gem "sequel", "~> 5.68.0"` |
535
- | "ruby_code_interpreter" | Interprets Ruby expressions | | `gem "safe_ruby", "~> 1.0.4"` |
536
- | "google_search" | A wrapper around Google Search | `ENV["SERPAPI_API_KEY"]` (https://serpapi.com/manage-api-key) | `gem "google_search_results", "~> 2.0.0"` |
537
- | "weather" | Calls Open Weather API to retrieve the current weather | `ENV["OPEN_WEATHER_API_KEY"]` (https://home.openweathermap.org/api_keys) | `gem "open-weather-ruby-client", "~> 0.3.0"` |
538
- | "wikipedia" | Calls Wikipedia API to retrieve the summary | | `gem "wikipedia-client", "~> 1.17.0"` |
380
+ Retrieve similar documents based on the query string passed in via the [HyDE technique](https://arxiv.org/abs/2212.10496):
381
+ ```ruby
382
+ client.similarity_search_with_hyde()
383
+ ```
539
384
 
540
- #### Loaders 🚚
385
+ Retrieve similar documents based on the embedding passed in:
386
+ ```ruby
387
+ client.similarity_search_by_vector(
388
+ embedding:,
389
+ k: # number of results to be retrieved
390
+ )
391
+ ```
541
392
 
542
- Need to read data from various sources? Load it up.
393
+ RAG-based querying
394
+ ```ruby
395
+ client.ask(
396
+ question:
397
+ )
398
+ ```
543
399
 
544
- ##### Usage
400
+ ## Building chat bots
545
401
 
546
- Just call `Langchan::Loader.load` with the path to the file or a URL you want to load.
402
+ ### Conversation class
547
403
 
404
+ Choose and instantiate the LLM provider you'll be using:
548
405
  ```ruby
549
- Langchain::Loader.load('/path/to/file.pdf')
406
+ llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
550
407
  ```
551
-
552
- or
553
-
408
+ Instantiate the Conversation class:
554
409
  ```ruby
555
- Langchain::Loader.load('https://www.example.com/file.pdf')
410
+ chat = Langchain::Conversation.new(llm: llm)
556
411
  ```
557
412
 
558
- ##### Supported Formats
413
+ (Optional) Set the conversation context:
414
+ ```ruby
415
+ chat.set_context("You are a chatbot from the future")
416
+ ```
559
417
 
418
+ Exchange messages with the LLM
419
+ ```ruby
420
+ chat.message("Tell me about future technologies")
421
+ ```
560
422
 
561
- | Format | Pocessor | Gem Requirements |
562
- | ------ | ---------------------------- | :--------------------------: |
563
- | docx | Langchain::Processors::Docx | `gem "docx", "~> 0.8.0"` |
564
- | html | Langchain::Processors::HTML | `gem "nokogiri", "~> 1.13"` |
565
- | pdf | Langchain::Processors::PDF | `gem "pdf-reader", "~> 1.4"` |
566
- | text | Langchain::Processors::Text | |
567
- | JSON | Langchain::Processors::JSON | |
568
- | JSONL | Langchain::Processors::JSONL | |
569
- | csv | Langchain::Processors::CSV | |
570
- | xlsx | Langchain::Processors::Xlsx | `gem "roo", "~> 2.10.0"` |
423
+ To stream the chat response:
424
+ ```ruby
425
+ chat = Langchain::Conversation.new(llm: llm) do |chunk|
426
+ print(chunk)
427
+ end
428
+ ```
571
429
 
572
- ## Examples
573
- Additional examples available: [/examples](https://github.com/andreibondarev/langchainrb/tree/main/examples)
430
+ Open AI Functions support
431
+ ```ruby
432
+ chat.set_functions(functions)
433
+ ```
574
434
 
575
435
  ## Evaluations (Evals)
576
436
  The Evaluations module is a collection of tools that can be used to evaluate and track the performance of the output products by LLM and your RAG (Retrieval Augmented Generation) pipelines.
@@ -598,13 +458,16 @@ ragas.score(answer: "", question: "", context: "")
598
458
  # }
599
459
  ```
600
460
 
461
+ ## Examples
462
+ Additional examples available: [/examples](https://github.com/andreibondarev/langchainrb/tree/main/examples)
463
+
601
464
  ## Logging
602
465
 
603
466
  LangChain.rb uses standard logging mechanisms and defaults to `:warn` level. Most messages are at info level, but we will add debug or warn statements as needed.
604
467
  To show all log messages:
605
468
 
606
469
  ```ruby
607
- Langchain.logger.level = :info
470
+ Langchain.logger.level = :debug
608
471
  ```
609
472
 
610
473
  ## Development
@@ -618,31 +481,6 @@ Langchain.logger.level = :info
618
481
  ## Discord
619
482
  Join us in the [Langchain.rb](https://discord.gg/WDARp7J2n8) Discord server.
620
483
 
621
- ## Core Contributors
622
- [<img style="border-radius:50%" alt="Andrei Bondarev" src="https://avatars.githubusercontent.com/u/541665?v=4" width="80" height="80" class="avatar">](https://twitter.com/rushing_andrei)
623
-
624
- ## Contributors
625
- [<img style="border-radius:50%" alt="Alex Chaplinsky" src="https://avatars.githubusercontent.com/u/695947?v=4" width="80" height="80" class="avatar">](https://github.com/alchaplinsky)
626
- [<img style="border-radius:50%" alt="Josh Nichols" src="https://avatars.githubusercontent.com/u/159?v=4" width="80" height="80" class="avatar">](https://github.com/technicalpickles)
627
- [<img style="border-radius:50%" alt="Matt Lindsey" src="https://avatars.githubusercontent.com/u/5638339?v=4" width="80" height="80" class="avatar">](https://github.com/mattlindsey)
628
- [<img style="border-radius:50%" alt="Ricky Chilcott" src="https://avatars.githubusercontent.com/u/445759?v=4" width="80" height="80" class="avatar">](https://github.com/rickychilcott)
629
- [<img style="border-radius:50%" alt="Moeki Kawakami" src="https://avatars.githubusercontent.com/u/72325947?v=4" width="80" height="80" class="avatar">](https://github.com/moekidev)
630
- [<img style="border-radius:50%" alt="Jens Stmrs" src="https://avatars.githubusercontent.com/u/3492669?v=4" width="80" height="80" class="avatar">](https://github.com/faustus7)
631
- [<img style="border-radius:50%" alt="Rafael Figueiredo" src="https://avatars.githubusercontent.com/u/35845775?v=4" width="80" height="80" class="avatar">](https://github.com/rafaelqfigueiredo)
632
- [<img style="border-radius:50%" alt="Piero Dotti" src="https://avatars.githubusercontent.com/u/5167659?v=4" width="80" height="80" class="avatar">](https://github.com/ProGM)
633
- [<img style="border-radius:50%" alt="Michał Ciemięga" src="https://avatars.githubusercontent.com/u/389828?v=4" width="80" height="80" class="avatar">](https://github.com/zewelor)
634
- [<img style="border-radius:50%" alt="Bruno Bornsztein" src="https://avatars.githubusercontent.com/u/3760?v=4" width="80" height="80" class="avatar">](https://github.com/bborn)
635
- [<img style="border-radius:50%" alt="Tim Williams" src="https://avatars.githubusercontent.com/u/1192351?v=4" width="80" height="80" class="avatar">](https://github.com/timrwilliams)
636
- [<img style="border-radius:50%" alt="Zhenhang Tung" src="https://avatars.githubusercontent.com/u/8170159?v=4" width="80" height="80" class="avatar">](https://github.com/ZhenhangTung)
637
- [<img style="border-radius:50%" alt="Hama" src="https://avatars.githubusercontent.com/u/38002468?v=4" width="80" height="80" class="avatar">](https://github.com/akmhmgc)
638
- [<img style="border-radius:50%" alt="Josh Weir" src="https://avatars.githubusercontent.com/u/10720337?v=4" width="80" height="80" class="avatar">](https://github.com/joshweir)
639
- [<img style="border-radius:50%" alt="Arthur Hess" src="https://avatars.githubusercontent.com/u/446035?v=4" width="80" height="80" class="avatar">](https://github.com/arthurhess)
640
- [<img style="border-radius:50%" alt="Jin Shen" src="https://avatars.githubusercontent.com/u/54917718?v=4" width="80" height="80" class="avatar">](https://github.com/jacshen-ebay)
641
- [<img style="border-radius:50%" alt="Earle Bunao" src="https://avatars.githubusercontent.com/u/4653624?v=4" width="80" height="80" class="avatar">](https://github.com/erbunao)
642
- [<img style="border-radius:50%" alt="Maël H." src="https://avatars.githubusercontent.com/u/61985678?v=4" width="80" height="80" class="avatar">](https://github.com/mael-ha)
643
- [<img style="border-radius:50%" alt="Chris O. Adebiyi" src="https://avatars.githubusercontent.com/u/62605573?v=4" width="80" height="80" class="avatar">](https://github.com/oluvvafemi)
644
- [<img style="border-radius:50%" alt="Aaron Breckenridge" src="https://avatars.githubusercontent.com/u/201360?v=4" width="80" height="80" class="avatar">](https://github.com/breckenedge)
645
-
646
484
  ## Star History
647
485
 
648
486
  [![Star History Chart](https://api.star-history.com/svg?repos=andreibondarev/langchainrb&type=Date)](https://star-history.com/#andreibondarev/langchainrb&Date)