ragit 0.7__py3-none-any.whl → 0.7.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,170 +0,0 @@
1
- Metadata-Version: 2.2
2
- Name: ragit
3
- Version: 0.7
4
- Home-page: https://github.com/stsfaroz/ragit
5
- Author: Salman Faroz
6
- License: MIT
7
- Description-Content-Type: text/markdown
8
- Requires-Dist: sentence-transformers>=3.4.1
9
- Requires-Dist: pandas>=2.2.3
10
- Requires-Dist: chromadb>=0.6.3
11
- Requires-Dist: setuptools>=75.8.0
12
- Requires-Dist: wheel>=0.45.1
13
- Requires-Dist: twine>=6.1.0
14
- Dynamic: author
15
- Dynamic: description
16
- Dynamic: description-content-type
17
- Dynamic: home-page
18
- Dynamic: license
19
- Dynamic: requires-dist
20
-
21
-
22
- # Ragit
23
- 🚀 Smart, Fast, Scalable Search 🚀
24
-
25
- **ragit** is a lightweight Python library that simplifies the management of vector databases. With **ragit**, you can easily create, update, query, and manage your vector database, all from CSV files containing text data.
26
-
27
- ## Features
28
-
29
- - **Create a Vector Database:** Build your database from a CSV file with two required columns: `id` and `text`.
30
- - **Add New Entries:** Insert additional entries from CSV files or add them individually.
31
- - **Similarity Search:** Find nearby texts using various distance metrics (e.g., cosine, L2) with similarity scores.
32
- - **Data Retrieval:** Fetch entries by IDs or exact text matches.
33
- - **Deletion:** Remove single entries or entire collections when needed.
34
-
35
- ## CSV File Format
36
- ragit expects your CSV file to have exactly two columns: `id` and `text`. **Note:** Each `id` must be unique.
37
-
38
- ## Example CSV (`data.csv`):
39
-
40
- ```csv
41
- id,text
42
- 1,The quick brown fox jumps over the lazy dog.
43
- 2,Another sample entry for testing.
44
- ```
45
-
46
- ## Usage
47
- Below are some examples that demonstrate how to use `ragit`. The examples cover creating a database, adding entries, performing similarity searches, and more.
48
-
49
- ### 1. Importing and Initializing
50
- First, import the `VectorDBManager` class from `ragit` and initialize it:
51
-
52
- ```python
53
- from ragit import VectorDBManager
54
-
55
- # Initialize the vector database manager with a custom persistence directory and model
56
- db_manager = VectorDBManager(
57
- persist_directory="./my_vector_db", # Optional # default : "./vector_db"
58
- provider="sentence_transformer", # Optional # default : "sentence_transformer"
59
- model_name="all-mpnet-base-v2" # Optional # default : "all-mpnet-base-v2"
60
- )
61
- ```
62
-
63
- ### 2. Creating a Database
64
- Create a new collection (named `my_collection`) using your CSV file. In this example, the `distance_metric` is set to "cosine"(available options: l2, cosine, ip, l1) :
65
-
66
- ```python
67
- db_manager.create_database(
68
- csv_path="data.csv",
69
- collection_name="my_collection",
70
- distance_metric="cosine" # Optional # default : l2
71
- )
72
- ```
73
- ### Reloading Your Database
74
-
75
- To reuse your existing vector database, initialize VectorDBManager with the same parameters that were used when creating the database.
76
-
77
- ```python
78
- from ragit import VectorDBManager
79
-
80
- db_manager = VectorDBManager(
81
- persist_directory="./my_vector_db",
82
- provider="sentence_transformer",
83
- model_name="all-mpnet-base-v2"
84
- )
85
- ```
86
-
87
- ### 3. Adding a Single Entry
88
- Add an individual entry to the collection:
89
-
90
- ```python
91
- db_manager.add_single_row(
92
- id_="101",
93
- text="This is a new test entry for the database.",
94
- collection_name="my_collection"
95
- )
96
- ```
97
-
98
- ### 4. Adding Multiple Entries from CSV
99
- You can also add multiple entries from a CSV file. This function skips any entries that already exist in the collection:
100
-
101
- ```python
102
- stats = db_manager.add_values_from_csv(
103
- csv_path="data.csv",
104
- collection_name="my_collection"
105
- )
106
- print(f"Added {stats['new_entries_added']} new entries")
107
- ```
108
-
109
- ### 5. Retrieving Collection Information
110
- Fetch and display information about your collection:
111
-
112
- ```python
113
- info = db_manager.get_collection_info("my_collection")
114
- print(f"Collection size: {info['count']} entries")
115
- ```
116
-
117
- ### 6. Performing a Similarity Search
118
- Find texts that are similar to your query. In this example, the query text is "ai", and the search is filtered using the string "Artificial intelligence". The top 2 results are returned:
119
-
120
- ```python
121
- results = db_manager.find_nearby_texts(
122
- text="ai",
123
- collection_name="my_collection",
124
- k=2,
125
- search_string="Artificial intelligence" # Optional
126
- )
127
-
128
- print("Results:")
129
- for item in results:
130
- print(f"\nID: {item['id']}")
131
- print(f"Text: {item['text']}")
132
- print(f"Similarity: {item['similarity']}%")
133
- print(f"Distance ({item['metric']}): {item['raw_distance']}")
134
- ```
135
-
136
- ### 7. Fetching Texts by IDs
137
- Retrieve text entries for a list of IDs:
138
-
139
- ```python
140
- ids_to_fetch = ["1", "2", "3"]
141
- texts = db_manager.get_by_ids(ids_to_fetch, "my_collection")
142
- print("Texts:", texts)
143
- ```
144
-
145
- ### 8. Deleting a Row / Collection
146
-
147
- Remove an entry from the collection by its ID:
148
-
149
- ```python
150
- db_manager.delete_entry_by_id(
151
- id_="1",
152
- collection_name="my_collection"
153
- )
154
- ```
155
-
156
-
157
- Delete an entire collection. **Note:** You must pass `confirmation="yes"` to proceed with deletion.
158
-
159
- ```python
160
- db_manager.delete_collection(
161
- collection_name="my_collection",
162
- confirmation="yes"
163
- )
164
- ```
165
-
166
- ## Contributing
167
- Contributions are welcome! If you encounter any issues or have suggestions for improvements, please feel free to open an issue or submit a pull request on GitHub.
168
-
169
- ## License
170
- This project is licensed under the MIT License. See the `LICENSE` file for details.
@@ -1,6 +0,0 @@
1
- ragit/__init__.py,sha256=GECJxYFL_0PMy6tbcVFpW9Fhe1JiI2uXH4iJWhUHpKs,48
2
- ragit/main.py,sha256=f2kDfZPxP26DBvzmP7aF6VhnNAE1hC-ZONU5ZH6RVBM,11774
3
- ragit-0.7.dist-info/METADATA,sha256=xYd5dnzwTFkXCGTCHMDakiER_xfDr8VjQBntOFlwL6M,5165
4
- ragit-0.7.dist-info/WHEEL,sha256=In9FTNxeP60KnTkGw7wk6mJPYd_dQSjEZmXdBdMCI-8,91
5
- ragit-0.7.dist-info/top_level.txt,sha256=pkPbG7yrw61wt9_y_xcLE2vq2a55fzockASD0yq0g4s,6
6
- ragit-0.7.dist-info/RECORD,,