ai-parrot 0.3.0__cp311-cp311-manylinux_2_28_x86_64.whl → 0.3.3__cp311-cp311-manylinux_2_28_x86_64.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of ai-parrot might be problematic. Click here for more details.

@@ -0,0 +1,318 @@
1
+ Metadata-Version: 2.1
2
+ Name: ai-parrot
3
+ Version: 0.3.3
4
+ Summary: Live Chatbots based on Langchain chatbots and Agents Integrated into Navigator Framework or used into aiohttp applications.
5
+ Home-page: https://github.com/phenobarbital/ai-parrot
6
+ Author: Jesus Lara
7
+ Author-email: jesuslara@phenobarbital.info
8
+ License: MIT
9
+ Project-URL: Source, https://github.com/phenobarbital/ai-parrot
10
+ Project-URL: Tracker, https://github.com/phenobarbital/ai-parrot/issues
11
+ Project-URL: Documentation, https://github.com/phenobarbital/ai-parrot/
12
+ Project-URL: Funding, https://paypal.me/phenobarbital
13
+ Project-URL: Say Thanks!, https://saythanks.io/to/phenobarbital
14
+ Keywords: asyncio,asyncpg,aioredis,aiomcache,langchain,chatbot,agents
15
+ Platform: POSIX
16
+ Classifier: Development Status :: 4 - Beta
17
+ Classifier: Intended Audience :: Developers
18
+ Classifier: Operating System :: POSIX :: Linux
19
+ Classifier: Environment :: Web Environment
20
+ Classifier: License :: OSI Approved :: MIT License
21
+ Classifier: Topic :: Software Development :: Build Tools
22
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
23
+ Classifier: Programming Language :: Python :: 3.9
24
+ Classifier: Programming Language :: Python :: 3.10
25
+ Classifier: Programming Language :: Python :: 3.11
26
+ Classifier: Programming Language :: Python :: 3.12
27
+ Classifier: Programming Language :: Python :: 3 :: Only
28
+ Classifier: Framework :: AsyncIO
29
+ Requires-Python: >=3.10.12
30
+ Description-Content-Type: text/markdown
31
+ License-File: LICENSE
32
+ Requires-Dist: Cython==3.0.11
33
+ Requires-Dist: accelerate==0.34.2
34
+ Requires-Dist: langchain>=0.2.6
35
+ Requires-Dist: langchain-community>=0.2.6
36
+ Requires-Dist: langchain-core>=0.2.32
37
+ Requires-Dist: langchain-experimental==0.0.62
38
+ Requires-Dist: langchainhub==0.1.15
39
+ Requires-Dist: langchain-text-splitters==0.2.2
40
+ Requires-Dist: langchain-huggingface==0.0.3
41
+ Requires-Dist: huggingface-hub==0.23.5
42
+ Requires-Dist: llama-index==0.10.20
43
+ Requires-Dist: llama-cpp-python==0.2.56
44
+ Requires-Dist: bitsandbytes==0.43.3
45
+ Requires-Dist: Cartopy==0.22.0
46
+ Requires-Dist: chromadb==0.4.24
47
+ Requires-Dist: datasets==2.18.0
48
+ Requires-Dist: faiss-cpu==1.8.0
49
+ Requires-Dist: fastavro==1.9.4
50
+ Requires-Dist: gunicorn==21.2.0
51
+ Requires-Dist: jq==1.7.0
52
+ Requires-Dist: rank-bm25==0.2.2
53
+ Requires-Dist: matplotlib==3.8.3
54
+ Requires-Dist: numba==0.59.0
55
+ Requires-Dist: querysource>=3.12.10
56
+ Requires-Dist: safetensors>=0.4.3
57
+ Requires-Dist: sentence-transformers==3.0.1
58
+ Requires-Dist: tabulate==0.9.0
59
+ Requires-Dist: tiktoken==0.7.0
60
+ Requires-Dist: tokenizers==0.19.1
61
+ Requires-Dist: unstructured==0.14.3
62
+ Requires-Dist: unstructured-client==0.18.0
63
+ Requires-Dist: youtube-transcript-api==0.6.2
64
+ Requires-Dist: selenium==4.18.1
65
+ Requires-Dist: webdriver-manager==4.0.1
66
+ Requires-Dist: transitions==0.9.0
67
+ Requires-Dist: sentencepiece==0.2.0
68
+ Requires-Dist: duckduckgo-search==5.3.0
69
+ Requires-Dist: google-search-results==2.4.2
70
+ Requires-Dist: google-api-python-client>=2.86.0
71
+ Requires-Dist: gdown==5.1.0
72
+ Requires-Dist: weasyprint==61.2
73
+ Requires-Dist: markdown2==2.4.13
74
+ Requires-Dist: fastembed==0.3.4
75
+ Requires-Dist: yfinance==0.2.40
76
+ Requires-Dist: youtube-search==2.1.2
77
+ Requires-Dist: wikipedia==1.4.0
78
+ Requires-Dist: mediawikiapi==1.2
79
+ Requires-Dist: pyowm==3.3.0
80
+ Requires-Dist: O365==2.0.35
81
+ Requires-Dist: stackapi==0.3.1
82
+ Requires-Dist: timm==1.0.9
83
+ Requires-Dist: torchvision==0.19.1
84
+ Provides-Extra: all
85
+ Requires-Dist: langchain-milvus==0.1.1; extra == "all"
86
+ Requires-Dist: milvus==2.3.5; extra == "all"
87
+ Requires-Dist: pymilvus==2.4.4; extra == "all"
88
+ Requires-Dist: groq==0.11.0; extra == "all"
89
+ Requires-Dist: langchain-groq==0.1.4; extra == "all"
90
+ Requires-Dist: llama-index-llms-huggingface==0.2.7; extra == "all"
91
+ Requires-Dist: langchain-google-vertexai==1.0.8; extra == "all"
92
+ Requires-Dist: langchain-google-genai==1.0.8; extra == "all"
93
+ Requires-Dist: google-generativeai==0.7.2; extra == "all"
94
+ Requires-Dist: vertexai==1.60.0; extra == "all"
95
+ Requires-Dist: google-cloud-aiplatform>=1.60.0; extra == "all"
96
+ Requires-Dist: grpc-google-iam-v1==0.13.0; extra == "all"
97
+ Requires-Dist: langchain-openai==0.1.21; extra == "all"
98
+ Requires-Dist: openai==1.40.8; extra == "all"
99
+ Requires-Dist: llama-index-llms-openai==0.1.11; extra == "all"
100
+ Requires-Dist: langchain-anthropic==0.1.23; extra == "all"
101
+ Requires-Dist: anthropic==0.34.0; extra == "all"
102
+ Provides-Extra: analytics
103
+ Requires-Dist: annoy==1.17.3; extra == "analytics"
104
+ Requires-Dist: gradio-tools==0.0.9; extra == "analytics"
105
+ Requires-Dist: gradio-client==0.2.9; extra == "analytics"
106
+ Requires-Dist: streamlit==1.37.1; extra == "analytics"
107
+ Requires-Dist: simsimd==4.3.1; extra == "analytics"
108
+ Requires-Dist: opencv-python==4.10.0.84; extra == "analytics"
109
+ Provides-Extra: anthropic
110
+ Requires-Dist: langchain-anthropic==0.1.11; extra == "anthropic"
111
+ Requires-Dist: anthropic==0.25.2; extra == "anthropic"
112
+ Provides-Extra: crew
113
+ Requires-Dist: colbert-ai==0.2.19; extra == "crew"
114
+ Requires-Dist: vanna==0.3.4; extra == "crew"
115
+ Requires-Dist: crewai[tools]==0.28.8; extra == "crew"
116
+ Provides-Extra: google
117
+ Requires-Dist: langchain-google-vertexai==1.0.10; extra == "google"
118
+ Requires-Dist: langchain-google-genai==1.0.10; extra == "google"
119
+ Requires-Dist: vertexai==1.65.0; extra == "google"
120
+ Provides-Extra: groq
121
+ Requires-Dist: groq==0.11.0; extra == "groq"
122
+ Requires-Dist: langchain-groq==0.1.9; extra == "groq"
123
+ Provides-Extra: hunggingfaces
124
+ Requires-Dist: llama-index-llms-huggingface==0.2.7; extra == "hunggingfaces"
125
+ Provides-Extra: loaders
126
+ Requires-Dist: pymupdf==1.24.4; extra == "loaders"
127
+ Requires-Dist: pymupdf4llm==0.0.1; extra == "loaders"
128
+ Requires-Dist: pdf4llm==0.0.6; extra == "loaders"
129
+ Requires-Dist: PyPDF2==3.0.1; extra == "loaders"
130
+ Requires-Dist: pdfminer.six==20231228; extra == "loaders"
131
+ Requires-Dist: pdfplumber==0.11.0; extra == "loaders"
132
+ Requires-Dist: GitPython==3.1.42; extra == "loaders"
133
+ Requires-Dist: opentelemetry-sdk==1.24.0; extra == "loaders"
134
+ Requires-Dist: rapidocr-onnxruntime==1.3.15; extra == "loaders"
135
+ Requires-Dist: pytesseract==0.3.10; extra == "loaders"
136
+ Requires-Dist: python-docx==1.1.0; extra == "loaders"
137
+ Requires-Dist: python-pptx==0.6.23; extra == "loaders"
138
+ Requires-Dist: docx2txt==0.8; extra == "loaders"
139
+ Requires-Dist: pytube==15.0.0; extra == "loaders"
140
+ Requires-Dist: pydub==0.25.1; extra == "loaders"
141
+ Requires-Dist: markdownify==0.12.1; extra == "loaders"
142
+ Requires-Dist: yt-dlp==2024.4.9; extra == "loaders"
143
+ Requires-Dist: moviepy==1.0.3; extra == "loaders"
144
+ Requires-Dist: mammoth==1.7.1; extra == "loaders"
145
+ Requires-Dist: paddlepaddle==2.6.1; extra == "loaders"
146
+ Requires-Dist: paddlepaddle-gpu==2.6.1; extra == "loaders"
147
+ Requires-Dist: paddleocr==2.8.1; extra == "loaders"
148
+ Requires-Dist: ftfy==6.2.3; extra == "loaders"
149
+ Requires-Dist: librosa==0.10.1; extra == "loaders"
150
+ Requires-Dist: XlsxWriter==3.2.0; extra == "loaders"
151
+ Requires-Dist: xformers==0.0.27.post2; extra == "loaders"
152
+ Provides-Extra: milvus
153
+ Requires-Dist: langchain-milvus>=0.1.4; extra == "milvus"
154
+ Requires-Dist: milvus==2.3.5; extra == "milvus"
155
+ Requires-Dist: pymilvus==2.4.6; extra == "milvus"
156
+ Provides-Extra: openai
157
+ Requires-Dist: langchain-openai==0.1.21; extra == "openai"
158
+ Requires-Dist: openai==1.40.3; extra == "openai"
159
+ Requires-Dist: llama-index-llms-openai==0.1.11; extra == "openai"
160
+ Requires-Dist: tiktoken==0.7.0; extra == "openai"
161
+ Provides-Extra: qdrant
162
+ Requires-Dist: qdrant-client==1.8.0; extra == "qdrant"
163
+
164
+ # AI Parrot: Python package for creating Chatbots
165
+ This is an open-source Python package for creating Chatbots based on Langchain and Navigator.
166
+ This README provides instructions for installation, development, testing, and releasing Parrot.
167
+
168
+ ## Installation
169
+
170
+ **Creating a virtual environment:**
171
+
172
+ This is recommended for development and isolation from system-wide libraries.
173
+ Run the following command in your terminal:
174
+
175
+ Debian-based systems installation:
176
+ ```
177
+ sudo apt install gcc python3.11-venv python3.11-full python3.11-dev libmemcached-dev zlib1g-dev build-essential libffi-dev unixodbc unixodbc-dev libsqliteodbc libev4 libev-dev
178
+ ```
179
+
180
+ For Qdrant installation:
181
+ ```
182
+ docker pull qdrant/qdrant
183
+ docker run -d -p 6333:6333 -p 6334:6334 --name qdrant -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant
184
+ ```
185
+
186
+ For VertexAI, creates a folder on "env" called "google" and copy the JSON credentials file into it.
187
+
188
+ ```bash
189
+ make venv
190
+ ```
191
+
192
+ This will create a virtual environment named `.venv`. To activate it, run:
193
+
194
+ ```bash
195
+ source .venv/bin/activate # Linux/macOS
196
+ ```
197
+
198
+ Once activated, install Parrot within the virtual environment:
199
+
200
+ ```bash
201
+ make install
202
+ ```
203
+ The output will remind you to activate the virtual environment before development.
204
+
205
+ **Optional** (for developers):
206
+ ```bash
207
+ pip install -e .
208
+ ```
209
+
210
+ ## Start http server
211
+ ```bash
212
+ python run.py
213
+ ```
214
+
215
+ ## Development Setup
216
+
217
+ This section explains how to set up your development environment:
218
+
219
+ 1. **Install development requirements:**
220
+
221
+ ```bash
222
+ make setup
223
+ ```
224
+
225
+ This installs development dependencies like linters and test runners mentioned in the `docs/requirements-dev.txt` file.
226
+
227
+ 2. **Install Parrot in editable mode:**
228
+
229
+ This allows you to make changes to the code and test them without reinstalling:
230
+
231
+ ```bash
232
+ make dev
233
+ ```
234
+
235
+ This uses `flit` to install Parrot in editable mode.
236
+
237
+
238
+ ### Usage (Replace with actual usage instructions)
239
+
240
+ *Once you have set up your development environment, you can start using Parrot.*
241
+
242
+ #### Test with Code ChatBOT
243
+ * Set environment variables for:
244
+ [google]
245
+ GOOGLE_API_KEY=apikey
246
+ GOOGLE_CREDENTIALS_FILE=.../credentials.json
247
+ VERTEX_PROJECT_ID=vertex-project
248
+ VERTEX_REGION=region
249
+
250
+ * Run the chatbot:
251
+ ```bash
252
+ python examples/test_agent.py
253
+ ```
254
+
255
+ ### Testing
256
+
257
+ To run the test suite:
258
+
259
+ ```bash
260
+ make test
261
+ ```
262
+
263
+ This will run tests using `coverage` to report on code coverage.
264
+
265
+
266
+ ### Code Formatting
267
+
268
+ To format the code with black:
269
+
270
+ ```bash
271
+ make format
272
+ ```
273
+
274
+
275
+ ### Linting
276
+
277
+ To lint the code for style and potential errors:
278
+
279
+ ```bash
280
+ make lint
281
+ ```
282
+
283
+ This uses `pylint` and `black` to check for issues.
284
+
285
+
286
+ ### Releasing a New Version
287
+
288
+ This section outlines the steps for releasing a new version of Parrot:
289
+
290
+ 1. **Ensure everything is clean and tested:**
291
+
292
+ ```bash
293
+ make release
294
+ ```
295
+
296
+ This runs `lint`, `test`, and `clean` tasks before proceeding.
297
+
298
+ 2. **Publish the package:**
299
+
300
+ ```bash
301
+ make release
302
+ ```
303
+
304
+ This uses `flit` to publish the package to a repository like PyPI. You'll need to have publishing credentials configured for `flit`.
305
+
306
+
307
+ ### Cleaning Up
308
+
309
+ To remove the virtual environment:
310
+
311
+ ```bash
312
+ make distclean
313
+ ```
314
+
315
+
316
+ ### Contributing
317
+
318
+ We welcome contributions to Parrot! Please refer to the CONTRIBUTING.md file for guidelines on how to contribute.
@@ -1,10 +1,10 @@
1
1
  parrot/__init__.py,sha256=eTkAkHeJ5BBDG2fxrXA4M37ODBJoS1DQYpeBAWL2xeI,387
2
2
  parrot/conf.py,sha256=-9bVGC7Rf-6wpIg6-ojvU4S_G1wBLUCVDt46KEGHEhM,4257
3
- parrot/exceptions.cpython-311-x86_64-linux-gnu.so,sha256=gDwsnUlOlwphVU97XaqG5e7BJs_PWPKdwgwDsjyVZIg,361200
3
+ parrot/exceptions.cpython-311-x86_64-linux-gnu.so,sha256=VNyBh3uLxGQgB0l1bkWjQDqYUN2ZAvRmV12AqQijV9Q,361184
4
4
  parrot/manager.py,sha256=NhzXoWxSgtoWHpmYP8cV2Ujq_SlvCbQYQBaohAeL2TM,5935
5
5
  parrot/models.py,sha256=RsVQCqhSXBKRPcu-BCga9Y1wyvENFXDCuq3_ObIKvAo,13452
6
6
  parrot/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
7
- parrot/version.py,sha256=FGfo76wlKkd0zQ9C4qQseeRf_HKXNQSmd2Lfu8iOctk,373
7
+ parrot/version.py,sha256=pbGrvnHWVk2vkgFh0ab5xc4-svi5oC7IvapZz06YLpM,373
8
8
  parrot/chatbots/__init__.py,sha256=ypskCnME0xUv6psBEGCEyXCrD0J0ULHSllpVmSxqb4A,200
9
9
  parrot/chatbots/abstract.py,sha256=CmDn3k4r9uKImOZRN4L9zxLbCdC-1MPUAorDlfZT-kA,26421
10
10
  parrot/chatbots/asktroc.py,sha256=gyWzyvpAnmXwXd-3DEKoIJtAxt6NnP5mUZdZbkFky8s,604
@@ -56,7 +56,7 @@ parrot/loaders/excel.py,sha256=Y1agxm-jG4AgsA2wlPP3p8uBH40wYW1KM2ycTTLKUm4,12441
56
56
  parrot/loaders/github.py,sha256=CscyUIqoHTytqCbRUUTcV3QSxI8XoDntq5aTU0vdhzQ,2593
57
57
  parrot/loaders/image.py,sha256=A9KCXXoGuhDoyeJaascY7Q1ZK12Kf1ggE1drzJjS3AU,3946
58
58
  parrot/loaders/json.py,sha256=6B43k591OpvoJLbsJa8CxJue_lAt713SCdldn8bFW3c,1481
59
- parrot/loaders/pdf.py,sha256=flGlUf9dLAD2Uh8MkvLP27OU1nvroeHU2HM5a3rBH3M,7996
59
+ parrot/loaders/pdf.py,sha256=wGwFnsUmeQqtqk3L2vYt2DkW09LUODUJN-xLjuAa-do,17826
60
60
  parrot/loaders/pdfchapters.py,sha256=YhA8Cdx3qXBR0vuTVnQ12XgH1DXT_rp1Tawzh4V2U3o,5637
61
61
  parrot/loaders/pdffn.py,sha256=gA-vJEWUiIUwbMxP8Nmvlzlcb39DVV69vGKtSzavUoI,4004
62
62
  parrot/loaders/pdfimages.py,sha256=4Q_HKiAee_hALBsG2qF7PpMgKP1AivHXhmcsCkUa9eE,7899
@@ -68,7 +68,7 @@ parrot/loaders/repo.py,sha256=vBqBAnwU6p3_DCvI9DVhi1Bs8iCDYHwFGp0P9zvGRyw,3737
68
68
  parrot/loaders/rtd.py,sha256=oKOC9Qn3iwulYx5BEvXy4_kusKRsy5RLYNHS-e5p-1k,1988
69
69
  parrot/loaders/txt.py,sha256=AeGroWffFT--7TYlTSTr5Av5zAr8YIp1fxt8r5qdi-A,2802
70
70
  parrot/loaders/video.py,sha256=pl5Ho69bp5vrWMqg5tLbsnHUus1LByTDoL6NPk57Ays,2929
71
- parrot/loaders/videolocal.py,sha256=QjCoiDREkpSyyVQ8yDZefzV2g24Gz4VUF4Eiei6v-dY,3791
71
+ parrot/loaders/videolocal.py,sha256=3EASzbettSO2tboTe3GndR4p6Nihwj6HGZoiPXekYo0,4302
72
72
  parrot/loaders/vimeo.py,sha256=zOvOOIiaZr_bRswJFI7uIMKISgALOxcSim9ZRUFY1Fc,4114
73
73
  parrot/loaders/web.py,sha256=3x06JNpfTGFtvSBPAEBVoVdZkpVXePcJeMtj61B2xJk,8867
74
74
  parrot/loaders/web_base.py,sha256=5SjQddT0Vhne8C9s30iU3Ex_9O1PJ8kyDmy8EdhGBo0,4380
@@ -94,17 +94,17 @@ parrot/tools/wikipedia.py,sha256=oadBTRAupu2dKThEORSHqqVs4u0G9lWOltbP6vSZgPE,199
94
94
  parrot/tools/zipcode.py,sha256=knScSvKgK7bHxyLcBkZFiMs65e-PlYU2_YhG6mohcjU,6440
95
95
  parrot/utils/__init__.py,sha256=vkBIvfl9-0NRLd76MIZk4s49PjtF_dW5imLTv_UOKxM,101
96
96
  parrot/utils/toml.py,sha256=CVyqDdAEyOj6AHfNpyQe4IUvLP_SSXlbHROYPeadLvU,302
97
- parrot/utils/types.cpython-311-x86_64-linux-gnu.so,sha256=kdox48-JUzj92QP6amGOCTIEQhrBUMn6qzrhX1u17CY,791912
97
+ parrot/utils/types.cpython-311-x86_64-linux-gnu.so,sha256=jghuq8bBlgGDjkb88Efi5l9cgR5KZL_qO7yxglGNsTA,791256
98
98
  parrot/utils/uv.py,sha256=Mb09bsi13hhi3xQDBjEhCf-U1wherXl-K4-BLcSvqtc,308
99
99
  parrot/utils/parsers/__init__.py,sha256=l82uIu07QvSJ8Xt0d_seei9n7UUL8PE-YFGBTyNbxSI,62
100
- parrot/utils/parsers/toml.cpython-311-x86_64-linux-gnu.so,sha256=vdQTxL4AyxinDpoDVk0Syx-ycDL02OmXESJOtiVFl0A,451056
100
+ parrot/utils/parsers/toml.cpython-311-x86_64-linux-gnu.so,sha256=gEnv6QGF6DtxExEdVTdNx48j90wPYKBLyCH1UCRj4MQ,451088
101
101
  resources/users/__init__.py,sha256=sdXUV7h0Oogcdru1RrQxbm9_RcMjftf0zTWqvxBVpO8,151
102
102
  resources/users/handlers.py,sha256=BGzqBvPY_OaIF_nONWX4b_B5OyyBrdGuSihIsdlFwjk,291
103
103
  resources/users/models.py,sha256=glk7Emv7QCi6i32xRFDrGc8UwK23_LPg0XUOJoHnwRU,6799
104
104
  settings/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
105
105
  settings/settings.py,sha256=9ueEvyLNurUX-AaIeRPV8GKX1c4YjDLbksUAeqEq6Ck,1854
106
- ai_parrot-0.3.0.dist-info/LICENSE,sha256=vRKOoa7onTsLNvSzJtGtMaNhWWh8B3YAT733Tlu6M4o,1070
107
- ai_parrot-0.3.0.dist-info/METADATA,sha256=84sQiAphAjSB9xEqCw1DWLUfwbDciz-1hp8cekb_-MA,10410
108
- ai_parrot-0.3.0.dist-info/WHEEL,sha256=tFO7F0mawMNWa_NzTDA1ygqZBeMykVNIr04O5Zxk1TE,113
109
- ai_parrot-0.3.0.dist-info/top_level.txt,sha256=qHoO4BhYDfeTkyKnciZSQtn5FSLN3Q-P5xCTkyvbuxg,26
110
- ai_parrot-0.3.0.dist-info/RECORD,,
106
+ ai_parrot-0.3.3.dist-info/LICENSE,sha256=vRKOoa7onTsLNvSzJtGtMaNhWWh8B3YAT733Tlu6M4o,1070
107
+ ai_parrot-0.3.3.dist-info/METADATA,sha256=LHLvoMsy1VvMlC33Kl2RKhwnFgj40lMpPDVmVWYj1m8,10624
108
+ ai_parrot-0.3.3.dist-info/WHEEL,sha256=UQ-0qXN3LQUffjrV43_e_ZXj2pgORBqTmXipnkj0E8I,113
109
+ ai_parrot-0.3.3.dist-info/top_level.txt,sha256=qHoO4BhYDfeTkyKnciZSQtn5FSLN3Q-P5xCTkyvbuxg,26
110
+ ai_parrot-0.3.3.dist-info/RECORD,,
@@ -1,5 +1,5 @@
1
1
  Wheel-Version: 1.0
2
- Generator: setuptools (72.2.0)
2
+ Generator: setuptools (74.1.2)
3
3
  Root-Is-Purelib: false
4
4
  Tag: cp311-cp311-manylinux_2_28_x86_64
5
5
 
parrot/loaders/pdf.py CHANGED
@@ -2,10 +2,26 @@ from collections.abc import Callable
2
2
  from pathlib import Path, PurePath
3
3
  from typing import Any
4
4
  from io import BytesIO
5
+ import re
6
+ import ftfy
5
7
  import fitz
6
8
  import pytesseract
9
+ from paddleocr import PaddleOCR
10
+ import torch
11
+ import cv2
12
+ from transformers import (
13
+ # DonutProcessor,
14
+ # VisionEncoderDecoderModel,
15
+ # VisionEncoderDecoderConfig,
16
+ # ViTImageProcessor,
17
+ # AutoTokenizer,
18
+ LayoutLMv3ForTokenClassification,
19
+ LayoutLMv3Processor
20
+ )
21
+ from pdf4llm import to_markdown
7
22
  from PIL import Image
8
23
  from langchain.docstore.document import Document
24
+ from navconfig import config
9
25
  from .basepdf import BasePDF
10
26
 
11
27
 
@@ -31,6 +47,29 @@ class PDFLoader(BasePDF):
31
47
  **kwargs
32
48
  )
33
49
  self.parse_images = kwargs.get('parse_images', False)
50
+ self.page_as_images = kwargs.get('page_as_images', False)
51
+ if self.page_as_images is True:
52
+ # # Load the processor and model from Hugging Face
53
+ # self.image_processor = DonutProcessor.from_pretrained(
54
+ # "naver-clova-ix/donut-base-finetuned-docvqa"
55
+ # )
56
+ # self.image_model = VisionEncoderDecoderModel.from_pretrained(
57
+ # "naver-clova-ix/donut-base-finetuned-docvqa",
58
+
59
+ # )
60
+ # Load the processor and model from Hugging Face
61
+ self.image_processor = LayoutLMv3Processor.from_pretrained(
62
+ "microsoft/layoutlmv3-base",
63
+ apply_ocr=True
64
+ )
65
+ self.image_model = LayoutLMv3ForTokenClassification.from_pretrained(
66
+ # "microsoft/layoutlmv3-base-finetuned-funsd"
67
+ "HYPJUDY/layoutlmv3-base-finetuned-funsd"
68
+ )
69
+ # Set device to GPU if available
70
+ self.image_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
71
+ self.image_model.to(self.image_device)
72
+
34
73
  # Table Settings:
35
74
  self.table_settings = {
36
75
  #"vertical_strategy": "text",
@@ -42,6 +81,134 @@ class PDFLoader(BasePDF):
42
81
  if table_settings:
43
82
  self.table_settings.update(table_settings)
44
83
 
84
+ def explain_image(self, image_path):
85
+ """Function to explain the image."""
86
+ # with open(image_path, "rb") as image_file:
87
+ # image_content = image_file.read()
88
+
89
+ # Open the image
90
+ image = cv2.imread(image_path)
91
+ task_prompt = "<s_docvqa><s_question>{user_input}</s_question><s_answer>"
92
+ question = "Extract Questions about Happily Greet"
93
+ prompt = task_prompt.replace("{user_input}", question)
94
+
95
+ decoder_input_ids = self.image_processor.tokenizer(
96
+ prompt,
97
+ add_special_tokens=False,
98
+ return_tensors="pt",
99
+ ).input_ids
100
+
101
+ pixel_values = self.image_processor(
102
+ image,
103
+ return_tensors="pt"
104
+ ).pixel_values
105
+
106
+ # Send inputs to the appropriate device
107
+ pixel_values = pixel_values.to(self.image_device)
108
+ decoder_input_ids = decoder_input_ids.to(self.image_device)
109
+
110
+ outputs = self.image_model.generate(
111
+ pixel_values,
112
+ decoder_input_ids=decoder_input_ids,
113
+ max_length=self.image_model.decoder.config.max_position_embeddings,
114
+ pad_token_id=self.image_processor.tokenizer.pad_token_id,
115
+ eos_token_id=self.image_processor.tokenizer.eos_token_id,
116
+ bad_words_ids=[[self.image_processor.tokenizer.unk_token_id]],
117
+ # use_cache=True
118
+ return_dict_in_generate=True,
119
+ )
120
+
121
+ sequence = self.image_processor.batch_decode(outputs.sequences)[0]
122
+
123
+
124
+ sequence = sequence.replace(
125
+ self.image_processor.tokenizer.eos_token, ""
126
+ ).replace(
127
+ self.image_processor.tokenizer.pad_token, ""
128
+ )
129
+ # remove first task start token
130
+ sequence = re.sub(r"<.*?>", "", sequence, count=1).strip()
131
+ # Print the extracted sequence
132
+ print("Extracted Text:", sequence)
133
+
134
+ print(self.image_processor.token2json(sequence))
135
+
136
+ # Format the output as Markdown (optional step)
137
+ markdown_text = self.format_as_markdown(sequence)
138
+ print("Markdown Format:\n", markdown_text)
139
+
140
+ return None
141
+
142
+ def convert_to_markdown(self, text):
143
+ """
144
+ Convert the cleaned text into a markdown format.
145
+ You can enhance this function to detect tables, headings, etc.
146
+ """
147
+ # For example, we can identify sections or headers and format them in Markdown
148
+ markdown_text = text
149
+ # Detect headings and bold them
150
+ markdown_text = re.sub(r"(^.*Scorecard.*$)", r"## \1", markdown_text)
151
+ # Convert lines with ":" to a list item (rough approach)
152
+ markdown_text = re.sub(r"(\w+):", r"- **\1**:", markdown_text)
153
+ # Return the markdown formatted text
154
+ return markdown_text
155
+
156
+ def clean_tokenized_text(self, tokenized_text):
157
+ """
158
+ Clean the tokenized text by fixing encoding issues and formatting, preserving line breaks.
159
+ """
160
+ # Fix encoding issues using ftfy
161
+ cleaned_text = ftfy.fix_text(tokenized_text)
162
+
163
+ # Remove <s> and </s> tags (special tokens)
164
+ cleaned_text = cleaned_text.replace("<s>", "").replace("</s>", "")
165
+
166
+ # Replace special characters like 'Ġ' and fix multiple spaces, preserving new lines
167
+ cleaned_text = cleaned_text.replace("Ġ", " ")
168
+
169
+ # Avoid collapsing line breaks, but still normalize multiple spaces
170
+ # Replace multiple spaces with a single space, but preserve line breaks
171
+ cleaned_text = re.sub(r" +", " ", cleaned_text)
172
+
173
+ return cleaned_text.strip()
174
+
175
+ def extract_page_text(self, image_path) -> str:
176
+ # Open the image
177
+ image = Image.open(image_path).convert("RGB")
178
+
179
+ # Processor handles the OCR internally, no need for words or boxes
180
+ encoding = self.image_processor(image, return_tensors="pt", truncation=True)
181
+ encoding = {k: v.to(self.image_device) for k, v in encoding.items()}
182
+
183
+ # Forward pass
184
+ outputs = self.image_model(**encoding)
185
+ logits = outputs.logits
186
+
187
+ # Get predictions
188
+ predictions = logits.argmax(-1).squeeze().tolist()
189
+ labels = [self.image_model.config.id2label[pred] for pred in predictions]
190
+
191
+ # Get the words and boxes from the processor's OCR step
192
+ words = self.image_processor.tokenizer.convert_ids_to_tokens(
193
+ encoding['input_ids'].squeeze().tolist()
194
+ )
195
+ boxes = encoding['bbox'].squeeze().tolist()
196
+
197
+ # Combine words and labels, preserving line breaks based on vertical box position
198
+ extracted_text = ""
199
+ last_box = None
200
+ for word, label, box in zip(words, labels, boxes):
201
+ if label != 'O':
202
+ # Check if the current word is on a new line based on the vertical position of the box
203
+ if last_box and abs(box[1] - last_box[1]) > 10: # A threshold for line breaks
204
+ extracted_text += "\n" # Add a line break
205
+
206
+ extracted_text += f"{word} "
207
+ last_box = box
208
+ cleaned_text = self.clean_tokenized_text(extracted_text)
209
+ markdown_text = self.convert_to_markdown(cleaned_text)
210
+ return markdown_text
211
+
45
212
  def _load_pdf(self, path: Path) -> list:
46
213
  """
47
214
  Load a PDF file using the Fitz library.
@@ -56,6 +223,32 @@ class PDFLoader(BasePDF):
56
223
  self.logger.info(f"Loading PDF file: {path}")
57
224
  pdf = fitz.open(str(path)) # Open the PDF file
58
225
  docs = []
226
+ try:
227
+ md_text = to_markdown(pdf) # get markdown for all pages
228
+ _meta = {
229
+ "url": f'{path}',
230
+ "source": f"{path.name}",
231
+ "filename": path.name,
232
+ "type": 'pdf',
233
+ "question": '',
234
+ "answer": '',
235
+ "source_type": self._source_type,
236
+ "data": {},
237
+ "summary": '',
238
+ "document_meta": {
239
+ "title": pdf.metadata.get("title", ""),
240
+ "creationDate": pdf.metadata.get("creationDate", ""),
241
+ "author": pdf.metadata.get("author", ""),
242
+ }
243
+ }
244
+ docs.append(
245
+ Document(
246
+ page_content=md_text,
247
+ metadata=_meta
248
+ )
249
+ )
250
+ except Exception:
251
+ pass
59
252
  for page_number in range(pdf.page_count):
60
253
  page = pdf[page_number]
61
254
  text = page.get_text()
@@ -79,12 +272,7 @@ class PDFLoader(BasePDF):
79
272
  "summary": summary,
80
273
  "document_meta": {
81
274
  "title": pdf.metadata.get("title", ""),
82
- # "subject": pdf.metadata.get("subject", ""),
83
- # "keywords": pdf.metadata.get("keywords", ""),
84
275
  "creationDate": pdf.metadata.get("creationDate", ""),
85
- # "modDate": pdf.metadata.get("modDate", ""),
86
- # "producer": pdf.metadata.get("producer", ""),
87
- # "creator": pdf.metadata.get("creator", ""),
88
276
  "author": pdf.metadata.get("author", ""),
89
277
  }
90
278
  }
@@ -96,9 +284,10 @@ class PDFLoader(BasePDF):
96
284
  )
97
285
  # Extract images and use OCR to get text from each image
98
286
  # second: images
287
+ file_name = path.stem.replace(' ', '_').replace('.', '').lower()
99
288
  if self.parse_images is True:
289
+ # extract any images in page:
100
290
  image_list = page.get_images(full=True)
101
- file_name = path.stem.replace(' ', '_').replace('.', '').lower()
102
291
  for img_index, img in enumerate(image_list):
103
292
  xref = img[0]
104
293
  base_image = pdf.extract_image(xref)
@@ -181,7 +370,68 @@ class PDFLoader(BasePDF):
181
370
  )
182
371
  except Exception as exc:
183
372
  print(exc)
373
+ # fourth: page as image
374
+ if self.page_as_images is True:
375
+ # Convert the page to a Pixmap (which is an image)
376
+ mat = fitz.Matrix(2, 2)
377
+ pix = page.get_pixmap(dpi=300, matrix=mat) # Increase DPI for better resolution
378
+ img_name = f'{file_name}_page_{page_num}.png'
379
+ img_path = self._imgdir.joinpath(img_name)
380
+ if img_path.exists():
381
+ img_path.unlink(missing_ok=True)
382
+ self.logger.notice(
383
+ f"Saving Page {page_number} as Image on {img_path}"
384
+ )
385
+ pix.save(
386
+ img_path
387
+ )
388
+ # TODO passing the image to a AI visual to get explanation
389
+ # Get the extracted text from the image
390
+ text = self.extract_page_text(img_path)
391
+ url = f'/static/images/{img_name}'
392
+ image_meta = {
393
+ "url": url,
394
+ "source": f"{path.name} Page.#{page_num}",
395
+ "filename": path.name,
396
+ "index": f"{path.name}:{page_num}",
397
+ "question": '',
398
+ "answer": '',
399
+ "type": 'page',
400
+ "data": {},
401
+ "summary": '',
402
+ "document_meta": {
403
+ "image_name": img_name,
404
+ "page_number": f"{page_number}"
405
+ },
406
+ "source_type": self._source_type
407
+ }
408
+ docs.append(
409
+ Document(page_content=text, metadata=image_meta)
410
+ )
184
411
  pdf.close()
185
412
  return docs
186
413
  else:
187
414
  return []
415
+
416
+ def get_ocr(self, img_path) -> list:
417
+ # Initialize PaddleOCR with table recognition
418
+ self.ocr_model = PaddleOCR(
419
+ lang='en',
420
+ det_model_dir=None,
421
+ rec_model_dir=None,
422
+ rec_char_dict_path=None,
423
+ table=True,
424
+ # use_angle_cls=True,
425
+ # use_gpu=True
426
+ )
427
+ result = self.ocr_model.ocr(img_path, cls=True)
428
+
429
+ # extract tables:
430
+ # The result contains the table structure and content
431
+ tables = []
432
+ for line in result:
433
+ if 'html' in line[1]:
434
+ html_table = line[1]['html']
435
+ tables.append(html_table)
436
+
437
+ print('TABLES > ', tables)
@@ -105,3 +105,16 @@ class VideoLocalLoader(BaseVideoLoader):
105
105
  if set(item.parts).isdisjoint(self.skip_directories):
106
106
  documents.extend(self.load_video(item))
107
107
  return self.split_documents(documents)
108
+
109
+ def extract(self) -> list:
110
+ documents = []
111
+ if self.path.is_file():
112
+ docs = self.load_video(self.path)
113
+ documents.extend(docs)
114
+ if self.path.is_dir():
115
+ # iterate over the files in the directory
116
+ for ext in self._extension:
117
+ for item in self.path.glob(f'*{ext}'):
118
+ if set(item.parts).isdisjoint(self.skip_directories):
119
+ documents.extend(self.load_video(item))
120
+ return documents
parrot/version.py CHANGED
@@ -3,7 +3,7 @@
3
3
  __title__ = "ai-parrot"
4
4
  __description__ = "Live Chatbots based on Langchain chatbots and Agents \
5
5
  Integrated into Navigator Framework or used into aiohttp applications."
6
- __version__ = "0.3.0"
6
+ __version__ = "0.3.3"
7
7
  __author__ = "Jesus Lara"
8
8
  __author_email__ = "jesuslarag@gmail.com"
9
9
  __license__ = "MIT"
@@ -1,319 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: ai-parrot
3
- Version: 0.3.0
4
- Summary: Live Chatbots based on Langchain chatbots and Agents Integrated into Navigator Framework or used into aiohttp applications.
5
- Home-page: https://github.com/phenobarbital/ai-parrot
6
- Author: Jesus Lara
7
- Author-email: jesuslara@phenobarbital.info
8
- License: MIT
9
- Project-URL: Source, https://github.com/phenobarbital/ai-parrot
10
- Project-URL: Tracker, https://github.com/phenobarbital/ai-parrot/issues
11
- Project-URL: Documentation, https://github.com/phenobarbital/ai-parrot/
12
- Project-URL: Funding, https://paypal.me/phenobarbital
13
- Project-URL: Say Thanks!, https://saythanks.io/to/phenobarbital
14
- Keywords: asyncio,asyncpg,aioredis,aiomcache,langchain,chatbot,agents
15
- Platform: POSIX
16
- Classifier: Development Status :: 4 - Beta
17
- Classifier: Intended Audience :: Developers
18
- Classifier: Operating System :: POSIX :: Linux
19
- Classifier: Environment :: Web Environment
20
- Classifier: License :: OSI Approved :: MIT License
21
- Classifier: Topic :: Software Development :: Build Tools
22
- Classifier: Topic :: Software Development :: Libraries :: Python Modules
23
- Classifier: Programming Language :: Python :: 3.9
24
- Classifier: Programming Language :: Python :: 3.10
25
- Classifier: Programming Language :: Python :: 3.11
26
- Classifier: Programming Language :: Python :: 3.12
27
- Classifier: Programming Language :: Python :: 3 :: Only
28
- Classifier: Framework :: AsyncIO
29
- Requires-Python: >=3.10.12
30
- Description-Content-Type: text/markdown
31
- License-File: LICENSE
32
- Requires-Dist: Cython ==3.0.9
33
- Requires-Dist: pymupdf ==1.24.4
34
- Requires-Dist: pymupdf4llm ==0.0.1
35
- Requires-Dist: pdf4llm ==0.0.6
36
- Requires-Dist: PyPDF2 ==3.0.1
37
- Requires-Dist: pdfminer.six ==20231228
38
- Requires-Dist: pdfplumber ==0.11.0
39
- Requires-Dist: bitsandbytes ==0.43.0
40
- Requires-Dist: Cartopy ==0.22.0
41
- Requires-Dist: chromadb ==0.4.24
42
- Requires-Dist: contourpy ==1.2.0
43
- Requires-Dist: datasets ==2.18.0
44
- Requires-Dist: faiss-cpu ==1.8.0
45
- Requires-Dist: fastavro ==1.9.4
46
- Requires-Dist: GitPython ==3.1.42
47
- Requires-Dist: gunicorn ==21.2.0
48
- Requires-Dist: jq ==1.7.0
49
- Requires-Dist: rank-bm25 ==0.2.2
50
- Requires-Dist: matplotlib ==3.8.3
51
- Requires-Dist: numba ==0.59.0
52
- Requires-Dist: opentelemetry-sdk ==1.24.0
53
- Requires-Dist: rapidocr-onnxruntime ==1.3.15
54
- Requires-Dist: pytesseract ==0.3.10
55
- Requires-Dist: python-docx ==1.1.0
56
- Requires-Dist: python-pptx ==0.6.23
57
- Requires-Dist: docx2txt ==0.8
58
- Requires-Dist: pytube ==15.0.0
59
- Requires-Dist: pydub ==0.25.1
60
- Requires-Dist: markdownify ==0.12.1
61
- Requires-Dist: librosa ==0.10.1
62
- Requires-Dist: yt-dlp ==2024.4.9
63
- Requires-Dist: moviepy ==1.0.3
64
- Requires-Dist: safetensors ==0.4.2
65
- Requires-Dist: sentence-transformers ==2.6.1
66
- Requires-Dist: tabulate ==0.9.0
67
- Requires-Dist: tiktoken ==0.7.0
68
- Requires-Dist: tokenizers ==0.19.1
69
- Requires-Dist: unstructured ==0.14.3
70
- Requires-Dist: unstructured-client ==0.18.0
71
- Requires-Dist: uvloop ==0.19.0
72
- Requires-Dist: XlsxWriter ==3.2.0
73
- Requires-Dist: youtube-transcript-api ==0.6.2
74
- Requires-Dist: selenium ==4.18.1
75
- Requires-Dist: webdriver-manager ==4.0.1
76
- Requires-Dist: transitions ==0.9.0
77
- Requires-Dist: sentencepiece ==0.2.0
78
- Requires-Dist: duckduckgo-search ==5.3.0
79
- Requires-Dist: google-search-results ==2.4.2
80
- Requires-Dist: google-api-python-client >=2.86.0
81
- Requires-Dist: gdown ==5.1.0
82
- Requires-Dist: weasyprint ==61.2
83
- Requires-Dist: markdown2 ==2.4.13
84
- Requires-Dist: xformers ==0.0.25.post1
85
- Requires-Dist: fastembed ==0.3.4
86
- Requires-Dist: mammoth ==1.7.1
87
- Requires-Dist: accelerate ==0.29.3
88
- Requires-Dist: langchain >=0.2.6
89
- Requires-Dist: langchain-community >=0.2.6
90
- Requires-Dist: langchain-core ==0.2.32
91
- Requires-Dist: langchain-experimental ==0.0.62
92
- Requires-Dist: langchainhub ==0.1.15
93
- Requires-Dist: langchain-text-splitters ==0.2.2
94
- Requires-Dist: huggingface-hub ==0.23.5
95
- Requires-Dist: llama-index ==0.10.20
96
- Requires-Dist: llama-cpp-python ==0.2.56
97
- Requires-Dist: asyncdb[all] >=2.7.10
98
- Requires-Dist: querysource >=3.10.1
99
- Requires-Dist: yfinance ==0.2.40
100
- Requires-Dist: youtube-search ==2.1.2
101
- Requires-Dist: wikipedia ==1.4.0
102
- Requires-Dist: mediawikiapi ==1.2
103
- Requires-Dist: wikibase-rest-api-client ==0.2.0
104
- Requires-Dist: asknews ==0.7.30
105
- Requires-Dist: pyowm ==3.3.0
106
- Requires-Dist: O365 ==2.0.35
107
- Requires-Dist: langchain-huggingface ==0.0.3
108
- Requires-Dist: stackapi ==0.3.1
109
- Provides-Extra: all
110
- Requires-Dist: langchain-milvus ==0.1.1 ; extra == 'all'
111
- Requires-Dist: milvus ==2.3.5 ; extra == 'all'
112
- Requires-Dist: pymilvus ==2.4.4 ; extra == 'all'
113
- Requires-Dist: groq ==0.6.0 ; extra == 'all'
114
- Requires-Dist: langchain-groq ==0.1.4 ; extra == 'all'
115
- Requires-Dist: llama-index-llms-huggingface ==0.2.7 ; extra == 'all'
116
- Requires-Dist: langchain-google-vertexai ==1.0.8 ; extra == 'all'
117
- Requires-Dist: langchain-google-genai ==1.0.8 ; extra == 'all'
118
- Requires-Dist: google-generativeai ==0.7.2 ; extra == 'all'
119
- Requires-Dist: vertexai ==1.60.0 ; extra == 'all'
120
- Requires-Dist: google-cloud-aiplatform >=1.60.0 ; extra == 'all'
121
- Requires-Dist: grpc-google-iam-v1 ==0.13.0 ; extra == 'all'
122
- Requires-Dist: langchain-openai ==0.1.21 ; extra == 'all'
123
- Requires-Dist: openai ==1.40.8 ; extra == 'all'
124
- Requires-Dist: llama-index-llms-openai ==0.1.11 ; extra == 'all'
125
- Requires-Dist: langchain-anthropic ==0.1.23 ; extra == 'all'
126
- Requires-Dist: anthropic ==0.34.0 ; extra == 'all'
127
- Provides-Extra: analytics
128
- Requires-Dist: annoy ==1.17.3 ; extra == 'analytics'
129
- Requires-Dist: gradio-tools ==0.0.9 ; extra == 'analytics'
130
- Requires-Dist: gradio-client ==0.2.9 ; extra == 'analytics'
131
- Requires-Dist: streamlit ==1.37.1 ; extra == 'analytics'
132
- Requires-Dist: simsimd ==4.3.1 ; extra == 'analytics'
133
- Requires-Dist: opencv-python ==4.10.0.84 ; extra == 'analytics'
134
- Provides-Extra: anthropic
135
- Requires-Dist: langchain-anthropic ==0.1.11 ; extra == 'anthropic'
136
- Requires-Dist: anthropic ==0.25.2 ; extra == 'anthropic'
137
- Provides-Extra: crew
138
- Requires-Dist: colbert-ai ==0.2.19 ; extra == 'crew'
139
- Requires-Dist: vanna ==0.3.4 ; extra == 'crew'
140
- Requires-Dist: crewai[tools] ==0.28.8 ; extra == 'crew'
141
- Provides-Extra: google
142
- Requires-Dist: langchain-google-vertexai ==1.0.4 ; extra == 'google'
143
- Requires-Dist: langchain-google-genai ==1.0.4 ; extra == 'google'
144
- Requires-Dist: google-generativeai ==0.5.4 ; extra == 'google'
145
- Requires-Dist: vertexai ==1.49.0 ; extra == 'google'
146
- Requires-Dist: google-cloud-aiplatform ==1.49.0 ; extra == 'google'
147
- Requires-Dist: grpc-google-iam-v1 ==0.13.0 ; extra == 'google'
148
- Provides-Extra: groq
149
- Requires-Dist: groq ==0.6.0 ; extra == 'groq'
150
- Requires-Dist: langchain-groq ==0.1.4 ; extra == 'groq'
151
- Provides-Extra: hunggingfaces
152
- Requires-Dist: llama-index-llms-huggingface ==0.2.7 ; extra == 'hunggingfaces'
153
- Provides-Extra: milvus
154
- Requires-Dist: langchain-milvus ==0.1.1 ; extra == 'milvus'
155
- Requires-Dist: milvus ==2.3.5 ; extra == 'milvus'
156
- Requires-Dist: pymilvus ==2.4.4 ; extra == 'milvus'
157
- Provides-Extra: openai
158
- Requires-Dist: langchain-openai ==0.1.21 ; extra == 'openai'
159
- Requires-Dist: openai ==1.40.3 ; extra == 'openai'
160
- Requires-Dist: llama-index-llms-openai ==0.1.11 ; extra == 'openai'
161
- Requires-Dist: tiktoken ==0.7.0 ; extra == 'openai'
162
- Provides-Extra: qdrant
163
- Requires-Dist: qdrant-client ==1.8.0 ; extra == 'qdrant'
164
-
165
- # AI Parrot: Python package for creating Chatbots
166
- This is an open-source Python package for creating Chatbots based on Langchain and Navigator.
167
- This README provides instructions for installation, development, testing, and releasing Parrot.
168
-
169
- ## Installation
170
-
171
- **Creating a virtual environment:**
172
-
173
- This is recommended for development and isolation from system-wide libraries.
174
- Run the following command in your terminal:
175
-
176
- Debian-based systems installation:
177
- ```
178
- sudo apt install gcc python3.11-venv python3.11-full python3.11-dev libmemcached-dev zlib1g-dev build-essential libffi-dev unixodbc unixodbc-dev libsqliteodbc libev4 libev-dev
179
- ```
180
-
181
- For Qdrant installation:
182
- ```
183
- docker pull qdrant/qdrant
184
- docker run -d -p 6333:6333 -p 6334:6334 --name qdrant -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant
185
- ```
186
-
187
- For VertexAI, creates a folder on "env" called "google" and copy the JSON credentials file into it.
188
-
189
- ```bash
190
- make venv
191
- ```
192
-
193
- This will create a virtual environment named `.venv`. To activate it, run:
194
-
195
- ```bash
196
- source .venv/bin/activate # Linux/macOS
197
- ```
198
-
199
- Once activated, install Parrot within the virtual environment:
200
-
201
- ```bash
202
- make install
203
- ```
204
- The output will remind you to activate the virtual environment before development.
205
-
206
- **Optional** (for developers):
207
- ```bash
208
- pip install -e .
209
- ```
210
-
211
- ## Start http server
212
- ```bash
213
- python run.py
214
- ```
215
-
216
- ## Development Setup
217
-
218
- This section explains how to set up your development environment:
219
-
220
- 1. **Install development requirements:**
221
-
222
- ```bash
223
- make setup
224
- ```
225
-
226
- This installs development dependencies like linters and test runners mentioned in the `docs/requirements-dev.txt` file.
227
-
228
- 2. **Install Parrot in editable mode:**
229
-
230
- This allows you to make changes to the code and test them without reinstalling:
231
-
232
- ```bash
233
- make dev
234
- ```
235
-
236
- This uses `flit` to install Parrot in editable mode.
237
-
238
-
239
- ### Usage (Replace with actual usage instructions)
240
-
241
- *Once you have set up your development environment, you can start using Parrot.*
242
-
243
- #### Test with Code ChatBOT
244
- * Set environment variables for:
245
- [google]
246
- GOOGLE_API_KEY=apikey
247
- GOOGLE_CREDENTIALS_FILE=.../credentials.json
248
- VERTEX_PROJECT_ID=vertex-project
249
- VERTEX_REGION=region
250
-
251
- * Run the chatbot:
252
- ```bash
253
- python examples/test_agent.py
254
- ```
255
-
256
- ### Testing
257
-
258
- To run the test suite:
259
-
260
- ```bash
261
- make test
262
- ```
263
-
264
- This will run tests using `coverage` to report on code coverage.
265
-
266
-
267
- ### Code Formatting
268
-
269
- To format the code with black:
270
-
271
- ```bash
272
- make format
273
- ```
274
-
275
-
276
- ### Linting
277
-
278
- To lint the code for style and potential errors:
279
-
280
- ```bash
281
- make lint
282
- ```
283
-
284
- This uses `pylint` and `black` to check for issues.
285
-
286
-
287
- ### Releasing a New Version
288
-
289
- This section outlines the steps for releasing a new version of Parrot:
290
-
291
- 1. **Ensure everything is clean and tested:**
292
-
293
- ```bash
294
- make release
295
- ```
296
-
297
- This runs `lint`, `test`, and `clean` tasks before proceeding.
298
-
299
- 2. **Publish the package:**
300
-
301
- ```bash
302
- make release
303
- ```
304
-
305
- This uses `flit` to publish the package to a repository like PyPI. You'll need to have publishing credentials configured for `flit`.
306
-
307
-
308
- ### Cleaning Up
309
-
310
- To remove the virtual environment:
311
-
312
- ```bash
313
- make distclean
314
- ```
315
-
316
-
317
- ### Contributing
318
-
319
- We welcome contributions to Parrot! Please refer to the CONTRIBUTING.md file for guidelines on how to contribute.