cohere-compass-sdk 1.3.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- cohere_compass_sdk-1.3.2/LICENSE +21 -0
- cohere_compass_sdk-1.3.2/PKG-INFO +354 -0
- cohere_compass_sdk-1.3.2/README.md +336 -0
- cohere_compass_sdk-1.3.2/cohere_compass/__init__.py +48 -0
- cohere_compass_sdk-1.3.2/cohere_compass/clients/__init__.py +3 -0
- cohere_compass_sdk-1.3.2/cohere_compass/clients/access_control.py +637 -0
- cohere_compass_sdk-1.3.2/cohere_compass/clients/compass.py +1184 -0
- cohere_compass_sdk-1.3.2/cohere_compass/clients/parser.py +423 -0
- cohere_compass_sdk-1.3.2/cohere_compass/clients/rbac.py +327 -0
- cohere_compass_sdk-1.3.2/cohere_compass/constants.py +37 -0
- cohere_compass_sdk-1.3.2/cohere_compass/exceptions.py +51 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/__init__.py +39 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/access_control.py +131 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/config.py +260 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/datasources.py +70 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/documents.py +252 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/rbac.py +133 -0
- cohere_compass_sdk-1.3.2/cohere_compass/models/search.py +109 -0
- cohere_compass_sdk-1.3.2/cohere_compass/py.typed +0 -0
- cohere_compass_sdk-1.3.2/cohere_compass/utils.py +152 -0
- cohere_compass_sdk-1.3.2/pyproject.toml +60 -0
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Cohere
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,354 @@
|
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
|
+
Name: cohere-compass-sdk
|
|
3
|
+
Version: 1.3.2
|
|
4
|
+
Summary: Cohere Compass SDK
|
|
5
|
+
Requires-Python: >=3.9,<4.0
|
|
6
|
+
Classifier: Programming Language :: Python :: 3
|
|
7
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
8
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
9
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
10
|
+
Requires-Dist: Deprecated (>=1.2.18,<2.0.0)
|
|
11
|
+
Requires-Dist: fsspec (>=2024.6.1)
|
|
12
|
+
Requires-Dist: joblib (==1.4.2)
|
|
13
|
+
Requires-Dist: pydantic (>=2.6.3)
|
|
14
|
+
Requires-Dist: requests (>=2.25.0,<3.0.0)
|
|
15
|
+
Requires-Dist: tenacity (>=8.2.3,<9.0.0)
|
|
16
|
+
Description-Content-Type: text/markdown
|
|
17
|
+
|
|
18
|
+
# Cohere Compass SDK
|
|
19
|
+
|
|
20
|
+
[](https://microsoft.github.io/pyright/)
|
|
21
|
+
|
|
22
|
+
The Compass SDK is a Python library that allows you to parse documents and insert them
|
|
23
|
+
into a Compass index.
|
|
24
|
+
|
|
25
|
+
In order to parse documents, the Compass SDK relies on the Compass Parser API, which is
|
|
26
|
+
a RESTful API that receives files and returns parsed documents. This requires a hosted
|
|
27
|
+
Compass server.
|
|
28
|
+
|
|
29
|
+
The Compass SDK provides a `CompassParserClient` that allows to interact with the parser
|
|
30
|
+
API from your Python code in a convenient manner. The `CompassParserClient` provides
|
|
31
|
+
methods to parse single and multiple files, as well as entire folders, and supports
|
|
32
|
+
multiple file types (e.g., `pdf`, `docx`, `json`, `csv`, etc.) as well as different file
|
|
33
|
+
systems (e.g., local, S3, GCS, etc.).
|
|
34
|
+
|
|
35
|
+
To insert parsed documents into a `Compass` index, the Compass SDK provides a
|
|
36
|
+
`CompassClient` class that allows to interact with a Compass API server. The Compass API
|
|
37
|
+
is also a RESTful API that allows to create, delete and search documents in a Compass
|
|
38
|
+
index. To install a Compass API service, please refer to the [Compass
|
|
39
|
+
documentation](https://github.com/cohere-ai/compass)
|
|
40
|
+
|
|
41
|
+
## Table of Contents
|
|
42
|
+
|
|
43
|
+
<!--
|
|
44
|
+
Do NOT remove the line below; it is used by markdown-toc to automatically generate the
|
|
45
|
+
Table of Contents.
|
|
46
|
+
|
|
47
|
+
To update the Table Of Contents, execute the following command in the repo root dir:
|
|
48
|
+
|
|
49
|
+
markdown-toc -i README.md
|
|
50
|
+
|
|
51
|
+
If you don't have the markdown-toc tool, you can install it with:
|
|
52
|
+
|
|
53
|
+
npm i -g markdown-toc # use sudo if you use a system-wide node installation.
|
|
54
|
+
>
|
|
55
|
+
|
|
56
|
+
<!-- toc -->
|
|
57
|
+
|
|
58
|
+
- [Getting Started](#getting-started)
|
|
59
|
+
- [Local Development](#local-development)
|
|
60
|
+
- [Create Python Virtual Environment](#create-python-virtual-environment)
|
|
61
|
+
- [Running Tests Locally](#running-tests-locally)
|
|
62
|
+
- [VSCode Users](#vscode-users)
|
|
63
|
+
- [Pre-commit](#pre-commit)
|
|
64
|
+
|
|
65
|
+
<!-- tocstop -->
|
|
66
|
+
|
|
67
|
+
## Getting Started
|
|
68
|
+
|
|
69
|
+
Fill in your URL, username, password, and path to test data below for an end to end run
|
|
70
|
+
of parsing and searching.
|
|
71
|
+
|
|
72
|
+
### Installation
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
pip install git+https://github.com/cohere-ai/cohere-compass-sdk.git
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
```Python
|
|
79
|
+
from cohere_compass.clients.compass import CompassClient
|
|
80
|
+
from cohere_compass.clients.parser import CompassParserClient
|
|
81
|
+
from cohere_compass.models.config import MetadataStrategy, MetadataConfig
|
|
82
|
+
|
|
83
|
+
api_url = "<COMPASS_URL>"
|
|
84
|
+
parser_url = "<PARSER URL>"
|
|
85
|
+
bearer_token = "<PASS BEARER TOKEN IF ANY OTHERWISE LEAVE IT BLANK>"
|
|
86
|
+
|
|
87
|
+
index = "test-index"
|
|
88
|
+
data_to_index = "<PATH_TO_TEST_DATA>"
|
|
89
|
+
|
|
90
|
+
# Parse the files before indexing
|
|
91
|
+
parsing_client = CompassParserClient(parser_url = parser_url)
|
|
92
|
+
metadata_config = MetadataConfig(
|
|
93
|
+
metadata_strategy=MetadataStrategy.No_Metadata,
|
|
94
|
+
commandr_extractable_attributes=["date", "link", "page_title", "authors"]
|
|
95
|
+
)
|
|
96
|
+
|
|
97
|
+
docs_to_index = parsing_client.process_folder(folder_path=data_to_index, metadata_config=metadata_config, recursive=True)
|
|
98
|
+
|
|
99
|
+
# Create index and insert files
|
|
100
|
+
compass_client = CompassClient(index_url=api_url, bearer_token=bearer_token)
|
|
101
|
+
compass_client.create_index(index_name=index)
|
|
102
|
+
results = compass_client.insert_docs(index_name=index, docs=docs_to_index)
|
|
103
|
+
|
|
104
|
+
result = compass_client.search_chunks(index_name=index, query="test", top_k=1)
|
|
105
|
+
print(f"Results preview: \n {result.hits} ... \n \n ")
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Adding filters to documents
|
|
109
|
+
|
|
110
|
+
#### Adding filter via dict
|
|
111
|
+
|
|
112
|
+
```python
|
|
113
|
+
from cohere_compass.clients.compass import CompassClient
|
|
114
|
+
from cohere_compass.clients.parser import CompassParserClient
|
|
115
|
+
from cohere_compass.models.search import SearchFilter
|
|
116
|
+
|
|
117
|
+
api_url = "<COMPASS_URL>"
|
|
118
|
+
parser_url = "<PARSER URL>"
|
|
119
|
+
data_to_index = "<PATH_TO_TEST_DATA>"
|
|
120
|
+
index = "test-index"
|
|
121
|
+
bearer_token = "<PASS BEARER TOKEN IF ANY OTHERWISE LEAVE IT BLANK>"
|
|
122
|
+
|
|
123
|
+
parsing_client = CompassParserClient(parser_url = parser_url)
|
|
124
|
+
custom_context_dict = {
|
|
125
|
+
"doc_purpose": "demo"
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
docs_to_index = parsing_client.process_folder(folder_path=data_to_index, recursive=True, custom_context=custom_context_dict)
|
|
129
|
+
|
|
130
|
+
compass_client = CompassClient(index_url=api_url, bearer_token=bearer_token)
|
|
131
|
+
filter = SearchFilter(type=SearchFilter.FilterType.EQ, field="content.doc_purpose", value="demo")
|
|
132
|
+
result = compass_client.search_chunks(index_name=index, query="*", filters=[filter])
|
|
133
|
+
print(f"Results preview: \n {result.hits} ... \n \n ")
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
#### Adding filter via function
|
|
137
|
+
|
|
138
|
+
```python
|
|
139
|
+
from cohere_compass.clients.compass import CompassClient
|
|
140
|
+
from cohere_compass.clients.parser import CompassParserClient
|
|
141
|
+
from cohere_compass.models.search import SearchFilter
|
|
142
|
+
from cohere_compass.models.documents import CompassDocument
|
|
143
|
+
|
|
144
|
+
api_url = "<COMPASS_URL>"
|
|
145
|
+
parser_url = "<PARSER URL>"
|
|
146
|
+
data_to_index = "<PATH_TO_TEST_DATA>"
|
|
147
|
+
index = "test-index"
|
|
148
|
+
bearer_token = "<PASS BEARER TOKEN IF ANY OTHERWISE LEAVE IT BLANK>"
|
|
149
|
+
|
|
150
|
+
parsing_client = CompassParserClient(parser_url = parser_url)
|
|
151
|
+
|
|
152
|
+
def custom_context_fn(input: CompassDocument):
|
|
153
|
+
content = input.content
|
|
154
|
+
if len(input.chunks) > 2:
|
|
155
|
+
content["new_doc_field"] = "more_than_two_chunks"
|
|
156
|
+
else:
|
|
157
|
+
content["new_doc_field"] = "less_than_two_chunks"
|
|
158
|
+
return content
|
|
159
|
+
|
|
160
|
+
|
|
161
|
+
docs_to_index = parsing_client.process_folder(folder_path=data_to_index, recursive=True, custom_context=custom_context_fn)
|
|
162
|
+
|
|
163
|
+
compass_client = CompassClient(index_url=api_url, bearer_token=bearer_token)
|
|
164
|
+
filter = SearchFilter(type=SearchFilter.FilterType.EQ, field="content.new_doc_field", value="less_than_two_chunks")
|
|
165
|
+
result = compass_client.search_chunks(index_name=index, query="*", filters=[filter])
|
|
166
|
+
print(f"Results preview: \n {result.hits} ... \n \n ")
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
### RBAC
|
|
170
|
+
|
|
171
|
+
```python
|
|
172
|
+
from cohere_compass.clients.access_control import CompassRootClient
|
|
173
|
+
from cohere_compass.models.access_control import Group, Permission, Policy, Role, User
|
|
174
|
+
from requests.exceptions import HTTPError
|
|
175
|
+
|
|
176
|
+
ROOT_BEARER_TOKEN = "<ROOT_BEARER_TOKEN>"
|
|
177
|
+
API_URL = "<API_URL>"
|
|
178
|
+
compass_root = CompassRootClient(API_URL, ROOT_BEARER_TOKEN)
|
|
179
|
+
|
|
180
|
+
user = User(user_name="<USER_NAME>")
|
|
181
|
+
group = Group(group_name="<GROUP_NAME>")
|
|
182
|
+
role = Role(role_name="<ROLE_NAME>")
|
|
183
|
+
indexes = ["<ALLOWED_INDEX or REGEX>"]
|
|
184
|
+
permission = Permission.WRITE # or Permission.READ
|
|
185
|
+
|
|
186
|
+
try:
|
|
187
|
+
# Create Users
|
|
188
|
+
users = client.create_users([user])
|
|
189
|
+
|
|
190
|
+
# Create Groups
|
|
191
|
+
groups = client.create_groups([group])
|
|
192
|
+
|
|
193
|
+
# Add Users to a Group
|
|
194
|
+
memberships = client.add_members_to_group(group.group_name, [user.user_name])
|
|
195
|
+
|
|
196
|
+
# Add Policies and Create a Role
|
|
197
|
+
role.policies = [
|
|
198
|
+
Policy(permission=Permission.READ, indexes=indexes),
|
|
199
|
+
]
|
|
200
|
+
roles = client.create_roles([role])
|
|
201
|
+
|
|
202
|
+
# Update Role Policies
|
|
203
|
+
role.policies = [
|
|
204
|
+
Policy(permission=Permission.READ, indexes=indexes),
|
|
205
|
+
Policy(permission=Permission.WRITE, indexes=indexes),
|
|
206
|
+
]
|
|
207
|
+
role = client.update_role(role)
|
|
208
|
+
|
|
209
|
+
# Assign Roles to a Group
|
|
210
|
+
role_assignments = client.add_roles_to_group(group.group_name, [role.role_name])
|
|
211
|
+
|
|
212
|
+
# Token for the user to access the indexes
|
|
213
|
+
USER_TO_TOKENS = {user.name: user.token for user in users}
|
|
214
|
+
except HTTPError as e:
|
|
215
|
+
if e.response.status_code == 409:
|
|
216
|
+
print("A entity already exists", e.response.json())
|
|
217
|
+
```
|
|
218
|
+
|
|
219
|
+
### Reading RBAC Information
|
|
220
|
+
|
|
221
|
+
```python
|
|
222
|
+
from cohere_compass.clients.access_control import CompassRootClient
|
|
223
|
+
from cohere_compass.models.access_control import Group, Role, User, PageDirection
|
|
224
|
+
from requests.exceptions import HTTPError
|
|
225
|
+
|
|
226
|
+
ROOT_BEARER_TOKEN = "<ROOT_BEARER_TOKEN>"
|
|
227
|
+
API_URL = "<API_URL>"
|
|
228
|
+
compass_root = CompassRootClient(API_URL, ROOT_BEARER_TOKEN)
|
|
229
|
+
|
|
230
|
+
user = User(user_name="<USER_NAME>")
|
|
231
|
+
group = Group(group_name="<GROUP_NAME>")
|
|
232
|
+
role = Role(role_name="<ROLE_NAME>")
|
|
233
|
+
|
|
234
|
+
# List all Users in the RBAC system
|
|
235
|
+
# First page
|
|
236
|
+
user_page = client.get_users_page()
|
|
237
|
+
# Subsequent pages
|
|
238
|
+
user_page = client.get_users_page(page_info=user_page.page_info, direction=PageDirection.NEXT)
|
|
239
|
+
|
|
240
|
+
# List all Groups in the RBAC system
|
|
241
|
+
# First page
|
|
242
|
+
group_page = client.get_groups_page()
|
|
243
|
+
# Subsequent pages
|
|
244
|
+
group_page = client.get_groups_page(page_info=group_page.page_info, direction=PageDirection.NEXT)
|
|
245
|
+
|
|
246
|
+
# List all Roles in the RBAC system
|
|
247
|
+
# First page
|
|
248
|
+
role_page = client.get_roles_page()
|
|
249
|
+
# Subsequent pages
|
|
250
|
+
role_page = client.get_roles_page(page_info=role_page.page_info, direction=PageDirection.NEXT)
|
|
251
|
+
|
|
252
|
+
# Get the Group Details (all data + first page each of Users who are Members and Roles Assigned)
|
|
253
|
+
detailed_group = client.get_detailed_group(group.group_name)
|
|
254
|
+
|
|
255
|
+
# Get pages of Group's User Memberships
|
|
256
|
+
# First page
|
|
257
|
+
memberships = client.get_group_members_page(group.group_name)
|
|
258
|
+
# Subsequent pages (can use the users_page_info from details)
|
|
259
|
+
memberships = client.get_group_members_page(group.group_name, page_info=memberships.page_info, direction=PageDirection.NEXT)
|
|
260
|
+
|
|
261
|
+
# Get pages of Group's Roles Assignments
|
|
262
|
+
# First page
|
|
263
|
+
role_assignments = client.get_group_roles_page(group.group_name)
|
|
264
|
+
# Subsequent pages (can use the role_page_info from details)
|
|
265
|
+
role_assignments = client.get_group_roles_page(group.group_name, page_info=role_assignments.page_info, direction=PageDirection.NEXT)
|
|
266
|
+
|
|
267
|
+
# Get the User Details (all data + first page of Groups that the User is a Member of)
|
|
268
|
+
detailed_user = client.get_detailed_user(user.user_name)
|
|
269
|
+
|
|
270
|
+
# Get pages of User's Group Memberships
|
|
271
|
+
# First page
|
|
272
|
+
group_memberships = client.get_user_groups_page(user.user_name)
|
|
273
|
+
# Subsequent pages (can use the group_page_info from details)
|
|
274
|
+
group_memberships = client.get_user_groups_page(user.user_name, page_info=group_memberships.page_info, direction=PageDirection.NEXT)
|
|
275
|
+
|
|
276
|
+
# Get the Roles Details (all data + first page of Groups the Role is Assigned to)
|
|
277
|
+
detailed_role = client.get_detailed_role(role.role_name)
|
|
278
|
+
|
|
279
|
+
# Get pages of Role's Group Assignments
|
|
280
|
+
group_assignments = client.get_role_groups_page(role.role_name)
|
|
281
|
+
# Subsequent pages (can use the group_page_info from details)
|
|
282
|
+
group_assignments = client.get_role_groups_page(role.role_name, page_info=group_assignments.page_info, direction=PageDirection.NEXT)
|
|
283
|
+
|
|
284
|
+
# Filtering any Page type query, exemplified on Users Page, but works with all.
|
|
285
|
+
user_page = client.get_users_page(filter="<SOME_NAME_OR_NAME_PARTIAL>")
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
### Deleting RBAC
|
|
289
|
+
|
|
290
|
+
```python
|
|
291
|
+
from cohere_compass.clients.access_control import CompassRootClient
|
|
292
|
+
from cohere_compass.models.access_control import Group, Role, User
|
|
293
|
+
|
|
294
|
+
ROOT_BEARER_TOKEN = "<ROOT_BEARER_TOKEN>"
|
|
295
|
+
API_URL = "<API_URL>"
|
|
296
|
+
compass_root = CompassRootClient(API_URL, ROOT_BEARER_TOKEN)
|
|
297
|
+
|
|
298
|
+
user = User(user_name="<USER_NAME>")
|
|
299
|
+
group = Group(group_name="<GROUP_NAME>")
|
|
300
|
+
role = Role(role_name="<ROLE_NAME>")
|
|
301
|
+
|
|
302
|
+
# removing Roles from a Group
|
|
303
|
+
removed_roles = client.remove_roles_from_group(group.group_name, [role.role_name])
|
|
304
|
+
|
|
305
|
+
# removing Members from a Group
|
|
306
|
+
removed_members = client.remove_members_from_group(group.group_name, [user.user_name])
|
|
307
|
+
|
|
308
|
+
# deleting Roles
|
|
309
|
+
deleted_roles = client.delete_roles([role.role_name])
|
|
310
|
+
|
|
311
|
+
# deleting Groups
|
|
312
|
+
deleted_groups = client.delete_groups([group.group_name])
|
|
313
|
+
|
|
314
|
+
# deleting Users
|
|
315
|
+
deleted_users = client.delete_users([user.user_name])
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
## Local Development
|
|
319
|
+
|
|
320
|
+
### Create Python Virtual Environment
|
|
321
|
+
|
|
322
|
+
We use Poetry to manage our Python environment. To create the virtual environment use
|
|
323
|
+
the following command:
|
|
324
|
+
|
|
325
|
+
```
|
|
326
|
+
poetry install
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
### Running Tests Locally
|
|
330
|
+
|
|
331
|
+
We use `pytest` for testing. So, you can simply run tests using the following command:
|
|
332
|
+
|
|
333
|
+
```
|
|
334
|
+
poetry run python -m pytest
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
#### VSCode Users
|
|
338
|
+
|
|
339
|
+
We provide `.vscode` folder for those developers who prefer to use VSCode. You just need
|
|
340
|
+
to open the folder in VSCode and VSCode should pick our settings.
|
|
341
|
+
|
|
342
|
+
### Pre-commit
|
|
343
|
+
|
|
344
|
+
We love and appreciate Coding Standards and so we enforce them in our code base.
|
|
345
|
+
However, without automation, enforcing Coding Standards usually result in a lot of
|
|
346
|
+
frustration for developers when they publish Pull Requests and our linters complain. So,
|
|
347
|
+
we automate our formatting and linting with [pre-commit](https://pre-commit.com/). All
|
|
348
|
+
you need to do is install our `pre-commit` hook so the code gets formatted automatically
|
|
349
|
+
when you commit your changes locally:
|
|
350
|
+
|
|
351
|
+
```bash
|
|
352
|
+
pip install pre-commit
|
|
353
|
+
```
|
|
354
|
+
|