rusty-graph 0.3.1__cp311-cp311-macosx_10_12_x86_64.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of rusty-graph might be problematic. Click here for more details.
rusty_graph/__init__.py
ADDED
|
Binary file
|
|
@@ -0,0 +1,303 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: rusty_graph
|
|
3
|
+
Version: 0.3.1
|
|
4
|
+
Classifier: Programming Language :: Rust
|
|
5
|
+
Classifier: Programming Language :: Python :: Implementation :: CPython
|
|
6
|
+
Classifier: Programming Language :: Python :: Implementation :: PyPy
|
|
7
|
+
License-File: LICENSE
|
|
8
|
+
Summary: A high-performance graph database with Python bindings written in Rust.
|
|
9
|
+
Author: Kristian dF Kollsgård <kkollsg@gmail.com>
|
|
10
|
+
Author-email: Kristian dF Kollsgård <kkollsg@gmail.com>
|
|
11
|
+
License: MIT
|
|
12
|
+
Requires-Python: >=3.8
|
|
13
|
+
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
|
|
14
|
+
Project-URL: Source Code, https://github.com/kkollsga/rusty_graph
|
|
15
|
+
|
|
16
|
+
# Rusty Graph Python Library
|
|
17
|
+
|
|
18
|
+
A high-performance graph database library with Python bindings written in Rust.
|
|
19
|
+
|
|
20
|
+
Rusty Graph is a Rust-based project that aims to empower the generation of high-performance knowledge graphs within Python environments. Specifically designed for aggregating and merging data from SQL databases, Rusty Graph facilitates the seamless transition of relational database information into structured knowledge graphs. By leveraging Rust's efficiency and Python's flexibility, Rusty Graph offers an optimal solution for data scientists and developers looking to harness the power of knowledge graphs in their data-driven applications.
|
|
21
|
+
|
|
22
|
+
## Key Features
|
|
23
|
+
- **Efficient Data Integration:** Easily import and merge data from SQL databases to construct knowledge graphs, optimizing for performance and scalability.
|
|
24
|
+
- **High-Performance Operations:** Utilize Rust's performance capabilities to handle graph operations, making Rusty Graph ideal for working with large-scale data.
|
|
25
|
+
- **Python Compatibility:** Directly integrate Rusty Graph into Python projects, allowing for a smooth workflow within Python-based data analysis and machine learning pipelines.
|
|
26
|
+
- **Flexible Graph Manipulation:** Create, modify, and query knowledge graphs with a rich set of features, catering to complex data structures and relationships.
|
|
27
|
+
|
|
28
|
+
## Direct Download and Install
|
|
29
|
+
Users can download the .whl file directly from the repository and install it using pip.
|
|
30
|
+
- *Note that the release is only compatible with Python 3.12 on win_amd64.*
|
|
31
|
+
- *The library is still in alpha, so the functionality is very limited.*
|
|
32
|
+
```sh
|
|
33
|
+
pip install https://github.com/kkollsga/rusty_graph/blob/main/wheels/rusty_graph-0.2.13-cp312-cp312-win_amd64.whl?raw=true
|
|
34
|
+
# or upgrade an already installed library
|
|
35
|
+
pip install --upgrade https://github.com/kkollsga/rusty_graph/blob/main/wheels/rusty_graph-0.2.13-cp312-cp312-win_amd64.whl?raw=true
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
# Rusty Graph Python Library
|
|
39
|
+
|
|
40
|
+
A high-performance graph database library with Python bindings written in Rust.
|
|
41
|
+
|
|
42
|
+
## Table of Contents
|
|
43
|
+
|
|
44
|
+
- [Installation](#installation)
|
|
45
|
+
- [Basic Usage](#basic-usage)
|
|
46
|
+
- [Working with Nodes](#working-with-nodes)
|
|
47
|
+
- [Creating Connections](#creating-connections)
|
|
48
|
+
- [Filtering and Querying](#filtering-and-querying)
|
|
49
|
+
- [Traversing the Graph](#traversing-the-graph)
|
|
50
|
+
- [Statistics and Calculations](#statistics-and-calculations)
|
|
51
|
+
- [Saving and Loading](#saving-and-loading)
|
|
52
|
+
- [Performance Tips](#performance-tips)
|
|
53
|
+
|
|
54
|
+
## Installation
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
pip install rusty-graph
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Basic Usage
|
|
61
|
+
|
|
62
|
+
```python
|
|
63
|
+
import rusty_graph
|
|
64
|
+
import pandas as pd
|
|
65
|
+
|
|
66
|
+
# Create a new knowledge graph
|
|
67
|
+
graph = rusty_graph.KnowledgeGraph()
|
|
68
|
+
|
|
69
|
+
# Create some data using pandas
|
|
70
|
+
users_df = pd.DataFrame({
|
|
71
|
+
'user_id': [1001, 1002, 1003],
|
|
72
|
+
'name': ['Alice', 'Bob', 'Charlie'],
|
|
73
|
+
'age': [28, 35, 42]
|
|
74
|
+
})
|
|
75
|
+
|
|
76
|
+
# Add nodes to the graph
|
|
77
|
+
graph.add_nodes(
|
|
78
|
+
data=users_df,
|
|
79
|
+
node_type='User',
|
|
80
|
+
unique_id_field='user_id',
|
|
81
|
+
node_title_field='name'
|
|
82
|
+
)
|
|
83
|
+
|
|
84
|
+
# View graph schema
|
|
85
|
+
print(graph.get_schema())
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
## Working with Nodes
|
|
89
|
+
|
|
90
|
+
### Adding Nodes
|
|
91
|
+
|
|
92
|
+
```python
|
|
93
|
+
# Add products to graph
|
|
94
|
+
products_df = pd.DataFrame({
|
|
95
|
+
'product_id': [101, 102, 103],
|
|
96
|
+
'title': ['Laptop', 'Phone', 'Tablet'],
|
|
97
|
+
'price': [999.99, 699.99, 349.99],
|
|
98
|
+
'stock': [45, 120, 30]
|
|
99
|
+
})
|
|
100
|
+
|
|
101
|
+
graph.add_nodes(
|
|
102
|
+
data=products_df,
|
|
103
|
+
node_type='Product',
|
|
104
|
+
unique_id_field='product_id',
|
|
105
|
+
node_title_field='title',
|
|
106
|
+
# Optional: specify which columns to include
|
|
107
|
+
columns=['product_id', 'title', 'price', 'stock', 'category'],
|
|
108
|
+
# Optional: how to handle conflicts with existing nodes
|
|
109
|
+
conflict_handling='update' # Options: 'update', 'replace', 'skip', 'preserve'
|
|
110
|
+
)
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
### Retrieving Nodes
|
|
114
|
+
|
|
115
|
+
```python
|
|
116
|
+
# Get all products
|
|
117
|
+
products = graph.type_filter('Product')
|
|
118
|
+
|
|
119
|
+
# Get node information
|
|
120
|
+
product_nodes = products.get_nodes()
|
|
121
|
+
print(product_nodes)
|
|
122
|
+
|
|
123
|
+
# Get specific properties
|
|
124
|
+
prices = products.get_properties(['price', 'stock'])
|
|
125
|
+
print(prices)
|
|
126
|
+
|
|
127
|
+
# Get only titles
|
|
128
|
+
titles = products.get_titles()
|
|
129
|
+
print(titles)
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
## Creating Connections
|
|
133
|
+
|
|
134
|
+
```python
|
|
135
|
+
# Purchase data
|
|
136
|
+
purchases_df = pd.DataFrame({
|
|
137
|
+
'user_id': [1001, 1001, 1002],
|
|
138
|
+
'product_id': [101, 103, 102],
|
|
139
|
+
'date': ['2023-01-15', '2023-02-10', '2023-01-20'],
|
|
140
|
+
'quantity': [1, 2, 1]
|
|
141
|
+
})
|
|
142
|
+
|
|
143
|
+
# Create connections
|
|
144
|
+
graph.add_connections(
|
|
145
|
+
data=purchases_df,
|
|
146
|
+
connection_type='PURCHASED',
|
|
147
|
+
source_type='User',
|
|
148
|
+
source_id_field='user_id',
|
|
149
|
+
target_type='Product',
|
|
150
|
+
target_id_field='product_id',
|
|
151
|
+
# Optional additional fields to include
|
|
152
|
+
columns=['date', 'quantity']
|
|
153
|
+
)
|
|
154
|
+
|
|
155
|
+
# Create connections from currently selected nodes
|
|
156
|
+
users = graph.type_filter('User')
|
|
157
|
+
products = graph.type_filter('Product')
|
|
158
|
+
# This would connect all users to all products with a 'VIEWED' connection
|
|
159
|
+
users.selection_to_new_connections(connection_type='VIEWED')
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## Filtering and Querying
|
|
163
|
+
|
|
164
|
+
### Basic Filtering
|
|
165
|
+
|
|
166
|
+
```python
|
|
167
|
+
# Filter by exact match
|
|
168
|
+
expensive_products = graph.type_filter('Product').filter({'price': 999.99})
|
|
169
|
+
|
|
170
|
+
# Filter using operators
|
|
171
|
+
affordable_products = graph.type_filter('Product').filter({
|
|
172
|
+
'price': {'<': 500.0}
|
|
173
|
+
})
|
|
174
|
+
|
|
175
|
+
# Multiple conditions
|
|
176
|
+
popular_affordable = graph.type_filter('Product').filter({
|
|
177
|
+
'price': {'<': 500.0},
|
|
178
|
+
'stock': {'>': 50}
|
|
179
|
+
})
|
|
180
|
+
|
|
181
|
+
# In operator
|
|
182
|
+
selected_products = graph.type_filter('Product').filter({
|
|
183
|
+
'product_id': {'in': [101, 103]}
|
|
184
|
+
})
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
### Sorting Results
|
|
188
|
+
|
|
189
|
+
```python
|
|
190
|
+
# Sort by a single field (ascending by default)
|
|
191
|
+
sorted_products = graph.type_filter('Product').sort('price')
|
|
192
|
+
|
|
193
|
+
# Sort explicitly by direction
|
|
194
|
+
expensive_first = graph.type_filter('Product').sort('price', ascending=False)
|
|
195
|
+
|
|
196
|
+
# Sort by multiple fields
|
|
197
|
+
sorted_complex = graph.type_filter('Product').sort([
|
|
198
|
+
('stock', False), # Highest stock first
|
|
199
|
+
('price', True) # Then by price, lowest first
|
|
200
|
+
])
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### Limiting Results
|
|
204
|
+
|
|
205
|
+
```python
|
|
206
|
+
# Get at most 5 nodes per group
|
|
207
|
+
limited_products = graph.type_filter('Product').max_nodes(5)
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
## Traversing the Graph
|
|
211
|
+
|
|
212
|
+
```python
|
|
213
|
+
# Find products purchased by a specific user
|
|
214
|
+
alice = graph.type_filter('User').filter({'name': 'Alice'})
|
|
215
|
+
alice_products = alice.traverse(
|
|
216
|
+
connection_type='PURCHASED',
|
|
217
|
+
direction='outgoing'
|
|
218
|
+
)
|
|
219
|
+
|
|
220
|
+
# Access the resulting products
|
|
221
|
+
alice_product_data = alice_products.get_nodes()
|
|
222
|
+
|
|
223
|
+
# Filter the traversal target nodes
|
|
224
|
+
expensive_purchases = alice.traverse(
|
|
225
|
+
connection_type='PURCHASED',
|
|
226
|
+
filter_target={'price': {'>=': 500.0}},
|
|
227
|
+
sort_target='price',
|
|
228
|
+
max_nodes=10
|
|
229
|
+
)
|
|
230
|
+
|
|
231
|
+
# Get connection information
|
|
232
|
+
connection_data = alice.get_connections(include_node_properties=True)
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
## Statistics and Calculations
|
|
236
|
+
|
|
237
|
+
### Basic Statistics
|
|
238
|
+
|
|
239
|
+
```python
|
|
240
|
+
# Get statistics for a property
|
|
241
|
+
price_stats = graph.type_filter('Product').statistics('price')
|
|
242
|
+
print(price_stats)
|
|
243
|
+
|
|
244
|
+
# Calculate unique values
|
|
245
|
+
unique_categories = graph.type_filter('Product').unique_values(
|
|
246
|
+
property='category',
|
|
247
|
+
# Store result in node property
|
|
248
|
+
store_as='category_list',
|
|
249
|
+
max_length=10
|
|
250
|
+
)
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Custom Calculations
|
|
254
|
+
|
|
255
|
+
```python
|
|
256
|
+
# Simple calculation: tax inclusive price
|
|
257
|
+
with_tax = graph.type_filter('Product').calculate(
|
|
258
|
+
expression='price * 1.1',
|
|
259
|
+
store_as='price_with_tax'
|
|
260
|
+
)
|
|
261
|
+
|
|
262
|
+
# Aggregate calculations per group
|
|
263
|
+
user_spending = graph.type_filter('User').traverse('PURCHASED').calculate(
|
|
264
|
+
expression='sum(price * quantity)',
|
|
265
|
+
store_as='total_spent'
|
|
266
|
+
)
|
|
267
|
+
|
|
268
|
+
# Count operations
|
|
269
|
+
products_per_user = graph.type_filter('User').traverse('PURCHASED').count(
|
|
270
|
+
store_as='product_count',
|
|
271
|
+
group_by_parent=True
|
|
272
|
+
)
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
## Saving and Loading
|
|
276
|
+
|
|
277
|
+
```python
|
|
278
|
+
# Save graph to file
|
|
279
|
+
graph.save("my_graph.bin")
|
|
280
|
+
|
|
281
|
+
# Load graph from file
|
|
282
|
+
loaded_graph = rusty_graph.load("my_graph.bin")
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
## Performance Tips
|
|
286
|
+
|
|
287
|
+
1. **Batch Operations**: Add nodes and connections in batches rather than individually.
|
|
288
|
+
|
|
289
|
+
2. **Specify Columns**: When adding nodes or connections, explicitly specify which columns to include to reduce memory usage.
|
|
290
|
+
|
|
291
|
+
3. **Use Indexing**: Filter on node type first before applying other filters.
|
|
292
|
+
|
|
293
|
+
4. **Avoid Overloading**: Keep node property count reasonable; too many properties per node will increase memory usage.
|
|
294
|
+
|
|
295
|
+
5. **Conflict Handling**: Choose the appropriate conflict handling strategy:
|
|
296
|
+
- Use `'update'` to merge new properties with existing ones
|
|
297
|
+
- Use `'replace'` for a complete overwrite
|
|
298
|
+
- Use `'skip'` to avoid any changes to existing nodes
|
|
299
|
+
- Use `'preserve'` to only add missing properties
|
|
300
|
+
|
|
301
|
+
6. **Connection Direction**: Specify direction in traversals when possible to improve performance.
|
|
302
|
+
|
|
303
|
+
7. **Limit Results**: Use `max_nodes()` to limit result size when working with large datasets.
|
|
@@ -0,0 +1,6 @@
|
|
|
1
|
+
rusty_graph-0.3.1.dist-info/METADATA,sha256=NVxPJF4pu3IYDgIkzCMES55H7QBCH1qlXZ8PSIZTwwc,9100
|
|
2
|
+
rusty_graph-0.3.1.dist-info/WHEEL,sha256=ORfdk0V4zlq1A4oYU5gQf1Wo26Lq3YYJA44NI-YXBrE,106
|
|
3
|
+
rusty_graph-0.3.1.dist-info/licenses/LICENSE,sha256=rpMbqF0kOM1XAviOJRrR8UYQsNx0QPAzbf5b4RE358g,932
|
|
4
|
+
rusty_graph/__init__.py,sha256=_Fds04T5qV95XgyZm7qIPfLghgoCZi-_hDbw-e_18oA,127
|
|
5
|
+
rusty_graph/rusty_graph.cpython-311-darwin.so,sha256=X5f9nVGXPDc7Mw2QYtQExrmuiOKBZLQW7BkkzH2Cuds,1252620
|
|
6
|
+
rusty_graph-0.3.1.dist-info/RECORD,,
|
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2024 Kristian de Figueiredo Kollsgård
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so.
|
|
6
|
+
|
|
7
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|