graphembed-rs 0.1.0__cp311-cp311-macosx_11_0_arm64.whl → 0.1.1__cp311-cp311-macosx_11_0_arm64.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- graphembed_rs/__init__.py +5 -0
- graphembed_rs/graphembed_rs.cpython-311-darwin.so +0 -0
- {graphembed_rs-0.1.0.dist-info → graphembed_rs-0.1.1.dist-info}/METADATA +71 -34
- graphembed_rs-0.1.1.dist-info/RECORD +8 -0
- graphembed_rs-0.1.1.dist-info/licenses/LICENSE-MIT +25 -0
- graphembed/__init__.py +0 -5
- graphembed/graphembed.cpython-311-darwin.so +0 -0
- graphembed_rs-0.1.0.dist-info/RECORD +0 -7
- {graphembed → graphembed_rs}/__init__.pyi +0 -0
- {graphembed → graphembed_rs}/py.typed +0 -0
- {graphembed_rs-0.1.0.dist-info → graphembed_rs-0.1.1.dist-info}/WHEEL +0 -0
Binary file
|
@@ -1,30 +1,86 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: graphembed_rs
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.1
|
4
|
+
License-File: LICENSE-MIT
|
4
5
|
Summary: Python bindings for the high‑performance Rust graph/network embedding library graphembed
|
5
6
|
Keywords: graph,embedding,hash
|
6
7
|
Author: Jianshu Zhao
|
7
|
-
Author-email: jeanpierre.both@gmail.com
|
8
|
+
Author-email: jeanpierre.both@gmail.com
|
8
9
|
License: MIT OR Apache-2.0
|
9
10
|
Requires-Python: >=3.8
|
10
11
|
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
|
11
12
|
Project-URL: Source Code, https://github.com/jean-pierreBoth/graphembed
|
12
13
|
|
13
|
-
|
14
|
+
[](http://bioconda.github.io/recipes/graphembed/README.html)
|
15
|
+

|
16
|
+

|
17
|
+

|
18
|
+

|
19
|
+
[](https://anaconda.org/bioconda/graphembed)
|
14
20
|
|
15
|
-
This crate provides ,as a library and an executable,embedding of directed or undirected graphs with positively weighted edges.
|
16
21
|
|
22
|
+
# GraphEmbed: Efficient and Robust Network Embedding via High-Order Proximity Preservation or Recursive Sketching
|
17
23
|
|
18
|
-
|
24
|
+
This crate provides an executable and a library for embedding of directed or undirected graphs with positively weighted edges. We engineered and optimized current network embedding algorithms for large-scale network embedding, especially biological network. This crate was developed by [Jianshu Zhao](https://gitlab.com/Jianshu_Zhao) and Jean-Pierre Both [jpboth](https://gitlab.com/jpboth). We have a copy here in [Github](https://github.com/jianshu93/graphembed)
|
19
25
|
|
20
|
-
- To complement the embeddings we provide also core decomposition of graphs (see the module **structure**). We give try to analyze how Orkut communities are preserved through an embedding. (See Notebooks directory).
|
21
26
|
|
22
|
-
-
|
23
|
-
The algorithm is based on an extension of the hashing strategy used in the module **nodesketch**.
|
24
|
-
In the undirected case, this module also computes a global embedding vector for the whole graph. **It is still in an early version**.
|
27
|
+
- For simple graphs, without data attached to nodes, there are 2 modules **nodesketch** and **atp**. A simple executable with a validation option based on link prediction is also provided.
|
25
28
|
|
26
|
-
##
|
29
|
+
## Quick Install
|
30
|
+
|
31
|
+
### Pre-built binaries on Linux
|
32
|
+
```bash
|
33
|
+
wget https://gitlab.com/-/project/64961144/uploads/ea72ca007e9e4899e0c830e708f52939/graphembed_Linux_x86-64_v0.1.4.zip
|
34
|
+
unzip graphembed_Linux_x86-64_v0.1.4.zip
|
35
|
+
chmod a+x ./graphembed
|
36
|
+
./graphembed -h
|
37
|
+
```
|
38
|
+
|
39
|
+
### Bioconda on Linux
|
40
|
+
```bash
|
41
|
+
conda install -c conda-forge -c bioconda graphembed
|
42
|
+
```
|
43
|
+
|
44
|
+
### Homebrew on MacOS
|
45
|
+
```bash
|
46
|
+
brew tap jianshu93/graphembed
|
47
|
+
brew update
|
48
|
+
brew install graphembed
|
49
|
+
```
|
50
|
+
|
51
|
+
|
52
|
+
### In Python (Please install python first)
|
53
|
+
```bash
|
54
|
+
pip install graphembed_rs
|
55
|
+
|
56
|
+
### or you can build from source (Linux) after installing maturin
|
57
|
+
git clone https://gitlab.com/Jianshu_Zhao/graphembed
|
58
|
+
cd graphembed
|
59
|
+
pip install maturin
|
60
|
+
### note: for macOS, you need to change the line "features = ["pyo3/extension-module", "intel-mkl-static", "simdeez_f"]" in pyporject.toml to "features = ["pyo3/extension-module","openblas-system","stdsimd"]"
|
61
|
+
maturin develop --release
|
62
|
+
|
63
|
+
#### Prepare some data
|
64
|
+
wget https://gitlab.com/-/project/64961144/uploads/4e341383d62d86d1dd66e668e91b2c07/BlogCatalog.txt
|
65
|
+
```
|
66
|
+
|
67
|
+
```python
|
68
|
+
import os
|
69
|
+
os.environ["RUST_LOG"] = "graphembed=info"
|
70
|
+
import graphembed as ge
|
71
|
+
help(ge)
|
72
|
+
### HOPE
|
73
|
+
ge.embed_hope_rank("BlogCatalog.txt", target_rank=128, nbiter=4)
|
74
|
+
|
75
|
+
### Sketching
|
76
|
+
### sketching only
|
77
|
+
ge.embed_sketching("BlogCatalog.txt", decay=0.3, dim=128, nbiter=5, symetric=True, output="embedding_output")
|
78
|
+
### validate accuracy
|
79
|
+
auc_scores = ge.validate_sketching("BlogCatalog.txt",decay=0.3, dim=128, nbiter=3, nbpass=1, skip_frac=0.2,symetric=True, centric=True)
|
80
|
+
print("Standard AUC per pass:", auc_scores)
|
81
|
+
```
|
27
82
|
|
83
|
+
## Methods
|
28
84
|
### The embedding algorithms used in this crate are based on the following papers
|
29
85
|
|
30
86
|
- **nodesketch**
|
@@ -57,29 +113,12 @@ Source node are related to left singular vectors and target nodes to the right o
|
|
57
113
|
The similarity measure is the dot product, so it is not a norm.
|
58
114
|
The svd is approximated by randomization as described in Halko-Tropp 2011 as implemented in the [annembed crate](https://crates.io/crates/annembed).
|
59
115
|
|
60
|
-
### The core decomposition algorithms
|
61
|
-
|
62
|
-
- **Density-friendly decomposition**
|
63
|
-
|
64
|
-
*Large Scale decomposition via convex programming 2017*
|
65
|
-
M.Danisch T.H Hubert Chan and M.Sozio
|
66
|
-
|
67
|
-
The decomposition of the graph in maximally dense groups of nodes is implemented and used to assess the quality of the embeddings in a structural way. See module *validation* and the comments on the embedding of the *Orkut* graph where we can use the community data provided with the graph to analyze the behaviour of embedded edge lengths.
|
68
|
-
|
69
|
-
In particular it is shown that :
|
70
|
-
- embedding of edges internal to a community are consistently smaller than embedded edges crossing a block frontier.
|
71
|
-
- The transition probabilities of edge from one block to another are similar (low kullback divergence) in the original graph and in the embedded graph.
|
72
|
-
|
73
|
-
See results in [orkut.md](./orkut.md) and examples directory together with a small Rust notebook in directory [Notebooks](./Notebooks/orkutrs.ipynb)
|
74
|
-
|
75
116
|
## Validation
|
76
117
|
|
77
118
|
Validation of embeddings is assessed via standard Auc with random deletion of edges. See documentation in the *link* module and *embed* binary.
|
78
119
|
We give also a variation based on centric quality assessment as explained at [cauc](http://github.com/jean-pierreBoth/linkauc)
|
79
120
|
## Some data sets
|
80
121
|
|
81
|
-
### Without labels
|
82
|
-
|
83
122
|
Small datasets are given in the Data subdirectory (with 7z compression) to run tests.
|
84
123
|
Larger datasets can be downloaded from the SNAP data collections <https://snap.stanford.edu/data>
|
85
124
|
|
@@ -132,13 +171,11 @@ A preliminary of node centric quality estimation is provided in the validation m
|
|
132
171
|
- The munmun_twitter_social graph shows that treating a directed graph as an undirected graph give significantly different results in terms of link prediction AUC.
|
133
172
|
|
134
173
|
|
135
|
-
|
136
|
-
|
137
174
|
## Generalized Svd
|
138
175
|
|
139
176
|
An implementation of Generalized Svd comes as a by-product in module [gsvd](./src/atp/gsvd.rs).
|
140
177
|
|
141
|
-
## Installation and Usage
|
178
|
+
## Detailed Installation and Usage
|
142
179
|
|
143
180
|
### Installation
|
144
181
|
|
@@ -161,21 +198,21 @@ so to the required dimension to get a valid embedding in $R^{n}$.
|
|
161
198
|
- The *embed* module takes embedding and possibly validation commands (link prediction task) in one directive.
|
162
199
|
The general syntax is :
|
163
200
|
|
164
|
-
|
201
|
+
graphembed file_description [validation_command --validation_arguments] sketching mode --embedding_arguments
|
165
202
|
for example:
|
166
203
|
|
167
204
|
For a symetric graph we get:
|
168
205
|
|
169
206
|
- just embedding:
|
170
|
-
|
207
|
+
graphembed --csv ./Data/Graphs/Orkut/com-orkut.ungraph.txt --symetric sketching --decay 0.2 --dim 200 --nbiter
|
171
208
|
|
172
209
|
- embedding and validation:
|
173
210
|
|
174
|
-
|
211
|
+
graphembed --csv ./Data/Graphs/Orkut/com-orkut.ungraph.txt --symetric validation --nbpass 5 --skip 0.15 sketching --decay 0.2 --dim 200 --nbiter 5
|
175
212
|
|
176
213
|
For an asymetric graph we get
|
177
214
|
|
178
|
-
|
215
|
+
graphembed --csv ./Data/Graphs/asymetric.csv validation --nbpass 5 --skip 0.15 sketching --decay 0.2 --dim 200 --nbiter 5
|
179
216
|
|
180
217
|
|
181
218
|
More details can be found in docs of the embed module. Use cargo doc --no-dep --bin embed (and cargo doc --no-dep) as usual.
|
@@ -0,0 +1,8 @@
|
|
1
|
+
graphembed_rs-0.1.1.dist-info/METADATA,sha256=uW8W9UwifiE6FViIKpp1I7d5e6ygVuo6AA2O72MogLU,10890
|
2
|
+
graphembed_rs-0.1.1.dist-info/WHEEL,sha256=wsVBlw9xyAuHecZeOYqJ_tA7emUKfXYOn-_180uZRi4,104
|
3
|
+
graphembed_rs-0.1.1.dist-info/licenses/LICENSE-MIT,sha256=ndZ12D28O4UkfOeoa6HP9E7IKyYG4iH79iQ6WiLs9bc,1077
|
4
|
+
graphembed_rs/__init__.py,sha256=R2D0If_-sN__21LBYNod0CNgVo2dCd2RqM11AStM3X0,135
|
5
|
+
graphembed_rs/__init__.pyi,sha256=3_KBFG4g9akylo32CHlm9bZStcLwxIY2X4si21ilD3w,1626
|
6
|
+
graphembed_rs/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
7
|
+
graphembed_rs/graphembed_rs.cpython-311-darwin.so,sha256=mjlLMuiQ50lMh9GVwzoSr7gEqrKcvxdMoWOE1XNeqEw,5106384
|
8
|
+
graphembed_rs-0.1.1.dist-info/RECORD,,
|
@@ -0,0 +1,25 @@
|
|
1
|
+
Copyright (c) 2022 jean-pierre.both and Jianshu Zhao
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any
|
4
|
+
person obtaining a copy of this software and associated
|
5
|
+
documentation files (the "Software"), to deal in the
|
6
|
+
Software without restriction, including without
|
7
|
+
limitation the rights to use, copy, modify, merge,
|
8
|
+
publish, distribute, sublicense, and/or sell copies of
|
9
|
+
the Software, and to permit persons to whom the Software
|
10
|
+
is furnished to do so, subject to the following
|
11
|
+
conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice
|
14
|
+
shall be included in all copies or substantial portions
|
15
|
+
of the Software.
|
16
|
+
|
17
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
|
18
|
+
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
|
19
|
+
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
|
20
|
+
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
|
21
|
+
SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
22
|
+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
23
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
|
24
|
+
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
25
|
+
DEALINGS IN THE SOFTWARE.
|
graphembed/__init__.py
DELETED
Binary file
|
@@ -1,7 +0,0 @@
|
|
1
|
-
graphembed_rs-0.1.0.dist-info/METADATA,sha256=1bBA8fy75z8I6YGAsPAMoB-67zpAHW66cHywSb-hPj8,9901
|
2
|
-
graphembed_rs-0.1.0.dist-info/WHEEL,sha256=wsVBlw9xyAuHecZeOYqJ_tA7emUKfXYOn-_180uZRi4,104
|
3
|
-
graphembed/__init__.py,sha256=RCcLraveWf-myTsDQGePMYq-scNNfz-3Mv1baSbgAmM,123
|
4
|
-
graphembed/__init__.pyi,sha256=3_KBFG4g9akylo32CHlm9bZStcLwxIY2X4si21ilD3w,1626
|
5
|
-
graphembed/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
6
|
-
graphembed/graphembed.cpython-311-darwin.so,sha256=02KwhQh5VZBhzUc7zl-JdfWhSjb1sQ40V4KSbqSsi9o,5158016
|
7
|
-
graphembed_rs-0.1.0.dist-info/RECORD,,
|
File without changes
|
File without changes
|
File without changes
|