risk-network 0.0.9b46__tar.gz → 0.0.10__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. {risk_network-0.0.9b46 → risk_network-0.0.10}/PKG-INFO +28 -47
  2. risk_network-0.0.10/README.md +83 -0
  3. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/__init__.py +1 -1
  4. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/annotations/annotations.py +6 -1
  5. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/neighborhoods/domains.py +9 -12
  6. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk_network.egg-info/PKG-INFO +28 -47
  7. risk_network-0.0.9b46/README.md +0 -102
  8. {risk_network-0.0.9b46 → risk_network-0.0.10}/LICENSE +0 -0
  9. {risk_network-0.0.9b46 → risk_network-0.0.10}/MANIFEST.in +0 -0
  10. {risk_network-0.0.9b46 → risk_network-0.0.10}/pyproject.toml +0 -0
  11. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/annotations/__init__.py +0 -0
  12. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/annotations/io.py +0 -0
  13. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/log/__init__.py +0 -0
  14. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/log/console.py +0 -0
  15. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/log/parameters.py +0 -0
  16. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/neighborhoods/__init__.py +0 -0
  17. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/neighborhoods/api.py +0 -0
  18. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/neighborhoods/community.py +0 -0
  19. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/neighborhoods/neighborhoods.py +0 -0
  20. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/__init__.py +0 -0
  21. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/geometry.py +0 -0
  22. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/graph/__init__.py +0 -0
  23. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/graph/api.py +0 -0
  24. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/graph/graph.py +0 -0
  25. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/graph/summary.py +0 -0
  26. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/io.py +0 -0
  27. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/__init__.py +0 -0
  28. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/api.py +0 -0
  29. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/canvas.py +0 -0
  30. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/contour.py +0 -0
  31. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/labels.py +0 -0
  32. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/network.py +0 -0
  33. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/plotter.py +0 -0
  34. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/utils/colors.py +0 -0
  35. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/network/plotter/utils/layout.py +0 -0
  36. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/risk.py +0 -0
  37. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/__init__.py +0 -0
  38. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/permutation/__init__.py +0 -0
  39. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/permutation/permutation.py +0 -0
  40. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/permutation/test_functions.py +0 -0
  41. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/significance.py +0 -0
  42. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk/stats/stat_tests.py +0 -0
  43. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk_network.egg-info/SOURCES.txt +0 -0
  44. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk_network.egg-info/dependency_links.txt +0 -0
  45. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk_network.egg-info/requires.txt +0 -0
  46. {risk_network-0.0.9b46 → risk_network-0.0.10}/risk_network.egg-info/top_level.txt +0 -0
  47. {risk_network-0.0.9b46 → risk_network-0.0.10}/setup.cfg +0 -0
  48. {risk_network-0.0.9b46 → risk_network-0.0.10}/setup.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: risk-network
3
- Version: 0.0.9b46
3
+ Version: 0.0.10
4
4
  Summary: A Python package for biological network analysis
5
5
  Author: Ira Horecka
6
6
  Author-email: Ira Horecka <ira89@icloud.com>
@@ -726,92 +726,73 @@ Dynamic: requires-python
726
726
  ![License](https://img.shields.io/badge/license-GPLv3-purple)
727
727
  [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.xxxxxxx.svg)](https://doi.org/10.5281/zenodo.xxxxxxx)
728
728
  ![Downloads](https://img.shields.io/pypi/dm/risk-network)
729
- ![Platforms](https://img.shields.io/badge/platform-linux%20%7C%20macos%20%7C%20windows-lightgrey)
729
+ ![Tests](https://github.com/riskportal/network/actions/workflows/ci.yml/badge.svg)
730
730
 
731
- **RISK** (Regional Inference of Significant Kinships) is a next-generation tool designed to streamline the analysis of biological and non-biological networks. RISK enhances network analysis with its modular architecture, extensive file format support, and advanced clustering algorithms. It simplifies the creation of publication-quality figures, making it an important tool for researchers across disciplines.
731
+ **RISK** (Regional Inference of Significant Kinships) is a next-generation tool for biological network annotation and visualization. RISK integrates community detection-based clustering, rigorous statistical enrichment analysis, and a modular framework to uncover biologically meaningful relationships and generate high-resolution visualizations. RISK supports diverse data formats and is optimized for large-scale network analysis, making it a valuable resource for researchers in systems biology and beyond.
732
732
 
733
733
  ## Documentation and Tutorial
734
734
 
735
- - **Documentation**: Comprehensive documentation is available [here](Documentation link).
736
- - **Tutorial**: An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial).
737
- We highly recommend new users to consult the documentation and tutorial early on to fully leverage RISK's capabilities.
735
+ An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial). We highly recommend new users to consult the documentation and tutorial early on to fully utilize RISK's capabilities.
738
736
 
739
737
  ## Installation
740
738
 
741
- RISK is compatible with Python 3.8 and later versions and operates on all major operating systems. Install RISK via pip:
739
+ RISK is compatible with Python 3.8 or later and runs on all major operating systems. To install the latest version of RISK, run:
742
740
 
743
741
  ```bash
744
- pip install risk-network
742
+ pip install risk-network --upgrade
745
743
  ```
746
744
 
747
745
  ## Features
748
746
 
749
- - **Comprehensive Network Analysis**: Analyze biological networks such as protein–protein interaction (PPI) and gene regulatory networks, as well as non-biological networks.
750
- - **Advanced Clustering Algorithms**: Utilize algorithms like Louvain, Markov Clustering, Spinglass, and more to identify key functional modules.
751
- - **Flexible Visualization**: Generate clear, publication-quality figures with customizable node and edge attributes, including colors, shapes, sizes, and labels.
752
- - **Efficient Data Handling**: Optimized for large datasets, supporting multiple file formats such as JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
753
- - **Statistical Analysis**: Integrated statistical tests, including hypergeometric, permutation, and Poisson tests, to assess the significance of enriched regions.
747
+ - **Comprehensive Network Analysis**: Analyze biological networks (e.g., protein–protein interaction and genetic interaction networks) as well as non-biological networks.
748
+ - **Advanced Clustering Algorithms**: Supports Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap for identifying structured network regions.
749
+ - **Flexible Visualization**: Produce customizable, high-resolution network visualizations with kernel density estimate overlays, adjustable node and edge attributes, and export options in SVG, PNG, and PDF formats.
750
+ - **Efficient Data Handling**: Supports multiple input/output formats, including JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
751
+ - **Statistical Analysis**: Assess functional enrichment using hypergeometric, permutation, binomial, chi-squared, Poisson, and z-score tests, ensuring statistical adaptability across datasets.
754
752
  - **Cross-Domain Applicability**: Suitable for network analysis across biological and non-biological domains, including social and communication networks.
755
753
 
756
754
  ## Example Usage
757
755
 
758
- We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network, revealing both established and novel functional relationships. The visualization below highlights key biological processes such as ribosomal assembly and mitochondrial organization.
756
+ We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network from Michaelis et al. (2023), filtering for proteins with six or more interactions to emphasize core functional relationships. RISK identified compact, statistically enriched clusters corresponding to biological processes such as ribosomal assembly and mitochondrial organization.
759
757
 
760
- ![RISK Main Figure](https://i.imgur.com/5OP3Hqe.jpeg)
758
+ [![Figure 1](https://i.imgur.com/lJHJrJr.jpeg)](https://i.imgur.com/lJHJrJr.jpeg)
761
759
 
762
- RISK successfully detected both known and novel functional clusters within the yeast interactome. Clusters related to Golgi transport and actin nucleation were clearly defined and closely located, showcasing RISK's ability to map well-characterized interactions. Additionally, RISK identified links between mRNA processing pathways and vesicle trafficking proteins, consistent with recent studies demonstrating the role of vesicles in mRNA localization and stability.
760
+ This figure highlights RISK’s capability to detect both established and novel functional modules within the yeast interactome.
763
761
 
764
762
  ## Citation
765
763
 
766
- If you use RISK in your research, please cite the following:
764
+ If you use RISK in your research, please cite:
767
765
 
768
- **Horecka**, *et al.*, "RISK: a next-generation tool for biological network annotation and visualization", **[Journal Name]**, 2024. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
766
+ **Horecka et al.**, "RISK: a next-generation tool for biological network annotation and visualization", **Bioinformatics**, 2025. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
769
767
 
770
768
  ## Software Architecture and Implementation
771
769
 
772
- RISK features a streamlined, modular architecture designed to meet diverse research needs. Each module focuses on a specific task—such as network input/output, statistical analysis, or visualization—ensuring ease of adaptation and extension. This design enhances flexibility and reduces development overhead for users integrating RISK into their workflows.
770
+ RISK features a streamlined, modular architecture designed to meet diverse research needs. It includes dedicated modules for:
773
771
 
774
- ### Supported Data Formats
775
-
776
- - **Input/Output**: JSON, CSV, TSV, Excel, Cytoscape, GPickle.
777
- - **Visualization Outputs**: SVG, PNG, PDF.
778
-
779
- ### Clustering Algorithms
780
-
781
- - **Available Algorithms**:
782
- - Greedy Modularity
783
- - Label Propagation
784
- - Louvain
785
- - Markov Clustering
786
- - Spinglass
787
- - Walktrap
788
- - **Distance Metrics**: Supports both spherical and Euclidean distance metrics.
789
-
790
- ### Statistical Tests
791
-
792
- - **Hypergeometric Test**
793
- - **Permutation Test** (single- or multi-process modes)
794
- - **Poisson Test**
772
+ - **Data I/O**: Supports JSON, CSV, TSV, Excel, Cytoscape, and GPickle formats.
773
+ - **Clustering**: Supports multiple clustering methods, including Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap. Provides flexible distance metrics tailored to network structure.
774
+ - **Statistical Analysis**: Provides a suite of tests for overrepresentation analysis of annotations.
775
+ - **Visualization**: Offers customizable, high-resolution output in multiple formats, including SVG, PNG, and PDF.
795
776
 
796
777
  ## Performance and Efficiency
797
778
 
798
- In benchmarking tests using the yeast interactome network, RISK demonstrated substantial improvements over previous tools in both computational performance and memory efficiency. RISK processed the dataset approximately **3.25 times faster**, reducing CPU time by **69%**, and required **25% less peak memory usage**, underscoring its efficient utilization of computational resources.
779
+ Benchmarking results demonstrate that RISK efficiently scales to networks exceeding hundreds of thousands of edges, maintaining low execution times and optimal memory usage across statistical tests.
799
780
 
800
781
  ## Contributing
801
782
 
802
- We welcome contributions from the community. Please use the following resources:
783
+ We welcome contributions from the community:
803
784
 
804
- - [Issues Tracker](https://github.com/irahorecka/risk/issues)
805
- - [Source Code](https://github.com/irahorecka/risk/tree/main/risk)
785
+ - [Issues Tracker](https://github.com/riskportal/network/issues)
786
+ - [Source Code](https://github.com/riskportal/network/tree/main/risk)
806
787
 
807
788
  ## Support
808
789
 
809
- If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/irahorecka/risk/issues) on GitHub.
790
+ If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/riskportal/network/issues) on GitHub.
810
791
 
811
792
  ## License
812
793
 
813
- RISK is freely available as open-source software under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
794
+ RISK is open source under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
814
795
 
815
796
  ---
816
797
 
817
- **Note**: For detailed documentation and to access the interactive tutorial, please visit the links provided in the [Documentation and Tutorial](#documentation-and-tutorial) section.
798
+ **Note**: For detailed documentation and to access the interactive tutorial, please visit the links above.
@@ -0,0 +1,83 @@
1
+ # RISK Network
2
+
3
+ <p align="center">
4
+ <img src="https://i.imgur.com/8TleEJs.png" width="50%" />
5
+ </p>
6
+
7
+ <br>
8
+
9
+ ![Python](https://img.shields.io/badge/python-3.8%2B-yellow)
10
+ [![pypiv](https://img.shields.io/pypi/v/risk-network.svg)](https://pypi.python.org/pypi/risk-network)
11
+ ![License](https://img.shields.io/badge/license-GPLv3-purple)
12
+ [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.xxxxxxx.svg)](https://doi.org/10.5281/zenodo.xxxxxxx)
13
+ ![Downloads](https://img.shields.io/pypi/dm/risk-network)
14
+ ![Tests](https://github.com/riskportal/network/actions/workflows/ci.yml/badge.svg)
15
+
16
+ **RISK** (Regional Inference of Significant Kinships) is a next-generation tool for biological network annotation and visualization. RISK integrates community detection-based clustering, rigorous statistical enrichment analysis, and a modular framework to uncover biologically meaningful relationships and generate high-resolution visualizations. RISK supports diverse data formats and is optimized for large-scale network analysis, making it a valuable resource for researchers in systems biology and beyond.
17
+
18
+ ## Documentation and Tutorial
19
+
20
+ An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial). We highly recommend new users to consult the documentation and tutorial early on to fully utilize RISK's capabilities.
21
+
22
+ ## Installation
23
+
24
+ RISK is compatible with Python 3.8 or later and runs on all major operating systems. To install the latest version of RISK, run:
25
+
26
+ ```bash
27
+ pip install risk-network --upgrade
28
+ ```
29
+
30
+ ## Features
31
+
32
+ - **Comprehensive Network Analysis**: Analyze biological networks (e.g., protein–protein interaction and genetic interaction networks) as well as non-biological networks.
33
+ - **Advanced Clustering Algorithms**: Supports Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap for identifying structured network regions.
34
+ - **Flexible Visualization**: Produce customizable, high-resolution network visualizations with kernel density estimate overlays, adjustable node and edge attributes, and export options in SVG, PNG, and PDF formats.
35
+ - **Efficient Data Handling**: Supports multiple input/output formats, including JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
36
+ - **Statistical Analysis**: Assess functional enrichment using hypergeometric, permutation, binomial, chi-squared, Poisson, and z-score tests, ensuring statistical adaptability across datasets.
37
+ - **Cross-Domain Applicability**: Suitable for network analysis across biological and non-biological domains, including social and communication networks.
38
+
39
+ ## Example Usage
40
+
41
+ We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network from Michaelis et al. (2023), filtering for proteins with six or more interactions to emphasize core functional relationships. RISK identified compact, statistically enriched clusters corresponding to biological processes such as ribosomal assembly and mitochondrial organization.
42
+
43
+ [![Figure 1](https://i.imgur.com/lJHJrJr.jpeg)](https://i.imgur.com/lJHJrJr.jpeg)
44
+
45
+ This figure highlights RISK’s capability to detect both established and novel functional modules within the yeast interactome.
46
+
47
+ ## Citation
48
+
49
+ If you use RISK in your research, please cite:
50
+
51
+ **Horecka et al.**, "RISK: a next-generation tool for biological network annotation and visualization", **Bioinformatics**, 2025. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
52
+
53
+ ## Software Architecture and Implementation
54
+
55
+ RISK features a streamlined, modular architecture designed to meet diverse research needs. It includes dedicated modules for:
56
+
57
+ - **Data I/O**: Supports JSON, CSV, TSV, Excel, Cytoscape, and GPickle formats.
58
+ - **Clustering**: Supports multiple clustering methods, including Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap. Provides flexible distance metrics tailored to network structure.
59
+ - **Statistical Analysis**: Provides a suite of tests for overrepresentation analysis of annotations.
60
+ - **Visualization**: Offers customizable, high-resolution output in multiple formats, including SVG, PNG, and PDF.
61
+
62
+ ## Performance and Efficiency
63
+
64
+ Benchmarking results demonstrate that RISK efficiently scales to networks exceeding hundreds of thousands of edges, maintaining low execution times and optimal memory usage across statistical tests.
65
+
66
+ ## Contributing
67
+
68
+ We welcome contributions from the community:
69
+
70
+ - [Issues Tracker](https://github.com/riskportal/network/issues)
71
+ - [Source Code](https://github.com/riskportal/network/tree/main/risk)
72
+
73
+ ## Support
74
+
75
+ If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/riskportal/network/issues) on GitHub.
76
+
77
+ ## License
78
+
79
+ RISK is open source under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
80
+
81
+ ---
82
+
83
+ **Note**: For detailed documentation and to access the interactive tutorial, please visit the links above.
@@ -7,4 +7,4 @@ RISK: Regional Inference of Significant Kinships
7
7
 
8
8
  from risk.risk import RISK
9
9
 
10
- __version__ = "0.0.9-beta.46"
10
+ __version__ = "0.0.10"
@@ -53,7 +53,7 @@ def ensure_nltk_resource(resource: str) -> None:
53
53
  print(f"Unzipped '{resource}' successfully.")
54
54
  break # Stop after unzipping the first found ZIP.
55
55
 
56
- # Final check: Try to load the resource one last time.
56
+ # Final check: Try to check resource one last time. If it fails, rai
57
57
  try:
58
58
  nltk.data.find(resource_path)
59
59
  print(f"Resource '{resource}' is now available.")
@@ -62,6 +62,11 @@ def ensure_nltk_resource(resource: str) -> None:
62
62
 
63
63
 
64
64
  # Ensure the NLTK stopwords and WordNet resources are available
65
+ # punkt is known to have issues with the default download method, so we use a custom function if it fails
66
+ try:
67
+ ensure_nltk_resource("punkt")
68
+ except LookupError:
69
+ nltk.download("punkt")
65
70
  ensure_nltk_resource("stopwords")
66
71
  ensure_nltk_resource("wordnet")
67
72
  # Use NLTK's stopwords - load all languages
@@ -42,7 +42,7 @@ def define_domains(
42
42
  Args:
43
43
  top_annotations (pd.DataFrame): DataFrame of top annotations data for the network nodes.
44
44
  significant_neighborhoods_significance (np.ndarray): The binary significance matrix below alpha.
45
- linkage_criterion (str): The clustering criterion for defining groups.
45
+ linkage_criterion (str): The clustering criterion for defining groups. Choose "off" to disable clustering.
46
46
  linkage_method (str): The linkage method for clustering. Choose "auto" to optimize.
47
47
  linkage_metric (str): The linkage metric for clustering. Choose "auto" to optimize.
48
48
  linkage_threshold (float, str): The threshold for clustering. Choose "auto" to optimize.
@@ -249,23 +249,21 @@ def _optimize_silhouette_across_linkage_and_metrics(
249
249
  # Some linkage methods and metrics may not work with certain data
250
250
  try:
251
251
  Z = linkage(m, method=method, metric=metric)
252
- except (ValueError, LinAlgError):
253
- # If linkage fails, set a default threshold (a float) and a very poor score
254
- current_threshold = 0.0
255
- score = -float("inf")
256
- else:
257
- # Only optimize silhouette score if the threshold is "auto"
258
252
  if linkage_threshold == "auto":
259
- threshold, score = _find_best_silhouette_score(Z, m, metric, linkage_criterion)
253
+ try:
254
+ threshold, score = _find_best_silhouette_score(Z, m, metric, linkage_criterion)
255
+ except (ValueError, LinAlgError):
256
+ continue # Skip to the next combination
260
257
  current_threshold = threshold
261
258
  else:
262
- # Use the provided threshold without optimization
263
259
  score = silhouette_score(
264
260
  m,
265
261
  fcluster(Z, linkage_threshold * np.max(Z[:, 2]), criterion=linkage_criterion),
266
262
  metric=metric,
267
263
  )
268
264
  current_threshold = linkage_threshold
265
+ except (ValueError, LinAlgError):
266
+ continue # Skip to the next combination
269
267
 
270
268
  if score > best_overall_score:
271
269
  best_overall_score = score
@@ -290,7 +288,6 @@ def _find_best_silhouette_score(
290
288
  linkage_criterion: str,
291
289
  lower_bound: float = 0.001,
292
290
  upper_bound: float = 1.0,
293
- resolution: float = 0.001,
294
291
  ) -> Tuple[float, float]:
295
292
  """Find the best silhouette score using binary search.
296
293
 
@@ -301,7 +298,6 @@ def _find_best_silhouette_score(
301
298
  linkage_criterion (str): Clustering criterion.
302
299
  lower_bound (float, optional): Lower bound for search. Defaults to 0.001.
303
300
  upper_bound (float, optional): Upper bound for search. Defaults to 1.0.
304
- resolution (float, optional): Desired resolution for the best threshold. Defaults to 0.001.
305
301
 
306
302
  Returns:
307
303
  Tuple[float, float]:
@@ -310,6 +306,7 @@ def _find_best_silhouette_score(
310
306
  """
311
307
  best_score = -np.inf
312
308
  best_threshold = None
309
+ minimum_linkage_threshold = 1e-6
313
310
 
314
311
  # Test lower bound
315
312
  max_d_lower = np.max(Z[:, 2]) * lower_bound
@@ -338,7 +335,7 @@ def _find_best_silhouette_score(
338
335
  lower_bound = (lower_bound + upper_bound) / 2
339
336
 
340
337
  # Binary search loop
341
- while upper_bound - lower_bound > resolution:
338
+ while upper_bound - lower_bound > minimum_linkage_threshold:
342
339
  mid_threshold = (upper_bound + lower_bound) / 2
343
340
  max_d_mid = np.max(Z[:, 2]) * mid_threshold
344
341
  clusters_mid = fcluster(Z, max_d_mid, criterion=linkage_criterion)
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.2
2
2
  Name: risk-network
3
- Version: 0.0.9b46
3
+ Version: 0.0.10
4
4
  Summary: A Python package for biological network analysis
5
5
  Author: Ira Horecka
6
6
  Author-email: Ira Horecka <ira89@icloud.com>
@@ -726,92 +726,73 @@ Dynamic: requires-python
726
726
  ![License](https://img.shields.io/badge/license-GPLv3-purple)
727
727
  [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.xxxxxxx.svg)](https://doi.org/10.5281/zenodo.xxxxxxx)
728
728
  ![Downloads](https://img.shields.io/pypi/dm/risk-network)
729
- ![Platforms](https://img.shields.io/badge/platform-linux%20%7C%20macos%20%7C%20windows-lightgrey)
729
+ ![Tests](https://github.com/riskportal/network/actions/workflows/ci.yml/badge.svg)
730
730
 
731
- **RISK** (Regional Inference of Significant Kinships) is a next-generation tool designed to streamline the analysis of biological and non-biological networks. RISK enhances network analysis with its modular architecture, extensive file format support, and advanced clustering algorithms. It simplifies the creation of publication-quality figures, making it an important tool for researchers across disciplines.
731
+ **RISK** (Regional Inference of Significant Kinships) is a next-generation tool for biological network annotation and visualization. RISK integrates community detection-based clustering, rigorous statistical enrichment analysis, and a modular framework to uncover biologically meaningful relationships and generate high-resolution visualizations. RISK supports diverse data formats and is optimized for large-scale network analysis, making it a valuable resource for researchers in systems biology and beyond.
732
732
 
733
733
  ## Documentation and Tutorial
734
734
 
735
- - **Documentation**: Comprehensive documentation is available [here](Documentation link).
736
- - **Tutorial**: An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial).
737
- We highly recommend new users to consult the documentation and tutorial early on to fully leverage RISK's capabilities.
735
+ An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial). We highly recommend new users to consult the documentation and tutorial early on to fully utilize RISK's capabilities.
738
736
 
739
737
  ## Installation
740
738
 
741
- RISK is compatible with Python 3.8 and later versions and operates on all major operating systems. Install RISK via pip:
739
+ RISK is compatible with Python 3.8 or later and runs on all major operating systems. To install the latest version of RISK, run:
742
740
 
743
741
  ```bash
744
- pip install risk-network
742
+ pip install risk-network --upgrade
745
743
  ```
746
744
 
747
745
  ## Features
748
746
 
749
- - **Comprehensive Network Analysis**: Analyze biological networks such as protein–protein interaction (PPI) and gene regulatory networks, as well as non-biological networks.
750
- - **Advanced Clustering Algorithms**: Utilize algorithms like Louvain, Markov Clustering, Spinglass, and more to identify key functional modules.
751
- - **Flexible Visualization**: Generate clear, publication-quality figures with customizable node and edge attributes, including colors, shapes, sizes, and labels.
752
- - **Efficient Data Handling**: Optimized for large datasets, supporting multiple file formats such as JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
753
- - **Statistical Analysis**: Integrated statistical tests, including hypergeometric, permutation, and Poisson tests, to assess the significance of enriched regions.
747
+ - **Comprehensive Network Analysis**: Analyze biological networks (e.g., protein–protein interaction and genetic interaction networks) as well as non-biological networks.
748
+ - **Advanced Clustering Algorithms**: Supports Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap for identifying structured network regions.
749
+ - **Flexible Visualization**: Produce customizable, high-resolution network visualizations with kernel density estimate overlays, adjustable node and edge attributes, and export options in SVG, PNG, and PDF formats.
750
+ - **Efficient Data Handling**: Supports multiple input/output formats, including JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
751
+ - **Statistical Analysis**: Assess functional enrichment using hypergeometric, permutation, binomial, chi-squared, Poisson, and z-score tests, ensuring statistical adaptability across datasets.
754
752
  - **Cross-Domain Applicability**: Suitable for network analysis across biological and non-biological domains, including social and communication networks.
755
753
 
756
754
  ## Example Usage
757
755
 
758
- We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network, revealing both established and novel functional relationships. The visualization below highlights key biological processes such as ribosomal assembly and mitochondrial organization.
756
+ We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network from Michaelis et al. (2023), filtering for proteins with six or more interactions to emphasize core functional relationships. RISK identified compact, statistically enriched clusters corresponding to biological processes such as ribosomal assembly and mitochondrial organization.
759
757
 
760
- ![RISK Main Figure](https://i.imgur.com/5OP3Hqe.jpeg)
758
+ [![Figure 1](https://i.imgur.com/lJHJrJr.jpeg)](https://i.imgur.com/lJHJrJr.jpeg)
761
759
 
762
- RISK successfully detected both known and novel functional clusters within the yeast interactome. Clusters related to Golgi transport and actin nucleation were clearly defined and closely located, showcasing RISK's ability to map well-characterized interactions. Additionally, RISK identified links between mRNA processing pathways and vesicle trafficking proteins, consistent with recent studies demonstrating the role of vesicles in mRNA localization and stability.
760
+ This figure highlights RISK’s capability to detect both established and novel functional modules within the yeast interactome.
763
761
 
764
762
  ## Citation
765
763
 
766
- If you use RISK in your research, please cite the following:
764
+ If you use RISK in your research, please cite:
767
765
 
768
- **Horecka**, *et al.*, "RISK: a next-generation tool for biological network annotation and visualization", **[Journal Name]**, 2024. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
766
+ **Horecka et al.**, "RISK: a next-generation tool for biological network annotation and visualization", **Bioinformatics**, 2025. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
769
767
 
770
768
  ## Software Architecture and Implementation
771
769
 
772
- RISK features a streamlined, modular architecture designed to meet diverse research needs. Each module focuses on a specific task—such as network input/output, statistical analysis, or visualization—ensuring ease of adaptation and extension. This design enhances flexibility and reduces development overhead for users integrating RISK into their workflows.
770
+ RISK features a streamlined, modular architecture designed to meet diverse research needs. It includes dedicated modules for:
773
771
 
774
- ### Supported Data Formats
775
-
776
- - **Input/Output**: JSON, CSV, TSV, Excel, Cytoscape, GPickle.
777
- - **Visualization Outputs**: SVG, PNG, PDF.
778
-
779
- ### Clustering Algorithms
780
-
781
- - **Available Algorithms**:
782
- - Greedy Modularity
783
- - Label Propagation
784
- - Louvain
785
- - Markov Clustering
786
- - Spinglass
787
- - Walktrap
788
- - **Distance Metrics**: Supports both spherical and Euclidean distance metrics.
789
-
790
- ### Statistical Tests
791
-
792
- - **Hypergeometric Test**
793
- - **Permutation Test** (single- or multi-process modes)
794
- - **Poisson Test**
772
+ - **Data I/O**: Supports JSON, CSV, TSV, Excel, Cytoscape, and GPickle formats.
773
+ - **Clustering**: Supports multiple clustering methods, including Louvain, Leiden, Markov Clustering, Greedy Modularity, Label Propagation, Spinglass, and Walktrap. Provides flexible distance metrics tailored to network structure.
774
+ - **Statistical Analysis**: Provides a suite of tests for overrepresentation analysis of annotations.
775
+ - **Visualization**: Offers customizable, high-resolution output in multiple formats, including SVG, PNG, and PDF.
795
776
 
796
777
  ## Performance and Efficiency
797
778
 
798
- In benchmarking tests using the yeast interactome network, RISK demonstrated substantial improvements over previous tools in both computational performance and memory efficiency. RISK processed the dataset approximately **3.25 times faster**, reducing CPU time by **69%**, and required **25% less peak memory usage**, underscoring its efficient utilization of computational resources.
779
+ Benchmarking results demonstrate that RISK efficiently scales to networks exceeding hundreds of thousands of edges, maintaining low execution times and optimal memory usage across statistical tests.
799
780
 
800
781
  ## Contributing
801
782
 
802
- We welcome contributions from the community. Please use the following resources:
783
+ We welcome contributions from the community:
803
784
 
804
- - [Issues Tracker](https://github.com/irahorecka/risk/issues)
805
- - [Source Code](https://github.com/irahorecka/risk/tree/main/risk)
785
+ - [Issues Tracker](https://github.com/riskportal/network/issues)
786
+ - [Source Code](https://github.com/riskportal/network/tree/main/risk)
806
787
 
807
788
  ## Support
808
789
 
809
- If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/irahorecka/risk/issues) on GitHub.
790
+ If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/riskportal/network/issues) on GitHub.
810
791
 
811
792
  ## License
812
793
 
813
- RISK is freely available as open-source software under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
794
+ RISK is open source under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
814
795
 
815
796
  ---
816
797
 
817
- **Note**: For detailed documentation and to access the interactive tutorial, please visit the links provided in the [Documentation and Tutorial](#documentation-and-tutorial) section.
798
+ **Note**: For detailed documentation and to access the interactive tutorial, please visit the links above.
@@ -1,102 +0,0 @@
1
- # RISK Network
2
-
3
- <p align="center">
4
- <img src="https://i.imgur.com/8TleEJs.png" width="50%" />
5
- </p>
6
-
7
- <br>
8
-
9
- ![Python](https://img.shields.io/badge/python-3.8%2B-yellow)
10
- [![pypiv](https://img.shields.io/pypi/v/risk-network.svg)](https://pypi.python.org/pypi/risk-network)
11
- ![License](https://img.shields.io/badge/license-GPLv3-purple)
12
- [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.xxxxxxx.svg)](https://doi.org/10.5281/zenodo.xxxxxxx)
13
- ![Downloads](https://img.shields.io/pypi/dm/risk-network)
14
- ![Platforms](https://img.shields.io/badge/platform-linux%20%7C%20macos%20%7C%20windows-lightgrey)
15
-
16
- **RISK** (Regional Inference of Significant Kinships) is a next-generation tool designed to streamline the analysis of biological and non-biological networks. RISK enhances network analysis with its modular architecture, extensive file format support, and advanced clustering algorithms. It simplifies the creation of publication-quality figures, making it an important tool for researchers across disciplines.
17
-
18
- ## Documentation and Tutorial
19
-
20
- - **Documentation**: Comprehensive documentation is available [here](Documentation link).
21
- - **Tutorial**: An interactive Jupyter notebook tutorial can be found [here](https://github.com/riskportal/network-tutorial).
22
- We highly recommend new users to consult the documentation and tutorial early on to fully leverage RISK's capabilities.
23
-
24
- ## Installation
25
-
26
- RISK is compatible with Python 3.8 and later versions and operates on all major operating systems. Install RISK via pip:
27
-
28
- ```bash
29
- pip install risk-network
30
- ```
31
-
32
- ## Features
33
-
34
- - **Comprehensive Network Analysis**: Analyze biological networks such as protein–protein interaction (PPI) and gene regulatory networks, as well as non-biological networks.
35
- - **Advanced Clustering Algorithms**: Utilize algorithms like Louvain, Markov Clustering, Spinglass, and more to identify key functional modules.
36
- - **Flexible Visualization**: Generate clear, publication-quality figures with customizable node and edge attributes, including colors, shapes, sizes, and labels.
37
- - **Efficient Data Handling**: Optimized for large datasets, supporting multiple file formats such as JSON, CSV, TSV, Excel, Cytoscape, and GPickle.
38
- - **Statistical Analysis**: Integrated statistical tests, including hypergeometric, permutation, and Poisson tests, to assess the significance of enriched regions.
39
- - **Cross-Domain Applicability**: Suitable for network analysis across biological and non-biological domains, including social and communication networks.
40
-
41
- ## Example Usage
42
-
43
- We applied RISK to a *Saccharomyces cerevisiae* protein–protein interaction network, revealing both established and novel functional relationships. The visualization below highlights key biological processes such as ribosomal assembly and mitochondrial organization.
44
-
45
- ![RISK Main Figure](https://i.imgur.com/5OP3Hqe.jpeg)
46
-
47
- RISK successfully detected both known and novel functional clusters within the yeast interactome. Clusters related to Golgi transport and actin nucleation were clearly defined and closely located, showcasing RISK's ability to map well-characterized interactions. Additionally, RISK identified links between mRNA processing pathways and vesicle trafficking proteins, consistent with recent studies demonstrating the role of vesicles in mRNA localization and stability.
48
-
49
- ## Citation
50
-
51
- If you use RISK in your research, please cite the following:
52
-
53
- **Horecka**, *et al.*, "RISK: a next-generation tool for biological network annotation and visualization", **[Journal Name]**, 2024. DOI: [10.1234/zenodo.xxxxxxx](https://doi.org/10.1234/zenodo.xxxxxxx)
54
-
55
- ## Software Architecture and Implementation
56
-
57
- RISK features a streamlined, modular architecture designed to meet diverse research needs. Each module focuses on a specific task—such as network input/output, statistical analysis, or visualization—ensuring ease of adaptation and extension. This design enhances flexibility and reduces development overhead for users integrating RISK into their workflows.
58
-
59
- ### Supported Data Formats
60
-
61
- - **Input/Output**: JSON, CSV, TSV, Excel, Cytoscape, GPickle.
62
- - **Visualization Outputs**: SVG, PNG, PDF.
63
-
64
- ### Clustering Algorithms
65
-
66
- - **Available Algorithms**:
67
- - Greedy Modularity
68
- - Label Propagation
69
- - Louvain
70
- - Markov Clustering
71
- - Spinglass
72
- - Walktrap
73
- - **Distance Metrics**: Supports both spherical and Euclidean distance metrics.
74
-
75
- ### Statistical Tests
76
-
77
- - **Hypergeometric Test**
78
- - **Permutation Test** (single- or multi-process modes)
79
- - **Poisson Test**
80
-
81
- ## Performance and Efficiency
82
-
83
- In benchmarking tests using the yeast interactome network, RISK demonstrated substantial improvements over previous tools in both computational performance and memory efficiency. RISK processed the dataset approximately **3.25 times faster**, reducing CPU time by **69%**, and required **25% less peak memory usage**, underscoring its efficient utilization of computational resources.
84
-
85
- ## Contributing
86
-
87
- We welcome contributions from the community. Please use the following resources:
88
-
89
- - [Issues Tracker](https://github.com/irahorecka/risk/issues)
90
- - [Source Code](https://github.com/irahorecka/risk/tree/main/risk)
91
-
92
- ## Support
93
-
94
- If you encounter issues or have suggestions for new features, please use the [Issues Tracker](https://github.com/irahorecka/risk/issues) on GitHub.
95
-
96
- ## License
97
-
98
- RISK is freely available as open-source software under the [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html).
99
-
100
- ---
101
-
102
- **Note**: For detailed documentation and to access the interactive tutorial, please visit the links provided in the [Documentation and Tutorial](#documentation-and-tutorial) section.
File without changes
File without changes
File without changes