PyPI - scdataloader - Versions diffs - 1.0.5__tar.gz → 1.1.3__tar.gz - Mend

scdataloader 1.0.5tar.gz → 1.1.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

{scdataloader-1.0.5 → scdataloader-1.1.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: scdataloader
-Version: 1.0.5
+Version: 1.1.3
 Summary: a dataloader for single cell data in lamindb
 Home-page: https://github.com/jkobject/scDataLoader
 License: GPL3
@@ -13,7 +13,7 @@ Classifier: Programming Language :: Python :: 3.10
 Provides-Extra: dev
 Requires-Dist: anndata
 Requires-Dist: biomart
-Requires-Dist: bionty (==0.48.0)
+Requires-Dist: bionty (==0.49.0)
 Requires-Dist: black (>=23.10.1,<24.0.0) ; extra == "dev"
 Requires-Dist: cellxgene-census
 Requires-Dist: coverage (>=7.3.2,<8.0.0) ; extra == "dev"
@@ -23,7 +23,7 @@ Requires-Dist: flake8 (>=6.1.0,<7.0.0) ; extra == "dev"
 Requires-Dist: gitchangelog (>=3.0.4,<4.0.0) ; extra == "dev"
 Requires-Dist: ipykernel
 Requires-Dist: isort (>=5.12.0,<6.0.0) ; extra == "dev"
-Requires-Dist: lamindb (==0.75.1)
+Requires-Dist: lamindb (==0.76.3)
 Requires-Dist: leidenalg
 Requires-Dist: lightning
 Requires-Dist: matplotlib
@@ -82,13 +82,41 @@ I needed to create this Data Loader for my PhD project. I am using it to load &
 ```bash
 pip install scdataloader
+# or
+pip install scDataLoader[dev] # for dev dependencies
+lamin login <email> --key <API-key>
+lamin init --storage [folder-name-where-lamin-data-will-be-stored] --schema bionty
 ```
-### Install it locally and run the notebooks:
+if you start with lamin and had to do a `lamin init`, you will also need to populate your ontologies. This is because scPRINT is using ontologies to define its cell types, diseases, sexes, ethnicities, etc.
+you can do it manually or with our function:
+```python
+from scdataloader.utils import populate_my_ontology
+populate_my_ontology() #to populate everything (recommended) (can take 2-10mns)
+populate_my_ontology( #the minimum for scprint to run some inferences (denoising, grn inference)
+organisms: List[str] = ["NCBITaxon:10090", "NCBITaxon:9606"],
+    sex: List[str] = ["PATO:0000384", "PATO:0000383"],
+    celltypes = None,
+    ethnicities = None,
+    assays = None,
+    tissues = None,
+    diseases = None,
+    dev_stages = None,
+)
+```
+### Dev install
+If you want to use the latest version of scDataLoader and work on the code yourself use `git clone` and `pip -e` instead of `pip install`.
 ```bash
 git clone https://github.com/jkobject/scDataLoader.git
-pip install -e scDataLoader
+pip install -e scDataLoader[dev]
 ```
 ## Usage

{scdataloader-1.0.5 → scdataloader-1.1.3}/README.md RENAMED Viewed

@@ -41,13 +41,41 @@ I needed to create this Data Loader for my PhD project. I am using it to load &
 ```bash
 pip install scdataloader
+# or
+pip install scDataLoader[dev] # for dev dependencies
+lamin login <email> --key <API-key>
+lamin init --storage [folder-name-where-lamin-data-will-be-stored] --schema bionty
 ```
-### Install it locally and run the notebooks:
+if you start with lamin and had to do a `lamin init`, you will also need to populate your ontologies. This is because scPRINT is using ontologies to define its cell types, diseases, sexes, ethnicities, etc.
+you can do it manually or with our function:
+```python
+from scdataloader.utils import populate_my_ontology
+populate_my_ontology() #to populate everything (recommended) (can take 2-10mns)
+populate_my_ontology( #the minimum for scprint to run some inferences (denoising, grn inference)
+organisms: List[str] = ["NCBITaxon:10090", "NCBITaxon:9606"],
+    sex: List[str] = ["PATO:0000384", "PATO:0000383"],
+    celltypes = None,
+    ethnicities = None,
+    assays = None,
+    tissues = None,
+    diseases = None,
+    dev_stages = None,
+)
+```
+### Dev install
+If you want to use the latest version of scDataLoader and work on the code yourself use `git clone` and `pip -e` instead of `pip install`.
 ```bash
 git clone https://github.com/jkobject/scDataLoader.git
-pip install -e scDataLoader
+pip install -e scDataLoader[dev]
 ```
 ## Usage

{scdataloader-1.0.5 → scdataloader-1.1.3}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "scdataloader"
-version = "1.0.5"
+version = "1.1.3"
 description = "a dataloader for single cell data in lamindb"
 authors = ["jkobject"]
 license = "GPL3"
@@ -10,8 +10,8 @@ keywords = ["scRNAseq", "dataloader", "pytorch", "lamindb", "scPRINT"]
 [tool.poetry.dependencies]
 python = "3.10.*"
-lamindb = "0.75.1"
-bionty = "0.48.0"
+lamindb = "0.76.3"
+bionty = "0.49.0"
 cellxgene-census = "*"
 torch = "*"
 lightning = "*"

scdataloader-1.1.3/scdataloader/VERSION ADDED Viewed

	@@ -0,0 +1 @@
1	+ 1.1.3

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/collator.py RENAMED Viewed

@@ -161,7 +161,10 @@ class Collator:
                 raise ValueError("how must be either most expr or random expr")
             if (
                 (self.add_zero_genes > 0) or (self.max_len > len(nnz_loc))
-            ) and self.how not in ["all", "some"]:
+            ) and self.how not in [
+                "all",
+                "some",
+            ]:
                 zero_loc = np.where(expr == 0)[0]
                 zero_loc = zero_loc[
                     np.random.choice(

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/utils.py RENAMED Viewed

@@ -433,7 +433,7 @@ def populate_my_ontology(
         names = bt.Phenotype.public().df().index if not sex else sex
         records = [
             bt.Phenotype.from_source(
-                ontology_id=i,
+                ontology_id=i, source=bt.PublicSource.filter(name="pato").first()
             )
             for i in names
         ]
@@ -472,9 +472,12 @@ def populate_my_ontology(
         names = bt.DevelopmentalStage.public(organism="mouse").df().index
         records = [
-            bt.DevelopmentalStage.from_source(ontology_id=i) for i in names.tolist()
+            bt.DevelopmentalStage.from_source(
+                ontology_id=i,
+                source=bt.PublicSource.filter(organism="mouse", name="mmusdv").first(),
+            )
+            for i in names.tolist()
         ]
-        records[-4] = records[-4][0]
         ln.save(records)
     # Disease
     if diseases is not None:

scdataloader-1.0.5/scdataloader/VERSION DELETED Viewed

	@@ -1 +0,0 @@
1	- 1.0.5

{scdataloader-1.0.5 → scdataloader-1.1.3}/LICENSE RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/__init__.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/__main__.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/base.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/config.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/data.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/datamodule.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/mapped.py RENAMED Viewed

File without changes

{scdataloader-1.0.5 → scdataloader-1.1.3}/scdataloader/preprocess.py RENAMED Viewed

File without changes

scdataloader 1.0.5__tar.gz → 1.1.3__tar.gz

scdataloader 1.0.5tar.gz → 1.1.3tar.gz