scdataloader 1.0.6__tar.gz → 1.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. scdataloader-1.2.1/.cursorignore +21 -0
  2. scdataloader-1.2.1/.github/ISSUE_TEMPLATE/bug_report.md +31 -0
  3. scdataloader-1.2.1/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
  4. scdataloader-1.2.1/.github/PULL_REQUEST_TEMPLATE.md +15 -0
  5. scdataloader-1.2.1/.github/dependabot.yml +6 -0
  6. scdataloader-1.2.1/.github/release_message.sh +3 -0
  7. scdataloader-1.2.1/.github/rename_project.sh +36 -0
  8. scdataloader-1.2.1/.github/workflows/main.yml +43 -0
  9. scdataloader-1.2.1/.github/workflows/release.yml +48 -0
  10. scdataloader-1.2.1/.gitignore +133 -0
  11. scdataloader-1.2.1/ABOUT_THIS_TEMPLATE.md +70 -0
  12. scdataloader-1.2.1/CONTRIBUTING.md +113 -0
  13. scdataloader-1.2.1/Containerfile +5 -0
  14. scdataloader-1.2.1/HISTORY.md +296 -0
  15. scdataloader-1.2.1/MANIFEST.in +5 -0
  16. scdataloader-1.2.1/Makefile +95 -0
  17. scdataloader-1.2.1/PKG-INFO +299 -0
  18. scdataloader-1.2.1/README.md +260 -0
  19. scdataloader-1.2.1/docs/collator.md +4 -0
  20. scdataloader-1.2.1/docs/datamodule.md +4 -0
  21. scdataloader-1.2.1/docs/dataset.md +7 -0
  22. scdataloader-1.0.6/README.md → scdataloader-1.2.1/docs/index.md +5 -5
  23. scdataloader-1.2.1/docs/notebooks/1_download_and_preprocess.ipynb +901 -0
  24. scdataloader-1.2.1/docs/notebooks/2_create_dataloader.ipynb +341 -0
  25. scdataloader-1.2.1/docs/preprocess.md +13 -0
  26. scdataloader-1.2.1/docs/scdataloader.drawio.png +0 -0
  27. scdataloader-1.2.1/docs/utils.md +4 -0
  28. scdataloader-1.2.1/mkdocs.yml +36 -0
  29. scdataloader-1.2.1/notebooks/additional.py +24 -0
  30. scdataloader-1.2.1/notebooks/finalize_data_loader.ipynb +1611 -0
  31. scdataloader-1.2.1/notebooks/finalize_data_loaderv2.ipynb +2436 -0
  32. scdataloader-1.2.1/notebooks/onto_rel.ipynb +519 -0
  33. scdataloader-1.2.1/notebooks/prepare_dataset.py +96 -0
  34. scdataloader-1.2.1/notebooks/rel_onto_tissues_age.ipynb +494 -0
  35. scdataloader-1.2.1/notebooks/reset_lamin.py +26 -0
  36. scdataloader-1.2.1/notebooks/speed_dataloader.ipynb +191 -0
  37. scdataloader-1.2.1/notebooks/work_on_dataloader_onto part 2.ipynb +4629 -0
  38. scdataloader-1.2.1/notebooks/work_on_dataloader_onto part 3.ipynb +6106 -0
  39. scdataloader-1.2.1/notebooks/work_on_dataloader_onto.ipynb +3496 -0
  40. scdataloader-1.2.1/poetry.lock +5061 -0
  41. scdataloader-1.2.1/pyproject.toml +67 -0
  42. scdataloader-1.2.1/scdataloader/VERSION +1 -0
  43. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/__main__.py +5 -3
  44. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/collator.py +8 -3
  45. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/data.py +41 -17
  46. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/datamodule.py +13 -13
  47. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/preprocess.py +71 -56
  48. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/utils.py +77 -58
  49. scdataloader-1.2.1/tests/__init__.py +0 -0
  50. scdataloader-1.2.1/tests/conftest.py +26 -0
  51. scdataloader-1.2.1/tests/test.h5ad +0 -0
  52. scdataloader-1.2.1/tests/test_base.py +64 -0
  53. scdataloader-1.2.1/uv.lock +3157 -0
  54. scdataloader-1.0.6/PKG-INFO +0 -899
  55. scdataloader-1.0.6/pyproject.toml +0 -62
  56. scdataloader-1.0.6/scdataloader/VERSION +0 -1
  57. scdataloader-1.0.6/scdataloader/mapped.py +0 -540
  58. {scdataloader-1.0.6 → scdataloader-1.2.1}/LICENSE +0 -0
  59. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/__init__.py +1 -1
  60. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/base.py +0 -0
  61. {scdataloader-1.0.6 → scdataloader-1.2.1}/scdataloader/config.py +0 -0
@@ -0,0 +1,21 @@
1
+ *.json
2
+ *.csv
3
+ *.txt
4
+ *.md
5
+ *.ipynb
6
+ *.Rmd
7
+ *.Rproj
8
+ *.Rproj.user
9
+ *.parquet
10
+ *.ckpt
11
+ *.h5ad
12
+ site/
13
+ *.lamindb
14
+ *.lndb
15
+ *.out
16
+ *.html
17
+ *.pdf
18
+ *.npy
19
+ *.npz
20
+ *.whl
21
+ *.gz
@@ -0,0 +1,31 @@
1
+ ---
2
+ name: Bug report
3
+ about: Create a report to help us improve
4
+ title: ''
5
+ labels: bug, help wanted
6
+ assignees: ''
7
+
8
+ ---
9
+
10
+ **Describe the bug**
11
+ A clear and concise description of what the bug is.
12
+
13
+ **To Reproduce**
14
+ Steps to reproduce the behavior:
15
+ 1. Go to '...'
16
+ 2. Click on '....'
17
+ 3. Scroll down to '....'
18
+ 4. See error
19
+
20
+ **Expected behavior**
21
+ A clear and concise description of what you expected to happen.
22
+
23
+ **Screenshots**
24
+ If applicable, add screenshots to help explain your problem.
25
+
26
+ **Desktop (please complete the following information):**
27
+ - OS: [e.g. iOS]
28
+ - Version [e.g. 22]
29
+
30
+ **Additional context**
31
+ Add any other context about the problem here.
@@ -0,0 +1,20 @@
1
+ ---
2
+ name: Feature request
3
+ about: Suggest an idea for this project
4
+ title: ''
5
+ labels: enhancement, question
6
+ assignees: ''
7
+
8
+ ---
9
+
10
+ **Is your feature request related to a problem? Please describe.**
11
+ A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12
+
13
+ **Describe the solution you'd like**
14
+ A clear and concise description of what you want to happen.
15
+
16
+ **Describe alternatives you've considered**
17
+ A clear and concise description of any alternative solutions or features you've considered.
18
+
19
+ **Additional context**
20
+ Add any other context or screenshots about the feature request here.
@@ -0,0 +1,15 @@
1
+ ### Summary :memo:
2
+ _Write an overview about it._
3
+
4
+ ### Details
5
+ _Describe more what you did on changes._
6
+ 1. (...)
7
+ 2. (...)
8
+
9
+ ### Bugfixes :bug: (delete if dind't have any)
10
+ -
11
+
12
+ ### Checks
13
+ - [ ] Closed #798
14
+ - [ ] Tested Changes
15
+ - [ ] Stakeholder Approval
@@ -0,0 +1,6 @@
1
+ version: 2
2
+ updates:
3
+ - package-ecosystem: "github-actions"
4
+ directory: "/"
5
+ schedule:
6
+ interval: "weekly"
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env bash
2
+ previous_tag=$(git tag --sort=-creatordate | sed -n 2p)
3
+ git shortlog "${previous_tag}.." | sed 's/^./ &/'
@@ -0,0 +1,36 @@
1
+ #!/usr/bin/env bash
2
+ while getopts a:n:u:d: flag
3
+ do
4
+ case "${flag}" in
5
+ a) author=${OPTARG};;
6
+ n) name=${OPTARG};;
7
+ u) urlname=${OPTARG};;
8
+ d) description=${OPTARG};;
9
+ esac
10
+ done
11
+
12
+ echo "Author: $author";
13
+ echo "Project Name: $name";
14
+ echo "Project URL name: $urlname";
15
+ echo "Description: $description";
16
+
17
+ echo "Renaming project..."
18
+
19
+ original_author="jkobject"
20
+ original_name="scdataloader"
21
+ original_urlname="scDataLoader"
22
+ original_description="Awesome scdataloader created by jkobject"
23
+ # for filename in $(find . -name "*.*")
24
+ for filename in $(git ls-files)
25
+ do
26
+ sed -i "s/$original_author/$author/g" $filename
27
+ sed -i "s/$original_name/$name/g" $filename
28
+ sed -i "s/$original_urlname/$urlname/g" $filename
29
+ sed -i "s/$original_description/$description/g" $filename
30
+ echo "Renamed $filename"
31
+ done
32
+
33
+ mv scdataloader $name
34
+
35
+ # This command runs only once on GHA!
36
+ rm -rf .github/template.yml
@@ -0,0 +1,43 @@
1
+ # This is a basic workflow to help you get started with Actions
2
+
3
+ name: CI
4
+
5
+ # Controls when the workflow will run
6
+ on:
7
+ # Triggers the workflow on push or pull request events but only for the main branch
8
+ push:
9
+ branches: [main]
10
+ pull_request:
11
+ branches: [main]
12
+
13
+ # Allows you to run this workflow manually from the Actions tab
14
+ workflow_dispatch:
15
+
16
+ jobs:
17
+ ci:
18
+ strategy:
19
+ fail-fast: true
20
+ matrix:
21
+ python-version: ["3.10"]
22
+ os: [ubuntu-latest]
23
+ runs-on: ${{ matrix.os }}
24
+ steps:
25
+ - uses: actions/checkout@v4
26
+ - name: Install uv
27
+ uses: astral-sh/setup-uv@v3
28
+ - uses: actions/setup-python@v5
29
+ with:
30
+ python-version: ${{ matrix.python-version }}
31
+ - name: Install project
32
+ run: make virtualenv
33
+ continue-on-error: false
34
+ - name: Run linter
35
+ run: make lint
36
+ continue-on-error: false
37
+ - name: Run tests
38
+ run: make test
39
+ - name: "Upload coverage to Codecov"
40
+ uses: codecov/codecov-action@v4
41
+ with:
42
+ fail_ci_if_error: true
43
+ token: ${{ secrets.CODECOV_TOKEN }}
@@ -0,0 +1,48 @@
1
+ name: Upload Python Package
2
+
3
+ on:
4
+ push:
5
+ # Sequence of patterns matched against refs/tags
6
+ tags:
7
+ - "*" # Push events to matching v*, i.e. v1.0, v20.15.10
8
+
9
+ # Allows you to run this workflow manually from the Actions tab
10
+ workflow_dispatch:
11
+
12
+ jobs:
13
+ release:
14
+ name: Create Release
15
+ runs-on: ubuntu-latest
16
+ permissions:
17
+ contents: write
18
+ steps:
19
+ - uses: actions/checkout@v4
20
+ with:
21
+ # by default, it uses a depth of 1
22
+ # this fetches all history so that we can read each commit
23
+ fetch-depth: 0
24
+ - name: Generate Changelog
25
+ run: .github/release_message.sh > release_message.md
26
+ - name: Release
27
+ uses: softprops/action-gh-release@v1
28
+ with:
29
+ body_path: release_message.md
30
+
31
+ deploy:
32
+ needs: release
33
+ strategy:
34
+ fail-fast: true
35
+ matrix:
36
+ python-version: ["3.10"]
37
+ os: [ubuntu-latest]
38
+ runs-on: ${{ matrix.os }}
39
+ steps:
40
+ - uses: actions/checkout@v4
41
+ - name: Install uv
42
+ uses: astral-sh/setup-uv@v3
43
+ - uses: actions/setup-python@v5
44
+ with:
45
+ python-version: ${{ matrix.python-version }}
46
+ - name: Build and publish
47
+ run: |
48
+ uv build && uv publish -u __token__ -p ${{ secrets.POETRY_TOKEN }}
@@ -0,0 +1,133 @@
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ pip-wheel-metadata/
24
+ share/python-wheels/
25
+ *.egg-info/
26
+ .installed.cfg
27
+ *.egg
28
+ MANIFEST
29
+
30
+ # PyInstaller
31
+ # Usually these files are written by a python script from a template
32
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
33
+ *.manifest
34
+ *.spec
35
+
36
+ # Installer logs
37
+ pip-log.txt
38
+ pip-delete-this-directory.txt
39
+
40
+ # Unit test / coverage reports
41
+ htmlcov/
42
+ .tox/
43
+ .nox/
44
+ .coverage
45
+ .coverage.*
46
+ .cache
47
+ nosetests.xml
48
+ coverage.xml
49
+ *.cover
50
+ *.py,cover
51
+ .hypothesis/
52
+ .pytest_cache/
53
+
54
+ # Translations
55
+ *.mo
56
+ *.pot
57
+
58
+ # Django stuff:
59
+ *.log
60
+ local_settings.py
61
+ db.sqlite3
62
+ db.sqlite3-journal
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ target/
76
+
77
+ # Jupyter Notebook
78
+ .ipynb_checkpoints
79
+
80
+ # IPython
81
+ profile_default/
82
+ ipython_config.py
83
+
84
+ # pyenv
85
+ .python-version
86
+
87
+ # pipenv
88
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
90
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
91
+ # install all needed dependencies.
92
+ #Pipfile.lock
93
+
94
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95
+ __pypackages__/
96
+
97
+ # Celery stuff
98
+ celerybeat-schedule
99
+ celerybeat.pid
100
+
101
+ # SageMath parsed files
102
+ *.sage.py
103
+
104
+ # Environments
105
+ .env
106
+ .venv
107
+ env/
108
+ venv/
109
+ ENV/
110
+ env.bak/
111
+ venv.bak/
112
+
113
+ # Spyder project settings
114
+ .spyderproject
115
+ .spyproject
116
+
117
+ # Rope project settings
118
+ .ropeproject
119
+
120
+ # mkdocs documentation
121
+ /site
122
+
123
+ # mypy
124
+ .mypy_cache/
125
+ .dmypy.json
126
+ dmypy.json
127
+
128
+ # Pyre type checker
129
+ .pyre/
130
+
131
+ # templates
132
+ .github/templates/*
133
+ .DS_Store
@@ -0,0 +1,70 @@
1
+ ## Structure
2
+
3
+ Lets take a look at the structure of this template:
4
+
5
+ ```text
6
+ ├── Containerfile # The file to build a container using buildah or docker
7
+ ├── CONTRIBUTING.md # Onboarding instructions for new contributors
8
+ ├── docs # Documentation site (add more .md files here)
9
+ │   └── index.md # The index page for the docs site
10
+ ├── .github # Github metadata for repository
11
+ │   ├── release_message.sh # A script to generate a release message
12
+ │   └── workflows # The CI pipeline for Github Actions
13
+ ├── .gitignore # A list of files to ignore when pushing to Github
14
+ ├── HISTORY.md # Auto generated list of changes to the project
15
+ ├── LICENSE # The license for the project
16
+ ├── Makefile # A collection of utilities to manage the project
17
+ ├── MANIFEST.in # A list of files to include in a package
18
+ ├── mkdocs.yml # Configuration for documentation site
19
+ ├── scdataloader # The main python package for the project
20
+ │   ├── base.py # The base module for the project
21
+ │   ├── __init__.py # This tells Python that this is a package
22
+ │   ├── __main__.py # The entry point for the project
23
+ │   └── VERSION # The version for the project is kept in a static file
24
+ ├── README.md # The main readme for the project
25
+ ├── setup.py # The setup.py file for installing and packaging the project
26
+ ├── requirements.txt # An empty file to hold the requirements for the project
27
+ ├── requirements-test.txt # List of requirements for testing and devlopment
28
+ ├── setup.py # The setup.py file for installing and packaging the project
29
+ └── tests # Unit tests for the project (add mote tests files here)
30
+ ├── conftest.py # Configuration, hooks and fixtures for pytest
31
+ ├── __init__.py # This tells Python that this is a test package
32
+ └── test_base.py # The base test case for the project
33
+ ```
34
+
35
+ ### Why to include `tests`, `history` and `Containerfile` as part of the release?
36
+
37
+ The `MANIFEST.in` file is used to include the files in the release, once the
38
+ project is released to PyPI all the files listed on MANIFEST.in will be included
39
+ even if the files are static or not related to Python.
40
+
41
+ Some build systems such as RPM, DEB, AUR for some Linux distributions, and also
42
+ internal repackaging systems tends to run the tests before the packaging is performed.
43
+
44
+ The Containerfile can be useful to provide a safer execution environment for
45
+ the project when running on a testing environment.
46
+
47
+ I added those files to make it easier for packaging in different formats.
48
+
49
+ ## The Makefile
50
+
51
+ All the utilities for the template and project are on the Makefile
52
+
53
+ ```bash
54
+ ❯ make
55
+ Usage: make <target>
56
+
57
+ Targets:
58
+ help: ## Show the help.
59
+ install: ## Install the project in dev mode.
60
+ fmt: ## Format code using black & isort.
61
+ lint: ## Run pep8, black, mypy linters.
62
+ test: lint ## Run tests and generate coverage report.
63
+ watch: ## Run tests on every change.
64
+ clean: ## Clean unused files.
65
+ virtualenv: ## Create a virtual environment.
66
+ release: ## Create a new tag for release.
67
+ docs: ## Build the documentation.
68
+ switch-to-poetry: ## Switch to poetry package manager.
69
+ init: ## Initialize the project based on an application template.
70
+ ```
@@ -0,0 +1,113 @@
1
+ # How to develop on this project
2
+
3
+ scdataloader welcomes contributions from the community.
4
+
5
+ **You need PYTHON3!**
6
+
7
+ This instructions are for linux base systems. (Linux, MacOS, BSD, etc.)
8
+ ## Setting up your own fork of this repo.
9
+
10
+ - On github interface click on `Fork` button.
11
+ - Clone your fork of this repo. `git clone git@github.com:YOUR_GIT_USERNAME/scDataLoader.git`
12
+ - Enter the directory `cd scDataLoader`
13
+ - Add upstream repo `git remote add upstream https://github.com/jkobject/scDataLoader`
14
+
15
+ ## Setting up your own virtual environment
16
+
17
+ Run `make virtualenv` to create a virtual environment.
18
+ then activate it with `source .venv/bin/activate`.
19
+
20
+ ## Install the project in develop mode
21
+
22
+ Run `make install` to install the project in develop mode.
23
+
24
+ ## Run the tests to ensure everything is working
25
+
26
+ Run `make test` to run the tests.
27
+
28
+ ## Create a new branch to work on your contribution
29
+
30
+ Run `git checkout -b my_contribution`
31
+
32
+ ## Make your changes
33
+
34
+ Edit the files using your preferred editor. (we recommend VIM or VSCode)
35
+
36
+ ## Format the code
37
+
38
+ Run `make fmt` to format the code.
39
+
40
+ ## Run the linter
41
+
42
+ Run `make lint` to run the linter.
43
+
44
+ ## Test your changes
45
+
46
+ Run `make test` to run the tests.
47
+
48
+ Ensure code coverage report shows `100%` coverage, add tests to your PR.
49
+
50
+ ## Build the docs locally
51
+
52
+ Run `make docs` to build the docs.
53
+
54
+ Ensure your new changes are documented.
55
+
56
+ ## Commit your changes
57
+
58
+ This project uses [conventional git commit messages](https://www.conventionalcommits.org/en/v1.0.0/).
59
+
60
+ Example: `fix(package): update setup.py arguments 🎉` (emojis are fine too)
61
+
62
+ ## Push your changes to your fork
63
+
64
+ Run `git push origin my_contribution`
65
+
66
+ ## Submit a pull request
67
+
68
+ On github interface, click on `Pull Request` button.
69
+
70
+ Wait CI to run and one of the developers will review your PR.
71
+ ## Makefile utilities
72
+
73
+ This project comes with a `Makefile` that contains a number of useful utility.
74
+
75
+ ```bash
76
+ ❯ make
77
+ Usage: make <target>
78
+
79
+ Targets:
80
+ help: ## Show the help.
81
+ install: ## Install the project in dev mode.
82
+ fmt: ## Format code using black & isort.
83
+ lint: ## Run pep8, black, mypy linters.
84
+ test: lint ## Run tests and generate coverage report.
85
+ watch: ## Run tests on every change.
86
+ clean: ## Clean unused files.
87
+ virtualenv: ## Create a virtual environment.
88
+ release: ## Create a new tag for release.
89
+ docs: ## Build the documentation.
90
+ switch-to-poetry: ## Switch to poetry package manager.
91
+ init: ## Initialize the project based on an application template.
92
+ ```
93
+
94
+ ## Making a new release
95
+
96
+ This project uses [semantic versioning](https://semver.org/) and tags releases with `X.Y.Z`
97
+ Every time a new tag is created and pushed to the remote repo, github actions will
98
+ automatically create a new release on github and trigger a release on PyPI.
99
+
100
+ For this to work you need to setup a secret called `PIPY_API_TOKEN` on the project settings>secrets,
101
+ this token can be generated on [pypi.org](https://pypi.org/account/).
102
+
103
+ To trigger a new release all you need to do is.
104
+
105
+ 1. If you have changes to add to the repo
106
+ * Make your changes following the steps described above.
107
+ * Commit your changes following the [conventional git commit messages](https://www.conventionalcommits.org/en/v1.0.0/).
108
+ 2. Run the tests to ensure everything is working.
109
+ 4. Run `make release` to create a new tag and push it to the remote repo.
110
+
111
+ the `make release` will ask you the version number to create the tag, ex: type `0.1.1` when you are asked.
112
+
113
+ > **CAUTION**: The make release will change local changelog files and commit all the unstaged changes you have.
@@ -0,0 +1,5 @@
1
+ FROM python:3.7-slim
2
+ COPY . /app
3
+ WORKDIR /app
4
+ RUN pip install .
5
+ CMD ["scdataloader"]