PyPI - nci-cidc-api-modules - Versions diffs - 1.1.24__tar.gz → 1.2.21__tar.gz - Mend

nci-cidc-api-modules 1.1.24tar.gz → 1.2.21tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

nci_cidc_api_modules-1.2.21/MANIFEST.in ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ include requirements.modules.txt
2	+ include cidc_api/shared/email_layout.html

nci_cidc_api_modules-1.1.24/README.md → nci_cidc_api_modules-1.2.21/PKG-INFO RENAMED Viewed

@@ -1,3 +1,45 @@
+Metadata-Version: 2.4
+Name: nci_cidc_api_modules
+Version: 1.2.21
+Summary: SQLAlchemy data models and configuration tools used in the NCI CIDC API
+Home-page: https://github.com/NCI-CIDC/cidc-api-gae
+License: MIT license
+Requires-Python: >=3.13
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: certifi>=2025.10.5
+Requires-Dist: cloud-sql-python-connector[pg8000]>=1.18.5
+Requires-Dist: flask>=3.1.2
+Requires-Dist: flask-migrate>=4.1.0
+Requires-Dist: flask-sqlalchemy>=3.1.1
+Requires-Dist: google-auth==2.41.1
+Requires-Dist: google-api-python-client>=2.185.0
+Requires-Dist: google-cloud-bigquery>=3.38.0
+Requires-Dist: google-cloud-pubsub>=2.32.0
+Requires-Dist: google-cloud-secret-manager>=2.25.0
+Requires-Dist: google-cloud-storage>=3.4.1
+Requires-Dist: jinja2>=3.1.6
+Requires-Dist: marshmallow>=4.0.1
+Requires-Dist: marshmallow-sqlalchemy>=1.4.2
+Requires-Dist: numpy>=2.3.4
+Requires-Dist: packaging>=25.0
+Requires-Dist: pandas>=2.3.3
+Requires-Dist: pyarrow>=22.0.0
+Requires-Dist: python-dotenv>=1.2.1
+Requires-Dist: requests>=2.32.5
+Requires-Dist: sqlalchemy>=2.0.44
+Requires-Dist: sqlalchemy-mixins~=2.0.5
+Requires-Dist: werkzeug>=3.1.3
+Requires-Dist: nci-cidc-schemas==0.28.9
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: license-file
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
 # NCI CIDC API <!-- omit in TOC -->
 The next generation of the CIDC API, reworked to use Google Cloud-managed services. This API is built with the Flask REST API framework backed by Google Cloud SQL, running on Google App Engine.
@@ -21,7 +63,7 @@ The next generation of the CIDC API, reworked to use Google Cloud-managed servic
 ## Install Python dependencies
-Python versions tested include 3.9 and 3.10. The current App Engine is using version 3.9 (see [app.prod.yaml](./app.prod.yaml)). You can use https://github.com/pyenv/pyenv to manage your python versions. Homebrew will also work, but you will have to be specific when you install packages with pip outside of virtual environments. On that note, it is recommended that you install your python dependencies in an isolated environment. For example,
+Use Python version 3.13
 ```bash
 # make a virtual environment in the current direcory called "venv"
@@ -166,52 +208,24 @@ FLASK_APP=cidc_api.app:app flask db upgrade
 ### Connecting to a Cloud SQL database instance
-Install the [Cloud SQL Proxy](https://cloud.google.com/sql/docs/mysql/quickstart-proxy-test):
-```bash
-sudo curl -o /usr/local/bin/cloud-sql-proxy https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.15.1/cloud-sql-proxy.darwin.amd64
-sudo chmod +x /usr/local/bin/cloud-sql-proxy
-mkdir ~/.cloudsql
-chmod 770 ~/.cloudsql
-```
-Proxy to the dev Cloud SQL instance:
-```bash
-cloud-sql-proxy --auto-iam-authn --address 127.0.0.1 --port 5432 nih-nci-cimac-cidc-dev2:us-east4:cidc-postgresql-dev2 &
-```
-If you want to run the proxy alongside a postgres instance on localhost listening on 5432, change the port for the proxy to another port instead like 5433.
-If you experience auth errors, make sure your google cloud sdk is authenticated.
+Make sure you are authenticated to gcloud:
 ```bash
 gcloud auth login
 gcloud auth application-default login
 ```
-To point an API running on localhost to the remote Postgres database, edit your `.env` file and comment out `POSTGRES_URI` and uncomment all environment variables prefixed with `CLOUD_SQL_`. Change CLOUD_SQL_SOCKET_DIR to contain a reference to your home directory. Restart your local API instance, and it will connect to the staging Cloud SQL instance via the local proxy.
-If you wish to connect to the staging Cloud SQL instance via the postgres REPL, download and run the CIDC sql proxy tool (a wrapper for `cloud_sql_proxy`):
-```bash
-# Download the proxy
-curl https://raw.githubusercontent.com/NCI-CIDC/cidc-devops/master/scripts/cidc_sql_proxy.sh -o /usr/local/bin/cidc_sql_proxy
-# Prepare the proxy
-chmod +x /usr/local/bin/cidc_sql_proxy
-cidc_sql_proxy install
-# Run the proxy
-cidc_sql_proxy staging # or cidc_sql_proxy prod
-```
+In your .env file, comment out `POSTGRES_URI` and uncommment
+`CLOUD_SQL_INSTANCE_NAME CLOUD_SQL_DB_USER CLOUD_SQL_DB_NAME` Replace `CLOUD_SQL_DB_USER` with your NIH email.
-### Running database migrations
+### Creating/Running database migrations
 This project uses [`Flask Migrate`](https://flask-migrate.readthedocs.io/en/latest/) for managing database migrations. To create a new migration and upgrade the database specified in your `.env` config:
 ```bash
 export FLASK_APP=cidc_api/app.py
-# Generate the migration script
+# First, make your changes to the model(s)
+# Then, let flask automatically generate the db change. Double check the migration script!
 flask db migrate -m "<a message describing the changes in this migration>"
 # Apply changes to the database
 flask db upgrade
@@ -383,7 +397,7 @@ API authentication relies on _identity tokens_ generated by Auth0 to verify that
 - It is a well-formatted JWT.
 - It has not yet expired.
-- Its cryptographic signature is valid.
+- Its cryptographic signature is valid.
 JWTs are a lot like passports - they convey personal information, they’re issued by a trusted entity, and they expire after a certain time. Moreover, like passports, JWTs **can be stolen** and used to impersonate someone. As such, JWTs should be kept private and treated sort of like short-lived passwords.

nci_cidc_api_modules-1.1.24/PKG-INFO → nci_cidc_api_modules-1.2.21/README.md RENAMED Viewed

@@ -1,43 +1,3 @@
-Metadata-Version: 2.4
-Name: nci_cidc_api_modules
-Version: 1.1.24
-Summary: SQLAlchemy data models and configuration tools used in the NCI CIDC API
-Home-page: https://github.com/NCI-CIDC/cidc-api-gae
-License: MIT license
-Requires-Python: >=3.9
-Description-Content-Type: text/markdown
-License-File: LICENSE
-Requires-Dist: werkzeug==3.0.6
-Requires-Dist: flask==3.0.3
-Requires-Dist: flask-migrate==3.1.0
-Requires-Dist: flask-sqlalchemy==3.0.2
-Requires-Dist: sqlalchemy==1.4.54
-Requires-Dist: marshmallow==3.19.0
-Requires-Dist: marshmallow-sqlalchemy==0.22.3
-Requires-Dist: google-cloud-storage==2.18.0
-Requires-Dist: google-cloud-secret-manager==2.20.1
-Requires-Dist: google-cloud-pubsub==2.22.0
-Requires-Dist: google-cloud-bigquery==3.18.0
-Requires-Dist: google-api-python-client==2.64.0
-Requires-Dist: google-auth==2.32.0
-Requires-Dist: packaging>=20.0.0
-Requires-Dist: pyarrow==14.0.1
-Requires-Dist: numpy<2,>=1.16.5
-Requires-Dist: pandas==1.5.3
-Requires-Dist: python-dotenv==0.10.3
-Requires-Dist: requests==2.32.3
-Requires-Dist: jinja2==3.1.6
-Requires-Dist: certifi==2024.7.4
-Requires-Dist: nci-cidc-schemas==0.27.16
-Dynamic: description
-Dynamic: description-content-type
-Dynamic: home-page
-Dynamic: license
-Dynamic: license-file
-Dynamic: requires-dist
-Dynamic: requires-python
-Dynamic: summary
 # NCI CIDC API <!-- omit in TOC -->
 The next generation of the CIDC API, reworked to use Google Cloud-managed services. This API is built with the Flask REST API framework backed by Google Cloud SQL, running on Google App Engine.
@@ -61,7 +21,7 @@ The next generation of the CIDC API, reworked to use Google Cloud-managed servic
 ## Install Python dependencies
-Python versions tested include 3.9 and 3.10. The current App Engine is using version 3.9 (see [app.prod.yaml](./app.prod.yaml)). You can use https://github.com/pyenv/pyenv to manage your python versions. Homebrew will also work, but you will have to be specific when you install packages with pip outside of virtual environments. On that note, it is recommended that you install your python dependencies in an isolated environment. For example,
+Use Python version 3.13
 ```bash
 # make a virtual environment in the current direcory called "venv"
@@ -206,52 +166,24 @@ FLASK_APP=cidc_api.app:app flask db upgrade
 ### Connecting to a Cloud SQL database instance
-Install the [Cloud SQL Proxy](https://cloud.google.com/sql/docs/mysql/quickstart-proxy-test):
-```bash
-sudo curl -o /usr/local/bin/cloud-sql-proxy https://storage.googleapis.com/cloud-sql-connectors/cloud-sql-proxy/v2.15.1/cloud-sql-proxy.darwin.amd64
-sudo chmod +x /usr/local/bin/cloud-sql-proxy
-mkdir ~/.cloudsql
-chmod 770 ~/.cloudsql
-```
-Proxy to the dev Cloud SQL instance:
-```bash
-cloud-sql-proxy --auto-iam-authn --address 127.0.0.1 --port 5432 nih-nci-cimac-cidc-dev2:us-east4:cidc-postgresql-dev2 &
-```
-If you want to run the proxy alongside a postgres instance on localhost listening on 5432, change the port for the proxy to another port instead like 5433.
-If you experience auth errors, make sure your google cloud sdk is authenticated.
+Make sure you are authenticated to gcloud:
 ```bash
 gcloud auth login
 gcloud auth application-default login
 ```
-To point an API running on localhost to the remote Postgres database, edit your `.env` file and comment out `POSTGRES_URI` and uncomment all environment variables prefixed with `CLOUD_SQL_`. Change CLOUD_SQL_SOCKET_DIR to contain a reference to your home directory. Restart your local API instance, and it will connect to the staging Cloud SQL instance via the local proxy.
-If you wish to connect to the staging Cloud SQL instance via the postgres REPL, download and run the CIDC sql proxy tool (a wrapper for `cloud_sql_proxy`):
-```bash
-# Download the proxy
-curl https://raw.githubusercontent.com/NCI-CIDC/cidc-devops/master/scripts/cidc_sql_proxy.sh -o /usr/local/bin/cidc_sql_proxy
-# Prepare the proxy
-chmod +x /usr/local/bin/cidc_sql_proxy
-cidc_sql_proxy install
-# Run the proxy
-cidc_sql_proxy staging # or cidc_sql_proxy prod
-```
+In your .env file, comment out `POSTGRES_URI` and uncommment
+`CLOUD_SQL_INSTANCE_NAME CLOUD_SQL_DB_USER CLOUD_SQL_DB_NAME` Replace `CLOUD_SQL_DB_USER` with your NIH email.
-### Running database migrations
+### Creating/Running database migrations
 This project uses [`Flask Migrate`](https://flask-migrate.readthedocs.io/en/latest/) for managing database migrations. To create a new migration and upgrade the database specified in your `.env` config:
 ```bash
 export FLASK_APP=cidc_api/app.py
-# Generate the migration script
+# First, make your changes to the model(s)
+# Then, let flask automatically generate the db change. Double check the migration script!
 flask db migrate -m "<a message describing the changes in this migration>"
 # Apply changes to the database
 flask db upgrade
@@ -423,7 +355,7 @@ API authentication relies on _identity tokens_ generated by Auth0 to verify that
 - It is a well-formatted JWT.
 - It has not yet expired.
-- Its cryptographic signature is valid.
+- Its cryptographic signature is valid.
 JWTs are a lot like passports - they convey personal information, they’re issued by a trusted entity, and they expire after a certain time. Moreover, like passports, JWTs **can be stolen** and used to impersonate someone. As such, JWTs should be kept private and treated sort of like short-lived passwords.

nci_cidc_api_modules-1.2.21/cidc_api/config/db.py ADDED Viewed

@@ -0,0 +1,58 @@
+from os import environ
+from flask import Flask
+from flask_sqlalchemy import SQLAlchemy
+from flask_migrate import Migrate, upgrade
+from sqlalchemy.engine.url import URL
+from sqlalchemy.orm import declarative_base
+from google.cloud.sql.connector import Connector, IPTypes
+from .secrets import get_secrets_manager
+db = SQLAlchemy()
+BaseModel = db.Model
+connector = Connector()
+def getconn():
+    return connector.connect(
+        environ.get("CLOUD_SQL_INSTANCE_NAME"),
+        "pg8000",
+        user=environ.get("CLOUD_SQL_DB_USER"),
+        password="xxxxx",
+        db=environ.get("CLOUD_SQL_DB_NAME"),
+        enable_iam_auth=True,
+        ip_type=IPTypes.PUBLIC,
+    )
+def init_db(app: Flask):
+    """Connect `app` to the database and run migrations"""
+    db.init_app(app)
+    Migrate(app, db, app.config["MIGRATIONS_PATH"])
+    with app.app_context():
+        upgrade(app.config["MIGRATIONS_PATH"])
+def get_sqlalchemy_database_uri(testing: bool = False) -> str:
+    """Get the PostgreSQL DB URI from environment variables"""
+    db_uri = environ.get("POSTGRES_URI")
+    if testing:
+        # Connect to the test database
+        db_uri = environ.get("TEST_POSTGRES_URI", "fake-conn-string")
+    elif not db_uri:
+        db_uri = f"postgresql+pg8000://{environ.get('CLOUD_SQL_DB_USER')}:xxx@/{environ.get('CLOUD_SQL_DB_NAME')}"
+    assert db_uri
+    return db_uri
+# Use SQLALCHEMY_ENGINE_OPTIONS to connect to the cloud but use uri for local db
+def cloud_connector(testing: bool = False):
+    if not testing and not environ.get("POSTGRES_URI"):
+        return {"creator": getconn}
+    else:
+        return {}

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/config/logging.py RENAMED Viewed

@@ -2,7 +2,7 @@ import sys
 import logging
 from typing import Optional
-from .settings import IS_GUNICORN, ENV
+from .settings import IS_GUNICORN, ENV, TESTING
 # TODO: consider adding custom formatting that automatically adds request context
 # to all logs, like who the requesting user is and what URL they're accessing, e.g.
@@ -19,7 +19,10 @@ def get_logger(name: Optional[str]) -> logging.Logger:
         logger.setLevel(gunicorn_logger.level)
     else:
         handler = logging.StreamHandler(sys.stdout)
-        handler.setFormatter(logging.Formatter("[%(asctime)s] [%(threadName)s] [%(levelname)s]: %(message)s"))
+        formatter = logging.Formatter("[%(asctime)s] [%(threadName)s] [%(levelname)s] [%(name)s]: %(message)s")
+        handler.setFormatter(formatter)
         logger.addHandler(handler)
         logger.setLevel(logging.DEBUG if ENV == "dev" else logging.INFO)
+        if not TESTING:
+            logger.propagate = False
     return logger

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/config/settings.py RENAMED Viewed

@@ -10,7 +10,7 @@ from os import environ, path, mkdir
 from dotenv import load_dotenv
-from .db import get_sqlalchemy_database_uri
+from .db import get_sqlalchemy_database_uri, cloud_connector
 from .secrets import get_secrets_manager
 load_dotenv()
@@ -54,6 +54,7 @@ else:
 ### Configure Flask-SQLAlchemy ###
 SQLALCHEMY_DATABASE_URI = get_sqlalchemy_database_uri(TESTING)
+SQLALCHEMY_ENGINE_OPTIONS = cloud_connector(TESTING)
 SQLALCHEMY_TRACK_MODIFICATIONS = False
 SQLALCHEMY_ECHO = False  # Set to True to emit all compiled sql statements
@@ -70,6 +71,7 @@ GOOGLE_INTAKE_BUCKET = environ["GOOGLE_INTAKE_BUCKET"]
 GOOGLE_UPLOAD_BUCKET = environ["GOOGLE_UPLOAD_BUCKET"]
 GOOGLE_UPLOAD_TOPIC = environ["GOOGLE_UPLOAD_TOPIC"]
 GOOGLE_ACL_DATA_BUCKET = environ["GOOGLE_ACL_DATA_BUCKET"]
+GOOGLE_CLINICAL_DATA_BUCKET = environ["GOOGLE_CLINICAL_DATA_BUCKET"]
 GOOGLE_EPHEMERAL_BUCKET = environ["GOOGLE_EPHEMERAL_BUCKET"]
 GOOGLE_UPLOAD_ROLE = environ["GOOGLE_UPLOAD_ROLE"]
 GOOGLE_LISTER_ROLE = environ["GOOGLE_LISTER_ROLE"]
@@ -80,6 +82,8 @@ GOOGLE_PATIENT_SAMPLE_TOPIC = environ["GOOGLE_PATIENT_SAMPLE_TOPIC"]
 GOOGLE_EMAILS_TOPIC = environ["GOOGLE_EMAILS_TOPIC"]
 GOOGLE_ARTIFACT_UPLOAD_TOPIC = environ["GOOGLE_ARTIFACT_UPLOAD_TOPIC"]
 GOOGLE_GRANT_DOWNLOAD_PERMISSIONS_TOPIC = environ["GOOGLE_GRANT_DOWNLOAD_PERMISSIONS_TOPIC"]
+GOOGLE_HL_CLINICAL_VALIDATION_TOPIC = environ["GOOGLE_HL_CLINICAL_VALIDATION_TOPIC"]
+GOOGLE_DL_CLINICAL_VALIDATION_TOPIC = environ["GOOGLE_DL_CLINICAL_VALIDATION_TOPIC"]
 GOOGLE_AND_OPERATOR = " && "
 GOOGLE_OR_OPERATOR = " || "
@@ -104,5 +108,9 @@ else:
     IS_EMAIL_ON = environ.get("IS_EMAIL_ON")
+# notification emails
+CIDC_CLINICAL_DATA_EMAIL = environ.get("CIDC_CLINICAL_DATA_EMAIL")
+CIDC_ADMIN_EMAIL = environ.get("CIDC_ADMIN_EMAIL")
 # Accumulate all constants defined in this file in a single dictionary
 SETTINGS = {k: v for k, v in globals().items() if k.isupper()}

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/models/__init__.py RENAMED Viewed

@@ -1,3 +1,5 @@
 from .models import *
 from .files import *
 from .schemas import *
+from cidc_api.models.db.base_orm import BaseORM

nci_cidc_api_modules-1.2.21/cidc_api/models/data.py ADDED Viewed

@@ -0,0 +1,15 @@
+from cidc_api.models.pydantic.stage2 import all_models
+standard_data_categories = [model.__data_category__ for model in all_models if hasattr(model, "__data_category__")]
+# A class to hold the representation of a trial's dataset all at once
+class Dataset(dict):
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        for data_category in standard_data_categories:
+            self[data_category] = []
+# Maps data categories like "treatment" to their associated pydantic model
+data_category_to_model = {model.__data_category__: model for model in all_models if hasattr(model, "__data_category__")}

nci_cidc_api_modules-1.2.21/cidc_api/models/db/base_orm.py ADDED Viewed

@@ -0,0 +1,25 @@
+from typing import Self
+from sqlalchemy_mixins import SerializeMixin, ReprMixin
+from cidc_api.config.db import db
+class BaseORM(db.Model, ReprMixin, SerializeMixin):
+    __abstract__ = True
+    __repr__ = ReprMixin.__repr__
+    def merge(self, d: dict) -> Self:
+        """Merge keys and values from dict d into this model, overwriting as necessary."""
+        for key, value in d.items():
+            setattr(self, key, value)
+        return self
+    def clone(self) -> "BaseORM":
+        """Clones a SQLAlchemy ORM object, excluding primary keys."""
+        mapper = self.__mapper__
+        new_instance = self.__class__()
+        for column in mapper.columns:
+            if not column.primary_key:
+                setattr(new_instance, column.key, getattr(self, column.key))
+        return new_instance

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/models/files/details.py RENAMED Viewed

@@ -993,4 +993,35 @@ details_dict = {
         "",
         "",
     ),
+    # scrna
+    "/scrnaseq/samples_metadata.csv": FileDetails("source", "", ""),
+    "/scrnaseq/read_1.gz": FileDetails("source", "", ""),
+    "/scrnaseq/read_2.gz": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/samples_metadata.csv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/config.yaml": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/R_package_versions.csv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration.rds": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration_heatmap_plots.zip": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration_markers.zip": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration_split_percent_plots.zip": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration_split_umap_plots.zip": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/integration_umap_plots.zip": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/clustering.rds": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/report.html": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/star_sorted_by_cord.bam": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/star_sorted_by_cord.bam.bai": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/log_final.out": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/log.out": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/log_progress.out": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/sj_out.tab": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/barcodes.stats": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_features.stats": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_summary.csv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_umi_per_cell_sorted.txt": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_filtered_features.tsv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_filtered_barcodes.tsv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_filtered_matrix.mtx": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_raw_features.tsv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_raw_barcodes.tsv": FileDetails("source", "", ""),
+    "/scrnaseq_analysis/gene_raw_matrix.mtx": FileDetails("source", "", ""),
 }

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/models/files/facets.py RENAMED Viewed

@@ -346,6 +346,36 @@ assay_facets: Facets = {
             "H and E file from MIBI analysis",
         ),
     },
+    "scRNA": {
+        "Samples Metadata": FacetConfig(["/scrnaseq/samples_metadata.csv"], "Sample metadata for scRNA run"),
+        "Read 1 gz": FacetConfig(["/scrnaseq/read_1.gz"], "Gz file for read 1"),
+        "Read 2 gz": FacetConfig(["/scrnaseq/read_2.gz"], "Gz file for read 2"),
+    },
+    "Visium": {
+        "Samples Metadata": FacetConfig(["/visium/samples_metadata.csv"], "Sample metadata for visium run"),
+        "Read 1 fastq gz": FacetConfig(["/visium/R1_001.fastq.gz"], "Gz file for read 1"),
+        "Read 2 fastq gz": FacetConfig(["/visium/R2_001.fastq.gz"], "Gz file for read 2"),
+        "loupe alignment file": FacetConfig(["/visium/loupe_alignment_file.json"]),
+        "brightfield image": FacetConfig(["/visium/brightfield.tiff"]),
+        "dark image": FacetConfig(["/visium/dark_image.tiff"]),
+        "colorized image": FacetConfig(["/visium/colorized.tiff"]),
+        "cytassist image": FacetConfig(["/visium/cytassist.tiff"]),
+    },
+    "NULISA": {
+        "Metadata": FacetConfig(["/nulisa/metadata.csv", "Metadata for NULISA run"]),
+        "NPQ File": FacetConfig(["/nulisa/npq_file.csv", "NPQ file for NULISA run"]),
+        "Raw Counts File": FacetConfig(["/nulisa/raw_counts_file.csv", "Raw counts file for NULISA run"]),
+    },
+    "MALDI Glycan": {
+        "Metadata": FacetConfig(["/maldi_glycan/metadata.tsv", "Metadata for a MALDI Glycan run"]),
+        "Molecular Assignments File": FacetConfig(
+            ["/maldi_glycan/molecular_assignments.tsv", "Molecular Assignments file for a MALDI Glycan run"]
+        ),
+        "IBD File": FacetConfig(["/maldi_glycan/ibd_file.ibd", "IBD file for MALDI Glycan run"]),
+        "IMZML File": FacetConfig(["/maldi_glycan/imzml_file.imzml", "IMZML file for MALDI Glycan run"]),
+        "Channels": FacetConfig(["/maldi_glycan/channels.csv", "Channels csv file for MALDI Glycan run"]),
+        "Tiff Zip": FacetConfig(["/maldi_glycan/tiff.zip", "Tiff zip for MALDI Glycan run"]),
+    },
     "mIHC": {
         "Samples Report": FacetConfig(["/mihc/sample_report.csv"], "Samples report for mIHC run"),
         "Multitiffs": FacetConfig(["/mihc/multitiffs.tar.gz"], "Multi Tiffs file from mIHC run"),
@@ -409,6 +439,11 @@ assay_facets: Facets = {
             "Analysis files for all samples run on the Olink platform in the trial.",
         ),
     },
+    "Olink HT": {
+        "Batch-Level Combined File": FacetConfig(["/olink_ht/batch_level_combined_file.xlsx"]),
+        "Study-Level Combined File": FacetConfig(["/olink_ht/study_level_combined_file.xlsx"]),
+        "Npx Run File": FacetConfig(["/olink_ht/npx_run_file.xlsx"]),
+    },
     "IHC": {
         "Images": FacetConfig(["/ihc/ihc_image."]),
         "Combined Markers": FacetConfig(["csv|ihc marker combined"]),
@@ -549,6 +584,48 @@ analysis_ready_facets = {
     "WES Analysis": FacetConfig(["/wes/analysis/report.tar.gz"]),
     "TCR": FacetConfig(["/tcr_analysis/report_trial.tar.gz"]),
     "mIF": FacetConfig(["/mif/roi_/cell_seg_data.txt"]),
+    "scRNA": FacetConfig(
+        [
+            "/scrnaseq_analysis/samples_metadata.csv",
+            "/scrnaseq_analysis/config.yaml",
+            "/scrnaseq_analysis/R_package_versions.csv",
+            "/scrnaseq_analysis/integration.rds",
+            "/scrnaseq_analysis/integration_heatmap_plots.zip",
+            "/scrnaseq_analysis/integration_markers.zip",
+            "/scrnaseq_analysis/integration_split_percent_plots.zip",
+            "/scrnaseq_analysis/integration_split_umap_plots.zip",
+            "/scrnaseq_analysis/integration_umap_plots.zip",
+            "/scrnaseq_analysis/clustering.rds",
+            "/scrnaseq_analysis/report.html",
+            "/scrnaseq_analysis/star_sorted_by_cord.bam",
+            "/scrnaseq_analysis/star_sorted_by_cord.bam.bai",
+            "/scrnaseq_analysis/log_final.out",
+            "/scrnaseq_analysis/log.out",
+            "/scrnaseq_analysis/log_progress.out",
+            "/scrnaseq_analysis/sj_out.tab",
+            "/scrnaseq_analysis/barcodes.stats",
+            "/scrnaseq_analysis/gene_features.stats",
+            "/scrnaseq_analysis/gene_summary.csv",
+            "/scrnaseq_analysis/gene_umi_per_cell_sorted.txt",
+            "/scrnaseq_analysis/gene_filtered_features.tsv",
+            "/scrnaseq_analysis/gene_filtered_barcodes.tsv",
+            "/scrnaseq_analysis/gene_filtered_matrix.mtx",
+            "/scrnaseq_analysis/gene_raw_features.tsv",
+            "/scrnaseq_analysis/gene_raw_barcodes.tsv",
+            "/scrnaseq_analysis/gene_raw_matrix.mtx",
+        ]
+    ),
+    "Visium": FacetConfig(
+        [
+            "/visium_analysis/samples_metadata.csv",
+            "/visium_analysis/config.yaml",
+            "/visium_analysis/R_package_versions.csv",
+            "/visium_analysis/merged.rds",
+            "/visium_analysis/spatial_variable_features.rds",
+            "/visium_analysis/report.html",
+            "/visium_analysis/visium_spaceranger_output.zip",
+        ]
+    ),
 }
 facets_dict: Dict[str, Facets] = {

{nci_cidc_api_modules-1.1.24 → nci_cidc_api_modules-1.2.21}/cidc_api/models/migrations.py RENAMED Viewed

@@ -91,15 +91,11 @@ def migration_session():
         session.close()
-def run_metadata_migration(
-    metadata_migration: Callable[[dict], MigrationResult], use_upload_jobs_table: bool
-):
+def run_metadata_migration(metadata_migration: Callable[[dict], MigrationResult], use_upload_jobs_table: bool):
     """Migrate trial metadata, upload job patches, and downloadable files according to `metadata_migration`"""
     with migration_session() as (session, task_queue):
         try:
-            _run_metadata_migration(
-                metadata_migration, use_upload_jobs_table, task_queue, session
-            )
+            _run_metadata_migration(metadata_migration, use_upload_jobs_table, task_queue, session)
         except:
             traceback.print_exc()
             raise
@@ -122,9 +118,7 @@ class ManifestUploads(CommonColumns):
     __tablename__ = "manifest_uploads"
-def _select_successful_assay_uploads(
-    use_upload_jobs_table: bool, session: Session
-) -> List[UploadJobs]:
+def _select_successful_assay_uploads(use_upload_jobs_table: bool, session: Session) -> List[UploadJobs]:
     if use_upload_jobs_table:
         return (
             session.query(UploadJobs)
@@ -133,21 +127,12 @@ def _select_successful_assay_uploads(
             .all()
         )
-    return (
-        session.query(AssayUploads)
-        .filter_by(status=UploadJobStatus.MERGE_COMPLETED.value)
-        .with_for_update()
-        .all()
-    )
+    return session.query(AssayUploads).filter_by(status=UploadJobStatus.MERGE_COMPLETED.value).with_for_update().all()
-def _select_manifest_uploads(
-    use_upload_jobs_table: bool, session: Session
-) -> List[UploadJobs]:
+def _select_manifest_uploads(use_upload_jobs_table: bool, session: Session) -> List[UploadJobs]:
     if use_upload_jobs_table:
-        return (
-            session.query(UploadJobs).filter_by(multifile=False).with_for_update().all()
-        )
+        return session.query(UploadJobs).filter_by(multifile=False).with_for_update().all()
     return session.query(ManifestUploads).with_for_update().all()
@@ -188,21 +173,15 @@ def _run_metadata_migration(
             # Regenerate additional metadata from the migrated clinical trial
             # metadata object.
-            print(
-                f"Regenerating additional metadata for artifact with uuid {artifact['upload_placeholder']}"
-            )
+            print(f"Regenerating additional metadata for artifact with uuid {artifact['upload_placeholder']}")
             artifact_path = uuid_path_map[artifact["upload_placeholder"]]
-            df.additional_metadata = get_source(
-                migration.result, artifact_path, skip_last=True
-            )[1]
+            df.additional_metadata = get_source(migration.result, artifact_path, skip_last=True)[1]
             # If the GCS URI has changed, rename the blob
             # makes call to bucket.rename_blob
             new_gcs_uri = artifact["object_url"]
             if old_gcs_uri != new_gcs_uri:
-                print(
-                    f"Encountered GCS data bucket artifact URI to update: {old_gcs_uri}"
-                )
+                print(f"Encountered GCS data bucket artifact URI to update: {old_gcs_uri}")
                 renamer = PieceOfWork(
                     partial(
                         rename_gcs_blob,
@@ -220,9 +199,7 @@ def _run_metadata_migration(
                 gcs_tasks.schedule(renamer)
     # Migrate all assay upload successes
-    successful_assay_uploads = _select_successful_assay_uploads(
-        use_upload_jobs_table, session
-    )
+    successful_assay_uploads = _select_successful_assay_uploads(use_upload_jobs_table, session)
     for upload in successful_assay_uploads:
         print(f"Running metadata migration for assay upload: {upload.id}")
         if use_upload_jobs_table:
@@ -248,9 +225,7 @@ def _run_metadata_migration(
             if old_target_uri in migration.file_updates:
                 new_target_uri = migration.file_updates[old_target_uri]["object_url"]
                 if old_target_uri != new_target_uri:
-                    print(
-                        f"Encountered GCS upload bucket artifact URI to update: {old_upload_uri}"
-                    )
+                    print(f"Encountered GCS upload bucket artifact URI to update: {old_upload_uri}")
                     new_upload_uri = "/".join([new_target_uri, upload_timestamp])
                     renamer = PieceOfWork(
                         partial(
@@ -325,7 +300,5 @@ def republish_artifact_uploads():
     with migration_session() as (session, _):
         files = session.query(DownloadableFiles).all()
         for f in files:
-            print(
-                f"Publishing to 'artifact_upload' topic for downloadable file with in bucket url {f.object_url}"
-            )
+            print(f"Publishing to 'artifact_upload' topic for downloadable file with in bucket url {f.object_url}")
             publish_artifact_upload(f.object_url)

nci-cidc-api-modules 1.1.24__tar.gz → 1.2.21__tar.gz

nci-cidc-api-modules 1.1.24tar.gz → 1.2.21tar.gz