PyPI - CytoTable - Versions diffs - 0.0.3__tar.gz → 0.0.4__tar.gz - Mend

CytoTable 0.0.3tar.gz → 0.0.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

{cytotable-0.0.3 → cytotable-0.0.4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: CytoTable
-Version: 0.0.3
+Version: 0.0.4
 Summary: Transform CellProfiler and DeepProfiler data for processing image-based profiling readouts with Pycytominer and other Cytomining tools.
 Home-page: https://github.com/cytomining/CytoTable
 License: BSD-3-Clause License
@@ -13,10 +13,11 @@ Classifier: Programming Language :: Python :: 3.8
 Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
 Requires-Dist: cloudpathlib[all] (>=0.15.0,<0.16.0)
 Requires-Dist: duckdb (>=0.8.0)
 Requires-Dist: parsl (>=2023.9.25)
-Requires-Dist: pyarrow (>=13.0.0,<14.0.0)
+Requires-Dist: pyarrow (>=13.0.0)
 Project-URL: Documentation, https://cytomining.github.io/CytoTable/
 Project-URL: Repository, https://github.com/cytomining/CytoTable
 Description-Content-Type: text/markdown
@@ -30,10 +31,17 @@ _Diagram showing data flow relative to this project._
 ## Summary
-CytoTable enables single-cell morphology data analysis by cleaning and transforming CellProfiler (`.csv` or `.sqlite`), cytominer-database (`.sqlite`), and DeepProfiler (`.npz`) output data at scale.
+CytoTable enables single-cell morphology data analysis by cleaning and transforming CellProfiler (`.csv` or `.sqlite`), cytominer-database (`.sqlite`), and DeepProfiler (`.npz`), and other sources such as IN Carta data output data at scale.
 CytoTable creates parquet files for both independent analysis and for input into [Pycytominer](https://github.com/cytomining/pycytominer).
 The Parquet files will have a unified and documented data model, including referenceable schema where appropriate (for validation within Pycytominer or other projects).
+The name for the project is inspired from:
+- __Cyto__: "1. (biology) cell." ([Wiktionary: Cyto-](https://en.wiktionary.org/wiki/cyto-))
+- __Table__:
+  - "1. Furniture with a top surface to accommodate a variety of uses."
+  - "3.1. A matrix or grid of data arranged in rows and columns." <br> ([Wiktionary: Table](https://en.wiktionary.org/wiki/table))
 ## Installation
 Install CytoTable from [PyPI](https://pypi.org/) or from source:

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/__init__.py RENAMED Viewed

@@ -1,6 +1,10 @@
 """
 __init__.py for cytotable
 """
+# note: version data is maintained by poetry-dynamic-versioning (do not edit)
+__version__ = "0.0.4"
 from .convert import convert
 from .exceptions import (
     CytoTableException,

cytotable-0.0.4/cytotable/constants.py ADDED Viewed

@@ -0,0 +1,74 @@
+"""
+CytoTable: constants - storing various constants to be used throughout cytotable.
+"""
+import multiprocessing
+import os
+from typing import cast
+from cytotable.utils import _get_cytotable_version
+# read max threads from environment if necessary
+# max threads will be used with default Parsl config and Duckdb
+MAX_THREADS = (
+    multiprocessing.cpu_count()
+    if "CYTOTABLE_MAX_THREADS" not in os.environ
+    else int(cast(int, os.environ.get("CYTOTABLE_MAX_THREADS")))
+)
+# enables overriding default memory mapping behavior with pyarrow memory mapping
+CYTOTABLE_ARROW_USE_MEMORY_MAPPING = (
+    os.environ.get("CYTOTABLE_ARROW_USE_MEMORY_MAPPING", "1") == "1"
+)
+DDB_DATA_TYPE_SYNONYMS = {
+    "real": ["float32", "float4", "float"],
+    "double": ["float64", "float8", "numeric", "decimal"],
+    "integer": ["int32", "int4", "int", "signed"],
+    "bigint": ["int64", "int8", "long"],
+}
+# A reference dictionary for SQLite affinity and storage class types
+# See more here: https://www.sqlite.org/datatype3.html#affinity_name_examples
+SQLITE_AFFINITY_DATA_TYPE_SYNONYMS = {
+    "integer": [
+        "int",
+        "integer",
+        "tinyint",
+        "smallint",
+        "mediumint",
+        "bigint",
+        "unsigned big int",
+        "int2",
+        "int8",
+    ],
+    "text": [
+        "character",
+        "varchar",
+        "varying character",
+        "nchar",
+        "native character",
+        "nvarchar",
+        "text",
+        "clob",
+    ],
+    "blob": ["blob"],
+    "real": [
+        "real",
+        "double",
+        "double precision",
+        "float",
+    ],
+    "numeric": [
+        "numeric",
+        "decimal",
+        "boolean",
+        "date",
+        "datetime",
+    ],
+}
+CYTOTABLE_DEFAULT_PARQUET_METADATA = {
+    "data-producer": "https://github.com/cytomining/CytoTable",
+    "data-producer-version": str(_get_cytotable_version()),
+}

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/convert.py RENAMED Viewed

@@ -75,7 +75,9 @@ def _get_table_columns_and_types(source: Dict[str, Any]) -> List[Dict[str, str]]
             segment_type as column_dtype
         FROM pragma_storage_info('column_details')
         /* avoid duplicate entries in the form of VALIDITY segment_types */
-        WHERE segment_type != 'VALIDITY';
+        WHERE segment_type != 'VALIDITY'
+        /* explicitly order the columns by their id to avoid inconsistent results */
+        ORDER BY column_id ASC;
         """
     # attempt to read the data to parquet from duckdb
@@ -302,7 +304,11 @@ def _source_chunk_to_parquet(
     from cloudpathlib import AnyPath
     from pyarrow import parquet
-    from cytotable.utils import _duckdb_reader, _sqlite_mixed_type_query_to_parquet
+    from cytotable.utils import (
+        _duckdb_reader,
+        _sqlite_mixed_type_query_to_parquet,
+        _write_parquet_table_with_metadata,
+    )
     # attempt to build dest_path
     source_dest_path = (
@@ -315,7 +321,7 @@ def _source_chunk_to_parquet(
     select_columns = ",".join(
         [
             # here we cast the column to the specified type ensure the colname remains the same
-            f"CAST({column['column_name']} AS {column['column_dtype']}) AS {column['column_name']}"
+            f"CAST(\"{column['column_name']}\" AS {column['column_dtype']}) AS \"{column['column_name']}\""
             for column in source["columns"]
         ]
     )
@@ -339,7 +345,7 @@ def _source_chunk_to_parquet(
         # read data with chunk size + offset
         # and export to parquet
         with _duckdb_reader() as ddb_reader:
-            parquet.write_table(
+            _write_parquet_table_with_metadata(
                 table=ddb_reader.execute(
                     f"""
                     {base_query}
@@ -358,7 +364,7 @@ def _source_chunk_to_parquet(
             "Mismatch Type Error" in str(e)
             and str(AnyPath(source["source_path"]).suffix).lower() == ".sqlite"
         ):
-            parquet.write_table(
+            _write_parquet_table_with_metadata(
                 # here we use sqlite instead of duckdb to extract
                 # data for special cases where column and value types
                 # may not align (which is valid functionality in SQLite).
@@ -410,14 +416,28 @@ def _prepend_column_name(
             Path to the modified file.
     """
+    import logging
     import pathlib
     import pyarrow.parquet as parquet
-    from cytotable.utils import CYTOTABLE_ARROW_USE_MEMORY_MAPPING
+    from cytotable.constants import CYTOTABLE_ARROW_USE_MEMORY_MAPPING
+    from cytotable.utils import _write_parquet_table_with_metadata
+    logger = logging.getLogger(__name__)
     targets = tuple(metadata) + tuple(compartments)
+    # if we have no targets or metadata to work from, return the table unchanged
+    if len(targets) == 0:
+        logger.warning(
+            msg=(
+                "Skipping column name prepend operations"
+                "because no compartments or metadata were provided."
+            )
+        )
+        return table_path
     table = parquet.read_table(
         source=table_path, memory_map=CYTOTABLE_ARROW_USE_MEMORY_MAPPING
     )
@@ -499,7 +519,7 @@ def _prepend_column_name(
             updated_column_names.append(column_name)
     # perform table column name updates
-    parquet.write_table(
+    _write_parquet_table_with_metadata(
         table=table.rename_columns(updated_column_names), where=table_path
     )
@@ -564,13 +584,18 @@ def _concat_source_group(
             Updated dictionary containing concatenated sources.
     """
+    import errno
     import pathlib
     import pyarrow as pa
     import pyarrow.parquet as parquet
+    from cytotable.constants import (
+        CYTOTABLE_ARROW_USE_MEMORY_MAPPING,
+        CYTOTABLE_DEFAULT_PARQUET_METADATA,
+    )
     from cytotable.exceptions import SchemaException
-    from cytotable.utils import CYTOTABLE_ARROW_USE_MEMORY_MAPPING
+    from cytotable.utils import _write_parquet_table_with_metadata
     # build a result placeholder
     concatted: List[Dict[str, Any]] = [
@@ -600,7 +625,9 @@ def _concat_source_group(
     destination_path.parent.mkdir(parents=True, exist_ok=True)
     # build the schema for concatenation writer
-    writer_schema = pa.schema(common_schema)
+    writer_schema = pa.schema(common_schema).with_metadata(
+        CYTOTABLE_DEFAULT_PARQUET_METADATA
+    )
     # build a parquet file writer which will be used to append files
     # as a single concatted parquet file, referencing the first file's schema
@@ -638,7 +665,7 @@ def _concat_source_group(
                 pathlib.Path(pathlib.Path(source["table"][0]).parent).rmdir()
             except OSError as os_err:
                 # raise only if we don't have a dir not empty errno
-                if os_err.errno != 66:
+                if os_err.errno != errno.ENOTEMPTY:
                     raise
     # return the concatted parquet filename
@@ -713,7 +740,7 @@ def _join_source_chunk(
     import pyarrow.parquet as parquet
-    from cytotable.utils import _duckdb_reader
+    from cytotable.utils import _duckdb_reader, _write_parquet_table_with_metadata
     # Attempt to read the data to parquet file
     # using duckdb for extraction and pyarrow for
@@ -757,7 +784,7 @@ def _join_source_chunk(
     )
     # write the result
-    parquet.write_table(
+    _write_parquet_table_with_metadata(
         table=result,
         where=result_file_path,
     )
@@ -797,7 +824,11 @@ def _concat_join_sources(
     import pyarrow.parquet as parquet
-    from cytotable.utils import CYTOTABLE_ARROW_USE_MEMORY_MAPPING
+    from cytotable.constants import (
+        CYTOTABLE_ARROW_USE_MEMORY_MAPPING,
+        CYTOTABLE_DEFAULT_PARQUET_METADATA,
+    )
+    from cytotable.utils import _write_parquet_table_with_metadata
     # remove the unjoined concatted compartments to prepare final dest_path usage
     # (we now have joined results)
@@ -811,7 +842,7 @@ def _concat_join_sources(
         shutil.rmtree(path=dest_path)
     # write the concatted result as a parquet file
-    parquet.write_table(
+    _write_parquet_table_with_metadata(
         table=pa.concat_tables(
             tables=[
                 parquet.read_table(
@@ -826,7 +857,9 @@ def _concat_join_sources(
     # build a parquet file writer which will be used to append files
     # as a single concatted parquet file, referencing the first file's schema
     # (all must be the same schema)
-    writer_schema = parquet.read_schema(join_sources[0])
+    writer_schema = parquet.read_schema(join_sources[0]).with_metadata(
+        CYTOTABLE_DEFAULT_PARQUET_METADATA
+    )
     with parquet.ParquetWriter(str(dest_path), writer_schema) as writer:
         for table_path in join_sources:
             writer.write_table(

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/presets.py RENAMED Viewed

@@ -1,5 +1,5 @@
 """
-Presets for common pycytominer-transform configurations.
+Presets for common CytoTable configurations.
 """
 config = {
@@ -204,7 +204,35 @@ config = {
                 AND nuclei.Nuclei_ObjectNumber = cytoplasm.Metadata_Cytoplasm_Parent_Nuclei
         """,
     },
+    "in-carta": {
+        # version specifications using related references
+        "CONFIG_SOURCE_VERSION": {
+            "in-carta": "v1.17.0412545",
+        },
+        # names of source table compartments (for ex. cells.csv, etc.)
+        "CONFIG_NAMES_COMPARTMENTS": tuple(),
+        # names of source table metadata (for ex. image.csv, etc.)
+        "CONFIG_NAMES_METADATA": tuple(),
+        # column names in any compartment or metadata tables which contain
+        # unique names to avoid renaming
+        "CONFIG_IDENTIFYING_COLUMNS": (
+            "OBJECT ID",
+            "Row",
+            "Column",
+            "FOV",
+            "WELL LABEL",
+            "Z",
+            "T",
+        ),
+        # chunk size to use for join operations to help with possible performance issues
+        # note: this number is an estimate and is may need changes contingent on data
+        # and system used by this library.
+        "CONFIG_CHUNK_SIZE": 1000,
+        # compartment and metadata joins performed using DuckDB SQL
+        # and modified at runtime as needed
+        "CONFIG_JOINS": "",
+    },
 }
 """
-Configuration presets for pycytominer-transform
+Configuration presets for CytoTable
 """

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/sources.py RENAMED Viewed

@@ -47,6 +47,7 @@ def _build_path(
 def _get_source_filepaths(
     path: Union[pathlib.Path, AnyPath],
     targets: List[str],
+    source_datatype: Optional[str] = None,
 ) -> Dict[str, List[Dict[str, Any]]]:
     """
     Gather dataset of filepaths from a provided directory path.
@@ -56,19 +57,27 @@ def _get_source_filepaths(
             Either a directory path to seek filepaths within or a path directly to a file.
         targets: List[str]:
             Compartment and metadata names to seek within the provided path.
+        source_datatype: Optional[str]:  (Default value = None)
+            The source datatype (extension) to use for reading the tables.
     Returns:
         Dict[str, List[Dict[str, Any]]]
             Data structure which groups related files based on the compartments.
     """
+    import os
     import pathlib
     from cloudpathlib import AnyPath
-    from cytotable.exceptions import NoInputDataException
+    from cytotable.exceptions import DatatypeException, NoInputDataException
     from cytotable.utils import _cache_cloudpath_to_local, _duckdb_reader
+    if (targets is None or targets == []) and source_datatype is None:
+        raise DatatypeException(
+            f"A source_datatype must be specified when using undefined compartments and metadata names."
+        )
     # gathers files from provided path using compartments + metadata as a filter
     sources = [
         # build source_paths for all files
@@ -85,6 +94,7 @@ def _get_source_filepaths(
         # ensure the subpaths meet certain specifications
         if (
             targets is None
+            or targets == []
             # checks for name of the file from targets (compartment + metadata names)
             or str(subpath.stem).lower() in [target.lower() for target in targets]
             # checks for sqlite extension (which may include compartment + metadata names)
@@ -134,21 +144,38 @@ def _get_source_filepaths(
     # group files together by similar filename for later data operations
     grouped_sources = {}
-    for unique_source in set(source["source_path"].name for source in sources):
-        grouped_sources[unique_source.capitalize()] = [
-            # case for files besides sqlite
-            source if source["source_path"].suffix.lower() != ".sqlite"
-            # if we have sqlite entries, update the source_path to the parent
-            # (the parent table database file) as grouped key name will now
-            # encapsulate the table name details.
-            else {
-                "source_path": source["source_path"].parent,
-                "table_name": source["table_name"],
-            }
-            for source in sources
-            # focus only on entries which include the unique_source name
-            if source["source_path"].name == unique_source
-        ]
+    # if we have no targets, create a single group inferred from a common prefix and suffix
+    # note: this may apply for scenarios where no compartments or metadata are
+    # provided as input to CytoTable operations.
+    if targets is None or targets == []:
+        # gather a common prefix to use for the group
+        common_prefix = os.path.commonprefix(
+            [
+                source["source_path"].stem
+                for source in sources
+                if source["source_path"].suffix == f".{source_datatype}"
+            ]
+        )
+        grouped_sources[f"{common_prefix}.{source_datatype}"] = sources
+    # otherwise, use the unique names in the paths to determine source grouping
+    else:
+        for unique_source in set(source["source_path"].name for source in sources):
+            grouped_sources[unique_source.capitalize()] = [
+                # case for files besides sqlite
+                source if source["source_path"].suffix.lower() != ".sqlite"
+                # if we have sqlite entries, update the source_path to the parent
+                # (the parent table database file) as grouped key name will now
+                # encapsulate the table name details.
+                else {
+                    "source_path": source["source_path"].parent,
+                    "table_name": source["table_name"],
+                }
+                for source in sources
+                # focus only on entries which include the unique_source name
+                if source["source_path"].name == unique_source
+            ]
     return grouped_sources
@@ -190,7 +217,7 @@ def _infer_source_datatype(
         raise DatatypeException(
             (
                 f"Unable to find source datatype {source_datatype} "
-                "within files. Detected datatypes: {suffixes}"
+                f"within files. Detected datatypes: {suffixes}"
             )
         )
@@ -270,7 +297,9 @@ def _gather_sources(
     source_path = _build_path(path=source_path, **kwargs)
     # gather filepaths which will be used as the basis for this work
-    sources = _get_source_filepaths(path=source_path, targets=targets)
+    sources = _get_source_filepaths(
+        path=source_path, targets=targets, source_datatype=source_datatype
+    )
     # infer or validate the source datatype based on source filepaths
     source_datatype = _infer_source_datatype(

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/utils.py RENAMED Viewed

@@ -3,13 +3,13 @@ Utility functions for CytoTable
 """
 import logging
-import multiprocessing
 import os
 import pathlib
-from typing import Any, Dict, Union, cast
+from typing import Any, Dict, Optional, Union, cast
 import duckdb
 import parsl
+import pyarrow as pa
 from cloudpathlib import AnyPath, CloudPath
 from cloudpathlib.exceptions import InvalidPrefixError
 from parsl.app.app import AppBase
@@ -19,67 +19,6 @@ from parsl.executors import HighThroughputExecutor
 logger = logging.getLogger(__name__)
-# read max threads from environment if necessary
-# max threads will be used with default Parsl config and Duckdb
-MAX_THREADS = (
-    multiprocessing.cpu_count()
-    if "CYTOTABLE_MAX_THREADS" not in os.environ
-    else int(cast(int, os.environ.get("CYTOTABLE_MAX_THREADS")))
-)
-# enables overriding default memory mapping behavior with pyarrow memory mapping
-CYTOTABLE_ARROW_USE_MEMORY_MAPPING = (
-    os.environ.get("CYTOTABLE_ARROW_USE_MEMORY_MAPPING", "1") == "1"
-)
-DDB_DATA_TYPE_SYNONYMS = {
-    "real": ["float32", "float4", "float"],
-    "double": ["float64", "float8", "numeric", "decimal"],
-    "integer": ["int32", "int4", "int", "signed"],
-    "bigint": ["int64", "int8", "long"],
-}
-# A reference dictionary for SQLite affinity and storage class types
-# See more here: https://www.sqlite.org/datatype3.html#affinity_name_examples
-SQLITE_AFFINITY_DATA_TYPE_SYNONYMS = {
-    "integer": [
-        "int",
-        "integer",
-        "tinyint",
-        "smallint",
-        "mediumint",
-        "bigint",
-        "unsigned big int",
-        "int2",
-        "int8",
-    ],
-    "text": [
-        "character",
-        "varchar",
-        "varying character",
-        "nchar",
-        "native character",
-        "nvarchar",
-        "text",
-        "clob",
-    ],
-    "blob": ["blob"],
-    "real": [
-        "real",
-        "double",
-        "double precision",
-        "float",
-    ],
-    "numeric": [
-        "numeric",
-        "decimal",
-        "boolean",
-        "date",
-        "datetime",
-    ],
-}
 # reference the original init
 original_init = AppBase.__init__
@@ -198,6 +137,10 @@ def _duckdb_reader() -> duckdb.DuckDBPyConnection:
         duckdb.DuckDBPyConnection
     """
+    import duckdb
+    from cytotable.constants import MAX_THREADS
     return duckdb.connect().execute(
         # note: we use an f-string here to
         # dynamically configure threads as appropriate
@@ -252,20 +195,25 @@ def _sqlite_mixed_type_query_to_parquet(
     import pyarrow as pa
+    from cytotable.constants import SQLITE_AFFINITY_DATA_TYPE_SYNONYMS
     from cytotable.exceptions import DatatypeException
-    from cytotable.utils import SQLITE_AFFINITY_DATA_TYPE_SYNONYMS
     # open sqlite3 connection
     with sqlite3.connect(source_path) as conn:
         cursor = conn.cursor()
-        # gather table column details including datatype
+        # Gather table column details including datatype.
+        # Note: uses SQLite pragma for table information.
+        # See the following for more information:
+        # https://sqlite.org/pragma.html#pragma_table_info
         cursor.execute(
             f"""
             SELECT :table_name as table_name,
                     name as column_name,
                     type as column_type
-            FROM pragma_table_info(:table_name);
+            FROM pragma_table_info(:table_name)
+            /* explicit column ordering by 'cid' */
+            ORDER BY cid ASC;
             """,
             {"table_name": table_name},
         )
@@ -384,6 +332,9 @@ def _arrow_type_cast_if_specified(
         Dict[str, str]
             A potentially data type updated dictionary of column information
     """
+    from cytotable.constants import DDB_DATA_TYPE_SYNONYMS
     # for casting to new float type
     if "float" in data_type_cast_map.keys() and column["column_dtype"] in [
         "REAL",
@@ -453,3 +404,56 @@ def _expand_path(
         modifed_path = modifed_path.expanduser()
     return modifed_path.resolve()
+def _get_cytotable_version() -> str:
+    """
+    Seeks the current version of CytoTable using either pkg_resources
+    or dunamai to determine the current version being used.
+    Returns:
+        str
+            A string representing the version of CytoTable currently being used.
+    """
+    try:
+        # attempt to gather the development version from dunamai
+        # for scenarios where cytotable from source is used.
+        import dunamai
+        return dunamai.Version.from_any_vcs().serialize()
+    except (RuntimeError, ModuleNotFoundError):
+        # else grab a static version from __init__.py
+        # for scenarios where the built/packaged cytotable is used.
+        import cytotable
+        return cytotable.__version__
+def _write_parquet_table_with_metadata(table: pa.Table, **kwargs) -> None:
+    """
+    Adds metadata to parquet output from CytoTable.
+    Note: this mostly wraps pyarrow.parquet.write_table
+    https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html
+    Args:
+        table: pa.Table:
+            Pyarrow table to be serialized as parquet table.
+        **kwargs: Any:
+            kwargs provided to this function roughly align with
+            pyarrow.parquet.write_table. The following might be
+            examples of what to expect here:
+            - where: str or pyarrow.NativeFile
+    """
+    from pyarrow import parquet
+    from cytotable.constants import CYTOTABLE_DEFAULT_PARQUET_METADATA
+    from cytotable.utils import _get_cytotable_version
+    parquet.write_table(
+        table=table.replace_schema_metadata(
+            metadata=CYTOTABLE_DEFAULT_PARQUET_METADATA
+        ),
+        **kwargs,
+    )

{cytotable-0.0.3 → cytotable-0.0.4}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,7 @@
 [tool.poetry]
 name = "CytoTable"
-version = "0.0.3"
+# note: version data is maintained by poetry-dynamic-versioning (do not edit)
+version = "0.0.4"
 description = "Transform CellProfiler and DeepProfiler data for processing image-based profiling readouts with Pycytominer and other Cytomining tools."
 authors = ["Cytomining Community"]
 license = "BSD-3-Clause License"
@@ -10,14 +11,25 @@ repository = "https://github.com/cytomining/CytoTable"
 documentation = "https://cytomining.github.io/CytoTable/"
 keywords = ["python", "cellprofiler","single-cell-analysis", "way-lab"]
+[tool.poetry-dynamic-versioning]
+enable = false
+style = "pep440"
+vcs = "git"
+[build-system]
+requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning>=1.0.0,<2.0.0"]
+build-backend = "poetry_dynamic_versioning.backend"
+[tool.setuptools_scm]
 [tool.poetry.dependencies]
 python = ">=3.8,<3.13"
-pyarrow = "^13.0.0"
+pyarrow = ">=13.0.0"
 cloudpathlib = {extras = ["all"], version = "^0.15.0"}
 duckdb = ">=0.8.0"
 parsl = ">=2023.9.25"
-[tool.poetry.dev-dependencies]
+[tool.poetry.group.dev.dependencies]
 pytest = "^7.4.0"
 pytest-cov = "^4.1.0"
 Sphinx = "^6.0.0"
@@ -27,10 +39,7 @@ moto = {extras = ["s3", "server"], version = "^4.0.0"}
 cffconvert = "^2.0.0"
 cytominer-database = "^0.3.4"
 pycytominer = { git = "https://github.com/cytomining/pycytominer.git", rev = "09b2c79aa94908e3520f0931a844db4fba7fd3fb" }
-[build-system]
-requires = ["poetry-core"]
-build-backend = "poetry.core.masonry.api"
+dunamai = "^1.19.0"
 [tool.vulture]
 min_confidence = 80

{cytotable-0.0.3 → cytotable-0.0.4}/readme.md RENAMED Viewed

@@ -7,10 +7,17 @@ _Diagram showing data flow relative to this project._
 ## Summary
-CytoTable enables single-cell morphology data analysis by cleaning and transforming CellProfiler (`.csv` or `.sqlite`), cytominer-database (`.sqlite`), and DeepProfiler (`.npz`) output data at scale.
+CytoTable enables single-cell morphology data analysis by cleaning and transforming CellProfiler (`.csv` or `.sqlite`), cytominer-database (`.sqlite`), and DeepProfiler (`.npz`), and other sources such as IN Carta data output data at scale.
 CytoTable creates parquet files for both independent analysis and for input into [Pycytominer](https://github.com/cytomining/pycytominer).
 The Parquet files will have a unified and documented data model, including referenceable schema where appropriate (for validation within Pycytominer or other projects).
+The name for the project is inspired from:
+- __Cyto__: "1. (biology) cell." ([Wiktionary: Cyto-](https://en.wiktionary.org/wiki/cyto-))
+- __Table__:
+  - "1. Furniture with a top surface to accommodate a variety of uses."
+  - "3.1. A matrix or grid of data arranged in rows and columns." <br> ([Wiktionary: Table](https://en.wiktionary.org/wiki/table))
 ## Installation
 Install CytoTable from [PyPI](https://pypi.org/) or from source:

{cytotable-0.0.3 → cytotable-0.0.4}/LICENSE RENAMED Viewed

File without changes

{cytotable-0.0.3 → cytotable-0.0.4}/cytotable/exceptions.py RENAMED Viewed

File without changes

CytoTable 0.0.3__tar.gz → 0.0.4__tar.gz

CytoTable 0.0.3tar.gz → 0.0.4tar.gz