PyPI - masster - Versions diffs - 0.5.9__py3-none-any.whl → 0.5.11__py3-none-any.whl - Mend

masster 0.5.9py3-none-any.whl → 0.5.11py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of masster might be problematic. Click here for more details.

Files changed (19) hide show

masster/_version.py +1 -1
masster/sample/adducts.py +2 -2
masster/sample/helpers.py +47 -15
masster/sample/plot.py +1209 -912
masster/sample/processing.py +1 -1
masster/sample/sample.py +91 -48
masster/study/export.py +4 -6
masster/study/h5.py +66 -15
masster/study/helpers.py +24 -5
masster/study/load.py +1 -164
masster/study/merge.py +16 -18
masster/study/plot.py +105 -35
masster/study/processing.py +17 -14
masster/study/study5_schema.json +3 -0
{masster-0.5.9.dist-info → masster-0.5.11.dist-info}/METADATA +3 -1
{masster-0.5.9.dist-info → masster-0.5.11.dist-info}/RECORD +19 -19
{masster-0.5.9.dist-info → masster-0.5.11.dist-info}/WHEEL +0 -0
{masster-0.5.9.dist-info → masster-0.5.11.dist-info}/entry_points.txt +0 -0
{masster-0.5.9.dist-info → masster-0.5.11.dist-info}/licenses/LICENSE +0 -0

masster/sample/processing.py CHANGED Viewed

@@ -1264,7 +1264,7 @@ def find_ms2(self, **kwargs):
     # Log completion
     self.logger.success(
-        f"MS2 linking completed. Total features with MS2 data: {c}",
+        f"MS2 linking completed. Features with MS2 data: {c}.",
     )
     self.features_df = features_df

masster/sample/sample.py CHANGED Viewed

@@ -1,35 +1,98 @@
 """
-sample.py
+sample.py - Mass Spectrometry Sample Analysis Module
-This module provides tools for processing and analyzing Data-Dependent Acquisition (DDA) mass spectrometry data.
-It defines the `Sample` class, which offers methods to load, process, analyze, and visualize mass spectrometry data
-from various file formats, including mzML, Thermo RAW, and Sciex WIFF formats.
+This module provides comprehensive tools for processing and analyzing Data-Dependent Acquisition (DDA)
+mass spectrometry data. It defines the `Sample` class, which offers methods to load, process, analyze,
+and visualize mass spectrometry data from various file formats.
+Supported File Formats:
+    - mzML (standard XML format for mass spectrometry data)
+    - Thermo RAW (native Thermo Fisher Scientific format)
+    - Sciex WIFF (native Sciex format)
+    - Sample5 (MASSter's native HDF5-based format for optimized storage)
 Key Features:
-- **File Handling**: Load and save data in multiple formats.
-- **Feature Detection**: Detect and process mass spectrometry features.
-- **Spectrum Analysis**: Retrieve and analyze MS1/MS2 spectra.
-- **Visualization**: Generate interactive and static plots for spectra and chromatograms.
-- **Statistics**: Compute and export detailed DDA run statistics.
-Dependencies:
-- `pyopenms`: For file handling and feature detection.
-- `polars` and `pandas`: For data manipulation.
-- `numpy`: For numerical computations.
-- `bokeh`, `panel`, `holoviews`, `datashader`: For interactive visualizations.
+    - **File Handling**: Load and save data in multiple formats with automatic format detection
+    - **Feature Detection**: Detect and process mass spectrometry features using advanced algorithms
+    - **Spectrum Analysis**: Retrieve and analyze MS1/MS2 spectra with comprehensive metadata
+    - **Adduct Detection**: Find and annotate adducts and in-source fragments
+    - **Isotope Analysis**: Detect and process isotopic patterns
+    - **Chromatogram Extraction**: Extract and analyze chromatograms (EIC, BPC, TIC)
+    - **Visualization**: Generate interactive and static plots for spectra, chromatograms, and 2D maps
+    - **Statistics**: Compute and export detailed DDA run statistics and quality metrics
+    - **Data Export**: Export processed data to various formats (XLSX, MGF, etc.)
+    - **Memory Management**: Efficient handling of large datasets with on-disk storage options
+Core Dependencies:
+    - `pyopenms`: OpenMS library for file handling and feature detection algorithms
+    - `polars`: High-performance data manipulation and analysis
+    - `numpy`: Numerical computations and array operations
+    - `bokeh`, `panel`, `holoviews`, `datashader`: Interactive visualizations and dashboards
+    - `h5py`: HDF5 file format support for Sample5 files
 Classes:
-- `Sample`: Main class for handling DDA data, providing methods for data import, processing, and visualization.
-Example Usage:
-```python
-from masster.sample import Sample
+    Sample: Main class for handling DDA mass spectrometry data, providing methods for
+            data import, processing, analysis, and visualization.
-sample = Sample(file="example.mzML")
-sample.find_features()
-sample.plot_2d()
-```
+Typical Workflow:
+    1. Load mass spectrometry data file
+    2. Detect features using find_features()
+    3. Optionally find MS2 spectra with find_ms2()
+    4. Analyze and visualize results
+    5. Export processed data
+Example Usage:
+    Basic analysis workflow:
+    ```python
+    from masster.sample import Sample
+    # Load a mass spectrometry file
+    sample = Sample(filename="experiment.mzML")
+    # Detect features
+    sample.find_features()
+    # Find MS2 spectra for features
+    sample.find_ms2()
+    # Generate 2D visualization
+    sample.plot_2d()
+    # Export results
+    sample.export_features("features.xlsx")
+    ```
+    Advanced usage with custom parameters:
+    ```python
+    from masster.sample import Sample
+    from masster.sample.defaults import sample_defaults, find_features_defaults
+    # Create custom parameters
+    params = sample_defaults(log_level="DEBUG", label="My Experiment")
+    ff_params = find_features_defaults(noise_threshold_int=1000)
+    # Initialize with custom parameters
+    sample = Sample(params=params)
+    sample.load("data.raw")
+    # Feature detection with custom parameters
+    sample.find_features(params=ff_params)
+    # Generate comprehensive statistics
+    stats = sample.get_dda_stats()
+    sample.plot_dda_stats()
+    ```
+Notes:
+    - The Sample class maintains processing history and parameters for reproducibility
+    - Large files can be processed with on-disk storage to manage memory usage
+    - All visualizations are interactive by default and can be exported as static images
+    - The module supports both individual sample analysis and batch processing workflows
+Version: Part of the MASSter mass spectrometry analysis framework
+Author: Zamboni Lab, ETH Zurich
 """
 import importlib
@@ -49,16 +112,12 @@ from masster.sample.defaults.get_spectrum_def import get_spectrum_defaults
 # Sample-specific imports - keeping these private, only for internal use
 from masster.sample.h5 import _load_sample5
-# from masster.sample.h5 import _load_sample5_study
 from masster.sample.h5 import _save_sample5
-# from masster.sample.helpers import _delete_ms2
 from masster.sample.helpers import _estimate_memory_usage
 from masster.sample.helpers import _get_scan_uids
 from masster.sample.helpers import _get_feature_uids
-# from masster.sample.helpers import _features_sync - made internal only
 from masster.sample.adducts import find_adducts
 from masster.sample.adducts import _get_adducts
-# Removed _get_adducts - only used in study modules
 from masster.sample.helpers import features_delete
 from masster.sample.helpers import features_filter
 from masster.sample.helpers import features_select
@@ -70,23 +129,17 @@ from masster.sample.helpers import get_eic
 from masster.sample.helpers import set_source
 from masster.sample.helpers import _recreate_feature_map
 from masster.sample.helpers import _get_feature_map
-# Load functions - keeping only specific ones needed for external API
-# from masster.sample.load import _load_featureXML - made internal only
-# from masster.sample.load import _load_ms2data - made internal only
-# from masster.sample.load import _load_mzML - made internal only
-# from masster.sample.load import _load_raw - made internal only
-# from masster.sample.load import _load_wiff - made internal only
 from masster.sample.load import chrom_extract
 from masster.sample.load import _index_file
 from masster.sample.load import load
 from masster.sample.load import load_noms1
-from masster.sample.load import _load_ms1  # Renamed from load_study
+from masster.sample.load import _load_ms1
 from masster.sample.load import sanitize
 from masster.sample.plot import plot_2d
 from masster.sample.plot import plot_2d_oracle
 from masster.sample.plot import plot_dda_stats
 from masster.sample.plot import plot_chrom
-from masster.sample.plot import plot_features_stats  # Renamed from plot_feature_stats
+from masster.sample.plot import plot_features_stats
 from masster.sample.plot import plot_ms2_cycle
 from masster.sample.plot import plot_ms2_eic
 from masster.sample.plot import plot_ms2_q1
@@ -113,7 +166,6 @@ from masster.sample.save import export_features
 from masster.sample.save import export_mgf
 from masster.sample.save import export_xlsx
 from masster.sample.save import save
-# Removed internal-only import: _save_featureXML
 class Sample:
@@ -402,6 +454,7 @@ class Sample:
             f"{base_modname}.chromatogram",
             f"{base_modname}.spectrum",
             f"{base_modname}.logger",
+            f"{base_modname}.lib",
         ]
         # Add study submodules
@@ -414,17 +467,9 @@ class Sample:
             ):
                 study_modules.append(module_name)
-        """ # Add parameters submodules
-        parameters_modules = []
-        parameters_module_prefix = f"{base_modname}.parameters."
-        for module_name in sys.modules:
-            if module_name.startswith(parameters_module_prefix) and module_name != current_module:
-                parameters_modules.append(module_name)
-        """
         all_modules_to_reload = (
             core_modules + sample_modules + study_modules
-        )  # + parameters_modules
+        )
         # Reload all discovered modules
         for full_module_name in all_modules_to_reload:
@@ -466,8 +511,6 @@ class Sample:
         else:
             str += "Features: 0\n"
             str += "Features with MS2 spectra: 0\n"
-        # estimate memory usage
         mem_usage = self._estimate_memory_usage()
         str += f"Estimated memory usage: {mem_usage:.2f} MB\n"

masster/study/export.py CHANGED Viewed

@@ -496,7 +496,7 @@ def export_mgf(self, **kwargs):
             # Write END IONS
             f.write("END IONS\n\n")
-    self.logger.info(f"Exported {len(mgf_data)} spectra to {filename}")
+    self.logger.success(f"Exported {len(mgf_data)} spectra to {filename}")
 def export_mztab(self, filename: str | None = None, include_mgf=True, **kwargs) -> None:
@@ -1183,7 +1183,7 @@ def export_mztab(self, filename: str | None = None, include_mgf=True, **kwargs)
             for line in mgf_lines:
                 f.write(line + "\n")
-    self.logger.info(f"Exported mzTab-M to {filename}")
+    self.logger.success(f"Exported mzTab-M to {filename}")
 def export_xlsx(self, filename: str | None = None) -> None:
@@ -1311,7 +1311,7 @@ def export_xlsx(self, filename: str | None = None) -> None:
                     f"Written worksheet '{sheet_name}' with shape {data.shape}",
                 )
-        self.logger.info(f"Study exported to {filename}")
+        self.logger.success(f"Study exported to {filename}")
     except Exception as e:
         self.logger.error(f"Error writing Excel file: {e}")
@@ -1424,8 +1424,6 @@ def export_parquet(self, filename: str | None = None) -> None:
     # Report results
     if exported_files:
-        self.logger.info(f"Study exported to {len(exported_files)} Parquet files:")
-        for file_path in exported_files:
-            self.logger.info(f"  - {file_path}")
+        self.logger.success(f"Study exported to {len(exported_files)} Parquet files.")
     else:
         self.logger.error("No Parquet files were created - no data available to export")

masster/study/h5.py CHANGED Viewed

@@ -818,9 +818,35 @@ def _reorder_columns_by_schema(
 def _create_dataframe_with_objects(data: dict, object_columns: list) -> pl.DataFrame:
     """Create DataFrame handling Object columns properly."""
+    # First check all data for numpy object arrays and move them to object columns
+    additional_object_cols = []
+    for k, v in data.items():
+        if k not in object_columns and hasattr(v, 'dtype') and str(v.dtype) == 'object':
+            # This is a numpy object array that should be treated as object
+            additional_object_cols.append(k)
+            object_columns.append(k)
+    if additional_object_cols:
+        # Re-run reconstruction for these columns
+        for col in additional_object_cols:
+            data[col] = _reconstruct_object_column(data[col], col)
     object_data = {k: v for k, v in data.items() if k in object_columns}
     regular_data = {k: v for k, v in data.items() if k not in object_columns}
+    # Final check: ensure no numpy object arrays in regular_data
+    problematic_cols = []
+    for k, v in regular_data.items():
+        if hasattr(v, 'dtype') and str(v.dtype) == 'object':
+            problematic_cols.append(k)
+    if problematic_cols:
+        # Move these to object_data
+        for col in problematic_cols:
+            object_data[col] = _reconstruct_object_column(regular_data[col], col)
+            del regular_data[col]
+            object_columns.append(col)
     # Determine expected length from regular data or first object column
     expected_length = None
     if regular_data:
@@ -1103,11 +1129,18 @@ def _load_dataframe_from_group(
         logger.info(f"Loading extra column '{col}' not in schema for {df_name}")
         column_data = group[col][:]
-        # Try to determine if this should be treated as an object column
-        # by checking if the data looks like JSON strings
-        if len(column_data) > 0 and isinstance(column_data[0], bytes):
+        # Check if this is a known object column by name
+        known_object_columns = {"ms1_spec", "chrom", "ms2_scans", "ms2_specs", "spec", "adducts", "iso"}
+        is_known_object = col in known_object_columns
+        if is_known_object:
+            # Known object column, always reconstruct
+            data[col] = _reconstruct_object_column(column_data, col)
+            if col not in object_columns:
+                object_columns.append(col)
+        elif len(column_data) > 0 and isinstance(column_data[0], bytes):
             try:
-                # Check if it looks like JSON
+                # Check if it looks like JSON for unknown columns
                 test_decode = column_data[0].decode("utf-8")
                 if test_decode.startswith("[") or test_decode.startswith("{"):
                     # Looks like JSON, treat as object column
@@ -1165,9 +1198,29 @@ def _load_dataframe_from_group(
                 logger.debug(
                     f"Object column '{col}': length={len(data[col]) if data[col] is not None else 'None'}",
                 )
+        # Debug: check for problematic data types in all columns before DataFrame creation
+        for col, values in data.items():
+            if hasattr(values, 'dtype') and str(values.dtype) == 'object':
+                logger.warning(f"Column '{col}' has numpy object dtype but is not in object_columns: {object_columns}")
+                if col not in object_columns:
+                    object_columns.append(col)
         df = _create_dataframe_with_objects(data, object_columns)
     else:
-        df = pl.DataFrame(data)
+        # Debug: check for problematic data types when no object columns are expected
+        for col, values in data.items():
+            if hasattr(values, 'dtype') and str(values.dtype) == 'object':
+                logger.warning(f"Column '{col}' has numpy object dtype but no object_columns specified!")
+                # Treat as object column
+                if object_columns is None:
+                    object_columns = []
+                object_columns.append(col)
+        if object_columns:
+            df = _create_dataframe_with_objects(data, object_columns)
+        else:
+            df = pl.DataFrame(data)
     # Clean null values and apply schema
     df = _clean_string_nulls(df)
@@ -1738,9 +1791,7 @@ def _save_study5(self, filename):
                 )
                 pbar.update(1)
-    self.logger.success(f"Study saved successfully to {filename}")
-    self.logger.debug(f"Save completed for {filename}")
-    self.logger.debug(f"Save completed for {filename}")
+    self.logger.success(f"Study saved to {filename}")
 def _load_study5(self, filename=None):
@@ -1859,7 +1910,7 @@ def _load_study5(self, filename=None):
                             )
                         else:
                             self.logger.debug(
-                                "Successfully updated parameters from loaded history",
+                                "Updated parameters from loaded history",
                             )
                     else:
                         self.logger.debug(
@@ -2093,8 +2144,8 @@ def _load_study5(self, filename=None):
         # Ensure the column is Int64 type
         self.samples_df = self.samples_df.cast({"map_id": pl.Int64})
-        self.logger.info(
-            f"Successfully migrated {sample_count} samples to indexed map_id format (0 to {sample_count - 1})",
+        self.logger.debug(
+            f"Sanitized {sample_count} samples to indexed map_id format (0 to {sample_count - 1})",
         )
     # Sanitize null feature_id and consensus_id values with new UIDs (same method as merge)
@@ -2218,7 +2269,7 @@ def _sanitize_nulls(self):
                 pl.Series("feature_id", feature_ids, dtype=pl.Utf8)
             )
-            self.logger.debug(f"Successfully sanitized {null_feature_ids} feature_id values")
+            self.logger.debug(f"Sanitized {null_feature_ids} feature_id values")
     # Sanitize consensus_df consensus_id column
     if hasattr(self, 'consensus_df') and self.consensus_df is not None and not self.consensus_df.is_empty():
@@ -2244,8 +2295,8 @@ def _sanitize_nulls(self):
                 self.consensus_df = self.consensus_df.with_columns(
                     pl.Series("consensus_id", consensus_ids, dtype=pl.Utf8)
                 )
-                self.logger.debug(f"Successfully sanitized {null_consensus_ids} consensus_id values")
+                self.logger.debug(f"Sanitized {null_consensus_ids} consensus_id values")
     # Sanitize rt_original in features_df by replacing null or NaN values with rt values
     if hasattr(self, 'features_df') and self.features_df is not None and not self.features_df.is_empty():
@@ -2262,4 +2313,4 @@ def _sanitize_nulls(self):
                     .otherwise(pl.col("rt_original"))
                     .alias("rt_original")
                 )
-                self.logger.debug(f"Successfully sanitized {null_or_nan_rt_original} rt_original values")
+                self.logger.debug(f"Sanitized {null_or_nan_rt_original} rt_original values")

masster/study/helpers.py CHANGED Viewed

@@ -1630,7 +1630,7 @@ def restore_features(self, samples=None, maps=False):
             self.logger.error(f"Failed to load sample {sample_name}: {e}")
             continue
-    self.logger.info(
+    self.logger.success(
         f"Completed restoring columns {columns_to_update} from {len(sample_uids)} samples",
     )
@@ -2663,7 +2663,7 @@ def features_filter(
     removed_count = initial_count - final_count
     self.logger.info(
-        f"Filtered features: kept {final_count:,}, removed {removed_count:,}"
+        f"Filtered features. Kept: {final_count:,}. Removed: {removed_count:,}."
     )
@@ -2940,6 +2940,7 @@ def features_delete(self, features):
 def consensus_select(
     self,
+    uid=None,
     mz=None,
     rt=None,
     inty_mean=None,
@@ -2956,14 +2957,12 @@ def consensus_select(
     rt_delta_mean=None,
     id_top_score=None,
     identified=None,
-    # New adduct filter parameters
     adduct_top=None,
     adduct_charge_top=None,
     adduct_mass_neutral_top=None,
     adduct_mass_shift_top=None,
     adduct_group=None,
     adduct_of=None,
-    # New identification filter parameters
     id_top_name=None,
     id_top_class=None,
     id_top_adduct=None,
@@ -2976,6 +2975,11 @@ def consensus_select(
     OPTIMIZED VERSION: Enhanced performance with lazy evaluation, vectorized operations, and efficient filtering.
     Parameters:
+        uid: consensus UID filter with flexible formats:
+            - None: include all consensus features (default)
+            - int: single specific consensus_uid
+            - tuple: range of consensus_uids (consensus_uid_min, consensus_uid_max)
+            - list: specific list of consensus_uid values
         mz: m/z filter with flexible formats:
             - float: m/z value ± default tolerance (uses study.parameters.eic_mz_tol)
             - tuple (mz_min, mz_max): range where mz_max > mz_min
@@ -3023,7 +3027,7 @@ def consensus_select(
         return pl.DataFrame()
     # Early return optimization - check if any filters are provided
-    filter_params = [mz, rt, inty_mean, consensus_uid, consensus_id, number_samples,
+    filter_params = [uid, mz, rt, inty_mean, consensus_uid, consensus_id, number_samples,
                     number_ms2, quality, bl, chrom_coherence_mean, chrom_prominence_mean,
                     chrom_prominence_scaled_mean, chrom_height_scaled_mean,
                     rt_delta_mean, id_top_score, identified,
@@ -3044,6 +3048,21 @@ def consensus_select(
     warnings = []
     # Build all filter conditions efficiently
+    # Handle uid parameter first (consensus_uid filter with flexible formats)
+    if uid is not None:
+        if isinstance(uid, int):
+            # Single specific consensus_uid
+            filter_conditions.append(pl.col("consensus_uid") == uid)
+        elif isinstance(uid, tuple) and len(uid) == 2:
+            # Range of consensus_uids (consensus_uid_min, consensus_uid_max)
+            min_uid, max_uid = uid
+            filter_conditions.append((pl.col("consensus_uid") >= min_uid) & (pl.col("consensus_uid") <= max_uid))
+        elif isinstance(uid, list):
+            # Specific list of consensus_uid values
+            filter_conditions.append(pl.col("consensus_uid").is_in(uid))
+        else:
+            self.logger.warning(f"Invalid uid parameter type: {type(uid)}. Expected int, tuple, or list.")
     if mz is not None:
         if isinstance(mz, tuple) and len(mz) == 2:
             if mz[1] < mz[0]:

masster/study/load.py CHANGED Viewed

@@ -139,7 +139,7 @@ def add(
             f"No files found in {folder}. Please check the folder path or file patterns.",
         )
     else:
-        self.logger.debug(f"Successfully added {counter} samples to the study.")
+        self.logger.debug(f"Added {counter} samples to the study.")
     # Return a simple summary to suppress marimo's automatic object display
     return f"Added {counter} samples to study"
@@ -2055,169 +2055,6 @@ def _sanitize(self):
     except Exception as e:
         self.logger.error(f"Failed to recreate sanitized DataFrame: {e}")
-'''
-def _load_features(self):
-    """
-    Load features by reconstructing FeatureMaps from the processed features_df data.
-    This ensures that the loaded FeatureMaps contain the same processed features
-    as stored in features_df, rather than loading raw features from .featureXML files
-    which may not match the processed data after filtering, alignment, etc.
-    """
-    import polars as pl
-    import pyopenms as oms
-    from tqdm import tqdm
-    from datetime import datetime
-    self.features_maps = []
-    # Check if features_df exists and is not empty
-    if self.features_df is None:
-        self.logger.warning("features_df is None. Falling back to XML loading.")
-        self._load_features_from_xml()
-        return
-    if len(self.features_df) == 0:
-        self.logger.warning("features_df is empty. Falling back to XML loading.")
-        self._load_features_from_xml()
-        return
-    # If we get here, we should use the new method
-    self.logger.debug("Reconstructing FeatureMaps from features_df.")
-    tdqm_disable = self.log_level not in ["TRACE", "DEBUG", "INFO"]
-    # Process each sample in order
-    for sample_index, row_dict in tqdm(
-        enumerate(self.samples_df.iter_rows(named=True)),
-        total=len(self.samples_df),
-        desc=f"{datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3]} | INFO     | {self.log_label}Reconstruct FeatureMaps from DataFrame",
-        disable=tdqm_disable,
-    ):
-        sample_uid = row_dict["sample_uid"]
-        sample_name = row_dict["sample_name"]
-        # Get features for this sample from features_df
-        sample_features = self.features_df.filter(pl.col("sample_uid") == sample_uid)
-        # Create new FeatureMap
-        feature_map = oms.FeatureMap()
-        # Convert DataFrame features to OpenMS Features
-        # Keep track of next available feature_id for this sample
-        next_feature_id = 1
-        used_feature_ids = set()
-        # First pass: collect existing feature_ids to avoid conflicts
-        for feature_row in sample_features.iter_rows(named=True):
-            if feature_row["feature_id"] is not None:
-                used_feature_ids.add(int(feature_row["feature_id"]))
-        # Find the next available feature_id
-        while next_feature_id in used_feature_ids:
-            next_feature_id += 1
-        for feature_row in sample_features.iter_rows(named=True):
-            feature = oms.Feature()
-            # Set properties from DataFrame (handle missing values gracefully)
-            try:
-                # Skip features with missing critical data
-                if feature_row["mz"] is None:
-                    self.logger.warning("Skipping feature due to missing mz")
-                    continue
-                if feature_row["rt"] is None:
-                    self.logger.warning("Skipping feature due to missing rt")
-                    continue
-                if feature_row["inty"] is None:
-                    self.logger.warning("Skipping feature due to missing inty")
-                    continue
-                # Handle missing feature_id by generating a new one
-                if feature_row["feature_id"] is None:
-                    feature_id = next_feature_id
-                    next_feature_id += 1
-                    self.logger.debug(f"Generated new feature_id {feature_id} for feature with missing ID")
-                else:
-                    feature_id = int(feature_row["feature_id"])
-                feature.setUniqueId(feature_id)
-                feature.setMZ(float(feature_row["mz"]))
-                feature.setRT(float(feature_row["rt"]))
-                feature.setIntensity(float(feature_row["inty"]))
-                # Handle optional fields that might be None
-                if feature_row.get("quality") is not None:
-                    feature.setOverallQuality(float(feature_row["quality"]))
-                if feature_row.get("charge") is not None:
-                    feature.setCharge(int(feature_row["charge"]))
-                # Add to feature map
-                feature_map.push_back(feature)
-            except (ValueError, TypeError) as e:
-                self.logger.warning(f"Skipping feature due to conversion error: {e}")
-                continue
-        self.features_maps.append(feature_map)
-    self.logger.debug(
-        f"Successfully reconstructed {len(self.features_maps)} FeatureMaps from features_df.",
-    )
-'''
-'''
-def _load_features_from_xml(self):
-    """
-    Original load_features method that loads from .featureXML files.
-    Used as fallback when features_df is not available.
-    """
-    self.features_maps = []
-    self.logger.debug("Loading features from featureXML files.")
-    tdqm_disable = self.log_level not in ["TRACE", "DEBUG", "INFO"]
-    for _index, row_dict in tqdm(
-        enumerate(self.samples_df.iter_rows(named=True)),
-        total=len(self.samples_df),
-        desc=f"{datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3]} | INFO     | {self.log_label}Load feature maps from XML",
-        disable=tdqm_disable,
-    ):
-        if self.folder is not None:
-            filename = os.path.join(
-                self.folder,
-                row_dict["sample_name"] + ".featureXML",
-            )
-        else:
-            filename = os.path.join(
-                os.getcwd(),
-                row_dict["sample_name"] + ".featureXML",
-            )
-        # check if file exists
-        if not os.path.exists(filename):
-            filename = row_dict["sample_path"].replace(".sample5", ".featureXML")
-        if not os.path.exists(filename):
-            self.features_maps.append(None)
-            continue
-        fh = oms.FeatureXMLFile()
-        fm = oms.FeatureMap()
-        fh.load(filename, fm)
-        self.features_maps.append(fm)
-    self.logger.debug("Features loaded successfully.")
-'''
-'''
-def _load_consensusXML(self, filename="alignment.consensusXML"):
-    """
-    Load a consensus map from a file.
-    """
-    if not os.path.exists(filename):
-        self.logger.error(f"File {filename} does not exist.")
-        return
-    fh = oms.ConsensusXMLFile()
-    self.consensus_map = oms.ConsensusMap()
-    fh.load(filename, self.consensus_map)
-    self.logger.debug(f"Loaded consensus map from {filename}.")
-'''
 def _add_samples_batch(
     self,
     files,

masster 0.5.9__py3-none-any.whl → 0.5.11__py3-none-any.whl

Potentially problematic release.

masster 0.5.9py3-none-any.whl → 0.5.11py3-none-any.whl