PyPI - moducomp - Versions diffs - 0.7.13__tar.gz → 0.7.16__tar.gz - Mend

moducomp 0.7.13tar.gz → 0.7.16tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

{moducomp-0.7.13 → moducomp-0.7.16}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: moducomp
-Version: 0.7.13
+Version: 0.7.16
 Summary: moducomp: metabolic module completeness and complementarity for microbiomes.
 Keywords: bioinformatics,microbiome,metabolic,kegg,genomics
 Author-email: "Juan C. Villada" <jvillada@lbl.gov>
@@ -62,6 +62,14 @@ pixi global install \
 `moducomp` needs the eggNOG-mapper database to run. The primary (recommended) way to download it is using the `download_eggnog_data.py` wrapper, which mirrors the upstream downloader behavior. For upstream details, see the eggNOG-mapper setup guide: [eggNOG-mapper database setup](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.13#user-content-Setup).
+```bash
+download_eggnog_data.py
+```
+By default, the data are stored in `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`, and `moducomp` will auto-detect that location without needing `EGGNOG_DATA_DIR`.
+To use a custom location:
 ```bash
 export EGGNOG_DATA_DIR="/path/to/eggnog-data"
 download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
@@ -69,8 +77,6 @@ download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
 # moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
 ```
-If `EGGNOG_DATA_DIR` is not set, the downloader defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
 ### Quick test
 Small test data sets ship with `moducomp`. After installation you can confirm the pipeline by running:
@@ -147,7 +153,7 @@ This section lists all CLI options implemented today, along with their default v
 | `--calculate-complementarity`, `-c` | `0` | Complementarity size to compute (0 disables). |
 | `--adapt-headers/--no-adapt-headers` | `false` | Adapt FASTA headers to `genome|protein_N`. |
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
-| `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `fullmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
+| `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `lowmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
 | `--validate/--no-validate` | `validate` | Run post-run validation checks. |
 | `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
@@ -253,7 +259,7 @@ moducomp validate /path/to/output --strict
 ### ⚠️ Important note 2
-`moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. Nevertheless, it can be run on **smaller systems with less RAM, using the flag `--lowmem` (`--low-mem`) when running the `pipeline` command**. The `test` command uses low-memory mode by default and can be switched to full memory with `--fullmem` (`--full-mem`).
+`moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. For robustness, **low-memory mode is now the default** for `pipeline` and `test`. If you have ample RAM and want full-memory mode, add `--fullmem` (`--full-mem`).
 ### Notes on bundled test data
@@ -290,7 +296,7 @@ moducomp pipeline \
     --ncpus <number_of_cpus_to_use> \
     --calculate-complementarity <N>  # 0 to disable, 2 for 2-member, 3 for 3-member complementarity.
     # Optional flags:
-    # --lowmem/--fullmem          # Optional: Use low-mem if you have less than 64GB of RAM (default is full mem)
+    # --fullmem                  # Optional: Use full-mem if you have ample RAM (default is low-mem)
     # --adapt-headers             # If your FASTA headers need modification
     # --del-tmp/--keep-tmp        # Delete or keep temporary files
     # --eggnog-data-dir /path     # If EGGNOG_DATA_DIR is not set
@@ -343,8 +349,11 @@ moducomp pipeline ./large_genome_collection ./output_large --ncpus 32 --calculat
 # For moderate datasets with verbose output
 moducomp analyze-ko-matrix ./ko_matrix.csv ./output_moderate --ncpus 16 --calculate-complementarity 2 --verbose
-# For systems with limited memory
-moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --lowmem --calculate-complementarity 2
+# For systems with limited memory (default behavior)
+moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --calculate-complementarity 2
+# For systems with ample RAM
+moducomp pipeline ./genomes ./output_fullmem --ncpus 8 --fullmem --calculate-complementarity 2
 ```
 ## Expected outputs

{moducomp-0.7.13 → moducomp-0.7.16}/README.md RENAMED Viewed

@@ -37,6 +37,14 @@ pixi global install \
 `moducomp` needs the eggNOG-mapper database to run. The primary (recommended) way to download it is using the `download_eggnog_data.py` wrapper, which mirrors the upstream downloader behavior. For upstream details, see the eggNOG-mapper setup guide: [eggNOG-mapper database setup](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.13#user-content-Setup).
+```bash
+download_eggnog_data.py
+```
+By default, the data are stored in `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`, and `moducomp` will auto-detect that location without needing `EGGNOG_DATA_DIR`.
+To use a custom location:
 ```bash
 export EGGNOG_DATA_DIR="/path/to/eggnog-data"
 download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
@@ -44,8 +52,6 @@ download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
 # moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
 ```
-If `EGGNOG_DATA_DIR` is not set, the downloader defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
 ### Quick test
 Small test data sets ship with `moducomp`. After installation you can confirm the pipeline by running:
@@ -122,7 +128,7 @@ This section lists all CLI options implemented today, along with their default v
 | `--calculate-complementarity`, `-c` | `0` | Complementarity size to compute (0 disables). |
 | `--adapt-headers/--no-adapt-headers` | `false` | Adapt FASTA headers to `genome|protein_N`. |
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
-| `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `fullmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
+| `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `lowmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
 | `--validate/--no-validate` | `validate` | Run post-run validation checks. |
 | `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
@@ -228,7 +234,7 @@ moducomp validate /path/to/output --strict
 ### ⚠️ Important note 2
-`moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. Nevertheless, it can be run on **smaller systems with less RAM, using the flag `--lowmem` (`--low-mem`) when running the `pipeline` command**. The `test` command uses low-memory mode by default and can be switched to full memory with `--fullmem` (`--full-mem`).
+`moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. For robustness, **low-memory mode is now the default** for `pipeline` and `test`. If you have ample RAM and want full-memory mode, add `--fullmem` (`--full-mem`).
 ### Notes on bundled test data
@@ -265,7 +271,7 @@ moducomp pipeline \
     --ncpus <number_of_cpus_to_use> \
     --calculate-complementarity <N>  # 0 to disable, 2 for 2-member, 3 for 3-member complementarity.
     # Optional flags:
-    # --lowmem/--fullmem          # Optional: Use low-mem if you have less than 64GB of RAM (default is full mem)
+    # --fullmem                  # Optional: Use full-mem if you have ample RAM (default is low-mem)
     # --adapt-headers             # If your FASTA headers need modification
     # --del-tmp/--keep-tmp        # Delete or keep temporary files
     # --eggnog-data-dir /path     # If EGGNOG_DATA_DIR is not set
@@ -318,8 +324,11 @@ moducomp pipeline ./large_genome_collection ./output_large --ncpus 32 --calculat
 # For moderate datasets with verbose output
 moducomp analyze-ko-matrix ./ko_matrix.csv ./output_moderate --ncpus 16 --calculate-complementarity 2 --verbose
-# For systems with limited memory
-moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --lowmem --calculate-complementarity 2
+# For systems with limited memory (default behavior)
+moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --calculate-complementarity 2
+# For systems with ample RAM
+moducomp pipeline ./genomes ./output_fullmem --ncpus 8 --fullmem --calculate-complementarity 2
 ```
 ## Expected outputs

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/__init__.py RENAMED Viewed

@@ -2,7 +2,7 @@
 moducomp: metabolic module completeness and complementarity for microbiomes.
 """
-__version__ = "0.7.13"
+__version__ = "0.7.16"
 __author__ = "Juan C. Villada"
 __email__ = "jvillada@lbl.gov"
 __title__ = "moducomp"

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/moducomp.py RENAMED Viewed

@@ -145,15 +145,23 @@ def require_eggnog_data_dir(eggnog_data_dir: Optional[str], logger: Optional[log
     if eggnog_data_dir:
         os.environ["EGGNOG_DATA_DIR"] = eggnog_data_dir
-    env_value = os.environ.get("EGGNOG_DATA_DIR", "")
-    if not env_value.strip():
-        message = (
-            "EGGNOG_DATA_DIR is required to run eggNOG-mapper. "
-            "Set the EGGNOG_DATA_DIR environment variable or pass --eggnog-data-dir. "
-            "Download the data with: download_eggnog_data.py or moducomp download-eggnog-data"
-        )
-        emit_error(message, logger)
-        raise typer.Exit(1)
+    env_value = os.environ.get("EGGNOG_DATA_DIR", "").strip()
+    if not env_value:
+        default_dir = default_eggnog_data_dir()
+        if default_dir.exists() and default_dir.is_dir() and any(default_dir.iterdir()):
+            os.environ["EGGNOG_DATA_DIR"] = str(default_dir)
+            env_value = str(default_dir)
+            if logger:
+                logger.info("EGGNOG_DATA_DIR not set; using default %s", env_value)
+        else:
+            message = (
+                "EGGNOG_DATA_DIR is required to run eggNOG-mapper. "
+                "Set the EGGNOG_DATA_DIR environment variable or pass --eggnog-data-dir. "
+                f"Default location is {default_dir}. "
+                "Download the data with: download_eggnog_data.py or moducomp download-eggnog-data"
+            )
+            emit_error(message, logger)
+            raise typer.Exit(1)
     data_dir = Path(env_value).expanduser().resolve()
     if not data_dir.exists() or not data_dir.is_dir():
@@ -3264,7 +3272,7 @@ def pipeline(
                  help="Complementarity size to compute (0 disables).",
              ),
              lowmem: bool = typer.Option(
-                 False,
+                 True,
                  "--lowmem/--fullmem",
                  "--low-mem/--full-mem",
                  help="Run eggNOG-mapper with reduced memory footprint by omitting --dbmem.",
@@ -3441,6 +3449,7 @@ def _run_pipeline_core(
     tmp_emapper_output_dir = f"{get_tmp_dir(savedir)}/emapper_output"
     tmp_emapper_file = f"{tmp_emapper_output_dir}/emapper_out.emapper.annotations"
     ko_matrix_path = f"{savedir}/kos_matrix.csv"
+    kpct_outprefix = "output_give_completeness"
     # Process annotations and create KO matrix
     if os.path.exists(ko_matrix_path):
@@ -3509,7 +3518,6 @@ def _run_pipeline_core(
         )
     else:
         # Set up KPCT processing
-        kpct_outprefix = "output_give_completeness"
         kpct_input_file = os.path.join(savedir, "ko_file_for_kpct.txt")
         # Check if KPCT output already exists
@@ -3594,7 +3602,7 @@ def _run_pipeline_core(
         try:
             validate(
                 savedir=savedir,
-                mode="ko-matrix",
+                mode="pipeline",
                 calculate_complementarity=calculate_complementarity,
                 kpct_outprefix=kpct_outprefix,
                 strict=validate_strict,
@@ -4143,24 +4151,32 @@ def analyze_ko_matrix(
         # Generate final resource usage summary
         log_final_resource_summary(resource_log_file, start_time, logger, verbose)
-        # Display pipeline completion summary
-        display_pipeline_completion_summary(start_time, savedir, logger, verbose)
         if run_validation:
             logger.info("Running post-run validation checks.")
             report_path = None
             if validation_report:
                 report_path = os.path.join(savedir, "validation_report.json")
-            validate(
-                savedir=savedir,
-                mode="ko-matrix",
-                calculate_complementarity=calculate_complementarity,
-                kpct_outprefix=kpct_outprefix,
-                strict=validate_strict,
-                report=report_path,
-                verbose=verbose,
-                log_level=log_level,
-            )
+            try:
+                validate(
+                    savedir=savedir,
+                    mode="ko-matrix",
+                    calculate_complementarity=calculate_complementarity,
+                    kpct_outprefix=kpct_outprefix,
+                    strict=validate_strict,
+                    report=report_path,
+                    verbose=verbose,
+                    log_level=log_level,
+                )
+            except typer.Exit as exc:
+                if logger:
+                    logger.error("Validation failed with exit code %s.", exc.exit_code)
+                    logger.error("Outputs written to: %s", savedir)
+                    if report_path:
+                        logger.error("Validation report: %s", report_path)
+                raise
+        # Display pipeline completion summary
+        display_pipeline_completion_summary(start_time, savedir, logger, verbose)
     except Exception as e:
         if logger:
@@ -4745,7 +4761,7 @@ def validate(
             )
     # Complementarity checks
-    comp_pattern = re.compile(r"module_completeness_complementarity_(\\d+)member\\.tsv$")
+    comp_pattern = re.compile(r"module_completeness_complementarity_(\d+)member\.tsv$")
     comp_files: Dict[int, Path] = {}
     for file_path in Path(savedir).glob("module_completeness_complementarity_*member.tsv"):
         match = comp_pattern.match(file_path.name)

{moducomp-0.7.13 → moducomp-0.7.16}/pixi.lock RENAMED Viewed

@@ -1340,8 +1340,8 @@ packages:
   timestamp: 1737229717596
 - pypi: ./
   name: moducomp
-  version: 0.7.9
-  sha256: 69bbe817500b6ae5011deb9c4ec4639f8a4c8b54a275b58ac1e8b3ad2d84e798
+  version: 0.7.14
+  sha256: 026a6159ce9247e5ce3136eee254748b471c239b82283298602c7c42837348be
   requires_dist:
   - typer>=0.9.1,<0.10.0
   - pandas>=1.5,<2.3

{moducomp-0.7.13 → moducomp-0.7.16}/recipe.yaml RENAMED Viewed

@@ -1,5 +1,5 @@
 context:
-  version: 0.7.13
+  version: 0.7.16
 package:
   name: moducomp
@@ -7,7 +7,7 @@ package:
 source:
 - url: https://pypi.org/packages/source/m/moducomp/moducomp-${{ version }}.tar.gz
-  sha256: bc3be76ad45937642876f843266362e1a664c4052f7a38b331284341e0cb515c
+  sha256: 917f8ebcba65b5607985fa5fdd7fc0823cd3edb401cc34e2efa0d6f9e650a62b
 build:
   script:

{moducomp-0.7.13 → moducomp-0.7.16}/.gitignore RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/LICENSE.txt RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/__main__.py RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/data/test_genomes/IMG2562617132.faa RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/data/test_genomes/IMG2568526683.faa RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/moducomp/data/test_genomes/IMG2740892217.faa RENAMED Viewed

File without changes

{moducomp-0.7.13 → moducomp-0.7.16}/pyproject.toml RENAMED Viewed

File without changes

moducomp 0.7.13__tar.gz → 0.7.16__tar.gz

moducomp 0.7.13tar.gz → 0.7.16tar.gz