PyPI - moducomp - Versions diffs - 0.7.11__tar.gz → 0.7.13__tar.gz - Mend

moducomp 0.7.11tar.gz → 0.7.13tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

{moducomp-0.7.11 → moducomp-0.7.13}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: moducomp
-Version: 0.7.11
+Version: 0.7.13
 Summary: moducomp: metabolic module completeness and complementarity for microbiomes.
 Keywords: bioinformatics,microbiome,metabolic,kegg,genomics
 Author-email: "Juan C. Villada" <jvillada@lbl.gov>
@@ -38,6 +38,7 @@ Project-URL: Repository, https://github.com/NeLLi-team/moducomp
 - Tracks and reports the actual proteins that are responsible for the completion of the module in the combination of N genomes.
 - **Automatic resource monitoring** with timestamped logs tracking CPU usage, memory consumption, and runtime for reproducibility.
 - **Consistent logging to stdout/stderr** with a per-command resource summary emitted at the end of each run.
+- **Built-in validation (`moducomp validate`)** for scientific consistency checks across annotations, KO matrices, KPCT outputs, and complementarity reports.
 ## Installation (Recommended)
@@ -59,16 +60,16 @@ pixi global install \
 ## Setup data (required)
-`moducomp` needs the eggNOG-mapper database to run. Download it once:
+`moducomp` needs the eggNOG-mapper database to run. The primary (recommended) way to download it is using the `download_eggnog_data.py` wrapper, which mirrors the upstream downloader behavior. For upstream details, see the eggNOG-mapper setup guide: [eggNOG-mapper database setup](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.13#user-content-Setup).
 ```bash
 export EGGNOG_DATA_DIR="/path/to/eggnog-data"
-moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
-# or the standalone script:
-# download_eggnog_data.py
+download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
+# equivalent:
+# moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
 ```
-If `EGGNOG_DATA_DIR` is not set, `moducomp download-eggnog-data` defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
+If `EGGNOG_DATA_DIR` is not set, the downloader defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
 ### Quick test
@@ -148,6 +149,9 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
 | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `fullmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
@@ -162,6 +166,9 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after the test completes. |
 | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `lowmem` | Low-memory mode is the default for tests. |
 | `--verbose/--quiet` | `verbose` | Verbose output is the default for tests. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
@@ -174,6 +181,21 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
 | `--ncpus`, `-n` | `16` | CPU cores for KPCT parallel processing. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
+| `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
+#### `validate` command (positional args: `savedir`)
+| Option | Default | Description |
+| --- | --- | --- |
+| `--mode` | `auto` | Validation mode: `auto`, `pipeline`, or `ko-matrix`. |
+| `--calculate-complementarity`, `-c` | `auto-detect` | Expected complementarity size (0 disables). |
+| `--kpct-outprefix` | `output_give_completeness` | KPCT output prefix used during analysis. |
+| `--strict/--lenient` | `lenient` | Treat warnings as failures when strict. |
+| `--report` | _none_ | Write JSON validation report to this path. |
+| `--verbose/--quiet` | `false` | Enable verbose progress output. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 #### `download-eggnog-data` command
@@ -198,6 +220,33 @@ This section lists all CLI options implemented today, along with their default v
 - For KPCT parallel processing, the system creates the same number of chunks as CPU cores specified
 - Example: `--ncpus 8` will use 8 cores and create 8 chunks for optimal parallel processing
+### Validation (QC)
+Use the built-in validator to check scientific consistency across outputs after a run. The validator compares:
+- KO sets and counts between eggNOG-mapper annotations and `kos_matrix.csv`
+- KO sets between `kos_matrix.csv` and `ko_file_for_kpct.txt`
+- KPCT contigs vs pathways outputs
+- Module completeness ranges and combination naming
+- Complementarity reports versus module completeness values
+- Protein provenance fields (pipeline mode) or placeholders (KO-matrix mode)
+Example:
+```bash
+# Validation runs by default after pipeline/analyze/test.
+# Use --no-validate to disable or --no-validate-report to skip JSON output.
+# When validation reports errors (or warnings in strict mode), the command exits non-zero.
+# Validate a pipeline run and write a JSON report
+moducomp validate /path/to/output --mode pipeline --report /path/to/output/validation_report.json
+# Validate KO-matrix mode outputs (non-default KPCT prefix)
+moducomp validate /path/to/output --mode ko-matrix --kpct-outprefix my_prefix
+# Treat warnings as failures
+moducomp validate /path/to/output --strict
+```
 ### ⚠️ Important note 1
 **Prepare FAA files**: Ensure FAA headers are in the form `>genomeName|proteinId`, or use the `--adapt-headers` option to format your headers into `>fileName_prefix|protein_id_counter`.
@@ -211,7 +260,7 @@ This section lists all CLI options implemented today, along with their default v
 You can override the bundled data location with `MODUCOMP_DATA_DIR`.
 When working from source, the bundled test genomes live at `moducomp/data/test_genomes`.
-`download_eggnog_data.py` is provided by eggnog-mapper and is available in the Pixi environment (or via `pixi global` installs).
+`download_eggnog_data.py` is exposed by `moducomp` as a convenience wrapper for the eggnog-mapper downloader and is available in the Pixi environment (including `pixi global` installs).
 Pixi task (supports passing a custom location):
@@ -298,15 +347,38 @@ moducomp analyze-ko-matrix ./ko_matrix.csv ./output_moderate --ncpus 16 --calcul
 moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --lowmem --calculate-complementarity 2
 ```
-## Output files
+## Expected outputs
+The sections below describe the expected output files, naming conventions, and the column-level meaning of each file. These details are the same for `moducomp pipeline` and `moducomp test` (pipeline mode), and the subset noted for `moducomp analyze-ko-matrix` (KO-matrix mode).
+**Naming conventions**
+Genome identifiers are stored as `taxon_oid`. In pipeline mode, ModuComp expects protein headers in the format `genome_id|protein_id`. If you set `--adapt-headers`, ModuComp rewrites headers to `>genomeName|protein_N`, where `genomeName` is the FAA filename stem. Combination identifiers use `__` (double underscore), for example `GenomeA__GenomeB`, and `n_members` in `module_completeness.tsv` records the size of each combination.
+**Pipeline mode outputs (`moducomp pipeline`, `moducomp test`)**
+- `emapper_out.emapper.annotations`: Full eggNOG-mapper annotations. The `#query` column must match `genome_id|protein_id`. `KEGG_ko` entries are prefixed `ko:KXXXXX` and are converted to `KXXXXX` for downstream matrices.
+- `kos_matrix.csv`: Genome × KO count matrix. Columns: `taxon_oid` followed by KO IDs (e.g., `K00001`). Values are integer protein counts per KO.
+- `ko_file_for_kpct.txt`: KPCT input file. Each line starts with `taxon_oid` followed by the set of KO IDs present in that genome or combination. If `--calculate-complementarity` is `N>=2`, combinations up to `N` are included as `GenomeA__GenomeB`.
+- `output_give_completeness_contigs.with_weights.tsv`: KPCT module results per genome/combination. Columns: `contig` (genome/combination ID), `module_accession`, `completeness` (0–100), `pathway_name`, `pathway_class`, `matching_ko` (KO weights), `missing_ko`.
+- `output_give_completeness_pathways.with_weights.tsv`: Same rows and order as the contigs file, but without the `contig` column. This is provided for compatibility with legacy tools; prefer the contigs file when you need genome-level provenance.
+- `module_completeness.tsv`: Pivoted module completeness matrix. Columns: `n_members`, `taxon_oid`, followed by KEGG module IDs (`M00001`, …). Values are numeric percentages in the range 0–100.
+- `module_completeness_complementarity_Nmember.tsv`: Complementarity report for `N`-member combinations (only when `--calculate-complementarity N` is set). Columns: `taxon_oid_1..N`, `completeness_taxon_oid_1..N`, `module_id`, `module_name`, `pathway_class`, `matching_ko`, `proteins_taxon_oid_1..N`. Protein fields list contributing proteins per KO (from eggNOG-mapper) as `{'KXXXXX': 'genome|protein'}`.
+- `logs/moducomp.log`: Detailed run log with structured progress messages and per-command resource summaries.
+- `logs/resource_usage_YYYYMMDD_HHMMSS.log`: Resource monitoring log capturing wall time, CPU time, CPU utilization, peak RAM, and exit code for each monitored command.
+- `tmp/` (only if `--keep-tmp`): Intermediate files such as `merged_genomes.faa`, `emapper_output/`, and KPCT chunk outputs.
+- `validation_report.json` (default when validation is enabled): JSON report produced by the validator.
-`moducomp` generates several output files in the specified output directory:
+**KO-matrix mode outputs (`moducomp analyze-ko-matrix`)**
-- **`kos_matrix.csv`**: Matrix of KO counts for each genome
-- **`module_completeness.tsv`**: Module completeness scores for individual genomes and combinations
-- **`module_completeness_complementarity_Nmember.tsv`**: Complementarity reports (if requested)
-- **`logs/resource_usage_YYYYMMDD_HHMMSS.log`**: Resource monitoring log with CPU, memory, and runtime metrics for reproducibility
-- **`logs/moducomp.log`**: Detailed pipeline execution log with a per-command resource summary at the end of the run
+- `kos_matrix.csv`: A copy of the input KO matrix (same format as above).
+- `ko_file_for_kpct.txt`: KPCT input generated from the KO matrix. If `--calculate-complementarity` is set, combination lines are added using `GenomeA__GenomeB` identifiers.
+- `output_give_completeness_contigs.with_weights.tsv`: KPCT module results per genome/combination (same format as pipeline mode).
+- `output_give_completeness_pathways.with_weights.tsv`: Same rows as the contigs file, without the `contig` column.
+- `module_completeness.tsv`: Module completeness matrix (same format as pipeline mode).
+- `module_completeness_complementarity_Nmember.tsv`: Complementarity report. Protein contribution columns are filled with `No protein data available for <genome>` because no eggNOG-mapper annotations are available in KO-matrix mode.
+- `logs/moducomp.log` and `logs/resource_usage_YYYYMMDD_HHMMSS.log`: Standard run logs and resource summaries.
+- `validation_report.json` (default when validation is enabled): JSON report produced by the validator.
 ## Citation
 Villada, JC. & Schulz, F. (2025). Assessment of metabolic module completeness of genomes and metabolic complementarity in microbiomes with `moducomp` . `moducomp` (v0.5.1) Zenodo. https://doi.org/10.5281/zenodo.16116092

{moducomp-0.7.11 → moducomp-0.7.13}/README.md RENAMED Viewed

@@ -13,6 +13,7 @@
 - Tracks and reports the actual proteins that are responsible for the completion of the module in the combination of N genomes.
 - **Automatic resource monitoring** with timestamped logs tracking CPU usage, memory consumption, and runtime for reproducibility.
 - **Consistent logging to stdout/stderr** with a per-command resource summary emitted at the end of each run.
+- **Built-in validation (`moducomp validate`)** for scientific consistency checks across annotations, KO matrices, KPCT outputs, and complementarity reports.
 ## Installation (Recommended)
@@ -34,16 +35,16 @@ pixi global install \
 ## Setup data (required)
-`moducomp` needs the eggNOG-mapper database to run. Download it once:
+`moducomp` needs the eggNOG-mapper database to run. The primary (recommended) way to download it is using the `download_eggnog_data.py` wrapper, which mirrors the upstream downloader behavior. For upstream details, see the eggNOG-mapper setup guide: [eggNOG-mapper database setup](https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v2.1.5-to-v2.1.13#user-content-Setup).
 ```bash
 export EGGNOG_DATA_DIR="/path/to/eggnog-data"
-moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
-# or the standalone script:
-# download_eggnog_data.py
+download_eggnog_data.py --eggnog-data-dir "$EGGNOG_DATA_DIR"
+# equivalent:
+# moducomp download-eggnog-data --eggnog-data-dir "$EGGNOG_DATA_DIR"
 ```
-If `EGGNOG_DATA_DIR` is not set, `moducomp download-eggnog-data` defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
+If `EGGNOG_DATA_DIR` is not set, the downloader defaults to `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog`.
 ### Quick test
@@ -123,6 +124,9 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
 | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `fullmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
@@ -137,6 +141,9 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after the test completes. |
 | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `lowmem` | Low-memory mode is the default for tests. |
 | `--verbose/--quiet` | `verbose` | Verbose output is the default for tests. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
@@ -149,6 +156,21 @@ This section lists all CLI options implemented today, along with their default v
 | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
 | `--ncpus`, `-n` | `16` | CPU cores for KPCT parallel processing. |
 | `--verbose/--quiet` | `false` | Enable verbose progress output. |
+| `--validate/--no-validate` | `validate` | Run post-run validation checks. |
+| `--validate-report/--no-validate-report` | `validate-report` | Write `validation_report.json` in the output directory. |
+| `--validate-strict/--validate-lenient` | `lenient` | Treat validation warnings as failures when strict. |
+| `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
+#### `validate` command (positional args: `savedir`)
+| Option | Default | Description |
+| --- | --- | --- |
+| `--mode` | `auto` | Validation mode: `auto`, `pipeline`, or `ko-matrix`. |
+| `--calculate-complementarity`, `-c` | `auto-detect` | Expected complementarity size (0 disables). |
+| `--kpct-outprefix` | `output_give_completeness` | KPCT output prefix used during analysis. |
+| `--strict/--lenient` | `lenient` | Treat warnings as failures when strict. |
+| `--report` | _none_ | Write JSON validation report to this path. |
+| `--verbose/--quiet` | `false` | Enable verbose progress output. |
 | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
 #### `download-eggnog-data` command
@@ -173,6 +195,33 @@ This section lists all CLI options implemented today, along with their default v
 - For KPCT parallel processing, the system creates the same number of chunks as CPU cores specified
 - Example: `--ncpus 8` will use 8 cores and create 8 chunks for optimal parallel processing
+### Validation (QC)
+Use the built-in validator to check scientific consistency across outputs after a run. The validator compares:
+- KO sets and counts between eggNOG-mapper annotations and `kos_matrix.csv`
+- KO sets between `kos_matrix.csv` and `ko_file_for_kpct.txt`
+- KPCT contigs vs pathways outputs
+- Module completeness ranges and combination naming
+- Complementarity reports versus module completeness values
+- Protein provenance fields (pipeline mode) or placeholders (KO-matrix mode)
+Example:
+```bash
+# Validation runs by default after pipeline/analyze/test.
+# Use --no-validate to disable or --no-validate-report to skip JSON output.
+# When validation reports errors (or warnings in strict mode), the command exits non-zero.
+# Validate a pipeline run and write a JSON report
+moducomp validate /path/to/output --mode pipeline --report /path/to/output/validation_report.json
+# Validate KO-matrix mode outputs (non-default KPCT prefix)
+moducomp validate /path/to/output --mode ko-matrix --kpct-outprefix my_prefix
+# Treat warnings as failures
+moducomp validate /path/to/output --strict
+```
 ### ⚠️ Important note 1
 **Prepare FAA files**: Ensure FAA headers are in the form `>genomeName|proteinId`, or use the `--adapt-headers` option to format your headers into `>fileName_prefix|protein_id_counter`.
@@ -186,7 +235,7 @@ This section lists all CLI options implemented today, along with their default v
 You can override the bundled data location with `MODUCOMP_DATA_DIR`.
 When working from source, the bundled test genomes live at `moducomp/data/test_genomes`.
-`download_eggnog_data.py` is provided by eggnog-mapper and is available in the Pixi environment (or via `pixi global` installs).
+`download_eggnog_data.py` is exposed by `moducomp` as a convenience wrapper for the eggnog-mapper downloader and is available in the Pixi environment (including `pixi global` installs).
 Pixi task (supports passing a custom location):
@@ -273,15 +322,38 @@ moducomp analyze-ko-matrix ./ko_matrix.csv ./output_moderate --ncpus 16 --calcul
 moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --lowmem --calculate-complementarity 2
 ```
-## Output files
+## Expected outputs
+The sections below describe the expected output files, naming conventions, and the column-level meaning of each file. These details are the same for `moducomp pipeline` and `moducomp test` (pipeline mode), and the subset noted for `moducomp analyze-ko-matrix` (KO-matrix mode).
+**Naming conventions**
+Genome identifiers are stored as `taxon_oid`. In pipeline mode, ModuComp expects protein headers in the format `genome_id|protein_id`. If you set `--adapt-headers`, ModuComp rewrites headers to `>genomeName|protein_N`, where `genomeName` is the FAA filename stem. Combination identifiers use `__` (double underscore), for example `GenomeA__GenomeB`, and `n_members` in `module_completeness.tsv` records the size of each combination.
+**Pipeline mode outputs (`moducomp pipeline`, `moducomp test`)**
+- `emapper_out.emapper.annotations`: Full eggNOG-mapper annotations. The `#query` column must match `genome_id|protein_id`. `KEGG_ko` entries are prefixed `ko:KXXXXX` and are converted to `KXXXXX` for downstream matrices.
+- `kos_matrix.csv`: Genome × KO count matrix. Columns: `taxon_oid` followed by KO IDs (e.g., `K00001`). Values are integer protein counts per KO.
+- `ko_file_for_kpct.txt`: KPCT input file. Each line starts with `taxon_oid` followed by the set of KO IDs present in that genome or combination. If `--calculate-complementarity` is `N>=2`, combinations up to `N` are included as `GenomeA__GenomeB`.
+- `output_give_completeness_contigs.with_weights.tsv`: KPCT module results per genome/combination. Columns: `contig` (genome/combination ID), `module_accession`, `completeness` (0–100), `pathway_name`, `pathway_class`, `matching_ko` (KO weights), `missing_ko`.
+- `output_give_completeness_pathways.with_weights.tsv`: Same rows and order as the contigs file, but without the `contig` column. This is provided for compatibility with legacy tools; prefer the contigs file when you need genome-level provenance.
+- `module_completeness.tsv`: Pivoted module completeness matrix. Columns: `n_members`, `taxon_oid`, followed by KEGG module IDs (`M00001`, …). Values are numeric percentages in the range 0–100.
+- `module_completeness_complementarity_Nmember.tsv`: Complementarity report for `N`-member combinations (only when `--calculate-complementarity N` is set). Columns: `taxon_oid_1..N`, `completeness_taxon_oid_1..N`, `module_id`, `module_name`, `pathway_class`, `matching_ko`, `proteins_taxon_oid_1..N`. Protein fields list contributing proteins per KO (from eggNOG-mapper) as `{'KXXXXX': 'genome|protein'}`.
+- `logs/moducomp.log`: Detailed run log with structured progress messages and per-command resource summaries.
+- `logs/resource_usage_YYYYMMDD_HHMMSS.log`: Resource monitoring log capturing wall time, CPU time, CPU utilization, peak RAM, and exit code for each monitored command.
+- `tmp/` (only if `--keep-tmp`): Intermediate files such as `merged_genomes.faa`, `emapper_output/`, and KPCT chunk outputs.
+- `validation_report.json` (default when validation is enabled): JSON report produced by the validator.
-`moducomp` generates several output files in the specified output directory:
+**KO-matrix mode outputs (`moducomp analyze-ko-matrix`)**
-- **`kos_matrix.csv`**: Matrix of KO counts for each genome
-- **`module_completeness.tsv`**: Module completeness scores for individual genomes and combinations
-- **`module_completeness_complementarity_Nmember.tsv`**: Complementarity reports (if requested)
-- **`logs/resource_usage_YYYYMMDD_HHMMSS.log`**: Resource monitoring log with CPU, memory, and runtime metrics for reproducibility
-- **`logs/moducomp.log`**: Detailed pipeline execution log with a per-command resource summary at the end of the run
+- `kos_matrix.csv`: A copy of the input KO matrix (same format as above).
+- `ko_file_for_kpct.txt`: KPCT input generated from the KO matrix. If `--calculate-complementarity` is set, combination lines are added using `GenomeA__GenomeB` identifiers.
+- `output_give_completeness_contigs.with_weights.tsv`: KPCT module results per genome/combination (same format as pipeline mode).
+- `output_give_completeness_pathways.with_weights.tsv`: Same rows as the contigs file, without the `contig` column.
+- `module_completeness.tsv`: Module completeness matrix (same format as pipeline mode).
+- `module_completeness_complementarity_Nmember.tsv`: Complementarity report. Protein contribution columns are filled with `No protein data available for <genome>` because no eggNOG-mapper annotations are available in KO-matrix mode.
+- `logs/moducomp.log` and `logs/resource_usage_YYYYMMDD_HHMMSS.log`: Standard run logs and resource summaries.
+- `validation_report.json` (default when validation is enabled): JSON report produced by the validator.
 ## Citation
 Villada, JC. & Schulz, F. (2025). Assessment of metabolic module completeness of genomes and metabolic complementarity in microbiomes with `moducomp` . `moducomp` (v0.5.1) Zenodo. https://doi.org/10.5281/zenodo.16116092

{moducomp-0.7.11 → moducomp-0.7.13}/moducomp/__init__.py RENAMED Viewed

@@ -2,7 +2,7 @@
 moducomp: metabolic module completeness and complementarity for microbiomes.
 """
-__version__ = "0.7.11"
+__version__ = "0.7.13"
 __author__ = "Juan C. Villada"
 __email__ = "jvillada@lbl.gov"
 __title__ = "moducomp"

moducomp 0.7.11__tar.gz → 0.7.13__tar.gz

moducomp 0.7.11tar.gz → 0.7.13tar.gz