moducomp 0.7.8__py3-none-any.whl → 0.7.10__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: moducomp
3
- Version: 0.7.8
3
+ Version: 0.7.10
4
4
  Summary: moducomp: metabolic module completeness and complementarity for microbiomes.
5
5
  Keywords: bioinformatics,microbiome,metabolic,kegg,genomics
6
6
  Author-email: "Juan C. Villada" <jvillada@lbl.gov>
@@ -37,6 +37,7 @@ Project-URL: Repository, https://github.com/NeLLi-team/moducomp
37
37
  - Generation of complementarity reports highlighting modules completed through genome partnerships.
38
38
  - Tracks and reports the actual proteins that are responsible for the completion of the module in the combination of N genomes.
39
39
  - **Automatic resource monitoring** with timestamped logs tracking CPU usage, memory consumption, and runtime for reproducibility.
40
+ - **Consistent logging to stdout/stderr** with a per-command resource summary emitted at the end of each run.
40
41
 
41
42
  ## Installation (Recommended)
42
43
 
@@ -77,6 +78,8 @@ Small test data sets ship with `moducomp`. After installation you can confirm th
77
78
  moducomp test --ncpus 16 --calculate-complementarity 2 --eggnog-data-dir "$EGGNOG_DATA_DIR"
78
79
  ```
79
80
 
81
+ The test command runs in low-memory mode by default. If you have plenty of RAM and want full-memory mode, add `--fullmem` (or `--full-mem`).
82
+
80
83
  ### Developer install (Pixi)
81
84
 
82
85
  If you want to download the code and develop locally:
@@ -105,6 +108,82 @@ You should see the command line help without errors.
105
108
 
106
109
  `moducomp` provides two main commands: `pipeline` and `analyze-ko-matrix`. You can run these commands using Pixi tasks defined in `pyproject.toml` or directly within the Pixi environment.
107
110
 
111
+ ### Pipeline overview
112
+
113
+ The diagram below shows the main stages executed by ModuComp.
114
+
115
+ ```mermaid
116
+ graph TD
117
+ A([Start run]) --> B[Initialize logging and resource monitoring]
118
+ B --> C{Input type}
119
+ C -->|pipeline| D[Validate genome directory]
120
+ C -->|analyze-ko-matrix| H[Load existing KO matrix]
121
+ D --> E[Prepare genomes: adapt headers or copy to tmp]
122
+ E --> F[Merge genomes into single FAA]
123
+ F --> G[Run eggNOG-mapper (if needed)]
124
+ G --> H[Create KO matrix (`kos_matrix.csv`)]
125
+ H --> I[Convert KO matrix to KPCT input]
126
+ I --> J[Run KPCT (parallel with fallback)]
127
+ J --> K[Create module completeness matrix]
128
+ K --> L{Complementarity requested?}
129
+ L -->|Yes| M[Generate complementarity report(s)]
130
+ L -->|No| N[Skip]
131
+ M --> O[Write outputs + logs]
132
+ N --> O
133
+ O --> P[Optional cleanup of `tmp/`]
134
+ P --> Q([Pipeline complete])
135
+ ```
136
+
137
+ ### CLI options and defaults
138
+
139
+ This section lists all CLI options implemented today, along with their default values.
140
+
141
+ #### `pipeline` command (positional args: `genomedir`, `savedir`)
142
+
143
+ | Option | Default | Description |
144
+ | --- | --- | --- |
145
+ | `--ncpus`, `-n` | `16` | Number of CPU cores to use for eggNOG-mapper and KPCT. |
146
+ | `--calculate-complementarity`, `-c` | `0` | Complementarity size to compute (0 disables). |
147
+ | `--adapt-headers/--no-adapt-headers` | `false` | Adapt FASTA headers to `genome|protein_N`. |
148
+ | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
149
+ | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `fullmem` | Run eggNOG-mapper without `--dbmem` to reduce RAM. |
150
+ | `--verbose/--quiet` | `false` | Enable verbose progress output. |
151
+ | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
152
+ | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
153
+
154
+ #### `test` command (bundled test genomes)
155
+
156
+ | Option | Default | Description |
157
+ | --- | --- | --- |
158
+ | `--output-dir`, `-o` | `output_test_moducomp_<DATETIME>` | Output directory for test run. |
159
+ | `--ncpus`, `-n` | `2` | CPU cores for the test run. |
160
+ | `--calculate-complementarity`, `-c` | `2` | Complementarity size to compute (0 disables). |
161
+ | `--adapt-headers/--no-adapt-headers` | `false` | Adapt FASTA headers before the test. |
162
+ | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after the test completes. |
163
+ | `--lowmem/--fullmem` (`--low-mem/--full-mem`) | `lowmem` | Low-memory mode is the default for tests. |
164
+ | `--verbose/--quiet` | `verbose` | Verbose output is the default for tests. |
165
+ | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
166
+ | `--eggnog-data-dir` | `EGGNOG_DATA_DIR` | Path to eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
167
+
168
+ #### `analyze-ko-matrix` command (positional args: `kos_matrix`, `savedir`)
169
+
170
+ | Option | Default | Description |
171
+ | --- | --- | --- |
172
+ | `--calculate-complementarity`, `-c` | `0` | Complementarity size to compute (0 disables). |
173
+ | `--kpct-outprefix` | `output_give_completeness` | Prefix for KPCT output files. |
174
+ | `--del-tmp/--keep-tmp` | `true` | Delete temporary files after completion. |
175
+ | `--ncpus`, `-n` | `16` | CPU cores for KPCT parallel processing. |
176
+ | `--verbose/--quiet` | `false` | Enable verbose progress output. |
177
+ | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
178
+
179
+ #### `download-eggnog-data` command
180
+
181
+ | Option | Default | Description |
182
+ | --- | --- | --- |
183
+ | `--eggnog-data-dir` | `${XDG_DATA_HOME:-~/.local/share}/moducomp/eggnog` | Destination for eggNOG-mapper data (sets `EGGNOG_DATA_DIR`). |
184
+ | `--log-level`, `-l` | `INFO` | Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`. |
185
+ | `--verbose/--quiet` | `verbose` | Stream downloader output to the console. |
186
+
108
187
  ### Performance and parallel processing
109
188
 
110
189
  `moducomp` includes **parallel processing capabilities** for the KPCT (KEGG Pathways Completeness Tool) analysis, which can significantly improve performance for large datasets:
@@ -125,7 +204,7 @@ You should see the command line help without errors.
125
204
 
126
205
  ### ⚠️ Important note 2
127
206
 
128
- `moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. Nevertheless, it can be run on **smaller systems with less RAM, using the flag `--lowmem` when running the `pipeline` command**.
207
+ `moducomp` is specifically designed for large scale analysis of microbiomes with hundreds of members, and works on Linux systems with at least **64GB of RAM**. Nevertheless, it can be run on **smaller systems with less RAM, using the flag `--lowmem` (`--low-mem`) when running the `pipeline` command**. The `test` command uses low-memory mode by default and can be switched to full memory with `--fullmem` (`--full-mem`).
129
208
 
130
209
  ### Notes on bundled test data
131
210
 
@@ -162,9 +241,9 @@ moducomp pipeline \
162
241
  --ncpus <number_of_cpus_to_use> \
163
242
  --calculate-complementarity <N> # 0 to disable, 2 for 2-member, 3 for 3-member complementarity.
164
243
  # Optional flags:
165
- # --lowmem # Optional: Use this if you have less than 64GB of RAM
244
+ # --lowmem/--fullmem # Optional: Use low-mem if you have less than 64GB of RAM (default is full mem)
166
245
  # --adapt-headers # If your FASTA headers need modification
167
- # --del-tmp # To delete temporary files
246
+ # --del-tmp/--keep-tmp # Delete or keep temporary files
168
247
  # --eggnog-data-dir /path # If EGGNOG_DATA_DIR is not set
169
248
  # --verbose # Enable verbose output with detailed progress information
170
249
  ```
@@ -183,7 +262,7 @@ moducomp analyze-ko-matrix \
183
262
  --calculate-complementarity <N> # 0 to disable, 2 for 2-member, 3 for 3-member complementarity.
184
263
 
185
264
  # Optional flags:
186
- # --del-tmp false
265
+ # --keep-tmp # Keep temporary files
187
266
  # --verbose # Enable verbose output with detailed progress information
188
267
  ```
189
268
 
@@ -227,7 +306,7 @@ moducomp pipeline ./genomes ./output_lowmem --ncpus 8 --lowmem --calculate-compl
227
306
  - **`module_completeness.tsv`**: Module completeness scores for individual genomes and combinations
228
307
  - **`module_completeness_complementarity_Nmember.tsv`**: Complementarity reports (if requested)
229
308
  - **`logs/resource_usage_YYYYMMDD_HHMMSS.log`**: Resource monitoring log with CPU, memory, and runtime metrics for reproducibility
230
- - **`logs/moducomp.log`**: Detailed pipeline execution log
309
+ - **`logs/moducomp.log`**: Detailed pipeline execution log with a per-command resource summary at the end of the run
231
310
 
232
311
  ## Citation
233
312
  Villada, JC. & Schulz, F. (2025). Assessment of metabolic module completeness of genomes and metabolic complementarity in microbiomes with `moducomp` . `moducomp` (v0.5.1) Zenodo. https://doi.org/10.5281/zenodo.16116092
@@ -0,0 +1,11 @@
1
+ moducomp/__init__.py,sha256=KXXHUQxz2yNFGf1_6cHMtl1fr2gbXjF6UEIzns9QBTM,659
2
+ moducomp/__main__.py,sha256=1O2pv6IGjUgqnbqsiMLtVqjxWQpRtZUjp8LDljZ1bsI,185
3
+ moducomp/moducomp.py,sha256=R4_mXvfpe_ojfDKibduMvgkTC1QDn4sFUt9TFc9xVUw,142734
4
+ moducomp/data/test_genomes/IMG2562617132.faa,sha256=gZPh-08pMRdAWJRr3__TbnU1F68CdkDb3gxtpaCLTTc,356863
5
+ moducomp/data/test_genomes/IMG2568526683.faa,sha256=PxFJwe-68UGw7il1hGlNhZt4-2WzzxXxGE1GTskDnow,343109
6
+ moducomp/data/test_genomes/IMG2740892217.faa,sha256=WsId4sIPxENbqF6tYFouAgDCy6T0SXNY6TywxBNe-3E,548954
7
+ moducomp-0.7.10.dist-info/entry_points.txt,sha256=dwt0_w7Ex9p1vhfp2fl4WXJLBh50u9fXTRNlAOJkAd4,114
8
+ moducomp-0.7.10.dist-info/licenses/LICENSE.txt,sha256=pt0cfIq9Wop21KDZYyQgP0M1YWYvKG0PomA5cUDC4TI,1536
9
+ moducomp-0.7.10.dist-info/WHEEL,sha256=_2ozNFCLWc93bK4WKHCO-eDUENDlo-dgc9cU3qokYO4,82
10
+ moducomp-0.7.10.dist-info/METADATA,sha256=_WnWpR9pSOpKcgNOolcf1uZaUYLgg0udk1YrYxqu0A4,14726
11
+ moducomp-0.7.10.dist-info/RECORD,,
@@ -1,11 +0,0 @@
1
- moducomp/__init__.py,sha256=QmRH9aaqfY1UP0hmv177LVOcfm5-8PF3fBLWN6Mh0Lo,658
2
- moducomp/__main__.py,sha256=1O2pv6IGjUgqnbqsiMLtVqjxWQpRtZUjp8LDljZ1bsI,185
3
- moducomp/moducomp.py,sha256=Tt4iR9P18NGo8MlFfQ5hEq7kHo1BvQpn86emUgqpOLU,138425
4
- moducomp/data/test_genomes/IMG2562617132.faa,sha256=gZPh-08pMRdAWJRr3__TbnU1F68CdkDb3gxtpaCLTTc,356863
5
- moducomp/data/test_genomes/IMG2568526683.faa,sha256=PxFJwe-68UGw7il1hGlNhZt4-2WzzxXxGE1GTskDnow,343109
6
- moducomp/data/test_genomes/IMG2740892217.faa,sha256=WsId4sIPxENbqF6tYFouAgDCy6T0SXNY6TywxBNe-3E,548954
7
- moducomp-0.7.8.dist-info/entry_points.txt,sha256=dwt0_w7Ex9p1vhfp2fl4WXJLBh50u9fXTRNlAOJkAd4,114
8
- moducomp-0.7.8.dist-info/licenses/LICENSE.txt,sha256=pt0cfIq9Wop21KDZYyQgP0M1YWYvKG0PomA5cUDC4TI,1536
9
- moducomp-0.7.8.dist-info/WHEEL,sha256=_2ozNFCLWc93bK4WKHCO-eDUENDlo-dgc9cU3qokYO4,82
10
- moducomp-0.7.8.dist-info/METADATA,sha256=Ias2JiXgJluXxTFNLPhOpKJuIYJxsb4T2D5DwqVBD-c,10474
11
- moducomp-0.7.8.dist-info/RECORD,,