pywombat 1.0.2__tar.gz → 1.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {pywombat-1.0.2 → pywombat-1.1.0}/CHANGELOG.md +57 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/PKG-INFO +143 -48
- {pywombat-1.0.2 → pywombat-1.1.0}/README.md +139 -47
- {pywombat-1.0.2 → pywombat-1.1.0}/pyproject.toml +9 -1
- {pywombat-1.0.2 → pywombat-1.1.0}/src/pywombat/cli.py +369 -60
- pywombat-1.1.0/uv.lock +276 -0
- pywombat-1.0.2/uv.lock +0 -127
- {pywombat-1.0.2 → pywombat-1.1.0}/.github/copilot-instructions.md +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/.github/workflows/publish.yml +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/.gitignore +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/.python-version +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/QUICKSTART.md +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/examples/README.md +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/examples/de_novo_mutations.yml +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/examples/rare_variants_high_impact.yml +0 -0
- {pywombat-1.0.2 → pywombat-1.1.0}/src/pywombat/__init__.py +0 -0
|
@@ -5,6 +5,63 @@ All notable changes to PyWombat will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [1.1.0] - 2026-02-05
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- **Memory-Optimized Two-Step Workflow**: New `wombat prepare` command for preprocessing large files
|
|
13
|
+
- Converts TSV/TSV.gz to Parquet format with pre-expanded INFO fields
|
|
14
|
+
- Processes files in chunks (50k rows default) to handle files of any size
|
|
15
|
+
- Applies memory-efficient data types (Categorical, UInt32, etc.)
|
|
16
|
+
- Reduces file size by ~30% compared to gzipped TSV
|
|
17
|
+
|
|
18
|
+
- **Parquet Input Support**: `wombat filter` now accepts both TSV and Parquet input
|
|
19
|
+
- Auto-detects input format (TSV, TSV.gz, or Parquet)
|
|
20
|
+
- Pre-filtering optimization: Applies expression filters BEFORE melting samples
|
|
21
|
+
- Reduces memory usage by 95%+ for large files (e.g., 200GB → 1.2GB for 38-sample, 4.2M variant dataset)
|
|
22
|
+
- Processing time improved from minutes/OOM to <1 second for filtered datasets
|
|
23
|
+
|
|
24
|
+
- **Subcommand Architecture**: Converted CLI to use Click groups
|
|
25
|
+
- `wombat prepare`: Preprocess TSV to Parquet
|
|
26
|
+
- `wombat filter`: Process and filter data (replaces old direct command)
|
|
27
|
+
- **Breaking Change**: Old syntax `wombat input.tsv` no longer works, use `wombat filter input.tsv`
|
|
28
|
+
|
|
29
|
+
- **Test Suite**: Added comprehensive pytest test suite
|
|
30
|
+
- 22 tests covering CLI structure, prepare command, and filter command
|
|
31
|
+
- Test fixtures for creating synthetic test data
|
|
32
|
+
- Integration tests with real data validation
|
|
33
|
+
- Added pytest and pytest-cov to dev dependencies
|
|
34
|
+
|
|
35
|
+
### Changed
|
|
36
|
+
|
|
37
|
+
- **CLI Architecture**: Restructured from single command to group-based subcommands
|
|
38
|
+
- **Filter Command**: Now applies expression filters before melting when using Parquet input
|
|
39
|
+
- **Sample Column Detection**: Improved heuristics to work with both TSV and Parquet formats
|
|
40
|
+
- **Documentation**: Updated README with two-step workflow examples and memory comparison table
|
|
41
|
+
|
|
42
|
+
### Fixed
|
|
43
|
+
|
|
44
|
+
- **INFO Field Extraction**: Fixed column index detection in `prepare` command (was using hardcoded index)
|
|
45
|
+
- **Type Casting**: Added explicit `.cast(pl.Utf8)` to preserve string types when all values are NULL
|
|
46
|
+
- **Parquet Processing**: Fixed `format_bcftools_tsv_minimal` to work without `(null)` column
|
|
47
|
+
|
|
48
|
+
### Performance
|
|
49
|
+
|
|
50
|
+
- **Memory Usage**: 95%+ reduction for large files with expression filters
|
|
51
|
+
- Example: 38 samples, 4.2M variants
|
|
52
|
+
- Before: 200GB+ (OOM failure)
|
|
53
|
+
- After: ~1.2GB peak memory
|
|
54
|
+
- **Processing Speed**: <1 second for filtered datasets (vs minutes or failure before)
|
|
55
|
+
- **Pre-filtering**: Expression filters applied before melting reduces data expansion
|
|
56
|
+
|
|
57
|
+
### Documentation
|
|
58
|
+
|
|
59
|
+
- Added memory optimization workflow section to README
|
|
60
|
+
- Added performance comparison table showing memory/time improvements
|
|
61
|
+
- Updated all examples to use new `wombat filter` syntax
|
|
62
|
+
- Added section explaining when to use `prepare` command
|
|
63
|
+
- Documented two-step workflow benefits and use cases
|
|
64
|
+
|
|
8
65
|
## [1.0.1] - 2026-01-24
|
|
9
66
|
|
|
10
67
|
### Added
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: pywombat
|
|
3
|
-
Version: 1.0
|
|
3
|
+
Version: 1.1.0
|
|
4
4
|
Summary: A CLI tool for processing and filtering bcftools tabulated TSV files with pedigree support
|
|
5
5
|
Project-URL: Homepage, https://github.com/bourgeron-lab/pywombat
|
|
6
6
|
Project-URL: Repository, https://github.com/bourgeron-lab/pywombat
|
|
@@ -18,6 +18,9 @@ Requires-Dist: click>=8.1.0
|
|
|
18
18
|
Requires-Dist: polars>=0.19.0
|
|
19
19
|
Requires-Dist: pyyaml>=6.0
|
|
20
20
|
Requires-Dist: tqdm>=4.67.1
|
|
21
|
+
Provides-Extra: dev
|
|
22
|
+
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
|
|
23
|
+
Requires-Dist: pytest>=7.0.0; extra == 'dev'
|
|
21
24
|
Description-Content-Type: text/markdown
|
|
22
25
|
|
|
23
26
|
# PyWombat 🦘
|
|
@@ -29,14 +32,15 @@ A high-performance CLI tool for processing and filtering bcftools tabulated TSV
|
|
|
29
32
|
|
|
30
33
|
## Features
|
|
31
34
|
|
|
32
|
-
✨ **Fast Processing**: Uses Polars for efficient data handling
|
|
33
|
-
🔬 **Quality Filtering**: Configurable depth, quality, and VAF thresholds
|
|
34
|
-
👨👩👧 **Pedigree Support**: Trio and family analysis with parent genotypes
|
|
35
|
-
🧬 **De Novo Detection**: Sex-chromosome-aware DNM identification
|
|
36
|
-
📊 **Flexible Output**: TSV, compressed TSV, or Parquet formats
|
|
37
|
-
🎯 **Expression Filters**: Complex filtering with logical expressions
|
|
38
|
-
🏷️ **Boolean Flag Support**: INFO field flags (PASS, DB, etc.) extracted as True/False columns
|
|
39
|
-
⚡ **
|
|
35
|
+
✨ **Fast Processing**: Uses Polars for efficient data handling
|
|
36
|
+
🔬 **Quality Filtering**: Configurable depth, quality, and VAF thresholds
|
|
37
|
+
👨👩👧 **Pedigree Support**: Trio and family analysis with parent genotypes
|
|
38
|
+
🧬 **De Novo Detection**: Sex-chromosome-aware DNM identification
|
|
39
|
+
📊 **Flexible Output**: TSV, compressed TSV, or Parquet formats
|
|
40
|
+
🎯 **Expression Filters**: Complex filtering with logical expressions
|
|
41
|
+
🏷️ **Boolean Flag Support**: INFO field flags (PASS, DB, etc.) extracted as True/False columns
|
|
42
|
+
⚡ **Memory Optimized**: Two-step workflow for large files (prepare → filter)
|
|
43
|
+
💾 **Parquet Support**: Pre-process large files for repeated, memory-efficient analysis
|
|
40
44
|
|
|
41
45
|
---
|
|
42
46
|
|
|
@@ -47,17 +51,37 @@ A high-performance CLI tool for processing and filtering bcftools tabulated TSV
|
|
|
47
51
|
Use `uvx` to run PyWombat without installation:
|
|
48
52
|
|
|
49
53
|
```bash
|
|
50
|
-
# Basic
|
|
51
|
-
uvx pywombat input.tsv -o output
|
|
54
|
+
# Basic filtering
|
|
55
|
+
uvx pywombat filter input.tsv -o output
|
|
52
56
|
|
|
53
|
-
# With
|
|
54
|
-
uvx pywombat input.tsv -F examples/rare_variants_high_impact.yml -o output
|
|
57
|
+
# With filter configuration
|
|
58
|
+
uvx pywombat filter input.tsv -F examples/rare_variants_high_impact.yml -o output
|
|
55
59
|
|
|
56
60
|
# De novo mutation detection
|
|
57
|
-
uvx pywombat input.tsv --pedigree pedigree.tsv \
|
|
61
|
+
uvx pywombat filter input.tsv --pedigree pedigree.tsv \
|
|
58
62
|
-F examples/de_novo_mutations.yml -o denovo
|
|
59
63
|
```
|
|
60
64
|
|
|
65
|
+
### For Large Files (>1GB or >50 samples)
|
|
66
|
+
|
|
67
|
+
Use the two-step workflow for memory-efficient processing:
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
# Step 1: Prepare (one-time preprocessing)
|
|
71
|
+
uvx pywombat prepare input.tsv.gz -o prepared.parquet
|
|
72
|
+
|
|
73
|
+
# Step 2: Filter (fast, memory-efficient, can be run multiple times)
|
|
74
|
+
uvx pywombat filter prepared.parquet \
|
|
75
|
+
-p pedigree.tsv \
|
|
76
|
+
-F config.yml \
|
|
77
|
+
-o filtered
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
**Benefits:**
|
|
81
|
+
- Pre-expands INFO fields once (saves time on repeated filtering)
|
|
82
|
+
- Applies filters before melting samples (reduces memory by 95%+)
|
|
83
|
+
- Parquet format enables fast columnar access
|
|
84
|
+
|
|
61
85
|
### Installation for Development/Repeated Use
|
|
62
86
|
|
|
63
87
|
```bash
|
|
@@ -69,7 +93,7 @@ cd pywombat
|
|
|
69
93
|
uv sync
|
|
70
94
|
|
|
71
95
|
# Run with uv run
|
|
72
|
-
uv run wombat input.tsv -o output
|
|
96
|
+
uv run wombat filter input.tsv -o output
|
|
73
97
|
```
|
|
74
98
|
|
|
75
99
|
---
|
|
@@ -114,25 +138,62 @@ chr1 100 A T 2 0.5 30 true Sample2 1/1 18 99
|
|
|
114
138
|
|
|
115
139
|
---
|
|
116
140
|
|
|
117
|
-
##
|
|
141
|
+
## Commands
|
|
142
|
+
|
|
143
|
+
PyWombat has two main commands:
|
|
144
|
+
|
|
145
|
+
### `wombat prepare` - Preprocess Large Files
|
|
146
|
+
|
|
147
|
+
Converts TSV/TSV.gz to optimized Parquet format with pre-expanded INFO fields:
|
|
148
|
+
|
|
149
|
+
```bash
|
|
150
|
+
# Basic usage
|
|
151
|
+
wombat prepare input.tsv.gz -o prepared.parquet
|
|
152
|
+
|
|
153
|
+
# With verbose output
|
|
154
|
+
wombat prepare input.tsv.gz -o prepared.parquet -v
|
|
155
|
+
|
|
156
|
+
# Adjust chunk size for memory constraints
|
|
157
|
+
wombat prepare input.tsv.gz -o prepared.parquet --chunk-size 25000
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
**What it does:**
|
|
161
|
+
- Extracts all INFO fields (VEP_*, AF, etc.) as separate columns
|
|
162
|
+
- Keeps samples in wide format (not melted yet)
|
|
163
|
+
- Writes memory-efficient Parquet format
|
|
164
|
+
- Processes in chunks to handle files of any size
|
|
165
|
+
|
|
166
|
+
**When to use:**
|
|
167
|
+
- Files >1GB or >50 samples
|
|
168
|
+
- Large families (>10 members)
|
|
169
|
+
- Running multiple filter configurations
|
|
170
|
+
- Repeated analysis of the same dataset
|
|
171
|
+
|
|
172
|
+
### `wombat filter` - Process and Filter Data
|
|
118
173
|
|
|
119
|
-
|
|
174
|
+
Transforms and filters variant data (works with both TSV and Parquet input):
|
|
120
175
|
|
|
121
176
|
```bash
|
|
122
|
-
#
|
|
123
|
-
|
|
177
|
+
# Basic filtering (TSV input)
|
|
178
|
+
wombat filter input.tsv -o output
|
|
179
|
+
|
|
180
|
+
# From prepared Parquet (faster, more memory-efficient)
|
|
181
|
+
wombat filter prepared.parquet -o output
|
|
182
|
+
|
|
183
|
+
# With filter configuration
|
|
184
|
+
wombat filter input.tsv -F config.yml -o output
|
|
124
185
|
|
|
125
|
-
#
|
|
126
|
-
|
|
186
|
+
# With pedigree
|
|
187
|
+
wombat filter input.tsv -p pedigree.tsv -o output
|
|
127
188
|
|
|
128
189
|
# Compressed output
|
|
129
|
-
|
|
190
|
+
wombat filter input.tsv -o output -f tsv.gz
|
|
130
191
|
|
|
131
|
-
# Parquet
|
|
132
|
-
|
|
192
|
+
# Parquet output
|
|
193
|
+
wombat filter input.tsv -o output -f parquet
|
|
133
194
|
|
|
134
195
|
# With verbose output
|
|
135
|
-
|
|
196
|
+
wombat filter input.tsv -o output -v
|
|
136
197
|
```
|
|
137
198
|
|
|
138
199
|
### With Pedigree (Trio/Family Analysis)
|
|
@@ -140,7 +201,7 @@ uvx pywombat input.tsv -o output --verbose
|
|
|
140
201
|
Add parent genotype information for inheritance analysis:
|
|
141
202
|
|
|
142
203
|
```bash
|
|
143
|
-
|
|
204
|
+
wombat filter input.tsv --pedigree pedigree.tsv -o output
|
|
144
205
|
```
|
|
145
206
|
|
|
146
207
|
**Pedigree File Format** (tab-separated):
|
|
@@ -178,7 +239,7 @@ PyWombat supports two types of filtering:
|
|
|
178
239
|
Filter for ultra-rare, high-impact variants:
|
|
179
240
|
|
|
180
241
|
```bash
|
|
181
|
-
|
|
242
|
+
wombat filter input.tsv \
|
|
182
243
|
-F examples/rare_variants_high_impact.yml \
|
|
183
244
|
-o rare_variants
|
|
184
245
|
```
|
|
@@ -210,7 +271,7 @@ expression: "VEP_CANONICAL = YES & VEP_IMPACT = HIGH & VEP_LoF = HC & VEP_LoF_fl
|
|
|
210
271
|
Identify de novo mutations in trio data:
|
|
211
272
|
|
|
212
273
|
```bash
|
|
213
|
-
|
|
274
|
+
wombat filter input.tsv \
|
|
214
275
|
--pedigree pedigree.tsv \
|
|
215
276
|
-F examples/de_novo_mutations.yml \
|
|
216
277
|
-o denovo
|
|
@@ -290,7 +351,7 @@ expression: "VEP_IMPACT = HIGH & VEP_CANONICAL = YES & gnomad_AF < 0.01 & CADD_P
|
|
|
290
351
|
Inspect specific variants for troubleshooting:
|
|
291
352
|
|
|
292
353
|
```bash
|
|
293
|
-
|
|
354
|
+
wombat filter input.tsv \
|
|
294
355
|
-F config.yml \
|
|
295
356
|
--debug chr11:70486013
|
|
296
357
|
```
|
|
@@ -309,20 +370,20 @@ Shows:
|
|
|
309
370
|
### TSV (Default)
|
|
310
371
|
|
|
311
372
|
```bash
|
|
312
|
-
|
|
313
|
-
|
|
373
|
+
wombat filter input.tsv -o output # Creates output.tsv
|
|
374
|
+
wombat filter input.tsv -o output -f tsv # Same as above
|
|
314
375
|
```
|
|
315
376
|
|
|
316
377
|
### Compressed TSV
|
|
317
378
|
|
|
318
379
|
```bash
|
|
319
|
-
|
|
380
|
+
wombat filter input.tsv -o output -f tsv.gz # Creates output.tsv.gz
|
|
320
381
|
```
|
|
321
382
|
|
|
322
383
|
### Parquet (Fastest for Large Files)
|
|
323
384
|
|
|
324
385
|
```bash
|
|
325
|
-
|
|
386
|
+
wombat filter input.tsv -o output -f parquet # Creates output.parquet
|
|
326
387
|
```
|
|
327
388
|
|
|
328
389
|
**When to use Parquet:**
|
|
@@ -340,7 +401,7 @@ uvx pywombat input.tsv -o output -f parquet # Creates output.parquet
|
|
|
340
401
|
|
|
341
402
|
```bash
|
|
342
403
|
# Step 1: Filter for rare, high-impact variants
|
|
343
|
-
|
|
404
|
+
wombat filter cohort.tsv \
|
|
344
405
|
-F examples/rare_variants_high_impact.yml \
|
|
345
406
|
-o rare_variants
|
|
346
407
|
|
|
@@ -352,24 +413,34 @@ uvx pywombat cohort.tsv \
|
|
|
352
413
|
|
|
353
414
|
```bash
|
|
354
415
|
# Identify de novo mutations in autism cohort
|
|
355
|
-
|
|
416
|
+
wombat filter autism_trios.tsv \
|
|
356
417
|
--pedigree autism_pedigree.tsv \
|
|
357
418
|
-F examples/de_novo_mutations.yml \
|
|
358
419
|
-o autism_denovo \
|
|
359
|
-
|
|
420
|
+
-v
|
|
360
421
|
|
|
361
422
|
# Review output for genes in autism risk lists
|
|
362
423
|
```
|
|
363
424
|
|
|
364
|
-
### 3. Multi-Family
|
|
425
|
+
### 3. Large Multi-Family Analysis (Memory-Optimized)
|
|
365
426
|
|
|
366
427
|
```bash
|
|
367
|
-
#
|
|
368
|
-
|
|
428
|
+
# Step 1: Prepare once (preprocesses INFO fields)
|
|
429
|
+
wombat prepare large_cohort.tsv.gz -o prepared.parquet -v
|
|
430
|
+
|
|
431
|
+
# Step 2: Filter with different configurations (fast, memory-efficient)
|
|
432
|
+
wombat filter prepared.parquet \
|
|
369
433
|
--pedigree families_pedigree.tsv \
|
|
370
434
|
-F examples/rare_variants_high_impact.yml \
|
|
371
435
|
-o families_rare_variants \
|
|
372
|
-
-
|
|
436
|
+
-v
|
|
437
|
+
|
|
438
|
+
# Step 3: Run additional filters without re-preparing
|
|
439
|
+
wombat filter prepared.parquet \
|
|
440
|
+
--pedigree families_pedigree.tsv \
|
|
441
|
+
-F examples/de_novo_mutations.yml \
|
|
442
|
+
-o families_denovo \
|
|
443
|
+
-v
|
|
373
444
|
```
|
|
374
445
|
|
|
375
446
|
### 4. Custom Expression Filter
|
|
@@ -389,7 +460,7 @@ expression: "VEP_IMPACT = HIGH & (gnomad_AF < 0.0001 | gnomad_AF = null)"
|
|
|
389
460
|
Apply:
|
|
390
461
|
|
|
391
462
|
```bash
|
|
392
|
-
|
|
463
|
+
wombat filter input.tsv -F custom_filter.yml -o output
|
|
393
464
|
```
|
|
394
465
|
|
|
395
466
|
---
|
|
@@ -464,7 +535,7 @@ bcftools query -HH \
|
|
|
464
535
|
annotated.split.bcf > annotated.tsv
|
|
465
536
|
|
|
466
537
|
# 4. Process with PyWombat
|
|
467
|
-
|
|
538
|
+
wombat filter annotated.tsv -F examples/rare_variants_high_impact.yml -o output
|
|
468
539
|
```
|
|
469
540
|
|
|
470
541
|
**Why split-vep is required:**
|
|
@@ -481,7 +552,7 @@ For production workflows, these commands can be piped together:
|
|
|
481
552
|
# Efficient pipeline (single pass through data)
|
|
482
553
|
bcftools +split-vep -c - -p VEP_ input.vcf.gz | \
|
|
483
554
|
bcftools query -HH -f '%CHROM\t%POS\t%REF\t%ALT\t%FILTER\t%INFO[\t%GT:%DP:%GQ:%AD]\n' | \
|
|
484
|
-
|
|
555
|
+
wombat filter - -F config.yml -o output
|
|
485
556
|
```
|
|
486
557
|
|
|
487
558
|
**Note**: For multiple filter configurations, it's more efficient to save the intermediate TSV file rather than regenerating it each time.
|
|
@@ -517,11 +588,31 @@ Each configuration file is fully documented with:
|
|
|
517
588
|
|
|
518
589
|
## Performance Tips
|
|
519
590
|
|
|
520
|
-
|
|
521
|
-
|
|
591
|
+
### For Large Files (>1GB or >50 samples)
|
|
592
|
+
|
|
593
|
+
1. **Use the two-step workflow**: `wombat prepare` → `wombat filter`
|
|
594
|
+
- Reduces memory usage by 95%+ (4.2M variants → ~100 after early filtering)
|
|
595
|
+
- Pre-expands INFO fields once, reuse for multiple filter configurations
|
|
596
|
+
- Example: 38-sample family with 4.2M variants processes in <1 second with ~1.2GB RAM
|
|
597
|
+
|
|
598
|
+
2. **Parquet format benefits**:
|
|
599
|
+
- Columnar storage enables selective column loading
|
|
600
|
+
- Pre-filtering before melting (expression filters applied before expanding to per-sample rows)
|
|
601
|
+
- 30% smaller file size vs gzipped TSV
|
|
602
|
+
|
|
603
|
+
### For All Files
|
|
604
|
+
|
|
522
605
|
3. **Pre-filter with bcftools**: Filter by region/gene before PyWombat
|
|
523
606
|
4. **Compressed input**: PyWombat handles `.gz` files natively
|
|
524
|
-
5. **
|
|
607
|
+
5. **Use verbose mode** (`-v`): Monitor progress and filtering statistics
|
|
608
|
+
|
|
609
|
+
### Memory Comparison
|
|
610
|
+
|
|
611
|
+
| Approach | 38 samples, 4.2M variants | Memory | Time |
|
|
612
|
+
|----------|---------------------------|--------|------|
|
|
613
|
+
| Direct TSV | ❌ OOM (>200GB) | 200+ GB | Failed |
|
|
614
|
+
| TSV with chunking | ⚠️ Slow | ~30GB | ~3 min |
|
|
615
|
+
| **Parquet + pre-filter** | ✅ **Optimal** | **~1.2GB** | **<1 sec** |
|
|
525
616
|
|
|
526
617
|
---
|
|
527
618
|
|
|
@@ -588,11 +679,15 @@ pywombat/
|
|
|
588
679
|
|
|
589
680
|
**Issue**: Memory errors on large files
|
|
590
681
|
|
|
591
|
-
- **Solution**:
|
|
682
|
+
- **Solution**: Use the two-step workflow: `wombat prepare` then `wombat filter` for 95%+ memory reduction
|
|
683
|
+
|
|
684
|
+
**Issue**: Command not found after upgrading
|
|
685
|
+
|
|
686
|
+
- **Solution**: PyWombat now uses subcommands - use `wombat filter` instead of just `wombat`
|
|
592
687
|
|
|
593
688
|
### Getting Help
|
|
594
689
|
|
|
595
|
-
1. Check `--help` for command options: `
|
|
690
|
+
1. Check `--help` for command options: `wombat --help` or `wombat filter --help`
|
|
596
691
|
2. Review example configurations in [`examples/`](examples/)
|
|
597
692
|
3. Use `--debug` mode to inspect specific variants
|
|
598
693
|
4. Use `--verbose` to see filtering steps
|