bioflowkit 0.2.0__tar.gz → 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- bioflowkit-0.3.0/.gitattributes +20 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/candidate-smoke-test.yml +5 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/ci.yml +17 -13
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/docs.yml +5 -0
- bioflowkit-0.3.0/.github/workflows/nfcore-concordance.yml +75 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/nightly-smoke.yml +15 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/release.yml +5 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.gitignore +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CHANGELOG.md +385 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/PKG-INFO +31 -16
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/README.md +30 -15
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/__init__.py +3 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/__init__.py +2 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/db.py +25 -5
- bioflowkit-0.3.0/bioflow/cli/provenance.py +82 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/recipe.py +22 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/update.py +7 -6
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/db.py +117 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/doctor.py +35 -17
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/ncbi.py +10 -3
- bioflowkit-0.3.0/bioflow/core/provenance.py +422 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/runner.py +145 -10
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/io.py +1 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/llm/__init__.py +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/__init__.py +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/ani_matrix.py +9 -5
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/atac_seq.py +7 -5
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/chip_seq.py +13 -10
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/eukaryote_assembly.py +16 -9
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/metagenome_assembly.py +7 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/metagenomics_profile.py +5 -4
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/methylation/bismark_wgbs.py +68 -9
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/proteomics/proteomics_dda.py +7 -4
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/rnaseq_deg/rnaseq_deg.py +79 -14
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/single_cell/scrna_seq.py +1 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/variant_calling/germline_variants.py +46 -13
- bioflowkit-0.3.0/bioflow/recipes/variant_calling/joint_genotyping.py +277 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/__init__.py +43 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_paths.py +35 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_stage.py +22 -1
- bioflowkit-0.3.0/conda-recipe/meta.yaml +86 -0
- bioflowkit-0.3.0/data/test/cafe_small/README.md +21 -0
- bioflowkit-0.3.0/data/test/cafe_small/families.tsv +61 -0
- bioflowkit-0.3.0/data/test/cafe_small/tree.nwk +1 -0
- bioflowkit-0.3.0/data/test/genomes_small/README.md +28 -0
- bioflowkit-0.3.0/data/test/genomes_small/genome1.fna +78 -0
- bioflowkit-0.3.0/data/test/genomes_small/genome2.fna +78 -0
- bioflowkit-0.3.0/data/test/gwas_small/README.md +22 -0
- bioflowkit-0.3.0/data/test/gwas_small/gene_presence_absence.csv +13 -0
- bioflowkit-0.3.0/data/test/gwas_small/traits.csv +11 -0
- bioflowkit-0.3.0/data/test/methyl_small/README.md +29 -0
- bioflowkit-0.3.0/data/test/methyl_small/genome.fa +79 -0
- bioflowkit-0.3.0/data/test/methyl_small/sample01_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/methyl_small/sample01_R2.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/phix_small/README.md +41 -0
- bioflowkit-0.3.0/data/test/phix_small/reference.fa +79 -0
- bioflowkit-0.3.0/data/test/phix_small/sim_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/phix_small/sim_R2.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/phylo_small/README.md +25 -0
- bioflowkit-0.3.0/data/test/phylo_small/gene_presence_absence.csv +9 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g1.ffn +67 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g1.gff +100 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g2.ffn +67 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g2.gff +100 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g3.ffn +72 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g3.gff +101 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g4.ffn +67 -0
- bioflowkit-0.3.0/data/test/phylo_small/gffs/g4.gff +101 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/README.md +22 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/ctl1_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/ctl1_R2.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/ctl2_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/ctl2_R2.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/transcriptome.fa +552 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/trt1_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/trt1_R2.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/trt2_R1.fastq.gz +0 -0
- bioflowkit-0.3.0/data/test/rnaseq_small/trt2_R2.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/MAINTAINER.md +42 -1
- bioflowkit-0.3.0/docs/benchmarks/nfcore-concordance.md +90 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/index.md +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/install.md +6 -0
- bioflowkit-0.3.0/docs/reference/e2e-coverage.md +68 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/reference/recipes.md +38 -21
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/reference/tools.md +50 -50
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/mkdocs.yml +3 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/pyproject.toml +1 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bedtools.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bowtie2.yaml +5 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bwa_mem2.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/minimap2.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/samtools.yaml +9 -5
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/abyss.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/canu.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/flye.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/hifiasm.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/masurca.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/medaka.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/megahit.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/nextdenovo.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/nextpolish.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/pilon.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/racon.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/raven.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/shasta.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/unicycler.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/verkko.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/busco.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/checkm2.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/compleasm.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/gfastats.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/merqury.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/abricate.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/cafe5.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/diamond.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/fastani.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/iqtree.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/mafft.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/mash.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/panaroo.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/roary.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/scoary.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/skani.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/deseq2.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/edger.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/limma_voom.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/clusterprofiler.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/enrichr.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/gseapy.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/topgo.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/bismark.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/deeptools.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/homer.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/macs3.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/methylkit.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/methylpy.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/picard.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/tobias.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/antismash.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/dbcan.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/eggnog_mapper.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/gtdbtk.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/interproscan.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/bracken.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/humann3.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/kneaddata.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/kraken2.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/lefse.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/maxbin2.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/metabat2.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/metaphlan4.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/comet.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/maxquant.yaml +10 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/msconvert.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/openms.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/percolator.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/xtandem.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/cutadapt.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/fastqc.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/filtlong.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/multiqc.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/nanoplot.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/seqkit.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/trimgalore.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/earlgrey.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/repeatmasker.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/repeatmodeler.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/hisat2.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/kallisto.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/rsem.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/salmon.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/star.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/stringtie.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/subread.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/bustools.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/cellranger.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/harmony.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/monocle3.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/scanpy.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/scrublet.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/seurat.yaml +2 -1
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/starsolo.yaml +1 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/augustus.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/bakta.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/braker3.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/liftoff.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/bcftools.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/freebayes.yaml +4 -3
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/gatk4.yaml +3 -2
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/snpeff.yaml +3 -2
- bioflowkit-0.3.0/scripts/compare_nfcore.py +234 -0
- bioflowkit-0.3.0/scripts/gen_methyl_fixture.py +89 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/pin_digests.py +12 -2
- bioflowkit-0.3.0/scripts/refresh_tags.py +199 -0
- bioflowkit-0.3.0/tests/integration/test_full_pipeline_e2e.py +343 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_recipe_smoke_matrix.py +31 -1
- bioflowkit-0.3.0/tests/unit/test_compare_nfcore.py +170 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_db.py +45 -0
- bioflowkit-0.3.0/tests/unit/test_docker_timeout.py +130 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_failure_report.py +2 -1
- bioflowkit-0.3.0/tests/unit/test_gpu_podman.py +203 -0
- bioflowkit-0.3.0/tests/unit/test_provenance.py +232 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_per_pipeline.py +7 -6
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_per_pipeline_e2e.py +32 -3
- bioflowkit-0.3.0/tests/unit/test_resource_clamp.py +54 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_run_resume.py +1 -0
- bioflowkit-0.3.0/tests/unit/test_unsafe_paths.py +71 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CODE_OF_CONDUCT.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CONTRIBUTING.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/LICENSE +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/SECURITY.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/README.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_card.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_plasmidfinder.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_vfdb.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_clade_results.txt +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_family_likelihoods.txt +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_family_results.txt +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_results.txt +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/vfdb_counts.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/eggnog/cog_counts_by_bucket.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/eggnog/cog_fractions_by_bucket.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_card.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_boxplot.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_card.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_plasmidfinder.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_vfdb.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_plasmidfinder.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_vfdb.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/ani_full_heatmap.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/ani_heatmap.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cafe_hcp_detail.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cafe_vfdb_tree.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cog_delta.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cog_stacked.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_curve.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_full_curve.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_full_pie.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_pie.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_is_dianthicola.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_is_solani.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_soft_rot.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_vascular_wilt.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_soft_rot.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_vascular_wilt.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/solani_island_gc.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/solani_island_synteny.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_ani_nj.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_full_with_vfdb.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_ml_iqtree.png +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny/ani_nj.nwk +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny/iqtree.treefile +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny_full/iqtree_full.treefile +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/top25_soft_rot.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/top25_vascular_wilt.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/traits.csv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_is_dianthicola.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_is_solani.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_soft_rot.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_vascular_wilt.tsv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/traits.csv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/summary.html +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/__main__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/_app.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/doctor.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/hw.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/llm.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/ncbi.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/pipelines.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/setup.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/approve.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/checkpoint.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/compatibility.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/dag.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/hardware.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/logger.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/planner.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/registry.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/report.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/llm/audit.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/atac_seq.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/chip_seq.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/genome_assembly.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/metagenomics.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/methylation.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/proteomics.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/rnaseq_deg.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/scrna_seq.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/variant_calling.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/amr_vf_catalogue.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/cafe_evolution.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/cog_enrichment.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/download_taxon.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/gwas.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/pangenome.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/phylogeny.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/prokaryote_assembly.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/methylation/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/proteomics/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/rnaseq_deg/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/single_cell/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/variant_calling/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/report.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_cache.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_hashing.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_parallel.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_pipeline.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_result.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_runtime.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/R1.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/R2.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/real_R1.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/real_R2.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/reference.fa +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/R1.fastq.gz +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/genome.fa +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/genome.gtf +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/samples.csv +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docker/core/Dockerfile +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docker/docker-compose.yml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/DESIGN.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/architecture.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/UPDATE_CADENCES.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/cowork_schedule_prompt.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/quarterly_audit_prompt.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/research_prompt.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/quickstart.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/cache_demo.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_atac_seq.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_chip_seq.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_custom.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_eukaryote_hifi.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_metagenomics.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_methylation.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_prokaryote_short.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_proteomics.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_recommend.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_rnaseq.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_scrna_seq.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/parallel_demo.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/pectobacterium_demo.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/pipeline_demo.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/recipes_quickstart.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/stage_demo.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/README.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/atac_seq_standard.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/chip_seq_standard.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_denovo_hifi.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_denovo_hybrid.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_resequencing.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/metagenomics_kraken2_standard.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/metagenomics_metaphlan4_standard.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/methylation_bismark_wgbs.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/prokaryote_denovo_hybrid.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/prokaryote_denovo_short.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/proteomics_msfragger_dda.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/rnaseq_deseq2_standard.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/scrna_seq_10x_scanpy.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/scrna_seq_10x_seurat.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/schema.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bwa.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/spades.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/quast.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/fragpipe.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/msfragger.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/fastp.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/prokka.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/gen_docs.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron-daily.sh +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron-weekly.sh +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron.sh +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-daily.ps1 +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-weekly.ps1 +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-windows.ps1 +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/test_prokaryote_short.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/test_rnaseq.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/fixtures/hypo_assembler.yaml +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/__init__.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_docker_backend.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_recipe_real_data.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_sdk_real_docker.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_approve.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_benchmark.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_bugfixes.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_crossplatform.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_dag.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_digest_pinning.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_doctor.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_freshness_check.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_interactive.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_io.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_audit.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_diagnose.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_setup.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_ncbi.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner_eukaryote.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner_rnaseq.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipe_cli_args.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipe_registry_alignment.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_cookbook.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recommend.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_registry_resolver.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_registry_sanity.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_release_watch.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_report.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_report_builder.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_runner.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_cache.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_external_mounts.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_parallel.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_pipeline.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_retry.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_stage.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_streaming.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_skeleton.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_update_auto.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/REGISTRY_CHANGELOG.md +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/benchmark.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/freshness_check.py +0 -0
- {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/release_watch.py +0 -0
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
# Test fixtures are read by Linux containers (FastANI, CAFE5, …). Some
|
|
2
|
+
# tool parsers don't strip a stray CR, so a CRLF checkout on Windows
|
|
3
|
+
# silently corrupts them (CAFE5 read the last column header as "D\r" and
|
|
4
|
+
# reported "D was not found"). Force LF for text fixtures everywhere,
|
|
5
|
+
# and mark the compressed reads binary so git never touches them.
|
|
6
|
+
data/test/**/*.fna text eol=lf
|
|
7
|
+
data/test/**/*.fa text eol=lf
|
|
8
|
+
data/test/**/*.fasta text eol=lf
|
|
9
|
+
data/test/**/*.tsv text eol=lf
|
|
10
|
+
data/test/**/*.csv text eol=lf
|
|
11
|
+
data/test/**/*.nwk text eol=lf
|
|
12
|
+
data/test/**/*.txt text eol=lf
|
|
13
|
+
data/test/**/*.gff text eol=lf
|
|
14
|
+
data/test/**/*.ffn text eol=lf
|
|
15
|
+
data/test/**/*.gz binary
|
|
16
|
+
|
|
17
|
+
# Registry + recipe shell commands also end up inside Linux containers;
|
|
18
|
+
# keep them LF regardless of host.
|
|
19
|
+
*.yaml text eol=lf
|
|
20
|
+
registry/** text eol=lf
|
|
@@ -17,6 +17,11 @@ permissions:
|
|
|
17
17
|
contents: read
|
|
18
18
|
pull-requests: write # for the summary comment
|
|
19
19
|
|
|
20
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
|
|
21
|
+
# the v4/v5 actions stop emitting Node 20 deprecation warnings.
|
|
22
|
+
env:
|
|
23
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
24
|
+
|
|
20
25
|
jobs:
|
|
21
26
|
smoke:
|
|
22
27
|
name: Validate + smoke-test changed candidates
|
|
@@ -6,6 +6,11 @@ on:
|
|
|
6
6
|
pull_request:
|
|
7
7
|
branches: [main]
|
|
8
8
|
|
|
9
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
|
|
10
|
+
# the v4/v5 actions stop emitting Node 20 deprecation warnings.
|
|
11
|
+
env:
|
|
12
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
13
|
+
|
|
9
14
|
jobs:
|
|
10
15
|
unit-tests:
|
|
11
16
|
name: Unit tests (Python ${{ matrix.python-version }})
|
|
@@ -47,12 +52,11 @@ jobs:
|
|
|
47
52
|
run: ruff check .
|
|
48
53
|
|
|
49
54
|
typecheck:
|
|
50
|
-
#
|
|
51
|
-
#
|
|
52
|
-
#
|
|
53
|
-
name: Type check (mypy
|
|
55
|
+
# Blocking: the bioflow package type-checks clean under mypy
|
|
56
|
+
# (--ignore-missing-imports for the un-stubbed docker / anthropic /
|
|
57
|
+
# openai SDKs). Keep it that way — a new type error fails CI.
|
|
58
|
+
name: Type check (mypy)
|
|
54
59
|
runs-on: ubuntu-latest
|
|
55
|
-
continue-on-error: true
|
|
56
60
|
|
|
57
61
|
steps:
|
|
58
62
|
- uses: actions/checkout@v4
|
|
@@ -66,7 +70,7 @@ jobs:
|
|
|
66
70
|
run: pip install -e ".[dev]" && pip install types-PyYAML types-requests
|
|
67
71
|
|
|
68
72
|
- name: Run mypy
|
|
69
|
-
run: mypy bioflow --ignore-missing-imports
|
|
73
|
+
run: mypy bioflow --ignore-missing-imports
|
|
70
74
|
|
|
71
75
|
registry-schema:
|
|
72
76
|
name: Validate registry YAMLs
|
|
@@ -108,12 +112,12 @@ jobs:
|
|
|
108
112
|
EOF
|
|
109
113
|
|
|
110
114
|
digest-audit:
|
|
111
|
-
#
|
|
112
|
-
#
|
|
113
|
-
#
|
|
114
|
-
|
|
115
|
+
# Now blocking: every *active* tool must carry an image_digest
|
|
116
|
+
# (deprecated tools are skipped — their upstream images are gone by
|
|
117
|
+
# definition). This keeps the registry fully content-addressed so a
|
|
118
|
+
# silently-retagged or GC'd upstream image can never change results.
|
|
119
|
+
name: Container digest pin audit
|
|
115
120
|
runs-on: ubuntu-latest
|
|
116
|
-
continue-on-error: true
|
|
117
121
|
|
|
118
122
|
steps:
|
|
119
123
|
- uses: actions/checkout@v4
|
|
@@ -126,5 +130,5 @@ jobs:
|
|
|
126
130
|
- name: Install dependencies
|
|
127
131
|
run: pip install pyyaml
|
|
128
132
|
|
|
129
|
-
- name:
|
|
130
|
-
run: python scripts/pin_digests.py --audit
|
|
133
|
+
- name: Require every active tool to be digest-pinned
|
|
134
|
+
run: python scripts/pin_digests.py --audit
|
|
@@ -26,6 +26,11 @@ concurrency:
|
|
|
26
26
|
group: pages
|
|
27
27
|
cancel-in-progress: true
|
|
28
28
|
|
|
29
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
|
|
30
|
+
# the v3/v4/v5 actions stop emitting Node 20 deprecation warnings.
|
|
31
|
+
env:
|
|
32
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
33
|
+
|
|
29
34
|
jobs:
|
|
30
35
|
build:
|
|
31
36
|
runs-on: ubuntu-latest
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
name: nf-core concordance
|
|
2
|
+
|
|
3
|
+
# Manually-dispatched concordance benchmark: run bioflow and the matching
|
|
4
|
+
# nf-core pipeline on a golden dataset, then score their agreement with
|
|
5
|
+
# scripts/compare_nfcore.py.
|
|
6
|
+
#
|
|
7
|
+
# This is NOT a per-PR gate. A real run needs tens of GB of references
|
|
8
|
+
# and hours of compute, so it expects a runner that already has them
|
|
9
|
+
# staged (a self-hosted runner, or a large GitHub runner the maintainer
|
|
10
|
+
# provisions). It is run deliberately before a release and its JSON
|
|
11
|
+
# output is published with the release notes.
|
|
12
|
+
#
|
|
13
|
+
# See docs/benchmarks/nfcore-concordance.md for the datasets, method, and
|
|
14
|
+
# acceptance thresholds.
|
|
15
|
+
|
|
16
|
+
on:
|
|
17
|
+
workflow_dispatch:
|
|
18
|
+
inputs:
|
|
19
|
+
comparison:
|
|
20
|
+
description: "Which comparison to run"
|
|
21
|
+
type: choice
|
|
22
|
+
options: [vcf, counts]
|
|
23
|
+
default: vcf
|
|
24
|
+
bioflow_output:
|
|
25
|
+
description: "Path to the bioflow output (VCF or counts TSV) on the runner"
|
|
26
|
+
required: true
|
|
27
|
+
reference_output:
|
|
28
|
+
description: "Path to the nf-core output on the runner"
|
|
29
|
+
required: true
|
|
30
|
+
threshold:
|
|
31
|
+
description: "Min Jaccard (vcf) or Spearman rho (counts) to pass"
|
|
32
|
+
required: false
|
|
33
|
+
default: "0.90"
|
|
34
|
+
|
|
35
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch.
|
|
36
|
+
env:
|
|
37
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
38
|
+
|
|
39
|
+
jobs:
|
|
40
|
+
score:
|
|
41
|
+
name: Score concordance
|
|
42
|
+
# The default GitHub runner cannot hold iGenomes references; point this
|
|
43
|
+
# at a self-hosted runner that has them staged. Left as ubuntu-latest
|
|
44
|
+
# so the workflow is valid and the *scoring* step is exercisable on
|
|
45
|
+
# pre-staged outputs.
|
|
46
|
+
runs-on: ubuntu-latest
|
|
47
|
+
steps:
|
|
48
|
+
- uses: actions/checkout@v4
|
|
49
|
+
|
|
50
|
+
- uses: actions/setup-python@v5
|
|
51
|
+
with:
|
|
52
|
+
python-version: "3.12"
|
|
53
|
+
|
|
54
|
+
- name: Score bioflow vs nf-core
|
|
55
|
+
run: |
|
|
56
|
+
if [ "${{ inputs.comparison }}" = "vcf" ]; then
|
|
57
|
+
python scripts/compare_nfcore.py vcf \
|
|
58
|
+
--bioflow "${{ inputs.bioflow_output }}" \
|
|
59
|
+
--reference "${{ inputs.reference_output }}" \
|
|
60
|
+
--out concordance.json \
|
|
61
|
+
--min-jaccard "${{ inputs.threshold }}"
|
|
62
|
+
else
|
|
63
|
+
python scripts/compare_nfcore.py counts \
|
|
64
|
+
--bioflow "${{ inputs.bioflow_output }}" \
|
|
65
|
+
--reference "${{ inputs.reference_output }}" \
|
|
66
|
+
--out concordance.json \
|
|
67
|
+
--min-rho "${{ inputs.threshold }}"
|
|
68
|
+
fi
|
|
69
|
+
|
|
70
|
+
- name: Upload concordance report
|
|
71
|
+
if: always()
|
|
72
|
+
uses: actions/upload-artifact@v4
|
|
73
|
+
with:
|
|
74
|
+
name: nfcore-concordance
|
|
75
|
+
path: concordance.json
|
|
@@ -10,6 +10,11 @@ on:
|
|
|
10
10
|
- cron: "0 3 * * *"
|
|
11
11
|
workflow_dispatch:
|
|
12
12
|
|
|
13
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
|
|
14
|
+
# the v4/v5 actions stop emitting Node 20 deprecation warnings.
|
|
15
|
+
env:
|
|
16
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
17
|
+
|
|
13
18
|
jobs:
|
|
14
19
|
smoke-matrix:
|
|
15
20
|
name: Recipe smoke matrix
|
|
@@ -43,12 +48,20 @@ jobs:
|
|
|
43
48
|
-v -m docker \
|
|
44
49
|
--junitxml=reports/smoke.xml
|
|
45
50
|
|
|
46
|
-
- name:
|
|
51
|
+
- name: Run full-pipeline e2e (all 9 recipes with committed fixtures)
|
|
52
|
+
env:
|
|
53
|
+
BIOFLOW_LOG_LEVEL: INFO
|
|
54
|
+
run: |
|
|
55
|
+
python -m pytest tests/integration/test_full_pipeline_e2e.py \
|
|
56
|
+
-v -m docker \
|
|
57
|
+
--junitxml=reports/full_e2e.xml
|
|
58
|
+
|
|
59
|
+
- name: Upload junit reports
|
|
47
60
|
if: always()
|
|
48
61
|
uses: actions/upload-artifact@v4
|
|
49
62
|
with:
|
|
50
63
|
name: smoke-junit
|
|
51
|
-
path: reports
|
|
64
|
+
path: reports/*.xml
|
|
52
65
|
|
|
53
66
|
- name: Fail the job on red results
|
|
54
67
|
if: failure()
|
|
@@ -22,6 +22,11 @@ on:
|
|
|
22
22
|
permissions:
|
|
23
23
|
contents: read
|
|
24
24
|
|
|
25
|
+
# Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
|
|
26
|
+
# the v4/v5 actions stop emitting Node 20 deprecation warnings.
|
|
27
|
+
env:
|
|
28
|
+
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
|
|
29
|
+
|
|
25
30
|
jobs:
|
|
26
31
|
build:
|
|
27
32
|
name: Build sdist + wheel
|
|
@@ -12,6 +12,391 @@ ship bug fixes only. Breaking changes to the documented public API
|
|
|
12
12
|
|
|
13
13
|
---
|
|
14
14
|
|
|
15
|
+
## [Unreleased]
|
|
16
|
+
|
|
17
|
+
## [0.3.0] — 2026-06-18
|
|
18
|
+
|
|
19
|
+
### Fixed — aligner images had no samtools (5 recipes broke at alignment)
|
|
20
|
+
The plain single-tool aligner BioContainers (`bwa`, `bowtie2`,
|
|
21
|
+
`minimap2`) do **not** bundle samtools, yet five recipes ran
|
|
22
|
+
`aligner | samtools sort` / `samtools index` inside them — so every one
|
|
23
|
+
failed at its alignment stage with `samtools: command not found`. The
|
|
24
|
+
smoke matrix only exercises each recipe's *first* stage, so this stayed
|
|
25
|
+
invisible. (See #2.) Each aligner stage now uses an image that carries
|
|
26
|
+
both tools:
|
|
27
|
+
- **germline_variants**, **joint_genotyping**: a mulled
|
|
28
|
+
`bwa 0.7.19 + samtools 1.22` image. A new **`prepare_reference`**
|
|
29
|
+
stage builds the BWA index + `.fai` + `.dict` **once** (in the cohort
|
|
30
|
+
recipe, before the per-sample fan-out — previously each parallel
|
|
31
|
+
sample raced to index the *shared* reference), and the gatk call stage
|
|
32
|
+
drops its own `samtools` calls (`MarkDuplicates --CREATE_INDEX`
|
|
33
|
+
indexes the dedup BAM; the gatk4 image ships no samtools either).
|
|
34
|
+
Verified end-to-end on phiX: prep → bwa-mem → sort/index →
|
|
35
|
+
MarkDuplicates → HaplotypeCaller now produces a valid VCF.
|
|
36
|
+
- **chip_seq**, **atac_seq**: `staphb/bowtie2:2.5.4` (bundles samtools).
|
|
37
|
+
- **metagenome_assembly**: a mulled `minimap2 2.31 + samtools 1.23` image.
|
|
38
|
+
- `registry/tools/alignment/bowtie2.yaml` had the same broken
|
|
39
|
+
assumption (and `samtools.yaml` documented it as fact) — both fixed.
|
|
40
|
+
|
|
41
|
+
### Fixed — static audit of the never-e2e'd recipes (round 1)
|
|
42
|
+
A static pass over the 10 recipes that have no committed full-e2e fixture
|
|
43
|
+
(they need external reference data) turned up several latent defects:
|
|
44
|
+
- **proteomics_dda**: the percolator FDR cut used ``awk -F\t``, whose
|
|
45
|
+
backslash bash strips before awk sees it — so the field separator
|
|
46
|
+
became the literal letter ``t`` instead of a tab, and
|
|
47
|
+
``passing_psms.tsv`` was filtered on garbage columns. Now ``-F"\t"``,
|
|
48
|
+
which reaches awk as a real tab (verified).
|
|
49
|
+
- **eukaryote_assembly**: the docstring advertised ``polish=False`` to
|
|
50
|
+
skip Medaka for HiFi reads, but no such parameter existed. Added it
|
|
51
|
+
(the ``polish`` stage is renamed ``polish_consensus`` to free the
|
|
52
|
+
name); ``assess`` already falls back to Flye's ``assembly.fasta``.
|
|
53
|
+
- **chip_seq**: docstring promised ``--ctrl-r1 / --ctrl-r2`` raw-control
|
|
54
|
+
alignment the recipe never implemented — corrected to document the
|
|
55
|
+
actual ``--ctrl-bam`` (pre-aligned control) input.
|
|
56
|
+
- **metagenomics_profile**: ``bioflow db fetch`` example used a
|
|
57
|
+
catalog key that doesn't exist (``kraken2_standard`` →
|
|
58
|
+
``kraken2_standard_8gb``) and now notes Bracken's ``kmer_distrib`` need.
|
|
59
|
+
|
|
60
|
+
### Docs — strict build fixed + e2e-coverage page
|
|
61
|
+
- `mkdocs build --strict` was aborting: three docs links pointed at repo
|
|
62
|
+
files outside the `docs/` tree (`conda-recipe/meta.yaml`,
|
|
63
|
+
`scripts/compare_nfcore.py`, the nf-core workflow), which strict mode
|
|
64
|
+
flags as unresolved. Re-pointed them at GitHub blob URLs.
|
|
65
|
+
- New **`reference/e2e-coverage.md`** documents which 9 recipes have a
|
|
66
|
+
committed full end-to-end fixture and which 10 are gated on external
|
|
67
|
+
reference data (with the `bioflow db fetch` key for each), plus the one
|
|
68
|
+
utility recipe. Regenerated `reference/recipes.md` / `tools.md` from
|
|
69
|
+
the registry (now 20 recipes; `joint_genotyping` was missing and image
|
|
70
|
+
tags were stale).
|
|
71
|
+
|
|
72
|
+
### Fixed — ani_matrix broken for genomes outside the workspace
|
|
73
|
+
- **Bug a full e2e caught**: FastANI reads genome paths from a *list
|
|
74
|
+
file*, not the command, so the SDK's command-path translator and
|
|
75
|
+
auto-mount never applied to them — every external genome failed with
|
|
76
|
+
`Could not open <host path>`. Since genomes normally live outside the
|
|
77
|
+
output workspace, this broke the recipe's primary documented use.
|
|
78
|
+
- New SDK helper **`stage_input(path)`** copies an external file into the
|
|
79
|
+
active workspace (always mounted at `/work`) and returns its container
|
|
80
|
+
path — the clean primitive for any recipe that feeds a tool a list
|
|
81
|
+
file of paths. Also exported **`container_path(path)`**.
|
|
82
|
+
- `ani_matrix` now stages genomes via `stage_input` and writes container
|
|
83
|
+
paths into the FastANI list; verified end-to-end (genome1 vs genome2 =
|
|
84
|
+
99.5% ANI).
|
|
85
|
+
|
|
86
|
+
### Added — full e2e for the comparative-genomics recipes
|
|
87
|
+
- `tests/integration/test_full_pipeline_e2e.py` gains real end-to-end
|
|
88
|
+
tests for **amr_vf_catalogue** (ABRicate fan-out, bundled DBs),
|
|
89
|
+
**ani_matrix** (all-vs-all FastANI), **pangenome** (Prokka × N →
|
|
90
|
+
Roary), and **gwas** (Scoary on a synthetic Roary GPA + phenotype,
|
|
91
|
+
recovers a planted association). Fixtures:
|
|
92
|
+
`data/test/genomes_small/` (phiX174 + a 25-SNP variant) and
|
|
93
|
+
`data/test/gwas_small/` (12-gene × 10-sample GPA). Recipes validated
|
|
94
|
+
end-to-end: 1 (prokaryote) → 9 (see below).
|
|
95
|
+
- **cafe_evolution** (CAFE5 gene-family expansion/contraction) added as
|
|
96
|
+
the 6th, on `data/test/cafe_small/` (ultrametric 4-taxon tree + 60
|
|
97
|
+
families).
|
|
98
|
+
- **phylogeny** (single-copy core → MAFFT × N → IQ-TREE) added as the
|
|
99
|
+
7th, on `data/test/phylo_small/` (Prokka GFF + CDS + Roary GPA for 4
|
|
100
|
+
phiX strains; IQ-TREE recovers a 4-taxon ML tree).
|
|
101
|
+
- **rnaseq_deg** (fastp → Salmon → DESeq2 → enrichment + MultiQC) added
|
|
102
|
+
as the 8th, on `data/test/rnaseq_small/` (60 synthetic transcripts, 4
|
|
103
|
+
samples, 10 transcripts planted ~4× up in the treated group). DESeq2
|
|
104
|
+
recovers the planted signal (`tx0001` log2FC ≈ 2) and the run finishes
|
|
105
|
+
in seconds. The sample sheet is built by the test at run time so no
|
|
106
|
+
machine-specific paths are committed.
|
|
107
|
+
- **methylation_wgbs** (TrimGalore → Bismark → methylKit) added as the
|
|
108
|
+
9th, on `data/test/methyl_small/` (phiX174 + 3,000 synthetic
|
|
109
|
+
directional bisulfite read pairs, ~70 % CpG-methylated). The reads
|
|
110
|
+
map at 100 % and Bismark produces a real cytosine report; the genome
|
|
111
|
+
is **not** committed pre-prepared — the new `bismark_prep` stage
|
|
112
|
+
(below) bisulfite-converts it at run time, so no version-tied bowtie2
|
|
113
|
+
index lands in git. Regenerated deterministically by
|
|
114
|
+
`scripts/gen_methyl_fixture.py`.
|
|
115
|
+
|
|
116
|
+
### Added — methylation_wgbs prepares its genome (matches the docs)
|
|
117
|
+
- The recipe's docstring promised automatic genome preparation, but the
|
|
118
|
+
pipeline had no such stage — it silently required a pre-prepared
|
|
119
|
+
`Bisulfite_Genome/` directory, so running from a plain reference FASTA
|
|
120
|
+
failed. A new **`bismark_prep`** stage now runs
|
|
121
|
+
`bismark_genome_preparation` when `--bismark-genome` is a FASTA (or a
|
|
122
|
+
directory holding one); an already-prepared directory is detected and
|
|
123
|
+
used directly, skipping preparation. `methylation_wgbs` is now 4
|
|
124
|
+
stages (trim → bismark_prep → bismark → methylkit).
|
|
125
|
+
|
|
126
|
+
### Fixed — methylKit CpG-report glob (shell ate the regex escape)
|
|
127
|
+
- `methylkit_dmr` matched the cytosine report with `pattern='…txt(\.gz)?$'`,
|
|
128
|
+
but the `\.` escape was stripped by the shell before R parsed the
|
|
129
|
+
string, so R aborted with *"'\.' is an unrecognized escape in character
|
|
130
|
+
string"* and the whole recipe failed at the final stage. Replaced with
|
|
131
|
+
a `[.]` character class (no backslash to escape), which matches the
|
|
132
|
+
report — with or without a `.gz` suffix — robustly.
|
|
133
|
+
|
|
134
|
+
### Fixed — rnaseq_deg DESeq2 step (two latent bugs)
|
|
135
|
+
- The `deseq2_diff` stage required **tximport**, which the
|
|
136
|
+
`bioconductor-deseq2` BioContainer does not ship — every run failed
|
|
137
|
+
with "there is no package called 'tximport'". Rewritten to assemble
|
|
138
|
+
the count matrix in base R straight from each sample's `quant.sf`
|
|
139
|
+
(`NumReads`, rounded) and feed `DESeqDataSetFromMatrix`, dropping the
|
|
140
|
+
tximport dependency entirely.
|
|
141
|
+
- A second, masked bug: `samples$sample_id` inside the `Rscript -e "…"`
|
|
142
|
+
body (run via `sh -c`) was shell-expanded because the `$` was
|
|
143
|
+
unescaped, so the `file.path(...)` of `quant.sf` paths was wrong.
|
|
144
|
+
Escaped to `\$sample_id`. The pipeline now also **fails fast** if the
|
|
145
|
+
DESeq2 stage exits non-zero (the downstream Enrichr step tolerates an
|
|
146
|
+
empty gene list, which previously masked a broken DEG table).
|
|
147
|
+
|
|
148
|
+
### Fixed — LF line endings for container-read fixtures
|
|
149
|
+
- A new `.gitattributes` pins text test fixtures (and registry YAMLs) to
|
|
150
|
+
LF. CAFE5 doesn't strip a trailing CR, so a CRLF checkout on Windows
|
|
151
|
+
made it read the last species column as `D\r` and fail with "D was not
|
|
152
|
+
found in gene family …". FASTA parsers tolerated the CR, but the
|
|
153
|
+
table parser did not — LF is now enforced so fixtures are safe on
|
|
154
|
+
every host.
|
|
155
|
+
|
|
156
|
+
### Fixed — bounded stdout retention (no orchestrator OOM on chatty tools)
|
|
157
|
+
- `DockerBackend.run` accumulated **every** stdout line in memory; a
|
|
158
|
+
tool that emits millions of lines (Roary, IQ-TREE) could OOM the
|
|
159
|
+
orchestrator. It now retains only the trailing `_STDOUT_TAIL_LINES`
|
|
160
|
+
(5000) via a bounded `deque` for the diagnostic `CommandResult.stdout`
|
|
161
|
+
— every line still streams live to `log_callback`, and real artifacts
|
|
162
|
+
go to files in the workspace.
|
|
163
|
+
- Tests: +1 (`test_docker_timeout.py`) asserting the tail is kept.
|
|
164
|
+
|
|
165
|
+
### Fixed — clear error for shell-unsafe external input filenames
|
|
166
|
+
- An external input file whose **basename** contained a space or shell
|
|
167
|
+
metacharacter silently corrupted the recipe's command — bioflow mounts
|
|
168
|
+
the file's parent at the space-free `/inputs/<n>` and splices the
|
|
169
|
+
basename in unquoted, and it can't be quoted generically because many
|
|
170
|
+
recipes wrap the whole command in `bash -c '…'`. (A spaced *directory*
|
|
171
|
+
was already fine — only the basename survives into the command.)
|
|
172
|
+
- `_collect_external_mounts` now raises an actionable `ValueError`
|
|
173
|
+
naming the offending characters and telling the user to rename /
|
|
174
|
+
symlink to a safe name.
|
|
175
|
+
- Tests: +12 (`tests/unit/test_unsafe_paths.py`), incl. confirmation
|
|
176
|
+
that spaced *directories* and workspace-internal paths are unaffected.
|
|
177
|
+
|
|
178
|
+
### Fixed — stage_timeout now actually bounds runtime
|
|
179
|
+
- **Latent bug**: `run_plan(stage_timeout=…)` never worked. The log
|
|
180
|
+
loop (`container.logs(stream=True, follow=True)`) blocks until the
|
|
181
|
+
container exits, so the subsequent `container.wait(timeout=…)` — a
|
|
182
|
+
docker-py HTTP read timeout, not a runtime cap — could never fire for
|
|
183
|
+
a runaway container; it would hang forever.
|
|
184
|
+
- `DockerBackend.run` now starts a watchdog `threading.Timer` that
|
|
185
|
+
`container.kill()`s the stage when the timeout elapses, returning the
|
|
186
|
+
conventional exit code **124** with a clear message.
|
|
187
|
+
- Tests: +3 (`tests/unit/test_docker_timeout.py`, fake-container based,
|
|
188
|
+
deterministic).
|
|
189
|
+
|
|
190
|
+
### Fixed — DockerBackend now clamps CPU/RAM to host capacity
|
|
191
|
+
- **Bug the full-pipeline e2e caught**: a stage declaring `cpu=8` (e.g.
|
|
192
|
+
SPAdes in `prokaryote_assembly`) failed to even *start* on any host
|
|
193
|
+
with fewer cores — Docker rejects a container whose `--cpus` exceeds
|
|
194
|
+
the host count ("range of CPUs is from 0.01 to N.00"), so all 3 retry
|
|
195
|
+
attempts died instantly. Passed locally (12 cores) but failed on the
|
|
196
|
+
4-core CI runner — and would hit any user on a small workstation.
|
|
197
|
+
- `DockerBackend.run` now clamps the requested CPU to the host core count
|
|
198
|
+
and RAM to ~90% of host memory (`_clamp_resources`), so an
|
|
199
|
+
over-ambitious resource request degrades to "use what's available"
|
|
200
|
+
instead of crashing.
|
|
201
|
+
- Tests: +5 (`tests/unit/test_resource_clamp.py`).
|
|
202
|
+
|
|
203
|
+
### Added — first full-pipeline end-to-end test
|
|
204
|
+
- `tests/integration/test_full_pipeline_e2e.py`: runs the **entire**
|
|
205
|
+
`prokaryote_assembly` recipe (fastp → SPAdes → QUAST → Prokka) against
|
|
206
|
+
real BioContainers — the first time a complete pipeline (not just a
|
|
207
|
+
first stage, as the smoke matrix does) is validated end-to-end.
|
|
208
|
+
- New `data/test/phix_small/` fixture: phiX174 (`NC_001422.1`, 5386 bp) +
|
|
209
|
+
1000 wgsim-simulated 150 bp pairs (~56×, seed 42 → deterministic).
|
|
210
|
+
phiX assembles into a single ~5.4 kb contig in <1 min, so the whole
|
|
211
|
+
chain finishes in ~45 s.
|
|
212
|
+
- Asserts real data flow: assembled contig length 4.5–6 kb, QUAST
|
|
213
|
+
`report.tsv`, and Prokka annotation with ≥1 CDS. Verified locally
|
|
214
|
+
(phiX → 1 contig 5377 bp, 6 CDS).
|
|
215
|
+
- Wired into the nightly-smoke workflow as a second step.
|
|
216
|
+
|
|
217
|
+
## [0.2.1] — 2026-06-12
|
|
218
|
+
|
|
219
|
+
> **Why upgrade from 0.2.0**: the `v0.2.0` tag predated the registry
|
|
220
|
+
> freshness fix, so the 0.2.0 wheel bundled a registry whose
|
|
221
|
+
> `quay.io/biocontainers/*` tags had been garbage-collected from Quay —
|
|
222
|
+
> `rnaseq_deg`, `chip_seq`, `atac_seq`, `eukaryote_assembly`,
|
|
223
|
+
> `metagenome_assembly`, and others would fail at run time with "image
|
|
224
|
+
> not found". **0.2.1 ships the repaired, 107/107-pinned registry**, so
|
|
225
|
+
> recipes actually pull their containers. Everything below landed after
|
|
226
|
+
> the 0.2.0 cut and is new to PyPI users here.
|
|
227
|
+
|
|
228
|
+
### Changed — nightly smoke matrix expanded (3 → 5 recipes)
|
|
229
|
+
- Added real-container smoke cases for **chip_seq** (TrimGalore — a
|
|
230
|
+
container family shared by chip/atac/methylation, previously
|
|
231
|
+
unexercised) and **germline_variants** (validates the variant recipe's
|
|
232
|
+
fastp wiring). All five pass against real BioContainers locally
|
|
233
|
+
(~90 s) — more of the "522 mostly-mock tests" critique closed with
|
|
234
|
+
genuine end-to-end coverage.
|
|
235
|
+
|
|
236
|
+
### Changed — mypy type-checking is now blocking
|
|
237
|
+
- Fixed all 33 mypy errors across 7 modules (Rich `TaskID` optionals in
|
|
238
|
+
the progress bars, `dict[str, Any]` run kwargs, anthropic/openai/docker
|
|
239
|
+
SDK union + missing-stub noise, a loop-variable shadow in
|
|
240
|
+
`cli/update.py`). `mypy bioflow --ignore-missing-imports` now reports
|
|
241
|
+
**0 errors**.
|
|
242
|
+
- CI's `typecheck` job drops `continue-on-error` — a new type error now
|
|
243
|
+
fails the build, closing the "advisory only" gap.
|
|
244
|
+
|
|
245
|
+
### Added — nf-core concordance benchmark (harness + methodology)
|
|
246
|
+
- `scripts/compare_nfcore.py` (new, stdlib-only): scores agreement
|
|
247
|
+
between a bioflow output and the matching nf-core output —
|
|
248
|
+
**Jaccard + genotype concordance** on normalised VCF sites
|
|
249
|
+
(vs nf-core/sarek), and **Spearman ρ** of per-gene counts
|
|
250
|
+
(vs nf-core/rnaseq). Optional `--min-jaccard` / `--min-rho` gate for
|
|
251
|
+
CI.
|
|
252
|
+
- `docs/benchmarks/nfcore-concordance.md`: golden datasets
|
|
253
|
+
(GIAB HG002 chr20; nf-core/rnaseq chr22), method, and initial
|
|
254
|
+
acceptance thresholds.
|
|
255
|
+
- `.github/workflows/nfcore-concordance.yml`: manually-dispatched job
|
|
256
|
+
(not a per-PR gate — a full run needs staged references).
|
|
257
|
+
- Honesty note: bioflow ships the *scoring* half (committed + tested);
|
|
258
|
+
the *production* half (running both pipelines on a machine with the
|
|
259
|
+
references) is operator-run and documented.
|
|
260
|
+
- Tests: +13 (`tests/unit/test_compare_nfcore.py`).
|
|
261
|
+
|
|
262
|
+
### Added — GPU passthrough + Podman runtime
|
|
263
|
+
- `@stage(..., gpu=True)` (and a tool YAML's `resources.gpu`) now attach
|
|
264
|
+
all host GPUs to that stage's container via a Docker `DeviceRequest`
|
|
265
|
+
(the API equivalent of `--gpus all`); needs the NVIDIA Container
|
|
266
|
+
Toolkit, and degrades to a warning on CPU-only hosts rather than
|
|
267
|
+
failing. Threaded through `Stage`, the `@stage` decorator, and the
|
|
268
|
+
preset `run_plan` path.
|
|
269
|
+
- `DockerBackend` works with **Podman**: it honours `BIOFLOW_DOCKER_HOST`
|
|
270
|
+
/ `DOCKER_HOST` (point it at the Podman API socket) and an optional
|
|
271
|
+
`base_url`, and reads `BIOFLOW_CONTAINER_RUNTIME`.
|
|
272
|
+
- `bioflow doctor` recognises Podman as a Docker alternative — the
|
|
273
|
+
`docker_cli` / `docker_daemon` checks fall back to `podman` and report
|
|
274
|
+
which runtime they found.
|
|
275
|
+
- Tests: +10 (`tests/unit/test_gpu_podman.py`). Backend `run()` gains a
|
|
276
|
+
`gpu` kwarg (the `ContainerBackend` protocol + MockBackend updated).
|
|
277
|
+
|
|
278
|
+
### Added — reference-DB catalog expansion + refgenie manifest
|
|
279
|
+
- `bioflow/core/db.py` catalog gains the references real recipes need:
|
|
280
|
+
GATK known-sites `dbsnp_grch38` + `mills_indels_grch38` (BQSR/VQSR),
|
|
281
|
+
`encode_blacklist_grch38` (ChIP/ATAC peak filtering), and
|
|
282
|
+
`gencode_grch38` (STAR/Salmon/featureCounts annotation).
|
|
283
|
+
- Catalog entries now carry `genome` + `asset` tags, and a new
|
|
284
|
+
`refgenie_manifest()` / `bioflow db manifest` emits a
|
|
285
|
+
[refgenie](https://refgenie.databio.org/)-compatible JSON mapping
|
|
286
|
+
`<genome>/<asset>` → catalogued DB, so labs already standardised on
|
|
287
|
+
refgenie can see which existing assets satisfy a bioflow requirement.
|
|
288
|
+
- Tests: +5 (`tests/unit/test_db.py`).
|
|
289
|
+
|
|
290
|
+
### Added — rnaseq_deg depth: GO enrichment + MultiQC
|
|
291
|
+
- `rnaseq_deg` extended from 4 → 6 stages:
|
|
292
|
+
- **`enrich_go`** — GO enrichment on the significant DEGs via gseapy's
|
|
293
|
+
Enrichr query. Symbol-based, so no organism-specific OrgDb package
|
|
294
|
+
is needed; an awk numeric-regex guard on the `padj` column avoids the
|
|
295
|
+
classic "NA-treated-as-0" false positive.
|
|
296
|
+
- **`multiqc_report`** — aggregates every per-sample fastp + Salmon
|
|
297
|
+
report into a single MultiQC HTML.
|
|
298
|
+
- DAG-shape test updated; recipe still runs end-to-end under MockBackend.
|
|
299
|
+
|
|
300
|
+
### Added — `joint_genotyping` recipe (GATK cohort best practice)
|
|
301
|
+
- New 7-stage recipe (`bioflow/recipes/variant_calling/joint_genotyping.py`)
|
|
302
|
+
implementing the canonical GATK **joint-genotyping** workflow for
|
|
303
|
+
cohorts, where `germline_variants` only does single-sample direct
|
|
304
|
+
calling:
|
|
305
|
+
- **per sample (fan-out)**: fastp → BWA-MEM → MarkDuplicates →
|
|
306
|
+
HaplotypeCaller `-ERC GVCF`
|
|
307
|
+
- **cohort (converge)**: CombineGVCFs → GenotypeGVCFs →
|
|
308
|
+
best-practice hard filtering (separate SNP / INDEL filters) → SnpEff
|
|
309
|
+
- Takes a `sample_id,fastq_r1,fastq_r2` sample sheet and uses `.starmap`
|
|
310
|
+
to run the per-sample stages in parallel before converging on the
|
|
311
|
+
joint steps — the production pattern reviewers expect for population
|
|
312
|
+
and family studies, and a worked example of bioflow's fan-out.
|
|
313
|
+
- Recipe count 19 → 20 (12 per-pipeline). Tests: +3 (registration, DAG
|
|
314
|
+
shape, MockBackend e2e).
|
|
315
|
+
|
|
316
|
+
### Added — run provenance (RO-Crate + PROV-style JSON)
|
|
317
|
+
- `bioflow/core/provenance.py` (new): every recipe run records, per
|
|
318
|
+
stage, the container **image + content digest**, the exact
|
|
319
|
+
**command**, every **input file's SHA-256 + size**, **start/end**
|
|
320
|
+
timestamps, exit code, and the bioflow version.
|
|
321
|
+
- At the end of a run the workspace gains two self-describing files:
|
|
322
|
+
- `provenance.json` — flat, human-readable run record
|
|
323
|
+
- `ro-crate-metadata.json` — an [RO-Crate 1.1](https://www.researchobject.org/ro-crate/)
|
|
324
|
+
research object (the de-facto packaging standard for computational
|
|
325
|
+
workflow runs), so the output directory is consumable by reviewers
|
|
326
|
+
and downstream tools directly.
|
|
327
|
+
- Wired into `bioflow recipe run` (on by default; `--no-provenance` to
|
|
328
|
+
skip). Opt-in and **zero-cost when off** — the SDK hot path pays
|
|
329
|
+
nothing unless a recorder is installed, and provenance errors degrade
|
|
330
|
+
to warnings rather than aborting the science.
|
|
331
|
+
- New `bioflow provenance show <workspace>` command (+ `--json`) renders
|
|
332
|
+
the recorded run: per-stage image, pinned digest, exit status, and
|
|
333
|
+
input hashes. Builds directly on the digest-pinning work — pinned
|
|
334
|
+
tools show their `sha256:…` in the provenance.
|
|
335
|
+
- Tests: +14 (`tests/unit/test_provenance.py`); verified end-to-end
|
|
336
|
+
against real Docker (digest resolved from the local image, RO-Crate
|
|
337
|
+
validates structurally).
|
|
338
|
+
|
|
339
|
+
### Fixed — registry freshness (stale BioContainer tags)
|
|
340
|
+
- **Discovery**: an audit during digest-pinning found that ~half the
|
|
341
|
+
registry's `quay.io/biocontainers/*` image tags had 404'd. Quay
|
|
342
|
+
rotates each package's `--<buildhash>_<n>` build suffix and garbage-
|
|
343
|
+
collects the old ones, so dozens of recipes (chip_seq, atac_seq,
|
|
344
|
+
eukaryote_assembly, metagenome_assembly, rnaseq_deg's Salmon stage,
|
|
345
|
+
…) would have failed at run time with "image not found" — unrelated
|
|
346
|
+
to the user's data.
|
|
347
|
+
- `scripts/refresh_tags.py` (new): audits every Quay BioContainer
|
|
348
|
+
reference, and with `--apply` rewrites any dead tag to the newest
|
|
349
|
+
*same-version* build (never changing the upstream software version).
|
|
350
|
+
Non-Quay images and versions that have left Quay entirely are
|
|
351
|
+
reported for manual review.
|
|
352
|
+
- Applied: **34 registry tool YAMLs** + **7 recipe-hardcoded images**
|
|
353
|
+
(bowtie2, bracken, flye, macs3, medaka, metabat2, tobias) re-pointed
|
|
354
|
+
to live tags. Salmon's `1.10.3--hb950928_0` → `1.10.3--h45fbf2d_5`
|
|
355
|
+
fixed separately (it broke the rnaseq_deg quant stage).
|
|
356
|
+
- Verified: the bumped images pull + run (e.g. `bowtie2 2.5.4`), and
|
|
357
|
+
the nightly smoke matrix is green.
|
|
358
|
+
|
|
359
|
+
### Added — full digest pinning (registry now 100% content-addressed)
|
|
360
|
+
- Digest coverage raised from 5/110 → **107/107 active tools** in two
|
|
361
|
+
passes. The second pass resolved the 17 hold-outs:
|
|
362
|
+
- **Version bumps** to the newest BioContainer of the same tool where
|
|
363
|
+
the pinned version had left Quay: DESeq2 1.44→1.50.2, edgeR
|
|
364
|
+
4.2→4.8.2, limma 3.60→3.66, clusterProfiler 4.12→4.18.4, topGO
|
|
365
|
+
2.56→2.62, HOMER 4.11.1→5.1, Scanpy 1.10.1→1.7.2 (1.10 was never on
|
|
366
|
+
Quay biocontainers), Comet 2024020→2026011, Percolator 3.06.1→3.7.1,
|
|
367
|
+
InterProScan 5.67→5.59.
|
|
368
|
+
- **Image-source switches**: methylKit and monocle3 moved off the
|
|
369
|
+
multi-GB `bioconductor/bioconductor_full:RELEASE_3_18` (gone) to the
|
|
370
|
+
dedicated `bioconductor-methylkit` / `r-monocle3` BioContainers;
|
|
371
|
+
Cell Ranger tag `7.2.0`→`v7.2.0`; Seurat `5.0.1`→`5.0.0`.
|
|
372
|
+
- **Deprecations**: MaxQuant joins MSFragger + FragPipe as
|
|
373
|
+
`deprecated: true` — proprietary tools whose images are gone and
|
|
374
|
+
which the `proteomics_dda` recipe does not use (it runs the
|
|
375
|
+
open-source msconvert → Comet → Percolator stack). The audit now
|
|
376
|
+
skips deprecated tools.
|
|
377
|
+
- **CI `digest-audit` is now blocking** (was advisory): every active
|
|
378
|
+
tool must carry an `image_digest`, so the registry can never silently
|
|
379
|
+
rot again.
|
|
380
|
+
- `scripts/refresh_tags.py` also reports `version_gone` and non-Quay
|
|
381
|
+
images for these manual cases.
|
|
382
|
+
|
|
383
|
+
### Added — Bioconda recipe (prep)
|
|
384
|
+
- `conda-recipe/meta.yaml`: noarch-python Bioconda recipe (only
|
|
385
|
+
bioflow's pure-Python stack is a conda dep; tools run as Docker
|
|
386
|
+
containers). Submission walkthrough in `docs/MAINTAINER.md` Part 7.
|
|
387
|
+
Gated on the real-PyPI publish (the recipe sources the PyPI sdist).
|
|
388
|
+
|
|
389
|
+
### Changed — CI
|
|
390
|
+
- All workflows opt into the Node 24 runtime
|
|
391
|
+
(`FORCE_JAVASCRIPT_ACTIONS_TO_NODE24`) ahead of the 2026-06-16 forced
|
|
392
|
+
switch, silencing the Node 20 deprecation warnings.
|
|
393
|
+
|
|
394
|
+
### Fixed — nightly smoke
|
|
395
|
+
- `rnaseq_deg.qc_one` smoke assertion matched the stage's real output
|
|
396
|
+
names (`<sample_id>_R1.clean.fq.gz`), fixing a false CI failure.
|
|
397
|
+
|
|
398
|
+
---
|
|
399
|
+
|
|
15
400
|
## [0.2.0] — 2026-06-05
|
|
16
401
|
|
|
17
402
|
First PyPI release. Three months of 0.1.x work consolidated into a
|