bioflowkit 0.2.0__tar.gz → 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (470) hide show
  1. bioflowkit-0.3.0/.gitattributes +20 -0
  2. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/candidate-smoke-test.yml +5 -0
  3. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/ci.yml +17 -13
  4. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/docs.yml +5 -0
  5. bioflowkit-0.3.0/.github/workflows/nfcore-concordance.yml +75 -0
  6. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/nightly-smoke.yml +15 -2
  7. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.github/workflows/release.yml +5 -0
  8. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/.gitignore +1 -0
  9. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CHANGELOG.md +385 -0
  10. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/PKG-INFO +31 -16
  11. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/README.md +30 -15
  12. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/__init__.py +3 -1
  13. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/__init__.py +2 -0
  14. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/db.py +25 -5
  15. bioflowkit-0.3.0/bioflow/cli/provenance.py +82 -0
  16. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/recipe.py +22 -0
  17. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/update.py +7 -6
  18. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/db.py +117 -0
  19. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/doctor.py +35 -17
  20. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/ncbi.py +10 -3
  21. bioflowkit-0.3.0/bioflow/core/provenance.py +422 -0
  22. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/runner.py +145 -10
  23. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/io.py +1 -1
  24. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/llm/__init__.py +3 -2
  25. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/__init__.py +1 -0
  26. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/ani_matrix.py +9 -5
  27. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/atac_seq.py +7 -5
  28. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/chip_seq.py +13 -10
  29. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/eukaryote_assembly.py +16 -9
  30. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/metagenome_assembly.py +7 -3
  31. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/metagenomics_profile.py +5 -4
  32. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/methylation/bismark_wgbs.py +68 -9
  33. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/proteomics/proteomics_dda.py +7 -4
  34. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/rnaseq_deg/rnaseq_deg.py +79 -14
  35. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/single_cell/scrna_seq.py +1 -1
  36. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/variant_calling/germline_variants.py +46 -13
  37. bioflowkit-0.3.0/bioflow/recipes/variant_calling/joint_genotyping.py +277 -0
  38. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/__init__.py +43 -0
  39. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_paths.py +35 -1
  40. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_stage.py +22 -1
  41. bioflowkit-0.3.0/conda-recipe/meta.yaml +86 -0
  42. bioflowkit-0.3.0/data/test/cafe_small/README.md +21 -0
  43. bioflowkit-0.3.0/data/test/cafe_small/families.tsv +61 -0
  44. bioflowkit-0.3.0/data/test/cafe_small/tree.nwk +1 -0
  45. bioflowkit-0.3.0/data/test/genomes_small/README.md +28 -0
  46. bioflowkit-0.3.0/data/test/genomes_small/genome1.fna +78 -0
  47. bioflowkit-0.3.0/data/test/genomes_small/genome2.fna +78 -0
  48. bioflowkit-0.3.0/data/test/gwas_small/README.md +22 -0
  49. bioflowkit-0.3.0/data/test/gwas_small/gene_presence_absence.csv +13 -0
  50. bioflowkit-0.3.0/data/test/gwas_small/traits.csv +11 -0
  51. bioflowkit-0.3.0/data/test/methyl_small/README.md +29 -0
  52. bioflowkit-0.3.0/data/test/methyl_small/genome.fa +79 -0
  53. bioflowkit-0.3.0/data/test/methyl_small/sample01_R1.fastq.gz +0 -0
  54. bioflowkit-0.3.0/data/test/methyl_small/sample01_R2.fastq.gz +0 -0
  55. bioflowkit-0.3.0/data/test/phix_small/README.md +41 -0
  56. bioflowkit-0.3.0/data/test/phix_small/reference.fa +79 -0
  57. bioflowkit-0.3.0/data/test/phix_small/sim_R1.fastq.gz +0 -0
  58. bioflowkit-0.3.0/data/test/phix_small/sim_R2.fastq.gz +0 -0
  59. bioflowkit-0.3.0/data/test/phylo_small/README.md +25 -0
  60. bioflowkit-0.3.0/data/test/phylo_small/gene_presence_absence.csv +9 -0
  61. bioflowkit-0.3.0/data/test/phylo_small/gffs/g1.ffn +67 -0
  62. bioflowkit-0.3.0/data/test/phylo_small/gffs/g1.gff +100 -0
  63. bioflowkit-0.3.0/data/test/phylo_small/gffs/g2.ffn +67 -0
  64. bioflowkit-0.3.0/data/test/phylo_small/gffs/g2.gff +100 -0
  65. bioflowkit-0.3.0/data/test/phylo_small/gffs/g3.ffn +72 -0
  66. bioflowkit-0.3.0/data/test/phylo_small/gffs/g3.gff +101 -0
  67. bioflowkit-0.3.0/data/test/phylo_small/gffs/g4.ffn +67 -0
  68. bioflowkit-0.3.0/data/test/phylo_small/gffs/g4.gff +101 -0
  69. bioflowkit-0.3.0/data/test/rnaseq_small/README.md +22 -0
  70. bioflowkit-0.3.0/data/test/rnaseq_small/ctl1_R1.fastq.gz +0 -0
  71. bioflowkit-0.3.0/data/test/rnaseq_small/ctl1_R2.fastq.gz +0 -0
  72. bioflowkit-0.3.0/data/test/rnaseq_small/ctl2_R1.fastq.gz +0 -0
  73. bioflowkit-0.3.0/data/test/rnaseq_small/ctl2_R2.fastq.gz +0 -0
  74. bioflowkit-0.3.0/data/test/rnaseq_small/transcriptome.fa +552 -0
  75. bioflowkit-0.3.0/data/test/rnaseq_small/trt1_R1.fastq.gz +0 -0
  76. bioflowkit-0.3.0/data/test/rnaseq_small/trt1_R2.fastq.gz +0 -0
  77. bioflowkit-0.3.0/data/test/rnaseq_small/trt2_R1.fastq.gz +0 -0
  78. bioflowkit-0.3.0/data/test/rnaseq_small/trt2_R2.fastq.gz +0 -0
  79. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/MAINTAINER.md +42 -1
  80. bioflowkit-0.3.0/docs/benchmarks/nfcore-concordance.md +90 -0
  81. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/index.md +3 -2
  82. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/install.md +6 -0
  83. bioflowkit-0.3.0/docs/reference/e2e-coverage.md +68 -0
  84. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/reference/recipes.md +38 -21
  85. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/reference/tools.md +50 -50
  86. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/mkdocs.yml +3 -0
  87. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/pyproject.toml +1 -1
  88. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bedtools.yaml +3 -2
  89. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bowtie2.yaml +5 -1
  90. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bwa_mem2.yaml +3 -2
  91. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/minimap2.yaml +3 -2
  92. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/samtools.yaml +9 -5
  93. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/abyss.yaml +4 -3
  94. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/canu.yaml +3 -2
  95. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/flye.yaml +4 -3
  96. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/hifiasm.yaml +3 -2
  97. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/masurca.yaml +4 -3
  98. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/medaka.yaml +4 -3
  99. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/megahit.yaml +3 -2
  100. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/nextdenovo.yaml +4 -3
  101. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/nextpolish.yaml +4 -3
  102. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/pilon.yaml +3 -2
  103. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/racon.yaml +3 -2
  104. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/raven.yaml +4 -3
  105. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/shasta.yaml +4 -3
  106. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/unicycler.yaml +4 -3
  107. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/verkko.yaml +3 -2
  108. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/busco.yaml +3 -2
  109. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/checkm2.yaml +3 -2
  110. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/compleasm.yaml +3 -2
  111. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/gfastats.yaml +4 -3
  112. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/merqury.yaml +3 -2
  113. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/abricate.yaml +1 -0
  114. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/cafe5.yaml +1 -0
  115. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/diamond.yaml +3 -2
  116. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/fastani.yaml +1 -0
  117. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/iqtree.yaml +1 -0
  118. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/mafft.yaml +3 -2
  119. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/mash.yaml +4 -3
  120. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/panaroo.yaml +3 -2
  121. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/roary.yaml +1 -0
  122. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/scoary.yaml +3 -2
  123. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/comparative_genomics/skani.yaml +4 -3
  124. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/deseq2.yaml +4 -3
  125. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/edger.yaml +4 -3
  126. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/deg/limma_voom.yaml +4 -3
  127. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/clusterprofiler.yaml +4 -3
  128. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/enrichr.yaml +4 -3
  129. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/gseapy.yaml +4 -3
  130. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/enrichment/topgo.yaml +4 -3
  131. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/bismark.yaml +1 -0
  132. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/deeptools.yaml +1 -0
  133. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/homer.yaml +2 -1
  134. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/macs3.yaml +2 -1
  135. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/methylkit.yaml +2 -1
  136. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/methylpy.yaml +4 -3
  137. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/picard.yaml +3 -2
  138. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/epigenomics/tobias.yaml +2 -1
  139. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/antismash.yaml +3 -2
  140. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/dbcan.yaml +3 -2
  141. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/eggnog_mapper.yaml +3 -2
  142. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/gtdbtk.yaml +3 -2
  143. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/func_annot/interproscan.yaml +4 -3
  144. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/bracken.yaml +2 -1
  145. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/humann3.yaml +2 -1
  146. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/kneaddata.yaml +1 -0
  147. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/kraken2.yaml +1 -0
  148. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/lefse.yaml +1 -0
  149. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/maxbin2.yaml +4 -3
  150. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/metabat2.yaml +4 -3
  151. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/metagenomics/metaphlan4.yaml +1 -0
  152. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/comet.yaml +4 -3
  153. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/maxquant.yaml +10 -1
  154. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/msconvert.yaml +1 -0
  155. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/openms.yaml +4 -3
  156. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/percolator.yaml +2 -1
  157. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/xtandem.yaml +3 -2
  158. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/cutadapt.yaml +4 -3
  159. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/fastqc.yaml +3 -2
  160. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/filtlong.yaml +4 -3
  161. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/multiqc.yaml +3 -2
  162. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/nanoplot.yaml +3 -2
  163. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/seqkit.yaml +3 -2
  164. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/trimgalore.yaml +1 -0
  165. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/earlgrey.yaml +4 -3
  166. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/repeatmasker.yaml +3 -2
  167. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/repeat/repeatmodeler.yaml +3 -2
  168. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/hisat2.yaml +3 -2
  169. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/kallisto.yaml +4 -3
  170. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/rsem.yaml +4 -3
  171. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/salmon.yaml +4 -3
  172. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/star.yaml +3 -2
  173. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/stringtie.yaml +4 -3
  174. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/rnaseq_align/subread.yaml +3 -2
  175. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/bustools.yaml +4 -3
  176. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/cellranger.yaml +2 -1
  177. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/harmony.yaml +4 -3
  178. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/monocle3.yaml +2 -1
  179. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/scanpy.yaml +2 -1
  180. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/scrublet.yaml +4 -3
  181. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/seurat.yaml +2 -1
  182. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/single_cell/starsolo.yaml +1 -0
  183. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/augustus.yaml +4 -3
  184. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/bakta.yaml +3 -2
  185. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/braker3.yaml +3 -2
  186. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/liftoff.yaml +3 -2
  187. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/bcftools.yaml +3 -2
  188. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/freebayes.yaml +4 -3
  189. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/gatk4.yaml +3 -2
  190. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/variant_calling/snpeff.yaml +3 -2
  191. bioflowkit-0.3.0/scripts/compare_nfcore.py +234 -0
  192. bioflowkit-0.3.0/scripts/gen_methyl_fixture.py +89 -0
  193. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/pin_digests.py +12 -2
  194. bioflowkit-0.3.0/scripts/refresh_tags.py +199 -0
  195. bioflowkit-0.3.0/tests/integration/test_full_pipeline_e2e.py +343 -0
  196. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_recipe_smoke_matrix.py +31 -1
  197. bioflowkit-0.3.0/tests/unit/test_compare_nfcore.py +170 -0
  198. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_db.py +45 -0
  199. bioflowkit-0.3.0/tests/unit/test_docker_timeout.py +130 -0
  200. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_failure_report.py +2 -1
  201. bioflowkit-0.3.0/tests/unit/test_gpu_podman.py +203 -0
  202. bioflowkit-0.3.0/tests/unit/test_provenance.py +232 -0
  203. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_per_pipeline.py +7 -6
  204. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_per_pipeline_e2e.py +32 -3
  205. bioflowkit-0.3.0/tests/unit/test_resource_clamp.py +54 -0
  206. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_run_resume.py +1 -0
  207. bioflowkit-0.3.0/tests/unit/test_unsafe_paths.py +71 -0
  208. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CODE_OF_CONDUCT.md +0 -0
  209. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/CONTRIBUTING.md +0 -0
  210. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/LICENSE +0 -0
  211. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/SECURITY.md +0 -0
  212. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/README.md +0 -0
  213. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.card.tsv +0 -0
  214. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.plasmidfinder.tsv +0 -0
  215. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_ananatis_019464615_1.vfdb.tsv +0 -0
  216. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.card.tsv +0 -0
  217. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.plasmidfinder.tsv +0 -0
  218. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_aquatica_900095885_1.vfdb.tsv +0 -0
  219. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.card.tsv +0 -0
  220. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.plasmidfinder.tsv +0 -0
  221. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_chrysanthemi_000023565_1.vfdb.tsv +0 -0
  222. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.card.tsv +0 -0
  223. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.plasmidfinder.tsv +0 -0
  224. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dadantii_003049785_1.vfdb.tsv +0 -0
  225. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.card.tsv +0 -0
  226. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.plasmidfinder.tsv +0 -0
  227. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_dianthicola_003403135_1.vfdb.tsv +0 -0
  228. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.card.tsv +0 -0
  229. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.plasmidfinder.tsv +0 -0
  230. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_fangzhongdai_002812485_1.vfdb.tsv +0 -0
  231. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.card.tsv +0 -0
  232. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.plasmidfinder.tsv +0 -0
  233. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_lacustris_003934295_1.vfdb.tsv +0 -0
  234. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.card.tsv +0 -0
  235. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.plasmidfinder.tsv +0 -0
  236. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_oryzae_020406815_2.vfdb.tsv +0 -0
  237. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.card.tsv +0 -0
  238. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.plasmidfinder.tsv +0 -0
  239. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_parazeae_000025065_1.vfdb.tsv +0 -0
  240. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.card.tsv +0 -0
  241. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.plasmidfinder.tsv +0 -0
  242. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_poaceiphila_007858975_2.vfdb.tsv +0 -0
  243. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.card.tsv +0 -0
  244. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.plasmidfinder.tsv +0 -0
  245. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_solani_001644705_1.vfdb.tsv +0 -0
  246. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.card.tsv +0 -0
  247. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.plasmidfinder.tsv +0 -0
  248. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_undicola_000784735_1.vfdb.tsv +0 -0
  249. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.card.tsv +0 -0
  250. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.plasmidfinder.tsv +0 -0
  251. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/D_zeae_002887555_1.vfdb.tsv +0 -0
  252. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_card.tsv +0 -0
  253. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_plasmidfinder.tsv +0 -0
  254. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/abricate/_summary_vfdb.tsv +0 -0
  255. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_clade_results.txt +0 -0
  256. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_family_likelihoods.txt +0 -0
  257. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_family_results.txt +0 -0
  258. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/results/Base_results.txt +0 -0
  259. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/cafe/vfdb_counts.tsv +0 -0
  260. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/eggnog/cog_counts_by_bucket.tsv +0 -0
  261. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/eggnog/cog_fractions_by_bucket.tsv +0 -0
  262. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_card.png +0 -0
  263. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_boxplot.png +0 -0
  264. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_card.png +0 -0
  265. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_plasmidfinder.png +0 -0
  266. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_full_vfdb.png +0 -0
  267. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_plasmidfinder.png +0 -0
  268. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/abricate_vfdb.png +0 -0
  269. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/ani_full_heatmap.png +0 -0
  270. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/ani_heatmap.png +0 -0
  271. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cafe_hcp_detail.png +0 -0
  272. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cafe_vfdb_tree.png +0 -0
  273. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cog_delta.png +0 -0
  274. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/cog_stacked.png +0 -0
  275. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_curve.png +0 -0
  276. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_full_curve.png +0 -0
  277. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_full_pie.png +0 -0
  278. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/pangenome_pie.png +0 -0
  279. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_is_dianthicola.png +0 -0
  280. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_is_solani.png +0 -0
  281. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_soft_rot.png +0 -0
  282. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_full_vascular_wilt.png +0 -0
  283. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_soft_rot.png +0 -0
  284. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/scoary_vascular_wilt.png +0 -0
  285. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/solani_island_gc.png +0 -0
  286. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/solani_island_synteny.png +0 -0
  287. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_ani_nj.png +0 -0
  288. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_full_with_vfdb.png +0 -0
  289. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/figures/tree_ml_iqtree.png +0 -0
  290. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny/ani_nj.nwk +0 -0
  291. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny/iqtree.treefile +0 -0
  292. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/phylogeny_full/iqtree_full.treefile +0 -0
  293. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/top25_soft_rot.tsv +0 -0
  294. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/top25_vascular_wilt.tsv +0 -0
  295. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary/traits.csv +0 -0
  296. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_is_dianthicola.tsv +0 -0
  297. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_is_solani.tsv +0 -0
  298. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_soft_rot.tsv +0 -0
  299. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/top30_vascular_wilt.tsv +0 -0
  300. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/scoary_full/traits.csv +0 -0
  301. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/analysis/dickeya/summary.html +0 -0
  302. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/__main__.py +0 -0
  303. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/_app.py +0 -0
  304. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/doctor.py +0 -0
  305. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/hw.py +0 -0
  306. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/llm.py +0 -0
  307. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/ncbi.py +0 -0
  308. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/pipelines.py +0 -0
  309. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/cli/setup.py +0 -0
  310. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/__init__.py +0 -0
  311. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/approve.py +0 -0
  312. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/checkpoint.py +0 -0
  313. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/compatibility.py +0 -0
  314. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/dag.py +0 -0
  315. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/hardware.py +0 -0
  316. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/logger.py +0 -0
  317. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/planner.py +0 -0
  318. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/registry.py +0 -0
  319. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/core/report.py +0 -0
  320. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/llm/audit.py +0 -0
  321. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/__init__.py +0 -0
  322. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/atac_seq.py +0 -0
  323. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/chip_seq.py +0 -0
  324. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/genome_assembly.py +0 -0
  325. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/metagenomics.py +0 -0
  326. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/methylation.py +0 -0
  327. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/proteomics.py +0 -0
  328. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/rnaseq_deg.py +0 -0
  329. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/scrna_seq.py +0 -0
  330. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/pipelines/variant_calling.py +0 -0
  331. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/__init__.py +0 -0
  332. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/amr_vf_catalogue.py +0 -0
  333. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/cafe_evolution.py +0 -0
  334. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/cog_enrichment.py +0 -0
  335. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/download_taxon.py +0 -0
  336. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/gwas.py +0 -0
  337. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/pangenome.py +0 -0
  338. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/comparative_genomics/phylogeny.py +0 -0
  339. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/epigenomics/__init__.py +0 -0
  340. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/__init__.py +0 -0
  341. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/genome_assembly/prokaryote_assembly.py +0 -0
  342. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/metagenomics/__init__.py +0 -0
  343. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/methylation/__init__.py +0 -0
  344. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/proteomics/__init__.py +0 -0
  345. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/rnaseq_deg/__init__.py +0 -0
  346. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/single_cell/__init__.py +0 -0
  347. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/recipes/variant_calling/__init__.py +0 -0
  348. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/report.py +0 -0
  349. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_cache.py +0 -0
  350. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_hashing.py +0 -0
  351. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_parallel.py +0 -0
  352. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_pipeline.py +0 -0
  353. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_result.py +0 -0
  354. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/bioflow/sdk/_runtime.py +0 -0
  355. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/R1.fastq.gz +0 -0
  356. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/R2.fastq.gz +0 -0
  357. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/real_R1.fastq.gz +0 -0
  358. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/real_R2.fastq.gz +0 -0
  359. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/ecoli_small/reference.fa +0 -0
  360. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/R1.fastq.gz +0 -0
  361. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/genome.fa +0 -0
  362. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/genome.gtf +0 -0
  363. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/data/test/rnaseq_toy/samples.csv +0 -0
  364. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docker/core/Dockerfile +0 -0
  365. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docker/docker-compose.yml +0 -0
  366. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/DESIGN.md +0 -0
  367. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/architecture.md +0 -0
  368. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/UPDATE_CADENCES.md +0 -0
  369. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/cowork_schedule_prompt.md +0 -0
  370. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/quarterly_audit_prompt.md +0 -0
  371. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/maintainer/research_prompt.md +0 -0
  372. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/docs/quickstart.md +0 -0
  373. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/cache_demo.py +0 -0
  374. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_atac_seq.yaml +0 -0
  375. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_chip_seq.yaml +0 -0
  376. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_custom.yaml +0 -0
  377. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_eukaryote_hifi.yaml +0 -0
  378. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_metagenomics.yaml +0 -0
  379. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_methylation.yaml +0 -0
  380. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_prokaryote_short.yaml +0 -0
  381. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_proteomics.yaml +0 -0
  382. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_recommend.yaml +0 -0
  383. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_rnaseq.yaml +0 -0
  384. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/config_scrna_seq.yaml +0 -0
  385. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/parallel_demo.py +0 -0
  386. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/pectobacterium_demo.py +0 -0
  387. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/pipeline_demo.py +0 -0
  388. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/recipes_quickstart.py +0 -0
  389. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/examples/stage_demo.py +0 -0
  390. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/README.md +0 -0
  391. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/atac_seq_standard.yaml +0 -0
  392. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/chip_seq_standard.yaml +0 -0
  393. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_denovo_hifi.yaml +0 -0
  394. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_denovo_hybrid.yaml +0 -0
  395. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/eukaryote_resequencing.yaml +0 -0
  396. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/metagenomics_kraken2_standard.yaml +0 -0
  397. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/metagenomics_metaphlan4_standard.yaml +0 -0
  398. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/methylation_bismark_wgbs.yaml +0 -0
  399. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/prokaryote_denovo_hybrid.yaml +0 -0
  400. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/prokaryote_denovo_short.yaml +0 -0
  401. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/proteomics_msfragger_dda.yaml +0 -0
  402. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/rnaseq_deseq2_standard.yaml +0 -0
  403. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/scrna_seq_10x_scanpy.yaml +0 -0
  404. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/presets/scrna_seq_10x_seurat.yaml +0 -0
  405. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/schema.yaml +0 -0
  406. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/alignment/bwa.yaml +0 -0
  407. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly/spades.yaml +0 -0
  408. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/assembly_qc/quast.yaml +0 -0
  409. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/fragpipe.yaml +0 -0
  410. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/proteomics/msfragger.yaml +0 -0
  411. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/qc/fastp.yaml +0 -0
  412. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/registry/tools/struct_annot/prokka.yaml +0 -0
  413. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/gen_docs.py +0 -0
  414. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron-daily.sh +0 -0
  415. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron-weekly.sh +0 -0
  416. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-cron.sh +0 -0
  417. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-daily.ps1 +0 -0
  418. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-weekly.ps1 +0 -0
  419. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/scripts/install-schedule-windows.ps1 +0 -0
  420. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/__init__.py +0 -0
  421. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/__init__.py +0 -0
  422. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/test_prokaryote_short.py +0 -0
  423. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/e2e/test_rnaseq.py +0 -0
  424. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/fixtures/hypo_assembler.yaml +0 -0
  425. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/__init__.py +0 -0
  426. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_docker_backend.py +0 -0
  427. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_recipe_real_data.py +0 -0
  428. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/integration/test_sdk_real_docker.py +0 -0
  429. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_approve.py +0 -0
  430. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_benchmark.py +0 -0
  431. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_bugfixes.py +0 -0
  432. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_crossplatform.py +0 -0
  433. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_dag.py +0 -0
  434. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_digest_pinning.py +0 -0
  435. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_doctor.py +0 -0
  436. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_freshness_check.py +0 -0
  437. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_interactive.py +0 -0
  438. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_io.py +0 -0
  439. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm.py +0 -0
  440. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_audit.py +0 -0
  441. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_diagnose.py +0 -0
  442. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_llm_setup.py +0 -0
  443. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_ncbi.py +0 -0
  444. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner.py +0 -0
  445. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner_eukaryote.py +0 -0
  446. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_planner_rnaseq.py +0 -0
  447. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipe_cli_args.py +0 -0
  448. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipe_registry_alignment.py +0 -0
  449. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes.py +0 -0
  450. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recipes_cookbook.py +0 -0
  451. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_recommend.py +0 -0
  452. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_registry_resolver.py +0 -0
  453. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_registry_sanity.py +0 -0
  454. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_release_watch.py +0 -0
  455. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_report.py +0 -0
  456. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_report_builder.py +0 -0
  457. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_runner.py +0 -0
  458. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_cache.py +0 -0
  459. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_external_mounts.py +0 -0
  460. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_parallel.py +0 -0
  461. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_pipeline.py +0 -0
  462. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_retry.py +0 -0
  463. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_stage.py +0 -0
  464. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_sdk_streaming.py +0 -0
  465. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_skeleton.py +0 -0
  466. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/tests/unit/test_update_auto.py +0 -0
  467. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/REGISTRY_CHANGELOG.md +0 -0
  468. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/benchmark.py +0 -0
  469. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/freshness_check.py +0 -0
  470. {bioflowkit-0.2.0 → bioflowkit-0.3.0}/update/release_watch.py +0 -0
@@ -0,0 +1,20 @@
1
+ # Test fixtures are read by Linux containers (FastANI, CAFE5, …). Some
2
+ # tool parsers don't strip a stray CR, so a CRLF checkout on Windows
3
+ # silently corrupts them (CAFE5 read the last column header as "D\r" and
4
+ # reported "D was not found"). Force LF for text fixtures everywhere,
5
+ # and mark the compressed reads binary so git never touches them.
6
+ data/test/**/*.fna text eol=lf
7
+ data/test/**/*.fa text eol=lf
8
+ data/test/**/*.fasta text eol=lf
9
+ data/test/**/*.tsv text eol=lf
10
+ data/test/**/*.csv text eol=lf
11
+ data/test/**/*.nwk text eol=lf
12
+ data/test/**/*.txt text eol=lf
13
+ data/test/**/*.gff text eol=lf
14
+ data/test/**/*.ffn text eol=lf
15
+ data/test/**/*.gz binary
16
+
17
+ # Registry + recipe shell commands also end up inside Linux containers;
18
+ # keep them LF regardless of host.
19
+ *.yaml text eol=lf
20
+ registry/** text eol=lf
@@ -17,6 +17,11 @@ permissions:
17
17
  contents: read
18
18
  pull-requests: write # for the summary comment
19
19
 
20
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
21
+ # the v4/v5 actions stop emitting Node 20 deprecation warnings.
22
+ env:
23
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
24
+
20
25
  jobs:
21
26
  smoke:
22
27
  name: Validate + smoke-test changed candidates
@@ -6,6 +6,11 @@ on:
6
6
  pull_request:
7
7
  branches: [main]
8
8
 
9
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
10
+ # the v4/v5 actions stop emitting Node 20 deprecation warnings.
11
+ env:
12
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
13
+
9
14
  jobs:
10
15
  unit-tests:
11
16
  name: Unit tests (Python ${{ matrix.python-version }})
@@ -47,12 +52,11 @@ jobs:
47
52
  run: ruff check .
48
53
 
49
54
  typecheck:
50
- # Advisory only mypy issues do not block merging. Provides
51
- # visibility while we gradually add type annotations. Will be
52
- # upgraded to a blocking check when the error count reaches 0.
53
- name: Type check (mypy, advisory)
55
+ # Blocking: the bioflow package type-checks clean under mypy
56
+ # (--ignore-missing-imports for the un-stubbed docker / anthropic /
57
+ # openai SDKs). Keep it that way a new type error fails CI.
58
+ name: Type check (mypy)
54
59
  runs-on: ubuntu-latest
55
- continue-on-error: true
56
60
 
57
61
  steps:
58
62
  - uses: actions/checkout@v4
@@ -66,7 +70,7 @@ jobs:
66
70
  run: pip install -e ".[dev]" && pip install types-PyYAML types-requests
67
71
 
68
72
  - name: Run mypy
69
- run: mypy bioflow --ignore-missing-imports || true
73
+ run: mypy bioflow --ignore-missing-imports
70
74
 
71
75
  registry-schema:
72
76
  name: Validate registry YAMLs
@@ -108,12 +112,12 @@ jobs:
108
112
  EOF
109
113
 
110
114
  digest-audit:
111
- # Advisory until enough tools are pinned prints the missing-digest
112
- # count and lists the first ~30 unpinned tools. Upgrade to a blocking
113
- # check (drop `|| true`) once the bulk of the registry is pinned.
114
- name: Container digest pin audit (advisory)
115
+ # Now blocking: every *active* tool must carry an image_digest
116
+ # (deprecated tools are skipped their upstream images are gone by
117
+ # definition). This keeps the registry fully content-addressed so a
118
+ # silently-retagged or GC'd upstream image can never change results.
119
+ name: Container digest pin audit
115
120
  runs-on: ubuntu-latest
116
- continue-on-error: true
117
121
 
118
122
  steps:
119
123
  - uses: actions/checkout@v4
@@ -126,5 +130,5 @@ jobs:
126
130
  - name: Install dependencies
127
131
  run: pip install pyyaml
128
132
 
129
- - name: Count missing image_digest entries
130
- run: python scripts/pin_digests.py --audit || true
133
+ - name: Require every active tool to be digest-pinned
134
+ run: python scripts/pin_digests.py --audit
@@ -26,6 +26,11 @@ concurrency:
26
26
  group: pages
27
27
  cancel-in-progress: true
28
28
 
29
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
30
+ # the v3/v4/v5 actions stop emitting Node 20 deprecation warnings.
31
+ env:
32
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
33
+
29
34
  jobs:
30
35
  build:
31
36
  runs-on: ubuntu-latest
@@ -0,0 +1,75 @@
1
+ name: nf-core concordance
2
+
3
+ # Manually-dispatched concordance benchmark: run bioflow and the matching
4
+ # nf-core pipeline on a golden dataset, then score their agreement with
5
+ # scripts/compare_nfcore.py.
6
+ #
7
+ # This is NOT a per-PR gate. A real run needs tens of GB of references
8
+ # and hours of compute, so it expects a runner that already has them
9
+ # staged (a self-hosted runner, or a large GitHub runner the maintainer
10
+ # provisions). It is run deliberately before a release and its JSON
11
+ # output is published with the release notes.
12
+ #
13
+ # See docs/benchmarks/nfcore-concordance.md for the datasets, method, and
14
+ # acceptance thresholds.
15
+
16
+ on:
17
+ workflow_dispatch:
18
+ inputs:
19
+ comparison:
20
+ description: "Which comparison to run"
21
+ type: choice
22
+ options: [vcf, counts]
23
+ default: vcf
24
+ bioflow_output:
25
+ description: "Path to the bioflow output (VCF or counts TSV) on the runner"
26
+ required: true
27
+ reference_output:
28
+ description: "Path to the nf-core output on the runner"
29
+ required: true
30
+ threshold:
31
+ description: "Min Jaccard (vcf) or Spearman rho (counts) to pass"
32
+ required: false
33
+ default: "0.90"
34
+
35
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch.
36
+ env:
37
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
38
+
39
+ jobs:
40
+ score:
41
+ name: Score concordance
42
+ # The default GitHub runner cannot hold iGenomes references; point this
43
+ # at a self-hosted runner that has them staged. Left as ubuntu-latest
44
+ # so the workflow is valid and the *scoring* step is exercisable on
45
+ # pre-staged outputs.
46
+ runs-on: ubuntu-latest
47
+ steps:
48
+ - uses: actions/checkout@v4
49
+
50
+ - uses: actions/setup-python@v5
51
+ with:
52
+ python-version: "3.12"
53
+
54
+ - name: Score bioflow vs nf-core
55
+ run: |
56
+ if [ "${{ inputs.comparison }}" = "vcf" ]; then
57
+ python scripts/compare_nfcore.py vcf \
58
+ --bioflow "${{ inputs.bioflow_output }}" \
59
+ --reference "${{ inputs.reference_output }}" \
60
+ --out concordance.json \
61
+ --min-jaccard "${{ inputs.threshold }}"
62
+ else
63
+ python scripts/compare_nfcore.py counts \
64
+ --bioflow "${{ inputs.bioflow_output }}" \
65
+ --reference "${{ inputs.reference_output }}" \
66
+ --out concordance.json \
67
+ --min-rho "${{ inputs.threshold }}"
68
+ fi
69
+
70
+ - name: Upload concordance report
71
+ if: always()
72
+ uses: actions/upload-artifact@v4
73
+ with:
74
+ name: nfcore-concordance
75
+ path: concordance.json
@@ -10,6 +10,11 @@ on:
10
10
  - cron: "0 3 * * *"
11
11
  workflow_dispatch:
12
12
 
13
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
14
+ # the v4/v5 actions stop emitting Node 20 deprecation warnings.
15
+ env:
16
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
17
+
13
18
  jobs:
14
19
  smoke-matrix:
15
20
  name: Recipe smoke matrix
@@ -43,12 +48,20 @@ jobs:
43
48
  -v -m docker \
44
49
  --junitxml=reports/smoke.xml
45
50
 
46
- - name: Upload junit report
51
+ - name: Run full-pipeline e2e (all 9 recipes with committed fixtures)
52
+ env:
53
+ BIOFLOW_LOG_LEVEL: INFO
54
+ run: |
55
+ python -m pytest tests/integration/test_full_pipeline_e2e.py \
56
+ -v -m docker \
57
+ --junitxml=reports/full_e2e.xml
58
+
59
+ - name: Upload junit reports
47
60
  if: always()
48
61
  uses: actions/upload-artifact@v4
49
62
  with:
50
63
  name: smoke-junit
51
- path: reports/smoke.xml
64
+ path: reports/*.xml
52
65
 
53
66
  - name: Fail the job on red results
54
67
  if: failure()
@@ -22,6 +22,11 @@ on:
22
22
  permissions:
23
23
  contents: read
24
24
 
25
+ # Opt into the Node 24 runtime ahead of the 2026-06-16 forced switch so
26
+ # the v4/v5 actions stop emitting Node 20 deprecation warnings.
27
+ env:
28
+ FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: "true"
29
+
25
30
  jobs:
26
31
  build:
27
32
  name: Build sdist + wheel
@@ -98,3 +98,4 @@ build/
98
98
 
99
99
  # MkDocs build output
100
100
  site/
101
+ data/test/**/_e2e_out/
@@ -12,6 +12,391 @@ ship bug fixes only. Breaking changes to the documented public API
12
12
 
13
13
  ---
14
14
 
15
+ ## [Unreleased]
16
+
17
+ ## [0.3.0] — 2026-06-18
18
+
19
+ ### Fixed — aligner images had no samtools (5 recipes broke at alignment)
20
+ The plain single-tool aligner BioContainers (`bwa`, `bowtie2`,
21
+ `minimap2`) do **not** bundle samtools, yet five recipes ran
22
+ `aligner | samtools sort` / `samtools index` inside them — so every one
23
+ failed at its alignment stage with `samtools: command not found`. The
24
+ smoke matrix only exercises each recipe's *first* stage, so this stayed
25
+ invisible. (See #2.) Each aligner stage now uses an image that carries
26
+ both tools:
27
+ - **germline_variants**, **joint_genotyping**: a mulled
28
+ `bwa 0.7.19 + samtools 1.22` image. A new **`prepare_reference`**
29
+ stage builds the BWA index + `.fai` + `.dict` **once** (in the cohort
30
+ recipe, before the per-sample fan-out — previously each parallel
31
+ sample raced to index the *shared* reference), and the gatk call stage
32
+ drops its own `samtools` calls (`MarkDuplicates --CREATE_INDEX`
33
+ indexes the dedup BAM; the gatk4 image ships no samtools either).
34
+ Verified end-to-end on phiX: prep → bwa-mem → sort/index →
35
+ MarkDuplicates → HaplotypeCaller now produces a valid VCF.
36
+ - **chip_seq**, **atac_seq**: `staphb/bowtie2:2.5.4` (bundles samtools).
37
+ - **metagenome_assembly**: a mulled `minimap2 2.31 + samtools 1.23` image.
38
+ - `registry/tools/alignment/bowtie2.yaml` had the same broken
39
+ assumption (and `samtools.yaml` documented it as fact) — both fixed.
40
+
41
+ ### Fixed — static audit of the never-e2e'd recipes (round 1)
42
+ A static pass over the 10 recipes that have no committed full-e2e fixture
43
+ (they need external reference data) turned up several latent defects:
44
+ - **proteomics_dda**: the percolator FDR cut used ``awk -F\t``, whose
45
+ backslash bash strips before awk sees it — so the field separator
46
+ became the literal letter ``t`` instead of a tab, and
47
+ ``passing_psms.tsv`` was filtered on garbage columns. Now ``-F"\t"``,
48
+ which reaches awk as a real tab (verified).
49
+ - **eukaryote_assembly**: the docstring advertised ``polish=False`` to
50
+ skip Medaka for HiFi reads, but no such parameter existed. Added it
51
+ (the ``polish`` stage is renamed ``polish_consensus`` to free the
52
+ name); ``assess`` already falls back to Flye's ``assembly.fasta``.
53
+ - **chip_seq**: docstring promised ``--ctrl-r1 / --ctrl-r2`` raw-control
54
+ alignment the recipe never implemented — corrected to document the
55
+ actual ``--ctrl-bam`` (pre-aligned control) input.
56
+ - **metagenomics_profile**: ``bioflow db fetch`` example used a
57
+ catalog key that doesn't exist (``kraken2_standard`` →
58
+ ``kraken2_standard_8gb``) and now notes Bracken's ``kmer_distrib`` need.
59
+
60
+ ### Docs — strict build fixed + e2e-coverage page
61
+ - `mkdocs build --strict` was aborting: three docs links pointed at repo
62
+ files outside the `docs/` tree (`conda-recipe/meta.yaml`,
63
+ `scripts/compare_nfcore.py`, the nf-core workflow), which strict mode
64
+ flags as unresolved. Re-pointed them at GitHub blob URLs.
65
+ - New **`reference/e2e-coverage.md`** documents which 9 recipes have a
66
+ committed full end-to-end fixture and which 10 are gated on external
67
+ reference data (with the `bioflow db fetch` key for each), plus the one
68
+ utility recipe. Regenerated `reference/recipes.md` / `tools.md` from
69
+ the registry (now 20 recipes; `joint_genotyping` was missing and image
70
+ tags were stale).
71
+
72
+ ### Fixed — ani_matrix broken for genomes outside the workspace
73
+ - **Bug a full e2e caught**: FastANI reads genome paths from a *list
74
+ file*, not the command, so the SDK's command-path translator and
75
+ auto-mount never applied to them — every external genome failed with
76
+ `Could not open <host path>`. Since genomes normally live outside the
77
+ output workspace, this broke the recipe's primary documented use.
78
+ - New SDK helper **`stage_input(path)`** copies an external file into the
79
+ active workspace (always mounted at `/work`) and returns its container
80
+ path — the clean primitive for any recipe that feeds a tool a list
81
+ file of paths. Also exported **`container_path(path)`**.
82
+ - `ani_matrix` now stages genomes via `stage_input` and writes container
83
+ paths into the FastANI list; verified end-to-end (genome1 vs genome2 =
84
+ 99.5% ANI).
85
+
86
+ ### Added — full e2e for the comparative-genomics recipes
87
+ - `tests/integration/test_full_pipeline_e2e.py` gains real end-to-end
88
+ tests for **amr_vf_catalogue** (ABRicate fan-out, bundled DBs),
89
+ **ani_matrix** (all-vs-all FastANI), **pangenome** (Prokka × N →
90
+ Roary), and **gwas** (Scoary on a synthetic Roary GPA + phenotype,
91
+ recovers a planted association). Fixtures:
92
+ `data/test/genomes_small/` (phiX174 + a 25-SNP variant) and
93
+ `data/test/gwas_small/` (12-gene × 10-sample GPA). Recipes validated
94
+ end-to-end: 1 (prokaryote) → 9 (see below).
95
+ - **cafe_evolution** (CAFE5 gene-family expansion/contraction) added as
96
+ the 6th, on `data/test/cafe_small/` (ultrametric 4-taxon tree + 60
97
+ families).
98
+ - **phylogeny** (single-copy core → MAFFT × N → IQ-TREE) added as the
99
+ 7th, on `data/test/phylo_small/` (Prokka GFF + CDS + Roary GPA for 4
100
+ phiX strains; IQ-TREE recovers a 4-taxon ML tree).
101
+ - **rnaseq_deg** (fastp → Salmon → DESeq2 → enrichment + MultiQC) added
102
+ as the 8th, on `data/test/rnaseq_small/` (60 synthetic transcripts, 4
103
+ samples, 10 transcripts planted ~4× up in the treated group). DESeq2
104
+ recovers the planted signal (`tx0001` log2FC ≈ 2) and the run finishes
105
+ in seconds. The sample sheet is built by the test at run time so no
106
+ machine-specific paths are committed.
107
+ - **methylation_wgbs** (TrimGalore → Bismark → methylKit) added as the
108
+ 9th, on `data/test/methyl_small/` (phiX174 + 3,000 synthetic
109
+ directional bisulfite read pairs, ~70 % CpG-methylated). The reads
110
+ map at 100 % and Bismark produces a real cytosine report; the genome
111
+ is **not** committed pre-prepared — the new `bismark_prep` stage
112
+ (below) bisulfite-converts it at run time, so no version-tied bowtie2
113
+ index lands in git. Regenerated deterministically by
114
+ `scripts/gen_methyl_fixture.py`.
115
+
116
+ ### Added — methylation_wgbs prepares its genome (matches the docs)
117
+ - The recipe's docstring promised automatic genome preparation, but the
118
+ pipeline had no such stage — it silently required a pre-prepared
119
+ `Bisulfite_Genome/` directory, so running from a plain reference FASTA
120
+ failed. A new **`bismark_prep`** stage now runs
121
+ `bismark_genome_preparation` when `--bismark-genome` is a FASTA (or a
122
+ directory holding one); an already-prepared directory is detected and
123
+ used directly, skipping preparation. `methylation_wgbs` is now 4
124
+ stages (trim → bismark_prep → bismark → methylkit).
125
+
126
+ ### Fixed — methylKit CpG-report glob (shell ate the regex escape)
127
+ - `methylkit_dmr` matched the cytosine report with `pattern='…txt(\.gz)?$'`,
128
+ but the `\.` escape was stripped by the shell before R parsed the
129
+ string, so R aborted with *"'\.' is an unrecognized escape in character
130
+ string"* and the whole recipe failed at the final stage. Replaced with
131
+ a `[.]` character class (no backslash to escape), which matches the
132
+ report — with or without a `.gz` suffix — robustly.
133
+
134
+ ### Fixed — rnaseq_deg DESeq2 step (two latent bugs)
135
+ - The `deseq2_diff` stage required **tximport**, which the
136
+ `bioconductor-deseq2` BioContainer does not ship — every run failed
137
+ with "there is no package called 'tximport'". Rewritten to assemble
138
+ the count matrix in base R straight from each sample's `quant.sf`
139
+ (`NumReads`, rounded) and feed `DESeqDataSetFromMatrix`, dropping the
140
+ tximport dependency entirely.
141
+ - A second, masked bug: `samples$sample_id` inside the `Rscript -e "…"`
142
+ body (run via `sh -c`) was shell-expanded because the `$` was
143
+ unescaped, so the `file.path(...)` of `quant.sf` paths was wrong.
144
+ Escaped to `\$sample_id`. The pipeline now also **fails fast** if the
145
+ DESeq2 stage exits non-zero (the downstream Enrichr step tolerates an
146
+ empty gene list, which previously masked a broken DEG table).
147
+
148
+ ### Fixed — LF line endings for container-read fixtures
149
+ - A new `.gitattributes` pins text test fixtures (and registry YAMLs) to
150
+ LF. CAFE5 doesn't strip a trailing CR, so a CRLF checkout on Windows
151
+ made it read the last species column as `D\r` and fail with "D was not
152
+ found in gene family …". FASTA parsers tolerated the CR, but the
153
+ table parser did not — LF is now enforced so fixtures are safe on
154
+ every host.
155
+
156
+ ### Fixed — bounded stdout retention (no orchestrator OOM on chatty tools)
157
+ - `DockerBackend.run` accumulated **every** stdout line in memory; a
158
+ tool that emits millions of lines (Roary, IQ-TREE) could OOM the
159
+ orchestrator. It now retains only the trailing `_STDOUT_TAIL_LINES`
160
+ (5000) via a bounded `deque` for the diagnostic `CommandResult.stdout`
161
+ — every line still streams live to `log_callback`, and real artifacts
162
+ go to files in the workspace.
163
+ - Tests: +1 (`test_docker_timeout.py`) asserting the tail is kept.
164
+
165
+ ### Fixed — clear error for shell-unsafe external input filenames
166
+ - An external input file whose **basename** contained a space or shell
167
+ metacharacter silently corrupted the recipe's command — bioflow mounts
168
+ the file's parent at the space-free `/inputs/<n>` and splices the
169
+ basename in unquoted, and it can't be quoted generically because many
170
+ recipes wrap the whole command in `bash -c '…'`. (A spaced *directory*
171
+ was already fine — only the basename survives into the command.)
172
+ - `_collect_external_mounts` now raises an actionable `ValueError`
173
+ naming the offending characters and telling the user to rename /
174
+ symlink to a safe name.
175
+ - Tests: +12 (`tests/unit/test_unsafe_paths.py`), incl. confirmation
176
+ that spaced *directories* and workspace-internal paths are unaffected.
177
+
178
+ ### Fixed — stage_timeout now actually bounds runtime
179
+ - **Latent bug**: `run_plan(stage_timeout=…)` never worked. The log
180
+ loop (`container.logs(stream=True, follow=True)`) blocks until the
181
+ container exits, so the subsequent `container.wait(timeout=…)` — a
182
+ docker-py HTTP read timeout, not a runtime cap — could never fire for
183
+ a runaway container; it would hang forever.
184
+ - `DockerBackend.run` now starts a watchdog `threading.Timer` that
185
+ `container.kill()`s the stage when the timeout elapses, returning the
186
+ conventional exit code **124** with a clear message.
187
+ - Tests: +3 (`tests/unit/test_docker_timeout.py`, fake-container based,
188
+ deterministic).
189
+
190
+ ### Fixed — DockerBackend now clamps CPU/RAM to host capacity
191
+ - **Bug the full-pipeline e2e caught**: a stage declaring `cpu=8` (e.g.
192
+ SPAdes in `prokaryote_assembly`) failed to even *start* on any host
193
+ with fewer cores — Docker rejects a container whose `--cpus` exceeds
194
+ the host count ("range of CPUs is from 0.01 to N.00"), so all 3 retry
195
+ attempts died instantly. Passed locally (12 cores) but failed on the
196
+ 4-core CI runner — and would hit any user on a small workstation.
197
+ - `DockerBackend.run` now clamps the requested CPU to the host core count
198
+ and RAM to ~90% of host memory (`_clamp_resources`), so an
199
+ over-ambitious resource request degrades to "use what's available"
200
+ instead of crashing.
201
+ - Tests: +5 (`tests/unit/test_resource_clamp.py`).
202
+
203
+ ### Added — first full-pipeline end-to-end test
204
+ - `tests/integration/test_full_pipeline_e2e.py`: runs the **entire**
205
+ `prokaryote_assembly` recipe (fastp → SPAdes → QUAST → Prokka) against
206
+ real BioContainers — the first time a complete pipeline (not just a
207
+ first stage, as the smoke matrix does) is validated end-to-end.
208
+ - New `data/test/phix_small/` fixture: phiX174 (`NC_001422.1`, 5386 bp) +
209
+ 1000 wgsim-simulated 150 bp pairs (~56×, seed 42 → deterministic).
210
+ phiX assembles into a single ~5.4 kb contig in <1 min, so the whole
211
+ chain finishes in ~45 s.
212
+ - Asserts real data flow: assembled contig length 4.5–6 kb, QUAST
213
+ `report.tsv`, and Prokka annotation with ≥1 CDS. Verified locally
214
+ (phiX → 1 contig 5377 bp, 6 CDS).
215
+ - Wired into the nightly-smoke workflow as a second step.
216
+
217
+ ## [0.2.1] — 2026-06-12
218
+
219
+ > **Why upgrade from 0.2.0**: the `v0.2.0` tag predated the registry
220
+ > freshness fix, so the 0.2.0 wheel bundled a registry whose
221
+ > `quay.io/biocontainers/*` tags had been garbage-collected from Quay —
222
+ > `rnaseq_deg`, `chip_seq`, `atac_seq`, `eukaryote_assembly`,
223
+ > `metagenome_assembly`, and others would fail at run time with "image
224
+ > not found". **0.2.1 ships the repaired, 107/107-pinned registry**, so
225
+ > recipes actually pull their containers. Everything below landed after
226
+ > the 0.2.0 cut and is new to PyPI users here.
227
+
228
+ ### Changed — nightly smoke matrix expanded (3 → 5 recipes)
229
+ - Added real-container smoke cases for **chip_seq** (TrimGalore — a
230
+ container family shared by chip/atac/methylation, previously
231
+ unexercised) and **germline_variants** (validates the variant recipe's
232
+ fastp wiring). All five pass against real BioContainers locally
233
+ (~90 s) — more of the "522 mostly-mock tests" critique closed with
234
+ genuine end-to-end coverage.
235
+
236
+ ### Changed — mypy type-checking is now blocking
237
+ - Fixed all 33 mypy errors across 7 modules (Rich `TaskID` optionals in
238
+ the progress bars, `dict[str, Any]` run kwargs, anthropic/openai/docker
239
+ SDK union + missing-stub noise, a loop-variable shadow in
240
+ `cli/update.py`). `mypy bioflow --ignore-missing-imports` now reports
241
+ **0 errors**.
242
+ - CI's `typecheck` job drops `continue-on-error` — a new type error now
243
+ fails the build, closing the "advisory only" gap.
244
+
245
+ ### Added — nf-core concordance benchmark (harness + methodology)
246
+ - `scripts/compare_nfcore.py` (new, stdlib-only): scores agreement
247
+ between a bioflow output and the matching nf-core output —
248
+ **Jaccard + genotype concordance** on normalised VCF sites
249
+ (vs nf-core/sarek), and **Spearman ρ** of per-gene counts
250
+ (vs nf-core/rnaseq). Optional `--min-jaccard` / `--min-rho` gate for
251
+ CI.
252
+ - `docs/benchmarks/nfcore-concordance.md`: golden datasets
253
+ (GIAB HG002 chr20; nf-core/rnaseq chr22), method, and initial
254
+ acceptance thresholds.
255
+ - `.github/workflows/nfcore-concordance.yml`: manually-dispatched job
256
+ (not a per-PR gate — a full run needs staged references).
257
+ - Honesty note: bioflow ships the *scoring* half (committed + tested);
258
+ the *production* half (running both pipelines on a machine with the
259
+ references) is operator-run and documented.
260
+ - Tests: +13 (`tests/unit/test_compare_nfcore.py`).
261
+
262
+ ### Added — GPU passthrough + Podman runtime
263
+ - `@stage(..., gpu=True)` (and a tool YAML's `resources.gpu`) now attach
264
+ all host GPUs to that stage's container via a Docker `DeviceRequest`
265
+ (the API equivalent of `--gpus all`); needs the NVIDIA Container
266
+ Toolkit, and degrades to a warning on CPU-only hosts rather than
267
+ failing. Threaded through `Stage`, the `@stage` decorator, and the
268
+ preset `run_plan` path.
269
+ - `DockerBackend` works with **Podman**: it honours `BIOFLOW_DOCKER_HOST`
270
+ / `DOCKER_HOST` (point it at the Podman API socket) and an optional
271
+ `base_url`, and reads `BIOFLOW_CONTAINER_RUNTIME`.
272
+ - `bioflow doctor` recognises Podman as a Docker alternative — the
273
+ `docker_cli` / `docker_daemon` checks fall back to `podman` and report
274
+ which runtime they found.
275
+ - Tests: +10 (`tests/unit/test_gpu_podman.py`). Backend `run()` gains a
276
+ `gpu` kwarg (the `ContainerBackend` protocol + MockBackend updated).
277
+
278
+ ### Added — reference-DB catalog expansion + refgenie manifest
279
+ - `bioflow/core/db.py` catalog gains the references real recipes need:
280
+ GATK known-sites `dbsnp_grch38` + `mills_indels_grch38` (BQSR/VQSR),
281
+ `encode_blacklist_grch38` (ChIP/ATAC peak filtering), and
282
+ `gencode_grch38` (STAR/Salmon/featureCounts annotation).
283
+ - Catalog entries now carry `genome` + `asset` tags, and a new
284
+ `refgenie_manifest()` / `bioflow db manifest` emits a
285
+ [refgenie](https://refgenie.databio.org/)-compatible JSON mapping
286
+ `<genome>/<asset>` → catalogued DB, so labs already standardised on
287
+ refgenie can see which existing assets satisfy a bioflow requirement.
288
+ - Tests: +5 (`tests/unit/test_db.py`).
289
+
290
+ ### Added — rnaseq_deg depth: GO enrichment + MultiQC
291
+ - `rnaseq_deg` extended from 4 → 6 stages:
292
+ - **`enrich_go`** — GO enrichment on the significant DEGs via gseapy's
293
+ Enrichr query. Symbol-based, so no organism-specific OrgDb package
294
+ is needed; an awk numeric-regex guard on the `padj` column avoids the
295
+ classic "NA-treated-as-0" false positive.
296
+ - **`multiqc_report`** — aggregates every per-sample fastp + Salmon
297
+ report into a single MultiQC HTML.
298
+ - DAG-shape test updated; recipe still runs end-to-end under MockBackend.
299
+
300
+ ### Added — `joint_genotyping` recipe (GATK cohort best practice)
301
+ - New 7-stage recipe (`bioflow/recipes/variant_calling/joint_genotyping.py`)
302
+ implementing the canonical GATK **joint-genotyping** workflow for
303
+ cohorts, where `germline_variants` only does single-sample direct
304
+ calling:
305
+ - **per sample (fan-out)**: fastp → BWA-MEM → MarkDuplicates →
306
+ HaplotypeCaller `-ERC GVCF`
307
+ - **cohort (converge)**: CombineGVCFs → GenotypeGVCFs →
308
+ best-practice hard filtering (separate SNP / INDEL filters) → SnpEff
309
+ - Takes a `sample_id,fastq_r1,fastq_r2` sample sheet and uses `.starmap`
310
+ to run the per-sample stages in parallel before converging on the
311
+ joint steps — the production pattern reviewers expect for population
312
+ and family studies, and a worked example of bioflow's fan-out.
313
+ - Recipe count 19 → 20 (12 per-pipeline). Tests: +3 (registration, DAG
314
+ shape, MockBackend e2e).
315
+
316
+ ### Added — run provenance (RO-Crate + PROV-style JSON)
317
+ - `bioflow/core/provenance.py` (new): every recipe run records, per
318
+ stage, the container **image + content digest**, the exact
319
+ **command**, every **input file's SHA-256 + size**, **start/end**
320
+ timestamps, exit code, and the bioflow version.
321
+ - At the end of a run the workspace gains two self-describing files:
322
+ - `provenance.json` — flat, human-readable run record
323
+ - `ro-crate-metadata.json` — an [RO-Crate 1.1](https://www.researchobject.org/ro-crate/)
324
+ research object (the de-facto packaging standard for computational
325
+ workflow runs), so the output directory is consumable by reviewers
326
+ and downstream tools directly.
327
+ - Wired into `bioflow recipe run` (on by default; `--no-provenance` to
328
+ skip). Opt-in and **zero-cost when off** — the SDK hot path pays
329
+ nothing unless a recorder is installed, and provenance errors degrade
330
+ to warnings rather than aborting the science.
331
+ - New `bioflow provenance show <workspace>` command (+ `--json`) renders
332
+ the recorded run: per-stage image, pinned digest, exit status, and
333
+ input hashes. Builds directly on the digest-pinning work — pinned
334
+ tools show their `sha256:…` in the provenance.
335
+ - Tests: +14 (`tests/unit/test_provenance.py`); verified end-to-end
336
+ against real Docker (digest resolved from the local image, RO-Crate
337
+ validates structurally).
338
+
339
+ ### Fixed — registry freshness (stale BioContainer tags)
340
+ - **Discovery**: an audit during digest-pinning found that ~half the
341
+ registry's `quay.io/biocontainers/*` image tags had 404'd. Quay
342
+ rotates each package's `--<buildhash>_<n>` build suffix and garbage-
343
+ collects the old ones, so dozens of recipes (chip_seq, atac_seq,
344
+ eukaryote_assembly, metagenome_assembly, rnaseq_deg's Salmon stage,
345
+ …) would have failed at run time with "image not found" — unrelated
346
+ to the user's data.
347
+ - `scripts/refresh_tags.py` (new): audits every Quay BioContainer
348
+ reference, and with `--apply` rewrites any dead tag to the newest
349
+ *same-version* build (never changing the upstream software version).
350
+ Non-Quay images and versions that have left Quay entirely are
351
+ reported for manual review.
352
+ - Applied: **34 registry tool YAMLs** + **7 recipe-hardcoded images**
353
+ (bowtie2, bracken, flye, macs3, medaka, metabat2, tobias) re-pointed
354
+ to live tags. Salmon's `1.10.3--hb950928_0` → `1.10.3--h45fbf2d_5`
355
+ fixed separately (it broke the rnaseq_deg quant stage).
356
+ - Verified: the bumped images pull + run (e.g. `bowtie2 2.5.4`), and
357
+ the nightly smoke matrix is green.
358
+
359
+ ### Added — full digest pinning (registry now 100% content-addressed)
360
+ - Digest coverage raised from 5/110 → **107/107 active tools** in two
361
+ passes. The second pass resolved the 17 hold-outs:
362
+ - **Version bumps** to the newest BioContainer of the same tool where
363
+ the pinned version had left Quay: DESeq2 1.44→1.50.2, edgeR
364
+ 4.2→4.8.2, limma 3.60→3.66, clusterProfiler 4.12→4.18.4, topGO
365
+ 2.56→2.62, HOMER 4.11.1→5.1, Scanpy 1.10.1→1.7.2 (1.10 was never on
366
+ Quay biocontainers), Comet 2024020→2026011, Percolator 3.06.1→3.7.1,
367
+ InterProScan 5.67→5.59.
368
+ - **Image-source switches**: methylKit and monocle3 moved off the
369
+ multi-GB `bioconductor/bioconductor_full:RELEASE_3_18` (gone) to the
370
+ dedicated `bioconductor-methylkit` / `r-monocle3` BioContainers;
371
+ Cell Ranger tag `7.2.0`→`v7.2.0`; Seurat `5.0.1`→`5.0.0`.
372
+ - **Deprecations**: MaxQuant joins MSFragger + FragPipe as
373
+ `deprecated: true` — proprietary tools whose images are gone and
374
+ which the `proteomics_dda` recipe does not use (it runs the
375
+ open-source msconvert → Comet → Percolator stack). The audit now
376
+ skips deprecated tools.
377
+ - **CI `digest-audit` is now blocking** (was advisory): every active
378
+ tool must carry an `image_digest`, so the registry can never silently
379
+ rot again.
380
+ - `scripts/refresh_tags.py` also reports `version_gone` and non-Quay
381
+ images for these manual cases.
382
+
383
+ ### Added — Bioconda recipe (prep)
384
+ - `conda-recipe/meta.yaml`: noarch-python Bioconda recipe (only
385
+ bioflow's pure-Python stack is a conda dep; tools run as Docker
386
+ containers). Submission walkthrough in `docs/MAINTAINER.md` Part 7.
387
+ Gated on the real-PyPI publish (the recipe sources the PyPI sdist).
388
+
389
+ ### Changed — CI
390
+ - All workflows opt into the Node 24 runtime
391
+ (`FORCE_JAVASCRIPT_ACTIONS_TO_NODE24`) ahead of the 2026-06-16 forced
392
+ switch, silencing the Node 20 deprecation warnings.
393
+
394
+ ### Fixed — nightly smoke
395
+ - `rnaseq_deg.qc_one` smoke assertion matched the stage's real output
396
+ names (`<sample_id>_R1.clean.fq.gz`), fixing a false CI failure.
397
+
398
+ ---
399
+
15
400
  ## [0.2.0] — 2026-06-05
16
401
 
17
402
  First PyPI release. Three months of 0.1.x work consolidated into a