bio-analyze-plot 0.1.0a0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (97) hide show
  1. bio_analyze_plot-0.1.0a0/.gitignore +26 -0
  2. bio_analyze_plot-0.1.0a0/CHANGELOG.md +0 -0
  3. bio_analyze_plot-0.1.0a0/PKG-INFO +273 -0
  4. bio_analyze_plot-0.1.0a0/README.md +249 -0
  5. bio_analyze_plot-0.1.0a0/metadata/bar.json +169 -0
  6. bio_analyze_plot-0.1.0a0/metadata/bar_api.json +190 -0
  7. bio_analyze_plot-0.1.0a0/metadata/bar_cli.json +170 -0
  8. bio_analyze_plot-0.1.0a0/metadata/box.json +149 -0
  9. bio_analyze_plot-0.1.0a0/metadata/box_api.json +150 -0
  10. bio_analyze_plot-0.1.0a0/metadata/box_cli.json +150 -0
  11. bio_analyze_plot-0.1.0a0/metadata/chromosome_api.json +120 -0
  12. bio_analyze_plot-0.1.0a0/metadata/chromosome_cli.json +110 -0
  13. bio_analyze_plot-0.1.0a0/metadata/gsea.json +159 -0
  14. bio_analyze_plot-0.1.0a0/metadata/gsea_api.json +140 -0
  15. bio_analyze_plot-0.1.0a0/metadata/gsea_cli.json +160 -0
  16. bio_analyze_plot-0.1.0a0/metadata/heatmap.json +99 -0
  17. bio_analyze_plot-0.1.0a0/metadata/heatmap_api.json +120 -0
  18. bio_analyze_plot-0.1.0a0/metadata/heatmap_cli.json +120 -0
  19. bio_analyze_plot-0.1.0a0/metadata/line.json +169 -0
  20. bio_analyze_plot-0.1.0a0/metadata/line_api.json +200 -0
  21. bio_analyze_plot-0.1.0a0/metadata/line_cli.json +170 -0
  22. bio_analyze_plot-0.1.0a0/metadata/pca.json +109 -0
  23. bio_analyze_plot-0.1.0a0/metadata/pca_api.json +120 -0
  24. bio_analyze_plot-0.1.0a0/metadata/pca_cli.json +150 -0
  25. bio_analyze_plot-0.1.0a0/metadata/pie.json +119 -0
  26. bio_analyze_plot-0.1.0a0/metadata/pie_api.json +110 -0
  27. bio_analyze_plot-0.1.0a0/metadata/pie_cli.json +120 -0
  28. bio_analyze_plot-0.1.0a0/metadata/scatter.json +129 -0
  29. bio_analyze_plot-0.1.0a0/metadata/scatter_api.json +130 -0
  30. bio_analyze_plot-0.1.0a0/metadata/scatter_cli.json +130 -0
  31. bio_analyze_plot-0.1.0a0/metadata/volcano.json +99 -0
  32. bio_analyze_plot-0.1.0a0/metadata/volcano_api.json +120 -0
  33. bio_analyze_plot-0.1.0a0/metadata/volcano_cli.json +100 -0
  34. bio_analyze_plot-0.1.0a0/pyproject.toml +39 -0
  35. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/__init__.py +35 -0
  36. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/cli.py +578 -0
  37. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/__init__.py +30 -0
  38. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/bar.py +282 -0
  39. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/base.py +147 -0
  40. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/box.py +137 -0
  41. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/chromosome.py +219 -0
  42. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/gsea.py +248 -0
  43. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/heatmap.py +143 -0
  44. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/line.py +254 -0
  45. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/pca.py +176 -0
  46. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/pie.py +119 -0
  47. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/scatter.py +197 -0
  48. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/plots/volcano.py +158 -0
  49. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/theme.py +349 -0
  50. bio_analyze_plot-0.1.0a0/src/bio_analyze_plot/volcano.py +84 -0
  51. bio_analyze_plot-0.1.0a0/tests/conftest.py +139 -0
  52. bio_analyze_plot-0.1.0a0/tests/debug_fonts.py +41 -0
  53. bio_analyze_plot-0.1.0a0/tests/test_bar/test_bar_plot_custom_error_bars.png +0 -0
  54. bio_analyze_plot-0.1.0a0/tests/test_bar/test_bar_plot_generation.png +0 -0
  55. bio_analyze_plot-0.1.0a0/tests/test_bar/test_bar_plot_sd.png +0 -0
  56. bio_analyze_plot-0.1.0a0/tests/test_bar/test_bar_plot_se.png +0 -0
  57. bio_analyze_plot-0.1.0a0/tests/test_bar/test_bar_plot_significance.png +0 -0
  58. bio_analyze_plot-0.1.0a0/tests/test_bar.py +83 -0
  59. bio_analyze_plot-0.1.0a0/tests/test_box/test_box_plot_generation.png +0 -0
  60. bio_analyze_plot-0.1.0a0/tests/test_box/test_box_plot_significance.png +0 -0
  61. bio_analyze_plot-0.1.0a0/tests/test_box/test_box_plot_swarm.png +0 -0
  62. bio_analyze_plot-0.1.0a0/tests/test_box.py +67 -0
  63. bio_analyze_plot-0.1.0a0/tests/test_chinese_font.py +44 -0
  64. bio_analyze_plot-0.1.0a0/tests/test_chromosome/test_chromosome_plot_custom_colors.png +0 -0
  65. bio_analyze_plot-0.1.0a0/tests/test_chromosome/test_chromosome_plot_generation.png +0 -0
  66. bio_analyze_plot-0.1.0a0/tests/test_chromosome/test_chromosome_plot_max_chroms.png +0 -0
  67. bio_analyze_plot-0.1.0a0/tests/test_chromosome/test_chromosome_plot_sorting.png +0 -0
  68. bio_analyze_plot-0.1.0a0/tests/test_chromosome.py +49 -0
  69. bio_analyze_plot-0.1.0a0/tests/test_cli.py +103 -0
  70. bio_analyze_plot-0.1.0a0/tests/test_cli_sheet.py +45 -0
  71. bio_analyze_plot-0.1.0a0/tests/test_custom_theme.py +54 -0
  72. bio_analyze_plot-0.1.0a0/tests/test_customization.py +83 -0
  73. bio_analyze_plot-0.1.0a0/tests/test_gsea/test_gsea_no_border.png +0 -0
  74. bio_analyze_plot-0.1.0a0/tests/test_gsea/test_gsea_no_metric.png +0 -0
  75. bio_analyze_plot-0.1.0a0/tests/test_gsea/test_gsea_with_metric.png +0 -0
  76. bio_analyze_plot-0.1.0a0/tests/test_gsea.py +69 -0
  77. bio_analyze_plot-0.1.0a0/tests/test_heatmap/test_heatmap_plot_generation.png +0 -0
  78. bio_analyze_plot-0.1.0a0/tests/test_heatmap.py +32 -0
  79. bio_analyze_plot-0.1.0a0/tests/test_latex.py +35 -0
  80. bio_analyze_plot-0.1.0a0/tests/test_line/test_line_plot_generation.png +0 -0
  81. bio_analyze_plot-0.1.0a0/tests/test_line.py +22 -0
  82. bio_analyze_plot-0.1.0a0/tests/test_pca/test_pca_plot.png +0 -0
  83. bio_analyze_plot-0.1.0a0/tests/test_pca/test_pca_plot_tidy.png +0 -0
  84. bio_analyze_plot-0.1.0a0/tests/test_pca.py +56 -0
  85. bio_analyze_plot-0.1.0a0/tests/test_pca_cluster.py +38 -0
  86. bio_analyze_plot-0.1.0a0/tests/test_pie/test_pie_plot.png +0 -0
  87. bio_analyze_plot-0.1.0a0/tests/test_pie/test_pie_plot_explode.png +0 -0
  88. bio_analyze_plot-0.1.0a0/tests/test_pie/test_pie_plot_explode_list.png +0 -0
  89. bio_analyze_plot-0.1.0a0/tests/test_pie.py +50 -0
  90. bio_analyze_plot-0.1.0a0/tests/test_plot.py +6 -0
  91. bio_analyze_plot-0.1.0a0/tests/test_scatter/test_scatter_basic.png +0 -0
  92. bio_analyze_plot-0.1.0a0/tests/test_scatter/test_scatter_complex.png +0 -0
  93. bio_analyze_plot-0.1.0a0/tests/test_scatter/test_scatter_ellipse.png +0 -0
  94. bio_analyze_plot-0.1.0a0/tests/test_scatter.py +69 -0
  95. bio_analyze_plot-0.1.0a0/tests/test_themes_exhaustive.py +85 -0
  96. bio_analyze_plot-0.1.0a0/tests/test_volcano/test_volcano_plot_generation.png +0 -0
  97. bio_analyze_plot-0.1.0a0/tests/test_volcano.py +23 -0
@@ -0,0 +1,26 @@
1
+ __pycache__/
2
+ *.py[cod]
3
+ *.pyd
4
+ *.pyo
5
+ *.so
6
+ .Python
7
+ .venv/
8
+ env/
9
+ venv/
10
+
11
+ build/
12
+ dist/
13
+ *.egg-info/
14
+ .pytest_cache/
15
+ .ruff_cache/
16
+ .mypy_cache/
17
+
18
+ .idea/
19
+ .vscode/
20
+
21
+ uv.lock
22
+
23
+ *.log
24
+ output
25
+ .trae/
26
+ *.xml
File without changes
@@ -0,0 +1,273 @@
1
+ Metadata-Version: 2.4
2
+ Name: bio-analyze-plot
3
+ Version: 0.1.0a0
4
+ Summary: Publication-ready plotting module for bio-analyze.
5
+ Author: qww
6
+ License: GPL-3.0
7
+ Requires-Python: <3.15,>=3.9
8
+ Requires-Dist: bio-analyze-core>=0.1.0a0
9
+ Requires-Dist: matplotlib>=3.8.0
10
+ Requires-Dist: numpy>=1.20.0
11
+ Requires-Dist: openpyxl>=3.1.0
12
+ Requires-Dist: pandas>=2.0.0
13
+ Requires-Dist: scikit-learn>=1.3.0
14
+ Requires-Dist: scipy>=1.10.0
15
+ Requires-Dist: seaborn>=0.13.0
16
+ Requires-Dist: statannotations>=0.6.0
17
+ Requires-Dist: typer>=0.9.0
18
+ Provides-Extra: dev
19
+ Requires-Dist: allure-pytest; extra == 'dev'
20
+ Requires-Dist: pillow; extra == 'dev'
21
+ Requires-Dist: pytest; extra == 'dev'
22
+ Requires-Dist: pytest-regressions; extra == 'dev'
23
+ Description-Content-Type: text/markdown
24
+
25
+ # bio-analyze-plot
26
+
27
+ **bio-analyze-plot** is the professional plotting module in the `bio-analyze` toolbox. Built on `matplotlib` and `seaborn`, it aims to generate publication-ready statistical charts and supports one-click switching between journal themes like `Nature` and `Science`.
28
+
29
+ ## ✨ Features
30
+
31
+ - **Publication-Ready Themes**: Built-in `nature`, `science`, and `default` themes that automatically adjust fonts, font sizes, line widths, and color palettes.
32
+ - **Wide Data Support**: Supports `.csv`, `.tsv`, `.txt`, `.xlsx`, and `.xls` formats. Specific Excel sheets can be targeted via `--sheet`.
33
+ - **Multi-Format Export**: Supports various image formats including `png`, `pdf`, `svg`, `jpg`, and `tiff`.
34
+ - **LaTeX Support**: Automatically parses LaTeX formulas in axis labels (e.g., `$y = \sin(x)$`).
35
+ - **Unified CLI**: All charts can be invoked through a unified command-line interface.
36
+
37
+ ## 📊 Supported Plots
38
+
39
+ ### 1. Volcano Plot
40
+
41
+ Used to display the distribution of Differentially Expressed Genes (DEGs), intuitively showing significantly up-regulated and down-regulated genes.
42
+
43
+ - **Command**: `volcano`
44
+ - **Key Parameters**:
45
+ - `-x`: log2 Fold Change column name (Default: "log2FoldChange")
46
+ - `-y`: P-value column name (Default: "pvalue")
47
+ - `--fc-cutoff`: Fold Change threshold
48
+ - `--p-cutoff`: P-value threshold
49
+ - `--labels`: Custom labels (e.g., `{"up": "Up", "down": "Down", "ns": "NS"}`)
50
+
51
+ ### 2. Bar Plot
52
+
53
+ Supports bar charts with error bars (SD/SE/CI) and significance markers.
54
+
55
+ - **Command**: `bar`
56
+ - **Key Parameters**:
57
+ - `--error-bar-type`: Error bar type. Options: `SD` (Standard Deviation), `SE` (Standard Error), `CI` (Confidence Interval).
58
+ - `--error-bar-ci`: Confidence level when type is `CI` (Default: 95).
59
+ - `--significance`: Specify group pairs for significance annotation, e.g., "Control,Treated".
60
+ - `--test`: Significance test method. Supports `t-test_ind`, `t-test_welch`, `Mann-Whitney`, etc.
61
+ - `--text-format`: Significance marker format (`star`, `full`, `simple`, `pvalue`).
62
+
63
+ ### 3. Box Plot
64
+
65
+ Displays data distribution, supporting overlaid SwarmPlot scatter points and significance markers.
66
+
67
+ - **Command**: `box`
68
+ - **Key Parameters**:
69
+ - `-x`: Grouping column (Categorical)
70
+ - `-y`: Value column (Numerical)
71
+ - `--hue`: Color grouping column
72
+ - `--add-swarm`: Whether to overlay a Swarmplot to show all data points.
73
+ - `--significance`: Group pairs for significance annotation.
74
+
75
+ ### 4. Heatmap
76
+
77
+ Used to display clustered heatmaps of gene expression or other matrix data.
78
+
79
+ - **Command**: `heatmap`
80
+ - **Key Parameters**:
81
+ - `--cluster-rows` / `--cluster-cols`: Whether to cluster rows/columns.
82
+ - `--z-score`: Perform Z-score normalization on rows (0) or columns (1).
83
+
84
+ ### 5. PCA Plot
85
+
86
+ Displays the distribution of samples in principal component space, supporting automatic clustering ellipses.
87
+
88
+ - **Command**: `pca`
89
+ - **Key Parameters**:
90
+ - `--transpose`: Whether to transpose the matrix (if input is Genes x Samples, it usually needs to be transposed to Samples x Genes).
91
+ - `--hue`: Sample grouping column.
92
+ - `--cluster`: Whether to display clustering confidence ellipses.
93
+
94
+ ### 6. Line Plot
95
+
96
+ Used to display time series or trend data, supporting smooth fitting and error bars.
97
+
98
+ - **Command**: `line`
99
+ - **Key Parameters**:
100
+ - `--hue`: Grouping column; different groups are shown in different colors.
101
+ - `--smooth`: Enable smooth curve fitting (B-spline).
102
+ - `--smooth-points`: Number of interpolation points for smoothing (Default: 300).
103
+ - `--error-bar-type`: Error bar type (`SD`, `SE`, `CI`).
104
+ - `--error-bar-ci`: Confidence interval size.
105
+ - `--error-bar-capsize`: Width of the error bar caps.
106
+ - `--markers`: Display markers for original data points.
107
+
108
+ ### 7. Scatter Plot
109
+
110
+ Displays the relationship between two variables, supporting confidence ellipses.
111
+
112
+ - **Command**: `scatter`
113
+ - **Key Parameters**:
114
+ - `--x`, `--y`: X/Y axis column names.
115
+ - `--hue`: Color grouping column.
116
+ - `--size`: Column to map point sizes.
117
+ - `--style`: Column to map point styles/shapes.
118
+ - `--add-ellipse`: Draw confidence ellipses for each group.
119
+ - `--ellipse-std`: Standard deviation multiplier for the ellipse (Default: 2.0).
120
+
121
+ ### 8. Pie Chart
122
+
123
+ Displays the proportions of categorical data.
124
+
125
+ - **Command**: `pie`
126
+ - **Key Parameters**:
127
+ - `--explode`: Distance to explode sectors.
128
+ - `--autopct`: Percentage display format (Default: "%1.1f%%").
129
+
130
+ ### 9. Chromosome Distribution Plot
131
+
132
+ Displays the distribution density of Reads across whole-genome chromosomes.
133
+
134
+ - **Command**: `chromosome`
135
+ - **Description**: Typically used with the `rna_seq` pipeline to show read coverage on positive and negative strands.
136
+
137
+ ### 10. GSEA Enrichment Plot
138
+
139
+ Displays the Enrichment Score trend of GSEA analysis.
140
+
141
+ - **Command**: `gsea`
142
+ - **Key Parameters**:
143
+ - `--rank`: Rank value column name.
144
+ - `--score`: Running ES column name.
145
+ - `--nes`: Normalized Enrichment Score (displayed in the title).
146
+ - `--pvalue` / `--fdr`: Statistical significance metrics.
147
+
148
+ ## 🎨 Themes
149
+
150
+ Supports customizing plotting themes via JSON files or Python code.
151
+
152
+ ### Using Built-in Themes
153
+
154
+ ```bash
155
+ # Use Nature style
156
+ bioanalyze plot volcano result.csv --theme nature
157
+
158
+ # Use Science style
159
+ bioanalyze plot volcano result.csv --theme science
160
+ ```
161
+
162
+ ### Custom Themes (JSON)
163
+
164
+ Create `my_theme.json`:
165
+
166
+ ```json
167
+ {
168
+ "name": "dark_presentation",
169
+ "style": "darkgrid",
170
+ "context": "talk",
171
+ "font": "Arial",
172
+ "rc_params": {
173
+ "lines.linewidth": 2.5,
174
+ "axes.labelsize": 14
175
+ }
176
+ }
177
+ ```
178
+
179
+ Usage: `bioanalyze plot volcano ... --theme ./my_theme.json`
180
+
181
+ ## 📦 Python API
182
+
183
+ All charts can be invoked directly via Python classes, supporting more flexible customization.
184
+
185
+ ### Basic Usage
186
+
187
+ ```python
188
+ import pandas as pd
189
+ from bio_analyze_plot.plots import VolcanoPlot, PCAPlot
190
+
191
+ # 1. Plot Volcano
192
+ df = pd.read_csv("de_results.csv")
193
+ plotter = VolcanoPlot(theme="nature")
194
+ plotter.plot(
195
+ data=df,
196
+ x="log2FoldChange",
197
+ y="padj",
198
+ fc_cutoff=1.5,
199
+ p_cutoff=0.05,
200
+ title="Differential Expression",
201
+ output="volcano.pdf"
202
+ )
203
+
204
+ # 2. Plot PCA
205
+ counts = pd.read_csv("counts.csv", index_col=0)
206
+ pca = PCAPlot(theme="science")
207
+ pca.plot(
208
+ data=counts,
209
+ transpose=True, # If input is Genes x Samples
210
+ hue=["Control", "Control", "Treat", "Treat"], # Sample grouping
211
+ cluster=True, # Draw confidence ellipses
212
+ output="pca.png"
213
+ )
214
+ ```
215
+
216
+ ### Chart Class Index
217
+
218
+ #### `VolcanoPlot`
219
+ - **plot() Parameters**:
220
+ - `data` (DataFrame): Data source.
221
+ - `x`, `y` (str): Column names.
222
+ - `log_y` (bool): Whether to apply -log10 to y (Default: True).
223
+ - `fc_cutoff`, `p_cutoff` (float): Threshold lines.
224
+ - `labels` (dict): Legend labels (e.g., `{"up": "Up", "down": "Down", "ns": "NS"}`).
225
+
226
+ #### `HeatmapPlot`
227
+ - **plot() Parameters**:
228
+ - `cluster_rows`, `cluster_cols` (bool): Whether to cluster.
229
+ - `z_score` (int): 0=Row standardization, 1=Column standardization, None=No standardization.
230
+ - `cmap` (str): Colormap (Default: "vlag").
231
+
232
+ #### `BoxPlot`
233
+ - **plot() Parameters**:
234
+ - `significance` (list[tuple]): Pairs for significance markers (e.g., `[("Ctrl", "Treat")]`).
235
+ - `test` (str): Test method (Default: "t-test_ind").
236
+ - `add_swarm` (bool): Whether to overlay scatter points.
237
+
238
+ #### `LinePlot`
239
+ - **plot() Parameters**:
240
+ - `smooth` (bool): Enable smooth curve.
241
+ - `error_bar_type` (str): Error bar type.
242
+ - `markers` (bool/list): Data point markers.
243
+
244
+ #### `ScatterPlot`
245
+ - **plot() Parameters**:
246
+ - `add_ellipse` (bool): Draw confidence ellipses.
247
+ - `ellipse_std` (float): Ellipse standard deviation.
248
+ - `style`, `size` (str): Style/size mapping columns.
249
+
250
+ #### `PCAPlot` (Inherits from ScatterPlot)
251
+ - **plot() Parameters**:
252
+ - `transpose` (bool): Whether to transpose the input matrix.
253
+ - `n_components` (int): Number of principal components.
254
+ - `cluster` (bool): Whether to draw confidence ellipses.
255
+
256
+ #### `ChromosomeDistributionPlot`
257
+ - **plot() Parameters**:
258
+ - `chrom_col`, `pos_col`: Chromosome and position column names.
259
+ - `pos_counts_col`, `neg_counts_col`: Positive and negative strand count columns.
260
+ - `max_chroms` (int): Maximum number of chromosomes to display.
261
+
262
+ #### `GSEAPlot`
263
+ - **plot() Parameters**:
264
+ - `rank`, `score`: Rank and score data columns.
265
+ - `nes`, `pvalue`, `fdr`: Statistical metrics (displayed in the plot).
266
+
267
+ ## 💻 Development
268
+
269
+ Unit test outputs are located in the `packages/plot/tests/output` directory.
270
+
271
+ ```bash
272
+ pytest packages/plot/tests
273
+ ```
@@ -0,0 +1,249 @@
1
+ # bio-analyze-plot
2
+
3
+ **bio-analyze-plot** is the professional plotting module in the `bio-analyze` toolbox. Built on `matplotlib` and `seaborn`, it aims to generate publication-ready statistical charts and supports one-click switching between journal themes like `Nature` and `Science`.
4
+
5
+ ## ✨ Features
6
+
7
+ - **Publication-Ready Themes**: Built-in `nature`, `science`, and `default` themes that automatically adjust fonts, font sizes, line widths, and color palettes.
8
+ - **Wide Data Support**: Supports `.csv`, `.tsv`, `.txt`, `.xlsx`, and `.xls` formats. Specific Excel sheets can be targeted via `--sheet`.
9
+ - **Multi-Format Export**: Supports various image formats including `png`, `pdf`, `svg`, `jpg`, and `tiff`.
10
+ - **LaTeX Support**: Automatically parses LaTeX formulas in axis labels (e.g., `$y = \sin(x)$`).
11
+ - **Unified CLI**: All charts can be invoked through a unified command-line interface.
12
+
13
+ ## 📊 Supported Plots
14
+
15
+ ### 1. Volcano Plot
16
+
17
+ Used to display the distribution of Differentially Expressed Genes (DEGs), intuitively showing significantly up-regulated and down-regulated genes.
18
+
19
+ - **Command**: `volcano`
20
+ - **Key Parameters**:
21
+ - `-x`: log2 Fold Change column name (Default: "log2FoldChange")
22
+ - `-y`: P-value column name (Default: "pvalue")
23
+ - `--fc-cutoff`: Fold Change threshold
24
+ - `--p-cutoff`: P-value threshold
25
+ - `--labels`: Custom labels (e.g., `{"up": "Up", "down": "Down", "ns": "NS"}`)
26
+
27
+ ### 2. Bar Plot
28
+
29
+ Supports bar charts with error bars (SD/SE/CI) and significance markers.
30
+
31
+ - **Command**: `bar`
32
+ - **Key Parameters**:
33
+ - `--error-bar-type`: Error bar type. Options: `SD` (Standard Deviation), `SE` (Standard Error), `CI` (Confidence Interval).
34
+ - `--error-bar-ci`: Confidence level when type is `CI` (Default: 95).
35
+ - `--significance`: Specify group pairs for significance annotation, e.g., "Control,Treated".
36
+ - `--test`: Significance test method. Supports `t-test_ind`, `t-test_welch`, `Mann-Whitney`, etc.
37
+ - `--text-format`: Significance marker format (`star`, `full`, `simple`, `pvalue`).
38
+
39
+ ### 3. Box Plot
40
+
41
+ Displays data distribution, supporting overlaid SwarmPlot scatter points and significance markers.
42
+
43
+ - **Command**: `box`
44
+ - **Key Parameters**:
45
+ - `-x`: Grouping column (Categorical)
46
+ - `-y`: Value column (Numerical)
47
+ - `--hue`: Color grouping column
48
+ - `--add-swarm`: Whether to overlay a Swarmplot to show all data points.
49
+ - `--significance`: Group pairs for significance annotation.
50
+
51
+ ### 4. Heatmap
52
+
53
+ Used to display clustered heatmaps of gene expression or other matrix data.
54
+
55
+ - **Command**: `heatmap`
56
+ - **Key Parameters**:
57
+ - `--cluster-rows` / `--cluster-cols`: Whether to cluster rows/columns.
58
+ - `--z-score`: Perform Z-score normalization on rows (0) or columns (1).
59
+
60
+ ### 5. PCA Plot
61
+
62
+ Displays the distribution of samples in principal component space, supporting automatic clustering ellipses.
63
+
64
+ - **Command**: `pca`
65
+ - **Key Parameters**:
66
+ - `--transpose`: Whether to transpose the matrix (if input is Genes x Samples, it usually needs to be transposed to Samples x Genes).
67
+ - `--hue`: Sample grouping column.
68
+ - `--cluster`: Whether to display clustering confidence ellipses.
69
+
70
+ ### 6. Line Plot
71
+
72
+ Used to display time series or trend data, supporting smooth fitting and error bars.
73
+
74
+ - **Command**: `line`
75
+ - **Key Parameters**:
76
+ - `--hue`: Grouping column; different groups are shown in different colors.
77
+ - `--smooth`: Enable smooth curve fitting (B-spline).
78
+ - `--smooth-points`: Number of interpolation points for smoothing (Default: 300).
79
+ - `--error-bar-type`: Error bar type (`SD`, `SE`, `CI`).
80
+ - `--error-bar-ci`: Confidence interval size.
81
+ - `--error-bar-capsize`: Width of the error bar caps.
82
+ - `--markers`: Display markers for original data points.
83
+
84
+ ### 7. Scatter Plot
85
+
86
+ Displays the relationship between two variables, supporting confidence ellipses.
87
+
88
+ - **Command**: `scatter`
89
+ - **Key Parameters**:
90
+ - `--x`, `--y`: X/Y axis column names.
91
+ - `--hue`: Color grouping column.
92
+ - `--size`: Column to map point sizes.
93
+ - `--style`: Column to map point styles/shapes.
94
+ - `--add-ellipse`: Draw confidence ellipses for each group.
95
+ - `--ellipse-std`: Standard deviation multiplier for the ellipse (Default: 2.0).
96
+
97
+ ### 8. Pie Chart
98
+
99
+ Displays the proportions of categorical data.
100
+
101
+ - **Command**: `pie`
102
+ - **Key Parameters**:
103
+ - `--explode`: Distance to explode sectors.
104
+ - `--autopct`: Percentage display format (Default: "%1.1f%%").
105
+
106
+ ### 9. Chromosome Distribution Plot
107
+
108
+ Displays the distribution density of Reads across whole-genome chromosomes.
109
+
110
+ - **Command**: `chromosome`
111
+ - **Description**: Typically used with the `rna_seq` pipeline to show read coverage on positive and negative strands.
112
+
113
+ ### 10. GSEA Enrichment Plot
114
+
115
+ Displays the Enrichment Score trend of GSEA analysis.
116
+
117
+ - **Command**: `gsea`
118
+ - **Key Parameters**:
119
+ - `--rank`: Rank value column name.
120
+ - `--score`: Running ES column name.
121
+ - `--nes`: Normalized Enrichment Score (displayed in the title).
122
+ - `--pvalue` / `--fdr`: Statistical significance metrics.
123
+
124
+ ## 🎨 Themes
125
+
126
+ Supports customizing plotting themes via JSON files or Python code.
127
+
128
+ ### Using Built-in Themes
129
+
130
+ ```bash
131
+ # Use Nature style
132
+ bioanalyze plot volcano result.csv --theme nature
133
+
134
+ # Use Science style
135
+ bioanalyze plot volcano result.csv --theme science
136
+ ```
137
+
138
+ ### Custom Themes (JSON)
139
+
140
+ Create `my_theme.json`:
141
+
142
+ ```json
143
+ {
144
+ "name": "dark_presentation",
145
+ "style": "darkgrid",
146
+ "context": "talk",
147
+ "font": "Arial",
148
+ "rc_params": {
149
+ "lines.linewidth": 2.5,
150
+ "axes.labelsize": 14
151
+ }
152
+ }
153
+ ```
154
+
155
+ Usage: `bioanalyze plot volcano ... --theme ./my_theme.json`
156
+
157
+ ## 📦 Python API
158
+
159
+ All charts can be invoked directly via Python classes, supporting more flexible customization.
160
+
161
+ ### Basic Usage
162
+
163
+ ```python
164
+ import pandas as pd
165
+ from bio_analyze_plot.plots import VolcanoPlot, PCAPlot
166
+
167
+ # 1. Plot Volcano
168
+ df = pd.read_csv("de_results.csv")
169
+ plotter = VolcanoPlot(theme="nature")
170
+ plotter.plot(
171
+ data=df,
172
+ x="log2FoldChange",
173
+ y="padj",
174
+ fc_cutoff=1.5,
175
+ p_cutoff=0.05,
176
+ title="Differential Expression",
177
+ output="volcano.pdf"
178
+ )
179
+
180
+ # 2. Plot PCA
181
+ counts = pd.read_csv("counts.csv", index_col=0)
182
+ pca = PCAPlot(theme="science")
183
+ pca.plot(
184
+ data=counts,
185
+ transpose=True, # If input is Genes x Samples
186
+ hue=["Control", "Control", "Treat", "Treat"], # Sample grouping
187
+ cluster=True, # Draw confidence ellipses
188
+ output="pca.png"
189
+ )
190
+ ```
191
+
192
+ ### Chart Class Index
193
+
194
+ #### `VolcanoPlot`
195
+ - **plot() Parameters**:
196
+ - `data` (DataFrame): Data source.
197
+ - `x`, `y` (str): Column names.
198
+ - `log_y` (bool): Whether to apply -log10 to y (Default: True).
199
+ - `fc_cutoff`, `p_cutoff` (float): Threshold lines.
200
+ - `labels` (dict): Legend labels (e.g., `{"up": "Up", "down": "Down", "ns": "NS"}`).
201
+
202
+ #### `HeatmapPlot`
203
+ - **plot() Parameters**:
204
+ - `cluster_rows`, `cluster_cols` (bool): Whether to cluster.
205
+ - `z_score` (int): 0=Row standardization, 1=Column standardization, None=No standardization.
206
+ - `cmap` (str): Colormap (Default: "vlag").
207
+
208
+ #### `BoxPlot`
209
+ - **plot() Parameters**:
210
+ - `significance` (list[tuple]): Pairs for significance markers (e.g., `[("Ctrl", "Treat")]`).
211
+ - `test` (str): Test method (Default: "t-test_ind").
212
+ - `add_swarm` (bool): Whether to overlay scatter points.
213
+
214
+ #### `LinePlot`
215
+ - **plot() Parameters**:
216
+ - `smooth` (bool): Enable smooth curve.
217
+ - `error_bar_type` (str): Error bar type.
218
+ - `markers` (bool/list): Data point markers.
219
+
220
+ #### `ScatterPlot`
221
+ - **plot() Parameters**:
222
+ - `add_ellipse` (bool): Draw confidence ellipses.
223
+ - `ellipse_std` (float): Ellipse standard deviation.
224
+ - `style`, `size` (str): Style/size mapping columns.
225
+
226
+ #### `PCAPlot` (Inherits from ScatterPlot)
227
+ - **plot() Parameters**:
228
+ - `transpose` (bool): Whether to transpose the input matrix.
229
+ - `n_components` (int): Number of principal components.
230
+ - `cluster` (bool): Whether to draw confidence ellipses.
231
+
232
+ #### `ChromosomeDistributionPlot`
233
+ - **plot() Parameters**:
234
+ - `chrom_col`, `pos_col`: Chromosome and position column names.
235
+ - `pos_counts_col`, `neg_counts_col`: Positive and negative strand count columns.
236
+ - `max_chroms` (int): Maximum number of chromosomes to display.
237
+
238
+ #### `GSEAPlot`
239
+ - **plot() Parameters**:
240
+ - `rank`, `score`: Rank and score data columns.
241
+ - `nes`, `pvalue`, `fdr`: Statistical metrics (displayed in the plot).
242
+
243
+ ## 💻 Development
244
+
245
+ Unit test outputs are located in the `packages/plot/tests/output` directory.
246
+
247
+ ```bash
248
+ pytest packages/plot/tests
249
+ ```