hammock-plot 0.4__tar.gz → 1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {hammock_plot-0.4 → hammock_plot-1.0}/PKG-INFO +95 -28
- {hammock_plot-0.4 → hammock_plot-1.0}/README.md +93 -13
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot/__init__.py +2 -1
- hammock_plot-1.0/hammock_plot/figure.py +536 -0
- hammock_plot-1.0/hammock_plot/main.py +331 -0
- hammock_plot-1.0/hammock_plot/shapes.py +168 -0
- hammock_plot-1.0/hammock_plot/unibar.py +573 -0
- hammock_plot-1.0/hammock_plot/utils.py +140 -0
- hammock_plot-1.0/hammock_plot/value.py +35 -0
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot.egg-info/PKG-INFO +96 -29
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot.egg-info/SOURCES.txt +6 -1
- {hammock_plot-0.4 → hammock_plot-1.0}/setup.py +2 -2
- hammock_plot-0.4/hammock_plot/hammock_plot.py +0 -751
- {hammock_plot-0.4 → hammock_plot-1.0}/LICENSE +0 -0
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot.egg-info/dependency_links.txt +0 -0
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot.egg-info/requires.txt +0 -0
- {hammock_plot-0.4 → hammock_plot-1.0}/hammock_plot.egg-info/top_level.txt +0 -0
- {hammock_plot-0.4 → hammock_plot-1.0}/setup.cfg +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
Metadata-Version: 2.
|
|
1
|
+
Metadata-Version: 2.1
|
|
2
2
|
Name: hammock_plot
|
|
3
|
-
Version: 0
|
|
3
|
+
Version: 1.0
|
|
4
4
|
Summary: Hammock - visualization of categorical or mixed categorical/continuous data
|
|
5
5
|
Home-page: https://github.com/TianchengY/hammock_plot
|
|
6
6
|
Author: Tiancheng Yang
|
|
@@ -12,19 +12,6 @@ Classifier: Intended Audience :: Science/Research
|
|
|
12
12
|
Requires-Python: >=3.6
|
|
13
13
|
Description-Content-Type: text/markdown
|
|
14
14
|
License-File: LICENSE
|
|
15
|
-
Requires-Dist: matplotlib
|
|
16
|
-
Requires-Dist: numpy
|
|
17
|
-
Requires-Dist: pandas
|
|
18
|
-
Dynamic: author
|
|
19
|
-
Dynamic: author-email
|
|
20
|
-
Dynamic: classifier
|
|
21
|
-
Dynamic: description
|
|
22
|
-
Dynamic: description-content-type
|
|
23
|
-
Dynamic: home-page
|
|
24
|
-
Dynamic: license-file
|
|
25
|
-
Dynamic: requires-dist
|
|
26
|
-
Dynamic: requires-python
|
|
27
|
-
Dynamic: summary
|
|
28
15
|
|
|
29
16
|
# Hammock plot
|
|
30
17
|
|
|
@@ -66,7 +53,7 @@ We import the diabetes dataset:
|
|
|
66
53
|
```python
|
|
67
54
|
import hammock_plot
|
|
68
55
|
import pandas as pd
|
|
69
|
-
df = pd.read_csv('
|
|
56
|
+
df = pd.read_csv('./data/data_asthma.csv')
|
|
70
57
|
```
|
|
71
58
|
|
|
72
59
|
Minimal example of a hammock plot:
|
|
@@ -77,14 +64,22 @@ ax = hammock.plot(var=var)
|
|
|
77
64
|
```
|
|
78
65
|
<img src="image/asthma_minimal.png" alt="Minimal example for a Hammock plot" width="600"/>
|
|
79
66
|
|
|
67
|
+
The labels for the numerical variables aren't as desired; we would like the labels directly drawn on the data. We specify that we want no levels for our numerical variables.
|
|
68
|
+
|
|
69
|
+
```python
|
|
70
|
+
numeric_levels = {"comorbidities": None, "hospitalizations": None}
|
|
71
|
+
ax = hammock.plot(var=var, numerical_var_levels=numeric_levels)
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
<img src="image/asthma_levels.png" alt="Hammock plot" width="600"/>
|
|
75
|
+
|
|
80
76
|
The ordering of the child-adolescent-adult variable is not in the desired order; adult should not be in the middle. We now specify a specific order, child-adolescent-adult.
|
|
81
77
|
|
|
82
78
|
```python
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
value_order = {"group": group_dict}
|
|
79
|
+
group_order = ["child", "adolescent", "adult"]
|
|
80
|
+
value_order = {"group": group_order}
|
|
86
81
|
hammock = hammock_plot.Hammock(data_df = df)
|
|
87
|
-
ax = hammock.plot(var=var, value_order=value_order )
|
|
82
|
+
ax = hammock.plot(var=var, value_order=value_order, numerical_var_levels=numeric_levels)
|
|
88
83
|
```
|
|
89
84
|
|
|
90
85
|
<!--- to restrict image size, I am using a an html command, rather than the standard  --->
|
|
@@ -94,7 +89,7 @@ ax = hammock.plot(var=var, value_order=value_order )
|
|
|
94
89
|
We highlight observations with comorbidities=0 in red:
|
|
95
90
|
|
|
96
91
|
```python
|
|
97
|
-
ax = hammock.plot(var=var
|
|
92
|
+
ax = hammock.plot(var=var ,hi_var="comorbidities", hi_value=[0], colors=["red"], numerical_var_levels=numeric_levels)
|
|
98
93
|
```
|
|
99
94
|
|
|
100
95
|
<!---  --->
|
|
@@ -108,14 +103,14 @@ We import the diabetes dataset:
|
|
|
108
103
|
```python
|
|
109
104
|
import hammock_plot
|
|
110
105
|
import pandas as pd
|
|
111
|
-
df = pd.read_csv('
|
|
106
|
+
df = pd.read_csv('./data/data_diabetes.csv')
|
|
112
107
|
```
|
|
113
108
|
|
|
114
109
|
The three variables represent different ordinal scales for satisfaction. We are checking for missing values:
|
|
115
110
|
```python
|
|
116
111
|
var = ["sataces","satcomm","satrate"]
|
|
117
112
|
hammock = hammock_plot.Hammock(data_df = df)
|
|
118
|
-
ax = hammock.plot(var=var, missing=True)
|
|
113
|
+
ax = hammock.plot(var=var, missing=True, min_bar_height=0.2,numerical_var_levels={"sataces": None, "satcomm": None, "satrate": None})
|
|
119
114
|
```
|
|
120
115
|
|
|
121
116
|
<img src="image/diabetes.png" alt="Hammock plot for the Diabetes Data" width="600"/>
|
|
@@ -123,7 +118,75 @@ ax = hammock.plot(var=var, missing=True)
|
|
|
123
118
|
The missing value category is shown at the bottom for each variable. We find missing values for all 3 variables, but fewest for the last one. We also see a phenomenon called "top coding", where
|
|
124
119
|
satisfied respondents simply choose the highest value.
|
|
125
120
|
|
|
121
|
+
### Example value_order for the Shakespeare data
|
|
122
|
+
|
|
123
|
+
We import the Shakespeare dataset:
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
import hammock_plot
|
|
127
|
+
import pandas as pd
|
|
128
|
+
df = pd.read_csv('./data/data_shakespeare.csv')
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
We use `speaker_dict` to map the values of the variables `speaker1` and `speaker2` according to the social class hierarchy.
|
|
132
|
+
```python
|
|
133
|
+
var_lst = ["type","speaker1","speaker2","sex1"]
|
|
134
|
+
color_lst = ["red","yellow","green"]
|
|
135
|
+
hi_value = ["Beggars","Citizens","Gentry"]
|
|
136
|
+
|
|
137
|
+
speaker_order=["Beggars", "Royalty", "Nobility", "Gentry", "Citizens", "Yeomanry"]
|
|
138
|
+
|
|
139
|
+
hammock = hammock_plot.Hammock(data_df = df)
|
|
140
|
+
ax = hammock.plot(var=var_lst,hi_var = "speaker1", hi_value=hi_value,color=color_lst, bar_width=0.6,missing=True,
|
|
141
|
+
value_order ={"speaker1":speaker_order,"speaker2":speaker_order} )
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
<img src="image/shakespeare_order.png" alt="Hammock plot for the Shakespeare data, with value_order specified" width="600"/>
|
|
145
|
+
|
|
146
|
+
### Example same_scale using Shakespeare data
|
|
147
|
+
We can accomplish similar results using `same_scale`.
|
|
148
|
+
```python
|
|
149
|
+
hammock = hammock_plot.Hammock(data_df = df)
|
|
150
|
+
ax = hammock.plot(var=var_lst,hi_var = "speaker1", hi_value=hi_value,color=color_lst, bar_width=0.6,missing=True,
|
|
151
|
+
value_order ={"speaker1":speaker_order}, same_scale=["speaker1", "speaker2"] )
|
|
152
|
+
```
|
|
153
|
+
<img src="image/shakespeare_scale.png" alt="Hammock plot for the Shakespeare data, with same_scale specified" width="600"/>
|
|
154
|
+
|
|
155
|
+
### Example numerical_display_type using penguin data
|
|
156
|
+
|
|
157
|
+
We import the Shakespeare dataset:
|
|
158
|
+
|
|
159
|
+
```python
|
|
160
|
+
import hammock_plot
|
|
161
|
+
import pandas as pd
|
|
162
|
+
df = pd.read_csv('./data/data_penguins.csv')
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
We use `numerical_display_type` to control how we want to display our numerical data.
|
|
126
166
|
|
|
167
|
+
```python
|
|
168
|
+
hammock = hammock_plot.Hammock(df)
|
|
169
|
+
ax = hammock.plot(
|
|
170
|
+
var= ["species", "island", "bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"],
|
|
171
|
+
hi_var="island",
|
|
172
|
+
hi_value=["Torgersen"],
|
|
173
|
+
missing=True,
|
|
174
|
+
numerical_display_type={"bill_length_mm":"box", "bill_depth_mm": "rugplot", "flipper_length_mm": "violin", "body_mass_g":"box"},
|
|
175
|
+
)
|
|
176
|
+
```
|
|
177
|
+
<img src="image/penguin_display_violin.png" alt="Hammock plot for the penguin data, demonstrating numerical_display_type" width="600"/>
|
|
178
|
+
|
|
179
|
+
Box plots support multiple highlight values. Violin plots only support one highlight value.
|
|
180
|
+
```python
|
|
181
|
+
ax = hammock.plot(
|
|
182
|
+
var= ["species", "island", "bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"],
|
|
183
|
+
hi_var="island",
|
|
184
|
+
hi_value=["Torgersen", "Biscoe"],
|
|
185
|
+
missing=True,
|
|
186
|
+
numerical_display_type={"bill_length_mm":"box", "bill_depth_mm": "box", "flipper_length_mm": "box", "body_mass_g":"box"},
|
|
187
|
+
)
|
|
188
|
+
```
|
|
189
|
+
<img src="image/penguin_display_types.png" alt="Hammock plot for the penguin data, demonstrating numerical_display_type with multiple highlighting" width="600"/>
|
|
127
190
|
|
|
128
191
|
## API Reference
|
|
129
192
|
|
|
@@ -134,21 +197,25 @@ satisfied respondents simply choose the highest value.
|
|
|
134
197
|
| Category | Parameter | Type | Description |
|
|
135
198
|
| --- | :-------- | :------- | :------------------------- |
|
|
136
199
|
| General | `var` | `List[str]` | List of variables to display. |
|
|
137
|
-
| | `value_order` | `Dict[str,
|
|
200
|
+
| | `value_order` | `Dict[str, List[int]]` | If specified, the order of the values in the plot follows the order of values in the list supplied in the dictionary. Only applicable to categorical variables |
|
|
201
|
+
| | `numerical_var_levels` | `Dict[str, int \| None]` | Specifies the number of subdivisions in the y-axis for numerical variables. Example: {"NumericalVarname": 9, "NumericalVarname2": None}. Default is 7. |
|
|
202
|
+
| | `numerical_display_type` | `Dict[str, str]` | Specifies the type of plot (rugplot, box plot, violin plot) for numerical variable display. Example: {"NumericalVarname": "rugplot", "NumericalVarname2": "violin", "NumericalVarname3": "box"}. Default is "rugplot". |
|
|
138
203
|
| | `missing` | `bool` | Whether or not to add a category for missing values at the bottom of the plot. If False, observations that have a missing value for any variable in the data frame (even those not used in the hammock plot) are removed. Default is False. |
|
|
139
204
|
| | `label` | `bool` | Whether or not to display labels between the plotting segments |
|
|
205
|
+
| | `unibar`| `bool` | Whether or not to display unibars between the plotting segments |
|
|
140
206
|
| Highlighting | `hi_var` | `str` | Variable to be highlighted. Default is none. |
|
|
141
|
-
| | `hi_value` | `List[str or int]` |
|
|
207
|
+
| | `hi_value` | `List[str or int] or str or int` | Value(s) of `hi_var` to be highlighted. You can highlighted one or multiple values. You can also pass an expression (e.g. "x>1 and (x>5 or x<4)") in string when you want to specify a range for a numeric hi_var.|
|
|
142
208
|
| | `hi_box` | `str` | Controls how highlighted values are displayed within category labels. Options are "vertical" for vertically stacked color segments or "horizontal" for horizontally split color segments. Default is "vertical".|
|
|
143
209
|
| | `hi_missing` | `bool` | Whether or not missing values for `hi_var` should be highlighted. |
|
|
144
210
|
| | `color` | `List[str]` | List of colors corresponding to the list of values to be highlighted. Each color can be specified as a plain color name (e.g., `"red"`, `"yellow"`) or in the format `"color=alpha"` (e.g., `"red=0.5"`) to control transparency/intensity, where `alpha` is a decimal between 0 and 1. The default highlight color list is `["red", "green", "yellow", "lightblue", "orange", "gray", "brown", "olive", "pink", "cyan", "magenta"]`. |
|
|
145
211
|
| | `default_color` | `str` | Default color of plotting elements for boxes that are not highlighted. Default is "blue" |
|
|
146
|
-
| Manipulating Spacing and Layout | `
|
|
147
|
-
| | `space` | `float` |
|
|
212
|
+
| Manipulating Spacing and Layout | `uni_fraction` | `float` | Fraction of vertical space that should be populated by data. Adjusts the height of the data points. Defaults is 0.08. |
|
|
213
|
+
| | `space` | `float` |Fraction of horizontal space allocated to labels/univ. bars rather than to connecting boxes. Default is 0.3 |
|
|
148
214
|
| | `label_options` | `Dict[str, Dict[str, Any]]` | Manipulates the size and look of the labels. Args following the options in the website: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html Example:{"ExampleVarname":{"fontsize":12,"fontstyle":"italic","fontweight":"black","color":"b"}} Default is None. |
|
|
149
215
|
| | `height` | `float` | Height of the plot in inches. Default is 10. |
|
|
150
216
|
| | `width` | `float` | Width of the plot in inches. Default is 15. Caution: Width too narrow may distort the plot. |
|
|
151
|
-
| |
|
|
217
|
+
| | `alpha` | `float` | Alpha value for the colours in the plot. Float from 0-1. Default is 0.7. |
|
|
218
|
+
| | `min_bar_height` | `float` | Minimal bar height. Bars representing only a tiny fraction of the data may be so narrow, that they are invivisible in a plot. The default value tries to ensure this does not happen. Default is 0.1.
|
|
152
219
|
| Other options | `shape` | `str` | Shape of the boxes. "rectangle" (default) or "parallelogram". |
|
|
153
220
|
| | `same_scale` | `List[str]` | List of variables that have the same scale. Default is None. |
|
|
154
221
|
| | `display_figure` | `bool` | Whether or not to display the figure. This can be useful if you just want to save the plots. Default is 'True'. |
|
|
@@ -38,7 +38,7 @@ We import the diabetes dataset:
|
|
|
38
38
|
```python
|
|
39
39
|
import hammock_plot
|
|
40
40
|
import pandas as pd
|
|
41
|
-
df = pd.read_csv('
|
|
41
|
+
df = pd.read_csv('./data/data_asthma.csv')
|
|
42
42
|
```
|
|
43
43
|
|
|
44
44
|
Minimal example of a hammock plot:
|
|
@@ -49,14 +49,22 @@ ax = hammock.plot(var=var)
|
|
|
49
49
|
```
|
|
50
50
|
<img src="image/asthma_minimal.png" alt="Minimal example for a Hammock plot" width="600"/>
|
|
51
51
|
|
|
52
|
+
The labels for the numerical variables aren't as desired; we would like the labels directly drawn on the data. We specify that we want no levels for our numerical variables.
|
|
53
|
+
|
|
54
|
+
```python
|
|
55
|
+
numeric_levels = {"comorbidities": None, "hospitalizations": None}
|
|
56
|
+
ax = hammock.plot(var=var, numerical_var_levels=numeric_levels)
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
<img src="image/asthma_levels.png" alt="Hammock plot" width="600"/>
|
|
60
|
+
|
|
52
61
|
The ordering of the child-adolescent-adult variable is not in the desired order; adult should not be in the middle. We now specify a specific order, child-adolescent-adult.
|
|
53
62
|
|
|
54
63
|
```python
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
value_order = {"group": group_dict}
|
|
64
|
+
group_order = ["child", "adolescent", "adult"]
|
|
65
|
+
value_order = {"group": group_order}
|
|
58
66
|
hammock = hammock_plot.Hammock(data_df = df)
|
|
59
|
-
ax = hammock.plot(var=var, value_order=value_order )
|
|
67
|
+
ax = hammock.plot(var=var, value_order=value_order, numerical_var_levels=numeric_levels)
|
|
60
68
|
```
|
|
61
69
|
|
|
62
70
|
<!--- to restrict image size, I am using a an html command, rather than the standard  --->
|
|
@@ -66,7 +74,7 @@ ax = hammock.plot(var=var, value_order=value_order )
|
|
|
66
74
|
We highlight observations with comorbidities=0 in red:
|
|
67
75
|
|
|
68
76
|
```python
|
|
69
|
-
ax = hammock.plot(var=var
|
|
77
|
+
ax = hammock.plot(var=var ,hi_var="comorbidities", hi_value=[0], colors=["red"], numerical_var_levels=numeric_levels)
|
|
70
78
|
```
|
|
71
79
|
|
|
72
80
|
<!---  --->
|
|
@@ -80,14 +88,14 @@ We import the diabetes dataset:
|
|
|
80
88
|
```python
|
|
81
89
|
import hammock_plot
|
|
82
90
|
import pandas as pd
|
|
83
|
-
df = pd.read_csv('
|
|
91
|
+
df = pd.read_csv('./data/data_diabetes.csv')
|
|
84
92
|
```
|
|
85
93
|
|
|
86
94
|
The three variables represent different ordinal scales for satisfaction. We are checking for missing values:
|
|
87
95
|
```python
|
|
88
96
|
var = ["sataces","satcomm","satrate"]
|
|
89
97
|
hammock = hammock_plot.Hammock(data_df = df)
|
|
90
|
-
ax = hammock.plot(var=var, missing=True)
|
|
98
|
+
ax = hammock.plot(var=var, missing=True, min_bar_height=0.2,numerical_var_levels={"sataces": None, "satcomm": None, "satrate": None})
|
|
91
99
|
```
|
|
92
100
|
|
|
93
101
|
<img src="image/diabetes.png" alt="Hammock plot for the Diabetes Data" width="600"/>
|
|
@@ -95,7 +103,75 @@ ax = hammock.plot(var=var, missing=True)
|
|
|
95
103
|
The missing value category is shown at the bottom for each variable. We find missing values for all 3 variables, but fewest for the last one. We also see a phenomenon called "top coding", where
|
|
96
104
|
satisfied respondents simply choose the highest value.
|
|
97
105
|
|
|
106
|
+
### Example value_order for the Shakespeare data
|
|
107
|
+
|
|
108
|
+
We import the Shakespeare dataset:
|
|
109
|
+
|
|
110
|
+
```python
|
|
111
|
+
import hammock_plot
|
|
112
|
+
import pandas as pd
|
|
113
|
+
df = pd.read_csv('./data/data_shakespeare.csv')
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
We use `speaker_dict` to map the values of the variables `speaker1` and `speaker2` according to the social class hierarchy.
|
|
117
|
+
```python
|
|
118
|
+
var_lst = ["type","speaker1","speaker2","sex1"]
|
|
119
|
+
color_lst = ["red","yellow","green"]
|
|
120
|
+
hi_value = ["Beggars","Citizens","Gentry"]
|
|
121
|
+
|
|
122
|
+
speaker_order=["Beggars", "Royalty", "Nobility", "Gentry", "Citizens", "Yeomanry"]
|
|
123
|
+
|
|
124
|
+
hammock = hammock_plot.Hammock(data_df = df)
|
|
125
|
+
ax = hammock.plot(var=var_lst,hi_var = "speaker1", hi_value=hi_value,color=color_lst, bar_width=0.6,missing=True,
|
|
126
|
+
value_order ={"speaker1":speaker_order,"speaker2":speaker_order} )
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
<img src="image/shakespeare_order.png" alt="Hammock plot for the Shakespeare data, with value_order specified" width="600"/>
|
|
130
|
+
|
|
131
|
+
### Example same_scale using Shakespeare data
|
|
132
|
+
We can accomplish similar results using `same_scale`.
|
|
133
|
+
```python
|
|
134
|
+
hammock = hammock_plot.Hammock(data_df = df)
|
|
135
|
+
ax = hammock.plot(var=var_lst,hi_var = "speaker1", hi_value=hi_value,color=color_lst, bar_width=0.6,missing=True,
|
|
136
|
+
value_order ={"speaker1":speaker_order}, same_scale=["speaker1", "speaker2"] )
|
|
137
|
+
```
|
|
138
|
+
<img src="image/shakespeare_scale.png" alt="Hammock plot for the Shakespeare data, with same_scale specified" width="600"/>
|
|
139
|
+
|
|
140
|
+
### Example numerical_display_type using penguin data
|
|
141
|
+
|
|
142
|
+
We import the Shakespeare dataset:
|
|
143
|
+
|
|
144
|
+
```python
|
|
145
|
+
import hammock_plot
|
|
146
|
+
import pandas as pd
|
|
147
|
+
df = pd.read_csv('./data/data_penguins.csv')
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
We use `numerical_display_type` to control how we want to display our numerical data.
|
|
98
151
|
|
|
152
|
+
```python
|
|
153
|
+
hammock = hammock_plot.Hammock(df)
|
|
154
|
+
ax = hammock.plot(
|
|
155
|
+
var= ["species", "island", "bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"],
|
|
156
|
+
hi_var="island",
|
|
157
|
+
hi_value=["Torgersen"],
|
|
158
|
+
missing=True,
|
|
159
|
+
numerical_display_type={"bill_length_mm":"box", "bill_depth_mm": "rugplot", "flipper_length_mm": "violin", "body_mass_g":"box"},
|
|
160
|
+
)
|
|
161
|
+
```
|
|
162
|
+
<img src="image/penguin_display_violin.png" alt="Hammock plot for the penguin data, demonstrating numerical_display_type" width="600"/>
|
|
163
|
+
|
|
164
|
+
Box plots support multiple highlight values. Violin plots only support one highlight value.
|
|
165
|
+
```python
|
|
166
|
+
ax = hammock.plot(
|
|
167
|
+
var= ["species", "island", "bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g"],
|
|
168
|
+
hi_var="island",
|
|
169
|
+
hi_value=["Torgersen", "Biscoe"],
|
|
170
|
+
missing=True,
|
|
171
|
+
numerical_display_type={"bill_length_mm":"box", "bill_depth_mm": "box", "flipper_length_mm": "box", "body_mass_g":"box"},
|
|
172
|
+
)
|
|
173
|
+
```
|
|
174
|
+
<img src="image/penguin_display_types.png" alt="Hammock plot for the penguin data, demonstrating numerical_display_type with multiple highlighting" width="600"/>
|
|
99
175
|
|
|
100
176
|
## API Reference
|
|
101
177
|
|
|
@@ -106,21 +182,25 @@ satisfied respondents simply choose the highest value.
|
|
|
106
182
|
| Category | Parameter | Type | Description |
|
|
107
183
|
| --- | :-------- | :------- | :------------------------- |
|
|
108
184
|
| General | `var` | `List[str]` | List of variables to display. |
|
|
109
|
-
| | `value_order` | `Dict[str,
|
|
185
|
+
| | `value_order` | `Dict[str, List[int]]` | If specified, the order of the values in the plot follows the order of values in the list supplied in the dictionary. Only applicable to categorical variables |
|
|
186
|
+
| | `numerical_var_levels` | `Dict[str, int \| None]` | Specifies the number of subdivisions in the y-axis for numerical variables. Example: {"NumericalVarname": 9, "NumericalVarname2": None}. Default is 7. |
|
|
187
|
+
| | `numerical_display_type` | `Dict[str, str]` | Specifies the type of plot (rugplot, box plot, violin plot) for numerical variable display. Example: {"NumericalVarname": "rugplot", "NumericalVarname2": "violin", "NumericalVarname3": "box"}. Default is "rugplot". |
|
|
110
188
|
| | `missing` | `bool` | Whether or not to add a category for missing values at the bottom of the plot. If False, observations that have a missing value for any variable in the data frame (even those not used in the hammock plot) are removed. Default is False. |
|
|
111
189
|
| | `label` | `bool` | Whether or not to display labels between the plotting segments |
|
|
190
|
+
| | `unibar`| `bool` | Whether or not to display unibars between the plotting segments |
|
|
112
191
|
| Highlighting | `hi_var` | `str` | Variable to be highlighted. Default is none. |
|
|
113
|
-
| | `hi_value` | `List[str or int]` |
|
|
192
|
+
| | `hi_value` | `List[str or int] or str or int` | Value(s) of `hi_var` to be highlighted. You can highlighted one or multiple values. You can also pass an expression (e.g. "x>1 and (x>5 or x<4)") in string when you want to specify a range for a numeric hi_var.|
|
|
114
193
|
| | `hi_box` | `str` | Controls how highlighted values are displayed within category labels. Options are "vertical" for vertically stacked color segments or "horizontal" for horizontally split color segments. Default is "vertical".|
|
|
115
194
|
| | `hi_missing` | `bool` | Whether or not missing values for `hi_var` should be highlighted. |
|
|
116
195
|
| | `color` | `List[str]` | List of colors corresponding to the list of values to be highlighted. Each color can be specified as a plain color name (e.g., `"red"`, `"yellow"`) or in the format `"color=alpha"` (e.g., `"red=0.5"`) to control transparency/intensity, where `alpha` is a decimal between 0 and 1. The default highlight color list is `["red", "green", "yellow", "lightblue", "orange", "gray", "brown", "olive", "pink", "cyan", "magenta"]`. |
|
|
117
196
|
| | `default_color` | `str` | Default color of plotting elements for boxes that are not highlighted. Default is "blue" |
|
|
118
|
-
| Manipulating Spacing and Layout | `
|
|
119
|
-
| | `space` | `float` |
|
|
197
|
+
| Manipulating Spacing and Layout | `uni_fraction` | `float` | Fraction of vertical space that should be populated by data. Adjusts the height of the data points. Defaults is 0.08. |
|
|
198
|
+
| | `space` | `float` |Fraction of horizontal space allocated to labels/univ. bars rather than to connecting boxes. Default is 0.3 |
|
|
120
199
|
| | `label_options` | `Dict[str, Dict[str, Any]]` | Manipulates the size and look of the labels. Args following the options in the website: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html Example:{"ExampleVarname":{"fontsize":12,"fontstyle":"italic","fontweight":"black","color":"b"}} Default is None. |
|
|
121
200
|
| | `height` | `float` | Height of the plot in inches. Default is 10. |
|
|
122
201
|
| | `width` | `float` | Width of the plot in inches. Default is 15. Caution: Width too narrow may distort the plot. |
|
|
123
|
-
| |
|
|
202
|
+
| | `alpha` | `float` | Alpha value for the colours in the plot. Float from 0-1. Default is 0.7. |
|
|
203
|
+
| | `min_bar_height` | `float` | Minimal bar height. Bars representing only a tiny fraction of the data may be so narrow, that they are invivisible in a plot. The default value tries to ensure this does not happen. Default is 0.1.
|
|
124
204
|
| Other options | `shape` | `str` | Shape of the boxes. "rectangle" (default) or "parallelogram". |
|
|
125
205
|
| | `same_scale` | `List[str]` | List of variables that have the same scale. Default is None. |
|
|
126
206
|
| | `display_figure` | `bool` | Whether or not to display the figure. This can be useful if you just want to save the plots. Default is 'True'. |
|