pandas-plots 0.10.1__tar.gz → 0.11.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: pandas-plots
3
- Version: 0.10.1
3
+ Version: 0.11.1
4
4
  Summary: A collection of helper for table handling and vizualization
5
5
  Home-page: https://github.com/smeisegeier/pandas-plots
6
6
  Author: smeisegeier
@@ -29,7 +29,7 @@ Requires-Dist: requests>=2.31.0
29
29
 
30
30
  # pandas-plots
31
31
 
32
- ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
32
+ ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10_|_3.11_|_3.12-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
33
33
 
34
34
  ## usage
35
35
 
@@ -98,8 +98,9 @@ tbl.show_num_df(
98
98
  - `mean_confidence_interval()` calculates mean and confidence interval for a series
99
99
  - `wrap_text()` formats strings or lists to a given width to fit nicely on the screen
100
100
  - `replace_delimiter_outside_quotes()` when manual import of csv files is needed: replaces delimiters only outside of quotes
101
- - 🆕 `create_barcode_from_url()` creates a barcode from a given URL
102
- - 🆕 `add_datetime_col()` adds a datetime columns to a dataframe
101
+ - `create_barcode_from_url()` creates a barcode from a given URL
102
+ - `add_datetime_col()` adds a datetime columns to a dataframe
103
+ - 🆕 `show_package_version` prints version of a list of packages
103
104
 
104
105
  > note: theme setting can be controlled through all functions by setting the environment variable `THEME` to either light or dark
105
106
 
@@ -154,3 +155,7 @@ _df, _details = ven.show_venn3(
154
155
  ```
155
156
 
156
157
  ![venn](https://github.com/smeisegeier/pandas-plots/blob/main/img/2024-02-19-20-49-52.png?raw=true)
158
+
159
+ ## tags
160
+
161
+ #pandas, #plotly, #visualizations, #statistics
@@ -1,6 +1,6 @@
1
1
  # pandas-plots
2
2
 
3
- ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
3
+ ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10_|_3.11_|_3.12-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
4
4
 
5
5
  ## usage
6
6
 
@@ -69,8 +69,9 @@ tbl.show_num_df(
69
69
  - `mean_confidence_interval()` calculates mean and confidence interval for a series
70
70
  - `wrap_text()` formats strings or lists to a given width to fit nicely on the screen
71
71
  - `replace_delimiter_outside_quotes()` when manual import of csv files is needed: replaces delimiters only outside of quotes
72
- - 🆕 `create_barcode_from_url()` creates a barcode from a given URL
73
- - 🆕 `add_datetime_col()` adds a datetime columns to a dataframe
72
+ - `create_barcode_from_url()` creates a barcode from a given URL
73
+ - `add_datetime_col()` adds a datetime columns to a dataframe
74
+ - 🆕 `show_package_version` prints version of a list of packages
74
75
 
75
76
  > note: theme setting can be controlled through all functions by setting the environment variable `THEME` to either light or dark
76
77
 
@@ -125,3 +126,7 @@ _df, _details = ven.show_venn3(
125
126
  ```
126
127
 
127
128
  ![venn](https://github.com/smeisegeier/pandas-plots/blob/main/img/2024-02-19-20-49-52.png?raw=true)
129
+
130
+ ## tags
131
+
132
+ #pandas, #plotly, #visualizations, #statistics
@@ -1,6 +1,6 @@
1
1
  [metadata]
2
2
  name = pandas-plots
3
- version = 0.10.1
3
+ version = 0.11.1
4
4
  author = smeisegeier
5
5
  author_email = dexterDSDo@googlemail.com
6
6
  description = A collection of helper for table handling and vizualization
@@ -1,6 +1,7 @@
1
1
  import pandas as pd
2
2
  import numpy as np
3
3
  import scipy.stats
4
+ import importlib.metadata as md
4
5
 
5
6
  from io import BytesIO
6
7
  from matplotlib import pyplot as plt
@@ -8,7 +9,7 @@ from PIL import Image
8
9
  import requests
9
10
  import re
10
11
 
11
- from tenacity import retry
12
+ # from devtools import debug
12
13
 
13
14
  URL_REGEX = r"^(?:http|ftp)s?://" # https://stackoverflow.com/a/1617386
14
15
 
@@ -125,53 +126,67 @@ def replace_delimiter_outside_quotes(
125
126
 
126
127
 
127
128
  def wrap_text(
128
- text: str | list, max_items_in_line: int = 70, sep: bool = True, apo: bool = False
129
+ text: str | list,
130
+ max_items_in_line: int = 70,
131
+ use_sep: bool = True,
132
+ use_apo: bool = False,
129
133
  ):
130
134
  """
131
135
  A function that wraps text into lines with a maximum number of items per line.
136
+ Important: enclose this function in a print() statement to print the text
132
137
 
133
138
  Args:
134
139
  text (str | list): The input text or list of words to be wrapped.
135
140
  max_items_in_line (int): The maximum number of items allowed in each line.
136
- sep (bool, optional): Whether to include a comma separator between items. Defaults to True.
137
- apo (bool, optional): Whether to enclose each word in single quotes. Defaults to False.
141
+ use_sep (bool, optional): When list: Whether to include a comma separator between items. Defaults to True.
142
+ use_apo (bool, optional): When list: Whether to enclose each word in single quotes. Defaults to False.
143
+ Returns: the wrapped text
138
144
  """
139
145
 
140
- # * check if text is string, then strip and build word list
146
+ # * check if text is string
141
147
  is_text = isinstance(text, str)
142
148
  if is_text:
149
+ # ! when splitting the text later by blanks, newlines are not correctly handled
150
+ # * to detect them, they must be followed by a blank:
151
+ pattern = r'(\n)(?=\S)' # *forward lookup for newline w/ no blank
152
+ # * add blank after these newlines
153
+ new_text = re.sub(pattern, r"\1 ", text)
154
+ text=new_text
155
+
156
+ # * then strip and build word list
143
157
  text = (
144
158
  text.replace(",", "")
145
159
  .replace("'", "")
146
160
  .replace("[", "")
147
161
  .replace("]", "")
162
+ # * use explicit blanks to prevent newline split
148
163
  .split(" ")
149
164
  )
150
165
 
151
- # * start
166
+ # * loop setup
152
167
  i = 0
153
168
  line = ""
154
-
155
169
  # * loop through words
156
170
  out = ""
157
171
  for word in text:
158
- apo_s = "'" if apo else ""
159
- sep_s = "," if sep and not is_text else ""
172
+ apo_s = "'" if use_apo and not is_text else ""
173
+ sep_s = "," if use_sep and not is_text else ""
160
174
  word_s = f"{apo_s}{str(word)}{apo_s}{sep_s}"
161
175
  # * inc counter
162
176
  i = i + len(word_s)
163
177
  # * construct print line
164
178
  line = line + word_s + " "
165
- # * reset if counter exceeds limit
166
- if i >= max_items_in_line:
179
+ # * reset if counter exceeds limit, or if word ends with newline
180
+ if i >= max_items_in_line or str(word).endswith("\n"):
167
181
  out = out + line + "\n"
168
182
  line = ""
169
183
  i = 0
170
184
  # else:
171
- # * on short lists no reset happens, trigger manually
172
- out = line if not out else out
173
- # * cut last newline
174
- return f"[{out[:-1]}]"
185
+ # * on short lists no line reset happens, so just print the line
186
+ # * else add last line
187
+ out = line if not out else out + line
188
+ # * cut off last newline
189
+ return f"[{out[:-1].strip()}]"
175
190
 
176
191
 
177
192
  def create_barcode_from_url(
@@ -211,21 +226,24 @@ def create_barcode_from_url(
211
226
  # plt.axis('off') # Turn off axis numbers
212
227
  plt.show()
213
228
 
229
+
214
230
  def add_datetime_columns(df: pd.DataFrame, date_column: str = None) -> pd.DataFrame:
215
- df_= df.copy()
231
+ df_ = df.copy()
216
232
  if not date_column:
217
- date_column = [col for col in df_.columns if pd.api.types.is_datetime64_any_dtype(df_[col])][0]
233
+ date_column = [
234
+ col for col in df_.columns if pd.api.types.is_datetime64_any_dtype(df_[col])
235
+ ][0]
218
236
  else:
219
237
  df_[date_column] = pd.to_datetime(df_[date_column])
220
238
 
221
239
  if not date_column or not pd.api.types.is_datetime64_any_dtype(df_[date_column]):
222
240
  print("❌ No datetime column found")
223
241
  return
224
-
242
+
225
243
  if [col for col in df_.columns if "YYYY-WW" in col]:
226
244
  print("❌ Added datetime columns already exist")
227
245
  return
228
-
246
+
229
247
  print(f"⏳ Adding datetime columns basing off of: {date_column}")
230
248
 
231
249
  df_["YYYY"] = df_[date_column].dt.year
@@ -235,9 +253,33 @@ def add_datetime_columns(df: pd.DataFrame, date_column: str = None) -> pd.DataFr
235
253
  df_["YYYY-MM"] = df_[date_column].dt.to_period("M").astype(str)
236
254
  df_["YYYYQ"] = df_[date_column].dt.to_period("Q").astype(str)
237
255
  df_["YYYY-WW"] = (
238
- df_[date_column].dt.isocalendar().year.astype(str) + "-W" +
239
- df_[date_column].dt.isocalendar().week.astype(str).str.zfill(2)
256
+ df_[date_column].dt.isocalendar().year.astype(str)
257
+ + "-W"
258
+ + df_[date_column].dt.isocalendar().week.astype(str).str.zfill(2)
240
259
  )
241
- df_["DDD"] = df_[date_column].dt.weekday.map({0: "Mon", 1: "Tue", 2: "Wed", 3: "Thu", 4: "Fri", 5: "Sat", 6: "Sun"})
242
-
260
+ df_["DDD"] = df_[date_column].dt.weekday.map(
261
+ {0: "Mon", 1: "Tue", 2: "Wed", 3: "Thu", 4: "Fri", 5: "Sat", 6: "Sun"}
262
+ )
263
+
243
264
  return df_
265
+
266
+
267
+ def show_package_version(packages: list[str] = ["pandas","numpy","duckdb","pandas-plots", "connection_helper"], sep: str = " | ") -> None:
268
+ """
269
+ Display the versions of the specified packages.
270
+
271
+ Parameters:
272
+ packages (list[str], optional): A list of package names. Defaults to ["pandas","numpy","duckdb","pandas-plots", "connection_helper"].
273
+ sep (str, optional): The separator to use when joining the package names and versions. Defaults to " | ".
274
+
275
+ Returns:
276
+ None
277
+ """
278
+ items = []
279
+ for item in packages:
280
+ try:
281
+ version = md.version(item)
282
+ items.append(f"📦 {item}: {version}")
283
+ except md.PackageNotFoundError:
284
+ items.append(f"❌ {item}: Package not found")
285
+ print(sep.join(items))
@@ -125,7 +125,7 @@ def plot_stacked_bars(
125
125
  Parameters:
126
126
  - df: pd.DataFrame - The DataFrame containing the data to plot.
127
127
  - top_n_index: int = 0 - The number of top indexes to include in the plot.
128
- - top_n_index: int = 0 - The number of top colors to include in the plot. WARNING: this forces distribution to 100% on a subset
128
+ - top_n_color: int = 0 - The number of top colors to include in the plot. WARNING: this forces distribution to 100% on a subset
129
129
  - dropna: bool = False - Whether to include NULL values in the plot.
130
130
  - swap: bool = False - Whether to swap the x-axis and y-axis.
131
131
  - normalize: bool = False - Whether to normalize the values.
@@ -1,4 +1,5 @@
1
1
  import warnings
2
+
2
3
  warnings.filterwarnings("ignore")
3
4
 
4
5
  import math
@@ -14,14 +15,18 @@ from plotly.subplots import make_subplots
14
15
  from scipy import stats
15
16
 
16
17
  from .hlp import wrap_text
18
+
17
19
  # from devtools import debug
20
+
18
21
  pd.options.display.colheader_justify = "right"
19
22
  # pd.options.mode.chained_assignment = None
20
23
 
21
24
  TOTAL_LITERAL = Literal[
22
25
  "sum", "mean", "median", "min", "max", "std", "var", "skew", "kurt"
23
26
  ]
24
- KPI_LITERAL = Literal["rag_abs","rag_rel", "min_max_xy", "max_min_xy", "min_max_x", "max_min_x"]
27
+ KPI_LITERAL = Literal[
28
+ "rag_abs", "rag_rel", "min_max_xy", "max_min_xy", "min_max_x", "max_min_x"
29
+ ]
25
30
 
26
31
 
27
32
  def describe_df(
@@ -108,7 +113,7 @@ def describe_df(
108
113
  is_str = df.loc[:, col].dtype.kind == "O"
109
114
  # * wrap output
110
115
  print(
111
- f"{_h} {wrap_text(_u[:top_n_uniques], max_items_in_line=70, apo=is_str)}"
116
+ f"{_h} {wrap_text(_u[:top_n_uniques], max_items_in_line=70, use_apo=is_str)}"
112
117
  )
113
118
  # print(f"{_h} {_u[:top_n_uniques]}")
114
119
  else:
@@ -130,14 +135,16 @@ def describe_df(
130
135
  # ! *** PLOTS ***
131
136
  if not use_plot:
132
137
  return
133
-
138
+
134
139
  # * reduce column names len if selected
135
140
  if top_n_chars_in_columns > 0:
136
141
  # * minumum 10 chars, or display is cluttered
137
- top_n_chars_in_columns = 10 if top_n_chars_in_columns < 10 else top_n_chars_in_columns
142
+ top_n_chars_in_columns = (
143
+ 10 if top_n_chars_in_columns < 10 else top_n_chars_in_columns
144
+ )
138
145
  col_list = []
139
146
  for i, col in enumerate(df.columns):
140
- col_list.append(col[:top_n_chars_in_columns]+"_"+str(i).zfill(3))
147
+ col_list.append(col[:top_n_chars_in_columns] + "_" + str(i).zfill(3))
141
148
  df.columns = col_list
142
149
 
143
150
  # * respect fig_offset to exclude unwanted plots from maintanance columns
@@ -183,7 +190,7 @@ def describe_df(
183
190
  else s[:top_n_chars_in_index]
184
191
  )
185
192
  x = [_cut(item) for item in x]
186
-
193
+
187
194
  figsub = px.bar(
188
195
  x=x,
189
196
  y=y,
@@ -219,6 +226,7 @@ def pivot_df(
219
226
  kpi_shape: Literal["squad", "circle"] = "squad",
220
227
  ) -> pd.DataFrame:
221
228
  """
229
+ DEPR: This function is deprecated and will be removed in the future.
222
230
  A function to pivot a DataFrame based on specified parameters and return the result as a new DataFrame.
223
231
 
224
232
  Args:
@@ -318,7 +326,7 @@ def pivot_df(
318
326
  heatmap_axis=heatmap_axis,
319
327
  kpi_mode=kpi_mode,
320
328
  kpi_rag_list=kpi_rag_list,
321
- kpi_shape=kpi_shape
329
+ kpi_shape=kpi_shape,
322
330
  )
323
331
 
324
332
 
@@ -359,12 +367,15 @@ def show_num_df(
359
367
  - max_min_x: max value green, min valued red for x axis
360
368
  - kpi_rag_list: a list of floats indicating the thresholds for rag lights. The list should have 2 elements.
361
369
  - kpi_shape: a Literal indicating the shape of the KPIs ["squad", "circle"]
370
+ - show_as_pct: a boolean indicating whether to show value as percentage (only advised on values ~1)
362
371
 
363
372
  The function returns a styled representation of the DataFrame.
364
373
  """
365
374
  # * ensure arguments match parameter definition
366
375
  if any([df[col].dtype.kind not in ["i", "u", "f"] for col in df.columns]) == True:
367
- print(f"❌ table must contain numeric data only. Maybe you forgot to convert this table with pivot or pivot_table first?")
376
+ print(
377
+ f"❌ table must contain numeric data only. Maybe you forgot to convert this table with pivot or pivot_table first?"
378
+ )
368
379
  return
369
380
 
370
381
  if (
@@ -383,16 +394,16 @@ def show_num_df(
383
394
  print(f"❌ kpi_mode '{kpi_mode}' not supported")
384
395
  return
385
396
 
386
- if (kpi_mode and kpi_mode.startswith("rag")) and (not isinstance(kpi_rag_list, abc.Iterable)
387
- or len(kpi_rag_list) != 2
388
- ):
397
+ if (kpi_mode and kpi_mode.startswith("rag")) and (
398
+ not isinstance(kpi_rag_list, abc.Iterable) or len(kpi_rag_list) != 2
399
+ ):
389
400
  print(f"❌ kpi_rag_list must be a list of 2 if kpi_mode is set")
390
401
  return
391
-
402
+
392
403
  if kpi_mode == "rag_rel":
393
404
  # * transform values into percentiles
394
405
  if all(i <= 1 and i >= 0 for i in kpi_rag_list):
395
- kpi_rag_list = [int(i*100) for i in kpi_rag_list]
406
+ kpi_rag_list = [int(i * 100) for i in kpi_rag_list]
396
407
  else:
397
408
  print(f"❌ kpi_list for relative mode must be between 0 and 1")
398
409
  return
@@ -415,17 +426,21 @@ def show_num_df(
415
426
  df_.loc["Total"] = df_.agg(total_mode, axis=0)
416
427
  if total_mode and total_axis in ["y", "xy"]:
417
428
  df_.loc[:, "Total"] = df_.agg(total_mode, axis=1)
418
-
429
+
419
430
  # hack
420
431
  # * column sum values are distorted by totals, these must be rendered out
421
- col_divider = 2 if (total_axis in ["x", "xy"] and pct_axis == "x" and total_mode=="sum") else 1
432
+ col_divider = (
433
+ 2
434
+ if (total_axis in ["x", "xy"] and pct_axis == "x" and total_mode == "sum")
435
+ else 1
436
+ )
422
437
  col_sum = df_.sum() / col_divider
423
-
438
+
424
439
  # * min values are unaffected
425
440
  col_min = df_.min()
426
441
 
427
442
  # * max values are affected by totals, ignore total row if present
428
- last_row = -1 if (total_axis in ["x", "xy"] and total_mode=="sum") else None
443
+ last_row = -1 if (total_axis in ["x", "xy"] and total_mode == "sum") else None
429
444
  col_max = df_[:last_row].max()
430
445
 
431
446
  # * derive style
@@ -449,41 +464,44 @@ def show_num_df(
449
464
  # align="zero",
450
465
  )
451
466
 
452
-
453
467
  def get_kpi(val: float, col: str) -> str:
454
468
  """
455
469
  Function to calculate and return the appropriate icon based on the given value and key performance indicator (KPI) mode.
456
-
470
+
457
471
  Parameters:
458
472
  val (float): The value to be evaluated.
459
473
  col (str): The column associated with the value.
460
-
474
+
461
475
  Returns:
462
476
  str: The appropriate icon based on the value and KPI mode.
463
477
  """
478
+
479
+ # * no icon if no mode. (or Total column, but total index cannot be located)
464
480
  if not kpi_mode:
481
+ # if not kpi_mode or col == "Total":
465
482
  return ""
483
+
484
+
466
485
 
467
486
  dict_icons = {
468
487
  "squad": {
469
- "light":["🟩", "🟨", "🟥", "⬜"],
470
- "dark":["🟩", "🟨", "🟥", "⬛"]
471
- },
488
+ "light": ["🟩", "🟨", "🟥", "⬜"],
489
+ "dark": ["🟩", "🟨", "🟥", "⬛"],
490
+ },
472
491
  "circle": {
473
- "light":["🟢", "🟡", "🔴", "⚪"],
474
- "dark":["🟢", "🟡", "🔴", "⚫"]
475
- },
492
+ "light": ["🟢", "🟡", "🔴", "⚪"],
493
+ "dark": ["🟢", "🟡", "🔴", "⚫"],
494
+ },
476
495
  }
477
496
  icons = dict_icons[kpi_shape][theme]
478
-
479
497
  # * transform values into percentiles if relative mode
480
- kpi_rag_list_= kpi_rag_list
481
- if kpi_mode=="rag_rel":
498
+ kpi_rag_list_ = kpi_rag_list
499
+ if kpi_mode == "rag_rel":
482
500
  # * get both percentile thresholds
483
501
  pcntl_1 = np.percentile(df_orig, kpi_rag_list[0])
484
502
  pcntl_2 = np.percentile(df_orig, kpi_rag_list[1])
485
503
  kpi_rag_list_ = [pcntl_1, pcntl_2]
486
-
504
+
487
505
  # * for rag mopde both rel and abs
488
506
  if kpi_mode.startswith("rag"):
489
507
  # * get fitting icon
@@ -500,39 +518,26 @@ def show_num_df(
500
518
  else icons[1] if val > kpi_rag_list_[1] else icons[2]
501
519
  )
502
520
  return icon
503
-
521
+
504
522
  # * for min/max mode, get min and max either from table or column
505
523
  # ! care for max values
506
524
  min_ = tbl_min if kpi_mode.endswith("_xy") else col_min[col]
507
525
  max_ = tbl_max if kpi_mode.endswith("_xy") else col_max[col]
508
526
 
509
- # * omit Total column for min/max
510
- if col=="Total":
511
- return ""
512
-
513
527
  # * calculate order of icons
514
- if kpi_mode.startswith( "min_max"):
515
- result= (
516
- icons[0]
517
- if val == min_
518
- else icons[2] if val == max_ else icons[3]
519
- )
528
+ if kpi_mode.startswith("min_max"):
529
+ result = icons[0] if val == min_ else icons[2] if val == max_ else icons[3]
520
530
  elif kpi_mode.startswith("max_min"):
521
- result= (
522
- icons[0]
523
- if val == max_
524
- else icons[2] if val == min_ else icons[3]
525
- )
531
+ result = icons[0] if val == max_ else icons[2] if val == min_ else icons[3]
526
532
  else:
527
- # * no matching mode founf
528
- result=""
529
-
533
+ # * no matching mode found
534
+ result = ""
530
535
  return result
531
536
 
532
537
  # * all cell formatting in one place
533
538
  def format_cell(val, col):
534
539
  """
535
- A function to format a cell value based on the sum and percentage axis.
540
+ A function to format a cell value based on the sum and percentage axis.
536
541
  Parameters:
537
542
  - val: The value of the cell.
538
543
  - col: The column index of the cell.
@@ -540,8 +545,8 @@ def show_num_df(
540
545
  Returns a formatted string for the cell value.
541
546
  """
542
547
  # * calc sum depending on pct_axis
543
- sum_=tbl_sum if pct_axis=="xy" else col_sum[col] if pct_axis=="x" else val
544
- val_rel= 0 if sum_== 0 else val / sum_
548
+ sum_ = tbl_sum if pct_axis == "xy" else col_sum[col] if pct_axis == "x" else val
549
+ val_rel = 0 if sum_ == 0 else val / sum_
545
550
 
546
551
  # * get kpi icon
547
552
  kpi = get_kpi(val, col=col)
@@ -556,14 +561,11 @@ def show_num_df(
556
561
  if pct_axis:
557
562
  return f'{val:_.{precision}f} <span style="color: {color_pct}">({val_rel:.1%}) {kpi}</span>'
558
563
  if show_as_pct:
559
- return f'{val:.{precision}%} {kpi}'
564
+ return f"{val:.{precision}%} {kpi}"
560
565
  return f"{val:_.{precision}f} {kpi}"
561
566
 
562
- # * formatter is now unified, col wise
563
- formatter = {
564
- col: lambda x, col=col: format_cell(x, col=col)
565
- for col in df_.columns
566
- }
567
+ # * formatter is a dict comprehension, only accepts column names
568
+ formatter = {col: lambda x, col=col: format_cell(x, col=col) for col in df_.columns}
567
569
 
568
570
  # ? pct_axis y is not implemented, needs row wise formatting
569
571
  # row_sums = _df.sum(axis=1) / divider
@@ -572,6 +574,7 @@ def show_num_df(
572
574
  # }
573
575
 
574
576
  # * apply formatter
577
+ # debug(formatter)
575
578
  out.format(formatter=formatter)
576
579
 
577
580
  # * apply fonts for cells
@@ -231,7 +231,6 @@ def show_venn2(
231
231
  a_label: str,
232
232
  b_set: set,
233
233
  b_label: str,
234
- # theme: Literal["light", "dark"] = "dark",
235
234
  max_set_len: int = 100,
236
235
  max_line_width: int = 120,
237
236
  alpha: float = 0.7,
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: pandas-plots
3
- Version: 0.10.1
3
+ Version: 0.11.1
4
4
  Summary: A collection of helper for table handling and vizualization
5
5
  Home-page: https://github.com/smeisegeier/pandas-plots
6
6
  Author: smeisegeier
@@ -29,7 +29,7 @@ Requires-Dist: requests>=2.31.0
29
29
 
30
30
  # pandas-plots
31
31
 
32
- ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
32
+ ![PyPI - Version](https://img.shields.io/pypi/v/pandas-plots) ![GitHub last commit](https://img.shields.io/github/last-commit/smeisegeier/pandas-plots?logo=github) ![GitHub License](https://img.shields.io/github/license/smeisegeier/pandas-plots?logo=github) ![py3.10](https://img.shields.io/badge/python-3.10_|_3.11_|_3.12-blue.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxMDAgMTAwIj4KICA8ZGVmcz4KICAgIDxsaW5lYXJHcmFkaWVudCBpZD0icHlZZWxsb3ciIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2ZlNSIgb2Zmc2V0PSIwLjYiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iI2RhMSIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogICAgPGxpbmVhckdyYWRpZW50IGlkPSJweUJsdWUiIGdyYWRpZW50VHJhbnNmb3JtPSJyb3RhdGUoNDUpIj4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzY5ZiIgb2Zmc2V0PSIwLjQiLz4KICAgICAgPHN0b3Agc3RvcC1jb2xvcj0iIzQ2OCIgb2Zmc2V0PSIxIi8+CiAgICA8L2xpbmVhckdyYWRpZW50PgogIDwvZGVmcz4KCiAgPHBhdGggZD0iTTI3LDE2YzAtNyw5LTEzLDI0LTEzYzE1LDAsMjMsNiwyMywxM2wwLDIyYzAsNy01LDEyLTExLDEybC0yNCwwYy04LDAtMTQsNi0xNCwxNWwwLDEwbC05LDBjLTgsMC0xMy05LTEzLTI0YzAtMTQsNS0yMywxMy0yM2wzNSwwbDAtM2wtMjQsMGwwLTlsMCwweiBNODgsNTB2MSIgZmlsbD0idXJsKCNweUJsdWUpIi8+CiAgPHBhdGggZD0iTTc0LDg3YzAsNy04LDEzLTIzLDEzYy0xNSwwLTI0LTYtMjQtMTNsMC0yMmMwLTcsNi0xMiwxMi0xMmwyNCwwYzgsMCwxNC03LDE0LTE1bDAtMTBsOSwwYzcsMCwxMyw5LDEzLDIzYzAsMTUtNiwyNC0xMywyNGwtMzUsMGwwLDNsMjMsMGwwLDlsMCwweiBNMTQwLDUwdjEiIGZpbGw9InVybCgjcHlZZWxsb3cpIi8+CgogIDxjaXJjbGUgcj0iNCIgY3g9IjY0IiBjeT0iODgiIGZpbGw9IiNGRkYiLz4KICA8Y2lyY2xlIHI9IjQiIGN4PSIzNyIgY3k9IjE1IiBmaWxsPSIjRkZGIi8+Cjwvc3ZnPgo=)
33
33
 
34
34
  ## usage
35
35
 
@@ -98,8 +98,9 @@ tbl.show_num_df(
98
98
  - `mean_confidence_interval()` calculates mean and confidence interval for a series
99
99
  - `wrap_text()` formats strings or lists to a given width to fit nicely on the screen
100
100
  - `replace_delimiter_outside_quotes()` when manual import of csv files is needed: replaces delimiters only outside of quotes
101
- - 🆕 `create_barcode_from_url()` creates a barcode from a given URL
102
- - 🆕 `add_datetime_col()` adds a datetime columns to a dataframe
101
+ - `create_barcode_from_url()` creates a barcode from a given URL
102
+ - `add_datetime_col()` adds a datetime columns to a dataframe
103
+ - 🆕 `show_package_version` prints version of a list of packages
103
104
 
104
105
  > note: theme setting can be controlled through all functions by setting the environment variable `THEME` to either light or dark
105
106
 
@@ -154,3 +155,7 @@ _df, _details = ven.show_venn3(
154
155
  ```
155
156
 
156
157
  ![venn](https://github.com/smeisegeier/pandas-plots/blob/main/img/2024-02-19-20-49-52.png?raw=true)
158
+
159
+ ## tags
160
+
161
+ #pandas, #plotly, #visualizations, #statistics
File without changes