imsciences 0.8__tar.gz → 0.9__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of imsciences might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: imsciences
3
- Version: 0.8
3
+ Version: 0.9
4
4
  Summary: IMS Data Processing Package
5
5
  Author: IMS
6
6
  Author-email: cam@im-sciences.com
@@ -36,97 +36,97 @@ The **IMSciences package** is a Python library designed to process incoming data
36
36
 
37
37
  ---
38
38
 
39
- ## Table of Contents
39
+ Table of Contents
40
+ =================
40
41
 
41
- 1. [Data Processing](#data-processing)
42
- 2. [Data Pulling](#data-pulling)
43
- 3. [Installation](#installation)
44
- 4. [Useage](#useage)
45
- 5. [License](#license)
42
+ 1. [Data Processing](#Data-Processing)
43
+ 2. [Data Pulling](#Data-Pulling)
44
+ 3. [Installation](#Installation)
45
+ 4. [Useage](#Useage)
46
+ 5. [License](#License)
46
47
 
47
48
  ---
48
49
 
49
50
  ## Data Processing
50
51
 
51
-
52
- ## 1. `get_wd_levels`
52
+ ## 1. get_wd_levels
53
53
  - **Description**: Get the working directory with the option of moving up parents.
54
54
  - **Usage**: `get_wd_levels(levels)`
55
55
  - **Example**: `get_wd_levels(0)`
56
56
 
57
57
  ---
58
58
 
59
- ## 2. `remove_rows`
59
+ ## 2. remove_rows
60
60
  - **Description**: Removes a specified number of rows from a pandas DataFrame.
61
61
  - **Usage**: `remove_rows(data_frame, num_rows_to_remove)`
62
62
  - **Example**: `remove_rows(df, 2)`
63
63
 
64
64
  ---
65
65
 
66
- ## 3. `aggregate_daily_to_wc_long`
66
+ ## 3. aggregate_daily_to_wc_long
67
67
  - **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week.
68
68
  - **Usage**: `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum')`
69
69
  - **Example**: `aggregate_daily_to_wc_long(df, 'date', ['platform'], ['cost', 'impressions', 'clicks'], 'mon', 'average')`
70
70
 
71
71
  ---
72
72
 
73
- ## 4. `convert_monthly_to_daily`
73
+ ## 4. convert_monthly_to_daily
74
74
  - **Description**: Converts monthly data in a DataFrame to daily data by expanding and dividing the numeric values.
75
75
  - **Usage**: `convert_monthly_to_daily(df, date_column, divide)`
76
76
  - **Example**: `convert_monthly_to_daily(df, 'date')`
77
77
 
78
78
  ---
79
79
 
80
- ## 5. `plot_two`
80
+ ## 5. plot_two
81
81
  - **Description**: Plots specified columns from two different DataFrames using a shared date column. Useful for comparing data.
82
82
  - **Usage**: `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`
83
83
  - **Example**: `plot_two(df1, 'cost', df2, 'cost', 'obs', True)`
84
84
 
85
85
  ---
86
86
 
87
- ## 6. `remove_nan_rows`
87
+ ## 6. remove_nan_rows
88
88
  - **Description**: Removes rows from a DataFrame where the specified column has NaN values.
89
89
  - **Usage**: `remove_nan_rows(df, col_to_remove_rows)`
90
90
  - **Example**: `remove_nan_rows(df, 'date')`
91
91
 
92
92
  ---
93
93
 
94
- ## 7. `filter_rows`
94
+ ## 7. filter_rows
95
95
  - **Description**: Filters the DataFrame based on whether the values in a specified column are in a provided list.
96
96
  - **Usage**: `filter_rows(df, col_to_filter, list_of_filters)`
97
97
  - **Example**: `filter_rows(df, 'country', ['UK', 'IE'])`
98
98
 
99
99
  ---
100
100
 
101
- ## 8. `plot_one`
101
+ ## 8. plot_one
102
102
  - **Description**: Plots a specified column from a DataFrame.
103
103
  - **Usage**: `plot_one(df1, col1, date_column)`
104
104
  - **Example**: `plot_one(df, 'Spend', 'OBS')`
105
105
 
106
106
  ---
107
107
 
108
- ## 9. `week_of_year_mapping`
108
+ ## 9. week_of_year_mapping
109
109
  - **Description**: Converts a week column in `yyyy-Www` or `yyyy-ww` format to week commencing date.
110
110
  - **Usage**: `week_of_year_mapping(df, week_col, start_day_str)`
111
111
  - **Example**: `week_of_year_mapping(df, 'week', 'mon')`
112
112
 
113
113
  ---
114
114
 
115
- ## 10. `exclude_rows`
115
+ ## 10. exclude_rows
116
116
  - **Description**: Removes rows from a DataFrame based on whether the values in a specified column are not in a provided list.
117
117
  - **Usage**: `exclude_rows(df, col_to_filter, list_of_filters)`
118
118
  - **Example**: `exclude_rows(df, 'week', ['2022-W20', '2022-W21'])`
119
119
 
120
120
  ---
121
121
 
122
- ## 11. `rename_cols`
122
+ ## 11. rename_cols
123
123
  - **Description**: Renames columns in a pandas DataFrame.
124
124
  - **Usage**: `rename_cols(df, name)`
125
125
  - **Example**: `rename_cols(df, 'ame_facebook')`
126
126
 
127
127
  ---
128
128
 
129
- ## 12. `merge_new_and_old`
129
+ ## 12. merge_new_and_old
130
130
  - **Description**: Creates a new DataFrame with two columns: one for dates and one for merged numeric values.
131
131
  - Merges numeric values from specified columns in the old and new DataFrames based on a given cutoff date.
132
132
  - **Usage**: `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`
@@ -134,21 +134,21 @@ The **IMSciences package** is a Python library designed to process incoming data
134
134
 
135
135
  ---
136
136
 
137
- ## 13. `merge_dataframes_on_date`
137
+ ## 13. merge_dataframes_on_date
138
138
  - **Description**: Merge a list of DataFrames on a common column.
139
139
  - **Usage**: `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`
140
140
  - **Example**: `merge_dataframes_on_date([df1, df2, df3], common_column='OBS', merge_how='outer')`
141
141
 
142
142
  ---
143
143
 
144
- ## 14. `merge_and_update_dfs`
144
+ ## 14. merge_and_update_dfs
145
145
  - **Description**: Merges two dataframes on a key column, updates the first dataframe's columns with the second's where available, and returns a dataframe sorted by the key column.
146
146
  - **Usage**: `merge_and_update_dfs(df1, df2, key_column)`
147
147
  - **Example**: `merge_and_update_dfs(processed_facebook, finalised_meta, 'OBS')`
148
148
 
149
149
  ---
150
150
 
151
- ## 15. `convert_us_to_uk_dates`
151
+ ## 15. convert_us_to_uk_dates
152
152
  - **Description**: Convert a DataFrame column with mixed date formats to datetime.
153
153
  - **Usage**: `convert_us_to_uk_dates(df, date_col)`
154
154
  - **Example**: `convert_us_to_uk_dates(df, 'date')`
@@ -162,189 +162,189 @@ The **IMSciences package** is a Python library designed to process incoming data
162
162
 
163
163
  ---
164
164
 
165
- ## 17. `pivot_table`
165
+ ## 17. pivot_table
166
166
  - **Description**: Dynamically pivots a DataFrame based on specified columns.
167
167
  - **Usage**: `pivot_table(df, index_col, columns, values_col, filters_dict=None, fill_value=0, aggfunc='sum', margins=False, margins_name='Total', datetime_trans_needed=True, reverse_header_order=False, fill_missing_weekly_dates=False, week_commencing='W-MON')`
168
168
  - **Example**: `pivot_table(df, 'OBS', 'Channel Short Names', 'Value', filters_dict={'Master Include': ' == 1', 'OBS': ' >= datetime(2019,9,9)', 'Metric Short Names': ' == spd'}, fill_value=0, aggfunc='sum', margins=False, margins_name='Total', datetime_trans_needed=True, reverse_header_order=True, fill_missing_weekly_dates=True, week_commencing='W-MON')`
169
169
 
170
170
  ---
171
171
 
172
- ## 18. `apply_lookup_table_for_columns`
172
+ ## 18. apply_lookup_table_for_columns
173
173
  - **Description**: Equivalent of XLOOKUP in Excel. Allows mapping of a dictionary of substrings within a column.
174
174
  - **Usage**: `apply_lookup_table_for_columns(df, col_names, to_find_dict, if_not_in_dict='Other', new_column_name='Mapping')`
175
175
  - **Example**: `apply_lookup_table_for_columns(df, col_names, {'spend': 'spd', 'clicks': 'clk'}, if_not_in_dict='Other', new_column_name='Metrics Short')`
176
176
 
177
177
  ---
178
178
 
179
- ## 19. `aggregate_daily_to_wc_wide`
179
+ ## 19. aggregate_daily_to_wc_wide
180
180
  - **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week.
181
181
  - **Usage**: `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`
182
182
  - **Example**: `aggregate_daily_to_wc_wide(df, 'date', ['platform'], ['cost', 'impressions', 'clicks'], 'mon', 'average', True)`
183
183
 
184
184
  ---
185
185
 
186
- ## 20. `merge_cols_with_seperator`
186
+ ## 20. merge_cols_with_seperator
187
187
  - **Description**: Merges multiple columns in a DataFrame into one column with a separator `_`. Useful for lookup tables.
188
188
  - **Usage**: `merge_cols_with_seperator(df, col_names, seperator='_', output_column_name='Merged', starting_prefix_str=None, ending_prefix_str=None)`
189
189
  - **Example**: `merge_cols_with_seperator(df, ['Campaign', 'Product'], seperator='|', output_column_name='Merged Columns', starting_prefix_str='start_', ending_prefix_str='_end')`
190
190
 
191
191
  ---
192
192
 
193
- ## 21. `check_sum_of_df_cols_are_equal`
193
+ ## 21. check_sum_of_df_cols_are_equal
194
194
  - **Description**: Checks if the sum of two columns in two DataFrames are the same, and provides the sums and differences.
195
195
  - **Usage**: `check_sum_of_df_cols_are_equal(df_1, df_2, cols_1, cols_2)`
196
196
  - **Example**: `check_sum_of_df_cols_are_equal(df_1, df_2, 'Media Cost', 'Spend')`
197
197
 
198
198
  ---
199
199
 
200
- ## 22. `convert_2_df_cols_to_dict`
200
+ ## 22. convert_2_df_cols_to_dict
201
201
  - **Description**: Creates a dictionary using two columns in a DataFrame.
202
202
  - **Usage**: `convert_2_df_cols_to_dict(df, key_col, value_col)`
203
203
  - **Example**: `convert_2_df_cols_to_dict(df, 'Campaign', 'Channel')`
204
204
 
205
205
  ---
206
206
 
207
- ## 23. `create_FY_and_H_columns`
207
+ ## 23. create_FY_and_H_columns
208
208
  - **Description**: Creates financial year, half-year, and financial half-year columns.
209
209
  - **Usage**: `create_FY_and_H_columns(df, index_col, start_date, starting_FY, short_format='No', half_years='No', combined_FY_and_H='No')`
210
210
  - **Example**: `create_FY_and_H_columns(df, 'Week (M-S)', '2022-10-03', 'FY2023', short_format='Yes', half_years='Yes', combined_FY_and_H='Yes')`
211
211
 
212
212
  ---
213
213
 
214
- ## 24. `keyword_lookup_replacement`
214
+ ## 24. keyword_lookup_replacement
215
215
  - **Description**: Updates chosen values in a specified column of the DataFrame based on a lookup dictionary.
216
216
  - **Usage**: `keyword_lookup_replacement(df, col, replacement_rows, cols_to_merge, replacement_lookup_dict, output_column_name='Updated Column')`
217
217
  - **Example**: `keyword_lookup_replacement(df, 'channel', 'Paid Search Generic', ['channel', 'segment', 'product'], qlik_dict_for_channel, output_column_name='Channel New')`
218
218
 
219
219
  ---
220
220
 
221
- ## 25. `create_new_version_of_col_using_LUT`
221
+ ## 25. create_new_version_of_col_using_LUT
222
222
  - **Description**: Creates a new column in a DataFrame by mapping values from an old column using a lookup table.
223
223
  - **Usage**: `create_new_version_of_col_using_LUT(df, keys_col, value_col, dict_for_specific_changes, new_col_name='New Version of Old Col')`
224
224
  - **Example**: `create_new_version_of_col_using_LUT(df, 'Campaign Name', 'Campaign Type', search_campaign_name_retag_lut, 'Campaign Name New')`
225
225
 
226
226
  ---
227
227
 
228
- ## 26. `convert_df_wide_2_long`
228
+ ## 26. convert_df_wide_2_long
229
229
  - **Description**: Converts a DataFrame from wide to long format.
230
230
  - **Usage**: `convert_df_wide_2_long(df, value_cols, variable_col_name='Stacked', value_col_name='Value')`
231
231
  - **Example**: `convert_df_wide_2_long(df, ['Media Cost', 'Impressions', 'Clicks'], variable_col_name='Metric')`
232
232
 
233
233
  ---
234
234
 
235
- ## 27. `manually_edit_data`
235
+ ## 27. manually_edit_data
236
236
  - **Description**: Enables manual updates to DataFrame cells by applying filters and editing a column.
237
237
  - **Usage**: `manually_edit_data(df, filters_dict, col_to_change, new_value, change_in_existing_df_col='No', new_col_to_change_name='New', manual_edit_col_name=None, add_notes='No', existing_note_col_name=None, note=None)`
238
238
  - **Example**: `manually_edit_data(df, {'OBS': ' <= datetime(2023,1,23)', 'File_Name': ' == France media'}, 'Master Include', 1, change_in_existing_df_col='Yes', new_col_to_change_name='Master Include', manual_edit_col_name='Manual Changes')`
239
239
 
240
240
  ---
241
241
 
242
- ## 28. `format_numbers_with_commas`
242
+ ## 28. format_numbers_with_commas
243
243
  - **Description**: Formats numeric data into numbers with commas and specified decimal places.
244
244
  - **Usage**: `format_numbers_with_commas(df, decimal_length_chosen=2)`
245
245
  - **Example**: `format_numbers_with_commas(df, 1)`
246
246
 
247
247
  ---
248
248
 
249
- ## 29. `filter_df_on_multiple_conditions`
249
+ ## 29. filter_df_on_multiple_conditions
250
250
  - **Description**: Filters a DataFrame based on multiple conditions from a dictionary.
251
251
  - **Usage**: `filter_df_on_multiple_conditions(df, filters_dict)`
252
252
  - **Example**: `filter_df_on_multiple_conditions(df, {'OBS': ' <= datetime(2023,1,23)', 'File_Name': ' == France media'})`
253
253
 
254
254
  ---
255
255
 
256
- ## 30. `read_and_concatenate_files`
256
+ ## 30. read_and_concatenate_files
257
257
  - **Description**: Reads and concatenates all files of a specified type in a folder.
258
258
  - **Usage**: `read_and_concatenate_files(folder_path, file_type='csv')`
259
259
  - **Example**: `read_and_concatenate_files(folder_path, file_type='csv')`
260
260
 
261
261
  ---
262
262
 
263
- ## 31. `remove_zero_values`
263
+ ## 31. remove_zero_values
264
264
  - **Description**: Removes rows with zero values in a specified column.
265
265
  - **Usage**: `remove_zero_values(data_frame, column_to_filter)`
266
266
  - **Example**: `remove_zero_values(df, 'Funeral_Delivery')`
267
267
 
268
268
  ---
269
269
 
270
- ## 32. `upgrade_outdated_packages`
270
+ ## 32. upgrade_outdated_packages
271
271
  - **Description**: Upgrades all outdated packages in the environment.
272
272
  - **Usage**: `upgrade_outdated_packages()`
273
273
  - **Example**: `upgrade_outdated_packages()`
274
274
 
275
275
  ---
276
276
 
277
- ## 33. `convert_mixed_formats_dates`
277
+ ## 33. convert_mixed_formats_dates
278
278
  - **Description**: Converts a mix of US and UK date formats to datetime.
279
279
  - **Usage**: `convert_mixed_formats_dates(df, date_col)`
280
280
  - **Example**: `convert_mixed_formats_dates(df, 'OBS')`
281
281
 
282
282
  ---
283
283
 
284
- ## 34. `fill_weekly_date_range`
284
+ ## 34. fill_weekly_date_range
285
285
  - **Description**: Fills in missing weeks with zero values.
286
286
  - **Usage**: `fill_weekly_date_range(df, date_column, freq)`
287
287
  - **Example**: `fill_weekly_date_range(df, 'OBS', 'W-MON')`
288
288
 
289
289
  ---
290
290
 
291
- ## 35. `add_prefix_and_suffix`
291
+ ## 35. add_prefix_and_suffix
292
292
  - **Description**: Adds prefixes and/or suffixes to column headers.
293
293
  - **Usage**: `add_prefix_and_suffix(df, prefix='', suffix='', date_col=None)`
294
294
  - **Example**: `add_prefix_and_suffix(df, prefix='media_', suffix='_spd', date_col='obs')`
295
295
 
296
296
  ---
297
297
 
298
- ## 36. `create_dummies`
298
+ ## 36. create_dummies
299
299
  - **Description**: Converts time series into binary indicators based on a threshold.
300
300
  - **Usage**: `create_dummies(df, date_col=None, dummy_threshold=0, add_total_dummy_col='No', total_col_name='total')`
301
301
  - **Example**: `create_dummies(df, date_col='obs', dummy_threshold=100, add_total_dummy_col='Yes', total_col_name='med_total_dum')`
302
302
 
303
303
  ---
304
304
 
305
- ## 37. `replace_substrings`
305
+ ## 37. replace_substrings
306
306
  - **Description**: Replaces substrings in a column of strings using a dictionary and can change column values to lowercase.
307
307
  - **Usage**: `replace_substrings(df, column, replacements, to_lower=False, new_column=None)`
308
308
  - **Example**: `replace_substrings(df, 'Influencer Handle', replacement_dict, to_lower=True, new_column='Short Version')`
309
309
 
310
310
  ---
311
311
 
312
- ## 38. `add_total_column`
312
+ ## 38. `add_total_column
313
313
  - **Description**: Sums all columns (excluding a specified column) to create a total column.
314
314
  - **Usage**: `add_total_column(df, exclude_col=None, total_col_name='Total')`
315
315
  - **Example**: `add_total_column(df, exclude_col='obs', total_col_name='total_media_spd')`
316
316
 
317
317
  ---
318
318
 
319
- ## 39. `apply_lookup_table_based_on_substring`
319
+ ## 39. apply_lookup_table_based_on_substring
320
320
  - **Description**: Maps substrings in a column to values using a lookup dictionary.
321
321
  - **Usage**: `apply_lookup_table_based_on_substring(df, column_name, category_dict, new_col_name='Category', other_label='Other')`
322
322
  - **Example**: `apply_lookup_table_based_on_substring(df, 'Campaign Name', campaign_dict, new_col_name='Campaign Name Short', other_label='Full Funnel')`
323
323
 
324
324
  ---
325
325
 
326
- ## 40. `compare_overlap`
326
+ ## 40. compare_overlap
327
327
  - **Description**: Compares matching rows and columns in two DataFrames and outputs the differences.
328
328
  - **Usage**: `compare_overlap(df1, df2, date_col)`
329
329
  - **Example**: `compare_overlap(df_1, df_2, 'obs')`
330
330
 
331
331
  ---
332
332
 
333
- ## 41. `week_commencing_2_week_commencing_conversion`
333
+ ## 41. week_commencing_2_week_commencing_conversion
334
334
  - **Description**: Converts a week commencing column to a different start day.
335
335
  - **Usage**: `week_commencing_2_week_commencing_conversion(df, date_col, week_commencing='sun')`
336
336
  - **Example**: `week_commencing_2_week_commencing_conversion(df, 'obs', week_commencing='mon')`
337
337
 
338
338
  ---
339
339
 
340
- ## 42. `plot_chart`
340
+ ## 42. plot_chart
341
341
  - **Description**: Plots various chart types including line, area, scatter, and bar.
342
342
  - **Usage**: `plot_chart(df, date_col, value_cols, chart_type='line', title='Chart', x_title='Date', y_title='Values', **kwargs)`
343
343
  - **Example**: `plot_chart(df, 'obs', df.cols, chart_type='line', title='Spend Over Time', x_title='Date', y_title='Spend')`
344
344
 
345
345
  ---
346
346
 
347
- ## 43. `plot_two_with_common_cols`
347
+ ## 43. plot_two_with_common_cols
348
348
  - **Description**: Plots charts for two DataFrames based on common column names.
349
349
  - **Usage**: `plot_two_with_common_cols(df1, df2, date_column, same_axis=True)`
350
350
  - **Example**: `plot_two_with_common_cols(df_1, df_2, date_column='obs')`
@@ -412,7 +412,7 @@ The **IMSciences package** is a Python library designed to process incoming data
412
412
  Install the IMS package via pip:
413
413
 
414
414
  ```bash
415
- pip install ims-package
415
+ pip install imsciences
416
416
  ```
417
417
 
418
418
  ---