imsciences 0.6.3.2__tar.gz → 0.8.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {imsciences-0.6.3.2 → imsciences-0.8.1}/PKG-INFO +122 -72
- {imsciences-0.6.3.2 → imsciences-0.8.1}/README.md +119 -70
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences/__init__.py +0 -1
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences/datafunctions.py +384 -236
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/PKG-INFO +122 -72
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/requires.txt +2 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/setup.py +2 -3
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences/unittesting.py +0 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/PKG-INFO-IMS-24Ltp-3 +0 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/SOURCES.txt +0 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/dependency_links.txt +0 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/imsciences.egg-info/top_level.txt +0 -0
- {imsciences-0.6.3.2 → imsciences-0.8.1}/setup.cfg +0 -0
|
@@ -1,10 +1,9 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: imsciences
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.8.1
|
|
4
4
|
Summary: IMS Data Processing Package
|
|
5
5
|
Author: IMS
|
|
6
6
|
Author-email: cam@im-sciences.com
|
|
7
|
-
License: MIT
|
|
8
7
|
Keywords: python,data processing,apis
|
|
9
8
|
Classifier: Development Status :: 3 - Alpha
|
|
10
9
|
Classifier: Intended Audience :: Developers
|
|
@@ -21,93 +20,113 @@ Requires-Dist: requests_cache
|
|
|
21
20
|
Requires-Dist: geopy
|
|
22
21
|
Requires-Dist: plotly
|
|
23
22
|
Requires-Dist: bs4
|
|
23
|
+
Requires-Dist: yfinance
|
|
24
|
+
Requires-Dist: holidays
|
|
24
25
|
|
|
25
26
|
# IMS Package Documentation
|
|
26
27
|
|
|
27
|
-
The
|
|
28
|
+
The **IMSciences package** is a Python library designed to process incoming data into a format tailored for econometrics projects, particularly those utilising weekly time series data. This package offers a suite of functions for efficient data manipulation and analysis.
|
|
28
29
|
|
|
29
|
-
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Key Features
|
|
33
|
+
- Seamless data processing for econometrics workflows.
|
|
34
|
+
- Aggregation, filtering, and transformation of time series data.
|
|
35
|
+
- Integration with external data sources like FRED, Bank of England, ONS and OECD.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
Table of Contents
|
|
40
|
+
=================
|
|
30
41
|
|
|
31
|
-
|
|
42
|
+
1. `Data Processing <#data-processing>`_
|
|
43
|
+
2. `Data Pulling <#data-pulling>`_
|
|
44
|
+
3. `Installation <#installation>`_
|
|
45
|
+
4. `Usage <#usage>`_
|
|
46
|
+
5. `License <#license>`_
|
|
47
|
+
|
|
48
|
+
---
|
|
32
49
|
|
|
33
|
-
##
|
|
50
|
+
## Data Processing
|
|
51
|
+
|
|
52
|
+
## 1. get_wd_levels
|
|
34
53
|
- **Description**: Get the working directory with the option of moving up parents.
|
|
35
54
|
- **Usage**: `get_wd_levels(levels)`
|
|
36
55
|
- **Example**: `get_wd_levels(0)`
|
|
37
56
|
|
|
38
57
|
---
|
|
39
58
|
|
|
40
|
-
## 2.
|
|
59
|
+
## 2. remove_rows
|
|
41
60
|
- **Description**: Removes a specified number of rows from a pandas DataFrame.
|
|
42
61
|
- **Usage**: `remove_rows(data_frame, num_rows_to_remove)`
|
|
43
62
|
- **Example**: `remove_rows(df, 2)`
|
|
44
63
|
|
|
45
64
|
---
|
|
46
65
|
|
|
47
|
-
## 3.
|
|
66
|
+
## 3. aggregate_daily_to_wc_long
|
|
48
67
|
- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week.
|
|
49
68
|
- **Usage**: `aggregate_daily_to_wc_long(df, date_column, group_columns, sum_columns, wc, aggregation='sum')`
|
|
50
69
|
- **Example**: `aggregate_daily_to_wc_long(df, 'date', ['platform'], ['cost', 'impressions', 'clicks'], 'mon', 'average')`
|
|
51
70
|
|
|
52
71
|
---
|
|
53
72
|
|
|
54
|
-
## 4.
|
|
73
|
+
## 4. convert_monthly_to_daily
|
|
55
74
|
- **Description**: Converts monthly data in a DataFrame to daily data by expanding and dividing the numeric values.
|
|
56
75
|
- **Usage**: `convert_monthly_to_daily(df, date_column, divide)`
|
|
57
76
|
- **Example**: `convert_monthly_to_daily(df, 'date')`
|
|
58
77
|
|
|
59
78
|
---
|
|
60
79
|
|
|
61
|
-
## 5.
|
|
80
|
+
## 5. plot_two
|
|
62
81
|
- **Description**: Plots specified columns from two different DataFrames using a shared date column. Useful for comparing data.
|
|
63
82
|
- **Usage**: `plot_two(df1, col1, df2, col2, date_column, same_axis=True)`
|
|
64
83
|
- **Example**: `plot_two(df1, 'cost', df2, 'cost', 'obs', True)`
|
|
65
84
|
|
|
66
85
|
---
|
|
67
86
|
|
|
68
|
-
## 6.
|
|
87
|
+
## 6. remove_nan_rows
|
|
69
88
|
- **Description**: Removes rows from a DataFrame where the specified column has NaN values.
|
|
70
89
|
- **Usage**: `remove_nan_rows(df, col_to_remove_rows)`
|
|
71
90
|
- **Example**: `remove_nan_rows(df, 'date')`
|
|
72
91
|
|
|
73
92
|
---
|
|
74
93
|
|
|
75
|
-
## 7.
|
|
94
|
+
## 7. filter_rows
|
|
76
95
|
- **Description**: Filters the DataFrame based on whether the values in a specified column are in a provided list.
|
|
77
96
|
- **Usage**: `filter_rows(df, col_to_filter, list_of_filters)`
|
|
78
97
|
- **Example**: `filter_rows(df, 'country', ['UK', 'IE'])`
|
|
79
98
|
|
|
80
99
|
---
|
|
81
100
|
|
|
82
|
-
## 8.
|
|
101
|
+
## 8. plot_one
|
|
83
102
|
- **Description**: Plots a specified column from a DataFrame.
|
|
84
103
|
- **Usage**: `plot_one(df1, col1, date_column)`
|
|
85
104
|
- **Example**: `plot_one(df, 'Spend', 'OBS')`
|
|
86
105
|
|
|
87
106
|
---
|
|
88
107
|
|
|
89
|
-
## 9.
|
|
108
|
+
## 9. week_of_year_mapping
|
|
90
109
|
- **Description**: Converts a week column in `yyyy-Www` or `yyyy-ww` format to week commencing date.
|
|
91
110
|
- **Usage**: `week_of_year_mapping(df, week_col, start_day_str)`
|
|
92
111
|
- **Example**: `week_of_year_mapping(df, 'week', 'mon')`
|
|
93
112
|
|
|
94
113
|
---
|
|
95
114
|
|
|
96
|
-
## 10.
|
|
115
|
+
## 10. exclude_rows
|
|
97
116
|
- **Description**: Removes rows from a DataFrame based on whether the values in a specified column are not in a provided list.
|
|
98
117
|
- **Usage**: `exclude_rows(df, col_to_filter, list_of_filters)`
|
|
99
118
|
- **Example**: `exclude_rows(df, 'week', ['2022-W20', '2022-W21'])`
|
|
100
119
|
|
|
101
120
|
---
|
|
102
121
|
|
|
103
|
-
## 11.
|
|
122
|
+
## 11. rename_cols
|
|
104
123
|
- **Description**: Renames columns in a pandas DataFrame.
|
|
105
124
|
- **Usage**: `rename_cols(df, name)`
|
|
106
125
|
- **Example**: `rename_cols(df, 'ame_facebook')`
|
|
107
126
|
|
|
108
127
|
---
|
|
109
128
|
|
|
110
|
-
## 12.
|
|
129
|
+
## 12. merge_new_and_old
|
|
111
130
|
- **Description**: Creates a new DataFrame with two columns: one for dates and one for merged numeric values.
|
|
112
131
|
- Merges numeric values from specified columns in the old and new DataFrames based on a given cutoff date.
|
|
113
132
|
- **Usage**: `merge_new_and_old(old_df, old_col, new_df, new_col, cutoff_date, date_col_name='OBS')`
|
|
@@ -115,21 +134,21 @@ The IMS package is a python library for processing incoming data into a format t
|
|
|
115
134
|
|
|
116
135
|
---
|
|
117
136
|
|
|
118
|
-
## 13.
|
|
137
|
+
## 13. merge_dataframes_on_date
|
|
119
138
|
- **Description**: Merge a list of DataFrames on a common column.
|
|
120
139
|
- **Usage**: `merge_dataframes_on_date(dataframes, common_column='OBS', merge_how='outer')`
|
|
121
140
|
- **Example**: `merge_dataframes_on_date([df1, df2, df3], common_column='OBS', merge_how='outer')`
|
|
122
141
|
|
|
123
142
|
---
|
|
124
143
|
|
|
125
|
-
## 14.
|
|
144
|
+
## 14. merge_and_update_dfs
|
|
126
145
|
- **Description**: Merges two dataframes on a key column, updates the first dataframe's columns with the second's where available, and returns a dataframe sorted by the key column.
|
|
127
146
|
- **Usage**: `merge_and_update_dfs(df1, df2, key_column)`
|
|
128
147
|
- **Example**: `merge_and_update_dfs(processed_facebook, finalised_meta, 'OBS')`
|
|
129
148
|
|
|
130
149
|
---
|
|
131
150
|
|
|
132
|
-
## 15.
|
|
151
|
+
## 15. convert_us_to_uk_dates
|
|
133
152
|
- **Description**: Convert a DataFrame column with mixed date formats to datetime.
|
|
134
153
|
- **Usage**: `convert_us_to_uk_dates(df, date_col)`
|
|
135
154
|
- **Example**: `convert_us_to_uk_dates(df, 'date')`
|
|
@@ -143,189 +162,189 @@ The IMS package is a python library for processing incoming data into a format t
|
|
|
143
162
|
|
|
144
163
|
---
|
|
145
164
|
|
|
146
|
-
## 17.
|
|
165
|
+
## 17. pivot_table
|
|
147
166
|
- **Description**: Dynamically pivots a DataFrame based on specified columns.
|
|
148
167
|
- **Usage**: `pivot_table(df, index_col, columns, values_col, filters_dict=None, fill_value=0, aggfunc='sum', margins=False, margins_name='Total', datetime_trans_needed=True, reverse_header_order=False, fill_missing_weekly_dates=False, week_commencing='W-MON')`
|
|
149
168
|
- **Example**: `pivot_table(df, 'OBS', 'Channel Short Names', 'Value', filters_dict={'Master Include': ' == 1', 'OBS': ' >= datetime(2019,9,9)', 'Metric Short Names': ' == spd'}, fill_value=0, aggfunc='sum', margins=False, margins_name='Total', datetime_trans_needed=True, reverse_header_order=True, fill_missing_weekly_dates=True, week_commencing='W-MON')`
|
|
150
169
|
|
|
151
170
|
---
|
|
152
171
|
|
|
153
|
-
## 18.
|
|
172
|
+
## 18. apply_lookup_table_for_columns
|
|
154
173
|
- **Description**: Equivalent of XLOOKUP in Excel. Allows mapping of a dictionary of substrings within a column.
|
|
155
174
|
- **Usage**: `apply_lookup_table_for_columns(df, col_names, to_find_dict, if_not_in_dict='Other', new_column_name='Mapping')`
|
|
156
175
|
- **Example**: `apply_lookup_table_for_columns(df, col_names, {'spend': 'spd', 'clicks': 'clk'}, if_not_in_dict='Other', new_column_name='Metrics Short')`
|
|
157
176
|
|
|
158
177
|
---
|
|
159
178
|
|
|
160
|
-
## 19.
|
|
179
|
+
## 19. aggregate_daily_to_wc_wide
|
|
161
180
|
- **Description**: Aggregates daily data into weekly data, grouping and summing specified columns, starting on a specified day of the week.
|
|
162
181
|
- **Usage**: `aggregate_daily_to_wc_wide(df, date_column, group_columns, sum_columns, wc, aggregation='sum', include_totals=False)`
|
|
163
182
|
- **Example**: `aggregate_daily_to_wc_wide(df, 'date', ['platform'], ['cost', 'impressions', 'clicks'], 'mon', 'average', True)`
|
|
164
183
|
|
|
165
184
|
---
|
|
166
185
|
|
|
167
|
-
## 20.
|
|
186
|
+
## 20. merge_cols_with_seperator
|
|
168
187
|
- **Description**: Merges multiple columns in a DataFrame into one column with a separator `_`. Useful for lookup tables.
|
|
169
188
|
- **Usage**: `merge_cols_with_seperator(df, col_names, seperator='_', output_column_name='Merged', starting_prefix_str=None, ending_prefix_str=None)`
|
|
170
189
|
- **Example**: `merge_cols_with_seperator(df, ['Campaign', 'Product'], seperator='|', output_column_name='Merged Columns', starting_prefix_str='start_', ending_prefix_str='_end')`
|
|
171
190
|
|
|
172
191
|
---
|
|
173
192
|
|
|
174
|
-
## 21.
|
|
193
|
+
## 21. check_sum_of_df_cols_are_equal
|
|
175
194
|
- **Description**: Checks if the sum of two columns in two DataFrames are the same, and provides the sums and differences.
|
|
176
195
|
- **Usage**: `check_sum_of_df_cols_are_equal(df_1, df_2, cols_1, cols_2)`
|
|
177
196
|
- **Example**: `check_sum_of_df_cols_are_equal(df_1, df_2, 'Media Cost', 'Spend')`
|
|
178
197
|
|
|
179
198
|
---
|
|
180
199
|
|
|
181
|
-
## 22.
|
|
200
|
+
## 22. convert_2_df_cols_to_dict
|
|
182
201
|
- **Description**: Creates a dictionary using two columns in a DataFrame.
|
|
183
202
|
- **Usage**: `convert_2_df_cols_to_dict(df, key_col, value_col)`
|
|
184
203
|
- **Example**: `convert_2_df_cols_to_dict(df, 'Campaign', 'Channel')`
|
|
185
204
|
|
|
186
205
|
---
|
|
187
206
|
|
|
188
|
-
## 23.
|
|
207
|
+
## 23. create_FY_and_H_columns
|
|
189
208
|
- **Description**: Creates financial year, half-year, and financial half-year columns.
|
|
190
209
|
- **Usage**: `create_FY_and_H_columns(df, index_col, start_date, starting_FY, short_format='No', half_years='No', combined_FY_and_H='No')`
|
|
191
210
|
- **Example**: `create_FY_and_H_columns(df, 'Week (M-S)', '2022-10-03', 'FY2023', short_format='Yes', half_years='Yes', combined_FY_and_H='Yes')`
|
|
192
211
|
|
|
193
212
|
---
|
|
194
213
|
|
|
195
|
-
## 24.
|
|
214
|
+
## 24. keyword_lookup_replacement
|
|
196
215
|
- **Description**: Updates chosen values in a specified column of the DataFrame based on a lookup dictionary.
|
|
197
216
|
- **Usage**: `keyword_lookup_replacement(df, col, replacement_rows, cols_to_merge, replacement_lookup_dict, output_column_name='Updated Column')`
|
|
198
217
|
- **Example**: `keyword_lookup_replacement(df, 'channel', 'Paid Search Generic', ['channel', 'segment', 'product'], qlik_dict_for_channel, output_column_name='Channel New')`
|
|
199
218
|
|
|
200
219
|
---
|
|
201
220
|
|
|
202
|
-
## 25.
|
|
221
|
+
## 25. create_new_version_of_col_using_LUT
|
|
203
222
|
- **Description**: Creates a new column in a DataFrame by mapping values from an old column using a lookup table.
|
|
204
223
|
- **Usage**: `create_new_version_of_col_using_LUT(df, keys_col, value_col, dict_for_specific_changes, new_col_name='New Version of Old Col')`
|
|
205
224
|
- **Example**: `create_new_version_of_col_using_LUT(df, 'Campaign Name', 'Campaign Type', search_campaign_name_retag_lut, 'Campaign Name New')`
|
|
206
225
|
|
|
207
226
|
---
|
|
208
227
|
|
|
209
|
-
## 26.
|
|
228
|
+
## 26. convert_df_wide_2_long
|
|
210
229
|
- **Description**: Converts a DataFrame from wide to long format.
|
|
211
230
|
- **Usage**: `convert_df_wide_2_long(df, value_cols, variable_col_name='Stacked', value_col_name='Value')`
|
|
212
231
|
- **Example**: `convert_df_wide_2_long(df, ['Media Cost', 'Impressions', 'Clicks'], variable_col_name='Metric')`
|
|
213
232
|
|
|
214
233
|
---
|
|
215
234
|
|
|
216
|
-
## 27.
|
|
235
|
+
## 27. manually_edit_data
|
|
217
236
|
- **Description**: Enables manual updates to DataFrame cells by applying filters and editing a column.
|
|
218
237
|
- **Usage**: `manually_edit_data(df, filters_dict, col_to_change, new_value, change_in_existing_df_col='No', new_col_to_change_name='New', manual_edit_col_name=None, add_notes='No', existing_note_col_name=None, note=None)`
|
|
219
238
|
- **Example**: `manually_edit_data(df, {'OBS': ' <= datetime(2023,1,23)', 'File_Name': ' == France media'}, 'Master Include', 1, change_in_existing_df_col='Yes', new_col_to_change_name='Master Include', manual_edit_col_name='Manual Changes')`
|
|
220
239
|
|
|
221
240
|
---
|
|
222
241
|
|
|
223
|
-
## 28.
|
|
242
|
+
## 28. format_numbers_with_commas
|
|
224
243
|
- **Description**: Formats numeric data into numbers with commas and specified decimal places.
|
|
225
244
|
- **Usage**: `format_numbers_with_commas(df, decimal_length_chosen=2)`
|
|
226
245
|
- **Example**: `format_numbers_with_commas(df, 1)`
|
|
227
246
|
|
|
228
247
|
---
|
|
229
248
|
|
|
230
|
-
## 29.
|
|
249
|
+
## 29. filter_df_on_multiple_conditions
|
|
231
250
|
- **Description**: Filters a DataFrame based on multiple conditions from a dictionary.
|
|
232
251
|
- **Usage**: `filter_df_on_multiple_conditions(df, filters_dict)`
|
|
233
252
|
- **Example**: `filter_df_on_multiple_conditions(df, {'OBS': ' <= datetime(2023,1,23)', 'File_Name': ' == France media'})`
|
|
234
253
|
|
|
235
254
|
---
|
|
236
255
|
|
|
237
|
-
## 30.
|
|
256
|
+
## 30. read_and_concatenate_files
|
|
238
257
|
- **Description**: Reads and concatenates all files of a specified type in a folder.
|
|
239
258
|
- **Usage**: `read_and_concatenate_files(folder_path, file_type='csv')`
|
|
240
259
|
- **Example**: `read_and_concatenate_files(folder_path, file_type='csv')`
|
|
241
260
|
|
|
242
261
|
---
|
|
243
262
|
|
|
244
|
-
## 31.
|
|
263
|
+
## 31. remove_zero_values
|
|
245
264
|
- **Description**: Removes rows with zero values in a specified column.
|
|
246
265
|
- **Usage**: `remove_zero_values(data_frame, column_to_filter)`
|
|
247
266
|
- **Example**: `remove_zero_values(df, 'Funeral_Delivery')`
|
|
248
267
|
|
|
249
268
|
---
|
|
250
269
|
|
|
251
|
-
## 32.
|
|
270
|
+
## 32. upgrade_outdated_packages
|
|
252
271
|
- **Description**: Upgrades all outdated packages in the environment.
|
|
253
272
|
- **Usage**: `upgrade_outdated_packages()`
|
|
254
273
|
- **Example**: `upgrade_outdated_packages()`
|
|
255
274
|
|
|
256
275
|
---
|
|
257
276
|
|
|
258
|
-
## 33.
|
|
277
|
+
## 33. convert_mixed_formats_dates
|
|
259
278
|
- **Description**: Converts a mix of US and UK date formats to datetime.
|
|
260
279
|
- **Usage**: `convert_mixed_formats_dates(df, date_col)`
|
|
261
280
|
- **Example**: `convert_mixed_formats_dates(df, 'OBS')`
|
|
262
281
|
|
|
263
282
|
---
|
|
264
283
|
|
|
265
|
-
## 34.
|
|
284
|
+
## 34. fill_weekly_date_range
|
|
266
285
|
- **Description**: Fills in missing weeks with zero values.
|
|
267
286
|
- **Usage**: `fill_weekly_date_range(df, date_column, freq)`
|
|
268
287
|
- **Example**: `fill_weekly_date_range(df, 'OBS', 'W-MON')`
|
|
269
288
|
|
|
270
289
|
---
|
|
271
290
|
|
|
272
|
-
## 35.
|
|
291
|
+
## 35. add_prefix_and_suffix
|
|
273
292
|
- **Description**: Adds prefixes and/or suffixes to column headers.
|
|
274
293
|
- **Usage**: `add_prefix_and_suffix(df, prefix='', suffix='', date_col=None)`
|
|
275
294
|
- **Example**: `add_prefix_and_suffix(df, prefix='media_', suffix='_spd', date_col='obs')`
|
|
276
295
|
|
|
277
296
|
---
|
|
278
297
|
|
|
279
|
-
## 36.
|
|
298
|
+
## 36. create_dummies
|
|
280
299
|
- **Description**: Converts time series into binary indicators based on a threshold.
|
|
281
300
|
- **Usage**: `create_dummies(df, date_col=None, dummy_threshold=0, add_total_dummy_col='No', total_col_name='total')`
|
|
282
301
|
- **Example**: `create_dummies(df, date_col='obs', dummy_threshold=100, add_total_dummy_col='Yes', total_col_name='med_total_dum')`
|
|
283
302
|
|
|
284
303
|
---
|
|
285
304
|
|
|
286
|
-
## 37.
|
|
305
|
+
## 37. replace_substrings
|
|
287
306
|
- **Description**: Replaces substrings in a column of strings using a dictionary and can change column values to lowercase.
|
|
288
307
|
- **Usage**: `replace_substrings(df, column, replacements, to_lower=False, new_column=None)`
|
|
289
308
|
- **Example**: `replace_substrings(df, 'Influencer Handle', replacement_dict, to_lower=True, new_column='Short Version')`
|
|
290
309
|
|
|
291
310
|
---
|
|
292
311
|
|
|
293
|
-
## 38. `add_total_column
|
|
312
|
+
## 38. `add_total_column
|
|
294
313
|
- **Description**: Sums all columns (excluding a specified column) to create a total column.
|
|
295
314
|
- **Usage**: `add_total_column(df, exclude_col=None, total_col_name='Total')`
|
|
296
315
|
- **Example**: `add_total_column(df, exclude_col='obs', total_col_name='total_media_spd')`
|
|
297
316
|
|
|
298
317
|
---
|
|
299
318
|
|
|
300
|
-
## 39.
|
|
319
|
+
## 39. apply_lookup_table_based_on_substring
|
|
301
320
|
- **Description**: Maps substrings in a column to values using a lookup dictionary.
|
|
302
321
|
- **Usage**: `apply_lookup_table_based_on_substring(df, column_name, category_dict, new_col_name='Category', other_label='Other')`
|
|
303
322
|
- **Example**: `apply_lookup_table_based_on_substring(df, 'Campaign Name', campaign_dict, new_col_name='Campaign Name Short', other_label='Full Funnel')`
|
|
304
323
|
|
|
305
324
|
---
|
|
306
325
|
|
|
307
|
-
## 40.
|
|
326
|
+
## 40. compare_overlap
|
|
308
327
|
- **Description**: Compares matching rows and columns in two DataFrames and outputs the differences.
|
|
309
328
|
- **Usage**: `compare_overlap(df1, df2, date_col)`
|
|
310
329
|
- **Example**: `compare_overlap(df_1, df_2, 'obs')`
|
|
311
330
|
|
|
312
331
|
---
|
|
313
332
|
|
|
314
|
-
## 41.
|
|
333
|
+
## 41. week_commencing_2_week_commencing_conversion
|
|
315
334
|
- **Description**: Converts a week commencing column to a different start day.
|
|
316
335
|
- **Usage**: `week_commencing_2_week_commencing_conversion(df, date_col, week_commencing='sun')`
|
|
317
336
|
- **Example**: `week_commencing_2_week_commencing_conversion(df, 'obs', week_commencing='mon')`
|
|
318
337
|
|
|
319
338
|
---
|
|
320
339
|
|
|
321
|
-
## 42.
|
|
340
|
+
## 42. plot_chart
|
|
322
341
|
- **Description**: Plots various chart types including line, area, scatter, and bar.
|
|
323
342
|
- **Usage**: `plot_chart(df, date_col, value_cols, chart_type='line', title='Chart', x_title='Date', y_title='Values', **kwargs)`
|
|
324
343
|
- **Example**: `plot_chart(df, 'obs', df.cols, chart_type='line', title='Spend Over Time', x_title='Date', y_title='Spend')`
|
|
325
344
|
|
|
326
345
|
---
|
|
327
346
|
|
|
328
|
-
## 43.
|
|
347
|
+
## 43. plot_two_with_common_cols
|
|
329
348
|
- **Description**: Plots charts for two DataFrames based on common column names.
|
|
330
349
|
- **Usage**: `plot_two_with_common_cols(df1, df2, date_column, same_axis=True)`
|
|
331
350
|
- **Example**: `plot_two_with_common_cols(df_1, df_2, date_column='obs')`
|
|
@@ -334,51 +353,82 @@ The IMS package is a python library for processing incoming data into a format t
|
|
|
334
353
|
|
|
335
354
|
## Data Pulling
|
|
336
355
|
|
|
337
|
-
## 1.
|
|
356
|
+
## 1. pull_fred_data
|
|
338
357
|
- **Description**: Fetch data from FRED using series ID tokens.
|
|
339
|
-
- **Usage**:
|
|
340
|
-
- **Example**:
|
|
358
|
+
- **Usage**: pull_fred_data(week_commencing, series_id_list)
|
|
359
|
+
- **Example**: pull_fred_data('mon', ['GPDIC1', 'Y057RX1Q020SBEA', 'GCEC1', 'ND000333Q', 'Y006RX1Q020SBEA'])
|
|
341
360
|
|
|
342
361
|
---
|
|
343
362
|
|
|
344
|
-
## 2.
|
|
363
|
+
## 2. pull_boe_data
|
|
345
364
|
- **Description**: Fetch and process Bank of England interest rate data.
|
|
346
|
-
- **Usage**:
|
|
347
|
-
- **Example**:
|
|
365
|
+
- **Usage**: pull_boe_data(week_commencing)
|
|
366
|
+
- **Example**: pull_boe_data('mon')
|
|
348
367
|
|
|
349
368
|
---
|
|
350
369
|
|
|
351
|
-
## 3.
|
|
352
|
-
- **Description**: Fetch and process time series data from the ONS API.
|
|
353
|
-
- **Usage**: `pull_ons_data(series_list, week_commencing)`
|
|
354
|
-
- **Example**: `pull_ons_data([{'series_id': 'LMSBSA', 'dataset_id': 'LMS'}], 'mon')`
|
|
355
|
-
|
|
356
|
-
---
|
|
357
|
-
|
|
358
|
-
## 4. `pull_oecd`
|
|
370
|
+
## 3. pull_oecd
|
|
359
371
|
- **Description**: Fetch macroeconomic data from OECD for a specified country.
|
|
360
|
-
- **Usage**:
|
|
361
|
-
- **Example**:
|
|
372
|
+
- **Usage**: pull_oecd(country='GBR', week_commencing='mon', start_date='2020-01-01')
|
|
373
|
+
- **Example**: pull_oecd('GBR', 'mon', '2000-01-01')
|
|
362
374
|
|
|
363
375
|
---
|
|
364
376
|
|
|
365
|
-
##
|
|
377
|
+
## 4. get_google_mobility_data
|
|
366
378
|
- **Description**: Fetch Google Mobility data for the specified country.
|
|
367
|
-
- **Usage**:
|
|
368
|
-
- **Example**:
|
|
379
|
+
- **Usage**: get_google_mobility_data(country, wc)
|
|
380
|
+
- **Example**: get_google_mobility_data('United Kingdom', 'mon')
|
|
369
381
|
|
|
370
382
|
---
|
|
371
383
|
|
|
372
|
-
##
|
|
384
|
+
## 5. pull_seasonality
|
|
373
385
|
- **Description**: Generate combined dummy variables for seasonality, trends, and COVID lockdowns.
|
|
374
|
-
- **Usage**:
|
|
375
|
-
- **Example**:
|
|
386
|
+
- **Usage**: pull_seasonality(week_commencing, start_date, countries)
|
|
387
|
+
- **Example**: pull_seasonality('mon', '2020-01-01', ['US', 'GB'])
|
|
376
388
|
|
|
377
389
|
---
|
|
378
390
|
|
|
379
|
-
##
|
|
391
|
+
## 6. pull_weather
|
|
380
392
|
- **Description**: Fetch and process historical weather data for the specified country.
|
|
381
|
-
- **Usage**:
|
|
382
|
-
- **Example**:
|
|
393
|
+
- **Usage**: pull_weather(week_commencing, country)
|
|
394
|
+
- **Example**: pull_weather('mon', 'GBR')
|
|
395
|
+
|
|
396
|
+
---
|
|
397
|
+
|
|
398
|
+
## 7. pull_macro_ons_uk
|
|
399
|
+
- **Description**: Fetch and process time series data from the Beta ONS API.
|
|
400
|
+
- **Usage**: pull_macro_ons_uk(additional_list, week_commencing, sector)
|
|
401
|
+
- **Example**: pull_macro_ons_uk(['HBOI'], 'mon', 'fast_food')
|
|
402
|
+
|
|
403
|
+
---
|
|
404
|
+
|
|
405
|
+
## 8. pull_yfinance
|
|
406
|
+
- **Description**: Fetch and process time series data from Yahoo Finance.
|
|
407
|
+
- **Usage**: pull_yfinance(tickers, week_start_day)
|
|
408
|
+
- **Example**: pull_yfinance(['^FTMC', '^IXIC'], 'mon')
|
|
409
|
+
|
|
410
|
+
## Installation
|
|
411
|
+
|
|
412
|
+
Install the IMS package via pip:
|
|
413
|
+
|
|
414
|
+
```bash
|
|
415
|
+
pip install imsciences
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
---
|
|
419
|
+
|
|
420
|
+
## Useage
|
|
421
|
+
|
|
422
|
+
```bash
|
|
423
|
+
from imsciences import *
|
|
424
|
+
ims = dataprocessing()
|
|
425
|
+
ims_pull = datapull()
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
---
|
|
429
|
+
|
|
430
|
+
## License
|
|
431
|
+
|
|
432
|
+
This project is licensed under the MIT License.
|
|
383
433
|
|
|
384
434
|
---
|