imsciences 0.6.1.9__tar.gz → 0.6.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: imsciences
3
- Version: 0.6.1.9
3
+ Version: 0.6.2.1
4
4
  Summary: IMS Data Processing Package
5
5
  Author: IMS
6
6
  Author-email: cam@im-sciences.com
@@ -1431,40 +1431,37 @@ class dataprocessing:
1431
1431
 
1432
1432
  return df
1433
1433
 
1434
- def apply_lookup_table_based_on_substring(df, column_name, category_dict, new_col_name='Category', other_label='Other'):
1434
+ def apply_lookup_table_based_on_substring(self, df, column_name, category_dict, new_col_name='Category', other_label='Other'):
1435
1435
  """
1436
1436
  Categorizes text in a specified DataFrame column by applying a lookup table based on substrings.
1437
1437
 
1438
- Parameters:
1439
- - df (pd.DataFrame): The DataFrame containing the column to categorize.
1440
- - column_name (str): The name of the column in the DataFrame that contains the text data to categorize.
1441
- - category_dict (dict): A dictionary where keys are substrings to search for in the text and values are
1442
- the categories to assign when a substring is found.
1443
- - new_col_name (str): The name of the new column to be created in the DataFrame, which will hold the
1444
- resulting categories. Default is 'Category'.
1438
+ Args:
1439
+ df (pd.DataFrame): The DataFrame containing the column to categorize.
1440
+ column_name (str): The name of the column in the DataFrame that contains the text data to categorize.
1441
+ category_dict (dict): A dictionary where keys are substrings to search for in the text and values are the categories to assign when a substring is found.
1442
+ new_col_name (str, optional): The name of the new column to be created in the DataFrame, which will hold the resulting categories. Default is 'Category'.
1443
+ other_label (str, optional): The name given to category if no substring from the dictionary is found in the cell
1445
1444
 
1446
1445
  Returns:
1447
- - pd.DataFrame: The original DataFrame with an additional column containing the assigned categories.
1446
+ pd.DataFrame: The original DataFrame with an additional column containing the assigned categories.
1448
1447
  """
1449
1448
 
1450
- def categorize_text(text, category_dict):
1449
+ def categorize_text(text):
1451
1450
  """
1452
1451
  Assigns a category to a single text string based on the presence of substrings from a dictionary.
1453
1452
 
1454
- Parameters:
1455
- - text (str): The text string to categorize.
1456
- - category_dict (dict): A dictionary where keys are substrings to search for in the text and
1457
- values are the categories to assign if a substring is found.
1453
+ Args:
1454
+ text (str): The text string to categorize.
1458
1455
 
1459
1456
  Returns:
1460
- - str: The category assigned based on the first matching substring found in the text. If no
1461
- matching substring is found, returns 'Full Funnel'.
1457
+ str: The category assigned based on the first matching substring found in the text. If no
1458
+ matching substring is found, returns other_name.
1462
1459
  """
1463
1460
  for key, category in category_dict.items():
1464
1461
  if key.lower() in text.lower(): # Check if the substring is in the text (case-insensitive)
1465
1462
  return category
1466
1463
  return other_label # Default category if no match is found
1467
-
1464
+
1468
1465
  # Apply the categorize_text function to each element in the specified column
1469
1466
  df[new_col_name] = df[column_name].apply(categorize_text)
1470
1467
  return df
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: imsciences
3
- Version: 0.6.1.9
3
+ Version: 0.6.2.1
4
4
  Summary: IMS Data Processing Package
5
5
  Author: IMS
6
6
  Author-email: cam@im-sciences.com
@@ -8,7 +8,7 @@ def read_md(file_name):
8
8
  return f.read()
9
9
  return ''
10
10
 
11
- VERSION = '0.6.1.9'
11
+ VERSION = '0.6.2.1'
12
12
  DESCRIPTION = 'IMS Data Processing Package'
13
13
  LONG_DESCRIPTION = read_md('README.md') # Reading from README.md
14
14
 
File without changes
File without changes