PyPI - idscrub - Versions diffs - 1.0.1__tar.gz → 1.1.1__tar.gz - Mend

idscrub 1.0.1tar.gz → 1.1.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

{idscrub-1.0.1 → idscrub-1.1.1}/.pre-commit-config.yaml RENAMED Viewed

@@ -12,7 +12,7 @@ repos:
 # Mandatory internal hooks
 - repo: https://github.com/uktrade/github-standards
-  rev: v1.2.1  # update periodically with pre-commit autoupdate
+  rev: v1.3.1  # update periodically with pre-commit autoupdate
   hooks:
     - id: run-security-scan
       verbose: false

{idscrub-1.0.1 → idscrub-1.1.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: idscrub
-Version: 1.0.1
+Version: 1.1.1
 Author: Department for Business and Trade
 Requires-Python: >=3.12
 Description-Content-Type: text/markdown
@@ -8,7 +8,7 @@ License-File: LICENSE
 Requires-Dist: ipykernel>=7.1.0
 Requires-Dist: ipywidgets
 Requires-Dist: numpy>=2.3.4
-Requires-Dist: pandas>=2.3.3
+Requires-Dist: pandas<3.0
 Requires-Dist: phonenumbers>=9.0.18
 Requires-Dist: pip>=25.3
 Requires-Dist: spacy-transformers>=1.3.9
@@ -19,12 +19,17 @@ Provides-Extra: trf
 Requires-Dist: en_core_web_trf; extra == "trf"
 Dynamic: license-file
+![Development](https://img.shields.io/badge/status-development-orange)
 # idscrub 🧽✨
 * Names and other personally identifying information are often present in text, even if they are not clearly visible or requested.
 * This information may need to be removed prior to further analysis in many cases.
 * `idscrub` identifies and removes (*✨scrubs✨*) personal data from text using [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) and [named-entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition).
+> [!IMPORTANT]
+> * This package is undergoing frequent internal development. Major updates will be made public periodically.
 ## Installation
 `idscrub` can be installed using `pip` into a Python **>=3.12** environment. Example:
@@ -45,7 +50,7 @@ Basic usage example (see [basic_usage.ipynb](https://github.com/uktrade/idscrub/
 from idscrub import IDScrub
 scrub = IDScrub(['Our names are Hamish McDonald, L. Salah, and Elena Suárez.', 'My number is +441111111111 and I live at AA11 1AA.'])x
-scrubbed_texts = scrub.scrub(scrub_methods=['spacy_persons', 'uk_phone_numbers', 'uk_postcodes'])
+scrubbed_texts = scrub.scrub(scrub_methods=['spacy_entities', 'uk_phone_numbers', 'uk_postcodes'])
 print(scrubbed_texts)
@@ -57,17 +62,18 @@ Personal data can either be scrubbed as methods with arguments for extra customi
 | Argument                | Scrubs                                                                 |
 |-------------------------|------------------------------------------------------------------------|
-| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation)                                      |
-| `spacy_persons`        | Person names detected by spaCy's `en_core_web_trf` (or other user-selected spaCy models)                                    |
-| `huggingface_persons`  | Person names detected by user-selected HuggingFace models                        |
-| `email_addresses`      | Email addresses                                                       |
-| `titles`               | Titles (e.g., Mr., Mrs., Dr.)                                         |
-| `handles`              | Social media handles (e.g., @username)                                |
-| `ip_addresses`         | IP addresses                                                          |
-| `uk_postcodes`         | UK postal codes                                                       |
-| `uk_phone_numbers`     | UK phone numbers                                                      |
-| `google_phone_numbers` | Phone numbers detected by Google’s [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
-| `presidio`             | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g., names, URLs, NHS numbers, IBAN codes) |
+| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation) |
+| `spacy_entities`        | Entities detected by spaCy's `en_core_web_trf` or other user-selected spaCy models (e.g. persons (names), organisations) |
+| `presidio_entities`     | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g. persons (names), URLs, NHS numbers, IBAN codes) |
+| `huggingface_entities`  | Entities detected by user-selected HuggingFace models |
+| `email_addresses`      | Email addresses (e.g. john@email.com)   |
+| `titles`               | Titles (e.g. Mr., Mrs., Dr.)    |
+| `handles`              | Social media handles (e.g. @username)  |
+| `ip_addresses`         | IP addresses (e.g. 8.8.8.8)  |
+| `uk_postcodes`         | UK postal codes (e.g. SW1A 2AA) |
+| `uk_addresses`         | UK addresses (e.g. 10 Downing Street)  |
+| `uk_phone_numbers`     | UK phone numbers (e.g. +441111111111) |
+| `google_phone_numbers` | Phone numbers detected by Google's [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
 ## Considerations before use

{idscrub-1.0.1 → idscrub-1.1.1}/README.md RENAMED Viewed

@@ -1,9 +1,14 @@
+![Development](https://img.shields.io/badge/status-development-orange)
 # idscrub 🧽✨
 * Names and other personally identifying information are often present in text, even if they are not clearly visible or requested.
 * This information may need to be removed prior to further analysis in many cases.
 * `idscrub` identifies and removes (*✨scrubs✨*) personal data from text using [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) and [named-entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition).
+> [!IMPORTANT]
+> * This package is undergoing frequent internal development. Major updates will be made public periodically.
 ## Installation
 `idscrub` can be installed using `pip` into a Python **>=3.12** environment. Example:
@@ -24,7 +29,7 @@ Basic usage example (see [basic_usage.ipynb](https://github.com/uktrade/idscrub/
 from idscrub import IDScrub
 scrub = IDScrub(['Our names are Hamish McDonald, L. Salah, and Elena Suárez.', 'My number is +441111111111 and I live at AA11 1AA.'])x
-scrubbed_texts = scrub.scrub(scrub_methods=['spacy_persons', 'uk_phone_numbers', 'uk_postcodes'])
+scrubbed_texts = scrub.scrub(scrub_methods=['spacy_entities', 'uk_phone_numbers', 'uk_postcodes'])
 print(scrubbed_texts)
@@ -36,17 +41,18 @@ Personal data can either be scrubbed as methods with arguments for extra customi
 | Argument                | Scrubs                                                                 |
 |-------------------------|------------------------------------------------------------------------|
-| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation)                                      |
-| `spacy_persons`        | Person names detected by spaCy's `en_core_web_trf` (or other user-selected spaCy models)                                    |
-| `huggingface_persons`  | Person names detected by user-selected HuggingFace models                        |
-| `email_addresses`      | Email addresses                                                       |
-| `titles`               | Titles (e.g., Mr., Mrs., Dr.)                                         |
-| `handles`              | Social media handles (e.g., @username)                                |
-| `ip_addresses`         | IP addresses                                                          |
-| `uk_postcodes`         | UK postal codes                                                       |
-| `uk_phone_numbers`     | UK phone numbers                                                      |
-| `google_phone_numbers` | Phone numbers detected by Google’s [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
-| `presidio`             | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g., names, URLs, NHS numbers, IBAN codes) |
+| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation) |
+| `spacy_entities`        | Entities detected by spaCy's `en_core_web_trf` or other user-selected spaCy models (e.g. persons (names), organisations) |
+| `presidio_entities`     | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g. persons (names), URLs, NHS numbers, IBAN codes) |
+| `huggingface_entities`  | Entities detected by user-selected HuggingFace models |
+| `email_addresses`      | Email addresses (e.g. john@email.com)   |
+| `titles`               | Titles (e.g. Mr., Mrs., Dr.)    |
+| `handles`              | Social media handles (e.g. @username)  |
+| `ip_addresses`         | IP addresses (e.g. 8.8.8.8)  |
+| `uk_postcodes`         | UK postal codes (e.g. SW1A 2AA) |
+| `uk_addresses`         | UK addresses (e.g. 10 Downing Street)  |
+| `uk_phone_numbers`     | UK phone numbers (e.g. +441111111111) |
+| `google_phone_numbers` | Phone numbers detected by Google's [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
 ## Considerations before use

{idscrub-1.0.1 → idscrub-1.1.1}/idscrub/scrub.py RENAMED Viewed

@@ -453,6 +453,24 @@ class IDScrub:
         return self.scrub_regex(pattern, replacement_text, label=label)
+    def uk_addresses(self, replacement_text: str = "[ADDRESS]", label: str = "uk_address") -> list[str]:
+        """
+        Removes addresses.
+        e.g. `10 Downing Street` scrubbed
+        Args:
+            replacement_text (str): The replacement text for the removed text.
+            label (str): Label for the personal data removed.
+        Returns:
+            list[str]: The input list of text with postcodes replaced.
+        """
+        self.logger.info("Scrubbing addresses using regex...")
+        pattern = r"(?i)\b(?:flat\s+\w+,\s*)?\d+[a-z]?(?:[-–/]\d+[a-z]?)?\s+[a-z][a-z'’\- ]+\s+(street|st|road|rd|avenue|ave|lane|ln|close|cl|drive|dr|way|walk|gardens|gdns|place|pl|mews|court|ct|crescent|cres|terrace|ter)\b"
+        return self.scrub_regex(pattern, replacement_text, label)
     def claimants(self, replacement_text="[CLAIMANT]", label: str = "claimant") -> list[str]:
         """
         Removes claimant names from employment tribunal texts.
@@ -528,64 +546,86 @@ class IDScrub:
         return model
-    def spacy_persons(
+    def spacy_entities(
         self,
         model_name: str = "en_core_web_trf",
+        entities: list[str] = ["PERSON", "ORG", "NORP"],
+        replacement_map: str = {"PERSON": "[PERSON]", "ORG": "[ORG]", "NORP": "[NORP]"},
+        label_prefix: str = None,
         n_process: int = 1,
         batch_size: int = 1000,
-        replacement_text: str = "[PERSON]",
-        label: str = "person",
     ) -> list[str]:
         """
-        Remove PERSON entities using a Spacy model.
+        Remove SpaCy entities using a given SpaCy model.
+        Documentation for entity labels: https://spacy.io/models/en#en_core_web_trf
         Note: only "en_core_web_trf" has been evaluated.
         Args:
             model_name (str): Name of Spacy model. Only `en_core_web_trf` has been evaluated.
+            entities (list[str]): Which SpaCy entities to scrub (based on SpaCy entity keys).
+            replacement_map (str): The replacement texts for the removed text. Index will match `entities`.
+            label_prefix (str): Prefix for the Spacy entity removed, e.g. `{label}_person`.
             n_process (int): Number of parallel processes.
             batch_size (int): The number of texts in each batch.
-            replacement_text (str): The replacement text for the removed text.
-            label (str): Label for the personal data removed.
         Returns:
             list[str]: The input list of text with PERSON entities scrubbed.
         """
-        self.logger.info(f"Scrubbing names using SpaCy model `{model_name}`...")
-        texts = self.get_texts()
+        self.logger.info(
+            f"Scrubbing SpaCy entities `{', '.join(str(entitity) for entitity in entities)}` using SpaCy model `{model_name}`..."
+        )
-        if self.replacement_text:
-            replacement_text = self.replacement_text
+        texts = self.get_texts()
         cleaned_texts = []
+        labels = []
         nlp = self.get_spacy_model(model_name)
         stripped_texts = [s.strip() if s.isspace() else s for s in texts]
         documents = nlp.pipe(stripped_texts, n_process=n_process, batch_size=batch_size)
         for i, (ids, doc, stripped_text) in tqdm(
-            enumerate((zip(self.text_ids, documents, stripped_texts))), total=len(texts)
+            enumerate(zip(self.text_ids, documents, stripped_texts)), total=len(texts)
         ):
-            if stripped_text == "":
+            if not stripped_text:
                 cleaned_texts.append(texts[i])
                 continue
-            # Collect person entities
-            person_entities = [
-                ent for ent in doc.ents if ent.label_ == "PERSON" and ent.text not in {"PERSON", "HANDLE"}
-            ]
-            self.scrubbed_data.extend({self.text_id_name: ids, label: ent.text} for ent in person_entities)
+            all_found_entities = []
+            for entity_type in entities:
+                found = [
+                    ent for ent in doc.ents if ent.label_ == entity_type and ent.text not in {entity_type, "HANDLE"}
+                ]
+                for ent in found:
+                    label = ent.label_.lower()
+                    if label_prefix:
+                        label = f"{label_prefix}_{label}"
+                    labels.append(label)
+                    self.scrubbed_data.append({self.text_id_name: ids, label: ent.text})
+                if self.replacement_text:
+                    all_found_entities.extend((ent.start_char, ent.end_char, self.replacement_text) for ent in found)
+                elif replacement_map:
+                    all_found_entities.extend(
+                        (ent.start_char, ent.end_char, replacement_map.get(entity_type)) for ent in found
+                    )
+                else:
+                    all_found_entities.extend((ent.start_char, ent.end_char, f"[{entity_type}]") for ent in found)
-            # Remove person entities
             cleaned = stripped_text
-            for ent in sorted(person_entities, key=lambda x: [x.start_char], reverse=True):
-                cleaned = cleaned[: ent.start_char] + replacement_text + cleaned[ent.end_char :]
+            for start, end, repl in sorted(all_found_entities, key=lambda x: x[0], reverse=True):
+                cleaned = cleaned[:start] + repl + cleaned[end:]
             cleaned_texts.append(cleaned)
         self.cleaned_texts = cleaned_texts
-        self.log_message(label)
+        for label in set(labels):
+            self.log_message(label)
         return cleaned_texts
@@ -600,7 +640,7 @@ class IDScrub:
         Note: No Hugging Face models have been evaluated for performance.
         Args:
-            hf_model_path (str): Path to the Hugging Face model on the DBT mirror.
+            hf_model_path (str): Path to the Hugging Face model.
             Only `dbmdz/bert-large-cased-finetuned-conll03-english` has been evaluated.
             download_directory (str): Directory in which to save the model.
             Default is current working directory.
@@ -624,20 +664,21 @@ class IDScrub:
         return tokenizer
-    def huggingface_persons(
+    def huggingface_entities(
         self,
         hf_model_path: str = "dbmdz/bert-large-cased-finetuned-conll03-english",
         download_directory: str = f"{DOWNLOAD_DIR}/huggingface/",
+        entity="PER",
         replacement_text: str = "[PERSON]",
         label: str = "person",
         batch_size: int = 8,
     ) -> list[str]:
         """
-        Remove PERSON entities using a Hugging Face model.
+        Remove entities using a Hugging Face model. Default is a PERSON entity identifier.
         Note: No Hugging Face models have been evaluated for performance.
         Args:
-            hf_model_path (str): Path to the Hugging Face model on the DBT mirror.
+            hf_model_path (str): Path to the Hugging Face model.
             Only `dbmdz/bert-large-cased-finetuned-conll03-english` has been tested.
             download_directory (str): Directory in which to save the model.
             Default is current working directory.
@@ -679,7 +720,7 @@ class IDScrub:
                 continue
             person_entities = [
-                ent for ent in entities if ent["entity_group"] == "PER" and ent["word"] not in {"HANDLE", "PERSON"}
+                ent for ent in entities if ent["entity_group"] == entity and ent["word"] not in {"HANDLE", entity}
             ]
             self.scrubbed_data.extend({self.text_id_name: ids, label: ent["word"]} for ent in person_entities)
@@ -695,10 +736,10 @@ class IDScrub:
         return cleaned_texts
-    def presidio(
+    def presidio_entities(
         self,
         model_name: str = "en_core_web_trf",
-        entities_to_scrub: list[str] = [
+        entities: list[str] = [
             "PERSON",
             "UK_NINO",
             "UK_NHS",
@@ -718,15 +759,18 @@ class IDScrub:
         Args:
             model_name (str): spaCy model to use
-            entities_to_scrub (list[str]): Entity types to scrub (e.g. ["PERSON", "IP_ADDRESS"])
+            entities (list[str]): Entity types to scrub (e.g. ["PERSON", "IP_ADDRESS"])
             replacement_map (dict): Mapping of entity_type to replacement string (e.g. {'PERSON': '[PERSON]'})
             label_prefix (str): Prefix for the Presidio personal data type removed, e.g. `{label}_person`.
+            Useful if you wish to identify this having being scrubbed by Presidio.
         Returns:
             list[str]: The input list of text with entities replaced.
         """
-        self.logger.info("Scrubbing using Presidio...")
+        self.logger.info(
+            f"Scrubbing Presidio entities `{', '.join(str(entitity) for entitity in entities)}` using SpaCy model `{model_name}`..."
+        )
         texts = self.get_texts()
@@ -744,7 +788,7 @@ class IDScrub:
         anonymizer = AnonymizerEngine()
         cleaned_texts = []
-        unique_labels = []
+        all_labels = []
         stripped_texts = [s.strip() if s.isspace() else s for s in texts]
@@ -754,14 +798,15 @@ class IDScrub:
                 continue
             results = analyzer.analyze(text=stripped_text, language="en")
-            results = [r for r in results if r.entity_type in entities_to_scrub]
+            results = [r for r in results if r.entity_type in entities]
             if label_prefix:
                 labels = [f"{label_prefix}_{res.entity_type.lower()}" for res in results]
             else:
                 labels = [f"{res.entity_type.lower()}" for res in results]
-            unique_labels.append(list(set(labels)))
+            for label in labels:
+                all_labels.append(label)
             self.scrubbed_data.extend(
                 {self.text_id_name: ids, label: stripped_text[res.start : res.end]}
@@ -788,9 +833,8 @@ class IDScrub:
         self.cleaned_texts = cleaned_texts
-        for label in unique_labels:
-            if label:
-                self.log_message(label[0])
+        for label in set(all_labels):
+            self.log_message(label)
         return cleaned_texts
@@ -810,6 +854,7 @@ class IDScrub:
         self.handles()
         self.ip_addresses()
         self.uk_phone_numbers()
+        self.uk_addresses()
         self.uk_postcodes()
         self.titles()
@@ -820,7 +865,8 @@ class IDScrub:
         custom_regex_patterns: list = None,
         custom_replacement_texts: list[str] = None,
         model_name: str = "en_core_web_trf",
-        presidio_entities_to_scrub: list[str] = [
+        spacy_entities: list[str] = ["PERSON", "ORG", "NORP"],
+        presidio_entities: list[str] = [
             "PERSON",
             "EMAIL_ADDRESS",
             "UK_NINO",
@@ -857,8 +903,8 @@ class IDScrub:
                 custom_replacement_texts=custom_replacement_texts,
             )
-        self.presidio(model_name=model_name, entities_to_scrub=presidio_entities_to_scrub)
-        self.spacy_persons(model_name=model_name, n_process=n_process, batch_size=batch_size)
+        self.presidio_entities(model_name=model_name, entities=presidio_entities)
+        self.spacy_entities(model_name=model_name, entities=spacy_entities, n_process=n_process, batch_size=batch_size)
         self.google_phone_numbers()
         self.all_regex()

{idscrub-1.0.1 → idscrub-1.1.1}/idscrub.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: idscrub
-Version: 1.0.1
+Version: 1.1.1
 Author: Department for Business and Trade
 Requires-Python: >=3.12
 Description-Content-Type: text/markdown
@@ -8,7 +8,7 @@ License-File: LICENSE
 Requires-Dist: ipykernel>=7.1.0
 Requires-Dist: ipywidgets
 Requires-Dist: numpy>=2.3.4
-Requires-Dist: pandas>=2.3.3
+Requires-Dist: pandas<3.0
 Requires-Dist: phonenumbers>=9.0.18
 Requires-Dist: pip>=25.3
 Requires-Dist: spacy-transformers>=1.3.9
@@ -19,12 +19,17 @@ Provides-Extra: trf
 Requires-Dist: en_core_web_trf; extra == "trf"
 Dynamic: license-file
+![Development](https://img.shields.io/badge/status-development-orange)
 # idscrub 🧽✨
 * Names and other personally identifying information are often present in text, even if they are not clearly visible or requested.
 * This information may need to be removed prior to further analysis in many cases.
 * `idscrub` identifies and removes (*✨scrubs✨*) personal data from text using [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) and [named-entity recognition](https://en.wikipedia.org/wiki/Named-entity_recognition).
+> [!IMPORTANT]
+> * This package is undergoing frequent internal development. Major updates will be made public periodically.
 ## Installation
 `idscrub` can be installed using `pip` into a Python **>=3.12** environment. Example:
@@ -45,7 +50,7 @@ Basic usage example (see [basic_usage.ipynb](https://github.com/uktrade/idscrub/
 from idscrub import IDScrub
 scrub = IDScrub(['Our names are Hamish McDonald, L. Salah, and Elena Suárez.', 'My number is +441111111111 and I live at AA11 1AA.'])x
-scrubbed_texts = scrub.scrub(scrub_methods=['spacy_persons', 'uk_phone_numbers', 'uk_postcodes'])
+scrubbed_texts = scrub.scrub(scrub_methods=['spacy_entities', 'uk_phone_numbers', 'uk_postcodes'])
 print(scrubbed_texts)
@@ -57,17 +62,18 @@ Personal data can either be scrubbed as methods with arguments for extra customi
 | Argument                | Scrubs                                                                 |
 |-------------------------|------------------------------------------------------------------------|
-| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation)                                      |
-| `spacy_persons`        | Person names detected by spaCy's `en_core_web_trf` (or other user-selected spaCy models)                                    |
-| `huggingface_persons`  | Person names detected by user-selected HuggingFace models                        |
-| `email_addresses`      | Email addresses                                                       |
-| `titles`               | Titles (e.g., Mr., Mrs., Dr.)                                         |
-| `handles`              | Social media handles (e.g., @username)                                |
-| `ip_addresses`         | IP addresses                                                          |
-| `uk_postcodes`         | UK postal codes                                                       |
-| `uk_phone_numbers`     | UK phone numbers                                                      |
-| `google_phone_numbers` | Phone numbers detected by Google’s [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
-| `presidio`             | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g., names, URLs, NHS numbers, IBAN codes) |
+| `all`                  | All supported personal data types (see `IDScrub.all()` for further customisation) |
+| `spacy_entities`        | Entities detected by spaCy's `en_core_web_trf` or other user-selected spaCy models (e.g. persons (names), organisations) |
+| `presidio_entities`     | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g. persons (names), URLs, NHS numbers, IBAN codes) |
+| `huggingface_entities`  | Entities detected by user-selected HuggingFace models |
+| `email_addresses`      | Email addresses (e.g. john@email.com)   |
+| `titles`               | Titles (e.g. Mr., Mrs., Dr.)    |
+| `handles`              | Social media handles (e.g. @username)  |
+| `ip_addresses`         | IP addresses (e.g. 8.8.8.8)  |
+| `uk_postcodes`         | UK postal codes (e.g. SW1A 2AA) |
+| `uk_addresses`         | UK addresses (e.g. 10 Downing Street)  |
+| `uk_phone_numbers`     | UK phone numbers (e.g. +441111111111) |
+| `google_phone_numbers` | Phone numbers detected by Google's [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
 ## Considerations before use

{idscrub-1.0.1 → idscrub-1.1.1}/idscrub.egg-info/SOURCES.txt RENAMED Viewed

@@ -27,8 +27,8 @@ test/test_huggingface.py
 test/test_id.py
 test/test_label.py
 test/test_log.py
-test/test_persidio.py
 test/test_phonenumbers.py
+test/test_presidio.py
 test/test_regex.py
 test/test_scrub.py
 test/test_spacy.py

{idscrub-1.0.1 → idscrub-1.1.1}/idscrub.egg-info/requires.txt RENAMED Viewed

@@ -1,7 +1,7 @@
 ipykernel>=7.1.0
 ipywidgets
 numpy>=2.3.4
-pandas>=2.3.3
+pandas<3.0
 phonenumbers>=9.0.18
 pip>=25.3
 spacy-transformers>=1.3.9

idscrub 1.0.1__tar.gz → 1.1.1__tar.gz

idscrub 1.0.1tar.gz → 1.1.1tar.gz