idscrub 1.0.0__py3-none-any.whl → 1.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: idscrub
3
- Version: 1.0.0
3
+ Version: 1.0.1
4
4
  Author: Department for Business and Trade
5
5
  Requires-Python: >=3.12
6
6
  Description-Content-Type: text/markdown
@@ -51,6 +51,23 @@ print(scrubbed_texts)
51
51
 
52
52
  # Output: ['Our names are [PERSON], [PERSON], and [PERSON].', 'My number is [PHONENO] and I live at [POSTCODE].']
53
53
  ```
54
+ ## Personal data types supported
55
+
56
+ Personal data can either be scrubbed as methods with arguments for extra customisation, e.g. `IDScrub.google_phone_numbers(region="GB")`, or as a string arguments with default configurations (see above). The method name and its string representation are the same.
57
+
58
+ | Argument | Scrubs |
59
+ |-------------------------|------------------------------------------------------------------------|
60
+ | `all` | All supported personal data types (see `IDScrub.all()` for further customisation) |
61
+ | `spacy_persons` | Person names detected by spaCy's `en_core_web_trf` (or other user-selected spaCy models) |
62
+ | `huggingface_persons` | Person names detected by user-selected HuggingFace models |
63
+ | `email_addresses` | Email addresses |
64
+ | `titles` | Titles (e.g., Mr., Mrs., Dr.) |
65
+ | `handles` | Social media handles (e.g., @username) |
66
+ | `ip_addresses` | IP addresses |
67
+ | `uk_postcodes` | UK postal codes |
68
+ | `uk_phone_numbers` | UK phone numbers |
69
+ | `google_phone_numbers` | Phone numbers detected by Google’s [phonenumbers](https://github.com/daviddrysdale/python-phonenumbers) |
70
+ | `presidio` | Entities supported by [Microsoft Presidio](https://microsoft.github.io/presidio/) (e.g., names, URLs, NHS numbers, IBAN codes) |
54
71
 
55
72
  ## Considerations before use
56
73
 
@@ -1,7 +1,7 @@
1
1
  idscrub/__init__.py,sha256=cRugJv27q1q--bl-VNLpfiScJb_ROlUxyLFhaF55S1w,38
2
2
  idscrub/locations.py,sha256=7fMNOcGMYe7sX8TrfhMW6oYGAlc1WVYVQKQbpxE3pqo,217
3
3
  idscrub/scrub.py,sha256=VqVqcChbbxMEKJR6Aci971dqG-RmD48otrp9sG2dX0o,34443
4
- idscrub-1.0.0.dist-info/licenses/LICENSE,sha256=JJnuf10NSx7YXglte1oH_N9ZP3AcWR_Y8irvQb_wnsg,1090
4
+ idscrub-1.0.1.dist-info/licenses/LICENSE,sha256=JJnuf10NSx7YXglte1oH_N9ZP3AcWR_Y8irvQb_wnsg,1090
5
5
  notebooks/basic_usage.ipynb,sha256=XTBxdtu2F0S99V2lntUEeFj6SN4GRVm4qKvqOhs7nec,38777
6
6
  test/conftest.py,sha256=y-pwGXpdg7bbFc36HtE3wQtZkeI0JM77fcMYjej5veY,557
7
7
  test/test_all.py,sha256=ifuXAI0Hq3ETNXzdITjNGCnuFyozhN5TpJC2hOtA2bM,1103
@@ -16,7 +16,7 @@ test/test_phonenumbers.py,sha256=hZsXgwhn5R-7426TTWwCH9gWQwhyHtjLUstN10jnX6c,607
16
16
  test/test_regex.py,sha256=zuq8g_8F_P5oCA2ChU5wUIFEWjT9LSYB0S_U1rBpTn4,4388
17
17
  test/test_scrub.py,sha256=MWpan5cWIGeNPJCvTwtYe-iZeoIjS_fZMIg46ZVrkJo,1377
18
18
  test/test_spacy.py,sha256=KHalx16GYHmCaQUU1O5bLMP95SLTu1007fJK1oq__v4,932
19
- idscrub-1.0.0.dist-info/METADATA,sha256=fo7FUBAHDei63EWPRUrfNS05p3bnZWSY2GPVrho0vjo,5403
20
- idscrub-1.0.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
21
- idscrub-1.0.0.dist-info/top_level.txt,sha256=D4EEodXGCjGiX35ObiBTmjjBAdouN-eCvH-LezGGtks,23
22
- idscrub-1.0.0.dist-info/RECORD,,
19
+ idscrub-1.0.1.dist-info/METADATA,sha256=mRpiv1ew3UV0ch6-ldQoLC744RVQ-wVS--KgBg2OpmI,7201
20
+ idscrub-1.0.1.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
21
+ idscrub-1.0.1.dist-info/top_level.txt,sha256=D4EEodXGCjGiX35ObiBTmjjBAdouN-eCvH-LezGGtks,23
22
+ idscrub-1.0.1.dist-info/RECORD,,