datacloak 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- datacloak-0.1.0/.gitignore +42 -0
- datacloak-0.1.0/CHANGELOG.md +23 -0
- datacloak-0.1.0/LICENSE +21 -0
- datacloak-0.1.0/PKG-INFO +364 -0
- datacloak-0.1.0/README.md +327 -0
- datacloak-0.1.0/datacloak/__init__.py +177 -0
- datacloak-0.1.0/datacloak/cli.py +222 -0
- datacloak-0.1.0/datacloak/detectors/__init__.py +40 -0
- datacloak-0.1.0/datacloak/detectors/aadhaar.py +53 -0
- datacloak-0.1.0/datacloak/detectors/base.py +97 -0
- datacloak-0.1.0/datacloak/detectors/credit_card.py +60 -0
- datacloak-0.1.0/datacloak/detectors/email.py +50 -0
- datacloak-0.1.0/datacloak/detectors/ifsc.py +57 -0
- datacloak-0.1.0/datacloak/detectors/ip_address.py +86 -0
- datacloak-0.1.0/datacloak/detectors/mobile.py +60 -0
- datacloak-0.1.0/datacloak/detectors/pan.py +57 -0
- datacloak-0.1.0/datacloak/detectors/upi.py +64 -0
- datacloak-0.1.0/datacloak/file_scanner.py +272 -0
- datacloak-0.1.0/datacloak/masker.py +196 -0
- datacloak-0.1.0/datacloak/py.typed +0 -0
- datacloak-0.1.0/datacloak/reporter.py +126 -0
- datacloak-0.1.0/datacloak/scanner.py +76 -0
- datacloak-0.1.0/pyproject.toml +127 -0
- datacloak-0.1.0/tests/__init__.py +1 -0
- datacloak-0.1.0/tests/test_aadhaar.py +60 -0
- datacloak-0.1.0/tests/test_email.py +62 -0
- datacloak-0.1.0/tests/test_file_scanner.py +131 -0
- datacloak-0.1.0/tests/test_integration.py +127 -0
- datacloak-0.1.0/tests/test_masking.py +126 -0
- datacloak-0.1.0/tests/test_mobile.py +61 -0
- datacloak-0.1.0/tests/test_other_detectors.py +136 -0
- datacloak-0.1.0/tests/test_pan.py +54 -0
- datacloak-0.1.0/tests/test_upi.py +17 -0
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*.pyo
|
|
5
|
+
*.pyd
|
|
6
|
+
*.so
|
|
7
|
+
*.egg
|
|
8
|
+
*.egg-info/
|
|
9
|
+
dist/
|
|
10
|
+
build/
|
|
11
|
+
.eggs/
|
|
12
|
+
.env
|
|
13
|
+
.venv/
|
|
14
|
+
venv/
|
|
15
|
+
env/
|
|
16
|
+
|
|
17
|
+
# Testing / Coverage
|
|
18
|
+
.pytest_cache/
|
|
19
|
+
.coverage
|
|
20
|
+
coverage.xml
|
|
21
|
+
htmlcov/
|
|
22
|
+
|
|
23
|
+
# Type checking
|
|
24
|
+
.mypy_cache/
|
|
25
|
+
|
|
26
|
+
# Linting
|
|
27
|
+
.ruff_cache/
|
|
28
|
+
|
|
29
|
+
# IDE
|
|
30
|
+
.idea/
|
|
31
|
+
.vscode/
|
|
32
|
+
*.swp
|
|
33
|
+
*.swo
|
|
34
|
+
|
|
35
|
+
# OS
|
|
36
|
+
.DS_Store
|
|
37
|
+
Thumbs.db
|
|
38
|
+
|
|
39
|
+
# Packaging
|
|
40
|
+
*.whl
|
|
41
|
+
*.tar.gz
|
|
42
|
+
MANIFEST
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to DataCloak will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## [0.1.0] โ 2025-06-01
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
- Initial release of DataCloak
|
|
14
|
+
- Built-in detectors: Aadhaar, PAN, Indian Mobile, Email, UPI ID, Credit Card (Luhn-validated), IFSC, IPv4/IPv6
|
|
15
|
+
- Three masking modes: `partial`, `full`, `hash`
|
|
16
|
+
- `scan()` API for structured PII detection without modification
|
|
17
|
+
- `scan_file()` supporting `.txt`, `.log`, and `.csv` files
|
|
18
|
+
- Extensible `FileHandler` interface for adding new file format support
|
|
19
|
+
- JSON report generation with risk-level classification
|
|
20
|
+
- `datacloak` CLI with `scan`, `mask`, and `report` commands
|
|
21
|
+
- Custom detector framework via `BaseDetector`
|
|
22
|
+
- Full pytest test suite (>90% coverage)
|
|
23
|
+
- Typed codebase (PEP 561 compliant)
|
datacloak-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 DataCloak Contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
datacloak-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,364 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: datacloak
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Privacy protection library for detecting and masking PII in text, logs, and files.
|
|
5
|
+
Project-URL: Homepage, https://github.com/datacloak/datacloak
|
|
6
|
+
Project-URL: Documentation, https://datacloak.readthedocs.io
|
|
7
|
+
Project-URL: Repository, https://github.com/datacloak/datacloak
|
|
8
|
+
Project-URL: Bug Tracker, https://github.com/datacloak/datacloak/issues
|
|
9
|
+
Project-URL: Changelog, https://github.com/datacloak/datacloak/blob/main/CHANGELOG.md
|
|
10
|
+
Author: DataCloak Contributors
|
|
11
|
+
License: MIT
|
|
12
|
+
License-File: LICENSE
|
|
13
|
+
Keywords: aadhaar,compliance,data-masking,gdpr,india,pan,pii,privacy,redaction,security
|
|
14
|
+
Classifier: Development Status :: 3 - Alpha
|
|
15
|
+
Classifier: Intended Audience :: Developers
|
|
16
|
+
Classifier: Intended Audience :: Information Technology
|
|
17
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
18
|
+
Classifier: Operating System :: OS Independent
|
|
19
|
+
Classifier: Programming Language :: Python :: 3
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
22
|
+
Classifier: Topic :: Security
|
|
23
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
24
|
+
Classifier: Topic :: Text Processing :: General
|
|
25
|
+
Classifier: Typing :: Typed
|
|
26
|
+
Requires-Python: >=3.11
|
|
27
|
+
Requires-Dist: click>=8.1
|
|
28
|
+
Provides-Extra: dev
|
|
29
|
+
Requires-Dist: build; extra == 'dev'
|
|
30
|
+
Requires-Dist: hatchling; extra == 'dev'
|
|
31
|
+
Requires-Dist: mypy>=1.10; extra == 'dev'
|
|
32
|
+
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
|
|
33
|
+
Requires-Dist: pytest>=8.0; extra == 'dev'
|
|
34
|
+
Requires-Dist: ruff>=0.4; extra == 'dev'
|
|
35
|
+
Requires-Dist: twine; extra == 'dev'
|
|
36
|
+
Description-Content-Type: text/markdown
|
|
37
|
+
|
|
38
|
+
# DataCloak ๐
|
|
39
|
+
|
|
40
|
+
> **Privacy protection for Python applications** โ detect and mask PII in text, logs, files, and data pipelines.
|
|
41
|
+
|
|
42
|
+
[](https://pypi.org/project/datacloak/)
|
|
43
|
+
[](https://pypi.org/project/datacloak/)
|
|
44
|
+
[](https://opensource.org/licenses/MIT)
|
|
45
|
+
[](https://github.com/datacloak/datacloak/actions)
|
|
46
|
+
[](https://codecov.io/gh/datacloak/datacloak)
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
DataCloak is a production-ready Python library for **automatically detecting and masking Personally Identifiable Information (PII)** โ built for India-first compliance use cases (Aadhaar, PAN, UPI) while covering universal types like email and credit cards.
|
|
51
|
+
|
|
52
|
+
## โจ Features
|
|
53
|
+
|
|
54
|
+
| Capability | Description |
|
|
55
|
+
|---|---|
|
|
56
|
+
| **8 built-in detectors** | Aadhaar, PAN, Mobile, Email, UPI ID, Credit Card (Luhn), IFSC, IPv4/IPv6 |
|
|
57
|
+
| **3 masking modes** | `partial` (keep trailing chars), `full` (redaction tags), `hash` (SHA-256) |
|
|
58
|
+
| **File scanning** | `.txt`, `.log`, `.csv` โ extensible to PDF, DOCX, and more |
|
|
59
|
+
| **Structured scan** | Returns findings dict without modifying original text |
|
|
60
|
+
| **JSON reports** | Risk-level classified reports with per-type counts |
|
|
61
|
+
| **CLI** | `datacloak scan / mask / report` commands |
|
|
62
|
+
| **Pluggable detectors** | Subclass `BaseDetector` to add your own PII types |
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## ๐ Installation
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
pip install datacloak
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Requires Python 3.11+.
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## โก Quick Start
|
|
77
|
+
|
|
78
|
+
```python
|
|
79
|
+
from datacloak import mask, scan
|
|
80
|
+
|
|
81
|
+
text = """
|
|
82
|
+
Aadhaar: 2345 6789 0123
|
|
83
|
+
PAN: ABCPE1234F
|
|
84
|
+
Email: alice@example.com
|
|
85
|
+
Phone: 9876543210
|
|
86
|
+
"""
|
|
87
|
+
|
|
88
|
+
# Partial masking (default) โ keeps trailing characters visible
|
|
89
|
+
print(mask(text))
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
**Output:**
|
|
93
|
+
```
|
|
94
|
+
Aadhaar: XXXX XXXX 0123
|
|
95
|
+
PAN: XXXXX1234F
|
|
96
|
+
Email: a***@example.com
|
|
97
|
+
Phone: ******3210
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## ๐ Usage Guide
|
|
103
|
+
|
|
104
|
+
### 1. Masking Modes
|
|
105
|
+
|
|
106
|
+
```python
|
|
107
|
+
from datacloak import mask
|
|
108
|
+
|
|
109
|
+
text = "Contact: alice@example.com | Phone: 9876543210"
|
|
110
|
+
|
|
111
|
+
# Partial โ show trailing characters (default)
|
|
112
|
+
mask(text, mode="partial")
|
|
113
|
+
# โ 'Contact: a***@example.com | Phone: ******3210'
|
|
114
|
+
|
|
115
|
+
# Full โ replace with descriptive redaction tags
|
|
116
|
+
mask(text, mode="full")
|
|
117
|
+
# โ 'Contact: [EMAIL_REDACTED] | Phone: [PHONE_REDACTED]'
|
|
118
|
+
|
|
119
|
+
# Hash โ SHA-256 digest (deterministic, reversible with original)
|
|
120
|
+
mask(text, mode="hash")
|
|
121
|
+
# โ 'Contact: [HASH:142d78e466cacab3] | Phone: [HASH:7619ee8cea49187f]'
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### 2. Scan Without Masking
|
|
125
|
+
|
|
126
|
+
```python
|
|
127
|
+
from datacloak import scan
|
|
128
|
+
|
|
129
|
+
findings = scan("Send invoice to billing@acme.com, call 9876543210")
|
|
130
|
+
print(findings)
|
|
131
|
+
# {
|
|
132
|
+
# "email": ["billing@acme.com"],
|
|
133
|
+
# "phone": ["9876543210"]
|
|
134
|
+
# }
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### 3. File Scanning
|
|
138
|
+
|
|
139
|
+
```python
|
|
140
|
+
from datacloak import scan_file
|
|
141
|
+
|
|
142
|
+
# Scan a plain text or log file
|
|
143
|
+
result = scan_file("application.log")
|
|
144
|
+
|
|
145
|
+
# Scan a CSV (each cell is scanned individually)
|
|
146
|
+
result = scan_file("customers.csv")
|
|
147
|
+
|
|
148
|
+
print(result.summary)
|
|
149
|
+
# {"email": 142, "phone": 38, "aadhaar": 5}
|
|
150
|
+
|
|
151
|
+
print(result.findings[0])
|
|
152
|
+
# FileFinding(email='alice@example.com' @customers.csv:line=2)
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
### 4. Report Generation
|
|
156
|
+
|
|
157
|
+
```python
|
|
158
|
+
from datacloak import report
|
|
159
|
+
|
|
160
|
+
r = report(text, source_label="user_input")
|
|
161
|
+
print(r.to_json())
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
```json
|
|
165
|
+
{
|
|
166
|
+
"generated_at": "2025-06-01T10:23:00+00:00",
|
|
167
|
+
"source": "user_input",
|
|
168
|
+
"total_findings": 4,
|
|
169
|
+
"summary": {
|
|
170
|
+
"aadhaar": 1,
|
|
171
|
+
"email": 1,
|
|
172
|
+
"phone": 1,
|
|
173
|
+
"pan": 1
|
|
174
|
+
},
|
|
175
|
+
"details": { ... },
|
|
176
|
+
"risk_level": "MEDIUM"
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
Save to disk:
|
|
181
|
+
```python
|
|
182
|
+
r.save("pii_report.json")
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
## ๐ฅ๏ธ Command-Line Interface
|
|
188
|
+
|
|
189
|
+
DataCloak ships a full CLI via `datacloak`:
|
|
190
|
+
|
|
191
|
+
```bash
|
|
192
|
+
# Scan a file and display findings as a table
|
|
193
|
+
datacloak scan customers.txt
|
|
194
|
+
|
|
195
|
+
# Scan with JSON output
|
|
196
|
+
datacloak scan --format json customers.txt
|
|
197
|
+
|
|
198
|
+
# Mask a file (writes customers.masked.txt by default)
|
|
199
|
+
datacloak mask customers.txt
|
|
200
|
+
|
|
201
|
+
# Mask with full-redaction mode, specify output file
|
|
202
|
+
datacloak mask customers.txt --mode full --output clean.txt
|
|
203
|
+
|
|
204
|
+
# Print masked output to stdout (pipe-friendly)
|
|
205
|
+
datacloak mask customers.txt --stdout | grep "REDACTED"
|
|
206
|
+
|
|
207
|
+
# Generate a JSON report
|
|
208
|
+
datacloak report customers.txt
|
|
209
|
+
|
|
210
|
+
# Save report to file
|
|
211
|
+
datacloak report customers.txt --output report.json
|
|
212
|
+
|
|
213
|
+
# Verbose logging
|
|
214
|
+
datacloak -v scan customers.txt
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
219
|
+
## ๐ Writing a Custom Detector
|
|
220
|
+
|
|
221
|
+
DataCloak's detector framework is designed for extension. Subclass `BaseDetector`, set `_pattern`, and optionally override `_validate()`:
|
|
222
|
+
|
|
223
|
+
```python
|
|
224
|
+
import re
|
|
225
|
+
from datacloak.detectors import BaseDetector, Detection
|
|
226
|
+
from datacloak import scan
|
|
227
|
+
|
|
228
|
+
class PassportDetector(BaseDetector):
|
|
229
|
+
name = "indian_passport"
|
|
230
|
+
description = "Indian Passport Number (A-Z followed by 7 digits)"
|
|
231
|
+
_pattern = re.compile(r"\b[A-Z]\d{7}\b")
|
|
232
|
+
|
|
233
|
+
# Use alongside built-in detectors
|
|
234
|
+
from datacloak.detectors import DEFAULT_DETECTORS
|
|
235
|
+
|
|
236
|
+
my_detectors = DEFAULT_DETECTORS + [PassportDetector()]
|
|
237
|
+
findings = scan("Passport: A1234567", detectors=my_detectors)
|
|
238
|
+
# {"indian_passport": ["A1234567"]}
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
### Adding a new file format
|
|
242
|
+
|
|
243
|
+
```python
|
|
244
|
+
from datacloak.file_scanner import FileHandler, register_handler
|
|
245
|
+
from pathlib import Path
|
|
246
|
+
|
|
247
|
+
class PDFHandler(FileHandler):
|
|
248
|
+
extensions = (".pdf",)
|
|
249
|
+
|
|
250
|
+
def extract_chunks(self, path: Path):
|
|
251
|
+
# Use any PDF library (pdfplumber, PyMuPDF, etc.)
|
|
252
|
+
import pdfplumber
|
|
253
|
+
with pdfplumber.open(path) as pdf:
|
|
254
|
+
for page_num, page in enumerate(pdf.pages, start=1):
|
|
255
|
+
text = page.extract_text() or ""
|
|
256
|
+
yield page_num, None, text
|
|
257
|
+
|
|
258
|
+
register_handler(PDFHandler())
|
|
259
|
+
|
|
260
|
+
# Now scan_file("document.pdf") works automatically
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
---
|
|
264
|
+
|
|
265
|
+
## ๐ต๏ธ Supported PII Types
|
|
266
|
+
|
|
267
|
+
| Detector | Example | Validation |
|
|
268
|
+
|---|---|---|
|
|
269
|
+
| `aadhaar` | `2345 6789 0123` | 12 digits, starts 2-9, space/hyphen/plain |
|
|
270
|
+
| `pan` | `ABCPE1234F` | AAAAA9999A format, valid entity code |
|
|
271
|
+
| `phone` | `9876543210`, `+91 9876543210` | 10 digits, starts 6-9, optional country code |
|
|
272
|
+
| `email` | `alice@example.com` | RFC-5321 compliant |
|
|
273
|
+
| `upi_id` | `user@okaxis` | VPA format, non-email handles only |
|
|
274
|
+
| `credit_card` | `4111 1111 1111 1111` | 13-19 digits, Luhn algorithm validated |
|
|
275
|
+
| `ifsc` | `HDFC0001234` | 4-alpha + 0 + 6-alphanumeric |
|
|
276
|
+
| `ip_address` | `192.168.1.1`, `::1` | IPv4 (range-validated) and IPv6 |
|
|
277
|
+
|
|
278
|
+
---
|
|
279
|
+
|
|
280
|
+
## ๐งช Running Tests
|
|
281
|
+
|
|
282
|
+
```bash
|
|
283
|
+
# Clone the repo
|
|
284
|
+
git clone https://github.com/datacloak/datacloak.git
|
|
285
|
+
cd datacloak
|
|
286
|
+
|
|
287
|
+
# Install dev dependencies
|
|
288
|
+
pip install -e ".[dev]"
|
|
289
|
+
|
|
290
|
+
# Run tests
|
|
291
|
+
pytest
|
|
292
|
+
|
|
293
|
+
# Run with coverage
|
|
294
|
+
pytest --cov=datacloak --cov-report=term-missing
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
Target: **โฅ 90% coverage**.
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
## ๐๏ธ Architecture
|
|
302
|
+
|
|
303
|
+
```
|
|
304
|
+
datacloak/
|
|
305
|
+
โโโ __init__.py # Public API: mask(), scan(), report(), scan_file()
|
|
306
|
+
โโโ detectors/
|
|
307
|
+
โ โโโ __init__.py # Exports all detectors + DEFAULT_DETECTORS registry
|
|
308
|
+
โ โโโ base.py # BaseDetector, Detection dataclass
|
|
309
|
+
โ โโโ aadhaar.py
|
|
310
|
+
โ โโโ pan.py
|
|
311
|
+
โ โโโ mobile.py
|
|
312
|
+
โ โโโ email.py
|
|
313
|
+
โ โโโ upi.py
|
|
314
|
+
โ โโโ credit_card.py # Luhn validation
|
|
315
|
+
โ โโโ ifsc.py
|
|
316
|
+
โ โโโ ip_address.py # IPv4 + IPv6
|
|
317
|
+
โโโ masker.py # mask_text(), masking modes logic
|
|
318
|
+
โโโ scanner.py # scan_text(), scan_summary()
|
|
319
|
+
โโโ file_scanner.py # scan_file(), mask_file(), FileHandler interface
|
|
320
|
+
โโโ reporter.py # Report dataclass, generate_report_*()
|
|
321
|
+
โโโ cli.py # Click CLI: scan, mask, report commands
|
|
322
|
+
```
|
|
323
|
+
|
|
324
|
+
**Design principles applied:**
|
|
325
|
+
- Single Responsibility โ each detector, masker, scanner, and reporter is self-contained
|
|
326
|
+
- Open/Closed โ extend via `BaseDetector` or `FileHandler` without modifying core
|
|
327
|
+
- Liskov Substitution โ any `BaseDetector` subclass drops in transparently
|
|
328
|
+
- Dependency Injection โ all public functions accept `detectors=` for testability
|
|
329
|
+
- Logging โ structured `logging` throughout, silent by default (NullHandler)
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## ๐ฆ Publishing to PyPI
|
|
334
|
+
|
|
335
|
+
See [PUBLISHING.md](PUBLISHING.md) for a complete step-by-step guide.
|
|
336
|
+
|
|
337
|
+
```bash
|
|
338
|
+
# Quick summary
|
|
339
|
+
pip install build twine
|
|
340
|
+
python -m build
|
|
341
|
+
twine upload dist/*
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
---
|
|
345
|
+
|
|
346
|
+
## ๐ License
|
|
347
|
+
|
|
348
|
+
[MIT License](LICENSE) โ Copyright ยฉ 2025 DataCloak Contributors.
|
|
349
|
+
|
|
350
|
+
---
|
|
351
|
+
|
|
352
|
+
## ๐ค Contributing
|
|
353
|
+
|
|
354
|
+
Contributions, issues, and feature requests are welcome! Please read the contributing guide and open a pull request.
|
|
355
|
+
|
|
356
|
+
1. Fork the repository
|
|
357
|
+
2. Create a feature branch: `git checkout -b feat/my-detector`
|
|
358
|
+
3. Write your code and tests
|
|
359
|
+
4. Run `pytest` and ensure coverage stays โฅ 90%
|
|
360
|
+
5. Open a pull request
|
|
361
|
+
|
|
362
|
+
---
|
|
363
|
+
|
|
364
|
+
*DataCloak โ because privacy is not optional.*
|