shareclean 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. shareclean-0.2.0/CHANGELOG.md +49 -0
  2. shareclean-0.2.0/CODE_OF_CONDUCT.md +23 -0
  3. shareclean-0.2.0/CONTRIBUTING.md +50 -0
  4. shareclean-0.2.0/LICENSE +21 -0
  5. shareclean-0.2.0/MANIFEST.in +11 -0
  6. shareclean-0.2.0/PKG-INFO +263 -0
  7. shareclean-0.2.0/README.md +230 -0
  8. shareclean-0.2.0/ROADMAP.md +22 -0
  9. shareclean-0.2.0/SECURITY.md +46 -0
  10. shareclean-0.2.0/docs/assets/shareclean-demo.svg +43 -0
  11. shareclean-0.2.0/docs/detection-rules.md +49 -0
  12. shareclean-0.2.0/docs/release-process.md +84 -0
  13. shareclean-0.2.0/pyproject.toml +61 -0
  14. shareclean-0.2.0/setup.cfg +4 -0
  15. shareclean-0.2.0/src/shareclean/__init__.py +4 -0
  16. shareclean-0.2.0/src/shareclean/__main__.py +5 -0
  17. shareclean-0.2.0/src/shareclean/cli.py +308 -0
  18. shareclean-0.2.0/src/shareclean/config.py +312 -0
  19. shareclean-0.2.0/src/shareclean/detectors.py +234 -0
  20. shareclean-0.2.0/src/shareclean/io_utils.py +124 -0
  21. shareclean-0.2.0/src/shareclean/models.py +65 -0
  22. shareclean-0.2.0/src/shareclean/py.typed +1 -0
  23. shareclean-0.2.0/src/shareclean/redactor.py +116 -0
  24. shareclean-0.2.0/src/shareclean/report.py +102 -0
  25. shareclean-0.2.0/src/shareclean/selectors.py +94 -0
  26. shareclean-0.2.0/src/shareclean.egg-info/PKG-INFO +263 -0
  27. shareclean-0.2.0/src/shareclean.egg-info/SOURCES.txt +50 -0
  28. shareclean-0.2.0/src/shareclean.egg-info/dependency_links.txt +1 -0
  29. shareclean-0.2.0/src/shareclean.egg-info/entry_points.txt +2 -0
  30. shareclean-0.2.0/src/shareclean.egg-info/requires.txt +3 -0
  31. shareclean-0.2.0/src/shareclean.egg-info/top_level.txt +1 -0
  32. shareclean-0.2.0/tests/__init__.py +0 -0
  33. shareclean-0.2.0/tests/fixtures/ci_cd/pipeline.env +2 -0
  34. shareclean-0.2.0/tests/fixtures/cloud/tokens.txt +2 -0
  35. shareclean-0.2.0/tests/fixtures/corpus_manifest.json +60 -0
  36. shareclean-0.2.0/tests/fixtures/databases/uris.txt +2 -0
  37. shareclean-0.2.0/tests/fixtures/expected_cleaned_log.txt +10 -0
  38. shareclean-0.2.0/tests/fixtures/false_positives/safe.txt +4 -0
  39. shareclean-0.2.0/tests/fixtures/generic/mixed.txt +5 -0
  40. shareclean-0.2.0/tests/fixtures/generic/private_key.txt +3 -0
  41. shareclean-0.2.0/tests/fixtures/logs/app.log +3 -0
  42. shareclean-0.2.0/tests/fixtures/saas/tokens.txt +2 -0
  43. shareclean-0.2.0/tests/fixtures/sample_log.txt +10 -0
  44. shareclean-0.2.0/tests/fixtures/yaml_json_env/config.yml +2 -0
  45. shareclean-0.2.0/tests/test_cli.py +227 -0
  46. shareclean-0.2.0/tests/test_config.py +166 -0
  47. shareclean-0.2.0/tests/test_detectors.py +119 -0
  48. shareclean-0.2.0/tests/test_fixtures.py +46 -0
  49. shareclean-0.2.0/tests/test_io_utils.py +337 -0
  50. shareclean-0.2.0/tests/test_properties.py +61 -0
  51. shareclean-0.2.0/tests/test_redactor.py +109 -0
  52. shareclean-0.2.0/tests/test_report.py +80 -0
@@ -0,0 +1,49 @@
1
+ # Changelog
2
+
3
+ All notable changes to ShareClean will be documented in this file.
4
+
5
+ This project follows a simple release format inspired by Keep a Changelog and uses semantic versioning for public releases.
6
+
7
+ ## 0.2.0 - 2026-07-02
8
+
9
+ ### Added
10
+
11
+ - Stable `SC###` detector IDs, categories, severities, and 1-based location ranges.
12
+ - Versioned JSON report schema `1.0` with privacy-preserving `source` labels.
13
+ - `--fail-on` and `--ignore-for-check` selectors for CI check policies.
14
+ - Project config support for `pyproject.toml` and `.shareclean.toml`, including profiles and environment variable overrides.
15
+ - `shareclean config show` for inspecting effective non-sensitive configuration.
16
+ - PEM private-key block detection.
17
+ - Fake-secret fixture corpus with manifest-driven regression tests.
18
+ - GitHub Release workflow for TestPyPI and PyPI Trusted Publishing without long-lived API tokens.
19
+
20
+ ### Changed
21
+
22
+ - Bumped package and CLI version to `0.2.0`.
23
+ - `pipx install shareclean` is now the intended install path after PyPI publication.
24
+ - Reports no longer include filenames or full input paths by default.
25
+ - Overlapping detections now emit one finding by severity and detector specificity.
26
+ - `--no-email` is now a deprecated alias for `--no-redact-email`.
27
+
28
+ ### Verified
29
+
30
+ - PyPI and TestPyPI package-name preflight returned 404 for `shareclean` on 2026-07-02, so the planned public package name appeared available before release workflow implementation.
31
+
32
+ ## 0.1.1 - 2026-07-01
33
+
34
+ ### Added
35
+
36
+ - `--redaction-label TEXT` for customizing the generic `[REDACTED]` label used by passwords, API keys, Bearer tokens, and connection string passwords.
37
+ - Interactive playground support for trying custom redaction labels in the browser demo.
38
+
39
+ ## 0.1.0 - 2026-07-01
40
+
41
+ ### Added
42
+
43
+ - Standard-library-only `shareclean` CLI.
44
+ - Redaction rules for key-value secrets, connection string passwords, Bearer tokens, JWT-like tokens, email addresses, local user paths, and opt-in private IP addresses.
45
+ - Human-readable and JSON reports that exclude original secret values.
46
+ - `--check`, `--output`, `--report`, `--report-format`, `--no-email`, and `--redact-private-ip` CLI options.
47
+ - Unit, integration, fixture-based, and property-style tests.
48
+ - Cross-platform line-ending preservation for file, stdin, and stdout workflows.
49
+ - GitHub-ready documentation, CI, security policy, and contribution guide.
@@ -0,0 +1,23 @@
1
+ # Code of Conduct
2
+
3
+ ShareClean aims to be a practical, respectful project for developers who care about safer debugging and sharing workflows.
4
+
5
+ ## Expected Behavior
6
+
7
+ - Be respectful and constructive.
8
+ - Assume good intent, but be clear when something is unsafe or incorrect.
9
+ - Use fake examples when discussing secrets, credentials, logs, or customer data.
10
+ - Keep feedback focused on the work and its impact.
11
+
12
+ ## Unacceptable Behavior
13
+
14
+ - Harassment, insults, threats, or discriminatory language.
15
+ - Posting real credentials, private logs, personal data, or customer data.
16
+ - Publicly disclosing a suspected vulnerability before the maintainer has had a reasonable chance to respond.
17
+ - Repeatedly derailing technical discussions.
18
+
19
+ ## Reporting
20
+
21
+ For conduct concerns, open a minimal issue asking for maintainer contact without posting private details.
22
+
23
+ For security concerns, follow [SECURITY.md](SECURITY.md).
@@ -0,0 +1,50 @@
1
+ # Contributing
2
+
3
+ Thanks for helping make ShareClean safer and more useful.
4
+
5
+ ## Development Setup
6
+
7
+ ```bash
8
+ git clone https://github.com/OmarH-creator/ShareClean.git
9
+ cd ShareClean
10
+ python -m pip install -e .
11
+ python -m unittest discover -s tests -v
12
+ ```
13
+
14
+ No third-party runtime or test dependencies are required.
15
+
16
+ ## Project Principles
17
+
18
+ - Keep ShareClean local-first: no network calls, accounts, telemetry, or remote scanning.
19
+ - Preserve debugging context wherever possible. Redact the sensitive value, not the whole line.
20
+ - Never store or print original matched secret values in `Finding`, reports, exceptions, or test output.
21
+ - Use clearly fake test values only.
22
+ - Prefer precise, readable standard-library code over broad dependencies.
23
+ - Add tests for every detector or behavior change.
24
+
25
+ ## Adding Or Changing A Detector
26
+
27
+ 1. Add or update the rule in `src/shareclean/detectors.py`.
28
+ 2. Keep rule order intentional. More specific rules should run before broader rules.
29
+ 3. Add positive and negative detector tests.
30
+ 4. Add redactor or property-style tests when behavior spans multiple modules.
31
+ 5. Confirm reports never include the raw matched value.
32
+
33
+ Run:
34
+
35
+ ```bash
36
+ python -m unittest discover -s tests -v
37
+ python -m compileall -q src tests
38
+ ```
39
+
40
+ ## Pull Request Checklist
41
+
42
+ - Tests pass locally.
43
+ - New behavior is covered by tests.
44
+ - Documentation is updated when CLI behavior or detection behavior changes.
45
+ - Examples use fake values only.
46
+ - The PR does not add telemetry, network behavior, or unnecessary dependencies.
47
+
48
+ ## Security Issues
49
+
50
+ Do not put real secrets or production logs in GitHub issues or pull requests. Read [SECURITY.md](SECURITY.md) first.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Omar Hassan Elsayed
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,11 @@
1
+ include README.md
2
+ include LICENSE
3
+ include CHANGELOG.md
4
+ include SECURITY.md
5
+ include CONTRIBUTING.md
6
+ include CODE_OF_CONDUCT.md
7
+ include ROADMAP.md
8
+ recursive-include docs *.md
9
+ recursive-include docs/assets *.svg
10
+ recursive-include tests *.py
11
+ recursive-include tests/fixtures *.txt *.json *.env *.log *.yml
@@ -0,0 +1,263 @@
1
+ Metadata-Version: 2.4
2
+ Name: shareclean
3
+ Version: 0.2.0
4
+ Summary: Local-first CLI for sanitizing logs, stack traces, and text before sharing.
5
+ Author: Omar Hassan Elsayed
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://omarh-creator.github.io/ShareClean/
8
+ Project-URL: Repository, https://github.com/OmarH-creator/ShareClean
9
+ Project-URL: Documentation, https://omarh-creator.github.io/ShareClean/
10
+ Project-URL: Issues, https://github.com/OmarH-creator/ShareClean/issues
11
+ Project-URL: Changelog, https://github.com/OmarH-creator/ShareClean/blob/main/CHANGELOG.md
12
+ Project-URL: Security, https://github.com/OmarH-creator/ShareClean/security
13
+ Keywords: cli,logs,privacy,redaction,security,secrets
14
+ Classifier: Development Status :: 3 - Alpha
15
+ Classifier: Environment :: Console
16
+ Classifier: Intended Audience :: Developers
17
+ Classifier: Operating System :: OS Independent
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3 :: Only
20
+ Classifier: Programming Language :: Python :: 3.10
21
+ Classifier: Programming Language :: Python :: 3.11
22
+ Classifier: Programming Language :: Python :: 3.12
23
+ Classifier: Programming Language :: Python :: 3.13
24
+ Classifier: Topic :: Security
25
+ Classifier: Topic :: Software Development :: Debuggers
26
+ Classifier: Topic :: Text Processing
27
+ Classifier: Typing :: Typed
28
+ Requires-Python: >=3.10
29
+ Description-Content-Type: text/markdown
30
+ License-File: LICENSE
31
+ Requires-Dist: tomli>=2; python_version < "3.11"
32
+ Dynamic: license-file
33
+
34
+ # ShareClean
35
+
36
+ [![CI](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml/badge.svg)](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml)
37
+ ![Python](https://img.shields.io/badge/python-3.10%2B-blue)
38
+ ![License](https://img.shields.io/badge/license-MIT-green)
39
+ [![Release](https://img.shields.io/github/v/release/OmarH-creator/ShareClean)](https://github.com/OmarH-creator/ShareClean/releases)
40
+ [![Live demo](https://img.shields.io/badge/live-demo-4fd1c5)](https://omarh-creator.github.io/ShareClean/)
41
+
42
+ Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.
43
+
44
+ ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.
45
+
46
+ [Try the interactive browser playground](https://omarh-creator.github.io/ShareClean/) to see the redaction rules before installing.
47
+
48
+ ![ShareClean browser playground demo](docs/assets/shareclean-showcase.gif)
49
+
50
+ Browser playground shown for illustration; real workflows run locally through the CLI.
51
+
52
+ ## Install
53
+
54
+ With `pipx`:
55
+
56
+ ```bash
57
+ pipx install shareclean
58
+ ```
59
+
60
+ From a local checkout:
61
+
62
+ ```bash
63
+ python -m pip install -e .
64
+ ```
65
+
66
+ Run without installing from the repository root:
67
+
68
+ ```bash
69
+ python -m shareclean --help
70
+ ```
71
+
72
+ ## Quick Start
73
+
74
+ ```bash
75
+ shareclean app.log
76
+ shareclean app.log --output app.cleaned.log
77
+ shareclean app.log --report
78
+ shareclean app.log --report --report-format json
79
+ shareclean app.log --check
80
+ shareclean app.log --check --fail-on severity:high
81
+ shareclean app.log --check --fail-on category:token,rule:SC004
82
+ shareclean app.log --check --ignore-for-check category:pii_email
83
+ ```
84
+
85
+ `--check` exits `1` only for findings selected by the check policy and never writes sanitized text to stdout.
86
+
87
+ ## Configuration
88
+
89
+ ShareClean supports committed project policy in either `pyproject.toml` or `.shareclean.toml`.
90
+
91
+ ```toml
92
+ [tool.shareclean]
93
+ redact_email = true
94
+ redact_private_ip = false
95
+ redaction_label = "[REDACTED]"
96
+ profile = "default"
97
+
98
+ [tool.shareclean.profiles.ci]
99
+ redact_email = true
100
+ redact_private_ip = true
101
+ fail_on = ["severity:high"]
102
+ ```
103
+
104
+ For `.shareclean.toml`, omit the `tool.shareclean` prefix:
105
+
106
+ ```toml
107
+ redact_email = true
108
+ redact_private_ip = false
109
+
110
+ [profiles.ci]
111
+ redact_private_ip = true
112
+ fail_on = ["severity:high"]
113
+ ```
114
+
115
+ Config location:
116
+
117
+ 1. `--config PATH`
118
+ 2. Nearest project directory containing `.shareclean.toml` or a `pyproject.toml` with `[tool.shareclean]`
119
+ 3. Defaults
120
+
121
+ Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If `.shareclean.toml` and ShareClean config in `pyproject.toml` exist in the same selected directory, ShareClean exits `2`.
122
+
123
+ Config precedence:
124
+
125
+ 1. CLI flags
126
+ 2. Environment variables
127
+ 3. Selected profile values
128
+ 4. Base project config
129
+ 5. Defaults
130
+
131
+ Environment variables:
132
+
133
+ - `SHARECLEAN_REDACT_EMAIL`
134
+ - `SHARECLEAN_REDACT_PRIVATE_IP`
135
+ - `SHARECLEAN_REDACTION_LABEL`
136
+ - `SHARECLEAN_PROFILE`
137
+ - `SHARECLEAN_FAIL_ON`
138
+ - `SHARECLEAN_IGNORE_FOR_CHECK`
139
+
140
+ Boolean environment values accept `true`, `1`, `yes`, `on`, `false`, `0`, `no`, and `off`.
141
+
142
+ Inspect effective configuration without reading input:
143
+
144
+ ```bash
145
+ shareclean config show
146
+ ```
147
+
148
+ ## Detection Rules
149
+
150
+ | Rule ID | Detector | Category | Severity |
151
+ |---|---|---|---|
152
+ | `SC001` | Key-value secret | `credential` | `high` |
153
+ | `SC002` | Bearer token | `token` | `high` |
154
+ | `SC003` | JWT-like token | `token` | `high` |
155
+ | `SC004` | Connection-string password | `connection_string` | `critical` |
156
+ | `SC005` | Email address | `pii_email` | `medium` |
157
+ | `SC006` | Local user path | `pii_path` | `medium` |
158
+ | `SC007` | Private IP address | `internal_network` | `medium` |
159
+ | `SC008` | PEM private-key block | `private_key` | `critical` |
160
+
161
+ Private IP detection is off by default; enable it with `--redact-private-ip` or config.
162
+
163
+ When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.
164
+
165
+ ## JSON Reports
166
+
167
+ JSON reports use schema version `1.0` and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.
168
+
169
+ ```json
170
+ {
171
+ "schema_version": "1.0",
172
+ "source": "file",
173
+ "summary": {
174
+ "findings": 1,
175
+ "by_category": {
176
+ "credential": 1
177
+ },
178
+ "by_severity": {
179
+ "high": 1
180
+ }
181
+ },
182
+ "findings": [
183
+ {
184
+ "rule_id": "SC001",
185
+ "category": "credential",
186
+ "severity": "high",
187
+ "location": {
188
+ "start": {
189
+ "line": 1,
190
+ "column": 10
191
+ },
192
+ "end": {
193
+ "line": 1,
194
+ "column": 27
195
+ }
196
+ },
197
+ "replacement": "[REDACTED]"
198
+ }
199
+ ]
200
+ }
201
+ ```
202
+
203
+ Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.
204
+
205
+ ## CLI Reference
206
+
207
+ ```text
208
+ usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
209
+ [--report-format {text,json}] [--config FILE]
210
+ [--profile NAME] [--redact-email] [--no-redact-email]
211
+ [--redact-private-ip] [--no-redact-private-ip]
212
+ [--redaction-label TEXT] [--fail-on SELECTORS]
213
+ [--ignore-for-check SELECTORS]
214
+ [FILE]
215
+ ```
216
+
217
+ `--no-email` remains as a deprecated alias for `--no-redact-email`.
218
+
219
+ Exit codes:
220
+
221
+ | Code | Meaning |
222
+ |---:|---|
223
+ | `0` | Completed successfully |
224
+ | `1` | Selected findings detected in `--check` mode |
225
+ | `2` | User, I/O, config, or selector error |
226
+ | `3` | Unexpected internal error |
227
+
228
+ ## Safety Model
229
+
230
+ ShareClean is intentionally local and transparent:
231
+
232
+ - No network calls
233
+ - No cloud processing
234
+ - No telemetry
235
+ - No account or API key required
236
+ - Original matched secret values are not stored in findings or reports
237
+ - Input files are never modified in place
238
+
239
+ ## Coverage And Limitations
240
+
241
+ ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.
242
+
243
+ The test corpus under `tests/fixtures/` uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.
244
+
245
+ ## Development
246
+
247
+ Run the test suite:
248
+
249
+ ```bash
250
+ python -m unittest discover -s tests -v
251
+ ```
252
+
253
+ Run packaging checks:
254
+
255
+ ```bash
256
+ python -m compileall -q src tests
257
+ python -m build
258
+ python -m twine check dist/*
259
+ ```
260
+
261
+ ## License
262
+
263
+ ShareClean is released under the [MIT License](LICENSE).
@@ -0,0 +1,230 @@
1
+ # ShareClean
2
+
3
+ [![CI](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml/badge.svg)](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml)
4
+ ![Python](https://img.shields.io/badge/python-3.10%2B-blue)
5
+ ![License](https://img.shields.io/badge/license-MIT-green)
6
+ [![Release](https://img.shields.io/github/v/release/OmarH-creator/ShareClean)](https://github.com/OmarH-creator/ShareClean/releases)
7
+ [![Live demo](https://img.shields.io/badge/live-demo-4fd1c5)](https://omarh-creator.github.io/ShareClean/)
8
+
9
+ Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.
10
+
11
+ ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.
12
+
13
+ [Try the interactive browser playground](https://omarh-creator.github.io/ShareClean/) to see the redaction rules before installing.
14
+
15
+ ![ShareClean browser playground demo](docs/assets/shareclean-showcase.gif)
16
+
17
+ Browser playground shown for illustration; real workflows run locally through the CLI.
18
+
19
+ ## Install
20
+
21
+ With `pipx`:
22
+
23
+ ```bash
24
+ pipx install shareclean
25
+ ```
26
+
27
+ From a local checkout:
28
+
29
+ ```bash
30
+ python -m pip install -e .
31
+ ```
32
+
33
+ Run without installing from the repository root:
34
+
35
+ ```bash
36
+ python -m shareclean --help
37
+ ```
38
+
39
+ ## Quick Start
40
+
41
+ ```bash
42
+ shareclean app.log
43
+ shareclean app.log --output app.cleaned.log
44
+ shareclean app.log --report
45
+ shareclean app.log --report --report-format json
46
+ shareclean app.log --check
47
+ shareclean app.log --check --fail-on severity:high
48
+ shareclean app.log --check --fail-on category:token,rule:SC004
49
+ shareclean app.log --check --ignore-for-check category:pii_email
50
+ ```
51
+
52
+ `--check` exits `1` only for findings selected by the check policy and never writes sanitized text to stdout.
53
+
54
+ ## Configuration
55
+
56
+ ShareClean supports committed project policy in either `pyproject.toml` or `.shareclean.toml`.
57
+
58
+ ```toml
59
+ [tool.shareclean]
60
+ redact_email = true
61
+ redact_private_ip = false
62
+ redaction_label = "[REDACTED]"
63
+ profile = "default"
64
+
65
+ [tool.shareclean.profiles.ci]
66
+ redact_email = true
67
+ redact_private_ip = true
68
+ fail_on = ["severity:high"]
69
+ ```
70
+
71
+ For `.shareclean.toml`, omit the `tool.shareclean` prefix:
72
+
73
+ ```toml
74
+ redact_email = true
75
+ redact_private_ip = false
76
+
77
+ [profiles.ci]
78
+ redact_private_ip = true
79
+ fail_on = ["severity:high"]
80
+ ```
81
+
82
+ Config location:
83
+
84
+ 1. `--config PATH`
85
+ 2. Nearest project directory containing `.shareclean.toml` or a `pyproject.toml` with `[tool.shareclean]`
86
+ 3. Defaults
87
+
88
+ Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If `.shareclean.toml` and ShareClean config in `pyproject.toml` exist in the same selected directory, ShareClean exits `2`.
89
+
90
+ Config precedence:
91
+
92
+ 1. CLI flags
93
+ 2. Environment variables
94
+ 3. Selected profile values
95
+ 4. Base project config
96
+ 5. Defaults
97
+
98
+ Environment variables:
99
+
100
+ - `SHARECLEAN_REDACT_EMAIL`
101
+ - `SHARECLEAN_REDACT_PRIVATE_IP`
102
+ - `SHARECLEAN_REDACTION_LABEL`
103
+ - `SHARECLEAN_PROFILE`
104
+ - `SHARECLEAN_FAIL_ON`
105
+ - `SHARECLEAN_IGNORE_FOR_CHECK`
106
+
107
+ Boolean environment values accept `true`, `1`, `yes`, `on`, `false`, `0`, `no`, and `off`.
108
+
109
+ Inspect effective configuration without reading input:
110
+
111
+ ```bash
112
+ shareclean config show
113
+ ```
114
+
115
+ ## Detection Rules
116
+
117
+ | Rule ID | Detector | Category | Severity |
118
+ |---|---|---|---|
119
+ | `SC001` | Key-value secret | `credential` | `high` |
120
+ | `SC002` | Bearer token | `token` | `high` |
121
+ | `SC003` | JWT-like token | `token` | `high` |
122
+ | `SC004` | Connection-string password | `connection_string` | `critical` |
123
+ | `SC005` | Email address | `pii_email` | `medium` |
124
+ | `SC006` | Local user path | `pii_path` | `medium` |
125
+ | `SC007` | Private IP address | `internal_network` | `medium` |
126
+ | `SC008` | PEM private-key block | `private_key` | `critical` |
127
+
128
+ Private IP detection is off by default; enable it with `--redact-private-ip` or config.
129
+
130
+ When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.
131
+
132
+ ## JSON Reports
133
+
134
+ JSON reports use schema version `1.0` and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.
135
+
136
+ ```json
137
+ {
138
+ "schema_version": "1.0",
139
+ "source": "file",
140
+ "summary": {
141
+ "findings": 1,
142
+ "by_category": {
143
+ "credential": 1
144
+ },
145
+ "by_severity": {
146
+ "high": 1
147
+ }
148
+ },
149
+ "findings": [
150
+ {
151
+ "rule_id": "SC001",
152
+ "category": "credential",
153
+ "severity": "high",
154
+ "location": {
155
+ "start": {
156
+ "line": 1,
157
+ "column": 10
158
+ },
159
+ "end": {
160
+ "line": 1,
161
+ "column": 27
162
+ }
163
+ },
164
+ "replacement": "[REDACTED]"
165
+ }
166
+ ]
167
+ }
168
+ ```
169
+
170
+ Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.
171
+
172
+ ## CLI Reference
173
+
174
+ ```text
175
+ usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
176
+ [--report-format {text,json}] [--config FILE]
177
+ [--profile NAME] [--redact-email] [--no-redact-email]
178
+ [--redact-private-ip] [--no-redact-private-ip]
179
+ [--redaction-label TEXT] [--fail-on SELECTORS]
180
+ [--ignore-for-check SELECTORS]
181
+ [FILE]
182
+ ```
183
+
184
+ `--no-email` remains as a deprecated alias for `--no-redact-email`.
185
+
186
+ Exit codes:
187
+
188
+ | Code | Meaning |
189
+ |---:|---|
190
+ | `0` | Completed successfully |
191
+ | `1` | Selected findings detected in `--check` mode |
192
+ | `2` | User, I/O, config, or selector error |
193
+ | `3` | Unexpected internal error |
194
+
195
+ ## Safety Model
196
+
197
+ ShareClean is intentionally local and transparent:
198
+
199
+ - No network calls
200
+ - No cloud processing
201
+ - No telemetry
202
+ - No account or API key required
203
+ - Original matched secret values are not stored in findings or reports
204
+ - Input files are never modified in place
205
+
206
+ ## Coverage And Limitations
207
+
208
+ ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.
209
+
210
+ The test corpus under `tests/fixtures/` uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.
211
+
212
+ ## Development
213
+
214
+ Run the test suite:
215
+
216
+ ```bash
217
+ python -m unittest discover -s tests -v
218
+ ```
219
+
220
+ Run packaging checks:
221
+
222
+ ```bash
223
+ python -m compileall -q src tests
224
+ python -m build
225
+ python -m twine check dist/*
226
+ ```
227
+
228
+ ## License
229
+
230
+ ShareClean is released under the [MIT License](LICENSE).
@@ -0,0 +1,22 @@
1
+ # Roadmap
2
+
3
+ ShareClean is intentionally small: local-first, standard-library-only, and focused on making text safer to share. These are the most likely next improvements.
4
+
5
+ ## Near Term
6
+
7
+ - Add more targeted detectors for common cloud and SaaS token shapes while keeping examples fake.
8
+ - Add optional allowlist support for values users intentionally want to preserve.
9
+ - Add optional SARIF output for code-scanning style integrations.
10
+ - Add repository-relative source paths as an explicit opt-in report mode for CI.
11
+ - Add `--report-format jsonl` for batch and stream processing.
12
+
13
+ ## Maybe Later
14
+
15
+ - More fixture packs for common log formats.
16
+
17
+ ## Non-Goals
18
+
19
+ - No telemetry.
20
+ - No network scanning.
21
+ - No credential validation against external services.
22
+ - No claim that ShareClean replaces dedicated secret scanners such as repository scanners or data loss prevention systems.