shareclean 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- shareclean-0.2.0/CHANGELOG.md +49 -0
- shareclean-0.2.0/CODE_OF_CONDUCT.md +23 -0
- shareclean-0.2.0/CONTRIBUTING.md +50 -0
- shareclean-0.2.0/LICENSE +21 -0
- shareclean-0.2.0/MANIFEST.in +11 -0
- shareclean-0.2.0/PKG-INFO +263 -0
- shareclean-0.2.0/README.md +230 -0
- shareclean-0.2.0/ROADMAP.md +22 -0
- shareclean-0.2.0/SECURITY.md +46 -0
- shareclean-0.2.0/docs/assets/shareclean-demo.svg +43 -0
- shareclean-0.2.0/docs/detection-rules.md +49 -0
- shareclean-0.2.0/docs/release-process.md +84 -0
- shareclean-0.2.0/pyproject.toml +61 -0
- shareclean-0.2.0/setup.cfg +4 -0
- shareclean-0.2.0/src/shareclean/__init__.py +4 -0
- shareclean-0.2.0/src/shareclean/__main__.py +5 -0
- shareclean-0.2.0/src/shareclean/cli.py +308 -0
- shareclean-0.2.0/src/shareclean/config.py +312 -0
- shareclean-0.2.0/src/shareclean/detectors.py +234 -0
- shareclean-0.2.0/src/shareclean/io_utils.py +124 -0
- shareclean-0.2.0/src/shareclean/models.py +65 -0
- shareclean-0.2.0/src/shareclean/py.typed +1 -0
- shareclean-0.2.0/src/shareclean/redactor.py +116 -0
- shareclean-0.2.0/src/shareclean/report.py +102 -0
- shareclean-0.2.0/src/shareclean/selectors.py +94 -0
- shareclean-0.2.0/src/shareclean.egg-info/PKG-INFO +263 -0
- shareclean-0.2.0/src/shareclean.egg-info/SOURCES.txt +50 -0
- shareclean-0.2.0/src/shareclean.egg-info/dependency_links.txt +1 -0
- shareclean-0.2.0/src/shareclean.egg-info/entry_points.txt +2 -0
- shareclean-0.2.0/src/shareclean.egg-info/requires.txt +3 -0
- shareclean-0.2.0/src/shareclean.egg-info/top_level.txt +1 -0
- shareclean-0.2.0/tests/__init__.py +0 -0
- shareclean-0.2.0/tests/fixtures/ci_cd/pipeline.env +2 -0
- shareclean-0.2.0/tests/fixtures/cloud/tokens.txt +2 -0
- shareclean-0.2.0/tests/fixtures/corpus_manifest.json +60 -0
- shareclean-0.2.0/tests/fixtures/databases/uris.txt +2 -0
- shareclean-0.2.0/tests/fixtures/expected_cleaned_log.txt +10 -0
- shareclean-0.2.0/tests/fixtures/false_positives/safe.txt +4 -0
- shareclean-0.2.0/tests/fixtures/generic/mixed.txt +5 -0
- shareclean-0.2.0/tests/fixtures/generic/private_key.txt +3 -0
- shareclean-0.2.0/tests/fixtures/logs/app.log +3 -0
- shareclean-0.2.0/tests/fixtures/saas/tokens.txt +2 -0
- shareclean-0.2.0/tests/fixtures/sample_log.txt +10 -0
- shareclean-0.2.0/tests/fixtures/yaml_json_env/config.yml +2 -0
- shareclean-0.2.0/tests/test_cli.py +227 -0
- shareclean-0.2.0/tests/test_config.py +166 -0
- shareclean-0.2.0/tests/test_detectors.py +119 -0
- shareclean-0.2.0/tests/test_fixtures.py +46 -0
- shareclean-0.2.0/tests/test_io_utils.py +337 -0
- shareclean-0.2.0/tests/test_properties.py +61 -0
- shareclean-0.2.0/tests/test_redactor.py +109 -0
- shareclean-0.2.0/tests/test_report.py +80 -0
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to ShareClean will be documented in this file.
|
|
4
|
+
|
|
5
|
+
This project follows a simple release format inspired by Keep a Changelog and uses semantic versioning for public releases.
|
|
6
|
+
|
|
7
|
+
## 0.2.0 - 2026-07-02
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- Stable `SC###` detector IDs, categories, severities, and 1-based location ranges.
|
|
12
|
+
- Versioned JSON report schema `1.0` with privacy-preserving `source` labels.
|
|
13
|
+
- `--fail-on` and `--ignore-for-check` selectors for CI check policies.
|
|
14
|
+
- Project config support for `pyproject.toml` and `.shareclean.toml`, including profiles and environment variable overrides.
|
|
15
|
+
- `shareclean config show` for inspecting effective non-sensitive configuration.
|
|
16
|
+
- PEM private-key block detection.
|
|
17
|
+
- Fake-secret fixture corpus with manifest-driven regression tests.
|
|
18
|
+
- GitHub Release workflow for TestPyPI and PyPI Trusted Publishing without long-lived API tokens.
|
|
19
|
+
|
|
20
|
+
### Changed
|
|
21
|
+
|
|
22
|
+
- Bumped package and CLI version to `0.2.0`.
|
|
23
|
+
- `pipx install shareclean` is now the intended install path after PyPI publication.
|
|
24
|
+
- Reports no longer include filenames or full input paths by default.
|
|
25
|
+
- Overlapping detections now emit one finding by severity and detector specificity.
|
|
26
|
+
- `--no-email` is now a deprecated alias for `--no-redact-email`.
|
|
27
|
+
|
|
28
|
+
### Verified
|
|
29
|
+
|
|
30
|
+
- PyPI and TestPyPI package-name preflight returned 404 for `shareclean` on 2026-07-02, so the planned public package name appeared available before release workflow implementation.
|
|
31
|
+
|
|
32
|
+
## 0.1.1 - 2026-07-01
|
|
33
|
+
|
|
34
|
+
### Added
|
|
35
|
+
|
|
36
|
+
- `--redaction-label TEXT` for customizing the generic `[REDACTED]` label used by passwords, API keys, Bearer tokens, and connection string passwords.
|
|
37
|
+
- Interactive playground support for trying custom redaction labels in the browser demo.
|
|
38
|
+
|
|
39
|
+
## 0.1.0 - 2026-07-01
|
|
40
|
+
|
|
41
|
+
### Added
|
|
42
|
+
|
|
43
|
+
- Standard-library-only `shareclean` CLI.
|
|
44
|
+
- Redaction rules for key-value secrets, connection string passwords, Bearer tokens, JWT-like tokens, email addresses, local user paths, and opt-in private IP addresses.
|
|
45
|
+
- Human-readable and JSON reports that exclude original secret values.
|
|
46
|
+
- `--check`, `--output`, `--report`, `--report-format`, `--no-email`, and `--redact-private-ip` CLI options.
|
|
47
|
+
- Unit, integration, fixture-based, and property-style tests.
|
|
48
|
+
- Cross-platform line-ending preservation for file, stdin, and stdout workflows.
|
|
49
|
+
- GitHub-ready documentation, CI, security policy, and contribution guide.
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# Code of Conduct
|
|
2
|
+
|
|
3
|
+
ShareClean aims to be a practical, respectful project for developers who care about safer debugging and sharing workflows.
|
|
4
|
+
|
|
5
|
+
## Expected Behavior
|
|
6
|
+
|
|
7
|
+
- Be respectful and constructive.
|
|
8
|
+
- Assume good intent, but be clear when something is unsafe or incorrect.
|
|
9
|
+
- Use fake examples when discussing secrets, credentials, logs, or customer data.
|
|
10
|
+
- Keep feedback focused on the work and its impact.
|
|
11
|
+
|
|
12
|
+
## Unacceptable Behavior
|
|
13
|
+
|
|
14
|
+
- Harassment, insults, threats, or discriminatory language.
|
|
15
|
+
- Posting real credentials, private logs, personal data, or customer data.
|
|
16
|
+
- Publicly disclosing a suspected vulnerability before the maintainer has had a reasonable chance to respond.
|
|
17
|
+
- Repeatedly derailing technical discussions.
|
|
18
|
+
|
|
19
|
+
## Reporting
|
|
20
|
+
|
|
21
|
+
For conduct concerns, open a minimal issue asking for maintainer contact without posting private details.
|
|
22
|
+
|
|
23
|
+
For security concerns, follow [SECURITY.md](SECURITY.md).
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# Contributing
|
|
2
|
+
|
|
3
|
+
Thanks for helping make ShareClean safer and more useful.
|
|
4
|
+
|
|
5
|
+
## Development Setup
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
git clone https://github.com/OmarH-creator/ShareClean.git
|
|
9
|
+
cd ShareClean
|
|
10
|
+
python -m pip install -e .
|
|
11
|
+
python -m unittest discover -s tests -v
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
No third-party runtime or test dependencies are required.
|
|
15
|
+
|
|
16
|
+
## Project Principles
|
|
17
|
+
|
|
18
|
+
- Keep ShareClean local-first: no network calls, accounts, telemetry, or remote scanning.
|
|
19
|
+
- Preserve debugging context wherever possible. Redact the sensitive value, not the whole line.
|
|
20
|
+
- Never store or print original matched secret values in `Finding`, reports, exceptions, or test output.
|
|
21
|
+
- Use clearly fake test values only.
|
|
22
|
+
- Prefer precise, readable standard-library code over broad dependencies.
|
|
23
|
+
- Add tests for every detector or behavior change.
|
|
24
|
+
|
|
25
|
+
## Adding Or Changing A Detector
|
|
26
|
+
|
|
27
|
+
1. Add or update the rule in `src/shareclean/detectors.py`.
|
|
28
|
+
2. Keep rule order intentional. More specific rules should run before broader rules.
|
|
29
|
+
3. Add positive and negative detector tests.
|
|
30
|
+
4. Add redactor or property-style tests when behavior spans multiple modules.
|
|
31
|
+
5. Confirm reports never include the raw matched value.
|
|
32
|
+
|
|
33
|
+
Run:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
python -m unittest discover -s tests -v
|
|
37
|
+
python -m compileall -q src tests
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Pull Request Checklist
|
|
41
|
+
|
|
42
|
+
- Tests pass locally.
|
|
43
|
+
- New behavior is covered by tests.
|
|
44
|
+
- Documentation is updated when CLI behavior or detection behavior changes.
|
|
45
|
+
- Examples use fake values only.
|
|
46
|
+
- The PR does not add telemetry, network behavior, or unnecessary dependencies.
|
|
47
|
+
|
|
48
|
+
## Security Issues
|
|
49
|
+
|
|
50
|
+
Do not put real secrets or production logs in GitHub issues or pull requests. Read [SECURITY.md](SECURITY.md) first.
|
shareclean-0.2.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Omar Hassan Elsayed
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
include README.md
|
|
2
|
+
include LICENSE
|
|
3
|
+
include CHANGELOG.md
|
|
4
|
+
include SECURITY.md
|
|
5
|
+
include CONTRIBUTING.md
|
|
6
|
+
include CODE_OF_CONDUCT.md
|
|
7
|
+
include ROADMAP.md
|
|
8
|
+
recursive-include docs *.md
|
|
9
|
+
recursive-include docs/assets *.svg
|
|
10
|
+
recursive-include tests *.py
|
|
11
|
+
recursive-include tests/fixtures *.txt *.json *.env *.log *.yml
|
|
@@ -0,0 +1,263 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: shareclean
|
|
3
|
+
Version: 0.2.0
|
|
4
|
+
Summary: Local-first CLI for sanitizing logs, stack traces, and text before sharing.
|
|
5
|
+
Author: Omar Hassan Elsayed
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://omarh-creator.github.io/ShareClean/
|
|
8
|
+
Project-URL: Repository, https://github.com/OmarH-creator/ShareClean
|
|
9
|
+
Project-URL: Documentation, https://omarh-creator.github.io/ShareClean/
|
|
10
|
+
Project-URL: Issues, https://github.com/OmarH-creator/ShareClean/issues
|
|
11
|
+
Project-URL: Changelog, https://github.com/OmarH-creator/ShareClean/blob/main/CHANGELOG.md
|
|
12
|
+
Project-URL: Security, https://github.com/OmarH-creator/ShareClean/security
|
|
13
|
+
Keywords: cli,logs,privacy,redaction,security,secrets
|
|
14
|
+
Classifier: Development Status :: 3 - Alpha
|
|
15
|
+
Classifier: Environment :: Console
|
|
16
|
+
Classifier: Intended Audience :: Developers
|
|
17
|
+
Classifier: Operating System :: OS Independent
|
|
18
|
+
Classifier: Programming Language :: Python :: 3
|
|
19
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
22
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
23
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
24
|
+
Classifier: Topic :: Security
|
|
25
|
+
Classifier: Topic :: Software Development :: Debuggers
|
|
26
|
+
Classifier: Topic :: Text Processing
|
|
27
|
+
Classifier: Typing :: Typed
|
|
28
|
+
Requires-Python: >=3.10
|
|
29
|
+
Description-Content-Type: text/markdown
|
|
30
|
+
License-File: LICENSE
|
|
31
|
+
Requires-Dist: tomli>=2; python_version < "3.11"
|
|
32
|
+
Dynamic: license-file
|
|
33
|
+
|
|
34
|
+
# ShareClean
|
|
35
|
+
|
|
36
|
+
[](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml)
|
|
37
|
+

|
|
38
|
+

|
|
39
|
+
[](https://github.com/OmarH-creator/ShareClean/releases)
|
|
40
|
+
[](https://omarh-creator.github.io/ShareClean/)
|
|
41
|
+
|
|
42
|
+
Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.
|
|
43
|
+
|
|
44
|
+
ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.
|
|
45
|
+
|
|
46
|
+
[Try the interactive browser playground](https://omarh-creator.github.io/ShareClean/) to see the redaction rules before installing.
|
|
47
|
+
|
|
48
|
+

|
|
49
|
+
|
|
50
|
+
Browser playground shown for illustration; real workflows run locally through the CLI.
|
|
51
|
+
|
|
52
|
+
## Install
|
|
53
|
+
|
|
54
|
+
With `pipx`:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
pipx install shareclean
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
From a local checkout:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
python -m pip install -e .
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
Run without installing from the repository root:
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
python -m shareclean --help
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
## Quick Start
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
shareclean app.log
|
|
76
|
+
shareclean app.log --output app.cleaned.log
|
|
77
|
+
shareclean app.log --report
|
|
78
|
+
shareclean app.log --report --report-format json
|
|
79
|
+
shareclean app.log --check
|
|
80
|
+
shareclean app.log --check --fail-on severity:high
|
|
81
|
+
shareclean app.log --check --fail-on category:token,rule:SC004
|
|
82
|
+
shareclean app.log --check --ignore-for-check category:pii_email
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
`--check` exits `1` only for findings selected by the check policy and never writes sanitized text to stdout.
|
|
86
|
+
|
|
87
|
+
## Configuration
|
|
88
|
+
|
|
89
|
+
ShareClean supports committed project policy in either `pyproject.toml` or `.shareclean.toml`.
|
|
90
|
+
|
|
91
|
+
```toml
|
|
92
|
+
[tool.shareclean]
|
|
93
|
+
redact_email = true
|
|
94
|
+
redact_private_ip = false
|
|
95
|
+
redaction_label = "[REDACTED]"
|
|
96
|
+
profile = "default"
|
|
97
|
+
|
|
98
|
+
[tool.shareclean.profiles.ci]
|
|
99
|
+
redact_email = true
|
|
100
|
+
redact_private_ip = true
|
|
101
|
+
fail_on = ["severity:high"]
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
For `.shareclean.toml`, omit the `tool.shareclean` prefix:
|
|
105
|
+
|
|
106
|
+
```toml
|
|
107
|
+
redact_email = true
|
|
108
|
+
redact_private_ip = false
|
|
109
|
+
|
|
110
|
+
[profiles.ci]
|
|
111
|
+
redact_private_ip = true
|
|
112
|
+
fail_on = ["severity:high"]
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Config location:
|
|
116
|
+
|
|
117
|
+
1. `--config PATH`
|
|
118
|
+
2. Nearest project directory containing `.shareclean.toml` or a `pyproject.toml` with `[tool.shareclean]`
|
|
119
|
+
3. Defaults
|
|
120
|
+
|
|
121
|
+
Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If `.shareclean.toml` and ShareClean config in `pyproject.toml` exist in the same selected directory, ShareClean exits `2`.
|
|
122
|
+
|
|
123
|
+
Config precedence:
|
|
124
|
+
|
|
125
|
+
1. CLI flags
|
|
126
|
+
2. Environment variables
|
|
127
|
+
3. Selected profile values
|
|
128
|
+
4. Base project config
|
|
129
|
+
5. Defaults
|
|
130
|
+
|
|
131
|
+
Environment variables:
|
|
132
|
+
|
|
133
|
+
- `SHARECLEAN_REDACT_EMAIL`
|
|
134
|
+
- `SHARECLEAN_REDACT_PRIVATE_IP`
|
|
135
|
+
- `SHARECLEAN_REDACTION_LABEL`
|
|
136
|
+
- `SHARECLEAN_PROFILE`
|
|
137
|
+
- `SHARECLEAN_FAIL_ON`
|
|
138
|
+
- `SHARECLEAN_IGNORE_FOR_CHECK`
|
|
139
|
+
|
|
140
|
+
Boolean environment values accept `true`, `1`, `yes`, `on`, `false`, `0`, `no`, and `off`.
|
|
141
|
+
|
|
142
|
+
Inspect effective configuration without reading input:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
shareclean config show
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
## Detection Rules
|
|
149
|
+
|
|
150
|
+
| Rule ID | Detector | Category | Severity |
|
|
151
|
+
|---|---|---|---|
|
|
152
|
+
| `SC001` | Key-value secret | `credential` | `high` |
|
|
153
|
+
| `SC002` | Bearer token | `token` | `high` |
|
|
154
|
+
| `SC003` | JWT-like token | `token` | `high` |
|
|
155
|
+
| `SC004` | Connection-string password | `connection_string` | `critical` |
|
|
156
|
+
| `SC005` | Email address | `pii_email` | `medium` |
|
|
157
|
+
| `SC006` | Local user path | `pii_path` | `medium` |
|
|
158
|
+
| `SC007` | Private IP address | `internal_network` | `medium` |
|
|
159
|
+
| `SC008` | PEM private-key block | `private_key` | `critical` |
|
|
160
|
+
|
|
161
|
+
Private IP detection is off by default; enable it with `--redact-private-ip` or config.
|
|
162
|
+
|
|
163
|
+
When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.
|
|
164
|
+
|
|
165
|
+
## JSON Reports
|
|
166
|
+
|
|
167
|
+
JSON reports use schema version `1.0` and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.
|
|
168
|
+
|
|
169
|
+
```json
|
|
170
|
+
{
|
|
171
|
+
"schema_version": "1.0",
|
|
172
|
+
"source": "file",
|
|
173
|
+
"summary": {
|
|
174
|
+
"findings": 1,
|
|
175
|
+
"by_category": {
|
|
176
|
+
"credential": 1
|
|
177
|
+
},
|
|
178
|
+
"by_severity": {
|
|
179
|
+
"high": 1
|
|
180
|
+
}
|
|
181
|
+
},
|
|
182
|
+
"findings": [
|
|
183
|
+
{
|
|
184
|
+
"rule_id": "SC001",
|
|
185
|
+
"category": "credential",
|
|
186
|
+
"severity": "high",
|
|
187
|
+
"location": {
|
|
188
|
+
"start": {
|
|
189
|
+
"line": 1,
|
|
190
|
+
"column": 10
|
|
191
|
+
},
|
|
192
|
+
"end": {
|
|
193
|
+
"line": 1,
|
|
194
|
+
"column": 27
|
|
195
|
+
}
|
|
196
|
+
},
|
|
197
|
+
"replacement": "[REDACTED]"
|
|
198
|
+
}
|
|
199
|
+
]
|
|
200
|
+
}
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.
|
|
204
|
+
|
|
205
|
+
## CLI Reference
|
|
206
|
+
|
|
207
|
+
```text
|
|
208
|
+
usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
|
|
209
|
+
[--report-format {text,json}] [--config FILE]
|
|
210
|
+
[--profile NAME] [--redact-email] [--no-redact-email]
|
|
211
|
+
[--redact-private-ip] [--no-redact-private-ip]
|
|
212
|
+
[--redaction-label TEXT] [--fail-on SELECTORS]
|
|
213
|
+
[--ignore-for-check SELECTORS]
|
|
214
|
+
[FILE]
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
`--no-email` remains as a deprecated alias for `--no-redact-email`.
|
|
218
|
+
|
|
219
|
+
Exit codes:
|
|
220
|
+
|
|
221
|
+
| Code | Meaning |
|
|
222
|
+
|---:|---|
|
|
223
|
+
| `0` | Completed successfully |
|
|
224
|
+
| `1` | Selected findings detected in `--check` mode |
|
|
225
|
+
| `2` | User, I/O, config, or selector error |
|
|
226
|
+
| `3` | Unexpected internal error |
|
|
227
|
+
|
|
228
|
+
## Safety Model
|
|
229
|
+
|
|
230
|
+
ShareClean is intentionally local and transparent:
|
|
231
|
+
|
|
232
|
+
- No network calls
|
|
233
|
+
- No cloud processing
|
|
234
|
+
- No telemetry
|
|
235
|
+
- No account or API key required
|
|
236
|
+
- Original matched secret values are not stored in findings or reports
|
|
237
|
+
- Input files are never modified in place
|
|
238
|
+
|
|
239
|
+
## Coverage And Limitations
|
|
240
|
+
|
|
241
|
+
ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.
|
|
242
|
+
|
|
243
|
+
The test corpus under `tests/fixtures/` uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.
|
|
244
|
+
|
|
245
|
+
## Development
|
|
246
|
+
|
|
247
|
+
Run the test suite:
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
python -m unittest discover -s tests -v
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
Run packaging checks:
|
|
254
|
+
|
|
255
|
+
```bash
|
|
256
|
+
python -m compileall -q src tests
|
|
257
|
+
python -m build
|
|
258
|
+
python -m twine check dist/*
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
## License
|
|
262
|
+
|
|
263
|
+
ShareClean is released under the [MIT License](LICENSE).
|
|
@@ -0,0 +1,230 @@
|
|
|
1
|
+
# ShareClean
|
|
2
|
+
|
|
3
|
+
[](https://github.com/OmarH-creator/ShareClean/actions/workflows/ci.yml)
|
|
4
|
+

|
|
5
|
+

|
|
6
|
+
[](https://github.com/OmarH-creator/ShareClean/releases)
|
|
7
|
+
[](https://omarh-creator.github.io/ShareClean/)
|
|
8
|
+
|
|
9
|
+
Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.
|
|
10
|
+
|
|
11
|
+
ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.
|
|
12
|
+
|
|
13
|
+
[Try the interactive browser playground](https://omarh-creator.github.io/ShareClean/) to see the redaction rules before installing.
|
|
14
|
+
|
|
15
|
+

|
|
16
|
+
|
|
17
|
+
Browser playground shown for illustration; real workflows run locally through the CLI.
|
|
18
|
+
|
|
19
|
+
## Install
|
|
20
|
+
|
|
21
|
+
With `pipx`:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
pipx install shareclean
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
From a local checkout:
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
python -m pip install -e .
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Run without installing from the repository root:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
python -m shareclean --help
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Quick Start
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
shareclean app.log
|
|
43
|
+
shareclean app.log --output app.cleaned.log
|
|
44
|
+
shareclean app.log --report
|
|
45
|
+
shareclean app.log --report --report-format json
|
|
46
|
+
shareclean app.log --check
|
|
47
|
+
shareclean app.log --check --fail-on severity:high
|
|
48
|
+
shareclean app.log --check --fail-on category:token,rule:SC004
|
|
49
|
+
shareclean app.log --check --ignore-for-check category:pii_email
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
`--check` exits `1` only for findings selected by the check policy and never writes sanitized text to stdout.
|
|
53
|
+
|
|
54
|
+
## Configuration
|
|
55
|
+
|
|
56
|
+
ShareClean supports committed project policy in either `pyproject.toml` or `.shareclean.toml`.
|
|
57
|
+
|
|
58
|
+
```toml
|
|
59
|
+
[tool.shareclean]
|
|
60
|
+
redact_email = true
|
|
61
|
+
redact_private_ip = false
|
|
62
|
+
redaction_label = "[REDACTED]"
|
|
63
|
+
profile = "default"
|
|
64
|
+
|
|
65
|
+
[tool.shareclean.profiles.ci]
|
|
66
|
+
redact_email = true
|
|
67
|
+
redact_private_ip = true
|
|
68
|
+
fail_on = ["severity:high"]
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
For `.shareclean.toml`, omit the `tool.shareclean` prefix:
|
|
72
|
+
|
|
73
|
+
```toml
|
|
74
|
+
redact_email = true
|
|
75
|
+
redact_private_ip = false
|
|
76
|
+
|
|
77
|
+
[profiles.ci]
|
|
78
|
+
redact_private_ip = true
|
|
79
|
+
fail_on = ["severity:high"]
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Config location:
|
|
83
|
+
|
|
84
|
+
1. `--config PATH`
|
|
85
|
+
2. Nearest project directory containing `.shareclean.toml` or a `pyproject.toml` with `[tool.shareclean]`
|
|
86
|
+
3. Defaults
|
|
87
|
+
|
|
88
|
+
Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If `.shareclean.toml` and ShareClean config in `pyproject.toml` exist in the same selected directory, ShareClean exits `2`.
|
|
89
|
+
|
|
90
|
+
Config precedence:
|
|
91
|
+
|
|
92
|
+
1. CLI flags
|
|
93
|
+
2. Environment variables
|
|
94
|
+
3. Selected profile values
|
|
95
|
+
4. Base project config
|
|
96
|
+
5. Defaults
|
|
97
|
+
|
|
98
|
+
Environment variables:
|
|
99
|
+
|
|
100
|
+
- `SHARECLEAN_REDACT_EMAIL`
|
|
101
|
+
- `SHARECLEAN_REDACT_PRIVATE_IP`
|
|
102
|
+
- `SHARECLEAN_REDACTION_LABEL`
|
|
103
|
+
- `SHARECLEAN_PROFILE`
|
|
104
|
+
- `SHARECLEAN_FAIL_ON`
|
|
105
|
+
- `SHARECLEAN_IGNORE_FOR_CHECK`
|
|
106
|
+
|
|
107
|
+
Boolean environment values accept `true`, `1`, `yes`, `on`, `false`, `0`, `no`, and `off`.
|
|
108
|
+
|
|
109
|
+
Inspect effective configuration without reading input:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
shareclean config show
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
## Detection Rules
|
|
116
|
+
|
|
117
|
+
| Rule ID | Detector | Category | Severity |
|
|
118
|
+
|---|---|---|---|
|
|
119
|
+
| `SC001` | Key-value secret | `credential` | `high` |
|
|
120
|
+
| `SC002` | Bearer token | `token` | `high` |
|
|
121
|
+
| `SC003` | JWT-like token | `token` | `high` |
|
|
122
|
+
| `SC004` | Connection-string password | `connection_string` | `critical` |
|
|
123
|
+
| `SC005` | Email address | `pii_email` | `medium` |
|
|
124
|
+
| `SC006` | Local user path | `pii_path` | `medium` |
|
|
125
|
+
| `SC007` | Private IP address | `internal_network` | `medium` |
|
|
126
|
+
| `SC008` | PEM private-key block | `private_key` | `critical` |
|
|
127
|
+
|
|
128
|
+
Private IP detection is off by default; enable it with `--redact-private-ip` or config.
|
|
129
|
+
|
|
130
|
+
When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.
|
|
131
|
+
|
|
132
|
+
## JSON Reports
|
|
133
|
+
|
|
134
|
+
JSON reports use schema version `1.0` and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.
|
|
135
|
+
|
|
136
|
+
```json
|
|
137
|
+
{
|
|
138
|
+
"schema_version": "1.0",
|
|
139
|
+
"source": "file",
|
|
140
|
+
"summary": {
|
|
141
|
+
"findings": 1,
|
|
142
|
+
"by_category": {
|
|
143
|
+
"credential": 1
|
|
144
|
+
},
|
|
145
|
+
"by_severity": {
|
|
146
|
+
"high": 1
|
|
147
|
+
}
|
|
148
|
+
},
|
|
149
|
+
"findings": [
|
|
150
|
+
{
|
|
151
|
+
"rule_id": "SC001",
|
|
152
|
+
"category": "credential",
|
|
153
|
+
"severity": "high",
|
|
154
|
+
"location": {
|
|
155
|
+
"start": {
|
|
156
|
+
"line": 1,
|
|
157
|
+
"column": 10
|
|
158
|
+
},
|
|
159
|
+
"end": {
|
|
160
|
+
"line": 1,
|
|
161
|
+
"column": 27
|
|
162
|
+
}
|
|
163
|
+
},
|
|
164
|
+
"replacement": "[REDACTED]"
|
|
165
|
+
}
|
|
166
|
+
]
|
|
167
|
+
}
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.
|
|
171
|
+
|
|
172
|
+
## CLI Reference
|
|
173
|
+
|
|
174
|
+
```text
|
|
175
|
+
usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
|
|
176
|
+
[--report-format {text,json}] [--config FILE]
|
|
177
|
+
[--profile NAME] [--redact-email] [--no-redact-email]
|
|
178
|
+
[--redact-private-ip] [--no-redact-private-ip]
|
|
179
|
+
[--redaction-label TEXT] [--fail-on SELECTORS]
|
|
180
|
+
[--ignore-for-check SELECTORS]
|
|
181
|
+
[FILE]
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
`--no-email` remains as a deprecated alias for `--no-redact-email`.
|
|
185
|
+
|
|
186
|
+
Exit codes:
|
|
187
|
+
|
|
188
|
+
| Code | Meaning |
|
|
189
|
+
|---:|---|
|
|
190
|
+
| `0` | Completed successfully |
|
|
191
|
+
| `1` | Selected findings detected in `--check` mode |
|
|
192
|
+
| `2` | User, I/O, config, or selector error |
|
|
193
|
+
| `3` | Unexpected internal error |
|
|
194
|
+
|
|
195
|
+
## Safety Model
|
|
196
|
+
|
|
197
|
+
ShareClean is intentionally local and transparent:
|
|
198
|
+
|
|
199
|
+
- No network calls
|
|
200
|
+
- No cloud processing
|
|
201
|
+
- No telemetry
|
|
202
|
+
- No account or API key required
|
|
203
|
+
- Original matched secret values are not stored in findings or reports
|
|
204
|
+
- Input files are never modified in place
|
|
205
|
+
|
|
206
|
+
## Coverage And Limitations
|
|
207
|
+
|
|
208
|
+
ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.
|
|
209
|
+
|
|
210
|
+
The test corpus under `tests/fixtures/` uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.
|
|
211
|
+
|
|
212
|
+
## Development
|
|
213
|
+
|
|
214
|
+
Run the test suite:
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
python -m unittest discover -s tests -v
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
Run packaging checks:
|
|
221
|
+
|
|
222
|
+
```bash
|
|
223
|
+
python -m compileall -q src tests
|
|
224
|
+
python -m build
|
|
225
|
+
python -m twine check dist/*
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
## License
|
|
229
|
+
|
|
230
|
+
ShareClean is released under the [MIT License](LICENSE).
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Roadmap
|
|
2
|
+
|
|
3
|
+
ShareClean is intentionally small: local-first, standard-library-only, and focused on making text safer to share. These are the most likely next improvements.
|
|
4
|
+
|
|
5
|
+
## Near Term
|
|
6
|
+
|
|
7
|
+
- Add more targeted detectors for common cloud and SaaS token shapes while keeping examples fake.
|
|
8
|
+
- Add optional allowlist support for values users intentionally want to preserve.
|
|
9
|
+
- Add optional SARIF output for code-scanning style integrations.
|
|
10
|
+
- Add repository-relative source paths as an explicit opt-in report mode for CI.
|
|
11
|
+
- Add `--report-format jsonl` for batch and stream processing.
|
|
12
|
+
|
|
13
|
+
## Maybe Later
|
|
14
|
+
|
|
15
|
+
- More fixture packs for common log formats.
|
|
16
|
+
|
|
17
|
+
## Non-Goals
|
|
18
|
+
|
|
19
|
+
- No telemetry.
|
|
20
|
+
- No network scanning.
|
|
21
|
+
- No credential validation against external services.
|
|
22
|
+
- No claim that ShareClean replaces dedicated secret scanners such as repository scanners or data loss prevention systems.
|