dumpling-cli 0.1.0__py3-none-win_amd64.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,207 @@
1
+ Metadata-Version: 2.4
2
+ Name: dumpling-cli
3
+ Version: 0.1.0
4
+ Classifier: Development Status :: 4 - Beta
5
+ Classifier: Environment :: Console
6
+ Classifier: Intended Audience :: Developers
7
+ Classifier: Operating System :: MacOS
8
+ Classifier: Operating System :: Microsoft :: Windows
9
+ Classifier: Operating System :: POSIX :: Linux
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: Programming Language :: Python :: 3 :: Only
12
+ Classifier: Programming Language :: Rust
13
+ Classifier: Topic :: Database
14
+ Classifier: Topic :: Security
15
+ Classifier: Topic :: Software Development :: Libraries
16
+ Classifier: Topic :: Utilities
17
+ Summary: Static anonymizer for Postgres plain SQL dumps produced by pg_dump.
18
+ Keywords: postgres,sql,anonymization,cli,rust
19
+ Requires-Python: >=3.8
20
+ Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
21
+
22
+ ## Dumpling
23
+
24
+ Static anonymizer for Postgres plain SQL dumps produced by `pg_dump`. It scans `INSERT` and `COPY FROM stdin` statements and replaces sensitive row data based on configurable rules.
25
+
26
+ ### Install / Build
27
+
28
+ ```bash
29
+ cargo build --release
30
+ ./target/release/dumpling --help
31
+ ```
32
+
33
+ ### Python package build (maturin)
34
+
35
+ This repository now includes Python distribution metadata so Dumpling can be
36
+ published as a pip-installable CLI package (distribution name:
37
+ `dumpling-cli`).
38
+
39
+ ```bash
40
+ # Build wheel/sdist locally
41
+ maturin build --release
42
+
43
+ # Install from local source (requires maturin as PEP 517 backend)
44
+ pip install .
45
+ ```
46
+
47
+ After install, the CLI command remains:
48
+
49
+ ```bash
50
+ dumpling --help
51
+ ```
52
+
53
+ ### Project automation
54
+
55
+ - **Lint:** `.github/workflows/ci.yml` runs `cargo fmt` and `cargo clippy` only (fast signal).
56
+ - **Test:** `.github/workflows/tests.yml` runs `cargo test --all-targets --all-features`.
57
+ - **Platform compatibility (latest):** `.github/workflows/platform-compat-latest.yml` runs cross-platform build checks on latest runner images.
58
+ - **Platform compatibility (matrix):** `.github/workflows/platform-compat-matrix.yml` is a manual, explicit-version matrix for legacy compatibility checks over time.
59
+ - **Docs:** `.github/workflows/docs.yml` builds this repo's mdBook docs and deploys them from `main` to GitHub Pages.
60
+ - **Publish:** `.github/workflows/publish.yml` builds wheels/sdist via `maturin`, publishes to PyPI from tags, and supports manual TestPyPI publication.
61
+ - **Release:** `.github/workflows/release.yml` publishes tagged releases (`v*.*.*`) with checksummed Linux artifacts.
62
+
63
+ ### Docs
64
+
65
+ ```bash
66
+ mdbook build
67
+ ```
68
+
69
+ Primary docs live under `docs/src/`, including the [release process](docs/src/releasing.md).
70
+
71
+ ### Usage
72
+
73
+ ```bash
74
+ dumpling -i dump.sql -o sanitized.sql # read from file, write to file
75
+ dumpling -i dump.sql --in-place # overwrite the input file (atomic swap)
76
+ cat dump.sql | dumpling > sanitized.sql # stream from stdin to stdout
77
+ dumpling -i dump.sql -c .dumplingconf # use explicit config path
78
+ dumpling --check -i dump.sql # exit 1 if changes would occur, no output
79
+ dumpling --stats -i dump.sql -o out.sql # print summary to stderr
80
+ dumpling --report report.json -i dump.sql # write detailed JSON report of changes/drops
81
+ dumpling --include-table '^public\\.' -i dump.sql -o out.sql
82
+ dumpling --exclude-table '^audit\\.' -i dump.sql -o out.sql
83
+ dumpling --allow-ext dmp -i data.dmp # restrict processing to specific extensions
84
+ ```
85
+
86
+ Configuration is loaded in this order:
87
+
88
+ 1) `--config <path>` if provided
89
+ 2) `.dumplingconf` in the current directory
90
+ 3) `pyproject.toml` `[tool.dumpling]` section
91
+
92
+ If no configuration is found, Dumpling performs a no-op transformation.
93
+
94
+ ### Configuration (TOML)
95
+
96
+ Both `.dumplingconf` and `[tool.dumpling]` inside `pyproject.toml` use the same schema:
97
+
98
+ ```toml
99
+ # Optional global salt for strategies that support it (e.g. hash)
100
+ salt = "mysalt"
101
+
102
+ # Rules are keyed by either "table" or "schema.table"
103
+ [rules."public.users"]
104
+ email = { strategy = "email" }
105
+ name = { strategy = "name" }
106
+ ssn = { strategy = "hash", as_string = true } # SHA-256 of original (salted)
107
+ age = { strategy = "int_range", min = 18, max = 90 }
108
+
109
+ [rules."orders"]
110
+ credit_card = { strategy = "redact", as_string = true }
111
+ ```
112
+
113
+ Supported strategies:
114
+
115
+ - `null`: set field to SQL NULL
116
+ - `redact`: replace with `REDACTED` (string)
117
+ - `uuid`: random UUIDv4-like string
118
+ - `hash`: SHA-256 hex of original value; supports per-column `salt` and global `salt`
119
+ - `email`: random-looking email at `example.com`
120
+ - `name`, `first_name`, `last_name`: simple placeholder names
121
+ - `phone`: simple US-like phone number `(xxx) xxx-xxxx`
122
+ - `int_range`: random integer in `[min, max]`
123
+ - `string`: random alphanumeric string, `length = 12` by default
124
+ - `date_fuzz`: shifts a date by a random number of days in `[min_days, max_days]` (defaults: `-30..30`)
125
+ - `time_fuzz`: shifts a time-of-day by a random number of seconds in `[min_seconds, max_seconds]` with 24h wraparound (defaults: `-300..300`)
126
+ - `datetime_fuzz`: shifts a timestamp/timestamptz by a random number of seconds in `[min_seconds, max_seconds]` (defaults: `-86400..86400`)
127
+
128
+ Common option:
129
+
130
+ - `as_string`: if true, forces the anonymized value to be rendered as a quoted SQL string literal. By default Dumpling preserves the original quoting where possible.
131
+ - `min_days`/`max_days`: used by `date_fuzz`
132
+ - `min_seconds`/`max_seconds`: used by `time_fuzz` and `datetime_fuzz`
133
+
134
+ ### Input format
135
+
136
+ This tool targets the plain-text SQL format from `pg_dump`, handling:
137
+
138
+ - `INSERT INTO schema.table (col1, col2, ...) VALUES (...), (...), ...;`
139
+ - `COPY schema.table (col1, col2, ...) FROM stdin; ... \.` (tab-delimited with `\N` as NULL)
140
+
141
+ Other `pg_dump` formats (custom/binary/directory) are not supported.
142
+
143
+ ### Row filtering (retain/delete)
144
+
145
+ You can retain or delete rows for specific tables using explicit predicate lists. Semantics:
146
+
147
+ - If `retain` is non-empty, a row is kept only if it matches at least one of its predicates.
148
+ - Regardless of `retain`, a row is dropped if it matches any predicate in `delete`.
149
+
150
+ Predicates support these operators on a column:
151
+
152
+ - `eq`, `neq` (string compare; case-insensitive if `case_insensitive = true`)
153
+ - `in`, `not_in` (list of values, string compare)
154
+ - `like`, `ilike` (SQL-like: `%` and `_`)
155
+ - `regex`, `iregex` (Rust regex; `iregex` is case-insensitive)
156
+ - `lt`, `lte`, `gt`, `gte` (numeric compare; values parsed as numbers)
157
+ - `is_null`, `not_null` (no value needed)
158
+
159
+ Example:
160
+
161
+ ```toml
162
+ [row_filters."public.users"]
163
+ retain = [
164
+ { column = "country", op = "eq", value = "US" },
165
+ { column = "email", op = "ilike", value = "%@myco.com" }
166
+ ]
167
+ delete = [
168
+ { column = "is_admin", op = "eq", value = "true" },
169
+ { column = "email", op = "ilike", value = "%@example.com" }
170
+ ]
171
+ ```
172
+
173
+ Row filtering works for both `INSERT ... VALUES (...)` and `COPY ... FROM stdin` rows.
174
+
175
+ ### Conditional per-column cases (first-match-wins)
176
+
177
+ Define default strategies in `rules."<table>"` and add ordered per-column cases in `column_cases."<table>"."<column>"`. For each row, for each column, Dumpling applies the first matching case; if none match, it uses the default from `rules`.
178
+
179
+ Example:
180
+
181
+ ```toml
182
+ [rules."public.users"]
183
+ email = { strategy = "hash", as_string = true } # default
184
+ name = { strategy = "name" }
185
+
186
+ [[column_cases."public.users".email]]
187
+ when.any = [{ column = "is_admin", op = "eq", value = "true" }]
188
+ strategy = { strategy = "redact", as_string = true }
189
+
190
+ [[column_cases."public.users".email]]
191
+ when.any = [{ column = "country", op = "in", values = ["DE","FR","GB"] }]
192
+ strategy = { strategy = "hash", salt = "eu-salt", as_string = true }
193
+ ```
194
+
195
+ Notes:
196
+ - `when.any` is OR, `when.all` is AND; you can use either or both. If both are empty, the case matches unconditionally.
197
+ - First-match-wins per column; there is no merge/replace or fallthrough flag.
198
+ - Row filtering (`row_filters`) is evaluated before cases; deleted rows are not transformed.
199
+
200
+ ### Notes
201
+
202
+ - This is a streaming transformer; memory usage stays small even for big dumps.
203
+ - For best results, configure strategies compatible with column data types. If you hash an integer column, Dumpling will render a string which Postgres can usually coerce, but explicit `as_string = false` may help in some cases.
204
+ - If you switch runtimes/branches frequently and see test DB migration issues in your project, remember you can run tests with `pytest --create-db` (project convention).
205
+ - Deterministic anonymization for tests: pass `--seed <u64>` or set env `DUMPLING_SEED` to make fuzz strategies reproducible across runs.
206
+
207
+
@@ -0,0 +1,5 @@
1
+ dumpling_cli-0.1.0.data/scripts/dumpling.exe,sha256=6vvLUVqdiCpBYYxN_vbl-LWnfEMzpgILCuZAqTphCGg,3210752
2
+ dumpling_cli-0.1.0.dist-info/METADATA,sha256=29V0898AhGpcxzdVumYBAE9cfmJwW79qnR-GZCQLPxI,8668
3
+ dumpling_cli-0.1.0.dist-info/WHEEL,sha256=uJOc2U-Q1x95AlblQcqMRb3iR4QnPtdI7X2ycPN99rM,94
4
+ dumpling_cli-0.1.0.dist-info/sboms/dumpling.cyclonedx.json,sha256=0Blmq2tyPTgp0vNrUIFsg906AnCWSbtgTL9cq1dFUGI,90520
5
+ dumpling_cli-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: maturin (1.12.6)
3
+ Root-Is-Purelib: false
4
+ Tag: py3-none-win_amd64