data_redactor 0.7.2-aarch64-linux → 0.8.0-aarch64-linux
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +9 -0
- data/lib/data_redactor/3.0/data_redactor.so +0 -0
- data/lib/data_redactor/3.1/data_redactor.so +0 -0
- data/lib/data_redactor/3.2/data_redactor.so +0 -0
- data/lib/data_redactor/3.3/data_redactor.so +0 -0
- data/lib/data_redactor/3.4/data_redactor.so +0 -0
- data/lib/data_redactor/4.0/data_redactor.so +0 -0
- data/lib/data_redactor/version.rb +1 -1
- data/lib/data_redactor.rb +74 -0
- data/readme.md +31 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 508508ec72fc9ab5eb574a60619fa95961c682902adfcedcf8ae29f57b246e1d
|
|
4
|
+
data.tar.gz: dd1bcabcaec602a9719d30fbc9c3bbc532aa9b4d59f9103bc8120a8cf327a39c
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: fa665cc51f93155c58bded6bcf737a09787dc1607232d94324994503588e783ee516c39936a2dc980623794c85e52537a30958db431ce29219a39d77852dbd70
|
|
7
|
+
data.tar.gz: 83762eb09df4b13b6c1a87347b3969571480ebf03bff16e88cb82e769a4aa0c9257a6bd1375ee3b5a2969111aa8cf812d255947580a8670680686354d06e68ca
|
data/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
### Added
|
|
11
|
+
- `DataRedactor.redact_deep(data, only:, except:, placeholder:)` — recursively redacts every String value in a nested Hash/Array structure. Non-string scalars (Integer, Float, nil, Boolean) and Hash keys are passed through unchanged. Returns a deep copy; never mutates the input. Raises `ArgumentError` on circular references.
|
|
12
|
+
- `DataRedactor.redact_json(json_string, only:, except:, placeholder:)` — parses JSON, redacts via `redact_deep`, and returns valid JSON. Raises `JSON::ParserError` on invalid input.
|
|
13
|
+
- HashiCorp Vault service tokens (`hvs.` prefix, 90–120 chars) — pattern `hashicorp_vault_service_token`
|
|
14
|
+
- HashiCorp Vault batch tokens (`hvb.` prefix, 138–300 chars) — pattern `hashicorp_vault_batch_token`
|
|
15
|
+
- HashiCorp Terraform Cloud API tokens (`<14-char-id>.atlasv1.<token>`) — pattern `hashicorp_terraform_api_token`
|
|
16
|
+
|
|
17
|
+
All three HashiCorp patterns are tagged `:credentials` and do not require word-boundary wrapping (distinctive prefixes eliminate false positives).
|
|
18
|
+
|
|
10
19
|
## [0.7.2] - 2026-05-09
|
|
11
20
|
|
|
12
21
|
**Supersedes 0.7.1, which has been yanked from RubyGems.**
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
data/lib/data_redactor.rb
CHANGED
|
@@ -1,4 +1,5 @@
|
|
|
1
1
|
require "set"
|
|
2
|
+
require "json"
|
|
2
3
|
require_relative "data_redactor/version"
|
|
3
4
|
require_relative "data_redactor/data_redactor" # loads the compiled .so
|
|
4
5
|
|
|
@@ -161,6 +162,54 @@ module DataRedactor
|
|
|
161
162
|
result
|
|
162
163
|
end
|
|
163
164
|
|
|
165
|
+
# Recursively redact every String value in a nested Hash/Array structure.
|
|
166
|
+
#
|
|
167
|
+
# Walks the structure depth-first. Only String leaves are passed through
|
|
168
|
+
# {redact}; all other leaf types (Integer, Float, nil, Symbol, Boolean)
|
|
169
|
+
# are copied unchanged. Hash keys are never modified.
|
|
170
|
+
#
|
|
171
|
+
# Returns a deep copy — the original structure is never mutated.
|
|
172
|
+
#
|
|
173
|
+
# @param data [Hash, Array, String, Object] the structure to walk.
|
|
174
|
+
# Any type is accepted; non-String scalars are returned as-is.
|
|
175
|
+
# @param only [Symbol, String, Array, nil] forwarded to {redact}.
|
|
176
|
+
# @param except [Symbol, String, Array, nil] forwarded to {redact}.
|
|
177
|
+
# @param placeholder [String, :tagged, :hash] forwarded to {redact}.
|
|
178
|
+
# @return [Hash, Array, String, Object] a new structure of the same shape
|
|
179
|
+
# with all String leaves redacted.
|
|
180
|
+
# @raise [ArgumentError] if the structure contains a circular reference.
|
|
181
|
+
#
|
|
182
|
+
# @example Rails params
|
|
183
|
+
# safe = DataRedactor.redact_deep(params.to_h)
|
|
184
|
+
#
|
|
185
|
+
# @example Mixed filter
|
|
186
|
+
# DataRedactor.redact_deep(payload, only: :credentials, placeholder: :tagged)
|
|
187
|
+
def redact_deep(data, only: nil, except: nil, placeholder: PLACEHOLDER_DEFAULT)
|
|
188
|
+
_walk(data, only: only, except: except, placeholder: placeholder, seen: Set.new)
|
|
189
|
+
end
|
|
190
|
+
|
|
191
|
+
# Parse +json_string+, redact every String value in the resulting structure,
|
|
192
|
+
# and return valid JSON.
|
|
193
|
+
#
|
|
194
|
+
# Delegates traversal to {redact_deep}. All keyword arguments are forwarded
|
|
195
|
+
# to {redact}.
|
|
196
|
+
#
|
|
197
|
+
# @param json_string [String] valid JSON input.
|
|
198
|
+
# @param only [Symbol, String, Array, nil] forwarded to {redact}.
|
|
199
|
+
# @param except [Symbol, String, Array, nil] forwarded to {redact}.
|
|
200
|
+
# @param placeholder [String, :tagged, :hash] forwarded to {redact}.
|
|
201
|
+
# @return [String] a JSON string with all String values redacted.
|
|
202
|
+
# @raise [JSON::ParserError] if +json_string+ is not valid JSON.
|
|
203
|
+
#
|
|
204
|
+
# @example
|
|
205
|
+
# DataRedactor.redact_json('{"email":"alice@example.com","count":3}')
|
|
206
|
+
# # => '{"email":"[REDACTED]","count":3}'
|
|
207
|
+
def redact_json(json_string, only: nil, except: nil, placeholder: PLACEHOLDER_DEFAULT)
|
|
208
|
+
parsed = JSON.parse(json_string)
|
|
209
|
+
redacted = redact_deep(parsed, only: only, except: except, placeholder: placeholder)
|
|
210
|
+
JSON.generate(redacted)
|
|
211
|
+
end
|
|
212
|
+
|
|
164
213
|
# Register a custom redaction pattern.
|
|
165
214
|
#
|
|
166
215
|
# Patterns must be valid POSIX ERE. Ruby-only syntax (+\d+, +\s+, +\w+,
|
|
@@ -317,6 +366,31 @@ module DataRedactor
|
|
|
317
366
|
bits
|
|
318
367
|
end
|
|
319
368
|
|
|
369
|
+
# @api private
|
|
370
|
+
# Depth-first recursive walker for {redact_deep}.
|
|
371
|
+
# +seen+ is a Set of object_ids already on the current traversal stack,
|
|
372
|
+
# used to detect circular references.
|
|
373
|
+
def _walk(node, only:, except:, placeholder:, seen:)
|
|
374
|
+
case node
|
|
375
|
+
when String
|
|
376
|
+
redact(node, only: only, except: except, placeholder: placeholder)
|
|
377
|
+
when Hash
|
|
378
|
+
raise ArgumentError, "redact_deep: circular reference detected" if seen.include?(node.object_id)
|
|
379
|
+
seen.add(node.object_id)
|
|
380
|
+
result = node.transform_values { |v| _walk(v, only: only, except: except, placeholder: placeholder, seen: seen) }
|
|
381
|
+
seen.delete(node.object_id)
|
|
382
|
+
result
|
|
383
|
+
when Array
|
|
384
|
+
raise ArgumentError, "redact_deep: circular reference detected" if seen.include?(node.object_id)
|
|
385
|
+
seen.add(node.object_id)
|
|
386
|
+
result = node.map { |v| _walk(v, only: only, except: except, placeholder: placeholder, seen: seen) }
|
|
387
|
+
seen.delete(node.object_id)
|
|
388
|
+
result
|
|
389
|
+
else
|
|
390
|
+
node
|
|
391
|
+
end
|
|
392
|
+
end
|
|
393
|
+
|
|
320
394
|
# @api private
|
|
321
395
|
def pattern_enabled?(name, tag_bit, only_present, only_bits, only_names,
|
|
322
396
|
except_bits, except_names)
|
data/readme.md
CHANGED
|
@@ -103,6 +103,36 @@ DataRedactor.scan(text, except: :network)
|
|
|
103
103
|
DataRedactor.scan(text, only: :contact, except: ["email"])
|
|
104
104
|
```
|
|
105
105
|
|
|
106
|
+
### Hash / JSON traversal
|
|
107
|
+
|
|
108
|
+
Redact every string value inside a nested Hash or Array — useful for params hashes, Sidekiq job payloads, webhook bodies, and anything that isn't a flat string:
|
|
109
|
+
|
|
110
|
+
```ruby
|
|
111
|
+
# Hash — returns a deep copy, never mutates the input
|
|
112
|
+
result = DataRedactor.redact_deep({
|
|
113
|
+
"user" => { "email" => "alice@example.com" },
|
|
114
|
+
"count" => 3,
|
|
115
|
+
"tags" => ["admin", "alice@example.com"]
|
|
116
|
+
})
|
|
117
|
+
# => { "user" => { "email" => "[REDACTED]" }, "count" => 3, "tags" => ["admin", "[REDACTED]"] }
|
|
118
|
+
|
|
119
|
+
# Hash keys are never touched — only values are redacted
|
|
120
|
+
# Non-string scalars (Integer, Float, nil, Boolean) pass through unchanged
|
|
121
|
+
|
|
122
|
+
# Accepts the same filters as redact
|
|
123
|
+
DataRedactor.redact_deep(params, only: :credentials)
|
|
124
|
+
DataRedactor.redact_deep(payload, except: :network, placeholder: :tagged)
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
```ruby
|
|
128
|
+
# JSON string — parse → redact_deep → re-serialise
|
|
129
|
+
safe_json = DataRedactor.redact_json('{"email":"alice@example.com","count":3}')
|
|
130
|
+
# => '{"email":"[REDACTED]","count":3}'
|
|
131
|
+
|
|
132
|
+
# Raises JSON::ParserError on invalid input
|
|
133
|
+
DataRedactor.redact_json("not json") # => JSON::ParserError
|
|
134
|
+
```
|
|
135
|
+
|
|
106
136
|
### Custom patterns
|
|
107
137
|
|
|
108
138
|
Teams often have internal IDs that the gem can't ship. Register them at boot:
|
|
@@ -179,7 +209,7 @@ Pass an empty subset (e.g. `scrub: [:headers]`) to opt out of body wrapping. For
|
|
|
179
209
|
|
|
180
210
|
> **Body wrapping is buffering.** The middleware reads the entire response body into memory before scanning. For streaming endpoints (SSE, large file downloads, Rack::Hijack) use `scrub: [:headers]` and rely on the Logger formatter for application logs instead.
|
|
181
211
|
|
|
182
|
-
## Detected patterns (
|
|
212
|
+
## Detected patterns (88 total)
|
|
183
213
|
|
|
184
214
|
The table below is a representative sample. Use `DataRedactor.pattern_names` for the canonical, machine-readable list — it stays in sync with the C extension automatically.
|
|
185
215
|
|