data_redactor 0.11.0-x86_64-linux-musl → 0.13.0-x86_64-linux-musl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +25 -1
- data/README.md +4 -4
- data/lib/data_redactor/3.0/data_redactor.so +0 -0
- data/lib/data_redactor/3.1/data_redactor.so +0 -0
- data/lib/data_redactor/3.2/data_redactor.so +0 -0
- data/lib/data_redactor/3.3/data_redactor.so +0 -0
- data/lib/data_redactor/3.4/data_redactor.so +0 -0
- data/lib/data_redactor/4.0/data_redactor.so +0 -0
- data/lib/data_redactor/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 56e8ce3c962e337b03a7b8fab9da6eee08a5d3d65323425f7f9afc6105b68876
|
|
4
|
+
data.tar.gz: 0ed77d4a8620bb1f8e03c386a201a99640cd7539ed321dc4e5d310ab4ce3b58b
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: b0b64c56150d14b34adcd81e6ce0204d4cc486aeb07cfbaafe067c8738a5233011a73aedaf6e4702f5dd309b13fce1844ec8d91a7f15704783053a4c60825b6d
|
|
7
|
+
data.tar.gz: b4cf6871059703476b2906984aec3d7dabb7ec86cf7912bbee33ed23dab1f7e81f988e82253b7b2c0af706cdc6ee53051822500d863e5bb06c2e9f50286f50ae
|
data/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [0.13.0] - 2026-06-13
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
- **Custom-pattern registration is now thread-safe.** `add_pattern`,
|
|
14
|
+
`remove_pattern`, and `clear_custom_patterns!` are guarded by a mutex shared
|
|
15
|
+
with the `redact`/`scan` custom-pattern loop, so patterns may be registered,
|
|
16
|
+
removed, or cleared from any thread at any time — including at runtime from a
|
|
17
|
+
request handler — without coordinating with in-flight redactions. The previous
|
|
18
|
+
"register custom patterns at boot only" caveat is lifted. (The C extension now
|
|
19
|
+
links `-lpthread` on glibc; no-op on musl and macOS where pthread is in libc.)
|
|
20
|
+
- **`redact` releases the GVL for large inputs.** The v19 engine's per-scan
|
|
21
|
+
mutable state (NFA scratch and the lazy DFA cache) moved into per-thread
|
|
22
|
+
storage, making the engine re-entrant. `redact` now releases the GVL
|
|
23
|
+
(`rb_thread_call_without_gvl`) around the built-in scan for inputs above a few
|
|
24
|
+
KB, so a large redaction on one thread no longer blocks other Ruby threads.
|
|
25
|
+
Small inputs keep the GVL. No public API change; output is byte-for-byte
|
|
26
|
+
identical (verified by a differential gate over ~6000 inputs). The per-thread
|
|
27
|
+
DFA cache's allocation floor was tuned so this adds ~0.86 MB per scanning
|
|
28
|
+
thread (down from a naive ~3.2 MB), with no throughput change. Per-thread scan
|
|
29
|
+
state is freed at thread exit (via a `pthread_key` destructor), so processes
|
|
30
|
+
that churn many short-lived scanning threads do not accumulate dead caches —
|
|
31
|
+
RSS stays flat across thousands of threads.
|
|
32
|
+
|
|
10
33
|
## [0.11.0] - 2026-06-10
|
|
11
34
|
|
|
12
35
|
### Added
|
|
@@ -232,7 +255,8 @@ features as 0.7.1 plus the pipeline fix.
|
|
|
232
255
|
- `DataRedactor.redact(text)` module function returning the input with every match replaced by `[REDACTED]`.
|
|
233
256
|
- RSpec suite with one example per pattern.
|
|
234
257
|
|
|
235
|
-
[Unreleased]: https://github.com/danielefrisanco/data_redactor/compare/v0.
|
|
258
|
+
[Unreleased]: https://github.com/danielefrisanco/data_redactor/compare/v0.13.0...HEAD
|
|
259
|
+
[0.13.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.11.0...v0.13.0
|
|
236
260
|
[0.11.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.10.1...v0.11.0
|
|
237
261
|
[0.10.1]: https://github.com/danielefrisanco/data_redactor/compare/v0.10.0...v0.10.1
|
|
238
262
|
[0.10.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.9.0...v0.10.0
|
data/README.md
CHANGED
|
@@ -19,7 +19,7 @@ It ships **88 built-in patterns** across 15+ countries, grouped into tags
|
|
|
19
19
|
(`:credentials`, `:financial`, `:contact`, ...) so you can redact only what you
|
|
20
20
|
care about. Beyond plain strings it can walk nested Hashes, Arrays, and JSON,
|
|
21
21
|
audit a payload without mutating it (`scan`), and plug into Logger, Rails, and
|
|
22
|
-
Rack. You can also register your own patterns at boot.
|
|
22
|
+
Rack. You can also register your own patterns — at boot or at runtime from any thread.
|
|
23
23
|
|
|
24
24
|
### Use cases
|
|
25
25
|
|
|
@@ -161,7 +161,7 @@ DataRedactor.redact_json("not json") # => JSON::ParserError
|
|
|
161
161
|
|
|
162
162
|
### Custom patterns
|
|
163
163
|
|
|
164
|
-
Teams often have internal IDs that the gem can't ship. Register them at boot:
|
|
164
|
+
Teams often have internal IDs that the gem can't ship. Register them at boot — or at runtime from any thread (registration is thread-safe, see [Thread safety](#thread-safety)):
|
|
165
165
|
|
|
166
166
|
```ruby
|
|
167
167
|
# String (POSIX ERE) or Regexp — both accepted
|
|
@@ -571,9 +571,9 @@ All C-side buffers are heap-allocated with `malloc`/`strdup` and freed before th
|
|
|
571
571
|
|
|
572
572
|
## Thread safety
|
|
573
573
|
|
|
574
|
-
`DataRedactor.redact` and `DataRedactor.scan` are safe to call concurrently from multiple threads. The v19 engine
|
|
574
|
+
`DataRedactor.redact` and `DataRedactor.scan` are safe to call concurrently from multiple threads. The v19 engine keeps its compiled patterns immutable and shared (read-only after `mm_init()` at load time) and all per-scan mutable state — NFA scratch and the lazy DFA cache — in per-thread storage, so concurrent scans never touch each other's state. For inputs above a few KB, `redact` **releases the GVL** (`rb_thread_call_without_gvl`) around the built-in scan, so a large redaction on one thread no longer blocks other Ruby threads from running. Small inputs keep the GVL (the release bookkeeping would cost more than the scan). Each call allocates its own working buffers. A thread's per-thread state is freed automatically when the thread exits, so processes that spawn many short-lived scanning threads do not accumulate memory.
|
|
575
575
|
|
|
576
|
-
`DataRedactor.add_pattern`, `remove_pattern`, and `clear_custom_patterns!`
|
|
576
|
+
`DataRedactor.add_pattern`, `remove_pattern`, and `clear_custom_patterns!` are also thread-safe: the shared custom-pattern array is guarded by a mutex that writers take around the mutation and `redact`/`scan` take around their custom-pattern loop. You can register, remove, or clear custom patterns from any thread at any time — including from request handlers in a running server — without coordinating with in-flight redactions. (Registration is still a rare operation; the lock is uncontended in practice.)
|
|
577
577
|
|
|
578
578
|
## Versioning
|
|
579
579
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|