data_redactor 0.16.0-arm64-darwin → 0.17.0-arm64-darwin
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +20 -1
- data/README.md +22 -1
- data/lib/data_redactor/3.1/data_redactor.bundle +0 -0
- data/lib/data_redactor/3.2/data_redactor.bundle +0 -0
- data/lib/data_redactor/3.3/data_redactor.bundle +0 -0
- data/lib/data_redactor/3.4/data_redactor.bundle +0 -0
- data/lib/data_redactor/4.0/data_redactor.bundle +0 -0
- data/lib/data_redactor/integrations/ruby_llm.rb +120 -0
- data/lib/data_redactor/version.rb +1 -1
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 3385fb2002595de048f01063e071f0412d980a6fb3fc0bb0582825101f8e844e
|
|
4
|
+
data.tar.gz: 6697ddbdbe27edfa1c42be5139b8b7839df461474c20ff7db116873a77d5afdc
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 69c893d101b200460bba95371a14296d1e1146d4b3a646bbc10011809bbebdefb91a762ed7724079011b4f45c39e01007474f41c47e7c6b5d6fba878cb9166fc
|
|
7
|
+
data.tar.gz: 31b8f13784c02607e90e6ec7b2d5e2e7897aaa214e78a91c0de75225676759056ec0aa220e28140a9a8c105d31d373c2837689083abcb227a347ba99e3465201
|
data/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [0.17.0] - 2026-06-21
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
- **Transparent `ruby_llm` integration (opt-in monkeypatch).**
|
|
14
|
+
`require "data_redactor/integrations/ruby_llm"` then
|
|
15
|
+
`DataRedactor::Integrations::RubyLLM.install!` prepends a small patch onto
|
|
16
|
+
`RubyLLM::Protocol#render`, so **every** outbound request is deep-redacted
|
|
17
|
+
before it is posted — no per-call `.redact`. One hook covers all providers
|
|
18
|
+
(Anthropic, OpenAI, Gemini, Bedrock, Responses) and scrubs the user prompt,
|
|
19
|
+
system prompt, tool definitions, and any file/command-output that an agent fed
|
|
20
|
+
back as a tool result (all inlined as strings in the payload). Forwards
|
|
21
|
+
`only:`/`except:`/`placeholder:`; idempotent; fails fast at `install!` if an
|
|
22
|
+
unsupported `ruby_llm` version is loaded or `Protocol#render` is missing.
|
|
23
|
+
**Limitation:** base64 attachments (PDFs/images/audio) and URL-referenced
|
|
24
|
+
files are not redacted — the secret bytes are encoded or remote, so patterns
|
|
25
|
+
cannot see them. This is a monkeypatch on internal API and is version-pinned;
|
|
26
|
+
the clean alternative remains per-call `DataRedactor.redact` before `chat.ask`.
|
|
27
|
+
|
|
10
28
|
## [0.16.0] - 2026-06-21
|
|
11
29
|
|
|
12
30
|
### Added
|
|
@@ -336,7 +354,8 @@ features as 0.7.1 plus the pipeline fix.
|
|
|
336
354
|
- `DataRedactor.redact(text)` module function returning the input with every match replaced by `[REDACTED]`.
|
|
337
355
|
- RSpec suite with one example per pattern.
|
|
338
356
|
|
|
339
|
-
[Unreleased]: https://github.com/danielefrisanco/data_redactor/compare/v0.
|
|
357
|
+
[Unreleased]: https://github.com/danielefrisanco/data_redactor/compare/v0.17.0...HEAD
|
|
358
|
+
[0.17.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.16.0...v0.17.0
|
|
340
359
|
[0.16.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.15.0...v0.16.0
|
|
341
360
|
[0.15.0]: https://github.com/danielefrisanco/data_redactor/compare/v0.14.1...v0.15.0
|
|
342
361
|
[0.14.1]: https://github.com/danielefrisanco/data_redactor/compare/v0.14.0...v0.14.1
|
data/README.md
CHANGED
|
@@ -358,7 +358,28 @@ chat.ask(DataRedactor.redact(user_input))
|
|
|
358
358
|
# the model receives: "My card is [REDACTED] and my email is [REDACTED]"
|
|
359
359
|
```
|
|
360
360
|
|
|
361
|
-
Wrap each prompt (and any `with_instructions` system prompt) in `DataRedactor.redact` before passing it to `ask`. This is a per-call step you opt into
|
|
361
|
+
Wrap each prompt (and any `with_instructions` system prompt) in `DataRedactor.redact` before passing it to `ask`. This is a per-call step you opt into, and it's the recommended approach.
|
|
362
|
+
|
|
363
|
+
#### Transparent mode (every request, no per-call wrapping)
|
|
364
|
+
|
|
365
|
+
If you'd rather redact **every** outbound request automatically — including the system prompt, tool definitions, and any file contents or shell-command output an agent feeds back as a tool result — opt into the monkeypatch:
|
|
366
|
+
|
|
367
|
+
```ruby
|
|
368
|
+
require "ruby_llm"
|
|
369
|
+
require "data_redactor/integrations/ruby_llm"
|
|
370
|
+
|
|
371
|
+
DataRedactor::Integrations::RubyLLM.install! # once, at boot
|
|
372
|
+
|
|
373
|
+
chat = RubyLLM.chat(model: "claude-opus-4-8")
|
|
374
|
+
chat.ask("my card is 4111111111111111") # sent as "my card is [REDACTED]"
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
`install!` prepends a patch onto `RubyLLM::Protocol#render` — the one point where every provider (Anthropic, OpenAI, Gemini, Bedrock, Responses) has assembled its final request — and deep-redacts the payload before it's posted. It forwards `only:`/`except:`/`placeholder:`, is idempotent, and **fails fast** at `install!` if an unsupported `ruby_llm` version is loaded or the internal API has moved (so it never silently leaks).
|
|
378
|
+
|
|
379
|
+
Two caveats, by design:
|
|
380
|
+
|
|
381
|
+
- **It's a monkeypatch on RubyLLM internals**, pinned to a supported version range. Prefer per-call `DataRedactor.redact` (above) unless you specifically need transparency. RubyLLM does not yet expose a public request hook ([crmne/ruby_llm#765](https://github.com/crmne/ruby_llm/issues/765) tracks the connection-middleware hook that would let us drop the patch).
|
|
382
|
+
- **Base64 attachments** (PDFs, images, audio sent inline) and **URL-referenced files** are not redacted — the sensitive bytes are encoded or remote, so patterns cannot see them.
|
|
362
383
|
|
|
363
384
|
## Detected patterns (89 total)
|
|
364
385
|
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
require "data_redactor"
|
|
2
|
+
|
|
3
|
+
module DataRedactor
|
|
4
|
+
module Integrations
|
|
5
|
+
# Transparent outbound redaction for the `ruby_llm` gem (crmne/ruby_llm).
|
|
6
|
+
#
|
|
7
|
+
# Calling {install!} prepends a small module onto `RubyLLM::Protocol` that
|
|
8
|
+
# deep-redacts the **rendered request payload** before it is posted to any
|
|
9
|
+
# provider. `Protocol#render` is the single point where every provider
|
|
10
|
+
# (Anthropic, OpenAI/chat_completions, Gemini, Bedrock/Converse, Responses)
|
|
11
|
+
# has assembled its final request Hash, so one hook covers them all without
|
|
12
|
+
# knowing any provider-specific shape.
|
|
13
|
+
#
|
|
14
|
+
# Because the payload is walked with {DataRedactor.redact_deep}, this scrubs
|
|
15
|
+
# **every String leaf** in the request: the user prompt, the system prompt,
|
|
16
|
+
# tool definitions, and — crucially — any file contents or shell-command
|
|
17
|
+
# output that an agent fed back in as a tool result, since those are already
|
|
18
|
+
# inlined as strings in `messages` by the time `render` runs.
|
|
19
|
+
#
|
|
20
|
+
# This is a monkeypatch (a `prepend` onto a private internal class). It is
|
|
21
|
+
# opt-in and pinned: {install!} raises unless a supported `ruby_llm` version
|
|
22
|
+
# is loaded and `RubyLLM::Protocol#render` still exists, so an upstream
|
|
23
|
+
# refactor fails loudly at install time rather than silently leaking data.
|
|
24
|
+
# Prefer this only when you need redaction to be *transparent*; otherwise
|
|
25
|
+
# redact per call with {DataRedactor.redact} before `chat.ask`.
|
|
26
|
+
#
|
|
27
|
+
# ## What is NOT redacted
|
|
28
|
+
# - **Base64 attachments** (PDFs, images, audio sent inline as base64) — the
|
|
29
|
+
# sensitive bytes are encoded, so patterns cannot see into them.
|
|
30
|
+
# - **URL-referenced files/images** — the content lives on a remote server
|
|
31
|
+
# and never enters the payload.
|
|
32
|
+
#
|
|
33
|
+
# @example Make every ruby_llm request redacted, app-wide
|
|
34
|
+
# require "data_redactor/integrations/ruby_llm"
|
|
35
|
+
# DataRedactor::Integrations::RubyLLM.install!
|
|
36
|
+
#
|
|
37
|
+
# chat = RubyLLM.chat(model: "claude-opus-4-8")
|
|
38
|
+
# chat.ask("my card is 4111111111111111") # sent as "my card is [REDACTED]"
|
|
39
|
+
#
|
|
40
|
+
# @example Scope the redaction with the usual filters
|
|
41
|
+
# DataRedactor::Integrations::RubyLLM.install!(only: [:financial, :contact])
|
|
42
|
+
module RubyLLM
|
|
43
|
+
module_function
|
|
44
|
+
|
|
45
|
+
# ruby_llm versions whose `Protocol#render` chokepoint this integration
|
|
46
|
+
# has been verified against. Bump (and re-verify) on each ruby_llm release.
|
|
47
|
+
SUPPORTED_VERSION = "~> 1.16"
|
|
48
|
+
|
|
49
|
+
# Prepend the redaction patch onto `RubyLLM::Protocol`. Idempotent: a
|
|
50
|
+
# second call with the patch already installed is a no-op (the filter
|
|
51
|
+
# options from the first successful install are kept).
|
|
52
|
+
#
|
|
53
|
+
# The `only:`/`except:`/`placeholder:` filters are captured here and
|
|
54
|
+
# applied to every subsequent request.
|
|
55
|
+
#
|
|
56
|
+
# @param only [Symbol, String, Array, nil] forwarded to {DataRedactor.redact_deep}.
|
|
57
|
+
# @param except [Symbol, String, Array, nil] forwarded to {DataRedactor.redact_deep}.
|
|
58
|
+
# @param placeholder [String, Symbol] forwarded to {DataRedactor.redact_deep}.
|
|
59
|
+
# @return [void]
|
|
60
|
+
# @raise [RuntimeError] if `ruby_llm` is not loaded, the loaded version is
|
|
61
|
+
# outside {SUPPORTED_VERSION}, or `RubyLLM::Protocol#render` is missing
|
|
62
|
+
# (i.e. an upstream refactor moved the chokepoint).
|
|
63
|
+
def install!(only: nil, except: nil, placeholder: DataRedactor::PLACEHOLDER_DEFAULT)
|
|
64
|
+
ensure_compatible!
|
|
65
|
+
|
|
66
|
+
@options = { only: only, except: except, placeholder: placeholder }
|
|
67
|
+
return if installed?
|
|
68
|
+
|
|
69
|
+
::RubyLLM::Protocol.prepend(PayloadPatch)
|
|
70
|
+
end
|
|
71
|
+
|
|
72
|
+
# @return [Boolean] whether the redaction patch is currently on
|
|
73
|
+
# `RubyLLM::Protocol`.
|
|
74
|
+
def installed?
|
|
75
|
+
defined?(::RubyLLM::Protocol) &&
|
|
76
|
+
::RubyLLM::Protocol.ancestors.include?(PayloadPatch)
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
# @!visibility private
|
|
80
|
+
# @return [Hash] the filter options captured at {install!}.
|
|
81
|
+
def options
|
|
82
|
+
@options ||= { only: nil, except: nil, placeholder: DataRedactor::PLACEHOLDER_DEFAULT }
|
|
83
|
+
end
|
|
84
|
+
|
|
85
|
+
# @!visibility private
|
|
86
|
+
def ensure_compatible!
|
|
87
|
+
unless defined?(::RubyLLM::VERSION)
|
|
88
|
+
raise "data_redactor ruby_llm integration: require \"ruby_llm\" before calling install!"
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
unless Gem::Requirement.new(SUPPORTED_VERSION).satisfied_by?(Gem::Version.new(::RubyLLM::VERSION))
|
|
92
|
+
raise "data_redactor ruby_llm integration supports ruby_llm #{SUPPORTED_VERSION}, " \
|
|
93
|
+
"got #{::RubyLLM::VERSION}. Check for a newer data_redactor or pin ruby_llm."
|
|
94
|
+
end
|
|
95
|
+
|
|
96
|
+
unless ::RubyLLM::Protocol.method_defined?(:render) || ::RubyLLM::Protocol.private_method_defined?(:render)
|
|
97
|
+
raise "data_redactor ruby_llm integration: RubyLLM::Protocol#render not found — " \
|
|
98
|
+
"the upstream request-rendering API changed. This integration needs an update."
|
|
99
|
+
end
|
|
100
|
+
end
|
|
101
|
+
|
|
102
|
+
# Prepended onto `RubyLLM::Protocol`. `Protocol#complete` calls
|
|
103
|
+
# `payload = render(...)` and then posts that payload, so redacting the
|
|
104
|
+
# return value of `render` redacts the request without touching anything
|
|
105
|
+
# else in the send path.
|
|
106
|
+
module PayloadPatch
|
|
107
|
+
def render(*args, **kwargs)
|
|
108
|
+
payload = super
|
|
109
|
+
opts = DataRedactor::Integrations::RubyLLM.options
|
|
110
|
+
DataRedactor.redact_deep(
|
|
111
|
+
payload,
|
|
112
|
+
only: opts[:only],
|
|
113
|
+
except: opts[:except],
|
|
114
|
+
placeholder: opts[:placeholder]
|
|
115
|
+
)
|
|
116
|
+
end
|
|
117
|
+
end
|
|
118
|
+
end
|
|
119
|
+
end
|
|
120
|
+
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: data_redactor
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.17.0
|
|
5
5
|
platform: arm64-darwin
|
|
6
6
|
authors:
|
|
7
7
|
- Daniele Frisanco
|
|
@@ -120,6 +120,7 @@ files:
|
|
|
120
120
|
- lib/data_redactor/integrations/openai.rb
|
|
121
121
|
- lib/data_redactor/integrations/rack.rb
|
|
122
122
|
- lib/data_redactor/integrations/rails.rb
|
|
123
|
+
- lib/data_redactor/integrations/ruby_llm.rb
|
|
123
124
|
- lib/data_redactor/name_pattern.rb
|
|
124
125
|
- lib/data_redactor/refinements.rb
|
|
125
126
|
- lib/data_redactor/version.rb
|