ez_logs_agent 0.1.10 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2a7479b12ee7dd0814929516f0474d7e05e393410d86a1c6d56117c854945809
4
- data.tar.gz: 54af5890799ca5614373dae0e66d1b422b3abab3dee7d9d0f95c7f23039ffdd7
3
+ metadata.gz: fbc336d9f93ad71ed33b7c76b07965e751eb6b197ae74a4e29a700c6498b2d26
4
+ data.tar.gz: 1429316f73489ca63f6925d6e5e6bb97a73e6911aefc3f832ec8be125fa9a74d
5
5
  SHA512:
6
- metadata.gz: a68790e445b19ba92f2df2d0733fe6ae7e6d9c43f2ff854eed01fced091cb7d62aa1c0377429c44d3250abc515d5a1bb45056f1087eb0e2b8d491461fdd94843
7
- data.tar.gz: f4208dc88f566f28e1ec24288533468fb3f99d268a77d4f3030da19fc4fc77fe8e36a39052516ddc94def977f03012d8bc157fccb319161a06d4889f52cd37e5
6
+ metadata.gz: 11b8ec5a85a5792c7d4ae0b90b01d46f9e738ba5e043df790b80f81db78da3e19b94611102acfd564388104852a4edf6b91d759f6a4310ffb405a4e97cb38ad0
7
+ data.tar.gz: acb11f65d81105396c16a5e189aa7437b607becbe46f8910c84828410546ec2ac19752af3f8a22414027623e375fbbdca52b4716c2670c97bd4119d432619ff3
data/CHANGELOG.md CHANGED
@@ -2,6 +2,56 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file.
4
4
 
5
+ ## [0.2.0] — 2026-06-05
6
+
7
+ ### Added
8
+ - **Bulk database operations are now captured.** Adds a fourth event
9
+ `source_type` — `bulk_database` — for the four ActiveRecord operations
10
+ that bypass per-row callbacks: `delete_all`, `update_all`, `insert_all`,
11
+ `upsert_all`. Implemented via a narrowly-filtered
12
+ `ActiveSupport::Notifications.subscribe("sql.active_record")` subscription
13
+ (`Capturers::BulkDatabaseCapturer`) — not a replacement for the existing
14
+ callback-based DatabaseCapturer; the two run side by side under the
15
+ same `capture_database` config flag.
16
+
17
+ The new wire shape carries `model_class`, `operation`, `row_count`,
18
+ `where_template` + sanitized `where_binds`, plus an operation-specific
19
+ field (`set` for update_all, `columns` for insert_all/upsert_all).
20
+ Insert/upsert ship column NAMES only — no values, per the product
21
+ decision that bulk-row PII shouldn't ride the wire.
22
+
23
+ Cascade case (`dependent: :delete_all` on a parent destroy) is
24
+ captured automatically, since it produces the same SQL shape — the
25
+ reader sees the parent destroy AND the cascade as sibling rows on the
26
+ timeline.
27
+
28
+ Resource attribution uses a `resource_id: "bulk:<row_count>"` sentinel,
29
+ since individual row IDs are not knowable from the SQL without
30
+ changing the customer's operation (which would violate the read-only
31
+ principle).
32
+
33
+ ### Changed
34
+ - `Sanitizer::SENSITIVE_PATTERNS` and the previous in-class
35
+ `DatabaseCapturer::SENSITIVE_PATTERNS` (which had drifted) are now a
36
+ single source of truth: `EzLogsAgent::SensitivePatterns::PATTERNS`.
37
+ Both capturers + the new BulkDatabaseCapturer consult the same list.
38
+ The merged list is the UNION of the previous two — no patterns
39
+ removed; `passwd`, `pwd`, `cvv`, `cvc`, `pem`, `cipher`, `nonce`,
40
+ `salt`, `digest`, `signature`, `hmac` all continue to be masked.
41
+ - `encrypts :foo` introspection (Rails 7+ `model.class.encrypted_attributes`)
42
+ is now exposed as the standalone `EzLogsAgent::EncryptedAttributes`
43
+ module, so both the per-row and bulk capturers consult it via the
44
+ same path. Behavior unchanged for per-row.
45
+
46
+ ### Limits (documented)
47
+ - Raw `connection.execute(sql)` calls are not captured (no
48
+ notifications fire under `sql.active_record` with a recognizable
49
+ shape). Use the typed bulk methods to get visibility.
50
+ - Specific row IDs affected by a bulk op are not captured — only the
51
+ filter rule (WHERE columns + values) and row count. For per-row
52
+ detail, use `find_each(&:destroy)` style which fires per-row
53
+ callbacks.
54
+
5
55
  ## [0.1.10] — 2026-06-05
6
56
 
7
57
  ### Fixed
data/README.md CHANGED
@@ -218,17 +218,29 @@ Sidekiq and ActiveJob executions:
218
218
 
219
219
  ActiveRecord create, update, and destroy operations:
220
220
 
221
- **What's captured:**
221
+ **What's captured (per-row, via `after_create` / `after_update` / `after_destroy`):**
222
222
  - Model class name, record ID, operation type (create/update/destroy)
223
223
  - For creates: initial attribute values
224
224
  - For updates: one meaningful attribute change (e.g., `status: pending → shipped`)
225
225
  - For destroys: final attribute values before deletion
226
226
  - Correlation ID (automatically inherited from the current request or job)
227
227
 
228
+ **What's captured (bulk, via `ActiveSupport::Notifications` on `sql.active_record`):**
229
+ - The four ActiveRecord operations that bypass per-row callbacks:
230
+ `delete_all`, `update_all`, `insert_all`, `upsert_all`.
231
+ - For each: model class, row count, the filter rule (WHERE columns +
232
+ sanitized bind values), and operation-specific detail (SET hash for
233
+ `update_all`, column names for `insert_all` / `upsert_all` — values
234
+ are NOT shipped for bulk inserts).
235
+ - `dependent: :delete_all` cascades during a parent destroy.
236
+
228
237
  **What's NOT captured:**
229
238
  - SELECT queries (read operations don't change data)
230
239
  - Schema migrations (Rails internal operations)
231
- - Bulk operations (e.g., `update_all`, `delete_all`)
240
+ - Raw `connection.execute(sql)` calls (no recognizable model class)
241
+ - Individual row IDs affected by bulk operations (only the filter rule
242
+ and row count — pulling the IDs would require modifying the
243
+ customer's SQL, which violates the read-only principle).
232
244
 
233
245
  **Automatic exclusions (no configuration needed):**
234
246
  - `sessions` — Session store updates
@@ -777,18 +789,23 @@ Common validation errors:
777
789
 
778
790
  3. **Look for database capture registration in logs:**
779
791
  ```
780
- [Railtie] Database capture installed
792
+ [Railtie] Database capture installed (per-row + bulk)
781
793
  ```
782
794
 
783
795
  4. **Verify models aren't excluded:**
784
796
  Check `config.excluded_tables` to ensure your tables aren't being filtered out.
785
797
 
786
- 5. **Remember: Only create/update/destroy are captured:**
787
- - ✅ `User.create(...)` — Captured
788
- - ✅ `user.update(...)` — Captured
789
- - ✅ `user.destroy` — Captured
798
+ 5. **What is and isn't captured:**
799
+ - ✅ `User.create(...)` — Captured (per-row callback)
800
+ - ✅ `user.update(...)` — Captured (per-row callback)
801
+ - ✅ `user.destroy` — Captured (per-row callback)
802
+ - ✅ `User.update_all(...)` — Captured as a `bulk_database` event
803
+ (filter + SET clause + row count; no per-row IDs)
804
+ - ✅ `User.where(...).delete_all` — Captured as a `bulk_database` event
805
+ - ✅ `User.insert_all(...)` / `upsert_all(...)` — Captured (columns + count)
790
806
  - ❌ `User.find(...)` — NOT captured (read-only)
791
- - ❌ `User.update_all(...)` — NOT captured (bulk operation)
807
+ - ❌ `User.connection.execute("DELETE FROM ...")` — NOT captured
808
+ (raw SQL bypasses the typed AR API we hook)
792
809
 
793
810
  ---
794
811
 
@@ -0,0 +1,302 @@
1
+ # frozen_string_literal: true
2
+
3
+ module EzLogsAgent
4
+ # Pure-functional parser for the four ActiveRecord bulk operations that
5
+ # bypass per-row callbacks: delete_all, update_all, insert_all, upsert_all.
6
+ # Used by BulkDatabaseCapturer to turn an AS::Notifications "sql.active_record"
7
+ # payload into a structured wire shape the server can humanize.
8
+ #
9
+ # ## What it extracts
10
+ #
11
+ # For delete_all:
12
+ # { operation: :delete_all, where_template: "status = $1", where_binds: [{column:, value:}] }
13
+ #
14
+ # For update_all:
15
+ # { operation: :update_all, set: {"status" => "paid"}, where_template:, where_binds: }
16
+ #
17
+ # For insert_all / upsert_all:
18
+ # { operation: :insert_all|:upsert_all, columns: ["name", "email"] }
19
+ # (No values — per the product decision; column SHAPE only.)
20
+ #
21
+ # For anything else (subqueries, joins, raw SQL the regex can't parse,
22
+ # malformed binds): returns { unparseable: true }. BulkDatabaseCapturer
23
+ # falls back to shipping row_count + operation + model_class with no
24
+ # template / binds / set — the timeline still reads "Bulk delete: 50,000
25
+ # orders" minus the WHERE detail.
26
+ #
27
+ # ## Why regex (not Arel / pg_query)
28
+ #
29
+ # Both Arel and pg_query are adapter-specific and either slow or heavy
30
+ # to add as a runtime dependency on every customer host app. Regex on
31
+ # the standardized AR-emitted SQL string is fast (sub-millisecond on
32
+ # typical statements) and adapter-tolerant. The graceful "unparseable"
33
+ # branch covers the long tail.
34
+ #
35
+ # ## Adapter quoting handled
36
+ #
37
+ # - PostgreSQL: "orders"."status" = $1 (double-quoted, $N placeholders)
38
+ # - SQLite: "orders"."status" = ? (double-quoted, ? placeholders)
39
+ # - MySQL: `orders`.`status` = ? (backticks, ? placeholders)
40
+ module BulkSqlParser
41
+ # Loose match for an identifier wrapped in any of the three quote
42
+ # styles AR uses. Captures the unquoted name.
43
+ IDENTIFIER = /["`]([^"`]+)["`]/.freeze
44
+
45
+ # Either a "qualified" or "bare" identifier reference, used in
46
+ # WHERE/SET clause columns. AR prefixes the table name in some
47
+ # paths and not in others — accept both. Captures just the column.
48
+ COLUMN = /(?:#{IDENTIFIER}\.)?#{IDENTIFIER}/.freeze
49
+
50
+ # A bind placeholder — Postgres uses $N (1-indexed), SQLite/MySQL use ?.
51
+ # The order of placeholders in the SQL corresponds 1:1 with the order
52
+ # of values in `binds` / `type_casted_binds`.
53
+ PLACEHOLDER = /(?:\$\d+|\?)/.freeze
54
+
55
+ module_function
56
+
57
+ # @param sql [String] Raw SQL string from `payload[:sql]`.
58
+ # @param type_casted_binds [Array] `payload[:type_casted_binds]` —
59
+ # already-typecast values in placeholder order. May be nil for
60
+ # raw SQL paths.
61
+ # @return [Hash] Always returns a Hash. Either the parsed structure
62
+ # (keys above) or `{ unparseable: true }`.
63
+ def parse(sql:, type_casted_binds:)
64
+ return { unparseable: true } if sql.nil? || sql.empty?
65
+
66
+ sql_stripped = sql.strip
67
+
68
+ case sql_stripped
69
+ when /\ADELETE FROM /i
70
+ parse_delete(sql_stripped, type_casted_binds || [])
71
+ when /\AUPDATE /i
72
+ parse_update(sql_stripped, type_casted_binds || [])
73
+ when /\AINSERT INTO /i
74
+ parse_insert(sql_stripped)
75
+ else
76
+ { unparseable: true }
77
+ end
78
+ rescue StandardError
79
+ # If the regex engine, binds zipping, or any sub-parse step raises,
80
+ # ship unparseable rather than crash the capture handler. The
81
+ # BulkDatabaseCapturer's own rescue would also catch this, but
82
+ # defending here means the rest of the capturer sees a uniform
83
+ # return shape.
84
+ { unparseable: true }
85
+ end
86
+
87
+ # Returns the symbolic operation name we expect downstream. Detected
88
+ # from SQL shape, independent of the `payload[:name]` Rails version
89
+ # variance (see plan §"insert_all/upsert_all payload :name varies").
90
+ # Upsert detection requires the `ON CONFLICT` (PG) or `ON DUPLICATE KEY`
91
+ # (MySQL) clause — bare INSERTs are insert_all.
92
+ #
93
+ # @return [Symbol, nil] :delete_all, :update_all, :insert_all,
94
+ # :upsert_all, or nil if not a bulk op.
95
+ def detect_operation(sql)
96
+ return nil if sql.nil?
97
+
98
+ sql_up = sql.lstrip.upcase
99
+
100
+ return :delete_all if sql_up.start_with?("DELETE FROM ")
101
+ return :update_all if sql_up.start_with?("UPDATE ")
102
+
103
+ if sql_up.start_with?("INSERT INTO ")
104
+ # Disambiguate insert_all vs upsert_all:
105
+ # - insert_all on PG/SQLite emits `ON CONFLICT DO NOTHING` (still insert).
106
+ # - upsert_all on PG/SQLite emits `ON CONFLICT ... DO UPDATE SET ...`.
107
+ # - upsert_all on MySQL emits `ON DUPLICATE KEY UPDATE ...`.
108
+ if (sql_up.include?("ON CONFLICT") && sql_up.include?("DO UPDATE")) ||
109
+ sql_up.include?("ON DUPLICATE KEY")
110
+ return :upsert_all
111
+ end
112
+
113
+ :insert_all
114
+ end
115
+ end
116
+
117
+ # --- DELETE FROM "orders" WHERE "orders"."status" = $1 ---
118
+ def parse_delete(sql, type_casted_binds)
119
+ where_sql = extract_where(sql)
120
+ template, binds = build_template_and_binds(where_sql, type_casted_binds)
121
+
122
+ { operation: :delete_all, where_template: template, where_binds: binds }
123
+ end
124
+
125
+ # --- UPDATE "orders" SET "status" = $1 WHERE "orders"."status" = $2 ---
126
+ # SET binds come first in placeholder order, then WHERE binds.
127
+ def parse_update(sql, type_casted_binds)
128
+ set_sql = extract_set(sql)
129
+ where_sql = extract_where(sql)
130
+ return { unparseable: true } if set_sql.nil?
131
+
132
+ set_pairs, set_bind_count = parse_set_assignments(set_sql)
133
+ set_binds = type_casted_binds.first(set_bind_count)
134
+ where_binds_raw = type_casted_binds.drop(set_bind_count)
135
+
136
+ set_hash = zip_set_values(set_pairs, set_binds)
137
+ where_template, where_binds = build_template_and_binds(where_sql, where_binds_raw)
138
+
139
+ {
140
+ operation: :update_all,
141
+ set: set_hash,
142
+ where_template: where_template,
143
+ where_binds: where_binds
144
+ }
145
+ end
146
+
147
+ # --- INSERT INTO "users" ("name","email") VALUES (...), (...) ---
148
+ def parse_insert(sql)
149
+ operation = detect_operation(sql)
150
+ return { unparseable: true } unless operation == :insert_all || operation == :upsert_all
151
+
152
+ columns = extract_insert_columns(sql)
153
+ return { unparseable: true } if columns.nil? || columns.empty?
154
+
155
+ { operation: operation, columns: columns }
156
+ end
157
+
158
+ # Returns the WHERE-clause SQL (without the keyword), or nil if absent.
159
+ # Stops at end-of-string, RETURNING, ORDER BY, LIMIT (AR rarely emits
160
+ # these on bulk DML but cheap insurance).
161
+ def extract_where(sql)
162
+ match = sql.match(/\sWHERE\s+(.+?)(?:\s+(?:RETURNING|ORDER BY|LIMIT)\b|\z)/i)
163
+ match && match[1].strip
164
+ end
165
+
166
+ # Returns the SET-clause SQL between "SET" and "WHERE"/end-of-string.
167
+ def extract_set(sql)
168
+ match = sql.match(/\sSET\s+(.+?)(?:\s+WHERE\s+|\s+RETURNING\b|\z)/i)
169
+ match && match[1].strip
170
+ end
171
+
172
+ # SET clause is comma-separated `"col" = <placeholder|literal>` pairs.
173
+ # AR inlines non-symbol values as literals (no placeholder) — we still
174
+ # surface those in the result so the reader sees what changed.
175
+ # Returns [[col_name, placeholder_or_literal_str], ...] + total bind count.
176
+ def parse_set_assignments(set_sql)
177
+ pairs = []
178
+ bind_count = 0
179
+
180
+ split_top_level_commas(set_sql).each do |assignment|
181
+ m = assignment.match(/\A#{COLUMN}\s*=\s*(.+)\z/)
182
+ next unless m
183
+
184
+ col = m[2] || m[1]
185
+ rhs = m[3].strip
186
+ pairs << [col, rhs]
187
+ bind_count += 1 if rhs.match?(PLACEHOLDER)
188
+ end
189
+
190
+ [pairs, bind_count]
191
+ end
192
+
193
+ # Walks set_pairs in order, taking the next bind for each placeholder
194
+ # RHS and pulling the literal text for non-placeholder RHS (e.g.,
195
+ # `updated_at = '2026-06-05 12:00:00'`).
196
+ def zip_set_values(set_pairs, set_binds)
197
+ bind_idx = 0
198
+ set_pairs.each_with_object({}) do |(col, rhs), acc|
199
+ if rhs.match?(PLACEHOLDER)
200
+ acc[col] = set_binds[bind_idx]
201
+ bind_idx += 1
202
+ else
203
+ # Strip surrounding quotes to surface the actual value.
204
+ acc[col] = rhs.gsub(/\A['"]|['"]\z/, "")
205
+ end
206
+ end
207
+ end
208
+
209
+ # Splits on commas that are NOT inside (parens) or quotes. Bulk DML
210
+ # SET / WHERE almost never nests, but defensive against function
211
+ # calls like `coalesce(x, 0)`.
212
+ def split_top_level_commas(str)
213
+ result = []
214
+ depth = 0
215
+ in_quote = false
216
+ quote_char = nil
217
+ buffer = +""
218
+
219
+ str.each_char do |ch|
220
+ if in_quote
221
+ buffer << ch
222
+ in_quote = false if ch == quote_char
223
+ elsif ch == "'" || ch == '"' || ch == "`"
224
+ in_quote = true
225
+ quote_char = ch
226
+ buffer << ch
227
+ elsif ch == "("
228
+ depth += 1
229
+ buffer << ch
230
+ elsif ch == ")"
231
+ depth -= 1
232
+ buffer << ch
233
+ elsif ch == "," && depth.zero?
234
+ result << buffer.strip unless buffer.empty?
235
+ buffer = +""
236
+ else
237
+ buffer << ch
238
+ end
239
+ end
240
+ result << buffer.strip unless buffer.empty?
241
+ result
242
+ end
243
+
244
+ # Given a WHERE clause SQL and the binds in placeholder order, walk
245
+ # the placeholders and pair each one with the column to its LEFT
246
+ # (the standard `"table"."col" = $1` shape AR emits). Returns the
247
+ # template (placeholder-preserved) and the binds tagged with column.
248
+ #
249
+ # Unrecognized shapes (subselects, joins, NULL checks without binds)
250
+ # leave the template intact but produce no column for the bind, in
251
+ # which case the bind ships with column: nil. The display layer
252
+ # handles nil-column binds.
253
+ def build_template_and_binds(where_sql, type_casted_binds)
254
+ return [nil, []] if where_sql.nil?
255
+
256
+ template = where_sql
257
+ binds = []
258
+ bind_index = 0
259
+
260
+ # Walk the template scanning each placeholder, looking backward
261
+ # for the nearest column identifier to its left.
262
+ template.scan(PLACEHOLDER) do
263
+ match_data = Regexp.last_match
264
+ next unless match_data
265
+
266
+ column = nearest_column_left_of(template, match_data.begin(0))
267
+ value = type_casted_binds[bind_index]
268
+ binds << { column: column, value: value }
269
+ bind_index += 1
270
+ end
271
+
272
+ [template, binds]
273
+ end
274
+
275
+ # Finds the most recent column identifier to the left of `pos` in
276
+ # the WHERE clause. Used to attribute each bind to its column name
277
+ # so we can mask sensitive values by name downstream.
278
+ def nearest_column_left_of(where_sql, pos)
279
+ prefix = where_sql[0...pos]
280
+ # Look for the last identifier before the operator (=, <, >, etc.)
281
+ match = prefix.match(/#{COLUMN}\s*(?:=|<>|!=|<=|>=|<|>|LIKE|IN|IS)\s*\z/i)
282
+ return nil unless match
283
+
284
+ # COLUMN has two capture groups: [table, col] or [_, col].
285
+ match[2] || match[1]
286
+ end
287
+
288
+ # --- INSERT INTO "users" ("name","email") VALUES ... ---
289
+ # Extracts the ordered column list. Returns nil if the open paren
290
+ # column list isn't present (e.g., INSERT ... DEFAULT VALUES).
291
+ def extract_insert_columns(sql)
292
+ # First parenthesized list after the table name.
293
+ m = sql.match(/\AINSERT INTO\s+#{IDENTIFIER}\s*\(([^)]+)\)/i)
294
+ return nil unless m
295
+
296
+ columns_block = m[2]
297
+ columns_block.split(",").map do |col|
298
+ col.strip.gsub(/\A["`]|["`]\z/, "")
299
+ end.reject(&:empty?)
300
+ end
301
+ end
302
+ end
@@ -0,0 +1,368 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/notifications"
4
+
5
+ module EzLogsAgent
6
+ module Capturers
7
+ # Captures bulk SQL operations that bypass ActiveRecord lifecycle
8
+ # callbacks: delete_all, update_all, insert_all, upsert_all.
9
+ #
10
+ # ## Why this exists
11
+ #
12
+ # DatabaseCapturer (the per-row sibling) hooks after_create/_update/_destroy
13
+ # to capture rich per-record context (saved_changes, encrypted_attributes,
14
+ # display_name). That model breaks for bulk ops — Rails issues a single
15
+ # UPDATE/DELETE/INSERT statement against the database WITHOUT instantiating
16
+ # records, so the callbacks never fire. Customer code like
17
+ #
18
+ # Order.where(status: "cart").delete_all
19
+ # Library.where(closed: true).update_all(status: "active")
20
+ # User.insert_all([{name: "a"}, {name: "b"}])
21
+ #
22
+ # plus `dependent: :delete_all` cascades during a parent destroy, are all
23
+ # invisible to the callback-based path. This capturer fills the gap.
24
+ #
25
+ # ## How it works
26
+ #
27
+ # Subscribes to "sql.active_record" — the standard Rails instrumentation
28
+ # API every observability tool uses (Datadog APM, AppSignal, Skylight).
29
+ # On every SQL statement the host app runs, we get a payload with
30
+ # the raw SQL, binds, name, and row_count. We filter aggressively
31
+ # to ONLY four operations (delete_all / update_all / insert_all /
32
+ # upsert_all) by SQL shape detection (BulkSqlParser.detect_operation),
33
+ # then parse + sanitize + ship.
34
+ #
35
+ # ## Dedup vs DatabaseCapturer
36
+ #
37
+ # Per-row CRUD (`user.save`, `order.destroy`) fires `after_*` callbacks
38
+ # AND produces an `sql.active_record` notification with a singular name
39
+ # ("User Update", "Order Destroy"). DatabaseCapturer captures these via
40
+ # callbacks; this capturer ignores them because their SQL shape is NOT
41
+ # one of the four bulk operations. Mutually exclusive — no double-capture.
42
+ #
43
+ # Cascade case: `Company has_many :orders, dependent: :delete_all` issues
44
+ # a single DELETE for the children. Callbacks don't fire on the children
45
+ # (delete_all bypasses them by design), but this capturer catches the
46
+ # bulk DELETE. The parent's `after_destroy` is captured separately by
47
+ # DatabaseCapturer. Both events share the request's correlation_id and
48
+ # land under the same Action shell. Reader sees parent + cascade as
49
+ # sibling rows on the timeline — the right narrative.
50
+ #
51
+ # ## Wire shape (matches server EventIngest expectations)
52
+ #
53
+ # {
54
+ # source_type: "bulk_database",
55
+ # source_data: {
56
+ # model_class: "Order",
57
+ # operation: "delete_all" | "update_all" | "insert_all" | "upsert_all",
58
+ # row_count: 50000,
59
+ # where_template: "\"orders\".\"status\" = $1",
60
+ # where_binds: [{column: "status", value: "cart"}],
61
+ # set: {"status" => "paid"}, # only update_all
62
+ # columns: ["name", "email"] # only insert_all / upsert_all
63
+ # },
64
+ # correlation_id: ...,
65
+ # resource_ids: [{resource_type: "Order", resource_id: "bulk:50000"}],
66
+ # outcome: "success",
67
+ # duration_ms: <finish - start>
68
+ # }
69
+ #
70
+ # The "bulk:<count>" sentinel resource_id is required because the server's
71
+ # ResourceAggregationStage drops entries with nil resource_id. The
72
+ # display layer detects the sentinel and renders "Order (50,000 rows)"
73
+ # without a clickable entity link.
74
+ module BulkDatabaseCapturer
75
+ # AR's `payload[:name]` convention for the four bulk operations
76
+ # (verified against Rails 7.0–8.0 + SQLite/PG/MySQL):
77
+ #
78
+ # delete_all → "<Model> Delete All"
79
+ # update_all → "<Model> Update All"
80
+ # insert_all → "<Model> Insert" (or "<Model> Bulk Insert" on older PG)
81
+ # upsert_all → "<Model> Upsert" (or "<Model> Bulk Upsert" on older PG)
82
+ #
83
+ # Per-row CRUD uses singular operation verbs:
84
+ # user.save (new) → "<Model> Create"
85
+ # user.update → "<Model> Update" (no " All")
86
+ # user.destroy → "<Model> Destroy"
87
+ #
88
+ # So the four bulk shapes are uniquely identified by either:
89
+ # - ending in " All" (covers Delete All / Update All), OR
90
+ # - the words Insert / Upsert (which are NEVER used for per-row CRUD
91
+ # — per-row inserts are tagged "Create", per-row updates "Update").
92
+ #
93
+ # SQL shape detection (BulkSqlParser.detect_operation) is the actual
94
+ # authority — this filter is only a sub-µs pre-pass to skip non-bulk
95
+ # notifications without parsing SQL.
96
+ BULK_NAME_HINT = / All\z| (Bulk )?(Insert|Upsert)\z/.freeze
97
+
98
+ class << self
99
+ attr_reader :subscriber
100
+
101
+ # Installs the AS::Notifications subscription. Idempotent — calling
102
+ # twice is a no-op (would otherwise produce double-events because
103
+ # AS::Notifications.subscribe is itself NOT idempotent).
104
+ #
105
+ # Called from Railtie.install_database_capturer alongside the
106
+ # per-row DatabaseCapturer.install. Both gated by the same
107
+ # `capture_database` configuration flag — no new toggle.
108
+ def install
109
+ return if @installed
110
+
111
+ @subscriber = ::ActiveSupport::Notifications.subscribe("sql.active_record") do |*args|
112
+ payload = args.last
113
+ event_name = args.first
114
+ started = args[1]
115
+ finished = args[2]
116
+ handle_notification(event_name, started, finished, payload)
117
+ end
118
+ @installed = true
119
+ end
120
+
121
+ # Removes the subscription. Specs use this between examples to
122
+ # avoid leaked subscribers; production never calls it.
123
+ def uninstall!
124
+ ::ActiveSupport::Notifications.unsubscribe(@subscriber) if @subscriber
125
+ @subscriber = nil
126
+ @installed = false
127
+ end
128
+
129
+ # Per-notification entry point. Wraps everything in `rescue Exception`
130
+ # because an AS::N handler that raises pollutes the host's subscriber
131
+ # chain and (depending on the chain order) can break OTHER observability
132
+ # tools listening on the same channel. Hard rule: bulk capture failures
133
+ # never propagate.
134
+ def handle_notification(_event_name, started, finished, payload)
135
+ return unless capture_enabled?
136
+ return unless eligible_payload?(payload)
137
+
138
+ operation = ::EzLogsAgent::BulkSqlParser.detect_operation(payload[:sql])
139
+ return unless operation
140
+
141
+ model_class = resolve_model_class(payload[:sql])
142
+ return if model_class.nil?
143
+ return if table_excluded?(model_class)
144
+
145
+ parse_result = ::EzLogsAgent::BulkSqlParser.parse(
146
+ sql: payload[:sql],
147
+ type_casted_binds: payload[:type_casted_binds]
148
+ )
149
+
150
+ source_data = build_source_data(
151
+ operation: operation,
152
+ model_class: model_class,
153
+ row_count: payload[:row_count],
154
+ parse_result: parse_result
155
+ )
156
+
157
+ duration_ms = ((finished - started) * 1000).to_i
158
+
159
+ event = ::EzLogsAgent::EventBuilder.build(
160
+ source_type: :bulk_database,
161
+ source_data: source_data,
162
+ outcome: :success,
163
+ correlation_id: ::EzLogsAgent::Correlation.current,
164
+ resource_ids: build_resource_ids(model_class, source_data[:row_count]),
165
+ context: nil,
166
+ duration_ms: duration_ms
167
+ )
168
+
169
+ ::EzLogsAgent::Buffer.push(event)
170
+ rescue Exception => e # rubocop:disable Lint/RescueException
171
+ # See class comment: a raise from an AS::N handler hurts other
172
+ # subscribers, so we swallow EVERYTHING (not just StandardError).
173
+ # Logged at error level so a regression surfaces in customer
174
+ # debug output, but never re-raised.
175
+ safe_log_error("handle_notification", e)
176
+ end
177
+
178
+ # Fast pre-filter — checks the name field WITHOUT touching SQL.
179
+ # Returns false for the vast majority of notifications (per-row
180
+ # CRUD, SCHEMA, TRANSACTION, internal lookups).
181
+ #
182
+ # @param payload [Hash, nil]
183
+ # @return [Boolean]
184
+ def eligible_payload?(payload)
185
+ return false unless payload.is_a?(Hash)
186
+
187
+ name = payload[:name].to_s
188
+ return false if name.empty?
189
+
190
+ BULK_NAME_HINT.match?(name)
191
+ end
192
+
193
+ # Looks up the model class from the SQL's table name. Returns nil
194
+ # for SQL we can't attribute (raw multi-table queries, anonymous
195
+ # adapter SQL, schema introspection). Skipping these is correct —
196
+ # we'd have nothing to display anyway.
197
+ def resolve_model_class(sql)
198
+ return nil if sql.nil?
199
+
200
+ table = extract_table_name(sql)
201
+ return nil if table.nil?
202
+
203
+ ::ActiveRecord::Base.descendants.find do |klass|
204
+ klass.respond_to?(:table_name) && klass.table_name == table && !klass.abstract_class?
205
+ end
206
+ rescue StandardError
207
+ nil
208
+ end
209
+
210
+ # Extracts the unquoted table name from the FROM / INTO / UPDATE
211
+ # clause. Handles all three identifier-quote styles (PG/SQLite/
212
+ # MySQL). Returns nil on unparseable SQL.
213
+ def extract_table_name(sql)
214
+ # DELETE FROM "table"
215
+ if (m = sql.match(/\ADELETE FROM\s+["`]?([^"`\s]+)["`]?/i))
216
+ return m[1]
217
+ end
218
+ # UPDATE "table"
219
+ if (m = sql.match(/\AUPDATE\s+["`]?([^"`\s]+)["`]?/i))
220
+ return m[1]
221
+ end
222
+ # INSERT INTO "table"
223
+ if (m = sql.match(/\AINSERT INTO\s+["`]?([^"`\s]+)["`]?/i))
224
+ return m[1]
225
+ end
226
+
227
+ nil
228
+ end
229
+
230
+ # Builds the source_data hash from the parser result, applying
231
+ # encrypted_attributes drop + sensitive-pattern masking on
232
+ # column-keyed values.
233
+ def build_source_data(operation:, model_class:, row_count:, parse_result:)
234
+ base = {
235
+ model_class: model_class.name,
236
+ operation: operation.to_s,
237
+ row_count: row_count
238
+ }
239
+
240
+ return base if parse_result[:unparseable]
241
+
242
+ if (set = parse_result[:set])
243
+ base[:set] = mask_set_hash(set, model_class)
244
+ end
245
+
246
+ if (template = parse_result[:where_template])
247
+ base[:where_template] = template
248
+ base[:where_binds] = mask_where_binds(parse_result[:where_binds], model_class)
249
+ end
250
+
251
+ if (columns = parse_result[:columns])
252
+ base[:columns] = filter_columns(columns, model_class)
253
+ end
254
+
255
+ base
256
+ end
257
+
258
+ # Walks `{ column => value }` from update_all SET, masking values
259
+ # whose column is encrypted OR matches a sensitive pattern. Date /
260
+ # Time / BigDecimal values get JSON-formatted so they don't collapse
261
+ # to "[Object]" downstream.
262
+ def mask_set_hash(set, model_class)
263
+ set.each_with_object({}) do |(col, value), acc|
264
+ acc[col] =
265
+ if ::EzLogsAgent::EncryptedAttributes.attribute?(model_class, col)
266
+ "[FILTERED]"
267
+ elsif ::EzLogsAgent::SensitivePatterns.match?(col)
268
+ "[FILTERED]"
269
+ else
270
+ format_value_for_json(value)
271
+ end
272
+ end
273
+ end
274
+
275
+ # Walks the array of {column:, value:} bind entries from the WHERE
276
+ # parser, same masking rules as mask_set_hash. Binds whose column
277
+ # is nil (the parser couldn't attribute them) ride through with
278
+ # the formatted value — display falls back to template substitution.
279
+ def mask_where_binds(binds, model_class)
280
+ (binds || []).map do |bind|
281
+ col = bind[:column]
282
+ value = bind[:value]
283
+ masked_value =
284
+ if col && (::EzLogsAgent::EncryptedAttributes.attribute?(model_class, col) ||
285
+ ::EzLogsAgent::SensitivePatterns.match?(col))
286
+ "[FILTERED]"
287
+ else
288
+ format_value_for_json(value)
289
+ end
290
+
291
+ { column: col, value: masked_value }
292
+ end
293
+ end
294
+
295
+ # For insert_all / upsert_all, we ship column names ONLY (no
296
+ # values — product decision). Sensitive column names still need
297
+ # masking so the column LIST itself doesn't hint "this table has
298
+ # a `password` column". Drop sensitive columns from the displayed
299
+ # list; replace with the literal marker so the count remains true.
300
+ def filter_columns(columns, model_class)
301
+ columns.map do |col|
302
+ if ::EzLogsAgent::EncryptedAttributes.attribute?(model_class, col)
303
+ "[FILTERED]"
304
+ elsif ::EzLogsAgent::SensitivePatterns.match?(col)
305
+ "[FILTERED]"
306
+ else
307
+ col
308
+ end
309
+ end
310
+ end
311
+
312
+ # Builds the sentinel resource entry. row_count may be nil (Rails
313
+ # < 7 didn't ship it; some adapters still don't) — fall back to
314
+ # "bulk" so the entry is non-nil and the server-side
315
+ # ResourceAggregationStage doesn't drop it.
316
+ def build_resource_ids(model_class, row_count)
317
+ count_str = row_count.is_a?(Integer) ? row_count.to_s : "unknown"
318
+ [{ resource_type: model_class.name, resource_id: "bulk:#{count_str}" }]
319
+ end
320
+
321
+ # Mirrors DatabaseCapturer's same-named guard. capture_database = false
322
+ # disables both capturers in one switch.
323
+ def capture_enabled?
324
+ ::EzLogsAgent.configuration.capture_database
325
+ rescue StandardError
326
+ false
327
+ end
328
+
329
+ # Uses DatabaseCapturer's existing all_excluded_tables list — one
330
+ # config knob, both capturers obey it.
331
+ def table_excluded?(model_class)
332
+ return false unless model_class.respond_to?(:table_name)
333
+
334
+ ::EzLogsAgent.configuration.all_excluded_tables.include?(model_class.table_name)
335
+ rescue StandardError
336
+ false
337
+ end
338
+
339
+ # Same formatter as DatabaseCapturer. Keeps Date / Time / BigDecimal
340
+ # from collapsing to "[Object]" when they reach Sanitizer / wire.
341
+ def format_value_for_json(value)
342
+ case value
343
+ when ::Time, ::DateTime
344
+ value.iso8601
345
+ when ::Date
346
+ value.to_s
347
+ when ::BigDecimal
348
+ value.to_f
349
+ when ::Array
350
+ value.map { |v| format_value_for_json(v) }
351
+ else
352
+ value
353
+ end
354
+ end
355
+
356
+ def safe_log_error(stage, exception)
357
+ ::EzLogsAgent::Logger.error(
358
+ "[BulkDatabaseCapturer] #{stage} failed: #{exception.class} - #{exception.message}"
359
+ )
360
+ rescue StandardError
361
+ # Even logging can fail in pathological boot states. We've done
362
+ # everything reasonable; drop the event silently.
363
+ nil
364
+ end
365
+ end
366
+ end
367
+ end
368
+ end
@@ -60,37 +60,9 @@ module EzLogsAgent
60
60
  # Previously we filtered them out, but this loses important context.
61
61
  # FOREIGN_KEY_PATTERN = /_id\z/ # Removed January 2026
62
62
 
63
- # Patterns for sensitive data to ignore.
64
- #
65
- # The first source of truth is `record.class.encrypted_attributes`
66
- # (Rails 7+ `encrypts :foo` declaration) — see encrypted_attribute?.
67
- # If the host app encrypted it, we never capture it.
68
- #
69
- # This list is the secondary defense: column names that frequently
70
- # carry sensitive material even when the host app didn't declare
71
- # `encrypts` (legacy code, manual hashing, externally-generated
72
- # material). Matching is substring + case-insensitive.
73
- SENSITIVE_PATTERNS = %w[
74
- password
75
- token
76
- secret
77
- api_key
78
- credit_card
79
- ssn
80
- social_security
81
- encrypted
82
- private_key
83
- public_key
84
- signing_key
85
- pem
86
- cipher
87
- nonce
88
- salt
89
- digest
90
- signature
91
- hmac
92
- ].freeze
93
-
63
+ # Sensitive-attribute name pattern denylist (secondary defense after
64
+ # `encrypts :foo` introspection) lives in EzLogsAgent::SensitivePatterns —
65
+ # see sensitive_attribute? below.
94
66
 
95
67
  @installed = false
96
68
  @callbacks_registered = false
@@ -386,35 +358,25 @@ module EzLogsAgent
386
358
  end
387
359
 
388
360
  # Checks whether the host app declared `encrypts :<attribute>` on
389
- # this model's class. Available since Rails 7.0 via
390
- # ActiveRecord::Encryption::EncryptableRecord#encrypted_attributes.
391
- #
392
- # Safe across host Rails versions: returns false if the API isn't
393
- # present (older Rails, non-AR records).
361
+ # this model's class. Delegates to EncryptedAttributes (single
362
+ # source of truth shared with BulkDatabaseCapturer, which only has
363
+ # the class — no instance — for bulk operations).
394
364
  #
395
365
  # @param attribute [String] The attribute name (already to_s'd)
396
366
  # @param model [ActiveRecord::Base] The model instance
397
367
  # @return [Boolean]
398
368
  def encrypted_attribute?(attribute, model)
399
- klass = model.class
400
- return false unless klass.respond_to?(:encrypted_attributes)
401
-
402
- encrypted = klass.encrypted_attributes
403
- return false if encrypted.nil? || encrypted.empty?
404
-
405
- encrypted.map(&:to_s).include?(attribute)
406
- rescue StandardError
407
- false
369
+ EzLogsAgent::EncryptedAttributes.attribute?(model.class, attribute)
408
370
  end
409
371
 
410
372
  # Checks if attribute name contains sensitive patterns.
411
- # Secondary check see SENSITIVE_PATTERNS comment.
373
+ # Delegates to SensitivePatterns (single source of truth shared
374
+ # with Sanitizer and BulkDatabaseCapturer).
412
375
  #
413
376
  # @param attribute [String] The attribute name
414
377
  # @return [Boolean]
415
378
  def sensitive_attribute?(attribute)
416
- attr_lower = attribute.downcase
417
- SENSITIVE_PATTERNS.any? { |pattern| attr_lower.include?(pattern) }
379
+ EzLogsAgent::SensitivePatterns.match?(attribute)
418
380
  end
419
381
 
420
382
  # Checks if both values are scalar types
@@ -0,0 +1,45 @@
1
+ # frozen_string_literal: true
2
+
3
+ module EzLogsAgent
4
+ # Primary defense against capturing encrypted columns: read the host
5
+ # app's `encrypts :foo` declarations (Rails 7+ ActiveRecord::Encryption)
6
+ # and drop those attributes from anywhere we'd ship them on the wire.
7
+ #
8
+ # Two callers:
9
+ # - DatabaseCapturer (per-record callbacks) — has a model INSTANCE,
10
+ # used to also work from `record.class`.
11
+ # - BulkDatabaseCapturer (AS::Notifications path) — has only the model
12
+ # CLASS (the bulk SQL never instantiated a record). So this module
13
+ # takes a class, not an instance — both call sites converge.
14
+ #
15
+ # The Rails API is `ModelClass.encrypted_attributes` (Symbol array).
16
+ # Available since Rails 7.0; older Rails or non-AR classes return false
17
+ # from this module, which is the fail-open default for the encrypts
18
+ # check. The pattern-based fallback in SensitivePatterns is the second
19
+ # layer of defense for hosts that don't (or can't) declare encrypts.
20
+ module EncryptedAttributes
21
+ module_function
22
+
23
+ # @param model_class [Class, nil] The AR class
24
+ # @param attribute [String, Symbol, nil] The attribute name
25
+ # @return [Boolean] true iff the host app declared `encrypts :<attribute>`
26
+ # on `model_class` (or an ancestor). False on any error, missing API,
27
+ # or empty list — see comment about fail-open semantics above.
28
+ def attribute?(model_class, attribute)
29
+ return false if model_class.nil? || attribute.nil?
30
+ return false unless model_class.respond_to?(:encrypted_attributes)
31
+
32
+ declared = model_class.encrypted_attributes
33
+ return false if declared.nil? || declared.empty?
34
+
35
+ attribute_str = attribute.to_s
36
+ declared.any? { |declared_attr| declared_attr.to_s == attribute_str }
37
+ rescue StandardError
38
+ # Same rescue policy as the previous inline check in DatabaseCapturer:
39
+ # if introspection raises (host app monkey-patched the API, weird
40
+ # AR class hierarchy), fall through to the pattern-based fallback
41
+ # rather than crash the capture path.
42
+ false
43
+ end
44
+ end
45
+ end
@@ -24,7 +24,10 @@ module EzLogsAgent
24
24
  SENSITIVE_KEYS = %w[password token secret api_key credit_card].freeze
25
25
 
26
26
  # Valid source types
27
- VALID_SOURCE_TYPES = %w[http_request background_job database_callback].freeze
27
+ # bulk_database covers AR bulk ops that bypass per-row callbacks
28
+ # (delete_all, update_all, insert_all, upsert_all) captured via
29
+ # ActiveSupport::Notifications. See Capturers::BulkDatabaseCapturer.
30
+ VALID_SOURCE_TYPES = %w[http_request background_job database_callback bulk_database].freeze
28
31
 
29
32
  # Valid outcome values
30
33
  VALID_OUTCOMES = %w[success failure].freeze
@@ -218,10 +218,13 @@ module EzLogsAgent
218
218
  EzLogsAgent::Logger.error("[Railtie] Failed to install ActiveJob capturer: #{e.class} - #{e.message}")
219
219
  end
220
220
 
221
- # Install Database capturer
221
+ # Install Database capturers (per-row + bulk).
222
222
  #
223
- # Database capturer installs ActiveRecord lifecycle callbacks
224
- # (after_create, after_update, after_destroy) for all models.
223
+ # Two capturers, one switch (`capture_database`):
224
+ # - DatabaseCapturer: per-row CRUD via after_create / _update / _destroy.
225
+ # - BulkDatabaseCapturer: bulk SQL via ActiveSupport::Notifications
226
+ # ("sql.active_record"), narrowly filtered to delete_all / update_all /
227
+ # insert_all / upsert_all. Catches what callbacks can't see.
225
228
  #
226
229
  # @return [void]
227
230
  def self.install_database_capturer
@@ -240,8 +243,9 @@ module EzLogsAgent
240
243
  return if @database_capturer_installed
241
244
 
242
245
  EzLogsAgent::Capturers::DatabaseCapturer.install
246
+ EzLogsAgent::Capturers::BulkDatabaseCapturer.install
243
247
  @database_capturer_installed = true
244
- EzLogsAgent::Logger.debug("[Railtie] Database capture installed")
248
+ EzLogsAgent::Logger.debug("[Railtie] Database capture installed (per-row + bulk)")
245
249
  rescue StandardError => e
246
250
  EzLogsAgent::Logger.error("[Railtie] Failed to install database capturer: #{e.class} - #{e.message}")
247
251
  end
@@ -21,18 +21,11 @@ module EzLogsAgent
21
21
  # The module is pure (no I/O, no state), so it's safe to call from
22
22
  # any thread.
23
23
  module Sanitizer
24
- # Default sensitive-key patterns. Matched case-insensitively as
25
- # SUBSTRINGS of the key, so `customer_password` matches `password`.
26
- SENSITIVE_PATTERNS = %w[
27
- password passwd pwd
28
- token access_token refresh_token api_token auth_token
29
- secret api_secret client_secret
30
- api_key apikey private_key privatekey secret_key secretkey
31
- credential auth authorization
32
- encrypted encrypted_data
33
- ssn social_security
34
- credit_card card_number cvv cvc
35
- ].freeze
24
+ # Sensitive-key pattern list. Delegates to SensitivePatterns (single
25
+ # source of truth shared with DatabaseCapturer / BulkDatabaseCapturer).
26
+ # Kept as a constant alias for backwards compatibility — code that
27
+ # used `Sanitizer::SENSITIVE_PATTERNS` continues to work.
28
+ SENSITIVE_PATTERNS = EzLogsAgent::SensitivePatterns::PATTERNS
36
29
 
37
30
  # Hard ceiling for nested object recursion. Deeper structures
38
31
  # collapse to the literal string "[Object]".
@@ -79,18 +72,13 @@ module EzLogsAgent
79
72
 
80
73
  # Check whether a key matches a sensitive pattern. Public so the
81
74
  # HTTP middleware can short-circuit early on identical keys.
75
+ # Delegates to SensitivePatterns (single source of truth — also
76
+ # consulted by DatabaseCapturer and BulkDatabaseCapturer).
82
77
  #
83
78
  # @param key [String, Symbol]
84
79
  # @return [Boolean]
85
80
  def sensitive_key?(key)
86
- key_lower = key.to_s.downcase
87
- return true if SENSITIVE_PATTERNS.any? { |pattern| key_lower.include?(pattern) }
88
-
89
- user_patterns = EzLogsAgent.configuration.excluded_graphql_variable_keys || []
90
- user_patterns.any? { |pattern| key_lower.include?(pattern.to_s.downcase) }
91
- rescue
92
- # Defensive: when in doubt, treat as sensitive.
93
- true
81
+ EzLogsAgent::SensitivePatterns.match?(key)
94
82
  end
95
83
 
96
84
  private
@@ -0,0 +1,64 @@
1
+ # frozen_string_literal: true
2
+
3
+ module EzLogsAgent
4
+ # Single source of truth for the agent's sensitive-key denylist. Used by
5
+ # every capture path that needs to mask a value based on its column /
6
+ # parameter / argument name (Sanitizer for HTTP params + job args,
7
+ # DatabaseCapturer for AR attributes, BulkDatabaseCapturer for SQL
8
+ # WHERE binds + SET values).
9
+ #
10
+ # This is a NAME-pattern denylist — the secondary defense, separate from
11
+ # the primary defense (Rails `encrypts :foo` introspection via
12
+ # `model.class.encrypted_attributes`, handled in EncryptedAttributes).
13
+ # Use both together: the encrypts check catches what the host app
14
+ # declared, this list catches what got past the declaration (legacy
15
+ # columns, manual hashing, externally-generated material).
16
+ #
17
+ # Matching rules:
18
+ # - Case-insensitive
19
+ # - Substring (so `customer_password` matches `password`)
20
+ # - User-extensible via `EzLogsAgent.configuration.excluded_graphql_variable_keys`
21
+ module SensitivePatterns
22
+ # Union of every column / key name we treat as sensitive. Curated
23
+ # from RFC 7468 / OWASP top sensitive-data categories plus
24
+ # ActiveRecord conventions. Keep this list narrow but defensive —
25
+ # adding a pattern is cheap; removing one is a backwards-incompatible
26
+ # behavior change for customer data on the wire.
27
+ PATTERNS = %w[
28
+ password passwd pwd
29
+ token access_token refresh_token api_token auth_token
30
+ secret api_secret client_secret
31
+ api_key apikey private_key privatekey secret_key secretkey
32
+ public_key signing_key
33
+ credential auth authorization
34
+ encrypted encrypted_data
35
+ pem cipher nonce salt digest signature hmac
36
+ ssn social_security
37
+ credit_card card_number cvv cvc
38
+ ].freeze
39
+
40
+ module_function
41
+
42
+ # @param key [String, Symbol, nil]
43
+ # @return [Boolean] true if the key matches a sensitive pattern OR
44
+ # matches a user-configured pattern in `excluded_graphql_variable_keys`
45
+ def match?(key)
46
+ return false if key.nil?
47
+
48
+ key_lower = key.to_s.downcase
49
+ return true if PATTERNS.any? { |pattern| key_lower.include?(pattern) }
50
+
51
+ # Direct configuration access — any raise here propagates to the
52
+ # rescue below and we fail-closed. Wrapping the access in its own
53
+ # rescue would silently fall back to "no extra patterns" on a
54
+ # config bug and the outer rescue would never fire, which means
55
+ # the broken-config path becomes "leak", not "mask".
56
+ user_patterns = EzLogsAgent.configuration.excluded_graphql_variable_keys || []
57
+ user_patterns.any? { |pattern| key_lower.include?(pattern.to_s.downcase) }
58
+ rescue StandardError
59
+ # Defensive: if configuration access raises, treat as sensitive.
60
+ # Better to over-mask than to leak.
61
+ true
62
+ end
63
+ end
64
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module EzLogsAgent
4
- VERSION = "0.1.10"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/ez_logs_agent.rb CHANGED
@@ -8,6 +8,9 @@ require_relative "ez_logs_agent/correlation"
8
8
  require_relative "ez_logs_agent/actor_validator"
9
9
  require_relative "ez_logs_agent/actor"
10
10
  require_relative "ez_logs_agent/user_agent_detector"
11
+ require_relative "ez_logs_agent/sensitive_patterns"
12
+ require_relative "ez_logs_agent/encrypted_attributes"
13
+ require_relative "ez_logs_agent/bulk_sql_parser"
11
14
  require_relative "ez_logs_agent/sanitizer"
12
15
  require_relative "ez_logs_agent/event_builder"
13
16
  require_relative "ez_logs_agent/resource_extractor"
@@ -19,6 +22,7 @@ require_relative "ez_logs_agent/middleware/http_request"
19
22
  require_relative "ez_logs_agent/capturers/job_capturer"
20
23
  require_relative "ez_logs_agent/capturers/active_job_capturer"
21
24
  require_relative "ez_logs_agent/capturers/database_capturer"
25
+ require_relative "ez_logs_agent/capturers/bulk_database_capturer"
22
26
 
23
27
  # Load Railtie only when Rails is present
24
28
  require_relative "ez_logs_agent/railtie" if defined?(Rails::Railtie)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ez_logs_agent
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.10
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dezsirazvan
@@ -123,12 +123,15 @@ files:
123
123
  - lib/ez_logs_agent/actor.rb
124
124
  - lib/ez_logs_agent/actor_validator.rb
125
125
  - lib/ez_logs_agent/buffer.rb
126
+ - lib/ez_logs_agent/bulk_sql_parser.rb
126
127
  - lib/ez_logs_agent/capturers/active_job_capturer.rb
128
+ - lib/ez_logs_agent/capturers/bulk_database_capturer.rb
127
129
  - lib/ez_logs_agent/capturers/database_capturer.rb
128
130
  - lib/ez_logs_agent/capturers/job_capturer.rb
129
131
  - lib/ez_logs_agent/configuration.rb
130
132
  - lib/ez_logs_agent/configuration_validator.rb
131
133
  - lib/ez_logs_agent/correlation.rb
134
+ - lib/ez_logs_agent/encrypted_attributes.rb
132
135
  - lib/ez_logs_agent/event_builder.rb
133
136
  - lib/ez_logs_agent/flush_scheduler.rb
134
137
  - lib/ez_logs_agent/logger.rb
@@ -137,6 +140,7 @@ files:
137
140
  - lib/ez_logs_agent/resource_extractor.rb
138
141
  - lib/ez_logs_agent/retry_sender.rb
139
142
  - lib/ez_logs_agent/sanitizer.rb
143
+ - lib/ez_logs_agent/sensitive_patterns.rb
140
144
  - lib/ez_logs_agent/transport.rb
141
145
  - lib/ez_logs_agent/user_agent_detector.rb
142
146
  - lib/ez_logs_agent/version.rb