ez_logs_agent 0.1.10 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +88 -0
- data/README.md +25 -8
- data/lib/ez_logs_agent/buffer.rb +14 -0
- data/lib/ez_logs_agent/bulk_sql_parser.rb +312 -0
- data/lib/ez_logs_agent/capturers/active_job_capturer.rb +28 -3
- data/lib/ez_logs_agent/capturers/bulk_database_capturer.rb +578 -0
- data/lib/ez_logs_agent/capturers/database_capturer.rb +46 -58
- data/lib/ez_logs_agent/encrypted_attributes.rb +45 -0
- data/lib/ez_logs_agent/event_builder.rb +4 -1
- data/lib/ez_logs_agent/railtie.rb +8 -4
- data/lib/ez_logs_agent/sanitizer.rb +8 -20
- data/lib/ez_logs_agent/sensitive_patterns.rb +82 -0
- data/lib/ez_logs_agent/version.rb +1 -1
- data/lib/ez_logs_agent.rb +4 -0
- metadata +5 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 96fd8717fab4330e842769b9a4ae03902ccb2487e24c418d54601c99862da306
|
|
4
|
+
data.tar.gz: e0967b4bfe22b2abb5946c213378a615f3d399a9c0b8d8dd3f002f731bda9444
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: bc502ab7e6ec65dab0f691005e4e0a71a23182a184c2ce1a188933ebe01db1155b3e4e500a59d23e7275c1a836450317a63e262d0d0babbf5123120be1718809
|
|
7
|
+
data.tar.gz: cc13f82eece53f6deb53c07ad324cf74fce412d06112abfa86c5ca76a829ed01a1cb421a63125489d969ee2e21efb3d9cb03f5b65412db9792330edca30be271
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,94 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to this project will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## [0.2.1] — 2026-06-05
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
- **Bulk-op row counts on Postgres.** The PG adapter does not populate
|
|
9
|
+
`payload[:row_count]` for plain `DELETE`/`UPDATE` notifications, so
|
|
10
|
+
the first 0.2.0 builds shipped `row_count: 0` for the cases that
|
|
11
|
+
matter most. `BulkDatabaseCapturer` now prepends a tiny shim onto
|
|
12
|
+
`ActiveRecord::Relation#delete_all` / `#update_all` that stashes the
|
|
13
|
+
returned row count and back-fills it onto the just-pushed event via
|
|
14
|
+
`Buffer.peek_last`. No change to the wire shape — `row_count` now
|
|
15
|
+
carries the real number.
|
|
16
|
+
- **Model resolution in Rails dev mode.** `ActiveRecord::Base.descendants`
|
|
17
|
+
only sees eager-loaded models. The resolver now falls through to
|
|
18
|
+
`safe_constantize` of the classified table name and verifies the
|
|
19
|
+
reconstructed class actually owns that table, so bulk ops on
|
|
20
|
+
lazy-autoloaded models in dev / test no longer silently drop.
|
|
21
|
+
- **Rails Query Log Tags noise in `where_template`.** The parser now
|
|
22
|
+
strips `/*application='X',action='Y'*/` comments before extracting
|
|
23
|
+
the WHERE clause, so the humanized filter line on the server reads
|
|
24
|
+
cleanly instead of leaking the framework's instrumentation tags.
|
|
25
|
+
|
|
26
|
+
### Changed
|
|
27
|
+
- **Framework-rewrite filter narrowed.** The 0.2.0 filter that swallowed
|
|
28
|
+
Rails-generated `SET col = NULL` writes was too aggressive — it also
|
|
29
|
+
hid deliberate "null this column out" operations the customer wrote.
|
|
30
|
+
The filter now only drops `COALESCE`-shaped counter bumps and
|
|
31
|
+
empty-`SET` shells; honest `SET col = NULL` writes are captured and
|
|
32
|
+
render as "Cleared col" on the timeline.
|
|
33
|
+
- **Hot-path performance.** `capture_jobs`, `capture_database`, the
|
|
34
|
+
excluded-tables / excluded-job-classes / display-name maps, and the
|
|
35
|
+
user-extended sensitive-key patterns are now memoized at install
|
|
36
|
+
time across `ActiveJobCapturer`, `DatabaseCapturer`,
|
|
37
|
+
`BulkDatabaseCapturer`, and `SensitivePatterns`. The
|
|
38
|
+
`sql.active_record` subscriber is also tuned: 5-arity block to avoid
|
|
39
|
+
splat allocation, cheap `end_with?` name-prefilter ahead of any
|
|
40
|
+
parsing, and an early bail before `safe_constantize`. Bulk capture
|
|
41
|
+
overhead measured below the noise floor on a 10 ms reference query.
|
|
42
|
+
|
|
43
|
+
## [0.2.0] — 2026-06-05
|
|
44
|
+
|
|
45
|
+
### Added
|
|
46
|
+
- **Bulk database operations are now captured.** Adds a fourth event
|
|
47
|
+
`source_type` — `bulk_database` — for the four ActiveRecord operations
|
|
48
|
+
that bypass per-row callbacks: `delete_all`, `update_all`, `insert_all`,
|
|
49
|
+
`upsert_all`. Implemented via a narrowly-filtered
|
|
50
|
+
`ActiveSupport::Notifications.subscribe("sql.active_record")` subscription
|
|
51
|
+
(`Capturers::BulkDatabaseCapturer`) — not a replacement for the existing
|
|
52
|
+
callback-based DatabaseCapturer; the two run side by side under the
|
|
53
|
+
same `capture_database` config flag.
|
|
54
|
+
|
|
55
|
+
The new wire shape carries `model_class`, `operation`, `row_count`,
|
|
56
|
+
`where_template` + sanitized `where_binds`, plus an operation-specific
|
|
57
|
+
field (`set` for update_all, `columns` for insert_all/upsert_all).
|
|
58
|
+
Insert/upsert ship column NAMES only — no values, per the product
|
|
59
|
+
decision that bulk-row PII shouldn't ride the wire.
|
|
60
|
+
|
|
61
|
+
Cascade case (`dependent: :delete_all` on a parent destroy) is
|
|
62
|
+
captured automatically, since it produces the same SQL shape — the
|
|
63
|
+
reader sees the parent destroy AND the cascade as sibling rows on the
|
|
64
|
+
timeline.
|
|
65
|
+
|
|
66
|
+
Resource attribution uses a `resource_id: "bulk:<row_count>"` sentinel,
|
|
67
|
+
since individual row IDs are not knowable from the SQL without
|
|
68
|
+
changing the customer's operation (which would violate the read-only
|
|
69
|
+
principle).
|
|
70
|
+
|
|
71
|
+
### Changed
|
|
72
|
+
- `Sanitizer::SENSITIVE_PATTERNS` and the previous in-class
|
|
73
|
+
`DatabaseCapturer::SENSITIVE_PATTERNS` (which had drifted) are now a
|
|
74
|
+
single source of truth: `EzLogsAgent::SensitivePatterns::PATTERNS`.
|
|
75
|
+
Both capturers + the new BulkDatabaseCapturer consult the same list.
|
|
76
|
+
The merged list is the UNION of the previous two — no patterns
|
|
77
|
+
removed; `passwd`, `pwd`, `cvv`, `cvc`, `pem`, `cipher`, `nonce`,
|
|
78
|
+
`salt`, `digest`, `signature`, `hmac` all continue to be masked.
|
|
79
|
+
- `encrypts :foo` introspection (Rails 7+ `model.class.encrypted_attributes`)
|
|
80
|
+
is now exposed as the standalone `EzLogsAgent::EncryptedAttributes`
|
|
81
|
+
module, so both the per-row and bulk capturers consult it via the
|
|
82
|
+
same path. Behavior unchanged for per-row.
|
|
83
|
+
|
|
84
|
+
### Limits (documented)
|
|
85
|
+
- Raw `connection.execute(sql)` calls are not captured (no
|
|
86
|
+
notifications fire under `sql.active_record` with a recognizable
|
|
87
|
+
shape). Use the typed bulk methods to get visibility.
|
|
88
|
+
- Specific row IDs affected by a bulk op are not captured — only the
|
|
89
|
+
filter rule (WHERE columns + values) and row count. For per-row
|
|
90
|
+
detail, use `find_each(&:destroy)` style which fires per-row
|
|
91
|
+
callbacks.
|
|
92
|
+
|
|
5
93
|
## [0.1.10] — 2026-06-05
|
|
6
94
|
|
|
7
95
|
### Fixed
|
data/README.md
CHANGED
|
@@ -218,17 +218,29 @@ Sidekiq and ActiveJob executions:
|
|
|
218
218
|
|
|
219
219
|
ActiveRecord create, update, and destroy operations:
|
|
220
220
|
|
|
221
|
-
**What's captured:**
|
|
221
|
+
**What's captured (per-row, via `after_create` / `after_update` / `after_destroy`):**
|
|
222
222
|
- Model class name, record ID, operation type (create/update/destroy)
|
|
223
223
|
- For creates: initial attribute values
|
|
224
224
|
- For updates: one meaningful attribute change (e.g., `status: pending → shipped`)
|
|
225
225
|
- For destroys: final attribute values before deletion
|
|
226
226
|
- Correlation ID (automatically inherited from the current request or job)
|
|
227
227
|
|
|
228
|
+
**What's captured (bulk, via `ActiveSupport::Notifications` on `sql.active_record`):**
|
|
229
|
+
- The four ActiveRecord operations that bypass per-row callbacks:
|
|
230
|
+
`delete_all`, `update_all`, `insert_all`, `upsert_all`.
|
|
231
|
+
- For each: model class, row count, the filter rule (WHERE columns +
|
|
232
|
+
sanitized bind values), and operation-specific detail (SET hash for
|
|
233
|
+
`update_all`, column names for `insert_all` / `upsert_all` — values
|
|
234
|
+
are NOT shipped for bulk inserts).
|
|
235
|
+
- `dependent: :delete_all` cascades during a parent destroy.
|
|
236
|
+
|
|
228
237
|
**What's NOT captured:**
|
|
229
238
|
- SELECT queries (read operations don't change data)
|
|
230
239
|
- Schema migrations (Rails internal operations)
|
|
231
|
-
-
|
|
240
|
+
- Raw `connection.execute(sql)` calls (no recognizable model class)
|
|
241
|
+
- Individual row IDs affected by bulk operations (only the filter rule
|
|
242
|
+
and row count — pulling the IDs would require modifying the
|
|
243
|
+
customer's SQL, which violates the read-only principle).
|
|
232
244
|
|
|
233
245
|
**Automatic exclusions (no configuration needed):**
|
|
234
246
|
- `sessions` — Session store updates
|
|
@@ -777,18 +789,23 @@ Common validation errors:
|
|
|
777
789
|
|
|
778
790
|
3. **Look for database capture registration in logs:**
|
|
779
791
|
```
|
|
780
|
-
[Railtie] Database capture installed
|
|
792
|
+
[Railtie] Database capture installed (per-row + bulk)
|
|
781
793
|
```
|
|
782
794
|
|
|
783
795
|
4. **Verify models aren't excluded:**
|
|
784
796
|
Check `config.excluded_tables` to ensure your tables aren't being filtered out.
|
|
785
797
|
|
|
786
|
-
5. **
|
|
787
|
-
- ✅ `User.create(...)` — Captured
|
|
788
|
-
- ✅ `user.update(...)` — Captured
|
|
789
|
-
- ✅ `user.destroy` — Captured
|
|
798
|
+
5. **What is and isn't captured:**
|
|
799
|
+
- ✅ `User.create(...)` — Captured (per-row callback)
|
|
800
|
+
- ✅ `user.update(...)` — Captured (per-row callback)
|
|
801
|
+
- ✅ `user.destroy` — Captured (per-row callback)
|
|
802
|
+
- ✅ `User.update_all(...)` — Captured as a `bulk_database` event
|
|
803
|
+
(filter + SET clause + row count; no per-row IDs)
|
|
804
|
+
- ✅ `User.where(...).delete_all` — Captured as a `bulk_database` event
|
|
805
|
+
- ✅ `User.insert_all(...)` / `upsert_all(...)` — Captured (columns + count)
|
|
790
806
|
- ❌ `User.find(...)` — NOT captured (read-only)
|
|
791
|
-
- ❌ `User.
|
|
807
|
+
- ❌ `User.connection.execute("DELETE FROM ...")` — NOT captured
|
|
808
|
+
(raw SQL bypasses the typed AR API we hook)
|
|
792
809
|
|
|
793
810
|
---
|
|
794
811
|
|
data/lib/ez_logs_agent/buffer.rb
CHANGED
|
@@ -59,6 +59,20 @@ module EzLogsAgent
|
|
|
59
59
|
# Best effort, ignore failures
|
|
60
60
|
end
|
|
61
61
|
|
|
62
|
+
# Returns the most recently pushed event WITHOUT removing it.
|
|
63
|
+
# Used by BulkDatabaseCapturer to backfill the affected-row count
|
|
64
|
+
# after the relation method's return value is known (Rails'
|
|
65
|
+
# payload[:row_count] is unreliable for plain DELETE on PG).
|
|
66
|
+
# Mutating the returned hash in place is intentional and safe —
|
|
67
|
+
# the event hasn't been flushed yet.
|
|
68
|
+
# @return [Hash, nil] The last event, or nil if empty.
|
|
69
|
+
def peek_last
|
|
70
|
+
@monitor.synchronize { @queue.last }
|
|
71
|
+
rescue => error
|
|
72
|
+
log_error("[Buffer] peek_last failed: #{error.message}")
|
|
73
|
+
nil
|
|
74
|
+
end
|
|
75
|
+
|
|
62
76
|
private
|
|
63
77
|
|
|
64
78
|
def max_size
|
|
@@ -0,0 +1,312 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module EzLogsAgent
|
|
4
|
+
# Pure-functional parser for the four ActiveRecord bulk operations that
|
|
5
|
+
# bypass per-row callbacks: delete_all, update_all, insert_all, upsert_all.
|
|
6
|
+
# Used by BulkDatabaseCapturer to turn an AS::Notifications "sql.active_record"
|
|
7
|
+
# payload into a structured wire shape the server can humanize.
|
|
8
|
+
#
|
|
9
|
+
# ## What it extracts
|
|
10
|
+
#
|
|
11
|
+
# For delete_all:
|
|
12
|
+
# { operation: :delete_all, where_template: "status = $1", where_binds: [{column:, value:}] }
|
|
13
|
+
#
|
|
14
|
+
# For update_all:
|
|
15
|
+
# { operation: :update_all, set: {"status" => "paid"}, where_template:, where_binds: }
|
|
16
|
+
#
|
|
17
|
+
# For insert_all / upsert_all:
|
|
18
|
+
# { operation: :insert_all|:upsert_all, columns: ["name", "email"] }
|
|
19
|
+
# (No values — per the product decision; column SHAPE only.)
|
|
20
|
+
#
|
|
21
|
+
# For anything else (subqueries, joins, raw SQL the regex can't parse,
|
|
22
|
+
# malformed binds): returns { unparseable: true }. BulkDatabaseCapturer
|
|
23
|
+
# falls back to shipping row_count + operation + model_class with no
|
|
24
|
+
# template / binds / set — the timeline still reads "Bulk delete: 50,000
|
|
25
|
+
# orders" minus the WHERE detail.
|
|
26
|
+
#
|
|
27
|
+
# ## Why regex (not Arel / pg_query)
|
|
28
|
+
#
|
|
29
|
+
# Both Arel and pg_query are adapter-specific and either slow or heavy
|
|
30
|
+
# to add as a runtime dependency on every customer host app. Regex on
|
|
31
|
+
# the standardized AR-emitted SQL string is fast (sub-millisecond on
|
|
32
|
+
# typical statements) and adapter-tolerant. The graceful "unparseable"
|
|
33
|
+
# branch covers the long tail.
|
|
34
|
+
#
|
|
35
|
+
# ## Adapter quoting handled
|
|
36
|
+
#
|
|
37
|
+
# - PostgreSQL: "orders"."status" = $1 (double-quoted, $N placeholders)
|
|
38
|
+
# - SQLite: "orders"."status" = ? (double-quoted, ? placeholders)
|
|
39
|
+
# - MySQL: `orders`.`status` = ? (backticks, ? placeholders)
|
|
40
|
+
module BulkSqlParser
|
|
41
|
+
# Loose match for an identifier wrapped in any of the three quote
|
|
42
|
+
# styles AR uses. Captures the unquoted name.
|
|
43
|
+
IDENTIFIER = /["`]([^"`]+)["`]/.freeze
|
|
44
|
+
|
|
45
|
+
# Either a "qualified" or "bare" identifier reference, used in
|
|
46
|
+
# WHERE/SET clause columns. AR prefixes the table name in some
|
|
47
|
+
# paths and not in others — accept both. Captures just the column.
|
|
48
|
+
COLUMN = /(?:#{IDENTIFIER}\.)?#{IDENTIFIER}/.freeze
|
|
49
|
+
|
|
50
|
+
# A bind placeholder — Postgres uses $N (1-indexed), SQLite/MySQL use ?.
|
|
51
|
+
# The order of placeholders in the SQL corresponds 1:1 with the order
|
|
52
|
+
# of values in `binds` / `type_casted_binds`.
|
|
53
|
+
PLACEHOLDER = /(?:\$\d+|\?)/.freeze
|
|
54
|
+
|
|
55
|
+
module_function
|
|
56
|
+
|
|
57
|
+
# @param sql [String] Raw SQL string from `payload[:sql]`.
|
|
58
|
+
# @param type_casted_binds [Array] `payload[:type_casted_binds]` —
|
|
59
|
+
# already-typecast values in placeholder order. May be nil for
|
|
60
|
+
# raw SQL paths.
|
|
61
|
+
# @return [Hash] Always returns a Hash. Either the parsed structure
|
|
62
|
+
# (keys above) or `{ unparseable: true }`.
|
|
63
|
+
def parse(sql:, type_casted_binds:)
|
|
64
|
+
return { unparseable: true } if sql.nil? || sql.empty?
|
|
65
|
+
|
|
66
|
+
sql_stripped = strip_query_log_tags(sql.strip)
|
|
67
|
+
|
|
68
|
+
case sql_stripped
|
|
69
|
+
when /\ADELETE FROM /i
|
|
70
|
+
parse_delete(sql_stripped, type_casted_binds || [])
|
|
71
|
+
when /\AUPDATE /i
|
|
72
|
+
parse_update(sql_stripped, type_casted_binds || [])
|
|
73
|
+
when /\AINSERT INTO /i
|
|
74
|
+
parse_insert(sql_stripped)
|
|
75
|
+
else
|
|
76
|
+
{ unparseable: true }
|
|
77
|
+
end
|
|
78
|
+
rescue StandardError
|
|
79
|
+
# If the regex engine, binds zipping, or any sub-parse step raises,
|
|
80
|
+
# ship unparseable rather than crash the capture handler. The
|
|
81
|
+
# BulkDatabaseCapturer's own rescue would also catch this, but
|
|
82
|
+
# defending here means the rest of the capturer sees a uniform
|
|
83
|
+
# return shape.
|
|
84
|
+
{ unparseable: true }
|
|
85
|
+
end
|
|
86
|
+
|
|
87
|
+
# Strip Rails 7+ Query Log Tags (`/*application='X',action='Y'*/`)
|
|
88
|
+
# AND any other trailing SQL comments. They land at the end of every
|
|
89
|
+
# statement when `config.active_record.query_log_tags_enabled = true`
|
|
90
|
+
# and are pure noise on the timeline — they leak the host app's name
|
|
91
|
+
# and controller into the user-visible filter line. Removing them at
|
|
92
|
+
# parse time means we never ship them on the wire.
|
|
93
|
+
def strip_query_log_tags(sql)
|
|
94
|
+
sql.gsub(%r{/\*.*?\*/}m, "").rstrip
|
|
95
|
+
end
|
|
96
|
+
|
|
97
|
+
# Returns the symbolic operation name we expect downstream. Detected
|
|
98
|
+
# from SQL shape, independent of the `payload[:name]` Rails version
|
|
99
|
+
# variance (see plan §"insert_all/upsert_all payload :name varies").
|
|
100
|
+
# Upsert detection requires the `ON CONFLICT` (PG) or `ON DUPLICATE KEY`
|
|
101
|
+
# (MySQL) clause — bare INSERTs are insert_all.
|
|
102
|
+
#
|
|
103
|
+
# @return [Symbol, nil] :delete_all, :update_all, :insert_all,
|
|
104
|
+
# :upsert_all, or nil if not a bulk op.
|
|
105
|
+
def detect_operation(sql)
|
|
106
|
+
return nil if sql.nil?
|
|
107
|
+
|
|
108
|
+
sql_up = sql.lstrip.upcase
|
|
109
|
+
|
|
110
|
+
return :delete_all if sql_up.start_with?("DELETE FROM ")
|
|
111
|
+
return :update_all if sql_up.start_with?("UPDATE ")
|
|
112
|
+
|
|
113
|
+
if sql_up.start_with?("INSERT INTO ")
|
|
114
|
+
# Disambiguate insert_all vs upsert_all:
|
|
115
|
+
# - insert_all on PG/SQLite emits `ON CONFLICT DO NOTHING` (still insert).
|
|
116
|
+
# - upsert_all on PG/SQLite emits `ON CONFLICT ... DO UPDATE SET ...`.
|
|
117
|
+
# - upsert_all on MySQL emits `ON DUPLICATE KEY UPDATE ...`.
|
|
118
|
+
if (sql_up.include?("ON CONFLICT") && sql_up.include?("DO UPDATE")) ||
|
|
119
|
+
sql_up.include?("ON DUPLICATE KEY")
|
|
120
|
+
return :upsert_all
|
|
121
|
+
end
|
|
122
|
+
|
|
123
|
+
:insert_all
|
|
124
|
+
end
|
|
125
|
+
end
|
|
126
|
+
|
|
127
|
+
# --- DELETE FROM "orders" WHERE "orders"."status" = $1 ---
|
|
128
|
+
def parse_delete(sql, type_casted_binds)
|
|
129
|
+
where_sql = extract_where(sql)
|
|
130
|
+
template, binds = build_template_and_binds(where_sql, type_casted_binds)
|
|
131
|
+
|
|
132
|
+
{ operation: :delete_all, where_template: template, where_binds: binds }
|
|
133
|
+
end
|
|
134
|
+
|
|
135
|
+
# --- UPDATE "orders" SET "status" = $1 WHERE "orders"."status" = $2 ---
|
|
136
|
+
# SET binds come first in placeholder order, then WHERE binds.
|
|
137
|
+
def parse_update(sql, type_casted_binds)
|
|
138
|
+
set_sql = extract_set(sql)
|
|
139
|
+
where_sql = extract_where(sql)
|
|
140
|
+
return { unparseable: true } if set_sql.nil?
|
|
141
|
+
|
|
142
|
+
set_pairs, set_bind_count = parse_set_assignments(set_sql)
|
|
143
|
+
set_binds = type_casted_binds.first(set_bind_count)
|
|
144
|
+
where_binds_raw = type_casted_binds.drop(set_bind_count)
|
|
145
|
+
|
|
146
|
+
set_hash = zip_set_values(set_pairs, set_binds)
|
|
147
|
+
where_template, where_binds = build_template_and_binds(where_sql, where_binds_raw)
|
|
148
|
+
|
|
149
|
+
{
|
|
150
|
+
operation: :update_all,
|
|
151
|
+
set: set_hash,
|
|
152
|
+
where_template: where_template,
|
|
153
|
+
where_binds: where_binds
|
|
154
|
+
}
|
|
155
|
+
end
|
|
156
|
+
|
|
157
|
+
# --- INSERT INTO "users" ("name","email") VALUES (...), (...) ---
|
|
158
|
+
def parse_insert(sql)
|
|
159
|
+
operation = detect_operation(sql)
|
|
160
|
+
return { unparseable: true } unless operation == :insert_all || operation == :upsert_all
|
|
161
|
+
|
|
162
|
+
columns = extract_insert_columns(sql)
|
|
163
|
+
return { unparseable: true } if columns.nil? || columns.empty?
|
|
164
|
+
|
|
165
|
+
{ operation: operation, columns: columns }
|
|
166
|
+
end
|
|
167
|
+
|
|
168
|
+
# Returns the WHERE-clause SQL (without the keyword), or nil if absent.
|
|
169
|
+
# Stops at end-of-string, RETURNING, ORDER BY, LIMIT (AR rarely emits
|
|
170
|
+
# these on bulk DML but cheap insurance).
|
|
171
|
+
def extract_where(sql)
|
|
172
|
+
match = sql.match(/\sWHERE\s+(.+?)(?:\s+(?:RETURNING|ORDER BY|LIMIT)\b|\z)/i)
|
|
173
|
+
match && match[1].strip
|
|
174
|
+
end
|
|
175
|
+
|
|
176
|
+
# Returns the SET-clause SQL between "SET" and "WHERE"/end-of-string.
|
|
177
|
+
def extract_set(sql)
|
|
178
|
+
match = sql.match(/\sSET\s+(.+?)(?:\s+WHERE\s+|\s+RETURNING\b|\z)/i)
|
|
179
|
+
match && match[1].strip
|
|
180
|
+
end
|
|
181
|
+
|
|
182
|
+
# SET clause is comma-separated `"col" = <placeholder|literal>` pairs.
|
|
183
|
+
# AR inlines non-symbol values as literals (no placeholder) — we still
|
|
184
|
+
# surface those in the result so the reader sees what changed.
|
|
185
|
+
# Returns [[col_name, placeholder_or_literal_str], ...] + total bind count.
|
|
186
|
+
def parse_set_assignments(set_sql)
|
|
187
|
+
pairs = []
|
|
188
|
+
bind_count = 0
|
|
189
|
+
|
|
190
|
+
split_top_level_commas(set_sql).each do |assignment|
|
|
191
|
+
m = assignment.match(/\A#{COLUMN}\s*=\s*(.+)\z/)
|
|
192
|
+
next unless m
|
|
193
|
+
|
|
194
|
+
col = m[2] || m[1]
|
|
195
|
+
rhs = m[3].strip
|
|
196
|
+
pairs << [col, rhs]
|
|
197
|
+
bind_count += 1 if rhs.match?(PLACEHOLDER)
|
|
198
|
+
end
|
|
199
|
+
|
|
200
|
+
[pairs, bind_count]
|
|
201
|
+
end
|
|
202
|
+
|
|
203
|
+
# Walks set_pairs in order, taking the next bind for each placeholder
|
|
204
|
+
# RHS and pulling the literal text for non-placeholder RHS (e.g.,
|
|
205
|
+
# `updated_at = '2026-06-05 12:00:00'`).
|
|
206
|
+
def zip_set_values(set_pairs, set_binds)
|
|
207
|
+
bind_idx = 0
|
|
208
|
+
set_pairs.each_with_object({}) do |(col, rhs), acc|
|
|
209
|
+
if rhs.match?(PLACEHOLDER)
|
|
210
|
+
acc[col] = set_binds[bind_idx]
|
|
211
|
+
bind_idx += 1
|
|
212
|
+
else
|
|
213
|
+
# Strip surrounding quotes to surface the actual value.
|
|
214
|
+
acc[col] = rhs.gsub(/\A['"]|['"]\z/, "")
|
|
215
|
+
end
|
|
216
|
+
end
|
|
217
|
+
end
|
|
218
|
+
|
|
219
|
+
# Splits on commas that are NOT inside (parens) or quotes. Bulk DML
|
|
220
|
+
# SET / WHERE almost never nests, but defensive against function
|
|
221
|
+
# calls like `coalesce(x, 0)`.
|
|
222
|
+
def split_top_level_commas(str)
|
|
223
|
+
result = []
|
|
224
|
+
depth = 0
|
|
225
|
+
in_quote = false
|
|
226
|
+
quote_char = nil
|
|
227
|
+
buffer = +""
|
|
228
|
+
|
|
229
|
+
str.each_char do |ch|
|
|
230
|
+
if in_quote
|
|
231
|
+
buffer << ch
|
|
232
|
+
in_quote = false if ch == quote_char
|
|
233
|
+
elsif ch == "'" || ch == '"' || ch == "`"
|
|
234
|
+
in_quote = true
|
|
235
|
+
quote_char = ch
|
|
236
|
+
buffer << ch
|
|
237
|
+
elsif ch == "("
|
|
238
|
+
depth += 1
|
|
239
|
+
buffer << ch
|
|
240
|
+
elsif ch == ")"
|
|
241
|
+
depth -= 1
|
|
242
|
+
buffer << ch
|
|
243
|
+
elsif ch == "," && depth.zero?
|
|
244
|
+
result << buffer.strip unless buffer.empty?
|
|
245
|
+
buffer = +""
|
|
246
|
+
else
|
|
247
|
+
buffer << ch
|
|
248
|
+
end
|
|
249
|
+
end
|
|
250
|
+
result << buffer.strip unless buffer.empty?
|
|
251
|
+
result
|
|
252
|
+
end
|
|
253
|
+
|
|
254
|
+
# Given a WHERE clause SQL and the binds in placeholder order, walk
|
|
255
|
+
# the placeholders and pair each one with the column to its LEFT
|
|
256
|
+
# (the standard `"table"."col" = $1` shape AR emits). Returns the
|
|
257
|
+
# template (placeholder-preserved) and the binds tagged with column.
|
|
258
|
+
#
|
|
259
|
+
# Unrecognized shapes (subselects, joins, NULL checks without binds)
|
|
260
|
+
# leave the template intact but produce no column for the bind, in
|
|
261
|
+
# which case the bind ships with column: nil. The display layer
|
|
262
|
+
# handles nil-column binds.
|
|
263
|
+
def build_template_and_binds(where_sql, type_casted_binds)
|
|
264
|
+
return [nil, []] if where_sql.nil?
|
|
265
|
+
|
|
266
|
+
template = where_sql
|
|
267
|
+
binds = []
|
|
268
|
+
bind_index = 0
|
|
269
|
+
|
|
270
|
+
# Walk the template scanning each placeholder, looking backward
|
|
271
|
+
# for the nearest column identifier to its left.
|
|
272
|
+
template.scan(PLACEHOLDER) do
|
|
273
|
+
match_data = Regexp.last_match
|
|
274
|
+
next unless match_data
|
|
275
|
+
|
|
276
|
+
column = nearest_column_left_of(template, match_data.begin(0))
|
|
277
|
+
value = type_casted_binds[bind_index]
|
|
278
|
+
binds << { column: column, value: value }
|
|
279
|
+
bind_index += 1
|
|
280
|
+
end
|
|
281
|
+
|
|
282
|
+
[template, binds]
|
|
283
|
+
end
|
|
284
|
+
|
|
285
|
+
# Finds the most recent column identifier to the left of `pos` in
|
|
286
|
+
# the WHERE clause. Used to attribute each bind to its column name
|
|
287
|
+
# so we can mask sensitive values by name downstream.
|
|
288
|
+
def nearest_column_left_of(where_sql, pos)
|
|
289
|
+
prefix = where_sql[0...pos]
|
|
290
|
+
# Look for the last identifier before the operator (=, <, >, etc.)
|
|
291
|
+
match = prefix.match(/#{COLUMN}\s*(?:=|<>|!=|<=|>=|<|>|LIKE|IN|IS)\s*\z/i)
|
|
292
|
+
return nil unless match
|
|
293
|
+
|
|
294
|
+
# COLUMN has two capture groups: [table, col] or [_, col].
|
|
295
|
+
match[2] || match[1]
|
|
296
|
+
end
|
|
297
|
+
|
|
298
|
+
# --- INSERT INTO "users" ("name","email") VALUES ... ---
|
|
299
|
+
# Extracts the ordered column list. Returns nil if the open paren
|
|
300
|
+
# column list isn't present (e.g., INSERT ... DEFAULT VALUES).
|
|
301
|
+
def extract_insert_columns(sql)
|
|
302
|
+
# First parenthesized list after the table name.
|
|
303
|
+
m = sql.match(/\AINSERT INTO\s+#{IDENTIFIER}\s*\(([^)]+)\)/i)
|
|
304
|
+
return nil unless m
|
|
305
|
+
|
|
306
|
+
columns_block = m[2]
|
|
307
|
+
columns_block.split(",").map do |col|
|
|
308
|
+
col.strip.gsub(/\A["`]|["`]\z/, "")
|
|
309
|
+
end.reject(&:empty?)
|
|
310
|
+
end
|
|
311
|
+
end
|
|
312
|
+
end
|
|
@@ -51,6 +51,25 @@ module EzLogsAgent
|
|
|
51
51
|
def install
|
|
52
52
|
return unless defined?(ActiveJob)
|
|
53
53
|
|
|
54
|
+
# Memoize config values that the per-job hot path reads on
|
|
55
|
+
# every execute. Without this, capture_execution dispatches
|
|
56
|
+
# into EzLogsAgent.configuration twice per job (once for
|
|
57
|
+
# capture_jobs, once for all_excluded_job_classes). On
|
|
58
|
+
# job-heavy apps that's measurable. Runtime mutations need
|
|
59
|
+
# uninstall! + install.
|
|
60
|
+
@capture_enabled =
|
|
61
|
+
begin
|
|
62
|
+
EzLogsAgent.configuration.capture_jobs
|
|
63
|
+
rescue StandardError
|
|
64
|
+
false
|
|
65
|
+
end
|
|
66
|
+
@excluded_job_classes =
|
|
67
|
+
begin
|
|
68
|
+
EzLogsAgent.configuration.all_excluded_job_classes.dup.freeze
|
|
69
|
+
rescue StandardError
|
|
70
|
+
[].freeze
|
|
71
|
+
end
|
|
72
|
+
|
|
54
73
|
install_serialization_hooks unless @serialization_installed
|
|
55
74
|
|
|
56
75
|
ActiveJob::Base.before_enqueue do |job|
|
|
@@ -101,7 +120,11 @@ module EzLogsAgent
|
|
|
101
120
|
# @param block [Proc] The job execution block
|
|
102
121
|
# @return [Object] The result of the job execution
|
|
103
122
|
def capture_execution(job, block)
|
|
104
|
-
|
|
123
|
+
# Memoized at install time for hot-path perf. If we haven't
|
|
124
|
+
# installed yet (specs that test capture_execution directly),
|
|
125
|
+
# fall back to a live config read so the behavior matches.
|
|
126
|
+
enabled = defined?(@capture_enabled) ? @capture_enabled : EzLogsAgent.configuration.capture_jobs
|
|
127
|
+
return block.call unless enabled
|
|
105
128
|
|
|
106
129
|
if sidekiq_adapter?(job)
|
|
107
130
|
EzLogsAgent::Logger.debug("[ActiveJobCapturer] Skipping capture (Sidekiq adapter)")
|
|
@@ -196,8 +219,10 @@ module EzLogsAgent
|
|
|
196
219
|
# @param job [ActiveJob::Base] The job instance
|
|
197
220
|
# @return [Boolean] true if excluded, false otherwise
|
|
198
221
|
def excluded_job_class?(job)
|
|
199
|
-
|
|
200
|
-
|
|
222
|
+
# Memoized at install time for perf. Fall back to a live read
|
|
223
|
+
# for specs that don't call install (see capture_execution).
|
|
224
|
+
excluded = defined?(@excluded_job_classes) ? @excluded_job_classes : EzLogsAgent.configuration.all_excluded_job_classes
|
|
225
|
+
excluded.include?(job.class.name)
|
|
201
226
|
rescue StandardError
|
|
202
227
|
false
|
|
203
228
|
end
|