dead_bro 0.2.24 → 0.2.25

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2123fe4aad1331c3a33afdf2e5f9cffcfac75f5893982522569da053fc4ea403
4
- data.tar.gz: a00a450442279e95799fb6ddd3e2d55b4b18e8ed26926f55c09ea076f12d02eb
3
+ metadata.gz: aec6a4f779a49726e225b3d01e9e1a6516c56639a451837615a8d8c08cb59e8a
4
+ data.tar.gz: 186f7fd09ef2b4c3d2ed9390c1278a9f5bbc577a0e05572e63cea7461689fccc
5
5
  SHA512:
6
- metadata.gz: ba0645dca3b3552e6c8b4aaef11e8f008e7822ba5d67ff9b8133c05bfe21b0a719252f655f7a893e99369aafe4c6e2ac3c886a372bc353b48c044d22b83d7f36
7
- data.tar.gz: dffa7888b43a8642fb9ee32ec9334f8cb13dab4c176353171a533780d086eca99257198e0f9a4d0609cc6897a5d51a7591213da3442929c82b2bdf00aa65d738
6
+ metadata.gz: 2b7028050d353c623587cd3f083b2b8666fcd05abfa68b7cf05d7bdda5979f29118bdabcbb37c0165a10eca829e30305ff061874c08d9a45e65d89193acd3ad3
7
+ data.tar.gz: 2f16e4716988194147e087396dc419c4d20fdb4bd2a958d38f866e158d76e2f233d0187a08eb54c6b7f40a9eb9637dc0079e857257cb9d3f2091a05b9490968b
data/CHANGELOG.md CHANGED
@@ -3,6 +3,26 @@
3
3
  ### Added
4
4
  - Monitor thread now sends a synchronous heartbeat on startup before the first collection tick. This ensures remote settings — including `monitor_enabled` — are applied from the very first reporting cycle, so Sidekiq workers and other non-web processes that have not yet sent any metrics still receive the correct configuration immediately on boot rather than waiting up to 60 seconds for the first scheduled tick.
5
5
 
6
+ ## [0.2.25] - 2026-06-14
7
+
8
+ ### Added
9
+ - **Memory diagnostics for "what is allocating all this memory?"** A request can grow RSS by hundreds of MB while instantiating only a few thousand ActiveRecord objects — the existing AR object count cannot explain it because the memory lives in transient strings/hashes (e.g. deserialized Elasticsearch responses, large JSON response bodies), not AR models. These additions localize that growth. They are organised into two performance tiers so the default path stays fast:
10
+
11
+ **Under `memory_tracking_enabled` (~0.1ms overhead, on by default):**
12
+ - **Retained-vs-transient GC signals.** `GcTracker` now enriches the per-request `gc_pressure` payload with three additional `GC.stat`-derived fields:
13
+ - `heap_live_slots_growth` — net change in live heap slots over the request. A small value alongside a large `allocated_objects` means the memory was *transient* (allocated then reclaimed by GC, with RSS held by allocator fragmentation); a large value means objects were *retained* — the real leak signal. This reframes a large RSS delta that the previous metrics could not characterise.
14
+ - `malloc_increase_bytes` / `oldmalloc_increase_bytes` — request-end gauges of memory malloc'd outside the Ruby object heap (large strings/buffers), pointing at off-heap pressure such as parsed response bodies.
15
+ - These fields are captured only when `memory_tracking_enabled`; the base GC pressure fields (`minor_gc_runs`, `major_gc_runs`, `allocated_objects`, `gc_time_ms`) remain always-on and unchanged.
16
+ - **Per-phase allocation attribution.** New `MemoryPhaseTracker` charges each request's object allocations to the phase that produced them — `sql`, `view`, or `elasticsearch` — emitted as a new `allocation_phases` field on the request payload (e.g. `{ elasticsearch: 412_000, sql: 9_000, view: 2_500 }`). Attribution is **exclusive**: a `sql.active_record` event nested inside a view render is charged only to `:sql`, never double-counted into `:view`, via a thread-local stack that pauses the parent phase while a child is active. Whatever isn't captured by an instrumented phase is controller/application code and is derivable on the backend as `gc_pressure.allocated_objects` minus the sum of the buckets. Overhead is two single-key `GC.stat(:total_allocated_objects)` reads per instrumented event (no hash allocation); listeners are installed at boot but no-op via a thread-local check unless a request opts in.
17
+
18
+ **Under `allocation_tracking_enabled` + the new `allocation_sample_rate` (~2–5ms overhead, off by default):**
19
+ - **By-bytes object-type breakdown.** New `AllocationSourceSampler` produces `memsize_by_type` — total retained bytes summed per Ruby class via `ObjectSpace.memsize_of`. This catches the "death by a million small strings" pattern (each object well under any size threshold, but enormous in aggregate) that the existing >1MB single-object scan structurally misses.
20
+ - **Allocation-source attribution.** `allocation_sources` reports the top allocation sites (`file:line`) by retained bytes, using `ObjectSpace.trace_object_allocations`. This is the gold-standard "this line allocated 300MB" answer. The expensive heap walk is additionally gated on actual memory growth (`memory_growth_mb >= 50` by default), so even with the flag on, a request that didn't move memory pays nothing for the walk. Sampled object counts/bytes are reported alongside `allocation_sample_rate` so consumers can extrapolate to the full heap.
21
+ - **`allocation_sample_rate` configuration (default `100`, remote-manageable).** When allocation tracking is enabled, the heavy per-request work now runs on this percentage of requests, so the cost can be capped across traffic instead of being all-or-nothing. The decision is made once at request start (`Configuration#allocation_tracking_active?`) and cached in a thread-local so the matching stop agrees with the start.
22
+
23
+ ### Changed
24
+ - **`objspace` is now loaded only under `allocation_tracking_enabled`.** `ObjectSpace.memsize_of`, `trace_object_allocations_*`, and `allocation_source*` live in the `objspace` stdlib extension, which is not loaded by default. It is now required exclusively via `AllocationSourceSampler` (loaded only under the allocation flag), keeping the heavyweight extension off the default and memory-tracking paths. As a consequence, the gem's pre-existing large-object scan (in `MemoryTrackingSubscriber` / `DeadBro.analyze`), which also depends on `ObjectSpace.memsize_of`, likewise functions only when allocation tracking is enabled — consistent with it already living behind that flag.
25
+
6
26
  ## [0.2.21] - 2026-06-02
7
27
 
8
28
  ### Added
data/README.md CHANGED
@@ -263,15 +263,16 @@ Usually managed in the dashboard; Ruby example:
263
263
  DeadBro.configure do |config|
264
264
  config.memory_tracking_enabled = true # Enable lightweight memory tracking (default: true)
265
265
  config.allocation_tracking_enabled = false # Enable detailed allocation tracking (default: false)
266
-
266
+ config.allocation_sample_rate = 100 # % of requests that pay for allocation tracking when enabled (1-100, default: 100)
267
+
267
268
  # Sampling configuration
268
269
  config.sample_rate = 100 # Percentage of requests to track (1-100, default: 100)
269
270
  end
270
271
  ```
271
272
 
272
273
  **Performance Impact:**
273
- - **Lightweight mode**: ~0.1ms overhead per request
274
- - **Allocation tracking**: ~2-5ms overhead per request (only enable when needed)
274
+ - **Lightweight mode** (`memory_tracking_enabled`, ~0.1ms overhead per request): RSS before/after, GC pressure, the retained-vs-transient signals (`heap_live_slots_growth`, `malloc_increase_bytes`), and per-phase allocation attribution (`allocation_phases` — which of `sql`/`view`/`elasticsearch` allocated the request's objects).
275
+ - **Allocation tracking** (`allocation_tracking_enabled`, ~2-5ms overhead per request only enable when needed): adds by-bytes object-type breakdown (`memsize_by_type`) and allocation-source attribution (`allocation_sources`, `file:line`) using the `objspace` extension, which is loaded only on this path. Use `allocation_sample_rate` to spread that cost across a fraction of traffic.
275
276
 
276
277
  ## Job Tracking
277
278
 
@@ -0,0 +1,140 @@
1
+ # frozen_string_literal: true
2
+
3
+ # ObjectSpace.memsize_of / trace_object_allocations_* / allocation_source* all
4
+ # live in the `objspace` stdlib extension, which is NOT loaded by default.
5
+ # Requiring it only defines the methods — it does not start any tracing or add
6
+ # runtime overhead until we explicitly call trace_object_allocations_start.
7
+ begin
8
+ require "objspace"
9
+ rescue LoadError
10
+ # Not available on this Ruby — available? will report false.
11
+ end
12
+
13
+ module DeadBro
14
+ # Deep, opt-in memory diagnostics that answer "what code allocated this?".
15
+ # Only active when allocation tracking is on (see Configuration
16
+ # #allocation_tracking_active?), because turning on object allocation tracing
17
+ # adds ~2-5ms of per-request overhead.
18
+ #
19
+ # Two complementary breakdowns, both produced from a single ObjectSpace walk
20
+ # after the request finishes:
21
+ #
22
+ # * by_type_bytes — total *retained bytes* per Ruby class. This catches the
23
+ # "death by a million small strings" pattern (e.g. a deserialized
24
+ # Elasticsearch response) that MemoryTrackingSubscriber's >1MB
25
+ # single-object scan structurally misses, because it sums bytes per type
26
+ # instead of flagging individually-large objects.
27
+ #
28
+ # * by_source — top allocation sites (file:line) by retained bytes. This is
29
+ # the gold-standard "this line allocated 300MB" attribution, available
30
+ # because trace_object_allocations was running for the request.
31
+ module AllocationSourceSampler
32
+ # Fraction of live objects inspected during the post-request walk. Reported
33
+ # back as sample_rate so the consumer can extrapolate to the full heap.
34
+ SAMPLE_RATE = 0.10
35
+ MAX_RESULTS = 20
36
+ # Only report a type if its sampled retained bytes clear this floor (keeps
37
+ # the breakdown to things that actually matter).
38
+ LARGE_TYPE_MIN_BYTES = 100_000
39
+ # Skip the walk entirely below this growth — no point profiling a request
40
+ # that didn't move memory. Gates the expensive path even when the flag is on.
41
+ DEFAULT_MIN_GROWTH_MB = 50
42
+
43
+ def self.available?
44
+ defined?(ObjectSpace) &&
45
+ ObjectSpace.respond_to?(:trace_object_allocations_start) &&
46
+ ObjectSpace.respond_to?(:memsize_of)
47
+ rescue StandardError
48
+ false
49
+ end
50
+
51
+ # Begin recording allocation source locations. Must be called before the
52
+ # request allocates the objects we want to attribute.
53
+ def self.start
54
+ return unless available?
55
+ ObjectSpace.trace_object_allocations_start
56
+ rescue StandardError
57
+ # Best-effort only.
58
+ end
59
+
60
+ # Stop and discard recorded allocation data. Call this AFTER analyze, since
61
+ # clearing wipes the source locations analyze reads.
62
+ def self.stop
63
+ return unless available?
64
+ ObjectSpace.trace_object_allocations_stop
65
+ ObjectSpace.trace_object_allocations_clear
66
+ rescue StandardError
67
+ # Best-effort only.
68
+ end
69
+
70
+ # Walk live objects once and build the two breakdowns. Returns {} when
71
+ # tracing is unavailable, or {skipped: ...} when growth was below threshold.
72
+ def self.analyze(memory_growth_mb: nil, min_growth_mb: DEFAULT_MIN_GROWTH_MB)
73
+ return {} unless available?
74
+ if memory_growth_mb && memory_growth_mb < min_growth_mb
75
+ return {skipped: "memory_growth_below_threshold", min_growth_mb: min_growth_mb}
76
+ end
77
+
78
+ by_type = Hash.new { |h, k| h[k] = {count: 0, bytes: 0} }
79
+ by_source = Hash.new { |h, k| h[k] = {count: 0, bytes: 0} }
80
+
81
+ ObjectSpace.each_object do |obj|
82
+ next unless rand < SAMPLE_RATE
83
+
84
+ size = begin
85
+ ObjectSpace.memsize_of(obj)
86
+ rescue StandardError
87
+ 0
88
+ end
89
+ next unless size && size > 0
90
+
91
+ klass = begin
92
+ obj.class.name
93
+ rescue StandardError
94
+ nil
95
+ end || "Unknown"
96
+ type_bucket = by_type[klass]
97
+ type_bucket[:count] += 1
98
+ type_bucket[:bytes] += size
99
+
100
+ file = begin
101
+ ObjectSpace.allocation_sourcefile(obj)
102
+ rescue StandardError
103
+ nil
104
+ end
105
+ next unless file
106
+
107
+ line = begin
108
+ ObjectSpace.allocation_sourceline(obj)
109
+ rescue StandardError
110
+ nil
111
+ end
112
+ source_bucket = by_source["#{file}:#{line}"]
113
+ source_bucket[:count] += 1
114
+ source_bucket[:bytes] += size
115
+ end
116
+
117
+ {
118
+ sample_rate: SAMPLE_RATE,
119
+ by_type_bytes: top_by_bytes(by_type, LARGE_TYPE_MIN_BYTES),
120
+ by_source: top_by_bytes(by_source, 0)
121
+ }
122
+ rescue StandardError
123
+ {}
124
+ end
125
+
126
+ def self.top_by_bytes(hash, min_bytes)
127
+ hash.select { |_, v| v[:bytes] >= min_bytes }
128
+ .sort_by { |_, v| -v[:bytes] }
129
+ .first(MAX_RESULTS)
130
+ .map do |key, v|
131
+ {
132
+ name: key,
133
+ count: v[:count],
134
+ bytes: v[:bytes],
135
+ mb: (v[:bytes] / 1_000_000.0).round(2)
136
+ }
137
+ end
138
+ end
139
+ end
140
+ end
@@ -10,7 +10,7 @@ module DeadBro
10
10
  :circuit_breaker_retry_timeout, :disk_paths, :interfaces_ignore
11
11
 
12
12
  # Remote-managed settings (overwritten by backend JSON `settings` on successful API responses)
13
- attr_accessor :memory_tracking_enabled, :allocation_tracking_enabled,
13
+ attr_accessor :memory_tracking_enabled, :allocation_tracking_enabled, :allocation_sample_rate,
14
14
  :sample_rate, :slow_query_threshold_ms, :explain_analyze_enabled,
15
15
  :monitor_enabled, :enable_db_stats, :enable_process_stats, :enable_system_stats,
16
16
  :max_sql_queries_to_send, :max_logs_to_send
@@ -55,7 +55,7 @@ module DeadBro
55
55
  ].freeze
56
56
 
57
57
  REMOTE_SETTING_KEYS = %w[
58
- enabled sample_rate memory_tracking_enabled allocation_tracking_enabled
58
+ enabled sample_rate memory_tracking_enabled allocation_tracking_enabled allocation_sample_rate
59
59
  explain_analyze_enabled slow_query_threshold_ms max_sql_queries_to_send max_logs_to_send
60
60
  excluded_controllers excluded_jobs exclusive_controllers exclusive_jobs
61
61
  monitor_enabled enable_db_stats enable_process_stats enable_system_stats
@@ -79,6 +79,10 @@ module DeadBro
79
79
  @sample_rate = 100
80
80
  @memory_tracking_enabled = true
81
81
  @allocation_tracking_enabled = false
82
+ # When allocation tracking is on, the heavy per-request work (object-space
83
+ # sampling, allocation-source tracing) runs on this % of requests so the
84
+ # ~2-5ms overhead can be capped without turning the feature fully off.
85
+ @allocation_sample_rate = 100
82
86
  @explain_analyze_enabled = false
83
87
  @slow_query_threshold_ms = 500
84
88
  @max_sql_queries_to_send = 500
@@ -142,7 +146,7 @@ module DeadBro
142
146
  next unless REMOTE_SETTING_KEYS.include?(k)
143
147
 
144
148
  case k
145
- when "sample_rate", "slow_query_threshold_ms", "max_sql_queries_to_send", "max_logs_to_send"
149
+ when "sample_rate", "allocation_sample_rate", "slow_query_threshold_ms", "max_sql_queries_to_send", "max_logs_to_send"
146
150
  send(:"#{k}=", value.to_i)
147
151
  when "enabled", "memory_tracking_enabled", "allocation_tracking_enabled", "explain_analyze_enabled",
148
152
  "monitor_enabled", "enable_db_stats", "enable_process_stats", "enable_system_stats"
@@ -239,6 +243,20 @@ module DeadBro
239
243
  rand(1..100) <= sample_rate
240
244
  end
241
245
 
246
+ # Per-request decision: should this request pay for the heavy allocation
247
+ # tracking (object-space sampling + allocation-source tracing)? Combines the
248
+ # on/off flag with allocation_sample_rate. Decide once at request start and
249
+ # reuse the cached result for the matching stop, so start/stop agree.
250
+ def allocation_tracking_active?
251
+ return false unless allocation_tracking_enabled
252
+
253
+ rate = allocation_sample_rate.to_i
254
+ return true if rate >= 100
255
+ return false if rate <= 0
256
+
257
+ rand(1..100) <= rate
258
+ end
259
+
242
260
  # Returns the configured sample_rate only (no ENV fallback). Use DeadBro.configure or remote settings.
243
261
  def resolve_sample_rate
244
262
  @sample_rate
@@ -111,17 +111,29 @@ module DeadBro
111
111
  return {} unless req
112
112
 
113
113
  params = req.params || {}
114
- sensitive_keys = %w[password password_confirmation token secret key authorization api_key]
115
- filtered = params.dup
116
- sensitive_keys.each do |k|
117
- filtered.delete(k)
118
- filtered.delete(k.to_sym)
119
- end
120
- JSON.parse(JSON.dump(filtered)) # ensure JSON-safe
114
+ # Redact at every nesting level (e.g. user[password]) before serializing.
115
+ JSON.parse(JSON.dump(redact_sensitive(params)))
121
116
  rescue
122
117
  {}
123
118
  end
124
119
 
120
+ # Matches a key segment so nested/prefixed/suffixed sensitive keys are caught
121
+ # without redacting innocent keys like passenger_count.
122
+ SENSITIVE_SEGMENT_RE = /(?:\A|[_\-\[])(password|passwd|secret|token|api_?key|access_?key|auth|authorization|credential|ssn|credit_?card|card_?number|cvv|cvc)(?:\z|[_\-\]])/i
123
+
124
+ def redact_sensitive(value)
125
+ case value
126
+ when Hash
127
+ value.each_with_object({}) do |(k, v), memo|
128
+ memo[k] = SENSITIVE_SEGMENT_RE.match?(k.to_s) ? "[FILTERED]" : redact_sensitive(v)
129
+ end
130
+ when Array
131
+ value.map { |v| redact_sensitive(v) }
132
+ else
133
+ value
134
+ end
135
+ end
136
+
125
137
  def truncate(str, max)
126
138
  return str if str.nil? || str.length <= max
127
139
  str[0..(max - 1)]
@@ -19,12 +19,27 @@ module DeadBro
19
19
  def self.snapshot
20
20
  return {} unless defined?(GC) && GC.respond_to?(:stat)
21
21
  stat = GC.stat
22
- {
22
+ base = {
23
23
  minor_gc_count: stat[:minor_gc_count] || 0,
24
24
  major_gc_count: stat[:major_gc_count] || 0,
25
25
  total_allocated_objects: stat[:total_allocated_objects] || 0,
26
26
  gc_time_ns: GC.respond_to?(:total_time) ? GC.total_time : nil
27
27
  }
28
+
29
+ # Memory-tracking enrichment (a few extra GC.stat reads). Only the base
30
+ # GC pressure fields above are truly always-on.
31
+ if memory_tracking_enabled?
32
+ # Live heap slots — net retained objects. Comparing this delta against
33
+ # allocated_objects separates transient churn from genuine retention.
34
+ base[:heap_live_slots] = stat[:heap_live_slots] || 0
35
+ # Bytes malloc'd outside the Ruby object heap (big strings/buffers, e.g.
36
+ # parsed JSON response bodies). These are point-in-time gauges reset by
37
+ # GC, so we report the request-end value rather than a diff.
38
+ base[:malloc_increase_bytes] = stat[:malloc_increase_bytes] || 0
39
+ base[:oldmalloc_increase_bytes] = stat[:oldmalloc_increase_bytes] || 0
40
+ end
41
+
42
+ base
28
43
  rescue
29
44
  {}
30
45
  end
@@ -34,14 +49,33 @@ module DeadBro
34
49
  gc_time_ms = if before[:gc_time_ns] && after[:gc_time_ns]
35
50
  ((after[:gc_time_ns] - before[:gc_time_ns]) / 1_000_000.0).round(3)
36
51
  end
37
- {
52
+ result = {
38
53
  minor_gc_runs: (after[:minor_gc_count] || 0) - (before[:minor_gc_count] || 0),
39
54
  major_gc_runs: (after[:major_gc_count] || 0) - (before[:major_gc_count] || 0),
40
55
  allocated_objects: (after[:total_allocated_objects] || 0) - (before[:total_allocated_objects] || 0),
41
56
  gc_time_ms: gc_time_ms
42
57
  }
58
+
59
+ # Present only when the enrichment was captured (memory tracking enabled).
60
+ if after.key?(:heap_live_slots) || before.key?(:heap_live_slots)
61
+ # Net change in live slots over the request. A small value alongside a
62
+ # large allocated_objects means the memory was transient (reclaimed by
63
+ # GC); a large value means objects were retained — the real leak signal.
64
+ result[:heap_live_slots_growth] = (after[:heap_live_slots] || 0) - (before[:heap_live_slots] || 0)
65
+ # Off-heap malloc pressure pending at request end (see snapshot).
66
+ result[:malloc_increase_bytes] = after[:malloc_increase_bytes] || 0
67
+ result[:oldmalloc_increase_bytes] = after[:oldmalloc_increase_bytes] || 0
68
+ end
69
+
70
+ result
43
71
  rescue
44
72
  {}
45
73
  end
74
+
75
+ def self.memory_tracking_enabled?
76
+ DeadBro.configuration.memory_tracking_enabled
77
+ rescue
78
+ false
79
+ end
46
80
  end
47
81
  end
@@ -0,0 +1,131 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/notifications"
4
+
5
+ module DeadBro
6
+ # Attributes a request's object allocations to the phase that produced them —
7
+ # e.g. "92% of this request's allocations happened during Elasticsearch".
8
+ #
9
+ # This is the always-on, low-overhead companion to GcTracker: GcTracker tells
10
+ # you *how many* objects a request allocated and whether they were retained;
11
+ # MemoryPhaseTracker tells you *where* they were allocated, so a 400MB request
12
+ # can be localized to ES deserialization vs view rendering vs controller code
13
+ # without running a full allocation profiler.
14
+ #
15
+ # Attribution is *exclusive*: a sql.active_record event nested inside a view
16
+ # render is charged only to :sql, not to both. A thread-local stack records
17
+ # the allocation counter when each phase becomes active; entering a child
18
+ # phase flushes the parent's accumulated delta and pauses it, leaving the
19
+ # child resumes the parent. Whatever isn't captured by an instrumented phase
20
+ # stays "unattributed" (controller/application code) and is derivable on the
21
+ # backend as gc_pressure.allocated_objects minus the sum of these buckets.
22
+ module MemoryPhaseTracker
23
+ THREAD_KEY = :dead_bro_memory_phases
24
+
25
+ # ActiveSupport event name => phase bucket. Each maps to a coarse phase so
26
+ # the breakdown stays readable (all view render events collapse to :view).
27
+ EVENT_PHASES = {
28
+ "sql.active_record" => :sql,
29
+ "render_template.action_view" => :view,
30
+ "render_partial.action_view" => :view,
31
+ "render_collection.action_view" => :view,
32
+ "render_layout.action_view" => :view,
33
+ "request.elasticsearch" => :elasticsearch,
34
+ "request.elastic_transport" => :elasticsearch
35
+ }.freeze
36
+
37
+ # Bridges ActiveSupport's evented-listener protocol (start/finish) onto our
38
+ # enter/leave accounting. Registered once per event name at boot.
39
+ class Listener
40
+ def initialize(phase)
41
+ @phase = phase
42
+ end
43
+
44
+ def start(_name, _id, _payload)
45
+ MemoryPhaseTracker.enter(@phase)
46
+ end
47
+
48
+ def finish(_name, _id, _payload)
49
+ MemoryPhaseTracker.leave(@phase)
50
+ end
51
+ end
52
+
53
+ def self.subscribe!
54
+ return if @subscribed
55
+ @subscribed = true
56
+ return unless allocation_counter_available?
57
+
58
+ EVENT_PHASES.each do |event_name, phase|
59
+ ActiveSupport::Notifications.subscribe(event_name, Listener.new(phase))
60
+ end
61
+ rescue StandardError
62
+ # Never raise from instrumentation install.
63
+ end
64
+
65
+ def self.start_request_tracking
66
+ Thread.current[THREAD_KEY] = {buckets: Hash.new(0), stack: []}
67
+ end
68
+
69
+ # Returns { sql: n, view: n, elasticsearch: n } of objects allocated
70
+ # exclusively within each phase, omitting phases that allocated nothing.
71
+ def self.stop_request_tracking
72
+ state = Thread.current[THREAD_KEY]
73
+ return {} unless state.is_a?(Hash)
74
+
75
+ buckets = state[:buckets]
76
+ buckets.each_with_object({}) do |(phase, count), result|
77
+ result[phase] = count if count.positive?
78
+ end
79
+ ensure
80
+ Thread.current[THREAD_KEY] = nil
81
+ end
82
+
83
+ def self.enter(phase)
84
+ state = Thread.current[THREAD_KEY]
85
+ return unless state.is_a?(Hash)
86
+
87
+ now = allocated_objects
88
+ stack = state[:stack]
89
+ if (parent = stack.last)
90
+ # Pause the parent: bank what it allocated up to this point.
91
+ state[:buckets][parent[:phase]] += now - parent[:checkpoint]
92
+ end
93
+ stack << {phase: phase, checkpoint: now}
94
+ rescue StandardError
95
+ # Best-effort only.
96
+ end
97
+
98
+ def self.leave(phase)
99
+ state = Thread.current[THREAD_KEY]
100
+ return unless state.is_a?(Hash)
101
+
102
+ stack = state[:stack]
103
+ frame = stack.pop
104
+ return unless frame
105
+
106
+ now = allocated_objects
107
+ state[:buckets][frame[:phase]] += now - frame[:checkpoint]
108
+ # Resume the parent from this point so the child's allocations aren't
109
+ # double-counted into it.
110
+ if (parent = stack.last)
111
+ parent[:checkpoint] = now
112
+ end
113
+ rescue StandardError
114
+ # Best-effort only.
115
+ end
116
+
117
+ # Single-key GC.stat returns just the Integer (no hash allocation), so this
118
+ # is cheap enough to call on every instrumented event boundary.
119
+ def self.allocated_objects
120
+ GC.stat(:total_allocated_objects)
121
+ rescue StandardError
122
+ 0
123
+ end
124
+
125
+ def self.allocation_counter_available?
126
+ defined?(GC) && GC.respond_to?(:stat) && !GC.stat(:total_allocated_objects).nil?
127
+ rescue StandardError
128
+ false
129
+ end
130
+ end
131
+ end
@@ -44,6 +44,11 @@ if defined?(Rails) && defined?(Rails::Railtie)
44
44
  require "dead_bro/ar_object_tracker"
45
45
  DeadBro::ArObjectTracker.subscribe!
46
46
 
47
+ # Install per-phase allocation attribution. Listeners are cheap and
48
+ # no-op unless a request opts in (gated on memory_tracking_enabled).
49
+ require "dead_bro/memory_phase_tracker"
50
+ DeadBro::MemoryPhaseTracker.subscribe!
51
+
47
52
  # Install view rendering tracking
48
53
  require "dead_bro/view_rendering_subscriber"
49
54
  DeadBro::ViewRenderingSubscriber.subscribe!(client: shared_client)
@@ -53,10 +58,13 @@ if defined?(Rails) && defined?(Rails::Railtie)
53
58
  require "dead_bro/memory_leak_detector"
54
59
  DeadBro::MemoryLeakDetector.initialize_history
55
60
 
56
- # Install detailed memory tracking only if enabled
61
+ # Install detailed memory + allocation-source tracking only if
62
+ # enabled. This is also where `objspace` gets loaded (via the
63
+ # sampler), so the heavyweight extension stays off the default path.
57
64
  if DeadBro.configuration.allocation_tracking_enabled
58
65
  require "dead_bro/memory_tracking_subscriber"
59
66
  DeadBro::MemoryTrackingSubscriber.subscribe!(client: shared_client)
67
+ require "dead_bro/allocation_source_sampler"
60
68
  end
61
69
 
62
70
  # Install job tracking if ActiveJob is available
@@ -46,9 +46,16 @@ module DeadBro
46
46
  DeadBro::LightweightMemoryTracker.start_request_tracking
47
47
  end
48
48
 
49
- # Start detailed memory tracking when allocation tracking is enabled
50
- if DeadBro.configuration.allocation_tracking_enabled && defined?(DeadBro::MemoryTrackingSubscriber)
51
- DeadBro::MemoryTrackingSubscriber.start_request_tracking
49
+ # Decide once whether this request pays for heavy allocation tracking
50
+ # (flag + per-request sampling). Cache the decision so the matching stop
51
+ # in Subscriber agrees with this start.
52
+ alloc_active = DeadBro.configuration.allocation_tracking_active?
53
+ Thread.current[:dead_bro_alloc_active] = alloc_active
54
+
55
+ # Start detailed memory + allocation-source tracking when active
56
+ if alloc_active
57
+ DeadBro::MemoryTrackingSubscriber.start_request_tracking if defined?(DeadBro::MemoryTrackingSubscriber)
58
+ DeadBro::AllocationSourceSampler.start if defined?(DeadBro::AllocationSourceSampler)
52
59
  end
53
60
 
54
61
  # Start Elasticsearch tracking for this request
@@ -64,6 +71,11 @@ module DeadBro
64
71
  # Start GC pressure tracking — snapshot before any app code runs
65
72
  DeadBro::GcTracker.start_request_tracking if defined?(DeadBro::GcTracker)
66
73
 
74
+ # Start per-phase allocation attribution (~0.1ms; under memory tracking)
75
+ if DeadBro.configuration.memory_tracking_enabled && defined?(DeadBro::MemoryPhaseTracker)
76
+ DeadBro::MemoryPhaseTracker.start_request_tracking
77
+ end
78
+
67
79
  # Start AR object instantiation counting for this request
68
80
  DeadBro::ArObjectTracker.start_request_tracking if defined?(DeadBro::ArObjectTracker)
69
81
 
@@ -110,6 +122,13 @@ module DeadBro
110
122
  # Bypass stop_request_tracking intentionally — cleanup only, no return value needed here.
111
123
  Thread.current[DeadBro::ArObjectTracker::THREAD_KEY] = nil if defined?(DeadBro::ArObjectTracker)
112
124
  Thread.current[DeadBro::CpuTracker::THREAD_KEY] = nil if defined?(DeadBro::CpuTracker)
125
+ Thread.current[DeadBro::MemoryPhaseTracker::THREAD_KEY] = nil if defined?(DeadBro::MemoryPhaseTracker)
126
+ # Safety net: ensure allocation tracing is never left running across
127
+ # requests (Subscriber normally stops it after analyzing).
128
+ if Thread.current[:dead_bro_alloc_active]
129
+ DeadBro::AllocationSourceSampler.stop if defined?(DeadBro::AllocationSourceSampler)
130
+ end
131
+ Thread.current[:dead_bro_alloc_active] = nil
113
132
  Thread.current[DeadBro::TRACKING_START_TIME_KEY] = nil
114
133
  end
115
134
 
@@ -79,10 +79,34 @@ module DeadBro
79
79
  view_events = DeadBro::ViewRenderingSubscriber.stop_request_tracking
80
80
  view_performance = DeadBro::ViewRenderingSubscriber.analyze_view_performance(view_events)
81
81
 
82
- # Stop memory tracking and get collected memory data
83
- if DeadBro.configuration.allocation_tracking_enabled && defined?(DeadBro::MemoryTrackingSubscriber)
82
+ # Per-phase allocation attribution (under memory tracking) which phase
83
+ # allocated the request's objects (sql / view / elasticsearch).
84
+ allocation_phases = if DeadBro.configuration.memory_tracking_enabled && defined?(DeadBro::MemoryPhaseTracker)
85
+ DeadBro::MemoryPhaseTracker.stop_request_tracking
86
+ else
87
+ {}
88
+ end
89
+
90
+ # Stop memory tracking and get collected memory data. The decision to do
91
+ # heavy allocation tracking was made (and sampled) at request start.
92
+ if Thread.current[:dead_bro_alloc_active] && defined?(DeadBro::MemoryTrackingSubscriber)
84
93
  detailed_memory = DeadBro::MemoryTrackingSubscriber.stop_request_tracking
85
94
  memory_performance = DeadBro::MemoryTrackingSubscriber.analyze_memory_performance(detailed_memory)
95
+
96
+ # Allocation-source + by-bytes-type diagnostics. Read the trace data
97
+ # before stopping the sampler (stop clears the source locations).
98
+ if defined?(DeadBro::AllocationSourceSampler)
99
+ growth = (detailed_memory[:memory_after].to_f - detailed_memory[:memory_before].to_f)
100
+ source_analysis = DeadBro::AllocationSourceSampler.analyze(memory_growth_mb: growth)
101
+ DeadBro::AllocationSourceSampler.stop
102
+ if source_analysis.is_a?(Hash) && source_analysis.any?
103
+ memory_performance[:allocation_sources] = source_analysis[:by_source]
104
+ memory_performance[:memsize_by_type] = source_analysis[:by_type_bytes]
105
+ memory_performance[:allocation_sample_rate] = source_analysis[:sample_rate]
106
+ memory_performance[:allocation_sources_skipped] = source_analysis[:skipped] if source_analysis[:skipped]
107
+ end
108
+ end
109
+
86
110
  # Keep memory_events compact and user-friendly (no large raw arrays)
87
111
  memory_events = {
88
112
  memory_before: detailed_memory[:memory_before],
@@ -189,6 +213,7 @@ module DeadBro
189
213
  view_performance: view_performance,
190
214
  memory_events: memory_events,
191
215
  memory_performance: memory_performance,
216
+ allocation_phases: allocation_phases,
192
217
  rack_duration_ms: rack_duration_ms,
193
218
  queue_duration_ms: Thread.current[:dead_bro_queue_duration_ms],
194
219
  db_connection_wait_ms: db_connection_stats[:wait_ms],
@@ -212,9 +237,12 @@ module DeadBro
212
237
  DeadBro::ElasticsearchSubscriber.stop_request_tracking if defined?(DeadBro::ElasticsearchSubscriber)
213
238
  DeadBro::ViewRenderingSubscriber.stop_request_tracking if defined?(DeadBro::ViewRenderingSubscriber)
214
239
  DeadBro::LightweightMemoryTracker.stop_request_tracking if defined?(DeadBro::LightweightMemoryTracker)
215
- if DeadBro.configuration.allocation_tracking_enabled && defined?(DeadBro::MemoryTrackingSubscriber)
216
- DeadBro::MemoryTrackingSubscriber.stop_request_tracking
240
+ DeadBro::MemoryPhaseTracker.stop_request_tracking if defined?(DeadBro::MemoryPhaseTracker)
241
+ if Thread.current[:dead_bro_alloc_active]
242
+ DeadBro::MemoryTrackingSubscriber.stop_request_tracking if defined?(DeadBro::MemoryTrackingSubscriber)
243
+ DeadBro::AllocationSourceSampler.stop if defined?(DeadBro::AllocationSourceSampler)
217
244
  end
245
+ Thread.current[:dead_bro_alloc_active] = nil
218
246
  Thread.current[:dead_bro_http_events] = nil
219
247
  DeadBro::DbConnectionSubscriber.stop_request_tracking if defined?(DeadBro::DbConnectionSubscriber)
220
248
  DeadBro::GcTracker.stop_request_tracking if defined?(DeadBro::GcTracker)
@@ -247,14 +275,11 @@ module DeadBro
247
275
  # Remove router-provided keys that we already send at top-level
248
276
  router_keys = %w[controller action format]
249
277
 
250
- # Filter out sensitive parameters
251
- sensitive_keys = %w[password password_confirmation token secret key]
252
-
253
278
  filtered = params.dup
254
279
  router_keys.each { |k| filtered.delete(k) || filtered.delete(k.to_sym) }
255
- filtered = filtered.except(*sensitive_keys, *sensitive_keys.map(&:to_sym)) if filtered.respond_to?(:except)
256
280
 
257
- # Truncate deeply to keep payload small and safe
281
+ # Truncate deeply to keep payload small and safe. truncate_value also redacts
282
+ # sensitive keys at every nesting level (e.g. user[password]).
258
283
  truncate_value(filtered)
259
284
  rescue
260
285
  {}
@@ -264,7 +289,18 @@ module DeadBro
264
289
  str.to_s.gsub("\x00", "")
265
290
  end
266
291
 
267
- # Recursively truncate values to reasonable sizes to avoid huge payloads
292
+ # Matched against a key segment (delimited by start/end, underscore, dash, or
293
+ # bracket) so nested and prefixed/suffixed keys are caught — e.g. user[password],
294
+ # access_token, client_secret — without redacting innocent keys like
295
+ # passenger_count or cardinality.
296
+ SENSITIVE_SEGMENT_RE = /(?:\A|[_\-\[])(password|passwd|secret|token|api_?key|access_?key|auth|authorization|credential|ssn|credit_?card|card_?number|cvv|cvc)(?:\z|[_\-\]])/i
297
+
298
+ def self.sensitive_key?(key)
299
+ SENSITIVE_SEGMENT_RE.match?(key.to_s)
300
+ end
301
+
302
+ # Recursively truncate values to reasonable sizes to avoid huge payloads, and
303
+ # redact values whose key looks sensitive at any nesting level.
268
304
  def self.truncate_value(value, max_str: 200, max_array: 20, max_hash_keys: 30)
269
305
  case value
270
306
  when String
@@ -277,7 +313,7 @@ module DeadBro
277
313
  when Hash
278
314
  entries = value.to_a[0, max_hash_keys]
279
315
  entries.each_with_object({}) do |(k, v), memo|
280
- memo[k] = truncate_value(v, max_str: max_str, max_array: max_array, max_hash_keys: max_hash_keys)
316
+ memo[k] = sensitive_key?(k) ? "[FILTERED]" : truncate_value(v, max_str: max_str, max_array: max_array, max_hash_keys: max_hash_keys)
281
317
  end
282
318
  else
283
319
  (value.to_s.length > max_str) ? value.to_s[0, max_str] + "…" : value.to_s
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DeadBro
4
- VERSION = "0.2.24"
4
+ VERSION = "0.2.25"
5
5
  end
data/lib/dead_bro.rb CHANGED
@@ -19,6 +19,8 @@ module DeadBro
19
19
  autoload :MemoryLeakDetector, "dead_bro/memory_leak_detector"
20
20
  autoload :LightweightMemoryTracker, "dead_bro/lightweight_memory_tracker"
21
21
  autoload :GcTracker, "dead_bro/gc_tracker"
22
+ autoload :MemoryPhaseTracker, "dead_bro/memory_phase_tracker"
23
+ autoload :AllocationSourceSampler, "dead_bro/allocation_source_sampler"
22
24
  autoload :ArObjectTracker, "dead_bro/ar_object_tracker"
23
25
  autoload :CpuTracker, "dead_bro/cpu_tracker"
24
26
  autoload :MemoryHelpers, "dead_bro/memory_helpers"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dead_bro
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.24
4
+ version: 0.2.25
5
5
  platform: ruby
6
6
  authors:
7
7
  - Emanuel Comsa
@@ -20,6 +20,7 @@ files:
20
20
  - CHANGELOG.md
21
21
  - README.md
22
22
  - lib/dead_bro.rb
23
+ - lib/dead_bro/allocation_source_sampler.rb
23
24
  - lib/dead_bro/ar_object_tracker.rb
24
25
  - lib/dead_bro/cache_subscriber.rb
25
26
  - lib/dead_bro/circuit_breaker.rb
@@ -47,6 +48,7 @@ files:
47
48
  - lib/dead_bro/memory_details.rb
48
49
  - lib/dead_bro/memory_helpers.rb
49
50
  - lib/dead_bro/memory_leak_detector.rb
51
+ - lib/dead_bro/memory_phase_tracker.rb
50
52
  - lib/dead_bro/memory_tracking_subscriber.rb
51
53
  - lib/dead_bro/monitor.rb
52
54
  - lib/dead_bro/railtie.rb