dead_bro 0.2.19 → 0.2.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +42 -0
- data/lib/dead_bro/cache_subscriber.rb +5 -2
- data/lib/dead_bro/client.rb +25 -17
- data/lib/dead_bro/collectors/process_info.rb +1 -0
- data/lib/dead_bro/configuration.rb +4 -4
- data/lib/dead_bro/elasticsearch_subscriber.rb +8 -5
- data/lib/dead_bro/error_middleware.rb +1 -0
- data/lib/dead_bro/http_instrumentation.rb +7 -1
- data/lib/dead_bro/job_subscriber.rb +2 -0
- data/lib/dead_bro/monitor.rb +14 -2
- data/lib/dead_bro/railtie.rb +1 -1
- data/lib/dead_bro/redis_subscriber.rb +17 -2
- data/lib/dead_bro/sql_subscriber.rb +5 -0
- data/lib/dead_bro/subscriber.rb +2 -0
- data/lib/dead_bro/version.rb +1 -1
- data/lib/dead_bro/view_rendering_subscriber.rb +11 -4
- data/lib/dead_bro.rb +24 -0
- metadata +2 -3
- data/FEATURES.md +0 -333
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 96a0bc8cf707ea8af5d3bcd0be76fa02e28d09031c8a7897a311754cda8f7440
|
|
4
|
+
data.tar.gz: 4a1407455415636db2858a28f0f77a8bb530d45fa916a359f85ae22f5f4294bb
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 493d8d79e5eedd0d0443ddc3922ace02ed2f606220e7cff07e07484c9a628a85b54a5160551cb816630e06db34886e22ebae5c6c95e3697784b67a8c71b5b1ad
|
|
7
|
+
data.tar.gz: 715064a097fadfb5ef97b96be9086f760f91efa004d3d339978f87ab87b6dce935ba5111c60c92a9f1849ec5a17a80e83efbd9f9cab031a963392f03a1055b2e
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,41 @@
|
|
|
1
1
|
## [Unreleased]
|
|
2
2
|
|
|
3
|
+
### Added
|
|
4
|
+
- Monitor thread now sends a synchronous heartbeat on startup before the first collection tick. This ensures remote settings — including `monitor_enabled` — are applied from the very first reporting cycle, so Sidekiq workers and other non-web processes that have not yet sent any metrics still receive the correct configuration immediately on boot rather than waiting up to 60 seconds for the first scheduled tick.
|
|
5
|
+
|
|
6
|
+
## [0.2.21] - 2026-06-02
|
|
7
|
+
|
|
8
|
+
### Added
|
|
9
|
+
- **Per-span timing for the Request Trace view.** Every instrumented event now includes a `start_offset_ms` field — the wall-clock milliseconds from rack entry to when that event started. This powers the waterfall visualisation in the DeadBro dashboard without any configuration changes.
|
|
10
|
+
- `SqlSubscriber`: `start_offset_ms` is computed from `TRACKING_START_TIME_KEY` using the `started` timestamp provided by `ActiveSupport::Notifications`. Stored on both the raw per-query hash and on the first-occurrence aggregate entry (so the bar is positioned at the actual time the query first ran, not a fabricated cumulative offset).
|
|
11
|
+
- `ViewRenderingSubscriber`: same pattern applied to template, partial, and collection render events. `start_offset_ms` is captured for the first render of each unique identifier and stored on the aggregate.
|
|
12
|
+
- `CacheSubscriber`: `start_offset_ms` derived from the `started` `ActiveSupport::Notifications` timestamp, added to every cache event hash and passed through `build_event`.
|
|
13
|
+
- `RedisSubscriber`: `wall_start = Time.now` is captured at the entry point of each instrumented block (`call`, `call_pipeline`, `call_multi`). The monotonic clock is still used for duration accuracy; `wall_start` is used only for the offset relative to `TRACKING_START_TIME_KEY`. Also applied to the `ActiveSupport::Notifications` fallback path in `install_notifications_subscription!`.
|
|
14
|
+
- `ElasticsearchSubscriber`: `start_offset_ms` added to `build_event`; the `record` method (called from `HttpInstrumentation` for Net::HTTP-based ES requests) now accepts a `start_offset_ms:` keyword argument. The `ActiveSupport::Notifications` subscription path (`request.elasticsearch` / `request.elastic_transport`) computes it from `started`.
|
|
15
|
+
- `HttpInstrumentation`: `wall_start = Time.now` captured alongside the existing monotonic `start_time`. `start_offset_ms` is included in the HTTP outgoing payload and forwarded to `ElasticsearchSubscriber.record` for requests routed to an ES host.
|
|
16
|
+
|
|
17
|
+
## [0.2.20] - 2026-05-29
|
|
18
|
+
|
|
19
|
+
### Added
|
|
20
|
+
- Monitor thread sends a synchronous heartbeat immediately on startup (before the first scheduled collection tick) so that remote settings — including `monitor_enabled` — are applied from the very first reporting cycle. Sidekiq workers and other non-web processes now receive the correct configuration on boot rather than waiting up to 60 seconds for the first tick.
|
|
21
|
+
- `gem_version` field added to every heartbeat payload so the dashboard can display and compare the running gem version per application.
|
|
22
|
+
- `process_kind` included in all system monitor payloads, linking server metrics to the correct process type.
|
|
23
|
+
- `post_heartbeat` now accepts a `sync: true` keyword for situations that require a blocking network call before proceeding (used by the monitor startup path).
|
|
24
|
+
|
|
25
|
+
## [0.2.19] - 2026-05-28
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
- **Error fingerprinting**: every unhandled exception payload now includes a stable `fingerprint` string derived from the exception class, a normalised version of the message (numeric IDs and UUIDs stripped), and the top application stack frame. Identical errors that differ only in record IDs or UUIDs produce the same fingerprint, enabling reliable grouping and deduplication on the server.
|
|
29
|
+
- `DeadBro.process_kind` auto-detects the type of the current Ruby process by inspecting `$PROGRAM_NAME` and `/proc/self/cmdline`: returns `"web"` (Puma/Passenger/Unicorn/Falcon), `"worker"` (Sidekiq/GoodJob/SolidQueue/DelayedJob), `"console"`, `"task"`, or `"app"` as a fallback. The value is memoised after the first call.
|
|
30
|
+
- `process_kind` included in error event payloads so the backend knows whether an exception came from a web request or a background worker.
|
|
31
|
+
|
|
32
|
+
## [0.2.18] - 2026-05-27
|
|
33
|
+
|
|
34
|
+
### Added
|
|
35
|
+
- **N+1 detection in the gem**: SQL queries are normalised (bind parameters, numeric literals, and `IN (...)` lists replaced with `?`) and counted per request. When the same normalised query fires 5 or more times, it is flagged as `n_plus_one: true` on its aggregate entry. A backtrace is captured exactly at the N+1 threshold rather than on every execution, keeping overhead low while still pointing to the callsite.
|
|
36
|
+
- **SQL aggregation**: instead of shipping a raw array of every query, the gem now groups queries by normalised SQL and sends one aggregate entry per unique pattern with `count`, `total_duration_ms`, `min_duration_ms`, `max_duration_ms`, `total_allocations`, and `cached_count`. This reduces payload size on N+1-heavy requests and makes the SQL breakdown directly usable without server-side grouping.
|
|
37
|
+
- **View rendering aggregation**: template, partial, and collection renders are aggregated per identifier (last three path segments). Each entry carries `count`, `total_duration_ms`, `min_duration_ms`, `max_duration_ms`, `rendered_at_min/max`, and cache hit counts. Aggregation happens on the thread-local stack so there is no GC pressure from intermediate arrays.
|
|
38
|
+
|
|
3
39
|
## [0.2.17] - 2026-05-25
|
|
4
40
|
|
|
5
41
|
### Added
|
|
@@ -10,6 +46,12 @@
|
|
|
10
46
|
### Added
|
|
11
47
|
- `ArObjectTracker`: subscribes to Rails' built-in `instantiation.active_record` notification to count the total number of ActiveRecord model instances hydrated during each request or background job. The count is reported as `ar_instantiation_count` in every payload. Uses a thread-local counter with an idempotent `subscribe!` guard, matching the same start/stop lifecycle as `GcTracker`. No monkey-patching required — Rails emits this event natively with a `record_count` field that accumulates correctly across batch loads.
|
|
12
48
|
|
|
49
|
+
## [0.2.15] - 2026-05-24
|
|
50
|
+
|
|
51
|
+
### Added
|
|
52
|
+
- **`GcTracker`**: records a GC snapshot at the start and end of every request and background job. Reports `gc_minor_runs`, `gc_major_runs`, `gc_allocated_objects`, `gc_time_ms`, and `heap_pages_increase` as a `gc_pressure` hash in every payload. Uses `GC.stat` and `GC::Profiler` with an idempotent subscribe guard; overhead is negligible when no GC cycles occur.
|
|
53
|
+
- **`SqlAllocListener`**: measures GC allocation deltas per SQL event by snapshotting `GC.stat[:total_allocated_objects]` in the notification `start` callback and diffing in `finish`. The delta is stored by notification ID and merged into the corresponding query's `allocations` field, allowing the dashboard to surface allocation-heavy queries independently of their duration.
|
|
54
|
+
|
|
13
55
|
## [0.2.14] - 2026-05-23
|
|
14
56
|
|
|
15
57
|
### Added
|
|
@@ -24,7 +24,9 @@ module DeadBro
|
|
|
24
24
|
next unless Thread.current[THREAD_LOCAL_KEY]
|
|
25
25
|
|
|
26
26
|
duration_ms = ((finished - started) * 1000.0).round(2)
|
|
27
|
-
|
|
27
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
28
|
+
start_offset_ms = tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil
|
|
29
|
+
event = build_event(name, data, duration_ms, start_offset_ms)
|
|
28
30
|
if event && should_continue_tracking?
|
|
29
31
|
Thread.current[THREAD_LOCAL_KEY] << event
|
|
30
32
|
end
|
|
@@ -63,12 +65,13 @@ module DeadBro
|
|
|
63
65
|
true
|
|
64
66
|
end
|
|
65
67
|
|
|
66
|
-
def self.build_event(name, data, duration_ms)
|
|
68
|
+
def self.build_event(name, data, duration_ms, start_offset_ms = nil)
|
|
67
69
|
return nil unless data.is_a?(Hash)
|
|
68
70
|
|
|
69
71
|
{
|
|
70
72
|
event: name,
|
|
71
73
|
duration_ms: duration_ms,
|
|
74
|
+
start_offset_ms: start_offset_ms,
|
|
72
75
|
key: safe_key(data[:key]),
|
|
73
76
|
keys_count: safe_keys_count(data[:keys]),
|
|
74
77
|
hit: infer_hit(name, data),
|
data/lib/dead_bro/client.rb
CHANGED
|
@@ -32,18 +32,33 @@ module DeadBro
|
|
|
32
32
|
nil
|
|
33
33
|
end
|
|
34
34
|
|
|
35
|
-
def post_heartbeat
|
|
35
|
+
def post_heartbeat(sync: false)
|
|
36
36
|
return if @configuration.api_key.nil?
|
|
37
37
|
|
|
38
38
|
@configuration.last_heartbeat_attempt_at = Time.now.utc
|
|
39
|
-
body = {event: "heartbeat", payload: {}, sent_at: Time.now.utc.iso8601, revision: @configuration.resolve_deploy_id, gem_version: DeadBro::VERSION}
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
39
|
+
body = {event: "heartbeat", payload: {rails_env: DeadBro.env}, sent_at: Time.now.utc.iso8601, revision: @configuration.resolve_deploy_id, gem_version: DeadBro::VERSION}
|
|
40
|
+
|
|
41
|
+
if sync
|
|
42
|
+
# Called from the monitor thread on startup — run inline so settings are
|
|
43
|
+
# applied before the first collection tick.
|
|
44
|
+
uri = URI.parse(metrics_endpoint_url)
|
|
45
|
+
http = Net::HTTP.new(uri.host, uri.port)
|
|
46
|
+
http.use_ssl = (uri.scheme == "https")
|
|
47
|
+
http.open_timeout = @configuration.open_timeout
|
|
48
|
+
http.read_timeout = @configuration.read_timeout
|
|
49
|
+
request = Net::HTTP::Post.new(uri.request_uri)
|
|
50
|
+
request["Content-Type"] = "application/json"
|
|
51
|
+
request["Authorization"] = "Bearer #{@configuration.api_key}"
|
|
52
|
+
request.body = JSON.dump(body)
|
|
53
|
+
perform_request(http, request, event_name: "heartbeat", apply_settings: true)
|
|
54
|
+
else
|
|
55
|
+
dispatch_request(
|
|
56
|
+
url: metrics_endpoint_url,
|
|
57
|
+
body: body,
|
|
58
|
+
event_name: "heartbeat",
|
|
59
|
+
apply_settings: true
|
|
60
|
+
)
|
|
61
|
+
end
|
|
47
62
|
|
|
48
63
|
nil
|
|
49
64
|
end
|
|
@@ -52,7 +67,7 @@ module DeadBro
|
|
|
52
67
|
return if @configuration.api_key.nil?
|
|
53
68
|
return unless @configuration.enabled
|
|
54
69
|
return if @configuration.skip_tracking?
|
|
55
|
-
return unless @configuration.
|
|
70
|
+
return unless @configuration.monitor_enabled
|
|
56
71
|
return if circuit_open?
|
|
57
72
|
|
|
58
73
|
body = {payload: payload, sent_at: Time.now.utc.iso8601, revision: @configuration.resolve_deploy_id, gem_version: DeadBro::VERSION}
|
|
@@ -206,12 +221,5 @@ module DeadBro
|
|
|
206
221
|
)
|
|
207
222
|
end
|
|
208
223
|
|
|
209
|
-
def log_debug(message)
|
|
210
|
-
if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
|
|
211
|
-
Rails.logger.debug(message)
|
|
212
|
-
else
|
|
213
|
-
$stdout.puts(message)
|
|
214
|
-
end
|
|
215
|
-
end
|
|
216
224
|
end
|
|
217
225
|
end
|
|
@@ -12,7 +12,7 @@ module DeadBro
|
|
|
12
12
|
# Remote-managed settings (overwritten by backend JSON `settings` on successful API responses)
|
|
13
13
|
attr_accessor :memory_tracking_enabled, :allocation_tracking_enabled,
|
|
14
14
|
:sample_rate, :slow_query_threshold_ms, :explain_analyze_enabled,
|
|
15
|
-
:
|
|
15
|
+
:monitor_enabled, :enable_db_stats, :enable_process_stats, :enable_system_stats,
|
|
16
16
|
:max_sql_queries_to_send, :max_logs_to_send
|
|
17
17
|
|
|
18
18
|
# Readers for exclusion lists. Writers are defined below so we can compile
|
|
@@ -58,7 +58,7 @@ module DeadBro
|
|
|
58
58
|
enabled sample_rate memory_tracking_enabled allocation_tracking_enabled
|
|
59
59
|
explain_analyze_enabled slow_query_threshold_ms max_sql_queries_to_send max_logs_to_send
|
|
60
60
|
excluded_controllers excluded_jobs exclusive_controllers exclusive_jobs
|
|
61
|
-
|
|
61
|
+
monitor_enabled enable_db_stats enable_process_stats enable_system_stats
|
|
62
62
|
].freeze
|
|
63
63
|
|
|
64
64
|
def initialize
|
|
@@ -87,7 +87,7 @@ module DeadBro
|
|
|
87
87
|
self.excluded_jobs = []
|
|
88
88
|
self.exclusive_controllers = []
|
|
89
89
|
self.exclusive_jobs = []
|
|
90
|
-
@
|
|
90
|
+
@monitor_enabled = false
|
|
91
91
|
@enable_db_stats = false
|
|
92
92
|
@enable_process_stats = false
|
|
93
93
|
@enable_system_stats = false
|
|
@@ -145,7 +145,7 @@ module DeadBro
|
|
|
145
145
|
when "sample_rate", "slow_query_threshold_ms", "max_sql_queries_to_send", "max_logs_to_send"
|
|
146
146
|
send(:"#{k}=", value.to_i)
|
|
147
147
|
when "enabled", "memory_tracking_enabled", "allocation_tracking_enabled", "explain_analyze_enabled",
|
|
148
|
-
"
|
|
148
|
+
"monitor_enabled", "enable_db_stats", "enable_process_stats", "enable_system_stats"
|
|
149
149
|
send(:"#{k}=", !!value)
|
|
150
150
|
when "excluded_controllers", "excluded_jobs", "exclusive_controllers", "exclusive_jobs"
|
|
151
151
|
send(:"#{k}=", Array(value).map(&:to_s))
|
|
@@ -13,12 +13,12 @@ module DeadBro
|
|
|
13
13
|
end
|
|
14
14
|
|
|
15
15
|
# Called by HttpInstrumentation when it detects a Net::HTTP request to an ES host.
|
|
16
|
-
def self.record(method:, path:, status:, duration_ms:)
|
|
16
|
+
def self.record(method:, path:, status:, duration_ms:, start_offset_ms: nil)
|
|
17
17
|
events = Thread.current[THREAD_LOCAL_KEY]
|
|
18
18
|
return unless events
|
|
19
19
|
return unless should_continue_tracking?
|
|
20
20
|
|
|
21
|
-
events << build_event(method, path, status, duration_ms)
|
|
21
|
+
events << build_event(method, path, status, duration_ms, start_offset_ms)
|
|
22
22
|
rescue
|
|
23
23
|
end
|
|
24
24
|
|
|
@@ -120,20 +120,23 @@ module DeadBro
|
|
|
120
120
|
duration_ms = ((finished - started) * 1000.0).round(2)
|
|
121
121
|
method = payload[:method].to_s.upcase
|
|
122
122
|
path = payload[:path].to_s
|
|
123
|
-
|
|
123
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
124
|
+
start_offset_ms = tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil
|
|
125
|
+
events << build_event(method, path, payload[:status], duration_ms, start_offset_ms)
|
|
124
126
|
rescue
|
|
125
127
|
end
|
|
126
128
|
end
|
|
127
129
|
rescue
|
|
128
130
|
end
|
|
129
131
|
|
|
130
|
-
def build_event(method, path, status, duration_ms)
|
|
132
|
+
def build_event(method, path, status, duration_ms, start_offset_ms = nil)
|
|
131
133
|
{
|
|
132
134
|
method: method.to_s.upcase,
|
|
133
135
|
path: sanitize_path(path),
|
|
134
136
|
operation: extract_operation(method, path),
|
|
135
137
|
status: status,
|
|
136
|
-
duration_ms: duration_ms
|
|
138
|
+
duration_ms: duration_ms,
|
|
139
|
+
start_offset_ms: start_offset_ms
|
|
137
140
|
}
|
|
138
141
|
end
|
|
139
142
|
end
|
|
@@ -21,6 +21,7 @@ module DeadBro
|
|
|
21
21
|
mod = Module.new do
|
|
22
22
|
define_method(:request) do |req, body = nil, &block|
|
|
23
23
|
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
24
|
+
wall_start = Time.now
|
|
24
25
|
response = nil
|
|
25
26
|
error = nil
|
|
26
27
|
begin
|
|
@@ -46,6 +47,9 @@ module DeadBro
|
|
|
46
47
|
# must still be tracked; only skip the deadbro backend itself.
|
|
47
48
|
skip_instrumentation = !is_es_host && uri && (uri.to_s.include?("localhost") || uri.to_s.include?("aberatii.com"))
|
|
48
49
|
|
|
50
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
51
|
+
start_offset_ms = tracking_start ? ((wall_start - tracking_start) * 1000.0).round(2) : nil
|
|
52
|
+
|
|
49
53
|
if is_es_host
|
|
50
54
|
# Route to elasticsearch subscriber instead of http_outgoing
|
|
51
55
|
if Thread.current[DeadBro::ElasticsearchSubscriber::THREAD_LOCAL_KEY]
|
|
@@ -54,7 +58,8 @@ module DeadBro
|
|
|
54
58
|
method: req.method,
|
|
55
59
|
path: path,
|
|
56
60
|
status: response && response.code.to_i,
|
|
57
|
-
duration_ms: duration_ms
|
|
61
|
+
duration_ms: duration_ms,
|
|
62
|
+
start_offset_ms: start_offset_ms
|
|
58
63
|
)
|
|
59
64
|
end
|
|
60
65
|
elsif !skip_instrumentation
|
|
@@ -67,6 +72,7 @@ module DeadBro
|
|
|
67
72
|
path: (uri && uri.path) || req.path,
|
|
68
73
|
status: response && response.code.to_i,
|
|
69
74
|
duration_ms: duration_ms,
|
|
75
|
+
start_offset_ms: start_offset_ms,
|
|
70
76
|
exception: error && error.class.name
|
|
71
77
|
}
|
|
72
78
|
if Thread.current[THREAD_LOCAL_KEY] && DeadBro::HttpInstrumentation.should_continue_tracking?
|
|
@@ -115,6 +115,7 @@ module DeadBro
|
|
|
115
115
|
sql_queries: sql_queries,
|
|
116
116
|
rails_env: DeadBro.env,
|
|
117
117
|
host: safe_host,
|
|
118
|
+
process_kind: DeadBro.process_kind,
|
|
118
119
|
memory_usage: memory_usage_mb,
|
|
119
120
|
gc_stats: gc_stats,
|
|
120
121
|
memory_events: memory_events,
|
|
@@ -213,6 +214,7 @@ module DeadBro
|
|
|
213
214
|
backtrace: Array(exception&.backtrace).first(50),
|
|
214
215
|
rails_env: DeadBro.env,
|
|
215
216
|
host: safe_host,
|
|
217
|
+
process_kind: DeadBro.process_kind,
|
|
216
218
|
memory_usage: memory_usage_mb,
|
|
217
219
|
gc_stats: gc_stats,
|
|
218
220
|
memory_events: memory_events,
|
data/lib/dead_bro/monitor.rb
CHANGED
|
@@ -24,6 +24,16 @@ module DeadBro
|
|
|
24
24
|
@running = true
|
|
25
25
|
@thread = Thread.new do
|
|
26
26
|
Thread.current.abort_on_exception = false
|
|
27
|
+
|
|
28
|
+
# Fetch initial settings before the first collection tick so processes
|
|
29
|
+
# that haven't yet posted any metrics (e.g. Sidekiq at boot) still get
|
|
30
|
+
# monitor_enabled and other remote settings from the backend.
|
|
31
|
+
begin
|
|
32
|
+
@client.post_heartbeat(sync: true)
|
|
33
|
+
rescue => e
|
|
34
|
+
log_error("Error fetching initial settings: #{e.message}")
|
|
35
|
+
end
|
|
36
|
+
|
|
27
37
|
loop do
|
|
28
38
|
break unless @running
|
|
29
39
|
|
|
@@ -59,6 +69,7 @@ module DeadBro
|
|
|
59
69
|
environment: DeadBro.env,
|
|
60
70
|
host: process_hostname,
|
|
61
71
|
pid: Process.pid,
|
|
72
|
+
process_kind: DeadBro.process_kind,
|
|
62
73
|
current_time: Time.now.utc.iso8601,
|
|
63
74
|
jobs: DeadBro::Collectors::Jobs.collect,
|
|
64
75
|
network: DeadBro::Collectors::Network.collect
|
|
@@ -86,10 +97,11 @@ module DeadBro
|
|
|
86
97
|
end
|
|
87
98
|
|
|
88
99
|
def log_error(message)
|
|
100
|
+
msg = "[DeadBro::Monitor] #{message}"
|
|
89
101
|
if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
|
|
90
|
-
Rails.logger.error(
|
|
102
|
+
Rails.logger.error(msg)
|
|
91
103
|
else
|
|
92
|
-
|
|
104
|
+
warn(msg)
|
|
93
105
|
end
|
|
94
106
|
end
|
|
95
107
|
|
data/lib/dead_bro/railtie.rb
CHANGED
|
@@ -68,7 +68,7 @@ if defined?(Rails) && defined?(Rails::Railtie)
|
|
|
68
68
|
end
|
|
69
69
|
|
|
70
70
|
# Always start the monitor thread. The thread runs every 60s but
|
|
71
|
-
# post_monitor_stats skips the HTTP POST when
|
|
71
|
+
# post_monitor_stats skips the HTTP POST when monitor_enabled
|
|
72
72
|
# is false, so the backend can toggle monitoring on/off mid-process.
|
|
73
73
|
require "dead_bro/monitor"
|
|
74
74
|
DeadBro.monitor = DeadBro::Monitor.new(client: shared_client)
|
|
@@ -63,6 +63,7 @@ module DeadBro
|
|
|
63
63
|
def record_redis_command(command)
|
|
64
64
|
return yield unless Thread.current[RedisSubscriber::THREAD_LOCAL_KEY]
|
|
65
65
|
|
|
66
|
+
wall_start = Time.now
|
|
66
67
|
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
67
68
|
error = nil
|
|
68
69
|
begin
|
|
@@ -77,12 +78,15 @@ module DeadBro
|
|
|
77
78
|
|
|
78
79
|
begin
|
|
79
80
|
cmd_info = extract_command_info(command)
|
|
81
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
82
|
+
start_offset_ms = tracking_start ? ((wall_start - tracking_start) * 1000.0).round(2) : nil
|
|
80
83
|
event = {
|
|
81
84
|
event: "redis.command",
|
|
82
85
|
command: cmd_info[:command],
|
|
83
86
|
key: cmd_info[:key],
|
|
84
87
|
args_count: cmd_info[:args_count],
|
|
85
88
|
duration_ms: duration_ms,
|
|
89
|
+
start_offset_ms: start_offset_ms,
|
|
86
90
|
db: safe_db(@db),
|
|
87
91
|
error: error ? error.class.name : nil
|
|
88
92
|
}
|
|
@@ -98,6 +102,7 @@ module DeadBro
|
|
|
98
102
|
def record_redis_pipeline(pipeline)
|
|
99
103
|
return yield unless Thread.current[RedisSubscriber::THREAD_LOCAL_KEY]
|
|
100
104
|
|
|
105
|
+
wall_start = Time.now
|
|
101
106
|
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
102
107
|
begin
|
|
103
108
|
result = yield
|
|
@@ -108,10 +113,13 @@ module DeadBro
|
|
|
108
113
|
|
|
109
114
|
begin
|
|
110
115
|
commands_count = pipeline.commands&.length || 0
|
|
116
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
117
|
+
start_offset_ms = tracking_start ? ((wall_start - tracking_start) * 1000.0).round(2) : nil
|
|
111
118
|
event = {
|
|
112
119
|
event: "redis.pipeline",
|
|
113
120
|
commands_count: commands_count,
|
|
114
121
|
duration_ms: duration_ms,
|
|
122
|
+
start_offset_ms: start_offset_ms,
|
|
115
123
|
db: safe_db(@db)
|
|
116
124
|
}
|
|
117
125
|
|
|
@@ -126,6 +134,7 @@ module DeadBro
|
|
|
126
134
|
def record_redis_multi(multi)
|
|
127
135
|
return yield unless Thread.current[RedisSubscriber::THREAD_LOCAL_KEY]
|
|
128
136
|
|
|
137
|
+
wall_start = Time.now
|
|
129
138
|
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
130
139
|
begin
|
|
131
140
|
result = yield
|
|
@@ -136,10 +145,13 @@ module DeadBro
|
|
|
136
145
|
|
|
137
146
|
begin
|
|
138
147
|
commands_count = multi.commands&.length || 0
|
|
148
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
149
|
+
start_offset_ms = tracking_start ? ((wall_start - tracking_start) * 1000.0).round(2) : nil
|
|
139
150
|
event = {
|
|
140
151
|
event: "redis.multi",
|
|
141
152
|
commands_count: commands_count,
|
|
142
153
|
duration_ms: duration_ms,
|
|
154
|
+
start_offset_ms: start_offset_ms,
|
|
143
155
|
db: safe_db(@db)
|
|
144
156
|
}
|
|
145
157
|
|
|
@@ -201,7 +213,9 @@ module DeadBro
|
|
|
201
213
|
ActiveSupport::Notifications.subscribe(/\Aredis\..+\z/) do |name, started, finished, _unique_id, data|
|
|
202
214
|
next unless Thread.current[THREAD_LOCAL_KEY]
|
|
203
215
|
duration_ms = ((finished - started) * 1000.0).round(2)
|
|
204
|
-
|
|
216
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
217
|
+
start_offset_ms = tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil
|
|
218
|
+
event = build_event(name, data, duration_ms, start_offset_ms)
|
|
205
219
|
if event && should_continue_tracking?
|
|
206
220
|
Thread.current[THREAD_LOCAL_KEY] << event
|
|
207
221
|
end
|
|
@@ -239,7 +253,7 @@ module DeadBro
|
|
|
239
253
|
true
|
|
240
254
|
end
|
|
241
255
|
|
|
242
|
-
def self.build_event(name, data, duration_ms)
|
|
256
|
+
def self.build_event(name, data, duration_ms, start_offset_ms = nil)
|
|
243
257
|
cmd = extract_command(data)
|
|
244
258
|
{
|
|
245
259
|
event: name.to_s,
|
|
@@ -247,6 +261,7 @@ module DeadBro
|
|
|
247
261
|
key: cmd[:key],
|
|
248
262
|
args_count: cmd[:args_count],
|
|
249
263
|
duration_ms: duration_ms,
|
|
264
|
+
start_offset_ms: start_offset_ms,
|
|
250
265
|
db: safe_db(data[:db])
|
|
251
266
|
}
|
|
252
267
|
rescue
|
|
@@ -106,6 +106,9 @@ module DeadBro
|
|
|
106
106
|
duration_ms = ((finished - started) * 1000.0).round(2)
|
|
107
107
|
original_sql = data[:sql]
|
|
108
108
|
|
|
109
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
110
|
+
start_offset_ms = tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil
|
|
111
|
+
|
|
109
112
|
threshold = begin
|
|
110
113
|
DeadBro.configuration.slow_query_threshold_ms
|
|
111
114
|
rescue
|
|
@@ -133,6 +136,7 @@ module DeadBro
|
|
|
133
136
|
sql: sanitized_sql,
|
|
134
137
|
name: data[:name],
|
|
135
138
|
duration_ms: duration_ms,
|
|
139
|
+
start_offset_ms: start_offset_ms,
|
|
136
140
|
cached: data[:cached] || false,
|
|
137
141
|
connection_id: data[:connection_id],
|
|
138
142
|
trace: captured_trace,
|
|
@@ -167,6 +171,7 @@ module DeadBro
|
|
|
167
171
|
total_allocations: allocations || 0,
|
|
168
172
|
cached_count: (data[:cached] ? 1 : 0),
|
|
169
173
|
n_plus_one: false,
|
|
174
|
+
start_offset_ms: start_offset_ms,
|
|
170
175
|
backtrace: captured_trace,
|
|
171
176
|
explain_plan: nil
|
|
172
177
|
}
|
data/lib/dead_bro/subscriber.rb
CHANGED
|
@@ -134,6 +134,7 @@ module DeadBro
|
|
|
134
134
|
duration_ms: duration_ms,
|
|
135
135
|
rails_env: DeadBro.env,
|
|
136
136
|
host: safe_host,
|
|
137
|
+
process_kind: DeadBro.process_kind,
|
|
137
138
|
params: safe_params(data),
|
|
138
139
|
user_agent: safe_user_agent(data),
|
|
139
140
|
user_id: extract_user_id(data),
|
|
@@ -166,6 +167,7 @@ module DeadBro
|
|
|
166
167
|
db_runtime_ms: data[:db_runtime],
|
|
167
168
|
host: safe_host,
|
|
168
169
|
rails_env: DeadBro.env,
|
|
170
|
+
process_kind: DeadBro.process_kind,
|
|
169
171
|
params: safe_params(data),
|
|
170
172
|
user_agent: safe_user_agent(data),
|
|
171
173
|
user_id: extract_user_id(data),
|
data/lib/dead_bro/version.rb
CHANGED
|
@@ -14,24 +14,30 @@ module DeadBro
|
|
|
14
14
|
|
|
15
15
|
def self.subscribe!(client: Client.new)
|
|
16
16
|
ActiveSupport::Notifications.subscribe(RENDER_TEMPLATE_EVENT) do |_name, started, finished, _uid, data|
|
|
17
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
17
18
|
add_view_event(type: "template", identifier: safe_identifier(data[:identifier]),
|
|
18
19
|
duration_ms: ((finished - started) * 1000.0).round(2),
|
|
19
|
-
rendered_at: Time.now.utc.to_i
|
|
20
|
+
rendered_at: Time.now.utc.to_i,
|
|
21
|
+
start_offset_ms: tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil)
|
|
20
22
|
end
|
|
21
23
|
|
|
22
24
|
ActiveSupport::Notifications.subscribe(RENDER_PARTIAL_EVENT) do |_name, started, finished, _uid, data|
|
|
25
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
23
26
|
add_view_event(type: "partial", identifier: safe_identifier(data[:identifier]),
|
|
24
27
|
duration_ms: ((finished - started) * 1000.0).round(2),
|
|
25
28
|
cache_key: data[:cache_key],
|
|
26
|
-
rendered_at: Time.now.utc.to_i
|
|
29
|
+
rendered_at: Time.now.utc.to_i,
|
|
30
|
+
start_offset_ms: tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil)
|
|
27
31
|
end
|
|
28
32
|
|
|
29
33
|
ActiveSupport::Notifications.subscribe(RENDER_COLLECTION_EVENT) do |_name, started, finished, _uid, data|
|
|
34
|
+
tracking_start = Thread.current[DeadBro::TRACKING_START_TIME_KEY]
|
|
30
35
|
add_view_event(type: "collection", identifier: safe_identifier(data[:identifier]),
|
|
31
36
|
duration_ms: ((finished - started) * 1000.0).round(2),
|
|
32
37
|
collection_count: (data[:count] || 0).to_i,
|
|
33
38
|
collection_cached_count: (data[:cached_count] || 0).to_i,
|
|
34
|
-
rendered_at: Time.now.utc.to_i
|
|
39
|
+
rendered_at: Time.now.utc.to_i,
|
|
40
|
+
start_offset_ms: tracking_start ? ((started - tracking_start) * 1000.0).round(2) : nil)
|
|
35
41
|
end
|
|
36
42
|
rescue
|
|
37
43
|
end
|
|
@@ -80,7 +86,8 @@ module DeadBro
|
|
|
80
86
|
rendered_at_max: rendered_at,
|
|
81
87
|
cache_hit_count: (view_info[:cache_key] ? 1 : 0),
|
|
82
88
|
collection_count: view_info[:collection_count].to_i,
|
|
83
|
-
collection_cached_count: view_info[:collection_cached_count].to_i
|
|
89
|
+
collection_cached_count: view_info[:collection_cached_count].to_i,
|
|
90
|
+
start_offset_ms: view_info[:start_offset_ms]
|
|
84
91
|
}
|
|
85
92
|
end
|
|
86
93
|
end
|
data/lib/dead_bro.rb
CHANGED
|
@@ -127,6 +127,30 @@ module DeadBro
|
|
|
127
127
|
"development"
|
|
128
128
|
end
|
|
129
129
|
|
|
130
|
+
def self.process_kind
|
|
131
|
+
@process_kind ||= begin
|
|
132
|
+
fingerprint = "#{$PROGRAM_NAME} #{process_command_line}".downcase
|
|
133
|
+
case
|
|
134
|
+
when fingerprint.match?(/sidekiq|good_job|solid_queue|delayed_job/) then "worker"
|
|
135
|
+
when fingerprint.match?(/puma|passenger|unicorn|falcon/) then "web"
|
|
136
|
+
when fingerprint.include?("console") then "console"
|
|
137
|
+
when fingerprint.include?("rake") then "task"
|
|
138
|
+
else "app"
|
|
139
|
+
end
|
|
140
|
+
end
|
|
141
|
+
rescue
|
|
142
|
+
"app"
|
|
143
|
+
end
|
|
144
|
+
|
|
145
|
+
def self.process_command_line
|
|
146
|
+
# /proc/self/cmdline is Linux-only; on macOS/Windows the fallback to
|
|
147
|
+
# $PROGRAM_NAME is used, which may miss some process fingerprints (e.g.
|
|
148
|
+
# a Sidekiq invocation that uses a wrapper script).
|
|
149
|
+
File.readable?("/proc/self/cmdline") ? File.read("/proc/self/cmdline").tr("\0", " ").strip : $PROGRAM_NAME.to_s
|
|
150
|
+
rescue
|
|
151
|
+
$PROGRAM_NAME.to_s
|
|
152
|
+
end
|
|
153
|
+
|
|
130
154
|
# Returns the monitor instance
|
|
131
155
|
def self.monitor
|
|
132
156
|
@monitor
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: dead_bro
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.21
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Emanuel Comsa
|
|
@@ -18,7 +18,6 @@ extensions: []
|
|
|
18
18
|
extra_rdoc_files: []
|
|
19
19
|
files:
|
|
20
20
|
- CHANGELOG.md
|
|
21
|
-
- FEATURES.md
|
|
22
21
|
- README.md
|
|
23
22
|
- lib/dead_bro.rb
|
|
24
23
|
- lib/dead_bro/ar_object_tracker.rb
|
|
@@ -75,7 +74,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
75
74
|
- !ruby/object:Gem::Version
|
|
76
75
|
version: '0'
|
|
77
76
|
requirements: []
|
|
78
|
-
rubygems_version: 4.0.
|
|
77
|
+
rubygems_version: 4.0.10
|
|
79
78
|
specification_version: 4
|
|
80
79
|
summary: Minimal APM for Rails apps.
|
|
81
80
|
test_files: []
|
data/FEATURES.md
DELETED
|
@@ -1,333 +0,0 @@
|
|
|
1
|
-
# ApmBro Feature List
|
|
2
|
-
|
|
3
|
-
A comprehensive feature list for comparing ApmBro with other APM (Application Performance Monitoring) tools.
|
|
4
|
-
|
|
5
|
-
## Core Architecture
|
|
6
|
-
|
|
7
|
-
- **Rails Integration**: Automatic subscription to Rails events via ActiveSupport::Notifications
|
|
8
|
-
- **Zero-Configuration Setup**: Works out of the box with minimal configuration
|
|
9
|
-
- **Asynchronous Metrics Posting**: Non-blocking HTTP requests using background threads
|
|
10
|
-
- **Thread-Local Storage**: Per-request metric collection using thread-local variables
|
|
11
|
-
- **Circuit Breaker Pattern**: Built-in circuit breaker to prevent cascading failures when APM endpoint is down
|
|
12
|
-
- **Deploy Tracking**: Automatic deploy ID resolution from multiple sources (Rails settings, ENV vars, Heroku, Git)
|
|
13
|
-
|
|
14
|
-
## Request Tracking
|
|
15
|
-
|
|
16
|
-
### Controller Action Monitoring
|
|
17
|
-
- **Automatic Tracking**: Tracks all controller actions automatically
|
|
18
|
-
- **Request Duration**: Measures total request processing time
|
|
19
|
-
- **HTTP Method & Path**: Captures HTTP method and request path
|
|
20
|
-
- **Status Codes**: Tracks HTTP response status codes
|
|
21
|
-
- **View Runtime**: Separate tracking of view rendering time
|
|
22
|
-
- **Database Runtime**: Separate tracking of database query time
|
|
23
|
-
- **Request Parameters**: Captures request parameters (with sensitive data filtering)
|
|
24
|
-
- **User Agent**: Tracks user agent strings
|
|
25
|
-
- **User ID Extraction**: Extracts authenticated user ID (supports Warden)
|
|
26
|
-
- **Environment Context**: Tracks Rails environment (development, staging, production)
|
|
27
|
-
|
|
28
|
-
### Request Sampling
|
|
29
|
-
- **Configurable Sample Rate**: Percentage-based sampling (1-100%)
|
|
30
|
-
- **Random Sampling**: Each request has random chance of being tracked
|
|
31
|
-
- **Consistent Per-Request**: Sampling decision applies to all metrics for a request
|
|
32
|
-
- **Error Override**: Errors are always tracked regardless of sampling
|
|
33
|
-
- **Cost Optimization**: Reduces data volume and costs for high-traffic applications
|
|
34
|
-
|
|
35
|
-
### Exclusion Rules
|
|
36
|
-
- **Controller Exclusion**: Exclude entire controllers from tracking
|
|
37
|
-
- **Action Exclusion**: Exclude specific controller#action combinations
|
|
38
|
-
- **Wildcard Support**: Pattern matching with `*` wildcards (e.g., `Admin::*`, `Admin::*#*`)
|
|
39
|
-
- **Job Exclusion**: Exclude specific background jobs from tracking
|
|
40
|
-
- **Flexible Configuration**: Configure via initializer, Rails settings, or environment variables
|
|
41
|
-
|
|
42
|
-
## SQL Query Tracking
|
|
43
|
-
|
|
44
|
-
### Query Details
|
|
45
|
-
- **Full SQL Tracking**: Captures all SQL queries executed during requests and jobs
|
|
46
|
-
- **Query Sanitization**: Automatically sanitizes SQL to remove sensitive data
|
|
47
|
-
- **Query Name**: Tracks query names (e.g., "User Load", "User Update")
|
|
48
|
-
- **Duration Measurement**: Precise query execution time in milliseconds
|
|
49
|
-
- **Cache Detection**: Identifies cached queries
|
|
50
|
-
- **Connection ID**: Tracks database connection ID
|
|
51
|
-
- **Call Stack Traces**: Full backtrace showing where queries were executed
|
|
52
|
-
- **Object Allocations**: Optional tracking of object allocations per query
|
|
53
|
-
|
|
54
|
-
### Query Performance Analysis
|
|
55
|
-
- **Slow Query Detection**: Configurable threshold for identifying slow queries
|
|
56
|
-
- **EXPLAIN ANALYZE**: Automatic execution plan capture for slow queries
|
|
57
|
-
- **Background Execution**: EXPLAIN ANALYZE runs in separate thread (non-blocking)
|
|
58
|
-
- **Multi-Database Support**: Works with PostgreSQL, MySQL, SQLite, and others
|
|
59
|
-
- **Smart Filtering**: Automatically skips transaction queries (BEGIN, COMMIT, ROLLBACK)
|
|
60
|
-
- **Execution Plan Details**:
|
|
61
|
-
- PostgreSQL: Full EXPLAIN ANALYZE with buffer usage statistics
|
|
62
|
-
- MySQL: EXPLAIN ANALYZE with actual execution times
|
|
63
|
-
- SQLite: EXPLAIN QUERY PLAN output
|
|
64
|
-
- **Query Optimization Insights**: Helps identify missing indexes, full table scans, JOIN issues
|
|
65
|
-
|
|
66
|
-
## View Rendering Tracking
|
|
67
|
-
|
|
68
|
-
### View Performance
|
|
69
|
-
- **Template Rendering**: Tracks main template rendering
|
|
70
|
-
- **Partial Rendering**: Tracks partial template rendering with cache key information
|
|
71
|
-
- **Collection Rendering**: Tracks collection rendering (partials in loops)
|
|
72
|
-
- **Rendering Duration**: Precise timing for each view component
|
|
73
|
-
- **Virtual Path Tracking**: Tracks view virtual paths
|
|
74
|
-
- **Layout Information**: Captures layout usage
|
|
75
|
-
|
|
76
|
-
### View Analysis
|
|
77
|
-
- **Slow View Detection**: Identifies the slowest rendering views
|
|
78
|
-
- **Frequency Analysis**: Tracks most frequently rendered views
|
|
79
|
-
- **Cache Hit Rate**: Calculates cache hit rates for partials
|
|
80
|
-
- **Collection Cache Analysis**: Tracks cache hit rates for collection rendering
|
|
81
|
-
- **Performance Metrics**:
|
|
82
|
-
- Total views rendered per request
|
|
83
|
-
- Total view rendering duration
|
|
84
|
-
- Average view rendering duration
|
|
85
|
-
- Breakdown by view type (template, partial, collection)
|
|
86
|
-
|
|
87
|
-
## Memory Tracking & Leak Detection
|
|
88
|
-
|
|
89
|
-
### Lightweight Memory Tracking (Default)
|
|
90
|
-
- **Memory Usage Monitoring**: Tracks memory consumption per request using GC stats
|
|
91
|
-
- **Memory Growth Tracking**: Measures memory growth during request processing
|
|
92
|
-
- **GC Statistics**: Tracks garbage collection count and heap pages
|
|
93
|
-
- **Minimal Performance Impact**: ~0.1ms overhead per request
|
|
94
|
-
- **Memory Before/After**: Captures memory state at request start and end
|
|
95
|
-
|
|
96
|
-
### Detailed Allocation Tracking (Optional)
|
|
97
|
-
- **Object Allocation Tracking**: Detailed tracking of object allocations (disabled by default)
|
|
98
|
-
- **Allocation Sampling**: Configurable sampling rate for allocations
|
|
99
|
-
- **Large Object Detection**: Identifies objects larger than 1MB threshold
|
|
100
|
-
- **Memory Snapshots**: Periodic memory snapshots during request processing
|
|
101
|
-
- **Object Count Tracking**: Tracks object counts before and after requests
|
|
102
|
-
- **Performance Impact**: ~2-5ms overhead per request (only when enabled)
|
|
103
|
-
|
|
104
|
-
### Memory Leak Detection
|
|
105
|
-
- **Pattern Detection**: Detects growing memory patterns over time
|
|
106
|
-
- **GC Efficiency Analysis**: Monitors garbage collection effectiveness
|
|
107
|
-
- **Heap Page Tracking**: Tracks heap page growth
|
|
108
|
-
- **Request Correlation**: Correlates memory growth with specific controllers/actions
|
|
109
|
-
|
|
110
|
-
## Background Job Tracking
|
|
111
|
-
|
|
112
|
-
### Job Execution Monitoring
|
|
113
|
-
- **ActiveJob Integration**: Automatic tracking when ActiveJob is available
|
|
114
|
-
- **Job Class Tracking**: Tracks job class names
|
|
115
|
-
- **Job ID**: Captures unique job identifiers
|
|
116
|
-
- **Queue Name**: Tracks which queue processed the job
|
|
117
|
-
- **Job Arguments**: Captures job arguments (with sensitive data filtering)
|
|
118
|
-
- **Duration Measurement**: Precise job execution time in milliseconds
|
|
119
|
-
- **Status Tracking**: Tracks job status (completed or failed)
|
|
120
|
-
|
|
121
|
-
### Job Error Tracking
|
|
122
|
-
- **Exception Capture**: Captures exceptions from failed jobs
|
|
123
|
-
- **Exception Class**: Tracks exception class names
|
|
124
|
-
- **Exception Messages**: Captures exception messages (truncated to 1000 chars)
|
|
125
|
-
- **Backtraces**: Full exception backtraces (first 50 lines)
|
|
126
|
-
- **SQL Query Context**: Includes SQL queries executed during failed jobs
|
|
127
|
-
- **Memory Context**: Includes memory usage during job execution
|
|
128
|
-
|
|
129
|
-
### Job SQL Tracking
|
|
130
|
-
- **SQL Query Tracking**: Tracks all SQL queries executed during job processing
|
|
131
|
-
- **Query Details**: Same detailed SQL tracking as request tracking
|
|
132
|
-
- **Query Context**: Full context of database operations in background jobs
|
|
133
|
-
|
|
134
|
-
## Cache Tracking
|
|
135
|
-
|
|
136
|
-
### Cache Operations
|
|
137
|
-
- **Read Operations**: Tracks cache read operations
|
|
138
|
-
- **Write Operations**: Tracks cache write operations
|
|
139
|
-
- **Delete Operations**: Tracks cache delete operations
|
|
140
|
-
- **Existence Checks**: Tracks cache existence checks
|
|
141
|
-
- **Fetch Operations**: Tracks cache fetch with hit/miss detection
|
|
142
|
-
- **Multi-Read Operations**: Tracks cache read_multi operations
|
|
143
|
-
- **Multi-Write Operations**: Tracks cache write_multi operations
|
|
144
|
-
- **Cache Generation**: Tracks cache generation events
|
|
145
|
-
|
|
146
|
-
### Cache Analysis
|
|
147
|
-
- **Cache Hit Detection**: Identifies cache hits vs misses
|
|
148
|
-
- **Cache Key Tracking**: Tracks cache keys (truncated to 200 chars)
|
|
149
|
-
- **Store Information**: Identifies which cache store was used
|
|
150
|
-
- **Namespace Tracking**: Tracks cache namespaces
|
|
151
|
-
- **Duration Measurement**: Precise timing for each cache operation
|
|
152
|
-
- **Hit Rate Calculation**: Calculates cache hit rates per request
|
|
153
|
-
|
|
154
|
-
## Redis Tracking
|
|
155
|
-
|
|
156
|
-
### Redis Command Tracking
|
|
157
|
-
- **Command Monitoring**: Tracks all Redis commands executed
|
|
158
|
-
- **Command Name**: Captures Redis command names (GET, SET, etc.)
|
|
159
|
-
- **Key Tracking**: Tracks Redis keys (truncated to 200 chars)
|
|
160
|
-
- **Argument Count**: Tracks number of arguments per command
|
|
161
|
-
- **Database Selection**: Tracks which Redis database is used
|
|
162
|
-
- **Duration Measurement**: Precise timing for each Redis command
|
|
163
|
-
- **Error Tracking**: Captures Redis command errors
|
|
164
|
-
|
|
165
|
-
### Advanced Redis Features
|
|
166
|
-
- **Pipeline Support**: Tracks Redis pipeline operations with command counts
|
|
167
|
-
- **Multi/Transaction Support**: Tracks Redis MULTI/EXEC transactions
|
|
168
|
-
- **ActiveSupport Integration**: Subscribes to ActiveSupport::Notifications for Redis events
|
|
169
|
-
- **Client Instrumentation**: Direct instrumentation of Redis::Client for comprehensive coverage
|
|
170
|
-
|
|
171
|
-
## Error Tracking
|
|
172
|
-
|
|
173
|
-
### Exception Handling
|
|
174
|
-
- **Automatic Exception Capture**: Captures exceptions from controller actions
|
|
175
|
-
- **Exception Class**: Tracks exception class names
|
|
176
|
-
- **Exception Messages**: Captures exception messages (truncated to 1000 chars)
|
|
177
|
-
- **Full Backtraces**: Captures complete exception backtraces (first 50 lines)
|
|
178
|
-
- **Request Context**: Includes full request context with exceptions
|
|
179
|
-
- **Error Flagging**: Errors are marked and always tracked (even with sampling)
|
|
180
|
-
|
|
181
|
-
### Error Context
|
|
182
|
-
- **Controller/Action**: Identifies where the error occurred
|
|
183
|
-
- **Request Parameters**: Includes request parameters at time of error
|
|
184
|
-
- **User Information**: Includes user ID if available
|
|
185
|
-
- **SQL Queries**: Includes SQL queries executed before error
|
|
186
|
-
- **Memory State**: Includes memory usage at time of error
|
|
187
|
-
- **Log Messages**: Includes application logs captured during request
|
|
188
|
-
|
|
189
|
-
## HTTP Instrumentation
|
|
190
|
-
|
|
191
|
-
### Outgoing HTTP Tracking
|
|
192
|
-
- **HTTP Request Tracking**: Tracks outgoing HTTP requests (via middleware)
|
|
193
|
-
- **Request Context**: Captures HTTP request details
|
|
194
|
-
- **Response Context**: Captures HTTP response details
|
|
195
|
-
- **Duration Measurement**: Tracks HTTP request duration
|
|
196
|
-
|
|
197
|
-
## Configuration & Flexibility
|
|
198
|
-
|
|
199
|
-
### Configuration Options
|
|
200
|
-
- **API Key Management**: Multiple sources (config, Rails credentials, ENV)
|
|
201
|
-
- **Endpoint Configuration**: Configurable endpoint URL
|
|
202
|
-
- **Timeout Settings**: Configurable open and read timeouts
|
|
203
|
-
- **Enable/Disable Toggle**: Can be enabled/disabled via configuration
|
|
204
|
-
- **Environment Detection**: Automatic Rails environment detection
|
|
205
|
-
|
|
206
|
-
### Circuit Breaker Configuration
|
|
207
|
-
- **Failure Threshold**: Configurable failure threshold (default: 3)
|
|
208
|
-
- **Recovery Timeout**: Configurable recovery timeout (default: 60 seconds)
|
|
209
|
-
- **Retry Timeout**: Configurable retry timeout (default: 300 seconds)
|
|
210
|
-
- **Enable/Disable**: Can enable/disable circuit breaker
|
|
211
|
-
|
|
212
|
-
### Memory Tracking Configuration
|
|
213
|
-
- **Memory Tracking Toggle**: Enable/disable memory tracking
|
|
214
|
-
- **Allocation Tracking Toggle**: Enable/disable detailed allocation tracking
|
|
215
|
-
- **Sampling Configuration**: Configurable request sampling rate
|
|
216
|
-
|
|
217
|
-
### Query Analysis Configuration
|
|
218
|
-
- **Slow Query Threshold**: Configurable threshold in milliseconds (default: 500ms)
|
|
219
|
-
- **EXPLAIN ANALYZE Toggle**: Enable/disable automatic EXPLAIN ANALYZE
|
|
220
|
-
|
|
221
|
-
## Data Safety & Privacy
|
|
222
|
-
|
|
223
|
-
### Data Sanitization
|
|
224
|
-
- **SQL Sanitization**: Automatically sanitizes SQL queries
|
|
225
|
-
- **Parameter Filtering**: Filters sensitive parameters (password, token, secret, key)
|
|
226
|
-
- **Argument Truncation**: Limits and truncates job arguments
|
|
227
|
-
- **Key Truncation**: Truncates cache and Redis keys to 200 characters
|
|
228
|
-
- **Value Truncation**: Recursively truncates nested values to prevent huge payloads
|
|
229
|
-
- **String Limits**: Limits string values (e.g., user agent to 200 chars, messages to 1000 chars)
|
|
230
|
-
|
|
231
|
-
### Data Limits
|
|
232
|
-
- **Array Limits**: Limits array sizes (e.g., first 10 job arguments, first 5 array elements)
|
|
233
|
-
- **Hash Limits**: Limits hash key counts (e.g., first 20 hash keys, first 30 params)
|
|
234
|
-
- **Backtrace Limits**: Limits backtraces to first 50 lines
|
|
235
|
-
- **Allocation Limits**: Limits allocations tracked per request (max 1000)
|
|
236
|
-
|
|
237
|
-
## Performance & Reliability
|
|
238
|
-
|
|
239
|
-
### Performance Optimizations
|
|
240
|
-
- **Asynchronous Posting**: Non-blocking HTTP requests
|
|
241
|
-
- **Lightweight Default Mode**: Minimal overhead in default configuration
|
|
242
|
-
- **Sampling Support**: Reduces data volume for high-traffic applications
|
|
243
|
-
- **Thread-Local Storage**: Efficient per-request data collection
|
|
244
|
-
- **Background EXPLAIN**: EXPLAIN ANALYZE runs in background thread
|
|
245
|
-
|
|
246
|
-
### Reliability Features
|
|
247
|
-
- **Circuit Breaker**: Prevents cascading failures
|
|
248
|
-
- **Error Handling**: Comprehensive error handling to prevent instrumentation failures
|
|
249
|
-
- **Graceful Degradation**: Continues working even if some features fail
|
|
250
|
-
- **Timeout Protection**: Configurable timeouts prevent hanging requests
|
|
251
|
-
|
|
252
|
-
## Integration & Compatibility
|
|
253
|
-
|
|
254
|
-
### Framework Support
|
|
255
|
-
- **Rails Integration**: Full Rails integration via Railtie
|
|
256
|
-
- **ActiveSupport Notifications**: Uses ActiveSupport::Notifications for event subscription
|
|
257
|
-
- **ActiveRecord Integration**: Tracks ActiveRecord SQL queries
|
|
258
|
-
- **ActiveJob Integration**: Tracks ActiveJob background jobs
|
|
259
|
-
- **ActionView Integration**: Tracks ActionView rendering
|
|
260
|
-
|
|
261
|
-
### Database Support
|
|
262
|
-
- **PostgreSQL**: Full support with EXPLAIN ANALYZE
|
|
263
|
-
- **MySQL**: Full support with EXPLAIN ANALYZE
|
|
264
|
-
- **SQLite**: Full support with EXPLAIN QUERY PLAN
|
|
265
|
-
- **Other Databases**: Basic support with standard EXPLAIN
|
|
266
|
-
|
|
267
|
-
### Cache Store Support
|
|
268
|
-
- **All Cache Stores**: Works with any Rails cache store
|
|
269
|
-
- **Multi-Store Support**: Tracks cache operations across different stores
|
|
270
|
-
|
|
271
|
-
### Redis Support
|
|
272
|
-
- **Redis Gem**: Works with redis gem
|
|
273
|
-
- **Client Instrumentation**: Direct instrumentation of Redis::Client
|
|
274
|
-
- **Pipeline Support**: Tracks Redis pipelines
|
|
275
|
-
- **Transaction Support**: Tracks Redis MULTI/EXEC transactions
|
|
276
|
-
|
|
277
|
-
## Logging & Debugging
|
|
278
|
-
|
|
279
|
-
### Application Logging
|
|
280
|
-
- **Log Capture**: Captures application logs during request processing
|
|
281
|
-
- **Log Context**: Includes logs in metric payloads
|
|
282
|
-
- **Debug Logging**: Optional debug logging for skipped requests
|
|
283
|
-
|
|
284
|
-
## Deployment & Environment
|
|
285
|
-
|
|
286
|
-
### Deploy Tracking
|
|
287
|
-
- **Deploy ID Resolution**: Multiple sources for deploy identification (`Configuration#deploy_id=` wins when set, then ENV in `Configuration::DEPLOY_REVISION_ENV_KEYS` order—including `DEAD_BRO_DEPLOY_ID`, git/CI vars, `DD_VERSION`, etc.), otherwise a **per-process UUID** (fine for single dyno/process; unusable alone for fleets like ECS replicas)
|
|
288
|
-
- **Revision Tracking**: Includes deploy/revision ID in all metric payloads
|
|
289
|
-
|
|
290
|
-
### Environment Support
|
|
291
|
-
- **Rails Environment**: Automatic Rails environment detection
|
|
292
|
-
- **Rack Environment**: Fallback to RACK_ENV or RAILS_ENV
|
|
293
|
-
- **Environment Context**: Includes environment in all metric payloads
|
|
294
|
-
|
|
295
|
-
## Data Collection & Transmission
|
|
296
|
-
|
|
297
|
-
### Metric Payload Structure
|
|
298
|
-
- **Structured Data**: Well-structured JSON payloads
|
|
299
|
-
- **Event Names**: Descriptive event names for different metric types
|
|
300
|
-
- **Timestamp Tracking**: ISO8601 timestamps for all metrics
|
|
301
|
-
- **Metadata**: Rich metadata including environment, host, deploy ID
|
|
302
|
-
|
|
303
|
-
### HTTP Client
|
|
304
|
-
- **HTTPS Support**: Secure HTTPS communication
|
|
305
|
-
- **Bearer Token Auth**: API key authentication via Bearer tokens
|
|
306
|
-
- **JSON Encoding**: JSON-encoded payloads
|
|
307
|
-
- **Custom Headers**: Proper Content-Type and Authorization headers
|
|
308
|
-
|
|
309
|
-
## Comparison-Ready Features
|
|
310
|
-
|
|
311
|
-
### Unique Differentiators
|
|
312
|
-
1. **Automatic EXPLAIN ANALYZE**: Background execution plan capture for slow queries
|
|
313
|
-
2. **Lightweight Memory Tracking**: Low-overhead memory monitoring by default
|
|
314
|
-
3. **Comprehensive Cache Tracking**: Detailed cache operation tracking
|
|
315
|
-
4. **Redis Instrumentation**: Full Redis command tracking including pipelines
|
|
316
|
-
5. **View Rendering Analysis**: Detailed view performance analysis with cache hit rates
|
|
317
|
-
6. **Flexible Exclusion Rules**: Wildcard support for controller/job exclusion
|
|
318
|
-
7. **Request Sampling**: Configurable percentage-based sampling
|
|
319
|
-
8. **Circuit Breaker**: Built-in resilience for APM endpoint failures
|
|
320
|
-
9. **Multi-Source Configuration**: Flexible configuration from multiple sources
|
|
321
|
-
10. **Deploy Tracking**: Automatic deploy ID resolution from multiple sources
|
|
322
|
-
|
|
323
|
-
### Standard APM Features
|
|
324
|
-
- Request/response tracking
|
|
325
|
-
- SQL query tracking
|
|
326
|
-
- Error tracking
|
|
327
|
-
- Background job tracking
|
|
328
|
-
- Memory tracking
|
|
329
|
-
- Performance metrics
|
|
330
|
-
- Exception handling
|
|
331
|
-
- User context
|
|
332
|
-
- Environment tracking
|
|
333
|
-
|