e11y 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec +4 -0
- data/.rubocop.yml +69 -0
- data/CHANGELOG.md +26 -0
- data/CODE_OF_CONDUCT.md +64 -0
- data/LICENSE.txt +21 -0
- data/README.md +179 -0
- data/Rakefile +37 -0
- data/benchmarks/run_all.rb +33 -0
- data/config/README.md +83 -0
- data/config/loki-local-config.yaml +35 -0
- data/config/prometheus.yml +15 -0
- data/docker-compose.yml +78 -0
- data/docs/00-ICP-AND-TIMELINE.md +483 -0
- data/docs/01-SCALE-REQUIREMENTS.md +858 -0
- data/docs/ADR-001-architecture.md +2617 -0
- data/docs/ADR-002-metrics-yabeda.md +1395 -0
- data/docs/ADR-003-slo-observability.md +3337 -0
- data/docs/ADR-004-adapter-architecture.md +2385 -0
- data/docs/ADR-005-tracing-context.md +1372 -0
- data/docs/ADR-006-security-compliance.md +4143 -0
- data/docs/ADR-007-opentelemetry-integration.md +1385 -0
- data/docs/ADR-008-rails-integration.md +1911 -0
- data/docs/ADR-009-cost-optimization.md +2993 -0
- data/docs/ADR-010-developer-experience.md +2166 -0
- data/docs/ADR-011-testing-strategy.md +1836 -0
- data/docs/ADR-012-event-evolution.md +958 -0
- data/docs/ADR-013-reliability-error-handling.md +2750 -0
- data/docs/ADR-014-event-driven-slo.md +1533 -0
- data/docs/ADR-015-middleware-order.md +1061 -0
- data/docs/ADR-016-self-monitoring-slo.md +1234 -0
- data/docs/API-REFERENCE-L28.md +914 -0
- data/docs/COMPREHENSIVE-CONFIGURATION.md +2366 -0
- data/docs/IMPLEMENTATION_NOTES.md +2804 -0
- data/docs/IMPLEMENTATION_PLAN.md +1971 -0
- data/docs/IMPLEMENTATION_PLAN_ARCHITECTURE.md +586 -0
- data/docs/PLAN.md +148 -0
- data/docs/QUICK-START.md +934 -0
- data/docs/README.md +296 -0
- data/docs/design/00-memory-optimization.md +593 -0
- data/docs/guides/MIGRATION-L27-L28.md +692 -0
- data/docs/guides/PERFORMANCE-BENCHMARKS.md +434 -0
- data/docs/guides/README.md +44 -0
- data/docs/prd/01-overview-vision.md +440 -0
- data/docs/use_cases/README.md +119 -0
- data/docs/use_cases/UC-001-request-scoped-debug-buffering.md +813 -0
- data/docs/use_cases/UC-002-business-event-tracking.md +1953 -0
- data/docs/use_cases/UC-003-pattern-based-metrics.md +1627 -0
- data/docs/use_cases/UC-004-zero-config-slo-tracking.md +728 -0
- data/docs/use_cases/UC-005-sentry-integration.md +759 -0
- data/docs/use_cases/UC-006-trace-context-management.md +905 -0
- data/docs/use_cases/UC-007-pii-filtering.md +2648 -0
- data/docs/use_cases/UC-008-opentelemetry-integration.md +1153 -0
- data/docs/use_cases/UC-009-multi-service-tracing.md +1043 -0
- data/docs/use_cases/UC-010-background-job-tracking.md +1018 -0
- data/docs/use_cases/UC-011-rate-limiting.md +1906 -0
- data/docs/use_cases/UC-012-audit-trail.md +2301 -0
- data/docs/use_cases/UC-013-high-cardinality-protection.md +2127 -0
- data/docs/use_cases/UC-014-adaptive-sampling.md +1940 -0
- data/docs/use_cases/UC-015-cost-optimization.md +735 -0
- data/docs/use_cases/UC-016-rails-logger-migration.md +785 -0
- data/docs/use_cases/UC-017-local-development.md +867 -0
- data/docs/use_cases/UC-018-testing-events.md +1081 -0
- data/docs/use_cases/UC-019-tiered-storage-migration.md +562 -0
- data/docs/use_cases/UC-020-event-versioning.md +708 -0
- data/docs/use_cases/UC-021-error-handling-retry-dlq.md +956 -0
- data/docs/use_cases/UC-022-event-registry.md +648 -0
- data/docs/use_cases/backlog.md +226 -0
- data/e11y.gemspec +76 -0
- data/lib/e11y/adapters/adaptive_batcher.rb +207 -0
- data/lib/e11y/adapters/audit_encrypted.rb +239 -0
- data/lib/e11y/adapters/base.rb +580 -0
- data/lib/e11y/adapters/file.rb +224 -0
- data/lib/e11y/adapters/in_memory.rb +216 -0
- data/lib/e11y/adapters/loki.rb +333 -0
- data/lib/e11y/adapters/otel_logs.rb +203 -0
- data/lib/e11y/adapters/registry.rb +141 -0
- data/lib/e11y/adapters/sentry.rb +230 -0
- data/lib/e11y/adapters/stdout.rb +108 -0
- data/lib/e11y/adapters/yabeda.rb +370 -0
- data/lib/e11y/buffers/adaptive_buffer.rb +339 -0
- data/lib/e11y/buffers/base_buffer.rb +40 -0
- data/lib/e11y/buffers/request_scoped_buffer.rb +246 -0
- data/lib/e11y/buffers/ring_buffer.rb +267 -0
- data/lib/e11y/buffers.rb +14 -0
- data/lib/e11y/console.rb +122 -0
- data/lib/e11y/current.rb +48 -0
- data/lib/e11y/event/base.rb +894 -0
- data/lib/e11y/event/value_sampling_config.rb +84 -0
- data/lib/e11y/events/base_audit_event.rb +43 -0
- data/lib/e11y/events/base_payment_event.rb +33 -0
- data/lib/e11y/events/rails/cache/delete.rb +21 -0
- data/lib/e11y/events/rails/cache/read.rb +23 -0
- data/lib/e11y/events/rails/cache/write.rb +22 -0
- data/lib/e11y/events/rails/database/query.rb +45 -0
- data/lib/e11y/events/rails/http/redirect.rb +21 -0
- data/lib/e11y/events/rails/http/request.rb +26 -0
- data/lib/e11y/events/rails/http/send_file.rb +21 -0
- data/lib/e11y/events/rails/http/start_processing.rb +26 -0
- data/lib/e11y/events/rails/job/completed.rb +22 -0
- data/lib/e11y/events/rails/job/enqueued.rb +22 -0
- data/lib/e11y/events/rails/job/failed.rb +22 -0
- data/lib/e11y/events/rails/job/scheduled.rb +23 -0
- data/lib/e11y/events/rails/job/started.rb +22 -0
- data/lib/e11y/events/rails/log.rb +56 -0
- data/lib/e11y/events/rails/view/render.rb +23 -0
- data/lib/e11y/events.rb +18 -0
- data/lib/e11y/instruments/active_job.rb +201 -0
- data/lib/e11y/instruments/rails_instrumentation.rb +141 -0
- data/lib/e11y/instruments/sidekiq.rb +175 -0
- data/lib/e11y/logger/bridge.rb +205 -0
- data/lib/e11y/metrics/cardinality_protection.rb +172 -0
- data/lib/e11y/metrics/cardinality_tracker.rb +134 -0
- data/lib/e11y/metrics/registry.rb +234 -0
- data/lib/e11y/metrics/relabeling.rb +226 -0
- data/lib/e11y/metrics.rb +102 -0
- data/lib/e11y/middleware/audit_signing.rb +174 -0
- data/lib/e11y/middleware/base.rb +140 -0
- data/lib/e11y/middleware/event_slo.rb +167 -0
- data/lib/e11y/middleware/pii_filter.rb +266 -0
- data/lib/e11y/middleware/pii_filtering.rb +280 -0
- data/lib/e11y/middleware/rate_limiting.rb +214 -0
- data/lib/e11y/middleware/request.rb +163 -0
- data/lib/e11y/middleware/routing.rb +157 -0
- data/lib/e11y/middleware/sampling.rb +254 -0
- data/lib/e11y/middleware/slo.rb +168 -0
- data/lib/e11y/middleware/trace_context.rb +131 -0
- data/lib/e11y/middleware/validation.rb +118 -0
- data/lib/e11y/middleware/versioning.rb +132 -0
- data/lib/e11y/middleware.rb +12 -0
- data/lib/e11y/pii/patterns.rb +90 -0
- data/lib/e11y/pii.rb +13 -0
- data/lib/e11y/pipeline/builder.rb +155 -0
- data/lib/e11y/pipeline/zone_validator.rb +110 -0
- data/lib/e11y/pipeline.rb +12 -0
- data/lib/e11y/presets/audit_event.rb +65 -0
- data/lib/e11y/presets/debug_event.rb +34 -0
- data/lib/e11y/presets/high_value_event.rb +51 -0
- data/lib/e11y/presets.rb +19 -0
- data/lib/e11y/railtie.rb +138 -0
- data/lib/e11y/reliability/circuit_breaker.rb +216 -0
- data/lib/e11y/reliability/dlq/file_storage.rb +277 -0
- data/lib/e11y/reliability/dlq/filter.rb +117 -0
- data/lib/e11y/reliability/retry_handler.rb +207 -0
- data/lib/e11y/reliability/retry_rate_limiter.rb +117 -0
- data/lib/e11y/sampling/error_spike_detector.rb +225 -0
- data/lib/e11y/sampling/load_monitor.rb +161 -0
- data/lib/e11y/sampling/stratified_tracker.rb +92 -0
- data/lib/e11y/sampling/value_extractor.rb +82 -0
- data/lib/e11y/self_monitoring/buffer_monitor.rb +79 -0
- data/lib/e11y/self_monitoring/performance_monitor.rb +97 -0
- data/lib/e11y/self_monitoring/reliability_monitor.rb +146 -0
- data/lib/e11y/slo/event_driven.rb +150 -0
- data/lib/e11y/slo/tracker.rb +119 -0
- data/lib/e11y/version.rb +9 -0
- data/lib/e11y.rb +283 -0
- metadata +452 -0
|
@@ -0,0 +1,905 @@
|
|
|
1
|
+
# UC-006: Trace Context Management
|
|
2
|
+
|
|
3
|
+
**Status:** MVP Feature
|
|
4
|
+
**Complexity:** Intermediate
|
|
5
|
+
**Setup Time:** 15 minutes
|
|
6
|
+
**Target Users:** Backend Developers, SRE, DevOps
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## ๐ Overview
|
|
11
|
+
|
|
12
|
+
### Problem Statement
|
|
13
|
+
|
|
14
|
+
**Current Approach (Disconnected Logs):**
|
|
15
|
+
```ruby
|
|
16
|
+
# Request 1:
|
|
17
|
+
Rails.logger.info "Order created: 123"
|
|
18
|
+
Rails.logger.info "Payment processed: $99"
|
|
19
|
+
|
|
20
|
+
# Request 2:
|
|
21
|
+
Rails.logger.info "Order created: 456"
|
|
22
|
+
|
|
23
|
+
# Request 1:
|
|
24
|
+
Rails.logger.info "Email sent for order 123"
|
|
25
|
+
|
|
26
|
+
# Problem: Can't tell which logs belong to same request!
|
|
27
|
+
# All logs are mixed together chronologically
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
### E11y Solution
|
|
31
|
+
|
|
32
|
+
**Automatic trace correlation:**
|
|
33
|
+
```ruby
|
|
34
|
+
# Request 1 (trace_id: abc-123)
|
|
35
|
+
Events::OrderCreated.track(order_id: '123')
|
|
36
|
+
# โ { trace_id: 'abc-123', event: 'order.created' }
|
|
37
|
+
|
|
38
|
+
Events::PaymentProcessed.track(order_id: '123', amount: 99)
|
|
39
|
+
# โ { trace_id: 'abc-123', event: 'payment.processed' }
|
|
40
|
+
|
|
41
|
+
# Request 2 (trace_id: def-456)
|
|
42
|
+
Events::OrderCreated.track(order_id: '456')
|
|
43
|
+
# โ { trace_id: 'def-456', event: 'order.created' }
|
|
44
|
+
|
|
45
|
+
# Request 1 background job (trace_id: abc-123 - PRESERVED!)
|
|
46
|
+
Events::EmailSent.track(order_id: '123')
|
|
47
|
+
# โ { trace_id: 'abc-123', event: 'email.sent' }
|
|
48
|
+
|
|
49
|
+
# In Grafana/Loki:
|
|
50
|
+
# {trace_id="abc-123"} โ Shows COMPLETE request timeline across services
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## ๐ฏ Features
|
|
56
|
+
|
|
57
|
+
> **Implementation:** See [ADR-005: Tracing Context](../ADR-005-tracing-context.md) for complete architecture, including [Section 3: Current (Thread-Local Storage)](../ADR-005-tracing-context.md#3-current-thread-local-storage), [Section 4: Trace ID Generation](../ADR-005-tracing-context.md#4-trace-id-generation-idgenerator), and [Section 5: W3C Trace Context](../ADR-005-tracing-context.md#5-w3c-trace-context).
|
|
58
|
+
|
|
59
|
+
### 1. Automatic Trace ID Propagation
|
|
60
|
+
|
|
61
|
+
**Rails Request Integration:**
|
|
62
|
+
```ruby
|
|
63
|
+
# config/initializers/e11y.rb
|
|
64
|
+
E11y.configure do |config|
|
|
65
|
+
# Auto-extract trace_id from various sources
|
|
66
|
+
config.trace_id do
|
|
67
|
+
# Priority order (first found wins):
|
|
68
|
+
|
|
69
|
+
# 1. Rails request ID (default)
|
|
70
|
+
from_rails_request_id true
|
|
71
|
+
|
|
72
|
+
# 2. HTTP headers (OpenTelemetry / W3C Trace Context)
|
|
73
|
+
from_http_headers ['traceparent', 'X-Request-ID', 'X-Trace-ID']
|
|
74
|
+
|
|
75
|
+
# 3. Current.request_id (Rails CurrentAttributes)
|
|
76
|
+
from_current_attributes :request_id
|
|
77
|
+
|
|
78
|
+
# 4. Thread local (for background jobs)
|
|
79
|
+
from_thread_local :trace_id
|
|
80
|
+
|
|
81
|
+
# 5. Generate new if none found
|
|
82
|
+
generator -> { SecureRandom.uuid }
|
|
83
|
+
end
|
|
84
|
+
end
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
**How it works:**
|
|
88
|
+
```ruby
|
|
89
|
+
# In controller, trace_id automatically extracted
|
|
90
|
+
class OrdersController < ApplicationController
|
|
91
|
+
def create
|
|
92
|
+
# E11y automatically uses request.uuid as trace_id
|
|
93
|
+
Events::OrderCreated.track(order_id: params[:id])
|
|
94
|
+
# โ trace_id = request.uuid (e.g., "abc-123-def")
|
|
95
|
+
|
|
96
|
+
# All events in this request share same trace_id
|
|
97
|
+
Events::PaymentProcessed.track(order_id: params[:id], amount: 99)
|
|
98
|
+
# โ trace_id = "abc-123-def" (same!)
|
|
99
|
+
end
|
|
100
|
+
end
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
### 2. Background Job Propagation
|
|
106
|
+
|
|
107
|
+
> **Implementation:** See [ADR-005 Section 6.2: Job Propagator](../ADR-005-tracing-context.md#62-job-propagator-sidekiqactivejob) for Sidekiq/ActiveJob integration details.
|
|
108
|
+
|
|
109
|
+
**Problem:** Background jobs lose trace_id context
|
|
110
|
+
|
|
111
|
+
**Solution:** Automatic propagation via Sidekiq middleware
|
|
112
|
+
|
|
113
|
+
```ruby
|
|
114
|
+
# lib/e11y/integrations/sidekiq.rb (built-in)
|
|
115
|
+
module E11y
|
|
116
|
+
module Integrations
|
|
117
|
+
class SidekiqMiddleware
|
|
118
|
+
def call(worker, job, queue)
|
|
119
|
+
# Extract trace_id from job payload
|
|
120
|
+
trace_id = job['trace_id']
|
|
121
|
+
|
|
122
|
+
# Set thread-local trace_id
|
|
123
|
+
E11y::TraceId.with_trace_id(trace_id) do
|
|
124
|
+
yield # Execute job
|
|
125
|
+
end
|
|
126
|
+
end
|
|
127
|
+
end
|
|
128
|
+
end
|
|
129
|
+
end
|
|
130
|
+
|
|
131
|
+
# Sidekiq configuration (auto-configured by E11y)
|
|
132
|
+
Sidekiq.configure_server do |config|
|
|
133
|
+
config.server_middleware do |chain|
|
|
134
|
+
chain.add E11y::Integrations::SidekiqMiddleware
|
|
135
|
+
end
|
|
136
|
+
end
|
|
137
|
+
|
|
138
|
+
Sidekiq.configure_client do |config|
|
|
139
|
+
config.client_middleware do |chain|
|
|
140
|
+
chain.add E11y::Integrations::SidekiqClientMiddleware
|
|
141
|
+
end
|
|
142
|
+
end
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**Usage (automatic!):**
|
|
146
|
+
```ruby
|
|
147
|
+
# In controller (trace_id: abc-123)
|
|
148
|
+
class OrdersController < ApplicationController
|
|
149
|
+
def create
|
|
150
|
+
order = Order.create!(params)
|
|
151
|
+
|
|
152
|
+
# Enqueue job (trace_id automatically passed)
|
|
153
|
+
SendOrderConfirmationJob.perform_later(order.id)
|
|
154
|
+
# โ Job payload includes: { 'trace_id' => 'abc-123' }
|
|
155
|
+
|
|
156
|
+
render json: order
|
|
157
|
+
end
|
|
158
|
+
end
|
|
159
|
+
|
|
160
|
+
# In job (trace_id: abc-123 - PRESERVED!)
|
|
161
|
+
class SendOrderConfirmationJob < ApplicationJob
|
|
162
|
+
def perform(order_id)
|
|
163
|
+
order = Order.find(order_id)
|
|
164
|
+
|
|
165
|
+
# trace_id is automatically restored!
|
|
166
|
+
Events::EmailSending.track(order_id: order.id)
|
|
167
|
+
# โ trace_id = 'abc-123' (same as original request!)
|
|
168
|
+
|
|
169
|
+
UserMailer.order_confirmation(order).deliver_now
|
|
170
|
+
|
|
171
|
+
Events::EmailSent.track(order_id: order.id)
|
|
172
|
+
# โ trace_id = 'abc-123' (still same!)
|
|
173
|
+
end
|
|
174
|
+
end
|
|
175
|
+
|
|
176
|
+
# Timeline in Grafana:
|
|
177
|
+
# 10:00:00.000 [abc-123] order.created (controller)
|
|
178
|
+
# 10:00:00.050 [abc-123] payment.processed (controller)
|
|
179
|
+
# 10:00:00.100 [abc-123] email.sending (job, 3 seconds later)
|
|
180
|
+
# 10:00:03.200 [abc-123] email.sent (job)
|
|
181
|
+
# โ Complete trace across HTTP request + background job!
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
---
|
|
185
|
+
|
|
186
|
+
### 3. Cross-Service Propagation
|
|
187
|
+
|
|
188
|
+
> **Implementation:** See [ADR-005 Section 6.1: HTTP Propagator](../ADR-005-tracing-context.md#61-http-propagator-outgoing-requests) for outgoing request integration (Faraday, HTTP.rb).
|
|
189
|
+
|
|
190
|
+
**Microservices scenario:**
|
|
191
|
+
```ruby
|
|
192
|
+
# Service A: API Gateway
|
|
193
|
+
class OrdersController < ApplicationController
|
|
194
|
+
def create
|
|
195
|
+
# trace_id: abc-123 (from HTTP request)
|
|
196
|
+
Events::OrderReceived.track(order_id: params[:id])
|
|
197
|
+
|
|
198
|
+
# Call Service B (Payment Service)
|
|
199
|
+
response = HTTP
|
|
200
|
+
.headers('X-Trace-ID' => E11y::TraceId.current) # Propagate!
|
|
201
|
+
.post('http://payment-service/process', json: { order_id: params[:id] })
|
|
202
|
+
|
|
203
|
+
Events::OrderCreated.track(order_id: params[:id])
|
|
204
|
+
render json: { status: 'ok' }
|
|
205
|
+
end
|
|
206
|
+
end
|
|
207
|
+
|
|
208
|
+
# Service B: Payment Service
|
|
209
|
+
class PaymentsController < ApplicationController
|
|
210
|
+
def process
|
|
211
|
+
# trace_id: abc-123 (extracted from X-Trace-ID header!)
|
|
212
|
+
Events::PaymentProcessing.track(order_id: params[:order_id])
|
|
213
|
+
|
|
214
|
+
# Process payment...
|
|
215
|
+
|
|
216
|
+
Events::PaymentSucceeded.track(order_id: params[:order_id])
|
|
217
|
+
render json: { status: 'paid' }
|
|
218
|
+
end
|
|
219
|
+
end
|
|
220
|
+
|
|
221
|
+
# Timeline in Grafana:
|
|
222
|
+
# Service A:
|
|
223
|
+
# 10:00:00.000 [abc-123] order.received
|
|
224
|
+
# 10:00:00.200 [abc-123] order.created
|
|
225
|
+
#
|
|
226
|
+
# Service B:
|
|
227
|
+
# 10:00:00.050 [abc-123] payment.processing โ Same trace_id!
|
|
228
|
+
# 10:00:00.150 [abc-123] payment.succeeded โ Same trace_id!
|
|
229
|
+
#
|
|
230
|
+
# โ Complete distributed trace across 2 services!
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
### 4. OpenTelemetry Integration
|
|
236
|
+
|
|
237
|
+
**W3C Trace Context support:**
|
|
238
|
+
```ruby
|
|
239
|
+
# config/initializers/e11y.rb
|
|
240
|
+
E11y.configure do |config|
|
|
241
|
+
config.trace_id do
|
|
242
|
+
# Parse W3C traceparent header
|
|
243
|
+
# Format: 00-{trace_id}-{span_id}-{flags}
|
|
244
|
+
# Example: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
|
|
245
|
+
|
|
246
|
+
from_http_headers ['traceparent']
|
|
247
|
+
parser :w3c_trace_context # Built-in parser
|
|
248
|
+
end
|
|
249
|
+
end
|
|
250
|
+
|
|
251
|
+
# Automatic parsing:
|
|
252
|
+
# HTTP Header: traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
|
|
253
|
+
# E11y extracts: trace_id = "4bf92f3577b34da6a3ce929d0e0e4736"
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
**Span creation (optional):**
|
|
257
|
+
```ruby
|
|
258
|
+
# Create OpenTelemetry span from E11y event
|
|
259
|
+
Events::PaymentProcessed.track(
|
|
260
|
+
order_id: '123',
|
|
261
|
+
amount: 99,
|
|
262
|
+
create_span: true # โ Creates OTel span
|
|
263
|
+
) do
|
|
264
|
+
process_payment # Span duration measured
|
|
265
|
+
end
|
|
266
|
+
|
|
267
|
+
# Result:
|
|
268
|
+
# - E11y event with duration_ms
|
|
269
|
+
# - OpenTelemetry span with same trace_id
|
|
270
|
+
# - Automatic parent-child span relationship
|
|
271
|
+
```
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
### 5. Manual Trace Management
|
|
276
|
+
|
|
277
|
+
**Override trace_id:**
|
|
278
|
+
```ruby
|
|
279
|
+
# Sometimes you want to use custom trace_id
|
|
280
|
+
E11y::TraceId.with_trace_id('custom-trace-123') do
|
|
281
|
+
Events::OrderCreated.track(order_id: '456')
|
|
282
|
+
# โ trace_id = 'custom-trace-123'
|
|
283
|
+
|
|
284
|
+
Events::PaymentProcessed.track(order_id: '456', amount: 99)
|
|
285
|
+
# โ trace_id = 'custom-trace-123'
|
|
286
|
+
end
|
|
287
|
+
|
|
288
|
+
# Outside block, trace_id reverts to original
|
|
289
|
+
Events::UserLoggedIn.track(user_id: '789')
|
|
290
|
+
# โ trace_id = original request trace_id
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
**Explicit trace_id:**
|
|
294
|
+
```ruby
|
|
295
|
+
# Override for single event
|
|
296
|
+
Events::OrderCreated.track(
|
|
297
|
+
order_id: '123',
|
|
298
|
+
trace_id: 'explicit-trace-456' # โ Explicit override
|
|
299
|
+
)
|
|
300
|
+
# โ trace_id = 'explicit-trace-456'
|
|
301
|
+
|
|
302
|
+
# Next event reverts to automatic
|
|
303
|
+
Events::PaymentProcessed.track(order_id: '123', amount: 99)
|
|
304
|
+
# โ trace_id = automatic (from request)
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
---
|
|
308
|
+
|
|
309
|
+
## ๐ป Implementation Examples
|
|
310
|
+
|
|
311
|
+
### Example 1: Complete Request Timeline
|
|
312
|
+
|
|
313
|
+
```ruby
|
|
314
|
+
# app/controllers/orders_controller.rb
|
|
315
|
+
class OrdersController < ApplicationController
|
|
316
|
+
def create
|
|
317
|
+
# 1. Validate input
|
|
318
|
+
Events::OrderValidationStarted.track(params: sanitized_params)
|
|
319
|
+
|
|
320
|
+
begin
|
|
321
|
+
validate_order_params!
|
|
322
|
+
Events::OrderValidationSucceeded.track
|
|
323
|
+
rescue ValidationError => e
|
|
324
|
+
Events::OrderValidationFailed.track(error: e.message, severity: :error)
|
|
325
|
+
return render json: { error: e.message }, status: :unprocessable_entity
|
|
326
|
+
end
|
|
327
|
+
|
|
328
|
+
# 2. Create order
|
|
329
|
+
order = Order.create!(order_params)
|
|
330
|
+
Events::OrderCreated.track(order_id: order.id, amount: order.total)
|
|
331
|
+
|
|
332
|
+
# 3. Process payment
|
|
333
|
+
Events::PaymentProcessing.track(order_id: order.id, amount: order.total)
|
|
334
|
+
|
|
335
|
+
begin
|
|
336
|
+
payment = PaymentService.charge(order)
|
|
337
|
+
Events::PaymentSucceeded.track(
|
|
338
|
+
order_id: order.id,
|
|
339
|
+
transaction_id: payment.id,
|
|
340
|
+
severity: :success
|
|
341
|
+
)
|
|
342
|
+
rescue PaymentError => e
|
|
343
|
+
Events::PaymentFailed.track(
|
|
344
|
+
order_id: order.id,
|
|
345
|
+
error: e.message,
|
|
346
|
+
severity: :error
|
|
347
|
+
)
|
|
348
|
+
raise
|
|
349
|
+
end
|
|
350
|
+
|
|
351
|
+
# 4. Enqueue background jobs
|
|
352
|
+
SendOrderConfirmationJob.perform_later(order.id)
|
|
353
|
+
UpdateInventoryJob.perform_later(order.id)
|
|
354
|
+
|
|
355
|
+
Events::OrderCompleted.track(order_id: order.id, severity: :success)
|
|
356
|
+
|
|
357
|
+
render json: order
|
|
358
|
+
end
|
|
359
|
+
end
|
|
360
|
+
|
|
361
|
+
# Grafana query: {trace_id="abc-123"}
|
|
362
|
+
# Result:
|
|
363
|
+
# 10:00:00.000 order.validation.started
|
|
364
|
+
# 10:00:00.010 order.validation.succeeded
|
|
365
|
+
# 10:00:00.020 order.created
|
|
366
|
+
# 10:00:00.030 payment.processing
|
|
367
|
+
# 10:00:00.150 payment.succeeded
|
|
368
|
+
# 10:00:00.160 order.completed
|
|
369
|
+
# 10:00:02.000 email.sending (background job)
|
|
370
|
+
# 10:00:03.500 email.sent (background job)
|
|
371
|
+
# 10:00:04.000 inventory.updated (background job)
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
---
|
|
375
|
+
|
|
376
|
+
### Example 2: Distributed Trace Across Services
|
|
377
|
+
|
|
378
|
+
```ruby
|
|
379
|
+
# Service A: API Gateway
|
|
380
|
+
class OrdersController < ApplicationController
|
|
381
|
+
def create
|
|
382
|
+
trace_id = E11y::TraceId.current # abc-123
|
|
383
|
+
|
|
384
|
+
Events::OrderReceived.track(order_id: params[:id])
|
|
385
|
+
|
|
386
|
+
# Call Payment Service (with trace propagation)
|
|
387
|
+
payment_response = call_payment_service(trace_id, params)
|
|
388
|
+
|
|
389
|
+
# Call Inventory Service (with trace propagation)
|
|
390
|
+
inventory_response = call_inventory_service(trace_id, params)
|
|
391
|
+
|
|
392
|
+
Events::OrderCreated.track(order_id: params[:id])
|
|
393
|
+
|
|
394
|
+
render json: { status: 'ok' }
|
|
395
|
+
end
|
|
396
|
+
|
|
397
|
+
private
|
|
398
|
+
|
|
399
|
+
def call_payment_service(trace_id, params)
|
|
400
|
+
HTTP
|
|
401
|
+
.headers('X-Trace-ID' => trace_id)
|
|
402
|
+
.post('http://payment-service/charge', json: params)
|
|
403
|
+
end
|
|
404
|
+
|
|
405
|
+
def call_inventory_service(trace_id, params)
|
|
406
|
+
HTTP
|
|
407
|
+
.headers('X-Trace-ID' => trace_id)
|
|
408
|
+
.post('http://inventory-service/reserve', json: params)
|
|
409
|
+
end
|
|
410
|
+
end
|
|
411
|
+
|
|
412
|
+
# Service B: Payment Service
|
|
413
|
+
class PaymentsController < ApplicationController
|
|
414
|
+
def charge
|
|
415
|
+
# trace_id automatically extracted from X-Trace-ID header
|
|
416
|
+
Events::PaymentReceived.track(amount: params[:amount])
|
|
417
|
+
|
|
418
|
+
# Process payment...
|
|
419
|
+
|
|
420
|
+
Events::PaymentCharged.track(
|
|
421
|
+
transaction_id: transaction.id,
|
|
422
|
+
amount: params[:amount],
|
|
423
|
+
severity: :success
|
|
424
|
+
)
|
|
425
|
+
|
|
426
|
+
render json: { status: 'charged' }
|
|
427
|
+
end
|
|
428
|
+
end
|
|
429
|
+
|
|
430
|
+
# Service C: Inventory Service
|
|
431
|
+
class InventoryController < ApplicationController
|
|
432
|
+
def reserve
|
|
433
|
+
# trace_id automatically extracted from X-Trace-ID header
|
|
434
|
+
Events::InventoryReserveRequested.track(items: params[:items])
|
|
435
|
+
|
|
436
|
+
# Reserve inventory...
|
|
437
|
+
|
|
438
|
+
Events::InventoryReserved.track(
|
|
439
|
+
items: params[:items],
|
|
440
|
+
severity: :success
|
|
441
|
+
)
|
|
442
|
+
|
|
443
|
+
render json: { status: 'reserved' }
|
|
444
|
+
end
|
|
445
|
+
end
|
|
446
|
+
|
|
447
|
+
# Timeline in Grafana: {trace_id="abc-123"}
|
|
448
|
+
# 10:00:00.000 [api-gateway] order.received
|
|
449
|
+
# 10:00:00.010 [payment-service] payment.received
|
|
450
|
+
# 10:00:00.015 [inventory-service] inventory.reserve.requested
|
|
451
|
+
# 10:00:00.100 [payment-service] payment.charged
|
|
452
|
+
# 10:00:00.120 [inventory-service] inventory.reserved
|
|
453
|
+
# 10:00:00.150 [api-gateway] order.created
|
|
454
|
+
# โ Complete distributed trace!
|
|
455
|
+
```
|
|
456
|
+
|
|
457
|
+
---
|
|
458
|
+
|
|
459
|
+
### Example 3: Nested Trace Contexts
|
|
460
|
+
|
|
461
|
+
```ruby
|
|
462
|
+
# Sometimes you need to track sub-operations separately
|
|
463
|
+
class BulkOrderProcessor
|
|
464
|
+
def process(orders)
|
|
465
|
+
# Parent trace_id from current request
|
|
466
|
+
parent_trace_id = E11y::TraceId.current
|
|
467
|
+
|
|
468
|
+
Events::BulkProcessingStarted.track(
|
|
469
|
+
order_count: orders.count,
|
|
470
|
+
trace_id: parent_trace_id
|
|
471
|
+
)
|
|
472
|
+
|
|
473
|
+
orders.each do |order|
|
|
474
|
+
# Create child trace for each order
|
|
475
|
+
child_trace_id = "#{parent_trace_id}-order-#{order.id}"
|
|
476
|
+
|
|
477
|
+
E11y::TraceId.with_trace_id(child_trace_id) do
|
|
478
|
+
Events::OrderProcessing.track(order_id: order.id)
|
|
479
|
+
|
|
480
|
+
process_single_order(order)
|
|
481
|
+
|
|
482
|
+
Events::OrderProcessed.track(order_id: order.id, severity: :success)
|
|
483
|
+
end
|
|
484
|
+
end
|
|
485
|
+
|
|
486
|
+
# Back to parent trace
|
|
487
|
+
Events::BulkProcessingCompleted.track(
|
|
488
|
+
order_count: orders.count,
|
|
489
|
+
trace_id: parent_trace_id,
|
|
490
|
+
severity: :success
|
|
491
|
+
)
|
|
492
|
+
end
|
|
493
|
+
end
|
|
494
|
+
|
|
495
|
+
# Timeline:
|
|
496
|
+
# [parent-123] bulk.processing.started (order_count: 3)
|
|
497
|
+
# [parent-123-order-1] order.processing
|
|
498
|
+
# [parent-123-order-1] order.processed
|
|
499
|
+
# [parent-123-order-2] order.processing
|
|
500
|
+
# [parent-123-order-2] order.processed
|
|
501
|
+
# [parent-123-order-3] order.processing
|
|
502
|
+
# [parent-123-order-3] order.processed
|
|
503
|
+
# [parent-123] bulk.processing.completed
|
|
504
|
+
```
|
|
505
|
+
|
|
506
|
+
---
|
|
507
|
+
|
|
508
|
+
### 6. Trace-Consistent Sampling Integration
|
|
509
|
+
|
|
510
|
+
> **Implementation:** See [ADR-005 Section 7: Sampling Decisions](../ADR-005-tracing-context.md#7-sampling-decisions-trace-consistent-sampling) for trace-consistent sampling architecture.
|
|
511
|
+
|
|
512
|
+
**Critical Feature:** Sampling decisions must be consistent across trace boundaries
|
|
513
|
+
|
|
514
|
+
**See:** [UC-014: Adaptive Sampling - Strategy 7: Trace-Consistent Sampling](./UC-014-adaptive-sampling.md#strategy-7-trace-consistent-sampling)
|
|
515
|
+
|
|
516
|
+
**Why it matters:**
|
|
517
|
+
|
|
518
|
+
```ruby
|
|
519
|
+
# โ PROBLEM: Inconsistent sampling breaks distributed traces
|
|
520
|
+
#
|
|
521
|
+
# HTTP request (trace_id: abc-123):
|
|
522
|
+
# โ Sampled at 10% โ NOT sampled
|
|
523
|
+
#
|
|
524
|
+
# Background job (trace_id: abc-123):
|
|
525
|
+
# โ Sampled at 10% independently โ MAYBE sampled
|
|
526
|
+
#
|
|
527
|
+
# RESULT: Job event exists, but parent HTTP event is missing!
|
|
528
|
+
# โ Can't understand context (orphaned event)
|
|
529
|
+
```
|
|
530
|
+
|
|
531
|
+
**Solution:** Propagate sample decision with trace_id
|
|
532
|
+
|
|
533
|
+
```ruby
|
|
534
|
+
# config/initializers/e11y.rb
|
|
535
|
+
E11y.configure do |config|
|
|
536
|
+
config.trace_id do
|
|
537
|
+
# Enable trace context propagation
|
|
538
|
+
from_rails_request_id true
|
|
539
|
+
from_http_headers ['traceparent', 'X-Trace-ID']
|
|
540
|
+
|
|
541
|
+
# โ
CRITICAL: Propagate sample decision
|
|
542
|
+
propagate_sample_decision true
|
|
543
|
+
|
|
544
|
+
# Include in outbound HTTP requests
|
|
545
|
+
propagate_via_headers [
|
|
546
|
+
'X-Trace-ID', # Trace ID
|
|
547
|
+
'X-E11y-Sampled' # Sample decision (true/false)
|
|
548
|
+
]
|
|
549
|
+
end
|
|
550
|
+
|
|
551
|
+
config.adaptive_sampling do
|
|
552
|
+
# Enable trace-consistent sampling
|
|
553
|
+
trace_consistent do
|
|
554
|
+
enabled true
|
|
555
|
+
propagate_decision true
|
|
556
|
+
sample_decision_key 'e11y_sampled' # Metadata key for jobs
|
|
557
|
+
|
|
558
|
+
# Sample entire trace if ANY event is error
|
|
559
|
+
sample_on_error true
|
|
560
|
+
end
|
|
561
|
+
end
|
|
562
|
+
end
|
|
563
|
+
```
|
|
564
|
+
|
|
565
|
+
**How trace context + sampling work together:**
|
|
566
|
+
|
|
567
|
+
```ruby
|
|
568
|
+
# === HTTP REQUEST (trace_id: abc-123) ===
|
|
569
|
+
class OrdersController < ApplicationController
|
|
570
|
+
def create
|
|
571
|
+
# 1. Trace context extracted from request
|
|
572
|
+
# โ E11y::Current.trace_id = 'abc-123'
|
|
573
|
+
|
|
574
|
+
# 2. Sample decision made at entry point
|
|
575
|
+
# โ rand < 0.1 = 0.05 โ SAMPLED!
|
|
576
|
+
# โ E11y::Current.sampled = true
|
|
577
|
+
|
|
578
|
+
Events::OrderCreated.track(order_id: '123')
|
|
579
|
+
# โ trace_id: 'abc-123', sampled: true โ TRACKED
|
|
580
|
+
|
|
581
|
+
# 3. Enqueue job (both trace_id + sample decision propagated)
|
|
582
|
+
SendEmailJob.perform_later(order_id: '123')
|
|
583
|
+
# Job metadata: {
|
|
584
|
+
# trace_id: 'abc-123', โ From E11y::Current.trace_id
|
|
585
|
+
# e11y_sampled: true โ From E11y::Current.sampled
|
|
586
|
+
# }
|
|
587
|
+
|
|
588
|
+
Events::OrderCompleted.track(order_id: '123')
|
|
589
|
+
# โ trace_id: 'abc-123', sampled: true โ TRACKED
|
|
590
|
+
end
|
|
591
|
+
end
|
|
592
|
+
|
|
593
|
+
# === BACKGROUND JOB ===
|
|
594
|
+
class SendEmailJob < ApplicationJob
|
|
595
|
+
def perform(order_id)
|
|
596
|
+
# 4. Trace context + sample decision restored from job metadata
|
|
597
|
+
# โ E11y::Current.trace_id = 'abc-123'
|
|
598
|
+
# โ E11y::Current.sampled = true
|
|
599
|
+
|
|
600
|
+
Events::EmailSending.track(order_id: order_id)
|
|
601
|
+
# โ trace_id: 'abc-123', sampled: true โ TRACKED (consistent!)
|
|
602
|
+
|
|
603
|
+
send_email(order_id)
|
|
604
|
+
|
|
605
|
+
Events::EmailSent.track(order_id: order_id)
|
|
606
|
+
# โ trace_id: 'abc-123', sampled: true โ TRACKED
|
|
607
|
+
end
|
|
608
|
+
end
|
|
609
|
+
|
|
610
|
+
# RESULT in Loki: Complete trace!
|
|
611
|
+
# {trace_id="abc-123"}
|
|
612
|
+
# 10:00:00.000 order.created
|
|
613
|
+
# 10:00:00.050 order.completed
|
|
614
|
+
# 10:00:02.000 email.sending
|
|
615
|
+
# 10:00:03.500 email.sent
|
|
616
|
+
```
|
|
617
|
+
|
|
618
|
+
**Cross-service example:**
|
|
619
|
+
|
|
620
|
+
```ruby
|
|
621
|
+
# Service A: API Gateway
|
|
622
|
+
class OrdersController < ApplicationController
|
|
623
|
+
def create
|
|
624
|
+
# trace_id: abc-123, sampled: true
|
|
625
|
+
Events::OrderReceived.track(order_id: '123')
|
|
626
|
+
|
|
627
|
+
# Call Service B (propagate BOTH trace_id + sample decision)
|
|
628
|
+
response = HTTP
|
|
629
|
+
.headers(
|
|
630
|
+
'X-Trace-ID' => E11y::Current.trace_id, # โ Trace ID
|
|
631
|
+
'X-E11y-Sampled' => E11y::Current.sampled # โ Sample decision
|
|
632
|
+
)
|
|
633
|
+
.post('http://payment-service/charge', json: { order_id: '123' })
|
|
634
|
+
|
|
635
|
+
Events::OrderCreated.track(order_id: '123')
|
|
636
|
+
end
|
|
637
|
+
end
|
|
638
|
+
|
|
639
|
+
# Service B: Payment Service
|
|
640
|
+
class PaymentsController < ApplicationController
|
|
641
|
+
before_action :extract_trace_context
|
|
642
|
+
|
|
643
|
+
def charge
|
|
644
|
+
# trace_id: abc-123 (from header)
|
|
645
|
+
# sampled: true (from header)
|
|
646
|
+
|
|
647
|
+
Events::PaymentProcessing.track(order_id: params[:order_id])
|
|
648
|
+
# โ Tracked (consistent with Service A!)
|
|
649
|
+
|
|
650
|
+
process_payment
|
|
651
|
+
|
|
652
|
+
Events::PaymentSucceeded.track(order_id: params[:order_id])
|
|
653
|
+
# โ Tracked
|
|
654
|
+
end
|
|
655
|
+
|
|
656
|
+
private
|
|
657
|
+
|
|
658
|
+
def extract_trace_context
|
|
659
|
+
# E11y automatically extracts from headers:
|
|
660
|
+
# X-Trace-ID โ E11y::Current.trace_id
|
|
661
|
+
# X-E11y-Sampled โ E11y::Current.sampled
|
|
662
|
+
end
|
|
663
|
+
end
|
|
664
|
+
|
|
665
|
+
# RESULT: Complete distributed trace!
|
|
666
|
+
# [Service A] order.received
|
|
667
|
+
# [Service A] order.created
|
|
668
|
+
# [Service B] payment.processing โ Same trace_id + sampled!
|
|
669
|
+
# [Service B] payment.succeeded
|
|
670
|
+
```
|
|
671
|
+
|
|
672
|
+
**Exception: sample_on_error**
|
|
673
|
+
|
|
674
|
+
```ruby
|
|
675
|
+
# Scenario: Request initially NOT sampled, but error occurs
|
|
676
|
+
#
|
|
677
|
+
# 1. HTTP request: trace_id = 'abc-123', sampled = false
|
|
678
|
+
# 2. Events buffered (not sent)
|
|
679
|
+
# 3. Payment error occurs!
|
|
680
|
+
# 4. sample_on_error = true โ Override: sampled = true
|
|
681
|
+
# 5. Flush buffer โ Send all events
|
|
682
|
+
# 6. Job metadata updated: e11y_sampled = true
|
|
683
|
+
# 7. Job tracks all events
|
|
684
|
+
#
|
|
685
|
+
# RESULT: Complete error trace (even though initially not sampled!)
|
|
686
|
+
|
|
687
|
+
E11y.configure do |config|
|
|
688
|
+
config.adaptive_sampling do
|
|
689
|
+
trace_consistent do
|
|
690
|
+
enabled true
|
|
691
|
+
|
|
692
|
+
# โ
Override sample decision on error
|
|
693
|
+
sample_on_error true
|
|
694
|
+
|
|
695
|
+
# This works with request-scoped debug buffering
|
|
696
|
+
# See: UC-001 (Request-Scoped Debug Buffering)
|
|
697
|
+
end
|
|
698
|
+
end
|
|
699
|
+
end
|
|
700
|
+
```
|
|
701
|
+
|
|
702
|
+
**Best practices:**
|
|
703
|
+
|
|
704
|
+
1. **Always propagate sample decision with trace_id**
|
|
705
|
+
```ruby
|
|
706
|
+
# โ
GOOD: Both trace_id + sampled
|
|
707
|
+
HTTP.headers(
|
|
708
|
+
'X-Trace-ID' => E11y::Current.trace_id,
|
|
709
|
+
'X-E11y-Sampled' => E11y::Current.sampled
|
|
710
|
+
).post(url, json: data)
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
2. **Use trace-consistent sampling in production**
|
|
714
|
+
```ruby
|
|
715
|
+
# โ
GOOD: Prevents incomplete traces
|
|
716
|
+
config.adaptive_sampling.trace_consistent.enabled = true
|
|
717
|
+
```
|
|
718
|
+
|
|
719
|
+
3. **Always sample critical patterns (override trace decision)**
|
|
720
|
+
```ruby
|
|
721
|
+
# โ
GOOD: Critical events always tracked
|
|
722
|
+
config.adaptive_sampling.always_sample event_patterns: ['payment.*', 'security.*']
|
|
723
|
+
```
|
|
724
|
+
|
|
725
|
+
**See also:**
|
|
726
|
+
- **[UC-014: Adaptive Sampling - Strategy 7](./UC-014-adaptive-sampling.md#strategy-7-trace-consistent-sampling)** - Detailed implementation
|
|
727
|
+
- **[UC-001: Request-Scoped Debug Buffering](./UC-001-request-scoped-debug-buffering.md)** - How `sample_on_error` works
|
|
728
|
+
|
|
729
|
+
---
|
|
730
|
+
|
|
731
|
+
## ๐ง Configuration API
|
|
732
|
+
|
|
733
|
+
### Full Configuration
|
|
734
|
+
|
|
735
|
+
```ruby
|
|
736
|
+
# config/initializers/e11y.rb
|
|
737
|
+
E11y.configure do |config|
|
|
738
|
+
config.trace_id do
|
|
739
|
+
# === SOURCE PRIORITY (first found wins) ===
|
|
740
|
+
|
|
741
|
+
# 1. Rails request UUID
|
|
742
|
+
from_rails_request_id true
|
|
743
|
+
|
|
744
|
+
# 2. HTTP headers (W3C Trace Context, custom headers)
|
|
745
|
+
from_http_headers [
|
|
746
|
+
'traceparent', # W3C Trace Context (OpenTelemetry)
|
|
747
|
+
'X-Request-ID', # Common Rails default
|
|
748
|
+
'X-Trace-ID', # Custom header
|
|
749
|
+
'X-Amzn-Trace-Id' # AWS X-Ray
|
|
750
|
+
]
|
|
751
|
+
|
|
752
|
+
# 3. Rails CurrentAttributes
|
|
753
|
+
from_current_attributes :request_id
|
|
754
|
+
|
|
755
|
+
# 4. Thread-local storage (background jobs)
|
|
756
|
+
from_thread_local :trace_id
|
|
757
|
+
|
|
758
|
+
# 5. Custom extractor
|
|
759
|
+
custom_extractor -> {
|
|
760
|
+
# Example: Extract from Sentry context
|
|
761
|
+
Sentry.get_current_scope&.transaction&.trace_id
|
|
762
|
+
}
|
|
763
|
+
|
|
764
|
+
# 6. Generator (last resort)
|
|
765
|
+
generator -> { SecureRandom.uuid }
|
|
766
|
+
|
|
767
|
+
# === PARSING ===
|
|
768
|
+
|
|
769
|
+
# W3C Trace Context parser (00-{trace_id}-{span_id}-{flags})
|
|
770
|
+
parser :w3c_trace_context
|
|
771
|
+
|
|
772
|
+
# OR custom parser
|
|
773
|
+
parser ->(header_value) {
|
|
774
|
+
# Extract trace_id from custom format
|
|
775
|
+
header_value.split('-').first
|
|
776
|
+
}
|
|
777
|
+
|
|
778
|
+
# === PROPAGATION ===
|
|
779
|
+
|
|
780
|
+
# Which HTTP headers to set when making outbound requests
|
|
781
|
+
propagate_via_headers ['X-Trace-ID', 'traceparent']
|
|
782
|
+
|
|
783
|
+
# Include in all E11y events
|
|
784
|
+
include_in_events true # Default
|
|
785
|
+
|
|
786
|
+
# Include in logs (Rails.logger)
|
|
787
|
+
include_in_logs true
|
|
788
|
+
|
|
789
|
+
# Include in Sentry
|
|
790
|
+
include_in_sentry true
|
|
791
|
+
end
|
|
792
|
+
end
|
|
793
|
+
```
|
|
794
|
+
|
|
795
|
+
---
|
|
796
|
+
|
|
797
|
+
## ๐ Monitoring
|
|
798
|
+
|
|
799
|
+
### Trace ID Coverage
|
|
800
|
+
|
|
801
|
+
```ruby
|
|
802
|
+
# Self-monitoring metric
|
|
803
|
+
E11y.configure do |config|
|
|
804
|
+
config.self_monitoring do
|
|
805
|
+
# Track events with vs without trace_id
|
|
806
|
+
counter :events_with_trace_id_total
|
|
807
|
+
counter :events_without_trace_id_total
|
|
808
|
+
|
|
809
|
+
# Alert if too many events lack trace_id
|
|
810
|
+
# events_without_trace_id_total > 5% โ alert
|
|
811
|
+
end
|
|
812
|
+
end
|
|
813
|
+
```
|
|
814
|
+
|
|
815
|
+
---
|
|
816
|
+
|
|
817
|
+
## ๐งช Testing
|
|
818
|
+
|
|
819
|
+
```ruby
|
|
820
|
+
# spec/e11y/trace_id_spec.rb
|
|
821
|
+
RSpec.describe 'Trace ID Management' do
|
|
822
|
+
it 'extracts trace_id from Rails request' do
|
|
823
|
+
get '/orders', headers: { 'X-Request-ID' => 'test-trace-123' }
|
|
824
|
+
|
|
825
|
+
# Verify event has correct trace_id
|
|
826
|
+
event = E11y::Buffer.pop
|
|
827
|
+
expect(event[:trace_id]).to eq('test-trace-123')
|
|
828
|
+
end
|
|
829
|
+
|
|
830
|
+
it 'propagates trace_id to background jobs' do
|
|
831
|
+
# Set trace_id in request
|
|
832
|
+
E11y::TraceId.with_trace_id('request-trace-456') do
|
|
833
|
+
# Enqueue job
|
|
834
|
+
TestJob.perform_later
|
|
835
|
+
end
|
|
836
|
+
|
|
837
|
+
# Verify job has same trace_id
|
|
838
|
+
job_payload = Sidekiq::Queue.new.first
|
|
839
|
+
expect(job_payload['trace_id']).to eq('request-trace-456')
|
|
840
|
+
end
|
|
841
|
+
|
|
842
|
+
it 'generates trace_id if none found' do
|
|
843
|
+
# No request context, no headers
|
|
844
|
+
Events::TestEvent.track(foo: 'bar')
|
|
845
|
+
|
|
846
|
+
event = E11y::Buffer.pop
|
|
847
|
+
expect(event[:trace_id]).to match(/^[0-9a-f-]{36}$/) # UUID format
|
|
848
|
+
end
|
|
849
|
+
end
|
|
850
|
+
```
|
|
851
|
+
|
|
852
|
+
---
|
|
853
|
+
|
|
854
|
+
## ๐ก Best Practices
|
|
855
|
+
|
|
856
|
+
### โ
DO
|
|
857
|
+
|
|
858
|
+
**1. Always propagate trace_id in HTTP calls**
|
|
859
|
+
```ruby
|
|
860
|
+
# โ
GOOD: Propagate trace_id
|
|
861
|
+
HTTP.headers('X-Trace-ID' => E11y::TraceId.current).post(url, json: data)
|
|
862
|
+
```
|
|
863
|
+
|
|
864
|
+
**2. Use nested traces for sub-operations**
|
|
865
|
+
```ruby
|
|
866
|
+
# โ
GOOD: Parent-child relationship
|
|
867
|
+
child_trace = "#{parent_trace}-operation-#{id}"
|
|
868
|
+
```
|
|
869
|
+
|
|
870
|
+
**3. Include trace_id in error logs**
|
|
871
|
+
```ruby
|
|
872
|
+
# โ
GOOD: Easy to correlate
|
|
873
|
+
Rails.logger.error "Payment failed (trace_id: #{E11y::TraceId.current})"
|
|
874
|
+
```
|
|
875
|
+
|
|
876
|
+
---
|
|
877
|
+
|
|
878
|
+
### โ DON'T
|
|
879
|
+
|
|
880
|
+
**1. Don't generate new trace_id in background jobs**
|
|
881
|
+
```ruby
|
|
882
|
+
# โ BAD: Loses correlation
|
|
883
|
+
E11y::TraceId.with_trace_id(SecureRandom.uuid) do # โ DON'T!
|
|
884
|
+
perform_work
|
|
885
|
+
end
|
|
886
|
+
|
|
887
|
+
# โ
GOOD: Use propagated trace_id (automatic!)
|
|
888
|
+
perform_work # Trace ID already set by middleware
|
|
889
|
+
```
|
|
890
|
+
|
|
891
|
+
---
|
|
892
|
+
|
|
893
|
+
## ๐ Related Use Cases
|
|
894
|
+
|
|
895
|
+
- **[UC-014: Adaptive Sampling - Strategy 7](./UC-014-adaptive-sampling.md#strategy-7-trace-consistent-sampling)** - Trace-consistent sampling implementation
|
|
896
|
+
- **[UC-001: Request-Scoped Debug Buffering](./UC-001-request-scoped-debug-buffering.md)** - How `sample_on_error` works with buffering
|
|
897
|
+
- **[UC-005: Sentry Integration](./UC-005-sentry-integration.md)** - Trace correlation with Sentry
|
|
898
|
+
- **[UC-008: OpenTelemetry Integration](./UC-008-opentelemetry-integration.md)** - Full OTel support
|
|
899
|
+
- **[UC-009: Multi-Service Tracing](./UC-009-multi-service-tracing.md)** - Cross-service trace propagation
|
|
900
|
+
|
|
901
|
+
---
|
|
902
|
+
|
|
903
|
+
**Document Version:** 1.0
|
|
904
|
+
**Last Updated:** January 12, 2026
|
|
905
|
+
**Status:** โ
Complete
|