datadog-statsd-schema 0.1.2 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -1,638 +1,502 @@
1
1
  [![RSpec and Rubocop](https://github.com/kigster/datadog-statsd-schema/actions/workflows/ruby.yml/badge.svg)](https://github.com/kigster/datadog-statsd-schema/actions/workflows/ruby.yml)
2
2
 
3
- # Datadog::Statsd::Schema
3
+ # Datadog StatsD Schema
4
4
 
5
- ## Stop the Metric Madness (And Save Your Budget) 💸
5
+ A Ruby gem that provides comprehensive schema definition, validation, and cost analysis for Datadog StatsD metrics. This library helps teams prevent metric explosion, control costs, and maintain consistent metric naming conventions.
6
6
 
7
- *"With great StatsD power comes great billing responsibility"*
7
+ ## Features
8
8
 
9
- Every engineering team starts the same way with [Datadog custom metrics](https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/?tab=ruby): a few innocent calls to `statsd.increment('user.signup')`, maybe a `statsd.gauge('queue.size', 42)`. Life is good. Metrics are flowing. Dashboards are pretty.
9
+ - **Schema Definition**: Define metric schemas with type safety and validation
10
+ - **Tag Management**: Centralized tag definitions with inheritance and validation
11
+ - **Cost Analysis**: Analyze potential custom metric costs before deployment
12
+ - **Metric Validation**: Runtime validation of metrics against defined schemas
13
+ - **CLI Tools**: Command-line interface for schema analysis and validation
14
+ - **Global Configuration**: Centralized configuration for tags and StatsD clients
10
15
 
11
- Then reality hits. Your Datadog bill explodes 🚀 because:
12
-
13
- - **Marketing** added `statsd.increment('clicks', tags: { campaign_id: campaign.id })` across 10,000 campaigns
14
- - **DevOps** thought `statsd.gauge('memory', tags: { container_id: container.uuid })` was a great idea
15
- - **Frontend** started tracking `statsd.timing('page.load', tags: { user_id: current_user.id })` for 2 million users
16
- - **Everyone** has their own creative naming conventions: `user_signups`, `user.sign.ups`, `users::signups`, `Users.Signups`
16
+ ## Installation
17
17
 
18
- **Congratulations!** 🎉 You now have 50,000+ custom metrics, each [costing real money](https://docs.datadoghq.com/account_management/billing/custom_metrics/), most providing zero actionable insights.
18
+ Add this line to your application's Gemfile:
19
19
 
20
- This gem exists to prevent that chaos (and save your engineering budget).
20
+ ```ruby
21
+ gem 'datadog-statsd-schema'
22
+ ```
21
23
 
22
- ## The Solution: Schema-Driven Metrics
24
+ And then execute:
23
25
 
24
- This gem wraps [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) with two superpowers:
26
+ ```bash
27
+ bundle install
28
+ ```
25
29
 
26
- 1. **🏷️ Intelligent Tag Merging** - Even without schemas, get consistent tagging across your application
27
- 2. **📋 Schema Validation** - Define your metrics upfront, validate everything, prevent metric explosion
30
+ Or install it yourself as:
28
31
 
29
- Let's see how this works, starting simple and building up...
32
+ ```bash
33
+ gem install datadog-statsd-schema
34
+ ```
30
35
 
31
- ## Quick Start: Better Tags Without Schemas
36
+ ## Quick Start
32
37
 
33
- Even before you define schemas, the `Emitter` class immediately improves your metrics with intelligent tag merging:
38
+ ### Basic Schema Definition
34
39
 
35
40
  ```ruby
36
41
  require 'datadog/statsd/schema'
37
42
 
38
- # Configure global tags that apply to ALL metrics
43
+ # Define your metrics schema
44
+ schema = Datadog::Statsd::Schema.new do
45
+ namespace :web do
46
+ tags do
47
+ tag :environment, values: %w[production staging development]
48
+ tag :service, values: %w[api web worker]
49
+ tag :region, values: %w[us-east-1 us-west-2]
50
+ end
51
+
52
+ metrics do
53
+ counter :requests_total do
54
+ description "Total HTTP requests"
55
+ tags required: [:environment, :service], allowed: [:region]
56
+ end
57
+
58
+ gauge :memory_usage do
59
+ description "Memory usage in bytes"
60
+ tags required: [:environment], allowed: [:service, :region]
61
+ end
62
+
63
+ distribution :request_duration do
64
+ description "Request processing time in milliseconds"
65
+ tags required: [:environment, :service]
66
+ end
67
+ end
68
+ end
69
+ end
70
+ ```
71
+
72
+ ### Using the Emitter with Schema Validation
73
+
74
+ ```ruby
75
+ # Configure global settings
39
76
  Datadog::Statsd::Schema.configure do |config|
40
- config.tags = { env: 'production', service: 'web-app', version: '1.2.3' }
41
77
  config.statsd = Datadog::Statsd.new('localhost', 8125)
78
+ config.schema = schema
79
+ config.tags = { environment: 'production' }
42
80
  end
43
81
 
44
- # Create an emitter for your authentication service
45
- auth_emitter = Datadog::Statsd::Emitter.new(
46
- 'AuthService', # Automatically becomes emitter:auth_service tag
47
- tags: { feature: 'user_auth' } # These tags go on every metric from this emitter
82
+ # Create an emitter with validation
83
+ emitter = Datadog::Statsd::Emitter.new(
84
+ schema: schema,
85
+ validation_mode: :strict # :strict, :warn, or :disabled
48
86
  )
49
87
 
50
- # Send a metric - watch the tag magic happen
51
- auth_emitter.increment('login.success', tags: { method: 'oauth' })
88
+ # Send metrics with automatic validation
89
+ emitter.increment('web.requests_total', tags: { service: 'api', region: 'us-east-1' })
90
+ emitter.gauge('web.memory_usage', 512_000_000, tags: { service: 'api' })
91
+ emitter.distribution('web.request_duration', 45.2, tags: { service: 'api' })
52
92
  ```
53
93
 
54
- **What actually gets sent to Datadog:**
55
- ```ruby
56
- # Metric: auth_service.login.success
57
- # Tags: {
58
- # env: 'production', # From global config
59
- # service: 'web-app', # From global config
60
- # version: '1.2.3', # From global config
61
- # emitter: 'auth_service', # Auto-generated from first argument
62
- # feature: 'user_auth', # From emitter constructor
63
- # method: 'oauth' # From method call
64
- # }
65
- ```
94
+ ## CLI Usage
66
95
 
67
- **Tag Precedence (method tags win):**
68
- - Method-level tags override emitter tags
69
- - Emitter tags override global tags
70
- - Global tags are always included
96
+ The gem provides a command-line interface for analyzing schemas and understanding their cost implications.
71
97
 
72
- This alone prevents the "different tag patterns everywhere" problem. But we're just getting started...
98
+ ### Installation
73
99
 
74
- ## Schema Power: Design Your Metrics, Then Code
100
+ After installing the gem, the `dss` (Datadog StatsD Schema) command will be available:
75
101
 
76
- Here's where this gem really shines. Instead of letting developers create metrics willy-nilly, you define them upfront:
102
+ ```bash
103
+ dss --help
104
+ ```
105
+
106
+ ### Schema Analysis
107
+
108
+ Create a schema file (e.g., `metrics_schema.rb`):
77
109
 
78
110
  ```ruby
79
- # Define what metrics you actually want
80
- user_metrics_schema = Datadog::Statsd::Schema.new do
81
- namespace :users do
82
- # Define the tags you'll actually use (not infinite user_ids!)
83
- tags do
84
- tag :signup_method, values: %w[email oauth google github]
85
- tag :plan_type, values: %w[free premium enterprise]
86
- tag :feature_flag, values: %w[enabled disabled]
87
- end
88
-
111
+ namespace :web do
112
+ tags do
113
+ tag :environment, values: %w[production staging development]
114
+ tag :service, values: %w[api web worker]
115
+ tag :region, values: %w[us-east-1 us-west-2 eu-west-1]
116
+ end
117
+
118
+ namespace :requests do
89
119
  metrics do
90
- # Define exactly which metrics exist and their constraints
91
- counter :signups do
92
- description "New user registrations"
93
- tags required: [:signup_method], allowed: [:plan_type, :feature_flag]
120
+ counter :total do
121
+ description "Total HTTP requests"
122
+ tags required: [:environment, :service], allowed: [:region]
94
123
  end
95
-
96
- gauge :active_sessions do
97
- description "Currently logged in users"
98
- tags allowed: [:plan_type]
124
+
125
+ distribution :duration do
126
+ description "Request processing time in milliseconds"
127
+ inherit_tags "web.requests.total"
99
128
  end
100
129
  end
101
130
  end
131
+
132
+ metrics do
133
+ gauge :memory_usage do
134
+ description "Memory usage in bytes"
135
+ tags required: [:environment], allowed: [:service]
136
+ end
137
+ end
102
138
  end
139
+ ```
103
140
 
104
- # Create an emitter bound to this schema
105
- user_emitter = Datadog::Statsd::Emitter.new(
106
- 'UserService',
107
- schema: user_metrics_schema,
108
- validation_mode: :strict # Explode on invalid metrics (good for development)
109
- )
141
+ Analyze the schema to understand metric costs:
110
142
 
111
- # This works - follows the schema
112
- user_emitter.increment('signups', tags: { signup_method: 'oauth', plan_type: 'premium' })
143
+ ```bash
144
+ dss analyze --file metrics_schema.rb --color
145
+ ```
113
146
 
114
- # This explodes 💥 - 'facebook' not in allowed signup_method values
115
- user_emitter.increment('signups', tags: { signup_method: 'facebook' })
147
+ **Output:**
116
148
 
117
- # This explodes 💥 - 'user_registrations' metric doesn't exist in schema
118
- user_emitter.increment('user_registrations')
149
+ ![output](./docs/img/dss-analyze.png)
119
150
 
120
- # This explodes 💥 - missing required tag signup_method
121
- user_emitter.increment('signups', tags: { plan_type: 'free' })
122
- ```
151
+ This analysis shows that your schema will generate **342 custom metrics** across **16 unique metric names**. Understanding this before deployment helps prevent unexpected Datadog billing surprises.
123
152
 
124
- **Schema validation catches:**
125
- - ❌ Metrics that don't exist
126
- - ❌ Wrong metric types (counter vs gauge vs distribution)
127
- - ❌ Missing required tags
128
- - ❌ Invalid tag values
129
- - ❌ Tags that aren't allowed on specific metrics
153
+ ## Advanced Features
130
154
 
131
- ## Progressive Examples: Real-World Schemas
155
+ ### Tag Inheritance
132
156
 
133
- ### E-commerce Application Metrics
157
+ Metrics can inherit tag configurations from other metrics to reduce duplication:
134
158
 
135
159
  ```ruby
136
- ecommerce_schema = Datadog::Statsd::Schema.new do
137
- # Global transformers for consistent naming
138
- transformers do
139
- underscore: ->(text) { text.underscore }
140
- downcase: ->(text) { text.downcase }
141
- end
142
-
143
- namespace :ecommerce do
144
- tags do
145
- # Finite set of product categories (not product IDs!)
146
- tag :category, values: %w[electronics clothing books home_garden]
147
-
148
- # Payment methods you actually support
149
- tag :payment_method, values: %w[credit_card paypal apple_pay]
150
-
151
- # Order status progression
152
- tag :status, values: %w[pending processing shipped delivered cancelled]
153
-
154
- # A/B test groups (not test IDs!)
155
- tag :checkout_flow, values: %w[single_page multi_step express]
156
- end
157
-
158
- namespace :orders do
159
- metrics do
160
- counter :created do
161
- description "New orders placed"
162
- tags required: [:category], allowed: [:payment_method, :checkout_flow]
163
- end
164
-
165
- counter :completed do
166
- description "Successfully processed orders"
167
- inherit_tags: "ecommerce.orders.created" # Reuse tag definition
168
- tags required: [:status]
169
- end
170
-
171
- distribution :value do
172
- description "Order value distribution in cents"
173
- units "cents"
174
- tags required: [:category], allowed: [:payment_method]
175
- end
176
-
177
- gauge :processing_queue_size do
178
- description "Orders waiting to be processed"
179
- # No tags - just a simple queue size metric
180
- end
181
- end
160
+ namespace :api do
161
+ metrics do
162
+ counter :requests_total do
163
+ tags required: [:environment, :service], allowed: [:region]
182
164
  end
183
-
184
- namespace :inventory do
185
- metrics do
186
- gauge :stock_level do
187
- description "Current inventory levels"
188
- tags required: [:category]
189
- end
190
-
191
- counter :restocked do
192
- description "Inventory replenishment events"
193
- tags required: [:category]
194
- end
195
- end
165
+
166
+ # Inherits environment, service (required) and region (allowed) from requests_total
167
+ distribution :request_duration do
168
+ inherit_tags "api.requests_total"
169
+ tags required: [:endpoint] # Adds endpoint as additional required tag
196
170
  end
197
171
  end
198
172
  end
199
-
200
- # Usage in your order processing service
201
- order_processor = Datadog::Statsd::Emitter.new(
202
- 'OrderProcessor',
203
- schema: ecommerce_schema,
204
- metric: 'ecommerce.orders', # Prefix for all metrics from this emitter
205
- tags: { checkout_flow: 'single_page' }
206
- )
207
-
208
- # Process an order - clean, validated metrics
209
- order_processor.increment('created', tags: {
210
- category: 'electronics',
211
- payment_method: 'credit_card'
212
- })
213
-
214
- order_processor.distribution('value', 15_99, tags: {
215
- category: 'electronics',
216
- payment_method: 'credit_card'
217
- })
218
-
219
- order_processor.gauge('processing_queue_size', 12)
220
173
  ```
221
174
 
222
- ### API Performance Monitoring
175
+ ### Nested Namespaces
176
+
177
+ Organize metrics hierarchically with nested namespaces:
223
178
 
224
179
  ```ruby
225
- api_schema = Datadog::Statsd::Schema.new do
226
- namespace :api do
180
+ namespace :application do
181
+ tags do
182
+ tag :environment, values: %w[prod staging dev]
183
+ end
184
+
185
+ namespace :database do
227
186
  tags do
228
- # HTTP methods you actually handle
229
- tag :method, values: %w[GET POST PUT PATCH DELETE]
230
-
231
- # Standardized controller names (transformed to snake_case)
232
- tag :controller,
233
- values: %r{^[a-z_]+$}, # Regex validation
234
- transform: [:underscore, :downcase]
235
-
236
- # Standard HTTP status code ranges
237
- tag :status_class, values: %w[2xx 3xx 4xx 5xx]
238
- tag :status_code,
239
- type: :integer,
240
- validate: ->(code) { (100..599).include?(code) }
241
-
242
- # Feature flags for A/B testing
243
- tag :feature_version, values: %w[v1 v2 experimental]
187
+ tag :table_name, values: %w[users orders products]
244
188
  end
245
-
246
- namespace :requests do
247
- metrics do
248
- counter :total do
249
- description "Total API requests"
250
- tags required: [:method, :controller],
251
- allowed: [:status_class, :feature_version]
252
- end
253
-
254
- distribution :duration do
255
- description "Request processing time"
256
- units "milliseconds"
257
- inherit_tags: "api.requests.total"
258
- tags required: [:status_code]
259
- end
260
-
261
- histogram :response_size do
262
- description "Response payload size distribution"
263
- units "bytes"
264
- tags required: [:method, :controller]
265
- end
266
- end
267
- end
268
-
269
- namespace :errors do
270
- metrics do
271
- counter :total do
272
- description "API errors by type"
273
- tags required: [:controller, :status_code]
274
- end
275
- end
189
+
190
+ metrics do
191
+ counter :queries_total
192
+ distribution :query_duration
276
193
  end
277
194
  end
278
- end
279
195
 
280
- # Usage in Rails controller concern
281
- class ApplicationController < ActionController::Base
282
- before_action :setup_metrics
283
- after_action :track_request
284
-
285
- private
286
-
287
- def setup_metrics
288
- @api_metrics = Datadog::Statsd::Emitter.new(
289
- self.class.name,
290
- schema: api_schema,
291
- metric: 'api',
292
- validation_mode: Rails.env.production? ? :warn : :strict
293
- )
294
- end
295
-
296
- def track_request
297
- controller_name = self.class.name.gsub('Controller', '').underscore
298
-
299
- @api_metrics.increment('requests.total', tags: {
300
- method: request.method,
301
- controller: controller_name,
302
- status_class: "#{response.status.to_s[0]}xx"
303
- })
304
-
305
- @api_metrics.distribution('requests.duration',
306
- request_duration_ms,
307
- tags: {
308
- method: request.method,
309
- controller: controller_name,
310
- status_code: response.status
311
- }
312
- )
196
+ namespace :cache do
197
+ tags do
198
+ tag :cache_type, values: %w[redis memcached]
199
+ end
200
+
201
+ metrics do
202
+ counter :hits_total
203
+ counter :misses_total
204
+ end
313
205
  end
314
206
  end
315
207
  ```
316
208
 
317
- ## Validation Modes: From Development to Production
209
+ ### Validation Modes
318
210
 
319
- The gem supports different validation strategies for different environments:
211
+ Control how validation errors are handled:
320
212
 
321
213
  ```ruby
322
- # Development: Explode on any schema violations
323
- dev_emitter = Datadog::Statsd::Emitter.new(
324
- 'MyService',
325
- schema: my_schema,
326
- validation_mode: :strict # Raises exceptions
327
- )
214
+ # Strict mode: Raises exceptions on validation failures
215
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :strict)
328
216
 
329
- # Staging: Log warnings but continue
330
- staging_emitter = Datadog::Statsd::Emitter.new(
331
- 'MyService',
332
- schema: my_schema,
333
- validation_mode: :warn # Prints to stderr, continues execution
334
- )
217
+ # Warn mode: Logs warnings but continues execution
218
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :warn)
335
219
 
336
- # Production: Drop invalid metrics silently
337
- prod_emitter = Datadog::Statsd::Emitter.new(
338
- 'MyService',
339
- schema: my_schema,
340
- validation_mode: :drop # Silently drops invalid metrics
341
- )
342
-
343
- # Emergency: Turn off validation entirely
344
- emergency_emitter = Datadog::Statsd::Emitter.new(
345
- 'MyService',
346
- schema: my_schema,
347
- validation_mode: :off # No validation at all
348
- )
220
+ # Disabled: No validation (production default)
221
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :disabled)
349
222
  ```
350
223
 
351
- ## Best Practices: Designing Schemas That Scale
224
+ ### Global Configuration
352
225
 
353
- ### 🎯 Design Metrics Before Code
226
+ Set up global defaults for your application:
354
227
 
355
228
  ```ruby
356
- # Good: Design session like this
357
- session_schema = Datadog::Statsd::Schema.new do
358
- namespace :user_sessions do
359
- tags do
360
- tag :session_type, values: %w[web mobile api]
361
- tag :auth_method, values: %w[password oauth sso]
362
- tag :plan_tier, values: %w[free premium enterprise]
363
- end
364
-
365
- metrics do
366
- counter :started do
367
- description "User sessions initiated"
368
- tags required: [:session_type], allowed: [:auth_method, :plan_tier]
369
- end
370
-
371
- counter :ended do
372
- description "User sessions terminated"
373
- tags required: [:session_type, :auth_method]
374
- end
375
-
376
- distribution :duration do
377
- description "How long sessions last"
378
- units "minutes"
379
- tags required: [:session_type]
380
- end
381
- end
382
- end
229
+ Datadog::Statsd::Schema.configure do |config|
230
+ config.statsd = Datadog::Statsd.new(
231
+ ENV['DATADOG_AGENT_HOST'] || 'localhost',
232
+ ENV['DATADOG_AGENT_PORT'] || 8125
233
+ )
234
+ config.schema = schema
235
+ config.tags = {
236
+ environment: ENV['RAILS_ENV'],
237
+ service: 'my-application',
238
+ version: ENV['APP_VERSION']
239
+ }
383
240
  end
384
241
 
385
- # Bad: Don't do this
386
- statsd.increment('user_login', tags: { user_id: user.id }) # Infinite cardinality!
387
- statsd.increment('session_start_web_premium_oauth') # Explosion of metric names!
388
- statsd.gauge('active_users_on_mobile_free_plan_from_usa', 1000) # Way too specific!
242
+ # These global tags are automatically added to all metrics
243
+ emitter = Datadog::Statsd::Emitter.new
244
+ emitter.increment('user.signup') # Automatically includes global tags
389
245
  ```
390
246
 
391
- ### 🏷️ Tag Strategy: Finite and Purposeful
247
+ ## Cost Control and Best Practices
248
+
249
+ ### Understanding Metric Expansion
250
+
251
+ Different metric types create different numbers of time series:
252
+
253
+ - **Counter/Set**: 1 time series per unique tag combination
254
+ - **Gauge**: 5 time series (count, min, max, sum, avg)
255
+ - **Distribution/Histogram**: 10 time series (count, min, max, sum, avg, p50, p75, p90, p95, p99)
256
+
257
+ ### Tag Value Limits
258
+
259
+ Be mindful of tag cardinality:
392
260
 
393
261
  ```ruby
394
- # Good: Finite tag values that enable grouping/filtering
395
- tag :plan_type, values: %w[free premium enterprise]
396
- tag :region, values: %w[us-east us-west eu-central ap-southeast]
397
- tag :feature_flag, values: %w[enabled disabled control]
398
-
399
- # Bad: Infinite or high-cardinality tags
400
- tag :user_id # Millions of possible values!
401
- tag :session_id # Unique every time!
402
- tag :timestamp # Infinite values!
403
- tag :request_path # Thousands of unique URLs!
262
+ # High cardinality - avoid
263
+ tag :user_id, type: :string # Could be millions of values
264
+
265
+ # Better approach - use bucketing
266
+ tag :user_tier, values: %w[free premium enterprise]
267
+ tag :user_cohort, values: %w[new_user returning_user power_user]
404
268
  ```
405
269
 
406
- ### 📊 Metric Types: Choose Wisely
270
+ > [!CAUTION]
271
+ > Be mindful of the number of tags and tag values your schema allows.
272
+
273
+ ### Schema Validation
274
+
275
+ Always validate your schema before deployment:
407
276
 
408
277
  ```ruby
409
- namespace :email_service do
410
- metrics do
411
- # ✅ Use counters for events that happen
412
- counter :sent do
413
- description "Emails successfully sent"
414
- end
415
-
416
- # ✅ Use gauges for current state/levels
417
- gauge :queue_size do
418
- description "Emails waiting to be sent"
419
- end
420
-
421
- # ✅ Use distributions for value analysis (careful - creates 10 metrics!)
422
- distribution :delivery_time do
423
- description "Time from send to delivery"
424
- units "seconds"
425
- end
426
-
427
- # ⚠️ Use histograms sparingly (creates 5 metrics each)
428
- histogram :processing_time do
429
- description "Email processing duration"
430
- units "milliseconds"
431
- end
432
-
433
- # ⚠️ Use sets very carefully (tracks unique values)
434
- set :unique_recipients do
435
- description "Unique email addresses receiving mail"
436
- end
437
- end
278
+ # Check for common issues
279
+ errors = schema.validate
280
+ if errors.any?
281
+ puts "Schema validation errors:"
282
+ errors.each { |error| puts " - #{error}" }
438
283
  end
439
284
  ```
440
285
 
441
- ### 🔄 Schema Evolution: Plan for Change
286
+ ## Integration Examples
287
+
288
+ ### Sidekiq Job Monitoring
289
+
290
+ Imagine that we are building a Rails application, and we prefer to create our own tracking of the jobs performed, failed, succeeded, as well as their duration.
291
+
292
+ > [!TIP]
293
+ > A very similar approach would work for tracking eg. requests coming to the `ApplicationController` subclasses.
294
+
295
+ First, let's initialize the schema from a file (we'll dive into the schema a bit later):
442
296
 
443
297
  ```ruby
444
- # ✅ Good: Use inherit_tags to reduce duplication
445
- base_schema = Datadog::Statsd::Schema.new do
446
- namespace :payments do
447
- tags do
448
- tag :payment_method, values: %w[card bank_transfer crypto]
449
- tag :currency, values: %w[USD EUR GBP JPY]
450
- tag :region, values: %w[north_america europe asia]
451
- end
452
-
453
- metrics do
454
- counter :initiated do
455
- description "Payment attempts started"
456
- tags required: [:payment_method], allowed: [:currency, :region]
457
- end
458
-
459
- counter :completed do
460
- description "Successful payments"
461
- inherit_tags: "payments.initiated" # Reuses the tag configuration
462
- end
463
-
464
- counter :failed do
465
- description "Failed payment attempts"
466
- inherit_tags: "payments.initiated"
467
- tags required: [:failure_reason] # Add specific tags as needed
468
- end
469
- end
470
- end
298
+ # config/initializers/datadog_statsd.rb
299
+ SIDEKIQ_SCHEMA = Datadog::Statsd::Schema.load_file(Rails.root.join('config/metrics/sidekiq.rb'))
300
+
301
+ Datadog::Statsd::Schema.configure do |config|
302
+ config.statsd = Datadog::Statsd.new
303
+ config.schema = SIDEKIQ_SCHEMA
304
+ config.tags = {
305
+ environment: Rails.env,
306
+ service: 'my-rails-app',
307
+ version: ENV['DEPLOY_SHA']
308
+ }
471
309
  end
472
310
  ```
473
311
 
474
- ### 🏗️ Namespace Organization
312
+ #### Adding Statsd Tracking to a Worker
313
+
314
+ In this example, a job monitors itself by submitting a relevant statsd metrics:
475
315
 
476
316
  ```ruby
477
- # ✅ Good: Hierarchical organization by domain
478
- app_schema = Datadog::Statsd::Schema.new do
479
- namespace :ecommerce do
480
- namespace :orders do
481
- # Order-related metrics
482
- end
483
-
484
- namespace :inventory do
485
- # Stock and fulfillment metrics
486
- end
317
+ class OrderProcessingJob
318
+ QUEUE = 'orders'.freeze
319
+
320
+ include Sidekiq::Job
321
+ sidekiq_options queue: QUEUE
322
+
323
+ def perform(order_id)
324
+ start_time = Time.current
487
325
 
488
- namespace :payments do
489
- # Payment processing metrics
326
+ begin
327
+ process_order(order_id)
328
+ emitter.increment('order_processing.success')
329
+ rescue => error
330
+ emitter.increment(
331
+ 'order_processing.failure',
332
+ tags: { error_type: error.class.name }
333
+ )
334
+ raise
335
+ ensure
336
+ duration = Time.current - start_time
337
+ emitter.distribution(
338
+ 'jobs.order_processing.duration',
339
+ duration * 1000
340
+ )
490
341
  end
491
342
  end
492
-
493
- namespace :infrastructure do
494
- namespace :database do
495
- # DB performance metrics
496
- end
497
-
498
- namespace :cache do
499
- # Redis/Memcached metrics
500
- end
343
+
344
+ # Create an instance of an Emitter equipped with our metric
345
+ # prefix and the tags.
346
+ def emitter
347
+ @emitter ||= Datadog::Statsd::Emitter.new(
348
+ self,
349
+ metric: 'job',
350
+ tags: { queue: QUEUE }
351
+ )
501
352
  end
502
353
  end
503
-
504
- # ❌ Bad: Flat namespace chaos
505
- # orders.created
506
- # orders_completed
507
- # order::cancelled
508
- # INVENTORY_LOW
509
- # db.query.time
510
- # cache_hits
511
354
  ```
512
355
 
513
- ## Advanced Features
356
+ The above Emitter will generate the following metrics:
514
357
 
515
- ### Global Configuration
358
+ * `job.order_processing.success` (counter)
359
+ * `job.order_processing.failure` (counter)
360
+ * `job.order_processing.duration.count`
361
+ * `job.order_processing.duration.min`
362
+ * `job.order_processing.duration.max`
363
+ * `job.order_processing.duration.sum`
364
+ * `job.order_processing.duration.avg`
516
365
 
517
- ```ruby
518
- # Set up global configuration in your initializer
519
- Datadog::Statsd::Schema.configure do |config|
520
- # Global tags applied to ALL metrics
521
- config.tags = {
522
- env: Rails.env,
523
- service: 'web-app',
524
- version: ENV['GIT_SHA']&.first(7),
525
- datacenter: ENV['DATACENTER'] || 'us-east-1'
526
- }
527
-
528
- # The actual StatsD client
529
- config.statsd = Datadog::Statsd.new(
530
- ENV['STATSD_HOST'] || 'localhost',
531
- ENV['STATSD_PORT'] || 8125,
532
- namespace: ENV['STATSD_NAMESPACE'],
533
- tags: [], # Don't double-up tags here
534
- delay_serialization: true
535
- )
536
- end
537
- ```
538
366
 
539
- ### Tag Transformers
367
+ However, you can see that doing this in each job is not practical. Therefore the first question that should be on our mind is — how do we make it so that this behavior would automatically apply to any Job we create?
368
+
369
+ #### Tracking All Sidekiq Workers At Once
370
+
371
+ The qustion postulated above is — can we come up with a class design patter that allows us to write this code once and forget about it?
372
+
373
+ **Let's take Ruby's metaprogramming for a spin.**
374
+
375
+ One of the most flexible methods to add functionality to all jobs is to create a module that the job classes include *instead of* the implementation-specific `Sidekiq::Job`.
376
+
377
+ So let's create our own module, let's call it `BackgroundWorker`, that we'll include into our classes instead. Once created, we'd like for our job classes to look like this:
540
378
 
541
379
  ```ruby
542
- schema_with_transforms = Datadog::Statsd::Schema.new do
543
- transformers do
544
- underscore: ->(text) { text.underscore }
545
- downcase: ->(text) { text.downcase }
546
- truncate: ->(text) { text.first(20) }
547
- end
380
+ class OrderProcessingJob
381
+ include BackgroundWorker
382
+ sidekiq_options queue: 'orders'
548
383
 
549
- namespace :user_actions do
550
- tags do
551
- # Controller names get normalized automatically
552
- tag :controller,
553
- values: %r{^[a-z_]+$},
554
- transform: [:underscore, :downcase] # Applied in order
555
-
556
- # Action names also get cleaned up
557
- tag :action,
558
- values: %w[index show create update destroy],
559
- transform: [:downcase]
560
- end
384
+ def perform(order_id)
385
+ # perform the work for the given order ID
561
386
  end
562
387
  end
563
-
564
- # "UserSettingsController" becomes "user_settings_controller"
565
- # "CreateUser" becomes "create_user"
566
388
  ```
567
389
 
568
- ### Complex Validation
390
+ So our module, when included, should:
391
+
392
+ * include `Sidekiq::Job` as well
393
+ * define the `emitter` method so that it's available to all Job instances
394
+ * wrap `perform` method in the exception handling block that emits corresponding metrics as in our example before.
395
+
396
+ The only tricky part here is the last one: wrapping `perform` method in some shared code. This used to require "monkey patching", but no more. These days Ruby gives us an all-powerful `prepend` method that does exactly what we need.
397
+
398
+ Final adjustment we'd like to make is the metric naming.
399
+
400
+ While the metrics such as:
401
+
402
+ * `job.order_processing.success` (counter)
403
+ * `job.order_processing.failure` (counter)
404
+
405
+ are easy to understand, the question begs: do we really need to insert the job's class name into the metric name? Or — is there a better way?
406
+
407
+ The truth is — there is! Why create 7 unique metrics **per job** when we can simply submit the same metrics for all jobs, tagged with our job's class name as an "emitter" source?
408
+
409
+ #### Module for including into Backround Worker
410
+
411
+ So, without furether ado, here we go:
569
412
 
570
413
  ```ruby
571
- advanced_schema = Datadog::Statsd::Schema.new do
572
- namespace :financial do
573
- tags do
574
- # Custom validation with lambdas
575
- tag :amount_bucket,
576
- validate: ->(value) { %w[small medium large].include?(value) }
577
-
578
- # Regex validation for IDs
579
- tag :transaction_type,
580
- values: %r{^[A-Z]{2,4}_[0-9]{3}$} # Like "AUTH_001", "REFUND_042"
581
-
582
- # Type validation
583
- tag :user_segment,
584
- type: :integer,
585
- validate: ->(segment) { (1..10).include?(segment) }
414
+ module BackgroundWorker
415
+ class << self
416
+ def included(klass)
417
+ klass.include(Sidekiq::Job)
418
+ klass.prepend(InstanceMethods)
419
+ end
420
+
421
+ module InstanceMethods
422
+ def perform(...)
423
+ start_time = Time.current
424
+ tags = {}
425
+ error = nil
426
+
427
+ begin
428
+ super(...)
429
+ emitter.increment("success")
430
+ rescue => error
431
+ tags.merge!({ error_type: error.class.name } )
432
+ emitter.increment("failure", tags:)
433
+ raise
434
+ ensure
435
+ duration = Time.current - start_time
436
+ emitter.distribution(
437
+ "duration",
438
+ duration * 1000,
439
+ tags:
440
+ )
441
+ end
442
+ end
443
+
444
+ def emitter
445
+ @emitter ||= Datadog::Statsd::Emitter.new(
446
+ metric: 'sidekiq.job',
447
+ tags: {
448
+ queue: sidekiq_options[:queue],
449
+ job: self.class.name
450
+ }
451
+ )
452
+ end
586
453
  end
587
454
  end
588
455
  end
589
456
  ```
590
457
 
591
- ### Loading Schemas from Files
458
+ > [!TIP]
459
+ > In a nutshell, we created a reusable module that, upon being included into any Job class, provides reliable tracking of job successes and failures, as well as the duration. The duration can be graphed for all successful jobs by ensuring the tag `error_type` does not exist.
592
460
 
593
- ```ruby
594
- # config/metrics_schema.rb
595
- Datadog::Statsd::Schema.new do
596
- namespace :my_app do
597
- # ... schema definition
598
- end
599
- end
461
+ So, the above strategy will generate the following metrics **FOR ALL** jobs:
600
462
 
601
- # In your application
602
- schema = Datadog::Statsd::Schema.load_file('config/metrics_schema.rb')
603
- ```
463
+ The above Emitter will generate the following metrics:
604
464
 
605
- ## Installation
465
+ * `sidekiq.job.success` (counter)
466
+ * `sidekiq.job.failure` (counter)
467
+ * `sidekiq.job.duration.count`
468
+ * `sidekiq.job.duration.min`
469
+ * `sidekiq.job.duration.max`
470
+ * `sidekiq.job.duration.sum`
471
+ * `sidekiq.job.duration.avg`
606
472
 
607
- Add to your Gemfile:
473
+ that will be tagged with the following tags:
608
474
 
609
- ```ruby
610
- gem 'datadog-statsd-schema'
611
- ```
475
+ * `queue: ... `
476
+ * `job: { 'OrderProcessingJob' | ... }`
477
+ * `environment: { "production" | "staging" | "development" }`
478
+ * `service: 'my-rails-app'`
479
+ * `version: { "git-sha" }`
480
+
481
+ ## Development
612
482
 
613
- Or install directly:
483
+ After checking out the repo, run:
614
484
 
615
485
  ```bash
616
- gem install datadog-statsd-schema
486
+ bin/setup # Install dependencies
487
+ bundle exec rspec # Run Specs
488
+ bundle exec rubocop # Run Rubocop
617
489
  ```
618
490
 
619
- ## The Bottom Line
620
-
621
- This gem transforms Datadog custom metrics from a "wild west" free-for-all into a disciplined, cost-effective observability strategy:
622
-
623
- - **🎯 Intentional Metrics**: Define what you measure before you measure it
624
- - **💰 Cost Control**: Prevent infinite cardinality and metric explosion
625
- - **🏷️ Consistent Tagging**: Global and hierarchical tag management
626
- - **🔍 Better Insights**: Finite tag values enable proper aggregation and analysis
627
- - **👥 Team Alignment**: Schema serves as documentation and contract
491
+ To install this gem onto your local machine:
628
492
 
629
- Stop the metric madness. Start with a schema.
630
-
631
- ---
493
+ ```bash
494
+ bundle exec rake install
495
+ ```
632
496
 
633
497
  ## Contributing
634
498
 
635
- Bug reports and pull requests are welcome on GitHub at [https://github.com/kigster/datadog-statsd-schema](https://github.com/kigster/datadog-statsd-schema)
499
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kigster/datadog-statsd-schema.
636
500
 
637
501
  ## License
638
502