datadog-statsd-schema 0.1.2 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -1,638 +1,410 @@
1
1
  [![RSpec and Rubocop](https://github.com/kigster/datadog-statsd-schema/actions/workflows/ruby.yml/badge.svg)](https://github.com/kigster/datadog-statsd-schema/actions/workflows/ruby.yml)
2
2
 
3
- # Datadog::Statsd::Schema
3
+ # Datadog StatsD Schema
4
4
 
5
- ## Stop the Metric Madness (And Save Your Budget) 💸
5
+ A Ruby gem that provides comprehensive schema definition, validation, and cost analysis for Datadog StatsD metrics. This library helps teams prevent metric explosion, control costs, and maintain consistent metric naming conventions.
6
6
 
7
- *"With great StatsD power comes great billing responsibility"*
7
+ ## Features
8
8
 
9
- Every engineering team starts the same way with [Datadog custom metrics](https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/?tab=ruby): a few innocent calls to `statsd.increment('user.signup')`, maybe a `statsd.gauge('queue.size', 42)`. Life is good. Metrics are flowing. Dashboards are pretty.
9
+ - **Schema Definition**: Define metric schemas with type safety and validation
10
+ - **Tag Management**: Centralized tag definitions with inheritance and validation
11
+ - **Cost Analysis**: Analyze potential custom metric costs before deployment
12
+ - **Metric Validation**: Runtime validation of metrics against defined schemas
13
+ - **CLI Tools**: Command-line interface for schema analysis and validation
14
+ - **Global Configuration**: Centralized configuration for tags and StatsD clients
10
15
 
11
- Then reality hits. Your Datadog bill explodes 🚀 because:
12
-
13
- - **Marketing** added `statsd.increment('clicks', tags: { campaign_id: campaign.id })` across 10,000 campaigns
14
- - **DevOps** thought `statsd.gauge('memory', tags: { container_id: container.uuid })` was a great idea
15
- - **Frontend** started tracking `statsd.timing('page.load', tags: { user_id: current_user.id })` for 2 million users
16
- - **Everyone** has their own creative naming conventions: `user_signups`, `user.sign.ups`, `users::signups`, `Users.Signups`
16
+ ## Installation
17
17
 
18
- **Congratulations!** 🎉 You now have 50,000+ custom metrics, each [costing real money](https://docs.datadoghq.com/account_management/billing/custom_metrics/), most providing zero actionable insights.
18
+ Add this line to your application's Gemfile:
19
19
 
20
- This gem exists to prevent that chaos (and save your engineering budget).
20
+ ```ruby
21
+ gem 'datadog-statsd-schema'
22
+ ```
21
23
 
22
- ## The Solution: Schema-Driven Metrics
24
+ And then execute:
23
25
 
24
- This gem wraps [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) with two superpowers:
26
+ ```bash
27
+ bundle install
28
+ ```
25
29
 
26
- 1. **🏷️ Intelligent Tag Merging** - Even without schemas, get consistent tagging across your application
27
- 2. **📋 Schema Validation** - Define your metrics upfront, validate everything, prevent metric explosion
30
+ Or install it yourself as:
28
31
 
29
- Let's see how this works, starting simple and building up...
32
+ ```bash
33
+ gem install datadog-statsd-schema
34
+ ```
30
35
 
31
- ## Quick Start: Better Tags Without Schemas
36
+ ## Quick Start
32
37
 
33
- Even before you define schemas, the `Emitter` class immediately improves your metrics with intelligent tag merging:
38
+ ### Basic Schema Definition
34
39
 
35
40
  ```ruby
36
41
  require 'datadog/statsd/schema'
37
42
 
38
- # Configure global tags that apply to ALL metrics
43
+ # Define your metrics schema
44
+ schema = Datadog::Statsd::Schema.new do
45
+ namespace :web do
46
+ tags do
47
+ tag :environment, values: %w[production staging development]
48
+ tag :service, values: %w[api web worker]
49
+ tag :region, values: %w[us-east-1 us-west-2]
50
+ end
51
+
52
+ metrics do
53
+ counter :requests_total do
54
+ description "Total HTTP requests"
55
+ tags required: [:environment, :service], allowed: [:region]
56
+ end
57
+
58
+ gauge :memory_usage do
59
+ description "Memory usage in bytes"
60
+ tags required: [:environment], allowed: [:service, :region]
61
+ end
62
+
63
+ distribution :request_duration do
64
+ description "Request processing time in milliseconds"
65
+ tags required: [:environment, :service]
66
+ end
67
+ end
68
+ end
69
+ end
70
+ ```
71
+
72
+ ### Using the Emitter with Schema Validation
73
+
74
+ ```ruby
75
+ # Configure global settings
39
76
  Datadog::Statsd::Schema.configure do |config|
40
- config.tags = { env: 'production', service: 'web-app', version: '1.2.3' }
41
77
  config.statsd = Datadog::Statsd.new('localhost', 8125)
78
+ config.schema = schema
79
+ config.tags = { environment: 'production' }
42
80
  end
43
81
 
44
- # Create an emitter for your authentication service
45
- auth_emitter = Datadog::Statsd::Emitter.new(
46
- 'AuthService', # Automatically becomes emitter:auth_service tag
47
- tags: { feature: 'user_auth' } # These tags go on every metric from this emitter
82
+ # Create an emitter with validation
83
+ emitter = Datadog::Statsd::Emitter.new(
84
+ schema: schema,
85
+ validation_mode: :strict # :strict, :warn, or :disabled
48
86
  )
49
87
 
50
- # Send a metric - watch the tag magic happen
51
- auth_emitter.increment('login.success', tags: { method: 'oauth' })
88
+ # Send metrics with automatic validation
89
+ emitter.increment('web.requests_total', tags: { service: 'api', region: 'us-east-1' })
90
+ emitter.gauge('web.memory_usage', 512_000_000, tags: { service: 'api' })
91
+ emitter.distribution('web.request_duration', 45.2, tags: { service: 'api' })
52
92
  ```
53
93
 
54
- **What actually gets sent to Datadog:**
55
- ```ruby
56
- # Metric: auth_service.login.success
57
- # Tags: {
58
- # env: 'production', # From global config
59
- # service: 'web-app', # From global config
60
- # version: '1.2.3', # From global config
61
- # emitter: 'auth_service', # Auto-generated from first argument
62
- # feature: 'user_auth', # From emitter constructor
63
- # method: 'oauth' # From method call
64
- # }
65
- ```
94
+ ## CLI Usage
95
+
96
+ The gem provides a command-line interface for analyzing schemas and understanding their cost implications.
97
+
98
+ ### Installation
66
99
 
67
- **Tag Precedence (method tags win):**
68
- - Method-level tags override emitter tags
69
- - Emitter tags override global tags
70
- - Global tags are always included
100
+ After installing the gem, the `dss` (Datadog StatsD Schema) command will be available:
71
101
 
72
- This alone prevents the "different tag patterns everywhere" problem. But we're just getting started...
102
+ ```bash
103
+ dss --help
104
+ ```
73
105
 
74
- ## Schema Power: Design Your Metrics, Then Code
106
+ ### Schema Analysis
75
107
 
76
- Here's where this gem really shines. Instead of letting developers create metrics willy-nilly, you define them upfront:
108
+ Create a schema file (e.g., `metrics_schema.rb`):
77
109
 
78
110
  ```ruby
79
- # Define what metrics you actually want
80
- user_metrics_schema = Datadog::Statsd::Schema.new do
81
- namespace :users do
82
- # Define the tags you'll actually use (not infinite user_ids!)
83
- tags do
84
- tag :signup_method, values: %w[email oauth google github]
85
- tag :plan_type, values: %w[free premium enterprise]
86
- tag :feature_flag, values: %w[enabled disabled]
87
- end
88
-
111
+ namespace :web do
112
+ tags do
113
+ tag :environment, values: %w[production staging development]
114
+ tag :service, values: %w[api web worker]
115
+ tag :region, values: %w[us-east-1 us-west-2 eu-west-1]
116
+ end
117
+
118
+ namespace :requests do
89
119
  metrics do
90
- # Define exactly which metrics exist and their constraints
91
- counter :signups do
92
- description "New user registrations"
93
- tags required: [:signup_method], allowed: [:plan_type, :feature_flag]
120
+ counter :total do
121
+ description "Total HTTP requests"
122
+ tags required: [:environment, :service], allowed: [:region]
94
123
  end
95
-
96
- gauge :active_sessions do
97
- description "Currently logged in users"
98
- tags allowed: [:plan_type]
124
+
125
+ distribution :duration do
126
+ description "Request processing time in milliseconds"
127
+ inherit_tags "web.requests.total"
99
128
  end
100
129
  end
101
130
  end
102
- end
103
-
104
- # Create an emitter bound to this schema
105
- user_emitter = Datadog::Statsd::Emitter.new(
106
- 'UserService',
107
- schema: user_metrics_schema,
108
- validation_mode: :strict # Explode on invalid metrics (good for development)
109
- )
110
131
 
111
- # This works - follows the schema
112
- user_emitter.increment('signups', tags: { signup_method: 'oauth', plan_type: 'premium' })
132
+ metrics do
133
+ gauge :memory_usage do
134
+ description "Memory usage in bytes"
135
+ tags required: [:environment], allowed: [:service]
136
+ end
137
+ end
138
+ end
139
+ ```
113
140
 
114
- # This explodes 💥 - 'facebook' not in allowed signup_method values
115
- user_emitter.increment('signups', tags: { signup_method: 'facebook' })
141
+ Analyze the schema to understand metric costs:
116
142
 
117
- # This explodes 💥 - 'user_registrations' metric doesn't exist in schema
118
- user_emitter.increment('user_registrations')
143
+ ```bash
144
+ dss analyze --file metrics_schema.rb
145
+ ```
119
146
 
120
- # This explodes 💥 - missing required tag signup_method
121
- user_emitter.increment('signups', tags: { plan_type: 'free' })
147
+ **Output:**
148
+ ```
149
+ ┌──────────────────────────────────────────────────────────────────────────────────────────────┐
150
+ │ Detailed Metric Analysis: │
151
+ └──────────────────────────────────────────────────────────────────────────────────────────────┘
152
+
153
+ • gauge('web.memory_usage')
154
+ Expanded names:
155
+ • web.memory_usage.count
156
+ • web.memory_usage.min
157
+ • web.memory_usage.max
158
+ • web.memory_usage.sum
159
+ • web.memory_usage.avg
160
+
161
+ Unique tags: 2
162
+ Total tag values: 6
163
+ Possible combinations: 45
164
+
165
+ ──────────────────────────────────────────────────────────────────────────────────────────────
166
+
167
+ • counter('web.requests.total')
168
+
169
+ Unique tags: 3
170
+ Total tag values: 9
171
+ Possible combinations: 27
172
+
173
+ ──────────────────────────────────────────────────────────────────────────────────────────────
174
+
175
+ • distribution('web.requests.duration')
176
+ Expanded names:
177
+ • web.requests.duration.count
178
+ • web.requests.duration.min
179
+ • web.requests.duration.max
180
+ • web.requests.duration.sum
181
+ • web.requests.duration.avg
182
+ • web.requests.duration.p50
183
+ • web.requests.duration.p75
184
+ • web.requests.duration.p90
185
+ • web.requests.duration.p95
186
+ • web.requests.duration.p99
187
+
188
+ Unique tags: 3
189
+ Total tag values: 9
190
+ Possible combinations: 270
191
+
192
+ ──────────────────────────────────────────────────────────────────────────────────────────────
193
+ ┌──────────────────────────────────────────────────────────────────────────────────────────────┐
194
+ │ Schema Analysis Results: │
195
+ │ SUMMARY │
196
+ └──────────────────────────────────────────────────────────────────────────────────────────────┘
197
+
198
+ Total unique metrics: 16
199
+ Total possible custom metric combinations: 342
122
200
  ```
123
201
 
124
- **Schema validation catches:**
125
- - ❌ Metrics that don't exist
126
- - ❌ Wrong metric types (counter vs gauge vs distribution)
127
- - ❌ Missing required tags
128
- - ❌ Invalid tag values
129
- - ❌ Tags that aren't allowed on specific metrics
202
+ This analysis shows that your schema will generate **342 custom metrics** across **16 unique metric names**. Understanding this before deployment helps prevent unexpected Datadog billing surprises.
130
203
 
131
- ## Progressive Examples: Real-World Schemas
204
+ ## Advanced Features
205
+
206
+ ### Tag Inheritance
132
207
 
133
- ### E-commerce Application Metrics
208
+ Metrics can inherit tag configurations from other metrics to reduce duplication:
134
209
 
135
210
  ```ruby
136
- ecommerce_schema = Datadog::Statsd::Schema.new do
137
- # Global transformers for consistent naming
138
- transformers do
139
- underscore: ->(text) { text.underscore }
140
- downcase: ->(text) { text.downcase }
141
- end
142
-
143
- namespace :ecommerce do
144
- tags do
145
- # Finite set of product categories (not product IDs!)
146
- tag :category, values: %w[electronics clothing books home_garden]
147
-
148
- # Payment methods you actually support
149
- tag :payment_method, values: %w[credit_card paypal apple_pay]
150
-
151
- # Order status progression
152
- tag :status, values: %w[pending processing shipped delivered cancelled]
153
-
154
- # A/B test groups (not test IDs!)
155
- tag :checkout_flow, values: %w[single_page multi_step express]
156
- end
157
-
158
- namespace :orders do
159
- metrics do
160
- counter :created do
161
- description "New orders placed"
162
- tags required: [:category], allowed: [:payment_method, :checkout_flow]
163
- end
164
-
165
- counter :completed do
166
- description "Successfully processed orders"
167
- inherit_tags: "ecommerce.orders.created" # Reuse tag definition
168
- tags required: [:status]
169
- end
170
-
171
- distribution :value do
172
- description "Order value distribution in cents"
173
- units "cents"
174
- tags required: [:category], allowed: [:payment_method]
175
- end
176
-
177
- gauge :processing_queue_size do
178
- description "Orders waiting to be processed"
179
- # No tags - just a simple queue size metric
180
- end
181
- end
211
+ namespace :api do
212
+ metrics do
213
+ counter :requests_total do
214
+ tags required: [:environment, :service], allowed: [:region]
182
215
  end
183
-
184
- namespace :inventory do
185
- metrics do
186
- gauge :stock_level do
187
- description "Current inventory levels"
188
- tags required: [:category]
189
- end
190
-
191
- counter :restocked do
192
- description "Inventory replenishment events"
193
- tags required: [:category]
194
- end
195
- end
216
+
217
+ # Inherits environment, service (required) and region (allowed) from requests_total
218
+ distribution :request_duration do
219
+ inherit_tags "api.requests_total"
220
+ tags required: [:endpoint] # Adds endpoint as additional required tag
196
221
  end
197
222
  end
198
223
  end
199
-
200
- # Usage in your order processing service
201
- order_processor = Datadog::Statsd::Emitter.new(
202
- 'OrderProcessor',
203
- schema: ecommerce_schema,
204
- metric: 'ecommerce.orders', # Prefix for all metrics from this emitter
205
- tags: { checkout_flow: 'single_page' }
206
- )
207
-
208
- # Process an order - clean, validated metrics
209
- order_processor.increment('created', tags: {
210
- category: 'electronics',
211
- payment_method: 'credit_card'
212
- })
213
-
214
- order_processor.distribution('value', 15_99, tags: {
215
- category: 'electronics',
216
- payment_method: 'credit_card'
217
- })
218
-
219
- order_processor.gauge('processing_queue_size', 12)
220
224
  ```
221
225
 
222
- ### API Performance Monitoring
226
+ ### Nested Namespaces
227
+
228
+ Organize metrics hierarchically with nested namespaces:
223
229
 
224
230
  ```ruby
225
- api_schema = Datadog::Statsd::Schema.new do
226
- namespace :api do
231
+ namespace :application do
232
+ tags do
233
+ tag :environment, values: %w[prod staging dev]
234
+ end
235
+
236
+ namespace :database do
227
237
  tags do
228
- # HTTP methods you actually handle
229
- tag :method, values: %w[GET POST PUT PATCH DELETE]
230
-
231
- # Standardized controller names (transformed to snake_case)
232
- tag :controller,
233
- values: %r{^[a-z_]+$}, # Regex validation
234
- transform: [:underscore, :downcase]
235
-
236
- # Standard HTTP status code ranges
237
- tag :status_class, values: %w[2xx 3xx 4xx 5xx]
238
- tag :status_code,
239
- type: :integer,
240
- validate: ->(code) { (100..599).include?(code) }
241
-
242
- # Feature flags for A/B testing
243
- tag :feature_version, values: %w[v1 v2 experimental]
238
+ tag :table_name, values: %w[users orders products]
244
239
  end
245
-
246
- namespace :requests do
247
- metrics do
248
- counter :total do
249
- description "Total API requests"
250
- tags required: [:method, :controller],
251
- allowed: [:status_class, :feature_version]
252
- end
253
-
254
- distribution :duration do
255
- description "Request processing time"
256
- units "milliseconds"
257
- inherit_tags: "api.requests.total"
258
- tags required: [:status_code]
259
- end
260
-
261
- histogram :response_size do
262
- description "Response payload size distribution"
263
- units "bytes"
264
- tags required: [:method, :controller]
265
- end
266
- end
267
- end
268
-
269
- namespace :errors do
270
- metrics do
271
- counter :total do
272
- description "API errors by type"
273
- tags required: [:controller, :status_code]
274
- end
275
- end
240
+
241
+ metrics do
242
+ counter :queries_total
243
+ distribution :query_duration
276
244
  end
277
245
  end
278
- end
279
246
 
280
- # Usage in Rails controller concern
281
- class ApplicationController < ActionController::Base
282
- before_action :setup_metrics
283
- after_action :track_request
284
-
285
- private
286
-
287
- def setup_metrics
288
- @api_metrics = Datadog::Statsd::Emitter.new(
289
- self.class.name,
290
- schema: api_schema,
291
- metric: 'api',
292
- validation_mode: Rails.env.production? ? :warn : :strict
293
- )
294
- end
295
-
296
- def track_request
297
- controller_name = self.class.name.gsub('Controller', '').underscore
298
-
299
- @api_metrics.increment('requests.total', tags: {
300
- method: request.method,
301
- controller: controller_name,
302
- status_class: "#{response.status.to_s[0]}xx"
303
- })
304
-
305
- @api_metrics.distribution('requests.duration',
306
- request_duration_ms,
307
- tags: {
308
- method: request.method,
309
- controller: controller_name,
310
- status_code: response.status
311
- }
312
- )
247
+ namespace :cache do
248
+ tags do
249
+ tag :cache_type, values: %w[redis memcached]
250
+ end
251
+
252
+ metrics do
253
+ counter :hits_total
254
+ counter :misses_total
255
+ end
313
256
  end
314
257
  end
315
258
  ```
316
259
 
317
- ## Validation Modes: From Development to Production
260
+ ### Validation Modes
318
261
 
319
- The gem supports different validation strategies for different environments:
262
+ Control how validation errors are handled:
320
263
 
321
264
  ```ruby
322
- # Development: Explode on any schema violations
323
- dev_emitter = Datadog::Statsd::Emitter.new(
324
- 'MyService',
325
- schema: my_schema,
326
- validation_mode: :strict # Raises exceptions
327
- )
328
-
329
- # Staging: Log warnings but continue
330
- staging_emitter = Datadog::Statsd::Emitter.new(
331
- 'MyService',
332
- schema: my_schema,
333
- validation_mode: :warn # Prints to stderr, continues execution
334
- )
265
+ # Strict mode: Raises exceptions on validation failures
266
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :strict)
335
267
 
336
- # Production: Drop invalid metrics silently
337
- prod_emitter = Datadog::Statsd::Emitter.new(
338
- 'MyService',
339
- schema: my_schema,
340
- validation_mode: :drop # Silently drops invalid metrics
341
- )
268
+ # Warn mode: Logs warnings but continues execution
269
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :warn)
342
270
 
343
- # Emergency: Turn off validation entirely
344
- emergency_emitter = Datadog::Statsd::Emitter.new(
345
- 'MyService',
346
- schema: my_schema,
347
- validation_mode: :off # No validation at all
348
- )
271
+ # Disabled: No validation (production default)
272
+ emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :disabled)
349
273
  ```
350
274
 
351
- ## Best Practices: Designing Schemas That Scale
275
+ ### Global Configuration
352
276
 
353
- ### 🎯 Design Metrics Before Code
277
+ Set up global defaults for your application:
354
278
 
355
279
  ```ruby
356
- # Good: Design session like this
357
- session_schema = Datadog::Statsd::Schema.new do
358
- namespace :user_sessions do
359
- tags do
360
- tag :session_type, values: %w[web mobile api]
361
- tag :auth_method, values: %w[password oauth sso]
362
- tag :plan_tier, values: %w[free premium enterprise]
363
- end
364
-
365
- metrics do
366
- counter :started do
367
- description "User sessions initiated"
368
- tags required: [:session_type], allowed: [:auth_method, :plan_tier]
369
- end
370
-
371
- counter :ended do
372
- description "User sessions terminated"
373
- tags required: [:session_type, :auth_method]
374
- end
375
-
376
- distribution :duration do
377
- description "How long sessions last"
378
- units "minutes"
379
- tags required: [:session_type]
380
- end
381
- end
382
- end
280
+ Datadog::Statsd::Schema.configure do |config|
281
+ config.statsd = Datadog::Statsd.new(
282
+ ENV['DATADOG_AGENT_HOST'] || 'localhost',
283
+ ENV['DATADOG_AGENT_PORT'] || 8125
284
+ )
285
+ config.schema = schema
286
+ config.tags = {
287
+ environment: ENV['RAILS_ENV'],
288
+ service: 'my-application',
289
+ version: ENV['APP_VERSION']
290
+ }
383
291
  end
384
292
 
385
- # Bad: Don't do this
386
- statsd.increment('user_login', tags: { user_id: user.id }) # Infinite cardinality!
387
- statsd.increment('session_start_web_premium_oauth') # Explosion of metric names!
388
- statsd.gauge('active_users_on_mobile_free_plan_from_usa', 1000) # Way too specific!
293
+ # These global tags are automatically added to all metrics
294
+ emitter = Datadog::Statsd::Emitter.new
295
+ emitter.increment('user.signup') # Automatically includes global tags
389
296
  ```
390
297
 
391
- ### 🏷️ Tag Strategy: Finite and Purposeful
298
+ ## Cost Control and Best Practices
392
299
 
393
- ```ruby
394
- # ✅ Good: Finite tag values that enable grouping/filtering
395
- tag :plan_type, values: %w[free premium enterprise]
396
- tag :region, values: %w[us-east us-west eu-central ap-southeast]
397
- tag :feature_flag, values: %w[enabled disabled control]
398
-
399
- # ❌ Bad: Infinite or high-cardinality tags
400
- tag :user_id # Millions of possible values!
401
- tag :session_id # Unique every time!
402
- tag :timestamp # Infinite values!
403
- tag :request_path # Thousands of unique URLs!
404
- ```
300
+ ### Understanding Metric Expansion
405
301
 
406
- ### 📊 Metric Types: Choose Wisely
302
+ Different metric types create different numbers of time series:
407
303
 
408
- ```ruby
409
- namespace :email_service do
410
- metrics do
411
- # ✅ Use counters for events that happen
412
- counter :sent do
413
- description "Emails successfully sent"
414
- end
415
-
416
- # ✅ Use gauges for current state/levels
417
- gauge :queue_size do
418
- description "Emails waiting to be sent"
419
- end
420
-
421
- # ✅ Use distributions for value analysis (careful - creates 10 metrics!)
422
- distribution :delivery_time do
423
- description "Time from send to delivery"
424
- units "seconds"
425
- end
426
-
427
- # ⚠️ Use histograms sparingly (creates 5 metrics each)
428
- histogram :processing_time do
429
- description "Email processing duration"
430
- units "milliseconds"
431
- end
432
-
433
- # ⚠️ Use sets very carefully (tracks unique values)
434
- set :unique_recipients do
435
- description "Unique email addresses receiving mail"
436
- end
437
- end
438
- end
439
- ```
304
+ - **Counter/Set**: 1 time series per unique tag combination
305
+ - **Gauge**: 5 time series (count, min, max, sum, avg)
306
+ - **Distribution/Histogram**: 10 time series (count, min, max, sum, avg, p50, p75, p90, p95, p99)
307
+
308
+ ### Tag Value Limits
440
309
 
441
- ### 🔄 Schema Evolution: Plan for Change
310
+ Be mindful of tag cardinality:
442
311
 
443
312
  ```ruby
444
- # Good: Use inherit_tags to reduce duplication
445
- base_schema = Datadog::Statsd::Schema.new do
446
- namespace :payments do
447
- tags do
448
- tag :payment_method, values: %w[card bank_transfer crypto]
449
- tag :currency, values: %w[USD EUR GBP JPY]
450
- tag :region, values: %w[north_america europe asia]
451
- end
452
-
453
- metrics do
454
- counter :initiated do
455
- description "Payment attempts started"
456
- tags required: [:payment_method], allowed: [:currency, :region]
457
- end
458
-
459
- counter :completed do
460
- description "Successful payments"
461
- inherit_tags: "payments.initiated" # Reuses the tag configuration
462
- end
463
-
464
- counter :failed do
465
- description "Failed payment attempts"
466
- inherit_tags: "payments.initiated"
467
- tags required: [:failure_reason] # Add specific tags as needed
468
- end
469
- end
470
- end
471
- end
313
+ # High cardinality - avoid
314
+ tag :user_id, type: :string # Could be millions of values
315
+
316
+ # Better approach - use bucketing
317
+ tag :user_tier, values: %w[free premium enterprise]
318
+ tag :user_cohort, values: %w[new_user returning_user power_user]
472
319
  ```
473
320
 
474
- ### 🏗️ Namespace Organization
321
+ ### Schema Validation
322
+
323
+ Always validate your schema before deployment:
475
324
 
476
325
  ```ruby
477
- # Good: Hierarchical organization by domain
478
- app_schema = Datadog::Statsd::Schema.new do
479
- namespace :ecommerce do
480
- namespace :orders do
481
- # Order-related metrics
482
- end
483
-
484
- namespace :inventory do
485
- # Stock and fulfillment metrics
486
- end
487
-
488
- namespace :payments do
489
- # Payment processing metrics
490
- end
491
- end
492
-
493
- namespace :infrastructure do
494
- namespace :database do
495
- # DB performance metrics
496
- end
497
-
498
- namespace :cache do
499
- # Redis/Memcached metrics
500
- end
501
- end
326
+ # Check for common issues
327
+ errors = schema.validate
328
+ if errors.any?
329
+ puts "Schema validation errors:"
330
+ errors.each { |error| puts " - #{error}" }
502
331
  end
503
-
504
- # ❌ Bad: Flat namespace chaos
505
- # orders.created
506
- # orders_completed
507
- # order::cancelled
508
- # INVENTORY_LOW
509
- # db.query.time
510
- # cache_hits
511
332
  ```
512
333
 
513
- ## Advanced Features
334
+ ## Integration Examples
514
335
 
515
- ### Global Configuration
336
+ ### Rails Integration
516
337
 
517
338
  ```ruby
518
- # Set up global configuration in your initializer
339
+ # config/initializers/datadog_statsd.rb
340
+ schema = Datadog::Statsd::Schema.load_file(Rails.root.join('config/metrics_schema.rb'))
341
+
519
342
  Datadog::Statsd::Schema.configure do |config|
520
- # Global tags applied to ALL metrics
343
+ config.statsd = Datadog::Statsd.new
344
+ config.schema = schema
521
345
  config.tags = {
522
- env: Rails.env,
523
- service: 'web-app',
524
- version: ENV['GIT_SHA']&.first(7),
525
- datacenter: ENV['DATACENTER'] || 'us-east-1'
346
+ environment: Rails.env,
347
+ service: 'my-rails-app'
526
348
  }
527
-
528
- # The actual StatsD client
529
- config.statsd = Datadog::Statsd.new(
530
- ENV['STATSD_HOST'] || 'localhost',
531
- ENV['STATSD_PORT'] || 8125,
532
- namespace: ENV['STATSD_NAMESPACE'],
533
- tags: [], # Don't double-up tags here
534
- delay_serialization: true
535
- )
536
349
  end
537
- ```
538
350
 
539
- ### Tag Transformers
351
+ # app/controllers/application_controller.rb
352
+ class ApplicationController < ActionController::Base
353
+ before_action :setup_metrics
540
354
 
541
- ```ruby
542
- schema_with_transforms = Datadog::Statsd::Schema.new do
543
- transformers do
544
- underscore: ->(text) { text.underscore }
545
- downcase: ->(text) { text.downcase }
546
- truncate: ->(text) { text.first(20) }
547
- end
548
-
549
- namespace :user_actions do
550
- tags do
551
- # Controller names get normalized automatically
552
- tag :controller,
553
- values: %r{^[a-z_]+$},
554
- transform: [:underscore, :downcase] # Applied in order
555
-
556
- # Action names also get cleaned up
557
- tag :action,
558
- values: %w[index show create update destroy],
559
- transform: [:downcase]
560
- end
355
+ private
356
+
357
+ def setup_metrics
358
+ @metrics = Datadog::Statsd::Emitter.new(
359
+ validation_mode: Rails.env.production? ? :disabled : :warn
360
+ )
561
361
  end
562
362
  end
563
-
564
- # "UserSettingsController" becomes "user_settings_controller"
565
- # "CreateUser" becomes "create_user"
566
363
  ```
567
364
 
568
- ### Complex Validation
365
+ ### Background Job Monitoring
569
366
 
570
367
  ```ruby
571
- advanced_schema = Datadog::Statsd::Schema.new do
572
- namespace :financial do
573
- tags do
574
- # Custom validation with lambdas
575
- tag :amount_bucket,
576
- validate: ->(value) { %w[small medium large].include?(value) }
577
-
578
- # Regex validation for IDs
579
- tag :transaction_type,
580
- values: %r{^[A-Z]{2,4}_[0-9]{3}$} # Like "AUTH_001", "REFUND_042"
581
-
582
- # Type validation
583
- tag :user_segment,
584
- type: :integer,
585
- validate: ->(segment) { (1..10).include?(segment) }
368
+ class OrderProcessingJob
369
+ def perform(order_id)
370
+ metrics = Datadog::Statsd::Emitter.new
371
+
372
+ start_time = Time.current
373
+
374
+ begin
375
+ process_order(order_id)
376
+ metrics.increment('jobs.order_processing.success', tags: { queue: 'orders' })
377
+ rescue => error
378
+ metrics.increment('jobs.order_processing.failure',
379
+ tags: { queue: 'orders', error_type: error.class.name })
380
+ raise
381
+ ensure
382
+ duration = Time.current - start_time
383
+ metrics.distribution('jobs.order_processing.duration', duration * 1000,
384
+ tags: { queue: 'orders' })
586
385
  end
587
386
  end
588
387
  end
589
388
  ```
590
389
 
591
- ### Loading Schemas from Files
592
-
593
- ```ruby
594
- # config/metrics_schema.rb
595
- Datadog::Statsd::Schema.new do
596
- namespace :my_app do
597
- # ... schema definition
598
- end
599
- end
390
+ ## Development
600
391
 
601
- # In your application
602
- schema = Datadog::Statsd::Schema.load_file('config/metrics_schema.rb')
603
- ```
392
+ After checking out the repo, run:
604
393
 
605
- ## Installation
606
-
607
- Add to your Gemfile:
608
-
609
- ```ruby
610
- gem 'datadog-statsd-schema'
394
+ ```bash
395
+ bin/setup # Install dependencies
396
+ bundle exec rake spec # Run tests
611
397
  ```
612
398
 
613
- Or install directly:
399
+ To install this gem onto your local machine:
614
400
 
615
401
  ```bash
616
- gem install datadog-statsd-schema
402
+ bundle exec rake install
617
403
  ```
618
404
 
619
- ## The Bottom Line
620
-
621
- This gem transforms Datadog custom metrics from a "wild west" free-for-all into a disciplined, cost-effective observability strategy:
622
-
623
- - **🎯 Intentional Metrics**: Define what you measure before you measure it
624
- - **💰 Cost Control**: Prevent infinite cardinality and metric explosion
625
- - **🏷️ Consistent Tagging**: Global and hierarchical tag management
626
- - **🔍 Better Insights**: Finite tag values enable proper aggregation and analysis
627
- - **👥 Team Alignment**: Schema serves as documentation and contract
628
-
629
- Stop the metric madness. Start with a schema.
630
-
631
- ---
632
-
633
405
  ## Contributing
634
406
 
635
- Bug reports and pull requests are welcome on GitHub at [https://github.com/kigster/datadog-statsd-schema](https://github.com/kigster/datadog-statsd-schema)
407
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kigster/datadog-statsd-schema.
636
408
 
637
409
  ## License
638
410