datadog-statsd-schema 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -2,354 +2,633 @@
2
2
 
3
3
  # Datadog::Statsd::Schema
4
4
 
5
- This is a wrapper around [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) gem that sends custom metrics via StatsD, with additional layer of validation based on a configurable schemas. Schemas can validate allowed metric names, associated tag and tag values. This approach can guide an organization towards a clear declarative approach to metrics and their tags, and then emitting them from within the application with the insurance that any invalid value would raise an exception.
5
+ ## Stop the Metric Madness (And Save Your Budget) 💸
6
6
 
7
- We invite you to explore some of the provided [examples](./examples/README.md) which can be run from project's root, and are described in the linked README.
7
+ *"With great StatsD power comes great billing responsibility"*
8
8
 
9
- ## Introduction
9
+ Every engineering team starts the same way with [Datadog custom metrics](https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/?tab=ruby): a few innocent calls to `statsd.increment('user.signup')`, maybe a `statsd.gauge('queue.size', 42)`. Life is good. Metrics are flowing. Dashboards are pretty.
10
10
 
11
- This is an extension to gem [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) which enhances the original with a robust schema definition for both the custom metrics being sent, and the tags allowed (or required) to attach to the metric.
11
+ Then reality hits. Your Datadog bill explodes 🚀 because:
12
12
 
13
- There are several interfaces to `Datadog::Statsd` instance — you can use the class methods of `Datadog::Statsd::Emitter`, and pass the typical statsd methods. But you can also use an instance of this class, which adds a number of features and powerful shortcuts.
13
+ - **Marketing** added `statsd.increment('clicks', tags: { campaign_id: campaign.id })` across 10,000 campaigns
14
+ - **DevOps** thought `statsd.gauge('memory', tags: { container_id: container.uuid })` was a great idea
15
+ - **Frontend** started tracking `statsd.timing('page.load', tags: { user_id: current_user.id })` for 2 million users
16
+ - **Everyone** has their own creative naming conventions: `user_signups`, `user.sign.ups`, `users::signups`, `Users.Signups`
14
17
 
15
- If you do not pass the schema argument to the emitter, it will act as a wrapper around `Datadog::Statsd` instance: it will merge the global and local tags together, it will concatenate metric names together, so it's quite useful on it' on.
18
+ **Congratulations!** 🎉 You now have 50,000+ custom metrics, each [costing real money](https://docs.datadoghq.com/account_management/billing/custom_metrics/), most providing zero actionable insights.
16
19
 
17
- But the real power comes from defining a Schema of metrics and tags, and providing the schema to the Emitter as a constructor argument. In that case every metric send will be validated against the schema.
18
-
19
- ## Metric Types
20
+ This gem exists to prevent that chaos (and save your engineering budget).
20
21
 
21
- > For more information about the metrics, please see the [Datadog Documentation](https://docs.datadoghq.com/metrics/custom_metrics/dogstatsd_metrics_submission/?tab=ruby).
22
+ ## The Solution: Schema-Driven Metrics
22
23
 
23
- There are 5 total metric types you can send with Statsd, and it's important to understand the differences:
24
+ This gem wraps [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) with two superpowers:
24
25
 
25
- * `COUNT` (eg, `Datadog::Statsd::Emitter.increment('emails.sent', by: 2)`)
26
- * `GAUGE` (eg, `Datadog::Statsd::Emitter.gauge('users.on.site', 100)`)
27
- * `HISTOGRAM` (eg, `Datadog::Statsd::Emitter.histogram('page.load.time', 100)`)
28
- * `DISTRIBUTION` (eg,`Datadog::Statsd::Emitter.distribution('page.load.time', 100)`)
29
- * `SET` (eg, `Datadog::Statsd::Emitter.set('users.unique', '12345')`)
26
+ 1. **🏷️ Intelligent Tag Merging** - Even without schemas, get consistent tagging across your application
27
+ 2. **📋 Schema Validation** - Define your metrics upfront, validate everything, prevent metric explosion
30
28
 
31
- NOTE: that `HISTOGRAM` converts your metric into FIVE separate metrics (with suffixes .`max`, .`median`, `avg`, .`count`, `p95`), while `DISTRIBUTION` explodes into TEN separate metrics (see the documentation). Do NOT use SET unless you know what you are doing.
29
+ Let's see how this works, starting simple and building up...
32
30
 
33
- You can send metrics via class methods of `Datadog::Statsd::Emitter`, or by instantiating the class.
31
+ ## Quick Start: Better Tags Without Schemas
34
32
 
35
- ## Sending Metrics
33
+ Even before you define schemas, the `Emitter` class immediately improves your metrics with intelligent tag merging:
36
34
 
37
- ### Class Methods
35
+ ```ruby
36
+ require 'datadog/statsd/schema'
37
+
38
+ # Configure global tags that apply to ALL metrics
39
+ Datadog::Statsd::Schema.configure do |config|
40
+ config.tags = { env: 'production', service: 'web-app', version: '1.2.3' }
41
+ config.statsd = Datadog::Statsd.new('localhost', 8125)
42
+ end
43
+
44
+ # Create an emitter for your authentication service
45
+ auth_emitter = Datadog::Statsd::Emitter.new(
46
+ 'AuthService', # Automatically becomes emitter:auth_service tag
47
+ tags: { feature: 'user_auth' } # These tags go on every metric from this emitter
48
+ )
38
49
 
39
- This is the most straightforward way of using this gem. You can just pass your metric names and tags to the standard operations on Statsd, just like so:
50
+ # Send a metric - watch the tag magic happen
51
+ auth_emitter.increment('login.success', tags: { method: 'oauth' })
52
+ ```
40
53
 
54
+ **What actually gets sent to Datadog:**
41
55
  ```ruby
42
- require 'datadog/statsd'
43
- require 'datadog/statsd/schema'
44
-
45
- Datadog::Statsd::Emitter.increment(
46
- 'marathon.started.total',
47
- by: 7,
48
- tags: {
49
- course: "sf-marathon",
50
- length: 26.212,
51
- units: "miles"
52
- },
53
- schema: ....
54
- )
56
+ # Metric: auth_service.login.success
57
+ # Tags: {
58
+ # env: 'production', # From global config
59
+ # service: 'web-app', # From global config
60
+ # version: '1.2.3', # From global config
61
+ # emitter: 'auth_service', # Auto-generated from first argument
62
+ # feature: 'user_auth', # From emitter constructor
63
+ # method: 'oauth' # From method call
64
+ # }
55
65
  ```
56
66
 
57
- As you can see, the API is identical to `Datadog::Statsd`. The main difference is that, if you provide a schema argument, the metric `marathon.started.total` must be pre-declared using the schema DSL language. In addition, the metric type ("count") and all of the tags and their possible values must be predeclared in the schema. Schema does support opening up a tag to any number of values, but that is not recommended.
67
+ **Tag Precedence (method tags win):**
68
+ - Method-level tags override emitter tags
69
+ - Emitter tags override global tags
70
+ - Global tags are always included
58
71
 
59
- So let's look at a more elaborate use case.
72
+ This alone prevents the "different tag patterns everywhere" problem. But we're just getting started...
60
73
 
61
- ### Defining Schema
74
+ ## Schema Power: Design Your Metrics, Then Code
62
75
 
63
- Below is an example of configuring the gem by creating a schema using the provided DSL. This can be a single global schema or assigned to a specific Statsd Sender, although you can have any number of Senders of type `Datadog::Statsd::Emitter` that map to a new connection and new defaults.
76
+ Here's where this gem really shines. Instead of letting developers create metrics willy-nilly, you define them upfront:
64
77
 
65
78
  ```ruby
66
- require 'etc'
67
- require 'git'
79
+ # Define what metrics you actually want
80
+ user_metrics_schema = Datadog::Statsd::Schema.new do
81
+ namespace :users do
82
+ # Define the tags you'll actually use (not infinite user_ids!)
83
+ tags do
84
+ tag :signup_method, values: %w[email oauth google github]
85
+ tag :plan_type, values: %w[free premium enterprise]
86
+ tag :feature_flag, values: %w[enabled disabled]
87
+ end
88
+
89
+ metrics do
90
+ # Define exactly which metrics exist and their constraints
91
+ counter :signups do
92
+ description "New user registrations"
93
+ tags required: [:signup_method], allowed: [:plan_type, :feature_flag]
94
+ end
95
+
96
+ gauge :active_sessions do
97
+ description "Currently logged in users"
98
+ tags allowed: [:plan_type]
99
+ end
100
+ end
101
+ end
102
+ end
68
103
 
69
- require 'datadog/statsd'
70
- require 'datadog/statsd/schema'
104
+ # Create an emitter bound to this schema
105
+ user_emitter = Datadog::Statsd::Emitter.new(
106
+ 'UserService',
107
+ schema: user_metrics_schema,
108
+ validation_mode: :strict # Explode on invalid metrics (good for development)
109
+ )
71
110
 
72
- # Define the global statsd instance that we'll use to send data through
73
- $statsd = ::Datadog::Statsd.new(
74
- 'localhost', 8125,
75
- delay_serialization: true
76
- )
111
+ # This works - follows the schema
112
+ user_emitter.increment('signups', tags: { signup_method: 'oauth', plan_type: 'premium' })
77
113
 
78
- # Configure the schema with global tags and the above-created Statsd instance
79
- Datadog::Statsd::Schema.configure do |config|
80
- # This configures the global tags that will be attached to all methods
81
- config.tags = {
82
- env: "development",
83
- arch: Etc.uname[:machine],
84
- version: Git.open('.').object('HEAD').sha
85
- }
86
-
87
- config.statsd = $statsd
88
- end
114
+ # This explodes 💥 - 'facebook' not in allowed signup_method values
115
+ user_emitter.increment('signups', tags: { signup_method: 'facebook' })
89
116
 
90
- # Now we'll create a Schema using the provided Schema DSL:
91
- schema = Datadog.schema do
92
- # Transformers can be attached to the tags, and applied before the tags are submitted
93
- # or validated.
94
- transformers do
95
- underscore: ->(text) { text.underscore },
96
- downcase: ->(text) { text.downcase }
97
- end
117
+ # This explodes 💥 - 'user_registrations' metric doesn't exist in schema
118
+ user_emitter.increment('user_registrations')
98
119
 
99
- namespace :marathon do
100
- tags do
101
- tag :course,
102
- values: ["san francisco", "boston", "new york"],
103
- transform: %i[downcase underscore],
120
+ # This explodes 💥 - missing required tag signup_method
121
+ user_emitter.increment('signups', tags: { plan_type: 'free' })
122
+ ```
104
123
 
105
- tag :marathon_type, values: %w[half full]
106
- tag :status, values: %w[finished no-show incomplete]
107
- tag :sponsorship, values: %w[nike cocacola redbull]
108
- end
124
+ **Schema validation catches:**
125
+ - Metrics that don't exist
126
+ - Wrong metric types (counter vs gauge vs distribution)
127
+ - ❌ Missing required tags
128
+ - ❌ Invalid tag values
129
+ - ❌ Tags that aren't allowed on specific metrics
109
130
 
110
- metrics do
111
- # This defines a single metric "marathon.started.total"
112
- namespace :started do
113
- counter :total do
114
- description "Incrementing - the total number of people who were registered for this marathon"
115
- tags required: %i[ course marathon_type ],
116
- allowed: %i[ sponsorship ]
117
- end
118
- end
131
+ ## Progressive Examples: Real-World Schemas
132
+
133
+ ### E-commerce Application Metrics
119
134
 
120
- # defines two metrics: a counter metric named "marathon.finished.total" and
121
- # a distribution metric "marathon.finished.duration"
122
- namespace :finished do
123
- counter :total, inherit_tags: "marathon.started.total",
124
- description "The number of people who finished a given marathon"
125
- tags required: %i[ status ]
126
- end
127
-
128
- distribution :duration, units: "minutes", inherit_tags: "marathon.finished.count" do
129
- description "The distribution of all finish times registered."
130
- end
131
- end
135
+ ```ruby
136
+ ecommerce_schema = Datadog::Statsd::Schema.new do
137
+ # Global transformers for consistent naming
138
+ transformers do
139
+ underscore: ->(text) { text.underscore }
140
+ downcase: ->(text) { text.downcase }
141
+ end
142
+
143
+ namespace :ecommerce do
144
+ tags do
145
+ # Finite set of product categories (not product IDs!)
146
+ tag :category, values: %w[electronics clothing books home_garden]
147
+
148
+ # Payment methods you actually support
149
+ tag :payment_method, values: %w[credit_card paypal apple_pay]
150
+
151
+ # Order status progression
152
+ tag :status, values: %w[pending processing shipped delivered cancelled]
153
+
154
+ # A/B test groups (not test IDs!)
155
+ tag :checkout_flow, values: %w[single_page multi_step express]
156
+ end
157
+
158
+ namespace :orders do
159
+ metrics do
160
+ counter :created do
161
+ description "New orders placed"
162
+ tags required: [:category], allowed: [:payment_method, :checkout_flow]
163
+ end
164
+
165
+ counter :completed do
166
+ description "Successfully processed orders"
167
+ inherit_tags: "ecommerce.orders.created" # Reuse tag definition
168
+ tags required: [:status]
169
+ end
170
+
171
+ distribution :value do
172
+ description "Order value distribution in cents"
173
+ units "cents"
174
+ tags required: [:category], allowed: [:payment_method]
175
+ end
176
+
177
+ gauge :processing_queue_size do
178
+ description "Orders waiting to be processed"
179
+ # No tags - just a simple queue size metric
180
+ end
181
+ end
182
+ end
183
+
184
+ namespace :inventory do
185
+ metrics do
186
+ gauge :stock_level do
187
+ description "Current inventory levels"
188
+ tags required: [:category]
189
+ end
190
+
191
+ counter :restocked do
192
+ description "Inventory replenishment events"
193
+ tags required: [:category]
194
+ end
132
195
  end
133
196
  end
134
197
  end
198
+ end
199
+
200
+ # Usage in your order processing service
201
+ order_processor = Datadog::Statsd::Emitter.new(
202
+ 'OrderProcessor',
203
+ schema: ecommerce_schema,
204
+ metric: 'ecommerce.orders', # Prefix for all metrics from this emitter
205
+ tags: { checkout_flow: 'single_page' }
206
+ )
135
207
 
136
- my_sender = Datadog.emitter(
137
- metric: 'marathon',
138
- schema: schema,
139
- validation_mode: :strict,
140
- tags: { marathon_type: :full, course: "san-francisco" }
141
- )
208
+ # Process an order - clean, validated metrics
209
+ order_processor.increment('created', tags: {
210
+ category: 'electronics',
211
+ payment_method: 'credit_card'
212
+ })
213
+
214
+ order_processor.distribution('value', 15_99, tags: {
215
+ category: 'electronics',
216
+ payment_method: 'credit_card'
217
+ })
142
218
 
143
- my_sender.increment('started.total', by: 43579) # register all participants at start
144
- # time passes, first runners start to arrive
145
- my_sender.increment('finished.total') # register one at a time
146
- my_sender.distribution('finished.duration', 33.21, tags: { sponsorship: 'nike' })
147
- ...
148
- my_sender.increment('finished.total')
149
- my_sender.distribution('finished.duration', 35.09, tags: { sponsorship: "redbull" })
219
+ order_processor.gauge('processing_queue_size', 12)
150
220
  ```
151
221
 
152
- In this case, the schema will validate that the metrics are named `marathon.finished.total` and `marathon.finished.duration`, and that their tags are appropriately defined.
222
+ ### API Performance Monitoring
153
223
 
154
224
  ```ruby
155
- finish_sender = Datadog.emitter(
156
- schema: schema,
157
- validation_mode: :warn,
158
- metric: "marathon.finished",
159
- tags: { marathon_type: :full, course: "san-francisco" }
160
- )
161
- finish.increment("total")
162
- finish.distribution("duration", 34)
225
+ api_schema = Datadog::Statsd::Schema.new do
226
+ namespace :api do
227
+ tags do
228
+ # HTTP methods you actually handle
229
+ tag :method, values: %w[GET POST PUT PATCH DELETE]
230
+
231
+ # Standardized controller names (transformed to snake_case)
232
+ tag :controller,
233
+ values: %r{^[a-z_]+$}, # Regex validation
234
+ transform: [:underscore, :downcase]
235
+
236
+ # Standard HTTP status code ranges
237
+ tag :status_class, values: %w[2xx 3xx 4xx 5xx]
238
+ tag :status_code,
239
+ type: :integer,
240
+ validate: ->(code) { (100..599).include?(code) }
241
+
242
+ # Feature flags for A/B testing
243
+ tag :feature_version, values: %w[v1 v2 experimental]
244
+ end
245
+
246
+ namespace :requests do
247
+ metrics do
248
+ counter :total do
249
+ description "Total API requests"
250
+ tags required: [:method, :controller],
251
+ allowed: [:status_class, :feature_version]
252
+ end
253
+
254
+ distribution :duration do
255
+ description "Request processing time"
256
+ units "milliseconds"
257
+ inherit_tags: "api.requests.total"
258
+ tags required: [:status_code]
259
+ end
260
+
261
+ histogram :response_size do
262
+ description "Response payload size distribution"
263
+ units "bytes"
264
+ tags required: [:method, :controller]
265
+ end
266
+ end
267
+ end
268
+
269
+ namespace :errors do
270
+ metrics do
271
+ counter :total do
272
+ description "API errors by type"
273
+ tags required: [:controller, :status_code]
274
+ end
275
+ end
276
+ end
277
+ end
278
+ end
279
+
280
+ # Usage in Rails controller concern
281
+ class ApplicationController < ActionController::Base
282
+ before_action :setup_metrics
283
+ after_action :track_request
284
+
285
+ private
286
+
287
+ def setup_metrics
288
+ @api_metrics = Datadog::Statsd::Emitter.new(
289
+ self.class.name,
290
+ schema: api_schema,
291
+ metric: 'api',
292
+ validation_mode: Rails.env.production? ? :warn : :strict
293
+ )
294
+ end
295
+
296
+ def track_request
297
+ controller_name = self.class.name.gsub('Controller', '').underscore
298
+
299
+ @api_metrics.increment('requests.total', tags: {
300
+ method: request.method,
301
+ controller: controller_name,
302
+ status_class: "#{response.status.to_s[0]}xx"
303
+ })
304
+
305
+ @api_metrics.distribution('requests.duration',
306
+ request_duration_ms,
307
+ tags: {
308
+ method: request.method,
309
+ controller: controller_name,
310
+ status_code: response.status
311
+ }
312
+ )
313
+ end
314
+ end
163
315
  ```
164
316
 
165
- The above code will transmit the following metric, with the following tags:
317
+ ## Validation Modes: From Development to Production
318
+
319
+ The gem supports different validation strategies for different environments:
166
320
 
167
321
  ```ruby
168
- $statsd.increment(
169
- "marathon.finished.total",
170
- tags: { marathon_type: :full, course: "san-francisco" }
322
+ # Development: Explode on any schema violations
323
+ dev_emitter = Datadog::Statsd::Emitter.new(
324
+ 'MyService',
325
+ schema: my_schema,
326
+ validation_mode: :strict # Raises exceptions
171
327
  )
172
328
 
173
- $statsd.distribution(
174
- "marathon.finished.duration",
175
- tags: { marathon_type: :full, course: "san-francisco" }
329
+ # Staging: Log warnings but continue
330
+ staging_emitter = Datadog::Statsd::Emitter.new(
331
+ 'MyService',
332
+ schema: my_schema,
333
+ validation_mode: :warn # Prints to stderr, continues execution
176
334
  )
177
- ```
178
335
 
179
- ### Validation Mode
336
+ # Production: Drop invalid metrics silently
337
+ prod_emitter = Datadog::Statsd::Emitter.new(
338
+ 'MyService',
339
+ schema: my_schema,
340
+ validation_mode: :drop # Silently drops invalid metrics
341
+ )
180
342
 
181
- There are four validation modes you can pass to an emitter to accompany a schema:
343
+ # Emergency: Turn off validation entirely
344
+ emergency_emitter = Datadog::Statsd::Emitter.new(
345
+ 'MyService',
346
+ schema: my_schema,
347
+ validation_mode: :off # No validation at all
348
+ )
349
+ ```
182
350
 
183
- 1. `:strict` raise an exception when anything is out of the ordinary is passed
184
- 2. `:warn` — print to stderr and continue
185
- 3. `:drop` — drop this metric
186
- 4. `:off` — no validation, as if schema was not even passed.
351
+ ## Best Practices: Designing Schemas That Scale
187
352
 
188
- ### An Example Tracking Web Performance
353
+ ### 🎯 Design Metrics Before Code
189
354
 
190
355
  ```ruby
191
- Datadog::Statsd::Schema.configure do |config|
192
- config.statsd = $statsd
193
- config.schema = Datadog::Statsd::Schema.new do
194
- namespace "web" do
195
- namespace "request" do
196
- tags do
197
- tag :uri,
198
- values: %r{.*}
199
-
200
- tag :logged_in,
201
- values: %w[logged_in logged_out]
202
-
203
- tag :billing_plan,
204
- values: %w[premium trial free]
205
-
206
- tag :controller,
207
- values: %r{[a-z.]*},
208
- transform: [ :underscore, :downcase ]
209
-
210
- tag :action,
211
- values: %r{[a-z.]*},
212
- transform: [ :underscore, :downcase ]
213
-
214
- tag :method,
215
- values: %i[get post put patch delete head options trace connect],
216
- transform: [ :downcase ]
217
-
218
- tag :status_code,
219
- type: :integer,
220
- validate: ->(code) { (100..599).include?(code) }
221
- end
222
-
223
- metrics do
224
- # This distribution allows tracking of the latency of the request.
225
- distribution :duration do
226
- description "HTTP request processing time in milliseconds"
227
- tags allowed: %w[controller action method status_code region]
228
- required: %w[controller]
229
- end
230
-
231
- # This counter allows tracking the frequency of each controller/action
232
- counter :total, inherit_tags: :duration do
233
- description "Total number of requests received"
234
- end
235
- end
236
- end
356
+ # Good: Design session like this
357
+ session_schema = Datadog::Statsd::Schema.new do
358
+ namespace :user_sessions do
359
+ tags do
360
+ tag :session_type, values: %w[web mobile api]
361
+ tag :auth_method, values: %w[password oauth sso]
362
+ tag :plan_tier, values: %w[free premium enterprise]
363
+ end
364
+
365
+ metrics do
366
+ counter :started do
367
+ description "User sessions initiated"
368
+ tags required: [:session_type], allowed: [:auth_method, :plan_tier]
369
+ end
370
+
371
+ counter :ended do
372
+ description "User sessions terminated"
373
+ tags required: [:session_type, :auth_method]
374
+ end
375
+
376
+ distribution :duration do
377
+ description "How long sessions last"
378
+ units "minutes"
379
+ tags required: [:session_type]
237
380
  end
238
381
  end
239
382
  end
240
- ```
383
+ end
241
384
 
385
+ # ❌ Bad: Don't do this
386
+ statsd.increment('user_login', tags: { user_id: user.id }) # Infinite cardinality!
387
+ statsd.increment('session_start_web_premium_oauth') # Explosion of metric names!
388
+ statsd.gauge('active_users_on_mobile_free_plan_from_usa', 1000) # Way too specific!
389
+ ```
242
390
 
243
- Let's say this monitor only tracks requests from logged in premium users, then you can provide those tags here, and they will be sent together with individual invocations:
391
+ ### 🏷️ Tag Strategy: Finite and Purposeful
244
392
 
245
393
  ```ruby
246
- # We'll use the shorthand version to create this Emitter.
247
- # It's equivalent to *Datadog::Statsd::Emitter.new*
248
- traffic_monitor = Datadog.emitter(
249
- self,
250
- metric: "web.request",
251
- tags: { billing_plan: :premium, logged_in: :logged_in }
252
- )
253
-
254
- my_sender.increment('total', tags: { uri: '/home/settings', method: :get } )
255
- my_sender.distribution('duration', tags: { uri: '/home/settings', method: :get } )
256
-
257
- my_sender.increment('total', tags: { uri: '/app/calendar', method: :post} )
258
- my_sender.distribution('duration', tags: { uri: '/app/calendar', method: :post } )
394
+ # Good: Finite tag values that enable grouping/filtering
395
+ tag :plan_type, values: %w[free premium enterprise]
396
+ tag :region, values: %w[us-east us-west eu-central ap-southeast]
397
+ tag :feature_flag, values: %w[enabled disabled control]
398
+
399
+ # Bad: Infinite or high-cardinality tags
400
+ tag :user_id # Millions of possible values!
401
+ tag :session_id # Unique every time!
402
+ tag :timestamp # Infinite values!
403
+ tag :request_path # Thousands of unique URLs!
259
404
  ```
260
-
261
- The above code will send two metrics: `web.request.total` as a counter, tagged with: `{ billing_plan: :premium, logged_in: :logged_in, uri: '/home/settings' }` and the second time for the `uri: '/app/calendar'`.
262
-
263
- ### Emitter
264
-
265
- You can create instances of this class and use the instance to emit custom metrics. You may want to do this, instead of using the class methods directly, for two reasons:
266
405
 
267
- 1. You want to send metrics from several places in the codebase, but have them share the "emitter" tag (which i.e. defines the source, a class, or object)emitting the metric, or any other tags for that matter.
406
+ ### 📊 Metric Types: Choose Wisely
268
407
 
269
- 2. You want to send metrics with a different sample rate than the defaults.
408
+ ```ruby
409
+ namespace :email_service do
410
+ metrics do
411
+ # ✅ Use counters for events that happen
412
+ counter :sent do
413
+ description "Emails successfully sent"
414
+ end
415
+
416
+ # ✅ Use gauges for current state/levels
417
+ gauge :queue_size do
418
+ description "Emails waiting to be sent"
419
+ end
420
+
421
+ # ✅ Use distributions for value analysis (careful - creates 10 metrics!)
422
+ distribution :delivery_time do
423
+ description "Time from send to delivery"
424
+ units "seconds"
425
+ end
426
+
427
+ # ⚠️ Use histograms sparingly (creates 5 metrics each)
428
+ histogram :processing_time do
429
+ description "Email processing duration"
430
+ units "milliseconds"
431
+ end
432
+
433
+ # ⚠️ Use sets very carefully (tracks unique values)
434
+ set :unique_recipients do
435
+ description "Unique email addresses receiving mail"
436
+ end
437
+ end
438
+ end
439
+ ```
270
440
 
271
- In both cases, you can create an instance of this class and use it to emit metrics.
441
+ ### 🔄 Schema Evolution: Plan for Change
272
442
 
273
- #### Naming Metrics
443
+ ```ruby
444
+ # ✅ Good: Use inherit_tags to reduce duplication
445
+ base_schema = Datadog::Statsd::Schema.new do
446
+ namespace :payments do
447
+ tags do
448
+ tag :payment_method, values: %w[card bank_transfer crypto]
449
+ tag :currency, values: %w[USD EUR GBP JPY]
450
+ tag :region, values: %w[north_america europe asia]
451
+ end
452
+
453
+ metrics do
454
+ counter :initiated do
455
+ description "Payment attempts started"
456
+ tags required: [:payment_method], allowed: [:currency, :region]
457
+ end
458
+
459
+ counter :completed do
460
+ description "Successful payments"
461
+ inherit_tags: "payments.initiated" # Reuses the tag configuration
462
+ end
463
+
464
+ counter :failed do
465
+ description "Failed payment attempts"
466
+ inherit_tags: "payments.initiated"
467
+ tags required: [:failure_reason] # Add specific tags as needed
468
+ end
469
+ end
470
+ end
471
+ end
472
+ ```
274
473
 
275
- Please remember that naming *IS* important. Good naming is self-documenting, easy to slice the data by, and easy to understand and analyze. Keep the number of unique metric names down, number of tags down, and the number of possible tag values should always be finite. If in doubt, set a tag, instead of creating a new metric.
474
+ ### 🏗️ Namespace Organization
276
475
 
277
- #### Example — Tracking email delivery
476
+ ```ruby
477
+ # ✅ Good: Hierarchical organization by domain
478
+ app_schema = Datadog::Statsd::Schema.new do
479
+ namespace :ecommerce do
480
+ namespace :orders do
481
+ # Order-related metrics
482
+ end
483
+
484
+ namespace :inventory do
485
+ # Stock and fulfillment metrics
486
+ end
487
+
488
+ namespace :payments do
489
+ # Payment processing metrics
490
+ end
491
+ end
492
+
493
+ namespace :infrastructure do
494
+ namespace :database do
495
+ # DB performance metrics
496
+ end
497
+
498
+ namespace :cache do
499
+ # Redis/Memcached metrics
500
+ end
501
+ end
502
+ end
503
+
504
+ # ❌ Bad: Flat namespace chaos
505
+ # orders.created
506
+ # orders_completed
507
+ # order::cancelled
508
+ # INVENTORY_LOW
509
+ # db.query.time
510
+ # cache_hits
511
+ ```
278
512
 
279
- Imagine that we want to track email delivery. But we have many types of emails that we send. Instead of creating new metric for each new email type, use the tag "email_type" to specify what type of email it is.
513
+ ## Advanced Features
280
514
 
281
- Keep metric name list short, eg: "emails.queued", "emails.sent", "emails.delivered" are good metrics as they define a distinctly unique events. However, should you want to differentiate between different types of emails, you could theoretically do the following: (BAD EXAMPLE, DO NOT FOLLOW) — "emails.sent.welcome", "emails.sent.payment". But this example conflates two distinct events into a single metric. Instead, we should use tags to set event properties, such as what type of email that is.
515
+ ### Global Configuration
282
516
 
283
517
  ```ruby
284
-
285
- emails_emitter = Datadog.emitter(
286
- self,
287
- metric: 'emails'
288
- )
289
-
290
- emails_emitter.increment('queued.total')
291
- emails_emitter.increment('delivered.total', by: count)
292
- emails_emitter.gauge('queue.size', EmailQueue.size)
518
+ # Set up global configuration in your initializer
519
+ Datadog::Statsd::Schema.configure do |config|
520
+ # Global tags applied to ALL metrics
521
+ config.tags = {
522
+ env: Rails.env,
523
+ service: 'web-app',
524
+ version: ENV['GIT_SHA']&.first(7),
525
+ datacenter: ENV['DATACENTER'] || 'us-east-1'
526
+ }
527
+
528
+ # The actual StatsD client
529
+ config.statsd = Datadog::Statsd.new(
530
+ ENV['STATSD_HOST'] || 'localhost',
531
+ ENV['STATSD_PORT'] || 8125,
532
+ namespace: ENV['STATSD_NAMESPACE'],
533
+ tags: [], # Don't double-up tags here
534
+ delay_serialization: true
535
+ )
536
+ end
293
537
  ```
294
538
 
295
- #### What's the Emitter Constructor Arguments?
539
+ ### Tag Transformers
296
540
 
297
- The first argument to the `Emitter.new()` or `Datadog.emitter()` (those are equivalent) is an object or a string or a class that's converted to a tag called `emitter`. This is the source class or object that sent the metric. The same mwtric may come from various places in your code, and `emitter` tag allows you to differentiate between them.
541
+ ```ruby
542
+ schema_with_transforms = Datadog::Statsd::Schema.new do
543
+ transformers do
544
+ underscore: ->(text) { text.underscore }
545
+ downcase: ->(text) { text.downcase }
546
+ truncate: ->(text) { text.first(20) }
547
+ end
548
+
549
+ namespace :user_actions do
550
+ tags do
551
+ # Controller names get normalized automatically
552
+ tag :controller,
553
+ values: %r{^[a-z_]+$},
554
+ transform: [:underscore, :downcase] # Applied in order
555
+
556
+ # Action names also get cleaned up
557
+ tag :action,
558
+ values: %w[index show create update destroy],
559
+ transform: [:downcase]
560
+ end
561
+ end
562
+ end
298
563
 
299
- Subsequent arguments are hash arguments.
564
+ # "UserSettingsController" becomes "user_settings_controller"
565
+ # "CreateUser" becomes "create_user"
566
+ ```
300
567
 
301
- * `metric` — The (optional) name of the metric to track. If set to, eg. `emails`, then any subsequent method sending metric will prepend `emails.` to it, for example:
568
+ ### Complex Validation
302
569
 
303
570
  ```ruby
304
- emitter.increment('sent.total', by: 3)
571
+ advanced_schema = Datadog::Statsd::Schema.new do
572
+ namespace :financial do
573
+ tags do
574
+ # Custom validation with lambdas
575
+ tag :amount_bucket,
576
+ validate: ->(value) { %w[small medium large].include?(value) }
577
+
578
+ # Regex validation for IDs
579
+ tag :transaction_type,
580
+ values: %r{^[A-Z]{2,4}_[0-9]{3}$} # Like "AUTH_001", "REFUND_042"
581
+
582
+ # Type validation
583
+ tag :user_segment,
584
+ type: :integer,
585
+ validate: ->(segment) { (1..10).include?(segment) }
586
+ end
587
+ end
588
+ end
305
589
  ```
306
590
 
307
- Will actually increment the metric `emails.sent.total`.
308
-
309
- #### Other Examples
591
+ ### Loading Schemas from Files
310
592
 
311
593
  ```ruby
312
-
313
- Datadog.emitter(self)
314
- .increment('emails.sent', by: 2)
315
-
316
- Datadog.emitter(ab_test: { 'login_test_2025' => 'control' })
317
- .increment('users.logged_in')
318
- # => tags: { ab_test_name: 'login_test_2025',
319
- # ab_test_group: 'control' }
320
-
321
- Datadog.emitter(SessionsController, metric: 'users')
322
- .gauge('logged_in', 100)
323
-
324
- sessions = Datadog.emitter(SessionsController, metric: 'users')
325
- # => tags: { emitter: "sessions_controller" }
326
- sessions.gauge('active', 100)
327
- sessions.distribution('active.total', 114)
594
+ # config/metrics_schema.rb
595
+ Datadog::Statsd::Schema.new do
596
+ namespace :my_app do
597
+ # ... schema definition
598
+ end
599
+ end
600
+
601
+ # In your application
602
+ schema = Datadog::Statsd::Schema.load_file('config/metrics_schema.rb')
328
603
  ```
329
604
 
330
605
  ## Installation
331
606
 
607
+ Add to your Gemfile:
332
608
 
333
- ```bash
334
- bundle add datadog-statsd-schema
609
+ ```ruby
610
+ gem 'datadog-statsd-schema'
335
611
  ```
336
612
 
337
- If bundler is not being used to manage dependencies, install the gem by executing:
613
+ Or install directly:
338
614
 
339
615
  ```bash
340
616
  gem install datadog-statsd-schema
341
617
  ```
342
618
 
343
- ## Usage
619
+ ## The Bottom Line
344
620
 
345
- 1. Define your metrics and tagging schema
346
- 2. Create as many "emitters" as necessary and start sending!
621
+ This gem transforms Datadog custom metrics from a "wild west" free-for-all into a disciplined, cost-effective observability strategy:
347
622
 
348
- ## Development
623
+ - **🎯 Intentional Metrics**: Define what you measure before you measure it
624
+ - **💰 Cost Control**: Prevent infinite cardinality and metric explosion
625
+ - **🏷️ Consistent Tagging**: Global and hierarchical tag management
626
+ - **🔍 Better Insights**: Finite tag values enable proper aggregation and analysis
627
+ - **👥 Team Alignment**: Schema serves as documentation and contract
349
628
 
350
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
629
+ Stop the metric madness. Start with a schema.
351
630
 
352
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
631
+ ---
353
632
 
354
633
  ## Contributing
355
634