datadog-statsd-schema 0.1.2 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +4 -0
- data/.rubocop_todo.yml +27 -22
- data/README.md +286 -514
- data/examples/schema/example_marathon.rb +29 -0
- data/exe/dss +8 -0
- data/lib/datadog/statsd/schema/analyzer.rb +397 -0
- data/lib/datadog/statsd/schema/cli.rb +16 -0
- data/lib/datadog/statsd/schema/commands/analyze.rb +52 -0
- data/lib/datadog/statsd/schema/commands.rb +14 -0
- data/lib/datadog/statsd/schema/namespace.rb +1 -1
- data/lib/datadog/statsd/schema/version.rb +1 -1
- data/lib/datadog/statsd/schema.rb +2 -0
- metadata +23 -4
- data/exe/datadog-statsd-schema +0 -3
data/README.md
CHANGED
@@ -1,638 +1,410 @@
|
|
1
1
|
[](https://github.com/kigster/datadog-statsd-schema/actions/workflows/ruby.yml)
|
2
2
|
|
3
|
-
# Datadog
|
3
|
+
# Datadog StatsD Schema
|
4
4
|
|
5
|
-
|
5
|
+
A Ruby gem that provides comprehensive schema definition, validation, and cost analysis for Datadog StatsD metrics. This library helps teams prevent metric explosion, control costs, and maintain consistent metric naming conventions.
|
6
6
|
|
7
|
-
|
7
|
+
## Features
|
8
8
|
|
9
|
-
|
9
|
+
- **Schema Definition**: Define metric schemas with type safety and validation
|
10
|
+
- **Tag Management**: Centralized tag definitions with inheritance and validation
|
11
|
+
- **Cost Analysis**: Analyze potential custom metric costs before deployment
|
12
|
+
- **Metric Validation**: Runtime validation of metrics against defined schemas
|
13
|
+
- **CLI Tools**: Command-line interface for schema analysis and validation
|
14
|
+
- **Global Configuration**: Centralized configuration for tags and StatsD clients
|
10
15
|
|
11
|
-
|
12
|
-
|
13
|
-
- **Marketing** added `statsd.increment('clicks', tags: { campaign_id: campaign.id })` across 10,000 campaigns
|
14
|
-
- **DevOps** thought `statsd.gauge('memory', tags: { container_id: container.uuid })` was a great idea
|
15
|
-
- **Frontend** started tracking `statsd.timing('page.load', tags: { user_id: current_user.id })` for 2 million users
|
16
|
-
- **Everyone** has their own creative naming conventions: `user_signups`, `user.sign.ups`, `users::signups`, `Users.Signups`
|
16
|
+
## Installation
|
17
17
|
|
18
|
-
|
18
|
+
Add this line to your application's Gemfile:
|
19
19
|
|
20
|
-
|
20
|
+
```ruby
|
21
|
+
gem 'datadog-statsd-schema'
|
22
|
+
```
|
21
23
|
|
22
|
-
|
24
|
+
And then execute:
|
23
25
|
|
24
|
-
|
26
|
+
```bash
|
27
|
+
bundle install
|
28
|
+
```
|
25
29
|
|
26
|
-
|
27
|
-
2. **📋 Schema Validation** - Define your metrics upfront, validate everything, prevent metric explosion
|
30
|
+
Or install it yourself as:
|
28
31
|
|
29
|
-
|
32
|
+
```bash
|
33
|
+
gem install datadog-statsd-schema
|
34
|
+
```
|
30
35
|
|
31
|
-
## Quick Start
|
36
|
+
## Quick Start
|
32
37
|
|
33
|
-
|
38
|
+
### Basic Schema Definition
|
34
39
|
|
35
40
|
```ruby
|
36
41
|
require 'datadog/statsd/schema'
|
37
42
|
|
38
|
-
#
|
43
|
+
# Define your metrics schema
|
44
|
+
schema = Datadog::Statsd::Schema.new do
|
45
|
+
namespace :web do
|
46
|
+
tags do
|
47
|
+
tag :environment, values: %w[production staging development]
|
48
|
+
tag :service, values: %w[api web worker]
|
49
|
+
tag :region, values: %w[us-east-1 us-west-2]
|
50
|
+
end
|
51
|
+
|
52
|
+
metrics do
|
53
|
+
counter :requests_total do
|
54
|
+
description "Total HTTP requests"
|
55
|
+
tags required: [:environment, :service], allowed: [:region]
|
56
|
+
end
|
57
|
+
|
58
|
+
gauge :memory_usage do
|
59
|
+
description "Memory usage in bytes"
|
60
|
+
tags required: [:environment], allowed: [:service, :region]
|
61
|
+
end
|
62
|
+
|
63
|
+
distribution :request_duration do
|
64
|
+
description "Request processing time in milliseconds"
|
65
|
+
tags required: [:environment, :service]
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
70
|
+
```
|
71
|
+
|
72
|
+
### Using the Emitter with Schema Validation
|
73
|
+
|
74
|
+
```ruby
|
75
|
+
# Configure global settings
|
39
76
|
Datadog::Statsd::Schema.configure do |config|
|
40
|
-
config.tags = { env: 'production', service: 'web-app', version: '1.2.3' }
|
41
77
|
config.statsd = Datadog::Statsd.new('localhost', 8125)
|
78
|
+
config.schema = schema
|
79
|
+
config.tags = { environment: 'production' }
|
42
80
|
end
|
43
81
|
|
44
|
-
# Create an emitter
|
45
|
-
|
46
|
-
|
47
|
-
|
82
|
+
# Create an emitter with validation
|
83
|
+
emitter = Datadog::Statsd::Emitter.new(
|
84
|
+
schema: schema,
|
85
|
+
validation_mode: :strict # :strict, :warn, or :disabled
|
48
86
|
)
|
49
87
|
|
50
|
-
# Send
|
51
|
-
|
88
|
+
# Send metrics with automatic validation
|
89
|
+
emitter.increment('web.requests_total', tags: { service: 'api', region: 'us-east-1' })
|
90
|
+
emitter.gauge('web.memory_usage', 512_000_000, tags: { service: 'api' })
|
91
|
+
emitter.distribution('web.request_duration', 45.2, tags: { service: 'api' })
|
52
92
|
```
|
53
93
|
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
# service: 'web-app', # From global config
|
60
|
-
# version: '1.2.3', # From global config
|
61
|
-
# emitter: 'auth_service', # Auto-generated from first argument
|
62
|
-
# feature: 'user_auth', # From emitter constructor
|
63
|
-
# method: 'oauth' # From method call
|
64
|
-
# }
|
65
|
-
```
|
94
|
+
## CLI Usage
|
95
|
+
|
96
|
+
The gem provides a command-line interface for analyzing schemas and understanding their cost implications.
|
97
|
+
|
98
|
+
### Installation
|
66
99
|
|
67
|
-
|
68
|
-
- Method-level tags override emitter tags
|
69
|
-
- Emitter tags override global tags
|
70
|
-
- Global tags are always included
|
100
|
+
After installing the gem, the `dss` (Datadog StatsD Schema) command will be available:
|
71
101
|
|
72
|
-
|
102
|
+
```bash
|
103
|
+
dss --help
|
104
|
+
```
|
73
105
|
|
74
|
-
|
106
|
+
### Schema Analysis
|
75
107
|
|
76
|
-
|
108
|
+
Create a schema file (e.g., `metrics_schema.rb`):
|
77
109
|
|
78
110
|
```ruby
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
end
|
88
|
-
|
111
|
+
namespace :web do
|
112
|
+
tags do
|
113
|
+
tag :environment, values: %w[production staging development]
|
114
|
+
tag :service, values: %w[api web worker]
|
115
|
+
tag :region, values: %w[us-east-1 us-west-2 eu-west-1]
|
116
|
+
end
|
117
|
+
|
118
|
+
namespace :requests do
|
89
119
|
metrics do
|
90
|
-
|
91
|
-
|
92
|
-
|
93
|
-
tags required: [:signup_method], allowed: [:plan_type, :feature_flag]
|
120
|
+
counter :total do
|
121
|
+
description "Total HTTP requests"
|
122
|
+
tags required: [:environment, :service], allowed: [:region]
|
94
123
|
end
|
95
|
-
|
96
|
-
|
97
|
-
description "
|
98
|
-
|
124
|
+
|
125
|
+
distribution :duration do
|
126
|
+
description "Request processing time in milliseconds"
|
127
|
+
inherit_tags "web.requests.total"
|
99
128
|
end
|
100
129
|
end
|
101
130
|
end
|
102
|
-
end
|
103
|
-
|
104
|
-
# Create an emitter bound to this schema
|
105
|
-
user_emitter = Datadog::Statsd::Emitter.new(
|
106
|
-
'UserService',
|
107
|
-
schema: user_metrics_schema,
|
108
|
-
validation_mode: :strict # Explode on invalid metrics (good for development)
|
109
|
-
)
|
110
131
|
|
111
|
-
|
112
|
-
|
132
|
+
metrics do
|
133
|
+
gauge :memory_usage do
|
134
|
+
description "Memory usage in bytes"
|
135
|
+
tags required: [:environment], allowed: [:service]
|
136
|
+
end
|
137
|
+
end
|
138
|
+
end
|
139
|
+
```
|
113
140
|
|
114
|
-
|
115
|
-
user_emitter.increment('signups', tags: { signup_method: 'facebook' })
|
141
|
+
Analyze the schema to understand metric costs:
|
116
142
|
|
117
|
-
|
118
|
-
|
143
|
+
```bash
|
144
|
+
dss analyze --file metrics_schema.rb
|
145
|
+
```
|
119
146
|
|
120
|
-
|
121
|
-
|
147
|
+
**Output:**
|
148
|
+
```
|
149
|
+
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
|
150
|
+
│ Detailed Metric Analysis: │
|
151
|
+
└──────────────────────────────────────────────────────────────────────────────────────────────┘
|
152
|
+
|
153
|
+
• gauge('web.memory_usage')
|
154
|
+
Expanded names:
|
155
|
+
• web.memory_usage.count
|
156
|
+
• web.memory_usage.min
|
157
|
+
• web.memory_usage.max
|
158
|
+
• web.memory_usage.sum
|
159
|
+
• web.memory_usage.avg
|
160
|
+
|
161
|
+
Unique tags: 2
|
162
|
+
Total tag values: 6
|
163
|
+
Possible combinations: 45
|
164
|
+
|
165
|
+
──────────────────────────────────────────────────────────────────────────────────────────────
|
166
|
+
|
167
|
+
• counter('web.requests.total')
|
168
|
+
|
169
|
+
Unique tags: 3
|
170
|
+
Total tag values: 9
|
171
|
+
Possible combinations: 27
|
172
|
+
|
173
|
+
──────────────────────────────────────────────────────────────────────────────────────────────
|
174
|
+
|
175
|
+
• distribution('web.requests.duration')
|
176
|
+
Expanded names:
|
177
|
+
• web.requests.duration.count
|
178
|
+
• web.requests.duration.min
|
179
|
+
• web.requests.duration.max
|
180
|
+
• web.requests.duration.sum
|
181
|
+
• web.requests.duration.avg
|
182
|
+
• web.requests.duration.p50
|
183
|
+
• web.requests.duration.p75
|
184
|
+
• web.requests.duration.p90
|
185
|
+
• web.requests.duration.p95
|
186
|
+
• web.requests.duration.p99
|
187
|
+
|
188
|
+
Unique tags: 3
|
189
|
+
Total tag values: 9
|
190
|
+
Possible combinations: 270
|
191
|
+
|
192
|
+
──────────────────────────────────────────────────────────────────────────────────────────────
|
193
|
+
┌──────────────────────────────────────────────────────────────────────────────────────────────┐
|
194
|
+
│ Schema Analysis Results: │
|
195
|
+
│ SUMMARY │
|
196
|
+
└──────────────────────────────────────────────────────────────────────────────────────────────┘
|
197
|
+
|
198
|
+
Total unique metrics: 16
|
199
|
+
Total possible custom metric combinations: 342
|
122
200
|
```
|
123
201
|
|
124
|
-
**
|
125
|
-
- ❌ Metrics that don't exist
|
126
|
-
- ❌ Wrong metric types (counter vs gauge vs distribution)
|
127
|
-
- ❌ Missing required tags
|
128
|
-
- ❌ Invalid tag values
|
129
|
-
- ❌ Tags that aren't allowed on specific metrics
|
202
|
+
This analysis shows that your schema will generate **342 custom metrics** across **16 unique metric names**. Understanding this before deployment helps prevent unexpected Datadog billing surprises.
|
130
203
|
|
131
|
-
##
|
204
|
+
## Advanced Features
|
205
|
+
|
206
|
+
### Tag Inheritance
|
132
207
|
|
133
|
-
|
208
|
+
Metrics can inherit tag configurations from other metrics to reduce duplication:
|
134
209
|
|
135
210
|
```ruby
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
140
|
-
downcase: ->(text) { text.downcase }
|
141
|
-
end
|
142
|
-
|
143
|
-
namespace :ecommerce do
|
144
|
-
tags do
|
145
|
-
# Finite set of product categories (not product IDs!)
|
146
|
-
tag :category, values: %w[electronics clothing books home_garden]
|
147
|
-
|
148
|
-
# Payment methods you actually support
|
149
|
-
tag :payment_method, values: %w[credit_card paypal apple_pay]
|
150
|
-
|
151
|
-
# Order status progression
|
152
|
-
tag :status, values: %w[pending processing shipped delivered cancelled]
|
153
|
-
|
154
|
-
# A/B test groups (not test IDs!)
|
155
|
-
tag :checkout_flow, values: %w[single_page multi_step express]
|
156
|
-
end
|
157
|
-
|
158
|
-
namespace :orders do
|
159
|
-
metrics do
|
160
|
-
counter :created do
|
161
|
-
description "New orders placed"
|
162
|
-
tags required: [:category], allowed: [:payment_method, :checkout_flow]
|
163
|
-
end
|
164
|
-
|
165
|
-
counter :completed do
|
166
|
-
description "Successfully processed orders"
|
167
|
-
inherit_tags: "ecommerce.orders.created" # Reuse tag definition
|
168
|
-
tags required: [:status]
|
169
|
-
end
|
170
|
-
|
171
|
-
distribution :value do
|
172
|
-
description "Order value distribution in cents"
|
173
|
-
units "cents"
|
174
|
-
tags required: [:category], allowed: [:payment_method]
|
175
|
-
end
|
176
|
-
|
177
|
-
gauge :processing_queue_size do
|
178
|
-
description "Orders waiting to be processed"
|
179
|
-
# No tags - just a simple queue size metric
|
180
|
-
end
|
181
|
-
end
|
211
|
+
namespace :api do
|
212
|
+
metrics do
|
213
|
+
counter :requests_total do
|
214
|
+
tags required: [:environment, :service], allowed: [:region]
|
182
215
|
end
|
183
|
-
|
184
|
-
|
185
|
-
|
186
|
-
|
187
|
-
|
188
|
-
tags required: [:category]
|
189
|
-
end
|
190
|
-
|
191
|
-
counter :restocked do
|
192
|
-
description "Inventory replenishment events"
|
193
|
-
tags required: [:category]
|
194
|
-
end
|
195
|
-
end
|
216
|
+
|
217
|
+
# Inherits environment, service (required) and region (allowed) from requests_total
|
218
|
+
distribution :request_duration do
|
219
|
+
inherit_tags "api.requests_total"
|
220
|
+
tags required: [:endpoint] # Adds endpoint as additional required tag
|
196
221
|
end
|
197
222
|
end
|
198
223
|
end
|
199
|
-
|
200
|
-
# Usage in your order processing service
|
201
|
-
order_processor = Datadog::Statsd::Emitter.new(
|
202
|
-
'OrderProcessor',
|
203
|
-
schema: ecommerce_schema,
|
204
|
-
metric: 'ecommerce.orders', # Prefix for all metrics from this emitter
|
205
|
-
tags: { checkout_flow: 'single_page' }
|
206
|
-
)
|
207
|
-
|
208
|
-
# Process an order - clean, validated metrics
|
209
|
-
order_processor.increment('created', tags: {
|
210
|
-
category: 'electronics',
|
211
|
-
payment_method: 'credit_card'
|
212
|
-
})
|
213
|
-
|
214
|
-
order_processor.distribution('value', 15_99, tags: {
|
215
|
-
category: 'electronics',
|
216
|
-
payment_method: 'credit_card'
|
217
|
-
})
|
218
|
-
|
219
|
-
order_processor.gauge('processing_queue_size', 12)
|
220
224
|
```
|
221
225
|
|
222
|
-
###
|
226
|
+
### Nested Namespaces
|
227
|
+
|
228
|
+
Organize metrics hierarchically with nested namespaces:
|
223
229
|
|
224
230
|
```ruby
|
225
|
-
|
226
|
-
|
231
|
+
namespace :application do
|
232
|
+
tags do
|
233
|
+
tag :environment, values: %w[prod staging dev]
|
234
|
+
end
|
235
|
+
|
236
|
+
namespace :database do
|
227
237
|
tags do
|
228
|
-
|
229
|
-
tag :method, values: %w[GET POST PUT PATCH DELETE]
|
230
|
-
|
231
|
-
# Standardized controller names (transformed to snake_case)
|
232
|
-
tag :controller,
|
233
|
-
values: %r{^[a-z_]+$}, # Regex validation
|
234
|
-
transform: [:underscore, :downcase]
|
235
|
-
|
236
|
-
# Standard HTTP status code ranges
|
237
|
-
tag :status_class, values: %w[2xx 3xx 4xx 5xx]
|
238
|
-
tag :status_code,
|
239
|
-
type: :integer,
|
240
|
-
validate: ->(code) { (100..599).include?(code) }
|
241
|
-
|
242
|
-
# Feature flags for A/B testing
|
243
|
-
tag :feature_version, values: %w[v1 v2 experimental]
|
238
|
+
tag :table_name, values: %w[users orders products]
|
244
239
|
end
|
245
|
-
|
246
|
-
|
247
|
-
|
248
|
-
|
249
|
-
description "Total API requests"
|
250
|
-
tags required: [:method, :controller],
|
251
|
-
allowed: [:status_class, :feature_version]
|
252
|
-
end
|
253
|
-
|
254
|
-
distribution :duration do
|
255
|
-
description "Request processing time"
|
256
|
-
units "milliseconds"
|
257
|
-
inherit_tags: "api.requests.total"
|
258
|
-
tags required: [:status_code]
|
259
|
-
end
|
260
|
-
|
261
|
-
histogram :response_size do
|
262
|
-
description "Response payload size distribution"
|
263
|
-
units "bytes"
|
264
|
-
tags required: [:method, :controller]
|
265
|
-
end
|
266
|
-
end
|
267
|
-
end
|
268
|
-
|
269
|
-
namespace :errors do
|
270
|
-
metrics do
|
271
|
-
counter :total do
|
272
|
-
description "API errors by type"
|
273
|
-
tags required: [:controller, :status_code]
|
274
|
-
end
|
275
|
-
end
|
240
|
+
|
241
|
+
metrics do
|
242
|
+
counter :queries_total
|
243
|
+
distribution :query_duration
|
276
244
|
end
|
277
245
|
end
|
278
|
-
end
|
279
246
|
|
280
|
-
|
281
|
-
|
282
|
-
|
283
|
-
|
284
|
-
|
285
|
-
|
286
|
-
|
287
|
-
|
288
|
-
|
289
|
-
self.class.name,
|
290
|
-
schema: api_schema,
|
291
|
-
metric: 'api',
|
292
|
-
validation_mode: Rails.env.production? ? :warn : :strict
|
293
|
-
)
|
294
|
-
end
|
295
|
-
|
296
|
-
def track_request
|
297
|
-
controller_name = self.class.name.gsub('Controller', '').underscore
|
298
|
-
|
299
|
-
@api_metrics.increment('requests.total', tags: {
|
300
|
-
method: request.method,
|
301
|
-
controller: controller_name,
|
302
|
-
status_class: "#{response.status.to_s[0]}xx"
|
303
|
-
})
|
304
|
-
|
305
|
-
@api_metrics.distribution('requests.duration',
|
306
|
-
request_duration_ms,
|
307
|
-
tags: {
|
308
|
-
method: request.method,
|
309
|
-
controller: controller_name,
|
310
|
-
status_code: response.status
|
311
|
-
}
|
312
|
-
)
|
247
|
+
namespace :cache do
|
248
|
+
tags do
|
249
|
+
tag :cache_type, values: %w[redis memcached]
|
250
|
+
end
|
251
|
+
|
252
|
+
metrics do
|
253
|
+
counter :hits_total
|
254
|
+
counter :misses_total
|
255
|
+
end
|
313
256
|
end
|
314
257
|
end
|
315
258
|
```
|
316
259
|
|
317
|
-
|
260
|
+
### Validation Modes
|
318
261
|
|
319
|
-
|
262
|
+
Control how validation errors are handled:
|
320
263
|
|
321
264
|
```ruby
|
322
|
-
#
|
323
|
-
|
324
|
-
'MyService',
|
325
|
-
schema: my_schema,
|
326
|
-
validation_mode: :strict # Raises exceptions
|
327
|
-
)
|
328
|
-
|
329
|
-
# Staging: Log warnings but continue
|
330
|
-
staging_emitter = Datadog::Statsd::Emitter.new(
|
331
|
-
'MyService',
|
332
|
-
schema: my_schema,
|
333
|
-
validation_mode: :warn # Prints to stderr, continues execution
|
334
|
-
)
|
265
|
+
# Strict mode: Raises exceptions on validation failures
|
266
|
+
emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :strict)
|
335
267
|
|
336
|
-
#
|
337
|
-
|
338
|
-
'MyService',
|
339
|
-
schema: my_schema,
|
340
|
-
validation_mode: :drop # Silently drops invalid metrics
|
341
|
-
)
|
268
|
+
# Warn mode: Logs warnings but continues execution
|
269
|
+
emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :warn)
|
342
270
|
|
343
|
-
#
|
344
|
-
|
345
|
-
'MyService',
|
346
|
-
schema: my_schema,
|
347
|
-
validation_mode: :off # No validation at all
|
348
|
-
)
|
271
|
+
# Disabled: No validation (production default)
|
272
|
+
emitter = Datadog::Statsd::Emitter.new(schema: schema, validation_mode: :disabled)
|
349
273
|
```
|
350
274
|
|
351
|
-
|
275
|
+
### Global Configuration
|
352
276
|
|
353
|
-
|
277
|
+
Set up global defaults for your application:
|
354
278
|
|
355
279
|
```ruby
|
356
|
-
|
357
|
-
|
358
|
-
|
359
|
-
|
360
|
-
|
361
|
-
|
362
|
-
|
363
|
-
|
364
|
-
|
365
|
-
|
366
|
-
|
367
|
-
description "User sessions initiated"
|
368
|
-
tags required: [:session_type], allowed: [:auth_method, :plan_tier]
|
369
|
-
end
|
370
|
-
|
371
|
-
counter :ended do
|
372
|
-
description "User sessions terminated"
|
373
|
-
tags required: [:session_type, :auth_method]
|
374
|
-
end
|
375
|
-
|
376
|
-
distribution :duration do
|
377
|
-
description "How long sessions last"
|
378
|
-
units "minutes"
|
379
|
-
tags required: [:session_type]
|
380
|
-
end
|
381
|
-
end
|
382
|
-
end
|
280
|
+
Datadog::Statsd::Schema.configure do |config|
|
281
|
+
config.statsd = Datadog::Statsd.new(
|
282
|
+
ENV['DATADOG_AGENT_HOST'] || 'localhost',
|
283
|
+
ENV['DATADOG_AGENT_PORT'] || 8125
|
284
|
+
)
|
285
|
+
config.schema = schema
|
286
|
+
config.tags = {
|
287
|
+
environment: ENV['RAILS_ENV'],
|
288
|
+
service: 'my-application',
|
289
|
+
version: ENV['APP_VERSION']
|
290
|
+
}
|
383
291
|
end
|
384
292
|
|
385
|
-
#
|
386
|
-
|
387
|
-
|
388
|
-
statsd.gauge('active_users_on_mobile_free_plan_from_usa', 1000) # Way too specific!
|
293
|
+
# These global tags are automatically added to all metrics
|
294
|
+
emitter = Datadog::Statsd::Emitter.new
|
295
|
+
emitter.increment('user.signup') # Automatically includes global tags
|
389
296
|
```
|
390
297
|
|
391
|
-
|
298
|
+
## Cost Control and Best Practices
|
392
299
|
|
393
|
-
|
394
|
-
# ✅ Good: Finite tag values that enable grouping/filtering
|
395
|
-
tag :plan_type, values: %w[free premium enterprise]
|
396
|
-
tag :region, values: %w[us-east us-west eu-central ap-southeast]
|
397
|
-
tag :feature_flag, values: %w[enabled disabled control]
|
398
|
-
|
399
|
-
# ❌ Bad: Infinite or high-cardinality tags
|
400
|
-
tag :user_id # Millions of possible values!
|
401
|
-
tag :session_id # Unique every time!
|
402
|
-
tag :timestamp # Infinite values!
|
403
|
-
tag :request_path # Thousands of unique URLs!
|
404
|
-
```
|
300
|
+
### Understanding Metric Expansion
|
405
301
|
|
406
|
-
|
302
|
+
Different metric types create different numbers of time series:
|
407
303
|
|
408
|
-
|
409
|
-
|
410
|
-
|
411
|
-
|
412
|
-
|
413
|
-
description "Emails successfully sent"
|
414
|
-
end
|
415
|
-
|
416
|
-
# ✅ Use gauges for current state/levels
|
417
|
-
gauge :queue_size do
|
418
|
-
description "Emails waiting to be sent"
|
419
|
-
end
|
420
|
-
|
421
|
-
# ✅ Use distributions for value analysis (careful - creates 10 metrics!)
|
422
|
-
distribution :delivery_time do
|
423
|
-
description "Time from send to delivery"
|
424
|
-
units "seconds"
|
425
|
-
end
|
426
|
-
|
427
|
-
# ⚠️ Use histograms sparingly (creates 5 metrics each)
|
428
|
-
histogram :processing_time do
|
429
|
-
description "Email processing duration"
|
430
|
-
units "milliseconds"
|
431
|
-
end
|
432
|
-
|
433
|
-
# ⚠️ Use sets very carefully (tracks unique values)
|
434
|
-
set :unique_recipients do
|
435
|
-
description "Unique email addresses receiving mail"
|
436
|
-
end
|
437
|
-
end
|
438
|
-
end
|
439
|
-
```
|
304
|
+
- **Counter/Set**: 1 time series per unique tag combination
|
305
|
+
- **Gauge**: 5 time series (count, min, max, sum, avg)
|
306
|
+
- **Distribution/Histogram**: 10 time series (count, min, max, sum, avg, p50, p75, p90, p95, p99)
|
307
|
+
|
308
|
+
### Tag Value Limits
|
440
309
|
|
441
|
-
|
310
|
+
Be mindful of tag cardinality:
|
442
311
|
|
443
312
|
```ruby
|
444
|
-
#
|
445
|
-
|
446
|
-
|
447
|
-
|
448
|
-
|
449
|
-
|
450
|
-
tag :region, values: %w[north_america europe asia]
|
451
|
-
end
|
452
|
-
|
453
|
-
metrics do
|
454
|
-
counter :initiated do
|
455
|
-
description "Payment attempts started"
|
456
|
-
tags required: [:payment_method], allowed: [:currency, :region]
|
457
|
-
end
|
458
|
-
|
459
|
-
counter :completed do
|
460
|
-
description "Successful payments"
|
461
|
-
inherit_tags: "payments.initiated" # Reuses the tag configuration
|
462
|
-
end
|
463
|
-
|
464
|
-
counter :failed do
|
465
|
-
description "Failed payment attempts"
|
466
|
-
inherit_tags: "payments.initiated"
|
467
|
-
tags required: [:failure_reason] # Add specific tags as needed
|
468
|
-
end
|
469
|
-
end
|
470
|
-
end
|
471
|
-
end
|
313
|
+
# High cardinality - avoid
|
314
|
+
tag :user_id, type: :string # Could be millions of values
|
315
|
+
|
316
|
+
# Better approach - use bucketing
|
317
|
+
tag :user_tier, values: %w[free premium enterprise]
|
318
|
+
tag :user_cohort, values: %w[new_user returning_user power_user]
|
472
319
|
```
|
473
320
|
|
474
|
-
###
|
321
|
+
### Schema Validation
|
322
|
+
|
323
|
+
Always validate your schema before deployment:
|
475
324
|
|
476
325
|
```ruby
|
477
|
-
#
|
478
|
-
|
479
|
-
|
480
|
-
|
481
|
-
|
482
|
-
end
|
483
|
-
|
484
|
-
namespace :inventory do
|
485
|
-
# Stock and fulfillment metrics
|
486
|
-
end
|
487
|
-
|
488
|
-
namespace :payments do
|
489
|
-
# Payment processing metrics
|
490
|
-
end
|
491
|
-
end
|
492
|
-
|
493
|
-
namespace :infrastructure do
|
494
|
-
namespace :database do
|
495
|
-
# DB performance metrics
|
496
|
-
end
|
497
|
-
|
498
|
-
namespace :cache do
|
499
|
-
# Redis/Memcached metrics
|
500
|
-
end
|
501
|
-
end
|
326
|
+
# Check for common issues
|
327
|
+
errors = schema.validate
|
328
|
+
if errors.any?
|
329
|
+
puts "Schema validation errors:"
|
330
|
+
errors.each { |error| puts " - #{error}" }
|
502
331
|
end
|
503
|
-
|
504
|
-
# ❌ Bad: Flat namespace chaos
|
505
|
-
# orders.created
|
506
|
-
# orders_completed
|
507
|
-
# order::cancelled
|
508
|
-
# INVENTORY_LOW
|
509
|
-
# db.query.time
|
510
|
-
# cache_hits
|
511
332
|
```
|
512
333
|
|
513
|
-
##
|
334
|
+
## Integration Examples
|
514
335
|
|
515
|
-
###
|
336
|
+
### Rails Integration
|
516
337
|
|
517
338
|
```ruby
|
518
|
-
#
|
339
|
+
# config/initializers/datadog_statsd.rb
|
340
|
+
schema = Datadog::Statsd::Schema.load_file(Rails.root.join('config/metrics_schema.rb'))
|
341
|
+
|
519
342
|
Datadog::Statsd::Schema.configure do |config|
|
520
|
-
|
343
|
+
config.statsd = Datadog::Statsd.new
|
344
|
+
config.schema = schema
|
521
345
|
config.tags = {
|
522
|
-
|
523
|
-
service: '
|
524
|
-
version: ENV['GIT_SHA']&.first(7),
|
525
|
-
datacenter: ENV['DATACENTER'] || 'us-east-1'
|
346
|
+
environment: Rails.env,
|
347
|
+
service: 'my-rails-app'
|
526
348
|
}
|
527
|
-
|
528
|
-
# The actual StatsD client
|
529
|
-
config.statsd = Datadog::Statsd.new(
|
530
|
-
ENV['STATSD_HOST'] || 'localhost',
|
531
|
-
ENV['STATSD_PORT'] || 8125,
|
532
|
-
namespace: ENV['STATSD_NAMESPACE'],
|
533
|
-
tags: [], # Don't double-up tags here
|
534
|
-
delay_serialization: true
|
535
|
-
)
|
536
349
|
end
|
537
|
-
```
|
538
350
|
|
539
|
-
|
351
|
+
# app/controllers/application_controller.rb
|
352
|
+
class ApplicationController < ActionController::Base
|
353
|
+
before_action :setup_metrics
|
540
354
|
|
541
|
-
|
542
|
-
|
543
|
-
|
544
|
-
|
545
|
-
|
546
|
-
|
547
|
-
end
|
548
|
-
|
549
|
-
namespace :user_actions do
|
550
|
-
tags do
|
551
|
-
# Controller names get normalized automatically
|
552
|
-
tag :controller,
|
553
|
-
values: %r{^[a-z_]+$},
|
554
|
-
transform: [:underscore, :downcase] # Applied in order
|
555
|
-
|
556
|
-
# Action names also get cleaned up
|
557
|
-
tag :action,
|
558
|
-
values: %w[index show create update destroy],
|
559
|
-
transform: [:downcase]
|
560
|
-
end
|
355
|
+
private
|
356
|
+
|
357
|
+
def setup_metrics
|
358
|
+
@metrics = Datadog::Statsd::Emitter.new(
|
359
|
+
validation_mode: Rails.env.production? ? :disabled : :warn
|
360
|
+
)
|
561
361
|
end
|
562
362
|
end
|
563
|
-
|
564
|
-
# "UserSettingsController" becomes "user_settings_controller"
|
565
|
-
# "CreateUser" becomes "create_user"
|
566
363
|
```
|
567
364
|
|
568
|
-
###
|
365
|
+
### Background Job Monitoring
|
569
366
|
|
570
367
|
```ruby
|
571
|
-
|
572
|
-
|
573
|
-
|
574
|
-
|
575
|
-
|
576
|
-
|
577
|
-
|
578
|
-
|
579
|
-
|
580
|
-
|
581
|
-
|
582
|
-
|
583
|
-
|
584
|
-
|
585
|
-
|
368
|
+
class OrderProcessingJob
|
369
|
+
def perform(order_id)
|
370
|
+
metrics = Datadog::Statsd::Emitter.new
|
371
|
+
|
372
|
+
start_time = Time.current
|
373
|
+
|
374
|
+
begin
|
375
|
+
process_order(order_id)
|
376
|
+
metrics.increment('jobs.order_processing.success', tags: { queue: 'orders' })
|
377
|
+
rescue => error
|
378
|
+
metrics.increment('jobs.order_processing.failure',
|
379
|
+
tags: { queue: 'orders', error_type: error.class.name })
|
380
|
+
raise
|
381
|
+
ensure
|
382
|
+
duration = Time.current - start_time
|
383
|
+
metrics.distribution('jobs.order_processing.duration', duration * 1000,
|
384
|
+
tags: { queue: 'orders' })
|
586
385
|
end
|
587
386
|
end
|
588
387
|
end
|
589
388
|
```
|
590
389
|
|
591
|
-
|
592
|
-
|
593
|
-
```ruby
|
594
|
-
# config/metrics_schema.rb
|
595
|
-
Datadog::Statsd::Schema.new do
|
596
|
-
namespace :my_app do
|
597
|
-
# ... schema definition
|
598
|
-
end
|
599
|
-
end
|
390
|
+
## Development
|
600
391
|
|
601
|
-
|
602
|
-
schema = Datadog::Statsd::Schema.load_file('config/metrics_schema.rb')
|
603
|
-
```
|
392
|
+
After checking out the repo, run:
|
604
393
|
|
605
|
-
|
606
|
-
|
607
|
-
|
608
|
-
|
609
|
-
```ruby
|
610
|
-
gem 'datadog-statsd-schema'
|
394
|
+
```bash
|
395
|
+
bin/setup # Install dependencies
|
396
|
+
bundle exec rake spec # Run tests
|
611
397
|
```
|
612
398
|
|
613
|
-
|
399
|
+
To install this gem onto your local machine:
|
614
400
|
|
615
401
|
```bash
|
616
|
-
|
402
|
+
bundle exec rake install
|
617
403
|
```
|
618
404
|
|
619
|
-
## The Bottom Line
|
620
|
-
|
621
|
-
This gem transforms Datadog custom metrics from a "wild west" free-for-all into a disciplined, cost-effective observability strategy:
|
622
|
-
|
623
|
-
- **🎯 Intentional Metrics**: Define what you measure before you measure it
|
624
|
-
- **💰 Cost Control**: Prevent infinite cardinality and metric explosion
|
625
|
-
- **🏷️ Consistent Tagging**: Global and hierarchical tag management
|
626
|
-
- **🔍 Better Insights**: Finite tag values enable proper aggregation and analysis
|
627
|
-
- **👥 Team Alignment**: Schema serves as documentation and contract
|
628
|
-
|
629
|
-
Stop the metric madness. Start with a schema.
|
630
|
-
|
631
|
-
---
|
632
|
-
|
633
405
|
## Contributing
|
634
406
|
|
635
|
-
Bug reports and pull requests are welcome on GitHub at
|
407
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/kigster/datadog-statsd-schema.
|
636
408
|
|
637
409
|
## License
|
638
410
|
|