statsd-instrument 3.8.0 → 3.9.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7ce05bd8d34026227e2960ccabca96119580dc2d5737fff81a2bdfd8ea18f826
4
- data.tar.gz: d619b08700bb735922673013d7e6314f32d76e8744dcfa07966a35595aa27cf4
3
+ metadata.gz: 436100176f41a4557dfad3169208a34581a4a51ed03532857ce0192ef0586f95
4
+ data.tar.gz: 888b9d6dcab8741aede0db4044f9a4d5ce0eb146ce016037b5ee278296f1f2a1
5
5
  SHA512:
6
- metadata.gz: 42317b00c680ffc079e89bad712225b8a9656faedf1f8743d9031c26070ae05af0c2ef8a10f1200d94a708023e6cd667d91fbf64a972fecb7d97d327edc31f22
7
- data.tar.gz: 540c9a8bccc54633f40e9b2830748e764d6bf5034791bcdacb611be6a7fc615e66a9fa138e2ce58f4a6461f46bb40114fb7507f3f4357d86fccc5d9d6614d244
6
+ metadata.gz: 181b505253e000ae9f4b457716e4715227dbdc81d85357bedc335296af681d2975e4da0062abd4028b87d37462cd3ae5280f1033886a1b96a9bf566c32cf7930
7
+ data.tar.gz: cbc3e3cb9ed41abfda94075eeb4e25e29445f42447af826616cc3545419be2e4ceeab273183bbb56af3e166066ef5bb02ed19ac09b4467782c0184e5251e454f
@@ -0,0 +1,14 @@
1
+ ## ✅ What
2
+ <!-- A brief description of the changes in this PR. -->
3
+
4
+ ## 🤔 Why
5
+ <!-- A brief description of the reason for these changes. -->
6
+
7
+ ## 👩🔬 How to validate
8
+ <!-- Step-by-step instructions for how reviewers can verify these changes work as expected. -->
9
+
10
+ ## Checklist
11
+
12
+ - [ ] I documented the changes in the CHANGELOG file.
13
+ <!-- If this is a user-facing change, you must update the CHANGELOG file. OR -->
14
+ <!-- - [ ] This change is not user-facing and does not require a CHANGELOG update. -->
@@ -8,7 +8,7 @@ jobs:
8
8
  runs-on: ubuntu-latest
9
9
 
10
10
  steps:
11
- - uses: actions/checkout@v3
11
+ - uses: actions/checkout@v4
12
12
 
13
13
  - name: Set up Ruby
14
14
  uses: ruby/setup-ruby@v1
@@ -21,7 +21,7 @@ jobs:
21
21
  - name: Run throughput benchmark on branch
22
22
  run: benchmark/local-udp-throughput
23
23
 
24
- - uses: actions/checkout@v3
24
+ - uses: actions/checkout@v4
25
25
  with:
26
26
  ref: 'main'
27
27
 
@@ -8,7 +8,7 @@ jobs:
8
8
  runs-on: ubuntu-latest
9
9
 
10
10
  steps:
11
- - uses: actions/checkout@v3
11
+ - uses: actions/checkout@v4
12
12
 
13
13
  - name: Set up Ruby
14
14
  uses: ruby/setup-ruby@v1
@@ -9,14 +9,14 @@ jobs:
9
9
  strategy:
10
10
  fail-fast: false
11
11
  matrix:
12
- ruby: ['2.6', '2.7', '3.0', '3.1', '3.2', '3.3', 'ruby-head', 'jruby-9.3.7.0', 'truffleruby-22.2.0']
12
+ ruby: ['2.6', '2.7', '3.0', '3.1', '3.2', '3.3', 'ruby-head', 'jruby-9.4.8.0', 'truffleruby-22.3.1']
13
13
  # Windows on macOS builds started failing, so they are disabled for now
14
14
  # platform: [windows-2019, macOS-10.14, ubuntu-18.04]
15
15
  # exclude:
16
16
  # ...
17
17
 
18
18
  steps:
19
- - uses: actions/checkout@v3
19
+ - uses: actions/checkout@v4
20
20
 
21
21
  - name: Set up Ruby
22
22
  uses: ruby/setup-ruby@v1
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 3.3.0
1
+ 3.3.3
data/CHANGELOG.md CHANGED
@@ -6,6 +6,24 @@ section below.
6
6
 
7
7
  ## Unreleased changes
8
8
 
9
+ ## Version 3.9.1
10
+
11
+ - [#378](https://github.com/Shopify/statsd-instrument/pull/378) - Respect sampling rate when aggregation is enabled, just for timing metrics.
12
+ Not respecting sampling rate, incurs in a performance penalty, as we will send more metrics than expected.
13
+ Moreover, it overloads the StatsD server, which has to send out and process more metrics than expected.
14
+
15
+ ## Version 3.9.0
16
+
17
+ - Introduced an experimental aggregation feature to improve the efficiency of metrics reporting by aggregating
18
+ multiple metric events into a single sample. This reduces the number of network requests and can significantly
19
+ decrease the overhead associated with high-frequency metric reporting. To enable metric aggregation, set the
20
+ `STATSD_ENABLE_AGGREGATION` environment variable to true. More information on this feature is available in the README.
21
+ - Added support for sending StatsD via Unix domain sockets. This feature is enabled by
22
+ setting the `STATSD_SOCKET` environment variable to the path of the Unix domain socket.
23
+ - :warning: **Possible breaking change**: We removed/renamed some classes and now Sinks are generic, so the classes `UDPSink` and `UDPBatchedSink` are now called
24
+ `StatsD::Instrument::Sink` and `StatsD::Instrument::BatchedSink` respectively.
25
+ If you used those internal classes, you will need to update your code to use the new classes.
26
+
9
27
  ## Version 3.8.0
10
28
 
11
29
  - UDP batching will now track statistics about its own batching performance, and
data/Gemfile CHANGED
@@ -11,3 +11,10 @@ gem "yard"
11
11
  gem "rubocop", ">= 1.0"
12
12
  gem "rubocop-shopify", require: false
13
13
  gem "benchmark-ips"
14
+ gem "dogstatsd-ruby", "~> 5.0", require: false
15
+ platform :mri do
16
+ # only if Ruby is MRI && >= 3.2
17
+ if Gem::Version.new(RUBY_VERSION) >= Gem::Version.new("3.2")
18
+ gem "vernier", require: false
19
+ end
20
+ end
data/README.md CHANGED
@@ -65,6 +65,52 @@ The following environment variables are supported:
65
65
  - `statsd_instrument.batched_udp_sink.avg_batch_length`: The average number of statsd lines per batch.
66
66
 
67
67
 
68
+ ### Experimental aggregation feature
69
+
70
+ The aggregation feature is currently experimental and aims to improve the efficiency of metrics reporting by aggregating
71
+ multiple metric events into a single sample. This reduces the number of network requests and can significantly decrease the overhead
72
+ associated with high-frequency metric reporting.
73
+
74
+ This means that instead of sending each metric event individually, the library will aggregate multiple events into a single sample and send it to the StatsD server.
75
+ Example:
76
+
77
+ Instead of sending counters in multiple packets like this:
78
+ ```
79
+ my.counter:1|c
80
+ my.counter:1|c
81
+ my.counter:1|c
82
+ ```
83
+
84
+ The library will aggregate them into a single packet like this:
85
+ ```
86
+ my.counter:3|c
87
+ ```
88
+
89
+ and for histograms/distributions:
90
+ ```
91
+ my.histogram:1|h
92
+ my.histogram:2|h
93
+ my.histogram:3|h
94
+ ```
95
+
96
+ The library will aggregate them into a single packet like this:
97
+ ```
98
+ my.histogram:1:2:3|h
99
+ ```
100
+
101
+ #### Enabling Aggregation
102
+
103
+ To enable metric aggregation, set the following environment variables:
104
+
105
+ - `STATSD_ENABLE_AGGREGATION`: Set this to `true` to enable the experimental aggregation feature. Aggregation is disabled by default.
106
+ - `STATSD_AGGREGATION_INTERVAL`: Specifies the interval (in seconds) at which aggregated metrics are flushed and sent to the StatsD server.
107
+ For example, setting this to `2` will aggregate and send metrics every 2 seconds. Two seconds is also the default value if this environment variable is not set.
108
+
109
+ Please note that since aggregation is an experimental feature, it should be used with caution in production environments.
110
+
111
+ > [!WARNING]
112
+ > This feature is only compatible with Datadog Agent's version >=6.25.0 && <7.0.0 or Agent's versions >=7.25.0.
113
+
68
114
  ## StatsD keys
69
115
 
70
116
  StatsD keys look like 'admin.logins.api.success'. Dots are used as namespace separators.
data/Rakefile CHANGED
@@ -2,6 +2,7 @@
2
2
 
3
3
  require "bundler/gem_tasks"
4
4
  require "rake/testtask"
5
+ require "rubocop/rake_task"
5
6
 
6
7
  Rake::TestTask.new("test") do |t|
7
8
  t.ruby_opts << "-r rubygems"
@@ -9,4 +10,14 @@ Rake::TestTask.new("test") do |t|
9
10
  t.test_files = FileList["test/**/*_test.rb"]
10
11
  end
11
12
 
13
+ RuboCop::RakeTask.new(:lint) do |task|
14
+ task.options = ["-D", "-S", "-E"]
15
+ end
16
+
17
+ RuboCop::RakeTask.new(:lint_fix) do |task|
18
+ task.options = ["-a"]
19
+ end
20
+
21
+ task lf: :lint_fix
22
+
12
23
  task(default: :test)
@@ -6,28 +6,113 @@ require "benchmark/ips"
6
6
  require "tmpdir"
7
7
  require "socket"
8
8
  require "statsd-instrument"
9
+ require "datadog/statsd"
10
+ require "forwardable"
11
+ require "vernier"
12
+
13
+ class DatadogShim
14
+ extend Forwardable
15
+
16
+ def_delegator :@client, :close
17
+ # This is a shim to make the Datadog client compatible with the StatsD client
18
+ # interface. It's not a complete implementation, but it's enough to run the
19
+ # benchmarks.
20
+ # @param [Datadog::Statsd] client
21
+ def initialize(client)
22
+ @client = client
23
+ end
24
+
25
+ class NullSink
26
+ def flush(blocking: false)
27
+ end
28
+ end
29
+
30
+ def sink
31
+ @sink ||= NullSink.new
32
+ end
33
+
34
+ def increment(stat, value = 1, tags: nil)
35
+ @client.increment(stat, value: value, tags: tags)
36
+ end
37
+
38
+ def measure(stat, value = nil, tags: nil, &block)
39
+ @client.time(stat, value: value, tags: tags, &block)
40
+ end
41
+
42
+ def histogram(stat, value = nil, tags: nil, &block)
43
+ @client.histogram(stat, value: value, tags: tags, &block)
44
+ end
45
+
46
+ def gauge(stat, value, tags: nil)
47
+ @client.gauge(stat, value: value, tags: tags)
48
+ end
49
+
50
+ def set(stat, value, tags: nil)
51
+ @client.set(stat, value: value, tags: tags)
52
+ end
53
+
54
+ def event(title, text, tags: nil)
55
+ @client.event(title, text, tags: tags)
56
+ end
57
+
58
+ def service_check(name, status, tags: nil)
59
+ @client.service_check(name, status, tags: tags)
60
+ end
61
+ end
9
62
 
10
63
  def send_metrics(client)
11
64
  client.increment("StatsD.increment", 10)
12
65
  client.measure("StatsD.measure") { 1 + 1 }
13
66
  client.gauge("StatsD.gauge", 12.0, tags: ["foo:bar", "quc"])
14
- client.set("StatsD.set", "value", tags: { foo: "bar", baz: "quc" })
15
- client.event("StasD.event", "12345")
16
- client.service_check("StatsD.service_check", "ok")
17
67
  end
18
68
 
69
+ def send_metrics_high_cardinality(client)
70
+ SERIES_COUNT.times do |i|
71
+ tags = ["series:#{i}", "foo:bar", "baz:quc"]
72
+ client.increment("StatsD.increment", 10, tags: tags)
73
+ client.measure("StatsD.measure", tags: tags) { 1 + 1 }
74
+ client.gauge("StatsD.gauge", 12.0, tags: tags)
75
+ end
76
+ end
77
+
78
+ SOCKET_PATH = File.join(Dir.pwd, "tmp/metric.sock")
19
79
  THREAD_COUNT = Integer(ENV.fetch("THREAD_COUNT", 5))
20
- EVENTS_PER_ITERATION = 6
21
- ITERATIONS = 50_000
22
- def benchmark_implementation(name, env = {})
80
+ EVENTS_PER_ITERATION = 3
81
+ ITERATIONS = Integer(ENV.fetch("ITERATIONS", 10_000))
82
+ SERIES_COUNT = Integer(ENV.fetch("SERIES_COUNT", 0))
83
+ ENABLE_PROFILING = ENV.key?("ENABLE_PROFILING")
84
+ UDS_MAX_SEND_SIZE = 32_768
85
+
86
+ LOG_DIR = File.join(Dir.tmpdir, "statsd-instrument-benchmarks")
87
+ FileUtils.mkdir_p(LOG_DIR)
88
+ puts "Logs are stored in #{LOG_DIR}"
89
+
90
+ def benchmark_implementation(name, env = {}, datadog_client = false)
23
91
  intermediate_results_filename = "#{Dir.tmpdir}/statsd-instrument-benchmarks/"
24
- log_filename = "#{Dir.tmpdir}/statsd-instrument-benchmarks/#{File.basename($PROGRAM_NAME)}-#{name}.log"
92
+ log_filename = File.join(LOG_DIR, "#{File.basename($PROGRAM_NAME)}-#{name}.log".tr(" ", "_"))
25
93
  FileUtils.mkdir_p(File.dirname(intermediate_results_filename))
94
+ FileUtils.mkdir_p(File.dirname(log_filename))
26
95
 
27
96
  # Set up an UDP listener to which we can send StatsD packets
28
97
  receiver = UDPSocket.new
29
98
  receiver.bind("localhost", 0)
30
99
 
100
+ FileUtils.mkdir_p(File.dirname(SOCKET_PATH))
101
+ FileUtils.rm_f(SOCKET_PATH)
102
+ receiver_uds = Socket.new(Socket::AF_UNIX, Socket::SOCK_DGRAM)
103
+ receiver_uds.setsockopt(Socket::SOL_SOCKET, Socket::SO_REUSEADDR, true)
104
+ receiver_uds.setsockopt(Socket::SOL_SOCKET, Socket::SO_RCVBUF, UDS_MAX_SEND_SIZE * THREAD_COUNT)
105
+ receiver_uds.bind(Socket.pack_sockaddr_un(SOCKET_PATH))
106
+ # with UDS we have to take data out of the socket, otherwise it will fill up
107
+ # and we will block writing to it (which is what we are testing)
108
+ consume = Thread.new do
109
+ loop do
110
+ receiver_uds.recv(32768)
111
+ rescue
112
+ # Ignored
113
+ end
114
+ end
115
+
31
116
  log_file = File.open(log_filename, "w+", level: Logger::WARN)
32
117
  StatsD.logger = Logger.new(log_file)
33
118
 
@@ -37,23 +122,103 @@ def benchmark_implementation(name, env = {})
37
122
  "STATSD_ENV" => "production",
38
123
  ).merge(env)).client
39
124
 
40
- puts "===== #{name} throughtput (#{THREAD_COUNT} threads) ====="
125
+ if datadog_client
126
+ statsd = Datadog::Statsd.new(receiver.addr[2], receiver.addr[1], **env)
127
+ udp_client = DatadogShim.new(statsd)
128
+ end
129
+
130
+ series = SERIES_COUNT.zero? ? 1 : SERIES_COUNT
131
+ events_sent = THREAD_COUNT * EVENTS_PER_ITERATION * ITERATIONS * series
132
+ puts "===== #{name} throughput (#{THREAD_COUNT} threads) - total events: #{events_sent} ====="
133
+ start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
41
134
  threads = THREAD_COUNT.times.map do
42
135
  Thread.new do
43
136
  count = ITERATIONS
44
137
  while (count -= 1) > 0
45
- send_metrics(udp_client)
138
+ if SERIES_COUNT.zero?
139
+ send_metrics(udp_client)
140
+ else
141
+ send_metrics_high_cardinality(udp_client)
142
+ end
46
143
  end
47
144
  end
48
145
  end
49
- start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
146
+
50
147
  threads.each(&:join)
148
+ udp_client.shutdown if udp_client.respond_to?(:shutdown)
149
+ if datadog_client
150
+ udp_client.close
151
+ end
152
+
51
153
  duration = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
52
- events_sent = THREAD_COUNT * EVENTS_PER_ITERATION * ITERATIONS
53
- puts "events: #{(events_sent / duration).round(1)}/s"
154
+
155
+ consume.kill
54
156
  receiver.close
55
- udp_client.shutdown if udp_client.respond_to?(:shutdown)
157
+ receiver_uds.close
158
+
159
+ series = SERIES_COUNT.zero? ? 1 : SERIES_COUNT
160
+ events_sent = THREAD_COUNT * EVENTS_PER_ITERATION * ITERATIONS * series
161
+ puts "events: #{(events_sent / duration).round(1).to_s.reverse.gsub(/(\d{3})(?=\d)/, '\\1,').reverse}/s"
56
162
  end
57
163
 
164
+ if ENABLE_PROFILING
165
+ Vernier.start_profile(out: "tmp/benchmark_profile_udp_sync.json")
166
+ end
58
167
  benchmark_implementation("UDP sync", "STATSD_BUFFER_CAPACITY" => "0")
168
+ if ENABLE_PROFILING
169
+ Vernier.stop_profile
170
+ end
171
+
172
+ if ENABLE_PROFILING
173
+ Vernier.start_profile(out: "tmp/benchmark_profile_udp_async.json")
174
+ end
59
175
  benchmark_implementation("UDP batched")
176
+ if ENABLE_PROFILING
177
+ Vernier.stop_profile
178
+ end
179
+
180
+ if ENABLE_PROFILING
181
+ Vernier.start_profile(out: "tmp/benchmark_profile_uds_small_packet.json")
182
+ end
183
+ benchmark_implementation("UDS batched with small packet", "STATSD_SOCKET_PATH" => SOCKET_PATH)
184
+ if ENABLE_PROFILING
185
+ Vernier.stop_profile
186
+ end
187
+
188
+ if ENABLE_PROFILING
189
+ Vernier.start_profile(out: "tmp/benchmark_profile_uds_batched_async.json")
190
+ end
191
+ benchmark_implementation(
192
+ "UDS batched with jumbo packet",
193
+ "STATSD_SOCKET_PATH" => SOCKET_PATH,
194
+ "STATSD_MAX_PACKET_SIZE" => UDS_MAX_SEND_SIZE.to_s,
195
+ )
196
+ if ENABLE_PROFILING
197
+ Vernier.stop_profile
198
+ end
199
+
200
+ if ENABLE_PROFILING
201
+ Vernier.start_profile(out: "tmp/benchmark_udp_batched_with_aggregation.json")
202
+ end
203
+ benchmark_implementation(
204
+ "UDP batched with aggregation and 5 second interval",
205
+ "STATSD_ENABLE_AGGREGATION" => "true",
206
+ "STATSD_AGGREGATION_FLUSH_INTERVAL" => "5",
207
+ )
208
+ if ENABLE_PROFILING
209
+ Vernier.stop_profile
210
+ end
211
+
212
+ if ENABLE_PROFILING
213
+ Vernier.start_profile(out: "tmp/benchmark_uds_with_aggregation.json")
214
+ end
215
+ benchmark_implementation(
216
+ "UDS batched with aggregation and 5 second interval",
217
+ "STATSD_ENABLE_AGGREGATION" => "true",
218
+ "STATSD_AGGREGATION_FLUSH_INTERVAL" => "5",
219
+ "STATSD_SOCKET_PATH" => SOCKET_PATH,
220
+ "STATSD_MAX_PACKET_SIZE" => UDS_MAX_SEND_SIZE.to_s,
221
+ )
222
+ if ENABLE_PROFILING
223
+ Vernier.stop_profile
224
+ end
@@ -27,9 +27,10 @@ def benchmark_implementation(name, env = {})
27
27
  %x(git rev-parse --abbrev-ref HEAD).rstrip
28
28
  end
29
29
 
30
- intermediate_results_filename = "#{Dir.tmpdir}/statsd-instrument-benchmarks/#{File.basename($PROGRAM_NAME)}-#{name}"
31
- log_filename = "#{Dir.tmpdir}/statsd-instrument-benchmarks/#{File.basename($PROGRAM_NAME)}-#{name}.log"
32
- FileUtils.mkdir_p(File.dirname(intermediate_results_filename))
30
+ log_dir = "#{Dir.tmpdir}/statsd-instrument-benchmarks"
31
+ intermediate_results_filename = File.join(log_dir, "#{File.basename($PROGRAM_NAME)}-#{name}")
32
+ log_filename = File.join(log_dir, "#{File.basename($PROGRAM_NAME)}-#{name}.log")
33
+ FileUtils.mkdir_p(log_dir)
33
34
 
34
35
  # Set up an UDP listener to which we can send StatsD packets
35
36
  receiver = UDPSocket.new
@@ -69,7 +70,7 @@ def benchmark_implementation(name, env = {})
69
70
  File.unlink(intermediate_results_filename)
70
71
  end
71
72
 
72
- log_file.close
73
+ # log_file.close
73
74
  logs = File.read(log_filename)
74
75
  unless logs.empty?
75
76
  puts
@@ -81,3 +82,4 @@ end
81
82
 
82
83
  benchmark_implementation("UDP sync", "STATSD_BUFFER_CAPACITY" => "0")
83
84
  benchmark_implementation("UDP batched")
85
+ benchmark_implementation("UDP batched with aggregation", "STATSD_ENABLE_AGGREGATION" => "true")