hastur 1.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,9 @@
1
+ *.swp
2
+ *.gem
3
+ .bundle
4
+ Gemfile.lock
5
+ pkg/*
6
+ doc/*
7
+ yardoc/*
8
+ coverage
9
+ .yardoc/*
@@ -0,0 +1 @@
1
+ --no-private
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source "http://rubygems.org"
2
+
3
+ gemspec
data/README ADDED
@@ -0,0 +1,120 @@
1
+ What Is It?
2
+ -----------
3
+
4
+ Hastur is a monitoring system written by Ooyala. It uses Cassandra
5
+ for time series storage, resulting in remarkable power, flexibility
6
+ and scalability.
7
+
8
+ Hastur works hard to make it easy to add your data and easy to get it
9
+ back at full resolution. For instance, it makes it easy to query in
10
+ big batches from a REST server, build a dashboard of metrics, show
11
+ errors in production or email you when an error rate gets too high.
12
+
13
+ This gem helps you get your data into Hastur. See the "hastur-server"
14
+ gem for the back end, and for how to get your data back out.
15
+
16
+ How Do I Use It?
17
+ ----------------
18
+
19
+ Install this gem ("gem install hastur") or add it to your app's
20
+ Gemfile and run "bundle".
21
+
22
+ Add Hastur calls to your application, such as:
23
+
24
+ Hastur.counter "my.thing.to.count" # Add 1 to my.thing.to.count
25
+ Hastur.gauge "other.thing.foo_latency", 371.1 # Record a latency of 371.1
26
+
27
+ You can find extensive per-method documentation in the source code, or
28
+ see "Is It Documented?" below for friendly HTML documentation.
29
+
30
+ This is enough to instrument your application code, but you'll need to
31
+ install a local daemon and have a back-end collector for it to talk
32
+ to. See the hastur-server gem for specifics.
33
+
34
+ Hastur allows you to send at regular intervals using Hastur.every,
35
+ which will call a block from a background thread:
36
+
37
+ @total = 0
38
+ Hastur.every(:minute) { Hastur.gauge("total.counting.so.far", @total) }
39
+ loop { sleep 1; @total += 1 } # Count one per second, send it once per minute
40
+
41
+ The YARD documentation (see below) has far more specifics.
42
+
43
+ Is It Documented?
44
+ -----------------
45
+
46
+ We use YARD. "gem install yard redcarpet", then type "yardoc" from
47
+ this source directory. This will generate documentation -- point a
48
+ browser at "doc/index.html" for the top-level view.
49
+
50
+ Mechanism
51
+ ---------
52
+
53
+ Your messages are automatically timestamped in microseconds, labeled
54
+ and converted to a JSON structure for transport and storage.
55
+
56
+ Hastur sends the JSON over a local UDP socket to a local "Hastur
57
+ Agent", a daemon that forwards your data to the shared Hastur servers.
58
+ That means that your application will never slow down for Hastur --
59
+ failed sends become no-ops. Note that local UDP won't randomly drop
60
+ packets like internet UDP, though you can lose them if there's no
61
+ Hastur Agent running.
62
+
63
+ The Hastur Agent forwards the messages to Hastur Routers over ZeroMQ
64
+ (see "http://0mq.org"). The routers send it to the sinks, which
65
+ preprocess your data, index it and write it to Cassandra. They also
66
+ forward to the syndicators for the streaming interface (e.g. to email
67
+ you if there's a problem).
68
+
69
+ Cassandra is a highly scalable clustered key-value store inspired
70
+ somewhat by Amazon Dynamo. It's a lot of the "secret sauce" that
71
+ makes Hastur interesting.
72
+
73
+ Hints and Tips
74
+ --------------
75
+
76
+ 1. You can retrieve messages with the same name prefix all together from
77
+ the REST API (for instance: "my.thing.*"). It's usually a good idea
78
+ to give metrics the same prefix if you will retrieve them at the same
79
+ time. This prefix syntax is very efficient for Cassandra. That's why
80
+ we made it easy to use.
81
+
82
+ 2. Every call allows you to pass labels - a one-level string-to-string
83
+ hash of tags about what that call means and what data goes with it.
84
+ For instance, you might call:
85
+
86
+ Hastur.gauge "my.thing.total_latency", 317.4, :now, :units => "usec"
87
+
88
+ Eventually you'll be able to query messages by label through the REST
89
+ interface, but for now that's inconvenient. However, it's easy to
90
+ subscribe to labels in the streaming interface. So labels are a
91
+ powerful way to mark data as being interesting to alert you about.
92
+
93
+ For example:
94
+
95
+ Hastur.gauge "my.thing.total_latency", 317.4, :now, :severity => "omg!"
96
+
97
+ It's easy to subscribe to any latency with a severity label in the
98
+ streaming interface, which would let you calculate how bad the overall
99
+ latency pretty well. See the hastur-server gem for details of the
100
+ trigger interface.
101
+
102
+ 3. You can group multiple messages together by giving them the same
103
+ timestamp. For instance:
104
+
105
+ ts = Hastur.timestamp
106
+ Hastur.gauge "my.thing.latency1", val1, ts
107
+ Hastur.gauge "my.thing.latency2", val2, ts
108
+ Hastur.counter "my.thing.counter371", 1, ts
109
+
110
+ This makes it easy to query all events with exactly that timestamp
111
+ and the same prefix ("my.thing.*"), and otherwise to make sure they're
112
+ exactly the same.
113
+
114
+ Do *not* give multiple messages the same name *and* the same
115
+ timestamp. Hastur will only store a single event with the same name
116
+ and timestamp from the same node. If you give several of them the
117
+ same name and timestamp, you'll lose all but one.
118
+
119
+ Keep in mind that timestamps are in microseconds -- you're not limited
120
+ to one event with the same name per second.
@@ -0,0 +1,27 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ namespace "test" do
5
+ desc "Run all unit tests"
6
+ Rake::TestTask.new(:units) do |t|
7
+ t.libs += ["test"]
8
+ t.test_files = Dir["test/*_test.rb"]
9
+ t.verbose = true
10
+ end
11
+ end
12
+
13
+ inclusion_tests = Dir["test/inclusion/*_test.rb"]
14
+
15
+ inclusion_tests.each do |test_filename|
16
+ test_name = test_filename.split("/")[-1].sub(/_test\.rb$/, "").gsub("_", " ")
17
+
18
+ desc "Hastur #{test_name} inclusion test"
19
+ task "test:inclusion:#{test_name}" do
20
+ system("ruby", "-I.", test_filename)
21
+ raise "Test #{test_name} failed!" unless $?.success?
22
+ end
23
+
24
+ task "test:inclusions" => "test:inclusion:#{test_name}"
25
+ end
26
+
27
+ task "test" => [ "test:units", "test:inclusions" ]
@@ -0,0 +1,127 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "hastur"
4
+ require "chronic"
5
+ require "trollop"
6
+
7
+ opts = Trollop::options do
8
+ banner <<EOS
9
+ hastur is a command-line program to send Hastur metrics.
10
+
11
+ Usage:
12
+ hastur [options] <type> [<name> [<value>]]
13
+
14
+ Examples:
15
+ hastur counter things.to.do 4 --labels app=MyApp type=todo
16
+ hastur heartbeat slow.cron.job
17
+ hastur mark script.ran.after.failed.job --labels env=development activity=debugging
18
+ hastur gauge old.gauge 37.1 --time "3 months ago Saturday at 5pm"
19
+
20
+ Options:
21
+ EOS
22
+ opt :time, "Timestamp to send", :type => String
23
+ opt :label, "Labels to send", :type => :strings, :multi => true
24
+ opt :print, "Print the call args", :type => :boolean, :default => false
25
+ end
26
+
27
+ Trollop::die "you must give a type!" if ARGV.empty?
28
+ Type = ARGV.shift.downcase
29
+
30
+ # Args:
31
+ # - mark: name, value, timestamp, labels
32
+ # - counter: name, increment, timestamp, labels
33
+ # - gauge: name, value, timestamp, labels
34
+ # - event: name, subject, body, attn, timestamp, labels
35
+ # - heartbeat: name, value, timeout, timestamp, labels
36
+
37
+ TYPES = {
38
+ "mark" => {
39
+ :name => true,
40
+ :value => :maybe,
41
+ },
42
+ "gauge" => {
43
+ :name => true,
44
+ :value => true,
45
+ },
46
+ "counter" => {
47
+ :name => true,
48
+ :value => :maybe,
49
+ },
50
+ "heartbeat" => {
51
+ :name => :maybe,
52
+ :value => :maybe,
53
+ :timeout => :maybe,
54
+ }
55
+ }
56
+
57
+ Trollop::die "Type must be one of: #{TYPES.keys.join(', ')}" unless TYPES[Type]
58
+
59
+ #
60
+ # This method tries to evaluate a string as Ruby and, if it can't,
61
+ # dies saying so.
62
+ #
63
+ # @param [String] value_string The code to evaluate
64
+ # @param [String] description What the value will be used as
65
+ # @return The result of the eval
66
+ # @raise TrollopException Your string didn't parse or raised an exception
67
+ #
68
+ def try_eval(value_string, description)
69
+ begin
70
+ value = eval(value_string)
71
+ rescue Exception
72
+ Trollop::die "Your #{description} (#{value_string}) didn't run as Ruby: #{$!.message}"
73
+ end
74
+ value
75
+ end
76
+
77
+ #
78
+ # Try to get an argument by name if this message type supports it.
79
+ #
80
+ def try_get_arg(args, arg_name, message_type)
81
+ # Doesn't allow this arg type? Return quietly.
82
+ return unless TYPES[Type][arg_name]
83
+
84
+ if ARGV.size > 0
85
+ # If the arg is here and TYPES[Type][arg_name] is true or maybe, use it.
86
+ if block_given?
87
+ args << yield(ARGV.shift, arg_name, message_type)
88
+ else
89
+ args << try_eval(ARGV.shift, arg_name)
90
+ end
91
+ elsif TYPES[Type][arg_name] == :maybe
92
+ args << nil
93
+ else
94
+ Trollop::die "You must give a #{arg_name} for a metric of type #{Type}"
95
+ end
96
+ end
97
+
98
+ ##############################
99
+ # Build the argument list
100
+ ##############################
101
+ args = [ Type ]
102
+
103
+ try_get_arg(args, :name, Type) { |arg, _, _| arg.to_s }
104
+ try_get_arg(args, :value, Type)
105
+ # TODO(noah): add timeout for heartbeat
106
+
107
+ # Time is next to last
108
+ time = Time.now
109
+ if opts[:time]
110
+ time = Chronic.parse opts[:time]
111
+ end
112
+ args << time
113
+
114
+ # Labels is last
115
+ labels = {}
116
+ if opts[:label]
117
+ opts[:label].flatten.each do |item|
118
+ name, value = item.split("=")
119
+ labels[name] = try_eval(value, "label value")
120
+ end
121
+ end
122
+
123
+ args << labels
124
+
125
+ puts "Hastur.send *#{args.inspect}" if opts[:print]
126
+
127
+ Hastur.send(*args)
@@ -0,0 +1,10 @@
1
+ #!/bin/bash
2
+
3
+ : ${REPO_ROOT:="$WORKSPACE"}
4
+ source $HOME/.rvm/scripts/rvm
5
+
6
+ cd $REPO_ROOT/hastur
7
+ rvm --create use 1.9.3@hastur
8
+ gem install --no-rdoc --no-ri bundler
9
+ bundle install
10
+ COVERAGE=true bundle exec rake --trace test:units
@@ -0,0 +1,4 @@
1
+ source :rubygems
2
+
3
+ gem "goliath"
4
+ gem "em-http-request"
@@ -0,0 +1 @@
1
+ Here is a short example how to log GET requests using Hastur in a transparent proxy.
@@ -0,0 +1,93 @@
1
+ require "rubygems"
2
+ require "goliath"
3
+ require "em-synchrony/em-http"
4
+ require "time"
5
+
6
+ $LOAD_PATH << File.join(File.dirname(__FILE__), "../lib")
7
+ require "hastur/eventmachine"
8
+
9
+ class GetProxy < Goliath::API
10
+ use Goliath::Rack::Params
11
+
12
+ attr_reader :backend
13
+ def initialize
14
+ ::ARGV.each_with_index do |arg,idx|
15
+ if arg == "--backend"
16
+ @backend = ::ARGV[idx + 1]
17
+ ::ARGV.slice! idx, 2
18
+ break
19
+ end
20
+ end
21
+
22
+ unless @backend
23
+ raise "Initialization error: could not determine backend server, try --backend <url>"
24
+ end
25
+
26
+ super
27
+ end
28
+
29
+ def response(env)
30
+ url = "#{@backend}#{env['REQUEST_PATH']}"
31
+ start = Hastur.timestamp
32
+ http = EM::HttpRequest.new(url).get :query => params
33
+ done = Hastur.timestamp
34
+
35
+ uri = URI.parse url
36
+
37
+ # Hastur was designed to be queried ground-up using labels. Liberal use
38
+ # of labels is recommended. We add labels as we need them.
39
+ labels = { :scheme => uri.scheme,
40
+ :host => uri.host,
41
+ :port => uri.port
42
+ }
43
+
44
+ case http.response_header.status
45
+ when 300..307
46
+ # Marks are interesting, but non-critical points. Value defaults to nil,
47
+ # timestamp defaults to 'now'.
48
+ Hastur.mark(
49
+ "test.proxy.3xx", # name
50
+ nil, # value
51
+ start, # timestamp
52
+ :status => :moved # label
53
+ )
54
+ labels[:status] = "3xx"
55
+ when 400..417
56
+ # Log is used for low priority data that will be buffered and batched by
57
+ # Hastur. Severity is optional and irrelevant to delivery.
58
+ Hastur.log(
59
+ "test.proxy.4xx", # name
60
+ { # data
61
+ :path => uri.path,
62
+ :query => uri.query,
63
+ },
64
+ start # timestamp
65
+ )
66
+ labels[:status] = "4xx"
67
+ when 500.505
68
+ # Event is serious business. Hastur will punish the little elves crankin
69
+ # in it bowels mercilessly to get this out and about ASAP.
70
+ Hastur.event(
71
+ "test.proxy.5xx", # name
72
+ "Internal Server Error", # subject
73
+ nil, # body
74
+ ["devnull@ooyala.com"], # attn
75
+ start, # timestamp
76
+ :path => uri.path, # labels
77
+ :query => uri.query # labels
78
+ )
79
+ labels[:status] = "5xx"
80
+ end
81
+
82
+ # Gauges are used to track values.
83
+ Hastur.gauge(
84
+ # Use . to separate namespaces in Hastur.
85
+ "test.proxy.latencies.ttr", # name
86
+ done.to_f - start.to_f, # value
87
+ start, # timestamp
88
+ labels # labels
89
+ )
90
+
91
+ [http.response_header.status, http.response_header, http.response]
92
+ end
93
+ end
@@ -0,0 +1,28 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+ require "hastur/version"
4
+
5
+ Gem::Specification.new do |s|
6
+ s.name = "hastur"
7
+ s.version = Hastur::VERSION
8
+ s.platform = Gem::Platform::RUBY
9
+ s.authors = ["Viet Nguyen"]
10
+ s.email = ["viet@ooyala.com"]
11
+ s.homepage = "http://www.ooyala.com"
12
+ s.description = "Hastur API client gem"
13
+ s.summary = "A gem used to communicate with the Hastur Client through UDP."
14
+ s.rubyforge_project = "hastur"
15
+
16
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
17
+
18
+ s.add_development_dependency "yard"
19
+ s.add_development_dependency "mocha"
20
+ s.add_development_dependency "minitest"
21
+ s.add_development_dependency "simplecov" if RUBY_VERSION[/^1.9/]
22
+ s.add_development_dependency "rake"
23
+ s.add_runtime_dependency "multi_json", "~>1.3.2"
24
+ s.add_runtime_dependency "chronic"
25
+
26
+ s.files = `git ls-files`.split("\n")
27
+ s.require_paths = ["lib"]
28
+ end
@@ -0,0 +1,2 @@
1
+ require "hastur/api"
2
+ Hastur.start
@@ -0,0 +1,732 @@
1
+ require "multi_json"
2
+ require "socket"
3
+ require "date"
4
+ require "thread"
5
+
6
+ require "hastur/version"
7
+
8
+ #
9
+ # Hastur API gem that allows services/apps to easily publish
10
+ # correct Hastur-commands to their local machine's UDP sockets.
11
+ # Bare minimum for all JSON packets is to have :type key/values to
12
+ # map to a hastur message type, which the router uses for sink delivery.
13
+ #
14
+ module Hastur
15
+ extend self
16
+
17
+ # TODO(noah): Change all instance variables to use Hastur.variable
18
+ # and add attr_reader/attr_accessor for them if appropriate.
19
+ # Right now you could use a mix of Hastur.variable and including
20
+ # the Hastur module and get two full sets of Hastur stuff.
21
+ # This will only matter if people include Hastur directly,
22
+ # which we haven't documented as possible.
23
+
24
+ class << self
25
+ attr_accessor :mutex
26
+ end
27
+
28
+ Hastur.mutex ||= Mutex.new
29
+
30
+ SECS_2100 = 4102444800
31
+ MILLI_SECS_2100 = 4102444800000
32
+ MICRO_SECS_2100 = 4102444800000000
33
+ NANO_SECS_2100 = 4102444800000000000
34
+
35
+ SECS_1971 = 31536000
36
+ MILLI_SECS_1971 = 31536000000
37
+ MICRO_SECS_1971 = 31536000000000
38
+ NANO_SECS_1971 = 31536000000000000
39
+
40
+ PLUGIN_INTERVALS = [ :five_minutes, :thirty_minutes, :hourly, :daily, :monthly ]
41
+
42
+ #
43
+ # Prevents starting a background thread under any circumstances.
44
+ #
45
+ def no_background_thread!
46
+ @prevent_background_thread = true
47
+ end
48
+
49
+ START_OPTS = [
50
+ :background_thread
51
+ ]
52
+
53
+ #
54
+ # Start Hastur's background thread and/or do process registration
55
+ # or neither, according to what options are set.
56
+ #
57
+ # @param [Hash] opts The options for features
58
+ # @option opts [boolean] :background_thread Whether to start a background thread
59
+ #
60
+ def start(opts = {})
61
+ bad_keys = opts.keys - START_OPTS
62
+ raise "Unknown options to Hastur.start: #{bad_keys.join(", ")}!" unless bad_keys.empty?
63
+
64
+ unless @prevent_background_thread ||
65
+ (opts.has_key?(:background_thread) && !opts[:background_thread])
66
+ start_background_thread
67
+ end
68
+
69
+ @process_registration_done = true
70
+ register_process Hastur.app_name, {}
71
+ end
72
+
73
+ #
74
+ # Starts a background thread that will execute blocks of code every so often.
75
+ #
76
+ def start_background_thread
77
+ if @prevent_background_thread
78
+ raise "You can't start a background thread! Somebody called .no_background_thread! already."
79
+ end
80
+
81
+ return if @bg_thread
82
+
83
+ @intervals = [:five_secs, :minute, :hour, :day]
84
+ @interval_values = [5, 60, 60*60, 60*60*24 ]
85
+ __reset_bg_thread__
86
+ end
87
+
88
+ #
89
+ # This should ordinarily only be for testing. It kills the
90
+ # background thread so that automatic heartbeats and .every() blocks
91
+ # don't happen. If you restart the background thread, all your
92
+ # .every() blocks go away, but the process heartbeat is restarted.
93
+ #
94
+ def kill_background_thread
95
+ __kill_bg_thread__
96
+ end
97
+
98
+ #
99
+ # Returns whether the background thread is currently running.
100
+ # @todo Debug this.
101
+ #
102
+ def background_thread?
103
+ @bg_thread && !@bg_thread.alive?
104
+ end
105
+
106
+ #
107
+ # Best effort to make all timestamps be Hastur timestamps, 64 bit
108
+ # numbers that represent the total number of microseconds since Jan
109
+ # 1, 1970 at midnight UTC. Accepts second, millisecond or nanosecond
110
+ # timestamps and Ruby times. You can also give :now or nil for Time.now.
111
+ #
112
+ # @param timestamp The timestamp as a Fixnum, Float or Time. Defaults to Time.now.
113
+ # @return [Fixnum] Number of microseconds since Jan 1, 1970 midnight UTC
114
+ # @raise RuntimeError Unable to validate timestamp format
115
+ #
116
+ def epoch_usec(timestamp=Time.now)
117
+ timestamp = Time.now if timestamp.nil? || timestamp == :now
118
+
119
+ case timestamp
120
+ when Time
121
+ (timestamp.to_f*1000000).to_i
122
+ when DateTime
123
+ # Ruby 1.8.7 doesn't have to DateTime#to_time or DateTime#to_f method.
124
+ # For right now, declare failure.
125
+ raise "Ruby DateTime objects are not yet supported!"
126
+ when SECS_1971..SECS_2100
127
+ timestamp * 1000000
128
+ when MILLI_SECS_1971..MILLI_SECS_2100
129
+ timestamp * 1000
130
+ when MICRO_SECS_1971..MICRO_SECS_2100
131
+ timestamp
132
+ when NANO_SECS_1971..NANO_SECS_2100
133
+ timestamp / 1000
134
+ else
135
+ raise "Unable to validate timestamp: #{timestamp}"
136
+ end
137
+ end
138
+
139
+ alias :timestamp :epoch_usec
140
+
141
+ #
142
+ # Attempts to determine the application name, or uses an
143
+ # application-provided one, if set. In order, Hastur checks:
144
+ #
145
+ # * User-provided app name via Hastur.app_name=
146
+ # * HASTUR_APP_NAME environment variable
147
+ # * ::HASTUR_APP_NAME Ruby constant
148
+ # * Ecology.application, if set
149
+ # * File.basename($0)
150
+ #
151
+ # @return [String] The application name, or best guess at same
152
+ #
153
+ def app_name
154
+ return @app_name if @app_name
155
+
156
+ return @app_name = ENV['HASTUR_APP_NAME'] if ENV['HASTUR_APP_NAME']
157
+
158
+ top_level = ::HASTUR_APP_NAME rescue nil
159
+ return @app_name = top_level if top_level
160
+
161
+ eco = Ecology rescue nil
162
+ return @app_name = Ecology.application if eco
163
+
164
+ @app_name = File.basename $0
165
+ end
166
+ alias application app_name
167
+
168
+ #
169
+ # Set the application name that Hastur registers as.
170
+ #
171
+ # @param [String] new_name The new application name.
172
+ #
173
+ def app_name=(new_name)
174
+ old_name = @app_name
175
+
176
+ @app_name = new_name
177
+
178
+ if @process_registration_done
179
+ err_str = "You changed the application name from #{old_name} to " +
180
+ "#{new_name} after the process was registered!"
181
+ STDERR.puts err_str
182
+ Hastur.log err_str
183
+ end
184
+ end
185
+ alias application= app_name=
186
+
187
+ #
188
+ # Add default labels which will be sent back with every Hastur
189
+ # message sent by this process. The labels will be sent back with
190
+ # the same constant value each time that is specified in the labels
191
+ # hash.
192
+ #
193
+ # This is a useful way to send back information that won't change
194
+ # during the run, or that will change only occasionally like
195
+ # resource usage, server information, deploy environment, etc. The
196
+ # same kind of information can be sent back using info_process(), so
197
+ # consider which way makes more sense for your case.
198
+ #
199
+ # @param [Hash] new_default_labels A hash of new labels to send.
200
+ #
201
+ def add_default_labels(new_default_labels)
202
+ @default_labels ||= {}
203
+
204
+ @default_labels.merge!
205
+ end
206
+
207
+ #
208
+ # Remove default labels which will be sent back with every Hastur
209
+ # message sent by this process. This cannot remove the three
210
+ # automatic defaults (application, pid, tid). Keys that have not
211
+ # been added cannot be removed, and so will be silently ignored (no
212
+ # exception will be raised).
213
+ #
214
+ # @param [Array<String> or multiple strings] default_label_keys Keys to stop sending
215
+ #
216
+ def remove_default_label_names(*default_label_keys)
217
+ keys_to_remove = default_label_keys.flatten
218
+
219
+ keys_to_remove.each { |key| @default_labels.delete(key) }
220
+ end
221
+
222
+ #
223
+ # Reset the default labels which will be sent back with every Hastur
224
+ # message sent by this process. After this, only the automatic
225
+ # default labels (process ID, thread ID, application name) will be
226
+ # sent, plus of course the ones specified for the specific Hastur
227
+ # message call.
228
+ #
229
+ def reset_default_labels
230
+ @default_labels = {}
231
+ end
232
+
233
+ #
234
+ # Reset Hastur module for tests. This removes all settings and
235
+ # kills the background thread, resetting Hastur to its initial
236
+ # pre-start condition.
237
+ #
238
+ def reset
239
+ __kill_bg_thread__
240
+ @app_name = nil
241
+ @prevent_background_thread = nil
242
+ @process_registration_done = nil
243
+ @udp_port = nil
244
+ @__delivery_method__ = nil
245
+ @scheduled_blocks = nil
246
+ @last_time = nil
247
+ @intervals = nil
248
+ @interval_values = nil
249
+ @default_labels = nil
250
+ @message_name_prefix = nil
251
+ end
252
+
253
+ #
254
+ # Set a message-name prefix for all message types that have names.
255
+ # It will be prepended automatically for those message types' names.
256
+ # A nil value will be treated as the empty string. Plugin names
257
+ # don't count as message names for these purposes, and will not be
258
+ # prefixed.
259
+ #
260
+ def message_name_prefix=(value)
261
+ @message_name_prefix = value
262
+ end
263
+
264
+ def message_name_prefix
265
+ @message_name_prefix || ""
266
+ end
267
+
268
+ protected
269
+
270
+ #
271
+ # Sends a compound data structure to Hastur. This is protected and for
272
+ # internal use only at the moment and is used for system statistics
273
+ # that are automatically collected by Hastur Agent.
274
+ #
275
+ # @param [String] name The counter name
276
+ # @param [Hash,Array] value compound value
277
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
278
+ # @param [Hash] labels Any additional data labels to send
279
+ #
280
+ def compound(name, value=[], timestamp=:now, labels={})
281
+ send_to_udp :type => :compound,
282
+ :name => message_name_prefix + (name || ""),
283
+ :value => value,
284
+ :timestamp => epoch_usec(timestamp),
285
+ :labels => default_labels.merge(labels)
286
+ end
287
+
288
+ #
289
+ # Returns the default labels for any UDP message that ships.
290
+ #
291
+ def default_labels
292
+ pid = Process.pid
293
+ thread = Thread.current
294
+ unless thread[:tid]
295
+ thread[:tid] = thread_id(thread)
296
+ end
297
+
298
+ {
299
+ :pid => pid,
300
+ :tid => thread[:tid],
301
+ :app => app_name,
302
+ }
303
+ end
304
+
305
+ #
306
+ # This is a convenience function because the Ruby
307
+ # thread API has no accessor for the thread ID,
308
+ # but includes it in "to_s" (buh?)
309
+ #
310
+ def thread_id(thread)
311
+ return 0 if thread == Thread.main
312
+
313
+ str = thread.to_s
314
+
315
+ match = nil
316
+ match = str.match /(0x\d+)/
317
+ return nil unless match
318
+ match[1].to_i
319
+ end
320
+
321
+ #
322
+ # Get the UDP port.
323
+ #
324
+ # @return The UDP port. Defaults to 8125.
325
+ #
326
+ def udp_port
327
+ @udp_port || 8125
328
+ end
329
+
330
+ private
331
+
332
+ #
333
+ # Sends a message unmolested to the HASTUR_UDP_PORT on 127.0.0.1
334
+ #
335
+ # @param m The message to send
336
+ # @todo Combine this with __send_to_udp__
337
+ #
338
+ def send_to_udp(m)
339
+ if @__delivery_method__
340
+ @__delivery_method__.call(m)
341
+ else
342
+ __send_to_udp__(m)
343
+ end
344
+ end
345
+
346
+ def __send_to_udp__(m)
347
+ begin
348
+ u = ::UDPSocket.new
349
+ mj = MultiJson.dump m
350
+ u.send mj, 0, "127.0.0.1", udp_port
351
+ rescue Errno::EMSGSIZE => e
352
+ return if @no_recurse
353
+ @no_recurse = true
354
+ err = "Message too long to send via Hastur UDP Socket. " +
355
+ "Backtrace: #{e.backtrace.inspect} " + "(Truncated) Message: #{mj}"
356
+ Hastur.log err
357
+ @no_recurse = false
358
+ rescue Exception => e
359
+ return if @no_recurse
360
+ @no_recurse = true
361
+ err = "Exception sending via Hastur UDP Socket. " + "Exception: #{e.message} " +
362
+ "Backtrace: #{e.backtrace.inspect} " + "(Truncated) Message: #{mj}"
363
+ Hastur.log err
364
+ @no_recurse = false
365
+
366
+ end
367
+ end
368
+
369
+ #
370
+ # Kills the background thread if it's running.
371
+ #
372
+ def __kill_bg_thread__
373
+ if @bg_thread
374
+ @bg_thread.kill
375
+ @bg_thread = nil
376
+ end
377
+ end
378
+
379
+ #
380
+ # Resets Hastur's background thread, removing all scheduled
381
+ # callbacks and resetting the times for all intervals. This is TEST
382
+ # MODE ONLY and will do TERRIBLE THINGS IF CALLED IN PRODUCTION.
383
+ #
384
+ def __reset_bg_thread__
385
+ if @prevent_background_thread
386
+ raise "You can't start a background thread! Somebody called .no_background_thread! already."
387
+ end
388
+
389
+ __kill_bg_thread__
390
+
391
+ @last_time ||= Hash.new
392
+
393
+ Hastur.mutex.synchronize do
394
+ @scheduled_blocks ||= Hash.new
395
+
396
+ # initialize all of the scheduling hashes
397
+ @intervals.each do |interval|
398
+ @last_time[interval] = Time.at(0)
399
+ @scheduled_blocks[interval] = []
400
+ end
401
+ end
402
+
403
+ # add a heartbeat background job
404
+ every :minute do
405
+ heartbeat("process_heartbeat")
406
+ end
407
+
408
+ # define a thread that will schedule and execute all of the background jobs.
409
+ # it is not very accurate on the scheduling, but should not be a problem
410
+ @bg_thread = Thread.new do
411
+ begin
412
+ loop do
413
+ # for each of the interval buckets
414
+ curr_time = Time.now
415
+
416
+ @intervals.each_with_index do |interval, idx|
417
+ to_call = []
418
+
419
+ # Don't need to dup this because we never change the old
420
+ # array, only reassign a new one.
421
+ Hastur.mutex.synchronize { to_call = @scheduled_blocks[interval] }
422
+
423
+ # execute the scheduled items if time is up
424
+ if curr_time - @last_time[ interval ] >= @interval_values[idx]
425
+ @last_time[interval] = curr_time
426
+ to_call.each(&:call)
427
+ end
428
+ end
429
+
430
+ # TODO(noah): increase this time?
431
+ sleep 1 # rest
432
+ end
433
+ rescue Exception => e
434
+ STDERR.puts e.inspect
435
+ end
436
+ end
437
+ end
438
+
439
+ public
440
+
441
+ #
442
+ # Set delivery method to the given proc/block. The block is saved
443
+ # and called with each message to be sent. If no block is given or
444
+ # if this method is not called, the delivery method defaults to
445
+ # sending over the configured UDP port.
446
+ #
447
+ def deliver_with(&block)
448
+ @__delivery_method__ = block
449
+ end
450
+
451
+ #
452
+ # Set the UDP port. Defaults to 8125
453
+ #
454
+ # @param [Fixnum] new_port The new port number.
455
+ #
456
+ def udp_port=(new_port)
457
+ @udp_port = new_port
458
+ end
459
+
460
+ #
461
+ # Sends a 'mark' stat to Hastur. A mark gives the time that
462
+ # an interesting event occurred even with no value attached.
463
+ # You can also use a mark to send back string-valued stats
464
+ # that might otherwise be guages -- "Green", "Yellow",
465
+ # "Red" or similar.
466
+ #
467
+ # It is different from a Hastur event because it happens at
468
+ # stat priority -- it can be batched or slightly delayed,
469
+ # and doesn't have an end-to-end acknowledgement included.
470
+ #
471
+ # @param [String] name The mark name
472
+ # @param [String] value An optional string value
473
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
474
+ # @param [Hash] labels Any additional data labels to send
475
+ #
476
+ def mark(name, value = nil, timestamp=:now, labels={})
477
+ send_to_udp :type => :mark,
478
+ :name => message_name_prefix + (name || ""),
479
+ :value => value,
480
+ :timestamp => epoch_usec(timestamp),
481
+ :labels => default_labels.merge(labels)
482
+ end
483
+
484
+ #
485
+ # Sends a 'counter' stat to Hastur. Counters are linear,
486
+ # and are sent as deltas (differences). Sending a
487
+ # value of 1 adds 1 to the counter.
488
+ #
489
+ # @param [String] name The counter name
490
+ # @param [Fixnum] value Amount to increment the counter by
491
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
492
+ # @param [Hash] labels Any additional data labels to send
493
+ #
494
+ def counter(name, value=1, timestamp=:now, labels={})
495
+ send_to_udp :type => :counter,
496
+ :name => message_name_prefix + (name || ""),
497
+ :value => value,
498
+ :timestamp => epoch_usec(timestamp),
499
+ :labels => default_labels.merge(labels)
500
+ end
501
+
502
+ #
503
+ # Sends a 'gauge' stat to Hastur. A gauge's value may or may
504
+ # not be on a linear scale. It is sent as an exact value, not
505
+ # a difference.
506
+ #
507
+ # @param [String] name The mark name
508
+ # @param value The value of the gauge as a Fixnum or Float
509
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
510
+ # @param [Hash] labels Any additional data labels to send
511
+ #
512
+ def gauge(name, value, timestamp=:now, labels={})
513
+ send_to_udp :type => :gauge,
514
+ :name => message_name_prefix + (name || ""),
515
+ :value => value,
516
+ :timestamp => epoch_usec(timestamp),
517
+ :labels => default_labels.merge(labels)
518
+ end
519
+
520
+ #
521
+ # Sends an event to Hastur. An event is high-priority and never buffered,
522
+ # and will be sent preferentially to stats or heartbeats. It includes
523
+ # an end-to-end acknowledgement to ensure arrival, but is expensive
524
+ # to store, send and query.
525
+ #
526
+ # 'Attn' is a mechanism to describe the system or component in which the
527
+ # event occurs and who would care about it. Obvious values to include in the
528
+ # array include user logins, email addresses, team names, and server, library
529
+ # or component names. This allows making searches like "what events should I
530
+ # worry about?" or "what events have recently occurred on the Rails server?"
531
+ #
532
+ # @param [String] name The name of the event (ex: "bad.log.line")
533
+ # @param [String] subject The subject or message for this specific event (ex "Got bad log line: @#$#@garbage@#$#@")
534
+ # @param [String] body An optional body with details of the event. A stack trace or email body would go here.
535
+ # @param [Array] attn The relevant components or teams for this event. Web hooks or email addresses would go here.
536
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
537
+ # @param [Hash] labels Any additional data labels to send
538
+ #
539
+ def event(name, subject=nil, body=nil, attn=[], timestamp=:now, labels={})
540
+ send_to_udp :type => :event,
541
+ :name => message_name_prefix + (name || ""),
542
+ :subject => subject.to_s[0...3_072],
543
+ :body => body.to_s[0...3_072],
544
+ :attn => [ attn ].flatten,
545
+ :timestamp => epoch_usec(timestamp),
546
+ :labels => default_labels.merge(labels)
547
+ end
548
+
549
+ #
550
+ # Sends a log line to Hastur. A log line is of relatively low
551
+ # priority, comparable to stats, and is allowed to be buffered or
552
+ # batched while higher-priority data is sent first.
553
+ #
554
+ # Severity can be included in the data field with the tag
555
+ # "severity" if desired.
556
+ #
557
+ # @param [String] subject The subject or message for this specific log (ex "Got bad input: @#$#@garbage@#$#@")
558
+ # @param [Hash] data Additional JSON-able data to be sent
559
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
560
+ # @param [Hash] labels Any additional data labels to send
561
+ #
562
+ def log(subject=nil, data={}, timestamp=:now, labels={})
563
+ send_to_udp :type => :log,
564
+ :subject => subject.to_s[0...7_168],
565
+ :data => data,
566
+ :timestamp => epoch_usec(timestamp),
567
+ :labels => default_labels.merge(labels)
568
+ end
569
+
570
+ #
571
+ # Sends a process registration to Hastur. This indicates that the
572
+ # process is currently running, and that heartbeats should be sent
573
+ # for some time afterward.
574
+ #
575
+ # @param [String] name The name of the application or best guess
576
+ # @param [Hash] data The additional data to include with the registration
577
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
578
+ # @param [Hash] labels Any additional data labels to send
579
+ #
580
+ def register_process(name = app_name, data = {}, timestamp = :now, labels = {})
581
+ send_to_udp :type => :reg_process,
582
+ :data => { "language" => "ruby", "version" => Hastur::VERSION }.merge(data),
583
+ :timestamp => epoch_usec(timestamp),
584
+ :labels => default_labels.merge(labels)
585
+ end
586
+
587
+ #
588
+ # Sends freeform process information to Hastur. This can be
589
+ # supplemental information about resources like memory, loaded gems,
590
+ # Ruby version, files open and whatnot. It can be additional
591
+ # configuration or deployment information like environment
592
+ # (dev/staging/prod), software or component version, etc. It can be
593
+ # information about the application as deployed, as run, or as it is
594
+ # currently running.
595
+ #
596
+ # The default labels contain application name and process ID to
597
+ # match this information with the process registration and similar
598
+ # details.
599
+ #
600
+ # Any number of these can be sent as information changes or is
601
+ # superceded. However, if information changes constantly or needs
602
+ # to be graphed or alerted on, send that separately as a metric or
603
+ # event. Info_process messages are freeform and not readily
604
+ # separable or graphable.
605
+ #
606
+ # @param [String] tag The tag or title of this chunk of process info
607
+ # @param [Hash] data The detailed data being sent
608
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
609
+ # @param [Hash] labels Any additional data labels to send
610
+ #
611
+ def info_process(tag, data = {}, timestamp = :now, labels = {})
612
+ send_to_udp :type => :info_process,
613
+ :tag => tag,
614
+ :data => data,
615
+ :timestamp => epoch_usec(timestamp),
616
+ :labels => default_labels.merge(labels)
617
+ end
618
+
619
+ #
620
+ # This sends back freeform data about the agent or host that Hastur
621
+ # is running on. Sample uses include what libraries or packages are
622
+ # installed and available, the total installed memory
623
+ #
624
+ # Any number of these can be sent as information changes or is
625
+ # superceded. However, if information changes constantly or needs
626
+ # to be graphed or alerted on, send that separately as a metric or
627
+ # event. Info_agent messages are freeform and not readily separable
628
+ # or graphable.
629
+ #
630
+ # @param [String] tag The tag or title of this chunk of process info
631
+ # @param [Hash] data The detailed data being sent
632
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
633
+ # @param [Hash] labels Any additional data labels to send
634
+ #
635
+ def info_agent(tag, data = {}, timestamp = :now, labels = {})
636
+ send_to_udp :type => :info_agent,
637
+ :tag => tag,
638
+ :data => data,
639
+ :timestamp => epoch_usec(timestamp),
640
+ :labels => default_labels.merge(labels)
641
+ end
642
+
643
+ #
644
+ # Sends a plugin registration to Hastur. A plugin is a program on the host machine which
645
+ # can be run to determine status of the machine, an application or anything else interesting.
646
+ #
647
+ # This registration tells Hastur to begin scheduling runs
648
+ # of the plugin and report back on the resulting status codes or crashes.
649
+ #
650
+ # @param [String] name The name of the plugin, and of the heartbeat sent back
651
+ # @param [String] plugin_path The path on the local file system to this plugin executable
652
+ # @param [Array] plugin_args The array of arguments to pass to the plugin executable
653
+ # @param [Symbol] plugin_interval The interval to run the plugin. The scheduling will be slightly approximate. One of: PLUGIN_INTERVALS
654
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
655
+ # @param [Hash] labels Any additional data labels to send
656
+ #
657
+ def register_plugin(name, plugin_path, plugin_args, plugin_interval, timestamp=:now, labels={})
658
+ unless PLUGIN_INTERVALS.include?(plugin_interval)
659
+ raise "Interval must be one of: #{PLUGIN_INTERVALS.join(', ')}"
660
+ end
661
+ send_to_udp :type => :reg_pluginv1,
662
+ :plugin_path => plugin_path,
663
+ :plugin_args => plugin_args,
664
+ :interval => plugin_interval,
665
+ :plugin => name,
666
+ :timestamp => epoch_usec(timestamp),
667
+ :labels => default_labels.merge(labels)
668
+ end
669
+
670
+ #
671
+ # Sends a heartbeat to Hastur. A heartbeat is a periodic
672
+ # message which indicates that a host, application or
673
+ # service is currently running. It is higher priority
674
+ # than a statistic and should not be batched, but is
675
+ # lower priority than an event does not include an
676
+ # end-to-end acknowledgement.
677
+ #
678
+ # Plugin results are sent as a heartbeat with the
679
+ # plugin's name as the heartbeat name.
680
+ #
681
+ # @param [String] name The name of the heartbeat.
682
+ # @param value The value of the heartbeat as a Fixnum or Float
683
+ # @param [Float] timeout How long in seconds to expect to wait, at maximum, before the next heartbeat. If this is nil, don't worry if it doesn't arrive.
684
+ # @param timestamp The timestamp as a Fixnum, Float, Time or :now
685
+ # @param [Hash] labels Any additional data labels to send
686
+ #
687
+ def heartbeat(name="application.heartbeat", value=nil, timeout = nil, timestamp=:now, labels={})
688
+ send_to_udp :name => message_name_prefix + (name || ""),
689
+ :type => :hb_process,
690
+ :value => value,
691
+ :timestamp => epoch_usec(timestamp),
692
+ :labels => default_labels.merge(labels)
693
+ end
694
+
695
+ #
696
+ # Run the block and report its runtime back to Hastur as a gauge.
697
+ #
698
+ # @param [String] name The name of the gauge.
699
+ # @example
700
+ # Hastur.time "foo.bar" { fib 10 }
701
+ # Hastur.time "foo.bar", Time.now, :from => "over there" do fib(100) end
702
+ #
703
+ def time(name, timestamp=nil, labels={})
704
+ started = Time.now
705
+ ret = yield
706
+ ended = Time.now
707
+ gauge name, ended - started, timestamp || started, labels
708
+ ret
709
+ end
710
+
711
+ #
712
+ # Runs a block of code periodically every interval.
713
+ # Use this method to report statistics at a fixed time interval.
714
+ #
715
+ # @param [Symbol] interval How often to run. One of [:five_secs, :minute, :hour, :day]
716
+ # @yield [] A block which will send Hastur messages, called periodically
717
+ #
718
+ def every(interval, &block)
719
+ if @prevent_background_thread
720
+ log("You called .every(), but background threads are specifically prevented.")
721
+ end
722
+
723
+ unless @intervals.include?(interval)
724
+ raise "Interval must be one of these: #{@intervals}, you gave #{interval.inspect}"
725
+ end
726
+
727
+ # Don't add to existing array. += will create a new array. Then
728
+ # when we save a reference to the old array and iterate through
729
+ # it, it won't change midway.
730
+ Hastur.mutex.synchronize { @scheduled_blocks[interval] += [ block ] }
731
+ end
732
+ end