apprentice 0.0.5 → 0.0.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +26 -10
- data/apprentice.gemspec +2 -2
- data/lib/apprentice.rb +21 -2
- data/lib/apprentice/checker.rb +50 -2
- data/lib/apprentice/checks/galera.rb +101 -1
- data/lib/apprentice/checks/mysql.rb +127 -0
- data/lib/apprentice/configuration.rb +80 -9
- data/lib/apprentice/server.rb +16 -3
- data/lib/apprentice/version.rb +2 -1
- data/ruby-apprentice.default +9 -1
- data/ruby-apprentice.init +5 -5
- metadata +5 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5677b66a1b063db82716d77a866bed546a2e873b
|
4
|
+
data.tar.gz: 3cc0c846e58fccd80625fe2ed1728fc402a6d1f4
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: dadebfde96c397883370a51b3f3f20199245f78dd7dc451eac833fb4bbc3ce40677ad0aeb4b76a41fe90127dbab57eebaffe82ad4a34e3935e73f02c06058220
|
7
|
+
data.tar.gz: a210337e0c220d29f93bfc3e8d66d382c37bcf2ad304e68dfe735911627c98b386a98cdad19a0227742cba03d1b1da01374a2d2040fb17219d88b556de5eaf3c
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# Apprentice
|
2
2
|
|
3
|
-
Apprentice is tiny server application that determines the
|
3
|
+
Apprentice is tiny server application (under 300 lines of ruby code) that determines the integrity of a running [MariaDB/MySQL slave](https://mariadb.com/kb/en/replication-overview/) or [MariaDB Galera master-master cluster member](https://mariadb.com/kb/en/what-is-mariadb-galera-cluster/) and responds to HTTP requests on a pre-defined port, depending on the state of the server it is checking on.
|
4
4
|
|
5
5
|
## How does it work?
|
6
6
|
|
@@ -10,13 +10,20 @@ You can find out about the syntax by running `apprentice --help`:
|
|
10
10
|
Usage: apprentice [options]
|
11
11
|
|
12
12
|
Specific options:
|
13
|
-
-s, --server SERVER
|
14
|
-
-u, --user USER USER to connect the server with
|
13
|
+
-s, --server SERVER SERVER to connect to
|
14
|
+
-u, --user USER USER to connect to the server with
|
15
15
|
-p, --password PASSWORD PASSWORD to use
|
16
|
+
-t, --type TYPE TYPE of server. Must either by "galera" or "mysql".
|
16
17
|
-i, --ip IP Local IP to bind to
|
18
|
+
(default: 0.0.0.0)
|
17
19
|
--port PORT Local PORT to use
|
18
|
-
|
19
|
-
--
|
20
|
+
(default: 3307)
|
21
|
+
--sql_port PORT Port of MariaDB/MySQL server to connect to
|
22
|
+
(default: 3306)
|
23
|
+
--[no-]accept-donor Accept galera cluster state "Donor/Desynced" as valid
|
24
|
+
(default: false)
|
25
|
+
--threshold SECONDS MariaDB/MySQL slave lag threshold
|
26
|
+
(default: 120)
|
20
27
|
|
21
28
|
Common options:
|
22
29
|
-h, --help Show this message
|
@@ -25,7 +32,7 @@ You can find out about the syntax by running `apprentice --help`:
|
|
25
32
|
|
26
33
|
## What it does
|
27
34
|
|
28
|
-
It determines whether or not the server it is connected to is alive and ready to serve connections to clients. Furthermore, it also determines whether said server is a healthy
|
35
|
+
It determines whether or not the server it is connected to is alive and ready to serve connections to clients. Furthermore, it also determines whether said server is a healthy enough to serve connections, i.e. doesn't suffer from slave lag or has separated from the cluster.
|
29
36
|
|
30
37
|
## What it doesn't do
|
31
38
|
|
@@ -35,7 +42,16 @@ It determines whether or not the server it is connected to is alive and ready to
|
|
35
42
|
* *`503 Service Unavailable`*: The server is unavailable and not ready for connections
|
36
43
|
|
37
44
|
## What's it checking exactly?
|
45
|
+
###MariaDB/MySQL
|
46
|
+
Apprentice checks the following variables:
|
47
|
+
|
48
|
+
* **Slave_IO_Running**: Indicates whether a slave is actually replicating from its master. If this is set to "No" or even "nil" the server is considered unfit for serving client connections.
|
49
|
+
* **Seconds_Behind_Master**: Indicates how far (in seconds) the slave is behind its master's state. A threshold above 120 is widely considered to be unsuitable for serving valid data. The lower the value the higher the risk of Apprentice returning a negative result.
|
50
|
+
* *Note*: Generally, MariaDB/MySQL slaves are lagging a little (even if it is just fractions to few seconds). A threshold value below 30 - 60 (depending on your setup) would probably be too conservative. However, YMMV.
|
51
|
+
|
52
|
+
For Apprentice to be able to check on the mentioned variables the user you specify on the command line needs [the 'REPLICATION CLIENT' privileges](http://dev.mysql.com/doc/refman/5.0/en/privileges-provided.html#priv_replication-client) granted within the given server. Otherwise Apprentice is going to return a negative result.
|
38
53
|
|
54
|
+
###Galera
|
39
55
|
Apprentice checks the following variables:
|
40
56
|
|
41
57
|
* **wsrep_cluster_size**: A cluster size below 2 is considered an error since there must never be one single server inside a cluster setup.
|
@@ -44,7 +60,8 @@ Apprentice checks the following variables:
|
|
44
60
|
* *Note*: The value `2` indicates the server in question is currently being used as a donor to another member of the cluster and might be exhibiting slow-downs and/or erratic behaviour due to elevated network traffic and disc IO. For further explanation please [consult the MariaDB documentation](https://mariadb.com/kb/en/what-is-mariadb-galera-cluster/).
|
45
61
|
|
46
62
|
## That's great and all, but what gives?
|
47
|
-
By itself, Apprentice doesn't do
|
63
|
+
By itself, Apprentice doesn't do anything all that useful. However, it accommodates [HAProxy's httpchk method](http://cbonte.github.io/haproxy-dconv/configuration-1.4.html#option%20httpchk) quite nicely, making it possible to let HAProxy not only balance connection among a large pool of MariaDB/MySQL slave nodes or cluster members but also check on their respected "health" while doing so.
|
64
|
+
Usually, HAProxy would only be able to establish a connection to a server without checking on its consistency. Apprentice does that job for you and helps HAProxy make the right decision on which servers to let a client gain access to.
|
48
65
|
|
49
66
|
## Goodies
|
50
67
|
|
@@ -53,13 +70,12 @@ I've included an init.d script, `ruby-apprentice.init` which you may use in orde
|
|
53
70
|
|
54
71
|
$ mv ruby-apprentice.init /etc/init.d/ruby-apprentice
|
55
72
|
$ chmod +x /etc/init.d/ruby-apprentice
|
56
|
-
$ mv ruby-apprentice.defaults /etc/defaults/
|
73
|
+
$ mv ruby-apprentice.defaults /etc/defaults/ruby-apprentice
|
57
74
|
|
58
75
|
Now you just need to add the relevant information for starting Apprentice. The defaults file is pretty self explanatory.
|
59
76
|
|
60
77
|
## TODO
|
61
78
|
|
62
|
-
* Write better (r)docs. I'm sorry for the abysmal state they're in right now
|
63
|
-
* Be a lot more forgiving when it comes to SQL connection errors/reconnects/server going awol.
|
64
79
|
* Finish the rspec definitions. Sorry for missing out on those as well.
|
80
|
+
* Maybe integrate a logger
|
65
81
|
* Write a better init script
|
data/apprentice.gemspec
CHANGED
@@ -8,8 +8,8 @@ Gem::Specification.new do |spec|
|
|
8
8
|
spec.version = Apprentice::VERSION
|
9
9
|
spec.authors = 'Moritz Heiber'
|
10
10
|
spec.email = %w{moritz.heiber@gmail.com}
|
11
|
-
spec.description = 'A MariaDB cluster integrity checker'
|
12
|
-
spec.summary = '
|
11
|
+
spec.description = 'A MariaDB/MySQL slave lag and cluster integrity checker'
|
12
|
+
spec.summary = 'Checks a given server for consistency and replication status'
|
13
13
|
spec.homepage = 'http://github.com/moritzheiber/apprentice'
|
14
14
|
spec.license = 'MIT'
|
15
15
|
|
data/lib/apprentice.rb
CHANGED
@@ -3,15 +3,34 @@ require 'apprentice/configuration'
|
|
3
3
|
require 'apprentice/version'
|
4
4
|
require 'apprentice/server'
|
5
5
|
|
6
|
+
# The main Apprentice module including all other modules and classes
|
6
7
|
module Apprentice
|
8
|
+
|
9
|
+
# This defines the sentinel, i.e. tiny server, Apprentice uses to communicate with e.g. HAProxy's httpchk method.
|
7
10
|
class Sentinel
|
8
|
-
include Configuration
|
9
|
-
include Server
|
11
|
+
include Configuration #:nodoc:
|
12
|
+
include Server #:nodoc:
|
10
13
|
|
14
|
+
# This depends on the Configuration module since it uses the Configuration#get_config method.
|
15
|
+
#
|
16
|
+
# ==== Return value
|
17
|
+
#
|
18
|
+
# * <tt>@options</tt> - set the global variable <tt>@options</tt> which is used inside #run the start the EventMachine server
|
11
19
|
def initialize
|
12
20
|
@options = get_config
|
13
21
|
end
|
14
22
|
|
23
|
+
# Starts the EventMachine server
|
24
|
+
#
|
25
|
+
# === Special conditions
|
26
|
+
#
|
27
|
+
# We are trapping the signals <tt>INT</tt> and <tt>TERM</tt> here in order to shut down the EventMachine gracefully.
|
28
|
+
#
|
29
|
+
# ==== Attributes
|
30
|
+
#
|
31
|
+
# * <tt>@options.ip</tt> - The server binds to this specific ip
|
32
|
+
# * <tt>@options.port</tt> - The server uses this specific port to expose its limited HTTP interface to the world
|
33
|
+
# * <tt>@options</tt> - Gets passed to the server as a whole to be used with Server::EventServer#initialize
|
15
34
|
def run
|
16
35
|
EM.run do
|
17
36
|
Signal.trap('INT') { EventMachine.stop }
|
data/lib/apprentice/checker.rb
CHANGED
@@ -1,9 +1,37 @@
|
|
1
|
+
# Contains all the relevant methods for checking on a server's state
|
2
|
+
#
|
3
|
+
# Conditionally includes either MariaDB/MySQL or Galera related checking code
|
1
4
|
module Checker
|
2
|
-
require 'apprentice/checks/galera'
|
3
|
-
include Galera
|
4
5
|
|
6
|
+
# HTTP response codes and their respective return value
|
7
|
+
#
|
8
|
+
# We're constructing our dumb HTTP response handler using these
|
5
9
|
CODES = {200 => 'OK',503 => 'Service Unavailable'}
|
6
10
|
|
11
|
+
case @type
|
12
|
+
when 'galera'
|
13
|
+
require 'apprentice/checks/galera'
|
14
|
+
include Galera
|
15
|
+
when 'mysql'
|
16
|
+
require 'apprentice/checks/mysql'
|
17
|
+
include Mysql_Checks
|
18
|
+
end
|
19
|
+
|
20
|
+
# Format our HTTP/1.1 response properly without using arbitrary line breaks.
|
21
|
+
#
|
22
|
+
# ==== Attributes
|
23
|
+
#
|
24
|
+
# * +texts+ - A hash containing all text responses returned from run_checks.
|
25
|
+
#
|
26
|
+
# ==== Return values
|
27
|
+
#
|
28
|
+
# * +value+ - The comprehensive text returned with a HTTP response.
|
29
|
+
#
|
30
|
+
# ==== Examples
|
31
|
+
#
|
32
|
+
# t = ['Something', 'Something else']
|
33
|
+
# response = format_text(t)
|
34
|
+
# response.inspect # => 'Something\r\nSomething else\r\n'
|
7
35
|
def format_text(texts)
|
8
36
|
value = ''
|
9
37
|
if !texts.empty?
|
@@ -14,6 +42,26 @@ module Checker
|
|
14
42
|
return value
|
15
43
|
end
|
16
44
|
|
45
|
+
# Generates the actual output returned by the Server::EventServer class.
|
46
|
+
#
|
47
|
+
# It's valid HTTP/1.1 and should be understood by almost any browser. Certainly by HAProxy's httpchk.
|
48
|
+
#
|
49
|
+
# ==== Attributes
|
50
|
+
#
|
51
|
+
# * +code+ - The HTTP code for the returned response
|
52
|
+
# * +text+ - Formatted text to be returned with the response
|
53
|
+
#
|
54
|
+
# ==== Return values
|
55
|
+
#
|
56
|
+
# * String - A HTTP response string
|
57
|
+
#
|
58
|
+
# ==== Examples
|
59
|
+
#
|
60
|
+
# code = 503
|
61
|
+
# text = 'Something is wrong'
|
62
|
+
#
|
63
|
+
# response = generate_response(code, text)
|
64
|
+
# response.inspect # => 'HTTP/1.1 503 Service Unavailable\r\nContent-type: text/plain\r\nContent-length: 18\r\n\r\nSomething is wrong\r\n'
|
17
65
|
def generate_response(code = 503, text)
|
18
66
|
"HTTP/1.1 #{code} #{CODES[code]}\r\nContent-type: text/plain\r\nContent-length: #{text.length}\r\n\r\n#{text}"
|
19
67
|
end
|
@@ -1,6 +1,22 @@
|
|
1
|
+
# Contains Galera specific methods for checking cluster member consistency
|
1
2
|
module Galera
|
2
|
-
STATES = {1 => 'Joining',2 => 'Donor/Desynced',3 => 'Joined',4 => 'Synced'}
|
3
3
|
|
4
|
+
# Galera knows {a couple of different states}[http://www.percona.com/doc/percona-xtradb-cluster/wsrep-status-index.html#wsrep_local_state].
|
5
|
+
# This constant describes their respective meaning for user feedback and, possibly, logging purposes.
|
6
|
+
STATES = {1 => 'Joining', 2 => 'Donor/Desynced', 3 => 'Joined', 4 => 'Synced'}
|
7
|
+
|
8
|
+
# Gets the actual status from the Galera cluster member using the Mysql2 gem.
|
9
|
+
# Notice that we're using the EventMachine-enabled Mysql2::Client.
|
10
|
+
#
|
11
|
+
# Right now it only returns the relevant error output and continues working afterwards.
|
12
|
+
#
|
13
|
+
# Nothing is mentioned about explicitly closing a client connection in the Mysql2 docs,
|
14
|
+
# however, we need to be careful with the amount of connections we're using since we might
|
15
|
+
# find ourselves in an environment where the number of connections is constraint for a very few.
|
16
|
+
#
|
17
|
+
# ==== Return values
|
18
|
+
#
|
19
|
+
# * @status - Contains a hash of all the relevant wsrep_* variables to be examined by #run_checks
|
4
20
|
def get_galera_status
|
5
21
|
begin
|
6
22
|
client = Mysql2::Client.new(
|
@@ -12,16 +28,34 @@ module Galera
|
|
12
28
|
)
|
13
29
|
result = client.query "SHOW STATUS LIKE 'wsrep_%';"
|
14
30
|
if result.count > 0
|
31
|
+
|
32
|
+
# We need to do some conversion here in order to get a usable hash
|
15
33
|
result.each do |r|
|
16
34
|
@status.merge!(Hash[*r])
|
17
35
|
end
|
18
36
|
end
|
19
37
|
client.close
|
20
38
|
rescue Exception => message
|
39
|
+
# FIXME Properly handle exception
|
21
40
|
puts message
|
22
41
|
end
|
23
42
|
end
|
24
43
|
|
44
|
+
# Returns the relevant status HTTP code accompanied by a useful user feedback text
|
45
|
+
#
|
46
|
+
# ==== Attributes
|
47
|
+
#
|
48
|
+
# * @status - Should contain a hash with the relevant information to determine the
|
49
|
+
# the cluster member status. Also see #get_galera_status.
|
50
|
+
#
|
51
|
+
# ==== Return values
|
52
|
+
#
|
53
|
+
# * +response+ - A hash containing a HTTP <tt>:code</tt> and a <tt>:text</tt> to return to the user
|
54
|
+
#
|
55
|
+
# ==== Example
|
56
|
+
#
|
57
|
+
# @status = {'wsrep_cluster_size' => 4 }
|
58
|
+
# response = self.run_checks # => {:code => 503, :text => 'Some text'}
|
25
59
|
def run_checks
|
26
60
|
get_galera_status
|
27
61
|
unless @status.empty?
|
@@ -42,16 +76,82 @@ module Galera
|
|
42
76
|
end
|
43
77
|
end
|
44
78
|
|
79
|
+
# Checks whether the cluster size as reported by the member is above 1.
|
80
|
+
# Any value below 2 is considered bad, as a cluster, by definition, should consist of at least
|
81
|
+
# 2 members connected to each other.
|
82
|
+
#
|
83
|
+
# A cluster size of 1 might also indicate a split-brain situation.
|
84
|
+
#
|
85
|
+
# ==== Return values
|
86
|
+
#
|
87
|
+
# * +true+ or +false+ - depending on the value of <tt>@status['wsrep_cluster_size']</tt>
|
88
|
+
#
|
89
|
+
# ==== Examples
|
90
|
+
#
|
91
|
+
# @status = Hash.new
|
92
|
+
#
|
93
|
+
# @status['wsrep_cluster_size'] = 3
|
94
|
+
# r = check_cluster_size
|
95
|
+
# r.inspect # => true
|
96
|
+
#
|
97
|
+
# @status['wsrep_cluster_size'] = 1
|
98
|
+
# r = check_cluster_size
|
99
|
+
# r.inspect # => false
|
45
100
|
def check_cluster_size
|
46
101
|
return true if Integer(@status['wsrep_cluster_size']) > 1
|
47
102
|
false
|
48
103
|
end
|
49
104
|
|
105
|
+
# Checks whether the cluster replication is running and active.
|
106
|
+
# If this returns false the <tt>'wsrep_ready'</tt> status variable is set to <tt>'OFF'</tt> and thus the server is not an active
|
107
|
+
# member of a running cluster.
|
108
|
+
#
|
109
|
+
# ==== Return values
|
110
|
+
#
|
111
|
+
# * +true+ or +false+ - depending on the value of <tt>@status['wsrep_ready']</tt>
|
112
|
+
#
|
113
|
+
# ==== Examples
|
114
|
+
#
|
115
|
+
# @status = Hash.new
|
116
|
+
#
|
117
|
+
# @status['wsrep_ready'] = 'ON'
|
118
|
+
# r = check_ready_state
|
119
|
+
# r.inspect # => true
|
120
|
+
#
|
121
|
+
# @status['wsrep_ready'] = 'OFF'
|
122
|
+
# r = check_ready_state
|
123
|
+
# r.inspect # => false
|
50
124
|
def check_ready_state
|
51
125
|
return true if @status['wsrep_ready'] == 'ON'
|
52
126
|
false
|
53
127
|
end
|
54
128
|
|
129
|
+
# Checks how the cluster member sees itself in terms of status
|
130
|
+
#
|
131
|
+
# Valid states, read from the <tt>'wsrep_local_state'</tt> variable and depending on the configuration, are <tt>4</tt>, meaning <tt>Synced</tt>, or <tt>2</tt>,
|
132
|
+
# meaning <tt>Donor/Desynced</tt>, if the option <tt>--accept-donor</tt> was passed at runtime.
|
133
|
+
#
|
134
|
+
# ==== Return values
|
135
|
+
#
|
136
|
+
# * +true+ or +false+ - depending on the value of <tt>@status['wsrep_local_state']</tt>
|
137
|
+
#
|
138
|
+
# ==== Examples
|
139
|
+
#
|
140
|
+
# @status = Hash.new
|
141
|
+
# @donor_allowed = false
|
142
|
+
#
|
143
|
+
# @status['wsrep_local_state'] = 4
|
144
|
+
# r = check_local_state
|
145
|
+
# r.inspect # => true
|
146
|
+
#
|
147
|
+
# @status['wsrep_local_state'] = 2
|
148
|
+
# r = check_local_state
|
149
|
+
# r.inspect # => false
|
150
|
+
#
|
151
|
+
# @donor_allowed = true
|
152
|
+
# @status['wsrep_local_state'] = 2
|
153
|
+
# r = check_local_state
|
154
|
+
# r.inspect # => true
|
55
155
|
def check_local_state
|
56
156
|
s = Integer(@status['wsrep_local_state'])
|
57
157
|
return true if s == 4 || (s == 2 && @donor_allowed)
|
@@ -0,0 +1,127 @@
|
|
1
|
+
# Contains MariaDB/MySQL specific methods for checking on slave health
|
2
|
+
module Mysql_Checks
|
3
|
+
|
4
|
+
# Gets the actual status from the MariaDB/MySQL slave using the Mysql2 gem.
|
5
|
+
# Notice that we're using the EventMachine-enabled Mysql2::Client.
|
6
|
+
#
|
7
|
+
# Right now it only returns the relevant error output and continues working afterwards.
|
8
|
+
#
|
9
|
+
# Nothing is mentioned about explicitly closing a client connection in the Mysql2 docs,
|
10
|
+
# however, we need to be careful with the amount of connections we're using since we might
|
11
|
+
# find ourselves in an environment where the number of connections is constraint for a very few.
|
12
|
+
#
|
13
|
+
# ==== Return values
|
14
|
+
#
|
15
|
+
# * @status - Contains a hash of all the relevant replication related variables to be examined by #run_checks
|
16
|
+
def get_mysql_status
|
17
|
+
begin
|
18
|
+
client = Mysql2::Client.new(
|
19
|
+
host: @server,
|
20
|
+
port: @sql_port,
|
21
|
+
username: @user,
|
22
|
+
password: @password
|
23
|
+
)
|
24
|
+
result = client.query 'SHOW SLAVE STATUS;'
|
25
|
+
if result.count > 0
|
26
|
+
result.each do |key, state|
|
27
|
+
@status[key] = state
|
28
|
+
end
|
29
|
+
end
|
30
|
+
client.close
|
31
|
+
rescue Exception => message
|
32
|
+
puts message
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
# Get the value of <tt>'Slave_IO_Running'</tt>, which, obviously, should be <tt>Yes</tt> since otherwise
|
37
|
+
# it would mean the slave is not replicated properly and/or has stopped because of an error.
|
38
|
+
#
|
39
|
+
# ==== Attributes
|
40
|
+
#
|
41
|
+
# * <tt>@status</tt> - Uses the <tt>'Slave_IO_Running'</tt> key inside the hash.
|
42
|
+
#
|
43
|
+
# ==== Return values
|
44
|
+
#
|
45
|
+
# +true+ or +false+ - depending on whether or not the slave's replication thread is running.
|
46
|
+
#
|
47
|
+
# ==== Examples
|
48
|
+
#
|
49
|
+
# @status = Hash.new
|
50
|
+
# @status['Slave_IO_Running'] = 'Yes'
|
51
|
+
#
|
52
|
+
# r = check_slave_io
|
53
|
+
# r.inspect # => true
|
54
|
+
#
|
55
|
+
# @status['Slave_IO_Running'] = 'No'
|
56
|
+
#
|
57
|
+
# r = check_slave_io
|
58
|
+
# r.inspect # => false
|
59
|
+
def check_slave_io
|
60
|
+
return true if @status['Slave_IO_Running'] == 'Yes'
|
61
|
+
false
|
62
|
+
end
|
63
|
+
|
64
|
+
# Get the value of <tt>'Seconds_Behind_Master'</tt>, which indicates the amount of time in seconds
|
65
|
+
# the slave is behind the master's instruction set received via the replication thread. This should
|
66
|
+
# always be as close to zero as possible (or even zero). If this value is beyond <tt>@threshold</tt>
|
67
|
+
# constantly you will need to think about changing your setup to accommodate the traffic coming in
|
68
|
+
# from the master.
|
69
|
+
#
|
70
|
+
# ==== Attributes
|
71
|
+
#
|
72
|
+
# * <tt>@status</tt> - Uses the <tt>'Seconds_Behind_Master'</tt> key inside the hash
|
73
|
+
# * <tt>@threshold</tt> - The globally defined threshold after which the slave is considered to be too far behind to still be an active member. The default is 120 seconds.
|
74
|
+
#
|
75
|
+
# ==== Return values
|
76
|
+
#
|
77
|
+
# +true+ or +false+ - depending on whether or not the slave's replication thread is behind <tt>@threshold</tt>
|
78
|
+
#
|
79
|
+
# ==== Examples
|
80
|
+
#
|
81
|
+
# @status = Hash.new
|
82
|
+
# @status['Slave_IO_Running'] = 'Yes'
|
83
|
+
#
|
84
|
+
# r = check_slave_io
|
85
|
+
# r.inspect # => true
|
86
|
+
#
|
87
|
+
# @status['Slave_IO_Running'] = 'No'
|
88
|
+
#
|
89
|
+
# r = check_slave_io
|
90
|
+
# r.inspect # => false
|
91
|
+
def check_seconds_behind
|
92
|
+
return true if Integer(@status['Seconds_Behind_Master']) < @threshold
|
93
|
+
end
|
94
|
+
|
95
|
+
# Returns the relevant status HTTP code accompanied by a useful user feedback text
|
96
|
+
#
|
97
|
+
# ==== Attributes
|
98
|
+
#
|
99
|
+
# * @status - Should contain a hash with the relevant information to determine the
|
100
|
+
# the cluster member status. Also see #get_mysql_status.
|
101
|
+
#
|
102
|
+
# ==== Return values
|
103
|
+
#
|
104
|
+
# * +response+ - A hash containing a HTTP <tt>:code</tt> and a <tt>:text</tt> to return to the user
|
105
|
+
#
|
106
|
+
# ==== Example
|
107
|
+
#
|
108
|
+
# @status = {'Seconds_Behind_Master' => 140 }
|
109
|
+
# response = self.run_checks
|
110
|
+
# response.inspect # => {:code => 503, :text => 'Some text'}
|
111
|
+
def run_checks
|
112
|
+
get_mysql_status
|
113
|
+
unless @status.empty?
|
114
|
+
response = {code: 200, text: []}
|
115
|
+
if !check_slave_io
|
116
|
+
response[:text] << 'Slave IO is not running.'
|
117
|
+
end
|
118
|
+
if !check_seconds_behind
|
119
|
+
response[:text] << "Slave is #{@status['Seconds_Behind_Master']} seconds behind. Threshold is #{@threshold}"
|
120
|
+
end
|
121
|
+
response[:code] = 503 unless response[:text].empty?
|
122
|
+
return response
|
123
|
+
else
|
124
|
+
return {code: 503, text: ['Unable to determine slave status']}
|
125
|
+
end
|
126
|
+
end
|
127
|
+
end
|
@@ -1,13 +1,41 @@
|
|
1
1
|
require 'optparse'
|
2
2
|
require 'ostruct'
|
3
3
|
|
4
|
+
# This module contains all the command line configuration methods
|
4
5
|
module Configuration
|
6
|
+
|
7
|
+
# Reads ARGV with OptionParser and return an OpenStruct object with the parsed values
|
8
|
+
#
|
9
|
+
# ==== Default values
|
10
|
+
#
|
11
|
+
# * +ip+ - By default Apprentice binds to 0.0.0.0.
|
12
|
+
# * +port+ - The port Apprentice binds to. It defaults to 3307.
|
13
|
+
# * +sql_port+ - The port the MariaDB/MySQL server listens on Apprentice connects to. Defaults to 3306.
|
14
|
+
# * +threshold+ - The acceptable slave lag in seconds. Defaults to 120 seconds. It only applies when the type is set to 'mysql'.
|
15
|
+
# * +accept_donor+ - If passed, cluster members in the state '2' aka "Donor/Desynced" are accepted as valid client providers. Defaults to false, which is recommended.
|
16
|
+
#
|
17
|
+
# ==== Attributes
|
18
|
+
#
|
19
|
+
# * +ARGV+
|
20
|
+
#
|
21
|
+
# ==== Return values
|
22
|
+
#
|
23
|
+
# * +options+ - OpenStruct object containing all options passed with ARGV
|
24
|
+
#
|
25
|
+
# ==== Example
|
26
|
+
#
|
27
|
+
# ARGV = "--user user --password password --server server"
|
28
|
+
# opt = get_config
|
29
|
+
# opt.user # => 'user'
|
30
|
+
# opt.password # => 'password'
|
31
|
+
# opt.server # => 'server'
|
5
32
|
def get_config
|
6
33
|
options = OpenStruct.new
|
7
34
|
options.ip = '0.0.0.0'
|
8
35
|
options.port = 3307
|
9
36
|
options.sql_port = 3306
|
10
37
|
options.accept_donor = false
|
38
|
+
options.threshold = 120
|
11
39
|
|
12
40
|
opt_parser = OptionParser.new do |opts|
|
13
41
|
opts.banner = "Usage: apprentice [options]\n"
|
@@ -15,20 +43,29 @@ module Configuration
|
|
15
43
|
opts.separator 'Specific options:'
|
16
44
|
|
17
45
|
opts.on('-s SERVER', '--server SERVER',
|
18
|
-
'
|
46
|
+
'SERVER to connect to') { |s| options.server = s }
|
19
47
|
opts.on('-u USER', '--user USER',
|
20
|
-
'USER to connect the server with') { |u| options.user = u }
|
48
|
+
'USER to connect to the server with') { |u| options.user = u }
|
21
49
|
opts.on('-p PASSWORD', '--password PASSWORD',
|
22
50
|
'PASSWORD to use') { |p| options.password = p }
|
51
|
+
opts.on('-t TYPE', '--type TYPE',
|
52
|
+
'TYPE of server. Must either by "galera" or "mysql".') { |t| options.type = t }
|
23
53
|
|
24
54
|
opts.on('-i', '--ip IP',
|
25
|
-
'Local IP to bind to'
|
55
|
+
'Local IP to bind to',
|
56
|
+
"(default: #{options.ip})") { |i| options.ip = i }
|
26
57
|
opts.on('--port PORT',
|
27
|
-
'Local PORT to use'
|
58
|
+
'Local PORT to use',
|
59
|
+
"(default: #{options.port})") { |p| options.port = p }
|
28
60
|
opts.on('--sql_port PORT',
|
29
|
-
'Port of
|
61
|
+
'Port of MariaDB/MySQL server to connect to',
|
62
|
+
"(default: #{options.sql_port})") { |p| options.sql_port = p }
|
30
63
|
opts.on('--[no-]accept-donor',
|
31
|
-
'Accept cluster state "Donor/Desynced" as valid'
|
64
|
+
'Accept galera cluster state "Donor/Desynced" as valid',
|
65
|
+
"(default: #{options.accept_donor})") { |ad| options.accept_donor = ad }
|
66
|
+
opts.on('--threshold SECONDS',
|
67
|
+
'MariaDB/MySQL slave lag threshold',
|
68
|
+
"(default: #{options.threshold})") { |tr| options.threshold = tr }
|
32
69
|
|
33
70
|
opts.separator ''
|
34
71
|
opts.separator 'Common options:'
|
@@ -44,10 +81,19 @@ module Configuration
|
|
44
81
|
end
|
45
82
|
|
46
83
|
begin
|
47
|
-
ARGV << 's-h' if ARGV.size < 3
|
48
84
|
opt_parser.parse!(ARGV)
|
49
|
-
|
50
|
-
|
85
|
+
|
86
|
+
# We need four variables:
|
87
|
+
# * user: a valid mysql user
|
88
|
+
# * password: the corresponding password
|
89
|
+
# * server: the server to connect to
|
90
|
+
# * type: either mysql or galera, depending on the setup
|
91
|
+
unless options.server &&
|
92
|
+
options.user &&
|
93
|
+
options.password &&
|
94
|
+
check_type(options.type)
|
95
|
+
$stderr.puts 'Error: you have to specify a user, a password, a server to connect to'
|
96
|
+
$stderr.puts 'and a valid type. It can either by "galera" or "mysql".'
|
51
97
|
$stderr.puts 'Try -h/--help for more options'
|
52
98
|
exit
|
53
99
|
end
|
@@ -57,4 +103,29 @@ module Configuration
|
|
57
103
|
exit
|
58
104
|
end
|
59
105
|
end
|
106
|
+
|
107
|
+
# Check the user input for a valid type
|
108
|
+
#
|
109
|
+
# ==== Attributes
|
110
|
+
#
|
111
|
+
# * +type+ - the type extracted from ARGV
|
112
|
+
#
|
113
|
+
# ==== Return values
|
114
|
+
#
|
115
|
+
# Either true or false, depending on whether the input provided
|
116
|
+
# matches either 'mysql' or 'galera'
|
117
|
+
#
|
118
|
+
# ==== Example
|
119
|
+
#
|
120
|
+
# r = check_type('mysql')
|
121
|
+
# r.inspect # => 'true'
|
122
|
+
#
|
123
|
+
# r = check_type('something else')
|
124
|
+
# r.inspect # => 'false'
|
125
|
+
def check_type(type)
|
126
|
+
%w{galera mysql}.each do |t|
|
127
|
+
return true if t == type
|
128
|
+
end
|
129
|
+
false
|
130
|
+
end
|
60
131
|
end
|
data/lib/apprentice/server.rb
CHANGED
@@ -1,12 +1,14 @@
|
|
1
|
+
# Main server module consisting of all server related methods and classes
|
1
2
|
module Server
|
3
|
+
|
4
|
+
# The actual EM::Connection instance referenced by the EventServer class.
|
5
|
+
# Notice that we use Mysql2::Client::EM instead of the regular Mysql2::Client class.
|
2
6
|
class EventServer < EM::Connection
|
3
7
|
require 'apprentice/checker'
|
4
8
|
require 'mysql2/em'
|
5
9
|
include Checker
|
6
10
|
|
7
|
-
|
8
|
-
|
9
|
-
def initialize(options)
|
11
|
+
def initialize(options) #:nodoc:
|
10
12
|
@ip = options.ip
|
11
13
|
@port = options.port
|
12
14
|
@sql_port = options.sql_port
|
@@ -14,9 +16,20 @@ module Server
|
|
14
16
|
@user = options.user
|
15
17
|
@password = options.password
|
16
18
|
@donor_allowed = options.donor_allowed
|
19
|
+
@type = options.type
|
20
|
+
@threshold = options.threshold
|
17
21
|
@status = {}
|
18
22
|
end
|
19
23
|
|
24
|
+
# Take the raw data received on @port and run initiate the checks against the server located at @server
|
25
|
+
#
|
26
|
+
# ==== Special conditions
|
27
|
+
#
|
28
|
+
# We are sending something to our client with #send_data inside the function, depending on what #run_checks returned to us during the function call.
|
29
|
+
#
|
30
|
+
# ==== Attributes
|
31
|
+
#
|
32
|
+
# * +data+ - We receive the actual HTTP request but since we're not a full blown HTTP server we don't actually use it to any extent
|
20
33
|
def receive_data(data)
|
21
34
|
response = run_checks
|
22
35
|
response_text = format_text(response[:text])
|
data/lib/apprentice/version.rb
CHANGED
data/ruby-apprentice.default
CHANGED
@@ -1,9 +1,17 @@
|
|
1
1
|
# Set to true to start the service
|
2
2
|
START=false
|
3
3
|
|
4
|
-
# MariaDB host
|
4
|
+
# MariaDB/MySQL host
|
5
5
|
DBHOST=''
|
6
6
|
# Username which shall be used to check the status
|
7
7
|
DBUSER=''
|
8
8
|
# Password
|
9
9
|
DBPASSWORD=''
|
10
|
+
# Type of server
|
11
|
+
# This should either by 'mysql' for MariaDB/MySQL slave lag detection
|
12
|
+
# or 'galera' for cluster member consistency checking
|
13
|
+
TYPE=''
|
14
|
+
# You can specify any other arguments you want to
|
15
|
+
# Example: '--threshold 60' for an accepted slave lag of 60 seconds
|
16
|
+
# For more options see 'apprentice --help'
|
17
|
+
EXTRA_ARGS=''
|
data/ruby-apprentice.init
CHANGED
@@ -5,7 +5,7 @@
|
|
5
5
|
# Required-Stop:
|
6
6
|
# Default-Start: 2 3 4 5
|
7
7
|
# Default-Stop: 0 1 6
|
8
|
-
# Short-Description: a MariaDB cluster integrity checker
|
8
|
+
# Short-Description: a MariaDB/MySQL slave lag and cluster integrity checker
|
9
9
|
### END INIT INFO
|
10
10
|
|
11
11
|
NAME="`basename ${0/.sh/}`"
|
@@ -30,7 +30,7 @@ do_start()
|
|
30
30
|
if [ ! "${START}" = "true" ]; then
|
31
31
|
log_failure_msg "this service is disabled. Enable it in /etc/default/$NAME"
|
32
32
|
return 2
|
33
|
-
elif [ ! "${DBHOST}" ] || [ ! "${DBPASSWORD}" ] || [ ! ${DBUSER} ] ; then
|
33
|
+
elif [ ! "${DBHOST}" ] || [ ! "${DBPASSWORD}" ] || [ ! ${DBUSER} ] || [ ! ${TYPE} ] ; then
|
34
34
|
log_failure_msg "Missing variables inside defaults file."
|
35
35
|
return 2
|
36
36
|
fi
|
@@ -41,13 +41,13 @@ do_start()
|
|
41
41
|
chown $USER:$GROUP "$pidfile_dirname"
|
42
42
|
chmod 0750 "$pidfile_dirname"
|
43
43
|
|
44
|
-
DAEMON_ARGS="--password ${DBPASSWORD} --user ${DBUSER} --server ${DBHOST} ${EXTRA_ARGS}"
|
44
|
+
DAEMON_ARGS="--password ${DBPASSWORD} --user ${DBUSER} --server ${DBHOST} --type ${TYPE} ${EXTRA_ARGS}"
|
45
45
|
|
46
46
|
start-stop-daemon --start --background --make-pidfile --quiet \
|
47
|
-
|
47
|
+
--user ${USER} --group ${GROUP} \
|
48
48
|
--pidfile ${PIDFILE} --exec ${DAEMON} --test > /dev/null || return 1
|
49
49
|
start-stop-daemon --start --background --make-pidfile --quiet \
|
50
|
-
|
50
|
+
--user ${USER} --group ${GROUP} \
|
51
51
|
--pidfile ${PIDFILE} --exec ${DAEMON} -- ${DAEMON_ARGS} || return 2
|
52
52
|
log_end_msg $?
|
53
53
|
}
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: apprentice
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.6
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Moritz Heiber
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2013-09-
|
11
|
+
date: 2013-09-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -38,7 +38,7 @@ dependencies:
|
|
38
38
|
- - '>='
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '0'
|
41
|
-
description: A MariaDB cluster integrity checker
|
41
|
+
description: A MariaDB/MySQL slave lag and cluster integrity checker
|
42
42
|
email:
|
43
43
|
- moritz.heiber@gmail.com
|
44
44
|
executables:
|
@@ -57,6 +57,7 @@ files:
|
|
57
57
|
- lib/apprentice.rb
|
58
58
|
- lib/apprentice/checker.rb
|
59
59
|
- lib/apprentice/checks/galera.rb
|
60
|
+
- lib/apprentice/checks/mysql.rb
|
60
61
|
- lib/apprentice/configuration.rb
|
61
62
|
- lib/apprentice/server.rb
|
62
63
|
- lib/apprentice/version.rb
|
@@ -87,7 +88,7 @@ rubyforge_project:
|
|
87
88
|
rubygems_version: 2.0.7
|
88
89
|
signing_key:
|
89
90
|
specification_version: 4
|
90
|
-
summary:
|
91
|
+
summary: Checks a given server for consistency and replication status
|
91
92
|
test_files:
|
92
93
|
- spec/lib/apprentice_spec.rb
|
93
94
|
- spec/spec_helper.rb
|