skylight 0.3.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +32 -0
- data/lib/skylight.rb +14 -8
- data/lib/skylight/api.rb +0 -2
- data/lib/skylight/version.rb +1 -1
- metadata +3 -4
- data/too-many-sockets.md +0 -62
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 4005d9f17790586799b47b937adeaf32049239cb
|
4
|
+
data.tar.gz: b58afb77fd648bfbdc904a305161fb57a47ceeb1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e7c5939f477c3902f1684b20fb5030aef6bdc7554dd395589792dcbd6421224b75a60a9fd0e8d4292296afede7c3e582e8cf399af866e67ad8d2f7b855ccbfd1
|
7
|
+
data.tar.gz: 9c5a136bc94db3c8bc51c80ec4ae191a23d6cb5e9eded4dc0e5bdcb1a60e5e1dd006bad9965bb9c9402195b8684849cb4783ec4160b705bbab9800ec2d586789
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,35 @@
|
|
1
|
+
## 0.3.1 (March 8, 2014)
|
2
|
+
|
3
|
+
* Fix requires to allow CLI to function without native extension.
|
4
|
+
|
5
|
+
## 0.3.0 (February 28, 2014)
|
6
|
+
|
7
|
+
* Native Rust agent
|
8
|
+
* Send exceptions occurring during HTTP requests to the client.
|
9
|
+
* Warn users when skylight is potentially disabled incorrectly.
|
10
|
+
* Update SQL Lexer to 0.0.6
|
11
|
+
* Log the backtraces of unhandled exceptions
|
12
|
+
* Add support for disabling GC tracking
|
13
|
+
* Add support for disabling agent
|
14
|
+
|
15
|
+
## 0.2.7 (February 26, 2014)
|
16
|
+
|
17
|
+
* Disable annotations to reduce memory load.
|
18
|
+
|
19
|
+
## 0.2.6 (February 25, 2014)
|
20
|
+
|
21
|
+
* `inspect` even whitelisted payload props
|
22
|
+
* Ignore Errno::EINTR for 'ps' call
|
23
|
+
|
24
|
+
## 0.2.5 (February 21, 2014)
|
25
|
+
|
26
|
+
* Revert "Update SqlLexer to 0.0.4"
|
27
|
+
|
28
|
+
## 0.2.4 (February 20, 2014)
|
29
|
+
|
30
|
+
* Whitelist process action annotation keys.
|
31
|
+
* Update SqlLexer to 0.0.4
|
32
|
+
|
1
33
|
## 0.2.3 (December 20, 2013)
|
2
34
|
|
3
35
|
* Fix SQL lexing for comments, arrays, double-colon casting, and multiple queries
|
data/lib/skylight.rb
CHANGED
@@ -13,10 +13,22 @@ rescue LoadError
|
|
13
13
|
raise if ENV.key?("SKYLIGHT_REQUIRED")
|
14
14
|
end
|
15
15
|
|
16
|
+
module Skylight
|
17
|
+
TRACE_ENV_KEY = 'SKYLIGHT_ENABLE_TRACE_LOGS'.freeze
|
18
|
+
|
19
|
+
autoload :Api, 'skylight/api'
|
20
|
+
autoload :CLI, 'skylight/cli'
|
21
|
+
autoload :Config, 'skylight/config'
|
22
|
+
|
23
|
+
module Util
|
24
|
+
autoload :Logging, 'skylight/util/logging'
|
25
|
+
autoload :HTTP, 'skylight/util/http'
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
16
29
|
if has_native_ext
|
17
30
|
|
18
31
|
module Skylight
|
19
|
-
TRACE_ENV_KEY = 'SKYLIGHT_ENABLE_TRACE_LOGS'.freeze
|
20
32
|
STANDALONE_ENV_KEY = 'SKYLIGHT_STANDALONE'.freeze
|
21
33
|
STANDALONE_ENV_VAL = 'server'.freeze
|
22
34
|
|
@@ -32,7 +44,6 @@ module Skylight
|
|
32
44
|
require 'skylight/vm/gc'
|
33
45
|
end
|
34
46
|
|
35
|
-
autoload :Config, 'skylight/config'
|
36
47
|
autoload :GC, 'skylight/gc'
|
37
48
|
autoload :Helpers, 'skylight/helpers'
|
38
49
|
autoload :Instrumenter, 'skylight/instrumenter'
|
@@ -175,9 +186,4 @@ module Skylight
|
|
175
186
|
end
|
176
187
|
end
|
177
188
|
|
178
|
-
end
|
179
|
-
|
180
|
-
module Skylight
|
181
|
-
autoload :Api, 'skylight/api'
|
182
|
-
autoload :CLI, 'skylight/cli'
|
183
|
-
end
|
189
|
+
end
|
data/lib/skylight/api.rb
CHANGED
data/lib/skylight/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: skylight
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilde, Inc.
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2014-
|
11
|
+
date: 2014-03-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activesupport
|
@@ -139,7 +139,6 @@ files:
|
|
139
139
|
- lib/sql_lexer.rb
|
140
140
|
- lib/sql_lexer/lexer.rb
|
141
141
|
- lib/sql_lexer/version.rb
|
142
|
-
- too-many-sockets.md
|
143
142
|
homepage: http://www.skylight.io
|
144
143
|
licenses: []
|
145
144
|
metadata: {}
|
@@ -159,7 +158,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
159
158
|
version: '0'
|
160
159
|
requirements: []
|
161
160
|
rubyforge_project:
|
162
|
-
rubygems_version: 2.2.
|
161
|
+
rubygems_version: 2.2.1
|
163
162
|
signing_key:
|
164
163
|
specification_version: 4
|
165
164
|
summary: Skylight is a ruby application monitoring tool. Currently in closed beta.
|
data/too-many-sockets.md
DELETED
@@ -1,62 +0,0 @@
|
|
1
|
-
We received a notification (Zendesk #280) from a customer that they were unable to SSH
|
2
|
-
into their box, and that their service provider discovered that Skylight
|
3
|
-
was triggering socket buffer limits.
|
4
|
-
|
5
|
-
The customer immediately disabled Skylight, which fixed the bug for him.
|
6
|
-
|
7
|
-
Customer Information:
|
8
|
-
|
9
|
-
* Using Skylight 0.2.3
|
10
|
-
* Ruby 2.0.0p353
|
11
|
-
* Also had New Relic installed (Bug #46)
|
12
|
-
* Recently upgraded to Rails 4
|
13
|
-
* Passenger Enterprise with a max-pool size of 12 and zero-downtime deploys
|
14
|
-
* He had a maxsockbuf of 32MB
|
15
|
-
|
16
|
-
The customer also reported 134 open sockets in the server process.
|
17
|
-
|
18
|
-
Separately, we discovered that New Relic was making our payloads
|
19
|
-
extremely large.
|
20
|
-
|
21
|
-
Our working hypothesis was that the large New Relic payloads were
|
22
|
-
triggering the max buffer condition, which was then cascading into more
|
23
|
-
failures.
|
24
|
-
|
25
|
-
Upon further investigation, we discovered a few things:
|
26
|
-
|
27
|
-
* If the client hits a kernel buffer limit (via EWOULDBLOCK that lasts
|
28
|
-
more than 5s), it closes down its socket
|
29
|
-
* The server expects to recover from this situation by getting the
|
30
|
-
client socket in its read list from IO.select. It would then try to
|
31
|
-
read from it, get an EOF, and close the socket.
|
32
|
-
* If the server does not get the client socket in its read list for some
|
33
|
-
(unknown) reason, this would result in an ever-growing list of sockets
|
34
|
-
on the server side.
|
35
|
-
|
36
|
-
We are not sure why the sockets would not appear in the read list, but
|
37
|
-
we hypothesize that when the OpenVZ limit is reached, the kernel no
|
38
|
-
longer includes the socket in the read list, causing the server to never
|
39
|
-
close the socket.
|
40
|
-
|
41
|
-
Mitigation strategies:
|
42
|
-
|
43
|
-
* The New Relic fix (#46) should reduce the likelihood of encountering
|
44
|
-
this in the first place
|
45
|
-
* We want to add a Hello heartbeat from the client. If the server
|
46
|
-
doesn't receive a message every 1m (or 2m, TBD), it will close the
|
47
|
-
socket even if it's not in the read list.
|
48
|
-
|
49
|
-
We are worried that if the server process ever gets stuck, this
|
50
|
-
condition can occur. We are also considering a server heartbeat back to
|
51
|
-
the clients, so they can take corrective action if the agent gets stuck.
|
52
|
-
|
53
|
-
Some mitigations for this situation:
|
54
|
-
|
55
|
-
* Kill the server process when a respawn is necessary. This will ensure
|
56
|
-
that a stuck agent process doesn't duplicate into N servers.
|
57
|
-
* Before connecting to the server socket, pre-check the server process
|
58
|
-
with kill -0.
|
59
|
-
|
60
|
-
For next time:
|
61
|
-
* Request skylight.log for easier debugging
|
62
|
-
* Form a best practices for customer bug reporting on doc site
|