skylight 0.2.6 → 0.2.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/lib/skylight/messages/span.rb +1 -1
- data/lib/skylight/version.rb +1 -1
- metadata +10 -11
- data/too-many-sockets.md +0 -62
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: edae1bae15bae8f3c6570992aeef722d03a6daba
|
4
|
+
data.tar.gz: 0bf27cb52e79314929c6d1e11e38956dcc93fa77
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 30f1914554e84ef1ec3f7e728787a36cc12b8aa469798ec18bc8ba3f142bc6c305fd98f2917d09eb058acbf691282ff113cc1aa302ca22db2c5c5af8b026560c
|
7
|
+
data.tar.gz: 292f7e33dc69706101430e2f0e6c21712d00526c7d789f7ca7bcaa02e1f622501c26ef493de776acbe7f448032ec0909f8f58dac60ad66d7e94df56cf3690fa7
|
@@ -136,7 +136,7 @@ module Skylight
|
|
136
136
|
category: category,
|
137
137
|
title: title,
|
138
138
|
description: description),
|
139
|
-
annotations:
|
139
|
+
annotations: nil,
|
140
140
|
started_at: @started_at,
|
141
141
|
duration: duration && duration > 0 ? duration : nil,
|
142
142
|
children: @children > 0 ? @children : nil)
|
data/lib/skylight/version.rb
CHANGED
metadata
CHANGED
@@ -1,27 +1,27 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: skylight
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.7
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilde, Inc.
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2014-02-
|
11
|
+
date: 2014-02-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activesupport
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
|
-
- -
|
17
|
+
- - '>='
|
18
18
|
- !ruby/object:Gem::Version
|
19
19
|
version: 3.0.0
|
20
20
|
type: :runtime
|
21
21
|
prerelease: false
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
|
-
- -
|
24
|
+
- - '>='
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: 3.0.0
|
27
27
|
description:
|
@@ -32,9 +32,6 @@ executables:
|
|
32
32
|
extensions: []
|
33
33
|
extra_rdoc_files: []
|
34
34
|
files:
|
35
|
-
- CHANGELOG.md
|
36
|
-
- README.md
|
37
|
-
- bin/skylight
|
38
35
|
- lib/skylight.rb
|
39
36
|
- lib/skylight/api.rb
|
40
37
|
- lib/skylight/cli.rb
|
@@ -142,7 +139,9 @@ files:
|
|
142
139
|
- lib/sql_lexer.rb
|
143
140
|
- lib/sql_lexer/lexer.rb
|
144
141
|
- lib/sql_lexer/version.rb
|
145
|
-
-
|
142
|
+
- CHANGELOG.md
|
143
|
+
- README.md
|
144
|
+
- bin/skylight
|
146
145
|
homepage: http://www.skylight.io
|
147
146
|
licenses: []
|
148
147
|
metadata: {}
|
@@ -152,17 +151,17 @@ require_paths:
|
|
152
151
|
- lib
|
153
152
|
required_ruby_version: !ruby/object:Gem::Requirement
|
154
153
|
requirements:
|
155
|
-
- -
|
154
|
+
- - '>='
|
156
155
|
- !ruby/object:Gem::Version
|
157
156
|
version: 1.9.2
|
158
157
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
159
158
|
requirements:
|
160
|
-
- -
|
159
|
+
- - '>='
|
161
160
|
- !ruby/object:Gem::Version
|
162
161
|
version: '0'
|
163
162
|
requirements: []
|
164
163
|
rubyforge_project:
|
165
|
-
rubygems_version: 2.
|
164
|
+
rubygems_version: 2.0.3
|
166
165
|
signing_key:
|
167
166
|
specification_version: 4
|
168
167
|
summary: Skylight is a ruby application monitoring tool. Currently in closed beta.
|
data/too-many-sockets.md
DELETED
@@ -1,62 +0,0 @@
|
|
1
|
-
We received a notification (Zendesk #280) from a customer that they were unable to SSH
|
2
|
-
into their box, and that their service provider discovered that Skylight
|
3
|
-
was triggering socket buffer limits.
|
4
|
-
|
5
|
-
The customer immediately disabled Skylight, which fixed the bug for him.
|
6
|
-
|
7
|
-
Customer Information:
|
8
|
-
|
9
|
-
* Using Skylight 0.2.3
|
10
|
-
* Ruby 2.0.0p353
|
11
|
-
* Also had New Relic installed (Bug #46)
|
12
|
-
* Recently upgraded to Rails 4
|
13
|
-
* Passenger Enterprise with a max-pool size of 12 and zero-downtime deploys
|
14
|
-
* He had a maxsockbuf of 32MB
|
15
|
-
|
16
|
-
The customer also reported 134 open sockets in the server process.
|
17
|
-
|
18
|
-
Separately, we discovered that New Relic was making our payloads
|
19
|
-
extremely large.
|
20
|
-
|
21
|
-
Our working hypothesis was that the large New Relic payloads were
|
22
|
-
triggering the max buffer condition, which was then cascading into more
|
23
|
-
failures.
|
24
|
-
|
25
|
-
Upon further investigation, we discovered a few things:
|
26
|
-
|
27
|
-
* If the client hits a kernel buffer limit (via EWOULDBLOCK that lasts
|
28
|
-
more than 5s), it closes down its socket
|
29
|
-
* The server expects to recover from this situation by getting the
|
30
|
-
client socket in its read list from IO.select. It would then try to
|
31
|
-
read from it, get an EOF, and close the socket.
|
32
|
-
* If the server does not get the client socket in its read list for some
|
33
|
-
(unknown) reason, this would result in an ever-growing list of sockets
|
34
|
-
on the server side.
|
35
|
-
|
36
|
-
We are not sure why the sockets would not appear in the read list, but
|
37
|
-
we hypothesize that when the OpenVZ limit is reached, the kernel no
|
38
|
-
longer includes the socket in the read list, causing the server to never
|
39
|
-
close the socket.
|
40
|
-
|
41
|
-
Mitigation strategies:
|
42
|
-
|
43
|
-
* The New Relic fix (#46) should reduce the likelihood of encountering
|
44
|
-
this in the first place
|
45
|
-
* We want to add a Hello heartbeat from the client. If the server
|
46
|
-
doesn't receive a message every 1m (or 2m, TBD), it will close the
|
47
|
-
socket even if it's not in the read list.
|
48
|
-
|
49
|
-
We are worried that if the server process ever gets stuck, this
|
50
|
-
condition can occur. We are also considering a server heartbeat back to
|
51
|
-
the clients, so they can take corrective action if the agent gets stuck.
|
52
|
-
|
53
|
-
Some mitigations for this situation:
|
54
|
-
|
55
|
-
* Kill the server process when a respawn is necessary. This will ensure
|
56
|
-
that a stuck agent process doesn't duplicate into N servers.
|
57
|
-
* Before connecting to the server socket, pre-check the server process
|
58
|
-
with kill -0.
|
59
|
-
|
60
|
-
For next time:
|
61
|
-
* Request skylight.log for easier debugging
|
62
|
-
* Form a best practices for customer bug reporting on doc site
|