flamingo 0.1 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -2,9 +2,13 @@ Flamingo
2
2
  ========
3
3
  Flamingo is a resque-based system for handling the Twitter Streaming API.
4
4
 
5
- This is *early alpha* code. There will be a lot of change and things like tests
6
- coming in the future. That said, it does work so give it a try if you have the
7
- need.
5
+ This is *early alpha* code. Parts of it are graceful, like the curve of a
6
+ flamingo's neck: it capably processes the multiple high-volume sample and filter
7
+ streams that power tweetreach.com. Many parts of it are ungainly, like a
8
+ flamingo's knees: this is early code, and it will change rapidly. And parts of
9
+ it are mired in muck, like a flamingo's feet: it has too few tests, and surely
10
+ some configuration we forgot to tell you about. That said, it does work: give it
11
+ a try if you have the need.
8
12
 
9
13
  Dependencies
10
14
  ------------
@@ -13,6 +17,8 @@ Dependencies
13
17
  * sinatra
14
18
  * twitter-stream
15
19
  * yajl-ruby
20
+ * active_support
21
+ * redis-namespace
16
22
 
17
23
  By default, the `resque` gem installs the latest 2.x `redis` gem, so if
18
24
  you are using Redis 1.x, you may want to swap it out.
@@ -30,44 +36,133 @@ Getting Started
30
36
  1. Install the gem
31
37
  sudo gem install flamingo
32
38
 
33
- 2. Create a config file (see `examples/flamingo.yml`) with at least a username and password
39
+ 2. Create a config file (see `examples/flamingo.yml`) with at least a username
40
+ and password. You can store this in ~/flamingo.yml or specify it on the
41
+ commandline (see below)
34
42
 
35
- username: USERNAME
43
+ username: SCREEN_NAME
36
44
  password: PASSWORD
37
- stream: filter
45
+
46
+ # should be "filter" or "sample", probably.
47
+ # Set the track terms for "filter" from the flamingo console (see README)
48
+ stream: filter
49
+
38
50
  logging:
39
- dest: /YOUR/LOG/PATH.LOG
40
- level: LOGLEVEL
51
+ dest: /tmp/flamingo.log
52
+ level: DEBUG
53
+
54
+ redis:
55
+ host: 0.0.0.0:6379
56
+ web:
57
+ host: 0.0.0.0:4711
41
58
 
42
59
  `LOGLEVEL` is one of the following:
43
60
  `DEBUG` < `INFO` < `WARN` < `ERROR` < `FATAL` < `UNKNOWN`
44
61
 
45
- 3. Start the Redis server
62
+ 3. Start the Redis server, and (optionally) open the resque web dashboard:
46
63
 
47
64
  $ redis-server
65
+ $ resque-web
66
+
67
+ 4. To set your tracking terms and start the queue subscription, jump into the `flamingo` client (installed during `gem install`):
48
68
 
49
- 4. Configure tracking using `flamingo` client (installed during `gem install`)
69
+ $ flamingo path/to/flamingo.yml
70
+
71
+ 5. This is a regular-old irb console, so anything ruby goes. First, register the terms you'd like to search on. This doesn't have a direct effect: it just pokes the values into the database so that the wader knows what to listen for.
50
72
 
51
- $ flamingo
52
73
  >> s = Stream.get(:filter)
53
- >> s.params[:track] = %w(FOO BAR BAZ)
54
- >> Subscription.new('YOUR_QUEUE').save
74
+ >> s.params[:track] = ["@cheaptweet", "austin", "#etsy"]
75
+
76
+ For now, use those three actual terms -- they'll give you a nice, testable receipt rate that is neither too slow ('...is this thing on?') nor torrential (you can watch the sream from your terminal window). Also note that you don't have to escape the tracking terms: twitter-stream will handle all that.
77
+
78
+ 6. Your second task from the flamingo console is to route the incoming tweets onto a queue -- in this case the EXAMPLE queue. This is used by the flamingod we'll start next but has no direct effect now.
79
+
80
+ >> Subscription.new('EXAMPLE').save
55
81
 
56
- 5. Start the Flamingo Daemon (`flamingod` installed during `gem install`)
82
+ 7. Start the Flamingo Daemon (`flamingod` installed during `gem install`), and also start watching its log file:
57
83
 
58
- $ flamingod -c your/config/file.yml
84
+ $ flamingod -c path/to/flamingo.yml
85
+ $ tail -f /tmp/flamingo.log
59
86
 
87
+ If things go well, you'll see something like
88
+
89
+ [2010-07-20 05:58:07, INFO] - Loaded config file from flamingo.yml
90
+ [2010-07-20 05:58:07, INFO] - Flamingod starting children
91
+ [2010-07-20 05:58:07, INFO] - Flamingod starting new wader
92
+ [2010-07-20 05:58:07, INFO] - Flamingod starting new dispatcher
93
+ [2010-07-20 05:58:07, INFO] - Flamingod starting new web server
94
+ [2010-07-20 05:58:07, INFO] - Starting wader on pid=91008 under pid=91003
95
+ [2010-07-20 05:58:07, INFO] - Starting dispatcher on pid=91009 under pid=91003
96
+ [2010-07-20 05:58:12, INFO] - Listening on stream: /1/statuses/filter.json?track=%23etsy,austin,cheaptweet
97
+ ... short initial delay ....
98
+ [2010-07-20 05:58:42, DEBUG] - Wader dispatched event
99
+ [2010-07-20 05:58:42, DEBUG] - Put job on subscription queue EXAMPLE for {"text":If you ever visit Austin make sure to go to Torchy's Tacos",...
100
+
101
+ On the resque-web dashboard, you should see a queue come up called EXAMPLE, with jobs accruing. There will only be 0 of 0 workers working: let's fix that
60
102
 
61
- 6. Consume events with a resque worker
103
+ 8. You'll consume those events with a resque worker, something like the following but more audacious:
62
104
 
63
105
  class HandleFlamingoEvent
64
-
65
106
  # type: One of "tweet" or "delete"
66
107
  # event: a hash of the json data from twitter
67
108
  def self.perform(type,event)
68
- # Do stuff with the data
109
+ # Do stuff with the data, probably something more interesting than this:
110
+ puts [type, event].inspect
69
111
  end
70
-
71
112
  end
113
+
114
+ 9. Start the worker task (see `examples/Rakefile`):
72
115
 
73
- $ QUEUE=YOUR_QUEUE rake resque:work
116
+ $ QUEUE=EXAMPLE rake -t examples/Rakefile resque:work
117
+
118
+ Two things should now happen:
119
+ * The pent-up jobs from the EXAMPLE queue should spray across your console
120
+ * The resque dashboard should show the queue being emptied as a result
121
+
122
+
123
+
124
+ Overview
125
+ --------
126
+
127
+ Flamingo uses EventMachine, sinatra and the twitter-stream API library to
128
+ efficiently route and process stream and dispatch events. Here are the
129
+ components of the flamingo flock:
130
+
131
+ *flamingo daemon (flamingod)*
132
+
133
+ Coordinates the wader process (initiates stream request, pushes each response
134
+ into the queue), the Sinatra webserver (handles subscriptions and changing
135
+ stream parameters), and a set of dispatchers (routes responses).
136
+
137
+ You can control flamingod with the following signals:
138
+
139
+ * TERM and INT will kill the flamingod parent process, and signal each child with TERM
140
+ * USR1 will restart the wader gracefully. This is used to change stream parameters
141
+
142
+ *wader*
143
+
144
+ The wader process starts the stream and dispatches stream responses as they arrive into a Resque queue.
145
+
146
+ *web server*
147
+
148
+ The flamingo webserver code creates and manages stream requests using a
149
+ lightweight Sinatra responder.
150
+
151
+ *workers*
152
+
153
+ This is the part you write. These are standard resque workers, living on one or
154
+ many machines, doing anything that your heart can imagine and your fingers can
155
+ code.
156
+
157
+
158
+ TODO
159
+ -----
160
+ * OAuth instructions
161
+
162
+
163
+ Flamingo
164
+ --------
165
+
166
+ Here is a photo of a flamingo:
167
+
168
+ ![Flamingo!](http://farm4.static.flickr.com/3438/3302580937_0ec540b73e_z_d.jpg "Flamingo Photo by William Warby, CC-BY License: http://www.flickr.com/photos/wwarby/3302580937 :: photo taken 21 Feb 2009 in Dagnall, England.")
@@ -1,15 +1,22 @@
1
- # Simple example that reads from a subscription queue and writes the events
2
- # to STDOUT
1
+ #
2
+ # Simple example: reads from a subscription queue, writes the events to STDOUT
3
+ #
3
4
  # Usage (from this directory):
4
- # $ QUEUE=YOUR_QUEUE rake resque:work
5
+ # $ QUEUE=EXAMPLES rake resque:work
5
6
 
6
7
  require 'rubygems'
7
8
  require 'resque/tasks'
8
9
 
9
10
  class HandleFlamingoEvent
10
-
11
- def self.perform(type,event)
12
- puts type, event
11
+
12
+ #
13
+ # type: One of "tweet" or "delete"
14
+ # event: a hash of the json data from twitter
15
+ #
16
+ #def self.perform(type, event_info, event)
17
+ def self.perform(type, event)
18
+ # Do stuff with the data, probably something more interesting than this:
19
+ puts [type, event].inspect
13
20
  end
14
-
15
- end
21
+
22
+ end
@@ -1,10 +1,22 @@
1
- username: SCREEN_NAME
2
- password: PASSWORD
3
- stream: filter
1
+ username: SCREEN_NAME
2
+ password: PASSWORD
3
+
4
+ # either "filter" or "sample"
5
+ # For filter, set the terms to track in the flamingo console (see README.md)
6
+ stream: filter
7
+
8
+ # Point the logs where you like.
9
+ # Should change the log level from DEBUG to INFO before you deploy: allowed levels are
10
+ # DEBUG < INFO < WARN < ERROR < FATAL < UNKNOWN
4
11
  logging:
5
- dest: /your/log/file/here.log
6
- level: INFO
12
+ dest: /tmp/flamingo.log
13
+ level: DEBUG
14
+
15
+ # Where is the redis server the flamingod processes should connect to?
7
16
  redis:
8
- host: 0.0.0.0:6379
17
+ host: 0.0.0.0:6379
18
+
19
+ # What port and interface should the flamingod web_server listen on?
20
+ # use 0.0.0.0 for all interfaces, 127.0.0.1 to listen on only localhost
9
21
  web:
10
- host: 0.0.0.0:4711
22
+ host: 0.0.0.0:4711
@@ -17,6 +17,7 @@ require 'flamingo/stream_params'
17
17
  require 'flamingo/stream'
18
18
  require 'flamingo/subscription'
19
19
  require 'flamingo/wader'
20
+ require 'flamingo/daemon/trap_keeper'
20
21
  require 'flamingo/daemon/pid_file'
21
22
  require 'flamingo/daemon/child_process'
22
23
  require 'flamingo/daemon/dispatcher_process'
@@ -37,6 +38,10 @@ module Flamingo
37
38
  logger.info "Loaded config file from #{config_file}"
38
39
  end
39
40
 
41
+ def config=(config)
42
+ @config = config
43
+ end
44
+
40
45
  def config
41
46
  @config
42
47
  end
@@ -98,6 +103,10 @@ module Flamingo
98
103
  @logger ||= new_logger
99
104
  end
100
105
 
106
+ def logger=(logger)
107
+ @logger = logger
108
+ end
109
+
101
110
  private
102
111
  def root_dir
103
112
  File.expand_path(File.dirname(__FILE__)+'/..')
@@ -1,6 +1,10 @@
1
1
  module Flamingo
2
2
  module Daemon
3
3
  class ChildProcess
4
+
5
+ # For process-scoping of traps
6
+ include TrapKeeper
7
+
4
8
  attr_accessor :pid
5
9
 
6
10
  def kill(sig)
@@ -8,6 +12,17 @@ module Flamingo
8
12
  end
9
13
  alias_method :signal, :kill
10
14
 
15
+ def running?
16
+ # Borrowed from daemons gem
17
+ Process.kill(0, pid)
18
+ return true
19
+ rescue Errno::ESRCH
20
+ return false
21
+ rescue ::Exception
22
+ # for example on EPERM (process exists but does not belong to us)
23
+ return true
24
+ end
25
+
11
26
  def start
12
27
  self.pid = fork { run }
13
28
  end
@@ -1,7 +1,25 @@
1
1
  module Flamingo
2
2
  module Daemon
3
+ #
4
+ # Flamingod is the main overseer of the Flamingo flock.
5
+ #
6
+ # Starts three sets of children:
7
+ #
8
+ # * A wader process: initiates stream request, pushes each response into the queue
9
+ # * A Sinatra server: lightweight responder to create and manage subscriptions
10
+ # * A set of dispatchers: worker processes that handle each stream response.
11
+ #
12
+ # You can control the flamingod with the following signals:
13
+ #
14
+ # * TERM and INT will kill the flamingod parent process, and signal each
15
+ # child with TERM
16
+ # * USR1 will restart the wader gracefully.
17
+ #
3
18
  class Flamingod
4
-
19
+
20
+ # For process-scoping of traps
21
+ include TrapKeeper
22
+
5
23
  def exit_signaled?
6
24
  @exit_signaled
7
25
  end
@@ -40,14 +58,27 @@ module Flamingo
40
58
  end
41
59
 
42
60
  def restart_wader
43
- Flamingo.logger.info "Flamingod restarting wader pid=#{@wader.pid} with SIGINT"
44
- @wader.kill("INT")
61
+ if @wader
62
+ Flamingo.logger.info "Flamingod restarting wader pid=#{@wader.pid} with SIGINT"
63
+ @wader.kill("INT")
64
+ else
65
+ Flamingo.logger.info "Wader is not started. Attempting to start new wader."
66
+ @wader = start_new_wader
67
+ end
45
68
  end
46
69
 
47
70
  def signal_children(sig)
48
71
  pids = (children.map {|c| c.pid}).join(",")
49
72
  Flamingo.logger.info "Flamingod sending SIG#{sig} to pids=#{pids}"
50
- children.each {|child| child.signal(sig) }
73
+ children.each do |child|
74
+ if child.running?
75
+ begin
76
+ child.signal(sig)
77
+ rescue => e
78
+ Flamingo.logger.info "Failure sending SIG#{sig} to child #{child.pid}: #{e}"
79
+ end
80
+ end
81
+ end
51
82
  end
52
83
 
53
84
  def terminate!
@@ -57,7 +88,7 @@ module Flamingo
57
88
  end
58
89
 
59
90
  def children
60
- [@wader,@web_server] + @dispatchers
91
+ ([@wader,@web_server] + @dispatchers).compact
61
92
  end
62
93
 
63
94
  def start_children
@@ -67,12 +98,17 @@ module Flamingo
67
98
  @web_server = start_new_web_server
68
99
  end
69
100
 
101
+ #
102
+ # Unless signaled externally, waits in an endless loop. If any child
103
+ # process terminates, it restarts that process.
104
+ # TODO Needs intelligent behavior so we don't get endless loops
70
105
  def wait_on_children()
71
106
  until exit_signaled?
72
107
  child_pid = Process.wait(-1)
108
+ child_status = $?
73
109
  unless exit_signaled?
74
- if @wader.pid == child_pid
75
- @wader = start_new_wader
110
+ if @wader && @wader.pid == child_pid
111
+ handle_wader_exit(child_status)
76
112
  elsif @web_server.pid == child_pid
77
113
  @web_server = start_new_web_server
78
114
  elsif (to_delete = @dispatchers.find{|d| d.pid == child_pid})
@@ -84,6 +120,17 @@ module Flamingo
84
120
  end
85
121
  end
86
122
  end
123
+
124
+ def handle_wader_exit(status)
125
+ if WaderProcess.fatal_exit?(status)
126
+ Flamingo.logger.error "Wader exited with status "+
127
+ "#{status.exitstatus} and cannot be automatically restarted"
128
+ $stderr.write("Wader exited with fatal error. Check the the log.")
129
+ terminate!
130
+ else
131
+ @wader = start_new_wader
132
+ end
133
+ end
87
134
 
88
135
  def run_as_daemon
89
136
  pid_file = PidFile.new
@@ -0,0 +1,22 @@
1
+ module Flamingo
2
+ module Daemon
3
+ module TrapKeeper
4
+
5
+ # Use instead of Kernel.trap to ensure that only the process that
6
+ # originally registered the trap has its block executed. This is necessary
7
+ # for cases where we fork after setting up traps since the child process
8
+ # gets the traps from the parent.
9
+ def trap(signal,&block)
10
+ owner_pid = Process.pid
11
+ Kernel.trap(signal) do
12
+ if Process.pid == owner_pid
13
+ block.call
14
+ end
15
+ end
16
+ end
17
+
18
+ module_function :trap
19
+
20
+ end
21
+ end
22
+ end
@@ -1,6 +1,29 @@
1
1
  module Flamingo
2
2
  module Daemon
3
3
  class WaderProcess < ChildProcess
4
+
5
+ # Exit codes
6
+ EXIT_CLEAN = 0
7
+
8
+ # Non-fatal exit code - For transient network errors where a retry is
9
+ # likely to resolve the problem
10
+ EXIT_UNKNOWN_ERROR = 001
11
+ EXIT_MAX_RECONNECTS = 002
12
+ EXIT_SERVER_UNAVAILABLE = 003
13
+
14
+ # 1XX is a fatal exit code - Human intervention or a configuration change
15
+ # is necessary to get the wader started
16
+ EXIT_FATAL_RANGE = 100..199
17
+ EXIT_AUTHENTICATION = 100
18
+ EXIT_UNKNOWN_STREAM = 101
19
+ EXIT_INVALID_PARAMS = 102
20
+
21
+ class << self
22
+ def fatal_exit?(status)
23
+ status && EXIT_FATAL_RANGE.include?(status.exitstatus)
24
+ end
25
+ end
26
+
4
27
  def register_signal_handlers
5
28
  trap("INT") { stop }
6
29
  end
@@ -16,13 +39,39 @@ module Flamingo
16
39
 
17
40
  @wader = Flamingo::Wader.new(screen_name,password,stream)
18
41
  Flamingo.logger.info "Starting wader on pid=#{Process.pid} under pid=#{Process.ppid}"
19
- @wader.run
20
- Flamingo.logger.info "Wader pid=#{Process.pid} stopped"
42
+
43
+ exit_code = EXIT_CLEAN
44
+ begin
45
+ @wader.run
46
+ rescue => e
47
+ exit_code = error_exit_code(e)
48
+ end
49
+
50
+ Flamingo.logger.info "Wader pid=#{Process.pid} exited with code #{exit_code}"
51
+ exit(exit_code)
21
52
  end
22
53
 
23
54
  def stop
24
55
  @wader.stop
25
56
  end
57
+
58
+ private
59
+ def error_exit_code(ex)
60
+ case ex
61
+ when Flamingo::Wader::AuthenticationError
62
+ then EXIT_AUTHENTICATION
63
+ when Flamingo::Wader::UnknownStreamError
64
+ then EXIT_UNKNOWN_STREAM
65
+ when Flamingo::Wader::InvalidParametersError
66
+ then EXIT_INVALID_PARAMS
67
+ when Flamingo::Wader::MaxReconnectsExceededError
68
+ then EXIT_MAX_RECONNECTS
69
+ when Flamingo::Wader::ServerUnavailableError
70
+ then EXIT_SERVER_UNAVAILABLE
71
+ else
72
+ EXIT_UNKNOWN_ERROR
73
+ end
74
+ end
26
75
  end
27
76
  end
28
77
  end
@@ -1,40 +1,41 @@
1
1
  module Flamingo
2
2
  class DispatchEvent
3
-
3
+
4
4
  @queue = :flamingo
5
5
  @parser = Yajl::Parser.new(:symbolize_keys => true)
6
-
6
+
7
7
  class << self
8
-
8
+
9
+ #
10
+ # TODO Track stats including: tweets per second and last tweet time
11
+ # TODO Provide some first-level check for repeated status ids
12
+ # TODO Consider subscribers for receiving particular terms - do the heavy
13
+ # lifting of parsing tweets and delivering them to particular subscribers
14
+ # TODO Consider window of tweets (approx 3 seconds) and sort before
15
+ # dispatching to improve in-order delivery (helps with "k-sorted")
16
+ #
9
17
  def perform(event_json)
10
- #TODO Track stats including: tweets per second and last tweet time
11
- #TODO Provide some first-level check for repeated status ids
12
- #TODO Consider subscribers for receiving particular terms - do the heavy
13
- # lifting of parsing tweets and delivering them to particular subscribers
14
- #TODO Consider window of tweets (approx 3 seconds) and sort before
15
- # dispatching to improve in-order delivery (helps with "k-sorted")
16
18
  type, event = typed_event(parse(event_json))
17
- # Flamingo.logger.info Flamingo.router.destinations(type,event).inspect
18
19
  Subscription.all.each do |sub|
19
20
  Resque::Job.create(sub.name, "HandleFlamingoEvent", type, event)
20
21
  Flamingo.logger.debug "Put job on subscription queue #{sub.name} for #{event_json}"
21
22
  end
22
23
  end
23
-
24
+
24
25
  def parse(json)
25
26
  @parser.parse(json)
26
27
  end
27
-
28
+
28
29
  def typed_event(event)
29
30
  if event[:delete]
30
- [:delete,event[:delete]]
31
+ [:delete, event[:delete]]
31
32
  elsif event[:link]
32
- [:link,event[:link]]
33
+ [:link, event[:link]]
33
34
  else
34
- [:tweet,event]
35
+ [:tweet, event]
35
36
  end
36
37
  end
37
-
38
+
38
39
  end
39
40
  end
40
41
  end
@@ -1,41 +1,42 @@
1
1
  module Flamingo
2
-
2
+ #
3
+ # Facade for redis:
4
+ # database object that behaves like a hash
5
+ #
3
6
  class StreamParams
4
-
5
7
  include Enumerable
6
-
7
8
  attr_accessor :stream_name
8
-
9
+
9
10
  def initialize(stream_name)
10
11
  self.stream_name = stream_name
11
12
  end
12
-
13
+
13
14
  def set(key,*values)
14
15
  delete(key)
15
16
  add(key,*values)
16
17
  end
17
-
18
+
18
19
  def []=(key,values)
19
20
  values = [values] unless values.is_a?(Array)
20
21
  set(key,*values)
21
22
  end
22
-
23
+
23
24
  def add(key,*values)
24
25
  values.each do |value|
25
26
  Flamingo.redis.sadd redis_key(key), value
26
27
  end
27
28
  end
28
-
29
+
29
30
  def remove(key,*values)
30
31
  values.each do |value|
31
32
  Flamingo.redis.srem redis_key(key), value
32
33
  end
33
34
  end
34
-
35
+
35
36
  def delete(key)
36
37
  Flamingo.redis.del redis_key(key)
37
38
  end
38
-
39
+
39
40
  def get(key)
40
41
  Flamingo.redis.smembers redis_key(key)
41
42
  end
@@ -53,22 +54,20 @@ module Flamingo
53
54
  h
54
55
  end
55
56
  end
56
-
57
+
57
58
  def each
58
59
  keys.each do |key|
59
60
  yield(key,get(key))
60
61
  end
61
62
  end
62
-
63
+
63
64
  private
64
65
  def redis_key_pattern
65
66
  "streams/#{stream_name}?*"
66
67
  end
67
-
68
+
68
69
  def redis_key(key)
69
70
  "streams/#{stream_name}?#{key}"
70
71
  end
71
-
72
72
  end
73
-
74
- end
73
+ end
@@ -1,9 +1,9 @@
1
1
  module Flamingo
2
-
2
+ #
3
+ # Track stream subscriptions in the Redis db.
4
+ #
3
5
  class Subscription
4
-
5
6
  class << self
6
-
7
7
  def all
8
8
  Flamingo.redis.smembers("subscriptions").map do |name|
9
9
  new(name)
@@ -15,25 +15,22 @@ module Flamingo
15
15
  Subscription.new(name)
16
16
  end
17
17
  end
18
-
19
18
  end
20
-
19
+
21
20
  attr_accessor :name
22
-
23
21
  def initialize(name)
24
- self.name = name
22
+ self.name = name
25
23
  end
26
-
24
+
27
25
  def save
28
26
  Flamingo.logger.info("Adding #{name} to subscriptions")
29
27
  Flamingo.redis.sadd("subscriptions",name)
30
28
  end
31
-
29
+
32
30
  def delete
33
31
  Flamingo.logger.info("Removing #{name} from subscriptions")
34
32
  Flamingo.redis.srem("subscriptions",name)
35
33
  end
36
-
34
+
37
35
  end
38
-
39
36
  end
@@ -1,3 +1,3 @@
1
1
  module Flamingo
2
- Version = VERSION = '0.1'
2
+ Version = VERSION = '0.2.0'
3
3
  end
@@ -1,57 +1,149 @@
1
1
  module Flamingo
2
2
  class Wader
3
3
 
4
- attr_accessor :screen_name, :password, :stream, :connection
4
+ class WaderError < StandardError
5
+ end
6
+
7
+ class HttpStatusError < WaderError
8
+
9
+ attr_accessor :code
10
+
11
+ def initialize(message,code)
12
+ super(message)
13
+ self.code = code
14
+ end
15
+ end
16
+
17
+ # Errors from certain HTTP Statuses
18
+ class AuthenticationError < HttpStatusError; end
19
+ class UnknownStreamError < HttpStatusError; end
20
+ class InvalidParametersError < HttpStatusError; end
5
21
 
22
+ # Fatal error from too many reconnection attempts
23
+ class MaxReconnectsExceededError < WaderError; end
24
+
25
+ # Raised if the server is just not available, e.g. Twitter is down
26
+ class ServerUnavailableError < WaderError; end
27
+
28
+ attr_accessor :screen_name, :password, :stream, :connection,
29
+ :server_unavailable_max_retries,
30
+ :server_unavailable_wait,
31
+ :server_unavailable_retries
32
+
6
33
  def initialize(screen_name,password,stream)
7
34
  self.screen_name = screen_name
8
35
  self.password = password
9
36
  self.stream = stream
37
+ self.server_unavailable_max_retries = 5
38
+ self.server_unavailable_wait = 60
10
39
  end
11
-
40
+
41
+ #
42
+ # The main EventMachine run loop
43
+ #
44
+ # Start the stream listener (using twitter-stream, http://github.com/voloko/twitter-stream)
45
+ # Listen for responses and errors;
46
+ # dispatch each for later handling
47
+ #
12
48
  def run
13
- EventMachine::run do
14
- self.connection = stream.connect(:auth=>"#{screen_name}:#{password}")
15
- Flamingo.logger.info("Listening on stream: #{stream.path}")
16
-
17
- connection.each_item do |event_json|
18
- dispatch_event(event_json)
49
+ self.server_unavailable_retries = 0
50
+ begin
51
+ connect_and_run
52
+ rescue => e
53
+ # This is largely to get around a bug in Twitter-Stream that should
54
+ # be fixed in the next release. If the server is just not there on
55
+ # the first try, it blows up. Hopefully this code can be removed after
56
+ # that release.
57
+ Flamingo.logger.warn "Failure initiating connection. Most likely "+
58
+ "because server is unavailable.\n#{e}\n#{e.backtrace.join("\n\t")}"
59
+ if server_unavailable_retries < server_unavailable_max_retries
60
+ sleep(server_unavailable_wait)
61
+ self.server_unavailable_retries += 1
62
+ retry
63
+ else
64
+ raise ServerUnavailableError.new
19
65
  end
20
-
21
- connection.on_error do |message|
22
- dispatch_error(:generic,message)
23
- end
24
-
25
- connection.on_reconnect do |timeout, retries|
26
- dispatch_error(:reconnection,
27
- "Will reconnect after #{timeout}. Retry \##{retries}",
28
- {:timeout=>timeout,:retries=>retries}
29
- )
30
- end
31
-
32
- connection.on_max_reconnects do |timeout, retries|
33
- dispatch_error(:fatal,
34
- "Failed to reconnect after #{retries} retries",
35
- {:timeout=>timeout,:retries=>retries}
36
- )
37
- end
38
- end
66
+ end
67
+ raise @error if @error
39
68
  end
40
-
69
+
70
+ def retries
71
+ if connection
72
+ # This is weird but necessary because twitter-stream increments the
73
+ # reconnect_retries a bit oddly. They are incremented prior to the
74
+ # actual reconnect which means that the last reconnect_retries value
75
+ # is 1 more than the real value.
76
+ rs = connection.reconnect_retries
77
+ rs == 0 ? 0 : rs - 1
78
+ else
79
+ 0
80
+ end
81
+ end
82
+
41
83
  def stop
42
- connection.stop
84
+ if connection
85
+ connection.stop
86
+ end
43
87
  EM.stop
44
88
  end
45
89
 
46
90
  private
47
- def dispatch_event(event_json)
48
- Flamingo.logger.debug "Wader dispatched event"
49
- Resque.enqueue(Flamingo::DispatchEvent,event_json)
91
+ def connect_and_run
92
+ EventMachine::run do
93
+ self.connection = stream.connect(:auth=>"#{screen_name}:#{password}")
94
+ Flamingo.logger.info("Listening on stream: #{stream.path}")
95
+
96
+ connection.each_item do |event_json|
97
+ dispatch_event(event_json)
98
+ end
99
+
100
+ connection.on_error do |message|
101
+ handle_connection_error(message)
102
+ end
103
+
104
+ connection.on_reconnect do |timeout, retries|
105
+ Flamingo.logger.warn "Failed to connect. Will reconnect after "+
106
+ "#{timeout}. Retry \##{retries}"
107
+ end
108
+
109
+ connection.on_max_reconnects do |timeout, retries|
110
+ stop_and_raise!(MaxReconnectsExceededError.new(
111
+ "Failed to reconnect after #{retries-1} retries"
112
+ ))
113
+ end
114
+ end
115
+ end
116
+
117
+ # Decides what to do with specific connection errors. For explanations
118
+ # of various HTTP status codes from the Streaming API, see:
119
+ # http://dev.twitter.com/pages/streaming_api_response_codes
120
+ def handle_connection_error(message)
121
+ code = connection.code # HTTP status code
122
+ if [401,403].include?(code)
123
+ stop_and_raise!(AuthenticationError.new(message,code))
124
+ elsif code == 404
125
+ stop_and_raise!(UnknownStreamError.new(message,code))
126
+ elsif [406,413,416].include?(code)
127
+ stop_and_raise!(InvalidParametersError.new(message,code))
128
+ elsif code && code > 0
129
+ Flamingo.logger.warn "Received non-fatal HTTP status #{code} with "+
130
+ "message \"#{message}\". Will retry."
131
+ else
132
+ Flamingo.logger.warn "Unknown connection error: #{message}. "+
133
+ "Will retry."
134
+ end
50
135
  end
51
136
 
52
- def dispatch_error(type,message,data={})
53
- Flamingo.logger.error "Received error: #{message}"
54
- Resque.enqueue(Flamingo::DispatchError,type,message,data)
137
+ def stop_and_raise!(error)
138
+ Flamingo.logger.error "Stopping wader due to error: #{error}"
139
+ stop
140
+ @error = error
141
+ end
142
+
143
+ def dispatch_event(event_json)
144
+ Flamingo.logger.debug "Wader dispatched event"
145
+ Resque.enqueue(Flamingo::DispatchEvent, event_json)
55
146
  end
147
+
56
148
  end
57
149
  end
metadata CHANGED
@@ -1,12 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: flamingo
3
3
  version: !ruby/object:Gem::Version
4
- hash: 9
4
+ hash: 23
5
5
  prerelease: false
6
6
  segments:
7
7
  - 0
8
- - 1
9
- version: "0.1"
8
+ - 2
9
+ - 0
10
+ version: 0.2.0
10
11
  platform: ruby
11
12
  authors:
12
13
  - Hayes Davis
@@ -15,7 +16,7 @@ autorequire:
15
16
  bindir: bin
16
17
  cert_chain: []
17
18
 
18
- date: 2010-07-19 00:00:00 -05:00
19
+ date: 2010-08-01 00:00:00 -05:00
19
20
  default_executable:
20
21
  dependencies:
21
22
  - !ruby/object:Gem::Dependency
@@ -130,6 +131,22 @@ dependencies:
130
131
  version: 2.1.0
131
132
  type: :runtime
132
133
  version_requirements: *id007
134
+ - !ruby/object:Gem::Dependency
135
+ name: mockingbird
136
+ prerelease: false
137
+ requirement: &id008 !ruby/object:Gem::Requirement
138
+ none: false
139
+ requirements:
140
+ - - ">="
141
+ - !ruby/object:Gem::Version
142
+ hash: 27
143
+ segments:
144
+ - 0
145
+ - 1
146
+ - 0
147
+ version: 0.1.0
148
+ type: :development
149
+ version_requirements: *id008
133
150
  description: " Flamingo makes it easy to wade through the Twitter Streaming API by \n handling all connectivity and resource management for you. You just tell \n it what to track and consume the information in a resque queue. \n\n Flamingo isn't a traditional ruby gem. You don't require it into your code.\n Instead, it's designed to run as a daemon like redis or mysql. It provides \n a REST interface to change the parameters sent to the Twitter Streaming \n resource. All events from the streaming API are placed on a resque job \n queue where your application can process them.\n\n"
134
151
  email: hayes@appozite.com
135
152
  executables:
@@ -148,6 +165,7 @@ files:
148
165
  - lib/flamingo/daemon/dispatcher_process.rb
149
166
  - lib/flamingo/daemon/flamingod.rb
150
167
  - lib/flamingo/daemon/pid_file.rb
168
+ - lib/flamingo/daemon/trap_keeper.rb
151
169
  - lib/flamingo/daemon/wader_process.rb
152
170
  - lib/flamingo/daemon/web_server_process.rb
153
171
  - lib/flamingo/dispatch_error.rb