girl_friday 0.9.1 → 0.9.2

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore CHANGED
@@ -2,3 +2,4 @@
2
2
  .bundle
3
3
  Gemfile.lock
4
4
  pkg/*
5
+ .rbxdb/
data/.rvmrc CHANGED
@@ -1,2 +1,3 @@
1
- rvm use rbx
1
+ rvm use jruby@gf --create
2
2
  export RBXOPT=-Xrbc.db=~/.rbxdb
3
+ export JRUBY_OPTS='--1.9 -X+O -J-Djruby.launch.inproc=false'
data/.travis.yml ADDED
@@ -0,0 +1,8 @@
1
+ rvm:
2
+ - 1.9.2
3
+ - jruby
4
+ - rbx-2.0
5
+ branches:
6
+ only:
7
+ - master
8
+ env: JRUBY_OPTS='--1.9 -X+O -J-Djruby.launch.inproc=false'
data/Gemfile CHANGED
@@ -6,3 +6,4 @@ gemspec
6
6
  # Needed for testing only!
7
7
  gem 'minitest'
8
8
  gem 'redis'
9
+ gem 'flexmock-minitest'
data/History.md CHANGED
@@ -1,14 +1,24 @@
1
1
  Changes
2
2
  ================
3
3
 
4
+ 0.9.2
5
+ ---------
6
+
7
+ * Remove use of weakrefs to track queue instances, use ObjectSpace
8
+ instead.
9
+ * Add support for Batch operations, providing an easy way to fan out
10
+ operations and then collect results when completed.
11
+ * Added WorkQueue.immediate! and WorkQueue.queue! to switch background processing off and back on respectively. Nice to use when testing. (jc00ke, ryanlecompte)
12
+ * Added some ajax updates to the girl\_friday status server. (jc00ke)
13
+
4
14
  0.9.1
5
15
  ---------
6
16
 
7
17
  * Lazy initialize the worker actors to avoid dead thread problems with Unicorn forking processes.
8
- * Add initial pass at girl_friday Rack server (see wiki). It's awful looking, trust me, help wanted.
18
+ * Add initial pass at girl\_friday Rack server (see wiki). It's awful looking, trust me, help wanted.
9
19
 
10
20
 
11
21
  0.9.0
12
22
  ---------
13
23
 
14
- * Initial release
24
+ * Initial release
data/README.md CHANGED
@@ -1,29 +1,29 @@
1
- girl_friday
1
+ girl\_friday
2
2
  ====================
3
3
 
4
- Have a task you want to get done sometime soon but don't want to do it yourself? Give it to girl_friday! From wikipedia:
4
+ Have a task you want to get done sometime soon but don't want to do it yourself? Give it to girl\_friday! From wikipedia:
5
5
 
6
6
  > The term Man Friday has become an idiom, still in mainstream usage, to describe an especially faithful servant or
7
7
  > one's best servant or right-hand man. The female equivalent is Girl Friday. The title of the movie His Girl Friday
8
8
  > alludes to it and may have popularized it.
9
9
 
10
- girl_friday is a Ruby library for performing asynchronous tasks. Often times you don't want to block a web response by performing some task, like sending an email, so you can just use this gem to perform it in the background. It works with any Ruby application, including Rails 3 applications.
10
+ girl\_friday is a Ruby library for performing asynchronous tasks. Often times you don't want to block a web response by performing some task, like sending an email, so you can just use this gem to perform it in the background. It works with any Ruby application, including Rails 3 applications.
11
11
 
12
12
 
13
13
  Installation
14
14
  ------------------
15
15
 
16
- We recommend using [JRuby 1.6+](http://jruby.org) or [Rubinius 2.0+](http://rubini.us) with girl_friday. Both are excellent options for executing Ruby these days.
16
+ We recommend using [JRuby 1.6+](http://jruby.org) or [Rubinius 2.0+](http://rubini.us) with girl\_friday. Both are excellent options for executing Ruby these days.
17
17
 
18
18
  gem install girl_friday
19
19
 
20
- girl_friday does not support Ruby 1.8 (MRI) because of its poor threading support. Ruby 1.9 will work reasonably well if you use gems that release the GIL for network I/O (mysql2 is a good example of this, do **not** use the original mysql gem).
20
+ girl\_friday does not support Ruby 1.8 (MRI) because of its poor threading support. Ruby 1.9 will work reasonably well if you use gems that release the GIL for network I/O (mysql2 is a good example of this, do **not** use the original mysql gem).
21
21
 
22
22
 
23
23
  Usage
24
24
  --------------------
25
25
 
26
- Put girl_friday in your Gemfile:
26
+ Put girl\_friday in your Gemfile:
27
27
 
28
28
  gem 'girl_friday'
29
29
 
@@ -46,17 +46,19 @@ The msg parameter to push is just a Hash whose contents are completely up to you
46
46
 
47
47
  Your message processing block should **not** access any instance data or variables outside of the block. That's shared mutable state and dangerous to touch! I also strongly recommend your queue processor block be **VERY** short, ideally just a method call or two. You can unit test those methods easily but not the processor block itself.
48
48
 
49
+ You can call `GirlFriday::WorkQueue.immediate!` to process jobs immediately, which is helpful when testing. `GirlFriday::WorkQueue.queue!` will revert this & jobs will be processed by actors.
50
+
49
51
 
50
52
  More Detail
51
53
  --------------------
52
54
 
53
- Please see the [girl_friday wiki](https://github.com/mperham/girl_friday/wiki) for more detail and advanced options and tuning. You'll find details on queue persistence with Redis, implementing clean shutdown, querying runtime metrics and SO MUCH MORE!
55
+ Please see the [girl\_friday wiki](https://github.com/mperham/girl_friday/wiki) for more detail and advanced options and tuning. You'll find details on queue persistence with Redis, implementing clean shutdown, querying runtime metrics and SO MUCH MORE!
54
56
 
55
57
 
56
58
  Thanks
57
59
  --------------------
58
60
 
59
- [Carbon Five](http://carbonfive.com), I write and maintain girl_friday on their clock.
61
+ [Carbon Five](http://carbonfive.com), I write and maintain girl\_friday on their clock.
60
62
 
61
63
  This gem contains a copy of the Rubinius Actor API, modified to work on any Ruby VM. Thanks to Evan Phoenix, MenTaLguY and the Rubinius project for permission to use and distribute this code.
62
64
 
@@ -64,4 +66,4 @@ This gem contains a copy of the Rubinius Actor API, modified to work on any Ruby
64
66
  Author
65
67
  --------------------
66
68
 
67
- Mike Perham, [@mperham](https://twitter.com/mperham), [mikeperham.com](http://mikeperham.com)
69
+ Mike Perham, [@mperham](https://twitter.com/mperham), [mikeperham.com](http://mikeperham.com)
data/TODO.md CHANGED
@@ -3,4 +3,3 @@ TODO
3
3
 
4
4
  - rufus-scheduler integration for scheduled tasks
5
5
  - web admin UI to surface status() metrics
6
- - nicer project homepage
data/examples/batch.rb ADDED
@@ -0,0 +1,42 @@
1
+ require 'girl_friday'
2
+ require 'open-uri'
3
+ require 'benchmark'
4
+ require 'nokogiri'
5
+
6
+ class UrlProcessor
7
+ URLS = %w(http://www.bing.com http://www.google.com http://www.yahoo.com)
8
+
9
+ def parallel
10
+ batch = GirlFriday::Batch.new(URLS, :size => 3) do |url|
11
+ html = open(url)
12
+ doc = Nokogiri::HTML(html.read)
13
+ doc.css('span').count
14
+ end
15
+ p URLS.zip(batch.results)
16
+ end
17
+
18
+ def serial
19
+ results = URLS.map do |url|
20
+ html = open(url)
21
+ doc = Nokogiri::HTML(html.read)
22
+ doc.css('span').count
23
+ end
24
+ p URLS.zip(results)
25
+ end
26
+ end
27
+
28
+ # Expected output:
29
+ # [["http://www.bing.com", 24], ["http://www.google.com", 8], ["http://www.yahoo.com", 172]]
30
+ #
31
+ # Benchmark results:
32
+ # serial 1.231000 0.000000 1.231000 ( 1.231000)
33
+ # parallel 0.447000 0.000000 0.447000 ( 0.447000)
34
+
35
+ processor = UrlProcessor.new
36
+ Benchmark.bm(25) do |x|
37
+ %w(serial parallel).each do |op|
38
+ x.report(op) do
39
+ processor.send(op.to_sym)
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,89 @@
1
+ require 'open-uri'
2
+ require 'nokogiri'
3
+ require 'girl_friday'
4
+
5
+
6
+ ##
7
+ # In this example, we use girl_friday to implement a processing pipeline
8
+ # for scraping large images from a website. Given a URL, we want to fetch
9
+ # the HTML for that URL, find all the images, download those images, discard images
10
+ # which do not meet a size heuristic and save the ones that match. This
11
+ # processing is I/O-heavy and perfect for breaking into many threads.
12
+ #
13
+ # A processing pipeline is just a series of linked processing steps.
14
+ # We create a girl_friday queue for each step, sized appropriately for how few/many parallel worker threads we want for that step.
15
+ # The process_xxx methods implement the actual logic for the step.
16
+ # The finish_xxx methods pass the result to the next step in the pipeline.
17
+ #
18
+ class ImagePipeline
19
+ def initialize
20
+ @download_html = GirlFriday::Queue.new(:download_html, :size => 5, &method(:process_html))
21
+ @extract_imgs = GirlFriday::Queue.new(:extract, :size => 2, &method(:process_extract))
22
+ @download_imgs = GirlFriday::Queue.new(:download_imgs, :size => 10, &method(:process_imgs))
23
+ @thumb = GirlFriday::Queue.new(:thumb_imgs, :size => 5, &method(:process_thumb))
24
+ end
25
+
26
+ def process(url)
27
+ log "Pushing #{url}"
28
+ @download_html.push({ :url => url }, &method(:finish_html))
29
+ end
30
+
31
+ private
32
+
33
+ def process_html(msg)
34
+ msg.merge(:htmlfile => open(msg[:url]))
35
+ end
36
+
37
+ def finish_html(result)
38
+ @extract_imgs.push(result, &method(:finish_extract))
39
+ end
40
+
41
+ def process_extract(msg)
42
+ doc = Nokogiri::HTML(msg[:htmlfile].read)
43
+ doc.css('img[src]').map{|n| n['src']}.select { |url| url =~ /^#{msg[:url]}/ }
44
+ end
45
+
46
+ def finish_extract(result)
47
+ result.each do |imgurl|
48
+ @download_imgs.push(imgurl, &method(:finish_imgs))
49
+ end
50
+ end
51
+
52
+ def process_imgs(msg)
53
+ log "Fetching image: #{msg}"
54
+ imgfile = open msg
55
+ return if imgfile.size < 20_000 # ignore images less than 20k
56
+ result = `identify #{imgfile.path}`
57
+ log "Image: #{result}"
58
+ return unless result =~ /(\d+)x(\d+)\+0\+0/
59
+ return if Integer($1) + Integer($2) < 500
60
+ # Passed all our heuristics, pass it on!
61
+ imgfile
62
+ end
63
+
64
+ def finish_imgs(result)
65
+ return unless result
66
+ @thumb.push(result, &method(:finish_thumb))
67
+ end
68
+
69
+ def process_thumb(msg)
70
+ FileUtils.cp msg.path, Time.now.to_f.to_s
71
+ msg.path
72
+ end
73
+
74
+ def finish_thumb(result)
75
+ log "Finished image at #{result}"
76
+ end
77
+
78
+ def log(msg)
79
+ print "#{Thread.current}: #{msg}\n"
80
+ end
81
+ end
82
+
83
+
84
+ pipeline = ImagePipeline.new
85
+ pipeline.process 'http://blog.carbonfive.com'
86
+
87
+ loop do
88
+ sleep 1
89
+ end
data/girl_friday.gemspec CHANGED
@@ -17,4 +17,5 @@ Gem::Specification.new do |s|
17
17
  s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
18
18
  s.require_paths = ["lib"]
19
19
  s.add_development_dependency 'sinatra', '~> 1.0'
20
+ s.add_development_dependency 'rake'
20
21
  end
@@ -53,7 +53,7 @@ class Actor
53
53
  @@registered_lock = Queue.new
54
54
  @@registered = {}
55
55
  @@registered_lock << nil
56
-
56
+
57
57
  def current
58
58
  Thread.current[:__current_actor__] ||= private_new
59
59
  end
@@ -113,7 +113,7 @@ class Actor
113
113
  recipient.notify_exited(current, reason)
114
114
  self
115
115
  end
116
-
116
+
117
117
  # Link the current Actor to another one.
118
118
  def link(actor)
119
119
  current = self.current
@@ -121,7 +121,7 @@ class Actor
121
121
  actor.notify_link current
122
122
  self
123
123
  end
124
-
124
+
125
125
  # Unlink the current Actor from another one
126
126
  def unlink(actor)
127
127
  current = self.current
@@ -259,12 +259,19 @@ class Actor
259
259
  begin
260
260
  raise @interrupts.shift unless @interrupts.empty?
261
261
 
262
- for i in 0...(@mailbox.size)
263
- message = @mailbox[i]
262
+ if @mailbox.size > 0
263
+ message = @mailbox.shift
264
264
  action = filter.action_for(message)
265
- if action
266
- @mailbox.delete_at(i)
267
- break
265
+ unless action
266
+ @mailbox << message
267
+ for i in 1...(@mailbox.size)
268
+ message = @mailbox[i]
269
+ action = filter.action_for(message)
270
+ if action
271
+ @mailbox.delete_at(i)
272
+ break
273
+ end
274
+ end
268
275
  end
269
276
  end
270
277
 
@@ -303,7 +310,7 @@ class Actor
303
310
  action.call message
304
311
  end
305
312
  end
306
-
313
+
307
314
  # Notify this actor that it's now linked to the given one; this is not
308
315
  # intended to be used directly except by actor implementations. Most
309
316
  # users will want to use Actor.link instead.
@@ -322,7 +329,7 @@ class Actor
322
329
  actor.notify_exited(self, exit_reason) unless alive
323
330
  self
324
331
  end
325
-
332
+
326
333
  # Notify this actor that it's now unlinked from the given one; this is
327
334
  # not intended to be used directly except by actor implementations. Most
328
335
  # users will want to use Actor.unlink instead.
@@ -337,7 +344,7 @@ class Actor
337
344
  end
338
345
  self
339
346
  end
340
-
347
+
341
348
  # Notify this actor that one of the Actors it's linked to has exited;
342
349
  # this is not intended to be used directly except by actor implementations.
343
350
  # Most users will want to use Actor.send_exit instead.
@@ -399,7 +406,7 @@ class Actor
399
406
  end
400
407
  end
401
408
  private :check_thread
402
-
409
+
403
410
  def _trap_exit=(value) #:nodoc:
404
411
  check_thread
405
412
  @lock.pop
@@ -410,7 +417,7 @@ class Actor
410
417
  @lock << nil
411
418
  end
412
419
  end
413
-
420
+
414
421
  def _trap_exit #:nodoc:
415
422
  check_thread
416
423
  @lock.pop
@@ -0,0 +1,47 @@
1
+ module GirlFriday
2
+
3
+ ##
4
+ # Batch represents a set of operations which can be processed
5
+ # concurrently. Asking for the results of the batch acts as a barrier:
6
+ # the calling thread will block until all operations have completed.
7
+ # Results are guaranteed to be returned in the
8
+ # same order as the operations are given.
9
+ #
10
+ # Internally a girl_friday queue is created which limits the
11
+ # number of concurrent operations based on the :size option.
12
+ #
13
+ # TODO Errors are not handled well at all.
14
+ class Batch
15
+ def initialize(enumerable, options, &block)
16
+ @queue = GirlFriday::Queue.new(:batch, options, &block)
17
+ @complete = 0
18
+ @size = enumerable.count
19
+ @results = Array.new(@size)
20
+ @lock = Mutex.new
21
+ @condition = ConditionVariable.new
22
+ start(enumerable)
23
+ end
24
+
25
+ def results(timeout=nil)
26
+ @lock.synchronize do
27
+ @condition.wait(@lock, timeout) if @complete != @size
28
+ @results
29
+ end
30
+ end
31
+
32
+ private
33
+
34
+ def start(operations)
35
+ operations.each_with_index do |packet, index|
36
+ @queue.push(packet) do |result|
37
+ @lock.synchronize do
38
+ @complete += 1
39
+ @results[index] = result
40
+ @condition.signal if @complete == @size
41
+ end
42
+ end
43
+ end
44
+ end
45
+
46
+ end
47
+ end
@@ -15,7 +15,7 @@ module GirlFriday
15
15
  class ErrorHandler
16
16
  class Hoptoad
17
17
  def handle(ex)
18
- HoptoadNotifier.notify(ex)
18
+ HoptoadNotifier.notify_or_ignore(ex)
19
19
  end
20
20
  end
21
21
  end
@@ -10,16 +10,16 @@ module GirlFriday
10
10
  @backlog << work
11
11
  end
12
12
  alias_method :<<, :push
13
-
13
+
14
14
  def pop
15
- @backlog.pop
15
+ @backlog.shift
16
16
  end
17
-
17
+
18
18
  def size
19
19
  @backlog.size
20
20
  end
21
21
  end
22
-
22
+
23
23
  class Redis
24
24
  def initialize(name, options)
25
25
  @opts = options
@@ -53,4 +53,3 @@ module GirlFriday
53
53
  end
54
54
  end
55
55
  end
56
-
@@ -16,24 +16,23 @@ module GirlFriday
16
16
  set :views, "#{basedir}/views"
17
17
  set :public, "#{basedir}/public"
18
18
  set :static, true
19
-
19
+
20
20
  helpers do
21
21
  include Rack::Utils
22
22
  alias_method :h, :escape_html
23
23
 
24
- def dashboard(stats)
25
- if stats[:busy] == stats[:pool_size] && stats[:backlog] < stats[:pool_size]
26
- ['#ffc', 'Busy']
27
- elsif stats[:busy] == stats[:pool_size] && stats[:backlog] >= stats[:pool_size]
28
- ['#fcc', 'Busy and Backlogged']
29
- else
30
- ['white', 'OK']
31
- end
24
+ def url_path(*path_parts)
25
+ [path_prefix, path_parts].join('/').squeeze('/')
26
+ end
27
+ alias_method :u, :url_path
28
+
29
+ def path_prefix
30
+ request.env['SCRIPT_NAME']
32
31
  end
33
32
  end
34
-
35
- get '/' do
36
- redirect "#{request.env['REQUEST_URI']}/status"
33
+
34
+ get '/?' do
35
+ redirect url_path('status')
37
36
  end
38
37
 
39
38
  get '/status' do
@@ -46,4 +45,4 @@ module GirlFriday
46
45
  GirlFriday.status.to_json
47
46
  end
48
47
  end
49
- end
48
+ end