kthxbye 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (41) hide show
  1. data/.document +5 -0
  2. data/.gitignore +33 -0
  3. data/DESIGN.textile +81 -0
  4. data/Gemfile +21 -0
  5. data/Gemfile.lock +42 -0
  6. data/LICENSE +20 -0
  7. data/README.textile +91 -0
  8. data/Rakefile +53 -0
  9. data/VERSION +1 -0
  10. data/config.ru +7 -0
  11. data/lib/kthxbye.rb +151 -0
  12. data/lib/kthxbye/config.rb +35 -0
  13. data/lib/kthxbye/exceptions.rb +4 -0
  14. data/lib/kthxbye/failure.rb +62 -0
  15. data/lib/kthxbye/helper.rb +42 -0
  16. data/lib/kthxbye/job.rb +127 -0
  17. data/lib/kthxbye/version.rb +5 -0
  18. data/lib/kthxbye/web_interface.rb +117 -0
  19. data/lib/kthxbye/web_interface/public/application.js +16 -0
  20. data/lib/kthxbye/web_interface/public/awesome-buttons.css +108 -0
  21. data/lib/kthxbye/web_interface/public/jquery.js +154 -0
  22. data/lib/kthxbye/web_interface/public/style.css +128 -0
  23. data/lib/kthxbye/web_interface/views/error.haml +5 -0
  24. data/lib/kthxbye/web_interface/views/failed.haml +26 -0
  25. data/lib/kthxbye/web_interface/views/hash.haml +6 -0
  26. data/lib/kthxbye/web_interface/views/layout.haml +33 -0
  27. data/lib/kthxbye/web_interface/views/overview.haml +2 -0
  28. data/lib/kthxbye/web_interface/views/queues.haml +31 -0
  29. data/lib/kthxbye/web_interface/views/set.haml +4 -0
  30. data/lib/kthxbye/web_interface/views/stats.haml +32 -0
  31. data/lib/kthxbye/web_interface/views/view_backtrace.haml +8 -0
  32. data/lib/kthxbye/web_interface/views/workers.haml +24 -0
  33. data/lib/kthxbye/web_interface/views/working.haml +19 -0
  34. data/lib/kthxbye/worker.rb +221 -0
  35. data/test/helper.rb +18 -0
  36. data/test/redis-test.conf +115 -0
  37. data/test/test_failure.rb +51 -0
  38. data/test/test_helper.rb +86 -0
  39. data/test/test_kthxbye.rb +213 -0
  40. data/test/test_worker.rb +148 -0
  41. metadata +364 -0
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
@@ -0,0 +1,33 @@
1
+ # rcov generated
2
+ coverage
3
+
4
+ # rdoc generated
5
+ rdoc
6
+
7
+ # yard generated
8
+ doc
9
+ .yardoc
10
+
11
+ # bundler
12
+ .bundle
13
+
14
+ # jeweler generated
15
+ pkg
16
+ ## MAC OS
17
+ .DS_Store
18
+
19
+ ## TEXTMATE
20
+ *.tmproj
21
+ tmtags
22
+
23
+ ## EMACS
24
+ *~
25
+ \#*
26
+ .\#*
27
+
28
+ ## VIM
29
+ *.swp
30
+ *.swo
31
+
32
+ ## PROJECT::SPECIFIC
33
+ *.gemspec
@@ -0,0 +1,81 @@
1
+ h1. Queues
2
+
3
+ There are a number of important queues that we employ to track data:
4
+
5
+ # queues - responsilbe for pushing/popping jobs. only stores a unique id. Is
6
+ pushed and popped, therefore a fairly "volatile" dataset.
7
+ # data-store:[name] - hash responsible for the different types of job data.
8
+ Stored by job id. This data is persisted until a job is successful or failed
9
+ too many times. Then the job will be dropped and failure recorded.
10
+ # result-store:[name] - hash responsible for storing the results by job id.
11
+ Data will remain here until removed. Can be fetched multiple times if desired
12
+ providing a cached retrieval.
13
+ # failure-store:[name] - hash responsible for storing any failed jobs. Stored
14
+ until manually cleared.
15
+
16
+ In addition there is a unique id counter that gets incremented as the job queue
17
+ grows. It is stored at "unique_id"
18
+
19
+ On top of these queues, there is also a set of stores setup to keep track of
20
+ workers and processes actively being worked.
21
+
22
+ # workers - set that tracks all the registered workers. All workers not in this list
23
+ should be destroyed.
24
+ # working - set of workers that are actively working a job.
25
+
26
+ h1. Jobs
27
+
28
+ I decided to make jobs a two fold purpose vehicle. One, it would be the job
29
+ storage mechanism through which a job would be placed on the queue for later
30
+ execution by workers and two, it would actually do the job execution, even
31
+ though the worker is the one that calls it. This way all job related tasks
32
+ can be abstracted to the job class, while the worker can busy itself about
33
+ handling the control of the job execution and not the job execution itself.
34
+
35
+ Jobs will also be responsible for storing the state of the result and any
36
+ failures that come from the job execution.
37
+
38
+ A job can only live on one queue.
39
+
40
+ Job will try 3 times before being taken out of the work queue and placed in the
41
+ failure queue
42
+
43
+ h1. Workers
44
+
45
+ There was a big struggle in the beginning to decide how to work queued jobs.
46
+ I originally wanted to do a more cooperative scheduling schema but found that
47
+ would not end up taking multiple cores into consideration. Moreover, it is
48
+ wicked hard to preempt fibers in a meaningful way without digging really deep
49
+ into other code as well in order to retrofit them for Fibers or EventMachine.
50
+ So I simply didn't.
51
+
52
+ Instead I went with Ruby's Process library and decided to work at making it
53
+ work as much as possible around a Unix-style processing queue so that in one
54
+ fell swoop, we could use multiple cores as well as gain true concurrency.
55
+ One more design decision is that we will not attempt to run multiple processes-
56
+ per-worker. This way we can control the number of processes running by how many
57
+ workers we choose to run, rather than by how many processes a worker is allowed
58
+ to spawn.
59
+
60
+ This presents a slight problem however. Each worker then will involve two
61
+ processes. One parent/control-loop and one child/job-processor. This means in
62
+ the end, we have more processes running at once, possibly chewing up more
63
+ resources than might be ultimately necessary. This will hopefully be overcome
64
+ by suspending the parent worker while the child process runs. This way one core
65
+ is not chewed up by an essentially idle process. Hopefully this can be benchmarked
66
+ to figure out if its faster to run one-to-many parent/childs or one-to-one.
67
+
68
+ A worker can work multiple queues
69
+
70
+ h1. Web Interface
71
+
72
+ Intended to have a web frontend to control/observe workers, jobs and queues.
73
+
74
+ h1. Job Observer Widget
75
+
76
+ Intended to have a JS widget that will indicate when a job has been computed
77
+ and the results are ready or a job has failed. Potential application for a
78
+ Node.js implementation (using something like
79
+ http://github.com/fictorial/redis-node-client). Should be easily embeddable in
80
+ any page for simple notification.
81
+
data/Gemfile ADDED
@@ -0,0 +1,21 @@
1
+ source "http://rubygems.org"
2
+ # Add dependencies required to use your gem here.
3
+ # Example:
4
+ # gem "activesupport", ">= 2.3.5"
5
+
6
+ # Add dependencies to develop your gem here.
7
+ # Include everything needed to run rake, tests, features, etc.
8
+ group :development do
9
+ gem "shoulda", ">= 0"
10
+ gem "bundler", "~> 1.0.0"
11
+ gem "jeweler", "~> 1.5.0.pre3"
12
+ gem "rcov", ">= 0"
13
+
14
+ end
15
+
16
+ gem "redis"
17
+ gem "yajl-ruby"
18
+ gem "json"
19
+ gem "mail"
20
+ gem 'i18n'
21
+ gem 'sinatra'
@@ -0,0 +1,42 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ activesupport (3.0.0)
5
+ git (1.2.5)
6
+ i18n (0.4.1)
7
+ jeweler (1.5.0.pre3)
8
+ bundler (~> 1.0.0)
9
+ git (>= 1.2.5)
10
+ rake
11
+ json (1.4.6)
12
+ mail (2.2.6.1)
13
+ activesupport (>= 2.3.6)
14
+ mime-types
15
+ treetop (>= 1.4.5)
16
+ mime-types (1.16)
17
+ polyglot (0.3.1)
18
+ rack (1.2.1)
19
+ rake (0.8.7)
20
+ rcov (0.9.9)
21
+ redis (2.0.10)
22
+ shoulda (2.11.3)
23
+ sinatra (1.0)
24
+ rack (>= 1.0)
25
+ treetop (1.4.8)
26
+ polyglot (>= 0.3.1)
27
+ yajl-ruby (0.7.7)
28
+
29
+ PLATFORMS
30
+ ruby
31
+
32
+ DEPENDENCIES
33
+ bundler (~> 1.0.0)
34
+ i18n
35
+ jeweler (~> 1.5.0.pre3)
36
+ json
37
+ mail
38
+ rcov
39
+ redis
40
+ shoulda
41
+ sinatra
42
+ yajl-ruby
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Luke van der Hoeven
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,91 @@
1
+ h1. kthxbye
2
+
3
+ Kthxbye is the answer to a fairly unique-yet-common problem: Background job
4
+ processing when we care about the result.
5
+
6
+ There are a number of projects I can think of where a job that takes longer than
7
+ a user ought to be waiting for a result (due to server timeout length or out of
8
+ simple app responsiveness courtesy to the user) and yet I need the result of the
9
+ operation to be returned to the user.
10
+
11
+ Here's a real-world example. I work with a set of legacy Oracle databases that
12
+ stores much of our business logic as PLSQL procedures. Yes, this is not "The
13
+ Rails Way™" but it's the only way for the company I work for right now.
14
+ Many of the procedures that I run as part of several of the applications I
15
+ support can take on average one minute or more with a standard deviation of
16
+ almost 2 minutes (with a forced timeout of 5 minutes). That's kinda a long time
17
+ to sit and wait on a web app.
18
+
19
+ <img src="http://img.skitch.com/20100901-gadna641fj4wdeswgj74y2pssq.png" alt="RLA - IWP Analysis"/>
20
+
21
+ We don't really want users sitting waiting for up to 5 minutes (when it forces
22
+ failure) unable to do anything or (even worse) hitting refresh or the action
23
+ again. Especially bad when this can mean the HTTP server is getting backed up
24
+ as more and more people run these long running processes.
25
+
26
+ Moreover, the users need to get response from the completed job before moving
27
+ on. Most job processors (DJ, Resque) are setup for running jobs that do not
28
+ require the result to be returned to the web app (think mass-mailers, queue
29
+ population, image resizing). They just run and the output goes to a database,
30
+ an inbox or a file server.
31
+
32
+ h2. Enter Kthxbye
33
+
34
+ Kthxbye is an attempt to solve this problem. It is based heavily off of
35
+ "Resque":http://github.com/defunkt/resque and why not an addition to Resque?
36
+ I needed some hack time with Redis on my own as I've never used it before...
37
+
38
+ bq. I can learn any language or tool in a matter of days if you give me
39
+ 1. a good manual
40
+ 2. an even better project to work on.
41
+
42
+ _- Prof. Shumacher_
43
+
44
+ This project accomplishes both those goals. This is an attempt to learn
45
+ something, using Resque and Redis docs as a manual, while at the same time
46
+ creating a much needed solution to a problem.
47
+
48
+ The idea is to be able to do the following:
49
+
50
+ # dummy job class
51
+ class MyJob
52
+ def self.perform(data)
53
+ puts "Do something with #{data}"
54
+ data.gsub(/hello/i, "Goodbye")
55
+ end
56
+ end
57
+
58
+ # setup options, then connect
59
+ Kthxbye::Config.setup(:redis_server => 'localhost', :redis_port => 8080)
60
+
61
+ # each enqueued job returns a unique id to poll with
62
+ unique_id = Kthxbye.enqueue("jobs", MyJob, "Hello World")
63
+
64
+ # ... code code code ...
65
+
66
+ # polls queue every 5 seconds
67
+ computed_value = Kthxbye.poll("jobs", unique_id, 5)
68
+
69
+ and then in some other world, on some other machine, (that still has knowledge of MyJob)
70
+
71
+ # inits with queue
72
+ worker = Kthxbye::Worker.new("jobs")
73
+
74
+ # connects to queue and runs jobs found there
75
+ worker.run
76
+
77
+ Pretty... damn... simple.
78
+
79
+ h2. Note on Patches/Pull Requests
80
+
81
+ * Fork the project.
82
+ * Make your feature addition or bug fix.
83
+ * Add tests for it. This is important so I don't break it in a
84
+ future version unintentionally.
85
+ * Commit, do not mess with rakefile, version, or history.
86
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
87
+ * Send me a pull request. Bonus points for topic branches.
88
+
89
+ h2. Copyright
90
+
91
+ Copyright (c) 2010 Luke van der Hoeven. See LICENSE for details.
@@ -0,0 +1,53 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+ require 'rake'
11
+
12
+ require 'jeweler'
13
+ Jeweler::Tasks.new do |gem|
14
+ gem.name = "kthxbye"
15
+ gem.summary = %Q{Async processing + results notification}
16
+ gem.description = %Q{Kthxbye is the answer to a fairly unique-yet-common problem: Background job processing when we care about the result.}
17
+ gem.email = "hungerandthirst@gmail.com"
18
+ gem.homepage = "http://github.com/plukevdh/kthxbye"
19
+ gem.authors = ["Luke van der Hoeven"]
20
+ gem.add_development_dependency "shoulda", ">= 0"
21
+ gem.add_development_dependency "bundler", "~> 1.0.0"
22
+ gem.add_development_dependency "jeweler", "~> 1.5.0.pre3"
23
+ gem.add_development_dependency "rcov", ">= 0"
24
+ gem.add_dependency "redis", "~> 2.0.5"
25
+ gem.add_dependency "yajl-ruby", ">= 0.7.7"
26
+ gem.add_dependency "json", "~> 1.4.6"
27
+ end
28
+ Jeweler::RubygemsDotOrgTasks.new
29
+ require 'rake/testtask'
30
+ Rake::TestTask.new(:test) do |test|
31
+ test.libs << 'lib' << 'test'
32
+ test.pattern = 'test/**/test_*.rb'
33
+ test.verbose = true
34
+ end
35
+
36
+ require 'rcov/rcovtask'
37
+ Rcov::RcovTask.new do |test|
38
+ test.libs << 'test'
39
+ test.pattern = 'test/**/test_*.rb'
40
+ test.verbose = true
41
+ end
42
+
43
+ task :default => :test
44
+
45
+ require 'rake/rdoctask'
46
+ Rake::RDocTask.new do |rdoc|
47
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
48
+
49
+ rdoc.rdoc_dir = 'rdoc'
50
+ rdoc.title = "kthxbye #{version}"
51
+ rdoc.rdoc_files.include('README*')
52
+ rdoc.rdoc_files.include('lib/**/*.rb')
53
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 1.0.0
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ $LOAD_PATH.unshift ::File.expand_path(::File.dirname(__FILE__) + '/lib')
4
+ require 'kthxbye'
5
+ require 'kthxbye/web_interface'
6
+
7
+ Kthxbye::WebInterface.run! :host => "localhost", :port => 4567
@@ -0,0 +1,151 @@
1
+ require 'redis'
2
+
3
+ begin
4
+ require 'yajl'
5
+ rescue
6
+ require 'json'
7
+ end
8
+
9
+ $LOAD_PATH << './lib'
10
+
11
+ require 'kthxbye/config'
12
+ require 'kthxbye/helper'
13
+ require 'kthxbye/job'
14
+ require 'kthxbye/worker'
15
+ require 'kthxbye/failure'
16
+
17
+ require 'kthxbye/version'
18
+
19
+ require 'kthxbye/exceptions'
20
+
21
+ module Kthxbye
22
+ include Helper
23
+ extend self
24
+
25
+ #takes in an existing redis instance or simply connects a new instance
26
+ def connect( redis_instance=nil )
27
+ @redis = ( redis_instance || Redis.new( :host => Config.options[:redis_server], :port => Config.options[:redis_port] ) )
28
+ end
29
+
30
+ def redis
31
+ return @redis if @redis
32
+ Config.setup
33
+ self.connect
34
+ self.redis
35
+ end
36
+
37
+ def keys
38
+ redis.keys("*")
39
+ end
40
+
41
+ #
42
+ def enqueue(queue, klass, *args)
43
+ Job.create(queue, klass, *args)
44
+ end
45
+
46
+ # gets the size of a given queue
47
+ def size(queue)
48
+ redis.llen("queue:#{queue}").to_i
49
+ end
50
+
51
+ # gets the latest latest job off the given queue
52
+ # returns a Job object
53
+ def salvage(q)
54
+ id = redis.lpop( "queue:#{q}" )
55
+ if id
56
+ payload = decode( redis.hget( "data-store:#{q}", id ) )
57
+ return Job.new(id, q, payload)
58
+ else
59
+ log "No jobs found in #{q}"
60
+ return nil
61
+ end
62
+ end
63
+
64
+ # lets us peek at the data to be run with a job
65
+ # can lookup an entire queue or for a specific job id
66
+ def peek(store, queue, id=nil)
67
+ if id
68
+ decode( redis.hget( "#{store}-store:#{queue}", id ) )
69
+ else
70
+ all = redis.hgetall( "#{store}-store:#{queue}" )
71
+ results = {}
72
+ all.each {|k,v| results[k] = decode( v ) }
73
+ return results
74
+ end
75
+ end
76
+
77
+ # handles a few of our dynamic methods
78
+ def method_missing(name, *args)
79
+ method_name = name.id2name
80
+ if method_name =~ /^(data|result)_peek$/
81
+ Kthxbye.send(:peek, $1, *args)
82
+ else
83
+ super
84
+ end
85
+
86
+ end
87
+
88
+ # returns all the queues Kthxbye knows about
89
+ def queues
90
+ redis.smembers( :queues ).sort
91
+ end
92
+
93
+ # registers the queue in our "known queues" list
94
+ def register_queue(queue)
95
+ redis.sadd(:queues, queue) unless redis.sismember(:queues, queue)
96
+ end
97
+
98
+ # Removes the queue from the active queue listing, does not delete queue
99
+ # will lead to phantom queues. use delete_queue for complete removal of queue
100
+ def unregister_queue(queue)
101
+ redis.srem(:queues, queue)
102
+ end
103
+
104
+ # Completely removes queue: unregisters it then deletes it
105
+ # should return true in all cases
106
+ def delete_queue(queue)
107
+ unregister_queue(queue)
108
+ redis.del( "queue:#{queue}" ) || true
109
+ end
110
+
111
+ # returns all our registered workers
112
+ def workers
113
+ workers = redis.smembers( :workers )
114
+ workers.map {|x| Worker.find( x ) }
115
+ end
116
+
117
+ # returns all our active workers and the job they are working
118
+ def working
119
+ workers = redis.smembers( :working )
120
+ data = []
121
+ workers.each do |w_id|
122
+ data << [w_id, decode( redis.get("worker:#{w_id}") )]
123
+ end
124
+ return data
125
+ end
126
+
127
+ # returns either the job results for a specific job (if id specified)
128
+ # or all the results for all the jobs on a queue
129
+ def job_results(queue, id=nil)
130
+ if id
131
+ decode( redis.hget( "result-store:#{queue}", id ) )
132
+ else
133
+ Array( redis.hgetall( "result-store:#{queue}" ) )
134
+ end
135
+ end
136
+
137
+ def inspect
138
+ {
139
+ :version => Version,
140
+ :keys => keys.size,
141
+ :workers => workers.size,
142
+ :working => working.size,
143
+ :queues => queues.size,
144
+ :failed => Failure.count,
145
+ :pending => queues.inject(0) {|m,o| m + size(o)}
146
+ }
147
+ end
148
+ end
149
+
150
+
151
+