kthxbye 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. data/.document +5 -0
  2. data/.gitignore +33 -0
  3. data/DESIGN.textile +81 -0
  4. data/Gemfile +21 -0
  5. data/Gemfile.lock +42 -0
  6. data/LICENSE +20 -0
  7. data/README.textile +91 -0
  8. data/Rakefile +53 -0
  9. data/VERSION +1 -0
  10. data/config.ru +7 -0
  11. data/lib/kthxbye.rb +151 -0
  12. data/lib/kthxbye/config.rb +35 -0
  13. data/lib/kthxbye/exceptions.rb +4 -0
  14. data/lib/kthxbye/failure.rb +62 -0
  15. data/lib/kthxbye/helper.rb +42 -0
  16. data/lib/kthxbye/job.rb +127 -0
  17. data/lib/kthxbye/version.rb +5 -0
  18. data/lib/kthxbye/web_interface.rb +117 -0
  19. data/lib/kthxbye/web_interface/public/application.js +16 -0
  20. data/lib/kthxbye/web_interface/public/awesome-buttons.css +108 -0
  21. data/lib/kthxbye/web_interface/public/jquery.js +154 -0
  22. data/lib/kthxbye/web_interface/public/style.css +128 -0
  23. data/lib/kthxbye/web_interface/views/error.haml +5 -0
  24. data/lib/kthxbye/web_interface/views/failed.haml +26 -0
  25. data/lib/kthxbye/web_interface/views/hash.haml +6 -0
  26. data/lib/kthxbye/web_interface/views/layout.haml +33 -0
  27. data/lib/kthxbye/web_interface/views/overview.haml +2 -0
  28. data/lib/kthxbye/web_interface/views/queues.haml +31 -0
  29. data/lib/kthxbye/web_interface/views/set.haml +4 -0
  30. data/lib/kthxbye/web_interface/views/stats.haml +32 -0
  31. data/lib/kthxbye/web_interface/views/view_backtrace.haml +8 -0
  32. data/lib/kthxbye/web_interface/views/workers.haml +24 -0
  33. data/lib/kthxbye/web_interface/views/working.haml +19 -0
  34. data/lib/kthxbye/worker.rb +221 -0
  35. data/test/helper.rb +18 -0
  36. data/test/redis-test.conf +115 -0
  37. data/test/test_failure.rb +51 -0
  38. data/test/test_helper.rb +86 -0
  39. data/test/test_kthxbye.rb +213 -0
  40. data/test/test_worker.rb +148 -0
  41. metadata +364 -0
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
@@ -0,0 +1,33 @@
1
+ # rcov generated
2
+ coverage
3
+
4
+ # rdoc generated
5
+ rdoc
6
+
7
+ # yard generated
8
+ doc
9
+ .yardoc
10
+
11
+ # bundler
12
+ .bundle
13
+
14
+ # jeweler generated
15
+ pkg
16
+ ## MAC OS
17
+ .DS_Store
18
+
19
+ ## TEXTMATE
20
+ *.tmproj
21
+ tmtags
22
+
23
+ ## EMACS
24
+ *~
25
+ \#*
26
+ .\#*
27
+
28
+ ## VIM
29
+ *.swp
30
+ *.swo
31
+
32
+ ## PROJECT::SPECIFIC
33
+ *.gemspec
@@ -0,0 +1,81 @@
1
+ h1. Queues
2
+
3
+ There are a number of important queues that we employ to track data:
4
+
5
+ # queues - responsilbe for pushing/popping jobs. only stores a unique id. Is
6
+ pushed and popped, therefore a fairly "volatile" dataset.
7
+ # data-store:[name] - hash responsible for the different types of job data.
8
+ Stored by job id. This data is persisted until a job is successful or failed
9
+ too many times. Then the job will be dropped and failure recorded.
10
+ # result-store:[name] - hash responsible for storing the results by job id.
11
+ Data will remain here until removed. Can be fetched multiple times if desired
12
+ providing a cached retrieval.
13
+ # failure-store:[name] - hash responsible for storing any failed jobs. Stored
14
+ until manually cleared.
15
+
16
+ In addition there is a unique id counter that gets incremented as the job queue
17
+ grows. It is stored at "unique_id"
18
+
19
+ On top of these queues, there is also a set of stores setup to keep track of
20
+ workers and processes actively being worked.
21
+
22
+ # workers - set that tracks all the registered workers. All workers not in this list
23
+ should be destroyed.
24
+ # working - set of workers that are actively working a job.
25
+
26
+ h1. Jobs
27
+
28
+ I decided to make jobs a two fold purpose vehicle. One, it would be the job
29
+ storage mechanism through which a job would be placed on the queue for later
30
+ execution by workers and two, it would actually do the job execution, even
31
+ though the worker is the one that calls it. This way all job related tasks
32
+ can be abstracted to the job class, while the worker can busy itself about
33
+ handling the control of the job execution and not the job execution itself.
34
+
35
+ Jobs will also be responsible for storing the state of the result and any
36
+ failures that come from the job execution.
37
+
38
+ A job can only live on one queue.
39
+
40
+ Job will try 3 times before being taken out of the work queue and placed in the
41
+ failure queue
42
+
43
+ h1. Workers
44
+
45
+ There was a big struggle in the beginning to decide how to work queued jobs.
46
+ I originally wanted to do a more cooperative scheduling schema but found that
47
+ would not end up taking multiple cores into consideration. Moreover, it is
48
+ wicked hard to preempt fibers in a meaningful way without digging really deep
49
+ into other code as well in order to retrofit them for Fibers or EventMachine.
50
+ So I simply didn't.
51
+
52
+ Instead I went with Ruby's Process library and decided to work at making it
53
+ work as much as possible around a Unix-style processing queue so that in one
54
+ fell swoop, we could use multiple cores as well as gain true concurrency.
55
+ One more design decision is that we will not attempt to run multiple processes-
56
+ per-worker. This way we can control the number of processes running by how many
57
+ workers we choose to run, rather than by how many processes a worker is allowed
58
+ to spawn.
59
+
60
+ This presents a slight problem however. Each worker then will involve two
61
+ processes. One parent/control-loop and one child/job-processor. This means in
62
+ the end, we have more processes running at once, possibly chewing up more
63
+ resources than might be ultimately necessary. This will hopefully be overcome
64
+ by suspending the parent worker while the child process runs. This way one core
65
+ is not chewed up by an essentially idle process. Hopefully this can be benchmarked
66
+ to figure out if its faster to run one-to-many parent/childs or one-to-one.
67
+
68
+ A worker can work multiple queues
69
+
70
+ h1. Web Interface
71
+
72
+ Intended to have a web frontend to control/observe workers, jobs and queues.
73
+
74
+ h1. Job Observer Widget
75
+
76
+ Intended to have a JS widget that will indicate when a job has been computed
77
+ and the results are ready or a job has failed. Potential application for a
78
+ Node.js implementation (using something like
79
+ http://github.com/fictorial/redis-node-client). Should be easily embeddable in
80
+ any page for simple notification.
81
+
data/Gemfile ADDED
@@ -0,0 +1,21 @@
1
+ source "http://rubygems.org"
2
+ # Add dependencies required to use your gem here.
3
+ # Example:
4
+ # gem "activesupport", ">= 2.3.5"
5
+
6
+ # Add dependencies to develop your gem here.
7
+ # Include everything needed to run rake, tests, features, etc.
8
+ group :development do
9
+ gem "shoulda", ">= 0"
10
+ gem "bundler", "~> 1.0.0"
11
+ gem "jeweler", "~> 1.5.0.pre3"
12
+ gem "rcov", ">= 0"
13
+
14
+ end
15
+
16
+ gem "redis"
17
+ gem "yajl-ruby"
18
+ gem "json"
19
+ gem "mail"
20
+ gem 'i18n'
21
+ gem 'sinatra'
@@ -0,0 +1,42 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ activesupport (3.0.0)
5
+ git (1.2.5)
6
+ i18n (0.4.1)
7
+ jeweler (1.5.0.pre3)
8
+ bundler (~> 1.0.0)
9
+ git (>= 1.2.5)
10
+ rake
11
+ json (1.4.6)
12
+ mail (2.2.6.1)
13
+ activesupport (>= 2.3.6)
14
+ mime-types
15
+ treetop (>= 1.4.5)
16
+ mime-types (1.16)
17
+ polyglot (0.3.1)
18
+ rack (1.2.1)
19
+ rake (0.8.7)
20
+ rcov (0.9.9)
21
+ redis (2.0.10)
22
+ shoulda (2.11.3)
23
+ sinatra (1.0)
24
+ rack (>= 1.0)
25
+ treetop (1.4.8)
26
+ polyglot (>= 0.3.1)
27
+ yajl-ruby (0.7.7)
28
+
29
+ PLATFORMS
30
+ ruby
31
+
32
+ DEPENDENCIES
33
+ bundler (~> 1.0.0)
34
+ i18n
35
+ jeweler (~> 1.5.0.pre3)
36
+ json
37
+ mail
38
+ rcov
39
+ redis
40
+ shoulda
41
+ sinatra
42
+ yajl-ruby
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Luke van der Hoeven
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,91 @@
1
+ h1. kthxbye
2
+
3
+ Kthxbye is the answer to a fairly unique-yet-common problem: Background job
4
+ processing when we care about the result.
5
+
6
+ There are a number of projects I can think of where a job that takes longer than
7
+ a user ought to be waiting for a result (due to server timeout length or out of
8
+ simple app responsiveness courtesy to the user) and yet I need the result of the
9
+ operation to be returned to the user.
10
+
11
+ Here's a real-world example. I work with a set of legacy Oracle databases that
12
+ stores much of our business logic as PLSQL procedures. Yes, this is not "The
13
+ Rails Way™" but it's the only way for the company I work for right now.
14
+ Many of the procedures that I run as part of several of the applications I
15
+ support can take on average one minute or more with a standard deviation of
16
+ almost 2 minutes (with a forced timeout of 5 minutes). That's kinda a long time
17
+ to sit and wait on a web app.
18
+
19
+ <img src="http://img.skitch.com/20100901-gadna641fj4wdeswgj74y2pssq.png" alt="RLA - IWP Analysis"/>
20
+
21
+ We don't really want users sitting waiting for up to 5 minutes (when it forces
22
+ failure) unable to do anything or (even worse) hitting refresh or the action
23
+ again. Especially bad when this can mean the HTTP server is getting backed up
24
+ as more and more people run these long running processes.
25
+
26
+ Moreover, the users need to get response from the completed job before moving
27
+ on. Most job processors (DJ, Resque) are setup for running jobs that do not
28
+ require the result to be returned to the web app (think mass-mailers, queue
29
+ population, image resizing). They just run and the output goes to a database,
30
+ an inbox or a file server.
31
+
32
+ h2. Enter Kthxbye
33
+
34
+ Kthxbye is an attempt to solve this problem. It is based heavily off of
35
+ "Resque":http://github.com/defunkt/resque and why not an addition to Resque?
36
+ I needed some hack time with Redis on my own as I've never used it before...
37
+
38
+ bq. I can learn any language or tool in a matter of days if you give me
39
+ 1. a good manual
40
+ 2. an even better project to work on.
41
+
42
+ _- Prof. Shumacher_
43
+
44
+ This project accomplishes both those goals. This is an attempt to learn
45
+ something, using Resque and Redis docs as a manual, while at the same time
46
+ creating a much needed solution to a problem.
47
+
48
+ The idea is to be able to do the following:
49
+
50
+ # dummy job class
51
+ class MyJob
52
+ def self.perform(data)
53
+ puts "Do something with #{data}"
54
+ data.gsub(/hello/i, "Goodbye")
55
+ end
56
+ end
57
+
58
+ # setup options, then connect
59
+ Kthxbye::Config.setup(:redis_server => 'localhost', :redis_port => 8080)
60
+
61
+ # each enqueued job returns a unique id to poll with
62
+ unique_id = Kthxbye.enqueue("jobs", MyJob, "Hello World")
63
+
64
+ # ... code code code ...
65
+
66
+ # polls queue every 5 seconds
67
+ computed_value = Kthxbye.poll("jobs", unique_id, 5)
68
+
69
+ and then in some other world, on some other machine, (that still has knowledge of MyJob)
70
+
71
+ # inits with queue
72
+ worker = Kthxbye::Worker.new("jobs")
73
+
74
+ # connects to queue and runs jobs found there
75
+ worker.run
76
+
77
+ Pretty... damn... simple.
78
+
79
+ h2. Note on Patches/Pull Requests
80
+
81
+ * Fork the project.
82
+ * Make your feature addition or bug fix.
83
+ * Add tests for it. This is important so I don't break it in a
84
+ future version unintentionally.
85
+ * Commit, do not mess with rakefile, version, or history.
86
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
87
+ * Send me a pull request. Bonus points for topic branches.
88
+
89
+ h2. Copyright
90
+
91
+ Copyright (c) 2010 Luke van der Hoeven. See LICENSE for details.
@@ -0,0 +1,53 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+ require 'rake'
11
+
12
+ require 'jeweler'
13
+ Jeweler::Tasks.new do |gem|
14
+ gem.name = "kthxbye"
15
+ gem.summary = %Q{Async processing + results notification}
16
+ gem.description = %Q{Kthxbye is the answer to a fairly unique-yet-common problem: Background job processing when we care about the result.}
17
+ gem.email = "hungerandthirst@gmail.com"
18
+ gem.homepage = "http://github.com/plukevdh/kthxbye"
19
+ gem.authors = ["Luke van der Hoeven"]
20
+ gem.add_development_dependency "shoulda", ">= 0"
21
+ gem.add_development_dependency "bundler", "~> 1.0.0"
22
+ gem.add_development_dependency "jeweler", "~> 1.5.0.pre3"
23
+ gem.add_development_dependency "rcov", ">= 0"
24
+ gem.add_dependency "redis", "~> 2.0.5"
25
+ gem.add_dependency "yajl-ruby", ">= 0.7.7"
26
+ gem.add_dependency "json", "~> 1.4.6"
27
+ end
28
+ Jeweler::RubygemsDotOrgTasks.new
29
+ require 'rake/testtask'
30
+ Rake::TestTask.new(:test) do |test|
31
+ test.libs << 'lib' << 'test'
32
+ test.pattern = 'test/**/test_*.rb'
33
+ test.verbose = true
34
+ end
35
+
36
+ require 'rcov/rcovtask'
37
+ Rcov::RcovTask.new do |test|
38
+ test.libs << 'test'
39
+ test.pattern = 'test/**/test_*.rb'
40
+ test.verbose = true
41
+ end
42
+
43
+ task :default => :test
44
+
45
+ require 'rake/rdoctask'
46
+ Rake::RDocTask.new do |rdoc|
47
+ version = File.exist?('VERSION') ? File.read('VERSION') : ""
48
+
49
+ rdoc.rdoc_dir = 'rdoc'
50
+ rdoc.title = "kthxbye #{version}"
51
+ rdoc.rdoc_files.include('README*')
52
+ rdoc.rdoc_files.include('lib/**/*.rb')
53
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 1.0.0
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ $LOAD_PATH.unshift ::File.expand_path(::File.dirname(__FILE__) + '/lib')
4
+ require 'kthxbye'
5
+ require 'kthxbye/web_interface'
6
+
7
+ Kthxbye::WebInterface.run! :host => "localhost", :port => 4567
@@ -0,0 +1,151 @@
1
+ require 'redis'
2
+
3
+ begin
4
+ require 'yajl'
5
+ rescue
6
+ require 'json'
7
+ end
8
+
9
+ $LOAD_PATH << './lib'
10
+
11
+ require 'kthxbye/config'
12
+ require 'kthxbye/helper'
13
+ require 'kthxbye/job'
14
+ require 'kthxbye/worker'
15
+ require 'kthxbye/failure'
16
+
17
+ require 'kthxbye/version'
18
+
19
+ require 'kthxbye/exceptions'
20
+
21
+ module Kthxbye
22
+ include Helper
23
+ extend self
24
+
25
+ #takes in an existing redis instance or simply connects a new instance
26
+ def connect( redis_instance=nil )
27
+ @redis = ( redis_instance || Redis.new( :host => Config.options[:redis_server], :port => Config.options[:redis_port] ) )
28
+ end
29
+
30
+ def redis
31
+ return @redis if @redis
32
+ Config.setup
33
+ self.connect
34
+ self.redis
35
+ end
36
+
37
+ def keys
38
+ redis.keys("*")
39
+ end
40
+
41
+ #
42
+ def enqueue(queue, klass, *args)
43
+ Job.create(queue, klass, *args)
44
+ end
45
+
46
+ # gets the size of a given queue
47
+ def size(queue)
48
+ redis.llen("queue:#{queue}").to_i
49
+ end
50
+
51
+ # gets the latest latest job off the given queue
52
+ # returns a Job object
53
+ def salvage(q)
54
+ id = redis.lpop( "queue:#{q}" )
55
+ if id
56
+ payload = decode( redis.hget( "data-store:#{q}", id ) )
57
+ return Job.new(id, q, payload)
58
+ else
59
+ log "No jobs found in #{q}"
60
+ return nil
61
+ end
62
+ end
63
+
64
+ # lets us peek at the data to be run with a job
65
+ # can lookup an entire queue or for a specific job id
66
+ def peek(store, queue, id=nil)
67
+ if id
68
+ decode( redis.hget( "#{store}-store:#{queue}", id ) )
69
+ else
70
+ all = redis.hgetall( "#{store}-store:#{queue}" )
71
+ results = {}
72
+ all.each {|k,v| results[k] = decode( v ) }
73
+ return results
74
+ end
75
+ end
76
+
77
+ # handles a few of our dynamic methods
78
+ def method_missing(name, *args)
79
+ method_name = name.id2name
80
+ if method_name =~ /^(data|result)_peek$/
81
+ Kthxbye.send(:peek, $1, *args)
82
+ else
83
+ super
84
+ end
85
+
86
+ end
87
+
88
+ # returns all the queues Kthxbye knows about
89
+ def queues
90
+ redis.smembers( :queues ).sort
91
+ end
92
+
93
+ # registers the queue in our "known queues" list
94
+ def register_queue(queue)
95
+ redis.sadd(:queues, queue) unless redis.sismember(:queues, queue)
96
+ end
97
+
98
+ # Removes the queue from the active queue listing, does not delete queue
99
+ # will lead to phantom queues. use delete_queue for complete removal of queue
100
+ def unregister_queue(queue)
101
+ redis.srem(:queues, queue)
102
+ end
103
+
104
+ # Completely removes queue: unregisters it then deletes it
105
+ # should return true in all cases
106
+ def delete_queue(queue)
107
+ unregister_queue(queue)
108
+ redis.del( "queue:#{queue}" ) || true
109
+ end
110
+
111
+ # returns all our registered workers
112
+ def workers
113
+ workers = redis.smembers( :workers )
114
+ workers.map {|x| Worker.find( x ) }
115
+ end
116
+
117
+ # returns all our active workers and the job they are working
118
+ def working
119
+ workers = redis.smembers( :working )
120
+ data = []
121
+ workers.each do |w_id|
122
+ data << [w_id, decode( redis.get("worker:#{w_id}") )]
123
+ end
124
+ return data
125
+ end
126
+
127
+ # returns either the job results for a specific job (if id specified)
128
+ # or all the results for all the jobs on a queue
129
+ def job_results(queue, id=nil)
130
+ if id
131
+ decode( redis.hget( "result-store:#{queue}", id ) )
132
+ else
133
+ Array( redis.hgetall( "result-store:#{queue}" ) )
134
+ end
135
+ end
136
+
137
+ def inspect
138
+ {
139
+ :version => Version,
140
+ :keys => keys.size,
141
+ :workers => workers.size,
142
+ :working => working.size,
143
+ :queues => queues.size,
144
+ :failed => Failure.count,
145
+ :pending => queues.inject(0) {|m,o| m + size(o)}
146
+ }
147
+ end
148
+ end
149
+
150
+
151
+