RubyGems - kthxbye - Versions diffs - 1.0.0 - Mend

kthxbye 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

data/.document +5 -0
data/.gitignore +33 -0
data/DESIGN.textile +81 -0
data/Gemfile +21 -0
data/Gemfile.lock +42 -0
data/LICENSE +20 -0
data/README.textile +91 -0
data/Rakefile +53 -0
data/VERSION +1 -0
data/config.ru +7 -0
data/lib/kthxbye.rb +151 -0
data/lib/kthxbye/config.rb +35 -0
data/lib/kthxbye/exceptions.rb +4 -0
data/lib/kthxbye/failure.rb +62 -0
data/lib/kthxbye/helper.rb +42 -0
data/lib/kthxbye/job.rb +127 -0
data/lib/kthxbye/version.rb +5 -0
data/lib/kthxbye/web_interface.rb +117 -0
data/lib/kthxbye/web_interface/public/application.js +16 -0
data/lib/kthxbye/web_interface/public/awesome-buttons.css +108 -0
data/lib/kthxbye/web_interface/public/jquery.js +154 -0
data/lib/kthxbye/web_interface/public/style.css +128 -0
data/lib/kthxbye/web_interface/views/error.haml +5 -0
data/lib/kthxbye/web_interface/views/failed.haml +26 -0
data/lib/kthxbye/web_interface/views/hash.haml +6 -0
data/lib/kthxbye/web_interface/views/layout.haml +33 -0
data/lib/kthxbye/web_interface/views/overview.haml +2 -0
data/lib/kthxbye/web_interface/views/queues.haml +31 -0
data/lib/kthxbye/web_interface/views/set.haml +4 -0
data/lib/kthxbye/web_interface/views/stats.haml +32 -0
data/lib/kthxbye/web_interface/views/view_backtrace.haml +8 -0
data/lib/kthxbye/web_interface/views/workers.haml +24 -0
data/lib/kthxbye/web_interface/views/working.haml +19 -0
data/lib/kthxbye/worker.rb +221 -0
data/test/helper.rb +18 -0
data/test/redis-test.conf +115 -0
data/test/test_failure.rb +51 -0
data/test/test_helper.rb +86 -0
data/test/test_kthxbye.rb +213 -0
data/test/test_worker.rb +148 -0
metadata +364 -0

data/.document ADDED

@@ -0,0 +1,5 @@
+README.rdoc
+lib/**/*.rb
+bin/*
+features/**/*.feature
+LICENSE

data/.gitignore ADDED

@@ -0,0 +1,33 @@
+# rcov generated
+coverage
+# rdoc generated
+rdoc
+# yard generated
+doc
+.yardoc
+# bundler
+.bundle
+# jeweler generated
+pkg
+## MAC OS
+.DS_Store
+## TEXTMATE
+*.tmproj
+tmtags
+## EMACS
+*~
+\#*
+.\#*
+## VIM
+*.swp
+*.swo
+## PROJECT::SPECIFIC
+*.gemspec

data/DESIGN.textile ADDED

@@ -0,0 +1,81 @@
+h1. Queues
+There are a number of important queues that we employ to track data:
+# queues - responsilbe for pushing/popping jobs. only stores a unique id. Is
+pushed and popped, therefore a fairly "volatile" dataset.
+# data-store:[name] - hash responsible for the different types of job data.
+Stored by job id. This data is persisted until a job is successful or failed
+too many times. Then the job will be dropped and failure recorded.
+# result-store:[name] - hash responsible for storing the results by job id.
+Data will remain here until removed. Can be fetched multiple times if desired
+providing a cached retrieval.
+# failure-store:[name] - hash responsible for storing any failed jobs. Stored
+until manually cleared.
+In addition there is a unique id counter that gets incremented as the job queue
+grows. It is stored at "unique_id"
+On top of these queues, there is also a set of stores setup to keep track of
+workers and processes actively being worked.
+# workers - set that tracks all the registered workers. All workers not in this list
+should be destroyed.
+# working - set of workers that are actively working a job.
+h1. Jobs
+I decided to make jobs a two fold purpose vehicle. One, it would be the job
+storage mechanism through which a job would be placed on the queue for later
+execution by workers and two, it would actually do the job execution, even
+though the worker is the one that calls it. This way all job related tasks
+can be abstracted to the job class, while the worker can busy itself about
+handling the control of the job execution and not the job execution itself.
+Jobs will also be responsible for storing the state of the result and any
+failures that come from the job execution.
+A job can only live on one queue.
+Job will try 3 times before being taken out of the work queue and placed in the
+failure queue
+h1. Workers
+There was a big struggle in the beginning to decide how to work queued jobs.
+I originally wanted to do a more cooperative scheduling schema but found that
+would not end up taking multiple cores into consideration. Moreover, it is
+wicked hard to preempt fibers in a meaningful way without digging really deep
+into other code as well in order to retrofit them for Fibers or EventMachine.
+So I simply didn't.
+Instead I went with Ruby's Process library and decided to work at making it
+work as much as possible around a Unix-style processing queue so that in one
+fell swoop, we could use multiple cores as well as gain true concurrency.
+One more design decision is that we will not attempt to run multiple processes-
+per-worker. This way we can control the number of processes running by how many
+workers we choose to run, rather than by how many processes a worker is allowed
+to spawn.
+This presents a slight problem however. Each worker then will involve two
+processes. One parent/control-loop and one child/job-processor. This means in
+the end, we have more processes running at once, possibly chewing up more
+resources than might be ultimately necessary. This will hopefully be overcome
+by suspending the parent worker while the child process runs. This way one core
+is not chewed up by an essentially idle process. Hopefully this can be benchmarked
+to figure out if its faster to run one-to-many parent/childs or one-to-one.
+A worker can work multiple queues
+h1. Web Interface
+Intended to have a web frontend to control/observe workers, jobs and queues.
+h1. Job Observer Widget
+Intended to have a JS widget that will indicate when a job has been computed
+and the results are ready or a job has failed. Potential application for a
+Node.js implementation (using something like
+http://github.com/fictorial/redis-node-client). Should be easily embeddable in
+any page for simple notification.

data/Gemfile ADDED

@@ -0,0 +1,21 @@
+source "http://rubygems.org"
+# Add dependencies required to use your gem here.
+# Example:
+#   gem "activesupport", ">= 2.3.5"
+# Add dependencies to develop your gem here.
+# Include everything needed to run rake, tests, features, etc.
+group :development do
+  gem "shoulda", ">= 0"
+  gem "bundler", "~> 1.0.0"
+  gem "jeweler", "~> 1.5.0.pre3"
+  gem "rcov", ">= 0"
+end
+gem "redis"
+gem "yajl-ruby"
+gem "json"
+gem "mail"
+gem 'i18n'
+gem 'sinatra'

data/Gemfile.lock ADDED

@@ -0,0 +1,42 @@
+GEM
+  remote: http://rubygems.org/
+  specs:
+    activesupport (3.0.0)
+    git (1.2.5)
+    i18n (0.4.1)
+    jeweler (1.5.0.pre3)
+      bundler (~> 1.0.0)
+      git (>= 1.2.5)
+      rake
+    json (1.4.6)
+    mail (2.2.6.1)
+      activesupport (>= 2.3.6)
+      mime-types
+      treetop (>= 1.4.5)
+    mime-types (1.16)
+    polyglot (0.3.1)
+    rack (1.2.1)
+    rake (0.8.7)
+    rcov (0.9.9)
+    redis (2.0.10)
+    shoulda (2.11.3)
+    sinatra (1.0)
+      rack (>= 1.0)
+    treetop (1.4.8)
+      polyglot (>= 0.3.1)
+    yajl-ruby (0.7.7)
+PLATFORMS
+  ruby
+DEPENDENCIES
+  bundler (~> 1.0.0)
+  i18n
+  jeweler (~> 1.5.0.pre3)
+  json
+  mail
+  rcov
+  redis
+  shoulda
+  sinatra
+  yajl-ruby

data/LICENSE ADDED

@@ -0,0 +1,20 @@
+Copyright (c) 2009 Luke van der Hoeven
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

data/README.textile ADDED

@@ -0,0 +1,91 @@
+h1. kthxbye
+Kthxbye is the answer to a fairly unique-yet-common problem: Background job
+processing when we care about the result.
+There are a number of projects I can think of where a job that takes longer than
+a user ought to be waiting for a result (due to server timeout length or out of
+simple app responsiveness courtesy to the user) and yet I need the result of the
+operation to be returned to the user.
+Here's a real-world example. I work with a set of legacy Oracle databases that
+stores much of our business logic as PLSQL procedures. Yes, this is not "The
+Rails Way&#0153;" but it's the only way for the company I work for right now.
+Many of the procedures that I run as part of several of the applications I
+support can take on average one minute or more with a standard deviation of
+almost 2 minutes (with a forced timeout of 5 minutes). That's kinda a long time
+to sit and wait on a web app.
+<img src="http://img.skitch.com/20100901-gadna641fj4wdeswgj74y2pssq.png" alt="RLA - IWP Analysis"/>
+We don't really want users sitting waiting for up to 5 minutes (when it forces
+failure) unable to do anything or (even worse) hitting refresh or the action
+again. Especially bad when this can mean the HTTP server is getting backed up
+as more and more people run these long running processes.
+Moreover, the users need to get response from the completed job before moving
+on. Most job processors (DJ, Resque) are setup for running jobs that do not
+require the result to be returned to the web app (think mass-mailers, queue
+population, image resizing). They just run and the output goes to a database,
+an inbox or a file server.
+h2. Enter Kthxbye
+Kthxbye is an attempt to solve this problem. It is based heavily off of
+"Resque":http://github.com/defunkt/resque and why not an addition to Resque?
+I needed some hack time with Redis on my own as I've never used it before...
+bq. I can learn any language or tool in a matter of days if you give me
+  1. a good manual
+  2. an even better project to work on.
+_- Prof. Shumacher_
+This project accomplishes both those goals. This is an attempt to learn
+something, using Resque and Redis docs as a manual, while at the same time
+creating a much needed solution to a problem.
+The idea is to be able to do the following:
+    # dummy job class
+    class MyJob
+      def self.perform(data)
+        puts "Do something with #{data}"
+        data.gsub(/hello/i, "Goodbye")
+      end
+    end
+    # setup options, then connect
+    Kthxbye::Config.setup(:redis_server => 'localhost', :redis_port => 8080)
+    # each enqueued job returns a unique id to poll with
+    unique_id = Kthxbye.enqueue("jobs", MyJob, "Hello World")
+    # ... code code code ...
+    # polls queue every 5 seconds
+    computed_value = Kthxbye.poll("jobs", unique_id, 5)
+and then in some other world, on some other machine, (that still has knowledge of MyJob)
+    # inits with queue
+    worker = Kthxbye::Worker.new("jobs")
+    # connects to queue and runs jobs found there
+    worker.run
+Pretty... damn... simple.
+h2. Note on Patches/Pull Requests
+* Fork the project.
+* Make your feature addition or bug fix.
+* Add tests for it. This is important so I don't break it in a
+  future version unintentionally.
+* Commit, do not mess with rakefile, version, or history.
+  (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
+* Send me a pull request. Bonus points for topic branches.
+h2. Copyright
+Copyright (c) 2010 Luke van der Hoeven. See LICENSE for details.

data/Rakefile ADDED

@@ -0,0 +1,53 @@
+require 'rubygems'
+require 'bundler'
+begin
+  Bundler.setup(:default, :development)
+rescue Bundler::BundlerError => e
+  $stderr.puts e.message
+  $stderr.puts "Run `bundle install` to install missing gems"
+  exit e.status_code
+end
+require 'rake'
+require 'jeweler'
+Jeweler::Tasks.new do |gem|
+  gem.name = "kthxbye"
+  gem.summary = %Q{Async processing + results notification}
+  gem.description = %Q{Kthxbye is the answer to a fairly unique-yet-common problem: Background job processing when we care about the result.}
+  gem.email = "hungerandthirst@gmail.com"
+  gem.homepage = "http://github.com/plukevdh/kthxbye"
+  gem.authors = ["Luke van der Hoeven"]
+  gem.add_development_dependency "shoulda", ">= 0"
+  gem.add_development_dependency "bundler", "~> 1.0.0"
+  gem.add_development_dependency "jeweler", "~> 1.5.0.pre3"
+  gem.add_development_dependency "rcov", ">= 0"
+  gem.add_dependency "redis", "~> 2.0.5"
+  gem.add_dependency "yajl-ruby", ">= 0.7.7"
+  gem.add_dependency "json", "~> 1.4.6"
+end
+Jeweler::RubygemsDotOrgTasks.new
+require 'rake/testtask'
+Rake::TestTask.new(:test) do |test|
+  test.libs << 'lib' << 'test'
+  test.pattern = 'test/**/test_*.rb'
+  test.verbose = true
+end
+require 'rcov/rcovtask'
+Rcov::RcovTask.new do |test|
+  test.libs << 'test'
+  test.pattern = 'test/**/test_*.rb'
+  test.verbose = true
+end
+task :default => :test
+require 'rake/rdoctask'
+Rake::RDocTask.new do |rdoc|
+  version = File.exist?('VERSION') ? File.read('VERSION') : ""
+  rdoc.rdoc_dir = 'rdoc'
+  rdoc.title = "kthxbye #{version}"
+  rdoc.rdoc_files.include('README*')
+  rdoc.rdoc_files.include('lib/**/*.rb')
+end

data/VERSION ADDED

	@@ -0,0 +1 @@
1	+ 1.0.0

data/config.ru ADDED

@@ -0,0 +1,7 @@
+#!/usr/bin/env ruby
+$LOAD_PATH.unshift ::File.expand_path(::File.dirname(__FILE__) + '/lib')
+require 'kthxbye'
+require 'kthxbye/web_interface'
+Kthxbye::WebInterface.run! :host => "localhost", :port => 4567

data/lib/kthxbye.rb ADDED

@@ -0,0 +1,151 @@
+require 'redis'
+begin
+  require 'yajl'
+rescue
+  require 'json'
+end
+$LOAD_PATH << './lib'
+require 'kthxbye/config'
+require 'kthxbye/helper'
+require 'kthxbye/job'
+require 'kthxbye/worker'
+require 'kthxbye/failure'
+require 'kthxbye/version'
+require 'kthxbye/exceptions'
+module Kthxbye
+  include Helper
+  extend self
+  #takes in an existing redis instance or simply connects a new instance
+  def connect( redis_instance=nil )
+    @redis = ( redis_instance || Redis.new( :host => Config.options[:redis_server], :port => Config.options[:redis_port] ) )
+  end
+  def redis
+    return @redis if @redis
+    Config.setup
+    self.connect
+    self.redis
+  end
+  def keys
+    redis.keys("*")
+  end
+  #
+  def enqueue(queue, klass, *args)
+    Job.create(queue, klass, *args)
+  end
+  # gets the size of a given queue
+  def size(queue)
+    redis.llen("queue:#{queue}").to_i
+  end
+  # gets the latest latest job off the given queue
+  # returns a Job object
+  def salvage(q)
+    id = redis.lpop( "queue:#{q}" )
+    if id
+      payload = decode( redis.hget( "data-store:#{q}", id ) )
+      return Job.new(id, q, payload)
+    else
+      log "No jobs found in #{q}"
+      return nil
+    end
+  end
+  # lets us peek at the data to be run with a job
+  # can lookup an entire queue or for a specific job id
+  def peek(store, queue, id=nil)
+    if id
+      decode( redis.hget( "#{store}-store:#{queue}", id ) )
+    else
+      all = redis.hgetall( "#{store}-store:#{queue}" )
+      results = {}
+      all.each {|k,v| results[k] = decode( v ) }
+      return results
+    end
+  end
+  # handles a few of our dynamic methods
+  def method_missing(name, *args)
+    method_name = name.id2name
+    if method_name =~ /^(data|result)_peek$/
+      Kthxbye.send(:peek, $1, *args)
+    else
+      super
+    end
+  end
+  # returns all the queues Kthxbye knows about
+  def queues
+    redis.smembers( :queues ).sort
+  end
+  # registers the queue in our "known queues" list
+  def register_queue(queue)
+    redis.sadd(:queues, queue) unless redis.sismember(:queues, queue)
+  end
+  # Removes the queue from the active queue listing, does not delete queue
+  # will lead to phantom queues. use delete_queue for complete removal of queue
+  def unregister_queue(queue)
+    redis.srem(:queues, queue)
+  end
+  # Completely removes queue: unregisters it then deletes it
+  # should return true in all cases
+  def delete_queue(queue)
+    unregister_queue(queue)
+    redis.del( "queue:#{queue}" ) || true
+  end
+  # returns all our registered workers
+  def workers
+    workers = redis.smembers( :workers )
+    workers.map {|x| Worker.find( x ) }
+  end
+  # returns all our active workers and the job they are working
+  def working
+    workers = redis.smembers( :working )
+    data = []
+    workers.each do |w_id|
+      data << [w_id, decode( redis.get("worker:#{w_id}") )]
+    end
+    return data
+  end
+  # returns either the job results for a specific job (if id specified)
+  # or all the results for all the jobs on a queue
+  def job_results(queue, id=nil)
+    if id
+      decode( redis.hget( "result-store:#{queue}", id ) )
+    else
+      Array( redis.hgetall( "result-store:#{queue}" ) )
+    end
+  end
+  def inspect
+    {
+      :version => Version,
+      :keys => keys.size,
+      :workers => workers.size,
+      :working => working.size,
+      :queues => queues.size,
+      :failed => Failure.count,
+      :pending => queues.inject(0) {|m,o| m + size(o)}
+    }
+  end
+end