RubyGems - updater - Versions diffs - 0.9.3 → 0.9.3.1 - Mend

updater 0.9.3 → 0.9.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

data/README.markdown CHANGED Viewed

@@ -2,12 +2,15 @@ Updater
 =======
 Updater is a background job queue processor.
-It works a bit like delayed_job or rescue,
+It works a bit like
+[delayed_job](http://github.com/tobi/delayed_job)
+or [resque](http://github.com/defunkt/resque),
 processing jobs in the background to allow user facing processes to stay responsive.
 It also allow jobs to be scheduled and run at a particular time.
-It is intended to work with a number of different ORM layers,
-but at the moment only DataMapper is implemented.
-Native support for MongoDB and ActiveRecord are planed.
+It is intended to work with a number of different ORM layers.
+at the moment only DataMapper, and MongoDB are implemented.
+Native support for ActiveRecord is planed.
 Get it on GemCutter with `gem install updater`.
 Main Features
@@ -17,11 +20,11 @@ Main Features
 * Intelligent Automatic Scaling
 * Does not poll the database
 * Uses minimal resources when not under load
-* Flexible Job Chaining allows for intelligent error handling and rescheduling
+* Flexible Job Chaining allows for intelligent error handling and rescheduling (see Below)
 * Powerful configuration with intelligent defaults
-* Comes with `rake` tasks
+* Comes with `rake` tasks and binary.
-These feature are as of ver 0.9.1.  See the change log for addational features
+These feature are as of ver 0.9.4.  See the change log for addational features
 Use Cases
 ---------
@@ -29,12 +32,12 @@ Use Cases
 Web based applications face two restrictions on their functionality
 as it pertains to data processing tasks.
 First, the application will only run code in response to a user request.
-Updater handles the case of actions or events that need to be triggered
+Updater handles the case of actions or events need to be triggered
 without a request coming in to the web application.
 Second, web applications, particularly those under heavy load,
 need to handle as many request as possible in a given time frame.
 Updater allows data processing and communication tasks
-to happen outside of the request response cycle
+to happen outside of the request/response cycle
 and makes it possible to move these tasks onto dedicated hardware
 Updater is also useful for circumstances where code needs to be run
@@ -46,7 +49,7 @@ Jobs that are regular and repeating can be run
 more consistently and with fewer resources with `cron`.
 Updater should be considered when the application generates
 a large number of one time events,
-and/or the events need to be regularly manipulated bu the application.
+and/or the events need to be regularly manipulated by the application.
 Updater is also not the optimal solution if the only goal
 is to offload large numbers of immediate tasks.
@@ -57,7 +60,7 @@ by Chris Wanstrath.
 Resque lacks a number of Updater's more powerful features,
 and as of this writing we are not aware of any ability in resque
 to set the time the job is run.
-But rescue does offer much higher potential throughput, and
+But resque does offer much higher potential throughput, and
 a more robust queue structure backed by the Redis key-value store.
 Using Updater
@@ -69,44 +72,62 @@ Initial Installation
 Updater comes packaged as a gem and is published on GemCutter.
 Get the latest version with `gem install updater`
+The Updater source code is located at:
+[http://github.com/antarestrader/Updater]([http://github.com/antarestrader/Updater])
 Consepts
 --------
 Updater is not complex to use but it does, of necessity, have a number of *moving parts*.
 In order to use updater successfully two tasks must be accomplished:
 your application (referred to as the client) must be able to add jobs to the queue,
-and a separate process (the server) must be setup and run
-which will preform the actions specified in those jobs.
+and a separate process (called the server, worker or job processor)
+must be setup and run -- potentially on seperate hardware.
+It will preform the actions specified in those jobs.
 Jobs are stored in a data store that is shared between client and server.
 Usually this data store will be a table in your database.
-Other data stores are possible, but require significantly configuration.
+Other data stores are possible, but require significantly more configuration.
 Updater is designed to have a minimal impact on the load of the data store,
 and it therefore should be a reasonable solution for most applications.
 (For a discussion of when this is not a reasonable solution see
 [the Rescue Blog Post](http://github.com/blog/542-introducing-resque))
-Updater is very flexible about what can be run in as a background job.
+Updater is *very* flexible about what can be run in as a background job,
+and this distinguishes it from other backgropund job processors.
 The Job will call a single method on either a Class or an instance.
 The method must be public.
-Any arguments can be passed to the method so long as the can be marshaled.
+Any arguments can be passed to the method so long as they can be marshaled.
 It is important to keep in mind that the job will be run in a completely separate process.
 Any global state will have to be recreated,
 and data must be persisted in some form in order to be seen by the client.
 For web applications this is usually not an issue.
 Calling a class method is fairly strait forward,
-but calling a method on in instance take a little more work.
+but calling a method on an instance take a little more work.
 Instances must be somehow persisted in the client
 then reinstantiated on the worker process.
-The assumption is that this will be don through the ORM and data store.
-Each ORM adapter in Updater lists a defaults methods
+The assumption is that this will be done through the ORM and data store.
+Each ORM adapter in Updater lists default methods
 for retrieving particular instances from the data store.
 When an instance is scheduled as the target of a job,
 its class and id will be stored in the `updates` table.
-When the job is run it will first use the it to pull the instance out of the data store,
+When the job is run,
+it will first use the this class to pull the instance out of the data store,
 then call the appropriate method on that instance.
+(*Notes on nomenclature*:
+Jobs which run methods on a class are refered to throughout the documentationas "class type jobs",
+while jobs which run methods on instances are called "instance type jobs."
+The *target* of a job is the class or instance upon which the method is called.
+A "conforming instance" is an instance of some class
+which is persisted in the datastore
+and can be found by calling the default `finder_method` on its class
+using the value returned by the default `finder_id` method.
+ActiveRecord or DataMapper model instances are conforming instances
+when updater is configured to use that ORM.
+)
 Client Setup
 ------------
@@ -117,7 +138,12 @@ The `client_setup` method is responsible for establishing interprocess communica
 and selecting the correct ORM adapter for Updater.
 It does this using the configuration file discussed later in this document.
 This method can take an optional hash that will override the options in the configuration file.
-It can also be passes a `:logger` option which will use the passed in logger instance for updater logging.
+It can also be passes a `:logger` option.
+With some ORM/datastore choices (only MongoDB at the moment)
+it will also be necessary to pass the datastore connection to
+`Updater::ORM::<<OrmCklass>>.setup`.
+See the Updater documentation for your ORM/datastore.
 Scheduling Jobs
 ---------------
@@ -140,7 +166,7 @@ If this is unspesified `:preform` is asumed (a la Resque)
 Either leave this blank, or set to `[]` to call without arguments.
 All members of the array must be marshalable.
-**options**: a hash of extra information, details can be found in the Options section.
+**options**: a hash of extra information, details can be found in the Options section of Updater::Update#at.
 We intend to add a module that can be included into a target class
 that will allow scheduling in the same general manner as delayed_job.
@@ -165,17 +191,20 @@ The configuration itself is a ERb interpreted YAML file.
 This is of use in limiting repetition,
 and in changing options based on the environment (test/development/production)
+**Warning:** in its standard configuration,
+the config file will be read by the server to deturmine how to boot the app.
+This has the unfortunate side effect the the framework's settings
+will not be availible when this file is processed by ERb.
 Please see the options section for details about the various options in this file.
 Starting Workers (Server)
 -------------------------
-In the parlance of background job processing,
-a process that executes jobs is known as a worker.
 The recommended way to start workers is through a rake task.
 First, include `updater/tasks` in your application's Rakefile.
 This will add start, stop and monitor tasks into the `updater` namespace.
-`start` will use the options in your configuration file to start  a worker process.
+`start` will use the options in your configuration file to start a worker process.
 Likewise, `stop` will shut that process down.
 The monitor task will start an http server
 that you can use to monitor and control the job queue and workers.
@@ -186,3 +215,304 @@ which monitors the work load and starts or stops individual workers as needed
 within the limits established in the configuration file.
 You should, therefore, only need to use `start` once.
+Options:
+--------
+Options may be set in configuration file or passed in at runtime.
+### General Configuration ###
+*   `:orm`
+    A string representing the ORM layer to use.
+    the default is `datamapper` but this value should be set by all users of
+    versions < 1.0.0 as the default may change to `activerecord` once that ORM is implimented.
+    Currently Updater supports `datamapper` and `mongodb`.
+    Supprot for `activerecord` (>=3.0.0 only) will be implimented sometime after Rails 3 is released.
+    Support for Redis is under investigation, patches welcome.
+*   `:pid_file`
+    This file will be created by the server and read by the client.
+    Process signals are used as an alternate means of communication between client and server,
+    and rake tasks make use of this file to start and stop the serve.
+    The default is `ROOT\updater.pid` where ROOT is the location of the config file
+    or failing that the curent working directory.
+*   `:database`
+    A hash of options passed to the Updater::ORM and used to establish a connection to the datastore,
+    and for other ORM spesific setup.  See the Updater documentation for your chosen ORM.
+*   `:config_file`
+    Sets an alternate path the the config file.  Obviously useless in the actuall config file,
+    this option can none the less be passed directly to client and server setup methods as
+    an extended option.  (See the cascade test.)  It can also be set by the command line binary
+    using the `-c` option.
+### Server Setup Options ###
+*   `:timeout`
+    Used only by the server,
+    this is the length of time (in seconds) that the server will wait before killing a worker.
+    It should be set to twice the length of the longes running job chain.
+    Because the master worker process will kill off jobs that run too long,
+    it is suggested that long jobs either be broken into smaller pieces using chains,
+    placed in a special long running job queue,
+    or forked off the worker process.
+*   `:workers`
+    This sets the maximum number of workers a single master server process may start.
+    Each worker type has its own default, the recomended default `fork_worker` uses 3.
+    The defaults are *very* conservitave, and so long as there are sufficient hardware
+    resources, values fo 20 or more are not out of the question.
+    The master worker process impliments a rather sophisticated heuristic
+    that adjusts the number of workers actually spun up to match the current load.
+    **Note:** It is likely that this option will be replaced by :max_workers before
+    version 1.0, and that a :min_workers option will be added with a default of 1.
+    Updater ignores unknown options so it is save to set :min_workers and :max_workers
+    in antisipation of this change.
+*   `:worker` (note singular)
+    This option is a string which tells Updater which kind of worker to use.
+    This option is only used by the server.
+    Options are `fork` or `thread` with a `simple` planned
+    either before 1.0 or 1.2 depending on what the author needs.
+    The default is 'fork' which is *strongly* recomended in production,
+    but is not compatible with Microsoft Windows, and *may* be sub-optimal with JRuby.
+    Windows user **must** set this option to `thread`.
+*   `:models`
+    This is actually an array of file names that the Server will require in order.
+    Many users will simple put a single file that loads their whole framework here.
+    (eg. `config/environment.rb` for Rails)
+    These files must allow the server to setup an Ruby environment in which all possible
+    job targets can be found, and the methods on those targets can be run.
+    An application that makes only minimal use of Updater,
+    and whose target classes and methods are carefully written,
+    might be able to only require a subset of the full application,
+    thus saving on system resources and improving start times.
+### Logging ###
+*   `:logger`
+    An instance of a Ruby Logger or another object that uses the same interface.
+    See Also :log_file and :log_level, which this command supercedes
+*   `:log_file`
+    The file to which Updater will log its actions.
+    Most logging is done by the server.
+    If no file is given SDTOUT is assumed
+    Note that if the `:logger` option is set, this option is ignored.
+*   `:log_level`
+    One of the standare logging levels (failure error warn info debug).
+    Updater will accept either symbols or strings and will automatically upcase this value.
+    The defauld value is `warn`.
+    Note that if the `:logger` option is set, this option is ignored.
+    It should be noted that the server produces a prodigious amount of data at the debug level.
+    (several MB/per day without any jobs; several MB per minute under load)
+    We therefore strongly recomend that the server log level not be set below info without cause.
+    The client on the other hand is quite safe even at the debug level in development and staging environments.
+### IPC ###
+Any or all or none of these options may be given.
+If the option is not given the communications channel will not be used.
+The server will listen on all channels given,
+while clients will communicate on the "best" only.
+Options are listed from "best" to "worst."
+If the client cannot use any of these options,
+it will use process signals as a last resort.
+These methods of communicaion mearly signal to a worker process that a job
+has been placed in the data store.  The client and server still must have access
+to the same datastore.
+*   `:socket`
+    The path to a UNIX socket.
+    The server will create and listen on this socket, clients can connect to it.
+    This option is only viable for a server running on the same machine as the client,
+    and will not work on Windows.
+*   `:udp`
+    The port number for UDP communications.
+    This is the prefered option for a cluster configuration.
+    **Security Notice:** Updater makes no effort to verify the authentisity of
+    network connections.  Administrators should configure network topology and firewalls
+    to ensure that only intended clients can communicate with the Updater server.
+*   `:tcp`
+    The port number for TCP comminications.
+    This is the prefered option for VPN connections between remote locations.
+     **Security Notice:** Updater makes no effort to verify the authentisity of
+    network connections.  Administrators should configure network topology and firewalls
+    to ensure that only intended clients can communicate with the Updater server.
+*   `:host`
+    The host name for UDP and TCP connections.
+    The devault is 'localhost'.
+    See security warnings above.
+*   `:remote` (client only) (**Pending**)
+    This is the url of a server monitor.
+    This is the prefered option for remote operations over an unsecurted network.
+    On an unsecured network, authentication becomes necessary.
+    The server core is not equipt for authentication.
+    Instead, a monitor server is started.
+    This monitor has a secured connection to the worker master process using one of the methods above.
+    The monitor recieves HTTP POST requests from authenticated clients,
+    and translates them into job-ready notifications.
+*   `:sockets` (note plural)
+    Generally for internal use.
+    This is an array of established Socket connections
+    that are passed directly to the worker master process.
+    The server will listen for new connections on these sockets.
+    This cannot be set in the configuration file,
+    it may only be passed as an option to Updater::Setup#start.
+Chained Jobs:
+=============
+One of the most exciting features of Updater is Job Chaining.
+Each job has three queues
+(`:success`, `:ensure` and `:failure`)
+that point to other jobs in the queue.
+These jobs are run after the initial job completes
+depending on whether the job finished withour raising an error.
+Jobs can in this way form a tree
+(processed depth first)
+of related tasks.
+This allows for code reuse,
+and extreeme flexibility when it comes to takes such as
+error handling, logging, auditing, and the like.
+Update will eventually come with a standard library of chained jobs
+which will be found in the Updater::Chains class.
+(TODO: Chains are being written for the 0.9 version in responce to developer needs.
+watch point releases for new chained methods)
+Adding Chained Jobs
+-------------------
+Jobs can be created with chained jobs by passing
+`:success`, `:ensure` and/or `:failure`
+as options to any of the job queuing methods.
+The value of these keys can be job, and array of jobs,
+or a hash where keys are jobs and values are options passes into the `__params__` argument (see below)
+(*Notes on nomenclature*:
+An initial job is one that was scheduled and run in the regular fassion
+and not as a result of any chain.
+A chained job is a job that is run by another job in responce to a chain.
+)
+Example:
+    # Assume self is a conforming instance
+    # Create a job to chain into
+    logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
+    # Create a job that will call this job in the case of an error
+    Updater::Update.immidiate(
+        self,
+        :some_method_that_might_fail,
+        [val1,val2],
+        :failure=>{logging_job=>{:message=>"an Epic Fail"}}
+      )
+    # [...]
+    class MyLoggingClass
+      def self.log_errors(job,options)
+        logger.error "There was {options[:message] || "failure"} while processing a job:  \n %s" % job.error.mesage
+        logger.debug job.error.backtrace.join('\n')
+      end
+    end
+Here, the worker will recreate `self` by pulling its information from the datastore.
+The worker will then send `:some_method_that_might_fail` to that instance with `val1` and `val2`.
+If `:some_method_that_might_fail` raises an error,
+the worker will then run `logging_job`.
+This job will send :log_errors to the `MyLoggingClass` class replacing `:__job__` with the instance of the job that failed,
+and `:__params__` replaced with `{:message=>"Epic Fail"}`.
+`MyLoggingClass` can use the first argument to get the error that `:some_method_that_might_fail` raised.
+Chained methods can also be added after a job is created by inserting them into the appropriate array.
+Notice however that an immidiate job may have already run before you have the chance to add a chained job.
+Example:
+    #Simular to above
+     Create a job to chain into
+    logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
+    # Create a job that will call this job in the case of an error
+    initial_job = Updater::Update.in(
+        5.minutes,
+        self,
+        :some_method_that_might_fail,
+        [val1,val2])
+    initial_job.failure << logging_job
+Writing Chained Jobs
+--------------------
+It is intended that chained jobs be reused.
+The examples above created a new job to be chained for each initial job.
+This is inefficient and would fill the datastore with unnecessary repeatition.
+Instead, chained jobs should be placed into the datestore on first use,
+then refered to by each new initial job.
+To facilitate this Updater impliments three special fields in the arguments list
+which are replaced with metadata before a job is called:
+* `__job__`: replaced with the instance of Updater::Update that chained into
+  this job.  If the job failed (that is raised and error while being run), this
+  instance will contain an error field with that error.
+* `__params__`: this is an optional field of a chain instance.  It allows the
+  chaining job to set specific options for the chained job to use. For example
+  a chained job that reschedules the the original job might take an option
+  defining how frequently the job is rescheduled.  This would be passed in
+  the params field.  (See example in Updater::Chained -- Pending!)
+* `__self__`:  this is simply set to the instance of Updater::Update that is
+  calling the method.  This might be useful for both chained and original
+  jobs that find a need to manipulate of inspect that job that called them.
+  Without this field, it would be impossible for a method to consistantly
+  determin wether it had been run from a background job or invoked
+  direclty by the app.
+Chained jobs can take advantage of these parameters to respond appropriatly without
+having to have a new chiain job for each initial job.
+Example: We could replace the `logging_job` above like this
+    class MyLoggingClass
+      def self.logging_job
+        # We will memoize this value so we don't have to hit the datastore each time.
+        # If the job is alread in the datastore, we will find it and use it,
+        # Otherwise, we will create it from scratch.
+        @logging_job ||= Updater::Update.for(self,'logging') || Updater::Update.chained(self,:log_errors,[:__job__,:__params__], :name=>'logging')
+      end
+      def self.log_errors
+        # [...] As above
+      end
+    end
+    # [...]
+    #Updater::Update.immidiate(
+        self,
+        :some_method_that_might_fail,
+        [val1,val2],
+        :failure=>{MyLoggingClass.logging_job=>{:message=>"an Epic Fail"}}
+      )
+See Also: Once it is started, see the example in Updater::Chains -- pending

data/Rakefile CHANGED Viewed

@@ -10,8 +10,8 @@ GEM_NAME = "updater"
 GEM_VERSION = File.read(VERSION_FILE).strip
 AUTHOR = "John F. Miller"
 EMAIL = "emperor@antarestrader.com"
-HOMEPAGE = "http://blog.antarestrader.com"
-SUMMARY = "A Gem for queuing methods for later calling which is ORM Agnostic, and has advanced Error Handling"
+HOMEPAGE = "http://github.com/antarestrader/Updater"
+SUMMARY = "A job queue which is ORM Agnostic and has advanced Error Handling"
 spec = Gem::Specification.new do |s|
   s.name = GEM_NAME

data/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 0.9.3
1	+ 0.9.3.1

data/lib/updater/fork_worker.rb CHANGED Viewed

@@ -382,7 +382,7 @@ module Updater
     # would be to wait until it is ready then run the next job the wake and run it.  There are two difficulties here
     # the first is the need to let the master process know that the worker is alive and has not hung.  We use a
     # heartbeat file discriptor which we periodically change ctimes on by changing its access mode.  This is
-    # modeled the technique used in the Unicorn web server.  Our difficult is that we must be prepaired for a
+    # modeled on the technique used in the Unicorn web server.  Our difficult is that we must be prepaired for a
     # much less consistant load then a web server.  Within a single application there may be periods where jobs
     # pile up and others where there is a compleatly empty queue for hours or days.  There is also the issue of
     # how long a job may take to run.  Jobs should generally be kept on the order of +timeout+ seconds.
@@ -396,7 +396,7 @@ module Updater
     # the pipe every time one is present.  The +smoke_pipe+ method handles this by attempting to remove a
     # charactor from the pipe when it is called.
     def wait_for(delay)
-      return unless @continue
+      return unless @continue #we're dead go back to run and break out of the main loop
       delay ||= 356*24*60*60 #delay will be nil if there are no jobs.  Wait a really long time in that case.
       if delay <= 0 #more jobs are immidiatly availible
         smoke_pipe(@stream)

data/lib/updater/orm/mongo.rb CHANGED Viewed

@@ -6,7 +6,7 @@ module Updater
   module ORM
     class Mongo
-      FINDER= :get
+      FINDER= :find_one
       ID=:_id
       def initialize(hash = {})
@@ -183,10 +183,12 @@ module Updater
         # * :port - the port to connect to.  Default: 27017
         # * :username/:password - if these are present, they will be used to authenticate against the database
         def setup(options)
-          logger ||= options[:logger]
-          raise ArgumentError, "Must spesify the name of a databas when setting up Mongo driver" unless options[:database]
+          logger ||= options[:logger] || Update.logger
+          raise ArgumentError, "Must spesify the name of a database when setting up Mongo driver" unless options[:database]
           if options[:database].kind_of? ::Mongo::DB
             @db = options[:database]
+            options[:database] = @db.name
+            logger.info "Updater is using already established connection to #{@db.name}"
           else
             logger.info "Attempting to connect to mongodb at #{[options[:host] || "localhost", options[:port] || 27017].join(':')} database: \"#{options[:database]}\""
             @db = ::Mongo::Connection.new(options[:host] || "localhost", options[:port] || 27017).db(options[:database].to_s)

data/lib/updater/setup.rb CHANGED Viewed

@@ -39,7 +39,7 @@ module Updater
     ROOT = File.dirname(self.config_file || Dir.pwd)
-    #extended used for clients who wnat to override parameters
+    #extended used for clients who want to override parameters
     def initialize(file_or_hash, extended = {})
       @options = file_or_hash.kind_of?(Hash) ? file_or_hash : load_file(file_or_hash)
       @options.merge!(extended)

data/lib/updater/update.rb CHANGED Viewed

@@ -2,11 +2,19 @@ module Updater
   class TargetMissingError < StandardError
   end
-  #the basic class that drives updater
+  #The basic class that drives Updater. See Readme for usage information.
   class Update
     # Contains the Error class after an error is caught in +run+. Not stored to the database
     attr_reader :error
+    # Contains the underlying ORM instance (eg. ORM::Datamapper or ORM Mongo)
     attr_reader :orm
+    # In order to reduce the proliferation of chained jobs in the queue,
+    # jobs chain request are allowed a params value that will pass
+    # specific values to a chained method.  When a chained instance is
+    # created, the job processor will set this value.  It will then be sent
+    # to the target method in plance of '__param__'.  See #sub_args
     attr_accessor :params
     #Run the action on this traget compleating any chained actions
@@ -34,25 +42,34 @@ module Updater
       ret
     end
+    #see if this method was intended for the underlying ORM layer.
     def method_missing(method, *args)
       @orm.send(method,*args)
     end
+    # Determins and if necessary find/creates the target for this instance.
+    #
+    # Warning: This value is intentionally NOT memoized.  For instance type targets, it will result in a call to the datastore
+    # (or the recreation of an object) on EACH invocation.  Methods that need to refer to the target more then once should
+    # take care to store this value locally after initial retreavel.
     def target
       target = @orm.finder.nil? ? @orm.target : @orm.target.send(@orm.finder,@orm.finder_args)
       raise TargetMissingError, "Target missing --Class:'#{@orm.target}' Finder:'#{@orm.finder}', Args:'#{@orm.finder_args.inspect}'" unless target
       target
     end
+    # orm_inst must be set to an instacne of the class Update.orm
     def initialize(orm_inst)
-      raise ArgumentError if orm_inst.nil?
+      raise ArgumentError if orm_inst.nil? || !orm_inst.kind_of?(orm)
       @orm = orm_inst
     end
+    #Jobs may be named to make them easier to find
     def name=(n)
       @orm.name=n
     end
+    #Jobs may be named to make them easier to find
     def name
       @orm.name
     end
@@ -66,6 +83,7 @@ module Updater
       id = other.id
     end
+    # If this is true, the job will NOT be removed after it is run.  This is usually true for chained Jobs.
     def persistant?
       @orm.persistant
     end
@@ -77,7 +95,31 @@ module Updater
     end
   private
+    # == Use and Purpose
+    # Takes a previous job and the original array of arguments form the data store.
+    # It replaced three special values with meta information from Updater.  This is
+    # done to allow chained jobs to respond to specific conditions in the originating
+    # job.
+    #
+    # ==Substitutions
+    # The following strings are replaced with meta information from the calling job
+    # as described below:
+    #
+    # * '__job__': replaced with the instance of Updater::Update that chained into
+    #   this job.  If the job failed (that is raised and error while being run), this
+    #   instance will contain an error field with that error.
+    # * '__params__': this is an optional field of a chain instance.  It allows the
+    #   chaining job to set specific options for the chained job to use. For example
+    #   a chained job that reschedules the the original job might take an option
+    #   defining how frequently the job is rescheduled.  This would be passed in
+    #   the params field.  (See example in Updater::Chained -- Pending!)
+    # * '__self__':  this is simply set to the instance of Updater::Update that is
+    #   calling the method.  This might be useful for both chained and original
+    #   jobs that find a need to manipulate of inspect that job that called them.
+    #   Without this field, it would be impossible for a method to consistantly
+    #   determin wether it had been run from a background job or invoked
+    #   direclty by the app.
     def sub_args(job,a)
       a.map do |e|
         begin
@@ -101,22 +143,49 @@ module Updater
       end# map
     end #def
+    # Invoked by the runner with the name of a chain (:success, :failure, :ensure),
+    # this method takes each chained job and runs it to completion. (Depth First Search of the chain tree)
     def run_chain(name)
       chains = @orm.send(name)
       return unless chains
       chains.each do |job|
         job.run(self)
       end
-    rescue NameError
-      puts @orm.inspect
+    rescue NameError
+      # There have been a number of bugs caused by the @orm instance not being what was expected when
+      # the ORM layer returned a chain.  This error if produced will propigat to the worker where it is caught
+      # and logged, but to prevent a complete crash of the system, it is then ignored and the next job is run.
+      # This is here to help catch and debug this type of error in ORM layers, particularly 3rd party ORMs.
+      self.class.logger.error "Something is wrong with the ORM value in a chained call \n From (%s:%s):\n%s" % [__FILE__,__LINE__,@orm.inspect]
       raise
     end
     class << self
-      #This attribute must be set to some ORM that will persist the data
+      # This attribute must be set to some ORM that will persist the data.  The value is normally set
+      # using one of the methods in Updater::Setup.
       attr_accessor :orm
+      # This is the application level default method to call on a class in order to find/create a target
+      # instance. (e.g find, get, find_one, etc...).  In most circumstances the ORM layer defines an
+      # appropriate default and this does not need to be explcitly set.
+      #
+      # MongoDB is one significant exception to this rule.  The Updater Mongo ORM layer uses the
+      # 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper.  If the
+      # application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
+      attr_accessor :finder_method
+      # This is the application level default method to call on an instance type target.  It  should
+      # return a value to be passed to the #finder_method (above) inorder to retrieve the instance
+      # from the datastore.  (eg. id) In most circumstances the ORM layer defines an
+      # appropriate default and this does not need to be explcitly set.
+      #
+      # MongoDB is one significant exception to this rule.  The Updater Mongo ORM layer uses the
+      # 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper.  If the
+      # application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
+      attr_accessor :finder_id
       #remove once Bug is discovered
       def orm=(input)
         raise ArgumentError, "Must set ORM to and appropriate class" unless input.kind_of? Class
@@ -126,8 +195,11 @@ module Updater
       # This is an open IO socket that will be writen to when a job is scheduled. If it is unset
       # then @pid is signaled instead.
       attr_accessor :socket
+      # Instance of a conforming logger.  This will be created if it is not explicitly set.
       attr_writer :logger
+      # Returns the logger instance.  If it has not been set, a new Logger will be created pointing to STDOUT
       def logger
         @logger ||= Logger.new(STDOUT)
       end
@@ -146,6 +218,7 @@ module Updater
         clear_locks(worker)
       end
+      #Ensure that a worker no longer holds any locks.
       def clear_locks(worker); @orm.clear_locks(worker); end
       # Request that the target be sent the method with args at the given time.
@@ -193,25 +266,34 @@ module Updater
       # they are set.  See +for+ for examples
       #
       # :failure, :success,:ensure <Updater::Update instance> an other request to be run when the request compleste.  Usually these
-      # valuses will be created with the +chained+ method.  As an alternative a hash with keys of Updater::Update instances and
+      # valuses will be created with the +chained+ method.
+      # As an alternative a Hash (OrderedHash in ruby 1.8) with keys of Updater::Update instances and
       # values of Hash may be used.  The hash will be substituted for the '__param__' argument if/when the chained method is called.
       #
       # :persistant <true|false> if true the object will not be destroyed after the completion of its run.  By default
       # this is false except when time is nil.
       #
+      # ===Note:
+      #
+      # Unless finder_args is passed, a non-class target will be asked for its ID value using #finder_id
+      # or if that is not set, then the default value defined in the ORM layer.  Particularly for MongoDB
+      # it is important that #finder_id be set to an appropriate value sence the Updater ORM layer uses
+      # the low level MongoDB driver instead of a more feature complete ORM like Mongoid.
+      #
       # == Examples
       #
       #    Updater.at(Chronic.parse('tomorrow'),Foo,:bar,[]) # will run Foo.bar() tomorrow at midnight
       #
       #    f = Foo.create
       #    u = Updater.at(Chronic.parse('2 hours form now'),f,:bar,[]) # will run Foo.get(f.id).bar in 2 hours
+      # == See Also
+      #
+      # +in+, +immidiate+ and +chain+ which share the same arguments and options but treat time differently
       def at(t,target,method = nil,args=[],options={})
         hash = Hash.new
         hash[:time] = t.to_i unless t.nil?
-        hash[:target],hash[:finder],hash[:finder_args] = target_for(target)
-        hash[:finder] = options[:finder] || hash[:finder]
-        hash[:finder_args] = options[:finder_args] || hash[:finder_args]
+        hash[:target],hash[:finder],hash[:finder_args] = target_for(target, options)
         hash[:method] = method || :perform
         hash[:method_args] = args
@@ -283,7 +365,7 @@ module Updater
             #The time class used by Updater.  See time=
       def time
-        @@time ||= Time
+        @time ||= Time
       end
       # By default Updater will use the system time (Time class) to get the current time.  The application
@@ -291,7 +373,7 @@ module Updater
       # allows us to substitute a custom class for Time.  This class must respond with in interger or Time to
       # the #now method.
       def time=(klass)
-        @@time = klass
+        @time = klass
       end
       # A filter for all requests that are ready to run, that is they requested to be run before or at time.now
@@ -318,7 +400,7 @@ module Updater
       end
       #Remove all scheduled jobs.  Mostly intended for testing, but may also be useful in cases of crashes
-      #or system corruption
+      #or system corruption. removes all pending jobs.
       def clear_all
         @orm.clear_all
       end
@@ -334,20 +416,22 @@ module Updater
       #in another way.
       def pid=(p)
         return @pid = nil unless p #tricky assignment in return
-        @pid = Integer("#{p}")
-        Process::kill 0, @pid
+        @pid = Integer("#{p}") #safety check that prevents a curupted PID file from crashing the system
+        Process::kill 0, @pid #check that the process exists
         @pid
       rescue Errno::ESRCH, ArgumentError
         @pid = nil
         raise ArgumentError, "PID was invalid"
       end
+      # The PID of the worker process
       def pid
         @pid
       end
     private
       def signal_worker
+        # TODO: If worker process goes down or has to be reset, try to reconnect
         if @socket
           @socket.write '.'
         elsif @pid
@@ -356,12 +440,15 @@ module Updater
       end
       # Given some instance return the information needed to recreate that target
-      def target_for(inst)
+      def target_for(inst,options = {})
         return [inst, nil, nil] if (inst.kind_of?(Class) || inst.kind_of?(Module))
-        [inst.class,@orm::FINDER,inst.send(orm::ID)]
+        [ inst.class, #target's class
+          options[:finder] || @finder_method || orm::FINDER, #method to call on targets class to find/create target
+          options[:finder_args] || inst.send(@finder_id || orm::ID) #value to pass to above method
+        ]
       end
-    end
+    end # class << self
   end #class Update
 end #Module Updater

metadata CHANGED Viewed

@@ -6,7 +6,8 @@ version: !ruby/object:Gem::Version
   - 0
   - 9
   - 3
-  version: 0.9.3
+  - 1
+  version: 0.9.3.1
 platform: ruby
 authors:
 - John F. Miller
@@ -14,7 +15,7 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2010-08-25 00:00:00 -07:00
+date: 2010-08-26 00:00:00 -07:00
 default_executable:
 dependencies:
 - !ruby/object:Gem::Dependency
@@ -77,7 +78,7 @@ dependencies:
         version: 0.2.3
   type: :development
   version_requirements: *id004
-description: A Gem for queuing methods for later calling which is ORM Agnostic, and has advanced Error Handling
+description: A job queue which is ORM Agnostic and has advanced Error Handling
 email: emperor@antarestrader.com
 executables: []
@@ -124,7 +125,7 @@ files:
 - spec/errors_spec.rb
 - bin/updater
 has_rdoc: true
-homepage: http://blog.antarestrader.com
+homepage: http://github.com/antarestrader/Updater
 licenses: []
 post_install_message:
@@ -154,6 +155,6 @@ rubyforge_project:
 rubygems_version: 1.3.7
 signing_key:
 specification_version: 3
-summary: A Gem for queuing methods for later calling which is ORM Agnostic, and has advanced Error Handling
+summary: A job queue which is ORM Agnostic and has advanced Error Handling
 test_files: []