updater 0.9.3 → 0.9.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.markdown +355 -25
- data/Rakefile +2 -2
- data/VERSION +1 -1
- data/lib/updater/fork_worker.rb +2 -2
- data/lib/updater/orm/mongo.rb +5 -3
- data/lib/updater/setup.rb +1 -1
- data/lib/updater/update.rb +105 -18
- metadata +6 -5
data/README.markdown
CHANGED
@@ -2,12 +2,15 @@ Updater
|
|
2
2
|
=======
|
3
3
|
|
4
4
|
Updater is a background job queue processor.
|
5
|
-
It works a bit like
|
5
|
+
It works a bit like
|
6
|
+
[delayed_job](http://github.com/tobi/delayed_job)
|
7
|
+
or [resque](http://github.com/defunkt/resque),
|
6
8
|
processing jobs in the background to allow user facing processes to stay responsive.
|
7
9
|
It also allow jobs to be scheduled and run at a particular time.
|
8
|
-
It is intended to work with a number of different ORM layers
|
9
|
-
|
10
|
-
Native support for
|
10
|
+
It is intended to work with a number of different ORM layers.
|
11
|
+
at the moment only DataMapper, and MongoDB are implemented.
|
12
|
+
Native support for ActiveRecord is planed.
|
13
|
+
|
11
14
|
Get it on GemCutter with `gem install updater`.
|
12
15
|
|
13
16
|
Main Features
|
@@ -17,11 +20,11 @@ Main Features
|
|
17
20
|
* Intelligent Automatic Scaling
|
18
21
|
* Does not poll the database
|
19
22
|
* Uses minimal resources when not under load
|
20
|
-
* Flexible Job Chaining allows for intelligent error handling and rescheduling
|
23
|
+
* Flexible Job Chaining allows for intelligent error handling and rescheduling (see Below)
|
21
24
|
* Powerful configuration with intelligent defaults
|
22
|
-
* Comes with `rake` tasks
|
25
|
+
* Comes with `rake` tasks and binary.
|
23
26
|
|
24
|
-
These feature are as of ver 0.9.
|
27
|
+
These feature are as of ver 0.9.4. See the change log for addational features
|
25
28
|
|
26
29
|
Use Cases
|
27
30
|
---------
|
@@ -29,12 +32,12 @@ Use Cases
|
|
29
32
|
Web based applications face two restrictions on their functionality
|
30
33
|
as it pertains to data processing tasks.
|
31
34
|
First, the application will only run code in response to a user request.
|
32
|
-
Updater handles the case of actions or events
|
35
|
+
Updater handles the case of actions or events need to be triggered
|
33
36
|
without a request coming in to the web application.
|
34
37
|
Second, web applications, particularly those under heavy load,
|
35
38
|
need to handle as many request as possible in a given time frame.
|
36
39
|
Updater allows data processing and communication tasks
|
37
|
-
to happen outside of the request
|
40
|
+
to happen outside of the request/response cycle
|
38
41
|
and makes it possible to move these tasks onto dedicated hardware
|
39
42
|
|
40
43
|
Updater is also useful for circumstances where code needs to be run
|
@@ -46,7 +49,7 @@ Jobs that are regular and repeating can be run
|
|
46
49
|
more consistently and with fewer resources with `cron`.
|
47
50
|
Updater should be considered when the application generates
|
48
51
|
a large number of one time events,
|
49
|
-
and/or the events need to be regularly manipulated
|
52
|
+
and/or the events need to be regularly manipulated by the application.
|
50
53
|
|
51
54
|
Updater is also not the optimal solution if the only goal
|
52
55
|
is to offload large numbers of immediate tasks.
|
@@ -57,7 +60,7 @@ by Chris Wanstrath.
|
|
57
60
|
Resque lacks a number of Updater's more powerful features,
|
58
61
|
and as of this writing we are not aware of any ability in resque
|
59
62
|
to set the time the job is run.
|
60
|
-
But
|
63
|
+
But resque does offer much higher potential throughput, and
|
61
64
|
a more robust queue structure backed by the Redis key-value store.
|
62
65
|
|
63
66
|
Using Updater
|
@@ -69,44 +72,62 @@ Initial Installation
|
|
69
72
|
Updater comes packaged as a gem and is published on GemCutter.
|
70
73
|
Get the latest version with `gem install updater`
|
71
74
|
|
75
|
+
The Updater source code is located at:
|
76
|
+
[http://github.com/antarestrader/Updater]([http://github.com/antarestrader/Updater])
|
77
|
+
|
72
78
|
Consepts
|
73
79
|
--------
|
74
80
|
|
75
81
|
Updater is not complex to use but it does, of necessity, have a number of *moving parts*.
|
76
82
|
In order to use updater successfully two tasks must be accomplished:
|
77
83
|
your application (referred to as the client) must be able to add jobs to the queue,
|
78
|
-
and a separate process (the server
|
79
|
-
|
84
|
+
and a separate process (called the server, worker or job processor)
|
85
|
+
must be setup and run -- potentially on seperate hardware.
|
86
|
+
It will preform the actions specified in those jobs.
|
80
87
|
|
81
88
|
Jobs are stored in a data store that is shared between client and server.
|
82
89
|
Usually this data store will be a table in your database.
|
83
|
-
Other data stores are possible, but require significantly configuration.
|
90
|
+
Other data stores are possible, but require significantly more configuration.
|
84
91
|
Updater is designed to have a minimal impact on the load of the data store,
|
85
92
|
and it therefore should be a reasonable solution for most applications.
|
86
93
|
(For a discussion of when this is not a reasonable solution see
|
87
94
|
[the Rescue Blog Post](http://github.com/blog/542-introducing-resque))
|
88
95
|
|
89
|
-
Updater is very flexible about what can be run in as a background job
|
96
|
+
Updater is *very* flexible about what can be run in as a background job,
|
97
|
+
and this distinguishes it from other backgropund job processors.
|
90
98
|
The Job will call a single method on either a Class or an instance.
|
91
99
|
The method must be public.
|
92
|
-
Any arguments can be passed to the method so long as
|
100
|
+
Any arguments can be passed to the method so long as they can be marshaled.
|
93
101
|
It is important to keep in mind that the job will be run in a completely separate process.
|
94
102
|
Any global state will have to be recreated,
|
95
103
|
and data must be persisted in some form in order to be seen by the client.
|
96
104
|
For web applications this is usually not an issue.
|
97
105
|
|
98
106
|
Calling a class method is fairly strait forward,
|
99
|
-
but calling a method on
|
107
|
+
but calling a method on an instance take a little more work.
|
100
108
|
Instances must be somehow persisted in the client
|
101
109
|
then reinstantiated on the worker process.
|
102
|
-
The assumption is that this will be
|
103
|
-
Each ORM adapter in Updater lists
|
110
|
+
The assumption is that this will be done through the ORM and data store.
|
111
|
+
Each ORM adapter in Updater lists default methods
|
104
112
|
for retrieving particular instances from the data store.
|
105
113
|
When an instance is scheduled as the target of a job,
|
106
114
|
its class and id will be stored in the `updates` table.
|
107
|
-
When the job is run
|
115
|
+
When the job is run,
|
116
|
+
it will first use the this class to pull the instance out of the data store,
|
108
117
|
then call the appropriate method on that instance.
|
109
118
|
|
119
|
+
(*Notes on nomenclature*:
|
120
|
+
Jobs which run methods on a class are refered to throughout the documentationas "class type jobs",
|
121
|
+
while jobs which run methods on instances are called "instance type jobs."
|
122
|
+
The *target* of a job is the class or instance upon which the method is called.
|
123
|
+
A "conforming instance" is an instance of some class
|
124
|
+
which is persisted in the datastore
|
125
|
+
and can be found by calling the default `finder_method` on its class
|
126
|
+
using the value returned by the default `finder_id` method.
|
127
|
+
ActiveRecord or DataMapper model instances are conforming instances
|
128
|
+
when updater is configured to use that ORM.
|
129
|
+
)
|
130
|
+
|
110
131
|
Client Setup
|
111
132
|
------------
|
112
133
|
|
@@ -117,7 +138,12 @@ The `client_setup` method is responsible for establishing interprocess communica
|
|
117
138
|
and selecting the correct ORM adapter for Updater.
|
118
139
|
It does this using the configuration file discussed later in this document.
|
119
140
|
This method can take an optional hash that will override the options in the configuration file.
|
120
|
-
It can also be passes a `:logger` option
|
141
|
+
It can also be passes a `:logger` option.
|
142
|
+
|
143
|
+
With some ORM/datastore choices (only MongoDB at the moment)
|
144
|
+
it will also be necessary to pass the datastore connection to
|
145
|
+
`Updater::ORM::<<OrmCklass>>.setup`.
|
146
|
+
See the Updater documentation for your ORM/datastore.
|
121
147
|
|
122
148
|
Scheduling Jobs
|
123
149
|
---------------
|
@@ -140,7 +166,7 @@ If this is unspesified `:preform` is asumed (a la Resque)
|
|
140
166
|
Either leave this blank, or set to `[]` to call without arguments.
|
141
167
|
All members of the array must be marshalable.
|
142
168
|
|
143
|
-
**options**: a hash of extra information, details can be found in the Options section.
|
169
|
+
**options**: a hash of extra information, details can be found in the Options section of Updater::Update#at.
|
144
170
|
|
145
171
|
We intend to add a module that can be included into a target class
|
146
172
|
that will allow scheduling in the same general manner as delayed_job.
|
@@ -165,17 +191,20 @@ The configuration itself is a ERb interpreted YAML file.
|
|
165
191
|
This is of use in limiting repetition,
|
166
192
|
and in changing options based on the environment (test/development/production)
|
167
193
|
|
194
|
+
**Warning:** in its standard configuration,
|
195
|
+
the config file will be read by the server to deturmine how to boot the app.
|
196
|
+
This has the unfortunate side effect the the framework's settings
|
197
|
+
will not be availible when this file is processed by ERb.
|
198
|
+
|
168
199
|
Please see the options section for details about the various options in this file.
|
169
200
|
|
170
201
|
Starting Workers (Server)
|
171
202
|
-------------------------
|
172
203
|
|
173
|
-
In the parlance of background job processing,
|
174
|
-
a process that executes jobs is known as a worker.
|
175
204
|
The recommended way to start workers is through a rake task.
|
176
205
|
First, include `updater/tasks` in your application's Rakefile.
|
177
206
|
This will add start, stop and monitor tasks into the `updater` namespace.
|
178
|
-
`start` will use the options in your configuration file to start
|
207
|
+
`start` will use the options in your configuration file to start a worker process.
|
179
208
|
Likewise, `stop` will shut that process down.
|
180
209
|
The monitor task will start an http server
|
181
210
|
that you can use to monitor and control the job queue and workers.
|
@@ -186,3 +215,304 @@ which monitors the work load and starts or stops individual workers as needed
|
|
186
215
|
within the limits established in the configuration file.
|
187
216
|
You should, therefore, only need to use `start` once.
|
188
217
|
|
218
|
+
Options:
|
219
|
+
--------
|
220
|
+
|
221
|
+
Options may be set in configuration file or passed in at runtime.
|
222
|
+
|
223
|
+
### General Configuration ###
|
224
|
+
|
225
|
+
* `:orm`
|
226
|
+
A string representing the ORM layer to use.
|
227
|
+
the default is `datamapper` but this value should be set by all users of
|
228
|
+
versions < 1.0.0 as the default may change to `activerecord` once that ORM is implimented.
|
229
|
+
Currently Updater supports `datamapper` and `mongodb`.
|
230
|
+
Supprot for `activerecord` (>=3.0.0 only) will be implimented sometime after Rails 3 is released.
|
231
|
+
Support for Redis is under investigation, patches welcome.
|
232
|
+
|
233
|
+
* `:pid_file`
|
234
|
+
This file will be created by the server and read by the client.
|
235
|
+
Process signals are used as an alternate means of communication between client and server,
|
236
|
+
and rake tasks make use of this file to start and stop the serve.
|
237
|
+
The default is `ROOT\updater.pid` where ROOT is the location of the config file
|
238
|
+
or failing that the curent working directory.
|
239
|
+
|
240
|
+
* `:database`
|
241
|
+
A hash of options passed to the Updater::ORM and used to establish a connection to the datastore,
|
242
|
+
and for other ORM spesific setup. See the Updater documentation for your chosen ORM.
|
243
|
+
|
244
|
+
* `:config_file`
|
245
|
+
Sets an alternate path the the config file. Obviously useless in the actuall config file,
|
246
|
+
this option can none the less be passed directly to client and server setup methods as
|
247
|
+
an extended option. (See the cascade test.) It can also be set by the command line binary
|
248
|
+
using the `-c` option.
|
249
|
+
|
250
|
+
### Server Setup Options ###
|
251
|
+
|
252
|
+
* `:timeout`
|
253
|
+
Used only by the server,
|
254
|
+
this is the length of time (in seconds) that the server will wait before killing a worker.
|
255
|
+
It should be set to twice the length of the longes running job chain.
|
256
|
+
|
257
|
+
Because the master worker process will kill off jobs that run too long,
|
258
|
+
it is suggested that long jobs either be broken into smaller pieces using chains,
|
259
|
+
placed in a special long running job queue,
|
260
|
+
or forked off the worker process.
|
261
|
+
|
262
|
+
* `:workers`
|
263
|
+
This sets the maximum number of workers a single master server process may start.
|
264
|
+
Each worker type has its own default, the recomended default `fork_worker` uses 3.
|
265
|
+
The defaults are *very* conservitave, and so long as there are sufficient hardware
|
266
|
+
resources, values fo 20 or more are not out of the question.
|
267
|
+
|
268
|
+
The master worker process impliments a rather sophisticated heuristic
|
269
|
+
that adjusts the number of workers actually spun up to match the current load.
|
270
|
+
|
271
|
+
**Note:** It is likely that this option will be replaced by :max_workers before
|
272
|
+
version 1.0, and that a :min_workers option will be added with a default of 1.
|
273
|
+
Updater ignores unknown options so it is save to set :min_workers and :max_workers
|
274
|
+
in antisipation of this change.
|
275
|
+
|
276
|
+
* `:worker` (note singular)
|
277
|
+
This option is a string which tells Updater which kind of worker to use.
|
278
|
+
This option is only used by the server.
|
279
|
+
Options are `fork` or `thread` with a `simple` planned
|
280
|
+
either before 1.0 or 1.2 depending on what the author needs.
|
281
|
+
The default is 'fork' which is *strongly* recomended in production,
|
282
|
+
but is not compatible with Microsoft Windows, and *may* be sub-optimal with JRuby.
|
283
|
+
|
284
|
+
Windows user **must** set this option to `thread`.
|
285
|
+
|
286
|
+
* `:models`
|
287
|
+
This is actually an array of file names that the Server will require in order.
|
288
|
+
Many users will simple put a single file that loads their whole framework here.
|
289
|
+
(eg. `config/environment.rb` for Rails)
|
290
|
+
|
291
|
+
These files must allow the server to setup an Ruby environment in which all possible
|
292
|
+
job targets can be found, and the methods on those targets can be run.
|
293
|
+
An application that makes only minimal use of Updater,
|
294
|
+
and whose target classes and methods are carefully written,
|
295
|
+
might be able to only require a subset of the full application,
|
296
|
+
thus saving on system resources and improving start times.
|
297
|
+
|
298
|
+
### Logging ###
|
299
|
+
|
300
|
+
* `:logger`
|
301
|
+
An instance of a Ruby Logger or another object that uses the same interface.
|
302
|
+
See Also :log_file and :log_level, which this command supercedes
|
303
|
+
|
304
|
+
* `:log_file`
|
305
|
+
The file to which Updater will log its actions.
|
306
|
+
Most logging is done by the server.
|
307
|
+
If no file is given SDTOUT is assumed
|
308
|
+
Note that if the `:logger` option is set, this option is ignored.
|
309
|
+
|
310
|
+
* `:log_level`
|
311
|
+
One of the standare logging levels (failure error warn info debug).
|
312
|
+
Updater will accept either symbols or strings and will automatically upcase this value.
|
313
|
+
The defauld value is `warn`.
|
314
|
+
Note that if the `:logger` option is set, this option is ignored.
|
315
|
+
|
316
|
+
It should be noted that the server produces a prodigious amount of data at the debug level.
|
317
|
+
(several MB/per day without any jobs; several MB per minute under load)
|
318
|
+
We therefore strongly recomend that the server log level not be set below info without cause.
|
319
|
+
The client on the other hand is quite safe even at the debug level in development and staging environments.
|
320
|
+
|
321
|
+
### IPC ###
|
322
|
+
|
323
|
+
Any or all or none of these options may be given.
|
324
|
+
If the option is not given the communications channel will not be used.
|
325
|
+
The server will listen on all channels given,
|
326
|
+
while clients will communicate on the "best" only.
|
327
|
+
Options are listed from "best" to "worst."
|
328
|
+
If the client cannot use any of these options,
|
329
|
+
it will use process signals as a last resort.
|
330
|
+
|
331
|
+
These methods of communicaion mearly signal to a worker process that a job
|
332
|
+
has been placed in the data store. The client and server still must have access
|
333
|
+
to the same datastore.
|
334
|
+
|
335
|
+
* `:socket`
|
336
|
+
The path to a UNIX socket.
|
337
|
+
The server will create and listen on this socket, clients can connect to it.
|
338
|
+
This option is only viable for a server running on the same machine as the client,
|
339
|
+
and will not work on Windows.
|
340
|
+
|
341
|
+
* `:udp`
|
342
|
+
The port number for UDP communications.
|
343
|
+
This is the prefered option for a cluster configuration.
|
344
|
+
|
345
|
+
**Security Notice:** Updater makes no effort to verify the authentisity of
|
346
|
+
network connections. Administrators should configure network topology and firewalls
|
347
|
+
to ensure that only intended clients can communicate with the Updater server.
|
348
|
+
|
349
|
+
* `:tcp`
|
350
|
+
The port number for TCP comminications.
|
351
|
+
This is the prefered option for VPN connections between remote locations.
|
352
|
+
|
353
|
+
**Security Notice:** Updater makes no effort to verify the authentisity of
|
354
|
+
network connections. Administrators should configure network topology and firewalls
|
355
|
+
to ensure that only intended clients can communicate with the Updater server.
|
356
|
+
|
357
|
+
* `:host`
|
358
|
+
The host name for UDP and TCP connections.
|
359
|
+
The devault is 'localhost'.
|
360
|
+
See security warnings above.
|
361
|
+
|
362
|
+
* `:remote` (client only) (**Pending**)
|
363
|
+
This is the url of a server monitor.
|
364
|
+
This is the prefered option for remote operations over an unsecurted network.
|
365
|
+
|
366
|
+
On an unsecured network, authentication becomes necessary.
|
367
|
+
The server core is not equipt for authentication.
|
368
|
+
Instead, a monitor server is started.
|
369
|
+
This monitor has a secured connection to the worker master process using one of the methods above.
|
370
|
+
The monitor recieves HTTP POST requests from authenticated clients,
|
371
|
+
and translates them into job-ready notifications.
|
372
|
+
|
373
|
+
* `:sockets` (note plural)
|
374
|
+
Generally for internal use.
|
375
|
+
This is an array of established Socket connections
|
376
|
+
that are passed directly to the worker master process.
|
377
|
+
The server will listen for new connections on these sockets.
|
378
|
+
This cannot be set in the configuration file,
|
379
|
+
it may only be passed as an option to Updater::Setup#start.
|
380
|
+
|
381
|
+
|
382
|
+
Chained Jobs:
|
383
|
+
=============
|
384
|
+
|
385
|
+
One of the most exciting features of Updater is Job Chaining.
|
386
|
+
Each job has three queues
|
387
|
+
(`:success`, `:ensure` and `:failure`)
|
388
|
+
that point to other jobs in the queue.
|
389
|
+
These jobs are run after the initial job completes
|
390
|
+
depending on whether the job finished withour raising an error.
|
391
|
+
Jobs can in this way form a tree
|
392
|
+
(processed depth first)
|
393
|
+
of related tasks.
|
394
|
+
This allows for code reuse,
|
395
|
+
and extreeme flexibility when it comes to takes such as
|
396
|
+
error handling, logging, auditing, and the like.
|
397
|
+
|
398
|
+
Update will eventually come with a standard library of chained jobs
|
399
|
+
which will be found in the Updater::Chains class.
|
400
|
+
(TODO: Chains are being written for the 0.9 version in responce to developer needs.
|
401
|
+
watch point releases for new chained methods)
|
402
|
+
|
403
|
+
Adding Chained Jobs
|
404
|
+
-------------------
|
405
|
+
|
406
|
+
Jobs can be created with chained jobs by passing
|
407
|
+
`:success`, `:ensure` and/or `:failure`
|
408
|
+
as options to any of the job queuing methods.
|
409
|
+
The value of these keys can be job, and array of jobs,
|
410
|
+
or a hash where keys are jobs and values are options passes into the `__params__` argument (see below)
|
411
|
+
|
412
|
+
(*Notes on nomenclature*:
|
413
|
+
An initial job is one that was scheduled and run in the regular fassion
|
414
|
+
and not as a result of any chain.
|
415
|
+
A chained job is a job that is run by another job in responce to a chain.
|
416
|
+
)
|
417
|
+
|
418
|
+
Example:
|
419
|
+
|
420
|
+
# Assume self is a conforming instance
|
421
|
+
# Create a job to chain into
|
422
|
+
logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
|
423
|
+
# Create a job that will call this job in the case of an error
|
424
|
+
Updater::Update.immidiate(
|
425
|
+
self,
|
426
|
+
:some_method_that_might_fail,
|
427
|
+
[val1,val2],
|
428
|
+
:failure=>{logging_job=>{:message=>"an Epic Fail"}}
|
429
|
+
)
|
430
|
+
|
431
|
+
# [...]
|
432
|
+
|
433
|
+
class MyLoggingClass
|
434
|
+
def self.log_errors(job,options)
|
435
|
+
logger.error "There was {options[:message] || "failure"} while processing a job: \n %s" % job.error.mesage
|
436
|
+
logger.debug job.error.backtrace.join('\n')
|
437
|
+
end
|
438
|
+
end
|
439
|
+
|
440
|
+
Here, the worker will recreate `self` by pulling its information from the datastore.
|
441
|
+
The worker will then send `:some_method_that_might_fail` to that instance with `val1` and `val2`.
|
442
|
+
If `:some_method_that_might_fail` raises an error,
|
443
|
+
the worker will then run `logging_job`.
|
444
|
+
This job will send :log_errors to the `MyLoggingClass` class replacing `:__job__` with the instance of the job that failed,
|
445
|
+
and `:__params__` replaced with `{:message=>"Epic Fail"}`.
|
446
|
+
`MyLoggingClass` can use the first argument to get the error that `:some_method_that_might_fail` raised.
|
447
|
+
|
448
|
+
Chained methods can also be added after a job is created by inserting them into the appropriate array.
|
449
|
+
Notice however that an immidiate job may have already run before you have the chance to add a chained job.
|
450
|
+
|
451
|
+
Example:
|
452
|
+
|
453
|
+
#Simular to above
|
454
|
+
Create a job to chain into
|
455
|
+
logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
|
456
|
+
# Create a job that will call this job in the case of an error
|
457
|
+
initial_job = Updater::Update.in(
|
458
|
+
5.minutes,
|
459
|
+
self,
|
460
|
+
:some_method_that_might_fail,
|
461
|
+
[val1,val2])
|
462
|
+
initial_job.failure << logging_job
|
463
|
+
|
464
|
+
Writing Chained Jobs
|
465
|
+
--------------------
|
466
|
+
|
467
|
+
It is intended that chained jobs be reused.
|
468
|
+
The examples above created a new job to be chained for each initial job.
|
469
|
+
This is inefficient and would fill the datastore with unnecessary repeatition.
|
470
|
+
Instead, chained jobs should be placed into the datestore on first use,
|
471
|
+
then refered to by each new initial job.
|
472
|
+
|
473
|
+
To facilitate this Updater impliments three special fields in the arguments list
|
474
|
+
which are replaced with metadata before a job is called:
|
475
|
+
|
476
|
+
* `__job__`: replaced with the instance of Updater::Update that chained into
|
477
|
+
this job. If the job failed (that is raised and error while being run), this
|
478
|
+
instance will contain an error field with that error.
|
479
|
+
* `__params__`: this is an optional field of a chain instance. It allows the
|
480
|
+
chaining job to set specific options for the chained job to use. For example
|
481
|
+
a chained job that reschedules the the original job might take an option
|
482
|
+
defining how frequently the job is rescheduled. This would be passed in
|
483
|
+
the params field. (See example in Updater::Chained -- Pending!)
|
484
|
+
* `__self__`: this is simply set to the instance of Updater::Update that is
|
485
|
+
calling the method. This might be useful for both chained and original
|
486
|
+
jobs that find a need to manipulate of inspect that job that called them.
|
487
|
+
Without this field, it would be impossible for a method to consistantly
|
488
|
+
determin wether it had been run from a background job or invoked
|
489
|
+
direclty by the app.
|
490
|
+
|
491
|
+
Chained jobs can take advantage of these parameters to respond appropriatly without
|
492
|
+
having to have a new chiain job for each initial job.
|
493
|
+
|
494
|
+
Example: We could replace the `logging_job` above like this
|
495
|
+
|
496
|
+
class MyLoggingClass
|
497
|
+
def self.logging_job
|
498
|
+
# We will memoize this value so we don't have to hit the datastore each time.
|
499
|
+
# If the job is alread in the datastore, we will find it and use it,
|
500
|
+
# Otherwise, we will create it from scratch.
|
501
|
+
@logging_job ||= Updater::Update.for(self,'logging') || Updater::Update.chained(self,:log_errors,[:__job__,:__params__], :name=>'logging')
|
502
|
+
end
|
503
|
+
|
504
|
+
def self.log_errors
|
505
|
+
# [...] As above
|
506
|
+
end
|
507
|
+
end
|
508
|
+
|
509
|
+
# [...]
|
510
|
+
|
511
|
+
#Updater::Update.immidiate(
|
512
|
+
self,
|
513
|
+
:some_method_that_might_fail,
|
514
|
+
[val1,val2],
|
515
|
+
:failure=>{MyLoggingClass.logging_job=>{:message=>"an Epic Fail"}}
|
516
|
+
)
|
517
|
+
|
518
|
+
See Also: Once it is started, see the example in Updater::Chains -- pending
|
data/Rakefile
CHANGED
@@ -10,8 +10,8 @@ GEM_NAME = "updater"
|
|
10
10
|
GEM_VERSION = File.read(VERSION_FILE).strip
|
11
11
|
AUTHOR = "John F. Miller"
|
12
12
|
EMAIL = "emperor@antarestrader.com"
|
13
|
-
HOMEPAGE = "http://
|
14
|
-
SUMMARY = "A
|
13
|
+
HOMEPAGE = "http://github.com/antarestrader/Updater"
|
14
|
+
SUMMARY = "A job queue which is ORM Agnostic and has advanced Error Handling"
|
15
15
|
|
16
16
|
spec = Gem::Specification.new do |s|
|
17
17
|
s.name = GEM_NAME
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.9.3
|
1
|
+
0.9.3.1
|
data/lib/updater/fork_worker.rb
CHANGED
@@ -382,7 +382,7 @@ module Updater
|
|
382
382
|
# would be to wait until it is ready then run the next job the wake and run it. There are two difficulties here
|
383
383
|
# the first is the need to let the master process know that the worker is alive and has not hung. We use a
|
384
384
|
# heartbeat file discriptor which we periodically change ctimes on by changing its access mode. This is
|
385
|
-
# modeled the technique used in the Unicorn web server. Our difficult is that we must be prepaired for a
|
385
|
+
# modeled on the technique used in the Unicorn web server. Our difficult is that we must be prepaired for a
|
386
386
|
# much less consistant load then a web server. Within a single application there may be periods where jobs
|
387
387
|
# pile up and others where there is a compleatly empty queue for hours or days. There is also the issue of
|
388
388
|
# how long a job may take to run. Jobs should generally be kept on the order of +timeout+ seconds.
|
@@ -396,7 +396,7 @@ module Updater
|
|
396
396
|
# the pipe every time one is present. The +smoke_pipe+ method handles this by attempting to remove a
|
397
397
|
# charactor from the pipe when it is called.
|
398
398
|
def wait_for(delay)
|
399
|
-
return unless @continue
|
399
|
+
return unless @continue #we're dead go back to run and break out of the main loop
|
400
400
|
delay ||= 356*24*60*60 #delay will be nil if there are no jobs. Wait a really long time in that case.
|
401
401
|
if delay <= 0 #more jobs are immidiatly availible
|
402
402
|
smoke_pipe(@stream)
|
data/lib/updater/orm/mongo.rb
CHANGED
@@ -6,7 +6,7 @@ module Updater
|
|
6
6
|
module ORM
|
7
7
|
class Mongo
|
8
8
|
|
9
|
-
FINDER= :
|
9
|
+
FINDER= :find_one
|
10
10
|
ID=:_id
|
11
11
|
|
12
12
|
def initialize(hash = {})
|
@@ -183,10 +183,12 @@ module Updater
|
|
183
183
|
# * :port - the port to connect to. Default: 27017
|
184
184
|
# * :username/:password - if these are present, they will be used to authenticate against the database
|
185
185
|
def setup(options)
|
186
|
-
logger ||= options[:logger]
|
187
|
-
raise ArgumentError, "Must spesify the name of a
|
186
|
+
logger ||= options[:logger] || Update.logger
|
187
|
+
raise ArgumentError, "Must spesify the name of a database when setting up Mongo driver" unless options[:database]
|
188
188
|
if options[:database].kind_of? ::Mongo::DB
|
189
189
|
@db = options[:database]
|
190
|
+
options[:database] = @db.name
|
191
|
+
logger.info "Updater is using already established connection to #{@db.name}"
|
190
192
|
else
|
191
193
|
logger.info "Attempting to connect to mongodb at #{[options[:host] || "localhost", options[:port] || 27017].join(':')} database: \"#{options[:database]}\""
|
192
194
|
@db = ::Mongo::Connection.new(options[:host] || "localhost", options[:port] || 27017).db(options[:database].to_s)
|
data/lib/updater/setup.rb
CHANGED
@@ -39,7 +39,7 @@ module Updater
|
|
39
39
|
|
40
40
|
ROOT = File.dirname(self.config_file || Dir.pwd)
|
41
41
|
|
42
|
-
#extended used for clients who
|
42
|
+
#extended used for clients who want to override parameters
|
43
43
|
def initialize(file_or_hash, extended = {})
|
44
44
|
@options = file_or_hash.kind_of?(Hash) ? file_or_hash : load_file(file_or_hash)
|
45
45
|
@options.merge!(extended)
|
data/lib/updater/update.rb
CHANGED
@@ -2,11 +2,19 @@ module Updater
|
|
2
2
|
class TargetMissingError < StandardError
|
3
3
|
end
|
4
4
|
|
5
|
-
#
|
5
|
+
#The basic class that drives Updater. See Readme for usage information.
|
6
6
|
class Update
|
7
7
|
# Contains the Error class after an error is caught in +run+. Not stored to the database
|
8
8
|
attr_reader :error
|
9
|
+
|
10
|
+
# Contains the underlying ORM instance (eg. ORM::Datamapper or ORM Mongo)
|
9
11
|
attr_reader :orm
|
12
|
+
|
13
|
+
# In order to reduce the proliferation of chained jobs in the queue,
|
14
|
+
# jobs chain request are allowed a params value that will pass
|
15
|
+
# specific values to a chained method. When a chained instance is
|
16
|
+
# created, the job processor will set this value. It will then be sent
|
17
|
+
# to the target method in plance of '__param__'. See #sub_args
|
10
18
|
attr_accessor :params
|
11
19
|
|
12
20
|
#Run the action on this traget compleating any chained actions
|
@@ -34,25 +42,34 @@ module Updater
|
|
34
42
|
ret
|
35
43
|
end
|
36
44
|
|
45
|
+
#see if this method was intended for the underlying ORM layer.
|
37
46
|
def method_missing(method, *args)
|
38
47
|
@orm.send(method,*args)
|
39
48
|
end
|
40
49
|
|
50
|
+
# Determins and if necessary find/creates the target for this instance.
|
51
|
+
#
|
52
|
+
# Warning: This value is intentionally NOT memoized. For instance type targets, it will result in a call to the datastore
|
53
|
+
# (or the recreation of an object) on EACH invocation. Methods that need to refer to the target more then once should
|
54
|
+
# take care to store this value locally after initial retreavel.
|
41
55
|
def target
|
42
56
|
target = @orm.finder.nil? ? @orm.target : @orm.target.send(@orm.finder,@orm.finder_args)
|
43
57
|
raise TargetMissingError, "Target missing --Class:'#{@orm.target}' Finder:'#{@orm.finder}', Args:'#{@orm.finder_args.inspect}'" unless target
|
44
58
|
target
|
45
59
|
end
|
46
60
|
|
61
|
+
# orm_inst must be set to an instacne of the class Update.orm
|
47
62
|
def initialize(orm_inst)
|
48
|
-
raise ArgumentError if orm_inst.nil?
|
63
|
+
raise ArgumentError if orm_inst.nil? || !orm_inst.kind_of?(orm)
|
49
64
|
@orm = orm_inst
|
50
65
|
end
|
51
66
|
|
67
|
+
#Jobs may be named to make them easier to find
|
52
68
|
def name=(n)
|
53
69
|
@orm.name=n
|
54
70
|
end
|
55
71
|
|
72
|
+
#Jobs may be named to make them easier to find
|
56
73
|
def name
|
57
74
|
@orm.name
|
58
75
|
end
|
@@ -66,6 +83,7 @@ module Updater
|
|
66
83
|
id = other.id
|
67
84
|
end
|
68
85
|
|
86
|
+
# If this is true, the job will NOT be removed after it is run. This is usually true for chained Jobs.
|
69
87
|
def persistant?
|
70
88
|
@orm.persistant
|
71
89
|
end
|
@@ -77,7 +95,31 @@ module Updater
|
|
77
95
|
end
|
78
96
|
|
79
97
|
private
|
80
|
-
|
98
|
+
|
99
|
+
# == Use and Purpose
|
100
|
+
# Takes a previous job and the original array of arguments form the data store.
|
101
|
+
# It replaced three special values with meta information from Updater. This is
|
102
|
+
# done to allow chained jobs to respond to specific conditions in the originating
|
103
|
+
# job.
|
104
|
+
#
|
105
|
+
# ==Substitutions
|
106
|
+
# The following strings are replaced with meta information from the calling job
|
107
|
+
# as described below:
|
108
|
+
#
|
109
|
+
# * '__job__': replaced with the instance of Updater::Update that chained into
|
110
|
+
# this job. If the job failed (that is raised and error while being run), this
|
111
|
+
# instance will contain an error field with that error.
|
112
|
+
# * '__params__': this is an optional field of a chain instance. It allows the
|
113
|
+
# chaining job to set specific options for the chained job to use. For example
|
114
|
+
# a chained job that reschedules the the original job might take an option
|
115
|
+
# defining how frequently the job is rescheduled. This would be passed in
|
116
|
+
# the params field. (See example in Updater::Chained -- Pending!)
|
117
|
+
# * '__self__': this is simply set to the instance of Updater::Update that is
|
118
|
+
# calling the method. This might be useful for both chained and original
|
119
|
+
# jobs that find a need to manipulate of inspect that job that called them.
|
120
|
+
# Without this field, it would be impossible for a method to consistantly
|
121
|
+
# determin wether it had been run from a background job or invoked
|
122
|
+
# direclty by the app.
|
81
123
|
def sub_args(job,a)
|
82
124
|
a.map do |e|
|
83
125
|
begin
|
@@ -101,22 +143,49 @@ module Updater
|
|
101
143
|
end# map
|
102
144
|
end #def
|
103
145
|
|
146
|
+
# Invoked by the runner with the name of a chain (:success, :failure, :ensure),
|
147
|
+
# this method takes each chained job and runs it to completion. (Depth First Search of the chain tree)
|
104
148
|
def run_chain(name)
|
105
149
|
chains = @orm.send(name)
|
106
150
|
return unless chains
|
107
151
|
chains.each do |job|
|
108
152
|
job.run(self)
|
109
153
|
end
|
110
|
-
rescue NameError
|
111
|
-
|
154
|
+
rescue NameError
|
155
|
+
# There have been a number of bugs caused by the @orm instance not being what was expected when
|
156
|
+
# the ORM layer returned a chain. This error if produced will propigat to the worker where it is caught
|
157
|
+
# and logged, but to prevent a complete crash of the system, it is then ignored and the next job is run.
|
158
|
+
# This is here to help catch and debug this type of error in ORM layers, particularly 3rd party ORMs.
|
159
|
+
self.class.logger.error "Something is wrong with the ORM value in a chained call \n From (%s:%s):\n%s" % [__FILE__,__LINE__,@orm.inspect]
|
112
160
|
raise
|
113
161
|
end
|
114
162
|
|
115
163
|
class << self
|
116
164
|
|
117
|
-
#This attribute must be set to some ORM that will persist the data
|
165
|
+
# This attribute must be set to some ORM that will persist the data. The value is normally set
|
166
|
+
# using one of the methods in Updater::Setup.
|
118
167
|
attr_accessor :orm
|
119
168
|
|
169
|
+
# This is the application level default method to call on a class in order to find/create a target
|
170
|
+
# instance. (e.g find, get, find_one, etc...). In most circumstances the ORM layer defines an
|
171
|
+
# appropriate default and this does not need to be explcitly set.
|
172
|
+
#
|
173
|
+
# MongoDB is one significant exception to this rule. The Updater Mongo ORM layer uses the
|
174
|
+
# 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper. If the
|
175
|
+
# application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
|
176
|
+
attr_accessor :finder_method
|
177
|
+
|
178
|
+
# This is the application level default method to call on an instance type target. It should
|
179
|
+
# return a value to be passed to the #finder_method (above) inorder to retrieve the instance
|
180
|
+
# from the datastore. (eg. id) In most circumstances the ORM layer defines an
|
181
|
+
# appropriate default and this does not need to be explcitly set.
|
182
|
+
#
|
183
|
+
# MongoDB is one significant exception to this rule. The Updater Mongo ORM layer uses the
|
184
|
+
# 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper. If the
|
185
|
+
# application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
|
186
|
+
attr_accessor :finder_id
|
187
|
+
|
188
|
+
|
120
189
|
#remove once Bug is discovered
|
121
190
|
def orm=(input)
|
122
191
|
raise ArgumentError, "Must set ORM to and appropriate class" unless input.kind_of? Class
|
@@ -126,8 +195,11 @@ module Updater
|
|
126
195
|
# This is an open IO socket that will be writen to when a job is scheduled. If it is unset
|
127
196
|
# then @pid is signaled instead.
|
128
197
|
attr_accessor :socket
|
198
|
+
|
199
|
+
# Instance of a conforming logger. This will be created if it is not explicitly set.
|
129
200
|
attr_writer :logger
|
130
201
|
|
202
|
+
# Returns the logger instance. If it has not been set, a new Logger will be created pointing to STDOUT
|
131
203
|
def logger
|
132
204
|
@logger ||= Logger.new(STDOUT)
|
133
205
|
end
|
@@ -146,6 +218,7 @@ module Updater
|
|
146
218
|
clear_locks(worker)
|
147
219
|
end
|
148
220
|
|
221
|
+
#Ensure that a worker no longer holds any locks.
|
149
222
|
def clear_locks(worker); @orm.clear_locks(worker); end
|
150
223
|
|
151
224
|
# Request that the target be sent the method with args at the given time.
|
@@ -193,25 +266,34 @@ module Updater
|
|
193
266
|
# they are set. See +for+ for examples
|
194
267
|
#
|
195
268
|
# :failure, :success,:ensure <Updater::Update instance> an other request to be run when the request compleste. Usually these
|
196
|
-
# valuses will be created with the +chained+ method.
|
269
|
+
# valuses will be created with the +chained+ method.
|
270
|
+
# As an alternative a Hash (OrderedHash in ruby 1.8) with keys of Updater::Update instances and
|
197
271
|
# values of Hash may be used. The hash will be substituted for the '__param__' argument if/when the chained method is called.
|
198
272
|
#
|
199
273
|
# :persistant <true|false> if true the object will not be destroyed after the completion of its run. By default
|
200
274
|
# this is false except when time is nil.
|
201
275
|
#
|
276
|
+
# ===Note:
|
277
|
+
#
|
278
|
+
# Unless finder_args is passed, a non-class target will be asked for its ID value using #finder_id
|
279
|
+
# or if that is not set, then the default value defined in the ORM layer. Particularly for MongoDB
|
280
|
+
# it is important that #finder_id be set to an appropriate value sence the Updater ORM layer uses
|
281
|
+
# the low level MongoDB driver instead of a more feature complete ORM like Mongoid.
|
282
|
+
#
|
202
283
|
# == Examples
|
203
284
|
#
|
204
285
|
# Updater.at(Chronic.parse('tomorrow'),Foo,:bar,[]) # will run Foo.bar() tomorrow at midnight
|
205
286
|
#
|
206
287
|
# f = Foo.create
|
207
288
|
# u = Updater.at(Chronic.parse('2 hours form now'),f,:bar,[]) # will run Foo.get(f.id).bar in 2 hours
|
289
|
+
# == See Also
|
290
|
+
#
|
291
|
+
# +in+, +immidiate+ and +chain+ which share the same arguments and options but treat time differently
|
208
292
|
def at(t,target,method = nil,args=[],options={})
|
209
293
|
hash = Hash.new
|
210
294
|
hash[:time] = t.to_i unless t.nil?
|
211
295
|
|
212
|
-
hash[:target],hash[:finder],hash[:finder_args] = target_for(target)
|
213
|
-
hash[:finder] = options[:finder] || hash[:finder]
|
214
|
-
hash[:finder_args] = options[:finder_args] || hash[:finder_args]
|
296
|
+
hash[:target],hash[:finder],hash[:finder_args] = target_for(target, options)
|
215
297
|
|
216
298
|
hash[:method] = method || :perform
|
217
299
|
hash[:method_args] = args
|
@@ -283,7 +365,7 @@ module Updater
|
|
283
365
|
|
284
366
|
#The time class used by Updater. See time=
|
285
367
|
def time
|
286
|
-
|
368
|
+
@time ||= Time
|
287
369
|
end
|
288
370
|
|
289
371
|
# By default Updater will use the system time (Time class) to get the current time. The application
|
@@ -291,7 +373,7 @@ module Updater
|
|
291
373
|
# allows us to substitute a custom class for Time. This class must respond with in interger or Time to
|
292
374
|
# the #now method.
|
293
375
|
def time=(klass)
|
294
|
-
|
376
|
+
@time = klass
|
295
377
|
end
|
296
378
|
|
297
379
|
# A filter for all requests that are ready to run, that is they requested to be run before or at time.now
|
@@ -318,7 +400,7 @@ module Updater
|
|
318
400
|
end
|
319
401
|
|
320
402
|
#Remove all scheduled jobs. Mostly intended for testing, but may also be useful in cases of crashes
|
321
|
-
#or system corruption
|
403
|
+
#or system corruption. removes all pending jobs.
|
322
404
|
def clear_all
|
323
405
|
@orm.clear_all
|
324
406
|
end
|
@@ -334,20 +416,22 @@ module Updater
|
|
334
416
|
#in another way.
|
335
417
|
def pid=(p)
|
336
418
|
return @pid = nil unless p #tricky assignment in return
|
337
|
-
@pid = Integer("#{p}")
|
338
|
-
Process::kill 0, @pid
|
419
|
+
@pid = Integer("#{p}") #safety check that prevents a curupted PID file from crashing the system
|
420
|
+
Process::kill 0, @pid #check that the process exists
|
339
421
|
@pid
|
340
422
|
rescue Errno::ESRCH, ArgumentError
|
341
423
|
@pid = nil
|
342
424
|
raise ArgumentError, "PID was invalid"
|
343
425
|
end
|
344
426
|
|
427
|
+
# The PID of the worker process
|
345
428
|
def pid
|
346
429
|
@pid
|
347
430
|
end
|
348
431
|
|
349
432
|
private
|
350
433
|
def signal_worker
|
434
|
+
# TODO: If worker process goes down or has to be reset, try to reconnect
|
351
435
|
if @socket
|
352
436
|
@socket.write '.'
|
353
437
|
elsif @pid
|
@@ -356,12 +440,15 @@ module Updater
|
|
356
440
|
end
|
357
441
|
|
358
442
|
# Given some instance return the information needed to recreate that target
|
359
|
-
def target_for(inst)
|
443
|
+
def target_for(inst,options = {})
|
360
444
|
return [inst, nil, nil] if (inst.kind_of?(Class) || inst.kind_of?(Module))
|
361
|
-
[inst.class
|
445
|
+
[ inst.class, #target's class
|
446
|
+
options[:finder] || @finder_method || orm::FINDER, #method to call on targets class to find/create target
|
447
|
+
options[:finder_args] || inst.send(@finder_id || orm::ID) #value to pass to above method
|
448
|
+
]
|
362
449
|
end
|
363
450
|
|
364
|
-
end
|
451
|
+
end # class << self
|
365
452
|
end #class Update
|
366
453
|
|
367
454
|
end #Module Updater
|
metadata
CHANGED
@@ -6,7 +6,8 @@ version: !ruby/object:Gem::Version
|
|
6
6
|
- 0
|
7
7
|
- 9
|
8
8
|
- 3
|
9
|
-
|
9
|
+
- 1
|
10
|
+
version: 0.9.3.1
|
10
11
|
platform: ruby
|
11
12
|
authors:
|
12
13
|
- John F. Miller
|
@@ -14,7 +15,7 @@ autorequire:
|
|
14
15
|
bindir: bin
|
15
16
|
cert_chain: []
|
16
17
|
|
17
|
-
date: 2010-08-
|
18
|
+
date: 2010-08-26 00:00:00 -07:00
|
18
19
|
default_executable:
|
19
20
|
dependencies:
|
20
21
|
- !ruby/object:Gem::Dependency
|
@@ -77,7 +78,7 @@ dependencies:
|
|
77
78
|
version: 0.2.3
|
78
79
|
type: :development
|
79
80
|
version_requirements: *id004
|
80
|
-
description: A
|
81
|
+
description: A job queue which is ORM Agnostic and has advanced Error Handling
|
81
82
|
email: emperor@antarestrader.com
|
82
83
|
executables: []
|
83
84
|
|
@@ -124,7 +125,7 @@ files:
|
|
124
125
|
- spec/errors_spec.rb
|
125
126
|
- bin/updater
|
126
127
|
has_rdoc: true
|
127
|
-
homepage: http://
|
128
|
+
homepage: http://github.com/antarestrader/Updater
|
128
129
|
licenses: []
|
129
130
|
|
130
131
|
post_install_message:
|
@@ -154,6 +155,6 @@ rubyforge_project:
|
|
154
155
|
rubygems_version: 1.3.7
|
155
156
|
signing_key:
|
156
157
|
specification_version: 3
|
157
|
-
summary: A
|
158
|
+
summary: A job queue which is ORM Agnostic and has advanced Error Handling
|
158
159
|
test_files: []
|
159
160
|
|