updater 0.9.3 → 0.9.3.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.markdown +355 -25
- data/Rakefile +2 -2
- data/VERSION +1 -1
- data/lib/updater/fork_worker.rb +2 -2
- data/lib/updater/orm/mongo.rb +5 -3
- data/lib/updater/setup.rb +1 -1
- data/lib/updater/update.rb +105 -18
- metadata +6 -5
data/README.markdown
CHANGED
@@ -2,12 +2,15 @@ Updater
|
|
2
2
|
=======
|
3
3
|
|
4
4
|
Updater is a background job queue processor.
|
5
|
-
It works a bit like
|
5
|
+
It works a bit like
|
6
|
+
[delayed_job](http://github.com/tobi/delayed_job)
|
7
|
+
or [resque](http://github.com/defunkt/resque),
|
6
8
|
processing jobs in the background to allow user facing processes to stay responsive.
|
7
9
|
It also allow jobs to be scheduled and run at a particular time.
|
8
|
-
It is intended to work with a number of different ORM layers
|
9
|
-
|
10
|
-
Native support for
|
10
|
+
It is intended to work with a number of different ORM layers.
|
11
|
+
at the moment only DataMapper, and MongoDB are implemented.
|
12
|
+
Native support for ActiveRecord is planed.
|
13
|
+
|
11
14
|
Get it on GemCutter with `gem install updater`.
|
12
15
|
|
13
16
|
Main Features
|
@@ -17,11 +20,11 @@ Main Features
|
|
17
20
|
* Intelligent Automatic Scaling
|
18
21
|
* Does not poll the database
|
19
22
|
* Uses minimal resources when not under load
|
20
|
-
* Flexible Job Chaining allows for intelligent error handling and rescheduling
|
23
|
+
* Flexible Job Chaining allows for intelligent error handling and rescheduling (see Below)
|
21
24
|
* Powerful configuration with intelligent defaults
|
22
|
-
* Comes with `rake` tasks
|
25
|
+
* Comes with `rake` tasks and binary.
|
23
26
|
|
24
|
-
These feature are as of ver 0.9.
|
27
|
+
These feature are as of ver 0.9.4. See the change log for addational features
|
25
28
|
|
26
29
|
Use Cases
|
27
30
|
---------
|
@@ -29,12 +32,12 @@ Use Cases
|
|
29
32
|
Web based applications face two restrictions on their functionality
|
30
33
|
as it pertains to data processing tasks.
|
31
34
|
First, the application will only run code in response to a user request.
|
32
|
-
Updater handles the case of actions or events
|
35
|
+
Updater handles the case of actions or events need to be triggered
|
33
36
|
without a request coming in to the web application.
|
34
37
|
Second, web applications, particularly those under heavy load,
|
35
38
|
need to handle as many request as possible in a given time frame.
|
36
39
|
Updater allows data processing and communication tasks
|
37
|
-
to happen outside of the request
|
40
|
+
to happen outside of the request/response cycle
|
38
41
|
and makes it possible to move these tasks onto dedicated hardware
|
39
42
|
|
40
43
|
Updater is also useful for circumstances where code needs to be run
|
@@ -46,7 +49,7 @@ Jobs that are regular and repeating can be run
|
|
46
49
|
more consistently and with fewer resources with `cron`.
|
47
50
|
Updater should be considered when the application generates
|
48
51
|
a large number of one time events,
|
49
|
-
and/or the events need to be regularly manipulated
|
52
|
+
and/or the events need to be regularly manipulated by the application.
|
50
53
|
|
51
54
|
Updater is also not the optimal solution if the only goal
|
52
55
|
is to offload large numbers of immediate tasks.
|
@@ -57,7 +60,7 @@ by Chris Wanstrath.
|
|
57
60
|
Resque lacks a number of Updater's more powerful features,
|
58
61
|
and as of this writing we are not aware of any ability in resque
|
59
62
|
to set the time the job is run.
|
60
|
-
But
|
63
|
+
But resque does offer much higher potential throughput, and
|
61
64
|
a more robust queue structure backed by the Redis key-value store.
|
62
65
|
|
63
66
|
Using Updater
|
@@ -69,44 +72,62 @@ Initial Installation
|
|
69
72
|
Updater comes packaged as a gem and is published on GemCutter.
|
70
73
|
Get the latest version with `gem install updater`
|
71
74
|
|
75
|
+
The Updater source code is located at:
|
76
|
+
[http://github.com/antarestrader/Updater]([http://github.com/antarestrader/Updater])
|
77
|
+
|
72
78
|
Consepts
|
73
79
|
--------
|
74
80
|
|
75
81
|
Updater is not complex to use but it does, of necessity, have a number of *moving parts*.
|
76
82
|
In order to use updater successfully two tasks must be accomplished:
|
77
83
|
your application (referred to as the client) must be able to add jobs to the queue,
|
78
|
-
and a separate process (the server
|
79
|
-
|
84
|
+
and a separate process (called the server, worker or job processor)
|
85
|
+
must be setup and run -- potentially on seperate hardware.
|
86
|
+
It will preform the actions specified in those jobs.
|
80
87
|
|
81
88
|
Jobs are stored in a data store that is shared between client and server.
|
82
89
|
Usually this data store will be a table in your database.
|
83
|
-
Other data stores are possible, but require significantly configuration.
|
90
|
+
Other data stores are possible, but require significantly more configuration.
|
84
91
|
Updater is designed to have a minimal impact on the load of the data store,
|
85
92
|
and it therefore should be a reasonable solution for most applications.
|
86
93
|
(For a discussion of when this is not a reasonable solution see
|
87
94
|
[the Rescue Blog Post](http://github.com/blog/542-introducing-resque))
|
88
95
|
|
89
|
-
Updater is very flexible about what can be run in as a background job
|
96
|
+
Updater is *very* flexible about what can be run in as a background job,
|
97
|
+
and this distinguishes it from other backgropund job processors.
|
90
98
|
The Job will call a single method on either a Class or an instance.
|
91
99
|
The method must be public.
|
92
|
-
Any arguments can be passed to the method so long as
|
100
|
+
Any arguments can be passed to the method so long as they can be marshaled.
|
93
101
|
It is important to keep in mind that the job will be run in a completely separate process.
|
94
102
|
Any global state will have to be recreated,
|
95
103
|
and data must be persisted in some form in order to be seen by the client.
|
96
104
|
For web applications this is usually not an issue.
|
97
105
|
|
98
106
|
Calling a class method is fairly strait forward,
|
99
|
-
but calling a method on
|
107
|
+
but calling a method on an instance take a little more work.
|
100
108
|
Instances must be somehow persisted in the client
|
101
109
|
then reinstantiated on the worker process.
|
102
|
-
The assumption is that this will be
|
103
|
-
Each ORM adapter in Updater lists
|
110
|
+
The assumption is that this will be done through the ORM and data store.
|
111
|
+
Each ORM adapter in Updater lists default methods
|
104
112
|
for retrieving particular instances from the data store.
|
105
113
|
When an instance is scheduled as the target of a job,
|
106
114
|
its class and id will be stored in the `updates` table.
|
107
|
-
When the job is run
|
115
|
+
When the job is run,
|
116
|
+
it will first use the this class to pull the instance out of the data store,
|
108
117
|
then call the appropriate method on that instance.
|
109
118
|
|
119
|
+
(*Notes on nomenclature*:
|
120
|
+
Jobs which run methods on a class are refered to throughout the documentationas "class type jobs",
|
121
|
+
while jobs which run methods on instances are called "instance type jobs."
|
122
|
+
The *target* of a job is the class or instance upon which the method is called.
|
123
|
+
A "conforming instance" is an instance of some class
|
124
|
+
which is persisted in the datastore
|
125
|
+
and can be found by calling the default `finder_method` on its class
|
126
|
+
using the value returned by the default `finder_id` method.
|
127
|
+
ActiveRecord or DataMapper model instances are conforming instances
|
128
|
+
when updater is configured to use that ORM.
|
129
|
+
)
|
130
|
+
|
110
131
|
Client Setup
|
111
132
|
------------
|
112
133
|
|
@@ -117,7 +138,12 @@ The `client_setup` method is responsible for establishing interprocess communica
|
|
117
138
|
and selecting the correct ORM adapter for Updater.
|
118
139
|
It does this using the configuration file discussed later in this document.
|
119
140
|
This method can take an optional hash that will override the options in the configuration file.
|
120
|
-
It can also be passes a `:logger` option
|
141
|
+
It can also be passes a `:logger` option.
|
142
|
+
|
143
|
+
With some ORM/datastore choices (only MongoDB at the moment)
|
144
|
+
it will also be necessary to pass the datastore connection to
|
145
|
+
`Updater::ORM::<<OrmCklass>>.setup`.
|
146
|
+
See the Updater documentation for your ORM/datastore.
|
121
147
|
|
122
148
|
Scheduling Jobs
|
123
149
|
---------------
|
@@ -140,7 +166,7 @@ If this is unspesified `:preform` is asumed (a la Resque)
|
|
140
166
|
Either leave this blank, or set to `[]` to call without arguments.
|
141
167
|
All members of the array must be marshalable.
|
142
168
|
|
143
|
-
**options**: a hash of extra information, details can be found in the Options section.
|
169
|
+
**options**: a hash of extra information, details can be found in the Options section of Updater::Update#at.
|
144
170
|
|
145
171
|
We intend to add a module that can be included into a target class
|
146
172
|
that will allow scheduling in the same general manner as delayed_job.
|
@@ -165,17 +191,20 @@ The configuration itself is a ERb interpreted YAML file.
|
|
165
191
|
This is of use in limiting repetition,
|
166
192
|
and in changing options based on the environment (test/development/production)
|
167
193
|
|
194
|
+
**Warning:** in its standard configuration,
|
195
|
+
the config file will be read by the server to deturmine how to boot the app.
|
196
|
+
This has the unfortunate side effect the the framework's settings
|
197
|
+
will not be availible when this file is processed by ERb.
|
198
|
+
|
168
199
|
Please see the options section for details about the various options in this file.
|
169
200
|
|
170
201
|
Starting Workers (Server)
|
171
202
|
-------------------------
|
172
203
|
|
173
|
-
In the parlance of background job processing,
|
174
|
-
a process that executes jobs is known as a worker.
|
175
204
|
The recommended way to start workers is through a rake task.
|
176
205
|
First, include `updater/tasks` in your application's Rakefile.
|
177
206
|
This will add start, stop and monitor tasks into the `updater` namespace.
|
178
|
-
`start` will use the options in your configuration file to start
|
207
|
+
`start` will use the options in your configuration file to start a worker process.
|
179
208
|
Likewise, `stop` will shut that process down.
|
180
209
|
The monitor task will start an http server
|
181
210
|
that you can use to monitor and control the job queue and workers.
|
@@ -186,3 +215,304 @@ which monitors the work load and starts or stops individual workers as needed
|
|
186
215
|
within the limits established in the configuration file.
|
187
216
|
You should, therefore, only need to use `start` once.
|
188
217
|
|
218
|
+
Options:
|
219
|
+
--------
|
220
|
+
|
221
|
+
Options may be set in configuration file or passed in at runtime.
|
222
|
+
|
223
|
+
### General Configuration ###
|
224
|
+
|
225
|
+
* `:orm`
|
226
|
+
A string representing the ORM layer to use.
|
227
|
+
the default is `datamapper` but this value should be set by all users of
|
228
|
+
versions < 1.0.0 as the default may change to `activerecord` once that ORM is implimented.
|
229
|
+
Currently Updater supports `datamapper` and `mongodb`.
|
230
|
+
Supprot for `activerecord` (>=3.0.0 only) will be implimented sometime after Rails 3 is released.
|
231
|
+
Support for Redis is under investigation, patches welcome.
|
232
|
+
|
233
|
+
* `:pid_file`
|
234
|
+
This file will be created by the server and read by the client.
|
235
|
+
Process signals are used as an alternate means of communication between client and server,
|
236
|
+
and rake tasks make use of this file to start and stop the serve.
|
237
|
+
The default is `ROOT\updater.pid` where ROOT is the location of the config file
|
238
|
+
or failing that the curent working directory.
|
239
|
+
|
240
|
+
* `:database`
|
241
|
+
A hash of options passed to the Updater::ORM and used to establish a connection to the datastore,
|
242
|
+
and for other ORM spesific setup. See the Updater documentation for your chosen ORM.
|
243
|
+
|
244
|
+
* `:config_file`
|
245
|
+
Sets an alternate path the the config file. Obviously useless in the actuall config file,
|
246
|
+
this option can none the less be passed directly to client and server setup methods as
|
247
|
+
an extended option. (See the cascade test.) It can also be set by the command line binary
|
248
|
+
using the `-c` option.
|
249
|
+
|
250
|
+
### Server Setup Options ###
|
251
|
+
|
252
|
+
* `:timeout`
|
253
|
+
Used only by the server,
|
254
|
+
this is the length of time (in seconds) that the server will wait before killing a worker.
|
255
|
+
It should be set to twice the length of the longes running job chain.
|
256
|
+
|
257
|
+
Because the master worker process will kill off jobs that run too long,
|
258
|
+
it is suggested that long jobs either be broken into smaller pieces using chains,
|
259
|
+
placed in a special long running job queue,
|
260
|
+
or forked off the worker process.
|
261
|
+
|
262
|
+
* `:workers`
|
263
|
+
This sets the maximum number of workers a single master server process may start.
|
264
|
+
Each worker type has its own default, the recomended default `fork_worker` uses 3.
|
265
|
+
The defaults are *very* conservitave, and so long as there are sufficient hardware
|
266
|
+
resources, values fo 20 or more are not out of the question.
|
267
|
+
|
268
|
+
The master worker process impliments a rather sophisticated heuristic
|
269
|
+
that adjusts the number of workers actually spun up to match the current load.
|
270
|
+
|
271
|
+
**Note:** It is likely that this option will be replaced by :max_workers before
|
272
|
+
version 1.0, and that a :min_workers option will be added with a default of 1.
|
273
|
+
Updater ignores unknown options so it is save to set :min_workers and :max_workers
|
274
|
+
in antisipation of this change.
|
275
|
+
|
276
|
+
* `:worker` (note singular)
|
277
|
+
This option is a string which tells Updater which kind of worker to use.
|
278
|
+
This option is only used by the server.
|
279
|
+
Options are `fork` or `thread` with a `simple` planned
|
280
|
+
either before 1.0 or 1.2 depending on what the author needs.
|
281
|
+
The default is 'fork' which is *strongly* recomended in production,
|
282
|
+
but is not compatible with Microsoft Windows, and *may* be sub-optimal with JRuby.
|
283
|
+
|
284
|
+
Windows user **must** set this option to `thread`.
|
285
|
+
|
286
|
+
* `:models`
|
287
|
+
This is actually an array of file names that the Server will require in order.
|
288
|
+
Many users will simple put a single file that loads their whole framework here.
|
289
|
+
(eg. `config/environment.rb` for Rails)
|
290
|
+
|
291
|
+
These files must allow the server to setup an Ruby environment in which all possible
|
292
|
+
job targets can be found, and the methods on those targets can be run.
|
293
|
+
An application that makes only minimal use of Updater,
|
294
|
+
and whose target classes and methods are carefully written,
|
295
|
+
might be able to only require a subset of the full application,
|
296
|
+
thus saving on system resources and improving start times.
|
297
|
+
|
298
|
+
### Logging ###
|
299
|
+
|
300
|
+
* `:logger`
|
301
|
+
An instance of a Ruby Logger or another object that uses the same interface.
|
302
|
+
See Also :log_file and :log_level, which this command supercedes
|
303
|
+
|
304
|
+
* `:log_file`
|
305
|
+
The file to which Updater will log its actions.
|
306
|
+
Most logging is done by the server.
|
307
|
+
If no file is given SDTOUT is assumed
|
308
|
+
Note that if the `:logger` option is set, this option is ignored.
|
309
|
+
|
310
|
+
* `:log_level`
|
311
|
+
One of the standare logging levels (failure error warn info debug).
|
312
|
+
Updater will accept either symbols or strings and will automatically upcase this value.
|
313
|
+
The defauld value is `warn`.
|
314
|
+
Note that if the `:logger` option is set, this option is ignored.
|
315
|
+
|
316
|
+
It should be noted that the server produces a prodigious amount of data at the debug level.
|
317
|
+
(several MB/per day without any jobs; several MB per minute under load)
|
318
|
+
We therefore strongly recomend that the server log level not be set below info without cause.
|
319
|
+
The client on the other hand is quite safe even at the debug level in development and staging environments.
|
320
|
+
|
321
|
+
### IPC ###
|
322
|
+
|
323
|
+
Any or all or none of these options may be given.
|
324
|
+
If the option is not given the communications channel will not be used.
|
325
|
+
The server will listen on all channels given,
|
326
|
+
while clients will communicate on the "best" only.
|
327
|
+
Options are listed from "best" to "worst."
|
328
|
+
If the client cannot use any of these options,
|
329
|
+
it will use process signals as a last resort.
|
330
|
+
|
331
|
+
These methods of communicaion mearly signal to a worker process that a job
|
332
|
+
has been placed in the data store. The client and server still must have access
|
333
|
+
to the same datastore.
|
334
|
+
|
335
|
+
* `:socket`
|
336
|
+
The path to a UNIX socket.
|
337
|
+
The server will create and listen on this socket, clients can connect to it.
|
338
|
+
This option is only viable for a server running on the same machine as the client,
|
339
|
+
and will not work on Windows.
|
340
|
+
|
341
|
+
* `:udp`
|
342
|
+
The port number for UDP communications.
|
343
|
+
This is the prefered option for a cluster configuration.
|
344
|
+
|
345
|
+
**Security Notice:** Updater makes no effort to verify the authentisity of
|
346
|
+
network connections. Administrators should configure network topology and firewalls
|
347
|
+
to ensure that only intended clients can communicate with the Updater server.
|
348
|
+
|
349
|
+
* `:tcp`
|
350
|
+
The port number for TCP comminications.
|
351
|
+
This is the prefered option for VPN connections between remote locations.
|
352
|
+
|
353
|
+
**Security Notice:** Updater makes no effort to verify the authentisity of
|
354
|
+
network connections. Administrators should configure network topology and firewalls
|
355
|
+
to ensure that only intended clients can communicate with the Updater server.
|
356
|
+
|
357
|
+
* `:host`
|
358
|
+
The host name for UDP and TCP connections.
|
359
|
+
The devault is 'localhost'.
|
360
|
+
See security warnings above.
|
361
|
+
|
362
|
+
* `:remote` (client only) (**Pending**)
|
363
|
+
This is the url of a server monitor.
|
364
|
+
This is the prefered option for remote operations over an unsecurted network.
|
365
|
+
|
366
|
+
On an unsecured network, authentication becomes necessary.
|
367
|
+
The server core is not equipt for authentication.
|
368
|
+
Instead, a monitor server is started.
|
369
|
+
This monitor has a secured connection to the worker master process using one of the methods above.
|
370
|
+
The monitor recieves HTTP POST requests from authenticated clients,
|
371
|
+
and translates them into job-ready notifications.
|
372
|
+
|
373
|
+
* `:sockets` (note plural)
|
374
|
+
Generally for internal use.
|
375
|
+
This is an array of established Socket connections
|
376
|
+
that are passed directly to the worker master process.
|
377
|
+
The server will listen for new connections on these sockets.
|
378
|
+
This cannot be set in the configuration file,
|
379
|
+
it may only be passed as an option to Updater::Setup#start.
|
380
|
+
|
381
|
+
|
382
|
+
Chained Jobs:
|
383
|
+
=============
|
384
|
+
|
385
|
+
One of the most exciting features of Updater is Job Chaining.
|
386
|
+
Each job has three queues
|
387
|
+
(`:success`, `:ensure` and `:failure`)
|
388
|
+
that point to other jobs in the queue.
|
389
|
+
These jobs are run after the initial job completes
|
390
|
+
depending on whether the job finished withour raising an error.
|
391
|
+
Jobs can in this way form a tree
|
392
|
+
(processed depth first)
|
393
|
+
of related tasks.
|
394
|
+
This allows for code reuse,
|
395
|
+
and extreeme flexibility when it comes to takes such as
|
396
|
+
error handling, logging, auditing, and the like.
|
397
|
+
|
398
|
+
Update will eventually come with a standard library of chained jobs
|
399
|
+
which will be found in the Updater::Chains class.
|
400
|
+
(TODO: Chains are being written for the 0.9 version in responce to developer needs.
|
401
|
+
watch point releases for new chained methods)
|
402
|
+
|
403
|
+
Adding Chained Jobs
|
404
|
+
-------------------
|
405
|
+
|
406
|
+
Jobs can be created with chained jobs by passing
|
407
|
+
`:success`, `:ensure` and/or `:failure`
|
408
|
+
as options to any of the job queuing methods.
|
409
|
+
The value of these keys can be job, and array of jobs,
|
410
|
+
or a hash where keys are jobs and values are options passes into the `__params__` argument (see below)
|
411
|
+
|
412
|
+
(*Notes on nomenclature*:
|
413
|
+
An initial job is one that was scheduled and run in the regular fassion
|
414
|
+
and not as a result of any chain.
|
415
|
+
A chained job is a job that is run by another job in responce to a chain.
|
416
|
+
)
|
417
|
+
|
418
|
+
Example:
|
419
|
+
|
420
|
+
# Assume self is a conforming instance
|
421
|
+
# Create a job to chain into
|
422
|
+
logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
|
423
|
+
# Create a job that will call this job in the case of an error
|
424
|
+
Updater::Update.immidiate(
|
425
|
+
self,
|
426
|
+
:some_method_that_might_fail,
|
427
|
+
[val1,val2],
|
428
|
+
:failure=>{logging_job=>{:message=>"an Epic Fail"}}
|
429
|
+
)
|
430
|
+
|
431
|
+
# [...]
|
432
|
+
|
433
|
+
class MyLoggingClass
|
434
|
+
def self.log_errors(job,options)
|
435
|
+
logger.error "There was {options[:message] || "failure"} while processing a job: \n %s" % job.error.mesage
|
436
|
+
logger.debug job.error.backtrace.join('\n')
|
437
|
+
end
|
438
|
+
end
|
439
|
+
|
440
|
+
Here, the worker will recreate `self` by pulling its information from the datastore.
|
441
|
+
The worker will then send `:some_method_that_might_fail` to that instance with `val1` and `val2`.
|
442
|
+
If `:some_method_that_might_fail` raises an error,
|
443
|
+
the worker will then run `logging_job`.
|
444
|
+
This job will send :log_errors to the `MyLoggingClass` class replacing `:__job__` with the instance of the job that failed,
|
445
|
+
and `:__params__` replaced with `{:message=>"Epic Fail"}`.
|
446
|
+
`MyLoggingClass` can use the first argument to get the error that `:some_method_that_might_fail` raised.
|
447
|
+
|
448
|
+
Chained methods can also be added after a job is created by inserting them into the appropriate array.
|
449
|
+
Notice however that an immidiate job may have already run before you have the chance to add a chained job.
|
450
|
+
|
451
|
+
Example:
|
452
|
+
|
453
|
+
#Simular to above
|
454
|
+
Create a job to chain into
|
455
|
+
logging_job = Updater::Update.chained(MyLoggingClass,:log_errors,[:__job__,:__params__])
|
456
|
+
# Create a job that will call this job in the case of an error
|
457
|
+
initial_job = Updater::Update.in(
|
458
|
+
5.minutes,
|
459
|
+
self,
|
460
|
+
:some_method_that_might_fail,
|
461
|
+
[val1,val2])
|
462
|
+
initial_job.failure << logging_job
|
463
|
+
|
464
|
+
Writing Chained Jobs
|
465
|
+
--------------------
|
466
|
+
|
467
|
+
It is intended that chained jobs be reused.
|
468
|
+
The examples above created a new job to be chained for each initial job.
|
469
|
+
This is inefficient and would fill the datastore with unnecessary repeatition.
|
470
|
+
Instead, chained jobs should be placed into the datestore on first use,
|
471
|
+
then refered to by each new initial job.
|
472
|
+
|
473
|
+
To facilitate this Updater impliments three special fields in the arguments list
|
474
|
+
which are replaced with metadata before a job is called:
|
475
|
+
|
476
|
+
* `__job__`: replaced with the instance of Updater::Update that chained into
|
477
|
+
this job. If the job failed (that is raised and error while being run), this
|
478
|
+
instance will contain an error field with that error.
|
479
|
+
* `__params__`: this is an optional field of a chain instance. It allows the
|
480
|
+
chaining job to set specific options for the chained job to use. For example
|
481
|
+
a chained job that reschedules the the original job might take an option
|
482
|
+
defining how frequently the job is rescheduled. This would be passed in
|
483
|
+
the params field. (See example in Updater::Chained -- Pending!)
|
484
|
+
* `__self__`: this is simply set to the instance of Updater::Update that is
|
485
|
+
calling the method. This might be useful for both chained and original
|
486
|
+
jobs that find a need to manipulate of inspect that job that called them.
|
487
|
+
Without this field, it would be impossible for a method to consistantly
|
488
|
+
determin wether it had been run from a background job or invoked
|
489
|
+
direclty by the app.
|
490
|
+
|
491
|
+
Chained jobs can take advantage of these parameters to respond appropriatly without
|
492
|
+
having to have a new chiain job for each initial job.
|
493
|
+
|
494
|
+
Example: We could replace the `logging_job` above like this
|
495
|
+
|
496
|
+
class MyLoggingClass
|
497
|
+
def self.logging_job
|
498
|
+
# We will memoize this value so we don't have to hit the datastore each time.
|
499
|
+
# If the job is alread in the datastore, we will find it and use it,
|
500
|
+
# Otherwise, we will create it from scratch.
|
501
|
+
@logging_job ||= Updater::Update.for(self,'logging') || Updater::Update.chained(self,:log_errors,[:__job__,:__params__], :name=>'logging')
|
502
|
+
end
|
503
|
+
|
504
|
+
def self.log_errors
|
505
|
+
# [...] As above
|
506
|
+
end
|
507
|
+
end
|
508
|
+
|
509
|
+
# [...]
|
510
|
+
|
511
|
+
#Updater::Update.immidiate(
|
512
|
+
self,
|
513
|
+
:some_method_that_might_fail,
|
514
|
+
[val1,val2],
|
515
|
+
:failure=>{MyLoggingClass.logging_job=>{:message=>"an Epic Fail"}}
|
516
|
+
)
|
517
|
+
|
518
|
+
See Also: Once it is started, see the example in Updater::Chains -- pending
|
data/Rakefile
CHANGED
@@ -10,8 +10,8 @@ GEM_NAME = "updater"
|
|
10
10
|
GEM_VERSION = File.read(VERSION_FILE).strip
|
11
11
|
AUTHOR = "John F. Miller"
|
12
12
|
EMAIL = "emperor@antarestrader.com"
|
13
|
-
HOMEPAGE = "http://
|
14
|
-
SUMMARY = "A
|
13
|
+
HOMEPAGE = "http://github.com/antarestrader/Updater"
|
14
|
+
SUMMARY = "A job queue which is ORM Agnostic and has advanced Error Handling"
|
15
15
|
|
16
16
|
spec = Gem::Specification.new do |s|
|
17
17
|
s.name = GEM_NAME
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.9.3
|
1
|
+
0.9.3.1
|
data/lib/updater/fork_worker.rb
CHANGED
@@ -382,7 +382,7 @@ module Updater
|
|
382
382
|
# would be to wait until it is ready then run the next job the wake and run it. There are two difficulties here
|
383
383
|
# the first is the need to let the master process know that the worker is alive and has not hung. We use a
|
384
384
|
# heartbeat file discriptor which we periodically change ctimes on by changing its access mode. This is
|
385
|
-
# modeled the technique used in the Unicorn web server. Our difficult is that we must be prepaired for a
|
385
|
+
# modeled on the technique used in the Unicorn web server. Our difficult is that we must be prepaired for a
|
386
386
|
# much less consistant load then a web server. Within a single application there may be periods where jobs
|
387
387
|
# pile up and others where there is a compleatly empty queue for hours or days. There is also the issue of
|
388
388
|
# how long a job may take to run. Jobs should generally be kept on the order of +timeout+ seconds.
|
@@ -396,7 +396,7 @@ module Updater
|
|
396
396
|
# the pipe every time one is present. The +smoke_pipe+ method handles this by attempting to remove a
|
397
397
|
# charactor from the pipe when it is called.
|
398
398
|
def wait_for(delay)
|
399
|
-
return unless @continue
|
399
|
+
return unless @continue #we're dead go back to run and break out of the main loop
|
400
400
|
delay ||= 356*24*60*60 #delay will be nil if there are no jobs. Wait a really long time in that case.
|
401
401
|
if delay <= 0 #more jobs are immidiatly availible
|
402
402
|
smoke_pipe(@stream)
|
data/lib/updater/orm/mongo.rb
CHANGED
@@ -6,7 +6,7 @@ module Updater
|
|
6
6
|
module ORM
|
7
7
|
class Mongo
|
8
8
|
|
9
|
-
FINDER= :
|
9
|
+
FINDER= :find_one
|
10
10
|
ID=:_id
|
11
11
|
|
12
12
|
def initialize(hash = {})
|
@@ -183,10 +183,12 @@ module Updater
|
|
183
183
|
# * :port - the port to connect to. Default: 27017
|
184
184
|
# * :username/:password - if these are present, they will be used to authenticate against the database
|
185
185
|
def setup(options)
|
186
|
-
logger ||= options[:logger]
|
187
|
-
raise ArgumentError, "Must spesify the name of a
|
186
|
+
logger ||= options[:logger] || Update.logger
|
187
|
+
raise ArgumentError, "Must spesify the name of a database when setting up Mongo driver" unless options[:database]
|
188
188
|
if options[:database].kind_of? ::Mongo::DB
|
189
189
|
@db = options[:database]
|
190
|
+
options[:database] = @db.name
|
191
|
+
logger.info "Updater is using already established connection to #{@db.name}"
|
190
192
|
else
|
191
193
|
logger.info "Attempting to connect to mongodb at #{[options[:host] || "localhost", options[:port] || 27017].join(':')} database: \"#{options[:database]}\""
|
192
194
|
@db = ::Mongo::Connection.new(options[:host] || "localhost", options[:port] || 27017).db(options[:database].to_s)
|
data/lib/updater/setup.rb
CHANGED
@@ -39,7 +39,7 @@ module Updater
|
|
39
39
|
|
40
40
|
ROOT = File.dirname(self.config_file || Dir.pwd)
|
41
41
|
|
42
|
-
#extended used for clients who
|
42
|
+
#extended used for clients who want to override parameters
|
43
43
|
def initialize(file_or_hash, extended = {})
|
44
44
|
@options = file_or_hash.kind_of?(Hash) ? file_or_hash : load_file(file_or_hash)
|
45
45
|
@options.merge!(extended)
|
data/lib/updater/update.rb
CHANGED
@@ -2,11 +2,19 @@ module Updater
|
|
2
2
|
class TargetMissingError < StandardError
|
3
3
|
end
|
4
4
|
|
5
|
-
#
|
5
|
+
#The basic class that drives Updater. See Readme for usage information.
|
6
6
|
class Update
|
7
7
|
# Contains the Error class after an error is caught in +run+. Not stored to the database
|
8
8
|
attr_reader :error
|
9
|
+
|
10
|
+
# Contains the underlying ORM instance (eg. ORM::Datamapper or ORM Mongo)
|
9
11
|
attr_reader :orm
|
12
|
+
|
13
|
+
# In order to reduce the proliferation of chained jobs in the queue,
|
14
|
+
# jobs chain request are allowed a params value that will pass
|
15
|
+
# specific values to a chained method. When a chained instance is
|
16
|
+
# created, the job processor will set this value. It will then be sent
|
17
|
+
# to the target method in plance of '__param__'. See #sub_args
|
10
18
|
attr_accessor :params
|
11
19
|
|
12
20
|
#Run the action on this traget compleating any chained actions
|
@@ -34,25 +42,34 @@ module Updater
|
|
34
42
|
ret
|
35
43
|
end
|
36
44
|
|
45
|
+
#see if this method was intended for the underlying ORM layer.
|
37
46
|
def method_missing(method, *args)
|
38
47
|
@orm.send(method,*args)
|
39
48
|
end
|
40
49
|
|
50
|
+
# Determins and if necessary find/creates the target for this instance.
|
51
|
+
#
|
52
|
+
# Warning: This value is intentionally NOT memoized. For instance type targets, it will result in a call to the datastore
|
53
|
+
# (or the recreation of an object) on EACH invocation. Methods that need to refer to the target more then once should
|
54
|
+
# take care to store this value locally after initial retreavel.
|
41
55
|
def target
|
42
56
|
target = @orm.finder.nil? ? @orm.target : @orm.target.send(@orm.finder,@orm.finder_args)
|
43
57
|
raise TargetMissingError, "Target missing --Class:'#{@orm.target}' Finder:'#{@orm.finder}', Args:'#{@orm.finder_args.inspect}'" unless target
|
44
58
|
target
|
45
59
|
end
|
46
60
|
|
61
|
+
# orm_inst must be set to an instacne of the class Update.orm
|
47
62
|
def initialize(orm_inst)
|
48
|
-
raise ArgumentError if orm_inst.nil?
|
63
|
+
raise ArgumentError if orm_inst.nil? || !orm_inst.kind_of?(orm)
|
49
64
|
@orm = orm_inst
|
50
65
|
end
|
51
66
|
|
67
|
+
#Jobs may be named to make them easier to find
|
52
68
|
def name=(n)
|
53
69
|
@orm.name=n
|
54
70
|
end
|
55
71
|
|
72
|
+
#Jobs may be named to make them easier to find
|
56
73
|
def name
|
57
74
|
@orm.name
|
58
75
|
end
|
@@ -66,6 +83,7 @@ module Updater
|
|
66
83
|
id = other.id
|
67
84
|
end
|
68
85
|
|
86
|
+
# If this is true, the job will NOT be removed after it is run. This is usually true for chained Jobs.
|
69
87
|
def persistant?
|
70
88
|
@orm.persistant
|
71
89
|
end
|
@@ -77,7 +95,31 @@ module Updater
|
|
77
95
|
end
|
78
96
|
|
79
97
|
private
|
80
|
-
|
98
|
+
|
99
|
+
# == Use and Purpose
|
100
|
+
# Takes a previous job and the original array of arguments form the data store.
|
101
|
+
# It replaced three special values with meta information from Updater. This is
|
102
|
+
# done to allow chained jobs to respond to specific conditions in the originating
|
103
|
+
# job.
|
104
|
+
#
|
105
|
+
# ==Substitutions
|
106
|
+
# The following strings are replaced with meta information from the calling job
|
107
|
+
# as described below:
|
108
|
+
#
|
109
|
+
# * '__job__': replaced with the instance of Updater::Update that chained into
|
110
|
+
# this job. If the job failed (that is raised and error while being run), this
|
111
|
+
# instance will contain an error field with that error.
|
112
|
+
# * '__params__': this is an optional field of a chain instance. It allows the
|
113
|
+
# chaining job to set specific options for the chained job to use. For example
|
114
|
+
# a chained job that reschedules the the original job might take an option
|
115
|
+
# defining how frequently the job is rescheduled. This would be passed in
|
116
|
+
# the params field. (See example in Updater::Chained -- Pending!)
|
117
|
+
# * '__self__': this is simply set to the instance of Updater::Update that is
|
118
|
+
# calling the method. This might be useful for both chained and original
|
119
|
+
# jobs that find a need to manipulate of inspect that job that called them.
|
120
|
+
# Without this field, it would be impossible for a method to consistantly
|
121
|
+
# determin wether it had been run from a background job or invoked
|
122
|
+
# direclty by the app.
|
81
123
|
def sub_args(job,a)
|
82
124
|
a.map do |e|
|
83
125
|
begin
|
@@ -101,22 +143,49 @@ module Updater
|
|
101
143
|
end# map
|
102
144
|
end #def
|
103
145
|
|
146
|
+
# Invoked by the runner with the name of a chain (:success, :failure, :ensure),
|
147
|
+
# this method takes each chained job and runs it to completion. (Depth First Search of the chain tree)
|
104
148
|
def run_chain(name)
|
105
149
|
chains = @orm.send(name)
|
106
150
|
return unless chains
|
107
151
|
chains.each do |job|
|
108
152
|
job.run(self)
|
109
153
|
end
|
110
|
-
rescue NameError
|
111
|
-
|
154
|
+
rescue NameError
|
155
|
+
# There have been a number of bugs caused by the @orm instance not being what was expected when
|
156
|
+
# the ORM layer returned a chain. This error if produced will propigat to the worker where it is caught
|
157
|
+
# and logged, but to prevent a complete crash of the system, it is then ignored and the next job is run.
|
158
|
+
# This is here to help catch and debug this type of error in ORM layers, particularly 3rd party ORMs.
|
159
|
+
self.class.logger.error "Something is wrong with the ORM value in a chained call \n From (%s:%s):\n%s" % [__FILE__,__LINE__,@orm.inspect]
|
112
160
|
raise
|
113
161
|
end
|
114
162
|
|
115
163
|
class << self
|
116
164
|
|
117
|
-
#This attribute must be set to some ORM that will persist the data
|
165
|
+
# This attribute must be set to some ORM that will persist the data. The value is normally set
|
166
|
+
# using one of the methods in Updater::Setup.
|
118
167
|
attr_accessor :orm
|
119
168
|
|
169
|
+
# This is the application level default method to call on a class in order to find/create a target
|
170
|
+
# instance. (e.g find, get, find_one, etc...). In most circumstances the ORM layer defines an
|
171
|
+
# appropriate default and this does not need to be explcitly set.
|
172
|
+
#
|
173
|
+
# MongoDB is one significant exception to this rule. The Updater Mongo ORM layer uses the
|
174
|
+
# 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper. If the
|
175
|
+
# application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
|
176
|
+
attr_accessor :finder_method
|
177
|
+
|
178
|
+
# This is the application level default method to call on an instance type target. It should
|
179
|
+
# return a value to be passed to the #finder_method (above) inorder to retrieve the instance
|
180
|
+
# from the datastore. (eg. id) In most circumstances the ORM layer defines an
|
181
|
+
# appropriate default and this does not need to be explcitly set.
|
182
|
+
#
|
183
|
+
# MongoDB is one significant exception to this rule. The Updater Mongo ORM layer uses the
|
184
|
+
# 10gen MongoDB dirver directly without an ORM such as Mongoid or Mongo_Mapper. If the
|
185
|
+
# application uses ond of thes ORMs #finder_method and #finder_id should be explicitly set.
|
186
|
+
attr_accessor :finder_id
|
187
|
+
|
188
|
+
|
120
189
|
#remove once Bug is discovered
|
121
190
|
def orm=(input)
|
122
191
|
raise ArgumentError, "Must set ORM to and appropriate class" unless input.kind_of? Class
|
@@ -126,8 +195,11 @@ module Updater
|
|
126
195
|
# This is an open IO socket that will be writen to when a job is scheduled. If it is unset
|
127
196
|
# then @pid is signaled instead.
|
128
197
|
attr_accessor :socket
|
198
|
+
|
199
|
+
# Instance of a conforming logger. This will be created if it is not explicitly set.
|
129
200
|
attr_writer :logger
|
130
201
|
|
202
|
+
# Returns the logger instance. If it has not been set, a new Logger will be created pointing to STDOUT
|
131
203
|
def logger
|
132
204
|
@logger ||= Logger.new(STDOUT)
|
133
205
|
end
|
@@ -146,6 +218,7 @@ module Updater
|
|
146
218
|
clear_locks(worker)
|
147
219
|
end
|
148
220
|
|
221
|
+
#Ensure that a worker no longer holds any locks.
|
149
222
|
def clear_locks(worker); @orm.clear_locks(worker); end
|
150
223
|
|
151
224
|
# Request that the target be sent the method with args at the given time.
|
@@ -193,25 +266,34 @@ module Updater
|
|
193
266
|
# they are set. See +for+ for examples
|
194
267
|
#
|
195
268
|
# :failure, :success,:ensure <Updater::Update instance> an other request to be run when the request compleste. Usually these
|
196
|
-
# valuses will be created with the +chained+ method.
|
269
|
+
# valuses will be created with the +chained+ method.
|
270
|
+
# As an alternative a Hash (OrderedHash in ruby 1.8) with keys of Updater::Update instances and
|
197
271
|
# values of Hash may be used. The hash will be substituted for the '__param__' argument if/when the chained method is called.
|
198
272
|
#
|
199
273
|
# :persistant <true|false> if true the object will not be destroyed after the completion of its run. By default
|
200
274
|
# this is false except when time is nil.
|
201
275
|
#
|
276
|
+
# ===Note:
|
277
|
+
#
|
278
|
+
# Unless finder_args is passed, a non-class target will be asked for its ID value using #finder_id
|
279
|
+
# or if that is not set, then the default value defined in the ORM layer. Particularly for MongoDB
|
280
|
+
# it is important that #finder_id be set to an appropriate value sence the Updater ORM layer uses
|
281
|
+
# the low level MongoDB driver instead of a more feature complete ORM like Mongoid.
|
282
|
+
#
|
202
283
|
# == Examples
|
203
284
|
#
|
204
285
|
# Updater.at(Chronic.parse('tomorrow'),Foo,:bar,[]) # will run Foo.bar() tomorrow at midnight
|
205
286
|
#
|
206
287
|
# f = Foo.create
|
207
288
|
# u = Updater.at(Chronic.parse('2 hours form now'),f,:bar,[]) # will run Foo.get(f.id).bar in 2 hours
|
289
|
+
# == See Also
|
290
|
+
#
|
291
|
+
# +in+, +immidiate+ and +chain+ which share the same arguments and options but treat time differently
|
208
292
|
def at(t,target,method = nil,args=[],options={})
|
209
293
|
hash = Hash.new
|
210
294
|
hash[:time] = t.to_i unless t.nil?
|
211
295
|
|
212
|
-
hash[:target],hash[:finder],hash[:finder_args] = target_for(target)
|
213
|
-
hash[:finder] = options[:finder] || hash[:finder]
|
214
|
-
hash[:finder_args] = options[:finder_args] || hash[:finder_args]
|
296
|
+
hash[:target],hash[:finder],hash[:finder_args] = target_for(target, options)
|
215
297
|
|
216
298
|
hash[:method] = method || :perform
|
217
299
|
hash[:method_args] = args
|
@@ -283,7 +365,7 @@ module Updater
|
|
283
365
|
|
284
366
|
#The time class used by Updater. See time=
|
285
367
|
def time
|
286
|
-
|
368
|
+
@time ||= Time
|
287
369
|
end
|
288
370
|
|
289
371
|
# By default Updater will use the system time (Time class) to get the current time. The application
|
@@ -291,7 +373,7 @@ module Updater
|
|
291
373
|
# allows us to substitute a custom class for Time. This class must respond with in interger or Time to
|
292
374
|
# the #now method.
|
293
375
|
def time=(klass)
|
294
|
-
|
376
|
+
@time = klass
|
295
377
|
end
|
296
378
|
|
297
379
|
# A filter for all requests that are ready to run, that is they requested to be run before or at time.now
|
@@ -318,7 +400,7 @@ module Updater
|
|
318
400
|
end
|
319
401
|
|
320
402
|
#Remove all scheduled jobs. Mostly intended for testing, but may also be useful in cases of crashes
|
321
|
-
#or system corruption
|
403
|
+
#or system corruption. removes all pending jobs.
|
322
404
|
def clear_all
|
323
405
|
@orm.clear_all
|
324
406
|
end
|
@@ -334,20 +416,22 @@ module Updater
|
|
334
416
|
#in another way.
|
335
417
|
def pid=(p)
|
336
418
|
return @pid = nil unless p #tricky assignment in return
|
337
|
-
@pid = Integer("#{p}")
|
338
|
-
Process::kill 0, @pid
|
419
|
+
@pid = Integer("#{p}") #safety check that prevents a curupted PID file from crashing the system
|
420
|
+
Process::kill 0, @pid #check that the process exists
|
339
421
|
@pid
|
340
422
|
rescue Errno::ESRCH, ArgumentError
|
341
423
|
@pid = nil
|
342
424
|
raise ArgumentError, "PID was invalid"
|
343
425
|
end
|
344
426
|
|
427
|
+
# The PID of the worker process
|
345
428
|
def pid
|
346
429
|
@pid
|
347
430
|
end
|
348
431
|
|
349
432
|
private
|
350
433
|
def signal_worker
|
434
|
+
# TODO: If worker process goes down or has to be reset, try to reconnect
|
351
435
|
if @socket
|
352
436
|
@socket.write '.'
|
353
437
|
elsif @pid
|
@@ -356,12 +440,15 @@ module Updater
|
|
356
440
|
end
|
357
441
|
|
358
442
|
# Given some instance return the information needed to recreate that target
|
359
|
-
def target_for(inst)
|
443
|
+
def target_for(inst,options = {})
|
360
444
|
return [inst, nil, nil] if (inst.kind_of?(Class) || inst.kind_of?(Module))
|
361
|
-
[inst.class
|
445
|
+
[ inst.class, #target's class
|
446
|
+
options[:finder] || @finder_method || orm::FINDER, #method to call on targets class to find/create target
|
447
|
+
options[:finder_args] || inst.send(@finder_id || orm::ID) #value to pass to above method
|
448
|
+
]
|
362
449
|
end
|
363
450
|
|
364
|
-
end
|
451
|
+
end # class << self
|
365
452
|
end #class Update
|
366
453
|
|
367
454
|
end #Module Updater
|
metadata
CHANGED
@@ -6,7 +6,8 @@ version: !ruby/object:Gem::Version
|
|
6
6
|
- 0
|
7
7
|
- 9
|
8
8
|
- 3
|
9
|
-
|
9
|
+
- 1
|
10
|
+
version: 0.9.3.1
|
10
11
|
platform: ruby
|
11
12
|
authors:
|
12
13
|
- John F. Miller
|
@@ -14,7 +15,7 @@ autorequire:
|
|
14
15
|
bindir: bin
|
15
16
|
cert_chain: []
|
16
17
|
|
17
|
-
date: 2010-08-
|
18
|
+
date: 2010-08-26 00:00:00 -07:00
|
18
19
|
default_executable:
|
19
20
|
dependencies:
|
20
21
|
- !ruby/object:Gem::Dependency
|
@@ -77,7 +78,7 @@ dependencies:
|
|
77
78
|
version: 0.2.3
|
78
79
|
type: :development
|
79
80
|
version_requirements: *id004
|
80
|
-
description: A
|
81
|
+
description: A job queue which is ORM Agnostic and has advanced Error Handling
|
81
82
|
email: emperor@antarestrader.com
|
82
83
|
executables: []
|
83
84
|
|
@@ -124,7 +125,7 @@ files:
|
|
124
125
|
- spec/errors_spec.rb
|
125
126
|
- bin/updater
|
126
127
|
has_rdoc: true
|
127
|
-
homepage: http://
|
128
|
+
homepage: http://github.com/antarestrader/Updater
|
128
129
|
licenses: []
|
129
130
|
|
130
131
|
post_install_message:
|
@@ -154,6 +155,6 @@ rubyforge_project:
|
|
154
155
|
rubygems_version: 1.3.7
|
155
156
|
signing_key:
|
156
157
|
specification_version: 3
|
157
|
-
summary: A
|
158
|
+
summary: A job queue which is ORM Agnostic and has advanced Error Handling
|
158
159
|
test_files: []
|
159
160
|
|