bjj 1.0.3
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +1 -0
- data/HISTORY +74 -0
- data/README +308 -0
- data/Rakefile +18 -0
- data/TODO +40 -0
- data/VERSION.yml +4 -0
- data/bin/bj +679 -0
- data/install.rb +214 -0
- data/lib/bj.rb +87 -0
- data/lib/bj/api.rb +164 -0
- data/lib/bj/attributes.rb +120 -0
- data/lib/bj/bj.rb +72 -0
- data/lib/bj/errors.rb +4 -0
- data/lib/bj/joblist.rb +112 -0
- data/lib/bj/logger.rb +50 -0
- data/lib/bj/runner.rb +359 -0
- data/lib/bj/stdext.rb +86 -0
- data/lib/bj/table.rb +384 -0
- data/lib/bj/util.rb +133 -0
- metadata +103 -0
data/.gitignore
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
*.gem
|
data/HISTORY
ADDED
@@ -0,0 +1,74 @@
|
|
1
|
+
1.0.3
|
2
|
+
- env wasn't properly unpacked by runner, added YAML.load(job.env), thanks
|
3
|
+
Chris Wanstrath
|
4
|
+
- made operations that generate migrations, etc, verbose - they dump
|
5
|
+
stdout and stderr to console when running
|
6
|
+
|
7
|
+
1.0.2:
|
8
|
+
- Bj now (should) auto detect the correct rake command on windows as
|
9
|
+
"rake.bat" not "rake". see Bj.which_rake and Bj.rake for impl.
|
10
|
+
|
11
|
+
1.0.1:
|
12
|
+
- fixed name collision with 'record.attributes = hash' ar mass
|
13
|
+
assignment method (thx jon guymon)
|
14
|
+
- added new sponsor: http://igicom.com/
|
15
|
+
|
16
|
+
0.0.5:
|
17
|
+
- use full path to ruby for plugin mode
|
18
|
+
- plugin correctly installs bin -->> script
|
19
|
+
- plugin install uses --force
|
20
|
+
- properly quote paths in windows (spaces)
|
21
|
+
- switch win signal to ABRT (was INT)
|
22
|
+
- background job regrestration now uses ppid to pin the subprocess to a
|
23
|
+
parent
|
24
|
+
- use ppid to detect parent death and exit in event loop
|
25
|
+
- don't use gem dependanices in plugin as they are broken when loading from
|
26
|
+
muliple gem repos
|
27
|
+
- added a small amount of drb magic that allows signals to work across
|
28
|
+
processes even on windows (see http://drawohara.com/post/22540307)
|
29
|
+
|
30
|
+
0.0.4:
|
31
|
+
- basic functionality in windows
|
32
|
+
- several small bug fixes
|
33
|
+
|
34
|
+
0.0.3:
|
35
|
+
- *many* small bug fixes
|
36
|
+
|
37
|
+
- plugin install should now pick up dependancies from plugin dir, last
|
38
|
+
release had LOAD_PATH/gem issues and was picking up globally installed
|
39
|
+
gems
|
40
|
+
|
41
|
+
- automatic management of the background processing can be turned off if you
|
42
|
+
want to manage your own processes
|
43
|
+
|
44
|
+
- all jobs are automatically restartable unless submitted with
|
45
|
+
|
46
|
+
:restartable => false
|
47
|
+
|
48
|
+
this means that, should a runner ever die, upon restart any jobs that were
|
49
|
+
mid-process will automatically be restarted
|
50
|
+
|
51
|
+
- signal based parent lifeline move out of thread and into even loop
|
52
|
+
|
53
|
+
- :lock => true added to a few AR finds to support true write serializable
|
54
|
+
transaction isolation when the db supports it
|
55
|
+
|
56
|
+
- all migrations now use :force => true and
|
57
|
+
|
58
|
+
- running 'bj setup' will always generate a new migration, even if you've
|
59
|
+
already run it before. this allows easy version upgrades.
|
60
|
+
|
61
|
+
- a few command would blow up on windows because they weren't prefixed with
|
62
|
+
'ruby'. gotta love the lack of #shebang line on windoze...
|
63
|
+
|
64
|
+
- ./script/bj is searched first before system path env var
|
65
|
+
|
66
|
+
- a default PATH is provided for whacky systems without one
|
67
|
+
|
68
|
+
- database.yml is filtered through ERB ala rails
|
69
|
+
|
70
|
+
0.0.2:
|
71
|
+
- path bug fixes
|
72
|
+
|
73
|
+
0.0.1:
|
74
|
+
- initial release
|
data/README
ADDED
@@ -0,0 +1,308 @@
|
|
1
|
+
NAME
|
2
|
+
bj
|
3
|
+
|
4
|
+
SYNOPSIS
|
5
|
+
bj (migration_code|generate_migration|migrate|setup|plugin|run|submit|list|set|config|pid) [options]+
|
6
|
+
|
7
|
+
DESCRIPTION
|
8
|
+
________________________________
|
9
|
+
Overview
|
10
|
+
--------------------------------
|
11
|
+
|
12
|
+
Backgroundjob (Bj) is a brain dead simple zero admin background priority queue
|
13
|
+
for Rails. Bj is robust, platform independent (including windows), and
|
14
|
+
supports internal or external manangement of the background runner process.
|
15
|
+
|
16
|
+
Jobs can be submitted to the queue directly using the api or from the command
|
17
|
+
line using the ./script/bj:
|
18
|
+
|
19
|
+
api:
|
20
|
+
Bj.submit 'cat /etc/password'
|
21
|
+
|
22
|
+
command line:
|
23
|
+
bj submit cat /etc/password
|
24
|
+
|
25
|
+
Bj's priority queue lives in the database and is therefore durable - your jobs
|
26
|
+
will live across an app crash or machine reboot. The job management is
|
27
|
+
comprehensive capturing stdout, stderr, exit_status, and temporal statistics
|
28
|
+
about each job:
|
29
|
+
|
30
|
+
jobs = Bj.submit array_of_commands, :priority => 42
|
31
|
+
|
32
|
+
...
|
33
|
+
|
34
|
+
jobs.each do |job|
|
35
|
+
if job.finished?
|
36
|
+
p job.stdout
|
37
|
+
p job.stderr
|
38
|
+
p job.exit_status
|
39
|
+
p job.started_at
|
40
|
+
p job.finished_at
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
In addition the background runner process logs all commands run and their
|
45
|
+
exit_status to a log named using the following convention:
|
46
|
+
|
47
|
+
rails_root/log/bj.#{ HOSTNAME }.#{ RAILS_ENV }.log
|
48
|
+
|
49
|
+
Bj allows you to submit jobs to multiple databases; for instance, if your
|
50
|
+
application is running in development mode you may do:
|
51
|
+
|
52
|
+
Bj.in :production do
|
53
|
+
Bj.submit 'my_job.exe'
|
54
|
+
end
|
55
|
+
|
56
|
+
Bj manages the ever growing list of jobs ran by automatically archiving them
|
57
|
+
into another table (by default jobs > 24 hrs old are archived) to prevent the
|
58
|
+
jobs table from becoming bloated and huge.
|
59
|
+
|
60
|
+
All Bj's tables are namespaced and accessible via the Bj module:
|
61
|
+
|
62
|
+
Bj.table.job.find(:all) # jobs table
|
63
|
+
Bj.table.job_archive.find(:all) # archived jobs
|
64
|
+
Bj.table.config.find(:all) # configuration and runner state
|
65
|
+
|
66
|
+
Bj always arranges for submitted jobs to run with a current working directory
|
67
|
+
of RAILS_ROOT and with the correct RAILS_ENV setting. For example, if you
|
68
|
+
submit a job in production it will have ENV['RAILS_ENV'] == 'production'.
|
69
|
+
|
70
|
+
When Bj manages the background runner it will never outlive the rails
|
71
|
+
application - it is started and stopped on demand as the rails app is started
|
72
|
+
and stopped. This is also true for ./script/console - Bj will automatically
|
73
|
+
fire off the background runner to process jobs submitted using the console.
|
74
|
+
|
75
|
+
Bj ensures that only one background process is running for your application -
|
76
|
+
firing up three mongrels or fcgi processes will result in only one background
|
77
|
+
runner being started. Note that the number of background runners does not
|
78
|
+
determine throughput - that is determined primarily by the nature of the jobs
|
79
|
+
themselves and how much work they perform per process.
|
80
|
+
|
81
|
+
|
82
|
+
________________________________
|
83
|
+
Architecture
|
84
|
+
--------------------------------
|
85
|
+
|
86
|
+
If one ignores platform specific details the design of Bj is quite simple: the
|
87
|
+
main Rails application submits jobs to table, stored in the database. The act
|
88
|
+
of submitting triggers exactly one of two things to occur:
|
89
|
+
|
90
|
+
1) a new long running background runner to be started
|
91
|
+
|
92
|
+
2) an existing background runner to be signaled
|
93
|
+
|
94
|
+
The background runner refuses to run two copies of itself for a given
|
95
|
+
hostname/rails_env combination. For example you may only have one background
|
96
|
+
runner processing jobs on localhost in development mode.
|
97
|
+
|
98
|
+
The background runner, under normal circumstances, is managed by Bj itself -
|
99
|
+
you need do nothing to start, monitor, or stop it - it just works. However,
|
100
|
+
some people will prefer manage their own background process, see 'External
|
101
|
+
Runner' section below for more on this.
|
102
|
+
|
103
|
+
The runner simply processes each job in a highest priority oldest-in fashion,
|
104
|
+
capturing stdout, stderr, exit_status, etc. and storing the information back
|
105
|
+
into the database while logging it's actions. When there are no jobs to run
|
106
|
+
the runner goes to sleep for 42 seconds; however this sleep is interuptable,
|
107
|
+
such as when the runner is signaled that a new job has been submitted so,
|
108
|
+
under normal circumstances there will be zero lag between job submission and
|
109
|
+
job running for an empty queue.
|
110
|
+
|
111
|
+
|
112
|
+
________________________________
|
113
|
+
External Runner / Clustering
|
114
|
+
--------------------------------
|
115
|
+
|
116
|
+
For the paranoid control freaks out there (myself included) it is quite
|
117
|
+
possible to manage and monitor the runner process manually. This can be
|
118
|
+
desirable in production setups where monitoring software may kill leaking
|
119
|
+
rails apps periodically.
|
120
|
+
|
121
|
+
Recalling that Bj will only allow one copy of itself to process jobs per
|
122
|
+
hostname/rails_env pair we can simply do something like this in cron
|
123
|
+
|
124
|
+
cmd = bj run --forever \
|
125
|
+
--rails_env=development \
|
126
|
+
--rails_root=/Users/ahoward/rails_root
|
127
|
+
|
128
|
+
*/15 * * * * $cmd
|
129
|
+
|
130
|
+
this will simply attempt the start the background runner every 15 minutes if,
|
131
|
+
and only if, it's not *already* running.
|
132
|
+
|
133
|
+
In addtion to this you'll want to tell Bj not to manage the runner itself
|
134
|
+
using
|
135
|
+
|
136
|
+
Bj.config["production.no_tickle"] = true
|
137
|
+
|
138
|
+
Note that, for clusting setups, it's as simple as adding a crontab and config
|
139
|
+
entry like this for each host. Because Bj throttles background runners per
|
140
|
+
hostname this will allow one runner per hostname - making it quite simple to
|
141
|
+
cluster three nodes behind a besieged rails application.
|
142
|
+
|
143
|
+
|
144
|
+
________________________________
|
145
|
+
Designing Jobs
|
146
|
+
--------------------------------
|
147
|
+
|
148
|
+
Bj runs it's jobs as command line applications. It ensures that all jobs run
|
149
|
+
in RAILS_ROOT so it's quite natural to apply a pattern such as
|
150
|
+
|
151
|
+
mkdir ./jobs
|
152
|
+
edit ./jobs/background_job_to_run
|
153
|
+
|
154
|
+
...
|
155
|
+
|
156
|
+
Bj.submit "./jobs/background_job_to_run"
|
157
|
+
|
158
|
+
If you need to run you jobs under an entire rails environment you'll need to
|
159
|
+
do this:
|
160
|
+
|
161
|
+
Bj.submit "./script/runner ./jobs/background_job_to_run"
|
162
|
+
|
163
|
+
Obviously "./script/runner" loads the rails environment for you. It's worth
|
164
|
+
noting that this happens for each job and that this is by design: the reason
|
165
|
+
is that most rails applications leak memory like a sieve so, if one were to
|
166
|
+
spawn a long running process that used the application code base you'd have a
|
167
|
+
lovely doubling of memory usage on you app servers. Although loading the
|
168
|
+
rails environment for each background job requires a little time, a little
|
169
|
+
cpu, and a lot less memory. A future version of Bj will provide a way to load
|
170
|
+
the rails environment once and to process background jobs in this environment,
|
171
|
+
but anyone wanting to use this in production will be required to duct tape
|
172
|
+
their entire chest and have a team of oxen rip off the tape without screaming
|
173
|
+
to prove steelyness of spirit and profound understanding of the other side.
|
174
|
+
|
175
|
+
Don't forget that you can submit jobs with command line arguments:
|
176
|
+
|
177
|
+
Bj.submit "./jobs/a.rb 1 foobar --force"
|
178
|
+
|
179
|
+
and that you can do powerful things by passing stdin to a job that powers
|
180
|
+
through a list of work. For instance, assume a "./jobs/bulkmail" job
|
181
|
+
resembling
|
182
|
+
|
183
|
+
STDIN.each do |line|
|
184
|
+
address = line.strip
|
185
|
+
mail_message_to address
|
186
|
+
end
|
187
|
+
|
188
|
+
then you could
|
189
|
+
|
190
|
+
stdin = [
|
191
|
+
"foo@bar.com",
|
192
|
+
"bar@foo.com",
|
193
|
+
"ara.t.howard@codeforpeople.com",
|
194
|
+
]
|
195
|
+
|
196
|
+
Bj.submit "./script/runner ./jobs/bulkmail", :stdin => stdin
|
197
|
+
|
198
|
+
and all those emails would be sent in the background.
|
199
|
+
|
200
|
+
Bj's power is putting jobs in the background in a simple and robust fashion.
|
201
|
+
It's your task to build intelligent jobs that leverage batch processing, and
|
202
|
+
other, possibilities. The upshot of building tasks this way is that they are
|
203
|
+
quite easy to test before submitting them from inside your application.
|
204
|
+
|
205
|
+
|
206
|
+
________________________________
|
207
|
+
Install
|
208
|
+
--------------------------------
|
209
|
+
|
210
|
+
Bj can be installed two ways: as a plugin or via rubygems
|
211
|
+
|
212
|
+
plugin:
|
213
|
+
1) ./script/plugin install http://codeforpeople.rubyforge.org/svn/rails/plugins/bj
|
214
|
+
2) ./script/bj setup
|
215
|
+
|
216
|
+
gem:
|
217
|
+
1) $sudo gem install bj
|
218
|
+
2) add "require 'bj'" to config/environment.rb
|
219
|
+
3) bj setup
|
220
|
+
|
221
|
+
________________________________
|
222
|
+
Api
|
223
|
+
--------------------------------
|
224
|
+
|
225
|
+
submit jobs for background processing. 'jobs' can be a string or array of
|
226
|
+
strings. options are applied to each job in the 'jobs', and the list of
|
227
|
+
submitted jobs is always returned. options (string or symbol) can be
|
228
|
+
|
229
|
+
:rails_env => production|development|key_in_database_yml
|
230
|
+
when given this keyword causes bj to submit jobs to the
|
231
|
+
specified database. default is RAILS_ENV.
|
232
|
+
|
233
|
+
:priority => any number, including negative ones. default is zero.
|
234
|
+
|
235
|
+
:tag => a tag added to the job. simply makes searching easier.
|
236
|
+
|
237
|
+
:env => a hash specifying any additional environment vars the background
|
238
|
+
process should have.
|
239
|
+
|
240
|
+
:stdin => any stdin the background process should have. must respond_to
|
241
|
+
to_s
|
242
|
+
|
243
|
+
eg:
|
244
|
+
|
245
|
+
jobs = Bj.submit 'echo foobar', :tag => 'simple job'
|
246
|
+
|
247
|
+
jobs = Bj.submit '/bin/cat', :stdin => 'in the hat', :priority => 42
|
248
|
+
|
249
|
+
jobs = Bj.submit './script/runner ./scripts/a.rb', :rails_env => 'production'
|
250
|
+
|
251
|
+
jobs = Bj.submit './script/runner /dev/stdin',
|
252
|
+
:stdin => 'p RAILS_ENV',
|
253
|
+
:tag => 'dynamic ruby code'
|
254
|
+
|
255
|
+
jobs Bj.submit array_of_commands, :priority => 451
|
256
|
+
|
257
|
+
when jobs are run, they are run in RAILS_ROOT. various attributes are
|
258
|
+
available *only* once the job has finished. you can check whether or not a
|
259
|
+
job is finished by using the #finished method, which simple does a reload and
|
260
|
+
checks to see if the exit_status is non-nil.
|
261
|
+
|
262
|
+
eg:
|
263
|
+
|
264
|
+
jobs = Bj.submit list_of_jobs, :tag => 'important'
|
265
|
+
...
|
266
|
+
|
267
|
+
jobs.each do |job|
|
268
|
+
if job.finished?
|
269
|
+
p job.exit_status
|
270
|
+
p job.stdout
|
271
|
+
p job.stderr
|
272
|
+
end
|
273
|
+
end
|
274
|
+
|
275
|
+
See lib/bj/api.rb for more details.
|
276
|
+
|
277
|
+
________________________________
|
278
|
+
Sponsors
|
279
|
+
--------------------------------
|
280
|
+
http://quintess.com/
|
281
|
+
http://www.engineyard.com/
|
282
|
+
http://igicom.com/
|
283
|
+
http://eparklabs.com/
|
284
|
+
|
285
|
+
http://your_company.com/ <<-- (targeted marketing aimed at *you*)
|
286
|
+
|
287
|
+
________________________________
|
288
|
+
Version
|
289
|
+
--------------------------------
|
290
|
+
1.0.1
|
291
|
+
|
292
|
+
PARAMETERS
|
293
|
+
--rails_root=rails_root, -R (0 ~> rails_root=)
|
294
|
+
the rails_root will be guessed unless you set this
|
295
|
+
--rails_env=rails_env, -E (0 ~> rails_env=development)
|
296
|
+
set the rails_env
|
297
|
+
--log=log, -l (0 ~> log=STDERR)
|
298
|
+
set the logfile
|
299
|
+
--help, -h
|
300
|
+
|
301
|
+
AUTHOR
|
302
|
+
ara.t.howard@gmail.com
|
303
|
+
|
304
|
+
URIS
|
305
|
+
http://codeforpeople.com/lib/ruby/
|
306
|
+
http://rubyforge.org/projects/codeforpeople/
|
307
|
+
http://codeforpeople.rubyforge.org/svn/rails/plugins/
|
308
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
begin
|
2
|
+
require 'jeweler'
|
3
|
+
Jeweler::Tasks.new do |gemspec|
|
4
|
+
gemspec.name = "bjj"
|
5
|
+
gemspec.summary = "Backgroundjob"
|
6
|
+
gemspec.description = "Univeral capture of stdout and stderr and handling of child process pid for windows, *nix, etc."
|
7
|
+
gemspec.email = "hahn@netseven.it"
|
8
|
+
gemspec.homepage = "http://codeforpeople.com/lib/ruby/"
|
9
|
+
gemspec.description = "Special version for JRuby compatibility. Backgroundjob (Bj) is a brain dead simple zero admin background priority queue for Rails. Bj is robust, platform independent (including windows), and supports internal or external manangement of the background runner process."
|
10
|
+
gemspec.authors = ["Ara T. Howard (ara.t.howard@noaa.gov)", "Daniel Hahn (JRuby modifications)"]
|
11
|
+
gemspec.add_dependency 'main', '>= 2.6.0'
|
12
|
+
gemspec.add_dependency 'systemu_j', '>= 1.2.0'
|
13
|
+
gemspec.add_dependency 'orderedhash', '>= 0.0.3'
|
14
|
+
end
|
15
|
+
Jeweler::GemcutterTasks.new
|
16
|
+
rescue LoadError
|
17
|
+
puts "Jeweler not available. Install it with: sudo gem install technicalpickles-jeweler -s http://gems.github.com"
|
18
|
+
end
|
data/TODO
ADDED
@@ -0,0 +1,40 @@
|
|
1
|
+
|
2
|
+
? the whole gem_path thing is still fubar
|
3
|
+
|
4
|
+
? commands need quoting, esp for windows, "c:\Documents And..." etc
|
5
|
+
|
6
|
+
- signals not operating properly on windows , non critical error tho...
|
7
|
+
|
8
|
+
- need to figure out how to cache connections for Bj.in(...)
|
9
|
+
|
10
|
+
- ttl will be added. maxing it out will cause auto-resubmission (Steve Midgley)
|
11
|
+
|
12
|
+
- is having the runner thread try forever to start the process the best thing?
|
13
|
+
|
14
|
+
- allow easy way to run ruby code. perhaps ./script/runner 'eval STDIN.read'
|
15
|
+
is good enough
|
16
|
+
|
17
|
+
- allow easy way to run ruby code that persists
|
18
|
+
|
19
|
+
- allow specification of runner on submit (--runner)
|
20
|
+
|
21
|
+
- allow specification of tags a runner will consume (--tag)
|
22
|
+
|
23
|
+
- flesh out the cli interface - it's a test only at this point
|
24
|
+
|
25
|
+
- test in windows
|
26
|
+
|
27
|
+
================================================================================
|
28
|
+
|
29
|
+
X ./script/console submission hangs on windows
|
30
|
+
X default PATH setting
|
31
|
+
X install issues for dave? - gem_path...
|
32
|
+
X main only loaded for (bin|script)/bj
|
33
|
+
X make it possible to declare externally managed runners
|
34
|
+
X restartable will be added. true by default (Steve Midgley)
|
35
|
+
X do the lifeline inline with the loop
|
36
|
+
X need to address the serialzable writer issue (:lock => true ??)
|
37
|
+
X migrations use --force
|
38
|
+
X i forget to add "#{ Bj.ruby } ... " to the generate command
|
39
|
+
X ./script/bj must be found in path before c:/.....bin/bj
|
40
|
+
X make sure database.yml is loaded via YAML::load(ERB.new(File.read * "config/database.yml").result)
|
data/VERSION.yml
ADDED