beaneater 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.yardopts ADDED
@@ -0,0 +1,8 @@
1
+ --output-dir doc/
2
+ --readme README.md
3
+ --title Beaneater
4
+ --markup-provider=redcarpet
5
+ --markup=markdown
6
+ --protected
7
+ --no-private
8
+ lib/beaneater/**/*.rb
data/Gemfile ADDED
@@ -0,0 +1,10 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in beaneater.gemspec
4
+ gemspec
5
+
6
+ group :development do
7
+ gem 'redcarpet', '~> 1'
8
+ gem 'github-markup'
9
+ gem 'yard'
10
+ end
data/LICENSE.txt ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Nico Taing
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,399 @@
1
+ # Beaneater
2
+
3
+ Beaneater is the best way to interact with beanstalkd from within Ruby.
4
+ [Beanstalkd](http://kr.github.com/beanstalkd/) is a simple, fast work queue. Its interface is generic, but was
5
+ originally designed for reducing the latency of page views in high-volume web applications by
6
+ running time-consuming tasks asynchronously. Read the
7
+ [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more details.
8
+
9
+
10
+ ## Why Beanstalk?
11
+
12
+ Illya has an excellent blog post
13
+ [Scalable Work Queues with Beanstalk](http://www.igvita.com/2010/05/20/scalable-work-queues-with-beanstalk/) and
14
+ Adam Wiggins posted [an excellent comparison](http://adam.heroku.com/past/2010/4/24/beanstalk_a_simple_and_fast_queueing_backend/).
15
+
16
+ You will quickly see that **beanstalkd** is an underrated but incredible project that is extremely well-suited as a job queue.
17
+ Significantly better suited for this task than Redis or a database. Beanstalk is a simple,
18
+ and a very fast work queue service rolled into a single binary - it is the memcached of work queues.
19
+ Originally built to power the backend for the 'Causes' Facebook app, it is a mature and production ready open source project.
20
+ [PostRank](http://www.postrank.com) uses beanstalk to reliably process millions of jobs a day.
21
+
22
+ A single instance of Beanstalk is perfectly capable of handling thousands of jobs a second (or more, depending on your job size)
23
+ because it is an in-memory, event-driven system. Powered by libevent under the hood,
24
+ it requires zero setup (launch and forget, à la memcached), optional log based persistence, an easily parsed ASCII protocol,
25
+ and a rich set of tools for job management that go well beyond a simple FIFO work queue.
26
+
27
+ Beanstalkd supports the following features out of the box:
28
+
29
+ | Feature | Description |
30
+ | ------- | ------------------------------- |
31
+ | **Parallel Queues** | Supports multiple work queues created on demand. |
32
+ | **Reliable** | Beanstalk’s reserve, work, delete cycle ensures reliable processing. |
33
+ | **Scheduling** | Delay enqueuing jobs by a specified interval to schedule processing later |
34
+ | **Fast** | Processes thousands of jobs per second; **significantly** faster than alternatives. |
35
+ | **Priorities** | Specify priority so important jobs can be processed quickly. |
36
+ | **Persistence** | Jobs are stored in memory for speed, but logged to disk for safe keeping. |
37
+ | **Federation** | Horizontal scalability provided through federation by the client. |
38
+ | **Error Handling** | Bury any job which causes an error for later debugging and inspection.|
39
+
40
+ Keep in mind that these features are supported out of the box with beanstalk and require no special code within this gem to support.
41
+ In the end, **beanstalk is the ideal job queue** while also being ridiculously easy to install and setup.
42
+
43
+ ## Installation
44
+
45
+ Install beanstalkd:
46
+
47
+ Mac OS
48
+
49
+ ```
50
+ brew update
51
+ brew install beanstalkd
52
+ beanstalkd -p 11300
53
+ ```
54
+
55
+ Ubuntu
56
+
57
+ ```
58
+ apt-get install beanstalkd
59
+ beanstalkd -p 11300
60
+ ```
61
+
62
+ Install beaneater as a gem:
63
+
64
+ ```
65
+ gem install beaneater
66
+ ```
67
+
68
+ or add this to your Gemfile:
69
+
70
+ ```ruby
71
+ # Gemfile
72
+ gem 'beaneater'
73
+ ```
74
+
75
+ and run `bundle install` to install the dependency.
76
+
77
+ ## Usage
78
+
79
+ ### Connection
80
+
81
+ To interact with a beanstalk queue, first establish a connection by providing a set of addresses:
82
+
83
+ ```ruby
84
+ @beanstalk = Beaneater::Pool.new(['10.0.1.5:11300'])
85
+ ```
86
+
87
+ You can conversely close and dispose of a pool at any time with:
88
+
89
+ ```ruby
90
+ @beanstalk.close
91
+ ```
92
+
93
+ ### Tubes
94
+
95
+ Beanstalkd has one or more tubes which can contain any number of jobs.
96
+ Jobs can be inserted (put) into the used tube and pulled out (reserved) from watched tubes.
97
+ Each tube consists of a _ready_, _delayed_, and _buried_ queue for jobs.
98
+
99
+ When a client connects, its watch list is initially just the tube named `default`.
100
+ Tube names are at most 200 bytes. It specifies the tube to use. If the tube does not exist, it will be automatically created.
101
+
102
+ To interact with a tube, first `find` the tube:
103
+
104
+ ```ruby
105
+ @tube = @beanstalk.tubes.find "some-tube-here"
106
+ # => <Tube name='some-tube-here'>
107
+ ```
108
+
109
+ To reserve jobs from beanstalk, you will need to 'watch' certain tubes:
110
+
111
+ ```ruby
112
+ # Watch only the tubes listed below (!)
113
+ @beanstalk.tubes.watch!('some-tube')
114
+ # Append tubes to existing set of watched tubes
115
+ @beanstalk.tubes.watch('another-tube')
116
+ # You can also ignore tubes that have been watched previously
117
+ @beanstalk.tubes.ignore('some-tube')
118
+ ```
119
+
120
+ You can easily get a list of all, used or watched tubes:
121
+
122
+ ```ruby
123
+ # The list-tubes command returns a list of all existing tubes
124
+ @beanstalk.tubes.all
125
+ # => [<Tube name='foo'>, <Tube name='bar'>]
126
+
127
+ # Returns the tube currently being used by the client (for insertion)
128
+ @beanstalk.tubes.used
129
+ # => <Tube name='bar'>
130
+
131
+ # Returns a list tubes currently being watched by the client (for consumption)
132
+ @beanstalk.tubes.watched
133
+ # => [<Tube name='foo'>]
134
+ ```
135
+
136
+ You can also temporarily 'pause' the execution of a tube by specifying the time:
137
+
138
+ ```ruby
139
+ tube = @beanstalk.tubes["some-tube-here"]
140
+ tube.pause(3) # pauses tube for 3 seconds
141
+ ```
142
+
143
+ or even clear the tube of all jobs:
144
+
145
+ ```ruby
146
+ tube = @beanstalk.tubes["some-tube-here"]
147
+ tube.clear # tube will now be empty
148
+ ```
149
+
150
+ In summary, each beanstalk client manages two separate concerns: which tube newly created jobs are put into,
151
+ and which tube(s) jobs are reserved from. Accordingly, there are two separate sets of functions for these concerns:
152
+
153
+ * **use** and **using** affect where 'put' places jobs
154
+ * **watch** and **watching** control where reserve takes jobs from
155
+
156
+ Note that these concerns are fully orthogonal: for example, when you 'use' a tube, it is not automatically 'watched'.
157
+ Neither does 'watching' a tube affect the tube you are 'using'.
158
+
159
+ ### Jobs
160
+
161
+ A job in beanstalk gets inserted by a client and includes the 'body' and job metadata.
162
+ Each job is enqueued into a tube and later reserved and processed. Here is a picture of the typical job lifecycle:
163
+
164
+ ```
165
+ put reserve delete
166
+ -----> [READY] ---------> [RESERVED] --------> *poof*
167
+ ```
168
+
169
+ A job at any given time is in one of three states: **ready**, **delayed**, or **buried**:
170
+
171
+ | State | Description |
172
+ | ------- | ------------------------------- |
173
+ | ready | waiting to be `reserved` and processed after being `put` onto a tube. |
174
+ | delayed | waiting to become `ready` after the specified delay. |
175
+ | buried | waiting to be kicked, usually after job fails to process |
176
+
177
+ In addition, there are several actions that can be performed on a given job, you can:
178
+
179
+ * **reserve** which locks a job from the ready queue for processing.
180
+ * **touch** which extends the time before a job is autoreleased back to ready.
181
+ * **release** which places a reserved job back onto the ready queue.
182
+ * **delete** which removes a job from beanstalk.
183
+ * **bury** which places a reserved job into the buried state.
184
+ * **kick** which places a buried job from the buried queue back to ready.
185
+
186
+ You can insert a job onto a beanstalk tube using the `put` command:
187
+
188
+ ```ruby
189
+ @tube.put "job-data-here"
190
+ ```
191
+
192
+ Beanstalkd can only stores strings as job bodies, but you can easily encode your data into a string:
193
+
194
+ ```ruby
195
+ @tube.put({:foo => 'bar'}.to_json)
196
+ ```
197
+
198
+ Each job has various metadata associated such as `priority`, `delay`, and `ttr` which can be
199
+ specified as part of the `put` command:
200
+
201
+ ```ruby
202
+ # defaults are priority 0, delay of 0 and ttr of 120 seconds
203
+ @tube.put "job-data-here", :pri => 1000, :delay => 50, :ttr => 200
204
+ ```
205
+
206
+ The `priority` argument is an integer < 2**32. Jobs with a smaller priority take precedence over jobs with larger priorities.
207
+ The `delay` argument is an integer number of seconds to wait before putting the job in the ready queue.
208
+ The `ttr` argument is the time to run -- is an integer number of seconds to allow a worker to run this job.
209
+
210
+ ### Processing Jobs (Manually)
211
+
212
+ In order to process jobs, the client should first specify the intended tubes to be watched. If not specified,
213
+ this will default to watching just the `default` tube.
214
+
215
+ ```ruby
216
+ @beanstalk = Beaneater::Connection.new(['10.0.1.5:11300'])
217
+ @beanstalk.tubes.watch!('tube-name', 'other-tube')
218
+ ```
219
+
220
+ Next you can use the `reserve` command which will return the first available job within the watched tubes:
221
+
222
+ ```ruby
223
+ job = @beanstalk.tubes.reserve
224
+ # => <Beaneater::Job id=5 body="foo">
225
+ puts job.body
226
+ # prints 'job-data-here'
227
+ print job.stats.state # => 'reserved'
228
+ ```
229
+
230
+ By default, reserve will wait indefinitely for the next job. If you want to specify a timeout,
231
+ simply pass that in seconds into the command:
232
+
233
+ ```ruby
234
+ job = @beanstalk.tubes.reserve(5) # wait 5 secs for a job, then return
235
+ # => <Beaneater::Job id=5 body="foo">
236
+ ```
237
+
238
+ You can 'release' a reserved job back onto the ready queue to retry later:
239
+
240
+ ```ruby
241
+ job = @beanstalk.tubes.reserve
242
+ # ...job has ephemeral fail...
243
+ job.release :delay => 5
244
+ print job.stats.state # => 'delayed'
245
+ ```
246
+
247
+ You can also 'delete' jobs that are finished:
248
+
249
+ ```ruby
250
+ job = @beanstalk.tubes.reserve
251
+ job.touch # extends ttr for job
252
+ # ...process job...
253
+ job.delete
254
+ ```
255
+
256
+ Beanstalk jobs can also be buried if they fail, rather than being deleted:
257
+
258
+ ```ruby
259
+ job = @beanstalk.tubes.reserve
260
+ # ...job fails...
261
+ job.bury
262
+ print job.stats.state # => 'buried'
263
+ ```
264
+
265
+ Burying a job means that the job is pulled out of the queue into a special 'holding' area for later inspection or reuse.
266
+ To reanimate this job later, you can 'kick' buried jobs back into being ready:
267
+
268
+ ```ruby
269
+ @beanstalk.tubes['some-tube'].kick(3)
270
+ ```
271
+
272
+ This kicks 3 buried jobs for 'some-tube' back into the 'ready' state. Jobs can also be
273
+ inspected using the 'peek' commands. To find and peek at a particular job based on the id:
274
+
275
+ ```ruby
276
+ @beanstalk.jobs.find(123)
277
+ # => <Beaneater::Job id=123 body="foo">
278
+ ```
279
+
280
+ or you can peek at jobs within a tube:
281
+
282
+ ```ruby
283
+ @tube = @beanstalk.tubes.find('foo')
284
+ @tube.peek(:ready)
285
+ # => <Beaneater::Job id=123 body="ready">
286
+ @tube.peek(:buried)
287
+ # => <Beaneater::Job id=456 body="buried">
288
+ @tube.peek(:delayed)
289
+ # => <Beaneater::Job id=789 body="delayed">
290
+ ```
291
+
292
+ When dealing with jobs there are a few other useful commands available:
293
+
294
+ ```ruby
295
+ job = @beanstalk.tubes.reserve
296
+ print job.tube # => "some-tube-name"
297
+ print job.reserved? # => true
298
+ print job.exists? # => true
299
+ job.delete
300
+ print job.exists? # => false
301
+ ```
302
+
303
+ ### Processing Jobs (Automatically)
304
+
305
+ Instead of using `watch` and `reserve`, you can also use the higher level `register` and `process` methods to
306
+ process jobs. First you can 'register' how to handle jobs from various tubes:
307
+
308
+ ```ruby
309
+ @beanstalk.jobs.register('some-tube', :retry_on => [SomeError]) do |job|
310
+ do_something(job)
311
+ end
312
+
313
+ @beanstalk.jobs.register('other-tube') do |job|
314
+ do_something_else(job)
315
+ end
316
+ ```
317
+
318
+ Once you have registered the handlers for known tubes, calling `process!` will begin a
319
+ loop processing jobs as defined by the registered processor blocks:
320
+
321
+ ```ruby
322
+ @beanstalk.jobs.process!
323
+ ```
324
+
325
+ Processing runs the following steps:
326
+
327
+ 1. Watch all registered tubes
328
+ 1. Reserve the next job
329
+ 1. Once job is reserved, invoke the registered handler based on the tube name
330
+ 1. If no exceptions occur, delete the job (success)
331
+ 1. If 'retry_on' exceptions occur, call 'release' (retry)
332
+ 1. If other exception occurs, call 'bury' (error)
333
+ 1. Repeat steps 2-5
334
+
335
+ The `process` command is ideally suited for a beanstalk job processing daemon.
336
+
337
+ ### Handling Errors
338
+
339
+ While using Beaneater, certain errors may be encountered. Errors are encountered when
340
+ a command is sent to beanstalk and something unexpected happens. The most common errors
341
+ are listed below:
342
+
343
+ | Errors | Description |
344
+ | -------------------- | ------------- |
345
+ | Beaneater::NotConnected | Client connection to beanstalk cannot be established. |
346
+ | Beaneater::InvalidTubeName | Specified tube name for use or watch is not valid. |
347
+ | Beaneater::NotFoundError | Specified job or tube could not be found. |
348
+ | Beaneater::TimedOutError | Job could not be reserved within time specified. |
349
+
350
+ There are other exceptions that are less common such as `OutOfMemoryError`, `DrainingError`,
351
+ `DeadlineSoonError`, `InternalError`, `BadFormatError`, `UnknownCommandError`,
352
+ `ExpectedCRLFError`, `JobTooBigError`, `NotIgnoredError`. Be sure to check the
353
+ [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more information.
354
+
355
+
356
+ ### Stats
357
+
358
+ Beanstalk has plenty of commands for introspecting the state of the queues and jobs. To get stats for
359
+ beanstalk overall:
360
+
361
+ ```ruby
362
+ # Get overall stats about the job processing that has occurred
363
+ print @beanstalk.stats
364
+ # => { 'current_connections': 1, 'current_jobs_buried': 0, ... }
365
+ print @beanstalk.stats.current_connections
366
+ # => 1
367
+ ```
368
+
369
+ For stats on a particular tube:
370
+
371
+ ```ruby
372
+ # Get statistical information about the specified tube if it exists
373
+ print @beanstalk.tubes['some_tube_name'].stats
374
+ # => { 'current_jobs_ready': 0, 'current_jobs_reserved': 0, ... }
375
+ ```
376
+
377
+ For stats on an individual job:
378
+
379
+ ```ruby
380
+ # Get statistical information about the specified job if it exists
381
+ print @beanstalk.jobs[some_job_id].stats
382
+ # => {'age': 0, 'id': 2, 'state': 'reserved', 'tube': 'default', ... }
383
+ ```
384
+
385
+ Be sure to check the [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for
386
+ more details about the stats commands.
387
+
388
+ ## Resources
389
+
390
+ There are other resources helpful when learning about beanstalk:
391
+
392
+ * [Beanstalkd homepage](http://kr.github.com/beanstalkd/)
393
+ * [beanstalk on github](https://github.com/kr/beanstalkd)
394
+ * [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md)
395
+
396
+ ## Contributors
397
+
398
+ - [Nico Taing](https://github.com/Nico-Taing) - Creator and co-maintainer
399
+ - [Nathan Esquenazi](https://github.com/nesquena) - Contributor and co-maintainer