beaneater 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +17 -0
- data/.yardopts +8 -0
- data/Gemfile +10 -0
- data/LICENSE.txt +22 -0
- data/README.md +399 -0
- data/REF +23 -0
- data/Rakefile +23 -0
- data/TODO +2 -0
- data/beaneater.gemspec +24 -0
- data/examples/demo.rb +96 -0
- data/lib/beaneater.rb +10 -0
- data/lib/beaneater/connection.rb +110 -0
- data/lib/beaneater/errors.rb +73 -0
- data/lib/beaneater/job.rb +2 -0
- data/lib/beaneater/job/collection.rb +91 -0
- data/lib/beaneater/job/record.rb +174 -0
- data/lib/beaneater/pool.rb +141 -0
- data/lib/beaneater/pool_command.rb +71 -0
- data/lib/beaneater/stats.rb +55 -0
- data/lib/beaneater/stats/fast_struct.rb +96 -0
- data/lib/beaneater/stats/stat_struct.rb +39 -0
- data/lib/beaneater/tube.rb +2 -0
- data/lib/beaneater/tube/collection.rb +134 -0
- data/lib/beaneater/tube/record.rb +158 -0
- data/lib/beaneater/version.rb +4 -0
- data/test/beaneater_test.rb +115 -0
- data/test/connection_test.rb +64 -0
- data/test/errors_test.rb +26 -0
- data/test/job_test.rb +213 -0
- data/test/jobs_test.rb +107 -0
- data/test/pool_command_test.rb +68 -0
- data/test/pool_test.rb +154 -0
- data/test/stat_struct_test.rb +41 -0
- data/test/stats_test.rb +42 -0
- data/test/test_helper.rb +21 -0
- data/test/tube_test.rb +164 -0
- data/test/tubes_test.rb +153 -0
- metadata +181 -0
data/.gitignore
ADDED
data/.yardopts
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Nico Taing
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,399 @@
|
|
1
|
+
# Beaneater
|
2
|
+
|
3
|
+
Beaneater is the best way to interact with beanstalkd from within Ruby.
|
4
|
+
[Beanstalkd](http://kr.github.com/beanstalkd/) is a simple, fast work queue. Its interface is generic, but was
|
5
|
+
originally designed for reducing the latency of page views in high-volume web applications by
|
6
|
+
running time-consuming tasks asynchronously. Read the
|
7
|
+
[beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more details.
|
8
|
+
|
9
|
+
|
10
|
+
## Why Beanstalk?
|
11
|
+
|
12
|
+
Illya has an excellent blog post
|
13
|
+
[Scalable Work Queues with Beanstalk](http://www.igvita.com/2010/05/20/scalable-work-queues-with-beanstalk/) and
|
14
|
+
Adam Wiggins posted [an excellent comparison](http://adam.heroku.com/past/2010/4/24/beanstalk_a_simple_and_fast_queueing_backend/).
|
15
|
+
|
16
|
+
You will quickly see that **beanstalkd** is an underrated but incredible project that is extremely well-suited as a job queue.
|
17
|
+
Significantly better suited for this task than Redis or a database. Beanstalk is a simple,
|
18
|
+
and a very fast work queue service rolled into a single binary - it is the memcached of work queues.
|
19
|
+
Originally built to power the backend for the 'Causes' Facebook app, it is a mature and production ready open source project.
|
20
|
+
[PostRank](http://www.postrank.com) uses beanstalk to reliably process millions of jobs a day.
|
21
|
+
|
22
|
+
A single instance of Beanstalk is perfectly capable of handling thousands of jobs a second (or more, depending on your job size)
|
23
|
+
because it is an in-memory, event-driven system. Powered by libevent under the hood,
|
24
|
+
it requires zero setup (launch and forget, à la memcached), optional log based persistence, an easily parsed ASCII protocol,
|
25
|
+
and a rich set of tools for job management that go well beyond a simple FIFO work queue.
|
26
|
+
|
27
|
+
Beanstalkd supports the following features out of the box:
|
28
|
+
|
29
|
+
| Feature | Description |
|
30
|
+
| ------- | ------------------------------- |
|
31
|
+
| **Parallel Queues** | Supports multiple work queues created on demand. |
|
32
|
+
| **Reliable** | Beanstalk’s reserve, work, delete cycle ensures reliable processing. |
|
33
|
+
| **Scheduling** | Delay enqueuing jobs by a specified interval to schedule processing later |
|
34
|
+
| **Fast** | Processes thousands of jobs per second; **significantly** faster than alternatives. |
|
35
|
+
| **Priorities** | Specify priority so important jobs can be processed quickly. |
|
36
|
+
| **Persistence** | Jobs are stored in memory for speed, but logged to disk for safe keeping. |
|
37
|
+
| **Federation** | Horizontal scalability provided through federation by the client. |
|
38
|
+
| **Error Handling** | Bury any job which causes an error for later debugging and inspection.|
|
39
|
+
|
40
|
+
Keep in mind that these features are supported out of the box with beanstalk and require no special code within this gem to support.
|
41
|
+
In the end, **beanstalk is the ideal job queue** while also being ridiculously easy to install and setup.
|
42
|
+
|
43
|
+
## Installation
|
44
|
+
|
45
|
+
Install beanstalkd:
|
46
|
+
|
47
|
+
Mac OS
|
48
|
+
|
49
|
+
```
|
50
|
+
brew update
|
51
|
+
brew install beanstalkd
|
52
|
+
beanstalkd -p 11300
|
53
|
+
```
|
54
|
+
|
55
|
+
Ubuntu
|
56
|
+
|
57
|
+
```
|
58
|
+
apt-get install beanstalkd
|
59
|
+
beanstalkd -p 11300
|
60
|
+
```
|
61
|
+
|
62
|
+
Install beaneater as a gem:
|
63
|
+
|
64
|
+
```
|
65
|
+
gem install beaneater
|
66
|
+
```
|
67
|
+
|
68
|
+
or add this to your Gemfile:
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
# Gemfile
|
72
|
+
gem 'beaneater'
|
73
|
+
```
|
74
|
+
|
75
|
+
and run `bundle install` to install the dependency.
|
76
|
+
|
77
|
+
## Usage
|
78
|
+
|
79
|
+
### Connection
|
80
|
+
|
81
|
+
To interact with a beanstalk queue, first establish a connection by providing a set of addresses:
|
82
|
+
|
83
|
+
```ruby
|
84
|
+
@beanstalk = Beaneater::Pool.new(['10.0.1.5:11300'])
|
85
|
+
```
|
86
|
+
|
87
|
+
You can conversely close and dispose of a pool at any time with:
|
88
|
+
|
89
|
+
```ruby
|
90
|
+
@beanstalk.close
|
91
|
+
```
|
92
|
+
|
93
|
+
### Tubes
|
94
|
+
|
95
|
+
Beanstalkd has one or more tubes which can contain any number of jobs.
|
96
|
+
Jobs can be inserted (put) into the used tube and pulled out (reserved) from watched tubes.
|
97
|
+
Each tube consists of a _ready_, _delayed_, and _buried_ queue for jobs.
|
98
|
+
|
99
|
+
When a client connects, its watch list is initially just the tube named `default`.
|
100
|
+
Tube names are at most 200 bytes. It specifies the tube to use. If the tube does not exist, it will be automatically created.
|
101
|
+
|
102
|
+
To interact with a tube, first `find` the tube:
|
103
|
+
|
104
|
+
```ruby
|
105
|
+
@tube = @beanstalk.tubes.find "some-tube-here"
|
106
|
+
# => <Tube name='some-tube-here'>
|
107
|
+
```
|
108
|
+
|
109
|
+
To reserve jobs from beanstalk, you will need to 'watch' certain tubes:
|
110
|
+
|
111
|
+
```ruby
|
112
|
+
# Watch only the tubes listed below (!)
|
113
|
+
@beanstalk.tubes.watch!('some-tube')
|
114
|
+
# Append tubes to existing set of watched tubes
|
115
|
+
@beanstalk.tubes.watch('another-tube')
|
116
|
+
# You can also ignore tubes that have been watched previously
|
117
|
+
@beanstalk.tubes.ignore('some-tube')
|
118
|
+
```
|
119
|
+
|
120
|
+
You can easily get a list of all, used or watched tubes:
|
121
|
+
|
122
|
+
```ruby
|
123
|
+
# The list-tubes command returns a list of all existing tubes
|
124
|
+
@beanstalk.tubes.all
|
125
|
+
# => [<Tube name='foo'>, <Tube name='bar'>]
|
126
|
+
|
127
|
+
# Returns the tube currently being used by the client (for insertion)
|
128
|
+
@beanstalk.tubes.used
|
129
|
+
# => <Tube name='bar'>
|
130
|
+
|
131
|
+
# Returns a list tubes currently being watched by the client (for consumption)
|
132
|
+
@beanstalk.tubes.watched
|
133
|
+
# => [<Tube name='foo'>]
|
134
|
+
```
|
135
|
+
|
136
|
+
You can also temporarily 'pause' the execution of a tube by specifying the time:
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
tube = @beanstalk.tubes["some-tube-here"]
|
140
|
+
tube.pause(3) # pauses tube for 3 seconds
|
141
|
+
```
|
142
|
+
|
143
|
+
or even clear the tube of all jobs:
|
144
|
+
|
145
|
+
```ruby
|
146
|
+
tube = @beanstalk.tubes["some-tube-here"]
|
147
|
+
tube.clear # tube will now be empty
|
148
|
+
```
|
149
|
+
|
150
|
+
In summary, each beanstalk client manages two separate concerns: which tube newly created jobs are put into,
|
151
|
+
and which tube(s) jobs are reserved from. Accordingly, there are two separate sets of functions for these concerns:
|
152
|
+
|
153
|
+
* **use** and **using** affect where 'put' places jobs
|
154
|
+
* **watch** and **watching** control where reserve takes jobs from
|
155
|
+
|
156
|
+
Note that these concerns are fully orthogonal: for example, when you 'use' a tube, it is not automatically 'watched'.
|
157
|
+
Neither does 'watching' a tube affect the tube you are 'using'.
|
158
|
+
|
159
|
+
### Jobs
|
160
|
+
|
161
|
+
A job in beanstalk gets inserted by a client and includes the 'body' and job metadata.
|
162
|
+
Each job is enqueued into a tube and later reserved and processed. Here is a picture of the typical job lifecycle:
|
163
|
+
|
164
|
+
```
|
165
|
+
put reserve delete
|
166
|
+
-----> [READY] ---------> [RESERVED] --------> *poof*
|
167
|
+
```
|
168
|
+
|
169
|
+
A job at any given time is in one of three states: **ready**, **delayed**, or **buried**:
|
170
|
+
|
171
|
+
| State | Description |
|
172
|
+
| ------- | ------------------------------- |
|
173
|
+
| ready | waiting to be `reserved` and processed after being `put` onto a tube. |
|
174
|
+
| delayed | waiting to become `ready` after the specified delay. |
|
175
|
+
| buried | waiting to be kicked, usually after job fails to process |
|
176
|
+
|
177
|
+
In addition, there are several actions that can be performed on a given job, you can:
|
178
|
+
|
179
|
+
* **reserve** which locks a job from the ready queue for processing.
|
180
|
+
* **touch** which extends the time before a job is autoreleased back to ready.
|
181
|
+
* **release** which places a reserved job back onto the ready queue.
|
182
|
+
* **delete** which removes a job from beanstalk.
|
183
|
+
* **bury** which places a reserved job into the buried state.
|
184
|
+
* **kick** which places a buried job from the buried queue back to ready.
|
185
|
+
|
186
|
+
You can insert a job onto a beanstalk tube using the `put` command:
|
187
|
+
|
188
|
+
```ruby
|
189
|
+
@tube.put "job-data-here"
|
190
|
+
```
|
191
|
+
|
192
|
+
Beanstalkd can only stores strings as job bodies, but you can easily encode your data into a string:
|
193
|
+
|
194
|
+
```ruby
|
195
|
+
@tube.put({:foo => 'bar'}.to_json)
|
196
|
+
```
|
197
|
+
|
198
|
+
Each job has various metadata associated such as `priority`, `delay`, and `ttr` which can be
|
199
|
+
specified as part of the `put` command:
|
200
|
+
|
201
|
+
```ruby
|
202
|
+
# defaults are priority 0, delay of 0 and ttr of 120 seconds
|
203
|
+
@tube.put "job-data-here", :pri => 1000, :delay => 50, :ttr => 200
|
204
|
+
```
|
205
|
+
|
206
|
+
The `priority` argument is an integer < 2**32. Jobs with a smaller priority take precedence over jobs with larger priorities.
|
207
|
+
The `delay` argument is an integer number of seconds to wait before putting the job in the ready queue.
|
208
|
+
The `ttr` argument is the time to run -- is an integer number of seconds to allow a worker to run this job.
|
209
|
+
|
210
|
+
### Processing Jobs (Manually)
|
211
|
+
|
212
|
+
In order to process jobs, the client should first specify the intended tubes to be watched. If not specified,
|
213
|
+
this will default to watching just the `default` tube.
|
214
|
+
|
215
|
+
```ruby
|
216
|
+
@beanstalk = Beaneater::Connection.new(['10.0.1.5:11300'])
|
217
|
+
@beanstalk.tubes.watch!('tube-name', 'other-tube')
|
218
|
+
```
|
219
|
+
|
220
|
+
Next you can use the `reserve` command which will return the first available job within the watched tubes:
|
221
|
+
|
222
|
+
```ruby
|
223
|
+
job = @beanstalk.tubes.reserve
|
224
|
+
# => <Beaneater::Job id=5 body="foo">
|
225
|
+
puts job.body
|
226
|
+
# prints 'job-data-here'
|
227
|
+
print job.stats.state # => 'reserved'
|
228
|
+
```
|
229
|
+
|
230
|
+
By default, reserve will wait indefinitely for the next job. If you want to specify a timeout,
|
231
|
+
simply pass that in seconds into the command:
|
232
|
+
|
233
|
+
```ruby
|
234
|
+
job = @beanstalk.tubes.reserve(5) # wait 5 secs for a job, then return
|
235
|
+
# => <Beaneater::Job id=5 body="foo">
|
236
|
+
```
|
237
|
+
|
238
|
+
You can 'release' a reserved job back onto the ready queue to retry later:
|
239
|
+
|
240
|
+
```ruby
|
241
|
+
job = @beanstalk.tubes.reserve
|
242
|
+
# ...job has ephemeral fail...
|
243
|
+
job.release :delay => 5
|
244
|
+
print job.stats.state # => 'delayed'
|
245
|
+
```
|
246
|
+
|
247
|
+
You can also 'delete' jobs that are finished:
|
248
|
+
|
249
|
+
```ruby
|
250
|
+
job = @beanstalk.tubes.reserve
|
251
|
+
job.touch # extends ttr for job
|
252
|
+
# ...process job...
|
253
|
+
job.delete
|
254
|
+
```
|
255
|
+
|
256
|
+
Beanstalk jobs can also be buried if they fail, rather than being deleted:
|
257
|
+
|
258
|
+
```ruby
|
259
|
+
job = @beanstalk.tubes.reserve
|
260
|
+
# ...job fails...
|
261
|
+
job.bury
|
262
|
+
print job.stats.state # => 'buried'
|
263
|
+
```
|
264
|
+
|
265
|
+
Burying a job means that the job is pulled out of the queue into a special 'holding' area for later inspection or reuse.
|
266
|
+
To reanimate this job later, you can 'kick' buried jobs back into being ready:
|
267
|
+
|
268
|
+
```ruby
|
269
|
+
@beanstalk.tubes['some-tube'].kick(3)
|
270
|
+
```
|
271
|
+
|
272
|
+
This kicks 3 buried jobs for 'some-tube' back into the 'ready' state. Jobs can also be
|
273
|
+
inspected using the 'peek' commands. To find and peek at a particular job based on the id:
|
274
|
+
|
275
|
+
```ruby
|
276
|
+
@beanstalk.jobs.find(123)
|
277
|
+
# => <Beaneater::Job id=123 body="foo">
|
278
|
+
```
|
279
|
+
|
280
|
+
or you can peek at jobs within a tube:
|
281
|
+
|
282
|
+
```ruby
|
283
|
+
@tube = @beanstalk.tubes.find('foo')
|
284
|
+
@tube.peek(:ready)
|
285
|
+
# => <Beaneater::Job id=123 body="ready">
|
286
|
+
@tube.peek(:buried)
|
287
|
+
# => <Beaneater::Job id=456 body="buried">
|
288
|
+
@tube.peek(:delayed)
|
289
|
+
# => <Beaneater::Job id=789 body="delayed">
|
290
|
+
```
|
291
|
+
|
292
|
+
When dealing with jobs there are a few other useful commands available:
|
293
|
+
|
294
|
+
```ruby
|
295
|
+
job = @beanstalk.tubes.reserve
|
296
|
+
print job.tube # => "some-tube-name"
|
297
|
+
print job.reserved? # => true
|
298
|
+
print job.exists? # => true
|
299
|
+
job.delete
|
300
|
+
print job.exists? # => false
|
301
|
+
```
|
302
|
+
|
303
|
+
### Processing Jobs (Automatically)
|
304
|
+
|
305
|
+
Instead of using `watch` and `reserve`, you can also use the higher level `register` and `process` methods to
|
306
|
+
process jobs. First you can 'register' how to handle jobs from various tubes:
|
307
|
+
|
308
|
+
```ruby
|
309
|
+
@beanstalk.jobs.register('some-tube', :retry_on => [SomeError]) do |job|
|
310
|
+
do_something(job)
|
311
|
+
end
|
312
|
+
|
313
|
+
@beanstalk.jobs.register('other-tube') do |job|
|
314
|
+
do_something_else(job)
|
315
|
+
end
|
316
|
+
```
|
317
|
+
|
318
|
+
Once you have registered the handlers for known tubes, calling `process!` will begin a
|
319
|
+
loop processing jobs as defined by the registered processor blocks:
|
320
|
+
|
321
|
+
```ruby
|
322
|
+
@beanstalk.jobs.process!
|
323
|
+
```
|
324
|
+
|
325
|
+
Processing runs the following steps:
|
326
|
+
|
327
|
+
1. Watch all registered tubes
|
328
|
+
1. Reserve the next job
|
329
|
+
1. Once job is reserved, invoke the registered handler based on the tube name
|
330
|
+
1. If no exceptions occur, delete the job (success)
|
331
|
+
1. If 'retry_on' exceptions occur, call 'release' (retry)
|
332
|
+
1. If other exception occurs, call 'bury' (error)
|
333
|
+
1. Repeat steps 2-5
|
334
|
+
|
335
|
+
The `process` command is ideally suited for a beanstalk job processing daemon.
|
336
|
+
|
337
|
+
### Handling Errors
|
338
|
+
|
339
|
+
While using Beaneater, certain errors may be encountered. Errors are encountered when
|
340
|
+
a command is sent to beanstalk and something unexpected happens. The most common errors
|
341
|
+
are listed below:
|
342
|
+
|
343
|
+
| Errors | Description |
|
344
|
+
| -------------------- | ------------- |
|
345
|
+
| Beaneater::NotConnected | Client connection to beanstalk cannot be established. |
|
346
|
+
| Beaneater::InvalidTubeName | Specified tube name for use or watch is not valid. |
|
347
|
+
| Beaneater::NotFoundError | Specified job or tube could not be found. |
|
348
|
+
| Beaneater::TimedOutError | Job could not be reserved within time specified. |
|
349
|
+
|
350
|
+
There are other exceptions that are less common such as `OutOfMemoryError`, `DrainingError`,
|
351
|
+
`DeadlineSoonError`, `InternalError`, `BadFormatError`, `UnknownCommandError`,
|
352
|
+
`ExpectedCRLFError`, `JobTooBigError`, `NotIgnoredError`. Be sure to check the
|
353
|
+
[beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more information.
|
354
|
+
|
355
|
+
|
356
|
+
### Stats
|
357
|
+
|
358
|
+
Beanstalk has plenty of commands for introspecting the state of the queues and jobs. To get stats for
|
359
|
+
beanstalk overall:
|
360
|
+
|
361
|
+
```ruby
|
362
|
+
# Get overall stats about the job processing that has occurred
|
363
|
+
print @beanstalk.stats
|
364
|
+
# => { 'current_connections': 1, 'current_jobs_buried': 0, ... }
|
365
|
+
print @beanstalk.stats.current_connections
|
366
|
+
# => 1
|
367
|
+
```
|
368
|
+
|
369
|
+
For stats on a particular tube:
|
370
|
+
|
371
|
+
```ruby
|
372
|
+
# Get statistical information about the specified tube if it exists
|
373
|
+
print @beanstalk.tubes['some_tube_name'].stats
|
374
|
+
# => { 'current_jobs_ready': 0, 'current_jobs_reserved': 0, ... }
|
375
|
+
```
|
376
|
+
|
377
|
+
For stats on an individual job:
|
378
|
+
|
379
|
+
```ruby
|
380
|
+
# Get statistical information about the specified job if it exists
|
381
|
+
print @beanstalk.jobs[some_job_id].stats
|
382
|
+
# => {'age': 0, 'id': 2, 'state': 'reserved', 'tube': 'default', ... }
|
383
|
+
```
|
384
|
+
|
385
|
+
Be sure to check the [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for
|
386
|
+
more details about the stats commands.
|
387
|
+
|
388
|
+
## Resources
|
389
|
+
|
390
|
+
There are other resources helpful when learning about beanstalk:
|
391
|
+
|
392
|
+
* [Beanstalkd homepage](http://kr.github.com/beanstalkd/)
|
393
|
+
* [beanstalk on github](https://github.com/kr/beanstalkd)
|
394
|
+
* [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md)
|
395
|
+
|
396
|
+
## Contributors
|
397
|
+
|
398
|
+
- [Nico Taing](https://github.com/Nico-Taing) - Creator and co-maintainer
|
399
|
+
- [Nathan Esquenazi](https://github.com/nesquena) - Contributor and co-maintainer
|