beaneater 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +17 -0
- data/.yardopts +8 -0
- data/Gemfile +10 -0
- data/LICENSE.txt +22 -0
- data/README.md +399 -0
- data/REF +23 -0
- data/Rakefile +23 -0
- data/TODO +2 -0
- data/beaneater.gemspec +24 -0
- data/examples/demo.rb +96 -0
- data/lib/beaneater.rb +10 -0
- data/lib/beaneater/connection.rb +110 -0
- data/lib/beaneater/errors.rb +73 -0
- data/lib/beaneater/job.rb +2 -0
- data/lib/beaneater/job/collection.rb +91 -0
- data/lib/beaneater/job/record.rb +174 -0
- data/lib/beaneater/pool.rb +141 -0
- data/lib/beaneater/pool_command.rb +71 -0
- data/lib/beaneater/stats.rb +55 -0
- data/lib/beaneater/stats/fast_struct.rb +96 -0
- data/lib/beaneater/stats/stat_struct.rb +39 -0
- data/lib/beaneater/tube.rb +2 -0
- data/lib/beaneater/tube/collection.rb +134 -0
- data/lib/beaneater/tube/record.rb +158 -0
- data/lib/beaneater/version.rb +4 -0
- data/test/beaneater_test.rb +115 -0
- data/test/connection_test.rb +64 -0
- data/test/errors_test.rb +26 -0
- data/test/job_test.rb +213 -0
- data/test/jobs_test.rb +107 -0
- data/test/pool_command_test.rb +68 -0
- data/test/pool_test.rb +154 -0
- data/test/stat_struct_test.rb +41 -0
- data/test/stats_test.rb +42 -0
- data/test/test_helper.rb +21 -0
- data/test/tube_test.rb +164 -0
- data/test/tubes_test.rb +153 -0
- metadata +181 -0
data/.gitignore
ADDED
data/.yardopts
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Nico Taing
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,399 @@
|
|
1
|
+
# Beaneater
|
2
|
+
|
3
|
+
Beaneater is the best way to interact with beanstalkd from within Ruby.
|
4
|
+
[Beanstalkd](http://kr.github.com/beanstalkd/) is a simple, fast work queue. Its interface is generic, but was
|
5
|
+
originally designed for reducing the latency of page views in high-volume web applications by
|
6
|
+
running time-consuming tasks asynchronously. Read the
|
7
|
+
[beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more details.
|
8
|
+
|
9
|
+
|
10
|
+
## Why Beanstalk?
|
11
|
+
|
12
|
+
Illya has an excellent blog post
|
13
|
+
[Scalable Work Queues with Beanstalk](http://www.igvita.com/2010/05/20/scalable-work-queues-with-beanstalk/) and
|
14
|
+
Adam Wiggins posted [an excellent comparison](http://adam.heroku.com/past/2010/4/24/beanstalk_a_simple_and_fast_queueing_backend/).
|
15
|
+
|
16
|
+
You will quickly see that **beanstalkd** is an underrated but incredible project that is extremely well-suited as a job queue.
|
17
|
+
Significantly better suited for this task than Redis or a database. Beanstalk is a simple,
|
18
|
+
and a very fast work queue service rolled into a single binary - it is the memcached of work queues.
|
19
|
+
Originally built to power the backend for the 'Causes' Facebook app, it is a mature and production ready open source project.
|
20
|
+
[PostRank](http://www.postrank.com) uses beanstalk to reliably process millions of jobs a day.
|
21
|
+
|
22
|
+
A single instance of Beanstalk is perfectly capable of handling thousands of jobs a second (or more, depending on your job size)
|
23
|
+
because it is an in-memory, event-driven system. Powered by libevent under the hood,
|
24
|
+
it requires zero setup (launch and forget, à la memcached), optional log based persistence, an easily parsed ASCII protocol,
|
25
|
+
and a rich set of tools for job management that go well beyond a simple FIFO work queue.
|
26
|
+
|
27
|
+
Beanstalkd supports the following features out of the box:
|
28
|
+
|
29
|
+
| Feature | Description |
|
30
|
+
| ------- | ------------------------------- |
|
31
|
+
| **Parallel Queues** | Supports multiple work queues created on demand. |
|
32
|
+
| **Reliable** | Beanstalk’s reserve, work, delete cycle ensures reliable processing. |
|
33
|
+
| **Scheduling** | Delay enqueuing jobs by a specified interval to schedule processing later |
|
34
|
+
| **Fast** | Processes thousands of jobs per second; **significantly** faster than alternatives. |
|
35
|
+
| **Priorities** | Specify priority so important jobs can be processed quickly. |
|
36
|
+
| **Persistence** | Jobs are stored in memory for speed, but logged to disk for safe keeping. |
|
37
|
+
| **Federation** | Horizontal scalability provided through federation by the client. |
|
38
|
+
| **Error Handling** | Bury any job which causes an error for later debugging and inspection.|
|
39
|
+
|
40
|
+
Keep in mind that these features are supported out of the box with beanstalk and require no special code within this gem to support.
|
41
|
+
In the end, **beanstalk is the ideal job queue** while also being ridiculously easy to install and setup.
|
42
|
+
|
43
|
+
## Installation
|
44
|
+
|
45
|
+
Install beanstalkd:
|
46
|
+
|
47
|
+
Mac OS
|
48
|
+
|
49
|
+
```
|
50
|
+
brew update
|
51
|
+
brew install beanstalkd
|
52
|
+
beanstalkd -p 11300
|
53
|
+
```
|
54
|
+
|
55
|
+
Ubuntu
|
56
|
+
|
57
|
+
```
|
58
|
+
apt-get install beanstalkd
|
59
|
+
beanstalkd -p 11300
|
60
|
+
```
|
61
|
+
|
62
|
+
Install beaneater as a gem:
|
63
|
+
|
64
|
+
```
|
65
|
+
gem install beaneater
|
66
|
+
```
|
67
|
+
|
68
|
+
or add this to your Gemfile:
|
69
|
+
|
70
|
+
```ruby
|
71
|
+
# Gemfile
|
72
|
+
gem 'beaneater'
|
73
|
+
```
|
74
|
+
|
75
|
+
and run `bundle install` to install the dependency.
|
76
|
+
|
77
|
+
## Usage
|
78
|
+
|
79
|
+
### Connection
|
80
|
+
|
81
|
+
To interact with a beanstalk queue, first establish a connection by providing a set of addresses:
|
82
|
+
|
83
|
+
```ruby
|
84
|
+
@beanstalk = Beaneater::Pool.new(['10.0.1.5:11300'])
|
85
|
+
```
|
86
|
+
|
87
|
+
You can conversely close and dispose of a pool at any time with:
|
88
|
+
|
89
|
+
```ruby
|
90
|
+
@beanstalk.close
|
91
|
+
```
|
92
|
+
|
93
|
+
### Tubes
|
94
|
+
|
95
|
+
Beanstalkd has one or more tubes which can contain any number of jobs.
|
96
|
+
Jobs can be inserted (put) into the used tube and pulled out (reserved) from watched tubes.
|
97
|
+
Each tube consists of a _ready_, _delayed_, and _buried_ queue for jobs.
|
98
|
+
|
99
|
+
When a client connects, its watch list is initially just the tube named `default`.
|
100
|
+
Tube names are at most 200 bytes. It specifies the tube to use. If the tube does not exist, it will be automatically created.
|
101
|
+
|
102
|
+
To interact with a tube, first `find` the tube:
|
103
|
+
|
104
|
+
```ruby
|
105
|
+
@tube = @beanstalk.tubes.find "some-tube-here"
|
106
|
+
# => <Tube name='some-tube-here'>
|
107
|
+
```
|
108
|
+
|
109
|
+
To reserve jobs from beanstalk, you will need to 'watch' certain tubes:
|
110
|
+
|
111
|
+
```ruby
|
112
|
+
# Watch only the tubes listed below (!)
|
113
|
+
@beanstalk.tubes.watch!('some-tube')
|
114
|
+
# Append tubes to existing set of watched tubes
|
115
|
+
@beanstalk.tubes.watch('another-tube')
|
116
|
+
# You can also ignore tubes that have been watched previously
|
117
|
+
@beanstalk.tubes.ignore('some-tube')
|
118
|
+
```
|
119
|
+
|
120
|
+
You can easily get a list of all, used or watched tubes:
|
121
|
+
|
122
|
+
```ruby
|
123
|
+
# The list-tubes command returns a list of all existing tubes
|
124
|
+
@beanstalk.tubes.all
|
125
|
+
# => [<Tube name='foo'>, <Tube name='bar'>]
|
126
|
+
|
127
|
+
# Returns the tube currently being used by the client (for insertion)
|
128
|
+
@beanstalk.tubes.used
|
129
|
+
# => <Tube name='bar'>
|
130
|
+
|
131
|
+
# Returns a list tubes currently being watched by the client (for consumption)
|
132
|
+
@beanstalk.tubes.watched
|
133
|
+
# => [<Tube name='foo'>]
|
134
|
+
```
|
135
|
+
|
136
|
+
You can also temporarily 'pause' the execution of a tube by specifying the time:
|
137
|
+
|
138
|
+
```ruby
|
139
|
+
tube = @beanstalk.tubes["some-tube-here"]
|
140
|
+
tube.pause(3) # pauses tube for 3 seconds
|
141
|
+
```
|
142
|
+
|
143
|
+
or even clear the tube of all jobs:
|
144
|
+
|
145
|
+
```ruby
|
146
|
+
tube = @beanstalk.tubes["some-tube-here"]
|
147
|
+
tube.clear # tube will now be empty
|
148
|
+
```
|
149
|
+
|
150
|
+
In summary, each beanstalk client manages two separate concerns: which tube newly created jobs are put into,
|
151
|
+
and which tube(s) jobs are reserved from. Accordingly, there are two separate sets of functions for these concerns:
|
152
|
+
|
153
|
+
* **use** and **using** affect where 'put' places jobs
|
154
|
+
* **watch** and **watching** control where reserve takes jobs from
|
155
|
+
|
156
|
+
Note that these concerns are fully orthogonal: for example, when you 'use' a tube, it is not automatically 'watched'.
|
157
|
+
Neither does 'watching' a tube affect the tube you are 'using'.
|
158
|
+
|
159
|
+
### Jobs
|
160
|
+
|
161
|
+
A job in beanstalk gets inserted by a client and includes the 'body' and job metadata.
|
162
|
+
Each job is enqueued into a tube and later reserved and processed. Here is a picture of the typical job lifecycle:
|
163
|
+
|
164
|
+
```
|
165
|
+
put reserve delete
|
166
|
+
-----> [READY] ---------> [RESERVED] --------> *poof*
|
167
|
+
```
|
168
|
+
|
169
|
+
A job at any given time is in one of three states: **ready**, **delayed**, or **buried**:
|
170
|
+
|
171
|
+
| State | Description |
|
172
|
+
| ------- | ------------------------------- |
|
173
|
+
| ready | waiting to be `reserved` and processed after being `put` onto a tube. |
|
174
|
+
| delayed | waiting to become `ready` after the specified delay. |
|
175
|
+
| buried | waiting to be kicked, usually after job fails to process |
|
176
|
+
|
177
|
+
In addition, there are several actions that can be performed on a given job, you can:
|
178
|
+
|
179
|
+
* **reserve** which locks a job from the ready queue for processing.
|
180
|
+
* **touch** which extends the time before a job is autoreleased back to ready.
|
181
|
+
* **release** which places a reserved job back onto the ready queue.
|
182
|
+
* **delete** which removes a job from beanstalk.
|
183
|
+
* **bury** which places a reserved job into the buried state.
|
184
|
+
* **kick** which places a buried job from the buried queue back to ready.
|
185
|
+
|
186
|
+
You can insert a job onto a beanstalk tube using the `put` command:
|
187
|
+
|
188
|
+
```ruby
|
189
|
+
@tube.put "job-data-here"
|
190
|
+
```
|
191
|
+
|
192
|
+
Beanstalkd can only stores strings as job bodies, but you can easily encode your data into a string:
|
193
|
+
|
194
|
+
```ruby
|
195
|
+
@tube.put({:foo => 'bar'}.to_json)
|
196
|
+
```
|
197
|
+
|
198
|
+
Each job has various metadata associated such as `priority`, `delay`, and `ttr` which can be
|
199
|
+
specified as part of the `put` command:
|
200
|
+
|
201
|
+
```ruby
|
202
|
+
# defaults are priority 0, delay of 0 and ttr of 120 seconds
|
203
|
+
@tube.put "job-data-here", :pri => 1000, :delay => 50, :ttr => 200
|
204
|
+
```
|
205
|
+
|
206
|
+
The `priority` argument is an integer < 2**32. Jobs with a smaller priority take precedence over jobs with larger priorities.
|
207
|
+
The `delay` argument is an integer number of seconds to wait before putting the job in the ready queue.
|
208
|
+
The `ttr` argument is the time to run -- is an integer number of seconds to allow a worker to run this job.
|
209
|
+
|
210
|
+
### Processing Jobs (Manually)
|
211
|
+
|
212
|
+
In order to process jobs, the client should first specify the intended tubes to be watched. If not specified,
|
213
|
+
this will default to watching just the `default` tube.
|
214
|
+
|
215
|
+
```ruby
|
216
|
+
@beanstalk = Beaneater::Connection.new(['10.0.1.5:11300'])
|
217
|
+
@beanstalk.tubes.watch!('tube-name', 'other-tube')
|
218
|
+
```
|
219
|
+
|
220
|
+
Next you can use the `reserve` command which will return the first available job within the watched tubes:
|
221
|
+
|
222
|
+
```ruby
|
223
|
+
job = @beanstalk.tubes.reserve
|
224
|
+
# => <Beaneater::Job id=5 body="foo">
|
225
|
+
puts job.body
|
226
|
+
# prints 'job-data-here'
|
227
|
+
print job.stats.state # => 'reserved'
|
228
|
+
```
|
229
|
+
|
230
|
+
By default, reserve will wait indefinitely for the next job. If you want to specify a timeout,
|
231
|
+
simply pass that in seconds into the command:
|
232
|
+
|
233
|
+
```ruby
|
234
|
+
job = @beanstalk.tubes.reserve(5) # wait 5 secs for a job, then return
|
235
|
+
# => <Beaneater::Job id=5 body="foo">
|
236
|
+
```
|
237
|
+
|
238
|
+
You can 'release' a reserved job back onto the ready queue to retry later:
|
239
|
+
|
240
|
+
```ruby
|
241
|
+
job = @beanstalk.tubes.reserve
|
242
|
+
# ...job has ephemeral fail...
|
243
|
+
job.release :delay => 5
|
244
|
+
print job.stats.state # => 'delayed'
|
245
|
+
```
|
246
|
+
|
247
|
+
You can also 'delete' jobs that are finished:
|
248
|
+
|
249
|
+
```ruby
|
250
|
+
job = @beanstalk.tubes.reserve
|
251
|
+
job.touch # extends ttr for job
|
252
|
+
# ...process job...
|
253
|
+
job.delete
|
254
|
+
```
|
255
|
+
|
256
|
+
Beanstalk jobs can also be buried if they fail, rather than being deleted:
|
257
|
+
|
258
|
+
```ruby
|
259
|
+
job = @beanstalk.tubes.reserve
|
260
|
+
# ...job fails...
|
261
|
+
job.bury
|
262
|
+
print job.stats.state # => 'buried'
|
263
|
+
```
|
264
|
+
|
265
|
+
Burying a job means that the job is pulled out of the queue into a special 'holding' area for later inspection or reuse.
|
266
|
+
To reanimate this job later, you can 'kick' buried jobs back into being ready:
|
267
|
+
|
268
|
+
```ruby
|
269
|
+
@beanstalk.tubes['some-tube'].kick(3)
|
270
|
+
```
|
271
|
+
|
272
|
+
This kicks 3 buried jobs for 'some-tube' back into the 'ready' state. Jobs can also be
|
273
|
+
inspected using the 'peek' commands. To find and peek at a particular job based on the id:
|
274
|
+
|
275
|
+
```ruby
|
276
|
+
@beanstalk.jobs.find(123)
|
277
|
+
# => <Beaneater::Job id=123 body="foo">
|
278
|
+
```
|
279
|
+
|
280
|
+
or you can peek at jobs within a tube:
|
281
|
+
|
282
|
+
```ruby
|
283
|
+
@tube = @beanstalk.tubes.find('foo')
|
284
|
+
@tube.peek(:ready)
|
285
|
+
# => <Beaneater::Job id=123 body="ready">
|
286
|
+
@tube.peek(:buried)
|
287
|
+
# => <Beaneater::Job id=456 body="buried">
|
288
|
+
@tube.peek(:delayed)
|
289
|
+
# => <Beaneater::Job id=789 body="delayed">
|
290
|
+
```
|
291
|
+
|
292
|
+
When dealing with jobs there are a few other useful commands available:
|
293
|
+
|
294
|
+
```ruby
|
295
|
+
job = @beanstalk.tubes.reserve
|
296
|
+
print job.tube # => "some-tube-name"
|
297
|
+
print job.reserved? # => true
|
298
|
+
print job.exists? # => true
|
299
|
+
job.delete
|
300
|
+
print job.exists? # => false
|
301
|
+
```
|
302
|
+
|
303
|
+
### Processing Jobs (Automatically)
|
304
|
+
|
305
|
+
Instead of using `watch` and `reserve`, you can also use the higher level `register` and `process` methods to
|
306
|
+
process jobs. First you can 'register' how to handle jobs from various tubes:
|
307
|
+
|
308
|
+
```ruby
|
309
|
+
@beanstalk.jobs.register('some-tube', :retry_on => [SomeError]) do |job|
|
310
|
+
do_something(job)
|
311
|
+
end
|
312
|
+
|
313
|
+
@beanstalk.jobs.register('other-tube') do |job|
|
314
|
+
do_something_else(job)
|
315
|
+
end
|
316
|
+
```
|
317
|
+
|
318
|
+
Once you have registered the handlers for known tubes, calling `process!` will begin a
|
319
|
+
loop processing jobs as defined by the registered processor blocks:
|
320
|
+
|
321
|
+
```ruby
|
322
|
+
@beanstalk.jobs.process!
|
323
|
+
```
|
324
|
+
|
325
|
+
Processing runs the following steps:
|
326
|
+
|
327
|
+
1. Watch all registered tubes
|
328
|
+
1. Reserve the next job
|
329
|
+
1. Once job is reserved, invoke the registered handler based on the tube name
|
330
|
+
1. If no exceptions occur, delete the job (success)
|
331
|
+
1. If 'retry_on' exceptions occur, call 'release' (retry)
|
332
|
+
1. If other exception occurs, call 'bury' (error)
|
333
|
+
1. Repeat steps 2-5
|
334
|
+
|
335
|
+
The `process` command is ideally suited for a beanstalk job processing daemon.
|
336
|
+
|
337
|
+
### Handling Errors
|
338
|
+
|
339
|
+
While using Beaneater, certain errors may be encountered. Errors are encountered when
|
340
|
+
a command is sent to beanstalk and something unexpected happens. The most common errors
|
341
|
+
are listed below:
|
342
|
+
|
343
|
+
| Errors | Description |
|
344
|
+
| -------------------- | ------------- |
|
345
|
+
| Beaneater::NotConnected | Client connection to beanstalk cannot be established. |
|
346
|
+
| Beaneater::InvalidTubeName | Specified tube name for use or watch is not valid. |
|
347
|
+
| Beaneater::NotFoundError | Specified job or tube could not be found. |
|
348
|
+
| Beaneater::TimedOutError | Job could not be reserved within time specified. |
|
349
|
+
|
350
|
+
There are other exceptions that are less common such as `OutOfMemoryError`, `DrainingError`,
|
351
|
+
`DeadlineSoonError`, `InternalError`, `BadFormatError`, `UnknownCommandError`,
|
352
|
+
`ExpectedCRLFError`, `JobTooBigError`, `NotIgnoredError`. Be sure to check the
|
353
|
+
[beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more information.
|
354
|
+
|
355
|
+
|
356
|
+
### Stats
|
357
|
+
|
358
|
+
Beanstalk has plenty of commands for introspecting the state of the queues and jobs. To get stats for
|
359
|
+
beanstalk overall:
|
360
|
+
|
361
|
+
```ruby
|
362
|
+
# Get overall stats about the job processing that has occurred
|
363
|
+
print @beanstalk.stats
|
364
|
+
# => { 'current_connections': 1, 'current_jobs_buried': 0, ... }
|
365
|
+
print @beanstalk.stats.current_connections
|
366
|
+
# => 1
|
367
|
+
```
|
368
|
+
|
369
|
+
For stats on a particular tube:
|
370
|
+
|
371
|
+
```ruby
|
372
|
+
# Get statistical information about the specified tube if it exists
|
373
|
+
print @beanstalk.tubes['some_tube_name'].stats
|
374
|
+
# => { 'current_jobs_ready': 0, 'current_jobs_reserved': 0, ... }
|
375
|
+
```
|
376
|
+
|
377
|
+
For stats on an individual job:
|
378
|
+
|
379
|
+
```ruby
|
380
|
+
# Get statistical information about the specified job if it exists
|
381
|
+
print @beanstalk.jobs[some_job_id].stats
|
382
|
+
# => {'age': 0, 'id': 2, 'state': 'reserved', 'tube': 'default', ... }
|
383
|
+
```
|
384
|
+
|
385
|
+
Be sure to check the [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for
|
386
|
+
more details about the stats commands.
|
387
|
+
|
388
|
+
## Resources
|
389
|
+
|
390
|
+
There are other resources helpful when learning about beanstalk:
|
391
|
+
|
392
|
+
* [Beanstalkd homepage](http://kr.github.com/beanstalkd/)
|
393
|
+
* [beanstalk on github](https://github.com/kr/beanstalkd)
|
394
|
+
* [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md)
|
395
|
+
|
396
|
+
## Contributors
|
397
|
+
|
398
|
+
- [Nico Taing](https://github.com/Nico-Taing) - Creator and co-maintainer
|
399
|
+
- [Nathan Esquenazi](https://github.com/nesquena) - Contributor and co-maintainer
|