beaneater 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/.yardopts ADDED
@@ -0,0 +1,8 @@
1
+ --output-dir doc/
2
+ --readme README.md
3
+ --title Beaneater
4
+ --markup-provider=redcarpet
5
+ --markup=markdown
6
+ --protected
7
+ --no-private
8
+ lib/beaneater/**/*.rb
data/Gemfile ADDED
@@ -0,0 +1,10 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in beaneater.gemspec
4
+ gemspec
5
+
6
+ group :development do
7
+ gem 'redcarpet', '~> 1'
8
+ gem 'github-markup'
9
+ gem 'yard'
10
+ end
data/LICENSE.txt ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Nico Taing
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,399 @@
1
+ # Beaneater
2
+
3
+ Beaneater is the best way to interact with beanstalkd from within Ruby.
4
+ [Beanstalkd](http://kr.github.com/beanstalkd/) is a simple, fast work queue. Its interface is generic, but was
5
+ originally designed for reducing the latency of page views in high-volume web applications by
6
+ running time-consuming tasks asynchronously. Read the
7
+ [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more details.
8
+
9
+
10
+ ## Why Beanstalk?
11
+
12
+ Illya has an excellent blog post
13
+ [Scalable Work Queues with Beanstalk](http://www.igvita.com/2010/05/20/scalable-work-queues-with-beanstalk/) and
14
+ Adam Wiggins posted [an excellent comparison](http://adam.heroku.com/past/2010/4/24/beanstalk_a_simple_and_fast_queueing_backend/).
15
+
16
+ You will quickly see that **beanstalkd** is an underrated but incredible project that is extremely well-suited as a job queue.
17
+ Significantly better suited for this task than Redis or a database. Beanstalk is a simple,
18
+ and a very fast work queue service rolled into a single binary - it is the memcached of work queues.
19
+ Originally built to power the backend for the 'Causes' Facebook app, it is a mature and production ready open source project.
20
+ [PostRank](http://www.postrank.com) uses beanstalk to reliably process millions of jobs a day.
21
+
22
+ A single instance of Beanstalk is perfectly capable of handling thousands of jobs a second (or more, depending on your job size)
23
+ because it is an in-memory, event-driven system. Powered by libevent under the hood,
24
+ it requires zero setup (launch and forget, à la memcached), optional log based persistence, an easily parsed ASCII protocol,
25
+ and a rich set of tools for job management that go well beyond a simple FIFO work queue.
26
+
27
+ Beanstalkd supports the following features out of the box:
28
+
29
+ | Feature | Description |
30
+ | ------- | ------------------------------- |
31
+ | **Parallel Queues** | Supports multiple work queues created on demand. |
32
+ | **Reliable** | Beanstalk’s reserve, work, delete cycle ensures reliable processing. |
33
+ | **Scheduling** | Delay enqueuing jobs by a specified interval to schedule processing later |
34
+ | **Fast** | Processes thousands of jobs per second; **significantly** faster than alternatives. |
35
+ | **Priorities** | Specify priority so important jobs can be processed quickly. |
36
+ | **Persistence** | Jobs are stored in memory for speed, but logged to disk for safe keeping. |
37
+ | **Federation** | Horizontal scalability provided through federation by the client. |
38
+ | **Error Handling** | Bury any job which causes an error for later debugging and inspection.|
39
+
40
+ Keep in mind that these features are supported out of the box with beanstalk and require no special code within this gem to support.
41
+ In the end, **beanstalk is the ideal job queue** while also being ridiculously easy to install and setup.
42
+
43
+ ## Installation
44
+
45
+ Install beanstalkd:
46
+
47
+ Mac OS
48
+
49
+ ```
50
+ brew update
51
+ brew install beanstalkd
52
+ beanstalkd -p 11300
53
+ ```
54
+
55
+ Ubuntu
56
+
57
+ ```
58
+ apt-get install beanstalkd
59
+ beanstalkd -p 11300
60
+ ```
61
+
62
+ Install beaneater as a gem:
63
+
64
+ ```
65
+ gem install beaneater
66
+ ```
67
+
68
+ or add this to your Gemfile:
69
+
70
+ ```ruby
71
+ # Gemfile
72
+ gem 'beaneater'
73
+ ```
74
+
75
+ and run `bundle install` to install the dependency.
76
+
77
+ ## Usage
78
+
79
+ ### Connection
80
+
81
+ To interact with a beanstalk queue, first establish a connection by providing a set of addresses:
82
+
83
+ ```ruby
84
+ @beanstalk = Beaneater::Pool.new(['10.0.1.5:11300'])
85
+ ```
86
+
87
+ You can conversely close and dispose of a pool at any time with:
88
+
89
+ ```ruby
90
+ @beanstalk.close
91
+ ```
92
+
93
+ ### Tubes
94
+
95
+ Beanstalkd has one or more tubes which can contain any number of jobs.
96
+ Jobs can be inserted (put) into the used tube and pulled out (reserved) from watched tubes.
97
+ Each tube consists of a _ready_, _delayed_, and _buried_ queue for jobs.
98
+
99
+ When a client connects, its watch list is initially just the tube named `default`.
100
+ Tube names are at most 200 bytes. It specifies the tube to use. If the tube does not exist, it will be automatically created.
101
+
102
+ To interact with a tube, first `find` the tube:
103
+
104
+ ```ruby
105
+ @tube = @beanstalk.tubes.find "some-tube-here"
106
+ # => <Tube name='some-tube-here'>
107
+ ```
108
+
109
+ To reserve jobs from beanstalk, you will need to 'watch' certain tubes:
110
+
111
+ ```ruby
112
+ # Watch only the tubes listed below (!)
113
+ @beanstalk.tubes.watch!('some-tube')
114
+ # Append tubes to existing set of watched tubes
115
+ @beanstalk.tubes.watch('another-tube')
116
+ # You can also ignore tubes that have been watched previously
117
+ @beanstalk.tubes.ignore('some-tube')
118
+ ```
119
+
120
+ You can easily get a list of all, used or watched tubes:
121
+
122
+ ```ruby
123
+ # The list-tubes command returns a list of all existing tubes
124
+ @beanstalk.tubes.all
125
+ # => [<Tube name='foo'>, <Tube name='bar'>]
126
+
127
+ # Returns the tube currently being used by the client (for insertion)
128
+ @beanstalk.tubes.used
129
+ # => <Tube name='bar'>
130
+
131
+ # Returns a list tubes currently being watched by the client (for consumption)
132
+ @beanstalk.tubes.watched
133
+ # => [<Tube name='foo'>]
134
+ ```
135
+
136
+ You can also temporarily 'pause' the execution of a tube by specifying the time:
137
+
138
+ ```ruby
139
+ tube = @beanstalk.tubes["some-tube-here"]
140
+ tube.pause(3) # pauses tube for 3 seconds
141
+ ```
142
+
143
+ or even clear the tube of all jobs:
144
+
145
+ ```ruby
146
+ tube = @beanstalk.tubes["some-tube-here"]
147
+ tube.clear # tube will now be empty
148
+ ```
149
+
150
+ In summary, each beanstalk client manages two separate concerns: which tube newly created jobs are put into,
151
+ and which tube(s) jobs are reserved from. Accordingly, there are two separate sets of functions for these concerns:
152
+
153
+ * **use** and **using** affect where 'put' places jobs
154
+ * **watch** and **watching** control where reserve takes jobs from
155
+
156
+ Note that these concerns are fully orthogonal: for example, when you 'use' a tube, it is not automatically 'watched'.
157
+ Neither does 'watching' a tube affect the tube you are 'using'.
158
+
159
+ ### Jobs
160
+
161
+ A job in beanstalk gets inserted by a client and includes the 'body' and job metadata.
162
+ Each job is enqueued into a tube and later reserved and processed. Here is a picture of the typical job lifecycle:
163
+
164
+ ```
165
+ put reserve delete
166
+ -----> [READY] ---------> [RESERVED] --------> *poof*
167
+ ```
168
+
169
+ A job at any given time is in one of three states: **ready**, **delayed**, or **buried**:
170
+
171
+ | State | Description |
172
+ | ------- | ------------------------------- |
173
+ | ready | waiting to be `reserved` and processed after being `put` onto a tube. |
174
+ | delayed | waiting to become `ready` after the specified delay. |
175
+ | buried | waiting to be kicked, usually after job fails to process |
176
+
177
+ In addition, there are several actions that can be performed on a given job, you can:
178
+
179
+ * **reserve** which locks a job from the ready queue for processing.
180
+ * **touch** which extends the time before a job is autoreleased back to ready.
181
+ * **release** which places a reserved job back onto the ready queue.
182
+ * **delete** which removes a job from beanstalk.
183
+ * **bury** which places a reserved job into the buried state.
184
+ * **kick** which places a buried job from the buried queue back to ready.
185
+
186
+ You can insert a job onto a beanstalk tube using the `put` command:
187
+
188
+ ```ruby
189
+ @tube.put "job-data-here"
190
+ ```
191
+
192
+ Beanstalkd can only stores strings as job bodies, but you can easily encode your data into a string:
193
+
194
+ ```ruby
195
+ @tube.put({:foo => 'bar'}.to_json)
196
+ ```
197
+
198
+ Each job has various metadata associated such as `priority`, `delay`, and `ttr` which can be
199
+ specified as part of the `put` command:
200
+
201
+ ```ruby
202
+ # defaults are priority 0, delay of 0 and ttr of 120 seconds
203
+ @tube.put "job-data-here", :pri => 1000, :delay => 50, :ttr => 200
204
+ ```
205
+
206
+ The `priority` argument is an integer < 2**32. Jobs with a smaller priority take precedence over jobs with larger priorities.
207
+ The `delay` argument is an integer number of seconds to wait before putting the job in the ready queue.
208
+ The `ttr` argument is the time to run -- is an integer number of seconds to allow a worker to run this job.
209
+
210
+ ### Processing Jobs (Manually)
211
+
212
+ In order to process jobs, the client should first specify the intended tubes to be watched. If not specified,
213
+ this will default to watching just the `default` tube.
214
+
215
+ ```ruby
216
+ @beanstalk = Beaneater::Connection.new(['10.0.1.5:11300'])
217
+ @beanstalk.tubes.watch!('tube-name', 'other-tube')
218
+ ```
219
+
220
+ Next you can use the `reserve` command which will return the first available job within the watched tubes:
221
+
222
+ ```ruby
223
+ job = @beanstalk.tubes.reserve
224
+ # => <Beaneater::Job id=5 body="foo">
225
+ puts job.body
226
+ # prints 'job-data-here'
227
+ print job.stats.state # => 'reserved'
228
+ ```
229
+
230
+ By default, reserve will wait indefinitely for the next job. If you want to specify a timeout,
231
+ simply pass that in seconds into the command:
232
+
233
+ ```ruby
234
+ job = @beanstalk.tubes.reserve(5) # wait 5 secs for a job, then return
235
+ # => <Beaneater::Job id=5 body="foo">
236
+ ```
237
+
238
+ You can 'release' a reserved job back onto the ready queue to retry later:
239
+
240
+ ```ruby
241
+ job = @beanstalk.tubes.reserve
242
+ # ...job has ephemeral fail...
243
+ job.release :delay => 5
244
+ print job.stats.state # => 'delayed'
245
+ ```
246
+
247
+ You can also 'delete' jobs that are finished:
248
+
249
+ ```ruby
250
+ job = @beanstalk.tubes.reserve
251
+ job.touch # extends ttr for job
252
+ # ...process job...
253
+ job.delete
254
+ ```
255
+
256
+ Beanstalk jobs can also be buried if they fail, rather than being deleted:
257
+
258
+ ```ruby
259
+ job = @beanstalk.tubes.reserve
260
+ # ...job fails...
261
+ job.bury
262
+ print job.stats.state # => 'buried'
263
+ ```
264
+
265
+ Burying a job means that the job is pulled out of the queue into a special 'holding' area for later inspection or reuse.
266
+ To reanimate this job later, you can 'kick' buried jobs back into being ready:
267
+
268
+ ```ruby
269
+ @beanstalk.tubes['some-tube'].kick(3)
270
+ ```
271
+
272
+ This kicks 3 buried jobs for 'some-tube' back into the 'ready' state. Jobs can also be
273
+ inspected using the 'peek' commands. To find and peek at a particular job based on the id:
274
+
275
+ ```ruby
276
+ @beanstalk.jobs.find(123)
277
+ # => <Beaneater::Job id=123 body="foo">
278
+ ```
279
+
280
+ or you can peek at jobs within a tube:
281
+
282
+ ```ruby
283
+ @tube = @beanstalk.tubes.find('foo')
284
+ @tube.peek(:ready)
285
+ # => <Beaneater::Job id=123 body="ready">
286
+ @tube.peek(:buried)
287
+ # => <Beaneater::Job id=456 body="buried">
288
+ @tube.peek(:delayed)
289
+ # => <Beaneater::Job id=789 body="delayed">
290
+ ```
291
+
292
+ When dealing with jobs there are a few other useful commands available:
293
+
294
+ ```ruby
295
+ job = @beanstalk.tubes.reserve
296
+ print job.tube # => "some-tube-name"
297
+ print job.reserved? # => true
298
+ print job.exists? # => true
299
+ job.delete
300
+ print job.exists? # => false
301
+ ```
302
+
303
+ ### Processing Jobs (Automatically)
304
+
305
+ Instead of using `watch` and `reserve`, you can also use the higher level `register` and `process` methods to
306
+ process jobs. First you can 'register' how to handle jobs from various tubes:
307
+
308
+ ```ruby
309
+ @beanstalk.jobs.register('some-tube', :retry_on => [SomeError]) do |job|
310
+ do_something(job)
311
+ end
312
+
313
+ @beanstalk.jobs.register('other-tube') do |job|
314
+ do_something_else(job)
315
+ end
316
+ ```
317
+
318
+ Once you have registered the handlers for known tubes, calling `process!` will begin a
319
+ loop processing jobs as defined by the registered processor blocks:
320
+
321
+ ```ruby
322
+ @beanstalk.jobs.process!
323
+ ```
324
+
325
+ Processing runs the following steps:
326
+
327
+ 1. Watch all registered tubes
328
+ 1. Reserve the next job
329
+ 1. Once job is reserved, invoke the registered handler based on the tube name
330
+ 1. If no exceptions occur, delete the job (success)
331
+ 1. If 'retry_on' exceptions occur, call 'release' (retry)
332
+ 1. If other exception occurs, call 'bury' (error)
333
+ 1. Repeat steps 2-5
334
+
335
+ The `process` command is ideally suited for a beanstalk job processing daemon.
336
+
337
+ ### Handling Errors
338
+
339
+ While using Beaneater, certain errors may be encountered. Errors are encountered when
340
+ a command is sent to beanstalk and something unexpected happens. The most common errors
341
+ are listed below:
342
+
343
+ | Errors | Description |
344
+ | -------------------- | ------------- |
345
+ | Beaneater::NotConnected | Client connection to beanstalk cannot be established. |
346
+ | Beaneater::InvalidTubeName | Specified tube name for use or watch is not valid. |
347
+ | Beaneater::NotFoundError | Specified job or tube could not be found. |
348
+ | Beaneater::TimedOutError | Job could not be reserved within time specified. |
349
+
350
+ There are other exceptions that are less common such as `OutOfMemoryError`, `DrainingError`,
351
+ `DeadlineSoonError`, `InternalError`, `BadFormatError`, `UnknownCommandError`,
352
+ `ExpectedCRLFError`, `JobTooBigError`, `NotIgnoredError`. Be sure to check the
353
+ [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for more information.
354
+
355
+
356
+ ### Stats
357
+
358
+ Beanstalk has plenty of commands for introspecting the state of the queues and jobs. To get stats for
359
+ beanstalk overall:
360
+
361
+ ```ruby
362
+ # Get overall stats about the job processing that has occurred
363
+ print @beanstalk.stats
364
+ # => { 'current_connections': 1, 'current_jobs_buried': 0, ... }
365
+ print @beanstalk.stats.current_connections
366
+ # => 1
367
+ ```
368
+
369
+ For stats on a particular tube:
370
+
371
+ ```ruby
372
+ # Get statistical information about the specified tube if it exists
373
+ print @beanstalk.tubes['some_tube_name'].stats
374
+ # => { 'current_jobs_ready': 0, 'current_jobs_reserved': 0, ... }
375
+ ```
376
+
377
+ For stats on an individual job:
378
+
379
+ ```ruby
380
+ # Get statistical information about the specified job if it exists
381
+ print @beanstalk.jobs[some_job_id].stats
382
+ # => {'age': 0, 'id': 2, 'state': 'reserved', 'tube': 'default', ... }
383
+ ```
384
+
385
+ Be sure to check the [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md) for
386
+ more details about the stats commands.
387
+
388
+ ## Resources
389
+
390
+ There are other resources helpful when learning about beanstalk:
391
+
392
+ * [Beanstalkd homepage](http://kr.github.com/beanstalkd/)
393
+ * [beanstalk on github](https://github.com/kr/beanstalkd)
394
+ * [beanstalk protocol](https://github.com/kr/beanstalkd/blob/master/doc/protocol.md)
395
+
396
+ ## Contributors
397
+
398
+ - [Nico Taing](https://github.com/Nico-Taing) - Creator and co-maintainer
399
+ - [Nathan Esquenazi](https://github.com/nesquena) - Contributor and co-maintainer