typhoeus 0.2.4 → 0.3.2

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,372 +0,0 @@
1
- h1. Typhoeus
2
-
3
- "http://github.com/pauldix/typhoeus/tree/master":http://github.com/pauldix/typhoeus/tree/master
4
-
5
- "the mailing list":http://groups.google.com/group/typhoeus
6
-
7
- Thanks to my employer "kgbweb":http://kgbweb.com for allowing me to release this as open source. Btw, we're hiring and we work on cool stuff like this every day. Get a hold of me if you rock at rails/js/html/css or if you have experience in search, information retrieval, and machine learning.
8
-
9
- I also wanted to thank Todd A. Fisher. I ripped a good chunk of the c libcurl-multi code from his update to Curb. Awesome stuff Todd!
10
-
11
- h2. Summary
12
-
13
- Like a modern code version of the mythical beast with 100 serpent heads, Typhoeus runs HTTP requests in parallel while cleanly encapsulating handling logic. To be a little more specific, it's a library for accessing web services in Ruby. It's specifically designed for building RESTful service oriented architectures in Ruby that need to be fast enough to process calls to multiple services within the client's HTTP request/response life cycle.
14
-
15
- Some of the awesome features are parallel request execution, memoization of request responses (so you don't make the same request multiple times in a single group), built in support for caching responses to memcached (or whatever), and mocking capability baked in. It uses libcurl and libcurl-multi to work this speedy magic. I wrote the c bindings myself so it's yet another Ruby libcurl library, but with some extra awesomeness added in.
16
-
17
- h2. Installation
18
-
19
- Typhoeus requires you to have a current version of libcurl installed. I've tested this with 7.19.4 and higher.
20
- <pre>
21
- gem install typhoeus
22
- </pre>
23
- If you're on Debian or Ubuntu and getting errors while trying to install, it could be because you don't have the latest version of libcurl installed. Do this to fix:
24
- <pre>
25
- sudo apt-get install libcurl4-gnutls-dev
26
- </pre>
27
- There's also something built in so that if you have a super old version of curl that you can't get rid of for some reason, you can install in a user directory and specify that during installation like so:
28
- <pre>
29
- gem install typhoeus --source http://gemcutter.org -- --with-curl=/usr/local/curl/7.19.7/
30
- </pre>
31
-
32
- -Another problem could be if you are running Mac Ports and you have libcurl installed through there. You need to uninstall it for Typhoeus to work! The version in Mac Ports is old and doesn't play nice. You should "download curl":http://curl.haxx.se/download.html and build from source. Then you'll have to install the gem again.- The current version of Mac Ports (7.21.2) works just fine.
33
-
34
- If you're still having issues, please let me know on "the mailing list":http://groups.google.com/group/typhoeus.
35
-
36
- There's one other thing you should know. The Easy object (which is just a libcurl thing) allows you to set timeout values in milliseconds. However, for this to work you need to build libcurl with c-ares support built in.
37
-
38
- h2. Usage
39
-
40
- *Deprecation Warning!*
41
- The old version of Typhoeus used a module that you included in your class to get functionality. That interface has been deprecated. Here is the new interface.
42
-
43
- The primary interface for Typhoeus is comprised of three classes: Request, Response, and Hydra. Request represents an HTTP request object, response represents an HTTP response, and Hydra manages making parallel HTTP connections.
44
-
45
- <pre>
46
- require 'rubygems'
47
- require 'typhoeus'
48
- require 'json'
49
-
50
- # the request object
51
- request = Typhoeus::Request.new("http://www.pauldix.net",
52
- :body => "this is a request body",
53
- :method => :post,
54
- :headers => {:Accept => "text/html"},
55
- :timeout => 100, # milliseconds
56
- :cache_timeout => 60, # seconds
57
- :params => {:field1 => "a field"})
58
- # we can see from this that the first argument is the url. the second is a set of options.
59
- # the options are all optional. The default for :method is :get. Timeout is measured in milliseconds.
60
- # cache_timeout is measured in seconds.
61
-
62
- # Run the request via Hydra.
63
- hydra = Typhoeus::Hydra.new
64
- hydra.queue(request)
65
- hydra.run
66
-
67
- # the response object will be set after the request is run
68
- response = request.response
69
- response.code # http status code
70
- response.time # time in seconds the request took
71
- response.headers # the http headers
72
- response.headers_hash # http headers put into a hash
73
- response.body # the response body
74
- </pre>
75
-
76
- *Making Quick Requests*
77
- The request object has some convenience methods for performing single HTTP requests. The arguments are the same as those you pass into the request constructor.
78
-
79
- <pre>
80
- response = Typhoeus::Request.get("http://www.pauldix.net")
81
- response = Typhoeus::Request.head("http://www.pauldix.net")
82
- response = Typhoeus::Request.put("http://localhost:3000/posts/1", :body => "whoo, a body")
83
- response = Typhoeus::Request.post("http://localhost:3000/posts", :params => {:title => "test post", :content => "this is my test"})
84
- response = Typhoeus::Request.delete("http://localhost:3000/posts/1")
85
- </pre>
86
-
87
- *Handling file uploads*
88
- A File object can be passed as a param for a POST request to handle uploading files to the server. Typhoeus will upload the file as the original file name and use Mime::Types to set the content type.
89
-
90
- <pre>
91
- response = Typhoeus::Request.post("http://localhost:3000/posts",
92
- :params => {
93
- :title => "test post", :content => "this is my test",
94
- :file => File.open("thesis.txt","r")
95
- }
96
- )
97
- </pre>
98
-
99
- *Making Parallel Requests*
100
-
101
- <pre>
102
- # Generally, you should be running requests through hydra. Here is how that looks
103
- hydra = Typhoeus::Hydra.new
104
-
105
- first_request = Typhoeus::Request.new("http://localhost:3000/posts/1.json")
106
- first_request.on_complete do |response|
107
- post = JSON.parse(response.body)
108
- third_request = Typhoeus::Request.new(post.links.first) # get the first url in the post
109
- third_request.on_complete do |response|
110
- # do something with that
111
- end
112
- hydra.queue third_request
113
- return post
114
- end
115
- second_request = Typhoeus::Request.new("http://localhost:3000/users/1.json")
116
- second_request.on_complete do |response|
117
- JSON.parse(response.body)
118
- end
119
- hydra.queue first_request
120
- hydra.queue second_request
121
- hydra.run # this is a blocking call that returns once all requests are complete
122
-
123
- first_request.handled_response # the value returned from the on_complete block
124
- second_request.handled_response # the value returned from the on_complete block (parsed JSON)
125
- </pre>
126
-
127
- The execution of that code goes something like this. The first and second requests are built and queued. When hydra is run the first and second requests run in parallel. When the first request completes, the third request is then built and queued up. The moment it is queued Hydra starts executing it. Meanwhile the second request would continue to run (or it could have completed before the first). Once the third request is done, hydra.run returns.
128
-
129
- *Specifying Max Concurrency*
130
-
131
- Hydra will also handle how many requests you can make in parallel. Things will get flakey if you try to make too many requests at the same time. The built in limit is 200. When more requests than that are queued up, hydra will save them for later and start the requests as others are finished. You can raise or lower the concurrency limit through the Hydra constructor.
132
-
133
- <pre>
134
- hydra = Typhoeus::Hydra.new(:max_concurrency => 20) # keep from killing some servers
135
- </pre>
136
-
137
- *Memoization*
138
- Hydra memoizes requests within a single run call. You can also disable memoization.
139
-
140
- <pre>
141
- hydra = Typhoeus::Hydra.new
142
- 2.times do
143
- r = Typhoeus::Request.new("http://localhost/3000/users/1")
144
- hydra.queue r
145
- end
146
- hydra.run # this will result in a single request being issued. However, the on_complete handlers of both will be called.
147
- hydra.disable_memoization
148
- 2.times do
149
- r = Typhoeus::Request.new("http://localhost/3000/users/1")
150
- hydra.queue r
151
- end
152
- hydra.run # this will result in a two requests.
153
- </pre>
154
-
155
- *Caching*
156
- Hydra includes built in support for creating cache getters and setters. In the following example, if there is a cache hit, the cached object is passed to the on_complete handler of the request object.
157
-
158
- <pre>
159
- hydra = Typhoeus::Hydra.new
160
- hydra.cache_setter do |request|
161
- @cache.set(request.cache_key, request.response, request.cache_timeout)
162
- end
163
-
164
- hydra.cache_getter do |request|
165
- @cache.get(request.cache_key) rescue nil
166
- end
167
- </pre>
168
-
169
- *Direct Stubbing*
170
- Hydra allows you to stub out specific urls and patters to avoid hitting remote servers while testing.
171
-
172
- <pre>
173
- hydra = Typhoeus::Hydra.new
174
- response = Response.new(:code => 200, :headers => "", :body => "{'name' : 'paul'}", :time => 0.3)
175
- hydra.stub(:get, "http://localhost:3000/users/1").and_return(response)
176
-
177
- request = Typhoeus::Request.new("http://localhost:3000/users/1")
178
- request.on_complete do |response|
179
- JSON.parse(response.body)
180
- end
181
- hydra.queue request
182
- hydra.run
183
- </pre>
184
-
185
- The queued request will hit the stub. The on_complete handler will be called and will be passed the response object. You can also specify a regex to match urls.
186
-
187
- <pre>
188
- hydra.stub(:get, /http\:\/\/localhost\:3000\/users\/.*/).and_return(response)
189
- # any requests for a user will be stubbed out with the pre built response.
190
- </pre>
191
-
192
- *The Singleton*
193
- All of the quick requests are done using the singleton hydra object. If you want to enable caching or stubbing on the quick requests, set those options on the singleton.
194
-
195
- <pre>
196
- hydra = Typhoeus::Hydra.hydra
197
- hydra.stub(:get, "http://localhost:3000/users")
198
- </pre>
199
-
200
- *Timeouts*
201
-
202
- No exceptions are raised on HTTP timeouts. You can check whether a request timed out with the following methods:
203
-
204
- <pre>
205
- easy.timed_out? # for a raw Easy handle
206
- response.timed_out? # for a Response handle
207
- </pre>
208
-
209
- *Basic Authentication*
210
-
211
- <pre>
212
- response = Typhoeus::Request.get("http://twitter.com/statuses/followers.json",
213
- :username => username, :password => password)
214
- </pre>
215
-
216
- *SSL*
217
- SSL comes built in to libcurl so it's in Typhoeus as well. If you pass in a url with "https" it should just work assuming that you have your "cert bundle":http://curl.haxx.se/docs/caextract.html in order and the server is verifiable. You must also have libcurl built with SSL support enabled. You can check that by doing this:
218
-
219
- <pre>
220
- Typhoeus::Easy.new.curl_version # output should include OpenSSL/...
221
- </pre>
222
-
223
- Now, even if you have libcurl built with OpenSSL you may still have a messed up cert bundle or if you're hitting a non-verifiable SSL server then you'll have to disable peer verification to make SSL work. Like this:
224
-
225
- <pre>
226
- Typhoeus::Request.get("https://mail.google.com/mail", :disable_ssl_peer_verification => true)
227
- </pre>
228
-
229
- *LibCurl*
230
- Typhoeus also has a more raw libcurl interface. These are the Easy and Multi objects. If you're into accessing just the raw libcurl style, those are your best bet.
231
-
232
- However, by using this raw interface, you do not get access to Hydra-specific features, such as stubbing/mocking.
233
-
234
- SSL Certs can be provided to the Easy interface:
235
-
236
- <pre>
237
- e = Typhoeus::Easy.new
238
- e.url = "https://example.com/action"
239
- s.ssl_cacert = "ca_file.cer"
240
- e.ssl_cert = "acert.crt"
241
- e.ssl_key = "akey.key"
242
- [...]
243
- e.perform
244
- </pre>
245
-
246
- or directly to a Typhoeus::Request :
247
-
248
- <pre>
249
- e = Typhoeus::Request.get("https://example.com/action",
250
- :ssl_cacert => "ca_file.cer",
251
- :ssl_cert => "acert.crt",
252
- :ssl_key => "akey.key",
253
- [...]
254
- end
255
- </pre>
256
-
257
- h2. Advanced authentication
258
-
259
- Thanks for the authentication piece and this description go to Oleg Ivanov (morhekil). The major reason to start this fork was the need to perform NTLM authentication in Ruby, and other libcurl's authentications method were made possible as a result. Now you can do it via Typhoeus::Easy interface using the following API.
260
-
261
- <pre>
262
- e = Typhoeus::Easy.new
263
- e.auth = {
264
- :username => 'username',
265
- :password => 'password',
266
- :method => Typhoeus::Easy::AUTH_TYPES[:CURLAUTH_NTLM]
267
- }
268
- e.url = "http://example.com/auth_ntlm"
269
- e.method = :get
270
- e.perform
271
- </pre>
272
-
273
- *Other authentication types*
274
-
275
- The following authentication types are available:
276
- * CURLAUTH_BASIC
277
- * CURLAUTH_DIGEST
278
- * CURLAUTH_GSSNEGOTIATE
279
- * CURLAUTH_NTLM
280
- * CURLAUTH_DIGEST_IE
281
- * CURLAUTH_AUTO
282
-
283
- The last one (CURLAUTH_AUTO) is really a combination of all previous methods and is provided by Typhoeus for convenience. When you set authentication to auto, Typhoeus will retrieve the given URL first and examine it's headers to confirm what auth types are supported by the server. The it will select the strongest of available auth methods and will send the second request using the selected authentication method.
284
-
285
- *Authentication via the quick request interface*
286
-
287
- There's also an easy way to perform any kind of authentication via the quick request interface:
288
-
289
- <pre>
290
- e = Typhoeus::Request.get("http://example.com",
291
- :username => 'username',
292
- :password => 'password',
293
- :auth_method => :ntlm)
294
- </pre>
295
-
296
- All methods listed above is available in a shorter form - :basic, :digest, :gssnegotiate, :ntlm, :digest_ie, :auto.
297
-
298
- *Query of available auth types*
299
-
300
- After the initial request you can get the authentication types available on the server via Typhoues::Easy#auth_methods call. It will return a number
301
- that you'll need to decode yourself, please refer to easy.rb source code to see the numeric values of different auth types.
302
-
303
- h2. Verbose debug output
304
-
305
- Sometime it's useful to see verbose output from curl. You may now enable it:
306
-
307
- <pre>
308
- e = Typhoeus::Easy.new
309
- e.verbose = 1
310
- </pre>
311
-
312
- or using the quick request:
313
-
314
- <pre>
315
- e = Typhoeus::Request.get("http://example.com", :verbose => true)
316
- </pre>
317
-
318
- Just remember that libcurl prints it's debug output to the console (to STDERR), so you'll need to run your scripts from the console to see it.
319
-
320
- h2. Benchmarks
321
-
322
- I set up a benchmark to test how the parallel performance works vs Ruby's built in NET::HTTP. The setup was a local evented HTTP server that would take a request, sleep for 500 milliseconds and then issued a blank response. I set up the client to call this 20 times. Here are the results:
323
-
324
- <pre>
325
- net::http 0.030000 0.010000 0.040000 ( 10.054327)
326
- typhoeus 0.020000 0.070000 0.090000 ( 0.508817)
327
- </pre>
328
-
329
- We can see from this that NET::HTTP performs as expected, taking 10 seconds to run 20 500ms requests. Typhoeus only takes 500ms (the time of the response that took the longest.) One other thing to note is that Typhoeus keeps a pool of libcurl Easy handles to use. For this benchmark I warmed the pool first. So if you test this out it may be a bit slower until the Easy handle pool has enough in it to run all the simultaneous requests. For some reason the easy handles can take quite some time to allocate.
330
-
331
- h2. Running the specs
332
-
333
- Running the specs requires a couple of Sinatra servers to be booted. Do this:
334
-
335
- <pre>
336
- ruby spec/servers/app.rb -p 3000
337
- ruby spec/servers/app.rb -p 3001
338
- ruby spec/servers/app.rb -p 3002
339
- rake spec
340
- </pre>
341
-
342
- h2. Next Steps
343
-
344
- * Add in ability to keep-alive requests and reuse them within hydra.
345
- * Add support for automatic retry, exponential back-off, and queuing for later.
346
-
347
- h2. LICENSE
348
-
349
- (The MIT License)
350
-
351
- Copyright (c) 2009:
352
-
353
- "Paul Dix":http://pauldix.net
354
-
355
- Permission is hereby granted, free of charge, to any person obtaining
356
- a copy of this software and associated documentation files (the
357
- 'Software'), to deal in the Software without restriction, including
358
- without limitation the rights to use, copy, modify, merge, publish,
359
- distribute, sublicense, and/or sell copies of the Software, and to
360
- permit persons to whom the Software is furnished to do so, subject to
361
- the following conditions:
362
-
363
- The above copyright notice and this permission notice shall be
364
- included in all copies or substantial portions of the Software.
365
-
366
- THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
367
- EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
368
- MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
369
- IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
370
- CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
371
- TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
372
- SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/VERSION DELETED
@@ -1 +0,0 @@
1
- 0.2.4
@@ -1,25 +0,0 @@
1
- require File.dirname(__FILE__) + '/../lib/typhoeus.rb'
2
- require 'rubygems'
3
- require 'ruby-prof'
4
-
5
- calls = 20
6
- @klass = Class.new do
7
- include Typhoeus
8
- end
9
-
10
- Typhoeus.init_easy_objects
11
-
12
- RubyProf.start
13
-
14
- responses = []
15
- calls.times do |i|
16
- responses << @klass.get("http://127.0.0.1:3000/#{i}")
17
- end
18
-
19
- responses.each {|r| }#raise unless r.response_body == "whatever"}
20
-
21
- result = RubyProf.stop
22
-
23
- # Print a flat profile to text
24
- printer = RubyProf::FlatPrinter.new(result)
25
- printer.print(STDOUT, 0)
@@ -1,35 +0,0 @@
1
- require 'rubygems'
2
- require File.dirname(__FILE__) + '/../lib/typhoeus.rb'
3
- require 'open-uri'
4
- require 'benchmark'
5
- include Benchmark
6
-
7
-
8
- calls = 20
9
- @klass = Class.new do
10
- include Typhoeus
11
- end
12
-
13
- Typhoeus.init_easy_object_pool
14
-
15
- benchmark do |t|
16
- t.report("net::http") do
17
- responses = []
18
-
19
- calls.times do |i|
20
- responses << open("http://127.0.0.1:3000/#{i}").read
21
- end
22
-
23
- responses.each {|r| raise unless r == "whatever"}
24
- end
25
-
26
- t.report("typhoeus") do
27
- responses = []
28
-
29
- calls.times do |i|
30
- responses << @klass.get("http://127.0.0.1:3000/#{i}")
31
- end
32
-
33
- responses.each {|r| raise unless r.body == "whatever"}
34
- end
35
- end
@@ -1,12 +0,0 @@
1
- require 'rubygems'
2
- require File.dirname(__FILE__) + '/../lib/typhoeus.rb'
3
-
4
-
5
- response = Typhoeus::Request.post(
6
- "http://video-feed.local",
7
- :params => {
8
- :file => File.new("file.rb")
9
- }
10
- )
11
-
12
- puts response.inspect