chuckle 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,6 @@
1
+ *.gem
2
+ .bundle
3
+ .rake-complete-cache
4
+ Gemfile.lock
5
+ pkg/*
6
+ rdoc
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source "http://rubygems.org"
2
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2013 Adam Doppelt
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,64 @@
1
+ # Welcome to chuckle
2
+
3
+ Chuckle is an http client with a disk-based cache. HTTP responses are cached on disk in a way that makes them easy to find and debug. The cache can be shared between machines. Features:
4
+
5
+ * Disk caching, with expiration
6
+ * Cookies (also cached)
7
+ * Robust support for timeouts and retries (also cached)
8
+
9
+ This gem is an extraction from [Dwellable](http://dwellable.com). We use it for tasks such as:
10
+
11
+ * Crawling content from partner web sites, including sites that require cookies
12
+ * Downloading photo sets from partner web sites
13
+ * Broken link detection
14
+ * Wrapping slow or rate-limited APIs with caching
15
+ * Sharing cached http responses between developers or machines
16
+
17
+ ## Install
18
+
19
+ ```ruby
20
+ gem install chuckle
21
+ ```
22
+
23
+ ## Example Usage
24
+
25
+ ```ruby
26
+ require "chuckle"
27
+
28
+ client = Chuckle::Client.new
29
+
30
+ # This makes a network request. The response will be cached in
31
+ # ~/.chuckle/www.google.com/_root_
32
+ #
33
+ # => Chuckle::Response http://www.google.com code=200
34
+ p client.get("http://www.google.com")
35
+
36
+ # Since the response is now cached, no network is required here.
37
+ #
38
+ # => Chuckle::Response http://www.google.com code=200
39
+ p client.get("http://www.google.com")
40
+ ```
41
+
42
+ ## Options
43
+
44
+ Pass these `Chuckle::Client.new`:
45
+
46
+ * **cache_dir** (~/.chuckle) - where chuckle should cache files. If HOME doesn't exist or isn't writable, it'll use `/tmp/chuckle` instead
47
+ * **cache_errors** (true) - false to not cache errors on disk (timeouts, http status >= 400, etc.)
48
+ * **cookies** (false) - true to turn on cookie support
49
+ * **expires_in** (:never) - time in seconds after which cache files should expire, or `:never` to never expire
50
+ * **nretries** (2) - number of times to retry a failing request
51
+ * **rate_limit** (1) - number of seconds between requests
52
+ * **timeout** (30) - timeout per request. Note that if `nretries` is 2 and `timeout` is 30, a failing request could take 90 seconds to truly fail.
53
+ * **user_agent** - the user agent. Defaults to the IE9 user agent.
54
+ * **verbose** (false) - if true, prints each request before fetching. Only prints network requests.
55
+
56
+
57
+
58
+
59
+ ## Limitations
60
+
61
+ * Only supports GET and POST.
62
+ * Cookies aren't intended to be cached, but we do it anyway. This works fine for our use case, where we crawl a site all at once and then clear the cache before re-crawling. Also, cookies are cached on a per-host basis so subdomains and wildcard cookies will be a problem. Use caution!
63
+ * The cache naming scheme isn't perfect. It's theoretically possible for two URLs to map to the same cache filename, though we haven't seen this happen in the wild.
64
+ * Chuckle shells out to [curl](http://curl.haxx.se/) for each request. Curl is rock solid and has great support for cookies, timeouts, compression, retries, etc. That makes Chuckle slower than other http clients, though network speed and rate limiting dwarf all such considerations.
data/Rakefile ADDED
@@ -0,0 +1,42 @@
1
+ require "bundler"
2
+ require "bundler/setup"
3
+ require "rake/testtask"
4
+ require "rdoc/task"
5
+
6
+ #
7
+ # gem
8
+ #
9
+
10
+ task gem: :build
11
+ task :build do
12
+ system "gem build --quiet chuckle.gemspec"
13
+ end
14
+
15
+ task install: :build do
16
+ system "sudo gem install --quiet chuckle-#{Chuckle::VERSION}.gem"
17
+ end
18
+
19
+ task release: :build do
20
+ system "git tag -a #{Chuckle::VERSION} -m 'Tagging #{Chuckle::VERSION}'"
21
+ system "git push --tags"
22
+ system "gem push chuckle-#{Chuckle::VERSION}.gem"
23
+ end
24
+
25
+ #
26
+ # test
27
+ #
28
+
29
+ Rake::TestTask.new(:test) do |test|
30
+ test.libs << "test"
31
+ end
32
+ task default: :test
33
+
34
+ #
35
+ # rdoc
36
+ #
37
+
38
+ RDoc::Task.new do |rdoc|
39
+ rdoc.rdoc_dir = "rdoc"
40
+ rdoc.title = "chuckle #{Chuckle::VERSION}"
41
+ rdoc.rdoc_files.include("lib/**/*.rb")
42
+ end
data/chuckle.gemspec ADDED
@@ -0,0 +1,28 @@
1
+ $LOAD_PATH << File.expand_path("../lib", __FILE__)
2
+
3
+ require "chuckle/version"
4
+
5
+ Gem::Specification.new do |s|
6
+ s.name = "chuckle"
7
+ s.version = Chuckle::VERSION
8
+ s.platform = Gem::Platform::RUBY
9
+ s.required_ruby_version = ">= 1.9.0"
10
+ s.authors = ["Adam Doppelt"]
11
+ s.email = ["amd@gurge.com"]
12
+ s.homepage = "http://github.com/gurgeous/chuckle"
13
+ s.summary = "Chuckle - caching http client that wraps curl."
14
+ s.description = "A caching http client that wraps curl."
15
+
16
+ s.rubyforge_project = "chuckle"
17
+
18
+ s.add_development_dependency "awesome_print"
19
+ s.add_development_dependency "json"
20
+ s.add_development_dependency "minitest", "~> 5.0"
21
+ s.add_development_dependency "rake"
22
+ s.add_development_dependency "rdoc"
23
+
24
+ s.files = `git ls-files`.split("\n")
25
+ s.test_files = `git ls-files -- test/*`.split("\n")
26
+ s.executables = `git ls-files -- bin/*`.split("\n").map { |i| File.basename(i) }
27
+ s.require_paths = ["lib"]
28
+ end
data/lib/chuckle.rb ADDED
@@ -0,0 +1,12 @@
1
+ require "chuckle/cache"
2
+ require "chuckle/cookie_jar"
3
+ require "chuckle/error"
4
+ require "chuckle/options"
5
+ require "chuckle/request"
6
+ require "chuckle/response"
7
+ require "chuckle/util"
8
+ require "chuckle/version"
9
+
10
+ # these last
11
+ require "chuckle/curl"
12
+ require "chuckle/client"
@@ -0,0 +1,89 @@
1
+ require "fileutils"
2
+
3
+ module Chuckle
4
+ class Cache
5
+ attr_accessor :hits, :misses
6
+
7
+ def initialize(client)
8
+ @client = client
9
+
10
+ self.hits = self.misses = 0
11
+ end
12
+
13
+ def get(request)
14
+ if !exists?(request) || expired?(request)
15
+ self.misses += 1
16
+ return
17
+ end
18
+ self.hits += 1
19
+ Response.new(request)
20
+ end
21
+
22
+ def set(request, curl)
23
+ %w(body headers).each do |i|
24
+ src, dst = curl.send("#{i}_path"), request.send("#{i}_path")
25
+ FileUtils.mkdir_p(File.dirname(dst))
26
+ FileUtils.mv(src, dst)
27
+ end
28
+ Response.new(request)
29
+ end
30
+
31
+ def clear(request)
32
+ Util.rm_if_necessary(request.headers_path)
33
+ Util.rm_if_necessary(request.body_path)
34
+ end
35
+
36
+ def exists?(request)
37
+ File.exists?(request.body_path)
38
+ end
39
+
40
+ def expired?(request)
41
+ return false if @client.expires_in == :never
42
+ return false if !exists?(request)
43
+ if File.stat(request.body_path).mtime + @client.expires_in < Time.now
44
+ clear(request)
45
+ true
46
+ end
47
+ end
48
+
49
+ def body_path(request)
50
+ uri = request.uri
51
+
52
+ # calculate body_path
53
+ s = @client.cache_dir
54
+ s = "#{s}/#{pathify(uri.host)}"
55
+ s = "#{s}/#{pathify(uri.path)}"
56
+ if uri.query
57
+ q = "?#{uri.query}"
58
+ s = "#{s}#{pathify(q)}"
59
+ end
60
+ if body = request.body
61
+ s = "#{s},#{pathify(body)}"
62
+ end
63
+
64
+ # shorten long paths to md5 checksum
65
+ if s.length > 250
66
+ dir, base = File.dirname(s), File.basename(s)
67
+ s = "#{dir}/#{Util.md5(base)}"
68
+ end
69
+
70
+ s
71
+ end
72
+
73
+ protected
74
+
75
+ # turn s into a string that can be a path
76
+ def pathify(s)
77
+ s = s.gsub(/^\//, "")
78
+ s = s.gsub("..", ",")
79
+ s = s.gsub(/[?\/&]/, ",")
80
+ s = s.gsub(/[^A-Za-z0-9_.,=%-]/) do |i|
81
+ hex = i.unpack("H2").first
82
+ "%#{hex}"
83
+ end
84
+ s = s.downcase
85
+ s = "_root_" if s.empty?
86
+ s
87
+ end
88
+ end
89
+ end
@@ -0,0 +1,89 @@
1
+ require "uri"
2
+
3
+ module Chuckle
4
+ class Client
5
+ include Chuckle::Options
6
+
7
+ attr_accessor :options, :cache
8
+
9
+ def initialize(options = {})
10
+ self.options = DEFAULT_OPTIONS.merge(options)
11
+ self.cache = Cache.new(self)
12
+ sanity!
13
+ end
14
+
15
+ #
16
+ # main entry points
17
+ #
18
+
19
+ def create_request(uri, body = nil)
20
+ uri = URI.parse(uri.to_s) if !uri.is_a?(URI)
21
+ Request.new(self, uri, body)
22
+ end
23
+
24
+ def get(uri)
25
+ run(create_request(uri))
26
+ end
27
+
28
+ def post(uri, body)
29
+ body = case body
30
+ when Hash
31
+ Util.hash_to_query(body)
32
+ else
33
+ body.to_s
34
+ end
35
+ run(create_request(uri, body))
36
+ end
37
+
38
+ def run(request)
39
+ response = cache.get(request) || curl(request)
40
+ raise_errors(response)
41
+ response
42
+ end
43
+
44
+ def inspect #:nodoc:
45
+ self.class.name
46
+ end
47
+
48
+ protected
49
+
50
+ # make sure curl command exists
51
+ def sanity!
52
+ system("which curl > /dev/null")
53
+ raise "Chuckle requires curl. Please install it." if $? != 0
54
+ end
55
+
56
+ def curl(request)
57
+ vputs request.uri
58
+ rate_limit!(request)
59
+ cache.set(request, Curl.new(request))
60
+ end
61
+
62
+ def raise_errors(response)
63
+ # raise errors if necessary
64
+ error = if response.curl_exit_code
65
+ "curl exit code #{response.curl_exit_code}"
66
+ elsif response.code >= 400
67
+ "http status #{response.code}"
68
+ end
69
+ return if !error
70
+
71
+ if !cache_errors?
72
+ cache.clear(response.request)
73
+ end
74
+ raise Error.new("#{Error.class}, #{error}", response)
75
+ end
76
+
77
+ def vputs(s)
78
+ puts "chuckle: #{s}" if verbose?
79
+ end
80
+
81
+ def rate_limit!(request)
82
+ return if !request.uri.host
83
+ @last_request ||= Time.at(0)
84
+ sleep = (@last_request + rate_limit) - Time.now
85
+ sleep(sleep) if sleep > 0
86
+ @last_request = Time.now
87
+ end
88
+ end
89
+ end
@@ -0,0 +1,24 @@
1
+ module Chuckle
2
+ class CookieJar
3
+ PATH = "/_chuckle_cookies.txt"
4
+
5
+ def initialize(request)
6
+ @request = request
7
+ end
8
+
9
+ def bogus_request
10
+ @bogus_request ||= Request.new(@request.client, @request.uri + PATH)
11
+ end
12
+
13
+ def path
14
+ bogus_request.body_path
15
+ end
16
+
17
+ def preflight
18
+ # expire the cookie jar if necessary
19
+ bogus_request.client.cache.expired?(bogus_request)
20
+ # mkdir
21
+ FileUtils.mkdir_p(File.dirname(path))
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,90 @@
1
+ require "fileutils"
2
+
3
+ module Chuckle
4
+ class Curl
5
+ def initialize(request)
6
+ @request = request
7
+ run
8
+ end
9
+
10
+ # tmp path for response headers
11
+ def headers_path
12
+ @headers_path ||= Util.tmp_path
13
+ end
14
+
15
+ # tmp path for response body
16
+ def body_path
17
+ @body_path ||= Util.tmp_path
18
+ end
19
+
20
+ def run
21
+ # note: explicitly use Kernel.system to allow for mocking
22
+ command = command(@request)
23
+ Kernel.system(*command)
24
+
25
+ # capture exit code, bail on INT
26
+ exit_code = $?.to_i / 256
27
+ if exit_code != 0 && $?.termsig == Signal.list["INT"]
28
+ Process.kill(:INT, $$)
29
+ end
30
+
31
+ # create tmp files if there were errors
32
+ if !File.exists?(body_path)
33
+ FileUtils.touch(body_path)
34
+ end
35
+ if exit_code != 0
36
+ IO.write(headers_path, Curl.exit_code_to_headers(exit_code))
37
+ end
38
+ end
39
+
40
+ def self.exit_code_to_headers(exit_code)
41
+ "exit_code #{exit_code}"
42
+ end
43
+
44
+ def self.exit_code_from_headers(headers)
45
+ if exit_code = headers[/^exit_code (\d+)/, 1]
46
+ exit_code.to_i
47
+ end
48
+ end
49
+
50
+ protected
51
+
52
+ # the command line for this request, based on the request and the
53
+ # options from client
54
+ def command(request)
55
+ client = request.client
56
+
57
+ command = ["curl"]
58
+ command << "--silent"
59
+ command << "--compressed"
60
+
61
+ command += [ "--user-agent", client.user_agent]
62
+ command += ["--max-time", client.timeout]
63
+ command += ["--retry", client.nretries]
64
+ command += ["--location", "--max-redirs", 3]
65
+
66
+ if request.body
67
+ command += ["--data-binary", request.body]
68
+ command += ["--header", "Content-Type: application/x-www-form-urlencoded"]
69
+ end
70
+
71
+ if client.cookies?
72
+ cookie_jar.preflight
73
+ command += ["--cookie", cookie_jar.path]
74
+ command += ["--cookie-jar", cookie_jar.path]
75
+ end
76
+
77
+ command += ["--dump-header", headers_path]
78
+ command += ["--output", body_path]
79
+
80
+ command << request.uri
81
+
82
+ command = command.map(&:to_s)
83
+ command
84
+ end
85
+
86
+ def cookie_jar
87
+ @cookie_jar ||= CookieJar.new(@request)
88
+ end
89
+ end
90
+ end
@@ -0,0 +1,16 @@
1
+ module Chuckle
2
+ class Error < StandardError
3
+ CURL_TIMEOUT = 28
4
+
5
+ attr_accessor :response
6
+
7
+ def initialize(msg, response)
8
+ super(msg)
9
+ self.response = response
10
+ end
11
+
12
+ def timeout?
13
+ response.curl_exit_code == CURL_TIMEOUT
14
+ end
15
+ end
16
+ end
@@ -0,0 +1,71 @@
1
+ module Chuckle
2
+ module Options
3
+ DEFAULT_OPTIONS = {
4
+ cache_dir: nil,
5
+ cache_errors: true,
6
+ cookies: false,
7
+ expires_in: :never,
8
+ nretries: 2,
9
+ rate_limit: 1,
10
+ timeout: 30,
11
+ user_agent: "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0",
12
+ verbose: false,
13
+ }
14
+
15
+ # cache root directory
16
+ def cache_dir
17
+ @cache_dir ||= begin
18
+ dir = options[:cache_dir]
19
+ dir ||= begin
20
+ if home = ENV["HOME"]
21
+ if File.exists?(home) && File.stat(home).writable?
22
+ "#{home}/.chuckle"
23
+ end
24
+ end
25
+ end
26
+ dir ||= "/tmp/chuckle"
27
+ dir
28
+ end
29
+ end
30
+
31
+ # should errors be cached?
32
+ def cache_errors?
33
+ options[:cache_errors]
34
+ end
35
+
36
+ # are cookies enabled?
37
+ def cookies?
38
+ options[:cookies]
39
+ end
40
+
41
+ # number of seconds to cache responses and cookies, or :never
42
+ def expires_in
43
+ options[:expires_in]
44
+ end
45
+
46
+ # number of retries to attempt
47
+ def nretries
48
+ options[:nretries]
49
+ end
50
+
51
+ # number of seconds between requests
52
+ def rate_limit
53
+ options[:rate_limit]
54
+ end
55
+
56
+ # timeout per retry
57
+ def timeout
58
+ options[:timeout]
59
+ end
60
+
61
+ # user agent
62
+ def user_agent
63
+ options[:user_agent]
64
+ end
65
+
66
+ # verbose output?
67
+ def verbose?
68
+ options[:verbose]
69
+ end
70
+ end
71
+ end
@@ -0,0 +1,34 @@
1
+ module Chuckle
2
+ class Request
3
+ attr_accessor :client, :uri, :body
4
+
5
+ def initialize(client, uri, body = nil)
6
+ self.client = client
7
+ self.uri = uri
8
+ self.body = body
9
+ end
10
+
11
+ def headers_path
12
+ @headers_path ||= begin
13
+ dir, base = File.dirname(body_path), File.basename(body_path)
14
+ "#{dir}/head/#{base}"
15
+ end
16
+ end
17
+
18
+ def body_path
19
+ @body_path ||= client.cache.body_path(self)
20
+ end
21
+
22
+ def to_s #:nodoc:
23
+ inspect
24
+ end
25
+
26
+ def inspect #:nodoc:
27
+ s = "#{self.class} #{uri}"
28
+ if body
29
+ s = "#{s} (#{body.length} bytes)"
30
+ end
31
+ s
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,45 @@
1
+ module Chuckle
2
+ class Response
3
+ attr_accessor :request, :curl_exit_code, :uri, :code
4
+
5
+ def initialize(request)
6
+ self.request = request
7
+ self.uri = request.uri
8
+ parse
9
+ end
10
+
11
+ def headers
12
+ @headers ||= File.read(request.headers_path)
13
+ end
14
+
15
+ def body
16
+ @body ||= File.read(request.body_path)
17
+ end
18
+
19
+ def to_s #:nodoc:
20
+ inspect
21
+ end
22
+
23
+ def inspect #:nodoc:
24
+ "#{self.class} #{uri} code=#{code}"
25
+ end
26
+
27
+ protected
28
+
29
+ def parse
30
+ self.curl_exit_code = Curl.exit_code_from_headers(headers)
31
+
32
+ locations = headers.scan(/^Location: ([^\r\n]+)/m).flatten
33
+ if !locations.empty?
34
+ location = locations.last
35
+ # some buggy servers do this. sigh.
36
+ location = location.gsub(" ", "%20")
37
+ self.uri = URI.parse(location)
38
+ end
39
+
40
+ codes = headers.scan(/^HTTP\/\d\.\d (\d+).*?\r\n\r\n/m).flatten
41
+ codes = codes.map(&:to_i)
42
+ self.code = codes.last
43
+ end
44
+ end
45
+ end
@@ -0,0 +1,33 @@
1
+ require "digest/md5"
2
+ require "tempfile"
3
+
4
+ module Chuckle
5
+ module Util
6
+ extend self
7
+
8
+ def hash_to_query(hash)
9
+ q = hash.map do |key, value|
10
+ key = CGI.escape(key.to_s)
11
+ value = CGI.escape(value.to_s)
12
+ "#{key}=#{value}"
13
+ end
14
+ q.sort.join("&")
15
+ end
16
+
17
+ def md5(s)
18
+ Digest::MD5.hexdigest(s.to_s)
19
+ end
20
+
21
+ def rm_if_necessary(path)
22
+ File.unlink(path) if File.exists?(path)
23
+ end
24
+
25
+ def tmp_path
26
+ Tempfile.open("chuckle") do |f|
27
+ path = f.path
28
+ f.unlink
29
+ path
30
+ end
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,4 @@
1
+ module Chuckle
2
+ # Gem version
3
+ VERSION = "1.0.0"
4
+ end
data/test/helper.rb ADDED
@@ -0,0 +1,93 @@
1
+ require "awesome_print"
2
+ require "chuckle"
3
+ require "json"
4
+ require "minitest/autorun"
5
+ require "minitest/pride"
6
+
7
+ module Helper
8
+ CACHE_DIR = "/tmp/_chuckle_tests"
9
+ URL = "http://chuckle"
10
+ QUERY = { "b" => "12", "a" => "34", "x y" => "56" }
11
+
12
+ #
13
+ # fake responses
14
+ #
15
+
16
+ HTTP_200 = <<-EOF.gsub(/(^|\n) +/, "\\1")
17
+ HTTP/1.1 200 OK
18
+
19
+ hello
20
+ EOF
21
+
22
+ HTTP_200_ALTERNATE = <<-EOF.gsub(/(^|\n) +/, "\\1")
23
+ HTTP/1.1 200 OK
24
+
25
+ alternate
26
+ EOF
27
+
28
+ HTTP_302 = <<-EOF.gsub(/(^|\n) +/, "\\1")
29
+ HTTP/1.1 302 FOUND
30
+ Location: http://one
31
+
32
+ HTTP/1.0 200 OK
33
+
34
+ hello
35
+ EOF
36
+
37
+ HTTP_302_2 = <<-EOF.gsub(/(^|\n) +/, "\\1")
38
+ HTTP/1.1 302 FOUND
39
+ Location: http://one
40
+
41
+ HTTP/1.1 302 FOUND
42
+ Location: http://two
43
+
44
+ HTTP/1.0 200 OK
45
+
46
+ hello
47
+ EOF
48
+
49
+ HTTP_404 = <<-EOF.gsub(/(^|\n) +/, "\\1")
50
+ HTTP/1.1 404 Not Found
51
+
52
+ EOF
53
+
54
+ def setup
55
+ # clear the cache before each test
56
+ FileUtils.rm_rf(CACHE_DIR)
57
+ end
58
+
59
+ # create a new client, with options
60
+ def client(options = {})
61
+ @client ||= begin
62
+ options = options.merge(cache_dir: CACHE_DIR)
63
+ Chuckle::Client.new(options)
64
+ end
65
+ end
66
+
67
+ # pretend to be curl by stubbing Kernel.system
68
+ def mcurl(response, exit_code = 0, &block)
69
+ # divide response into headers/body
70
+ sep = response.rindex("\n\n") + "\n\n".length
71
+ body = response[sep..-1]
72
+ headers = response[0, sep].gsub("\n", "\r\n")
73
+
74
+ # a lambda that pretends to be curl
75
+ fake_system = lambda do |*command|
76
+ tmp_headers = command[command.index("--dump-header") + 1]
77
+ tmp_body = command[command.index("--output") + 1]
78
+ IO.write(tmp_headers, headers)
79
+ IO.write(tmp_body, body)
80
+ `(exit #{exit_code})`
81
+ end
82
+
83
+ # stub out Kernel.system
84
+ Kernel.stub(:system, fake_system) { yield }
85
+ end
86
+
87
+ def assert_if_system(&block)
88
+ fake_system = lambda do |*command|
89
+ assert false, "system called with #{command.inspect}"
90
+ end
91
+ Kernel.stub(:system, fake_system) { yield }
92
+ end
93
+ end
@@ -0,0 +1,184 @@
1
+ require "helper"
2
+
3
+ class TestCache < Minitest::Test
4
+ include Helper
5
+
6
+ def after_setup
7
+ client(expires_in: 10)
8
+ end
9
+
10
+ # exists? and expired? (and clear)
11
+ def test_predicates
12
+ request = client.create_request(URL)
13
+ assert !client.cache.exists?(request), "uncache! uri said it was cached"
14
+
15
+ mcurl(HTTP_200) do
16
+ client.run(request)
17
+ end
18
+ assert client.cache.exists?(request), "cache said it wasn't cached"
19
+ assert !client.cache.expired?(request), "cache said it was expired"
20
+
21
+ client.cache.clear(request)
22
+ assert !client.cache.exists?(request), "still cached after clear"
23
+ end
24
+
25
+ # cache expiration
26
+ def test_expiry
27
+ request = client.create_request(URL)
28
+ response = mcurl(HTTP_200) do
29
+ client.run(request)
30
+ end
31
+ assert_equal "hello\n", response.body
32
+
33
+ # make it look old
34
+ tm = Time.now - (client.expires_in + 9999)
35
+ path = request.body_path
36
+ File.utime(tm, tm, path)
37
+ assert client.cache.expired?(request), "#{path} was supposed to be expired"
38
+
39
+ # make sure we get the new body
40
+ response = mcurl(HTTP_200_ALTERNATE) do
41
+ client.run(request)
42
+ end
43
+ assert_equal "alternate\n", response.body
44
+ assert_equal 2, client.cache.misses
45
+ end
46
+
47
+ #
48
+ # requests and errors
49
+ #
50
+
51
+ def test_200
52
+ # cache miss
53
+ mcurl(HTTP_200) do
54
+ client.get(URL)
55
+ end
56
+
57
+ # cache hit
58
+ response = assert_if_system do
59
+ client.get(URL)
60
+ end
61
+ assert_equal 200, response.code
62
+ assert_equal URI.parse(URL), response.uri
63
+ assert_equal "hello\n", response.body
64
+ assert_equal 1, client.cache.hits
65
+ assert_equal 1, client.cache.misses
66
+ end
67
+
68
+ def test_404
69
+ # cache miss
70
+ begin
71
+ mcurl(HTTP_404) do
72
+ client.get(URL)
73
+ end
74
+ rescue Chuckle::Error => e
75
+ end
76
+
77
+ # cache hit
78
+ e = assert_raises Chuckle::Error do
79
+ assert_if_system do
80
+ client.get(URL)
81
+ end
82
+ end
83
+ assert_equal 404, e.response.code
84
+ assert_equal 1, client.cache.hits
85
+ assert_equal 1, client.cache.misses
86
+ end
87
+
88
+ def test_timeout
89
+ # cache miss
90
+ begin
91
+ mcurl(HTTP_404, Chuckle::Error::CURL_TIMEOUT) do
92
+ client.get(URL)
93
+ end
94
+ rescue Chuckle::Error => e
95
+ end
96
+
97
+ # cache hit
98
+ e = assert_raises Chuckle::Error do
99
+ assert_if_system do
100
+ client.get(URL)
101
+ end
102
+ end
103
+ assert e.timeout?, "exception didn't indicate timeout"
104
+ assert_equal 1, client.cache.hits
105
+ assert_equal 1, client.cache.misses
106
+ end
107
+
108
+ def test_post
109
+ # cache miss
110
+ mcurl(HTTP_200) do
111
+ client.post(URL, QUERY)
112
+ end
113
+
114
+ # cache hit
115
+ response = assert_if_system do
116
+ client.post(URL, QUERY)
117
+ end
118
+ assert_equal 200, response.code
119
+ assert_equal URI.parse(URL), response.uri
120
+ assert_equal "hello\n", response.body
121
+ assert_equal 1, client.cache.hits
122
+ assert_equal 1, client.cache.misses
123
+ end
124
+
125
+ def test_long_url
126
+ words = %w(the quick brown fox jumped over the lazy dog)
127
+ query = (1..100).map { words[rand(words.length)] }
128
+ query = query.each_slice(2).map { |i| i.join("=") }.join("&")
129
+
130
+ # cache miss
131
+ mcurl(HTTP_200) do
132
+ client.post(URL, query)
133
+ end
134
+
135
+ # cache hit
136
+ response = assert_if_system do
137
+ client.post(URL, query)
138
+ end
139
+
140
+ # make sure it turned into a checksum
141
+ assert response.request.body_path =~ /[a-f0-9]{32}$/, "body_path wasn't an md5 checksum"
142
+ assert_equal 200, response.code
143
+ assert_equal URI.parse(URL), response.uri
144
+ assert_equal "hello\n", response.body
145
+ assert_equal 1, client.cache.hits
146
+ assert_equal 1, client.cache.misses
147
+ end
148
+
149
+ def test_query
150
+ url = "#{URL}?abc=def"
151
+
152
+ # cache miss
153
+ mcurl(HTTP_200) do
154
+ client.get(url)
155
+ end
156
+
157
+ # cache hit
158
+ response = assert_if_system do
159
+ client.get(url)
160
+ end
161
+
162
+ # make sure it turned into a checksum
163
+ assert response.request.body_path =~ /abc=def$/, "body_path didn't contain query"
164
+ assert_equal 200, response.code
165
+ assert_equal URI.parse(url), response.uri
166
+ assert_equal "hello\n", response.body
167
+ assert_equal 1, client.cache.hits
168
+ assert_equal 1, client.cache.misses
169
+ end
170
+
171
+ def test_nocache_errors
172
+ # turn off error caching
173
+ client = Chuckle::Client.new(cache_dir: CACHE_DIR, expires_in: 10, cache_errors: false)
174
+
175
+ # cache misses
176
+ 2.times do
177
+ begin
178
+ mcurl(HTTP_404) { client.get(URL) }
179
+ rescue Chuckle::Error => e
180
+ end
181
+ end
182
+ assert_equal 2, client.cache.misses
183
+ end
184
+ end
@@ -0,0 +1,56 @@
1
+ require "helper"
2
+
3
+ # these actually use the network, and get skipped unless ENV["NETWORK"].
4
+ class TestNetwork < Minitest::Test
5
+ include Helper
6
+
7
+ def after_setup
8
+ skip if !ENV["NETWORK"]
9
+ end
10
+
11
+ def test_get
12
+ response = client.get("http://httpbin.org/get")
13
+ assert_equal 200, response.code
14
+ end
15
+
16
+ def test_timeout
17
+ e = assert_raises Chuckle::Error do
18
+ client(nretries: 0, timeout: 2).get("http://httpbin.org/delay/3")
19
+ end
20
+ assert e.timeout?, "exception didn't indicate timeout"
21
+ end
22
+
23
+ def test_post
24
+ response = client.post("http://httpbin.org/post", QUERY)
25
+ assert_equal JSON.parse(response.body)["form"], QUERY
26
+ end
27
+
28
+ def test_cookies
29
+ cookies = { "a" => "b" }
30
+
31
+ client(cookies: true, expires_in: 60) # set options
32
+
33
+ request = client.create_request("http://httpbin.org/get")
34
+ cookie_jar = Chuckle::CookieJar.new(request).path
35
+
36
+ # make sure there are no cookies after the GET
37
+ client.run(request)
38
+ assert !File.exists?(cookie_jar), "cookie jar shouldn't exist yet"
39
+
40
+ # make sure there ARE cookies after a Set-Cookie
41
+ client.get("http://httpbin.org/cookies/set?#{Chuckle::Util.hash_to_query(cookies)}")
42
+ assert File.exists?(cookie_jar), "cookie jar SHOULD exist now"
43
+
44
+ # make sure cookies come back from the server
45
+ response = client.get("http://httpbin.org/cookies")
46
+ assert_equal JSON.parse(response.body)["cookies"], cookies
47
+
48
+ # Finally, test cache expiry on cookie_jar. Note that this has to
49
+ # be an un-cached URL, otherwise the cookie_jar never gets
50
+ # checked!
51
+ tm = Time.now - (client.expires_in + 9999)
52
+ File.utime(tm, tm, cookie_jar)
53
+ client.get("http://httpbin.org/robots.txt")
54
+ assert !File.exists?(cookie_jar), "cookie jar should've expired"
55
+ end
56
+ end
@@ -0,0 +1,55 @@
1
+ require "helper"
2
+
3
+ class TestRequests < Minitest::Test
4
+ include Helper
5
+
6
+ def test_200
7
+ response = mcurl(HTTP_200) { client.get(URL) }
8
+ assert_equal 200, response.code
9
+ assert_equal URI.parse(URL), response.uri
10
+ assert_equal "hello\n", response.body
11
+ end
12
+
13
+ def test_302
14
+ response = mcurl(HTTP_302) { client.get(URL) }
15
+ assert_equal 200, response.code
16
+ assert_equal URI.parse("http://one"), response.uri
17
+ assert_equal "hello\n", response.body
18
+ end
19
+
20
+ def test_302_2
21
+ response = mcurl(HTTP_302_2) { client.get(URL) }
22
+ assert_equal 200, response.code
23
+ assert_equal URI.parse("http://two"), response.uri
24
+ assert_equal "hello\n", response.body
25
+ end
26
+
27
+ def test_404
28
+ e = assert_raises Chuckle::Error do
29
+ mcurl(HTTP_404) do
30
+ client.get(URL)
31
+ end
32
+ end
33
+ assert_equal 404, e.response.code
34
+ end
35
+
36
+ def test_timeout
37
+ e = assert_raises Chuckle::Error do
38
+ mcurl(HTTP_404, Chuckle::Error::CURL_TIMEOUT) do
39
+ client.get(URL)
40
+ end
41
+ end
42
+ assert e.timeout?, "exception didn't indicate timeout"
43
+ end
44
+
45
+ def test_post
46
+ # just test hash_to_query first
47
+ assert_equal "a=34&b=12&x+y=56", Chuckle::Util.hash_to_query(QUERY)
48
+
49
+ response = mcurl(HTTP_200) { client.post(URL, QUERY) }
50
+ assert_equal response.request.body, Chuckle::Util.hash_to_query(QUERY)
51
+ assert_equal 200, response.code
52
+ assert_equal URI.parse(URL), response.uri
53
+ assert_equal "hello\n", response.body
54
+ end
55
+ end
metadata ADDED
@@ -0,0 +1,153 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: chuckle
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Adam Doppelt
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2013-05-17 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: awesome_print
16
+ requirement: !ruby/object:Gem::Requirement
17
+ none: false
18
+ requirements:
19
+ - - ! '>='
20
+ - !ruby/object:Gem::Version
21
+ version: '0'
22
+ type: :development
23
+ prerelease: false
24
+ version_requirements: !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ! '>='
28
+ - !ruby/object:Gem::Version
29
+ version: '0'
30
+ - !ruby/object:Gem::Dependency
31
+ name: json
32
+ requirement: !ruby/object:Gem::Requirement
33
+ none: false
34
+ requirements:
35
+ - - ! '>='
36
+ - !ruby/object:Gem::Version
37
+ version: '0'
38
+ type: :development
39
+ prerelease: false
40
+ version_requirements: !ruby/object:Gem::Requirement
41
+ none: false
42
+ requirements:
43
+ - - ! '>='
44
+ - !ruby/object:Gem::Version
45
+ version: '0'
46
+ - !ruby/object:Gem::Dependency
47
+ name: minitest
48
+ requirement: !ruby/object:Gem::Requirement
49
+ none: false
50
+ requirements:
51
+ - - ~>
52
+ - !ruby/object:Gem::Version
53
+ version: '5.0'
54
+ type: :development
55
+ prerelease: false
56
+ version_requirements: !ruby/object:Gem::Requirement
57
+ none: false
58
+ requirements:
59
+ - - ~>
60
+ - !ruby/object:Gem::Version
61
+ version: '5.0'
62
+ - !ruby/object:Gem::Dependency
63
+ name: rake
64
+ requirement: !ruby/object:Gem::Requirement
65
+ none: false
66
+ requirements:
67
+ - - ! '>='
68
+ - !ruby/object:Gem::Version
69
+ version: '0'
70
+ type: :development
71
+ prerelease: false
72
+ version_requirements: !ruby/object:Gem::Requirement
73
+ none: false
74
+ requirements:
75
+ - - ! '>='
76
+ - !ruby/object:Gem::Version
77
+ version: '0'
78
+ - !ruby/object:Gem::Dependency
79
+ name: rdoc
80
+ requirement: !ruby/object:Gem::Requirement
81
+ none: false
82
+ requirements:
83
+ - - ! '>='
84
+ - !ruby/object:Gem::Version
85
+ version: '0'
86
+ type: :development
87
+ prerelease: false
88
+ version_requirements: !ruby/object:Gem::Requirement
89
+ none: false
90
+ requirements:
91
+ - - ! '>='
92
+ - !ruby/object:Gem::Version
93
+ version: '0'
94
+ description: A caching http client that wraps curl.
95
+ email:
96
+ - amd@gurge.com
97
+ executables: []
98
+ extensions: []
99
+ extra_rdoc_files: []
100
+ files:
101
+ - .gitignore
102
+ - Gemfile
103
+ - LICENSE
104
+ - README.md
105
+ - Rakefile
106
+ - chuckle.gemspec
107
+ - lib/chuckle.rb
108
+ - lib/chuckle/cache.rb
109
+ - lib/chuckle/client.rb
110
+ - lib/chuckle/cookie_jar.rb
111
+ - lib/chuckle/curl.rb
112
+ - lib/chuckle/error.rb
113
+ - lib/chuckle/options.rb
114
+ - lib/chuckle/request.rb
115
+ - lib/chuckle/response.rb
116
+ - lib/chuckle/util.rb
117
+ - lib/chuckle/version.rb
118
+ - test/helper.rb
119
+ - test/test_cache.rb
120
+ - test/test_network.rb
121
+ - test/test_requests.rb
122
+ homepage: http://github.com/gurgeous/chuckle
123
+ licenses: []
124
+ post_install_message:
125
+ rdoc_options: []
126
+ require_paths:
127
+ - lib
128
+ required_ruby_version: !ruby/object:Gem::Requirement
129
+ none: false
130
+ requirements:
131
+ - - ! '>='
132
+ - !ruby/object:Gem::Version
133
+ version: 1.9.0
134
+ required_rubygems_version: !ruby/object:Gem::Requirement
135
+ none: false
136
+ requirements:
137
+ - - ! '>='
138
+ - !ruby/object:Gem::Version
139
+ version: '0'
140
+ segments:
141
+ - 0
142
+ hash: 2600014295267267585
143
+ requirements: []
144
+ rubyforge_project: chuckle
145
+ rubygems_version: 1.8.24
146
+ signing_key:
147
+ specification_version: 3
148
+ summary: Chuckle - caching http client that wraps curl.
149
+ test_files:
150
+ - test/helper.rb
151
+ - test/test_cache.rb
152
+ - test/test_network.rb
153
+ - test/test_requests.rb