RubyGems - josh-rack-cache - Versions diffs - 0.5.1 - Mend

josh-rack-cache 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

data/CHANGES +167 -0
data/COPYING +18 -0
data/README +110 -0
data/Rakefile +137 -0
data/TODO +27 -0
data/doc/configuration.markdown +112 -0
data/doc/faq.markdown +141 -0
data/doc/index.markdown +121 -0
data/doc/layout.html.erb +34 -0
data/doc/license.markdown +24 -0
data/doc/rack-cache.css +362 -0
data/doc/server.ru +34 -0
data/doc/storage.markdown +164 -0
data/example/sinatra/app.rb +25 -0
data/example/sinatra/views/index.erb +44 -0
data/lib/rack/cache.rb +45 -0
data/lib/rack/cache/appengine.rb +52 -0
data/lib/rack/cache/cachecontrol.rb +193 -0
data/lib/rack/cache/context.rb +253 -0
data/lib/rack/cache/entitystore.rb +339 -0
data/lib/rack/cache/key.rb +52 -0
data/lib/rack/cache/metastore.rb +407 -0
data/lib/rack/cache/options.rb +150 -0
data/lib/rack/cache/request.rb +33 -0
data/lib/rack/cache/response.rb +267 -0
data/lib/rack/cache/storage.rb +62 -0
data/rack-cache.gemspec +70 -0
data/test/cache_test.rb +38 -0
data/test/cachecontrol_test.rb +139 -0
data/test/context_test.rb +774 -0
data/test/entitystore_test.rb +230 -0
data/test/key_test.rb +50 -0
data/test/metastore_test.rb +302 -0
data/test/options_test.rb +77 -0
data/test/pony.jpg +0 -0
data/test/request_test.rb +19 -0
data/test/response_test.rb +178 -0
data/test/spec_setup.rb +237 -0
data/test/storage_test.rb +94 -0
metadata +118 -0

data/doc/server.ru ADDED

@@ -0,0 +1,34 @@
+# Rackup config that serves the contents of Rack::Cache's
+# doc directory. The documentation is rebuilt on each request.
+# Rewrites URLs like conventional web server configs.
+class Rewriter < Struct.new(:app)
+  def call(env)
+    if env['PATH_INFO'] =~ /\/$/
+      env['PATH_INFO'] += 'index.html'
+    elsif env['PATH_INFO'] !~ /\.\w+$/
+      env['PATH_INFO'] += '.html'
+    end
+    app.call(env)
+  end
+end
+# Rebuilds documentation on each request.
+class DocBuilder < Struct.new(:app)
+  def call(env)
+    if env['PATH_INFO'] !~ /\.(css|js|gif|jpg|png|ico)$/
+      env['rack.errors'] << "*** rebuilding documentation (rake -s doc)\n"
+      system "rake -s doc"
+    end
+    app.call(env)
+  end
+end
+use Rack::CommonLogger
+use DocBuilder
+use Rewriter
+use Rack::Static, :root => File.dirname(__FILE__), :urls => ["/"]
+run(lambda{|env| [404,{},'<h1>Not Found</h1>']})
+# vim: ft=ruby

data/doc/storage.markdown ADDED

@@ -0,0 +1,164 @@
+Storage
+=======
+__Rack::Cache__ runs within each of your backend application processes and does not
+rely on a single intermediary process like most types of proxy cache
+implementations. Because of this, the storage subsystem has implications on not
+only where cache data is stored but whether the cache is properly distributed
+between multiple backend processes. It is highly recommended that you read and
+understand the following before choosing a storage implementation.
+Storage Areas
+-------------
+__Rack::Cache__ stores cache entries in two separate configurable storage
+areas: a _MetaStore_ and an _EntityStore_.
+The _MetaStore_ keeps high level information about each cache entry, including
+the request/response headers and other status information. When a request is
+received, the core caching logic uses this meta information to determine whether
+a fresh cache entry exists that can satisfy the request.
+The _EntityStore_ is where the actual response body content is stored. When a
+response is entered into the cache, a SHA1 digest of the response body content
+is calculated and used as a key. The entries stored in the MetaStore reference
+their response bodies using this SHA1 key.
+Separating request/response meta-data from response content has a few important
+advantages:
+  * Different storage types can be used for meta and entity storage. For
+    example, it may be desirable to use memcached to store meta information
+    while using the filesystem for entity storage.
+  * Cache entry meta-data may be retrieved quickly without also retrieving
+    response bodies. This avoids significant overhead when the cache misses
+    or only requires validation.
+  * Multiple different responses may include the same exact response body. In
+    these cases, the actual body content is stored once and referenced from
+    each of the meta store entries.
+You should consider how the meta and entity stores differ when choosing a storage
+implementation. The MetaStore does not require nearly as much memory as the
+EntityStore and is accessed much more frequently. The EntityStore can grow quite
+large and raw performance is less of a concern. Using a memory based storage
+implementation (`heap` or `memcached`) for the MetaStore is strongly advised,
+while a disk based storage implementation (`file`) is often satisfactory for
+the EntityStore and uses much less memory.
+Storage Configuration
+---------------------
+The MetaStore and EntityStore used for a particular request is determined by
+inspecting the `rack-cache.metastore` and `rack-cache.entitystore` Rack env
+variables. The value of these variables is a URI that identifies the storage
+type and location (URI formats are documented in the following section).
+The `heap:/` storage is assumed if either storage type is not explicitly
+provided. This storage type has significant drawbacks for most types of
+deployments so explicit configuration is advised.
+The default metastore and entitystore values can be specified when the
+__Rack::Cache__ object is added to the Rack middleware pipeline as follows:
+    use Rack::Cache,
+      :metastore => 'file:/var/cache/rack/meta',
+      :entitystore => 'file:/var/cache/rack/body'
+Alternatively, the `rack-cache.metastore` and `rack-cache.entitystore`
+variables may be set in the Rack environment by an upstream component.
+Storage Implementations
+-----------------------
+__Rack::Cache__ includes meta and entity storage implementations backed by local
+process memory ("heap storage"), the file system ("disk storage"), and
+memcached. This section includes information on configuring __Rack::Cache__ to
+use a specific storage implementation as well as pros and cons of each.
+### Heap Storage
+Uses local process memory to store cached entries.
+    use Rack::Cache,
+      :metastore   => 'heap:/',
+      :entitystore => 'heap:/'
+The heap storage backend is simple, fast, and mostly useless. All cache
+information is stored in each backend application's local process memory (using
+a normal Hash, in fact), which means that data cached under one backend is
+invisible to all other backends. This leads to low cache hit rates and excessive
+memory use, the magnitude of which is a function of the number of backends in
+use. Further, the heap storage provides no mechanism for purging unused entries
+so memory use is guaranteed to exceed that available, given enough time and
+utilization.
+Use of heap storage is recommended only for testing purposes or for very
+simple/single-backend deployment scenarios where the number of resources served
+is small and well understood.
+### Disk Storage
+Stores cached entries on the filesystem.
+    use Rack::Cache,
+      :metastore   => 'file:/var/cache/rack/meta',
+      :entitystore => 'file:/var/cache/rack/body'
+The URI may specify an absolute, relative, or home-rooted path:
+  * `file:/storage/path` - absolute path to storage directory.
+  * `file:storage/path` - relative path to storage directory, rooted at the
+    process's current working directory (`Dir.pwd`).
+  * `file:~user/storage/path` - path to storage directory, rooted at the
+    specified user's home directory.
+  * `file:~/storage/path` - path to storage directory, rooted at the current
+    user's home directory.
+File system storage is simple, requires no special daemons or libraries, has a
+tiny memory footprint, and allows multiple backends to share a single cache; it
+is one of the slower storage implementations, however. Its use is recommended in
+cases where memory is limited or in environments where more complex storage
+backends (i.e., memcached) are not available. In many cases, it may be
+acceptable (and even optimal) to use file system storage for the entitystore and
+a more performant storage implementation (i.e. memcached) for the metastore.
+__NOTE:__ When both the metastore and entitystore are configured to use file
+system storage, they should be set to different paths to prevent any chance of
+collision.
+### Memcached Storage
+Stores cached entries in a remote [memcached](http://www.danga.com/memcached/)
+instance.
+    use Rack::Cache,
+      :metastore   => 'memcached://localhost:11211/meta',
+      :entitystore => 'memcached://localhost:11211/body'
+The URI must specify the host and port of a remote memcached daemon. The path
+portion is an optional (but recommended) namespace that is prepended to each
+cache key.
+The memcached storage backend requires either the `memcache-client` or
+`memcached` libraries. By default, the `memcache-client` library is used;
+require the `memcached` library explicitly to use it instead.
+    gem install memcache-client
+Memcached storage is reasonably fast and allows multiple backends to share a
+single cache. It is also the only storage implementation that allows the cache
+to reside somewhere other than the local machine. The memcached daemon stores
+all data in local process memory so using it for the entitystore can result in
+heavy memory usage. It is by far the best option for the metastore in
+deployments with multiple backend application processes since it allows the
+cache to be properly distributed and provides fast access to the
+meta-information required to perform cache logic. Memcached is considerably more
+complex than the other storage implementations, requiring a separate daemon
+process and extra libraries. Still, its use is recommended in all cases where
+you can get away with it.
+[e]: http://blog.evanweaver.com/files/doc/fauna/memcached/files/README.html
+[f]: http://blog.evanweaver.com/articles/2008/01/21/b-the-fastest-u-can-b-memcached/
+[l]: http://tangent.org/552/libmemcached.html

data/example/sinatra/app.rb ADDED

@@ -0,0 +1,25 @@
+require 'sinatra'
+require 'rack/cache'
+use Rack::Cache do
+  set :verbose, true
+  set :metastore,   'heap:/'
+  set :entitystore, 'heap:/'
+  on :receive do
+    pass! if request.url =~ /favicon/
+  end
+end
+before do
+  last_modified $updated_at ||= Time.now
+end
+get '/' do
+  erb :index
+end
+put '/' do
+  $updated_at = nil
+  redirect '/'
+end

data/example/sinatra/views/index.erb ADDED

@@ -0,0 +1,44 @@
+<html>
+  <head>
+    <title>Sample Rack::Cache Sinatra app</title>
+    <style type="text/css" media="screen">
+      body {
+        font-family: Georgia;
+        font-size: 24px;
+        text-align: center;
+      }
+      #headers {
+        font-size: 16px;
+      }
+      input {
+        font-size: 24px;
+        cursor: pointer;
+      }
+    </style>
+  </head>
+  <body>
+    <h1>Last updated at: <%= $updated_at.strftime('%l:%m:%S%P') %></h1>
+    <p>
+      <form action="/" method="post">
+        <input type="hidden" name="_method" value="PUT">
+        <input type="submit" value="Expire the cache.">
+      </form>
+    </p>
+    <div id="headers">
+      <h3>Headers:</h3>
+      <% response.headers.each do |key, value| %>
+        <p><%= key %>: <%= value %></p>
+      <% end %>
+      <h3>Params:</h3>
+      <% params.each do |key, value| %>
+        <p><%= key %>: <%= value || '(blank)' %></p>
+      <% end %>
+    </div>
+  </body>
+</html>

data/lib/rack/cache.rb ADDED

@@ -0,0 +1,45 @@
+require 'rack'
+# = HTTP Caching For Rack
+#
+# Rack::Cache is suitable as a quick, drop-in component to enable HTTP caching
+# for Rack-enabled applications that produce freshness (+Expires+, +Cache-Control+)
+# and/or validation (+Last-Modified+, +ETag+) information.
+#
+# * Standards-based (RFC 2616 compliance)
+# * Freshness/expiration based caching and validation
+# * Supports HTTP Vary
+# * Portable: 100% Ruby / works with any Rack-enabled framework
+# * Disk, memcached, and heap memory storage backends
+#
+# === Usage
+#
+# Create with default options:
+#   require 'rack/cache'
+#   Rack::Cache.new(app, :verbose => true, :entitystore => 'file:cache')
+#
+# Within a rackup file (or with Rack::Builder):
+#   require 'rack/cache'
+#   use Rack::Cache do
+#     set :verbose, true
+#     set :metastore, 'memcached://localhost:11211/meta'
+#     set :entitystore, 'file:/var/cache/rack'
+#   end
+#   run app
+module Rack::Cache
+  autoload :Request,      'rack/cache/request'
+  autoload :Response,     'rack/cache/response'
+  autoload :Context,      'rack/cache/context'
+  autoload :Storage,      'rack/cache/storage'
+  autoload :CacheControl, 'rack/cache/cachecontrol'
+  # Create a new Rack::Cache middleware component that fetches resources from
+  # the specified backend application. The +options+ Hash can be used to
+  # specify default configuration values (see attributes defined in
+  # Rack::Cache::Options for possible key/values). When a block is given, it
+  # is executed within the context of the newly create Rack::Cache::Context
+  # object.
+  def self.new(backend, options={}, &b)
+    Context.new(backend, options, &b)
+  end
+end

data/lib/rack/cache/appengine.rb ADDED

@@ -0,0 +1,52 @@
+require 'base64'
+module Rack::Cache::AppEngine
+  module MC
+    require 'java'
+    import com.google.appengine.api.memcache.Expiration;
+    import com.google.appengine.api.memcache.MemcacheService;
+    import com.google.appengine.api.memcache.MemcacheServiceFactory;
+    import com.google.appengine.api.memcache.Stats;
+    Service = MemcacheServiceFactory.getMemcacheService
+  end unless defined?(Rack::Cache::AppEngine::MC)
+  class MemCache
+      def initialize(options = {})
+        @cache = MC::Service
+        @cache.namespace = options[:namespace] if options[:namespace]
+      end
+      def contains?(key)
+        MC::Service.contains(key)
+      end
+      def get(key)
+        value = MC::Service.get(key)
+        Marshal.load(Base64.decode64(value)) if value
+      end
+      def put(key, value, ttl = nil)
+        expiration = ttl ? MC::Expiration.byDeltaSeconds(ttl) : nil
+        value = Base64.encode64(Marshal.dump(value)).gsub(/\n/, '')
+        MC::Service.put(key, value, expiration)
+      end
+      def namespace
+        MC::Service.getNamespace
+      end
+      def namespace=(value)
+        MC::Service.setNamespace(value.to_s)
+      end
+      def delete(key)
+        MC::Service.delete(key)
+      end
+  end
+end

data/lib/rack/cache/cachecontrol.rb ADDED

@@ -0,0 +1,193 @@
+module Rack
+  module Cache
+    # Parses a Cache-Control header and exposes the directives as a Hash.
+    # Directives that do not have values are set to +true+.
+    class CacheControl < Hash
+      def initialize(value=nil)
+        parse(value)
+      end
+      # Indicates that the response MAY be cached by any cache, even if it
+      # would normally be non-cacheable or cacheable only within a non-
+      # shared cache.
+      #
+      # A response may be considered public without this directive if the
+      # private directive is not set and the request does not include an
+      # Authorization header.
+      def public?
+        self['public']
+      end
+      # Indicates that all or part of the response message is intended for
+      # a single user and MUST NOT be cached by a shared cache. This
+      # allows an origin server to state that the specified parts of the
+      # response are intended for only one user and are not a valid
+      # response for requests by other users. A private (non-shared) cache
+      # MAY cache the response.
+      #
+      # Note: This usage of the word private only controls where the
+      # response may be cached, and cannot ensure the privacy of the
+      # message content.
+      def private?
+        self['private']
+      end
+      # When set in a response, a cache MUST NOT use the response to satisfy a
+      # subsequent request without successful revalidation with the origin
+      # server. This allows an origin server to prevent caching even by caches
+      # that have been configured to return stale responses to client requests.
+      #
+      # Note that this does not necessary imply that the response may not be
+      # stored by the cache, only that the cache cannot serve it without first
+      # making a conditional GET request with the origin server.
+      #
+      # When set in a request, the server MUST NOT use a cached copy for its
+      # response. This has quite different semantics compared to the no-cache
+      # directive on responses. When the client specifies no-cache, it causes
+      # an end-to-end reload, forcing each cache to update their cached copies.
+      def no_cache?
+        self['no-cache']
+      end
+      # Indicates that the response MUST NOT be stored under any circumstances.
+      #
+      # The purpose of the no-store directive is to prevent the
+      # inadvertent release or retention of sensitive information (for
+      # example, on backup tapes). The no-store directive applies to the
+      # entire message, and MAY be sent either in a response or in a
+      # request. If sent in a request, a cache MUST NOT store any part of
+      # either this request or any response to it. If sent in a response,
+      # a cache MUST NOT store any part of either this response or the
+      # request that elicited it. This directive applies to both non-
+      # shared and shared caches. "MUST NOT store" in this context means
+      # that the cache MUST NOT intentionally store the information in
+      # non-volatile storage, and MUST make a best-effort attempt to
+      # remove the information from volatile storage as promptly as
+      # possible after forwarding it.
+      #
+      # The purpose of this directive is to meet the stated requirements
+      # of certain users and service authors who are concerned about
+      # accidental releases of information via unanticipated accesses to
+      # cache data structures. While the use of this directive might
+      # improve privacy in some cases, we caution that it is NOT in any
+      # way a reliable or sufficient mechanism for ensuring privacy. In
+      # particular, malicious or compromised caches might not recognize or
+      # obey this directive, and communications networks might be
+      # vulnerable to eavesdropping.
+      def no_store?
+        self['no-store']
+      end
+      # The expiration time of an entity MAY be specified by the origin
+      # server using the Expires header (see section 14.21). Alternatively,
+      # it MAY be specified using the max-age directive in a response. When
+      # the max-age cache-control directive is present in a cached response,
+      # the response is stale if its current age is greater than the age
+      # value given (in seconds) at the time of a new request for that
+      # resource. The max-age directive on a response implies that the
+      # response is cacheable (i.e., "public") unless some other, more
+      # restrictive cache directive is also present.
+      #
+      # If a response includes both an Expires header and a max-age
+      # directive, the max-age directive overrides the Expires header, even
+      # if the Expires header is more restrictive. This rule allows an origin
+      # server to provide, for a given response, a longer expiration time to
+      # an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be
+      # useful if certain HTTP/1.0 caches improperly calculate ages or
+      # expiration times, perhaps due to desynchronized clocks.
+      #
+      # Many HTTP/1.0 cache implementations will treat an Expires value that
+      # is less than or equal to the response Date value as being equivalent
+      # to the Cache-Control response directive "no-cache". If an HTTP/1.1
+      # cache receives such a response, and the response does not include a
+      # Cache-Control header field, it SHOULD consider the response to be
+      # non-cacheable in order to retain compatibility with HTTP/1.0 servers.
+      #
+      # When the max-age directive is included in the request, it indicates
+      # that the client is willing to accept a response whose age is no
+      # greater than the specified time in seconds.
+      def max_age
+        self['max-age'].to_i  if key?('max-age')
+      end
+      # If a response includes an s-maxage directive, then for a shared
+      # cache (but not for a private cache), the maximum age specified by
+      # this directive overrides the maximum age specified by either the
+      # max-age directive or the Expires header. The s-maxage directive
+      # also implies the semantics of the proxy-revalidate directive. i.e.,
+      # that the shared cache must not use the entry after it becomes stale
+      # to respond to a subsequent request without first revalidating it with
+      # the origin server. The s-maxage directive is always ignored by a
+      # private cache.
+      def shared_max_age
+        self['s-maxage'].to_i  if key?('s-maxage')
+      end
+      alias_method :s_maxage, :shared_max_age
+      # Because a cache MAY be configured to ignore a server's specified
+      # expiration time, and because a client request MAY include a max-
+      # stale directive (which has a similar effect), the protocol also
+      # includes a mechanism for the origin server to require revalidation
+      # of a cache entry on any subsequent use. When the must-revalidate
+      # directive is present in a response received by a cache, that cache
+      # MUST NOT use the entry after it becomes stale to respond to a
+      # subsequent request without first revalidating it with the origin
+      # server. (I.e., the cache MUST do an end-to-end revalidation every
+      # time, if, based solely on the origin server's Expires or max-age
+      # value, the cached response is stale.)
+      #
+      # The must-revalidate directive is necessary to support reliable
+      # operation for certain protocol features. In all circumstances an
+      # HTTP/1.1 cache MUST obey the must-revalidate directive; in
+      # particular, if the cache cannot reach the origin server for any
+      # reason, it MUST generate a 504 (Gateway Timeout) response.
+      #
+      # Servers SHOULD send the must-revalidate directive if and only if
+      # failure to revalidate a request on the entity could result in
+      # incorrect operation, such as a silently unexecuted financial
+      # transaction. Recipients MUST NOT take any automated action that
+      # violates this directive, and MUST NOT automatically provide an
+      # unvalidated copy of the entity if revalidation fails.
+      def must_revalidate?
+        self['must-revalidate']
+      end
+      # The proxy-revalidate directive has the same meaning as the must-
+      # revalidate directive, except that it does not apply to non-shared
+      # user agent caches. It can be used on a response to an
+      # authenticated request to permit the user's cache to store and
+      # later return the response without needing to revalidate it (since
+      # it has already been authenticated once by that user), while still
+      # requiring proxies that service many users to revalidate each time
+      # (in order to make sure that each user has been authenticated).
+      # Note that such authenticated responses also need the public cache
+      # control directive in order to allow them to be cached at all.
+      def proxy_revalidate?
+        self['proxy-revalidate']
+      end
+      def to_s
+        bools, vals = [], []
+        each do |key,value|
+          if value == true
+            bools << key
+          elsif value
+            vals << "#{key}=#{value}"
+          end
+        end
+        (bools.sort + vals.sort).join(', ')
+      end
+    private
+      def parse(value)
+        return  if value.nil? || value.empty?
+        value.delete(' ').split(',').inject(self) do |hash,part|
+          name, value = part.split('=', 2)
+          hash[name.downcase] = (value || true) unless name.empty?
+          hash
+        end
+      end
+    end
+  end
+end