RubyGems - rtomayko-rack-cache - Versions diffs - 0.3.0 → 0.3.9 - Mend

rtomayko-rack-cache 0.3.0 → 0.3.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

data/CHANGES +41 -0
data/README +0 -1
data/TODO +14 -10
data/doc/configuration.markdown +7 -153
data/doc/index.markdown +1 -3
data/example/sinatra/app.rb +25 -0
data/example/sinatra/views/index.erb +44 -0
data/lib/rack/cache.rb +5 -11
data/lib/rack/cache/cachecontrol.rb +193 -0
data/lib/rack/cache/context.rb +188 -51
data/lib/rack/cache/entitystore.rb +10 -4
data/lib/rack/cache/key.rb +52 -0
data/lib/rack/cache/metastore.rb +52 -16
data/lib/rack/cache/options.rb +29 -13
data/lib/rack/cache/request.rb +11 -15
data/lib/rack/cache/response.rb +221 -30
data/lib/rack/cache/storage.rb +1 -2
data/rack-cache.gemspec +9 -14
data/test/cache_test.rb +4 -1
data/test/cachecontrol_test.rb +139 -0
data/test/context_test.rb +198 -169
data/test/entitystore_test.rb +12 -11
data/test/key_test.rb +50 -0
data/test/metastore_test.rb +57 -14
data/test/options_test.rb +11 -0
data/test/request_test.rb +19 -0
data/test/response_test.rb +164 -23
data/test/spec_setup.rb +6 -0
metadata +13 -19
data/lib/rack/cache/config.rb +0 -65
data/lib/rack/cache/config/busters.rb +0 -16
data/lib/rack/cache/config/default.rb +0 -133
data/lib/rack/cache/config/no-cache.rb +0 -13
data/lib/rack/cache/core.rb +0 -299
data/lib/rack/cache/headers.rb +0 -325
data/lib/rack/utils/environment_headers.rb +0 -78
data/test/config_test.rb +0 -66
data/test/core_test.rb +0 -84
data/test/environment_headers_test.rb +0 -69
data/test/headers_test.rb +0 -298
data/test/logging_test.rb +0 -45

data/CHANGES CHANGED Viewed

@@ -1,3 +1,44 @@
+## 0.4.0 / Unreleased
+  * Ruby 1.9.1 / Rack 1.0 compatible.
+  * Invalidate cache entries that match the request URL on non-GET/HEAD
+    requests. i.e., POST, PUT, DELETE cause matching cache entries to
+    be invalidated. The cache entry is validated with the backend using
+    a conditional GET the next time it's requested.
+  * Implement "Cache-Control: max-age=N" request directive by forcing
+    validation when the max-age provided exceeds the age of the cache
+    entry.
+  * Properly implement "Cache-Control: no-cache" request directive by
+    performing a full reload. RFC 2616 states that when "no-cache" is
+    present in the request, the cache MUST NOT serve a stored response even
+    after successful validation. This is slightly different from the
+    "no-cache" directive in responses, which indicates that the cache must
+    first validate its entry with the origin. Previously, we implemented
+    "no-cache" on requests by passing so no new cache entry would be stored
+    based on the response. Now we treat it as a forced miss and enter the
+    response into the cache if it's cacheable.
+  * Assume identical semantics for the "Pragma: no-cache" request header
+    as the "Cache-Control: no-cache" directive described above.
+  * Less crazy logging. When the verbose option is set, a single log entry
+    is written with a comma separated list of trace events. For example, if
+    the cache was stale but validated, the following log entry would be
+    written: "cache: stale, valid, store". When the verbose option is false,
+    no logging occurs.
+  * Added "X-Rack-Cache" response header with the same comma separated trace
+    value as described above. This gives some visibility into how the cache
+    processed the request.
+  * Add support for canonicalized cache keys, as well as custom cache key
+    generators, which are specified in the options as :cache_key as either
+    any object that has a call() or as a block. Cache key generators get
+    passed a request object and return a cache key string.
 ## 0.3.0 / December 2008
   * Add support for public and private cache control directives. Responses

data/README CHANGED Viewed

@@ -12,7 +12,6 @@ validation (Last-Modified, ETag) information:
   * Cache-Control: public, private, max-age, s-maxage, must-revalidate,
     and proxy-revalidate.
   * Portable: 100% Ruby / works with any Rack-enabled framework
-  * Configuration language for advanced caching policies
   * Disk, memcached, and heap memory storage backends
 For more information about Rack::Cache features and usage, see:

data/TODO CHANGED Viewed

@@ -1,21 +1,25 @@
 ## 0.4
-  - liberal, conservative, sane caching configs
+  - Move breakers.rb configuration file into rack-contrib as a middleware
+    component.
+  - Add docs on using Rack::Cache with Rails 2.3 or link to one of the
+    existing tutorials on this.
   - Sample apps: Rack, Rails, Sinatra, Merb, etc.
-  - busters.rb and no-cache.rb doc and tests
-  - Canonicalized URL for cache key:
-    - sorts params by key, then value
-    - urlencodes /[^ A-Za-z0-9_.-]/ host, path, and param key/value
-  - Custom cache keys
-  - Cache invalidation on PUT, POST, DELETE.
-    - Invalidate at the request URI; or, anything that's "near" the request URI.
-    - Invalidate at the URI of the Location or Content-Location response header.
 ## Backlog
+  - Use Bacon instead of test/spec
+  - Work with both memcache and memcached gems (memcached hasn't built on MacOS
+    for some time now).
+  - Fast path pass processing. We do a lot more than necessary just to determine
+    that the response should be passed through untouched.
+  - Don't purge/remove cache entries when invalidating. The entries should be
+    marked as stale and be forced revalidated on the next request instead of
+    being removed entirely.
   - Add missing Expires header if we have a max-age.
-  - Purge/invalidate specific cache entries
   - Purge/invalidate everything
+  - Invalidate at the URI of the Location or Content-Location response header
+    on POST, PUT, or DELETE that results in a redirect.
   - Maximum size of cached entity
   - Last-Modified factor: requests that have a Last-Modified header but no Expires
     header have a TTL assigned based on the last modified age of the response:

data/doc/configuration.markdown CHANGED Viewed

@@ -1,51 +1,17 @@
-Configuration Language
-======================
+Configuration
+=============
 __Rack::Cache__ includes a configuration system that can be used to specify
 fairly sophisticated cache policy on a global or per-request basis.
-  - [Synopsis](#synopsis)
-  - [Setting Cache Options](#setopt)
-  - [Cache Option Reference](#options)
-  - [Configuration Machinery - Events and Transitions](#machinery)
-  - [Importing Configuration](#import)
-  - [Default Configuration Machinery](#default)
-  - [Notes](#notes)
-<a id='synopsis'></a>
-Synopsis
---------
-    use Rack::Cache do
-      # set cache related options
-      set :verbose, true
-      set :metastore,   'memcached://localhost:11211'
-      set :entitystore, 'file:/var/cache/rack/body'
-      # override events / transitions
-      on :receive do
-        pass! if request.url =~ %r|/dontcache/|
-        error! 402 if request.referrer =~ /digg.com/
-      end
-      on :miss do
-        trace 'missed: %s', request.url
-      end
-      # bring in other configuration machinery
-      import 'rack/cache/config/breakers'
-      import 'mycacheconfig'
-    end
 <a id='setopt'></a>
 Setting Cache Options
 ---------------------
-Cache options can be set when the __Rack::Cache__ object is created; or by using
-the `set` method within a configuration block; or by setting a
-`rack-cache.<option>` variable in __Rack__'s __Environment__.
+Cache options can be set when the __Rack::Cache__ object is created,
+or by setting a `rack-cache.<option>` variable in __Rack__'s
+__Environment__.
 When the __Rack::Cache__ object is instantiated:
@@ -54,14 +20,6 @@ When the __Rack::Cache__ object is instantiated:
       :metastore => 'memcached://localhost:11211/',
       :entitystore => 'file:/var/cache/rack'
-Using the `set` method within __Rack::Cache__'s configuration context:
-    use Rack::Cache do
-      set :verbose, true
-      set :metastore, 'memcached://localhost:11211/'
-      set :entitystore, 'file:/var/cache/rack'
-    end
 Using __Rack__'s __Environment__:
     env.merge!(
@@ -123,110 +81,6 @@ If any of these headers are present in the request, the response is considered
 private and will not be cached _unless_ the response is explicitly marked public
 (e.g., `Cache-Control: public`).
-<a id='machinery'></a>
-Configuration Machinery - Events and Transitions
-------------------------------------------------
-The configuration machinery is built around a series of interceptable events and
-transitions controlled by a simple configuration language. The following diagram
-shows each state (interceptable event) along with their possible transitions:
-<p class='center'>
-<img src='events.png' alt='Events and Transitions Diagram' />
-</p>
-Custom logic can be layered onto the `receive`, `hit`, `miss`, `fetch`, `store`,
-`deliver`, and `pass` events by passing a block to the `on` method:
-    on :fetch do
-      trace 'fetched %p from backend application', request.url
-    end
-Here, the `trace` method writes a message to the `rack.errors` stream when a
-response is fetched from the backend application. The `request` object is a
-[__Rack::Cache::Request__](./api/classes/Rack/Cache/Request) that can be
-inspected (and modified) to determine what action should be taken next.
-Event blocks are capable of performing more interesting operations:
-  * Transition to a different event or override default caching logic.
-  * Modify the request, response, cache entry, or Rack environment options.
-  * Set the `metastore` or `entitystore` options to select a different storage
-    mechanism / location dynamically.
-  * Collect statistics or log request/response/cache information.
-When an event is triggered, the blocks associated with the event are executed in
-reverse/FILO order (i.e., the block declared last runs first) until a
-_transitioning statement_ is encountered. Transitioning statements are suffixed
-with a bang character (e.g, `pass!`, `store!`, `error!`) and cause the current
-event to halt and the machine to transition to the subsequent event; control is
-not returned to the original event. The [default configuration](#default)
-includes documentation on available transitions for each event.
-The `next` statement can be used to exit an event block without transitioning
-to another event. Subsequent event blocks are executed until a transitioning
-statement is encountered:
-    on :fetch do
-      next if response.freshness_information?
-      if request.url =~ /\/feed$/
-        trace 'feed will expire in fifteen minutes'
-        response.ttl = 15 * 60
-      end
-    end
-<a id='import'></a>
-Importing Configuration
------------------------
-Since caching logic can be layered, it's possible to separate various bits of
-cache policy into files for organization and reuse.
-    use Rack::Cache do
-      import 'rack/cache/config/busters'
-      import 'mycacheconfig'
-      # more stuff here
-    end
-The `busters` and `mycacheconfig` configuration files are normal Ruby source
-files (i.e., they have a `.rb` extension) situated on the `$LOAD_PATH` - the
-`import` statement works like Ruby's `require` statement but the contents of the
-files are evaluated in the context of the configuration machinery, as if
-specified directly in the configuration block.
-The `rack/cache/config/busters.rb` file makes a good example. It hooks into the
-`fetch` event and adds an impractically long expiration lifetime to any response
-that includes a cache busting query string:
-<%= File.read('lib/rack/cache/config/busters.rb').gsub(/^/, '    ') %>
-<a id='default'></a>
-Default Configuration Machinery
--------------------------------
-The `rack/cache/config/default.rb` file is imported when the __Rack::Cache__
-object is instantiated and before any custom configuration code is executed.
-It's useful to understand this configuration because it drives the default
-transitioning logic.
-<%= File.read('lib/rack/cache/config/default.rb').gsub(/^/, '    ') %>
-<a id='notes'></a>
-Notes
------
-The configuration language was inspired by [Varnish][var]'s
-[VCL configuration language][vcl].
-[var]: http://varnish.projects.linpro.no/
-  "Varnish HTTP accelerator"
+### `cache_key`
-[vcl]: http://tomayko.com/man/vcl
-  "VCL(7) -- Varnish Configuration Language Manual Page"
+TODO: Document custom cache keys

data/doc/index.markdown CHANGED Viewed

@@ -7,7 +7,6 @@ for [Rack][]-based applications that produce freshness (`Expires`,
   * Validation
   * Vary support
   * Portable: 100% Ruby / works with any [Rack][]-enabled framework.
-  * [Configuration language][config] for advanced caching policies.
   * Disk, memcached, and heap memory [storage backends][storage].
 Status
@@ -52,8 +51,7 @@ caching.
 More
 ----
-  * [Configuration Language Documentation][config] - how to customize cache
-    policy using the simple event-based configuration system.
+  * [Configuration and Options][config] - how to set cache options.
   * [Cache Storage Documentation][storage] - detailed information on the various
     storage implementations available in __Rack::Cache__ and how to choose the one

data/example/sinatra/app.rb ADDED Viewed

@@ -0,0 +1,25 @@
+require 'sinatra'
+require 'rack/cache'
+use Rack::Cache do
+  set :verbose, true
+  set :metastore,   'heap:/'
+  set :entitystore, 'heap:/'
+  on :receive do
+    pass! if request.url =~ /favicon/
+  end
+end
+before do
+  last_modified $updated_at ||= Time.now
+end
+get '/' do
+  erb :index
+end
+put '/' do
+  $updated_at = nil
+  redirect '/'
+end

data/example/sinatra/views/index.erb ADDED Viewed

@@ -0,0 +1,44 @@
+<html>
+  <head>
+    <title>Sample Rack::Cache Sinatra app</title>
+    <style type="text/css" media="screen">
+      body {
+        font-family: Georgia;
+        font-size: 24px;
+        text-align: center;
+      }
+      #headers {
+        font-size: 16px;
+      }
+      input {
+        font-size: 24px;
+        cursor: pointer;
+      }
+    </style>
+  </head>
+  <body>
+    <h1>Last updated at: <%= $updated_at.strftime('%l:%m:%S%P') %></h1>
+    <p>
+      <form action="/" method="post">
+        <input type="hidden" name="_method" value="PUT">
+        <input type="submit" value="Expire the cache.">
+      </form>
+    </p>
+    <div id="headers">
+      <h3>Headers:</h3>
+      <% response.headers.each do |key, value| %>
+        <p><%= key %>: <%= value %></p>
+      <% end %>
+      <h3>Params:</h3>
+      <% params.each do |key, value| %>
+        <p><%= key %>: <%= value || '(blank)' %></p>
+      <% end %>
+    </div>
+  </body>
+</html>

data/lib/rack/cache.rb CHANGED Viewed

@@ -1,10 +1,5 @@
-require 'fileutils'
-require 'time'
 require 'rack'
-module Rack #:nodoc:
-end
 # = HTTP Caching For Rack
 #
 # Rack::Cache is suitable as a quick, drop-in component to enable HTTP caching
@@ -15,7 +10,6 @@ end
 # * Freshness/expiration based caching and validation
 # * Supports HTTP Vary
 # * Portable: 100% Ruby / works with any Rack-enabled framework
-# * VCL-like configuration language for advanced caching policies
 # * Disk, memcached, and heap memory storage backends
 #
 # === Usage
@@ -32,12 +26,12 @@ end
 #     set :entitystore, 'file:/var/cache/rack'
 #   end
 #   run app
-#
 module Rack::Cache
-  require 'rack/cache/request'
-  require 'rack/cache/response'
-  require 'rack/cache/context'
-  require 'rack/cache/storage'
+  autoload :Request,      'rack/cache/request'
+  autoload :Response,     'rack/cache/response'
+  autoload :Context,      'rack/cache/context'
+  autoload :Storage,      'rack/cache/storage'
+  autoload :CacheControl, 'rack/cache/cachecontrol'
   # Create a new Rack::Cache middleware component that fetches resources from
   # the specified backend application. The +options+ Hash can be used to

data/lib/rack/cache/cachecontrol.rb ADDED Viewed

@@ -0,0 +1,193 @@
+module Rack
+  module Cache
+    # Parses a Cache-Control header and exposes the directives as a Hash.
+    # Directives that do not have values are set to +true+.
+    class CacheControl < Hash
+      def initialize(value=nil)
+        parse(value)
+      end
+      # Indicates that the response MAY be cached by any cache, even if it
+      # would normally be non-cacheable or cacheable only within a non-
+      # shared cache.
+      #
+      # A response may be considered public without this directive if the
+      # private directive is not set and the request does not include an
+      # Authorization header.
+      def public?
+        self['public']
+      end
+      # Indicates that all or part of the response message is intended for
+      # a single user and MUST NOT be cached by a shared cache. This
+      # allows an origin server to state that the specified parts of the
+      # response are intended for only one user and are not a valid
+      # response for requests by other users. A private (non-shared) cache
+      # MAY cache the response.
+      #
+      # Note: This usage of the word private only controls where the
+      # response may be cached, and cannot ensure the privacy of the
+      # message content.
+      def private?
+        self['private']
+      end
+      # When set in a response, a cache MUST NOT use the response to satisfy a
+      # subsequent request without successful revalidation with the origin
+      # server. This allows an origin server to prevent caching even by caches
+      # that have been configured to return stale responses to client requests.
+      #
+      # Note that this does not necessary imply that the response may not be
+      # stored by the cache, only that the cache cannot serve it without first
+      # making a conditional GET request with the origin server.
+      #
+      # When set in a request, the server MUST NOT use a cached copy for its
+      # response. This has quite different semantics compared to the no-cache
+      # directive on responses. When the client specifies no-cache, it causes
+      # an end-to-end reload, forcing each cache to update their cached copies.
+      def no_cache?
+        self['no-cache']
+      end
+      # Indicates that the response MUST NOT be stored under any circumstances.
+      #
+      # The purpose of the no-store directive is to prevent the
+      # inadvertent release or retention of sensitive information (for
+      # example, on backup tapes). The no-store directive applies to the
+      # entire message, and MAY be sent either in a response or in a
+      # request. If sent in a request, a cache MUST NOT store any part of
+      # either this request or any response to it. If sent in a response,
+      # a cache MUST NOT store any part of either this response or the
+      # request that elicited it. This directive applies to both non-
+      # shared and shared caches. "MUST NOT store" in this context means
+      # that the cache MUST NOT intentionally store the information in
+      # non-volatile storage, and MUST make a best-effort attempt to
+      # remove the information from volatile storage as promptly as
+      # possible after forwarding it.
+      #
+      # The purpose of this directive is to meet the stated requirements
+      # of certain users and service authors who are concerned about
+      # accidental releases of information via unanticipated accesses to
+      # cache data structures. While the use of this directive might
+      # improve privacy in some cases, we caution that it is NOT in any
+      # way a reliable or sufficient mechanism for ensuring privacy. In
+      # particular, malicious or compromised caches might not recognize or
+      # obey this directive, and communications networks might be
+      # vulnerable to eavesdropping.
+      def no_store?
+        self['no-store']
+      end
+      # The expiration time of an entity MAY be specified by the origin
+      # server using the Expires header (see section 14.21). Alternatively,
+      # it MAY be specified using the max-age directive in a response. When
+      # the max-age cache-control directive is present in a cached response,
+      # the response is stale if its current age is greater than the age
+      # value given (in seconds) at the time of a new request for that
+      # resource. The max-age directive on a response implies that the
+      # response is cacheable (i.e., "public") unless some other, more
+      # restrictive cache directive is also present.
+      #
+      # If a response includes both an Expires header and a max-age
+      # directive, the max-age directive overrides the Expires header, even
+      # if the Expires header is more restrictive. This rule allows an origin
+      # server to provide, for a given response, a longer expiration time to
+      # an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be
+      # useful if certain HTTP/1.0 caches improperly calculate ages or
+      # expiration times, perhaps due to desynchronized clocks.
+      #
+      # Many HTTP/1.0 cache implementations will treat an Expires value that
+      # is less than or equal to the response Date value as being equivalent
+      # to the Cache-Control response directive "no-cache". If an HTTP/1.1
+      # cache receives such a response, and the response does not include a
+      # Cache-Control header field, it SHOULD consider the response to be
+      # non-cacheable in order to retain compatibility with HTTP/1.0 servers.
+      #
+      # When the max-age directive is included in the request, it indicates
+      # that the client is willing to accept a response whose age is no
+      # greater than the specified time in seconds.
+      def max_age
+        self['max-age'].to_i  if key?('max-age')
+      end
+      # If a response includes an s-maxage directive, then for a shared
+      # cache (but not for a private cache), the maximum age specified by
+      # this directive overrides the maximum age specified by either the
+      # max-age directive or the Expires header. The s-maxage directive
+      # also implies the semantics of the proxy-revalidate directive. i.e.,
+      # that the shared cache must not use the entry after it becomes stale
+      # to respond to a subsequent request without first revalidating it with
+      # the origin server. The s-maxage directive is always ignored by a
+      # private cache.
+      def shared_max_age
+        self['s-maxage'].to_i  if key?('s-maxage')
+      end
+      alias_method :s_maxage, :shared_max_age
+      # Because a cache MAY be configured to ignore a server's specified
+      # expiration time, and because a client request MAY include a max-
+      # stale directive (which has a similar effect), the protocol also
+      # includes a mechanism for the origin server to require revalidation
+      # of a cache entry on any subsequent use. When the must-revalidate
+      # directive is present in a response received by a cache, that cache
+      # MUST NOT use the entry after it becomes stale to respond to a
+      # subsequent request without first revalidating it with the origin
+      # server. (I.e., the cache MUST do an end-to-end revalidation every
+      # time, if, based solely on the origin server's Expires or max-age
+      # value, the cached response is stale.)
+      #
+      # The must-revalidate directive is necessary to support reliable
+      # operation for certain protocol features. In all circumstances an
+      # HTTP/1.1 cache MUST obey the must-revalidate directive; in
+      # particular, if the cache cannot reach the origin server for any
+      # reason, it MUST generate a 504 (Gateway Timeout) response.
+      #
+      # Servers SHOULD send the must-revalidate directive if and only if
+      # failure to revalidate a request on the entity could result in
+      # incorrect operation, such as a silently unexecuted financial
+      # transaction. Recipients MUST NOT take any automated action that
+      # violates this directive, and MUST NOT automatically provide an
+      # unvalidated copy of the entity if revalidation fails.
+      def must_revalidate?
+        self['must-revalidate']
+      end
+      # The proxy-revalidate directive has the same meaning as the must-
+      # revalidate directive, except that it does not apply to non-shared
+      # user agent caches. It can be used on a response to an
+      # authenticated request to permit the user's cache to store and
+      # later return the response without needing to revalidate it (since
+      # it has already been authenticated once by that user), while still
+      # requiring proxies that service many users to revalidate each time
+      # (in order to make sure that each user has been authenticated).
+      # Note that such authenticated responses also need the public cache
+      # control directive in order to allow them to be cached at all.
+      def proxy_revalidate?
+        self['proxy-revalidate']
+      end
+      def to_s
+        bools, vals = [], []
+        each do |key,value|
+          if value == true
+            bools << key
+          elsif value
+            vals << "#{key}=#{value}"
+          end
+        end
+        (bools.sort + vals.sort).join(', ')
+      end
+    private
+      def parse(value)
+        return  if value.nil? || value.empty?
+        value.delete(' ').split(',').inject(self) do |hash,part|
+          name, value = part.split('=', 2)
+          hash[name.downcase] = (value || true) unless name.empty?
+          hash
+        end
+      end
+    end
+  end
+end