rack-cache 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of rack-cache might be problematic. Click here for more details.
- data/CHANGES +27 -0
- data/COPYING +18 -0
- data/README +96 -0
- data/Rakefile +144 -0
- data/TODO +40 -0
- data/doc/configuration.markdown +224 -0
- data/doc/events.dot +27 -0
- data/doc/faq.markdown +133 -0
- data/doc/index.markdown +113 -0
- data/doc/layout.html.erb +33 -0
- data/doc/license.markdown +24 -0
- data/doc/rack-cache.css +362 -0
- data/doc/storage.markdown +162 -0
- data/lib/rack/cache.rb +51 -0
- data/lib/rack/cache/config.rb +65 -0
- data/lib/rack/cache/config/busters.rb +16 -0
- data/lib/rack/cache/config/default.rb +134 -0
- data/lib/rack/cache/config/no-cache.rb +13 -0
- data/lib/rack/cache/context.rb +95 -0
- data/lib/rack/cache/core.rb +271 -0
- data/lib/rack/cache/entitystore.rb +224 -0
- data/lib/rack/cache/headers.rb +237 -0
- data/lib/rack/cache/metastore.rb +309 -0
- data/lib/rack/cache/options.rb +119 -0
- data/lib/rack/cache/request.rb +37 -0
- data/lib/rack/cache/response.rb +76 -0
- data/lib/rack/cache/storage.rb +50 -0
- data/lib/rack/utils/environment_headers.rb +78 -0
- data/rack-cache.gemspec +74 -0
- data/test/cache_test.rb +35 -0
- data/test/config_test.rb +66 -0
- data/test/context_test.rb +465 -0
- data/test/core_test.rb +84 -0
- data/test/entitystore_test.rb +176 -0
- data/test/environment_headers_test.rb +71 -0
- data/test/headers_test.rb +215 -0
- data/test/logging_test.rb +45 -0
- data/test/metastore_test.rb +210 -0
- data/test/options_test.rb +64 -0
- data/test/pony.jpg +0 -0
- data/test/response_test.rb +37 -0
- data/test/spec_setup.rb +189 -0
- data/test/storage_test.rb +94 -0
- metadata +120 -0
@@ -0,0 +1,162 @@
|
|
1
|
+
Storage
|
2
|
+
=======
|
3
|
+
|
4
|
+
__Rack::Cache__ runs within each of your backend application processes and does not
|
5
|
+
rely on a single intermediary process like most types of proxy cache
|
6
|
+
implementations. Because of this, the storage subsystem has implications on not
|
7
|
+
only where cache data is stored but whether the cache is properly distributed
|
8
|
+
between multiple backend processes. It is highly recommended that you read and
|
9
|
+
understand the following before choosing a storage implementation.
|
10
|
+
|
11
|
+
Storage Areas
|
12
|
+
-------------
|
13
|
+
|
14
|
+
__Rack::Cache__ stores cache entries in two separate configurable storage
|
15
|
+
areas: a _MetaStore_ and an _EntityStore_.
|
16
|
+
|
17
|
+
The _MetaStore_ keeps high level information about each cache entry, including
|
18
|
+
the request/response headers and other status information. When a request is
|
19
|
+
received, the core caching logic uses this meta information to determine whether
|
20
|
+
a fresh cache entry exists that can satisfy the request.
|
21
|
+
|
22
|
+
The _EntityStore_ is where the actual response body content is stored. When a
|
23
|
+
response is entered into the cache, a SHA1 digest of the response body content
|
24
|
+
is calculated and used as a key. The entries stored in the MetaStore reference
|
25
|
+
their response bodies using this SHA1 key.
|
26
|
+
|
27
|
+
Separating request/response meta-data from response content has a few important
|
28
|
+
advantages:
|
29
|
+
|
30
|
+
* Different storage types can be used for meta and entity storage. For
|
31
|
+
example, it may be desirable to use memcached to store meta information
|
32
|
+
while using the filesystem for entity storage.
|
33
|
+
|
34
|
+
* Cache entry meta-data may be retrieved quickly without also retrieving
|
35
|
+
response bodies. This avoids significant overhead when the cache misses
|
36
|
+
or only requires validation.
|
37
|
+
|
38
|
+
* Multiple different responses may include the same exact response body. In
|
39
|
+
these cases, the actual body content is stored once and referenced from
|
40
|
+
each of the meta store entries.
|
41
|
+
|
42
|
+
You should consider how the meta and entity stores differ when choosing a storage
|
43
|
+
implementation. The MetaStore does not require nearly as much memory as the
|
44
|
+
EntityStore and is accessed much more frequently. The EntityStore can grow quite
|
45
|
+
large and raw performance is less of a concern. Using a memory based storage
|
46
|
+
implementation (`heap` or `memcached`) for the MetaStore is strongly advised,
|
47
|
+
while a disk based storage implementation (`file`) is often satisfactory for
|
48
|
+
the EntityStore and uses much less memory.
|
49
|
+
|
50
|
+
Storage Configuration
|
51
|
+
---------------------
|
52
|
+
|
53
|
+
The MetaStore and EntityStore used for a particular request is determined by
|
54
|
+
inspecting the `rack-cache.metastore` and `rack-cache.entitystore` Rack env
|
55
|
+
variables. The value of these variables is a URI that identifies the storage
|
56
|
+
type and location (URI formats are documented in the following section).
|
57
|
+
|
58
|
+
The `heap:/` storage is assumed if either storage type is not explicitly
|
59
|
+
provided. This storage type has significant drawbacks for most types of
|
60
|
+
deployments so explicit configuration is advised.
|
61
|
+
|
62
|
+
The default metastore and entitystore values can be specified when the
|
63
|
+
__Rack::Cache__ object is added to the Rack middleware pipeline as follows:
|
64
|
+
|
65
|
+
use Rack::Cache do
|
66
|
+
set :metastore, 'file:/var/cache/rack/meta'
|
67
|
+
set :entitystore, 'file:/var/cache/rack/body'
|
68
|
+
end
|
69
|
+
|
70
|
+
Alternatively, the `rack-cache.metastore` and `rack-cache.entitystore`
|
71
|
+
variables may be set in the Rack environment by an upstream component.
|
72
|
+
|
73
|
+
Storage Implementations
|
74
|
+
-----------------------
|
75
|
+
|
76
|
+
__Rack::Cache__ includes meta and entity storage implementations backed by local
|
77
|
+
process memory ("heap storage"), the file system ("disk storage"), and
|
78
|
+
memcached. This section includes information on configuring __Rack::Cache__ to
|
79
|
+
use a specific storage implementation as well as pros and cons of each.
|
80
|
+
|
81
|
+
### Heap Storage
|
82
|
+
|
83
|
+
Uses local process memory to store cached entries.
|
84
|
+
|
85
|
+
set :metastore, 'heap:/'
|
86
|
+
set :entitystore, 'heap:/'
|
87
|
+
|
88
|
+
The heap storage backend is simple, fast, and mostly useless. All cache
|
89
|
+
information is stored in each backend application's local process memory (using
|
90
|
+
a normal Hash, in fact), which means that data cached under one backend is
|
91
|
+
invisible to all other backends. This leads to low cache hit rates and excessive
|
92
|
+
memory use, the magnitude of which is a function of the number of backends in
|
93
|
+
use. Further, the heap storage provides no mechanism for purging unused entries
|
94
|
+
so memory use is guaranteed to exceed that available, given enough time and
|
95
|
+
utilization.
|
96
|
+
|
97
|
+
Use of heap storage is recommended only for testing purposes or for very
|
98
|
+
simple/single-backend deployment scenarios where the number of resources served
|
99
|
+
is small and well understood.
|
100
|
+
|
101
|
+
### Disk Storage
|
102
|
+
|
103
|
+
Stores cached entries on the filesystem.
|
104
|
+
|
105
|
+
set :metastore, 'file:/var/cache/rack/meta'
|
106
|
+
set :entitystore, 'file:/var/cache/rack/body'
|
107
|
+
|
108
|
+
The URI may specify an absolute, relative, or home-rooted path:
|
109
|
+
|
110
|
+
* `file:/storage/path` - absolute path to storage directory.
|
111
|
+
* `file:storage/path` - relative path to storage directory, rooted at the
|
112
|
+
process's current working directory (`Dir.pwd`).
|
113
|
+
* `file:~user/storage/path` - path to storage directory, rooted at the
|
114
|
+
specified user's home directory.
|
115
|
+
* `file:~/storage/path` - path to storage directory, rooted at the current
|
116
|
+
user's home directory.
|
117
|
+
|
118
|
+
File system storage is simple, requires no special daemons or libraries, has a
|
119
|
+
tiny memory footprint, and allows multiple backends to share a single cache; it
|
120
|
+
is one of the slower storage implementations, however. Its use is recommended in
|
121
|
+
cases where memory is limited or in environments where more complex storage
|
122
|
+
backends (i.e., memcached) are not available. In many cases, it may be
|
123
|
+
acceptable (and even optimal) to use file system storage for the entitystore and
|
124
|
+
a more performant storage implementation (i.e. memcached) for the metastore.
|
125
|
+
|
126
|
+
__NOTE:__ When both the metastore and entitystore are configured to use file
|
127
|
+
system storage, they should be set to different paths to prevent any chance of
|
128
|
+
collision.
|
129
|
+
|
130
|
+
### Memcached Storage
|
131
|
+
|
132
|
+
Stores cached entries in a remote [memcached](http://www.danga.com/memcached/)
|
133
|
+
instance.
|
134
|
+
|
135
|
+
set :metastore, 'memcached://localhost:11211/meta'
|
136
|
+
set :entitystore, 'memcached://localhost:11211/body'
|
137
|
+
|
138
|
+
The URI must specify the host and port of a remote memcached daemon. The path
|
139
|
+
portion is an optional (but recommended) namespace that is prepended to each
|
140
|
+
cache key.
|
141
|
+
|
142
|
+
The memcached storage backend requires [Evan Weaver's memcached client library][e].
|
143
|
+
This is a [fast][f] client implementation built on the SWIG/[libmemcached][l] C
|
144
|
+
library. The library may be installed from Gem as follows:
|
145
|
+
|
146
|
+
sudo gem install memcached --no-rdoc --no-ri
|
147
|
+
|
148
|
+
Memcached storage is reasonably fast and allows multiple backends to share a
|
149
|
+
single cache. It is also the only storage implementation that allows the cache
|
150
|
+
to reside somewhere other than the local machine. The memcached daemon stores
|
151
|
+
all data in local process memory so using it for the entitystore can result in
|
152
|
+
heavy memory usage. It is by far the best option for the metastore in
|
153
|
+
deployments with multiple backend application processes since it allows the
|
154
|
+
cache to be properly distributed and provides fast access to the
|
155
|
+
meta-information required to perform cache logic. Memcached is considerably more
|
156
|
+
complex than the other storage implementations, requiring a separate daemon
|
157
|
+
process and extra libraries. Still, its use is recommended in all cases where
|
158
|
+
you can get away with it.
|
159
|
+
|
160
|
+
[e]: http://blog.evanweaver.com/files/doc/fauna/memcached/files/README.html
|
161
|
+
[f]: http://blog.evanweaver.com/articles/2008/01/21/b-the-fastest-u-can-b-memcached/
|
162
|
+
[l]: http://tangent.org/552/libmemcached.html
|
data/lib/rack/cache.rb
ADDED
@@ -0,0 +1,51 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
require 'time'
|
3
|
+
require 'rack'
|
4
|
+
|
5
|
+
module Rack #:nodoc:
|
6
|
+
end
|
7
|
+
|
8
|
+
# = HTTP Caching For Rack
|
9
|
+
#
|
10
|
+
# Rack::Cache is suitable as a quick, drop-in component to enable HTTP caching
|
11
|
+
# for Rack-enabled applications that produce freshness (+Expires+, +Cache-Control+)
|
12
|
+
# and/or validation (+Last-Modified+, +ETag+) information.
|
13
|
+
#
|
14
|
+
# * Standards-based (RFC 2616 compliance)
|
15
|
+
# * Freshness/expiration based caching and validation
|
16
|
+
# * Supports HTTP Vary
|
17
|
+
# * Portable: 100% Ruby / works with any Rack-enabled framework
|
18
|
+
# * VCL-like configuration language for advanced caching policies
|
19
|
+
# * Disk, memcached, and heap memory storage backends
|
20
|
+
#
|
21
|
+
# === Usage
|
22
|
+
#
|
23
|
+
# Create with default options:
|
24
|
+
# require 'rack/cache'
|
25
|
+
# Rack::Cache.new(app, :verbose => true, :entitystore => 'file:cache')
|
26
|
+
#
|
27
|
+
# Within a rackup file (or with Rack::Builder):
|
28
|
+
# require 'rack/cache'
|
29
|
+
# use Rack::Cache do
|
30
|
+
# set :verbose, true
|
31
|
+
# set :metastore, 'memcached://localhost:11211/meta'
|
32
|
+
# set :entitystore, 'file:/var/cache/rack'
|
33
|
+
# end
|
34
|
+
# run app
|
35
|
+
#
|
36
|
+
module Rack::Cache
|
37
|
+
require 'rack/cache/request'
|
38
|
+
require 'rack/cache/response'
|
39
|
+
require 'rack/cache/context'
|
40
|
+
require 'rack/cache/storage'
|
41
|
+
|
42
|
+
# Create a new Rack::Cache middleware component that fetches resources from
|
43
|
+
# the specified backend application. The +options+ Hash can be used to
|
44
|
+
# specify default configuration values (see attributes defined in
|
45
|
+
# Rack::Cache::Options for possible key/values). When a block is given, it
|
46
|
+
# is executed within the context of the newly create Rack::Cache::Context
|
47
|
+
# object.
|
48
|
+
def self.new(backend, options={}, &b)
|
49
|
+
Context.new(backend, options, &b)
|
50
|
+
end
|
51
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
require 'set'
|
2
|
+
|
3
|
+
module Rack::Cache
|
4
|
+
# Provides cache configuration methods. This module is included in the cache
|
5
|
+
# context object.
|
6
|
+
|
7
|
+
module Config
|
8
|
+
# Evaluate a block of configuration code within the scope of receiver.
|
9
|
+
def configure(&block)
|
10
|
+
instance_eval(&block) if block_given?
|
11
|
+
end
|
12
|
+
|
13
|
+
# Import the configuration file specified. This has the same basic semantics
|
14
|
+
# as Ruby's built-in +require+ statement but always evaluates the source
|
15
|
+
# file within the scope of the receiver. The file may exist anywhere on the
|
16
|
+
# $LOAD_PATH.
|
17
|
+
def import(file)
|
18
|
+
return false if imported_features.include?(file)
|
19
|
+
path = add_file_extension(file, 'rb')
|
20
|
+
if path = locate_file_on_load_path(path)
|
21
|
+
source = File.read(path)
|
22
|
+
imported_features.add(file)
|
23
|
+
instance_eval source, path, 1
|
24
|
+
true
|
25
|
+
else
|
26
|
+
raise LoadError, 'no such file to load -- %s' % [file]
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
private
|
31
|
+
# Load the default configuration and evaluate the block provided within
|
32
|
+
# the scope of the receiver.
|
33
|
+
def initialize_config(&block)
|
34
|
+
import 'rack/cache/config/default'
|
35
|
+
configure(&block)
|
36
|
+
end
|
37
|
+
|
38
|
+
# Set of files that have been imported.
|
39
|
+
def imported_features
|
40
|
+
@imported_features ||= Set.new
|
41
|
+
end
|
42
|
+
|
43
|
+
# Attempt to expand +file+ to a full path by possibly adding an .rb
|
44
|
+
# extension and traversing the $LOAD_PATH looking for matches.
|
45
|
+
def locate_file_on_load_path(file)
|
46
|
+
if file[0,1] == '/'
|
47
|
+
file if File.exist?(file)
|
48
|
+
else
|
49
|
+
$LOAD_PATH.
|
50
|
+
map { |base| File.join(base, file) }.
|
51
|
+
detect { |p| File.exist?(p) }
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
# Add an extension to the filename provided if the file doesn't
|
56
|
+
# already have extension.
|
57
|
+
def add_file_extension(file, extension='rb')
|
58
|
+
if file =~ /\.\w+$/
|
59
|
+
file
|
60
|
+
else
|
61
|
+
"#{file}.#{extension}"
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
@@ -0,0 +1,16 @@
|
|
1
|
+
# Adds a very long max-age response header when the requested url
|
2
|
+
# looks like it includes a cache busting timestamp. Cache busting
|
3
|
+
# URLs look like this:
|
4
|
+
# http://HOST/PATH?DIGITS
|
5
|
+
#
|
6
|
+
# DIGITS is typically the number of seconds since some epoch but
|
7
|
+
# this can theoretically be any set of digits. Example:
|
8
|
+
# http://example.com/css/foo.css?7894387283
|
9
|
+
#
|
10
|
+
on :fetch do
|
11
|
+
next if response.freshness_information?
|
12
|
+
if request.url =~ /\?\d+$/
|
13
|
+
trace 'adding huge max-age to response for cache busting URL'
|
14
|
+
response.ttl = 100000000000000
|
15
|
+
end
|
16
|
+
end
|
@@ -0,0 +1,134 @@
|
|
1
|
+
# Called at the beginning of request processing, after the complete
|
2
|
+
# request has been fully received. Its purpose is to decide whether or
|
3
|
+
# not to serve the request and how to do it.
|
4
|
+
#
|
5
|
+
# The request should not be modified.
|
6
|
+
#
|
7
|
+
# Possible transitions from receive:
|
8
|
+
#
|
9
|
+
# * pass! - pass the request to the backend the response upstream,
|
10
|
+
# bypassing all caching features.
|
11
|
+
#
|
12
|
+
# * lookup! - attempt to locate the entry in the cache. Control will
|
13
|
+
# pass to the +hit+, +miss+, or +fetch+ event based on the result of
|
14
|
+
# the cache lookup.
|
15
|
+
#
|
16
|
+
# * error! - return the error code specified, abandoning the request.
|
17
|
+
#
|
18
|
+
on :receive do
|
19
|
+
pass! unless request.method? 'GET', 'HEAD'
|
20
|
+
pass! if request.header? 'Cookie', 'Authorization', 'Expect'
|
21
|
+
lookup!
|
22
|
+
end
|
23
|
+
|
24
|
+
# Called upon entering pass mode. The request is sent to the backend,
|
25
|
+
# and the backend's response is sent to the client, but is not entered
|
26
|
+
# into the cache. The event is triggered immediately after the response
|
27
|
+
# is received from the backend but before the it has been sent upstream.
|
28
|
+
#
|
29
|
+
# Possible transitions from pass:
|
30
|
+
#
|
31
|
+
# * finish! - deliver the response upstream.
|
32
|
+
#
|
33
|
+
# * error! - return the error code specified, abandoning the request.
|
34
|
+
#
|
35
|
+
on :pass do
|
36
|
+
finish!
|
37
|
+
end
|
38
|
+
|
39
|
+
# Called after a cache lookup when no matching entry is found in the
|
40
|
+
# cache. Its purpose is to decide whether or not to attempt to retrieve
|
41
|
+
# the response from the backend and in what manner.
|
42
|
+
#
|
43
|
+
# Possible transitions from miss:
|
44
|
+
#
|
45
|
+
# * fetch! - retrieve the requested document from the backend with
|
46
|
+
# caching features enabled.
|
47
|
+
#
|
48
|
+
# * pass! - pass the request to the backend and the response upstream,
|
49
|
+
# bypassing all caching features.
|
50
|
+
#
|
51
|
+
# * error! - return the error code specified and abandon request.
|
52
|
+
#
|
53
|
+
# The default configuration transfers control to the fetch event.
|
54
|
+
on :miss do
|
55
|
+
fetch!
|
56
|
+
end
|
57
|
+
|
58
|
+
# Called after a cache lookup when the requested document is found in
|
59
|
+
# the cache and is fresh.
|
60
|
+
#
|
61
|
+
# Possible transitions from hit:
|
62
|
+
#
|
63
|
+
# * deliver! - transfer control to the deliver event, sending the cached
|
64
|
+
# response upstream.
|
65
|
+
#
|
66
|
+
# * pass! - abandon the cache entry and transfer to pass mode. The
|
67
|
+
# original request is sent to the backend and the response sent
|
68
|
+
# upstream, bypassing all caching features.
|
69
|
+
#
|
70
|
+
# * error! - return the error code specified and abandon request.
|
71
|
+
#
|
72
|
+
on :hit do
|
73
|
+
deliver!
|
74
|
+
end
|
75
|
+
|
76
|
+
# Called after a document is successfully retrieved from the backend
|
77
|
+
# application or after a cache entry is validated with the backend.
|
78
|
+
# During validation, the original request is used as a template for a
|
79
|
+
# conditional GET request with the backend. The +original_response+
|
80
|
+
# object contains the response as received from the backend and +entry+
|
81
|
+
# is set to the cached response that triggered validation.
|
82
|
+
#
|
83
|
+
# Possible transitions from fetch:
|
84
|
+
#
|
85
|
+
# * store! - store the fetched response in the cache or, when
|
86
|
+
# validating, update the cached response with validated results.
|
87
|
+
#
|
88
|
+
# * deliver! - deliver the response upstream without entering it
|
89
|
+
# into the cache.
|
90
|
+
#
|
91
|
+
# * error! return the error code specified and abandon request.
|
92
|
+
#
|
93
|
+
on :fetch do
|
94
|
+
store! if response.cacheable?
|
95
|
+
deliver!
|
96
|
+
end
|
97
|
+
|
98
|
+
# Called immediately before an entry is written to the underlying
|
99
|
+
# cache. The +entry+ object may be modified.
|
100
|
+
#
|
101
|
+
# Possible transitions from store:
|
102
|
+
#
|
103
|
+
# * persist! - commit the object to cache and transfer control to
|
104
|
+
# the deliver event.
|
105
|
+
#
|
106
|
+
# * deliver! - transfer control to the deliver event without committing
|
107
|
+
# the object to cache.
|
108
|
+
#
|
109
|
+
# * error! - return the error code specified and abandon request.
|
110
|
+
#
|
111
|
+
on :store do
|
112
|
+
entry.ttl = default_ttl if entry.ttl.nil?
|
113
|
+
trace 'store backend response in cache (ttl: %ds)', entry.ttl
|
114
|
+
persist!
|
115
|
+
end
|
116
|
+
|
117
|
+
# Called immediately before +response+ is delivered upstream. +response+
|
118
|
+
# may be modified at this point but the changes will not effect the
|
119
|
+
# cache since the entry has already been persisted.
|
120
|
+
#
|
121
|
+
# * finish! - complete processing and send the response upstream
|
122
|
+
#
|
123
|
+
# * error! - return the error code specified and abandon request.
|
124
|
+
#
|
125
|
+
on :deliver do
|
126
|
+
finish!
|
127
|
+
end
|
128
|
+
|
129
|
+
# Called when an error! transition is triggered. The +response+ has the
|
130
|
+
# error code, headers, and body that will be delivered to upstream and
|
131
|
+
# may be modified if needed.
|
132
|
+
on :error do
|
133
|
+
finish!
|
134
|
+
end
|
@@ -0,0 +1,13 @@
|
|
1
|
+
# The default configuration ignores the `Cache-Control: no-cache` directive on
|
2
|
+
# requests. Per RFC 2616, the presence of the no-cache directive should cause
|
3
|
+
# intermediaries to process requests as if no cached version were available.
|
4
|
+
# However, this directive is most often targetted at shared proxy caches, not
|
5
|
+
# gateway caches, and so we've chosen to break with the spec in our default
|
6
|
+
# configuration.
|
7
|
+
#
|
8
|
+
# Import 'rack/cache/config/no-cache' to enable standards-based
|
9
|
+
# processing.
|
10
|
+
|
11
|
+
on :receive do
|
12
|
+
pass! if request.header['Cache-Control'] =~ /no-cache/
|
13
|
+
end
|