rack-cache 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of rack-cache might be problematic. Click here for more details.
- data/CHANGES +27 -0
- data/COPYING +18 -0
- data/README +96 -0
- data/Rakefile +144 -0
- data/TODO +40 -0
- data/doc/configuration.markdown +224 -0
- data/doc/events.dot +27 -0
- data/doc/faq.markdown +133 -0
- data/doc/index.markdown +113 -0
- data/doc/layout.html.erb +33 -0
- data/doc/license.markdown +24 -0
- data/doc/rack-cache.css +362 -0
- data/doc/storage.markdown +162 -0
- data/lib/rack/cache.rb +51 -0
- data/lib/rack/cache/config.rb +65 -0
- data/lib/rack/cache/config/busters.rb +16 -0
- data/lib/rack/cache/config/default.rb +134 -0
- data/lib/rack/cache/config/no-cache.rb +13 -0
- data/lib/rack/cache/context.rb +95 -0
- data/lib/rack/cache/core.rb +271 -0
- data/lib/rack/cache/entitystore.rb +224 -0
- data/lib/rack/cache/headers.rb +237 -0
- data/lib/rack/cache/metastore.rb +309 -0
- data/lib/rack/cache/options.rb +119 -0
- data/lib/rack/cache/request.rb +37 -0
- data/lib/rack/cache/response.rb +76 -0
- data/lib/rack/cache/storage.rb +50 -0
- data/lib/rack/utils/environment_headers.rb +78 -0
- data/rack-cache.gemspec +74 -0
- data/test/cache_test.rb +35 -0
- data/test/config_test.rb +66 -0
- data/test/context_test.rb +465 -0
- data/test/core_test.rb +84 -0
- data/test/entitystore_test.rb +176 -0
- data/test/environment_headers_test.rb +71 -0
- data/test/headers_test.rb +215 -0
- data/test/logging_test.rb +45 -0
- data/test/metastore_test.rb +210 -0
- data/test/options_test.rb +64 -0
- data/test/pony.jpg +0 -0
- data/test/response_test.rb +37 -0
- data/test/spec_setup.rb +189 -0
- data/test/storage_test.rb +94 -0
- metadata +120 -0
@@ -0,0 +1,162 @@
|
|
1
|
+
Storage
|
2
|
+
=======
|
3
|
+
|
4
|
+
__Rack::Cache__ runs within each of your backend application processes and does not
|
5
|
+
rely on a single intermediary process like most types of proxy cache
|
6
|
+
implementations. Because of this, the storage subsystem has implications on not
|
7
|
+
only where cache data is stored but whether the cache is properly distributed
|
8
|
+
between multiple backend processes. It is highly recommended that you read and
|
9
|
+
understand the following before choosing a storage implementation.
|
10
|
+
|
11
|
+
Storage Areas
|
12
|
+
-------------
|
13
|
+
|
14
|
+
__Rack::Cache__ stores cache entries in two separate configurable storage
|
15
|
+
areas: a _MetaStore_ and an _EntityStore_.
|
16
|
+
|
17
|
+
The _MetaStore_ keeps high level information about each cache entry, including
|
18
|
+
the request/response headers and other status information. When a request is
|
19
|
+
received, the core caching logic uses this meta information to determine whether
|
20
|
+
a fresh cache entry exists that can satisfy the request.
|
21
|
+
|
22
|
+
The _EntityStore_ is where the actual response body content is stored. When a
|
23
|
+
response is entered into the cache, a SHA1 digest of the response body content
|
24
|
+
is calculated and used as a key. The entries stored in the MetaStore reference
|
25
|
+
their response bodies using this SHA1 key.
|
26
|
+
|
27
|
+
Separating request/response meta-data from response content has a few important
|
28
|
+
advantages:
|
29
|
+
|
30
|
+
* Different storage types can be used for meta and entity storage. For
|
31
|
+
example, it may be desirable to use memcached to store meta information
|
32
|
+
while using the filesystem for entity storage.
|
33
|
+
|
34
|
+
* Cache entry meta-data may be retrieved quickly without also retrieving
|
35
|
+
response bodies. This avoids significant overhead when the cache misses
|
36
|
+
or only requires validation.
|
37
|
+
|
38
|
+
* Multiple different responses may include the same exact response body. In
|
39
|
+
these cases, the actual body content is stored once and referenced from
|
40
|
+
each of the meta store entries.
|
41
|
+
|
42
|
+
You should consider how the meta and entity stores differ when choosing a storage
|
43
|
+
implementation. The MetaStore does not require nearly as much memory as the
|
44
|
+
EntityStore and is accessed much more frequently. The EntityStore can grow quite
|
45
|
+
large and raw performance is less of a concern. Using a memory based storage
|
46
|
+
implementation (`heap` or `memcached`) for the MetaStore is strongly advised,
|
47
|
+
while a disk based storage implementation (`file`) is often satisfactory for
|
48
|
+
the EntityStore and uses much less memory.
|
49
|
+
|
50
|
+
Storage Configuration
|
51
|
+
---------------------
|
52
|
+
|
53
|
+
The MetaStore and EntityStore used for a particular request is determined by
|
54
|
+
inspecting the `rack-cache.metastore` and `rack-cache.entitystore` Rack env
|
55
|
+
variables. The value of these variables is a URI that identifies the storage
|
56
|
+
type and location (URI formats are documented in the following section).
|
57
|
+
|
58
|
+
The `heap:/` storage is assumed if either storage type is not explicitly
|
59
|
+
provided. This storage type has significant drawbacks for most types of
|
60
|
+
deployments so explicit configuration is advised.
|
61
|
+
|
62
|
+
The default metastore and entitystore values can be specified when the
|
63
|
+
__Rack::Cache__ object is added to the Rack middleware pipeline as follows:
|
64
|
+
|
65
|
+
use Rack::Cache do
|
66
|
+
set :metastore, 'file:/var/cache/rack/meta'
|
67
|
+
set :entitystore, 'file:/var/cache/rack/body'
|
68
|
+
end
|
69
|
+
|
70
|
+
Alternatively, the `rack-cache.metastore` and `rack-cache.entitystore`
|
71
|
+
variables may be set in the Rack environment by an upstream component.
|
72
|
+
|
73
|
+
Storage Implementations
|
74
|
+
-----------------------
|
75
|
+
|
76
|
+
__Rack::Cache__ includes meta and entity storage implementations backed by local
|
77
|
+
process memory ("heap storage"), the file system ("disk storage"), and
|
78
|
+
memcached. This section includes information on configuring __Rack::Cache__ to
|
79
|
+
use a specific storage implementation as well as pros and cons of each.
|
80
|
+
|
81
|
+
### Heap Storage
|
82
|
+
|
83
|
+
Uses local process memory to store cached entries.
|
84
|
+
|
85
|
+
set :metastore, 'heap:/'
|
86
|
+
set :entitystore, 'heap:/'
|
87
|
+
|
88
|
+
The heap storage backend is simple, fast, and mostly useless. All cache
|
89
|
+
information is stored in each backend application's local process memory (using
|
90
|
+
a normal Hash, in fact), which means that data cached under one backend is
|
91
|
+
invisible to all other backends. This leads to low cache hit rates and excessive
|
92
|
+
memory use, the magnitude of which is a function of the number of backends in
|
93
|
+
use. Further, the heap storage provides no mechanism for purging unused entries
|
94
|
+
so memory use is guaranteed to exceed that available, given enough time and
|
95
|
+
utilization.
|
96
|
+
|
97
|
+
Use of heap storage is recommended only for testing purposes or for very
|
98
|
+
simple/single-backend deployment scenarios where the number of resources served
|
99
|
+
is small and well understood.
|
100
|
+
|
101
|
+
### Disk Storage
|
102
|
+
|
103
|
+
Stores cached entries on the filesystem.
|
104
|
+
|
105
|
+
set :metastore, 'file:/var/cache/rack/meta'
|
106
|
+
set :entitystore, 'file:/var/cache/rack/body'
|
107
|
+
|
108
|
+
The URI may specify an absolute, relative, or home-rooted path:
|
109
|
+
|
110
|
+
* `file:/storage/path` - absolute path to storage directory.
|
111
|
+
* `file:storage/path` - relative path to storage directory, rooted at the
|
112
|
+
process's current working directory (`Dir.pwd`).
|
113
|
+
* `file:~user/storage/path` - path to storage directory, rooted at the
|
114
|
+
specified user's home directory.
|
115
|
+
* `file:~/storage/path` - path to storage directory, rooted at the current
|
116
|
+
user's home directory.
|
117
|
+
|
118
|
+
File system storage is simple, requires no special daemons or libraries, has a
|
119
|
+
tiny memory footprint, and allows multiple backends to share a single cache; it
|
120
|
+
is one of the slower storage implementations, however. Its use is recommended in
|
121
|
+
cases where memory is limited or in environments where more complex storage
|
122
|
+
backends (i.e., memcached) are not available. In many cases, it may be
|
123
|
+
acceptable (and even optimal) to use file system storage for the entitystore and
|
124
|
+
a more performant storage implementation (i.e. memcached) for the metastore.
|
125
|
+
|
126
|
+
__NOTE:__ When both the metastore and entitystore are configured to use file
|
127
|
+
system storage, they should be set to different paths to prevent any chance of
|
128
|
+
collision.
|
129
|
+
|
130
|
+
### Memcached Storage
|
131
|
+
|
132
|
+
Stores cached entries in a remote [memcached](http://www.danga.com/memcached/)
|
133
|
+
instance.
|
134
|
+
|
135
|
+
set :metastore, 'memcached://localhost:11211/meta'
|
136
|
+
set :entitystore, 'memcached://localhost:11211/body'
|
137
|
+
|
138
|
+
The URI must specify the host and port of a remote memcached daemon. The path
|
139
|
+
portion is an optional (but recommended) namespace that is prepended to each
|
140
|
+
cache key.
|
141
|
+
|
142
|
+
The memcached storage backend requires [Evan Weaver's memcached client library][e].
|
143
|
+
This is a [fast][f] client implementation built on the SWIG/[libmemcached][l] C
|
144
|
+
library. The library may be installed from Gem as follows:
|
145
|
+
|
146
|
+
sudo gem install memcached --no-rdoc --no-ri
|
147
|
+
|
148
|
+
Memcached storage is reasonably fast and allows multiple backends to share a
|
149
|
+
single cache. It is also the only storage implementation that allows the cache
|
150
|
+
to reside somewhere other than the local machine. The memcached daemon stores
|
151
|
+
all data in local process memory so using it for the entitystore can result in
|
152
|
+
heavy memory usage. It is by far the best option for the metastore in
|
153
|
+
deployments with multiple backend application processes since it allows the
|
154
|
+
cache to be properly distributed and provides fast access to the
|
155
|
+
meta-information required to perform cache logic. Memcached is considerably more
|
156
|
+
complex than the other storage implementations, requiring a separate daemon
|
157
|
+
process and extra libraries. Still, its use is recommended in all cases where
|
158
|
+
you can get away with it.
|
159
|
+
|
160
|
+
[e]: http://blog.evanweaver.com/files/doc/fauna/memcached/files/README.html
|
161
|
+
[f]: http://blog.evanweaver.com/articles/2008/01/21/b-the-fastest-u-can-b-memcached/
|
162
|
+
[l]: http://tangent.org/552/libmemcached.html
|
data/lib/rack/cache.rb
ADDED
@@ -0,0 +1,51 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
require 'time'
|
3
|
+
require 'rack'
|
4
|
+
|
5
|
+
module Rack #:nodoc:
|
6
|
+
end
|
7
|
+
|
8
|
+
# = HTTP Caching For Rack
|
9
|
+
#
|
10
|
+
# Rack::Cache is suitable as a quick, drop-in component to enable HTTP caching
|
11
|
+
# for Rack-enabled applications that produce freshness (+Expires+, +Cache-Control+)
|
12
|
+
# and/or validation (+Last-Modified+, +ETag+) information.
|
13
|
+
#
|
14
|
+
# * Standards-based (RFC 2616 compliance)
|
15
|
+
# * Freshness/expiration based caching and validation
|
16
|
+
# * Supports HTTP Vary
|
17
|
+
# * Portable: 100% Ruby / works with any Rack-enabled framework
|
18
|
+
# * VCL-like configuration language for advanced caching policies
|
19
|
+
# * Disk, memcached, and heap memory storage backends
|
20
|
+
#
|
21
|
+
# === Usage
|
22
|
+
#
|
23
|
+
# Create with default options:
|
24
|
+
# require 'rack/cache'
|
25
|
+
# Rack::Cache.new(app, :verbose => true, :entitystore => 'file:cache')
|
26
|
+
#
|
27
|
+
# Within a rackup file (or with Rack::Builder):
|
28
|
+
# require 'rack/cache'
|
29
|
+
# use Rack::Cache do
|
30
|
+
# set :verbose, true
|
31
|
+
# set :metastore, 'memcached://localhost:11211/meta'
|
32
|
+
# set :entitystore, 'file:/var/cache/rack'
|
33
|
+
# end
|
34
|
+
# run app
|
35
|
+
#
|
36
|
+
module Rack::Cache
|
37
|
+
require 'rack/cache/request'
|
38
|
+
require 'rack/cache/response'
|
39
|
+
require 'rack/cache/context'
|
40
|
+
require 'rack/cache/storage'
|
41
|
+
|
42
|
+
# Create a new Rack::Cache middleware component that fetches resources from
|
43
|
+
# the specified backend application. The +options+ Hash can be used to
|
44
|
+
# specify default configuration values (see attributes defined in
|
45
|
+
# Rack::Cache::Options for possible key/values). When a block is given, it
|
46
|
+
# is executed within the context of the newly create Rack::Cache::Context
|
47
|
+
# object.
|
48
|
+
def self.new(backend, options={}, &b)
|
49
|
+
Context.new(backend, options, &b)
|
50
|
+
end
|
51
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
require 'set'
|
2
|
+
|
3
|
+
module Rack::Cache
|
4
|
+
# Provides cache configuration methods. This module is included in the cache
|
5
|
+
# context object.
|
6
|
+
|
7
|
+
module Config
|
8
|
+
# Evaluate a block of configuration code within the scope of receiver.
|
9
|
+
def configure(&block)
|
10
|
+
instance_eval(&block) if block_given?
|
11
|
+
end
|
12
|
+
|
13
|
+
# Import the configuration file specified. This has the same basic semantics
|
14
|
+
# as Ruby's built-in +require+ statement but always evaluates the source
|
15
|
+
# file within the scope of the receiver. The file may exist anywhere on the
|
16
|
+
# $LOAD_PATH.
|
17
|
+
def import(file)
|
18
|
+
return false if imported_features.include?(file)
|
19
|
+
path = add_file_extension(file, 'rb')
|
20
|
+
if path = locate_file_on_load_path(path)
|
21
|
+
source = File.read(path)
|
22
|
+
imported_features.add(file)
|
23
|
+
instance_eval source, path, 1
|
24
|
+
true
|
25
|
+
else
|
26
|
+
raise LoadError, 'no such file to load -- %s' % [file]
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
private
|
31
|
+
# Load the default configuration and evaluate the block provided within
|
32
|
+
# the scope of the receiver.
|
33
|
+
def initialize_config(&block)
|
34
|
+
import 'rack/cache/config/default'
|
35
|
+
configure(&block)
|
36
|
+
end
|
37
|
+
|
38
|
+
# Set of files that have been imported.
|
39
|
+
def imported_features
|
40
|
+
@imported_features ||= Set.new
|
41
|
+
end
|
42
|
+
|
43
|
+
# Attempt to expand +file+ to a full path by possibly adding an .rb
|
44
|
+
# extension and traversing the $LOAD_PATH looking for matches.
|
45
|
+
def locate_file_on_load_path(file)
|
46
|
+
if file[0,1] == '/'
|
47
|
+
file if File.exist?(file)
|
48
|
+
else
|
49
|
+
$LOAD_PATH.
|
50
|
+
map { |base| File.join(base, file) }.
|
51
|
+
detect { |p| File.exist?(p) }
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
# Add an extension to the filename provided if the file doesn't
|
56
|
+
# already have extension.
|
57
|
+
def add_file_extension(file, extension='rb')
|
58
|
+
if file =~ /\.\w+$/
|
59
|
+
file
|
60
|
+
else
|
61
|
+
"#{file}.#{extension}"
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
@@ -0,0 +1,16 @@
|
|
1
|
+
# Adds a very long max-age response header when the requested url
|
2
|
+
# looks like it includes a cache busting timestamp. Cache busting
|
3
|
+
# URLs look like this:
|
4
|
+
# http://HOST/PATH?DIGITS
|
5
|
+
#
|
6
|
+
# DIGITS is typically the number of seconds since some epoch but
|
7
|
+
# this can theoretically be any set of digits. Example:
|
8
|
+
# http://example.com/css/foo.css?7894387283
|
9
|
+
#
|
10
|
+
on :fetch do
|
11
|
+
next if response.freshness_information?
|
12
|
+
if request.url =~ /\?\d+$/
|
13
|
+
trace 'adding huge max-age to response for cache busting URL'
|
14
|
+
response.ttl = 100000000000000
|
15
|
+
end
|
16
|
+
end
|
@@ -0,0 +1,134 @@
|
|
1
|
+
# Called at the beginning of request processing, after the complete
|
2
|
+
# request has been fully received. Its purpose is to decide whether or
|
3
|
+
# not to serve the request and how to do it.
|
4
|
+
#
|
5
|
+
# The request should not be modified.
|
6
|
+
#
|
7
|
+
# Possible transitions from receive:
|
8
|
+
#
|
9
|
+
# * pass! - pass the request to the backend the response upstream,
|
10
|
+
# bypassing all caching features.
|
11
|
+
#
|
12
|
+
# * lookup! - attempt to locate the entry in the cache. Control will
|
13
|
+
# pass to the +hit+, +miss+, or +fetch+ event based on the result of
|
14
|
+
# the cache lookup.
|
15
|
+
#
|
16
|
+
# * error! - return the error code specified, abandoning the request.
|
17
|
+
#
|
18
|
+
on :receive do
|
19
|
+
pass! unless request.method? 'GET', 'HEAD'
|
20
|
+
pass! if request.header? 'Cookie', 'Authorization', 'Expect'
|
21
|
+
lookup!
|
22
|
+
end
|
23
|
+
|
24
|
+
# Called upon entering pass mode. The request is sent to the backend,
|
25
|
+
# and the backend's response is sent to the client, but is not entered
|
26
|
+
# into the cache. The event is triggered immediately after the response
|
27
|
+
# is received from the backend but before the it has been sent upstream.
|
28
|
+
#
|
29
|
+
# Possible transitions from pass:
|
30
|
+
#
|
31
|
+
# * finish! - deliver the response upstream.
|
32
|
+
#
|
33
|
+
# * error! - return the error code specified, abandoning the request.
|
34
|
+
#
|
35
|
+
on :pass do
|
36
|
+
finish!
|
37
|
+
end
|
38
|
+
|
39
|
+
# Called after a cache lookup when no matching entry is found in the
|
40
|
+
# cache. Its purpose is to decide whether or not to attempt to retrieve
|
41
|
+
# the response from the backend and in what manner.
|
42
|
+
#
|
43
|
+
# Possible transitions from miss:
|
44
|
+
#
|
45
|
+
# * fetch! - retrieve the requested document from the backend with
|
46
|
+
# caching features enabled.
|
47
|
+
#
|
48
|
+
# * pass! - pass the request to the backend and the response upstream,
|
49
|
+
# bypassing all caching features.
|
50
|
+
#
|
51
|
+
# * error! - return the error code specified and abandon request.
|
52
|
+
#
|
53
|
+
# The default configuration transfers control to the fetch event.
|
54
|
+
on :miss do
|
55
|
+
fetch!
|
56
|
+
end
|
57
|
+
|
58
|
+
# Called after a cache lookup when the requested document is found in
|
59
|
+
# the cache and is fresh.
|
60
|
+
#
|
61
|
+
# Possible transitions from hit:
|
62
|
+
#
|
63
|
+
# * deliver! - transfer control to the deliver event, sending the cached
|
64
|
+
# response upstream.
|
65
|
+
#
|
66
|
+
# * pass! - abandon the cache entry and transfer to pass mode. The
|
67
|
+
# original request is sent to the backend and the response sent
|
68
|
+
# upstream, bypassing all caching features.
|
69
|
+
#
|
70
|
+
# * error! - return the error code specified and abandon request.
|
71
|
+
#
|
72
|
+
on :hit do
|
73
|
+
deliver!
|
74
|
+
end
|
75
|
+
|
76
|
+
# Called after a document is successfully retrieved from the backend
|
77
|
+
# application or after a cache entry is validated with the backend.
|
78
|
+
# During validation, the original request is used as a template for a
|
79
|
+
# conditional GET request with the backend. The +original_response+
|
80
|
+
# object contains the response as received from the backend and +entry+
|
81
|
+
# is set to the cached response that triggered validation.
|
82
|
+
#
|
83
|
+
# Possible transitions from fetch:
|
84
|
+
#
|
85
|
+
# * store! - store the fetched response in the cache or, when
|
86
|
+
# validating, update the cached response with validated results.
|
87
|
+
#
|
88
|
+
# * deliver! - deliver the response upstream without entering it
|
89
|
+
# into the cache.
|
90
|
+
#
|
91
|
+
# * error! return the error code specified and abandon request.
|
92
|
+
#
|
93
|
+
on :fetch do
|
94
|
+
store! if response.cacheable?
|
95
|
+
deliver!
|
96
|
+
end
|
97
|
+
|
98
|
+
# Called immediately before an entry is written to the underlying
|
99
|
+
# cache. The +entry+ object may be modified.
|
100
|
+
#
|
101
|
+
# Possible transitions from store:
|
102
|
+
#
|
103
|
+
# * persist! - commit the object to cache and transfer control to
|
104
|
+
# the deliver event.
|
105
|
+
#
|
106
|
+
# * deliver! - transfer control to the deliver event without committing
|
107
|
+
# the object to cache.
|
108
|
+
#
|
109
|
+
# * error! - return the error code specified and abandon request.
|
110
|
+
#
|
111
|
+
on :store do
|
112
|
+
entry.ttl = default_ttl if entry.ttl.nil?
|
113
|
+
trace 'store backend response in cache (ttl: %ds)', entry.ttl
|
114
|
+
persist!
|
115
|
+
end
|
116
|
+
|
117
|
+
# Called immediately before +response+ is delivered upstream. +response+
|
118
|
+
# may be modified at this point but the changes will not effect the
|
119
|
+
# cache since the entry has already been persisted.
|
120
|
+
#
|
121
|
+
# * finish! - complete processing and send the response upstream
|
122
|
+
#
|
123
|
+
# * error! - return the error code specified and abandon request.
|
124
|
+
#
|
125
|
+
on :deliver do
|
126
|
+
finish!
|
127
|
+
end
|
128
|
+
|
129
|
+
# Called when an error! transition is triggered. The +response+ has the
|
130
|
+
# error code, headers, and body that will be delivered to upstream and
|
131
|
+
# may be modified if needed.
|
132
|
+
on :error do
|
133
|
+
finish!
|
134
|
+
end
|
@@ -0,0 +1,13 @@
|
|
1
|
+
# The default configuration ignores the `Cache-Control: no-cache` directive on
|
2
|
+
# requests. Per RFC 2616, the presence of the no-cache directive should cause
|
3
|
+
# intermediaries to process requests as if no cached version were available.
|
4
|
+
# However, this directive is most often targetted at shared proxy caches, not
|
5
|
+
# gateway caches, and so we've chosen to break with the spec in our default
|
6
|
+
# configuration.
|
7
|
+
#
|
8
|
+
# Import 'rack/cache/config/no-cache' to enable standards-based
|
9
|
+
# processing.
|
10
|
+
|
11
|
+
on :receive do
|
12
|
+
pass! if request.header['Cache-Control'] =~ /no-cache/
|
13
|
+
end
|