josh-rack-cache 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,34 @@
1
+ # Rackup config that serves the contents of Rack::Cache's
2
+ # doc directory. The documentation is rebuilt on each request.
3
+
4
+ # Rewrites URLs like conventional web server configs.
5
+ class Rewriter < Struct.new(:app)
6
+ def call(env)
7
+ if env['PATH_INFO'] =~ /\/$/
8
+ env['PATH_INFO'] += 'index.html'
9
+ elsif env['PATH_INFO'] !~ /\.\w+$/
10
+ env['PATH_INFO'] += '.html'
11
+ end
12
+ app.call(env)
13
+ end
14
+ end
15
+
16
+ # Rebuilds documentation on each request.
17
+ class DocBuilder < Struct.new(:app)
18
+ def call(env)
19
+ if env['PATH_INFO'] !~ /\.(css|js|gif|jpg|png|ico)$/
20
+ env['rack.errors'] << "*** rebuilding documentation (rake -s doc)\n"
21
+ system "rake -s doc"
22
+ end
23
+ app.call(env)
24
+ end
25
+ end
26
+
27
+ use Rack::CommonLogger
28
+ use DocBuilder
29
+ use Rewriter
30
+ use Rack::Static, :root => File.dirname(__FILE__), :urls => ["/"]
31
+
32
+ run(lambda{|env| [404,{},'<h1>Not Found</h1>']})
33
+
34
+ # vim: ft=ruby
@@ -0,0 +1,164 @@
1
+ Storage
2
+ =======
3
+
4
+ __Rack::Cache__ runs within each of your backend application processes and does not
5
+ rely on a single intermediary process like most types of proxy cache
6
+ implementations. Because of this, the storage subsystem has implications on not
7
+ only where cache data is stored but whether the cache is properly distributed
8
+ between multiple backend processes. It is highly recommended that you read and
9
+ understand the following before choosing a storage implementation.
10
+
11
+ Storage Areas
12
+ -------------
13
+
14
+ __Rack::Cache__ stores cache entries in two separate configurable storage
15
+ areas: a _MetaStore_ and an _EntityStore_.
16
+
17
+ The _MetaStore_ keeps high level information about each cache entry, including
18
+ the request/response headers and other status information. When a request is
19
+ received, the core caching logic uses this meta information to determine whether
20
+ a fresh cache entry exists that can satisfy the request.
21
+
22
+ The _EntityStore_ is where the actual response body content is stored. When a
23
+ response is entered into the cache, a SHA1 digest of the response body content
24
+ is calculated and used as a key. The entries stored in the MetaStore reference
25
+ their response bodies using this SHA1 key.
26
+
27
+ Separating request/response meta-data from response content has a few important
28
+ advantages:
29
+
30
+ * Different storage types can be used for meta and entity storage. For
31
+ example, it may be desirable to use memcached to store meta information
32
+ while using the filesystem for entity storage.
33
+
34
+ * Cache entry meta-data may be retrieved quickly without also retrieving
35
+ response bodies. This avoids significant overhead when the cache misses
36
+ or only requires validation.
37
+
38
+ * Multiple different responses may include the same exact response body. In
39
+ these cases, the actual body content is stored once and referenced from
40
+ each of the meta store entries.
41
+
42
+ You should consider how the meta and entity stores differ when choosing a storage
43
+ implementation. The MetaStore does not require nearly as much memory as the
44
+ EntityStore and is accessed much more frequently. The EntityStore can grow quite
45
+ large and raw performance is less of a concern. Using a memory based storage
46
+ implementation (`heap` or `memcached`) for the MetaStore is strongly advised,
47
+ while a disk based storage implementation (`file`) is often satisfactory for
48
+ the EntityStore and uses much less memory.
49
+
50
+ Storage Configuration
51
+ ---------------------
52
+
53
+ The MetaStore and EntityStore used for a particular request is determined by
54
+ inspecting the `rack-cache.metastore` and `rack-cache.entitystore` Rack env
55
+ variables. The value of these variables is a URI that identifies the storage
56
+ type and location (URI formats are documented in the following section).
57
+
58
+ The `heap:/` storage is assumed if either storage type is not explicitly
59
+ provided. This storage type has significant drawbacks for most types of
60
+ deployments so explicit configuration is advised.
61
+
62
+ The default metastore and entitystore values can be specified when the
63
+ __Rack::Cache__ object is added to the Rack middleware pipeline as follows:
64
+
65
+ use Rack::Cache,
66
+ :metastore => 'file:/var/cache/rack/meta',
67
+ :entitystore => 'file:/var/cache/rack/body'
68
+
69
+ Alternatively, the `rack-cache.metastore` and `rack-cache.entitystore`
70
+ variables may be set in the Rack environment by an upstream component.
71
+
72
+ Storage Implementations
73
+ -----------------------
74
+
75
+ __Rack::Cache__ includes meta and entity storage implementations backed by local
76
+ process memory ("heap storage"), the file system ("disk storage"), and
77
+ memcached. This section includes information on configuring __Rack::Cache__ to
78
+ use a specific storage implementation as well as pros and cons of each.
79
+
80
+ ### Heap Storage
81
+
82
+ Uses local process memory to store cached entries.
83
+
84
+ use Rack::Cache,
85
+ :metastore => 'heap:/',
86
+ :entitystore => 'heap:/'
87
+
88
+ The heap storage backend is simple, fast, and mostly useless. All cache
89
+ information is stored in each backend application's local process memory (using
90
+ a normal Hash, in fact), which means that data cached under one backend is
91
+ invisible to all other backends. This leads to low cache hit rates and excessive
92
+ memory use, the magnitude of which is a function of the number of backends in
93
+ use. Further, the heap storage provides no mechanism for purging unused entries
94
+ so memory use is guaranteed to exceed that available, given enough time and
95
+ utilization.
96
+
97
+ Use of heap storage is recommended only for testing purposes or for very
98
+ simple/single-backend deployment scenarios where the number of resources served
99
+ is small and well understood.
100
+
101
+ ### Disk Storage
102
+
103
+ Stores cached entries on the filesystem.
104
+
105
+ use Rack::Cache,
106
+ :metastore => 'file:/var/cache/rack/meta',
107
+ :entitystore => 'file:/var/cache/rack/body'
108
+
109
+ The URI may specify an absolute, relative, or home-rooted path:
110
+
111
+ * `file:/storage/path` - absolute path to storage directory.
112
+ * `file:storage/path` - relative path to storage directory, rooted at the
113
+ process's current working directory (`Dir.pwd`).
114
+ * `file:~user/storage/path` - path to storage directory, rooted at the
115
+ specified user's home directory.
116
+ * `file:~/storage/path` - path to storage directory, rooted at the current
117
+ user's home directory.
118
+
119
+ File system storage is simple, requires no special daemons or libraries, has a
120
+ tiny memory footprint, and allows multiple backends to share a single cache; it
121
+ is one of the slower storage implementations, however. Its use is recommended in
122
+ cases where memory is limited or in environments where more complex storage
123
+ backends (i.e., memcached) are not available. In many cases, it may be
124
+ acceptable (and even optimal) to use file system storage for the entitystore and
125
+ a more performant storage implementation (i.e. memcached) for the metastore.
126
+
127
+ __NOTE:__ When both the metastore and entitystore are configured to use file
128
+ system storage, they should be set to different paths to prevent any chance of
129
+ collision.
130
+
131
+ ### Memcached Storage
132
+
133
+ Stores cached entries in a remote [memcached](http://www.danga.com/memcached/)
134
+ instance.
135
+
136
+ use Rack::Cache,
137
+ :metastore => 'memcached://localhost:11211/meta',
138
+ :entitystore => 'memcached://localhost:11211/body'
139
+
140
+ The URI must specify the host and port of a remote memcached daemon. The path
141
+ portion is an optional (but recommended) namespace that is prepended to each
142
+ cache key.
143
+
144
+ The memcached storage backend requires either the `memcache-client` or
145
+ `memcached` libraries. By default, the `memcache-client` library is used;
146
+ require the `memcached` library explicitly to use it instead.
147
+
148
+ gem install memcache-client
149
+
150
+ Memcached storage is reasonably fast and allows multiple backends to share a
151
+ single cache. It is also the only storage implementation that allows the cache
152
+ to reside somewhere other than the local machine. The memcached daemon stores
153
+ all data in local process memory so using it for the entitystore can result in
154
+ heavy memory usage. It is by far the best option for the metastore in
155
+ deployments with multiple backend application processes since it allows the
156
+ cache to be properly distributed and provides fast access to the
157
+ meta-information required to perform cache logic. Memcached is considerably more
158
+ complex than the other storage implementations, requiring a separate daemon
159
+ process and extra libraries. Still, its use is recommended in all cases where
160
+ you can get away with it.
161
+
162
+ [e]: http://blog.evanweaver.com/files/doc/fauna/memcached/files/README.html
163
+ [f]: http://blog.evanweaver.com/articles/2008/01/21/b-the-fastest-u-can-b-memcached/
164
+ [l]: http://tangent.org/552/libmemcached.html
@@ -0,0 +1,25 @@
1
+ require 'sinatra'
2
+ require 'rack/cache'
3
+
4
+ use Rack::Cache do
5
+ set :verbose, true
6
+ set :metastore, 'heap:/'
7
+ set :entitystore, 'heap:/'
8
+
9
+ on :receive do
10
+ pass! if request.url =~ /favicon/
11
+ end
12
+ end
13
+
14
+ before do
15
+ last_modified $updated_at ||= Time.now
16
+ end
17
+
18
+ get '/' do
19
+ erb :index
20
+ end
21
+
22
+ put '/' do
23
+ $updated_at = nil
24
+ redirect '/'
25
+ end
@@ -0,0 +1,44 @@
1
+ <html>
2
+ <head>
3
+ <title>Sample Rack::Cache Sinatra app</title>
4
+ <style type="text/css" media="screen">
5
+ body {
6
+ font-family: Georgia;
7
+ font-size: 24px;
8
+ text-align: center;
9
+ }
10
+
11
+ #headers {
12
+ font-size: 16px;
13
+ }
14
+
15
+ input {
16
+ font-size: 24px;
17
+ cursor: pointer;
18
+ }
19
+ </style>
20
+ </head>
21
+ <body>
22
+ <h1>Last updated at: <%= $updated_at.strftime('%l:%m:%S%P') %></h1>
23
+
24
+ <p>
25
+ <form action="/" method="post">
26
+ <input type="hidden" name="_method" value="PUT">
27
+ <input type="submit" value="Expire the cache.">
28
+ </form>
29
+ </p>
30
+
31
+ <div id="headers">
32
+ <h3>Headers:</h3>
33
+
34
+ <% response.headers.each do |key, value| %>
35
+ <p><%= key %>: <%= value %></p>
36
+ <% end %>
37
+
38
+ <h3>Params:</h3>
39
+ <% params.each do |key, value| %>
40
+ <p><%= key %>: <%= value || '(blank)' %></p>
41
+ <% end %>
42
+ </div>
43
+ </body>
44
+ </html>
@@ -0,0 +1,45 @@
1
+ require 'rack'
2
+
3
+ # = HTTP Caching For Rack
4
+ #
5
+ # Rack::Cache is suitable as a quick, drop-in component to enable HTTP caching
6
+ # for Rack-enabled applications that produce freshness (+Expires+, +Cache-Control+)
7
+ # and/or validation (+Last-Modified+, +ETag+) information.
8
+ #
9
+ # * Standards-based (RFC 2616 compliance)
10
+ # * Freshness/expiration based caching and validation
11
+ # * Supports HTTP Vary
12
+ # * Portable: 100% Ruby / works with any Rack-enabled framework
13
+ # * Disk, memcached, and heap memory storage backends
14
+ #
15
+ # === Usage
16
+ #
17
+ # Create with default options:
18
+ # require 'rack/cache'
19
+ # Rack::Cache.new(app, :verbose => true, :entitystore => 'file:cache')
20
+ #
21
+ # Within a rackup file (or with Rack::Builder):
22
+ # require 'rack/cache'
23
+ # use Rack::Cache do
24
+ # set :verbose, true
25
+ # set :metastore, 'memcached://localhost:11211/meta'
26
+ # set :entitystore, 'file:/var/cache/rack'
27
+ # end
28
+ # run app
29
+ module Rack::Cache
30
+ autoload :Request, 'rack/cache/request'
31
+ autoload :Response, 'rack/cache/response'
32
+ autoload :Context, 'rack/cache/context'
33
+ autoload :Storage, 'rack/cache/storage'
34
+ autoload :CacheControl, 'rack/cache/cachecontrol'
35
+
36
+ # Create a new Rack::Cache middleware component that fetches resources from
37
+ # the specified backend application. The +options+ Hash can be used to
38
+ # specify default configuration values (see attributes defined in
39
+ # Rack::Cache::Options for possible key/values). When a block is given, it
40
+ # is executed within the context of the newly create Rack::Cache::Context
41
+ # object.
42
+ def self.new(backend, options={}, &b)
43
+ Context.new(backend, options, &b)
44
+ end
45
+ end
@@ -0,0 +1,52 @@
1
+ require 'base64'
2
+
3
+ module Rack::Cache::AppEngine
4
+
5
+ module MC
6
+ require 'java'
7
+
8
+ import com.google.appengine.api.memcache.Expiration;
9
+ import com.google.appengine.api.memcache.MemcacheService;
10
+ import com.google.appengine.api.memcache.MemcacheServiceFactory;
11
+ import com.google.appengine.api.memcache.Stats;
12
+
13
+ Service = MemcacheServiceFactory.getMemcacheService
14
+ end unless defined?(Rack::Cache::AppEngine::MC)
15
+
16
+ class MemCache
17
+
18
+ def initialize(options = {})
19
+ @cache = MC::Service
20
+ @cache.namespace = options[:namespace] if options[:namespace]
21
+ end
22
+
23
+ def contains?(key)
24
+ MC::Service.contains(key)
25
+ end
26
+
27
+ def get(key)
28
+ value = MC::Service.get(key)
29
+ Marshal.load(Base64.decode64(value)) if value
30
+ end
31
+
32
+ def put(key, value, ttl = nil)
33
+ expiration = ttl ? MC::Expiration.byDeltaSeconds(ttl) : nil
34
+ value = Base64.encode64(Marshal.dump(value)).gsub(/\n/, '')
35
+ MC::Service.put(key, value, expiration)
36
+ end
37
+
38
+ def namespace
39
+ MC::Service.getNamespace
40
+ end
41
+
42
+ def namespace=(value)
43
+ MC::Service.setNamespace(value.to_s)
44
+ end
45
+
46
+ def delete(key)
47
+ MC::Service.delete(key)
48
+ end
49
+
50
+ end
51
+
52
+ end
@@ -0,0 +1,193 @@
1
+ module Rack
2
+ module Cache
3
+
4
+ # Parses a Cache-Control header and exposes the directives as a Hash.
5
+ # Directives that do not have values are set to +true+.
6
+ class CacheControl < Hash
7
+ def initialize(value=nil)
8
+ parse(value)
9
+ end
10
+
11
+ # Indicates that the response MAY be cached by any cache, even if it
12
+ # would normally be non-cacheable or cacheable only within a non-
13
+ # shared cache.
14
+ #
15
+ # A response may be considered public without this directive if the
16
+ # private directive is not set and the request does not include an
17
+ # Authorization header.
18
+ def public?
19
+ self['public']
20
+ end
21
+
22
+ # Indicates that all or part of the response message is intended for
23
+ # a single user and MUST NOT be cached by a shared cache. This
24
+ # allows an origin server to state that the specified parts of the
25
+ # response are intended for only one user and are not a valid
26
+ # response for requests by other users. A private (non-shared) cache
27
+ # MAY cache the response.
28
+ #
29
+ # Note: This usage of the word private only controls where the
30
+ # response may be cached, and cannot ensure the privacy of the
31
+ # message content.
32
+ def private?
33
+ self['private']
34
+ end
35
+
36
+ # When set in a response, a cache MUST NOT use the response to satisfy a
37
+ # subsequent request without successful revalidation with the origin
38
+ # server. This allows an origin server to prevent caching even by caches
39
+ # that have been configured to return stale responses to client requests.
40
+ #
41
+ # Note that this does not necessary imply that the response may not be
42
+ # stored by the cache, only that the cache cannot serve it without first
43
+ # making a conditional GET request with the origin server.
44
+ #
45
+ # When set in a request, the server MUST NOT use a cached copy for its
46
+ # response. This has quite different semantics compared to the no-cache
47
+ # directive on responses. When the client specifies no-cache, it causes
48
+ # an end-to-end reload, forcing each cache to update their cached copies.
49
+ def no_cache?
50
+ self['no-cache']
51
+ end
52
+
53
+ # Indicates that the response MUST NOT be stored under any circumstances.
54
+ #
55
+ # The purpose of the no-store directive is to prevent the
56
+ # inadvertent release or retention of sensitive information (for
57
+ # example, on backup tapes). The no-store directive applies to the
58
+ # entire message, and MAY be sent either in a response or in a
59
+ # request. If sent in a request, a cache MUST NOT store any part of
60
+ # either this request or any response to it. If sent in a response,
61
+ # a cache MUST NOT store any part of either this response or the
62
+ # request that elicited it. This directive applies to both non-
63
+ # shared and shared caches. "MUST NOT store" in this context means
64
+ # that the cache MUST NOT intentionally store the information in
65
+ # non-volatile storage, and MUST make a best-effort attempt to
66
+ # remove the information from volatile storage as promptly as
67
+ # possible after forwarding it.
68
+ #
69
+ # The purpose of this directive is to meet the stated requirements
70
+ # of certain users and service authors who are concerned about
71
+ # accidental releases of information via unanticipated accesses to
72
+ # cache data structures. While the use of this directive might
73
+ # improve privacy in some cases, we caution that it is NOT in any
74
+ # way a reliable or sufficient mechanism for ensuring privacy. In
75
+ # particular, malicious or compromised caches might not recognize or
76
+ # obey this directive, and communications networks might be
77
+ # vulnerable to eavesdropping.
78
+ def no_store?
79
+ self['no-store']
80
+ end
81
+
82
+ # The expiration time of an entity MAY be specified by the origin
83
+ # server using the Expires header (see section 14.21). Alternatively,
84
+ # it MAY be specified using the max-age directive in a response. When
85
+ # the max-age cache-control directive is present in a cached response,
86
+ # the response is stale if its current age is greater than the age
87
+ # value given (in seconds) at the time of a new request for that
88
+ # resource. The max-age directive on a response implies that the
89
+ # response is cacheable (i.e., "public") unless some other, more
90
+ # restrictive cache directive is also present.
91
+ #
92
+ # If a response includes both an Expires header and a max-age
93
+ # directive, the max-age directive overrides the Expires header, even
94
+ # if the Expires header is more restrictive. This rule allows an origin
95
+ # server to provide, for a given response, a longer expiration time to
96
+ # an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be
97
+ # useful if certain HTTP/1.0 caches improperly calculate ages or
98
+ # expiration times, perhaps due to desynchronized clocks.
99
+ #
100
+ # Many HTTP/1.0 cache implementations will treat an Expires value that
101
+ # is less than or equal to the response Date value as being equivalent
102
+ # to the Cache-Control response directive "no-cache". If an HTTP/1.1
103
+ # cache receives such a response, and the response does not include a
104
+ # Cache-Control header field, it SHOULD consider the response to be
105
+ # non-cacheable in order to retain compatibility with HTTP/1.0 servers.
106
+ #
107
+ # When the max-age directive is included in the request, it indicates
108
+ # that the client is willing to accept a response whose age is no
109
+ # greater than the specified time in seconds.
110
+ def max_age
111
+ self['max-age'].to_i if key?('max-age')
112
+ end
113
+
114
+ # If a response includes an s-maxage directive, then for a shared
115
+ # cache (but not for a private cache), the maximum age specified by
116
+ # this directive overrides the maximum age specified by either the
117
+ # max-age directive or the Expires header. The s-maxage directive
118
+ # also implies the semantics of the proxy-revalidate directive. i.e.,
119
+ # that the shared cache must not use the entry after it becomes stale
120
+ # to respond to a subsequent request without first revalidating it with
121
+ # the origin server. The s-maxage directive is always ignored by a
122
+ # private cache.
123
+ def shared_max_age
124
+ self['s-maxage'].to_i if key?('s-maxage')
125
+ end
126
+ alias_method :s_maxage, :shared_max_age
127
+
128
+ # Because a cache MAY be configured to ignore a server's specified
129
+ # expiration time, and because a client request MAY include a max-
130
+ # stale directive (which has a similar effect), the protocol also
131
+ # includes a mechanism for the origin server to require revalidation
132
+ # of a cache entry on any subsequent use. When the must-revalidate
133
+ # directive is present in a response received by a cache, that cache
134
+ # MUST NOT use the entry after it becomes stale to respond to a
135
+ # subsequent request without first revalidating it with the origin
136
+ # server. (I.e., the cache MUST do an end-to-end revalidation every
137
+ # time, if, based solely on the origin server's Expires or max-age
138
+ # value, the cached response is stale.)
139
+ #
140
+ # The must-revalidate directive is necessary to support reliable
141
+ # operation for certain protocol features. In all circumstances an
142
+ # HTTP/1.1 cache MUST obey the must-revalidate directive; in
143
+ # particular, if the cache cannot reach the origin server for any
144
+ # reason, it MUST generate a 504 (Gateway Timeout) response.
145
+ #
146
+ # Servers SHOULD send the must-revalidate directive if and only if
147
+ # failure to revalidate a request on the entity could result in
148
+ # incorrect operation, such as a silently unexecuted financial
149
+ # transaction. Recipients MUST NOT take any automated action that
150
+ # violates this directive, and MUST NOT automatically provide an
151
+ # unvalidated copy of the entity if revalidation fails.
152
+ def must_revalidate?
153
+ self['must-revalidate']
154
+ end
155
+
156
+ # The proxy-revalidate directive has the same meaning as the must-
157
+ # revalidate directive, except that it does not apply to non-shared
158
+ # user agent caches. It can be used on a response to an
159
+ # authenticated request to permit the user's cache to store and
160
+ # later return the response without needing to revalidate it (since
161
+ # it has already been authenticated once by that user), while still
162
+ # requiring proxies that service many users to revalidate each time
163
+ # (in order to make sure that each user has been authenticated).
164
+ # Note that such authenticated responses also need the public cache
165
+ # control directive in order to allow them to be cached at all.
166
+ def proxy_revalidate?
167
+ self['proxy-revalidate']
168
+ end
169
+
170
+ def to_s
171
+ bools, vals = [], []
172
+ each do |key,value|
173
+ if value == true
174
+ bools << key
175
+ elsif value
176
+ vals << "#{key}=#{value}"
177
+ end
178
+ end
179
+ (bools.sort + vals.sort).join(', ')
180
+ end
181
+
182
+ private
183
+ def parse(value)
184
+ return if value.nil? || value.empty?
185
+ value.delete(' ').split(',').inject(self) do |hash,part|
186
+ name, value = part.split('=', 2)
187
+ hash[name.downcase] = (value || true) unless name.empty?
188
+ hash
189
+ end
190
+ end
191
+ end
192
+ end
193
+ end