repertoire-faceting 0.7.3 → 0.7.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/FAQ +68 -0
- data/TODO +3 -0
- data/ext/Makefile +1 -1
- data/ext/bytea/faceting_bytea.control +1 -1
- data/ext/signature/faceting.control +1 -1
- data/ext/varbit/faceting_varbit.control +1 -1
- data/lib/repertoire-faceting/adapters/postgresql_adapter.rb +17 -0
- data/lib/repertoire-faceting/controller.rb +30 -10
- data/lib/repertoire-faceting/facets/abstract_facet.rb +9 -0
- data/lib/repertoire-faceting/facets/basic_facet.rb +1 -1
- data/lib/repertoire-faceting/facets/nested_facet.rb +2 -2
- data/lib/repertoire-faceting/model.rb +36 -1
- data/lib/repertoire-faceting/version.rb +1 -1
- metadata +5 -5
- /data/ext/{faceting--0.7.2.sql → faceting--0.7.4.sql} +0 -0
- /data/ext/{faceting_bytea--0.7.2.sql → faceting_bytea--0.7.4.sql} +0 -0
- /data/ext/{faceting_varbit--0.7.2.sql → faceting_varbit--0.7.4.sql} +0 -0
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6d22b7d7430b564a39b9dd864f7da3fd9fc28003
|
4
|
+
data.tar.gz: 5c51fc9c8de95977c82cf4d1e03ec1d53af3fe2a
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 270ea2290d57503544d37200f525f875f32798b95bc72d414f2ba2f45eb956ed3bf56690005f8bb93a83f111d43a2f9dcc79752dc78a81adacd77dfdc2085239
|
7
|
+
data.tar.gz: a49a2376295d12ba0f1cb6151ef94bff18ccd1aa352ec75e5189eababc8d8ee33233e88233c5ef0542878397f7a552d2ecf938c7164fb8d7c98132bdfb8652e7
|
data/FAQ
CHANGED
@@ -20,6 +20,67 @@
|
|
20
20
|
In cases where you do not have superuser access to the deployment host (e.g. Heroku) and so cannot run "rake db:faceting:extensions:install", you can get use the connection's "faceting_api_sql" method to load the API by hand. See the repertoire-faceting-example application's migrations for a concrete example.
|
21
21
|
|
22
22
|
|
23
|
+
*Q* Caching for the faceted browser.
|
24
|
+
|
25
|
+
*A* Because the opening page of a faceted search is also the most computationally-intensive to produce (it integrates
|
26
|
+
counts across the entire dataset), the faceted browser is radically faster when configured to cache results. As of version
|
27
|
+
7.4, the controller mixin sets HTTP cache headers in responses to reasonable defaults.
|
28
|
+
|
29
|
+
The short story is that models should have an updated_at column, which is used to compute cache keys. For models with many
|
30
|
+
items, you may wish to add an index on updated_at.
|
31
|
+
|
32
|
+
The detailed story:
|
33
|
+
|
34
|
+
- Requests without an HTTP etag header are processed normally
|
35
|
+
|
36
|
+
- For results queries, a cache key is built that combines the model's updated_at column with a count of the number of rows
|
37
|
+
in the table. If you use Rails' built-in timestamps, this will expire the cache after updating and deleting items.
|
38
|
+
|
39
|
+
- For facet counts on indexed facets, the most recent index refresh is used to construct a cache key. This ensures caches
|
40
|
+
expire when you call "<Model>.index_facets" to refresh the indices (and not before).
|
41
|
+
|
42
|
+
- For facet counts on unindexed facets, the model table's cache key is used.
|
43
|
+
|
44
|
+
- If the model table has no updated_at column, caching falls back to Rails default. (Your query will execute every time.)
|
45
|
+
|
46
|
+
This arrangement caches the most commonly accessed (and slowest) queries in the user's browser. Hence the first
|
47
|
+
request from a new session will load more slowly than consecutive ones. To speed up access across all sessions,
|
48
|
+
configure your Rails app to use an intermediate server cache, e.g. with Rack::Cache and memcached or Varnish.
|
49
|
+
|
50
|
+
If you choose to over-ride the results web-service in your own controller, you can easily reuse Repertoire Faceting's
|
51
|
+
cache settings by checking the value of <Model>.facet_cache_key:
|
52
|
+
|
53
|
+
class TodoslistController < ApplicationController
|
54
|
+
include Repertoire::Faceting::Controller
|
55
|
+
...
|
56
|
+
def results
|
57
|
+
filter = params[:filter] || {}
|
58
|
+
if stale?(base.facet_cache_key)
|
59
|
+
@results = base.refine(filter)
|
60
|
+
respond_with @results
|
61
|
+
end
|
62
|
+
end
|
63
|
+
...
|
64
|
+
|
65
|
+
See https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works for context.
|
66
|
+
|
67
|
+
*Caveats* The following caveats apply to Rails HTTP header based caching in general:
|
68
|
+
|
69
|
+
- You must make sure your ActiveRecord associations touch the parent table's updated_at column when updating, or you
|
70
|
+
may receive stale data. For example, if you are faceting over todos stored in a joined table:
|
71
|
+
|
72
|
+
class Todo < ActiveRecord::Base
|
73
|
+
belongs_to :todolist, touch: true
|
74
|
+
end
|
75
|
+
|
76
|
+
- It is recommended to expire all cache keys on deploying a new version of any application. However, as of Rails 4.0.5, there
|
77
|
+
is still no standard convention for versioning apps. It may be possible to automate this in Heroku. See
|
78
|
+
http://stackoverflow.com/questions/8792716/reflecting-heroku-push-version-within-the-app.
|
79
|
+
|
80
|
+
- Because the default strategy caches items in the user agent's browser, the first load of a faceted browser session will be
|
81
|
+
slower than the rest. Installing a server-side cache like Rack::Cache will alleviate this.
|
82
|
+
|
83
|
+
|
23
84
|
== About facet indexing and the signature SQL type
|
24
85
|
|
25
86
|
*Q* What's the scalability of this thing?
|
@@ -32,6 +93,13 @@
|
|
32
93
|
*A* Make sure the facet indices aren't empty. Running '<Model>.index_facets([])' from the Rails console will drop them all.
|
33
94
|
|
34
95
|
|
96
|
+
*Q* My facet count values are out of date with the model.
|
97
|
+
|
98
|
+
*A* If your facets are indexed, the indexes must be refreshed from the model table periodically. Running '<Model>.index_facets'
|
99
|
+
(no arguments) in the Rails console will refresh them. In a production environment, run this periodically for each indexed model
|
100
|
+
via a cron task.
|
101
|
+
|
102
|
+
|
35
103
|
*Q* Can I facet over multiple models?
|
36
104
|
|
37
105
|
*A* Not currently. However, this may be possible using an ActiveRecord polymorphic relation on the main model.
|
data/TODO
CHANGED
@@ -16,6 +16,9 @@ DESIRED FEATURES / IMPROVEMENTS.
|
|
16
16
|
|
17
17
|
TODO
|
18
18
|
|
19
|
+
-- adapter tests for existence of materialized views too often
|
20
|
+
-- need a way to merge version number into etags?
|
21
|
+
|
19
22
|
-- add info on CORS to FAQ DONE
|
20
23
|
-- ensure pushState respects repertoire.defaults.path_prefix DONE
|
21
24
|
|
data/ext/Makefile
CHANGED
@@ -65,6 +65,23 @@ module Repertoire
|
|
65
65
|
Float(result)
|
66
66
|
end
|
67
67
|
|
68
|
+
#
|
69
|
+
# Methods for detecting table content changes
|
70
|
+
#
|
71
|
+
# (If a later version of PostgreSQL can hashcode a table/timestamp in the system catalog,
|
72
|
+
# switch to use that instead.)
|
73
|
+
#
|
74
|
+
def stat_table(table_name, column="updated_at")
|
75
|
+
sql = "SELECT COUNT(#{column}), MAX(#{column}) AS timestamp FROM #{table_name}"
|
76
|
+
result = select_one(sql)
|
77
|
+
result = HashWithIndifferentAccess.new({
|
78
|
+
:count => Integer(result["count"]),
|
79
|
+
:timestamp => Time.parse(result["timestamp"])
|
80
|
+
})
|
81
|
+
|
82
|
+
result
|
83
|
+
end
|
84
|
+
|
68
85
|
#
|
69
86
|
# Methods for running facet value counts
|
70
87
|
#
|
@@ -35,8 +35,11 @@ module Repertoire
|
|
35
35
|
# when 'alphanumeric' then ["#{facet} ASC"]
|
36
36
|
# when 'count' then ["count DESC", "#{facet} ASC"]
|
37
37
|
# end
|
38
|
-
#
|
39
|
-
#
|
38
|
+
#
|
39
|
+
# if stale?(base.facet_cache_key, :public => true)
|
40
|
+
# @counts = base.refine(filter).order(sorting).count(facet)
|
41
|
+
# render :json => @counts.to_a
|
42
|
+
# end
|
40
43
|
# end
|
41
44
|
#
|
42
45
|
module Controller
|
@@ -44,26 +47,43 @@ module Repertoire
|
|
44
47
|
# Web-service to return value, count pairs for a given facet, given existing filter refinements
|
45
48
|
# on other facets in the context. Over-ride this method if you need to specify additional
|
46
49
|
# query params for faceting.
|
50
|
+
#
|
51
|
+
# Public HTTP cache headers are set, in the following order:
|
52
|
+
# - by the facet index table (if present)
|
53
|
+
# - by the facet model table (if it has an updated_at column)
|
54
|
+
# - otherwise, no HTTP cache header is set
|
55
|
+
#
|
47
56
|
def counts
|
48
57
|
facet = params[:facet]
|
49
58
|
filter = params[:filter] || {}
|
50
59
|
raise "Unkown facet #{facet}" unless base.facet?(facet)
|
51
|
-
|
52
|
-
@counts = base.refine(filter).count(facet)
|
53
60
|
|
54
|
-
|
61
|
+
if stale?(base.facet_cache_key(facet), :public => true)
|
62
|
+
|
63
|
+
@counts = base.refine(filter).count(facet)
|
64
|
+
render :json => @counts.to_a
|
65
|
+
|
66
|
+
end
|
55
67
|
end
|
56
68
|
|
57
69
|
# Web-service to return the results of a query, given existing filter requirements. Over-ride
|
58
70
|
# this method if you need to specify additional query parms for faceting results.
|
71
|
+
#
|
72
|
+
# Private HTTP cache headers are set:
|
73
|
+
# - by the facet model table (if it has an updated_at column)
|
74
|
+
# - otherwise, no HTTP cache header is set
|
75
|
+
#
|
59
76
|
def results
|
60
77
|
filter = params[:filter] || {}
|
61
|
-
|
62
|
-
@results = base.refine(filter).to_a
|
63
78
|
|
64
|
-
|
65
|
-
|
66
|
-
|
79
|
+
if stale?(base.facet_cache_key)
|
80
|
+
|
81
|
+
@results = base.refine(filter).to_a
|
82
|
+
|
83
|
+
respond_to do |format|
|
84
|
+
format.html { render @results, :layout => false }
|
85
|
+
format.json { render :json => @results }
|
86
|
+
end
|
67
87
|
end
|
68
88
|
end
|
69
89
|
|
@@ -62,6 +62,15 @@ module Repertoire
|
|
62
62
|
indexed_facets.map(&:to_sym).include?(facet_name)
|
63
63
|
end
|
64
64
|
|
65
|
+
# Returns a row count and timestamp for the facet's index table,
|
66
|
+
# or nil if it is unindexed.
|
67
|
+
#
|
68
|
+
# If a future PostgreSQL timestamps materialized view refreshes
|
69
|
+
# via the system catalogs, that value should be used instead.
|
70
|
+
def stat_table
|
71
|
+
connection.stat_table(facet_index_name) if facet_indexed?
|
72
|
+
end
|
73
|
+
|
65
74
|
protected
|
66
75
|
|
67
76
|
# Return a facet's index table name
|
@@ -34,7 +34,7 @@ module Repertoire #:nodoc:
|
|
34
34
|
def create_index
|
35
35
|
col = group_values.first
|
36
36
|
rel = only(:where, :joins, :group)
|
37
|
-
sql = rel.select(["#{col} AS #{facet_name}", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
|
37
|
+
sql = rel.select(["#{col} AS #{facet_name}", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
|
38
38
|
|
39
39
|
connection.create_materialized_view(facet_index_name, sql)
|
40
40
|
end
|
@@ -50,12 +50,12 @@ module Repertoire
|
|
50
50
|
end
|
51
51
|
rel = rel.group(columns[0..level])
|
52
52
|
|
53
|
-
queries << rel.select(["#{level+1} AS level", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
|
53
|
+
queries << rel.select(["#{level+1} AS level", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
|
54
54
|
end
|
55
55
|
|
56
56
|
# Root of tree
|
57
57
|
empty_cols = columns.map { |col| "NULL AS #{col}"}
|
58
|
-
queries << only(:where).select(empty_cols + ["0 AS level", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
|
58
|
+
queries << only(:where).select(empty_cols + ["0 AS level", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
|
59
59
|
|
60
60
|
# Give the fullest index first (i.e. leaves of the tree), so the database
|
61
61
|
# can resolve types before encountering any NULL values (i.e. values of
|
@@ -11,7 +11,7 @@ module Repertoire
|
|
11
11
|
base.singleton_class.delegate :refine, :minimum, :nils, :reorder, :to_sql, :to => :scoped_all
|
12
12
|
|
13
13
|
base.class_attribute(:facets)
|
14
|
-
base.facets =
|
14
|
+
base.facets = HashWithIndifferentAccess.new
|
15
15
|
end
|
16
16
|
|
17
17
|
#
|
@@ -254,6 +254,41 @@ module Repertoire
|
|
254
254
|
connection.signature_wastage(table_name, signature_column)
|
255
255
|
end
|
256
256
|
|
257
|
+
# Returns the row count and most recent update timestamp for a model table; or nil if
|
258
|
+
# there is no updated_at field.
|
259
|
+
def stat_table(timestamp_column = nil)
|
260
|
+
timestamp_column ||= "updated_at"
|
261
|
+
connection.stat_table(table_name) if column_names.include?(timestamp_column)
|
262
|
+
end
|
263
|
+
|
264
|
+
# Return a key suitable for use in HTTP headers, either for the base model table or one of its
|
265
|
+
# facets.
|
266
|
+
#
|
267
|
+
# If the name of an indexed facet is provided, the timestamp of the facet index table is used
|
268
|
+
# to construct the facet's cache key. This ensures that facet counts expire when the facet is
|
269
|
+
# re-indexed (and not before).
|
270
|
+
#
|
271
|
+
# If the name of an unindexed facet is given, a cache key for the entire model table is provided.
|
272
|
+
# This ensures that faceted browsers over live data expire facet count caches when the base
|
273
|
+
# model table is updated.
|
274
|
+
#
|
275
|
+
# Calling facet_cache_key with no arguments returns a cache key for the entire model table. The
|
276
|
+
# key is a combination of the most recent update_at column value and the table row count. If
|
277
|
+
# the model has no updated_at attribute, caching is disabled.
|
278
|
+
#
|
279
|
+
# See the FAQ for additional information.
|
280
|
+
def facet_cache_key(facet = nil)
|
281
|
+
facet = facet.try(:to_s)
|
282
|
+
result = nil
|
283
|
+
|
284
|
+
stats = facets[facet].stat_table if facet_names.include?(facet)
|
285
|
+
stats ||= stat_table
|
286
|
+
|
287
|
+
result = { :etag => stats[:count], :last_modified => stats[:timestamp] } if stats.present?
|
288
|
+
|
289
|
+
result
|
290
|
+
end
|
291
|
+
|
257
292
|
# Once clients have migrated to Rails 4, delete and replace with 'all' where this is called
|
258
293
|
#
|
259
294
|
# c.f. http://stackoverflow.com/questions/18198963/with-rails-4-model-scoped-is-deprecated-but-model-all-cant-replace-it
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: repertoire-faceting
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.7.
|
4
|
+
version: 0.7.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Christopher York
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2014-06-
|
11
|
+
date: 2014-06-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rails
|
@@ -83,9 +83,9 @@ files:
|
|
83
83
|
- ext/bytea/faceting_bytea.control
|
84
84
|
- ext/common/util.sql
|
85
85
|
- ext/extconf.rb
|
86
|
-
- ext/faceting--0.7.
|
87
|
-
- ext/faceting_bytea--0.7.
|
88
|
-
- ext/faceting_varbit--0.7.
|
86
|
+
- ext/faceting--0.7.4.sql
|
87
|
+
- ext/faceting_bytea--0.7.4.sql
|
88
|
+
- ext/faceting_varbit--0.7.4.sql
|
89
89
|
- ext/signature/faceting.control
|
90
90
|
- ext/signature/signature.c
|
91
91
|
- ext/signature/signature.o
|
File without changes
|
File without changes
|
File without changes
|