repertoire-faceting 0.7.3 → 0.7.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 9f33047f3e67ebe050a869ec98d8f90042c284de
4
- data.tar.gz: 66d4aa841d23714ede620a6ea56f2947f107a97c
3
+ metadata.gz: 6d22b7d7430b564a39b9dd864f7da3fd9fc28003
4
+ data.tar.gz: 5c51fc9c8de95977c82cf4d1e03ec1d53af3fe2a
5
5
  SHA512:
6
- metadata.gz: 5d86d643b9c881419b6dfe4f3608a56160193775057fc4f645f4b4367f10416cc5968e7f85a0f8059acc9a6c0f5cdc7ebf01a571d1bdc74e6661aa253f6a15e3
7
- data.tar.gz: 890a6a07e726450554e058065f0e1fd2471805b8b4e6a751d87d564b5bc3dee58567176b56e8c445fc3df7f12009a9882d7eb2f854966d7facd0c82dce447675
6
+ metadata.gz: 270ea2290d57503544d37200f525f875f32798b95bc72d414f2ba2f45eb956ed3bf56690005f8bb93a83f111d43a2f9dcc79752dc78a81adacd77dfdc2085239
7
+ data.tar.gz: a49a2376295d12ba0f1cb6151ef94bff18ccd1aa352ec75e5189eababc8d8ee33233e88233c5ef0542878397f7a552d2ecf938c7164fb8d7c98132bdfb8652e7
data/FAQ CHANGED
@@ -20,6 +20,67 @@
20
20
  In cases where you do not have superuser access to the deployment host (e.g. Heroku) and so cannot run "rake db:faceting:extensions:install", you can get use the connection's "faceting_api_sql" method to load the API by hand. See the repertoire-faceting-example application's migrations for a concrete example.
21
21
 
22
22
 
23
+ *Q* Caching for the faceted browser.
24
+
25
+ *A* Because the opening page of a faceted search is also the most computationally-intensive to produce (it integrates
26
+ counts across the entire dataset), the faceted browser is radically faster when configured to cache results. As of version
27
+ 7.4, the controller mixin sets HTTP cache headers in responses to reasonable defaults.
28
+
29
+ The short story is that models should have an updated_at column, which is used to compute cache keys. For models with many
30
+ items, you may wish to add an index on updated_at.
31
+
32
+ The detailed story:
33
+
34
+ - Requests without an HTTP etag header are processed normally
35
+
36
+ - For results queries, a cache key is built that combines the model's updated_at column with a count of the number of rows
37
+ in the table. If you use Rails' built-in timestamps, this will expire the cache after updating and deleting items.
38
+
39
+ - For facet counts on indexed facets, the most recent index refresh is used to construct a cache key. This ensures caches
40
+ expire when you call "<Model>.index_facets" to refresh the indices (and not before).
41
+
42
+ - For facet counts on unindexed facets, the model table's cache key is used.
43
+
44
+ - If the model table has no updated_at column, caching falls back to Rails default. (Your query will execute every time.)
45
+
46
+ This arrangement caches the most commonly accessed (and slowest) queries in the user's browser. Hence the first
47
+ request from a new session will load more slowly than consecutive ones. To speed up access across all sessions,
48
+ configure your Rails app to use an intermediate server cache, e.g. with Rack::Cache and memcached or Varnish.
49
+
50
+ If you choose to over-ride the results web-service in your own controller, you can easily reuse Repertoire Faceting's
51
+ cache settings by checking the value of <Model>.facet_cache_key:
52
+
53
+ class TodoslistController < ApplicationController
54
+ include Repertoire::Faceting::Controller
55
+ ...
56
+ def results
57
+ filter = params[:filter] || {}
58
+ if stale?(base.facet_cache_key)
59
+ @results = base.refine(filter)
60
+ respond_with @results
61
+ end
62
+ end
63
+ ...
64
+
65
+ See https://signalvnoise.com/posts/3113-how-key-based-cache-expiration-works for context.
66
+
67
+ *Caveats* The following caveats apply to Rails HTTP header based caching in general:
68
+
69
+ - You must make sure your ActiveRecord associations touch the parent table's updated_at column when updating, or you
70
+ may receive stale data. For example, if you are faceting over todos stored in a joined table:
71
+
72
+ class Todo < ActiveRecord::Base
73
+ belongs_to :todolist, touch: true
74
+ end
75
+
76
+ - It is recommended to expire all cache keys on deploying a new version of any application. However, as of Rails 4.0.5, there
77
+ is still no standard convention for versioning apps. It may be possible to automate this in Heroku. See
78
+ http://stackoverflow.com/questions/8792716/reflecting-heroku-push-version-within-the-app.
79
+
80
+ - Because the default strategy caches items in the user agent's browser, the first load of a faceted browser session will be
81
+ slower than the rest. Installing a server-side cache like Rack::Cache will alleviate this.
82
+
83
+
23
84
  == About facet indexing and the signature SQL type
24
85
 
25
86
  *Q* What's the scalability of this thing?
@@ -32,6 +93,13 @@
32
93
  *A* Make sure the facet indices aren't empty. Running '<Model>.index_facets([])' from the Rails console will drop them all.
33
94
 
34
95
 
96
+ *Q* My facet count values are out of date with the model.
97
+
98
+ *A* If your facets are indexed, the indexes must be refreshed from the model table periodically. Running '<Model>.index_facets'
99
+ (no arguments) in the Rails console will refresh them. In a production environment, run this periodically for each indexed model
100
+ via a cron task.
101
+
102
+
35
103
  *Q* Can I facet over multiple models?
36
104
 
37
105
  *A* Not currently. However, this may be possible using an ActiveRecord polymorphic relation on the main model.
data/TODO CHANGED
@@ -16,6 +16,9 @@ DESIRED FEATURES / IMPROVEMENTS.
16
16
 
17
17
  TODO
18
18
 
19
+ -- adapter tests for existence of materialized views too often
20
+ -- need a way to merge version number into etags?
21
+
19
22
  -- add info on CORS to FAQ DONE
20
23
  -- ensure pushState respects repertoire.defaults.path_prefix DONE
21
24
 
data/ext/Makefile CHANGED
@@ -8,7 +8,7 @@
8
8
  #
9
9
  #-------------------------------------------------------------------------
10
10
 
11
- API_VERSION = 0.7.3
11
+ API_VERSION = 0.7.4
12
12
 
13
13
  MODULES = signature/signature
14
14
  EXTENSION = signature/faceting \
@@ -2,5 +2,5 @@
2
2
 
3
3
  comment = 'API for faceted indexing and queries (based on plv8 + bytea bitmaps)'
4
4
  requires = 'plv8, plpgsql'
5
- default_version = '0.7.3'
5
+ default_version = '0.7.4'
6
6
  schema = 'facet'
@@ -2,5 +2,5 @@
2
2
 
3
3
  comment = 'API for faceted indexing and queries (based on custom C bitmap type)'
4
4
  requires = plpgsql
5
- default_version = '0.7.3'
5
+ default_version = '0.7.4'
6
6
  schema = 'facet'
@@ -3,5 +3,5 @@
3
3
  comment = 'API for faceted indexing and queries (based on builtin VARBIT bit strings)'
4
4
  requires = plpgsql
5
5
  superuser = false
6
- default_version = '0.7.3'
6
+ default_version = '0.7.4'
7
7
  schema = 'facet'
@@ -65,6 +65,23 @@ module Repertoire
65
65
  Float(result)
66
66
  end
67
67
 
68
+ #
69
+ # Methods for detecting table content changes
70
+ #
71
+ # (If a later version of PostgreSQL can hashcode a table/timestamp in the system catalog,
72
+ # switch to use that instead.)
73
+ #
74
+ def stat_table(table_name, column="updated_at")
75
+ sql = "SELECT COUNT(#{column}), MAX(#{column}) AS timestamp FROM #{table_name}"
76
+ result = select_one(sql)
77
+ result = HashWithIndifferentAccess.new({
78
+ :count => Integer(result["count"]),
79
+ :timestamp => Time.parse(result["timestamp"])
80
+ })
81
+
82
+ result
83
+ end
84
+
68
85
  #
69
86
  # Methods for running facet value counts
70
87
  #
@@ -35,8 +35,11 @@ module Repertoire
35
35
  # when 'alphanumeric' then ["#{facet} ASC"]
36
36
  # when 'count' then ["count DESC", "#{facet} ASC"]
37
37
  # end
38
- # @counts = base.refine(filter).order(sorting).count(facet)
39
- # render :json => @counts.to_a
38
+ #
39
+ # if stale?(base.facet_cache_key, :public => true)
40
+ # @counts = base.refine(filter).order(sorting).count(facet)
41
+ # render :json => @counts.to_a
42
+ # end
40
43
  # end
41
44
  #
42
45
  module Controller
@@ -44,26 +47,43 @@ module Repertoire
44
47
  # Web-service to return value, count pairs for a given facet, given existing filter refinements
45
48
  # on other facets in the context. Over-ride this method if you need to specify additional
46
49
  # query params for faceting.
50
+ #
51
+ # Public HTTP cache headers are set, in the following order:
52
+ # - by the facet index table (if present)
53
+ # - by the facet model table (if it has an updated_at column)
54
+ # - otherwise, no HTTP cache header is set
55
+ #
47
56
  def counts
48
57
  facet = params[:facet]
49
58
  filter = params[:filter] || {}
50
59
  raise "Unkown facet #{facet}" unless base.facet?(facet)
51
-
52
- @counts = base.refine(filter).count(facet)
53
60
 
54
- render :json => @counts.to_a
61
+ if stale?(base.facet_cache_key(facet), :public => true)
62
+
63
+ @counts = base.refine(filter).count(facet)
64
+ render :json => @counts.to_a
65
+
66
+ end
55
67
  end
56
68
 
57
69
  # Web-service to return the results of a query, given existing filter requirements. Over-ride
58
70
  # this method if you need to specify additional query parms for faceting results.
71
+ #
72
+ # Private HTTP cache headers are set:
73
+ # - by the facet model table (if it has an updated_at column)
74
+ # - otherwise, no HTTP cache header is set
75
+ #
59
76
  def results
60
77
  filter = params[:filter] || {}
61
-
62
- @results = base.refine(filter).to_a
63
78
 
64
- respond_to do |format|
65
- format.html { render @results, :layout => false }
66
- format.json { render :json => @results }
79
+ if stale?(base.facet_cache_key)
80
+
81
+ @results = base.refine(filter).to_a
82
+
83
+ respond_to do |format|
84
+ format.html { render @results, :layout => false }
85
+ format.json { render :json => @results }
86
+ end
67
87
  end
68
88
  end
69
89
 
@@ -62,6 +62,15 @@ module Repertoire
62
62
  indexed_facets.map(&:to_sym).include?(facet_name)
63
63
  end
64
64
 
65
+ # Returns a row count and timestamp for the facet's index table,
66
+ # or nil if it is unindexed.
67
+ #
68
+ # If a future PostgreSQL timestamps materialized view refreshes
69
+ # via the system catalogs, that value should be used instead.
70
+ def stat_table
71
+ connection.stat_table(facet_index_name) if facet_indexed?
72
+ end
73
+
65
74
  protected
66
75
 
67
76
  # Return a facet's index table name
@@ -34,7 +34,7 @@ module Repertoire #:nodoc:
34
34
  def create_index
35
35
  col = group_values.first
36
36
  rel = only(:where, :joins, :group)
37
- sql = rel.select(["#{col} AS #{facet_name}", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
37
+ sql = rel.select(["#{col} AS #{facet_name}", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
38
38
 
39
39
  connection.create_materialized_view(facet_index_name, sql)
40
40
  end
@@ -50,12 +50,12 @@ module Repertoire
50
50
  end
51
51
  rel = rel.group(columns[0..level])
52
52
 
53
- queries << rel.select(["#{level+1} AS level", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
53
+ queries << rel.select(["#{level+1} AS level", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
54
54
  end
55
55
 
56
56
  # Root of tree
57
57
  empty_cols = columns.map { |col| "NULL AS #{col}"}
58
- queries << only(:where).select(empty_cols + ["0 AS level", "facet.signature(#{table_name}.#{faceting_id})"]).to_sql
58
+ queries << only(:where).select(empty_cols + ["0 AS level", "facet.signature(#{table_name}.#{faceting_id})", "now() AS updated_at"]).to_sql
59
59
 
60
60
  # Give the fullest index first (i.e. leaves of the tree), so the database
61
61
  # can resolve types before encountering any NULL values (i.e. values of
@@ -11,7 +11,7 @@ module Repertoire
11
11
  base.singleton_class.delegate :refine, :minimum, :nils, :reorder, :to_sql, :to => :scoped_all
12
12
 
13
13
  base.class_attribute(:facets)
14
- base.facets = {}
14
+ base.facets = HashWithIndifferentAccess.new
15
15
  end
16
16
 
17
17
  #
@@ -254,6 +254,41 @@ module Repertoire
254
254
  connection.signature_wastage(table_name, signature_column)
255
255
  end
256
256
 
257
+ # Returns the row count and most recent update timestamp for a model table; or nil if
258
+ # there is no updated_at field.
259
+ def stat_table(timestamp_column = nil)
260
+ timestamp_column ||= "updated_at"
261
+ connection.stat_table(table_name) if column_names.include?(timestamp_column)
262
+ end
263
+
264
+ # Return a key suitable for use in HTTP headers, either for the base model table or one of its
265
+ # facets.
266
+ #
267
+ # If the name of an indexed facet is provided, the timestamp of the facet index table is used
268
+ # to construct the facet's cache key. This ensures that facet counts expire when the facet is
269
+ # re-indexed (and not before).
270
+ #
271
+ # If the name of an unindexed facet is given, a cache key for the entire model table is provided.
272
+ # This ensures that faceted browsers over live data expire facet count caches when the base
273
+ # model table is updated.
274
+ #
275
+ # Calling facet_cache_key with no arguments returns a cache key for the entire model table. The
276
+ # key is a combination of the most recent update_at column value and the table row count. If
277
+ # the model has no updated_at attribute, caching is disabled.
278
+ #
279
+ # See the FAQ for additional information.
280
+ def facet_cache_key(facet = nil)
281
+ facet = facet.try(:to_s)
282
+ result = nil
283
+
284
+ stats = facets[facet].stat_table if facet_names.include?(facet)
285
+ stats ||= stat_table
286
+
287
+ result = { :etag => stats[:count], :last_modified => stats[:timestamp] } if stats.present?
288
+
289
+ result
290
+ end
291
+
257
292
  # Once clients have migrated to Rails 4, delete and replace with 'all' where this is called
258
293
  #
259
294
  # c.f. http://stackoverflow.com/questions/18198963/with-rails-4-model-scoped-is-deprecated-but-model-all-cant-replace-it
@@ -1,5 +1,5 @@
1
1
  module Repertoire
2
2
  module Faceting #:nodoc:
3
- VERSION = "0.7.3"
3
+ VERSION = "0.7.4"
4
4
  end
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: repertoire-faceting
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.3
4
+ version: 0.7.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Christopher York
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-06-19 00:00:00.000000000 Z
11
+ date: 2014-06-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rails
@@ -83,9 +83,9 @@ files:
83
83
  - ext/bytea/faceting_bytea.control
84
84
  - ext/common/util.sql
85
85
  - ext/extconf.rb
86
- - ext/faceting--0.7.2.sql
87
- - ext/faceting_bytea--0.7.2.sql
88
- - ext/faceting_varbit--0.7.2.sql
86
+ - ext/faceting--0.7.4.sql
87
+ - ext/faceting_bytea--0.7.4.sql
88
+ - ext/faceting_varbit--0.7.4.sql
89
89
  - ext/signature/faceting.control
90
90
  - ext/signature/signature.c
91
91
  - ext/signature/signature.o
File without changes