blind_index 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f656fa1765df9bf2bcfa9d994ccb9b5cf0504f27f7524d159323126462d973e0
4
- data.tar.gz: 88d5f2cd786f840e75540a204dbe7fbb74794de954978d90cdc69297078cf752
3
+ metadata.gz: 80c561a0a96707de1176ae314d4d884e03390de9b8c4e23049d3649d7576937e
4
+ data.tar.gz: ea819c68b4d1a44c492799225d3250a86186cca1a355a3ab2e65a02069ae4062
5
5
  SHA512:
6
- metadata.gz: 139d3d8f3aca413d0ee5045fe3212e6ed3327cdb6d0c60cb7eb2b314b4eb849abb42dcf17e386e23fb5521db2875b6c502b143fc46dc3305ad38151688b610be
7
- data.tar.gz: 0f5df7f99a1b79f2eab0bc9ff959b9fd21c238e772e09265ef88690678134f539968d9aebe508b88e9881657563013cd53031947d6a6322e08c7e29d95f45902
6
+ metadata.gz: 44835258443127734b6940287e1768935884c86b53d90bb5c39a9a5db372b937649f81e5a6f16d9e4605cb78b19824759b019461491176cb4b2e0bfd1330858d
7
+ data.tar.gz: ddeeb0f625335d49e86ab0a5ff2350a7ae8c5b74ffd0322d7504763818b05ee6c648ea0d85b395c2f197266adfcb18015b37e556d6cbee04aea32f943d0123cd
@@ -1,4 +1,26 @@
1
- ## 2.0.0 (2019-02-10)
1
+ ## 2.2.0 (2020-09-07)
2
+
3
+ - Added support for `where` with table in Active Record 5.2+
4
+
5
+ ## 2.1.1 (2020-08-14)
6
+
7
+ - Fixed `version` option
8
+
9
+ ## 2.1.0 (2020-07-06)
10
+
11
+ - Improved performance of uniqueness validations
12
+ - Fixed deprecation warnings in Ruby 2.7 with Mongoid
13
+
14
+ ## 2.0.2 (2020-06-01)
15
+
16
+ - Improved error message for bad key length
17
+ - Fixed `backfill` method with relations for Mongoid
18
+
19
+ ## 2.0.1 (2020-02-14)
20
+
21
+ - Added `BlindIndex.backfill` method
22
+
23
+ ## 2.0.0 (2020-02-10)
2
24
 
3
25
  - Blind indexes are updated immediately instead of in a `before_validation` callback
4
26
  - Better Lockbox integration - no need to generate a separate key
data/README.md CHANGED
@@ -10,7 +10,7 @@ Learn more about [securing sensitive data in Rails](https://ankane.org/sensitive
10
10
 
11
11
  ## How It Works
12
12
 
13
- We use [this approach](https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql) by Scott Arciszewski. To summarize, we compute a keyed hash of the sensitive data and store it in a column. To query, we apply the keyed hash function to the value we’re searching and then perform a database search. This results in performant queries for exact matches. `LIKE` queries are not possible, but you can index expressions.
13
+ We use [this approach](https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql) by Scott Arciszewski. To summarize, we compute a keyed hash of the sensitive data and store it in a column. To query, we apply the keyed hash function to the value we’re searching and then perform a database search. This results in performant queries for exact matches. Efficient `LIKE` queries are [not possible](#like-ilike-and-full-text-searching), but you can index expressions.
14
14
 
15
15
  ## Leakage
16
16
 
@@ -26,13 +26,13 @@ Add this line to your application’s Gemfile:
26
26
  gem 'blind_index'
27
27
  ```
28
28
 
29
- ## Getting Started
29
+ ## Prep
30
30
 
31
31
  Your model should already be set up with Lockbox or attr_encrypted. The examples are for a `User` model with `encrypts :email` or `attr_encrypted :email`. See the full examples for [Lockbox](https://ankane.org/securing-user-emails-lockbox) and [attr_encrypted](https://ankane.org/securing-user-emails-in-rails) if needed.
32
32
 
33
33
  Also, if you use attr_encrypted, [generate a key](#key-generation).
34
34
 
35
- ---
35
+ ## Getting Started
36
36
 
37
37
  Create a migration to add a column for the blind index
38
38
 
@@ -60,10 +60,7 @@ end
60
60
  Backfill existing records
61
61
 
62
62
  ```ruby
63
- User.unscoped.where(email_bidx: nil).find_each do |user|
64
- user.compute_email_bidx
65
- user.save(validate: false)
66
- end
63
+ BlindIndex.backfill(User)
67
64
  ```
68
65
 
69
66
  And query away
@@ -72,9 +69,19 @@ And query away
72
69
  User.where(email: "test@example.org")
73
70
  ```
74
71
 
72
+ ## Expressions
73
+
74
+ You can apply expressions to attributes before indexing and searching. This gives you the the ability to perform case-insensitive searches and more.
75
+
76
+ ```ruby
77
+ class User < ApplicationRecord
78
+ blind_index :email, expression: ->(v) { v.downcase }
79
+ end
80
+ ```
81
+
75
82
  ## Validations
76
83
 
77
- To prevent duplicates, use:
84
+ You can use blind indexes for uniqueness validations.
78
85
 
79
86
  ```ruby
80
87
  class User < ApplicationRecord
@@ -82,15 +89,27 @@ class User < ApplicationRecord
82
89
  end
83
90
  ```
84
91
 
85
- We also recommend adding a unique index to the blind index column through a database migration.
92
+ We recommend adding a unique index to the blind index column through a database migration.
86
93
 
87
- ## Expressions
94
+ ```ruby
95
+ add_index :users, :email_bidx, unique: true
96
+ ```
88
97
 
89
- You can apply expressions to attributes before indexing and searching. This gives you the the ability to perform case-insensitive searches and more.
98
+ For `allow_blank: true`, use:
99
+
100
+ ```ruby
101
+ class User < ApplicationRecord
102
+ blind_index :email, expression: ->(v) { v.presence }
103
+ validates :email, uniqueness: {allow_blank: true}
104
+ end
105
+ ```
106
+
107
+ For `case_sensitive: false`, use:
90
108
 
91
109
  ```ruby
92
110
  class User < ApplicationRecord
93
111
  blind_index :email, expression: ->(v) { v.downcase }
112
+ validates :email, uniqueness: true # for best performance, leave out {case_sensitive: false}
94
113
  end
95
114
  ```
96
115
 
@@ -115,10 +134,7 @@ end
115
134
  Backfill existing records
116
135
 
117
136
  ```ruby
118
- User.unscoped.where(email_ci_bidx: nil).find_each do |user|
119
- user.compute_email_ci_bidx
120
- user.save(validate: false)
121
- end
137
+ BlindIndex.backfill(User, columns: [:email_ci_bidx])
122
138
  ```
123
139
 
124
140
  And query away
@@ -168,10 +184,7 @@ end
168
184
  This allows you to backfill records while still querying the unencrypted field.
169
185
 
170
186
  ```ruby
171
- User.unscoped.where(email_bidx: nil).find_each do |user|
172
- user.compute_migrated_email_bidx
173
- user.save(validate: false)
174
- end
187
+ BlindIndex.backfill(User)
175
188
  ```
176
189
 
177
190
  Once that completes, you can remove the `migrating` option.
@@ -196,10 +209,7 @@ end
196
209
  This will keep the new column synced going forward. Next, backfill the data:
197
210
 
198
211
  ```ruby
199
- User.unscoped.where(email_bidx_v2: nil).find_each do |user|
200
- user.compute_rotated_email_bidx
201
- user.save(validate: false)
202
- end
212
+ BlindIndex.backfill(User, columns: [:email_bidx_v2])
203
213
  ```
204
214
 
205
215
  Then update your model
@@ -279,6 +289,30 @@ or create `config/initializers/blind_index.rb` with something like
279
289
  BlindIndex.master_key = Rails.application.credentials.blind_index_master_key
280
290
  ```
281
291
 
292
+ ## LIKE, ILIKE, and Full-Text Searching
293
+
294
+ Unfortunately, blind indexes can’t be used for `LIKE`, `ILIKE`, or full-text searching. Instead, records must be loaded, decrypted, and searched in memory.
295
+
296
+ For `LIKE`, use:
297
+
298
+ ```ruby
299
+ User.select { |u| u.email.include?("value") }
300
+ ```
301
+
302
+ For `ILIKE`, use:
303
+
304
+ ```ruby
305
+ User.select { |u| u.email =~ /value/i }
306
+ ```
307
+
308
+ For full-text or fuzzy searching, use a gem like [FuzzyMatch](https://github.com/seamusabshere/fuzzy_match):
309
+
310
+ ```ruby
311
+ FuzzyMatch.new(User.all, read: :email).find("value")
312
+ ```
313
+
314
+ If the number of records is large, try to find a way to narrow it down. An [expression index](#expressions) is one way to do this, but leaks which records have the same value of the expression, so use it carefully.
315
+
282
316
  ## Reference
283
317
 
284
318
  Set default options in an initializer with:
@@ -448,3 +482,5 @@ cd blind_index
448
482
  bundle install
449
483
  bundle exec rake test
450
484
  ```
485
+
486
+ For security issues, send an email to the address on [this page](https://github.com/ankane).
@@ -4,6 +4,7 @@ require "openssl"
4
4
  require "argon2/kdf"
5
5
 
6
6
  # modules
7
+ require "blind_index/backfill"
7
8
  require "blind_index/key_generator"
8
9
  require "blind_index/model"
9
10
  require "blind_index/version"
@@ -116,17 +117,21 @@ module BlindIndex
116
117
  key
117
118
  end
118
119
 
119
- def self.decode_key(key)
120
+ def self.decode_key(key, name: "Key")
120
121
  # decode hex key
121
122
  if key.encoding != Encoding::BINARY && key =~ /\A[0-9a-f]{64}\z/i
122
123
  key = [key].pack("H*")
123
124
  end
124
125
 
125
- raise BlindIndex::Error, "Key must use binary encoding" if key.encoding != Encoding::BINARY
126
- raise BlindIndex::Error, "Key must be 32 bytes" if key.bytesize != 32
126
+ raise BlindIndex::Error, "#{name} must be 32 bytes (64 hex digits)" if key.bytesize != 32
127
+ raise BlindIndex::Error, "#{name} must use binary encoding" if key.encoding != Encoding::BINARY
127
128
 
128
129
  key
129
130
  end
131
+
132
+ def self.backfill(relation, columns: nil, batch_size: 1000)
133
+ Backfill.new(relation, columns: columns, batch_size: batch_size).perform
134
+ end
130
135
  end
131
136
 
132
137
  ActiveSupport.on_load(:active_record) do
@@ -136,17 +141,18 @@ ActiveSupport.on_load(:active_record) do
136
141
  ActiveRecord::TableMetadata.prepend(BlindIndex::Extensions::TableMetadata)
137
142
  ActiveRecord::DynamicMatchers::Method.prepend(BlindIndex::Extensions::DynamicMatchers)
138
143
 
139
- unless ActiveRecord::VERSION::STRING.start_with?("5.1.")
144
+ unless ActiveRecord::VERSION::STRING.to_f == 5.1
140
145
  ActiveRecord::Validations::UniquenessValidator.prepend(BlindIndex::Extensions::UniquenessValidator)
141
146
  end
142
- end
143
147
 
144
- if defined?(Mongoid)
145
- # TODO find better ActiveModel hook
146
- require "active_model/callbacks"
147
- ActiveModel::Callbacks.include(BlindIndex::Model)
148
+ if ActiveRecord::VERSION::STRING.to_f >= 5.2
149
+ ActiveRecord::PredicateBuilder.prepend(BlindIndex::Extensions::PredicateBuilder)
150
+ end
151
+ end
148
152
 
153
+ ActiveSupport.on_load(:mongoid) do
149
154
  require "blind_index/mongoid"
155
+ Mongoid::Document::ClassMethods.include(BlindIndex::Model)
150
156
  Mongoid::Criteria.prepend(BlindIndex::Mongoid::Criteria)
151
157
  Mongoid::Validatable::UniquenessValidator.prepend(BlindIndex::Mongoid::UniquenessValidator)
152
158
  end
@@ -0,0 +1,113 @@
1
+ module BlindIndex
2
+ class Backfill
3
+ attr_reader :blind_indexes
4
+
5
+ def initialize(relation, batch_size:, columns:)
6
+ @relation = relation
7
+ @transaction = @relation.respond_to?(:transaction)
8
+ @batch_size = batch_size
9
+ @blind_indexes = @relation.blind_indexes
10
+ filter_columns!(columns) if columns
11
+ end
12
+
13
+ def perform
14
+ each_batch do |records|
15
+ backfill_records(records)
16
+ end
17
+ end
18
+
19
+ private
20
+
21
+ # modify in-place
22
+ def filter_columns!(columns)
23
+ columns = Array(columns).map(&:to_s)
24
+ blind_indexes.select! { |_, v| columns.include?(v[:bidx_attribute]) }
25
+ bad_columns = columns - blind_indexes.map { |_, v| v[:bidx_attribute] }
26
+ raise ArgumentError, "Bad column: #{bad_columns.first}" if bad_columns.any?
27
+ end
28
+
29
+ def build_relation
30
+ # build relation
31
+ relation = @relation
32
+
33
+ if defined?(ActiveRecord::Base) && relation.is_a?(ActiveRecord::Base)
34
+ relation = relation.unscoped
35
+ end
36
+
37
+ # convert from possible class to ActiveRecord::Relation or Mongoid::Criteria
38
+ relation = relation.all
39
+
40
+ attributes = blind_indexes.map { |_, v| v[:bidx_attribute] }
41
+
42
+ if defined?(ActiveRecord::Relation) && relation.is_a?(ActiveRecord::Relation)
43
+ base_relation = relation.unscoped
44
+ or_relation = relation.unscoped
45
+
46
+ attributes.each_with_index do |attribute, i|
47
+ or_relation =
48
+ if i == 0
49
+ base_relation.where(attribute => nil)
50
+ else
51
+ or_relation.or(base_relation.where(attribute => nil))
52
+ end
53
+ end
54
+
55
+ relation.merge(or_relation)
56
+ else
57
+ relation.merge(relation.unscoped.or(attributes.map { |a| {a => nil} }))
58
+ end
59
+ end
60
+
61
+ def each_batch
62
+ relation = build_relation
63
+
64
+ if relation.respond_to?(:find_in_batches)
65
+ relation.find_in_batches(batch_size: @batch_size) do |records|
66
+ yield records
67
+ end
68
+ else
69
+ # https://github.com/karmi/tire/blob/master/lib/tire/model/import.rb
70
+ # use cursor for Mongoid
71
+ records = []
72
+ relation.all.each do |record|
73
+ records << record
74
+ if records.length == @batch_size
75
+ yield records
76
+ records = []
77
+ end
78
+ end
79
+ yield records if records.any?
80
+ end
81
+ end
82
+
83
+ def backfill_records(records)
84
+ # do expensive blind index computation outside of transaction
85
+ records.each do |record|
86
+ blind_indexes.each do |k, v|
87
+ record.send("compute_#{k}_bidx") if !record.send(v[:bidx_attribute])
88
+ end
89
+ end
90
+
91
+ # don't need to save records that went from nil => nil
92
+ records.select! { |r| r.changed? }
93
+
94
+ if records.any?
95
+ with_transaction do
96
+ records.each do |record|
97
+ record.save!(validate: false)
98
+ end
99
+ end
100
+ end
101
+ end
102
+
103
+ def with_transaction
104
+ if @transaction
105
+ @relation.transaction do
106
+ yield
107
+ end
108
+ else
109
+ yield
110
+ end
111
+ end
112
+ end
113
+ end
@@ -1,22 +1,24 @@
1
1
  module BlindIndex
2
2
  module Extensions
3
3
  module TableMetadata
4
- def resolve_column_aliases(hash)
5
- new_hash = super
6
- if has_blind_indexes?
7
- hash.each do |key, _|
8
- if key.respond_to?(:to_sym) && (bi = klass.blind_indexes[key.to_sym]) && !new_hash[key].is_a?(ActiveRecord::StatementCache::Substitute)
9
- value = new_hash.delete(key)
10
- new_hash[bi[:bidx_attribute]] =
11
- if value.is_a?(Array)
12
- value.map { |v| BlindIndex.generate_bidx(v, **bi) }
13
- else
14
- BlindIndex.generate_bidx(value, **bi)
15
- end
4
+ if ActiveRecord::VERSION::STRING.to_f < 5.2
5
+ def resolve_column_aliases(hash)
6
+ new_hash = super
7
+ if has_blind_indexes?
8
+ hash.each_key do |key|
9
+ if key.respond_to?(:to_sym) && (bi = klass.blind_indexes[key.to_sym]) && !new_hash[key].is_a?(ActiveRecord::StatementCache::Substitute)
10
+ value = new_hash.delete(key)
11
+ new_hash[bi[:bidx_attribute]] =
12
+ if value.is_a?(Array)
13
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
14
+ else
15
+ BlindIndex.generate_bidx(value, **bi)
16
+ end
17
+ end
16
18
  end
17
19
  end
20
+ new_hash
18
21
  end
19
- new_hash
20
22
  end
21
23
 
22
24
  # memoize for performance
@@ -28,11 +30,36 @@ module BlindIndex
28
30
  end
29
31
  end
30
32
 
33
+ module PredicateBuilder
34
+ # https://github.com/rails/rails/commit/56f30962b84fc53b76001301fb830c1594fd377e
35
+ def build(attribute, value, *args)
36
+ if table.has_blind_indexes? && (bi = table.send(:klass).blind_indexes[attribute.name.to_sym]) && !value.is_a?(ActiveRecord::StatementCache::Substitute)
37
+ attribute = attribute.relation[bi[:bidx_attribute]]
38
+ value =
39
+ if value.is_a?(Array)
40
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
41
+ else
42
+ BlindIndex.generate_bidx(value, **bi)
43
+ end
44
+ end
45
+
46
+ super(attribute, value, *args)
47
+ end
48
+ end
49
+
31
50
  module UniquenessValidator
32
- if ActiveRecord::VERSION::STRING >= "5.2"
51
+ def validate_each(record, attribute, value)
52
+ klass = record.class
53
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
54
+ value = record.read_attribute_for_validation(bi[:bidx_attribute])
55
+ end
56
+ super(record, attribute, value)
57
+ end
58
+
59
+ # change attribute name here instead of validate_each for better error message
60
+ if ActiveRecord::VERSION::STRING.to_f >= 5.2
33
61
  def build_relation(klass, attribute, value)
34
62
  if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
35
- value = BlindIndex.generate_bidx(value, **bi)
36
63
  attribute = bi[:bidx_attribute]
37
64
  end
38
65
  super(klass, attribute, value)
@@ -40,7 +67,6 @@ module BlindIndex
40
67
  else
41
68
  def build_relation(klass, table, attribute, value)
42
69
  if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
43
- value = BlindIndex.generate_bidx(value, **bi)
44
70
  attribute = bi[:bidx_attribute]
45
71
  end
46
72
  super(klass, table, attribute, value)
@@ -11,7 +11,7 @@ module BlindIndex
11
11
  raise ArgumentError, "Missing field for key generation" if bidx_attribute.to_s.empty?
12
12
 
13
13
  c = "\x7E"*32
14
- root_key = hkdf(BlindIndex.decode_key(@master_key), salt: table.to_s, info: "#{c}#{bidx_attribute}", length: 32, hash: "sha384")
14
+ root_key = hkdf(BlindIndex.decode_key(@master_key, name: "Master key"), salt: table.to_s, info: "#{c}#{bidx_attribute}", length: 32, hash: "sha384")
15
15
  hash_hmac("sha256", pack([table, bidx_attribute, bidx_attribute]), root_key)
16
16
  end
17
17
 
@@ -10,7 +10,7 @@ module BlindIndex
10
10
  # check here so we validate rotate options as well
11
11
  unknown_keywords = options.keys - [:algorithm, :attribute, :bidx_attribute,
12
12
  :callback, :cost, :encode, :expression, :insecure_key, :iterations, :key,
13
- :legacy, :master_key, :size, :slow]
13
+ :legacy, :master_key, :size, :slow, :version]
14
14
  raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
15
15
 
16
16
  attribute = options[:attribute] || name
@@ -65,7 +65,7 @@ module BlindIndex
65
65
  end
66
66
 
67
67
  define_method method_name do
68
- self.send("#{bidx_attribute}=", self.class.send(class_method_name, send(attribute)))
68
+ send("#{bidx_attribute}=", self.class.send(class_method_name, send(attribute)))
69
69
  end
70
70
 
71
71
  if callback
@@ -26,9 +26,9 @@ module BlindIndex
26
26
 
27
27
  criterion[bidx_key] =
28
28
  if value.is_a?(Array)
29
- value.map { |v| BlindIndex.generate_bidx(v, bi) }
29
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
30
30
  else
31
- BlindIndex.generate_bidx(value, bi)
31
+ BlindIndex.generate_bidx(value, **bi)
32
32
  end
33
33
  end
34
34
  end
@@ -39,9 +39,18 @@ module BlindIndex
39
39
  end
40
40
 
41
41
  module UniquenessValidator
42
+ def validate_each(record, attribute, value)
43
+ klass = record.class
44
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
45
+ value = record.read_attribute_for_validation(bi[:bidx_attribute])
46
+ end
47
+ super(record, attribute, value)
48
+ end
49
+
50
+ # change attribute name here instead of validate_each for better error message
42
51
  def create_criteria(base, document, attribute, value)
43
- if base.respond_to?(:blind_indexes) && (bi = base.blind_indexes[attribute])
44
- value = BlindIndex.generate_bidx(value, bi)
52
+ klass = document.class
53
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
45
54
  attribute = bi[:bidx_attribute]
46
55
  end
47
56
  super(base, document, attribute, value)
@@ -1,3 +1,3 @@
1
1
  module BlindIndex
2
- VERSION = "2.0.0"
2
+ VERSION = "2.2.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: blind_index
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-02-10 00:00:00.000000000 Z
11
+ date: 2020-09-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
@@ -174,6 +174,7 @@ files:
174
174
  - LICENSE.txt
175
175
  - README.md
176
176
  - lib/blind_index.rb
177
+ - lib/blind_index/backfill.rb
177
178
  - lib/blind_index/extensions.rb
178
179
  - lib/blind_index/key_generator.rb
179
180
  - lib/blind_index/model.rb