blind_index 2.0.0 → 2.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f656fa1765df9bf2bcfa9d994ccb9b5cf0504f27f7524d159323126462d973e0
4
- data.tar.gz: 88d5f2cd786f840e75540a204dbe7fbb74794de954978d90cdc69297078cf752
3
+ metadata.gz: 80c561a0a96707de1176ae314d4d884e03390de9b8c4e23049d3649d7576937e
4
+ data.tar.gz: ea819c68b4d1a44c492799225d3250a86186cca1a355a3ab2e65a02069ae4062
5
5
  SHA512:
6
- metadata.gz: 139d3d8f3aca413d0ee5045fe3212e6ed3327cdb6d0c60cb7eb2b314b4eb849abb42dcf17e386e23fb5521db2875b6c502b143fc46dc3305ad38151688b610be
7
- data.tar.gz: 0f5df7f99a1b79f2eab0bc9ff959b9fd21c238e772e09265ef88690678134f539968d9aebe508b88e9881657563013cd53031947d6a6322e08c7e29d95f45902
6
+ metadata.gz: 44835258443127734b6940287e1768935884c86b53d90bb5c39a9a5db372b937649f81e5a6f16d9e4605cb78b19824759b019461491176cb4b2e0bfd1330858d
7
+ data.tar.gz: ddeeb0f625335d49e86ab0a5ff2350a7ae8c5b74ffd0322d7504763818b05ee6c648ea0d85b395c2f197266adfcb18015b37e556d6cbee04aea32f943d0123cd
@@ -1,4 +1,26 @@
1
- ## 2.0.0 (2019-02-10)
1
+ ## 2.2.0 (2020-09-07)
2
+
3
+ - Added support for `where` with table in Active Record 5.2+
4
+
5
+ ## 2.1.1 (2020-08-14)
6
+
7
+ - Fixed `version` option
8
+
9
+ ## 2.1.0 (2020-07-06)
10
+
11
+ - Improved performance of uniqueness validations
12
+ - Fixed deprecation warnings in Ruby 2.7 with Mongoid
13
+
14
+ ## 2.0.2 (2020-06-01)
15
+
16
+ - Improved error message for bad key length
17
+ - Fixed `backfill` method with relations for Mongoid
18
+
19
+ ## 2.0.1 (2020-02-14)
20
+
21
+ - Added `BlindIndex.backfill` method
22
+
23
+ ## 2.0.0 (2020-02-10)
2
24
 
3
25
  - Blind indexes are updated immediately instead of in a `before_validation` callback
4
26
  - Better Lockbox integration - no need to generate a separate key
data/README.md CHANGED
@@ -10,7 +10,7 @@ Learn more about [securing sensitive data in Rails](https://ankane.org/sensitive
10
10
 
11
11
  ## How It Works
12
12
 
13
- We use [this approach](https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql) by Scott Arciszewski. To summarize, we compute a keyed hash of the sensitive data and store it in a column. To query, we apply the keyed hash function to the value we’re searching and then perform a database search. This results in performant queries for exact matches. `LIKE` queries are not possible, but you can index expressions.
13
+ We use [this approach](https://paragonie.com/blog/2017/05/building-searchable-encrypted-databases-with-php-and-sql) by Scott Arciszewski. To summarize, we compute a keyed hash of the sensitive data and store it in a column. To query, we apply the keyed hash function to the value we’re searching and then perform a database search. This results in performant queries for exact matches. Efficient `LIKE` queries are [not possible](#like-ilike-and-full-text-searching), but you can index expressions.
14
14
 
15
15
  ## Leakage
16
16
 
@@ -26,13 +26,13 @@ Add this line to your application’s Gemfile:
26
26
  gem 'blind_index'
27
27
  ```
28
28
 
29
- ## Getting Started
29
+ ## Prep
30
30
 
31
31
  Your model should already be set up with Lockbox or attr_encrypted. The examples are for a `User` model with `encrypts :email` or `attr_encrypted :email`. See the full examples for [Lockbox](https://ankane.org/securing-user-emails-lockbox) and [attr_encrypted](https://ankane.org/securing-user-emails-in-rails) if needed.
32
32
 
33
33
  Also, if you use attr_encrypted, [generate a key](#key-generation).
34
34
 
35
- ---
35
+ ## Getting Started
36
36
 
37
37
  Create a migration to add a column for the blind index
38
38
 
@@ -60,10 +60,7 @@ end
60
60
  Backfill existing records
61
61
 
62
62
  ```ruby
63
- User.unscoped.where(email_bidx: nil).find_each do |user|
64
- user.compute_email_bidx
65
- user.save(validate: false)
66
- end
63
+ BlindIndex.backfill(User)
67
64
  ```
68
65
 
69
66
  And query away
@@ -72,9 +69,19 @@ And query away
72
69
  User.where(email: "test@example.org")
73
70
  ```
74
71
 
72
+ ## Expressions
73
+
74
+ You can apply expressions to attributes before indexing and searching. This gives you the the ability to perform case-insensitive searches and more.
75
+
76
+ ```ruby
77
+ class User < ApplicationRecord
78
+ blind_index :email, expression: ->(v) { v.downcase }
79
+ end
80
+ ```
81
+
75
82
  ## Validations
76
83
 
77
- To prevent duplicates, use:
84
+ You can use blind indexes for uniqueness validations.
78
85
 
79
86
  ```ruby
80
87
  class User < ApplicationRecord
@@ -82,15 +89,27 @@ class User < ApplicationRecord
82
89
  end
83
90
  ```
84
91
 
85
- We also recommend adding a unique index to the blind index column through a database migration.
92
+ We recommend adding a unique index to the blind index column through a database migration.
86
93
 
87
- ## Expressions
94
+ ```ruby
95
+ add_index :users, :email_bidx, unique: true
96
+ ```
88
97
 
89
- You can apply expressions to attributes before indexing and searching. This gives you the the ability to perform case-insensitive searches and more.
98
+ For `allow_blank: true`, use:
99
+
100
+ ```ruby
101
+ class User < ApplicationRecord
102
+ blind_index :email, expression: ->(v) { v.presence }
103
+ validates :email, uniqueness: {allow_blank: true}
104
+ end
105
+ ```
106
+
107
+ For `case_sensitive: false`, use:
90
108
 
91
109
  ```ruby
92
110
  class User < ApplicationRecord
93
111
  blind_index :email, expression: ->(v) { v.downcase }
112
+ validates :email, uniqueness: true # for best performance, leave out {case_sensitive: false}
94
113
  end
95
114
  ```
96
115
 
@@ -115,10 +134,7 @@ end
115
134
  Backfill existing records
116
135
 
117
136
  ```ruby
118
- User.unscoped.where(email_ci_bidx: nil).find_each do |user|
119
- user.compute_email_ci_bidx
120
- user.save(validate: false)
121
- end
137
+ BlindIndex.backfill(User, columns: [:email_ci_bidx])
122
138
  ```
123
139
 
124
140
  And query away
@@ -168,10 +184,7 @@ end
168
184
  This allows you to backfill records while still querying the unencrypted field.
169
185
 
170
186
  ```ruby
171
- User.unscoped.where(email_bidx: nil).find_each do |user|
172
- user.compute_migrated_email_bidx
173
- user.save(validate: false)
174
- end
187
+ BlindIndex.backfill(User)
175
188
  ```
176
189
 
177
190
  Once that completes, you can remove the `migrating` option.
@@ -196,10 +209,7 @@ end
196
209
  This will keep the new column synced going forward. Next, backfill the data:
197
210
 
198
211
  ```ruby
199
- User.unscoped.where(email_bidx_v2: nil).find_each do |user|
200
- user.compute_rotated_email_bidx
201
- user.save(validate: false)
202
- end
212
+ BlindIndex.backfill(User, columns: [:email_bidx_v2])
203
213
  ```
204
214
 
205
215
  Then update your model
@@ -279,6 +289,30 @@ or create `config/initializers/blind_index.rb` with something like
279
289
  BlindIndex.master_key = Rails.application.credentials.blind_index_master_key
280
290
  ```
281
291
 
292
+ ## LIKE, ILIKE, and Full-Text Searching
293
+
294
+ Unfortunately, blind indexes can’t be used for `LIKE`, `ILIKE`, or full-text searching. Instead, records must be loaded, decrypted, and searched in memory.
295
+
296
+ For `LIKE`, use:
297
+
298
+ ```ruby
299
+ User.select { |u| u.email.include?("value") }
300
+ ```
301
+
302
+ For `ILIKE`, use:
303
+
304
+ ```ruby
305
+ User.select { |u| u.email =~ /value/i }
306
+ ```
307
+
308
+ For full-text or fuzzy searching, use a gem like [FuzzyMatch](https://github.com/seamusabshere/fuzzy_match):
309
+
310
+ ```ruby
311
+ FuzzyMatch.new(User.all, read: :email).find("value")
312
+ ```
313
+
314
+ If the number of records is large, try to find a way to narrow it down. An [expression index](#expressions) is one way to do this, but leaks which records have the same value of the expression, so use it carefully.
315
+
282
316
  ## Reference
283
317
 
284
318
  Set default options in an initializer with:
@@ -448,3 +482,5 @@ cd blind_index
448
482
  bundle install
449
483
  bundle exec rake test
450
484
  ```
485
+
486
+ For security issues, send an email to the address on [this page](https://github.com/ankane).
@@ -4,6 +4,7 @@ require "openssl"
4
4
  require "argon2/kdf"
5
5
 
6
6
  # modules
7
+ require "blind_index/backfill"
7
8
  require "blind_index/key_generator"
8
9
  require "blind_index/model"
9
10
  require "blind_index/version"
@@ -116,17 +117,21 @@ module BlindIndex
116
117
  key
117
118
  end
118
119
 
119
- def self.decode_key(key)
120
+ def self.decode_key(key, name: "Key")
120
121
  # decode hex key
121
122
  if key.encoding != Encoding::BINARY && key =~ /\A[0-9a-f]{64}\z/i
122
123
  key = [key].pack("H*")
123
124
  end
124
125
 
125
- raise BlindIndex::Error, "Key must use binary encoding" if key.encoding != Encoding::BINARY
126
- raise BlindIndex::Error, "Key must be 32 bytes" if key.bytesize != 32
126
+ raise BlindIndex::Error, "#{name} must be 32 bytes (64 hex digits)" if key.bytesize != 32
127
+ raise BlindIndex::Error, "#{name} must use binary encoding" if key.encoding != Encoding::BINARY
127
128
 
128
129
  key
129
130
  end
131
+
132
+ def self.backfill(relation, columns: nil, batch_size: 1000)
133
+ Backfill.new(relation, columns: columns, batch_size: batch_size).perform
134
+ end
130
135
  end
131
136
 
132
137
  ActiveSupport.on_load(:active_record) do
@@ -136,17 +141,18 @@ ActiveSupport.on_load(:active_record) do
136
141
  ActiveRecord::TableMetadata.prepend(BlindIndex::Extensions::TableMetadata)
137
142
  ActiveRecord::DynamicMatchers::Method.prepend(BlindIndex::Extensions::DynamicMatchers)
138
143
 
139
- unless ActiveRecord::VERSION::STRING.start_with?("5.1.")
144
+ unless ActiveRecord::VERSION::STRING.to_f == 5.1
140
145
  ActiveRecord::Validations::UniquenessValidator.prepend(BlindIndex::Extensions::UniquenessValidator)
141
146
  end
142
- end
143
147
 
144
- if defined?(Mongoid)
145
- # TODO find better ActiveModel hook
146
- require "active_model/callbacks"
147
- ActiveModel::Callbacks.include(BlindIndex::Model)
148
+ if ActiveRecord::VERSION::STRING.to_f >= 5.2
149
+ ActiveRecord::PredicateBuilder.prepend(BlindIndex::Extensions::PredicateBuilder)
150
+ end
151
+ end
148
152
 
153
+ ActiveSupport.on_load(:mongoid) do
149
154
  require "blind_index/mongoid"
155
+ Mongoid::Document::ClassMethods.include(BlindIndex::Model)
150
156
  Mongoid::Criteria.prepend(BlindIndex::Mongoid::Criteria)
151
157
  Mongoid::Validatable::UniquenessValidator.prepend(BlindIndex::Mongoid::UniquenessValidator)
152
158
  end
@@ -0,0 +1,113 @@
1
+ module BlindIndex
2
+ class Backfill
3
+ attr_reader :blind_indexes
4
+
5
+ def initialize(relation, batch_size:, columns:)
6
+ @relation = relation
7
+ @transaction = @relation.respond_to?(:transaction)
8
+ @batch_size = batch_size
9
+ @blind_indexes = @relation.blind_indexes
10
+ filter_columns!(columns) if columns
11
+ end
12
+
13
+ def perform
14
+ each_batch do |records|
15
+ backfill_records(records)
16
+ end
17
+ end
18
+
19
+ private
20
+
21
+ # modify in-place
22
+ def filter_columns!(columns)
23
+ columns = Array(columns).map(&:to_s)
24
+ blind_indexes.select! { |_, v| columns.include?(v[:bidx_attribute]) }
25
+ bad_columns = columns - blind_indexes.map { |_, v| v[:bidx_attribute] }
26
+ raise ArgumentError, "Bad column: #{bad_columns.first}" if bad_columns.any?
27
+ end
28
+
29
+ def build_relation
30
+ # build relation
31
+ relation = @relation
32
+
33
+ if defined?(ActiveRecord::Base) && relation.is_a?(ActiveRecord::Base)
34
+ relation = relation.unscoped
35
+ end
36
+
37
+ # convert from possible class to ActiveRecord::Relation or Mongoid::Criteria
38
+ relation = relation.all
39
+
40
+ attributes = blind_indexes.map { |_, v| v[:bidx_attribute] }
41
+
42
+ if defined?(ActiveRecord::Relation) && relation.is_a?(ActiveRecord::Relation)
43
+ base_relation = relation.unscoped
44
+ or_relation = relation.unscoped
45
+
46
+ attributes.each_with_index do |attribute, i|
47
+ or_relation =
48
+ if i == 0
49
+ base_relation.where(attribute => nil)
50
+ else
51
+ or_relation.or(base_relation.where(attribute => nil))
52
+ end
53
+ end
54
+
55
+ relation.merge(or_relation)
56
+ else
57
+ relation.merge(relation.unscoped.or(attributes.map { |a| {a => nil} }))
58
+ end
59
+ end
60
+
61
+ def each_batch
62
+ relation = build_relation
63
+
64
+ if relation.respond_to?(:find_in_batches)
65
+ relation.find_in_batches(batch_size: @batch_size) do |records|
66
+ yield records
67
+ end
68
+ else
69
+ # https://github.com/karmi/tire/blob/master/lib/tire/model/import.rb
70
+ # use cursor for Mongoid
71
+ records = []
72
+ relation.all.each do |record|
73
+ records << record
74
+ if records.length == @batch_size
75
+ yield records
76
+ records = []
77
+ end
78
+ end
79
+ yield records if records.any?
80
+ end
81
+ end
82
+
83
+ def backfill_records(records)
84
+ # do expensive blind index computation outside of transaction
85
+ records.each do |record|
86
+ blind_indexes.each do |k, v|
87
+ record.send("compute_#{k}_bidx") if !record.send(v[:bidx_attribute])
88
+ end
89
+ end
90
+
91
+ # don't need to save records that went from nil => nil
92
+ records.select! { |r| r.changed? }
93
+
94
+ if records.any?
95
+ with_transaction do
96
+ records.each do |record|
97
+ record.save!(validate: false)
98
+ end
99
+ end
100
+ end
101
+ end
102
+
103
+ def with_transaction
104
+ if @transaction
105
+ @relation.transaction do
106
+ yield
107
+ end
108
+ else
109
+ yield
110
+ end
111
+ end
112
+ end
113
+ end
@@ -1,22 +1,24 @@
1
1
  module BlindIndex
2
2
  module Extensions
3
3
  module TableMetadata
4
- def resolve_column_aliases(hash)
5
- new_hash = super
6
- if has_blind_indexes?
7
- hash.each do |key, _|
8
- if key.respond_to?(:to_sym) && (bi = klass.blind_indexes[key.to_sym]) && !new_hash[key].is_a?(ActiveRecord::StatementCache::Substitute)
9
- value = new_hash.delete(key)
10
- new_hash[bi[:bidx_attribute]] =
11
- if value.is_a?(Array)
12
- value.map { |v| BlindIndex.generate_bidx(v, **bi) }
13
- else
14
- BlindIndex.generate_bidx(value, **bi)
15
- end
4
+ if ActiveRecord::VERSION::STRING.to_f < 5.2
5
+ def resolve_column_aliases(hash)
6
+ new_hash = super
7
+ if has_blind_indexes?
8
+ hash.each_key do |key|
9
+ if key.respond_to?(:to_sym) && (bi = klass.blind_indexes[key.to_sym]) && !new_hash[key].is_a?(ActiveRecord::StatementCache::Substitute)
10
+ value = new_hash.delete(key)
11
+ new_hash[bi[:bidx_attribute]] =
12
+ if value.is_a?(Array)
13
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
14
+ else
15
+ BlindIndex.generate_bidx(value, **bi)
16
+ end
17
+ end
16
18
  end
17
19
  end
20
+ new_hash
18
21
  end
19
- new_hash
20
22
  end
21
23
 
22
24
  # memoize for performance
@@ -28,11 +30,36 @@ module BlindIndex
28
30
  end
29
31
  end
30
32
 
33
+ module PredicateBuilder
34
+ # https://github.com/rails/rails/commit/56f30962b84fc53b76001301fb830c1594fd377e
35
+ def build(attribute, value, *args)
36
+ if table.has_blind_indexes? && (bi = table.send(:klass).blind_indexes[attribute.name.to_sym]) && !value.is_a?(ActiveRecord::StatementCache::Substitute)
37
+ attribute = attribute.relation[bi[:bidx_attribute]]
38
+ value =
39
+ if value.is_a?(Array)
40
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
41
+ else
42
+ BlindIndex.generate_bidx(value, **bi)
43
+ end
44
+ end
45
+
46
+ super(attribute, value, *args)
47
+ end
48
+ end
49
+
31
50
  module UniquenessValidator
32
- if ActiveRecord::VERSION::STRING >= "5.2"
51
+ def validate_each(record, attribute, value)
52
+ klass = record.class
53
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
54
+ value = record.read_attribute_for_validation(bi[:bidx_attribute])
55
+ end
56
+ super(record, attribute, value)
57
+ end
58
+
59
+ # change attribute name here instead of validate_each for better error message
60
+ if ActiveRecord::VERSION::STRING.to_f >= 5.2
33
61
  def build_relation(klass, attribute, value)
34
62
  if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
35
- value = BlindIndex.generate_bidx(value, **bi)
36
63
  attribute = bi[:bidx_attribute]
37
64
  end
38
65
  super(klass, attribute, value)
@@ -40,7 +67,6 @@ module BlindIndex
40
67
  else
41
68
  def build_relation(klass, table, attribute, value)
42
69
  if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
43
- value = BlindIndex.generate_bidx(value, **bi)
44
70
  attribute = bi[:bidx_attribute]
45
71
  end
46
72
  super(klass, table, attribute, value)
@@ -11,7 +11,7 @@ module BlindIndex
11
11
  raise ArgumentError, "Missing field for key generation" if bidx_attribute.to_s.empty?
12
12
 
13
13
  c = "\x7E"*32
14
- root_key = hkdf(BlindIndex.decode_key(@master_key), salt: table.to_s, info: "#{c}#{bidx_attribute}", length: 32, hash: "sha384")
14
+ root_key = hkdf(BlindIndex.decode_key(@master_key, name: "Master key"), salt: table.to_s, info: "#{c}#{bidx_attribute}", length: 32, hash: "sha384")
15
15
  hash_hmac("sha256", pack([table, bidx_attribute, bidx_attribute]), root_key)
16
16
  end
17
17
 
@@ -10,7 +10,7 @@ module BlindIndex
10
10
  # check here so we validate rotate options as well
11
11
  unknown_keywords = options.keys - [:algorithm, :attribute, :bidx_attribute,
12
12
  :callback, :cost, :encode, :expression, :insecure_key, :iterations, :key,
13
- :legacy, :master_key, :size, :slow]
13
+ :legacy, :master_key, :size, :slow, :version]
14
14
  raise ArgumentError, "unknown keywords: #{unknown_keywords.join(", ")}" if unknown_keywords.any?
15
15
 
16
16
  attribute = options[:attribute] || name
@@ -65,7 +65,7 @@ module BlindIndex
65
65
  end
66
66
 
67
67
  define_method method_name do
68
- self.send("#{bidx_attribute}=", self.class.send(class_method_name, send(attribute)))
68
+ send("#{bidx_attribute}=", self.class.send(class_method_name, send(attribute)))
69
69
  end
70
70
 
71
71
  if callback
@@ -26,9 +26,9 @@ module BlindIndex
26
26
 
27
27
  criterion[bidx_key] =
28
28
  if value.is_a?(Array)
29
- value.map { |v| BlindIndex.generate_bidx(v, bi) }
29
+ value.map { |v| BlindIndex.generate_bidx(v, **bi) }
30
30
  else
31
- BlindIndex.generate_bidx(value, bi)
31
+ BlindIndex.generate_bidx(value, **bi)
32
32
  end
33
33
  end
34
34
  end
@@ -39,9 +39,18 @@ module BlindIndex
39
39
  end
40
40
 
41
41
  module UniquenessValidator
42
+ def validate_each(record, attribute, value)
43
+ klass = record.class
44
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
45
+ value = record.read_attribute_for_validation(bi[:bidx_attribute])
46
+ end
47
+ super(record, attribute, value)
48
+ end
49
+
50
+ # change attribute name here instead of validate_each for better error message
42
51
  def create_criteria(base, document, attribute, value)
43
- if base.respond_to?(:blind_indexes) && (bi = base.blind_indexes[attribute])
44
- value = BlindIndex.generate_bidx(value, bi)
52
+ klass = document.class
53
+ if klass.respond_to?(:blind_indexes) && (bi = klass.blind_indexes[attribute])
45
54
  attribute = bi[:bidx_attribute]
46
55
  end
47
56
  super(base, document, attribute, value)
@@ -1,3 +1,3 @@
1
1
  module BlindIndex
2
- VERSION = "2.0.0"
2
+ VERSION = "2.2.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: blind_index
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-02-10 00:00:00.000000000 Z
11
+ date: 2020-09-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
@@ -174,6 +174,7 @@ files:
174
174
  - LICENSE.txt
175
175
  - README.md
176
176
  - lib/blind_index.rb
177
+ - lib/blind_index/backfill.rb
177
178
  - lib/blind_index/extensions.rb
178
179
  - lib/blind_index/key_generator.rb
179
180
  - lib/blind_index/model.rb