disco 0.3.2 → 0.4.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 645a5f169c78e36ee6394a71ba61ac611cb333176adff84812c8a25549e2ea28
4
- data.tar.gz: 68caa44554bff09a39a68522ad5c8008164840186c08e0fdbad8f9e465855c89
3
+ metadata.gz: eb58ae5c2579ef03fc0089f9f9e0421cd38bd8996472bf04ef9383b601759e5a
4
+ data.tar.gz: 9bb8c5aa186ab1e6364d0ed446dc61e69a0f0ee730cee4c8367586dac69b666a
5
5
  SHA512:
6
- metadata.gz: 104a016c693c0256cae13d4545df4112538a1f7a66c8790c38a1b434f144c8d2fb173d2428bbc9168e2581e0e9399d761083500fca69919c07f86b2f1e6582ee
7
- data.tar.gz: 3d06f45fdcf63ea26fa48c18ec5304d22f3306da5bb3fa0a2bcd107455002dbfda56e0df22f61270c12374cb693ad6da2d747bd363c0ec001048f23690b3efe1
6
+ metadata.gz: ecee89ddb2db25a9ba697ec637f16a2408079b8b3d7d287df0cfdb30d9b7c0a996d6cb38075766d7e3fecf529be6ad90187a1ec95d2d141076ecfe3e2236209a
7
+ data.tar.gz: 31d7259ac86779530b468392b611f0920b24682f8017e3b50f9a3ca41d30ad08454219881e26eae67d1add7a47d645f7a8b77d253ada4a3d0b5db0f5dbf5e33d
data/CHANGELOG.md CHANGED
@@ -1,3 +1,13 @@
1
+ ## 0.4.1 (2024-05-23)
2
+
3
+ - Reduced memory for `item_recs` and `similar_users`
4
+
5
+ ## 0.4.0 (2023-01-30)
6
+
7
+ - Fixed issue with `has_recommended` and inheritance with Rails < 6.1
8
+ - Deprecated marshal serialization
9
+ - Dropped support for Ruby < 2.7 and Rails < 6
10
+
1
11
  ## 0.3.2 (2022-09-26)
2
12
 
3
13
  - Fixed issue when `fit` is called multiple times
data/README.md CHANGED
@@ -6,7 +6,7 @@
6
6
  - Works with explicit and implicit feedback
7
7
  - Uses high-performance matrix factorization
8
8
 
9
- [![Build Status](https://github.com/ankane/disco/workflows/build/badge.svg?branch=master)](https://github.com/ankane/disco/actions)
9
+ [![Build Status](https://github.com/ankane/disco/actions/workflows/build.yml/badge.svg)](https://github.com/ankane/disco/actions)
10
10
 
11
11
  ## Installation
12
12
 
@@ -176,8 +176,6 @@ user.update_recommended_products_v2(recs)
176
176
  user.recommended_products_v2
177
177
  ```
178
178
 
179
- For Rails < 6, speed up inserts by adding [activerecord-import](https://github.com/zdennis/activerecord-import) to your app.
180
-
181
179
  ## Storing Recommenders
182
180
 
183
181
  If you’d prefer to perform recommendations on-the-fly, store the recommender
@@ -231,8 +229,8 @@ recommender.user_recs(new_user_id) # returns empty array
231
229
 
232
230
  There are a number of ways to deal with this, but here are some common ones:
233
231
 
234
- - For user-based recommendations, show new users the most popular items.
235
- - For item-based recommendations, make content-based recommendations with a gem like [tf-idf-similarity](https://github.com/jpmckinney/tf-idf-similarity).
232
+ - For user-based recommendations, show new users the most popular items
233
+ - For item-based recommendations, make content-based recommendations with a gem like [tf-idf-similarity](https://github.com/jpmckinney/tf-idf-similarity)
236
234
 
237
235
  Get top items with:
238
236
 
@@ -331,28 +329,6 @@ Thanks to:
331
329
  - [Implicit](https://github.com/benfred/implicit/) for serving as an initial reference for user and item similarity
332
330
  - [@dasch](https://github.com/dasch) for the gem name
333
331
 
334
- ## Upgrading
335
-
336
- ### 0.2.7
337
-
338
- There’s now a warning when passing `:value` with implicit feedback, as this has no effect on recommendations and can be removed. Earlier versions of the library incorrectly stated this was used.
339
-
340
- ```ruby
341
- recommender.fit([
342
- {user_id: 1, item_id: 1, value: 1},
343
- {user_id: 2, item_id: 1, value: 3}
344
- ])
345
- ```
346
-
347
- to:
348
-
349
- ```ruby
350
- recommender.fit([
351
- {user_id: 1, item_id: 1},
352
- {user_id: 2, item_id: 1}
353
- ])
354
- ```
355
-
356
332
  ## History
357
333
 
358
334
  View the [changelog](https://github.com/ankane/disco/blob/master/CHANGELOG.md)
data/lib/disco/data.rb CHANGED
@@ -9,7 +9,7 @@ module Disco
9
9
  file_hash: "06416e597f82b7342361e41163890c81036900f418ad91315590814211dca490")
10
10
 
11
11
  # convert u.item to utf-8
12
- movies_str = File.read(item_path).encode("UTF-8", "binary", invalid: :replace, undef: :replace, replace: "")
12
+ movies_str = File.read(item_path).encode("UTF-8", "ISO-8859-1")
13
13
 
14
14
  movies = {}
15
15
  CSV.parse(movies_str, col_sep: "|") do |row|
data/lib/disco/model.rb CHANGED
@@ -1,7 +1,12 @@
1
1
  module Disco
2
2
  module Model
3
3
  def has_recommended(name, class_name: nil)
4
+ if ActiveRecord::VERSION::MAJOR < 6
5
+ raise Disco::Error, "Requires Active Record 6+"
6
+ end
7
+
4
8
  class_name ||= name.to_s.singularize.camelize
9
+ subject_type = model_name.name
5
10
 
6
11
  class_eval do
7
12
  unless reflect_on_association(:recommendations)
@@ -12,21 +17,13 @@ module Disco
12
17
 
13
18
  define_method("update_recommended_#{name}") do |items|
14
19
  now = Time.now
15
- items = items.map { |item| {subject_type: model_name.name, subject_id: id, item_type: class_name, item_id: item.fetch(:item_id), context: name, score: item.fetch(:score), created_at: now, updated_at: now} }
20
+ items = items.map { |item| {subject_type: subject_type, subject_id: id, item_type: class_name, item_id: item.fetch(:item_id), context: name, score: item.fetch(:score), created_at: now, updated_at: now} }
16
21
 
17
22
  self.class.transaction do
18
23
  recommendations.where(context: name).delete_all
19
24
 
20
25
  if items.any?
21
- if recommendations.respond_to?(:insert_all!)
22
- # Rails 6
23
- recommendations.insert_all!(items)
24
- elsif recommendations.respond_to?(:bulk_import!)
25
- # activerecord-import
26
- recommendations.bulk_import!(items, validate: false)
27
- else
28
- recommendations.create!([items])
29
- end
26
+ recommendations.insert_all!(items)
30
27
  end
31
28
  end
32
29
  end
@@ -99,8 +99,8 @@ module Disco
99
99
  @user_factors = model.p_factors(format: :numo)
100
100
  @item_factors = model.q_factors(format: :numo)
101
101
 
102
- @normalized_user_factors = nil
103
- @normalized_item_factors = nil
102
+ @user_norms = nil
103
+ @item_norms = nil
104
104
 
105
105
  @user_recs_index = nil
106
106
  @similar_users_index = nil
@@ -172,13 +172,13 @@ module Disco
172
172
 
173
173
  def similar_items(item_id, count: 5)
174
174
  check_fit
175
- similar(item_id, :item_id, @item_map, normalized_item_factors, count, @similar_items_index)
175
+ similar(item_id, :item_id, @item_map, @item_factors, item_norms, count, @similar_items_index)
176
176
  end
177
177
  alias_method :item_recs, :similar_items
178
178
 
179
179
  def similar_users(user_id, count: 5)
180
180
  check_fit
181
- similar(user_id, :user_id, @user_map, normalized_user_factors, count, @similar_users_index)
181
+ similar(user_id, :user_id, @user_map, @user_factors, user_norms, count, @similar_users_index)
182
182
  end
183
183
 
184
184
  def top_items(count: 5)
@@ -247,13 +247,13 @@ module Disco
247
247
 
248
248
  def optimize_similar_items(library: nil)
249
249
  check_fit
250
- @similar_items_index = create_index(normalized_item_factors, library: library)
250
+ @similar_items_index = create_index(@item_factors / item_norms.expand_dims(1), library: library)
251
251
  end
252
252
  alias_method :optimize_item_recs, :optimize_similar_items
253
253
 
254
254
  def optimize_similar_users(library: nil)
255
255
  check_fit
256
- @similar_users_index = create_index(normalized_user_factors, library: library)
256
+ @similar_users_index = create_index(@user_factors / user_norms.expand_dims(1), library: library)
257
257
  end
258
258
 
259
259
  def inspect
@@ -341,36 +341,37 @@ module Disco
341
341
  end
342
342
  end
343
343
 
344
- def normalized_user_factors
345
- @normalized_user_factors ||= normalize(@user_factors)
344
+ def user_norms
345
+ @user_norms ||= norms(@user_factors)
346
346
  end
347
347
 
348
- def normalized_item_factors
349
- @normalized_item_factors ||= normalize(@item_factors)
348
+ def item_norms
349
+ @item_norms ||= norms(@item_factors)
350
350
  end
351
351
 
352
- def normalize(factors)
352
+ def norms(factors)
353
353
  norms = Numo::SFloat::Math.sqrt((factors * factors).sum(axis: 1))
354
354
  norms[norms.eq(0)] = 1e-10 # no zeros
355
- factors / norms.expand_dims(1)
355
+ norms
356
356
  end
357
357
 
358
- def similar(id, key, map, norm_factors, count, index)
358
+ def similar(id, key, map, factors, norms, count, index)
359
359
  i = map[id]
360
360
 
361
- if i && norm_factors.shape[0] > 1
361
+ if i && factors.shape[0] > 1
362
362
  if index && count
363
+ norm_factors = factors[i, true] / norms[i]
363
364
  if defined?(Faiss) && index.is_a?(Faiss::Index)
364
- predictions, ids = index.search(norm_factors[i, true].expand_dims(0), count + 1).map { |v| v.to_a[0] }
365
+ predictions, ids = index.search(norm_factors.expand_dims(0), count + 1).map { |v| v.to_a[0] }
365
366
  else
366
- result = index.search(norm_factors[i, true], size: count + 1)
367
+ result = index.search(norm_factors, size: count + 1)
367
368
  # ids from batch_insert start at 1 instead of 0
368
369
  ids = result.map { |v| v[:id] - 1 }
369
370
  # convert cosine distance to cosine similarity
370
371
  predictions = result.map { |v| 1 - v[:distance] }
371
372
  end
372
373
  else
373
- predictions = norm_factors.inner(norm_factors[i, true])
374
+ predictions = factors.inner(factors[i, true]) / (norms * norms[i])
374
375
  indexes = predictions.sort_index.reverse
375
376
  indexes = indexes[0...[count + 1, indexes.size].min] if count
376
377
  predictions = predictions[indexes]
@@ -386,6 +387,7 @@ module Disco
386
387
  next if id == i
387
388
 
388
389
  result << {key => keys[id], score: predictions[j]}
390
+ break if result.size == count
389
391
  end
390
392
  result
391
393
  else
@@ -430,6 +432,8 @@ module Disco
430
432
  end
431
433
 
432
434
  def marshal_dump
435
+ warn "[disco] Marshal serialization is deprecated - use JSON instead"
436
+
433
437
  obj = {
434
438
  implicit: @implicit,
435
439
  user_map: @user_map,
@@ -457,6 +461,8 @@ module Disco
457
461
  end
458
462
 
459
463
  def marshal_load(obj)
464
+ warn "[disco] Marshal serialization is deprecated - use JSON instead"
465
+
460
466
  @implicit = obj[:implicit]
461
467
  @user_map = obj[:user_map]
462
468
  @item_map = obj[:item_map]
data/lib/disco/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Disco
2
- VERSION = "0.3.2"
2
+ VERSION = "0.4.1"
3
3
  end
data/lib/disco.rb CHANGED
@@ -3,13 +3,13 @@ require "libmf"
3
3
  require "numo/narray"
4
4
 
5
5
  # modules
6
- require "disco/data"
7
- require "disco/metrics"
8
- require "disco/recommender"
9
- require "disco/version"
6
+ require_relative "disco/data"
7
+ require_relative "disco/metrics"
8
+ require_relative "disco/recommender"
9
+ require_relative "disco/version"
10
10
 
11
11
  # integrations
12
- require "disco/engine" if defined?(Rails)
12
+ require_relative "disco/engine" if defined?(Rails)
13
13
 
14
14
  module Disco
15
15
  class Error < StandardError; end
@@ -19,7 +19,7 @@ end
19
19
 
20
20
  if defined?(ActiveSupport.on_load)
21
21
  ActiveSupport.on_load(:active_record) do
22
- require "disco/model"
22
+ require_relative "disco/model"
23
23
  extend Disco::Model
24
24
  end
25
25
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: disco
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.2
4
+ version: 0.4.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2022-09-27 00:00:00.000000000 Z
11
+ date: 2024-05-23 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: libmf
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 0.2.0
19
+ version: '0.2'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: 0.2.0
26
+ version: '0.2'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: numo-narray
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -69,14 +69,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
69
69
  requirements:
70
70
  - - ">="
71
71
  - !ruby/object:Gem::Version
72
- version: '2.6'
72
+ version: '2.7'
73
73
  required_rubygems_version: !ruby/object:Gem::Requirement
74
74
  requirements:
75
75
  - - ">="
76
76
  - !ruby/object:Gem::Version
77
77
  version: '0'
78
78
  requirements: []
79
- rubygems_version: 3.3.7
79
+ rubygems_version: 3.5.9
80
80
  signing_key:
81
81
  specification_version: 4
82
82
  summary: Recommendations for Ruby and Rails using collaborative filtering