inci_score 4.3.0 → 4.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0affbfe591b6551bc8761ccfc245e1b233f311e5804da9761661ebe959e5be4a
4
- data.tar.gz: 772501375864af4b28617351a060a839048e97794c1b596062771d3e6d807e8a
3
+ metadata.gz: 462ec33d1c493272235feaef061ac62822c4dfd6ad6c339e858da8fdfa491894
4
+ data.tar.gz: 42f1e47b971185e92d4af19f2fbdee0f363dc9f241041b8ed800230ae9bd0e22
5
5
  SHA512:
6
- metadata.gz: 0367e213bff98076b13ed908f36380f1e452dc4a8c1c85c3e16c0b4a69acac66fdcf783787040b2004603175fd8476def74deea86b41c05064baddbea4e5563a
7
- data.tar.gz: 931e2a90a874d865418481a3beae838f614537ababf9386edc0d186d0921786972a54eb6cfa73ab235c260412241bc557298814cad07bc493b06aaf04651ae37
6
+ metadata.gz: cc4f049d56ea9fc60ce92943d7da50c15b697f8712733b409e3327ab73b3c4c0e60d0687f55fcbbefb9f4fd2fdc4c05fe5aa86435f5b75fb558bf456744c9cc4
7
+ data.tar.gz: a706360921a1cc36b1b5f1fef53932b859de69a27b02fd8b662b76dc8fe9808e1409779297bfa159c0d57bc6bb1dfb6e5287d4455def84500ce194fa0102706a
data/README.md CHANGED
@@ -9,8 +9,9 @@
9
9
  * [Usage](#usage)
10
10
  * [Library](#library)
11
11
  * [CLI](#cli)
12
- * [Benchmark](#benchmark)
12
+ * [Benchmarks](#benchmark)
13
13
  * [Levenshtein in C](#levenshtein-in-c)
14
+ * [Run benchmarks](#run-benchmarks)
14
15
 
15
16
  ## Scope
16
17
  This gem computes the score of cosmetic components basing on the information provided by the [Biodizionario site](http://www.biodizionario.it/) by Fabrizio Zago.
@@ -56,7 +57,7 @@ You can include this gem into your own library and start computing the INCI scor
56
57
  require "inci_score"
57
58
 
58
59
  inci = InciScore::Computer.new(src: 'aqua, dimethicone').call
59
- inci.score # 53.7629
60
+ inci.score # 53.76
60
61
  ```
61
62
 
62
63
  As you see the results are wrapped by an *InciScore::Response* object, this is useful when dealing with the CLI and HTTP interfaces (read below).
@@ -80,12 +81,10 @@ inci_score --src="ingredients: aqua, dimethicone, pej-10, noent"
80
81
 
81
82
  TOTAL SCORE:
82
83
  47.18
83
- VALID STATE:
84
- true
85
84
  PRECISION:
86
85
  75.0
87
86
  COMPONENTS:
88
- aqua\n dimethicone\n peg-10
87
+ aqua (0), dimethicone (4), peg-10 (3)
89
88
  UNRECOGNIZED:
90
89
  noent
91
90
  ```
@@ -98,15 +97,17 @@ Usage: inci_score --src="aqua, parfum, etc"
98
97
  -h, --help Prints this help
99
98
  ```
100
99
 
101
- ## Benchmark
100
+ ## Benchmarks
102
101
 
103
102
  ### Levenshtein in C
104
103
  I noticed the APIs slows down dramatically when dealing with unrecognized components to fuzzy match on.
105
104
  I profiled the code by using the [benchmark-ips](https://github.com/evanphx/benchmark-ips) gem, finding the bottleneck was the pure Ruby implementation of the Levenshtein distance algorithm.
106
- After some pointless optimization, i replaced this routine with a C implementation: i opted for the straightforward [Ruby Inline](https://github.com/seattlerb/rubyinline) library to call the C code straight from Ruby.
107
105
 
108
- Once downloaded source code, run the bench specs by:
106
+ After some pointless optimization, i replaced this routine with a C implementation: i opted for the straightforward [Ruby Inline](https://github.com/seattlerb/rubyinline) library to call the C code straight from Ruby, gaining an order of magnitude in speed (x30).
107
+
108
+ ### Run benchmarks
109
+ Once downloaded source code, run the benchmarks by:
109
110
 
110
111
  ```shell
111
- bundle exec rake spec:bench
112
+ bundle exec rake bench
112
113
  ```
data/config/catalog.yml CHANGED
@@ -1,5 +1,4 @@
1
1
  ---
2
- generic-hazard: 3
3
2
  aqua: 0
4
3
  water: 0
5
4
  parfum: 3
data/config/hazards.yml CHANGED
@@ -1,31 +1,29 @@
1
- - peg-
2
- - ppg-
3
- - dea-
4
- - mipa-
5
- - edta-
6
- - thicone
7
- - siloxane
8
- - chlorexidine
9
- - petrolatum
10
- - paraffinum liquidum
11
- - carbomer
12
- - crosspolymer
13
- - acrylate
14
- - styrene
15
- - copolymer
16
- - triethanolamine
17
- - triclosan
18
- - dmdm
19
- - hydantoin
20
- - imidazolidinyl urea
21
- - diazolidinyl urea
22
- - formaldheyde
23
- - methylchloroisothiazolinone
24
- - methylisothiazolinone
25
- - sodium hydroxymethylglycinate
26
- - nonoxynol
27
- - poloxamer
28
- - trimonium
29
- - dimonium
30
- - glycol
31
- - glicol
1
+ ---
2
+ peg-: 3
3
+ ppg-: 3
4
+ dea-: 3
5
+ mipa-: 3
6
+ edta-: 4
7
+ thicone: 4
8
+ siloxane: 4
9
+ chlorexidine: 4
10
+ petrolatum: 3
11
+ paraffinum: 3
12
+ carbomer: 3
13
+ crosspolymer: 3
14
+ acrylate: 3
15
+ styrene: 3
16
+ copolymer: 3
17
+ triethanolamine: 3
18
+ triclosan: 4
19
+ dmdm: 3
20
+ hydantoin: 3
21
+ imidazolidinyl: 4
22
+ diazolidinyl: 3
23
+ methylchloroisothiazolinone: 3
24
+ methylisothiazolinone: 3
25
+ nonoxynol: 4
26
+ poloxamer: 3
27
+ trimonium: 3
28
+ dimonium: 3
29
+ glycol: 3
@@ -4,8 +4,6 @@ module InciScore
4
4
  class Recognizer
5
5
  DEFAULT_RULES = [Rules::Key, Rules::Levenshtein, Rules::Hazard, Rules::Prefix, Rules::Tokens].freeze
6
6
 
7
- Component = Struct.new(:name, :hazard)
8
-
9
7
  attr_reader :ingredient, :rules, :applied
10
8
 
11
9
  def initialize(ingredient, rules = DEFAULT_RULES)
@@ -17,9 +15,7 @@ module InciScore
17
15
 
18
16
  def call
19
17
  return if ingredient.to_s.empty?
20
- component = find_component
21
- return unless component
22
- Component.new(component, Config::CATALOG[component])
18
+ find_component
23
19
  end
24
20
 
25
21
  private
@@ -7,14 +7,23 @@ module InciScore
7
7
  module Rules
8
8
  TOLERANCE = 3
9
9
 
10
- Key = ->(src) { src if Config::CATALOG.has_key?(src) }
10
+ Component = Struct.new(:name, :hazard)
11
11
 
12
- Hazard = ->(src) { 'generic-hazard' if Config::HAZARDS.any? { |h| src.include?(h) } }
12
+ Key = ->(src) do
13
+ score = Config::CATALOG[src]
14
+ Component.new(src, score) if score
15
+ end
16
+
17
+ Hazard = ->(src) do
18
+ if hazard = Config::HAZARDS.detect { |name, _| src.include?(name) }
19
+ Component.new(src, hazard.last)
20
+ end
21
+ end
13
22
 
14
23
  module Levenshtein
15
24
  extend self
16
25
 
17
- Result = Struct.new(:name, :distance) do
26
+ Result = Struct.new(:name, :distance, :score) do
18
27
  def tolerable?(size)
19
28
  distance < TOLERANCE && distance <= (size-1)
20
29
  end
@@ -25,14 +34,14 @@ module InciScore
25
34
  size = src.size
26
35
  farthest = Result.new(nil, size)
27
36
  initial = src[0]
28
- result = Config::CATALOG.reduce(farthest) do |nearest, (component, _)|
29
- next nearest unless component.start_with?(initial)
30
- next nearest if component.size > (size + TOLERANCE)
31
- d = src.distance(component)
32
- nearest = Result.new(component, d) if d < nearest.distance
37
+ result = Config::CATALOG.reduce(farthest) do |nearest, (name, score)|
38
+ next nearest unless name.start_with?(initial)
39
+ next nearest if name.size > (size + TOLERANCE)
40
+ d = src.distance(name)
41
+ nearest = Result.new(name, d, score) if d < nearest.distance
33
42
  nearest
34
43
  end
35
- result.name if result.tolerable?(size)
44
+ Component.new(result.name, result.score) if result.tolerable?(size)
36
45
  end
37
46
  end
38
47
 
@@ -44,7 +53,8 @@ module InciScore
44
53
  def call(src)
45
54
  return if src.size < TOLERANCE
46
55
  digits = src[0, MIN_MEANINGFUL]
47
- Config::CATALOG.detect { |component, _| component.start_with?(digits) }.to_a.first
56
+ pairs = Config::CATALOG.detect { |name, _| name.start_with?(digits) }.to_a.first
57
+ Component.new(*pairs) if pairs
48
58
  end
49
59
  end
50
60
 
@@ -56,8 +66,8 @@ module InciScore
56
66
  def call(src)
57
67
  return if src.size <= TOLERANCE
58
68
  tokens(src).each do |token|
59
- Config::CATALOG.each do |component, _|
60
- return component if component.include?(token)
69
+ Config::CATALOG.each do |name, score|
70
+ return Component.new(name, score) if name.include?(token)
61
71
  end
62
72
  end
63
73
  nil
@@ -20,16 +20,35 @@ module InciScore
20
20
  end
21
21
 
22
22
  def to_s
23
+ [score_str, precision_str, components_str, unrecognized_str].join
24
+ end
25
+
26
+ private
27
+
28
+ def score_str
23
29
  %Q{
24
30
  TOTAL SCORE:
25
- \t#{score}
31
+ \t#{score}}
32
+ end
33
+
34
+ def precision_str
35
+ %Q{
26
36
  PRECISION:
27
- \t#{precision}
37
+ \t#{precision}}
38
+ end
39
+
40
+ def components_str
41
+ return '' if components.empty?
42
+ %Q{
28
43
  COMPONENTS:
29
- \t#{components.map { |c| "#{c.name} (#{c.hazard})" }.join(', ')}
44
+ \t#{components.map { |c| "#{c.name} (#{c.hazard})" }.join(', ')}}
45
+ end
46
+
47
+ def unrecognized_str
48
+ return '' if unrecognized.empty?
49
+ %Q{
30
50
  UNRECOGNIZED:
31
- \t#{unrecognized.join(', ')}
32
- }
51
+ \t#{unrecognized.join(', ')}}
33
52
  end
34
53
  end
35
54
  end
@@ -2,7 +2,7 @@
2
2
 
3
3
  module InciScore
4
4
  class Scorer
5
- HAZARD_PERCENT = 25
5
+ HAZARD_RATIO = 25
6
6
  WEIGHT_FACTOR = 5
7
7
 
8
8
  attr_reader :hazards, :size
@@ -15,7 +15,7 @@ module InciScore
15
15
 
16
16
  def call
17
17
  return 0 if hazards.empty?
18
- (100 - avg * HAZARD_PERCENT).round(4)
18
+ (100 - avg * HAZARD_RATIO).round(4)
19
19
  end
20
20
 
21
21
  private
@@ -25,10 +25,8 @@ module InciScore
25
25
  end
26
26
 
27
27
  def avg_weighted
28
- return hazards.reduce(&:+) if same_hazard?
29
- weighted.reduce(0.0) do |acc,score|
30
- acc += score.value
31
- end
28
+ return hazards.sum if same_hazard?
29
+ weighted.sum(&:value)
32
30
  end
33
31
 
34
32
  def same_hazard?
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module InciScore
4
- VERSION = '4.3.0'
4
+ VERSION = '4.5.0'
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: inci_score
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.3.0
4
+ version: 4.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - costajob