inci_score 1.2.1 → 2.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.travis.yml +1 -2
- data/README.md +44 -10
- data/bin/inci_score +4 -4
- data/inci_score.gemspec +1 -1
- data/lib/inci_score/api/app.rb +1 -1
- data/lib/inci_score/cli.rb +44 -0
- data/lib/inci_score/computer.rb +9 -9
- data/lib/inci_score/{parser.rb → fetcher.rb} +1 -1
- data/lib/inci_score/normalizer.rb +4 -4
- data/lib/inci_score/normalizer_rules.rb +51 -34
- data/lib/inci_score/recognizer.rb +8 -12
- data/lib/inci_score/recognizer_rules.rb +6 -6
- data/lib/inci_score/version.rb +1 -1
- data/lib/inci_score.rb +3 -2
- metadata +6 -5
    
        checksums.yaml
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            ---
         | 
| 2 2 | 
             
            SHA1:
         | 
| 3 | 
            -
              metadata.gz:  | 
| 4 | 
            -
              data.tar.gz:  | 
| 3 | 
            +
              metadata.gz: 5970cfdecac8492dbfd510dce7a24488e543233c
         | 
| 4 | 
            +
              data.tar.gz: fb5b1171f1fcab479e24b33dc7d7f37582b93741
         | 
| 5 5 | 
             
            SHA512:
         | 
| 6 | 
            -
              metadata.gz:  | 
| 7 | 
            -
              data.tar.gz:  | 
| 6 | 
            +
              metadata.gz: 42624a99c66bc3fcfb53cff14ebe6a153b220901df9b9e5f49f3d8ec2c9378436cd7090446bb449fefc7320c8ff61fdd5375b551b015312858fac8b8bfa8b66c
         | 
| 7 | 
            +
              data.tar.gz: 2f6bcc48dd8727a6b2b9665882cd3a809b849a7876a7a5303b7a8c3438c373fb3e25c28545889f70fcc14aa1724ea5febfbb5a78a555456ec99e44b5ca9de329
         | 
    
        data/.travis.yml
    CHANGED
    
    
    
        data/README.md
    CHANGED
    
    | @@ -11,9 +11,13 @@ | |
| 11 11 | 
             
              * [Starting Puma](#starting-puma)
         | 
| 12 12 | 
             
              * [Triggering a request](#triggering-a-request)
         | 
| 13 13 | 
             
            * [CLI API](#cli-api)
         | 
| 14 | 
            -
            * [ | 
| 14 | 
            +
              * [Refresh catalog](#refresh-catalog)
         | 
| 15 | 
            +
            * [Benchmark](#benchmark)
         | 
| 15 16 | 
             
              * [Levenshtein in C](#levenshtein-in-c)
         | 
| 16 | 
            -
              * [ | 
| 17 | 
            +
              * [Platform](#platform)
         | 
| 18 | 
            +
              * [Wrk](#wrk)
         | 
| 19 | 
            +
              * [Results](#results)
         | 
| 20 | 
            +
              * [Ruby 2.4](#ruby-2.4)
         | 
| 17 21 |  | 
| 18 22 | 
             
            ## Scope
         | 
| 19 23 | 
             
            This gem computes the score of cosmetic components basing on the information provided by the [Biodizionario site](http://www.biodizionario.it/) by Fabrizio Zago.
         | 
| @@ -75,8 +79,8 @@ The Web API exposes the *InciScore* library over HTTP via the [Puma](http://puma | |
| 75 79 |  | 
| 76 80 | 
             
            ### Starting Puma
         | 
| 77 81 | 
             
            Simply start Puma via the *config.ru* file included in the repository by spawning how many workers as your current workstation supports:
         | 
| 78 | 
            -
            ```
         | 
| 79 | 
            -
            bundle exec puma -w  | 
| 82 | 
            +
            ```shell
         | 
| 83 | 
            +
            bundle exec puma -w 8 -t 0:2 --preload
         | 
| 80 84 | 
             
            ```
         | 
| 81 85 |  | 
| 82 86 | 
             
            ### Triggering a request
         | 
| @@ -84,7 +88,7 @@ The Web API responds with a JSON object representing the original *InciScore::Re | |
| 84 88 |  | 
| 85 89 | 
             
            You can pass the source string directly as a HTTP parameter:
         | 
| 86 90 |  | 
| 87 | 
            -
            ```
         | 
| 91 | 
            +
            ```shell
         | 
| 88 92 | 
             
            curl http://127.0.0.1:9292?src=aqua,dimethicone
         | 
| 89 93 | 
             
            => {"components":{"aqua":0,"dimethicone":4},"unrecognized":[],"score":53.762874945799766,"valid":true}
         | 
| 90 94 | 
             
            ```
         | 
| @@ -92,8 +96,8 @@ curl http://127.0.0.1:9292?src=aqua,dimethicone | |
| 92 96 | 
             
            ## CLI API
         | 
| 93 97 | 
             
            You can collect INCI data by using the available binary:
         | 
| 94 98 |  | 
| 95 | 
            -
            ```
         | 
| 96 | 
            -
            inci_score "aqua,dimethicone,pej-10,noent"
         | 
| 99 | 
            +
            ```shell
         | 
| 100 | 
            +
            inci_score --src="aqua,dimethicone,pej-10,noent"
         | 
| 97 101 |  | 
| 98 102 | 
             
            TOTAL SCORE:
         | 
| 99 103 | 
             
                    47.18034913243358
         | 
| @@ -107,11 +111,41 @@ UNRECOGNIZED: | |
| 107 111 | 
             
                    noent
         | 
| 108 112 | 
             
            ```
         | 
| 109 113 |  | 
| 110 | 
            -
             | 
| 114 | 
            +
            ### Refresh catalog
         | 
| 115 | 
            +
            When using CLI you have the option to fetch a fresh catalog from remote by specifyng a flag:
         | 
| 116 | 
            +
            ```shell
         | 
| 117 | 
            +
            inci_score --fresh --src="aqua,dimethicone,pej-10,noent"
         | 
| 118 | 
            +
            ```
         | 
| 119 | 
            +
             | 
| 120 | 
            +
            ## Benchmark
         | 
| 121 | 
            +
             | 
| 122 | 
            +
            ### Levenshtein in C
         | 
| 111 123 | 
             
            I noticed the APIs slows down dramatically when dealing with unrecognized components to fuzzy match on.  
         | 
| 112 124 | 
             
            I profiled the code by using the [benchmark-ips](https://github.com/evanphx/benchmark-ips) gem, finding the bottleneck was the pure Ruby implementation of the Levenshtein distance algorithm.  
         | 
| 113 125 | 
             
            After some pointless optimization, i replaced this routine with a C implementation: i opted for the straightforward [Ruby Inline](https://github.com/seattlerb/rubyinline) library to call the C code straight from Ruby.  
         | 
| 114 126 | 
             
            As a result i've got a 10x increment of the throughput, all without scarifying code readability.
         | 
| 115 127 |  | 
| 116 | 
            -
            ###  | 
| 117 | 
            -
            I  | 
| 128 | 
            +
            ### Platform
         | 
| 129 | 
            +
            I registered these benchmarks with a MacBook PRO 15 mid 2015 having these specs:
         | 
| 130 | 
            +
            * OSX El Captain
         | 
| 131 | 
            +
            * 2,2 GHz Intel Core i7 (4 cores)
         | 
| 132 | 
            +
            * 16 GB 1600 MHz DDR3
         | 
| 133 | 
            +
             | 
| 134 | 
            +
            ### Wrk
         | 
| 135 | 
            +
            As always i used [wrk](https://github.com/wg/wrk) as the loading tool.
         | 
| 136 | 
            +
            I measured each library three times, picking the best lap.  
         | 
| 137 | 
            +
            The following script command is used:
         | 
| 138 | 
            +
             | 
| 139 | 
            +
            ```
         | 
| 140 | 
            +
            wrk -t 4 -c 100 -d 30s --timeout 2000 http://127.0.0.1:9292/?src=<list_of_ingredients>
         | 
| 141 | 
            +
            ```
         | 
| 142 | 
            +
             | 
| 143 | 
            +
            ### Results
         | 
| 144 | 
            +
            | Type               | Ingredients              | Throughput (req/s) | Latency in ms (avg/stdev/max) |
         | 
| 145 | 
            +
            | :----------------- | :----------------------- | -----------------: | ----------------------------: |
         | 
| 146 | 
            +
            | exact matching     | aqua,parfum,zeolite      |           48863.58 |               0.31/0.55/10.82 |
         | 
| 147 | 
            +
             | 
| 148 | 
            +
            ## Ruby 2.4
         | 
| 149 | 
            +
            After upgrading to Ruby 2.4 i doubled the throughput of the matcher: i assume Ruby optimization to the [Hash access](#https://blog.heroku.com/ruby-2-4-features-hashes-integers-rounding) is the driving reason.  
         | 
| 150 | 
            +
            I also adopted the new #match? method to avoid creating a MatchData object when i am just checking for predicate.  
         | 
| 151 | 
            +
            In the end Ruby upgrade is a big deal for my gem, give it a try!
         | 
    
        data/bin/inci_score
    CHANGED
    
    | @@ -1,7 +1,7 @@ | |
| 1 1 | 
             
            #!/usr/bin/env ruby
         | 
| 2 | 
            +
            lib = File.expand_path("../../lib", __FILE__)
         | 
| 3 | 
            +
            $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
         | 
| 2 4 |  | 
| 3 | 
            -
            require  | 
| 4 | 
            -
            require 'inci_score'
         | 
| 5 | 
            +
            require "inci_score"
         | 
| 5 6 |  | 
| 6 | 
            -
             | 
| 7 | 
            -
            puts InciScore::Computer.new(ARGV[0], InciScore::Catalog.fetch).call
         | 
| 7 | 
            +
            InciScore::CLI.new(args: ARGV.clone).call
         | 
    
        data/inci_score.gemspec
    CHANGED
    
    | @@ -14,7 +14,7 @@ Gem::Specification.new do |s| | |
| 14 14 | 
             
              s.executables << "inci_score"
         | 
| 15 15 | 
             
              s.require_paths = ["lib"]
         | 
| 16 16 | 
             
              s.license = "MIT"
         | 
| 17 | 
            -
              s.required_ruby_version = ">= 2. | 
| 17 | 
            +
              s.required_ruby_version = ">= 2.4"
         | 
| 18 18 |  | 
| 19 19 | 
             
              s.add_runtime_dependency "nokogiri", "~> 1.6"
         | 
| 20 20 | 
             
              s.add_runtime_dependency "puma", "~> 3"
         | 
    
        data/lib/inci_score/api/app.rb
    CHANGED
    
    | @@ -13,7 +13,7 @@ module InciScore | |
| 13 13 | 
             
                  def call(env)
         | 
| 14 14 | 
             
                    req = Rack::Request.new(env)
         | 
| 15 15 | 
             
                    src = req.params["src"]
         | 
| 16 | 
            -
                    json = src ? Computer.new(src, catalog).call.to_json : %q({"error": "no valid source"})
         | 
| 16 | 
            +
                    json = src ? Computer.new(src: src, catalog: catalog).call.to_json : %q({"error": "no valid source"})
         | 
| 17 17 | 
             
                    ['200', {'Content-Type' => 'application/json'}, [json]]
         | 
| 18 18 | 
             
                  end
         | 
| 19 19 | 
             
                end
         | 
| @@ -0,0 +1,44 @@ | |
| 1 | 
            +
            require "optparse"
         | 
| 2 | 
            +
            require "inci_score/computer"
         | 
| 3 | 
            +
             | 
| 4 | 
            +
            module InciScore
         | 
| 5 | 
            +
              class CLI
         | 
| 6 | 
            +
                def initialize(args:, io: STDOUT, catalog: InciScore::Catalog.fetch)
         | 
| 7 | 
            +
                  @args = args
         | 
| 8 | 
            +
                  @io = io
         | 
| 9 | 
            +
                  @catalog = catalog
         | 
| 10 | 
            +
                  @src = nil
         | 
| 11 | 
            +
                  @fresh = nil
         | 
| 12 | 
            +
                end
         | 
| 13 | 
            +
             | 
| 14 | 
            +
                def call(computer_klass = Computer, fetcher = Fetcher.new)
         | 
| 15 | 
            +
                  parser.parse!(@args)
         | 
| 16 | 
            +
                  return @io.puts("Specify inci list as: --src='aqua, parfum, etc'") unless @src
         | 
| 17 | 
            +
                  @io.puts computer_klass.new(src: @src, catalog: catalog(fetcher)).call
         | 
| 18 | 
            +
                end
         | 
| 19 | 
            +
             | 
| 20 | 
            +
                private def parser
         | 
| 21 | 
            +
                  OptionParser.new do |opts|
         | 
| 22 | 
            +
                    opts.banner = %q{Usage: ./bin/inci_score --src='aqua, parfum, etc' --fresh}
         | 
| 23 | 
            +
             | 
| 24 | 
            +
                    opts.on("-sSRC", "--src=SRC", "The INCI list: 'aqua, parfum, etc'") do |src|
         | 
| 25 | 
            +
                      @src = src
         | 
| 26 | 
            +
                    end
         | 
| 27 | 
            +
             | 
| 28 | 
            +
                    opts.on("-f", "--fresh", "Fetch a fresh catalog from remote") do |fresh|
         | 
| 29 | 
            +
                      @fresh = fresh
         | 
| 30 | 
            +
                    end
         | 
| 31 | 
            +
             | 
| 32 | 
            +
                    opts.on("-h", "--help", "Prints this help") do
         | 
| 33 | 
            +
                      @io.puts opts
         | 
| 34 | 
            +
                      exit
         | 
| 35 | 
            +
                    end
         | 
| 36 | 
            +
                  end
         | 
| 37 | 
            +
                end
         | 
| 38 | 
            +
             | 
| 39 | 
            +
                private def catalog(fetcher)
         | 
| 40 | 
            +
                  return @catalog unless @fresh
         | 
| 41 | 
            +
                  fetcher.call
         | 
| 42 | 
            +
                end
         | 
| 43 | 
            +
              end
         | 
| 44 | 
            +
            end
         | 
    
        data/lib/inci_score/computer.rb
    CHANGED
    
    | @@ -7,9 +7,11 @@ module InciScore | |
| 7 7 | 
             
              class Computer
         | 
| 8 8 | 
             
                TOLERANCE = 30.0
         | 
| 9 9 |  | 
| 10 | 
            -
                def initialize(src,  | 
| 10 | 
            +
                def initialize(src:, catalog:, tolerance: TOLERANCE, rules: Normalizer::DEFAULT_RULES)
         | 
| 11 11 | 
             
                  @src = src
         | 
| 12 12 | 
             
                  @catalog = catalog
         | 
| 13 | 
            +
                  @tolerance = Float(tolerance)
         | 
| 14 | 
            +
                  @rules = rules
         | 
| 13 15 | 
             
                  @unrecognized = []
         | 
| 14 16 | 
             
                end
         | 
| 15 17 |  | 
| @@ -20,17 +22,15 @@ module InciScore | |
| 20 22 | 
             
                                             valid: valid?)
         | 
| 21 23 | 
             
                end
         | 
| 22 24 |  | 
| 23 | 
            -
                private
         | 
| 24 | 
            -
             | 
| 25 | 
            -
                def score
         | 
| 25 | 
            +
                private def score
         | 
| 26 26 | 
             
                  Scorer.new(components.map(&:last)).call
         | 
| 27 27 | 
             
                end
         | 
| 28 28 |  | 
| 29 | 
            -
                def ingredients
         | 
| 30 | 
            -
                  @ingredients ||= Normalizer.new(src: @src).call
         | 
| 29 | 
            +
                private def ingredients
         | 
| 30 | 
            +
                  @ingredients ||= Normalizer.new(src: @src, rules: @rules).call
         | 
| 31 31 | 
             
                end
         | 
| 32 32 |  | 
| 33 | 
            -
                def components
         | 
| 33 | 
            +
                private def components
         | 
| 34 34 | 
             
                  @components ||= ingredients.map do |ingredient|
         | 
| 35 35 | 
             
                    Recognizer.new(ingredient, @catalog).call.tap do |component|
         | 
| 36 36 | 
             
                      @unrecognized << ingredient unless component
         | 
| @@ -38,8 +38,8 @@ module InciScore | |
| 38 38 | 
             
                  end.compact
         | 
| 39 39 | 
             
                end
         | 
| 40 40 |  | 
| 41 | 
            -
                def valid?
         | 
| 42 | 
            -
                  @unrecognized.size / (ingredients.size / 100.0) <=  | 
| 41 | 
            +
                private def valid?
         | 
| 42 | 
            +
                  @unrecognized.size / (ingredients.size / 100.0) <= @tolerance
         | 
| 43 43 | 
             
                end
         | 
| 44 44 | 
             
              end
         | 
| 45 45 | 
             
            end
         | 
| @@ -2,7 +2,7 @@ require 'inci_score/normalizer_rules' | |
| 2 2 |  | 
| 3 3 | 
             
            module InciScore
         | 
| 4 4 | 
             
              class Normalizer
         | 
| 5 | 
            -
                DEFAULT_RULES = Rules | 
| 5 | 
            +
                DEFAULT_RULES = [Rules::Replacer, Rules::Downcaser, Rules::Beheader, Rules::Separator, Rules::Tokenizer, Rules::Sanitizer, Rules::Desynonymizer]
         | 
| 6 6 |  | 
| 7 7 | 
             
                attr_reader :src
         | 
| 8 8 |  | 
| @@ -12,9 +12,9 @@ module InciScore | |
| 12 12 | 
             
                end
         | 
| 13 13 |  | 
| 14 14 | 
             
                def call
         | 
| 15 | 
            -
                  @rules | 
| 16 | 
            -
             | 
| 17 | 
            -
                    src = rule.call
         | 
| 15 | 
            +
                  yield(@rules) if block_given?
         | 
| 16 | 
            +
                  @rules.reduce(@src) do |src, rule|
         | 
| 17 | 
            +
                    @src = rule.call(src)
         | 
| 18 18 | 
             
                  end
         | 
| 19 19 | 
             
                end
         | 
| 20 20 | 
             
              end
         | 
| @@ -1,73 +1,90 @@ | |
| 1 1 | 
             
            module InciScore
         | 
| 2 2 | 
             
              class Normalizer
         | 
| 3 3 | 
             
                module Rules
         | 
| 4 | 
            -
                   | 
| 5 | 
            -
                    SEPARATOR = ','
         | 
| 4 | 
            +
                  SEPARATOR = ','
         | 
| 6 5 |  | 
| 7 | 
            -
             | 
| 8 | 
            -
             | 
| 9 | 
            -
                    end
         | 
| 6 | 
            +
                  module Replacer
         | 
| 7 | 
            +
                    extend self
         | 
| 10 8 |  | 
| 11 | 
            -
                    def call
         | 
| 12 | 
            -
                      fail NotImplementedError
         | 
| 13 | 
            -
                    end
         | 
| 14 | 
            -
                  end
         | 
| 15 | 
            -
             | 
| 16 | 
            -
                  class Replacer < Base
         | 
| 17 9 | 
             
                    REPLACEMENTS = [
         | 
| 18 10 | 
             
                      [/\n+|\t+/, ' '],
         | 
| 19 11 | 
             
                      ['‘', "'"],
         | 
| 20 12 | 
             
                      ['—', '-'],
         | 
| 21 | 
            -
                      ['(', 'C'],
         | 
| 22 13 | 
             
                      ['_', ' '],
         | 
| 23 14 | 
             
                      ['~', '-'],
         | 
| 24 15 | 
             
                      ['|', 'l'],
         | 
| 25 16 | 
             
                      [' I ', '/']
         | 
| 26 17 | 
             
                    ]
         | 
| 27 18 |  | 
| 28 | 
            -
                    def call
         | 
| 29 | 
            -
                      REPLACEMENTS.reduce( | 
| 19 | 
            +
                    def call(src)
         | 
| 20 | 
            +
                      REPLACEMENTS.reduce(src) do |_src, replacement|
         | 
| 30 21 | 
             
                        invalid, valid = *replacement
         | 
| 31 | 
            -
                         | 
| 22 | 
            +
                        _src.index(invalid) ? _src.gsub(invalid, valid) : _src
         | 
| 32 23 | 
             
                      end
         | 
| 33 24 | 
             
                    end
         | 
| 34 25 | 
             
                  end
         | 
| 35 26 |  | 
| 36 | 
            -
                   | 
| 37 | 
            -
                     | 
| 38 | 
            -
             | 
| 27 | 
            +
                  module Downcaser
         | 
| 28 | 
            +
                    extend self
         | 
| 29 | 
            +
             | 
| 30 | 
            +
                    def call(src)
         | 
| 31 | 
            +
                      src.downcase
         | 
| 39 32 | 
             
                    end
         | 
| 40 33 | 
             
                  end 
         | 
| 41 34 |  | 
| 42 | 
            -
                   | 
| 35 | 
            +
                  module Beheader
         | 
| 36 | 
            +
                    extend self
         | 
| 37 | 
            +
             | 
| 43 38 | 
             
                    TITLE_SEP = ':'
         | 
| 44 39 | 
             
                    MAX_INDEX = 50
         | 
| 45 40 |  | 
| 46 | 
            -
                    def call
         | 
| 47 | 
            -
                      sep_index =  | 
| 48 | 
            -
                      return  | 
| 49 | 
            -
                       | 
| 41 | 
            +
                    def call(src)
         | 
| 42 | 
            +
                      sep_index = src.index(TITLE_SEP)
         | 
| 43 | 
            +
                      return src if !sep_index || sep_index > MAX_INDEX
         | 
| 44 | 
            +
                      src[sep_index+1, src.size]
         | 
| 50 45 | 
             
                    end
         | 
| 51 46 | 
             
                  end
         | 
| 52 47 |  | 
| 53 | 
            -
                   | 
| 48 | 
            +
                  module Separator
         | 
| 49 | 
            +
                    extend self
         | 
| 50 | 
            +
             | 
| 54 51 | 
             
                    SEPARATORS = ["; ", ". ", " ' ", " - ", " : "]
         | 
| 55 52 |  | 
| 56 | 
            -
                    def call
         | 
| 57 | 
            -
                      SEPARATORS.reduce( | 
| 58 | 
            -
                         | 
| 53 | 
            +
                    def call(src)
         | 
| 54 | 
            +
                      SEPARATORS.reduce(src) do |_src, separator|
         | 
| 55 | 
            +
                        _src = _src.gsub(separator, SEPARATOR)
         | 
| 59 56 | 
             
                      end
         | 
| 60 57 | 
             
                    end
         | 
| 61 58 | 
             
                  end 
         | 
| 62 59 |  | 
| 63 | 
            -
                   | 
| 64 | 
            -
                     | 
| 60 | 
            +
                  module Tokenizer
         | 
| 61 | 
            +
                    extend self
         | 
| 62 | 
            +
             | 
| 63 | 
            +
                    def call(src)
         | 
| 64 | 
            +
                      src.split(SEPARATOR).map(&:strip)
         | 
| 65 | 
            +
                    end
         | 
| 66 | 
            +
                  end
         | 
| 67 | 
            +
             | 
| 68 | 
            +
                  module Sanitizer
         | 
| 69 | 
            +
                    extend self
         | 
| 70 | 
            +
             | 
| 71 | 
            +
                    INVALID_CHARS = /[^\/\(\)\w\s-]/
         | 
| 72 | 
            +
             | 
| 73 | 
            +
                    def call(src)
         | 
| 74 | 
            +
                      Array(src).map do |token|
         | 
| 75 | 
            +
                        token.gsub(INVALID_CHARS, '')
         | 
| 76 | 
            +
                      end.reject(&:empty?)
         | 
| 77 | 
            +
                    end
         | 
| 78 | 
            +
                  end
         | 
| 79 | 
            +
             | 
| 80 | 
            +
                  module Desynonymizer
         | 
| 81 | 
            +
                    extend self
         | 
| 82 | 
            +
             | 
| 83 | 
            +
                    SYNONYM = /\/.*/
         | 
| 65 84 |  | 
| 66 | 
            -
                    def call
         | 
| 67 | 
            -
                       | 
| 68 | 
            -
                        token | 
| 69 | 
            -
                        token = token.gsub(INVALID_CHARS, '')
         | 
| 70 | 
            -
                        token = token.strip
         | 
| 85 | 
            +
                    def call(src)
         | 
| 86 | 
            +
                      Array(src).map do |token|
         | 
| 87 | 
            +
                        token.sub(SYNONYM, '').strip
         | 
| 71 88 | 
             
                      end.reject(&:empty?)
         | 
| 72 89 | 
             
                    end
         | 
| 73 90 | 
             
                  end
         | 
| @@ -2,7 +2,7 @@ require 'inci_score/recognizer_rules' | |
| 2 2 |  | 
| 3 3 | 
             
            module InciScore
         | 
| 4 4 | 
             
              class Recognizer
         | 
| 5 | 
            -
                DEFAULT_RULES = Rules | 
| 5 | 
            +
                DEFAULT_RULES = [Rules::Key, Rules::Levenshtein, Rules::Digits, Rules::Tokens]
         | 
| 6 6 |  | 
| 7 7 | 
             
                def initialize(src, catalog, rules = DEFAULT_RULES)
         | 
| 8 8 | 
             
                  @src = src
         | 
| @@ -11,17 +11,13 @@ module InciScore | |
| 11 11 | 
             
                end
         | 
| 12 12 |  | 
| 13 13 | 
             
                def call
         | 
| 14 | 
            -
                  @component =  | 
| 15 | 
            -
             | 
| 16 | 
            -
             | 
| 17 | 
            -
             | 
| 18 | 
            -
             | 
| 19 | 
            -
             | 
| 20 | 
            -
                def apply_rules
         | 
| 21 | 
            -
                  @rules.reduce(nil) do |component, name|
         | 
| 22 | 
            -
                    rule = Rules.const_get(name).new(@src, @catalog)
         | 
| 23 | 
            -
                    component || rule.call
         | 
| 14 | 
            +
                  @component = @rules.reduce(nil) do |component, rule|
         | 
| 15 | 
            +
                    break(component) if component
         | 
| 16 | 
            +
                    _rule = rule.new(@src, @catalog)
         | 
| 17 | 
            +
                    yield(rule) if block_given?
         | 
| 18 | 
            +
                    _rule.call
         | 
| 24 19 | 
             
                  end
         | 
| 25 | 
            -
             | 
| 20 | 
            +
                  [@component, @catalog[@component]] if @component
         | 
| 21 | 
            +
                end 
         | 
| 26 22 | 
             
              end
         | 
| 27 23 | 
             
            end
         | 
| @@ -28,12 +28,12 @@ module InciScore | |
| 28 28 | 
             
                    def call
         | 
| 29 29 | 
             
                      size = @src.size
         | 
| 30 30 | 
             
                      initial = @src[0]
         | 
| 31 | 
            -
                      component, distance = @catalog.reduce([nil, size]) do |min, ( | 
| 32 | 
            -
                        next min unless  | 
| 33 | 
            -
                        match = (n =  | 
| 31 | 
            +
                      component, distance = @catalog.reduce([nil, size]) do |min, (_component, _)|
         | 
| 32 | 
            +
                        next min unless _component.start_with?(initial)
         | 
| 33 | 
            +
                        match = (n = _component.index(ALTERNATE_SEP)) ? _component[0, n] : _component
         | 
| 34 34 | 
             
                        next min if match.size > (size + TOLERANCE)
         | 
| 35 35 | 
             
                        dist = @src.distance(match)
         | 
| 36 | 
            -
                        min = [ | 
| 36 | 
            +
                        min = [_component, dist] if dist < min[1]
         | 
| 37 37 | 
             
                        min
         | 
| 38 38 | 
             
                      end
         | 
| 39 39 | 
             
                      component unless distance > TOLERANCE || distance >= (size-1)
         | 
| @@ -47,7 +47,7 @@ module InciScore | |
| 47 47 | 
             
                      return if @src.size < TOLERANCE
         | 
| 48 48 | 
             
                      digits = @src[0, MIN_MEANINGFUL]
         | 
| 49 49 | 
             
                      @catalog.detect do |component, _| 
         | 
| 50 | 
            -
                        component.match(/^#{Regexp::escape(digits)}/)
         | 
| 50 | 
            +
                        component.match?(/^#{Regexp::escape(digits)}/)
         | 
| 51 51 | 
             
                      end.to_a.first
         | 
| 52 52 | 
             
                    end
         | 
| 53 53 | 
             
                  end
         | 
| @@ -58,7 +58,7 @@ module InciScore | |
| 58 58 | 
             
                    def call
         | 
| 59 59 | 
             
                      tokens.each do |token|
         | 
| 60 60 | 
             
                        @catalog.each do |component, _| 
         | 
| 61 | 
            -
                          return component if component.match(/\b#{Regexp.escape(token)}\b/)
         | 
| 61 | 
            +
                          return component if component.match?(/\b#{Regexp.escape(token)}\b/)
         | 
| 62 62 | 
             
                        end
         | 
| 63 63 | 
             
                      end
         | 
| 64 64 | 
             
                      nil
         | 
    
        data/lib/inci_score/version.rb
    CHANGED
    
    
    
        data/lib/inci_score.rb
    CHANGED
    
    
    
        metadata
    CHANGED
    
    | @@ -1,14 +1,14 @@ | |
| 1 1 | 
             
            --- !ruby/object:Gem::Specification
         | 
| 2 2 | 
             
            name: inci_score
         | 
| 3 3 | 
             
            version: !ruby/object:Gem::Version
         | 
| 4 | 
            -
              version:  | 
| 4 | 
            +
              version: 2.0.1
         | 
| 5 5 | 
             
            platform: ruby
         | 
| 6 6 | 
             
            authors:
         | 
| 7 7 | 
             
            - costajob
         | 
| 8 8 | 
             
            autorequire: 
         | 
| 9 9 | 
             
            bindir: bin
         | 
| 10 10 | 
             
            cert_chain: []
         | 
| 11 | 
            -
            date:  | 
| 11 | 
            +
            date: 2017-01-03 00:00:00.000000000 Z
         | 
| 12 12 | 
             
            dependencies:
         | 
| 13 13 | 
             
            - !ruby/object:Gem::Dependency
         | 
| 14 14 | 
             
              name: nokogiri
         | 
| @@ -145,11 +145,12 @@ files: | |
| 145 145 | 
             
            - lib/inci_score.rb
         | 
| 146 146 | 
             
            - lib/inci_score/api/app.rb
         | 
| 147 147 | 
             
            - lib/inci_score/catalog.rb
         | 
| 148 | 
            +
            - lib/inci_score/cli.rb
         | 
| 148 149 | 
             
            - lib/inci_score/computer.rb
         | 
| 150 | 
            +
            - lib/inci_score/fetcher.rb
         | 
| 149 151 | 
             
            - lib/inci_score/levenshtein.rb
         | 
| 150 152 | 
             
            - lib/inci_score/normalizer.rb
         | 
| 151 153 | 
             
            - lib/inci_score/normalizer_rules.rb
         | 
| 152 | 
            -
            - lib/inci_score/parser.rb
         | 
| 153 154 | 
             
            - lib/inci_score/recognizer.rb
         | 
| 154 155 | 
             
            - lib/inci_score/recognizer_rules.rb
         | 
| 155 156 | 
             
            - lib/inci_score/response.rb
         | 
| @@ -169,7 +170,7 @@ required_ruby_version: !ruby/object:Gem::Requirement | |
| 169 170 | 
             
              requirements:
         | 
| 170 171 | 
             
              - - ">="
         | 
| 171 172 | 
             
                - !ruby/object:Gem::Version
         | 
| 172 | 
            -
                  version: 2. | 
| 173 | 
            +
                  version: '2.4'
         | 
| 173 174 | 
             
            required_rubygems_version: !ruby/object:Gem::Requirement
         | 
| 174 175 | 
             
              requirements:
         | 
| 175 176 | 
             
              - - ">="
         | 
| @@ -177,7 +178,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement | |
| 177 178 | 
             
                  version: '0'
         | 
| 178 179 | 
             
            requirements: []
         | 
| 179 180 | 
             
            rubyforge_project: 
         | 
| 180 | 
            -
            rubygems_version: 2. | 
| 181 | 
            +
            rubygems_version: 2.6.8
         | 
| 181 182 | 
             
            signing_key: 
         | 
| 182 183 | 
             
            specification_version: 4
         | 
| 183 184 | 
             
            summary: A library that computes the hazard of cosmetic products components, based
         |