vader_sentiment_ruby 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (26) hide show
  1. checksums.yaml +7 -0
  2. data/LICENSE.txt +21 -0
  3. data/README.md +51 -0
  4. data/lib/vader_sentiment_ruby.rb +20 -0
  5. data/lib/vader_sentiment_ruby/checker.rb +13 -0
  6. data/lib/vader_sentiment_ruby/checker/but_word_negation_checker.rb +34 -0
  7. data/lib/vader_sentiment_ruby/checker/least_word_negation_checker.rb +38 -0
  8. data/lib/vader_sentiment_ruby/checker/negation_checker.rb +114 -0
  9. data/lib/vader_sentiment_ruby/checker/no_word_checker.rb +49 -0
  10. data/lib/vader_sentiment_ruby/checker/previous_words_influence_checker.rb +55 -0
  11. data/lib/vader_sentiment_ruby/checker/sentiment_laden_idioms_checker.rb +30 -0
  12. data/lib/vader_sentiment_ruby/checker/special_idioms_checker.rb +107 -0
  13. data/lib/vader_sentiment_ruby/constants.rb +135 -0
  14. data/lib/vader_sentiment_ruby/data/emoji_utf8_lexicon.txt +3570 -0
  15. data/lib/vader_sentiment_ruby/data/vader_lexicon.txt +7518 -0
  16. data/lib/vader_sentiment_ruby/emojis_describer.rb +39 -0
  17. data/lib/vader_sentiment_ruby/emojis_dictionary_creator.rb +21 -0
  18. data/lib/vader_sentiment_ruby/lexicon_dictionary_creator.rb +21 -0
  19. data/lib/vader_sentiment_ruby/punctuation_emphasis_amplifier.rb +36 -0
  20. data/lib/vader_sentiment_ruby/sentiment_intensity_analyzer.rb +105 -0
  21. data/lib/vader_sentiment_ruby/sentiment_properties_identifier.rb +48 -0
  22. data/lib/vader_sentiment_ruby/sentiment_scores_sifter.rb +27 -0
  23. data/lib/vader_sentiment_ruby/valence_score_calculator.rb +82 -0
  24. data/lib/vader_sentiment_ruby/version.rb +5 -0
  25. data/lib/vader_sentiment_ruby/word_helper.rb +93 -0
  26. metadata +156 -0
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 57fb7693c238e33224556fb6a2d7d8c479ed88d76cd6576edef5c8befe4ad144
4
+ data.tar.gz: 8fc4484045923da3ec6986b8dbbf39f41437c635a25dbfdd14c0973baebe271b
5
+ SHA512:
6
+ metadata.gz: 127836bbe570da1cd60082da181d22e4a3619d32300d8e16dfc9aa9a3fcbbc0b3d51ad0e1b4c8e1143879f4b51f6e70a047590468af7c7de260468ddb7987ab9
7
+ data.tar.gz: 94d770a25884484554c2dce05607c68bda7742482d883d18c2cf693c979a31085189fec042f78b7303e6b13e0b75f13c0c09ee7e39f5d756787cde27db0a6102
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2021 N. Bulavin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,51 @@
1
+ # VaderSentimentRuby
2
+
3
+ VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media.
4
+
5
+ This is a port of [VADER sentiment analysis tool](https://github.com/cjhutto/vaderSentiment) originally written in Python. If you'd like to make a contribution, please checkout the original author's work.
6
+
7
+ ## Installation
8
+
9
+ Add this line to your application's Gemfile:
10
+ ```ruby
11
+ gem 'vader_sentiment_ruby'
12
+ ```
13
+ And then execute:
14
+ ```ruby
15
+ bundle install
16
+ ```
17
+ Or install it yourself as:
18
+ ```ruby
19
+ gem install vader_sentiment_ruby
20
+ ```
21
+ ## Usage
22
+ ```ruby
23
+ require 'vader_sentiment_ruby'
24
+
25
+ VaderSentimentRuby.polarity_scores('VADER is smart, handsome, and funny.')
26
+ # => {:negative=>0.0, :neutral=>0.254, :positive=>0.746, :compound=>0.8316}
27
+ ```
28
+
29
+ ## About the Scoring
30
+ The compound score is computed by summing the valence scores of each word in the lexicon, adjusted according to the rules, and then normalized to be between -1 (most extreme negative) and +1 (most extreme positive). This is the most useful metric if you want a single unidimensional measure of sentiment for a given sentence. Calling it a 'normalized, weighted composite score' is accurate.
31
+
32
+ It is also useful for researchers who would like to set standardized thresholds for classifying sentences as either positive, neutral, or negative. Typical threshold values (used in the literature cited on this page) are:
33
+
34
+ positive sentiment: compound score >= 0.05
35
+ neutral sentiment: (compound score > -0.05) and (compound score < 0.05)
36
+ negative sentiment: compound score <= -0.05
37
+
38
+ The pos, neu, and neg scores are ratios for proportions of text that fall in each category (so these should all add up to be 1... or close to it with float operation). These are the most useful metrics if you want multidimensional measures of sentiment for a given sentence.
39
+
40
+ ## Citation Information
41
+ If you use either the dataset or any of the VADER sentiment analysis tools (VADER sentiment lexicon or Rust code for rule-based sentiment analysis engine) in your research, please cite the above paper. For example:
42
+
43
+ > Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.
44
+
45
+ For questions, please contact: C.J. Hutto Georgia Institute of Technology, Atlanta, GA 30032
46
+ cjhutto [at] gatech [dot] edu
47
+
48
+ ## License
49
+ The original source code is copyright © 2013 C.J. Hutto
50
+
51
+ This port gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ # VaderSentimentRuby namespace
4
+ module VaderSentimentRuby
5
+ autoload(:Constants, 'vader_sentiment_ruby/constants')
6
+ autoload(:WordHelper, 'vader_sentiment_ruby/word_helper')
7
+ autoload(:LexiconDictionaryCreator, 'vader_sentiment_ruby/lexicon_dictionary_creator')
8
+ autoload(:EmojisDictionaryCreator, 'vader_sentiment_ruby/emojis_dictionary_creator')
9
+ autoload(:PunctuationEmphasisAmplifier, 'vader_sentiment_ruby/punctuation_emphasis_amplifier')
10
+ autoload(:SentimentScoresSifter, 'vader_sentiment_ruby/sentiment_scores_sifter')
11
+ autoload(:SentimentIntensityAnalyzer, 'vader_sentiment_ruby/sentiment_intensity_analyzer')
12
+ autoload(:ValenceScoreCalculator, 'vader_sentiment_ruby/valence_score_calculator')
13
+ autoload(:EmojiDescriber, 'vader_sentiment_ruby/emojis_describer')
14
+ autoload(:SentimentPropertiesIdentifier, 'vader_sentiment_ruby/sentiment_properties_identifier')
15
+ autoload(:Checker, 'vader_sentiment_ruby/checker')
16
+
17
+ def self.polarity_scores(text)
18
+ VaderSentimentRuby::SentimentIntensityAnalyzer.new.polarity_scores(text)
19
+ end
20
+ end
@@ -0,0 +1,13 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ # Used only for checkers autoload
5
+ module Checker
6
+ autoload(:ButWordNegationChecker, 'vader_sentiment_ruby/checker/but_word_negation_checker')
7
+ autoload(:LeastWordNegationChecker, 'vader_sentiment_ruby/checker/least_word_negation_checker')
8
+ autoload(:NegationChecker, 'vader_sentiment_ruby/checker/negation_checker')
9
+ autoload(:NoWordChecker, 'vader_sentiment_ruby/checker/no_word_checker')
10
+ autoload(:PreviousWordsInfluenceChecker, 'vader_sentiment_ruby/checker/previous_words_influence_checker')
11
+ autoload(:SpecialIdiomsChecker, 'vader_sentiment_ruby/checker/special_idioms_checker')
12
+ end
13
+ end
@@ -0,0 +1,34 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Checks for modification in sentiment due to contrastive conjunction 'but'
6
+ class ButWordNegationChecker
7
+ def initialize(words_and_emoticons, sentiments)
8
+ @words_and_emoticons_lower = words_and_emoticons.map { |w| w.to_s.downcase }
9
+ @sentiments = sentiments
10
+ end
11
+
12
+ def call
13
+ return @sentiments unless @words_and_emoticons_lower.include?('but')
14
+
15
+ but_index = @words_and_emoticons_lower.index('but')
16
+ updated_sentiments = []
17
+ @sentiments.each_with_index do |sentiment, senti_index|
18
+ updated_sentiments << modified_sentiment(sentiment, senti_index, but_index)
19
+ end
20
+
21
+ updated_sentiments
22
+ end
23
+
24
+ private
25
+
26
+ def modified_sentiment(sentiment, senti_index, but_index)
27
+ return sentiment * 0.5 if senti_index < but_index
28
+ return sentiment * 1.5 if senti_index > but_index
29
+
30
+ sentiment
31
+ end
32
+ end
33
+ end
34
+ end
@@ -0,0 +1,38 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Checks for negation case using "least"
6
+ class LeastWordNegationChecker
7
+ def initialize(valence, words_and_emoticons, index, lexicon)
8
+ @valence = valence
9
+ @words_and_emoticons = words_and_emoticons
10
+ @index = index
11
+ @lexicon = lexicon
12
+ end
13
+
14
+ def call
15
+ valence = @valence
16
+ return valence unless !word_in_lexicon?(@index - 1) && word_is?(@index - 1, 'least')
17
+
18
+ if @index > 1
19
+ valence *= Constants::N_SCALAR if !word_is?(@index - 2, 'at') && !word_is?(@index - 2, 'very')
20
+ elsif @index.positive?
21
+ valence *= Constants::N_SCALAR
22
+ end
23
+
24
+ valence
25
+ end
26
+
27
+ private
28
+
29
+ def word_in_lexicon?(index)
30
+ @lexicon.keys.include?(@words_and_emoticons[index].downcase)
31
+ end
32
+
33
+ def word_is?(index, word)
34
+ @words_and_emoticons[index].downcase == word
35
+ end
36
+ end
37
+ end
38
+ end
@@ -0,0 +1,114 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Checks for negation
6
+ class NegationChecker
7
+ # @param [Float] valence
8
+ # @param [Array(String)] words_and_emoticons
9
+ # @param [Integer] start_index
10
+ # @param [Integer] index
11
+ def initialize(valence, words_and_emoticons, start_index, index)
12
+ @valence = valence
13
+ @words_and_emoticons_lower = words_and_emoticons.map { |word| word.to_s.downcase }
14
+ @start_index = start_index
15
+ @index = index
16
+ end
17
+
18
+ # @return [Float]
19
+ def call
20
+ valence = @valence
21
+ valence = check_zero_index(valence) if @start_index.zero?
22
+ valence = check_first_index(valence) if @start_index == 1
23
+ valence = check_second_index(valence) if @start_index == 2
24
+
25
+ valence
26
+ end
27
+
28
+ private
29
+
30
+ def check_zero_index(valence)
31
+ # 1 word preceding lexicon word (w/o stopwords)
32
+ return valence unless negated?([@words_and_emoticons_lower[@index - (@start_index + 1)]])
33
+
34
+ valence * Constants::N_SCALAR
35
+ end
36
+
37
+ def check_first_index(valence)
38
+ return valence * 1.25 if word_is_never?(-2) && (word_is_so?(-1) || word_is_this?(-1))
39
+ return valence if word_is_without?(-2) && word_is_doubt?(-1)
40
+
41
+ if negated?([@words_and_emoticons_lower[@index - (@start_index + 1)]])
42
+ # 2 words preceding the lexicon word position
43
+ return valence * Constants::N_SCALAR
44
+ end
45
+
46
+ valence
47
+ end
48
+
49
+ # rubocop:disable Metrics/CyclomaticComplexity
50
+ # rubocop:disable Metrics/PerceivedComplexity
51
+ def check_second_index(valence)
52
+ if word_is_never?(-3) &&
53
+ (word_is_so?(-2) || word_is_this?(-2)) ||
54
+ (word_is_so?(-1) || word_is_this?(-1))
55
+ return valence * 1.25
56
+ elsif word_is_without?(-3) && (word_is_doubt?(-2) || word_is_doubt?(-1))
57
+ return valence
58
+ elsif negated?([@words_and_emoticons_lower[@index - (@start_index + 1)]])
59
+ # 3 words preceding the lexicon word position
60
+ return valence * Constants::N_SCALAR
61
+ end
62
+
63
+ valence
64
+ end
65
+ # rubocop:enable Metrics/CyclomaticComplexity
66
+ # rubocop:enable Metrics/PerceivedComplexity
67
+
68
+ # Determine if input contains negation words
69
+ def negated?(input_words, include_nt: true)
70
+ input_words = input_words.map { |w| w.to_s.downcase }
71
+ Constants::NEGATE.each do |word|
72
+ return true if input_words.include?(word)
73
+ end
74
+
75
+ if include_nt
76
+ input_words.each do |word|
77
+ return true if word.include?("n't")
78
+ end
79
+ end
80
+
81
+ # if input_words.include?('least')
82
+ # index = input_words.index('least')
83
+ # return true if index.positive? && input_words[index - 1] != 'at'
84
+ # end
85
+
86
+ false
87
+ end
88
+
89
+ def word_is_never?(index_shift)
90
+ word_is?(index_shift, 'never')
91
+ end
92
+
93
+ def word_is_so?(index_shift)
94
+ word_is?(index_shift, 'so')
95
+ end
96
+
97
+ def word_is_this?(index_shift)
98
+ word_is?(index_shift, 'this')
99
+ end
100
+
101
+ def word_is_without?(index_shift)
102
+ word_is?(index_shift, 'without')
103
+ end
104
+
105
+ def word_is_doubt?(index_shift)
106
+ word_is?(index_shift, 'doubt')
107
+ end
108
+
109
+ def word_is?(index_shift, word)
110
+ @words_and_emoticons_lower[@index + index_shift].downcase == word
111
+ end
112
+ end
113
+ end
114
+ end
@@ -0,0 +1,49 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Check for "no" as negation for an adjacent lexicon item vs "no" as its own stand-alone lexicon item
6
+ class NoWordChecker
7
+ # @param [Float] valence
8
+ # @param [String] item_lowercase
9
+ # @param [Integer] index
10
+ # @param [Array] words_and_emoticons
11
+ # @param [Hash] lexicon
12
+ def initialize(valence, item_lowercase, index, words_and_emoticons, lexicon)
13
+ @valence = valence
14
+ @item_lowercase = item_lowercase
15
+ @index = index
16
+ @words_and_emoticons = words_and_emoticons
17
+ @lexicon = lexicon
18
+ end
19
+
20
+ # @return [Float]
21
+ def call
22
+ valence = @valence
23
+
24
+ if @item_lowercase == 'no' &&
25
+ @index != @words_and_emoticons.size - 1 &&
26
+ @lexicon.keys.include?(@words_and_emoticons[@index + 1].downcase)
27
+ # don't use valence of "no" as a lexicon item. Instead set it's valence to 0.0 and negate the next item
28
+ valence = 0.0
29
+ end
30
+
31
+ valence = @lexicon[@item_lowercase] * Constants::N_SCALAR if one_of_preceding_words_is_no?
32
+
33
+ valence
34
+ end
35
+
36
+ private
37
+
38
+ def one_of_preceding_words_is_no?
39
+ preceding_word_is_no?(0) ||
40
+ preceding_word_is_no?(1) ||
41
+ (preceding_word_is_no?(2) && %w[or nor].include?(@words_and_emoticons[@index - 1].downcase))
42
+ end
43
+
44
+ def preceding_word_is_no?(distance)
45
+ @index > distance && @words_and_emoticons[@index - (distance + 1)].downcase == 'no'
46
+ end
47
+ end
48
+ end
49
+ end
@@ -0,0 +1,55 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Checks if the preceding words increase, decrease, or negate/nullify the valence
6
+ class PreviousWordsInfluenceChecker
7
+ # @param [String] word
8
+ # @param [Float] valence
9
+ # @param [Boolean] is_cap_diff
10
+ def initialize(word, valence, is_cap_diff)
11
+ @word = word
12
+ @word_lower = word.downcase
13
+ @valence = valence
14
+ @is_cap_diff = is_cap_diff
15
+ @scalar = 0.0
16
+ end
17
+
18
+ # @return [Float]
19
+ def call
20
+ return @scalar unless word_in_booster_dictionary?
21
+
22
+ take_scalar_from_dictionary
23
+ @scalar *= -1 if @valence.negative?
24
+ amplify_scalar_by_word_case
25
+
26
+ @scalar
27
+ end
28
+
29
+ private
30
+
31
+ def word_in_booster_dictionary?
32
+ Constants::BOOSTER_DICT.keys.include?(@word_lower)
33
+ end
34
+
35
+ def take_scalar_from_dictionary
36
+ @scalar = Constants::BOOSTER_DICT[@word_lower]
37
+ end
38
+
39
+ def amplify_scalar_by_word_case
40
+ # Check if booster/dampener word is in ALLCAPS (while others aren't)
41
+ return unless WordHelper.word_upcase?(@word) && @is_cap_diff
42
+
43
+ amplified_scalar
44
+ end
45
+
46
+ def amplified_scalar
47
+ if @valence.positive?
48
+ @scalar += Constants::C_INCR
49
+ else
50
+ @scalar -= Constants::C_INCR
51
+ end
52
+ end
53
+ end
54
+ end
55
+ end
@@ -0,0 +1,30 @@
1
+ # frozen_string_literal: true
2
+
3
+ module VaderSentimentRuby
4
+ module Checker
5
+ # Not implemented
6
+ # check for sentiment laden idioms that don't contain a lexicon word
7
+ class SentimentLadenIdiomsChecker
8
+ def initialize(valence, senti_text_lower)
9
+ @valence = valence
10
+ @senti_text_lower = senti_text_lower
11
+ end
12
+
13
+ def call
14
+ idioms_valences = []
15
+ valence = @valence
16
+
17
+ Constants::SENTIMENT_LADEN_IDIOMS.each do |idiom|
18
+ next unless @senti_text_lower.include?(idiom)
19
+
20
+ valence = Constants::SENTIMENT_LADEN_IDIOMS[idiom]
21
+ idioms_valences.push(valence)
22
+ end
23
+
24
+ valence = idioms_valences.sum / idioms_valences.size.to_f if idioms_valences.size.positive?
25
+
26
+ valence
27
+ end
28
+ end
29
+ end
30
+ end